Skip to content

Fix/llm self hosted stack review#91

Merged
Smana merged 2 commits into
mainfrom
fix/llm-self-hosted-stack-review
May 13, 2026
Merged

Fix/llm self hosted stack review#91
Smana merged 2 commits into
mainfrom
fix/llm-self-hosted-stack-review

Conversation

@Smana
Copy link
Copy Markdown
Owner

@Smana Smana commented May 13, 2026

No description provided.

Smana added 2 commits May 13, 2026 22:46
…mage optimization

FR/EN:
- Rewrite paged attention explanation: naive malloc worst-case → vLLM OS paging analogy
- Fix OpenCode framing: "just discovered" instead of claiming daily use
- Rewrite monitoring intro: multi-axis signal, LLM-specific indicators, existing obs stack
- Rephrase AI Gateway section intro: positional framing instead of overselling hook
- Simplify gen_ai labels paragraph (remove enumerate + "piece that changes everything")
- Replace "géostratégique" / "geostrategic" with "géopolitique" / "geopolitical"
- Fix "booming" → "rapidly evolving"

EN only:
- Defrenchify 18 literal translation artifacts (bears directly on → affects, Thanks to this → With this setup, whole gap → key difference, constrained tasks → well-scoped tasks, etc.)
- Fix "a L4" → "an L4"

Images (FR + EN):
- thumbnail.png: resize 2424×1728 → 1600px wide + lossless recompress (6.9 MB → 2.1 MB, −70%)
- grafana-dashboard-llm.png: lossless recompress (389 KB → 314 KB, −19%)
- architecture-vllm.png: lossless recompress (204 KB → 110 KB, −46%)
…n FR/EN

Technical accuracy:
- Replace invalid vLLM metric names (kv_cache_usage_perc → gpu_cache_usage_perc,
  drop _total suffix on prompt_tokens/generation_tokens, remove the non-existent
  num_requests_max metric)
- Match KEDA trigger YAML to the actual KCL composition in cloud-native-ref:
  scalar(vector(maxNumSeqs)) divisor and the real 0.6 GPU-cache threshold
- Drop misleading KEDA HTTP add-on entry from References — points at a
  commercial vendor URL and the component isn't deployed

FR/EN parity (FR canonical):
- Trim EN-only paragraphs that drifted: 7B fragility qualifier, FinOps tagline,
  "Let's be honest" sentence in Promptfoo, "hybrid future" closing aphorism

Prose polish:
- Fix typo (celà → cela)
- Rewrite L64 with cloud-native-ref link + "aucune donnée ne quitte
  l'infrastructure" / "no data leaves the infrastructure"
- Resolve EN "exposed/exposed" repetition
- Tighten FR layer-walk transition and "less price pressure" wording in EN
@Smana Smana merged commit 4dc26f9 into main May 13, 2026
1 check passed
@Smana Smana deleted the fix/llm-self-hosted-stack-review branch May 13, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant