Add the endpoint for legacy v1 completion and the local 8bit quantisation by baixiac · Pull Request #39 · CogStack/CogStack-ModelServe

baixiac · 2026-02-11T17:43:56Z

feat: auto-utilise the local chat template if detected
feat: add the option to generate full sentences
feat: add the option for local 8bit quantisation
feat: add the gpt oss chat template
fix: skip quantisation if the model being loaded is already quantised

feat: auto-utilise the local chat template if detected feat: add the option to generate full sentences feat: add the option for local 8bit quantisation feat: add the gpt oss chat template fix: skip quantisation if the model being loaded is already quantised

baixiac merged commit de5e049 into main Feb 13, 2026
7 checks passed

baixiac deleted the llm-serving branch February 13, 2026 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the endpoint for legacy v1 completion and the local 8bit quantisation#39

Add the endpoint for legacy v1 completion and the local 8bit quantisation#39
baixiac merged 1 commit intomainfrom
llm-serving

baixiac commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

baixiac commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant