Skip to content

Add the endpoint for legacy v1 completion and the local 8bit quantisation#39

Merged
baixiac merged 1 commit intomainfrom
llm-serving
Feb 13, 2026
Merged

Add the endpoint for legacy v1 completion and the local 8bit quantisation#39
baixiac merged 1 commit intomainfrom
llm-serving

Conversation

@baixiac
Copy link
Member

@baixiac baixiac commented Feb 11, 2026

feat: auto-utilise the local chat template if detected
feat: add the option to generate full sentences
feat: add the option for local 8bit quantisation
feat: add the gpt oss chat template
fix: skip quantisation if the model being loaded is already quantised

feat: auto-utilise the local chat template if detected
feat: add the option to generate full sentences
feat: add the option for local 8bit quantisation
feat: add the gpt oss chat template
fix: skip quantisation if the model being loaded is already quantised
@baixiac baixiac merged commit de5e049 into main Feb 13, 2026
7 checks passed
@baixiac baixiac deleted the llm-serving branch February 13, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant