Skip to content

server: abort prefill when the client disconnects#376

Open
adv0r wants to merge 1 commit into
antirez:mainfrom
adv0r:tb-abort-prefill-on-disconnect
Open

server: abort prefill when the client disconnects#376
adv0r wants to merge 1 commit into
antirez:mainfrom
adv0r:tb-abort-prefill-on-disconnect

Conversation

@adv0r

@adv0r adv0r commented Jun 9, 2026

Copy link
Copy Markdown

Dear maintainer — AI-authored PR by Fable 5 under @adv0r. Methodology + opt-out at tokens-for-good.
A one-line "no thanks" → auto-apology + auto-close + permanent blacklist. Silent close treated the same. Your time matters more than this contribution.

What: wire the existing ds4_session_set_cancel() machinery into ds4-server's request path, with a cancel callback that probes the client socket (recv MSG_PEEK|MSG_DONTWAIT) so a long prefill stops at the next safe boundary once the client is gone — for non-streaming requests too, where the SSE keepalive never fires.
Why: a client timeout/cancel mid-prefill currently burns minutes of GPU finishing a response nobody will read.
Verified: make clean, ./ds4_test --server passes on Apple Silicon (Metal build), including a new socketpair unit test for the cancel callback. No live-model run on this machine — happy to rework if you want it measured first.

On DS4_SESSION_SYNC_INTERRUPTED the server logs the abort and sends nothing; the disk cache entry is kept and the partial prefill stays a valid checkpoint prefix, so a client retry resumes where it stopped. Pipelined bytes (recv() > 0) are treated as still-connected and never consumed.

Closes #333.

@adv0r

adv0r commented Jun 10, 2026

Copy link
Copy Markdown
Author

just a heads up from me (the human).
Although I thought I was using Fable, given the nature of this task, it might be possible that the harness downgraded exec to shittier models without notice. Concerning.

re-reading the model card:

In light of the ability of recent models to accelerate their own development, we’ve
implemented new interventions that limit Claude’s effectiveness for requests targeting
frontier LLM development (for example, on building pretraining pipelines, distributed
training infrastructure, or ML accelerator design). Using Claude to develop competing
models already violates our Terms of Service, but enforcing this restriction through our
safeguards avoids accelerating the actors most willing to violate these terms.
Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts,
these safeguards will not be visible to the user. Fable 5 will not fall back to a different
model. Instead, the safeguards will limit effectiveness through methods such as prompt
modification, steering vectors, or parameter-efficient fine-tuning (PEFT).

Same thing might apply to #378 , and this entire project for what I know.

@antirez, dagli un'occhiata. Ho anche visto il tuo video di prima su Fable e non mi sembra che parlassi di questa condizione. Chissà se il loro system detection considera ds4 intero come "frontier LLM development" e quindi fa auto-sabotaggio senza avvisare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Server does not abort prefill when client disconnects

1 participant