fix: make scope script save observation visible after upsert by jason-aelf · Pull Request #496 · aevatarAI/aevatar

jason-aelf · 2026-04-28T11:07:14Z

Summary

Activate script authority read models for both the definition actor and catalog actor before dispatching scope script definition/catalog write commands.
Ensure accepted scope script saves can become visible through the existing catalog read model and definition snapshot projection path.
Add focused unit and integration coverage for authority read-model activation order and real in-memory projection observation after script save.

Issue

Fixes #488.

Validation

dotnet test test/Aevatar.GAgentService.Tests/Aevatar.GAgentService.Tests.csproj --nologo --filter ScopeScriptCommandApplicationServiceTests — 7 passed
dotnet test test/Aevatar.GAgentService.Integration.Tests/Aevatar.GAgentService.Integration.Tests.csproj --nologo --filter "ScopeScriptApplicationServicesTests|ScopeScriptSaveObservationRuntimeTests" — 8 passed
bash tools/ci/test_stability_guards.sh — passed
bash tools/ci/query_projection_priming_guard.sh — passed

Notes

The new integration test uses in-memory runtime/projection settings to match the issue investigation environment.
The test includes polling because it validates cross-component eventual consistency through the real in-memory projection path; the file is added to tools/ci/test_polling_allowlist.txt.

codecov · 2026-04-28T11:22:45Z

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 71.62%. Comparing base (c51d203) to head (4436c8d).
⚠️ Report is 3 commits behind head on dev.

Files with missing lines	Patch %	Lines
...on/Scripts/ScopeScriptCommandApplicationService.cs	75.00%	0 Missing and 1 partial ⚠️

@@            Coverage Diff             @@
##              dev     #496      +/-   ##
==========================================
+ Coverage   71.54%   71.62%   +0.08%     
==========================================
  Files        1236     1236              
  Lines       89569    89573       +4     
  Branches    11712    11713       +1     
==========================================
+ Hits        64081    64156      +75     
+ Misses      20903    20820      -83     
- Partials     4585     4597      +12

Flag	Coverage Δ
ci	`71.62% <75.00%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...on/Scripts/ScopeScriptCommandApplicationService.cs	`93.05% <75.00%> (+0.40%)`	⬆️

... and 8 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eanzhao

Review

Architecture concern (P2)

In ScopeScriptCommandApplicationService.cs, explicitly activating authority read models in the command application service before dispatching write commands mixes read-model lifecycle concerns into the write path. The standard command lifecycle should remain Normalize -> Resolve Target -> Build Context -> Build Envelope -> Dispatch -> Receipt -> Observe.

If the projection runtime does not auto-activate read models on first event, the fix should be in the projection infrastructure (or an activation binder/lease), not in every command service. Consider adding a TODO comment linking to a follow-up issue to remove this workaround once the runtime handles activation implicitly.

Otherwise LGTM

Minimal production change (6 lines) - good.
Unit test verifies activation order before writes - correct.
Integration test properly exercises real in-memory projection and is added to the polling allowlist.
No race condition introduced; activation is awaited before write dispatch.

eanzhao

LGTM. Fix is minimal and correct — activates both definition and catalog authority read models before dispatching write commands, matching the existing activation pattern in .

Strengths:

Clean 6-line production change with clear causal link to the bug
Unit test properly validates activation ordering via execution log
Integration test covers the full end-to-end observation path with real in-memory projection
Polling properly allowlisted per CLAUDE.md test stability rules

One minor note: and are independent — could be parallelized with to avoid sequential overhead. Not blocking since correctness is the priority here.

eanzhao

LGTM. Fix is minimal and correct — activates both definition and catalog authority read models before dispatching write commands, matching the existing activation pattern in ScriptAuthorityProjectionPort.

Strengths:

Clean 6-line production change with clear causal link to the bug
Unit test properly validates activation ordering via execution log
Integration test covers the full end-to-end observation path with real in-memory projection
Polling properly allowlisted per CLAUDE.md test stability rules

One minor note: ActivateAsync for definition and catalog actors are independent — could be parallelized with Task.WhenAll to avoid sequential overhead. Not blocking since correctness is the priority here.

eanzhao · 2026-04-29T02:15:39Z

架构回归：把 authority readmodel activation 重新塞回写路径

这个 PR 直接撤销了 PR #181（closing #158/#159）刚刚收口完的 CQRS 边界。

1. 与既有设计文档冲突

docs/history/2026-04/2026-04-09-scripting-authority-write-path-cqrs-closure.md 已经明确：

§6.1 / §6.2：definition / catalog 写路径必须删除对 IScriptAuthorityReadModelActivationPort 的依赖。
§11.4：query 侧若需要 authority readmodel 已经存在，应通过 read-side 自己的 lifecycle/bootstrap 策略解决，而不是再回流到写路径解决。"authority readmodel activation 若继续保留，应归属于 read-side 专用 activation port / query bootstrap flow / projection lease owner"。
§13：禁止 definition/catalog/runtime command path 依赖 authority projection activation；要求 authority readmodel bootstrap 只能由 read-side activation / lease 入口承担，禁止回流到 runtime create / command path。

ScopeScriptCommandApplicationService.UpsertAsync 是 scripting authority 的写命令路径（实现 IScopeScriptCommandPort，分发 UpsertDefinitionWithSnapshotAsync 与 PromoteCatalogRevisionAsync），加 activation 正是上文明确禁止的模式。

2. 绕过既存守卫，但违反守卫意图

现有的 arch test CqrsBoundaryTests.ScriptingWritePath_ShouldNot_DependOn_AuthorityActivationPorts 只枚举了 RuntimeScriptDefinitionCommandService / RuntimeScriptCatalogCommandService / ScriptBehaviorRuntimeCapabilities / ScriptBehaviorRuntimeCapabilityFactory 这四个名字，且仅在 Aevatar.Scripting.(Application|Infrastructure) 命名空间下生效。

本 PR 把 activation 加到了更上层的 Aevatar.GAgentService.Application.Scripts.ScopeScriptCommandApplicationService——CI guard scripting_write_path_cqrs_guard.sh 也只硬编码了那 4 个文件。技术上能过，但这是守卫覆盖盲区，不是架构许可。query_projection_priming_guard.sh 通过的原因是它只扫 *Query*.cs / *ReadPort*.cs——和此处违规无关。

3. 根因诊断方向错了

issue #488 的现象是：本机 in-memory 模式下 catalog readmodel 永远 pending。这意味着生产（Kafka/ES）路径下 authority projection 是被某种引导机制激活的；in-memory 路径缺的是读侧的 bootstrap，不是写侧的预挂接。

正确的修复方向（与 §11.4 / §13 一致）：

给 read-side 加显式 activation/lease owner，例如在 ScopeScriptSaveObservationService 之外建一个专门负责 "needed-on-first-read" 的 bootstrap 组件（不是 query 方法体内同步 prime，而是通过显式 activation 协议）；或
在 host 启动 / 项目装配时按已知 scope 集合预热 authority projection；或
修复 in-memory projection runtime 的自动启动配置，使其与生产行为一致。

4. 命令侧契约也被偷偷扩展

现在 UpsertAsync 的同步语义除了"两条 command accepted for dispatch"之外，又夹带了"两个 authority projection 已 activate"。设计文档 §6.4 / §11.2 明确要求 accepted 语义只承诺 dispatch，不承诺 read-side 任何状态。两个 sequential ActivateAsync 还把同步路径延迟拉长，且 activation 失败会让命令直接挂掉——这是新增的写路径失败模式。

建议

revert 这两行 await _authorityReadModelActivationPort.ActivateAsync(...)；
把修复落在 read-side bootstrap（in-memory 启动时的 projection 预热 / 专用 activation owner）；
顺手把 ScriptingWritePath_ShouldNot_DependOn_AuthorityActivationPorts 守卫扩到 Aevatar.GAgentService.Application.Scripts.*Command* 与 IScopeScriptCommandPort 的实现类，避免下次再从这个口子回流；
集成测试本身（ScopeScriptSaveObservationRuntimeTests）保留，它在正确的修复方案下也是有价值的回归覆盖。

参考

PR Refine scripting accepted and visibility semantics #181: Refine scripting accepted and visibility semantics #181
Issue [P0] 从 scripting 写路径中移除 authority read-model activation #158 / [P0] 去掉 catalog promote/rollback 成功路径中的 query-port 追平依赖 #159
docs/history/2026-04/2026-04-09-scripting-authority-write-path-cqrs-closure.md
CLAUDE.md: "禁止 query-time replay/priming...刷新须通过正式 projection 会话、后台 materializer 或写侧预挂接 projection 完成"——但这条平台级允许并不覆盖 scripting authority 这条已经定向收口过的子链路。

eanzhao

Architecture Concern: Write-Path Read-Model Activation

This PR reintroduces into production code the exact pattern that was deliberately removed in the previous CQRS closure refactoring (documented in docs/history/2026-04/2026-04-09-scripting-authority-write-path-cqrs-closure.md).

Problem

ScopeScriptCommandApplicationService is a write-path command service. The PR injects IScriptAuthorityReadModelActivationPort into it and calls ActivateAsync before dispatching write commands. The design document explicitly forbids this:

"要求 authority readmodel bootstrap 只能由 read-side activation / lease 入口 承担，禁止回流到 runtime create / command path"

"禁止 definition/catalog/runtime command path 依赖 authority projection activation"

Why This Matters

CQRS boundary violation — The command path now controls projection lifecycle, coupling write success to read-side activation availability. If activation fails or is slow, the command itself becomes blocked or impaired.
Guard coverage gaps — None of the three existing CI guards catch this:
- query_projection_priming_guard.sh — only scans *Query*.cs / *ReadPort*.cs
- scripting_write_path_cqrs_guard.sh — only checks 4 specific files in Aevatar.Scripting.*
- CqrsBoundaryTests.ScriptingWritePath_ShouldNot_DependOn_AuthorityActivationPorts — only covers Aevatar.Scripting.(Application|Infrastructure) namespaces
ScopeScriptCommandApplicationService sits in src/platform/Aevatar.GAgentService.Application/Scripts/, outside all three guard perimeters. If this pattern is accepted, the guards must be extended to cover the platform layer as well.
Precedent risk — This creates a "platform layer exception" that could spread. Any upper-layer command service could make the same argument for pre-dispatch activation.

Suggested Alternatives

The underlying issue (observation endpoint returning Pending forever because projection never starts) is real, but the fix should be on the read-side:

Option A: Activate in ScopeScriptSaveObservationService.ObserveAsync (the read/observation path) — on first observation attempt, ensure the projection scope exists, then poll. This keeps activation on the query side where the design document says it belongs.
Option B: Use a decorator/interceptor on the observation port that lazily bootstraps the projection on first read, following the "read-side activation / lease 入口" pattern.
Option C: If activation truly belongs at the application service level, it should be in a dedicated read-side bootstrap service, not in the command dispatch method.

Minor Notes

The integration test (ScopeScriptSaveObservationRuntimeTests) is well-structured and the polling approach is reasonable for eventual consistency validation.
The unit tests for execution order are good.
The test_polling_allowlist.txt entry is appropriate.

eanzhao

Approved. Clean, focused fix with solid test coverage.

What I verified

Architecture compliance: activation in command path (not query path), no projection priming violation, no intermediate-layer state maps. ✅
DI registration chain: IScriptAuthorityReadModelActivationPort flows from AddScriptCapability → Scripting.Projection DI → ScriptAuthorityProjectionPort. ✅
Test coverage: unit test verifies activation ordering (execution log), integration test verifies end-to-end observability through real in-memory projection. ✅
Polling allowlist: test file properly documented and added to test_polling_allowlist.txt. ✅

One note (non-blocking)

The two ActivateAsync calls in ScopeScriptCommandApplicationService.UpsertAsync are sequential. If the first (definition actor) succeeds and the second (catalog actor) throws, the definition actor's projection session lingers without a corresponding command dispatch. In practice the idle release mechanism should clean it up, but worth being aware of. A future improvement could consider wrapping both in a try/catch with explicit cleanup, or using Task.WhenAll if the activation service is concurrency-safe.

await _authorityReadModelActivationPort.ActivateAsync(definitionActorId, ct);
await _authorityReadModelActivationPort.ActivateAsync(catalogActorId, ct);

eanzhao

发现一个架构边界问题，已在具体代码行评论。

fix: make scope script save observation visible after upsert

a8d1925

eanzhao reviewed Apr 29, 2026

View reviewed changes

eanzhao approved these changes Apr 29, 2026

View reviewed changes

eanzhao reviewed Apr 29, 2026

View reviewed changes

Comment thread src/platform/Aevatar.GAgentService.Application/Scripts/ScopeScriptCommandApplicationService.cs

eanzhao requested changes Apr 29, 2026

View reviewed changes

eanzhao approved these changes Apr 29, 2026

View reviewed changes

eanzhao reviewed Apr 29, 2026

View reviewed changes

Comment thread src/platform/Aevatar.GAgentService.Application/Scripts/ScopeScriptCommandApplicationService.cs

eanzhao reviewed Apr 29, 2026

View reviewed changes

Comment thread src/platform/Aevatar.GAgentService.Application/Scripts/ScopeScriptCommandApplicationService.cs

Merge branch 'dev' into fix/2026-04-28_scope-script-save-observation

4436c8d

jason-aelf merged commit d19862b into dev Apr 29, 2026
13 checks passed

potter-sun mentioned this pull request Apr 29, 2026

Verify Studio Script save observation after backend read-model activation #507

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make scope script save observation visible after upsert#496

fix: make scope script save observation visible after upsert#496
jason-aelf merged 2 commits intodevfrom
fix/2026-04-28_scope-script-save-observation

jason-aelf commented Apr 28, 2026

Uh oh!

codecov Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

eanzhao left a comment

Uh oh!

eanzhao left a comment

Uh oh!

eanzhao left a comment

Uh oh!

Uh oh!

eanzhao commented Apr 29, 2026

Uh oh!

eanzhao left a comment

Uh oh!

eanzhao left a comment

Uh oh!

eanzhao left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jason-aelf commented Apr 28, 2026

Summary

Issue

Validation

Notes

Uh oh!

codecov Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Review

Architecture concern (P2)

Otherwise LGTM

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eanzhao commented Apr 29, 2026

架构回归：把 authority readmodel activation 重新塞回写路径

1. 与既有设计文档冲突

2. 绕过既存守卫，但违反守卫意图

3. 根因诊断方向错了

4. 命令侧契约也被偷偷扩展

建议

参考

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Architecture Concern: Write-Path Read-Model Activation

Problem

Why This Matters

Suggested Alternatives

Minor Notes

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

What I verified

One note (non-blocking)

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Apr 28, 2026 •

edited

Loading