Merge main into presence-refactor by WillieHabi · Pull Request #24336 · microsoft/FluidFramework

WillieHabi · 2025-04-11T21:58:14Z

Description

^

Adding a skipped test which reproduces a inconsistency in merge tree related to rollback --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

## Description Historian denyList is used to prevent DoS issues with our services. However, we do not need to use the denyList for delete ops. This PR removes the denyList from ops that are handled by the delete APIs of Historian.

…at for old loader) (#24276) We need to flush whenever incoming ops are processed (even non-runtime ops), since we track the reference sequence number of the in-progress batch and throw if it ever changes mid-batch. For this reason, we recently started calling `flush` in `ContainerRuntime.process`. In old loader code, `ContainerRuntime.process` is only called for runtime messages, so it is possible for the reference sequence number to advance without ContainerRuntime flushing. This change adds another flush call in response to DeltaManager's "op" event to close this gap. It's ok to call flush extra, it's a cheap no-op if there's nothing there.

## Bug Recursively assinging an unhydrated and unparented node to itself hangs forever. Note that doing the same for a parented node does throw a usage error (added a test for this scenario) ## Fix The root-cause of this bug is that it is stuck in a while loop in `TreeNodeKernel` trying to emit `subtreeChangedAfterBatch` event for the sub-tree of the node. Added logic in `adoptBy` to detect recursion and throw a usage error. This is where the logic to detect this for parent node happens. [AB#32207](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/32207)

## Description Remove alpha+legacy tagging on some non-exported API. Group exports to indicate which ones are actually intended to be used.

…oundBatchMessage in Outbox and related code (#24287) This is a glorified "rename", splitting the type `BatchMessage` into two cases: `LocalBatchMessage` and `OutboundBatchMessage`, converting between the two when necessary. The intention is to change no behavior with this change. Here are the non-typing changes (that affect runtime code): * Renaming the property `contents` to `serializedOp` for the Local batch case * Shallow-copy the whole batch and each message in `Outbox.virtualizeBatch` to convert from `LocalBatch` to `OutboundBatch`. * This is where we copy `serializedOp` over to `contents`, which paves the way for `serializedOp` to stop being serialized in the next PR, which is the end goal here * Minor refactoring of `createEmptyGroupedBatch` to support both Local/Outbound forms ### Context As ops are accumulated into a batch in the ContainerRuntime, they're currently serialized immediately. However, we plan to keep those unserialized, and only stringify when virtualizing (grouping, compressing) in preparation to send to the server. See #24281. This is the first step there. We should model pre/post virtualized batches differently because they are different. pre-virtualization batch messages are called `LocalBatchMessage` and post-virtualization are called `OutboundBatchMessage`.

Removes lerna.json files since they are no longer used. Also removes pipeline references to lerna.json, and cleans up a few other references in other files. There is a reference to lerna in a comment in the mocha-test-setup project. I left it as-is, because it still seems relevant. Split off from #24072.

## Description Provide alpha APIs for accessing document contents without view schema

## Description This PR adds a document denyList for Alfred and Nexus. This is similar to what we have in Historian. This is added to the all Alfred APIs. The list has two features: 1) A list of tenants to block: This list can allow blocking of all requests from a given tenant. Useful for situations when we get DDoS-ed by 1 tenant. 2) A list of documents to block: Blocks documents that are large in size and can cause OOM issues. Once change here from the previous implementation is that now we no longer use the tenantId -> documentId map. This is to make OCE life easier. Once GUID is easier to type/copy-paste than two GUIDs. Since DocIds are GUIDS chances of collision are negligible. For some `/documents` APIs, we can skip document level checks as these ops are either `GET` for document data and not ops or are `DELETE` calls to delete the document. This PR also has TODOs to remove the implementation in Historian and use the implementation in routerlicious instead.

## Description Back when the lazy entity types where originally implemented, the FlexTree API for object nodes exposed fields as properties, which could collide with members used in the implementation, so symbols were used to avoid this. As this these properties are no longer used, these APIs can be migrated away from symbols for cleaner code and easier debugging.

…24304) ## Description When using the vs-code test browser as configured by the tree workspace to debug tests, a handled exception is thrown who's messages suggested this change. Making this change seems to work, and the exception is gone which makes debugging with break on caught exceptions nicer. ## Breaking Changes This changes when the debugger breaks, making it break less ;)

## Description Follow-up PR to #24298 to consume the new denyList into Historian. Replaces the old denyList implementation.

## Description Updates NextJS to address https://nvd.nist.gov/vuln/detail/CVE-2025-29927. Of note, we only use NextJS in a test application, nothing that gets deployed anywhere.

…24312) ## Description Tweak SharedTree tests to make less direct use of the SharedTreeClass and instead prefer its interfaces. Also reduce unnecessary use of its factory. These changes make the test suite more compatible with changes like #23729 which adjust how the kernel is used to implement the SharedObject.

## Description If building client with ES2022, this fixes the only compile error. Since using ES2022 saves over 2% (7309 bytes) on bundle size (measured using SharedTree bundles which includes all of the runtime shared tree uses as well), making it easy to enable and experiment with seems like a good idea. Also building targeting ES2022 provides a better debug experience for JS private properties by not polyfilling them to weak map lookups.

…ndle size. (#24161) ## Description This redccues bundle size of container runtime by ~20KB. Delay load parts of summary module like RunningSummarizer, SummaryGenerator etc to reduce the container runtime bundle size. This effort is done as part of reducing fluid bundle size for loop so as to make uop for some of the increased bundle size due to new shared tree. --------- Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>

Docusaurus generates only 2 inline scripts during our website build, one for banner insertion and other one for setting dark/light theme. By doing a search on script tags on our build output, these are the only ones. After adding their respective hashes and testing the website on a staging environment, I could confirm that such scripts don't get flagged by the CSP. I also edited some files in our docs folder (react and docs) to see if script generation changed, but remained the same. In addition, adding an extra step to our website deployment pipeline to generate hashes for every script in our index.html and verifies that matches with the ones we have hard coded in our config file. This will give us a failure in case someone changes something in our code that changes the generated inline scripts. No check on other files is needed since the option 'self' in CSP allows us run external scripts (in comparison to inline scripts which we have removed the option to permit (unsafe-inline)). Moving script-src to report only mode again to have a test period.

Fixes a bug seen in rollback where tie breaks over a rolled back insert stopped too soon due a bad condition in insert. Also clean up some rollback code to use consistent stamping.

## Description Update the version of the uuid package. It is now written in TypeScript, avoids throwing a caught exception on load and now officially supports node 20 (which we use). More details in [the changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md). The updated version of uuid, at least the subset of it pulled into the tree package is 1041 bytes, where the old one was 868, so this does regress bundle size by 173 bytes uncompressed: the regression should be less for the compressed and parsed sizes. I think 173 bytes is acceptable for the developer workflow improvement of not having exceptions on startup (once the server update is pulled into client) or dealing with the separate types package. The tree bundle being considered here is 357568 making this a regression of approximately 0.048%, about the same as we would see from adding a nice UssageError to an API.

…ed batching scenario (#24292) ## Description With recent fixes to merge-tree, we're able to enable a bunch of previously skipped fuzz+regression tests around grouped batching. This PR does so. I also checked the status of turning on reconnect + grouped batching for obliterate and interval collection. There was one minor / easily fixed issue which this PR addresses and adds regression tests for. The remaining issues look like they occur much more rarely and are probably fundamentally related to known bugs in the segment normalization codepath. I've updated references in various comments to reflect this. --------- Co-authored-by: Abram Sanderson <absander@microsoft.com>

…e.flush throws (#24318) Flush is often called from secondary callstacks such that an error thrown will not propagate up to user code. However, failure to flush is a fatal error that we cannot recover from. This change adds a try-catch to `ContainerRuntime.flush` to ensure that the container is always closed if an error is thrown. I also went through all the errors thrown underneath there and made some tweaks to make them easier to interpret. e.g. using `DataProcessingError` more, since this is an error type we expect consumers to monitor closely.

This PR bumps all semver deps to 7.7.1 and @types/semver to 7.7.0. Context: #24248 (comment)

## Description Alfred, Nexus and Historian validate access tokens by making API calls to Riddler. We have seen significant amount of these calls fail with a 401 or 403 error. Indicating that Riddler has determined tokens to invalid multiple times. To try and reduce the calls b/w these services and riddler, this PR introduces a means to cache invalid tokens. These tokens can be cached in plaintext since these are invalid and cannot be made valid in the future.

…le path (#24317) In my GitHub CodeSpace, the `Debug Current Mocha Test (auto build)` debug config seems to be using the Windows entry for `runtimeExecutable` even though it's a Linux environment, so F5 to debug has been broken for me. Copilot suggested I add an explicit Linux platform-specific `runtimeExecutable` path to the `launch.json` file, which fixed the issue.

changesets have a new format, but the build-tools were only recently updated in client, so old changesets need to be manually converted.

ChumpChief and others added 26 commits April 9, 2025 10:39

Remove FluidDataStoreContext.request (#24293)

5f19024

MergeTree: Test for rollback inconsistency. (#24296)

c4b64a4

Adding a skipped test which reproduces a inconsistency in merge tree related to rollback --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removed denyList from Historian delete calls (#24299)

b4a63d5

## Description Historian denyList is used to prevent DoS issues with our services. However, we do not need to use the denyList for delete ops. This PR removes the denyList from ops that are handled by the delete APIs of Historian.

Cleanup map export tagging (#24297)

d7914c9

## Description Remove alpha+legacy tagging on some non-exported API. Group exports to indicate which ones are actually intended to be used.

Add ITreeAlpha and related APIs (#24225)

18b6e05

## Description Provide alpha APIs for accessing document contents without view schema

Consumed new denyList into Historian (#24315)

aa254a2

## Description Follow-up PR to #24298 to consume the new denyList into Historian. Replaces the old denyList implementation.

build: Update NextJS to address CVE (#24277)

7ba81ea

## Description Updates NextJS to address https://nvd.nist.gov/vuln/detail/CVE-2025-29927. Of note, we only use NextJS in a test application, nothing that gets deployed anywhere.

MergeTree: Fix tie break over removed segments below min seq (#24301)

6bfd258

Fixes a bug seen in rollback where tie breaks over a rolled back insert stopped too soon due a bad condition in insert. Also clean up some rollback code to use consistent stamping.

bump semver to 7.7.1 (#24327)

298dbd6

This PR bumps all semver deps to 7.7.1 and @types/semver to 7.7.0. Context: #24248 (comment)

build: Update changesets to new format (#24335)

9f45885

changesets have a new format, but the build-tools were only recently updated in client, so old changesets need to be manually converted.

WillieHabi requested review from a team as code owners April 11, 2025 21:58

WillieHabi closed this Apr 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge main into presence-refactor#24336

Merge main into presence-refactor#24336
WillieHabi wants to merge 26 commits intotest/client/presence-refactorfrom
main

WillieHabi commented Apr 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

WillieHabi commented Apr 11, 2025

Description

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants