v5.4.0 — Parse Server 8/9 compatibility, hybrid-search RAG, and MCP streaming transport#15
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the test/developer workflow to run reliably under Bundler while also shipping a sizable 5.4.0 feature set: hybrid lexical+vector search (RRF + optional $rankFusion), retrieval reranking, embedding spend caps, tighter vector exposure defaults (serialization + webhooks), and expanded MFA/email/push integration coverage plus supporting docker test-stack wiring.
Changes:
- Run test files via
bundle execand add test helper support for post-signup login to obtain a live session token under Parse Server 9.x behavior. - Add major RAG/search and security-related features: hybrid search + reranking, spend-cap metering, vector visibility controls, and vector scrubbing in webhook payloads.
- Expand integration/unit coverage, examples, and docs; bump version to 5.4.0 and update Parse Server docker/test-stack configuration (MFA/auth, push, capturing email).
Reviewed changes
Copilot reviewed 52 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_helper_integration.rb | Adds login_after_signup! helper to ensure live session tokens in integration tests. |
| test/lib/parse/vector_visibility_test.rb | Adds unit tests for new vector_visibility behavior and webhook redaction expectations. |
| test/lib/parse/vector_searchable_hybrid_test.rb | Tests Class.hybrid_search wrapper kwargs threading and object hydration with hybrid metadata. |
| test/lib/parse/vector_search_hybrid_test.rb | Tests RRF math, $rankFusion probe caching, native pipeline shape, and orchestration behavior. |
| test/lib/parse/user_save_signup_integration_test.rb | Adjusts integration tests to log in after signup to obtain session tokens. |
| test/lib/parse/user_authdata_strip_test.rb | Adds test ensuring MFA authData is reduced to leak-safe status only. |
| test/lib/parse/retrieval_retrieve_test.rb | Updates retrieval tests for hybrid routing and rerank behavior. |
| test/lib/parse/retrieval_reranker_test.rb | Adds unit tests for reranker protocol, fixture, and Cohere adapter parsing/redaction. |
| test/lib/parse/push_integration_test.rb | Fixes integration setup and adds server-backed push/install/audience lifecycle tests with cleanup. |
| test/lib/parse/mfa_totp_flow_integration_test.rb | Adds end-to-end TOTP MFA integration tests against MFA-enabled server. |
| test/lib/parse/mfa_test.rb | Fixes provisioning URI assertion to account for URL encoding. |
| test/lib/parse/live_query_integration_test.rb | Makes LiveQuery event tests deterministic by using public ACL objects + timeouts. |
| test/lib/parse/embeddings_spend_cap_test.rb | Adds unit tests for embedding token spend cap behavior. |
| test/lib/parse/embed_pending_test.rb | Adds unit tests for embedding backfill and compute_embedding!. |
| test/lib/parse/email_verification_disruptive_test.rb | Adds disruptive integration test that recreates server with email verification enabled. |
| test/lib/parse/client_rest_password_reset_integration_test.rb | Adds client-mode password reset integration coverage using capturing adapter. |
| test/cloud/dummy-push-adapter.js | Adds test-only push adapter enabling deterministic _PushStatus lifecycle tests. |
| test/cloud/capturing-email-adapter.js | Adds test-only email adapter capturing outgoing messages into Parse for assertions. |
| scripts/start-parse.sh | Wires push adapter, MFA auth config file, capturing email adapter, and public server URL for test stack. |
| scripts/docker/Dockerfile.parse | Pins Parse Server image tag to 9.9.0 for specific security fixes/features. |
| scripts/docker/docker-compose.verifyemail.yml | Adds compose override enabling verifyUserEmails for disruptive test. |
| README.md | Adds “What’s new in 5.4” summary, examples section, and bundle exec testing guidance; bumps version text. |
| Rakefile | Routes per-file test execution through bundle exec and adds optional MFA login flow for console. |
| lib/parse/webhooks/payload.rb | Adds vector column scrubbing for webhook payload object/original/update/objects. |
| lib/parse/vector_search/hybrid.rb | Introduces hybrid search module (RRF + optional native $rankFusion path + probe cache). |
| lib/parse/two_factor_auth/user_extension.rb | Fixes MFA API calls to pass session/master opts correctly; revises disable flow. |
| lib/parse/stack/version.rb | Bumps gem version to 5.4.0. |
| lib/parse/retrieval/retriever.rb | Enables hybrid: and rerank: in retrieval, adds rerank integration and hybrid wiring. |
| lib/parse/retrieval/reranker/cohere.rb | Adds Cohere /v2/rerank adapter with hardened HTTP handling. |
| lib/parse/retrieval/reranker.rb | Adds reranker protocol/base, fixture reranker, and Cohere autoload. |
| lib/parse/retrieval/agent_tool.rb | Charges embedding spend cap for semantic_search and maps breaches to rate-limit errors. |
| lib/parse/pipeline_security.rb | Allows $rankFusion as an Atlas stage-0 operator. |
| lib/parse/model/object.rb | Documents default ACL policy and adds hybrid score/ranks accessors; updates vector serialization default logic. |
| lib/parse/model/core/vector_searchable.rb | Adds vector_visibility DSL and hybrid_search API + hybrid hit builder. |
| lib/parse/model/core/embed_managed.rb | Adds compute_embedding! and embed_pending! bulk backfill API. |
| lib/parse/model/classes/user.rb | Preserves leak-safe MFA status projection while stripping sensitive authData; adds email verification request APIs. |
| lib/parse/model/classes/audience.rb | Fixes _Audience.query persistence by storing JSON string on wire and exposing Hash API. |
| lib/parse/model/acl.rb | Clarifies default ACL policy documentation (:owner_else_private). |
| lib/parse/embeddings/spend_cap.rb | Adds per-tenant embedding spend cap implementation (disabled by default). |
| lib/parse/embeddings.rb | Requires new spend cap module. |
| lib/parse/atlas_search.rb | Reduces role cache TTL default from 120s to 30s. |
| lib/parse/api/users.rb | Adds POST /verificationEmailRequest client method with rate-limit tracking. |
| Gemfile.lock | Bumps gem version and adds rotp/rqrcode (and deps). |
| Gemfile | Adds rotp and rqrcode to test/development group for MFA tests/QR. |
| examples/README.md | Adds index of runnable example scripts and common setup. |
| examples/rag_chatbot.rb | Adds end-to-end RAG example using managed embeddings + agent retrieval + LLM add-in. |
| examples/live_query_listener.rb | Adds interactive LiveQuery listener example scoped to a user session. |
| examples/basic_server.rb | Adds privileged (master-key) setup + schema/CRUD example. |
| examples/basic_client.rb | Adds unprivileged client example demonstrating ACL enforcement. |
| docs/mongodb_direct_guide.md | Documents enforcement behavior for Atlas index stages including $rankFusion and hybrid search. |
| docs/client_sdk_guide.md | Links to new runnable client/server examples. |
| docs/atlas_vector_search_guide.md | Documents hybrid search, reranking, and spend cap behavior; links to RAG example. |
| CHANGELOG.md | Adds detailed 5.4.0 changelog entry. |
| .gitignore | Ensures examples/README.md is not ignored. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+125
to
+127
| # The account label is URL-encoded in a valid otpauth URI ("@" -> "%40"), | ||
| # so decode before asserting the address is present. | ||
| assert CGI.unescape(uri).include?("test@example.com"), "Should include account name" |
0e3a5ee to
e9ca91a
Compare
Comment on lines
15
to
17
| def setup | ||
| skip "Integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] | ||
|
|
Comment on lines
+171
to
+175
| Parse::Embeddings::SpendCap.charge!(tenant_id: tenant_id, tokens: tokens) | ||
| rescue Parse::Embeddings::SpendCap::Exceeded => e | ||
| raise Parse::Agent::RateLimitExceeded.new( | ||
| retry_after: e.retry_after || e.window, limit: e.limit, window: e.window, | ||
| ) |
Comment on lines
147
to
+150
| @query = hash[:query] | ||
| @objects = hash[:objects] || [] | ||
| @objects = Array(hash[:objects]).map do |o| | ||
| self.class.scrub_vector_columns(self.class.scrub_credentials(o)) | ||
| end |
9ddaec9 to
1f084f1
Compare
Bump to 5.4.0 and add a broad set of RAG/vector, MFA, webhook, and server-compatibility features and fixes. Highlights: - Hybrid search & reranking: client-side reciprocal-rank fusion (RRF) for lexical + vector (`Class.hybrid_search`), native `$rankFusion` detection, and a cross-encoder reranker protocol with a Cohere adapter and deterministic test fixture. - Retrieval & embeddings: `Parse::Retrieval` now accepts `hybrid:` and `rerank:`, `Parse::Embeddings::SpendCap` per-tenant token caps, `Class.embed_pending!` and `Parse::Object#compute_embedding!`, and vector visibility controls with webhook redaction. - MFA and auth: fixes to MFA enrollment/disable flows, `Parse::User#verify_password`, improved `mfa_status` reporting, interactive console MFA login support (rake client:console) and docs. - Webhooks & triggers: expanded trigger allowlist (including file/connection `@` pseudo-classes), new Cloud Code Webhooks guide and runnable webhook example, and clearer guidance on ActiveModel vs Parse trigger mapping. - Server capability probing & behavior toggles: `Parse.server_supports?` / `Parse.server_features` probe, `Query#read_pref` fix, LiveQuery subscription options (`keys`/`fields`) fix, `Query#exclude_keys`, `Query#hint`, aggregate raw options, and other Parse Server compatibility fixes. - MCP & transport: Streamable HTTP transport option, default dispatcher cap and abandoned-dispatcher observability. - Tooling & docs: many docs and examples added (examples/, docs/webhooks_guide.md), README/CHANGELOG updates, added test coverage and new integration/unit tests. - Build/test tweaks: Gemfile adds rotp and rqrcode for MFA tests; Rakefile updated to prompt/handle MFA and always run tests via `bundle exec` to avoid minitest activation issues. Also includes numerous code, test, and docs changes to implement and exercise the above features.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v5.4.0 — Parse Server 8/9 compatibility, hybrid-search RAG, and MCP streaming transport
This branch is the 5.4.0 release (dev2 → main). It spans: a layer of Parse Server 8.x/9.x compatibility fixes and capability detection, full webhook trigger coverage (including the non-object auth/LiveQuery triggers and a fix that makes beforeFind/afterFind actually route), hybrid (lexical + vector) retrieval with reranking, and a consolidated MCP Streamable HTTP transport with disconnection hardening. It also carries MFA, email-verification,
Parse::Audience, and test-tooling fixes.Breaking changes
Parse::User#disable_mfa_master_key!now fails closed. Because it bypasses MFA verification via the master key, it refuses to run without an authorization signal: passadmin_role:(the library verifies the operator's role membership) orallow_unverified: true(assert the operator was authorized out-of-band). Callers that previously passed onlyauthorized_by:now raiseParse::MFA::ForbiddenError. Migration: addadmin_role:orallow_unverified: true.Parse.call_function/client.call_functionresults are now ORM-typed for registered classes. The decoder (see "Parse Server 8.x / 9.x compatibility") rebuilds a returned object of a registered class into aParse::Object, so an in-process Ruby caller reading a field goes through the property getter and gets the ORM type — anenum/symbolize:property yields a Symbol (:active, not"active"), a date yieldsParse::Date, a pointer yieldsParse::Pointer— where a pre-5.4.0 caller read the raw JSON String/Hash. Scope is narrow: it affects only the client-side return of a cloud-function call read in Ruby; HTTP/JSON consumers are unaffected (JSON has no symbols), and webhook trigger returns (beforeSave/afterSave/etc.) are unaffected since those responses go to Parse Server and never pass through the result decoder. Migration: a cloud function whose JSON shape is a contract should return an explicit plain Hash and coerce typed fields (status: obj.status.to_s) rather than wholeParse::Objects — returning objects (andobj.as_json, which still emits the__typeenvelope) is what triggers the rebuild; Ruby callers reading typed fields should expect the ORM type or normalize at the read site.Parse Server 8.x / 9.x compatibility
Query#read_prefnow rides the REST query body (readPreference), not just theX-Parse-Read-Preferenceheader — Parse Server maps no such header, so over REST the preference was silently ignored and every scoped read hit the primary.keyssubscription option (Parse Server 7.0 renamed it fromfields), withfieldskept as an alias for older servers — a projected subscription was silently receiving every column on 7.0+.__type-encoded Parse objects back intoParse::Object/Parse::Pointer(Parse Server 8.0 began encoding returned objects; 9.0 made it unconditional). Decoding is conservative — an unregistered class is left as a raw Hash, plain data passes through untouched. This makescall_functionresults ORM-typed for registered classes (enums read as Symbols, etc.) for in-process Ruby callers — see Breaking changes above for the consequence and migration.Parse.server_supports?(:capability)/Parse.server_features— a capability probe built on the memoizedserverInfofetch, so future server changes can be feature-gated rather than discovered by breakage. Prefers the advertisedfeaturesblock and falls back to version inference, failing open to the current server line.Query#explainsurfaces actionable guidance (proactive one-shot warning + reactive message, and theexplain_queryagent tool) when it hits Parse Server 9.0'sallowPublicExplain: falsedefault, instead of a bare 403.Webhook trigger coverage
beforeLogin,afterLogin,afterLogout,beforePasswordResetRequest) and the LiveQuery triggers (beforeConnect,beforeSubscribe,afterEvent).Parse::Webhooks::Payloadgains matching predicates (before_login?…after_event?, plusauth_trigger?/live_query_trigger?), aneventaccessor andclients/subscriptionsconnection counters, and captures the top-levelsessionTokenconnect/subscribe carry into#session_token(so#user_client/#user_agentwork, while keeping the token out ofas_jsonand the request log). Dispatch matches Parse Server's response contract: the body is ignored for all seven, so abefore*handler returningfalse(which Parse Server would resolve as{success:false}and allow) is converted to a rejection. None of these run ActiveModelsave/create/destroycallbacks even though the auth triggers carry a_User/_Session.beforeFind/afterFindwebhook triggers now route. Parse Server omits the class name from the find payload body entirely (the matchedobjectscarry noclassNameand there is no top-level one), so the SDK could not resolveparse_classand the dispatcher never invoked the handler. The class is now threaded from the webhook URL path (<endpoint>/<trigger>/<className>) into thePayload. This is also a correctness fix, not just feature completeness: an unroutedafterFindreturned{"success": true}(not an objects array), which Parse Server rejects — so a registeredafterFindpreviously broke every matching query with a connection error. The path segment is charset-validated before use as a routing key.:vectorcolumns are now stripped fromafterFindwebhook payloadobjects(the route-derived class is the only way to resolve the model, since the find payload carries no className;vector_visibility :publicclasses keep them).@File) and connection (@Connect) triggers now have a full register/fetch/delete lifecycle;Parse::API::PathSegment.trigger_class_name!accepts the@-prefixed pseudo-classes.beforeCreate/afterCreateare no longer presented as registerable webhook triggers (Parse Server has no such type); they remain ActiveModel callbacks that run inside thebeforeSave/afterSavehandler. Registering a create trigger raises a clear error pointing to the save trigger.Webhook handler ergonomics
return value. Handlers previously ran viainstance_exec, so a barereturnraisedLocalJumpErrorwhen the handler was defined inside a method (initializer, class body, config block); they now run as a method on the payload, givingreturnordinary semantics. The legacy idioms (last expression,next,break) still set the result, andselfis still the payload.payload.after_response { … }(aliasdefer) runs work after the webhook response is sent, off the client's critical path (search indexing, cache warming, fan-out). Usesrack.after_reply(Puma/Unicorn) when available, else a detached thread; callbacks run in registration order, are isolated, and fire only on the success path. In-process only (does not survive a worker restart) — use a durable queue for work that must happen.Webhook trigger coverage audit
Parse::Webhooks.trigger_audit— a master-key operator audit that cross-references three sources of trigger truth across every registered class and reports where they drift: a model's ActiveModel callbacks, the locally registered webhook blocks (Parse::Webhooks.routes), and the triggers actually registered with Parse Server (hooks/triggers). It surfaces the non-obvious rule that a callback runs server-side for non-Ruby clients only when both a local webhook block and the matching server trigger are registered — a callback declared on its own is inert for JS/Swift/REST/Dashboard writes. Findings:callbacks_inert,route_not_registered,orphan_server_trigger, andlocal_only_callbacks(*_update/*_validationcallbacks no server trigger can run). Framework-internal callbacks are filtered out by source location. Returns a Hash (or a human-readable summary withpretty: true);network: falseaudits callbacks against local routes without a master key.Parse Server feature coverage
context:propagation oncreate_object/update_object,call_function/call_function_with_session, andParse.call_function— serialized toX-Parse-Cloud-Contextand exposed to Cloud Code triggers;Webhooks::Payload#contextreads it on the receive side.Parse::User#verify_password(password)/Users#verify_password(username, password)validate credentials viaPOST /verifyPassword(credentials in the body, mirroringlogin, so the plaintext password stays out of URLs/logs) without minting a session — a step-up / re-auth primitive.Parse::Error::EmailNotVerifiedErrorfromParse::User.login!distinguishes "verify your email" (preventLoginWithUnverifiedEmail, code 205) from bad credentials. It subclassesParse::Error::AuthenticationError, so existingrescue AuthenticationErrorhandlers keep catching it (non-breaking).Query#exclude_keys(*fields)(excludeKeys), LiveQuerysubscribe(watch: [...])(update events only when named fields change, 7.0+),Query#aggregate(pipeline, raw_values:, raw_field_names:)(9.9.0rawValues/rawFieldNames),Query#hint(index_name)(REST + mongo-direct), and the:field.contained_by => [...]($containedBy) constraint.Retrieval (RAG): hybrid search + reranking
Class.hybrid_search(text:, lexical:, vector:, k:, fusion:)fuses a lexical Atlas Search branch with a$vectorSearchbranch via reciprocal-rank fusion (RRF). Two independent aggregations are required because$vectorSearchmust be stage 0; each branch enforces ACL/CLP/protectedFieldsindependently before fusion, so fused rows are already access-filtered. Results carry#hybrid_score,#hybrid_ranks,#vector_score/#search_score.Parse::VectorSearch::Hybrid.rrf(pure fusion math) and.rank_fusion_supported?(Atlas 8.0+ native$rankFusiondetection via a cached behavioural probe, not version-string parsing).Parse::Retrieval::Rerankercross-encoder protocol with a deterministicReranker::Fixtureand aReranker::Cohereadapter (/v2/rerank);Parse::Retrieval.retrievenow acceptshybrid:andrerank:(previously reserved, raisingNotImplementedError), with tenant scope enforced authoritatively in both branches.Parse::Embeddings::SpendCap— opt-in per-tenant cumulative embedding-token cap with hard-refuse, charged at thesemantic_searchagent-tool boundary (admin agents exempt).PipelineSecurityadmits$rankFusion(read-only, stage-0 Atlas operator) for the opt-in native path.Retrieval (RAG): completeness
Class.embed_pending!backfills null:vectorfields via objectId-cursor pagination;Parse::Object#compute_embedding!forces a digest-tracked in-place recompute without a save.vector_visibility :owner_only | :publiccontrols whether:vectorproperties appear inas_jsonby default (:owner_onlyis the safe default; an explicitinclude_vectors:always wins).:vectorcolumns fromobject/original/update/objectsby default (a:publicclass keeps them).MCP: Streamable HTTP transport + disconnection hardening
Parse::Agent::MCPRackApp.new(transport: :streamable_http)(andParse::Agent.rack_app(transport:)) enables the full MCP 2025-06-18 Streamable HTTP transport in one switch — POST→SSE streaming plus the server→clientGET /notification stream — equivalent tostreaming: true, notifications: true. Streamable HTTP is now documented as the primary embedded-Rack transport.transport:is a closed enum (:streamable_http/:legacy/nil); passing it alongside an explicitstreaming:/notifications:, or an unknown value, raisesArgumentError.rack_app { ... }keeps its non-streaming behavior until it opts in. The switch needs a streaming-capable Rack server (Puma/Falcon/Unicorn) and has no effect under the WEBrickMCPServer.max_concurrent_dispatchers:now defaults to a finite 100 (wasnil/unlimited), so a streaming surface is bounded out of the box — the cap fires a503JSON-RPC-32000instead of spawning unbounded orphan-prone threads. Pass an explicit integer to resize, ornilto knowingly run uncapped (logs a one-time warning); a non-positive/non-integer value raisesArgumentError.MCPRackApp.abandoned_dispatcher_count(process-wide counter of genuine orphans) plus aparse.agent.mcp_dispatcher_abandonedActiveSupport::Notificationsevent on every premature close. On disconnect the dispatcher's cancellation token is tripped and the orphan is bounded by the per-toolTimeoutand the clean MongoDB/REST I/O deadlines; it is intentionally not force-killed (aThread#killwould skip the driver's connection-invalidation and risk returning a half-used pooled connection).Parse::Agent::Tools.registernow have their declaredtimeout:(default 30s) actually enforced —Tools.invokewraps the handler inTimeout.timeout, raisingToolTimeoutError(previously the custom-handler path ran unbounded).registerrejects a non-positivetimeout:. Migration: a custom tool that legitimately runs longer than 30s must now declare an explicittimeout:.Auth and accounts
Parse::UserMFA lifecycle —setup_mfa!,setup_sms_mfa!,confirm_sms_mfa!,disable_mfa!,disable_mfa_master_key!no longer raise an internal argument error before reaching the server;mfa_enabled?/mfa_statusreport correctly after an ordinary fetch (a leak-safe{status: "enabled"}projection is preserved while the TOTP secret and recovery codes are stripped). Self-servicedisable_mfa!proves possession of the current code then unlinks the provider, confirming the disable from the server's own view.rake client:consoleprompts for a TOTP/recovery code (or readsPARSE_LOGIN_MFA) when logging into an enrolled account.Parse::User.request_email_verification(email)(and the instance form) re-sends the verification email for a registered, unverified user, mirroringrequest_password_reset.Parse::Audience#queryis stored as a JSON string on the wire to match Parse Server's_Audience.querycolumn type, so saving a hash query no longer fails the server schema check. Public API unchanged (assign/read aHash).Performance and tooling
Parse::AtlasSearchrole_cache_ttlnow defaults to 30s (was 120) so role grants/revokes reflect in$searchACL decisions sooner.bundle exec ruby) to avoid a minitest activation/load error on individual files; README documents the requirement.:owner_else_privatepolicy, its private fallback, and how to override it viaset_default_acl/acl_policy.Notes for reviewers
$rankFusionpath is opt-in (fusion: { method: :rrf_native }) and falls back to client-side fusion when the cluster does not support it. Detection and the native pipeline shape are unit-tested, but native execution is not exercised in CI (no Atlas 8.0 cluster).Timeout+ clean I/O deadlines) but not force-killed; this was reviewed against the connection-pool corruption risk aThread#killwould introduce.beforeConnectis effectively in-process only.Reranker::Cohere/v2/rerankresponse parsing is tested against a stubbed HTTP layer rather than a live key.