diff --git a/.gitignore b/.gitignore index 49b2bd0..feffc44 100644 --- a/.gitignore +++ b/.gitignore @@ -49,6 +49,8 @@ logs !docs/mongodb_index_optimization_guide.md !docs/atlas_vector_search_guide.md !docs/usage_guide.md +!docs/webhooks_guide.md !SECURITY.md !docs/client_sdk_guide.md !docs/acl_clp_guide.md +!examples/README.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 552d5e6..4600c0d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,466 @@ ## parse-stack-next Changelog +### 5.4.0 + +#### Parse Server 8.x / 9.x compatibility fixes + +A set of latent correctness fixes for behaviors that changed across Parse +Server versions, plus a capability-detection layer so future server changes +can be feature-gated rather than discovered by breakage. + +- **FIXED**: `Query#read_pref` now rides the REST query body + (`readPreference`), not just the `X-Parse-Read-Preference` header. Parse + Server reads read preference from request options and maps no such header, + so over REST the header alone was silently ignored and every read hit the + primary. Scoped reads now route correctly; the mongo-direct path (which + passes the preference straight to the driver) was already correct. The + normalized token (`PRIMARY`, `SECONDARY_PREFERRED`, …) is emitted verbatim, + matching Parse Server's accepted values. +- **FIXED**: LiveQuery field projection now emits the `keys` subscription + option (and keeps `fields` for older servers). Parse Server 7.0 renamed the + projection option from `fields` to `keys`; a frame carrying only `fields` + was ignored on 7.0+, so projected subscriptions silently received every + column. `subscribe(keys: […])` is accepted on `Query#subscribe`, + `Klass.subscribe`, and the LiveQuery client, with `fields:` retained as an + alias. +- **BREAKING**: cloud-function results now decode `__type`-encoded Parse objects + back into `Parse::Object` / `Parse::Pointer`. Parse Server 8.0 began encoding + returned objects and 9.0 made it unconditional, so `Parse.call_function` + returned an encoded dictionary where callers expected attribute access. + Decoding is conservative: only a fully-shaped Object/Pointer envelope is + converted, an Object of an unregistered class is left as a raw Hash (so no + attributes are lost), and plain data passes through untouched. This changes the + read type of typed fields for **in-process Ruby callers**: a returned object of + a *registered* class is now an ORM instance, so reading a field goes through the + property getter and carries its ORM type — an `enum` / `symbolize:` property + yields a Symbol (`:active`, not `"active"`), a date yields a `Parse::Date`, and + a pointer field yields a `Parse::Pointer`, where a pre-5.4.0 caller read the raw + JSON String/Hash. HTTP/JSON consumers are unaffected — JSON has no symbols, so + values re-serialize to strings on the way out. Migration: a cloud function whose + JSON shape is a contract should return an explicit plain Hash and coerce typed + fields (`status: obj.status.to_s`) rather than returning whole `Parse::Object`s + — returning objects (and `as_json`, which still emits the `__type` envelope) is + what triggers the client-side rebuild; reading `obj.as_json` back through a + caller does not escape it. Ruby callers that read typed fields off a result + should expect the ORM type or normalize at the read site. (`lib/parse/client.rb`) +- **IMPROVED**: `Query#explain` surfaces actionable guidance when it hits a + permission error — Parse Server 9.0 defaults `allowPublicExplain` to false, + so a non-master explain now reports that it requires the master key or + `allowPublicExplain: true` instead of a bare 403. +- **NEW**: `Parse.server_supports?(:capability)` and `Parse.server_features` + (and the client-level equivalents) expose a capability probe built on the + memoized `serverInfo` fetch. It prefers the advertised `features` block where + present and falls back to version inference for behavior flags the block does + not carry, failing open to the current server line when the version is + unknown. +- **IMPROVED**: the webhook trigger allowlist now mirrors Parse Server's full + set of trigger types, so registering `beforeLogin`, `afterLogin`, + `afterLogout`, `beforePasswordResetRequest`, `beforeConnect`, + `beforeSubscribe`, and `afterEvent` hooks is no longer pre-rejected by the + SDK. +- **NEW**: first-class webhook routing for the non-object trigger shapes — the + authentication triggers (`beforeLogin`, `afterLogin`, `afterLogout`, + `beforePasswordResetRequest`) and the LiveQuery triggers (`beforeConnect`, + `beforeSubscribe`, `afterEvent`). `Parse::Webhooks::Payload` gains matching + predicates (`before_login?` … `after_event?`, plus `auth_trigger?` / + `live_query_trigger?`), an `event` accessor (the `afterEvent` event type) and + `clients` / `subscriptions` connection counters, and captures the top-level + `sessionToken` that connect/subscribe carry into `#session_token` (so + `#user_client` / `#user_agent` work, while keeping the token out of `as_json` + and the request log). Dispatch matches Parse Server's response contract: the + response body is ignored for all seven triggers, so a handler that returns + `false` from a `before*` variant — which Parse Server would otherwise resolve + as `{success:false}` and **allow** — is converted to a rejection (`error!` + remains the explicit, message-carrying form), while a returned object is + normalized to a success no-op rather than serialized back. None of these run + ActiveModel `save` / `create` / `destroy` callbacks, even though the auth + triggers carry a `_User` / `_Session`. `@Connect` / `@File` trigger paths now + route correctly. LiveQuery triggers are delivered over HTTP only in a + co-located single-process setup; `beforeConnect` is effectively in-process + only. +- **NEW**: file (`@File`) and connection (`@Connect`) triggers now have a full + register/fetch/delete lifecycle through the SDK. + `Parse::API::PathSegment.trigger_class_name!` accepts Parse Server's + `@`-prefixed pseudo-classes for trigger paths (previously `fetch_trigger` / + `delete_trigger` rejected the leading `@`, so an `@File` / `@Connect` trigger + could be created but not managed). +- **CHANGED**: `beforeCreate` / `afterCreate` are no longer presented as + registerable webhook triggers — Parse Server has no such trigger type and + rejects them. They remain ActiveModel callbacks (`before_create` / + `after_create`) that run inside the `beforeSave` / `afterSave` handler for new + objects, so registering `beforeSave` / `afterSave` enables both the save and + create callbacks. Attempting to register a create trigger now raises a clear + error pointing to the save trigger instead of failing with a server-side + "invalid hook declaration". +- **FIXED**: `beforeFind` / `afterFind` webhook triggers now route. Parse Server + omits the class name from the find payload body entirely (the matched + `objects` carry no `className` and there is no top-level one), so the SDK could + not resolve `parse_class` for a find request and the dispatcher never invoked + the registered handler. The class is now threaded from the webhook URL path + (`//`) into the `Parse::Webhooks::Payload`, so + find handlers fire. This also matters for correctness, not just feature + completeness: an unrouted `afterFind` returned `{"success": true}` (not an + objects array), which Parse Server rejects — so a registered `afterFind` + previously broke every matching query with a connection error rather than + no-op'ing. The path segment is charset-validated before use as a routing key. +- **FIXED**: `:vector` columns are now stripped from `afterFind` webhook payload + `objects`. Because the find payload carries no class name, the route-derived + class is the only way to resolve the model and its declared `:vector` fields; + the previous per-element `className` lookup found nothing and left embeddings + in the payload. (`vector_visibility :public` classes keep them, as elsewhere.) +- **IMPROVED**: `Query#explain` now warns proactively (one-shot) when a clearly + non-master explain runs against a server known to restrict it (Parse Server + 9.0+ defaults `allowPublicExplain` to false), and the `explain_query` agent + tool surfaces the same guidance on a permission error — both in addition to + the existing reactive message. The warning is suppressed for master-key and + unknown-version calls to avoid noise on a flag `/serverInfo` does not expose. +- **DOCS**: added a Cloud Code Webhooks guide and a runnable + `examples/webhook_server.rb`, plus a README section on how ActiveModel + callbacks relate to Parse Server trigger types, the synchronous-latency model + for `afterSave`, and the inbound replay/freshness protection. + +#### `exclude_keys` honored on the direct-MongoDB read path + +- **IMPROVED**: `Query#exclude_keys` now takes effect on the mongo-direct read + path (`results_direct`, `first_direct`, and an aggregation that auto-promotes + to direct MongoDB, such as an `$inQuery`/`$notInQuery` pointer constraint). + MongoDB's `$project` is an allowlist with no denylist equivalent, so the + excluded fields were previously dropped silently and the full object came + back. The SDK now applies the denylist as a post-fetch sanitize over the + decoded results — the MongoDB query itself is unchanged. On this path the + strip is recursive by field name (it removes the field at every depth, + including inside included/nested objects), which is broader than the REST + path's top-level/dotted `excludeKeys` scoping. Decode-critical reserved + fields (`objectId`, `className`, `__type`, `createdAt`, `updatedAt`, `ACL`, + and their Mongo storage-form names) are never stripped, so excluding one of + them is a no-op rather than a way to break object reconstruction. + `exclude_keys` remains a result-shaping convenience, not an ACL/CLP boundary — + use `keys` or `protectedFields` to keep a field from leaving the database. + +#### Webhook handler blocks support explicit `return` + +- **IMPROVED**: a registered webhook handler (`Parse::Webhooks.route`, + `webhook`, `webhook_function`) can now use an explicit `return value` to set + its result. Previously the block ran through `instance_exec`, so a bare + `return` raised `LocalJumpError: unexpected return` whenever the handler was + defined inside a method (an initializer, a class body, a config block) — the + only way to return a value was to make it the last expression. Handlers now + run as a method on the request payload, giving `return` ordinary + method semantics: + + ```ruby + Parse::Webhooks.route :before_save, :Post do + post = parse_object + return post if post.title.present? # now works + error! "title required" + end + ``` + + The legacy idioms are unchanged — the last expression's value, `next value`, + and `break value` all still set the result — and `self` is still the payload, + so `parse_object`, `params`, and `error!` resolve directly inside the block. + `raise` / `error!` and returning `false` from a `before_save` halt the save + exactly as before. `return` ends the handler, so it cannot be followed by more + work in the same block; to run work after the response, use `after_response` + (below). +- **NEW**: `payload.after_response { … }` (alias `defer`) registers work to run + **after** the webhook response has been sent, off the client's critical path — + for search indexing, cache warming, or fan-out that should not add latency to + the save/function the client is waiting on. Under a server that exposes + `rack.after_reply` (Puma, Unicorn) the block runs once the response is flushed + to the socket on the same worker thread; otherwise it falls back to a detached + thread. Multiple callbacks run in registration order and each is isolated, so + one raising affects neither the response nor the others. Callbacks are + dispatched only on the success path (a rejected `before_save` does not trigger + follow-up work) and only when the payload is processed through the mounted + `Parse::Webhooks` Rack app. The work runs after the response is flushed, which + is not a guarantee about when the row commits, and it runs in-process (it does + not survive a worker restart) — for work that *must* happen, use a durable job + queue. + +#### Webhook trigger coverage audit + +- **NEW**: `Parse::Webhooks.trigger_audit` — a master-key operator audit that + cross-references three sources of trigger truth across every registered class + and reports where they drift: a model's ActiveModel callbacks + (`before_save` / `after_save` / `after_create` / ...), the locally registered + webhook blocks (`Parse::Webhooks.routes`), and the triggers actually + registered with Parse Server (`hooks/triggers`). It surfaces the non-obvious + rule that a callback runs server-side for non-Ruby clients only when both a + local webhook block and the matching server trigger are registered — a + callback declared on its own is inert for JS/Swift/REST/Dashboard writes. + Findings include `callbacks_inert` (callbacks that will not run for non-Ruby + clients), `route_not_registered` (a local block with no server trigger), + `orphan_server_trigger` (a server trigger with no local handler), and + `local_only_callbacks` (`*_update` / `*_validation` callbacks that no server + trigger can ever run). Framework-internal callbacks are filtered out by source + location so the report shows only app-defined logic. Returns a Hash by + default, a human-readable summary with `pretty: true`, and `network: false` + audits callbacks against local routes without a master key. See the Cloud Code + Webhooks guide for details. + +#### Parse Server feature-coverage additions + +Closes a set of backend capabilities the SDK did not previously surface. + +- **NEW**: `context` propagation. Pass `context:` to `create_object` / + `update_object`, `call_function` / `call_function_with_session`, and + `Parse.call_function`; it is serialized to the `X-Parse-Cloud-Context` header + (`Parse::Protocol::CLOUD_CONTEXT`) and made available to Cloud Code triggers. + On the receive side, `Parse::Webhooks::Payload#context` exposes the incoming + context (not credential-scrubbed). Backward compatible — omitting `context:` + sends nothing. +- **NEW**: `Parse::User#verify_password(password)` and + `Parse::API::Users#verify_password(username, password)` validate credentials + via `POST /verifyPassword` (credentials in the request body, mirroring + `login`) without minting a session — a step-up / re-authentication primitive. + POST is used over the GET form so the plaintext password stays out of the URL + (and therefore out of access logs, proxy logs, and the response cache key). +- **NEW**: `Parse::Error::EmailNotVerifiedError` is raised from + `Parse::User.login!` when Parse Server rejects a login because the account's + email is unverified (`preventLoginWithUnverifiedEmail`; Parse Server returns + code 205 in this context). It subclasses `Parse::Error::AuthenticationError`, + so existing `rescue Parse::Error::AuthenticationError` handlers keep catching + the unverified-email case (no breaking change — it was a plain + `AuthenticationError` before); callers who want to distinguish "verify your + email" from "bad credentials" rescue the narrower subclass first. +- **NEW**: `Query#exclude_keys(*fields)` emits the Parse Server `excludeKeys` + parameter (a server-side field denylist, the complement of `keys`) — fetch a + row minus large columns (e.g. a managed `:vector`) without enumerating every + field you do want. +- **NEW**: LiveQuery `watch` — `subscribe(watch: [...])` (on `Klass.subscribe`, + `Query#subscribe`, and the LiveQuery client) requests update events only when + the named fields change, cutting event volume on busy subscriptions. Emitted + as the `watch` subscription option (Parse Server 7.0+). +- **NEW**: `Query#aggregate(pipeline, raw_values:, raw_field_names:)` forwards + the Parse Server 9.9.0 `rawValues` / `rawFieldNames` aggregation options + through the REST aggregate path. +- **NEW**: `Query#hint(index_name)` forces a specific index. Emitted in the + compiled REST query body and forwarded to the mongo-direct path + (`Parse::MongoDB.aggregate` / `find` `hint:`), so a bad plan diagnosed with + `explain` can be corrected without dropping to `mongosh`. +- **NEW**: `:field.contained_by => [...]` (`$containedBy`) query constraint — + matches when the array field's values are all within the supplied set (the + inverse of `$all`), rounding out array-operator coverage. + +#### Hybrid search and reranking for RAG + +- **NEW**: `Class.hybrid_search(text:, lexical:, vector:, k:, fusion:)` fuses a + lexical Atlas Search branch with a `$vectorSearch` branch using + reciprocal-rank fusion (RRF). Lexical search captures exact-token matches + (proper nouns, codes); vector search captures paraphrase; fusing the two beats + either alone on most workloads. Each branch enforces ACL/CLP/`protectedFields` + independently before fusion, so results are already access-filtered — there is + no separate hydration fetch to secure. + + ```ruby + Article.hybrid_search( + text: "how do I reset my password", + lexical: { index: "article_search" }, + vector: { num_candidates: 200 }, + k: 20, + fusion: { k_constant: 60, weights: { lexical: 0.4, vector: 0.6 } }, + ) + ``` + + Returned objects carry `#hybrid_score`, `#hybrid_ranks`, and (when the branch + contributed) `#vector_score` / `#search_score`. +- **NEW**: `Parse::VectorSearch::Hybrid.rrf` exposes the pure RRF fusion math, + and `Parse::VectorSearch::Hybrid.rank_fusion_supported?` detects Atlas 8.0+ + native `$rankFusion` via a cached behavioural probe (1-hour TTL) rather than + version-string parsing. +- **NEW**: `Parse::Retrieval::Reranker` cross-encoder reranking protocol with a + deterministic `Reranker::Fixture` (zero-network, for tests) and a + `Reranker::Cohere` adapter (`/v2/rerank`). A reranker reorders retrieved + documents by relevance before chunking. +- **NEW**: `Parse::Retrieval.retrieve` now accepts `hybrid:` (route through + `hybrid_search`) and `rerank:` (a reranker that reorders documents and sets the + chunk score to the cross-encoder relevance). Both kwargs were previously + reserved and raised `NotImplementedError`. When `tenant_scope:` is supplied, the + tenant constraint is enforced authoritatively in BOTH hybrid branches: a + caller-supplied `vector_filter` / `lexical` filter can narrow the result set but + can no longer replace (and thereby escape) the tenant scope. +- **NEW**: `Parse::Embeddings::SpendCap` adds an opt-in per-tenant cumulative + embedding token cap with hard-refuse semantics. The `semantic_search` agent + tool charges the estimated query tokens against the caller's tenant budget on + every call (attacker-controlled chat input embeds text); a breach surfaces as a + rate-limited tool error. Disabled by default; admin agents are exempt. The token + estimate takes the larger of a character- and a byte-based heuristic so + multibyte input (e.g. CJK, emoji) is not undercounted — the chars/4 ratio only + holds for ASCII, and this estimate is the sole basis for the refuse decision. +- **CHANGED**: `PipelineSecurity::ALLOWED_STAGES` and `STAGE0_ONLY_ATLAS_STAGES` + admit `$rankFusion` (Atlas 8.0+ native server-side RRF) — a read-only, + stage-0 Atlas operator like `$vectorSearch`. +- **NOTE**: Hybrid fusion runs client-side by default. The native single-roundtrip + `$rankFusion` path is opt-in (`fusion: { method: :rrf_native }`) and falls back + to client-side fusion when the cluster does not support it; detection and the + native pipeline shape ship, but live results route through the always-enforced + two-aggregate client path unless native is explicitly requested. When native + fusion does execute, top-level rows are re-verified against the scope's `_rperm` + after the fusion stage and fail closed (a row must carry an `_rperm` that + explicitly satisfies the scope), and `numCandidates` is clamped to Atlas's + `[limit, 10000]` range to match `Parse::VectorSearch`. + +#### RAG completeness: bulk embed, vector visibility, webhook redaction + +- **NEW**: `Class.embed_pending!` backfills embeddings for records whose managed + `:vector` field is still null, using objectId-cursor pagination (robust to the + result set shrinking as records embed). Intended as a master-key maintenance + operation; supports `field:`, `batch_size:`, `limit:`, and `where:`. +- **NEW**: `Parse::Object#compute_embedding!` forces an in-place recompute of a + record's managed embedding(s) without a save (digest-tracked — a provider call + happens only when the source changed). +- **NEW**: `vector_visibility :owner_only | :public` class-level DSL controls + whether a class's `:vector` properties are included in `as_json` by default + (`:owner_only` omits, the safe default; `:public` includes). An explicit + `include_vectors:` in the `as_json` call always wins. +- **IMPROVED**: Webhook trigger payloads now strip declared `:vector` columns from + `object` / `original` / `update` / `objects` by default, mirroring the `as_json` + default. A class that opts into `vector_visibility :public` keeps its vectors in + the payload. Embeddings are large and leak ML signal; a handler has no reason to + receive them. + +#### Fix `Parse::Audience` hash-query persistence + +- **FIXED**: `Parse::Audience#query` is now stored as a JSON string on the wire, + matching Parse Server's built-in `_Audience.query` column (which is typed + `String`). The property previously serialized as an object, so every save of + an audience with a hash query was rejected by the server with a schema + mismatch (`expected String but got Object`). The public API is unchanged — + assign a `Hash` and read a `Hash` back; the value is encoded to JSON on save + and decoded on load, reading back as a `HashWithIndifferentAccess` so both + string and symbol keys resolve. + +#### `Parse::MFA` write, status, and disable fixes + +- **FIXED**: `Parse::User#setup_mfa!`, `#setup_sms_mfa!`, `#confirm_sms_mfa!`, + `#disable_mfa!`, and `#disable_mfa_master_key!` raised an `ArgumentError` + (nested `opts:`) before reaching the server. Each passed its session token or + master-key flag wrapped in an `opts:` hash, which the current `Parse::Client` + request layer rejects; the credentials are now passed as direct keyword + arguments so MFA enrollment and disable calls work. +- **FIXED**: `Parse::User#mfa_enabled?` and `#mfa_status` now report correctly + after an ordinary fetch. The SDK strips `authData` from fetched users to avoid + leaking the TOTP secret and recovery codes that Parse Server returns there; + the strip now preserves a non-sensitive `{ "status" => "enabled" }` projection + (and nothing else — the secret and recovery codes are still removed), so the + status methods read true instead of always reporting "not enabled". +- **FIXED**: `Parse::User#disable_mfa!` (self-service disable) now works. Parse + Server's TOTP adapter has no first-class self-disable, so the SDK first proves + possession of the current code, then unlinks the MFA provider. A wrong code is + rejected with `Parse::MFA::VerificationError` and leaves MFA enabled. The + current-code step is now classified positively — a rejected code raises + `VerificationError`, while any other failure (transport, session, server error) + surfaces as a `Parse::Client::ResponseError` instead of being mislabeled a + verification failure. The disable is confirmed authoritatively from the + server's own view (a disabled account's own session-token read returns no + `authData.mfa`) rather than from the in-memory projection, and the local + `mfa_enabled?` / `mfa_status` state is cleared to reflect the disable so a + subsequent read on the same object does not report a stale `enabled`. +- **FIXED**: `Parse::User#disable_mfa_master_key!` now clears the in-memory MFA + status after disabling, so `mfa_enabled?` / `mfa_status` report the truth on + the same object without requiring a fresh load. +- **BREAKING**: `Parse::User#disable_mfa_master_key!` now fails closed. Because it + bypasses MFA verification entirely via the master key, it refuses to run without + an authorization signal: pass `admin_role:` for the library to verify the + operator's role membership, or `allow_unverified: true` to explicitly assert that + the caller has already authorized the operator out-of-band. Callers that + previously passed only `authorized_by:` now raise `Parse::MFA::ForbiddenError`; + add `admin_role:` or `allow_unverified: true` to migrate. `authorized_by:` + remains required and is still validated first. +- **FIXED**: `Parse::User#mfa_enabled?` / `#mfa_status` no longer report `enabled` + for a user whose `authData.mfa` carries an explicit non-`enabled` status with a + stale residual secret or recovery code; an explicit status is now authoritative. + +#### Interactive console MFA login + +- **NEW**: `rake client:console` now logs in MFA-enrolled accounts. When the + server reports that an additional MFA factor is required, the console prompts + for a TOTP / recovery code (or reads `PARSE_LOGIN_MFA` for non-interactive + use) and completes the login via `Parse::User.login_with_mfa`. A + password-only login of a non-enrolled account is unaffected. + +#### Request email-address verification + +- **NEW**: `Parse::User.request_email_verification(email)` and the instance + `Parse::User#request_email_verification` ask Parse Server to (re)send the + address-verification email for a registered, not-yet-verified user (the + `POST /verificationEmailRequest` endpoint). The server must have an email + adapter and `verifyUserEmails` enabled. Mirrors `request_password_reset`: + rate-limited per email, returns a Boolean, and raises + `Parse::Error::ServiceUnavailableError` on a misconfigured server. + +#### Faster AtlasSearch role-cache expiry + +- **CHANGED**: `Parse::AtlasSearch` `role_cache_ttl` now defaults to 30 seconds + (was 120). The shorter TTL expires cached user-to-role mappings sooner, so a + role grant or revoke is reflected in `$search` ACL decisions faster, at the + cost of slightly more frequent role lookups. Override via + `Parse::AtlasSearch.configure(role_cache_ttl:)`. + +#### MCP Streamable HTTP transport switch + +- **NEW**: `Parse::Agent::MCPRackApp.new(transport: :streamable_http)` (and the + `Parse::Agent.rack_app(transport: :streamable_http)` convenience) enables the + full MCP 2025-06-18 Streamable HTTP transport with one switch — POST→SSE + streaming plus the server→client `GET /` notification stream — instead of + setting `streaming: true, notifications: true` separately. Streamable HTTP is + now documented as the primary transport for embedded Rack deployments. + + ```ruby + mcp_app = Parse::Agent.rack_app(transport: :streamable_http) do |env| + # auth factory returning a Parse::Agent + end + ``` + + `transport:` is a closed enum (`:streamable_http`, `:legacy`, or `nil`). + `resource_subscriptions: true` may be combined with `:streamable_http` to + upgrade the server→client bus to its LiveQuery-backed resource-subscription + posture. +- **CHANGED**: Passing `transport: :streamable_http` together with an explicit + `streaming:` or `notifications:` raises `ArgumentError` (the switch already + owns those toggles); any `transport:` value outside the closed enum also + raises. +- **NOTE**: The default transport is unchanged — an existing + `Parse::Agent.rack_app { ... }` keeps its non-streaming buffered-JSON + behavior until it opts in. The switch requires a streaming-capable Rack server + (Puma, Falcon, Unicorn) and has no effect under the WEBrick-backed + `MCPServer`, which cannot stream. +- **CHANGED**: `Parse::Agent::MCPRackApp` `max_concurrent_dispatchers:` now + defaults to a finite **100** (`DEFAULT_MAX_CONCURRENT_DISPATCHERS`) instead of + `nil` (unlimited). Enabling a streaming surface is now bounded out of the box: + once the cap is reached, a new SSE request or `GET /` listening stream is + refused with a `503` JSON-RPC `-32000` ("server busy") rather than spawning an + unbounded number of orphan-prone threads. The cap applies separately to + request-scoped SSE and listening streams (effective ceiling up to 2x). Pass an + explicit positive integer to resize it, or `max_concurrent_dispatchers: nil` + to knowingly run uncapped (which logs a one-time construction warning). A + non-positive or non-integer value now raises `ArgumentError`. +- **NEW**: Observability for SSE dispatchers abandoned by a client disconnect. + `Parse::Agent::MCPRackApp.abandoned_dispatcher_count` is a process-wide + cumulative counter, and each abandonment emits a + `parse.agent.mcp_dispatcher_abandoned` `ActiveSupport::Notifications` event + (`reason:`, `dispatcher_alive:`, `request_id:`) so operators can detect + disconnect-against-slow-tool pressure. On disconnect the dispatcher's + cancellation token is tripped (cooperative exit) and its lifetime is bounded + by the per-tool `Timeout` plus the clean MongoDB/REST I/O deadlines; the + orphan is intentionally NOT force-killed, because a `Thread#kill` would skip + the database driver's connection-invalidation and risk returning a half-used + pooled connection to a later request. +- **CHANGED**: Custom tools registered via `Parse::Agent::Tools.register` now + have their declared `timeout:` (default 30s) actually enforced — + `Tools.invoke` wraps the handler in `Timeout.timeout`, raising + `Parse::Agent::ToolTimeoutError` when it is exceeded (previously the stored + timeout was not applied to the custom-handler path, so a blocking or looping + handler ran unbounded and could hold an MCP streaming dispatcher slot after a + client disconnect). Built-in tools are unaffected (they already self-applied + their timeout). **Migration:** a custom tool that legitimately runs longer + than 30s must now declare an explicit `timeout:` (e.g. + `register(..., timeout: 120)`); a tool that exceeds its budget will otherwise + raise `ToolTimeoutError`. `register` now also rejects a non-positive + `timeout:` with `ArgumentError` (a `0` would make `Timeout.timeout` a no-op + and silently disable the bound). + ### 5.3.0 #### Run webhook handlers as the calling user diff --git a/Gemfile b/Gemfile index e871f98..54d440a 100644 --- a/Gemfile +++ b/Gemfile @@ -32,5 +32,12 @@ group :test, :development do gem "puma" gem "sinatra" gem "rack-test" + # MFA / TOTP test infrastructure (Parse::MFA, two_factor_auth). + # rotp: generates TOTP secrets and time-based codes so the MFA unit and + # integration tests can enroll and log in against Parse Server's + # TOTP adapter (SHA1 / 6 digits / 30s — rotp's defaults match). + # rqrcode: renders the provisioning QR code exercised by Parse::MFA.qr_code. + gem "rotp" + gem "rqrcode" # gem "thin" # for yard server - disabled due to eventmachine compilation issues end diff --git a/Gemfile.lock b/Gemfile.lock index 2d660e5..d1db41e 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,7 +1,7 @@ PATH remote: . specs: - parse-stack-next (5.3.0) + parse-stack-next (5.4.0) activemodel (>= 6.1, < 9) activesupport (>= 6.1, < 9) connection_pool (>= 2.2, < 4) @@ -39,6 +39,7 @@ GEM bundler-audit (0.9.3) bundler (>= 1.2.0) thor (~> 1.0) + chunky_png (1.4.0) coderay (1.1.3) concurrent-ruby (1.3.6) connection_pool (3.0.2) @@ -54,7 +55,7 @@ GEM faraday-net_http (>= 2.0, < 3.5) json logger - faraday-net_http (3.4.3) + faraday-net_http (3.4.4) net-http (~> 0.5) faraday-net_http_persistent (2.3.1) faraday (~> 2.5) @@ -72,7 +73,7 @@ GEM prism (>= 1.3.0) rdoc (>= 4.0.0) reline (>= 0.4.2) - json (2.19.5) + json (2.19.8) logger (1.7.0) method_source (1.1.0) minitest (6.0.6) @@ -104,7 +105,7 @@ GEM coderay (~> 1.1) method_source (~> 1.0) reline (>= 0.6.0) - psych (5.3.1) + psych (5.4.0) date stringio puma (8.0.2) @@ -133,6 +134,11 @@ GEM connection_pool reline (0.6.3) io-console (~> 0.5) + rotp (6.3.0) + rqrcode (3.2.0) + chunky_png (~> 1.0) + rqrcode_core (~> 2.0) + rqrcode_core (2.1.0) ruby-progressbar (1.13.0) rufo (0.18.2) securerandom (0.4.1) @@ -181,6 +187,8 @@ DEPENDENCIES rake redcarpet redis + rotp + rqrcode rufo sinatra webrick diff --git a/README.md b/README.md index 6a8eec9..e6f9993 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,15 @@ A full-featured Ruby client SDK for [Parse Server](http://parseplatform.org/). [parse-stack-next](https://github.com/neurosynq/parse-stack-next) is a Ruby client SDK, REST client, and Active Model ORM for [Parse Server](http://parseplatform.org/), combining a low-level API client, a query engine, an object-relational mapper (ORM), and a Cloud Code Webhooks rack application in a single gem. +### What's new in 5.4 + +- **5.4.0 — Hybrid search + reranking for RAG** — `Class.hybrid_search(text:, lexical:, vector:, k:, fusion:)` fuses a lexical Atlas Search branch with a `$vectorSearch` branch using reciprocal-rank fusion (RRF): lexical search nails exact tokens (codes, proper nouns), vector search nails paraphrase, and fusing the two beats either alone. Each branch enforces ACL/CLP independently before fusion (no separate hydration fetch to secure); results carry `#hybrid_score` / `#hybrid_ranks`. `Parse::VectorSearch::Hybrid.rank_fusion_supported?` detects Atlas 8.0+ native `$rankFusion` by a cached behavioural probe (native execution is opt-in; client-side RRF is the always-enforced default). `Parse::Retrieval::Reranker` adds cross-encoder reranking (`Reranker::Cohere` over `/v2/rerank`, plus a deterministic `Reranker::Fixture`), wired into `Parse::Retrieval.retrieve(hybrid:, rerank:)`. `Parse::Embeddings::SpendCap` adds an opt-in per-tenant embedding token cap (hard-refuse) at the `semantic_search` agent-tool boundary. See [CHANGELOG.md](./CHANGELOG.md) and [`docs/atlas_vector_search_guide.md`](./docs/atlas_vector_search_guide.md) +- **5.4.0 — Vector backfill, visibility, and webhook redaction** — `Class.embed_pending!` backfills embeddings for records whose managed `:vector` field is null (objectId-cursor pagination); `Parse::Object#compute_embedding!` forces an in-place recompute without a save; `vector_visibility :owner_only | :public` controls whether a class's vectors appear in `as_json` by default; and webhook trigger payloads now strip declared `:vector` columns by default (a `:public` class keeps them). See [CHANGELOG.md](./CHANGELOG.md) +- **5.4.0 — TOTP multi-factor auth works end to end** — the `Parse::User` MFA lifecycle is now fully functional and exercised against a real MFA-enabled Parse Server. `setup_mfa!(secret:, token:)` enrolls TOTP and returns recovery codes; `Parse::User.login_with_mfa(user, pass, code)` completes a second-factor login; `mfa_enabled?` / `mfa_status` report enrollment after an ordinary fetch — the SDK strips the raw TOTP secret and recovery codes that Parse Server returns in `authData` but preserves a leak-safe `{status: "enabled"}` projection so the status reads correctly without exposing the secret; `disable_mfa!(current_token:)` turns MFA off after re-validating the current code (a wrong code raises `Parse::MFA::VerificationError`), and `disable_mfa_master_key!(authorized_by:)` is the operator override. Each MFA write also no longer raises an internal argument error before reaching the server. Interactively, `rake client:console` now prompts for a TOTP / recovery code (or reads `PARSE_LOGIN_MFA`) when logging into an enrolled account. See [CHANGELOG.md](./CHANGELOG.md) +- **5.4.0 — Request email-address verification** — `Parse::User.request_email_verification(email)` and the instance `Parse::User#request_email_verification` ask Parse Server to (re)send the verification email for a registered, not-yet-verified user, mirroring `request_password_reset` (per-email rate limiting, Boolean return). Requires a server email adapter with `verifyUserEmails` enabled. See [CHANGELOG.md](./CHANGELOG.md) +- **5.4.0 — Audience hash queries persist correctly** — `Parse::Audience#query` is now stored as a JSON string on the wire to match Parse Server's `_Audience.query` column type, so saving an audience with a `Hash` query no longer fails the server schema check. The public API is unchanged — assign a `Hash`, read a `Hash` back. See [CHANGELOG.md](./CHANGELOG.md) +- **5.4.0 — Faster AtlasSearch role-cache expiry** — `Parse::AtlasSearch` `role_cache_ttl` now defaults to 30 seconds (was 120) so a role grant or revoke is reflected in `$search` ACL decisions sooner, at the cost of slightly more frequent role lookups. See [CHANGELOG.md](./CHANGELOG.md) + ### What's new in 5.3 - **5.3.0 — Run webhook handlers (and clients) as the calling user** — Parse Server embeds the caller's live session token in every trigger webhook fired by a logged-in user. A handler can now opt in to acting on the server *as that user* — full ACL/CLP/`protectedFields` enforcement, no master key. `payload.session_token` exposes the captured token (`nil` for master-key requests; still scrubbed from `payload.user`/`payload.object`/`as_json`/logs); `payload.user_agent` returns a client-mode `Parse::Agent`, and `payload.user_client` a non-master `Parse::Client` with the token **bound** so even raw REST calls authorize as the user. The same user-scoped client is available client-side via `Parse::User#session_client` and the `Parse::Client#become(token)` primitive, with `Parse::Client#with_session { … }` for block scoping. Backed by a new `Parse::Client.new(session_token:)` option. See [Acting as the calling user](#acting-as-the-calling-user) @@ -209,9 +218,20 @@ result = Parse.call_function :myFunctionName, {param: value} ``` +## Examples + +Runnable, self-contained scripts live in [`examples/`](examples/) — see +[`examples/README.md`](examples/README.md) for the full index. Highlights: + +- [`basic_server.rb`](examples/basic_server.rb) — master-key setup: models, schema, CRUD + queries. +- [`basic_client.rb`](examples/basic_client.rb) — unprivileged client with row-level ACL enforcement. +- [`live_query_listener.rb`](examples/live_query_listener.rb) — interactive LiveQuery console scoped to a user's session. +- [`rag_chatbot.rb`](examples/rag_chatbot.rb) — retrieval-augmented generation with `semantic_search` + an OpenAI/Anthropic add-in. +- [`transaction_example.rb`](examples/transaction_example.rb) — atomic multi-object transactions. + ## Release History -**Current version: 5.0.1** | **Ruby 3.2+ required** +**Current version: 5.4.0** | **Ruby 3.2+ required** The 5.0 highlights (vector search / RAG, pooled Redis cache, AS::N instrumentation, MCP transport hardening, GraphQL type generation) are summarized in the [What's new in 5.0](#whats-new-in-50) section above. Earlier releases are recorded below. @@ -1586,8 +1606,11 @@ user.mfa_status # => :enabled, :disabled, or :unknown # Disable MFA (requires current token) user.disable_mfa!(current_token: "123456") -# Admin reset (master key) — authorized_by must be a Parse::User -user.disable_mfa_master_key!(authorized_by: admin_user) +# Admin reset (master key) — fails closed: pass either an admin_role: +# for the library to verify, or allow_unverified: true to assert that +# you have already authorized the operator out-of-band. +user.disable_mfa_master_key!(authorized_by: admin_user, admin_role: "Admin") +# or: user.disable_mfa_master_key!(authorized_by: admin_user, allow_unverified: true) ``` **SMS MFA (requires Parse Server SMS callback):** @@ -4917,6 +4940,32 @@ The `parse_object` handed to your handler is the **full object as Parse Server s For any `after_*` hook, return values are not needed since Parse does not utilize them. You may also register as many `after_save` or `after_delete` handlers as you prefer, all of them will be called. +For `before_save` (and functions), the handler's value **is** the response Parse Server acts on — return the (possibly mutated) `parse_object` to allow the write, or `false` / `error!` to reject it. You can set that value with an explicit `return` or as the block's last expression; both work, as do the proc idioms `next value` / `break value`: + +```ruby +Parse::Webhooks.route :before_save, :Artist do + artist = parse_object + return artist if artist.name.present? # explicit early return + error! "name is required" # raise to reject the save +end +``` + +`self` inside the block is the `Parse::Webhooks::Payload`, so `parse_object`, `params`, and `error!` are available directly. As anywhere in Ruby, `return` ends the handler immediately — to run work *after* the response is sent, use `after_response` (below) rather than code after the `return`. + +#### Deferring work until after the response + +`payload.after_response { … }` (alias `defer`) registers work to run **after** the webhook response has been sent, off the critical path of the save or function the client is waiting on. The handler returns its value synchronously; the deferred block runs afterward — ideal for search indexing, cache warming, or fan-out that should not add latency. + +```ruby +Parse::Webhooks.route :after_save, :Post do + post = parse_object + after_response { SearchIndex.reindex(post.id) } # runs after the reply is sent + post +end +``` + +Under Puma or Unicorn the block runs via `rack.after_reply` once the response is flushed (same worker thread, zero added round-trip latency); on a server without it (e.g. WEBrick) it falls back to a detached thread. Multiple blocks run in order and are isolated — one raising affects neither the response nor the others. Notes: deferred blocks run **only on the success path** (a rejected `before_save` runs none), "after the response" is **not** "after the row commits" (don't rely on the persisted row inside the block), and the work is **in-process and best-effort** — it dies with the worker, so for anything that *must* happen use a durable job queue (Sidekiq / ActiveJob). Blocks are drained only when the payload runs through the mounted `Parse::Webhooks` Rack app (a no-op under direct `run_function` / `call_route`). See the [Cloud Code Webhooks Guide](docs/webhooks_guide.md#deferring-work-until-after-the-response). + > **Your model's `after_save` callbacks run here too.** When an `after_save` / > `after_create` trigger fires, the webhook rebuilds the `Parse::Object` from the > payload and runs that model's ActiveModel `after_save` / `after_create` @@ -4928,6 +4977,57 @@ For any `after_*` hook, return values are not needed since Parse does not utiliz > for saves from other clients (JS / iOS / REST), the webhook runs them, since > the SDK never had the chance. +#### ActiveModel callbacks vs. Parse Server triggers + +The SDK exposes the full ActiveModel lifecycle on every model +(`before_validation`, `before_save`/`after_save`, `before_create`/`after_create`, +`before_update`/`after_update`, `before_destroy`/`after_destroy`). Parse Server, +separately, exposes a fixed set of **webhook trigger types**. They are not +one-to-one — the SDK maps between them, and a webhook must be **registered** for +your ActiveModel logic to run server-side for non-Ruby clients (JS / iOS / REST / +Dashboard). Without a registered webhook, that logic runs only in the Ruby +process that initiated the save. + +Supported Parse Server trigger types: `beforeSave`/`afterSave`, +`beforeDelete`/`afterDelete`, `beforeFind`/`afterFind`, `beforeLogin`/`afterLogin`, +`afterLogout`, `beforePasswordResetRequest`, `beforeConnect`, +`beforeSubscribe`/`afterEvent`, and file triggers on the `@File` pseudo-class. + +The **authentication** triggers (`beforeLogin`/`afterLogin`/`afterLogout`/ +`beforePasswordResetRequest`) and **LiveQuery** triggers (`beforeConnect`/ +`beforeSubscribe`/`afterEvent`) route as first-class shapes — predicates +(`before_login?` … `after_event?`, `auth_trigger?`/`live_query_trigger?`), an +`event` accessor, and top-level `sessionToken` capture into `payload.session_token`. +None of them run ActiveModel `save`/`create`/`destroy` callbacks, even though the +auth triggers carry a `_User`/`_Session`. Parse Server **ignores the response body** +for all of them, so the only signal that affects the operation is rejection, and +only on the `before*` variants: returning `false` (or calling `error!`) from a +`before_login`/`before_connect`/`before_subscribe`/`before_password_reset_request` +handler denies the operation, while anything else is a success no-op. (LiveQuery +triggers are delivered over HTTP only in a co-located single-process setup; +`beforeConnect` is effectively in-process-only.) + +Key relationship — **`beforeSave`/`afterSave` carry the create variants**. Parse +Server has **no `beforeCreate`/`afterCreate` trigger** (it rejects them). The SDK +runs your `before_create`/`after_create` callbacks *inside* the +`beforeSave`/`afterSave` handler for new objects, in ActiveModel order +(`before_save → before_create`, `after_create → after_save`). So **registering a +`beforeSave` webhook enables both `before_save` and `before_create`**, and +`afterSave` enables both `after_save` and `after_create`. Requesting a create +webhook raises with guidance pointing you at the save trigger. + +> **`after_save` is synchronous and on the critical path.** Parse Server waits +> for the webhook to return before completing the client's write — even on +> `afterSave`, whose return value is a no-op. Treat `after_save` as a place to +> **enqueue** background work, not to run long logic inline, and avoid saving +> other objects inside it (each cascading save fires more webhooks). `beforeSave` +> can mutate or reject the write, so it is necessarily inline — keep it lean. + +For the full picture — trigger types, registration, the synchronous-latency +model, the Ruby-initiated dedup, and inbound replay/freshness protection — see +the [Cloud Code Webhooks Guide](docs/webhooks_guide.md) and +[`examples/webhook_server.rb`](examples/webhook_server.rb). + #### Trigger object state Because the trigger payload is server-authoritative, the `parse_object` your @@ -5764,6 +5864,13 @@ The integration tests use Docker Compose to spin up a Parse Server instance with - Docker and Docker Compose installed - Ruby environment with bundler +> **Always run the suite with `bundle exec`.** Newer `minitest` (6.0+) moved +> `minitest/mock` out into a separate gem, so a bare `ruby`/`rake` invocation +> activates minitest 6 and then fails to load `minitest/mock`, aborting every +> test at load time with `cannot load such file -- minitest/mock (LoadError)`. +> Running through bundler pins the locked versions and avoids this. If you hit +> that LoadError, prefix the command with `bundle exec`. + #### Setup and Running Tests 1. **Enable Docker Tests**: Set the environment variable to enable Docker-based tests: @@ -5848,6 +5955,56 @@ docker compose -f scripts/docker/docker-compose.test.yml up -d curl -s http://localhost:29337/parse/health # -> {"status":"ok"} ``` +#### Network Exposure and the Preflight Guard + +Every service binds to loopback (`127.0.0.1`) by default, and the default +credentials above are committed to this repository — safe in combination, +since nothing off the host can reach them. Each bind is overridable +(`PARSE_BIND`, `MONGO_BIND`, `REDIS_BIND`, `DASHBOARD_BIND`) for the +occasional need to attach a remote client while debugging. + +That override is a footgun: pointing a bind at `0.0.0.0` while the default +credentials are still in force would publish an admin-credentialed stack +(Mongo `admin:password`, master key `psnextItMasterKey`) onto your LAN. A +`preflight` service runs before anything else and **refuses to start the +stack** in exactly that case. To proceed, do one of: + +```bash +# 1. Keep it loopback (the default) — just omit the *_BIND override. + +# 2. Supply real credentials instead of the committed test defaults. +PARSE_MASTER_KEY="$(openssl rand -hex 24)" \ +MONGO_ROOT_PASSWORD="$(openssl rand -hex 24)" \ +MONGO_BIND=0.0.0.0 \ + docker compose -f scripts/docker/docker-compose.test.yml up -d + +# 3. Acknowledge the exposure on a trusted, isolated network. +ALLOW_INSECURE_BIND=1 MONGO_BIND=0.0.0.0 \ + docker compose -f scripts/docker/docker-compose.test.yml up -d +``` + +#### Secret Injection (real credentials) + +The committed defaults are deliberately non-secret, so the loopback stack +needs no secrets manager. If you point the stack at *real or shared* +credentials (option 2 above, or a staging Mongo), keep them out of your +shell history and the compose file by injecting them at launch. The stack +reads plain environment variables, so any injector works: + +```bash +# 1Password CLI — secrets resolved from an op:// .env reference file. +op run --env-file=.env.secrets -- \ + docker compose -f scripts/docker/docker-compose.test.yml up -d + +# Doppler — secrets pulled from a configured project/config. +doppler run -- \ + docker compose -f scripts/docker/docker-compose.test.yml up -d +``` + +Use the committed `.env.sample` as the reference for which variables each +side expects; copy it to a gitignored `.env` (or an `op://`-referenced +`.env.secrets`) and fill in real values there. + #### Environment Variables The defaults above are baked into the Compose file and the test helpers, so the diff --git a/Rakefile b/Rakefile index 286294a..f5e63c5 100644 --- a/Rakefile +++ b/Rakefile @@ -77,12 +77,57 @@ def client_console_token! pwd = $stdin.gets.to_s end end - u = Parse::User.login(user, pwd.chomp) + u = console_login_with_optional_mfa(user, pwd.chomp) abort "[client:console] login failed for #{user.inspect}" if u.nil? || u.session_token.to_s.empty? puts "Logged in as #{u.username} (#{u.id})." u.session_token end +# Log `user` in, transparently handling an MFA-enrolled account. If the server +# reports that additional MFA auth is required, prompt for a TOTP / recovery +# code (or read +PARSE_LOGIN_MFA+ for non-interactive use) and retry via +# {Parse::User.login_with_mfa}. Returns a logged-in {Parse::User}, or nil when +# the credentials themselves are rejected (so the caller's "login failed" abort +# still fires for a bad password). +def console_login_with_optional_mfa(user, pwd) + # Parse Server signals "this account needs an MFA token" two ways depending on + # the error code path: a returned error response ("Missing additional + # authData ...") or a raised Parse::Error for the OTHER_CAUSE (code <= 100) + # variant. Treat both as "prompt for MFA"; anything else is a real credential + # failure and must NOT trigger an MFA prompt. + mfa_indicator = /additional\s+authData|missing.*mfa|\bMFA\b/i + begin + response = Parse.client.login(user, pwd) + if response.success? + return Parse::User.with_authdata_trust { Parse::User.build(response.result) } + end + return nil unless response.error.to_s.match?(mfa_indicator) + rescue Parse::Error, Parse::Client::ResponseError => e + raise unless e.message.to_s.match?(mfa_indicator) + end + + token = ENV["PARSE_LOGIN_MFA"].to_s.strip + if token.empty? + print "MFA token (authenticator code or recovery code): " + token = $stdin.gets.to_s.strip + end + abort "[client:console] MFA token required for #{user.inspect}" if token.empty? + + # A wrong/expired token can surface either as Parse::MFA::VerificationError or, + # depending on the server error code path, as a generic Parse::Error (e.g. + # ServiceUnavailableError for the OTHER_CAUSE code) or a nil return. Since a + # token was supplied here, treat any failure as an MFA verification failure + # and abort cleanly rather than letting an unhandled exception escape. + result = + begin + Parse::User.login_with_mfa(user, pwd, token) + rescue Parse::MFA::VerificationError, Parse::Error => e + abort "[client:console] MFA verification failed for #{user.inspect}: #{e.message}" + end + abort "[client:console] MFA verification failed for #{user.inspect}" if result.nil? + result +end + # Default test task runs all tests with Docker enabled. # # `*disruptive*` tests are EXCLUDED here: they stop/restart the shared @@ -131,7 +176,11 @@ def run_test_files!(label, files, log:) puts "[#{n}/#{total}] #{file}" puts "=" * 80 t0 = Time.now - ok = system("PARSE_TEST_USE_DOCKER=true ruby -Ilib:test #{file}") + # Always go through `bundle exec` so the locked gem versions win. With a + # bare `ruby`, RubyGems activates the newest installed minitest (6.0.x), + # which dropped the bundled `minitest/mock`; the standalone `minitest-mock` + # gem then can't co-activate and `test_helper.rb` fails to load every file. + ok = system("PARSE_TEST_USE_DOCKER=true bundle exec ruby -Ilib:test #{file}") dt = Time.now - t0 results << [file, ok, dt] summary = format("[%d/%d] %-4s %7.1fs %s", n, total, ok ? "PASS" : "FAIL", dt, file) @@ -203,7 +252,7 @@ namespace :test do puts "=" * 80 # Each file runs in its own process so a server outage in one cannot # bleed into the next. - system("PARSE_TEST_USE_DOCKER=true ruby -Ilib:test #{file}") || begin + system("PARSE_TEST_USE_DOCKER=true bundle exec ruby -Ilib:test #{file}") || begin # A disruptive test may have left the server down on failure; bring # it back so a follow-up run / other tasks start from a clean state. system("docker start #{ENV["PSNEXT_PREFIX"] || "psnext-it"}-server", out: IO::NULL, err: IO::NULL) diff --git a/docs/atlas_vector_search_guide.md b/docs/atlas_vector_search_guide.md index 8ce7f43..415bc3a 100644 --- a/docs/atlas_vector_search_guide.md +++ b/docs/atlas_vector_search_guide.md @@ -372,6 +372,10 @@ embed-time chunking), use one of these patterns: ## Retrieval (RAG) +> For an end-to-end runnable script — managed `embed`, `agent_searchable`, +> `semantic_search`, and an OpenAI/Anthropic generation add-in — see +> [`examples/rag_chatbot.rb`](../examples/rag_chatbot.rb). + `Parse::Retrieval` (`Parse::RAG` is an alias) sits on top of `find_similar`. `Parse::Retrieval.retrieve` embeds a natural-language query, runs Atlas `$vectorSearch` through `find_similar` (so ACL/CLP are @@ -395,8 +399,88 @@ chunks = Parse::Retrieval.retrieve( # => Array — { id, score, content, source, metadata } ``` -`rerank:` and `hybrid:` are reserved on the signature and raise -`NotImplementedError` if supplied. +`retrieve` also accepts `hybrid:` (fuse a lexical branch with the vector +branch — see [Hybrid search](#hybrid-search-vector--lexical) below) and +`rerank:` (reorder retrieved documents with a cross-encoder before +chunking — see [Reranking](#reranking)). Both were reserved in earlier +releases and now ship in 5.4.0. + +### Hybrid search (vector + lexical) + +`Class.hybrid_search` runs a lexical Atlas Search (`$search`) branch and a +`$vectorSearch` branch as **two independent aggregations**, then fuses +their ranked results with reciprocal-rank fusion (RRF). Two aggregations +(not a single `$facet`) is mandatory: `$vectorSearch` is prohibited inside +`$facet` / `$lookup` / `$unionWith` and must be stage 0 of its pipeline. +Each branch enforces ACL/CLP/`protectedFields` independently before +fusion (via `Parse::AtlasSearch.search` and `Parse::VectorSearch.search`), +so the fused rows are already access-filtered — there is no separate +hydration fetch. + +```ruby +hits = Article.hybrid_search( + text: "how do I reset my password", # embedded for the vector branch; + # also the default lexical query + lexical: { index: "article_search", fields: %w[title body] }, + vector: { index: "article_embedding_idx", num_candidates: 200 }, + k: 20, + fusion: { k_constant: 60, weights: { lexical: 0.4, vector: 0.6 } }, + session_token: user.session_token, # ACL scope, applied to BOTH branches +) +# => Array; each carries #hybrid_score, #hybrid_ranks, +# and #vector_score / #search_score when that branch contributed. +``` + +**RRF math.** `fused_score(d) = Σ_b weight_b / (k_constant + rank_b(d))`, +where `rank_b(d)` is the document's 1-based rank in branch `b`. A larger +`k_constant` (default 60) flattens the contribution curve. `weights` +defaults to 1.0 per branch. `Parse::VectorSearch::Hybrid.rrf` exposes the +pure fusion if you want to fuse pre-fetched ranked lists yourself. + +**Native `$rankFusion` (Atlas 8.0+).** +`Parse::VectorSearch::Hybrid.rank_fusion_supported?(collection)` detects +the native server-side fusion stage via a cached behavioural probe (1-hour +TTL — not version-string parsing). Native execution is **opt-in** +(`fusion: { method: :rrf_native }`) and falls back to the client-side path +when the cluster does not support it; the default `:rrf` always fuses +client-side, which is the fully-enforced, deterministic path. `$rankFusion` +is admitted to `PipelineSecurity::ALLOWED_STAGES` for the native path. + +`Parse::Retrieval.retrieve(hybrid: true, ...)` routes through +`hybrid_search` and chunks the fused results; pass `hybrid: { lexical:, +vector:, fusion: }` to configure the branches. Tenant scope is folded into +**both** branches (the vector Atlas pre-filter and the lexical +post-`$search` `$match`) so neither leaks cross-tenant document existence. + +### Reranking + +A reranker reorders retrieved documents by a cross-encoder relevance score +**before** chunking. Pass any object answering +`#rerank(query:, documents:, top_n:)` — typically a +`Parse::Retrieval::Reranker::Base` subclass: + +```ruby +reranker = Parse::Retrieval::Reranker::Cohere.new( + api_key: ENV.fetch("COHERE_API_KEY"), model: "rerank-v3.5", +) +chunks = Parse::Retrieval.retrieve( + query: "reset my password", klass: Article, k: 30, + rerank: reranker, rerank_top_n: 5, # keep the 5 most relevant docs +) +# Reranked chunks' score is the cross-encoder relevance_score. +``` + +`Reranker::Fixture` is a deterministic, zero-network reranker (lexical +token overlap) for tests. The `Reranker::Base` protocol validates inputs, +bounds `top_n`, rejects out-of-range indices, and sorts descending — +adapters implement only the network call (`#rerank_scores`). + +> **Spend cap.** The `semantic_search` agent tool charges the estimated +> query-embedding tokens against the caller's tenant budget via +> `Parse::Embeddings::SpendCap` (opt-in; `configure(limit_tokens:, +> window:)`). A breach hard-refuses (surfaced to the agent as a +> rate-limited tool error). Admin agents are exempt; direct +> `find_similar` / `retrieve` callers are not metered. ### Chunkers diff --git a/docs/client_sdk_guide.md b/docs/client_sdk_guide.md index 447c462..348d876 100644 --- a/docs/client_sdk_guide.md +++ b/docs/client_sdk_guide.md @@ -11,6 +11,11 @@ go over REST, and authorization is carried by the user's `sessionToken`. Every claim below is locked in by the integration tests under `test/lib/parse/client_*_integration_test.rb`. +For a runnable starting point, see +[`examples/basic_client.rb`](../examples/basic_client.rb) (a no-master client +with a row-level ACL-enforcement demo) and its master-key counterpart +[`examples/basic_server.rb`](../examples/basic_server.rb). + --- ## Why a separate guide? diff --git a/docs/mcp_guide.md b/docs/mcp_guide.md index 5e55f1c..a052b67 100644 --- a/docs/mcp_guide.md +++ b/docs/mcp_guide.md @@ -7,7 +7,7 @@ The Model Context Protocol (MCP) is a standardized JSON-RPC 2.0-based interface Three deployment modes are available: - **Standalone HTTP server (`MCPServer`)** — a WEBrick process for dedicated MCP deployments. -- **Rack-mountable adapter (`MCPRackApp`)** — embeds inside an existing Sinatra or Rails application. +- **Rack-mountable adapter (`MCPRackApp`)** — embeds inside an existing Sinatra or Rails application. This is the primary deployment for the MCP 2025-06-18 Streamable HTTP transport; enable it with `transport: :streamable_http` (see [Streamable HTTP transport](#streamable-http-transport-primary)). - **Direct in-process dispatcher (`MCPDispatcher`)** — a pure function for in-process usage, custom transports, and testing. --- @@ -191,6 +191,42 @@ map("/mcp") { run mcp_app } map("/") { run ->(env) { [200, {"Content-Type" => "text/plain"}, ["ok"]] } } ``` +#### Streamable HTTP transport (primary) + +The MCP 2025-06-18 **Streamable HTTP** transport is the recommended transport for `MCPRackApp`. It is a single connection model in which the client `POST`s JSON-RPC requests (receiving either a buffered JSON reply or, with `Accept: text/event-stream`, a streamed SSE reply) and holds open a long-lived `GET` request to receive server-initiated notifications. Session termination is signalled with `DELETE` carrying the `Mcp-Session-Id`. + +Enable the whole transport with one switch: + +```ruby +mcp_app = Parse::Agent.rack_app(transport: :streamable_http) do |env| + # ... auth factory ... +end +``` + +`transport: :streamable_http` is exactly equivalent to `streaming: true, notifications: true` — it turns on POST→SSE streaming and the server→client `GET /` notification stream together. Add `resource_subscriptions: true` alongside it to upgrade the server→client bus from the plain notification posture to the LiveQuery-backed resource-subscription posture: + +```ruby +mcp_app = Parse::Agent.rack_app( + transport: :streamable_http, + resource_subscriptions: true, # optional: bridge LiveQuery resource updates +) do |env| + # ... +end +``` + +`transport:` is a closed enum: + +| Value | Effect | +|-------|--------| +| `:streamable_http` | Full Streamable HTTP transport (`streaming: true` + `notifications: true`). | +| `:legacy` / `nil` (default) | Historical behavior: buffered JSON responses, no server→client stream. The standalone SSE/JSON path below remains a supported fallback. | + +Passing `transport: :streamable_http` together with an explicit `streaming:` or `notifications:` raises `ArgumentError` (the switch already owns those toggles); any value other than the two above also raises. The default is unchanged, so an existing `Parse::Agent.rack_app { ... }` keeps its non-streaming JSON behavior until you opt in. + +**WEBrick cannot deliver Streamable HTTP.** The switch — like `streaming:` — has no effect under the WEBrick-backed standalone `MCPServer`, which buffers responses and cannot hold the `GET` stream open. Use Puma, Falcon, or Unicorn for a real Streamable HTTP deployment. + +The remaining subsections document the individual toggles `transport: :streamable_http` consolidates, for operators who need finer control or are reading older configurations. + #### MCP progress notifications via SSE (opt-in) **WEBrick cannot stream.** The standalone `MCPServer` is WEBrick-based and buffers the full response before sending. Setting `streaming: true` on an `MCPRackApp` mounted under WEBrick silently degrades to a single buffered response with concatenated SSE events. SSE streaming requires a Rack server that supports streaming response bodies — **Puma, Falcon, or Unicorn**. Verify your deployment uses one of these before relying on `streaming: true`. @@ -537,10 +573,29 @@ Parse Server version and its `masterKeyIps` configuration.) soft cap *equal to* `max_concurrent_dispatchers`. So the effective steady-state ceiling across both surfaces is up to **2× `max_concurrent_dispatchers`** (up to N request-scoped SSE dispatchers plus N listening streams). Size the value - with that 2× factor in mind (e.g. relative to your Puma `max_threads`). Leaving - it unset (the default `nil`) leaves both surfaces uncapped; the app logs a + with that 2× factor in mind (e.g. relative to your Puma `max_threads`). + `max_concurrent_dispatchers:` defaults to a finite **100** + (`Parse::Agent::MCPRackApp::DEFAULT_MAX_CONCURRENT_DISPATCHERS`), so a + streaming surface is bounded out of the box — once the cap is reached a new + SSE request or listening stream is refused with a `503` JSON-RPC `-32000` + ("server busy"). Pass an explicit positive integer to resize it, or + `max_concurrent_dispatchers: nil` to knowingly run uncapped (the app logs a one-time warning at construction when a streaming or subscription/notification - surface is enabled without a cap. + surface is enabled with `nil`). A non-positive or non-integer value raises + `ArgumentError`. +- **Client disconnect mid-tool-call.** When a client drops the connection while + a tool is still running, the SSE worker is torn down and the dispatcher's + cancellation token is tripped, so a cooperative tool (one that checks + `agent.cancelled?` at a checkpoint) exits promptly. A tool blocked inside a + Mongo/REST roundtrip cannot observe the token, but its slot is reclaimed when + the per-tool `Timeout` or the clean MongoDB `socket_timeout` (10s) / REST + `timeout` (30s) deadline fires — through the driver's clean error path. The + orphaned dispatcher is **intentionally not force-killed**: a `Thread#kill` + would bypass the driver's connection-invalidation and could return a half-used + pooled connection to a later request. To observe how often disconnects abandon + in-flight work, watch the cumulative + `Parse::Agent::MCPRackApp.abandoned_dispatcher_count` or subscribe to the + `parse.agent.mcp_dispatcher_abandoned` `ActiveSupport::Notifications` event. ### Listening-stream ownership diff --git a/docs/mongodb_direct_guide.md b/docs/mongodb_direct_guide.md index e5160d1..d1f48d2 100644 --- a/docs/mongodb_direct_guide.md +++ b/docs/mongodb_direct_guide.md @@ -173,6 +173,58 @@ set the same kwargs on the query for chainable composition. Related: `first_direct(n)` for the first N rows, `count_direct` for a count-only query. Both accept the same auth kwargs. +#### Field projection: `keys` and `exclude_keys` + +The two field-selection options behave differently on the direct path +because MongoDB's `$project` is an allowlist, not a denylist: + +- **`keys` (allowlist)** compiles to a `$project` stage in the direct + pipeline, so the projection runs server-side in MongoDB — only the + named fields (plus the reserved envelope: `_id`, `_created_at`, + `_updated_at`, `_acl`) leave the database. + +- **`exclude_keys` (denylist)** has no `$project` equivalent, so Parse + Stack honors it as a **post-fetch sanitize**: the pipeline is + unchanged, and the SDK recursively strips every key with a matching + name from the decoded results in Ruby. The fields still travel from + MongoDB to the client — this is a result-shaping convenience, not a + data-minimization or access-control boundary. + +```ruby +# Allowlist — projected server-side via $project +Song.query.keys(:title, :artist).results_direct + +# Denylist — stripped client-side after fetch +Song.query.exclude_keys(:internal_notes).results_direct +``` + +Two consequences specific to the direct path: + +1. **Recursive by name.** `exclude_keys(:name)` removes `name` at every + depth, including inside included/nested objects — so a query that + includes a pointer also strips the pointed-to object's `name`. This + is broader than Parse Server's REST `excludeKeys`, which is + path-scoped (top-level or dotted) and would leave the nested field + intact. The same query can therefore return different shapes on the + REST and direct paths. + +2. **Reserved fields are never stripped.** `objectId`, `className`, + `__type`, `createdAt`, `updatedAt`, `ACL`, and their Mongo + storage-form names (`_id`, `_created_at`, `_updated_at`, `_acl`) are + always retained, so excluding one of them is a no-op rather than a + way to break object reconstruction. + +The sanitize applies to the object/decoded result paths +(`results_direct`, `first_direct`, and the auto-promoted +`$inQuery`/`$notInQuery` aggregation). The raw aggregation accessor +(`aggregate(...).raw`) returns documents untouched. + +Because `exclude_keys` here is a projection convenience and not an +ACL/CLP/`protectedFields` boundary, the security contract in +[Security](#security) is unaffected — to keep a field from leaving the +database, use `keys` (allowlist) or `protectedFields`, not +`exclude_keys`. + ### `Query#aggregate(pipeline, mongo_direct: true)` ```ruby @@ -233,9 +285,35 @@ raw = Parse::MongoDB.find( ``` Convenience wrapper around `db.find`. Accepts `limit:`, `skip:`, `sort:`, -`projection:`, `max_time_ms:`. When `:limit` is omitted the call applies +`projection:`, `hint:`, `max_time_ms:`. When `:limit` is omitted the call applies `DEFAULT_FIND_LIMIT = 1000` and warns; pass `limit: 0` to opt out. +### Forcing an index with `hint` + +When the query planner picks a sub-optimal index on a large collection, +`Query#hint` forces a specific one. It applies on **both** paths — the REST body +(`hint` parameter, Parse Server 7.4.0+) and the mongo-direct path — so a plan you +diagnosed with `Query#explain` can be corrected without dropping to `mongosh`. + +```ruby +# Diagnose, then force the index, on the mongo-direct path: +Post.query(:status => "published").order(:created_at.desc).hint("status_1_created_at_-1") + .results_direct + +# A key pattern works too: +Post.query(:status => "published").hint({ "status" => 1, "createdAt" => -1 }).count_direct +``` + +On the mongo-direct path the hint is forwarded to the driver as the Mongo `hint` +option: `results_direct` / `count_direct` / `distinct_direct` pass it to +`Parse::MongoDB.aggregate` (`hint:` → the aggregation `hint` option), and the +primitives `Parse::MongoDB.aggregate(..., hint:)` and +`Parse::MongoDB.find(..., hint:)` accept it directly. The index name (a `String`) +or a key pattern (`Hash`) are both accepted; an unknown index name is rejected by +MongoDB, which is the intended fail-fast signal that the hint is stale. + +`hint` is unset by default (the planner chooses); it is purely an override. + ### Geo queries Three geo query constraints land in v4.4.0 alongside a direct @@ -620,6 +698,20 @@ ACL/CLP enforcement if the SDK applies it. As of **v4.4.0**, the SDK applies that enforcement on the mongo-direct path when the caller supplies a scope. Five layers compose: +> **Atlas index entry points share this enforcement.** The Atlas-index +> stages (`$vectorSearch`, `$search`, `$rankFusion`) must be stage 0 of +> their pipeline, so they cannot route through `Parse::MongoDB.aggregate` +> (which prepends an ACL `$match` at stage 0). `Parse::VectorSearch.search` +> (`find_similar`), `Parse::AtlasSearch.search`, and +> `Parse::VectorSearch::Hybrid` (`Class.hybrid_search`, v5.4.0) therefore +> reproduce the same enforcement chain **inline** — the ACL `_rperm` +> `$match` is appended AFTER the index stage, and CLP / `protectedFields` / +> the internal-fields denylist run post-fetch — so the same scope kwargs +> (`session_token:` / `acl_user:` / `acl_role:` / `master:`) and the same +> contract apply. Hybrid search fuses two independently-enforced branches, +> so fused rows are already access-filtered. `$rankFusion` was added to the +> strict-mode allowlist (Layer 1) in v5.4.0 for the opt-in native path. + ### Layer 1: Pipeline-security denylist (always on) `Parse::PipelineSecurity` refuses dangerous operators at any depth in diff --git a/docs/usage_guide.md b/docs/usage_guide.md index 83f4f08..7675028 100644 --- a/docs/usage_guide.md +++ b/docs/usage_guide.md @@ -83,10 +83,20 @@ Song.query.order(:plays.desc).skip(10).limit(20).results # Include related objects Song.all(includes: [:album, :comments]) -# Select specific fields +# Select specific fields (allowlist) Song.all(keys: [:title, :artist]) + +# Omit specific fields (denylist) +Song.query.exclude_keys(:internal_notes).results ``` +> On the mongo-direct read path, `keys` is projected server-side while +> `exclude_keys` is applied as a recursive post-fetch sanitize (it strips +> matching field names at every depth and never removes reserved fields +> such as `objectId`). See the +> [Direct MongoDB Integration Guide](mongodb_direct_guide.md) for the +> exact semantics and how it differs from the REST path. + ## Aggregation ```ruby diff --git a/docs/webhooks_guide.md b/docs/webhooks_guide.md new file mode 100644 index 0000000..9a03bcf --- /dev/null +++ b/docs/webhooks_guide.md @@ -0,0 +1,418 @@ +# Cloud Code Webhooks Guide + +Webhooks are how `parse-stack-next` runs **server-side** trigger logic. They are +the bridge between Parse Server and your Ruby code: Parse Server calls back into +a Ruby Rack app on a matching trigger, and your model's ActiveModel callbacks +(and any webhook blocks) run there. + +This is a server-side-only concern. A pure client (or a server with no +registered webhooks) runs all of its trigger logic locally in ActiveModel and +nothing inside Parse Server. + +## Why register a webhook at all + +A `Parse::Object`'s ActiveModel callbacks run in the process that initiates the +save: + +- A **Ruby-initiated** save (this SDK) runs `before_save`, `after_create`, etc. + locally, before/after the REST call. +- A save from a **non-Ruby client** — the JS/Swift SDKs, a raw REST call, or the + Parse Dashboard — never touches your Ruby process. That trigger logic is + simply skipped server-side. + +Registering a webhook closes that gap. Once Parse Server has a `beforeSave` +webhook for a class, it calls your Ruby app on every save from every client, and +your callbacks run server-side for all of them. + +**The rule:** your ActiveModel logic applies to non-Ruby clients **only if the +webhook is registered.** + +## ActiveModel hooks vs Parse Server triggers + +The SDK exposes the full ActiveModel lifecycle on every `Parse::Object`. Parse +Server, separately, exposes a fixed set of webhook trigger types. They are not +one-to-one — the SDK maps between them. + +### ActiveModel callbacks (Ruby side) + +| Callback | Fires | +|----------|-------| +| `before_validation` / `after_validation` | around local validation | +| `before_save` / `after_save` | around every save (create **and** update) | +| `before_create` / `after_create` | around the first save of a new object | +| `before_update` / `after_update` | around saves of an existing object | +| `before_destroy` / `after_destroy` | around delete | + +### Parse Server webhook trigger types (server side) + +| Trigger | className | Notes | +|---------|-----------|-------| +| `beforeSave` / `afterSave` | a class | create **and** update | +| `beforeDelete` / `afterDelete` | a class | | +| `beforeFind` / `afterFind` | a class | | +| `beforeLogin` / `afterLogin` | `_User` | login-side hooks | +| `afterLogout` | `_Session` | | +| `beforePasswordResetRequest` | `_User` | | +| `beforeSave` / `afterSave` / `beforeDelete` / `beforeFind` / `afterFind` | `@File` | file triggers | +| `beforeConnect` | `@Connect` | LiveQuery connection (connection-global) | +| `beforeSubscribe` / `afterEvent` | a class | LiveQuery subscription / events | + +### How they relate + +- **`beforeSave` / `afterSave` carry the create variants.** Parse Server has **no + `beforeCreate` / `afterCreate` trigger** — it rejects them. The SDK runs your + `before_create` / `after_create` callbacks *inside* the `beforeSave` / + `afterSave` handler, gated on whether the object is new. So **registering a + `beforeSave` webhook enables both `before_save` and `before_create`**; + registering `afterSave` enables both `after_save` and `after_create`. + + Asking for a create webhook fails fast with guidance: + + ```ruby + Post.webhook(:after_create) { … } + # ArgumentError: There is no after_create webhook. Register `webhook :after_save` + # instead — your after_create ActiveModel callbacks run inside the after_save + # handler for new objects (registering after_save enables BOTH the after_save + # and after_create callbacks). + ``` + +- **Trigger order is honored.** Within the save handler the SDK runs callbacks in + ActiveModel order: `before_save` then `before_create` on the way in, + `after_create` then `after_save` on the way out. + +- **`@File` and `@Connect` are pseudo-classes.** File triggers register against + the `@File` className; the connection-global LiveQuery trigger uses `@Connect`. + The SDK accepts both for the full register/fetch/delete lifecycle. + +- **`beforeFind` / `afterFind` are result-side, not object-side.** Unlike the + save/delete triggers, a find payload carries no single `object` — `beforeFind` + exposes the incoming `query` (via `payload.query`) and `afterFind` exposes the + matched rows (via `payload.objects`). And unlike `afterSave` (whose return + value Parse Server ignores), **`afterFind` is result-rewriting**: whatever the + handler returns *replaces* the rows sent to the client, so it can filter or + redact results. It also adds a webhook round-trip to every matching query, so + register it deliberately. + + One non-obvious detail the SDK handles for you: **Parse Server does not put the + class name anywhere in the find payload body** — the matched objects omit + `className` and there is no top-level one. The SDK derives the class from the + webhook URL path (`//`) so your `afterFind` / + `beforeFind` block routes correctly and `payload.parse_class` resolves. (If you + build a `Payload` yourself in a test, pass the class as the second argument: + `Parse::Webhooks::Payload.new(body, "MyClass")`.) + + Because the class is resolved from the route, declared `:vector` columns are + stripped from `afterFind` `payload.objects` by default, exactly as they are + from `object`/`original`/`update` on the other triggers (a + `vector_visibility :public` class keeps them). One consequence to keep in + mind: an `afterFind` handler that returns `payload.objects` to pass results + through passes the *vector-scrubbed* rows on to the client — which matches the + `as_json` default (an `owner_only` class never exposes vectors anyway). Return + your own array if you need different columns. + +- **Auth triggers (`beforeLogin` / `afterLogin` / `afterLogout` / + `beforePasswordResetRequest`) and LiveQuery triggers (`beforeConnect` / + `beforeSubscribe` / `afterEvent`) are routed as first-class shapes** — they + are not object save/delete triggers, so **none of them run ActiveModel + `save` / `create` / `destroy` callbacks**, even the login/logout/reset ones + that carry a `_User` or `_Session`. + + Identify them with the matching predicates — `before_login?`, `after_login?`, + `after_logout?`, `before_password_reset_request?`, `before_connect?`, + `before_subscribe?`, `after_event?` — or the category helpers `auth_trigger?` + / `live_query_trigger?`. Useful accessors by shape: + + | Trigger | what the payload carries | + |---------|--------------------------| + | `beforeLogin` | the user being authenticated as **`payload.parse_object`** (a `_User`). `payload.user` is **`nil`** — auth isn't complete yet. | + | `afterLogin` | both `payload.parse_object` and `payload.user` (the now-authenticated user). | + | `afterLogout` | the session as `payload.parse_object` (a `_Session`). | + | `beforePasswordResetRequest` | the target user as `payload.parse_object`. | + | `beforeConnect` | connection-global: no object; the caller token (if any) in `payload.session_token`; counts in `payload.clients` / `payload.subscriptions`. | + | `beforeSubscribe` | shaped like `beforeFind` — `payload.query` / `payload.parse_query`; className comes from the route. Caller token in `payload.session_token`. | + | `afterEvent` | the event type in `payload.event` (`create` / `enter` / `update` / `leave` / `delete`), plus `payload.object` / `payload.original`. | + + > The login footgun: during `beforeLogin` reach for `payload.parse_object`, + > **not** `payload.user` (which is `nil`). For connect/subscribe the live + > session token is at the top level of the payload, not nested under a user — + > the SDK captures it into `payload.session_token` (so `payload.user_client` / + > `payload.user_agent` work) and keeps it out of `as_json` and the request log. + + **Response contract — what you return matters only for the `before*` ones.** + Parse Server **ignores the response body for all seven** of these triggers + (its webhook response handler resolves `{}` regardless). The *only* way a + handler affects the operation is by **rejecting** it, and only the `before*` + variants can be rejected (an `after*` trigger fires after the fact): + + ```ruby + Parse::Webhooks.route(:before_login, "_User") do |payload| + error!("account suspended") if payload.parse_object.suspended? # denies login + # returning false also denies (mapped to the error response); anything else + # — including the user object — succeeds as a no-op + end + + Parse::Webhooks.route(:after_event, "Post") do |payload| + AuditLog.record(payload.event, payload.parse_id) # observe-only; return value ignored + end + ``` + + Note the asymmetry with `before_save`: Parse Server treats a `{success:false}` + body as **allow** (only an `{error}` body rejects). So "return `false` to deny + login" only works because the SDK converts that `false` into an error response + for you. `error!(message)` is the explicit, message-carrying form. + + **LiveQuery delivery caveat.** `beforeConnect` / `beforeSubscribe` / + `afterEvent` fire inside the LiveQuery server. They are delivered to an HTTP + webhook **only in a co-located, single-process LiveQuery setup**; with a + separate LiveQuery server they are in-process (`Parse.Cloud`) only. + `beforeConnect` in particular carries a live client handle that does not + serialize over HTTP, so it is effectively in-process-only. Register them when + you know your topology supports it. + +## Defining and registering webhooks + +```ruby +Parse::Webhooks.key = ENV.fetch("PARSE_WEBHOOK_KEY") # matches Parse Server's webhookKey + +class Post < Parse::Object + property :title, :string + + before_save :normalize # runs server-side once beforeSave is registered + after_create :index_for_search # runs inside the afterSave handler for new posts + + webhook :before_save do # optional block, in addition to callbacks + parse_object # return the object (or `false` to halt the save) + end +end +``` + +Register with Parse Server (once, at deploy — requires the master key). +`endpoint` is the public HTTPS URL where the Rack app is reachable: + +```ruby +Parse::Webhooks.register_functions!("https://hooks.example.com/webhooks") +Parse::Webhooks.register_triggers!("https://hooks.example.com/webhooks") +``` + +Mount the Rack app (`config.ru`): + +```ruby +require_relative "app/webhooks" +run Parse::Webhooks +``` + +See [`examples/webhook_server.rb`](../examples/webhook_server.rb) for a complete, +runnable setup. + +## Auditing trigger coverage + +The wiring above has three independent moving parts, and a callback runs +server-side only when all three line up: + +1. the model's **ActiveModel callback** (`after_save :send_email`), +2. a **local webhook route** so the router has a handler to run (the + `webhook :after_save` block, or `Parse::Webhooks.route(:after_save, "Post")`), +3. the **server trigger** registered with Parse Server (`register_triggers!`), + so Parse Server actually POSTs to your app. + +Declaring the callback alone does nothing for a non-Ruby client — the save +never touches your Ruby process. It is easy for these three to drift: a new +`after_save` callback with no block, a `webhook` block you never registered, or +a stale server trigger pointing at a class whose block was removed. + +`Parse::Webhooks.trigger_audit` cross-references all three across every +registered class and reports the gaps. The server comparison reads the +master-key-only `hooks/triggers` endpoint, so it needs a master-key client; +pass `network: false` to audit callbacks against local routes only. + +```ruby +puts Parse::Webhooks.trigger_audit(pretty: true) # human-readable summary +report = Parse::Webhooks.trigger_audit # Hash report +Parse::Webhooks.trigger_audit(network: false) # local-only, no master key +``` + +The audit emits four kinds of findings: + +- **`callbacks_inert`** — a model has callbacks mapping to a trigger + (`after_save` / `after_create` → `afterSave`, etc.) but the local block and/or + the server trigger is missing, so they never fire for non-Ruby clients. The + `missing:` list says which piece to add. This is the headline gap. +- **`route_not_registered`** — a local `webhook :X` block exists but the trigger + isn't on the server, so Parse Server never calls it. Fix by running + `register_triggers!`. +- **`orphan_server_trigger`** — a server trigger is registered but no local block + handles it; every matching operation pays a webhook round-trip that does + nothing. +- **`local_only_callbacks`** — informational: `before_update` / `after_update` + and `before_validation` / `after_validation` callbacks have **no** Parse Server + trigger that can run them (the webhook router runs only the save and create + chains). They fire for Ruby-initiated saves but never for non-Ruby clients, + and no registration changes that. + +Wire it into CI or a deploy check to fail fast on a coverage gap: + +```ruby +inert = Parse::Webhooks.trigger_audit[:summary][:findings][:callbacks_inert].to_i +abort "Webhook coverage gaps detected" if inert.positive? +``` + +## Returning a value from a handler + +A handler block runs with `self` bound to the `Parse::Webhooks::Payload`, so +inside it you can call `parse_object`, `params`, `error!`, etc. directly. The +value the handler produces is what Parse Server receives: for `before_save`, +return the (possibly mutated) `parse_object` to allow the write, or `false` / +`error!` to reject it. + +You can set that value either with an explicit `return` or by letting it be the +block's last expression — both work: + +```ruby +Parse::Webhooks.route :before_save, :Post do + post = parse_object + + return post if post.title.present? # explicit early return + error! "title is required" # raise to reject the save +end + +# Equivalent, using the last-expression value: +Parse::Webhooks.route :before_save, :Post do + post = parse_object + post.title.present? ? post : error!("title is required") +end +``` + +The legacy proc idioms remain valid too — `next value` and `break value` both +set the result. `return`, like anywhere in Ruby, ends the handler immediately, +so nothing written after it in the same block runs. To run work *after* the +response, use [`after_response`](#deferring-work-until-after-the-response) +rather than writing code after the `return`. + +## Deferring work until after the response + +`payload.after_response { … }` (alias `defer`) registers a block to run **after** +the webhook response has been sent to Parse Server — off the critical path of the +save or function the client is waiting on. The handler still returns its value +synchronously (that value is the response Parse Server acts on); the deferred +block runs afterward. Use it for follow-up work that should not add latency: +search indexing, cache warming, fan-out notifications. + +```ruby +Parse::Webhooks.route :after_save, :Post do + post = parse_object + after_response { SearchIndex.reindex(post.id) } # runs after the reply is sent + post +end +``` + +How it runs: + +- **Under Puma or Unicorn** the block is enqueued on `rack.after_reply` and runs + once the response is flushed to the socket, on the same worker thread — so it + adds nothing to the client's round-trip. +- **On a server without `rack.after_reply`** (e.g. WEBrick) it falls back to a + detached thread per request with deferred work — there is no pool or cap, so + under high request volume those threads can accumulate. Run the webhook app + under **Puma or Unicorn in production** (both provide `rack.after_reply`, which + runs the work on the existing worker thread with no extra thread spawned); the + thread fallback is best treated as a development-server convenience. +- Multiple `after_response` blocks run in registration order, and each is + isolated — one raising affects neither the response nor the others. +- `self` inside the block is the payload, so `parse_object`, `params`, etc. are + available (it closes over the handler's scope). + +Things to know before relying on it: + +- **Success path only.** Deferred blocks run only when the handler produced a + successful response. If a `before_save` rejects the write (`error!`, a raise, + or returning `false`), its registered `after_response` blocks do **not** run. +- **"After the response" is not "after the row commits."** The block runs after + the *response* is flushed. For `before_save` that is before Parse Server has + committed the write; even for `after_save` the SDK does not guarantee commit + ordering relative to the deferred block. Do not rely on the persisted row being + readable inside it. +- **In-process and best-effort.** The work runs in the web worker and does not + survive a restart, crash, or deploy. For work that *must* happen — payment + capture, irreversible side effects — hand it to a durable job queue + (Sidekiq / ActiveJob) instead; `after_response` is for latency-shedding, not + durability. +- **Mounted-app only.** Deferred blocks are drained by the `Parse::Webhooks` Rack + app. Invoking a handler directly (`Parse::Webhooks.run_function`, or calling + `call_route` in a unit test) does not run them — `after_response` is a no-op + there. +- **Capturing `user_client` / `user_agent` extends the token's lifetime.** A + deferred block closes over the payload, so referencing `payload.user_client` / + `payload.user_agent` (or `payload.session_token`) keeps the caller's live + session token in memory until the block finishes — beyond the synchronous + request. That is fine and expected when the deferred work needs to act as the + caller; just don't capture them when the work doesn't need the user's + authority (use a master-key client instead), so the token isn't pinned longer + than necessary. + +## Latency: webhooks are synchronous + +Every registered webhook adds a **separate, synchronous HTTP round-trip** to the +client's operation. Parse Server **waits for the webhook to return before +proceeding** — and it waits even on `afterSave`, despite the afterSave return +value being a no-op. + +This has direct design consequences for `afterSave` (and `afterDelete`): + +- **Enqueue, don't execute.** Treat `after_save` as a place to hand work to a + background job, not to do long-running logic inline. Anything slow here is + added latency on every save, for every client. For in-process follow-up that + doesn't need a durable queue, [`after_response`](#deferring-work-until-after-the-response) + moves it off the client's round-trip; for anything that *must* happen, use a + real job queue. +- **Avoid saving other objects during an afterSave.** Each cascading save fires + its own webhooks, which can fire more — a latency cascade. If you must, do it + in a background job, not inline in the handler. + +`beforeSave` is necessarily inline (it can mutate or reject the write), so keep +it lean and deterministic. + +## Server-side dedup: two distinct mechanisms + +Two different "dedup" systems protect webhook handling. They solve different +problems — don't conflate them. + +### 1. Ruby-initiated dedup (keep logic local, prevent double-runs) + +When a save is initiated by **this SDK with the master key**, Parse Stack tags +the request as trusted-Ruby-initiated (an `_RB_` request-id marker plus the +master key). It has already run the model's `before_save` / `after_save` / +`after_create` ActiveModel callbacks **locally**. The webhook therefore does +**not** re-run those callbacks — that would double-fire side effects (e.g. an +`after_save :send_email` would send two emails per save). + +The intent is to keep trigger logic local when possible and run it exactly once. +Note that any logic in the **webhook block itself** still runs; only the +duplicate ActiveModel callback pass is skipped. A spoofed `_RB_` marker without +the master key does not get this treatment — the callbacks run in the webhook as +usual. + +### 2. Server-initiated replay / freshness protection (inbound) + +This protects the webhook endpoint against **replayed inbound POSTs** — +`lib/parse/webhooks/replay_protection.rb`: + +- **Always-on body + request-id dedup.** A bounded LRU records a digest of each + `(request_id, body)`; a duplicate seen within `replay_window_seconds` is + rejected with `"Webhook replay detected."`. No cooperation from Parse Server is + required; this stops in-window replays. +- **Opt-in HMAC freshness verification.** Set a `signing_secret` and the receiver + verifies two headers: + - `X-Parse-Webhook-Timestamp` — Unix epoch seconds; requests outside + `signing_max_skew_seconds` (default 300) are rejected as stale. + - `X-Parse-Webhook-Signature` — hex HMAC-SHA256 of `"#{timestamp}.#{body}"` + keyed with the signing secret. + +```ruby +Parse::Webhooks::ReplayProtection.signing_secret = ENV["PARSE_WEBHOOK_SIGNING_SECRET"] +Parse::Webhooks::ReplayProtection.replay_window_seconds = 120 +Parse::Webhooks::ReplayProtection.signing_max_skew_seconds = 300 +``` + +This is **inbound** protection and is unrelated to request **idempotency** +(`X-Parse-Request-Id`), which dedups the SDK's own **outbound** retries on the +Parse Server side. Different direction, different mechanism. diff --git a/examples/README.md b/examples/README.md new file mode 100644 index 0000000..224a3ce --- /dev/null +++ b/examples/README.md @@ -0,0 +1,46 @@ +# Examples + +Runnable scripts that exercise `parse-stack-next` against a live Parse Server. +Each file is self-contained and reads its configuration from environment +variables. Start here: + +| Script | Demonstrates | Needs | +|---|---|---| +| [`basic_server.rb`](basic_server.rb) | Privileged (master-key) setup: define models, push schema with `auto_upgrade!`, full CRUD + queries with a `belongs_to`. | app id, REST key, **master key** | +| [`basic_client.rb`](basic_client.rb) | Unprivileged client (no master key): login/signup, `with_session`, and a row-level **ACL enforcement** demo (the owner reads a record; an anonymous caller gets `nil`). | app id, REST key | +| [`live_query_listener.rb`](live_query_listener.rb) | Interactive LiveQuery console: subscribes scoped to a user's session token and prints create / update / delete events until Ctrl-C — you only "hear" what that user may read. | app id, REST key, LiveQuery URL | +| [`rag_chatbot.rb`](rag_chatbot.rb) | Retrieval-augmented generation: managed `embed`, `agent_searchable`, `semantic_search` via `Parse::Agent`, plus an OpenAI/Anthropic generation add-in. | app id, REST key, master key, `OPENAI_API_KEY` (+ Atlas) | +| [`transaction_example.rb`](transaction_example.rb) | Atomic multi-object operations via `Parse::Object.transaction`. | app id, REST key | + +## Common setup + +All scripts read a Parse connection from the environment: + +```bash +export PARSE_SERVER_URL=http://localhost:1337/parse +export PARSE_APP_ID=your-app-id +export PARSE_REST_KEY=your-rest-api-key +export PARSE_MASTER_KEY=your-master-key # server-side scripts only +``` + +Then run any script with the gem on the load path: + +```bash +ruby -Ilib examples/basic_server.rb +``` + +## Suggested order + +1. **`basic_server.rb`** — defines and provisions the `Artist`, `Song`, and + `Post` classes the other scripts use. Run it first. +2. **`basic_client.rb`** — see how the same SDK behaves without the master key, + and watch Parse Server enforce a row-level ACL. +3. **`live_query_listener.rb`** — leave it running, then create/update/destroy + `Post`s from another terminal (or the dashboard) and watch them stream in. +4. **`rag_chatbot.rb`** — requires an Atlas-backed server and an embedding key; + see [`../docs/atlas_vector_search_guide.md`](../docs/atlas_vector_search_guide.md) + for the vector-search setup. + +> Each script's header comment lists the exact environment variables and any +> prerequisites (e.g. `basic_client.rb` needs the `Post` class to already +> exist, which `basic_server.rb` provisions). diff --git a/examples/basic_client.rb b/examples/basic_client.rb new file mode 100644 index 0000000..63f8b7f --- /dev/null +++ b/examples/basic_client.rb @@ -0,0 +1,93 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +# Basic Client Setup for parse-stack-next +# +# The UNPRIVILEGED side: configure the SDK WITHOUT a master key — the way a +# mobile app, browser, or untrusted worker uses it. There is no admin escape +# hatch, so authorization is carried per-call by the user's sessionToken and +# Parse Server is the enforcement boundary (CLP rejects, ACL filters rows, +# protectedFields strips columns). +# +# This example logs a user in and shows that a row-level ACL actually blocks +# reads: the owning user can read their object; an anonymous client cannot. +# +# See basic_server.rb for the privileged (master-key) counterpart. +# +# Prerequisite: the `Post` class must already exist on the server. A no-master +# client cannot create a class when Parse Server's allowClientClassCreation is +# false (the default since 5.0), so run examples/basic_server.rb first (it +# provisions Post with the master key) — or create the class yourself. +# +# Run it (REST key only — no master key in this process): +# export PARSE_SERVER_URL=http://localhost:1337/parse +# export PARSE_APP_ID=... PARSE_REST_KEY=... +# ruby examples/basic_client.rb + +require "parse-stack-next" + +# --------------------------------------------------------------------------- +# 1. Configure a no-master-key client +# --------------------------------------------------------------------------- +Parse.setup( + server_url: ENV.fetch("PARSE_SERVER_URL", "http://localhost:1337/parse"), + app_id: ENV.fetch("PARSE_APP_ID"), + api_key: ENV.fetch("PARSE_REST_KEY"), + master_key: nil, # explicit: never set this from env in client builds + logging: false, +) + +# Belt-and-suspenders: prove the master key really is absent. +raise "master key leaked into a client process!" unless Parse.client.master_key.nil? + +class Post < Parse::Object + property :title, :string + property :body, :string +end + +# --------------------------------------------------------------------------- +# 2. Authenticate (log in, or sign up on first run) +# --------------------------------------------------------------------------- +USERNAME = "ada" +PASSWORD = "p4ssw0rd!" + +# Parse::User.login returns nil on bad/unknown credentials (it does not raise), +# so fall back to signup the first time. +user = Parse::User.login(USERNAME, PASSWORD) || + Parse::User.signup(USERNAME, PASSWORD, "ada@example.com") + +puts "Logged in as #{user.username} (#{user.id})" +puts "Session token: #{user.session_token[0, 8]}…" + +# --------------------------------------------------------------------------- +# 3. Create an owner-only object AS the user +# --------------------------------------------------------------------------- +# `with_session` authorizes every REST-routed op in the block as this user. +post = user.with_session do + p = Post.new(title: "My private note", body: "Only Ada may read this.") + # Owner-only ACL: grant read+write to this user, no public access. + acl = Parse::ACL.new # empty == no public, no one + acl.apply(user.id, true, true) # this user: read + write + p.acl = acl + p.save + p +end +puts "Created Post #{post.id} with an owner-only ACL" + +# --------------------------------------------------------------------------- +# 4. Read it back AS the owner — succeeds +# --------------------------------------------------------------------------- +as_owner = user.with_session { Post.find(post.id) } +puts "As owner -> #{as_owner ? "READ OK: #{as_owner.title.inspect}" : "BLOCKED"}" + +# --------------------------------------------------------------------------- +# 5. Read it back ANONYMOUSLY (no session token) — blocked by the ACL +# --------------------------------------------------------------------------- +# No master key + no session => a plain REST request the ACL filters out. +# `first` returns nil rather than raising when the row is not visible. +anon = Post.first(objectId: post.id) +puts "Anonymous -> #{anon ? "READ OK (unexpected!): #{anon.title.inspect}" : "BLOCKED (nil) — ACL enforced"}" + +# Takeaway: identical SDK calls return the row for the owner and nil for an +# unauthorized caller. That difference is Parse Server enforcing the ACL — +# the client SDK simply threads the auth context and reports the verdict. diff --git a/examples/basic_server.rb b/examples/basic_server.rb new file mode 100644 index 0000000..6ba5a94 --- /dev/null +++ b/examples/basic_server.rb @@ -0,0 +1,109 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +# Basic Server-Side Setup for parse-stack-next +# +# The privileged way an app/server boots the SDK: configure a client WITH the +# master key, define a model, push its schema, and do CRUD + queries. Because +# the master key is present, Parse Server treats every request as an admin +# operation (ACL / CLP / protectedFields are bypassed) — which is exactly what +# you want for a trusted backend, and exactly what you must NOT do in an +# untrusted client (see basic_client.rb for that side). +# +# Run it: +# export PARSE_SERVER_URL=http://localhost:1337/parse +# export PARSE_APP_ID=... PARSE_REST_KEY=... PARSE_MASTER_KEY=... +# ruby examples/basic_server.rb + +require "parse-stack-next" + +# --------------------------------------------------------------------------- +# 1. Configure the (master-key) client +# --------------------------------------------------------------------------- +Parse.setup( + server_url: ENV.fetch("PARSE_SERVER_URL", "http://localhost:1337/parse"), + app_id: ENV.fetch("PARSE_APP_ID"), + api_key: ENV.fetch("PARSE_REST_KEY"), + master_key: ENV.fetch("PARSE_MASTER_KEY"), +) + +# --------------------------------------------------------------------------- +# 2. Define models +# --------------------------------------------------------------------------- +class Artist < Parse::Object + property :name, :string, required: true + property :country, :string +end + +class Song < Parse::Object + property :title, :string, required: true + property :plays, :integer, default: 0 + property :released_on, :date + + belongs_to :artist # stored as a Pointer +end + +# Provisioned here for the companion basic_client.rb. A no-master client can't +# create a class when Parse Server's allowClientClassCreation is false (the +# default since Parse Server 5.0), so the trusted side defines it up front. +class Post < Parse::Object + property :title, :string + property :body, :string +end + +# --------------------------------------------------------------------------- +# 3. Push the schema (server-side only — needs the master key) +# --------------------------------------------------------------------------- +# auto_upgrade! creates the class and any missing columns on Parse Server to +# match the model definition. Run it at boot / deploy, not on every request. +Artist.auto_upgrade! +Song.auto_upgrade! +Post.auto_upgrade! + +# --------------------------------------------------------------------------- +# 4. Create +# --------------------------------------------------------------------------- +artist = Artist.create!(name: "Daft Punk", country: "FR") + +song = Song.new(title: "One More Time", plays: 1_000, artist: artist) +song.save # => true (returns false + sets .errors on failure) +puts "Created Song #{song.id}: #{song.title}" + +# create! is `new(attrs).save!` in one call (raises on failure): +Song.create!(title: "Harder, Better, Faster, Stronger", plays: 2_500, artist: artist) + +# --------------------------------------------------------------------------- +# 5. Read +# --------------------------------------------------------------------------- +found = Song.query(:objectId => song.id).include(:artist).first # eager-load the pointer +puts "Fetched: #{found.title} by #{found.artist.name}" + +first_hit = Song.first(title: "One More Time") +puts "First match plays: #{first_hit.plays}" + +# --------------------------------------------------------------------------- +# 6. Update +# --------------------------------------------------------------------------- +song.plays += 1 +song.save +puts "Updated plays: #{song.plays}" + +# --------------------------------------------------------------------------- +# 7. Query +# --------------------------------------------------------------------------- +# DataMapper-style constraints. Symbol operators (:plays.gt) build comparisons; +# order / limit chain on. +popular = Song.query(:plays.gt => 1_500) + .where(artist: artist) + .order(:plays.desc) + .limit(10) + .results +puts "Popular songs: #{popular.map(&:title).join(', ')}" + +puts "Total songs by #{artist.name}: #{Song.count(artist: artist)}" + +# --------------------------------------------------------------------------- +# 8. Delete +# --------------------------------------------------------------------------- +song.destroy +puts "Destroyed #{song.id}" diff --git a/examples/live_query_listener.rb b/examples/live_query_listener.rb new file mode 100644 index 0000000..52feddf --- /dev/null +++ b/examples/live_query_listener.rb @@ -0,0 +1,98 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +# Live Query Listener for parse-stack-next +# +# An interactive console listener — like a tail / `rails console` that just +# prints: it logs in as a user, opens a LiveQuery subscription scoped to that +# user's session token, and prints every event (create / update / delete / +# enter / leave) until you press Ctrl-C. +# +# Because the subscription carries the user's `sessionToken`, Parse Server +# enforces ACLs on the live stream too: you only receive events for objects +# that user is allowed to read — "whatever the user can hear." Swap in a +# master-key subscription (use_master_key: true) to hear everything. +# +# Prerequisite: the `Post` class must exist on the server (run +# examples/basic_server.rb first, which provisions it). You also need a Parse +# Server with the LiveQuery websocket server enabled for the Post class. +# +# Run it, then in another terminal create/update/destroy Posts (e.g. via +# examples/basic_client.rb or the dashboard) and watch them stream in: +# export PARSE_SERVER_URL=http://localhost:1337/parse +# export PARSE_APP_ID=... PARSE_REST_KEY=... +# export PARSE_LIVE_QUERY_URL=ws://localhost:1337/parse # ws:// or wss:// +# ruby examples/live_query_listener.rb + +require "parse-stack-next" +require "parse/live_query" + +# --------------------------------------------------------------------------- +# 1. Configure the REST client + the LiveQuery websocket client +# --------------------------------------------------------------------------- +Parse.setup( + server_url: ENV.fetch("PARSE_SERVER_URL", "http://localhost:1337/parse"), + app_id: ENV.fetch("PARSE_APP_ID"), + api_key: ENV.fetch("PARSE_REST_KEY"), + master_key: nil, # a plain client — the session token does the scoping + logging: false, +) + +Parse.live_query_enabled = true +Parse::LiveQuery.configure do |config| + config.url = ENV.fetch("PARSE_LIVE_QUERY_URL", "ws://localhost:1337/parse") + config.application_id = ENV.fetch("PARSE_APP_ID") + config.client_key = ENV.fetch("PARSE_REST_KEY") +end + +class Post < Parse::Object + property :title, :string + property :body, :string +end + +# --------------------------------------------------------------------------- +# 2. Authenticate (so the subscription is ACL-scoped to this user) +# --------------------------------------------------------------------------- +USERNAME = ENV.fetch("PARSE_USERNAME", "ada") +PASSWORD = ENV.fetch("PARSE_PASSWORD", "p4ssw0rd!") + +user = Parse::User.login(USERNAME, PASSWORD) || + Parse::User.signup(USERNAME, PASSWORD, "ada@example.com") +puts "Listening as #{user.username} (#{user.id}) — only Posts this user can read.\n\n" + +# --------------------------------------------------------------------------- +# 3. Open the subscription + register handlers +# --------------------------------------------------------------------------- +def stamp = Time.now.strftime("%H:%M:%S") + +# `where:` narrows the live query (omit it to hear every readable Post); +# `session_token:` is what makes Parse Server apply this user's ACL to the +# stream. Use `use_master_key: true` instead to listen to everything. +subscription = Post.subscribe( + where: {}, # e.g. { :title.exists => true } + session_token: user.session_token, +) + +subscription.on(:subscribe) { puts "[#{stamp}] subscribed — waiting for events…" } +subscription.on(:create) { |post| puts "[#{stamp}] CREATE #{post.id} #{post.title.inspect}" } +subscription.on(:update) { |post, _orig| puts "[#{stamp}] UPDATE #{post.id} #{post.title.inspect}" } +subscription.on(:delete) { |post| puts "[#{stamp}] DELETE #{post.id}" } +subscription.on(:enter) { |post, _orig| puts "[#{stamp}] ENTER #{post.id} (now matches query)" } +subscription.on(:leave) { |post, _orig| puts "[#{stamp}] LEAVE #{post.id} (no longer matches)" } +subscription.on(:error) { |err| warn "[#{stamp}] ERROR #{err}" } + +# --------------------------------------------------------------------------- +# 4. Block and print until Ctrl-C +# --------------------------------------------------------------------------- +running = true +trap("INT") do + running = false # keep the handler tiny — just flip the flag +end + +puts "Press Ctrl-C to stop.\n\n" +sleep 0.2 while running # events arrive on the websocket thread + +puts "\nStopping…" +subscription.unsubscribe +Parse::LiveQuery.reset! # closes the websocket connection +puts "Done." diff --git a/examples/rag_chatbot.rb b/examples/rag_chatbot.rb new file mode 100644 index 0000000..a593871 --- /dev/null +++ b/examples/rag_chatbot.rb @@ -0,0 +1,221 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +# RAG Chatbot Example for parse-stack-next +# +# A retrieval-augmented-generation (RAG) chatbot in a single file: +# +# 1. Store documents in Parse; the SDK manages their embeddings on save. +# 2. Retrieve the most relevant passages for a question with the +# `semantic_search` agent tool. +# 3. Hand those passages to an LLM to write the answer. +# +# The SDK owns the **R** (retrieval) and the embedding lifecycle. The **G** +# (generation) is a thin add-in over the OpenAI or Anthropic HTTP API, shown +# here with zero extra gems (`net/http` only). +# +# Retrieval (`Parse::Retrieval` / `semantic_search`) shipped in v5.2; managed +# `embed` and the embeddings provider registry shipped in v5.1. Vector search +# runs against MongoDB Atlas (`$vectorSearch`), so point the SDK at an +# Atlas-backed Parse Server / Mongo URI. +# +# Run it: +# export OPENAI_API_KEY=sk-... # embeddings + (optionally) generation +# export ANTHROPIC_API_KEY=sk-ant-... # if using the Anthropic backend +# export PARSE_SERVER_URL=http://localhost:1337/parse +# export PARSE_APP_ID=... PARSE_REST_KEY=... PARSE_MASTER_KEY=... +# ruby examples/rag_chatbot.rb + +require "parse-stack-next" +require "net/http" +require "json" + +# --------------------------------------------------------------------------- +# 1. Embedding provider + Parse connection +# --------------------------------------------------------------------------- + +# text-embedding-3-small is 1536-dim. Register it under :openai so the model's +# :vector property can resolve it by name at save / query time. +Parse::Embeddings.register( + :openai, + Parse::Embeddings::OpenAI.new(api_key: ENV.fetch("OPENAI_API_KEY")), +) + +Parse.setup( + server_url: ENV.fetch("PARSE_SERVER_URL", "http://localhost:1337/parse"), + app_id: ENV.fetch("PARSE_APP_ID"), + api_key: ENV.fetch("PARSE_REST_KEY"), + master_key: ENV.fetch("PARSE_MASTER_KEY"), +) + +# --------------------------------------------------------------------------- +# 2. The model +# --------------------------------------------------------------------------- +# +# `embed` declares a MANAGED embedding: list the source fields, name the +# :vector property they feed, and the SDK recomputes the vector on `save` +# whenever those fields change (digest-tracked, so a no-op save makes zero +# provider calls). You never assign `embedding` yourself — it is +# write-protected. Set title/body and save. +# +# `agent_searchable` opts the class into the `semantic_search` agent tool and +# declares which fields an agent may filter on. +class KnowledgeArticle < Parse::Object + property :title, :string + property :body, :string + property :category, :string + + property :embedding, :vector, dimensions: 1536, provider: :openai + + # title + body feed :embedding, recomputed on save. + embed :title, :body, into: :embedding + + # Opt into semantic_search; allow filtering on :category. + agent_searchable field: :embedding, filter_fields: %i[category] + + # Declare the Atlas $vectorSearch index that retrieval needs. + mongo_search_index "knowledge_embedding", + { fields: [{ type: "vector", path: "embedding", + numDimensions: 1536, similarity: "cosine" }] }, + type: "vectorSearch" +end + +# --------------------------------------------------------------------------- +# 3. The LLM generation add-in (NOT part of the SDK) +# --------------------------------------------------------------------------- +# +# ~15 lines of HTTP per backend. Both take the retrieved chunks as context and +# return an answer grounded in them. +module ChatAnswerer + PROMPT = <<~SYS + You are a support assistant. Answer ONLY from the context below. + If the context does not contain the answer, say you don't know. + SYS + + module_function + + def context(chunks) + chunks.map { |c| "- #{c[:content]}" }.join("\n") + end + + # --- OpenAI backend --- + def openai(question, chunks, model: "gpt-4o-mini") + post("https://api.openai.com/v1/chat/completions", + { "Authorization" => "Bearer #{ENV.fetch('OPENAI_API_KEY')}" }, + { model: model, + messages: [ + { role: "system", content: PROMPT }, + { role: "user", + content: "Context:\n#{context(chunks)}\n\nQuestion: #{question}" }, + ] }) + .dig("choices", 0, "message", "content") + end + + # --- Anthropic backend --- + def anthropic(question, chunks, model: "claude-opus-4-8") + post("https://api.anthropic.com/v1/messages", + { "x-api-key" => ENV.fetch("ANTHROPIC_API_KEY"), + "anthropic-version" => "2023-06-01" }, + { model: model, max_tokens: 1024, system: PROMPT, + messages: [ + { role: "user", + content: "Context:\n#{context(chunks)}\n\nQuestion: #{question}" }, + ] }) + .dig("content", 0, "text") + end + + def post(url, headers, body) + uri = URI(url) + req = Net::HTTP::Post.new(uri, { "Content-Type" => "application/json" }.merge(headers)) + req.body = JSON.generate(body) + res = Net::HTTP.start(uri.host, uri.port, use_ssl: true) { |http| http.request(req) } + JSON.parse(res.body) + end +end + +# --------------------------------------------------------------------------- +# 4. Retrieval helper +# --------------------------------------------------------------------------- +# +# An unscoped Parse::Agent (no session_token: / acl_user: / acl_role:) runs in +# master posture using the master key already on the default client from +# Parse.setup — there is NO `master_key:` constructor argument. +# +# To scope retrieval to a signed-in user (so the bot only ever sees documents +# that user may read), construct it as: +# Parse::Agent.new(session_token: user.session_token) +# +# `execute(:semantic_search, ...)` returns +# { success:, data: { chunks:, documents:, count: } }. +def retrieve(agent, question, k: 4) + result = agent.execute(:semantic_search, + class_name: "KnowledgeArticle", + query: question, + k: k) + raise result[:error].to_s unless result[:success] + + # Each chunk: { id:, score:, content:, metadata: { object_id:, ... } }. + # The parent record lives once in data[:documents], keyed by objectId. + result[:data][:chunks] +end + +# --------------------------------------------------------------------------- +# 5. Seed a corpus + chat loop (runs when executed directly) +# --------------------------------------------------------------------------- + +CORPUS = [ + { title: "Resetting your password", + body: "Open Settings, choose Security, then Reset Password. A link is emailed to you.", + category: "account" }, + { title: "Exporting your data", + body: "Use Settings > Export to download a ZIP of all your documents as JSON.", + category: "data" }, + { title: "Billing cycles", + body: "Plans renew monthly on the date you subscribed. Cancel anytime before renewal.", + category: "billing" }, +].freeze + +# Bulk back-fill / precompute outside the managed-save path: call the provider +# directly. `embed_text_batched` splits into the provider's recommended batch +# size (OpenAI: 100) and returns one vector per string, in order. (Not used by +# the demo below — `save` handles embedding — but shown for completeness.) +def precompute_vectors(texts) + provider = Parse::Embeddings.provider(:openai) + provider.embed_text_batched(texts, input_type: :search_document) +end + +def seed_corpus! + # $vectorSearch needs an Atlas vector index on `embedding`, or retrieval + # raises IndexNotResolved. The model declares it; apply it once. This uses + # the mongo-direct writer (requires Parse::MongoDB configured against Atlas) + # — or create the same index in the Atlas UI. + KnowledgeArticle.apply_search_indexes!(wait: true) + + # Managed embedding means ingestion is just `save`: each save that changes a + # source field makes one embedding call; unchanged re-saves make none. + CORPUS.each { |attrs| KnowledgeArticle.new(attrs).save } +end + +def chat_loop(backend: :anthropic) + # Master posture (reads everything). Emits a one-time master-key warning to + # stderr; silence it with Parse::Agent.suppress_master_key_warning = true. + agent = Parse::Agent.new + + puts "Ask a question (Ctrl-D to quit):" + while (question = $stdin.gets&.strip) + next if question.empty? + + chunks = retrieve(agent, question) + answer = ChatAnswerer.public_send(backend, question, chunks) + + puts "\n#{answer}\n" + sources = chunks.map { |c| c.dig(:metadata, :object_id) }.uniq.join(", ") + puts " (sources: #{sources})\n\n" + end +end + +if __FILE__ == $PROGRAM_NAME + seed_corpus! + # Pick :openai or :anthropic for the generation step. + chat_loop(backend: :anthropic) +end diff --git a/examples/webhook_server.rb b/examples/webhook_server.rb new file mode 100644 index 0000000..7ee11cd --- /dev/null +++ b/examples/webhook_server.rb @@ -0,0 +1,111 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +# Cloud Code Webhooks for parse-stack-next +# +# Webhooks are how a Ruby backend runs SERVER-SIDE trigger logic. Without them, +# a Parse::Object's ActiveModel callbacks (before_save, after_create, …) run +# ONLY in the Ruby process that initiated the save. A write that comes from a +# JS/Swift/REST client — or the Parse Dashboard — never touches your Ruby code, +# so that logic is silently skipped server-side. +# +# Registering a webhook flips that: Parse Server calls back into this Ruby app +# on the matching trigger, and your ActiveModel callbacks + webhook blocks run +# for EVERY client, not just Ruby ones. +# +# This file is a Rack app. Mount it (config.ru): +# +# require_relative "examples/webhook_server" +# run Parse::Webhooks +# +# and point Parse Server's `webhookKey` at the same value as Parse::Webhooks.key. +# +# See docs/webhooks_guide.md for the full picture (trigger types, the +# ActiveModel↔Parse hook relationship, latency, and replay protection). + +require "parse-stack-next" + +# --------------------------------------------------------------------------- +# 1. Configure the (master-key) client + the webhook key +# --------------------------------------------------------------------------- +Parse.setup( + server_url: ENV.fetch("PARSE_SERVER_URL", "http://localhost:1337/parse"), + app_id: ENV.fetch("PARSE_APP_ID"), + api_key: ENV.fetch("PARSE_REST_KEY"), + master_key: ENV.fetch("PARSE_MASTER_KEY"), +) + +# Shared secret Parse Server sends as `X-Parse-Webhook-Key`. Set the same value +# in Parse Server's `webhookKey` option. Requests without it are rejected. +Parse::Webhooks.key = ENV.fetch("PARSE_WEBHOOK_KEY") + +# Optional: server-initiated replay/freshness protection (see the guide). The +# body+request-id dedup is always on; an HMAC signing secret adds freshness +# verification of X-Parse-Webhook-Timestamp / X-Parse-Webhook-Signature. +Parse::Webhooks::ReplayProtection.signing_secret = ENV["PARSE_WEBHOOK_SIGNING_SECRET"] + +# --------------------------------------------------------------------------- +# 2. A model with both ActiveModel callbacks AND webhook blocks +# --------------------------------------------------------------------------- +class Post < Parse::Object + property :title, :string, required: true + property :slug, :string + property :published, :boolean, default: false + + # ActiveModel callbacks. For a Ruby-initiated save these run locally; for a + # NON-Ruby client they run here only if a beforeSave/afterSave webhook is + # registered for Post. Registering beforeSave enables BOTH before_save and + # before_create; afterSave enables both after_save and after_create. + before_save :normalize_slug + before_create { self.published = false } # created drafts start unpublished + after_create :enqueue_welcome # see note on afterSave below + + def normalize_slug + self.slug = title.to_s.downcase.strip.gsub(/[^a-z0-9]+/, "-") if title_changed? + end + + def enqueue_welcome + # AFTER-SAVE BEST PRACTICE: enqueue, don't execute. Parse Server blocks the + # client's write until this webhook returns — even though afterSave's return + # is a no-op — so long work here adds latency to every save. And do NOT save + # another object here if you can avoid it: each cascading save fires more + # webhooks (a latency cascade). Hand off to a background job instead. + # BackgroundJobs.enqueue(:index_post, id) + end +end + +Post.auto_upgrade! + +# --------------------------------------------------------------------------- +# 3. Webhook blocks (optional) — server-side logic without a Ruby model callback +# --------------------------------------------------------------------------- +# A block runs in the scope of a Parse::Webhooks::Payload. A beforeSave block +# returns `parse_object` (the SDK turns it into the changes Parse Server wants); +# returning `false` halts the save. +class Post + webhook :before_save do + # `parse_object` is the incoming object; mutate it, then return it. + parse_object + end + + # afterSave/afterDelete blocks may register more than one handler. + webhook :after_save do + # keep this short or enqueue — see enqueue_welcome above. + true + end +end + +# NOTE: there is no `webhook :before_create` / `:after_create`. Parse Server has +# no such trigger — register beforeSave/afterSave and your create callbacks fire +# within them for new objects. Asking for a create webhook raises with guidance. + +# --------------------------------------------------------------------------- +# 4. Register the webhooks with Parse Server (run once at deploy, needs master key) +# --------------------------------------------------------------------------- +# `endpoint` is the public HTTPS URL where THIS Rack app is reachable. +if ENV["PARSE_WEBHOOK_ENDPOINT"] + endpoint = ENV.fetch("PARSE_WEBHOOK_ENDPOINT") # e.g. https://hooks.example.com/webhooks + Parse::Webhooks.register_functions!(endpoint) + Parse::Webhooks.register_triggers!(endpoint) + puts "Registered webhooks at #{endpoint}" +end diff --git a/lib/parse/agent/mcp_rack_app.rb b/lib/parse/agent/mcp_rack_app.rb index 2bfed7b..6d3e384 100644 --- a/lib/parse/agent/mcp_rack_app.rb +++ b/lib/parse/agent/mcp_rack_app.rb @@ -20,6 +20,34 @@ class Agent # (method, content-type, body-size, and JSON-parse checks) and then # delegates to Parse::Agent::MCPDispatcher.call for all protocol handling. # + # == Transport (`transport: :streamable_http`) + # + # The MCP 2025-06-18 "Streamable HTTP" transport is the recommended, + # primary transport. Rather than toggling its constituent pieces + # individually (`streaming:` for POST→SSE, `notifications:` for the + # server→client `GET /` stream), pass `transport: :streamable_http` to + # enable the whole transport with one switch: + # + # app = Parse::Agent::MCPRackApp.new(transport: :streamable_http) { |env| ... } + # + # That is exactly equivalent to `streaming: true, notifications: true`. + # `resource_subscriptions: true` may still be added alongside it to + # upgrade the server→client bus from the plain notification posture to + # the LiveQuery-backed resource-subscription posture. + # + # `transport:` is a closed enum — `:streamable_http`, `:legacy`, or `nil`. + # `:legacy` and `nil` both select the historical default (no streaming, no + # server→client stream); the standalone SSE/JSON behavior remains a + # supported fallback. Passing `transport: :streamable_http` together with + # an explicit `streaming:` or `notifications:` raises `ArgumentError`, + # since the switch already owns those toggles. + # + # The default is unchanged (`transport: nil`): an existing + # `MCPRackApp.new { ... }` keeps its non-streaming JSON behavior. A + # streaming-capable Rack server (Puma, Falcon, Unicorn) is required for + # `:streamable_http` to have any effect — the WEBrick-backed `MCPServer` + # buffers responses and cannot deliver it. + # # == SSE Streaming (MCP progress notifications) # # When constructed with `streaming: true`, requests that include @@ -83,6 +111,16 @@ class MCPRackApp # Default heartbeat interval in seconds when streaming is enabled. DEFAULT_HEARTBEAT_INTERVAL = 2 + # Default bound on concurrently-active streaming dispatchers — and, + # separately, on concurrently-open listening streams — when the + # `max_concurrent_dispatchers:` constructor argument is omitted. Finite by + # default so that enabling a streaming surface (request-scoped SSE or the + # long-lived `GET /` stream) does not silently expose an unbounded + # orphan-thread DoS surface. The cap is applied SEPARATELY to each + # surface, so the effective ceiling across both is up to 2x this value. + # Pass an explicit `nil` to knowingly opt into the unbounded surface. + DEFAULT_MAX_CONCURRENT_DISPATCHERS = 100 + # Seconds to wait for a human's elicitation reply before failing # closed (refusing the destructive op). Generous by default — a # human-in-the-loop approver needs time the tool timeout doesn't @@ -110,6 +148,27 @@ class MCPRackApp @listening_stream_count = 0 @listening_stream_mutex = Mutex.new + # Process-wide CUMULATIVE counter of GENUINE orphaned dispatchers — a + # client disconnected (stream closed before its response was delivered) + # WHILE the dispatcher thread was still running (see + # {.abandoned_dispatcher_count}). It deliberately excludes the + # already-finished-but-undelivered case (dispatcher had pushed its + # response but the client dropped before {#each} popped it), which is a + # delivery miss, not an orphan holding a connection-pool slot. A monotonic + # counter (not a live gauge like the two above): operators watch its rate + # of increase to detect a disconnect storm against slow tools, which is the + # orphan-thread pressure signal. (The companion + # `parse.agent.mcp_dispatcher_abandoned` notification fires for EVERY + # premature close and carries a `dispatcher_alive` flag, so subscribers can + # also observe the delivery-miss case and filter on `dispatcher_alive: + # true` for orphans.) The orphaned dispatcher is cooperatively cancelled + # (its token is tripped) and bounded in duration by the per-tool Timeout + # and the clean MongoDB/REST I/O timeouts; it is intentionally NOT + # force-killed (see {SSEBody#close} for why a hard kill would risk + # connection-pool corruption). + @abandoned_dispatcher_count = 0 + @abandoned_dispatcher_mutex = Mutex.new + # Drop env keys that would have come from underscore-form HTTP header # names. The Rack-spec-compliant interpretation of HTTP headers maps # `X-MCP-API-Key` and `X_MCP_API_KEY` to the same env key @@ -171,13 +230,20 @@ def self.strip_underscore_smuggled_headers!(env) # @param heartbeat_interval [Numeric] seconds between progress heartbeat # events when streaming is active. Defaults to DEFAULT_HEARTBEAT_INTERVAL. # Ignored when `streaming: false`. - # @param max_concurrent_dispatchers [Integer, nil] when set, limits the - # number of concurrently active dispatcher threads across all SSE - # connections served by this app instance. When the limit is reached a - # new SSE request immediately receives a 503 JSON-RPC error envelope - # (`-32000` "server busy") rather than spawning another dispatcher. - # Defaults to `nil` (unlimited). Use `active_dispatcher_count` to - # monitor current concurrency from operator tooling. + # @param max_concurrent_dispatchers [Integer, nil] limits the number of + # concurrently active dispatcher threads across all SSE connections + # served by this app instance (and, separately, the number of open + # listening streams). When the limit is reached a new SSE request + # immediately receives a 503 JSON-RPC error envelope (`-32000` "server + # busy") rather than spawning another dispatcher. + # + # Defaults to a finite {DEFAULT_MAX_CONCURRENT_DISPATCHERS} (100) — so a + # streaming surface is bounded out of the box rather than unbounded. + # Pass an explicit positive `Integer` to set the cap, or `nil` to + # knowingly opt into the unbounded surface (which warns at + # construction). A non-positive or non-integer value raises + # `ArgumentError`. Use `active_dispatcher_count` to monitor current + # concurrency from operator tooling. # @param pre_auth_rate_limiter [#check!, nil] optional rate limiter # consulted at the top of every request, BEFORE the agent_factory is # invoked. Closes the factory-amplification DoS where each malformed @@ -234,17 +300,30 @@ def self.strip_underscore_smuggled_headers!(env) # adapter). Takes precedence over `resource_subscriptions:`. When nil # and `resource_subscriptions: true`, a default in-process manager is # constructed. + # @param transport [Symbol, nil] MCP transport selector. Pass + # `:streamable_http` to enable the full MCP 2025-06-18 Streamable HTTP + # transport in one switch — exactly equivalent to `streaming: true, + # notifications: true` (POST→SSE plus the server→client `GET /` + # stream). `resource_subscriptions: true` may still be combined to + # upgrade the bus to its LiveQuery-backed posture. `:legacy` (or the + # default `nil`) selects the historical non-streaming behavior; the + # standalone SSE/JSON path stays a supported fallback. Any other value + # raises `ArgumentError`. Passing `:streamable_http` together with an + # explicit `streaming:` or `notifications:` also raises, since the + # switch already owns those toggles. Requires a streaming-capable Rack + # server (Puma, Falcon, Unicorn); has no effect under WEBrick. # @raise [ArgumentError] if both or neither of agent_factory/block are given. def initialize(agent_factory: nil, max_body_size: DEFAULT_MAX_BODY_SIZE, - logger: nil, streaming: false, + logger: nil, streaming: nil, heartbeat_interval: DEFAULT_HEARTBEAT_INTERVAL, - max_concurrent_dispatchers: nil, + max_concurrent_dispatchers: DEFAULT_MAX_CONCURRENT_DISPATCHERS, pre_auth_rate_limiter: nil, allowed_origins: nil, require_custom_header: nil, resource_subscriptions: false, subscription_manager: nil, - notifications: false, + notifications: nil, + transport: nil, approval_timeout: DEFAULT_APPROVAL_TIMEOUT, principal_resolver: nil, health_path: nil, &block) @@ -258,11 +337,41 @@ def initialize(agent_factory: nil, max_body_size: DEFAULT_MAX_BODY_SIZE, raise ArgumentError, "pre_auth_rate_limiter must respond to #check!" end + # `transport:` is the consolidation switch over the granular + # `streaming:` / `notifications:` toggles. `streaming` and + # `notifications` default to nil (not false) precisely so we can tell + # "operator left it alone" from "operator explicitly set it" and raise + # on a conflicting combination instead of silently letting the switch + # win. Closed enum — unknown values fail closed. + unless transport.nil? || %i[legacy streamable_http].include?(transport) + raise ArgumentError, + "transport: must be :streamable_http, :legacy, or nil, got #{transport.inspect}" + end + if transport == :streamable_http + unless streaming.nil? && notifications.nil? + raise ArgumentError, + "transport: :streamable_http already enables streaming and the server-initiated " \ + "notification stream; do not also pass streaming:/notifications: " \ + "(resource_subscriptions: may still be combined to upgrade the bus to LiveQuery)" + end + streaming = true + notifications = true + end + # Collapse the nil sentinel to the historical default for the + # remainder of the constructor (and @streaming below). + streaming = false if streaming.nil? + notifications = false if notifications.nil? + @agent_factory = agent_factory || block @max_body_size = max_body_size @logger = logger @streaming = streaming @heartbeat_interval = heartbeat_interval + # The dispatcher cap defaults to the finite DEFAULT_MAX_CONCURRENT_DISPATCHERS + # (set in the signature). An explicit positive Integer overrides it; an + # explicit nil knowingly opts into the unbounded surface; anything else + # is a config error and raises. + validate_max_concurrent_dispatchers!(max_concurrent_dispatchers) @max_concurrent_dispatchers = max_concurrent_dispatchers @pre_auth_rate_limiter = pre_auth_rate_limiter @allowed_origins = normalize_allowed_origins(allowed_origins) @@ -319,21 +428,24 @@ def initialize(agent_factory: nil, max_body_size: DEFAULT_MAX_BODY_SIZE, Parse::Agent::MCPSubscriptions::Manager.new(logger: @logger, supported: false) end - # Warn operators who enable a streaming surface without a concurrency - # cap. Both request-scoped SSE (streaming:) and the long-lived GET - # listening stream (resource_subscriptions:/notifications:, which set + # Warn operators who enable a streaming surface AND have explicitly + # opted into an unbounded dispatcher cap. Both request-scoped SSE + # (streaming:) and the long-lived GET listening stream + # (resource_subscriptions:/notifications:, which set # @subscription_manager) spawn per-connection threads; an unbounded # endpoint is a practical DoS surface — a slow or hostile client opening # connections faster than they close can exhaust the host thread pool and # downstream Parse connection pool. The cap bounds each surface # SEPARATELY, so the effective ceiling is up to 2x max_concurrent_dispatchers - # across both. Leaving the default `nil` (unlimited) preserves backward - # compatibility, but we tell the operator once at construction. + # across both. The default is now the finite DEFAULT_MAX_CONCURRENT_DISPATCHERS, + # so a nil here means the operator deliberately chose `nil` (unbounded) — + # we warn once at construction so the choice is visible. if (streaming || @subscription_manager) && @max_concurrent_dispatchers.nil? surface = streaming ? "streaming: true" : "resource_subscriptions/notifications" - line = "[Parse::Agent::MCPRackApp] #{surface} with max_concurrent_dispatchers: nil (unlimited). " \ - "Set a finite cap (e.g. 100, or 2x your Puma max_threads) to bound the orphan-thread DoS surface. " \ - "See docs/mcp_guide.md for sizing guidance." + line = "[Parse::Agent::MCPRackApp] #{surface} with an explicitly unbounded dispatcher cap " \ + "(max_concurrent_dispatchers: nil). This is an orphan-thread DoS surface. " \ + "Prefer the finite default (#{DEFAULT_MAX_CONCURRENT_DISPATCHERS}) or pass a value sized to " \ + "~2x your Puma max_threads. See docs/mcp_guide.md for sizing guidance." if @logger @logger.warn(line) else @@ -409,6 +521,28 @@ def self.adjust_listening_stream_count(delta) @listening_stream_mutex.synchronize { @listening_stream_count += delta } end + # Process-wide CUMULATIVE count of GENUINE orphaned dispatchers — a client + # disconnect that closed the stream while the dispatcher thread was still + # running. Excludes already-finished-but-undelivered closes (a delivery + # miss, not an orphan). Unlike {.active_dispatcher_count} / + # {.active_listening_stream_count} this is a monotonic total, not a live + # gauge — operators alert on its *rate* of increase, the orphan-thread + # pressure signal under a disconnect-against-slow-tools storm. EVERY + # premature close (orphan or delivery-miss) also emits a + # `parse.agent.mcp_dispatcher_abandoned` ActiveSupport::Notifications event + # carrying `dispatcher_alive:`, so subscribers wanting the broader + # delivery-miss signal can filter there. Reset is not supported (counters + # are process-lifetime); subtract a baseline if you need a windowed delta. + def self.abandoned_dispatcher_count + @abandoned_dispatcher_mutex.synchronize { @abandoned_dispatcher_count } + end + + # @api private — increment the cumulative abandoned-dispatcher counter. + # Called by {SSEBody#close} on the client-disconnect path. + def self.record_abandoned_dispatcher! + @abandoned_dispatcher_mutex.synchronize { @abandoned_dispatcher_count += 1 } + end + # Rack interface. # # @param env [Hash] Rack environment @@ -1104,6 +1238,10 @@ def initialize(progress_token, req_id, interval, logger, ->(t, i) { t.join(i) } @queue = Queue.new @worker = nil + # The dispatcher thread spawned inside @worker. Published under + # @close_mutex once started so {#close} can snapshot its liveness for + # the abandonment signal. Never force-killed (see #close). + @dispatcher_thread = nil # Flipped to true by #each when the DONE sentinel is consumed. # #close uses this to decide whether to trip the cancellation # token (false = client disconnect) or skip the trip (true = @@ -1158,38 +1296,56 @@ def each # sentinel was not consumed by {#each}), this is interpreted as # a client disconnect and: # - # 1. The cancellation token (if any) is tripped BEFORE the - # worker is killed, so tools that observe `agent.cancelled?` - # at a checkpoint can exit cooperatively. The kill becomes - # the fallback for tools stuck inside a blocking I/O call. + # 1. The cancellation token (if any) is tripped, so a tool that + # observes `agent.cancelled?` at a checkpoint exits + # cooperatively. The orphaned dispatcher is NOT force-killed + # (see below); its lifetime is bounded by the per-tool + # Timeout and the clean MongoDB/REST I/O deadlines. + # 2. The abandonment is recorded — a `parse.agent.mcp_dispatcher_abandoned` + # notification is emitted for every premature close, and the + # process-wide {MCPRackApp.abandoned_dispatcher_count} counter is + # bumped when the dispatcher was still running (a genuine orphan) — + # so operators can see disconnect-against-slow-tool pressure even + # though each orphan is individually bounded. # - # When called AFTER normal completion, the token is NOT tripped - # — the request finished on its own; cancellation would only - # confuse a tool that races to check the flag. + # When called AFTER normal completion, neither happens — the + # request finished on its own; cancellation would only confuse a + # tool that races to check the flag, and there is nothing to + # report. # # Either path: - # - Kills the worker thread if still alive. + # - Kills the WORKER thread (the heartbeat loop) if still alive. # - Invokes the on_close hook so MCPRackApp can deregister # the token from its per-app registry. Failures in the hook # are logged and swallowed — close must always succeed. # - # Cancellation note: blocking I/O calls (MongoDB query, Parse - # REST roundtrip) do not observe the token until they return. - # The Ruby-level `Timeout.timeout` already wrapping each tool is - # the hard upper bound on wasted work; cancellation reduces it, - # not eliminates it. + # Why the dispatcher is not force-killed: a `Thread#kill` (or a + # foreign `Thread#raise`) skips the DB driver's rescue-based + # connection-invalidation, so `connection_pool`'s `ensure` could + # return a half-used connection to the pool and corrupt a later + # request that reuses it. Blocking I/O calls do not observe the + # cancellation token, but they ARE bounded by the per-tool + # `Timeout.timeout` (Tools::TOOL_TIMEOUTS, 5–60s) and the clean + # MongoDB `socket_timeout` (10s) / REST `timeout` (30s) deadlines, + # which reclaim the connection-pool slot through the driver's + # clean error path. Cooperative cancellation reduces wasted work; + # the bounded timeouts cap it; a forcible kill is intentionally + # avoided. def close # Idempotent — concurrent invocations from the I/O fiber and # a disconnect-handler thread short-circuit after the first # caller wins the mutex. completed_normally = nil + dispatcher_alive = false @close_mutex.synchronize do return if @closed @closed = true completed_normally = @completed_normally + dispatcher_alive = @dispatcher_thread&.alive? || false end unless completed_normally @cancellation_token&.cancel!(reason: :client_disconnect) + record_abandonment(dispatcher_alive) end @worker&.kill if @worker&.alive? @worker = nil @@ -1227,6 +1383,36 @@ def close private + # Record a client-disconnect abandonment. `dispatcher_alive` reports + # whether the dispatcher was still running at close time (true = a + # genuine mid-flight orphan holding its slot; false = it had already + # finished but the DONE sentinel was never consumed — a delivery miss). + # + # The cumulative counter tracks GENUINE orphans only (gated on + # `dispatcher_alive`), matching {MCPRackApp.abandoned_dispatcher_count}'s + # contract. The `parse.agent.mcp_dispatcher_abandoned` notification fires + # for EVERY premature close and carries the flag, so subscribers can see + # delivery misses too and filter on `dispatcher_alive: true` for orphans. + # Best-effort and fully guarded — observability must never break stream + # teardown. + # + # Subscriber discipline matches the rest of the SDK's instrumentation: + # subscribers run synchronously on the thread that calls close (a Rack + # I/O fiber or a disconnect-handler thread); keep them cheap. + def record_abandonment(dispatcher_alive) + MCPRackApp.record_abandoned_dispatcher! if dispatcher_alive + return unless defined?(ActiveSupport::Notifications) + ActiveSupport::Notifications.instrument( + "parse.agent.mcp_dispatcher_abandoned", + reason: :client_disconnect, + dispatcher_alive: dispatcher_alive, + request_id: @req_id, + ) + rescue StandardError => e + line = "[Parse::Agent::MCPRackApp::SSEBody] abandonment-record error: #{e.class}: #{e.message}" + @logger ? @logger.warn(line) : warn(line) + end + def start_worker # Subscribe to listChanged events BEFORE spawning the worker # so any registry mutation that races with the start of the @@ -1250,40 +1436,62 @@ def start_worker # every @interval seconds until the call completes OR until a # tool starts reporting its own progress (@tool_progress_reported). # - # Cancellation note: if the consumer disconnects (close is called), - # the outer @worker is killed but dispatcher_thread is orphaned and - # runs to completion. A proper cancellation mechanism (e.g. passing - # a cancel token into MCPDispatcher) is a separate deferred item - # (see CHANGELOG / project plans). + # Cancellation contract on client disconnect (close is called): + # the outer @worker is killed and the dispatcher thread is + # cooperatively cancelled — {#close} trips the cancellation token + # so a tool checking `agent.cancelled?` at a checkpoint exits + # promptly. The dispatcher is NOT force-killed: a `Thread#kill` + # (or a foreign `Thread#raise`) would skip the DB driver's + # rescue-based connection-invalidation, so connection_pool's + # ensure could check a half-used connection back in and corrupt a + # subsequent request. Instead the orphan's lifetime is bounded by + # (a) for BUILT-IN tools, the per-tool `Timeout.timeout` budget + # (Tools::TOOL_TIMEOUTS, 5–60s, applied inside each built-in tool + # via Tools.with_timeout) and (b) the clean MongoDB `socket_timeout` + # (10s) / REST `timeout` (30s) I/O deadlines, which DO route through + # the driver's clean error path. For CUSTOM registered tools the + # handler IS wrapped by Tools.invoke in its declared `timeout:` + # (default 30s; register rejects a non-positive value), so a + # blocking or looping custom handler is bounded just like a + # built-in. (A handler that swallows ToolTimeoutError or blocks in + # an uninterruptible C call can still evade Timeout, but the default + # path is bounded.) The + # max_concurrent_dispatchers: cap still bounds how MANY orphans can + # exist; the abandonment counter + `parse.agent.mcp_dispatcher_abandoned` + # notification surface how OFTEN it happens (see {#close} and + # {.abandoned_dispatcher_count}). # - # Each dispatcher_thread is tagged with :parse_mcp_dispatcher so - # operators can observe concurrency via - # Parse::Agent::MCPRackApp.active_dispatcher_count. Orphaned - # dispatchers (from client disconnects) are counted until they - # complete naturally. Forcible kill is intentionally not attempted - # here — killing threads inside MCPDispatcher.call risks leaving - # agent state corrupt. The max_concurrent_dispatchers: constructor - # option provides a concurrency cap that fires 503 before a new - # dispatcher is admitted. - dispatcher_thread = Thread.new do - Thread.current[:parse_mcp_dispatcher] = true - begin - # The block receives the SSEBody's progress callback so - # tools running inside MCPDispatcher.call can emit - # notifications/progress events without coupling to - # SSEBody internals. - result = @dispatcher_blk.call(@progress_callback) - rescue StandardError => e - # Log the unexpected failure (MCPDispatcher.call normally catches - # StandardError internally; anything reaching here is unusual). - line = "[Parse::Agent::MCPRackApp::SSEBody] Dispatcher error: #{e.class}: #{e.message}" - if @logger - @logger.warn(line) - else - warn line + # Each dispatcher thread is tagged with :parse_mcp_dispatcher so + # operators can observe live concurrency via + # {.active_dispatcher_count}. The spawn and the publish to + # @dispatcher_thread happen together under @close_mutex so a + # concurrent {#close} (e.g. an out-of-band disconnect-handler + # thread) observes the thread as either entirely absent or fully + # published — never a created-but-unpublished orphan it would + # miscount as a delivery miss. + dispatcher_thread = nil + @close_mutex.synchronize do + dispatcher_thread = Thread.new do + Thread.current[:parse_mcp_dispatcher] = true + begin + # The block receives the SSEBody's progress callback so + # tools running inside MCPDispatcher.call can emit + # notifications/progress events without coupling to + # SSEBody internals. + result = @dispatcher_blk.call(@progress_callback) + rescue StandardError => e + # Log the unexpected failure (MCPDispatcher.call normally catches + # StandardError internally; anything reaching here is unusual). + line = "[Parse::Agent::MCPRackApp::SSEBody] Dispatcher error: #{e.class}: #{e.message}" + if @logger + @logger.warn(line) + else + warn line + end + result = { status: 200, body: build_error_envelope(e) } end - result = { status: 200, body: build_error_envelope(e) } end + @dispatcher_thread = dispatcher_thread end while dispatcher_thread.alive? @@ -1850,6 +2058,21 @@ def unauthorized_body }) end + # Validate the `max_concurrent_dispatchers:` argument. A positive Integer + # caps the streaming surface; an explicit `nil` is a knowing opt-in to the + # unbounded surface (warned about at construction); anything else is a + # code-level config error and raises loudly. + # + # @param value [Object] the constructor argument. + # @raise [ArgumentError] when value is neither nil nor a positive Integer. + def validate_max_concurrent_dispatchers!(value) + return if value.nil? + unless value.is_a?(Integer) && value >= 1 + raise ArgumentError, + "max_concurrent_dispatchers must be a positive Integer or nil (unbounded), got #{value.inspect}" + end + end + # Normalize the allowed-origins kwarg into a frozen Array of # downcased entries. Returns nil when the caller passed nil or an # empty array (no check configured). Each entry retains its diff --git a/lib/parse/agent/tools.rb b/lib/parse/agent/tools.rb index d0d8514..ed66f06 100644 --- a/lib/parse/agent/tools.rb +++ b/lib/parse/agent/tools.rb @@ -1066,7 +1066,10 @@ class << self # @param description [String] human-readable description (required) # @param parameters [Hash] JSON Schema object definition (required) # @param permission [Symbol] :readonly, :write, or :admin (required) - # @param timeout [Integer] seconds before ToolTimeoutError (default: 30) + # @param timeout [Integer] positive seconds before ToolTimeoutError + # (default: 30). Enforced by Tools.invoke, which wraps the handler in + # Timeout.timeout. Must be >= 1 (a non-positive value raises + # ArgumentError, since Timeout.timeout(0) would disable the bound). # @param handler [Proc] lambda(agent, **args) -> Hash (required) # @param client_safe [Boolean] when +true+, the tool is dispatchable # from a client-mode agent (one whose client has no master_key). @@ -1089,9 +1092,14 @@ class << self # - Bypass the agent_fields allowlist enforced by built-in tools when # they return raw Parse::Object instances. Project fields manually # in the handler. - # - Receive a Timeout.timeout budget derived from TOOL_TIMEOUTS (or the - # custom :timeout kwarg), but note that Parse Server's REST surface - # does not accept maxTimeMS — the only timeout is the Ruby-level one. + # - Be wrapped by Tools.invoke in a Timeout.timeout budget equal to the + # handler's declared :timeout kwarg (default 30s) — so a blocking or + # looping handler is bounded and raises ToolTimeoutError. (Built-in + # tools derive their budget from TOOL_TIMEOUTS; a registered handler + # uses its own :timeout.) Note that Parse Server's REST surface does + # not accept maxTimeMS — the only timeout is this Ruby-level one, so + # a handler that ignores `agent.cancelled?` is interrupted only when + # the Timeout fires. # # Treat the handler list as part of your application's trust boundary: # register at boot from code you control; never accept registrations @@ -1120,6 +1128,17 @@ def register(name:, description:, parameters:, handler:, end category_str = category.to_s raise ArgumentError, "category must be a non-empty string" if category_str.empty? + # Guarantee an enforceable wall-clock bound: Tools.invoke wraps the + # handler in Timeout.timeout(timeout), and Timeout.timeout(0) means + # "no timeout" — so a 0 (or fractional value that floors to 0) would + # silently leave the handler unbounded. Require a positive integer of + # seconds. Operators who genuinely need a long-running tool pass a + # large value, not 0. + if timeout.to_i < 1 + raise ArgumentError, + "timeout must be a positive integer number of seconds (got #{timeout.inspect}); " \ + "Timeout.timeout(0) would disable the bound" + end sym = name.to_sym # NEW-TOOLS-6: refuse names that collide with a builtin tool. The @@ -1207,15 +1226,28 @@ def reset_subscribers! # Dispatch a tool call. Registered tools take precedence over builtins # only when both share a name; otherwise each path is exclusive. # + # A registered handler is wrapped in `with_timeout(sym)` so its declared + # `timeout:` (default DEFAULT_TIMEOUT, 30s) is actually enforced — + # without this, a custom handler that blocks or loops forever has no + # wall-clock bound and (over the MCP streaming transport) can hold a + # dispatcher slot indefinitely after a client disconnect. Built-in tools + # are NOT wrapped here: each built-in already applies `with_timeout` + # inside its own body, so wrapping the `else` branch would double-wrap. + # `register` rejects a non-positive `timeout:`, so the budget here is + # always >= 1s (Timeout.timeout(0) would otherwise mean "no timeout"). + # # @param agent [Parse::Agent] the agent instance # @param name [Symbol, String] tool name # @param kwargs [Hash] keyword arguments forwarded to handler or builtin + # @raise [Parse::Agent::ToolTimeoutError] if a registered handler exceeds + # its declared timeout (handled by Agent#execute and the approval + # preview, which both rescue it). def invoke(agent, name, **kwargs) sym = name.to_sym entry = REGISTRY_MUTEX.synchronize { @registry[sym] } if entry - entry[:handler].call(agent, **kwargs) + with_timeout(sym) { entry[:handler].call(agent, **kwargs) } else Tools.send(sym, agent, **kwargs) end @@ -4901,6 +4933,14 @@ def explain_query(agent, class_name:, where: nil, **_kwargs) response = agent.client.find_objects(class_name, query, **agent.request_opts) unless response.success? + # Parse Server 9.0+ defaults `allowPublicExplain` to false, so a + # non-master agent's explain is rejected. Surface that as actionable + # guidance instead of a bare permission error. + if response.respond_to?(:permission_denied?) && response.permission_denied? + raise "Explain failed: #{response.error} — Parse Server 9.0+ defaults " \ + "allowPublicExplain to false; query explain requires a master-key agent " \ + "or `allowPublicExplain: true` in the server's databaseOptions." + end raise "Explain failed: #{response.error}" end diff --git a/lib/parse/api/aggregate.rb b/lib/parse/api/aggregate.rb index bb4986a..7c8e94a 100644 --- a/lib/parse/api/aggregate.rb +++ b/lib/parse/api/aggregate.rb @@ -66,10 +66,16 @@ def aggregate_objects(className, query = {}, headers: {}, **opts) # @param pipeline [Array] the MongoDB aggregation pipeline stages. # @param opts [Hash] additional options to pass to the {Parse::Client} request. # @param headers [Hash] additional HTTP headers to send with the request. + # @param raw_values [Boolean] when true, adds +rawValues: true+ to the request + # so Parse Server returns un-decoded field values (PS 9.9.0+, #10438). + # @param raw_field_names [Boolean] when true, adds +rawFieldNames: true+ to the + # request so Parse Server returns original (un-decoded) field names (PS 9.9.0+). # @return [Parse::Response] # @see Parse::Query - def aggregate_pipeline(className, pipeline = [], headers: {}, **opts) + def aggregate_pipeline(className, pipeline = [], headers: {}, raw_values: false, raw_field_names: false, **opts) query = { pipeline: pipeline.to_json } + query[:rawValues] = true if raw_values + query[:rawFieldNames] = true if raw_field_names response = request :get, aggregate_uri_path(className), query: query, headers: headers, opts: opts response.parse_class = className if response.present? response diff --git a/lib/parse/api/cloud_functions.rb b/lib/parse/api/cloud_functions.rb index 0ac4fdd..6aa9348 100644 --- a/lib/parse/api/cloud_functions.rb +++ b/lib/parse/api/cloud_functions.rb @@ -12,10 +12,16 @@ module CloudFunctions # @param opts [Hash] additional options for the request. # @option opts [String] :session_token The session token for authenticated requests. # @option opts [String] :master_key Whether to use the master key for this request. + # @param context [Hash, nil] an optional caller context forwarded as the + # +X-Parse-Cloud-Context+ header. Parse Server maps it to + # +req.info.context+ in the cloud function handler. + # Omit or pass +nil+ to leave behavior unchanged. # @return [Parse::Response] - def call_function(name, body = {}, opts: {}) + def call_function(name, body = {}, opts: {}, context: nil) safe = Parse::API::PathSegment.identifier!(name, kind: "function name") - request :post, "functions/#{safe}", body: body, opts: opts + headers = {} + headers[Parse::Protocol::CLOUD_CONTEXT] = context.to_json unless context.nil? + request :post, "functions/#{safe}", body: body, headers: headers, opts: opts end # Trigger a job. @@ -35,11 +41,13 @@ def trigger_job(name, body = {}, opts: {}) # @param name [String] the name of the cloud function. # @param body [Hash] the parameters to forward to the function. # @param session_token [String] the session token for authenticated requests. + # @param context [Hash, nil] an optional caller context forwarded as the + # +X-Parse-Cloud-Context+ header. # @return [Parse::Response] - def call_function_with_session(name, body = {}, session_token) + def call_function_with_session(name, body = {}, session_token, context: nil) opts = {} opts[:session_token] = session_token if session_token.present? - call_function(name, body, opts: opts) + call_function(name, body, opts: opts, context: context) end # Trigger a job with a specific session token. diff --git a/lib/parse/api/hooks.rb b/lib/parse/api/hooks.rb index 8a41f2e..7aa3947 100644 --- a/lib/parse/api/hooks.rb +++ b/lib/parse/api/hooks.rb @@ -7,15 +7,52 @@ module API module Hooks # @!visibility private HOOKS_PREFIX = "hooks/" - # The allowed set of Parse triggers. - TRIGGER_NAMES = [:afterCreate, :afterDelete, :afterFind, :afterSave, :beforeDelete, :beforeFind, :beforeSave].freeze + # The allowed set of Parse webhook triggers. Mirrors Parse Server's + # `triggers.Types` so registration of the auth / LiveQuery / password- + # reset hooks is no longer pre-rejected by the SDK. + # + # NOTE: this allowlist gates *registration* only. The webhook router in + # {Parse::Webhooks} currently shapes payloads for the object triggers + # (before/after save/delete/find); the login / connect / subscribe / + # password-reset payloads carry a different shape (no `object`) and + # their first-class routing is a follow-up. `beforeConnect` is a + # connection-global trigger whose Parse-canonical className is the + # `@Connect` sentinel; file triggers use `@File`. Both are accepted by + # the trigger-className validator ({Parse::API::PathSegment.trigger_class_name!}). + TRIGGER_NAMES = [ + :afterDelete, :afterFind, :afterSave, + :beforeDelete, :beforeFind, :beforeSave, + :beforeLogin, :afterLogin, :afterLogout, :beforePasswordResetRequest, + :beforeConnect, :beforeSubscribe, :afterEvent, + ].freeze # @!visibility private - TRIGGER_NAMES_LOCAL = [:after_create, :after_delete, :after_find, :after_save, :before_delete, :before_find, :before_save].freeze + TRIGGER_NAMES_LOCAL = [ + :after_delete, :after_find, :after_save, + :before_delete, :before_find, :before_save, + :before_login, :after_login, :after_logout, :before_password_reset_request, + :before_connect, :before_subscribe, :after_event, + ].freeze + + # `beforeCreate` / `afterCreate` are NOT Parse Server trigger types — + # Parse Server rejects them ("invalid hook declaration"). They exist only + # as Parse-Stack ActiveModel callbacks (`before_create` / `after_create`), + # which the webhook router runs INSIDE the `beforeSave` / `afterSave` + # handler for new objects (gated on `original.nil?`). So there is nothing + # to register for them — register `beforeSave` / `afterSave` instead and + # the create callbacks fire within it. # @!visibility private def _verify_trigger(triggerName) - triggerName = triggerName.to_s.camelize(:lower).to_sym - raise ArgumentError, "Invalid trigger name #{triggerName}" unless TRIGGER_NAMES.include?(triggerName) - triggerName + camel = triggerName.to_s.camelize(:lower).to_sym + if %i[beforeCreate afterCreate].include?(camel) + save = camel == :beforeCreate ? "beforeSave" : "afterSave" + callback = camel == :beforeCreate ? "before_create" : "after_create" + raise ArgumentError, + "Parse Server has no #{camel} webhook trigger. Register a " \ + "#{save} webhook instead — Parse Stack runs your #{callback} " \ + "ActiveModel callbacks within the #{save} handler for new objects." + end + raise ArgumentError, "Invalid trigger name #{camel}" unless TRIGGER_NAMES.include?(camel) + camel end # Fetch all defined cloud code functions. @@ -74,7 +111,7 @@ def triggers # @see TRIGGER_NAMES def fetch_trigger(triggerName, className) triggerName = _verify_trigger(triggerName) - safe_class = Parse::API::PathSegment.identifier!(className, kind: "class name") + safe_class = Parse::API::PathSegment.trigger_class_name!(className, kind: "class name") request :get, "#{HOOKS_PREFIX}triggers/#{safe_class}/#{triggerName}" end @@ -98,7 +135,7 @@ def create_trigger(triggerName, className, url) # @see Parse::API::Hooks::TRIGGER_NAMES def update_trigger(triggerName, className, url) triggerName = _verify_trigger(triggerName) - safe_class = Parse::API::PathSegment.identifier!(className, kind: "class name") + safe_class = Parse::API::PathSegment.trigger_class_name!(className, kind: "class name") request :put, "#{HOOKS_PREFIX}triggers/#{safe_class}/#{triggerName}", body: { url: url } end @@ -109,7 +146,7 @@ def update_trigger(triggerName, className, url) # @see Parse::API::Hooks::TRIGGER_NAMES def delete_trigger(triggerName, className) triggerName = _verify_trigger(triggerName) - safe_class = Parse::API::PathSegment.identifier!(className, kind: "class name") + safe_class = Parse::API::PathSegment.trigger_class_name!(className, kind: "class name") request :put, "#{HOOKS_PREFIX}triggers/#{safe_class}/#{triggerName}", body: { __op: "Delete" } end end diff --git a/lib/parse/api/objects.rb b/lib/parse/api/objects.rb index 72c241f..7da4220 100644 --- a/lib/parse/api/objects.rb +++ b/lib/parse/api/objects.rb @@ -84,8 +84,15 @@ def uri_path(className, id = nil) # @param body [Hash] the body of the request. # @param opts [Hash] additional options to pass to the {Parse::Client} request. # @param headers [Hash] additional HTTP headers to send with the request. + # @param context [Hash, nil] an optional caller context forwarded as the + # +X-Parse-Cloud-Context+ header. Parse Server maps it to + # +req.info.context+ inside beforeSave/afterSave cloud triggers. + # Omit or pass +nil+ to leave behavior unchanged. # @return [Parse::Response] - def create_object(className, body = {}, headers: {}, **opts) + def create_object(className, body = {}, headers: {}, context: nil, **opts) + unless context.nil? + headers = headers.merge(Parse::Protocol::CLOUD_CONTEXT => context.to_json) + end response = request :post, uri_path(className), body: body, headers: headers, opts: opts response.parse_class = className if response.present? response @@ -135,8 +142,15 @@ def find_objects(className, query = {}, headers: {}, **opts) # @param body [Hash] The key value pairs to update. # @param opts [Hash] additional options to pass to the {Parse::Client} request. # @param headers [Hash] additional HTTP headers to send with the request. + # @param context [Hash, nil] an optional caller context forwarded as the + # +X-Parse-Cloud-Context+ header. Parse Server maps it to + # +req.info.context+ inside beforeSave/afterSave cloud triggers. + # Omit or pass +nil+ to leave behavior unchanged. # @return [Parse::Response] - def update_object(className, id, body = {}, headers: {}, **opts) + def update_object(className, id, body = {}, headers: {}, context: nil, **opts) + unless context.nil? + headers = headers.merge(Parse::Protocol::CLOUD_CONTEXT => context.to_json) + end response = request :put, uri_path(className, id), body: body, headers: headers, opts: opts response.parse_class = className if response.present? response diff --git a/lib/parse/api/path_segment.rb b/lib/parse/api/path_segment.rb index bcf86ce..33adb97 100644 --- a/lib/parse/api/path_segment.rb +++ b/lib/parse/api/path_segment.rb @@ -45,6 +45,39 @@ def identifier!(value, kind: "name") s end + # Parse trigger className pattern: a normal identifier, OR one of Parse + # Server's `@`-prefixed pseudo-classes (`@File` for file triggers, + # `@Connect` for the connection-global LiveQuery trigger). The optional + # leading `@` is the only relaxation; the rest stays path-safe (no `/`, + # `.`, or `..`). + TRIGGER_CLASS_PATTERN = /\A@?[A-Za-z_][A-Za-z0-9_]*\z/.freeze + + # Validate a className used in a webhook-trigger path + # (`hooks/triggers//`). Same as {.identifier!} but + # additionally accepts the `@File` / `@Connect` pseudo-classes that Parse + # Server uses for file and connection triggers. `create_trigger` carries + # the className in the request BODY (not the path), so it already accepts + # these; this keeps fetch / update / delete symmetric with create. + # + # @param value the className to validate. + # @param kind [String] human-readable name for error messages. + # @return [String] the validated className. + # @raise [ArgumentError] if blank or otherwise fails the pattern. + def trigger_class_name!(value, kind: "class name") + s = value.to_s + if s.empty? + raise ArgumentError, "#{kind} must not be empty" + end + unless TRIGGER_CLASS_PATTERN.match?(s) + raise ArgumentError, + "#{kind} #{s.inspect} contains characters that are not allowed in " \ + "a Parse trigger class name. Names must match " \ + "/\\A@?[A-Za-z_][A-Za-z0-9_]*\\z/ (an identifier, optionally an " \ + "@-prefixed pseudo-class such as @File or @Connect)." + end + s + end + # Validate and percent-encode a less-restrictive path segment, used # for file names which can contain hyphens, periods, and other # filename-safe characters but must never contain a literal `/`, diff --git a/lib/parse/api/server.rb b/lib/parse/api/server.rb index 1a0fbd0..3e76539 100644 --- a/lib/parse/api/server.rb +++ b/lib/parse/api/server.rb @@ -59,6 +59,100 @@ def server_version server_info.present? ? @server_info[:parseServerVersion] : nil end + # The `features` block advertised by `GET /serverInfo`. Parse Server + # surfaces coarse capability groups here (`globalConfig`, `hooks`, + # `cloudCode`, `logs`, `push`, `schemas`), each a Hash of booleans. + # This is authoritative where present but intentionally coarse — it + # does NOT carry fine-grained behavior flags like "public explain" or + # the LiveQuery `keys` rename. Those are resolved by version inference + # in {#server_supports?}. + # @return [Hash] the advertised features block, or `{}` if unavailable. + def server_features + info = server_info + return {} unless info.is_a?(Hash) + feats = info[:features] + feats.is_a?(Hash) ? feats : {} + end + + # Capability table consumed by {#server_supports?}. Each entry is + # version-inferred (we cannot read these off the coarse `features` + # block) with one of two predicates: + # + # - `since:` — the capability EXISTS on this version and newer. An + # unknown/unparseable server version resolves to `true` + # (fail-open-to-modern: assume the current server line, matching the + # deprecation gate's posture). + # - `until:` — the capability existed BELOW this version and was + # removed/restricted at it. Unknown version resolves to `false` + # (the modern server no longer offers it). + # + # `feature:` (a `[group, flag]` pair) lets a future capability prefer + # the advertised `features` block when Parse Server genuinely surfaces + # it there; absent that, the version predicate decides. + # @!visibility private + CAPABILITIES = { + # LiveQuery subscription field projection: the `fields` option was + # renamed `keys` in Parse Server 7.0.0 (DEPPS9 / #8852). The SDK + # emits both, so this is informational rather than gating. + livequery_keys_option: { since: "7.0.0" }, + # Cloud functions encode returned Parse.Object values as `__type` + # dictionaries: default flipped to `true` in 8.0.0, made + # unconditional (option removed) in 9.0.0. + cloud_object_encoding: { since: "8.0.0" }, + # Non-master `explain` on a query: `allowPublicExplain` defaulted to + # `false` in 9.0.0, so a session-scoped explain that worked on 8.x + # is rejected on 9.x unless the operator re-enables it. + public_explain: { until: "9.0.0" }, + # Aggregation `rawValues` / `rawFieldNames` options added in 9.9.0 + # (#10438). + aggregate_raw_values: { since: "9.9.0" }, + }.freeze + + # Capability probe against the connected Parse Server. Builds on the + # already-memoized {#server_info} (no extra round-trip beyond the one + # `serverInfo` fetch) and the coarse `features` block, falling back to + # version inference for behavior flags the `features` block does not + # carry. + # + # Fails OPEN to the modern server line: when the server version cannot + # be determined (offline unit tests, a `serverInfo` outage, a wire + # surprise), a `since:` capability resolves `true` and an `until:` + # capability resolves `false` — i.e. "assume the current server", + # mirroring {#warn_if_deprecated_server_version!}. + # + # @example + # client.server_supports?(:public_explain) # => false on PS 9.x + # client.server_supports?(:aggregate_raw_values) + # @param feature [Symbol] a key of {CAPABILITIES}. + # @return [Boolean] whether the connected server supports the feature. + # @raise [ArgumentError] for an unknown capability key (typo guard). + def server_supports?(feature) + spec = CAPABILITIES[feature] + raise ArgumentError, "Unknown Parse Server capability #{feature.inspect}" if spec.nil? + + # Prefer the advertised features block when a capability declares a + # `[group, flag]` path AND the server actually surfaces it. + if (path = spec[:feature]) + group, flag = path + advertised = server_features.dig(group.to_s, flag.to_s) + advertised = server_features.dig(group, flag) if advertised.nil? + return advertised == true unless advertised.nil? + end + + version = server_version.to_s + if (floor = spec[:since]) + # Supported on `floor` and newer. Unknown version => assume modern => true. + return true if version.empty? + !server_version_below?(version, floor) + elsif (ceiling = spec[:until]) + # Supported strictly below `ceiling`. Unknown version => assume modern => false. + return false if version.empty? + server_version_below?(version, ceiling) + else + false + end + end + private # One-shot deprecation warning. The check runs once per client diff --git a/lib/parse/api/users.rb b/lib/parse/api/users.rb index d97a096..b4f9f3f 100644 --- a/lib/parse/api/users.rb +++ b/lib/parse/api/users.rb @@ -14,7 +14,11 @@ module Users # @!visibility private LOGIN_PATH = "login" # @!visibility private + VERIFY_PASSWORD_PATH = "verifyPassword" + # @!visibility private REQUEST_PASSWORD_RESET = "requestPasswordReset" + # @!visibility private + VERIFICATION_EMAIL_REQUEST = "verificationEmailRequest" # Fetch a {Parse::User} for a given objectId. # @param id [String] the user objectid @@ -130,6 +134,26 @@ def request_password_reset(email, headers: {}, **opts) response end + # Request that Parse Server (re)send the email-address verification email + # for a registered, not-yet-verified user. Requires the server to have an + # email adapter and `verifyUserEmails` enabled; otherwise Parse Server + # responds with an error. Rate-limited per email like password reset. + # + # @param email [String] the Parse user email. + # @param opts [Hash] additional options to pass to the {Parse::Client} request. + # @param headers [Hash] additional HTTP headers to send with the request. + # @return [Parse::Response] + def request_email_verification(email, headers: {}, **opts) + rate_key = "emailverify:#{email}" + check_login_rate_limit!(rate_key) + body = { email: email } + response = request :post, VERIFICATION_EMAIL_REQUEST, body: body, opts: opts, headers: headers + # Indistinguishable found/not-found response, like password reset — count + # every attempt toward backoff so probing can't reset the counter. + track_login_attempt(rate_key, false) + response + end + # Login a user. Implements client-side rate limiting with exponential # backoff after repeated failures to mitigate brute force attacks. # @param username [String] the Parse user username. @@ -180,6 +204,37 @@ def login_with_mfa(username, password, mfa_token, headers: {}, **opts) response end + # Verify a user's credentials against Parse Server without minting a session. + # This is the canonical step-up / re-authentication primitive: it confirms + # that the username + password combination is correct without producing a + # new session token on success. + # + # Uses the +POST /parse/verifyPassword+ endpoint (credentials in the request + # BODY, mirroring +login+) rather than the +GET+ form. Parse Server accepts + # both (same handler, neither master-key gated; the POST variant landed in + # 7.1.0), but POST keeps the plaintext password out of the URL — and + # therefore out of server access logs, reverse-proxy logs, the +Referer+ + # header, and the SDK's URL-keyed response cache. + # + # On success Parse Server returns the user object (HTTP 200) with the same + # shape as a login response (minus +sessionToken+). On failure it returns a + # 4xx with an error body, most commonly: + # - code 101 (+ERROR_OBJECT_NOT_FOUND+) for an unknown username or wrong password. + # - code 205 (+ERROR_EMAIL_NOT_FOUND+) when +preventLoginWithUnverifiedEmail+ + # is enabled and the account's email has not been verified. + # + # @param username [String] the Parse user username. + # @param password [String] the Parse user's associated password. + # @param headers [Hash] additional HTTP headers to send with the request. + # @param opts [Hash] additional options to pass to the {Parse::Client} request. + # @return [Parse::Response] + def verify_password(username, password, headers: {}, **opts) + body = { username: username, password: password } + response = request :post, VERIFY_PASSWORD_PATH, body: body, headers: headers, opts: opts + response.parse_class = Parse::Model::CLASS_USER + response + end + # Logout a user by deleting the associated session. # @param session_token [String] the Parse user session token to delete. # @param headers [Hash] additional HTTP headers to send with the request. @@ -223,7 +278,7 @@ def login_rate_limits LOGIN_RATE_LIMIT_TTL = 600 # Checks if a login attempt is allowed for the given username. - # @raise [RuntimeError] if the account is temporarily locked out. + # @raise [Parse::Error::AccountLockoutError] if the account is temporarily locked out. def check_login_rate_limit!(username) @login_rate_limit_mutex ||= Mutex.new @login_rate_limit_mutex.synchronize do @@ -231,7 +286,8 @@ def check_login_rate_limit!(username) return unless entry if entry[:locked_until] && Time.now < entry[:locked_until] wait = (entry[:locked_until] - Time.now).ceil - raise "Login rate limited for '#{username}'. Try again in #{wait} seconds." + raise Parse::Error::AccountLockoutError, + "Login rate limited for '#{username}'. Try again in #{wait} seconds." end end end diff --git a/lib/parse/atlas_search.rb b/lib/parse/atlas_search.rb index 2a1ae4f..1c2c443 100644 --- a/lib/parse/atlas_search.rb +++ b/lib/parse/atlas_search.rb @@ -105,10 +105,10 @@ class << self # @!attribute [rw] role_cache_ttl # TTL (seconds) for {Session}'s user-id → role-name cache. - # Default: 120 (2 minutes). Short on purpose: stale role - # data yields incorrect ACL decisions, so the cache is sized - # to amortize within a single request/turn but expire well - # inside the response time the operator notices a role grant. + # Default: 30. Short on purpose: stale role data yields + # incorrect ACL decisions, so the cache is sized to amortize + # within a single request/turn but expire well inside the + # response time the operator notices a role grant or revoke. # @return [Integer] attr_accessor :role_cache_ttl @@ -141,7 +141,7 @@ class << self # @param session_cache_ttl [Integer] session-token cache TTL # (seconds). Default: 3600. # @param role_cache_ttl [Integer] role-name cache TTL (seconds). - # Default: 120. + # Default: 30. # @example # Parse::AtlasSearch.configure(enabled: true, default_index: "default") def configure(enabled: true, @@ -195,7 +195,7 @@ def reset! @allow_raw = default_allow_raw @require_session_token = false @session_cache_ttl = 3600 - @role_cache_ttl = 120 + @role_cache_ttl = 30 @session_cache = Session::MemoryCache.new @role_cache = Session::MemoryCache.new @master_warned = false @@ -1015,7 +1015,7 @@ def sanitize_raw_results(docs) @allow_raw = nil @require_session_token = false @session_cache_ttl = 3600 - @role_cache_ttl = 120 + @role_cache_ttl = 30 @session_cache = Session::MemoryCache.new @role_cache = Session::MemoryCache.new @master_warned = false diff --git a/lib/parse/client.rb b/lib/parse/client.rb index d8cd9d2..3c74cf8 100644 --- a/lib/parse/client.rb +++ b/lib/parse/client.rb @@ -1414,9 +1414,53 @@ def self.setup(opts = {}, &block) # Unwrap the `{ "result" => ... }` envelope from a successful cloud-code response. # Guards against unusual server payloads (non-Hash bodies) by returning the raw # result rather than raising TypeError on `String#[]`/`Integer#[]`. + # + # Parse Server 8.0 flipped `encodeParseObjectInCloudFunction` to true and 9.0 + # removed the opt-out, so a cloud function that returns a Parse object now + # yields a `__type`-encoded dictionary (`{"__type":"Object","className":...}`) + # where a pre-8.x caller received a plain attribute Hash. We decode those + # self-describing envelopes back into `Parse::Object` / `Parse::Pointer` so the + # value a caller sees is consistent across server versions and matches what + # every other Parse SDK returns. Decoding is conservative: only a fully-shaped + # Object/Pointer envelope is converted, and an Object of an UNregistered class + # is left as a raw Hash (building it would degrade to a field-less Pointer). + # Plain Hashes and arbitrary `__type` app data pass through untouched. def self._extract_cloud_result(response) r = response.result - r.is_a?(Hash) ? r["result"] : r + value = r.is_a?(Hash) ? r["result"] : r + _decode_cloud_value(value) + end + + # @!visibility private + # Recursively decode Parse-encoded values in a cloud-code result. See + # {._extract_cloud_result} for the rationale and the conservatism rules. + def self._decode_cloud_value(value) + case value + when Array + value.map { |v| _decode_cloud_value(v) } + when Hash + type = value["__type"] || value[:__type] + class_name = value["className"] || value[:className] + object_id = value["objectId"] || value[:objectId] + if type == Parse::Model::TYPE_POINTER && class_name && object_id + # Pointers carry no attributes, so building one is lossless even for + # an unregistered class (yields a Parse::Pointer). + Parse::Object.build(value) + elsif type == Parse::Model::TYPE_OBJECT && class_name && object_id && + Parse::Model.find_class(class_name) + # Only build a full object when the class is registered; otherwise + # Parse::Object.build collapses to a field-less Pointer and we'd lose + # the attributes — better to hand back the raw Hash. + Parse::Object.build(value) + else + # Plain Hash, partial envelope, or non-object `__type` (Date/GeoPoint/ + # File/Bytes, or literal app data): leave the node shape intact and + # only recurse into nested values so embedded objects still decode. + value.transform_values { |v| _decode_cloud_value(v) } + end + else + value + end end # Helper method to trigger cloud jobs and get results. @@ -1491,6 +1535,9 @@ def self.trigger_job_with_session!(name, body = {}, session_token, **opts) # @option opts [Symbol] :client The client connection to use. # @option opts [Boolean] :raw Whether to return the raw response object. # @option opts [Boolean] :master_key Whether to use the master key for this request. + # @option opts [Hash, nil] :context An optional caller context forwarded as the + # +X-Parse-Cloud-Context+ header. Parse Server maps it to +req.info.context+ + # in the function handler and flows it through beforeSave/afterSave triggers. # @return [Object] the result data of the response. nil if there was an error. def self.call_function(name, body = {}, **opts) conn = opts[:session] || opts[:client] || :default @@ -1500,7 +1547,13 @@ def self.call_function(name, body = {}, **opts) request_opts[:session_token] = opts[:session_token] if opts[:session_token] request_opts[:master_key] = opts[:master_key] if opts[:master_key] - response = Parse::Client.client(conn).call_function(name, body, opts: request_opts) + # Build call kwargs; only forward context: when explicitly supplied so + # call sites that do not use context produce the same opts hash that + # existing mock expectations match against. + call_kwargs = { opts: request_opts } + call_kwargs[:context] = opts[:context] unless opts[:context].nil? + + response = Parse::Client.client(conn).call_function(name, body, **call_kwargs) return response if opts[:raw].present? if response.error? Parse::Client._safe_warn("CloudCodeError", response, name: name) diff --git a/lib/parse/client/body_builder.rb b/lib/parse/client/body_builder.rb index 48c4f6b..d09f5fc 100644 --- a/lib/parse/client/body_builder.rb +++ b/lib/parse/client/body_builder.rb @@ -74,6 +74,11 @@ class BodyBuilder < Faraday::Middleware Parse::Protocol::MASTER_KEY, Parse::Protocol::API_KEY, Parse::Protocol::SESSION_TOKEN, + # Caller-supplied Cloud Code context (X-Parse-Cloud-Context) carries + # `context.to_json`, which may hold PII / request metadata. Redact it in + # the header log; the body/as_json log path scrubs sensitive sub-values + # of context separately. + Parse::Protocol::CLOUD_CONTEXT, "X-Parse-JavaScript-Key", "Authorization", "Cookie", diff --git a/lib/parse/client/protocol.rb b/lib/parse/client/protocol.rb index 4cd1658..bd4fa79 100644 --- a/lib/parse/client/protocol.rb +++ b/lib/parse/client/protocol.rb @@ -31,6 +31,10 @@ module Protocol # The request header field for MongoDB read preference. # Supported values: PRIMARY, PRIMARY_PREFERRED, SECONDARY, SECONDARY_PREFERRED, NEAREST READ_PREFERENCE = "X-Parse-Read-Preference" + # The request header field for threading a caller-supplied context object + # through a write or cloud-function call. Parse Server maps this header to + # +req.info.context+ and flows it through beforeSave/afterSave triggers. + CLOUD_CONTEXT = "X-Parse-Cloud-Context" # Valid read preference values for MongoDB READ_PREFERENCES = %w[PRIMARY PRIMARY_PREFERRED SECONDARY SECONDARY_PREFERRED NEAREST].freeze diff --git a/lib/parse/embeddings.rb b/lib/parse/embeddings.rb index 61401d1..a2ddb6a 100644 --- a/lib/parse/embeddings.rb +++ b/lib/parse/embeddings.rb @@ -554,3 +554,4 @@ def ip_shaped_but_not_canonical?(host) require_relative "embeddings/jina" require_relative "embeddings/qwen" require_relative "embeddings/local_http" +require_relative "embeddings/spend_cap" diff --git a/lib/parse/embeddings/spend_cap.rb b/lib/parse/embeddings/spend_cap.rb new file mode 100644 index 0000000..eb374ee --- /dev/null +++ b/lib/parse/embeddings/spend_cap.rb @@ -0,0 +1,255 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require "thread" + +module Parse + module Embeddings + # Per-tenant cumulative embedding spend cap. + # + # The agent `semantic_search` tool embeds attacker-controlled text + # (chat queries) on every call. Without a cap, a tenant — or an + # adversary driving an agent — can run up unbounded embedding-provider + # cost. {SpendCap} tracks the cumulative number of *tokens* embedded + # per tenant inside a sliding time window and HARD-REFUSES (raises + # {Exceeded}) once a tenant would exceed its limit. This is distinct + # from {Parse::Agent::RateLimiter}, which bounds request *count* per + # window; the spend cap bounds embedding *volume* (a proxy for cost). + # + # == Disabled by default + # + # With no configured limit the cap is a no-op — {.charge!} records + # nothing and never raises. Operators opt in: + # + # Parse::Embeddings::SpendCap.configure(limit_tokens: 1_000_000, window: 3600) + # Parse::Embeddings::SpendCap.configure(:acme_tenant, limit_tokens: 50_000) + # + # A per-tenant limit (second form) overrides the default for that + # tenant. The reserved key {DEFAULT_KEY} sets the fallback applied to + # every tenant without an explicit limit. + # + # == Token estimation + # + # Callers pass an explicit token count, or use {.estimate_tokens} (a + # chars/4 heuristic — the same approximation the agent layer uses for + # its context-token budgets). The cap is intentionally an estimate: it + # exists to bound runaway cost, not to bill precisely. + # + # Thread-safe: all state lives behind a single mutex. + module SpendCap + # Raised when a tenant would exceed its token cap. Carries the + # limit, the already-used count (within the window), and a + # `retry_after` hint (seconds until enough of the window rolls off + # to admit the rejected charge — `nil` if the charge can never fit). + class Exceeded < StandardError + attr_reader :tenant_id, :limit, :used, :requested, :window, :retry_after + + def initialize(tenant_id:, limit:, used:, requested:, window:, retry_after:) + @tenant_id = tenant_id + @limit = limit + @used = used + @requested = requested + @window = window + @retry_after = retry_after + super( + "Embedding spend cap exceeded for tenant #{tenant_id.inspect}: " \ + "#{used}+#{requested} tokens would exceed #{limit}/#{window}s." \ + "#{retry_after ? " Retry after #{retry_after.round(1)}s." : " Request exceeds the cap outright."}" + ) + end + end + + # Fallback bucket key for charges with no tenant identity, and the + # key under which {.configure} (with no explicit tenant) sets the + # default limit applied to every tenant lacking an override. + DEFAULT_KEY = :__default__ + + # Default sliding window (seconds) when none is configured. + DEFAULT_WINDOW = 3600 + + class << self + # Configure the cap. Two forms: + # + # configure(limit_tokens:, window:) # default for all tenants + # configure(tenant_id, limit_tokens:, window:) # override one tenant + # + # `limit_tokens: nil` disables the cap for that scope (the default + # scope when no tenant is given). + # + # @param tenant_id [Object, nil] tenant to override, or nil for + # the global default. + # @param limit_tokens [Integer, nil] token ceiling per window. + # @param window [Integer] sliding window length in seconds. + # @return [void] + def configure(tenant_id = nil, limit_tokens:, window: DEFAULT_WINDOW) + key = tenant_id.nil? ? DEFAULT_KEY : tenant_id + unless limit_tokens.nil? + li = Integer(limit_tokens) + raise ArgumentError, "SpendCap: limit_tokens must be positive (got #{li})." if li <= 0 + end + w = Integer(window) + raise ArgumentError, "SpendCap: window must be positive (got #{w})." if w <= 0 + mutex.synchronize do + limits[key] = limit_tokens.nil? ? nil : { limit: Integer(limit_tokens), window: w } + end + nil + end + + # Charge `tokens` against `tenant_id`'s budget. HARD-REFUSES by + # raising {Exceeded} when the charge would push the tenant over + # its limit within the window; otherwise records the charge and + # returns the new in-window total. + # + # No-op (returns nil) when no limit applies to the tenant. + # + # @param tenant_id [Object, nil] tenant identity (nil → {DEFAULT_KEY}). + # @param tokens [Integer] tokens to charge (>= 0). + # @return [Integer, nil] new in-window total, or nil if uncapped. + # @raise [Exceeded] + def charge!(tenant_id:, tokens:) + t = Integer(tokens) + raise ArgumentError, "SpendCap: tokens must be >= 0 (got #{t})." if t.negative? + key = tenant_id.nil? ? DEFAULT_KEY : tenant_id + + mutex.synchronize do + cfg = limit_for(key) + return nil if cfg.nil? # uncapped + + window = cfg[:window] + limit = cfg[:limit] + now = monotonic + entries = prune(key, now, window) + used = entries.sum { |e| e[1] } + + if used + t > limit + raise Exceeded.new( + tenant_id: key, limit: limit, used: used, requested: t, + window: window, retry_after: retry_after_for(entries, t, limit, window, now), + ) + end + entries << [now, t] if t.positive? + used + t + end + end + + # Current in-window token usage for a tenant (0 when uncapped or + # idle). Does not mutate. + # + # @param tenant_id [Object, nil] + # @return [Integer] + def usage(tenant_id: nil) + key = tenant_id.nil? ? DEFAULT_KEY : tenant_id + mutex.synchronize do + cfg = limit_for(key) + return 0 if cfg.nil? + prune(key, monotonic, cfg[:window]).sum { |e| e[1] } + end + end + + # Estimate token count from a String. + # + # The familiar "~4 characters per token" ratio only holds for + # ASCII. CJK, emoji, and other multibyte text run closer to one + # token per codepoint in a real tokenizer, so a pure + # `chars / 4` estimate undercounts such input by up to ~4x — and + # since this estimate is the sole basis for the hard-refuse + # decision, that lets a caller feeding multibyte text reach ~4x + # the real embedding volume before the cap trips. Take the larger + # of the char-based and byte-based estimates so multibyte input + # bills at least as much as its UTF-8 byte length implies. + # + # @param text [String] + # @return [Integer] + def estimate_tokens(text) + str = text.to_s + chars = (str.length / 4.0).ceil + bytes = (str.bytesize / 4.0).ceil + [chars, bytes].max + end + + # Clear recorded usage (all tenants, or one). Limits are retained. + # + # @param tenant_id [Object, nil] + def reset!(tenant_id = nil) + mutex.synchronize do + if tenant_id.nil? + @buckets = {} + else + buckets.delete(tenant_id) + end + end + nil + end + + # Remove all configured limits AND recorded usage. Mainly for + # tests — returns the cap to its disabled-by-default state. + def reset_all! + mutex.synchronize do + @limits = {} + @buckets = {} + end + nil + end + + private + + MUTEX_INIT = Mutex.new + private_constant :MUTEX_INIT + + def mutex + @mutex ||= MUTEX_INIT.synchronize { @mutex ||= Mutex.new } + end + + def limits + @limits ||= {} + end + + def buckets + @buckets ||= {} + end + + # Resolve the effective limit config for a key: an explicit + # per-tenant entry wins; otherwise the DEFAULT_KEY entry; nil when + # neither is set (uncapped). A key explicitly set to nil disables + # the cap for that tenant even if a default exists. + def limit_for(key) + if limits.key?(key) + limits[key] + else + limits[DEFAULT_KEY] + end + end + + # Drop entries older than the window; returns the (mutated) live + # entry list for the key. + def prune(key, now, window) + entries = (buckets[key] ||= []) + cutoff = now - window + entries.reject! { |e| e[0] <= cutoff } + entries + end + + # Seconds until enough in-window tokens roll off to admit a charge + # of `requested` tokens. nil when the request alone exceeds the + # limit (it can never fit). + def retry_after_for(entries, requested, limit, window, now) + return nil if requested > limit + need_to_free = (entries.sum { |e| e[1] } + requested) - limit + return 0.0 if need_to_free <= 0 + freed = 0 + entries.sort_by { |e| e[0] }.each do |ts, tok| + freed += tok + if freed >= need_to_free + return [(ts + window) - now, 0.0].max + end + end + nil + end + + def monotonic + Process.clock_gettime(Process::CLOCK_MONOTONIC) + end + end + end + end +end diff --git a/lib/parse/live_query/client.rb b/lib/parse/live_query/client.rb index 58bfe9b..8e21ecb 100644 --- a/lib/parse/live_query/client.rb +++ b/lib/parse/live_query/client.rb @@ -363,7 +363,7 @@ def health_info # events can arrive. Optional — callers may still capture the # returned subscription and register callbacks later. # @return [Subscription] - def subscribe(class_name, where: {}, fields: nil, session_token: nil, + def subscribe(class_name, where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, use_master_key: false, &block) # Handle Parse::Object subclass if class_name.is_a?(Class) && class_name < Parse::Object @@ -396,6 +396,8 @@ def subscribe(class_name, where: {}, fields: nil, session_token: nil, class_name: class_name, query: where, fields: fields, + keys: keys, + watch: watch, session_token: session_token, use_master_key: use_master_key, ) diff --git a/lib/parse/live_query/subscription.rb b/lib/parse/live_query/subscription.rb index 08b2ab4..32f5762 100644 --- a/lib/parse/live_query/subscription.rb +++ b/lib/parse/live_query/subscription.rb @@ -62,12 +62,22 @@ def next_request_id # @return [Parse::LiveQuery::Client] the LiveQuery client attr_reader :client - # @return [Array] fields to watch for changes (nil = all fields) + # @return [Array] field projection for returned events + # (nil = all fields). Parse Server 7.0 renamed this subscription + # option from `fields` to `keys`; {#keys} is the canonical alias. attr_reader :fields + # @return [Array] alias for {#fields} under Parse Server's + # post-7.0 `keys` name. + alias_method :keys, :fields # @return [String, nil] session token for ACL-aware subscriptions attr_reader :session_token + # @return [Array, nil] field names that trigger update events when + # changed (PS 7.0+ `watch` option). +nil+ means all field changes trigger + # update events. + attr_reader :watch + # Create a new subscription # @param client [Parse::LiveQuery::Client] the LiveQuery client # @param class_name [String] Parse class name @@ -85,13 +95,16 @@ def next_request_id # elevated; on a non-admin connection the client warns and the # subscription stays ACL-scoped. For mixed scoped + admin needs, # use two separate clients. Defaults to false. - def initialize(client:, class_name:, query: {}, fields: nil, - session_token: nil, use_master_key: false) + def initialize(client:, class_name:, query: {}, fields: nil, keys: nil, + session_token: nil, use_master_key: false, watch: nil) @monitor = Monitor.new @client = client @class_name = class_name @query = query - @fields = fields + # `keys` is the post-7.0 name; accept either and prefer the explicit + # `keys:` when both are supplied. + @fields = keys.nil? ? fields : keys + @watch = watch @session_token = session_token @use_master_key = use_master_key == true @request_id = generate_request_id @@ -239,7 +252,21 @@ def to_subscribe_message }, } - msg[:query][:fields] = fields if fields&.any? + if fields&.any? + # Parse Server 7.0 (DEPPS9 / #8852) renamed the subscription field- + # projection option from `fields` to `keys`. PS 7+ reads `keys` and + # ignores `fields`; PS < 7 reads `fields`. Emit BOTH so projection is + # honored on every supported server — sending an extra key the server + # ignores is harmless, while sending only `fields` silently disables + # projection (events return all columns) on PS 7+. + msg[:query][:keys] = fields + msg[:query][:fields] = fields + end + # PS 7.0 (#8028) `watch`: fire update events only when the named fields + # change. Distinct from field projection (`keys`/`fields`): `watch` + # controls which field mutations generate an update event; `keys` controls + # which fields are returned in the event payload. + msg[:query][:watch] = watch if watch&.any? msg[:sessionToken] = session_token if session_token # The subscribe frame deliberately NEVER carries `masterKey`. # Parse Server's `_handleSubscribe` does not read it — master-key diff --git a/lib/parse/model/acl.rb b/lib/parse/model/acl.rb index 0505766..32acda5 100644 --- a/lib/parse/model/acl.rb +++ b/lib/parse/model/acl.rb @@ -91,8 +91,10 @@ def as_json(*args) # artist.save # # You may also set default ACLs for your subclasses by using {Parse::Object.set_default_acl}. - # These will be get applied for newly created instances. All subclasses have - # public read and write enabled by default. + # These will get applied for newly created instances. Unless overridden, subclasses + # inherit the shipped `:owner_else_private` policy — records are private + # (master-only) until an owner is resolved at save time. Use {Parse::Object.set_default_acl} + # or {Parse::Object.acl_policy} to grant broader access. # # class AdminData < Parse::Object # diff --git a/lib/parse/model/classes/audience.rb b/lib/parse/model/classes/audience.rb index 7443d17..9b9c36f 100644 --- a/lib/parse/model/classes/audience.rb +++ b/lib/parse/model/classes/audience.rb @@ -155,12 +155,60 @@ def find_by_name_uncached(name) property :name # @!attribute query - # The query constraints that define which installations belong to this audience. - # This is stored as a hash matching the Installation query format. - # @return [Hash] The query constraint hash. + # The query constraints that define which installations belong to this + # audience, as a Hash matching the Installation query format. + # + # On the wire this is persisted as a JSON **string**, matching Parse + # Server's built-in `_Audience.query` column (typed `String`, not Object). + # Assigning a Hash and reading a Hash back is handled transparently. The + # previous `property :query, :object` emitted a JSON object, so every save + # of a hash query was rejected by the server with a schema mismatch + # ("expected String but got Object"). + # @return [ActiveSupport::HashWithIndifferentAccess, nil] the query hash. # @example # audience.query = { "deviceType" => "ios", "appVersion" => { "$gte" => "2.0" } } - property :query, :object + property :query_json, :string, field: :query + + # JSON-encode a Hash/Array assigned to the query field before the `:string` + # property coercion runs. Every assignment path — `query=`, + # `query_constraint=`, mass-assignment via `new(query:)`, and server + # hydration — funnels through `format_value`, so intercepting here (rather + # than the public setter, which mass-assignment bypasses) is the single + # reliable place to keep the wire value valid JSON (`{"k":"v"}`) instead of + # Ruby's `Hash#to_s` (`{"k"=>"v"}`). A String passed in (e.g. a row loaded + # from the server) falls through to the normal string coercion untouched. + def format_value(key, val, data_type = nil) + if key == :query_json && (val.is_a?(Hash) || val.is_a?(Array)) + return val.to_json + end + super + end + + # The query constraints as a Hash (decoded from the stored JSON string). + # @return [ActiveSupport::HashWithIndifferentAccess, nil] + def query + raw = query_json + case raw + when nil, "" then nil + when Hash then raw.with_indifferent_access + when String + decoded = begin + JSON.parse(raw) + rescue JSON::ParserError + nil + end + decoded.is_a?(Hash) ? decoded.with_indifferent_access : decoded + else + raw + end + end + + # Assign the audience query. Accepts a Hash (preferred) or a pre-encoded + # JSON String; either form is persisted as a JSON string. + # @param value [Hash, String, nil] + def query=(value) + self.query_json = value + end # Alias for query to match Parse Server naming conventions. # @return [Hash] The query constraint hash. diff --git a/lib/parse/model/classes/user.rb b/lib/parse/model/classes/user.rb index c1be9b5..514e82a 100644 --- a/lib/parse/model/classes/user.rb +++ b/lib/parse/model/classes/user.rb @@ -28,6 +28,59 @@ class EmailNotFound < Error; end # 125 Error code indicating that the email address was invalid. class InvalidEmailAddress < Error; end + + # Error code 205 (Parse::Response::ERROR_EMAIL_NOT_FOUND) raised by + # {Parse::User.login!} and {Parse::User#verify_password} when Parse Server + # returns code 205 because +preventLoginWithUnverifiedEmail+ is enabled and + # the account's email address has not been verified. + # + # It is a SUBCLASS of {AuthenticationError} on purpose: before this typed + # error existed, the unverified-email rejection raised a plain + # +AuthenticationError+, so existing callers wrapping {Parse::User.login!} + # in +rescue AuthenticationError+ must keep catching it (subclassing keeps + # that contract — making it a sibling would be a silent breaking change). + # Callers who want to special-case the unverified-email path just rescue + # this narrower subclass FIRST. + # + # @example + # begin + # Parse::User.login!(username, password) + # rescue Parse::Error::EmailNotVerifiedError + # # Prompt user to check their inbox and verify their email + # rescue Parse::Error::AuthenticationError + # # Wrong credentials or other login failure (still catches the above + # # too, if no narrower rescue precedes it) + # end + class EmailNotVerifiedError < AuthenticationError; end + + # Raised by {Parse::Client} when the SDK's client-side login rate-limit + # guard fires — i.e. the same username has failed + # {Parse::API::Users::LOGIN_MAX_FAILURES} or more times and the exponential + # back-off window has not yet elapsed. + # + # The class is a subclass of {AuthenticationError} so that a single + # rescue Parse::Error::AuthenticationError handler covers both + # wrong-credential failures and lockout situations. Callers that need to + # distinguish the lockout case just rescue this narrower subclass first. + # Because the previous implementation raised a plain +RuntimeError+, there + # is no prior +AuthenticationError+ rescue contract to preserve — this is + # a new typed entry in the login-failure taxonomy. + # + # Note that +Parse::Error < StandardError+, so a bare +rescue+ or + # +rescue StandardError+ still catches this error. + # + # @example + # begin + # Parse::User.login!(username, password) + # rescue Parse::Error::AccountLockoutError => e + # # Too many failed attempts — tell the user how long to wait + # retry_in = e.message[/\d+/] + # render_lockout_page(retry_in: retry_in) + # rescue Parse::Error::AuthenticationError + # # Wrong credentials (or other login failure — also catches lockout + # # if no narrower rescue precedes it) + # end + class AccountLockoutError < AuthenticationError; end end # The main class representing the _User table in Parse. A user can either be signed up or anonymous. @@ -504,15 +557,55 @@ def apply_attributes!(hash, dirty_track: false, filter_protected: nil, protected if hash.key?(:authData) || hash.key?("authData") || hash.key?(:auth_data) || hash.key?("auth_data") hash = hash.dup + raw_auth = hash[:authData] || hash["authData"] || + hash[:auth_data] || hash["auth_data"] hash.delete(:authData) hash.delete("authData") hash.delete(:auth_data) hash.delete("auth_data") + # Preserve ONLY a non-sensitive MFA status derived from the stripped + # authData, so #mfa_enabled? / #mfa_status (and the #disable_mfa! + # guard) work after an ordinary fetch without retaining the TOTP + # secret, recovery codes, mobile number, or any OAuth provider token. + # Non-MFA authData still strips to nil exactly as before. + safe = sanitized_mfa_authdata(raw_auth) + hash["authData"] = safe if safe end end super(hash, dirty_track: dirty_track, filter_protected: filter_protected, protected_set: protected_set) end + # @!visibility private + # Reduce a server-returned +authData+ hash to a leak-safe MFA status. + # Parse Server returns +authData.mfa+ as +{ "secret" => ..., "recovery" => + # [...] }+ (the raw TOTP secret and one-time recovery codes) even on a + # user's own session-token read, so the value itself must never be retained. + # This keeps only +{ "mfa" => { "status" => "enabled" } }+ when MFA is + # configured, and returns +nil+ otherwise (preserving the prior + # strip-to-nil behavior for OAuth-only / non-MFA authData). + # @return [Hash, nil] + def sanitized_mfa_authdata(raw) + return nil unless raw.is_a?(Hash) + mfa = raw["mfa"] || raw[:mfa] + return nil unless mfa.is_a?(Hash) + + status = mfa["status"] || mfa[:status] + # An EXPLICIT non-"enabled" status is authoritative: treat the user + # as disabled even if a stale `secret`/`recovery` lingers in the + # blob. Without this, a residual credential would override an + # explicit `status: "disabled"` and make `mfa_enabled?` report true + # for a user who has turned MFA off. + return nil if status.is_a?(String) && status != "enabled" + + recovery = mfa["recovery"] || mfa[:recovery] + enabled = status == "enabled" || + (mfa["secret"] || mfa[:secret]).present? || + (recovery.is_a?(Array) ? recovery.any? : recovery.present?) || + (mfa["mobile"] || mfa[:mobile]).present? + + enabled ? { "mfa" => { "status" => "enabled" } } : nil + end + # @return [Boolean] true if this user is anonymous (i.e. created # via the +authData.anonymous+ provider rather than via signup # with a username/password or a real OAuth provider). @@ -664,6 +757,17 @@ def request_password_reset Parse::User.request_password_reset(email) end + # Request that Parse Server (re)send this user's email-address verification + # email. The server must have an email adapter and `verifyUserEmails` enabled. + # @return [Boolean] true if the request was accepted, false otherwise. + # @raise [Parse::Error::ServiceUnavailableError] if Parse Server returns a + # 500/503 (e.g. no emailAdapter / `verifyUserEmails` disabled). + # @see Parse::User.request_email_verification + def request_email_verification + return false if email.nil? + Parse::User.request_email_verification(email) + end + # You may set a password for this user when you are creating them. Parse never returns a # @param passwd The user's password to be used for signing up. # @raise [Parse::Error::UsernameMissingError] If username is missing. @@ -1069,9 +1173,21 @@ def self.login!(username, password) # Self-fetch trust: see {.login}. with_authdata_trust { Parse::User.build(response.result) } else - raise Parse::Error::AuthenticationError, - "Parse::User.login! failed for #{username.inspect}: " \ - "#{response.error || "HTTP #{response.http_status}"} (code=#{response.code.inspect})" + case response.code + when Parse::Response::ERROR_EMAIL_NOT_FOUND + # Parse Server throws code 205 (EMAIL_NOT_FOUND) when + # +preventLoginWithUnverifiedEmail+ is set and the account's email + # address has not yet been verified. Raise the typed error so callers + # can direct the user to verify their inbox without catching every + # AuthenticationError. + raise Parse::Error::EmailNotVerifiedError, + "Parse::User.login! failed for #{username.inspect}: " \ + "email address is not verified (code=205)" + else + raise Parse::Error::AuthenticationError, + "Parse::User.login! failed for #{username.inspect}: " \ + "#{response.error || "HTTP #{response.http_status}"} (code=#{response.code.inspect})" + end end end @@ -1096,6 +1212,26 @@ def self.request_password_reset(email) response.success? end + # Request that Parse Server (re)send the email-address verification email for + # a registered, not-yet-verified user. The server must have an email adapter + # and `verifyUserEmails` enabled. + # @example + # # pass a user object + # Parse::User.request_email_verification(user) + # # or an email + # Parse::User.request_email_verification("user@example.com") + # @param email [String] The user's email address (or a {Parse::User}). + # @return [Boolean] True/false if the request was accepted. + # @raise [Parse::Error::ServiceUnavailableError] if Parse Server returns a + # 500/503 (e.g. no emailAdapter / `verifyUserEmails` disabled). Callers that + # branch on the Boolean should rescue this. + def self.request_email_verification(email) + email = email.email if email.is_a?(Parse::User) + return false if email.blank? + response = client.request_email_verification(email) + response.success? + end + # Same as `session!` but returns nil if a user was not found or sesion token was invalid. # @return [User] the user matching this active token, otherwise nil. # @see #session! @@ -1260,6 +1396,47 @@ def multi_session? active_session_count > 1 end + # Verify this user's password without minting a session token. + # + # Delegates to the +GET /parse/verifyPassword+ endpoint (Parse Server + # 7.1.0+) using this user's +username+ and the supplied +password+. The + # check is purely credential validation — no session is created on + # success, and the user's existing sessions are unaffected. + # + # Use this as a step-up authentication gate: before allowing a sensitive + # action (e.g. changing an email address or deleting an account), call + # +verify_password+ to confirm the caller still knows the password. + # + # @param password [String] the password to verify. + # @return [Boolean] +true+ if the credentials are valid. + # @raise [Parse::Error::EmailNotVerifiedError] when the account exists but + # +preventLoginWithUnverifiedEmail+ is enabled and the email has not been + # verified (Parse Server error code 205). The caller may want to prompt + # the user to check their inbox rather than treating this as a wrong- + # password failure. + # @raise [Parse::Error::AuthenticationError] when the username does not + # exist or the password is wrong (code 101, +OBJECT_NOT_FOUND+). + # @return [Boolean] + # @example + # # Step-up check before a destructive action + # if user.verify_password(params[:current_password]) + # user.destroy + # end + def verify_password(password) + response = client.verify_password(username.to_s, password.to_s) + return true if response.success? + + case response.code + when Parse::Response::ERROR_EMAIL_NOT_FOUND + raise Parse::Error::EmailNotVerifiedError, + "verify_password failed: email address is not verified (code=205)" + else + raise Parse::Error::AuthenticationError, + "verify_password failed: " \ + "#{response.error || "HTTP #{response.http_status}"} (code=#{response.code.inspect})" + end + end + # Return the transitive upward closure of role names this user # inherits permissions from. # diff --git a/lib/parse/model/core/embed_managed.rb b/lib/parse/model/core/embed_managed.rb index 3680c75..a6f56bd 100644 --- a/lib/parse/model/core/embed_managed.rb +++ b/lib/parse/model/core/embed_managed.rb @@ -128,6 +128,39 @@ def self.included(base) base.extend(ClassMethods) end + # Recompute this record's managed embedding(s) in-place, NOW, + # without a save. Runs the same digest-tracked recompute the + # `before_save` callback runs: a provider call happens only when the + # source text/URL changed since the last embed (digest miss). Useful + # to populate the vector before inspecting it, or to force a refresh + # in a console. + # + # @param field [Symbol, nil] limit to one embed target; nil + # recomputes every declared directive. + # @return [self] + # @raise [ArgumentError] when `field:` names no embed target, or the + # class declares no `embed` directives. + def compute_embedding!(field: nil) + directives = self.class.embed_directives + if directives.empty? + raise ArgumentError, "#{self.class}#compute_embedding!: no `embed` directives declared." + end + selected = + if field + d = directives[field.to_sym] + unless d + raise ArgumentError, + "#{self.class}#compute_embedding!: :#{field} is not an embed target " \ + "(have #{directives.keys.inspect})." + end + [d] + else + directives.values + end + selected.each { |directive| Parse::Core::EmbedManaged.recompute_embedding!(self, directive) } + self + end + module ClassMethods # Per-class registry of {EmbedDirective}s keyed by target vector # property symbol. Read by tests and tooling; written only by @@ -300,6 +333,86 @@ def embed_image(source_field, into:, input_type: :search_document, into end + # Backfill embeddings for records whose managed vector field is + # still null — the bulk counterpart to the per-save embed path. + # Walks the class with objectId-cursor pagination (robust to the + # result set shrinking as records are embedded; terminates even + # when a record has no source text and stays null), saving each + # pending record so its `before_save` embed callback runs. + # + # Intended as an admin / maintenance operation: it reads and + # writes through the default client, so run it with a master-key + # client (or pass `save_opts:` carrying a `session_token:` that can + # write every row). + # + # @param field [Symbol, nil] limit the backfill to one embed + # target; nil processes every declared directive. + # @param batch_size [Integer] rows fetched per round (default 100). + # @param limit [Integer, nil] stop after embedding at most this + # many records across all directives; nil = no cap. + # @param where [Hash, nil] extra query constraints AND-ed with the + # null-target filter (e.g. `{ published: true }`). + # @param save_opts [Hash] options forwarded to each `record.save` + # (e.g. `session_token:`). + # @return [Integer] number of records saved (embedded). + # @raise [ArgumentError] when `field:` names no embed target, or + # the class declares no `embed` directives. + def embed_pending!(field: nil, batch_size: 100, limit: nil, where: nil, save_opts: {}) + bs = Integer(batch_size) + raise ArgumentError, "#{self}.embed_pending!: batch_size must be positive." if bs <= 0 + directives = resolve_embed_directives_for_backfill(field) + + processed = 0 + directives.each do |directive| + remaining = limit ? (limit - processed) : nil + break if remaining && remaining <= 0 + processed += backfill_embed_directive!(directive, bs, where, remaining, save_opts) + end + processed + end + + # @!visibility private + def resolve_embed_directives_for_backfill(field) + if field + d = embed_directives[field.to_sym] + unless d + raise ArgumentError, + "#{self}.embed_pending!: :#{field} is not an embed target " \ + "(have #{embed_directives.keys.inspect})." + end + [d] + else + ds = embed_directives.values + raise ArgumentError, "#{self}.embed_pending!: no `embed` directives declared." if ds.empty? + ds + end + end + + # @!visibility private + # objectId-cursor walk over rows where `directive.into` is null. + def backfill_embed_directive!(directive, batch_size, where, remaining, save_opts) + count = 0 + cursor = nil + loop do + q = query(directive.into.null => true) + q = q.where(where) if where.is_a?(Hash) && !where.empty? + q = q.where(:objectId.gt => cursor) if cursor + q.order(:objectId.asc) + q.limit(batch_size) + batch = q.results + break if batch.nil? || batch.empty? + + batch.each do |record| + cursor = record.id + record.save(**save_opts) + count += 1 + return count if remaining && count >= remaining + end + break if batch.length < batch_size + end + count + end + # @!visibility private # Prepend a module that intercepts the public `=` setter # and raises {ProtectedFieldError} unless the current thread has diff --git a/lib/parse/model/core/querying.rb b/lib/parse/model/core/querying.rb index a97ccc4..f5d777b 100644 --- a/lib/parse/model/core/querying.rb +++ b/lib/parse/model/core/querying.rb @@ -554,7 +554,7 @@ def cursor(constraints = {}, limit: 100, order: nil) # @return [Parse::LiveQuery::Subscription] the subscription object # @see Parse::LiveQuery::Subscription # @see Parse::Query#subscribe - def subscribe(where: {}, fields: nil, session_token: nil, client: nil, + def subscribe(where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, client: nil, use_master_key: false, &block) # Fall through to the ambient set by `Parse.with_session` / `Parse.login` # so a caller wrapping a region with `with_session(user) { Klass.subscribe ... }` @@ -565,6 +565,8 @@ def subscribe(where: {}, fields: nil, session_token: nil, client: nil, end query(where).subscribe( fields: fields, + keys: keys, + watch: watch, session_token: session_token, client: client, use_master_key: use_master_key, diff --git a/lib/parse/model/core/vector_searchable.rb b/lib/parse/model/core/vector_searchable.rb index 4a4a9a7..d199085 100644 --- a/lib/parse/model/core/vector_searchable.rb +++ b/lib/parse/model/core/vector_searchable.rb @@ -63,6 +63,54 @@ class IndexNotResolved < ArgumentError; end # binding it to a provider at the property level. class EmbedderNotConfigured < ArgumentError; end + # Accepted {#vector_visibility} modes. + VECTOR_VISIBILITY_MODES = %i[owner_only public].freeze + + # Class-level default for whether this class's `:vector` properties + # are included in `as_json` serialization. + # + # * `:owner_only` (default) — vectors are OMITTED from `as_json` + # unless the caller passes `include_vectors: true`. Embeddings are + # large and leak ML signal; the safe default keeps them off the + # wire and out of API responses. Row-level read access is still + # governed by ACL as usual — this controls serialization exposure, + # not row authorization. + # * `:public` — vectors are INCLUDED in `as_json` by default (a + # caller can still suppress per-call with `include_vectors: false`). + # + # class Article < Parse::Object + # vector_visibility :public # expose embeddings in as_json + # property :embedding, :vector, dimensions: 1536, provider: :openai + # end + # + # Read the effective mode by calling with no argument; it inherits + # from the superclass when unset on the subclass. + # + # @param mode [Symbol, nil] one of {VECTOR_VISIBILITY_MODES}, or nil + # to read the current effective mode. + # @return [Symbol] the effective mode (when reading) or the mode set. + # @raise [ArgumentError] on an unknown mode. + def vector_visibility(mode = nil) + if mode.nil? + return @vector_visibility if defined?(@vector_visibility) && @vector_visibility + return superclass.vector_visibility if superclass.respond_to?(:vector_visibility) + return :owner_only + end + m = mode.to_sym + unless VECTOR_VISIBILITY_MODES.include?(m) + raise ArgumentError, + "#{self}.vector_visibility: mode must be one of " \ + "#{VECTOR_VISIBILITY_MODES.inspect} (got #{mode.inspect})." + end + @vector_visibility = m + end + + # @return [Boolean] whether `:vector` fields are serialized into + # `as_json` by default for this class (true only for `:public`). + def vectors_public_by_default? + vector_visibility == :public + end + # Find documents whose declared `:vector` property is closest to # `vector:` under the Atlas vectorSearch index's similarity # function. @@ -169,6 +217,97 @@ def find_similar(vector: nil, text: nil, k: 10, field: nil, filter: nil, build_vector_hits(raw_hits) end + # Hybrid (lexical + vector) search with reciprocal-rank fusion. + # + # Runs a lexical Atlas Search branch and a `$vectorSearch` branch + # independently, then fuses their ranked results client-side via RRF + # (or, on Atlas 8.0+, server-side via native `$rankFusion` when + # detected). Both branches enforce ACL/CLP/protectedFields before + # fusion — see {Parse::VectorSearch::Hybrid}. + # + # @example + # Song.hybrid_search( + # text: "love songs about rain", + # lexical: { index: "song_search", query: "rain love" }, + # vector: { num_candidates: 200 }, + # k: 20, + # fusion: { k_constant: 60, weights: { lexical: 0.4, vector: 0.6 } }, + # ) + # + # @param text [String, nil] natural-language query. Embedded (via + # the resolved `:vector` property's `provider:`) for the vector + # branch, and used as the lexical query unless `lexical[:query]` + # overrides it. + # @param query_vector [Array, Parse::Vector, nil] pre-computed + # query embedding (alternative to `text:` for the vector branch). + # @param lexical [Hash] lexical branch config (`:query`, `:index`, + # `:fields`, `:filter`, `:fuzzy`). `:query` defaults to `text:`. + # @param vector [Hash] vector branch config (`:field`, `:index`, + # `:num_candidates`, `:filter`, `:vector_filter`). `:field` + # defaults to the class's sole `:vector` property; `:index` is + # auto-discovered when omitted. + # @param k [Integer] number of fused hits to return. + # @param fusion [Hash, nil] `:method` (`:rrf` / `:rrf_client`), + # `:k_constant`, `:weights` (`{ lexical:, vector: }`). + # @param raw [Boolean] return fused raw rows instead of built + # Parse::Object instances. + # @param scope_opts [Hash] ACL/CLP scope kwargs forwarded to both + # branches (`session_token:` / `master:` / `acl_user:` / + # `acl_role:`). + # @return [Array] fused, RRF-ordered; each carries + # `#hybrid_score` and `#hybrid_ranks` (and `#vector_score` / + # `#search_score` when the branch contributed). `raw: true` + # returns the fused Hashes. + def hybrid_search(text: nil, query_vector: nil, lexical: {}, vector: {}, + k: 20, fusion: nil, raw: false, **scope_opts) + require_relative "../../vector_search/hybrid" + lex = (lexical || {}).transform_keys(&:to_sym) + vec = (vector || {}).transform_keys(&:to_sym) + + field_sym = resolve_vector_field!(vec[:field]) + declared_dims = vector_properties.dig(field_sym, :dimensions) + + qv = query_vector || vec[:query_vector] + qv = + if qv.nil? + unless text.is_a?(String) && !text.strip.empty? + raise ArgumentError, + "#{self}.hybrid_search: pass `text:` (to embed) or a `query_vector:`." + end + embed_query_text!(text, field_sym) + else + coerce_query_vector(qv) + end + Parse::VectorSearch.validate_query_vector!(qv, dimensions: declared_dims) + + lexical_query = lex[:query] || text + unless lexical_query.is_a?(String) && !lexical_query.strip.empty? + raise ArgumentError, + "#{self}.hybrid_search: needs a lexical query — pass `text:` or `lexical: { query: }`." + end + + vector_index = vec[:index] || resolve_vector_index!(field_sym, nil) + + fused = Parse::VectorSearch::Hybrid.search( + parse_class, + lexical: { + query: lexical_query, index: lex[:index], fields: lex[:fields], + filter: lex[:filter], fuzzy: lex[:fuzzy], + }, + vector: { + query_vector: qv, field: field_sym, index: vector_index, + num_candidates: vec[:num_candidates], filter: vec[:filter], + vector_filter: vec[:vector_filter], + }, + k: k, + fusion: fusion, + **scope_opts, + ) + + return fused if raw + build_hybrid_hits(fused) + end + private def resolve_vector_field!(field) @@ -280,6 +419,28 @@ def build_vector_hits(raw_hits) obj end.compact end + + # Build Parse::Object instances from fused hybrid rows, attaching + # the fused score / per-branch ranks plus whatever per-branch scores + # survived the merge (`_vscore`, `_score`). + def build_hybrid_hits(rows) + return [] if rows.nil? || rows.empty? + converted = Parse::MongoDB.convert_documents_to_parse(rows, parse_class) + converted.each_with_index.map do |doc, idx| + obj = Parse::Object.build(doc, parse_class) + next nil unless obj + src = rows[idx] + hscore = src["_hybrid_score"] || src[:_hybrid_score] + hranks = src["_hybrid_ranks"] || src[:_hybrid_ranks] + vscore = src["_vscore"] || src[:_vscore] + sscore = src["_score"] || src[:_score] + obj.instance_variable_set(:@_hybrid_score, hscore) unless hscore.nil? + obj.instance_variable_set(:@_hybrid_ranks, hranks) unless hranks.nil? + obj.instance_variable_set(:@_vector_score, vscore) unless vscore.nil? + obj.instance_variable_set(:@_search_score, sscore) unless sscore.nil? + obj + end.compact + end end end end diff --git a/lib/parse/model/object.rb b/lib/parse/model/object.rb index 0fc23e3..4a31540 100644 --- a/lib/parse/model/object.rb +++ b/lib/parse/model/object.rb @@ -182,6 +182,14 @@ def search_score; @_search_score; end # @return [Hash, nil] Atlas Search highlights blob. def search_highlights; @_search_highlights; end + # @return [Float, nil] fused reciprocal-rank-fusion score from + # `Class.hybrid_search`. + def hybrid_score; @_hybrid_score; end + + # @return [Hash, nil] per-branch 1-based ranks from + # `Class.hybrid_search` (`{ lexical:, vector: }`). + def hybrid_ranks; @_hybrid_ranks; end + # @return [Model::TYPE_OBJECT] def __type; Parse::Model::TYPE_OBJECT; end @@ -384,7 +392,12 @@ def parse_class(remoteName = nil) end # The set of default ACLs to be applied on newly created instances of this class. - # By default, public read and write are enabled unless {default_acl_private} is true. + # The result follows the class's {acl_policy_setting}: the shipped default + # policy is `:owner_else_private`, whose fallback half is {Parse::ACL.private} + # (an empty ACL — readable only by the master key until an owner is resolved + # at save time). Classes that opt into a `:public*` policy, or set + # {default_acl_private} / {set_default_acl}, get the corresponding permissions + # instead. # @see Parse::ACL.everyone # @see Parse::ACL.private # @return [Parse::ACL] the current default ACLs for this class. @@ -398,8 +411,10 @@ def default_acls end # A method to set default ACLs to be applied for newly created - # instances of this class. All subclasses have public read and write enabled - # by default. + # instances of this class. Unless overridden, subclasses inherit the + # shipped `:owner_else_private` policy (records are private/master-only + # until an owner is resolved at save time); use this method (or + # {acl_policy}) to grant broader access. # @example # class AdminData < Parse::Object # @@ -1223,8 +1238,16 @@ def as_json(opts = nil) # signal to clients, and they round-trip through the dedicated # embed/find_similar pipelines rather than the standard REST # save/find. Pass `include_vectors: true` to opt back in (e.g., - # for tests or internal mongo-direct bulk writes). - unless opts[:include_vectors] == true + # for tests or internal mongo-direct bulk writes). A class may flip + # the per-class default with `vector_visibility :public`; an explicit + # `include_vectors:` in the call always wins over the class default. + include_vectors = + if opts.key?(:include_vectors) + opts[:include_vectors] == true + else + self.class.respond_to?(:vectors_public_by_default?) && self.class.vectors_public_by_default? + end + unless include_vectors vector_fields = self.class.respond_to?(:fields) ? self.class.fields(:vector).keys.map(&:to_s) : [] if vector_fields.any? except = Array(opts[:except]).map(&:to_s) | vector_fields diff --git a/lib/parse/mongodb.rb b/lib/parse/mongodb.rb index 5876973..bf286fe 100644 --- a/lib/parse/mongodb.rb +++ b/lib/parse/mongodb.rb @@ -1499,7 +1499,7 @@ def assert_role_subtree_users_pipeline_shape!(pipeline, role_id, graph_depth) # @raise [Parse::ACLScope::ACLRequired] when neither # `session_token:` nor `master: true` is supplied and # {Parse::ACLScope.require_session_token} is enabled. - def aggregate(collection_name, pipeline, max_time_ms: nil, rewrite_lookups: nil, allow_internal_fields: false, session_token: nil, master: nil, acl_user: nil, acl_role: nil, read_preference: nil) + def aggregate(collection_name, pipeline, max_time_ms: nil, rewrite_lookups: nil, allow_internal_fields: false, session_token: nil, master: nil, acl_user: nil, acl_role: nil, read_preference: nil, hint: nil) # AS::N envelope. Payload is intentionally metadata-only — # `stage_count`, `stage_types`, `collection`, `scope`, # `result_count`, `max_time_ms`, `read_preference`. Pipeline @@ -1620,6 +1620,11 @@ def aggregate(collection_name, pipeline, max_time_ms: nil, rewrite_lookups: nil, agg_opts = {} agg_opts[:max_time_ms] = max_time_ms if max_time_ms + # Forced index hint (Query#hint). Mirrors Parse Server's REST `hint` + # on the mongo-direct path so a bad plan diagnosed with `explain` can + # be corrected here too. Accepts an index name (String) or a key + # pattern (Hash). + agg_opts[:hint] = hint unless hint.nil? coll = collection(collection_name) if (mode = normalize_read_preference(read_preference)) coll = coll.with(read: { mode: mode }) @@ -1838,6 +1843,7 @@ def find(collection_name, filter = {}, **options) cursor = cursor.skip(options[:skip]) if options[:skip] cursor = cursor.sort(options[:sort]) if options[:sort] cursor = cursor.projection(options[:projection]) if options[:projection] + cursor = cursor.hint(options[:hint]) unless options[:hint].nil? cursor = cursor.max_time_ms(max_time_ms) if max_time_ms results = cursor.to_a diff --git a/lib/parse/pipeline_security.rb b/lib/parse/pipeline_security.rb index 86bd001..729b89d 100644 --- a/lib/parse/pipeline_security.rb +++ b/lib/parse/pipeline_security.rb @@ -182,13 +182,15 @@ def initialize(message, stage: nil, operator: nil, reason: nil) # that `Parse::AtlasSearch` calls do not break. `$vectorSearch` is # included for `Parse::VectorSearch` — like `$search`, it is a # read-only Atlas index stage and must be the FIRST stage of the - # pipeline (Atlas refuses it otherwise). + # pipeline (Atlas refuses it otherwise). `$rankFusion` (Atlas 8.0+) + # is the native server-side reciprocal-rank-fusion stage used by + # `Parse::VectorSearch::Hybrid` — also a read-only stage-0 operator. ALLOWED_STAGES = %w[ $match $group $sort $project $limit $skip $unwind $lookup $count $addFields $set $unset $bucket $bucketAuto $facet $sample $sortByCount $replaceRoot $replaceWith $redact $graphLookup $unionWith - $search $searchMeta $listSearchIndexes $vectorSearch + $search $searchMeta $listSearchIndexes $vectorSearch $rankFusion ].freeze # Atlas operators that are valid only as the FIRST stage of a @@ -202,7 +204,7 @@ def initialize(message, stage: nil, operator: nil, reason: nil) # for full-text and vector search is the dedicated # `atlas_search` / `semantic_search` tools, not raw aggregate. STAGE0_ONLY_ATLAS_STAGES = %w[ - $search $searchMeta $vectorSearch $listSearchIndexes + $search $searchMeta $vectorSearch $listSearchIndexes $rankFusion ].freeze # Cap on the length of a caller-supplied `$regex` (or the `regex:` diff --git a/lib/parse/query.rb b/lib/parse/query.rb index fcfac2d..dea13da 100644 --- a/lib/parse/query.rb +++ b/lib/parse/query.rb @@ -370,6 +370,19 @@ def all(table, constraints = { limit: :max }) # @param where [Array] an array of {Parse::Constraint} objects. # @return [Hash] a hash representing the compiled query, with # internal routing markers stripped. + # One-shot process latch so {#warn_if_public_explain_restricted!} emits + # the allowPublicExplain guidance at most once per process rather than on + # every explain call. + # @!visibility private + def public_explain_warned? + @public_explain_warned == true + end + + # @!visibility private + def public_explain_warned! + @public_explain_warned = true + end + def compile_where(where) constraint_reduce(where).reject { |k, _| k.is_a?(String) && k.start_with?("__") } end @@ -480,6 +493,7 @@ def initialize(table, constraints = {}) @where = [] @order = [] @keys = [] + @exclude_keys = [] @includes = [] @limit = nil @skip = 0 @@ -495,6 +509,7 @@ def initialize(table, constraints = {}) # unless the caller said otherwise. @use_master_key = nil @verbose_aggregate = false + @hint = nil conditions constraints end # initialize @@ -616,6 +631,50 @@ def keys(*fields) alias_method :select_fields, :keys + # Set a server-side field denylist for this query. + # When set, Parse Server excludes the named fields from each returned + # object, complementing the {#keys} allowlist. The two options can be + # combined: Parse Server first applies the {#keys} allowlist, then + # strips any field names listed in +excludeKeys+. + # + # @note On the REST query path (+encode: true+ in {#compile}) this maps to + # Parse Server's path-scoped +excludeKeys+. On the mongo-direct path + # (explicit +.results_direct+, an auto-route, or an aggregation that + # auto-promotes — e.g. an +$inQuery+ pointer constraint that rewrites to + # a +$lookup+) the pipeline can only project the {#keys} allowlist, so + # the SDK honors the denylist as a post-fetch sanitize over the returned + # results instead. That mongo-direct sanitize is recursive by name: it + # strips EVERY key with a matching name at any depth, so excluding a + # field also removes a same-named field inside included/nested objects — + # broader than the REST path's top-level/dotted scoping. Reserved + # envelope fields (+objectId+, +className+, +__type+, +createdAt+, + # +updatedAt+, +ACL+ and their Mongo storage-form names) are never + # stripped, so object reconstruction is unaffected. The raw aggregation + # accessor (`aggregate(...).raw`) returns unredacted documents — the + # sanitize applies to the object/decoded result paths. +excludeKeys+ is a + # projection convenience, not an ACL/CLP boundary, so it does not affect + # access control. + # + # @example Omit a single sensitive field + # Post.query.exclude_keys(:secret_token).results + # + # @example Omit multiple fields + # Post.query.exclude_keys(:secret_token, :internal_notes).results + # + # @param fields [Array] the field names to exclude. + # @return [self] + def exclude_keys(*fields) + @exclude_keys ||= [] + fields.flatten.each do |field| + if field.nil? == false && field.respond_to?(:to_s) + @exclude_keys.push Query.format_field(field).to_sym + end + end + @exclude_keys.uniq! + @results = nil if fields.count > 0 + self # chaining + end + # Extract values for a specific field from all matching objects. # This is similar to keys() but returns an array of the actual field values # instead of objects with only those fields selected. @@ -792,6 +851,31 @@ def read_pref(preference) self end + # Set a MongoDB index hint for this query. + # Forces Parse Server (and the underlying MongoDB driver) to use the + # named index instead of the query planner's choice. Useful for + # benchmarking or for working around sub-optimal plan selection. + # The hint is emitted in the compiled REST query body as the +hint+ + # parameter (supported by Parse Server 7.4.0+) AND forwarded to the + # mongo-direct path — +results_direct+ / +count_direct+ / +distinct_direct+ + # pass it to {Parse::MongoDB.aggregate}/{Parse::MongoDB.find} as the Mongo + # +hint+ option, so a plan diagnosed with {#explain} can be corrected on + # either path. + # + # @example Force a specific index + # Post.query(:status => "published").hint("status_1_created_at_-1").results + # + # @param index_name [String, nil, :_read_] the index name or key pattern to use, + # or +nil+ to clear a previously set hint. Called with no arguments acts as a + # reader and returns the current hint value. + # @return [String, nil, self] + HINT_UNSET = :_hint_unset_ # @!visibility private + def hint(index_name = HINT_UNSET) + return @hint if index_name.equal?(HINT_UNSET) + @hint = index_name + self + end + def related_to(field, pointer) raise ArgumentError, "Object value must be a Parse::Pointer type" unless pointer.is_a?(Parse::Pointer) add_constraint field.to_sym.related_to, pointer @@ -1462,24 +1546,125 @@ def _opts # Build headers for the query request def _headers headers = {} - if read_preference.present? - pref = read_preference.to_s.upcase.gsub("_", " ").split.join("_") - # Normalize common formats - pref = case pref - when "PRIMARY" then "PRIMARY" - when "PRIMARY_PREFERRED", "PRIMARYPREFERRED" then "PRIMARY_PREFERRED" - when "SECONDARY" then "SECONDARY" - when "SECONDARY_PREFERRED", "SECONDARYPREFERRED" then "SECONDARY_PREFERRED" - when "NEAREST" then "NEAREST" - else pref - end - if Parse::Protocol::READ_PREFERENCES.include?(pref) - headers[Parse::Protocol::READ_PREFERENCE] = pref - else - warn "[ParseQuery] Invalid read preference: #{read_preference}. Valid values: #{Parse::Protocol::READ_PREFERENCES.join(", ")}" + pref = normalized_read_preference + headers[Parse::Protocol::READ_PREFERENCE] = pref if pref + headers + end + + # Normalize the query's `read_pref` value to the canonical Parse Server + # token (`PRIMARY`, `PRIMARY_PREFERRED`, `SECONDARY`, `SECONDARY_PREFERRED`, + # `NEAREST`). Parse Server's `_parseReadPreference` upcases the incoming + # string and matches exactly these forms, so the SDK emits them verbatim. + # @return [String, nil] the canonical token, or nil when no preference is + # set. Warns and returns nil on an unrecognized value. + # @!visibility private + def normalized_read_preference + return nil unless read_preference.present? + pref = read_preference.to_s.upcase.gsub("_", " ").split.join("_") + pref = case pref + when "PRIMARY" then "PRIMARY" + when "PRIMARY_PREFERRED", "PRIMARYPREFERRED" then "PRIMARY_PREFERRED" + when "SECONDARY" then "SECONDARY" + when "SECONDARY_PREFERRED", "SECONDARYPREFERRED" then "SECONDARY_PREFERRED" + when "NEAREST" then "NEAREST" + else pref end + return pref if Parse::Protocol::READ_PREFERENCES.include?(pref) + warn "[ParseQuery] Invalid read preference: #{read_preference}. Valid values: #{Parse::Protocol::READ_PREFERENCES.join(", ")}" + nil + end + + # Proactive guidance for {#explain} on Parse Server 9.0+. PS 9.0 defaults + # `allowPublicExplain` to false, so a NON-master explain is rejected unless + # the operator re-enabled it server-side. That flag is not surfaced in + # `/serverInfo`, so we cannot know for certain whether the call will be + # allowed — we therefore WARN (one-shot) and still run the call: + # `allowPublicExplain: true` servers return the plan; restricted servers + # fail and {#explain}'s reactive enrichment explains why. + # + # We warn only when the query is clearly non-master (explicit + # `use_master_key: false`, or a session-token scope) AND the server version + # is known to restrict it — so a master-default explain (the common case) + # and unknown-version servers don't get spurious noise. + # @!visibility private + def warn_if_public_explain_restricted! + non_master = use_master_key == false || + (session_token.present? && use_master_key != true) + return unless non_master + return unless client.respond_to?(:server_supports?) && client.respond_to?(:server_version) + return if client.server_version.to_s.empty? # known version only + return if client.server_supports?(:public_explain) + return if Parse::Query.public_explain_warned? + Parse::Query.public_explain_warned! + message = "[ParseQuery:Explain] Parse Server #{client.server_version} defaults " \ + "`allowPublicExplain` to false; a non-master explain will be rejected " \ + "unless the server enables it. Run explain with use_master_key: true, or " \ + "set `allowPublicExplain: true` in the server's databaseOptions." + if defined?(Parse) && Parse.respond_to?(:logger) && Parse.logger + Parse.logger.warn(message) + else + warn message end - headers + end + + # Honor the `exclude_keys` denylist on the mongo-direct path by redacting + # the matching fields from the fetched results in Ruby — the mongo-direct + # pipeline projects only the `keys` allowlist (Parse Server's REST + # `excludeKeys` has no mongo-direct equivalent), so without this the + # denylist would silently have no effect. This is a pure post-fetch + # sanitize over the Parse-format result hashes; it does NOT change the + # MongoDB query or pipeline. + # + # Semantics differ from the REST path: `excludeKeys` on REST is + # path-scoped (top-level / dotted), whereas this drops EVERY key with a + # matching name at ANY depth — so excluding `:name` also strips `name` + # from included/nested objects. This matches the "recursively drop all + # keys with that name" contract for the mongo-direct path. + # + # `exclude_keys` is a projection convenience, NOT an ACL/CLP boundary, so + # this redaction is about returned-object shape, not access control. + # + # Decode-critical structural keys are never stripped, so a query can ask + # to exclude e.g. `:objectId` without breaking object reconstruction. + # @param results [Array] Parse-format result hashes (mutated in place) + # @return [Array] the same array, with excluded keys removed + # @!visibility private + def redact_excluded_keys!(results) + return results unless @exclude_keys&.any? + names = @exclude_keys.map(&:to_s) - RESERVED_EXCLUDE_KEYS + return results if names.empty? + drop = names.to_set + results.each { |row| recursively_drop_keys!(row, drop) } + results + end + + # Reserved fields that {#redact_excluded_keys!} never strips: dropping these + # would break {#decode} (objectId / className / __type) or remove the + # required Parse envelope. Both the Parse-format names (objectId, createdAt, + # updatedAt, ACL) and their Mongo storage-form counterparts (_id, + # _created_at, _updated_at, _acl) are guarded, so the redaction is safe even + # if it is ever pointed at a raw Mongo document, and a caller can't break + # reconstruction by excluding e.g. `:_id`. This is an SDK safety choice, not + # an assertion about which fields Parse Server's REST `excludeKeys` strips. + RESERVED_EXCLUDE_KEYS = %w[ + objectId className __type createdAt updatedAt ACL + _id _created_at _updated_at _acl + ].freeze + + # Recursively delete every key named in +names+ from a nested + # Hash/Array structure, in place. Symbol and string keys both match. + # @param value [Object] a Hash, Array, or scalar + # @param names [Set] the key names to drop + # @!visibility private + def recursively_drop_keys!(value, names) + case value + when Hash + value.reject! { |k, _| names.include?(k.to_s) } + value.each_value { |v| recursively_drop_keys!(v, names) } + when Array + value.each { |v| recursively_drop_keys!(v, names) } + end + value end # Performs the fetch request for the query. @@ -1921,11 +2106,18 @@ def results_direct(raw: false, max_time_ms: nil, session_token: nil, master: nil master: master, acl_user: acl_user, acl_role: acl_role, - read_preference: @read_preference) + read_preference: @read_preference, + hint: @hint) # Convert MongoDB documents to Parse format parse_results = Parse::MongoDB.convert_documents_to_parse(raw_results, @table) + # Honor exclude_keys on the mongo-direct path: the pipeline can only + # project the keys allowlist, so apply the denylist here as a post-fetch + # sanitize over the Parse-format hashes (before the raw/decode fork so + # both shapes are redacted). Does not alter the MongoDB query. + redact_excluded_keys!(parse_results) + if raw return parse_results.each(&block) if block_given? return parse_results @@ -2042,7 +2234,8 @@ def count_direct(session_token: nil, master: nil, acl_user: nil, acl_role: nil) master: master, acl_user: acl_user, acl_role: acl_role, - read_preference: @read_preference) + read_preference: @read_preference, + hint: @hint) # Extract count from result return 0 if raw_results.empty? @@ -2121,6 +2314,7 @@ def distinct_direct(field, return_pointers: false, order: nil, raw_results = Parse::MongoDB.aggregate(@table, pipeline, allow_internal_fields: true, read_preference: @read_preference, + hint: @hint, session_token: session_token, master: master, acl_user: acl_user, @@ -2239,7 +2433,8 @@ def atlas_search(query = nil, **options, &block) # SDK-built pipeline only — see results_direct for rationale. raw_results = Parse::MongoDB.aggregate(@table, pipeline, allow_internal_fields: true, - read_preference: @read_preference) + read_preference: @read_preference, + hint: @hint) # Convert results if options[:raw] @@ -3032,7 +3227,7 @@ def cursor(limit: 100, order: nil) # events can arrive. Optional. # @return [Parse::LiveQuery::Subscription] the subscription object # @see Parse::LiveQuery::Subscription - def subscribe(fields: nil, session_token: nil, client: nil, use_master_key: false, &block) + def subscribe(fields: nil, keys: nil, watch: nil, session_token: nil, client: nil, use_master_key: false, &block) require_relative "live_query" lq_client = client || Parse::LiveQuery.client @@ -3040,6 +3235,8 @@ def subscribe(fields: nil, session_token: nil, client: nil, use_master_key: fals @table, where: compile_where, fields: fields, + keys: keys, + watch: watch, session_token: session_token || @session_token, use_master_key: use_master_key, &block @@ -3063,11 +3260,22 @@ def subscribe(fields: nil, session_token: nil, client: nil, use_master_key: fals # @note This feature requires MongoDB explain support in Parse Server. # The format of the returned plan depends on the MongoDB version. def explain + warn_if_public_explain_restricted! compiled_query = compile compiled_query[:explain] = true - response = client.find_objects(@table, compiled_query.as_json, **_opts) + response = client.find_objects(@table, compiled_query.as_json, headers: _headers, **_opts) if response.error? - puts "[ParseQuery:Explain] #{response.error}" + # Parse Server 9.0+ defaults `allowPublicExplain` to false, so a + # non-master explain that worked on 8.x now returns a permission + # error. Surface that as actionable guidance instead of a bare 403. + if response.respond_to?(:permission_denied?) && response.permission_denied? + puts "[ParseQuery:Explain] #{response.error} — Parse Server 9.0+ defaults " \ + "`allowPublicExplain` to false; query explain now requires the master key " \ + "(use_master_key: true) or `allowPublicExplain: true` in the server's " \ + "databaseOptions." + else + puts "[ParseQuery:Explain] #{response.error}" + end return {} end response.result @@ -3131,7 +3339,7 @@ def deduplicate_consecutive_match_stages(pipeline) # at the top-level stage. BLOCKED_PIPELINE_STAGES = Parse::PipelineSecurity::DENIED_OPERATORS - def aggregate(pipeline, verbose: nil, mongo_direct: nil, rewrite_lookups: nil) + def aggregate(pipeline, verbose: nil, mongo_direct: nil, rewrite_lookups: nil, raw_values: false, raw_field_names: false) validate_pipeline!(pipeline) # Auto-rewrite LLM-style $lookup stages against logical Parse class @@ -3275,7 +3483,8 @@ def aggregate(pipeline, verbose: nil, mongo_direct: nil, rewrite_lookups: nil) complete_pipeline = translate_pipeline_for_direct_mongodb(complete_pipeline) end - Aggregation.new(self, complete_pipeline, verbose: verbose, mongo_direct: use_mongo_direct || false) + Aggregation.new(self, complete_pipeline, verbose: verbose, mongo_direct: use_mongo_direct || false, + raw_values: raw_values, raw_field_names: raw_field_names) end # Apply the direct-MongoDB stage converter to every stage in a pipeline. @@ -3938,6 +4147,7 @@ def compile(encode: true, includeClassName: false) q[:include] = @includes.join(",") unless @includes.empty? q[:keys] = @keys.join(",") unless @keys.empty? + q[:excludeKeys] = @exclude_keys.join(",") if encode && @exclude_keys&.any? q[:order] = @order.join(",") unless @order.empty? unless @where.empty? q[:where] = Parse::Query.compile_where(@where) @@ -3949,6 +4159,17 @@ def compile(encode: true, includeClassName: false) q[:limit] = 0 q[:count] = 1 end + # Read preference must ride the REST query body (restOptions), NOT a + # header: Parse Server's middleware does not map any + # `X-Parse-Read-Preference` header into request options, so the + # header alone is silently ignored and the read always hits the + # primary. `RestQuery` reads `readPreference` from restOptions, so + # emitting it here is what actually routes the read. (The header is + # still sent for any intermediary that honors it; it is harmless.) + if encode && (pref = normalized_read_preference) + q[:readPreference] = pref + end + q[:hint] = @hint if @hint if includeClassName q[:className] = @table end @@ -5207,7 +5428,7 @@ def clone cloned_query = Parse::Query.new(self.instance_variable_get(:@table)) # Note: :client is intentionally excluded - it contains non-serializable objects # (Redis connections, Faraday connections) and should be obtained lazily - [:count, :where, :order, :keys, :includes, :limit, :skip, :cache, :use_master_key].each do |param| + [:count, :where, :order, :keys, :exclude_keys, :includes, :limit, :skip, :cache, :use_master_key, :hint].each do |param| if instance_variable_defined?(:"@#{param}") value = instance_variable_get(:"@#{param}") if value.is_a?(Array) || value.is_a?(Hash) @@ -5503,12 +5724,19 @@ class Aggregation # @param mongo_direct [Boolean] if true, uses MongoDB directly bypassing Parse Server (required for $literal) # @param max_time_ms [Integer, nil] optional server-side time limit in milliseconds passed to # {Parse::MongoDB.aggregate} when mongo_direct is true. Pass +nil+ (the default) for no cap. - def initialize(query, pipeline, verbose: nil, mongo_direct: false, max_time_ms: nil) + # @param raw_values [Boolean] when true, passes +rawValues: true+ to the Parse Server REST + # aggregate endpoint (PS 9.9.0+). Has no effect on the mongo-direct path. + # @param raw_field_names [Boolean] when true, passes +rawFieldNames: true+ to the Parse Server + # REST aggregate endpoint (PS 9.9.0+). Has no effect on the mongo-direct path. + def initialize(query, pipeline, verbose: nil, mongo_direct: false, max_time_ms: nil, + raw_values: false, raw_field_names: false) @query = query @pipeline = pipeline @cached_response = nil @mongo_direct = mongo_direct @max_time_ms = max_time_ms + @raw_values = raw_values + @raw_field_names = raw_field_names # Use provided verbose setting, or fall back to query's verbose_aggregate setting @verbose = verbose.nil? ? @query.instance_variable_get(:@verbose_aggregate) : verbose end @@ -5532,6 +5760,8 @@ def execute! @query.instance_variable_get(:@table), @pipeline, headers: {}, + raw_values: @raw_values, + raw_field_names: @raw_field_names, **@query.send(:_opts), ) end @@ -5555,7 +5785,11 @@ def execute! def execute_direct!(max_time_ms: @max_time_ms) table = @query.instance_variable_get(:@table) auth_kwargs = @query.send(:mongo_direct_auth_kwargs) - Parse::MongoDB.aggregate(table, @pipeline, max_time_ms: max_time_ms, **auth_kwargs) + # Forward the parent query's index hint so `query.hint(...).aggregate(...)` + # honors it on the mongo-direct path too (parity with results_direct / + # count_direct / distinct_direct). + hint = @query.instance_variable_get(:@hint) + Parse::MongoDB.aggregate(table, @pipeline, max_time_ms: max_time_ms, hint: hint, **auth_kwargs) end # Returns processed results from the aggregation. @@ -5607,6 +5841,10 @@ def convert_aggregation_item(item) def convert_direct_aggregation_item(raw, table) if raw_is_parse_document?(raw) parse_doc = Parse::MongoDB.convert_document_to_parse(raw, table) + # Honor exclude_keys on this mongo-direct aggregation path (e.g. the + # $inQuery -> $lookup rewrite) by redacting the denylisted fields from + # the converted document before decode. Mirrors results_direct. + @query.send(:redact_excluded_keys!, [parse_doc]) @query.send(:decode, [parse_doc]).first else AggregationResult.new(Parse::MongoDB.convert_aggregation_document(raw)) diff --git a/lib/parse/query/constraints.rb b/lib/parse/query/constraints.rb index 86087f3..194034d 100644 --- a/lib/parse/query/constraints.rb +++ b/lib/parse/query/constraints.rb @@ -499,6 +499,35 @@ def build end end + # Equivalent to the $containedBy Parse query operation. Matches documents + # where the array field's values are all within the supplied set (the + # inverse of {ContainsAllConstraint}: the field must be a *subset* of the + # provided array). The field column should be of type {Array} in your + # Parse class. + # + # q.where :field.contained_by => [1, 2, 3] + # q.where :tags.contained_by => ["ruby", "rails", "parse"] + # + # @see ContainsAllConstraint + # @see ContainedInConstraint + class ContainedByConstraint < Constraint + # @!method contained_by + # A registered method on a symbol to create the constraint. + # Maps to Parse operator "$containedBy". + # @example + # q.where :tags.contained_by => ["ruby", "rails"] + # @return [ContainedByConstraint] + constraint_keyword :$containedBy + register :contained_by + + # @return [Hash] the compiled constraint. + def build + val = formatted_value + val = [val].compact unless val.is_a?(Array) + { @operation.operand => { key => val } } + end + end + # Array size constraint using MongoDB aggregation. # Parse Server does not natively support $size query constraint, so we use # MongoDB aggregation pipeline with $expr and $size to check array length. diff --git a/lib/parse/retrieval/agent_tool.rb b/lib/parse/retrieval/agent_tool.rb index b4f8fc6..59caf9d 100644 --- a/lib/parse/retrieval/agent_tool.rb +++ b/lib/parse/retrieval/agent_tool.rb @@ -94,6 +94,14 @@ def semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, # AccessDenied for an un-bound agent on a scoped class). scope = Parse::Agent::Tools.resolve_tenant_scope!(agent, cname) + # Per-tenant embedding spend cap (§16.10 — agent-tool exposure + # mitigation). semantic_search embeds attacker-controlled query + # text on every call; charge the estimated query tokens against + # the tenant's budget BEFORE embedding. HARD-REFUSES once the + # tenant is over cap. No-op when no limit is configured or for + # trusted admin agents. + charge_spend_cap!(agent, scope, query) + # Non-admin agents get quantized scores (membership-inference # defense); admin agents get full precision. Keyed on the # permission tier, not master-key posture. @@ -144,6 +152,47 @@ def semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, envelope end + # @!visibility private + # Charge the estimated query-embedding token cost against the + # tenant's spend cap. The tenant key is the resolved tenant-scope + # value (so each tenant has its own budget); unscoped non-admin + # calls charge the shared default bucket. Admin agents are trusted + # and skip the cap entirely (mirrors the score-quantize tier check). + # + # A cap hit is surfaced as a structured error rather than the raw + # {Parse::Embeddings::SpendCap::Exceeded} — otherwise the agent's + # generic-error rescue would collapse it to an opaque "internal + # error" and the model couldn't self-correct. Two distinct cases: + # + # * Transient (`retry_after` non-nil): the window will roll off + # enough tokens to admit this charge. Surface as + # {Parse::Agent::RateLimitExceeded} (wire `error_code: + # :rate_limited`) carrying the real backoff hint so the model + # waits and retries. + # * Permanent (`retry_after` nil): the request alone exceeds the cap + # (`requested > limit`) and can NEVER fit, no matter how long the + # caller waits. Mapping that to a RateLimitExceeded would tell the + # model to back off and retry an unsatisfiable request — and it + # would also crash, since RateLimitExceeded#initialize calls + # `retry_after.round`. Surface as {Parse::Agent::ValidationError} + # so the model shrinks the query (or the operator raises the cap). + def charge_spend_cap!(agent, scope, query) + return if agent.permissions == :admin + tenant_id = scope && (scope[:value] || scope["value"]) + tokens = Parse::Embeddings::SpendCap.estimate_tokens(query) + Parse::Embeddings::SpendCap.charge!(tenant_id: tenant_id, tokens: tokens) + rescue Parse::Embeddings::SpendCap::Exceeded => e + if e.retry_after.nil? + raise Parse::Agent::ValidationError, + "semantic_search: query too large for the embedding spend cap " \ + "(#{e.requested} tokens requested, limit #{e.limit}/#{e.window}s). " \ + "Shorten the query or raise the cap." + end + raise Parse::Agent::RateLimitExceeded.new( + retry_after: e.retry_after, limit: e.limit, window: e.window, + ) + end + # @!visibility private # nil -> DEFAULT_MAX_TOTAL_TOKENS; <=0 -> nil (unlimited); else the int. def resolve_token_budget(max_total_tokens) diff --git a/lib/parse/retrieval/reranker.rb b/lib/parse/retrieval/reranker.rb new file mode 100644 index 0000000..fbe5b01 --- /dev/null +++ b/lib/parse/retrieval/reranker.rb @@ -0,0 +1,157 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +module Parse + module Retrieval + # Cross-encoder reranking for retrieved documents. + # + # A reranker takes a query and a list of candidate document texts and + # returns a relevance-ordered scoring. It runs AFTER the (vector, + # lexical, or hybrid) retrieval step and BEFORE chunking, reordering + # the retrieved documents by a more expensive cross-encoder relevance + # model than the first-stage similarity score. + # + # == Protocol + # + # A reranker is any object that responds to: + # + # #rerank(query:, documents:, top_n: nil) -> Array + # + # where `documents` is an Array and the return is an Array of + # {Result} (`index` into `documents`, plus `relevance_score`), + # descending by relevance. Implementations MUST: + # + # * Return at most `documents.length` results (and at most `top_n` + # when given). + # * Use 0-based `index` values that are valid positions in the input. + # * Never raise for an empty `documents` list — return `[]`. + # + # {Base} provides input validation and result normalization so + # adapters only implement the network call ({Base#rerank_scores}). + # + # @example wiring into retrieve + # reranker = Parse::Retrieval::Reranker::Cohere.new(api_key: ENV.fetch("COHERE_API_KEY")) + # chunks = Parse::Retrieval.retrieve(query: q, klass: Article, k: 30, + # rerank: reranker, rerank_top_n: 5) + module Reranker + # The Cohere `/v2/rerank` adapter is loaded lazily — it requires + # Faraday, which the core retrieval path does not. + autoload :Cohere, ::File.expand_path("reranker/cohere", __dir__) + + # Base error for the reranker layer. Adapters raise subclasses. + class Error < StandardError; end + + # Raised when a reranker returns a response that doesn't satisfy the + # protocol (bad index, non-numeric score, over-length result set). + class InvalidResponseError < Error; end + + # A single rerank result: the 0-based position of a document in the + # input list, plus its cross-encoder relevance score (higher is more + # relevant; range is provider-defined). + Result = Struct.new(:index, :relevance_score, keyword_init: true) + + # Common superclass: validates inputs, bounds `top_n`, and + # normalizes raw `(index, score)` pairs into sorted {Result}s. + # Concrete adapters implement {#rerank_scores}. + class Base + # Hard cap on the number of documents a single rerank call may + # carry, to bound provider cost / payload size. Providers + # typically cap around 1000; we stay conservative. + MAX_DOCUMENTS = 1000 + + # Rerank `documents` against `query`. + # + # @param query [String] the natural-language query. + # @param documents [Array] candidate document texts. + # @param top_n [Integer, nil] return at most this many results. + # @return [Array] descending by `relevance_score`. + def rerank(query:, documents:, top_n: nil) + unless query.is_a?(String) && !query.strip.empty? + raise ArgumentError, "#{self.class}#rerank: query must be a non-empty String." + end + docs = Array(documents).map(&:to_s) + return [] if docs.empty? + if docs.length > MAX_DOCUMENTS + raise ArgumentError, + "#{self.class}#rerank: #{docs.length} documents exceeds MAX_DOCUMENTS=#{MAX_DOCUMENTS}." + end + n = top_n.nil? ? docs.length : [Integer(top_n), docs.length].min + n = docs.length if n <= 0 + + pairs = rerank_scores(query, docs, n) + normalize_results(pairs, docs.length, n) + end + + protected + + # Adapter hook: return an Array of `[index, score]` pairs (or + # {Result}s) for `documents`. `top_n` is a hint; the base class + # re-bounds and re-sorts regardless. + # + # @param query [String] + # @param documents [Array] + # @param top_n [Integer] + # @return [Array, Array] + def rerank_scores(query, documents, top_n) + raise NotImplementedError, "#{self.class}#rerank_scores must be implemented." + end + + private + + def normalize_results(pairs, doc_count, top_n) + results = Array(pairs).map do |p| + idx, score = + case p + when Result then [p.index, p.relevance_score] + when Array then [p[0], p[1]] + when Hash then [p[:index] || p["index"], p[:relevance_score] || p["relevance_score"]] + else + raise InvalidResponseError, "#{self.class}: unexpected rerank result element #{p.inspect}." + end + i = Integer(idx) + unless i >= 0 && i < doc_count + raise InvalidResponseError, + "#{self.class}: rerank index #{i} out of range 0...#{doc_count}." + end + unless score.is_a?(Numeric) && score.to_f.finite? + raise InvalidResponseError, + "#{self.class}: rerank relevance_score #{score.inspect} is not a finite number." + end + Result.new(index: i, relevance_score: score.to_f) + end + # Defensive: drop duplicate indices (keep the first / highest), + # then sort descending and bound to top_n. + seen = {} + results.each { |r| seen[r.index] ||= r } + seen.values.sort_by { |r| [-r.relevance_score, r.index] }.first(top_n) + end + end + + # Deterministic, zero-network reranker for tests and offline use. + # Scores each document by lexical token overlap with the query + # (Jaccard-ish: shared unique lowercased word count, tie-broken by + # input order). No external dependency, fully reproducible. + class Fixture < Base + protected + + def rerank_scores(query, documents, _top_n) + q_tokens = tokenize(query) + documents.each_with_index.map do |doc, i| + d_tokens = tokenize(doc) + overlap = (q_tokens & d_tokens).length + # Normalize into a 0..1-ish score so output looks like a real + # relevance score; longer-overlap docs rank higher. + denom = [q_tokens.length, 1].max + [i, overlap.to_f / denom] + end + end + + private + + def tokenize(text) + text.to_s.downcase.scan(/[a-z0-9]+/).uniq + end + end + end + end +end diff --git a/lib/parse/retrieval/reranker/cohere.rb b/lib/parse/retrieval/reranker/cohere.rb new file mode 100644 index 0000000..bad50b6 --- /dev/null +++ b/lib/parse/retrieval/reranker/cohere.rb @@ -0,0 +1,218 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require "json" +require "uri" +require_relative "../reranker" + +module Parse + module Retrieval + module Reranker + # Cohere cross-encoder reranker. Wraps `POST /v2/rerank`. + # + # Cohere's rerank API takes a query plus a list of document strings + # and returns a relevance-ordered list of `{ index, relevance_score }` + # objects. It is a distinct endpoint from `/v1/embed` / + # `/v2/embed` — do NOT confuse it with + # {Parse::Embeddings::Cohere} (the embeddings provider). + # + # The HTTP stack mirrors the embeddings provider's hardening: + # explicit `proxy: nil` unless opted in, bounded timeouts, capped + # retries with backoff on 429/5xx, response-size cap, and a redacted + # `#inspect`. + # + # @example + # reranker = Parse::Retrieval::Reranker::Cohere.new( + # api_key: ENV.fetch("COHERE_API_KEY"), + # model: "rerank-v3.5", + # ) + # reranker.rerank(query: "rain songs", documents: lyrics, top_n: 5) + class Cohere < Base + class AuthenticationError < Error; end + class RateLimitError < Error; end + class TransientError < Error; end + class BadRequestError < Error; end + + DEFAULT_BASE_URL = "https://api.cohere.com/v2" + DEFAULT_MODEL = "rerank-v3.5" + DEFAULT_TIMEOUT = 30 + DEFAULT_OPEN_TIMEOUT = 5 + DEFAULT_MAX_RETRIES = 2 + + # Cohere documents a cap of 1000 documents per rerank call; the + # {Base::MAX_DOCUMENTS} cap (1000) already enforces this. + MAX_RESPONSE_BYTES = 5 * 1024 * 1024 + + # @param api_key [String] Cohere API key. + # @param model [String] rerank model (default {DEFAULT_MODEL}). + # @param base_url [String] API base (default {DEFAULT_BASE_URL}). + # @param timeout [Integer] read timeout (seconds). + # @param open_timeout [Integer] connect timeout (seconds). + # @param max_retries [Integer] retry budget for 429 / 5xx / + # transient connection errors. + # @param allow_faraday_proxy [Boolean] permit Faraday to honor + # `*_proxy` env vars (default false — explicit `proxy: nil`). + def initialize(api_key:, model: DEFAULT_MODEL, base_url: DEFAULT_BASE_URL, + timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, + max_retries: DEFAULT_MAX_RETRIES, allow_faraday_proxy: false) + validate_api_key!(api_key) + @api_key = api_key + @model = model.to_s + raise ArgumentError, "Reranker::Cohere: model must be non-empty." if @model.empty? + @base_url = base_url.to_s + validate_base_url!(@base_url) + @timeout = Integer(timeout) + @open_timeout = Integer(open_timeout) + @max_retries = Integer(max_retries) + raise ArgumentError, "Reranker::Cohere: max_retries must be >= 0." if @max_retries.negative? + @allow_faraday_proxy = allow_faraday_proxy ? true : false + @connection = build_connection + end + + # @return [String] the rerank model name. + attr_reader :model + + def inspect + "#<#{self.class} model=#{@model.inspect} base=#{safe_base_host.inspect} " \ + "retries=#{@max_retries} api_key=[REDACTED]>" + end + + protected + + def rerank_scores(query, documents, top_n) + require_faraday! + body = { + "model" => @model, + "query" => query, + "documents" => documents, + "top_n" => top_n, + } + payload = post_rerank(body) + extract_results!(payload, documents.length) + end + + private + + def post_rerank(body) + attempts = 0 + loop do + attempts += 1 + begin + response = @connection.post("rerank") { |req| req.body = body.to_json } + rescue Faraday::TimeoutError, Faraday::ConnectionFailed => e + raise TransientError, "Reranker::Cohere: #{e.class} after #{attempts} attempt(s)." if attempts > @max_retries + sleep(backoff_seconds(attempts)) + next + end + + status = response.status + return parse_json_body!(response.body) if status >= 200 && status < 300 + + case status + when 401 + raise AuthenticationError, "Reranker::Cohere: 401 Unauthorized — check api_key." + when 429 + raise RateLimitError, "Reranker::Cohere: 429 rate limited after #{attempts} attempt(s)." if attempts > @max_retries + sleep(retry_after_seconds(response) || backoff_seconds(attempts)) + when 500..599 + raise TransientError, "Reranker::Cohere: #{status} after #{attempts} attempt(s)." if attempts > @max_retries + sleep(backoff_seconds(attempts)) + else + raise BadRequestError, "Reranker::Cohere: #{status} from POST /rerank." + end + end + end + + # Cohere v2 /rerank response shape: + # { "id": "...", "results": [ { "index": 0, "relevance_score": 0.98 }, ... ], + # "meta": { "billed_units": { "search_units": 1 } } } + def extract_results!(payload, doc_count) + unless payload.is_a?(Hash) + raise InvalidResponseError, "Reranker::Cohere: response body is not a JSON object." + end + results = payload["results"] + unless results.is_a?(Array) + raise InvalidResponseError, "Reranker::Cohere: response.results is not an Array." + end + results.map do |r| + unless r.is_a?(Hash) + raise InvalidResponseError, "Reranker::Cohere: rerank result is not an object (#{r.inspect})." + end + Result.new(index: r["index"], relevance_score: r["relevance_score"]) + end + end + + def parse_json_body!(body) + s = body.to_s + if s.bytesize > MAX_RESPONSE_BYTES + raise InvalidResponseError, + "Reranker::Cohere: response body exceeds #{MAX_RESPONSE_BYTES} bytes (#{s.bytesize})." + end + JSON.parse(s, max_nesting: 32) + rescue JSON::ParserError => e + raise InvalidResponseError, "Reranker::Cohere: response is not valid JSON (#{e.message})." + end + + def build_connection + require_faraday! + headers = { + "Authorization" => "Bearer #{@api_key}", + "Content-Type" => "application/json", + "Accept" => "application/json", + "User-Agent" => "parse-stack-reranker/#{Parse::Stack::VERSION rescue "0"}", + } + # base_url must end with a trailing slash so Faraday resolves the + # relative "rerank" path under /v2/ rather than replacing it. + base = @base_url.end_with?("/") ? @base_url : "#{@base_url}/" + faraday_opts = { url: base, headers: headers } + faraday_opts[:proxy] = nil unless @allow_faraday_proxy + conn = Faraday.new(**faraday_opts) do |f| + f.options.timeout = @timeout + f.options.open_timeout = @open_timeout + f.adapter Faraday.default_adapter + end + conn.proxy = nil if !@allow_faraday_proxy && conn.respond_to?(:proxy=) + conn + end + + def backoff_seconds(attempt) + [0.5 * (2**(attempt - 1)), 30.0].min + end + + def retry_after_seconds(response) + ra = response.respond_to?(:headers) ? response.headers["retry-after"] || response.headers["Retry-After"] : nil + return nil unless ra + v = ra.to_f + v.positive? ? [v, 60.0].min : nil + end + + def validate_api_key!(api_key) + unless api_key.is_a?(String) && !api_key.empty? + raise ArgumentError, "Reranker::Cohere: api_key must be a non-empty String." + end + end + + def validate_base_url!(base_url) + uri = URI.parse(base_url) + unless uri.is_a?(URI::HTTPS) || uri.is_a?(URI::HTTP) + raise ArgumentError, "Reranker::Cohere: base_url must be http(s) (got #{base_url.inspect})." + end + rescue URI::InvalidURIError => e + raise ArgumentError, "Reranker::Cohere: invalid base_url #{base_url.inspect} (#{e.message})." + end + + def safe_base_host + URI.parse(@base_url).host + rescue StandardError + "?" + end + + def require_faraday! + require "faraday" unless defined?(Faraday) + rescue LoadError + raise Error, "Reranker::Cohere requires the `faraday` gem." + end + end + end + end +end diff --git a/lib/parse/retrieval/retriever.rb b/lib/parse/retrieval/retriever.rb index 95cb43b..f5178b0 100644 --- a/lib/parse/retrieval/retriever.rb +++ b/lib/parse/retrieval/retriever.rb @@ -3,6 +3,7 @@ require_relative "chunker" require_relative "chunk" +require_relative "reranker" module Parse # Retrieval-augmented-generation (RAG) helpers. `Parse::RAG` is a @@ -98,25 +99,33 @@ def assert_no_underscore_keys!(obj, path = []) # propagates and aborts the whole call (fail-closed). Kept as an # injection point so this model-layer method stays free of any # agent-layer dependency. - # @param hybrid [Object, nil] reserved — raises {NotImplementedError} - # if truthy. Hybrid (vector + lexical) retrieval lands in a later - # release; the kwarg locks the API shape now. - # @param rerank [Object, nil] reserved — raises {NotImplementedError} - # if non-nil. Cross-encoder rerank lands in a later release. + # @param hybrid [Boolean, Hash, nil] when truthy, fuse a lexical + # Atlas Search branch with the `$vectorSearch` branch via + # reciprocal-rank fusion (see {Parse::Core::VectorSearchable#hybrid_search}). + # `true` uses defaults (lexical query = `query`); a Hash may carry + # `:lexical`, `:vector`, and `:fusion` sub-configs. + # @param rerank [#rerank, nil] a {Parse::Retrieval::Reranker::Base} + # (or any object answering `#rerank(query:, documents:, top_n:)`). + # When present, retrieved documents are reordered by the + # cross-encoder relevance score BEFORE chunking, and the chunk score + # becomes the rerank relevance score. + # @param rerank_top_n [Integer, nil] keep only the top-N documents + # after reranking (defaults to all retrieved documents). # @param scope_opts [Hash] ACL/CLP scope kwargs forwarded verbatim to - # `find_similar`: `session_token:` / `acl_user:` / `acl_role:` / - # `master:`. + # `find_similar` / `hybrid_search`: `session_token:` / `acl_user:` / + # `acl_role:` / `master:`. # @return [Array] descending by score; chunk # order within a document is positional. def retrieve(query:, klass: nil, field: nil, text_field: nil, k: 10, filter: nil, vector_filter: nil, chunker: nil, tenant_scope: nil, score_quantize: false, source_transform: nil, hybrid: nil, rerank: nil, - **scope_opts) - raise NotImplementedError, - "Parse::Retrieval.retrieve: `hybrid:` is reserved for a future release." if hybrid - raise NotImplementedError, - "Parse::Retrieval.retrieve: `rerank:` is reserved for a future release." if rerank + rerank_top_n: nil, **scope_opts) + if rerank && !rerank.respond_to?(:rerank) + raise ArgumentError, + "Parse::Retrieval.retrieve: `rerank:` must respond to #rerank " \ + "(a Parse::Retrieval::Reranker::Base); got #{rerank.class}." + end # `class:` alias (reserved word — arrives via **scope_opts). klass ||= scope_opts.delete(:class) @@ -129,25 +138,60 @@ def retrieve(query:, klass: nil, field: nil, text_field: nil, k: 10, resolved_text_field = (text_field || infer_text_field!(klass)).to_sym merged_vector_filter = fold_tenant_scope(klass, vector_filter, tenant_scope) chunker ||= default_chunker + text_wire = wire_name(klass, resolved_text_field) - raw_hits = klass.find_similar( - text: query, - k: k, - field: field, - filter: filter, - vector_filter: merged_vector_filter, - raw: true, - **scope_opts, - ) + raw_hits = + if hybrid + fetch_hybrid_hits(klass, query, k, field, filter, merged_vector_filter, + tenant_scope, hybrid, scope_opts) + else + klass.find_similar( + text: query, k: k, field: field, filter: filter, + vector_filter: merged_vector_filter, raw: true, **scope_opts, + ) + end return [] if raw_hits.nil? || raw_hits.empty? - text_wire = wire_name(klass, resolved_text_field) + raw_hits = apply_rerank(rerank, query, raw_hits, text_wire, rerank_top_n) if rerank raw_hits.flat_map do |doc| build_chunks_for(doc, klass, text_wire, score_quantize, source_transform, chunker) end end + # @!visibility private + # Run the hybrid (lexical + vector) branch and return fused raw rows. + # Tenant scope is folded into BOTH branches: the vector branch via the + # Atlas pre-filter (`merged_vector_filter`) and the lexical branch via + # a post-`$search` `$match` (so neither branch leaks cross-tenant + # document existence). + def fetch_hybrid_hits(klass, query, k, field, filter, merged_vector_filter, + tenant_scope, hybrid, scope_opts) + cfg = hybrid.is_a?(Hash) ? hybrid : {} + lexical = (cfg[:lexical] || cfg["lexical"] || {}).dup + vector = (cfg[:vector] || cfg["vector"] || {}).dup + fusion = cfg[:fusion] || cfg["fusion"] + + lexical[:query] ||= query + # Tenant scope must be AUTHORITATIVE in BOTH branches. The previous + # `||=` form let a caller-supplied `vector[:vector_filter]` (or a + # colliding `lexical[:filter]`) REPLACE the tenant-folded filter + # rather than narrow within it — silently dropping tenant isolation + # and contradicting this method's "folded into BOTH branches" + # contract. `merge_filters` is last-wins, so ordering the tenant + # constraint LAST guarantees its key survives any caller collision: + # callers can narrow the result set but never escape their tenant. + lexical[:filter] = merge_filters(filter, lexical[:filter], tenant_filter_hash(klass, tenant_scope)) + vector[:field] ||= field unless field.nil? + vector[:filter] = merge_filters(vector[:filter], filter) + vector[:vector_filter] = merge_filters(vector[:vector_filter], merged_vector_filter) + + klass.hybrid_search( + text: query, lexical: lexical, vector: vector, + k: k, fusion: fusion, raw: true, **scope_opts, + ) + end + # @!visibility private def resolve_class!(klass) resolved = @@ -227,10 +271,53 @@ def fetch_field(doc, wire, sym) doc[sym] end + # @!visibility private + # Reorder retrieved documents by a cross-encoder reranker and stamp + # each surviving hit with its `_rerank_score`. The reranker scores the + # document's presentation text (the same `text_field` used for + # chunking). Index alignment between `documents` and `raw_hits` is + # preserved so the returned `index` maps back to the right hit. + def apply_rerank(reranker, query, raw_hits, text_wire, top_n) + documents = raw_hits.map { |doc| fetch_field(doc, text_wire, text_wire).to_s } + results = reranker.rerank(query: query, documents: documents, top_n: top_n) + results.map do |r| + hit = raw_hits[r.index] + next nil if hit.nil? + hit = hit.dup + hit["_rerank_score"] = r.relevance_score + hit + end.compact + end + + # @!visibility private + # Convert a `{ field:, value: }` tenant scope into a `{ wire => value }` + # filter hash (the lexical branch's post-`$search` `$match`), or nil. + def tenant_filter_hash(klass, tenant_scope) + return nil if tenant_scope.nil? + field = tenant_scope[:field] || tenant_scope["field"] + return nil if field.nil? + value = tenant_scope.key?(:value) ? tenant_scope[:value] : tenant_scope["value"] + { wire_name(klass, field) => value } + end + + # @!visibility private + # Shallow-merge non-empty filter hashes (left-to-right; later keys + # win). Returns nil when nothing is left to apply. + def merge_filters(*filters) + merged = {} + filters.each do |f| + next if f.nil? || (f.respond_to?(:empty?) && f.empty?) + merged.merge!(f) + end + merged.empty? ? nil : merged + end + # @!visibility private def build_chunks_for(doc, klass, text_wire, score_quantize, source_transform, chunker) object_id = (doc["_id"] || doc[:_id] || doc["objectId"] || doc[:objectId]).to_s - raw_score = doc["_vscore"] || doc[:_vscore] + raw_score = doc["_rerank_score"] || doc[:_rerank_score] || + doc["_hybrid_score"] || doc[:_hybrid_score] || + doc["_vscore"] || doc[:_vscore] score = quantize_score(raw_score, score_quantize) text = fetch_field(doc, text_wire, text_wire) diff --git a/lib/parse/stack.rb b/lib/parse/stack.rb index c353813..1794bc6 100644 --- a/lib/parse/stack.rb +++ b/lib/parse/stack.rb @@ -940,6 +940,23 @@ def track_event(name, dimensions: {}, **opts) end Parse.client.send_analytics(event_name, dimensions, **opts) end + + # Capability probe against the connected Parse Server, delegated to the + # default client. Builds on the memoized `serverInfo` fetch — see + # {Parse::API::Server#server_supports?} for the capability table and the + # fail-open-to-modern semantics. + # @param feature [Symbol] a capability key. + # @return [Boolean] whether the connected server supports the feature. + def server_supports?(feature) + Parse.client.server_supports?(feature) + end + + # The coarse `features` block advertised by `GET /serverInfo`, delegated + # to the default client. @see Parse::API::Server#server_features + # @return [Hash] the advertised features block, or `{}` if unavailable. + def server_features + Parse.client.server_features + end end # Error raised when {Parse::CreateLock#synchronize} cannot acquire the diff --git a/lib/parse/stack/version.rb b/lib/parse/stack/version.rb index 667f0b8..dca8b8c 100644 --- a/lib/parse/stack/version.rb +++ b/lib/parse/stack/version.rb @@ -6,6 +6,6 @@ module Parse # The Parse Server SDK for Ruby module Stack # The current version. - VERSION = "5.3.0" + VERSION = "5.4.0" end end diff --git a/lib/parse/two_factor_auth/user_extension.rb b/lib/parse/two_factor_auth/user_extension.rb index e758302..ed5830c 100644 --- a/lib/parse/two_factor_auth/user_extension.rb +++ b/lib/parse/two_factor_auth/user_extension.rb @@ -155,7 +155,7 @@ def setup_mfa!(secret:, token:) }, } - response = client.update_user(id, { authData: auth_data_payload }, opts: { session_token: session_token }) + response = client.update_user(id, { authData: auth_data_payload }, session_token: session_token) if response.error? if response.result.to_s.include?("Invalid MFA") @@ -208,7 +208,7 @@ def setup_sms_mfa!(mobile:) }, } - response = client.update_user(id, { authData: auth_data_payload }, opts: { session_token: session_token }) + response = client.update_user(id, { authData: auth_data_payload }, session_token: session_token) if response.error? raise Parse::Client::ResponseError, response @@ -245,7 +245,7 @@ def confirm_sms_mfa!(mobile:, token:) }, } - response = client.update_user(id, { authData: auth_data_payload }, opts: { session_token: session_token }) + response = client.update_user(id, { authData: auth_data_payload }, session_token: session_token) if response.error? if response.result.to_s.include?("Invalid MFA token") @@ -276,27 +276,60 @@ def disable_mfa!(current_token:) raise MFA::NotEnabledError, "MFA is not enabled for this user" unless mfa_enabled? raise ArgumentError, "Current token is required" if current_token.blank? - # To disable, we need to update authData.mfa with the old token for validation - # and then set it to null - auth_data_payload = { - mfa: { - old: current_token, - secret: nil, # Setting to nil disables TOTP - }, - } - - response = client.update_user(id, { authData: auth_data_payload }, opts: { session_token: session_token }) - - if response.error? - if response.result.to_s.include?("Invalid MFA token") - raise MFA::VerificationError, response.result.to_s - end - raise Parse::Client::ResponseError, response + # Parse Server's TOTP adapter exposes no first-class "disable via authData + # update" path — its validateUpdate always re-runs setup, so a partial + # mfa payload is rejected outright. Disabling is therefore a two-step: + # + # 1. Prove possession of the current code by submitting it as + # `{ mfa: { old: } }`. In the *update* context (unlike a + # fresh login) the adapter validates that code against the stored + # secret. A WRONG code fails at validateLogin ("Invalid MFA token"); + # a CORRECT code passes validateLogin and is then blocked by the + # re-setup requirement ("Invalid MFA data") — which is precisely the + # signal that the code was accepted. (This re-entry of the current + # code is the deliberate confirmation gate for turning MFA off.) + # 2. Disable MFA by unlinking the provider with `{ mfa: nil }`. + # + # This keeps self-disable gated on a valid current code even though the + # server offers no dedicated TOTP self-disable endpoint. + verify = client.update_user(id, { authData: { mfa: { old: current_token } } }, + session_token: session_token) + # Classify the two-step response POSITIVELY instead of treating + # "anything that isn't success-or-one-magic-string" as a bad + # token. The current code is ACCEPTED iff the server either + # succeeds or rejects only the follow-on re-setup ("Invalid MFA + # data") — that block fires AFTER validateLogin has already + # accepted the code. A WRONG code fails earlier at validateLogin + # ("Invalid MFA token"). Any OTHER error (transport, session, 5xx) + # is a real fault surfaced as-is, not mislabeled a verification + # failure. + err = verify.error.to_s + code_rejected = err.match?(/Invalid MFA token/i) + code_accepted = verify.success? || err.match?(/Invalid MFA data/i) + if code_rejected + raise MFA::VerificationError, "Invalid MFA token" + elsif !code_accepted + raise Parse::Client::ResponseError, verify end - # Refresh auth_data - fetch + response = client.update_user(id, { authData: { mfa: nil } }, session_token: session_token) + raise Parse::Client::ResponseError, response if response.error? + + # CONFIRM the disable took effect from the SERVER's own view — a + # positive post-condition rather than trusting the unlink response + # alone. We must read the server directly here, NOT lean on the + # in-memory #mfa_enabled? projection: Parse Server omits +authData+ + # entirely for a user with no providers, so once MFA is unlinked an + # ordinary fetch carries no +authData+ key at all and therefore can + # never clear the +{ mfa: { status: "enabled" } }+ value pinned at + # enrollment. An enabled account's own (session-token) read returns + # +authData.mfa+; a disabled one omits it — so an absent/mfa-less + # authData on this trusted self-read is the authoritative signal. + if mfa_enabled_on_server? + raise MFA::VerificationError, "MFA disable did not take effect (still enabled after unlink)" + end + clear_local_mfa_projection! true end @@ -315,20 +348,26 @@ def disable_mfa!(current_token:) # # @param authorized_by [Parse::User, Parse::Pointer] the operator # performing the override. Required. - # @param admin_role [Parse::Role, String, nil] optional role (or role - # name) that +authorized_by+ must belong to. + # @param admin_role [Parse::Role, String, nil] role (or role name) + # that +authorized_by+ must belong to. Library-enforced. Either + # this or +allow_unverified: true+ is REQUIRED (fail-closed). + # @param allow_unverified [Boolean] explicitly accept caller-side + # authorization without a library role check. Defaults to +false+; + # must be set deliberately to bypass MFA without an +admin_role+. # @return [Boolean] True if disabled successfully. # @raise [ArgumentError] when +authorized_by:+ is missing or not a User. - # @raise [Parse::MFA::ForbiddenError] when +admin_role+ is supplied + # @raise [Parse::MFA::ForbiddenError] when neither +admin_role+ nor + # +allow_unverified:+ is supplied, or when +admin_role+ is supplied # and the operator is not a member. # - # @example Caller-verified authorization - # user.disable_mfa_master_key!(authorized_by: current_admin) - # - # @example Library-enforced role check + # @example Library-enforced role check (preferred) # user.disable_mfa_master_key!(authorized_by: current_admin, # admin_role: "Admin") - def disable_mfa_master_key!(authorized_by:, admin_role: nil) + # + # @example Caller-verified authorization (explicit opt-out) + # user.disable_mfa_master_key!(authorized_by: current_admin, + # allow_unverified: true) + def disable_mfa_master_key!(authorized_by:, admin_role: nil, allow_unverified: false) operator = authorized_by unless operator.is_a?(Parse::User) || (operator.is_a?(Parse::Pointer) && operator.parse_class == Parse::User.parse_class) @@ -340,6 +379,18 @@ def disable_mfa_master_key!(authorized_by:, admin_role: nil) raise ArgumentError, "authorized_by: User must be persisted (have an objectId)" end + # FAIL CLOSED: this method bypasses MFA verification entirely via + # the master key, so it refuses to run without SOME authorization + # signal. Either supply an `admin_role:` for the library to verify, + # or pass `allow_unverified: true` to deliberately assert that the + # caller has already authorized the operator out-of-band. + if admin_role.nil? && !allow_unverified + raise MFA::ForbiddenError, + "disable_mfa_master_key! refuses to bypass MFA without an authorization " \ + "check: pass admin_role: to enforce role membership, or " \ + "allow_unverified: true to explicitly accept caller-side authorization." + end + if admin_role role = admin_role.is_a?(Parse::Role) ? admin_role : Parse::Role.find_by_name(admin_role.to_s) if role.nil? @@ -357,14 +408,19 @@ def disable_mfa_master_key!(authorized_by:, admin_role: nil) end auth_data_payload = { mfa: nil } - response = client.update_user(id, { authData: auth_data_payload }, opts: { use_master_key: true }) + response = client.update_user(id, { authData: auth_data_payload }, use_master_key: true) if response.error? raise Parse::Client::ResponseError, response end - # Refresh auth_data + # Refresh auth_data, then drop the in-memory MFA projection. As in + # #disable_mfa!, a disabled user's read omits +authData+, so the + # +{ mfa: { status: "enabled" } }+ value pinned at enrollment won't + # self-clear on fetch — clear it explicitly so #mfa_enabled? reports + # the truth after a master-key disable. fetch + clear_local_mfa_projection! true end @@ -435,6 +491,42 @@ def mfa_qr_code(secret, issuer: nil, format: :svg) account_name = email.presence || username.presence || id MFA.qr_code(secret, account_name, issuer: issuer, format: format) end + + private + + # @!visibility private + # Authoritative server-side MFA check via a trusted self-read. + # Reads +authData.mfa+ straight from a fresh session-token fetch + # rather than the (possibly stale) in-memory projection. An enabled + # account returns +authData.mfa+ with a +status+/+secret+; a disabled + # one omits +authData+ — so absence (or an mfa-less authData) means + # disabled. + # @return [Boolean] + def mfa_enabled_on_server? + result = client.fetch_object(self.class.parse_class, id, + session_token: session_token).result + mfa = result.is_a?(Hash) ? result["authData"] : nil + mfa = mfa["mfa"] if mfa.is_a?(Hash) + mfa.is_a?(Hash) && (mfa["status"] == "enabled" || mfa["secret"].present?) + end + + # @!visibility private + # Drop the in-memory MFA projection after a disable. A disabled user's + # server read omits +authData+ entirely, so an ordinary fetch can + # never clear the +{ mfa: { status: "enabled" } }+ value pinned at + # enrollment; do it explicitly here. Only the +mfa+ subkey is removed + # (any anonymous/OAuth authData is preserved), and the assignment runs + # through the non-dirtying hydration path inside a +with_authdata_trust+ + # scope so it is neither stripped nor marked dirty — a later #save will + # not resend +authData+. + def clear_local_mfa_projection! + cleared = auth_data.is_a?(Hash) ? auth_data.dup : {} + cleared.delete("mfa") + cleared.delete(:mfa) + self.class.with_authdata_trust do + apply_attributes!({ "authData" => cleared }, dirty_track: false) + end + end end # Not enabled error diff --git a/lib/parse/vector_search/hybrid.rb b/lib/parse/vector_search/hybrid.rb new file mode 100644 index 0000000..549791c --- /dev/null +++ b/lib/parse/vector_search/hybrid.rb @@ -0,0 +1,578 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../vector_search" + +module Parse + module VectorSearch + # Hybrid (lexical + vector) search with reciprocal-rank fusion. + # + # Lexical search (`Parse::AtlasSearch`, BM25/`$search`) nails + # exact-token matches — proper nouns, SKU codes, "OAuth 2.0". Vector + # search (`Parse::VectorSearch`, `$vectorSearch`) nails paraphrase — + # "login token spec". Fusing the two beats either alone on most real + # workloads. + # + # == Why two aggregations (and not one `$facet`) + # + # `$vectorSearch` is explicitly prohibited inside `$facet`, + # `$lookup`, `$unionWith`, or any compound stage on every Atlas + # version, and it must be the FIRST stage of its pipeline. So on + # pre-Atlas-8.0 clusters the only correct shape is two independent + # aggregations followed by client-side reciprocal-rank fusion (RRF). + # On Atlas 8.0+ the native `$rankFusion` stage performs the same + # fusion server-side in a single round-trip; {.rank_fusion_supported?} + # detects it (probe-and-cache, not version-string parsing). + # + # == ACL / CLP enforcement + # + # The client-side path delegates each branch to an entry point that + # already enforces the full SDK-side chain — {Parse::AtlasSearch.search} + # (lexical) and {Parse::VectorSearch.search} (vector). Both apply the + # CLP `find` boundary, the post-stage `_rperm` `$match`, pointerFields + # filtering, `protectedFields` redaction, and the internal-fields + # denylist BEFORE returning rows. Fusion therefore operates only on + # rows the caller is already allowed to read; there is no separate + # hydration fetch to re-secure. The native `$rankFusion` path + # reproduces the same enforcement inline (CLP `find`, post-stage ACL + # `$match`, post-fetch redaction), mirroring {Parse::VectorSearch.search}. + # + # == Scores + # + # The vector branch projects `_vscore` (Atlas `vectorSearchScore`), + # the lexical branch `_score` (Atlas `searchScore`). The fused row + # carries `_hybrid_score` (the summed RRF weight) and `_hybrid_ranks` + # (`{ lexical: , vector: }`, 1-based, absent for a branch + # the row did not appear in). The raw branch scores are preserved on + # the row for callers that want them. + module Hybrid + # Raised on malformed fusion input (bad weights, non-positive + # `k_constant`, empty branch set). Inherits {ArgumentError} so it + # joins the other bad-input raises in a single rescue boundary. + class FusionError < ArgumentError; end + + # Standard RRF rank constant. Larger values flatten the + # contribution curve (later ranks matter more); 60 is the value + # from the original Cormack et al. RRF paper and the Atlas + # `$rankFusion` default. + DEFAULT_K_CONSTANT = 60 + + # Default number of fused hits returned. + DEFAULT_K = 20 + + # Per-branch oversample multiplier. Each branch fetches + # `k * this` candidates so a row ranked low in one branch but high + # in the other still has a rank to fuse. Atlas's own `$rankFusion` + # uses a comparable internal oversample. + DEFAULT_OVERSAMPLE_MULTIPLIER = 5 + + # Hard ceiling on the fused result count, matching + # {Parse::VectorSearch::MAX_K}. + MAX_K = Parse::VectorSearch::MAX_K + + # TTL (seconds) for the {.rank_fusion_supported?} probe cache. A + # cluster gaining or losing `$rankFusion` support is a rare, + # operator-driven event (an Atlas major-version upgrade), so a + # 1-hour cache keeps the extra probe round-trip off the hot path. + PROBE_CACHE_TTL = 3600 + + class << self + # Pure reciprocal-rank fusion. Operates on already-fetched, + # already-ranked branch result lists — no I/O, no ACL concerns + # (the rows were enforced upstream). + # + # `fused_score(d) = Σ_b weight_b / (k_constant + rank_b(d))` + # + # @param branches [Hash{Symbol=>Array}] each value is a + # branch's result rows in descending relevance order (best + # first). Keys name the branch (`:lexical`, `:vector`). + # @param k_constant [Integer] RRF rank constant (> 0). + # @param weights [Hash{Symbol=>Numeric}, nil] per-branch weight. + # Missing branches default to weight 1.0; nil weights the whole + # set at 1.0. + # @return [Array] fused rows, descending by `_hybrid_score`, + # each carrying `_hybrid_score` and `_hybrid_ranks`. Ties broke + # deterministically by objectId (stable for snapshots). + def rrf(branches, k_constant: DEFAULT_K_CONSTANT, weights: nil) + unless branches.is_a?(Hash) && !branches.empty? + raise FusionError, "rrf: branches must be a non-empty Hash of ranked result lists." + end + kc = Integer(k_constant) + raise FusionError, "rrf: k_constant must be a positive integer (got #{kc})." if kc <= 0 + validate_weights!(weights) + + acc = {} + order = 0 + branches.each do |branch_name, rows| + weight = weight_for(weights, branch_name) + next if weight.zero? + Array(rows).each_with_index do |row, i| + id = row_id(row) + next if id.nil? + rank = i + 1 + entry = (acc[id] ||= { doc: row, score: 0.0, ranks: {}, seq: (order += 1) }) + entry[:doc] = merge_rows(entry[:doc], row) + entry[:score] += weight.to_f / (kc + rank) + entry[:ranks][branch_name] = rank + end + end + + acc.values + .sort_by { |e| [-e[:score], row_id(e[:doc]).to_s, e[:seq]] } + .map do |e| + row = e[:doc].dup + row["_hybrid_score"] = e[:score] + row["_hybrid_ranks"] = e[:ranks] + row + end + end + + # Detect whether the cluster backing `collection` supports the + # native `$rankFusion` aggregation stage (Atlas 8.0+). + # + # Probe-and-cache, NOT version-string parsing: Atlas upgrades + # cluster versions silently and the exact version where + # `$rankFusion` reached general availability has moved. We send a + # zero-cost behavioural probe (`[{$rankFusion: {input: {}}}, + # {$limit: 0}]`) and classify the response: success or any error + # OTHER than "unknown stage" means supported; an "Unknown + # aggregation stage" failure means unsupported. The result is + # cached per collection for {PROBE_CACHE_TTL}. + # + # @param collection [String] Parse class / Mongo collection name. + # @return [Boolean] + def rank_fusion_supported?(collection) + key = collection.to_s + now = monotonic + cached = probe_cache_get(key, now) + return cached unless cached.nil? + + supported = run_probe(key) + probe_cache_put(key, supported, now) + supported + end + + # Clear the {.rank_fusion_supported?} probe cache (all + # collections, or one). Mainly for tests that toggle cluster + # behaviour between cases. + # + # @param collection [String, nil] + def clear_probe_cache(collection = nil) + probe_mutex.synchronize do + if collection + probe_cache.delete(collection.to_s) + else + @probe_cache = {} + end + end + end + + # Run a hybrid search and return the fused raw rows. + # + # @param collection_name [String] Parse class / collection. + # @param lexical [Hash] lexical branch config: + # * `:query` [String] (required) the `$search` query text. + # * `:index` [String, nil] Atlas Search (lexical) index name. + # * `:fields` [Array, String, nil] fields to search; + # defaults to a wildcard path. + # * `:filter` [Hash, nil] post-`$search` `$match`. + # * `:fuzzy` [Hash, nil] forwarded to the text operator. + # @param vector [Hash] vector branch config: + # * `:query_vector` [Array] (required) the query embedding. + # * `:field` [String, Symbol] (required) vector field path. + # * `:index` [String, nil] vectorSearch index name. + # * `:num_candidates` [Integer, nil] Atlas HNSW search width. + # * `:filter` [Hash, nil] post-`$vectorSearch` `$match`. + # * `:vector_filter` [Hash, nil] Atlas-native pre-search filter. + # @param k [Integer] number of fused hits to return (≤ {MAX_K}). + # @param fusion [Hash, nil] fusion config: + # * `:method` [Symbol] `:rrf` (default) and `:rrf_client` both + # fuse CLIENT-SIDE (deterministic across Atlas versions). + # `:rrf_native` opts into the single-roundtrip server-side + # `$rankFusion` stage (Atlas 8.0+ only) and falls back to the + # client path when unsupported or on any execution error. + # * `:k_constant` [Integer] RRF rank constant. + # * `:weights` [Hash] `{ lexical:, vector: }` branch weights. + # @param scope_opts [Hash] ACL/CLP scope kwargs forwarded to BOTH + # branch entry points: `session_token:` / `master:` / + # `acl_user:` / `acl_role:`. + # @return [Array] fused rows (see {.rrf}). + def search(collection_name, lexical:, vector:, k: DEFAULT_K, fusion: nil, **scope_opts) + require_available! + fusion = symbolize(fusion || {}) + lex = symbolize(lexical || {}) + vec = symbolize(vector || {}) + + k_int = Integer(k) + raise ArgumentError, "k must be in 1..#{MAX_K} (got #{k_int})." if k_int <= 0 || k_int > MAX_K + + unless lex[:query].is_a?(String) && !lex[:query].strip.empty? + raise ArgumentError, "hybrid search: lexical[:query] must be a non-empty String." + end + if vec[:query_vector].nil? || vec[:field].nil? + raise ArgumentError, "hybrid search: vector[:query_vector] and vector[:field] are required." + end + + method = (fusion[:method] || :rrf).to_sym + unless %i[rrf rrf_client rrf_native].include?(method) + raise ArgumentError, + "hybrid search: fusion[:method] must be :rrf, :rrf_client, or :rrf_native (got #{method.inspect})." + end + k_constant = fusion[:k_constant] || DEFAULT_K_CONSTANT + weights = fusion[:weights] + oversample = [k_int * DEFAULT_OVERSAMPLE_MULTIPLIER, k_int].max + + # NOTE (deviation from plan §8.3): the default fuses CLIENT-SIDE. + # The native single-roundtrip `$rankFusion` path is OPT-IN + # (`fusion: { method: :rrf_native }`) rather than the default, + # because its server-side execution (and its ACL `$match` + # placement) cannot be validated without an Atlas 8.0+ cluster + # in CI. `rank_fusion_supported?` detection ships and is unit- + # tested; the native pipeline shape is snapshot-tested; but live + # results route through the always-correct, fully-enforced + # two-aggregate client path unless a caller explicitly opts into + # native AND the cluster supports it. Native still falls back to + # the client path on any execution error. + if method == :rrf_native && rank_fusion_supported?(collection_name) + fused = run_native(collection_name, lex, vec, oversample, + k_constant: k_constant, weights: weights, scope_opts: scope_opts) + return fused.first(k_int) if fused + end + + lexical_rows = run_lexical(collection_name, lex, oversample, scope_opts) + vector_rows = run_vector(collection_name, vec, oversample, scope_opts) + rrf({ lexical: lexical_rows, vector: vector_rows }, + k_constant: k_constant, weights: weights).first(k_int) + end + + private + + # -- client-side branch execution -------------------------------- + + def run_lexical(collection_name, lex, oversample, scope_opts) + require_relative "../atlas_search" + Parse::AtlasSearch.search( + collection_name, lex[:query], + index: lex[:index], + fields: lex[:fields], + filter: lex[:filter], + fuzzy: lex[:fuzzy], + limit: oversample, + raw: true, + **scope_opts.dup, + ) + end + + def run_vector(collection_name, vec, oversample, scope_opts) + Parse::VectorSearch.search( + collection_name, + field: vec[:field], + query_vector: vec[:query_vector], + k: oversample, + num_candidates: vec[:num_candidates], + filter: vec[:filter], + vector_filter: vec[:vector_filter], + index: vec[:index], + **scope_opts.dup, + ) + end + + # -- native $rankFusion path ------------------------------------- + + # Build the native `$rankFusion` pipeline (without ACL/CLP + # stages). Public-ish via {.native_pipeline} for snapshot tests; + # the live path appends ACL enforcement in {#run_native}. + def build_rank_fusion_stage(lex, vec, oversample, k_constant:, weights:) + vsel = vector_search_stage(vec, oversample) + lsel = lexical_search_stage(lex, oversample) + stage = { + "input" => { + "pipelines" => { "vector" => vsel, "lexical" => lsel }, + }, + # `$rankFusion` performs reciprocal-rank fusion implicitly; the + # only tunable in `combination` is per-input `weights`. + "scoreDetails" => false, + } + if weights + w = symbolize(weights) + stage["combination"] = { + "weights" => { "vector" => weight_for(w, :vector), "lexical" => weight_for(w, :lexical) }, + } + end + { "$rankFusion" => stage } + end + + # Assemble (but do not execute) the full native pipeline, + # including the ACL `$match` for a non-master resolution. Exposed + # for snapshot tests so the security-relevant shape is pinned even + # without an Atlas 8.0 cluster to execute against. + # + # @return [Array] the aggregation pipeline. + def native_pipeline(collection_name, lexical:, vector:, k: DEFAULT_K, fusion: nil, **scope_opts) + fusion = symbolize(fusion || {}) + lex = symbolize(lexical || {}) + vec = symbolize(vector || {}) + oversample = [Integer(k) * DEFAULT_OVERSAMPLE_MULTIPLIER, Integer(k)].max + resolution = Parse::ACLScope.resolve!(scope_opts.dup, method_name: :"VectorSearch::Hybrid.search") + native_pipeline_for(lex, vec, oversample, resolution, + k_constant: fusion[:k_constant] || DEFAULT_K_CONSTANT, + weights: fusion[:weights], limit: Integer(k)) + end + + def native_pipeline_for(lex, vec, oversample, resolution, k_constant:, weights:, limit:) + pipeline = [build_rank_fusion_stage(lex, vec, oversample, k_constant: k_constant, weights: weights)] + # The fused RRF score is surfaced via `{ $meta: "score" }` + # (a numeric), not "scoreDetails" (a breakdown document). + pipeline << { "$addFields" => { "_hybrid_score" => { "$meta" => "score" } } } + unless resolution.nil? || resolution.master? + acl_match = Parse::ACLScope.match_stage_for(resolution) + pipeline << acl_match if acl_match + end + pipeline << { "$sort" => { "_hybrid_score" => -1 } } + pipeline << { "$limit" => limit } + pipeline + end + + def run_native(collection_name, lex, vec, oversample, k_constant:, weights:, scope_opts:) + resolution = Parse::ACLScope.resolve!(scope_opts.dup, method_name: :"VectorSearch::Hybrid.search") + assert_clp_find!(collection_name, resolution) + pointer_fields = resolve_pointer_fields!(collection_name, resolution) + protected_fields = Parse::CLPScope.protected_fields_for( + collection_name, resolution.permission_strings, + ) + Parse::VectorSearch.validate_query_vector!(vec[:query_vector]) + Parse::PipelineSecurity.validate_filter!(vec[:vector_filter]) if vec[:vector_filter] + Parse::PipelineSecurity.validate_filter!(vec[:filter]) if vec[:filter] + Parse::PipelineSecurity.validate_filter!(lex[:filter]) if lex[:filter] + + pipeline = native_pipeline_for(lex, vec, oversample, resolution, + k_constant: k_constant, weights: weights, limit: oversample) + rows = run_pipeline!(collection_name, pipeline) + + unless resolution.master? + # Defense-in-depth top-level row gate. The in-pipeline ACL + # `$match` is the primary filter, but it sits AFTER + # `$rankFusion` and treats a missing `_rperm` as public + # (`{$exists: false}`). If the fusion stage fails to carry + # `_rperm` through to its output documents — a behaviour we + # cannot validate without an Atlas 8.x cluster, and one this + # method would otherwise silently swallow via the StandardError + # fallback below — every row would fail OPEN as public. So + # re-verify each row here and FAIL CLOSED: a non-master row + # must carry an `_rperm` array that explicitly satisfies the + # scope. `redact_results!` does NOT cover this case — it skips + # top-level rows by design (see Parse::ACLScope). The tradeoff + # is that genuinely ACL-less rows (no `_rperm` at all) are + # dropped on this opt-in path; public-readable rows store + # `_rperm: ["*"]` and are kept (non-strict scopes carry `"*"`). + perms_set = Array(resolution.permission_strings).to_set + rows.select! { |doc| native_row_visible?(doc, perms_set) } + Parse::ACLScope.redact_results!(rows, resolution) + Parse::CLPScope.redact_protected_fields!(rows, protected_fields) if protected_fields.any? + if pointer_fields + rows = Parse::CLPScope.filter_by_pointer_fields(rows, pointer_fields, resolution.user_id) + end + end + rows.map! { |doc| Parse::PipelineSecurity.strip_internal_fields(doc) } + rows + rescue Parse::CLPScope::Denied + raise + rescue StandardError + # Native execution failed (e.g. a cluster that probed as + # supported but rejects this exact shape, or a transient error). + # Fall back to the client-side path rather than failing the + # whole search — the client path is the always-correct baseline. + nil + end + + def vector_search_stage(vec, oversample) + # Parity with Parse::VectorSearch: Atlas requires + # `numCandidates >= limit` and caps it at 10_000. The default + # (`oversample * MULTIPLIER`) can blow past 10_000 for a large + # `k`, so clamp into `[limit, 10_000]` rather than emit a value + # Atlas will reject. `oversample` (the per-branch limit) is + # bounded by `MAX_K * OVERSAMPLE_MULTIPLIER` and stays below the + # cap, so the clamp range is always valid. + num_candidates = (vec[:num_candidates] || oversample * Parse::VectorSearch::DEFAULT_NUM_CANDIDATES_MULTIPLIER).to_i + num_candidates = [[num_candidates, oversample].max, 10_000].min + stage = { + "index" => vec[:index].to_s, + "path" => vec[:field].to_s, + "queryVector" => vec[:query_vector], + "numCandidates" => num_candidates, + "limit" => oversample, + } + stage["filter"] = vec[:vector_filter] if vec[:vector_filter] && !vec[:vector_filter].empty? + inner = [{ "$vectorSearch" => stage }] + inner << { "$match" => vec[:filter] } if vec[:filter] + inner + end + + def lexical_search_stage(lex, oversample) + require_relative "../atlas_search" if defined?(Parse::AtlasSearch::SearchBuilder).nil? + builder = Parse::AtlasSearch::SearchBuilder.new(index_name: lex[:index]) + fields = lex[:fields] + if fields.nil? || (fields.respond_to?(:empty?) && fields.empty?) + builder.text(query: lex[:query], path: { "wildcard" => "*" }, fuzzy: lex[:fuzzy]) + else + Array(fields).each { |f| builder.text(query: lex[:query], path: f.to_s, fuzzy: lex[:fuzzy]) } + end + inner = [builder.build, { "$limit" => oversample }] + inner << { "$match" => lex[:filter] } if lex[:filter] + inner + end + + # -- the $rankFusion support probe ------------------------------- + + def run_probe(collection_name) + coll = Parse::MongoDB.collection(collection_name) + coll.aggregate([{ "$rankFusion" => { "input" => {} } }, { "$limit" => 0 }]).to_a + true + rescue StandardError => e + # "Unknown aggregation stage $rankFusion" (or an unrecognized- + # operator variant) means the cluster predates native support. + # Any OTHER failure (a malformed-but-recognized stage, an auth + # error, etc.) means the stage IS recognized — treat as supported + # and let the real query surface the real error. + unsupported_stage_error?(e) ? false : true + end + + # Message fragments Mongo emits for an UNRECOGNIZED pipeline stage. + # We only treat the probe failure as "unsupported" when BOTH the + # stage name AND an unrecognized-stage phrase appear, so a + # recognized-but-misused `$rankFusion` (or an unrelated auth/parse + # error) is treated as supported and surfaces its real error on the + # actual query rather than silently disabling native fusion. + UNSUPPORTED_STAGE_FRAGMENTS = [ + "unrecognized pipeline stage name", + "unknown aggregation stage", + "is not allowed", + ].freeze + private_constant :UNSUPPORTED_STAGE_FRAGMENTS + + def unsupported_stage_error?(err) + msg = err.message.to_s.downcase + msg.include?("rankfusion") && UNSUPPORTED_STAGE_FRAGMENTS.any? { |f| msg.include?(f) } + end + + # -- probe cache ------------------------------------------------- + + PROBE_MUTEX_INIT = Mutex.new + private_constant :PROBE_MUTEX_INIT + + def probe_mutex + @probe_mutex ||= PROBE_MUTEX_INIT.synchronize { @probe_mutex ||= Mutex.new } + end + + def probe_cache + @probe_cache ||= {} + end + + def probe_cache_get(key, now) + probe_mutex.synchronize do + entry = probe_cache[key] + next nil if entry.nil? + next nil if (now - entry[:at]) >= PROBE_CACHE_TTL + entry[:supported] + end + end + + def probe_cache_put(key, supported, now) + probe_mutex.synchronize { probe_cache[key] = { supported: supported, at: now } } + end + + # Monotonic clock so the TTL is immune to wall-clock jumps. + def monotonic + Process.clock_gettime(Process::CLOCK_MONOTONIC) + end + + # -- shared helpers ---------------------------------------------- + + def require_available! + Parse::MongoDB.require_gem! + unless Parse::MongoDB.available? + raise Parse::VectorSearch::NotAvailable, + "Parse::VectorSearch::Hybrid requires Parse::MongoDB.configure(enabled: true)." + end + end + + def run_pipeline!(collection_name, pipeline) + Parse::MongoDB.collection(collection_name).aggregate(pipeline).to_a + end + + def assert_clp_find!(collection_name, resolution) + return if resolution.nil? || resolution.master? + unless Parse::CLPScope.permits?(collection_name, :find, resolution.permission_strings) + raise Parse::CLPScope::Denied.new( + collection_name, :find, + "CLP refuses find on '#{collection_name}' for the current hybrid-search scope.", + ) + end + end + + def resolve_pointer_fields!(collection_name, resolution) + return nil if resolution.nil? || resolution.master? + pointer_fields = Parse::CLPScope.pointer_fields_for(collection_name, :find) + return nil if pointer_fields.nil? + if resolution.user_id.nil? + raise Parse::CLPScope::Denied.new( + collection_name, :find, + "CLP requires user identity (pointerFields=#{pointer_fields.inspect}) " \ + "but the current hybrid-search scope has no user_id.", + ) + end + pointer_fields + end + + def validate_weights!(weights) + return if weights.nil? + unless weights.is_a?(Hash) + raise FusionError, "rrf: weights must be a Hash of branch => weight (got #{weights.class})." + end + weights.each_value do |w| + unless w.is_a?(Numeric) && w >= 0 + raise FusionError, "rrf: weights must be non-negative numbers (got #{w.inspect})." + end + end + end + + def weight_for(weights, branch_name) + return 1.0 if weights.nil? + w = weights[branch_name] || weights[branch_name.to_s] || weights[branch_name.to_sym] + w.nil? ? 1.0 : w.to_f + end + + def row_id(row) + id = row["_id"] || row[:_id] || row["objectId"] || row[:objectId] + id.nil? ? nil : id.to_s + end + + # Fail-closed top-level row gate for the native fusion path. + # Unlike {Parse::ACLScope}'s subdoc matcher (which treats a + # missing `_rperm` as public), this REQUIRES an explicit, + # satisfied `_rperm` array: a row with no, empty, or non-Array + # `_rperm` is dropped, because on the native path a missing + # `_rperm` may mean `$rankFusion` stripped it rather than the row + # being genuinely public. + def native_row_visible?(doc, perms_set) + rperm = doc["_rperm"] || doc[:_rperm] + rperm.is_a?(Array) && rperm.any? { |entry| perms_set.include?(entry) } + end + + # Merge two rows for the same objectId across branches: keep all + # fields, preferring non-nil values, so the fused row carries both + # branch scores (`_score` and `_vscore`). + def merge_rows(a, b) + return b if a.nil? + return a if b.nil? + a.merge(b) { |_k, va, vb| vb.nil? ? va : vb } + end + + def symbolize(hash) + return {} if hash.nil? + hash.each_with_object({}) { |(k, v), out| out[k.to_sym] = v } + end + end + end + end +end diff --git a/lib/parse/webhooks.rb b/lib/parse/webhooks.rb index 0284450..4362e9c 100644 --- a/lib/parse/webhooks.rb +++ b/lib/parse/webhooks.rb @@ -17,6 +17,7 @@ require_relative "webhooks/payload" require_relative "webhooks/registration" require_relative "webhooks/replay_protection" +require_relative "webhooks/trigger_audit" module Parse class Object @@ -83,6 +84,36 @@ class Webhooks # will trigger the Parse::Webhooks application to return the proper error response. class ResponseError < StandardError; end + # The authentication-side triggers (local underscore form). These carry a + # +_User+ / +_Session+ as the payload object but are NOT object save/delete + # triggers: the router runs no ActiveModel save/create/destroy callbacks for + # them, and Parse Server ignores their response body. + AUTH_TRIGGERS = %i[ + before_login after_login after_logout before_password_reset_request + ].freeze + + # The LiveQuery triggers (local underscore form). Connection-global or + # event-scoped; Parse Server ignores their response body. Delivered over an + # HTTP webhook only in a co-located single-process LiveQuery setup. + LIVE_QUERY_TRIGGERS = %i[before_connect before_subscribe after_event].freeze + + # Every trigger whose payload is not an object save/delete/find shape. + # Parse Server's webhook response handler resolves +{}+ for all of these + # (the body is ignored), so the router normalizes their handler result to a + # success no-op rather than serializing a returned object into the response. + NON_OBJECT_TRIGGERS = (AUTH_TRIGGERS + LIVE_QUERY_TRIGGERS).freeze + + # The +before*+ subset of {NON_OBJECT_TRIGGERS} for which a handler can DENY + # the operation. Parse Server only treats an +{error}+ response as a + # rejection -- a +{success:false}+ body resolves and lets the login / + # connect / subscribe / reset proceed. So, mirroring the +before_save+ + # convention, the router converts a +false+ return from one of these into a + # {ResponseError} (which serializes to +{error}+). +error!+ works for any + # trigger; the +after*+ variants fire after the fact and cannot undo it. + REJECTABLE_NON_OBJECT_TRIGGERS = %i[ + before_login before_password_reset_request before_connect before_subscribe + ].freeze + include Client::Connectable extend Parse::Webhooks::Registration # The name of the incoming env containing the webhook key. @@ -135,6 +166,18 @@ def route(type, className, &block) className = className.parse_class end className = className.to_s + # Parse Server has no beforeCreate/afterCreate webhook trigger; the + # create variants are ActiveModel callbacks that run inside the + # beforeSave/afterSave handler for new objects. Point callers there + # rather than registering a route that can never fire. + if type == :before_create || type == :after_create + save = type == :before_create ? :before_save : :after_save + raise ArgumentError, + "There is no #{type} webhook. Register `webhook :#{save}` instead — " \ + "your #{type} ActiveModel callbacks run inside the #{save} handler " \ + "for new objects (registering #{save} enables BOTH the #{save} and " \ + "#{type} callbacks)." + end if routes[type].nil? || block.respond_to?(:call) == false raise ArgumentError, "Invalid Webhook registration trigger #{type} #{className}" end @@ -159,6 +202,104 @@ def run_function(name, params) call_route(:function, name, payload) end + # Evaluate a single registered handler block in the scope of the payload. + # + # The block runs with `self` bound to the {Parse::Webhooks::Payload}, so a + # handler can call `parse_object`, `params`, `error!`, etc. directly -- + # exactly as it could under the historical `payload.instance_exec(payload, + # &block)` invocation. The difference is the return semantics: + # + # - `return value` returns `value` as the handler result (instead of the + # `LocalJumpError: unexpected return` that bare `instance_exec` raised + # when the block was defined inside a method). + # - The legacy idioms still work unchanged: the last expression's value, + # `next value`, and `break value` all return `value`, and `raise` + # propagates untouched (so `error!` / before_save rejections behave the + # same). + # + # This is achieved by attaching the block as a singleton method on the + # per-request payload (so `return` gets method semantics) and removing it + # afterward. The payload is a per-request instance, so this neither leaks + # nor mutates shared state across threads. + # + # Arity is matched to the old `instance_exec(payload, ...)` contract: a + # zero-arity block (`do ... end` / `proc { }`) is called with no args; a + # block that declares a parameter (`do |payload| ... end`) or a splat + # receives the payload. + # + # @param payload [Parse::Webhooks::Payload] the request payload (becomes `self`). + # @param block [Proc] the registered handler block. + # @return [Object] the handler's result value. + def invoke_handler(payload, block) + name = :"__parse_webhook_handler_#{block.object_id}__" + payload.define_singleton_method(name, &block) + handler = payload.method(name) + begin + # Match the old `payload.instance_exec(payload, &block)` arity + # leniency: a zero-arity block is called bare; otherwise it receives + # the payload, plus a nil for each additional REQUIRED positional so a + # block declaring `|payload, extra|` (or more) does not raise — under + # instance_exec those surplus params were silently nil. `arity` is + # negative for optional/splat params (e.g. -1 for `|*a|`, -2 for + # `|a, *b|`); `~arity` gives the required count in that case. + if handler.arity == 0 + handler.call + else + required = handler.arity.negative? ? ~handler.arity : handler.arity + handler.call(payload, *Array.new([required - 1, 0].max)) + end + ensure + singleton = payload.singleton_class + if singleton.method_defined?(name) || singleton.private_method_defined?(name) + singleton.send(:remove_method, name) + end + end + end + + # Run any {Parse::Webhooks::Payload#after_response} callbacks a handler + # registered, AFTER the response has been produced. Prefers the server's + # `rack.after_reply` hook (Puma / Unicorn), which fires once the response + # is flushed to the socket on the same worker thread; falls back to a + # detached thread when the server does not provide it (e.g. WEBrick). Each + # callback is isolated so one raising neither aborts the others nor reaches + # the client. No-op when nothing was deferred. + # + # @param env [Hash] the Rack environment (for `rack.after_reply`). + # @param payload [Parse::Webhooks::Payload, nil] the request payload. + # @return [void] + def dispatch_deferred(env, payload) + return if payload.nil? || !payload.respond_to?(:deferred_callbacks) + callbacks = payload.deferred_callbacks + return if callbacks.blank? + + runner = proc do + callbacks.each do |cb| + begin + cb.call + rescue => e + warn "[Webhooks::after_response] deferred callback raised: #{e.class}: #{e.message}" + end + end + end + + # Enqueueing must never break an otherwise-successful response: this runs + # just before `response.finish`, so a raise here (a frozen after_reply + # array, thread exhaustion) would discard the buffered reply and surface + # as a 500. Failing to schedule deferred work degrades to "not run", + # never to a failed response. + begin + after_reply = env.is_a?(Hash) ? env["rack.after_reply"] : nil + if after_reply.respond_to?(:<<) + after_reply << runner + else + Thread.new(&runner) + end + rescue => e + warn "[Webhooks::after_response] could not schedule deferred work: #{e.class}: #{e.message}" + end + nil + end + # Calls the set of registered webhook trigger blocks or the specific function block. # This method is usually called when an incoming request from Parse Server is received. # @param type (see route) @@ -219,9 +360,9 @@ def call_route(type, className, payload = nil) end if registry.is_a?(Array) - result = registry.map { |hook| payload.instance_exec(payload, &hook) }.last + result = registry.map { |hook| invoke_handler(payload, hook) }.last else - result = payload.instance_exec(payload, ®istry) + result = invoke_handler(payload, registry) end if result.is_a?(Parse::Object) @@ -265,6 +406,30 @@ def call_route(type, className, payload = nil) result = {} end + # Auth- and LiveQuery-trigger dispatch (beforeLogin/afterLogin/ + # afterLogout/beforePasswordResetRequest, beforeConnect/beforeSubscribe/ + # afterEvent). Parse Server IGNORES the response body for all of these -- + # its webhook response handler resolves {} regardless -- so the ONLY way + # a handler can affect the operation is the error path, and only for the + # "before" variants (a login/connect/subscribe/reset can be denied; an + # after_* fires after the fact and cannot be undone). + # + # Crucially, Parse Server treats only an {error} response as a rejection: + # a {success:false} body RESOLVES and lets the operation proceed. So a + # handler that returns `false` to "deny login" would silently allow it. + # We mirror the before_save convention and convert that false into a + # ResponseError (=> {error} => Parse Server denies). `error!` works for + # any of them (the call! rescue converts it). Every other return value -- + # including a Parse::Object a handler happened to return (e.g. the _User + # from beforeLogin) -- is normalized to a success no-op so we never + # serialize an object into the response or the redacted request log. + if NON_OBJECT_TRIGGERS.include?(type) + if result == false && REJECTABLE_NON_OBJECT_TRIGGERS.include?(type) + raise Parse::Webhooks::ResponseError, "#{type} rejected by webhook handler" + end + result = true + end + # Guard-injection: when a handler returns a Hash (or true/nil normalized # to {}) for a class with field_guards, Parse Server would otherwise # merge the response with the client's original payload and persist @@ -380,6 +545,40 @@ def call(env) dup.call!(env) end + # Extract the Parse class name from a webhook request path. Parse Server + # registers each trigger at `//` + # (functions at `/`), so for a trigger the class + # is the last segment and the second-to-last is a known trigger name. + # Returns nil for a function path, a path with no recognizable trigger + # segment, or a className that fails the conservative charset check + # (Parse class names are `[A-Za-z0-9_]`, built-ins prefixed with `_`). + # The charset gate keeps an attacker-supplied path (reachable when + # `allow_unauthenticated` is set) from injecting an arbitrary routing / + # scrub key. + # + # @param path [String] the request PATH_INFO. + # @return [String, nil] the sanitized class name, or nil. + def trigger_class_from_path(path) + segments = path.to_s.split("/").reject(&:empty?) + return nil if segments.size < 2 + trigger, klass = segments[-2], segments[-1] + # register_triggers! builds the URL with the LOCAL snake_case trigger + # name (`after_find`), while Parse Server sends the camelCase form in the + # body — accept both so the path segment is recognized either way. + known = (Parse::API::Hooks::TRIGGER_NAMES + Parse::API::Hooks::TRIGGER_NAMES_LOCAL).map(&:to_s) + return nil unless known.include?(trigger) + # Allow a leading `@` for the Parse pseudo-classes (`@Connect` for the + # connection-global LiveQuery trigger, `@File` for file triggers): the + # SDK encodes the className in the per-trigger URL, so beforeConnect + # would not route without it. Mirrors the trigger-className validator + # (Parse::API::PathSegment.trigger_class_name!). Still anchored and + # charset-limited -- this gate keeps an attacker-supplied path (reachable + # only under allow_unauthenticated) from injecting an arbitrary routing + # / scrub key. + return nil unless /\A@?_?[A-Za-z][A-Za-z0-9_]*\z/.match?(klass) + klass + end + # @!visibility private def call!(env) request = Rack::Request.new env @@ -440,8 +639,17 @@ def call!(env) return response.finish end + # Parse Server registers each trigger at + # `//`. For beforeFind/afterFind the + # payload body carries NO className anywhere, so the request PATH is the + # only authoritative source of the class — without it, find triggers + # don't route (parse_class is nil) and afterFind `objects` can't have + # their :vector columns stripped. Thread it into the payload here, before + # construction, so it is available for both routing and the scrub. Nil + # for function requests and for malformed paths. + webhook_class = Parse::Webhooks.trigger_class_from_path(request.path) begin - payload = Parse::Webhooks::Payload.new body_str + payload = Parse::Webhooks::Payload.new(body_str, webhook_class) rescue => e warn "Invalid webhook payload format: #{e}" response.write error("Invalid payload format. Should be valid JSON.") @@ -486,6 +694,10 @@ def call!(env) puts "----------------------------------------------------\n" end response.write success(result) + # Schedule any after_response work to run once this reply is flushed, + # off the client's critical path. Registered on the success path so the + # deferred work overlaps a response Parse Server will act on. + dispatch_deferred(env, payload) return response.finish rescue Parse::Webhooks::ResponseError, ActiveModel::ValidationError => e if payload.trigger? diff --git a/lib/parse/webhooks/payload.rb b/lib/parse/webhooks/payload.rb index ac92e4c..b0132fc 100644 --- a/lib/parse/webhooks/payload.rb +++ b/lib/parse/webhooks/payload.rb @@ -25,7 +25,9 @@ class Payload original: nil, update: nil, query: nil, log: nil, objects: nil, - triggerName: nil }.freeze + triggerName: nil, + event: nil, clients: nil, subscriptions: nil, + context: nil }.freeze include ::ActiveModel::Serializers::JSON # @!attribute [rw] master # @return [Boolean] whether the master key was used for this request. @@ -64,6 +66,31 @@ class Payload attr_accessor :master, :user, :installation_id, :params, :function_name, :object, :trigger_name attr_accessor :query, :log, :objects attr_accessor :original, :update, :raw + # @!attribute [rw] event + # The LiveQuery event type for an +afterEvent+ trigger -- one of + # +"create"+, +"enter"+, +"update"+, +"leave"+, or +"delete"+ -- or + # +"connect"+ for a +beforeConnect+ trigger. +nil+ for every non- + # LiveQuery trigger. See {#after_event?} / {#before_connect?}. + # @return [String, nil] + # @!attribute [rw] clients + # Connection-global metadata sent on the LiveQuery +beforeConnect+ / + # +afterEvent+ triggers: the number of currently-connected LiveQuery + # clients. +nil+ for non-LiveQuery triggers. + # @return [Integer, nil] + # @!attribute [rw] subscriptions + # Connection-global metadata sent on the LiveQuery +beforeConnect+ / + # +afterEvent+ triggers: the number of active subscriptions. +nil+ for + # non-LiveQuery triggers. + # @return [Integer, nil] + attr_accessor :event, :clients, :subscriptions + # @!attribute [rw] context + # The caller-supplied context object threaded from the originating REST + # write or cloud-function call via the +X-Parse-Cloud-Context+ header. + # Parse Server includes this as a top-level +context+ key in trigger + # payloads (beforeSave/afterSave/etc.). Returns a Hash when present, or + # +nil+ when the originating request carried no context. + # @return [Hash, nil] + attr_accessor :context # @!attribute [r] session_token # The caller's live Parse session token, captured from the incoming # webhook payload (`user.sessionToken`) before credentials are scrubbed @@ -85,10 +112,24 @@ class Payload # You would normally never create a {Parse::Webhooks::Payload} object since it is automatically # provided to you when using Parse::Webhooks. # @see Parse::Webhooks - def initialize(hash = {}) + # @param hash [String, Hash] the raw webhook body (JSON string or Hash). + # @param webhook_class [String, nil] the Parse class name derived from the + # webhook URL path (`//`). This is the + # ONLY authoritative source of the class for beforeFind/afterFind + # triggers — Parse Server omits `className` from the find payload body + # entirely (the matched `objects` carry no `className` and there is no + # top-level one). Threading it here lets `parse_class` resolve (so find + # triggers route) and lets `:vector` columns be stripped from afterFind + # `objects`. For save/delete triggers the path className equals the + # body's, so it is consistent; for functions it is nil. + def initialize(hash = {}, webhook_class = nil) hash = JSON.parse(hash, max_nesting: 20) if hash.is_a?(String) hash = Hash[hash.map { |k, v| [k.to_s.underscore.to_sym, v] }] @raw = hash + # Set BEFORE the vector scrub below so the route-derived class is + # available to strip :vector columns from afterFind objects (whose + # body carries no className of its own). + @webhook_class = webhook_class.to_s if webhook_class && !webhook_class.to_s.empty? @master = hash[:master] # Capture the caller's session token from the *unscrubbed* user hash # before scrub_credentials strips it below. Parse Server includes @@ -99,6 +140,16 @@ def initialize(hash = {}) # still letting a handler opt in to acting as the calling user via # #session_token / #user_client / #user_agent. @session_token = self.class.extract_session_token(hash[:user]) + # LiveQuery beforeConnect/beforeSubscribe carry the caller's session + # token at the TOP LEVEL (not nested under `user`), because no user is + # resolved yet when the trigger fires. Capture it here -- with the same + # "set it aside, keep it out of as_json / the log" treatment as the + # nested form -- so #user_client / #user_agent can act as the caller. + # It is intentionally NOT one of ATTRIBUTES. + if @session_token.nil? + top_token = hash[:session_token].to_s.strip + @session_token = top_token unless top_token.empty? + end # Webhook trigger payloads (beforeSave/afterSave/etc.) are delivered by # Parse Server and, when a webhook key is configured (the default; see # Parse::Webhooks.allow_unauthenticated for the opt-out used in tests / @@ -125,14 +176,42 @@ def initialize(hash = {}) @params = hash[:params] @params = @params.with_indifferent_access if @params.is_a?(Hash) @function_name = hash[:function_name] - @object = self.class.scrub_credentials(hash[:object]) @trigger_name = hash[:trigger_name] - @original = self.class.scrub_credentials(hash[:original]) - @update = self.class.scrub_credentials(hash[:update]) || {} - # Added for beforeFind and afterFind triggers + # Resolve the model class once so :vector columns can be stripped from + # every object-shaped payload (see scrub_vector_columns). Credentials + # are scrubbed first, then vectors. The route-derived @webhook_class is + # authoritative and preferred — it is the only class source for + # afterFind (whose body carries no className anywhere); for save/delete + # it equals the body's className. Falls back to the object/original + # className for older callers that don't supply a route class. + vec_klass = self.class.resolve_klass_by_name(@webhook_class) || + self.class.resolve_vector_klass(hash[:object], hash[:original]) + @object = self.class.scrub_vector_columns(self.class.scrub_credentials(hash[:object]), vec_klass) + @original = self.class.scrub_vector_columns(self.class.scrub_credentials(hash[:original]), vec_klass) + @update = self.class.scrub_vector_columns(self.class.scrub_credentials(hash[:update]), vec_klass) || {} + # Added for beforeFind and afterFind triggers. afterFind objects are all + # of one class but carry no className of their own, so the route-derived + # vec_klass is the only way to strip their :vector columns. @query = hash[:query] - @objects = hash[:objects] || [] + # LiveQuery connection metadata. `event` is the afterEvent event type + # (create/enter/update/leave/delete) or "connect" for beforeConnect; + # `clients`/`subscriptions` are connection-global counts. All nil for + # the object / auth triggers. These are plain scalars (no credential + # material), so they pass through unscrubbed. + @event = hash[:event] + @clients = hash[:clients] + @subscriptions = hash[:subscriptions] + @objects = Array(hash[:objects]).map do |o| + self.class.scrub_vector_columns(self.class.scrub_credentials(o), vec_klass) + end @log = hash[:log] + # Caller-supplied context object threaded via X-Parse-Cloud-Context. + # This is caller metadata (not a credential), so it passes through + # without scrubbing — mirroring the treatment of @query and @log. + @context = hash[:context] + # Blocks registered by a handler via #after_response / #defer, to run + # after the webhook response has been sent (drained by the Rack app). + @deferred_callbacks = [] end # @!visibility private @@ -166,6 +245,62 @@ def self.scrub_credentials(obj) end end + # @!visibility private + # Resolve the Parse::Object subclass for a webhook payload from the + # `className` of the first object-shaped hash given. Returns nil when + # no class name is present or no matching model is registered (the + # caller then skips vector stripping — fail-open is acceptable here: + # an unregistered class has no declared `:vector` columns to strip). + def self.resolve_vector_klass(*candidates) + candidates.each do |obj| + next unless obj.is_a?(Hash) + name = obj["className"] || obj[:className] + next if name.nil? || name.to_s.empty? + klass = resolve_klass_by_name(name) + return klass if klass + end + nil + end + + # @!visibility private + # Resolve a registered Parse::Object subclass from a bare class-name + # string (e.g. the route-derived @webhook_class). Returns nil for a blank + # name or an unregistered class (the caller then skips vector stripping — + # fail-open, as an unregistered class has no declared :vector columns). + def self.resolve_klass_by_name(name) + return nil if name.nil? || name.to_s.empty? + klass = (Parse::Object.find_class(name.to_s) rescue nil) + klass.respond_to?(:fields) ? klass : nil + end + + # @!visibility private + # Returns a copy of +obj+ with the model's declared `:vector` + # columns removed. Embeddings are large dense float arrays that leak + # ML signal; a webhook handler has no reason to receive them, and + # leaving them in bloats logs and any object a handler re-persists. + # Mirrors the `as_json` default (vectors omitted) — a class that opts + # into `vector_visibility :public` keeps its vectors here too. + # + # `klass` may be passed explicitly (so changed-only payloads like + # `update`, which carry no `className`, are still scrubbed using the + # class resolved from the sibling `object`/`original` hash); when nil + # it is resolved from the hash's own `className`. + # Pass-through for non-Hash input (and nil). + def self.scrub_vector_columns(obj, klass = nil) + return obj unless obj.is_a?(Hash) + klass ||= resolve_vector_klass(obj) + return obj if klass.nil? + if klass.respond_to?(:vectors_public_by_default?) && klass.vectors_public_by_default? + return obj + end + vector_fields = klass.fields(:vector).keys.map(&:to_s) + return obj if vector_fields.empty? + field_map = klass.respond_to?(:field_map) ? klass.field_map : {} + wire = vector_fields.map { |f| (field_map[f.to_sym] || f).to_s } + denied = (vector_fields + wire) + obj.reject { |k, _| denied.include?(k.to_s) } + end + # @!visibility private # Pulls the caller's session token out of the (unscrubbed) +user+ hash. # Parse Server sends it as the camelCase string key +sessionToken+; this @@ -325,6 +460,73 @@ def after_find? trigger? && @trigger_name.to_sym == :afterFind end + # true if this is a beforeLogin webhook trigger request. + # + # NOTE: a +beforeLogin+ payload carries the user being authenticated as + # {#object} / {#parse_object} (a +_User+), NOT as {#user} -- the caller is + # not yet authenticated when the trigger fires, so {#user} is +nil+. (By + # +afterLogin+ both are populated and equal.) Reach for {#parse_object} to + # inspect the logging-in user during +beforeLogin+. + def before_login? + trigger? && @trigger_name.to_sym == :beforeLogin + end + + # true if this is a afterLogin webhook trigger request. + def after_login? + trigger? && @trigger_name.to_sym == :afterLogin + end + + # true if this is a afterLogout webhook trigger request. The logged-out + # session is carried as {#object} / {#parse_object} (a +_Session+). + def after_logout? + trigger? && @trigger_name.to_sym == :afterLogout + end + + # true if this is a beforePasswordResetRequest webhook trigger request. + # The target user is carried as {#object} / {#parse_object} (a +_User+). + def before_password_reset_request? + trigger? && @trigger_name.to_sym == :beforePasswordResetRequest + end + + # true if this is a LiveQuery beforeConnect webhook trigger request. + # Connection-global: carries no {#object}; the className is the +@Connect+ + # sentinel and the caller's token (if any) is in {#session_token}. + def before_connect? + trigger? && @trigger_name.to_sym == :beforeConnect + end + + # true if this is a LiveQuery beforeSubscribe webhook trigger request. + # Shaped like beforeFind: carries a {#query} (see {#parse_query}) and the + # className comes from the request path, not the body. + def before_subscribe? + trigger? && @trigger_name.to_sym == :beforeSubscribe + end + + # true if this is a LiveQuery afterEvent webhook trigger request. The + # event type (create/enter/update/leave/delete) is in {#event}. + def after_event? + trigger? && @trigger_name.to_sym == :afterEvent + end + + # true if this is one of the authentication-side triggers + # (beforeLogin / afterLogin / afterLogout / beforePasswordResetRequest). + # These carry a +_User+ / +_Session+ as {#object} but are NOT object + # save/delete triggers: no ActiveModel save/create/destroy callbacks run + # for them, and Parse Server ignores the response body (the only way to + # affect a +before*+ one is to deny it -- see the webhook router). + def auth_trigger? + before_login? || after_login? || after_logout? || before_password_reset_request? + end + + # true if this is one of the LiveQuery triggers (beforeConnect / + # beforeSubscribe / afterEvent). Parse Server delivers these over an HTTP + # webhook only in a co-located single-process LiveQuery setup; + # +beforeConnect+ in particular carries a live client and is effectively + # in-process-only. See the webhooks guide. + def live_query_trigger? + before_connect? || before_subscribe? || after_event? + end + # true if this request is a trigger that contains an object. def object? trigger? && @object.present? @@ -483,6 +685,49 @@ def error!(msg = "") raise Parse::Webhooks::ResponseError, msg end + # Register a block to run **after** this webhook's response has been sent + # to Parse Server, off the client's critical path. Use it to do work that + # should not add latency to the save/function the client is waiting on — + # search indexing, cache warming, fan-out notifications. + # + # The handler still returns its value synchronously (the response Parse + # Server acts on); the deferred block runs afterward. When the SDK is + # mounted under a server that supports `rack.after_reply` (Puma, Unicorn) + # the block runs once the response is flushed to the socket, on the same + # worker thread; otherwise it runs in a detached thread. Each block is + # isolated, so one raising neither affects the response nor the others. + # + # Parse::Webhooks.route :after_save, :Post do + # post = parse_object + # after_response { SearchIndex.reindex(post.id) } + # post + # end + # + # `self` inside the block is this payload (it closes over the handler's + # scope), so `parse_object`, `params`, etc. remain available. Note the + # block runs in-process and does not survive a worker restart — for work + # that *must* happen, hand it to a durable job queue instead. Deferred + # callbacks fire only when the payload is processed through the mounted + # `Parse::Webhooks` Rack app. + # + # @yield the work to run after the response is sent. + # @return [Boolean] true if a block was registered. + def after_response(&block) + return false unless block_given? + @deferred_callbacks ||= [] + @deferred_callbacks << block + true + end + alias_method :defer, :after_response + + # @!visibility private + # The blocks registered via {#after_response}; drained by the Rack app + # ({Parse::Webhooks.dispatch_deferred}) after the response is finished. + # @return [Array] + def deferred_callbacks + @deferred_callbacks ||= [] + end + # @return [Parse::Query] the Parse query for a beforeFind trigger. def parse_query return nil unless parse_class.present? && @query.is_a?(Hash) diff --git a/lib/parse/webhooks/trigger_audit.rb b/lib/parse/webhooks/trigger_audit.rb new file mode 100644 index 0000000..3c46d9b --- /dev/null +++ b/lib/parse/webhooks/trigger_audit.rb @@ -0,0 +1,502 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require "active_support/inflector" + +module Parse + class Webhooks + # Operator-facing audit that cross-references three sources of truth about a + # Parse application's trigger logic and reports where they disagree: + # + # 1. **Model callbacks** — the ActiveModel `before_save` / `after_save` / + # `after_create` / ... callbacks declared on each {Parse::Object} + # subclass (app-defined ones, with framework-internal callbacks filtered + # out by source location). + # 2. **Local webhook routes** — the blocks registered via + # `webhook :before_save { ... }` / `Parse::Webhooks.route(...)`, held in + # {Parse::Webhooks.routes}. + # 3. **Server triggers** — what is actually registered with Parse Server + # (`GET hooks/triggers`), so a matching client POST reaches your Rack app. + # + # The non-obvious relationship the audit exists to surface (see + # {file:docs/webhooks_guide.md}): a model's ActiveModel callbacks only run + # server-side for **non-Ruby clients** when BOTH a local route is registered + # (so the webhook router has a handler) AND the trigger is registered on Parse + # Server (so it POSTs at all). Declaring `after_save :send_email` alone does + # nothing for a JS/Swift/REST/Dashboard write — that write never touches your + # Ruby process, and the callback is silently skipped. + # + # SECURITY POSTURE — mirrors {Parse::Core::Describe}. This is operator-side + # observability, NOT data exposed to an LLM. The server fetch hits the + # master-key-only `hooks/triggers` endpoint, so a `network: true` audit + # requires a master-key client; `network: false` audits callbacks vs. local + # routes only and needs no credentials. Output is never included in tool + # responses or any `parse.agent.*` notification payload. + # + # @example + # audit = Parse::Webhooks.trigger_audit # Hash report (network) + # puts Parse::Webhooks.trigger_audit(pretty: true) # human-readable summary + # Parse::Webhooks.trigger_audit(network: false) # local-only, no master key + class TriggerAudit + # The object-shaped triggers an ActiveModel callback or a webhook block can + # map to. (Auth / LiveQuery triggers carry no object and have no ActiveModel + # callback equivalent, so they are surfaced only as server/local routes, not + # cross-referenced against model callbacks.) + OBJECT_TRIGGERS = %i[ + before_save after_save before_delete after_delete before_find after_find + ].freeze + + # Maps an ActiveModel callback chain + phase to the local trigger name whose + # webhook handler runs it server-side. `before_create` / `after_create` ride + # inside the save handler (Parse Server has no create trigger); the webhook + # router runs the destroy chain inside the beforeDelete handler. + CALLBACK_TRIGGER_MAP = { + [:save, :before] => :before_save, + [:create, :before] => :before_save, + [:save, :after] => :after_save, + [:create, :after] => :after_save, + [:destroy, :before] => :before_delete, + [:destroy, :after] => :after_delete, + }.freeze + + # ActiveModel callback chains + phases with NO server trigger that can run + # them. The webhook router only runs the `:save` and `:create` chains (plus + # the destroy chain on beforeDelete) — it never runs `:update` or + # `:validation`. So these callbacks are LOCAL-ONLY: they fire for + # Ruby-initiated saves but can never fire for a non-Ruby client, and no + # trigger registration changes that. Surfaced as an informational note, not + # a fixable gap. + LOCAL_ONLY_MAP = { + [:update, :before] => :before_update, + [:update, :after] => :after_update, + [:validation, :before] => :before_validation, + [:validation, :after] => :after_validation, + }.freeze + + # The ActiveModel callback chains we introspect. + CALLBACK_CHAINS = %i[validation create update save destroy].freeze + + # Directory under which a callback's source file marks it as + # framework-internal (defined by the gem) rather than app-defined. Computed + # from this file's own location: `__dir__` is `/lib/parse/webhooks`, so + # its parent is `/lib/parse`. + GEM_PARSE_DIR = ::File.expand_path("..", __dir__) + + # Per-class audit row. + class ClassAudit + # @return [String] the Parse class name (e.g. "Post", "_User", "*"). + attr_reader :parse_class + # @return [Hash{Symbol=>Array}] app-defined callbacks keyed by local + # trigger-ish name (`:before_save`, `:after_create`, ...). Each value is + # an array of `{ name:, source: }`. + attr_reader :callbacks + # @return [Array] local trigger names that have a registered + # webhook block/route for this class (or via the `*` wildcard route). + attr_reader :local_routes + # @return [Hash{Symbol=>String}] server-registered triggers for this class, + # mapped trigger-name => url. Empty when `network: false`. + attr_reader :server_triggers + # @return [Array] findings for this class. See {TriggerAudit} for the + # finding kinds. + attr_reader :findings + # @return [Boolean] whether a loaded Parse::Object subclass models this class. + attr_reader :modeled + + def initialize(parse_class:, callbacks:, local_routes:, server_triggers:, + findings:, modeled:) + @parse_class = parse_class + @callbacks = callbacks + @local_routes = local_routes + @server_triggers = server_triggers + @findings = findings + @modeled = modeled + end + + # @return [Boolean] true when the class has at least one finding. + def issues? + @findings.any? + end + + # @return [Hash] a JSON-safe representation of this row. + def to_h + { + parse_class: parse_class, + modeled: modeled, + callbacks: callbacks, + local_routes: local_routes, + server_triggers: server_triggers, + findings: findings, + } + end + end + + # @return [Array] one row per audited class, sorted by name. + attr_reader :classes + # @return [Boolean] whether the server was queried for registered triggers. + attr_reader :networked + + # @param network [Boolean] when true, query Parse Server for registered + # triggers (requires a master-key client). When false, audit model + # callbacks against local routes only. + # @param client [Parse::Client, nil] optional client override for the server + # fetch. + # @param include_framework [Boolean] when true, also report gem-internal + # callbacks (e.g. the `_User` default-ACL callback). Off by default to keep + # the report focused on app-defined logic. + def initialize(network: true, client: nil, include_framework: false) + @networked = network + @include_framework = include_framework + @client = client + @server_lookup = network ? fetch_server_triggers : {} + @classes = build_classes + end + + # @return [Array] every finding across all classes, flattened, with the + # class name folded into each entry. Convenient for programmatic checks + # (CI fails the build if `gaps.any? { |g| g[:kind] == :callbacks_inert }`). + def gaps + @classes.flat_map do |ca| + ca.findings.map { |f| f.merge(parse_class: ca.parse_class) } + end + end + + # @return [Hash] the full JSON-safe report. + def to_h + { + networked: networked, + classes: @classes.map(&:to_h), + summary: summary, + } + end + alias as_json to_h + + # @return [Hash] finding counts keyed by kind, plus class totals. + def summary + counts = Hash.new(0) + gaps.each { |g| counts[g[:kind]] += 1 } + { + classes_audited: @classes.size, + classes_with_issues: @classes.count(&:issues?), + findings: counts, + } + end + + # @return [String] a human-readable, `puts`-friendly summary in the style of + # `Model.describe(pretty: true)`. + def pretty + lines = ["Parse trigger audit (#{networked ? "server-compared" : "local-only"}):"] + @classes.each do |ca| + header = " #{ca.parse_class}" + header += " [server-only]" unless ca.modeled + lines << header + + ca.callbacks.each do |trigger, cbs| + names = cbs.map { |c| c[:name] }.join(", ") + lines << " callback #{trigger}: #{names}" + end + lines << " routes: #{ca.local_routes.map(&:to_s).sort.join(", ")}" if ca.local_routes.any? + if networked && ca.server_triggers.any? + lines << " server: #{ca.server_triggers.keys.map(&:to_s).sort.join(", ")}" + end + + if ca.findings.empty? + lines << " ok" + else + ca.findings.each { |f| lines << " #{finding_glyph(f[:kind])} #{f[:message]}" } + end + end + s = summary + lines << "" + lines << "Summary: #{s[:classes_audited]} class(es), " \ + "#{s[:classes_with_issues]} with issues." + s[:findings].sort.each { |kind, n| lines << " #{kind}: #{n}" } + lines.join("\n") + end + alias to_s pretty + + private + + # Pull `hooks/triggers` and build server[className][local_trigger] => url. + # Raises a clear error (rather than letting the bare REST 403 surface) when + # no master key is configured — the endpoint is master-key-only. + def fetch_server_triggers + client = @client || Parse::Webhooks.client + if client.respond_to?(:master_key) && client.master_key.blank? + raise ArgumentError, + "Parse::Webhooks.trigger_audit requires a master-key client to " \ + "read server triggers (the hooks/triggers endpoint is " \ + "master-key-only). Configure a master key, or pass network: false " \ + "to audit model callbacks against local routes only." + end + lookup = Hash.new { |h, k| h[k] = {} } + client.triggers.results.each do |t| + next unless t["url"].present? + name = t["triggerName"] + klass = t[Parse::Model::KEY_CLASS_NAME] || t["className"] + next if name.blank? || klass.blank? + lookup[klass.to_s][name.to_s.underscore.to_sym] = t["url"] + end + lookup + end + + # The union of every class that any of the three sources knows about, so a + # server-only trigger (no local model) or a `*` wildcard route still appears. + def build_classes + names = Set.new + Parse.registered_classes.each { |c| names << c.to_s } + @server_lookup.each_key { |c| names << c.to_s } + OBJECT_TRIGGERS.each do |trigger| + route_map = Parse::Webhooks.routes[trigger] + next if route_map.nil? + route_map.each_key { |c| names << c.to_s } + end + # All trigger chains, in case a non-object trigger (function aside) is + # registered against a class name. + Parse::API::Hooks::TRIGGER_NAMES_LOCAL.each do |trigger| + route_map = Parse::Webhooks.routes[trigger] + next if route_map.nil? + route_map.each_key { |c| names << c.to_s } + end + + names.to_a.sort.map { |name| audit_class(name) } + end + + def audit_class(name) + klass = name == "*" ? nil : (Parse::Model.find_class(name) rescue nil) + callbacks = klass ? collect_callbacks(klass) : {} + routes = collect_local_routes(name) + server = @server_lookup[name] || {} + findings = analyze(name, callbacks, routes, server) + ClassAudit.new( + parse_class: name, + callbacks: callbacks, + local_routes: routes, + server_triggers: server, + findings: findings, + modeled: !klass.nil?, + ) + end + + # App-defined ActiveModel callbacks keyed by a trigger-ish name + # (`:before_save`, `:after_create`, `:before_update`, ...). Framework + # callbacks are filtered by source location unless include_framework is set. + def collect_callbacks(klass) + out = {} + CALLBACK_CHAINS.each do |chain| + callback_chain = klass.send("_#{chain}_callbacks") + callback_chain.each do |cb| + next unless cb.kind == :before || cb.kind == :after + entry = describe_callback(klass, cb) + next if entry[:framework] && !@include_framework + key = :"#{cb.kind}_#{chain}" + (out[key] ||= []) << entry.slice(:name, :source).merge(framework: entry[:framework]) + end + end + # Drop the framework flag from values when not requested (keeps output lean). + unless @include_framework + out.each_value { |arr| arr.each { |h| h.delete(:framework) } } + end + out + end + + # Resolve a callback's display name, source location, and whether it is + # framework-internal (defined under the gem's lib/parse). + def describe_callback(klass, cb) + filter = cb.filter + case filter + when Symbol + loc = begin + klass.instance_method(filter).source_location + rescue NameError + nil + end + { name: filter.to_s, source: format_source(loc), framework: framework_source?(loc) } + when Proc + loc = filter.source_location + { name: "(block)", source: format_source(loc), framework: framework_source?(loc) } + else # String (eval'd) — uncommon + { name: "(string)", source: nil, framework: false } + end + end + + def framework_source?(loc) + return false if loc.nil? + ::File.expand_path(loc.first.to_s).start_with?(GEM_PARSE_DIR + ::File::SEPARATOR) + end + + def format_source(loc) + return nil if loc.nil? + "#{loc.first}:#{loc.last}" + end + + # Local trigger names with a registered block for this class. The `*` + # wildcard route applies to every class, so a class inherits any wildcard + # routes in addition to its own. + def collect_local_routes(name) + triggers = [] + Parse::API::Hooks::TRIGGER_NAMES_LOCAL.each do |trigger| + route_map = Parse::Webhooks.routes[trigger] + next if route_map.nil? + if route_map[name].present? + triggers << trigger + elsif name != "*" && route_map["*"].present? + triggers << trigger + end + end + triggers + end + + # Cross-reference the three axes and emit findings. Finding kinds: + # + # - `:callbacks_inert` — app callbacks exist that map to an object trigger, + # but the local route and/or the server trigger is missing, so they never + # run for non-Ruby clients. The headline gap. `missing:` lists which + # piece(s) are absent (`:route`, `:server`). + # - `:route_not_registered` — a local webhook block exists but the server + # trigger is not registered, so Parse Server never POSTs to it. + # - `:orphan_server_trigger` — a server trigger is registered but there is no + # local route to handle it; the round-trip is wasted (the router returns a + # success no-op). + # - `:local_only_callbacks` — informational: `before_update` / `after_update` + # / `*_validation` callbacks that no server trigger can ever run. + def analyze(name, callbacks, routes, server) + findings = [] + server_known = @networked + + # Which object triggers do the app callbacks require? + required = Hash.new { |h, k| h[k] = [] } # trigger => [callback keys] + callbacks.each_key do |cb_key| + kind, chain = split_callback_key(cb_key) + if (trigger = CALLBACK_TRIGGER_MAP[[chain, kind]]) + required[trigger] << cb_key + end + end + + required.each do |trigger, cb_keys| + has_route = routes.include?(trigger) + has_server = server.key?(trigger) + missing = [] + missing << :route unless has_route + missing << :server if server_known && !has_server + next if missing.empty? + + findings << { + kind: :callbacks_inert, + trigger: trigger, + missing: missing, + callbacks: cb_keys.sort, + message: inert_message(name, trigger, missing, cb_keys), + } + end + + # Local block registered but no server trigger. + if server_known + routes.each do |trigger| + next unless OBJECT_TRIGGERS.include?(trigger) || + Parse::API::Hooks::TRIGGER_NAMES_LOCAL.include?(trigger) + next if server.key?(trigger) + # Wildcard-only coverage is reported on the "*" row, not here. + next unless Parse::Webhooks.routes[trigger]&.key?(name) + findings << { + kind: :route_not_registered, + trigger: trigger, + message: "Local `webhook :#{trigger}` block for #{name} is not " \ + "registered as a server trigger — run register_triggers! " \ + "so Parse Server POSTs to it.", + } + end + + # Server trigger registered but nothing local handles it. + server.each_key do |trigger| + next if routes.include?(trigger) + findings << { + kind: :orphan_server_trigger, + trigger: trigger, + message: "Server trigger #{trigger} is registered for #{name} but no " \ + "local webhook block handles it — every matching operation " \ + "pays a webhook round-trip that does nothing.", + } + end + end + + # Local-only callbacks (informational). + local_only = callbacks.keys.filter_map do |cb_key| + kind, chain = split_callback_key(cb_key) + cb_key if LOCAL_ONLY_MAP.key?([chain, kind]) + end + if local_only.any? + findings << { + kind: :local_only_callbacks, + callbacks: local_only.sort, + message: "#{name} has local-only callbacks (#{local_only.sort.join(", ")}) " \ + "that no server trigger can run — they fire for Ruby-initiated " \ + "saves but never for non-Ruby clients.", + } + end + + findings + end + + # ":before_save" => [:before, :save]; "after_create" => [:after, :create]. + def split_callback_key(cb_key) + s = cb_key.to_s + if s.start_with?("before_") + [:before, s.sub("before_", "").to_sym] + elsif s.start_with?("after_") + [:after, s.sub("after_", "").to_sym] + else + [nil, nil] + end + end + + def inert_message(name, trigger, missing, cb_keys) + callbacks = cb_keys.sort.join(", ") + reason = + if missing == [:route, :server] + "neither a local `webhook :#{trigger}` block nor a server trigger is " \ + "registered" + elsif missing == [:route] + "no local `webhook :#{trigger}` block is registered to handle it" + else # [:server] + "a local block exists but the #{trigger} server trigger is not registered" + end + "#{name} callbacks (#{callbacks}) will NOT run for non-Ruby clients: " \ + "#{reason}. Register `webhook :#{trigger}` and run register_triggers!." + end + + def finding_glyph(kind) + case kind + when :callbacks_inert then "GAP " + when :route_not_registered then "GAP " + when :orphan_server_trigger then "WARN" + when :local_only_callbacks then "note" + else " " + end + end + end + + class << self + # Audit trigger logic across all registered classes, cross-referencing model + # ActiveModel callbacks, locally registered webhook blocks, and the triggers + # registered on Parse Server. See {Parse::Webhooks::TriggerAudit}. + # + # The server comparison reads the master-key-only `hooks/triggers` endpoint, + # so `network: true` (the default) requires a master-key client. Pass + # `network: false` for a credential-free audit of callbacks vs. local routes. + # + # @param pretty [Boolean] when true, return the human-readable String summary + # instead of the Hash report. + # @param network [Boolean] query Parse Server for registered triggers. + # @param client [Parse::Client, nil] optional client override. + # @param include_framework [Boolean] include gem-internal callbacks. + # @return [Hash, String] the report Hash, or the pretty String when + # `pretty: true`. + def trigger_audit(pretty: false, network: true, client: nil, include_framework: false) + audit = TriggerAudit.new( + network: network, client: client, include_framework: include_framework + ) + pretty ? audit.pretty : audit.to_h + end + end + end +end diff --git a/scripts/docker/Dockerfile.parse b/scripts/docker/Dockerfile.parse index 6e9c872..4ad1adf 100644 --- a/scripts/docker/Dockerfile.parse +++ b/scripts/docker/Dockerfile.parse @@ -1,4 +1,8 @@ -FROM parseplatform/parse-server:9 +# Pinned to a specific patch release (was the floating `:9` tag, which had +# cached to a pre-patch 8.4.0 build). 9.9.0 includes the MFA / authData security +# fixes — GHSA-pfj7-wv7c-22pr (auth-provider validation bypass on login) and +# GHSA-37mj-c2wf-cx96 / CVE-2026-33627 (TOTP secret leak via /users/me). +FROM parseplatform/parse-server:9.9.0 # Switch to root to copy and set permissions USER root diff --git a/scripts/docker/docker-compose.test.yml b/scripts/docker/docker-compose.test.yml index dbd1ba3..c78065b 100644 --- a/scripts/docker/docker-compose.test.yml +++ b/scripts/docker/docker-compose.test.yml @@ -5,9 +5,37 @@ version: '3.8' name: ${PSNEXT_PREFIX:-psnext-it} services: + # Security preflight — TEST STACK ONLY. Runs to completion before any + # real service starts (each gates on it via + # `service_completed_successfully`). Fails the stack closed when a + # `*_BIND` override would expose the stack on the LAN while privileged + # credentials are still the committed defaults. Invisible to the normal + # loopback run. See scripts/docker/preflight.sh for the rationale and + # escape hatches (ALLOW_INSECURE_BIND=1, or set real credentials). + preflight: + image: busybox:1.36 + container_name: ${PSNEXT_PREFIX:-psnext-it}-preflight + environment: + # Resolved binds (compose applies the 127.0.0.1 default here). + PARSE_BIND: ${PARSE_BIND:-127.0.0.1} + MONGO_BIND: ${MONGO_BIND:-127.0.0.1} + REDIS_BIND: ${REDIS_BIND:-127.0.0.1} + DASHBOARD_BIND: ${DASHBOARD_BIND:-127.0.0.1} + # "1" only when the operator supplied a non-empty override; empty + # means the committed default credential is still in force. + PARSE_MASTER_KEY_SET: ${PARSE_MASTER_KEY:+1} + MONGO_ROOT_PASSWORD_SET: ${MONGO_ROOT_PASSWORD:+1} + ALLOW_INSECURE_BIND: ${ALLOW_INSECURE_BIND:-} + volumes: + - ./preflight.sh:/preflight.sh:ro + command: ["sh", "/preflight.sh"] + mongo: image: mongo:8 container_name: ${PSNEXT_PREFIX:-psnext-it}-mongo + depends_on: + preflight: + condition: service_completed_successfully # Bind to loopback so the test database isn't reachable from the LAN # when a developer runs `docker-compose up`. Override with # `MONGO_BIND=0.0.0.0` if you really want it exposed. @@ -83,6 +111,9 @@ services: redis: image: redis:7-alpine container_name: ${PSNEXT_PREFIX:-psnext-it}-redis + depends_on: + preflight: + condition: service_completed_successfully # Loopback-only by default. Used by the cache integration test # (cache_redis_integration_test.rb) and the synchronize-create lock # tests. Override with `REDIS_BIND=0.0.0.0` if you need to point a diff --git a/scripts/docker/docker-compose.verifyemail.yml b/scripts/docker/docker-compose.verifyemail.yml new file mode 100644 index 0000000..2de0acc --- /dev/null +++ b/scripts/docker/docker-compose.verifyemail.yml @@ -0,0 +1,4 @@ +services: + parse: + environment: + PARSE_SERVER_VERIFY_USER_EMAILS: "true" diff --git a/scripts/docker/preflight.sh b/scripts/docker/preflight.sh new file mode 100755 index 0000000..08df4f4 --- /dev/null +++ b/scripts/docker/preflight.sh @@ -0,0 +1,76 @@ +#!/bin/sh +# Preflight guard for the integration stack — TEST STACK ONLY. +# +# The Compose defaults bind every service to loopback (127.0.0.1) and fall +# back to KNOWN, COMMITTED test credentials (master key psnextItMasterKey, +# Mongo root admin:password, Dashboard admin:admin). That combination is +# safe on loopback — nothing on the LAN can reach it. +# +# It is NOT safe the moment a developer overrides a `*_BIND` to a +# non-loopback address (e.g. MONGO_BIND=0.0.0.0 to attach a remote client): +# the stack is then reachable from the LAN while protected only by +# credentials that are published in this very repository. An admin- +# credentialed Mongo / Parse master key exposed to a shared network is a +# real footgun, and the failure is silent — `docker compose up` just works +# and the developer never sees it. +# +# This guard runs first (every other service gates on it via +# `service_completed_successfully`) and FAILS THE STACK CLOSED whenever a +# non-loopback bind is combined with still-default privileged credentials. +# It is invisible to the normal loopback run. +# +# Escape hatches (pick one): +# 1. Keep it loopback — unset the *_BIND override (the default). +# 2. Use real secrets — set PARSE_MASTER_KEY and MONGO_ROOT_PASSWORD +# (inject them with `op run` / `doppler run` — +# see the README "Secret injection" recipe). +# 3. Acknowledge the risk — ALLOW_INSECURE_BIND=1 on a trusted/isolated +# network where exposure is intentional. + +set -eu + +# Treat an empty value as loopback: the compose interpolation passes the +# resolved bind, and an unset override resolves to the 127.0.0.1 default. +is_loopback() { + case "$1" in + "" | 127.0.0.1 | ::1 | localhost) return 0 ;; + *) return 1 ;; + esac +} + +exposed="" +for pair in "PARSE_BIND=${PARSE_BIND:-}" "MONGO_BIND=${MONGO_BIND:-}" \ + "REDIS_BIND=${REDIS_BIND:-}" "DASHBOARD_BIND=${DASHBOARD_BIND:-}"; do + val=${pair#*=} + is_loopback "$val" || exposed="$exposed ${pair%%=*}=$val" +done + +# Privileged credentials are "default" unless BOTH have been overridden. +# *_SET is "1" only when Compose interpolated a non-empty override +# (`${VAR:+1}`); empty means the committed default is in force. +if [ -n "${PARSE_MASTER_KEY_SET:-}" ] && [ -n "${MONGO_ROOT_PASSWORD_SET:-}" ]; then + default_creds=0 +else + default_creds=1 +fi + +if [ -n "$exposed" ] && [ "$default_creds" = "1" ] && [ "${ALLOW_INSECURE_BIND:-}" != "1" ]; then + echo "[preflight] REFUSING TO START — non-loopback bind with default credentials." >&2 + echo "[preflight]" >&2 + echo "[preflight] Exposed (non-loopback) bind(s):$exposed" >&2 + echo "[preflight] ...while still using the committed test credentials" >&2 + echo "[preflight] (master key psnextItMasterKey / Mongo admin:password)." >&2 + echo "[preflight] This publishes an admin-credentialed stack onto your LAN." >&2 + echo "[preflight]" >&2 + echo "[preflight] Resolve ONE of:" >&2 + echo "[preflight] 1. Keep it loopback — unset the *_BIND override." >&2 + echo "[preflight] 2. Set real secrets — PARSE_MASTER_KEY=... MONGO_ROOT_PASSWORD=..." >&2 + echo "[preflight] 3. Acknowledge intent — ALLOW_INSECURE_BIND=1 (trusted network only)." >&2 + exit 1 +fi + +if [ -n "$exposed" ]; then + echo "[preflight] OK — non-loopback bind(s)$exposed permitted (credentials overridden or ALLOW_INSECURE_BIND=1)." +else + echo "[preflight] OK — all services bound to loopback." +fi diff --git a/scripts/start-parse.sh b/scripts/start-parse.sh index c3ae148..a1ee4da 100755 --- a/scripts/start-parse.sh +++ b/scripts/start-parse.sh @@ -64,6 +64,52 @@ export PARSE_SERVER_ALLOW_CUSTOM_OBJECT_ID="${PARSE_SERVER_ALLOW_CUSTOM_OBJECT_I export PARSE_SERVER_LIVE_QUERY="${PARSE_SERVER_LIVE_QUERY:-{\"classNames\":[\"Song\",\"Album\",\"User\",\"_User\",\"TestLiveQuery\"]}}" export PARSE_SERVER_START_LIVE_QUERY_SERVER="${PARSE_SERVER_START_LIVE_QUERY_SERVER:-true}" +# Push configuration — test-stack only. Points at a no-op adapter bind-mounted +# from test/cloud (see test/cloud/dummy-push-adapter.js). It does NOT deliver to +# any real device gateway; it lets Parse Server accept `POST /parse/push` and +# create/complete a real `_PushStatus` so the push send+status lifecycle is +# integration-testable offline. Without this, `POST /push` returns code 115 +# "Missing push configuration". DO NOT use a no-op adapter in a deployed +# environment — it silently drops every notification. +export PARSE_SERVER_PUSH="${PARSE_SERVER_PUSH:-{\"adapter\":\"/parse-server/cloud/dummy-push-adapter.js\"}}" + +# MFA / 2FA configuration — test-stack only. Enables Parse Server's built-in +# TOTP MFA adapter so the Parse::MFA / two_factor_auth integration tests can +# enroll a user (authData.mfa.{secret,token}) and log in with a time-based code. +# Params match rotp's defaults (SHA1 / 6 digits / 30s period) so codes generated +# client-side validate server-side. +# +# NOTE: the `auth` option is the one Parse Server option that CANNOT be passed +# as a JSON env var — its Definitions entry has no objectParser, so +# PARSE_SERVER_AUTH_PROVIDERS is taken as a raw string and never JSON-parsed +# (the MFA adapter then receives `undefined` options and 500s). It must come +# from a config file, which parse-server JSON-parses natively. We write a +# minimal config file holding only the `auth` block and pass it to parse-server +# below; env vars still provide — and take precedence for — everything else +# (parse-server applies env first, then fills gaps from the file). +PARSE_AUTH_CONFIG_FILE="${PARSE_AUTH_CONFIG_FILE:-/tmp/psnext-parse-auth-config.json}" +cat > "$PARSE_AUTH_CONFIG_FILE" <<'AUTHCFG' +{ "auth": { "mfa": { "options": ["TOTP"], "digits": 6, "period": 30, "algorithm": "SHA1" } } } +AUTHCFG + +# Email — test-stack only. Captures outgoing mail into an `EmailCapture` class +# (see test/cloud/capturing-email-adapter.js) instead of sending it, so the +# client-side password-reset / verification integration tests can assert +# delivery and read back the reset link. `PARSE_PUBLIC_SERVER_URL` is required +# for Parse Server to build those links. Email verification is NOT enabled, so +# ordinary signups still work without a verification round-trip. DO NOT use a +# capturing adapter in a deployed environment — it drops every email. +export PARSE_SERVER_EMAIL_ADAPTER="${PARSE_SERVER_EMAIL_ADAPTER:-/parse-server/cloud/capturing-email-adapter.js}" +export PARSE_PUBLIC_SERVER_URL="${PARSE_PUBLIC_SERVER_URL:-http://localhost:${PARSE_HOST_PORT:-29337}/parse}" +export PARSE_SERVER_APP_NAME="${PARSE_SERVER_APP_NAME:-parse-stack-next-it}" +# Keep email verification OFF. Configuring an email adapter otherwise flips +# Parse Server into requiring verification, which makes signup return a user +# with NO session token until the address is verified — breaking the +# signup-on-save suite. Password reset does not need verification, only the +# adapter + public URL above. +export PARSE_SERVER_VERIFY_USER_EMAILS="${PARSE_SERVER_VERIFY_USER_EMAILS:-false}" +export PARSE_SERVER_PREVENT_LOGIN_WITH_UNVERIFIED_EMAIL="${PARSE_SERVER_PREVENT_LOGIN_WITH_UNVERIFIED_EMAIL:-false}" + # File upload — test-stack only. Authenticated session-token uploads are # permitted; public/anonymous uploads are NOT (mirrors a typical hardened # Parse Server config). The client_rest_files integration tests assert @@ -92,15 +138,17 @@ echo "Looking for parse-server..." which node ls -la /parse-server/ -# Try different ways to start parse-server +# Try different ways to start parse-server. The config file argument supplies +# the `auth` (MFA) block; every other option still comes from the environment. +echo " Auth config file: $PARSE_AUTH_CONFIG_FILE" if [ -f "/parse-server/bin/parse-server" ]; then echo "Using /parse-server/bin/parse-server" - exec /parse-server/bin/parse-server + exec /parse-server/bin/parse-server "$PARSE_AUTH_CONFIG_FILE" elif [ -f "/usr/src/app/bin/parse-server" ]; then echo "Using /usr/src/app/bin/parse-server" - exec /usr/src/app/bin/parse-server + exec /usr/src/app/bin/parse-server "$PARSE_AUTH_CONFIG_FILE" else echo "Trying with node and index.js" cd /parse-server - exec node ./bin/parse-server + exec node ./bin/parse-server "$PARSE_AUTH_CONFIG_FILE" fi diff --git a/test/cloud/capturing-email-adapter.js b/test/cloud/capturing-email-adapter.js new file mode 100644 index 0000000..19083a4 --- /dev/null +++ b/test/cloud/capturing-email-adapter.js @@ -0,0 +1,69 @@ +'use strict'; + +// Test-only email adapter for the parse-stack-next integration stack. +// +// Parse Server requires an email adapter (and a public server URL) before +// `POST /requestPasswordReset` / `requestVerificationEmail` will do anything. +// This adapter does NOT send real email. Instead it captures each outgoing +// message into an `EmailCapture` class via the local REST API (master key), so +// integration tests can assert that an email was generated and read back the +// reset / verification link. DO NOT use this in a deployed environment — it +// silently swallows every email and records reset links in plaintext. +// +// Wired via PARSE_SERVER_EMAIL_ADAPTER in scripts/start-parse.sh. + +const APP_ID = process.env.PARSE_SERVER_APPLICATION_ID; +const MASTER_KEY = process.env.PARSE_SERVER_MASTER_KEY; +const MOUNT = process.env.PARSE_SERVER_MOUNT_PATH || '/parse'; +const BASE = `http://127.0.0.1:1337${MOUNT}`; + +async function capture(doc) { + try { + await fetch(`${BASE}/classes/EmailCapture`, { + method: 'POST', + headers: { + 'X-Parse-Application-Id': APP_ID, + 'X-Parse-Master-Key': MASTER_KEY, + 'Content-Type': 'application/json', + }, + body: JSON.stringify(doc), + }); + } catch (e) { + // Test-only: a capture failure must never break the request under test. + // eslint-disable-next-line no-console + console.warn('[capturing-email-adapter] capture failed:', e && e.message); + } +} + +class CapturingEmailAdapter { + constructor(options = {}) { + this.options = options; + } + + sendPasswordResetEmail({ link, appName, user }) { + return capture({ + kind: 'passwordReset', + email: user && user.get('email'), + username: user && user.get('username'), + link, + appName, + }); + } + + sendVerificationEmail({ link, appName, user }) { + return capture({ + kind: 'verification', + email: user && user.get('email'), + username: user && user.get('username'), + link, + appName, + }); + } + + sendMail({ to, subject, text }) { + return capture({ kind: 'mail', email: to, subject, text }); + } +} + +module.exports = CapturingEmailAdapter; +module.exports.default = CapturingEmailAdapter; diff --git a/test/cloud/dummy-push-adapter.js b/test/cloud/dummy-push-adapter.js new file mode 100644 index 0000000..20f2f5a --- /dev/null +++ b/test/cloud/dummy-push-adapter.js @@ -0,0 +1,57 @@ +'use strict'; + +// Test-only push adapter for the parse-stack-next integration stack. +// +// Parse Server's `POST /parse/push` endpoint requires a push adapter to be +// configured; without one it returns code 115 "Missing push configuration". +// This adapter does NOT deliver to any real device gateway (no FCM/APNS +// credentials, no network). It accepts every installation and reports a +// successful transmission, which lets Parse Server create and complete a real +// `_PushStatus` row. That makes the push *send + status lifecycle* testable +// deterministically and offline. +// +// Wired via PARSE_SERVER_PUSH in scripts/start-parse.sh. Do NOT ship this in a +// deployed environment — it silently drops every notification. +class DummyPushAdapter { + constructor(options = {}) { + this.options = options; + // Device types this adapter claims to handle. Installations with other + // types are skipped by Parse Server before send() is called. + this.validPushTypes = ['ios', 'android']; + } + + // Parse Server calls this with the push body and the matched installations. + // Return one result per installation. `transmitted: true` is tallied under + // numSent / sentPerType for that deviceType; `transmitted: false` under + // numFailed / failedPerType. + // + // Failure simulation (test hook): any installation whose deviceToken begins + // with "fail-" is reported as a failed transmission. This lets tests exercise + // the failure half of the _PushStatus lifecycle (numFailed, failedPerType, + // and mixed sent+failed pushes) deterministically and offline. + send(body, installations) { + const results = installations.map((installation) => { + const token = installation.deviceToken || ''; + const failed = token.indexOf('fail-') === 0; + const result = { + transmitted: !failed, + device: { + deviceToken: installation.deviceToken, + deviceType: installation.deviceType, + }, + }; + if (failed) { + result.response = { error: 'simulated-failure' }; + } + return result; + }); + return Promise.resolve(results); + } + + getValidPushTypes() { + return this.validPushTypes; + } +} + +module.exports = DummyPushAdapter; +module.exports.default = DummyPushAdapter; diff --git a/test/cloud/main.js b/test/cloud/main.js index 8d356bc..9a1c659 100644 --- a/test/cloud/main.js +++ b/test/cloud/main.js @@ -108,4 +108,26 @@ Parse.Cloud.beforeSave('ValidatedThing', (request) => { if (typeof amount !== 'number' || amount <= 0) { throw new Parse.Error(Parse.Error.VALIDATION_ERROR, 'amount must be a positive number'); } +}); + +// Returns a saved Parse.Object. Parse Server 8.0 began encoding returned +// Parse objects as `{ "__type": "Object", ... }` dictionaries and 9.0 made it +// unconditional — this fixture lets the SDK assert it decodes that wire shape +// back into a Parse::Object (see cloud_object_decode_integration_test.rb). +Parse.Cloud.define('echoObject', async (request) => { + const obj = new Parse.Object('EchoObjectThing'); + obj.set('title', request.params.title || 'echoed'); + await obj.save(null, { useMasterKey: true }); + return obj; +}); + +// Returns an array of two saved Parse.Objects, to exercise element-wise +// decoding of an array result. +Parse.Cloud.define('echoObjects', async (request) => { + const a = new Parse.Object('EchoObjectThing'); + a.set('title', 'a'); + const b = new Parse.Object('EchoObjectThing'); + b.set('title', 'b'); + await Parse.Object.saveAll([a, b], { useMasterKey: true }); + return [a, b]; }); \ No newline at end of file diff --git a/test/lib/parse/account_lockout_error_test.rb b/test/lib/parse/account_lockout_error_test.rb new file mode 100644 index 0000000..ed32c76 --- /dev/null +++ b/test/lib/parse/account_lockout_error_test.rb @@ -0,0 +1,183 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Tests for Parse::Error::AccountLockoutError — the typed error raised by the +# SDK's client-side login rate-limit guard in Parse::API::Users#check_login_rate_limit!. +# +# The rate-limit state is stored in a plain in-memory Hash (no Redis/Docker +# required), so these are pure unit tests driven by a minimal object that +# includes the module. +class AccountLockoutErrorTest < Minitest::Test + + # A minimal host class that includes the Users module so we can exercise + # check_login_rate_limit! and track_login_attempt in isolation. + def make_limiter + Class.new { include Parse::API::Users }.new + end + + # ========================================================================= + # Class existence and ancestry + # ========================================================================= + + def test_account_lockout_error_class_exists + assert defined?(Parse::Error::AccountLockoutError), + "Parse::Error::AccountLockoutError must be defined" + end + + def test_account_lockout_error_is_parse_error_subclass + assert Parse::Error::AccountLockoutError.ancestors.include?(Parse::Error), + "AccountLockoutError must descend from Parse::Error" + end + + def test_account_lockout_error_is_standard_error_subclass + assert Parse::Error::AccountLockoutError.ancestors.include?(StandardError), + "AccountLockoutError must descend from StandardError (bare rescue must still catch it)" + end + + def test_account_lockout_error_subclasses_authentication_error + # AccountLockoutError is in the login-failure taxonomy alongside + # EmailNotVerifiedError. A single `rescue AuthenticationError` handler + # should cover all login failures including lockout. + assert Parse::Error::AccountLockoutError.ancestors.include?(Parse::Error::AuthenticationError), + "AccountLockoutError must inherit from AuthenticationError" + end + + # ========================================================================= + # AccountLockoutError caught by AuthenticationError rescue + # ========================================================================= + + def test_account_lockout_caught_by_authentication_error_rescue + raised = + begin + raise Parse::Error::AccountLockoutError, "locked" + rescue Parse::Error::AuthenticationError => e + e + end + assert_kind_of Parse::Error::AccountLockoutError, raised, + "rescue AuthenticationError must also catch AccountLockoutError" + end + + def test_account_lockout_caught_by_bare_rescue + raised = + begin + raise Parse::Error::AccountLockoutError, "locked" + rescue => e + e + end + assert_kind_of Parse::Error::AccountLockoutError, raised, + "a bare rescue must still catch AccountLockoutError" + end + + # ========================================================================= + # check_login_rate_limit! raises AccountLockoutError when locked + # ========================================================================= + + def test_check_login_rate_limit_raises_account_lockout_error_when_locked + limiter = make_limiter + # Seed the rate-limit table directly so the test has no timing dependency. + limiter.send(:login_rate_limits)["alice"] = { + failures: 5, + locked_until: Time.now + 300 + } + + assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "alice") + end + end + + def test_check_login_rate_limit_error_message_contains_username + limiter = make_limiter + limiter.send(:login_rate_limits)["bob"] = { + failures: 5, + locked_until: Time.now + 300 + } + + error = assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "bob") + end + assert_match(/bob/, error.message) + end + + def test_check_login_rate_limit_error_message_contains_wait_seconds + limiter = make_limiter + limiter.send(:login_rate_limits)["carol"] = { + failures: 5, + locked_until: Time.now + 300 + } + + error = assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "carol") + end + assert_match(/\d+\s+seconds?/i, error.message, + "error message must include a wait duration in seconds") + end + + def test_check_login_rate_limit_does_not_raise_when_no_entry + limiter = make_limiter + # No entry for this username — should return cleanly. + assert_nil limiter.send(:check_login_rate_limit!, "unknown_user") + end + + def test_check_login_rate_limit_does_not_raise_when_lockout_expired + limiter = make_limiter + limiter.send(:login_rate_limits)["dave"] = { + failures: 5, + locked_until: Time.now - 1 # already expired + } + + # Must not raise; should return nil. + assert_nil limiter.send(:check_login_rate_limit!, "dave") + end + + # ========================================================================= + # End-to-end: triggering lockout via track_login_attempt + # ========================================================================= + + def test_lockout_raised_after_max_failures_via_track_login_attempt + limiter = make_limiter + # Each failure increments the counter; lockout kicks in at LOGIN_MAX_FAILURES. + failures = Parse::API::Users::LOGIN_MAX_FAILURES + failures.times { limiter.send(:track_login_attempt, "eve", false) } + + assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "eve") + end + end + + def test_lockout_error_is_parse_error_subclass_in_end_to_end_raise + limiter = make_limiter + failures = Parse::API::Users::LOGIN_MAX_FAILURES + failures.times { limiter.send(:track_login_attempt, "frank", false) } + + error = assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "frank") + end + assert_kind_of Parse::Error, error + assert_kind_of Parse::Error::AuthenticationError, error + end + + def test_no_lockout_before_max_failures + limiter = make_limiter + failures = Parse::API::Users::LOGIN_MAX_FAILURES - 1 + failures.times { limiter.send(:track_login_attempt, "grace", false) } + + # One below the threshold — must not raise. + assert_nil limiter.send(:check_login_rate_limit!, "grace") + end + + def test_successful_login_clears_lockout_state + limiter = make_limiter + failures = Parse::API::Users::LOGIN_MAX_FAILURES + failures.times { limiter.send(:track_login_attempt, "henry", false) } + # Verify it IS locked first. + assert_raises(Parse::Error::AccountLockoutError) do + limiter.send(:check_login_rate_limit!, "henry") + end + + # A successful login clears the entry. + limiter.send(:track_login_attempt, "henry", true) + assert_nil limiter.send(:check_login_rate_limit!, "henry") + end +end diff --git a/test/lib/parse/agent/mcp_rack_app_test.rb b/test/lib/parse/agent/mcp_rack_app_test.rb index 420a522..00bdc87 100644 --- a/test/lib/parse/agent/mcp_rack_app_test.rb +++ b/test/lib/parse/agent/mcp_rack_app_test.rb @@ -122,6 +122,144 @@ def test_block_form_constructor_works assert_instance_of Array, body end + # --------------------------------------------------------------------------- + # transport: :streamable_http consolidation switch + # --------------------------------------------------------------------------- + + # A GET env requesting the server→client listening stream. + def listening_stream_env(session_id: "sess-abc123") + { + "REQUEST_METHOD" => "GET", + "HTTP_ACCEPT" => "text/event-stream", + "HTTP_MCP_SESSION_ID" => session_id, + } + end + + def test_transport_streamable_http_enables_post_sse + app = build_app(transport: :streamable_http) + env = rack_env.merge("HTTP_ACCEPT" => "text/event-stream") + status, headers, _body = app.call(env) + assert_equal 200, status + assert_equal "text/event-stream", headers["Content-Type"] + end + + def test_transport_streamable_http_enables_get_listening_stream + app = build_app(transport: :streamable_http) + status, headers, _body = app.call(listening_stream_env) + assert_equal 200, status + assert_equal "text/event-stream", headers["Content-Type"] + end + + def test_transport_streamable_http_equals_streaming_plus_notifications + app = build_app(transport: :streamable_http) + assert_equal true, app.instance_variable_get(:@streaming) + refute_nil app.subscription_manager, + "transport: :streamable_http should build the server→client notification manager" + end + + def test_transport_default_leaves_post_non_streaming + app = build_app + env = rack_env.merge("HTTP_ACCEPT" => "text/event-stream") + _status, headers, _body = app.call(env) + assert_equal "application/json", headers["Content-Type"], + "default transport must keep the historical buffered-JSON behavior" + end + + def test_transport_default_rejects_get_with_405 + app = build_app + status, _headers, _body = app.call(listening_stream_env) + assert_equal 405, status, + "default transport opens no listening stream — GET falls through to 405" + end + + def test_transport_legacy_is_equivalent_to_default + app = build_app(transport: :legacy) + assert_equal false, app.instance_variable_get(:@streaming) + assert_nil app.subscription_manager + end + + def test_transport_streamable_http_conflicts_with_explicit_streaming + err = assert_raises(ArgumentError) do + build_app(transport: :streamable_http, streaming: true) + end + assert_match(/already enables streaming/, err.message) + end + + def test_transport_streamable_http_conflicts_with_explicit_notifications + assert_raises(ArgumentError) do + build_app(transport: :streamable_http, notifications: true) + end + end + + # Discriminating case: an EXPLICIT `false` must still conflict. This is what + # separates the nil-sentinel "omitted vs explicit" check from a plain + # truthiness check — `streaming: false` is falsey but not omitted, so the + # switch must still refuse it. A regression to `unless !streaming` would let + # these through. + def test_transport_streamable_http_conflicts_with_explicit_streaming_false + assert_raises(ArgumentError) do + build_app(transport: :streamable_http, streaming: false) + end + end + + def test_transport_streamable_http_conflicts_with_explicit_notifications_false + assert_raises(ArgumentError) do + build_app(transport: :streamable_http, notifications: false) + end + end + + def test_transport_streamable_http_combines_with_resource_subscriptions + # resource_subscriptions: upgrades the bus to the LiveQuery posture and is + # NOT subsumed by the switch, so the combination must be allowed. + app = build_app(transport: :streamable_http, resource_subscriptions: true) + assert_equal true, app.instance_variable_get(:@streaming) + # resource_subscriptions: builds the manager with no supported-override, so + # it advertises resources.subscribe live when LiveQuery is up. notifications: + # alone forces supported: false. Proving the override is nil here shows the + # switch did not subsume / downgrade resource_subscriptions. + assert_nil app.subscription_manager.instance_variable_get(:@supported_override), + "resource_subscriptions: should keep the LiveQuery-backed posture" + end + + def test_unknown_transport_raises + err = assert_raises(ArgumentError) do + build_app(transport: :websocket) + end + assert_match(/transport:/, err.message) + end + + # --------------------------------------------------------------------------- + # max_concurrent_dispatchers — finite default cap + # --------------------------------------------------------------------------- + + def test_dispatcher_cap_defaults_to_finite + app = build_app + assert_equal Parse::Agent::MCPRackApp::DEFAULT_MAX_CONCURRENT_DISPATCHERS, + app.instance_variable_get(:@max_concurrent_dispatchers) + end + + def test_explicit_dispatcher_cap_is_honored + app = build_app(max_concurrent_dispatchers: 7) + assert_equal 7, app.instance_variable_get(:@max_concurrent_dispatchers) + end + + def test_explicit_nil_dispatcher_cap_is_unbounded + app = build_app(max_concurrent_dispatchers: nil) + assert_nil app.instance_variable_get(:@max_concurrent_dispatchers) + end + + def test_zero_dispatcher_cap_raises + assert_raises(ArgumentError) { build_app(max_concurrent_dispatchers: 0) } + end + + def test_negative_dispatcher_cap_raises + assert_raises(ArgumentError) { build_app(max_concurrent_dispatchers: -5) } + end + + def test_non_integer_dispatcher_cap_raises + assert_raises(ArgumentError) { build_app(max_concurrent_dispatchers: "100") } + end + # --------------------------------------------------------------------------- # Happy path # --------------------------------------------------------------------------- diff --git a/test/lib/parse/agent/mcp_streaming_test.rb b/test/lib/parse/agent/mcp_streaming_test.rb index f63296e..d98e575 100644 --- a/test/lib/parse/agent/mcp_streaming_test.rb +++ b/test/lib/parse/agent/mcp_streaming_test.rb @@ -508,7 +508,21 @@ def test_constructor_accepts_streaming_keyword end end - def test_constructor_warns_when_streaming_without_concurrency_cap + def test_constructor_warns_when_streaming_with_explicitly_unbounded_cap + # The cap now defaults to a finite value, so the orphan-DoS warning only + # fires when the operator EXPLICITLY opts into the unbounded surface. + warns = capture_warns do + Parse::Agent::MCPRackApp.new( + agent_factory: permissive_factory, + streaming: true, + heartbeat_interval: 1, + max_concurrent_dispatchers: nil, + ) + end + assert_match(/unbounded dispatcher cap/, warns) + end + + def test_constructor_does_not_warn_when_streaming_with_default_finite_cap warns = capture_warns do Parse::Agent::MCPRackApp.new( agent_factory: permissive_factory, @@ -516,7 +530,13 @@ def test_constructor_warns_when_streaming_without_concurrency_cap heartbeat_interval: 1, ) end - assert_match(/max_concurrent_dispatchers: nil \(unlimited\)/, warns) + assert_equal "", warns, "finite default cap must not trip the orphan-DoS warning" + end + + def test_streaming_default_finite_cap_value + app = Parse::Agent::MCPRackApp.new(agent_factory: permissive_factory, streaming: true) + assert_equal Parse::Agent::MCPRackApp::DEFAULT_MAX_CONCURRENT_DISPATCHERS, + app.instance_variable_get(:@max_concurrent_dispatchers) end def test_constructor_streaming_defaults_to_false @@ -748,11 +768,13 @@ def test_client_disconnect_mid_stream_no_leaked_threads drain_thread.join(2) - # After close, the worker should be killed. The dispatcher_thread is - # orphaned (cancellation is a separate deferred item) but will run to - # completion naturally. Poll with a generous deadline — Thread#kill - # propagation timing varies across Ruby versions and CI runners; the - # contract is "eventually," not a tight wall-clock bound. + # After close, the worker should be killed. The dispatcher thread is + # cooperatively cancelled (its token is tripped) and bounded by the + # per-tool Timeout + clean I/O deadlines; it is intentionally not + # force-killed (would risk connection-pool corruption). Poll with a + # generous deadline — Thread#kill propagation timing varies across Ruby + # versions and CI runners; the contract is "eventually," not a tight + # wall-clock bound. deadline = Time.now + 10.0 sleep 0.01 while Thread.list.any? { |t| t[:parse_mcp_sse_worker] && t.alive? } && Time.now < deadline @@ -761,6 +783,81 @@ def test_client_disconnect_mid_stream_no_leaked_threads "Worker thread still alive 10s after client disconnect" end + # --------------------------------------------------------------------------- + # 15b. Client disconnect is recorded as an abandonment (counter + notification) + # --------------------------------------------------------------------------- + + def test_disconnect_increments_abandoned_dispatcher_counter_and_emits_event + StreamingDispatcherStub.delay = 1.5 + app = streaming_app(heartbeat_interval: 0.1) + _status, _headers, body = app.call(rack_env(accept: "text/event-stream")) + + before = Parse::Agent::MCPRackApp.abandoned_dispatcher_count + events = [] + sub = ActiveSupport::Notifications.subscribe("parse.agent.mcp_dispatcher_abandoned") do |*args| + events << ActiveSupport::Notifications::Event.new(*args) + end + + # Receive one event then disconnect — drives each's ensure → close with + # completed_normally == false. + drain_thread = Thread.new { body.each { |_chunk| break } } + drain_thread.join(2) + + assert_equal before + 1, Parse::Agent::MCPRackApp.abandoned_dispatcher_count + refute_empty events, "expected a parse.agent.mcp_dispatcher_abandoned event" + assert_equal :client_disconnect, events.last.payload[:reason] + assert_equal true, events.last.payload[:dispatcher_alive] + ensure + ActiveSupport::Notifications.unsubscribe(sub) if sub + end + + def test_normal_completion_does_not_record_abandonment + StreamingDispatcherStub.delay = 0 + # Long heartbeat so the stream produces just the response + DONE. + app = streaming_app(heartbeat_interval: 5) + _status, _headers, body = app.call(rack_env(accept: "text/event-stream")) + + before = Parse::Agent::MCPRackApp.abandoned_dispatcher_count + drain_body(body) # consume through DONE → completed_normally == true + + assert_equal before, Parse::Agent::MCPRackApp.abandoned_dispatcher_count, + "normal completion must not be recorded as an abandonment" + end + + # The discriminating branch: a premature close where the dispatcher has + # ALREADY finished (dispatcher_alive == false) but the client dropped before + # consuming DONE. This is a delivery miss, not an orphan: the counter is gated + # on dispatcher_alive so it must NOT move, while the notification still fires + # (carrying dispatcher_alive: false). Pins the gating contract that the two + # tests above leave vacuous (both would pass even if the gate were dropped). + def test_delivery_miss_emits_event_with_dispatcher_alive_false_but_does_not_count + StreamingDispatcherStub.delay = 0 # dispatcher finishes immediately + # Long heartbeat so no heartbeat is emitted; the worker pushes the response + # event ONLY after its `while dispatcher_thread.alive?` loop exits — i.e. + # after the dispatcher is deterministically dead. + app = streaming_app(heartbeat_interval: 5) + _status, _headers, body = app.call(rack_env(accept: "text/event-stream")) + + before = Parse::Agent::MCPRackApp.abandoned_dispatcher_count + events = [] + sub = ActiveSupport::Notifications.subscribe("parse.agent.mcp_dispatcher_abandoned") do |*args| + events << ActiveSupport::Notifications::Event.new(*args) + end + + # Consume the response event (dispatcher already dead) but break before the + # DONE sentinel → completed_normally == false AND dispatcher_alive == false. + drain_thread = Thread.new { body.each { |_chunk| break } } + drain_thread.join(2) + + assert_equal before, Parse::Agent::MCPRackApp.abandoned_dispatcher_count, + "a delivery miss (dispatcher already finished) must NOT bump the genuine-orphan counter" + refute_empty events, "the notification must fire on every premature close" + assert_equal false, events.last.payload[:dispatcher_alive] + assert_equal :client_disconnect, events.last.payload[:reason] + ensure + ActiveSupport::Notifications.unsubscribe(sub) if sub + end + # --------------------------------------------------------------------------- # 16. max_concurrent_dispatchers cap: second concurrent request returns 503 # --------------------------------------------------------------------------- diff --git a/test/lib/parse/agent/tools_registration_test.rb b/test/lib/parse/agent/tools_registration_test.rb index 6a0efae..21c8bb9 100644 --- a/test/lib/parse/agent/tools_registration_test.rb +++ b/test/lib/parse/agent/tools_registration_test.rb @@ -317,6 +317,61 @@ def test_invoke_string_name_dispatches_to_registered_handler assert_equal({ ok: true }, result) end + # ------------------------------------------------------------------------- + # invoke — registered-handler timeout enforcement + # ------------------------------------------------------------------------- + + # A custom handler that runs past its declared timeout is interrupted with + # ToolTimeoutError — the bound is enforced by Tools.invoke's with_timeout + # wrap, not left to the handler. Proves the orphan-bounding contract holds + # for custom tools (previously the handler ran unbounded). + def test_invoke_enforces_registered_handler_timeout + T.register( + name: :slow_custom, + description: "Sleeps past its timeout", + parameters: { type: "object", properties: {}, required: [] }, + permission: :readonly, + timeout: 1, + handler: ->(_agent, **_args) { sleep 2; { ok: true } }, + ) + agent = Parse::Agent.new + assert_raises(Parse::Agent::ToolTimeoutError) do + T.invoke(agent, :slow_custom) + end + end + + # A fast handler well under its timeout must NOT be falsely interrupted. + def test_invoke_does_not_time_out_fast_registered_handler + T.register( + name: :fast_custom, + description: "Returns immediately", + parameters: { type: "object", properties: {}, required: [] }, + permission: :readonly, + timeout: 5, + handler: ->(_agent, **_args) { { ok: true } }, + ) + agent = Parse::Agent.new + assert_equal({ ok: true }, T.invoke(agent, :fast_custom)) + end + + # register refuses a non-positive timeout: Timeout.timeout(0) would silently + # disable the bound, so the registration must fail loudly at boot. + def test_register_rejects_non_positive_timeout + %i[zero fractional negative].zip([0, 0.5, -3]).each do |label, value| + err = assert_raises(ArgumentError, "timeout: #{value.inspect} (#{label}) should raise") do + T.register( + name: :bad_timeout, + description: "x", + parameters: { type: "object", properties: {}, required: [] }, + permission: :readonly, + timeout: value, + handler: ->(_a, **) { {} }, + ) + end + assert_match(/timeout must be a positive integer/, err.message) + end + end + # ------------------------------------------------------------------------- # permission_for # ------------------------------------------------------------------------- diff --git a/test/lib/parse/aggregate_raw_values_test.rb b/test/lib/parse/aggregate_raw_values_test.rb new file mode 100644 index 0000000..d924557 --- /dev/null +++ b/test/lib/parse/aggregate_raw_values_test.rb @@ -0,0 +1,135 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Tests that Query#aggregate forwards raw_values: and raw_field_names: through +# the Aggregation object to the REST aggregate endpoint (Parse Server 9.9.0+). +class TestAggregateRawValues < Minitest::Test + + # A tiny stub that records the call arguments to aggregate_pipeline. + class SpyClient + attr_reader :last_class_name, :last_pipeline, :last_raw_values, :last_raw_field_names + + def aggregate_pipeline(class_name, pipeline, raw_values: false, raw_field_names: false, **opts) + @last_class_name = class_name + @last_pipeline = pipeline + @last_raw_values = raw_values + @last_raw_field_names = raw_field_names + stub_response + end + + private + + def stub_response + resp = Minitest::Mock.new + resp.expect :present?, false + resp.expect :error?, true + resp.expect :result, [] + resp + end + end + + def setup + @query = Parse::Query.new("Post") + @spy = SpyClient.new + @query.instance_variable_set(:@client, @spy) + end + + # --- Query#aggregate interface ----------------------------------------- + + def test_aggregate_accepts_raw_values_kwarg + assert_silent do + @query.aggregate([{ "$match" => {} }], raw_values: true).execute! + end + end + + def test_aggregate_accepts_raw_field_names_kwarg + assert_silent do + @query.aggregate([{ "$match" => {} }], raw_field_names: true).execute! + end + end + + def test_aggregate_defaults_both_flags_to_false + @query.aggregate([{ "$match" => {} }]).execute! + assert_equal false, @spy.last_raw_values + assert_equal false, @spy.last_raw_field_names + end + + def test_aggregate_forwards_raw_values_true + @query.aggregate([{ "$match" => {} }], raw_values: true).execute! + assert_equal true, @spy.last_raw_values + end + + def test_aggregate_forwards_raw_field_names_true + @query.aggregate([{ "$match" => {} }], raw_field_names: true).execute! + assert_equal true, @spy.last_raw_field_names + end + + def test_aggregate_forwards_both_flags_together + @query.aggregate([{ "$match" => {} }], raw_values: true, raw_field_names: true).execute! + assert_equal true, @spy.last_raw_values + assert_equal true, @spy.last_raw_field_names + end + + # --- API module: aggregate_pipeline body shape -------------------------- + # Exercises Parse::API::Aggregate#aggregate_pipeline directly by including + # the module into a lightweight stub that captures the query hash. + + class APIStub + include Parse::API::Aggregate + + attr_reader :last_query + + def request(method, path, query: {}, headers: {}, opts: {}) + @last_query = query + stub_response + end + + private + + def stub_response + resp = Minitest::Mock.new + resp.expect :present?, false + resp + end + end + + def api_stub + @api_stub ||= APIStub.new + end + + def test_api_aggregate_pipeline_omits_raw_values_by_default + api_stub.aggregate_pipeline("Post", []) + refute api_stub.last_query.key?(:rawValues), + "rawValues should not appear in the query when not set" + end + + def test_api_aggregate_pipeline_omits_raw_field_names_by_default + api_stub.aggregate_pipeline("Post", []) + refute api_stub.last_query.key?(:rawFieldNames), + "rawFieldNames should not appear in the query when not set" + end + + def test_api_aggregate_pipeline_adds_raw_values_true + api_stub.aggregate_pipeline("Post", [], raw_values: true) + assert_equal true, api_stub.last_query[:rawValues] + end + + def test_api_aggregate_pipeline_adds_raw_field_names_true + api_stub.aggregate_pipeline("Post", [], raw_field_names: true) + assert_equal true, api_stub.last_query[:rawFieldNames] + end + + def test_api_aggregate_pipeline_adds_both_flags + api_stub.aggregate_pipeline("Post", [], raw_values: true, raw_field_names: true) + assert_equal true, api_stub.last_query[:rawValues] + assert_equal true, api_stub.last_query[:rawFieldNames] + end + + def test_api_aggregate_pipeline_pipeline_json_still_present + api_stub.aggregate_pipeline("Post", [{ "$match" => { "status" => "published" } }], raw_values: true) + assert api_stub.last_query.key?(:pipeline), + "pipeline: key must still be present alongside rawValues" + end +end diff --git a/test/lib/parse/api_users_password_reset_rate_limit_test.rb b/test/lib/parse/api_users_password_reset_rate_limit_test.rb index b6055e0..8c9ca75 100644 --- a/test/lib/parse/api_users_password_reset_rate_limit_test.rb +++ b/test/lib/parse/api_users_password_reset_rate_limit_test.rb @@ -50,7 +50,7 @@ def test_first_attempts_are_allowed def test_sixth_attempt_locks_email_out 5.times { @client.request_password_reset("locked@example.com") } - err = assert_raises(RuntimeError) do + err = assert_raises(Parse::Error::AccountLockoutError) do @client.request_password_reset("locked@example.com") end assert_match(/Login rate limited/, err.message, @@ -68,7 +68,7 @@ def test_different_emails_have_independent_counters assert_equal 6, @client.requests.size # And a@ is still locked. - assert_raises(RuntimeError) do + assert_raises(Parse::Error::AccountLockoutError) do @client.request_password_reset("a@example.com") end end diff --git a/test/lib/parse/client_rest_cloud_job_integration_test.rb b/test/lib/parse/client_rest_cloud_job_integration_test.rb index a99f1b8..bd071b1 100644 --- a/test/lib/parse/client_rest_cloud_job_integration_test.rb +++ b/test/lib/parse/client_rest_cloud_job_integration_test.rb @@ -34,9 +34,12 @@ def setup # -------------------------------------------------------------------- # Client-mode: trigger_job must NOT silently succeed. The SDK's - # response middleware translates the Parse Server 403 "master key is - # required" into +Parse::Error::AuthenticationError+, so the - # rejection surfaces as a raise — pin that exact translation. + # response middleware translates the Parse Server 403 into + # +Parse::Error::AuthenticationError+, so the rejection surfaces as a + # raise — pin that translation. We do not pin the server's exact prose: + # older Parse Server said "master key is required", 9.x returns a + # generic "Permission denied (403)". The invariant is the authorization + # refusal, not its wording. # -------------------------------------------------------------------- def test_trigger_job_under_client_mode_does_not_silently_succeed err = nil @@ -47,8 +50,8 @@ def test_trigger_job_under_client_mode_does_not_silently_succeed end end - assert_match(/master key/i, err.message, - "rejection must cite the missing master key (got: #{err.message})") + assert_match(/master key|permission denied|forbidden|\b403\b/i, err.message, + "rejection must surface an authorization failure (got: #{err.message})") end # -------------------------------------------------------------------- diff --git a/test/lib/parse/client_rest_password_reset_integration_test.rb b/test/lib/parse/client_rest_password_reset_integration_test.rb new file mode 100644 index 0000000..1caf828 --- /dev/null +++ b/test/lib/parse/client_rest_password_reset_integration_test.rb @@ -0,0 +1,93 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" +require_relative "../../support/client_mode_helper" +require "securerandom" + +# Client-side coverage for +Parse::User.request_password_reset+ against a live +# Parse Server. Password reset requires a server email adapter + public server +# URL; the test stack configures a capturing adapter (see +# test/cloud/capturing-email-adapter.js, wired via scripts/start-parse.sh) that +# records each outgoing message into an +EmailCapture+ class instead of sending +# it, so the test can assert that an email was generated and read back the reset +# link. Requests run in CLIENT MODE (no master key) — `requestPasswordReset` is +# a public endpoint, which is how a real application initiates a reset. +class ClientRestPasswordResetIntegrationTest < Minitest::Test + include ParseStackIntegrationTest + include Parse::Test::ClientModeHelper + + # Mirrors the documents written by the capturing email adapter. + class EmailCapture < Parse::Object + parse_class "EmailCapture" + property :kind + property :email + property :username + property :link + end + + def setup + skip "Docker integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] == "true" + super + end + + # Poll (with master key) for a captured email. The adapter writes + # asynchronously, so allow a brief window. + def captured_email(email, kind: "passwordReset", timeout: 8) + deadline = Time.now + timeout + loop do + row = EmailCapture.query(email: email, kind: kind, order: :createdAt.desc).first + return row if row + break if Time.now > deadline + sleep 0.25 + end + nil + end + + # A client-mode reset request for an existing user reports success and the + # server generates a reset email carrying a tokenized link. + def test_request_password_reset_for_existing_user_sends_email + user, _password = seed_client_user("pwreset") + email = user.email + + ok = as_client do + assert_client_mode! + Parse::User.request_password_reset(email) + end + assert ok, "request_password_reset should report success for an existing user" + + captured = captured_email(email) + refute_nil captured, "a password-reset email should have been generated" + assert_equal "passwordReset", captured.kind + assert_match %r{\Ahttps?://}, captured.link.to_s, "the email should contain a reset link" + assert_includes captured.link.to_s, "token=", "the reset link should carry a reset token" + end + + # Parse Server returns success for an unknown email (so attackers cannot + # enumerate which addresses are registered), and no email is generated. + def test_request_password_reset_for_unknown_email_does_not_enumerate + unknown = "nobody_#{SecureRandom.hex(6)}@test.com" + + ok = as_client { Parse::User.request_password_reset(unknown) } + assert ok, "an unknown email must be indistinguishable from a known one (no enumeration)" + + assert_nil captured_email(unknown, timeout: 3), + "no reset email should be generated for an unregistered address" + end + + # The instance helper resolves the user's own email. + def test_instance_request_password_reset_uses_user_email + user, _password = seed_client_user("pwreset_inst") + + ok = as_client { user.request_password_reset } + assert ok + + refute_nil captured_email(user.email), "the instance helper should trigger a reset email" + end + + # A blank email short-circuits in the SDK without a server round-trip. + def test_request_password_reset_blank_email_returns_false + refute Parse::User.request_password_reset(""), + "a blank email should return false without contacting the server" + end +end diff --git a/test/lib/parse/client_rest_server_info_integration_test.rb b/test/lib/parse/client_rest_server_info_integration_test.rb index c08744f..77d061a 100644 --- a/test/lib/parse/client_rest_server_info_integration_test.rb +++ b/test/lib/parse/client_rest_server_info_integration_test.rb @@ -48,8 +48,11 @@ def test_server_info_requires_master_key_under_client_mode Parse.client.server_info! end end - assert_match(/master key/i, err.message, - "rejection must cite the missing master key (got: #{err.message})") + # Don't pin the server's exact prose: older Parse Server said "master + # key is required", 9.x returns a generic "Permission denied (403)". + # The invariant is the authorization refusal, not its wording. + assert_match(/master key|permission denied|forbidden|\b403\b/i, err.message, + "rejection must surface an authorization failure (got: #{err.message})") end # -------------------------------------------------------------------- diff --git a/test/lib/parse/cloud_object_decode_integration_test.rb b/test/lib/parse/cloud_object_decode_integration_test.rb new file mode 100644 index 0000000..ea4dd74 --- /dev/null +++ b/test/lib/parse/cloud_object_decode_integration_test.rb @@ -0,0 +1,48 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" + +# End-to-end proof that the SDK decodes the `__type`-encoded Parse objects that +# Parse Server 8.0+/9.x return from cloud functions back into Parse::Object. +# Backed by the `echoObject` / `echoObjects` cloud fixtures in test/cloud/main.js. +class CloudObjectDecodeIntegrationTest < Minitest::Test + include ParseStackIntegrationTest + + # Registered so decode resolves the className to a concrete class. parse_class + # is set explicitly because the nested constant's model name would otherwise + # carry the enclosing namespace and not match the wire className. + class EchoObjectThing < Parse::Object + parse_class "EchoObjectThing" + property :title, :string + end + + def teardown + EchoObjectThing.query.results.each { |o| o.destroy rescue nil } rescue nil + super + end + + def test_raw_envelope_is_type_object_encoded + # Pins the assumption the decoder is built on: PS 9.x wraps a returned + # object as { "__type": "Object", "className": ..., "objectId": ..., ... }. + raw = Parse.client.call_function("echoObject", { title: "x" }).result["result"] + assert_equal "Object", raw["__type"] + assert_equal "EchoObjectThing", raw["className"] + refute raw["objectId"].to_s.empty? + end + + def test_cloud_function_returning_object_decodes_to_parse_object + obj = Parse.call_function("echoObject", { title: "hi" }) + assert_kind_of EchoObjectThing, obj + assert_equal "hi", obj.title + refute obj.id.to_s.empty?, "decoded object should carry its objectId" + end + + def test_cloud_function_returning_array_decodes_elementwise + arr = Parse.call_function("echoObjects", {}) + assert_kind_of Array, arr + assert_equal 2, arr.size + assert(arr.all? { |o| o.is_a?(EchoObjectThing) }, "each array element should decode") + assert_equal %w[a b].sort, arr.map(&:title).sort + end +end diff --git a/test/lib/parse/cloud_result_decode_test.rb b/test/lib/parse/cloud_result_decode_test.rb new file mode 100644 index 0000000..2bccf4a --- /dev/null +++ b/test/lib/parse/cloud_result_decode_test.rb @@ -0,0 +1,104 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Parse Server 8.0 flipped `encodeParseObjectInCloudFunction` to true and 9.0 +# removed the opt-out, so a cloud function returning a Parse object now yields a +# `__type`-encoded dictionary. These pin that the SDK decodes those envelopes +# back into Parse::Object / Parse::Pointer (matching every other Parse SDK) +# while leaving plain data and unregistered-class objects untouched. +class TestCloudResultDecode < Minitest::Test + Resp = Struct.new(:result) + + # A registered class so a full Object envelope decodes losslessly. + class DecodePost < Parse::Object + parse_class "DecodePostCRD" + property :title, :string + end + + def decode(value) + Parse._decode_cloud_value(value) + end + + def test_registered_object_envelope_decodes_to_object + enc = { "__type" => "Object", "className" => "DecodePostCRD", + "objectId" => "abc123", "title" => "Hello" } + obj = decode(enc) + assert_kind_of DecodePost, obj + assert_equal "abc123", obj.id + assert_equal "Hello", obj.title + end + + def test_pointer_envelope_decodes_to_pointer + enc = { "__type" => "Pointer", "className" => "GhostClassCRD", "objectId" => "g1" } + ptr = decode(enc) + assert_kind_of Parse::Pointer, ptr + assert_equal "g1", ptr.id + assert_equal "GhostClassCRD", ptr.parse_class + end + + def test_unregistered_object_envelope_left_as_hash_no_loss + # Building an unregistered-class Object would degrade to a field-less + # Pointer; we must hand back the raw Hash to avoid losing attributes. + enc = { "__type" => "Object", "className" => "TotallyUnregisteredCRD", + "objectId" => "x", "foo" => "bar" } + out = decode(enc) + assert_kind_of Hash, out + assert_equal "bar", out["foo"] + end + + def test_literal_app_data_with_type_key_untouched + # App data that happens to carry a `__type` value we don't recognize must + # pass through unchanged. + data = { "__type" => "invoice", "amount" => 5 } + assert_equal data, decode(data) + end + + def test_scalars_unchanged + assert_equal "hello", decode("hello") + assert_equal 42, decode(42) + assert_nil decode(nil) + assert_equal true, decode(true) + end + + def test_array_of_objects_decodes_elementwise + arr = [ + { "__type" => "Object", "className" => "DecodePostCRD", "objectId" => "1", "title" => "A" }, + { "__type" => "Object", "className" => "DecodePostCRD", "objectId" => "2", "title" => "B" }, + ] + out = decode(arr) + assert_equal 2, out.size + assert(out.all? { |o| o.is_a?(DecodePost) }) + assert_equal %w[A B], out.map(&:title) + end + + def test_nested_object_inside_plain_hash_decodes + payload = { "count" => 1, + "post" => { "__type" => "Object", "className" => "DecodePostCRD", + "objectId" => "z", "title" => "Nested" } } + out = decode(payload) + assert_equal 1, out["count"] + assert_kind_of DecodePost, out["post"] + assert_equal "Nested", out["post"].title + end + + def test_extract_cloud_result_unwraps_and_decodes + enc = { "__type" => "Object", "className" => "DecodePostCRD", + "objectId" => "abc123", "title" => "Hello" } + resp = Resp.new({ "result" => enc }) + obj = Parse._extract_cloud_result(resp) + assert_kind_of DecodePost, obj + assert_equal "Hello", obj.title + end + + def test_extract_cloud_result_passes_scalar_through + resp = Resp.new({ "result" => "Hello world!" }) + assert_equal "Hello world!", Parse._extract_cloud_result(resp) + end + + def test_extract_cloud_result_tolerates_non_hash_body + resp = Resp.new("raw-string-body") + assert_equal "raw-string-body", Parse._extract_cloud_result(resp) + end +end diff --git a/test/lib/parse/context_propagation_test.rb b/test/lib/parse/context_propagation_test.rb new file mode 100644 index 0000000..56a14c6 --- /dev/null +++ b/test/lib/parse/context_propagation_test.rb @@ -0,0 +1,248 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Tests for Parse Server context propagation (X-Parse-Cloud-Context header). +# +# Parse Server threads a caller-supplied `context` hash from a REST write or +# cloud-function call through to beforeSave/afterSave cloud triggers via +# req.info.context. The SDK participates on two sides: +# +# SEND — create_object / update_object / call_function accept `context:` and +# serialize it as the X-Parse-Cloud-Context header when present. +# RECEIVE — Parse::Webhooks::Payload exposes a `#context` accessor populated +# from the `context` key in the incoming trigger payload hash. +class TestContextPropagation < Minitest::Test + + # --------------------------------------------------------------------------- + # SEND side — header constant + # --------------------------------------------------------------------------- + + def test_cloud_context_header_constant_exists + assert_equal "X-Parse-Cloud-Context", Parse::Protocol::CLOUD_CONTEXT + end + + # --------------------------------------------------------------------------- + # SEND side — create_object header-building (exercises real API module logic) + # --------------------------------------------------------------------------- + + # A minimal class that includes Parse::API::Objects and captures the headers + # that the method would pass to `request`, without making a network call. + class FakeObjectsClient + include Parse::API::Objects + + attr_reader :captured_headers + + # Matches the signature that create_object/update_object call: + # request(method, uri, body:, headers:, opts:) + def request(_method, _uri, body: nil, headers: {}, opts: {}) + @captured_headers = headers + Parse::Response.new + end + + # required by uri_path (delegated to self.class via the module) + def self.uri_path(class_name, id = nil) + id ? "classes/#{class_name}/#{id}" : "classes/#{class_name}/" + end + + def uri_path(class_name, id = nil) + self.class.uri_path(class_name, id) + end + end + + def test_create_object_sets_cloud_context_header + ctx = { "requestId" => "abc-123", "source" => "test" } + fake = FakeObjectsClient.new + fake.create_object("Post", { title: "Hello" }, context: ctx) + + assert_equal ctx.to_json, fake.captured_headers[Parse::Protocol::CLOUD_CONTEXT] + end + + def test_create_object_omits_cloud_context_header_when_nil + fake = FakeObjectsClient.new + fake.create_object("Post", { title: "Hello" }) + + refute fake.captured_headers.key?(Parse::Protocol::CLOUD_CONTEXT), + "X-Parse-Cloud-Context header must be absent when context: is not supplied" + end + + def test_update_object_sets_cloud_context_header + ctx = { "userId" => "u1", "action" => "publish" } + fake = FakeObjectsClient.new + fake.update_object("Post", "abc123", { status: "published" }, context: ctx) + + assert_equal ctx.to_json, fake.captured_headers[Parse::Protocol::CLOUD_CONTEXT] + end + + def test_update_object_omits_cloud_context_header_when_nil + fake = FakeObjectsClient.new + fake.update_object("Post", "abc123", { status: "draft" }) + + refute fake.captured_headers.key?(Parse::Protocol::CLOUD_CONTEXT), + "X-Parse-Cloud-Context header must be absent when context: is not supplied" + end + + # Verify that a caller-owned headers hash is NOT mutated in place — the + # method must merge into a new hash, not modify the argument. + def test_create_object_does_not_mutate_caller_headers + ctx = { "source" => "test" } + caller_hdrs = { "X-Custom" => "yes" }.freeze # frozen guards mutation + fake = FakeObjectsClient.new + + assert_silent { fake.create_object("Post", {}, headers: caller_hdrs, context: ctx) } + refute caller_hdrs.key?(Parse::Protocol::CLOUD_CONTEXT) + end + + # --------------------------------------------------------------------------- + # SEND side — call_function header-building (exercises real API module logic) + # --------------------------------------------------------------------------- + + class FakeCloudClient + include Parse::API::CloudFunctions + + attr_reader :captured_headers + + def request(_method, _uri, body: nil, headers: {}, opts: {}) + @captured_headers = headers + Parse::Response.new + end + end + + def test_call_function_sets_cloud_context_header + ctx = { "traceId" => "xyz-789" } + fake = FakeCloudClient.new + fake.call_function("myFunc", { arg: 1 }, context: ctx) + + assert_equal ctx.to_json, fake.captured_headers[Parse::Protocol::CLOUD_CONTEXT] + end + + def test_call_function_omits_cloud_context_header_when_nil + fake = FakeCloudClient.new + fake.call_function("myFunc", { arg: 1 }) + + refute fake.captured_headers.key?(Parse::Protocol::CLOUD_CONTEXT), + "X-Parse-Cloud-Context header must be absent when context: is not supplied" + end + + def test_call_function_with_session_sets_cloud_context_header + ctx = { "locale" => "en-US" } + fake = FakeCloudClient.new + fake.call_function_with_session("myFunc", { arg: 2 }, "sess-token-abc", context: ctx) + + assert_equal ctx.to_json, fake.captured_headers[Parse::Protocol::CLOUD_CONTEXT] + end + + # --------------------------------------------------------------------------- + # SEND side — module-level Parse.call_function threads context: to client + # --------------------------------------------------------------------------- + + def test_parse_call_function_with_context_threads_context_kwarg + ctx = { "requestId" => "mod-test-01" } + + mock_client = Minitest::Mock.new + mock_response = Minitest::Mock.new + mock_response.expect :error?, false + mock_response.expect :result, { "result" => "ok" } + + # The module-level Parse.call_function must forward context: to the client + # method when it is non-nil. + mock_client.expect :call_function, mock_response, + ["ctxFunc", { param: "v" }], + opts: {}, context: ctx + + Parse::Client.stub :client, mock_client do + result = Parse.call_function("ctxFunc", { param: "v" }, context: ctx) + assert_equal "ok", result + end + + mock_client.verify + mock_response.verify + end + + def test_parse_call_function_without_context_does_not_pass_context_kwarg + # When context: is absent the call to the client method must carry no + # context: kwarg — this preserves exact compatibility with the existing + # mock expectations in cloud_functions_module_test.rb. + mock_client = Minitest::Mock.new + mock_response = Minitest::Mock.new + mock_response.expect :error?, false + mock_response.expect :result, { "result" => "plain" } + + mock_client.expect :call_function, mock_response, + ["plainFunc", {}], + opts: {} + + Parse::Client.stub :client, mock_client do + result = Parse.call_function("plainFunc", {}) + assert_equal "plain", result + end + + mock_client.verify + mock_response.verify + end + + # --------------------------------------------------------------------------- + # RECEIVE side — Parse::Webhooks::Payload context accessor + # --------------------------------------------------------------------------- + + def test_payload_exposes_context_when_present + ctx = { "requestId" => "r-42", "locale" => "fr-FR" } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSave", + "object" => { "className" => "Post", "objectId" => "abc1" }, + "master" => true, + "context" => ctx, + ) + + assert_equal ctx, payload.context + end + + def test_payload_context_is_nil_when_absent + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterSave", + "object" => { "className" => "Post", "objectId" => "def2" }, + "master" => true, + ) + + assert_nil payload.context + end + + def test_payload_context_is_not_in_credentials_scrub + # Verify context is NOT treated as a credential — its keys must survive + # intact even if they happen to match other scrubbed key names. + ctx = { "note" => "session notes are caller metadata, not credentials" } + payload = Parse::Webhooks::Payload.new( + "functionName" => "doWork", + "master" => false, + "context" => ctx, + ) + + assert_equal ctx, payload.context, + "context must pass through without credential scrubbing" + end + + def test_payload_context_appears_in_attributes + assert Parse::Webhooks::Payload::ATTRIBUTES.key?(:context), + "context must be listed in ATTRIBUTES so it appears in #as_json" + end + + def test_payload_context_responds_to_accessor + payload = Parse::Webhooks::Payload.new({}) + assert_respond_to payload, :context + assert_respond_to payload, :context= + end + + def test_payload_function_request_with_context + ctx = { "source" => "ios", "version" => "2.1" } + payload = Parse::Webhooks::Payload.new( + "functionName" => "processPost", + "params" => { "postId" => "xyz" }, + "master" => false, + "context" => ctx, + ) + + assert payload.function? + assert_equal ctx, payload.context + end +end diff --git a/test/lib/parse/email_verification_disruptive_test.rb b/test/lib/parse/email_verification_disruptive_test.rb new file mode 100644 index 0000000..d29db3e --- /dev/null +++ b/test/lib/parse/email_verification_disruptive_test.rb @@ -0,0 +1,97 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" +require "securerandom" + +# Email-address verification flow against a real Parse Server. +# +# Verification fundamentally changes signup (an unverified user gets no session +# token), so it cannot be enabled for the main integration suite. This test is +# therefore DISRUPTIVE: it recreates the Parse Server container with +# `verifyUserEmails=true` (via scripts/docker/docker-compose.verifyemail.yml), +# exercises the flow, and restores the default config in teardown. It runs only +# under `rake test:integration:disruptive` (excluded from `test` / +# `test:integration` by the `*disruptive*` filename), so it never reconfigures +# the shared server out from under other tests. +# +# The capturing email adapter (test/cloud/capturing-email-adapter.js) records +# the verification email into an `EmailCapture` class so the test can assert it +# was sent and read back the verification link. +class EmailVerificationDisruptiveTest < Minitest::Test + include ParseStackIntegrationTest + + COMPOSE = "scripts/docker/docker-compose.test.yml" + OVERRIDE = "scripts/docker/docker-compose.verifyemail.yml" + HEALTH_URL = "http://localhost:#{ENV['PARSE_HOST_PORT'] || 29337}/parse/health" + + class EmailCapture < Parse::Object + parse_class "EmailCapture" + property :kind + property :email + property :link + end + + def setup + skip "Docker integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] == "true" + recreate_parse!(verify_emails: true) + super + end + + def teardown + super + ensure + # Always restore the default (no-verification) config for the shared server. + recreate_parse!(verify_emails: false) + end + + def test_signup_sends_verification_email_and_request_resends + username = "ev_#{SecureRandom.hex(4)}" + email = "#{username}@test.com" + + user = Parse::User.new(username: username, password: "p4ssw0rd!", email: email) + assert user.save, "signup should succeed even when email verification is required" + + sent = captured_email(email) + refute_nil sent, "signup with verifyUserEmails should generate a verification email" + assert_equal "verification", sent.kind + assert_includes sent.link.to_s, "token=", "the verification link should carry a token" + + # The SDK can (re)request the verification email for the same address. + assert Parse::User.request_email_verification(email), + "request_email_verification should be accepted by the server" + + # And the instance helper resolves the user's own email. + assert user.request_email_verification, + "the instance helper should also request a verification email" + end + + private + + def recreate_parse!(verify_emails:) + files = verify_emails ? "-f #{COMPOSE} -f #{OVERRIDE}" : "-f #{COMPOSE}" + system("docker-compose #{files} up -d --force-recreate --no-deps parse", + out: IO::NULL, err: IO::NULL) + wait_for_health! + end + + def wait_for_health!(timeout: 60) + deadline = Time.now + timeout + until Time.now > deadline + return if system("curl -sf #{HEALTH_URL} -o /dev/null 2>/dev/null") + sleep 1 + end + flunk "Parse Server did not become healthy within #{timeout}s after recreate" + end + + def captured_email(email, kind: "verification", timeout: 8) + deadline = Time.now + timeout + loop do + row = EmailCapture.query(email: email, kind: kind, order: :createdAt.desc).first + return row if row + break if Time.now > deadline + sleep 0.25 + end + nil + end +end diff --git a/test/lib/parse/embed_pending_test.rb b/test/lib/parse/embed_pending_test.rb new file mode 100644 index 0000000..e18f209 --- /dev/null +++ b/test/lib/parse/embed_pending_test.rb @@ -0,0 +1,97 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit tests for the v5.0 bulk/backfill surface: +# - Parse::Object#compute_embedding! (force in-place recompute) +# - Class.embed_pending! (objectId-cursor backfill) +# The provider is the deterministic Fixture; the query chain is stubbed +# so the backfill runs without a server. +class EmbedPendingTest < Minitest::Test + def self.register + Parse::Embeddings.register(:fx_ep, Parse::Embeddings::Fixture.new(dimensions: 4)) + end + register + + class EPItem < Parse::Object + parse_class "EPItem" + property :title, :string + property :embedding, :vector, dimensions: 4, provider: :fx_ep + embed :title, into: :embedding + end + + # ----- compute_embedding! ----- + + def test_compute_embedding_populates_vector_and_digest + r = EPItem.new(title: "hello world") + assert_nil r.embedding + assert_same r, r.compute_embedding! + assert_equal 4, r.embedding.dimensions + refute_nil r.embedding_digest + end + + def test_compute_embedding_unknown_field_raises + assert_raises(ArgumentError) { EPItem.new(title: "x").compute_embedding!(field: :nope) } + end + + # ----- embed_pending! (stubbed query chain) ----- + + # A fake record: records save calls; carries an id. + class FakeRecord + attr_reader :id, :saves + def initialize(id) = (@id = id; @saves = 0) + def save(**_opts) = (@saves += 1) + end + + # A fake query that returns successive canned batches and records the + # `:objectId.gt` cursor it was asked to filter on. + class FakeQuery + attr_reader :cursors + def initialize(batches) = (@batches = batches; @i = -1; @cursors = []) + def where(constraints = {}) + # The only .where call in the backfill carries the objectId cursor, + # so capture every where value (the key is a Parse operation object). + constraints.each_value { |v| @cursors << v } + self + end + def order(*) = self + def limit(*) = self + def results + @i += 1 + @batches[@i] || [] + end + end + + def test_embed_pending_saves_each_pending_record_until_drained + b1 = [FakeRecord.new("a"), FakeRecord.new("b")] + b2 = [FakeRecord.new("c")] # short batch (< batch_size) ends the loop + fq = FakeQuery.new([b1, b2]) + EPItem.stub(:query, ->(*_a) { fq }) do + n = EPItem.embed_pending!(batch_size: 2) + assert_equal 3, n + end + (b1 + b2).each { |r| assert_equal 1, r.saves } + # cursor advanced to the last id of the first full batch. + assert_includes fq.cursors, "b" + end + + def test_embed_pending_respects_limit + b1 = [FakeRecord.new("a"), FakeRecord.new("b"), FakeRecord.new("c")] + fq = FakeQuery.new([b1, b1, b1]) + EPItem.stub(:query, ->(*_a) { fq }) do + n = EPItem.embed_pending!(batch_size: 3, limit: 2) + assert_equal 2, n + end + assert_equal 1, b1[0].saves + assert_equal 1, b1[1].saves + assert_equal 0, b1[2].saves, "limit should stop before the 3rd record" + end + + def test_embed_pending_empty_first_batch_is_zero + fq = FakeQuery.new([[]]) + EPItem.stub(:query, ->(*_a) { fq }) do + assert_equal 0, EPItem.embed_pending!(batch_size: 50) + end + end +end diff --git a/test/lib/parse/embeddings_spend_cap_test.rb b/test/lib/parse/embeddings_spend_cap_test.rb new file mode 100644 index 0000000..19eacbf --- /dev/null +++ b/test/lib/parse/embeddings_spend_cap_test.rb @@ -0,0 +1,88 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit tests for Parse::Embeddings::SpendCap — the per-tenant cumulative +# token cap with hard-refuse semantics. +class EmbeddingsSpendCapTest < Minitest::Test + SC = Parse::Embeddings::SpendCap + + def setup + SC.reset_all! + end + + def teardown + SC.reset_all! + end + + def test_disabled_by_default_is_noop + assert_nil SC.charge!(tenant_id: "t", tokens: 10_000) + assert_equal 0, SC.usage(tenant_id: "t") + end + + def test_charges_accumulate_and_hard_refuse + SC.configure(limit_tokens: 100, window: 3600) + assert_equal 60, SC.charge!(tenant_id: "t", tokens: 60) + assert_equal 60, SC.usage(tenant_id: "t") + err = assert_raises(SC::Exceeded) { SC.charge!(tenant_id: "t", tokens: 50) } + assert_equal 60, err.used + assert_equal 50, err.requested + assert_equal 100, err.limit + # The refused charge is NOT recorded. + assert_equal 60, SC.usage(tenant_id: "t") + end + + def test_separate_tenants_have_separate_buckets + SC.configure(limit_tokens: 100) + SC.charge!(tenant_id: "a", tokens: 90) + assert_equal 90, SC.charge!(tenant_id: "b", tokens: 90) + assert_equal 90, SC.usage(tenant_id: "a") + assert_equal 90, SC.usage(tenant_id: "b") + end + + def test_per_tenant_override_wins_over_default + SC.configure(limit_tokens: 1000) + SC.configure("small", limit_tokens: 10) + assert_raises(SC::Exceeded) { SC.charge!(tenant_id: "small", tokens: 20) } + # default tenant still has the large cap. + assert_equal 500, SC.charge!(tenant_id: "big", tokens: 500) + end + + def test_per_tenant_disable_overrides_default + SC.configure(limit_tokens: 10) + SC.configure("vip", limit_tokens: nil) # uncapped for vip + assert_nil SC.charge!(tenant_id: "vip", tokens: 1_000_000) + end + + def test_request_larger_than_limit_has_nil_retry_after + SC.configure(limit_tokens: 10) + err = assert_raises(SC::Exceeded) { SC.charge!(tenant_id: "t", tokens: 20) } + assert_nil err.retry_after, "a charge that can never fit reports no retry_after" + end + + def test_nil_tenant_uses_shared_default_bucket + SC.configure(limit_tokens: 100) + SC.charge!(tenant_id: nil, tokens: 70) + assert_equal 70, SC.usage(tenant_id: nil) + end + + def test_estimate_tokens_is_chars_over_four + assert_equal 3, SC.estimate_tokens("abcdefghij") # 10/4 -> 3 (ceil) + assert_equal 0, SC.estimate_tokens("") + end + + def test_configure_rejects_bad_values + assert_raises(ArgumentError) { SC.configure(limit_tokens: 0) } + assert_raises(ArgumentError) { SC.configure(limit_tokens: 100, window: 0) } + end + + def test_reset_clears_usage_but_keeps_limits + SC.configure(limit_tokens: 100) + SC.charge!(tenant_id: "t", tokens: 50) + SC.reset!("t") + assert_equal 0, SC.usage(tenant_id: "t") + # limit still applies after a usage reset. + assert_equal 100, SC.charge!(tenant_id: "t", tokens: 100) + end +end diff --git a/test/lib/parse/hooks_trigger_names_test.rb b/test/lib/parse/hooks_trigger_names_test.rb new file mode 100644 index 0000000..31c3cec --- /dev/null +++ b/test/lib/parse/hooks_trigger_names_test.rb @@ -0,0 +1,106 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# The webhook trigger allowlist must mirror Parse Server's `triggers.Types` so +# the SDK no longer pre-rejects registration of the auth / LiveQuery / password- +# reset hooks. (This gates registration only; payload routing for the non-object +# shapes is a separate follow-up.) +class TestHooksTriggerNames < Minitest::Test + class HookHost + include Parse::API::Hooks + end + + NEW_TRIGGERS = %i[beforeLogin afterLogin afterLogout beforePasswordResetRequest + beforeConnect beforeSubscribe afterEvent].freeze + + def test_new_trigger_types_in_allowlist + NEW_TRIGGERS.each do |t| + assert_includes Parse::API::Hooks::TRIGGER_NAMES, t, "#{t} should be allowlisted" + end + end + + def test_original_object_triggers_preserved + %i[beforeSave afterSave beforeDelete afterDelete beforeFind afterFind].each do |t| + assert_includes Parse::API::Hooks::TRIGGER_NAMES, t + end + end + + def test_create_is_not_a_registerable_trigger + # beforeCreate/afterCreate are NOT Parse Server trigger types — they are + # ActiveModel callbacks dispatched inside beforeSave/afterSave. + refute_includes Parse::API::Hooks::TRIGGER_NAMES, :afterCreate + refute_includes Parse::API::Hooks::TRIGGER_NAMES, :beforeCreate + refute_includes Parse::API::Hooks::TRIGGER_NAMES_LOCAL, :after_create + refute_includes Parse::API::Hooks::TRIGGER_NAMES_LOCAL, :before_create + end + + def test_verify_trigger_create_raises_helpful_guidance + host = HookHost.new + %i[before_create beforeCreate].each do |t| + err = assert_raises(ArgumentError) { host.send(:_verify_trigger, t) } + assert_match(/no beforeCreate webhook trigger/, err.message) + assert_match(/beforeSave/, err.message) + end + %i[after_create afterCreate].each do |t| + err = assert_raises(ArgumentError) { host.send(:_verify_trigger, t) } + assert_match(/no afterCreate webhook trigger/, err.message) + assert_match(/afterSave/, err.message) + end + end + + def test_local_snake_case_list_in_sync + assert_equal Parse::API::Hooks::TRIGGER_NAMES.length, + Parse::API::Hooks::TRIGGER_NAMES_LOCAL.length + assert_includes Parse::API::Hooks::TRIGGER_NAMES_LOCAL, :before_password_reset_request + assert_includes Parse::API::Hooks::TRIGGER_NAMES_LOCAL, :after_event + end + + def test_verify_trigger_accepts_snake_case_new_names + host = HookHost.new + assert_equal :beforeLogin, host.send(:_verify_trigger, :before_login) + assert_equal :beforePasswordResetRequest, + host.send(:_verify_trigger, :before_password_reset_request) + assert_equal :afterEvent, host.send(:_verify_trigger, "after_event") + end + + def test_verify_trigger_accepts_camel_case_new_names + host = HookHost.new + assert_equal :beforeSubscribe, host.send(:_verify_trigger, "beforeSubscribe") + end + + def test_verify_trigger_still_rejects_bogus + host = HookHost.new + assert_raises(ArgumentError) { host.send(:_verify_trigger, :totallyNotATrigger) } + end + + # --- trigger className validation (@File / @Connect pseudo-classes) ------- + + def test_trigger_class_name_accepts_pseudo_classes + ps = Parse::API::PathSegment + assert_equal "@File", ps.trigger_class_name!("@File") + assert_equal "@Connect", ps.trigger_class_name!("@Connect") + assert_equal "_User", ps.trigger_class_name!("_User") + assert_equal "Post", ps.trigger_class_name!("Post") + end + + def test_trigger_class_name_rejects_path_traversal + ps = Parse::API::PathSegment + %w[../_User a/b @@x @ a.b].each do |bad| + assert_raises(ArgumentError, "#{bad.inspect} should be rejected") do + ps.trigger_class_name!(bad) + end + end + end + + # --- webhook route DSL rejects create with guidance ---------------------- + + def test_route_dsl_rejects_create_with_guidance + err = assert_raises(ArgumentError) do + Parse::Webhooks.route(:after_create, "Post") { parse_object } + end + assert_match(/no after_create webhook/, err.message) + assert_match(/after_save/, err.message) + end +end diff --git a/test/lib/parse/hooks_trigger_registration_integration_test.rb b/test/lib/parse/hooks_trigger_registration_integration_test.rb new file mode 100644 index 0000000..6ae8f30 --- /dev/null +++ b/test/lib/parse/hooks_trigger_registration_integration_test.rb @@ -0,0 +1,92 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" + +# Integration coverage proving that each webhook trigger type added to the +# allowlist in 5.4.0 can actually be registered against a live Parse Server +# (9.x) — register -> fetch -> delete round-trips cleanly. This is the surface +# the allowlist expansion enabled; payload routing for the non-object trigger +# shapes (login / connect / subscribe carry no `object`) remains a follow-up. +# +# NOTE on `beforeConnect`: it is a connection-global trigger whose documented +# className is the `@Connect` sentinel. Parse Server accepts `@Connect` on +# create, but the SDK's `PathSegment.identifier!` guard rejects the leading `@` +# on fetch/delete (same as `@File` for file triggers) — so this test exercises +# `beforeConnect` under a concrete className (`_User`) where the full lifecycle +# is SDK-manageable. First-class `@Connect` / `@File` path handling is a +# separate follow-up. +class HooksTriggerRegistrationIntegrationTest < Minitest::Test + include ParseStackIntegrationTest + + WEBHOOK_URL = "https://hooks.example.com/parse-stack-trigger-it" + + # [triggerName, className] + NEW_TRIGGERS = [ + [:beforeLogin, "_User"], + [:afterLogin, "_User"], + [:afterLogout, "_Session"], + [:beforePasswordResetRequest, "_User"], + [:beforeSubscribe, "HookRegITClass"], + [:afterEvent, "HookRegITClass"], + [:beforeConnect, "_User"], + ].freeze + + def teardown + # Best-effort: remove any registration a failed assertion left behind so + # the next run (and the rest of the suite) starts clean. + if Parse::Client.client&.master_key.present? + NEW_TRIGGERS.each { |t, k| Parse.client.delete_trigger(t, k) rescue nil } + %i[beforeSave afterSave beforeDelete].each { |t| Parse.client.delete_trigger(t, "@File") rescue nil } + Parse.client.delete_trigger(:beforeSave, "HookRegITClass") rescue nil + end + super + end + + def test_new_trigger_types_register_fetch_delete + skip "hook registration requires a master key" unless Parse::Client.client&.master_key.present? + + NEW_TRIGGERS.each do |trigger, klass| + created = Parse.client.create_trigger(trigger, klass, WEBHOOK_URL) + refute created.error?, "register #{trigger}/#{klass} failed: #{created.error}" + assert_equal klass, created.result["className"], + "#{trigger}/#{klass}: server echoed an unexpected className" + assert_equal WEBHOOK_URL, created.result["url"] + + fetched = Parse.client.fetch_trigger(trigger, klass) + refute fetched.error?, "fetch #{trigger}/#{klass} failed: #{fetched.error}" + assert_equal WEBHOOK_URL, fetched.result["url"] + + deleted = Parse.client.delete_trigger(trigger, klass) + refute deleted.error?, "delete #{trigger}/#{klass} failed: #{deleted.error}" + end + end + + # A previously-allowed object trigger must still register cleanly after the + # allowlist grew (guards against an accidental regression in the expansion). + def test_object_trigger_still_registers + skip "hook registration requires a master key" unless Parse::Client.client&.master_key.present? + + created = Parse.client.create_trigger(:beforeSave, "HookRegITClass", WEBHOOK_URL) + refute created.error?, "register beforeSave failed: #{created.error}" + Parse.client.delete_trigger(:beforeSave, "HookRegITClass") + end + + # File triggers use the `@File` pseudo-class. Before the trigger-className + # validator relaxation, create succeeded but fetch/delete raised on the `@`. + # Now the full lifecycle works through the SDK. + def test_file_trigger_at_pseudo_class_lifecycle + skip "hook registration requires a master key" unless Parse::Client.client&.master_key.present? + + %i[beforeSave afterSave beforeDelete].each do |trigger| + created = Parse.client.create_trigger(trigger, "@File", WEBHOOK_URL) + refute created.error?, "register #{trigger}/@File failed: #{created.error}" + + fetched = Parse.client.fetch_trigger(trigger, "@File") + refute fetched.error?, "fetch #{trigger}/@File failed: #{fetched.error}" + + deleted = Parse.client.delete_trigger(trigger, "@File") + refute deleted.error?, "delete #{trigger}/@File failed: #{deleted.error}" + end + end +end diff --git a/test/lib/parse/live_query/subscription_test.rb b/test/lib/parse/live_query/subscription_test.rb index 7db1aa2..cdaf52c 100644 --- a/test/lib/parse/live_query/subscription_test.rb +++ b/test/lib/parse/live_query/subscription_test.rb @@ -80,6 +80,42 @@ def test_to_subscribe_message_without_optional_fields refute message.key?(:sessionToken) refute message[:query].key?(:fields) + refute message[:query].key?(:keys) + end + + # Parse Server 7.0 renamed the subscription field-projection option from + # `fields` to `keys`. On PS 7+ a frame carrying only `fields` is ignored + # and events return ALL columns — a silent projection break. The subscribe + # frame must emit `keys` (and keep `fields` for pre-7 servers). + def test_to_subscribe_message_emits_keys_for_projection + message = @subscription.to_subscribe_message + assert_equal ["title", "plays"], message[:query][:keys], + "subscribe frame must emit the PS 7+ `keys` projection option" + assert_equal ["title", "plays"], message[:query][:fields], + "subscribe frame keeps `fields` for pre-7.0 server compatibility" + end + + def test_keys_constructor_alias + subscription = Parse::LiveQuery::Subscription.new( + client: @mock_client, + class_name: "Song", + keys: ["title"], + ) + assert_equal ["title"], subscription.fields + assert_equal ["title"], subscription.keys + message = subscription.to_subscribe_message + assert_equal ["title"], message[:query][:keys] + assert_equal ["title"], message[:query][:fields] + end + + def test_keys_takes_precedence_over_fields_when_both_given + subscription = Parse::LiveQuery::Subscription.new( + client: @mock_client, + class_name: "Song", + fields: ["legacy"], + keys: ["canonical"], + ) + assert_equal ["canonical"], subscription.fields end def test_to_unsubscribe_message diff --git a/test/lib/parse/live_query/upstream_fixes_test.rb b/test/lib/parse/live_query/upstream_fixes_test.rb index 629351c..6bf994d 100644 --- a/test/lib/parse/live_query/upstream_fixes_test.rb +++ b/test/lib/parse/live_query/upstream_fixes_test.rb @@ -130,13 +130,15 @@ def initialize(master_key: nil) end # Mirror Client#subscribe's signature so the kwarg propagation is exercised. - def subscribe(class_name, where: {}, fields: nil, session_token: nil, + def subscribe(class_name, where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, use_master_key: false, &block) sub = Parse::LiveQuery::Subscription.new( client: self, class_name: class_name.to_s, query: where, fields: fields, + keys: keys, + watch: watch, session_token: session_token, use_master_key: use_master_key, ) @@ -187,13 +189,15 @@ def initialize @master_key = nil end - def subscribe(class_name, where: {}, fields: nil, session_token: nil, + def subscribe(class_name, where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, use_master_key: false, &block) sub = Parse::LiveQuery::Subscription.new( client: self, class_name: class_name.to_s, query: where, fields: fields, + keys: keys, + watch: watch, session_token: session_token, use_master_key: use_master_key, ) diff --git a/test/lib/parse/live_query/watch_test.rb b/test/lib/parse/live_query/watch_test.rb new file mode 100644 index 0000000..a305f71 --- /dev/null +++ b/test/lib/parse/live_query/watch_test.rb @@ -0,0 +1,137 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" +require_relative "../../../../lib/parse/live_query" + +# Tests for the LiveQuery `watch:` option (PS 7.0+, PR #8028). +# `watch` filters which field mutations trigger an update event, independently +# of field projection (`keys`/`fields` which control what the event payload +# contains). +class TestLiveQueryWatch < Minitest::Test + def mock_client + @mock_client ||= Minitest::Mock.new + end + + def subscription_with(opts = {}) + Parse::LiveQuery::Subscription.new( + client: mock_client, + class_name: "Post", + query: {}, + **opts, + ) + end + + # --- Subscription constructor ------------------------------------------ + + def test_watch_attr_reader_exists + sub = subscription_with + assert_respond_to sub, :watch + end + + def test_watch_defaults_to_nil + sub = subscription_with + assert_nil sub.watch + end + + def test_watch_stores_array_of_strings + sub = subscription_with(watch: ["title", "status"]) + assert_equal ["title", "status"], sub.watch + end + + def test_watch_stores_array_of_symbols + sub = subscription_with(watch: [:title, :status]) + assert_equal [:title, :status], sub.watch + end + + def test_watch_stores_mixed_array + sub = subscription_with(watch: ["title", :status]) + assert_equal ["title", :status], sub.watch + end + + # --- to_subscribe_message --------------------------------------------- + + def test_subscribe_message_omits_watch_when_nil + sub = subscription_with + msg = sub.to_subscribe_message + refute msg[:query].key?(:watch), + "watch must not appear in the message when nil" + end + + def test_subscribe_message_omits_watch_when_empty_array + sub = subscription_with(watch: []) + msg = sub.to_subscribe_message + refute msg[:query].key?(:watch), + "watch must not appear in the message when the array is empty" + end + + def test_subscribe_message_includes_watch_when_set + sub = subscription_with(watch: ["title", "status"]) + msg = sub.to_subscribe_message + assert msg[:query].key?(:watch), + "watch must appear in the subscribe message when set" + assert_equal ["title", "status"], msg[:query][:watch] + end + + def test_subscribe_message_watch_and_keys_are_independent + sub = subscription_with( + keys: ["title", "author"], + watch: ["status"], + ) + msg = sub.to_subscribe_message + assert_equal ["title", "author"], msg[:query][:keys] + assert_equal ["status"], msg[:query][:watch] + end + + def test_subscribe_message_base_structure_intact_with_watch + sub = subscription_with(watch: ["title"]) + msg = sub.to_subscribe_message + assert_equal "subscribe", msg[:op] + assert_equal "Post", msg[:query][:className] + assert_equal({}, msg[:query][:where]) + assert_equal ["title"], msg[:query][:watch] + end + + # --- Client#subscribe forwarding --------------------------------------- + + def test_lq_client_subscribe_accepts_watch_kwarg + # Build a minimal LiveQuery::Client stub + stub_client = Object.new + captured = {} + + stub_client.define_singleton_method(:subscribe) do |class_name, where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, use_master_key: false, &block| + captured[:watch] = watch + Parse::LiveQuery::Subscription.new( + client: stub_client, + class_name: class_name, + query: where, + watch: watch, + ) + end + + q = Parse::Query.new("Post") + sub = q.subscribe(watch: ["title"], client: stub_client) + assert_equal ["title"], sub.watch + assert_equal ["title"], captured[:watch] + end + + # --- Query#subscribe forwarding --------------------------------------- + + def test_query_subscribe_accepts_and_forwards_watch + captured_watch = nil + spy_lq = Object.new + spy_lq.define_singleton_method(:subscribe) do |class_name, where: {}, fields: nil, keys: nil, watch: nil, session_token: nil, use_master_key: false, &block| + captured_watch = watch + Parse::LiveQuery::Subscription.new( + client: spy_lq, + class_name: class_name, + query: where, + watch: watch, + ) + end + + q = Parse::Query.new("Post") + q.subscribe(watch: ["title", "body"], client: spy_lq) + assert_equal ["title", "body"], captured_watch + end +end diff --git a/test/lib/parse/live_query_integration_test.rb b/test/lib/parse/live_query_integration_test.rb index 00f1fb5..ad18dbd 100644 --- a/test/lib/parse/live_query_integration_test.rb +++ b/test/lib/parse/live_query_integration_test.rb @@ -95,22 +95,16 @@ def test_subscribe_receives_create_event # Wait for subscription to be confirmed wait_for_subscription(subscription) - # Create an object - should trigger callback - new_obj = TestLiveQueryModel.new - new_obj.name = "Test Object" - new_obj.value = 42 - new_obj.status = "active" + # Create a publicly-readable object - should trigger callback + new_obj = build_public(name: "Test Object", value: 42, status: "active") new_obj.save # Wait for callback (with timeout) - callback_called.wait(5) + callback_called.wait(EVENT_TIMEOUT) - if callback_called.set? - assert_equal "Test Object", created_object.name - assert_equal 42, created_object.value - else - skip "LiveQuery create event not received (may be server configuration issue)" - end + assert callback_called.set?, "LiveQuery create event not received within #{EVENT_TIMEOUT}s" + assert_equal "Test Object", created_object.name + assert_equal 42, created_object.value subscription.unsubscribe new_obj.destroy @@ -119,10 +113,8 @@ def test_subscribe_receives_create_event def test_subscribe_receives_update_event skip_unless_livequery_available - # Create initial object - obj = TestLiveQueryModel.new - obj.name = "Original" - obj.value = 1 + # Create initial publicly-readable object + obj = build_public(name: "Original", value: 1) obj.save updated_object = nil @@ -143,14 +135,11 @@ def test_subscribe_receives_update_event obj.value = 2 obj.save - callback_called.wait(5) + callback_called.wait(EVENT_TIMEOUT) - if callback_called.set? - assert_equal "Updated", updated_object.name - assert_equal 2, updated_object.value - else - skip "LiveQuery update event not received (may be server configuration issue)" - end + assert callback_called.set?, "LiveQuery update event not received within #{EVENT_TIMEOUT}s" + assert_equal "Updated", updated_object.name + assert_equal 2, updated_object.value subscription.unsubscribe obj.destroy @@ -159,10 +148,8 @@ def test_subscribe_receives_update_event def test_subscribe_receives_delete_event skip_unless_livequery_available - # Create object to delete - obj = TestLiveQueryModel.new - obj.name = "ToDelete" - obj.value = 99 + # Create publicly-readable object to delete + obj = build_public(name: "ToDelete", value: 99) obj.save object_id = obj.id @@ -180,13 +167,10 @@ def test_subscribe_receives_delete_event # Delete the object obj.destroy - callback_called.wait(5) + callback_called.wait(EVENT_TIMEOUT) - if callback_called.set? - assert_equal object_id, deleted_object.id - else - skip "LiveQuery delete event not received (may be server configuration issue)" - end + assert callback_called.set?, "LiveQuery delete event not received within #{EVENT_TIMEOUT}s" + assert_equal object_id, deleted_object.id subscription.unsubscribe end @@ -207,26 +191,19 @@ def test_subscribe_with_query_filter wait_for_subscription(subscription) # Create object that doesn't match filter - obj1 = TestLiveQueryModel.new - obj1.name = "Low Value" - obj1.value = 10 + obj1 = build_public(name: "Low Value", value: 10) obj1.save # Create object that matches filter - obj2 = TestLiveQueryModel.new - obj2.name = "High Value" - obj2.value = 100 + obj2 = build_public(name: "High Value", value: 100) obj2.save - callback_called.wait(5) + callback_called.wait(EVENT_TIMEOUT) - if callback_called.set? - # Should only receive the high value object - assert_equal 1, received_objects.length - assert_equal "High Value", received_objects.first.name - else - skip "LiveQuery filtered create event not received" - end + assert callback_called.set?, "LiveQuery filtered create event not received within #{EVENT_TIMEOUT}s" + # Should only receive the high value object + assert_equal 1, received_objects.length + assert_equal "High Value", received_objects.first.name subscription.unsubscribe obj1.destroy @@ -247,9 +224,10 @@ def test_unsubscribe_stops_events subscription.unsubscribe assert subscription.unsubscribed? - # Create object after unsubscribe - obj = TestLiveQueryModel.new - obj.name = "After Unsubscribe" + # Create a publicly-readable object after unsubscribe. Public ACL means a + # leaked subscription WOULD receive it, so a passing assertion genuinely + # proves unsubscribe stopped delivery (rather than ACL filtering masking it). + obj = build_public(name: "After Unsubscribe") obj.save sleep 2 # Wait to ensure no callback is triggered @@ -264,28 +242,27 @@ def test_multiple_subscriptions sub1_received = [] sub2_received = [] + sub1_got = Concurrent::Event.new + sub2_got = Concurrent::Event.new sub1 = TestLiveQueryModel.subscribe(where: { status: "active" }) - sub1.on(:create) { |obj| sub1_received << obj } + sub1.on(:create) { |obj| sub1_received << obj; sub1_got.set } sub2 = TestLiveQueryModel.subscribe(where: { status: "inactive" }) - sub2.on(:create) { |obj| sub2_received << obj } + sub2.on(:create) { |obj| sub2_received << obj; sub2_got.set } wait_for_subscription(sub1) wait_for_subscription(sub2) - # Create objects with different statuses - active_obj = TestLiveQueryModel.new - active_obj.name = "Active" - active_obj.status = "active" + # Create publicly-readable objects with different statuses + active_obj = build_public(name: "Active", status: "active") active_obj.save - inactive_obj = TestLiveQueryModel.new - inactive_obj.name = "Inactive" - inactive_obj.status = "inactive" + inactive_obj = build_public(name: "Inactive", status: "inactive") inactive_obj.save - sleep 3 # Wait for events + sub1_got.wait(EVENT_TIMEOUT) + sub2_got.wait(EVENT_TIMEOUT) # Clean up sub1.unsubscribe @@ -293,10 +270,9 @@ def test_multiple_subscriptions active_obj.destroy inactive_obj.destroy - # Skip assertion if events weren't received - if sub1_received.empty? && sub2_received.empty? - skip "LiveQuery events not received for multiple subscriptions" - end + # Each subscription should receive only the object matching its filter. + assert_equal ["Active"], sub1_received.map(&:name), "sub1 (status=active) should receive only the active object" + assert_equal ["Inactive"], sub2_received.map(&:name), "sub2 (status=inactive) should receive only the inactive object" end def test_subscription_callback_chaining @@ -319,6 +295,26 @@ def test_subscription_callback_chaining private + # Build a TestLiveQuery instance that an anonymous (client-key only) + # LiveQuery subscription can actually read. The SDK's default ACL policy + # is `:owner_else_private`, which stamps an empty (master-key-only) ACL on + # owner-less objects; Parse Server's LiveQuery server then refuses to push + # events for such objects to a subscription that connected without a + # session token. A public-read ACL makes the delivery path deterministic, + # so these tests assert real event delivery instead of skipping. (The + # ACL-scoped delivery path — subscribe-as-user, private rows — is covered + # separately by client_livequery_integration_test.rb.) + def build_public(attrs = {}) + obj = TestLiveQueryModel.new(attrs) + obj.acl = Parse::ACL.everyone(true, true) + obj + end + + # Generous wait for an event to land. Delivery is normally sub-second once + # the subscription is confirmed; the headroom absorbs cold-Parse-Server-boot + # latency without reintroducing the old silent skips. + EVENT_TIMEOUT = 10 + def skip_unless_livequery_available unless livequery_server_available? skip "LiveQuery server not available at #{LIVE_QUERY_URL}" diff --git a/test/lib/parse/login_error_taxonomy_test.rb b/test/lib/parse/login_error_taxonomy_test.rb new file mode 100644 index 0000000..631bc80 --- /dev/null +++ b/test/lib/parse/login_error_taxonomy_test.rb @@ -0,0 +1,226 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Tests for the typed login-failure taxonomy introduced in response.rb and +# user.rb. Specifically: +# - Parse::Response::ERROR_EMAIL_NOT_FOUND has the correct numeric value (205). +# - Parse::User.login! raises EmailNotVerifiedError for code-205 responses. +# - Parse::User.login! raises the generic AuthenticationError for other codes. +# - Parse::User.login! returns a user object on success (no raise). +# +# The mock is registered on Parse::User.stub(:client, ...) because +# Parse::User.client is memoised in @client — stubbing Parse::Client.client +# alone would race with the memoised value from a prior test run in the +# same process. +class LoginErrorTaxonomyTest < Minitest::Test + + # ========================================================================= + # Constant values + # ========================================================================= + + def test_error_email_not_found_constant_is_205 + # Parse Server throws code 205 (EMAIL_NOT_FOUND) when + # preventLoginWithUnverifiedEmail is enabled and the email is unverified. + assert_equal 205, Parse::Response::ERROR_EMAIL_NOT_FOUND + end + + def test_error_email_not_verified_class_exists + assert defined?(Parse::Error::EmailNotVerifiedError), + "Parse::Error::EmailNotVerifiedError must be defined" + end + + def test_email_not_verified_error_is_parse_error_subclass + assert Parse::Error::EmailNotVerifiedError.ancestors.include?(Parse::Error), + "EmailNotVerifiedError must descend from Parse::Error" + end + + def test_email_not_verified_error_subclasses_authentication_error + # EmailNotVerifiedError MUST descend from AuthenticationError: before the + # typed error existed, a 205 login rejection raised a plain + # AuthenticationError, so existing `rescue AuthenticationError` handlers + # must keep catching it (subclassing preserves that contract; a sibling + # would be a silent breaking change). Callers that want the unverified case + # specifically just rescue the narrower subclass first. + assert Parse::Error::EmailNotVerifiedError.ancestors.include?(Parse::Error::AuthenticationError), + "EmailNotVerifiedError must inherit from AuthenticationError (back-compat)" + end + + def test_email_not_verified_caught_by_authentication_error_rescue + raised = + begin + raise Parse::Error::EmailNotVerifiedError, "unverified" + rescue Parse::Error::AuthenticationError => e + e + end + assert_kind_of Parse::Error::EmailNotVerifiedError, raised, + "a generic `rescue AuthenticationError` must still catch the unverified case" + end + + # ========================================================================= + # Parse::User.login! — typed error on code 205 + # ========================================================================= + + def test_login_bang_raises_email_not_verified_error_on_code_205 + err_body = { "code" => 205, "error" => "User email is not verified." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["alice", "correct"]) + + Parse::User.stub(:client, mock_client) do + assert_raises(Parse::Error::EmailNotVerifiedError) do + Parse::User.login!("alice", "correct") + end + end + + mock_client.verify + end + + def test_login_bang_error_message_includes_username_and_code_on_205 + err_body = { "code" => 205, "error" => "User email is not verified." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["bob", "pass"]) + + Parse::User.stub(:client, mock_client) do + error = assert_raises(Parse::Error::EmailNotVerifiedError) do + Parse::User.login!("bob", "pass") + end + assert_match(/bob/, error.message) + assert_match(/205/, error.message) + end + + mock_client.verify + end + + # ========================================================================= + # Parse::User.login! — generic AuthenticationError for other error codes + # ========================================================================= + + def test_login_bang_raises_authentication_error_on_code_101 + err_body = { "code" => 101, "error" => "Invalid username/password." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 404 + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["alice", "wrong"]) + + Parse::User.stub(:client, mock_client) do + assert_raises(Parse::Error::AuthenticationError) do + Parse::User.login!("alice", "wrong") + end + end + + mock_client.verify + end + + def test_login_bang_raises_authentication_error_on_code_200 + err_body = { "code" => 200, "error" => "Username is required." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["", ""]) + + Parse::User.stub(:client, mock_client) do + assert_raises(Parse::Error::AuthenticationError) do + Parse::User.login!("", "") + end + end + + mock_client.verify + end + + def test_login_bang_raises_authentication_error_without_json_code + # When Parse Server returns an HTTP error with no JSON body / error code, + # the generic AuthenticationError must still be raised (not EmailNotVerifiedError). + err_response = Parse::Response.new({}) + err_response.http_status = 503 + # Simulate missing code / error (service-level failure) + def err_response.success?; false; end + def err_response.error?; true; end + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["carol", "pass"]) + + Parse::User.stub(:client, mock_client) do + assert_raises(Parse::Error::AuthenticationError) do + Parse::User.login!("carol", "pass") + end + end + + mock_client.verify + end + + # ========================================================================= + # Parse::User.login! — success path is unchanged + # ========================================================================= + + def test_login_bang_returns_user_on_success + ok_result = { "objectId" => "xyz789", "username" => "dave", "sessionToken" => "r:tok123" } + ok_response = Parse::Response.new(ok_result) + + mock_client = Minitest::Mock.new + mock_client.expect(:login, ok_response, ["dave", "correct"]) + + user = nil + Parse::User.stub(:client, mock_client) do + user = Parse::User.login!("dave", "correct") + end + + assert_instance_of Parse::User, user + assert_equal "xyz789", user.id + mock_client.verify + end + + def test_login_bang_does_not_raise_on_success + ok_result = { "objectId" => "abc001", "username" => "eve", "sessionToken" => "r:sessabc" } + ok_response = Parse::Response.new(ok_result) + + mock_client = Minitest::Mock.new + mock_client.expect(:login, ok_response, ["eve", "s3cret"]) + + Parse::User.stub(:client, mock_client) do + refute_raises(Parse::Error) do + Parse::User.login!("eve", "s3cret") + end + end + + mock_client.verify + end + + # ========================================================================= + # Backward-compat: a code-205 login rejection IS caught by an existing + # `rescue AuthenticationError` (it was a plain AuthenticationError before the + # typed subclass existed) — AND it is specifically an EmailNotVerifiedError + # for callers that rescue the narrower subclass first. + # ========================================================================= + + def test_email_not_verified_error_is_caught_by_authentication_error_rescue + err_body = { "code" => 205, "error" => "User email is not verified." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:login, err_response, ["frank", "pass"]) + + caught = nil + Parse::User.stub(:client, mock_client) do + begin + Parse::User.login!("frank", "pass") + rescue Parse::Error::AuthenticationError => e + caught = e + end + end + + refute_nil caught, "a generic `rescue AuthenticationError` must still catch the 205 case" + assert_kind_of Parse::Error::EmailNotVerifiedError, caught, + "the caught error is specifically the typed EmailNotVerifiedError" + mock_client.verify + end +end diff --git a/test/lib/parse/mfa_test.rb b/test/lib/parse/mfa_test.rb index 3be76fc..57ec320 100644 --- a/test/lib/parse/mfa_test.rb +++ b/test/lib/parse/mfa_test.rb @@ -2,6 +2,7 @@ # frozen_string_literal: true require_relative "../../test_helper" +require "cgi" # CGI.unescape in the provisioning-URI assertions; not guaranteed transitively class MFATest < Minitest::Test def setup @@ -122,7 +123,9 @@ def test_provisioning_uri assert_kind_of String, uri assert uri.start_with?("otpauth://totp/"), "Should be otpauth URI" assert uri.include?("secret="), "Should include secret" - assert uri.include?("test@example.com"), "Should include account name" + # The account label is URL-encoded in a valid otpauth URI ("@" -> "%40"), + # so decode before asserting the address is present. + assert CGI.unescape(uri).include?("test@example.com"), "Should include account name" end def test_provisioning_uri_with_issuer diff --git a/test/lib/parse/mfa_totp_flow_integration_test.rb b/test/lib/parse/mfa_totp_flow_integration_test.rb new file mode 100644 index 0000000..ca1550b --- /dev/null +++ b/test/lib/parse/mfa_totp_flow_integration_test.rb @@ -0,0 +1,143 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" +require_relative "../../support/client_mode_helper" +require "securerandom" + +# End-to-end coverage for the TOTP MFA flow against a real MFA-enabled Parse +# Server. The test stack configures Parse Server's built-in TOTP adapter (see +# scripts/start-parse.sh) and the suite depends on the `rotp` gem to generate +# valid time-based codes, so this exercises the actual enroll / login / status / +# disable path rather than just the SDK boundary. +# +# #mfa_enabled? / #mfa_status read back authData.mfa. The SDK never retains the +# raw value (the server exposes the TOTP secret + recovery codes there), but it +# preserves a non-sensitive `{ "status" => "enabled" }` projection, so the +# status methods work after an ordinary fetch — asserted below alongside the +# secret-is-not-leaked guarantee. +class MfaTotpFlowIntegrationTest < Minitest::Test + include ParseStackIntegrationTest + include Parse::Test::ClientModeHelper + + def setup + skip "Docker integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] == "true" + super + skip "rotp gem not available (add to the Gemfile :test group)" unless Parse::MFA.rotp_available? + require "rotp" + end + + # Enroll a freshly-seeded user in TOTP MFA. Returns the logged-in user, its + # password, the TOTP secret, and the recovery codes. + def enroll_user(prefix) + user, password = seed_client_user(prefix) + logged = Parse::User.login(user.username, password) + secret = Parse::MFA.generate_secret + recovery = logged.setup_mfa!(secret: secret, token: ROTP::TOTP.new(secret).now) + [logged, password, secret, recovery] + end + + def logged_in?(result) + result.is_a?(Parse::User) && !result.session_token.to_s.empty? + end + + # Enrollment returns recovery codes, and a password-only login is afterwards + # rejected by the server as requiring an additional MFA factor. + def test_totp_enrollment_returns_recovery_and_enforces_mfa + user, password, _secret, recovery = enroll_user("mfa_enroll") + + refute_empty Array(recovery), "enrollment should return one-time recovery codes" + + err = assert_raises(Parse::Error) { Parse.client.login(user.username, password) } + assert_match(/additional authData|mfa/i, err.message, + "password-only login on an enrolled account must be rejected as MFA-required") + end + + # #mfa_enabled? / #mfa_status report enabled after an ordinary (untrusted) + # fetch, and the fetched record must NOT carry the raw TOTP secret or recovery + # codes — only the leak-safe status projection. + def test_mfa_status_readable_after_fetch_without_leaking_secret + user, _password, secret, _ = enroll_user("mfa_status") + + fetched = Parse::User.query(objectId: user.id).first + refute_nil fetched + + assert fetched.mfa_enabled?, "mfa_enabled? should be true after enrollment" + assert_equal :enabled, fetched.mfa_status + + blob = fetched.auth_data.to_s + refute_includes blob, secret, "the TOTP secret must never survive into a fetched user" + refute_match(/recovery/i, blob, "recovery codes must never survive into a fetched user") + end + + # Self-disable with a valid current code clears MFA (password-only login works + # again); a wrong code is rejected and leaves MFA enabled. + def test_self_disable_with_valid_code_clears_mfa + user, password, secret, _ = enroll_user("mfa_selfdisable") + + user.fetch + user.disable_mfa!(current_token: ROTP::TOTP.new(secret).now) + + relog = Parse::User.login(user.username, password) + assert logged_in?(relog), "after self-disable, password-only login should succeed" + end + + def test_self_disable_with_wrong_code_is_rejected + user, password, _secret, _ = enroll_user("mfa_selfdisable_bad") + + user.fetch + assert_raises(Parse::MFA::VerificationError) do + user.disable_mfa!(current_token: "000000") + end + + # MFA must still be enforced — password-only login is still rejected. + assert_raises(Parse::Error) { Parse.client.login(user.username, password) } + end + + # A valid time-based code completes the second factor and yields a session. + # + # These login assertions run in CLIENT MODE (non-master). The master key + # bypasses MFA by design — Parse Server's MFA `validateLogin` short-circuits + # on `req.master` — so MFA enforcement is only meaningful for a normal client, + # which is how real applications authenticate users. + def test_login_with_valid_totp_succeeds + user, password, secret, _ = enroll_user("mfa_login") + + result = as_client do + Parse::User.login_with_mfa(user.username, password, ROTP::TOTP.new(secret).now) + end + + assert logged_in?(result), "a valid TOTP code should produce a logged-in session" + end + + # A wrong (or empty) code must NOT authenticate a non-master client. + def test_login_with_wrong_totp_does_not_authenticate + user, password, _secret, _ = enroll_user("mfa_wrong") + + as_client do + %w[000000 123456].each do |bad| + result = + begin + Parse::User.login_with_mfa(user.username, password, bad) + rescue Parse::Error + nil + end + refute logged_in?(result), "a wrong MFA code (#{bad}) must not authenticate" + end + end + end + + # The master-key disable path (authData.mfa = nil, no mfa_enabled? guard) + # clears MFA so a password-only login works again. + def test_master_key_disable_restores_password_login + user, password, _secret, _ = enroll_user("mfa_disable") + admin, _admin_pw = seed_client_user("mfa_admin") + + # `allow_unverified: true` opts into caller-side authorization (this + # test vouches for `admin`); without it the method now fails closed. + user.disable_mfa_master_key!(authorized_by: admin, allow_unverified: true) + + relog = Parse::User.login(user.username, password) + assert logged_in?(relog), "after master-key disable, password-only login should succeed again" + end +end diff --git a/test/lib/parse/push_integration_test.rb b/test/lib/parse/push_integration_test.rb index 45e0cda..b8b66f4 100644 --- a/test/lib/parse/push_integration_test.rb +++ b/test/lib/parse/push_integration_test.rb @@ -1,23 +1,90 @@ # encoding: UTF-8 # frozen_string_literal: true -require_relative "../../test_helper" +require_relative "../../test_helper_integration" -# Integration tests for Parse::Push functionality -# These tests require Parse Server to be running +# Integration tests for Parse::Push functionality. +# +# These exercise the SDK-side push surface (payload construction, badge ops, +# localization, and the Push/PushStatus/Audience/Installation model APIs). +# They need a configured Parse client but do not push to a real device +# gateway, so a running Parse Server is sufficient. # # Run with: PARSE_TEST_USE_DOCKER=true ruby -Itest test/lib/parse/push_integration_test.rb class PushIntegrationTest < Minitest::Test def setup - skip "Integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] - - # Ensure we have a valid connection - begin - response = Parse.client.request(:get, "health") - skip "Parse Server not responding" unless response - rescue StandardError => e - skip "Parse Server not available: #{e.message}" + skip "Integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] == "true" + + # Configure the default Parse client (server URL, app id, keys). The + # previous health-check ran before any client was set up, so `Parse.client` + # raised and silently skipped every test in this file. + Parse::Test::ServerHelper.setup + @created = [] + end + + # The server-backed tests below create real _Installation / _PushStatus rows + # on the shared integration database. Destroy them so they don't skew counts + # or queries other files run (e.g. Installation.subscribers_count, + # PushStatus.recent). + def teardown + Array(@created).reverse_each { |obj| obj.destroy rescue nil } + end + + # Register a saved object for teardown cleanup. + def track(obj) + (@created ||= []) << obj + obj + end + + # Poll for the _PushStatus belonging to THIS test to reach a terminal + # state. Parse Server processes `POST /push` asynchronously, so the row's + # counts are not populated the instant the request returns. + # + # `channel:` scopes the lookup to the push under test (every test uses a + # unique channel). Without it, a concurrent or prior test's terminal row + # is the globally-newest one and can both false-fail (wrong counts) and + # false-pass (assertions check the wrong row). For non-channel targeting + # (e.g. `to_query`), pass `since:` to accept only rows newer than a + # snapshot taken before `.send`. + def latest_push_status_time + Parse::PushStatus.query(order: :createdAt.desc).first&.created_at + end + + def wait_for_terminal_push_status(timeout: 10, channel: nil, since: nil) + matches = lambda do |status| + return false unless status + return false unless %w[succeeded failed].include?(status.status) + return false if since && !(status.created_at && status.created_at > since) + return false if channel && !push_status_targets_channel?(status, channel) + true + end + + deadline = Time.now + timeout + candidate = nil + loop do + candidate = recent_push_status(channel) + return candidate if matches.call(candidate) + break if Time.now > deadline + sleep 0.25 end + candidate + end + + # Newest _PushStatus, narrowed server-side to the channel when possible + # (the `query` column stores the target as a JSON string, so an exact + # server-side filter isn't reliable — fall back to a client-side scan of + # recent rows). + def recent_push_status(channel) + rows = Parse::PushStatus.query(order: :createdAt.desc, limit: 25).results + return rows.first if channel.nil? + rows.find { |s| push_status_targets_channel?(s, channel) } || rows.first + end + + # True if the _PushStatus row's stored target query references `channel`. + def push_status_targets_channel?(status, channel) + q = status.query + q = (JSON.parse(q) rescue nil) if q.is_a?(String) + q.to_s.include?(channel) end # ========================================================================== @@ -529,4 +596,259 @@ def test_full_push_with_new_features puts "Full push with all new features works correctly!" end + + # ========================================================================== + # Server-backed integration: real _Installation round-trips + # ========================================================================== + + # Test 24: Installation saves and is queryable by channel + def test_installation_save_and_query_by_channel + channel = "news_#{SecureRandom.hex(4)}" + token = SecureRandom.hex(32) + + installation = Parse::Installation.new( + device_type: "ios", + device_token: token, + installation_id: SecureRandom.uuid, + channels: [channel], + ) + assert installation.save, "installation should save to the server" + refute_nil installation.id + track(installation) + + # Round-trip: query the channel back from the server. + found = Parse::Installation.query(:channels.in => [channel]).first + refute_nil found, "installation should be findable by its channel" + assert_equal installation.id, found.id + assert_includes found.channels.to_a, channel + end + + # Test 25: subscribe / unsubscribe persist to the server + def test_installation_subscribe_unsubscribe_round_trip + installation = Parse::Installation.new( + device_type: "android", + device_token: SecureRandom.hex(32), + installation_id: SecureRandom.uuid, + channels: [], + ) + installation.save + track(installation) + + a = "alpha_#{SecureRandom.hex(3)}" + b = "beta_#{SecureRandom.hex(3)}" + + installation.subscribe(a, b) + reloaded = Parse::Installation.find(installation.id) + assert_includes reloaded.channels.to_a, a + assert_includes reloaded.channels.to_a, b + + installation.unsubscribe(a) + reloaded = Parse::Installation.find(installation.id) + refute_includes reloaded.channels.to_a, a + assert_includes reloaded.channels.to_a, b + end + + # ========================================================================== + # Server-backed integration: push send + _PushStatus lifecycle + # + # The test stack configures a no-op push adapter (test/cloud/ + # dummy-push-adapter.js, wired via PARSE_SERVER_PUSH). It reports a + # successful transmission without contacting any device gateway, so Parse + # Server creates and completes a real _PushStatus we can assert against. + # ========================================================================== + + # Test 26: sending to a channel creates a succeeded _PushStatus + def test_push_send_to_channel_creates_succeeded_status + channel = "push_#{SecureRandom.hex(4)}" + + # A subscriber must exist for the push to have a recipient. + installation = Parse::Installation.new( + device_type: "ios", + device_token: SecureRandom.hex(32), + installation_id: SecureRandom.uuid, + channels: [channel], + ) + installation.save + track(installation) + + response = Parse::Push.new + .to_channel(channel) + .with_alert("Integration ping") + .send + + assert response.success?, "POST /push should succeed with a push adapter configured" + assert_equal true, response.result["result"] + + status = wait_for_terminal_push_status(channel: channel) + refute_nil status, "a _PushStatus row should be created" + track(status) + assert_equal "succeeded", status.status + assert_operator status.num_sent.to_i, :>=, 1, "at least the one subscriber should be counted as sent" + assert_equal 1, status.sent_per_type["ios"], "the iOS subscriber should be tallied under sent_per_type" + end + + # Test 27: sending without a push adapter would fail closed — here we assert + # the SDK's master-key guard on the send path (no master key => raise, never + # an unauthenticated POST). + def test_push_send_requires_master_key + no_master = Parse::Client.new( + server_url: ENV["PARSE_TEST_SERVER_URL"] || "http://localhost:29337/parse", + application_id: ENV["PARSE_TEST_APP_ID"] || "psnextItAppId", + api_key: ENV["PARSE_TEST_API_KEY"] || "psnext-it-rest-key", + ) + + assert_raises(Parse::Error::AuthenticationError) do + no_master.push(channels: ["anything"], data: { alert: "nope" }) + end + end + + # ========================================================================== + # Server-backed integration: real _Audience round-trip + # + # Parse Server's `_Audience.query` column is typed String (JSON), so a hash + # query must be persisted as a JSON string. This exercises that the SDK + # serializes/deserializes correctly and that the audience drives a real + # Installation count. + # ========================================================================== + + # Test 28: audience saves a hash query, is findable, and counts installations + def test_audience_save_find_and_installation_count + channel = "aud_#{SecureRandom.hex(4)}" + name = "Audience #{SecureRandom.hex(4)}" + + # An installation that matches the audience's query. + installation = Parse::Installation.new( + device_type: "ios", + device_token: SecureRandom.hex(32), + installation_id: SecureRandom.uuid, + channels: [channel], + ) + installation.save + track(installation) + + audience = Parse::Audience.new(name: name, query: { "channels" => channel }) + assert audience.save, + "audience with a hash query should save (query persists as a JSON string)" + track(audience) + + found = Parse::Audience.find_by_name(name, cache: false) + refute_nil found, "audience should be findable by name" + assert_equal channel, found.query["channels"], "query should round-trip back to a hash" + + assert_equal 1, Parse::Audience.installation_count(name), + "the one matching installation should be counted" + end + + # ========================================================================== + # Server-backed integration: _PushStatus failure + multi-type lifecycle + # + # The no-op adapter simulates a failed transmission for any installation whose + # device token begins with "fail-", which exercises numFailed / failedPerType + # without a real device gateway. + # ========================================================================== + + # Helper: save an installation subscribed to a channel. + def save_installation(device_type:, token:, channel:) + inst = Parse::Installation.new( + device_type: device_type, + device_token: token, + installation_id: SecureRandom.uuid, + channels: [channel], + ) + inst.save + track(inst) + end + + # Test 29: a failed transmission is tracked under num_failed / failed_per_type + def test_push_failed_delivery_is_tracked + channel = "fail_#{SecureRandom.hex(4)}" + save_installation(device_type: "ios", token: "fail-#{SecureRandom.hex(8)}", channel: channel) + + Parse::Push.new.to_channel(channel).with_alert("will fail").send + + status = wait_for_terminal_push_status(channel: channel) + refute_nil status + track(status) + assert_equal 0, status.num_sent.to_i + assert_operator status.num_failed.to_i, :>=, 1 + assert_equal 1, status.failed_per_type["ios"] + end + + # Test 30: a single push with one good and one failing recipient tallies both + def test_push_mixed_sent_and_failed + channel = "mix_#{SecureRandom.hex(4)}" + save_installation(device_type: "ios", token: SecureRandom.hex(16), channel: channel) + save_installation(device_type: "ios", token: "fail-#{SecureRandom.hex(8)}", channel: channel) + + Parse::Push.new.to_channel(channel).with_alert("mixed").send + + status = wait_for_terminal_push_status(channel: channel) + refute_nil status + track(status) + assert_operator status.num_sent.to_i, :>=, 1, "the good recipient should be counted as sent" + assert_operator status.num_failed.to_i, :>=, 1, "the fail- recipient should be counted as failed" + end + + # Test 31: sent_per_type tallies each device type + def test_push_sent_per_type_for_multiple_device_types + channel = "multi_#{SecureRandom.hex(4)}" + save_installation(device_type: "ios", token: SecureRandom.hex(16), channel: channel) + save_installation(device_type: "android", token: SecureRandom.hex(16), channel: channel) + + Parse::Push.new.to_channel(channel).with_alert("multi-type").send + + status = wait_for_terminal_push_status(channel: channel) + refute_nil status + track(status) + assert_equal "succeeded", status.status + assert_equal 1, status.sent_per_type["ios"], "iOS recipient should be tallied" + assert_equal 1, status.sent_per_type["android"], "Android recipient should be tallied" + end + + # ========================================================================== + # Server-backed integration: alternate targeting paths + # ========================================================================== + + # Test 32: query-based targeting (to_query) sends to matching installations + def test_push_to_query_creates_succeeded_status + token = SecureRandom.hex(16) + save_installation(device_type: "ios", token: token, channel: "q_#{SecureRandom.hex(4)}") + + # Query-based targeting has no channel to scope on; snapshot the newest + # terminal row before sending and accept only a row created after it. + since = latest_push_status_time + + Parse::Push.new + .to_query { |q| q.where(device_token: token) } + .with_alert("query target") + .send + + status = wait_for_terminal_push_status(since: since) + refute_nil status + track(status) + assert_equal "succeeded", status.status + assert_operator status.num_sent.to_i, :>=, 1 + end + + # Test 33: audience-based targeting (to_audience) resolves the saved query + def test_push_to_audience_sends_to_matching_installations + channel = "ta_#{SecureRandom.hex(4)}" + name = "PushAudience #{SecureRandom.hex(4)}" + save_installation(device_type: "ios", token: SecureRandom.hex(16), channel: channel) + + audience = Parse::Audience.new(name: name, query: { "channels" => channel }) + audience.save + track(audience) + + Parse::Push.new + .to_audience(name) + .with_alert("audience target") + .send + + status = wait_for_terminal_push_status(channel: channel) + refute_nil status + track(status) + assert_equal "succeeded", status.status + assert_operator status.num_sent.to_i, :>=, 1 + end end diff --git a/test/lib/parse/query/contained_by_test.rb b/test/lib/parse/query/contained_by_test.rb new file mode 100644 index 0000000..fcc3627 --- /dev/null +++ b/test/lib/parse/query/contained_by_test.rb @@ -0,0 +1,56 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +class TestContainedByConstraint < Minitest::Test + extend Minitest::Spec::DSL + include ConstraintTests + + def setup + @klass = Parse::Constraint::ContainedByConstraint + @key = :$containedBy + @operand = :contained_by + @keys = [:contained_by] + end + + def build(value) + { "field" => { @key.to_s => [Parse::Constraint.formatted_value(value)].flatten.compact } } + end + + def test_contained_by_operator_registered_on_symbol + assert_respond_to :tags, :contained_by + end + + def test_constraint_keyword_is_dollar_contained_by + assert_equal :$containedBy, Parse::Constraint::ContainedByConstraint.key + end + + def test_constraint_operand_is_contained_by + assert_equal :contained_by, Parse::Constraint::ContainedByConstraint.operand + end + + def test_compile_produces_correct_hash + q = Parse::Query.new("Post") + q.where :tags.contained_by => ["ruby", "rails", "parse"] + compiled = q.compile(encode: false) + where = compiled[:where] + assert where.key?("tags"), "compiled where should have 'tags' key" + # key is emitted as a symbol by constraint#build + assert_equal({ :"$containedBy" => ["ruby", "rails", "parse"] }, where["tags"]) + end + + def test_compile_wraps_scalar_in_array + q = Parse::Query.new("Post") + q.where :tags.contained_by => "ruby" + compiled = q.compile(encode: false) + assert_equal({ :"$containedBy" => ["ruby"] }, compiled[:where]["tags"]) + end + + def test_constraint_instance_has_correct_key + op = :tags.contained_by + assert_instance_of Parse::Operation, op + assert_kind_of Parse::Constraint::ContainedByConstraint, op.constraint + assert_equal :$containedBy, op.constraint.key + end +end diff --git a/test/lib/parse/query/exclude_keys_mongo_direct_redact_test.rb b/test/lib/parse/query/exclude_keys_mongo_direct_redact_test.rb new file mode 100644 index 0000000..6c49a02 --- /dev/null +++ b/test/lib/parse/query/exclude_keys_mongo_direct_redact_test.rb @@ -0,0 +1,132 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +# Parse Server's REST `excludeKeys` has no mongo-direct equivalent (the direct +# pipeline can only project the `keys` allowlist), so the SDK honors the +# denylist on the mongo-direct path as a post-fetch sanitize: it recursively +# drops every key with a matching name from the returned Parse-format hashes +# without touching the MongoDB query. These tests drive the redaction helper +# directly over hashes, so they need no live MongoDB. +class TestExcludeKeysMongoDirectRedact < Minitest::Test + def redact(table, fields, results) + query = Parse::Query.new(table) + query.exclude_keys(*fields) + query.send(:redact_excluded_keys!, results) + end + + def test_drops_matching_top_level_key + rows = [{ "objectId" => "a1", "title" => "Hi", "secretToken" => "xyz" }] + redact("Post", [:secret_token], rows) + refute rows.first.key?("secretToken") + assert_equal "Hi", rows.first["title"] + end + + def test_noop_when_no_exclude_keys + rows = [{ "objectId" => "a1", "secretToken" => "xyz" }] + redact("Post", [], rows) + assert_equal "xyz", rows.first["secretToken"] + end + + def test_recurses_into_nested_included_objects + # exclude_keys(:name) must also strip a same-named field inside an + # included/nested object — the recursive-by-name contract. + rows = [{ + "objectId" => "a1", + "name" => "outer", + "author" => { "objectId" => "u1", "name" => "inner", "email" => "x@y" }, + }] + redact("Post", [:name], rows) + refute rows.first.key?("name") + refute rows.first["author"].key?("name") + assert_equal "x@y", rows.first["author"]["email"] + end + + def test_recurses_through_arrays + rows = [{ + "objectId" => "a1", + "comments" => [ + { "body" => "one", "secretToken" => "s1" }, + { "body" => "two", "secretToken" => "s2" }, + ], + }] + redact("Post", [:secret_token], rows) + rows.first["comments"].each do |c| + refute c.key?("secretToken") + refute_nil c["body"] + end + end + + def test_field_name_camelized_like_rest_path + # exclude_keys runs field names through format_field (snake -> camel), + # matching the camelCase keys mongo/Parse produce. + rows = [{ "objectId" => "a1", "internalNotes" => "secret" }] + redact("Post", [:internal_notes], rows) + refute rows.first.key?("internalNotes") + end + + # --- Structural-key protection: dropping these would break decode or + # diverge from Parse Server's reserved envelope. --- + + def test_objectId_is_never_stripped + rows = [{ "objectId" => "a1", "title" => "Hi" }] + redact("Post", [:objectId], rows) + assert_equal "a1", rows.first["objectId"], + "objectId must survive exclude_keys so decode can reconstruct the object" + end + + def test_className_and_type_never_stripped + rows = [{ "objectId" => "a1", "className" => "Post", "__type" => "Object" }] + redact("Post", [:className, :__type], rows) + assert_equal "Post", rows.first["className"] + assert_equal "Object", rows.first["__type"] + end + + def test_reserved_timestamp_and_acl_fields_protected + rows = [{ + "objectId" => "a1", + "createdAt" => "2026-01-01T00:00:00.000Z", + "updatedAt" => "2026-01-02T00:00:00.000Z", + "ACL" => { "*" => { "read" => true } }, + }] + redact("Post", [:createdAt, :updatedAt, :ACL], rows) + assert rows.first.key?("createdAt") + assert rows.first.key?("updatedAt") + assert rows.first.key?("ACL") + end + + def test_mongo_storage_form_reserved_keys_protected + # Defensive: even on a raw Mongo-form document, the storage-form reserved + # keys survive so reconstruction can't be broken by excluding them. + rows = [{ + "_id" => "a1", + "_created_at" => "t1", + "_updated_at" => "t2", + "_acl" => { "*" => { "r" => true } }, + "secretToken" => "xyz", + }] + redact("Post", [:_id, :_created_at, :_updated_at, :_acl, :secret_token], rows) + assert rows.first.key?("_id") + assert rows.first.key?("_created_at") + assert rows.first.key?("_updated_at") + assert rows.first.key?("_acl") + refute rows.first.key?("secretToken") + end + + def test_mixed_reserved_and_user_field + # Excluding both a reserved and a user field keeps the reserved one, + # drops only the user field. + rows = [{ "objectId" => "a1", "createdAt" => "t", "secretToken" => "xyz" }] + redact("Post", [:objectId, :createdAt, :secret_token], rows) + assert_equal "a1", rows.first["objectId"] + assert rows.first.key?("createdAt") + refute rows.first.key?("secretToken") + end + + def test_returns_the_same_array_reference + rows = [{ "objectId" => "a1", "secretToken" => "xyz" }] + result = redact("Post", [:secret_token], rows) + assert_same rows, result + end +end diff --git a/test/lib/parse/query/exclude_keys_test.rb b/test/lib/parse/query/exclude_keys_test.rb new file mode 100644 index 0000000..995a7d2 --- /dev/null +++ b/test/lib/parse/query/exclude_keys_test.rb @@ -0,0 +1,102 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +# Fields passed to exclude_keys go through Query.format_field, which converts +# snake_case to camelCase (e.g. :secret_token => "secretToken"). Tests use +# camelCase field names for clarity, but a snake_case round-trip is verified +# explicitly. +class TestQueryExcludeKeys < Minitest::Test + def setup + @query = Parse::Query.new("Post") + end + + def test_exclude_keys_method_exists + assert_respond_to @query, :exclude_keys + end + + def test_exclude_keys_returns_self_for_chaining + result = @query.exclude_keys(:token) + assert_same @query, result + end + + def test_exclude_keys_single_field + @query.exclude_keys(:token) + compiled = @query.compile + assert_equal "token", compiled[:excludeKeys] + end + + def test_exclude_keys_snake_case_converted_to_camel_case + @query.exclude_keys(:secret_token) + compiled = @query.compile + # format_field converts snake_case -> camelCase + assert_equal "secretToken", compiled[:excludeKeys] + end + + def test_exclude_keys_multiple_fields_variadic + @query.exclude_keys(:token, :notes) + compiled = @query.compile + fields = compiled[:excludeKeys].split(",") + assert_includes fields, "token" + assert_includes fields, "notes" + assert_equal 2, fields.size + end + + def test_exclude_keys_multiple_calls_accumulate + @query.exclude_keys(:token) + @query.exclude_keys(:notes) + compiled = @query.compile + fields = compiled[:excludeKeys].split(",") + assert_includes fields, "token" + assert_includes fields, "notes" + end + + def test_exclude_keys_deduplicates + @query.exclude_keys(:token, :token) + compiled = @query.compile + fields = compiled[:excludeKeys].split(",") + assert_equal 1, fields.count("token") + end + + def test_exclude_keys_omitted_when_unset + compiled = @query.compile + refute compiled.key?(:excludeKeys) + end + + def test_exclude_keys_omitted_under_encode_false + @query.exclude_keys(:token) + compiled = @query.compile(encode: false) + refute compiled.key?(:excludeKeys), + "excludeKeys must not appear in the structural (encode: false) form" + end + + def test_exclude_keys_present_under_encode_true_by_default + @query.exclude_keys(:token) + compiled = @query.compile(encode: true) + assert compiled.key?(:excludeKeys) + end + + def test_exclude_keys_chainable_with_other_methods + result = @query.exclude_keys(:token).limit(10).where(status: "active") + assert_same @query, result + compiled = @query.compile + assert_equal "token", compiled[:excludeKeys] + assert_equal 10, compiled[:limit] + end + + def test_exclude_keys_with_array_argument + @query.exclude_keys([:token, :notes]) + compiled = @query.compile + fields = compiled[:excludeKeys].split(",") + assert_includes fields, "token" + assert_includes fields, "notes" + end + + def test_exclude_keys_survives_clone + @query.exclude_keys(:token) + cloned = @query.clone + compiled = cloned.compile + assert_equal "token", compiled[:excludeKeys] + end +end diff --git a/test/lib/parse/query/explain_public_warn_test.rb b/test/lib/parse/query/explain_public_warn_test.rb new file mode 100644 index 0000000..91b120f --- /dev/null +++ b/test/lib/parse/query/explain_public_warn_test.rb @@ -0,0 +1,75 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +# Parse Server 9.0 defaults `allowPublicExplain` to false. Query#explain warns +# proactively (one-shot) when the call is clearly non-master AND the server +# version is known to restrict it — but NOT for master-default calls or +# unknown-version servers (avoid spurious noise on a flag /serverInfo can't +# surface). The warn decision is observable via the one-shot latch. +class TestExplainPublicWarn < Minitest::Test + FakeClient = Struct.new(:version, :supports) do + def server_version; version; end + def server_supports?(_feature); supports; end + end + + def setup + Parse::Query.instance_variable_set(:@public_explain_warned, false) + end + + def teardown + Parse::Query.instance_variable_set(:@public_explain_warned, false) + end + + def query_with(client:, use_master_key: nil, session_token: nil) + q = Parse::Query.new("Post") + q.use_master_key = use_master_key unless use_master_key.nil? + q.session_token = session_token if session_token + q.define_singleton_method(:client) { client } + q + end + + def warned?(q) + q.send(:warn_if_public_explain_restricted!) + Parse::Query.public_explain_warned? + end + + def test_warns_for_explicit_non_master_on_restricted_server + q = query_with(client: FakeClient.new("9.9.0", false), use_master_key: false) + assert warned?(q), "should warn for non-master explain on PS 9.x" + end + + def test_warns_for_session_token_scope + q = query_with(client: FakeClient.new("9.0.0", false), session_token: "r:abc") + assert warned?(q) + end + + def test_no_warn_for_explicit_master + q = query_with(client: FakeClient.new("9.9.0", false), use_master_key: true) + refute warned?(q), "master explain should not warn" + end + + def test_no_warn_for_master_default_unspecified + # use_master_key nil (the common master-default case) → no spurious warn. + q = query_with(client: FakeClient.new("9.9.0", false)) + refute warned?(q) + end + + def test_no_warn_when_server_supports_public_explain + q = query_with(client: FakeClient.new("8.5.0", true), use_master_key: false) + refute warned?(q) + end + + def test_no_warn_on_unknown_version + q = query_with(client: FakeClient.new("", false), use_master_key: false) + refute warned?(q), "unknown version should not trigger the warning" + end + + def test_one_shot_latch + q = query_with(client: FakeClient.new("9.9.0", false), use_master_key: false) + assert warned?(q) + # Second call is a no-op; latch stays set and nothing raises. + assert warned?(q) + end +end diff --git a/test/lib/parse/query/hint_mongo_direct_test.rb b/test/lib/parse/query/hint_mongo_direct_test.rb new file mode 100644 index 0000000..480b6a3 --- /dev/null +++ b/test/lib/parse/query/hint_mongo_direct_test.rb @@ -0,0 +1,44 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +# Query#hint forwards to the mongo-direct path (Parse::MongoDB.aggregate `hint:`), +# not just the REST body. Stubs Parse::MongoDB so the test is deterministic and +# needs no live Mongo. +class TestHintMongoDirect < Minitest::Test + def capture_aggregate_opts + captured = nil + Parse::MongoDB.stub(:require_gem!, nil) do + Parse::MongoDB.stub(:available?, true) do + agg = ->(_table, _pipeline, **opts) { captured = opts; [] } + Parse::MongoDB.stub(:aggregate, agg) do + yield + end + end + end + captured + end + + def test_results_direct_forwards_hint + opts = capture_aggregate_opts do + Parse::Query.new("HintMongoDirectThing").hint("status_1_createdAt_-1").results_direct + end + refute_nil opts, "Parse::MongoDB.aggregate should have been called" + assert_equal "status_1_createdAt_-1", opts[:hint] + end + + def test_results_direct_without_hint_forwards_nil + opts = capture_aggregate_opts do + Parse::Query.new("HintMongoDirectThing").results_direct + end + assert_nil opts[:hint] + end + + def test_hint_accepts_key_pattern_hash + opts = capture_aggregate_opts do + Parse::Query.new("HintMongoDirectThing").hint({ "status" => 1 }).results_direct + end + assert_equal({ "status" => 1 }, opts[:hint]) + end +end diff --git a/test/lib/parse/query/query_hint_test.rb b/test/lib/parse/query/query_hint_test.rb new file mode 100644 index 0000000..4ef9183 --- /dev/null +++ b/test/lib/parse/query/query_hint_test.rb @@ -0,0 +1,72 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../../test_helper" + +class TestQueryHint < Minitest::Test + def setup + @query = Parse::Query.new("Post") + end + + def test_hint_method_exists + assert_respond_to @query, :hint + end + + def test_hint_returns_nil_when_unset + assert_nil @query.hint + end + + def test_hint_setter_returns_self_for_chaining + result = @query.hint("status_1_created_at_-1") + assert_same @query, result + end + + def test_hint_stored_as_string + @query.hint("status_1_created_at_-1") + assert_equal "status_1_created_at_-1", @query.hint + end + + def test_hint_reader_returns_current_value_after_set + @query.hint("my_index") + assert_equal "my_index", @query.hint + end + + def test_compile_includes_hint_when_set + @query.hint("status_1_created_at_-1") + compiled = @query.compile + assert_equal "status_1_created_at_-1", compiled[:hint] + end + + def test_compile_omits_hint_when_unset + compiled = @query.compile + refute compiled.key?(:hint) + end + + def test_hint_present_in_both_encoded_and_unencoded_compile + @query.hint("my_index") + assert_equal "my_index", @query.compile(encode: true)[:hint] + assert_equal "my_index", @query.compile(encode: false)[:hint] + end + + def test_hint_can_be_cleared_with_nil + @query.hint("my_index") + @query.hint(nil) + compiled = @query.compile + refute compiled.key?(:hint) + end + + def test_hint_chainable_with_limit + result = @query.hint("my_index").limit(5) + assert_same @query, result + assert_equal "my_index", @query.hint + assert_equal 5, @query.compile[:limit] + end + + def test_hint_survives_clone + @query.hint("my_index") + cloned = @query.clone + assert_equal "my_index", cloned.hint + compiled = cloned.compile + assert_equal "my_index", compiled[:hint] + end +end diff --git a/test/lib/parse/query/read_preference_test.rb b/test/lib/parse/query/read_preference_test.rb index 22a535c..b06add6 100644 --- a/test/lib/parse/query/read_preference_test.rb +++ b/test/lib/parse/query/read_preference_test.rb @@ -79,6 +79,50 @@ def test_headers_empty_when_no_read_preference assert_empty headers end + # --- REST BODY (the part Parse Server actually reads) ------------------- + # Parse Server's middleware maps no `X-Parse-Read-Preference` header into + # request options; `RestQuery` reads `readPreference` only from restOptions + # (the compiled query body). Asserting the header alone is the blind spot + # that let the no-op ship — these pin the body. + + def test_compiled_body_includes_read_preference + query = Parse::Query.new("TestClass") + query.read_pref(:secondary) + compiled = query.compile + assert_equal "SECONDARY", compiled[:readPreference] + end + + def test_compiled_body_normalizes_secondary_preferred + query = Parse::Query.new("TestClass") + query.read_pref("secondary_preferred") + compiled = query.compile + assert_equal "SECONDARY_PREFERRED", compiled[:readPreference] + end + + def test_compiled_body_omits_read_preference_when_unset + query = Parse::Query.new("TestClass") + compiled = query.compile + refute compiled.key?(:readPreference) + end + + def test_compiled_body_omits_invalid_read_preference + query = Parse::Query.new("TestClass") + query.read_preference = :invalid_value + assert_output(nil, /Invalid read preference/) do + compiled = query.compile + refute compiled.key?(:readPreference) + end + end + + def test_unencoded_compile_omits_read_preference + # `readPreference` is a REST-wire concern; the structural (encode: false) + # form used by mongo-direct / snapshot tooling should not carry it. + query = Parse::Query.new("TestClass") + query.read_pref(:secondary) + compiled = query.compile(encode: false) + refute compiled.key?(:readPreference) + end + def test_invalid_read_preference_not_added_to_headers query = Parse::Query.new("TestClass") query.read_preference = :invalid_value diff --git a/test/lib/parse/retrieval_reranker_test.rb b/test/lib/parse/retrieval_reranker_test.rb new file mode 100644 index 0000000..a495f30 --- /dev/null +++ b/test/lib/parse/retrieval_reranker_test.rb @@ -0,0 +1,125 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" +require "parse/retrieval/reranker" + +# Unit tests for Parse::Retrieval::Reranker — the Base protocol +# (validation + normalization), the deterministic Fixture, and the +# Cohere adapter's response parsing (HTTP stubbed). +class RetrievalRerankerTest < Minitest::Test + R = Parse::Retrieval::Reranker + + # ----- Base protocol ----- + + def test_base_rerank_scores_is_abstract + assert_raises(NotImplementedError) { R::Base.new.rerank(query: "q", documents: %w[a]) } + end + + def test_base_validates_query + rr = Class.new(R::Base) { def rerank_scores(*) = [] }.new + assert_raises(ArgumentError) { rr.rerank(query: "", documents: %w[a]) } + assert_raises(ArgumentError) { rr.rerank(query: nil, documents: %w[a]) } + end + + def test_base_empty_documents_returns_empty + rr = Class.new(R::Base) { def rerank_scores(*) = raise("should not be called") }.new + assert_equal [], rr.rerank(query: "q", documents: []) + end + + def test_base_sorts_descending_and_bounds_top_n + rr = Class.new(R::Base) { def rerank_scores(_q, _d, _n) = [[0, 0.1], [1, 0.9], [2, 0.5]] }.new + out = rr.rerank(query: "q", documents: %w[a b c], top_n: 2) + assert_equal [1, 2], out.map(&:index) + assert_in_delta 0.9, out.first.relevance_score, 1e-9 + end + + def test_base_rejects_out_of_range_index + rr = Class.new(R::Base) { def rerank_scores(*) = [[7, 0.5]] }.new + assert_raises(R::InvalidResponseError) { rr.rerank(query: "q", documents: %w[a]) } + end + + def test_base_rejects_non_finite_score + rr = Class.new(R::Base) { def rerank_scores(*) = [[0, Float::INFINITY]] }.new + assert_raises(R::InvalidResponseError) { rr.rerank(query: "q", documents: %w[a]) } + end + + def test_base_dedupes_duplicate_indices + rr = Class.new(R::Base) { def rerank_scores(*) = [[0, 0.9], [0, 0.1]] }.new + out = rr.rerank(query: "q", documents: %w[a]) + assert_equal 1, out.length + assert_in_delta 0.9, out.first.relevance_score, 1e-9 + end + + def test_base_rejects_oversized_document_list + rr = Class.new(R::Base) { def rerank_scores(*) = [] }.new + big = Array.new(R::Base::MAX_DOCUMENTS + 1, "x") + assert_raises(ArgumentError) { rr.rerank(query: "q", documents: big) } + end + + # ----- Fixture reranker (deterministic) ----- + + def test_fixture_is_deterministic_and_overlap_ranked + fx = R::Fixture.new + docs = ["a song about rain and love", "unrelated cooking recipe", "love", ""] + a = fx.rerank(query: "rain love", documents: docs) + b = fx.rerank(query: "rain love", documents: docs) + assert_equal a.map(&:index), b.map(&:index), "Fixture must be deterministic" + assert_equal 0, a.first.index, "highest token overlap ranks first" + end + + # ----- Cohere adapter (HTTP stubbed) ----- + + # Minimal Faraday response/connection doubles. + FakeResp = Struct.new(:status, :body) do + def headers = {} + end + + class FakeConn + def initialize(resp) = (@resp = resp) + def post(_path) = @resp + end + + def build_cohere_with_response(status:, body:) + rr = R::Cohere.allocate + rr.instance_variable_set(:@api_key, "k") + rr.instance_variable_set(:@model, "rerank-v3.5") + rr.instance_variable_set(:@base_url, "https://api.cohere.com/v2") + rr.instance_variable_set(:@timeout, 30) + rr.instance_variable_set(:@open_timeout, 5) + rr.instance_variable_set(:@max_retries, 0) + rr.instance_variable_set(:@allow_faraday_proxy, false) + rr.instance_variable_set(:@connection, FakeConn.new(FakeResp.new(status, body))) + rr + end + + def test_cohere_parses_results + body = { "results" => [{ "index" => 2, "relevance_score" => 0.91 }, + { "index" => 0, "relevance_score" => 0.42 }] }.to_json + rr = build_cohere_with_response(status: 200, body: body) + out = rr.rerank(query: "q", documents: %w[a b c]) + assert_equal [2, 0], out.map(&:index) + assert_in_delta 0.91, out.first.relevance_score, 1e-9 + end + + def test_cohere_401_raises_auth_error + rr = build_cohere_with_response(status: 401, body: "{}") + assert_raises(R::Cohere::AuthenticationError) { rr.rerank(query: "q", documents: %w[a]) } + end + + def test_cohere_bad_json_raises_invalid_response + rr = build_cohere_with_response(status: 200, body: "not json") + assert_raises(R::InvalidResponseError) { rr.rerank(query: "q", documents: %w[a]) } + end + + def test_cohere_inspect_redacts_api_key + rr = build_cohere_with_response(status: 200, body: "{}") + refute_match(/\bk\b/, rr.inspect) + assert_match(/REDACTED/, rr.inspect) + end + + def test_cohere_constructor_validates + assert_raises(ArgumentError) { R::Cohere.new(api_key: "") } + assert_raises(ArgumentError) { R::Cohere.new(api_key: "k", base_url: "ftp://x") } + end +end diff --git a/test/lib/parse/retrieval_retrieve_test.rb b/test/lib/parse/retrieval_retrieve_test.rb index e8b36da..b7b85fe 100644 --- a/test/lib/parse/retrieval_retrieve_test.rb +++ b/test/lib/parse/retrieval_retrieve_test.rb @@ -27,7 +27,8 @@ def image? # fold and scope-kwarg pass-through. class FakeModel class << self - attr_accessor :last_find_similar_kwargs, :canned_hits + attr_accessor :last_find_similar_kwargs, :canned_hits, + :last_hybrid_kwargs, :canned_hybrid_hits def parse_class "FakeDoc" @@ -45,30 +46,96 @@ def find_similar(**kwargs) self.last_find_similar_kwargs = kwargs canned_hits || [] end + + def hybrid_search(**kwargs) + self.last_hybrid_kwargs = kwargs + canned_hybrid_hits || [] + end end end def setup FakeModel.last_find_similar_kwargs = nil FakeModel.canned_hits = nil + FakeModel.last_hybrid_kwargs = nil + FakeModel.canned_hybrid_hits = nil end def hit(id:, body:, score:, **extra) { "_id" => id, "body" => body, "_vscore" => score }.merge(extra) end - # ----- reserved kwargs ----- + # ----- hybrid + rerank wiring ----- - def test_hybrid_reserved - assert_raises(NotImplementedError) do - Parse::Retrieval.retrieve(query: "q", klass: FakeModel, hybrid: true) - end + def test_hybrid_routes_to_hybrid_search + FakeModel.canned_hybrid_hits = [ + { "_id" => "h1", "body" => "alpha beta", "_hybrid_score" => 0.5 }, + ] + chunks = Parse::Retrieval.retrieve( + query: "q", klass: FakeModel, hybrid: true, k: 7, + session_token: "tok", + ) + kw = FakeModel.last_hybrid_kwargs + refute_nil kw, "expected hybrid_search to be called" + assert_equal "q", kw[:text] + assert_equal "q", kw[:lexical][:query] + assert_equal 7, kw[:k] + assert_equal true, kw[:raw] + assert_equal "tok", kw[:session_token] + assert_equal 1, chunks.length + # chunk score derives from _hybrid_score when present. + assert_in_delta 0.5, chunks.first.score, 1e-9 + end + + def test_hybrid_config_hash_threads_lexical_vector_fusion + FakeModel.canned_hybrid_hits = [] + Parse::Retrieval.retrieve( + query: "q", klass: FakeModel, + hybrid: { lexical: { index: "lex_idx" }, vector: { num_candidates: 200 }, + fusion: { k_constant: 40 } }, + ) + kw = FakeModel.last_hybrid_kwargs + assert_equal "lex_idx", kw[:lexical][:index] + assert_equal 200, kw[:vector][:num_candidates] + assert_equal({ k_constant: 40 }, kw[:fusion]) end - def test_rerank_reserved - assert_raises(NotImplementedError) do + def test_rerank_invalid_object_raises_argument_error + err = assert_raises(ArgumentError) do Parse::Retrieval.retrieve(query: "q", klass: FakeModel, rerank: Object.new) end + assert_match(/must respond to #rerank/, err.message) + end + + def test_rerank_reorders_documents_and_overrides_score + # Two hits; the vector order puts "h_low" first, but the reranker + # (lexical-overlap Fixture) should surface "h_high" (matches query). + FakeModel.canned_hits = [ + hit(id: "h_low", body: "completely unrelated text", score: 0.99), + hit(id: "h_high", body: "rain and love song", score: 0.10), + ] + reranker = Parse::Retrieval::Reranker::Fixture.new + chunks = Parse::Retrieval.retrieve( + query: "rain love", klass: FakeModel, rerank: reranker, + ) + # First chunk should come from the reranked-best document. + assert_equal "h_high", chunks.first.metadata[:object_id] + # The chunk score is the rerank relevance score (non-nil, from Fixture). + refute_nil chunks.first.score + end + + def test_rerank_top_n_limits_documents + FakeModel.canned_hits = [ + hit(id: "a", body: "rain love", score: 0.5), + hit(id: "b", body: "rain", score: 0.4), + hit(id: "c", body: "nothing", score: 0.3), + ] + chunks = Parse::Retrieval.retrieve( + query: "rain love", klass: FakeModel, + rerank: Parse::Retrieval::Reranker::Fixture.new, rerank_top_n: 1, + ) + ids = chunks.map { |c| c.metadata[:object_id] }.uniq + assert_equal 1, ids.length, "rerank_top_n: 1 should keep a single document" end # ----- input validation ----- diff --git a/test/lib/parse/semantic_search_tool_test.rb b/test/lib/parse/semantic_search_tool_test.rb index 29f993c..356157b 100644 --- a/test/lib/parse/semantic_search_tool_test.rb +++ b/test/lib/parse/semantic_search_tool_test.rb @@ -341,4 +341,48 @@ def test_class_alias_selects_class assert_equal :embedding, captured[:field] end end + + # --- spend cap → structured error mapping (§16.10) --- + + # A transient cap hit (the window will eventually admit the charge: + # `requested <= limit`) maps to RateLimitExceeded carrying the REAL + # backoff hint — never the window fallback — so the model waits and + # retries. + def test_spend_cap_transient_hit_maps_to_rate_limited + sc = Parse::Embeddings::SpendCap + sc.configure(limit_tokens: 100, window: 3600) + sc.charge!(tenant_id: nil, tokens: 90) # near the cap; window now has entries + # query of 80 ASCII chars → est 20 tokens: 90 + 20 > 100, but 20 <= 100, + # so the charge CAN fit once the window rolls off → non-nil retry_after. + query = "a" * 80 + with_retrieve_returning([]) do + err = assert_raises(Parse::Agent::RateLimitExceeded) do + call(fake_agent, class_name: "SemanticSearchDoc", query: query) + end + refute_nil err.retry_after, "transient cap hit must carry a real backoff hint" + assert err.retry_after > 0 + assert_equal 100, err.limit + assert_equal 3600, err.window + end + end + + # A permanent cap hit (the request alone exceeds the cap: + # `requested > limit`, so SpendCap reports retry_after=nil) maps to + # ValidationError — retrying can never help, and RateLimitExceeded would + # both mislead the model and crash on `nil.round`. + def test_spend_cap_oversized_request_maps_to_validation_error + sc = Parse::Embeddings::SpendCap + sc.configure(limit_tokens: 10, window: 3600) + query = "a" * 80 # est 20 tokens > 10-token cap → can never fit + with_retrieve_returning([]) do + err = assert_raises(Parse::Agent::ValidationError) do + call(fake_agent, class_name: "SemanticSearchDoc", query: query) + end + assert_match(/too large/, err.message) + end + end + + def teardown + Parse::Embeddings::SpendCap.reset_all! if defined?(Parse::Embeddings::SpendCap) + end end diff --git a/test/lib/parse/server_capabilities_test.rb b/test/lib/parse/server_capabilities_test.rb new file mode 100644 index 0000000..5ebb67c --- /dev/null +++ b/test/lib/parse/server_capabilities_test.rb @@ -0,0 +1,72 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit coverage for the `/serverInfo`-backed capability layer +# (Parse::API::Server#server_supports? / #server_features). The capability +# table is version-inferred (Parse Server's `features` block is too coarse to +# carry these behavior flags) and fails OPEN to the modern server line. +class TestServerCapabilities < Minitest::Test + # Minimal host exposing the Server API mixin with a pre-seeded server_info, + # so `server_info` returns without a wire request. + class FakeServerClient + include Parse::API::Server + def initialize(info) + @server_info = info + end + end + + def client_for(version: nil, features: {}) + info = { "features" => features } + info["parseServerVersion"] = version if version + FakeServerClient.new(info.with_indifferent_access) + end + + def test_server_features_returns_advertised_block + c = client_for(version: "9.9.0", features: { "hooks" => { "create" => true } }) + assert_equal({ "hooks" => { "create" => true } }, c.server_features) + end + + def test_server_features_empty_when_absent + c = FakeServerClient.new({ "parseServerVersion" => "9.9.0" }.with_indifferent_access) + assert_equal({}, c.server_features) + end + + def test_capabilities_on_current_server_9_9 + c = client_for(version: "9.9.0") + assert c.server_supports?(:livequery_keys_option), "keys option since 7.0" + assert c.server_supports?(:cloud_object_encoding), "object encoding since 8.0" + assert c.server_supports?(:aggregate_raw_values), "rawValues since 9.9" + refute c.server_supports?(:public_explain), "public explain restricted at 9.0" + end + + def test_capabilities_on_8_5 + c = client_for(version: "8.5.0") + assert c.server_supports?(:cloud_object_encoding) + assert c.server_supports?(:public_explain), "below 9.0 → public explain still allowed" + refute c.server_supports?(:aggregate_raw_values), "rawValues not until 9.9" + assert c.server_supports?(:livequery_keys_option) + end + + def test_capabilities_on_old_6_x + c = client_for(version: "6.0.0") + refute c.server_supports?(:cloud_object_encoding), "encoding not until 8.0" + refute c.server_supports?(:livequery_keys_option), "keys rename not until 7.0" + assert c.server_supports?(:public_explain), "old server allowed public explain" + end + + def test_fail_open_to_modern_on_unknown_version + c = client_for(version: nil) # features present, version absent + # `since:` capabilities assume the modern server line → true + assert c.server_supports?(:cloud_object_encoding) + assert c.server_supports?(:aggregate_raw_values) + # `until:` capabilities assume the modern (restricted) server → false + refute c.server_supports?(:public_explain) + end + + def test_unknown_capability_raises + c = client_for(version: "9.9.0") + assert_raises(ArgumentError) { c.server_supports?(:no_such_capability) } + end +end diff --git a/test/lib/parse/trigger_audit_test.rb b/test/lib/parse/trigger_audit_test.rb new file mode 100644 index 0000000..eaaa24a --- /dev/null +++ b/test/lib/parse/trigger_audit_test.rb @@ -0,0 +1,227 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Exercises Parse::Webhooks.trigger_audit / Parse::Webhooks::TriggerAudit: the +# operator audit that cross-references model ActiveModel callbacks, locally +# registered webhook blocks, and the triggers registered with Parse Server. +class TestTriggerAudit < Minitest::Test + # --- fixtures ------------------------------------------------------------- + + # Callbacks declared, but NO webhook block and (in the live case) no server + # trigger: the headline "inert" case. + class AuditPostFixture < Parse::Object + parse_class "AuditPostFixture" + property :title, :string + before_save :normalize + after_save :reindex + after_create :seed + before_update :touch # local-only (no server trigger can run it) + after_validation :stamp # local-only + def normalize; end + def reindex; end + def seed; end + def touch; end + def stamp; end + end + + # Callback AND a matching local webhook block — wired for non-Ruby clients + # once the trigger is also on the server. + class AuditReportFixture < Parse::Object + parse_class "AuditReportFixture" + property :name, :string + after_save :notify + webhook :after_save do + parse_object + end + def notify; end + end + + # No user callbacks at all. + class AuditPlainFixture < Parse::Object + parse_class "AuditPlainFixture" + property :value, :string + end + + # A fake Parse::Client stand-in for the network path. `triggers.results` + # returns the hashes the audit reads; `master_key` gates the guard. + FakeResponse = Struct.new(:results) + class FakeClient + attr_reader :master_key + def initialize(master_key:, triggers:) + @master_key = master_key + @triggers = triggers + end + + def triggers + FakeResponse.new(@triggers) + end + end + + def setup + @saved_routes = Parse::Webhooks.instance_variable_get(:@routes) + # Re-declaring the webhook block here (the class body already did, but a + # prior test may have reset @routes) keeps the AuditReportFixture route + # present regardless of suite ordering. + Parse::Webhooks.route(:after_save, "AuditReportFixture") { parse_object } + end + + def teardown + Parse::Webhooks.instance_variable_set(:@routes, @saved_routes) + end + + def row(report_or_audit, name) + classes = report_or_audit.is_a?(Parse::Webhooks::TriggerAudit) ? + report_or_audit.classes : nil + classes&.find { |c| c.parse_class == name } + end + + def kinds_for(audit, name) + r = row(audit, name) + r ? r.findings.map { |f| f[:kind] } : [] + end + + # --- local-only audit (no server, no master key needed) ------------------- + + def test_local_audit_flags_inert_callbacks + audit = Parse::Webhooks::TriggerAudit.new(network: false) + post = row(audit, "AuditPostFixture") + refute_nil post + + inert = post.findings.select { |f| f[:kind] == :callbacks_inert } + triggers = inert.map { |f| f[:trigger] } + # before_save (from before_save callback) and after_save (after_save + + # after_create) both lack a local webhook block. + assert_includes triggers, :before_save + assert_includes triggers, :after_save + + after_save_finding = inert.find { |f| f[:trigger] == :after_save } + assert_includes after_save_finding[:callbacks], :after_create + assert_includes after_save_finding[:callbacks], :after_save + # Local-only audit can't see the server, so it never claims :server missing. + assert_equal [:route], after_save_finding[:missing] + end + + def test_local_audit_notes_local_only_callbacks + audit = Parse::Webhooks::TriggerAudit.new(network: false) + post = row(audit, "AuditPostFixture") + note = post.findings.find { |f| f[:kind] == :local_only_callbacks } + refute_nil note + assert_includes note[:callbacks], :before_update + assert_includes note[:callbacks], :after_validation + # save/create callbacks are NOT local-only. + refute_includes note[:callbacks], :before_save + end + + def test_wired_callback_with_block_has_no_inert_finding_locally + audit = Parse::Webhooks::TriggerAudit.new(network: false) + report = row(audit, "AuditReportFixture") + refute_nil report + assert_includes report.local_routes, :after_save + refute_includes kinds_for(audit, "AuditReportFixture"), :callbacks_inert + end + + def test_plain_class_has_no_findings + audit = Parse::Webhooks::TriggerAudit.new(network: false) + plain = row(audit, "AuditPlainFixture") + refute_nil plain + assert_empty plain.findings + end + + # The advisor's acceptance test: framework-internal callbacks must not leak + # into the per-class callback report. + def test_framework_callbacks_filtered_for_user + audit = Parse::Webhooks::TriggerAudit.new(network: false) + user = row(audit, "_User") + refute_nil user + all_names = user.callbacks.values.flatten.map { |c| c[:name] } + refute_includes all_names, "_resolve_default_acl", + "gem-internal callback leaked into the audit" + end + + def test_include_framework_surfaces_gem_callbacks + audit = Parse::Webhooks::TriggerAudit.new(network: false, include_framework: true) + user = row(audit, "_User") + all_names = user.callbacks.values.flatten.map { |c| c[:name] } + assert_includes all_names, "_resolve_default_acl" + end + + # --- network audit (stubbed client) --------------------------------------- + + def server_triggers_list + [ + # Orphan: registered on the server, no local block handles it. + { "triggerName" => "beforeSave", "className" => "AuditPostFixture", + "url" => "https://hooks.example.com/before_save/AuditPostFixture" }, + # Entries with no url are cloud-code, not webhooks — must be ignored. + { "triggerName" => "afterSave", "className" => "AuditPlainFixture" }, + ] + end + + def networked_audit + client = FakeClient.new(master_key: "m", triggers: server_triggers_list) + Parse::Webhooks::TriggerAudit.new(network: true, client: client) + end + + def test_network_audit_flags_orphan_server_trigger + audit = networked_audit + post = row(audit, "AuditPostFixture") + orphan = post.findings.find { |f| f[:kind] == :orphan_server_trigger } + refute_nil orphan + assert_equal :before_save, orphan[:trigger] + assert_equal({ before_save: "https://hooks.example.com/before_save/AuditPostFixture" }, + post.server_triggers) + end + + def test_network_audit_flags_block_without_server_trigger + audit = networked_audit + kinds = kinds_for(audit, "AuditReportFixture") + # Local block exists, server trigger does not. + assert_includes kinds, :route_not_registered + # And the callback is inert with :server missing. + inert = row(audit, "AuditReportFixture").findings + .find { |f| f[:kind] == :callbacks_inert } + assert_equal [:server], inert[:missing] + end + + def test_network_audit_ignores_urlless_cloudcode_triggers + audit = networked_audit + plain = row(audit, "AuditPlainFixture") + # The afterSave entry has no url, so it must not register as a server trigger. + assert_empty plain.server_triggers + refute_includes plain.findings.map { |f| f[:kind] }, :orphan_server_trigger + end + + def test_network_requires_master_key + client = FakeClient.new(master_key: "", triggers: []) + err = assert_raises(ArgumentError) do + Parse::Webhooks::TriggerAudit.new(network: true, client: client) + end + assert_match(/master-key/, err.message) + end + + # --- shape / convenience -------------------------------------------------- + + def test_trigger_audit_returns_hash_by_default + report = Parse::Webhooks.trigger_audit(network: false) + assert_kind_of Hash, report + assert report.key?(:classes) + assert report.key?(:summary) + assert_kind_of Integer, report[:summary][:classes_audited] + end + + def test_trigger_audit_pretty_returns_string + out = Parse::Webhooks.trigger_audit(network: false, pretty: true) + assert_kind_of String, out + assert_match(/Parse trigger audit/, out) + end + + def test_gaps_fold_class_name_into_entries + audit = networked_audit + gap = audit.gaps.find { |g| g[:parse_class] == "AuditPostFixture" && + g[:kind] == :callbacks_inert } + refute_nil gap + assert_equal "AuditPostFixture", gap[:parse_class] + end +end diff --git a/test/lib/parse/user_authdata_strip_test.rb b/test/lib/parse/user_authdata_strip_test.rb index d4e7ed0..95d3ece 100644 --- a/test/lib/parse/user_authdata_strip_test.rb +++ b/test/lib/parse/user_authdata_strip_test.rb @@ -48,6 +48,27 @@ def test_build_strips_symbol_keyed_authdata assert_nil user.auth_data, "symbol-keyed authData must also be stripped" end + # MFA is special: the untrusted strip keeps a non-sensitive + # `{ "mfa" => { "status" => "enabled" } }` projection so #mfa_enabled? / + # #mfa_status work after a fetch, but the raw TOTP secret and recovery codes + # must NOT survive. + def test_build_preserves_safe_mfa_status_and_strips_secret + row = row_with_authdata.merge( + "authData" => { + "mfa" => { "secret" => "JBSWY3DPEHPK3PXP", "recovery" => %w[rec-abc rec-def] }, + }, + ) + user = Parse::User.build(row) + + assert_equal({ "mfa" => { "status" => "enabled" } }, user.auth_data, + "only the leak-safe MFA status should survive an untrusted hydration") + assert user.mfa_enabled?, "#mfa_enabled? should read the preserved status" + + blob = user.auth_data.to_s + refute_includes blob, "JBSWY3DPEHPK3PXP", "the TOTP secret must not survive the strip" + refute_match(/rec-abc|rec-def|recovery/i, blob, "recovery codes must not survive the strip") + end + # -------------------------------------------------------------------- # Trusted-self path: login!/session!/create/MFA wrap their build calls # in with_authdata_trust so authData survives. diff --git a/test/lib/parse/user_save_signup_integration_test.rb b/test/lib/parse/user_save_signup_integration_test.rb index fcb9630..dbd0c6c 100644 --- a/test/lib/parse/user_save_signup_integration_test.rb +++ b/test/lib/parse/user_save_signup_integration_test.rb @@ -11,6 +11,11 @@ class UserSaveSignupIntegrationTest < Minitest::Test include ParseStackIntegrationTest + # Parse Server 9.x does not issue a session token for a master-key signup (it + # treats master-key user creation as admin provisioning), so tests that need a + # live user session log in right after signup via the shared + # {ParseStackIntegrationTest#login_after_signup!} helper. + def with_timeout(seconds, description) Timeout.timeout(seconds) { yield } rescue Timeout::Error @@ -39,9 +44,10 @@ def test_save_on_new_user_issues_real_session_token user = Parse::User.new(username: username, password: "p4ssw0rd!", email: "#{username}@test.com") assert user.save, "Parse::User.new(...).save should succeed against real server" @test_context.track(user) + login_after_signup!(user, "p4ssw0rd!") refute_nil user.id, "user must have a server-assigned objectId" - refute_nil user.session_token, "signup-on-save must populate session_token" + refute_nil user.session_token, "signup-on-save + login must populate session_token" assert user.logged_in?, "user should be logged_in? after signup-via-save" assert session_token_valid?(user.session_token), "session token from signup-via-save must be live on the server" @@ -62,6 +68,7 @@ def test_existing_user_save_does_not_invalidate_session user = Parse::User.new(username: username, password: "p4ssw0rd!", email: "#{username}@test.com") assert user.save, "initial signup" @test_context.track(user) + login_after_signup!(user, "p4ssw0rd!") original_token = user.session_token refute_nil original_token @@ -156,6 +163,7 @@ def test_save_after_signup_bang_does_not_invalidate_session user = Parse::User.new(username: username, password: "p4ssw0rd!", email: "#{username}@test.com") assert user.signup! @test_context.track(user) + login_after_signup!(user, "p4ssw0rd!") original_token = user.session_token refute_nil original_token @@ -214,6 +222,7 @@ def test_password_change_does_invalidate_session user = Parse::User.new(username: username, password: "old_p4ss!", email: "#{username}@test.com") assert user.save @test_context.track(user) + login_after_signup!(user, "old_p4ss!") original_token = user.session_token refute_nil original_token @@ -286,9 +295,10 @@ def test_signup_on_save_with_parse_reference_subclass_succeeds # update! sends only parseReference and the save succeeds. assert user.save!, "save! on parse_reference User subclass must succeed" @test_context.track(user) + login_after_signup!(user, "p4ssw0rd!") refute_nil user.id, "user must have an objectId after signup" - refute_nil user.session_token, "signup-on-save must populate session_token" + refute_nil user.session_token, "signup-on-save + login must populate session_token" assert session_token_valid?(user.session_token), "session token must remain live (no bcrypt rehash path triggered)" diff --git a/test/lib/parse/vector_search_hybrid_test.rb b/test/lib/parse/vector_search_hybrid_test.rb new file mode 100644 index 0000000..5274021 --- /dev/null +++ b/test/lib/parse/vector_search_hybrid_test.rb @@ -0,0 +1,186 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" +require "parse/vector_search/hybrid" +require "parse/atlas_search" + +# Unit tests for Parse::VectorSearch::Hybrid — the reciprocal-rank-fusion +# math, the $rankFusion probe-and-cache, the native pipeline SHAPE, and +# the client-side orchestration. No Atlas / Docker: the branch entry +# points and the Mongo probe are stubbed. +class VectorSearchHybridTest < Minitest::Test + H = Parse::VectorSearch::Hybrid + + def setup + H.clear_probe_cache + end + + def teardown + H.clear_probe_cache + end + + # ----- pure RRF fusion ----- + + def test_rrf_fuses_on_object_id_and_orders_by_score + lex = [{ "_id" => "a", "_score" => 9.0 }, { "_id" => "b", "_score" => 8.0 }, { "_id" => "c", "_score" => 7.0 }] + vec = [{ "_id" => "b", "_vscore" => 0.9 }, { "_id" => "d", "_vscore" => 0.8 }, { "_id" => "a", "_vscore" => 0.7 }] + fused = H.rrf({ lexical: lex, vector: vec }, k_constant: 60) + # b: lexical#2 + vector#1 (best combined), a: lexical#1 + vector#3. + assert_equal %w[b a d c], fused.map { |r| r["_id"] } + b = fused.first + assert_equal({ lexical: 2, vector: 1 }, b["_hybrid_ranks"]) + # merged row carries BOTH branch scores. + assert b.key?("_score") + assert b.key?("_vscore") + assert_operator b["_hybrid_score"], :>, fused[1]["_hybrid_score"] + end + + def test_rrf_weights_shift_order + lex = [{ "_id" => "x", "_score" => 1.0 }] + vec = [{ "_id" => "y", "_vscore" => 1.0 }] + # Heavily weight vector -> y outranks x even though both are rank 1. + fused = H.rrf({ lexical: lex, vector: vec }, weights: { lexical: 0.1, vector: 0.9 }) + assert_equal "y", fused.first["_id"] + end + + def test_rrf_zero_weight_branch_excluded + lex = [{ "_id" => "x", "_score" => 1.0 }] + vec = [{ "_id" => "y", "_vscore" => 1.0 }] + fused = H.rrf({ lexical: lex, vector: vec }, weights: { lexical: 0, vector: 1 }) + assert_equal ["y"], fused.map { |r| r["_id"] } + end + + def test_rrf_deterministic_tie_break_by_object_id + # Both single-branch rank-1 -> equal scores -> ordered by id. + lex = [{ "_id" => "zzz", "_score" => 1.0 }] + vec = [{ "_id" => "aaa", "_vscore" => 1.0 }] + fused = H.rrf({ lexical: lex, vector: vec }, weights: { lexical: 1, vector: 1 }) + assert_equal %w[aaa zzz], fused.map { |r| r["_id"] } + end + + def test_rrf_rejects_bad_input + assert_raises(H::FusionError) { H.rrf({}, k_constant: 60) } + assert_raises(H::FusionError) { H.rrf({ a: [] }, k_constant: 0) } + assert_raises(H::FusionError) { H.rrf({ a: [] }, weights: { a: -1 }) } + assert_raises(H::FusionError) { H.rrf({ a: [] }, weights: "nope") } + end + + # ----- $rankFusion probe-and-cache ----- + + # Fake Mongo collection whose #aggregate runs a supplied proc. + class FakeColl + def initialize(behavior) = (@behavior = behavior) + def aggregate(_pipeline) = self + def to_a = @behavior.call + end + + # Run `blk` with Parse::MongoDB.collection stubbed to a FakeColl whose + # aggregate runs `behavior`. + def with_probe_collection(behavior) + Parse::MongoDB.stub(:collection, ->(_name) { FakeColl.new(behavior) }) { yield } + end + + def test_probe_returns_true_when_stage_recognized + with_probe_collection(-> { [] }) do + assert_equal true, H.rank_fusion_supported?("Song") + end + end + + def test_probe_returns_false_on_unknown_stage_error + with_probe_collection(-> { raise StandardError, "Unknown aggregation stage $rankFusion" }) do + assert_equal false, H.rank_fusion_supported?("Song") + end + end + + def test_probe_treats_other_errors_as_supported + # A recognized-but-misused stage (or auth error) is NOT "unsupported". + with_probe_collection(-> { raise StandardError, "BSONObj exceeded maximum nested depth" }) do + assert_equal true, H.rank_fusion_supported?("Song") + end + end + + def test_probe_result_is_cached_per_collection + calls = 0 + with_probe_collection(-> { calls += 1; [] }) do + H.rank_fusion_supported?("Song") + H.rank_fusion_supported?("Song") + end + assert_equal 1, calls, "second probe should hit the cache" + end + + # ----- native pipeline shape (security-relevant) ----- + + def test_native_pipeline_is_stage0_rankfusion_with_subpipelines + pipe = H.send(:native_pipeline, "Song", + lexical: { query: "rain", index: "song_search" }, + vector: { query_vector: [0.1, 0.2], field: "embedding", index: "song_idx", num_candidates: 40 }, + k: 5, fusion: { weights: { lexical: 0.4, vector: 0.6 } }, master: true) + assert_equal "$rankFusion", pipe.first.keys.first + inputs = pipe.first["$rankFusion"]["input"]["pipelines"] + assert_equal "$vectorSearch", inputs["vector"].first.keys.first + assert_equal "$search", inputs["lexical"].first.keys.first + assert_equal({ "vector" => 0.6, "lexical" => 0.4 }, pipe.first["$rankFusion"]["combination"]["weights"]) + # Fused score projected as a NUMBER via $meta:score, then sorted. + assert_equal({ "$meta" => "score" }, pipe[1]["$addFields"]["_hybrid_score"]) + assert(pipe.any? { |s| s["$sort"] == { "_hybrid_score" => -1 } }) + end + + def test_native_pipeline_injects_acl_match_for_scoped_caller + # A session-token scope (non-master) MUST get an ACL $match stage so + # the fused candidate set is narrowed to _rperm-readable rows. + fake_resolution = Object.new + def fake_resolution.master? = false + Parse::ACLScope.stub(:resolve!, ->(*) { fake_resolution }) do + Parse::ACLScope.stub(:match_stage_for, ->(_r) { { "$match" => { "_rperm" => { "$in" => %w[u1] } } } }) do + pipe = H.send(:native_pipeline, "Song", + lexical: { query: "x", index: "i" }, + vector: { query_vector: [0.1], field: "e", index: "vi" }, + k: 3, session_token: "tok") + assert(pipe.any? { |s| s["$match"] && s["$match"].key?("_rperm") }, + "scoped native pipeline must contain an ACL _rperm $match") + end + end + end + + # ----- client-side orchestration (the default path) ----- + + def test_search_default_fuses_client_side_without_probing + lexical_rows = [{ "_id" => "a", "_score" => 5.0 }, { "_id" => "b", "_score" => 4.0 }] + vector_rows = [{ "_id" => "b", "_vscore" => 0.9 }, { "_id" => "c", "_vscore" => 0.8 }] + probed = false + Parse::MongoDB.stub(:require_gem!, nil) do + Parse::MongoDB.stub(:available?, true) do + H.stub(:rank_fusion_supported?, ->(_c) { probed = true; true }) do + Parse::AtlasSearch.stub(:search, ->(*_a, **_k) { lexical_rows }) do + Parse::VectorSearch.stub(:search, ->(*_a, **_k) { vector_rows }) do + out = H.search("Song", + lexical: { query: "rain" }, + vector: { query_vector: [0.1, 0.2], field: "embedding", index: "idx" }, + k: 10) + assert_equal %w[b a c], out.map { |r| r["_id"] } + end + end + end + end + end + refute probed, "default :rrf method must NOT probe for native $rankFusion" + end + + def test_search_validates_inputs + Parse::MongoDB.stub(:require_gem!, nil) do + Parse::MongoDB.stub(:available?, true) do + assert_raises(ArgumentError) do + H.search("Song", lexical: { query: "" }, vector: { query_vector: [0.1], field: "e" }, k: 5) + end + assert_raises(ArgumentError) do + H.search("Song", lexical: { query: "x" }, vector: { field: "e" }, k: 5) + end + assert_raises(ArgumentError) do + H.search("Song", lexical: { query: "x" }, vector: { query_vector: [0.1], field: "e" }, + k: 5, fusion: { method: :bogus }) + end + end + end + end +end diff --git a/test/lib/parse/vector_searchable_hybrid_test.rb b/test/lib/parse/vector_searchable_hybrid_test.rb new file mode 100644 index 0000000..14f9a59 --- /dev/null +++ b/test/lib/parse/vector_searchable_hybrid_test.rb @@ -0,0 +1,87 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" +require "parse/vector_search/hybrid" + +# Unit tests for Parse::Core::VectorSearchable#hybrid_search — the +# class-level wrapper — and #build_hybrid_hits. Parse::VectorSearch::Hybrid.search +# is stubbed so these run without Atlas; they pin (a) the kwargs the +# wrapper threads into Hybrid.search, and (b) that fused raw rows become +# Parse::Object instances carrying #hybrid_score / #hybrid_ranks / +# #vector_score / #search_score. +class VectorSearchableHybridTest < Minitest::Test + def self.register_fixture + Parse::Embeddings.register(:fixture_hyb, Parse::Embeddings::Fixture.new(dimensions: 4)) + end + register_fixture + + class HybDoc < Parse::Object + parse_class "HybDoc" + property :title, :string + property :embedding, :vector, dimensions: 4, provider: :fixture_hyb + embed :title, into: :embedding + end + + def test_hybrid_search_threads_kwargs_into_hybrid_module + captured = nil + fake = lambda do |collection, **kw| + captured = { collection: collection }.merge(kw) + [] + end + Parse::VectorSearch::Hybrid.stub(:search, fake) do + HybDoc.hybrid_search( + text: "love and rain", + lexical: { index: "hyb_lex" }, + vector: { index: "hyb_vec", num_candidates: 150 }, + k: 12, + fusion: { k_constant: 40, weights: { lexical: 0.3, vector: 0.7 } }, + session_token: "tok", + ) + end + refute_nil captured + assert_equal "HybDoc", captured[:collection] + assert_equal "love and rain", captured[:lexical][:query] # defaults to text + assert_equal "hyb_lex", captured[:lexical][:index] + assert_equal :embedding, captured[:vector][:field] # sole vector field + assert_equal 150, captured[:vector][:num_candidates] + assert_kind_of Array, captured[:vector][:query_vector] # text embedded + assert_equal 12, captured[:k] + assert_equal 40, captured[:fusion][:k_constant] + assert_equal "tok", captured[:session_token] + end + + def test_hybrid_search_requires_a_query + assert_raises(ArgumentError) do + HybDoc.hybrid_search(lexical: {}, vector: {}) + end + end + + def test_build_hybrid_hits_attaches_scores_and_ranks + fused_rows = [ + { + "_id" => "abc123", "title" => "rain song", + "_hybrid_score" => 0.0321, "_hybrid_ranks" => { lexical: 2, vector: 1 }, + "_vscore" => 0.9, "_score" => 7.5, + }, + ] + Parse::VectorSearch::Hybrid.stub(:search, ->(*_a, **_k) { fused_rows }) do + hits = HybDoc.hybrid_search(text: "rain", vector: { index: "hyb_vec" }, raw: false) + assert_equal 1, hits.length + obj = hits.first + assert_kind_of Parse::Object, obj + assert_in_delta 0.0321, obj.hybrid_score, 1e-9 + assert_equal({ lexical: 2, vector: 1 }, obj.hybrid_ranks) + assert_in_delta 0.9, obj.vector_score, 1e-9 + assert_in_delta 7.5, obj.search_score, 1e-9 + end + end + + def test_hybrid_search_raw_returns_rows + fused_rows = [{ "_id" => "x", "_hybrid_score" => 0.1 }] + Parse::VectorSearch::Hybrid.stub(:search, ->(*_a, **_k) { fused_rows }) do + out = HybDoc.hybrid_search(text: "rain", vector: { index: "hyb_vec" }, raw: true) + assert_equal fused_rows, out + end + end +end diff --git a/test/lib/parse/vector_visibility_test.rb b/test/lib/parse/vector_visibility_test.rb new file mode 100644 index 0000000..ca371ed --- /dev/null +++ b/test/lib/parse/vector_visibility_test.rb @@ -0,0 +1,145 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit tests for the v5.0 vector_visibility DSL and the webhook +# :vector-column redaction. vector_visibility governs whether a class's +# :vector properties are included in as_json by default; the webhook +# payload strips vector columns unless the class is :public. +class VectorVisibilityTest < Minitest::Test + class VisDefault < Parse::Object + parse_class "VisDefault" + property :title, :string + property :embedding, :vector, dimensions: 3 + end + + class VisPublic < Parse::Object + parse_class "VisPublic" + vector_visibility :public + property :title, :string + property :embedding, :vector, dimensions: 3 + end + + def vec3 = Parse::Vector.new([1.0, 2.0, 3.0]) + + # ----- DSL ----- + + def test_default_is_owner_only + assert_equal :owner_only, VisDefault.vector_visibility + refute VisDefault.vectors_public_by_default? + end + + def test_public_mode + assert_equal :public, VisPublic.vector_visibility + assert VisPublic.vectors_public_by_default? + end + + def test_invalid_mode_raises + assert_raises(ArgumentError) { VisDefault.vector_visibility(:bogus) } + end + + # ----- as_json default ----- + + def test_owner_only_omits_vector_from_as_json + obj = VisDefault.new(title: "t") + obj.embedding = vec3 + refute obj.as_json.key?("embedding") + end + + def test_public_includes_vector_in_as_json + obj = VisPublic.new(title: "t") + obj.embedding = vec3 + assert obj.as_json.key?("embedding") + end + + def test_explicit_include_vectors_overrides_class_default_both_ways + d = VisDefault.new(title: "t"); d.embedding = vec3 + p = VisPublic.new(title: "t"); p.embedding = vec3 + assert d.as_json(include_vectors: true).key?("embedding") + refute p.as_json(include_vectors: false).key?("embedding") + end + + # ----- webhook redaction ----- + + P = Parse::Webhooks::Payload + + def test_webhook_strips_vector_for_owner_only_class + out = P.scrub_vector_columns({ "className" => "VisDefault", "title" => "x", "embedding" => [1.0, 2.0, 3.0] }) + refute out.key?("embedding") + assert_equal "x", out["title"] + end + + def test_webhook_keeps_vector_for_public_class + out = P.scrub_vector_columns({ "className" => "VisPublic", "title" => "x", "embedding" => [1.0, 2.0, 3.0] }) + assert out.key?("embedding") + end + + def test_webhook_unknown_class_passes_through + out = P.scrub_vector_columns({ "className" => "TotallyUnregistered", "embedding" => [1.0] }) + assert out.key?("embedding") + end + + def test_webhook_non_hash_passes_through + assert_nil P.scrub_vector_columns(nil) + assert_equal "x", P.scrub_vector_columns("x") + end + + def test_webhook_strips_update_payload_via_explicit_klass + # An update/changes payload carries no className; the resolved class is + # passed explicitly so its vector columns are still stripped. + out = P.scrub_vector_columns({ "embedding" => [1.0, 2.0, 3.0], "title" => "x" }, VisDefault) + refute out.key?("embedding") + end + + # ----- afterFind objects redaction (Payload constructor path) ----- + # + # Parse Server's afterFind payload carries NO className anywhere — the matched + # objects omit it and there is no top-level className (verified against Parse + # Server 9.9.0). The class is known only from the webhook URL path, threaded + # in as `webhook_class:`. These fixtures therefore use objects WITHOUT + # className and supply the class via webhook_class (the real shape), NOT via + # per-element className. + + def test_webhook_strips_vectors_from_afterfind_objects_via_route_class + payload = P.new( + { trigger_name: "afterFind", + objects: [ + { "title" => "a", "embedding" => [1.0, 2.0, 3.0] }, + { "title" => "b", "embedding" => [4.0, 5.0, 6.0] }, + ] }, + "VisDefault", + ) + payload.objects.each do |o| + refute o.key?("embedding"), "afterFind object must have its :vector stripped via the route class" + assert o.key?("title") + end + end + + def test_webhook_afterfind_keeps_vectors_for_public_route_class + payload = P.new( + { trigger_name: "afterFind", + objects: [{ "title" => "a", "embedding" => [1.0, 2.0, 3.0] }] }, + "VisPublic", + ) + assert payload.objects[0].key?("embedding") + end + + def test_webhook_afterfind_without_route_class_cannot_scrub + # Honest negative: with no webhook_class and no per-element className there + # is no way to resolve the class, so vectors are NOT stripped (fail-open). + # This pins WHY threading the route class is required. + payload = P.new( + trigger_name: "afterFind", + objects: [{ "title" => "a", "embedding" => [1.0, 2.0, 3.0] }], + ) + assert payload.objects[0].key?("embedding"), + "without a resolvable class the vector cannot be stripped (documents the gap the route class closes)" + end + + def test_webhook_class_sets_parse_class_for_find_trigger + payload = P.new({ trigger_name: "afterFind", objects: [] }, "VisDefault") + assert_equal "VisDefault", payload.parse_class, + "find triggers resolve parse_class from the route-derived webhook_class" + end +end diff --git a/test/lib/parse/verify_password_test.rb b/test/lib/parse/verify_password_test.rb new file mode 100644 index 0000000..99954df --- /dev/null +++ b/test/lib/parse/verify_password_test.rb @@ -0,0 +1,216 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit tests for the verifyPassword API method (Parse::API::Users#verify_password) +# and the Parse::User#verify_password instance method. +# +# API-layer tests include the module directly and stub +request+ — the same +# pattern used in test/lib/parse/api/cloud_functions_test.rb. User-model tests +# use Minitest::Mock to mock the client at the instance level. +class VerifyPasswordAPITest < Minitest::Test + include Parse::API::Users + + # Captures the most recent call to #request so tests can assert on method, + # path, and keyword arguments. + def setup + @last_request = nil + @stub_response = nil + end + + # Intercept the internal #request call that all API methods delegate to. + def request(method, path, **kwargs) + @last_request = { method: method, path: path, kwargs: kwargs } + @stub_response || Parse::Response.new({}) + end + + # Stub helpers are not needed here since #request is already overridden. + def check_login_rate_limit!(*); end + def track_login_attempt(*); end + + # ========================================================================= + # VERIFY_PASSWORD_PATH constant + # ========================================================================= + + def test_verify_password_path_constant_value + assert_equal "verifyPassword", Parse::API::Users::VERIFY_PASSWORD_PATH + end + + # ========================================================================= + # #verify_password issues a POST to the correct path with credentials in the + # BODY (not the URL) so the plaintext password never reaches access logs, + # proxy logs, the Referer header, or the URL-keyed response cache. + # ========================================================================= + + def test_verify_password_uses_post_method + verify_password("alice", "s3cret") + assert_equal :post, @last_request[:method] + end + + def test_verify_password_uses_correct_path + verify_password("alice", "s3cret") + assert_equal "verifyPassword", @last_request[:path] + end + + def test_verify_password_passes_credentials_in_body_not_query + verify_password("alice", "s3cret") + body = @last_request.dig(:kwargs, :body) + assert_equal "alice", body[:username] + assert_equal "s3cret", body[:password] + assert_nil @last_request.dig(:kwargs, :query), + "credentials must not ride the URL query string" + end + + def test_verify_password_passes_headers + verify_password("alice", "s3cret", headers: { "X-Custom" => "val" }) + assert_equal({ "X-Custom" => "val" }, @last_request.dig(:kwargs, :headers)) + end + + def test_verify_password_sets_parse_class_on_response + ok_result = { "objectId" => "abc123", "username" => "alice" } + @stub_response = Parse::Response.new(ok_result) + response = verify_password("alice", "s3cret") + assert_equal Parse::Model::CLASS_USER, response.parse_class + end + + def test_verify_password_returns_response_object + ok_result = { "objectId" => "abc123", "username" => "alice" } + @stub_response = Parse::Response.new(ok_result) + response = verify_password("alice", "s3cret") + assert_instance_of Parse::Response, response + assert response.success? + end + + def test_verify_password_returns_error_response_on_failure + err_body = { "code" => 101, "error" => "Invalid username/password." } + @stub_response = Parse::Response.new(err_body) + response = verify_password("alice", "wrong") + assert response.error? + assert_equal 101, response.code + end +end + +# Model-layer tests: Parse::User#verify_password instance method +class VerifyPasswordUserTest < Minitest::Test + + # ========================================================================= + # Happy path + # ========================================================================= + + def test_verify_password_returns_true_on_success + user = Parse::User.new + user.username = "alice" + + ok_result = { "objectId" => "abc123", "username" => "alice" } + ok_response = Parse::Response.new(ok_result) + + mock_client = Minitest::Mock.new + mock_client.expect(:verify_password, ok_response, ["alice", "correct"]) + + user.stub(:client, mock_client) do + assert_equal true, user.verify_password("correct") + end + + mock_client.verify + end + + # ========================================================================= + # Wrong password / unknown user (code 101) + # ========================================================================= + + def test_verify_password_raises_authentication_error_on_wrong_password + user = Parse::User.new + user.username = "alice" + + err_body = { "code" => 101, "error" => "Invalid username/password." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 404 + + mock_client = Minitest::Mock.new + mock_client.expect(:verify_password, err_response, ["alice", "wrong"]) + + user.stub(:client, mock_client) do + assert_raises(Parse::Error::AuthenticationError) do + user.verify_password("wrong") + end + end + + mock_client.verify + end + + def test_verify_password_auth_error_message_contains_code + user = Parse::User.new + user.username = "carol" + + err_body = { "code" => 101, "error" => "Invalid username/password." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 404 + + mock_client = Minitest::Mock.new + mock_client.expect(:verify_password, err_response, ["carol", "badpass"]) + + user.stub(:client, mock_client) do + error = assert_raises(Parse::Error::AuthenticationError) do + user.verify_password("badpass") + end + assert_match(/101/, error.message) + end + + mock_client.verify + end + + # ========================================================================= + # Unverified email (code 205) + # ========================================================================= + + def test_verify_password_raises_email_not_verified_error_on_code_205 + user = Parse::User.new + user.username = "bob" + + err_body = { "code" => 205, "error" => "User email is not verified." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:verify_password, err_response, ["bob", "correct"]) + + user.stub(:client, mock_client) do + assert_raises(Parse::Error::EmailNotVerifiedError) do + user.verify_password("correct") + end + end + + mock_client.verify + end + + def test_verify_password_unverified_error_message_contains_code + user = Parse::User.new + user.username = "dana" + + err_body = { "code" => 205, "error" => "User email is not verified." } + err_response = Parse::Response.new(err_body) + err_response.http_status = 400 + + mock_client = Minitest::Mock.new + mock_client.expect(:verify_password, err_response, ["dana", "rightpass"]) + + user.stub(:client, mock_client) do + error = assert_raises(Parse::Error::EmailNotVerifiedError) do + user.verify_password("rightpass") + end + assert_match(/205/, error.message) + end + + mock_client.verify + end + + # ========================================================================= + # Error class ancestry + # ========================================================================= + + def test_email_not_verified_error_is_parse_error_subclass + assert Parse::Error::EmailNotVerifiedError.ancestors.include?(Parse::Error), + "EmailNotVerifiedError must descend from Parse::Error" + end +end diff --git a/test/lib/parse/webhook_after_response_test.rb b/test/lib/parse/webhook_after_response_test.rb new file mode 100644 index 0000000..99e8607 --- /dev/null +++ b/test/lib/parse/webhook_after_response_test.rb @@ -0,0 +1,173 @@ +require_relative "../../test_helper" +require "minitest/autorun" +require "stringio" + +# Verifies Parse::Webhooks::Payload#after_response (alias #defer): work +# registered by a handler runs AFTER the response is produced, off the +# client's critical path. Under a server exposing `rack.after_reply` the runner +# is enqueued there (Puma/Unicorn); otherwise it falls back to a thread. +class WebhookAfterResponseTest < Minitest::Test + WEBHOOK_HEADER = "HTTP_X_PARSE_WEBHOOK_KEY" + + class AfterRespProbe < Parse::Object + parse_class "AfterRespProbe" + property :title, :string + end + + def setup + @saved_allow = Parse::Webhooks.instance_variable_get(:@allow_unauthenticated) + @saved_logging = Parse::Webhooks.logging + Parse::Webhooks.instance_variable_set(:@key, nil) + Parse::Webhooks.instance_variable_set(:@allow_unauthenticated, true) + Parse::Webhooks.logging = false + Parse::Webhooks.instance_variable_set(:@routes, nil) + Parse::Webhooks::ReplayProtection.reset! + Parse.setup(server_url: "https://test.parse.com", application_id: "test", api_key: "test") + end + + def teardown + Parse::Webhooks.instance_variable_set(:@allow_unauthenticated, @saved_allow) + Parse::Webhooks.logging = @saved_logging + Parse::Webhooks.instance_variable_set(:@routes, nil) + end + + # env with a Puma/Unicorn-style rack.after_reply array unless after_reply: nil + def build_env(body:, path: nil, after_reply: []) + env = { + "REQUEST_METHOD" => "POST", + "CONTENT_TYPE" => "application/json", + "rack.input" => StringIO.new(body), + "CONTENT_LENGTH" => body.bytesize.to_s, + } + env["PATH_INFO"] = path if path + env["rack.after_reply"] = after_reply unless after_reply.nil? + env + end + + def fn_body(name, params = {}) + { "functionName" => name, "params" => params }.to_json + end + + def drain(after_reply) + after_reply.each(&:call) + end + + def test_after_response_runs_via_rack_after_reply + ran = [] + Parse::Webhooks.route(:function, "deferFn") do + after_response { ran << "did-work" } + "ok" + end + + after_reply = [] + status, _h, body = Parse::Webhooks.call(build_env(body: fn_body("deferFn"), after_reply: after_reply)) + + assert_equal 200, status + assert_equal({ "success" => "ok" }, JSON.parse(body.join)) + # Not yet run — only enqueued onto rack.after_reply. + assert_empty ran, "deferred work must not run before the reply is flushed" + assert_equal 1, after_reply.size + + drain(after_reply) + assert_equal ["did-work"], ran + end + + def test_defer_alias_works + ran = [] + Parse::Webhooks.route(:function, "deferAlias") do + defer { ran << "via-defer" } + "ok" + end + after_reply = [] + Parse::Webhooks.call(build_env(body: fn_body("deferAlias"), after_reply: after_reply)) + drain(after_reply) + assert_equal ["via-defer"], ran + end + + def test_self_inside_deferred_block_is_payload + seen = [] + Parse::Webhooks.route(:function, "deferSelf") do + after_response { seen << params["echo"] } + "ok" + end + after_reply = [] + Parse::Webhooks.call(build_env(body: fn_body("deferSelf", "echo" => "hi"), after_reply: after_reply)) + drain(after_reply) + assert_equal ["hi"], seen + end + + def test_multiple_deferred_blocks_run_in_order_and_are_isolated + ran = [] + Parse::Webhooks.route(:function, "deferMany") do + after_response { ran << 1 } + after_response { raise "boom" } # must not abort the others + after_response { ran << 3 } + "ok" + end + after_reply = [] + Parse::Webhooks.call(build_env(body: fn_body("deferMany"), after_reply: after_reply)) + # A single runner is enqueued; draining it must not raise. + assert_equal 1, after_reply.size + drain(after_reply) + assert_equal [1, 3], ran + end + + def test_falls_back_to_thread_without_rack_after_reply + q = Queue.new + Parse::Webhooks.route(:function, "deferThread") do + after_response { q << "threaded" } + "ok" + end + # after_reply: nil => no rack.after_reply key => Thread fallback path + status, _h, _b = Parse::Webhooks.call(build_env(body: fn_body("deferThread"), after_reply: nil)) + assert_equal 200, status + assert_equal "threaded", q.pop # blocks until the detached thread runs it + end + + def test_not_dispatched_when_handler_rejects + ran = [] + Parse::Webhooks.route(:function, "deferReject") do + after_response { ran << "should-not-run" } + error! "nope" + end + after_reply = [] + status, _h, body = Parse::Webhooks.call(build_env(body: fn_body("deferReject"), after_reply: after_reply)) + assert_equal 200, status + assert_equal "nope", JSON.parse(body.join)["error"] + assert_empty after_reply, "rejected handler must not enqueue deferred work" + assert_empty ran + end + + def test_after_response_on_after_save_trigger_path + # The advertised use case: defer reindex-style work from an after_save + # trigger. Exercises call!'s trigger branch (which calls call_route twice + # — the specific class and the "*" route — on the SAME payload), so this + # also guards that dispatch happens once, not once per call_route. + ran = [] + Parse::Webhooks.route(:after_save, "AfterRespProbe") do + post = parse_object + after_response { ran << post.id } + post + end + + body = JSON.generate("triggerName" => "afterSave", + "object" => { "className" => "AfterRespProbe", "objectId" => "p1" }) + after_reply = [] + status, _h, _b = Parse::Webhooks.call( + build_env(body: body, path: "/after_save/AfterRespProbe", after_reply: after_reply) + ) + + assert_equal 200, status + assert_equal 1, after_reply.size, "exactly one runner enqueued despite specific + '*' routing" + assert_empty ran + drain(after_reply) + assert_equal ["p1"], ran + end + + def test_no_after_response_means_no_runner_enqueued + Parse::Webhooks.route(:function, "plain") { "ok" } + after_reply = [] + Parse::Webhooks.call(build_env(body: fn_body("plain"), after_reply: after_reply)) + assert_empty after_reply + end +end diff --git a/test/lib/parse/webhook_afterfind_integration_test.rb b/test/lib/parse/webhook_afterfind_integration_test.rb new file mode 100644 index 0000000..bf45e56 --- /dev/null +++ b/test/lib/parse/webhook_afterfind_integration_test.rb @@ -0,0 +1,107 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper_integration" +require_relative "../../support/webhook_test_server" + +# End-to-end proof that beforeFind/afterFind webhooks route through the real +# HTTP dispatch pipeline: +# +# Parse Server (Docker) -> HTTP POST afterFind webhook -> in-process WEBrick -> +# Parse::Webhooks Rack app -> webhook block +# +# This guards the v5.4.0 fix that threads the class name from the webhook URL +# path (`/afterFind/`) into the Payload. Parse Server's find payload body +# carries NO className anywhere (the matched objects omit it and there is no +# top-level className — verified against Parse Server 9.9.0), so without the +# path-derived class, parse_class was nil and the dispatch never invoked the +# registered find handler, and afterFind `objects` could not have their :vector +# columns stripped. +# +# Requires Docker (PARSE_TEST_USE_DOCKER=true) and a container whose +# host.docker.internal resolves back to the test host. +class WebhookAfterFindPost < Parse::Object + parse_class "WebhookAfterFindPost" + acl_policy :public + property :title, :string + property :embedding, :vector, dimensions: 3 +end + +class WebhookAfterFindIntegrationTest < Minitest::Test + include ParseStackIntegrationTest + + def setup + super + Parse::Webhooks.instance_variable_set(:@routes, nil) + Parse::Webhooks.allow_unauthenticated = true + @prior_private = Parse::Webhooks.instance_variable_get(:@allow_private_webhook_urls) + Parse::Webhooks.allow_private_webhook_urls = true + @server = Parse::Test::WebhookTestServer.new.start! + unless docker_can_reach_host? + @server.stop! + skip "container cannot reach host at #{@server.url}" + end + end + + def teardown + begin + Parse::Webhooks.remove_all_triggers! if @server + rescue StandardError + end + # Don't let rows accumulate across re-runs / other suites sharing the DB. + begin + WebhookAfterFindPost.query.results.each(&:destroy) + rescue StandardError + end + @server&.stop! + Parse::Webhooks.allow_unauthenticated = false + Parse::Webhooks.instance_variable_set(:@allow_private_webhook_urls, @prior_private) + super + end + + def docker_can_reach_host? + result = `docker exec #{ENV["PSNEXT_PREFIX"] || "psnext-it"}-server sh -c 'getent hosts host.docker.internal' 2>&1` + !result.empty? && $?.success? + end + + def test_after_find_routes_and_scrubs_vectors_end_to_end + captured = {} + Parse::Webhooks.route(:after_find, "WebhookAfterFindPost") do + captured[:fired] = true + captured[:parse_class] = parse_class + captured[:count] = objects.size + captured[:first_keys] = objects.first.is_a?(Hash) ? objects.first.keys.map(&:to_s).sort : nil + captured[:any_has_embedding] = objects.any? { |o| o.is_a?(Hash) && o.key?("embedding") } + objects # return the (vector-scrubbed) matched objects + end + Parse::Webhooks.register_triggers!(@server.url) + + 3.times do |i| + o = WebhookAfterFindPost.new(title: "p#{i}") + o.embedding = Parse::Vector.new([1.0 + i, 2.0, 3.0]) + o.save + end + + query_error = nil + results = nil + begin + results = WebhookAfterFindPost.query.results + rescue StandardError => e + query_error = "#{e.class}: #{e.message}" + end + + assert captured[:fired], "afterFind handler must fire via the real HTTP dispatch (path-derived class)" + assert_equal "WebhookAfterFindPost", captured[:parse_class], + "parse_class must resolve from the webhook URL path for find triggers" + assert captured[:count].to_i >= 3, "handler must see the matched objects (saw #{captured[:count]})" + refute captured[:any_has_embedding], + "afterFind objects must have their :vector column stripped (class resolved from the route)" + refute_includes (captured[:first_keys] || []), "embedding" + # Critically: a registered afterFind that fails to route returns + # `{"success": true}` (not an objects array), which Parse Server rejects and + # the query EOFs. So a passing query here also proves the routing fix — + # before it, this raised Faraday::ConnectionFailed. + assert_nil query_error, "afterFind must not break the query (got #{query_error})" + assert results.size >= 3, "afterFind must not drop the query results" + end +end diff --git a/test/lib/parse/webhook_handler_return_test.rb b/test/lib/parse/webhook_handler_return_test.rb new file mode 100644 index 0000000..588ea66 --- /dev/null +++ b/test/lib/parse/webhook_handler_return_test.rb @@ -0,0 +1,153 @@ +require_relative "../../test_helper" +require "minitest/autorun" + +# Verifies the value-returning semantics of registered webhook handler blocks. +# +# Handlers run with `self` bound to the Payload, so a block can use an explicit +# `return value` (the natural Ruby idiom) AND the historical proc idioms -- +# last-expression value, `next value`, `break value` -- and they all produce the +# handler result. `raise` must still propagate untouched so before_save +# rejections / `error!` keep working. +class WebhookHandlerReturnTest < Minitest::Test + class HandlerReturnObject < Parse::Object + property :name + def autofetch!(*args); end + end + + def setup + Parse::Webhooks.instance_variable_set(:@routes, nil) + Parse.setup(server_url: "https://test.parse.com", application_id: "test", api_key: "test") + end + + def teardown + Parse::Webhooks.instance_variable_set(:@routes, nil) + end + + def function_payload(name, params = {}) + Parse::Webhooks::Payload.new( + "functionName" => name, + "params" => params, + ) + end + + def call_fn(name, params = {}) + Parse::Webhooks.call_route(:function, name, function_payload(name, params)) + end + + def test_explicit_return_value_is_used + Parse::Webhooks.route(:function, "withReturn") do + return "early-#{params["who"]}" if params["who"] + "late" + end + + assert_equal "early-bob", call_fn("withReturn", "who" => "bob") + assert_equal "late", call_fn("withReturn") + end + + def test_return_can_short_circuit_before_later_work + Parse::Webhooks.route(:function, "guard") do + return { error: "denied" } unless params["allowed"] + { ok: true } + end + + assert_equal({ error: "denied" }, call_fn("guard")) + assert_equal({ ok: true }, call_fn("guard", "allowed" => true)) + end + + def test_legacy_last_expression_value_still_works + Parse::Webhooks.route(:function, "lastExpr") { "the-result" } + assert_equal "the-result", call_fn("lastExpr") + end + + def test_legacy_next_value_still_works + Parse::Webhooks.route(:function, "nextVal") do + next "via-next" if params["short"] + "via-last" + end + + assert_equal "via-next", call_fn("nextVal", "short" => true) + assert_equal "via-last", call_fn("nextVal") + end + + def test_self_is_payload_inside_handler + Parse::Webhooks.route(:function, "selfCheck") do + return params["echo"] + end + + assert_equal "hi", call_fn("selfCheck", "echo" => "hi") + end + + def test_block_with_explicit_payload_param_still_receives_payload + Parse::Webhooks.route(:function, "argBlock") do |payload| + return payload.params["v"] + end + + assert_equal 42, call_fn("argBlock", "v" => 42) + end + + def test_raise_propagates_unchanged + Parse::Webhooks.route(:function, "boom") do + raise Parse::Webhooks::ResponseError, "nope" + end + + err = assert_raises(Parse::Webhooks::ResponseError) { call_fn("boom") } + assert_equal "nope", err.message + end + + def test_error_bang_helper_still_throws + Parse::Webhooks.route(:function, "errBang") do + error! "rejected" unless params["ok"] + "passed" + end + + err = assert_raises(Parse::Webhooks::ResponseError) { call_fn("errBang") } + assert_equal "rejected", err.message + assert_equal "passed", call_fn("errBang", "ok" => true) + end + + def test_handler_leaves_no_singleton_method_on_payload + Parse::Webhooks.route(:function, "leakCheck") { return "ok" } + payload = function_payload("leakCheck") + Parse::Webhooks.call_route(:function, "leakCheck", payload) + + leaked = payload.singleton_class.instance_methods(false).grep(/parse_webhook_handler/) + assert_empty leaked, "handler singleton method should be removed after invocation" + end + + def test_singleton_method_removed_even_when_handler_raises + Parse::Webhooks.route(:function, "raiseClean") { error! "x" } + payload = function_payload("raiseClean") + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:function, "raiseClean", payload) + end + + leaked = payload.singleton_class.instance_methods(false).grep(/parse_webhook_handler/) + assert_empty leaked, "singleton method must be removed even when the handler raises" + end + + def test_block_with_extra_required_param_does_not_raise + # instance_exec(payload, &block) used to leave surplus params nil; the + # singleton-method invocation must preserve that leniency rather than raise + # ArgumentError for an arity >= 2 block. + Parse::Webhooks.route(:function, "twoArg") do |payload, extra| + return "p=#{payload.params["v"]} extra=#{extra.inspect}" + end + + assert_equal 'p=1 extra=nil', call_fn("twoArg", "v" => 1) + end + + def test_before_save_return_false_halts_save + Parse::Webhooks.route(:before_save, "HandlerReturnObject") do + return false + end + + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSave", + "object" => { "className" => "HandlerReturnObject", "objectId" => "x1", "name" => "n" }, + ) + + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_save, "HandlerReturnObject", payload) + end + end +end diff --git a/test/lib/parse/webhook_non_object_triggers_test.rb b/test/lib/parse/webhook_non_object_triggers_test.rb new file mode 100644 index 0000000..1d9e2e4 --- /dev/null +++ b/test/lib/parse/webhook_non_object_triggers_test.rb @@ -0,0 +1,407 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" +require "minitest/autorun" + +# Tests for first-class routing of the NON-OBJECT webhook trigger shapes: +# the authentication triggers (beforeLogin / afterLogin / afterLogout / +# beforePasswordResetRequest) and the LiveQuery triggers (beforeConnect / +# beforeSubscribe / afterEvent). +# +# The behavioral contract under test mirrors Parse Server 9.x: +# * Parse Server's webhook response handler IGNORES the body for all of these +# (it resolves {}). The ONLY signal that affects the operation is the error +# path, and only for the "before" variants -- a {success:false} body +# RESOLVES and lets the login/connect/subscribe proceed. So a handler +# returning `false` from before_login/before_connect/before_subscribe/ +# before_password_reset_request must be converted to a ResponseError, and a +# returned Parse::Object must be normalized to a success no-op (never +# serialized back). +# * None of these run ActiveModel save/create/destroy callbacks even though +# the auth triggers carry a _User / _Session object. +class WebhookNonObjectTriggersTest < Minitest::Test + def setup + Parse::Webhooks.instance_variable_set(:@routes, nil) + end + + def teardown + Parse::Webhooks.instance_variable_set(:@routes, nil) + end + + # ========================================================================== + # Trigger-type predicates + # ========================================================================== + + TRIGGER_PREDICATES = { + "beforeLogin" => :before_login?, + "afterLogin" => :after_login?, + "afterLogout" => :after_logout?, + "beforePasswordResetRequest" => :before_password_reset_request?, + "beforeConnect" => :before_connect?, + "beforeSubscribe" => :before_subscribe?, + "afterEvent" => :after_event?, + }.freeze + + def test_each_trigger_predicate_is_exclusive + TRIGGER_PREDICATES.each do |name, predicate| + payload = Parse::Webhooks::Payload.new("triggerName" => name) + assert payload.send(predicate), "#{predicate} should be true for #{name}" + TRIGGER_PREDICATES.each do |other_name, other_pred| + next if other_pred == predicate + refute payload.send(other_pred), + "#{other_pred} should be false for #{name}" + end + end + end + + def test_auth_trigger_classification + %w[beforeLogin afterLogin afterLogout beforePasswordResetRequest].each do |name| + p = Parse::Webhooks::Payload.new("triggerName" => name) + assert p.auth_trigger?, "#{name} should be an auth_trigger?" + refute p.live_query_trigger?, "#{name} should not be a live_query_trigger?" + end + end + + def test_live_query_trigger_classification + %w[beforeConnect beforeSubscribe afterEvent].each do |name| + p = Parse::Webhooks::Payload.new("triggerName" => name) + assert p.live_query_trigger?, "#{name} should be a live_query_trigger?" + refute p.auth_trigger?, "#{name} should not be an auth_trigger?" + end + end + + def test_object_triggers_are_not_auth_or_live_query + %w[beforeSave afterSave beforeDelete afterFind].each do |name| + p = Parse::Webhooks::Payload.new("triggerName" => name) + refute p.auth_trigger?, "#{name} must not classify as auth" + refute p.live_query_trigger?, "#{name} must not classify as live_query" + end + end + + # ========================================================================== + # Reject-on-false: a `false` return from a before_* auth/LQ trigger denies. + # (Parse Server only treats {error} as a rejection.) + # ========================================================================== + + def test_before_login_false_raises_response_error + Parse::Webhooks.route(:before_login, "_User") { |_p| false } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_login, "_User", payload) + end + end + + def test_before_password_reset_request_false_raises + Parse::Webhooks.route(:before_password_reset_request, "_User") { |_p| false } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforePasswordResetRequest", + "object" => { "className" => "_User", "email" => "a@example.com" }, + ) + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_password_reset_request, "_User", payload) + end + end + + def test_before_connect_false_raises + Parse::Webhooks.route(:before_connect, "@Connect") { |_p| false } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeConnect", "event" => "connect", + ) + payload.instance_variable_set(:@webhook_class, "@Connect") + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_connect, "@Connect", payload) + end + end + + def test_before_subscribe_false_raises + Parse::Webhooks.route(:before_subscribe, "Post") { |_p| false } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSubscribe", + "query" => { "where" => { "archived" => false } }, + ) + payload.instance_variable_set(:@webhook_class, "Post") + assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_subscribe, "Post", payload) + end + end + + def test_before_login_error_bang_raises_with_message + Parse::Webhooks.route(:before_login, "_User") { |_p| error!("banned user") } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "username" => "mallory" }, + ) + err = assert_raises(Parse::Webhooks::ResponseError) do + Parse::Webhooks.call_route(:before_login, "_User", payload) + end + assert_equal "banned user", err.message + end + + # ========================================================================== + # after_* are observe-only: false does NOT raise, result normalizes to true, + # and a returned object is never serialized back. + # ========================================================================== + + def test_after_login_false_does_not_raise_and_normalizes_true + Parse::Webhooks.route(:after_login, "_User") { |_p| false } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + result = Parse::Webhooks.call_route(:after_login, "_User", payload) + assert_equal true, result, "after_login response is ignored; normalize to success" + end + + def test_after_logout_normalizes_true + Parse::Webhooks.route(:after_logout, "_Session") { |_p| nil } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterLogout", + "object" => { "className" => "_Session", "objectId" => "s1" }, + ) + assert_equal true, Parse::Webhooks.call_route(:after_logout, "_Session", payload) + end + + def test_after_event_returned_object_is_not_leaked_into_response + # A handler that returns the parse_object (a Parse::Object) must NOT have + # that object serialized back -- Parse Server ignores the body and we must + # not leak it into the response/log. The result must normalize to `true`. + Parse::Webhooks.route(:after_event, "Post") { |p| p.parse_object } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterEvent", "event" => "create", + "object" => { "className" => "Post", "objectId" => "p1", "title" => "Hi" }, + ) + payload.instance_variable_set(:@webhook_class, "Post") + result = Parse::Webhooks.call_route(:after_event, "Post", payload) + assert_equal true, result + refute_kind_of Parse::Object, result + end + + def test_before_login_returned_object_is_normalized_true + # Even a "before" auth trigger that returns the user object (not false) + # must succeed with a no-op -- the object is never serialized back. + Parse::Webhooks.route(:before_login, "_User") { |p| p.parse_object } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + result = Parse::Webhooks.call_route(:before_login, "_User", payload) + assert_equal true, result + refute_kind_of Parse::Object, result + end + + # ========================================================================== + # No ActiveModel save/create/destroy callbacks fire for auth triggers, even + # though they carry a _User / _Session object. + # ========================================================================== + + def test_before_login_does_not_run_save_or_create_callbacks + fired = [] + spy = Object.new + spy.define_singleton_method(:is_a?) { |k| k == Parse::Object } + spy.define_singleton_method(:run_before_save_callbacks) { fired << :before_save; true } + spy.define_singleton_method(:run_before_create_callbacks) { fired << :before_create; true } + spy.define_singleton_method(:changes_payload) { { "x" => 1 } } + + Parse::Webhooks.route(:before_login, "_User") { |_p| spy } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + payload.define_singleton_method(:parse_object) { spy } + + Parse::Webhooks.call_route(:before_login, "_User", payload) + assert_empty fired, "beforeLogin must not run save/create ActiveModel callbacks" + end + + def test_after_login_does_not_run_after_save_or_create_callbacks + fired = [] + spy = Object.new + spy.define_singleton_method(:is_a?) { |k| k == Parse::Object } + spy.define_singleton_method(:run_after_save_callbacks) { fired << :after_save; true } + spy.define_singleton_method(:run_after_create_callbacks) { fired << :after_create; true } + + Parse::Webhooks.route(:after_login, "_User") { |_p| true } + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + payload.define_singleton_method(:parse_object) { spy } + + Parse::Webhooks.call_route(:after_login, "_User", payload) + assert_empty fired, "afterLogin must not run after_save/after_create callbacks" + end + + # ========================================================================== + # Accessors: event / clients / subscriptions, the beforeLogin user footgun, + # beforeSubscribe query, and top-level session-token capture. + # ========================================================================== + + def test_after_event_event_accessor + payload = Parse::Webhooks::Payload.new( + "triggerName" => "afterEvent", "event" => "update", + "clients" => 3, "subscriptions" => 7, + "object" => { "className" => "Post", "objectId" => "p1" }, + ) + assert_equal "update", payload.event + assert_equal 3, payload.clients + assert_equal 7, payload.subscriptions + end + + def test_non_live_query_triggers_have_nil_event + p = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSave", + "object" => { "className" => "Post", "objectId" => "p1" }, + ) + assert_nil p.event + assert_nil p.clients + assert_nil p.subscriptions + end + + def test_before_login_user_is_parse_object_not_user + # The login footgun: the user being authenticated is the OBJECT, and + # #user is nil (auth not complete yet). + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "objectId" => "u1", "username" => "alice" }, + ) + assert_nil payload.user, "beforeLogin carries no resolved #user" + assert_equal "_User", payload.parse_class + obj = payload.parse_object + assert_kind_of Parse::User, obj + assert_equal "alice", obj.username + end + + def test_before_subscribe_parse_query + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSubscribe", + "query" => { "where" => { "archived" => false } }, + ) + payload.instance_variable_set(:@webhook_class, "Post") + assert_equal "Post", payload.parse_class + q = payload.parse_query + assert_kind_of Parse::Query, q + end + + def test_before_connect_captures_top_level_session_token + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeConnect", "event" => "connect", + "sessionToken" => "r:lq-token-123", + ) + assert_equal "r:lq-token-123", payload.session_token + assert payload.session_token? + end + + def test_top_level_session_token_not_in_as_json + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeSubscribe", + "sessionToken" => "r:secret-lq-token", + "query" => { "where" => {} }, + ) + refute_includes payload.as_json.to_json, "secret-lq-token", + "top-level session token must never appear in #as_json" + end + + def test_blank_top_level_session_token_is_nil + payload = Parse::Webhooks::Payload.new( + "triggerName" => "beforeConnect", "sessionToken" => " ", + ) + assert_nil payload.session_token + refute payload.session_token? + end + + # ========================================================================== + # Path routing: trigger_class_from_path accepts the @-prefixed pseudo-classes. + # ========================================================================== + + def test_trigger_class_from_path_accepts_connect_pseudo_class + assert_equal "@Connect", + Parse::Webhooks.trigger_class_from_path("/webhooks/beforeConnect/@Connect") + end + + def test_trigger_class_from_path_accepts_file_pseudo_class + assert_equal "@File", + Parse::Webhooks.trigger_class_from_path("/webhooks/beforeSave/@File") + end + + def test_trigger_class_from_path_still_rejects_garbage + assert_nil Parse::Webhooks.trigger_class_from_path("/webhooks/beforeConnect/..%2fetc") + assert_nil Parse::Webhooks.trigger_class_from_path("/webhooks/beforeConnect/@@bad") + end + + # ========================================================================== + # End-to-end through the Rack entry point (#call!): the seam where the regex + # fix, the Payload `webhook_class:` constructor handling, and routing combine + # on a RAW body — the actual production path. + # ========================================================================== + + WEBHOOK_HEADER = "HTTP_X_PARSE_WEBHOOK_KEY" + + def with_rack_webhook_env + saved_key = Parse::Webhooks.instance_variable_get(:@key) + saved_allow = Parse::Webhooks.instance_variable_get(:@allow_unauthenticated) + saved_logging = Parse::Webhooks.logging + Parse::Webhooks.instance_variable_set(:@key, nil) + Parse::Webhooks.instance_variable_set(:@allow_unauthenticated, true) + Parse::Webhooks.logging = false + Parse::Webhooks::ReplayProtection.reset! + capture_io { yield } + ensure + Parse::Webhooks.instance_variable_set(:@key, saved_key) + Parse::Webhooks.instance_variable_set(:@allow_unauthenticated, saved_allow) + Parse::Webhooks.logging = saved_logging + end + + def rack_env(body:, path:) + { + "REQUEST_METHOD" => "POST", + "CONTENT_TYPE" => "application/json", + "PATH_INFO" => path, + "rack.input" => StringIO.new(body), + "CONTENT_LENGTH" => body.bytesize.to_s, + } + end + + def test_call_routes_a_raw_before_login_body + fired = [] + Parse::Webhooks.route(:before_login, "_User") do |p| + fired << p.parse_object.username + true + end + body = JSON.generate( + "triggerName" => "beforeLogin", + "object" => { "className" => "_User", "username" => "alice" }, + ) + with_rack_webhook_env do + status, _h, resp = Parse::Webhooks.call( + rack_env(body: body, path: "/webhooks/beforeLogin/_User") + ) + assert_equal 200, status + assert_equal({ "success" => true }, JSON.parse(resp.join)) + end + assert_equal ["alice"], fired, "the beforeLogin handler must fire via call!" + end + + def test_call_routes_a_raw_before_connect_body_with_at_connect_path + # The full seam: a raw beforeConnect body carries NO className; the class + # is resolved from the @Connect path via the constructor's webhook_class:. + # error! must surface as an {error} response (deny the connection). + Parse::Webhooks.route(:before_connect, "@Connect") do |_p| + error!("connection refused") + end + body = JSON.generate( + "triggerName" => "beforeConnect", "event" => "connect", + "sessionToken" => "r:lq-tok", + ) + with_rack_webhook_env do + status, _h, resp = Parse::Webhooks.call( + rack_env(body: body, path: "/webhooks/beforeConnect/@Connect") + ) + assert_equal 200, status + assert_equal "connection refused", JSON.parse(resp.join)["error"] + end + end +end diff --git a/test/lib/parse/webhook_rack_call_test.rb b/test/lib/parse/webhook_rack_call_test.rb index 6ad8882..b7a633b 100644 --- a/test/lib/parse/webhook_rack_call_test.rb +++ b/test/lib/parse/webhook_rack_call_test.rb @@ -5,6 +5,12 @@ class WebhookRackCallTest < Minitest::Test WEBHOOK_HEADER = "HTTP_X_PARSE_WEBHOOK_KEY" + # Real model so an after_save route's call_route can build the parse object. + class SaveProbe < Parse::Object + parse_class "SaveProbe" + property :title, :string + end + def setup @saved_key = Parse::Webhooks.instance_variable_get(:@key) @saved_allow = Parse::Webhooks.instance_variable_get(:@allow_unauthenticated) @@ -37,13 +43,14 @@ def teardown Parse::Webhooks.instance_variable_set(:@routes, nil) end - def build_env(body: '{"functionName":"test"}', key_header: nil) + def build_env(body: '{"functionName":"test"}', key_header: nil, path: nil) env = { "REQUEST_METHOD" => "POST", "CONTENT_TYPE" => "application/json", "rack.input" => StringIO.new(body), "CONTENT_LENGTH" => body.bytesize.to_s, } + env["PATH_INFO"] = path if path env[WEBHOOK_HEADER] = key_header if key_header env end @@ -71,6 +78,77 @@ def test_emits_missing_key_warning_only_once_across_requests assert_equal 1, occurrences, "expected the missing-key warning to fire once across multiple requests" end + # ----- find-trigger routing via the real call! dispatch ----- + # + # afterFind/beforeFind payloads carry NO className in the body, so the class + # is derived from the request PATH (`/afterFind/`) and threaded into + # the Payload. Before this was wired, parse_class was nil for find triggers + # and the dispatch at call! never invoked the registered handler. These tests + # exercise the full call! path (not call_route directly) to guard the fix. + + def test_after_find_handler_routes_via_path_className + Parse::Webhooks.allow_unauthenticated = true + fired = false + seen_class = nil + Parse::Webhooks.route(:after_find, "RoutingProbe") do + fired = true + seen_class = parse_class + objects + end + body = JSON.generate("triggerName" => "afterFind", + "objects" => [{ "objectId" => "a", "title" => "x" }]) + capture_io do + status, _h, _b = Parse::Webhooks.call(build_env(body: body, path: "/after_find/RoutingProbe")) + assert_equal 200, status + end + assert fired, "afterFind handler must fire when the class comes from the request path" + assert_equal "RoutingProbe", seen_class + end + + def test_before_find_handler_routes_via_path_className + Parse::Webhooks.allow_unauthenticated = true + fired = false + Parse::Webhooks.route(:before_find, "RoutingProbe") { fired = true; true } + body = JSON.generate("triggerName" => "beforeFind", "query" => { "where" => {} }) + capture_io do + Parse::Webhooks.call(build_env(body: body, path: "/before_find/RoutingProbe")) + end + assert fired, "beforeFind handler must fire when the class comes from the request path" + end + + def test_save_trigger_still_routes_with_path_className + # Guard the precedence change: setting webhook_class from the path must not + # break save triggers, whose body DOES carry className (path == body class). + Parse::Webhooks.allow_unauthenticated = true + seen_class = nil + Parse::Webhooks.route(:after_save, "SaveProbe") { seen_class = parse_class; true } + body = JSON.generate("triggerName" => "afterSave", + "object" => { "className" => "SaveProbe", "objectId" => "a" }) + capture_io do + Parse::Webhooks.call(build_env(body: body, path: "/after_save/SaveProbe")) + end + assert_equal "SaveProbe", seen_class + end + + def test_trigger_class_from_path + # camelCase (Parse Server body form) and snake_case (the form register_triggers! + # actually builds the URL with) must both be recognized. + assert_equal "Post", Parse::Webhooks.trigger_class_from_path("/afterFind/Post") + assert_equal "Post", Parse::Webhooks.trigger_class_from_path("/after_find/Post") + assert_equal "_User", Parse::Webhooks.trigger_class_from_path("/before_save/_User") + assert_equal "Post", Parse::Webhooks.trigger_class_from_path("/mcp/hooks/after_save/Post") + # Parse pseudo-classes (file / connection triggers) are allowed. + assert_equal "@File", Parse::Webhooks.trigger_class_from_path("/after_save/@File") + assert_equal "@Connect", Parse::Webhooks.trigger_class_from_path("/before_connect/@Connect") + # Function path (single trailing segment, no trigger) -> nil + assert_nil Parse::Webhooks.trigger_class_from_path("/myFunction") + # Unknown trigger segment -> nil + assert_nil Parse::Webhooks.trigger_class_from_path("/bogusTrigger/Post") + # Malicious / malformed class segment -> nil (charset gate) + assert_nil Parse::Webhooks.trigger_class_from_path("/afterFind/..%2Fetc") + assert_nil Parse::Webhooks.trigger_class_from_path("/afterFind/has space") + end + def test_permissive_mode_via_setter_allows_request_without_key Parse::Webhooks.allow_unauthenticated = true capture_io do diff --git a/test/support/test_server.rb b/test/support/test_server.rb index 08233b0..10b58c8 100644 --- a/test/support/test_server.rb +++ b/test/support/test_server.rb @@ -112,6 +112,12 @@ def create_test_user(username: nil, password: nil, email: nil) email: email, ) user.save + + # Parse Server 9.x issues NO session token for a master-key signup + # (admin-provisioning semantics). Callers of this helper expect a + # usable session token (e.g. to call cloud functions as the user), + # so log in after signup to obtain one when the signup didn't. + user.login!(password) if user.session_token.to_s.empty? user end diff --git a/test/test_helper_integration.rb b/test/test_helper_integration.rb index 239cc8b..39e9714 100644 --- a/test/test_helper_integration.rb +++ b/test/test_helper_integration.rb @@ -83,6 +83,26 @@ def create_test_user(attributes = {}) def reset_database! Parse::Test::ServerHelper.reset_database! end + + # Obtain a live session token for a freshly-created user by logging in. + # + # Parse Server 9.x does NOT return a session token from a master-key signup + # (`Parse::User.new(...).save` / `signup!` on the default master client) — it + # treats master-key user creation as admin provisioning. Tests that need an + # authenticated user session therefore log in right after signup. Pass the + # just-saved user and the plaintext password; the user's `#session_token` is + # populated on return. No-ops if a token is already present (e.g. a client-mode + # signup that already issued one). + # + # @param user [Parse::User] a user that has already been saved/signed up. + # @param password [String] the plaintext password used at signup. + # @return [Parse::User] the same user, now carrying a live `#session_token`. + def login_after_signup!(user, password) + return user if user.session_token.present? + assert user.login!(password), + "login after signup must succeed to obtain a session token for #{user.username.inspect}" + user + end end # Example usage in tests: