diff --git a/CHANGELOG.md b/CHANGELOG.md index 41ce196..5b189a8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,150 @@ ## parse-stack-next Changelog +### 5.5.1 + +#### Mongo-direct reads inside `Parse.with_session` are now scoped, not master + +- **FIXED**: A query that auto-routes to the mongo-direct path because of a + direct-only constraint (for example a geo `$near` / `$geoIntersects` query) + now honors the ambient session token set by `Parse.with_session(token)`. + Previously the mongo-direct auth resolver consulted only the query's own + `session_token=` / `scope_to_user` / `scope_to_role` and ignored the + fiber-local ambient session, so in server mode it fell through to a + master-key read with no ACL/CLP enforcement — returning rows the session was + not permitted to see, even though every REST query in the same + `with_session` block was correctly scoped. The resolver now mirrors + `Parse::Client#request` precedence: an explicit per-query token wins, then + the ambient session, then the master-key fallback; an explicit + `use_master_key: true` is a deliberate admin call and still skips the + ambient. Routing also accepts the ambient on non-master clients + (`Parse.client_mode` or a user-scoped client), so such a query runs scoped + rather than raising. + +#### Boolean property coercion no longer treats the string "false" as true + +- **FIXED**: A `:boolean` property assigned a string now coerces via + ActiveModel's boolean caster instead of raw Ruby truthiness. Previously the + coercion was `val ? true : false`, so the strings `"false"`, `"0"`, and + `"off"` — exactly what arrives on a Rails-form or query-string ingestion + path — all coerced to `true`, silently flipping a boolean the wrong way (for + example an `archived` flag or an application-defined access gate). String + forms now map correctly (`"false"`/`"0"`/`"off"` to `false`), a blank string + is treated as unset (`nil`), and native booleans from Parse wire JSON pass + through unchanged. + +#### Deprecation warning for setting ACL via mass-assignment + +- **DEPRECATED**: Setting `acl`/`ACL` through mass-assignment + (`Parse::Object#attributes=`) now emits a one-time security warning. Mass- + assigning an ACL from a caller-supplied hash — for example a controller doing + `record.attributes = params` without StrongParameters — lets an attacker + grant unintended access by sending an `ACL` key + (`{"ACL" => {"*" => {"write" => true}}}`). The behavior is unchanged this + release (the ACL is still applied), but the supported path is the explicit + `record.acl = ...` setter, and a future release may block ACL mass-assignment. + The constructor form `Klass.new(acl: ...)` is unaffected and does not warn. + +#### Redis cache values serialized as JSON instead of Marshal + +- **FIXED**: `Parse::Cache::Redis` now serializes cached HTTP responses as + JSON rather than Marshal. The Moneta-Redis store Marshals values by default, + so every cache hit ran `Marshal.load` on the bytes returned by Redis. Against + a shared, unauthenticated, or plaintext-`redis://` cache, an attacker able to + write the cache could plant a crafted Marshal payload that executed code on + deserialization. The wrapper now disables Moneta's value serializer + (`value_serializer: nil`) and JSON-encodes/decodes values itself; an + undecodable value (including any legacy Marshal entry) is treated as a cache + miss rather than deserialized. Cache keys are unchanged. No application code + changes are required; existing cached entries are transparently refetched and + re-stored in the new format on first access. +- **FIXED**: The `cache: "redis://..."` shorthand on `Parse::Client.new` / + `Parse.setup` now builds a `Parse::Cache::Redis` store instead of a bare + `Moneta.new(:Redis, ...)`, so it gets the same JSON value serialization and + is not subject to the Marshal deserialization issue above. +- **CHANGED**: The caching middleware stores response entries with string keys + so they round-trip losslessly through the JSON serialization. Reads accept + both string and legacy symbol keys. +- **FIXED**: `Parse::Embeddings::Cache::MonetaStore` now JSON-encodes cached + embedding vectors instead of relying on the Moneta store's default Marshal + value serializer, closing the same `Marshal.load`-on-read deserialization + vector for the embedding cache (whose key is derived from often-user-supplied + text). It also emits a one-time warning when handed a Marshal-serializing + store and recommends `value_serializer: nil`. +- **CHANGED**: Documentation for Redis-backed caches, the embedding cache, and + the synchronize-create lock store (`Parse.synchronize_create_store`) now + builds the Redis store via `Parse::Cache::Redis` or `value_serializer: nil` + so a raw `Moneta.new(:Redis, ...)` no longer leaves Marshal on the read path. + +#### Internal columns stripped from joined documents on mongo-direct reads + +- **FIXED**: `Parse::MongoDB.aggregate` now recursively strips Parse-internal + credential columns (`_hashed_password`, `_session_token`, `_auth_data_*`, + `_rperm`/`_wperm`, ...) from every result row **and every embedded + sub-document** for scoped (non-master) callers. Previously a scoped caller + could embed a foreign class (e.g. `_User` or `_Session`) into an arbitrary + alias via `$lookup` / `$graphLookup` / `$unionWith` and read back password + hashes, OAuth tokens, and session tokens: the per-class `protectedFields` + strip is keyed on the outer class, and the ACL sub-document walk only drops + ACL-failing sub-documents, so neither covered the aliased foreign document. + A new `Parse::PipelineSecurity.redact_internal_fields_deep!` runs as the final + redaction step. Structural columns (`_id`, `_p_*`, `_acl`, timestamps) are + preserved, so object and ACL reconstruction are unaffected; master-key reads + are unchanged. + +#### Hardened developer-facing mongo-direct aggregation terminals + +- **FIXED**: Credential columns (`_hashed_password`, `_session_token`, + `_auth_data_*`, `_email_verify_token`, `_perishable_token`, ...) used as a + `$match` field name are now refused **unconditionally** on the mongo-direct + path — even on a pipeline running with `allow_internal_fields: true` (the flag + that lets SDK-emitted `_rperm`/`_wperm` references through for + `readable_by_role` / `publicly_readable`). Previously the `*_direct` terminals + (`count_direct`, `results_direct`, `distinct_direct`, the direct group-by + helpers) passed `allow_internal_fields: true` unconditionally, so a query + whose `where` referenced a credential column compiled into a `$match` key that + bypassed the internal-field screen — a count/match oracle that could bisect a + bcrypt hash or session token. The ACL columns (`_rperm`/`_wperm`/`_tombstone`) + remain gated by `allow_internal_fields`, so `readable_by_role` still works. +- **FIXED**: `Parse::Query#aggregate` and `#aggregate_from_query` now treat a + scoped query (`session_token` / `scope_to_user` / `scope_to_role`) as + authoritative over an explicit `mongo_direct: false`. Previously passing + `mongo_direct: false` on a scoped aggregation skipped the fail-closed guard + and routed to Parse Server's master-key-only REST `/aggregate` endpoint, + running the aggregation unscoped (no ACL, CLP, or `protectedFields`). A scoped + aggregation now promotes to mongo-direct, or fails closed with + `Parse::Query::MongoDirectRequired` when direct Mongo is unavailable; unscoped + callers can still opt out to REST with `mongo_direct: false`. + +#### Additional hardening + +- **FIXED**: Request/response body logging now redacts credentials. At `:debug` + level the logging middleware emitted login/signup request bodies (cleartext + `password`) and auth response bodies (`sessionToken`, `authData`, MFA + secrets); the body path now runs through the same `BodyBuilder.redact` + scrubber the header path already used, before truncation. +- **FIXED**: The `_User` REST endpoints (`fetch_user` / `update_user` / + `delete_user`) now validate the `objectId` against + `Parse::API::PathSegment.object_id!` before interpolating it into the path, + matching the object endpoints. A crafted objectId (e.g. from a compromised + server response) can no longer traverse to a different endpoint on a + subsequent request. +- **CHANGED**: `$sessionToken` / `$session_token` (the camelCase forms of the + session-token column) are now in `DENIED_FIELD_REFS`, so they cannot be + laundered through a `$`-field reference in a pipeline. +- **IMPROVED**: The internal-collection floor (`_SCHEMA` / `_Hooks` / + `_GlobalConfig` / `_Audit` / ...) is now enforced unconditionally on every + `$lookup` / `$graphLookup` / `$unionWith` join target in + `Parse::ACLScope`, not only when lookup-rewriting runs. This closes a + defense-in-depth gap where an internal class whose CLP lookup returned no + policy could otherwise have been joinable on the direct path. +- **IMPROVED**: When the MCP agent server is started on an unauthenticated + loopback bind with no Origin/custom-header gate configured, it now defaults + to a loopback-only Origin policy. A browser DNS-rebinding attack against + `127.0.0.1` carries a non-loopback `Origin` and is refused; native clients + (which send no `Origin`) and local browser UIs are unaffected. A one-time + warning points operators at `MCP_API_KEY` / `allowed_origins:` / + `require_custom_header:` for routable deployments. + ### 5.5.0 #### Multimodal bytes-fetch path with magic-byte MIME verification diff --git a/Gemfile.lock b/Gemfile.lock index abc7365..f9d4813 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,7 +1,7 @@ PATH remote: . specs: - parse-stack-next (5.5.0) + parse-stack-next (5.5.1) activemodel (>= 6.1, < 9) activesupport (>= 6.1, < 9) connection_pool (>= 2.2, < 4) diff --git a/README.md b/README.md index 7013e9f..a2ddfc3 100644 --- a/README.md +++ b/README.md @@ -628,12 +628,21 @@ If `faraday-net_http_persistent` is not available, Parse Stack automatically fal A caching adapter of type `Moneta::Transformer`. Caching queries and object fetches can help improve the performance of your application, even if it is for a few seconds. Only successful `GET` object fetches and queries (non-empty) will be cached. You may set the default expiration time with the `expires` option. See related: [Moneta](https://github.com/minad/moneta). At any point in time you may clear the cache by calling the `clear_cache!` method on the client connection. ```ruby - store = Moneta.new :Redis, url: 'redis://localhost:6379' + # Use the bundled Parse::Cache::Redis wrapper for a Redis-backed cache. It + # serializes cached responses as JSON (never Marshal): a raw + # `Moneta.new(:Redis, ...)` store Marshals values by default, so a cache + # read would `Marshal.load` bytes from Redis — an RCE vector if that Redis + # is shared, unauthenticated, or reachable over a plaintext `redis://` MITM. + store = Parse::Cache::Redis.new(url: 'redis://localhost:6379') # use a Redis cache store with an automatic expire of 10 seconds. Parse.setup(cache: store, expires: 10, ...) ``` -As a shortcut, if you are planning on using REDIS and have configured the use of `redis` in your `Gemfile`, you can just pass the REDIS connection string directly to the cache option. +If you supply your own raw `Moneta.new(:Redis, ...)` store instead of the +wrapper, build it with `value_serializer: nil` to keep Marshal off the cache +read path. + +As a shortcut, if you are planning on using REDIS and have configured the use of `redis` in your `Gemfile`, you can just pass the REDIS connection string directly to the cache option. The string form builds a `Parse::Cache::Redis` wrapper for you, so it is JSON-serialized and safe by default. ```ruby Parse.setup(cache: 'redis://localhost:6379', ...) @@ -5342,7 +5351,11 @@ If you are already have setup a client that is being used by your defined models For high traffic applications that may be performing several server tasks on similar objects, you may utilize request caching. Caching is provided by a the `Parse::Middleware::Caching` class which utilizes a [Moneta store](https://github.com/minad/moneta) object to cache GET url requests that have allowable status codes (ex. HTTP 200, etc). The cache entry for the url will be removed when it is either considered expired (based on the `expires` option) or if a non-GET request is made with the same url. Using this feature appropriately can dramatically reduce your API request usage. ```ruby -store = Moneta.new :Redis, url: 'redis://localhost:6379' +# Parse::Cache::Redis serializes cached responses as JSON, not Marshal — a raw +# Moneta.new(:Redis) store Marshals values by default and a cache read would +# Marshal.load Redis bytes (RCE if the cache is shared/untrusted). Prefer the +# wrapper; if you supply a raw Moneta-Redis store, pass value_serializer: nil. +store = Parse::Cache::Redis.new(url: 'redis://localhost:6379') # use a Redis cache store with an automatic expire of 10 seconds. Parse.setup(cache: store, expires: 10, ...) diff --git a/docs/atlas_vector_search_guide.md b/docs/atlas_vector_search_guide.md index a9a3b92..2045b00 100644 --- a/docs/atlas_vector_search_guide.md +++ b/docs/atlas_vector_search_guide.md @@ -353,7 +353,11 @@ layer shared across processes, wrap any Moneta-compatible backend in the bundled adapter: ```ruby -moneta = Moneta.new(:Redis, url: ENV["REDIS_URL"]) +# Build the Moneta store with value_serializer: nil. MonetaStore JSON-encodes +# vectors itself; without value_serializer: nil, Moneta would additionally +# Marshal the values, and a cache read would Marshal.load bytes from a shared +# Redis — an RCE vector if that Redis is untrusted or MITM'd over redis://. +moneta = Moneta.new(:Redis, url: ENV["REDIS_URL"], value_serializer: nil) Parse::Embeddings::Cache.enable!( store: Parse::Embeddings::Cache::MonetaStore.new(moneta, ttl: 30 * 24 * 3600), ) diff --git a/lib/parse/acl_scope.rb b/lib/parse/acl_scope.rb index bd001fe..a9695f3 100644 --- a/lib/parse/acl_scope.rb +++ b/lib/parse/acl_scope.rb @@ -336,6 +336,17 @@ def assert_join_target_permitted!(target, perms) return if target.nil? target_str = target.to_s return if target_str.empty? + # RT-7 / NEW-4: hard internal-collection floor FIRST, independent of + # CLP. This must run on EVERY join target on the direct + # Parse::MongoDB.aggregate path. LookupRewriter.auto_rewrite (the other + # caller of assert_collection_allowed!) is skipped when rewrite_lookups + # is off or the root class can't be resolved, so relying on it alone + # leaves a gap: an internal collection (`_SCHEMA`/`_Hooks`/`_Audit`/ + # `_GlobalConfig`/...) whose CLP fetch returns :no_clp would pass the + # permits? check below. The floor refuses those outright while still + # admitting the SDK data classes (`_User`/`_Role`/`_Installation`/ + # `_Session`), which then face the per-scope CLP `find` gate. + Parse::PipelineSecurity.assert_collection_allowed!(target_str) return if Parse::CLPScope.permits?(target_str, :find, perms) raise Parse::CLPScope::Denied.new( target_str, :find, diff --git a/lib/parse/agent/mcp_rack_app.rb b/lib/parse/agent/mcp_rack_app.rb index 6d3e384..880fc96 100644 --- a/lib/parse/agent/mcp_rack_app.rb +++ b/lib/parse/agent/mcp_rack_app.rb @@ -4,6 +4,7 @@ require "json" require "securerandom" require "digest" +require "uri" require_relative "errors" require_relative "mcp_dispatcher" require_relative "mcp_subscriptions" @@ -320,6 +321,7 @@ def initialize(agent_factory: nil, max_body_size: DEFAULT_MAX_BODY_SIZE, pre_auth_rate_limiter: nil, allowed_origins: nil, require_custom_header: nil, + loopback_csrf_default: false, resource_subscriptions: false, subscription_manager: nil, notifications: nil, @@ -376,6 +378,16 @@ def initialize(agent_factory: nil, max_body_size: DEFAULT_MAX_BODY_SIZE, @pre_auth_rate_limiter = pre_auth_rate_limiter @allowed_origins = normalize_allowed_origins(allowed_origins) @required_custom_header = normalize_required_custom_header(require_custom_header) + # NEW-9: when no explicit allowed_origins / require_custom_header CSRF + # gate is configured but the server was started on an unauthenticated + # loopback bind, default to a loopback-only Origin policy. A browser + # DNS-rebinding attack against 127.0.0.1 always carries an `Origin` + # header (the attacker page's origin), so refusing any present + # non-loopback Origin closes that vector — while native clients (curl, + # SDK-to-SDK) send NO Origin and stay allowed, and a legitimate local + # browser UI sends a loopback Origin and is allowed. Ignored when an + # explicit allowlist is configured (operator owns the policy then). + @loopback_csrf_default = loopback_csrf_default && @allowed_origins.nil? @health_path = health_path.is_a?(String) && !health_path.empty? ? health_path : nil # Per-app registry of in-flight cancellable requests. Keyed by # [correlation_id, request_id]. A `notifications/cancelled` POST @@ -660,12 +672,9 @@ def call(env) # Missing/empty `Origin` is allowed regardless — native # clients (curl, SDK-to-SDK) shouldn't be broken by a # CSRF defense aimed at browsers. - if @allowed_origins - origin = env["HTTP_ORIGIN"].to_s.strip - unless origin.empty? || origin_allowed?(origin) - @logger&.warn("[Parse::Agent::MCPRackApp] Origin refused: #{origin.inspect}") - return [403, json_headers, [json_rpc_error(-32_700, "Origin not allowed")]] - end + if origin_refused?(env) + @logger&.warn("[Parse::Agent::MCPRackApp] Origin refused: #{env["HTTP_ORIGIN"].to_s.strip.inspect}") + return [403, json_headers, [json_rpc_error(-32_700, "Origin not allowed")]] end # 2c. Required custom header (CSRF defense-in-depth). A header @@ -1051,14 +1060,11 @@ def serve_listening_stream(env) return [400, json_headers, [json_rpc_error(-32_600, "Missing or invalid Mcp-Session-Id")]] end - # The origin allowlist (when configured) guards the listening stream - # the same way it guards POST — a browser-driven cross-origin GET to - # an SSE endpoint is the analogous CSRF surface. - if @allowed_origins - origin = env["HTTP_ORIGIN"].to_s.strip - unless origin.empty? || origin_allowed?(origin) - return [403, json_headers, [json_rpc_error(-32_700, "Origin not allowed")]] - end + # The origin policy (when configured, or the loopback default) guards + # the listening stream the same way it guards POST — a browser-driven + # cross-origin GET to an SSE endpoint is the analogous CSRF surface. + if origin_refused?(env) + return [403, json_headers, [json_rpc_error(-32_700, "Origin not allowed")]] end # Owner-binding: only the principal that established this session (or, @@ -2119,6 +2125,39 @@ def header_env_key(name) # `@allowed_origins`. Comparison is case-insensitive on host and # scheme. Wildcard via leading `.` matches subdomains: # `.example.com` matches `app.example.com` and `example.com`. + # Single chokepoint for the Origin CSRF gate, shared by the POST and + # listening-stream paths. A missing/empty Origin (native clients: curl, + # SDK-to-SDK) is always allowed — the CSRF surface is browser-only, and + # browsers always send an Origin on cross-origin requests. When an + # explicit allowlist is configured it wins; otherwise the loopback + # default (NEW-9) refuses any present non-loopback Origin. + def origin_refused?(env) + origin = env["HTTP_ORIGIN"].to_s.strip + return false if origin.empty? + if @allowed_origins + !origin_allowed?(origin) + elsif @loopback_csrf_default + !origin_is_loopback?(origin) + else + false + end + end + + # True when `origin`'s host is a loopback address (any scheme/port). + # Closes browser DNS-rebinding on an unauthenticated loopback bind: the + # attacker page's Origin (e.g. http://evil.example) is not loopback and + # is refused, while a real local UI on http://localhost: passes. + def origin_is_loopback?(origin) + host = begin + URI.parse(origin).host + rescue URI::InvalidURIError, StandardError + nil + end + return false if host.nil? + host = host.downcase.delete_prefix("[").delete_suffix("]") # unwrap IPv6 brackets + host == "localhost" || host == "127.0.0.1" || host == "::1" + end + def origin_allowed?(origin) return false unless @allowed_origins normalized = origin.downcase diff --git a/lib/parse/agent/mcp_server.rb b/lib/parse/agent/mcp_server.rb index 813bbb2..a0486b4 100644 --- a/lib/parse/agent/mcp_server.rb +++ b/lib/parse/agent/mcp_server.rb @@ -162,11 +162,30 @@ def initialize(port: 3001, host: "127.0.0.1", permissions: :readonly, # pre_auth_rate_limiter: closes NEW-MCP-6 — runs before the factory # is invoked so an empty or malformed body can't amplify into a # Parse Server round-trip. + # NEW-9: on an unauthenticated loopback dev bind with no explicit CSRF + # gate configured, enable a loopback-only Origin policy by default to + # mitigate browser DNS-rebinding (a malicious page resolving a hostname + # to 127.0.0.1 and POSTing to the agent). The attacker page always + # carries a non-loopback Origin and is refused; native (no-Origin) + # clients and real local browser UIs are unaffected. Skipped when an + # API key is set (auth already gates) or the operator configured the + # Origin/custom-header gates themselves. + loopback_csrf_default = + LOOPBACK_HOSTS.include?(host.to_s) && @api_key.to_s.empty? && + allowed_origins.nil? && require_custom_header.nil? + if loopback_csrf_default + warn "[Parse::Agent::MCPServer] Binding #{host}:#{port} without an API key. " \ + "Enabling a loopback-only Origin policy to mitigate browser DNS-rebinding. " \ + "For anything beyond local single-user dev set MCP_API_KEY (or pass api_key:), " \ + "and/or configure allowed_origins:/require_custom_header:." + end + @rack_app = MCPRackApp.new( agent_factory: method(:agent_factory), pre_auth_rate_limiter: pre_auth_rate_limiter, allowed_origins: allowed_origins, require_custom_header: require_custom_header, + loopback_csrf_default: loopback_csrf_default, ) end diff --git a/lib/parse/api/path_segment.rb b/lib/parse/api/path_segment.rb index 33adb97..1b6ec96 100644 --- a/lib/parse/api/path_segment.rb +++ b/lib/parse/api/path_segment.rb @@ -45,6 +45,37 @@ def identifier!(value, kind: "name") s end + # Parse objectId pattern: 1–40 alphanumerics. Parse Server generates + # 10-char alphanumeric ids; the cap is generous to allow custom ids + # while still refusing path-traversal (`/`, `.`, `..`) and query + # injection (`?`, `&`, `=`). Mirrors Parse::API::Objects::OBJECT_ID_PATTERN. + OBJECT_ID_PATTERN = /\A[A-Za-z0-9]{1,40}\z/.freeze + + # Validate a Parse objectId used in a REST path (`users/`, + # `classes//`) and return it unchanged. Refuses anything that + # could traverse to a different endpoint or smuggle a query string when + # interpolated raw — e.g. a hostile/compromised Parse Server returning a + # crafted `objectId` like `../classes/_User?where=...` on a prior + # response that then rides the next fetch/update/delete with whatever + # credentials the call is authorized to send. + # + # @param value the objectId to validate (anything responding to `to_s`). + # @param kind [String] human-readable name for error messages. + # @return [String] the validated objectId. + # @raise [ArgumentError] if blank or it fails the pattern. + def object_id!(value, kind: "objectId") + s = value.to_s + if s.empty? + raise ArgumentError, "#{kind} must not be empty" + end + unless OBJECT_ID_PATTERN.match?(s) + raise ArgumentError, + "#{kind} #{s.inspect} contains characters not allowed in a Parse " \ + "objectId. Must match /\\A[A-Za-z0-9]{1,40}\\z/." + end + s + end + # Parse trigger className pattern: a normal identifier, OR one of Parse # Server's `@`-prefixed pseudo-classes (`@File` for file triggers, # `@Connect` for the connection-global LiveQuery trigger). The optional diff --git a/lib/parse/api/users.rb b/lib/parse/api/users.rb index e2355b0..4af9339 100644 --- a/lib/parse/api/users.rb +++ b/lib/parse/api/users.rb @@ -26,6 +26,7 @@ module Users # @param headers [Hash] additional HTTP headers to send with the request. # @return [Parse::Response] def fetch_user(id, headers: {}, **opts) + id = Parse::API::PathSegment.object_id!(id) request :get, "#{USER_PATH_PREFIX}/#{id}", headers: headers, opts: opts end @@ -74,6 +75,7 @@ def create_user(body, headers: {}, **opts) # @param headers [Hash] additional HTTP headers to send with the request. # @return [Parse::Response] def update_user(id, body = {}, headers: {}, **opts) + id = Parse::API::PathSegment.object_id!(id) response = request :put, "#{USER_PATH_PREFIX}/#{id}", body: body, headers: headers, opts: opts response.parse_class = Parse::Model::CLASS_USER response @@ -98,6 +100,7 @@ def set_service_auth_data(id, service_name, auth_data, headers: {}, **opts) # @param headers [Hash] additional HTTP headers to send with the request. # @return [Parse::Response] def delete_user(id, headers: {}, **opts) + id = Parse::API::PathSegment.object_id!(id) request :delete, "#{USER_PATH_PREFIX}/#{id}", headers: headers, opts: opts end diff --git a/lib/parse/cache/redis.rb b/lib/parse/cache/redis.rb index 961e885..ecfb313 100644 --- a/lib/parse/cache/redis.rb +++ b/lib/parse/cache/redis.rb @@ -2,6 +2,7 @@ # frozen_string_literal: true require "moneta" +require "json" require_relative "pool" module Parse @@ -82,6 +83,20 @@ def initialize(url:, namespace: nil, pool_size: 5, pool_timeout: 5, **moneta_opt # session-scoped REST responses outlive their token's # validity. Callers can still pass `expires: false` to opt out. merged_options = { expires: true }.merge(moneta_options) + # SECURITY: disable Moneta's value serializer so cached values are NOT + # Marshal-encoded. We JSON-(de)serialize values ourselves in #store / + # #[] (see #encode_value / #decode_value). The default Moneta-Redis + # value serializer is Marshal, which would `Marshal.load` whatever + # bytes come back from Redis on every cache hit — an arbitrary-code- + # execution primitive if the Redis cache is shared, unauthenticated, + # or reachable through a plaintext `redis://` MITM. Forcing nil here + # (overriding any caller-supplied `value_serializer:`/`serializer:`) + # keeps that gadget-deserialization vector closed regardless of how + # the wrapper is configured. Keys keep the default (:marshal) encoding: + # they are only ever written and SCAN/DEL-compared as opaque strings, + # never `Marshal.load`ed from Redis content, so they are not a + # deserialization vector. + merged_options = merged_options.merge(value_serializer: nil) @moneta_options = merged_options @closed = false @pool = Pool.new(size: pool_size, timeout: pool_timeout) do @@ -90,7 +105,7 @@ def initialize(url:, namespace: nil, pool_size: 5, pool_timeout: 5, **moneta_opt end def [](key) - @pool[key] + decode_value(@pool[key]) end def key?(key) @@ -102,15 +117,18 @@ def delete(key) end def store(key, value, options = {}) - @pool.store(key, value, options) + @pool.store(key, encode_value(value), options) end # Atomic SETNX. Required so `Parse::CreateLock` can acquire # cross-process locks when this wrapper is the configured cache / # `synchronize_create_store`. Returns `true` only when the key did - # not already exist. + # not already exist. The value goes through the same JSON encoding + # as {#store} so a later {#[]} read round-trips instead of decoding + # to nil. (Parse::LockBackend never hits this path on this wrapper — + # it prefers the raw-Redis {#lock_acquire}/{#lock_release} pair.) def create(key, value, options = {}) - @pool.create(key, value, options) + @pool.create(key, encode_value(value), options) end # Atomic counter increment. Forwarded for Moneta surface parity. @@ -135,14 +153,14 @@ def increment(key, amount = 1, options = {}) # Atomically acquire a lock: SET key=owner only if absent, with a # native expiry. Used by {Parse::LockBackend} for {Parse::Lock} and # {Parse::CreateLock}. Deliberately bypasses Moneta's `create` — - # `Moneta.new(:Redis)` marshals BOTH keys and values, so a raw-Redis - # compare-and-delete on the marshaled blob would be fragile and - # coupled to Moneta's serializer config. Routing acquire AND release - # through plain-string raw Redis here keeps one consistent encoding - # across both ends of the lock and makes the keys human-inspectable - # in Redis (`parse-stack:lock:v1:`). Lock keys are + # `Moneta.new(:Redis)` marshals keys (and, by default, values), so a + # raw-Redis compare-and-delete on a Moneta-encoded blob would be + # fragile and coupled to Moneta's serializer config. Routing acquire + # AND release through plain-string raw Redis here keeps one consistent + # encoding across both ends of the lock and makes the keys human- + # inspectable in Redis (`parse-stack:lock:v1:`). Lock keys are # short-lived (TTL ≤ 30s) so there is no migration concern when a - # deploy flips between the Moneta-encoded and raw-encoded paths. + # deploy flips encodings. # # @param key [String] plain-string lock key. # @param owner [String] unique-per-acquisition owner token. @@ -222,6 +240,32 @@ def close private + # Serialize a cache value to a JSON String before handing it to Moneta + # (which stores it raw, since the value serializer is disabled — see the + # constructor). JSON is used instead of Marshal so the read side never + # `Marshal.load`s attacker-influenced Redis bytes. Cache values written + # by the caching middleware are `{ "headers" => ..., "body" => ... }` + # hashes of strings, which round-trip losslessly through JSON. + def encode_value(value) + JSON.generate(value) + end + + # Decode a JSON String read back from Moneta/Redis. Returns nil on a + # miss or on any value that is not valid JSON — most importantly, legacy + # Marshal-encoded entries written before this wrapper switched to JSON. + # Treating an undecodable value as a miss makes the caller refetch and + # re-store it in the JSON format, and ensures a hostile non-JSON blob can + # at worst yield a cache miss, never a deserialized Ruby object graph. + def decode_value(raw) + return nil if raw.nil? + JSON.parse(raw) + rescue JSON::ParserError, EncodingError, TypeError + # ParserError covers malformed and hostile-depth JSON + # (JSON::NestingError subclasses it); TypeError covers a + # non-String blob from a misconfigured store. All are misses. + nil + end + def delete_keys_matching!(pattern) @pool.pool.with do |store| redis = backend_client(store) diff --git a/lib/parse/client.rb b/lib/parse/client.rb index 2c0d0c1..6a1d7cf 100644 --- a/lib/parse/client.rb +++ b/lib/parse/client.rb @@ -716,10 +716,26 @@ def initialize(opts = {}) warn "[Parse::Client] Cache store provided but :expires is not set or is 0. " \ "Caching will be disabled. Set :expires to enable caching (e.g., expires: 10)." else - # advanced: provide a REDIS url, we'll configure a Moneta Redis store. + # advanced: provide a REDIS url, we'll configure a Redis store. if opts[:cache].is_a?(String) && opts[:cache].starts_with?("redis://") begin - opts[:cache] = Moneta.new(:Redis, url: opts[:cache]) + # Eagerly load the redis adapter so a missing `redis` gem + # fails fast here (at setup) with the friendly hint below, + # rather than deferring to the first cache access — the + # Parse::Cache::Redis pool builds its Moneta-Redis backends + # lazily, so without this the LoadError would surface later. + require "moneta/adapters/redis" + # Route through Parse::Cache::Redis rather than a bare + # `Moneta.new(:Redis, ...)`. SECURITY: the Moneta-Redis store + # Marshals values by default, so every cache hit would + # `Marshal.load` whatever bytes come back from Redis — an + # arbitrary-code-execution primitive if the cache is shared, + # unauthenticated, or reachable over a plaintext `redis://` + # MITM. The wrapper forces `value_serializer: nil` and + # JSON-(de)serializes cached values itself, closing that + # deserialization vector on this shorthand the same way an + # explicitly-constructed wrapper does. + opts[:cache] = Parse::Cache::Redis.new(url: opts[:cache]) rescue LoadError puts "[Parse::Middleware::Caching] Did you forget to load the redis gem (Gemfile)?" raise diff --git a/lib/parse/client/caching.rb b/lib/parse/client/caching.rb index 5c2aa95..0f98a45 100644 --- a/lib/parse/client/caching.rb +++ b/lib/parse/client/caching.rb @@ -190,8 +190,13 @@ def call!(env) body = cache_data.respond_to?(:body) ? cache_data.body : nil response_headers = cache_data.response_headers || {} elsif cache_data.is_a?(Hash) - body = cache_data[:body] - response_headers = cache_data[:headers] || {} + # New entries are stored with string keys so they survive a + # JSON round-trip (the Redis cache wrapper serializes values as + # JSON, not Marshal — see Parse::Cache::Redis). Fall back to + # symbol keys for legacy in-memory / Marshal-backed entries + # written before that switch. + body = cache_data["body"] || cache_data[:body] + response_headers = cache_data["headers"] || cache_data[:headers] || {} end if cache_data.present? && body.present? @@ -244,8 +249,12 @@ def call!(env) response_env.body.present? && response_env.response_headers[CONTENT_LENGTH_KEY].to_i.between?(20, 1_250_000) store_start = Process.clock_gettime(Process::CLOCK_MONOTONIC) begin + # Store with string keys (and a plain Hash of headers) so the + # value round-trips losslessly through the Redis cache wrapper's + # JSON serialization. The read path above reads string keys first + # with a symbol-key fallback for legacy entries. @store.store(@cache_key, - { headers: response_env.response_headers, body: response_env.body }, + { "headers" => response_env.response_headers.to_h, "body" => response_env.body }, expires: @expires) duration_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - store_start) * 1000.0).round(3) instrument_cache(:store, method: method, url_path: url_path, duration_ms: duration_ms) diff --git a/lib/parse/client/logging.rb b/lib/parse/client/logging.rb index 8a50511..3c6822c 100644 --- a/lib/parse/client/logging.rb +++ b/lib/parse/client/logging.rb @@ -186,6 +186,15 @@ def log_body(body, prefix) end end + # Scrub credentials before logging. At :debug level this method emits + # both the request body (login/signup carries a cleartext `password`) + # and the response body (auth responses carry a fresh `sessionToken`, + # `authData`, and MFA secrets). `log_headers` already redacts headers; + # the body path must use the same canonical scrubber or it leaks live + # credentials to anyone with log access. Redact BEFORE the length cap + # so truncation can't split a token across the boundary and slip past. + content = Parse::Middleware::BodyBuilder.redact(content) + if content.length > max_length logger.debug " [#{prefix} Body] #{content[0...max_length]}... (truncated, #{content.length} total)" elsif content.length > 0 diff --git a/lib/parse/embeddings/cache.rb b/lib/parse/embeddings/cache.rb index 1bc893b..486462d 100644 --- a/lib/parse/embeddings/cache.rb +++ b/lib/parse/embeddings/cache.rb @@ -3,6 +3,7 @@ require "digest" require "monitor" +require "json" module Parse module Embeddings @@ -89,14 +90,25 @@ def clear # shared across processes: # # require "moneta" - # moneta = Moneta.new(:Redis, url: ENV["REDIS_URL"]) + # moneta = Moneta.new(:Redis, url: ENV["REDIS_URL"], value_serializer: nil) # Parse::Embeddings::Cache.enable!( # store: Parse::Embeddings::Cache::MonetaStore.new(moneta, ttl: 30 * 24 * 3600), # ) # # Keys are namespaced (`emb:` by default) so the entries are - # recognizable next to other application keys; values are the - # raw vector Arrays (Moneta's own serializer handles encoding). + # recognizable next to other application keys; values are + # JSON-encoded vector Arrays (see {#get}/{#set}). + # + # SECURITY — build the Moneta store with `value_serializer: nil` + # (as above). Moneta's default value serializer is Marshal, so a + # cache read would `Marshal.load` whatever bytes are in the backing + # store — an arbitrary-code-execution primitive if that store is + # shared, unauthenticated, or reachable over a plaintext `redis://` + # MITM, and the cache key is derived from (often user-supplied) + # embedded text. `MonetaStore` JSON-(de)serializes values itself, but + # that only closes the vector IF Moneta is not also Marshaling on top; + # `value_serializer: nil` ensures it is not. `MonetaStore` emits a + # one-time warning if it is handed a Marshal-serializing store. # TTL is forwarded via Moneta's `expires:` option when the # backend supports it, ignored otherwise. # @@ -121,6 +133,13 @@ def initialize(moneta, ttl: nil, namespace: "emb:") "Parse::Embeddings::Cache::MonetaStore expects a Moneta-compatible " \ "store responding to #[] and #[]= (got #{moneta.class})." end + if marshaling_value_store?(moneta) + warn "[Parse::Embeddings::Cache::MonetaStore] SECURITY: the supplied Moneta " \ + "store deserializes values with Marshal. A cache read Marshal.loads bytes " \ + "from the backing store, which is a remote-code-execution vector when the " \ + "store is shared/untrusted. Rebuild it with value_serializer: nil, e.g. " \ + "Moneta.new(:Redis, url: ..., value_serializer: nil)." + end @moneta = moneta @ttl = ttl && Float(ttl) @namespace = namespace.to_s @@ -128,8 +147,7 @@ def initialize(moneta, ttl: nil, namespace: "emb:") # @return [Array, nil] def get(key) - value = @moneta[@namespace + key] - value.is_a?(Array) ? value : nil + decode_vector(@moneta[@namespace + key]) rescue StandardError nil end @@ -137,23 +155,57 @@ def get(key) # @return [Array] the vector, unchanged. def set(key, vector) k = @namespace + key + encoded = encode_vector(vector) if @ttl && @moneta.respond_to?(:store) begin - @moneta.store(k, vector, expires: @ttl) + @moneta.store(k, encoded, expires: @ttl) rescue ArgumentError # Hash-like backends define #store(key, value) with no # options arg, so the expires: form raises ArgumentError. # Fall back to a plain write (no expiry) rather than letting # the fail-open rescue below silently drop every vector. - @moneta[k] = vector + @moneta[k] = encoded end else - @moneta[k] = vector + @moneta[k] = encoded end vector rescue StandardError vector end + + private + + # Vectors are JSON-encoded here rather than left to the Moneta + # store's own (Marshal-by-default) value serializer. Combined with a + # store built with `value_serializer: nil`, this keeps Marshal off + # the read path entirely: a JSON parse of attacker-influenced backing- + # store bytes can at worst yield inert data or raise — never a + # deserialized Ruby gadget object graph (RCE-if-cache-compromised). + # Embedding vectors are Array, which round-trips losslessly + # through JSON. + def encode_vector(vector) + JSON.generate(vector) + end + + def decode_vector(raw) + return raw if raw.is_a?(Array) # legacy/non-serializing store entry + return nil if raw.nil? + parsed = JSON.parse(raw) + parsed.is_a?(Array) ? parsed : nil + rescue JSON::ParserError, TypeError, EncodingError + nil + end + + # Best-effort detection of a Moneta store that serializes VALUES with + # Marshal. Moneta names its transformer proxy after the active + # serializers (e.g. "...MarshalValue"); a store built with + # value_serializer: nil has no "...Value" segment. Used only to warn. + def marshaling_value_store?(moneta) + moneta.class.name.to_s.include?("MarshalValue") + rescue StandardError + false + end end MONITOR = Monitor.new diff --git a/lib/parse/model/core/properties.rb b/lib/parse/model/core/properties.rb index 889c333..0dfdbe1 100644 --- a/lib/parse/model/core/properties.rb +++ b/lib/parse/model/core/properties.rb @@ -79,11 +79,33 @@ module Properties CORE_FIELDS = { id: :string, created_at: :date, updated_at: :date, acl: :acl }.freeze # The delete operation hash. DELETE_OP = { "__op" => "Delete" }.freeze + # Shared stateless boolean caster used by {#format_value}. One instance + # for the process lifetime — `cast` only consults a frozen FALSE_VALUES + # set, so reuse is thread-safe. + BOOLEAN_CASTER = ActiveModel::Type::Boolean.new.freeze # @!visibility private def self.included(base) base.extend(ClassMethods) end + # Process-once deprecation warning emitted when an ACL is set through + # mass-assignment (`Parse::Object#attributes=`). Setting ACL this way is + # still permitted in this release for backward compatibility, but is a + # mass-assignment foot-gun (a caller-supplied params hash bearing an + # `ACL` key can grant public write). A future release may block it; the + # supported path is the explicit `obj.acl =` setter. One-time so loops + # over many records do not spam the log. + # @!visibility private + def self.warn_acl_mass_assignment_once! + return if @acl_mass_assignment_warned + @acl_mass_assignment_warned = true + warn "[Parse::Stack:SECURITY] Setting `acl`/`ACL` via mass-assignment " \ + "(Parse::Object#attributes=) is deprecated and may be blocked in a " \ + "future release. A caller-supplied params hash bearing an ACL key can " \ + "grant unintended access — filter input with StrongParameters and set " \ + "ACL via the explicit `obj.acl = ...` setter instead." + end + # The class methods added to Parse::Objects module ClassMethods @@ -723,6 +745,17 @@ def apply_attributes!(hash, dirty_track: false, filter_protected: nil, protected # @return (see #apply_attributes!) def attributes=(hash) return unless hash.is_a?(Hash) + # `acl`/`ACL` is still accepted here (a user-facing property), but + # mass-assigning an ACL from a caller-supplied hash — e.g. a Rails + # controller doing `record.attributes = params` without + # StrongParameters — lets an attacker grant themselves write by + # sending `{"ACL" => {"*" => {"write" => true}}}`. Warn (once) so the + # foot-gun is visible; callers should set ACL via the explicit + # `obj.acl =` setter. The constructor path (`Klass.new(acl:)`) calls + # apply_attributes! directly and is intentionally not warned. + if hash.key?("ACL") || hash.key?("acl") || hash.key?(:ACL) || hash.key?(:acl) + Parse::Properties.warn_acl_mass_assignment_once! + end # - [:id, :objectId] # only overwrite @id if it hasn't been set. apply_attributes!(hash, dirty_track: true) @@ -838,11 +871,15 @@ def format_value(key, val, data_type = nil) val = val.to_i end when :boolean - if val.nil? - val = nil - else - val = val ? true : false - end + # Coerce via ActiveModel's boolean caster rather than Ruby + # truthiness. Plain `val ? true : false` treats every non-nil, + # non-false object as true, so the strings "false", "0", and "off" + # — exactly what arrives on the Rails-form / query-string ingestion + # path — would coerce to `true` and silently flip a boolean the + # wrong way (e.g. an `archived` or admin gate). ActiveModel maps the + # string forms ("false"/"0"/"f"/"off"/"") to false/nil. Parse wire + # JSON already sends real booleans, which pass through unchanged. + val = val.nil? ? nil : BOOLEAN_CASTER.cast(val) when :string val = val.to_s unless val.blank? when :float diff --git a/lib/parse/mongodb.rb b/lib/parse/mongodb.rb index bf286fe..45ee6dc 100644 --- a/lib/parse/mongodb.rb +++ b/lib/parse/mongodb.rb @@ -1651,6 +1651,18 @@ def aggregate(collection_name, pipeline, max_time_ms: nil, rewrite_lookups: nil, collection_name, perms_for_clp, ) Parse::CLPScope.redact_protected_fields!(results, strip_set) if strip_set.any? + + # Process-level floor: recursively strip Parse-internal credential + # columns (_hashed_password, _session_token, _auth_data_*, _rperm, + # ...) from every row AND every embedded sub-document. The + # protectedFields strip above is keyed on the OUTER class, and the + # ACL sub-doc walk only DROPS ACL-failing sub-docs — neither covers + # a foreign class (e.g. _User / _Session) pulled in via $lookup / + # $graphLookup / $unionWith under an arbitrary alias. Runs last, for + # scoped (non-master) callers only; master is unredacted by design. + results.each do |row| + Parse::PipelineSecurity.redact_internal_fields_deep!(row) + end end payload[:result_count] = results.size diff --git a/lib/parse/pipeline_security.rb b/lib/parse/pipeline_security.rb index 729b89d..f45207d 100644 --- a/lib/parse/pipeline_security.rb +++ b/lib/parse/pipeline_security.rb @@ -105,6 +105,7 @@ def initialize(message, stage: nil, operator: nil, reason: nil) DENIED_FIELD_REFS = %w[ $_hashed_password $_password_history $_session_token $_sessionToken + $sessionToken $session_token $_email_verify_token $_perishable_token $_failed_login_count $_account_lockout_expires_at $_rperm $_wperm @@ -161,6 +162,19 @@ def initialize(message, stage: nil, operator: nil, reason: nil) # walk_for_denied! field-name screen. INTERNAL_FIELDS_PREFIX_DENYLIST = %w[_auth_data_].freeze + # The credential / sensitive subset of {INTERNAL_FIELDS_DENYLIST}. These + # columns must NEVER appear as a user-influenced `$match` field name — + # even on a pipeline that runs with `allow_internal_fields: true` (which + # exists to permit SDK-emitted `_rperm`/`_wperm` references from + # `readable_by_role` / `publicly_readable`). A `$match`/`$count` on a + # password hash, session/reset token, or auth-data column is a credential- + # exfiltration oracle (bisect the value char-by-char), and these columns + # have NO legitimate SDK query use — so the `allow_internal_fields` escape + # hatch must not relax them. Derived from {INTERNAL_FIELDS_DENYLIST} minus + # the ACL/bookkeeping columns (`_rperm`/`_wperm`/`_tombstone`) the ACL DSL + # legitimately emits, so the two lists never drift. + CREDENTIAL_FIELDS_DENYLIST = (INTERNAL_FIELDS_DENYLIST - %w[_rperm _wperm _tombstone]).freeze + # Forensic string-introspection operators. When any of these # appears INSIDE `$expr` with a field-reference input string, the # query becomes a per-character oracle even though the operator @@ -336,6 +350,48 @@ def strip_internal_fields(doc) end end + # Depth bound for {redact_internal_fields_deep!}. `$lookup`/`$graphLookup`/ + # `$unionWith` embed foreign documents at shallow alias depth, so this is + # generous; the bound exists only to fail safe on cyclic/pathological docs. + INTERNAL_REDACT_MAX_DEPTH = 32 + + # Recursively delete {INTERNAL_FIELDS_DENYLIST} / {INTERNAL_FIELDS_PREFIX_DENYLIST} + # keys from `node` AND every embedded sub-document/array element, in place. + # + # This is the process-level floor that stops Parse-Server-internal + # credential columns (`_hashed_password`, `_session_token`, `_auth_data_*`, + # `_rperm`/`_wperm`, ...) from reaching a scoped caller through ANY result + # shape — most importantly a foreign-class document pulled in via + # `$lookup`/`$graphLookup`/`$unionWith` under an arbitrary alias. Neither + # the per-class protectedFields strip (keyed on the OUTER class) nor the + # ACL sub-document walk (which only DROPS ACL-failing sub-docs, never + # strips field names) covers that alias. Unlike {strip_internal_fields} + # (one level, non-mutating), this walks the whole tree and mutates in + # place so it can run as the last step over a result set. + # + # Structural columns (`_id`, `_p_*`, `_created_at`, `_updated_at`, `_acl`) + # are intentionally NOT in the denylist, so object/ACL reconstruction is + # unaffected. + # + # @param node [Object] a result row (Hash), array, or scalar. + # @return [Object] the same node, mutated. + def redact_internal_fields_deep!(node, depth: INTERNAL_REDACT_MAX_DEPTH) + case node + when Hash + # Always clean the current level (even at the depth floor) so an + # embedded document sitting exactly at the bound is still scrubbed. + node.delete_if do |key, _value| + ks = key.to_s + INTERNAL_FIELDS_DENYLIST.include?(ks) || + INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |prefix| ks.start_with?(prefix) } + end + node.each_value { |v| redact_internal_fields_deep!(v, depth: depth - 1) } if depth > 0 + when Array + node.each { |el| redact_internal_fields_deep!(el, depth: depth - 1) } if depth > 0 + end + node + end + # Wave-3 TRACK-CLP-4: refuse caller-supplied pipelines that # reference a protected field via `$` on the RHS of a # `$project` / `$addFields` / `$set` / `$group` / `$bucket` / @@ -510,21 +566,31 @@ def walk_for_denied!(node, depth:, stage_idx: nil, inside_expr: false, allow_int # oracle as the where:-constraint path in ConstraintTranslator. # Operators ($-prefixed) are excluded because they are validated # separately by DENIED_OPERATORS. - if !allow_internal_fields && - !key_str.start_with?("$") && - (INTERNAL_FIELDS_DENYLIST.include?(key_str) || - INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |prefix| key_str.start_with?(prefix) }) - raise Error.new( - "SECURITY: Pipeline references internal Parse Server field " \ - "'#{key_str}' at nesting depth #{depth}" \ - "#{stage_idx ? " inside stage #{stage_idx}" : ""}. " \ - "This column (password hash, session token, auth data, or ACL " \ - "pointer) must not appear in a user-influenced pipeline — " \ - "it enables credential exfiltration via count/match oracles.", - stage: stage_idx, - operator: key_str, - reason: :denied_internal_field, - ) + # + # CREDENTIAL columns (password hash, session/reset token, auth data) + # are refused UNCONDITIONALLY — `allow_internal_fields` (which exists + # so SDK-emitted `_rperm`/`_wperm` references survive on the mongo- + # direct path) must NOT relax them, or a `*_direct` terminal becomes + # a credential-bisection oracle. The remaining internal columns + # (`_rperm`/`_wperm`/`_tombstone`) stay gated by allow_internal_fields. + if !key_str.start_with?("$") + is_credential = CREDENTIAL_FIELDS_DENYLIST.include?(key_str) || + INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |prefix| key_str.start_with?(prefix) } + is_internal = INTERNAL_FIELDS_DENYLIST.include?(key_str) || + INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |prefix| key_str.start_with?(prefix) } + if is_credential || (is_internal && !allow_internal_fields) + raise Error.new( + "SECURITY: Pipeline references internal Parse Server field " \ + "'#{key_str}' at nesting depth #{depth}" \ + "#{stage_idx ? " inside stage #{stage_idx}" : ""}. " \ + "This column (password hash, session token, auth data, or ACL " \ + "pointer) must not appear in a user-influenced pipeline — " \ + "it enables credential exfiltration via count/match oracles.", + stage: stage_idx, + operator: key_str, + reason: :denied_internal_field, + ) + end end # Cap caller-supplied regex pattern length. Catches the two # shapes Mongo accepts: the find-form `{ field: { $regex: "..." } }` diff --git a/lib/parse/query.rb b/lib/parse/query.rb index 6929377..e199b2b 100644 --- a/lib/parse/query.rb +++ b/lib/parse/query.rb @@ -1965,11 +1965,21 @@ def acl_permission_set # roles, injects the three-layer ACL simulation # (top-level `$match`, `$lookup` rewriter, post-fetch # redactor) via {Parse::MongoDB.aggregate}. + # * an active `Parse.with_session` block — the fiber-local ambient + # session token scopes the read the same way an explicit + # `session_token=` would (see {#mongo_direct_auth_kwargs}). # # Raises a clear {MongoDirectRequired} otherwise. # @!visibility private def assert_mongo_direct_routable! has_session = @session_token.is_a?(String) && !@session_token.empty? + # An active `Parse.with_session` block scopes the read even on a + # non-master client (client_mode, or a user-scoped client with no + # master key), where `server_mode_master` is false. Without this the + # query would raise instead of running scoped — and on a master + # client the ambient is what `mongo_direct_auth_kwargs` forwards so + # the read is scoped rather than silently master. + has_ambient_session = !ambient_session_token.nil? # Mirror the request-layer auth resolution in Parse::Client#request: # when the process is in "server mode" — Parse.client_mode == false # AND the resolved Parse::Client has a master_key — and the caller @@ -1985,7 +1995,7 @@ def assert_mongo_direct_routable! false end server_mode_master = (use_master_key != false) && !Parse.client_mode && client_has_master_key - unless use_master_key || server_mode_master || @acl_user || @acl_role || has_session + unless use_master_key || server_mode_master || @acl_user || @acl_role || has_session || has_ambient_session raise MongoDirectRequired, "[Parse::Query] This query uses a constraint that can only run " \ "via mongo-direct. Mongo-direct bypasses Parse Server's enforcement, " \ @@ -2019,6 +2029,12 @@ def assert_mongo_direct_routable! # double-inject). # * `session_token` is set → forward `session_token:` so # Parse::ACLScope runs the full three-layer simulation. + # * Otherwise, the fiber-local ambient session set by + # `Parse.with_session` is forwarded as `session_token:` (unless + # the caller explicitly requested `use_master_key: true`), so a + # query that auto-routes to mongo-direct inside a `with_session` + # block is scoped to that user — matching what the REST path does + # in {Parse::Client#request}. # * Otherwise (master-key path) → forward `master: true`. # @!visibility private def mongo_direct_auth_kwargs @@ -2036,11 +2052,32 @@ def mongo_direct_auth_kwargs { acl_role: @acl_role } elsif @session_token.is_a?(String) && !@session_token.empty? { session_token: @session_token } + elsif use_master_key != true && (ambient = ambient_session_token) + # No explicit per-query scope, but a `Parse.with_session` block is + # active. Mirror Parse::Client#request's precedence (ambient + # session wins over the server-mode master default) so the read is + # scoped to that user instead of silently running as master with + # no ACL/CLP enforcement. An explicit `use_master_key: true` is a + # deliberate admin call and skips the ambient, exactly as the REST + # path does. + { session_token: ambient } else { master: true } end end + # The fiber-local ambient session token set by `Parse.with_session`, + # or nil. A whitespace-only ambient is treated as absent so it cannot + # block the master fallback and then fail a later presence check — + # the same guard {Parse::Client#request} applies. + # @return [String, nil] + # @!visibility private + def ambient_session_token + return nil unless Parse.respond_to?(:current_session_token) + ambient = Parse.current_session_token + ambient if ambient.is_a?(String) && !ambient.strip.empty? + end + # Check if this query contains constraints that require aggregation pipeline processing # @return [Boolean] true if aggregation pipeline is required def requires_aggregation_pipeline? @@ -3527,22 +3564,31 @@ def aggregate(pipeline, verbose: nil, mongo_direct: nil, rewrite_lookups: nil, r # the merged pipeline is provably SDK-injected, never user input. uses_internal_fields = pipeline_uses_internal_fields?(complete_pipeline) scoped = distinct_query_is_scoped? + mongo_ready = defined?(Parse::MongoDB) && Parse::MongoDB.enabled? use_mongo_direct = mongo_direct - if use_mongo_direct.nil? - mongo_ready = defined?(Parse::MongoDB) && Parse::MongoDB.enabled? - if lookup_stages && lookup_stages.any? + + if scoped + # A scoped aggregation (session_token / scope_to_user / scope_to_role) + # must NEVER reach Parse Server's REST /aggregate endpoint — it is + # master-key-only and enforces NEITHER ACL NOR CLP, so it would run + # unscoped as the master key. This holds even when the caller + # explicitly passes `mongo_direct: false`: an explicit false cannot + # opt a scoped query out of ACL/CLP enforcement. Promote to mongo- + # direct, or fail closed when direct Mongo is unavailable (refuse + # rather than leak unscoped rows). + if mongo_ready + use_mongo_direct = true + else + raise_scoped_aggregation_requires_mongo_direct! + end + elsif use_mongo_direct.nil? + # Unscoped auto-routing: $inQuery/$notInQuery → $lookup pipelines and + # SDK-injected internal-field ($rperm/_wperm) pipelines can't be served + # by REST /aggregate, so prefer mongo-direct when available. An unscoped + # internal-field pipeline keeps the REST fallback (a master-key + # correctness edge, not an enforcement bypass). + if (lookup_stages && lookup_stages.any?) || uses_internal_fields use_mongo_direct = true if mongo_ready - elsif scoped || uses_internal_fields - if mongo_ready - use_mongo_direct = true - elsif scoped - # Fail closed: a scoped aggregation cannot fall back to REST - # /aggregate without silently bypassing ACL/CLP (master-key-only - # endpoint). Refuse rather than leak unscoped results. Unscoped - # internal-field pipelines keep the REST fallback (a master-key - # correctness edge, not an enforcement bypass). - raise_scoped_aggregation_requires_mongo_direct! - end end end @@ -3641,17 +3687,21 @@ def aggregate_from_query(additional_stages = [], verbose: nil, mongo_direct: nil # unenforced). A scoped query fails closed when mongo-direct is # unavailable rather than silently running unscoped as master. scoped = distinct_query_is_scoped? + mongo_ready = defined?(Parse::MongoDB) && Parse::MongoDB.enabled? use_mongo_direct = mongo_direct - if use_mongo_direct.nil? - mongo_ready = defined?(Parse::MongoDB) && Parse::MongoDB.enabled? - if has_lookup_stages + + if scoped + # A scoped aggregation must never reach REST /aggregate (master-key- + # only, unenforced) — not even when the caller explicitly passes + # mongo_direct: false. Promote to mongo-direct, or fail closed. + if mongo_ready + use_mongo_direct = true + else + raise_scoped_aggregation_requires_mongo_direct! + end + elsif use_mongo_direct.nil? + if has_lookup_stages || uses_internal_fields use_mongo_direct = true if mongo_ready - elsif scoped || uses_internal_fields - if mongo_ready - use_mongo_direct = true - elsif scoped - raise_scoped_aggregation_requires_mongo_direct! - end end end diff --git a/lib/parse/stack.rb b/lib/parse/stack.rb index 1794bc6..6ec6765 100644 --- a/lib/parse/stack.rb +++ b/lib/parse/stack.rb @@ -582,8 +582,19 @@ def self.client_mode? # Optional dedicated Moneta store for the synchronize-create lock. When # nil, falls back to {Parse.cache}. + # + # SECURITY: if you pass a raw Moneta-Redis store, build it with + # +value_serializer: nil+. The lock release path reads the stored owner + # token back (+store[key]+) to compare-and-delete; with Moneta's default + # Marshal value serializer that read +Marshal.load+s bytes from Redis — an + # RCE vector on a shared/untrusted/MITM'd lock store. With + # +value_serializer: nil+ the owner token is a plain string and is never + # deserialized. Alternatively pass a {Parse::Cache::Redis} instance, which + # uses a raw-string acquire/release path and avoids Marshal entirely. # @example - # Parse.synchronize_create_store = Moneta.new(:Redis, url: "redis://locks:6379/1") + # Parse.synchronize_create_store = Moneta.new(:Redis, url: "redis://locks:6379/1", value_serializer: nil) + # # or, preferred: + # Parse.synchronize_create_store = Parse::Cache::Redis.new(url: "redis://locks:6379/1") @synchronize_create_store = nil # Optional allowlist of {Parse::Object} subclasses that may use the diff --git a/lib/parse/stack/version.rb b/lib/parse/stack/version.rb index f3c2cd1..0e1b598 100644 --- a/lib/parse/stack/version.rb +++ b/lib/parse/stack/version.rb @@ -6,6 +6,6 @@ module Parse # The Parse Server SDK for Ruby module Stack # The current version. - VERSION = "5.5.0" + VERSION = "5.5.1" end end diff --git a/test/lib/parse/agent/mcp_origin_allowlist_test.rb b/test/lib/parse/agent/mcp_origin_allowlist_test.rb index 4d6ed00..e2d925e 100644 --- a/test/lib/parse/agent/mcp_origin_allowlist_test.rb +++ b/test/lib/parse/agent/mcp_origin_allowlist_test.rb @@ -117,6 +117,47 @@ def test_empty_string_origin_is_allowed_when_allowlist_configured assert_equal 200, status end + # ---- NEW-9: loopback-only Origin default (DNS-rebinding mitigation) ------ + + def test_loopback_default_refuses_non_loopback_origin + # A browser DNS-rebinding attack against 127.0.0.1 carries the attacker + # page's Origin; the loopback default refuses it. + app = build_app(loopback_csrf_default: true) + status, _h, body = app.call(rack_env(origin: "http://evil.example")) + assert_equal 403, status + assert_includes body.first, "Origin not allowed" + end + + def test_loopback_default_allows_loopback_origin + app = build_app(loopback_csrf_default: true) + %w[http://localhost:5173 http://127.0.0.1:9999 https://[::1]:8443].each do |origin| + status, _h, _b = app.call(rack_env(origin: origin)) + assert_equal 200, status, "should accept loopback origin #{origin}" + end + end + + def test_loopback_default_allows_missing_origin + # Native clients (curl, SDK-to-SDK) send no Origin and must not be broken. + app = build_app(loopback_csrf_default: true) + status, _h, _b = app.call(rack_env) + assert_equal 200, status + end + + def test_explicit_allowlist_overrides_loopback_default + # When an operator configures allowed_origins, that policy governs and the + # loopback default is suppressed (loopback origin not in the list -> 403). + app = build_app(allowed_origins: ["https://app.example.com"], loopback_csrf_default: true) + status, _h, _b = app.call(rack_env(origin: "http://localhost:5173")) + assert_equal 403, status + end + + def test_no_loopback_default_means_no_check + # Default off: behaves like before (no Origin gate without an allowlist). + app = build_app + status, _h, _b = app.call(rack_env(origin: "http://evil.example")) + assert_equal 200, status + end + def test_empty_allowlist_array_is_treated_as_no_check app = build_app(allowed_origins: []) status, _h, _b = app.call(rack_env(origin: "https://attacker.example.com")) diff --git a/test/lib/parse/aggregation_auto_promotion_test.rb b/test/lib/parse/aggregation_auto_promotion_test.rb index a5e4085..696d543 100644 --- a/test/lib/parse/aggregation_auto_promotion_test.rb +++ b/test/lib/parse/aggregation_auto_promotion_test.rb @@ -171,6 +171,44 @@ def test_results_pipeline_fails_closed_when_scoped_and_mongodb_disabled assert_raises(Parse::Query::MongoDirectRequired) { @query.results } end + # ---- RT-3: Query#aggregate must not let an explicit mongo_direct: false + # opt a scoped query out of ACL/CLP enforcement (REST /aggregate) ---- + + PIPE = [{ "$group" => { "_id" => "$artist", "n" => { "$sum" => 1 } } }].freeze + + def test_aggregate_scoped_explicit_mongo_direct_false_fails_closed + # The crux: an explicit mongo_direct: false on a SCOPED query must NOT + # route to REST /aggregate (master-key-only, unenforced). With mongo-direct + # unavailable it fails closed instead of silently leaking unscoped rows. + stub_mongodb_enabled!(false) + @query.scope_to_role("Admin") + assert_raises(Parse::Query::MongoDirectRequired) do + @query.aggregate(PIPE, mongo_direct: false) + end + end + + def test_aggregate_scoped_explicit_mongo_direct_false_promotes_when_ready + stub_mongodb_enabled!(true) + @query.scope_to_role("Admin") + agg = @query.aggregate(PIPE, mongo_direct: false) + assert agg.mongo_direct, "scoped aggregate must be promoted to mongo-direct despite mongo_direct: false" + end + + def test_aggregate_scoped_session_token_explicit_false_fails_closed + stub_mongodb_enabled!(false) + @query.session_token = "r:test-session" + assert_raises(Parse::Query::MongoDirectRequired) do + @query.aggregate(PIPE, mongo_direct: false) + end + end + + def test_aggregate_unscoped_explicit_mongo_direct_false_stays_on_rest + # Unscoped callers can still opt out to REST with an explicit false. + stub_mongodb_enabled!(true) + agg = @query.aggregate(PIPE, mongo_direct: false) + refute agg.mongo_direct, "unscoped aggregate must honor explicit mongo_direct: false" + end + private # Stub Parse::MongoDB.enabled? for the duration of one test. We don't diff --git a/test/lib/parse/boolean_property_coercion_test.rb b/test/lib/parse/boolean_property_coercion_test.rb new file mode 100644 index 0000000..0f943dc --- /dev/null +++ b/test/lib/parse/boolean_property_coercion_test.rb @@ -0,0 +1,94 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Unit tests for :boolean property coercion. +# +# Regression guard for the "false"->true mass-assignment foot-gun: the +# coercion used to be `val ? true : false`, which treats every non-nil, +# non-false object as truthy — so the string "false" (and "0", "off"), +# exactly what arrives from Rails-form / query-string input, coerced to +# `true` and could silently flip an access-control boolean the wrong way. +# Coercion now routes through ActiveModel::Type::Boolean. +class BooleanPropertyCoercionTest < Minitest::Test + class BoolDoc < Parse::Object + parse_class "BoolDoc" + property :title, :string + property :archived, :boolean + end + + # ---- the bug being fixed ---------------------------------------------- + + def test_string_false_coerces_to_false + doc = BoolDoc.new + doc.archived = "false" + assert_equal false, doc.archived, + 'string "false" must coerce to false, not Ruby-truthy true' + end + + def test_string_zero_coerces_to_false + doc = BoolDoc.new + doc.archived = "0" + assert_equal false, doc.archived + end + + def test_string_off_coerces_to_false + doc = BoolDoc.new + doc.archived = "off" + assert_equal false, doc.archived + end + + # Weaponized form: a hostile params hash that flips an access-control + # boolean by sending the *string* "false" must not end up storing true. + def test_mass_assignment_string_false_does_not_flip_true + doc = BoolDoc.new + doc.attributes = { "title" => "x", "archived" => "false" } + refute_equal true, doc.archived, + 'mass-assigned "false" must never become true' + assert_equal false, doc.archived + end + + # ---- true-ish values still coerce to true ------------------------------ + + def test_string_true_coerces_to_true + doc = BoolDoc.new + doc.archived = "true" + assert_equal true, doc.archived + end + + def test_string_one_coerces_to_true + doc = BoolDoc.new + doc.archived = "1" + assert_equal true, doc.archived + end + + # ---- native booleans pass through (Parse wire JSON path) --------------- + + def test_real_true_stays_true + doc = BoolDoc.new + doc.archived = true + assert_equal true, doc.archived + end + + def test_real_false_stays_false + doc = BoolDoc.new + doc.archived = false + assert_equal false, doc.archived + end + + # ---- nil / blank ------------------------------------------------------- + + def test_nil_stays_nil + doc = BoolDoc.new + doc.archived = nil + assert_nil doc.archived, "nil must remain unset, not coerce to a boolean" + end + + def test_empty_string_is_unset + doc = BoolDoc.new + doc.archived = "" + assert_nil doc.archived, + 'blank "" should be treated as unset (nil), not true' + end +end diff --git a/test/lib/parse/cache_redis_wrapper_test.rb b/test/lib/parse/cache_redis_wrapper_test.rb index d9824ac..7ceb862 100644 --- a/test/lib/parse/cache_redis_wrapper_test.rb +++ b/test/lib/parse/cache_redis_wrapper_test.rb @@ -95,4 +95,37 @@ def del(*_keys); 0; end w.instance_variable_set(:@pool, Parse::Cache::Pool.new(size: 1) { backend }) assert_same w, w.clear, "Parse::Cache::Redis#clear should return self for chaining" end + + # SECURITY: cached values must be serialized as JSON, never Marshal, so a + # cache hit can never `Marshal.load` attacker-influenced Redis bytes into a + # gadget object graph (RCE-if-cache-compromised). The wrapper disables + # Moneta's value serializer and does its own JSON (de)serialization. + def test_values_are_json_encoded_not_marshal + w = Parse::Cache::Redis.new(url: "redis://localhost:6379/0") + payload = { "headers" => { "Content-Type" => "application/json" }, + "body" => '{"results":[1,2,3]}' } + encoded = w.send(:encode_value, payload) + assert_kind_of String, encoded + refute encoded.b.start_with?("\x04\b".b), + "value must not be a Marshal stream (\\x04\\x08 magic)" + assert_equal payload, JSON.parse(encoded), "value must be JSON-decodable" + assert_equal payload, w.send(:decode_value, encoded), "JSON value must round-trip" + end + + def test_forces_value_serializer_nil_even_when_caller_passes_one + # A caller-supplied serializer must not be able to re-enable Marshal on the + # value path. The wrapper forces value_serializer: nil. + w = Parse::Cache::Redis.new(url: "redis://localhost:6379/0", serializer: :marshal) + opts = w.instance_variable_get(:@moneta_options) + assert_nil opts[:value_serializer], + "wrapper must force value_serializer: nil to keep Marshal off the value path" + end + + def test_decode_value_treats_hostile_marshal_bytes_as_miss + w = Parse::Cache::Redis.new(url: "redis://localhost:6379/0") + hostile = Marshal.dump([1, 2, 3]) # non-JSON bytes an attacker could plant + assert_nil w.send(:decode_value, hostile), + "undecodable (e.g. Marshal) bytes must decode to nil, never be Marshal.load-ed" + assert_nil w.send(:decode_value, nil), "a cache miss must decode to nil" + end end diff --git a/test/lib/parse/cache_write_only_test.rb b/test/lib/parse/cache_write_only_test.rb index e64f70e..2b85c6e 100644 --- a/test/lib/parse/cache_write_only_test.rb +++ b/test/lib/parse/cache_write_only_test.rb @@ -556,7 +556,11 @@ def test_write_only_mode_updates_cache # Cache should now have the fresh data assert @store.key?(@cache_key), "Cache should be updated after write_only request" cached = @store[@cache_key] - assert_equal '{"objectId":"abc123","title":"Fresh"}', cached[:body], + # The caching middleware now stores values with string keys so they + # survive the Redis wrapper's JSON serialization (no Marshal). Read the + # string key, with a symbol fallback for legacy entries. + cached_body = cached["body"] || cached[:body] + assert_equal '{"objectId":"abc123","title":"Fresh"}', cached_body, "Cache should contain the fresh response body" end diff --git a/test/lib/parse/embeddings_cache_test.rb b/test/lib/parse/embeddings_cache_test.rb index 919012c..c58457f 100644 --- a/test/lib/parse/embeddings_cache_test.rb +++ b/test/lib/parse/embeddings_cache_test.rb @@ -225,7 +225,33 @@ def test_moneta_store_with_ttl_and_hash_backend_falls_back_to_plain_write store = CACHE::MonetaStore.new(moneta, ttl: 60) store.set("k", [1.0, 2.0]) assert_equal [1.0, 2.0], store.get("k") - assert_equal [1.0, 2.0], moneta["emb:k"] + # The value is written under emb:k via the plain []= fallback. MonetaStore + # JSON-encodes the vector (not Marshal), so the raw stored value is a JSON + # string, not the Array itself. + assert_equal JSON.generate([1.0, 2.0]), moneta["emb:k"] + end + + # SECURITY: vectors must be JSON-encoded, never left to Moneta's default + # Marshal value serializer — a cache read must never Marshal.load backing- + # store bytes (RCE-if-cache-compromised). See lib/parse/embeddings/cache.rb. + def test_moneta_store_serializes_values_as_json_not_marshal + moneta = FakeMoneta.new + store = CACHE::MonetaStore.new(moneta) + store.set("k", [0.5, -1.5, 2.0]) + raw = moneta.h["emb:k"] + assert_kind_of String, raw + refute raw.b.start_with?("\x04\b".b), "value must not be a Marshal stream" + assert_equal [0.5, -1.5, 2.0], JSON.parse(raw) + assert_equal [0.5, -1.5, 2.0], store.get("k"), "JSON value must round-trip" + end + + def test_moneta_store_warns_on_marshal_serializing_store + require "moneta" + # Moneta's default value serializer is Marshal (class name "...MarshalValue"). + marshaling = Moneta.new(:Memory) + assert_output(nil, /SECURITY.*Marshal|value_serializer: nil/m) do + CACHE::MonetaStore.new(marshaling) + end end def test_moneta_store_fails_open_on_backend_errors diff --git a/test/lib/parse/logging_middleware_test.rb b/test/lib/parse/logging_middleware_test.rb index d3dd63c..dd48ef7 100644 --- a/test/lib/parse/logging_middleware_test.rb +++ b/test/lib/parse/logging_middleware_test.rb @@ -96,6 +96,29 @@ def test_default_logger_format puts "✅ Default logger format works correctly!" end + # NEW-2: at :debug level, log_body emits request bodies (login carries a + # cleartext password) and response bodies (sessionToken / authData). These + # must be scrubbed with the same canonical redactor used for headers. + def test_log_body_redacts_credentials + output = StringIO.new + logger = Logger.new(output) + logger.level = Logger::DEBUG + logger.formatter = proc { |_s, _d, _p, msg| "#{msg}\n" } + Parse::Middleware::Logging.logger = logger + + mw = Parse::Middleware::Logging.new(->(env) { env }) + body = '{"username":"alice","password":"hunter2secret","sessionToken":"r:LIVE-TOKEN-XYZ"}' + mw.send(:log_body, body, "Request") + + output.rewind + log = output.read + refute_includes log, "hunter2secret", "password must be redacted from the body log" + refute_includes log, "r:LIVE-TOKEN-XYZ", "sessionToken must be redacted from the body log" + assert_includes log, "[FILTERED]", "redacted values should be marked [FILTERED]" + ensure + Parse::Middleware::Logging.logger = nil + end + # ========================================================================== # Test 4: Log level filtering # ========================================================================== diff --git a/test/lib/parse/mass_assignment_protection_test.rb b/test/lib/parse/mass_assignment_protection_test.rb index e2b6346..e9d2453 100644 --- a/test/lib/parse/mass_assignment_protection_test.rb +++ b/test/lib/parse/mass_assignment_protection_test.rb @@ -18,7 +18,8 @@ class TestDocument < Parse::Object def test_mass_assignment_allows_acl # ACL is a user-facing property; setting it via the constructor or - # `attributes=` must work for legitimate developer code paths. + # `attributes=` still works (behavior unchanged in 5.5.1 — it emits a + # deprecation warning but is not blocked; see the warning tests below). doc = TestDocument.new acl = Parse::ACL.new acl.apply("role:Admin", read: true, write: true) @@ -28,6 +29,58 @@ def test_mass_assignment_allows_acl assert acl_json["role:Admin"], "developer-set ACL must be applied" end + # F3 (5.5.1): mass-assigning an ACL is a foot-gun — a caller-supplied + # params hash bearing an `ACL` key can grant unintended write. We warn + # (once, non-breaking) rather than block; the supported path is the + # explicit `obj.acl =` setter. + def reset_acl_warn_flag! + Parse::Properties.instance_variable_set(:@acl_mass_assignment_warned, false) + end + + def test_mass_assignment_acl_emits_deprecation_warning + reset_acl_warn_flag! + doc = TestDocument.new + _out, err = capture_io do + doc.attributes = { "ACL" => { "*" => { "read" => true, "write" => true } } } + end + assert_match(/mass-assignment/i, err) + assert_match(/SECURITY/, err) + ensure + reset_acl_warn_flag! + end + + def test_mass_assignment_acl_warning_fires_only_once + reset_acl_warn_flag! + doc = TestDocument.new + _o1, err1 = capture_io { doc.attributes = { "acl" => Parse::ACL.new } } + _o2, err2 = capture_io { doc.attributes = { "acl" => Parse::ACL.new } } + refute_empty err1, "first ACL mass-assignment must warn" + assert_empty err2, "subsequent ACL mass-assignments must not re-warn (process-once)" + ensure + reset_acl_warn_flag! + end + + def test_mass_assignment_without_acl_does_not_warn + reset_acl_warn_flag! + doc = TestDocument.new + _out, err = capture_io { doc.attributes = { "title" => "Hello", "body" => "x" } } + assert_empty err, "non-ACL mass-assignment must not emit the ACL warning" + ensure + reset_acl_warn_flag! + end + + def test_constructor_acl_does_not_warn + # `Klass.new(acl:)` routes through apply_attributes! directly, not + # attributes=, so the legitimate construction idiom stays silent. + reset_acl_warn_flag! + _out, err = capture_io do + TestDocument.new(acl: Parse::ACL.new, title: "Hello") + end + assert_empty err, "constructor ACL must not emit the mass-assignment warning" + ensure + reset_acl_warn_flag! + end + def test_mass_assignment_skips_session_token user = Parse::User.new user.attributes = { "username" => "alice", "sessionToken" => "r:stolen" } diff --git a/test/lib/parse/path_segment_object_id_test.rb b/test/lib/parse/path_segment_object_id_test.rb new file mode 100644 index 0000000..5992cda --- /dev/null +++ b/test/lib/parse/path_segment_object_id_test.rb @@ -0,0 +1,63 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# NEW-10: Parse::API::PathSegment.object_id! validates an objectId before it is +# interpolated raw into a REST path. The _User endpoints (fetch/update/delete) +# previously interpolated `id` without validation, so a hostile/compromised +# Parse Server returning a crafted objectId could traverse to a different +# endpoint on the next call with whatever credentials it was authorized to send. +class PathSegmentObjectIdTest < Minitest::Test + PS = Parse::API::PathSegment + + def test_accepts_valid_object_ids + %w[abc123 Ab0Z aaaaaaaaaa 0123456789 A].each do |id| + assert_equal id, PS.object_id!(id), "#{id.inspect} should be accepted" + end + # Up to 40 chars. + long = "a" * 40 + assert_equal long, PS.object_id!(long) + end + + def test_rejects_path_traversal_and_query_injection + [ + "../classes/_User", + "../classes/_User?where=%7B%7D", + "abc/def", + "abc?where=1", + "abc&x=1", + "abc=1", + "abc.def", + "..", + ".", + "a b", + "abc\n", + "a" * 41, # over the length cap + ].each do |id| + assert_raises(ArgumentError, "#{id.inspect} must be rejected") { PS.object_id!(id) } + end + end + + def test_rejects_empty + assert_raises(ArgumentError) { PS.object_id!("") } + assert_raises(ArgumentError) { PS.object_id!(nil) } + end + + def test_user_endpoints_reject_hostile_object_id_before_request + # The validation must fire before any network call. We use a client whose + # request path would raise loudly if reached; object_id! should raise first. + client = Parse::Client.new( + server_url: "http://localhost:1/parse", + application_id: "app", api_key: "rest", + ) + %i[fetch_user delete_user].each do |m| + assert_raises(ArgumentError, "#{m} must reject a traversal objectId") do + client.public_send(m, "../classes/_User?where=%7B%7D") + end + end + assert_raises(ArgumentError, "update_user must reject a traversal objectId") do + client.update_user("../classes/_User", { foo: 1 }) + end + end +end diff --git a/test/lib/parse/pipeline_security_test.rb b/test/lib/parse/pipeline_security_test.rb index 9e95a87..2a5c498 100644 --- a/test/lib/parse/pipeline_security_test.rb +++ b/test/lib/parse/pipeline_security_test.rb @@ -334,6 +334,147 @@ def test_mongodb_aggregate_raises_denied_operator_on_function_in_pipeline end end + # --- redact_internal_fields_deep! (RT-1: $lookup foreign-doc credential leak) --- + + def test_redact_internal_fields_deep_strips_embedded_credentials + # A Post row with a _User document embedded via $lookup as: "leak". + row = { + "_id" => "post1", "_p_author" => "_User$u1", "_acl" => { "u1" => { "r" => true } }, + "title" => "hello", "_rperm" => ["*"], + "leak" => [ + { "_id" => "u1", "username" => "alice", + "_hashed_password" => "$2b$10$secret", + "_auth_data_google" => { "access_token" => "ya29.SECRET" }, + "_session_token" => "r:TOKEN", "_email_verify_token" => "v", + "_rperm" => ["*"], "_p_org" => "Tenant$t1" }, + ], + } + PS.redact_internal_fields_deep!(row) + + leak = row["leak"].first + # Credentials stripped from the embedded foreign document at depth. + %w[_hashed_password _auth_data_google _session_token _email_verify_token _rperm].each do |k| + refute leak.key?(k), "embedded #{k} must be stripped" + end + # Structural columns preserved for object reconstruction. + %w[_id username _p_org].each { |k| assert leak.key?(k), "embedded #{k} must survive" } + # Top-level _rperm stripped; ACL / pointer / id reconstruction columns kept. + refute row.key?("_rperm") + %w[_id _p_author _acl title].each { |k| assert row.key?(k), "top-level #{k} must survive" } + end + + def test_redact_internal_fields_deep_handles_arrays_and_scalars + node = { "list" => [{ "_session_token" => "x", "ok" => 1 }, 42, "str"], "n" => 5 } + PS.redact_internal_fields_deep!(node) + refute node["list"].first.key?("_session_token") + assert_equal 1, node["list"].first["ok"] + assert_equal [{ "ok" => 1 }, 42, "str"], node["list"] + assert_equal 5, node["n"] + end + + def test_redact_internal_fields_deep_strips_auth_data_prefix_and_symbol_keys + node = { "_auth_data_facebook" => { "id" => 1 }, :_hashed_password => "h", "keep" => "v" } + PS.redact_internal_fields_deep!(node) + refute node.key?("_auth_data_facebook") + refute node.key?(:_hashed_password) + assert_equal "v", node["keep"] + end + + def test_redact_internal_fields_deep_is_depth_bounded + # Build a chain deeper than the bound; the call must terminate (no stack + # overflow / infinite recursion) and clean the top level within the bound. + leaf = { "_session_token" => "x" } + (PS::INTERNAL_REDACT_MAX_DEPTH + 5).times do + leaf = { "child" => leaf, "_session_token" => "y" } + end + PS.redact_internal_fields_deep!(leaf) + refute leaf.key?("_session_token"), "top level always cleaned" + # Reaching here (no exception) confirms the recursion terminates. + end + + # --- RT-2: credential match-keys refused even under allow_internal_fields --- + + def test_credential_field_match_key_refused_even_with_allow_internal_fields + # allow_internal_fields: true is what the `*_direct` terminals pass so + # SDK-emitted _rperm/_wperm references survive. It must NOT relax a + # user-supplied credential column used as a $match key (a count/match + # oracle that bisects a bcrypt hash / session token char-by-char). + %w[_hashed_password _session_token _email_verify_token _perishable_token + _password_history _auth_data_google].each do |field| + err = assert_raises(PS::Error, "#{field} must be refused") do + PS.validate_filter!({ field => { "$regex" => "^x" } }, allow_internal_fields: true) + end + assert_equal field, err.operator + assert_equal :denied_internal_field, err.reason + end + end + + def test_credential_field_match_key_refused_when_nested + err = assert_raises(PS::Error) do + PS.validate_filter!({ "$or" => [{ "x" => 1 }, { "_session_token" => "r:t" }] }, + allow_internal_fields: true) + end + assert_equal "_session_token", err.operator + end + + def test_acl_columns_allowed_with_allow_internal_fields_but_gated_without + # _rperm/_wperm are emitted by readable_by_role / publicly_readable, so they + # must pass when allow_internal_fields: true ... + PS.validate_filter!({ "_rperm" => { "$in" => ["*", "role:Admin"] } }, allow_internal_fields: true) + PS.validate_filter!({ "_wperm" => { "$in" => ["*"] } }, allow_internal_fields: true) + # ... and stay refused when allow_internal_fields: false (unchanged). + assert_raises(PS::Error) { PS.validate_filter!({ "_rperm" => { "$in" => ["*"] } }) } + end + + def test_credential_denylist_excludes_acl_columns + assert_includes PS::CREDENTIAL_FIELDS_DENYLIST, "_hashed_password" + assert_includes PS::CREDENTIAL_FIELDS_DENYLIST, "_session_token" + refute_includes PS::CREDENTIAL_FIELDS_DENYLIST, "_rperm" + refute_includes PS::CREDENTIAL_FIELDS_DENYLIST, "_wperm" + end + + # --- RT-6: camelCase $sessionToken / $session_token field refs denied --- + + def test_camelcase_session_token_field_refs_are_denied + assert_includes PS::DENIED_FIELD_REFS, "$sessionToken" + assert_includes PS::DENIED_FIELD_REFS, "$session_token" + # And the validator refuses them as a $-field reference (e.g. laundering + # via a $project/$expr rename). + assert_raises(PS::Error) do + PS.validate_pipeline!([{ "$project" => { "leak" => "$sessionToken" } }]) + end + assert_raises(PS::Error) do + PS.validate_pipeline!([{ "$project" => { "leak" => "$session_token" } }]) + end + end + + # --- RT-7 / NEW-4: ACLScope join gate enforces the internal-collection floor --- + + def test_join_gate_refuses_internal_collections_unconditionally + # _SCHEMA / _Hooks / _GlobalConfig etc. must be refused as a $lookup target + # even if a CLP fetch would (wrongly) permit them — the hard floor runs + # first, independent of rewrite_lookups having run. + %w[_SCHEMA _Hooks _GlobalConfig _Audit].each do |coll| + assert_raises(PS::Error, "#{coll} join must be refused by the floor") do + Parse::ACLScope.send(:assert_join_target_permitted!, coll, ["*"]) + end + end + end + + def test_join_gate_admits_sdk_data_classes_through_floor + # The four allowed underscore collections pass the floor; they then face + # the per-scope CLP gate (which may still deny, but NOT via the floor). + %w[_User _Role _Installation _Session].each do |coll| + begin + Parse::ACLScope.send(:assert_join_target_permitted!, coll, ["*"]) + rescue Parse::CLPScope::Denied + # CLP may deny for this scope — acceptable; the floor did not refuse it. + rescue PS::Error => e + flunk "#{coll} must pass the internal-collection floor, got #{e.message}" + end + end + end + def test_mongodb_aggregate_passes_safe_pipeline_to_collection # Confirm the validator does NOT trip on a legitimate pipeline. We stub # collection() to capture the call and return a fake aggregator. diff --git a/test/lib/parse/query_with_session_mongo_direct_routing_test.rb b/test/lib/parse/query_with_session_mongo_direct_routing_test.rb new file mode 100644 index 0000000..e6125dc --- /dev/null +++ b/test/lib/parse/query_with_session_mongo_direct_routing_test.rb @@ -0,0 +1,88 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" + +# Regression for F1: a query that auto-routes to mongo-direct inside a +# `Parse.with_session(token)` block must be SCOPED to that ambient session, +# not silently run as master with no ACL/CLP enforcement. +# +# The bug: `Parse::Query#mongo_direct_auth_kwargs` consulted only the query +# instance's own `@session_token` / `@acl_user` / `@acl_role` and ignored +# `Parse.current_session_token` (the fiber-local ambient set by +# `with_session`). In server mode it fell through to `{ master: true }`, so a +# geo / `$near` query inside a `with_session` block returned every row of +# every tenant — even though every REST query in the same block was correctly +# scoped to that user. +# +# This drives the routing decision directly (the security-relevant logic); +# it needs no live Mongo connection. +class QueryWithSessionMongoDirectRoutingTest < Minitest::Test + AMBIENT = "r:ambient-user-token" + + def build_query + Parse::Query.new("GeoThing") + end + + def auth_kwargs(query) + query.send(:mongo_direct_auth_kwargs) + end + + # ---- the bug being fixed ---------------------------------------------- + + def test_ambient_session_scopes_direct_route_instead_of_master + query = build_query + kwargs = Parse.with_session(AMBIENT) { auth_kwargs(query) } + assert_equal({ session_token: AMBIENT }, kwargs, + "an active with_session block must scope the mongo-direct read, not run as master") + refute kwargs.key?(:master), + "the ambient session must suppress the master-key fallback" + end + + # ---- precedence parity with the REST path ------------------------------ + + def test_explicit_master_key_skips_ambient + # `use_master_key: true` is a deliberate admin call and must win over the + # ambient — exactly as Parse::Client#request behaves. + query = build_query + query.use_master_key = true + kwargs = Parse.with_session(AMBIENT) { auth_kwargs(query) } + assert_equal({ master: true }, kwargs) + end + + def test_explicit_query_session_token_wins_over_ambient + query = build_query + query.session_token = "r:explicit-query-token" + kwargs = Parse.with_session(AMBIENT) { auth_kwargs(query) } + assert_equal({ session_token: "r:explicit-query-token" }, kwargs) + end + + def test_scope_to_role_wins_over_ambient + query = build_query + query.scope_to_role("admin") + kwargs = Parse.with_session(AMBIENT) { auth_kwargs(query) } + assert_equal({ acl_role: "admin" }, kwargs) + end + + # ---- baselines that must stay unchanged -------------------------------- + + def test_no_ambient_no_scope_still_master + query = build_query + assert_equal({ master: true }, auth_kwargs(query), + "with no scope and no ambient, the server-mode master fallback is unchanged") + end + + def test_whitespace_only_ambient_is_treated_as_absent + query = build_query + kwargs = Parse.with_session(" ") { auth_kwargs(query) } + assert_equal({ master: true }, kwargs, + "a whitespace-only ambient must not be forwarded as a session token") + end + + def test_ambient_does_not_leak_outside_the_block + query = build_query + Parse.with_session(AMBIENT) { auth_kwargs(query) } + assert_equal({ master: true }, auth_kwargs(query), + "the ambient must not affect routing once the with_session block exits") + end +end