From a645f3cd5b7b938ddd9c13a322dd4e34cddcfb4e Mon Sep 17 00:00:00 2001 From: Adrian Curtin <48138055+AdrianCurtin@users.noreply.github.com> Date: Thu, 11 Jun 2026 09:53:36 -0400 Subject: [PATCH] Fix aggregation scoping and long-URL overrides Send JSON bodies for long-URL aggregate overrides and scope aggregations to ambient sessions. BodyBuilder now detects /aggregate/ requests and builds a JSON POST body (preserving pipeline as an Array and boolean params) to avoid Parse Server rejecting urlencoded pipeline strings. Query/aggregation logic was updated to consult the ambient Parse.with_session session token (in distinct_query_is_scoped? and query_is_scoped?), to auto-promote scoped aggregations to mongo-direct when available and to raise MongoDirectRequired when mongo-direct is unavailable. REST /aggregate calls force use_master_key: true unless explicitly set false to avoid ambient sessions suppressing the master key and causing 401/403s. Tests for ambient session scoping and the long-URL aggregate override were added/updated, CHANGELOG updated, and the stack version bumped to 5.5.2. --- CHANGELOG.md | 45 ++++ Gemfile.lock | 2 +- lib/parse/client/body_builder.rb | 79 ++++++- lib/parse/query.rb | 143 ++++++++++--- lib/parse/stack/version.rb | 2 +- .../parse/aggregation_auto_promotion_test.rb | 195 ++++++++++++++++++ .../body_builder_method_override_test.rb | 176 ++++++++++++++++ .../parse/query_aggregate_integration_test.rb | 65 ++++++ 8 files changed, 663 insertions(+), 44 deletions(-) create mode 100644 test/lib/parse/body_builder_method_override_test.rb diff --git a/CHANGELOG.md b/CHANGELOG.md index 5b189a8..fd3e9bf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,50 @@ ## parse-stack-next Changelog +### 5.5.2 + +#### Large aggregation pipelines no longer fail with "Invalid aggregate stage '0'" + +- **FIXED**: An aggregation whose request URL exceeds ~2KB (for example a + `group_by`, `group_by_date`, `distinct`, or custom `aggregate` pipeline with + a large `$in` / `$match`) is rewritten from a GET to a POST carrying + `_method=GET`, with the query moved into the request body. The pipeline was + sent in the body as a URL-encoded string, but Parse Server's aggregate + endpoint only JSON-decodes query-string params, not body params — so the + pipeline arrived as a raw string and was rejected with + `Invalid aggregate stage '0'`, causing the aggregation to return an empty + result. The long-URL override now sends a JSON body for the aggregate + endpoint so the pipeline is delivered as a real array (boolean params such as + `rawValues` are preserved as booleans). The historical URL-encoded override is + unchanged for `find` and other endpoints, which Parse Server already decodes + correctly. + +#### Aggregations inside `Parse.with_session` blocks are now scoped + +- **FIXED**: `group_by_date`, `group_by`, `distinct`, and `count` (aggregation + branch) now detect the ambient session token set by `Parse.with_session` and + treat the query as scoped — consistent with how `Parse::Client#request` + already scopes REST find/get/count calls in the same block. Previously the + `query_is_scoped?` / `distinct_query_is_scoped?` checks consulted only the + query instance's own `session_token=` / `scope_to_user` / `scope_to_role` + and ignored `Parse.current_session_token`, so an aggregation inside a + `with_session` block ran unscoped as the master key and returned all rows + regardless of ACL. The checks now include the ambient: when scoped and + mongo-direct is available the aggregation auto-promotes (ACL/CLP enforced); + when scoped and mongo-direct is unavailable it fails closed with + `MongoDirectRequired` rather than silently leaking rows. +- **FIXED**: `group_by_date` now also fails closed (`MongoDirectRequired`) when + the query is scoped but mongo-direct is unavailable — matching the existing + behavior of `group_by`, `distinct`, and `count`. Previously `group_by_date` + silently fell back to the REST `/aggregate` endpoint in that case. +- **FIXED**: A regression introduced in 5.5.1 where `group_by_date`, + `group_by`, and pipeline-based aggregations called inside a + `Parse.with_session` block returned empty results `{}`. The ambient session + token was forwarded as an HTTP session-token header (suppressing the master + key), causing Parse Server's REST `/aggregate` endpoint — which is + master-key-only — to return a 401/403. The REST aggregate call sites now + force `use_master_key: true` so the ambient cannot suppress it, unless the + caller explicitly set `use_master_key: false`. + ### 5.5.1 #### Mongo-direct reads inside `Parse.with_session` are now scoped, not master diff --git a/Gemfile.lock b/Gemfile.lock index f9d4813..f98c8d9 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,7 +1,7 @@ PATH remote: . specs: - parse-stack-next (5.5.1) + parse-stack-next (5.5.2) activemodel (>= 6.1, < 9) activesupport (>= 6.1, < 9) connection_pool (>= 2.2, < 4) diff --git a/lib/parse/client/body_builder.rb b/lib/parse/client/body_builder.rb index d09f5fc..2e3cdf9 100644 --- a/lib/parse/client/body_builder.rb +++ b/lib/parse/client/body_builder.rb @@ -285,14 +285,32 @@ def call!(env) # to be POST instead of GET and send the query parameters in the body of the POST request. # The standard maximum POST request (which is a server setting), is usually set to 20MBs if env[:method] == :get && env[:url].to_s.length >= MAX_URL_LENGTH - env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET" - env[:request_headers][CONTENT_TYPE] = "application/x-www-form-urlencoded" - # parse-sever looks for method overrides in the body under the `_method` param. - # so we will add it to the query string, which will now go into the body. - env[:body] = "_method=GET&" + env[:url].query - env[:url].query = nil - #override - env[:method] = :post + if aggregate_request?(env[:url]) + # Parse Server's AggregateRouter only JSON-decodes query-string + # params (via JSONFromQuery); it does NOT decode a `pipeline` param + # that arrives in the request body. The urlencoded override below + # would therefore deliver `pipeline` as a raw JSON *string*, which + # AggregateRouter.getPipeline mis-reads character-by-character and + # rejects with "Invalid aggregate stage '0'". Send a JSON body + # instead so the pipeline survives as a real Array. `_method=GET` + # still routes Parse Server to its GET-only aggregate handler. + env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET" + env[:request_headers][CONTENT_TYPE] = CONTENT_TYPE_FORMAT + env[:body] = aggregate_override_body(env[:url].query) + env[:url].query = nil + env[:method] = :post + else + env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET" + env[:request_headers][CONTENT_TYPE] = "application/x-www-form-urlencoded" + # parse-server looks for method overrides in the body under the `_method` param. + # so we will add it to the query string, which will now go into the body. + # `.to_s` guards the (contrived but possible) case of a >=2KB URL whose + # length is all path and no query — nil + String would raise TypeError. + env[:body] = "_method=GET&" + env[:url].query.to_s + env[:url].query = nil + #override + env[:method] = :post + end # else if not a get, always make sure the request is JSON encoded if the content type matches elsif env[:request_headers][CONTENT_TYPE] == CONTENT_TYPE_FORMAT && (env[:body].is_a?(Hash) || env[:body].is_a?(Array)) @@ -334,6 +352,51 @@ def call!(env) response_env[:body] = r end end + + private + + # Whether the request targets Parse Server's `/aggregate/` + # endpoint. Used by {#call!} to pick the JSON-body form of the + # long-URL GET→POST override (the aggregate endpoint does not + # JSON-decode a body `pipeline` param, unlike `where`). + # + # Anchored to the final two path segments: `.../aggregate/` + # where is the last segment (no further slashes). The + # className is mandatory and slash-free — see + # {Parse::API::Aggregate#aggregate_uri_path}, which validates it via + # PathSegment.identifier! — so a real aggregate URL always ends this way. + # A `find` request is `.../classes/` (no match), a class + # merely *named* with "aggregate" (e.g. `MyAggregateData`) does not match, + # and an `/aggregate/` segment appearing earlier in a custom mount prefix + # (e.g. `/aggregate/api/classes/Foo`) does not match either. + # @param url [URI] the request URL. + # @return [Boolean] + def aggregate_request?(url) + url.path.to_s.match?(%r{/aggregate/[^/]+/?\z}) + end + + # Build the JSON request body for a long-URL aggregate GET→POST + # override. Reconstructs the params from the encoded query string and + # JSON-decodes each value so the `pipeline` Array (and boolean + # `rawValues`/`rawFieldNames`) reach Parse Server as real types rather + # than strings. A value that is not itself JSON is passed through + # unchanged. `_method=GET` is injected so Parse Server routes the POST + # to its GET-only aggregate handler. + # @param query_string [String, nil] the encoded query string. + # @return [String] the JSON body to send. + def aggregate_override_body(query_string) + params = Faraday::Utils.parse_query(query_string.to_s) || {} + body = { "_method" => "GET" } + params.each do |key, value| + body[key] = + begin + JSON.parse(value) + rescue JSON::ParserError, TypeError + value + end + end + body.to_json + end end end #Middleware end diff --git a/lib/parse/query.rb b/lib/parse/query.rb index e199b2b..de93251 100644 --- a/lib/parse/query.rb +++ b/lib/parse/query.rb @@ -1811,13 +1811,25 @@ def requires_mongo_direct? # Whether this query carries a non-master-key auth scope. Used by # `#distinct` (and group_by aggregations) to decide whether to # auto-promote the REST aggregate path to mongo-direct so the SDK's - # ACLScope / CLPScope enforcement actually runs. + # ACLScope / CLPScope enforcement actually runs. Also detects the + # fiber-local ambient session set by Parse.with_session so that + # aggregations inside a with_session block are treated as scoped — + # consistent with how Parse::Client#request already scopes REST calls. # @return [Boolean] # @api private def distinct_query_is_scoped? return true if @session_token.is_a?(String) && !@session_token.empty? return true if @acl_user return true if @acl_role + # An ambient Parse.with_session counts as scope ONLY when the query did + # not explicitly request master-key mode — mirroring Parse::Client#request, + # where an explicit use_master_key: true is a deliberate admin call that + # skips the ambient session. Otherwise an admin aggregation inside a + # with_session block would be wrongly forced to mongo-direct / fail-closed. + unless use_master_key == true + ambient = ambient_session_token + return true if ambient.is_a?(String) && !ambient.empty? + end false end @@ -1832,12 +1844,13 @@ def distinct_query_is_scoped? def raise_scoped_aggregation_requires_mongo_direct! raise MongoDirectRequired, "[Parse::Query] This scoped aggregation (session_token / " \ - "scope_to_user / scope_to_role) requires mongo-direct so the " \ - "SDK can enforce ACL/CLP. Parse Server's REST /aggregate " \ - "endpoint is master-key-only and enforces neither, so routing " \ - "it there would silently run unscoped as the master key. " \ - "Enable mongo-direct via Parse::MongoDB.configure(...), or " \ - "rewrite without the aggregation terminal." + "scope_to_user / scope_to_role, or an active Parse.with_session " \ + "block) requires mongo-direct so the SDK can enforce ACL/CLP. " \ + "Parse Server's REST /aggregate endpoint is master-key-only and " \ + "enforces neither, so routing it there would silently run unscoped " \ + "as the master key. Enable mongo-direct via " \ + "Parse::MongoDB.configure(...), or rewrite without the " \ + "aggregation terminal." end # Scope a query to a specific user's row-level ACL when it auto-routes @@ -5996,13 +6009,21 @@ def execute! if @mongo_direct && defined?(Parse::MongoDB) && Parse::MongoDB.enabled? @cached_response = execute_direct! else + # REST /aggregate is master-key-only. An ambient Parse.with_session + # block would suppress the master key via session_token, causing a + # 401/403. Force use_master_key unless the caller explicitly disabled + # it (use_master_key: false is a deliberate client-mode decision). + # `.dup` keeps the master-key flip local to this call even if `_opts` + # ever returns a shared/memoized hash. + rest_opts = @query.send(:_opts).dup + rest_opts[:use_master_key] = true unless rest_opts[:use_master_key] == false @cached_response = @query.client.aggregate_pipeline( @query.instance_variable_get(:@table), @pipeline, headers: {}, raw_values: @raw_values, raw_field_names: @raw_field_names, - **@query.send(:_opts), + **rest_opts, ) end @@ -6388,16 +6409,24 @@ def pipeline # @return [Array] raw aggregation results def raw(operation, aggregation_expr) formatted_group_field = @query.send(:format_aggregation_field, @group_field) - pipeline = build_pipeline(formatted_group_field, aggregation_expr) - response = @query.client.aggregate_pipeline( - @query.instance_variable_get(:@table), - pipeline, - headers: {}, - **@query.send(:_opts), - ) + # Build the same pipeline the count/sum/etc. terminals use, then delegate + # to Query#aggregate. That central path handles scoped-query routing + # (session_token / acl_user / acl_role / ambient Parse.with_session → + # auto-promote to mongo-direct, or fail closed when unavailable) so a + # scoped `raw` is never sent to the master-key-only REST /aggregate + # endpoint, and it returns the raw Array rows this method documents. + # `$match` from the query's where constraints is added by Query#aggregate. + pipeline = [] + pipeline << { "$unwind" => "$#{formatted_group_field}" } if @flatten_arrays + pipeline << { "$group" => { "_id" => "$#{formatted_group_field}", "count" => aggregation_expr } } + add_fields = size_addfields_stage + pipeline << add_fields if add_fields + sort = sort_stage + pipeline << sort if sort + pipeline << { "$project" => { "_id" => 0, "objectId" => "$_id", "count" => 1 } } - response.result || [] + @query.aggregate(pipeline, verbose: @query.instance_variable_get(:@verbose_aggregate)).raw || [] end # Count the number of items in each group. @@ -6541,8 +6570,12 @@ def execute_group_aggregation(operation, aggregation_expr, &value_transformer) # already does this auto-promotion (lib/parse/agent/tools.rb), this # is the equivalent at the Query layer for direct SDK callers. use_mongo_direct = @mongo_direct - if !use_mongo_direct && query_is_scoped? && parse_mongodb_available? - use_mongo_direct = true + if !use_mongo_direct && query_is_scoped? + if parse_mongodb_available? + use_mongo_direct = true + else + @query.send(:raise_scoped_aggregation_requires_mongo_direct!) + end end if use_mongo_direct @@ -6779,15 +6812,22 @@ def validate_sort_target_for_operation!(operation) end # Whether the parent query carries any non-master-key auth scope. A - # session_token, acl_user, or acl_role means the caller expects the - # results to be filtered by ACL — which only happens on the SDK's - # mongo-direct path. Used to decide whether to auto-promote the REST - # aggregation path to mongo-direct. + # session_token, acl_user, acl_role, or an active Parse.with_session + # ambient means the caller expects ACL-filtered results — which only + # the SDK's mongo-direct path provides. Used to decide whether to + # auto-promote the REST aggregation path to mongo-direct. def query_is_scoped? st = @query.session_token return true if st.is_a?(String) && !st.empty? return true if @query.instance_variable_get(:@acl_user) return true if @query.instance_variable_get(:@acl_role) + # Ambient Parse.with_session counts as scope only when the query did not + # explicitly set use_master_key: true (matches Parse::Client#request + # precedence — an explicit master-key call skips the ambient session). + unless @query.use_master_key == true + ambient = @query.send(:ambient_session_token) + return true if ambient.is_a?(String) && !ambient.empty? + end false end @@ -7228,14 +7268,19 @@ def execute_date_aggregation(operation, aggregation_expr) # Format the date field name formatted_date_field = @query.send(:format_aggregation_field, @date_field) - # Auto-promote scoped queries to mongo-direct. See the matching - # block in `GroupBy#execute_group_aggregation` for the full - # rationale — REST `/aggregate` is master-key-only and unscoped, so - # session_token / acl_user / acl_role queries need the SDK's - # mongo-direct enforcement layers to actually filter results. + # Auto-promote scoped queries to mongo-direct. REST `/aggregate` is + # master-key-only and enforces neither ACL nor CLP — a scoped query + # (session_token / acl_user / acl_role, or an active + # Parse.with_session block) must use the SDK's enforcement layers. + # Fail closed if mongo-direct is unavailable rather than silently + # returning unscoped rows. Mirrors the scoped-query gate in Query#aggregate. use_mongo_direct = @mongo_direct - if !use_mongo_direct && query_is_scoped? && parse_mongodb_available? - use_mongo_direct = true + if !use_mongo_direct && query_is_scoped? + if parse_mongodb_available? + use_mongo_direct = true + else + @query.send(:raise_scoped_aggregation_requires_mongo_direct!) + end end if use_mongo_direct @@ -7281,11 +7326,21 @@ def execute_date_aggregation(operation, aggregation_expr) puts "[VERBOSE AGGREGATE] Sending to: #{@query.instance_variable_get(:@table)}" end + # Parse Server's REST /aggregate endpoint is master-key-only. An active + # Parse.with_session block sets a fiber-local ambient session token that + # Parse::Client#request picks up and uses in place of the master key, + # causing a 401/403 on this endpoint. Force use_master_key: true so the + # ambient session cannot suppress it — unless the caller explicitly set + # use_master_key: false (deliberate client-mode / session-token intent). + # `.dup` keeps the master-key flip local to this call (see Aggregation#execute!). + rest_opts = @query.send(:_opts).dup + rest_opts[:use_master_key] = true unless rest_opts[:use_master_key] == false + response = @query.client.aggregate_pipeline( @query.instance_variable_get(:@table), pipeline, headers: {}, - **@query.send(:_opts), + **rest_opts, ) if @query.instance_variable_get(:@verbose_aggregate) @@ -7312,6 +7367,18 @@ def execute_date_aggregation(operation, aggregation_expr) end result_hash else + unless response.success? + # Surface the failure (the result would otherwise be a silent `{}`) + # through the configured logger rather than unconditional stderr. + # Log the error code + message, not a full `inspect`, to avoid + # echoing an unbounded server payload into logs. + logger = Parse.respond_to?(:logger) ? Parse.logger : nil + logger&.warn( + "[Parse::GroupByDate] aggregate failed " \ + "(#{@query.instance_variable_get(:@table)} :#{@date_field} :#{@interval}): " \ + "code=#{response.code} #{response.error}" + ) + end {} end end @@ -7492,15 +7559,23 @@ def sort_stage { "$sort" => { field => dir } } end - # Mirror of {GroupBy#query_is_scoped?}. A session_token, acl_user, or - # acl_role on the parent query means the caller expects ACL filtering, - # which only the mongo-direct path provides — Parse Server REST - # `/aggregate` is master-key-only and unscoped. + # Mirror of {GroupBy#query_is_scoped?}. A session_token, acl_user, + # acl_role, or an active Parse.with_session ambient means the caller + # expects ACL-filtered results — which only the mongo-direct path + # provides. Parse Server REST `/aggregate` is master-key-only and + # unscoped. def query_is_scoped? st = @query.session_token return true if st.is_a?(String) && !st.empty? return true if @query.instance_variable_get(:@acl_user) return true if @query.instance_variable_get(:@acl_role) + # Ambient Parse.with_session counts as scope only when the query did not + # explicitly set use_master_key: true (matches Parse::Client#request + # precedence — an explicit master-key call skips the ambient session). + unless @query.use_master_key == true + ambient = @query.send(:ambient_session_token) + return true if ambient.is_a?(String) && !ambient.empty? + end false end diff --git a/lib/parse/stack/version.rb b/lib/parse/stack/version.rb index 0e1b598..51e15c1 100644 --- a/lib/parse/stack/version.rb +++ b/lib/parse/stack/version.rb @@ -6,6 +6,6 @@ module Parse # The Parse Server SDK for Ruby module Stack # The current version. - VERSION = "5.5.1" + VERSION = "5.5.2" end end diff --git a/test/lib/parse/aggregation_auto_promotion_test.rb b/test/lib/parse/aggregation_auto_promotion_test.rb index 696d543..d8787cb 100644 --- a/test/lib/parse/aggregation_auto_promotion_test.rb +++ b/test/lib/parse/aggregation_auto_promotion_test.rb @@ -15,10 +15,28 @@ def setup @query = Parse::Query.new("Song") @mock_client = Minitest::Mock.new @query.client = @mock_client + # `Parse::MongoDB.aggregate` is a real singleton method. Several tests + # below stub it via define_singleton_method; capture the original here so + # teardown can restore it, ensuring a stub (or a stub's `remove_method` + # cleanup) never leaks the method's absence into a later test in the run. + require_relative "../../../lib/parse/mongodb" + @mongodb_aggregate_original = + (Parse::MongoDB.method(:aggregate) if Parse::MongoDB.singleton_class.method_defined?(:aggregate)) end def teardown restore_mongodb_stub + restore_mongodb_aggregate + end + + # Restore the real Parse::MongoDB.aggregate captured in setup (or remove a + # leftover stub if it was never a real method). Idempotent. + def restore_mongodb_aggregate + if @mongodb_aggregate_original + Parse::MongoDB.define_singleton_method(:aggregate, @mongodb_aggregate_original) + elsif Parse::MongoDB.singleton_class.method_defined?(:aggregate) + Parse::MongoDB.singleton_class.remove_method(:aggregate) + end end # ---- GroupBy auto-promotion ------------------------------------------- @@ -209,6 +227,182 @@ def test_aggregate_unscoped_explicit_mongo_direct_false_stays_on_rest refute agg.mongo_direct, "unscoped aggregate must honor explicit mongo_direct: false" end + # ---- Ambient Parse.with_session scopes aggregations ------------------------- + # An active Parse.with_session(token) block sets a fiber-local session that + # should scope aggregations just as it scopes REST find/get/count calls. + # The query_is_scoped? / distinct_query_is_scoped? checks must consult + # Parse.current_session_token so that GroupByDate / GroupBy / distinct / + # count (aggregation branch) all auto-promote to mongo-direct (or fail closed + # when mongo-direct is unavailable) rather than silently running as master + # and returning unscoped rows. + + def test_group_by_date_ambient_session_promotes_to_direct + stub_mongodb_enabled!(true) + gbd = @query.group_by_date(:created_at, :day) + Parse.with_session("r:ambient-tok") { assert_date_promotes_to_direct(gbd, :count) } + end + + def test_group_by_date_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + gbd = @query.group_by_date(:created_at, :day) + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { gbd.count } + end + end + + def test_group_by_ambient_session_promotes_to_direct + stub_mongodb_enabled!(true) + gb = @query.group_by(:created_at) + Parse.with_session("r:ambient-tok") { assert_promotes_to_direct(gb, :count) } + end + + def test_group_by_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + gb = @query.group_by(:created_at) + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { gb.count } + end + end + + def test_distinct_ambient_session_promotes_to_direct + stub_mongodb_enabled!(true) + Parse.with_session("r:ambient-tok") { assert_distinct_promotes_to_direct } + end + + def test_distinct_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { @query.distinct(:artist) } + end + end + + def test_count_aggregation_ambient_session_promotes_to_direct + stub_mongodb_enabled!(true) + @query.where :tags.size => 2 + direct_called = false + Parse::MongoDB.define_singleton_method(:aggregate) do |_class_name, _pipeline, **_kw| + direct_called = true + [] + end + Parse.with_session("r:ambient-tok") { @query.count } + assert direct_called, "expected scoped #count (ambient session) to route through mongo-direct" + ensure + Parse::MongoDB.singleton_class.remove_method(:aggregate) if Parse::MongoDB.singleton_class.method_defined?(:aggregate) + end + + def test_count_aggregation_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + @query.where :tags.size => 2 + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { @query.count } + end + end + + # ---- precedence: explicit use_master_key: true beats the ambient ---------- + # Parse::Client#request treats an explicit use_master_key: true as a + # deliberate admin call that skips the ambient Parse.with_session token. The + # scoping checks must mirror that: an admin aggregation inside a with_session + # block must NOT be treated as scoped (no forced mongo-direct / fail-closed). + + def test_group_by_date_ambient_ignored_when_use_master_key_true + # mongo-direct disabled: if the ambient were (wrongly) treated as scope this + # would raise MongoDirectRequired. With use_master_key: true it must stay on + # REST instead. + stub_mongodb_enabled!(false) + @query.use_master_key = true + gbd = @query.group_by_date(:created_at, :day) + response = stub_response([]) + @mock_client.expect(:aggregate_pipeline, response) { |_t, _p, **_kw| true } + Parse.with_session("r:ambient-tok") { gbd.count } + @mock_client.verify + end + + def test_group_by_ambient_ignored_when_use_master_key_true + stub_mongodb_enabled!(false) + @query.use_master_key = true + gb = @query.group_by(:created_at) + response = stub_response([]) + @mock_client.expect(:aggregate_pipeline, response) { |_t, _p, **_kw| true } + Parse.with_session("r:ambient-tok") { gb.count } + @mock_client.verify + end + + def test_distinct_query_is_scoped_ignores_ambient_when_use_master_key_true + # Drive the predicate directly: with use_master_key: true an ambient session + # must not register as scope. + @query.use_master_key = true + scoped = Parse.with_session("r:ambient-tok") { @query.send(:distinct_query_is_scoped?) } + refute scoped, "explicit use_master_key: true must suppress the ambient session as scope" + end + + def test_distinct_query_is_scoped_honors_ambient_without_master_key + # Control: without use_master_key: true the ambient DOES count as scope. + scoped = Parse.with_session("r:ambient-tok") { @query.send(:distinct_query_is_scoped?) } + assert scoped, "ambient session must count as scope when master key is not forced" + end + + def test_aggregate_ambient_session_promotes_to_direct + # Query#aggregate returns an Aggregation object; the routing decision + # (mongo_direct flag) is made at construction time by #aggregate itself. + stub_mongodb_enabled!(true) + agg = Parse.with_session("r:ambient-tok") { @query.aggregate(PIPE) } + assert agg.mongo_direct, + "Query#aggregate inside Parse.with_session must set mongo_direct: true on " \ + "the returned Aggregation so subsequent .results/.raw run through mongo-direct" + end + + def test_aggregate_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { @query.aggregate(PIPE) } + end + end + + def test_group_by_raw_ambient_session_promotes_to_direct + stub_mongodb_enabled!(true) + gb = @query.group_by(:artist) + direct_called = false + Parse::MongoDB.define_singleton_method(:aggregate) do |_class_name, _pipeline, **_kw| + direct_called = true + [] + end + Parse.with_session("r:ambient-tok") { gb.raw("count", { "$sum" => 1 }) } + assert direct_called, + "GroupBy#raw must route through mongo-direct when ambient session is active " \ + "(not REST /aggregate as master, which would return unscoped rows)" + ensure + Parse::MongoDB.singleton_class.remove_method(:aggregate) if Parse::MongoDB.singleton_class.method_defined?(:aggregate) + end + + def test_group_by_raw_ambient_session_fails_closed_when_mongodb_disabled + stub_mongodb_enabled!(false) + gb = @query.group_by(:artist) + assert_raises(Parse::Query::MongoDirectRequired) do + Parse.with_session("r:ambient-tok") { gb.raw("count", { "$sum" => 1 }) } + end + end + + def test_group_by_date_rest_respects_explicit_use_master_key_false + # An unscoped query with explicit use_master_key: false (no ambient, no + # scope) must still route to REST and forward the false flag. + stub_mongodb_enabled!(false) + @query.use_master_key = false + gbd = @query.group_by_date(:created_at, :day) + + opts_received = nil + response = stub_response([]) + @mock_client.expect(:aggregate_pipeline, response) do |_table, _pipeline, **kw| + opts_received = kw + true + end + + gbd.count + @mock_client.verify + + assert_equal false, opts_received[:use_master_key], + "explicit use_master_key: false must be respected on REST aggregate path" + end + private # Stub Parse::MongoDB.enabled? for the duration of one test. We don't @@ -314,4 +508,5 @@ def stub_response(rows) define_method(:result) { rows } end.new end + end diff --git a/test/lib/parse/body_builder_method_override_test.rb b/test/lib/parse/body_builder_method_override_test.rb new file mode 100644 index 0000000..6a73541 --- /dev/null +++ b/test/lib/parse/body_builder_method_override_test.rb @@ -0,0 +1,176 @@ +# encoding: UTF-8 +# frozen_string_literal: true + +require_relative "../../test_helper" +require "faraday" +require "json" + +# Unit tests for Parse::Middleware::BodyBuilder's long-URL GET→POST override. +# +# Parse Server's REST surface caps a GET URL at ~2KB, so for a query whose +# encoded URL exceeds MAX_URL_LENGTH the SDK rewrites the request to a POST +# carrying `_method=GET` and moves the query into the request body. +# +# The two endpoints need DIFFERENT body encodings: +# +# * find / classes: Parse Server JSON-decodes a string `where` body param, +# so the historical `application/x-www-form-urlencoded` body +# (`_method=GET&where=&...`) works. +# * aggregate: Parse Server's AggregateRouter.getPipeline does NOT JSON-decode +# a body `pipeline` param. A urlencoded string `pipeline` is mis-read +# character-by-character ("Invalid aggregate stage '0'"). The override must +# send a JSON body so the pipeline survives as a real Array. +# +# These tests drive BodyBuilder with a Faraday test adapter and assert the +# converted method / headers / body without needing a live Parse Server. +class BodyBuilderMethodOverrideTest < Minitest::Test + MAX = Parse::Middleware::BodyBuilder::MAX_URL_LENGTH + + # Build a connection whose adapter snapshots the REQUEST as BodyBuilder + # hands it to the adapter. We snapshot (not just hold the env reference) + # because Faraday mutates env.body to the *response* body on completion. + def capture_env + captured = nil + conn = Faraday.new(url: "http://example.test/parse") do |c| + c.use Parse::Middleware::BodyBuilder + c.adapter :test do |stub| + handler = lambda do |env| + captured = { + method: env.method, + body: (env.body && env.body.dup), + request_headers: env.request_headers.dup, + query: env.url.query, + } + [200, { "Content-Type" => "application/json" }, '{"results":[]}'] + end + stub.get(/.*/, &handler) + stub.post(/.*/, &handler) + end + end + yield conn + captured + end + + # A pipeline whose JSON is long enough to push the encoded URL past MAX. + def long_pipeline + big_in = (0...400).map { |i| "Project$#{format('%010d', i)}" } + [ + { "$match" => { "_p_project" => { "$in" => big_in } } }, + { "$group" => { "_id" => { "year" => { "$year" => "$createdAt" } }, "count" => { "$sum" => 1 } } }, + { "$sort" => { "_id" => 1 } }, + { "$project" => { "_id" => 0, "objectId" => "$_id", "count" => 1 } }, + ] + end + + # A `where` whose encoded URL definitely exceeds MAX_URL_LENGTH. + def long_where + { "tags" => { "$in" => (0...1200).to_a } } + end + + # ---- aggregate endpoint: JSON body, pipeline preserved as an Array -------- + + def test_long_aggregate_get_becomes_post_with_json_body + pipeline = long_pipeline + env = capture_env do |conn| + conn.get("aggregate/Note", { pipeline: pipeline.to_json }) + end + + assert_equal :post, env[:method], "long aggregate GET must be rewritten to POST" + assert_equal "GET", env[:request_headers]["X-Http-Method-Override"] + assert_equal Parse::Protocol::CONTENT_TYPE_FORMAT, + env[:request_headers]["Content-Type"], + "aggregate override must send a JSON body, not urlencoded" + assert_nil env[:query], "query must be moved out of the URL" + + body = JSON.parse(env[:body]) + assert_equal "GET", body["_method"], "Parse Server routes the POST to its GET-only handler via _method" + assert_kind_of Array, body["pipeline"], + "pipeline must arrive as a real Array (string form triggers 'Invalid aggregate stage 0')" + assert_equal pipeline, body["pipeline"], "pipeline content must round-trip unchanged" + end + + def test_long_aggregate_override_decodes_boolean_params + # rawValues / rawFieldNames are sent as booleans; Parse Server ignores them + # unless typeof === 'boolean', so the JSON body must preserve the boolean. + env = capture_env do |conn| + conn.get("aggregate/Note", { pipeline: long_pipeline.to_json, rawValues: true }) + end + body = JSON.parse(env[:body]) + assert_equal true, body["rawValues"], "boolean query params must round-trip as booleans, not strings" + end + + # ---- non-aggregate endpoint: unchanged urlencoded override ---------------- + + def test_long_find_get_keeps_urlencoded_override + env = capture_env do |conn| + conn.get("classes/Note", { where: long_where.to_json }) + end + + assert_equal :post, env[:method], "long find GET must still be rewritten to POST" + assert_equal "GET", env[:request_headers]["X-Http-Method-Override"] + assert_equal "application/x-www-form-urlencoded", env[:request_headers]["Content-Type"], + "find override must keep its historical urlencoded body" + assert env[:body].start_with?("_method=GET&"), "find override body must start with _method=GET&" + assert_includes env[:body], "where=", "where must remain in the urlencoded body" + end + + # ---- short URLs are untouched (stay GET) ---------------------------------- + + def test_short_aggregate_get_stays_get + env = capture_env do |conn| + conn.get("aggregate/Note", { pipeline: [{ "$group" => { "_id" => "$x" } }].to_json }) + end + assert_equal :get, env[:method], "a short aggregate URL must stay a GET" + refute_nil env[:query], "short GET keeps its query string" + assert_includes env[:query], "pipeline=" + end + + def test_a_class_named_with_aggregate_substring_is_not_treated_as_aggregate + # "/classes/aggregateThings" contains "aggregate" but not "/aggregate/", + # so it must take the urlencoded (find) path, not the JSON path. + env = capture_env do |conn| + conn.get("classes/aggregateThings", { where: long_where.to_json }) + end + assert_equal :post, env[:method] + assert_equal "application/x-www-form-urlencoded", env[:request_headers]["Content-Type"] + assert env[:body].start_with?("_method=GET&") + end + + # ---- robustness: long URL with no query string ---------------------------- + + # A >=2KB URL whose length is all path and no query is contrived but + # reachable; the non-aggregate branch must not crash on `"..." + nil`. + def test_long_non_aggregate_url_with_no_query_does_not_crash + long_path = "classes/" + ("A" * (Parse::Middleware::BodyBuilder::MAX_URL_LENGTH + 50)) + env = capture_env do |conn| + conn.get(long_path) # no params -> nil query + end + assert_equal :post, env[:method], "a >=2KB URL must still convert to POST" + assert_equal "_method=GET&", env[:body], + "nil query must coerce to an empty string, not raise TypeError" + end + + def test_long_aggregate_url_with_no_query_is_safe + long_path = "aggregate/" + ("A" * (Parse::Middleware::BodyBuilder::MAX_URL_LENGTH + 50)) + env = capture_env do |conn| + conn.get(long_path) + end + assert_equal :post, env[:method] + assert_equal({ "_method" => "GET" }, JSON.parse(env[:body]), + "aggregate branch with no query yields just the _method marker, no crash") + end + + # ---- a non-JSON query value passes through unchanged ----------------------- + + def test_aggregate_override_passes_non_json_value_through_unchanged + # A bare opaque token is not valid JSON; aggregate_override_body must keep + # it as the exact string (rescue branch), not nil and not an error. + env = capture_env do |conn| + conn.get("aggregate/Note", { pipeline: long_pipeline.to_json, opaque: "r:not-json-token" }) + end + body = JSON.parse(env[:body]) + assert_equal "r:not-json-token", body["opaque"], + "a non-JSON query value must survive verbatim through the rescue path" + assert_kind_of Array, body["pipeline"], "pipeline still decodes to an Array alongside it" + end +end diff --git a/test/lib/parse/query_aggregate_integration_test.rb b/test/lib/parse/query_aggregate_integration_test.rb index 85d7a32..eaa71c3 100644 --- a/test/lib/parse/query_aggregate_integration_test.rb +++ b/test/lib/parse/query_aggregate_integration_test.rb @@ -1,5 +1,6 @@ require_relative "../../test_helper_integration" require "minitest/autorun" +require "cgi" # CGI.escape in the >2KB URL-length assertion; not guaranteed loaded transitively # Test models for aggregate pipeline testing class AggregateTestUser < Parse::Object @@ -1918,4 +1919,68 @@ def test_pointer_constraint_aggregation end end end + + # Regression: an aggregate whose encoded URL exceeds ~2KB is rewritten from + # GET to POST (_method=GET) with the query moved into the body. Parse Server's + # AggregateRouter only JSON-decodes query-string params, not body params, so + # the pipeline must be sent as a JSON body (not a URL-encoded string) or it is + # rejected with "Invalid aggregate stage '0'". This test drives a real Parse + # Server with a >2KB pipeline through all three surfaces (custom pipeline, + # group_by, group_by_date) and asserts correct counts instead of an empty/ + # errored result. See Parse::Middleware::BodyBuilder#aggregate_override_body. + def test_large_aggregate_pipeline_exceeding_url_limit + skip "Docker integration tests require PARSE_TEST_USE_DOCKER=true" unless ENV["PARSE_TEST_USE_DOCKER"] == "true" + + with_parse_server do + with_timeout(30, "large aggregate pipeline (>2KB URL) test") do + puts "\n=== Testing Aggregate Pipeline Exceeding 2KB URL Limit ===" + + author = AggregateTestUser.new(name: "Big-IN Author", age: 40, city: "Austin", active: true) + assert author.save, "Author should save" + + # Five posts by this author, all created "now" so group_by_date buckets + # them into a single day. + 5.times do |i| + p = AggregateTestPost.new(title: "Big-IN Post #{i}", author: author, category: "tech", likes: i) + assert p.save, "Post #{i} should save" + end + + # A large $in of pointers: the real author plus enough fake ids to push + # the encoded aggregate URL well past MAX_URL_LENGTH (2000). + fake_ids = (0...400).map { |i| AggregateTestUser.pointer(format("Fake%010d", i)) } + in_set = fake_ids + [author.pointer] + + # Confirm the request actually crosses the GET→POST override threshold, + # otherwise this test would silently pass without exercising the fix. + long_pipeline = AggregateTestPost.where(:author.in => in_set).group_by(:category).pipeline + encoded_len = { pipeline: long_pipeline.to_json }.to_a + .map { |k, v| "#{k}=#{CGI.escape(v.to_s)}" }.join("&").length + assert_operator encoded_len, :>, Parse::Middleware::BodyBuilder::MAX_URL_LENGTH, + "test pipeline must exceed MAX_URL_LENGTH to exercise the POST override" + + # 1) Custom pipeline via the client (the path the unit test hand-builds). + result = AggregateTestPost.where(:author.in => in_set).group_by(:category).count + assert_kind_of Hash, result + assert_equal 5, result.values.sum, + "group_by with a >2KB $in must return real counts, not an empty/errored result" + puts "✅ group_by with large $in: #{result.inspect}" + + # 2) group_by_date — the exact user-facing surface from the bug report. + by_date = AggregateTestPost.where(:author.in => in_set) + .group_by_date(:created_at, :day).count + assert_kind_of Hash, by_date + assert_equal 5, by_date.values.sum, + "group_by_date with a >2KB $in must bucket all 5 posts, not return {} from 'Invalid aggregate stage 0'" + puts "✅ group_by_date with large $in: #{by_date.inspect}" + + # 3) distinct over the same large constraint. + cats = AggregateTestPost.where(:author.in => in_set).distinct(:category) + assert_includes cats, "tech", + "distinct with a >2KB $in must return values, not an empty/errored result" + puts "✅ distinct with large $in: #{cats.inspect}" + + puts "\n✅ Large aggregate pipeline (>2KB URL) test passed" + end + end + end end