Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,50 @@
## parse-stack-next Changelog

### 5.5.2

#### Large aggregation pipelines no longer fail with "Invalid aggregate stage '0'"

- **FIXED**: An aggregation whose request URL exceeds ~2KB (for example a
`group_by`, `group_by_date`, `distinct`, or custom `aggregate` pipeline with
a large `$in` / `$match`) is rewritten from a GET to a POST carrying
`_method=GET`, with the query moved into the request body. The pipeline was
sent in the body as a URL-encoded string, but Parse Server's aggregate
endpoint only JSON-decodes query-string params, not body params — so the
pipeline arrived as a raw string and was rejected with
`Invalid aggregate stage '0'`, causing the aggregation to return an empty
result. The long-URL override now sends a JSON body for the aggregate
endpoint so the pipeline is delivered as a real array (boolean params such as
`rawValues` are preserved as booleans). The historical URL-encoded override is
unchanged for `find` and other endpoints, which Parse Server already decodes
correctly.

#### Aggregations inside `Parse.with_session` blocks are now scoped

- **FIXED**: `group_by_date`, `group_by`, `distinct`, and `count` (aggregation
branch) now detect the ambient session token set by `Parse.with_session` and
treat the query as scoped — consistent with how `Parse::Client#request`
already scopes REST find/get/count calls in the same block. Previously the
`query_is_scoped?` / `distinct_query_is_scoped?` checks consulted only the
query instance's own `session_token=` / `scope_to_user` / `scope_to_role`
and ignored `Parse.current_session_token`, so an aggregation inside a
`with_session` block ran unscoped as the master key and returned all rows
regardless of ACL. The checks now include the ambient: when scoped and
mongo-direct is available the aggregation auto-promotes (ACL/CLP enforced);
when scoped and mongo-direct is unavailable it fails closed with
`MongoDirectRequired` rather than silently leaking rows.
- **FIXED**: `group_by_date` now also fails closed (`MongoDirectRequired`) when
the query is scoped but mongo-direct is unavailable — matching the existing
behavior of `group_by`, `distinct`, and `count`. Previously `group_by_date`
silently fell back to the REST `/aggregate` endpoint in that case.
- **FIXED**: A regression introduced in 5.5.1 where `group_by_date`,
`group_by`, and pipeline-based aggregations called inside a
`Parse.with_session` block returned empty results `{}`. The ambient session
token was forwarded as an HTTP session-token header (suppressing the master
key), causing Parse Server's REST `/aggregate` endpoint — which is
master-key-only — to return a 401/403. The REST aggregate call sites now
force `use_master_key: true` so the ambient cannot suppress it, unless the
caller explicitly set `use_master_key: false`.

### 5.5.1

#### Mongo-direct reads inside `Parse.with_session` are now scoped, not master
Expand Down
2 changes: 1 addition & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
PATH
remote: .
specs:
parse-stack-next (5.5.1)
parse-stack-next (5.5.2)
activemodel (>= 6.1, < 9)
activesupport (>= 6.1, < 9)
connection_pool (>= 2.2, < 4)
Expand Down
79 changes: 71 additions & 8 deletions lib/parse/client/body_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -285,14 +285,32 @@ def call!(env)
# to be POST instead of GET and send the query parameters in the body of the POST request.
# The standard maximum POST request (which is a server setting), is usually set to 20MBs
if env[:method] == :get && env[:url].to_s.length >= MAX_URL_LENGTH
env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET"
env[:request_headers][CONTENT_TYPE] = "application/x-www-form-urlencoded"
# parse-sever looks for method overrides in the body under the `_method` param.
# so we will add it to the query string, which will now go into the body.
env[:body] = "_method=GET&" + env[:url].query
env[:url].query = nil
#override
env[:method] = :post
if aggregate_request?(env[:url])
# Parse Server's AggregateRouter only JSON-decodes query-string
# params (via JSONFromQuery); it does NOT decode a `pipeline` param
# that arrives in the request body. The urlencoded override below
# would therefore deliver `pipeline` as a raw JSON *string*, which
# AggregateRouter.getPipeline mis-reads character-by-character and
# rejects with "Invalid aggregate stage '0'". Send a JSON body
# instead so the pipeline survives as a real Array. `_method=GET`
# still routes Parse Server to its GET-only aggregate handler.
env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET"
env[:request_headers][CONTENT_TYPE] = CONTENT_TYPE_FORMAT
env[:body] = aggregate_override_body(env[:url].query)
env[:url].query = nil
env[:method] = :post
else
env[:request_headers][HTTP_METHOD_OVERRIDE] = "GET"
env[:request_headers][CONTENT_TYPE] = "application/x-www-form-urlencoded"
# parse-server looks for method overrides in the body under the `_method` param.
# so we will add it to the query string, which will now go into the body.
# `.to_s` guards the (contrived but possible) case of a >=2KB URL whose
# length is all path and no query — nil + String would raise TypeError.
env[:body] = "_method=GET&" + env[:url].query.to_s
env[:url].query = nil
Comment on lines +303 to +310
#override
env[:method] = :post
end
# else if not a get, always make sure the request is JSON encoded if the content type matches
elsif env[:request_headers][CONTENT_TYPE] == CONTENT_TYPE_FORMAT &&
(env[:body].is_a?(Hash) || env[:body].is_a?(Array))
Expand Down Expand Up @@ -334,6 +352,51 @@ def call!(env)
response_env[:body] = r
end
end

private

# Whether the request targets Parse Server's `/aggregate/<Class>`
# endpoint. Used by {#call!} to pick the JSON-body form of the
# long-URL GET→POST override (the aggregate endpoint does not
# JSON-decode a body `pipeline` param, unlike `where`).
#
# Anchored to the final two path segments: `.../aggregate/<ClassName>`
# where <ClassName> is the last segment (no further slashes). The
# className is mandatory and slash-free — see
# {Parse::API::Aggregate#aggregate_uri_path}, which validates it via
# PathSegment.identifier! — so a real aggregate URL always ends this way.
# A `find` request is `.../classes/<ClassName>` (no match), a class
# merely *named* with "aggregate" (e.g. `MyAggregateData`) does not match,
# and an `/aggregate/` segment appearing earlier in a custom mount prefix
# (e.g. `/aggregate/api/classes/Foo`) does not match either.
# @param url [URI] the request URL.
# @return [Boolean]
def aggregate_request?(url)
url.path.to_s.match?(%r{/aggregate/[^/]+/?\z})
end
Comment on lines +374 to +376

# Build the JSON request body for a long-URL aggregate GET→POST
# override. Reconstructs the params from the encoded query string and
# JSON-decodes each value so the `pipeline` Array (and boolean
# `rawValues`/`rawFieldNames`) reach Parse Server as real types rather
# than strings. A value that is not itself JSON is passed through
# unchanged. `_method=GET` is injected so Parse Server routes the POST
# to its GET-only aggregate handler.
# @param query_string [String, nil] the encoded query string.
# @return [String] the JSON body to send.
def aggregate_override_body(query_string)
params = Faraday::Utils.parse_query(query_string.to_s) || {}
body = { "_method" => "GET" }
params.each do |key, value|
body[key] =
begin
JSON.parse(value)
rescue JSON::ParserError, TypeError
value
end
end
body.to_json
end
end
end #Middleware
end
143 changes: 109 additions & 34 deletions lib/parse/query.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1811,13 +1811,25 @@ def requires_mongo_direct?
# Whether this query carries a non-master-key auth scope. Used by
# `#distinct` (and group_by aggregations) to decide whether to
# auto-promote the REST aggregate path to mongo-direct so the SDK's
# ACLScope / CLPScope enforcement actually runs.
# ACLScope / CLPScope enforcement actually runs. Also detects the
# fiber-local ambient session set by Parse.with_session so that
# aggregations inside a with_session block are treated as scoped —
# consistent with how Parse::Client#request already scopes REST calls.
# @return [Boolean]
# @api private
def distinct_query_is_scoped?
return true if @session_token.is_a?(String) && !@session_token.empty?
return true if @acl_user
return true if @acl_role
# An ambient Parse.with_session counts as scope ONLY when the query did
# not explicitly request master-key mode — mirroring Parse::Client#request,
# where an explicit use_master_key: true is a deliberate admin call that
# skips the ambient session. Otherwise an admin aggregation inside a
# with_session block would be wrongly forced to mongo-direct / fail-closed.
unless use_master_key == true
ambient = ambient_session_token
return true if ambient.is_a?(String) && !ambient.empty?
end
false
end

Expand All @@ -1832,12 +1844,13 @@ def distinct_query_is_scoped?
def raise_scoped_aggregation_requires_mongo_direct!
raise MongoDirectRequired,
"[Parse::Query] This scoped aggregation (session_token / " \
"scope_to_user / scope_to_role) requires mongo-direct so the " \
"SDK can enforce ACL/CLP. Parse Server's REST /aggregate " \
"endpoint is master-key-only and enforces neither, so routing " \
"it there would silently run unscoped as the master key. " \
"Enable mongo-direct via Parse::MongoDB.configure(...), or " \
"rewrite without the aggregation terminal."
"scope_to_user / scope_to_role, or an active Parse.with_session " \
"block) requires mongo-direct so the SDK can enforce ACL/CLP. " \
"Parse Server's REST /aggregate endpoint is master-key-only and " \
"enforces neither, so routing it there would silently run unscoped " \
"as the master key. Enable mongo-direct via " \
"Parse::MongoDB.configure(...), or rewrite without the " \
"aggregation terminal."
end

# Scope a query to a specific user's row-level ACL when it auto-routes
Expand Down Expand Up @@ -5996,13 +6009,21 @@ def execute!
if @mongo_direct && defined?(Parse::MongoDB) && Parse::MongoDB.enabled?
@cached_response = execute_direct!
else
# REST /aggregate is master-key-only. An ambient Parse.with_session
# block would suppress the master key via session_token, causing a
# 401/403. Force use_master_key unless the caller explicitly disabled
# it (use_master_key: false is a deliberate client-mode decision).
# `.dup` keeps the master-key flip local to this call even if `_opts`
# ever returns a shared/memoized hash.
rest_opts = @query.send(:_opts).dup
rest_opts[:use_master_key] = true unless rest_opts[:use_master_key] == false
@cached_response = @query.client.aggregate_pipeline(
@query.instance_variable_get(:@table),
@pipeline,
headers: {},
raw_values: @raw_values,
raw_field_names: @raw_field_names,
**@query.send(:_opts),
**rest_opts,
)
end

Expand Down Expand Up @@ -6388,16 +6409,24 @@ def pipeline
# @return [Array<Hash>] raw aggregation results
def raw(operation, aggregation_expr)
formatted_group_field = @query.send(:format_aggregation_field, @group_field)
pipeline = build_pipeline(formatted_group_field, aggregation_expr)

response = @query.client.aggregate_pipeline(
@query.instance_variable_get(:@table),
pipeline,
headers: {},
**@query.send(:_opts),
)
# Build the same pipeline the count/sum/etc. terminals use, then delegate
# to Query#aggregate. That central path handles scoped-query routing
# (session_token / acl_user / acl_role / ambient Parse.with_session →
# auto-promote to mongo-direct, or fail closed when unavailable) so a
# scoped `raw` is never sent to the master-key-only REST /aggregate
# endpoint, and it returns the raw Array<Hash> rows this method documents.
# `$match` from the query's where constraints is added by Query#aggregate.
pipeline = []
pipeline << { "$unwind" => "$#{formatted_group_field}" } if @flatten_arrays
pipeline << { "$group" => { "_id" => "$#{formatted_group_field}", "count" => aggregation_expr } }
add_fields = size_addfields_stage
pipeline << add_fields if add_fields
sort = sort_stage
pipeline << sort if sort
pipeline << { "$project" => { "_id" => 0, "objectId" => "$_id", "count" => 1 } }

response.result || []
@query.aggregate(pipeline, verbose: @query.instance_variable_get(:@verbose_aggregate)).raw || []
end

# Count the number of items in each group.
Expand Down Expand Up @@ -6541,8 +6570,12 @@ def execute_group_aggregation(operation, aggregation_expr, &value_transformer)
# already does this auto-promotion (lib/parse/agent/tools.rb), this
# is the equivalent at the Query layer for direct SDK callers.
use_mongo_direct = @mongo_direct
if !use_mongo_direct && query_is_scoped? && parse_mongodb_available?
use_mongo_direct = true
if !use_mongo_direct && query_is_scoped?
if parse_mongodb_available?
use_mongo_direct = true
else
@query.send(:raise_scoped_aggregation_requires_mongo_direct!)
end
end

if use_mongo_direct
Expand Down Expand Up @@ -6779,15 +6812,22 @@ def validate_sort_target_for_operation!(operation)
end

# Whether the parent query carries any non-master-key auth scope. A
# session_token, acl_user, or acl_role means the caller expects the
# results to be filtered by ACL — which only happens on the SDK's
# mongo-direct path. Used to decide whether to auto-promote the REST
# aggregation path to mongo-direct.
# session_token, acl_user, acl_role, or an active Parse.with_session
# ambient means the caller expects ACL-filtered results — which only
# the SDK's mongo-direct path provides. Used to decide whether to
# auto-promote the REST aggregation path to mongo-direct.
def query_is_scoped?
st = @query.session_token
return true if st.is_a?(String) && !st.empty?
return true if @query.instance_variable_get(:@acl_user)
return true if @query.instance_variable_get(:@acl_role)
# Ambient Parse.with_session counts as scope only when the query did not
# explicitly set use_master_key: true (matches Parse::Client#request
# precedence — an explicit master-key call skips the ambient session).
unless @query.use_master_key == true
ambient = @query.send(:ambient_session_token)
return true if ambient.is_a?(String) && !ambient.empty?
end
false
end

Expand Down Expand Up @@ -7228,14 +7268,19 @@ def execute_date_aggregation(operation, aggregation_expr)
# Format the date field name
formatted_date_field = @query.send(:format_aggregation_field, @date_field)

# Auto-promote scoped queries to mongo-direct. See the matching
# block in `GroupBy#execute_group_aggregation` for the full
# rationale — REST `/aggregate` is master-key-only and unscoped, so
# session_token / acl_user / acl_role queries need the SDK's
# mongo-direct enforcement layers to actually filter results.
# Auto-promote scoped queries to mongo-direct. REST `/aggregate` is
# master-key-only and enforces neither ACL nor CLP — a scoped query
# (session_token / acl_user / acl_role, or an active
# Parse.with_session block) must use the SDK's enforcement layers.
# Fail closed if mongo-direct is unavailable rather than silently
# returning unscoped rows. Mirrors the scoped-query gate in Query#aggregate.
use_mongo_direct = @mongo_direct
if !use_mongo_direct && query_is_scoped? && parse_mongodb_available?
use_mongo_direct = true
if !use_mongo_direct && query_is_scoped?
if parse_mongodb_available?
use_mongo_direct = true
else
@query.send(:raise_scoped_aggregation_requires_mongo_direct!)
end
end

if use_mongo_direct
Expand Down Expand Up @@ -7281,11 +7326,21 @@ def execute_date_aggregation(operation, aggregation_expr)
puts "[VERBOSE AGGREGATE] Sending to: #{@query.instance_variable_get(:@table)}"
end

# Parse Server's REST /aggregate endpoint is master-key-only. An active
# Parse.with_session block sets a fiber-local ambient session token that
# Parse::Client#request picks up and uses in place of the master key,
# causing a 401/403 on this endpoint. Force use_master_key: true so the
# ambient session cannot suppress it — unless the caller explicitly set
# use_master_key: false (deliberate client-mode / session-token intent).
# `.dup` keeps the master-key flip local to this call (see Aggregation#execute!).
rest_opts = @query.send(:_opts).dup
rest_opts[:use_master_key] = true unless rest_opts[:use_master_key] == false

response = @query.client.aggregate_pipeline(
@query.instance_variable_get(:@table),
pipeline,
headers: {},
**@query.send(:_opts),
**rest_opts,
)

if @query.instance_variable_get(:@verbose_aggregate)
Expand All @@ -7312,6 +7367,18 @@ def execute_date_aggregation(operation, aggregation_expr)
end
result_hash
else
unless response.success?
# Surface the failure (the result would otherwise be a silent `{}`)
# through the configured logger rather than unconditional stderr.
# Log the error code + message, not a full `inspect`, to avoid
# echoing an unbounded server payload into logs.
logger = Parse.respond_to?(:logger) ? Parse.logger : nil
logger&.warn(
"[Parse::GroupByDate] aggregate failed " \
"(#{@query.instance_variable_get(:@table)} :#{@date_field} :#{@interval}): " \
"code=#{response.code} #{response.error}"
)
end
Comment on lines +7370 to +7381
{}
end
end
Expand Down Expand Up @@ -7492,15 +7559,23 @@ def sort_stage
{ "$sort" => { field => dir } }
end

# Mirror of {GroupBy#query_is_scoped?}. A session_token, acl_user, or
# acl_role on the parent query means the caller expects ACL filtering,
# which only the mongo-direct path provides — Parse Server REST
# `/aggregate` is master-key-only and unscoped.
# Mirror of {GroupBy#query_is_scoped?}. A session_token, acl_user,
# acl_role, or an active Parse.with_session ambient means the caller
# expects ACL-filtered results — which only the mongo-direct path
# provides. Parse Server REST `/aggregate` is master-key-only and
# unscoped.
def query_is_scoped?
st = @query.session_token
return true if st.is_a?(String) && !st.empty?
return true if @query.instance_variable_get(:@acl_user)
return true if @query.instance_variable_get(:@acl_role)
# Ambient Parse.with_session counts as scope only when the query did not
# explicitly set use_master_key: true (matches Parse::Client#request
# precedence — an explicit master-key call skips the ambient session).
unless @query.use_master_key == true
ambient = @query.send(:ambient_session_token)
return true if ambient.is_a?(String) && !ambient.empty?
end
false
end

Expand Down
Loading
Loading