Skip to content

fix: Limit length of response body read to 4mb#2080

Open
kaylareopelle wants to merge 11 commits intoopen-telemetry:mainfrom
kaylareopelle:read-body-limit
Open

fix: Limit length of response body read to 4mb#2080
kaylareopelle wants to merge 11 commits intoopen-telemetry:mainfrom
kaylareopelle:read-body-limit

Conversation

@kaylareopelle
Copy link
Copy Markdown
Contributor

@kaylareopelle kaylareopelle commented Apr 8, 2026

Our exporters read HTTP response bodies without any limit. A misconfigured or misbehaving server could send an arbitrarily large response, causing the exporter to read it all into memory.

This implements a 4 MB response body limit also implemented by:

Based on https://cwe.mitre.org/data/definitions/789.html

Fixes #2079

Combined with kaylareopelle#15 by @robbkidd

Limiting the read size may help prevent memory exhaustion exploits when
the configured collector endpoint is attacker-controlled.
truncated = false

response.read_body do |chunk|
if body.bytesize + chunk.bytesize <= RESPONSE_BODY_LIMIT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before starting to read the response body, should we not first read the content-length or transfer-encoding and short circuit if the body exceeds the limit?

IIUC the Go collector will switch to using chunked responses that exceed the 2KiB default buffer size. In cases where the response is less than 2KiB or the backend supports larger buffer sizes we may get the content-length responses and exit early.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Length

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Transfer-Encoding

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using the headers to make this more performant!

Just to make sure I'm on the same page, are you proposing:

  • Use the transfer-encoding / content-length headers to determine body size
  • Skip chunking when the body is less than the limit and just return the body
  • Truncate the response using chunking for bodies that exceed the limit

Or are you suggesting something else?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it is either or.

The response will either have a Content-Length or Transfer-Encoding Chunked

In the case the Content-Length is present and it exceeds the size then discard the response body and exit immediately.

If it's a chunked response use the code you've introduced here.

Copy link
Copy Markdown
Member

@robbkidd robbkidd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a hacky reproduction of the the problem (some malicious or misbehaving OTLP receiver responds with a large body), I still saw the whole large response body getting loaded into memory. The exporter's @http.request(request) call reads the full body before read_response_body runs, so the 4 MB cap never activates against a real HTTP server. The existing tests pass because WebMock stubs don't have real socket behavior. 😭

Comment thread exporter/otlp/lib/opentelemetry/exporter/otlp/exporter.rb Outdated
Comment thread exporter/otlp/lib/opentelemetry/exporter/otlp/exporter.rb Outdated
Comment thread exporter/otlp/lib/opentelemetry/exporter/otlp/exporter.rb
Comment thread exporter/otlp/lib/opentelemetry/exporter/otlp/exporter.rb Outdated
robbkidd added a commit to robbkidd/opentelemetry-ruby that referenced this pull request Apr 15, 2026
Adds a failing test that exercises the OTLP exporter's response body
limit against a real TCPServer instead of WebMock stubs.

WebMock patches Net::HTTP's read_body internals, so the existing
stub-based tests pass even though the chunked reader doesn't work
against real HTTP. With WebMock's adapter fully disabled, we see that
Net::HTTP#request eagerly reads the full body before read_response_body
runs, causing IOError: "read_body called twice". The IOError is the
symptom — the actual problem is that the full oversized response body
is already in memory by that point, defeating the 4 MB cap.

Reproduces the problem described in review comments on PR open-telemetry#2080.
robbkidd added a commit to robbkidd/opentelemetry-ruby that referenced this pull request Apr 22, 2026
Adds a failing test that exercises the OTLP exporter's response body
limit against a real TCPServer instead of WebMock stubs.

WebMock patches Net::HTTP's read_body internals, so the existing
stub-based tests pass even though the chunked reader doesn't work
against real HTTP. With WebMock's adapter fully disabled, we see that
Net::HTTP#request eagerly reads the full body before read_response_body
runs, causing IOError: "read_body called twice". The IOError is the
symptom — the actual problem is that the full oversized response body
is already in memory by that point, defeating the 4 MB cap.

Reproduces the problem described in review comments on PR open-telemetry#2080.
robbkidd and others added 9 commits April 22, 2026 14:49
Adds a failing test that exercises the OTLP exporter's response body
limit against a real TCPServer instead of WebMock stubs.

WebMock patches Net::HTTP's read_body internals, so the existing
stub-based tests pass even though the chunked reader doesn't work
against real HTTP. With WebMock's adapter fully disabled,
Net::HTTP#request eagerly reads the full body before read_response_body
runs, causing IOError: "read_body called twice". The full response body
is already in memory by that point, defeating the 4 MB cap.
Net::HTTP#request without a block eagerly reads the entire response
body into memory via reading_body's self.body call.

Refactor send_bytes to use @http.request(request) { |response| }
so the case statement runs before the body is read. The error path
uses streaming access for the chunked reader. Non-error paths
drain-and-discard with read_body { |_| } to preserve keep-alive
connections (read_body(nil) leaves the socket undrained).

redo doesn't work from within a nested block, so add a should_redo
flag to trigger it.
Got past the IOError from trying to read the body twice, verify
the chunked reader hit the 4 MB cap against a real HTTP response.
After read_response_body breaks out of the chunked read at 4 MB,
Net::HTTP's reading_body still calls self.body which reads more data
from the socket into memory. Our String is capped but response.body
held 5.8 MB against a 10 MB response.

Spies on the exporter to observe response.body size after
read_response_body returns.
When read_response_body truncates at the 4MB limit, Net::HTTP's
reading_body calls self.body after our block returns. This
re-reads the entire remaining response from the socket into memory.
Our break doesn't prevent this because reading_body has its
"ensure to read body" safety net that fires before the socket
closes.

Calling @http.finish before break closes the socket. reading_body's
self.body call hits a closed stream instead of re-reading. The
resulting IOError is rescued only when truncation caused it.
Unexpected IOErrors still propagate.
send_bytes()'s sad path needs to know if the chucking reader truncated
the response body so it can log that appropriately.

Instead of managing a shared-state instance variable, return that state
from the reader and pass it into the logger.
Decode will raise an error on a truncated body and then the exporter
will log a confusing "unexpected error decoding rpc.Status".
A truncated protobuf is invalid binary. Passing it to
Google::Rpc::Status.decode raises "unexpected error decoding rpc.Status"
which is rescued by our own error handling. The resulting log output
about decode error mentions nothing about a known-truncated body.

When we know the body was truncated, log a direct oversized-body message
and return early.
* Metrics/CyclomaticComplexity
* Metrics/MethodLength
* Metrics/PerceivedComplexity
return ['', false] if response.nil?

body, truncated = collect_response_chunks(response)
body.force_encoding('UTF-8')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the response is truncated and the code points will be incomplete and could cause an invalid byte sequence when trying to translate it from binary to a String unicode sequence.
e.g. if the last 8 bytes is multibyte but was cut off it will fail. E.g. take the number 1 in Chinese character U+4E00...

 11100100 10111000 10000000

the first byte in that sequence expects to be followed by two bytes but when truncated it is

 11100100 10111000

IIUC the force_encoding method only changes encoding attribute in the String and does not actually attempt to validate it. It will fail once someone tries to read the string ... effectively to false advertising. The user of the body will then have to guard and ensure the encoding is valid or not or handle the invalid byte sequences.

# U+4E00 (一, yī) is encoded in UTF-8 as 3 bytes: E4 B8 80
# Here we only have the first 2 bytes — the sequence is incomplete.
truncated = "\xE4\xB8".b
truncated.force_encoding("UTF-8")

puts truncated.encoding        #=> UTF-8
puts truncated.valid_encoding?  #=> false

begin
  puts truncated.encode("UTF-8")
rescue Encoding::InvalidByteSequenceError => e
  puts "Error: #{e.message}"
  #=> Error: "\xE4\xB8" on UTF-8
end

What should we do?

I am not sure but I am not a fan of flow and control by exception. If the body is truncated perhaps we should return the string with placeholders for invalid byte sequences or a blank string?

[body, truncated]
rescue StandardError => e
OpenTelemetry.handle_error(exception: e, message: 'error reading response body')
['', false]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make sense to me. Does returning an empty string effectively mean we are truncating the string in its entirety?

Won't the protobuf parser be unhappy with an empty string?

client.read(content_length) if content_length > 0

client.print "HTTP/1.1 #{status} Bad Request\r\n"
client.print "Content-Type: application/octet-stream\r\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is incorrect per the contract of the otel collector. It will respond with the same content type it received, which in our case is application/x-protobuf.

The exporter should probably check the response header content-type to ensure it is getting back when it requested.

If for some reason an intermediate response with say text/plain then we know something went wrong and should not attempt to parse the body as protobuf anyway.


client.print "HTTP/1.1 #{status} Bad Request\r\n"
client.print "Content-Type: application/octet-stream\r\n"
client.print "Content-Length: #{body.bytesize}\r\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case for Transfer-Encoding: chunked without the Content-Length header which will likely be the most common response from the collecter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exporters reading response body have no size limit

3 participants