fix: Limit length of response body read to 4mb by kaylareopelle · Pull Request #2080 · open-telemetry/opentelemetry-ruby

kaylareopelle · 2026-04-08T12:52:43Z

Our exporters read HTTP response bodies without any limit. A misconfigured or misbehaving server could send an arbitrarily large response, causing the exporter to read it all into memory.

This implements a 4 MB response body limit also implemented by:

Based on https://cwe.mitre.org/data/definitions/789.html

Fixes #2079

Combined with kaylareopelle#15 by @robbkidd

Limiting the read size may help prevent memory exhaustion exploits when the configured collector endpoint is attacker-controlled.

arielvalentin · 2026-04-11T17:28:06Z

+            truncated = false
+
+            response.read_body do |chunk|
+              if body.bytesize + chunk.bytesize <= RESPONSE_BODY_LIMIT


Before starting to read the response body, should we not first read the content-length or transfer-encoding and short circuit if the body exceeds the limit?

IIUC the Go collector will switch to using chunked responses that exceed the 2KiB default buffer size. In cases where the response is less than 2KiB or the backend supports larger buffer sizes we may get the content-length responses and exit early.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Length

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Transfer-Encoding

I like the idea of using the headers to make this more performant!

Just to make sure I'm on the same page, are you proposing:

Use the transfer-encoding / content-length headers to determine body size

Skip chunking when the body is less than the limit and just return the body

Truncate the response using chunking for bodies that exceed the limit

Or are you suggesting something else?

My understanding is that it is either or.

The response will either have a Content-Length or Transfer-Encoding Chunked

In the case the Content-Length is present and it exceeds the size then discard the response body and exit immediately.

If it's a chunked response use the code you've introduced here.

robbkidd

In a hacky reproduction of the the problem (some malicious or misbehaving OTLP receiver responds with a large body), I still saw the whole large response body getting loaded into memory. The exporter's @http.request(request) call reads the full body before read_response_body runs, so the 4 MB cap never activates against a real HTTP server. The existing tests pass because WebMock stubs don't have real socket behavior. 😭

Adds a failing test that exercises the OTLP exporter's response body limit against a real TCPServer instead of WebMock stubs. WebMock patches Net::HTTP's read_body internals, so the existing stub-based tests pass even though the chunked reader doesn't work against real HTTP. With WebMock's adapter fully disabled, we see that Net::HTTP#request eagerly reads the full body before read_response_body runs, causing IOError: "read_body called twice". The IOError is the symptom — the actual problem is that the full oversized response body is already in memory by that point, defeating the 4 MB cap. Reproduces the problem described in review comments on PR open-telemetry#2080.

Adds a failing test that exercises the OTLP exporter's response body limit against a real TCPServer instead of WebMock stubs. WebMock patches Net::HTTP's read_body internals, so the existing stub-based tests pass even though the chunked reader doesn't work against real HTTP. With WebMock's adapter fully disabled, Net::HTTP#request eagerly reads the full body before read_response_body runs, causing IOError: "read_body called twice". The full response body is already in memory by that point, defeating the 4 MB cap.

Net::HTTP#request without a block eagerly reads the entire response body into memory via reading_body's self.body call. Refactor send_bytes to use @http.request(request) { |response| } so the case statement runs before the body is read. The error path uses streaming access for the chunked reader. Non-error paths drain-and-discard with read_body { |_| } to preserve keep-alive connections (read_body(nil) leaves the socket undrained). redo doesn't work from within a nested block, so add a should_redo flag to trigger it.

Got past the IOError from trying to read the body twice, verify the chunked reader hit the 4 MB cap against a real HTTP response.

After read_response_body breaks out of the chunked read at 4 MB, Net::HTTP's reading_body still calls self.body which reads more data from the socket into memory. Our String is capped but response.body held 5.8 MB against a 10 MB response. Spies on the exporter to observe response.body size after read_response_body returns.

@http

When read_response_body truncates at the 4MB limit, Net::HTTP's reading_body calls self.body after our block returns. This re-reads the entire remaining response from the socket into memory. Our break doesn't prevent this because reading_body has its "ensure to read body" safety net that fires before the socket closes. Calling @http.finish before break closes the socket. reading_body's self.body call hits a closed stream instead of re-reading. The resulting IOError is rescued only when truncation caused it. Unexpected IOErrors still propagate.

send_bytes()'s sad path needs to know if the chucking reader truncated the response body so it can log that appropriately. Instead of managing a shared-state instance variable, return that state from the reader and pass it into the logger.

Decode will raise an error on a truncated body and then the exporter will log a confusing "unexpected error decoding rpc.Status".

A truncated protobuf is invalid binary. Passing it to Google::Rpc::Status.decode raises "unexpected error decoding rpc.Status" which is rescued by our own error handling. The resulting log output about decode error mentions nothing about a known-truncated body. When we know the body was truncated, log a direct oversized-body message and return early.

fix: refactor response reader

* Metrics/CyclomaticComplexity * Metrics/MethodLength * Metrics/PerceivedComplexity

arielvalentin · 2026-04-25T18:43:11Z

          return ['', false] if response.nil?

+          body, truncated = collect_response_chunks(response)
+          body.force_encoding('UTF-8')


If the response is truncated and the code points will be incomplete and could cause an invalid byte sequence when trying to translate it from binary to a String unicode sequence.
e.g. if the last 8 bytes is multibyte but was cut off it will fail. E.g. take the number 1 in Chinese character U+4E00...

11100100 10111000 10000000

the first byte in that sequence expects to be followed by two bytes but when truncated it is

11100100 10111000

IIUC the force_encoding method only changes encoding attribute in the String and does not actually attempt to validate it. It will fail once someone tries to read the string ... effectively to false advertising. The user of the body will then have to guard and ensure the encoding is valid or not or handle the invalid byte sequences.

# U+4E00 (一, yī) is encoded in UTF-8 as 3 bytes: E4 B8 80 # Here we only have the first 2 bytes — the sequence is incomplete. truncated = "\xE4\xB8".b truncated.force_encoding("UTF-8") puts truncated.encoding #=> UTF-8 puts truncated.valid_encoding? #=> false begin puts truncated.encode("UTF-8") rescue Encoding::InvalidByteSequenceError => e puts "Error: #{e.message}" #=> Error: "\xE4\xB8" on UTF-8 end

What should we do?

I am not sure but I am not a fan of flow and control by exception. If the body is truncated perhaps we should return the string with placeholders for invalid byte sequences or a blank string?

arielvalentin · 2026-04-25T18:44:33Z

+          [body, truncated]
+        rescue StandardError => e
+          OpenTelemetry.handle_error(exception: e, message: 'error reading response body')
+          ['', false]


This does not make sense to me. Does returning an empty string effectively mean we are truncating the string in its entirety?

Won't the protobuf parser be unhappy with an empty string?

arielvalentin · 2026-04-25T18:46:50Z

+      client.read(content_length) if content_length > 0
+
+      client.print "HTTP/1.1 #{status} Bad Request\r\n"
+      client.print "Content-Type: application/octet-stream\r\n"


IIUC, this is incorrect per the contract of the otel collector. It will respond with the same content type it received, which in our case is application/x-protobuf.

The exporter should probably check the response header content-type to ensure it is getting back when it requested.

If for some reason an intermediate response with say text/plain then we know something went wrong and should not attempt to parse the body as protobuf anyway.

arielvalentin · 2026-04-25T18:50:44Z

+
+      client.print "HTTP/1.1 #{status} Bad Request\r\n"
+      client.print "Content-Type: application/octet-stream\r\n"
+      client.print "Content-Length: #{body.bytesize}\r\n"


Can you add a test case for Transfer-Encoding: chunked without the Content-Length header which will likely be the most common response from the collecter?

fix: Limit length of response body read to 4mb

f0bec26

Limiting the read size may help prevent memory exhaustion exploits when the configured collector endpoint is attacker-controlled.

kaylareopelle force-pushed the read-body-limit branch from 35dfa8e to f0bec26 Compare April 8, 2026 12:58

kaylareopelle marked this pull request as ready for review April 10, 2026 15:58

kaylareopelle requested review from ahayworth, arielvalentin, dazuma, ericmustin, fbogsany, mwear, plantfansam, robbkidd and robertlaurin as code owners April 10, 2026 15:58

arielvalentin reviewed Apr 11, 2026

View reviewed changes

robbkidd requested changes Apr 14, 2026

View reviewed changes

robbkidd and others added 9 commits April 22, 2026 14:49

test: assert body truncation in real-socket test

e1a8c00

Got past the IOError from trying to read the body twice, verify the chunked reader hit the 4 MB cap against a real HTTP response.

test: assert body truncated logged, not a decode error

eefa743

Decode will raise an error on a truncated body and then the exporter will log a confusing "unexpected error decoding rpc.Status".

Merge pull request #15 from robbkidd/robb.repro-response-body-limit

3727f79

fix: refactor response reader

kaylareopelle requested a review from xuan-cao-swi as a code owner April 24, 2026 18:04

Rubocop

97d3196

* Metrics/CyclomaticComplexity * Metrics/MethodLength * Metrics/PerceivedComplexity

arielvalentin reviewed Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Limit length of response body read to 4mb#2080

fix: Limit length of response body read to 4mb#2080
kaylareopelle wants to merge 11 commits intoopen-telemetry:mainfrom
kaylareopelle:read-body-limit

kaylareopelle commented Apr 8, 2026 •

edited

Loading

Uh oh!

arielvalentin Apr 11, 2026

Uh oh!

kaylareopelle Apr 14, 2026

Uh oh!

arielvalentin Apr 14, 2026

Uh oh!

robbkidd left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arielvalentin Apr 25, 2026

Uh oh!

arielvalentin Apr 25, 2026

Uh oh!

arielvalentin Apr 25, 2026

Uh oh!

arielvalentin Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kaylareopelle commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arielvalentin Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

kaylareopelle Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

arielvalentin Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

robbkidd left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arielvalentin Apr 25, 2026

Choose a reason for hiding this comment

What should we do?

Uh oh!

arielvalentin Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

arielvalentin Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

arielvalentin Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaylareopelle commented Apr 8, 2026 •

edited

Loading

robbkidd left a comment •

edited

Loading