Skip to content

Fix Hash/Array fingerprint instability across JRuby versions#79

Merged
donoghuc merged 3 commits intologstash-plugins:mainfrom
donoghuc:jruby-10
Mar 31, 2026
Merged

Fix Hash/Array fingerprint instability across JRuby versions#79
donoghuc merged 3 commits intologstash-plugins:mainfrom
donoghuc:jruby-10

Conversation

@donoghuc
Copy link
Copy Markdown
Contributor

@donoghuc donoghuc commented Mar 26, 2026

JRuby 10 changed Hash#inspect spacing, producing different fingerprint bytes for nested Hash/Array fields. Use a pipe-delimited format instead of relying on Hash#inspect.

Thanks for contributing to Logstash! If you haven't already signed our CLA, here's a handy link: https://www.elastic.co/contributor-agreement/

Closes: #78

JRuby 10 changed Hash#inspect spacing, producing different fingerprint
bytes for nested Hash/Array fields. Use a  pipe-delimited format
instead of relying on Hash#inspect.
@donoghuc
Copy link
Copy Markdown
Contributor Author

Note: This is a behavioral change for users — existing fingerprints in production will change on upgrade regardless of fix approach. Needs consideration.

In practice this would only appply to complex ruby objects. I would imagine in practice most of the actual use case are for string values. While this change will change the sha for some complex objects, that "duplication" risk is one off. Trying to keep full behavior compatability would inevitably produce a messy and brittle solution that is likely not worth it.

Copy link
Copy Markdown
Contributor

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea sounds good, I've left a doubt.

when Array
object.map { |e| to_canonical_string(e) }.join("|")
else
object
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't return a string in case of Object?

Suggested change
object
object.to_s

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault for the confusing name. We actually dont want to always produce a string (for example the hash of an integer 1 would be different than the string "1". I updated the helper name to avoid the inconsistency, but we dont want the .to_s here.

This commit renames the function to normalize objects. We dont actually need "string" values generally as the old name suggests.
@donoghuc donoghuc requested a review from andsel March 31, 2026 18:42
Copy link
Copy Markdown
Contributor

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@donoghuc donoghuc merged commit 7152692 into logstash-plugins:main Mar 31, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix Hash#inspect format dependency in fingerprint input

2 participants