Add a built-in RE2 implementation based on re2js#290
Add a built-in RE2 implementation based on re2js#290jonbodner-buf wants to merge 11 commits intomainfrom
Conversation
…h broken imports.
srikrsna
left a comment
There was a problem hiding this comment.
Left some comments regarding the integration. I'll review the RE2 implementation in a second pass.
| /\\c[A-Z]/, // control character eg: /\cM\cJ/ | ||
| /\\u[0-9a-fA-F]{4}/, // UTF-16 code-unit | ||
| /\\0(?!\d)/, // NUL | ||
| /\[\\b.*\]/, // Backspace eg: [\b] |
| // can probably delete this since the RE2 engine will already reject them, but keep for now | ||
| for (const invalidPattern of invalidPatterns) { | ||
| if (invalidPattern.test(pattern)) { | ||
| throw new Error( | ||
| `Error evaluating pattern ${pattern}, invalid RE2 syntax`, | ||
| ); | ||
| } |
| } | ||
| const re = new RegExp(pattern, flags); | ||
| return re.test(this); | ||
| const re: RE2JS = RE2JS.compile(pattern, flagVal); |
There was a problem hiding this comment.
My understanding is that flags are part of the syntax in RE2? Can we add support for them in the library instead of trying to identify them here? Or am I missing something?
cel-go also passes them directly to regex engine without any preprocessing: https://github.com/google/cel-go/blob/646cdc1728643aec9499e3a00236ef1007a5d3fa/common/types/string.go#L156
| "@bufbuild/re2": "0.4.0" | ||
| }, | ||
| "devDependencies": { | ||
| "@unicode/unicode-16.0.0": "^1.6.16", |
There was a problem hiding this comment.
Is this a peer dependency that is required?
| "peggy": "^5.0.6", | ||
| "peggy-ts": "github:hudlow/peggy-ts#v0.0.9", | ||
| "expect-type": "^1.3.0" | ||
| "unicode-property-value-aliases": "^3.9.0" |
There was a problem hiding this comment.
Is this a peer dependency that is required?
| "@unicode/unicode-16.0.0": "^1.6.16", | ||
| "unicode-property-value-aliases": "^3.9.0" |
There was a problem hiding this comment.
I see that these are added as dev dependencies in the cel package as well, will users end up needing them?
| }, | ||
| "dependencies": { | ||
| "@bufbuild/re2": "^0.1.0" |
The CEL spec says that its regular expressions meet the RE2 spec, but the existing ES implementation defaults to the regex implementation that's built into ES runtimes, which uses a backtracking implementation that has pathological cases susceptible to ReDOS attacks.
This PR adds a stripped-down version of RE2JS (https://github.com/le0pard/re2js) as a package, and integrates it into CEL as the default regex engine.