fix: admin-adjustable max columns with a clear overflow message#4669
fix: admin-adjustable max columns with a clear overflow message#4669aglinxinyuan merged 23 commits intoapache:mainfrom
Conversation
|
#3825 is already closed? |
We want this parameter to be adjustable by the admins. I am OK with fixing it. |
|
@kunwp1 @SarahAsad23 Feel free to chime in based on a related use case with a collaborator. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4669 +/- ##
============================================
+ Coverage 42.69% 42.71% +0.02%
+ Complexity 2186 2183 -3
============================================
Files 1031 1032 +1
Lines 38111 38161 +50
Branches 4004 4006 +2
============================================
+ Hits 16270 16301 +31
- Misses 20825 20841 +16
- Partials 1016 1019 +3
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Let's make it a large fixed number, unless there is a disadvantage for that. |
The disadvantage is that univocity allocates memory per parser based on that number (10,000 is negligible, but something like 1,000,000 is not when you want to scale the platform). |
That PR was closed after we decided in a meeting to keep it static, as @aglinxinyuan suggested (it only fixes the display of CSVs in the result panel with more columns). The problem is that users are hitting the 512 limit and need it increased. |
This makes sense to me now. Why do we also need batch size adjustable for result table? Can we just pick a fixed number? |
That setting I added because the result panel settings menu seems a little empty, and I thought it might be a nice feature for users. I can remove it if it should be in a different PR or if it's not needed. |
Let's don't over complicate the settings. |
|
Agreed. Let's take a simple solution. |
|
@aglinxinyuan, please review again. I removed the other column-per-result-panel setting. |
|
@aglinxinyuan please review |
…nivocity # Conflicts: # frontend/src/app/workspace/component/result-panel/result-table-frame/result-table-frame.component.spec.ts
… into fix/adjustable_univocity
94e52f4 to
869a85b
Compare
aglinxinyuan
left a comment
There was a problem hiding this comment.
All four items addressed, thanks. Two small nits inline, plus PR description still says "the default lives in default.conf" (no longer true after the csv block was removed) and lists ConfigResource in the refactor scope (no longer touched). Please update.
|
Please fix the frontend format. |
Head branch was pushed to by a user without write access
What changes were proposed in this PR?
csv_parser_max_columnsto Admin → Settings → Result Panel. The CSV scan operator reads this fromsite_settingsat runtime.RuntimeException("Max columns of N exceeded.")Instead of letting Univocity's raw stack trace through. Detection looks at the cause(
ArrayIndexOutOfBoundsException) and parses the index from both the Java 8 and Java 9+ message formats — needed because Univocity's own hint string is silently dropped on Java 9+.site_settingswith fallback" pattern intocommon/dao/SiteSettings, replacing inline jOOQ inCSVScanSourceOpExecandDatasetResource.Any related issues, documentation, discussions?
closes #4589
Related #3825
How was this PR tested?
CSVScanSourceOpExecSpec: happy path, end-of-input, overflow translation against a real parser, andisColumnOverflowcases for both AIOOBE message formats.admin-settings.component.spec.tsfor load/save/reset of the max columns setting.sbt Test/compileclean across DAO, ConfigService, WorkflowOperator, FileService. Frontend specs need Node ≥ 20.19 to run.Max columns of N exceeded.Was this PR authored or co-authored using generative AI tooling?
Co-authored with: Claude Opus 4.7