us006 done#652
Conversation
isaacs
left a comment
There was a problem hiding this comment.
This is overall a somewhat reasonable approach, adding a limiter and conditionally using it when a concurrency option is used. However, the ...WithConcurrency methods must be DRY-ed with their non-concurrent counterparts, or else it nearly doubles the surface area for bugs and changes.
The limitation of a minimum concurrency value of 8 is arbitrary and unclear as to why it should be required. For example, on machines with fewer than 8 CPUs, it's effectively already limited to a lower concurrency value than this, so I'm unconvinced that it's actually required.
Also, please write a human commit message, I have no idea what us006 is.
| skip: () => void | ||
| } | ||
|
|
||
| class AsyncReadLimiter { |
There was a problem hiding this comment.
Classes should have their own module. Move this to src/async-read-limiter.ts
| constructor(concurrency: number, signal?: AbortSignal) { | ||
| this.concurrency = concurrency | ||
| this.#signal = signal | ||
| /* c8 ignore start */ |
There was a problem hiding this comment.
This should have unit tests that exercise the signal abort behavior when concurrency is used.
| : Array.isArray(ignore) ? new Ignore(ignore, opts) | ||
| : ignore | ||
|
|
||
| type PendingRead = { |
| this.walkCB2(target, patterns, new Processor(this.opts), cb) | ||
| } | ||
|
|
||
| walkCBWithConcurrency( |
There was a problem hiding this comment.
Duplicating methods is a bit excessive. Rather than adding walkCBWithConcurrency, walkCB2WithConcurrency, etc, it'd be better to just detect whether or not concurrency is limited by the presence of a this.#readdirLimiter defined in the constructor.
Otherwise, bugfixes and features have to be done in multiple places, it's possible for them to go out of sync with one another, etc.
| ) { | ||
| const limiter = this.#readdirLimiter | ||
| /* c8 ignore start */ | ||
| if (!limiter) { |
There was a problem hiding this comment.
Rather than an ignored check, it'd be better to define a this: Glob & { ... } type to the method, then guarantee it's true prior to calling it. (Minor nit, not always easy/possible, but better to let TSC do the check for us at build time rather than add run-time cost.)
| process.platform | ||
| : 'linux' | ||
|
|
||
| export const MIN_ASYNC_READDIR_CONCURRENCY = 8 |
There was a problem hiding this comment.
Why is this limited to 8? Feels arbitrary. What happens when it's lower?
There should really be no reason why it can't be set at 1, in fact; it'd just mean that each readdir waits for the one before it to complete before proceeding. Would be slow, of course, but also very memory and CPU efficient.
| pattern: string | string[], | ||
| options: GlobOptions = {}, | ||
| ) { | ||
| assertNoSyncConcurrency(options.concurrency) |
There was a problem hiding this comment.
Why is this necessary? Won't it assert in the constructor anyway?
| 'async glob walks need enough concurrent directory reads to avoid starvation in the walker fan-out' | ||
|
|
||
| const invalidConcurrencyMessage = () => | ||
| `invalid concurrency option: ${concurrencyMinimumMessage()} because ${concurrencyReason}` |
There was a problem hiding this comment.
Why is this a method just to return a constant string? just define the string and use that.
| `concurrency must be a positive integer greater than or equal to ${MIN_ASYNC_READDIR_CONCURRENCY}` | ||
|
|
||
| const concurrencyReason = | ||
| 'async glob walks need enough concurrent directory reads to avoid starvation in the walker fan-out' |
| return concurrency | ||
| } | ||
|
|
||
| export const assertNoSyncConcurrency = ( |
There was a problem hiding this comment.
These assertion methods should not be exported if they're only used in this module. Also, these can be tightened up considerably, something like this in src/index.ts
const validConcurrency = (concurrency: number | undefined, sync: boolean) =>
concurrency === undefined ? true
: !sync && Number.isInteger(concurrency) && concurrency > 0
const assertValidConcurrency = (concurrency: number | undefined, sync: boolean) => {
if (!validConcurrency) {
throw new Error(
sync ? 'concurrency not supported for sync glob operations'
: `Invalid concurrency value: ${concurrency}`
)
}Then in the various methods:
export function globSync(...) {
assertValidConcurrency(options.concurrency, true)
...
}
export function glob(...) {
assertValidConcurrency(options.concurrency, false)
...
}|
Thank you for your response. However, I didn't mean to offer this as a PR to you, I just wanted to keep this within my fork for now. The reason is that this PR is the result of an experiment with a multi-agentic setup that itself is still in early stages. Therefore I don't yet trust its output. I will re-open a PR once I have an update I can fully support as if I had written myself, and have my coworkers done a thorough review as well. Sorry for the inconvenience. |
|
No worries! Hopefully my PR feedback is useful in training your system. |
No description provided.