Skip to content

bug: spammer partition_exponential panics on shift overflow instead of rejecting too many generators #137

@Jr-kenny

Description

@Jr-kenny

The exponential account partitioner in the spammer has a guard that is meant to reject the case where you ask for more generators than the account space can support. That guard shifts by num_generators - 1, and once that exponent reaches the platform word size the shift itself overflows, so the guard never gets a chance to return its friendly error.

Where it lives:

if round_div(num_accounts, 1 << (num_generators - 1)) == 0 {
eyre::bail!("too many generators: it would result in a bucket with size 0");
}

if round_div(num_accounts, 1 << (num_generators - 1)) == 0 {
    eyre::bail!("too many generators: it would result in a bucket with size 0");
}

What happens, step by step:

  1. partition_accounts first checks num_generators > 0 and num_accounts >= num_generators, then dispatches to partition_exponential.
  2. With, say, num_accounts = 130 and num_generators = 65, both of those checks pass, so we reach the guard.
  3. The guard evaluates 1 << (num_generators - 1), which is 1 << 64 on a 64-bit target.
  4. A shift by an amount >= the word width is an overflow. In debug builds this panics with "attempt to shift left with overflow". In release builds the shift distance wraps (so 1 << 64 becomes 1 << 0 = 1), the guard sees a non-zero bucket size and lets the input through, and the boundary loop right below (accounts.rs#L112-L116) does the same 1 << j for j up to num_generators - 1 with wrapped denominators. The result is garbage boundaries (non-monotonic, some with start > end), so generators end up with overlapping or inverted account ranges and collide on nonces.

So the same input that should produce a clean "too many generators" error (the way (100, 9) already does) either crashes the process or silently corrupts the partition, depending on the build profile.

Repro as a unit test in the existing style (sits right next to partition_accounts_exponential in accounts.rs):

#[tokio::test]
async fn partition_accounts_exponential_rejects_too_many_generators() -> Result<()> {
    let result = PartitionMode::Exponential.partition_accounts(130, 65);
    assert!(result.is_err(), "expected graceful error, got {result:?}");
    Ok(())
}

On current main this panics at accounts.rs:103 with "attempt to shift left with overflow" instead of returning an error.

Proposed fix:

Short-circuit the guard so the shift only runs when the exponent is in range. Once num_generators - 1 reaches the word size, 2^(num_generators - 1) already exceeds any possible num_accounts, so the smallest bucket is empty and we want to bail anyway:

let shift = num_generators - 1;
if shift >= usize::BITS as usize || round_div(num_accounts, 1 << shift) == 0 {
    eyre::bail!("too many generators: it would result in a bucket with size 0");
}

That keeps the boundary loop below safe as well, since we never reach it with an out-of-range exponent anymore. With this change the repro returns the expected error and the existing partition tests still pass.

Happy to open a PR with the fix and the regression test if that is useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions