Skip to content

Tunable GPU block sizes #735

@JPRichings

Description

@JPRichings

Noticed that

const int NUM_THREADS_PER_BLOCK = 128;

is fixed for all target hardware and is a bit large for common tuning recommendations.

Plan to change this to allow a compile time default and a setter-getter interface to allow performance tuning tests.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions