[6034518] Downgrade TRT support for remote autotuning in Autotune from 10.16 to 10.15 by gcunhase · Pull Request #1259 · NVIDIA/Model-Optimizer

gcunhase · 2026-04-14T16:49:39Z

What does this PR do?

Type of change: Bug fix

Remote autotuning is supported in TensorRT from version 10.15, but fails with Autotune as it's checking for 10.16+. This PR fixes that check and updates documentation accordingly.

Usage

# Add a code snippet demonstrating how to use this

Testing

See bug 6034518.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A
Did you update Changelog?: ✅

Summary by CodeRabbit

Documentation
- Added a Remote Autotuning guide for TensorRT 10.15+ with CLI examples; updated examples to require --safe --skipInference.
Updates
- Lowered TensorRT minimum requirement for remote autotuning from 10.16 to 10.15.
- Clarified CLI help text for trtexec/autotune arguments.
Bug Fixes
- trtexec-based autotuning now verifies the trtexec executable version when checking compatibility.

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

coderabbitai · 2026-04-14T16:49:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Added a trtexec detection helper, lowered the TensorRT version requirement for remote autotuning to 10.15, updated benchmark logic to use trtexec (appending --safe and --skipInference as needed), clarified the CLI help, and added documentation/examples for trtexec-based remote autotuning.

Changes

Cohort / File(s)	Summary
Changelog & docs `CHANGELOG.rst`, `docs/source/guides/9_autotune.rst`, `examples/onnx_ptq/autotune/README.md`	Lowered TensorRT requirement from 10.16 → 10.15; added/updated remote-autotuning docs and example CLI showing `--use_trtexec` with `--trtexec_benchmark_args='--remoteAutoTuningConfig=... --safe --skipInference'`.
CLI help & benchmark behavior `modelopt/onnx/quantization/__main__.py`, `modelopt/onnx/quantization/autotune/benchmark.py`	Clarified `--autotune_trtexec_args` help text. Benchmark now uses `_check_for_trtexec`, updates version checks/messages to 10.15, and conditionally appends `--safe` and `--skipInference` to `trtexec` args; fallback warning updated.
trtexec detection util `modelopt/onnx/quantization/ort_utils.py`	Added `_check_for_trtexec(min_version: str = "10.0") -> str` to locate `trtexec`, run it, parse TensorRT version from output, validate against `min_version`, handle timeouts/errors, log results, and raise `ImportError` on failure.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant Bench as AutotuneBenchmark
    participant Util as _check_for_trtexec
    participant TrtExec as trtexec
    participant Remote as Remote AutoTuner

    User->>Bench: invoke autotune (--use_trtexec + args)
    Bench->>Util: probe `trtexec` binary and parse version
    Util-->>Bench: returns detected version or raises ImportError
    Bench->>TrtExec: launch `trtexec` with --remoteAutoTuningConfig + --safe + --skipInference
    TrtExec->>Remote: perform remote autotuning
    Remote-->>TrtExec: tuning results
    TrtExec-->>Bench: benchmark/tuning output

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	PR introduces five # nosec directives without following required exception process per SECURITY.md policy.	Remove # nosec comments and add inline explanations. Add Security Exception section to PR description and request `@NVIDIA/modelopt-setup-codeowners` review.
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: downgrading TensorRT version support for remote autotuning from 10.16 to 10.15, which is the primary bug fix across all modified files.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

modelopt/onnx/quantization/autotune/benchmark.py (1)

209-228: ⚠️ Potential issue | 🔴 Critical

Remove early return so remote autotuning args are actually applied.

Line 218 returns from __init__ before Line 227 extends self._base_cmd, so the --remoteAutoTuningConfig/--safe args never make it into the executed command when TRT >= 10.15. This breaks remote autotuning despite passing the version gate.

💡 Proposed fix

-        trtexec_args = self.trtexec_args or []
+        trtexec_args = list(self.trtexec_args or [])
         has_remote_config = any("--remoteAutoTuningConfig" in arg for arg in trtexec_args)

         if has_remote_config:
             try:
                 _check_for_tensorrt(min_version="10.15")
                 self.logger.debug("TensorRT Python API version >= 10.15 detected")
                 if "--safe" not in trtexec_args:
                     self.logger.warning(
                         "Remote autotuning requires '--safe' to be set. Adding it to trtexec arguments."
                     )
-                    self.trtexec_args.append("--safe")
-                return
+                    trtexec_args.append("--safe")
             except ImportError:
                 self.logger.warning(
                     "Remote autotuning is not supported with TensorRT version < 10.15. "
                     "Removing --remoteAutoTuningConfig from trtexec arguments"
                 )
                 trtexec_args = [
                     arg for arg in trtexec_args if "--remoteAutoTuningConfig" not in arg
                 ]
         self._base_cmd.extend(trtexec_args)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelopt/onnx/quantization/autotune/benchmark.py` around lines 209 - 228, The
constructor's remote-autotuning branch returns early so trtexec args like
"--safe" or "--remoteAutoTuningConfig" never get appended to the final command;
remove the early return in the block that calls
_check_for_tensorrt(min_version="10.15") and instead ensure any modifications to
self.trtexec_args (and filtered local trtexec_args) are followed by extending
self._base_cmd with those trtexec_args so the arguments are applied; locate the
logic around _check_for_tensorrt, self.trtexec_args, trtexec_args and
self._base_cmd in the __init__ of the Benchmark class and adjust flow so the
successful TensorRT path does not exit before
self._base_cmd.extend(trtexec_args) executes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@modelopt/onnx/quantization/autotune/benchmark.py`:
- Around line 209-228: The constructor's remote-autotuning branch returns early
so trtexec args like "--safe" or "--remoteAutoTuningConfig" never get appended
to the final command; remove the early return in the block that calls
_check_for_tensorrt(min_version="10.15") and instead ensure any modifications to
self.trtexec_args (and filtered local trtexec_args) are followed by extending
self._base_cmd with those trtexec_args so the arguments are applied; locate the
logic around _check_for_tensorrt, self.trtexec_args, trtexec_args and
self._base_cmd in the __init__ of the Benchmark class and adjust flow so the
successful TensorRT path does not exit before
self._base_cmd.extend(trtexec_args) executes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: aca1a5f5-c803-49d9-8889-37357600cffb

📥 Commits

Reviewing files that changed from the base of the PR and between b6c6ec3 and 2097254.

📒 Files selected for processing (5)

CHANGELOG.rst
docs/source/guides/9_autotune.rst
examples/onnx_ptq/autotune/README.md
modelopt/onnx/quantization/__main__.py
modelopt/onnx/quantization/autotune/benchmark.py

ajrasane · 2026-04-14T16:59:43Z

PR #1259 Review: Downgrade TRT support for remote autotuning from 10.16 to 10.15

Verdict: Request Changes — core fix is correct, but docs have formatting issues and there's a code/doc inconsistency.

RST formatting errors — docs/source/guides/9_autotune.rst

Missing blank line after .. code-block:: bash — RST requires a blank line between the directive and content. All other code blocks in this file follow this pattern.
Wrong indentation — content uses 2-space indent but every other code block in the file uses 3-space indent. This will prevent proper rendering.
Stray Markdown fence — there's a ``` closing fence after the code block that doesn't belong in RST. Will render as literal text.
Leading space on " Other TensorRT benchmark options..." line — creates an unintended blockquote.

--skipInference inconsistency between docs and code

Docs (both RST and README) now state --safe --skipInference must be enabled, but benchmark.py:211-215 only auto-adds --safe — no check or auto-add for --skipInference:

  if "--safe" not in trtexec_args:                                                                                                                 
      self.logger.warning(...)                                                                                                      
      self.trtexec_args.append("--safe")                                                            
  # No equivalent for --skipInference

Either:

Add a similar auto-add for --skipInference in the code (preferred for consistency), or
Clarify in docs that --skipInference is recommended but not enforced

Minor: main.py help text

The updated help says "Only relevant with the 'trtexec' workflow enabled" — would be clearer as "Only relevant when --use_trtexec is set" since that's the actual flag name.

What looks good

Version downgrade 10.16 → 10.15 in benchmark.py — correct and consistent across all 3 occurrences
CHANGELOG.rst entry — properly placed
README.md updates — version numbers and --skipInference addition are consistent

codecov · 2026-04-14T17:03:41Z

Codecov Report

❌ Patch coverage is 14.63415% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.72%. Comparing base (b6c6ec3) to head (14d425c).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/onnx/quantization/ort_utils.py	14.28%	30 Missing ⚠️
modelopt/onnx/quantization/autotune/benchmark.py	16.66%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1259      +/-   ##
==========================================
- Coverage   76.90%   76.72%   -0.19%     
==========================================
  Files         350      355       +5     
  Lines       40524    41453     +929     
==========================================
+ Hits        31166    31804     +638     
- Misses       9358     9649     +291

Flag	Coverage Δ
examples	`43.78% <12.19%> (+1.15%)`	⬆️
gpu	`57.38% <14.63%> (-0.15%)`	⬇️
unit	`55.68% <14.63%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…d for qdq tuning Signed-off-by: dmoodie <dmoodie@nvidia.com>

copy-pr-bot · 2026-04-15T14:14:17Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/onnx/quantization/ort_utils.py`:
- Line 58: Remove the "# nosec" suppressions on the subprocess import and calls
in ort_utils.py; replace any unsafe subprocess usage with secure patterns (use
subprocess.run with a list of args, check=True, capture stdout/stderr, avoid
shell=True, and validate inputs or use shlex.split when necessary) and add
proper error handling/logging around the call; if you cannot convert the call to
a safe pattern, remove the nosec and add an explicit note in the PR requesting
`@NVIDIA/modelopt-setup-codeowners` approval with justification. Reference: the
subprocess import line and the subprocess call sites in ort_utils.py.
- Around line 63-70: The version extraction currently matches any dotted number
first (regex (\d+(\.\d+)+)) and so returns CUDA/driver versions from
version_str; update the logic in the TensorRT parsing block (where version_str
is inspected and the first regex run) to try a TensorRT-specific pattern first
(e.g., look for "TensorRT" or "[TensorRT v...]" and extract its dotted version
or convert vNNNNNN to x.y format) before falling back to generic dotted-number
matching; ensure references to the existing variables and match handling in this
function (the match variable, the first regex branch and the "[TensorRT v...]"
fallback branch) are updated so the TensorRT-specific match is tested and
returned prior to the generic (\d+(\.\d+)+) match.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3972ea4d-99f9-463a-940c-61ade3610b89

📥 Commits

Reviewing files that changed from the base of the PR and between 2097254 and cf42607.

📒 Files selected for processing (2)

modelopt/onnx/quantization/autotune/benchmark.py
modelopt/onnx/quantization/ort_utils.py

🚧 Files skipped from review as they are similar to previous changes (1)

modelopt/onnx/quantization/autotune/benchmark.py

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

gcunhase · 2026-04-15T14:57:44Z

PR #1259 Review: Downgrade TRT support for remote autotuning from 10.16 to 10.15

Verdict: Request Changes — core fix is correct, but docs have formatting issues and there's a code/doc inconsistency.

RST formatting errors — docs/source/guides/9_autotune.rst

Missing blank line after .. code-block:: bash — RST requires a blank line between the directive and content. All other code blocks in this file follow this pattern.

Wrong indentation — content uses 2-space indent but every other code block in the file uses 3-space indent. This will prevent proper rendering.

Stray Markdown fence — there's a ``` closing fence after the code block that doesn't belong in RST. Will render as literal text.

Leading space on " Other TensorRT benchmark options..." line — creates an unintended blockquote.

--skipInference inconsistency between docs and code

Docs (both RST and README) now state --safe --skipInference must be enabled, but benchmark.py:211-215 only auto-adds --safe — no check or auto-add for --skipInference:
  if "--safe" not in trtexec_args:                                                                                                                 
      self.logger.warning(...)                                                                                                      
      self.trtexec_args.append("--safe")                                                            
  # No equivalent for --skipInference
Either:

Add a similar auto-add for --skipInference in the code (preferred for consistency), or

Clarify in docs that --skipInference is recommended but not enforced

Minor: main.py help text

The updated help says "Only relevant with the 'trtexec' workflow enabled" — would be clearer as "Only relevant when --use_trtexec is set" since that's the actual flag name.

What looks good

Version downgrade 10.16 → 10.15 in benchmark.py — correct and consistent across all 3 occurrences

CHANGELOG.rst entry — properly placed

README.md updates — version numbers and --skipInference addition are consistent

Addressed all concerns, PTAL again @ajrasane. Thanks!

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

kevalmorabia97 · 2026-04-15T16:10:38Z

I think there are 3 or 4 places in modelopt.onnx where we use subprocess to run trtexec. As a separate PR, can you unify all into a single function run_trtexec(args: list[str]) or something like this and call that function everywhere so we only have 1 place where # nosec is needed?

ajrasane

LGTM. Please merge after the subprocess module import issue is resolved.

gcunhase · 2026-04-15T16:38:42Z

I think there are 3 or 4 places in modelopt.onnx where we use subprocess to run trtexec. As a separate PR, can you unify all into a single function run_trtexec(args: list[str]) or something like this and call that function everywhere so we only have 1 place where # nosec is needed?

Will do, thank you!

kevalmorabia97 · 2026-04-15T16:41:53Z

/ok to test 14d425c

gcunhase · 2026-04-15T21:07:13Z

I think there are 3 or 4 places in modelopt.onnx where we use subprocess to run trtexec. As a separate PR, can you unify all into a single function run_trtexec(args: list[str]) or something like this and call that function everywhere so we only have 1 place where # nosec is needed?

@kevalmorabia97 PTAL at #1268. Thanks.

gcunhase added 2 commits April 14, 2026 12:45

Update documentation for TRT support in Autotune: 10.16 -> 10.15

4012727

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

Update CHANGELOG

2097254

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

gcunhase requested a review from a team as a code owner April 14, 2026 16:49

gcunhase requested a review from vishalpandya1990 April 14, 2026 16:49

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

Check trtexec version instead of python API when using trtexec backen…

ee8bae2

…d for qdq tuning Signed-off-by: dmoodie <dmoodie@nvidia.com>

Update CHANGELOG with trtexec check fix

c957365

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

gcunhase force-pushed the dev/gcunhasergio/6034518_autotune_trt10.15 branch from cf42607 to c957365 Compare April 15, 2026 14:21

gcunhase added 2 commits April 15, 2026 10:25

Merge branch 'pr-1' into dev/gcunhasergio/6034518_autotune_trt10.15

a3ce746

nit: changelog wording, args comment

14e42b2

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

Comment thread modelopt/onnx/quantization/ort_utils.py Outdated

Comment thread modelopt/onnx/quantization/ort_utils.py Outdated

gcunhase added 2 commits April 15, 2026 10:35

Addressed other feedback

9d673a0

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

trtexec check: explicitly check for TRT version

0907258

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

Removed 'nosec' comments

14d425c

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

ajrasane approved these changes Apr 15, 2026

View reviewed changes

gcunhase merged commit dec2952 into NVIDIA:main Apr 15, 2026
43 of 44 checks passed

gcunhase mentioned this pull request Apr 15, 2026

Centralize 'trtexec' subprocess runs in ONNX into a single function #1268

Open

Conversation

gcunhase commented Apr 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ajrasane commented Apr 14, 2026

Uh oh!

codecov bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

copy-pr-bot bot commented Apr 15, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gcunhase commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevalmorabia97 commented Apr 15, 2026

Uh oh!

ajrasane left a comment

Choose a reason for hiding this comment

Uh oh!

gcunhase commented Apr 15, 2026

Uh oh!

kevalmorabia97 commented Apr 15, 2026

Uh oh!

Uh oh!

gcunhase commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gcunhase commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

codecov bot commented Apr 14, 2026 •

edited

Loading

gcunhase commented Apr 15, 2026 •

edited

Loading