Add LAQ (Learnable Amax Quantization) algorithm by realAsma · Pull Request #1247 · NVIDIA/Model-Optimizer

realAsma · 2026-04-13T16:40:12Z

Summary

Add LAQ (Learnable Amax Quantization), a QAT algorithm that learns separate pre-quantization and post-dequantization amax values during training. Forward pass: w_q = Q_STE(w / s_pre) * s_post where s = amax / Q_max.

Key options:

learnable_amax: controls which amax parameters are trainable — ["pre", "post"] (both), "post" (post-only, default), "pre" (pre-only), or [] (frozen)
tied_amax: when True, pre and post share a single tensor (requires both to have the same learnable state)
scale_algorithm: optional initial scale calibration (mse, local_hessian, or max) before learning begins

Test plan

Run unit tests: pytest tests/unit/torch/quantization/test_laq.py
Run recipe tests: pytest tests/unit/recipe/test_laq_recipes.py
Run GPU tests: pytest tests/gpu/torch/quantization/test_laq_cuda.py
Verify LAQ with llm_qat example end-to-end

🤖 Generated with Claude Code

coderabbitai · 2026-04-13T16:40:20Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)

main
release/.*
feature/.*

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 88b7c0f6-a5df-40aa-aaa2-1869d912bd0e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch asma/laq-algorithm

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-13T16:57:44Z

Codecov Report

❌ Patch coverage is 75.54585% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.30%. Comparing base (1f1c250) to head (8866b80).

Files with missing lines	Patch %	Lines
modelopt/torch/quantization/model_calib.py	76.11%	16 Missing ⚠️
modelopt/torch/quantization/tensor_quant.py	48.38%	16 Missing ⚠️
.../torch/quantization/nn/modules/tensor_quantizer.py	82.55%	15 Missing ⚠️
modelopt/torch/quantization/triton/fp4_kernel.py	65.21%	8 Missing ⚠️
modelopt/torch/quantization/config.py	92.85%	1 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                 @@
##           asma/new-qat-2    #1247      +/-   ##
==================================================
- Coverage           76.94%   72.30%   -4.65%     
==================================================
  Files                 355      355              
  Lines               41402    41622     +220     
==================================================
- Hits                31858    30096    -1762     
- Misses               9544    11526    +1982

Flag	Coverage Δ
examples	`42.34% <29.25%> (-0.07%)`	⬇️
gpu	`48.25% <75.54%> (-9.62%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Also clean up llm_qat example configs and fix pad_token_id handling. Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma requested review from a team as code owners April 13, 2026 16:40

realAsma requested review from AAnoosheh, cjluo-nv, h-guo18, meenchen, mxinO and shengliangxu and removed request for a team April 13, 2026 16:40

realAsma force-pushed the asma/new-qat-2 branch from 7718d64 to 4c5a889 Compare April 16, 2026 13:52

Add LAQ (Learnable Amax Quantization) algorithm

8866b80

Also clean up llm_qat example configs and fix pad_token_id handling. Signed-off-by: realAsma <akuriparambi@nvidia.com>

realAsma force-pushed the asma/laq-algorithm branch from 16832fb to 8866b80 Compare April 16, 2026 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LAQ (Learnable Amax Quantization) algorithm#1247

Add LAQ (Learnable Amax Quantization) algorithm#1247
realAsma wants to merge 1 commit intoasma/new-qat-2from
asma/laq-algorithm

realAsma commented Apr 13, 2026

Uh oh!

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

Review skipped

Uh oh!

codecov bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

realAsma commented Apr 13, 2026

Summary

Test plan

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

codecov bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

codecov bot commented Apr 13, 2026 •

edited

Loading