Skip to content

Add LAQ (Learnable Amax Quantization) algorithm#1247

Open
realAsma wants to merge 1 commit intoasma/new-qat-2from
asma/laq-algorithm
Open

Add LAQ (Learnable Amax Quantization) algorithm#1247
realAsma wants to merge 1 commit intoasma/new-qat-2from
asma/laq-algorithm

Conversation

@realAsma
Copy link
Copy Markdown
Contributor

Summary

Add LAQ (Learnable Amax Quantization), a QAT algorithm that learns separate pre-quantization and post-dequantization amax values during training. Forward pass: w_q = Q_STE(w / s_pre) * s_post where s = amax / Q_max.

Key options:

  • learnable_amax: controls which amax parameters are trainable — ["pre", "post"] (both), "post" (post-only, default), "pre" (pre-only), or [] (frozen)
  • tied_amax: when True, pre and post share a single tensor (requires both to have the same learnable state)
  • scale_algorithm: optional initial scale calibration (mse, local_hessian, or max) before learning begins

Test plan

  • Run unit tests: pytest tests/unit/torch/quantization/test_laq.py
  • Run recipe tests: pytest tests/unit/recipe/test_laq_recipes.py
  • Run GPU tests: pytest tests/gpu/torch/quantization/test_laq_cuda.py
  • Verify LAQ with llm_qat example end-to-end

🤖 Generated with Claude Code

@realAsma realAsma requested review from a team as code owners April 13, 2026 16:40
@realAsma realAsma requested review from AAnoosheh, cjluo-nv, h-guo18, meenchen, mxinO and shengliangxu and removed request for a team April 13, 2026 16:40
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)
  • main
  • release/.*
  • feature/.*

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 88b7c0f6-a5df-40aa-aaa2-1869d912bd0e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch asma/laq-algorithm

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 75.54585% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.30%. Comparing base (1f1c250) to head (8866b80).

Files with missing lines Patch % Lines
modelopt/torch/quantization/model_calib.py 76.11% 16 Missing ⚠️
modelopt/torch/quantization/tensor_quant.py 48.38% 16 Missing ⚠️
.../torch/quantization/nn/modules/tensor_quantizer.py 82.55% 15 Missing ⚠️
modelopt/torch/quantization/triton/fp4_kernel.py 65.21% 8 Missing ⚠️
modelopt/torch/quantization/config.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                 @@
##           asma/new-qat-2    #1247      +/-   ##
==================================================
- Coverage           76.94%   72.30%   -4.65%     
==================================================
  Files                 355      355              
  Lines               41402    41622     +220     
==================================================
- Hits                31858    30096    -1762     
- Misses               9544    11526    +1982     
Flag Coverage Δ
examples 42.34% <29.25%> (-0.07%) ⬇️
gpu 48.25% <75.54%> (-9.62%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Also clean up llm_qat example configs and fix pad_token_id handling.

Signed-off-by: realAsma <akuriparambi@nvidia.com>
@realAsma realAsma force-pushed the asma/laq-algorithm branch from 16832fb to 8866b80 Compare April 16, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant