fix(launcher): use afterany dependency for allow_to_fail pipelines by yeyu-nvidia · Pull Request #1248 · NVIDIA/Model-Optimizer

yeyu-nvidia · 2026-04-13T17:03:13Z

Summary

nemo-run's SlurmExecutor defaults to dependency_type="afterok", which cancels all downstream Slurm tasks when a predecessor times out (TIMEOUT) or fails
For pipelines with allow_to_fail=True, this changes the dependency type to "afterany" so subsequent tasks run regardless of predecessor exit status
This unblocks EAGLE3 multi-step pipelines where task_0 (data generation) may time out but task_1+ should still run on whatever data was produced

Test plan

Verify existing launcher unit tests pass (uv run python3 -m pytest tests/ -v in tools/launcher/)
Submit an EAGLE3 pipeline with allow_to_fail: true and confirm task_1 runs after task_0 times out
Verify pipelines without allow_to_fail still use default afterok behavior

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Experiments can now continue executing downstream tasks even when upstream tasks fail or timeout, improving workflow resilience and enabling more robust experiment pipelines.

nemo-run's SlurmExecutor defaults to dependency_type="afterok", which cancels all downstream tasks when a predecessor times out or fails. For pipelines with allow_to_fail=True, use "afterany" so subsequent tasks run regardless of predecessor exit status. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ye Yu <yeyu@nvidia.com>

coderabbitai · 2026-04-13T17:03:31Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 26dba2c4-106d-43c4-aa06-b0c2bf60569a

📥 Commits

Reviewing files that changed from the base of the PR and between 0b42c14 and 4edc5a6.

📒 Files selected for processing (1)

tools/launcher/core.py

📝 Walkthrough

Walkthrough

Added conditional logic in run_jobs function that checks job.allow_to_fail and whether the executor has a dependency_type attribute, then sets executor.dependency_type to "afterany" to enable downstream tasks to proceed independently of predecessor failures.

Changes

Cohort / File(s)	Summary
Task Dependency Configuration `tools/launcher/core.py`	Added conditional logic to set executor `dependency_type` to `"afterany"` when a job allows failure, enabling downstream tasks to continue regardless of predecessor timeout or failure states.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: using afterany dependency type for pipelines with allow_to_fail enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	The pull request adds conditional logic to set executor.dependency_type without introducing security anti-patterns like unsafe deserialization, code execution, or dangerous configurations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch yeyu/afterany-pipeline-fix

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-13T17:07:10Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1248/
Built to branch `gh-pages` at 2026-04-13 17:06 UTC. Preview will be ready when the GitHub Pages deployment is complete.

codecov · 2026-04-13T17:16:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.91%. Comparing base (5ff1d7b) to head (4edc5a6).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1248   +/-   ##
=======================================
  Coverage   76.91%   76.91%           
=======================================
  Files         350      350           
  Lines       40481    40481           
=======================================
  Hits        31137    31137           
  Misses       9344     9344

Flag	Coverage Δ
unit	`55.53% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(launcher): use afterany dependency for allow_to_fail pipelines#1248

fix(launcher): use afterany dependency for allow_to_fail pipelines#1248
yeyu-nvidia wants to merge 1 commit intomainfrom
yeyu/afterany-pipeline-fix

yeyu-nvidia commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

github-actions bot commented Apr 13, 2026

Built to branch `gh-pages` at 2026-04-13 17:06 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yeyu-nvidia commented Apr 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

github-actions bot commented Apr 13, 2026

Built to branch gh-pages at 2026-04-13 17:06 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov bot commented Apr 13, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yeyu-nvidia commented Apr 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 13, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-13 17:06 UTC.
Preview will be ready when the GitHub Pages deployment is complete.