ovphysx 0.4.3 + ovrtx coexistence: clone migration, init-order hoist, pose forwarding#1
Open
EricArnoldNVIDIA wants to merge 4 commits intopbarejko/ovphysx-ovrtxfrom
Open
Conversation
OVRTX shares a Carbonite framework with ovphysx; per the kit/sdk and ovstage working samples, the renderer-side native objects must release before the physics-side teardown runs, otherwise the second teardown crashes on already-freed plugin state. SimulationContext.clear_instance released physics first and let cameras GC last, which inverted the required order. Add RenderContext.cleanup() that walks registered backends and invokes BaseRenderer.cleanup(None) on each, then call it from clear_instance before physics_manager.close(). Per-camera Camera.__del__ cleanup stays in place as a safety net but becomes a no-op once the central cleanup has run. No behavior change for non-ovrtx backends. Companion design notes for the deferred items (renderer hoist, _setup_object_bindings on ovphysx, physx.clone migration) at source/isaaclab_ov/docs/ovphysx_coexist_DESIGN.md. Tracking: OMPE-88037
…nit hoist Two stacked fixes that, together with the earlier render-then-physics teardown change, get presets=ovphysx,ovrtx_renderer,rgb running end-to-end against ovphysx 0.4.3 + ovrtx 0.3.0.307110. Fix 1 — physx.clone() removed in ovphysx 0.4.3. InteractiveScene routed the ovphysx backend through clone_usd=False so that physx.clone() in OvPhysxManager._warmup_and_load could populate env_1..N at the physics layer. The 0.4.3 wheel removed that public API in favor of ovstage_clone_subtree on an attached ovstage.Stage. Until the ovstage bridge is wired in (see source/isaaclab_ov/docs/ ovphysx_coexist_DESIGN.md), switch ovphysx onto USD-side cloning: clone_usd=True and clone_physics=False. _warmup_and_load now raises a clear error if a pre-0.4.3 wheel and the legacy code path coincide, rather than letting AttributeError bubble through Camera init. Fix 2 — OVRTX must construct its native Renderer before ovphysx. ovrtx and ovphysx share a Carbonite framework; ovrtx does no coexistence checks at construction time and SIGSEGVs inside createRTXRenderer if ovphysx (or any other Carbonite owner) has already loaded. IsaacLab today constructs OVRTX lazily in Camera._initialize_impl (which fires on PHYSICS_READY, after OvPhysxManager._warmup_and_load), inverting the required order. Add BaseRenderer.early_init() (default no-op) and have OVRTXRenderer override it to construct the C++ Renderer up front; OVRTXRenderer. initialize() now skips renderer construction if early_init already ran. RenderContext.early_init_all() walks registered backends, and InteractiveScene.early_init_renderers() pre-walks Camera sensors to register their renderer cfg with the context so early_init has something to act on. DirectRLEnv calls scene.early_init_renderers() between _setup_scene (which adds tiled cameras) and sim.reset() so the OVRTX Renderer is constructed before OvPhysxManager constructs its native instance. Smoke run on the OMPE-88037 cartpole repro (32 envs, 2 epochs, 100x100 RGB tiled camera): 16.34s end-to-end, training ran at ~700 fps, exit 0. Object transforms in the rendered frame are still static — Item B in the DESIGN doc (OVRTXRenderer.update_transforms wiring to ovphysx pose bindings) is the next piece. Tracking: OMPE-88037
OVRTXRenderer's object-binding path was Newton-only: it asked SceneDataProvider.get_newton_model() and bailed when None (which is always the case under an OvPhysx physics manager). Result: with presets=ovphysx,ovrtx_renderer the rendered frames had moving cameras but static objects. Add an OvPhysx fallback that: 1. Walks the USD stage for prims with PhysicsRigidBodyAPI under /World/envs/env_*, filters out the camera and any ground plane. 2. Creates an ovphysx RIGID_BODY_POSE tensor binding over those paths (CUDA-resident, [N, 7] = px,py,pz,qx,qy,qz,qw). 3. Binds the same flat list of prim paths to ovrtx omni:xform with PrimMode.EXISTING_ONLY and writes omni:resetXformStack=True so the matrix is interpreted as the world transform regardless of parent xforms — same pattern as the camera binding in OVRTXRenderer.initialize. Per-frame update_transforms now reads the ovphysx pose tensor on GPU, runs sync_ovphysx_pose_to_mat44d_kernel to construct ovrtx-format mat44d rows (column-major rotation, translation in row 3, ``1.0`` at ``[3,3]`` — matches create_camera_transforms_kernel), then wp.copy's into the ovrtx attribute mapping. Zero host roundtrip, single ovrtx write per frame. Smoke run on the OMPE-88037 cartpole repro (32 envs, 2 epochs, 100x100 RGB tiled camera, ovphysx 0.4.3 + ovrtx 0.3.0.307110): 16.59s, training ran at ~670 fps, exit 0; renderer reports "OvPhysx: created RIGID_BODY_POSE binding for 96 ovrtx prims" for 32 envs × 3 articulation links each. Recipe credit: kit/sdk team (notes saved to /tmp/ovstage-answer.md). The Newton path is unchanged; the fallback only fires when get_newton_model() returns None. Tracking: OMPE-88037
Rewrite source/isaaclab_ov/docs/ovphysx_coexist_DESIGN.md so that PR reviewers can read it without external references: - Drop ~/tmp/OMPE-88037-ovphysx-ovrtx-isaaclab.md and /tmp/ovstage-answer.md links (those are scratch files on the patch author's machine, not visible to reviewers). - Inline the answers to the three open questions to the kit/sdk and ovstage teams that informed the design (init order, ovrtx pin, pose-forwarding recipe). - Tag each item with the commit that landed it (a7ecbd0, f018386, 9039f0a). - Note known caveats / future work per item, and the cross-cutting bits (Fabric IStageReaderWriter v0.16/v0.15 ABI warnings, the surviving os._exit(0) HACK in OvPhysxManager). Tracking: OMPE-88037
kellyguo11
reviewed
May 7, 2026
| # Carbonite before ovphysx constructs its native instance (sim.reset → | ||
| # OvPhysxManager._warmup_and_load), otherwise createRTXRenderer SIGSEGVs. | ||
| # No-op for backends without an early_init override. | ||
| self.scene.early_init_renderers() |
There was a problem hiding this comment.
probably also need to add this to direct_marl_env.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Gets
presets=ovphysx,ovrtx_renderer,rgbrunning end-to-end onIsaac-Cartpole-Camera-Presets-Direct-v0against ovphysx 0.4.3 +ovrtx 0.3.0.307110. Three independent fixes plus a companion design
doc:
a7ecbd0de18—RenderContext.cleanup()runs beforephysics_manager.close()so OVRTX releases its native objects beforeovphysx tears down their shared Carbonite (the inverted order was
crashing the second teardown).
f018386acda— Two changes that travel together:physx.clone()was removed in ovphysx 0.4.3.InteractiveScenenow routes ovphysx through USD-side cloning(
clone_usd=True,clone_physics=False);OvPhysxManager._warmup_and_loadraises a clear error if a pre-0.4.3 wheel and the legacy code path coincide.
BaseRenderer.early_init().OVRTXRendereroverrides it toconstruct the C++ Renderer up front so OVRTX claims Carbonite before
OvPhysxManager._warmup_and_load.RenderContext.early_init_all()walks registered backends;
InteractiveScene.early_init_renderers()pre-walks Camera sensors so
early_inithas something to act on;DirectRLEnv._init_simcalls it between_setup_scene(where thecartpole task adds its tiled camera) and
sim.reset(). Without thishoist,
createRTXRendererSIGSEGVs because ovrtx does nocoexistence checks at construction time.
9039f0adb1a— Object pose forwarding for ovphysx.OVRTXRenderer._setup_object_bindingsfalls through to an OvPhysxpath when the Newton scene-data provider returns no model: walks the
USD for
PhysicsRigidBodyAPIprims under/World/envs/env_*,creates a single
RIGID_BODY_POSEtensor binding, binds the sameflat list to ovrtx
omni:xform.update_transformsreads the posetensor on GPU into a pre-allocated
[N, 7]warp array, runssync_ovphysx_pose_to_mat44d_kernelto construct ovrtx-formatmat44drows, thenwp.copys into the ovrtx attribute mapping.Zero host roundtrip per frame.
560ad29e2a1—source/isaaclab_ov/docs/ovphysx_coexist_DESIGN.md,per-item rationale + caveats + future work.
This stacks on top of
pbarejko/ovphysx-ovrtxand assumes ovphysx 0.4.3(specifically
dev/erarnold/ovstage-attachcommitc728f7e740, whichadds cross-device staging in
ovphysx_read_tensor_binding— that'swhat kills the original
body_com_posedevice-mismatch with noIsaacLab-side workaround needed). With ovphysx 0.4.2 the legacy
physx.clone()path is rejected by the new defensive check; thedefense is intentional.
Tracking: OMPE-88037
Test plan
./isaaclab.sh -p scripts/reinforcement_learning/rl_games/train.py --task=Isaac-Cartpole-Camera-Presets-Direct-v0 --num_envs=32 --max_iterations=2 --headless --enable_cameras presets=ovphysx,ovrtx_renderer,rgbagainst ovphysx0.4.3+dev.erarnold.ovstage.attach.c728f7e740+ ovrtx0.3.0.307110. Result: 2 epochs, ~670 fps step, 16.59s end-to-end, exit 0. Renderer logsOvPhysx: created RIGID_BODY_POSE binding for 96 ovrtx prims (shape=(96, 7))and noFailed to update object transformswarnings across the run.--videocapture is the easy way).presets=physx,...andpresets=newton_mjwarp,...) to confirm no regression. The cleanup-order change and the renderer-hoist machinery default to no-op for backends that don't overrideBaseRenderer.early_init(); the ovphysx-only branches inInteractiveSceneandOVRTXRendererare gated onphysics_backend.startswith("ovphysx")and the Newton-model-None check respectively.early_init_renderers()call lives inDirectRLEnv._init_simonly — manager envs that create cameras afterInteractiveScene.__init__will need the same hook). Captured as a caveat in the DESIGN doc.Notes for review
RIGID_BODY_POSEfor allbodies (including articulation links). Worked for cartpole's layout
(32 envs × 3 bodies = 96 prims). If a future task hits a layout
where ovphysx rejects an articulation link in a
RIGID_BODY_POSEbinding, the fallback should switch to per-articulation
ARTICULATION_LINK_POSEbindings. Not implemented because notneeded yet; called out in the DESIGN doc.
Warning: Possible version incompatibility. Attempting to load omni::fabric::IStageReaderWriter with version v0.16 against v0.15. Non-fatal but worth flagging upstream — thekit/sdk integration sample's
CMakeLists.txtcalls out the sameIStageReaderWriter v0.16+ requirement.