Skip to content

Comments

feat(profiling): Integrate Tracy profiler#2202

Open
stephanmeesters wants to merge 6 commits intoTheSuperHackers:mainfrom
stephanmeesters:tracy-profiler
Open

feat(profiling): Integrate Tracy profiler#2202
stephanmeesters wants to merge 6 commits intoTheSuperHackers:mainfrom
stephanmeesters:tracy-profiler

Conversation

@stephanmeesters
Copy link

@stephanmeesters stephanmeesters commented Jan 27, 2026

Merge by rebase

This PR adds Tracy profiler to the CMake presets win32-profile and win32-vcpkg-profile.

A variable RTS_BUILD_OPTION_PROFILE_TRACY is added to enable Tracy. Enabling this will disable the old profiler.

It is not added to the vc6-profile because Tracy library can't compile against it. The preset mingw-w64-i686-profile is also skipped as this preset does not build yet (see #2163).

Library include

Tracy is added as a package to VCPKG (version 0.11.1) and as a CMake FetchContent_Declare (version 0.13.1). These are the latest available versions at the time. Unfortunately the VCPKG is old, but the VCPKG Github has PR's going to update Tracy to the latest version. The version 0.13.1 improves coloring immensely, which is important for at a glance seeing if we are render-bound of update-bound.

Tracy themselves recommend adding the library as a CMake include because the version of the library that is included in the game must match exactly with the version of their main executable tracy-profiler.exe. This can be guaranteed by building and copying tracy-profiler.exe to the build dir, however we can not do this right now as it requires a 64 bit compile.

What are we tracing?

  • Frame markers (start-end of each frame)
  • Zones
  • Logic frame number
  • Pathfinding count of paths and cells
  • Messages ("started game with map Alpine Assault" ..)

A number of zones were picked that capture the majority of processing time, and will hopefully be useful for profiling sessions.

Performance impact

Tracy itself has a negligible impact, however the frame capturing does have a minor impact (0.5ms average per frame). I have done my best to optimize the frame capturing by downscaling the backbuffer on the GPU side before copying to the CPU. It will be possible to simplify some of this code by moving from DX8 to DX9 and perhaps increase performance too.

Future work

It should be determined whether this will be sufficient to replace the old profiling code. At that point, and when all other profiles build, we can activate Tracy using only RTS_BUILD_OPTION_PROFILE. Other interesting things to trace would be a plot of the memory usage.

Todo

  • Replicate to Generals

Example of a capture. Render bound frame (top), update bound frame (bottom)
tracy-capture

stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Jan 27, 2026
@stephanmeesters stephanmeesters marked this pull request as ready for review January 27, 2026 22:29
@greptile-apps
Copy link

greptile-apps bot commented Jan 27, 2026

Greptile Overview

Greptile Summary

This PR integrates Tracy profiler (v0.13.1 via FetchContent, v0.11.1 via VCPKG) into the profiling build presets win32-profile and win32-vcpkg-profile. Tracy is enabled via the new RTS_BUILD_OPTION_PROFILE_TRACY CMake option, which mutually excludes the legacy profiler.

Key Changes

  • CMake Integration: Added cmake/tracy.cmake for FetchContent and modified cmake/config-build.cmake to conditionally link Tracy::TracyClient
  • Profile Header: Modified rts/profile.h to include Tracy headers when TRACY_ENABLE is defined, with no-op macro fallbacks for disabled builds
  • Instrumentation: Added profiling zones (ZoneScopedN, ZoneScopedNC) throughout the codebase covering:
    • Game logic update loop and subsystems (ScriptEngine, AI, TerrainLogic, etc.)
    • Render pipeline stages (scene rendering, UI, particles, shadows)
    • Pathfinding with plots for cells and paths processed
  • Frame Capture: Implemented GPU-accelerated frame image capture in W3DDisplay with BGRA-to-RGBA pixel shader conversion and downscaling to 256x256
  • Frame Markers: Added FrameMark in GameClient::update() to delimit frame boundaries
  • Telemetry: Added TracyMessage calls on game start/end and TracyPlot for logic frame number and pathfinding metrics

Issues Found

  • Uninitialized pointers: Tracy member variables in W3DDisplay.h (m_tracyRenderTarget, m_tracySurfaceClass, m_tracyIntermediateTexture, m_tracySwizzleShader) are not initialized in the constructor, which could cause crashes if ResetTracyCaptureImage() is called before InitTracyCaptureImage()
  • Missing error handling: InitTracyCaptureImage() assumes all allocations succeed (render targets, surfaces, textures, shaders) without checking for failures, which could lead to null pointer dereferences in TracyCaptureImage()

The implementation is comprehensive and well-structured, but requires defensive null checks and member initialization to be production-ready.

Confidence Score: 3/5

  • This PR has critical uninitialized pointer issues that could cause crashes in Tracy-enabled builds
  • Score reflects well-designed Tracy integration with comprehensive instrumentation, but uninitialized member variables and missing error handling in frame capture code pose crash risks that must be addressed before merge
  • Pay close attention to W3DDisplay.h (uninitialized pointers) and W3DDisplay.cpp (missing error handling in InitTracyCaptureImage())

Important Files Changed

Filename Overview
cmake/config-build.cmake Adds RTS_BUILD_OPTION_PROFILE_TRACY option and conditionally links Tracy when enabled
Core/Libraries/Include/rts/profile.h Adds Tracy header include and macro definitions with no-op fallbacks for disabled builds
GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h Declares Tracy frame capture methods and member variables for texture/surface management
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp Implements Tracy frame capture with GPU-side downscaling and BGRA-to-RGBA pixel shader conversion
GeneralsMD/Code/GameEngine/Source/GameLogic/System/GameLogic.cpp Adds Tracy zones throughout game logic update loop and plots for logic frame number

Sequence Diagram

sequenceDiagram
    participant GC as GameClient::update()
    participant GL as GameLogic::update()
    participant D as Display::draw()
    participant W3DD as W3DDisplay::draw()
    participant Tracy as Tracy Profiler
    
    Note over GC: Frame Start
    GC->>Tracy: FrameMark
    
    Note over GC,GL: Logic Update Phase
    GC->>GL: update()
    GL->>Tracy: ZoneScopedNC("GameLogic::update", green)
    GL->>Tracy: TracyPlot("LogicFrame", frameNumber)
    GL->>GL: ScriptEngine, TerrainLogic, AI, etc.
    Note over GL: Each subsystem wrapped in ZoneScopedN
    GL->>Tracy: TracyPlot("PathfindCells", cellCount)
    GL->>Tracy: TracyPlot("PathfindPaths", pathCount)
    
    Note over GC,W3DD: Render Phase
    GC->>Tracy: ZoneScopedNC("Render", blue)
    GC->>D: draw()
    D->>W3DD: draw()
    W3DD->>Tracy: ZoneScopedN zones for rendering stages
    W3DD->>W3DD: Render 3D scene, UI, particles
    W3DD->>W3DD: TracyCaptureImage()
    W3DD->>Tracy: FrameImage(pixels, width, height)
    
    Note over GC: Frame End
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@Caball009
Copy link

Would you mind adding a few images or a short video to the first post?

@stephanmeesters
Copy link
Author

stephanmeesters commented Jan 29, 2026

Would you mind adding a few images or a short video to the first post?

Done. Added examples.

I've noticed that it's important to use -win otherwise the frame image capture will use your screen resolution and not the native game resolution so it's much slower... I've tested using 1080p

Nevermind I accidentally had DXVK DLL's in the binary dir and that changed the profile: the Render::VSync block was gone and the TracyFrameImage zone took longer to complete, while overall FPS was way up. I attributed whatever happened in the time after the rendering was complete to VSync but I'm not sure now.

stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
stephanmeesters added a commit to stephanmeesters/GeneralsGameCode that referenced this pull request Feb 4, 2026
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@xezon xezon added Enhancement Is new feature or request Major Severity: Minor < Major < Critical < Blocker Gen Relates to Generals labels Feb 23, 2026
@xezon xezon added ZH Relates to Zero Hour Debug Is mostly debug functionality labels Feb 23, 2026
Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising, but additional code polishing is required.

message(FATAL_ERROR "Tracy is enabled but Tracy::TracyClient was not found.")
endif()
target_compile_definitions(core_config INTERFACE
TRACY_ENABLE
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can merge with line above.

#define FrameMarkNamed(name) ((void)0)
#define FrameImage(image, width, height, offset, flip) ((void)0)
#define TracyMessage(txt, size) ((void)0)
#define TracyIsConnected false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MACRO_CASE for macros

Copy link
Author

@stephanmeesters stephanmeesters Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FrameMarkNamed , FrameImage, TracyMessage etc. are macros defined by Tracy and the code here are stubs of those macros when we don't include the Tracy library.

We could do MACRO_CASE like this (but you will get the ones emitted by Tracy too)

// when enabled
#define TRACY_MESSAGE(text, size) TracyMessage(text, size)
// when disabled
#define TRACY_MESSAGE(text, size) ((void)0)

#define TRACY_FRAMEIMAGE_SIZE 256
#else
#include "../../Source/profile/profile.h"
#define ZoneScopedN(name) ((void)0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use abstracted names for profiler functions, so that we would be able to plugin different kinds of Profilers behind the same macros.

Find example here:
https://github.com/TheAssemblyArmada/Thyme/pull/801/changes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK will check that out


if(RTS_BUILD_OPTION_PROFILE AND RTS_BUILD_OPTION_PROFILE_TRACY)
target_link_libraries(corei_libraries_include INTERFACE Tracy::TracyClient)
endif()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(RTS_BUILD_OPTION_PROFILE)
    if(RTS_BUILD_OPTION_PROFILE_TRACY)
        target_link_libraries(corei_libraries_include INTERFACE Tracy::TracyClient)
    endif()
endif()

Makes it easier to plugin new ones later.

@@ -8,6 +8,7 @@ option(RTS_BUILD_OPTION_DEBUG "Build code with the \"Debug\" configuration." OFF
option(RTS_BUILD_OPTION_ASAN "Build code with Address Sanitizer." OFF)
option(RTS_BUILD_OPTION_VC6_FULL_DEBUG "Build VC6 with full debug info." OFF)
option(RTS_BUILD_OPTION_FFMPEG "Enable FFmpeg support" OFF)
option(RTS_BUILD_OPTION_PROFILE_TRACY "Enable Tracy profiling for profile builds." OFF)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add a another profiler here, then how would that compile option work?

E.g.

RTS_BUILD_OPTION_PROFILE_TRACY ON
RTS_BUILD_OPTION_PROFILE_OPTICK ON

What is expected to happen?

return false;

// allocate intermediate texture
SurfaceClass *backBuffer = DX8Wrapper::_Get_DX8_Back_Buffer();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backBuffer is never released.

// swizzle shader
// TheSuperHackers @todo In DX9 with ps2.0 this shader will be much simpler
ID3DXBuffer *compiledShader = nullptr;
static const char *shader =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const char *const, no static

void TracyCaptureImage();
TextureClass *m_tracyRenderTarget = nullptr;
SurfaceClass *m_tracySurfaceClass = nullptr;
IDirect3DTexture8 *m_tracyIntermediateTexture = nullptr;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need all these members in the class? If possible use texture data in one go passing from one function another.

IDirect3DSurface8 *tracyBackbufferTextureSurface;
m_tracyIntermediateTexture->GetSurfaceLevel(0, &tracyBackbufferTextureSurface);
DX8Wrapper::_Copy_DX8_Rects(backBufferSurface, nullptr, 0,
tracyBackbufferTextureSurface, nullptr);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: can be one line

@@ -1935,6 +2156,7 @@ void W3DDisplay::draw( void )
TheGraphDraw->render();
TheGraphDraw->clear();
#endif
TracyCaptureImage();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to capture every single frame? Does that create very large captures? What is the recommended interval for this by Tracy? I looked at Tracy Docs and did not find any.

Can we get some numbers? Consider being more conservative with the Frame Grab frequency.

Capturing images on spikes would also be an option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Debug Is mostly debug functionality Enhancement Is new feature or request Gen Relates to Generals Major Severity: Minor < Major < Critical < Blocker ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants