feat(profiling): Integrate Tracy profiler#2202
feat(profiling): Integrate Tracy profiler#2202stephanmeesters wants to merge 6 commits intoTheSuperHackers:mainfrom
Conversation
12b8fbf to
8ede749
Compare
…ells/paths to Tracy profiling (TheSuperHackers#2202)
Greptile Overview
|
| Filename | Overview |
|---|---|
| cmake/config-build.cmake | Adds RTS_BUILD_OPTION_PROFILE_TRACY option and conditionally links Tracy when enabled |
| Core/Libraries/Include/rts/profile.h | Adds Tracy header include and macro definitions with no-op fallbacks for disabled builds |
| GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Declares Tracy frame capture methods and member variables for texture/surface management |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Implements Tracy frame capture with GPU-side downscaling and BGRA-to-RGBA pixel shader conversion |
| GeneralsMD/Code/GameEngine/Source/GameLogic/System/GameLogic.cpp | Adds Tracy zones throughout game logic update loop and plots for logic frame number |
Sequence Diagram
sequenceDiagram
participant GC as GameClient::update()
participant GL as GameLogic::update()
participant D as Display::draw()
participant W3DD as W3DDisplay::draw()
participant Tracy as Tracy Profiler
Note over GC: Frame Start
GC->>Tracy: FrameMark
Note over GC,GL: Logic Update Phase
GC->>GL: update()
GL->>Tracy: ZoneScopedNC("GameLogic::update", green)
GL->>Tracy: TracyPlot("LogicFrame", frameNumber)
GL->>GL: ScriptEngine, TerrainLogic, AI, etc.
Note over GL: Each subsystem wrapped in ZoneScopedN
GL->>Tracy: TracyPlot("PathfindCells", cellCount)
GL->>Tracy: TracyPlot("PathfindPaths", pathCount)
Note over GC,W3DD: Render Phase
GC->>Tracy: ZoneScopedNC("Render", blue)
GC->>D: draw()
D->>W3DD: draw()
W3DD->>Tracy: ZoneScopedN zones for rendering stages
W3DD->>W3DD: Render 3D scene, UI, particles
W3DD->>W3DD: TracyCaptureImage()
W3DD->>Tracy: FrameImage(pixels, width, height)
Note over GC: Frame End
|
Would you mind adding a few images or a short video to the first post? |
Done. Added examples.
Nevermind I accidentally had DXVK DLL's in the binary dir and that changed the profile: the |
8ede749 to
38cf6dd
Compare
…ells/paths to Tracy profiling (TheSuperHackers#2202)
GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h
Outdated
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
38cf6dd to
a36e02c
Compare
xezon
left a comment
There was a problem hiding this comment.
Looks promising, but additional code polishing is required.
| message(FATAL_ERROR "Tracy is enabled but Tracy::TracyClient was not found.") | ||
| endif() | ||
| target_compile_definitions(core_config INTERFACE | ||
| TRACY_ENABLE |
| #define FrameMarkNamed(name) ((void)0) | ||
| #define FrameImage(image, width, height, offset, flip) ((void)0) | ||
| #define TracyMessage(txt, size) ((void)0) | ||
| #define TracyIsConnected false |
There was a problem hiding this comment.
FrameMarkNamed , FrameImage, TracyMessage etc. are macros defined by Tracy and the code here are stubs of those macros when we don't include the Tracy library.
We could do MACRO_CASE like this (but you will get the ones emitted by Tracy too)
// when enabled
#define TRACY_MESSAGE(text, size) TracyMessage(text, size)
// when disabled
#define TRACY_MESSAGE(text, size) ((void)0)
| #define TRACY_FRAMEIMAGE_SIZE 256 | ||
| #else | ||
| #include "../../Source/profile/profile.h" | ||
| #define ZoneScopedN(name) ((void)0) |
There was a problem hiding this comment.
Please use abstracted names for profiler functions, so that we would be able to plugin different kinds of Profilers behind the same macros.
Find example here:
https://github.com/TheAssemblyArmada/Thyme/pull/801/changes
There was a problem hiding this comment.
Ah OK will check that out
|
|
||
| if(RTS_BUILD_OPTION_PROFILE AND RTS_BUILD_OPTION_PROFILE_TRACY) | ||
| target_link_libraries(corei_libraries_include INTERFACE Tracy::TracyClient) | ||
| endif() |
There was a problem hiding this comment.
if(RTS_BUILD_OPTION_PROFILE)
if(RTS_BUILD_OPTION_PROFILE_TRACY)
target_link_libraries(corei_libraries_include INTERFACE Tracy::TracyClient)
endif()
endif()Makes it easier to plugin new ones later.
| @@ -8,6 +8,7 @@ option(RTS_BUILD_OPTION_DEBUG "Build code with the \"Debug\" configuration." OFF | |||
| option(RTS_BUILD_OPTION_ASAN "Build code with Address Sanitizer." OFF) | |||
| option(RTS_BUILD_OPTION_VC6_FULL_DEBUG "Build VC6 with full debug info." OFF) | |||
| option(RTS_BUILD_OPTION_FFMPEG "Enable FFmpeg support" OFF) | |||
| option(RTS_BUILD_OPTION_PROFILE_TRACY "Enable Tracy profiling for profile builds." OFF) | |||
There was a problem hiding this comment.
When we add a another profiler here, then how would that compile option work?
E.g.
RTS_BUILD_OPTION_PROFILE_TRACY ON
RTS_BUILD_OPTION_PROFILE_OPTICK ON
What is expected to happen?
| return false; | ||
|
|
||
| // allocate intermediate texture | ||
| SurfaceClass *backBuffer = DX8Wrapper::_Get_DX8_Back_Buffer(); |
| // swizzle shader | ||
| // TheSuperHackers @todo In DX9 with ps2.0 this shader will be much simpler | ||
| ID3DXBuffer *compiledShader = nullptr; | ||
| static const char *shader = |
| void TracyCaptureImage(); | ||
| TextureClass *m_tracyRenderTarget = nullptr; | ||
| SurfaceClass *m_tracySurfaceClass = nullptr; | ||
| IDirect3DTexture8 *m_tracyIntermediateTexture = nullptr; |
There was a problem hiding this comment.
Does it need all these members in the class? If possible use texture data in one go passing from one function another.
| IDirect3DSurface8 *tracyBackbufferTextureSurface; | ||
| m_tracyIntermediateTexture->GetSurfaceLevel(0, &tracyBackbufferTextureSurface); | ||
| DX8Wrapper::_Copy_DX8_Rects(backBufferSurface, nullptr, 0, | ||
| tracyBackbufferTextureSurface, nullptr); |
| @@ -1935,6 +2156,7 @@ void W3DDisplay::draw( void ) | |||
| TheGraphDraw->render(); | |||
| TheGraphDraw->clear(); | |||
| #endif | |||
| TracyCaptureImage(); | |||
There was a problem hiding this comment.
Is this meant to capture every single frame? Does that create very large captures? What is the recommended interval for this by Tracy? I looked at Tracy Docs and did not find any.
Can we get some numbers? Consider being more conservative with the Frame Grab frequency.
Capturing images on spikes would also be an option.
Merge by rebase
This PR adds Tracy profiler to the CMake presets
win32-profileandwin32-vcpkg-profile.A variable
RTS_BUILD_OPTION_PROFILE_TRACYis added to enable Tracy. Enabling this will disable the old profiler.It is not added to the
vc6-profilebecause Tracy library can't compile against it. The presetmingw-w64-i686-profileis also skipped as this preset does not build yet (see #2163).Library include
Tracy is added as a package to VCPKG (version 0.11.1) and as a CMake
FetchContent_Declare(version 0.13.1). These are the latest available versions at the time. Unfortunately the VCPKG is old, but the VCPKG Github has PR's going to update Tracy to the latest version. The version 0.13.1 improves coloring immensely, which is important for at a glance seeing if we are render-bound of update-bound.Tracy themselves recommend adding the library as a CMake include because the version of the library that is included in the game must match exactly with the version of their main executable
tracy-profiler.exe. This can be guaranteed by building and copyingtracy-profiler.exeto the build dir, however we can not do this right now as it requires a 64 bit compile.What are we tracing?
A number of zones were picked that capture the majority of processing time, and will hopefully be useful for profiling sessions.
Performance impact
Tracy itself has a negligible impact, however the frame capturing does have a minor impact (0.5ms average per frame). I have done my best to optimize the frame capturing by downscaling the backbuffer on the GPU side before copying to the CPU. It will be possible to simplify some of this code by moving from DX8 to DX9 and perhaps increase performance too.
Future work
It should be determined whether this will be sufficient to replace the old profiling code. At that point, and when all other profiles build, we can activate Tracy using only
RTS_BUILD_OPTION_PROFILE. Other interesting things to trace would be a plot of the memory usage.Todo
Example of a capture. Render bound frame (top), update bound frame (bottom)
