forked from wolfpld/tracy
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
1261 lines (1175 loc) · 61.8 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Note: There is no guarantee that version mismatched client and server will
be able to talk with each other. Network protocol breakages won't be listed
here.
vx.xx.x (xxxx-xx-xx)
--------------------
- Enabled workaround for MSVC runtime library SNAFU, which manifested with
the profiler executables crashing at startup inside mutex code.
- CPU topology data now includes CPU die information.
- Clients running under Wine will now report that in the trace info.
- Added flame graph.
- The Git ref information for the build is now included in the about dialog.
- Added support for clipboard copy and paste on Wayland.
- The welcome dialog client address entry field will now trim the entered
address, so that stray spaces at the start and the end are removed. This
should reduce the amount of user precision required when copy pasting the
address from somewhere else.
- Metal GPU profiling is now available.
- Profiling zones can now optionally inherit their parent color.
- It is no longer needed to have up-to-date copy of wayland-protocols
installed. CMake will download the required version from GitHub.
- Added option to show the top inline in symbol statistics list in stead of
the symbol name.
v0.11.1 (2024-08-22)
--------------------
- Utilities import-chrome and import-fuchsia now live together in the import
directory.
- Added TRACY_VERBOSE to available CMake options.
- It is now possible to set TRACY_SAMPLING_HZ via a environment variable.
- Thread group hints can be now used to group threads together in the
profiler UI.
- Limit Lua file names to 255 characters, as the source string can contain
the whole script, if loaded with loadstring().
v0.11.0 (2024-07-16)
--------------------
- Support for pre-0.9 traces has been dropped.
- The old server-side build system has been replaced by CMake. The client
integration is not affected. Refer to the manual for details.
- Most importantly, a known version of the capstone library is now
downloaded from GitHub. You will need to have git installed for this
to work (there is a CMake option to use the capstone installed on the
system, as was done previously).
- Various Meson fixes.
- Proper way of loading Vulkan calibrated timestamps extension.
- Fixed C API support for GPU tracing when on demand mode is enabled.
- Added a way to resynchronize CPU and GPU timestamps.
- Using calibrated contexts should always be preferred.
- Each synchronization event requires a sync of CPU and GPU, which is
something you always want to avoid.
- This is not exposed as an easy-to-use API available through the GPU
wrappers.
- Added TracyIsStarted macro to check if the profiler has been started.
Using this functionality only makes sense in the manual lifetime mode,
and will always return true in any other mode of operation.
- Added basic QNX support.
- Zmmword is now recognized as an assemble size directive.
- Libunwind can be used for call stack capture on Linux if you build with
the TRACY_LIBUNWIND_BACKTRACE define.
- Preloading symbols for all modules on Windows, which is always performed
on program init, and which can be quite slow, may now be omitted through
the TRACY_NO_DBGHELP_INIT_LOAD define. In this mode, symbols will be
loaded as needed.
- Validation of discontinuous frames has been disabled in on-demand mode.
It's quite likely to connect in the middle of a discontinuous frame,
which resulted in frame end event for a frame that hasn't been started.
- Symbols can be now resolved offline on Windows and Linux.
- Enabled with the TRACY_SYMBOL_OFFLINE_RESOLVE define or env variable.
- The update utility has two additional options:
- -r, which enables resolving symbol and patching stack frames in the
trace.
- -p, which you can use to modify the paths used for symbol resolution.
- Some functionality will be missing if this mode is used. For example,
symbol statistics are unavailable.
- Resolving symbol names on Linux will now use image cache to reduce the
number of dladdr() calls.
- Compiling with the TRACY_LIBBACKTRACE_ELF_DYNLOAD_SUPPORT define will
enable support for run-time updating of known elf ranges in libbacktrace
on Linux. Previously, shared objects dlopened() after libbacktrace init
would not be visible during symbol resolution.
- Zone group count in the Find zone window is now explicitly displayed.
- Instrumentation statistics now display in how many threads each source
location has appeared in.
- Added import tool for fuchsia traces.
- https://fuchsia.dev/fuchsia-src/reference/tracing/trace-format
- Added checks for overflow of source locations.
- As a reminder, Tracy only allows to have 64K unique source locations,
split in half between static and dynamic locations.
- Runtime checks are active during capture and will stop a trace that
goes beyond the limit.
- Load-time checks will stop any broken trace file from loading.
- Opening the source code view that has no associated address in code
(i.e., from the list of instrumented zones, or from the find zone
window) will now search the list of symbols for a function name match.
- In many cases this will result in displaying the full disassembly view
where previously you would only see the source code.
- Matching is performed by string comparisons, which in rare cases may
result in showing false data.
- Press ctrl key while opening source view to keep the old behavior.
- If more than one matching symbol is found (e.g., if two classes have
methods with the same name, or if a template is instantiated in multiple
places in code), it is not possible to tell which of the code locations
the source location corresponds to and only the source code will be
displayed.
- Added TracyNoop macro, which inserts a reference to Tracy's object file
into your application. Use it if you want to use Tracy in sampling mode,
without any manual instrumentation (so no references of your own exist)
and link Tracy as a static library. Linkers will only include library code
if code references it, and this doesn't work as intended with Tracy, as it
ignores global constructors that have side effects.
- ZoneText and ZoneName macros now have a printf-like variant, denoted with
a 'F' postfix.
- The 'tracy_shared_libs' Meson option was removed. Use interface provided
by Meson to set the library type instead.
- Dropped the 'tracy_' prefix from Meson options. The `tracy_enable` option
remains as it was, as it can be inherited from parent projects.
- Fixed display of active / inactive allocations in memory call tree.
- Instrumentation statistics can be now sorted by source location.
- Added option to hide external code frames in call stack view.
- There's now a copy to clipboard button in the statistics view. It copies
the visible rows of either the instrumentation or GPU statistics view to
a CSV string matching a subset of the csvexport format.
- Source file contents can be copied to the clipboard.
- Added key binding for immediate reconnect: Ctrl+Shift+Alt+R.
- Lock markup is now available through the C API.
- Symbol statistics window now allows aggregation of inlined functions in
symbols.
- Cost measurements of inlined functions in the symbol statistics window
can be now relative to the base symbol instead of total program run time.
- ScopedZone and AllocSourceLocation now accept color parameter. Impact on
existing code should be minimal.
- AllocSourceLocation has a new parameter with a default value.
- __tracy_alloc_srcloc and __tracy_alloc_srcloc_name break the existing
API. This can be easily fixed by setting the last parameter to zero.
- To build the profiler GUI with Wayland you now need wayland-scanner and
wayland-protocols to be installed. A reasonably recent release of the
protocols is required, which, as always, is not available on Ubuntu.
Seriously, stop trying to build modern software with that broken distro.
- Fractional DPI scaling is now properly supported on Wayland.
- Added Python bindings.
- The per-line sampling statistics are now also displayed as a percentage
of total program run time.
- The out-of-focus render frame rate reduction can be now disabled in
global settings.
- It is now possible to load source files that are newer than the trace.
The default setting is still to reject such files.
- Memory limit for a capture can be now set, both in the GUI profiler and
in the capture utility.
- Thread list can be now sorted alphabetically.
- It is now possible to adjust plot height.
- Trace comparison statistics were expanded and made more clear.
- Implemented retrieval of kernel symbol code on Linux.
- Added support for multiple compression streams in trace files. This
effectively parallelizes both load and save operations.
- The default save setup is now set to Zstd level 3 with 4 compression
streams. This gives both faster compression time and smaller file size.
- New users will be now eased into the profiler with a set of tutorial
achievements.
- You can now set the timeline options default values in global settings.
- Added a check for program memory being available before symbol retrieval
on Windows.
v0.10.0 (2023-10-16)
--------------------
- Missed frames region of on-demand captures will be now ignored when
calculating trace time span, zone percentages, etc.
- Due to technicalities information about locks, frame statistics in trace
information window and csvexport utility still include the missed frames
time.
- When source location dynamic zone coloring mode is enabled, collapsed
zones will be now gray-colored. Previously such regions falled back to
showing thread colors, which may have been confusing to users.
- Vulkan contexts can now use VK_EXT_host_query_reset extension.
- System power usage is now reported on x86 Linux.
- Program name displayed in broadcast messages can be now changed with the
TracySetProgramName() macro.
- Zone error markers (red regions and error bars) have been removed for
consistency with how all other profiling events are displayed.
- It is now possible to export messages in the csvexport utility.
- Major overhaul of how timeline items are processed in GUI.
- The process of figuring out what needs to be drawn on the timeline has
been heavily parallelized.
- The impact is especially visible with traces containing large amounts
of data. The framerate improvement in such cases can be ~30x.
- Consequently, the profiler GUI will now produce multi-core spikes when
rendering frames. This may have impact on the profiled application's
performance, if both the application and the profiler GUI are running
on the same machine. If this is a problem, you may consider the capture
utility instead, which is not affected by these changes. Alternatively,
you may disable parallelization in the options menu.
- Most of the timeline item logic has been written from scratch, which
may have taken care of some elusive bugs.
- Added global configuration settings dialog. You can find it in the
profiler's about menu (the wrench icon in the welcome dialog).
- List of found zones in the Find zone menu can be filtered by user text.
- Fixed div-by-zero in cvsexport utility when there was only one zone of
a kind.
- Fixed compatibility problems with FreeBSD.
- Added support for dynamically loaded Vulkan symbols.
- Trace description or filename is now displayed on the window title bar.
- The csvexport utility will now export thread id data.
- Improved compatibility with MSVC projects not defining NOMINMAX.
- Improved compatibility with Linux setups targeting musl as libc.
- Thread safety of Vulkan instrumentation has been reviewed.
- D3D11 and D3D12 instrumentation was rewritten.
- Added support for efficient profiling when running under rr, the record-
replaying debugger. This is enabled with TRACY_PATCHABLE_NOPSLEDS define.
- History of viewed symbols is now preserved and you can go back to
previously displayed entries.
v0.9.1 (2023-02-26)
-------------------
- Support for pre-0.8 traces has been dropped.
- Profiled programs will ignore dlclose() calls.
- Added warning when the profiler interface is run with privilege elevation.
Advice is given to instead run the client with admin rights.
- Switched to official ZEN4 uarch data.
- Handle cases when thread name is set, but not through Tracy facilities.
- Allow customization of source location data through the following macros:
- TracyFunction - defaults to __FUNCTION__
- TracyFile - defaults to __FILE__
- TracyLine - defaults to __LINE__
- Tracy on Linux now targets and requires Wayland by default.
- Please don't ask about window decorations on Gnome. Current behavior is
the intended behavior. Gnome does not want windows to have decorations,
and Tracy respects this choice. If you find this problematic, use a
desktop environment that actually listens to its users.
- Pass LEGACY=1 parameter to make, if you want to instead rely on the GLFW
library, like before.
- Other platforms still use GLFW.
- Compare traces menu can now display source code differences between two
traces.
- Assembly listings saved to files have been improved.
- Listings are now annotated with source line information.
- To improve compatibility with external tools comments are now prefixed
with '#' instead of ';'.
- Histogram tooltip will now also show left/right counts.
- Tracy now actively manages timeline vertical scroll offset in order to keep
the thread under the mouse cursor in the same place on screen.
- Removed support for AT&T assembly syntax.
- Tracy will not display notification if the file selector can't be used.
Possible reasons for failure include lack of xdg-desktop-portal.
- Using the TRACY_NO_CRASH_HANDLER define will disable handling of
application crashes by the profiler.
- Tracy will now query jump and call target addresses. This enables discovery
of target function names, even if such function has no samples and is not
present in any call stack.
v0.9.0 (2022-10-26)
-------------------
- Attention! All the header and source files used for integrating Tracy with
applications were moved to the public/ directory. This will break your
integration!
- To fix this, update the source and include directories lists to point to
the new location.
- Tracy include files directly referenced by the client were moved to
tracy/ subdirectory, to facilitate setups which previously had Tracy
checkout parent directory in the include paths list (i.e. when you
included "tracy/Tracy.hpp").
- Previously, if you have included the Tracy checkout directory in your
project include directories list (i.e. you could include "Tracy.hpp"),
this could result in third-party library conflicts, e.g. with ImGui.
Such scenarios are no longer the case.
- Tracy macros now require to be terminated with a semicolon.
- The undocumented ___tracy_demangle() function API has been changed. Please
refer to the source code for further instructions.
- The parameter callback and its registration macro have been extended to
include user data pointer. You will need to update your code accordingly.
- Plots visualization has been improved.
- Each plot now has its own color, which can also be defined by the user.
- The area below the plot is now optionally filled with a color.
- Plots can now also be configured to be staircase instead of smooth. This
new setting is appropriate for many inputs where only discrete values
make sense, e.g. the memory allocation plot.
- The API for TracyPlotConfig() macro has been changed. Please refer to
the manual to see how you can fix this.
- Some text labels in the user interface are now more easy to read.
- The profiler will now instruct the user in the UI on what can be done, if
the send queue is slow to process (typically due to symbol resolution).
- If a client with an incompatible protocol is discovered, Tracy will now
try to show which versions can be used to handle the connection.
- Messages list in zone info window can now show messages exclusive to the
zone, filtering out the messages emitted from child zones.
- Added capture of vertical synchronization timings on Linux.
- The range of frame bar colors in the frames overview on top of the screen
can be now controlled with the "Target FPS" entry box in the options menu.
- The "Draw frame targets" option does not need to be selected.
- Previously the hardcoded FPS target thresholds were: 30, 60, 144 FPS.
- Currently the FPS target threshold is: half of target, target, twice the
target.
- Reworked the way zone names are shortened.
- Previously shortening supported only namespace removal, in a way that
didn't consider function parameters or template arguments.
- Shortening to one-letter namespace chains is no longer available.
- The new shortening rules first perform normalization of the function name.
- The function const qualifier is removed.
- Common return types are removed.
- All function parameters and all template arguments are removed.
- The next steps consist of repeated removal of namespaces, starting with
the most outermost one.
- While the old process was all or nothing, the new implementation by
default will dynamically adjust to the space available, trying to show
the most context possible.
- It is also possible to completely disable shortening, or require that it
is always performed in full.
- Function name normalization is enabled by default, even if there is space
to show full function name. This can be changed in options.
- Previously shortening was only applied to the zone names displayed on the
timeline. Currently this process will also apply to all other places in
the UI where function names are displayed. However, in these cases the
function names will only be normalized.
- Full function names are still available as tooltips, or in fine print if
the normalized name is already displayed in a tooltip.
- This functionality is disabled if zone name shortening is disabled.
- Added context menu for timeline labels. Currently the only option is to hide
the selected thread, plot, etc.
- You can now provide custom source file contents through a profiler callback.
- Exposed Tracy version to client applications (available through the
common/TracyVersion.hpp header file).
- D3D12 instrumentation is now thread-safe.
- Timeline can be now navigated with WASD keys.
- Symbol file paths are now normalized on libbacktrace systems. For example,
instead of "/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.2.0/../../../../
include/c++/12.2.0/bits/std_mutex.h" Tracy will now report such file as
"/usr/include/c++/12.2.0/bits/std_mutex.h".
- The import-chrome utility interprets Instant (`i`/`I`) events where the
`name` field contains the word `frame` as a frame event. The `name` is the
frame set name.
- Frame data won't be displayed if there was no frame instrumentation in the
profiling session.
- Note that some automated functionality (e.g. vertical synchronization
capture) may automatically generate frame data, which will force frames to
be displayed.
- Tracy threads will now be collapsed by default on the timeline.
- Clicking on a local thread in the CPU data view will make the thread visible
and uncollapsed on the timeline.
- Assembly view is now in color.
- The profiler UI will no longer unnecessarily redraw the screen if nothing
was changed. This should have a profound impact on power usage.
- Added microarchitecture data for Zen 4.
- Implemented optional propagation of inline cost down the local call stack.
- This feature may be useful when trying to get a general outlook of the
cost at the top-level function in the symbol.
- It is possible to get nonsense data when this is enabled, for example
total cost exceeding 100%. This is by design.
- Assembly line costs are not affected.
- Available clients now also broadcast their PID.
- Reversed mouse button assignments for jumping to source / assembly line in
symbol view. The left mouse button will now focus the target line.
- Assembly lines tooltip will now display local call stack of inline functions
(within the symbol).
- Right-clicking the source location entry in assembly line will show the
local call stack, along with source code preview of each entry and ability
to navigate to any selected inline function.
- The profiler UI will now indicate that it needs attention if the window is
not focused and something interesting happens. For example when a connection
is established, or when a saved trace finishes loading, etc. How the
attention request is indicated depends on the operating system.
- Clicking on the red microarchitecture icon in the symbol view assembly pane
will switch the selected microarchitecture to one the profiled application
was running on.
- Removed option to display instruction latencies in a graphical form. Latency
data is still available in instruction tooltip.
v0.8.2 (2022-06-28)
-------------------
- Added support for debuginfod debug information services. Note that
since this depends on proper system configuration, vendors providing
the debug information, and network retrieval, it is disabled by
default. To enable, compile the profiled application with the
TRACY_DEBUGINFOD define and link with libdebuginfod.
- When Tracy server-side utilities are build with MSVC, the required
libraries will be now automatically retrieved and built with vcpkg.
- Added microarchitecture data for: Bonnell, Airmont, Goldmont, Goldmont
Plus, Tremont.
- Recognize additional CPUIDs of Zen 3, Alder Lake, Ice Lake
microarchitectures.
- Assembly line width will be now extended, if needed. Previously the line
width was calculated for the initial layout and changing amount of
displayed data (especially listing the read/written registers) didn't
affect this, which may have made some lines partially unreadable.
- Added ability to filter call stacks in memory tab by inactive allocations.
Filtering by inactive allocations helps to pinpoint wasteful allocations
in the program.
- Plot graph will no longer display min/max values interpolated for
animation, but rather true values.
- The CPU topology tree structure was replaced by a CPU schematic showing
the same thing in a more concise way.
v0.8.1 (2022-04-21)
-------------------
- Support for pre-0.7 traces has been dropped.
- Update utility can now scan for source files missing in the trace cache,
if the '-c' parameter is given. Found files will be added to the cache.
- Added high-priority queue for fast queries to bypass slow symbol queries.
- Fixed Android documentation to show how to enable context switch tracing.
- Workaround MSVC 2015 stupidity which prevented compilation as C++11.
- Added support for showing branch cost data for CPUs that don't report
branch retirement events (but do report branch misses).
- The right-click context menu available for jump arrows in the symbol view
window will now additionally display jump context, i.e. jump sources and
jump target source code fragments.
- Added freedesktop.org compliant desktop entry and MIME type definition.
- The call stack column in list of messages will now be only displayed when
at least one message on the list has call stack data.
- File dialogs on Unix will be now native to the desktop environment you are
using. Note that this relies on xdg-desktop-portal and dbus.
v0.8.0 (2022-03-28)
-------------------
- Support for Cygwin has been dropped. It was not working for a very long
time and nobody had complained about it.
- Mingw is deprecated due to lack of interest.
- Added TRACY_NO_CALLSTACK_INLINES macro to disable inline functions
resolution in call stacks on Windows.
- Improved function matching algorithm in compare traces view.
- Added CMake integration.
- Reworked rpmalloc initialization.
- Fixed display of messages with newlines on messages list.
- Excluded some uninteresting wrapper functions from call stacks (for
example SIMD pass-through intrinsics to the compiler built-ins).
- Adjusted coloring of instruction hotness in symbol view.
- Properly handle rare cases when sampling on Linux is momentary not able to
resolve time stamps.
- Added Rocket Lake microarchitectural data.
- Updated CPU identifier lists.
- Implemented GPU timer overflow handling heuristics.
- Assembly instructions are now assigned to inline symbols.
- You can not only see the assembly source file and line, but also the
originating function.
- If symbol view is restricted to a single inline function, all assembly
instructions not in this context will be dimmed out.
- Likewise, the navigation in assembly code will be limited just to the
inline context, if a single function is selected.
- Kernel call stacks will be now properly captured and displayed in the
profiler. Kernel functions are marked with the red color.
- The CPU hardware performance counters can be now sampled on Linux.
- Three inferred statistics are displayed for lines in both source and
assembly code in the symbol view window:
- Instructions executed per cycle.
- Branch miss rate.
- Cache miss rate.
- Instruction cost estimation method is no longer tied to software call
stack sampling.
- The image name filter entry field is now providing a list of available
images.
- Reentrant function calls may be now excluded from calculations in the
statistics view.
- Crash handler is now properly removed during profiler destruction.
- Repeatedly right-clicking on the same source line in the symbol view
window will now cycle through assembly blocks associated with this source
line.
- Vulkan headers must be now explicitly included before including
TracyVulkan.hpp.
- The capture utility may now limit capture time to a specified number of
seconds.
- Fixed message thread assignment in the import-chrome utility.
- Sampling data can be now also found in the find zone menu.
- Instrumentation failures may now display their context, e.g. the zone text
that was to be set.
- A warning is now displayed when sampling data is out-of-order.
- Average value for plots can be now viewed.
- Moved symbol resolution to a separate thread. Profiling will no longer be
stuck when there is a large number of symbols to resolve. This not only
improves user experience, but also prevents buildup of data (and memory
consumption) on the client side.
- Android device name will be now reported.
- Added support for capturing fibers.
- Fibers require additional processing, which has to be enabled by adding
the TRACY_FIBERS define on the client side.
- Client code requires additional instrumentation using the new macros
TracyFiberEnter and TracyFiberLeave (or the corresponding C API
variants).
- Fibers are represented in traces as separate threads, and are
distinguished by green color. Faux context switch regions are used to
indicate when a fiber is being run by the worker thread.
- Continuous frame marks no longer need to be issued from a single thread.
- Context switch call stacks are now captured on Windows and Linux.
- Hovering the context switch wait region will now display wait stack,
which may provide additional insight into why the switch happened.
- Wait stacks inspection can be performed in a new view.
- Stacks can be limited to certain threads and to a selected time range.
- Stacks are presented either as a sorted list, or as a bottom-up and
top-down trees.
- Entry call stacks can be now also viewed as a bottom-up and top-down
trees.
- Updated project build files to MSVC 2022.
- Call stack tooltips now also show the executable image name.
- Playback frames can be now changed by interacting with the frame image
slider using the mouse wheel.
- Signal used to handle crashes on Linux can be now redefined.
- Various DPI scaling improvements.
- User interface can be now scaled in run time.
- Symbol code retrieval now also supports kernel on Windows.
- Added low-level C API interface for GPU zones.
- Symbol child calls can be now listed.
- Replaced "restrict time" in memory window with a proper time range limit.
- Added Alder Lake microarchitectural data.
- Added GPU zone statistics.
- Universal Windows Platform support.
- All call stack related functionality can be now disabled with the
TRACY_NO_CALLSTACK macro.
- Added ability to add full-view annotations from the annotations list
window.
v0.7.8 (2021-05-19)
-------------------
- Updated Zen 3 and added Tiger Lake microarchitectural data.
- Manually disconnecting from the server will no longer display erroneous
warning message.
- Added ability to display sample time spent in child function calls.
- Fixed issue which may have prevented sampling on ARM64.
- Added TRACY_NO_FRAME_IMAGE macro to disable frame image compression
thread.
- Ctrl and shift keys will now modify mouse wheel zoom speed.
- Improved user experience in the symbol view window.
- Added support for Direct3D 11 instrumentation.
- Vulkan contexts can be now calibrated on Linux.
- Support loading zstd-compressed chrome traces.
- Chrome traces with multiple PID entries (and possibly conflicting TIDs)
can be now imported.
- Added support for custom source location tag ("loc") in chrome traces.
- Sampling frequency can be now controlled using TRACY_SAMPLING_HZ macro.
- Trace compression can be now selected when saving a trace.
- If a trace cannot be saved, a failure dialog will be displayed.
- Run-time memory usage of frame images can be reduced by calculating
a compression dictionary. This can be only performed when a trace is saved
or through the update utility.
v0.7.7 (2021-04-01)
-------------------
- Linux crash handler will now also catch SIGABRT.
- Fixed invalid name assignment to source files discovered client-side.
- Added ability to check if a zone is active (which may be used to avoid
preparing zone text, etc., as it wouldn't be used anyway).
- Improved sorting behavior of internal vectors.
- Some data will now be always properly displayed during live capture.
This was not particularly visible before, as it mainly concerns edge
cases.
- Sorting is performed only as needed.
- In case of plots the performance during live capture may be decreased,
as these were sorted with at least 0.25 second intervals before. Now
the sorting is performed every frame.
- Some other data, which previously was not sorted, is sorted now.
- In headless capture mode sorting will be only performed when the trace
is saved to disk.
- Fixed some typos in macros.
- Fixed handling of non-ANSI file names on Windows. You can now name your
traces 'ęśąćż.tracy' and it should work as intended. This is supported on
Windows 10 release 1903 and newer.
- Fixed sending GPU context name in on-demand mode.
- Fixed color channel order in ZoneColor() macro.
- Handle failure state when a memory pointer allocation is reported twice,
without an intermediate free.
- Renamed "call stack parents" to "entry call stacks".
- Display number of entry call stacks in assembly line sample count tooltip.
- Added tooltips with preview of source code in various places in the UI.
v0.7.6 (2021-02-06)
-------------------
- Various fixes in build scripts.
- Fixed a faulty rpmalloc initialization path when the first thing the
thread did was sending a message with call stack.
- Added fallback timer define for various virtualized environments, which
may not be able to access the hardware timer registers. This will result
in usage of timer provided by the standard library, with reduced
resolution.
- Further OpenCL improvements.
- Updated libbacktrace.
- Adds Mach-O 64-bit FAT support.
- Fixes memory corruption when processing Mach-O data.
- Fixes missing matching entries during binary search.
- Adds support for MiniDebugInfo.
- Adds fallback to ELF symbol table if no debug info is available.
- Various other fixes.
- Store build time of profiled program in captures.
- GPU contexts can be now named.
- Implemented client -> server source code transfer.
v0.7.5 (2021-01-23)
-------------------
- More robust handling of system tracing on Android.
- Added warning dialog when the connection is lost before all needed data
can be retrieved.
- Fixed handling of NaN plot entries (by skipping them).
- Dynamic zone colors are now supported through the ZoneColor() macro.
- Fixed Arm machine code printout to match the one printed by objdump.
- Fixed client memory corruption when using colored messages.
- Switched to the next-gen ImGui table UI.
- Table columns can have their order rearranged, can be hidden, can be
sorted both in ascending and descending order (where appropriate).
- Table columns state is now preserved between runs.
- Various fixes related to restricting listening to localhost.
- Improved compatibility of ETW tracing with non-MSVC compilers.
- Fixed Vulkan call stack transfer.
- Added support for transient GPU zones (OpenGL, Vulkan, Direct3D 12).
- OpenCL fixes for assert-less builds and non-active zones.
- Added support for thread names and title bar description in traces
imported from chrome tracing format.
v0.7.4 (2020-11-15)
-------------------
- Added support for user-provided locks to keep dbghelp calls thread-safe.
- Call stacks can be now copied to clipboard.
- Allow more control over which automated captures are performed.
- Added textual descriptions for some assembly instructions.
- Profiler memory usage is now also displayed as a percentage of available
physical memory.
- Microarchitecture mismatch is now clearly displayed in the source view
window.
- Added Zen 3 and Cascade Lake microarchitectural data.
- Ghost zones are now supporting all zone coloring modes and namespace
shortening.
- Extend C API to support memory pools.
- Frame rate targets can be now visually represented on the timeline view.
v0.7.3 (2020-10-06)
-------------------
- Properly support DPI scaling on Linux (requires GLFW 3.3).
- Added early checks for output file validity in the capture utility.
- Improvements to presence broadcast handling.
- Custom zone colors can be optionally ignored.
- Added support for tracking multiple memory pools.
- Memory free failure dialog can now show call stack pointing to the failure
location.
- Added support for Wayland on Linux.
- If during the first 5 seconds of the trace there are no frames being
reported, the profiler will switch to following last 5 seconds of the
trace, instead of displaying three last frames.
v0.7.2 (2020-09-14)
-------------------
- Note: the bitbucket repository is obsolete and will soon stop receiving
updates. Migrate to https://github.com/wolfpld/tracy, if you haven't
already.
- The "waiting for connection" dialog no longer has "cancel" button. To
abort connection attempt just use the "close window" button.
- Added update notification.
- The most recent traced events can be now viewed regardless of timeline
zoom level.
- Fixed going-to-line in source view (again).
- Crash handling on client is now not performed, if there is no active
connection.
- Added ability to listen only on IPv4 interfaces.
v0.7.1 (2020-08-24)
-------------------
- Dropped support for pre-v0.6 traces.
- Fixed regression on non-AVX2 CPUs.
- Fixed incorrect calculation of some ghost zones.
- Added list of cached source files.
- Added import of plot data.
- Secure versions of alloc/free macros.
- Automated tracing of vertical synchronization on Windows.
- Fixed attachment of postponed frame images.
- Source location data can be now copied to clipboard from zone info window.
- Zones in find zones menu can be now grouped by zone name.
- Vulkan and D3D12 GPU contexts can be now calibrated.
- Added CSV export utility.
- "Go to frame" popup no longer has a dedicated button. To show it, click on
the frame counter.
- Added macro for checking if profiler is connected.
- Implemented optional data removal from traces in the update utility.
- Allow manual management of profiler lifetime.
- Adjusted priority of ETW threads to time critical.
- Annotations can be now freely adjusted on the timeline.
- Limiting time range for find zone functionality has been significantly
improved.
- Added time range limits for statistics and symbol view.
- Implemented call stack sampling on Linux (including Android).
- Exact time from start of profiling session can be now viewed by hovering
the mouse over the time scale.
- Code transfer can be now compiled-out.
- Added support for zone markup in unloadable modules.
- Added image name filter to sampling statistics results window.
v0.7 (2020-06-11)
-----------------
This is the last release which will be able to load pre-v0.6 traces. Use the
update utility to convert your old traces now!
- chrome:tracing importer now imports zone metadata from "args" key.
- Added display of statistical mode to find zone menu.
- Automatic stack sampling is now available on windows.
- Properly handle tracing on long-running systems.
- Message list entries can now show associated frame image.
- Call stack window will now display module names.
- Symbol location in call stack window may now also display symbol address.
- Statistics menu can now be used to display call stack sampling data or
list available symbols.
- All call paths leading to the sampled instruction in a call stack can be
now displayed.
- Frame image compression ratio (lossless in-memory compression, not taking
into account DXT compression) is displayed in playback window.
- Allow reconnection straight from the discard data dialog.
- Added ability to set custom names for locks.
- Improved handling of network ports.
- Added time percentage display to instrumentation statistics.
- Display of ghost zones (generated from automated call stack sampling).
- Notify when empty labels display is enabled.
- Small fragments of executable code will be now sent from client to server.
- Added notification about query backlog.
- Fixed performance problem with query backlog.
- Display number of in-flight queries, in addition to query backlog.
- Improved failure reports.
- The capture utility will connect to localhost by default.
- Added optional support for QPC timer on windows.
- Complete rewrite of source file viewer. It is now 100% reliable when going
to a source location.
- Symbol source view was added.
- Extension of source file viewer.
- Can display source file, assembly view, or both at the same time.
- May include display of statistical profiling data.
- Ability to switch between source files which were used to build the
symbol.
- Ability to switch between inlined functions which are incorporated into
the symbol.
- Graphical representation of control flow in program.
- Display of micro-architectural data for each assembly instruction.
- Tracking register dependencies between assembly instructions.
- Disassembly may be saved to a file, in order to be processed by external
tools.
- If the default listening port is occupied, profiler will now try listening
on other ports.
- Added possibility to perform source file names substitution.
- Profiler windows can be now docked.
- CPU usage tooltip now displays a list of running threads.
- Added possibility to filter discovered clients list.
- Source files are now cached during capture.
- Profiler will now display a popup when application crashes.
- Added ability to send simple integral values as extra payload for zones.
- Per-frame zone times on the frames plot can now display self time.
- Ability to bind only on localhost interface.
- OpenCL profiling.
- Direct3D 12 profiling.
v0.6.3 (2020-02-13)
-------------------
- Fixed performance issues with loading saved traces on Ryzen CPUs.
- Profiler window contents are now properly updated during window resize.
- Improved tid to pid mapping on windows.
- Zero length and unfinished zones are no longer taken into account for
statistics.
- Build files for shared library are now available (experimental).
- GPU zones now also have "active" parameter.
- Further reduction of memory usage and on-disk trace size.
- Replaced ska::flat_hash_map with robin-hood-hashing.
- Speed-up rendering of long lists of items.
- Exact event time is displayed in some places in the UI.
- Memory allocation lists can now be sorted.
- Added display of trace file compression ratio.
- Optional Zstd compression of trace files.
- Frame images are now internally compressed using Zstd (instead of LZ4).
- Fix display of continuous frame set tooltips.
v0.6.2 (2019-12-30)
-------------------
- Improved call stack decoding on OSX.
- Collection of CPU topology data.
- C API now supports allocated source locations.
- Added chrome:tracing importer.
- Allow merging of ZoneText() strings.
- Time distribution can now show both exclusive and inclusive times.
- Display proper value of selection time in find zone menu.
- Implemented limiting find zone search to a specified time range.
- Highlight hovered zone from find zone menu zone list on the histogram.
- Allow copying user data directory location to the clipboard.
v0.6.1 (2019-11-28)
-------------------
- Dropped support for pre-v0.5 traces.
- Improve BSD support.
- GPU zone CPU thread highlight will now highlight whole thread, not only
the thread name.
- Added CPU thread highlight for CPU data items.
- Client parameters may be now set from the server.
- Minor UI fixes.
v0.6 (2019-11-17)
-----------------
This is the last release which will be able to load pre-v0.5 traces. Use the
update utility to convert your old traces now!
- Dropped support for pre-v0.4 traces.
- Major memory usage decrease.
- Significant network bandwidth decrease.
- Implemented context switch capture on selected platforms.
- Zone timings in various UI places can now take into account only the
time when the thread was executing.
- Zone information window can now display regions in which thread was
suspended by the operating system.
- CPUs on which the zone was running are enumerated.
- Thread activity regions can be graphed on the timeline.
- API breakage: SetThreadName() now only works on current thread.
- Fixed thread name retrieval after thread is destroyed.
- Added number of CPU cores to host info.
- Limited number of possible source locations to 64K.
- Limited supported capture length to 1.6 days.
- CPU cores are now displayed on the timeline.
- Thread execution workload is displayed, including threads from external
programs.
- Thread migrations across CPU cores can be graphed.
- System-wide workload distribution is now plotted on the timeline.
- Added "CPU data" window showing programs competing for CPU during the
capture.
- Switched to using native thread identifiers (relatively small numbers), as
opposed to pthreads identifiers, which in reality were pointers.
- Improved thread name discovery if context switch capture is enabled.
- Per-trace state is now preserved between profiling sessions:
- Timeline view position.
- Item categories draw/hide settings.
- Timeline zones will be highlighted using a different color, when a
matching time range is selected on histogram.
- Per-frame zone times are now displayed on the frames plot when a zone is
selected in the find zone menu.
- Zone color is now displayed in zone information window.
- Zone colors can now be determined basing on depth and thread or source
location.
- Thread colors are displayed across the profiler application.
- Frame times can be now compared.
- Expose more lock handling functionality.
- Network port can be now specified by the user.
- Proper handling of multithreaded Vulkan code.
- Added extreme compression level in update utility.
- Added time distribution data in the zone information window.
- Trace file name is now displayed in trace information window.
- Annotations can be now added to the timeline.
- Server now performs network data retrieval and decompression on a dedicated
thread.
- Added examples of Tracy integration.
- Allow grouping of zones in the find zone menu by zone parent or with no
grouping.
- Zone list in the statistics window can be now filtered.
- Implemented configuration of plots.
- Messages can now collect call stacks.
v0.5 (2019-08-10)
-----------------
This is the last release which will be able to load pre-v0.4 traces. Use the
update utility to convert your old traces now!
- Major decrease of trace dump file size.
- Major optimizations across the board.
- Vcpkg is now used for library management on Windows.
- Display dump file size change in the update utility.
- Added notification area.
- Display trace loading time.
- Display background processing tasks after trace is loaded.
- Display trace save notification.
- Show crash icon, if there was a crash.
- Added C API.
- Profiling session may now gracefully terminate, due to incorrect
instrumentation. A popup with termination reason will be displayed.
- Call stack improvements.
- Call stack frames now have a proper source file and file line
information on Linux.
- Single call stack frame may now have multiple entries, representing
inlined function calls.
- Call stack grouping in the find zone menu now has a special display
mode.
- Call stack memory allocations tree improvements:
- Add top-down variant to complement the previously available bottom-up
one.
- Add ability to group tree nodes by function name.
- Allow restricting tree to display only active allocations.
- Added support for Lua call stack capture.
- Self time of zones may be now displayed in the find zone menu.
- Added ability to disconnect from a client.
- Find zone groups can now be sorted by mean time per call.
- Zones displayed in the find zone menu can be now grouped by order of
appearance, execution time or name.
- Time is now displayed without trailing fractional zeros (e.g. "2.5 ms"
instead of "2.50 ms").
- Child zones displayed in zone info window can be now grouped by source
location.
- Selected or hovered lock is now highlighted on the timeline.
- Locks are now grouped into single and multithreaded (contended and
uncontended) in the options menu locks list.
- On broken platforms the profiler can now be initialized as needed (and
possible), taking a performance and functionality hit.
- User experience improvements in the graphical profiler.
- Thread position and height is now animated, to eliminate flickering that
was happening when depth of displayed zones was changing.
- Zooming in/out using the mouse wheel is now animated.
- Plot range adjustment is now animated.
- Various other UI improvements.
- System CPU usage is now being monitored.
- Threads that have nothing to display in the current view are now hidden by
default.
- Dimmed-out the timeline outside the profiling area.
- Source file view can now be opened also from statistics menu.
- Display standard deviation in find zone and compare traces menus.
- Display zone messages in zone information window.
- Display order of threads can be changed in the options menu.
- Prevent deadlocks by querying socket send buffer size.
- Frame set statistics can be now limited to frames visible on the screen.
- Messages can be now colored.
- Zone selection in compare traces menu can be now linked to the other
trace.
- Added support for frame image (screen shot) storage.
- Implemented ability to cut off outliers on histograms.
- Zone or frame that is currently hovered by the mouse cursor will be
highlighted on the histogram.
- Server now displays available clients in the local network.
- Source code whitespace visibility can now be enabled or disabled.
- Profiler will now check if proper timer readings can be performed on
x86/x64.
- Application can now log app-specific information, similarly to how the
host info reports system information.
- Message list will automatically scroll down to the most recent message.
- Feature will disable when the list is scrolled by user.
- To re-enable, scroll to the bottom of the list.
- Message list can be now filtered.
- A notification popup will be displayed during trace cleanup.
- Source file view won't be available if a source file is newer than the
capture.
- Added ability to set custom trace descriptions.
- Added frame time target lines.
- FPS counts are now displayed next to frame times.
- GPU drift value can be now automatically measured.
- Connection window is now a popup hidden under a dedicated button.
v0.4.1 (2018-12-30)
-------------------
- Active frame set can be now switched by clicking on a frame set on the
timeline.
- Add ability to go to a specified frame.
- Most commonly used addresses can be now selected from the drop-down menu.
- Fixed corner case problem with profiler initialization on Windows.
- Added third state (stopped) to the pause/resume button. It will be used
after the connection to the client is terminated.
- Active trace can be discarded.
- Call stack capture may be forced through TRACY_CALLSTACK define.
- Lock info window has been added.
- Time of lock creation and termination is now being tracked.
- Menu bar buttons are now toggles that can also close their corresponding
windows.
- Find zone and compare menu improvements.
- Ability to ignore case during search.
- Pressing enter key will now start search, just like pressing the "find"
button.
- Using the ^F keyboard shortcut will open the find zone menu and focus
the input box.
- Added ability to automatically connect to an IP address in the graphical
profiler application (use "-a address" argument to enable).
- Pressing enter key after entering client address in the welcome dialog
will now automatically begin connection process.
v0.4 (2018-10-09)
-----------------