Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace2 V3 (squash/fixup of V1+V2) #100

Merged
merged 25 commits into from
Jan 4, 2019

Conversation

jeffhostetler
Copy link

@jeffhostetler jeffhostetler commented Dec 21, 2018

This patch series contains V3 of the Trace2 feature. This version includes the original "v1" parts
(see #28 and #34
which were shipped in GVFS 2.19.*)
and the extensions previously described as "v2" (see #99).

This version first reverts the original V1 parts. The V1 and V2 parts are squashed and lots of fix-ups applied.

The last 2 commits are gvfs-specific. We should be able to upstream the rest as is.

Before sending upstream (for RFC/WIP purposes):
[] I still need to add some unit tests (covering threading and the nested region data).
[] I need to write a design doc.

Much later:
[] Resolve the Trace1 and Trace2 cleanup. (Both original GIT_TRACE messages and custom
GIT_TRACE_key messages.)

jeffhostetler and others added 24 commits December 21, 2018 11:51
Create a new unified tracing facility for git.  The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.

In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.

Trace2 defines 3 output targets.  These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".  These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).

* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
  summary data.

* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
  It extends the output with columns for the command process, thread,
  repo, absolute and relative elapsed times.  It reports events for
  child process start/stop, thread start/stop, and per-thread function
  nesting.

* GIT_TR2_EVENT is a new structured format. It writes event data as a
  series of JSON records.

Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add optional platform-specific code to log information about
the current process.

On Windows, this includes whether git.exe is running under a
debugger and information about the ancestors of the process.

The purpose of this information is to help indicate if the
process was launched interactively or in the background by
an IDE, for example.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2_region_enter() and trace2_region_leave() calls around the
various phases of a status scan.  This gives elapsed time for each
phase in the GIT_TR2_PERF and GIT_TR2_EVENT trace target.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 process classification for editor and pager
child processes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 classification for long-running processes
started in sub-process.c

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 child classification for transport processes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Classify certain child processes as hooks.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 events to measure reading and writing the index.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When studying the performance of 'git push' we would like to know
how much time is spent at various parts of the command. One area
that could cause performance trouble is 'git pack-objects'.

Add trace2 regions around the three main actions taken in this
command:

1. Enumerate objects.
2. Prepare pack.
3. Write pack-file.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 region around read_object_process to collect
time spent waiting for missing objects to be dynamically
fetched.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see anything out of the ordinary. I'm running VFS builds in this PR and plan to run some Git tests with coverage on my linux machine soon.

builtin/commit.c Show resolved Hide resolved
@jeffhostetler jeffhostetler changed the title [WIP] [do not merge] Trace2 V3 (squash/fixup of V1+V2) Trace2 V3 (squash/fixup of V1+V2) Jan 4, 2019
Add trace2 region and data events describing attempts to deserialize
status data using a status cache.

A category:status, label:deserialize region is pushed around the
deserialize code.

Deserialization results when reading from a file are:
    category:status, path   = <path>
    category:status, polled = <number_of_attempts>
    category:status, result = "ok" | "reject"

When reading from STDIN are:
    category:status, path   = "STDIN"
    category:status, result = "ok" | "reject"

Status will fallback and run a normal status scan when a "reject"
is reported (unless "--deserialize-wait=fail").

If "ok" is reported, status was able to use the status cache and
avoid scanning the workdir.

Additionally, a sub-verb is emitted for each step: collection,
deserialization, and serialization.  For example, if deserialization
is attempted and fails and status falls back to actually computing
the status, a sub-verb message containing "deserialize" is issued
and then a sub-verb for "collect" is issued.

Also, if deserialization fails, a data message containing the
rejection reason is emitted.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
@jeffhostetler jeffhostetler merged commit 534dbe1 into microsoft:vfs-2.20.1 Jan 4, 2019
derrickstolee added a commit to microsoft/VFSForGit that referenced this pull request Jan 7, 2019
… fix and trace2 v3

* [PR 107 in microsoft/git](microsoft/git#107) includes a fix for the way we permute the set of packs during a `git multi-pack-index expire` command. Specifically, this fixes an issue when we expire packs and add packs to the multi-pack-index at the same time (and the order of those packs).

* [PR 100 in microsoft/git](microsoft/git#100) includes updates to trace2.
@jeffhostetler jeffhostetler deleted the gvfs-trace2-v3 branch July 1, 2019 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants