Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse-checkout builtin: upstreamable version #180

Commits on Aug 19, 2019

  1. sparse-checkout: create builtin with 'list' subcommand

    The sparse-checkout feature is mostly hidden to users, as its
    only documentation is supplementary information in the docs for
    'git read-tree'. In addition, users need to know how to edit the
    .git/info/sparse-checkout file with the right patterns, then run
    the appropriate 'git read-tree -mu HEAD' command. Keeping the
    working directory in sync with the sparse-checkout file requires
    care.
    
    Begin an effort to make the sparse-checkout feature a porcelain
    feature by creating a new 'git sparse-checkout' builtin. This
    builtin will be the preferred mechanism for manipulating the
    sparse-checkout file and syncing the working directory.
    
    For now, create the basics of the builtin. Includes a single
    subcommand, "git sparse-checkout list", that lists the patterns
    currently in the sparse-checkout file. Test that these patterns
    are parsed and written correctly to the output.
    
    The documentation provided is adapted from the "git read-tree"
    documentation with a few edits for clarity in the new context.
    Extra sections are added to hint toward a future change to
    a moer restricted pattern set.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 19, 2019
    Configuration menu
    Copy the full SHA
    ab2e063 View commit details
    Browse the repository at this point in the history
  2. sparse-checkout: create 'init' subcommand

    Getting started with a sparse-checkout file can be daunting. Help
    users start their sparse enlistment using 'git sparse-checkout init'.
    This will set 'core.sparseCheckout=true' in their config, write
    an initial set of patterns to the sparse-checkout file, and update
    their working directory.
    
    Using 'git read-tree' to clear directories does not work cleanly
    on Windows, so manually delete directories that are tracked by Git
    before running read-tree.
    
    The use of running another process for 'git read-tree' is likely
    suboptimal, but that can be improved in a later change, if valuable.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 19, 2019
    Configuration menu
    Copy the full SHA
    35700b3 View commit details
    Browse the repository at this point in the history
  3. clone: add --sparse mode

    When someone wants to clone a large repository, but plans to work
    using a sparse-checkout file, they either need to do a full
    checkout first and then reduce the patterns they included, or
    clone with --no-checkout, set up their patterns, and then run
    a checkout manually. This requires knowing a lot about the repo
    shape and how sparse-checkout works.
    
    Add a new '--sparse' option to 'git clone' that initializes the
    sparse-checkout file to include the following patterns:
    
    	/*
    	!/*/*
    
    These patterns include every file in the root directory, but
    no directories. This allows a repo to include files like a
    README or a bootstrapping script to grow enlistments from that
    point.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 19, 2019
    Configuration menu
    Copy the full SHA
    4e8e6c3 View commit details
    Browse the repository at this point in the history
  4. sparse-checkout: 'add' subcommand

    The 'git sparse-checkout add' subcommand takes a list of patterns
    over stdin and writes them to the sparse-checkout file. Then, it
    updates the working directory using 'git read-tree -mu HEAD'.
    
    Note: if a user adds a negative pattern that would lead to the
    removal of a non-empty directory, then Git may not delete that
    directory (on Windows).
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 19, 2019
    Configuration menu
    Copy the full SHA
    8913c25 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2019

  1. sparse-checkout: create 'disable' subcommand

    The instructions for disabling a sparse-checkout to a full
    working directory are complicated and non-intuitive. Add a
    subcommand, 'git sparse-checkout disable', to perform those
    steps for the user.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    c08d037 View commit details
    Browse the repository at this point in the history
  2. trace2:experiment: clear_ce_flags_1

    The clear_ce_flags_1 method is used by many types of calls to
    unpack_trees(). Add trace2 regions around the method, including
    some flag information, so we can get granular performance data
    during experiments.
    
    Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    jeffhostetler authored and derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    4aee073 View commit details
    Browse the repository at this point in the history
  3. sparse-checkout: add 'cone' mode

    The sparse-checkout feature can have quadratic performance as
    the number of patterns and number of entries in the index grow.
    If there are 1,000 patterns and 1,000,000 entries, this time can
    be very significant.
    
    Create a new 'cone' mode for the core.sparseCheckout config
    option, and adjust the parser to set an appropriate enum value.
    
    While adjusting the type of this variable, rename it from
    core_apply_sparse_checkout to core_sparse_checkout. This will
    help avoid parallel changes from hitting type issues, and we
    can guarantee that all uses now consider the enum values instead
    of the int value.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    6c1b7fb View commit details
    Browse the repository at this point in the history
  4. sparse-checkout: use hashmaps for cone patterns

    The parent and recursive patterns allowed by the "cone mode"
    option in sparse-checkout are restrictive enough that we
    can avoid using the regex parsing. Everything is based on
    prefix matches, so we can use hashsets to store the prefixes
    from the sparse-checkout file. When checking a path, we can
    strip path entries from the path and check the hashset for
    an exact match.
    
    As a test, I created a cone-mode sparse-checkout file for the
    Linux repository that actually includes every file. This was
    constructed by taking every folder in the Linux repo and creating
    the pattern pairs here:
    
    	/$folder/*
    	!/$folder/*/*
    
    This resulted in a sparse-checkout file sith 8,296 patterns.
    Running 'git read-tree -mu HEAD' on this file had the following
    performance:
    
    	core.sparseCheckout=false: 0.21 s (0.00 s)
    	 core.sparseCheckout=true: 3.75 s (3.50 s)
    	 core.sparseCheckout=cone: 0.23 s (0.01 s)
    
    The times in parentheses above correspond to the time spent
    in the first clear_ce_flags() call, according to the trace2
    performance traces.
    
    While this example is contrived, it demonstrates how these
    patterns can slow the sparse-checkout feature.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    79a042c View commit details
    Browse the repository at this point in the history
  5. sparse-checkout: init and add in cone mode

    To make the cone pattern set easy to use, update the behavior of
    'git sparse-checkout [init|add]'.
    
    Add '--cone' flag to 'git sparse-checkout init' to set the config
    option 'core.sparseCheckout=cone'.
    
    When running 'git sparse-checkout add' in cone mode, a user only
    needs to supply a list of recursive folder matches. Git will
    automatically add the necessary parent matches for the leading
    directories.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    19ad979 View commit details
    Browse the repository at this point in the history
  6. Merge branch 'sparse-checkout/v1'

    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Aug 20, 2019
    Configuration menu
    Copy the full SHA
    181fb3b View commit details
    Browse the repository at this point in the history