Skip to content

Commit

Permalink
Merge branch 'ns/core-fsyncmethod' into jch
Browse files Browse the repository at this point in the history
Replace core.fsyncObjectFiles with two new configuration variables,
core.fsync and core.fsyncMethod.

* ns/core-fsyncmethod:
  core.fsync: documentation and user-friendly aggregate options
  core.fsync: new option to harden the index
  core.fsync: add configuration parsing
  core.fsync: introduce granular fsync control infrastructure
  core.fsyncmethod: add writeout-only mode
  wrapper: make inclusion of Windows csprng header tightly scoped
  • Loading branch information
gitster committed Mar 24, 2022
2 parents a68dfad + b9f5d03 commit 4f5f388
Show file tree
Hide file tree
Showing 26 changed files with 444 additions and 60 deletions.
58 changes: 54 additions & 4 deletions Documentation/config/core.txt
Original file line number Diff line number Diff line change
Expand Up @@ -547,13 +547,63 @@ core.whitespace::
is relevant for `indent-with-non-tab` and when Git fixes `tab-in-indent`
errors. The default tab width is 8. Allowed values are 1 to 63.

core.fsync::
A comma-separated list of components of the repository that
should be hardened via the core.fsyncMethod when created or
modified. You can disable hardening of any component by
prefixing it with a '-'. Items that are not hardened may be
lost in the event of an unclean system shutdown. Unless you
have special requirements, it is recommended that you leave
this option empty or pick one of `committed`, `added`,
or `all`.
+
When this configuration is encountered, the set of components starts with
the platform default value, disabled components are removed, and additional
components are added. `none` resets the state so that the platform default
is ignored.
+
The empty string resets the fsync configuration to the platform
default. The default on most platforms is equivalent to
`core.fsync=committed,-loose-object`, which has good performance,
but risks losing recent work in the event of an unclean system shutdown.
+
* `none` clears the set of fsynced components.
* `loose-object` hardens objects added to the repo in loose-object form.
* `pack` hardens objects added to the repo in packfile form.
* `pack-metadata` hardens packfile bitmaps and indexes.
* `commit-graph` hardens the commit graph file.
* `index` hardens the index when it is modified.
* `objects` is an aggregate option that is equivalent to
`loose-object,pack`.
* `derived-metadata` is an aggregate option that is equivalent to
`pack-metadata,commit-graph`.
* `committed` is an aggregate option that is currently equivalent to
`objects`. This mode sacrifices some performance to ensure that work
that is committed to the repository with `git commit` or similar commands
is hardened.
* `added` is an aggregate option that is currently equivalent to
`committed,index`. This mode sacrifices additional performance to
ensure that the results of commands like `git add` and similar operations
are hardened.
* `all` is an aggregate option that syncs all individual components above.

core.fsyncMethod::
A value indicating the strategy Git will use to harden repository data
using fsync and related primitives.
+
* `fsync` uses the fsync() system call or platform equivalents.
* `writeout-only` issues pagecache writeback requests, but depending on the
filesystem and storage hardware, data added to the repository may not be
durable in the event of a system crash. This is the default mode on macOS.

core.fsyncObjectFiles::
This boolean will enable 'fsync()' when writing object files.
This setting is deprecated. Use core.fsync instead.
+
This is a total waste of time and effort on a filesystem that orders
data writes properly, but can be useful for filesystems that do not use
journalling (traditional UNIX filesystems) or that only journal metadata
and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
This setting affects data added to the Git repository in loose-object
form. When set to true, Git will issue an fsync or similar system call
to flush caches so that loose-objects remain consistent in the face
of a unclean system shutdown.

core.preloadIndex::
Enable parallel index preload for operations like 'git diff'
Expand Down
6 changes: 6 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,8 @@ include shared.mak
#
# Define HAVE_CLOCK_MONOTONIC if your platform has CLOCK_MONOTONIC.
#
# Define HAVE_SYNC_FILE_RANGE if your platform has sync_file_range.
#
# Define NEEDS_LIBRT if your platform requires linking with librt (glibc version
# before 2.17) for clock_gettime and CLOCK_MONOTONIC.
#
Expand Down Expand Up @@ -1918,6 +1920,10 @@ ifdef HAVE_CLOCK_MONOTONIC
BASIC_CFLAGS += -DHAVE_CLOCK_MONOTONIC
endif

ifdef HAVE_SYNC_FILE_RANGE
BASIC_CFLAGS += -DHAVE_SYNC_FILE_RANGE
endif

ifdef NEEDS_LIBRT
EXTLIBS += -lrt
endif
Expand Down
2 changes: 1 addition & 1 deletion builtin/fast-import.c
Original file line number Diff line number Diff line change
Expand Up @@ -865,7 +865,7 @@ static void end_packfile(void)
struct tag *t;

close_pack_windows(pack_data);
finalize_hashfile(pack_file, cur_pack_oid.hash, 0);
finalize_hashfile(pack_file, cur_pack_oid.hash, FSYNC_COMPONENT_PACK, 0);
fixup_pack_header_footer(pack_data->pack_fd, pack_data->hash,
pack_data->pack_name, object_count,
cur_pack_oid.hash, pack_size);
Expand Down
4 changes: 2 additions & 2 deletions builtin/index-pack.c
Original file line number Diff line number Diff line change
Expand Up @@ -1291,7 +1291,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
nr_objects - nr_objects_initial);
stop_progress_msg(&progress, msg.buf);
strbuf_release(&msg);
finalize_hashfile(f, tail_hash, 0);
finalize_hashfile(f, tail_hash, FSYNC_COMPONENT_PACK, 0);
hashcpy(read_hash, pack_hash);
fixup_pack_header_footer(output_fd, pack_hash,
curr_pack, nr_objects,
Expand Down Expand Up @@ -1513,7 +1513,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
if (!from_stdin) {
close(input_fd);
} else {
fsync_or_die(output_fd, curr_pack_name);
fsync_component_or_die(FSYNC_COMPONENT_PACK, output_fd, curr_pack_name);
err = close(output_fd);
if (err)
die_errno(_("error while closing pack file"));
Expand Down
24 changes: 17 additions & 7 deletions builtin/pack-objects.c
Original file line number Diff line number Diff line change
Expand Up @@ -1199,16 +1199,26 @@ static void write_pack_file(void)
display_progress(progress_state, written);
}

/*
* Did we write the wrong # entries in the header?
* If so, rewrite it like in fast-import
*/
if (pack_to_stdout) {
finalize_hashfile(f, hash, CSUM_HASH_IN_STREAM | CSUM_CLOSE);
/*
* We never fsync when writing to stdout since we may
* not be writing to an actual pack file. For instance,
* the upload-pack code passes a pipe here. Calling
* fsync on a pipe results in unnecessary
* synchronization with the reader on some platforms.
*/
finalize_hashfile(f, hash, FSYNC_COMPONENT_NONE,
CSUM_HASH_IN_STREAM | CSUM_CLOSE);
} else if (nr_written == nr_remaining) {
finalize_hashfile(f, hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE);
finalize_hashfile(f, hash, FSYNC_COMPONENT_PACK,
CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE);
} else {
int fd = finalize_hashfile(f, hash, 0);
/*
* If we wrote the wrong number of entries in the
* header, rewrite it like in fast-import.
*/

int fd = finalize_hashfile(f, hash, FSYNC_COMPONENT_PACK, 0);
fixup_pack_header_footer(fd, hash, pack_tmp_name,
nr_written, hash, offset);
close(fd);
Expand Down
5 changes: 3 additions & 2 deletions bulk-checkin.c
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,10 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
unlink(state->pack_tmp_name);
goto clear_exit;
} else if (state->nr_written == 1) {
finalize_hashfile(state->f, hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE);
finalize_hashfile(state->f, hash, FSYNC_COMPONENT_PACK,
CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE);
} else {
int fd = finalize_hashfile(state->f, hash, 0);
int fd = finalize_hashfile(state->f, hash, FSYNC_COMPONENT_PACK, 0);
fixup_pack_header_footer(fd, hash, state->pack_tmp_name,
state->nr_written, hash,
state->offset);
Expand Down
48 changes: 48 additions & 0 deletions cache.h
Original file line number Diff line number Diff line change
Expand Up @@ -993,8 +993,54 @@ void reset_shared_repository(void);
extern int read_replace_refs;
extern char *git_replace_ref_base;

/*
* These values are used to help identify parts of a repository to fsync.
* FSYNC_COMPONENT_NONE identifies data that will not be a persistent part of the
* repository and so shouldn't be fsynced.
*/
enum fsync_component {
FSYNC_COMPONENT_NONE,
FSYNC_COMPONENT_LOOSE_OBJECT = 1 << 0,
FSYNC_COMPONENT_PACK = 1 << 1,
FSYNC_COMPONENT_PACK_METADATA = 1 << 2,
FSYNC_COMPONENT_COMMIT_GRAPH = 1 << 3,
FSYNC_COMPONENT_INDEX = 1 << 4,
};

#define FSYNC_COMPONENTS_OBJECTS (FSYNC_COMPONENT_LOOSE_OBJECT | \
FSYNC_COMPONENT_PACK)

#define FSYNC_COMPONENTS_DERIVED_METADATA (FSYNC_COMPONENT_PACK_METADATA | \
FSYNC_COMPONENT_COMMIT_GRAPH)

#define FSYNC_COMPONENTS_DEFAULT (FSYNC_COMPONENTS_OBJECTS | \
FSYNC_COMPONENTS_DERIVED_METADATA | \
~FSYNC_COMPONENT_LOOSE_OBJECT)

#define FSYNC_COMPONENTS_COMMITTED (FSYNC_COMPONENTS_OBJECTS)

#define FSYNC_COMPONENTS_ADDED (FSYNC_COMPONENTS_COMMITTED | \
FSYNC_COMPONENT_INDEX)

#define FSYNC_COMPONENTS_ALL (FSYNC_COMPONENT_LOOSE_OBJECT | \
FSYNC_COMPONENT_PACK | \
FSYNC_COMPONENT_PACK_METADATA | \
FSYNC_COMPONENT_COMMIT_GRAPH | \
FSYNC_COMPONENT_INDEX)

/*
* A bitmask indicating which components of the repo should be fsynced.
*/
extern enum fsync_component fsync_components;
extern int fsync_object_files;
extern int use_fsync;

enum fsync_method {
FSYNC_METHOD_FSYNC,
FSYNC_METHOD_WRITEOUT_ONLY
};

extern enum fsync_method fsync_method;
extern int core_preload_index;
extern int precomposed_unicode;
extern int protect_hfs;
Expand Down Expand Up @@ -1715,6 +1761,8 @@ int copy_file_with_time(const char *dst, const char *src, int mode);

void write_or_die(int fd, const void *buf, size_t count);
void fsync_or_die(int fd, const char *);
int fsync_component(enum fsync_component component, int fd);
void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);

ssize_t read_in_full(int fd, void *buf, size_t count);
ssize_t write_in_full(int fd, const void *buf, size_t count);
Expand Down
3 changes: 2 additions & 1 deletion commit-graph.c
Original file line number Diff line number Diff line change
Expand Up @@ -1952,7 +1952,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
}

close_commit_graph(ctx->r->objects);
finalize_hashfile(f, file_hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
finalize_hashfile(f, file_hash, FSYNC_COMPONENT_COMMIT_GRAPH,
CSUM_HASH_IN_STREAM | CSUM_FSYNC);
free_chunkfile(cf);

if (ctx->split) {
Expand Down
3 changes: 3 additions & 0 deletions compat/mingw.h
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,9 @@ int mingw_getpagesize(void);
#define getpagesize mingw_getpagesize
#endif

int win32_fsync_no_flush(int fd);
#define fsync_no_flush win32_fsync_no_flush

struct rlimit {
unsigned int rlim_cur;
};
Expand Down
28 changes: 28 additions & 0 deletions compat/win32/flush.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#include "git-compat-util.h"
#include <winternl.h>
#include "lazyload.h"

int win32_fsync_no_flush(int fd)
{
IO_STATUS_BLOCK io_status;

#define FLUSH_FLAGS_FILE_DATA_ONLY 1

DECLARE_PROC_ADDR(ntdll.dll, NTSTATUS, NTAPI, NtFlushBuffersFileEx,
HANDLE FileHandle, ULONG Flags, PVOID Parameters, ULONG ParameterSize,
PIO_STATUS_BLOCK IoStatusBlock);

if (!INIT_PROC_ADDR(NtFlushBuffersFileEx)) {
errno = ENOSYS;
return -1;
}

memset(&io_status, 0, sizeof(io_status));
if (NtFlushBuffersFileEx((HANDLE)_get_osfhandle(fd), FLUSH_FLAGS_FILE_DATA_ONLY,
NULL, 0, &io_status)) {
errno = EINVAL;
return -1;
}

return 0;
}
5 changes: 0 additions & 5 deletions compat/winansi.c
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,6 @@

#undef NOGDI

/*
* Including the appropriate header file for RtlGenRandom causes MSVC to see a
* redefinition of types in an incompatible way when including headers below.
*/
#undef HAVE_RTLGENRANDOM
#include "../git-compat-util.h"
#include <wingdi.h>
#include <winreg.h>
Expand Down
94 changes: 94 additions & 0 deletions config.c
Original file line number Diff line number Diff line change
Expand Up @@ -1323,6 +1323,79 @@ static int git_parse_maybe_bool_text(const char *value)
return -1;
}

static const struct fsync_component_name {
const char *name;
enum fsync_component component_bits;
} fsync_component_names[] = {
{ "loose-object", FSYNC_COMPONENT_LOOSE_OBJECT },
{ "pack", FSYNC_COMPONENT_PACK },
{ "pack-metadata", FSYNC_COMPONENT_PACK_METADATA },
{ "commit-graph", FSYNC_COMPONENT_COMMIT_GRAPH },
{ "index", FSYNC_COMPONENT_INDEX },
{ "objects", FSYNC_COMPONENTS_OBJECTS },
{ "derived-metadata", FSYNC_COMPONENTS_DERIVED_METADATA },
{ "committed", FSYNC_COMPONENTS_COMMITTED },
{ "added", FSYNC_COMPONENTS_ADDED },
{ "all", FSYNC_COMPONENTS_ALL },
};

static enum fsync_component parse_fsync_components(const char *var, const char *string)
{
enum fsync_component current = FSYNC_COMPONENTS_DEFAULT;
enum fsync_component positive = 0, negative = 0;

while (string) {
int i;
size_t len;
const char *ep;
int negated = 0;
int found = 0;

string = string + strspn(string, ", \t\n\r");
ep = strchrnul(string, ',');
len = ep - string;
if (!strcmp(string, "none")) {
current = FSYNC_COMPONENT_NONE;
goto next_name;
}

if (*string == '-') {
negated = 1;
string++;
len--;
if (!len)
warning(_("invalid value for variable %s"), var);
}

if (!len)
break;

for (i = 0; i < ARRAY_SIZE(fsync_component_names); ++i) {
const struct fsync_component_name *n = &fsync_component_names[i];

if (strncmp(n->name, string, len))
continue;

found = 1;
if (negated)
negative |= n->component_bits;
else
positive |= n->component_bits;
}

if (!found) {
char *component = xstrndup(string, len);
warning(_("ignoring unknown core.fsync component '%s'"), component);
free(component);
}

next_name:
string = ep;
}

return (current & ~negative) | positive;
}

int git_parse_maybe_bool(const char *value)
{
int v = git_parse_maybe_bool_text(value);
Expand Down Expand Up @@ -1600,7 +1673,28 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
return 0;
}

if (!strcmp(var, "core.fsync")) {
if (!value)
return config_error_nonbool(var);
fsync_components = parse_fsync_components(var, value);
return 0;
}

if (!strcmp(var, "core.fsyncmethod")) {
if (!value)
return config_error_nonbool(var);
if (!strcmp(value, "fsync"))
fsync_method = FSYNC_METHOD_FSYNC;
else if (!strcmp(value, "writeout-only"))
fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
else
warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);

}

if (!strcmp(var, "core.fsyncobjectfiles")) {
if (fsync_object_files < 0)
warning(_("core.fsyncobjectfiles is deprecated; use core.fsync instead"));
fsync_object_files = git_config_bool(var, value);
return 0;
}
Expand Down
Loading

0 comments on commit 4f5f388

Please sign in to comment.