-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects #1134
Changes from all commits
3b20234
66cba40
fcb5afb
64e6340
be7ccf9
72c69ff
288a484
1b03440
16a3aa1
127d35c
cea4df4
e52f1da
44c5297
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -596,6 +596,14 @@ core.fsyncMethod:: | |
* `writeout-only` issues pagecache writeback requests, but depending on the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
|
||
filesystem and storage hardware, data added to the repository may not be | ||
durable in the event of a system crash. This is the default mode on macOS. | ||
* `batch` enables a mode that uses writeout-only flushes to stage multiple | ||
updates in the disk writeback cache and then does a single full fsync of | ||
a dummy file to trigger the disk cache flush at the end of the operation. | ||
+ | ||
Currently `batch` mode only applies to loose-object files. Other repository | ||
data is made durable as if `fsync` was specified. This mode is expected to | ||
be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems | ||
and on Windows for repos stored on NTFS or ReFS filesystems. | ||
|
||
core.fsyncObjectFiles:: | ||
This boolean will enable 'fsync()' when writing object files. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -141,7 +141,16 @@ int add_files_to_cache(const char *prefix, | |
rev.diffopt.format_callback_data = &data; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
|
||
rev.diffopt.flags.override_submodule_config = 1; | ||
rev.max_count = 0; /* do not compare unmerged paths with stage #2 */ | ||
|
||
/* | ||
* Use an ODB transaction to optimize adding multiple objects. | ||
* This function is invoked from commands other than 'add', which | ||
* may not have their own transaction active. | ||
*/ | ||
begin_odb_transaction(); | ||
run_diff_files(&rev, DIFF_RACY_IS_MODIFIED); | ||
end_odb_transaction(); | ||
|
||
clear_pathspec(&rev.prune_data); | ||
return !!data.add_errors; | ||
} | ||
|
@@ -670,7 +679,7 @@ int cmd_add(int argc, const char **argv, const char *prefix) | |
string_list_clear(&only_match_skip_worktree, 0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
|
||
} | ||
|
||
plug_bulk_checkin(); | ||
begin_odb_transaction(); | ||
|
||
if (add_renormalize) | ||
exit_status |= renormalize_tracked_files(&pathspec, flags); | ||
|
@@ -682,7 +691,7 @@ int cmd_add(int argc, const char **argv, const char *prefix) | |
|
||
if (chmod_arg && pathspec.nr) | ||
exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only); | ||
unplug_bulk_checkin(); | ||
end_odb_transaction(); | ||
|
||
finish: | ||
if (write_locked_index(&the_index, &lock_file, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
#include "builtin.h" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
|
||
#include "cache.h" | ||
#include "bulk-checkin.h" | ||
#include "config.h" | ||
#include "object-store.h" | ||
#include "object.h" | ||
|
@@ -503,10 +504,12 @@ static void unpack_all(void) | |
if (!quiet) | ||
progress = start_progress(_("Unpacking objects"), nr_objects); | ||
CALLOC_ARRAY(obj_list, nr_objects); | ||
begin_odb_transaction(); | ||
for (i = 0; i < nr_objects; i++) { | ||
unpack_one(i); | ||
display_progress(progress, i + 1); | ||
} | ||
end_odb_transaction(); | ||
stop_progress(&progress); | ||
|
||
if (delta_list) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,7 @@ | |
*/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
|
||
#define USE_THE_INDEX_COMPATIBILITY_MACROS | ||
#include "cache.h" | ||
#include "bulk-checkin.h" | ||
#include "config.h" | ||
#include "lockfile.h" | ||
#include "quote.h" | ||
|
@@ -57,6 +58,14 @@ static void report(const char *fmt, ...) | |
if (!verbose) | ||
return; | ||
|
||
/* | ||
* It is possible, though unlikely, that a caller could use the verbose | ||
* output to synchronize with addition of objects to the object | ||
* database. The current implementation of ODB transactions leaves | ||
* objects invisible while a transaction is active, so flush the | ||
* transaction here before reporting a change made by update-index. | ||
*/ | ||
flush_odb_transaction(); | ||
va_start(vp, fmt); | ||
vprintf(fmt, vp); | ||
putchar('\n'); | ||
|
@@ -1116,6 +1125,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) | |
*/ | ||
parse_options_start(&ctx, argc, argv, prefix, | ||
options, PARSE_OPT_STOP_AT_NON_OPTION); | ||
|
||
/* | ||
* Allow the object layer to optimize adding multiple objects in | ||
* a batch. | ||
*/ | ||
begin_odb_transaction(); | ||
while (ctx.argc) { | ||
if (parseopt_state != PARSE_OPT_DONE) | ||
parseopt_state = parse_options_step(&ctx, options, | ||
|
@@ -1190,6 +1205,11 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) | |
strbuf_release(&buf); | ||
} | ||
|
||
/* | ||
* By now we have added all of the new objects | ||
*/ | ||
end_odb_transaction(); | ||
|
||
if (split_index > 0) { | ||
if (git_config_get_split_index() == 0) | ||
warning(_("core.splitIndex is set to false; " | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,16 +3,21 @@ | |
*/ | ||
#include "cache.h" | ||
#include "bulk-checkin.h" | ||
#include "lockfile.h" | ||
#include "repository.h" | ||
#include "csum-file.h" | ||
#include "pack.h" | ||
#include "strbuf.h" | ||
#include "string-list.h" | ||
#include "tmp-objdir.h" | ||
#include "packfile.h" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
|
||
#include "object-store.h" | ||
|
||
static struct bulk_checkin_state { | ||
unsigned plugged:1; | ||
static int odb_transaction_nesting; | ||
|
||
static struct tmp_objdir *bulk_fsync_objdir; | ||
|
||
static struct bulk_checkin_packfile { | ||
char *pack_tmp_name; | ||
struct hashfile *f; | ||
off_t offset; | ||
|
@@ -21,7 +26,7 @@ static struct bulk_checkin_state { | |
struct pack_idx_entry **written; | ||
uint32_t alloc_written; | ||
uint32_t nr_written; | ||
} state; | ||
} bulk_checkin_packfile; | ||
|
||
static void finish_tmp_packfile(struct strbuf *basename, | ||
const char *pack_tmp_name, | ||
|
@@ -39,7 +44,7 @@ static void finish_tmp_packfile(struct strbuf *basename, | |
free(idx_tmp_name); | ||
} | ||
|
||
static void finish_bulk_checkin(struct bulk_checkin_state *state) | ||
static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state) | ||
{ | ||
unsigned char hash[GIT_MAX_RAWSZ]; | ||
struct strbuf packname = STRBUF_INIT; | ||
|
@@ -80,7 +85,41 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) | |
reprepare_packed_git(the_repository); | ||
} | ||
|
||
static int already_written(struct bulk_checkin_state *state, struct object_id *oid) | ||
/* | ||
* Cleanup after batch-mode fsync_object_files. | ||
*/ | ||
static void flush_batch_fsync(void) | ||
{ | ||
struct strbuf temp_path = STRBUF_INIT; | ||
struct tempfile *temp; | ||
|
||
if (!bulk_fsync_objdir) | ||
return; | ||
|
||
/* | ||
* Issue a full hardware flush against a temporary file to ensure | ||
* that all objects are durable before any renames occur. The code in | ||
* fsync_loose_object_bulk_checkin has already issued a writeout | ||
* request, but it has not flushed any writeback cache in the storage | ||
* hardware or any filesystem logs. This fsync call acts as a barrier | ||
* to ensure that the data in each new object file is durable before | ||
* the final name is visible. | ||
*/ | ||
strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory()); | ||
temp = xmks_tempfile(temp_path.buf); | ||
fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp)); | ||
delete_tempfile(&temp); | ||
strbuf_release(&temp_path); | ||
|
||
/* | ||
* Make the object files visible in the primary ODB after their data is | ||
* fully durable. | ||
*/ | ||
tmp_objdir_migrate(bulk_fsync_objdir); | ||
bulk_fsync_objdir = NULL; | ||
} | ||
|
||
static int already_written(struct bulk_checkin_packfile *state, struct object_id *oid) | ||
{ | ||
int i; | ||
|
||
|
@@ -112,7 +151,7 @@ static int already_written(struct bulk_checkin_state *state, struct object_id *o | |
* status before calling us just in case we ask it to call us again | ||
* with a new pack. | ||
*/ | ||
static int stream_to_pack(struct bulk_checkin_state *state, | ||
static int stream_to_pack(struct bulk_checkin_packfile *state, | ||
git_hash_ctx *ctx, off_t *already_hashed_to, | ||
int fd, size_t size, enum object_type type, | ||
const char *path, unsigned flags) | ||
|
@@ -189,7 +228,7 @@ static int stream_to_pack(struct bulk_checkin_state *state, | |
} | ||
|
||
/* Lazily create backing packfile for the state */ | ||
static void prepare_to_stream(struct bulk_checkin_state *state, | ||
static void prepare_to_stream(struct bulk_checkin_packfile *state, | ||
unsigned flags) | ||
{ | ||
if (!(flags & HASH_WRITE_OBJECT) || state->f) | ||
|
@@ -204,7 +243,7 @@ static void prepare_to_stream(struct bulk_checkin_state *state, | |
die_errno("unable to write pack header"); | ||
} | ||
|
||
static int deflate_to_pack(struct bulk_checkin_state *state, | ||
static int deflate_to_pack(struct bulk_checkin_packfile *state, | ||
struct object_id *result_oid, | ||
int fd, size_t size, | ||
enum object_type type, const char *path, | ||
|
@@ -251,7 +290,7 @@ static int deflate_to_pack(struct bulk_checkin_state *state, | |
BUG("should not happen"); | ||
hashfile_truncate(state->f, &checkpoint); | ||
state->offset = checkpoint.offset; | ||
finish_bulk_checkin(state); | ||
flush_bulk_checkin_packfile(state); | ||
if (lseek(fd, seekback, SEEK_SET) == (off_t) -1) | ||
return error("cannot seek back"); | ||
} | ||
|
@@ -274,25 +313,66 @@ static int deflate_to_pack(struct bulk_checkin_state *state, | |
return 0; | ||
} | ||
|
||
void prepare_loose_object_bulk_checkin(void) | ||
{ | ||
/* | ||
* We lazily create the temporary object directory | ||
* the first time an object might be added, since | ||
* callers may not know whether any objects will be | ||
* added at the time they call begin_odb_transaction. | ||
*/ | ||
if (!odb_transaction_nesting || bulk_fsync_objdir) | ||
return; | ||
|
||
bulk_fsync_objdir = tmp_objdir_create("bulk-fsync"); | ||
if (bulk_fsync_objdir) | ||
tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0); | ||
} | ||
|
||
void fsync_loose_object_bulk_checkin(int fd, const char *filename) | ||
{ | ||
/* | ||
* If we have an active ODB transaction, we issue a call that | ||
* cleans the filesystem page cache but avoids a hardware flush | ||
* command. Later on we will issue a single hardware flush | ||
* before as part of flush_batch_fsync. | ||
*/ | ||
if (!bulk_fsync_objdir || | ||
git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) { | ||
fsync_or_die(fd, filename); | ||
} | ||
} | ||
|
||
int index_bulk_checkin(struct object_id *oid, | ||
int fd, size_t size, enum object_type type, | ||
const char *path, unsigned flags) | ||
{ | ||
int status = deflate_to_pack(&state, oid, fd, size, type, | ||
int status = deflate_to_pack(&bulk_checkin_packfile, oid, fd, size, type, | ||
path, flags); | ||
if (!state.plugged) | ||
finish_bulk_checkin(&state); | ||
if (!odb_transaction_nesting) | ||
flush_bulk_checkin_packfile(&bulk_checkin_packfile); | ||
return status; | ||
} | ||
|
||
void plug_bulk_checkin(void) | ||
void begin_odb_transaction(void) | ||
{ | ||
state.plugged = 1; | ||
odb_transaction_nesting += 1; | ||
} | ||
|
||
void unplug_bulk_checkin(void) | ||
void flush_odb_transaction(void) | ||
{ | ||
state.plugged = 0; | ||
if (state.f) | ||
finish_bulk_checkin(&state); | ||
flush_batch_fsync(); | ||
flush_bulk_checkin_packfile(&bulk_checkin_packfile); | ||
} | ||
|
||
void end_odb_transaction(void) | ||
{ | ||
odb_transaction_nesting -= 1; | ||
if (odb_transaction_nesting < 0) | ||
BUG("Unbalanced ODB transaction nesting"); | ||
|
||
if (odb_transaction_nesting) | ||
return; | ||
|
||
flush_odb_transaction(); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
#include "tree.h" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Neeraj Singh wrote (reply to this):
|
||
#include "tree-walk.h" | ||
#include "cache-tree.h" | ||
#include "bulk-checkin.h" | ||
#include "object-store.h" | ||
#include "replace-object.h" | ||
#include "promisor-remote.h" | ||
|
@@ -474,8 +475,10 @@ int cache_tree_update(struct index_state *istate, int flags) | |
|
||
trace_performance_enter(); | ||
trace2_region_enter("cache_tree", "update", the_repository); | ||
begin_odb_transaction(); | ||
i = update_one(istate->cache_tree, istate->cache, istate->cache_nr, | ||
"", 0, &skip, flags); | ||
end_odb_transaction(); | ||
trace2_region_leave("cache_tree", "update", the_repository); | ||
trace_performance_leave("cache_tree_update"); | ||
if (i < 0) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Neeraj Singh wrote (reply to this):