refs: fix corruption by not correctly syncing packed-refs to disk

At GitLab we have recently received a report where a repository was left
with a corrupted `packed-refs` file after the node hard-crashed even
though `core.fsync=reference` was set. This is something that in theory
should not happen if we correctly did the atomic-rename dance to:

    1. Write the data into a temporary file.

    2. Synchronize the temporary file to disk.

    3. Rename the temporary file into place.

So if we crash in the middle of writing the `packed-refs` file we should
only ever see either the old or the new state of the file.

And while we do the dance when writing the `packed-refs` file, there is
indeed one gotcha: we use a `FILE *` stream to write the temporary file,
but don't flush it before synchronizing it to disk. As a consequence any
data that is still buffered will not get synchronized and a crash of the
machine may cause corruption.

Fix this bug by flushing the file stream before we fsync.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Patrick Steinhardt 2022-12-20 15:52:14 +01:00 committed by Junio C Hamano
parent bc22d845c4
commit ce54672f9b
1 changed files with 2 additions and 1 deletions

View File

@ -1262,7 +1262,8 @@ static int write_with_updates(struct packed_ref_store *refs,
goto error;
}
if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
if (fflush(out) ||
fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
close_tempfile_gently(refs->tempfile)) {
strbuf_addf(err, "error closing file %s: %s",
get_tempfile_path(refs->tempfile),