Go to file
Taylor Blau 2f4ba2a867 packfile: prepare for the existence of '*.rev' files
Specify the format of the on-disk reverse index 'pack-*.rev' file, as
well as prepare the code for the existence of such files.

The reverse index maps from pack relative positions (i.e., an index into
the array of object which is sorted by their offsets within the
packfile) to their position within the 'pack-*.idx' file. Today, this is
done by building up a list of (off_t, uint32_t) tuples for each object
(the off_t corresponding to that object's offset, and the uint32_t
corresponding to its position in the index). To convert between pack and
index position quickly, this array of tuples is radix sorted based on
its offset.

This has two major drawbacks:

First, the in-memory cost scales linearly with the number of objects in
a pack.  Each 'struct revindex_entry' is sizeof(off_t) +
sizeof(uint32_t) + padding bytes for a total of 16.

To observe this, force Git to load the reverse index by, for e.g.,
running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking
for a single object in a fresh clone of the kernel, Git needs to
allocate 120+ MB of memory in order to hold the reverse index in memory.

Second, the cost to sort also scales with the size of the pack.
Luckily, this is a linear function since 'load_pack_revindex()' uses a
radix sort, but this cost still must be paid once per pack per process.

As an example, it takes ~60x longer to print the _size_ of an object as
it does to print that entire object's _contents_:

  Benchmark #1: git.compile cat-file --batch <obj
    Time (mean ± σ):       3.4 ms ±   0.1 ms    [User: 3.3 ms, System: 2.1 ms]
    Range (min … max):     3.2 ms …   3.7 ms    726 runs

  Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj
    Time (mean ± σ):     210.3 ms ±   8.9 ms    [User: 188.2 ms, System: 23.2 ms]
    Range (min … max):   193.7 ms … 224.4 ms    13 runs

Instead, avoid computing and sorting the revindex once per process by
writing it to a file when the pack itself is generated.

The format is relatively straightforward. It contains an array of
uint32_t's, the length of which is equal to the number of objects in the
pack.  The ith entry in this table contains the index position of the
ith object in the pack, where "ith object in the pack" is determined by
pack offset.

One thing that the on-disk format does _not_ contain is the full (up to)
eight-byte offset corresponding to each object. This is something that
the in-memory revindex contains (it stores an off_t in 'struct
revindex_entry' along with the same uint32_t that the on-disk format
has). Omit it in the on-disk format, since knowing the index position
for some object is sufficient to get a constant-time lookup in the
pack-*.idx file to ask for an object's offset within the pack.

This trades off between the on-disk size of the 'pack-*.rev' file for
runtime to chase down the offset for some object. Even though the lookup
is constant time, the constant is heavier, since it can potentially
involve two pointer walks in v2 indexes (one to access the 4-byte offset
table, and potentially a second to access the double wide offset table).

Consider trying to map an object's pack offset to a relative position
within that pack. In a cold-cache scenario, more page faults occur while
switching between binary searching through the reverse index and
searching through the *.idx file for an object's offset. Sure enough,
with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after
'sync'ing), printing out the entire object's contents is still
marginally faster than printing its size:

  Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null
    Time (mean ± σ):      22.6 ms ±   0.5 ms    [User: 2.4 ms, System: 7.9 ms]
    Range (min … max):    21.4 ms …  23.5 ms    41 runs

  Benchmark #2: git.compile cat-file --batch <obj >/dev/null
    Time (mean ± σ):      17.2 ms ±   0.7 ms    [User: 2.8 ms, System: 5.5 ms]
    Range (min … max):    15.6 ms …  18.2 ms    45 runs

(Numbers taken in the kernel after cheating and using the next patch to
generate a reverse index). There are a couple of approaches to improve
cold cache performance not pursued here:

  - We could include the object offsets in the reverse index format.
    Predictably, this does result in fewer page faults, but it triples
    the size of the file, while simultaneously duplicating a ton of data
    already available in the .idx file. (This was the original way I
    implemented the format, and it did show
    `--batch-check='%(objectsize:disk)'` winning out against `--batch`.)

    On the other hand, this increase in size also results in a large
    block-cache footprint, which could potentially hurt other workloads.

  - We could store the mapping from pack to index position in more
    cache-friendly way, like constructing a binary search tree from the
    table and writing the values in breadth-first order. This would
    result in much better locality, but the price you pay is trading
    O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you
    can no longer directly index the table).

So, neither of these approaches are taken here. (Thankfully, the format
is versioned, so we are free to pursue these in the future.) But, cold
cache performance likely isn't interesting outside of one-off cases like
asking for the size of an object directly. In real-world usage, Git is
often performing many operations in the revindex (i.e., asking about
many objects rather than a single one).

The trade-off is worth it, since we will avoid the vast majority of the
cost of generating the revindex that the extra pointer chase will look
like noise in the following patch's benchmarks.

This patch describes the format and prepares callers (like in
pack-revindex.c) to be able to read *.rev files once they exist. An
implementation of the writer will appear in the next patch, and callers
will gradually begin to start using the writer in the patches that
follow after that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
.github Merge branch 'da/vs-build-iconv-fix' 2020-12-14 10:21:38 -08:00
Documentation packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
block-sha1 block-sha1: take a size_t length parameter 2020-11-16 13:41:35 -08:00
builtin packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
ci Merge branch 'js/default-branch-name-tests-final-stretch' 2021-01-25 14:19:18 -08:00
compat Merge branch 'da/vs-build-iconv-fix' 2020-12-14 10:21:38 -08:00
contrib completion: add proper public __git_complete 2021-01-04 15:25:56 -08:00
ewah bitmap: implement bitmap_is_subset() 2020-12-08 14:48:16 -08:00
git-gui Merge https://github.com/prati0100/git-gui 2020-12-18 15:07:10 -08:00
gitk-git Merge remote-tracking branch 'paulus/master' into pm/gitk-update 2020-10-03 10:06:27 -07:00
gitweb gitweb/Makefile: conditionally include ../GIT-VERSION-FILE 2020-12-08 16:56:56 -08:00
mergetools Merge branch 'pd/mergetool-nvimdiff' 2020-11-21 15:14:39 -08:00
negotiator negotiator/noop: add noop fetch negotiator 2020-08-18 13:25:05 -07:00
perl Merge branch 'jk/perl-warning' 2020-11-09 14:06:25 -08:00
po l10n: zh_CN: for git v2.30.0 l10n round 1 and 2 2020-12-27 19:23:27 +08:00
ppc
refs refs/files-backend: don't peek into `struct lock_file` 2021-01-06 13:53:32 -08:00
sha1collisiondetection@855827c583
sha1dc Merge branch 'jk/lore-is-the-archive' 2019-12-06 15:09:23 -08:00
sha256 hash: implement and use a context cloning function 2020-02-24 09:33:21 -08:00
t Merge branch 'ab/mailmap-fixup' 2021-01-25 14:19:20 -08:00
templates hook: add sample template for push-to-checkout 2020-10-16 08:47:02 -07:00
trace2 trace2: teach Git to log environment variables 2020-03-23 13:14:53 -07:00
vcs-svn drop vcs-svn experiment 2020-08-13 11:02:15 -07:00
xdiff diff: add -I<regex> that ignores matching changes 2020-10-20 12:53:26 -07:00
.cirrus.yml CI: add FreeBSD CI support via Cirrus-CI 2019-12-20 12:09:12 -08:00
.clang-format
.editorconfig editorconfig: indent text files with tabs 2020-01-06 08:46:32 -08:00
.gitattributes CoC: explicitly take any whitespace breakage 2021-01-04 09:44:49 -08:00
.gitignore Merge branch 'fc/random-cleanup' 2020-12-08 15:11:21 -08:00
.gitmodules
.mailmap Merge branch 'bc/wildcard-credential' 2020-03-05 10:43:02 -08:00
.travis.yml ci: fix the `jobname` of the `GETTEXT_POISON` job 2020-04-07 22:17:10 -07:00
.tsan-suppressions replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
CODE_OF_CONDUCT.md CoC: update to version 2.0 + local changes 2021-01-13 17:45:04 -08:00
COPYING
GIT-VERSION-GEN The first batch in 2.31 cycle 2021-01-06 23:33:44 -08:00
INSTALL doc: mention Python 3.x supports 2020-12-14 15:01:03 -08:00
LGPL-2.1
Makefile Merge branch 'ab/gettext-charset-comment-fix' 2021-01-15 21:48:46 -08:00
README.md ci: retire the Azure Pipelines definition 2020-04-10 10:30:40 -07:00
RelNotes The first batch in 2.31 cycle 2021-01-06 23:33:44 -08:00
abspath.c abspath: add a function to resolve paths with missing components 2020-12-12 23:35:47 -08:00
aclocal.m4
add-interactive.c Merge branch 'js/add-i-color-fix' 2020-12-08 15:11:17 -08:00
add-interactive.h built-in add -p: respect the `interactive.singlekey` config setting 2020-01-15 12:06:17 -08:00
add-patch.c Merge branch 'js/add-i-color-fix' 2020-12-08 15:11:17 -08:00
advice.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
advice.h push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
alias.c
alias.h
alloc.c commit: move members graph_pos, generation to a slab 2020-06-17 14:37:30 -07:00
alloc.h object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
apply.c Merge branch 'ab/unreachable-break' 2020-12-18 15:15:18 -08:00
apply.h apply.h: include missing header 2019-09-28 14:04:16 +09:00
archive-tar.c archive: support compression levels beyond 9 2020-11-09 11:25:45 -08:00
archive-zip.c archive: read short blobs in archive.c::write_archive_entry() 2020-09-19 15:56:05 -07:00
archive.c Merge branch 'rs/archive-plug-leak-refname' 2020-11-25 15:24:53 -08:00
archive.h Merge branch 'rs/archive-plug-leak-refname' 2020-11-25 15:24:53 -08:00
attr.c Use new HASHMAP_INIT macro to simplify hashmap initialization 2020-11-11 12:55:27 -08:00
attr.h attr: move doc to attr.h 2019-11-18 15:21:28 +09:00
banned.h banned.h: mark ctime_r() and asctime_r() as banned 2020-12-02 14:30:39 -08:00
base85.c
bisect.c hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
bisect.h bisect: combine args passed to find_bisection() 2020-08-07 15:13:03 -07:00
blame.c Merge branch 'en/strmap' 2020-11-21 15:14:38 -08:00
blame.h blame: simplify 'setup_blame_bloom_data' interface 2020-11-01 15:54:15 -08:00
blob.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
blob.h
bloom.c Use new HASHMAP_INIT macro to simplify hashmap initialization 2020-11-11 12:55:27 -08:00
bloom.h bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
branch.c wt-status: tolerate dangling marks 2020-09-02 14:39:25 -07:00
branch.h
builtin.h Merge branch 'ds/maintenance-part-3' 2020-11-18 13:32:53 -08:00
bulk-checkin.c bulk-checkin: zero-initialize hashfile_checkpoint 2019-09-06 11:03:39 -07:00
bulk-checkin.h
bundle.c bundle: arguments can be read from stdin 2021-01-11 21:50:41 -08:00
bundle.h Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
cache-tree.c sha1-file: pass git_hash_algo to hash_object_file() 2020-01-31 10:45:39 -08:00
cache-tree.h cache-tree: share code between functions writing an index as a tree 2019-08-19 10:08:03 -07:00
cache.h Merge branch 'ps/config-env-pairs' 2021-01-25 14:19:19 -08:00
chdir-notify.c
chdir-notify.h
check-builtins.sh
check_bindir
checkout.c config: drop git_config_get_string_const() 2020-08-17 15:35:47 -07:00
checkout.h
color.c color.c: alias RGB colors 8-15 to aixterm colors 2020-02-11 11:19:00 -08:00
color.h
column.c Merge branch 'jk/strvec' 2020-08-10 10:23:57 -07:00
column.h
combine-diff.c Merge branch 'jk/diff-cc-oidfind-fix' 2020-10-05 14:01:55 -07:00
command-list.txt mailmap doc: create a new "gitmailmap(5)" man page 2021-01-12 14:04:39 -08:00
commit-graph.c Merge branch 'ma/more-opaque-lock-file' 2021-01-25 14:19:17 -08:00
commit-graph.h Merge branch 'tb/bloom-improvements' 2020-09-29 14:01:20 -07:00
commit-reach.c commit-reach: fix in_merge_bases_many bug 2020-10-02 10:26:31 -07:00
commit-reach.h commit-reach: avoid is_descendant_of() shim 2020-06-23 16:36:53 -07:00
commit-slab-decl.h Merge branch 'sg/commit-graph-cleanups' into master 2020-07-30 13:20:30 -07:00
commit-slab-impl.h commit-slab: add a function to deep free entries on the slab 2020-06-08 12:28:49 -07:00
commit-slab.h commit-slab: add a function to deep free entries on the slab 2020-06-08 12:28:49 -07:00
commit.c Merge branch 'ma/sha1-is-a-hash' 2021-01-15 15:20:29 -08:00
commit.h Merge branch 'en/merge-ort-recursive' 2021-01-06 23:33:44 -08:00
common-main.c
config.c Merge branch 'ps/config-env-pairs' 2021-01-25 14:19:19 -08:00
config.h Merge branch 'ps/config-env-pairs' 2021-01-25 14:19:19 -08:00
config.mak.dev Merge branch 'jc/sparse-error-for-developer-build' 2020-11-18 13:32:54 -08:00
config.mak.in
config.mak.uname Merge branch 'rb/nonstop-config-mak-uname-update' 2020-12-18 15:15:18 -08:00
configure.ac Merge branch 'dd/sequencer-utf8' 2019-12-01 09:04:36 -08:00
connect.c Merge branch 'jk/forbid-lf-in-git-url' 2021-01-25 14:19:17 -08:00
connect.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
connected.c Merge branch 'rs/more-buffered-io' 2020-08-24 14:54:31 -07:00
connected.h connected: always use partial clone optimization 2020-03-29 10:37:44 -07:00
convert.c convert: drop unused crlf_action from check_global_conv_flags_eol() 2020-09-30 12:53:47 -07:00
convert.h convert: provide additional metadata to filters 2020-03-16 11:37:02 -07:00
copy.c
credential.c credential: treat CR/LF as line endings in the credential protocol 2020-10-03 10:41:03 -07:00
credential.h credential: correct order of parameters for credential_match 2020-05-04 22:56:33 -07:00
csum-file.c hash: implement and use a context cloning function 2020-02-24 09:33:21 -08:00
csum-file.h csum-file: add hashwrite_be64() 2020-11-12 09:40:06 -08:00
ctype.c
daemon.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
date.c date.c: allow compact version of ISO-8601 datetime 2020-04-24 14:06:09 -07:00
decorate.c
decorate.h
delta-islands.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
delta-islands.h
delta.h
detect-compiler
diff-delta.c diff-delta: set size out-parameter to 0 for NULL delta 2019-09-06 11:03:39 -07:00
diff-lib.c Merge branch 'rs/plug-diff-cache-leak' 2020-11-25 15:24:53 -08:00
diff-no-index.c
diff.c Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty' 2021-01-25 14:19:18 -08:00
diff.h Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty' 2021-01-25 14:19:18 -08:00
diffcore-break.c diff: restrict when prefetching occurs 2020-04-07 16:09:29 -07:00
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c diffcore-rename: remove unnecessary duplicate entry checks 2021-01-04 12:59:34 -08:00
diffcore.h diff: restrict when prefetching occurs 2020-04-07 16:09:29 -07:00
dir-iterator.c
dir-iterator.h
dir.c Merge branch 'en/strmap' 2020-11-21 15:14:38 -08:00
dir.h dir: fix problematic API to avoid memory leaks 2020-08-18 17:17:31 -07:00
editor.c config: fix leaks from git_config_get_string_const() 2020-08-14 10:52:04 -07:00
entry.c checkout_entry(): remove unreachable error() call 2020-08-18 13:26:10 -07:00
environment.c config: allow specifying config entries via envvar pairs 2021-01-15 13:03:45 -08:00
environment.h environment: make `getenv_safe()` a public function 2021-01-15 13:03:45 -08:00
exec-cmd.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
exec-cmd.h argv-array: rename to strvec 2020-07-28 15:02:17 -07:00
fetch-negotiator.c negotiator/noop: add noop fetch negotiator 2020-08-18 13:25:05 -07:00
fetch-negotiator.h repo-settings: create feature.experimental setting 2019-08-13 13:33:55 -07:00
fetch-pack.c fetch-pack: refactor writing promisor file 2021-01-12 16:01:07 -08:00
fetch-pack.h Merge branch 'jt/lazy-fetch' 2020-09-03 12:37:04 -07:00
fmt-merge-msg.c fmt-merge-msg: also suppress "into main" by default 2020-10-23 08:57:39 -07:00
fmt-merge-msg.h Lib-ify fmt-merge-msg 2020-03-24 15:04:43 -07:00
fsck.c Merge branch 'jk/forbid-lf-in-git-url' 2021-01-25 14:19:17 -08:00
fsck.h fsck: make fsck_config() re-usable 2021-01-05 14:58:29 -08:00
fsmonitor.c Merge branch 'jk/strvec' 2020-08-10 10:23:57 -07:00
fsmonitor.h
fuzz-commit-graph.c commit-graph: pass a 'struct repository *' in more places 2020-09-09 12:51:48 -07:00
fuzz-pack-headers.c
fuzz-pack-idx.c
generate-cmdlist.sh Fit to Plan 9's ANSI/POSIX compatibility layer 2020-09-09 22:31:31 -07:00
generate-configlist.sh help: move list_config_help to builtin/help 2020-04-16 15:22:16 -07:00
gettext.c gettext.c: remove/reword a mostly-useless comment 2021-01-11 13:07:33 -08:00
gettext.h
git-add--interactive.perl Merge branch 'js/add-i-color-fix' 2020-12-08 15:11:17 -08:00
git-archimport.perl
git-bisect.sh Merge branch 'mr/bisect-in-c-3' 2020-11-09 14:06:25 -08:00
git-compat-util.h Merge branch 'jc/compat-util-setitimer-fix' 2020-12-18 15:15:17 -08:00
git-cvsexportcommit.perl cvsexportcommit: do not run git programs in dashed form 2020-08-26 14:49:52 -07:00
git-cvsimport.perl git-cvsimport: port to SHA-256 2020-06-22 11:21:07 -07:00
git-cvsserver.perl git-cvsserver: port to SHA-256 2020-06-22 11:21:07 -07:00
git-difftool--helper.sh
git-filter-branch.sh Recommend git-filter-repo instead of git-filter-branch 2019-09-05 13:01:48 -07:00
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh Merge branch 'pb/mergetool-tool-help-fix' 2021-01-15 21:48:46 -08:00
git-mergetool.sh
git-p4.py Merge branch 'dl/p4-encode-after-kw-expansion' 2021-01-15 21:48:47 -08:00
git-quiltimport.sh
git-rebase--preserve-merges.sh rebase: remove unused function reschedule_last_action 2020-08-12 12:25:42 -07:00
git-request-pull.sh
git-send-email.perl git-send-email: die if sendmail.* config is set 2020-07-23 18:00:34 -07:00
git-sh-i18n.sh
git-sh-setup.sh
git-submodule.sh submodule: fix fetch_in_submodule logic 2020-11-24 13:14:09 -08:00
git-svn.perl perl: check for perl warnings while running tests 2020-10-21 23:11:48 -07:00
git-web--browse.sh
git.c Merge branch 'ps/config-env-pairs' 2021-01-25 14:19:19 -08:00
git.rc
gpg-interface.c strvec: fix indentation in renamed calls 2020-07-28 15:02:18 -07:00
gpg-interface.h gpg-interface: prefer check_signature() for GPG verification 2020-03-15 09:46:28 -07:00
graph.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
graph.h graph: move doc to graph.h and graph.c 2019-11-18 15:21:28 +09:00
grep.c grep: copy struct in one fell swoop 2020-11-30 13:55:54 -08:00
grep.h grep: use designated initializers for `grep_defaults` 2020-11-21 14:50:33 -08:00
hash-lookup.c hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
hash-lookup.h hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
hash.h cache.h: move hash/oid functions to hash.h 2020-12-04 13:55:14 -08:00
hashmap.c hashmap: provide deallocation function names 2020-11-02 12:15:50 -08:00
hashmap.h hashmap: provide deallocation function names 2020-11-02 12:15:50 -08:00
help.c help.c: help.autocorrect=never means "do not compute suggestions" 2020-11-25 13:02:15 -08:00
help.h help: do not expect built-in commands to be hardlinked 2020-10-07 15:25:10 -07:00
hex.c hex: add functions to parse hex object IDs in any algorithm 2020-02-24 09:33:21 -08:00
http-backend.c strvec: fix indentation in renamed calls 2020-07-28 15:02:18 -07:00
http-fetch.c http-fetch: set up git directory before parsing pack hashes 2020-07-30 09:16:48 -07:00
http-push.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
http-walker.c http: refactor finish_http_pack_request() 2020-06-10 18:06:34 -07:00
http.c strvec: fix indentation in renamed calls 2020-07-28 15:02:18 -07:00
http.h Merge branch 'jt/cdn-offload' 2020-06-25 12:27:47 -07:00
ident.c Merge branch 'pw/rebase-i-more-options' 2020-09-03 12:37:01 -07:00
imap-send.c imap-send: parse default git config 2020-12-01 11:10:59 -08:00
iterator.h
json-writer.c
json-writer.h
khash.h
kwset.c
kwset.h kset.h, tar.h: add missing header guard to prevent multiple inclusion 2019-11-07 20:12:04 +09:00
levenshtein.c
levenshtein.h
line-log.c line-log: handle deref_tag() returning NULL 2020-10-12 12:25:14 -07:00
line-log.h line-log: more responsive, incremental 'git log -L' 2020-05-11 09:33:56 -07:00
line-range.c
line-range.h
linear-assignment.c
linear-assignment.h
list-objects-filter-options.c list-objects-filter-options: fix function name in BUG 2020-11-16 14:28:25 -08:00
list-objects-filter-options.h list_objects_filter_options: introduce 'list_object_filter_config_name' 2020-08-03 18:03:24 -07:00
list-objects-filter.c object-name.c: rename from sha1-name.c 2021-01-04 13:01:55 -08:00
list-objects-filter.h
list-objects.c Merge branch 'jk/list-objects-optim-wo-trees' 2019-10-07 11:32:56 +09:00
list-objects.h
list.h
ll-merge.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
ll-merge.h merge: move doc to ll-merge.h 2019-11-18 15:21:28 +09:00
lockfile.c lockfile.c: introduce 'hold_lock_file_for_update_mode' 2020-04-27 11:27:36 -07:00
lockfile.h lockfile.c: introduce 'hold_lock_file_for_update_mode' 2020-04-27 11:27:36 -07:00
log-tree.c format-patch: make output filename configurable 2020-11-09 17:44:41 -08:00
log-tree.h format-patch: make output filename configurable 2020-11-09 17:44:41 -08:00
ls-refs.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
ls-refs.h argv-array: rename to strvec 2020-07-28 15:02:17 -07:00
mailinfo.c mailinfo: disallow NUL character in mail's header 2020-04-22 14:01:03 -07:00
mailinfo.h
mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
mailmap.h shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
match-trees.c
mem-pool.c mem-pool: use consistent pool variable name 2020-08-18 12:16:08 -07:00
mem-pool.h mem-pool: use consistent pool variable name 2020-08-18 12:16:08 -07:00
merge-blobs.c
merge-blobs.h
merge-ort-wrappers.c merge-ort-wrappers: new convience wrappers to mimic the old merge API 2020-10-26 22:36:14 -07:00
merge-ort-wrappers.h merge-ort-wrappers: new convience wrappers to mimic the old merge API 2020-10-26 22:36:14 -07:00
merge-ort.c Merge branch 'en/merge-ort-3' 2021-01-25 14:19:17 -08:00
merge-ort.h merge-ort: implement merge_incore_recursive() 2020-12-16 21:56:39 -08:00
merge-recursive.c commit: move reverse_commit_list() from merge-recursive 2020-12-16 21:56:39 -08:00
merge-recursive.h merge-recursive: fix unclear and outright wrong comments 2020-08-02 11:03:57 -07:00
merge.c dir: fix problematic API to avoid memory leaks 2020-08-18 17:17:31 -07:00
mergesort.c
mergesort.h
midx.c Merge branch 'ma/more-opaque-lock-file' 2021-01-25 14:19:17 -08:00
midx.h Merge branch 'ds/multi-pack-index' 2020-05-01 13:39:55 -07:00
name-hash.c hashmap: provide deallocation function names 2020-11-02 12:15:50 -08:00
notes-cache.c
notes-cache.h
notes-merge.c
notes-merge.h
notes-utils.c strbuf: add and use strbuf_insertstr() 2020-02-10 09:04:45 -08:00
notes-utils.h
notes.c Merge branch 'na/notes-displayref-is-not-boolean' 2020-11-30 14:49:44 -08:00
notes.h Merge branch 'dl/format-patch-notes-config-fixup' 2019-12-25 11:21:58 -08:00
object-file.c hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
object-name.c object-name.c: rename from sha1-name.c 2021-01-04 13:01:55 -08:00
object-store.h packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
object.c bundle: lost objects when removing duplicate pendings 2021-01-11 21:50:41 -08:00
object.h object: allow clear_commit_marks_all to handle any repo 2020-10-31 10:46:34 -07:00
oid-array.c hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
oid-array.h oid-array: provide a for-loop iterator 2020-12-07 12:32:04 -08:00
oidmap.c hashmap: provide deallocation function names 2020-11-02 12:15:50 -08:00
oidmap.h hashmap: use *_entry APIs for iteration 2019-10-07 10:20:11 +09:00
oidset.c blame: silently ignore invalid ignore file objects 2020-11-10 13:05:06 -08:00
oidset.h blame: validate and peel the object names on the ignore list 2020-09-24 22:20:58 -07:00
pack-bitmap-write.c Merge branch 'ma/sha1-is-a-hash' 2021-01-15 15:20:29 -08:00
pack-bitmap.c rebuild_existing_bitmaps(): convert to new revindex API 2021-01-13 21:53:46 -08:00
pack-bitmap.h pack-bitmap: factor out 'bitmap_for_commit()' 2020-12-08 14:49:04 -08:00
pack-check.c fsck: correctly compute checksums on idx files larger than 4GB 2020-11-16 13:41:35 -08:00
pack-objects.c pack-objects: convert oe_set_delta_ext() to use object_id 2020-02-24 12:55:52 -08:00
pack-objects.h pack-objects: convert oe_set_delta_ext() to use object_id 2020-02-24 12:55:52 -08:00
pack-revindex.c packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
pack-revindex.h packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
pack-write.c pack-write: die on error in write_promisor_file() 2021-01-14 17:02:22 -08:00
pack.h fetch-pack: refactor writing promisor file 2021-01-12 16:01:07 -08:00
packfile.c packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
packfile.h packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
pager.c strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
parse-options-cb.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
parse-options.c parse-options: add --git-completion-helper-all 2020-08-19 17:46:17 -07:00
parse-options.h parse-options: format argh like error messages 2021-01-06 15:10:27 -08:00
patch-delta.c
patch-ids.c Merge branch 'jk/log-cherry-pick-duplicate-patches' 2021-01-25 14:19:19 -08:00
patch-ids.h patch-ids: handle duplicate hashmap entries 2021-01-12 11:13:32 -08:00
path.c sequencer: treat REVERT_HEAD as a pseudo ref 2020-08-21 11:20:11 -07:00
path.h sequencer: treat REVERT_HEAD as a pseudo ref 2020-08-21 11:20:11 -07:00
pathspec.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
pathspec.h Merge branch 'hw/doc-in-header' 2019-12-16 13:08:39 -08:00
pkt-line.c sideband: diagnose more sideband anomalies 2020-10-29 09:23:29 -07:00
pkt-line.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
preload-index.c
pretty.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
pretty.h pretty: refactor `format_sanitized_subject()` 2020-08-28 13:52:51 -07:00
prio-queue.c
prio-queue.h
progress.c Merge branch 'ma/stop-progress-null-fix' 2020-08-17 17:02:48 -07:00
progress.h progress.c: silence cgcc suggestion about internal linkage 2020-04-27 11:21:28 -07:00
promisor-remote.c promisor-remote: remove unused variable 2020-09-21 22:32:49 -07:00
promisor-remote.h promisor-remote: remove unused variable 2020-09-21 22:32:49 -07:00
prompt.c interactive: explicitly `fflush` stdout before expecting input 2020-04-10 10:27:16 -07:00
prompt.h interactive: refactor code asking the user for interactive input 2020-04-10 10:26:31 -07:00
protocol.c protocol: re-enable v2 protocol by default 2020-09-25 11:40:42 -07:00
protocol.h
prune-packed.c Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
prune-packed.h Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
quote.c quote: make sq_dequote_step() a public function 2021-01-12 12:03:18 -08:00
quote.h quote: make sq_dequote_step() a public function 2021-01-12 12:03:18 -08:00
range-diff.c Use new HASHMAP_INIT macro to simplify hashmap initialization 2020-11-11 12:55:27 -08:00
range-diff.h strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
reachable.c pack-bitmap: basic noop bitmap filter infrastructure 2020-02-14 10:46:22 -08:00
reachable.h
read-cache.c read-cache: try not to peek into `struct {lock_,temp}file` 2021-01-06 13:53:32 -08:00
rebase-interactive.c Merge branch 'rt/format-zero-length-fix' 2020-03-09 11:21:21 -07:00
rebase-interactive.h Merge branch 'en/rebase-backend' 2020-03-02 15:07:19 -08:00
rebase.c pull --rebase/remote rename: document and honor single-letter abbreviations rebase types 2020-02-10 10:52:10 -08:00
rebase.h pull --rebase/remote rename: document and honor single-letter abbreviations rebase types 2020-02-10 10:52:10 -08:00
ref-filter.c branch: show "HEAD detached" first under reverse sort 2021-01-07 15:13:21 -08:00
ref-filter.h branch: sort detached HEAD based on a flag 2021-01-07 15:13:21 -08:00
reflog-walk.c
reflog-walk.h
refs.c refs: allow @{n} to work with n-sized reflog 2021-01-11 14:13:50 -08:00
refs.h get_default_branch_name(): prepare for showing some advice 2020-12-13 15:53:50 -08:00
refspec.c Merge branch 'fc/atmark-in-refspec' 2020-12-14 10:21:36 -08:00
refspec.h Merge branch 'sb/clone-origin' 2020-10-27 15:09:50 -07:00
remote-curl.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
remote.c Merge branch 'nk/refspecs-negative-fix' 2020-12-23 13:59:46 -08:00
remote.h fetch: extract writing to FETCH_HEAD 2021-01-12 12:06:14 -08:00
replace-object.c replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
replace-object.h replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
repo-settings.c Merge branch 'ds/maintenance-part-2' 2020-10-27 15:09:47 -07:00
repository.c repository: enable SHA-256 support by default 2020-07-30 09:16:49 -07:00
repository.h Merge branch 'ds/maintenance-part-2' 2020-10-27 15:09:47 -07:00
rerere.c hash-lookup: rename from sha1-lookup 2021-01-04 13:01:55 -08:00
rerere.h
reset.c Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
reset.h reset: extract reset_head() from rebase 2020-04-10 09:28:02 -07:00
resolve-undo.c
resolve-undo.h
revision.c Merge branch 'jk/log-cherry-pick-duplicate-patches' 2021-01-25 14:19:19 -08:00
revision.h format-patch: make output filename configurable 2020-11-09 17:44:41 -08:00
run-command.c maintenance: optionally skip --auto process 2020-09-25 10:59:44 -07:00
run-command.h maintenance: replace run_auto_gc() 2020-09-17 11:30:05 -07:00
send-pack.c Merge branch 'js/trace2-session-id' 2020-12-08 15:11:20 -08:00
send-pack.h
sequencer.c Merge branch 'en/strmap' 2020-11-21 15:14:38 -08:00
sequencer.h Merge branch 'en/merge-ort-api-null-impl' 2020-11-18 13:32:53 -08:00
serve.c upload-pack, serve: log received client session ID 2020-11-11 18:26:53 -08:00
serve.h argv-array: rename to strvec 2020-07-28 15:02:17 -07:00
server-info.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
setup.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
sh-i18n--envsubst.c
sha1dc_git.c hex: drop sha1_to_hex() 2019-11-13 10:09:10 +09:00
sha1dc_git.h
shallow.c Merge branch 'sg/commit-graph-cleanups' into master 2020-07-30 13:20:30 -07:00
shallow.h shallow: use struct 'shallow_lock' for additional safety 2020-04-30 14:19:13 -07:00
shell.c interactive: refactor code asking the user for interactive input 2020-04-10 10:26:31 -07:00
shortlog.h shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
sideband.c Merge branch 'jk/sideband-more-error-checking' 2020-11-09 14:06:29 -08:00
sideband.h sideband: diagnose more sideband anomalies 2020-10-29 09:23:29 -07:00
sigchain.c
sigchain.h sigchain: move doc to sigchain.h 2019-11-18 15:21:29 +09:00
split-index.c mem-pool: use more standard initialization and finalization 2020-08-18 12:16:06 -07:00
split-index.h
stable-qsort.c Move git_sort(), a stable sort, into into libgit.a 2019-10-02 14:44:51 +09:00
strbuf.c Merge branch 'rs/retire-strbuf-write-fd' 2020-06-29 14:17:26 -07:00
strbuf.h Merge branch 'rs/retire-strbuf-write-fd' 2020-06-29 14:17:26 -07:00
streaming.c streaming: allow open_istream() to handle any repo 2020-01-31 10:45:39 -08:00
streaming.h streaming: allow open_istream() to handle any repo 2020-01-31 10:45:39 -08:00
string-list.c
string-list.h Merge branch 'en/string-list-can-be-custom-sorted' into maint 2020-02-14 12:42:27 -08:00
strmap.c strmap: take advantage of FLEXPTR_ALLOC_STR when relevant 2020-11-11 12:55:27 -08:00
strmap.h strmap: make callers of strmap_remove() to call it in void context 2020-12-15 15:30:44 -08:00
strvec.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
strvec.h strvec: rename struct fields 2020-07-30 19:18:06 -07:00
sub-process.c strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
sub-process.h hashmap_entry: remove first member requirement from docs 2019-10-07 10:20:12 +09:00
submodule-config.c hashmap: provide deallocation function names 2020-11-02 12:15:50 -08:00
submodule-config.h submodule-config: add skip_if_read option to repo_read_gitmodules() 2020-01-17 13:52:14 -08:00
submodule.c Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty' 2021-01-25 14:19:18 -08:00
submodule.h submodule: rename helper functions to avoid ambiguity 2020-08-12 14:12:58 -07:00
symlinks.c
tag.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
tag.h tag: factor out get_tagged_oid() 2019-09-05 14:10:18 -07:00
tar.h kset.h, tar.h: add missing header guard to prevent multiple inclusion 2019-11-07 20:12:04 +09:00
tempfile.c tempfile.c: introduce 'create_tempfile_mode' 2020-04-27 11:27:35 -07:00
tempfile.h tempfile.c: introduce 'create_tempfile_mode' 2020-04-27 11:27:35 -07:00
thread-utils.c
thread-utils.h
tmp-objdir.c packfile: prepare for the existence of '*.rev' files 2021-01-25 18:32:43 -08:00
tmp-objdir.h
trace.c http, imap-send: stop using CURLOPT_VERBOSE 2020-05-11 11:18:01 -07:00
trace.h http, imap-send: stop using CURLOPT_VERBOSE 2020-05-11 11:18:01 -07:00
trace2.c trace2: add a public function for getting the SID 2020-11-11 18:26:52 -08:00
trace2.h trace2: add a public function for getting the SID 2020-11-11 18:26:52 -08:00
trailer.c pretty format %(trailers): add a "key_value_separator" 2020-12-09 14:16:42 -08:00
trailer.h pretty format %(trailers): add a "key_value_separator" 2020-12-09 14:16:42 -08:00
transport-helper.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
transport-internal.h strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
transport.c transport: log received server session ID 2020-11-11 18:26:53 -08:00
transport.h push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
tree-diff.c bloom/diff: properly short-circuit on max_changes 2020-09-17 09:31:25 -07:00
tree-walk.c tree-walk.c: don't match submodule entries for 'submod/anything' 2020-06-08 12:28:48 -07:00
tree-walk.h tree-walk.c: break circular dependency with unpack-trees 2020-02-04 10:32:15 -08:00
tree.c tree: enable cmp_cache_name_compare() to be used elsewhere 2020-12-13 14:18:20 -08:00
tree.h tree: enable cmp_cache_name_compare() to be used elsewhere 2020-12-13 14:18:20 -08:00
unicode-width.h unicode: update the width tables to Unicode 13.0 2020-03-17 15:06:37 -07:00
unimplemented.sh
unix-socket.c
unix-socket.h
unpack-trees.c strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
unpack-trees.h strvec: convert remaining callers away from argv_array name 2020-07-28 15:02:18 -07:00
upload-pack.c Merge branch 'tb/partial-clone-filters-fix' 2020-12-17 15:06:40 -08:00
upload-pack.h argv-array: rename to strvec 2020-07-28 15:02:17 -07:00
url.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
url.h
urlmatch.c credential: handle `credential.<partial-URL>.<key>` again 2020-04-24 15:53:46 -07:00
urlmatch.h credential: handle `credential.<partial-URL>.<key>` again 2020-04-24 15:53:46 -07:00
usage.c Merge branch 'jt/trace-error-on-warning' 2020-12-08 15:11:17 -08:00
userdiff.c Merge branch 've/userdiff-bash' 2020-11-02 13:17:46 -08:00
userdiff.h
utf8.c utf8: use skip_iprefix() in same_utf_encoding() 2019-11-10 16:04:36 +09:00
utf8.h
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c Merge branch 'rs/show-progress-in-dumb-http-fetch' 2020-03-09 11:21:21 -07:00
walker.h remote-curl: show progress for fetches over dumb HTTP 2020-03-03 13:15:40 -08:00
wildmatch.c
wildmatch.h
worktree.c worktree: teach `repair` to fix multi-directional breakage 2020-12-21 13:44:28 -08:00
worktree.h Merge branch 'ma/worktree-cleanups' 2020-10-05 14:01:52 -07:00
wrap-for-bin.sh
wrapper.c xrealloc: do not reuse pointer freed by zero-length realloc() 2020-09-02 12:18:14 -07:00
write-or-die.c
ws.c
wt-status.c Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty' 2021-01-25 14:19:18 -08:00
wt-status.h branch: sort detached HEAD based on a flag 2021-01-07 15:13:21 -08:00
xdiff-interface.c xdiff: avoid computing non-zero offset from NULL pointer 2020-01-28 23:13:25 -08:00
xdiff-interface.h Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
zlib.c

README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks