Merge branch 'ds/sparse-index-protections'

Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

* ds/sparse-index-protections: (47 commits)
  name-hash: use expand_to_path()
  sparse-index: expand_to_path()
  name-hash: don't add directories to name_hash
  revision: ensure full index
  resolve-undo: ensure full index
  read-cache: ensure full index
  pathspec: ensure full index
  merge-recursive: ensure full index
  entry: ensure full index
  dir: ensure full index
  update-index: ensure full index
  stash: ensure full index
  rm: ensure full index
  merge-index: ensure full index
  ls-files: ensure full index
  grep: ensure full index
  fsck: ensure full index
  difftool: ensure full index
  commit: ensure full index
  checkout: ensure full index
  ...
This commit is contained in:
Junio C Hamano 2021-04-30 13:50:26 +09:00
commit 8e97852919
48 changed files with 1257 additions and 109 deletions

View File

@ -14,6 +14,11 @@ index.recordOffsetTable::
Defaults to 'true' if index.threads has been explicitly enabled,
'false' otherwise.
index.sparse::
When enabled, write the index using sparse-directory entries. This
has no effect unless `core.sparseCheckout` and
`core.sparseCheckoutCone` are both enabled. Defaults to 'false'.
index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.

View File

@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the
When `--cone` is provided, the `core.sparseCheckoutCone` setting is
also set, allowing for better performance with a limited set of
patterns (see 'CONE PATTERN SET' below).
+
Use the `--[no-]sparse-index` option to toggle the use of the sparse
index format. This reduces the size of the index to be more closely
aligned with your sparse-checkout definition. This can have significant
performance advantages for commands such as `git status` or `git add`.
This feature is still experimental. Some commands might be slower with
a sparse index until they are properly integrated with the feature.
+
**WARNING:** Using a sparse index requires modifying the index in a way
that is not completely understood by external tools. If you have trouble
with this compatibility, then run `git sparse-checkout init --no-sparse-index`
to rewrite your index to not be sparse. Older versions of Git will not
understand the sparse directory entries index extension and may fail to
interact with your repository until it is disabled.
'set'::
Write a set of patterns to the sparse-checkout file, as given as

View File

@ -44,6 +44,13 @@ Git index format
localization, no special casing of directory separator '/'). Entries
with the same name are sorted by their stage field.
An index entry typically represents a file. However, if sparse-checkout
is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the
`extensions.sparseIndex` extension is enabled, then the index may
contain entries for directories outside of the sparse-checkout definition.
These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and
the path ends in a directory separator.
32-bit ctime seconds, the last time a file's metadata changed
this is stat(2) data
@ -385,3 +392,15 @@ The remaining data of each directory block is grouped by type:
in this block of entries.
- 32-bit count of cache entries in this block
== Sparse Directory Entries
When using sparse-checkout in cone mode, some entire directories within
the index can be summarized by pointing to a tree object instead of the
entire expanded list of paths within that tree. An index containing such
entries is a "sparse index". Index format versions 4 and less were not
implemented with such entries in mind. Thus, for these versions, an
index containing sparse directory entries will include this extension
with signature { 's', 'd', 'i', 'r' }. Like the split-index extension,
tools should avoid interacting with a sparse index unless they understand
this extension.

View File

@ -0,0 +1,208 @@
Git Sparse-Index Design Document
================================
The sparse-checkout feature allows users to focus a working directory on
a subset of the files at HEAD. The cone mode patterns, enabled by
`core.sparseCheckoutCone`, allow for very fast pattern matching to
discover which files at HEAD belong in the sparse-checkout cone.
Three important scale dimensions for a Git working directory are:
* `HEAD`: How many files are present at `HEAD`?
* Populated: How many files are within the sparse-checkout cone.
* Modified: How many files has the user modified in the working directory?
We will use big-O notation -- O(X) -- to denote how expensive certain
operations are in terms of these dimensions.
These dimensions are ordered by their magnitude: users (typically) modify
fewer files than are populated, and we can only populate files at `HEAD`.
Problems occur if there is an extreme imbalance in these dimensions. For
example, if `HEAD` contains millions of paths but the populated set has
only tens of thousands, then commands like `git status` and `git add` can
be dominated by operations that require O(`HEAD`) operations instead of
O(Populated). Primarily, the cost is in parsing and rewriting the index,
which is filled primarily with files at `HEAD` that are marked with the
`SKIP_WORKTREE` bit.
The sparse-index intends to take these commands that read and modify the
index from O(`HEAD`) to O(Populated). To do this, we need to modify the
index format in a significant way: add "sparse directory" entries.
With cone mode patterns, it is possible to detect when an entire
directory will have its contents outside of the sparse-checkout definition.
Instead of listing all of the files it contains as individual entries, a
sparse-index contains an entry with the directory name, referencing the
object ID of the tree at `HEAD` and marked with the `SKIP_WORKTREE` bit.
If we need to discover the details for paths within that directory, we
can parse trees to find that list.
At time of writing, sparse-directory entries violate expectations about the
index format and its in-memory data structure. There are many consumers in
the codebase that expect to iterate through all of the index entries and
see only files. In fact, these loops expect to see a reference to every
staged file. One way to handle this is to parse trees to replace a
sparse-directory entry with all of the files within that tree as the index
is loaded. However, parsing trees is slower than parsing the index format,
so that is a slower operation than if we left the index alone. The plan is
to make all of these integrations "sparse aware" so this expansion through
tree parsing is unnecessary and they use fewer resources than when using a
full index.
The implementation plan below follows four phases to slowly integrate with
the sparse-index. The intention is to incrementally update Git commands to
interact safely with the sparse-index without significant slowdowns. This
may not always be possible, but the hope is that the primary commands that
users need in their daily work are dramatically improved.
Phase I: Format and initial speedups
------------------------------------
During this phase, Git learns to enable the sparse-index and safely parse
one. Protections are put in place so that every consumer of the in-memory
data structure can operate with its current assumption of every file at
`HEAD`.
At first, every index parse will call a helper method,
`ensure_full_index()`, which scans the index for sparse-directory entries
(pointing to trees) and replaces them with the full list of paths (with
blob contents) by parsing tree objects. This will be slower in all cases.
The only noticeable change in behavior will be that the serialized index
file contains sparse-directory entries.
To start, we use a new required index extension, `sdir`, to allow
inserting sparse-directory entries into indexes with file format
versions 2, 3, and 4. This prevents Git versions that do not understand
the sparse-index from operating on one, while allowing tools that do not
understand the sparse-index to operate on repositories as long as they do
not interact with the index. A new format, index v5, will be introduced
that includes sparse-directory entries by default. It might also
introduce other features that have been considered for improving the
index, as well.
Next, consumers of the index will be guarded against operating on a
sparse-index by inserting calls to `ensure_full_index()` or
`expand_index_to_path()`. If a specific path is requested, then those will
be protected from within the `index_file_exists()` and `index_name_pos()`
API calls: they will call `ensure_full_index()` if necessary. The
intention here is to preserve existing behavior when interacting with a
sparse-checkout. We don't want a change to happen by accident, without
tests. Many of these locations may not need any change before removing the
guards, but we should not do so without tests to ensure the expected
behavior happens.
It may be desirable to _change_ the behavior of some commands in the
presence of a sparse index or more generally in any sparse-checkout
scenario. In such cases, these should be carefully communicated and
tested. No such behavior changes are intended during this phase.
During a scan of the codebase, not every iteration of the cache entries
needs an `ensure_full_index()` check. The basic reasons include:
1. The loop is scanning for entries with non-zero stage. These entries
are not collapsed into a sparse-directory entry.
2. The loop is scanning for submodules. These entries are not collapsed
into a sparse-directory entry.
3. The loop is part of the index API, especially around reading or
writing the format.
4. The loop is checking for correct order of cache entries and that is
correct if and only if the sparse-directory entries are in the correct
location.
5. The loop ignores entries with the `SKIP_WORKTREE` bit set, or is
otherwise already aware of sparse directory entries.
6. The sparse-index is disabled at this point when using the split-index
feature, so no effort is made to protect the split-index API.
Even after inserting these guards, we will keep expanding sparse-indexes
for most Git commands using the `command_requires_full_index` repository
setting. This setting will be on by default and disabled one builtin at a
time until we have sufficient confidence that all of the index operations
are properly guarded.
To complete this phase, the commands `git status` and `git add` will be
integrated with the sparse-index so that they operate with O(Populated)
performance. They will be carefully tested for operations within and
outside the sparse-checkout definition.
Phase II: Careful integrations
------------------------------
This phase focuses on ensuring that all index extensions and APIs work
well with a sparse-index. This requires significant increases to our test
coverage, especially for operations that interact with the working
directory outside of the sparse-checkout definition. Some of these
behaviors may not be the desirable ones, such as some tests already
marked for failure in `t1092-sparse-checkout-compatibility.sh`.
The index extensions that may require special integrations are:
* FS Monitor
* Untracked cache
While integrating with these features, we should look for patterns that
might lead to better APIs for interacting with the index. Coalescing
common usage patterns into an API call can reduce the number of places
where sparse-directories need to be handled carefully.
Phase III: Important command speedups
-------------------------------------
At this point, the patterns for testing and implementing sparse-directory
logic should be relatively stable. This phase focuses on updating some of
the most common builtins that use the index to operate as O(Populated).
Here is a potential list of commands that could be valuable to integrate
at this point:
* `git commit`
* `git checkout`
* `git merge`
* `git rebase`
Hopefully, commands such as `git merge` and `git rebase` can benefit
instead from merge algorithms that do not use the index as a data
structure, such as the merge-ORT strategy. As these topics mature, we
may enable the ORT strategy by default for repositories using the
sparse-index feature.
Along with `git status` and `git add`, these commands cover the majority
of users' interactions with the working directory. In addition, we can
integrate with these commands:
* `git grep`
* `git rm`
These have been proposed as some whose behavior could change when in a
repo with a sparse-checkout definition. It would be good to include this
behavior automatically when using a sparse-index. Some clarity is needed
to make the behavior switch clear to the user.
This phase is the first where parallel work might be possible without too
much conflicts between topics.
Phase IV: The long tail
-----------------------
This last phase is less a "phase" and more "the new normal" after all of
the previous work.
To start, the `command_requires_full_index` option could be removed in
favor of expanding only when hitting an API guard.
There are many Git commands that could use special attention to operate as
O(Populated), while some might be so rare that it is acceptable to leave
them with additional overhead when a sparse-index is present.
Here are some commands that might be useful to update:
* `git sparse-checkout set`
* `git am`
* `git clean`
* `git stash`

View File

@ -995,6 +995,7 @@ LIB_OBJS += setup.o
LIB_OBJS += shallow.o
LIB_OBJS += sideband.o
LIB_OBJS += sigchain.o
LIB_OBJS += sparse-index.o
LIB_OBJS += split-index.o
LIB_OBJS += stable-qsort.o
LIB_OBJS += strbuf.o

14
attr.c
View File

@ -733,7 +733,7 @@ static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
return res;
}
static struct attr_stack *read_attr_from_index(const struct index_state *istate,
static struct attr_stack *read_attr_from_index(struct index_state *istate,
const char *path,
unsigned flags)
{
@ -763,7 +763,7 @@ static struct attr_stack *read_attr_from_index(const struct index_state *istate,
return res;
}
static struct attr_stack *read_attr(const struct index_state *istate,
static struct attr_stack *read_attr(struct index_state *istate,
const char *path, unsigned flags)
{
struct attr_stack *res = NULL;
@ -855,7 +855,7 @@ static void push_stack(struct attr_stack **attr_stack_p,
}
}
static void bootstrap_attr_stack(const struct index_state *istate,
static void bootstrap_attr_stack(struct index_state *istate,
struct attr_stack **stack)
{
struct attr_stack *e;
@ -894,7 +894,7 @@ static void bootstrap_attr_stack(const struct index_state *istate,
push_stack(stack, e, NULL, 0);
}
static void prepare_attr_stack(const struct index_state *istate,
static void prepare_attr_stack(struct index_state *istate,
const char *path, int dirlen,
struct attr_stack **stack)
{
@ -1094,7 +1094,7 @@ static void determine_macros(struct all_attrs_item *all_attrs,
* If check->check_nr is non-zero, only attributes in check[] are collected.
* Otherwise all attributes are collected.
*/
static void collect_some_attrs(const struct index_state *istate,
static void collect_some_attrs(struct index_state *istate,
const char *path,
struct attr_check *check)
{
@ -1123,7 +1123,7 @@ static void collect_some_attrs(const struct index_state *istate,
fill(path, pathlen, basename_offset, check->stack, check->all_attrs, rem);
}
void git_check_attr(const struct index_state *istate,
void git_check_attr(struct index_state *istate,
const char *path,
struct attr_check *check)
{
@ -1140,7 +1140,7 @@ void git_check_attr(const struct index_state *istate,
}
}
void git_all_attrs(const struct index_state *istate,
void git_all_attrs(struct index_state *istate,
const char *path, struct attr_check *check)
{
int i;

4
attr.h
View File

@ -190,14 +190,14 @@ void attr_check_free(struct attr_check *check);
*/
const char *git_attr_name(const struct git_attr *);
void git_check_attr(const struct index_state *istate,
void git_check_attr(struct index_state *istate,
const char *path, struct attr_check *check);
/*
* Retrieve all attributes that apply to the specified path.
* check holds the attributes and their values.
*/
void git_all_attrs(const struct index_state *istate,
void git_all_attrs(struct index_state *istate,
const char *path, struct attr_check *check);
enum git_attr_direction {

View File

@ -141,6 +141,8 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
{
int i, retval = 0;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
struct cache_entry *ce = active_cache[i];

View File

@ -120,6 +120,8 @@ static void checkout_all(const char *prefix, int prefix_length)
int i, errs = 0;
struct cache_entry *last_ce = NULL;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr ; i++) {
struct cache_entry *ce = active_cache[i];
if (ce_stage(ce) != checkout_stage

View File

@ -369,6 +369,9 @@ static int checkout_worktree(const struct checkout_opts *opts,
NULL);
enable_delayed_checkout(&state);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (pos = 0; pos < active_nr; pos++) {
struct cache_entry *ce = active_cache[pos];
if (ce->ce_flags & CE_MATCHED) {
@ -513,6 +516,8 @@ static int checkout_paths(const struct checkout_opts *opts,
* Make sure all pathspecs participated in locating the paths
* to be checked out.
*/
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (pos = 0; pos < active_nr; pos++)
if (opts->overlay_mode)
mark_ce_for_checkout_overlay(active_cache[pos],

View File

@ -261,6 +261,8 @@ static int list_paths(struct string_list *list, const char *with_tree,
free(max_prefix);
}
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
const struct cache_entry *ce = active_cache[i];
struct string_list_item *item;
@ -976,6 +978,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
if (get_oid(parent, &oid)) {
int i, ita_nr = 0;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++)
if (ce_intent_to_add(active_cache[i]))
ita_nr++;

View File

@ -585,6 +585,9 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
setenv("GIT_DIFFTOOL_DIRDIFF", "true", 1);
rc = run_command_v_opt(helper_argv, flags);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&wtindex);
/*
* If the diff includes working copy files and those
* files were modified during the diff, then the changes

View File

@ -881,6 +881,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
verify_index_checksum = 1;
verify_ce_order = 1;
read_cache();
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
unsigned int mode;
struct blob *blob;

View File

@ -504,6 +504,8 @@ static int grep_cache(struct grep_opt *opt,
if (repo_read_index(repo) < 0)
die(_("index file corrupt"));
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(repo->index);
for (nr = 0; nr < repo->index->cache_nr; nr++) {
const struct cache_entry *ce = repo->index->cache[nr];

View File

@ -57,7 +57,7 @@ static const char *tag_modified = "";
static const char *tag_skip_worktree = "";
static const char *tag_resolve_undo = "";
static void write_eolinfo(const struct index_state *istate,
static void write_eolinfo(struct index_state *istate,
const struct cache_entry *ce, const char *path)
{
if (show_eol) {
@ -122,7 +122,7 @@ static void print_debug(const struct cache_entry *ce)
}
}
static void show_dir_entry(const struct index_state *istate,
static void show_dir_entry(struct index_state *istate,
const char *tag, struct dir_entry *ent)
{
int len = max_prefix_len;
@ -139,7 +139,7 @@ static void show_dir_entry(const struct index_state *istate,
write_name(ent->name);
}
static void show_other_files(const struct index_state *istate,
static void show_other_files(struct index_state *istate,
const struct dir_struct *dir)
{
int i;
@ -152,7 +152,7 @@ static void show_other_files(const struct index_state *istate,
}
}
static void show_killed_files(const struct index_state *istate,
static void show_killed_files(struct index_state *istate,
const struct dir_struct *dir)
{
int i;
@ -254,7 +254,7 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
}
}
static void show_ru_info(const struct index_state *istate)
static void show_ru_info(struct index_state *istate)
{
struct string_list_item *item;
@ -317,6 +317,8 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
if (!(show_cached || show_stage || show_deleted || show_modified))
return;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(repo->index);
for (i = 0; i < repo->index->cache_nr; i++) {
const struct cache_entry *ce = repo->index->cache[i];
struct stat st;
@ -494,6 +496,8 @@ void overlay_tree_on_index(struct index_state *istate,
die("bad tree-ish %s", tree_name);
/* Hoist the unmerged entries up to stage #3 to make room */
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
if (!ce_stage(ce))

View File

@ -58,6 +58,8 @@ static void merge_one_path(const char *path)
static void merge_all(void)
{
int i;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
const struct cache_entry *ce = active_cache[i];
if (!ce_stage(ce))
@ -80,6 +82,9 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
read_cache();
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
i = 1;
if (!strcmp(argv[i], "-o")) {
one_shot = 1;

View File

@ -293,6 +293,8 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
seen = xcalloc(pathspec.nr, 1);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
const struct cache_entry *ce = active_cache[i];
if (!ce_path_match(&the_index, ce, &pathspec, seen))

View File

@ -14,6 +14,7 @@
#include "unpack-trees.h"
#include "wt-status.h"
#include "quote.h"
#include "sparse-index.h"
static const char *empty_base = "";
@ -110,6 +111,8 @@ static int update_working_directory(struct pattern_list *pl)
if (is_index_unborn(r->index))
return UPDATE_SPARSITY_SUCCESS;
r->index->sparse_checkout_patterns = pl;
memset(&o, 0, sizeof(o));
o.verbose_update = isatty(2);
o.update = 1;
@ -138,6 +141,7 @@ static int update_working_directory(struct pattern_list *pl)
else
rollback_lock_file(&lock_file);
r->index->sparse_checkout_patterns = NULL;
return result;
}
@ -276,16 +280,20 @@ static int set_config(enum sparse_checkout_mode mode)
"core.sparseCheckoutCone",
mode == MODE_CONE_PATTERNS ? "true" : NULL);
if (mode == MODE_NO_PATTERNS)
set_sparse_index_config(the_repository, 0);
return 0;
}
static char const * const builtin_sparse_checkout_init_usage[] = {
N_("git sparse-checkout init [--cone]"),
N_("git sparse-checkout init [--cone] [--[no-]sparse-index]"),
NULL
};
static struct sparse_checkout_init_opts {
int cone_mode;
int sparse_index;
} init_opts;
static int sparse_checkout_init(int argc, const char **argv)
@ -300,11 +308,15 @@ static int sparse_checkout_init(int argc, const char **argv)
static struct option builtin_sparse_checkout_init_options[] = {
OPT_BOOL(0, "cone", &init_opts.cone_mode,
N_("initialize the sparse-checkout in cone mode")),
OPT_BOOL(0, "sparse-index", &init_opts.sparse_index,
N_("toggle the use of a sparse index")),
OPT_END(),
};
repo_read_index(the_repository);
init_opts.sparse_index = -1;
argc = parse_options(argc, argv, NULL,
builtin_sparse_checkout_init_options,
builtin_sparse_checkout_init_usage, 0);
@ -323,10 +335,20 @@ static int sparse_checkout_init(int argc, const char **argv)
sparse_filename = get_sparse_checkout_filename();
res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL, 0);
if (init_opts.sparse_index >= 0) {
if (set_sparse_index_config(the_repository, init_opts.sparse_index) < 0)
die(_("failed to modify sparse-index config"));
/* force an index rewrite */
repo_read_index(the_repository);
the_repository->index->updated_workdir = 1;
}
core_apply_sparse_checkout = 1;
/* If we already have a sparse-checkout file, use it. */
if (res >= 0) {
free(sparse_filename);
core_apply_sparse_checkout = 1;
return update_working_directory(NULL);
}
@ -348,6 +370,7 @@ static int sparse_checkout_init(int argc, const char **argv)
add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0);
strbuf_addstr(&pattern, "!/*/");
add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0);
pl.use_cone_patterns = init_opts.cone_mode;
return write_patterns_and_update(&pl);
}
@ -517,19 +540,18 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
{
int result;
int changed_config = 0;
struct pattern_list pl;
memset(&pl, 0, sizeof(pl));
struct pattern_list *pl = xcalloc(1, sizeof(*pl));
switch (m) {
case ADD:
if (core_sparse_checkout_cone)
add_patterns_cone_mode(argc, argv, &pl);
add_patterns_cone_mode(argc, argv, pl);
else
add_patterns_literal(argc, argv, &pl);
add_patterns_literal(argc, argv, pl);
break;
case REPLACE:
add_patterns_from_input(&pl, argc, argv);
add_patterns_from_input(pl, argc, argv);
break;
}
@ -539,12 +561,13 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
changed_config = 1;
}
result = write_patterns_and_update(&pl);
result = write_patterns_and_update(pl);
if (result && changed_config)
set_config(MODE_NO_PATTERNS);
clear_pattern_list(&pl);
clear_pattern_list(pl);
free(pl);
return result;
}
@ -614,6 +637,9 @@ static int sparse_checkout_disable(int argc, const char **argv)
strbuf_addstr(&match_all, "/*");
add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0);
prepare_repo_settings(the_repository);
the_repository->settings.sparse_index = 0;
if (update_working_directory(&pl))
die(_("error while refreshing working directory"));

View File

@ -1412,6 +1412,8 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
int i;
char *ps_matched = xcalloc(ps->nr, 1);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++)
ce_path_match(&the_index, active_cache[i], ps,
ps_matched);

View File

@ -745,6 +745,8 @@ static int do_reupdate(int ac, const char **av,
*/
has_head = 0;
redo:
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (pos = 0; pos < active_nr; pos++) {
const struct cache_entry *ce = active_cache[pos];
struct cache_entry *old = NULL;

View File

@ -6,6 +6,7 @@
#include "object-store.h"
#include "replace-object.h"
#include "promisor-remote.h"
#include "sparse-index.h"
#ifndef DEBUG_CACHE_TREE
#define DEBUG_CACHE_TREE 0
@ -255,6 +256,24 @@ static int update_one(struct cache_tree *it,
*skip_count = 0;
/*
* If the first entry of this region is a sparse directory
* entry corresponding exactly to 'base', then this cache_tree
* struct is a "leaf" in the data structure, pointing to the
* tree OID specified in the entry.
*/
if (entries > 0) {
const struct cache_entry *ce = cache[0];
if (S_ISSPARSEDIR(ce->ce_mode) &&
ce->ce_namelen == baselen &&
!strncmp(ce->name, base, baselen)) {
it->entry_count = 1;
oidcpy(&it->oid, &ce->oid);
return 1;
}
}
if (0 <= it->entry_count && has_object_file(&it->oid))
return it->entry_count;
@ -442,6 +461,8 @@ int cache_tree_update(struct index_state *istate, int flags)
if (i)
return i;
ensure_full_index(istate);
if (!istate->cache_tree)
istate->cache_tree = cache_tree();
@ -787,6 +808,19 @@ int cache_tree_matches_traversal(struct cache_tree *root,
return 0;
}
static void verify_one_sparse(struct repository *r,
struct index_state *istate,
struct cache_tree *it,
struct strbuf *path,
int pos)
{
struct cache_entry *ce = istate->cache[pos];
if (!S_ISSPARSEDIR(ce->ce_mode))
BUG("directory '%s' is present in index, but not sparse",
path->buf);
}
static void verify_one(struct repository *r,
struct index_state *istate,
struct cache_tree *it,
@ -809,6 +843,12 @@ static void verify_one(struct repository *r,
if (path->len) {
pos = index_name_pos(istate, path->buf, path->len);
if (pos >= 0) {
verify_one_sparse(r, istate, it, path, pos);
return;
}
pos = -pos - 1;
} else {
pos = 0;

25
cache.h
View File

@ -204,6 +204,8 @@ struct cache_entry {
#error "CE_EXTENDED_FLAGS out of range"
#endif
#define S_ISSPARSEDIR(m) ((m) == S_IFDIR)
/* Forward structure decls */
struct pathspec;
struct child_process;
@ -249,6 +251,8 @@ static inline unsigned int create_ce_mode(unsigned int mode)
{
if (S_ISLNK(mode))
return S_IFLNK;
if (S_ISSPARSEDIR(mode))
return S_IFDIR;
if (S_ISDIR(mode) || S_ISGITLINK(mode))
return S_IFGITLINK;
return S_IFREG | ce_permissions(mode);
@ -305,6 +309,7 @@ static inline unsigned int canon_mode(unsigned int mode)
struct split_index;
struct untracked_cache;
struct progress;
struct pattern_list;
struct index_state {
struct cache_entry **cache;
@ -319,7 +324,14 @@ struct index_state {
drop_cache_tree : 1,
updated_workdir : 1,
updated_skipworktree : 1,
fsmonitor_has_run_once : 1;
fsmonitor_has_run_once : 1,
/*
* sparse_index == 1 when sparse-directory
* entries exist. Requires sparse-checkout
* in cone mode.
*/
sparse_index : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
struct object_id oid;
@ -329,6 +341,7 @@ struct index_state {
struct mem_pool *ce_mem_pool;
struct progress *progress;
struct repository *repo;
struct pattern_list *sparse_checkout_patterns;
};
/* Name hashing */
@ -337,6 +350,7 @@ void add_name_hash(struct index_state *istate, struct cache_entry *ce);
void remove_name_hash(struct index_state *istate, struct cache_entry *ce);
void free_name_hash(struct index_state *istate);
void ensure_full_index(struct index_state *istate);
/* Cache entry creation and cleanup */
@ -722,6 +736,8 @@ int read_index_from(struct index_state *, const char *path,
const char *gitdir);
int is_index_unborn(struct index_state *);
void ensure_full_index(struct index_state *istate);
/* For use with `write_locked_index()`. */
#define COMMIT_LOCK (1 << 0)
#define SKIP_IF_UNCHANGED (1 << 1)
@ -785,7 +801,7 @@ struct cache_entry *index_file_exists(struct index_state *istate, const char *na
* index_name_pos(&index, "f", 1) -> -3
* index_name_pos(&index, "g", 1) -> -5
*/
int index_name_pos(const struct index_state *, const char *name, int namelen);
int index_name_pos(struct index_state *, const char *name, int namelen);
/*
* Some functions return the negative complement of an insert position when a
@ -835,8 +851,8 @@ int add_file_to_index(struct index_state *, const char *path, int flags);
int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
int index_name_is_other(const struct index_state *, const char *, int);
void *read_blob_data_from_index(const struct index_state *, const char *, unsigned long *);
int index_name_is_other(struct index_state *, const char *, int);
void *read_blob_data_from_index(struct index_state *, const char *, unsigned long *);
/* do stat comparison even if CE_VALID is true */
#define CE_MATCH_IGNORE_VALID 01
@ -1044,6 +1060,7 @@ struct repository_format {
int worktree_config;
int is_bare;
int hash_algo;
int sparse_index;
char *work_tree;
struct string_list unknown_extensions;
struct string_list v1_only_extensions;

View File

@ -127,7 +127,7 @@ static const char *gather_convert_stats_ascii(const char *data, unsigned long si
}
}
const char *get_cached_convert_stats_ascii(const struct index_state *istate,
const char *get_cached_convert_stats_ascii(struct index_state *istate,
const char *path)
{
const char *ret;
@ -211,7 +211,7 @@ static void check_global_conv_flags_eol(const char *path,
}
}
static int has_crlf_in_index(const struct index_state *istate, const char *path)
static int has_crlf_in_index(struct index_state *istate, const char *path)
{
unsigned long sz;
void *data;
@ -485,7 +485,7 @@ static int encode_to_worktree(const char *path, const char *src, size_t src_len,
return 1;
}
static int crlf_to_git(const struct index_state *istate,
static int crlf_to_git(struct index_state *istate,
const char *path, const char *src, size_t len,
struct strbuf *buf,
enum convert_crlf_action crlf_action, int conv_flags)
@ -1293,7 +1293,7 @@ static int git_path_check_ident(struct attr_check_item *check)
static struct attr_check *check;
void convert_attrs(const struct index_state *istate,
void convert_attrs(struct index_state *istate,
struct conv_attrs *ca, const char *path)
{
struct attr_check_item *ccheck = NULL;
@ -1355,7 +1355,7 @@ void reset_parsed_attributes(void)
user_convert_tail = NULL;
}
int would_convert_to_git_filter_fd(const struct index_state *istate, const char *path)
int would_convert_to_git_filter_fd(struct index_state *istate, const char *path)
{
struct conv_attrs ca;
@ -1374,7 +1374,7 @@ int would_convert_to_git_filter_fd(const struct index_state *istate, const char
return apply_filter(path, NULL, 0, -1, NULL, ca.drv, CAP_CLEAN, NULL, NULL);
}
const char *get_convert_attr_ascii(const struct index_state *istate, const char *path)
const char *get_convert_attr_ascii(struct index_state *istate, const char *path)
{
struct conv_attrs ca;
@ -1400,7 +1400,7 @@ const char *get_convert_attr_ascii(const struct index_state *istate, const char
return "";
}
int convert_to_git(const struct index_state *istate,
int convert_to_git(struct index_state *istate,
const char *path, const char *src, size_t len,
struct strbuf *dst, int conv_flags)
{
@ -1434,7 +1434,7 @@ int convert_to_git(const struct index_state *istate,
return ret | ident_to_git(src, len, dst, ca.ident);
}
void convert_to_git_filter_fd(const struct index_state *istate,
void convert_to_git_filter_fd(struct index_state *istate,
const char *path, int fd, struct strbuf *dst,
int conv_flags)
{
@ -1511,7 +1511,7 @@ int convert_to_working_tree_ca(const struct conv_attrs *ca,
meta, NULL);
}
int renormalize_buffer(const struct index_state *istate, const char *path,
int renormalize_buffer(struct index_state *istate, const char *path,
const char *src, size_t len, struct strbuf *dst)
{
struct conv_attrs ca;
@ -1972,7 +1972,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,
return filter;
}
struct stream_filter *get_stream_filter(const struct index_state *istate,
struct stream_filter *get_stream_filter(struct index_state *istate,
const char *path,
const struct object_id *oid)
{

View File

@ -84,19 +84,19 @@ struct conv_attrs {
const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */
};
void convert_attrs(const struct index_state *istate,
void convert_attrs(struct index_state *istate,
struct conv_attrs *ca, const char *path);
extern enum eol core_eol;
extern char *check_roundtrip_encoding;
const char *get_cached_convert_stats_ascii(const struct index_state *istate,
const char *get_cached_convert_stats_ascii(struct index_state *istate,
const char *path);
const char *get_wt_convert_stats_ascii(const char *path);
const char *get_convert_attr_ascii(const struct index_state *istate,
const char *get_convert_attr_ascii(struct index_state *istate,
const char *path);
/* returns 1 if *dst was used */
int convert_to_git(const struct index_state *istate,
int convert_to_git(struct index_state *istate,
const char *path, const char *src, size_t len,
struct strbuf *dst, int conv_flags);
int convert_to_working_tree_ca(const struct conv_attrs *ca,
@ -108,7 +108,7 @@ int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
size_t len, struct strbuf *dst,
const struct checkout_metadata *meta,
void *dco);
static inline int convert_to_working_tree(const struct index_state *istate,
static inline int convert_to_working_tree(struct index_state *istate,
const char *path, const char *src,
size_t len, struct strbuf *dst,
const struct checkout_metadata *meta)
@ -117,7 +117,7 @@ static inline int convert_to_working_tree(const struct index_state *istate,
convert_attrs(istate, &ca, path);
return convert_to_working_tree_ca(&ca, path, src, len, dst, meta);
}
static inline int async_convert_to_working_tree(const struct index_state *istate,
static inline int async_convert_to_working_tree(struct index_state *istate,
const char *path, const char *src,
size_t len, struct strbuf *dst,
const struct checkout_metadata *meta,
@ -129,20 +129,20 @@ static inline int async_convert_to_working_tree(const struct index_state *istate
}
int async_query_available_blobs(const char *cmd,
struct string_list *available_paths);
int renormalize_buffer(const struct index_state *istate,
int renormalize_buffer(struct index_state *istate,
const char *path, const char *src, size_t len,
struct strbuf *dst);
static inline int would_convert_to_git(const struct index_state *istate,
static inline int would_convert_to_git(struct index_state *istate,
const char *path)
{
return convert_to_git(istate, path, NULL, 0, NULL, 0);
}
/* Precondition: would_convert_to_git_filter_fd(path) == true */
void convert_to_git_filter_fd(const struct index_state *istate,
void convert_to_git_filter_fd(struct index_state *istate,
const char *path, int fd,
struct strbuf *dst,
int conv_flags);
int would_convert_to_git_filter_fd(const struct index_state *istate,
int would_convert_to_git_filter_fd(struct index_state *istate,
const char *path);
/*
@ -176,7 +176,7 @@ void reset_parsed_attributes(void);
struct stream_filter; /* opaque */
struct stream_filter *get_stream_filter(const struct index_state *istate,
struct stream_filter *get_stream_filter(struct index_state *istate,
const char *path,
const struct object_id *);
struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca,

14
dir.c
View File

@ -306,7 +306,7 @@ static int do_read_blob(const struct object_id *oid, struct oid_stat *oid_stat,
* [1] Only if DO_MATCH_DIRECTORY is passed; otherwise, this is NOT a match.
* [2] Only if DO_MATCH_LEADING_PATHSPEC is passed; otherwise, not a match.
*/
static int match_pathspec_item(const struct index_state *istate,
static int match_pathspec_item(struct index_state *istate,
const struct pathspec_item *item, int prefix,
const char *name, int namelen, unsigned flags)
{
@ -429,7 +429,7 @@ static int match_pathspec_item(const struct index_state *istate,
* pathspec did not match any names, which could indicate that the
* user mistyped the nth pathspec.
*/
static int do_match_pathspec(const struct index_state *istate,
static int do_match_pathspec(struct index_state *istate,
const struct pathspec *ps,
const char *name, int namelen,
int prefix, char *seen,
@ -500,7 +500,7 @@ static int do_match_pathspec(const struct index_state *istate,
return retval;
}
static int match_pathspec_with_flags(const struct index_state *istate,
static int match_pathspec_with_flags(struct index_state *istate,
const struct pathspec *ps,
const char *name, int namelen,
int prefix, char *seen, unsigned flags)
@ -516,7 +516,7 @@ static int match_pathspec_with_flags(const struct index_state *istate,
return negative ? 0 : positive;
}
int match_pathspec(const struct index_state *istate,
int match_pathspec(struct index_state *istate,
const struct pathspec *ps,
const char *name, int namelen,
int prefix, char *seen, int is_dir)
@ -529,7 +529,7 @@ int match_pathspec(const struct index_state *istate,
/**
* Check if a submodule is a superset of the pathspec
*/
int submodule_path_match(const struct index_state *istate,
int submodule_path_match(struct index_state *istate,
const struct pathspec *ps,
const char *submodule_name,
char *seen)
@ -892,7 +892,7 @@ void add_pattern(const char *string, const char *base,
add_pattern_to_hashsets(pl, pattern);
}
static int read_skip_worktree_file_from_index(const struct index_state *istate,
static int read_skip_worktree_file_from_index(struct index_state *istate,
const char *path,
size_t *size_out, char **data_out,
struct oid_stat *oid_stat)
@ -3542,6 +3542,8 @@ static void connect_wt_gitdir_in_nested(const char *sub_worktree,
if (repo_read_index(&subrepo) < 0)
die(_("index file corrupt in repo %s"), subrepo.gitdir);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(subrepo.index);
for (i = 0; i < subrepo.index->cache_nr; i++) {
const struct cache_entry *ce = subrepo.index->cache[i];

8
dir.h
View File

@ -354,7 +354,7 @@ int count_slashes(const char *s);
int simple_length(const char *match);
int no_wildcard(const char *string);
char *common_prefix(const struct pathspec *pathspec);
int match_pathspec(const struct index_state *istate,
int match_pathspec(struct index_state *istate,
const struct pathspec *pathspec,
const char *name, int namelen,
int prefix, char *seen, int is_dir);
@ -493,12 +493,12 @@ int git_fnmatch(const struct pathspec_item *item,
const char *pattern, const char *string,
int prefix);
int submodule_path_match(const struct index_state *istate,
int submodule_path_match(struct index_state *istate,
const struct pathspec *ps,
const char *submodule_name,
char *seen);
static inline int ce_path_match(const struct index_state *istate,
static inline int ce_path_match(struct index_state *istate,
const struct cache_entry *ce,
const struct pathspec *pathspec,
char *seen)
@ -507,7 +507,7 @@ static inline int ce_path_match(const struct index_state *istate,
S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode));
}
static inline int dir_path_match(const struct index_state *istate,
static inline int dir_path_match(struct index_state *istate,
const struct dir_entry *ent,
const struct pathspec *pathspec,
int prefix, char *seen)

View File

@ -423,6 +423,8 @@ static void mark_colliding_entries(const struct checkout *state,
ce->ce_flags |= CE_MATCHED;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(state->istate);
for (i = 0; i < state->istate->cache_nr; i++) {
struct cache_entry *dup = state->istate->cache[i];

View File

@ -2564,7 +2564,7 @@ static int blob_unchanged(struct merge_options *opt,
struct strbuf basebuf = STRBUF_INIT;
struct strbuf sidebuf = STRBUF_INIT;
int ret = 0; /* assume changed for safety */
const struct index_state *idx = &opt->priv->attr_index;
struct index_state *idx = &opt->priv->attr_index;
if (!idx->initialized)
initialize_attr_index(opt);

View File

@ -522,6 +522,8 @@ static struct string_list *get_unmerged(struct index_state *istate)
unmerged->strdup_strings = 1;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
struct string_list_item *item;
struct stage_data *e;
@ -3014,7 +3016,7 @@ static int blob_unchanged(struct merge_options *opt,
struct strbuf obuf = STRBUF_INIT;
struct strbuf abuf = STRBUF_INIT;
int ret = 0; /* assume changed for safety */
const struct index_state *idx = opt->repo->index;
struct index_state *idx = opt->repo->index;
if (a->mode != o->mode)
return 0;

View File

@ -8,6 +8,7 @@
#include "cache.h"
#include "thread-utils.h"
#include "trace2.h"
#include "sparse-index.h"
struct dir_entry {
struct hashmap_entry ent;
@ -109,8 +110,11 @@ static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
if (ce->ce_flags & CE_HASHED)
return;
ce->ce_flags |= CE_HASHED;
hashmap_entry_init(&ce->ent, memihash(ce->name, ce_namelen(ce)));
hashmap_add(&istate->name_hash, &ce->ent);
if (!S_ISSPARSEDIR(ce->ce_mode)) {
hashmap_entry_init(&ce->ent, memihash(ce->name, ce_namelen(ce)));
hashmap_add(&istate->name_hash, &ce->ent);
}
if (ignore_case)
add_dir_entry(istate, ce);
@ -680,6 +684,7 @@ int index_dir_exists(struct index_state *istate, const char *name, int namelen)
struct dir_entry *dir;
lazy_init_name_hash(istate);
expand_to_path(istate, name, namelen, 0);
dir = find_dir_entry(istate, name, namelen);
return dir && dir->nr;
}
@ -690,6 +695,7 @@ void adjust_dirname_case(struct index_state *istate, char *name)
const char *ptr = startPtr;
lazy_init_name_hash(istate);
expand_to_path(istate, name, strlen(name), 0);
while (*ptr) {
while (*ptr && *ptr != '/')
ptr++;
@ -713,6 +719,7 @@ struct cache_entry *index_file_exists(struct index_state *istate, const char *na
unsigned int hash = memihash(name, namelen);
lazy_init_name_hash(istate);
expand_to_path(istate, name, namelen, icase);
ce = hashmap_get_entry_from_hash(&istate->name_hash, hash, NULL,
struct cache_entry, ent);

View File

@ -20,7 +20,7 @@
* to use find_pathspecs_matching_against_index() instead.
*/
void add_pathspec_matches_against_index(const struct pathspec *pathspec,
const struct index_state *istate,
struct index_state *istate,
char *seen)
{
int num_unmatched = 0, i;
@ -36,6 +36,8 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
num_unmatched++;
if (!num_unmatched)
return;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
const struct cache_entry *ce = istate->cache[i];
ce_path_match(istate, ce, pathspec, seen);
@ -51,7 +53,7 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
* given pathspecs achieves against all items in the index.
*/
char *find_pathspecs_matching_against_index(const struct pathspec *pathspec,
const struct index_state *istate)
struct index_state *istate)
{
char *seen = xcalloc(pathspec->nr, 1);
add_pathspec_matches_against_index(pathspec, istate, seen);
@ -702,7 +704,7 @@ void clear_pathspec(struct pathspec *pathspec)
pathspec->nr = 0;
}
int match_pathspec_attrs(const struct index_state *istate,
int match_pathspec_attrs(struct index_state *istate,
const char *name, int namelen,
const struct pathspec_item *item)
{

View File

@ -150,11 +150,11 @@ static inline int ps_strcmp(const struct pathspec_item *item,
}
void add_pathspec_matches_against_index(const struct pathspec *pathspec,
const struct index_state *istate,
struct index_state *istate,
char *seen);
char *find_pathspecs_matching_against_index(const struct pathspec *pathspec,
const struct index_state *istate);
int match_pathspec_attrs(const struct index_state *istate,
struct index_state *istate);
int match_pathspec_attrs(struct index_state *istate,
const char *name, int namelen,
const struct pathspec_item *item);

View File

@ -25,6 +25,7 @@
#include "fsmonitor.h"
#include "thread-utils.h"
#include "progress.h"
#include "sparse-index.h"
/* Mask for the name length in ce_flags in the on-disk index */
@ -47,6 +48,7 @@
#define CACHE_EXT_FSMONITOR 0x46534D4E /* "FSMN" */
#define CACHE_EXT_ENDOFINDEXENTRIES 0x454F4945 /* "EOIE" */
#define CACHE_EXT_INDEXENTRYOFFSETTABLE 0x49454F54 /* "IEOT" */
#define CACHE_EXT_SPARSE_DIRECTORIES 0x73646972 /* "sdir" */
/* changes that can be kept in $GIT_DIR/index (basically all extensions) */
#define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
@ -101,6 +103,9 @@ static const char *alternate_index_output;
static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
{
if (S_ISSPARSEDIR(ce->ce_mode))
istate->sparse_index = 1;
istate->cache[nr] = ce;
add_name_hash(istate, ce);
}
@ -544,7 +549,7 @@ int cache_name_stage_compare(const char *name1, int len1, int stage1, const char
return 0;
}
static int index_name_stage_pos(const struct index_state *istate, const char *name, int namelen, int stage)
static int index_name_stage_pos(struct index_state *istate, const char *name, int namelen, int stage)
{
int first, last;
@ -562,10 +567,31 @@ static int index_name_stage_pos(const struct index_state *istate, const char *na
}
first = next+1;
}
if (istate->sparse_index &&
first > 0) {
/* Note: first <= istate->cache_nr */
struct cache_entry *ce = istate->cache[first - 1];
/*
* If we are in a sparse-index _and_ the entry before the
* insertion position is a sparse-directory entry that is
* an ancestor of 'name', then we need to expand the index
* and search again. This will only trigger once, because
* thereafter the index is fully expanded.
*/
if (S_ISSPARSEDIR(ce->ce_mode) &&
ce_namelen(ce) < namelen &&
!strncmp(name, ce->name, ce_namelen(ce))) {
ensure_full_index(istate);
return index_name_stage_pos(istate, name, namelen, stage);
}
}
return -first-1;
}
int index_name_pos(const struct index_state *istate, const char *name, int namelen)
int index_name_pos(struct index_state *istate, const char *name, int namelen)
{
return index_name_stage_pos(istate, name, namelen, 0);
}
@ -999,8 +1025,14 @@ inside:
c = *path++;
if ((c == '.' && !verify_dotfile(path, mode)) ||
is_dir_sep(c) || c == '\0')
is_dir_sep(c))
return 0;
/*
* allow terminating directory separators for
* sparse directory entries.
*/
if (c == '\0')
return S_ISDIR(mode);
} else if (c == '\\' && protect_ntfs) {
if (is_ntfs_dotgit(path))
return 0;
@ -1545,6 +1577,8 @@ int refresh_index(struct index_state *istate, unsigned int flags,
*/
preload_index(istate, pathspec, 0);
trace2_region_enter("index", "refresh", NULL);
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce, *new_entry;
int cache_errno = 0;
@ -1760,6 +1794,10 @@ static int read_index_extension(struct index_state *istate,
case CACHE_EXT_INDEXENTRYOFFSETTABLE:
/* already handled in do_read_index() */
break;
case CACHE_EXT_SPARSE_DIRECTORIES:
/* no content, only an indicator */
istate->sparse_index = 1;
break;
default:
if (*ext < 'A' || 'Z' < *ext)
return error(_("index uses %.4s extension, which we do not understand"),
@ -2273,6 +2311,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
trace2_data_intmax("index", the_repository, "read/cache_nr",
istate->cache_nr);
if (!istate->repo)
istate->repo = the_repository;
prepare_repo_settings(istate->repo);
if (istate->repo->settings.command_requires_full_index)
ensure_full_index(istate);
return istate->cache_nr;
unmap:
@ -2457,6 +2501,8 @@ int repo_index_has_changes(struct repository *repo,
diff_flush(&opt);
return opt.flags.has_changes != 0;
} else {
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; sb && i < istate->cache_nr; i++) {
if (i)
strbuf_addch(sb, ' ');
@ -3012,6 +3058,10 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
if (err)
return -1;
}
if (istate->sparse_index) {
if (write_index_ext_header(&c, &eoie_c, newfd, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0)
return -1;
}
/*
* CACHE_EXT_ENDOFINDEXENTRIES must be written as the last entry before the SHA1
@ -3071,6 +3121,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
unsigned flags)
{
int ret;
int was_full = !istate->sparse_index;
ret = convert_to_sparse(istate);
if (ret) {
warning(_("failed to convert to a sparse-index"));
return ret;
}
/*
* TODO trace2: replace "the_repository" with the actual repo instance
@ -3082,6 +3140,9 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
trace2_region_leave_printf("index", "do_write_index", the_repository,
"%s", get_lock_file_path(lock));
if (was_full)
ensure_full_index(istate);
if (ret)
return ret;
if (flags & COMMIT_LOCK)
@ -3172,9 +3233,10 @@ static int write_shared_index(struct index_state *istate,
struct tempfile **temp)
{
struct split_index *si = istate->split_index;
int ret;
int ret, was_full = !istate->sparse_index;
move_cache_to_base_index(istate);
convert_to_sparse(istate);
trace2_region_enter_printf("index", "shared/do_write_index",
the_repository, "%s", get_tempfile_path(*temp));
@ -3182,6 +3244,9 @@ static int write_shared_index(struct index_state *istate,
trace2_region_leave_printf("index", "shared/do_write_index",
the_repository, "%s", get_tempfile_path(*temp));
if (was_full)
ensure_full_index(istate);
if (ret)
return ret;
ret = adjust_shared_perm(get_tempfile_path(*temp));
@ -3350,8 +3415,8 @@ int repo_read_index_unmerged(struct repository *repo)
* We helpfully remove a trailing "/" from directories so that
* the output of read_directory can be used as-is.
*/
int index_name_is_other(const struct index_state *istate, const char *name,
int namelen)
int index_name_is_other(struct index_state *istate, const char *name,
int namelen)
{
int pos;
if (namelen && name[namelen - 1] == '/')
@ -3369,7 +3434,7 @@ int index_name_is_other(const struct index_state *istate, const char *name,
return 1;
}
void *read_blob_data_from_index(const struct index_state *istate,
void *read_blob_data_from_index(struct index_state *istate,
const char *path, unsigned long *size)
{
int pos, len;

View File

@ -77,4 +77,19 @@ void prepare_repo_settings(struct repository *r)
UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP);
UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT);
/*
* This setting guards all index reads to require a full index
* over a sparse index. After suitable guards are placed in the
* codebase around uses of the index, this setting will be
* removed.
*/
r->settings.command_requires_full_index = 1;
/*
* Initialize this as off.
*/
r->settings.sparse_index = 0;
if (!repo_config_get_bool(r, "index.sparse", &value) && value)
r->settings.sparse_index = 1;
}

View File

@ -10,6 +10,7 @@
#include "object.h"
#include "lockfile.h"
#include "submodule-config.h"
#include "sparse-index.h"
/* The main repository */
static struct repository the_repo;
@ -261,6 +262,8 @@ void repo_clear(struct repository *repo)
int repo_read_index(struct repository *repo)
{
int res;
if (!repo->index)
CALLOC_ARRAY(repo->index, 1);
@ -270,7 +273,13 @@ int repo_read_index(struct repository *repo)
else if (repo->index->repo != repo)
BUG("repo's index should point back at itself");
return read_index_from(repo->index, repo->index_file, repo->gitdir);
res = read_index_from(repo->index, repo->index_file, repo->gitdir);
prepare_repo_settings(repo);
if (repo->settings.command_requires_full_index)
ensure_full_index(repo->index);
return res;
}
int repo_hold_locked_index(struct repository *repo,

View File

@ -41,6 +41,9 @@ struct repo_settings {
enum fetch_negotiation_setting fetch_negotiation_algorithm;
int core_multi_pack_index;
unsigned command_requires_full_index:1,
sparse_index:1;
};
struct repository {

View File

@ -172,6 +172,8 @@ void unmerge_marked_index(struct index_state *istate)
if (!istate->resolve_undo)
return;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
const struct cache_entry *ce = istate->cache[i];
if (ce->ce_flags & CE_MATCHED)
@ -186,6 +188,8 @@ void unmerge_index(struct index_state *istate, const struct pathspec *pathspec)
if (!istate->resolve_undo)
return;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
const struct cache_entry *ce = istate->cache[i];
if (!ce_path_match(istate, ce, pathspec, NULL))

View File

@ -1680,6 +1680,8 @@ static void do_add_index_objects_to_pending(struct rev_info *revs,
{
int i;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
struct blob *blob;

358
sparse-index.c Normal file
View File

@ -0,0 +1,358 @@
#include "cache.h"
#include "repository.h"
#include "sparse-index.h"
#include "tree.h"
#include "pathspec.h"
#include "trace2.h"
#include "cache-tree.h"
#include "config.h"
#include "dir.h"
#include "fsmonitor.h"
static struct cache_entry *construct_sparse_dir_entry(
struct index_state *istate,
const char *sparse_dir,
struct cache_tree *tree)
{
struct cache_entry *de;
de = make_cache_entry(istate, S_IFDIR, &tree->oid, sparse_dir, 0, 0);
de->ce_flags |= CE_SKIP_WORKTREE;
return de;
}
/*
* Returns the number of entries "inserted" into the index.
*/
static int convert_to_sparse_rec(struct index_state *istate,
int num_converted,
int start, int end,
const char *ct_path, size_t ct_pathlen,
struct cache_tree *ct)
{
int i, can_convert = 1;
int start_converted = num_converted;
enum pattern_match_result match;
int dtype;
struct strbuf child_path = STRBUF_INIT;
struct pattern_list *pl = istate->sparse_checkout_patterns;
/*
* Is the current path outside of the sparse cone?
* Then check if the region can be replaced by a sparse
* directory entry (everything is sparse and merged).
*/
match = path_matches_pattern_list(ct_path, ct_pathlen,
NULL, &dtype, pl, istate);
if (match != NOT_MATCHED)
can_convert = 0;
for (i = start; can_convert && i < end; i++) {
struct cache_entry *ce = istate->cache[i];
if (ce_stage(ce) ||
S_ISGITLINK(ce->ce_mode) ||
!(ce->ce_flags & CE_SKIP_WORKTREE))
can_convert = 0;
}
if (can_convert) {
struct cache_entry *se;
se = construct_sparse_dir_entry(istate, ct_path, ct);
istate->cache[num_converted++] = se;
return 1;
}
for (i = start; i < end; ) {
int count, span, pos = -1;
const char *base, *slash;
struct cache_entry *ce = istate->cache[i];
/*
* Detect if this is a normal entry outside of any subtree
* entry.
*/
base = ce->name + ct_pathlen;
slash = strchr(base, '/');
if (slash)
pos = cache_tree_subtree_pos(ct, base, slash - base);
if (pos < 0) {
istate->cache[num_converted++] = ce;
i++;
continue;
}
strbuf_setlen(&child_path, 0);
strbuf_add(&child_path, ce->name, slash - ce->name + 1);
span = ct->down[pos]->cache_tree->entry_count;
count = convert_to_sparse_rec(istate,
num_converted, i, i + span,
child_path.buf, child_path.len,
ct->down[pos]->cache_tree);
num_converted += count;
i += span;
}
strbuf_release(&child_path);
return num_converted - start_converted;
}
static int set_index_sparse_config(struct repository *repo, int enable)
{
int res;
char *config_path = repo_git_path(repo, "config.worktree");
res = git_config_set_in_file_gently(config_path,
"index.sparse",
enable ? "true" : NULL);
free(config_path);
prepare_repo_settings(repo);
repo->settings.sparse_index = 1;
return res;
}
int set_sparse_index_config(struct repository *repo, int enable)
{
int res = set_index_sparse_config(repo, enable);
prepare_repo_settings(repo);
repo->settings.sparse_index = enable;
return res;
}
int convert_to_sparse(struct index_state *istate)
{
int test_env;
if (istate->split_index || istate->sparse_index ||
!core_apply_sparse_checkout || !core_sparse_checkout_cone)
return 0;
if (!istate->repo)
istate->repo = the_repository;
/*
* The GIT_TEST_SPARSE_INDEX environment variable triggers the
* index.sparse config variable to be on.
*/
test_env = git_env_bool("GIT_TEST_SPARSE_INDEX", -1);
if (test_env >= 0)
set_sparse_index_config(istate->repo, test_env);
/*
* Only convert to sparse if index.sparse is set.
*/
prepare_repo_settings(istate->repo);
if (!istate->repo->settings.sparse_index)
return 0;
if (!istate->sparse_checkout_patterns) {
istate->sparse_checkout_patterns = xcalloc(1, sizeof(struct pattern_list));
if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0)
return 0;
}
if (!istate->sparse_checkout_patterns->use_cone_patterns) {
warning(_("attempting to use sparse-index without cone mode"));
return -1;
}
if (cache_tree_update(istate, 0)) {
warning(_("unable to update cache-tree, staying full"));
return -1;
}
remove_fsmonitor(istate);
trace2_region_enter("index", "convert_to_sparse", istate->repo);
istate->cache_nr = convert_to_sparse_rec(istate,
0, 0, istate->cache_nr,
"", 0, istate->cache_tree);
/* Clear and recompute the cache-tree */
cache_tree_free(&istate->cache_tree);
cache_tree_update(istate, 0);
istate->sparse_index = 1;
trace2_region_leave("index", "convert_to_sparse", istate->repo);
return 0;
}
static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
{
ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc);
istate->cache[nr] = ce;
add_name_hash(istate, ce);
}
static int add_path_to_index(const struct object_id *oid,
struct strbuf *base, const char *path,
unsigned int mode, void *context)
{
struct index_state *istate = (struct index_state *)context;
struct cache_entry *ce;
size_t len = base->len;
if (S_ISDIR(mode))
return READ_TREE_RECURSIVE;
strbuf_addstr(base, path);
ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0);
ce->ce_flags |= CE_SKIP_WORKTREE;
set_index_entry(istate, istate->cache_nr++, ce);
strbuf_setlen(base, len);
return 0;
}
void ensure_full_index(struct index_state *istate)
{
int i;
struct index_state *full;
struct strbuf base = STRBUF_INIT;
if (!istate || !istate->sparse_index)
return;
if (!istate->repo)
istate->repo = the_repository;
trace2_region_enter("index", "ensure_full_index", istate->repo);
/* initialize basics of new index */
full = xcalloc(1, sizeof(struct index_state));
memcpy(full, istate, sizeof(struct index_state));
/* then change the necessary things */
full->sparse_index = 0;
full->cache_alloc = (3 * istate->cache_alloc) / 2;
full->cache_nr = 0;
ALLOC_ARRAY(full->cache, full->cache_alloc);
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
struct tree *tree;
struct pathspec ps;
if (!S_ISSPARSEDIR(ce->ce_mode)) {
set_index_entry(full, full->cache_nr++, ce);
continue;
}
if (!(ce->ce_flags & CE_SKIP_WORKTREE))
warning(_("index entry is a directory, but not sparse (%08x)"),
ce->ce_flags);
/* recursively walk into cd->name */
tree = lookup_tree(istate->repo, &ce->oid);
memset(&ps, 0, sizeof(ps));
ps.recursive = 1;
ps.has_wildcard = 1;
ps.max_depth = -1;
strbuf_setlen(&base, 0);
strbuf_add(&base, ce->name, strlen(ce->name));
read_tree_at(istate->repo, tree, &base, &ps,
add_path_to_index, full);
/* free directory entries. full entries are re-used */
discard_cache_entry(ce);
}
/* Copy back into original index. */
memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash));
istate->sparse_index = 0;
free(istate->cache);
istate->cache = full->cache;
istate->cache_nr = full->cache_nr;
istate->cache_alloc = full->cache_alloc;
strbuf_release(&base);
free(full);
/* Clear and recompute the cache-tree */
cache_tree_free(&istate->cache_tree);
cache_tree_update(istate, 0);
trace2_region_leave("index", "ensure_full_index", istate->repo);
}
/*
* This static global helps avoid infinite recursion between
* expand_to_path() and index_file_exists().
*/
static int in_expand_to_path = 0;
void expand_to_path(struct index_state *istate,
const char *path, size_t pathlen, int icase)
{
struct strbuf path_mutable = STRBUF_INIT;
size_t substr_len;
/* prevent extra recursion */
if (in_expand_to_path)
return;
if (!istate || !istate->sparse_index)
return;
if (!istate->repo)
istate->repo = the_repository;
in_expand_to_path = 1;
/*
* We only need to actually expand a region if the
* following are both true:
*
* 1. 'path' is not already in the index.
* 2. Some parent directory of 'path' is a sparse directory.
*/
if (index_file_exists(istate, path, pathlen, icase))
goto cleanup;
strbuf_add(&path_mutable, path, pathlen);
strbuf_addch(&path_mutable, '/');
/* Check the name hash for all parent directories */
substr_len = 0;
while (substr_len < pathlen) {
char temp;
char *replace = strchr(path_mutable.buf + substr_len, '/');
if (!replace)
break;
/* replace the character _after_ the slash */
replace++;
temp = *replace;
*replace = '\0';
if (index_file_exists(istate, path_mutable.buf,
path_mutable.len, icase)) {
/*
* We found a parent directory in the name-hash
* hashtable, because only sparse directory entries
* have a trailing '/' character. Since "path" wasn't
* in the index, perhaps it exists within this
* sparse-directory. Expand accordingly.
*/
ensure_full_index(istate);
break;
}
*replace = temp;
substr_len = replace - path_mutable.buf;
}
cleanup:
strbuf_release(&path_mutable);
in_expand_to_path = 0;
}

23
sparse-index.h Normal file
View File

@ -0,0 +1,23 @@
#ifndef SPARSE_INDEX_H__
#define SPARSE_INDEX_H__
struct index_state;
int convert_to_sparse(struct index_state *istate);
/*
* Some places in the codebase expect to search for a specific path.
* This path might be outside of the sparse-checkout definition, in
* which case a sparse-index may not contain a path for that index.
*
* Given an index and a path, check to see if a leading directory for
* 'path' exists in the index as a sparse directory. In that case,
* expand that sparse directory to a full range of cache entries and
* populate the index accordingly.
*/
void expand_to_path(struct index_state *istate,
const char *path, size_t pathlen, int icase);
struct repository;
int set_sparse_index_config(struct repository *repo, int enable);
#endif

View File

@ -33,7 +33,7 @@ static struct oid_array ref_tips_after_fetch;
* will be disabled because we can't guess what might be configured in
* .gitmodules unless the user resolves the conflict.
*/
int is_gitmodules_unmerged(const struct index_state *istate)
int is_gitmodules_unmerged(struct index_state *istate)
{
int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
if (pos < 0) { /* .gitmodules not found or isn't merged */
@ -301,7 +301,7 @@ int is_submodule_populated_gently(const char *path, int *return_error_code)
/*
* Dies if the provided 'prefix' corresponds to an unpopulated submodule
*/
void die_in_unpopulated_submodule(const struct index_state *istate,
void die_in_unpopulated_submodule(struct index_state *istate,
const char *prefix)
{
int i, prefixlen;
@ -331,7 +331,7 @@ void die_in_unpopulated_submodule(const struct index_state *istate,
/*
* Dies if any paths in the provided pathspec descends into a submodule
*/
void die_path_inside_submodule(const struct index_state *istate,
void die_path_inside_submodule(struct index_state *istate,
const struct pathspec *ps)
{
int i, j;

View File

@ -39,7 +39,7 @@ struct submodule_update_strategy {
};
#define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
int is_gitmodules_unmerged(const struct index_state *istate);
int is_gitmodules_unmerged(struct index_state *istate);
int is_writing_gitmodules_ok(void);
int is_staging_gitmodules_ok(struct index_state *istate);
int update_path_in_gitmodules(const char *oldpath, const char *newpath);
@ -60,9 +60,9 @@ int is_submodule_active(struct repository *repo, const char *path);
* Otherwise the return error code is the same as of resolve_gitdir_gently.
*/
int is_submodule_populated_gently(const char *path, int *return_error_code);
void die_in_unpopulated_submodule(const struct index_state *istate,
void die_in_unpopulated_submodule(struct index_state *istate,
const char *prefix);
void die_path_inside_submodule(const struct index_state *istate,
void die_path_inside_submodule(struct index_state *istate,
const struct pathspec *ps);
enum submodule_update_type parse_submodule_update_type(const char *value);
int parse_submodule_update_strategy(const char *value,

View File

@ -436,6 +436,9 @@ and "sha256".
GIT_TEST_WRITE_REV_INDEX=<boolean>, when true enables the
'pack.writeReverseIndex' setting.
GIT_TEST_SPARSE_INDEX=<boolean>, when true enables index writes to use the
sparse-index format by default.
Naming Tests
------------

View File

@ -1,36 +1,82 @@
#include "test-tool.h"
#include "cache.h"
#include "config.h"
#include "blob.h"
#include "commit.h"
#include "tree.h"
#include "sparse-index.h"
static void print_cache_entry(struct cache_entry *ce)
{
const char *type;
printf("%06o ", ce->ce_mode & 0177777);
if (S_ISSPARSEDIR(ce->ce_mode))
type = tree_type;
else if (S_ISGITLINK(ce->ce_mode))
type = commit_type;
else
type = blob_type;
printf("%s %s\t%s\n",
type,
oid_to_hex(&ce->oid),
ce->name);
}
static void print_cache(struct index_state *istate)
{
int i;
for (i = 0; i < istate->cache_nr; i++)
print_cache_entry(istate->cache[i]);
}
int cmd__read_cache(int argc, const char **argv)
{
struct repository *r = the_repository;
int i, cnt = 1;
const char *name = NULL;
int table = 0, expand = 0;
if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) {
argc--;
argv++;
initialize_the_repository();
prepare_repo_settings(r);
r->settings.command_requires_full_index = 0;
for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
if (skip_prefix(*argv, "--print-and-refresh=", &name))
continue;
if (!strcmp(*argv, "--table"))
table = 1;
else if (!strcmp(*argv, "--expand"))
expand = 1;
}
if (argc == 2)
cnt = strtol(argv[1], NULL, 0);
if (argc == 1)
cnt = strtol(argv[0], NULL, 0);
setup_git_directory();
git_config(git_default_config, NULL);
for (i = 0; i < cnt; i++) {
read_cache();
repo_read_index(r);
if (expand)
ensure_full_index(r->index);
if (name) {
int pos;
refresh_index(&the_index, REFRESH_QUIET,
refresh_index(r->index, REFRESH_QUIET,
NULL, NULL, NULL);
pos = index_name_pos(&the_index, name, strlen(name));
pos = index_name_pos(r->index, name, strlen(name));
if (pos < 0)
die("%s not in index", name);
printf("%s is%s up to date\n", name,
ce_uptodate(the_index.cache[pos]) ? "" : " not");
ce_uptodate(r->index->cache[pos]) ? "" : " not");
write_file(name, "%d\n", i);
}
discard_cache();
if (table)
print_cache(r->index);
discard_index(r->index);
}
return 0;
}

101
t/perf/p2000-sparse-operations.sh Executable file
View File

@ -0,0 +1,101 @@
#!/bin/sh
test_description="test performance of Git operations using the index"
. ./perf-lib.sh
test_perf_default_repo
SPARSE_CONE=f2/f4/f1
test_expect_success 'setup repo and indexes' '
git reset --hard HEAD &&
# Remove submodules from the example repo, because our
# duplication of the entire repo creates an unlikely data shape.
if git config --file .gitmodules --get-regexp "submodule.*.path" >modules
then
git rm $(awk "{print \$2}" modules) &&
git commit -m "remove submodules" || return 1
fi &&
echo bogus >a &&
cp a b &&
git add a b &&
git commit -m "level 0" &&
BLOB=$(git rev-parse HEAD:a) &&
OLD_COMMIT=$(git rev-parse HEAD) &&
OLD_TREE=$(git rev-parse HEAD^{tree}) &&
for i in $(test_seq 1 4)
do
cat >in <<-EOF &&
100755 blob $BLOB a
040000 tree $OLD_TREE f1
040000 tree $OLD_TREE f2
040000 tree $OLD_TREE f3
040000 tree $OLD_TREE f4
EOF
NEW_TREE=$(git mktree <in) &&
NEW_COMMIT=$(git commit-tree $NEW_TREE -p $OLD_COMMIT -m "level $i") &&
OLD_TREE=$NEW_TREE &&
OLD_COMMIT=$NEW_COMMIT || return 1
done &&
git sparse-checkout init --cone &&
git branch -f wide $OLD_COMMIT &&
git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
(
cd full-index-v3 &&
git sparse-checkout init --cone &&
git sparse-checkout set $SPARSE_CONE &&
git config index.version 3 &&
git update-index --index-version=3
) &&
git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
(
cd full-index-v4 &&
git sparse-checkout init --cone &&
git sparse-checkout set $SPARSE_CONE &&
git config index.version 4 &&
git update-index --index-version=4
) &&
git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 &&
(
cd sparse-index-v3 &&
git sparse-checkout init --cone --sparse-index &&
git sparse-checkout set $SPARSE_CONE &&
git config index.version 3 &&
git update-index --index-version=3
) &&
git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 &&
(
cd sparse-index-v4 &&
git sparse-checkout init --cone --sparse-index &&
git sparse-checkout set $SPARSE_CONE &&
git config index.version 4 &&
git update-index --index-version=4
)
'
test_perf_on_all () {
command="$@"
for repo in full-index-v3 full-index-v4 \
sparse-index-v3 sparse-index-v4
do
test_perf "$command ($repo)" "
(
cd $repo &&
echo >>$SPARSE_CONE/a &&
$command
)
"
done
}
test_perf_on_all git status
test_perf_on_all git add -A
test_perf_on_all git add .
test_perf_on_all git commit -a -m A
test_done

View File

@ -205,6 +205,19 @@ test_expect_success 'sparse-checkout disable' '
check_files repo a deep folder1 folder2
'
test_expect_success 'sparse-index enabled and disabled' '
git -C repo sparse-checkout init --cone --sparse-index &&
test_cmp_config -C repo true index.sparse &&
test-tool -C repo read-cache --table >cache &&
grep " tree " cache &&
git -C repo sparse-checkout disable &&
test-tool -C repo read-cache --table >cache &&
! grep " tree " cache &&
git -C repo config --list >config &&
! grep index.sparse config
'
test_expect_success 'cone mode: init and set' '
git -C repo sparse-checkout init --cone &&
git -C repo config --list >config &&

View File

@ -2,11 +2,15 @@
test_description='compare full workdir to sparse workdir'
GIT_TEST_SPLIT_INDEX=0
GIT_TEST_SPARSE_INDEX=
. ./test-lib.sh
test_expect_success 'setup' '
git init initial-repo &&
(
GIT_TEST_SPARSE_INDEX=0 &&
cd initial-repo &&
echo a >a &&
echo "after deep" >e &&
@ -87,39 +91,102 @@ init_repos () {
cp -r initial-repo sparse-checkout &&
git -C sparse-checkout reset --hard &&
git -C sparse-checkout sparse-checkout init --cone &&
cp -r initial-repo sparse-index &&
git -C sparse-index reset --hard &&
# initialize sparse-checkout definitions
git -C sparse-checkout sparse-checkout set deep
git -C sparse-checkout sparse-checkout init --cone &&
git -C sparse-checkout sparse-checkout set deep &&
git -C sparse-index sparse-checkout init --cone --sparse-index &&
test_cmp_config -C sparse-index true index.sparse &&
git -C sparse-index sparse-checkout set deep
}
run_on_sparse () {
(
cd sparse-checkout &&
$* >../sparse-checkout-out 2>../sparse-checkout-err
"$@" >../sparse-checkout-out 2>../sparse-checkout-err
) &&
(
cd sparse-index &&
"$@" >../sparse-index-out 2>../sparse-index-err
)
}
run_on_all () {
(
cd full-checkout &&
$* >../full-checkout-out 2>../full-checkout-err
"$@" >../full-checkout-out 2>../full-checkout-err
) &&
run_on_sparse $*
run_on_sparse "$@"
}
test_all_match () {
run_on_all $* &&
run_on_all "$@" &&
test_cmp full-checkout-out sparse-checkout-out &&
test_cmp full-checkout-err sparse-checkout-err
test_cmp full-checkout-out sparse-index-out &&
test_cmp full-checkout-err sparse-checkout-err &&
test_cmp full-checkout-err sparse-index-err
}
test_sparse_match () {
run_on_sparse "$@" &&
test_cmp sparse-checkout-out sparse-index-out &&
test_cmp sparse-checkout-err sparse-index-err
}
test_expect_success 'sparse-index contents' '
init_repos &&
test-tool -C sparse-index read-cache --table >cache &&
for dir in folder1 folder2 x
do
TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
grep "040000 tree $TREE $dir/" cache \
|| return 1
done &&
git -C sparse-index sparse-checkout set folder1 &&
test-tool -C sparse-index read-cache --table >cache &&
for dir in deep folder2 x
do
TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
grep "040000 tree $TREE $dir/" cache \
|| return 1
done &&
git -C sparse-index sparse-checkout set deep/deeper1 &&
test-tool -C sparse-index read-cache --table >cache &&
for dir in deep/deeper2 folder1 folder2 x
do
TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
grep "040000 tree $TREE $dir/" cache \
|| return 1
done &&
# Disabling the sparse-index removes tree entries with full ones
git -C sparse-index sparse-checkout init --no-sparse-index &&
test-tool -C sparse-index read-cache --table >cache &&
! grep "040000 tree" cache &&
test_sparse_match test-tool read-cache --table
'
test_expect_success 'expanded in-memory index matches full index' '
init_repos &&
test_sparse_match test-tool read-cache --expand --table
'
test_expect_success 'status with options' '
init_repos &&
test_sparse_match ls &&
test_all_match git status --porcelain=v2 &&
test_all_match git status --porcelain=v2 -z -u &&
test_all_match git status --porcelain=v2 -uno &&
run_on_all "touch README.md" &&
run_on_all touch README.md &&
test_all_match git status --porcelain=v2 &&
test_all_match git status --porcelain=v2 -z -u &&
test_all_match git status --porcelain=v2 -uno &&
@ -135,7 +202,7 @@ test_expect_success 'add, commit, checkout' '
write_script edit-contents <<-\EOF &&
echo text >>$1
EOF
run_on_all "../edit-contents README.md" &&
run_on_all ../edit-contents README.md &&
test_all_match git add README.md &&
test_all_match git status --porcelain=v2 &&
@ -144,7 +211,7 @@ test_expect_success 'add, commit, checkout' '
test_all_match git checkout HEAD~1 &&
test_all_match git checkout - &&
run_on_all "../edit-contents README.md" &&
run_on_all ../edit-contents README.md &&
test_all_match git add -A &&
test_all_match git status --porcelain=v2 &&
@ -153,7 +220,7 @@ test_expect_success 'add, commit, checkout' '
test_all_match git checkout HEAD~1 &&
test_all_match git checkout - &&
run_on_all "../edit-contents deep/newfile" &&
run_on_all ../edit-contents deep/newfile &&
test_all_match git status --porcelain=v2 -uno &&
test_all_match git status --porcelain=v2 &&
@ -186,7 +253,7 @@ test_expect_success 'diff --staged' '
write_script edit-contents <<-\EOF &&
echo text >>README.md
EOF
run_on_all "../edit-contents" &&
run_on_all ../edit-contents &&
test_all_match git diff &&
test_all_match git diff --staged &&
@ -252,6 +319,17 @@ test_expect_failure 'checkout and reset (mixed)' '
test_all_match git reset update-folder2
'
# Ensure that sparse-index behaves identically to
# sparse-checkout with a full index.
test_expect_success 'checkout and reset (mixed) [sparse]' '
init_repos &&
test_sparse_match git checkout -b reset-test update-deep &&
test_sparse_match git reset deepest &&
test_sparse_match git reset update-folder1 &&
test_sparse_match git reset update-folder2
'
test_expect_success 'merge' '
init_repos &&
@ -280,7 +358,7 @@ test_expect_success 'clean' '
echo bogus >>.gitignore &&
run_on_all cp ../.gitignore . &&
test_all_match git add .gitignore &&
test_all_match git commit -m ignore-bogus-files &&
test_all_match git commit -m "ignore bogus files" &&
run_on_sparse mkdir folder1 &&
run_on_all touch folder1/bogus &&
@ -288,14 +366,51 @@ test_expect_success 'clean' '
test_all_match git status --porcelain=v2 &&
test_all_match git clean -f &&
test_all_match git status --porcelain=v2 &&
test_sparse_match ls &&
test_sparse_match ls folder1 &&
test_all_match git clean -xf &&
test_all_match git status --porcelain=v2 &&
test_sparse_match ls &&
test_sparse_match ls folder1 &&
test_all_match git clean -xdf &&
test_all_match git status --porcelain=v2 &&
test_sparse_match ls &&
test_sparse_match ls folder1 &&
test_path_is_dir sparse-checkout/folder1
test_sparse_match test_path_is_dir folder1
'
test_expect_success 'submodule handling' '
init_repos &&
test_all_match mkdir modules &&
test_all_match touch modules/a &&
test_all_match git add modules &&
test_all_match git commit -m "add modules directory" &&
run_on_all git submodule add "$(pwd)/initial-repo" modules/sub &&
test_all_match git commit -m "add submodule" &&
# having a submodule prevents "modules" from collapse
test-tool -C sparse-index read-cache --table >cache &&
grep "100644 blob .* modules/a" cache &&
grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache
'
test_expect_success 'sparse-index is expanded and converted back' '
init_repos &&
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
git -C sparse-index -c core.fsmonitor="" reset --hard &&
test_region index convert_to_sparse trace2.txt &&
test_region index ensure_full_index trace2.txt &&
rm trace2.txt &&
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
git -C sparse-index -c core.fsmonitor="" status -uno &&
test_region index ensure_full_index trace2.txt
'
test_done

View File

@ -750,9 +750,13 @@ static int index_pos_by_traverse_info(struct name_entry *names,
strbuf_make_traverse_path(&name, info, names->path, names->pathlen);
strbuf_addch(&name, '/');
pos = index_name_pos(o->src_index, name.buf, name.len);
if (pos >= 0)
BUG("This is a directory and should not exist in index");
pos = -pos - 1;
if (pos >= 0) {
if (!o->src_index->sparse_index ||
!(o->src_index->cache[pos]->ce_flags & CE_SKIP_WORKTREE))
BUG("This is a directory and should not exist in index");
} else {
pos = -pos - 1;
}
if (pos >= o->src_index->cache_nr ||
!starts_with(o->src_index->cache[pos]->name, name.buf) ||
(pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf)))
@ -1571,6 +1575,7 @@ static int verify_absent(const struct cache_entry *,
*/
int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o)
{
struct repository *repo = the_repository;
int i, ret;
static struct cache_entry *dfc;
struct pattern_list pl;
@ -1582,6 +1587,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
trace_performance_enter();
trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
prepare_repo_settings(repo);
if (repo->settings.command_requires_full_index) {
ensure_full_index(o->src_index);
ensure_full_index(o->dst_index);
}
if (!core_apply_sparse_checkout || !o->update)
o->skip_sparse_checkout = 1;
if (!o->skip_sparse_checkout && !o->pl) {