Commit Graph

1021 Commits

Author SHA1 Message Date
Junio C Hamano 1002f28a52 Merge branch 'eb/hash-transition'
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.

* eb/hash-transition: (30 commits)
  t1016-compatObjectFormat: add tests to verify the conversion between objects
  t1006: test oid compatibility with cat-file
  t1006: rename sha1 to oid
  test-lib: compute the compatibility hash so tests may use it
  builtin/ls-tree: let the oid determine the output algorithm
  object-file: handle compat objects in check_object_signature
  tree-walk: init_tree_desc take an oid to get the hash algorithm
  builtin/cat-file: let the oid determine the output algorithm
  rev-parse: add an --output-object-format parameter
  repository: implement extensions.compatObjectFormat
  object-file: update object_info_extended to reencode objects
  object-file-convert: convert commits that embed signed tags
  object-file-convert: convert commit objects when writing
  object-file-convert: don't leak when converting tag objects
  object-file-convert: convert tag objects when writing
  object-file-convert: add a function to convert trees between algorithms
  object: factor out parse_mode out of fast-import and tree-walk into in object.h
  cache: add a function to read an OID of a specific algorithm
  tag: sign both hashes
  commit: export add_header_signature to support handling signatures on tags
  ...
2024-03-28 14:13:50 -07:00
Junio C Hamano 8e663afb95 Merge branch 'as/option-names-in-messages'
Error message updates.

* as/option-names-in-messages:
  revision.c: trivial fix to message
  builtin/clone.c: trivial fix of message
  builtin/remote.c: trivial fix of error message
  transport-helper.c: trivial fix of error message
2024-03-15 16:05:59 -07:00
Junio C Hamano 7745f92507 Merge branch 'js/merge-base-with-missing-commit'
Make sure failure return from merge_bases_many() is properly caught.

* js/merge-base-with-missing-commit:
  merge-ort/merge-recursive: do report errors in `merge_submodule()`
  merge-recursive: prepare for `merge_submodule()` to report errors
  commit-reach(repo_get_merge_bases_many_dirty): pass on errors
  commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
  commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
  commit-reach(repo_get_merge_bases): pass on "missing commits" errors
  commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
  commit-reach(merge_bases_many): pass on "missing commits" errors
  commit-reach(paint_down_to_common): start reporting errors
  commit-reach(paint_down_to_common): prepare for handling shallow commits
  commit-reach(repo_in_merge_bases_many): report missing commits
  commit-reach(repo_in_merge_bases_many): optionally expect missing commits
  commit-reach(paint_down_to_common): plug two memory leaks
2024-03-11 14:12:30 -07:00
Junio C Hamano 56d6084560 Merge branch 'jk/upload-pack-bounded-resources'
Various parts of upload-pack has been updated to bound the resource
consumption relative to the size of the repository to protect from
abusive clients.

* jk/upload-pack-bounded-resources:
  upload-pack: free tree buffers after parsing
  upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
  upload-pack: always turn off save_commit_buffer
  upload-pack: disallow object-info capability by default
  upload-pack: accept only a single packfile-uri line
  upload-pack: use a strmap for want-ref lines
  upload-pack: use oidset for deepen_not list
  upload-pack: switch deepen-not list to an oid_array
  upload-pack: drop separate v2 "haves" array
2024-03-07 15:59:42 -08:00
Junio C Hamano 6a887bdd92 Merge branch 'ml/log-merge-with-cherry-pick-and-other-pseudo-heads'
"git log --merge" learned to pay attention to CHERRY_PICK_HEAD and
other kinds of *_HEAD pseudorefs.

* ml/log-merge-with-cherry-pick-and-other-pseudo-heads:
  revision: implement `git log --merge` also for rebase/cherry-pick/revert
  revision: ensure MERGE_HEAD is a ref in prepare_show_merge
2024-03-07 15:59:41 -08:00
Junio C Hamano 76d1cd8e5e Merge branch 'cc/rev-list-allow-missing-tips'
"git rev-list --missing=print" has learned to optionally take
"--allow-missing-tips", which allows the objects at the starting
points to be missing.

* cc/rev-list-allow-missing-tips:
  revision: fix --missing=[print|allow*] for annotated tags
  rev-list: allow missing tips with --missing=[print|allow*]
  t6022: fix 'test' style and 'even though' typo
  oidset: refactor oidset_insert_from_set()
  revision: clarify a 'return NULL' in get_reference()
2024-03-07 15:59:40 -08:00
Alexander Shopov 781fb7b4c2 revision.c: trivial fix to message
ancestry-path is an option, not a command - mark it as such.
This brings it in sync with the rest of usages in the file

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:11:56 -08:00
Johannes Schindelin 76e2a09999 commit-reach(repo_get_merge_bases): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
Jeff King 6cd05e768b upload-pack: free tree buffers after parsing
When a client sends us a "want" or "have" line, we call parse_object()
to get an object struct. If the object is a tree, then the parsed state
means that tree->buffer points to the uncompressed contents of the tree.
But we don't really care about it. We only really need to parse commits
and tags; for trees and blobs, the important output is just a "struct
object" with the correct type.

But much worse, we do not ever free that tree buffer. It's not leaked in
the traditional sense, in that we still have a pointer to it from the
global object hash. But if the client requests many trees, we'll hold
all of their contents in memory at the same time.

Nobody really noticed because it's rare for clients to directly request
a tree. It might happen for a lightweight tag pointing straight at a
tree, or it might happen for a "tree:depth" partial clone filling in
missing trees.

But it's also possible for a malicious client to request a lot of trees,
causing upload-pack's memory to balloon. For example, without this
patch, requesting every tree in git.git like:

  pktline() {
    local msg="$*"
    printf "%04x%s\n" $((1+4+${#msg})) "$msg"
  }

  want_trees() {
    pktline command=fetch
    printf 0001
    git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype)' |
      while read oid type; do
        test "$type" = "tree" || continue
        pktline want $oid
      done
      pktline done
      printf 0000
  }

  want_trees | GIT_PROTOCOL=version=2 valgrind --tool=massif ./git upload-pack . >/dev/null

shows a peak heap usage of ~3.7GB. Which is just about the sum of the
sizes of all of the uncompressed trees. For linux.git, it's closer to
17GB.

So the obvious thing to do is to call free_tree_buffer() after we
realize that we've parsed a tree. We know that upload-pack won't need it
later. But let's push the logic into parse_object_with_flags(), telling
it to discard the tree buffer immediately. There are two reasons for
this. One, all of the relevant call-sites already call the with_options
variant to pass the SKIP_HASH flag. So it actually ends up as less code
than manually free-ing in each spot. And two, it enables an extra
optimization that I'll discuss below.

I've touched all of the sites that currently use SKIP_HASH in
upload-pack. That drops the peak heap of the upload-pack invocation
above from 3.7GB to ~24MB.

I've also modified the caller in get_reference(); a partial clone
benefits from its use in pack-objects for the reasons given in
0bc2557951 (upload-pack: skip parse-object re-hashing of "want" objects,
2022-09-06), where we were measuring blob requests. But note that the
results of get_reference() are used for traversing, as well; so we
really would _eventually_ use the tree contents. That makes this at
first glance a space/time tradeoff: we won't hold all of the trees in
memory at once, but we'll have to reload them each when it comes time to
traverse.

And here's where our extra optimization comes in. If the caller is not
going to immediately look at the tree contents, and it doesn't care
about checking the hash, then parse_object() can simply skip loading the
tree entirely, just like we do for blobs! And now it's not a space/time
tradeoff in get_reference() anymore. It's just a lazy-load: we're
delaying reading the tree contents until it's time to actually traverse
them one by one.

And of course for upload-pack, this optimization means we never load the
trees at all, saving lots of CPU time. Timing the "every tree from
git.git" request above shows upload-pack dropping from 32 seconds of CPU
to 19 (the remainder is mostly due to pack-objects actually sending the
pack; timing just the upload-pack portion shows we go from 13s to
~0.28s).

These are all highly gamed numbers, of course. For real-world
partial-clone requests we're saving only a small bit of time in
practice. But it does help harden upload-pack against malicious
denial-of-service attacks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
Michael Lohmann f3fc5d9c91 revision: implement `git log --merge` also for rebase/cherry-pick/revert
'git log' learned in ae3e5e1ef2 (git log -p --merge [[--] paths...],
2006-07-03) to show commits touching conflicted files in the range
HEAD...MERGE_HEAD, an addition documented in d249b45547 (Document
rev-list's option --merge, 2006-08-04).

It can be useful to look at the commit history to understand what lead
to merge conflicts also for other mergy operations besides merges, like
cherry-pick, revert and rebase.

For rebases and cherry-picks, an interesting range to look at is
HEAD...{REBASE_HEAD,CHERRY_PICK_HEAD}, since even if all the commits
included in that range are not directly part of the 3-way merge,
conflicts encountered during these operations can indeed be caused by
changes introduced in preceding commits on both sides of the history.

For revert, as we are (most likely) reversing changes from a previous
commit, an appropriate range is REVERT_HEAD..HEAD, which is equivalent
to REVERT_HEAD...HEAD and to HEAD...REVERT_HEAD, if we keep HEAD and its
parents on the left side of the range.

As such, adjust the code in prepare_show_merge so it constructs the
range HEAD...$OTHER for OTHER={MERGE_HEAD, CHERRY_PICK_HEAD, REVERT_HEAD
or REBASE_HEAD}. Note that we try these pseudorefs in order, so keep
REBASE_HEAD last since the three other operations can be performed
during a rebase. Note also that in the uncommon case where $OTHER and
HEAD do not share a common ancestor, this will show the complete
histories of both sides since their root commits, which is the same
behaviour as currently happens in that case for HEAD and MERGE_HEAD.

Adjust the documentation of this option accordingly.

Co-authored-by: Johannes Sixt <j6t@kdbg.org>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
[jc: tweaked in j6t's precedence fix that tries REBASE_HEAD last]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:04:39 -08:00
Michael Lohmann f476143ee6 revision: ensure MERGE_HEAD is a ref in prepare_show_merge
This is done to

 (1) ensure MERGE_HEAD is a ref,
 (2) obtain the oid without any prefixing by refs.c:repo_dwim_ref()
 (3) error out when MERGE_HEAD is a symref.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:02:46 -08:00
Christian Couder a4324babe6 revision: fix --missing=[print|allow*] for annotated tags
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with missing commits, not just blobs/trees.

Unfortunately, such a command was still failing with a "fatal: bad
object <oid>" if it was passed a missing commit, blob or tree as an
argument (before the rev walking even begins). This was fixed in a
recent commit.

That fix still doesn't work when an argument passed to the command is
an annotated tag pointing to a missing commit though. In that case
`git rev-list --missing=...` still errors out with a "fatal: bad
object <oid>" error where <oid> is the object ID of the missing
commit.

Let's fix this issue, and also, while at it, let's add tests not just
for annotated tags but also for regular tags and branches.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:28:18 -08:00
Patrick Steinhardt 31f898397b refs: drop unused params from the reflog iterator callback
The ref and reflog iterators share much of the same underlying code to
iterate over the corresponding entries. This results in some weird code
because the reflog iterator also exposes an object ID as well as a flag
to the callback function. Neither of these fields do refer to the reflog
though -- they refer to the corresponding ref with the same name. This
is quite misleading. In practice at least the object ID cannot really be
implemented in any other way as a reflog does not have a specific object
ID in the first place. This is further stressed by the fact that none of
the callbacks except for our test helper make use of these fields.

Split up the infrastucture so that ref and reflog iterators use separate
callback signatures. This allows us to drop the nonsensical fields from
the reflog iterator.

Note that internally, the backends still use the same shared infra to
iterate over both types. As the backends should never end up being
called directly anyway, this is not much of a problem and thus kept
as-is for simplicity's sake.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
Christian Couder 7b644c8c5a rev-list: allow missing tips with --missing=[print|allow*]
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with with missing commits, not just blobs/trees.

Unfortunately, such a command would still fail with a "fatal: bad
object <oid>" if it is passed a missing commit, blob or tree as an
argument (before the rev walking even begins).

When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects (see the
"QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
documentation), it would be better if the command would instead
consider such missing objects, especially commits, in the same way as
other missing objects.

If, for example `--missing=print` is used, it would be nice for some
use cases if the missing tips passed as arguments were reported in
the same way as other missing objects instead of the command just
failing.

We could introduce a new option to make it work like this, but most
users are likely to prefer the command to have this behavior as the
default one. Introducing a new option would require another dumb loop
to look for that option early, which isn't nice.

Also we made `git rev-list` work with missing commits very recently
and the command is most often passed commits as arguments. So let's
consider this as a bug fix related to these recent changes.

While at it let's add a NEEDSWORK comment to say that we should get
rid of the existing ugly dumb loops that parse the
`--exclude-promisor-objects` and `--missing=...` options early.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:39:14 -08:00
Christian Couder 3ff56af99b revision: clarify a 'return NULL' in get_reference()
When we know a pointer variable is NULL, it's clearer to
explicitly return NULL than to return that variable.

In get_reference(), when 'object' is NULL, we already return NULL
when 'revs->exclude_promisor_objects && is_promisor_object(oid)' is
true, but we return 'object' when 'revs->ignore_missing' is true.

Let's make the code clearer and more uniform by also explicitly
returning NULL when 'revs->ignore_missing' is true.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:38:22 -08:00
Junio C Hamano 0f7a10a3aa Merge branch 'en/header-cleanup' into maint-2.43
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-02-08 16:22:10 -08:00
Junio C Hamano 6479e121c2 Merge branch 'rs/incompatible-options-messages' into maint-2.43
Clean-up code that handles combinations of incompatible options.

* rs/incompatible-options-messages:
  worktree: simplify incompatibility message for --orphan and commit-ish
  worktree: standardize incompatibility messages
  clean: factorize incompatibility message
  revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
  revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
  repack: use die_for_incompatible_opt3() for -A/-k/--cruft
  push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
2024-02-08 16:22:06 -08:00
Junio C Hamano 25e2039cf6 Merge branch 'rs/i18n-cannot-be-used-together' into maint-2.43
Clean-up code that handles combinations of incompatible options.

* rs/i18n-cannot-be-used-together:
  i18n: factorize even more 'incompatible options' messages
2024-02-08 16:22:05 -08:00
Junio C Hamano 4b50f86141 Merge branch 'jc/revision-parse-int' into maint-2.43
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.

* jc/revision-parse-int:
  revision: parse integer arguments to --max-count, --skip, etc., more carefully
2024-02-08 16:22:02 -08:00
Junio C Hamano 492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Elijah Newren eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Junio C Hamano 3c8f932d35 Merge branch 'rs/incompatible-options-messages'
Clean-up code that handles combinations of incompatible options.

* rs/incompatible-options-messages:
  worktree: simplify incompatibility message for --orphan and commit-ish
  worktree: standardize incompatibility messages
  clean: factorize incompatibility message
  revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
  revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
  repack: use die_for_incompatible_opt3() for -A/-k/--cruft
  push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
2023-12-20 10:14:53 -08:00
Junio C Hamano c4bf868bee Merge branch 'jc/revision-parse-int'
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.

* jc/revision-parse-int:
  revision: parse integer arguments to --max-count, --skip, etc., more carefully
2023-12-20 10:14:53 -08:00
Junio C Hamano 71a1e94821 revision: parse integer arguments to --max-count, --skip, etc., more carefully
The "rev-list" and other commands in the "log" family, being the
oldest part of the system, use their own custom argument parsers,
and integer values of some options are parsed with atoi(), which
allows a non-digit after the number (e.g., "1q") to be silently
ignored.  As a natural consequence, an argument that does not begin
with a digit (e.g., "q") silently becomes zero, too.

Switch to use strtol_i() and parse_timestamp() appropriately to
catch bogus input.

Note that one may naïvely expect that --max-count, --skip, etc., to
only take non-negative values, but we must allow them to also take
negative values, as an escape hatch to countermand a limit set by an
earlier option on the command line; the underlying variables are
initialized to (-1) and "--max-count=-1", for example, is a
legitimate way to reinitialize the limit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:57:31 +09:00
René Scharfe 81fb70f55e revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
Use the standard parameterized message for reporting incompatible
options to report options that are not accepted in combination with
--exclude-hidden.  This reduces the number of strings to translate and
makes the UI a bit more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:41:03 +09:00
René Scharfe fa518aef56 revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
The revision option --reverse is incompatible with --walk-reflogs and
--graph is incompatible with both --reverse and --walk-reflogs.  So they
are all incompatible with each other.

Use the function for checking three mutually incompatible options,
die_for_incompatible_opt3(), to perform this check in one place and
without repetition.  This is shorter and clearer.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:40:44 +09:00
René Scharfe 7854bf4960 i18n: factorize even more 'incompatible options' messages
Continue the work of 12909b6b8a (i18n: turn "options are incompatible"
into "cannot be used together", 2022-01-05) and a699367bb8 (i18n:
factorize more 'incompatible options' messages, 2022-01-31) to use the
same parameterized error message for reporting incompatible command line
options.  This reduces the number of strings to translate and makes the
UI slightly more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-27 10:01:45 +09:00
Karthik Nayak 9830926c7d rev-list: add commit object support in `--missing` option
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.

Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:07:18 +09:00
Junio C Hamano 2920971a7f Merge branch 'jk/decoration-and-other-leak-fixes'
Leakfix.

* jk/decoration-and-other-leak-fixes:
  daemon: free listen_addr before returning
  revision: clear decoration structs during release_revisions()
  decorate: add clear_decoration() function
2023-10-13 14:18:28 -07:00
Jeff King 8ef8da4842 revision: clear decoration structs during release_revisions()
The point of release_revisions() is to free memory associated with the
rev_info struct, but we have several "struct decoration" members that
are left untouched. Since the previous commit introduced a function to
do that, we can just call it.

We do have to provide some specialized callbacks to map the void
pointers onto real ones (the alternative would be casting the existing
function pointers; this generally works because "void *" is usually
interchangeable with a struct pointer, but it is technically forbidden
by the standard).

Since the line-log code does not expose the type it stores in the
decoration (nor of course the function to free it), I put this behind a
generic line_log_free() entry point. It's possible we may need to add
more line-log specific bits anyway (running t4211 shows a number of
other leaks in the line-log code).

While this doubtless cleans up many leaks triggered by the test suite,
the only script which becomes leak-free is t4217, as it does very little
beyond a simple traversal (its existing leak was from the use of
--children, which is now fixed).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 14:54:57 -07:00
Junio C Hamano 3029189186 Merge branch 'ps/revision-cmdline-stdin-not'
"git rev-list --stdin" learned to take non-revisions (like "--not")
recently from the standard input, but the way such a "--not" was
handled was quite confusing, which has been rethought.  This is
potentially a change that breaks backward compatibility.

* ps/revision-cmdline-stdin-not:
  revision: make pseudo-opt flags read via stdin behave consistently
2023-10-04 13:28:52 -07:00
Eric W. Biederman efed687edc tree-walk: init_tree_desc take an oid to get the hash algorithm
To make it possible for git ls-tree to display the tree encoded
in the hash algorithm of the oid specified to git ls-tree, update
init_tree_desc to take as a parameter the oid of the tree object.

Update all callers of init_tree_desc and init_tree_desc_gently
to pass the oid of the tree object.

Use the oid of the tree object to discover the hash algorithm
of the oid and store that hash algorithm in struct tree_desc.

Use the hash algorithm in decode_tree_entry and
update_tree_entry_internal to handle reading a tree object encoded in
a hash algorithm that differs from the repositories hash algorithm.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
Patrick Steinhardt f97c8b1e00 revision: make pseudo-opt flags read via stdin behave consistently
When reading revisions from stdin via git-rev-list(1)'s `--stdin` option
then these revisions never honor flags like `--not` which have been
passed on the command line. Thus, an invocation like e.g. `git rev-list
--all --not --stdin` will not treat all revisions read from stdin as
uninteresting. While this behaviour may be surprising to a user, it's
been this way ever since it has been introduced via 42cabc341c (Teach
rev-list an option to read revs from the standard input., 2006-09-05).

With that said, in c40f0b7877 (revision: handle pseudo-opts in `--stdin`
mode, 2023-06-15) we have introduced a new mode to read pseudo opts from
standard input where this behaviour is a lot more confusing. If you pass
`--not` via stdin, it will:

    - Influence subsequent revisions or pseudo-options passed on the
      command line.

    - Influence pseudo-options passed via standard input.

    - _Not_ influence normal revisions passed via standard input.

This behaviour is extremely inconsistent and bound to cause confusion.

While it would be nice to retroactively change the behaviour for how
`--not` and `--stdin` behave together, chances are quite high that this
would break existing scripts that expect the current behaviour that has
been around for many years by now. This is thus not really a viable
option to explore to fix the inconsistency.

Instead, we change the behaviour of how pseudo-opts read via standard
input influence the flags such that the effect is fully localized. With
this change, when reading `--not` via standard input, it will:

    - _Not_ influence subsequent revisions or pseudo-options passed on
      the command line, which is a change in behaviour.

    - Influence pseudo-options passed via standard input.

    - Influence normal revisions passed via standard input, which is a
      change in behaviour.

Thus, all flags read via standard input are fully self-contained to that
standard input, only.

While this is a breaking change as well, the behaviour has only been
recently introduced with Git v2.42.0. Furthermore, the current behaviour
can be regarded as a simple bug. With that in mind it feels like the
right thing to retroactively change it and make the behaviour sane.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 09:59:04 -07:00
Kristoffer Haugsbakk 2e0d30d928 range-diff: treat notes like `log`
Currently, `range-diff` shows the default notes if no notes-related
arguments are given. This is also how `log` behaves. But unlike
`range-diff`, `log` does *not* show the default notes if
`--notes=<custom>` are given. In other words, this:

    git log --notes=custom

is equivalent to this:

    git log --no-notes --notes=custom

While:

    git range-diff --notes=custom

acts like this:

    git log --notes --notes-custom

This can’t be how the user expects `range-diff` to behave given that the
man page for `range-diff` under `--[no-]notes[=<ref>]` says:

> This flag is passed to the `git log` program (see git-log(1)) that
> generates the patches.

This behavior also affects `format-patch` since it uses `range-diff` for
the cover letter. Unlike `log`, though, `format-patch` is not supposed
to show the default notes if no notes-related arguments are given.[1]
But this promise is broken when the range-diff happens to have something
to say about the changes to the default notes, since that will be shown
in the cover letter.

Remedy this by introducing `--show-notes-by-default` that `range-diff` can
use to tell the `log` subprocess what to do.

§ Authors

• Fix by Johannes
• Tests by Kristoffer

† 1: See e.g. 66b2ed09c2 (Fix "log" family not to be too agressive about
    showing notes, 2010-01-20).

Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-19 14:40:19 -07:00
Junio C Hamano c5fcd34e1b Merge branch 'jk/unused-parameter'
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.

* jk/unused-parameter:
  t/helper: mark unused callback void data parameters
  tag: mark unused parameters in each_tag_name_fn callbacks
  rev-parse: mark unused parameter in for_each_abbrev callback
  replace: mark unused parameter in each_mergetag_fn callback
  replace: mark unused parameter in ref callback
  merge-tree: mark unused parameter in traverse callback
  fsck: mark unused parameters in various fsck callbacks
  revisions: drop unused "opt" parameter in "tweak" callbacks
  count-objects: mark unused parameter in alternates callback
  am: mark unused keep_cr parameters
  http-push: mark unused parameter in xml callback
  http: mark unused parameters in curl callbacks
  do_for_each_ref_helper(): mark unused repository parameter
  test-ref-store: drop unimplemented reflog-expire command
2023-07-25 12:05:24 -07:00
Junio C Hamano 39fe402d67 Merge branch 'tb/refs-exclusion-and-packed-refs'
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.

* tb/refs-exclusion-and-packed-refs:
  ls-refs.c: avoid enumerating hidden refs where possible
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references
  refs.h: implement `hidden_refs_to_excludes()`
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: add trace2 counters for jump list
  refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
  refs/packed-backend.c: refactor `find_reference_location()`
  refs: plumb `exclude_patterns` argument throughout
  builtin/for-each-ref.c: add `--exclude` option
  ref-filter.c: parameterize match functions over patterns
  ref-filter: add `ref_filter_clear()`
  ref-filter: clear reachable list pointers after freeing
  ref-filter.h: provide `REF_FILTER_INIT`
  refs.c: rename `ref_filter`
2023-07-21 13:47:26 -07:00
Junio C Hamano ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
Jeff King cc88afad62 revisions: drop unused "opt" parameter in "tweak" callbacks
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.

But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.

We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Taylor Blau c45841fff8 revision.h: store hidden refs in a `strvec`
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.

Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.

Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:

    ref = xstrdup(value);
    len = strlen(ref);
    while (len && ref[len - 1] == '/')
            ref[--len] = '\0';
    strvec_push(hide_refs, ref);
    free(ref);

but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau b269ac53c0 refs: plumb `exclude_patterns` argument throughout
The subsequent patch will want to access an optional `excluded_patterns`
array within `refs/packed-backend.c` that will cull out certain
references matching any of the given patterns on a best-effort basis.

To do so, the refs subsystem needs to be updated to pass this value
across a number of different locations.

Prepare for a future patch by introducing this plumbing now, passing
NULLs at top-level APIs in order to make that patch less noisy and more
easily readable.

Signed-off-by: Taylor Blau <me@ttaylorr.co>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Junio C Hamano b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
Calvin Wan 91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Junio C Hamano 812907d16f Merge branch 'ps/revision-stdin-with-options'
The set-up code for the get_revision() API now allows feeding
options like --all and --not in the --stdin mode.

* ps/revision-stdin-with-options:
  revision: handle pseudo-opts in `--stdin` mode
  revision: small readability improvement for reading from stdin
  revision: reorder `read_revisions_from_stdin()`
2023-07-04 16:08:18 -07:00
Glen Choo a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Elijah Newren a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren dd77d58795 git-compat-util.h: remove unneccessary include of wildmatch.h
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch().  The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.

Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren bc5c5ec044 cache.h: remove this no-longer-used header
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.

Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first).  This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 08c46a499a read-cache*.h: move declarations for read-cache.c functions from cache.h
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h.  Also move some related inline
functions from cache.h to read-cache.h.  The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren baf889c2cd sparse-index.h: move declarations for sparse-index.c from cache.h
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Patrick Steinhardt c40f0b7877 revision: handle pseudo-opts in `--stdin` mode
While both git-rev-list(1) and git-log(1) support `--stdin`, it only
accepts commits and files. Most notably, it is impossible to pass any of
the pseudo-opts like `--all`, `--glob=` or others via stdin.

This makes it hard to use this function in certain scripted scenarios,
like when one wants to support queries against specific revisions, but
also against reference patterns. While this is theoretically possible by
using arguments, this may run into issues once we hit platform limits
with sufficiently large queries. And because `--stdin` cannot handle
pseudo-opts, the only alternative would be to use a mixture of arguments
and standard input, which is cumbersome.

Implement support for handling pseudo-opts in both commands to support
this usecase better. One notable restriction here is that `--stdin` only
supports "stuck" arguments in the form of `--glob=foo`. This is because
"unstuck" arguments would also require us to read the next line, which
would add quite some complexity to the code. This restriction should be
fine for scripted usage though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00