Merge branch 'ds/bundle-uri-4'

Bundle URIs part 4.

* ds/bundle-uri-4:
  clone: unbundle the advertised bundles
  bundle-uri: download bundles from an advertised list
  bundle-uri: allow relative URLs in bundle lists
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: serve bundle.* keys from config
  bundle-uri client: add helper for testing server
  transport: rename got_remote_heads
  bundle-uri client: add boolean transfer.bundleURI setting
  clone: request the 'bundle-uri' command when available
  t: create test harness for 'bundle-uri' command
  protocol v2: add server-side "bundle-uri" skeleton
This commit is contained in:
Junio C Hamano 2023-01-02 21:37:18 +09:00
commit 0903d8bbde
24 changed files with 1041 additions and 12 deletions

View File

@ -115,3 +115,9 @@ transfer.unpackLimit::
transfer.advertiseSID::
Boolean. When true, client and server processes will advertise their
unique session IDs to their remote counterpart. Defaults to false.
transfer.bundleURI::
When `true`, local `git clone` commands will request bundle
information from the remote server (if advertised) and download
bundles before continuing the clone through the Git protocol.
Defaults to `false`.

View File

@ -578,6 +578,207 @@ and associated requested information, each separated by a single space.
obj-info = obj-id SP obj-size
bundle-uri
~~~~~~~~~~
If the 'bundle-uri' capability is advertised, the server supports the
`bundle-uri' command.
The capability is currently advertised with no value (i.e. not
"bundle-uri=somevalue"), a value may be added in the future for
supporting command-wide extensions. Clients MUST ignore any unknown
capability values and proceed with the 'bundle-uri` dialog they
support.
The 'bundle-uri' command is intended to be issued before `fetch` to
get URIs to bundle files (see linkgit:git-bundle[1]) to "seed" and
inform the subsequent `fetch` command.
The client CAN issue `bundle-uri` before or after any other valid
command. To be useful to clients it's expected that it'll be issued
after an `ls-refs` and before `fetch`, but CAN be issued at any time
in the dialog.
DISCUSSION of bundle-uri
^^^^^^^^^^^^^^^^^^^^^^^^
The intent of the feature is optimize for server resource consumption
in the common case by changing the common case of fetching a very
large PACK during linkgit:git-clone[1] into a smaller incremental
fetch.
It also allows servers to achieve better caching in combination with
an `uploadpack.packObjectsHook` (see linkgit:git-config[1]).
By having new clones or fetches be a more predictable and common
negotiation against the tips of recently produces *.bundle file(s).
Servers might even pre-generate the results of such negotiations for
the `uploadpack.packObjectsHook` as new pushes come in.
One way that servers could take advantage of these bundles is that the
server would anticipate that fresh clones will download a known bundle,
followed by catching up to the current state of the repository using ref
tips found in that bundle (or bundles).
PROTOCOL for bundle-uri
^^^^^^^^^^^^^^^^^^^^^^^
A `bundle-uri` request takes no arguments, and as noted above does not
currently advertise a capability value. Both may be added in the
future.
When the client issues a `command=bundle-uri` request, the response is a
list of key-value pairs provided as packet lines with value
`<key>=<value>`. Each `<key>` should be interpreted as a config key from
the `bundle.*` namespace to construct a list of bundles. These keys are
grouped by a `bundle.<id>.` subsection, where each key corresponding to a
given `<id>` contributes attributes to the bundle defined by that `<id>`.
See linkgit:git-config[1] for the specific details of these keys and how
the Git client will interpret their values.
Clients MUST parse the line according to the above format, lines that do
not conform to the format SHOULD be discarded. The user MAY be warned in
such a case.
bundle-uri CLIENT AND SERVER EXPECTATIONS
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
URI CONTENTS::
The content at the advertised URIs MUST be one of two types.
+
The advertised URI may contain a bundle file that `git bundle verify`
would accept. I.e. they MUST contain one or more reference tips for
use by the client, MUST indicate prerequisites (in any) with standard
"-" prefixes, and MUST indicate their "object-format", if
applicable.
+
The advertised URI may alternatively contain a plaintext file that `git
config --list` would accept (with the `--file` option). The key-value
pairs in this list are in the `bundle.*` namespace (see
linkgit:git-config[1]).
bundle-uri CLIENT ERROR RECOVERY::
A client MUST above all gracefully degrade on errors, whether that
error is because of bad missing/data in the bundle URI(s), because
that client is too dumb to e.g. understand and fully parse out bundle
headers and their prerequisite relationships, or something else.
+
Server operators should feel confident in turning on "bundle-uri" and
not worry if e.g. their CDN goes down that clones or fetches will run
into hard failures. Even if the server bundle bundle(s) are
incomplete, or bad in some way the client should still end up with a
functioning repository, just as if it had chosen not to use this
protocol extension.
+
All subsequent discussion on client and server interaction MUST keep
this in mind.
bundle-uri SERVER TO CLIENT::
The ordering of the returned bundle uris is not significant. Clients
MUST parse their headers to discover their contained OIDS and
prerequisites. A client MUST consider the content of the bundle(s)
themselves and their header as the ultimate source of truth.
+
A server MAY even return bundle(s) that don't have any direct
relationship to the repository being cloned (either through accident,
or intentional "clever" configuration), and expect a client to sort
out what data they'd like from the bundle(s), if any.
bundle-uri CLIENT TO SERVER::
The client SHOULD provide reference tips found in the bundle header(s)
as 'have' lines in any subsequent `fetch` request. A client MAY also
ignore the bundle(s) entirely if doing so is deemed worse for some
reason, e.g. if the bundles can't be downloaded, it doesn't like the
tips it finds etc.
WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
of the bundle(s) the client finds that the ref tips it wants can be
retrieved entirely from advertised bundle(s), the client MAY disconnect
from the Git server. The results of such a 'clone' or 'fetch' should be
indistinguishable from the state attained without using bundle-uri.
EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
A client MAY perform an early disconnect while still downloading the
bundle(s) (having streamed and parsed their headers). In such a case
the client MUST gracefully recover from any errors related to
finishing the download and validation of the bundle(s).
+
I.e. a client might need to re-connect and issue a 'fetch' command,
and possibly fall back to not making use of 'bundle-uri' at all.
+
This "MAY" behavior is specified as such (and not a "SHOULD") on the
assumption that a server advertising bundle uris is more likely than
not to be serving up a relatively large repository, and to be pointing
to URIs that have a good chance of being in working order. A client
MAY e.g. look at the payload size of the bundles as a heuristic to see
if an early disconnect is worth it, should falling back on a full
"fetch" dialog be necessary.
WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
A client SHOULD commence a negotiation of a PACK from the server via
the "fetch" command using the OID tips found in advertised bundles,
even if's still in the process of downloading those bundle(s).
+
This allows for aggressive early disconnects from any interactive
server dialog. The client blindly trusts that the advertised OID tips
are relevant, and issues them as 'have' lines, it then requests any
tips it would like (usually from the "ls-refs" advertisement) via
'want' lines. The server will then compute a (hopefully small) PACK
with the expected difference between the tips from the bundle(s) and
the data requested.
+
The only connection the client then needs to keep active is to the
concurrently downloading static bundle(s), when those and the
incremental PACK are retrieved they should be inflated and
validated. Any errors at this point should be gracefully recovered
from, see above.
bundle-uri PROTOCOL FEATURES
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The client constructs a bundle list from the `<key>=<value>` pairs
provided by the server. These pairs are part of the `bundle.*` namespace
as documented in linkgit:git-config[1]. In this section, we discuss some
of these keys and describe the actions the client will do in response to
this information.
In particular, the `bundle.version` key specifies an integer value. The
only accepted value at the moment is `1`, but if the client sees an
unexpected value here then the client MUST ignore the bundle list.
As long as `bundle.version` is understood, all other unknown keys MAY be
ignored by the client. The server will guarantee compatibility with older
clients, though newer clients may be better able to use the extra keys to
minimize downloads.
Any backwards-incompatible addition of pre-URI key-value will be
guarded by a new `bundle.version` value or values in 'bundle-uri'
capability advertisement itself, and/or by new future `bundle-uri`
request arguments.
Some example key-value pairs that are not currently implemented but could
be implemented in the future include:
* Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
size of the bundle file.
* Advertise that one or more bundle files are the same (to e.g. have
clients round-robin or otherwise choose one of N possible files).
* A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
expressing the common case of a bundle with one tip and no
prerequisites, or one tip and one prerequisite.
+
This would allow for optimizing the common case of servers who'd like
to provide one "big bundle" containing only their "main" branch,
and/or incremental updates thereof.
+
A client receiving such a a response MAY assume that they can skip
retrieving the header from a bundle at the indicated URI, and thus
save themselves and the server(s) the request(s) needed to inspect the
headers of that bundle or bundles.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -1271,6 +1271,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
if (refs)
mapped_refs = wanted_peer_refs(refs, &remote->fetch);
if (!bundle_uri) {
/*
* Populate transport->got_remote_bundle_uri and
* transport->bundle_uri. We might get nothing.
*/
transport_get_remote_bundle_uri(transport);
if (transport->bundles &&
hashmap_get_size(&transport->bundles->bundles)) {
/* At this point, we need the_repository to match the cloned repo. */
if (repo_init(the_repository, git_dir, work_tree))
warning(_("failed to initialize the repo, skipping bundle URI"));
else if (fetch_bundle_list(the_repository,
transport->bundles))
warning(_("failed to fetch advertised bundles"));
} else {
clear_bundle_list(transport->bundles);
FREE_AND_NULL(transport->bundles);
}
}
if (mapped_refs) {
int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));

View File

@ -7,6 +7,7 @@
#include "hashmap.h"
#include "pkt-line.h"
#include "config.h"
#include "remote.h"
static int compare_bundles(const void *hashmap_cmp_fn_data,
const struct hashmap_entry *he1,
@ -49,6 +50,7 @@ void clear_bundle_list(struct bundle_list *list)
for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
free(list->baseURI);
}
int for_all_bundles_in_list(struct bundle_list *list,
@ -163,7 +165,7 @@ static int bundle_list_update(const char *key, const char *value,
if (!strcmp(subkey, "uri")) {
if (bundle->uri)
return -1;
bundle->uri = xstrdup(value);
bundle->uri = relative_url(list->baseURI, value, NULL);
return 0;
}
@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
.error_action = CONFIG_ERROR_ERROR,
};
if (!list->baseURI) {
struct strbuf baseURI = STRBUF_INIT;
strbuf_addstr(&baseURI, uri);
/*
* If the URI does not end with a trailing slash, then
* remove the filename portion of the path. This is
* important for relative URIs.
*/
strbuf_strip_file_from_path(&baseURI);
list->baseURI = strbuf_detach(&baseURI, NULL);
}
result = git_config_from_file_with_options(config_to_bundle_list,
filename, list,
&opts);
@ -563,6 +577,77 @@ cleanup:
return result;
}
int fetch_bundle_list(struct repository *r, struct bundle_list *list)
{
int result;
struct bundle_list global_list;
init_bundle_list(&global_list);
/* If a bundle is added to this global list, then it is required. */
global_list.mode = BUNDLE_MODE_ALL;
if ((result = download_bundle_list(r, list, &global_list, 0)))
goto cleanup;
result = unbundle_all_bundles(r, &global_list);
cleanup:
for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
clear_bundle_list(&global_list);
return result;
}
/**
* API for serve.c.
*/
int bundle_uri_advertise(struct repository *r, struct strbuf *value UNUSED)
{
static int advertise_bundle_uri = -1;
if (advertise_bundle_uri != -1)
goto cached;
advertise_bundle_uri = 0;
repo_config_get_maybe_bool(r, "uploadpack.advertisebundleuris", &advertise_bundle_uri);
cached:
return advertise_bundle_uri;
}
static int config_to_packet_line(const char *key, const char *value, void *data)
{
struct packet_reader *writer = data;
if (!strncmp(key, "bundle.", 7))
packet_write_fmt(writer->fd, "%s=%s", key, value);
return 0;
}
int bundle_uri_command(struct repository *r,
struct packet_reader *request)
{
struct packet_writer writer;
packet_writer_init(&writer, 1);
while (packet_reader_read(request) == PACKET_READ_NORMAL)
die(_("bundle-uri: unexpected argument: '%s'"), request->line);
if (request->status != PACKET_READ_FLUSH)
die(_("bundle-uri: expected flush after arguments"));
/*
* Read all "bundle.*" config lines to the client as key=value
* packet lines.
*/
git_config(config_to_packet_line, &writer);
packet_writer_flush(&writer);
return 0;
}
/**
* General API for {transport,connect}.c etc.
*/

View File

@ -4,6 +4,7 @@
#include "hashmap.h"
#include "strbuf.h"
struct packet_reader;
struct repository;
struct string_list;
@ -60,6 +61,20 @@ struct bundle_list {
int version;
enum bundle_list_mode mode;
struct hashmap bundles;
/**
* The baseURI of a bundle_list is the URI that provided the list.
*
* In the case of the 'bundle-uri' protocol v2 command, the base
* URI is the URI of the Git remote.
*
* Otherwise, the bundle list was downloaded over HTTP from some
* known URI. 'baseURI' is set to that value.
*
* The baseURI is used as the base for any relative URIs
* advertised by the bundle list at that location.
*/
char *baseURI;
};
void init_bundle_list(struct bundle_list *list);
@ -92,6 +107,26 @@ int bundle_uri_parse_config_format(const char *uri,
*/
int fetch_bundle_uri(struct repository *r, const char *uri);
/**
* Given a bundle list that was already advertised (likely by the
* bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
* bundles according to the bundle strategy of that list.
*
* It is expected that the given 'list' is initialized, including its
* 'baseURI' value.
*
* Returns non-zero if there was an error trying to download the list
* or any of its advertised bundles.
*/
int fetch_bundle_list(struct repository *r,
struct bundle_list *list);
/**
* API for serve.c.
*/
int bundle_uri_advertise(struct repository *r, struct strbuf *value);
int bundle_uri_command(struct repository *r, struct packet_reader *request);
/**
* General API for {transport,connect}.c etc.
*/

View File

@ -15,6 +15,7 @@
#include "version.h"
#include "protocol.h"
#include "alias.h"
#include "bundle-uri.h"
static char *server_capabilities_v1;
static struct strvec server_capabilities_v2 = STRVEC_INIT;
@ -493,6 +494,49 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
}
}
int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
struct bundle_list *bundles, int stateless_rpc)
{
int line_nr = 1;
/* Assert bundle-uri support */
ensure_server_supports_v2("bundle-uri");
/* (Re-)send capabilities */
send_capabilities(fd_out, reader);
/* Send command */
packet_write_fmt(fd_out, "command=bundle-uri\n");
packet_delim(fd_out);
packet_flush(fd_out);
/* Process response from server */
while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
const char *line = reader->line;
line_nr++;
if (!bundle_uri_parse_line(bundles, line))
continue;
return error(_("error on bundle-uri response line %d: %s"),
line_nr, line);
}
if (reader->status != PACKET_READ_FLUSH)
return error(_("expected flush after bundle-uri listing"));
/*
* Might die(), but obscure enough that that's OK, e.g. in
* serve.c we'll call BUG() on its equivalent (the
* PACKET_READ_RESPONSE_END check).
*/
check_stateless_delimiter(stateless_rpc, reader,
_("expected response end packet after ref listing"));
return 0;
}
struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
struct ref **list, int for_push,
struct transport_ls_refs_options *transport_options,

View File

@ -234,6 +234,11 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
const struct string_list *server_options,
int stateless_rpc);
/* Used for protocol v2 in order to retrieve refs from a remote */
struct bundle_list;
int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
struct bundle_list *bundles, int stateless_rpc);
int resolve_remote_symref(struct ref *ref, struct ref *list);
/*

View File

@ -7,6 +7,7 @@
#include "protocol-caps.h"
#include "serve.h"
#include "upload-pack.h"
#include "bundle-uri.h"
static int advertise_sid = -1;
static int client_hash_algo = GIT_HASH_SHA1;
@ -135,6 +136,11 @@ static struct protocol_capability capabilities[] = {
.advertise = always_advertise,
.command = cap_object_info,
},
{
.name = "bundle-uri",
.advertise = bundle_uri_advertise,
.command = bundle_uri_command,
},
};
void protocol_v2_advertise_capabilities(void)

View File

@ -1200,3 +1200,9 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
free(path2);
return res;
}
void strbuf_strip_file_from_path(struct strbuf *sb)
{
char *path_sep = find_last_dir_sep(sb->buf);
strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
}

View File

@ -664,6 +664,17 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
const char *const *env);
/*
* Remove the filename from the provided path string. If the path
* contains a trailing separator, then the path is considered a directory
* and nothing is modified.
*
* Examples:
* - "/path/to/file" -> "/path/to/"
* - "/path/to/dir/" -> "/path/to/dir/"
*/
void strbuf_strip_file_from_path(struct strbuf *sb);
void strbuf_add_lines(struct strbuf *sb,
const char *prefix,
const char *buf,

View File

@ -3,6 +3,10 @@
#include "bundle-uri.h"
#include "strbuf.h"
#include "string-list.h"
#include "transport.h"
#include "ref-filter.h"
#include "remote.h"
#include "refs.h"
enum input_mode {
KEY_VALUE_PAIRS,
@ -36,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
init_bundle_list(&list);
list.baseURI = xstrdup("<uri>");
switch (mode) {
case KEY_VALUE_PAIRS:
if (argc != 1)
@ -68,6 +74,46 @@ usage:
usage_with_options(usage, options);
}
static int cmd_ls_remote(int argc, const char **argv)
{
const char *uploadpack = NULL;
struct string_list server_options = STRING_LIST_INIT_DUP;
const char *dest;
struct remote *remote;
struct transport *transport;
int status = 0;
dest = argc > 1 ? argv[1] : NULL;
remote = remote_get(dest);
if (!remote) {
if (dest)
die(_("bad repository '%s'"), dest);
die(_("no remote configured to get bundle URIs from"));
}
if (!remote->url_nr)
die(_("remote '%s' has no configured URL"), dest);
transport = transport_get(remote, NULL);
if (uploadpack)
transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
if (server_options.nr)
transport->server_options = &server_options;
if (transport_get_remote_bundle_uri(transport) < 0) {
error(_("could not get the bundle-uri list"));
status = 1;
goto cleanup;
}
print_bundle_list(stdout, transport->bundles);
cleanup:
if (transport_disconnect(transport))
return 1;
return status;
}
int cmd__bundle_uri(int argc, const char **argv)
{
const char *usage[] = {
@ -88,6 +134,8 @@ int cmd__bundle_uri(int argc, const char **argv)
return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
if (!strcmp(argv[1], "parse-config"))
return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
if (!strcmp(argv[1], "ls-remote"))
return cmd_ls_remote(argc - 1, argv + 1);
error("there is no test-tool bundle-uri tool '%s'", argv[1]);
usage:

View File

@ -0,0 +1,216 @@
# Set up and run tests of the 'bundle-uri' command in protocol v2
#
# The test that includes this script should set BUNDLE_URI_PROTOCOL
# to one of "file", "git", or "http".
BUNDLE_URI_TEST_PARENT=
BUNDLE_URI_TEST_URI=
BUNDLE_URI_TEST_BUNDLE_URI=
case "$BUNDLE_URI_PROTOCOL" in
file)
BUNDLE_URI_PARENT=file_parent
BUNDLE_URI_REPO_URI="file://$PWD/file_parent"
BUNDLE_URI_BUNDLE_URI="$BUNDLE_URI_REPO_URI/fake.bdl"
test_set_prereq BUNDLE_URI_FILE
;;
git)
. "$TEST_DIRECTORY"/lib-git-daemon.sh
start_git_daemon --export-all --enable=receive-pack
BUNDLE_URI_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
BUNDLE_URI_REPO_URI="$GIT_DAEMON_URL/parent"
BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
test_set_prereq BUNDLE_URI_GIT
;;
http)
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
BUNDLE_URI_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
BUNDLE_URI_REPO_URI="$HTTPD_URL/smart/http_parent"
BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
test_set_prereq BUNDLE_URI_HTTP
;;
*)
BUG "Need to pass valid BUNDLE_URI_PROTOCOL (was \"$BUNDLE_URI_PROTOCOL\")"
;;
esac
test_expect_success "setup protocol v2 $BUNDLE_URI_PROTOCOL:// tests" '
git init "$BUNDLE_URI_PARENT" &&
test_commit -C "$BUNDLE_URI_PARENT" one &&
git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs true
'
case "$BUNDLE_URI_PROTOCOL" in
http)
test_expect_success "setup config for $BUNDLE_URI_PROTOCOL:// tests" '
git -C "$BUNDLE_URI_PARENT" config http.receivepack true
'
;;
*)
;;
esac
BUNDLE_URI_BUNDLE_URI_ESCAPED=$(echo "$BUNDLE_URI_BUNDLE_URI" | test_uri_escape)
test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: no bundle-uri" '
test_when_finished "rm -f log" &&
test_when_finished "git -C \"$BUNDLE_URI_PARENT\" config uploadpack.advertiseBundleURIs true" &&
git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs false &&
GIT_TRACE_PACKET="$PWD/log" \
git \
-c protocol.version=2 \
ls-remote --symref "$BUNDLE_URI_REPO_URI" \
>actual 2>err &&
# Server responded using protocol v2
grep "< version 2" log &&
! grep bundle-uri log
'
test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: have bundle-uri" '
test_when_finished "rm -f log" &&
GIT_TRACE_PACKET="$PWD/log" \
git \
-c protocol.version=2 \
ls-remote --symref "$BUNDLE_URI_REPO_URI" \
>actual 2>err &&
# Server responded using protocol v2
grep "< version 2" log &&
# Server advertised bundle-uri capability
grep "< bundle-uri" log
'
test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
test_when_finished "rm -rf log* cloned*" &&
GIT_TRACE_PACKET="$PWD/log" \
git \
-c transfer.bundleURI=false \
-c protocol.version=2 \
clone "$BUNDLE_URI_REPO_URI" cloned \
>actual 2>err &&
# Server responded using protocol v2
grep "< version 2" log &&
# Server advertised bundle-uri capability
grep "< bundle-uri" log &&
# Client did not issue bundle-uri command
! grep "> command=bundle-uri" log &&
GIT_TRACE_PACKET="$PWD/log" \
git \
-c transfer.bundleURI=true \
-c protocol.version=2 \
clone "$BUNDLE_URI_REPO_URI" cloned2 \
>actual 2>err &&
# Server responded using protocol v2
grep "< version 2" log &&
# Server advertised bundle-uri capability
grep "< bundle-uri" log &&
# Client issued bundle-uri command
grep "> command=bundle-uri" log &&
GIT_TRACE_PACKET="$PWD/log3" \
git \
-c transfer.bundleURI=true \
-c protocol.version=2 \
clone --bundle-uri="$BUNDLE_URI_BUNDLE_URI" \
"$BUNDLE_URI_REPO_URI" cloned3 \
>actual 2>err &&
# Server responded using protocol v2
grep "< version 2" log3 &&
# Server advertised bundle-uri capability
grep "< bundle-uri" log3 &&
# Client did not issue bundle-uri command (--bundle-uri override)
! grep "> command=bundle-uri" log3
'
# The remaining tests will all assume transfer.bundleURI=true
#
# This test can be removed when transfer.bundleURI is enabled by default.
test_expect_success 'enable transfer.bundleURI for remaining tests' '
git config --global transfer.bundleURI true
'
test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
test_config -C "$BUNDLE_URI_PARENT" \
bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
# All data about bundle URIs
cat >expect <<-EOF &&
[bundle]
version = 1
mode = all
[bundle "only"]
uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
EOF
test-tool bundle-uri \
ls-remote \
"$BUNDLE_URI_REPO_URI" \
>actual &&
test_cmp_config_output expect actual
'
test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 and extra data" '
test_config -C "$BUNDLE_URI_PARENT" \
bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
# Extra data should be ignored
test_config -C "$BUNDLE_URI_PARENT" bundle.only.extra bogus &&
# All data about bundle URIs
cat >expect <<-EOF &&
[bundle]
version = 1
mode = all
[bundle "only"]
uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
EOF
test-tool bundle-uri \
ls-remote \
"$BUNDLE_URI_REPO_URI" \
>actual &&
test_cmp_config_output expect actual
'
test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 with list" '
test_config -C "$BUNDLE_URI_PARENT" \
bundle.bundle1.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl" &&
test_config -C "$BUNDLE_URI_PARENT" \
bundle.bundle2.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl" &&
test_config -C "$BUNDLE_URI_PARENT" \
bundle.bundle3.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl" &&
# All data about bundle URIs
cat >expect <<-EOF &&
[bundle]
version = 1
mode = all
[bundle "bundle1"]
uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl
[bundle "bundle2"]
uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl
[bundle "bundle3"]
uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl
EOF
test-tool bundle-uri \
ls-remote \
"$BUNDLE_URI_REPO_URI" \
>actual &&
test_cmp_config_output expect actual
'

View File

@ -772,6 +772,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
'
test_expect_success 'auto-discover bundle URI from HTTP clone' '
test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
uploadpack.advertiseBundleURIs true &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
bundle.version 1 &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
bundle.mode all &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
git -c protocol.version=2 \
-c transfer.bundleURI=true clone \
$HTTPD_URL/smart/repo2.git repo2 &&
cat >pattern <<-EOF &&
"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
EOF
grep -f pattern trace.txt
'
test_expect_success 'auto-discover multiple bundles from HTTP clone' '
test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
test_commit -C src new &&
git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
uploadpack.advertiseBundleURIs true &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
bundle.version 1 &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
bundle.mode all &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
bundle.new.uri "$HTTPD_URL/new.bundle" &&
GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
git -c protocol.version=2 \
-c transfer.bundleURI=true clone \
$HTTPD_URL/smart/repo3.git repo3 &&
# We should fetch _both_ bundles
cat >pattern <<-EOF &&
"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
EOF
grep -f pattern trace.txt &&
cat >pattern <<-EOF &&
"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
EOF
grep -f pattern trace.txt
'
# DO NOT add non-httpd-specific tests here, because the last part of this
# test script is only executed when httpd is available and enabled.

View File

@ -13,7 +13,7 @@ test_expect_success 'test capability advertisement' '
wrong_algo sha1:sha256
wrong_algo sha256:sha1
EOF
cat >expect <<-EOF &&
cat >expect.base <<-EOF &&
version 2
agent=git/$(git version | cut -d" " -f3)
ls-refs=unborn
@ -21,8 +21,11 @@ test_expect_success 'test capability advertisement' '
server-option
object-format=$(test_oid algo)
object-info
EOF
cat >expect.trailer <<-EOF &&
0000
EOF
cat expect.base expect.trailer >expect &&
GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
--advertise-capabilities >out &&
@ -342,4 +345,39 @@ test_expect_success 'basics of object-info' '
test_cmp expect actual
'
test_expect_success 'test capability advertisement with uploadpack.advertiseBundleURIs' '
test_config uploadpack.advertiseBundleURIs true &&
cat >expect.extra <<-EOF &&
bundle-uri
EOF
cat expect.base \
expect.extra \
expect.trailer >expect &&
GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
--advertise-capabilities >out &&
test-tool pkt-line unpack <out >actual &&
test_cmp expect actual
'
test_expect_success 'basics of bundle-uri: dies if not enabled' '
test-tool pkt-line pack >in <<-EOF &&
command=bundle-uri
0000
EOF
cat >err.expect <<-\EOF &&
fatal: invalid command '"'"'bundle-uri'"'"'
EOF
cat >expect <<-\EOF &&
ERR serve: invalid command '"'"'bundle-uri'"'"'
EOF
test_must_fail test-tool serve-v2 --stateless-rpc <in >out 2>err.actual &&
test_cmp err.expect err.actual &&
test_must_be_empty out
'
test_done

View File

@ -0,0 +1,17 @@
#!/bin/sh
test_description="Test bundle-uri with protocol v2 and 'file://' transport"
TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
. ./test-lib.sh
# Test protocol v2 with 'file://' transport
#
BUNDLE_URI_PROTOCOL=file
. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
test_done

View File

@ -0,0 +1,17 @@
#!/bin/sh
test_description="Test bundle-uri with protocol v2 and 'git://' transport"
TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
. ./test-lib.sh
# Test protocol v2 with 'git://' transport
#
BUNDLE_URI_PROTOCOL=git
. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
test_done

View File

@ -0,0 +1,17 @@
#!/bin/sh
test_description="Test bundle-uri with protocol v2 and 'http://' transport"
TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
. ./test-lib.sh
# Test protocol v2 with 'http://' transport
#
BUNDLE_URI_PROTOCOL=http
. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
test_done

View File

@ -30,6 +30,58 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
test_cmp_config_output expect actual
'
test_expect_success 'bundle_uri_parse_line(): relative URIs' '
cat >in <<-\EOF &&
bundle.one.uri=bundle.bdl
bundle.two.uri=../bundle.bdl
bundle.three.uri=sub/dir/bundle.bdl
EOF
cat >expect <<-\EOF &&
[bundle]
version = 1
mode = all
[bundle "one"]
uri = <uri>/bundle.bdl
[bundle "two"]
uri = bundle.bdl
[bundle "three"]
uri = <uri>/sub/dir/bundle.bdl
EOF
test-tool bundle-uri parse-key-values in >actual 2>err &&
test_must_be_empty err &&
test_cmp_config_output expect actual
'
test_expect_success 'bundle_uri_parse_line(): relative URIs and parent paths' '
cat >in <<-\EOF &&
bundle.one.uri=bundle.bdl
bundle.two.uri=../bundle.bdl
bundle.three.uri=../../bundle.bdl
EOF
cat >expect <<-\EOF &&
[bundle]
version = 1
mode = all
[bundle "one"]
uri = <uri>/bundle.bdl
[bundle "two"]
uri = bundle.bdl
[bundle "three"]
uri = <uri>/../bundle.bdl
EOF
# TODO: We would prefer if parsing a bundle list would not cause
# a die() and instead would give a warning and allow the rest of
# a Git command to continue. This test_must_fail is necessary for
# now until the interface for relative_url() allows for reporting
# an error instead of die()ing.
test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
grep "fatal: cannot strip one component off url" err
'
test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
cat >in <<-\EOF &&
=bogus-value
@ -136,6 +188,36 @@ test_expect_success 'parse config format: just URIs' '
test_cmp_config_output expect actual
'
test_expect_success 'parse config format: relative URIs' '
cat >in <<-\EOF &&
[bundle]
version = 1
mode = all
[bundle "one"]
uri = bundle.bdl
[bundle "two"]
uri = ../bundle.bdl
[bundle "three"]
uri = sub/dir/bundle.bdl
EOF
cat >expect <<-\EOF &&
[bundle]
version = 1
mode = all
[bundle "one"]
uri = <uri>/bundle.bdl
[bundle "two"]
uri = bundle.bdl
[bundle "three"]
uri = <uri>/sub/dir/bundle.bdl
EOF
test-tool bundle-uri parse-config in >actual 2>err &&
test_must_be_empty err &&
test_cmp_config_output expect actual
'
test_expect_success 'parse config format edge cases: empty key or value' '
cat >in1 <<-\EOF &&
= bogus-value

View File

@ -28,7 +28,7 @@ test_cmp_info () {
rm -f tmp.expect tmp.actual
}
quoted_svnrepo="$(echo $svnrepo | sed 's/ /%20/')"
quoted_svnrepo="$(echo $svnrepo | test_uri_escape)"
test_expect_success 'setup repository and import' '
mkdir info &&

View File

@ -1751,6 +1751,13 @@ test_path_is_hidden () {
return 1
}
# Poor man's URI escaping. Good enough for the test suite whose trash
# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
test_uri_escape() {
sed 's/ /%20/g'
}
# Check that the given command was invoked as part of the
# trace2-format trace on stdin.
#

View File

@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
return ret;
}
static int get_bundle_uri(struct transport *transport)
{
get_helper(transport);
if (process_connect(transport, 0)) {
do_take_over(transport);
return transport->vtable->get_bundle_uri(transport);
}
return -1;
}
static struct transport_vtable vtable = {
.set_option = set_helper_option,
.get_refs_list = get_refs_list,
.get_bundle_uri = get_bundle_uri,
.fetch_refs = fetch_refs,
.push_refs = push_refs,
.connect = connect_helper,

View File

@ -26,6 +26,13 @@ struct transport_vtable {
struct ref *(*get_refs_list)(struct transport *transport, int for_push,
struct transport_ls_refs_options *transport_options);
/**
* Populates the remote side's bundle-uri under protocol v2,
* if the "bundle-uri" capability was advertised. Returns 0 if
* OK, negative values on error.
*/
int (*get_bundle_uri)(struct transport *transport);
/**
* Fetch the objects for the given refs. Note that this gets
* an array, and should ignore the list structure.

View File

@ -22,6 +22,7 @@
#include "protocol.h"
#include "object-store.h"
#include "color.h"
#include "bundle-uri.h"
static int transport_use_color = -1;
static char transport_colors[][COLOR_MAXLEN] = {
@ -197,7 +198,7 @@ struct git_transport_data {
struct git_transport_options options;
struct child_process *conn;
int fd[2];
unsigned got_remote_heads : 1;
unsigned finished_handshake : 1;
enum protocol_version version;
struct oid_array extra_have;
struct oid_array shallow;
@ -344,7 +345,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
case protocol_unknown_version:
BUG("unknown protocol version");
}
data->got_remote_heads = 1;
data->finished_handshake = 1;
transport->hash_algo = reader.hash_algo;
if (reader.line_peeked)
@ -359,6 +360,39 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
return handshake(transport, for_push, options, 1);
}
static int get_bundle_uri(struct transport *transport)
{
struct git_transport_data *data = transport->data;
struct packet_reader reader;
int stateless_rpc = transport->stateless_rpc;
if (!transport->bundles) {
CALLOC_ARRAY(transport->bundles, 1);
init_bundle_list(transport->bundles);
}
if (!data->finished_handshake) {
struct ref *refs = handshake(transport, 0, NULL, 0);
if (refs)
free_refs(refs);
}
/*
* "Support" protocol v0 and v2 without bundle-uri support by
* silently degrading to a NOOP.
*/
if (!server_supports_v2("bundle-uri"))
return 0;
packet_reader_init(&reader, data->fd[0], NULL, 0,
PACKET_READ_CHOMP_NEWLINE |
PACKET_READ_GENTLE_ON_EOF);
return get_remote_bundle_uri(data->fd[1], &reader,
transport->bundles, stateless_rpc);
}
static int fetch_refs_via_pack(struct transport *transport,
int nr_heads, struct ref **to_fetch)
{
@ -394,7 +428,7 @@ static int fetch_refs_via_pack(struct transport *transport,
args.negotiation_tips = data->options.negotiation_tips;
args.reject_shallow_remote = transport->smart_options->reject_shallow;
if (!data->got_remote_heads) {
if (!data->finished_handshake) {
int i;
int must_list_refs = 0;
for (i = 0; i < nr_heads; i++) {
@ -434,7 +468,7 @@ static int fetch_refs_via_pack(struct transport *transport,
to_fetch, nr_heads, &data->shallow,
&transport->pack_lockfiles, data->version);
data->got_remote_heads = 0;
data->finished_handshake = 0;
data->options.self_contained_and_connected =
args.self_contained_and_connected;
data->options.connectivity_checked = args.connectivity_checked;
@ -819,7 +853,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
if (transport_color_config() < 0)
return -1;
if (!data->got_remote_heads)
if (!data->finished_handshake)
get_refs_via_connect(transport, 1, NULL);
memset(&args, 0, sizeof(args));
@ -867,7 +901,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
else
ret = finish_connect(data->conn);
data->conn = NULL;
data->got_remote_heads = 0;
data->finished_handshake = 0;
return ret;
}
@ -887,7 +921,7 @@ static int disconnect_git(struct transport *transport)
{
struct git_transport_data *data = transport->data;
if (data->conn) {
if (data->got_remote_heads && !transport->stateless_rpc)
if (data->finished_handshake && !transport->stateless_rpc)
packet_flush(data->fd[1]);
close(data->fd[0]);
if (data->fd[1] >= 0)
@ -902,6 +936,7 @@ static int disconnect_git(struct transport *transport)
static struct transport_vtable taken_over_vtable = {
.get_refs_list = get_refs_via_connect,
.get_bundle_uri = get_bundle_uri,
.fetch_refs = fetch_refs_via_pack,
.push_refs = git_transport_push,
.disconnect = disconnect_git
@ -921,7 +956,7 @@ void transport_take_over(struct transport *transport,
data->conn = child;
data->fd[0] = data->conn->out;
data->fd[1] = data->conn->in;
data->got_remote_heads = 0;
data->finished_handshake = 0;
transport->data = data;
transport->vtable = &taken_over_vtable;
@ -1054,6 +1089,7 @@ static struct transport_vtable bundle_vtable = {
static struct transport_vtable builtin_smart_vtable = {
.get_refs_list = get_refs_via_connect,
.get_bundle_uri = get_bundle_uri,
.fetch_refs = fetch_refs_via_pack,
.push_refs = git_transport_push,
.connect = connect_git,
@ -1068,6 +1104,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
ret->progress = isatty(2);
string_list_init_dup(&ret->pack_lockfiles);
CALLOC_ARRAY(ret->bundles, 1);
init_bundle_list(ret->bundles);
if (!remote)
BUG("No remote provided to transport_get()");
@ -1118,7 +1157,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
ret->smart_options = &(data->options);
data->conn = NULL;
data->got_remote_heads = 0;
data->finished_handshake = 0;
} else {
/* Unknown protocol in URL. Pass to external handler. */
int len = external_specification_len(url);
@ -1482,6 +1521,34 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
return rc;
}
int transport_get_remote_bundle_uri(struct transport *transport)
{
int value = 0;
const struct transport_vtable *vtable = transport->vtable;
/* Check config only once. */
if (transport->got_remote_bundle_uri)
return 0;
transport->got_remote_bundle_uri = 1;
/*
* Don't request bundle-uri from the server unless configured to
* do so by the transfer.bundleURI=true config option.
*/
if (git_config_get_bool("transfer.bundleuri", &value) || !value)
return 0;
if (!transport->bundles->baseURI)
transport->bundles->baseURI = xstrdup(transport->url);
if (!vtable->get_bundle_uri)
return error(_("bundle-uri operation not supported by protocol"));
if (vtable->get_bundle_uri(transport) < 0)
return error(_("could not retrieve server-advertised bundle-uri list"));
return 0;
}
void transport_unlock_pack(struct transport *transport, unsigned int flags)
{
int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER);
@ -1512,6 +1579,8 @@ int transport_disconnect(struct transport *transport)
ret = transport->vtable->disconnect(transport);
if (transport->got_remote_refs)
free_refs((void *)transport->remote_refs);
clear_bundle_list(transport->bundles);
free(transport->bundles);
free(transport);
return ret;
}

View File

@ -62,6 +62,7 @@ enum transport_family {
TRANSPORT_FAMILY_IPV6
};
struct bundle_list;
struct transport {
const struct transport_vtable *vtable;
@ -76,6 +77,18 @@ struct transport {
*/
unsigned got_remote_refs : 1;
/**
* Indicates whether we already called get_bundle_uri_list(); set by
* transport.c::transport_get_remote_bundle_uri().
*/
unsigned got_remote_bundle_uri : 1;
/*
* The results of "command=bundle-uri", if both sides support
* the "bundle-uri" capability.
*/
struct bundle_list *bundles;
/*
* Transports that call take-over destroys the data specific to
* the transport type while doing so, and cannot be reused.
@ -281,6 +294,12 @@ void transport_ls_refs_options_release(struct transport_ls_refs_options *opts);
const struct ref *transport_get_remote_refs(struct transport *transport,
struct transport_ls_refs_options *transport_options);
/**
* Retrieve bundle URI(s) from a remote. Populates "struct
* transport"'s "bundle_uri" and "got_remote_bundle_uri".
*/
int transport_get_remote_bundle_uri(struct transport *transport);
/*
* Fetch the hash algorithm used by a remote.
*