From 8b5636a57ff078ac368f246f813b156552a0726c Mon Sep 17 00:00:00 2001
From: Johannes Sixt <j6t@kdbg.org>
Date: Fri, 17 Oct 2025 18:37:52 +0200
Subject: [PATCH 001/553] Revert "gitk: Only restore window size from ~/.gitk,
 not position"

This reverts commit b9bee11526ec (gitk: Only restore window size from
~/.gitk, not position, 2008-03-10).

The earlier commit e9937d2a03a4 (Make gitk work reasonably well on
Cygwin, 2007-02-01) reworked the window layout considerably. Much of
this became irrelevant around 2011 after Cygwin gained an X11 server
and switched to a supportable port of the Unix/X11 Tcl/Tk (it is now
on the current 8.6 code base).

Part of the necessary change was to restore the window size across
sessions, but the position was also restored. This raised complaints
on the mailing list[*], because Gitk was opened on the wrong monitor.
b9bee11526ec was the compromise, because it was only the size that
mattered for the Cygwin layout engine to work.

I personally, find it annoying when Gitk pops up on a random location
on the screen, in particular, since many other applications restore
the window positions across sessions, so why not Gitk as well? (I do
not operate multi-monitor setups, so I cannot test the case.)

[*] https://lore.kernel.org/git/47AAA254.2020008@thorn.ws/

Helped-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 gitk | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/gitk b/gitk
index 6e4d71d5852533..275f3538117f31 100755
--- a/gitk
+++ b/gitk
@@ -2764,17 +2764,9 @@ proc makewindow {} {
     .pwbottom add .bright
     .ctop add .pwbottom
 
-    # restore window width & height if known
+    # restore window position if known
     if {[info exists geometry(main)]} {
-        if {[scan $geometry(main) "%dx%d" w h] >= 2} {
-            if {$w > [winfo screenwidth .]} {
-                set w [winfo screenwidth .]
-            }
-            if {$h > [winfo screenheight .]} {
-                set h [winfo screenheight .]
-            }
-            wm geometry . "${w}x$h"
-        }
+        wm geometry . "$geometry(main)"
     }
 
     if {[info exists geometry(state)] && $geometry(state) eq "zoomed"} {

From bf5a55ac5eaef91e87470d704613e6942500a810 Mon Sep 17 00:00:00 2001
From: Johannes Sixt <j6t@kdbg.org>
Date: Fri, 17 Oct 2025 18:38:11 +0200
Subject: [PATCH 002/553] gitk: persist position and size of the Tags and Heads
 window

The Tags and Heads window always opens at a default position and size,
requiring users to reposition it each time. Remember its geometry
between sessions in the config file as `geometry(showrefs)`.

Note that the existing configuration is sourced in proc savestuff
right before new settings are written. This makes the old settings
available as local variables(!) and does not overwrite the current
settings. Since we need access to the global geometry(showrefs), it
is necessary to unset the local variable.

Helped-by: Michael Rappazzo <rappazzo@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 gitk | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/gitk b/gitk
index 275f3538117f31..ed616613ae3c91 100755
--- a/gitk
+++ b/gitk
@@ -2131,12 +2131,14 @@ proc ttk_toplevel {w args} {
     return $w
 }
 
-proc make_transient {window origin} {
+proc make_transient {window origin {geometry ""}} {
     wm transient $window $origin
 
-    # Windows fails to place transient windows normally, so
-    # schedule a callback to center them on the parent.
-    if {[tk windowingsystem] eq {win32}} {
+    if {$geometry ne ""} {
+        after idle [list wm geometry $window $geometry]
+    } elseif {[tk windowingsystem] eq {win32}} {
+        # Windows fails to place transient windows normally, so
+        # schedule a callback to center them on the parent.
         after idle [list tk::PlaceWindow $window widget $origin]
     }
 }
@@ -3106,6 +3108,11 @@ proc savestuff {w} {
         puts $f "set geometry(pwsash1) \"[.tf.histframe.pwclist sashpos 1] 1\""
         puts $f "set geometry(botwidth) [winfo width .bleft]"
         puts $f "set geometry(botheight) [winfo height .bleft]"
+        unset -nocomplain geometry
+        global geometry
+        if {[info exists geometry(showrefs)]} {
+            puts $f "set geometry(showrefs) $geometry(showrefs)"
+        }
 
         array set view_save {}
         array set views {}
@@ -10193,6 +10200,7 @@ proc rmbranch {} {
 proc showrefs {} {
     global showrefstop bgcolor fgcolor selectbgcolor
     global bglist fglist reflistfilter reflist maincursor
+    global geometry
 
     set top .showrefs
     set showrefstop $top
@@ -10203,7 +10211,11 @@ proc showrefs {} {
     }
     ttk_toplevel $top
     wm title $top [mc "Tags and heads: %s" [file tail [pwd]]]
-    make_transient $top .
+    if {[info exists geometry(showrefs)]} {
+        make_transient $top . $geometry(showrefs)
+    } else {
+        make_transient $top .
+    }
     text $top.list -background $bgcolor -foreground $fgcolor \
         -selectbackground $selectbgcolor -font mainfont \
         -xscrollcommand "$top.xsb set" -yscrollcommand "$top.ysb set" \
@@ -10239,6 +10251,9 @@ proc showrefs {} {
     bind $top.list <ButtonRelease-1> {sel_reflist %W %x %y; break}
     set reflist {}
     refill_reflist
+    # avoid <Configure> being bound to child windows
+    bindtags $top [linsert [bindtags $top] 1 bind$top]
+    bind bind$top <Configure> {set geometry(showrefs) [wm geometry %W]}
 }
 
 proc sel_reflist {w x y} {

From e78ab370545d81a950fa3b2701dd7c72015ee802 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:38 +0100
Subject: [PATCH 003/553] packfile: use a `strmap` to store packs by name

To allow fast lookups of a packfile by name we use a hashmap that has
the packfile name as key and the pack itself as value. But while this is
the perfect use case for a `strmap`, we instead use `struct hashmap` and
store the hashmap entry in the packfile itself.

Simplify the code by using a `strmap` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c | 24 ++++--------------------
 packfile.h |  4 ++--
 2 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/packfile.c b/packfile.c
index 1ae2b2fe1eda77..04649e52920a1f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -788,8 +788,7 @@ void packfile_store_add_pack(struct packfile_store *store,
 	pack->next = store->packs;
 	store->packs = pack;
 
-	hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name));
-	hashmap_add(&store->map, &pack->packmap_ent);
+	strmap_put(&store->packs_by_path, pack->pack_name, pack);
 }
 
 struct packed_git *packfile_store_load_pack(struct packfile_store *store,
@@ -806,8 +805,7 @@ struct packed_git *packfile_store_load_pack(struct packfile_store *store,
 	strbuf_strip_suffix(&key, ".idx");
 	strbuf_addstr(&key, ".pack");
 
-	p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
-					struct packed_git, packmap_ent);
+	p = strmap_get(&store->packs_by_path, key.buf);
 	if (!p) {
 		p = add_packed_git(store->odb->repo, idx_path,
 				   strlen(idx_path), local);
@@ -2311,27 +2309,13 @@ int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *l
 	return 0;
 }
 
-static int pack_map_entry_cmp(const void *cmp_data UNUSED,
-			      const struct hashmap_entry *entry,
-			      const struct hashmap_entry *entry2,
-			      const void *keydata)
-{
-	const char *key = keydata;
-	const struct packed_git *pg1, *pg2;
-
-	pg1 = container_of(entry, const struct packed_git, packmap_ent);
-	pg2 = container_of(entry2, const struct packed_git, packmap_ent);
-
-	return strcmp(pg1->pack_name, key ? key : pg2->pack_name);
-}
-
 struct packfile_store *packfile_store_new(struct object_database *odb)
 {
 	struct packfile_store *store;
 	CALLOC_ARRAY(store, 1);
 	store->odb = odb;
 	INIT_LIST_HEAD(&store->mru);
-	hashmap_init(&store->map, pack_map_entry_cmp, NULL, 0);
+	strmap_init(&store->packs_by_path);
 	return store;
 }
 
@@ -2341,7 +2325,7 @@ void packfile_store_free(struct packfile_store *store)
 		next = p->next;
 		free(p);
 	}
-	hashmap_clear(&store->map);
+	strmap_clear(&store->packs_by_path, 0);
 	free(store);
 }
 
diff --git a/packfile.h b/packfile.h
index c9d0b93446b5f5..9da7f14317b02c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -5,12 +5,12 @@
 #include "object.h"
 #include "odb.h"
 #include "oidset.h"
+#include "strmap.h"
 
 /* in odb.h */
 struct object_info;
 
 struct packed_git {
-	struct hashmap_entry packmap_ent;
 	struct packed_git *next;
 	struct list_head mru;
 	struct pack_window *windows;
@@ -85,7 +85,7 @@ struct packfile_store {
 	 * A map of packfile names to packed_git structs for tracking which
 	 * packs have been loaded already.
 	 */
-	struct hashmap map;
+	struct strmap packs_by_path;
 
 	/*
 	 * Whether packfiles have already been populated with this store's

From f905a855b1d1e172ba1e51e03fc5ec531445575e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:39 +0100
Subject: [PATCH 004/553] packfile: move the MRU list into the packfile store

Packfiles have two lists associated to them:

  - A list that keeps track of packfiles in the order that they were
    added to a packfile store.

  - A list that keeps track of packfiles in most-recently-used order so
    that packfiles that are more likely to contain a specific object are
    ordered towards the front.

Both of these lists are hosted by `struct packed_git` itself, So to
identify all packfiles in a repository you simply need to grab the first
packfile and then iterate the `->next` pointers or the MRU list. This
pattern has the problem that all packfiles are part of the same list,
regardless of whether or not they belong to the same object source.

With the upcoming pluggable object database effort this needs to change:
packfiles should be contained by a single object source, and reading an
object from any such packfile should use that source to look up the
object. Consequently, we need to break up the global lists of packfiles
into per-object-source lists.

A first step towards this goal is to move those lists out of `struct
packed_git` and into the packfile store. While the packfile store is
currently sitting on the `struct object_database` level, the intent is
to push it down one level into the `struct odb_source` in a subsequent
patch series.

Introduce a new `struct packfile_list` that is used to manage lists of
packfiles and use it to store the list of most-recently-used packfiles
in `struct packfile_store`. For now, the new list type is only used in a
single spot, but we'll expand its usage in subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c |  9 ++---
 midx.c                 |  2 +-
 packfile.c             | 92 +++++++++++++++++++++++++++++++++++++-----
 packfile.h             | 19 +++++++--
 4 files changed, 104 insertions(+), 18 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b5454e5df137b4..5348aebbe9f190 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1706,8 +1706,8 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
 				     uint32_t found_mtime)
 {
 	int want;
+	struct packfile_list_entry *e;
 	struct odb_source *source;
-	struct list_head *pos;
 
 	if (!exclude && local) {
 		/*
@@ -1748,12 +1748,11 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
 		}
 	}
 
-	list_for_each(pos, packfile_store_get_packs_mru(the_repository->objects->packfiles)) {
-		struct packed_git *p = list_entry(pos, struct packed_git, mru);
+	for (e = the_repository->objects->packfiles->mru.head; e; e = e->next) {
+		struct packed_git *p = e->pack;
 		want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime);
 		if (!exclude && want > 0)
-			list_move(&p->mru,
-				  packfile_store_get_packs_mru(the_repository->objects->packfiles));
+			packfile_list_prepend(&the_repository->objects->packfiles->mru, p);
 		if (want != -1)
 			return want;
 	}
diff --git a/midx.c b/midx.c
index 1d6269f957e781..8022be9a45ecb9 100644
--- a/midx.c
+++ b/midx.c
@@ -463,7 +463,7 @@ int prepare_midx_pack(struct multi_pack_index *m,
 	p = packfile_store_load_pack(r->objects->packfiles,
 				     pack_name.buf, m->source->local);
 	if (p)
-		list_add_tail(&p->mru, &r->objects->packfiles->mru);
+		packfile_list_append(&m->source->odb->packfiles->mru, p);
 	strbuf_release(&pack_name);
 
 	if (!p) {
diff --git a/packfile.c b/packfile.c
index 04649e52920a1f..4d2d3b674f3fb0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -47,6 +47,80 @@ static size_t pack_mapped;
 #define SZ_FMT PRIuMAX
 static inline uintmax_t sz_fmt(size_t s) { return s; }
 
+void packfile_list_clear(struct packfile_list *list)
+{
+	struct packfile_list_entry *e, *next;
+
+	for (e = list->head; e; e = next) {
+		next = e->next;
+		free(e);
+	}
+
+	list->head = list->tail = NULL;
+}
+
+static struct packfile_list_entry *packfile_list_remove_internal(struct packfile_list *list,
+								 struct packed_git *pack)
+{
+	struct packfile_list_entry *e, *prev;
+
+	for (e = list->head, prev = NULL; e; prev = e, e = e->next) {
+		if (e->pack != pack)
+			continue;
+
+		if (prev)
+			prev->next = e->next;
+		if (list->head == e)
+			list->head = e->next;
+		if (list->tail == e)
+			list->tail = prev;
+
+		return e;
+	}
+
+	return NULL;
+}
+
+void packfile_list_remove(struct packfile_list *list, struct packed_git *pack)
+{
+	free(packfile_list_remove_internal(list, pack));
+}
+
+void packfile_list_prepend(struct packfile_list *list, struct packed_git *pack)
+{
+	struct packfile_list_entry *entry;
+
+	entry = packfile_list_remove_internal(list, pack);
+	if (!entry) {
+		entry = xmalloc(sizeof(*entry));
+		entry->pack = pack;
+	}
+	entry->next = list->head;
+
+	list->head = entry;
+	if (!list->tail)
+		list->tail = entry;
+}
+
+void packfile_list_append(struct packfile_list *list, struct packed_git *pack)
+{
+	struct packfile_list_entry *entry;
+
+	entry = packfile_list_remove_internal(list, pack);
+	if (!entry) {
+		entry = xmalloc(sizeof(*entry));
+		entry->pack = pack;
+	}
+	entry->next = NULL;
+
+	if (list->tail) {
+		list->tail->next = entry;
+		list->tail = entry;
+	} else {
+		list->head = list->tail = entry;
+	}
+}
+
 void pack_report(struct repository *repo)
 {
 	fprintf(stderr,
@@ -995,10 +1069,10 @@ static void packfile_store_prepare_mru(struct packfile_store *store)
 {
 	struct packed_git *p;
 
-	INIT_LIST_HEAD(&store->mru);
+	packfile_list_clear(&store->mru);
 
 	for (p = store->packs; p; p = p->next)
-		list_add_tail(&p->mru, &store->mru);
+		packfile_list_append(&store->mru, p);
 }
 
 void packfile_store_prepare(struct packfile_store *store)
@@ -1040,10 +1114,10 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store)
 	return store->packs;
 }
 
-struct list_head *packfile_store_get_packs_mru(struct packfile_store *store)
+struct packfile_list_entry *packfile_store_get_packs_mru(struct packfile_store *store)
 {
 	packfile_store_prepare(store);
-	return &store->mru;
+	return store->mru.head;
 }
 
 /*
@@ -2048,7 +2122,7 @@ static int fill_pack_entry(const struct object_id *oid,
 
 int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e)
 {
-	struct list_head *pos;
+	struct packfile_list_entry *l;
 
 	packfile_store_prepare(r->objects->packfiles);
 
@@ -2059,10 +2133,11 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 	if (!r->objects->packfiles->packs)
 		return 0;
 
-	list_for_each(pos, &r->objects->packfiles->mru) {
-		struct packed_git *p = list_entry(pos, struct packed_git, mru);
+	for (l = r->objects->packfiles->mru.head; l; l = l->next) {
+		struct packed_git *p = l->pack;
+
 		if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
-			list_move(&p->mru, &r->objects->packfiles->mru);
+			packfile_list_prepend(&r->objects->packfiles->mru, p);
 			return 1;
 		}
 	}
@@ -2314,7 +2389,6 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
 	struct packfile_store *store;
 	CALLOC_ARRAY(store, 1);
 	store->odb = odb;
-	INIT_LIST_HEAD(&store->mru);
 	strmap_init(&store->packs_by_path);
 	return store;
 }
diff --git a/packfile.h b/packfile.h
index 9da7f14317b02c..39ed1073e4ad79 100644
--- a/packfile.h
+++ b/packfile.h
@@ -12,7 +12,6 @@ struct object_info;
 
 struct packed_git {
 	struct packed_git *next;
-	struct list_head mru;
 	struct pack_window *windows;
 	off_t pack_size;
 	const void *index_data;
@@ -52,6 +51,20 @@ struct packed_git {
 	char pack_name[FLEX_ARRAY]; /* more */
 };
 
+struct packfile_list {
+	struct packfile_list_entry *head, *tail;
+};
+
+struct packfile_list_entry {
+	struct packfile_list_entry *next;
+	struct packed_git *pack;
+};
+
+void packfile_list_clear(struct packfile_list *list);
+void packfile_list_remove(struct packfile_list *list, struct packed_git *pack);
+void packfile_list_prepend(struct packfile_list *list, struct packed_git *pack);
+void packfile_list_append(struct packfile_list *list, struct packed_git *pack);
+
 /*
  * A store that manages packfiles for a given object database.
  */
@@ -79,7 +92,7 @@ struct packfile_store {
 	} kept_cache;
 
 	/* A most-recently-used ordered version of the packs list. */
-	struct list_head mru;
+	struct packfile_list mru;
 
 	/*
 	 * A map of packfile names to packed_git structs for tracking which
@@ -153,7 +166,7 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store);
 /*
  * Get all packs in most-recently-used order.
  */
-struct list_head *packfile_store_get_packs_mru(struct packfile_store *store);
+struct packfile_list_entry *packfile_store_get_packs_mru(struct packfile_store *store);
 
 /*
  * Open the packfile and add it to the store if it isn't yet known. Returns

From 89219bc0cd09ada8a204e0ace0bd15decaea7d31 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:40 +0100
Subject: [PATCH 005/553] http: refactor subsystem to use `packfile_list`s

The dumb HTTP protocol directly fetches packfiles from the remote server
and temporarily stores them in a list of packfiles. Those packfiles are
not yet added to the repository's packfile store until we finalize the
whole fetch.

Refactor the code to instead use a `struct packfile_list` to store those
packs. This prepares us for a subsequent change where the `->next`
pointer of `struct packed_git` will go away.

Note that this refactoring creates some temporary duplication of code,
as we now have both `packfile_list_find_oid()` and `find_oid_pack()`.
The latter function will be removed in a subsequent commit though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 http-push.c   |  6 +++---
 http-walker.c | 26 +++++++++-----------------
 http.c        | 21 ++++++++-------------
 http.h        |  5 +++--
 packfile.c    |  9 +++++++++
 packfile.h    |  8 ++++++++
 6 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/http-push.c b/http-push.c
index a1c01e3b9b93a3..d86ce771198206 100644
--- a/http-push.c
+++ b/http-push.c
@@ -104,7 +104,7 @@ struct repo {
 	int has_info_refs;
 	int can_update_info_refs;
 	int has_info_packs;
-	struct packed_git *packs;
+	struct packfile_list packs;
 	struct remote_lock *locks;
 };
 
@@ -311,7 +311,7 @@ static void start_fetch_packed(struct transfer_request *request)
 	struct transfer_request *check_request = request_queue_head;
 	struct http_pack_request *preq;
 
-	target = find_oid_pack(&request->obj->oid, repo->packs);
+	target = packfile_list_find_oid(repo->packs.head, &request->obj->oid);
 	if (!target) {
 		fprintf(stderr, "Unable to fetch %s, will not be able to update server info refs\n", oid_to_hex(&request->obj->oid));
 		repo->can_update_info_refs = 0;
@@ -683,7 +683,7 @@ static int add_send_request(struct object *obj, struct remote_lock *lock)
 		get_remote_object_list(obj->oid.hash[0]);
 	if (obj->flags & (REMOTE | PUSHING))
 		return 0;
-	target = find_oid_pack(&obj->oid, repo->packs);
+	target = packfile_list_find_oid(repo->packs.head, &obj->oid);
 	if (target) {
 		obj->flags |= REMOTE;
 		return 0;
diff --git a/http-walker.c b/http-walker.c
index 0f7ae46d7f12c0..e886e6486646d1 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -15,7 +15,7 @@
 struct alt_base {
 	char *base;
 	int got_indices;
-	struct packed_git *packs;
+	struct packfile_list packs;
 	struct alt_base *next;
 };
 
@@ -324,11 +324,8 @@ static void process_alternates_response(void *callback_data)
 				} else if (is_alternate_allowed(target.buf)) {
 					warning("adding alternate object store: %s",
 						target.buf);
-					newalt = xmalloc(sizeof(*newalt));
-					newalt->next = NULL;
+					CALLOC_ARRAY(newalt, 1);
 					newalt->base = strbuf_detach(&target, NULL);
-					newalt->got_indices = 0;
-					newalt->packs = NULL;
 
 					while (tail->next != NULL)
 						tail = tail->next;
@@ -435,7 +432,7 @@ static int http_fetch_pack(struct walker *walker, struct alt_base *repo,
 
 	if (fetch_indices(walker, repo))
 		return -1;
-	target = find_oid_pack(oid, repo->packs);
+	target = packfile_list_find_oid(repo->packs.head, oid);
 	if (!target)
 		return -1;
 	close_pack_index(target);
@@ -584,17 +581,15 @@ static void cleanup(struct walker *walker)
 	if (data) {
 		alt = data->alt;
 		while (alt) {
-			struct packed_git *pack;
+			struct packfile_list_entry *e;
 
 			alt_next = alt->next;
 
-			pack = alt->packs;
-			while (pack) {
-				struct packed_git *pack_next = pack->next;
-				close_pack(pack);
-				free(pack);
-				pack = pack_next;
+			for (e = alt->packs.head; e; e = e->next) {
+				close_pack(e->pack);
+				free(e->pack);
 			}
+			packfile_list_clear(&alt->packs);
 
 			free(alt->base);
 			free(alt);
@@ -612,14 +607,11 @@ struct walker *get_http_walker(const char *url)
 	struct walker_data *data = xmalloc(sizeof(struct walker_data));
 	struct walker *walker = xmalloc(sizeof(struct walker));
 
-	data->alt = xmalloc(sizeof(*data->alt));
+	CALLOC_ARRAY(data->alt, 1);
 	data->alt->base = xstrdup(url);
 	for (s = data->alt->base + strlen(data->alt->base) - 1; *s == '/'; --s)
 		*s = 0;
 
-	data->alt->got_indices = 0;
-	data->alt->packs = NULL;
-	data->alt->next = NULL;
 	data->got_alternates = -1;
 
 	walker->corrupt_object_found = 0;
diff --git a/http.c b/http.c
index 17130823f006f2..41f850db16d19f 100644
--- a/http.c
+++ b/http.c
@@ -2413,8 +2413,9 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url)
 	return tmp;
 }
 
-static int fetch_and_setup_pack_index(struct packed_git **packs_head,
-	unsigned char *sha1, const char *base_url)
+static int fetch_and_setup_pack_index(struct packfile_list *packs,
+				      unsigned char *sha1,
+				      const char *base_url)
 {
 	struct packed_git *new_pack, *p;
 	char *tmp_idx = NULL;
@@ -2448,12 +2449,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
 	if (ret)
 		return -1;
 
-	new_pack->next = *packs_head;
-	*packs_head = new_pack;
+	packfile_list_prepend(packs, new_pack);
 	return 0;
 }
 
-int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
+int http_get_info_packs(const char *base_url, struct packfile_list *packs)
 {
 	struct http_get_options options = {0};
 	int ret = 0;
@@ -2477,7 +2477,7 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
 		    !parse_oid_hex(data, &oid, &data) &&
 		    skip_prefix(data, ".pack", &data) &&
 		    (*data == '\n' || *data == '\0')) {
-			fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
+			fetch_and_setup_pack_index(packs, oid.hash, base_url);
 		} else {
 			data = strchrnul(data, '\n');
 		}
@@ -2541,14 +2541,9 @@ int finish_http_pack_request(struct http_pack_request *preq)
 }
 
 void http_install_packfile(struct packed_git *p,
-			   struct packed_git **list_to_remove_from)
+			   struct packfile_list *list_to_remove_from)
 {
-	struct packed_git **lst = list_to_remove_from;
-
-	while (*lst != p)
-		lst = &((*lst)->next);
-	*lst = (*lst)->next;
-
+	packfile_list_remove(list_to_remove_from, p);
 	packfile_store_add_pack(the_repository->objects->packfiles, p);
 }
 
diff --git a/http.h b/http.h
index 553e16205ce201..f9d459340476e4 100644
--- a/http.h
+++ b/http.h
@@ -2,6 +2,7 @@
 #define HTTP_H
 
 struct packed_git;
+struct packfile_list;
 
 #include "git-zlib.h"
 
@@ -190,7 +191,7 @@ struct curl_slist *http_append_auth_header(const struct credential *c,
 
 /* Helpers for fetching packs */
 int http_get_info_packs(const char *base_url,
-			struct packed_git **packs_head);
+			struct packfile_list *packs);
 
 /* Helper for getting Accept-Language header */
 const char *http_get_accept_language_header(void);
@@ -226,7 +227,7 @@ void release_http_pack_request(struct http_pack_request *preq);
  * from http_get_info_packs() and have chosen a specific pack to fetch.
  */
 void http_install_packfile(struct packed_git *p,
-			   struct packed_git **list_to_remove_from);
+			   struct packfile_list *list_to_remove_from);
 
 /* Helpers for fetching object */
 struct http_object_request {
diff --git a/packfile.c b/packfile.c
index 4d2d3b674f3fb0..6aa2ca8ac9ee43 100644
--- a/packfile.c
+++ b/packfile.c
@@ -121,6 +121,15 @@ void packfile_list_append(struct packfile_list *list, struct packed_git *pack)
 	}
 }
 
+struct packed_git *packfile_list_find_oid(struct packfile_list_entry *packs,
+					  const struct object_id *oid)
+{
+	for (; packs; packs = packs->next)
+		if (find_pack_entry_one(oid, packs->pack))
+			return packs->pack;
+	return NULL;
+}
+
 void pack_report(struct repository *repo)
 {
 	fprintf(stderr,
diff --git a/packfile.h b/packfile.h
index 39ed1073e4ad79..a53336d722a42b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -65,6 +65,14 @@ void packfile_list_remove(struct packfile_list *list, struct packed_git *pack);
 void packfile_list_prepend(struct packfile_list *list, struct packed_git *pack);
 void packfile_list_append(struct packfile_list *list, struct packed_git *pack);
 
+/*
+ * Find the pack within the "packs" list whose index contains the object
+ * "oid". For general object lookups, you probably don't want this; use
+ * find_pack_entry() instead.
+ */
+struct packed_git *packfile_list_find_oid(struct packfile_list_entry *packs,
+					  const struct object_id *oid);
+
 /*
  * A store that manages packfiles for a given object database.
  */

From 02a7f6ffab9ec7641f88032f30998976bca07820 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:41 +0100
Subject: [PATCH 006/553] packfile: fix approximation of object counts

When approximating the number of objects in a repository we only take
into account two data sources, the multi-pack index and the packfile
indices, as both of these data structures allow us to easily figure out
how many objects they contain.

But the way we currently approximate the number of objects is broken in
presence of a multi-pack index. This is due to two separate reasons:

  - We have recently introduced initial infrastructure for incremental
    multi-pack indices. Starting with that series, `num_objects` only
    counts the number of objects of a specific layer of the MIDX chain,
    so we do not take into account objects from parent layers.

    This issue is fixed by adding `num_objects_in_base`, which contains
    the sum of all objects in previous layers.

  - When using the multi-pack index we may count objects contained in
    packfiles twice: once via the multi-pack index, but then we again
    count them via the packfile itself.

    This issue is fixed by skipping any packfiles that have an MIDX.

Overall, given that we _always_ count the packs, we can only end up
overestimating the number of objects, and the overestimation is limited
to a factor of two at most.

The consequences of those issues are very limited though, as we only
approximate object counts in a small number of cases:

  - When writing a commit-graph we use the approximate object count to
    display the upper limit of a progress display.

  - In `repo_find_unique_abbrev_r()` we use it to specify a lower limit
    of how many hex digits we want to abbreviate to. Given that we use
    power-of-two here to derive the lower limit we may end up with an
    abbreviated hash that is one digit longer than required.

  - In `estimate_repack_memory()` we may end up overestimating how much
    memory a repack needs to pack objects. Conseuqently, we may end up
    dropping some packfiles from a repack.

None of these are really game-changing. But it's nice to fix those
issues regardless.

While at it, convert the code to use `repo_for_each_pack()`.
Furthermore, use `odb_prepare_alternates()` instead of explicitly
preparing the packfile store. We really only want to prepare the object
database sources, and `get_multi_pack_index()` already knows to prepare
the packfile store for us.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index 6aa2ca8ac9ee43..b07509b69bd7cc 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1143,16 +1143,16 @@ unsigned long repo_approximate_object_count(struct repository *r)
 		unsigned long count = 0;
 		struct packed_git *p;
 
-		packfile_store_prepare(r->objects->packfiles);
+		odb_prepare_alternates(r->objects);
 
 		for (source = r->objects->sources; source; source = source->next) {
 			struct multi_pack_index *m = get_multi_pack_index(source);
 			if (m)
-				count += m->num_objects;
+				count += m->num_objects + m->num_objects_in_base;
 		}
 
-		for (p = r->objects->packfiles->packs; p; p = p->next) {
-			if (open_pack_index(p))
+		repo_for_each_pack(r, p) {
+			if (p->multi_pack_index || open_pack_index(p))
 				continue;
 			count += p->num_objects;
 		}

From 0d0e4b5954e97fcdfd0a4120e17e2c570497981a Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:42 +0100
Subject: [PATCH 007/553] builtin/pack-objects: simplify logic to find kept or
 nonlocal objects

The function `has_sha1_pack_kept_or_nonlocal()` takes an object ID and
then searches through packed objects to figure out whether the object
exists in a kept or non-local pack. As a performance optimization we
remember the packfile that contains a given object ID so that the next
call to the function first checks that same packfile again.

The way this is written is rather hard to follow though, as the caching
mechanism is intertwined with the loop that iterates through the packs.
Consequently, we need to do some gymnastics to re-start the iteration if
the cached pack does not contain the objects.

Refactor this so that we check the cached packfile at the beginning. We
don't have to re-verify whether the packfile meets the properties as we
have already verified those when storing the pack in `last_found` in the
first place. So all we need to do is to use `find_pack_entry_one()` to
check whether the pack contains the object ID, and to skip the cached
pack in the loop so that we don't search it twice.

Furthermore, stop using the `(void *)1` sentinel value and instead use a
simple `NULL` pointer to indicate that we don't have a last-found pack
yet.

This refactoring significantly simplifies the logic and makes it much
easier to follow.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5348aebbe9f190..b83eb8ead14139 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4388,27 +4388,27 @@ static void add_unreachable_loose_objects(struct rev_info *revs)
 
 static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
 {
-	struct packfile_store *packs = the_repository->objects->packfiles;
-	static struct packed_git *last_found = (void *)1;
+	static struct packed_git *last_found = NULL;
 	struct packed_git *p;
 
-	p = (last_found != (void *)1) ? last_found :
-					packfile_store_get_packs(packs);
+	if (last_found && find_pack_entry_one(oid, last_found))
+		return 1;
 
-	while (p) {
-		if ((!p->pack_local || p->pack_keep ||
-				p->pack_keep_in_core) &&
-			find_pack_entry_one(oid, p)) {
+	repo_for_each_pack(the_repository, p) {
+		/*
+		 * We have already checked `last_found`, so there is no need to
+		 * re-check here.
+		 */
+		if (p == last_found)
+			continue;
+
+		if ((!p->pack_local || p->pack_keep || p->pack_keep_in_core) &&
+		    find_pack_entry_one(oid, p)) {
 			last_found = p;
 			return 1;
 		}
-		if (p == last_found)
-			p = packfile_store_get_packs(packs);
-		else
-			p = p->next;
-		if (p == last_found)
-			p = p->next;
 	}
+
 	return 0;
 }
 

From 589127caa73090040200989ff4d24c3d54f473f2 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:43 +0100
Subject: [PATCH 008/553] packfile: move list of packs into the packfile store

Move the list of packs into the packfile store. This follows the same
logic as in a previous commit, where we moved the most-recently-used
list of packs, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fast-import.c |  4 +--
 packfile.c            | 83 +++++++++++++++++++------------------------
 packfile.h            | 16 +++------
 3 files changed, 43 insertions(+), 60 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 215295c1561f57..6fe6e9bc61d81b 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -978,7 +978,7 @@ static int store_object(
 	if (e->idx.offset) {
 		duplicate_count_by_type[type]++;
 		return 1;
-	} else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
+	} else if (packfile_list_find_oid(packfile_store_get_packs(packs), &oid)) {
 		e->type = type;
 		e->pack_id = MAX_PACK_ID;
 		e->idx.offset = 1; /* just not zero! */
@@ -1179,7 +1179,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
 		duplicate_count_by_type[OBJ_BLOB]++;
 		truncate_pack(&checkpoint);
 
-	} else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) {
+	} else if (packfile_list_find_oid(packfile_store_get_packs(packs), &oid)) {
 		e->type = OBJ_BLOB;
 		e->pack_id = MAX_PACK_ID;
 		e->idx.offset = 1; /* just not zero! */
diff --git a/packfile.c b/packfile.c
index b07509b69bd7cc..71e95ae11c56d2 100644
--- a/packfile.c
+++ b/packfile.c
@@ -356,13 +356,14 @@ static void scan_windows(struct packed_git *p,
 
 static int unuse_one_window(struct packed_git *current)
 {
-	struct packed_git *p, *lru_p = NULL;
+	struct packfile_list_entry *e;
+	struct packed_git *lru_p = NULL;
 	struct pack_window *lru_w = NULL, *lru_l = NULL;
 
 	if (current)
 		scan_windows(current, &lru_p, &lru_w, &lru_l);
-	for (p = current->repo->objects->packfiles->packs; p; p = p->next)
-		scan_windows(p, &lru_p, &lru_w, &lru_l);
+	for (e = current->repo->objects->packfiles->packs.head; e; e = e->next)
+		scan_windows(e->pack, &lru_p, &lru_w, &lru_l);
 	if (lru_p) {
 		munmap(lru_w->base, lru_w->len);
 		pack_mapped -= lru_w->len;
@@ -542,14 +543,15 @@ static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struc
 
 static int close_one_pack(struct repository *r)
 {
-	struct packed_git *p, *lru_p = NULL;
+	struct packfile_list_entry *e;
+	struct packed_git *lru_p = NULL;
 	struct pack_window *mru_w = NULL;
 	int accept_windows_inuse = 1;
 
-	for (p = r->objects->packfiles->packs; p; p = p->next) {
-		if (p->pack_fd == -1)
+	for (e = r->objects->packfiles->packs.head; e; e = e->next) {
+		if (e->pack->pack_fd == -1)
 			continue;
-		find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse);
+		find_lru_pack(e->pack, &lru_p, &mru_w, &accept_windows_inuse);
 	}
 
 	if (lru_p)
@@ -868,8 +870,7 @@ void packfile_store_add_pack(struct packfile_store *store,
 	if (pack->pack_fd != -1)
 		pack_open_fds++;
 
-	pack->next = store->packs;
-	store->packs = pack;
+	packfile_list_prepend(&store->packs, pack);
 
 	strmap_put(&store->packs_by_path, pack->pack_name, pack);
 }
@@ -1046,9 +1047,10 @@ static void prepare_packed_git_one(struct odb_source *source)
 	string_list_clear(data.garbage, 0);
 }
 
-DEFINE_LIST_SORT(static, sort_packs, struct packed_git, next);
+DEFINE_LIST_SORT(static, sort_packs, struct packfile_list_entry, next);
 
-static int sort_pack(const struct packed_git *a, const struct packed_git *b)
+static int sort_pack(const struct packfile_list_entry *a,
+		     const struct packfile_list_entry *b)
 {
 	int st;
 
@@ -1058,7 +1060,7 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
 	 * remote ones could be on a network mounted filesystem.
 	 * Favor local ones for these reasons.
 	 */
-	st = a->pack_local - b->pack_local;
+	st = a->pack->pack_local - b->pack->pack_local;
 	if (st)
 		return -st;
 
@@ -1067,21 +1069,19 @@ static int sort_pack(const struct packed_git *a, const struct packed_git *b)
 	 * and more recent objects tend to get accessed more
 	 * often.
 	 */
-	if (a->mtime < b->mtime)
+	if (a->pack->mtime < b->pack->mtime)
 		return 1;
-	else if (a->mtime == b->mtime)
+	else if (a->pack->mtime == b->pack->mtime)
 		return 0;
 	return -1;
 }
 
 static void packfile_store_prepare_mru(struct packfile_store *store)
 {
-	struct packed_git *p;
-
 	packfile_list_clear(&store->mru);
 
-	for (p = store->packs; p; p = p->next)
-		packfile_list_append(&store->mru, p);
+	for (struct packfile_list_entry *e = store->packs.head; e; e = e->next)
+		packfile_list_append(&store->mru, e->pack);
 }
 
 void packfile_store_prepare(struct packfile_store *store)
@@ -1096,7 +1096,11 @@ void packfile_store_prepare(struct packfile_store *store)
 		prepare_multi_pack_index_one(source);
 		prepare_packed_git_one(source);
 	}
-	sort_packs(&store->packs, sort_pack);
+
+	sort_packs(&store->packs.head, sort_pack);
+	for (struct packfile_list_entry *e = store->packs.head; e; e = e->next)
+		if (!e->next)
+			store->packs.tail = e;
 
 	packfile_store_prepare_mru(store);
 	store->initialized = true;
@@ -1108,7 +1112,7 @@ void packfile_store_reprepare(struct packfile_store *store)
 	packfile_store_prepare(store);
 }
 
-struct packed_git *packfile_store_get_packs(struct packfile_store *store)
+struct packfile_list_entry *packfile_store_get_packs(struct packfile_store *store)
 {
 	packfile_store_prepare(store);
 
@@ -1120,7 +1124,7 @@ struct packed_git *packfile_store_get_packs(struct packfile_store *store)
 			prepare_midx_pack(m, i);
 	}
 
-	return store->packs;
+	return store->packs.head;
 }
 
 struct packfile_list_entry *packfile_store_get_packs_mru(struct packfile_store *store)
@@ -1276,11 +1280,11 @@ void mark_bad_packed_object(struct packed_git *p, const struct object_id *oid)
 const struct packed_git *has_packed_and_bad(struct repository *r,
 					    const struct object_id *oid)
 {
-	struct packed_git *p;
+	struct packfile_list_entry *e;
 
-	for (p = r->objects->packfiles->packs; p; p = p->next)
-		if (oidset_contains(&p->bad_objects, oid))
-			return p;
+	for (e = r->objects->packfiles->packs.head; e; e = e->next)
+		if (oidset_contains(&e->pack->bad_objects, oid))
+			return e->pack;
 	return NULL;
 }
 
@@ -2088,19 +2092,6 @@ int is_pack_valid(struct packed_git *p)
 	return !open_packed_git(p);
 }
 
-struct packed_git *find_oid_pack(const struct object_id *oid,
-				 struct packed_git *packs)
-{
-	struct packed_git *p;
-
-	for (p = packs; p; p = p->next) {
-		if (find_pack_entry_one(oid, p))
-			return p;
-	}
-	return NULL;
-
-}
-
 static int fill_pack_entry(const struct object_id *oid,
 			   struct pack_entry *e,
 			   struct packed_git *p)
@@ -2139,7 +2130,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 		if (source->midx && fill_midx_entry(source->midx, oid, e))
 			return 1;
 
-	if (!r->objects->packfiles->packs)
+	if (!r->objects->packfiles->packs.head)
 		return 0;
 
 	for (l = r->objects->packfiles->mru.head; l; l = l->next) {
@@ -2404,19 +2395,19 @@ struct packfile_store *packfile_store_new(struct object_database *odb)
 
 void packfile_store_free(struct packfile_store *store)
 {
-	for (struct packed_git *p = store->packs, *next; p; p = next) {
-		next = p->next;
-		free(p);
-	}
+	for (struct packfile_list_entry *e = store->packs.head; e; e = e->next)
+		free(e->pack);
+	packfile_list_clear(&store->packs);
+
 	strmap_clear(&store->packs_by_path, 0);
 	free(store);
 }
 
 void packfile_store_close(struct packfile_store *store)
 {
-	for (struct packed_git *p = store->packs; p; p = p->next) {
-		if (p->do_not_close)
+	for (struct packfile_list_entry *e = store->packs.head; e; e = e->next) {
+		if (e->pack->do_not_close)
 			BUG("want to close pack marked 'do-not-close'");
-		close_pack(p);
+		close_pack(e->pack);
 	}
 }
diff --git a/packfile.h b/packfile.h
index a53336d722a42b..d95275e666c95e 100644
--- a/packfile.h
+++ b/packfile.h
@@ -11,7 +11,6 @@
 struct object_info;
 
 struct packed_git {
-	struct packed_git *next;
 	struct pack_window *windows;
 	off_t pack_size;
 	const void *index_data;
@@ -83,7 +82,7 @@ struct packfile_store {
 	 * The list of packfiles in the order in which they are being added to
 	 * the store.
 	 */
-	struct packed_git *packs;
+	struct packfile_list packs;
 
 	/*
 	 * Cache of packfiles which are marked as "kept", either because there
@@ -163,13 +162,14 @@ void packfile_store_add_pack(struct packfile_store *store,
  * repository.
  */
 #define repo_for_each_pack(repo, p) \
-	for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next)
+	for (struct packfile_list_entry *e = packfile_store_get_packs(repo->objects->packfiles); \
+	     ((p) = (e ? e->pack : NULL)); e = e->next)
 
 /*
  * Get all packs managed by the given store, including packfiles that are
  * referenced by multi-pack indices.
  */
-struct packed_git *packfile_store_get_packs(struct packfile_store *store);
+struct packfile_list_entry *packfile_store_get_packs(struct packfile_store *store);
 
 /*
  * Get all packs in most-recently-used order.
@@ -266,14 +266,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path);
  */
 unsigned long repo_approximate_object_count(struct repository *r);
 
-/*
- * Find the pack within the "packs" list whose index contains the object "oid".
- * For general object lookups, you probably don't want this; use
- * find_pack_entry() instead.
- */
-struct packed_git *find_oid_pack(const struct object_id *oid,
-				 struct packed_git *packs);
-
 void pack_report(struct repository *repo);
 
 /*

From 6aff1f25a046f3dcd8a78b0c61414fa2d1c9a93c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:44 +0100
Subject: [PATCH 009/553] packfile: always add packfiles to MRU when adding a
 pack

When preparing the packfile store we know to also prepare the MRU list
of packfiles with all packs that are currently loaded in the store via
`packfile_store_prepare_mru()`. So we know that the list of packs in the
MRU list should match the list of packs in the non-MRU list.

But there are some direct or indirect callsites that add a packfile to
the store via `packfile_store_add_pack()` without adding the pack to the
MRU. And while functions that access the MRU (e.g. `find_pack_entry()`)
know to call `packfile_store_prepare()`, which knows to prepare the MRU
via `packfile_store_prepare_mru()`, that operation will be turned into a
no-op because the packfile store is already prepared. So this will not
cause us to add the packfile to the MRU, and consequently we won't be
able to find the packfile in our MRU list.

There are only a handful of callers outside of "packfile.c" that add a
packfile to the store:

  - "builtin/fast-import.c" adds multiple packs of imported objects, but
    it knows to look up objects via `packfile_store_get_packs()`. This
    function does not use the MRU, so we're good.

  - "builtin/index-pack.c" adds the indexed pack to the store in case it
    needs to perform consistency checks on its objects.

  - "http.c" adds the fetched pack to the store so that we can access
    its objects.

In all of these cases we actually want to access the contained objects.
And luckily, reading these objects works as expected:

  1. We eventually end up in `do_oid_object_info_extended()`.

  2. Calling `find_pack_entry()` fails because the MRU list doesn't
     contain the newly added packfile.

  3. The callers don't pass `OBJECT_INFO_QUICK`, so we end up
     repreparing the object database. This will also cause us to
     reprepare the MRU list.

  4. We now retry reading the object via `find_pack_entry()`, and now we
     succeed because the MRU list got populated.

This logic feels quite fragile: we intentionally add the packfile to the
store, but we then ultimately rely on repreparing the entire store only
to make the packfile accessible. While we do the correct thing in
`do_oid_object_info_extended()`, other sites that access the MRU may not
know to reprepare.

But besides being fragile it's also a waste of resources: repreparing
the object database requires us to re-read the alternates file and
discard any caches.

Refactor the code so that we unconditionally add packfiles to the MRU
when adding them to a packfile store. This makes the logic less fragile
and ensures that we don't have to reprepare the store to make the pack
accessible.

Note that this does not allow us to drop `packfile_store_prepare_mru()`
just yet: while the MRU list is already populated with all packs now,
the order in which we add these packs is indeterministic for most of the
part. So by first calling `sort_pack()` on the other packfile list and
then re-preparing the MRU list we inherit its sorting.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 midx.c     | 2 --
 packfile.c | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/midx.c b/midx.c
index 8022be9a45ecb9..24e1e721754d0c 100644
--- a/midx.c
+++ b/midx.c
@@ -462,8 +462,6 @@ int prepare_midx_pack(struct multi_pack_index *m,
 		    m->pack_names[pack_int_id]);
 	p = packfile_store_load_pack(r->objects->packfiles,
 				     pack_name.buf, m->source->local);
-	if (p)
-		packfile_list_append(&m->source->odb->packfiles->mru, p);
 	strbuf_release(&pack_name);
 
 	if (!p) {
diff --git a/packfile.c b/packfile.c
index 71e95ae11c56d2..60f2e42876a823 100644
--- a/packfile.c
+++ b/packfile.c
@@ -871,6 +871,7 @@ void packfile_store_add_pack(struct packfile_store *store,
 		pack_open_fds++;
 
 	packfile_list_prepend(&store->packs, pack);
+	packfile_list_append(&store->mru, pack);
 
 	strmap_put(&store->packs_by_path, pack->pack_name, pack);
 }

From c31bad4f7dcf3e04ae22e7d4a1059fd628acf1a2 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 30 Oct 2025 11:38:45 +0100
Subject: [PATCH 010/553] packfile: track packs via the MRU list exclusively

We track packfiles via two different lists:

  - `struct packfile_store::packs` is a list that sorts local packs
    first. In addition, these packs are sorted so that younger packs are
    sorted towards the front.

  - `struct packfile_store::mru` is a list that sorts packs so that
    most-recently used packs are at the front.

The reasoning behind the ordering in the `packs` list is that younger
objects stored in the local object store tend to be accessed more
frequently, and that is certainly true for some cases. But there are
going to be lots of cases where that isn't true. Especially when
traversing history it is likely that one needs to access many older
objects, and due to our housekeeping it is very likely that almost all
of those older objects will be contained in one large pack that is
oldest.

So whether or not the ordering makes sense really depends on the use
case at hand. A flexible approach like our MRU list addresses that need,
as it will sort packs towards the front that are accessed all the time.
Intuitively, this approach is thus able to satisfy more use cases more
efficiently.

This reasoning casts some doubt on whether or not it really makes sense
to track packs via two different lists. It causes confusion, and it is
not clear whether there are use cases where the `packs` list really is
such an obvious choice.

Merge these two lists into one most-recently-used list.

Note that there is one important edge case: `for_each_packed_object()`
uses the MRU list to iterate through packs, and then it lists each
object in those packs. This would have the effect that we now sort the
current pack towards the front, thus modifying the list of packfiles we
are iterating over, with the consequence that we'll see an infinite
loop. This edge case is worked around by introducing a new field that
allows us to skip updating the MRU.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c |  4 ++--
 packfile.c             | 27 +++++++--------------------
 packfile.h             | 27 +++++++++++++++++----------
 3 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b83eb8ead14139..0e4e9f80682fd6 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1748,11 +1748,11 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
 		}
 	}
 
-	for (e = the_repository->objects->packfiles->mru.head; e; e = e->next) {
+	for (e = the_repository->objects->packfiles->packs.head; e; e = e->next) {
 		struct packed_git *p = e->pack;
 		want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime);
 		if (!exclude && want > 0)
-			packfile_list_prepend(&the_repository->objects->packfiles->mru, p);
+			packfile_list_prepend(&the_repository->objects->packfiles->packs, p);
 		if (want != -1)
 			return want;
 	}
diff --git a/packfile.c b/packfile.c
index 60f2e42876a823..378b0b1920d493 100644
--- a/packfile.c
+++ b/packfile.c
@@ -870,9 +870,7 @@ void packfile_store_add_pack(struct packfile_store *store,
 	if (pack->pack_fd != -1)
 		pack_open_fds++;
 
-	packfile_list_prepend(&store->packs, pack);
-	packfile_list_append(&store->mru, pack);
-
+	packfile_list_append(&store->packs, pack);
 	strmap_put(&store->packs_by_path, pack->pack_name, pack);
 }
 
@@ -1077,14 +1075,6 @@ static int sort_pack(const struct packfile_list_entry *a,
 	return -1;
 }
 
-static void packfile_store_prepare_mru(struct packfile_store *store)
-{
-	packfile_list_clear(&store->mru);
-
-	for (struct packfile_list_entry *e = store->packs.head; e; e = e->next)
-		packfile_list_append(&store->mru, e->pack);
-}
-
 void packfile_store_prepare(struct packfile_store *store)
 {
 	struct odb_source *source;
@@ -1103,7 +1093,6 @@ void packfile_store_prepare(struct packfile_store *store)
 		if (!e->next)
 			store->packs.tail = e;
 
-	packfile_store_prepare_mru(store);
 	store->initialized = true;
 }
 
@@ -1128,12 +1117,6 @@ struct packfile_list_entry *packfile_store_get_packs(struct packfile_store *stor
 	return store->packs.head;
 }
 
-struct packfile_list_entry *packfile_store_get_packs_mru(struct packfile_store *store)
-{
-	packfile_store_prepare(store);
-	return store->mru.head;
-}
-
 /*
  * Give a fast, rough count of the number of objects in the repository. This
  * ignores loose objects completely. If you have a lot of them, then either
@@ -2134,11 +2117,12 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 	if (!r->objects->packfiles->packs.head)
 		return 0;
 
-	for (l = r->objects->packfiles->mru.head; l; l = l->next) {
+	for (l = r->objects->packfiles->packs.head; l; l = l->next) {
 		struct packed_git *p = l->pack;
 
 		if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
-			packfile_list_prepend(&r->objects->packfiles->mru, p);
+			if (!r->objects->packfiles->skip_mru_updates)
+				packfile_list_prepend(&r->objects->packfiles->packs, p);
 			return 1;
 		}
 	}
@@ -2270,6 +2254,7 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
 	int r = 0;
 	int pack_errors = 0;
 
+	repo->objects->packfiles->skip_mru_updates = true;
 	repo_for_each_pack(repo, p) {
 		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
 			continue;
@@ -2290,6 +2275,8 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb,
 		if (r)
 			break;
 	}
+	repo->objects->packfiles->skip_mru_updates = false;
+
 	return r ? r : pack_errors;
 }
 
diff --git a/packfile.h b/packfile.h
index d95275e666c95e..27ba607e7c5eae 100644
--- a/packfile.h
+++ b/packfile.h
@@ -79,8 +79,8 @@ struct packfile_store {
 	struct object_database *odb;
 
 	/*
-	 * The list of packfiles in the order in which they are being added to
-	 * the store.
+	 * The list of packfiles in the order in which they have been most
+	 * recently used.
 	 */
 	struct packfile_list packs;
 
@@ -98,9 +98,6 @@ struct packfile_store {
 		unsigned flags;
 	} kept_cache;
 
-	/* A most-recently-used ordered version of the packs list. */
-	struct packfile_list mru;
-
 	/*
 	 * A map of packfile names to packed_git structs for tracking which
 	 * packs have been loaded already.
@@ -112,6 +109,21 @@ struct packfile_store {
 	 * packs.
 	 */
 	bool initialized;
+
+	/*
+	 * Usually, packfiles will be reordered to the front of the `packs`
+	 * list whenever an object is looked up via them. This has the effect
+	 * that packs that contain a lot of accessed objects will be located
+	 * towards the front.
+	 *
+	 * This is usually desireable, but there are exceptions. One exception
+	 * is when the looking up multiple objects in a loop for each packfile.
+	 * In that case, we may easily end up with an infinite loop as the
+	 * packfiles get reordered to the front repeatedly.
+	 *
+	 * Setting this field to `true` thus disables these reorderings.
+	 */
+	bool skip_mru_updates;
 };
 
 /*
@@ -171,11 +183,6 @@ void packfile_store_add_pack(struct packfile_store *store,
  */
 struct packfile_list_entry *packfile_store_get_packs(struct packfile_store *store);
 
-/*
- * Get all packs in most-recently-used order.
- */
-struct packfile_list_entry *packfile_store_get_packs_mru(struct packfile_store *store);
-
 /*
  * Open the packfile and add it to the store if it isn't yet known. Returns
  * either the newly opened packfile or the preexisting packfile. Returns a

From f82e430b4e46120e0ef67959e0ef9d8ab9282c56 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:41:56 +0100
Subject: [PATCH 011/553] odb: fix subtle logic to check whether an alternate
 is usable

When adding an alternate to the object database we first check whether
or not the path is usable. A path is usable if:

  - It actually exists.

  - We don't have it in our object sources yet.

While the former check is trivial enough, the latter part is somewhat
subtle and prone for bugs. This is because the function doesn't only
check whether or not the given path is usable. But if it _is_ usable, we
also store that path in the map of object sources immediately.

The tricky part here is that the path that gets stored in the map is
_not_ copied. Instead, we rely on the fact that subsequent code uses
`strbuf_detach()` to store the exact same allocated memory in the
created object source. Consequently, the memory is owned by the source
but _also_ stored in the map. This subtlety is easy to miss, so if one
decides to refactor this code one can easily end up breaking this
mechanism.

Make the relationship more explicit by not storing the path as part of
`alt_odb_usable()`. Instead, store the path after we have created the
source so that we can use the source's path pointer directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/odb.c b/odb.c
index 00a6e71568b598..57d85ed9505e3f 100644
--- a/odb.c
+++ b/odb.c
@@ -86,17 +86,16 @@ int odb_mkstemp(struct object_database *odb,
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
-static int alt_odb_usable(struct object_database *o,
-			  struct strbuf *path,
-			  const char *normalized_objdir, khiter_t *pos)
+static int alt_odb_usable(struct object_database *o, const char *path,
+			  const char *normalized_objdir)
 {
 	int r;
 
 	/* Detect cases where alternate disappeared */
-	if (!is_directory(path->buf)) {
+	if (!is_directory(path)) {
 		error(_("object directory %s does not exist; "
 			"check .git/objects/info/alternates"),
-		      path->buf);
+		      path);
 		return 0;
 	}
 
@@ -113,11 +112,14 @@ static int alt_odb_usable(struct object_database *o,
 		assert(r == 1); /* never used */
 		kh_value(o->source_by_path, p) = o->sources;
 	}
-	if (fspatheq(path->buf, normalized_objdir))
+
+	if (fspatheq(path, normalized_objdir))
+		return 0;
+
+	if (kh_get_odb_path_map(o->source_by_path, path) < kh_end(o->source_by_path))
 		return 0;
-	*pos = kh_put_odb_path_map(o->source_by_path, path->buf, &r);
-	/* r: 0 = exists, 1 = never used, 2 = deleted */
-	return r == 0 ? 0 : 1;
+
+	return 1;
 }
 
 /*
@@ -148,6 +150,7 @@ static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 	struct strbuf pathbuf = STRBUF_INIT;
 	struct strbuf tmp = STRBUF_INIT;
 	khiter_t pos;
+	int ret;
 
 	if (!is_absolute_path(dir) && relative_base) {
 		strbuf_realpath(&pathbuf, relative_base, 1);
@@ -172,20 +175,21 @@ static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 	strbuf_reset(&tmp);
 	strbuf_realpath(&tmp, odb->sources->path, 1);
 
-	if (!alt_odb_usable(odb, &pathbuf, tmp.buf, &pos))
+	if (!alt_odb_usable(odb, pathbuf.buf, tmp.buf))
 		goto error;
 
 	CALLOC_ARRAY(alternate, 1);
 	alternate->odb = odb;
 	alternate->local = false;
-	/* pathbuf.buf is already in r->objects->source_by_path */
 	alternate->path = strbuf_detach(&pathbuf, NULL);
 
 	/* add the alternate entry */
 	*odb->sources_tail = alternate;
 	odb->sources_tail = &(alternate->next);
-	alternate->next = NULL;
-	assert(odb->source_by_path);
+
+	pos = kh_put_odb_path_map(odb->source_by_path, alternate->path, &ret);
+	if (!ret)
+		BUG("source must not yet exist");
 	kh_value(odb->source_by_path, pos) = alternate;
 
 	/* recursively add alternates */

From 0820a4b120f310d87ac8817ade63896a901c9267 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:41:57 +0100
Subject: [PATCH 012/553] odb: introduce `odb_source_new()`

We have three different locations where we create a new ODB source.
Deduplicate the logic via a new `odb_source_new()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c        | 23 ++++++++++++++++-------
 odb.h        |  4 ++++
 repository.c | 14 +++++++++-----
 3 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/odb.c b/odb.c
index 57d85ed9505e3f..d2d4c514ae5c09 100644
--- a/odb.c
+++ b/odb.c
@@ -141,6 +141,20 @@ static void read_info_alternates(struct object_database *odb,
 				 const char *relative_base,
 				 int depth);
 
+struct odb_source *odb_source_new(struct object_database *odb,
+				  const char *path,
+				  bool local)
+{
+	struct odb_source *source;
+
+	CALLOC_ARRAY(source, 1);
+	source->odb = odb;
+	source->local = local;
+	source->path = xstrdup(path);
+
+	return source;
+}
+
 static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 					     const char *dir,
 					     const char *relative_base,
@@ -178,10 +192,7 @@ static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 	if (!alt_odb_usable(odb, pathbuf.buf, tmp.buf))
 		goto error;
 
-	CALLOC_ARRAY(alternate, 1);
-	alternate->odb = odb;
-	alternate->local = false;
-	alternate->path = strbuf_detach(&pathbuf, NULL);
+	alternate = odb_source_new(odb, pathbuf.buf, false);
 
 	/* add the alternate entry */
 	*odb->sources_tail = alternate;
@@ -341,9 +352,7 @@ struct odb_source *odb_set_temporary_primary_source(struct object_database *odb,
 	 * Make a new primary odb and link the old primary ODB in as an
 	 * alternate
 	 */
-	source = xcalloc(1, sizeof(*source));
-	source->odb = odb;
-	source->path = xstrdup(dir);
+	source = odb_source_new(odb, dir, false);
 
 	/*
 	 * Disable ref updates while a temporary odb is active, since
diff --git a/odb.h b/odb.h
index e6602dd90c833c..2bec895d1352f4 100644
--- a/odb.h
+++ b/odb.h
@@ -89,6 +89,10 @@ struct odb_source {
 	char *path;
 };
 
+struct odb_source *odb_source_new(struct object_database *odb,
+				  const char *path,
+				  bool local);
+
 struct packed_git;
 struct packfile_store;
 struct cached_object_entry;
diff --git a/repository.c b/repository.c
index 6faf5c73981ebf..6aaa7ba00869bf 100644
--- a/repository.c
+++ b/repository.c
@@ -160,20 +160,24 @@ void repo_set_gitdir(struct repository *repo,
 	 * until after xstrdup(root). Then we can free it.
 	 */
 	char *old_gitdir = repo->gitdir;
+	char *objects_path = NULL;
 
 	repo->gitdir = xstrdup(gitfile ? gitfile : root);
 	free(old_gitdir);
 
 	repo_set_commondir(repo, o->commondir);
+	expand_base_dir(&objects_path, o->object_dir,
+			repo->commondir, "objects");
 
 	if (!repo->objects->sources) {
-		CALLOC_ARRAY(repo->objects->sources, 1);
-		repo->objects->sources->odb = repo->objects;
-		repo->objects->sources->local = true;
+		repo->objects->sources = odb_source_new(repo->objects,
+							objects_path, true);
 		repo->objects->sources_tail = &repo->objects->sources->next;
+		free(objects_path);
+	} else {
+		free(repo->objects->sources->path);
+		repo->objects->sources->path = objects_path;
 	}
-	expand_base_dir(&repo->objects->sources->path, o->object_dir,
-			repo->commondir, "objects");
 
 	repo->objects->sources->disable_ref_updates = o->disable_ref_updates;
 

From c2da110411919387fef0f869181fb510d06295d8 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:41:58 +0100
Subject: [PATCH 013/553] odb: adjust naming to free object sources

The functions `free_object_directory()` and `free_object_directories()`
are responsible for freeing a single object source or all object sources
connected to an object database, respectively. The associated structure
has been renamed from `struct object_directory` to `struct odb_source`
in a1e2581a1e (object-store: rename `object_directory` to `odb_source`,
2025-07-01) though, so the names are somewhat stale nowadays.

Rename them to mention the new struct name instead. Furthermore, while
at it, adapt them to our modern naming schema where we first have the
subject followed by a verb.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/odb.c b/odb.c
index d2d4c514ae5c09..77490d7fdbeb4b 100644
--- a/odb.c
+++ b/odb.c
@@ -365,7 +365,7 @@ struct odb_source *odb_set_temporary_primary_source(struct object_database *odb,
 	return source->next;
 }
 
-static void free_object_directory(struct odb_source *source)
+static void odb_source_free(struct odb_source *source)
 {
 	free(source->path);
 	odb_clear_loose_cache(source);
@@ -387,7 +387,7 @@ void odb_restore_primary_source(struct object_database *odb,
 		BUG("we expect the old primary object store to be the first alternate");
 
 	odb->sources = restore_source;
-	free_object_directory(cur_source);
+	odb_source_free(cur_source);
 }
 
 char *compute_alternate_path(const char *path, struct strbuf *err)
@@ -1015,13 +1015,13 @@ struct object_database *odb_new(struct repository *repo)
 	return o;
 }
 
-static void free_object_directories(struct object_database *o)
+static void odb_free_sources(struct object_database *o)
 {
 	while (o->sources) {
 		struct odb_source *next;
 
 		next = o->sources->next;
-		free_object_directory(o->sources);
+		odb_source_free(o->sources);
 		o->sources = next;
 	}
 	kh_destroy_odb_path_map(o->source_by_path);
@@ -1039,7 +1039,7 @@ void odb_clear(struct object_database *o)
 	o->commit_graph = NULL;
 	o->commit_graph_attempted = 0;
 
-	free_object_directories(o);
+	odb_free_sources(o);
 	o->sources_tail = NULL;
 	o->loaded_alternates = 0;
 

From 0cc12dedef2885dba8cf2635697767d394baf91f Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:41:59 +0100
Subject: [PATCH 014/553] object-file: move `fetch_if_missing`

The `fetch_if_missing` global variable is declared in "object-file.h"
but defined in "odb.c". The variable relates to the whole object
database instead of only loose objects, so move the declaration into
"odb.h" accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.h | 8 --------
 odb.h         | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/object-file.h b/object-file.h
index 3fd48dcafbf1dc..097e9764be169e 100644
--- a/object-file.h
+++ b/object-file.h
@@ -7,14 +7,6 @@
 
 struct index_state;
 
-/*
- * Set this to 0 to prevent odb_read_object_info_extended() from fetching missing
- * blobs. This has a difference only if extensions.partialClone is set.
- *
- * Its default value is 1.
- */
-extern int fetch_if_missing;
-
 enum {
 	INDEX_WRITE_OBJECT = (1 << 0),
 	INDEX_FORMAT_CHECK = (1 << 1),
diff --git a/odb.h b/odb.h
index 2bec895d1352f4..2346ffeca85961 100644
--- a/odb.h
+++ b/odb.h
@@ -14,6 +14,14 @@ struct strbuf;
 struct repository;
 struct multi_pack_index;
 
+/*
+ * Set this to 0 to prevent odb_read_object_info_extended() from fetching missing
+ * blobs. This has a difference only if extensions.partialClone is set.
+ *
+ * Its default value is 1.
+ */
+extern int fetch_if_missing;
+
 /*
  * Compute the exact path an alternate is at and returns it. In case of
  * error NULL is returned and the human readable error is added to `err`

From ece43d9dc70b1717484ee78b66aef4f9390c2b2b Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:00 +0100
Subject: [PATCH 015/553] object-file: introduce `struct odb_source_loose`

Currently, all state that relates to loose objects is held directly by
the `struct odb_source`. Introduce a new `struct odb_source_loose` to
hold the state instead so that it is entirely self-contained.

This structure will eventually morph into the backend for accessing
loose objects. As such, this is part of the refactorings to introduce
pluggable object databases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 13 +++++++++++++
 object-file.h |  7 +++++++
 odb.c         |  2 ++
 odb.h         |  3 +++
 4 files changed, 25 insertions(+)

diff --git a/object-file.c b/object-file.c
index 4675c8ed6b67eb..cd6aa561fa7db2 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1995,3 +1995,16 @@ void object_file_transaction_commit(struct odb_transaction *transaction)
 	transaction->odb->transaction = NULL;
 	free(transaction);
 }
+
+struct odb_source_loose *odb_source_loose_new(struct odb_source *source)
+{
+	struct odb_source_loose *loose;
+	CALLOC_ARRAY(loose, 1);
+	loose->source = source;
+	return loose;
+}
+
+void odb_source_loose_free(struct odb_source_loose *loose)
+{
+	free(loose);
+}
diff --git a/object-file.h b/object-file.h
index 097e9764be169e..695a7e8e7c4b0b 100644
--- a/object-file.h
+++ b/object-file.h
@@ -18,6 +18,13 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 
 struct odb_source;
 
+struct odb_source_loose {
+	struct odb_source *source;
+};
+
+struct odb_source_loose *odb_source_loose_new(struct odb_source *source);
+void odb_source_loose_free(struct odb_source_loose *loose);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
diff --git a/odb.c b/odb.c
index 77490d7fdbeb4b..2d06ab0bb85c27 100644
--- a/odb.c
+++ b/odb.c
@@ -151,6 +151,7 @@ struct odb_source *odb_source_new(struct object_database *odb,
 	source->odb = odb;
 	source->local = local;
 	source->path = xstrdup(path);
+	source->loose = odb_source_loose_new(source);
 
 	return source;
 }
@@ -368,6 +369,7 @@ struct odb_source *odb_set_temporary_primary_source(struct object_database *odb,
 static void odb_source_free(struct odb_source *source)
 {
 	free(source->path);
+	odb_source_loose_free(source->loose);
 	odb_clear_loose_cache(source);
 	loose_object_map_clear(&source->loose_map);
 	free(source);
diff --git a/odb.h b/odb.h
index 2346ffeca85961..49b398bedae640 100644
--- a/odb.h
+++ b/odb.h
@@ -48,6 +48,9 @@ struct odb_source {
 	/* Object database that owns this object source. */
 	struct object_database *odb;
 
+	/* Private state for loose objects. */
+	struct odb_source_loose *loose;
+
 	/*
 	 * Used to store the results of readdir(3) calls when we are OK
 	 * sacrificing accuracy due to races for speed. That includes

From 90a93f9dea88532623ef7422dbc21d8dc70a58dd Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:01 +0100
Subject: [PATCH 016/553] object-file: move loose object cache into loose
 source

Our loose objects use a cache that (optionally) stores all objects for
each of the opened sharding directories. This cache is located in the
`struct odb_source`, but now that we have `struct odb_source_loose` it
makes sense to move it into the latter structure so that all state that
relates to loose objects is entirely self-contained.

Do so. While at it, rename corresponding functions to have a prefix that
relates to `struct odb_source_loose`.

Note that despite this prefix, the functions still accept a `struct
odb_source` as input. This is done intentionally: once we introduce
pluggable object databases, we will continue to accept this struct but
then do a cast inside these functions to `struct odb_source_loose`. This
design is similar to how we do it for our ref backends.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 loose.c       |  9 +++++----
 object-file.c | 35 +++++++++++++++++++----------------
 object-file.h | 16 ++++++++++++++--
 object-name.c |  2 +-
 odb.c         |  1 -
 odb.h         | 12 ------------
 6 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/loose.c b/loose.c
index e8ea6e7e24ba31..8cc7573ff2b2d9 100644
--- a/loose.c
+++ b/loose.c
@@ -1,6 +1,7 @@
 #include "git-compat-util.h"
 #include "hash.h"
 #include "path.h"
+#include "object-file.h"
 #include "odb.h"
 #include "hex.h"
 #include "repository.h"
@@ -54,7 +55,7 @@ static int insert_loose_map(struct odb_source *source,
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
 	inserted |= insert_oid_pair(map->to_storage, compat_oid, oid);
 	if (inserted)
-		oidtree_insert(source->loose_objects_cache, compat_oid);
+		oidtree_insert(source->loose->cache, compat_oid);
 
 	return inserted;
 }
@@ -66,9 +67,9 @@ static int load_one_loose_object_map(struct repository *repo, struct odb_source
 
 	if (!source->loose_map)
 		loose_object_map_init(&source->loose_map);
-	if (!source->loose_objects_cache) {
-		ALLOC_ARRAY(source->loose_objects_cache, 1);
-		oidtree_init(source->loose_objects_cache);
+	if (!source->loose->cache) {
+		ALLOC_ARRAY(source->loose->cache, 1);
+		oidtree_init(source->loose->cache);
 	}
 
 	insert_loose_map(source, repo->hash_algo->empty_tree, repo->compat_hash_algo->empty_tree);
diff --git a/object-file.c b/object-file.c
index cd6aa561fa7db2..fef00d6d3d082a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -223,7 +223,7 @@ static int quick_has_loose(struct repository *r,
 
 	odb_prepare_alternates(r->objects);
 	for (source = r->objects->sources; source; source = source->next) {
-		if (oidtree_contains(odb_loose_cache(source, oid), oid))
+		if (oidtree_contains(odb_source_loose_cache(source, oid), oid))
 			return 1;
 	}
 	return 0;
@@ -1802,44 +1802,44 @@ static int append_loose_object(const struct object_id *oid,
 	return 0;
 }
 
-struct oidtree *odb_loose_cache(struct odb_source *source,
-				const struct object_id *oid)
+struct oidtree *odb_source_loose_cache(struct odb_source *source,
+				       const struct object_id *oid)
 {
 	int subdir_nr = oid->hash[0];
 	struct strbuf buf = STRBUF_INIT;
-	size_t word_bits = bitsizeof(source->loose_objects_subdir_seen[0]);
+	size_t word_bits = bitsizeof(source->loose->subdir_seen[0]);
 	size_t word_index = subdir_nr / word_bits;
 	size_t mask = (size_t)1u << (subdir_nr % word_bits);
 	uint32_t *bitmap;
 
 	if (subdir_nr < 0 ||
-	    (size_t) subdir_nr >= bitsizeof(source->loose_objects_subdir_seen))
+	    (size_t) subdir_nr >= bitsizeof(source->loose->subdir_seen))
 		BUG("subdir_nr out of range");
 
-	bitmap = &source->loose_objects_subdir_seen[word_index];
+	bitmap = &source->loose->subdir_seen[word_index];
 	if (*bitmap & mask)
-		return source->loose_objects_cache;
-	if (!source->loose_objects_cache) {
-		ALLOC_ARRAY(source->loose_objects_cache, 1);
-		oidtree_init(source->loose_objects_cache);
+		return source->loose->cache;
+	if (!source->loose->cache) {
+		ALLOC_ARRAY(source->loose->cache, 1);
+		oidtree_init(source->loose->cache);
 	}
 	strbuf_addstr(&buf, source->path);
 	for_each_file_in_obj_subdir(subdir_nr, &buf,
 				    source->odb->repo->hash_algo,
 				    append_loose_object,
 				    NULL, NULL,
-				    source->loose_objects_cache);
+				    source->loose->cache);
 	*bitmap |= mask;
 	strbuf_release(&buf);
-	return source->loose_objects_cache;
+	return source->loose->cache;
 }
 
 void odb_clear_loose_cache(struct odb_source *source)
 {
-	oidtree_clear(source->loose_objects_cache);
-	FREE_AND_NULL(source->loose_objects_cache);
-	memset(&source->loose_objects_subdir_seen, 0,
-	       sizeof(source->loose_objects_subdir_seen));
+	oidtree_clear(source->loose->cache);
+	FREE_AND_NULL(source->loose->cache);
+	memset(&source->loose->subdir_seen, 0,
+	       sizeof(source->loose->subdir_seen));
 }
 
 static int check_stream_oid(git_zstream *stream,
@@ -2006,5 +2006,8 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source *source)
 
 void odb_source_loose_free(struct odb_source_loose *loose)
 {
+	if (!loose)
+		return;
+	odb_clear_loose_cache(loose->source);
 	free(loose);
 }
diff --git a/object-file.h b/object-file.h
index 695a7e8e7c4b0b..90da69cf5f7d59 100644
--- a/object-file.h
+++ b/object-file.h
@@ -20,6 +20,18 @@ struct odb_source;
 
 struct odb_source_loose {
 	struct odb_source *source;
+
+	/*
+	 * Used to store the results of readdir(3) calls when we are OK
+	 * sacrificing accuracy due to races for speed. That includes
+	 * object existence with OBJECT_INFO_QUICK, as well as
+	 * our search for unique abbreviated hashes. Don't use it for tasks
+	 * requiring greater accuracy!
+	 *
+	 * Be sure to call odb_load_loose_cache() before using.
+	 */
+	uint32_t subdir_seen[8]; /* 256 bits */
+	struct oidtree *cache;
 };
 
 struct odb_source_loose *odb_source_loose_new(struct odb_source *source);
@@ -29,8 +41,8 @@ void odb_source_loose_free(struct odb_source_loose *loose);
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
  */
-struct oidtree *odb_loose_cache(struct odb_source *source,
-				const struct object_id *oid);
+struct oidtree *odb_source_loose_cache(struct odb_source *source,
+				       const struct object_id *oid);
 
 /* Empty the loose object cache for the specified object directory. */
 void odb_clear_loose_cache(struct odb_source *source);
diff --git a/object-name.c b/object-name.c
index 766c757042a389..8ce0ef7c23a6bd 100644
--- a/object-name.c
+++ b/object-name.c
@@ -116,7 +116,7 @@ static void find_short_object_filename(struct disambiguate_state *ds)
 	struct odb_source *source;
 
 	for (source = ds->repo->objects->sources; source && !ds->ambiguous; source = source->next)
-		oidtree_each(odb_loose_cache(source, &ds->bin_pfx),
+		oidtree_each(odb_source_loose_cache(source, &ds->bin_pfx),
 				&ds->bin_pfx, ds->len, match_prefix, ds);
 }
 
diff --git a/odb.c b/odb.c
index 2d06ab0bb85c27..87d84688c638cc 100644
--- a/odb.c
+++ b/odb.c
@@ -370,7 +370,6 @@ static void odb_source_free(struct odb_source *source)
 {
 	free(source->path);
 	odb_source_loose_free(source->loose);
-	odb_clear_loose_cache(source);
 	loose_object_map_clear(&source->loose_map);
 	free(source);
 }
diff --git a/odb.h b/odb.h
index 49b398bedae640..77104396afe4aa 100644
--- a/odb.h
+++ b/odb.h
@@ -51,18 +51,6 @@ struct odb_source {
 	/* Private state for loose objects. */
 	struct odb_source_loose *loose;
 
-	/*
-	 * Used to store the results of readdir(3) calls when we are OK
-	 * sacrificing accuracy due to races for speed. That includes
-	 * object existence with OBJECT_INFO_QUICK, as well as
-	 * our search for unique abbreviated hashes. Don't use it for tasks
-	 * requiring greater accuracy!
-	 *
-	 * Be sure to call odb_load_loose_cache() before using.
-	 */
-	uint32_t loose_objects_subdir_seen[8]; /* 256 bits */
-	struct oidtree *loose_objects_cache;
-
 	/* Map between object IDs for loose objects. */
 	struct loose_object_map *loose_map;
 

From be659c97eae3b68e38b71f0a67067dede23903b5 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:02 +0100
Subject: [PATCH 017/553] object-file: hide internals when we need to reprepare
 loose sources

There are two different situations where we have to clear the cache of
loose objects:

  - When freeing the loose object source itself to avoid memory leaks.

  - When repreparing the loose object source so that any potentially-
    stale data is getting evicted from the cache.

The former is already handled by `odb_source_loose_free()`. But the
latter case is still done manually by in `odb_reprepare()`, so we are
leaking internals into that code.

Introduce a new `odb_source_loose_reprepare()` function as an equivalent
to `packfile_store_prepare()` to hide these implementation details.
Furthermore, while at it, rename the function `odb_clear_loose_cache()`
to `odb_source_loose_clear()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 17 +++++++++++------
 object-file.h |  6 +++---
 odb.c         |  2 +-
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/object-file.c b/object-file.c
index fef00d6d3d082a..20daa629a1dda9 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1834,12 +1834,17 @@ struct oidtree *odb_source_loose_cache(struct odb_source *source,
 	return source->loose->cache;
 }
 
-void odb_clear_loose_cache(struct odb_source *source)
+static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
-	oidtree_clear(source->loose->cache);
-	FREE_AND_NULL(source->loose->cache);
-	memset(&source->loose->subdir_seen, 0,
-	       sizeof(source->loose->subdir_seen));
+	oidtree_clear(loose->cache);
+	FREE_AND_NULL(loose->cache);
+	memset(&loose->subdir_seen, 0,
+	       sizeof(loose->subdir_seen));
+}
+
+void odb_source_loose_reprepare(struct odb_source *source)
+{
+	odb_source_loose_clear_cache(source->loose);
 }
 
 static int check_stream_oid(git_zstream *stream,
@@ -2008,6 +2013,6 @@ void odb_source_loose_free(struct odb_source_loose *loose)
 {
 	if (!loose)
 		return;
-	odb_clear_loose_cache(loose->source);
+	odb_source_loose_clear_cache(loose);
 	free(loose);
 }
diff --git a/object-file.h b/object-file.h
index 90da69cf5f7d59..bec855e8e53f95 100644
--- a/object-file.h
+++ b/object-file.h
@@ -37,6 +37,9 @@ struct odb_source_loose {
 struct odb_source_loose *odb_source_loose_new(struct odb_source *source);
 void odb_source_loose_free(struct odb_source_loose *loose);
 
+/* Reprepare the loose source by emptying the loose object cache. */
+void odb_source_loose_reprepare(struct odb_source *source);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
@@ -44,9 +47,6 @@ void odb_source_loose_free(struct odb_source_loose *loose);
 struct oidtree *odb_source_loose_cache(struct odb_source *source,
 				       const struct object_id *oid);
 
-/* Empty the loose object cache for the specified object directory. */
-void odb_clear_loose_cache(struct odb_source *source);
-
 /*
  * Put in `buf` the name of the file in the local object database that
  * would be used to store a loose object with the specified oid.
diff --git a/odb.c b/odb.c
index 87d84688c638cc..b3e8d4a49cb07e 100644
--- a/odb.c
+++ b/odb.c
@@ -1071,7 +1071,7 @@ void odb_reprepare(struct object_database *o)
 	odb_prepare_alternates(o);
 
 	for (source = o->sources; source; source = source->next)
-		odb_clear_loose_cache(source);
+		odb_source_loose_reprepare(source);
 
 	o->approximate_object_count_valid = 0;
 

From 376016ec71c3a6c883f2ca77a3f1c0245fd60dc2 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:03 +0100
Subject: [PATCH 018/553] object-file: move loose object map into loose source

The loose object map is used to map from the repository's canonical
object hash to the compatibility hash. As the name indicates, this map
is only used for loose objects, and as such it is tied to a specific
loose object source.

Same as with preceding commits, move this map into the loose object
source accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 loose.c       | 10 +++++-----
 object-file.c |  1 +
 object-file.h |  3 +++
 odb.c         |  1 -
 odb.h         |  3 ---
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/loose.c b/loose.c
index 8cc7573ff2b2d9..56cf64b648bf80 100644
--- a/loose.c
+++ b/loose.c
@@ -49,7 +49,7 @@ static int insert_loose_map(struct odb_source *source,
 			    const struct object_id *oid,
 			    const struct object_id *compat_oid)
 {
-	struct loose_object_map *map = source->loose_map;
+	struct loose_object_map *map = source->loose->map;
 	int inserted = 0;
 
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
@@ -65,8 +65,8 @@ static int load_one_loose_object_map(struct repository *repo, struct odb_source
 	struct strbuf buf = STRBUF_INIT, path = STRBUF_INIT;
 	FILE *fp;
 
-	if (!source->loose_map)
-		loose_object_map_init(&source->loose_map);
+	if (!source->loose->map)
+		loose_object_map_init(&source->loose->map);
 	if (!source->loose->cache) {
 		ALLOC_ARRAY(source->loose->cache, 1);
 		oidtree_init(source->loose->cache);
@@ -125,7 +125,7 @@ int repo_read_loose_object_map(struct repository *repo)
 
 int repo_write_loose_object_map(struct repository *repo)
 {
-	kh_oid_map_t *map = repo->objects->sources->loose_map->to_compat;
+	kh_oid_map_t *map = repo->objects->sources->loose->map->to_compat;
 	struct lock_file lock;
 	int fd;
 	khiter_t iter;
@@ -231,7 +231,7 @@ int repo_loose_object_map_oid(struct repository *repo,
 	khiter_t pos;
 
 	for (source = repo->objects->sources; source; source = source->next) {
-		struct loose_object_map *loose_map = source->loose_map;
+		struct loose_object_map *loose_map = source->loose->map;
 		if (!loose_map)
 			continue;
 		map = (to == repo->compat_hash_algo) ?
diff --git a/object-file.c b/object-file.c
index 20daa629a1dda9..ccc67713fad38f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2014,5 +2014,6 @@ void odb_source_loose_free(struct odb_source_loose *loose)
 	if (!loose)
 		return;
 	odb_source_loose_clear_cache(loose);
+	loose_object_map_clear(&loose->map);
 	free(loose);
 }
diff --git a/object-file.h b/object-file.h
index bec855e8e53f95..f8a96a45f57703 100644
--- a/object-file.h
+++ b/object-file.h
@@ -32,6 +32,9 @@ struct odb_source_loose {
 	 */
 	uint32_t subdir_seen[8]; /* 256 bits */
 	struct oidtree *cache;
+
+	/* Map between object IDs for loose objects. */
+	struct loose_object_map *map;
 };
 
 struct odb_source_loose *odb_source_loose_new(struct odb_source *source);
diff --git a/odb.c b/odb.c
index b3e8d4a49cb07e..d1df9609e21d1e 100644
--- a/odb.c
+++ b/odb.c
@@ -370,7 +370,6 @@ static void odb_source_free(struct odb_source *source)
 {
 	free(source->path);
 	odb_source_loose_free(source->loose);
-	loose_object_map_clear(&source->loose_map);
 	free(source);
 }
 
diff --git a/odb.h b/odb.h
index 77104396afe4aa..f9a3137a34aa3b 100644
--- a/odb.h
+++ b/odb.h
@@ -51,9 +51,6 @@ struct odb_source {
 	/* Private state for loose objects. */
 	struct odb_source_loose *loose;
 
-	/* Map between object IDs for loose objects. */
-	struct loose_object_map *loose_map;
-
 	/*
 	 * private data
 	 *

From ff7ad5cb3936514ec0be32531ff6274b53dbe091 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:04 +0100
Subject: [PATCH 019/553] object-file: read objects via the loose object source

When reading an object via `loose_object_info()` or `map_loose_object()`
we hand in the whole repository. We then iterate through each of the
object sources to figure out whether that source has the object in
question.

This logic is reversing responsibility though: a specific backend should
only care about one specific source, where the object sources themselves
are then managed by the object database.

Refactor the code accordingly by passing an object source to both of
these functions instead. The different sources are then handled by
either `do_oid_object_info_extended()`, which sits on the object
database level, and by `open_istream_loose()`. The latter function
arguably is still at the wrong level, but this will be cleaned up at a
later point in time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 68 +++++++++++++++++++--------------------------------
 object-file.h | 15 ++++++------
 odb.c         |  9 +++++--
 streaming.c   | 11 ++++++++-
 4 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/object-file.c b/object-file.c
index ccc67713fad38f..6d6e9a5a2ad3c8 100644
--- a/object-file.c
+++ b/object-file.c
@@ -167,25 +167,22 @@ int stream_object_signature(struct repository *r, const struct object_id *oid)
 }
 
 /*
- * Find "oid" as a loose object in the local repository or in an alternate.
+ * Find "oid" as a loose object in given source.
  * Returns 0 on success, negative on failure.
  *
  * The "path" out-parameter will give the path of the object we found (if any).
  * Note that it may point to static storage and is only valid until another
  * call to stat_loose_object().
  */
-static int stat_loose_object(struct repository *r, const struct object_id *oid,
+static int stat_loose_object(struct odb_source_loose *loose,
+			     const struct object_id *oid,
 			     struct stat *st, const char **path)
 {
-	struct odb_source *source;
 	static struct strbuf buf = STRBUF_INIT;
 
-	odb_prepare_alternates(r->objects);
-	for (source = r->objects->sources; source; source = source->next) {
-		*path = odb_loose_path(source, &buf, oid);
-		if (!lstat(*path, st))
-			return 0;
-	}
+	*path = odb_loose_path(loose->source, &buf, oid);
+	if (!lstat(*path, st))
+		return 0;
 
 	return -1;
 }
@@ -194,39 +191,24 @@ static int stat_loose_object(struct repository *r, const struct object_id *oid,
  * Like stat_loose_object(), but actually open the object and return the
  * descriptor. See the caveats on the "path" parameter above.
  */
-static int open_loose_object(struct repository *r,
+static int open_loose_object(struct odb_source_loose *loose,
 			     const struct object_id *oid, const char **path)
 {
-	int fd;
-	struct odb_source *source;
-	int most_interesting_errno = ENOENT;
 	static struct strbuf buf = STRBUF_INIT;
+	int fd;
 
-	odb_prepare_alternates(r->objects);
-	for (source = r->objects->sources; source; source = source->next) {
-		*path = odb_loose_path(source, &buf, oid);
-		fd = git_open(*path);
-		if (fd >= 0)
-			return fd;
+	*path = odb_loose_path(loose->source, &buf, oid);
+	fd = git_open(*path);
+	if (fd >= 0)
+		return fd;
 
-		if (most_interesting_errno == ENOENT)
-			most_interesting_errno = errno;
-	}
-	errno = most_interesting_errno;
 	return -1;
 }
 
-static int quick_has_loose(struct repository *r,
+static int quick_has_loose(struct odb_source_loose *loose,
 			   const struct object_id *oid)
 {
-	struct odb_source *source;
-
-	odb_prepare_alternates(r->objects);
-	for (source = r->objects->sources; source; source = source->next) {
-		if (oidtree_contains(odb_source_loose_cache(source, oid), oid))
-			return 1;
-	}
-	return 0;
+	return !!oidtree_contains(odb_source_loose_cache(loose->source, oid), oid);
 }
 
 /*
@@ -252,12 +234,12 @@ static void *map_fd(int fd, const char *path, unsigned long *size)
 	return map;
 }
 
-void *map_loose_object(struct repository *r,
-		       const struct object_id *oid,
-		       unsigned long *size)
+void *odb_source_loose_map_object(struct odb_source *source,
+				  const struct object_id *oid,
+				  unsigned long *size)
 {
 	const char *p;
-	int fd = open_loose_object(r, oid, &p);
+	int fd = open_loose_object(source->loose, oid, &p);
 
 	if (fd < 0)
 		return NULL;
@@ -407,9 +389,9 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
 	return 0;
 }
 
-int loose_object_info(struct repository *r,
-		      const struct object_id *oid,
-		      struct object_info *oi, int flags)
+int odb_source_loose_read_object_info(struct odb_source *source,
+				      const struct object_id *oid,
+				      struct object_info *oi, int flags)
 {
 	int status = 0;
 	int fd;
@@ -422,7 +404,7 @@ int loose_object_info(struct repository *r,
 	enum object_type type_scratch;
 
 	if (oi->delta_base_oid)
-		oidclr(oi->delta_base_oid, r->hash_algo);
+		oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
 
 	/*
 	 * If we don't care about type or size, then we don't
@@ -435,15 +417,15 @@ int loose_object_info(struct repository *r,
 	if (!oi->typep && !oi->sizep && !oi->contentp) {
 		struct stat st;
 		if (!oi->disk_sizep && (flags & OBJECT_INFO_QUICK))
-			return quick_has_loose(r, oid) ? 0 : -1;
-		if (stat_loose_object(r, oid, &st, &path) < 0)
+			return quick_has_loose(source->loose, oid) ? 0 : -1;
+		if (stat_loose_object(source->loose, oid, &st, &path) < 0)
 			return -1;
 		if (oi->disk_sizep)
 			*oi->disk_sizep = st.st_size;
 		return 0;
 	}
 
-	fd = open_loose_object(r, oid, &path);
+	fd = open_loose_object(source->loose, oid, &path);
 	if (fd < 0) {
 		if (errno != ENOENT)
 			error_errno(_("unable to open loose object %s"), oid_to_hex(oid));
diff --git a/object-file.h b/object-file.h
index f8a96a45f57703..ca13d3d64e722b 100644
--- a/object-file.h
+++ b/object-file.h
@@ -43,6 +43,14 @@ void odb_source_loose_free(struct odb_source_loose *loose);
 /* Reprepare the loose source by emptying the loose object cache. */
 void odb_source_loose_reprepare(struct odb_source *source);
 
+int odb_source_loose_read_object_info(struct odb_source *source,
+				      const struct object_id *oid,
+				      struct object_info *oi, int flags);
+
+void *odb_source_loose_map_object(struct odb_source *source,
+				  const struct object_id *oid,
+				  unsigned long *size);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
@@ -66,9 +74,6 @@ const char *odb_loose_path(struct odb_source *source,
 int has_loose_object(struct odb_source *source,
 		     const struct object_id *oid);
 
-void *map_loose_object(struct repository *r, const struct object_id *oid,
-		       unsigned long *size);
-
 /*
  * Iterate over the files in the loose-object parts of the object
  * directory "path", triggering the following callbacks:
@@ -196,10 +201,6 @@ int check_object_signature(struct repository *r, const struct object_id *oid,
  */
 int stream_object_signature(struct repository *r, const struct object_id *oid);
 
-int loose_object_info(struct repository *r,
-		      const struct object_id *oid,
-		      struct object_info *oi, int flags);
-
 enum finalize_object_file_flags {
 	FOF_SKIP_COLLISION_CHECK = 1,
 };
diff --git a/odb.c b/odb.c
index d1df9609e21d1e..4c0b4fdcd54ce1 100644
--- a/odb.c
+++ b/odb.c
@@ -697,13 +697,18 @@ static int do_oid_object_info_extended(struct object_database *odb,
 		return 0;
 	}
 
+	odb_prepare_alternates(odb);
+
 	while (1) {
+		struct odb_source *source;
+
 		if (find_pack_entry(odb->repo, real, &e))
 			break;
 
 		/* Most likely it's a loose object. */
-		if (!loose_object_info(odb->repo, real, oi, flags))
-			return 0;
+		for (source = odb->sources; source; source = source->next)
+			if (!odb_source_loose_read_object_info(source, real, oi, flags))
+				return 0;
 
 		/* Not a loose object; someone else may have just packed it. */
 		if (!(flags & OBJECT_INFO_QUICK)) {
diff --git a/streaming.c b/streaming.c
index 4b13827668e67a..00ad649ae397f3 100644
--- a/streaming.c
+++ b/streaming.c
@@ -230,12 +230,21 @@ static int open_istream_loose(struct git_istream *st, struct repository *r,
 			      enum object_type *type)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
+	struct odb_source *source;
+
 	oi.sizep = &st->size;
 	oi.typep = type;
 
-	st->u.loose.mapped = map_loose_object(r, oid, &st->u.loose.mapsize);
+	odb_prepare_alternates(r->objects);
+	for (source = r->objects->sources; source; source = source->next) {
+		st->u.loose.mapped = odb_source_loose_map_object(source, oid,
+								 &st->u.loose.mapsize);
+		if (st->u.loose.mapped)
+			break;
+	}
 	if (!st->u.loose.mapped)
 		return -1;
+
 	switch (unpack_loose_header(&st->z, st->u.loose.mapped,
 				    st->u.loose.mapsize, st->u.loose.hdr,
 				    sizeof(st->u.loose.hdr))) {

From 05130c6c9eed9ff7450e9067d7215032eb914c10 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:05 +0100
Subject: [PATCH 020/553] object-file: rename `has_loose_object()`

Rename `has_loose_object()` to `odb_source_loose_has_object()` so that
it becomes clear that this is tied to a specific loose object source.
This matches our modern naming schema for functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c |  4 ++--
 object-file.c          |  6 +++---
 object-file.h          | 16 ++++++++--------
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b5454e5df137b4..69e80b1443a9b7 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1716,7 +1716,7 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
 		 */
 		struct odb_source *source = the_repository->objects->sources->next;
 		for (; source; source = source->next)
-			if (has_loose_object(source, oid))
+			if (odb_source_loose_has_object(source, oid))
 				return 0;
 	}
 
@@ -3978,7 +3978,7 @@ static void add_cruft_object_entry(const struct object_id *oid, enum object_type
 			int found = 0;
 
 			for (; !found && source; source = source->next)
-				if (has_loose_object(source, oid))
+				if (odb_source_loose_has_object(source, oid))
 					found = 1;
 
 			/*
diff --git a/object-file.c b/object-file.c
index 6d6e9a5a2ad3c8..79e7ab8d2e3d0e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -99,8 +99,8 @@ static int check_and_freshen_source(struct odb_source *source,
 	return check_and_freshen_file(path.buf, freshen);
 }
 
-int has_loose_object(struct odb_source *source,
-		     const struct object_id *oid)
+int odb_source_loose_has_object(struct odb_source *source,
+				const struct object_id *oid)
 {
 	return check_and_freshen_source(source, oid, 0);
 }
@@ -1161,7 +1161,7 @@ int force_object_loose(struct odb_source *source,
 	int ret;
 
 	for (struct odb_source *s = source->odb->sources; s; s = s->next)
-		if (has_loose_object(s, oid))
+		if (odb_source_loose_has_object(s, oid))
 			return 0;
 
 	oi.typep = &type;
diff --git a/object-file.h b/object-file.h
index ca13d3d64e722b..065a44bb8a019e 100644
--- a/object-file.h
+++ b/object-file.h
@@ -51,6 +51,14 @@ void *odb_source_loose_map_object(struct odb_source *source,
 				  const struct object_id *oid,
 				  unsigned long *size);
 
+/*
+ * Return true iff an object database source has a loose object
+ * with the specified name.  This function does not respect replace
+ * references.
+ */
+int odb_source_loose_has_object(struct odb_source *source,
+				const struct object_id *oid);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
@@ -66,14 +74,6 @@ const char *odb_loose_path(struct odb_source *source,
 			   struct strbuf *buf,
 			   const struct object_id *oid);
 
-/*
- * Return true iff an object database source has a loose object
- * with the specified name.  This function does not respect replace
- * references.
- */
-int has_loose_object(struct odb_source *source,
-		     const struct object_id *oid);
-
 /*
  * Iterate over the files in the loose-object parts of the object
  * directory "path", triggering the following callbacks:

From f2bd88a308a2754e727cb462e03102307cdfe004 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:06 +0100
Subject: [PATCH 021/553] object-file: refactor freshening of objects

When writing an object that already exists in our object database we
skip the write and instead only update mtimes of the object, either in
its packed or loose object format. This logic is wholly contained in
"object-file.c", but that file is really only concerned with loose
objects. So it does not really make sense that it also contains the
logic to freshen a packed object.

Introduce a new `odb_freshen_object()` function that sits on the object
database level and two functions `packfile_store_freshen_object()` and
`odb_source_loose_freshen_object()`. Like this, the format-specific
functions can be part of their respective subsystems, while the backend
agnostic function to freshen an object sits at the object database
layer.

Note that this change also moves the logic that iterates through object
sources from the object source layer into the object database layer.
This change is intentional: object sources should ideally only have to
worry about themselves, and coordination of different sources should be
handled on the object database level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 33 +++++----------------------------
 object-file.h |  3 +++
 odb.c         | 16 ++++++++++++++++
 odb.h         |  3 +++
 packfile.c    | 16 ++++++++++++++++
 packfile.h    |  3 +++
 6 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/object-file.c b/object-file.c
index 79e7ab8d2e3d0e..893c32adcddbbd 100644
--- a/object-file.c
+++ b/object-file.c
@@ -968,30 +968,10 @@ static int write_loose_object(struct odb_source *source,
 					  FOF_SKIP_COLLISION_CHECK);
 }
 
-static int freshen_loose_object(struct object_database *odb,
-				const struct object_id *oid)
+int odb_source_loose_freshen_object(struct odb_source *source,
+				    const struct object_id *oid)
 {
-	odb_prepare_alternates(odb);
-	for (struct odb_source *source = odb->sources; source; source = source->next)
-		if (check_and_freshen_source(source, oid, 1))
-			return 1;
-	return 0;
-}
-
-static int freshen_packed_object(struct object_database *odb,
-				 const struct object_id *oid)
-{
-	struct pack_entry e;
-	if (!find_pack_entry(odb->repo, oid, &e))
-		return 0;
-	if (e.p->is_cruft)
-		return 0;
-	if (e.p->freshened)
-		return 1;
-	if (!freshen_file(e.p->pack_name))
-		return 0;
-	e.p->freshened = 1;
-	return 1;
+	return !!check_and_freshen_source(source, oid, 1);
 }
 
 int stream_loose_object(struct odb_source *source,
@@ -1073,12 +1053,10 @@ int stream_loose_object(struct odb_source *source,
 		die(_("deflateEnd on stream object failed (%d)"), ret);
 	close_loose_object(source, fd, tmp_file.buf);
 
-	if (freshen_packed_object(source->odb, oid) ||
-	    freshen_loose_object(source->odb, oid)) {
+	if (odb_freshen_object(source->odb, oid)) {
 		unlink_or_warn(tmp_file.buf);
 		goto cleanup;
 	}
-
 	odb_loose_path(source, &filename, oid);
 
 	/* We finally know the object path, and create the missing dir. */
@@ -1137,8 +1115,7 @@ int write_object_file(struct odb_source *source,
 	 * it out into .git/objects/??/?{38} file.
 	 */
 	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
-	if (freshen_packed_object(source->odb, oid) ||
-	    freshen_loose_object(source->odb, oid))
+	if (odb_freshen_object(source->odb, oid))
 		return 0;
 	if (write_loose_object(source, oid, hdr, hdrlen, buf, len, 0, flags))
 		return -1;
diff --git a/object-file.h b/object-file.h
index 065a44bb8a019e..ee5b24cec66c34 100644
--- a/object-file.h
+++ b/object-file.h
@@ -59,6 +59,9 @@ void *odb_source_loose_map_object(struct odb_source *source,
 int odb_source_loose_has_object(struct odb_source *source,
 				const struct object_id *oid);
 
+int odb_source_loose_freshen_object(struct odb_source *source,
+				    const struct object_id *oid);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
diff --git a/odb.c b/odb.c
index 4c0b4fdcd54ce1..17734bdaffe8e6 100644
--- a/odb.c
+++ b/odb.c
@@ -987,6 +987,22 @@ int odb_has_object(struct object_database *odb, const struct object_id *oid,
 	return odb_read_object_info_extended(odb, oid, NULL, object_info_flags) >= 0;
 }
 
+int odb_freshen_object(struct object_database *odb,
+		       const struct object_id *oid)
+{
+	struct odb_source *source;
+
+	if (packfile_store_freshen_object(odb->packfiles, oid))
+		return 1;
+
+	odb_prepare_alternates(odb);
+	for (source = odb->sources; source; source = source->next)
+		if (odb_source_loose_freshen_object(source, oid))
+			return 1;
+
+	return 0;
+}
+
 void odb_assert_oid_type(struct object_database *odb,
 			 const struct object_id *oid, enum object_type expect)
 {
diff --git a/odb.h b/odb.h
index f9a3137a34aa3b..2653247e0cc871 100644
--- a/odb.h
+++ b/odb.h
@@ -396,6 +396,9 @@ int odb_has_object(struct object_database *odb,
 		   const struct object_id *oid,
 		   unsigned flags);
 
+int odb_freshen_object(struct object_database *odb,
+		       const struct object_id *oid);
+
 void odb_assert_oid_type(struct object_database *odb,
 			 const struct object_id *oid, enum object_type expect);
 
diff --git a/packfile.c b/packfile.c
index 1ae2b2fe1eda77..40f733dd234900 100644
--- a/packfile.c
+++ b/packfile.c
@@ -819,6 +819,22 @@ struct packed_git *packfile_store_load_pack(struct packfile_store *store,
 	return p;
 }
 
+int packfile_store_freshen_object(struct packfile_store *store,
+				  const struct object_id *oid)
+{
+	struct pack_entry e;
+	if (!find_pack_entry(store->odb->repo, oid, &e))
+		return 0;
+	if (e.p->is_cruft)
+		return 0;
+	if (e.p->freshened)
+		return 1;
+	if (utime(e.p->pack_name, NULL))
+		return 0;
+	e.p->freshened = 1;
+	return 1;
+}
+
 void (*report_garbage)(unsigned seen_bits, const char *path);
 
 static void report_helper(const struct string_list *list,
diff --git a/packfile.h b/packfile.h
index c9d0b93446b5f5..58fcc88e20224b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -163,6 +163,9 @@ struct list_head *packfile_store_get_packs_mru(struct packfile_store *store);
 struct packed_git *packfile_store_load_pack(struct packfile_store *store,
 					    const char *idx_path, int local);
 
+int packfile_store_freshen_object(struct packfile_store *store,
+				  const struct object_id *oid);
+
 struct pack_window {
 	struct pack_window *next;
 	unsigned char *base;

From bfb1b2b4ac5cfa99f7d2503b404d282714d84bdf Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:07 +0100
Subject: [PATCH 022/553] object-file: rename `write_object_file()`

Rename `write_object_file()` to `odb_source_loose_write_object()` so
that it becomes clear that this is tied to a specific loose object
source. This matches our modern naming schema for functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c |  8 ++++----
 object-file.h | 10 +++++-----
 odb.c         |  3 ++-
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/object-file.c b/object-file.c
index 893c32adcddbbd..fdc644a4275373 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1084,10 +1084,10 @@ int stream_loose_object(struct odb_source *source,
 	return err;
 }
 
-int write_object_file(struct odb_source *source,
-		      const void *buf, unsigned long len,
-		      enum object_type type, struct object_id *oid,
-		      struct object_id *compat_oid_in, unsigned flags)
+int odb_source_loose_write_object(struct odb_source *source,
+				  const void *buf, unsigned long len,
+				  enum object_type type, struct object_id *oid,
+				  struct object_id *compat_oid_in, unsigned flags)
 {
 	const struct git_hash_algo *algo = source->odb->repo->hash_algo;
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
diff --git a/object-file.h b/object-file.h
index ee5b24cec66c34..36a60e15c40547 100644
--- a/object-file.h
+++ b/object-file.h
@@ -62,6 +62,11 @@ int odb_source_loose_has_object(struct odb_source *source,
 int odb_source_loose_freshen_object(struct odb_source *source,
 				    const struct object_id *oid);
 
+int odb_source_loose_write_object(struct odb_source *source,
+				  const void *buf, unsigned long len,
+				  enum object_type type, struct object_id *oid,
+				  struct object_id *compat_oid_in, unsigned flags);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
@@ -168,11 +173,6 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 struct object_info;
 int parse_loose_header(const char *hdr, struct object_info *oi);
 
-int write_object_file(struct odb_source *source,
-		      const void *buf, unsigned long len,
-		      enum object_type type, struct object_id *oid,
-		      struct object_id *compat_oid_in, unsigned flags);
-
 struct input_stream {
 	const void *(*read)(struct input_stream *, unsigned long *len);
 	void *data;
diff --git a/odb.c b/odb.c
index 17734bdaffe8e6..da44f1d63b43b2 100644
--- a/odb.c
+++ b/odb.c
@@ -1021,7 +1021,8 @@ int odb_write_object_ext(struct object_database *odb,
 			 struct object_id *compat_oid,
 			 unsigned flags)
 {
-	return write_object_file(odb->sources, buf, len, type, oid, compat_oid, flags);
+	return odb_source_loose_write_object(odb->sources, buf, len, type,
+					     oid, compat_oid, flags);
 }
 
 struct object_database *odb_new(struct repository *repo)

From 3e5e360888316ed1a44da69bf134bb6ec70aee1b Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 3 Nov 2025 08:42:08 +0100
Subject: [PATCH 023/553] object-file: refactor writing objects via a stream

We have two different ways to write an object into the database:

  - We either provide the full buffer and write the object all at once.

  - Or we provide an input stream that has a `read()` function so that
    we can chunk the object.

The latter is especially used for large objects, where it may be too
expensive to hold the complete object in memory all at once.

While we already have `odb_write_object()` at the ODB-layer, we don't
have an equivalent for streaming an object. Introduce a new function
`odb_write_object_stream()` to address this gap so that callers don't
have to be aware of the inner workings of how to stream an object to
disk with a specific object source.

Rename `stream_loose_object()` to `odb_source_loose_write_stream()` to
clarify its scope. This matches our modern best practices around how to
name functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/unpack-objects.c |  7 +++----
 object-file.c            |  6 +++---
 object-file.h            | 14 ++++----------
 odb.c                    |  7 +++++++
 odb.h                    | 10 ++++++++++
 5 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ef79e43715d362..6fc64e9e4b8d5a 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -363,7 +363,7 @@ struct input_zstream_data {
 	int status;
 };
 
-static const void *feed_input_zstream(struct input_stream *in_stream,
+static const void *feed_input_zstream(struct odb_write_stream *in_stream,
 				      unsigned long *readlen)
 {
 	struct input_zstream_data *data = in_stream->data;
@@ -393,7 +393,7 @@ static void stream_blob(unsigned long size, unsigned nr)
 {
 	git_zstream zstream = { 0 };
 	struct input_zstream_data data = { 0 };
-	struct input_stream in_stream = {
+	struct odb_write_stream in_stream = {
 		.read = feed_input_zstream,
 		.data = &data,
 	};
@@ -402,8 +402,7 @@ static void stream_blob(unsigned long size, unsigned nr)
 	data.zstream = &zstream;
 	git_inflate_init(&zstream);
 
-	if (stream_loose_object(the_repository->objects->sources,
-				&in_stream, size, &info->oid))
+	if (odb_write_object_stream(the_repository->objects, &in_stream, size, &info->oid))
 		die(_("failed to write object in stream"));
 
 	if (data.status != Z_STREAM_END)
diff --git a/object-file.c b/object-file.c
index fdc644a4275373..811c569ed36aa4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -974,9 +974,9 @@ int odb_source_loose_freshen_object(struct odb_source *source,
 	return !!check_and_freshen_source(source, oid, 1);
 }
 
-int stream_loose_object(struct odb_source *source,
-			struct input_stream *in_stream, size_t len,
-			struct object_id *oid)
+int odb_source_loose_write_stream(struct odb_source *source,
+				  struct odb_write_stream *in_stream, size_t len,
+				  struct object_id *oid)
 {
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
diff --git a/object-file.h b/object-file.h
index 36a60e15c40547..eeffa67bbda631 100644
--- a/object-file.h
+++ b/object-file.h
@@ -67,6 +67,10 @@ int odb_source_loose_write_object(struct odb_source *source,
 				  enum object_type type, struct object_id *oid,
 				  struct object_id *compat_oid_in, unsigned flags);
 
+int odb_source_loose_write_stream(struct odb_source *source,
+				  struct odb_write_stream *stream, size_t len,
+				  struct object_id *oid);
+
 /*
  * Populate and return the loose object cache array corresponding to the
  * given object ID.
@@ -173,16 +177,6 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 struct object_info;
 int parse_loose_header(const char *hdr, struct object_info *oi);
 
-struct input_stream {
-	const void *(*read)(struct input_stream *, unsigned long *len);
-	void *data;
-	int is_finished;
-};
-
-int stream_loose_object(struct odb_source *source,
-			struct input_stream *in_stream, size_t len,
-			struct object_id *oid);
-
 int force_object_loose(struct odb_source *source,
 		       const struct object_id *oid, time_t mtime);
 
diff --git a/odb.c b/odb.c
index da44f1d63b43b2..3ec21ef24e16bb 100644
--- a/odb.c
+++ b/odb.c
@@ -1025,6 +1025,13 @@ int odb_write_object_ext(struct object_database *odb,
 					     oid, compat_oid, flags);
 }
 
+int odb_write_object_stream(struct object_database *odb,
+			    struct odb_write_stream *stream, size_t len,
+			    struct object_id *oid)
+{
+	return odb_source_loose_write_stream(odb->sources, stream, len, oid);
+}
+
 struct object_database *odb_new(struct repository *repo)
 {
 	struct object_database *o = xmalloc(sizeof(*o));
diff --git a/odb.h b/odb.h
index 2653247e0cc871..9bb28008b1d953 100644
--- a/odb.h
+++ b/odb.h
@@ -492,4 +492,14 @@ static inline int odb_write_object(struct object_database *odb,
 	return odb_write_object_ext(odb, buf, len, type, oid, NULL, 0);
 }
 
+struct odb_write_stream {
+	const void *(*read)(struct odb_write_stream *, unsigned long *len);
+	void *data;
+	int is_finished;
+};
+
+int odb_write_object_stream(struct object_database *odb,
+			    struct odb_write_stream *stream, size_t len,
+			    struct object_id *oid);
+
 #endif /* ODB_H */

From bdbebe5714b25dc9d215b48efbb80f410925d7dd Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:10 +0200
Subject: [PATCH 024/553] refs: introduce wrapper struct for `each_ref_fn`

The `each_ref_fn` callback function type is used across our code base
for several different functions that iterate through reference. There's
a bunch of callbacks implementing this type, which makes any changes to
the callback signature extremely noisy. An example of the required churn
is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a
single argument required us to change 48 files.

It was already proposed back then [1] that we might want to introduce a
wrapper structure to alleviate the pain going forward. While this of
course requires the same kind of global refactoring as just introducing
a new parameter, it at least allows us to more change the callback type
afterwards by just extending the wrapper structure.

One counterargument to this refactoring is that it makes the structure
more opaque. While it is obvious which callsites need to be fixed up
when we change the function type, it's not obvious anymore once we use
a structure. That being said, we only have a handful of sites that
actually need to populate this wrapper structure: our ref backends,
"refs/iterator.c" as well as very few sites that invoke the iterator
callback functions directly.

Introduce this wrapper structure so that we can adapt the iterator
interfaces more readily.

[1]: <ZmarVcF5JjsZx0dl@tanuki>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 bisect.c                    | 24 ++++++-------
 builtin/bisect.c            | 17 +++-------
 builtin/checkout.c          |  6 ++--
 builtin/describe.c          | 18 +++++-----
 builtin/fetch.c             | 13 +++----
 builtin/fsck.c              | 33 ++++++++++--------
 builtin/gc.c                | 15 ++++-----
 builtin/name-rev.c          | 17 +++++-----
 builtin/pack-objects.c      | 27 ++++++---------
 builtin/receive-pack.c      | 13 ++++---
 builtin/remote.c            | 44 +++++++++++-------------
 builtin/replace.c           | 21 +++++-------
 builtin/repo.c              |  9 ++---
 builtin/rev-parse.c         | 12 +++----
 builtin/show-branch.c       | 35 +++++++++----------
 builtin/show-ref.c          | 20 +++++------
 builtin/submodule--helper.c | 10 ++----
 builtin/worktree.c          |  6 +---
 commit-graph.c              | 14 ++++----
 delta-islands.c             |  9 +++--
 fetch-pack.c                | 16 +++------
 help.c                      | 10 +++---
 http-backend.c              | 20 +++++------
 log-tree.c                  | 24 ++++++-------
 ls-refs.c                   | 36 ++++++++++++--------
 midx-write.c                | 17 +++++-----
 negotiator/default.c        |  7 ++--
 negotiator/skipping.c       |  7 ++--
 notes.c                     |  8 ++---
 object-name.c               | 10 +++---
 pseudo-merge.c              | 21 +++++-------
 reachable.c                 |  9 +++--
 ref-filter.c                | 24 +++++++------
 reflog.c                    |  9 ++---
 refs.c                      | 67 +++++++++++++++++++++----------------
 refs.h                      | 26 +++++++++++---
 refs/files-backend.c        |  7 ++--
 refs/iterator.c             |  9 ++++-
 remote.c                    | 27 +++++++--------
 repack-midx.c               | 16 ++++-----
 replace-object.c            | 16 ++++-----
 revision.c                  | 12 +++----
 server-info.c               | 12 +++----
 shallow.c                   | 16 +++------
 submodule.c                 | 12 ++-----
 t/helper/test-ref-store.c   |  5 ++-
 upload-pack.c               | 29 +++++++---------
 walker.c                    |  8 ++---
 worktree.c                  | 11 ++++--
 49 files changed, 392 insertions(+), 462 deletions(-)

diff --git a/bisect.c b/bisect.c
index a6dc76b15c910b..326b59c0dc70e7 100644
--- a/bisect.c
+++ b/bisect.c
@@ -450,21 +450,20 @@ void find_bisection(struct commit_list **commit_list, int *reaches,
 	clear_commit_weight(&commit_weight);
 }
 
-static int register_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			int flags UNUSED, void *cb_data UNUSED)
+static int register_ref(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct strbuf good_prefix = STRBUF_INIT;
 	strbuf_addstr(&good_prefix, term_good);
 	strbuf_addstr(&good_prefix, "-");
 
-	if (!strcmp(refname, term_bad)) {
+	if (!strcmp(ref->name, term_bad)) {
 		free(current_bad_oid);
 		current_bad_oid = xmalloc(sizeof(*current_bad_oid));
-		oidcpy(current_bad_oid, oid);
-	} else if (starts_with(refname, good_prefix.buf)) {
-		oid_array_append(&good_revs, oid);
-	} else if (starts_with(refname, "skip-")) {
-		oid_array_append(&skipped_revs, oid);
+		oidcpy(current_bad_oid, ref->oid);
+	} else if (starts_with(ref->name, good_prefix.buf)) {
+		oid_array_append(&good_revs, ref->oid);
+	} else if (starts_with(ref->name, "skip-")) {
+		oid_array_append(&skipped_revs, ref->oid);
 	}
 
 	strbuf_release(&good_prefix);
@@ -1178,14 +1177,11 @@ int estimate_bisect_steps(int all)
 	return (e < 3 * x) ? n : n - 1;
 }
 
-static int mark_for_removal(const char *refname,
-			    const char *referent UNUSED,
-			    const struct object_id *oid UNUSED,
-			    int flag UNUSED, void *cb_data)
+static int mark_for_removal(const struct reference *ref, void *cb_data)
 {
 	struct string_list *refs = cb_data;
-	char *ref = xstrfmt("refs/bisect%s", refname);
-	string_list_append(refs, ref);
+	char *bisect_ref = xstrfmt("refs/bisect%s", ref->name);
+	string_list_append(refs, bisect_ref);
 	return 0;
 }
 
diff --git a/builtin/bisect.c b/builtin/bisect.c
index 8b8d870cd1ef08..5b2024be62dacd 100644
--- a/builtin/bisect.c
+++ b/builtin/bisect.c
@@ -358,10 +358,7 @@ static int check_and_set_terms(struct bisect_terms *terms, const char *cmd)
 	return 0;
 }
 
-static int inc_nr(const char *refname UNUSED,
-		  const char *referent UNUSED,
-		  const struct object_id *oid UNUSED,
-		  int flag UNUSED, void *cb_data)
+static int inc_nr(const struct reference *ref UNUSED, void *cb_data)
 {
 	unsigned int *nr = (unsigned int *)cb_data;
 	(*nr)++;
@@ -549,12 +546,11 @@ static int bisect_append_log_quoted(const char **argv)
 	return res;
 }
 
-static int add_bisect_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			  int flags UNUSED, void *cb)
+static int add_bisect_ref(const struct reference *ref, void *cb)
 {
 	struct add_bisect_ref_data *data = cb;
 
-	add_pending_oid(data->revs, refname, oid, data->object_flags);
+	add_pending_oid(data->revs, ref->name, ref->oid, data->object_flags);
 
 	return 0;
 }
@@ -1165,12 +1161,9 @@ static int bisect_visualize(struct bisect_terms *terms, int argc,
 	return run_command(&cmd);
 }
 
-static int get_first_good(const char *refname UNUSED,
-			  const char *referent UNUSED,
-			  const struct object_id *oid,
-			  int flag UNUSED, void *cb_data)
+static int get_first_good(const struct reference *ref, void *cb_data)
 {
-	oidcpy(cb_data, oid);
+	oidcpy(cb_data, ref->oid);
 	return 1;
 }
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f9453473fe2a20..66b69df6e67234 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1063,11 +1063,9 @@ static void update_refs_for_switch(const struct checkout_opts *opts,
 		report_tracking(new_branch_info);
 }
 
-static int add_pending_uninteresting_ref(const char *refname, const char *referent UNUSED,
-					 const struct object_id *oid,
-					 int flags UNUSED, void *cb_data)
+static int add_pending_uninteresting_ref(const struct reference *ref, void *cb_data)
 {
-	add_pending_oid(cb_data, refname, oid, UNINTERESTING);
+	add_pending_oid(cb_data, ref->name, ref->oid, UNINTERESTING);
 	return 0;
 }
 
diff --git a/builtin/describe.c b/builtin/describe.c
index ffaf8d9f0aa6ea..79545350443c6c 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -154,20 +154,19 @@ static void add_to_known_names(const char *path,
 	}
 }
 
-static int get_name(const char *path, const char *referent UNUSED, const struct object_id *oid,
-		    int flag UNUSED, void *cb_data UNUSED)
+static int get_name(const struct reference *ref, void *cb_data UNUSED)
 {
 	int is_tag = 0;
 	struct object_id peeled;
 	int is_annotated, prio;
 	const char *path_to_match = NULL;
 
-	if (skip_prefix(path, "refs/tags/", &path_to_match)) {
+	if (skip_prefix(ref->name, "refs/tags/", &path_to_match)) {
 		is_tag = 1;
 	} else if (all) {
 		if ((exclude_patterns.nr || patterns.nr) &&
-		    !skip_prefix(path, "refs/heads/", &path_to_match) &&
-		    !skip_prefix(path, "refs/remotes/", &path_to_match)) {
+		    !skip_prefix(ref->name, "refs/heads/", &path_to_match) &&
+		    !skip_prefix(ref->name, "refs/remotes/", &path_to_match)) {
 			/* Only accept reference of known type if there are match/exclude patterns */
 			return 0;
 		}
@@ -209,10 +208,10 @@ static int get_name(const char *path, const char *referent UNUSED, const struct
 	}
 
 	/* Is it annotated? */
-	if (!peel_iterated_oid(the_repository, oid, &peeled)) {
-		is_annotated = !oideq(oid, &peeled);
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled)) {
+		is_annotated = !oideq(ref->oid, &peeled);
 	} else {
-		oidcpy(&peeled, oid);
+		oidcpy(&peeled, ref->oid);
 		is_annotated = 0;
 	}
 
@@ -229,7 +228,8 @@ static int get_name(const char *path, const char *referent UNUSED, const struct
 	else
 		prio = 0;
 
-	add_to_known_names(all ? path + 5 : path + 10, &peeled, prio, oid);
+	add_to_known_names(all ? ref->name + 5 : ref->name + 10,
+			   &peeled, prio, ref->oid);
 	return 0;
 }
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c7ff3480fb1827..7052e6ff215ead 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -289,13 +289,11 @@ static struct refname_hash_entry *refname_hash_add(struct hashmap *map,
 	return ent;
 }
 
-static int add_one_refname(const char *refname, const char *referent UNUSED,
-			   const struct object_id *oid,
-			   int flag UNUSED, void *cbdata)
+static int add_one_refname(const struct reference *ref, void *cbdata)
 {
 	struct hashmap *refname_map = cbdata;
 
-	(void) refname_hash_add(refname_map, refname, oid);
+	(void) refname_hash_add(refname_map, ref->name, ref->oid);
 	return 0;
 }
 
@@ -1416,14 +1414,11 @@ static void set_option(struct transport *transport, const char *name, const char
 }
 
 
-static int add_oid(const char *refname UNUSED,
-		   const char *referent UNUSED,
-		   const struct object_id *oid,
-		   int flags UNUSED, void *cb_data)
+static int add_oid(const struct reference *ref, void *cb_data)
 {
 	struct oid_array *oids = cb_data;
 
-	oid_array_append(oids, oid);
+	oid_array_append(oids, ref->oid);
 	return 0;
 }
 
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 8ee95e0d67cf37..ed4eea16803349 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -530,14 +530,13 @@ static int fsck_handle_reflog(const char *logname, void *cb_data)
 	return 0;
 }
 
-static int fsck_handle_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			   int flag UNUSED, void *cb_data UNUSED)
+static int fsck_handle_ref(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object *obj;
 
-	obj = parse_object(the_repository, oid);
+	obj = parse_object(the_repository, ref->oid);
 	if (!obj) {
-		if (is_promisor_object(the_repository, oid)) {
+		if (is_promisor_object(the_repository, ref->oid)) {
 			/*
 			 * Increment default_refs anyway, because this is a
 			 * valid ref.
@@ -546,19 +545,19 @@ static int fsck_handle_ref(const char *refname, const char *referent UNUSED, con
 			 return 0;
 		}
 		error(_("%s: invalid sha1 pointer %s"),
-		      refname, oid_to_hex(oid));
+		      ref->name, oid_to_hex(ref->oid));
 		errors_found |= ERROR_REACHABLE;
 		/* We'll continue with the rest despite the error.. */
 		return 0;
 	}
-	if (obj->type != OBJ_COMMIT && is_branch(refname)) {
-		error(_("%s: not a commit"), refname);
+	if (obj->type != OBJ_COMMIT && is_branch(ref->name)) {
+		error(_("%s: not a commit"), ref->name);
 		errors_found |= ERROR_REFS;
 	}
 	default_refs++;
 	obj->flags |= USED;
 	fsck_put_object_name(&fsck_walk_options,
-			     oid, "%s", refname);
+			     ref->oid, "%s", ref->name);
 	mark_object_reachable(obj);
 
 	return 0;
@@ -580,13 +579,19 @@ static void get_default_heads(void)
 	worktrees = get_worktrees();
 	for (p = worktrees; *p; p++) {
 		struct worktree *wt = *p;
-		struct strbuf ref = STRBUF_INIT;
+		struct strbuf refname = STRBUF_INIT;
 
-		strbuf_worktree_ref(wt, &ref, "HEAD");
-		fsck_head_link(ref.buf, &head_points_at, &head_oid);
-		if (head_points_at && !is_null_oid(&head_oid))
-			fsck_handle_ref(ref.buf, NULL, &head_oid, 0, NULL);
-		strbuf_release(&ref);
+		strbuf_worktree_ref(wt, &refname, "HEAD");
+		fsck_head_link(refname.buf, &head_points_at, &head_oid);
+		if (head_points_at && !is_null_oid(&head_oid)) {
+			struct reference ref = {
+				.name = refname.buf,
+				.oid = &head_oid,
+			};
+
+			fsck_handle_ref(&ref, NULL);
+		}
+		strbuf_release(&refname);
 
 		if (include_reflogs)
 			refs_for_each_reflog(get_worktree_ref_store(wt),
diff --git a/builtin/gc.c b/builtin/gc.c
index e19e13d9788076..9de5de175f6a40 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1100,24 +1100,21 @@ struct cg_auto_data {
 	int limit;
 };
 
-static int dfs_on_ref(const char *refname UNUSED,
-		      const char *referent UNUSED,
-		      const struct object_id *oid,
-		      int flags UNUSED,
-		      void *cb_data)
+static int dfs_on_ref(const struct reference *ref, void *cb_data)
 {
 	struct cg_auto_data *data = (struct cg_auto_data *)cb_data;
 	int result = 0;
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 	struct commit_list *stack = NULL;
 	struct commit *commit;
 
-	if (!peel_iterated_oid(the_repository, oid, &peeled))
-		oid = &peeled;
-	if (odb_read_object_info(the_repository->objects, oid, NULL) != OBJ_COMMIT)
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+		maybe_peeled = &peeled;
+	if (odb_read_object_info(the_repository->objects, maybe_peeled, NULL) != OBJ_COMMIT)
 		return 0;
 
-	commit = lookup_commit(the_repository, oid);
+	commit = lookup_commit(the_repository, maybe_peeled);
 	if (!commit)
 		return 0;
 	if (repo_parse_commit(the_repository, commit) ||
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 74512e54a38c45..615f7d1aae4987 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -339,10 +339,9 @@ static int cmp_by_tag_and_age(const void *a_, const void *b_)
 	return a->taggerdate != b->taggerdate;
 }
 
-static int name_ref(const char *path, const char *referent UNUSED, const struct object_id *oid,
-		    int flags UNUSED, void *cb_data)
+static int name_ref(const struct reference *ref, void *cb_data)
 {
-	struct object *o = parse_object(the_repository, oid);
+	struct object *o = parse_object(the_repository, ref->oid);
 	struct name_ref_data *data = cb_data;
 	int can_abbreviate_output = data->tags_only && data->name_only;
 	int deref = 0;
@@ -350,14 +349,14 @@ static int name_ref(const char *path, const char *referent UNUSED, const struct
 	struct commit *commit = NULL;
 	timestamp_t taggerdate = TIME_MAX;
 
-	if (data->tags_only && !starts_with(path, "refs/tags/"))
+	if (data->tags_only && !starts_with(ref->name, "refs/tags/"))
 		return 0;
 
 	if (data->exclude_filters.nr) {
 		struct string_list_item *item;
 
 		for_each_string_list_item(item, &data->exclude_filters) {
-			if (subpath_matches(path, item->string) >= 0)
+			if (subpath_matches(ref->name, item->string) >= 0)
 				return 0;
 		}
 	}
@@ -378,7 +377,7 @@ static int name_ref(const char *path, const char *referent UNUSED, const struct
 			 * shouldn't stop when seeing 'refs/tags/v1.4' matches
 			 * 'refs/tags/v*'.  We should show it as 'v1.4'.
 			 */
-			switch (subpath_matches(path, item->string)) {
+			switch (subpath_matches(ref->name, item->string)) {
 			case -1: /* did not match */
 				break;
 			case 0: /* matched fully */
@@ -406,13 +405,13 @@ static int name_ref(const char *path, const char *referent UNUSED, const struct
 	}
 	if (o && o->type == OBJ_COMMIT) {
 		commit = (struct commit *)o;
-		from_tag = starts_with(path, "refs/tags/");
+		from_tag = starts_with(ref->name, "refs/tags/");
 		if (taggerdate == TIME_MAX)
 			taggerdate = commit->date;
 	}
 
-	add_to_tip_table(oid, path, can_abbreviate_output, commit, taggerdate,
-			 from_tag, deref);
+	add_to_tip_table(ref->oid, ref->name, can_abbreviate_output,
+			 commit, taggerdate, from_tag, deref);
 	return 0;
 }
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5bdc44fb2de1fa..39633a0158e095 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -831,15 +831,14 @@ static enum write_one_status write_one(struct hashfile *f,
 	return WRITE_ONE_WRITTEN;
 }
 
-static int mark_tagged(const char *path UNUSED, const char *referent UNUSED, const struct object_id *oid,
-		       int flag UNUSED, void *cb_data UNUSED)
+static int mark_tagged(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object_id peeled;
-	struct object_entry *entry = packlist_find(&to_pack, oid);
+	struct object_entry *entry = packlist_find(&to_pack, ref->oid);
 
 	if (entry)
 		entry->tagged = 1;
-	if (!peel_iterated_oid(the_repository, oid, &peeled)) {
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled)) {
 		entry = packlist_find(&to_pack, &peeled);
 		if (entry)
 			entry->tagged = 1;
@@ -3306,13 +3305,12 @@ static void add_tag_chain(const struct object_id *oid)
 	}
 }
 
-static int add_ref_tag(const char *tag UNUSED, const char *referent UNUSED, const struct object_id *oid,
-		       int flag UNUSED, void *cb_data UNUSED)
+static int add_ref_tag(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object_id peeled;
 
-	if (!peel_iterated_oid(the_repository, oid, &peeled) && obj_is_packed(&peeled))
-		add_tag_chain(oid);
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled) && obj_is_packed(&peeled))
+		add_tag_chain(ref->oid);
 	return 0;
 }
 
@@ -4533,19 +4531,16 @@ static void record_recent_commit(struct commit *commit, void *data UNUSED)
 	oid_array_append(&recent_objects, &commit->object.oid);
 }
 
-static int mark_bitmap_preferred_tip(const char *refname,
-				     const char *referent UNUSED,
-				     const struct object_id *oid,
-				     int flags UNUSED,
-				     void *data UNUSED)
+static int mark_bitmap_preferred_tip(const struct reference *ref, void *data UNUSED)
 {
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 	struct object *object;
 
-	if (!peel_iterated_oid(the_repository, oid, &peeled))
-		oid = &peeled;
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+		maybe_peeled = &peeled;
 
-	object = parse_object_or_die(the_repository, oid, refname);
+	object = parse_object_or_die(the_repository, maybe_peeled, ref->name);
 	if (object->type == OBJ_COMMIT)
 		object->flags |= NEEDS_BITMAP;
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c9288a9c7e382b..e8ee0e73217aae 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -305,13 +305,12 @@ static void show_ref(const char *path, const struct object_id *oid)
 	}
 }
 
-static int show_ref_cb(const char *path_full, const char *referent UNUSED, const struct object_id *oid,
-		       int flag UNUSED, void *data)
+static int show_ref_cb(const struct reference *ref, void *data)
 {
 	struct oidset *seen = data;
-	const char *path = strip_namespace(path_full);
+	const char *path = strip_namespace(ref->name);
 
-	if (ref_is_hidden(path, path_full, &hidden_refs))
+	if (ref_is_hidden(path, ref->name, &hidden_refs))
 		return 0;
 
 	/*
@@ -320,13 +319,13 @@ static int show_ref_cb(const char *path_full, const char *referent UNUSED, const
 	 * transfer but will otherwise ignore them.
 	 */
 	if (!path) {
-		if (oidset_insert(seen, oid))
+		if (oidset_insert(seen, ref->oid))
 			return 0;
 		path = ".have";
 	} else {
-		oidset_insert(seen, oid);
+		oidset_insert(seen, ref->oid);
 	}
-	show_ref(path, oid);
+	show_ref(path, ref->oid);
 	return 0;
 }
 
diff --git a/builtin/remote.c b/builtin/remote.c
index 8a7ed4299a4b51..7ffc14ba15743a 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -570,17 +570,14 @@ struct branches_for_remote {
 	struct known_remotes *keep;
 };
 
-static int add_branch_for_removal(const char *refname,
-				  const char *referent UNUSED,
-				  const struct object_id *oid UNUSED,
-				  int flags UNUSED, void *cb_data)
+static int add_branch_for_removal(const struct reference *ref, void *cb_data)
 {
 	struct branches_for_remote *branches = cb_data;
 	struct refspec_item refspec;
 	struct known_remote *kr;
 
 	memset(&refspec, 0, sizeof(refspec));
-	refspec.dst = (char *)refname;
+	refspec.dst = (char *)ref->name;
 	if (remote_find_tracking(branches->remote, &refspec))
 		return 0;
 	free(refspec.src);
@@ -588,7 +585,7 @@ static int add_branch_for_removal(const char *refname,
 	/* don't delete a branch if another remote also uses it */
 	for (kr = branches->keep->list; kr; kr = kr->next) {
 		memset(&refspec, 0, sizeof(refspec));
-		refspec.dst = (char *)refname;
+		refspec.dst = (char *)ref->name;
 		if (!remote_find_tracking(kr->remote, &refspec)) {
 			free(refspec.src);
 			return 0;
@@ -596,16 +593,16 @@ static int add_branch_for_removal(const char *refname,
 	}
 
 	/* don't delete non-remote-tracking refs */
-	if (!starts_with(refname, "refs/remotes/")) {
+	if (!starts_with(ref->name, "refs/remotes/")) {
 		/* advise user how to delete local branches */
-		if (starts_with(refname, "refs/heads/"))
+		if (starts_with(ref->name, "refs/heads/"))
 			string_list_append(branches->skipped,
-					   abbrev_branch(refname));
+					   abbrev_branch(ref->name));
 		/* silently skip over other non-remote refs */
 		return 0;
 	}
 
-	string_list_append(branches->branches, refname);
+	string_list_append(branches->branches, ref->name);
 
 	return 0;
 }
@@ -713,18 +710,18 @@ static int rename_one_reflog(const char *old_refname,
 	return error;
 }
 
-static int rename_one_ref(const char *old_refname, const char *referent,
-			  const struct object_id *oid,
-			  int flags, void *cb_data)
+static int rename_one_ref(const struct reference *ref, void *cb_data)
 {
 	struct strbuf new_referent = STRBUF_INIT;
 	struct strbuf new_refname = STRBUF_INIT;
 	struct rename_info *rename = cb_data;
+	const struct object_id *oid = ref->oid;
+	const char *referent = ref->target;
 	int error;
 
-	compute_renamed_ref(rename, old_refname, &new_refname);
+	compute_renamed_ref(rename, ref->name, &new_refname);
 
-	if (flags & REF_ISSYMREF) {
+	if (ref->flags & REF_ISSYMREF) {
 		/*
 		 * Stupidly enough `referent` is not pointing to the immediate
 		 * target of a symref, but it's the recursively resolved value.
@@ -732,25 +729,25 @@ static int rename_one_ref(const char *old_refname, const char *referent,
 		 * unborn symrefs don't have any value for the `referent` at all.
 		 */
 		referent = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
-						   old_refname, RESOLVE_REF_NO_RECURSE,
+						   ref->name, RESOLVE_REF_NO_RECURSE,
 						   NULL, NULL);
 		compute_renamed_ref(rename, referent, &new_referent);
 		oid = NULL;
 	}
 
-	error = ref_transaction_delete(rename->transaction, old_refname,
+	error = ref_transaction_delete(rename->transaction, ref->name,
 				       oid, referent, REF_NO_DEREF, NULL, rename->err);
 	if (error < 0)
 		goto out;
 
 	error = ref_transaction_update(rename->transaction, new_refname.buf, oid, null_oid(the_hash_algo),
-				       (flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL,
+				       (ref->flags & REF_ISSYMREF) ? new_referent.buf : NULL, NULL,
 				       REF_SKIP_CREATE_REFLOG | REF_NO_DEREF | REF_SKIP_OID_VERIFICATION,
 				       NULL, rename->err);
 	if (error < 0)
 		goto out;
 
-	error = rename_one_reflog(old_refname, oid, rename);
+	error = rename_one_reflog(ref->name, oid, rename);
 	if (error < 0)
 		goto out;
 
@@ -1125,19 +1122,16 @@ static void free_remote_ref_states(struct ref_states *states)
 	string_list_clear_func(&states->push, clear_push_info);
 }
 
-static int append_ref_to_tracked_list(const char *refname,
-				      const char *referent UNUSED,
-				      const struct object_id *oid UNUSED,
-				      int flags, void *cb_data)
+static int append_ref_to_tracked_list(const struct reference *ref, void *cb_data)
 {
 	struct ref_states *states = cb_data;
 	struct refspec_item refspec;
 
-	if (flags & REF_ISSYMREF)
+	if (ref->flags & REF_ISSYMREF)
 		return 0;
 
 	memset(&refspec, 0, sizeof(refspec));
-	refspec.dst = (char *)refname;
+	refspec.dst = (char *)ref->name;
 	if (!remote_find_tracking(states->remote, &refspec)) {
 		string_list_append(&states->tracked, abbrev_branch(refspec.src));
 		free(refspec.src);
diff --git a/builtin/replace.c b/builtin/replace.c
index 900b560a77d9d7..4c62c5ab58bd0a 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -47,30 +47,27 @@ struct show_data {
 	enum replace_format format;
 };
 
-static int show_reference(const char *refname,
-			  const char *referent UNUSED,
-			  const struct object_id *oid,
-			  int flag UNUSED, void *cb_data)
+static int show_reference(const struct reference *ref, void *cb_data)
 {
 	struct show_data *data = cb_data;
 
-	if (!wildmatch(data->pattern, refname, 0)) {
+	if (!wildmatch(data->pattern, ref->name, 0)) {
 		if (data->format == REPLACE_FORMAT_SHORT)
-			printf("%s\n", refname);
+			printf("%s\n", ref->name);
 		else if (data->format == REPLACE_FORMAT_MEDIUM)
-			printf("%s -> %s\n", refname, oid_to_hex(oid));
+			printf("%s -> %s\n", ref->name, oid_to_hex(ref->oid));
 		else { /* data->format == REPLACE_FORMAT_LONG */
 			struct object_id object;
 			enum object_type obj_type, repl_type;
 
-			if (repo_get_oid(data->repo, refname, &object))
-				return error(_("failed to resolve '%s' as a valid ref"), refname);
+			if (repo_get_oid(data->repo, ref->name, &object))
+				return error(_("failed to resolve '%s' as a valid ref"), ref->name);
 
 			obj_type = odb_read_object_info(data->repo->objects, &object, NULL);
-			repl_type = odb_read_object_info(data->repo->objects, oid, NULL);
+			repl_type = odb_read_object_info(data->repo->objects, ref->oid, NULL);
 
-			printf("%s (%s) -> %s (%s)\n", refname, type_name(obj_type),
-			       oid_to_hex(oid), type_name(repl_type));
+			printf("%s (%s) -> %s (%s)\n", ref->name, type_name(obj_type),
+			       oid_to_hex(ref->oid), type_name(repl_type));
 		}
 	}
 
diff --git a/builtin/repo.c b/builtin/repo.c
index 9d4749f79befa8..f26640bd6ea1e7 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -366,16 +366,13 @@ struct count_references_data {
 	struct progress *progress;
 };
 
-static int count_references(const char *refname,
-			    const char *referent UNUSED,
-			    const struct object_id *oid,
-			    int flags UNUSED, void *cb_data)
+static int count_references(const struct reference *ref, void *cb_data)
 {
 	struct count_references_data *data = cb_data;
 	struct ref_stats *stats = data->stats;
 	size_t ref_count;
 
-	switch (ref_kind_from_refname(refname)) {
+	switch (ref_kind_from_refname(ref->name)) {
 	case FILTER_REFS_BRANCHES:
 		stats->branches++;
 		break;
@@ -396,7 +393,7 @@ static int count_references(const char *refname,
 	 * While iterating through references for counting, also add OIDs in
 	 * preparation for the path walk.
 	 */
-	add_pending_oid(data->revs, NULL, oid, 0);
+	add_pending_oid(data->revs, NULL, ref->oid, 0);
 
 	ref_count = get_total_reference_count(stats);
 	display_progress(data->progress, ref_count);
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 9da92b990d074b..3578591b4f2a38 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -217,19 +217,17 @@ static int show_default(void)
 	return 0;
 }
 
-static int show_reference(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			  int flag UNUSED, void *cb_data UNUSED)
+static int show_reference(const struct reference *ref, void *cb_data UNUSED)
 {
-	if (ref_excluded(&ref_excludes, refname))
+	if (ref_excluded(&ref_excludes, ref->name))
 		return 0;
-	show_rev(NORMAL, oid, refname);
+	show_rev(NORMAL, ref->oid, ref->name);
 	return 0;
 }
 
-static int anti_reference(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			  int flag UNUSED, void *cb_data UNUSED)
+static int anti_reference(const struct reference *ref, void *cb_data UNUSED)
 {
-	show_rev(REVERSED, oid, refname);
+	show_rev(REVERSED, ref->oid, ref->name);
 	return 0;
 }
 
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 441babf2e350f9..10475a6b5edb2b 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -413,34 +413,32 @@ static int append_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int append_head_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			   int flag UNUSED, void *cb_data UNUSED)
+static int append_head_ref(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object_id tmp;
 	int ofs = 11;
-	if (!starts_with(refname, "refs/heads/"))
+	if (!starts_with(ref->name, "refs/heads/"))
 		return 0;
 	/* If both heads/foo and tags/foo exists, get_sha1 would
 	 * get confused.
 	 */
-	if (repo_get_oid(the_repository, refname + ofs, &tmp) || !oideq(&tmp, oid))
+	if (repo_get_oid(the_repository, ref->name + ofs, &tmp) || !oideq(&tmp, ref->oid))
 		ofs = 5;
-	return append_ref(refname + ofs, oid, 0);
+	return append_ref(ref->name + ofs, ref->oid, 0);
 }
 
-static int append_remote_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			     int flag UNUSED, void *cb_data UNUSED)
+static int append_remote_ref(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object_id tmp;
 	int ofs = 13;
-	if (!starts_with(refname, "refs/remotes/"))
+	if (!starts_with(ref->name, "refs/remotes/"))
 		return 0;
 	/* If both heads/foo and tags/foo exists, get_sha1 would
 	 * get confused.
 	 */
-	if (repo_get_oid(the_repository, refname + ofs, &tmp) || !oideq(&tmp, oid))
+	if (repo_get_oid(the_repository, ref->name + ofs, &tmp) || !oideq(&tmp, ref->oid))
 		ofs = 5;
-	return append_ref(refname + ofs, oid, 0);
+	return append_ref(ref->name + ofs, ref->oid, 0);
 }
 
 static int append_tag_ref(const char *refname, const struct object_id *oid,
@@ -454,27 +452,26 @@ static int append_tag_ref(const char *refname, const struct object_id *oid,
 static const char *match_ref_pattern = NULL;
 static int match_ref_slash = 0;
 
-static int append_matching_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			       int flag, void *cb_data)
+static int append_matching_ref(const struct reference *ref, void *cb_data)
 {
 	/* we want to allow pattern hold/<asterisk> to show all
 	 * branches under refs/heads/hold/, and v0.99.9? to show
 	 * refs/tags/v0.99.9a and friends.
 	 */
 	const char *tail;
-	int slash = count_slashes(refname);
-	for (tail = refname; *tail && match_ref_slash < slash; )
+	int slash = count_slashes(ref->name);
+	for (tail = ref->name; *tail && match_ref_slash < slash; )
 		if (*tail++ == '/')
 			slash--;
 	if (!*tail)
 		return 0;
 	if (wildmatch(match_ref_pattern, tail, 0))
 		return 0;
-	if (starts_with(refname, "refs/heads/"))
-		return append_head_ref(refname, NULL, oid, flag, cb_data);
-	if (starts_with(refname, "refs/tags/"))
-		return append_tag_ref(refname, oid, flag, cb_data);
-	return append_ref(refname, oid, 0);
+	if (starts_with(ref->name, "refs/heads/"))
+		return append_head_ref(ref, cb_data);
+	if (starts_with(ref->name, "refs/tags/"))
+		return append_tag_ref(ref->name, ref->oid, ref->flags, cb_data);
+	return append_ref(ref->name, ref->oid, 0);
 }
 
 static void snarf_refs(int head, int remotes)
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 0b6f9edf86c24f..4803b5e59865f6 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -66,26 +66,25 @@ struct show_ref_data {
 	int show_head;
 };
 
-static int show_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		    int flag UNUSED, void *cbdata)
+static int show_ref(const struct reference *ref, void *cbdata)
 {
 	struct show_ref_data *data = cbdata;
 
-	if (data->show_head && !strcmp(refname, "HEAD"))
+	if (data->show_head && !strcmp(ref->name, "HEAD"))
 		goto match;
 
 	if (data->patterns) {
-		int reflen = strlen(refname);
+		int reflen = strlen(ref->name);
 		const char **p = data->patterns, *m;
 		while ((m = *p++) != NULL) {
 			int len = strlen(m);
 			if (len > reflen)
 				continue;
-			if (memcmp(m, refname + reflen - len, len))
+			if (memcmp(m, ref->name + reflen - len, len))
 				continue;
 			if (len == reflen)
 				goto match;
-			if (refname[reflen - len - 1] == '/')
+			if (ref->name[reflen - len - 1] == '/')
 				goto match;
 		}
 		return 0;
@@ -94,18 +93,15 @@ static int show_ref(const char *refname, const char *referent UNUSED, const stru
 match:
 	data->found_match++;
 
-	show_one(data->show_one_opts, refname, oid);
+	show_one(data->show_one_opts, ref->name, ref->oid);
 
 	return 0;
 }
 
-static int add_existing(const char *refname,
-			const char *referent UNUSED,
-			const struct object_id *oid UNUSED,
-			int flag UNUSED, void *cbdata)
+static int add_existing(const struct reference *ref, void *cbdata)
 {
 	struct string_list *list = (struct string_list *)cbdata;
-	string_list_insert(list, refname);
+	string_list_insert(list, ref->name);
 	return 0;
 }
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index fcd73abe5336a9..35f6cf735e51cd 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -593,16 +593,12 @@ static void print_status(unsigned int flags, char state, const char *path,
 	printf("\n");
 }
 
-static int handle_submodule_head_ref(const char *refname UNUSED,
-				     const char *referent UNUSED,
-				     const struct object_id *oid,
-				     int flags UNUSED,
-				     void *cb_data)
+static int handle_submodule_head_ref(const struct reference *ref, void *cb_data)
 {
 	struct object_id *output = cb_data;
 
-	if (oid)
-		oidcpy(output, oid);
+	if (ref->oid)
+		oidcpy(output, ref->oid);
 
 	return 0;
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 812774a5ca992c..b7f323b5e4d73e 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -635,11 +635,7 @@ static void print_preparing_worktree_line(int detach,
  *
  * Returns 0 on failure and non-zero on success.
  */
-static int first_valid_ref(const char *refname UNUSED,
-			   const char *referent UNUSED,
-			   const struct object_id *oid UNUSED,
-			   int flags UNUSED,
-			   void *cb_data UNUSED)
+static int first_valid_ref(const struct reference *ref UNUSED, void *cb_data UNUSED)
 {
 	return 1;
 }
diff --git a/commit-graph.c b/commit-graph.c
index 474454db73d094..f91af416259c84 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1851,18 +1851,16 @@ struct refs_cb_data {
 	struct progress *progress;
 };
 
-static int add_ref_to_set(const char *refname UNUSED,
-			  const char *referent UNUSED,
-			  const struct object_id *oid,
-			  int flags UNUSED, void *cb_data)
+static int add_ref_to_set(const struct reference *ref, void *cb_data)
 {
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 	struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
 
-	if (!peel_iterated_oid(data->repo, oid, &peeled))
-		oid = &peeled;
-	if (odb_read_object_info(data->repo->objects, oid, NULL) == OBJ_COMMIT)
-		oidset_insert(data->commits, oid);
+	if (!peel_iterated_oid(data->repo, ref->oid, &peeled))
+		maybe_peeled = &peeled;
+	if (odb_read_object_info(data->repo->objects, maybe_peeled, NULL) == OBJ_COMMIT)
+		oidset_insert(data->commits, maybe_peeled);
 
 	display_progress(data->progress, oidset_size(data->commits));
 
diff --git a/delta-islands.c b/delta-islands.c
index 36c94799d69d7a..7cfebc4162b0e0 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -390,8 +390,7 @@ static void add_ref_to_island(kh_str_t *remote_islands, const char *island_name,
 	rl->hash += sha_core;
 }
 
-static int find_island_for_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			       int flags UNUSED, void *cb)
+static int find_island_for_ref(const struct reference *ref, void *cb)
 {
 	struct island_load_data *ild = cb;
 
@@ -406,7 +405,7 @@ static int find_island_for_ref(const char *refname, const char *referent UNUSED,
 
 	/* walk backwards to get last-one-wins ordering */
 	for (i = ild->nr - 1; i >= 0; i--) {
-		if (!regexec(&ild->rx[i], refname,
+		if (!regexec(&ild->rx[i], ref->name,
 			     ARRAY_SIZE(matches), matches, 0))
 			break;
 	}
@@ -428,10 +427,10 @@ static int find_island_for_ref(const char *refname, const char *referent UNUSED,
 		if (island_name.len)
 			strbuf_addch(&island_name, '-');
 
-		strbuf_add(&island_name, refname + match->rm_so, match->rm_eo - match->rm_so);
+		strbuf_add(&island_name, ref->name + match->rm_so, match->rm_eo - match->rm_so);
 	}
 
-	add_ref_to_island(ild->remote_islands, island_name.buf, oid);
+	add_ref_to_island(ild->remote_islands, island_name.buf, ref->oid);
 	strbuf_release(&island_name);
 	return 0;
 }
diff --git a/fetch-pack.c b/fetch-pack.c
index fe7a84bf2f97fa..78c45d4a155c89 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -188,13 +188,9 @@ static int rev_list_insert_ref(struct fetch_negotiator *negotiator,
 	return 0;
 }
 
-static int rev_list_insert_ref_oid(const char *refname UNUSED,
-				   const char *referent UNUSED,
-				   const struct object_id *oid,
-				   int flag UNUSED,
-				   void *cb_data)
+static int rev_list_insert_ref_oid(const struct reference *ref, void *cb_data)
 {
-	return rev_list_insert_ref(cb_data, oid);
+	return rev_list_insert_ref(cb_data, ref->oid);
 }
 
 enum ack_type {
@@ -616,13 +612,9 @@ static int mark_complete(const struct object_id *oid)
 	return 0;
 }
 
-static int mark_complete_oid(const char *refname UNUSED,
-			     const char *referent UNUSED,
-			     const struct object_id *oid,
-			     int flag UNUSED,
-			     void *cb_data UNUSED)
+static int mark_complete_oid(const struct reference *ref, void *cb_data UNUSED)
 {
-	return mark_complete(oid);
+	return mark_complete(ref->oid);
 }
 
 static void mark_recent_complete_commits(struct fetch_pack_args *args,
diff --git a/help.c b/help.c
index 5854dd4a7e468b..20e114432d7f85 100644
--- a/help.c
+++ b/help.c
@@ -851,18 +851,16 @@ struct similar_ref_cb {
 	struct string_list *similar_refs;
 };
 
-static int append_similar_ref(const char *refname, const char *referent UNUSED,
-			      const struct object_id *oid UNUSED,
-			      int flags UNUSED, void *cb_data)
+static int append_similar_ref(const struct reference *ref, void *cb_data)
 {
 	struct similar_ref_cb *cb = (struct similar_ref_cb *)(cb_data);
-	char *branch = strrchr(refname, '/') + 1;
+	char *branch = strrchr(ref->name, '/') + 1;
 
 	/* A remote branch of the same name is deemed similar */
-	if (starts_with(refname, "refs/remotes/") &&
+	if (starts_with(ref->name, "refs/remotes/") &&
 	    !strcmp(branch, cb->base_ref))
 		string_list_append_nodup(cb->similar_refs,
-					 refs_shorten_unambiguous_ref(get_main_ref_store(the_repository), refname, 1));
+					 refs_shorten_unambiguous_ref(get_main_ref_store(the_repository), ref->name, 1));
 	return 0;
 }
 
diff --git a/http-backend.c b/http-backend.c
index 9084058f1e9f13..92e1733f14042c 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -513,18 +513,17 @@ static void run_service(const char **argv, int buffer_input)
 		exit(1);
 }
 
-static int show_text_ref(const char *name, const char *referent UNUSED, const struct object_id *oid,
-			 int flag UNUSED, void *cb_data)
+static int show_text_ref(const struct reference *ref, void *cb_data)
 {
-	const char *name_nons = strip_namespace(name);
+	const char *name_nons = strip_namespace(ref->name);
 	struct strbuf *buf = cb_data;
-	struct object *o = parse_object(the_repository, oid);
+	struct object *o = parse_object(the_repository, ref->oid);
 	if (!o)
 		return 0;
 
-	strbuf_addf(buf, "%s\t%s\n", oid_to_hex(oid), name_nons);
+	strbuf_addf(buf, "%s\t%s\n", oid_to_hex(ref->oid), name_nons);
 	if (o->type == OBJ_TAG) {
-		o = deref_tag(the_repository, o, name, 0);
+		o = deref_tag(the_repository, o, ref->name, 0);
 		if (!o)
 			return 0;
 		strbuf_addf(buf, "%s\t%s^{}\n", oid_to_hex(&o->oid),
@@ -569,21 +568,20 @@ static void get_info_refs(struct strbuf *hdr, char *arg UNUSED)
 	strbuf_release(&buf);
 }
 
-static int show_head_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			 int flag, void *cb_data)
+static int show_head_ref(const struct reference *ref, void *cb_data)
 {
 	struct strbuf *buf = cb_data;
 
-	if (flag & REF_ISSYMREF) {
+	if (ref->flags & REF_ISSYMREF) {
 		const char *target = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
-							     refname,
+							     ref->name,
 							     RESOLVE_REF_READING,
 							     NULL, NULL);
 
 		if (target)
 			strbuf_addf(buf, "ref: %s\n", strip_namespace(target));
 	} else {
-		strbuf_addf(buf, "%s\n", oid_to_hex(oid));
+		strbuf_addf(buf, "%s\n", oid_to_hex(ref->oid));
 	}
 
 	return 0;
diff --git a/log-tree.c b/log-tree.c
index 7d917f2a83d5ae..1729b0c201271b 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -147,9 +147,7 @@ static int ref_filter_match(const char *refname,
 	return 1;
 }
 
-static int add_ref_decoration(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			      int flags UNUSED,
-			      void *cb_data)
+static int add_ref_decoration(const struct reference *ref, void *cb_data)
 {
 	int i;
 	struct object *obj;
@@ -158,16 +156,16 @@ static int add_ref_decoration(const char *refname, const char *referent UNUSED,
 	struct decoration_filter *filter = (struct decoration_filter *)cb_data;
 	const char *git_replace_ref_base = ref_namespace[NAMESPACE_REPLACE].ref;
 
-	if (filter && !ref_filter_match(refname, filter))
+	if (filter && !ref_filter_match(ref->name, filter))
 		return 0;
 
-	if (starts_with(refname, git_replace_ref_base)) {
+	if (starts_with(ref->name, git_replace_ref_base)) {
 		struct object_id original_oid;
 		if (!replace_refs_enabled(the_repository))
 			return 0;
-		if (get_oid_hex(refname + strlen(git_replace_ref_base),
+		if (get_oid_hex(ref->name + strlen(git_replace_ref_base),
 				&original_oid)) {
-			warning("invalid replace ref %s", refname);
+			warning("invalid replace ref %s", ref->name);
 			return 0;
 		}
 		obj = parse_object(the_repository, &original_oid);
@@ -176,10 +174,10 @@ static int add_ref_decoration(const char *refname, const char *referent UNUSED,
 		return 0;
 	}
 
-	objtype = odb_read_object_info(the_repository->objects, oid, NULL);
+	objtype = odb_read_object_info(the_repository->objects, ref->oid, NULL);
 	if (objtype < 0)
 		return 0;
-	obj = lookup_object_by_type(the_repository, oid, objtype);
+	obj = lookup_object_by_type(the_repository, ref->oid, objtype);
 
 	for (i = 0; i < ARRAY_SIZE(ref_namespace); i++) {
 		struct ref_namespace_info *info = &ref_namespace[i];
@@ -187,24 +185,24 @@ static int add_ref_decoration(const char *refname, const char *referent UNUSED,
 		if (!info->decoration)
 			continue;
 		if (info->exact) {
-			if (!strcmp(refname, info->ref)) {
+			if (!strcmp(ref->name, info->ref)) {
 				deco_type = info->decoration;
 				break;
 			}
-		} else if (starts_with(refname, info->ref)) {
+		} else if (starts_with(ref->name, info->ref)) {
 			deco_type = info->decoration;
 			break;
 		}
 	}
 
-	add_name_decoration(deco_type, refname, obj);
+	add_name_decoration(deco_type, ref->name, obj);
 	while (obj->type == OBJ_TAG) {
 		if (!obj->parsed)
 			parse_object(the_repository, &obj->oid);
 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;
-		add_name_decoration(DECORATION_REF_TAG, refname, obj);
+		add_name_decoration(DECORATION_REF_TAG, ref->name, obj);
 	}
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index c47acde07f335b..64d02723691466 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -75,42 +75,42 @@ struct ls_refs_data {
 	unsigned unborn : 1;
 };
 
-static int send_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		    int flag, void *cb_data)
+static int send_ref(const struct reference *ref, void *cb_data)
 {
 	struct ls_refs_data *data = cb_data;
-	const char *refname_nons = strip_namespace(refname);
+	const char *refname_nons = strip_namespace(ref->name);
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
+	if (ref_is_hidden(refname_nons, ref->name, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	if (oid)
-		strbuf_addf(&data->buf, "%s %s", oid_to_hex(oid), refname_nons);
+	if (ref->oid)
+		strbuf_addf(&data->buf, "%s %s", oid_to_hex(ref->oid), refname_nons);
 	else
 		strbuf_addf(&data->buf, "unborn %s", refname_nons);
-	if (data->symrefs && flag & REF_ISSYMREF) {
+	if (data->symrefs && ref->flags & REF_ISSYMREF) {
+		int unused_flag;
 		struct object_id unused;
 		const char *symref_target = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
-								    refname,
+								    ref->name,
 								    0,
 								    &unused,
-								    &flag);
+								    &unused_flag);
 
 		if (!symref_target)
-			die("'%s' is a symref but it is not?", refname);
+			die("'%s' is a symref but it is not?", ref->name);
 
 		strbuf_addf(&data->buf, " symref-target:%s",
 			    strip_namespace(symref_target));
 	}
 
-	if (data->peel && oid) {
+	if (data->peel && ref->oid) {
 		struct object_id peeled;
-		if (!peel_iterated_oid(the_repository, oid, &peeled))
+		if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
 			strbuf_addf(&data->buf, " peeled:%s", oid_to_hex(&peeled));
 	}
 
@@ -131,9 +131,17 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 	if (!refs_resolve_ref_unsafe(get_main_ref_store(the_repository), namespaced.buf, 0, &oid, &flag))
 		return; /* bad ref */
 	oid_is_null = is_null_oid(&oid);
+
 	if (!oid_is_null ||
-	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
-		send_ref(namespaced.buf, NULL, oid_is_null ? NULL : &oid, flag, data);
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF))) {
+		struct reference ref = {
+			.name = namespaced.buf,
+			.oid = oid_is_null ? NULL : &oid,
+			.flags = flag,
+		};
+
+		send_ref(&ref, data);
+	}
 	strbuf_release(&namespaced);
 }
 
diff --git a/midx-write.c b/midx-write.c
index c73010df6d3a4f..f4dd875747a4b6 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -697,28 +697,27 @@ static void prepare_midx_packing_data(struct packing_data *pdata,
 	trace2_region_leave("midx", "prepare_midx_packing_data", ctx->repo);
 }
 
-static int add_ref_to_pending(const char *refname, const char *referent UNUSED,
-			      const struct object_id *oid,
-			      int flag, void *cb_data)
+static int add_ref_to_pending(const struct reference *ref, void *cb_data)
 {
 	struct rev_info *revs = (struct rev_info*)cb_data;
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 	struct object *object;
 
-	if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) {
-		warning("symbolic ref is dangling: %s", refname);
+	if ((ref->flags & REF_ISSYMREF) && (ref->flags & REF_ISBROKEN)) {
+		warning("symbolic ref is dangling: %s", ref->name);
 		return 0;
 	}
 
-	if (!peel_iterated_oid(revs->repo, oid, &peeled))
-		oid = &peeled;
+	if (!peel_iterated_oid(revs->repo, ref->oid, &peeled))
+		maybe_peeled = &peeled;
 
-	object = parse_object_or_die(revs->repo, oid, refname);
+	object = parse_object_or_die(revs->repo, maybe_peeled, ref->name);
 	if (object->type != OBJ_COMMIT)
 		return 0;
 
 	add_pending_object(revs, object, "");
-	if (bitmap_is_preferred_refname(revs->repo, refname))
+	if (bitmap_is_preferred_refname(revs->repo, ref->name))
 		object->flags |= NEEDS_BITMAP;
 	return 0;
 }
diff --git a/negotiator/default.c b/negotiator/default.c
index c479da9b091570..116dedcf83035d 100644
--- a/negotiator/default.c
+++ b/negotiator/default.c
@@ -38,11 +38,10 @@ static void rev_list_push(struct negotiation_state *ns,
 	}
 }
 
-static int clear_marks(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		       int flag UNUSED,
-		       void *cb_data UNUSED)
+static int clear_marks(const struct reference *ref, void *cb_data UNUSED)
 {
-	struct object *o = deref_tag(the_repository, parse_object(the_repository, oid), refname, 0);
+	struct object *o = deref_tag(the_repository, parse_object(the_repository, ref->oid),
+				     ref->name, 0);
 
 	if (o && o->type == OBJ_COMMIT)
 		clear_commit_marks((struct commit *)o,
diff --git a/negotiator/skipping.c b/negotiator/skipping.c
index 616df6bf3af51c..0a272130fb1b6d 100644
--- a/negotiator/skipping.c
+++ b/negotiator/skipping.c
@@ -75,11 +75,10 @@ static struct entry *rev_list_push(struct data *data, struct commit *commit, int
 	return entry;
 }
 
-static int clear_marks(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		       int flag UNUSED,
-		       void *cb_data UNUSED)
+static int clear_marks(const struct reference *ref, void *cb_data UNUSED)
 {
-	struct object *o = deref_tag(the_repository, parse_object(the_repository, oid), refname, 0);
+	struct object *o = deref_tag(the_repository, parse_object(the_repository, ref->oid),
+				     ref->name, 0);
 
 	if (o && o->type == OBJ_COMMIT)
 		clear_commit_marks((struct commit *)o,
diff --git a/notes.c b/notes.c
index 9a2e9181fe67d8..8e00fd8c470dd2 100644
--- a/notes.c
+++ b/notes.c
@@ -938,13 +938,11 @@ int combine_notes_cat_sort_uniq(struct object_id *cur_oid,
 	return ret;
 }
 
-static int string_list_add_one_ref(const char *refname, const char *referent UNUSED,
-				   const struct object_id *oid UNUSED,
-				   int flag UNUSED, void *cb)
+static int string_list_add_one_ref(const struct reference *ref, void *cb)
 {
 	struct string_list *refs = cb;
-	if (!unsorted_string_list_has_string(refs, refname))
-		string_list_append(refs, refname);
+	if (!unsorted_string_list_has_string(refs, ref->name))
+		string_list_append(refs, ref->name);
 	return 0;
 }
 
diff --git a/object-name.c b/object-name.c
index f6902e140dd43e..7e8109f25fb839 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1444,18 +1444,16 @@ struct handle_one_ref_cb {
 	struct commit_list **list;
 };
 
-static int handle_one_ref(const char *path, const char *referent UNUSED, const struct object_id *oid,
-			  int flag UNUSED,
-			  void *cb_data)
+static int handle_one_ref(const struct reference *ref, void *cb_data)
 {
 	struct handle_one_ref_cb *cb = cb_data;
 	struct commit_list **list = cb->list;
-	struct object *object = parse_object(cb->repo, oid);
+	struct object *object = parse_object(cb->repo, ref->oid);
 	if (!object)
 		return 0;
 	if (object->type == OBJ_TAG) {
-		object = deref_tag(cb->repo, object, path,
-				   strlen(path));
+		object = deref_tag(cb->repo, object, ref->name,
+				   strlen(ref->name));
 		if (!object)
 			return 0;
 	}
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 893b763fe45490..0abd51b42c185a 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -221,28 +221,25 @@ void load_pseudo_merges_from_config(struct repository *r,
 	}
 }
 
-static int find_pseudo_merge_group_for_ref(const char *refname,
-					   const char *referent UNUSED,
-					   const struct object_id *oid,
-					   int flags UNUSED,
-					   void *_data)
+static int find_pseudo_merge_group_for_ref(const struct reference *ref, void *_data)
 {
 	struct bitmap_writer *writer = _data;
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 	struct commit *c;
 	uint32_t i;
 	int has_bitmap;
 
-	if (!peel_iterated_oid(the_repository, oid, &peeled))
-		oid = &peeled;
+	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+		maybe_peeled = &peeled;
 
-	c = lookup_commit(the_repository, oid);
+	c = lookup_commit(the_repository, maybe_peeled);
 	if (!c)
 		return 0;
-	if (!packlist_find(writer->to_pack, oid))
+	if (!packlist_find(writer->to_pack, maybe_peeled))
 		return 0;
 
-	has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid);
+	has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, maybe_peeled);
 
 	for (i = 0; i < writer->pseudo_merge_groups.nr; i++) {
 		struct pseudo_merge_group *group;
@@ -252,7 +249,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
 		size_t j;
 
 		group = writer->pseudo_merge_groups.items[i].util;
-		if (regexec(group->pattern, refname, ARRAY_SIZE(captures),
+		if (regexec(group->pattern, ref->name, ARRAY_SIZE(captures),
 			    captures, 0))
 			continue;
 
@@ -269,7 +266,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
 			if (group_name.len)
 				strbuf_addch(&group_name, '-');
 
-			strbuf_add(&group_name, refname + match->rm_so,
+			strbuf_add(&group_name, ref->name + match->rm_so,
 				   match->rm_eo - match->rm_so);
 		}
 
diff --git a/reachable.c b/reachable.c
index 22266db5233df7..b753c395530b6d 100644
--- a/reachable.c
+++ b/reachable.c
@@ -83,18 +83,17 @@ static void add_rebase_files(struct rev_info *revs)
 	free_worktrees(worktrees);
 }
 
-static int add_one_ref(const char *path, const char *referent UNUSED, const struct object_id *oid,
-		       int flag, void *cb_data)
+static int add_one_ref(const struct reference *ref, void *cb_data)
 {
 	struct rev_info *revs = (struct rev_info *)cb_data;
 	struct object *object;
 
-	if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) {
-		warning("symbolic ref is dangling: %s", path);
+	if ((ref->flags & REF_ISSYMREF) && (ref->flags & REF_ISBROKEN)) {
+		warning("symbolic ref is dangling: %s", ref->name);
 		return 0;
 	}
 
-	object = parse_object_or_die(the_repository, oid, path);
+	object = parse_object_or_die(the_repository, ref->oid, ref->name);
 	add_pending_object(revs, object, "");
 
 	return 0;
diff --git a/ref-filter.c b/ref-filter.c
index 30cc488d8ab659..6837fa60a9b2b5 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2954,14 +2954,15 @@ struct ref_filter_cbdata {
  * A call-back given to for_each_ref().  Filter refs and keep them for
  * later object processing.
  */
-static int filter_one(const char *refname, const char *referent, const struct object_id *oid, int flag, void *cb_data)
+static int filter_one(const struct reference *ref, void *cb_data)
 {
 	struct ref_filter_cbdata *ref_cbdata = cb_data;
-	struct ref_array_item *ref;
+	struct ref_array_item *item;
 
-	ref = apply_ref_filter(refname, referent, oid, flag, ref_cbdata->filter);
-	if (ref)
-		ref_array_append(ref_cbdata->array, ref);
+	item = apply_ref_filter(ref->name, ref->target, ref->oid,
+				ref->flags, ref_cbdata->filter);
+	if (item)
+		ref_array_append(ref_cbdata->array, item);
 
 	return 0;
 }
@@ -2990,17 +2991,18 @@ struct ref_filter_and_format_cbdata {
 	} internal;
 };
 
-static int filter_and_format_one(const char *refname, const char *referent, const struct object_id *oid, int flag, void *cb_data)
+static int filter_and_format_one(const struct reference *ref, void *cb_data)
 {
 	struct ref_filter_and_format_cbdata *ref_cbdata = cb_data;
-	struct ref_array_item *ref;
+	struct ref_array_item *item;
 	struct strbuf output = STRBUF_INIT, err = STRBUF_INIT;
 
-	ref = apply_ref_filter(refname, referent, oid, flag, ref_cbdata->filter);
-	if (!ref)
+	item = apply_ref_filter(ref->name, ref->target, ref->oid,
+				ref->flags, ref_cbdata->filter);
+	if (!item)
 		return 0;
 
-	if (format_ref_array_item(ref, ref_cbdata->format, &output, &err))
+	if (format_ref_array_item(item, ref_cbdata->format, &output, &err))
 		die("%s", err.buf);
 
 	if (output.len || !ref_cbdata->format->array_opts.omit_empty) {
@@ -3010,7 +3012,7 @@ static int filter_and_format_one(const char *refname, const char *referent, cons
 
 	strbuf_release(&output);
 	strbuf_release(&err);
-	free_array_item(ref);
+	free_array_item(item);
 
 	/*
 	 * Increment the running count of refs that match the filter. If
diff --git a/reflog.c b/reflog.c
index 65ef259b4f5e1e..ac87e20c4f97ff 100644
--- a/reflog.c
+++ b/reflog.c
@@ -423,16 +423,13 @@ int should_expire_reflog_ent_verbose(struct object_id *ooid,
 	return expire;
 }
 
-static int push_tip_to_list(const char *refname UNUSED,
-			    const char *referent UNUSED,
-			    const struct object_id *oid,
-			    int flags, void *cb_data)
+static int push_tip_to_list(const struct reference *ref, void *cb_data)
 {
 	struct commit_list **list = cb_data;
 	struct commit *tip_commit;
-	if (flags & REF_ISSYMREF)
+	if (ref->flags & REF_ISSYMREF)
 		return 0;
-	tip_commit = lookup_commit_reference_gently(the_repository, oid, 1);
+	tip_commit = lookup_commit_reference_gently(the_repository, ref->oid, 1);
 	if (!tip_commit)
 		return 0;
 	commit_list_insert(tip_commit, list);
diff --git a/refs.c b/refs.c
index 965381367e0e53..25f0579d610ddc 100644
--- a/refs.c
+++ b/refs.c
@@ -426,17 +426,19 @@ int refs_ref_exists(struct ref_store *refs, const char *refname)
 					 NULL, NULL);
 }
 
-static int for_each_filter_refs(const char *refname, const char *referent,
-				const struct object_id *oid,
-				int flags, void *data)
+static int for_each_filter_refs(const struct reference *ref, void *data)
 {
 	struct for_each_ref_filter *filter = data;
 
-	if (wildmatch(filter->pattern, refname, 0))
+	if (wildmatch(filter->pattern, ref->name, 0))
 		return 0;
-	if (filter->prefix)
-		skip_prefix(refname, filter->prefix, &refname);
-	return filter->fn(refname, referent, oid, flags, filter->cb_data);
+	if (filter->prefix) {
+		struct reference skipped = *ref;
+		skip_prefix(skipped.name, filter->prefix, &skipped.name);
+		return filter->fn(&skipped, filter->cb_data);
+	} else {
+		return filter->fn(ref, filter->cb_data);
+	}
 }
 
 struct warn_if_dangling_data {
@@ -447,17 +449,15 @@ struct warn_if_dangling_data {
 	int dry_run;
 };
 
-static int warn_if_dangling_symref(const char *refname, const char *referent UNUSED,
-				   const struct object_id *oid UNUSED,
-				   int flags, void *cb_data)
+static int warn_if_dangling_symref(const struct reference *ref, void *cb_data)
 {
 	struct warn_if_dangling_data *d = cb_data;
 	const char *resolves_to, *msg;
 
-	if (!(flags & REF_ISSYMREF))
+	if (!(ref->flags & REF_ISSYMREF))
 		return 0;
 
-	resolves_to = refs_resolve_ref_unsafe(d->refs, refname, 0, NULL, NULL);
+	resolves_to = refs_resolve_ref_unsafe(d->refs, ref->name, 0, NULL, NULL);
 	if (!resolves_to
 	    || !string_list_has_string(d->refnames, resolves_to)) {
 		return 0;
@@ -466,7 +466,7 @@ static int warn_if_dangling_symref(const char *refname, const char *referent UNU
 	msg = d->dry_run
 		? _("%s%s will become dangling after %s is deleted\n")
 		: _("%s%s has become dangling after %s was deleted\n");
-	fprintf(d->fp, msg, d->indent, refname, resolves_to);
+	fprintf(d->fp, msg, d->indent, ref->name, resolves_to);
 	return 0;
 }
 
@@ -507,8 +507,15 @@ int refs_head_ref_namespaced(struct ref_store *refs, each_ref_fn fn, void *cb_da
 	int flag;
 
 	strbuf_addf(&buf, "%sHEAD", get_git_namespace());
-	if (!refs_read_ref_full(refs, buf.buf, RESOLVE_REF_READING, &oid, &flag))
-		ret = fn(buf.buf, NULL, &oid, flag, cb_data);
+	if (!refs_read_ref_full(refs, buf.buf, RESOLVE_REF_READING, &oid, &flag)) {
+		struct reference ref = {
+			.name = buf.buf,
+			.oid = &oid,
+			.flags = flag,
+		};
+
+		ret = fn(&ref, cb_data);
+	}
 	strbuf_release(&buf);
 
 	return ret;
@@ -1741,8 +1748,15 @@ int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 	int flag;
 
 	if (refs_resolve_ref_unsafe(refs, "HEAD", RESOLVE_REF_READING,
-				    &oid, &flag))
-		return fn("HEAD", NULL, &oid, flag, cb_data);
+				    &oid, &flag)) {
+		struct reference ref = {
+			.name = "HEAD",
+			.oid = &oid,
+			.flags = flag,
+		};
+
+		return fn(&ref, cb_data);
+	}
 
 	return 0;
 }
@@ -2753,14 +2767,10 @@ struct do_for_each_reflog_help {
 	void *cb_data;
 };
 
-static int do_for_each_reflog_helper(const char *refname,
-				     const char *referent UNUSED,
-				     const struct object_id *oid UNUSED,
-				     int flags UNUSED,
-				     void *cb_data)
+static int do_for_each_reflog_helper(const struct reference *ref, void *cb_data)
 {
 	struct do_for_each_reflog_help *hp = cb_data;
-	return hp->fn(refname, hp->cb_data);
+	return hp->fn(ref->name, hp->cb_data);
 }
 
 int refs_for_each_reflog(struct ref_store *refs, each_reflog_fn fn, void *cb_data)
@@ -2976,25 +2986,24 @@ struct migration_data {
 	uint64_t index;
 };
 
-static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			   int flags, void *cb_data)
+static int migrate_one_ref(const struct reference *ref, void *cb_data)
 {
 	struct migration_data *data = cb_data;
 	struct strbuf symref_target = STRBUF_INIT;
 	int ret;
 
-	if (flags & REF_ISSYMREF) {
-		ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target);
+	if (ref->flags & REF_ISSYMREF) {
+		ret = refs_read_symbolic_ref(data->old_refs, ref->name, &symref_target);
 		if (ret < 0)
 			goto done;
 
-		ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(the_hash_algo),
+		ret = ref_transaction_update(data->transaction, ref->name, NULL, null_oid(the_hash_algo),
 					     symref_target.buf, NULL,
 					     REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf);
 		if (ret < 0)
 			goto done;
 	} else {
-		ret = ref_transaction_create(data->transaction, refname, oid, NULL,
+		ret = ref_transaction_create(data->transaction, ref->name, ref->oid, NULL,
 					     REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION,
 					     NULL, data->errbuf);
 		if (ret < 0)
diff --git a/refs.h b/refs.h
index 4e6bd63aa86c54..68d235438c2b32 100644
--- a/refs.h
+++ b/refs.h
@@ -355,14 +355,32 @@ struct ref_transaction;
  */
 #define REF_BAD_NAME 0x08
 
+/* A reference passed to `for_each_ref()`-style callbacks. */
+struct reference {
+	/* The fully-qualified name of the reference. */
+	const char *name;
+
+	/* The target of a symbolic ref. `NULL` for direct references. */
+	const char *target;
+
+	/*
+	 * The object ID of a reference. Either the direct object ID or the
+	 * resolved object ID in the case of a symbolic ref. May be the zero
+	 * object ID in case the symbolic ref cannot be resolved.
+	 */
+	const struct object_id *oid;
+
+	/* A bitfield of `REF_` flags. */
+	int flags;
+};
+
 /*
  * The signature for the callback function for the for_each_*()
- * functions below.  The memory pointed to by the refname and oid
- * arguments is only guaranteed to be valid for the duration of a
+ * functions below.  The memory pointed to by the `struct reference`
+ * argument is only guaranteed to be valid for the duration of a
  * single callback invocation.
  */
-typedef int each_ref_fn(const char *refname, const char *referent,
-			const struct object_id *oid, int flags, void *cb_data);
+typedef int each_ref_fn(const struct reference *ref, void *cb_data);
 
 /*
  * The following functions invoke the specified callback function for
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8d7007f4aaa9da..eb3142f8f2dd32 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3150,14 +3150,11 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 	return 0;
 }
 
-static int ref_present(const char *refname, const char *referent UNUSED,
-		       const struct object_id *oid UNUSED,
-		       int flags UNUSED,
-		       void *cb_data)
+static int ref_present(const struct reference *ref, void *cb_data)
 {
 	struct string_list *affected_refnames = cb_data;
 
-	return string_list_has_string(affected_refnames, refname);
+	return string_list_has_string(affected_refnames, ref->name);
 }
 
 static int files_transaction_finish_initial(struct files_ref_store *refs,
diff --git a/refs/iterator.c b/refs/iterator.c
index 17ef841d8a3013..7f2e718f1c9107 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -476,7 +476,14 @@ int do_for_each_ref_iterator(struct ref_iterator *iter,
 
 	current_ref_iter = iter;
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
-		retval = fn(iter->refname, iter->referent, iter->oid, iter->flags, cb_data);
+		struct reference ref = {
+			.name = iter->refname,
+			.target = iter->referent,
+			.oid = iter->oid,
+			.flags = iter->flags,
+		};
+
+		retval = fn(&ref, cb_data);
 		if (retval)
 			goto out;
 	}
diff --git a/remote.c b/remote.c
index df9675cd330ed1..59b371512084eb 100644
--- a/remote.c
+++ b/remote.c
@@ -2315,21 +2315,19 @@ int format_tracking_info(struct branch *branch, struct strbuf *sb,
 	return 1;
 }
 
-static int one_local_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			 int flag UNUSED,
-			 void *cb_data)
+static int one_local_ref(const struct reference *ref, void *cb_data)
 {
 	struct ref ***local_tail = cb_data;
-	struct ref *ref;
+	struct ref *local_ref;
 
 	/* we already know it starts with refs/ to get here */
-	if (check_refname_format(refname + 5, 0))
+	if (check_refname_format(ref->name + 5, 0))
 		return 0;
 
-	ref = alloc_ref(refname);
-	oidcpy(&ref->new_oid, oid);
-	**local_tail = ref;
-	*local_tail = &ref->next;
+	local_ref = alloc_ref(ref->name);
+	oidcpy(&local_ref->new_oid, ref->oid);
+	**local_tail = local_ref;
+	*local_tail = &local_ref->next;
 	return 0;
 }
 
@@ -2402,15 +2400,14 @@ struct stale_heads_info {
 	struct refspec *rs;
 };
 
-static int get_stale_heads_cb(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-			      int flags, void *cb_data)
+static int get_stale_heads_cb(const struct reference *ref, void *cb_data)
 {
 	struct stale_heads_info *info = cb_data;
 	struct string_list matches = STRING_LIST_INIT_DUP;
 	struct refspec_item query;
 	int i, stale = 1;
 	memset(&query, 0, sizeof(struct refspec_item));
-	query.dst = (char *)refname;
+	query.dst = (char *)ref->name;
 
 	refspec_find_all_matches(info->rs, &query, &matches);
 	if (matches.nr == 0)
@@ -2423,7 +2420,7 @@ static int get_stale_heads_cb(const char *refname, const char *referent UNUSED,
 	 * overlapping refspecs, we need to go over all of the
 	 * matching refs.
 	 */
-	if (flags & REF_ISSYMREF)
+	if (ref->flags & REF_ISSYMREF)
 		goto clean_exit;
 
 	for (i = 0; stale && i < matches.nr; i++)
@@ -2431,8 +2428,8 @@ static int get_stale_heads_cb(const char *refname, const char *referent UNUSED,
 			stale = 0;
 
 	if (stale) {
-		struct ref *ref = make_linked_ref(refname, &info->stale_refs_tail);
-		oidcpy(&ref->new_oid, oid);
+		struct ref *linked_ref = make_linked_ref(ref->name, &info->stale_refs_tail);
+		oidcpy(&linked_ref->new_oid, ref->oid);
 	}
 
 clean_exit:
diff --git a/repack-midx.c b/repack-midx.c
index 6f6202c5bccd89..349f7e20b53f25 100644
--- a/repack-midx.c
+++ b/repack-midx.c
@@ -16,25 +16,23 @@ struct midx_snapshot_ref_data {
 	int preferred;
 };
 
-static int midx_snapshot_ref_one(const char *refname UNUSED,
-				 const char *referent UNUSED,
-				 const struct object_id *oid,
-				 int flag UNUSED, void *_data)
+static int midx_snapshot_ref_one(const struct reference *ref, void *_data)
 {
 	struct midx_snapshot_ref_data *data = _data;
+	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 
-	if (!peel_iterated_oid(data->repo, oid, &peeled))
-		oid = &peeled;
+	if (!peel_iterated_oid(data->repo, ref->oid, &peeled))
+		maybe_peeled = &peeled;
 
-	if (oidset_insert(&data->seen, oid))
+	if (oidset_insert(&data->seen, maybe_peeled))
 		return 0; /* already seen */
 
-	if (odb_read_object_info(data->repo->objects, oid, NULL) != OBJ_COMMIT)
+	if (odb_read_object_info(data->repo->objects, maybe_peeled, NULL) != OBJ_COMMIT)
 		return 0;
 
 	fprintf(data->f->fp, "%s%s\n", data->preferred ? "+" : "",
-		oid_to_hex(oid));
+		oid_to_hex(maybe_peeled));
 
 	return 0;
 }
diff --git a/replace-object.c b/replace-object.c
index 3eae0510745eae..03d0f1f083bed9 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -8,31 +8,27 @@
 #include "repository.h"
 #include "commit.h"
 
-static int register_replace_ref(const char *refname,
-				const char *referent UNUSED,
-				const struct object_id *oid,
-				int flag UNUSED,
-				void *cb_data)
+static int register_replace_ref(const struct reference *ref, void *cb_data)
 {
 	struct repository *r = cb_data;
 
 	/* Get sha1 from refname */
-	const char *slash = strrchr(refname, '/');
-	const char *hash = slash ? slash + 1 : refname;
+	const char *slash = strrchr(ref->name, '/');
+	const char *hash = slash ? slash + 1 : ref->name;
 	struct replace_object *repl_obj = xmalloc(sizeof(*repl_obj));
 
 	if (get_oid_hex_algop(hash, &repl_obj->original.oid, r->hash_algo)) {
 		free(repl_obj);
-		warning(_("bad replace ref name: %s"), refname);
+		warning(_("bad replace ref name: %s"), ref->name);
 		return 0;
 	}
 
 	/* Copy sha1 from the read ref */
-	oidcpy(&repl_obj->replacement, oid);
+	oidcpy(&repl_obj->replacement, ref->oid);
 
 	/* Register new object */
 	if (oidmap_put(&r->objects->replace_map, repl_obj))
-		die(_("duplicate replace ref: %s"), refname);
+		die(_("duplicate replace ref: %s"), ref->name);
 
 	return 0;
 }
diff --git a/revision.c b/revision.c
index cf5e6c1ec9e4e1..5f0850ae5c9c1a 100644
--- a/revision.c
+++ b/revision.c
@@ -1644,19 +1644,17 @@ struct all_refs_cb {
 	struct worktree *wt;
 };
 
-static int handle_one_ref(const char *path, const char *referent UNUSED, const struct object_id *oid,
-			  int flag UNUSED,
-			  void *cb_data)
+static int handle_one_ref(const struct reference *ref, void *cb_data)
 {
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(&cb->all_revs->ref_excludes, path))
+	if (ref_excluded(&cb->all_revs->ref_excludes, ref->name))
 	    return 0;
 
-	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
-	add_rev_cmdline(cb->all_revs, object, path, REV_CMD_REF, cb->all_flags);
-	add_pending_object(cb->all_revs, object, path);
+	object = get_reference(cb->all_revs, ref->name, ref->oid, cb->all_flags);
+	add_rev_cmdline(cb->all_revs, object, ref->name, REV_CMD_REF, cb->all_flags);
+	add_pending_object(cb->all_revs, object, ref->name);
 	return 0;
 }
 
diff --git a/server-info.c b/server-info.c
index 1d33de821e9f5e..0a07c722e8bfe5 100644
--- a/server-info.c
+++ b/server-info.c
@@ -148,23 +148,21 @@ static int update_info_file(struct repository *r, char *path,
 	return ret;
 }
 
-static int add_info_ref(const char *path, const char *referent UNUSED, const struct object_id *oid,
-			int flag UNUSED,
-			void *cb_data)
+static int add_info_ref(const struct reference *ref, void *cb_data)
 {
 	struct update_info_ctx *uic = cb_data;
-	struct object *o = parse_object(uic->repo, oid);
+	struct object *o = parse_object(uic->repo, ref->oid);
 	if (!o)
 		return -1;
 
-	if (uic_printf(uic, "%s	%s\n", oid_to_hex(oid), path) < 0)
+	if (uic_printf(uic, "%s	%s\n", oid_to_hex(ref->oid), ref->name) < 0)
 		return -1;
 
 	if (o->type == OBJ_TAG) {
-		o = deref_tag(uic->repo, o, path, 0);
+		o = deref_tag(uic->repo, o, ref->name, 0);
 		if (o)
 			if (uic_printf(uic, "%s	%s^{}\n",
-				oid_to_hex(&o->oid), path) < 0)
+				oid_to_hex(&o->oid), ref->name) < 0)
 				return -1;
 	}
 	return 0;
diff --git a/shallow.c b/shallow.c
index d9cd4e219cb07d..55b9cd9d3f29f1 100644
--- a/shallow.c
+++ b/shallow.c
@@ -626,14 +626,10 @@ static void paint_down(struct paint_info *info, const struct object_id *oid,
 	free(tmp);
 }
 
-static int mark_uninteresting(const char *refname UNUSED,
-			      const char *referent UNUSED,
-			      const struct object_id *oid,
-			      int flags UNUSED,
-			      void *cb_data UNUSED)
+static int mark_uninteresting(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct commit *commit = lookup_commit_reference_gently(the_repository,
-							       oid, 1);
+							       ref->oid, 1);
 	if (!commit)
 		return 0;
 	commit->object.flags |= UNINTERESTING;
@@ -742,16 +738,12 @@ struct commit_array {
 	size_t nr, alloc;
 };
 
-static int add_ref(const char *refname UNUSED,
-		  const char *referent UNUSED,
-		   const struct object_id *oid,
-		   int flags UNUSED,
-		   void *cb_data)
+static int add_ref(const struct reference *ref, void *cb_data)
 {
 	struct commit_array *ca = cb_data;
 	ALLOC_GROW(ca->commits, ca->nr + 1, ca->alloc);
 	ca->commits[ca->nr] = lookup_commit_reference_gently(the_repository,
-							     oid, 1);
+							     ref->oid, 1);
 	if (ca->commits[ca->nr])
 		ca->nr++;
 	return 0;
diff --git a/submodule.c b/submodule.c
index 35c55155f7bf83..40a5c6fb9d1545 100644
--- a/submodule.c
+++ b/submodule.c
@@ -934,10 +934,7 @@ static void free_submodules_data(struct string_list *submodules)
 	string_list_clear(submodules, 1);
 }
 
-static int has_remote(const char *refname UNUSED,
-		      const char *referent UNUSED,
-		      const struct object_id *oid UNUSED,
-		      int flags UNUSED, void *cb_data UNUSED)
+static int has_remote(const struct reference *ref UNUSED, void *cb_data UNUSED)
 {
 	return 1;
 }
@@ -1255,13 +1252,10 @@ int push_unpushed_submodules(struct repository *r,
 	return ret;
 }
 
-static int append_oid_to_array(const char *ref UNUSED,
-			       const char *referent UNUSED,
-			       const struct object_id *oid,
-			       int flags UNUSED, void *data)
+static int append_oid_to_array(const struct reference *ref, void *data)
 {
 	struct oid_array *array = data;
-	oid_array_append(array, oid);
+	oid_array_append(array, ref->oid);
 	return 0;
 }
 
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 83b06d39a36235..b1215947c5e67a 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -154,10 +154,9 @@ static int cmd_rename_ref(struct ref_store *refs, const char **argv)
 	return refs_rename_ref(refs, oldref, newref, logmsg);
 }
 
-static int each_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		    int flags, void *cb_data UNUSED)
+static int each_ref(const struct reference *ref, void *cb_data UNUSED)
 {
-	printf("%s %s 0x%x\n", oid_to_hex(oid), refname, flags);
+	printf("%s %s 0x%x\n", oid_to_hex(ref->oid), ref->name, ref->flags);
 	return 0;
 }
 
diff --git a/upload-pack.c b/upload-pack.c
index 1e87ae95593063..0d563ae74e92be 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -870,8 +870,8 @@ static void send_unshallow(struct upload_pack_data *data)
 	}
 }
 
-static int check_ref(const char *refname_full, const char *referent UNUSED, const struct object_id *oid,
-		     int flag, void *cb_data);
+static int check_ref(const struct reference *ref, void *cb_data);
+
 static void deepen(struct upload_pack_data *data, int depth)
 {
 	if (depth == INFINITE_DEPTH && !is_repository_shallow(the_repository)) {
@@ -1224,13 +1224,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 	return 0;
 }
 
-static int check_ref(const char *refname_full, const char *referent UNUSED,const struct object_id *oid,
-		     int flag UNUSED, void *cb_data)
+static int check_ref(const struct reference *ref, void *cb_data)
 {
-	const char *refname = strip_namespace(refname_full);
+	const char *refname = strip_namespace(ref->name);
 	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
+	mark_our_ref(refname, ref->name, ref->oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1292,27 +1291,25 @@ static void write_v0_ref(struct upload_pack_data *data,
 	return;
 }
 
-static int send_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
-		    int flag UNUSED, void *cb_data)
+static int send_ref(const struct reference *ref, void *cb_data)
 {
-	write_v0_ref(cb_data, refname, strip_namespace(refname), oid);
+	write_v0_ref(cb_data, ref->name, strip_namespace(ref->name), ref->oid);
 	return 0;
 }
 
-static int find_symref(const char *refname, const char *referent UNUSED,
-		       const struct object_id *oid UNUSED,
-		       int flag, void *cb_data)
+static int find_symref(const struct reference *ref, void *cb_data)
 {
 	const char *symref_target;
 	struct string_list_item *item;
+	int flag;
 
-	if ((flag & REF_ISSYMREF) == 0)
+	if ((ref->flags & REF_ISSYMREF) == 0)
 		return 0;
 	symref_target = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
-						refname, 0, NULL, &flag);
+						ref->name, 0, NULL, &flag);
 	if (!symref_target || (flag & REF_ISSYMREF) == 0)
-		die("'%s' is a symref but it is not?", refname);
-	item = string_list_append(cb_data, strip_namespace(refname));
+		die("'%s' is a symref but it is not?", ref->name);
+	item = string_list_append(cb_data, strip_namespace(ref->name));
 	item->util = xstrdup(strip_namespace(symref_target));
 	return 0;
 }
diff --git a/walker.c b/walker.c
index 80737545172bbd..409b646578a3d4 100644
--- a/walker.c
+++ b/walker.c
@@ -226,14 +226,10 @@ static int interpret_target(struct walker *walker, char *target, struct object_i
 	return -1;
 }
 
-static int mark_complete(const char *path UNUSED,
-			const char *referent UNUSED,
-			 const struct object_id *oid,
-			 int flag UNUSED,
-			 void *cb_data UNUSED)
+static int mark_complete(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct commit *commit = lookup_commit_reference_gently(the_repository,
-							       oid, 1);
+							       ref->oid, 1);
 
 	if (commit) {
 		commit->object.flags |= COMPLETE;
diff --git a/worktree.c b/worktree.c
index a2a5f51f29fca2..9308389cb6f029 100644
--- a/worktree.c
+++ b/worktree.c
@@ -595,8 +595,15 @@ int other_head_refs(each_ref_fn fn, void *cb_data)
 		if (refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
 					    refname.buf,
 					    RESOLVE_REF_READING,
-					    &oid, &flag))
-			ret = fn(refname.buf, NULL, &oid, flag, cb_data);
+					    &oid, &flag)) {
+			struct reference ref = {
+				.name = refname.buf,
+				.oid = &oid,
+				.flags = flag,
+			};
+
+			ret = fn(&ref, cb_data);
+		}
 		if (ret)
 			break;
 	}

From 89baa52da612dde6da031acfa2cb957d4297d544 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:11 +0200
Subject: [PATCH 025/553] refs: introduce `.ref` field for the base iterator

The base iterator has a couple of fields that tracks the name, target,
object ID and flags for the current reference. Due to this design we
have to create a new `struct reference` whenever we want to hand over
that reference to the callback function, which is tedious and not very
efficient.

Convert the structure to instead contain a `struct reference` as member.
This member is expected to be populated by the implementations of the
iterator and is handed over to the callback directly.

While at it, simplify `should_pack_ref()` to take a `struct reference`
directly instead of passing its respective fields.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c                  |  8 +++----
 refs/debug.c            |  8 +++----
 refs/files-backend.c    | 47 ++++++++++++++++++-----------------------
 refs/iterator.c         | 39 +++++++++++-----------------------
 refs/packed-backend.c   | 46 ++++++++++++++++++++--------------------
 refs/ref-cache.c        | 10 ++++-----
 refs/refs-internal.h    |  5 +----
 refs/reftable-backend.c | 12 +++++------
 8 files changed, 75 insertions(+), 100 deletions(-)

diff --git a/refs.c b/refs.c
index 25f0579d610ddc..f96cf43b128a27 100644
--- a/refs.c
+++ b/refs.c
@@ -2327,8 +2327,8 @@ int refs_optimize(struct ref_store *refs, struct pack_refs_opts *opts)
 int peel_iterated_oid(struct repository *r, const struct object_id *base, struct object_id *peeled)
 {
 	if (current_ref_iter &&
-	    (current_ref_iter->oid == base ||
-	     oideq(current_ref_iter->oid, base)))
+	    (current_ref_iter->ref.oid == base ||
+	     oideq(current_ref_iter->ref.oid, base)))
 		return ref_iterator_peel(current_ref_iter, peeled);
 
 	return peel_object(r, base, peeled) ? -1 : 0;
@@ -2703,7 +2703,7 @@ enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
 
 			while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 				if (skip &&
-				    string_list_has_string(skip, iter->refname))
+				    string_list_has_string(skip, iter->ref.name))
 					continue;
 
 				if (transaction && ref_transaction_maybe_set_rejected(
@@ -2712,7 +2712,7 @@ enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
 					continue;
 
 				strbuf_addf(err, _("'%s' exists; cannot create '%s'"),
-					    iter->refname, refname);
+					    iter->ref.name, refname);
 				goto cleanup;
 			}
 
diff --git a/refs/debug.c b/refs/debug.c
index 697adbd0dc3f65..67718bd1f49f1f 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -160,11 +160,9 @@ static int debug_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		trace_printf_key(&trace_refs, "iterator_advance: (%d)\n", res);
 	else
 		trace_printf_key(&trace_refs, "iterator_advance: %s (0)\n",
-			diter->iter->refname);
+			diter->iter->ref.name);
 
-	diter->base.refname = diter->iter->refname;
-	diter->base.oid = diter->iter->oid;
-	diter->base.flags = diter->iter->flags;
+	diter->base.ref = diter->iter->ref;
 	return res;
 }
 
@@ -185,7 +183,7 @@ static int debug_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	struct debug_ref_iterator *diter =
 		(struct debug_ref_iterator *)ref_iterator;
 	int res = diter->iter->vtable->peel(diter->iter, peeled);
-	trace_printf_key(&trace_refs, "iterator_peel: %s: %d\n", diter->iter->refname, res);
+	trace_printf_key(&trace_refs, "iterator_peel: %s: %d\n", diter->iter->ref.name, res);
 	return res;
 }
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index eb3142f8f2dd32..fac53fa052dd22 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -961,26 +961,23 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 
 	while ((ok = ref_iterator_advance(iter->iter0)) == ITER_OK) {
 		if (iter->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
-		    parse_worktree_ref(iter->iter0->refname, NULL, NULL,
+		    parse_worktree_ref(iter->iter0->ref.name, NULL, NULL,
 				       NULL) != REF_WORKTREE_CURRENT)
 			continue;
 
 		if ((iter->flags & DO_FOR_EACH_OMIT_DANGLING_SYMREFS) &&
-		    (iter->iter0->flags & REF_ISSYMREF) &&
-		    (iter->iter0->flags & REF_ISBROKEN))
+		    (iter->iter0->ref.flags & REF_ISSYMREF) &&
+		    (iter->iter0->ref.flags & REF_ISBROKEN))
 			continue;
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->iter0->refname,
+		    !ref_resolves_to_object(iter->iter0->ref.name,
 					    iter->repo,
-					    iter->iter0->oid,
-					    iter->iter0->flags))
+					    iter->iter0->ref.oid,
+					    iter->iter0->ref.flags))
 			continue;
 
-		iter->base.refname = iter->iter0->refname;
-		iter->base.oid = iter->iter0->oid;
-		iter->base.flags = iter->iter0->flags;
-		iter->base.referent = iter->iter0->referent;
+		iter->base.ref = iter->iter0->ref;
 
 		return ITER_OK;
 	}
@@ -1367,30 +1364,29 @@ static void prune_refs(struct files_ref_store *refs, struct ref_to_prune **refs_
  * Return true if the specified reference should be packed.
  */
 static int should_pack_ref(struct files_ref_store *refs,
-			   const char *refname,
-			   const struct object_id *oid, unsigned int ref_flags,
+			   const struct reference *ref,
 			   struct pack_refs_opts *opts)
 {
 	struct string_list_item *item;
 
 	/* Do not pack per-worktree refs: */
-	if (parse_worktree_ref(refname, NULL, NULL, NULL) !=
+	if (parse_worktree_ref(ref->name, NULL, NULL, NULL) !=
 	    REF_WORKTREE_SHARED)
 		return 0;
 
 	/* Do not pack symbolic refs: */
-	if (ref_flags & REF_ISSYMREF)
+	if (ref->flags & REF_ISSYMREF)
 		return 0;
 
 	/* Do not pack broken refs: */
-	if (!ref_resolves_to_object(refname, refs->base.repo, oid, ref_flags))
+	if (!ref_resolves_to_object(ref->name, refs->base.repo, ref->oid, ref->flags))
 		return 0;
 
-	if (ref_excluded(opts->exclusions, refname))
+	if (ref_excluded(opts->exclusions, ref->name))
 		return 0;
 
 	for_each_string_list_item(item, opts->includes)
-		if (!wildmatch(item->string, refname, 0))
+		if (!wildmatch(item->string, ref->name, 0))
 			return 1;
 
 	return 0;
@@ -1443,8 +1439,7 @@ static int should_pack_refs(struct files_ref_store *refs,
 	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs, 0), NULL,
 					refs->base.repo, 0);
 	while ((ret = ref_iterator_advance(iter)) == ITER_OK) {
-		if (should_pack_ref(refs, iter->refname, iter->oid,
-				    iter->flags, opts))
+		if (should_pack_ref(refs, &iter->ref, opts))
 			refcount++;
 		if (refcount >= limit) {
 			ref_iterator_free(iter);
@@ -1489,24 +1484,24 @@ static int files_pack_refs(struct ref_store *ref_store,
 		 * in the packed ref cache. If the reference should be
 		 * pruned, also add it to refs_to_prune.
 		 */
-		if (!should_pack_ref(refs, iter->refname, iter->oid, iter->flags, opts))
+		if (!should_pack_ref(refs, &iter->ref, opts))
 			continue;
 
 		/*
 		 * Add a reference creation for this reference to the
 		 * packed-refs transaction:
 		 */
-		if (ref_transaction_update(transaction, iter->refname,
-					   iter->oid, NULL, NULL, NULL,
+		if (ref_transaction_update(transaction, iter->ref.name,
+					   iter->ref.oid, NULL, NULL, NULL,
 					   REF_NO_DEREF, NULL, &err))
 			die("failure preparing to create packed reference %s: %s",
-			    iter->refname, err.buf);
+			    iter->ref.name, err.buf);
 
 		/* Schedule the loose reference for pruning if requested. */
 		if ((opts->flags & PACK_REFS_PRUNE)) {
 			struct ref_to_prune *n;
-			FLEX_ALLOC_STR(n, name, iter->refname);
-			oidcpy(&n->oid, iter->oid);
+			FLEX_ALLOC_STR(n, name, iter->ref.name);
+			oidcpy(&n->oid, iter->ref.oid);
 			n->next = refs_to_prune;
 			refs_to_prune = n;
 		}
@@ -2379,7 +2374,7 @@ static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator)
 					 REFNAME_ALLOW_ONELEVEL))
 			continue;
 
-		iter->base.refname = diter->relative_path;
+		iter->base.ref.name = diter->relative_path;
 		return ITER_OK;
 	}
 
diff --git a/refs/iterator.c b/refs/iterator.c
index 7f2e718f1c9107..fe5980e1b6c96b 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -41,10 +41,7 @@ void base_ref_iterator_init(struct ref_iterator *iter,
 			    struct ref_iterator_vtable *vtable)
 {
 	iter->vtable = vtable;
-	iter->refname = NULL;
-	iter->referent = NULL;
-	iter->oid = NULL;
-	iter->flags = 0;
+	memset(&iter->ref, 0, sizeof(iter->ref));
 }
 
 struct empty_ref_iterator {
@@ -127,8 +124,8 @@ enum iterator_selection ref_iterator_select(struct ref_iterator *iter_worktree,
 		 * latter.
 		 */
 		if (iter_worktree) {
-			int cmp = strcmp(iter_worktree->refname,
-					 iter_common->refname);
+			int cmp = strcmp(iter_worktree->ref.name,
+					 iter_common->ref.name);
 			if (cmp < 0)
 				return ITER_SELECT_0;
 			else if (!cmp)
@@ -139,7 +136,7 @@ enum iterator_selection ref_iterator_select(struct ref_iterator *iter_worktree,
 		  * We now know that the lexicographically-next ref is a common
 		  * ref. When the common ref is a shared one we return it.
 		  */
-		if (parse_worktree_ref(iter_common->refname, NULL, NULL,
+		if (parse_worktree_ref(iter_common->ref.name, NULL, NULL,
 				       NULL) == REF_WORKTREE_SHARED)
 			return ITER_SELECT_1;
 
@@ -212,10 +209,7 @@ static int merge_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		}
 
 		if (selection & ITER_YIELD_CURRENT) {
-			iter->base.referent = (*iter->current)->referent;
-			iter->base.refname = (*iter->current)->refname;
-			iter->base.oid = (*iter->current)->oid;
-			iter->base.flags = (*iter->current)->flags;
+			iter->base.ref = (*iter->current)->ref;
 			return ITER_OK;
 		}
 	}
@@ -313,7 +307,7 @@ static enum iterator_selection overlay_iterator_select(
 	else if (!front)
 		return ITER_SELECT_1;
 
-	cmp = strcmp(front->refname, back->refname);
+	cmp = strcmp(front->ref.name, back->ref.name);
 
 	if (cmp < 0)
 		return ITER_SELECT_0;
@@ -371,7 +365,7 @@ static int prefix_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	int ok;
 
 	while ((ok = ref_iterator_advance(iter->iter0)) == ITER_OK) {
-		int cmp = compare_prefix(iter->iter0->refname, iter->prefix);
+		int cmp = compare_prefix(iter->iter0->ref.name, iter->prefix);
 		if (cmp < 0)
 			continue;
 		/*
@@ -382,6 +376,8 @@ static int prefix_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		if (cmp > 0)
 			return ITER_DONE;
 
+		iter->base.ref = iter->iter0->ref;
+
 		if (iter->trim) {
 			/*
 			 * It is nonsense to trim off characters that
@@ -392,15 +388,11 @@ static int prefix_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			 * one character left in the refname after
 			 * trimming, report it as a bug:
 			 */
-			if (strlen(iter->iter0->refname) <= iter->trim)
+			if (strlen(iter->base.ref.name) <= iter->trim)
 				BUG("attempt to trim too many characters");
-			iter->base.refname = iter->iter0->refname + iter->trim;
-		} else {
-			iter->base.refname = iter->iter0->refname;
+			iter->base.ref.name += iter->trim;
 		}
 
-		iter->base.oid = iter->iter0->oid;
-		iter->base.flags = iter->iter0->flags;
 		return ITER_OK;
 	}
 
@@ -476,14 +468,7 @@ int do_for_each_ref_iterator(struct ref_iterator *iter,
 
 	current_ref_iter = iter;
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
-		struct reference ref = {
-			.name = iter->refname,
-			.target = iter->referent,
-			.oid = iter->oid,
-			.flags = iter->flags,
-		};
-
-		retval = fn(&ref, cb_data);
+		retval = fn(&iter->ref, cb_data);
 		if (retval)
 			goto out;
 	}
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index a8c22a0a7ff8cd..7987acdc96a14b 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -908,7 +908,7 @@ static int next_record(struct packed_ref_iterator *iter)
 	if (iter->pos == iter->eof)
 		return ITER_DONE;
 
-	iter->base.flags = REF_ISPACKED;
+	iter->base.ref.flags = REF_ISPACKED;
 	p = iter->pos;
 
 	if (iter->eof - p < snapshot_hexsz(iter->snapshot) + 2 ||
@@ -923,22 +923,22 @@ static int next_record(struct packed_ref_iterator *iter)
 				      iter->pos, iter->eof - iter->pos);
 
 	strbuf_add(&iter->refname_buf, p, eol - p);
-	iter->base.refname = iter->refname_buf.buf;
+	iter->base.ref.name = iter->refname_buf.buf;
 
 	if (refname_contains_nul(&iter->refname_buf))
-		die("packed refname contains embedded NULL: %s", iter->base.refname);
+		die("packed refname contains embedded NULL: %s", iter->base.ref.name);
 
-	if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
-		if (!refname_is_safe(iter->base.refname))
+	if (check_refname_format(iter->base.ref.name, REFNAME_ALLOW_ONELEVEL)) {
+		if (!refname_is_safe(iter->base.ref.name))
 			die("packed refname is dangerous: %s",
-			    iter->base.refname);
+			    iter->base.ref.name);
 		oidclr(&iter->oid, iter->repo->hash_algo);
-		iter->base.flags |= REF_BAD_NAME | REF_ISBROKEN;
+		iter->base.ref.flags |= REF_BAD_NAME | REF_ISBROKEN;
 	}
 	if (iter->snapshot->peeled == PEELED_FULLY ||
 	    (iter->snapshot->peeled == PEELED_TAGS &&
-	     starts_with(iter->base.refname, "refs/tags/")))
-		iter->base.flags |= REF_KNOWS_PEELED;
+	     starts_with(iter->base.ref.name, "refs/tags/")))
+		iter->base.ref.flags |= REF_KNOWS_PEELED;
 
 	iter->pos = eol + 1;
 
@@ -956,11 +956,11 @@ static int next_record(struct packed_ref_iterator *iter)
 		 * definitely know the value of *this* reference. But
 		 * we suppress it if the reference is broken:
 		 */
-		if ((iter->base.flags & REF_ISBROKEN)) {
+		if ((iter->base.ref.flags & REF_ISBROKEN)) {
 			oidclr(&iter->peeled, iter->repo->hash_algo);
-			iter->base.flags &= ~REF_KNOWS_PEELED;
+			iter->base.ref.flags &= ~REF_KNOWS_PEELED;
 		} else {
-			iter->base.flags |= REF_KNOWS_PEELED;
+			iter->base.ref.flags |= REF_KNOWS_PEELED;
 		}
 	} else {
 		oidclr(&iter->peeled, iter->repo->hash_algo);
@@ -976,15 +976,15 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 	int ok;
 
 	while ((ok = next_record(iter)) == ITER_OK) {
-		const char *refname = iter->base.refname;
+		const char *refname = iter->base.ref.name;
 		const char *prefix = iter->prefix;
 
 		if (iter->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
-		    !is_per_worktree_ref(iter->base.refname))
+		    !is_per_worktree_ref(iter->base.ref.name))
 			continue;
 
 		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
-		    !ref_resolves_to_object(iter->base.refname, iter->repo,
+		    !ref_resolves_to_object(iter->base.ref.name, iter->repo,
 					    &iter->oid, iter->flags))
 			continue;
 
@@ -1033,10 +1033,10 @@ static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
 	struct packed_ref_iterator *iter =
 		(struct packed_ref_iterator *)ref_iterator;
 
-	if ((iter->base.flags & REF_KNOWS_PEELED)) {
+	if ((iter->base.ref.flags & REF_KNOWS_PEELED)) {
 		oidcpy(peeled, &iter->peeled);
 		return is_null_oid(&iter->peeled) ? -1 : 0;
-	} else if ((iter->base.flags & (REF_ISBROKEN | REF_ISSYMREF))) {
+	} else if ((iter->base.ref.flags & (REF_ISBROKEN | REF_ISSYMREF))) {
 		return -1;
 	} else {
 		return peel_object(iter->repo, &iter->oid, peeled) ? -1 : 0;
@@ -1194,7 +1194,7 @@ static struct ref_iterator *packed_ref_iterator_begin(
 	iter->snapshot = snapshot;
 	acquire_snapshot(snapshot);
 	strbuf_init(&iter->refname_buf, 0);
-	iter->base.oid = &iter->oid;
+	iter->base.ref.oid = &iter->oid;
 	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
@@ -1436,7 +1436,7 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 			if (!iter)
 				cmp = +1;
 			else
-				cmp = strcmp(iter->refname, update->refname);
+				cmp = strcmp(iter->ref.name, update->refname);
 		}
 
 		if (!cmp) {
@@ -1459,11 +1459,11 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 					}
 
 					goto error;
-				} else if (!oideq(&update->old_oid, iter->oid)) {
+				} else if (!oideq(&update->old_oid, iter->ref.oid)) {
 					strbuf_addf(err, "cannot update ref '%s': "
 						    "is at %s but expected %s",
 						    update->refname,
-						    oid_to_hex(iter->oid),
+						    oid_to_hex(iter->ref.oid),
 						    oid_to_hex(&update->old_oid));
 					ret = REF_TRANSACTION_ERROR_INCORRECT_OLD_VALUE;
 
@@ -1527,8 +1527,8 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 			struct object_id peeled;
 			int peel_error = ref_iterator_peel(iter, &peeled);
 
-			if (write_packed_entry(out, iter->refname,
-					       iter->oid,
+			if (write_packed_entry(out, iter->ref.name,
+					       iter->ref.oid,
 					       peel_error ? NULL : &peeled))
 				goto write_error;
 
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index e5e5df16d85e40..f1abc39624166e 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -425,10 +425,10 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			level->prefix_state = entry_prefix_state;
 			level->index = -1;
 		} else {
-			iter->base.refname = entry->name;
-			iter->base.referent = entry->u.value.referent;
-			iter->base.oid = &entry->u.value.oid;
-			iter->base.flags = entry->flag;
+			iter->base.ref.name = entry->name;
+			iter->base.ref.target = entry->u.value.referent;
+			iter->base.ref.oid = &entry->u.value.oid;
+			iter->base.ref.flags = entry->flag;
 			return ITER_OK;
 		}
 	}
@@ -550,7 +550,7 @@ static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
 {
 	struct cache_ref_iterator *iter =
 		(struct cache_ref_iterator *)ref_iterator;
-	return peel_object(iter->repo, ref_iterator->oid, peeled) ? -1 : 0;
+	return peel_object(iter->repo, ref_iterator->ref.oid, peeled) ? -1 : 0;
 }
 
 static void cache_ref_iterator_release(struct ref_iterator *ref_iterator)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4ef3bd75c6ae55..ed749d16572dac 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -249,10 +249,7 @@ const char *find_descendant_ref(const char *dirname,
  */
 struct ref_iterator {
 	struct ref_iterator_vtable *vtable;
-	const char *refname;
-	const char *referent;
-	const struct object_id *oid;
-	unsigned int flags;
+	struct reference ref;
 };
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index d4b792862024fc..0e47986cb5b699 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -704,10 +704,10 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 					    &iter->oid, flags))
 				continue;
 
-		iter->base.refname = iter->ref.refname;
-		iter->base.referent = referent;
-		iter->base.oid = &iter->oid;
-		iter->base.flags = flags;
+		iter->base.ref.name = iter->ref.refname;
+		iter->base.ref.target = referent;
+		iter->base.ref.oid = &iter->oid;
+		iter->base.ref.flags = flags;
 
 		break;
 	}
@@ -828,7 +828,7 @@ static struct reftable_ref_iterator *ref_iterator_for_stack(struct reftable_ref_
 
 	iter = xcalloc(1, sizeof(*iter));
 	base_ref_iterator_init(&iter->base, &reftable_ref_iterator_vtable);
-	iter->base.oid = &iter->oid;
+	iter->base.ref.oid = &iter->oid;
 	iter->flags = flags;
 	iter->refs = refs;
 	iter->exclude_patterns = filter_exclude_patterns(exclude_patterns);
@@ -2072,7 +2072,7 @@ static int reftable_reflog_iterator_advance(struct ref_iterator *ref_iterator)
 
 		strbuf_reset(&iter->last_name);
 		strbuf_addstr(&iter->last_name, iter->log.refname);
-		iter->base.refname = iter->log.refname;
+		iter->base.ref.name = iter->log.refname;
 
 		break;
 	}

From 4cea0422879f6a64c0f7ad0ddac6d43897a53e94 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:12 +0200
Subject: [PATCH 026/553] refs: fully reset `struct ref_iterator::ref` on
 iteration

With the introduction of the `struct ref_iterator::ref` field it now is
a whole lot easier to introduce new fields that become accessible to the
caller without having to adapt every single callsite. But there's a
downside: when a new field is introduced we always have to adapt all
backends to set that field.

This isn't something we can avoid in the general case: when the new
field is expected to be populated by all backends we of course cannot
avoid doing so. But new fields may be entirely optional, in which case
we'd still have such churn. And furthermore, it is very easy right now
to leak state from a previous iteration into the next iteration.

Address this issue by ensuring that the reference backends all fully
reset the field on every single iteration. This ensures that no state
from previous iterations can leak into the next one. And it ensures that
any newly introduced fields will be zeroed out by default.

Note that we don't have to explicitly adapt the "files" backend, as it
uses the `cache_ref_iterator` internally. Furthermore, other "wrapping"
iterators like for example the `prefix_ref_iterator` copy around the
whole reference, so these don't need to be adapted either.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs/packed-backend.c   | 3 ++-
 refs/ref-cache.c        | 1 +
 refs/reftable-backend.c | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 7987acdc96a14b..711e07f8326c6b 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -882,6 +882,7 @@ static int next_record(struct packed_ref_iterator *iter)
 {
 	const char *p, *eol;
 
+	memset(&iter->base.ref, 0, sizeof(iter->base.ref));
 	strbuf_reset(&iter->refname_buf);
 
 	/*
@@ -916,6 +917,7 @@ static int next_record(struct packed_ref_iterator *iter)
 	    !isspace(*p++))
 		die_invalid_line(iter->snapshot->refs->path,
 				 iter->pos, iter->eof - iter->pos);
+	iter->base.ref.oid = &iter->oid;
 
 	eol = memchr(p, '\n', iter->eof - p);
 	if (!eol)
@@ -1194,7 +1196,6 @@ static struct ref_iterator *packed_ref_iterator_begin(
 	iter->snapshot = snapshot;
 	acquire_snapshot(snapshot);
 	strbuf_init(&iter->refname_buf, 0);
-	iter->base.ref.oid = &iter->oid;
 	iter->repo = ref_store->repo;
 	iter->flags = flags;
 
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index f1abc39624166e..e427848879d61b 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -425,6 +425,7 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			level->prefix_state = entry_prefix_state;
 			level->index = -1;
 		} else {
+			memset(&iter->base.ref, 0, sizeof(iter->base.ref));
 			iter->base.ref.name = entry->name;
 			iter->base.ref.target = entry->u.value.referent;
 			iter->base.ref.oid = &entry->u.value.oid;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 0e47986cb5b699..728886eafd33bd 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -704,6 +704,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 					    &iter->oid, flags))
 				continue;
 
+		memset(&iter->base.ref, 0, sizeof(iter->base.ref));
 		iter->base.ref.name = iter->ref.refname;
 		iter->base.ref.target = referent;
 		iter->base.ref.oid = &iter->oid;

From eb2934d94b43388642ce4840994800310a6d3456 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:13 +0200
Subject: [PATCH 027/553] refs: refactor reference status flags

The reference flags encode information like whether or not a reference
is a symbolic reference or whether it may be broken. This information is
stored in a `int flags` bitfield, which is in conflict with our modern
best practices; we tend to use an unsigned integer to store flags.

Change the type of the field to be `unsigned`. While at it, refactor the
individual flags to be part of an `enum` instead of using preprocessor
defines.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.h | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/refs.h b/refs.h
index 68d235438c2b32..4f0a685714fa3a 100644
--- a/refs.h
+++ b/refs.h
@@ -333,27 +333,28 @@ struct ref_transaction;
  * stored in ref_iterator::flags. Other bits are for internal use
  * only:
  */
+enum reference_status {
+	/* Reference is a symbolic reference. */
+	REF_ISSYMREF = (1 << 0),
 
-/* Reference is a symbolic reference. */
-#define REF_ISSYMREF 0x01
+	/* Reference is a packed reference. */
+	REF_ISPACKED = (1 << 1),
 
-/* Reference is a packed reference. */
-#define REF_ISPACKED 0x02
-
-/*
- * Reference cannot be resolved to an object name: dangling symbolic
- * reference (directly or indirectly), corrupt reference file,
- * reference exists but name is bad, or symbolic reference refers to
- * ill-formatted reference name.
- */
-#define REF_ISBROKEN 0x04
+	/*
+	 * Reference cannot be resolved to an object name: dangling symbolic
+	 * reference (directly or indirectly), corrupt reference file,
+	 * reference exists but name is bad, or symbolic reference refers to
+	 * ill-formatted reference name.
+	 */
+	REF_ISBROKEN = (1 << 2),
 
-/*
- * Reference name is not well formed.
- *
- * See git-check-ref-format(1) for the definition of well formed ref names.
- */
-#define REF_BAD_NAME 0x08
+	/*
+	 * Reference name is not well formed.
+	 *
+	 * See git-check-ref-format(1) for the definition of well formed ref names.
+	 */
+	REF_BAD_NAME = (1 << 3),
+};
 
 /* A reference passed to `for_each_ref()`-style callbacks. */
 struct reference {
@@ -370,8 +371,8 @@ struct reference {
 	 */
 	const struct object_id *oid;
 
-	/* A bitfield of `REF_` flags. */
-	int flags;
+	/* A bitfield of `enum reference_status` flags. */
+	unsigned flags;
 };
 
 /*

From f89866163704528f1a6570e134853dbb99120e7c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:14 +0200
Subject: [PATCH 028/553] refs: expose peeled object ID via the iterator

Both the "files" and "reftable" backend are able to store peeled values
for tags in the respective formats. This allows for a more efficient
lookup of the target object of such a tag without having to manually
peel via the object database.

The infrastructure to access these peeled object IDs is somewhat funky
though. When iterating through objects, we store a pointer reference to
the current iterator in a global variable. The callbacks invoked by that
iterator are then expected to call `peel_iterated_oid()`, which checks
whether the globally-stored iterator's current reference refers to the
one handed into that function. If so, we ask the iterator to peel the
object, otherwise we manually peel the object via the object database.
Depending on global state like this is somewhat weird and also quite
fragile.

Introduce a new `struct reference::peeled_oid` field that can be
populated by the reference backends. This field can be accessed via a
new function `reference_get_peeled_oid()` that either uses that value,
if set, or alternatively peels via the ODB. With this change we don't
have to rely on global state anymore, but make the peeled object ID
available to the callback functions directly.

Adjust trivial callers that already have a `struct reference` available.
Remaining callers will be adjusted in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/describe.c      |  2 +-
 builtin/gc.c            |  2 +-
 builtin/pack-objects.c  |  7 ++++---
 commit-graph.c          |  2 +-
 ls-refs.c               |  2 +-
 midx-write.c            |  2 +-
 pseudo-merge.c          |  2 +-
 refs.c                  | 12 ++++++++++++
 refs.h                  | 19 +++++++++++++++++++
 refs/packed-backend.c   |  1 +
 refs/reftable-backend.c |  5 +++++
 repack-midx.c           |  2 +-
 12 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 79545350443c6c..443546aaac96f0 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -208,7 +208,7 @@ static int get_name(const struct reference *ref, void *cb_data UNUSED)
 	}
 
 	/* Is it annotated? */
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled)) {
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled)) {
 		is_annotated = !oideq(ref->oid, &peeled);
 	} else {
 		oidcpy(&peeled, ref->oid);
diff --git a/builtin/gc.c b/builtin/gc.c
index 9de5de175f6a40..f0cf20d42389fe 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1109,7 +1109,7 @@ static int dfs_on_ref(const struct reference *ref, void *cb_data)
 	struct commit_list *stack = NULL;
 	struct commit *commit;
 
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled))
 		maybe_peeled = &peeled;
 	if (odb_read_object_info(the_repository->objects, maybe_peeled, NULL) != OBJ_COMMIT)
 		return 0;
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 39633a0158e095..1613fecb669830 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -838,7 +838,7 @@ static int mark_tagged(const struct reference *ref, void *cb_data UNUSED)
 
 	if (entry)
 		entry->tagged = 1;
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled)) {
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled)) {
 		entry = packlist_find(&to_pack, &peeled);
 		if (entry)
 			entry->tagged = 1;
@@ -3309,7 +3309,8 @@ static int add_ref_tag(const struct reference *ref, void *cb_data UNUSED)
 {
 	struct object_id peeled;
 
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled) && obj_is_packed(&peeled))
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled) &&
+	    obj_is_packed(&peeled))
 		add_tag_chain(ref->oid);
 	return 0;
 }
@@ -4537,7 +4538,7 @@ static int mark_bitmap_preferred_tip(const struct reference *ref, void *data UNU
 	struct object_id peeled;
 	struct object *object;
 
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled))
 		maybe_peeled = &peeled;
 
 	object = parse_object_or_die(the_repository, maybe_peeled, ref->name);
diff --git a/commit-graph.c b/commit-graph.c
index f91af416259c84..80be2ff2c39842 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1857,7 +1857,7 @@ static int add_ref_to_set(const struct reference *ref, void *cb_data)
 	struct object_id peeled;
 	struct refs_cb_data *data = (struct refs_cb_data *)cb_data;
 
-	if (!peel_iterated_oid(data->repo, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(data->repo, ref, &peeled))
 		maybe_peeled = &peeled;
 	if (odb_read_object_info(data->repo->objects, maybe_peeled, NULL) == OBJ_COMMIT)
 		oidset_insert(data->commits, maybe_peeled);
diff --git a/ls-refs.c b/ls-refs.c
index 64d02723691466..8641281b86c55a 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -110,7 +110,7 @@ static int send_ref(const struct reference *ref, void *cb_data)
 
 	if (data->peel && ref->oid) {
 		struct object_id peeled;
-		if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+		if (!reference_get_peeled_oid(the_repository, ref, &peeled))
 			strbuf_addf(&data->buf, " peeled:%s", oid_to_hex(&peeled));
 	}
 
diff --git a/midx-write.c b/midx-write.c
index f4dd875747a4b6..23e61cb0001428 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -709,7 +709,7 @@ static int add_ref_to_pending(const struct reference *ref, void *cb_data)
 		return 0;
 	}
 
-	if (!peel_iterated_oid(revs->repo, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(revs->repo, ref, &peeled))
 		maybe_peeled = &peeled;
 
 	object = parse_object_or_die(revs->repo, maybe_peeled, ref->name);
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 0abd51b42c185a..a2d5bd85f95ebf 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -230,7 +230,7 @@ static int find_pseudo_merge_group_for_ref(const struct reference *ref, void *_d
 	uint32_t i;
 	int has_bitmap;
 
-	if (!peel_iterated_oid(the_repository, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled))
 		maybe_peeled = &peeled;
 
 	c = lookup_commit(the_repository, maybe_peeled);
diff --git a/refs.c b/refs.c
index f96cf43b128a27..1b1551f9814394 100644
--- a/refs.c
+++ b/refs.c
@@ -2334,6 +2334,18 @@ int peel_iterated_oid(struct repository *r, const struct object_id *base, struct
 	return peel_object(r, base, peeled) ? -1 : 0;
 }
 
+int reference_get_peeled_oid(struct repository *repo,
+			     const struct reference *ref,
+			     struct object_id *peeled_oid)
+{
+	if (ref->peeled_oid) {
+		oidcpy(peeled_oid, ref->peeled_oid);
+		return 0;
+	}
+
+	return peel_object(repo, ref->oid, peeled_oid) ? -1 : 0;
+}
+
 int refs_update_symref(struct ref_store *refs, const char *ref,
 		       const char *target, const char *logmsg)
 {
diff --git a/refs.h b/refs.h
index 4f0a685714fa3a..886ed2c0f43b04 100644
--- a/refs.h
+++ b/refs.h
@@ -371,10 +371,29 @@ struct reference {
 	 */
 	const struct object_id *oid;
 
+	/*
+	 * An optional peeled object ID. This field _may_ be set for tags in
+	 * case the peeled value is present in the backend. Please refer to
+	 * `reference_get_peeled_oid()`.
+	 */
+	const struct object_id *peeled_oid;
+
 	/* A bitfield of `enum reference_status` flags. */
 	unsigned flags;
 };
 
+/*
+ * Peel the tag to a non-tag commit. If present, this uses the peeled object ID
+ * exposed by the reference backend. Otherwise, the object is peeled via the
+ * object database, which is less efficient.
+ *
+ * Return `0` if the reference could be peeled, a negative error code
+ * otherwise.
+ */
+int reference_get_peeled_oid(struct repository *repo,
+			     const struct reference *ref,
+			     struct object_id *peeled_oid);
+
 /*
  * The signature for the callback function for the for_each_*()
  * functions below.  The memory pointed to by the `struct reference`
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 711e07f8326c6b..1fefefd54ed0e7 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -963,6 +963,7 @@ static int next_record(struct packed_ref_iterator *iter)
 			iter->base.ref.flags &= ~REF_KNOWS_PEELED;
 		} else {
 			iter->base.ref.flags |= REF_KNOWS_PEELED;
+			iter->base.ref.peeled_oid = &iter->peeled;
 		}
 	} else {
 		oidclr(&iter->peeled, iter->repo->hash_algo);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 728886eafd33bd..e214e120d77a5c 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -547,6 +547,7 @@ struct reftable_ref_iterator {
 	struct reftable_iterator iter;
 	struct reftable_ref_record ref;
 	struct object_id oid;
+	struct object_id peeled_oid;
 
 	char *prefix;
 	size_t prefix_len;
@@ -671,6 +672,8 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		case REFTABLE_REF_VAL2:
 			oidread(&iter->oid, iter->ref.value.val2.value,
 				refs->base.repo->hash_algo);
+			oidread(&iter->peeled_oid, iter->ref.value.val2.target_value,
+				refs->base.repo->hash_algo);
 			break;
 		case REFTABLE_REF_SYMREF:
 			referent = refs_resolve_ref_unsafe(&iter->refs->base,
@@ -708,6 +711,8 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		iter->base.ref.name = iter->ref.refname;
 		iter->base.ref.target = referent;
 		iter->base.ref.oid = &iter->oid;
+		if (iter->ref.value_type == REFTABLE_REF_VAL2)
+			iter->base.ref.peeled_oid = &iter->peeled_oid;
 		iter->base.ref.flags = flags;
 
 		break;
diff --git a/repack-midx.c b/repack-midx.c
index 349f7e20b53f25..74bdfa3a6e913f 100644
--- a/repack-midx.c
+++ b/repack-midx.c
@@ -22,7 +22,7 @@ static int midx_snapshot_ref_one(const struct reference *ref, void *_data)
 	const struct object_id *maybe_peeled = ref->oid;
 	struct object_id peeled;
 
-	if (!peel_iterated_oid(data->repo, ref->oid, &peeled))
+	if (!reference_get_peeled_oid(data->repo, ref, &peeled))
 		maybe_peeled = &peeled;
 
 	if (oidset_insert(&data->seen, maybe_peeled))

From adecd5f0b6fdd40219d5503fdaf46aa8d36a4ff7 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:15 +0200
Subject: [PATCH 029/553] upload-pack: convert to use
 `reference_get_peeled_oid()`

The `write_v0_ref()` callback is invoked from two callsites:

  - Once via `send_ref()` which is a callback passed to
    `for_each_namespaced_ref_1()` and `refs_head_ref_namespaced()`.

  - Once manually to announce capabilities.

When sending references to the client we also send the peeled value of
tags. As we don't have a `struct reference` available in the second
case, we cannot easily peel by calling `reference_get_peeled_oid()`, but
we instead have to depend on on global state via `peel_iterated_oid()`.

We do have a reference available though in the first case, it's only the
second case that keeps us from using `reference_get_peeled_oid()`. But
that second case only announces capabilities anyway, so we're not really
handling a reference at all here.

Adapt that case to construct a reference manually and pass that to
`write_v0_ref()`. Start to use `reference_get_peeled_oid()` now that we
always have a `struct reference` available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 upload-pack.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 0d563ae74e92be..2d2b70cbf2dd0b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1249,15 +1249,15 @@ static void format_session_id(struct strbuf *buf, struct upload_pack_data *d) {
 }
 
 static void write_v0_ref(struct upload_pack_data *data,
-			const char *refname, const char *refname_nons,
-			const struct object_id *oid)
+			 const struct reference *ref,
+			 const char *refname_nons)
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow deepen-since deepen-not"
 		" deepen-relative no-progress include-tag multi_ack_detailed";
 	struct object_id peeled;
 
-	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
+	if (mark_our_ref(refname_nons, ref->name, ref->oid, &data->hidden_refs))
 		return;
 
 	if (capabilities) {
@@ -1267,7 +1267,7 @@ static void write_v0_ref(struct upload_pack_data *data,
 		format_symref_info(&symref_info, &data->symref);
 		format_session_id(&session_id, data);
 		packet_fwrite_fmt(stdout, "%s %s%c%s%s%s%s%s%s%s object-format=%s agent=%s\n",
-			     oid_to_hex(oid), refname_nons,
+			     oid_to_hex(ref->oid), refname_nons,
 			     0, capabilities,
 			     (data->allow_uor & ALLOW_TIP_SHA1) ?
 				     " allow-tip-sha1-in-want" : "",
@@ -1283,17 +1283,17 @@ static void write_v0_ref(struct upload_pack_data *data,
 		strbuf_release(&session_id);
 		data->sent_capabilities = 1;
 	} else {
-		packet_fwrite_fmt(stdout, "%s %s\n", oid_to_hex(oid), refname_nons);
+		packet_fwrite_fmt(stdout, "%s %s\n", oid_to_hex(ref->oid), refname_nons);
 	}
 	capabilities = NULL;
-	if (!peel_iterated_oid(the_repository, oid, &peeled))
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled))
 		packet_fwrite_fmt(stdout, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
 	return;
 }
 
 static int send_ref(const struct reference *ref, void *cb_data)
 {
-	write_v0_ref(cb_data, ref->name, strip_namespace(ref->name), ref->oid);
+	write_v0_ref(cb_data, ref, strip_namespace(ref->name));
 	return 0;
 }
 
@@ -1442,8 +1442,12 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 					 send_ref, &data);
 		for_each_namespaced_ref_1(send_ref, &data);
 		if (!data.sent_capabilities) {
-			const char *refname = "capabilities^{}";
-			write_v0_ref(&data, refname, refname, null_oid(the_hash_algo));
+			struct reference ref = {
+				.name = "capabilities^{}",
+				.oid = null_oid(the_hash_algo),
+			};
+
+			write_v0_ref(&data, &ref, ref.name);
 		}
 		/*
 		 * fflush stdout before calling advertise_shallow_grafts because send_ref

From 70b783c3a194746d8b747677615f33b94454146f Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:16 +0200
Subject: [PATCH 030/553] ref-filter: propagate peeled object ID

When queueing a reference in the "ref-filter" subsystem we end up
creating a new ref array item that contains the reference's info. One
bit of info that we always discard though is the peeled object ID, and
because of that we are forced to use `peel_iterated_oid()`.

Refactor the code to propagate the peeled object ID via the ref array,
if available. This allows us to manually peel tags without having to go
through the object database.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/ls-remote.c  |  2 +-
 builtin/tag.c        |  2 +-
 builtin/verify-tag.c |  2 +-
 ref-filter.c         | 66 +++++++++++++++++++++++++-------------------
 ref-filter.h         |  5 +++-
 5 files changed, 45 insertions(+), 32 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index df09000b30de50..fe77829557f252 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -156,7 +156,7 @@ int cmd_ls_remote(int argc,
 			continue;
 		if (!tail_match(&pattern, ref->name))
 			continue;
-		item = ref_array_push(&ref_array, ref->name, &ref->old_oid);
+		item = ref_array_push(&ref_array, ref->name, &ref->old_oid, NULL);
 		item->symref = xstrdup_or_null(ref->symref);
 	}
 
diff --git a/builtin/tag.c b/builtin/tag.c
index f0665af3acdf6c..01eba90c5c7bb2 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -153,7 +153,7 @@ static int verify_tag(const char *name, const char *ref UNUSED,
 		return -1;
 
 	if (format->format)
-		pretty_print_ref(name, oid, format);
+		pretty_print_ref(name, oid, NULL, format);
 
 	return 0;
 }
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index cd6bc11095d01a..558121eaa1688e 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -67,7 +67,7 @@ int cmd_verify_tag(int argc,
 		}
 
 		if (format.format)
-			pretty_print_ref(name, &oid, &format);
+			pretty_print_ref(name, &oid, NULL, &format);
 	}
 	return had_error;
 }
diff --git a/ref-filter.c b/ref-filter.c
index 6837fa60a9b2b5..7fd8babec8f5bd 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2578,8 +2578,15 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	 * If it is a tag object, see if we use the peeled value. If we do,
 	 * grab the peeled OID.
 	 */
-	if (need_tagged && peel_iterated_oid(the_repository, &obj->oid, &oi_deref.oid))
-		die("bad tag");
+	if (need_tagged) {
+		if (!is_null_oid(&ref->peeled_oid)) {
+			oidcpy(&oi_deref.oid, &ref->peeled_oid);
+		} else if (!peel_object(the_repository, &obj->oid, &oi_deref.oid)) {
+			/* We managed to peel the object ourselves. */
+		} else {
+			die("bad tag");
+		}
+	}
 
 	return get_object(ref, 1, &obj, &oi_deref, err);
 }
@@ -2807,12 +2814,15 @@ static int match_points_at(struct oid_array *points_at,
  * Callers can then fill in other struct members at their leisure.
  */
 static struct ref_array_item *new_ref_array_item(const char *refname,
-						 const struct object_id *oid)
+						 const struct object_id *oid,
+						 const struct object_id *peeled_oid)
 {
 	struct ref_array_item *ref;
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	if (peeled_oid)
+		oidcpy(&ref->peeled_oid, peeled_oid);
 	ref->rest = NULL;
 
 	return ref;
@@ -2826,9 +2836,10 @@ static void ref_array_append(struct ref_array *array, struct ref_array_item *ref
 
 struct ref_array_item *ref_array_push(struct ref_array *array,
 				      const char *refname,
-				      const struct object_id *oid)
+				      const struct object_id *oid,
+				      const struct object_id *peeled_oid)
 {
-	struct ref_array_item *ref = new_ref_array_item(refname, oid);
+	struct ref_array_item *ref = new_ref_array_item(refname, oid, peeled_oid);
 	ref_array_append(array, ref);
 	return ref;
 }
@@ -2871,25 +2882,25 @@ static int filter_ref_kind(struct ref_filter *filter, const char *refname)
 	return ref_kind_from_refname(refname);
 }
 
-static struct ref_array_item *apply_ref_filter(const char *refname, const char *referent, const struct object_id *oid,
-			    int flag, struct ref_filter *filter)
+static struct ref_array_item *apply_ref_filter(const struct reference *ref,
+					       struct ref_filter *filter)
 {
-	struct ref_array_item *ref;
+	struct ref_array_item *item;
 	struct commit *commit = NULL;
 	unsigned int kind;
 
-	if (flag & REF_BAD_NAME) {
-		warning(_("ignoring ref with broken name %s"), refname);
+	if (ref->flags & REF_BAD_NAME) {
+		warning(_("ignoring ref with broken name %s"), ref->name);
 		return NULL;
 	}
 
-	if (flag & REF_ISBROKEN) {
-		warning(_("ignoring broken ref %s"), refname);
+	if (ref->flags & REF_ISBROKEN) {
+		warning(_("ignoring broken ref %s"), ref->name);
 		return NULL;
 	}
 
 	/* Obtain the current ref kind from filter_ref_kind() and ignore unwanted refs. */
-	kind = filter_ref_kind(filter, refname);
+	kind = filter_ref_kind(filter, ref->name);
 
 	/*
 	 * Generally HEAD refs are printed with special description denoting a rebase,
@@ -2902,13 +2913,13 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const char *
 	else if (!(kind & filter->kind))
 		return NULL;
 
-	if (!filter_pattern_match(filter, refname))
+	if (!filter_pattern_match(filter, ref->name))
 		return NULL;
 
-	if (filter_exclude_match(filter, refname))
+	if (filter_exclude_match(filter, ref->name))
 		return NULL;
 
-	if (filter->points_at.nr && !match_points_at(&filter->points_at, oid, refname))
+	if (filter->points_at.nr && !match_points_at(&filter->points_at, ref->oid, ref->name))
 		return NULL;
 
 	/*
@@ -2918,7 +2929,7 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const char *
 	 */
 	if (filter->reachable_from || filter->unreachable_from ||
 	    filter->with_commit || filter->no_commit || filter->verbose) {
-		commit = lookup_commit_reference_gently(the_repository, oid, 1);
+		commit = lookup_commit_reference_gently(the_repository, ref->oid, 1);
 		if (!commit)
 			return NULL;
 		/* We perform the filtering for the '--contains' option... */
@@ -2936,13 +2947,13 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const char *
 	 * to do its job and the resulting list may yet to be pruned
 	 * by maxcount logic.
 	 */
-	ref = new_ref_array_item(refname, oid);
-	ref->commit = commit;
-	ref->flag = flag;
-	ref->kind = kind;
-	ref->symref = xstrdup_or_null(referent);
+	item = new_ref_array_item(ref->name, ref->oid, ref->peeled_oid);
+	item->commit = commit;
+	item->flag = ref->flags;
+	item->kind = kind;
+	item->symref = xstrdup_or_null(ref->target);
 
-	return ref;
+	return item;
 }
 
 struct ref_filter_cbdata {
@@ -2959,8 +2970,7 @@ static int filter_one(const struct reference *ref, void *cb_data)
 	struct ref_filter_cbdata *ref_cbdata = cb_data;
 	struct ref_array_item *item;
 
-	item = apply_ref_filter(ref->name, ref->target, ref->oid,
-				ref->flags, ref_cbdata->filter);
+	item = apply_ref_filter(ref, ref_cbdata->filter);
 	if (item)
 		ref_array_append(ref_cbdata->array, item);
 
@@ -2997,8 +3007,7 @@ static int filter_and_format_one(const struct reference *ref, void *cb_data)
 	struct ref_array_item *item;
 	struct strbuf output = STRBUF_INIT, err = STRBUF_INIT;
 
-	item = apply_ref_filter(ref->name, ref->target, ref->oid,
-				ref->flags, ref_cbdata->filter);
+	item = apply_ref_filter(ref, ref_cbdata->filter);
 	if (!item)
 		return 0;
 
@@ -3585,13 +3594,14 @@ void print_formatted_ref_array(struct ref_array *array, struct ref_format *forma
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
+		      const struct object_id *peeled_oid,
 		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
 	struct strbuf err = STRBUF_INIT;
 
-	ref_item = new_ref_array_item(name, oid);
+	ref_item = new_ref_array_item(name, oid, peeled_oid);
 	ref_item->kind = ref_kind_from_refname(name);
 	if (format_ref_array_item(ref_item, format, &output, &err))
 		die("%s", err.buf);
diff --git a/ref-filter.h b/ref-filter.h
index 235c60f79c9a0f..120221b47fa30d 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -41,6 +41,7 @@ enum ref_sorting_order {
 
 struct ref_array_item {
 	struct object_id objectname;
+	struct object_id peeled_oid;
 	const char *rest;
 	int flag;
 	unsigned int kind;
@@ -187,6 +188,7 @@ void print_formatted_ref_array(struct ref_array *array, struct ref_format *forma
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
+		      const struct object_id *peeled_oid,
 		      struct ref_format *format);
 
 /*
@@ -195,7 +197,8 @@ void pretty_print_ref(const char *name, const struct object_id *oid,
  */
 struct ref_array_item *ref_array_push(struct ref_array *array,
 				      const char *refname,
-				      const struct object_id *oid);
+				      const struct object_id *oid,
+				      const struct object_id *peeled_oid);
 
 /*
  * If the provided format includes ahead-behind atoms, then compute the

From feaaea4c123e6b94ebbdc2135278946ee9cc8eed Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:17 +0200
Subject: [PATCH 031/553] builtin/show-ref: convert to use
 `reference_get_peeled_oid()`

The git-show-ref(1) command has multiple different modes:

  - It knows to show all references matching a pattern.

  - It knows to list all references that are an exact match to whatever
    the user has provided.

  - It knows to check for reference existence.

The first two commands use mostly the same infrastructure to print the
references via `show_one()`. But while the former mode uses a proper
iterator and thus has a `struct reference` available in its context, the
latter calls `refs_read_ref()` and thus doesn't. Consequently, we cannot
easily use `reference_get_peeled_oid()` to print the peeled value.

Adapt the code so that we manually construct a `struct reference` when
verifying refs. We wouldn't ever have the peeled value available anyway
as we're not using an iterator here, so we can simply plug in the values
we _do_ have.

With this change we now have a `struct reference` available at both
callsites of `show_one()` and can thus pass it, which allows us to use
`reference_get_peeled_oid()` instead of `peel_iterated_oid()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/show-ref.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 4803b5e59865f6..4d4984e4e0c244 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -31,31 +31,31 @@ struct show_one_options {
 };
 
 static void show_one(const struct show_one_options *opts,
-		     const char *refname, const struct object_id *oid)
+		     const struct reference *ref)
 {
 	const char *hex;
 	struct object_id peeled;
 
-	if (!odb_has_object(the_repository->objects, oid,
+	if (!odb_has_object(the_repository->objects, ref->oid,
 			    HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR))
-		die("git show-ref: bad ref %s (%s)", refname,
-		    oid_to_hex(oid));
+		die("git show-ref: bad ref %s (%s)", ref->name,
+		    oid_to_hex(ref->oid));
 
 	if (opts->quiet)
 		return;
 
-	hex = repo_find_unique_abbrev(the_repository, oid, opts->abbrev);
+	hex = repo_find_unique_abbrev(the_repository, ref->oid, opts->abbrev);
 	if (opts->hash_only)
 		printf("%s\n", hex);
 	else
-		printf("%s %s\n", hex, refname);
+		printf("%s %s\n", hex, ref->name);
 
 	if (!opts->deref_tags)
 		return;
 
-	if (!peel_iterated_oid(the_repository, oid, &peeled)) {
+	if (!reference_get_peeled_oid(the_repository, ref, &peeled)) {
 		hex = repo_find_unique_abbrev(the_repository, &peeled, opts->abbrev);
-		printf("%s %s^{}\n", hex, refname);
+		printf("%s %s^{}\n", hex, ref->name);
 	}
 }
 
@@ -93,7 +93,7 @@ static int show_ref(const struct reference *ref, void *cbdata)
 match:
 	data->found_match++;
 
-	show_one(data->show_one_opts, ref->name, ref->oid);
+	show_one(data->show_one_opts, ref);
 
 	return 0;
 }
@@ -175,12 +175,18 @@ static int cmd_show_ref__verify(const struct show_one_options *show_one_opts,
 
 		if ((starts_with(*refs, "refs/") || refname_is_safe(*refs)) &&
 		    !refs_read_ref(get_main_ref_store(the_repository), *refs, &oid)) {
-			show_one(show_one_opts, *refs, &oid);
-		}
-		else if (!show_one_opts->quiet)
+			struct reference ref = {
+				.name = *refs,
+				.oid = &oid,
+			};
+
+			show_one(show_one_opts, &ref);
+		} else if (!show_one_opts->quiet) {
 			die("'%s' - not a valid ref", *refs);
-		else
+		} else {
 			return 1;
+		}
+
 		refs++;
 	}
 

From 5a5c7359f77ecd1bc4b0e172563161d602f131d3 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:18 +0200
Subject: [PATCH 032/553] refs: drop `current_ref_iter` hack

In preceding commits we have refactored all callers of
`peel_iterated_oid()` to instead use `reference_get_peeled_oid()`. This
allows us to thus get rid of the former function.

Getting rid of that function is nice, but even nicer is that this also
allows us to get rid of the `current_ref_iter` hack. This global
variable tracked the currently-active ref iterator so that we can use it
to peel an object ID. Now that the peeled object ID is propagated via
`struct reference` though we don't have to depend on this hack anymore,
which makes for a more robust and easier-to-understand infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c               | 10 ----------
 refs/iterator.c      |  5 -----
 refs/refs-internal.h | 13 -------------
 3 files changed, 28 deletions(-)

diff --git a/refs.c b/refs.c
index 1b1551f9814394..9d8f0a9ca4a3a6 100644
--- a/refs.c
+++ b/refs.c
@@ -2324,16 +2324,6 @@ int refs_optimize(struct ref_store *refs, struct pack_refs_opts *opts)
 	return refs->be->optimize(refs, opts);
 }
 
-int peel_iterated_oid(struct repository *r, const struct object_id *base, struct object_id *peeled)
-{
-	if (current_ref_iter &&
-	    (current_ref_iter->ref.oid == base ||
-	     oideq(current_ref_iter->ref.oid, base)))
-		return ref_iterator_peel(current_ref_iter, peeled);
-
-	return peel_object(r, base, peeled) ? -1 : 0;
-}
-
 int reference_get_peeled_oid(struct repository *repo,
 			     const struct reference *ref,
 			     struct object_id *peeled_oid)
diff --git a/refs/iterator.c b/refs/iterator.c
index fe5980e1b6c96b..072c6aacdb0341 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -458,15 +458,11 @@ struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
 	return ref_iterator;
 }
 
-struct ref_iterator *current_ref_iter = NULL;
-
 int do_for_each_ref_iterator(struct ref_iterator *iter,
 			     each_ref_fn fn, void *cb_data)
 {
 	int retval = 0, ok;
-	struct ref_iterator *old_ref_iter = current_ref_iter;
 
-	current_ref_iter = iter;
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		retval = fn(&iter->ref, cb_data);
 		if (retval)
@@ -474,7 +470,6 @@ int do_for_each_ref_iterator(struct ref_iterator *iter,
 	}
 
 out:
-	current_ref_iter = old_ref_iter;
 	if (ok == ITER_ERROR)
 		retval = -1;
 	ref_iterator_free(iter);
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index ed749d16572dac..f4f845bbeaf673 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -376,19 +376,6 @@ struct ref_iterator_vtable {
 	ref_iterator_release_fn *release;
 };
 
-/*
- * current_ref_iter is a performance hack: when iterating over
- * references using the for_each_ref*() functions, current_ref_iter is
- * set to the reference iterator before calling the callback function.
- * If the callback function calls peel_ref(), then peel_ref() first
- * checks whether the reference to be peeled is the one referred to by
- * the iterator (it usually is) and if so, asks the iterator for the
- * peeled version of the reference if it is available. This avoids a
- * refname lookup in a common case. current_ref_iter is set to NULL
- * when the iteration is over.
- */
-extern struct ref_iterator *current_ref_iter;
-
 struct ref_store;
 
 /* refs backends */

From 705114772e0a0741c3288329bd9ac4e11e38db9a Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:19 +0200
Subject: [PATCH 033/553] refs: drop infrastructure to peel via iterators

Now that the peeled object ID gets propagated via the `struct reference`
there is no need anymore to call into the reference iterator itself to
dereference an object. Remove this infrastructure.

Most of the changes are straight-forward deletions of code. There is one
exception though in `refs/packed-backend.c::write_with_updates()`. Here
we stop peeling the iterator and instead just pass the peeled object ID
of that iterator directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.h                  | 14 --------------
 refs/debug.c            | 11 -----------
 refs/files-backend.c    | 17 -----------------
 refs/iterator.c         | 36 ------------------------------------
 refs/packed-backend.c   | 24 +-----------------------
 refs/ref-cache.c        |  9 ---------
 refs/refs-internal.h    |  7 -------
 refs/reftable-backend.c | 24 ------------------------
 8 files changed, 1 insertion(+), 141 deletions(-)

diff --git a/refs.h b/refs.h
index 886ed2c0f43b04..2dd7ac1a16aee9 100644
--- a/refs.h
+++ b/refs.h
@@ -1289,10 +1289,6 @@ int repo_migrate_ref_storage_format(struct repository *repo,
  * to the next entry, ref_iterator_advance() aborts the iteration,
  * frees the ref_iterator, and returns ITER_ERROR.
  *
- * The reference currently being looked at can be peeled by calling
- * ref_iterator_peel(). This function is often faster than peel_ref(),
- * so it should be preferred when iterating over references.
- *
  * Putting it all together, a typical iteration looks like this:
  *
  *     int ok;
@@ -1307,9 +1303,6 @@ int repo_migrate_ref_storage_format(struct repository *repo,
  *             // Access information about the current reference:
  *             if (!(iter->flags & REF_ISSYMREF))
  *                     printf("%s is %s\n", iter->refname, oid_to_hex(iter->oid));
- *
- *             // If you need to peel the reference:
- *             ref_iterator_peel(iter, &oid);
  *     }
  *
  *     if (ok != ITER_DONE)
@@ -1400,13 +1393,6 @@ enum ref_iterator_seek_flag {
 int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *refname,
 		      unsigned int flags);
 
-/*
- * If possible, peel the reference currently being viewed by the
- * iterator. Return 0 on success.
- */
-int ref_iterator_peel(struct ref_iterator *ref_iterator,
-		      struct object_id *peeled);
-
 /* Free the reference iterator and any associated resources. */
 void ref_iterator_free(struct ref_iterator *ref_iterator);
 
diff --git a/refs/debug.c b/refs/debug.c
index 67718bd1f49f1f..01499b9033ca3c 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -177,16 +177,6 @@ static int debug_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return res;
 }
 
-static int debug_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
-{
-	struct debug_ref_iterator *diter =
-		(struct debug_ref_iterator *)ref_iterator;
-	int res = diter->iter->vtable->peel(diter->iter, peeled);
-	trace_printf_key(&trace_refs, "iterator_peel: %s: %d\n", diter->iter->ref.name, res);
-	return res;
-}
-
 static void debug_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct debug_ref_iterator *diter =
@@ -198,7 +188,6 @@ static void debug_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable debug_ref_iterator_vtable = {
 	.advance = debug_ref_iterator_advance,
 	.seek = debug_ref_iterator_seek,
-	.peel = debug_ref_iterator_peel,
 	.release = debug_ref_iterator_release,
 };
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index fac53fa052dd22..5aeb454fb47684 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -993,15 +993,6 @@ static int files_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return ref_iterator_seek(iter->iter0, refname, flags);
 }
 
-static int files_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
-{
-	struct files_ref_iterator *iter =
-		(struct files_ref_iterator *)ref_iterator;
-
-	return ref_iterator_peel(iter->iter0, peeled);
-}
-
 static void files_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct files_ref_iterator *iter =
@@ -1012,7 +1003,6 @@ static void files_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable files_ref_iterator_vtable = {
 	.advance = files_ref_iterator_advance,
 	.seek = files_ref_iterator_seek,
-	.peel = files_ref_iterator_peel,
 	.release = files_ref_iterator_release,
 };
 
@@ -2388,12 +2378,6 @@ static int files_reflog_iterator_seek(struct ref_iterator *ref_iterator UNUSED,
 	BUG("ref_iterator_seek() called for reflog_iterator");
 }
 
-static int files_reflog_iterator_peel(struct ref_iterator *ref_iterator UNUSED,
-				      struct object_id *peeled UNUSED)
-{
-	BUG("ref_iterator_peel() called for reflog_iterator");
-}
-
 static void files_reflog_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct files_reflog_iterator *iter =
@@ -2404,7 +2388,6 @@ static void files_reflog_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable files_reflog_iterator_vtable = {
 	.advance = files_reflog_iterator_advance,
 	.seek = files_reflog_iterator_seek,
-	.peel = files_reflog_iterator_peel,
 	.release = files_reflog_iterator_release,
 };
 
diff --git a/refs/iterator.c b/refs/iterator.c
index 072c6aacdb0341..d79aa5ec82dc6f 100644
--- a/refs/iterator.c
+++ b/refs/iterator.c
@@ -21,12 +21,6 @@ int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *refname,
 	return ref_iterator->vtable->seek(ref_iterator, refname, flags);
 }
 
-int ref_iterator_peel(struct ref_iterator *ref_iterator,
-		      struct object_id *peeled)
-{
-	return ref_iterator->vtable->peel(ref_iterator, peeled);
-}
-
 void ref_iterator_free(struct ref_iterator *ref_iterator)
 {
 	if (ref_iterator) {
@@ -60,12 +54,6 @@ static int empty_ref_iterator_seek(struct ref_iterator *ref_iterator UNUSED,
 	return 0;
 }
 
-static int empty_ref_iterator_peel(struct ref_iterator *ref_iterator UNUSED,
-				   struct object_id *peeled UNUSED)
-{
-	BUG("peel called for empty iterator");
-}
-
 static void empty_ref_iterator_release(struct ref_iterator *ref_iterator UNUSED)
 {
 }
@@ -73,7 +61,6 @@ static void empty_ref_iterator_release(struct ref_iterator *ref_iterator UNUSED)
 static struct ref_iterator_vtable empty_ref_iterator_vtable = {
 	.advance = empty_ref_iterator_advance,
 	.seek = empty_ref_iterator_seek,
-	.peel = empty_ref_iterator_peel,
 	.release = empty_ref_iterator_release,
 };
 
@@ -240,18 +227,6 @@ static int merge_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return 0;
 }
 
-static int merge_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
-{
-	struct merge_ref_iterator *iter =
-		(struct merge_ref_iterator *)ref_iterator;
-
-	if (!iter->current) {
-		BUG("peel called before advance for merge iterator");
-	}
-	return ref_iterator_peel(*iter->current, peeled);
-}
-
 static void merge_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct merge_ref_iterator *iter =
@@ -263,7 +238,6 @@ static void merge_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable merge_ref_iterator_vtable = {
 	.advance = merge_ref_iterator_advance,
 	.seek = merge_ref_iterator_seek,
-	.peel = merge_ref_iterator_peel,
 	.release = merge_ref_iterator_release,
 };
 
@@ -412,15 +386,6 @@ static int prefix_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return ref_iterator_seek(iter->iter0, refname, flags);
 }
 
-static int prefix_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				    struct object_id *peeled)
-{
-	struct prefix_ref_iterator *iter =
-		(struct prefix_ref_iterator *)ref_iterator;
-
-	return ref_iterator_peel(iter->iter0, peeled);
-}
-
 static void prefix_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct prefix_ref_iterator *iter =
@@ -432,7 +397,6 @@ static void prefix_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable prefix_ref_iterator_vtable = {
 	.advance = prefix_ref_iterator_advance,
 	.seek = prefix_ref_iterator_seek,
-	.peel = prefix_ref_iterator_peel,
 	.release = prefix_ref_iterator_release,
 };
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1fefefd54ed0e7..6fa229edd0ffad 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1030,22 +1030,6 @@ static int packed_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return 0;
 }
 
-static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
-{
-	struct packed_ref_iterator *iter =
-		(struct packed_ref_iterator *)ref_iterator;
-
-	if ((iter->base.ref.flags & REF_KNOWS_PEELED)) {
-		oidcpy(peeled, &iter->peeled);
-		return is_null_oid(&iter->peeled) ? -1 : 0;
-	} else if ((iter->base.ref.flags & (REF_ISBROKEN | REF_ISSYMREF))) {
-		return -1;
-	} else {
-		return peel_object(iter->repo, &iter->oid, peeled) ? -1 : 0;
-	}
-}
-
 static void packed_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct packed_ref_iterator *iter =
@@ -1059,7 +1043,6 @@ static void packed_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable packed_ref_iterator_vtable = {
 	.advance = packed_ref_iterator_advance,
 	.seek = packed_ref_iterator_seek,
-	.peel = packed_ref_iterator_peel,
 	.release = packed_ref_iterator_release,
 };
 
@@ -1525,13 +1508,8 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 
 		if (cmp < 0) {
 			/* Pass the old reference through. */
-
-			struct object_id peeled;
-			int peel_error = ref_iterator_peel(iter, &peeled);
-
 			if (write_packed_entry(out, iter->ref.name,
-					       iter->ref.oid,
-					       peel_error ? NULL : &peeled))
+					       iter->ref.oid, iter->ref.peeled_oid))
 				goto write_error;
 
 			if ((ok = ref_iterator_advance(iter)) != ITER_OK) {
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index e427848879d61b..ffef01a597579e 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -546,14 +546,6 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return 0;
 }
 
-static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				   struct object_id *peeled)
-{
-	struct cache_ref_iterator *iter =
-		(struct cache_ref_iterator *)ref_iterator;
-	return peel_object(iter->repo, ref_iterator->ref.oid, peeled) ? -1 : 0;
-}
-
 static void cache_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct cache_ref_iterator *iter =
@@ -565,7 +557,6 @@ static void cache_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable cache_ref_iterator_vtable = {
 	.advance = cache_ref_iterator_advance,
 	.seek = cache_ref_iterator_seek,
-	.peel = cache_ref_iterator_peel,
 	.release = cache_ref_iterator_release,
 };
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f4f845bbeaf673..4671517dade968 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -357,12 +357,6 @@ typedef int ref_iterator_advance_fn(struct ref_iterator *ref_iterator);
 typedef int ref_iterator_seek_fn(struct ref_iterator *ref_iterator,
 				 const char *refname, unsigned int flags);
 
-/*
- * Peels the current ref, returning 0 for success or -1 for failure.
- */
-typedef int ref_iterator_peel_fn(struct ref_iterator *ref_iterator,
-				 struct object_id *peeled);
-
 /*
  * Implementations of this function should free any resources specific
  * to the derived class.
@@ -372,7 +366,6 @@ typedef void ref_iterator_release_fn(struct ref_iterator *ref_iterator);
 struct ref_iterator_vtable {
 	ref_iterator_advance_fn *advance;
 	ref_iterator_seek_fn *seek;
-	ref_iterator_peel_fn *peel;
 	ref_iterator_release_fn *release;
 };
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index e214e120d77a5c..e329d4a423abdb 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -744,21 +744,6 @@ static int reftable_ref_iterator_seek(struct ref_iterator *ref_iterator,
 	return iter->err;
 }
 
-static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
-				      struct object_id *peeled)
-{
-	struct reftable_ref_iterator *iter =
-		(struct reftable_ref_iterator *)ref_iterator;
-
-	if (iter->ref.value_type == REFTABLE_REF_VAL2) {
-		oidread(peeled, iter->ref.value.val2.target_value,
-			iter->refs->base.repo->hash_algo);
-		return 0;
-	}
-
-	return -1;
-}
-
 static void reftable_ref_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct reftable_ref_iterator *iter =
@@ -776,7 +761,6 @@ static void reftable_ref_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
 	.advance = reftable_ref_iterator_advance,
 	.seek = reftable_ref_iterator_seek,
-	.peel = reftable_ref_iterator_peel,
 	.release = reftable_ref_iterator_release,
 };
 
@@ -2098,13 +2082,6 @@ static int reftable_reflog_iterator_seek(struct ref_iterator *ref_iterator UNUSE
 	return -1;
 }
 
-static int reftable_reflog_iterator_peel(struct ref_iterator *ref_iterator UNUSED,
-					 struct object_id *peeled UNUSED)
-{
-	BUG("reftable reflog iterator cannot be peeled");
-	return -1;
-}
-
 static void reftable_reflog_iterator_release(struct ref_iterator *ref_iterator)
 {
 	struct reftable_reflog_iterator *iter =
@@ -2117,7 +2094,6 @@ static void reftable_reflog_iterator_release(struct ref_iterator *ref_iterator)
 static struct ref_iterator_vtable reftable_reflog_iterator_vtable = {
 	.advance = reftable_reflog_iterator_advance,
 	.seek = reftable_reflog_iterator_seek,
-	.peel = reftable_reflog_iterator_peel,
 	.release = reftable_reflog_iterator_release,
 };
 

From 7ec85185b197ce1cd28721a6f4415fb9db5cd42f Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:20 +0200
Subject: [PATCH 034/553] object: add flag to `peel_object()` to verify object
 type

When peeling a tag to a non-tag object we repeatedly call
`parse_object()` on the tagged object until we find the first object
that isn't a tag. While this feels sensible at first, there is a big
catch here: `parse_object()` doesn't actually verify the type of the
tagged object.

The relevant code path here eventually ends up in `parse_tag_buffer()`.
Here, we parse the various fields of the tag, including the "type". Once
we've figured out the type and the tagged object ID, we call one of the
`lookup_${type}()` functions for whatever type we have found. There is
two possible outcomes in the successful case:

  1. The object is already part of our cached objects. In that case we
     double-check whether the type we're trying to look up matches the
     type that was cached.

  2. The object is _not_ part of our cached objects. In that case, we
     simply create a new object with the expected type, but we don't
     parse that object.

In the first case we might notice type mismatches, but only in the case
where our cache has the object with the correct type. In the second
case, we'll blindly assume that the type is correct and then go with it.
We'll only notice that the type might be wrong when we try to parse the
object at a later point.

Now arguably, we could change `parse_tag_buffer()` to verify the tagged
object's type for us. But that would have the effect that such a tag
cannot be parsed at all anymore, and we have a small bunch of tests for
exactly this case that assert we still can open such tags. So this
change does not feel like something we can retroactively tighten, even
though one shouldn't ever hit such corrupted tags.

Instead, add a new `flags` field to `peel_object()` that allows the
caller to opt in to strict object verification. This will be wired up at
a subset of callsites over the next few commits.

Note that this change also inlines `deref_tag_noverify()`. There's only
been two callsites of that function, the one we're changing and one in
our test helpers. The latter callsite can trivially use `deref_tag()`
instead, so by inlining the function we avoid having to pass down the
flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object.c                | 20 +++++++++++++++++---
 object.h                | 15 ++++++++++++++-
 ref-filter.c            |  2 +-
 refs.c                  |  2 +-
 refs/packed-backend.c   |  5 ++---
 refs/reftable-backend.c |  4 ++--
 t/helper/test-reach.c   |  2 +-
 tag.c                   | 12 ------------
 tag.h                   |  1 -
 9 files changed, 38 insertions(+), 25 deletions(-)

diff --git a/object.c b/object.c
index 986114a6dba843..e72b0ed4360e67 100644
--- a/object.c
+++ b/object.c
@@ -209,11 +209,12 @@ struct object *lookup_object_by_type(struct repository *r,
 
 enum peel_status peel_object(struct repository *r,
 			     const struct object_id *name,
-			     struct object_id *oid)
+			     struct object_id *oid,
+			     unsigned flags)
 {
 	struct object *o = lookup_unknown_object(r, name);
 
-	if (o->type == OBJ_NONE) {
+	if (o->type == OBJ_NONE || flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
 		int type = odb_read_object_info(r->objects, name, NULL);
 		if (type < 0 || !object_as_type(o, type, 0))
 			return PEEL_INVALID;
@@ -222,7 +223,20 @@ enum peel_status peel_object(struct repository *r,
 	if (o->type != OBJ_TAG)
 		return PEEL_NON_TAG;
 
-	o = deref_tag_noverify(r, o);
+	while (o && o->type == OBJ_TAG) {
+		o = parse_object(r, &o->oid);
+		if (o && o->type == OBJ_TAG && ((struct tag *)o)->tagged) {
+			o = ((struct tag *)o)->tagged;
+
+			if (flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
+				int type = odb_read_object_info(r->objects, &o->oid, NULL);
+				if (type < 0 || !object_as_type(o, type, 0))
+					return PEEL_INVALID;
+			}
+		} else {
+			o = NULL;
+		}
+	}
 	if (!o)
 		return PEEL_INVALID;
 
diff --git a/object.h b/object.h
index 8c3c1c46e1bf04..1499f63d507c32 100644
--- a/object.h
+++ b/object.h
@@ -287,6 +287,17 @@ enum peel_status {
 	PEEL_BROKEN = -4
 };
 
+enum peel_object_flags {
+	/*
+	 * Always verify the object type, even in the case where the looked-up
+	 * object already has an object type. This can be useful when the
+	 * stored object type may be invalid. One such case is when looking up
+	 * objects via tags, where we blindly trust the object type declared by
+	 * the tag.
+	 */
+	PEEL_OBJECT_VERIFY_OBJECT_TYPE = (1 << 0),
+};
+
 /*
  * Peel the named object; i.e., if the object is a tag, resolve the
  * tag recursively until a non-tag is found.  If successful, store the
@@ -295,7 +306,9 @@ enum peel_status {
  * and leave oid unchanged.
  */
 enum peel_status peel_object(struct repository *r,
-			     const struct object_id *name, struct object_id *oid);
+			     const struct object_id *name,
+			     struct object_id *oid,
+			     unsigned flags);
 
 struct object_list *object_list_insert(struct object *item,
 				       struct object_list **list_p);
diff --git a/ref-filter.c b/ref-filter.c
index 7fd8babec8f5bd..9a8ed8c8fc1f3b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2581,7 +2581,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	if (need_tagged) {
 		if (!is_null_oid(&ref->peeled_oid)) {
 			oidcpy(&oi_deref.oid, &ref->peeled_oid);
-		} else if (!peel_object(the_repository, &obj->oid, &oi_deref.oid)) {
+		} else if (!peel_object(the_repository, &oi.oid, &oi_deref.oid, 0)) {
 			/* We managed to peel the object ourselves. */
 		} else {
 			die("bad tag");
diff --git a/refs.c b/refs.c
index 9d8f0a9ca4a3a6..a41a94ae55bb43 100644
--- a/refs.c
+++ b/refs.c
@@ -2333,7 +2333,7 @@ int reference_get_peeled_oid(struct repository *repo,
 		return 0;
 	}
 
-	return peel_object(repo, ref->oid, peeled_oid) ? -1 : 0;
+	return peel_object(repo, ref->oid, peeled_oid, 0) ? -1 : 0;
 }
 
 int refs_update_symref(struct ref_store *refs, const char *ref,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6fa229edd0ffad..4752d3f3981fe3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1527,9 +1527,8 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 			i++;
 		} else {
 			struct object_id peeled;
-			int peel_error = peel_object(refs->base.repo,
-						     &update->new_oid,
-						     &peeled);
+			int peel_error = peel_object(refs->base.repo, &update->new_oid,
+						     &peeled, 0);
 
 			if (write_packed_entry(out, update->refname,
 					       &update->new_oid,
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index e329d4a423abdb..9febb2322c3b24 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1632,7 +1632,7 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
 			ref.refname = (char *)u->refname;
 			ref.update_index = ts;
 
-			peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled);
+			peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled, 0);
 			if (!peel_error) {
 				ref.value_type = REFTABLE_REF_VAL2;
 				memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ);
@@ -2497,7 +2497,7 @@ static int write_reflog_expiry_table(struct reftable_writer *writer, void *cb_da
 		ref.refname = (char *)arg->refname;
 		ref.update_index = ts;
 
-		if (!peel_object(arg->refs->base.repo, &arg->update_oid, &peeled)) {
+		if (!peel_object(arg->refs->base.repo, &arg->update_oid, &peeled, 0)) {
 			ref.value_type = REFTABLE_REF_VAL2;
 			memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ);
 			memcpy(ref.value.val2.value, arg->update_oid.hash, GIT_MAX_RAWSZ);
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 028ec003067828..c58c93800f3232 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -63,7 +63,7 @@ int cmd__reach(int ac, const char **av)
 			die("failed to resolve %s", buf.buf + 2);
 
 		orig = parse_object(r, &oid);
-		peeled = deref_tag_noverify(the_repository, orig);
+		peeled = deref_tag(the_repository, orig, NULL, 0);
 
 		if (!peeled)
 			die("failed to load commit for input %s resulting in oid %s",
diff --git a/tag.c b/tag.c
index 1d52686ee105f2..f5c232d2f1f36c 100644
--- a/tag.c
+++ b/tag.c
@@ -94,18 +94,6 @@ struct object *deref_tag(struct repository *r, struct object *o, const char *war
 	return o;
 }
 
-struct object *deref_tag_noverify(struct repository *r, struct object *o)
-{
-	while (o && o->type == OBJ_TAG) {
-		o = parse_object(r, &o->oid);
-		if (o && o->type == OBJ_TAG && ((struct tag *)o)->tagged)
-			o = ((struct tag *)o)->tagged;
-		else
-			o = NULL;
-	}
-	return o;
-}
-
 struct tag *lookup_tag(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
diff --git a/tag.h b/tag.h
index c49d7c19ad3c90..ef12a610372063 100644
--- a/tag.h
+++ b/tag.h
@@ -16,7 +16,6 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u
 int parse_tag(struct tag *item);
 void release_tag_memory(struct tag *t);
 struct object *deref_tag(struct repository *r, struct object *, const char *, int);
-struct object *deref_tag_noverify(struct repository *r, struct object *);
 int gpg_verify_tag(const struct object_id *oid,
 		   const char *name_to_report, unsigned flags);
 struct object_id *get_tagged_oid(struct tag *tag);

From 6ec4c0b45ba916770c6fbc03df94018ca06fb77e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:21 +0200
Subject: [PATCH 035/553] refs: don't store peeled object IDs for invalid tags

Both the "files" and "reftable" backend store peeled object IDs for
references that point to tags:

  - The "files" backend stores the value when packing refs, where each
    peeled object ID is prefixed with "^".

  - The "reftable" backend stores the value whenever writing a new
    reference that points to a tag via a special ref record type.

Both of these backends use `peel_object()` to find the peeled object ID.
But as explained in the preceding commit, that function does not detect
the case where the tag's tagged object and its claimed type mismatch.

The consequence of storing these bogus peeled object IDs is that we're
less likely to detect such corruption in other parts of Git.
git-for-each-ref(1) for example does not notice anymore that the tag is
broken when using "--format=%(*objectname)" to dereference tags.

One could claim that this is good, because it still allows us to mostly
use the tag as intended. But the biggest problem here is that we now
have different behaviour for such a broken tag depending on whether or
not we have its peeled value in the refdb.

Fix the issue by verifying the object type when peeling the object. If
that verification fails we simply skip storing the peeled value in
either of the reference formats.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs/packed-backend.c      |  2 +-
 refs/reftable-backend.c    |  3 ++-
 t/pack-refs-tests.sh       | 32 ++++++++++++++++++++++++++++++++
 t/t0610-reftable-basics.sh | 28 ++++++++++++++++++++++++++++
 4 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 4752d3f3981fe3..1ab0c503930164 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1528,7 +1528,7 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 		} else {
 			struct object_id peeled;
 			int peel_error = peel_object(refs->base.repo, &update->new_oid,
-						     &peeled, 0);
+						     &peeled, PEEL_OBJECT_VERIFY_OBJECT_TYPE);
 
 			if (write_packed_entry(out, update->refname,
 					       &update->new_oid,
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 9febb2322c3b24..6bbfd5618dac16 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1632,7 +1632,8 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
 			ref.refname = (char *)u->refname;
 			ref.update_index = ts;
 
-			peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled, 0);
+			peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled,
+						 PEEL_OBJECT_VERIFY_OBJECT_TYPE);
 			if (!peel_error) {
 				ref.value_type = REFTABLE_REF_VAL2;
 				memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ);
diff --git a/t/pack-refs-tests.sh b/t/pack-refs-tests.sh
index 3dbcc01718e157..095823d915fd63 100644
--- a/t/pack-refs-tests.sh
+++ b/t/pack-refs-tests.sh
@@ -428,4 +428,36 @@ do
 	'
 done
 
+test_expect_success 'pack-refs does not store invalid peeled tag value' '
+	test_when_finished rm -rf repo &&
+	git init repo &&
+	(
+		cd repo &&
+		git commit --allow-empty --message initial &&
+
+		echo garbage >blob-content &&
+		blob_id=$(git hash-object -w -t blob blob-content) &&
+
+		# Write an invalid tag into the object database. The tag itself
+		# is well-formed, but the tagged object is a blob while we
+		# claim that it is a commit.
+		cat >tag-content <<-EOF &&
+		object $blob_id
+		type commit
+		tag bad-tag
+		tagger C O Mitter <committer@example.com> 1112354055 +0200
+
+		annotated
+		EOF
+		tag_id=$(git hash-object -w -t tag tag-content) &&
+		git update-ref refs/tags/bad-tag "$tag_id" &&
+
+		# The packed-refs file should not contain the peeled object ID.
+		# If it did this would cause commands that use the peeled value
+		# to not notice this corrupted tag.
+		git pack-refs --all &&
+		test_grep ! "^\^" .git/packed-refs
+	)
+'
+
 test_done
diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh
index 3ea5d51532a8b8..6575528f212716 100755
--- a/t/t0610-reftable-basics.sh
+++ b/t/t0610-reftable-basics.sh
@@ -1135,4 +1135,32 @@ test_expect_success 'fetch: accessing FETCH_HEAD special ref works' '
 	test_cmp expect actual
 '
 
+test_expect_success 'writes do not persist peeled value for invalid tags' '
+	test_when_finished rm -rf repo &&
+	git init repo &&
+	(
+		cd repo &&
+		git commit --allow-empty --message initial &&
+
+		# We cannot easily verify that the peeled value is not stored
+		# in the tables. Instead, we test this indirectly: we create
+		# two tags that both point to the same object, but they claim
+		# different object types. If we parse both tags we notice that
+		# the parsed tagged object has a mismatch between the two tags
+		# and bail out.
+		#
+		# If we instead use the persisted peeled value we would not
+		# even parse the tags. As such, we would not notice the
+		# discrepancy either and thus listing these tags would succeed.
+		git tag tag-1 -m "tag 1" &&
+		git cat-file tag tag-1 >raw-tag &&
+		sed "s/^type commit$/type blob/" <raw-tag >broken-tag &&
+		broken_tag_id=$(git hash-object -w -t tag broken-tag) &&
+		git update-ref refs/tags/tag-2 $broken_tag_id &&
+
+		test_must_fail git for-each-ref --format="%(*objectname)" refs/tags/ 2>err &&
+		test_grep "bad tag pointer" err
+	)
+'
+
 test_done

From e66077ae45813a5ca269a7d676310a243bdcc1c2 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:22 +0200
Subject: [PATCH 036/553] ref-filter: detect broken tags when dereferencing
 them

Users can ask git-for-each-ref(1) to peel tags and return information of
the tagged object by adding an asterisk to the format, like for example
"%(*$objectname)". If so, git-for-each-ref(1) peels that object to the
first non-tag object and then returns its values.

As mentioned in preceding commits, it can happen that the tagged object
type and the claimed object type differ, effectively resulting in a
corrupt tag. git-for-each-ref(1) would notice this mismatch, print an
error and then bail out when trying to peel the tag.

But we only notice this corruption in some very specific edge cases!
While we have a test in "t/for-each-ref-tests.sh" that verifies the
above scenario, this test is specifically crafted to detect the issue at
hand. Namely, we create two tags:

  - One tag points to a specific object with the correct type.

  - The other tag points to the *same* object with a different type.

The fact that both tags point to the same object is important here:
`peel_object()` wouldn't notice the corruption if the tagged objects
were different.

The root cause is that `peel_object()` calls `lookup_${type}()`
eventually, where the type is the same type declared in the tag object.
Consequently, when we have two tags pointing to the same object but with
different declared types we'll call two different lookup functions. The
first lookup will store the object with an unverified type A, whereas
the second lookup will try to look up the object with a different
unverified type B. And it is only now that we notice the discrepancy in
object types, even though type A could've already been the wrong type.

Fix the issue by verifying the object type in `populate_value()`. With
this change we'll also notice type mismatches when only dereferencing a
tag once.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 ref-filter.c            | 3 ++-
 t/for-each-ref-tests.sh | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 9a8ed8c8fc1f3b..c54025d6b4c4cd 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2581,7 +2581,8 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	if (need_tagged) {
 		if (!is_null_oid(&ref->peeled_oid)) {
 			oidcpy(&oi_deref.oid, &ref->peeled_oid);
-		} else if (!peel_object(the_repository, &oi.oid, &oi_deref.oid, 0)) {
+		} else if (!peel_object(the_repository, &oi.oid, &oi_deref.oid,
+					PEEL_OBJECT_VERIFY_OBJECT_TYPE)) {
 			/* We managed to peel the object ourselves. */
 		} else {
 			die("bad tag");
diff --git a/t/for-each-ref-tests.sh b/t/for-each-ref-tests.sh
index e3ad19298accde..4593be5fd544a8 100644
--- a/t/for-each-ref-tests.sh
+++ b/t/for-each-ref-tests.sh
@@ -1809,7 +1809,9 @@ test_expect_success "${git_for_each_ref} reports broken tags" '
 	bad=$(git hash-object -w -t tag bad) &&
 	git update-ref refs/tags/broken-tag-bad $bad &&
 	test_must_fail ${git_for_each_ref} --format="%(*objectname)" \
-		refs/tags/broken-tag-*
+		refs/tags/broken-tag-* &&
+	test_must_fail ${git_for_each_ref} --format="%(*objectname)" \
+		refs/tags/broken-tag-bad
 '
 
 test_expect_success 'set up tag with signature and no blank lines' '

From a29e2e8fe7e3935e23d2a03dc429cc9c2e68bfbe Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 23 Oct 2025 09:16:23 +0200
Subject: [PATCH 037/553] ref-filter: parse objects on demand
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When formatting an arbitrary object we parse that object regardless of
whether or not we actually need any parsed data. In fact, many of the
atoms we have don't require any.

Refactor the code so that we parse the data on demand when we see an
atom that wants to access the objects. This leads to a small speedup,
for example in the Chromium repository with around 40000 refs:

    Benchmark 1: for-each-ref --format='%(raw)' (HEAD~)
      Time (mean ± σ):     388.7 ms ±   1.1 ms    [User: 322.2 ms, System: 65.0 ms]
      Range (min … max):   387.3 ms … 390.8 ms    10 runs

    Benchmark 2: for-each-ref --format='%(raw)' (HEAD)
      Time (mean ± σ):     344.7 ms ±   0.7 ms    [User: 287.8 ms, System: 55.1 ms]
      Range (min … max):   343.9 ms … 345.7 ms    10 runs

    Summary
      for-each-ref --format='%(raw)' (HEAD) ran
        1.13 ± 0.00 times faster than for-each-ref --format='%(raw)' (HEAD~)

With this change, we now spend ~90% of the time decompressing objects,
which is almost as good as it gets regarding git-for-each-ref(1)'s own
infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 ref-filter.c | 142 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 106 insertions(+), 36 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index c54025d6b4c4cd..7cfcd5c3554930 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -91,6 +91,7 @@ static struct expand_data {
 	struct object_id delta_base_oid;
 	void *content;
 
+	struct object *maybe_object;
 	struct object_info info;
 } oi, oi_deref;
 
@@ -1475,11 +1476,29 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 	}
 }
 
+static struct object *get_or_parse_object(struct expand_data *data, const char *refname,
+					  struct strbuf *err, int *eaten)
+{
+	if (!data->maybe_object) {
+		data->maybe_object = parse_object_buffer(the_repository, &data->oid, data->type,
+							 data->size, data->content, eaten);
+		if (!data->maybe_object) {
+			strbuf_addf(err, _("parse_object_buffer failed on %s for %s"),
+				    oid_to_hex(&data->oid), refname);
+			return NULL;
+		}
+	}
+
+	return data->maybe_object;
+}
+
 /* See grab_values */
-static void grab_tag_values(struct atom_value *val, int deref, struct object *obj)
+static int grab_tag_values(struct atom_value *val, int deref,
+			   struct expand_data *data, const char *refname,
+			   struct strbuf *err, int *eaten)
 {
+	struct tag *tag = NULL;
 	int i;
-	struct tag *tag = (struct tag *) obj;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
@@ -1487,6 +1506,14 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
+
+		if (!tag) {
+			tag = (struct tag *) get_or_parse_object(data, refname,
+								 err, eaten);
+			if (!tag)
+				return -1;
+		}
+
 		if (deref)
 			name++;
 		if (atom_type == ATOM_TAG)
@@ -1496,22 +1523,35 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
+
+	return 0;
 }
 
 /* See grab_values */
-static void grab_commit_values(struct atom_value *val, int deref, struct object *obj)
+static int grab_commit_values(struct atom_value *val, int deref,
+			      struct expand_data *data, const char *refname,
+			      struct strbuf *err, int *eaten)
 {
 	int i;
-	struct commit *commit = (struct commit *) obj;
+	struct commit *commit = NULL;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
 		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
+
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
+
+		if (!commit) {
+			commit = (struct commit *) get_or_parse_object(data, refname,
+								       err, eaten);
+			if (!commit)
+				return -1;
+		}
+
 		if (atom_type == ATOM_TREE &&
 		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
@@ -1531,6 +1571,8 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			v->s = strbuf_detach(&s, NULL);
 		}
 	}
+
+	return 0;
 }
 
 static const char *find_wholine(const char *who, int wholen, const char *buf)
@@ -1759,10 +1801,12 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 	}
 }
 
-static void grab_signature(struct atom_value *val, int deref, struct object *obj)
+static int grab_signature(struct atom_value *val, int deref,
+			  struct expand_data *data, const char *refname,
+			  struct strbuf *err, int *eaten)
 {
 	int i;
-	struct commit *commit = (struct commit *) obj;
+	struct commit *commit = NULL;
 	struct signature_check sigc = { 0 };
 	int signature_checked = 0;
 
@@ -1790,6 +1834,13 @@ static void grab_signature(struct atom_value *val, int deref, struct object *obj
 			continue;
 
 		if (!signature_checked) {
+			if (!commit) {
+				commit = (struct commit *) get_or_parse_object(data, refname,
+									       err, eaten);
+				if (!commit)
+					return -1;
+			}
+
 			check_commit_signature(commit, &sigc);
 			signature_checked = 1;
 		}
@@ -1843,6 +1894,8 @@ static void grab_signature(struct atom_value *val, int deref, struct object *obj
 
 	if (signature_checked)
 		signature_check_clear(&sigc);
+
+	return 0;
 }
 
 static void find_subpos(const char *buf,
@@ -1920,9 +1973,8 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 static void grab_describe_values(struct atom_value *val, int deref,
-				 struct object *obj)
+				 struct expand_data *data)
 {
-	struct commit *commit = (struct commit *)obj;
 	int i;
 
 	for (i = 0; i < used_atom_cnt; i++) {
@@ -1944,7 +1996,7 @@ static void grab_describe_values(struct atom_value *val, int deref,
 		cmd.git_cmd = 1;
 		strvec_push(&cmd.args, "describe");
 		strvec_pushv(&cmd.args, atom->u.describe_args.v);
-		strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
+		strvec_push(&cmd.args, oid_to_hex(&data->oid));
 		if (pipe_command(&cmd, NULL, 0, &out, 0, &err, 0) < 0) {
 			error(_("failed to run 'describe'"));
 			v->s = xstrdup("");
@@ -2066,24 +2118,36 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
+static int grab_values(struct atom_value *val, int deref, struct expand_data *data,
+		       const char *refname, struct strbuf *err, int *eaten)
 {
 	void *buf = data->content;
+	int ret;
 
-	switch (obj->type) {
+	switch (data->type) {
 	case OBJ_TAG:
-		grab_tag_values(val, deref, obj);
+		ret = grab_tag_values(val, deref, data, refname, err, eaten);
+		if (ret < 0)
+			goto out;
+
 		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
-		grab_describe_values(val, deref, obj);
+		grab_describe_values(val, deref, data);
 		break;
 	case OBJ_COMMIT:
-		grab_commit_values(val, deref, obj);
+		ret = grab_commit_values(val, deref, data, refname, err, eaten);
+		if (ret < 0)
+			goto out;
+
 		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
-		grab_signature(val, deref, obj);
-		grab_describe_values(val, deref, obj);
+
+		ret = grab_signature(val, deref, data, refname, err, eaten);
+		if (ret < 0)
+			goto out;
+
+		grab_describe_values(val, deref, data);
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
@@ -2094,8 +2158,12 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
-		die("Eh?  Object of type %d?", obj->type);
+		die("Eh?  Object of type %d?", data->type);
 	}
+
+	ret = 0;
+out:
+	return ret;
 }
 
 static inline char *copy_advance(char *dst, const char *src)
@@ -2292,38 +2360,41 @@ static const char *get_refname(struct used_atom *atom, struct ref_array_item *re
 	return show_ref(&atom->u.refname, ref->refname);
 }
 
-static int get_object(struct ref_array_item *ref, int deref, struct object **obj,
+static int get_object(struct ref_array_item *ref, int deref,
 		      struct expand_data *oi, struct strbuf *err)
 {
-	/* parse_object_buffer() will set eaten to 0 if free() will be needed */
-	int eaten = 1;
+	/* parse_object_buffer() will set eaten to 1 if free() will be needed */
+	int eaten = 0;
+	int ret;
+
 	if (oi->info.contentp) {
 		/* We need to know that to use parse_object_buffer properly */
 		oi->info.sizep = &oi->size;
 		oi->info.typep = &oi->type;
 	}
+
 	if (odb_read_object_info_extended(the_repository->objects, &oi->oid, &oi->info,
-					  OBJECT_INFO_LOOKUP_REPLACE))
-		return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
-				       oid_to_hex(&oi->oid), ref->refname);
+					  OBJECT_INFO_LOOKUP_REPLACE)) {
+		ret = strbuf_addf_ret(err, -1, _("missing object %s for %s"),
+				      oid_to_hex(&oi->oid), ref->refname);
+		goto out;
+	}
 	if (oi->info.disk_sizep && oi->disk_size < 0)
 		BUG("Object size is less than zero.");
 
 	if (oi->info.contentp) {
-		*obj = parse_object_buffer(the_repository, &oi->oid, oi->type, oi->size, oi->content, &eaten);
-		if (!*obj) {
-			if (!eaten)
-				free(oi->content);
-			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
-					       oid_to_hex(&oi->oid), ref->refname);
-		}
-		grab_values(ref->value, deref, *obj, oi);
+		ret = grab_values(ref->value, deref, oi, ref->refname, err, &eaten);
+		if (ret < 0)
+			goto out;
 	}
 
 	grab_common_values(ref->value, deref, oi);
+	ret = 0;
+
+out:
 	if (!eaten)
 		free(oi->content);
-	return 0;
+	return ret;
 }
 
 static void populate_worktree_map(struct hashmap *map, struct worktree **worktrees)
@@ -2376,7 +2447,6 @@ static char *get_worktree_path(const struct ref_array_item *ref)
  */
 static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 {
-	struct object *obj;
 	int i;
 	struct object_info empty = OBJECT_INFO_INIT;
 	int ahead_behind_atoms = 0;
@@ -2564,14 +2634,14 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 
 
 	oi.oid = ref->objectname;
-	if (get_object(ref, 0, &obj, &oi, err))
+	if (get_object(ref, 0, &oi, err))
 		return -1;
 
 	/*
 	 * If there is no atom that wants to know about tagged
 	 * object, we are done.
 	 */
-	if (!need_tagged || (obj->type != OBJ_TAG))
+	if (!need_tagged || (oi.type != OBJ_TAG))
 		return 0;
 
 	/*
@@ -2589,7 +2659,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		}
 	}
 
-	return get_object(ref, 1, &obj, &oi_deref, err);
+	return get_object(ref, 1, &oi_deref, err);
 }
 
 /*

From bea37f1d647c6b17896eb3f0c210ac8dfc27b6d7 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Tue, 4 Nov 2025 15:36:13 +0100
Subject: [PATCH 038/553] ref-filter: fix stale parsed objects

In 054f5f457e (ref-filter: parse objects on demand, 2025-10-23) we have
started to skip parsing some objects in case we don't need to access
their values in the first place. This was done by introducing a new
member `struct expand_data::maybe_object` that gets populated on demand
via `get_or_parse_object()`.

This has led to a regression though where the object now gets reused
because we don't reset it properly. The `oi` structure is declared in
global scope, and there is no single place where we reset it before
invoking `get_object()`. The consequence is that the `maybe_object`
member doesn't get reset across calls, so subsequent calls will end up
reusing the same object.

This is only an issue for a subset of retrieved values, as not all of
the infrastructure ends up calling `get_or_parse_object()`. So the
effect is limited, which is probably why the issue wasn't detected
earlier.

Fix the issue by resetting `maybe_object` in `get_object()`.

Reported-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 ref-filter.c   |  2 ++
 t/t7004-tag.sh | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 7cfcd5c3554930..d8667c569a18f1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2367,6 +2367,8 @@ static int get_object(struct ref_array_item *ref, int deref,
 	int eaten = 0;
 	int ret;
 
+	oi->maybe_object = NULL;
+
 	if (oi->info.contentp) {
 		/* We need to know that to use parse_object_buffer properly */
 		oi->info.sizep = &oi->size;
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 10835631ca7644..d1388cfdf45b02 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -2332,4 +2332,24 @@ test_expect_success 'If tag cannot be created then tag message file is not unlin
 	test_path_exists .git/TAG_EDITMSG
 '
 
+test_expect_success 'annotated tag version sort' '
+	git tag -a -m "sample 1.0" vsample-1.0 &&
+	git tag -a -m "sample 2.0" vsample-2.0 &&
+	git tag -a -m "sample 10.0" vsample-10.0 &&
+	cat >expect <<-EOF &&
+	vsample-1.0
+	vsample-2.0
+	vsample-10.0
+	EOF
+
+	git tag --list --sort=version:tag vsample-\* >actual &&
+	test_cmp expect actual &&
+
+	# Ensure that we also handle this case alright in the case we have the
+	# peeled values cached e.g. via the packed-refs file.
+	git pack-refs --all &&
+	git tag --list --sort=version:tag vsample-\* &&
+	test_cmp expect actual
+'
+
 test_done

From 61ac8ba0f034b61d7353a799fecae9fe45137a72 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 4 Nov 2025 07:28:59 -0800
Subject: [PATCH 039/553] t7004: do not chdir around in the main process

Move down to no-contains subdirectory inside a subshell, just like
the previous step that created and used it does.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7004-tag.sh | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index d1388cfdf45b02..ce2ff2a28abaa3 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -2293,24 +2293,26 @@ test_expect_success '--contains combined with --no-contains' '
 # don't recurse down to tags for trees or blobs pointed to by *those*
 # commits.
 test_expect_success 'Does --[no-]contains stop at commits? Yes!' '
-	cd no-contains &&
-	blob=$(git rev-parse v0.3:v0.3.t) &&
-	tree=$(git rev-parse v0.3^{tree}) &&
-	git tag tag-blob $blob &&
-	git tag tag-tree $tree &&
-	git tag --contains v0.3 >actual &&
-	cat >expected <<-\EOF &&
-	v0.3
-	v0.4
-	v0.5
-	EOF
-	test_cmp expected actual &&
-	git tag --no-contains v0.3 >actual &&
-	cat >expected <<-\EOF &&
-	v0.1
-	v0.2
-	EOF
-	test_cmp expected actual
+	(
+		cd no-contains &&
+		blob=$(git rev-parse v0.3:v0.3.t) &&
+		tree=$(git rev-parse v0.3^{tree}) &&
+		git tag tag-blob $blob &&
+		git tag tag-tree $tree &&
+		git tag --contains v0.3 >actual &&
+		cat >expected <<-\EOF &&
+		v0.3
+		v0.4
+		v0.5
+		EOF
+		test_cmp expected actual &&
+		git tag --no-contains v0.3 >actual &&
+		cat >expected <<-\EOF &&
+		v0.1
+		v0.2
+		EOF
+		test_cmp expected actual
+	)
 '
 
 test_expect_success 'If tag is created then tag message file is unlinked' '

From 9b93ab8a9c61c53b3b9b2b3ba60c3e5d66b8ff56 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Mon, 20 Oct 2025 10:18:29 +0200
Subject: [PATCH 040/553] refs: move to using the '.optimize' functions

The `struct ref_store` variable exposes two ways to optimize a reftable
backend:

  1. pack_refs
  2. optimize

The former was specific to the 'files' + 'packed' refs backend. The
latter is more generic and covers all backends. While the naming is
different, both of these functions perform the same functionality.

Consolidate this code to only maintain the 'optimize' functions. Do this
by modifying the backends so that they exclusively implement the
`optimize` callback, only. All users of the refs subsystem already use
the 'optimize' function so there is no changes needed on the callee
side. Finally, cleanup all references to the 'pack_refs' field of the
structure and code around it.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c                  |  6 ------
 refs.h                  |  6 ------
 refs/debug.c            |  8 ++++----
 refs/files-backend.c    | 14 ++------------
 refs/packed-backend.c   |  6 +++---
 refs/refs-internal.h    |  3 ---
 refs/reftable-backend.c | 13 +++----------
 7 files changed, 12 insertions(+), 44 deletions(-)

diff --git a/refs.c b/refs.c
index a41a94ae55bb43..b9a4a606462d71 100644
--- a/refs.c
+++ b/refs.c
@@ -2313,12 +2313,6 @@ void base_ref_store_init(struct ref_store *refs, struct repository *repo,
 	refs->gitdir = xstrdup(path);
 }
 
-/* backend functions */
-int refs_pack_refs(struct ref_store *refs, struct pack_refs_opts *opts)
-{
-	return refs->be->pack_refs(refs, opts);
-}
-
 int refs_optimize(struct ref_store *refs, struct pack_refs_opts *opts)
 {
 	return refs->be->optimize(refs, opts);
diff --git a/refs.h b/refs.h
index 2dd7ac1a16aee9..8ff591ea95c7d9 100644
--- a/refs.h
+++ b/refs.h
@@ -514,12 +514,6 @@ struct pack_refs_opts {
 	struct string_list *includes;
 };
 
-/*
- * Write a packed-refs file for the current repository.
- * flags: Combination of the above PACK_REFS_* flags.
- */
-int refs_pack_refs(struct ref_store *refs, struct pack_refs_opts *opts);
-
 /*
  * Optimize the ref store. The exact behavior is up to the backend.
  * For the files backend, this is equivalent to packing refs.
diff --git a/refs/debug.c b/refs/debug.c
index 01499b9033ca3c..40cd1d9c1540c6 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -116,11 +116,11 @@ static int debug_transaction_abort(struct ref_store *refs,
 	return res;
 }
 
-static int debug_pack_refs(struct ref_store *ref_store, struct pack_refs_opts *opts)
+static int debug_optimize(struct ref_store *ref_store, struct pack_refs_opts *opts)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
-	int res = drefs->refs->be->pack_refs(drefs->refs, opts);
-	trace_printf_key(&trace_refs, "pack_refs: %d\n", res);
+	int res = drefs->refs->be->optimize(drefs->refs, opts);
+	trace_printf_key(&trace_refs, "optimize: %d\n", res);
 	return res;
 }
 
@@ -430,7 +430,7 @@ struct ref_storage_be refs_be_debug = {
 	.transaction_finish = debug_transaction_finish,
 	.transaction_abort = debug_transaction_abort,
 
-	.pack_refs = debug_pack_refs,
+	.optimize = debug_optimize,
 	.rename_ref = debug_rename_ref,
 	.copy_ref = debug_copy_ref,
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5aeb454fb47684..d13b87e056cf5e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1444,8 +1444,8 @@ static int should_pack_refs(struct files_ref_store *refs,
 	return 0;
 }
 
-static int files_pack_refs(struct ref_store *ref_store,
-			   struct pack_refs_opts *opts)
+static int files_optimize(struct ref_store *ref_store,
+			  struct pack_refs_opts *opts)
 {
 	struct files_ref_store *refs =
 		files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB,
@@ -1512,15 +1512,6 @@ static int files_pack_refs(struct ref_store *ref_store,
 	return 0;
 }
 
-static int files_optimize(struct ref_store *ref_store, struct pack_refs_opts *opts)
-{
-	/*
-	 * For the "files" backend, "optimizing" is the same as "packing".
-	 * So, we just call the existing worker function for packing.
-	 */
-	return files_pack_refs(ref_store, opts);
-}
-
 /*
  * People using contrib's git-new-workdir have .git/logs/refs ->
  * /some/other/path/.git/logs/refs, and that may live on another device.
@@ -3975,7 +3966,6 @@ struct ref_storage_be refs_be_files = {
 	.transaction_finish = files_transaction_finish,
 	.transaction_abort = files_transaction_abort,
 
-	.pack_refs = files_pack_refs,
 	.optimize = files_optimize,
 	.rename_ref = files_rename_ref,
 	.copy_ref = files_copy_ref,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1ab0c503930164..20cf9fab18e2e9 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1773,8 +1773,8 @@ static int packed_transaction_finish(struct ref_store *ref_store,
 	return ret;
 }
 
-static int packed_pack_refs(struct ref_store *ref_store UNUSED,
-			    struct pack_refs_opts *pack_opts UNUSED)
+static int packed_optimize(struct ref_store *ref_store UNUSED,
+			   struct pack_refs_opts *pack_opts UNUSED)
 {
 	/*
 	 * Packed refs are already packed. It might be that loose refs
@@ -2129,7 +2129,7 @@ struct ref_storage_be refs_be_packed = {
 	.transaction_finish = packed_transaction_finish,
 	.transaction_abort = packed_transaction_abort,
 
-	.pack_refs = packed_pack_refs,
+	.optimize = packed_optimize,
 	.rename_ref = NULL,
 	.copy_ref = NULL,
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4671517dade968..fc5149df5b3c5c 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -422,8 +422,6 @@ typedef int ref_transaction_commit_fn(struct ref_store *refs,
 				      struct ref_transaction *transaction,
 				      struct strbuf *err);
 
-typedef int pack_refs_fn(struct ref_store *ref_store,
-			 struct pack_refs_opts *opts);
 typedef int optimize_fn(struct ref_store *ref_store,
 			struct pack_refs_opts *opts);
 typedef int rename_ref_fn(struct ref_store *ref_store,
@@ -550,7 +548,6 @@ struct ref_storage_be {
 	ref_transaction_finish_fn *transaction_finish;
 	ref_transaction_abort_fn *transaction_abort;
 
-	pack_refs_fn *pack_refs;
 	optimize_fn *optimize;
 	rename_ref_fn *rename_ref;
 	copy_ref_fn *copy_ref;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 6bbfd5618dac16..43cc66a48e9143 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1700,11 +1700,11 @@ static int reftable_be_transaction_finish(struct ref_store *ref_store UNUSED,
 	return ret;
 }
 
-static int reftable_be_pack_refs(struct ref_store *ref_store,
-				 struct pack_refs_opts *opts)
+static int reftable_be_optimize(struct ref_store *ref_store,
+				struct pack_refs_opts *opts)
 {
 	struct reftable_ref_store *refs =
-		reftable_be_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB, "pack_refs");
+		reftable_be_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB, "optimize_refs");
 	struct reftable_stack *stack;
 	int ret;
 
@@ -1733,12 +1733,6 @@ static int reftable_be_pack_refs(struct ref_store *ref_store,
 	return ret;
 }
 
-static int reftable_be_optimize(struct ref_store *ref_store,
-				struct pack_refs_opts *opts)
-{
-	return reftable_be_pack_refs(ref_store, opts);
-}
-
 struct write_create_symref_arg {
 	struct reftable_ref_store *refs;
 	struct reftable_stack *stack;
@@ -2761,7 +2755,6 @@ struct ref_storage_be refs_be_reftable = {
 	.transaction_finish = reftable_be_transaction_finish,
 	.transaction_abort = reftable_be_transaction_abort,
 
-	.pack_refs = reftable_be_pack_refs,
 	.optimize = reftable_be_optimize,
 	.rename_ref = reftable_be_rename_ref,
 	.copy_ref = reftable_be_copy_ref,

From 2cd99d984122f7f1cd7c3b153ee0a0d566831b30 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Mon, 20 Oct 2025 10:18:30 +0200
Subject: [PATCH 041/553] refs: rename 'pack_refs_opts' to 'refs_optimize_opts'

The previous commit removed all references to 'pack_refs()' within
the refs subsystem. Continue this cleanup by also renaming
'pack_refs_opts' to 'refs_optimize_opts' and the respective flags
accordingly. Keeping the naming consistent will make the code easier to
maintain.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 pack-refs.c             | 20 ++++++++++----------
 refs.c                  |  2 +-
 refs.h                  | 18 +++++++++---------
 refs/debug.c            |  2 +-
 refs/files-backend.c    | 10 +++++-----
 refs/packed-backend.c   |  2 +-
 refs/refs-internal.h    |  2 +-
 refs/reftable-backend.c |  4 ++--
 8 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/pack-refs.c b/pack-refs.c
index 1a5e07d8b888ab..eb6b2ba2c2c2b5 100644
--- a/pack-refs.c
+++ b/pack-refs.c
@@ -14,10 +14,10 @@ int pack_refs_core(int argc,
 {
 	struct ref_exclusions excludes = REF_EXCLUSIONS_INIT;
 	struct string_list included_refs = STRING_LIST_INIT_NODUP;
-	struct pack_refs_opts pack_refs_opts = {
+	struct refs_optimize_opts optimize_opts = {
 		.exclusions = &excludes,
 		.includes = &included_refs,
-		.flags = PACK_REFS_PRUNE,
+		.flags = REFS_OPTIMIZE_PRUNE,
 	};
 	struct string_list option_excluded_refs = STRING_LIST_INIT_NODUP;
 	struct string_list_item *item;
@@ -26,9 +26,9 @@ int pack_refs_core(int argc,
 
 	struct option opts[] = {
 		OPT_BOOL(0, "all",   &pack_all, N_("pack everything")),
-		OPT_BIT(0, "prune", &pack_refs_opts.flags, N_("prune loose refs (default)"), PACK_REFS_PRUNE),
-		OPT_BIT(0, "auto", &pack_refs_opts.flags, N_("auto-pack refs as needed"), PACK_REFS_AUTO),
-		OPT_STRING_LIST(0, "include", pack_refs_opts.includes, N_("pattern"),
+		OPT_BIT(0, "prune", &optimize_opts.flags, N_("prune loose refs (default)"), REFS_OPTIMIZE_PRUNE),
+		OPT_BIT(0, "auto", &optimize_opts.flags, N_("auto-pack refs as needed"), REFS_OPTIMIZE_AUTO),
+		OPT_STRING_LIST(0, "include", optimize_opts.includes, N_("pattern"),
 			N_("references to include")),
 		OPT_STRING_LIST(0, "exclude", &option_excluded_refs, N_("pattern"),
 			N_("references to exclude")),
@@ -39,15 +39,15 @@ int pack_refs_core(int argc,
 		usage_with_options(usage_opts, opts);
 
 	for_each_string_list_item(item, &option_excluded_refs)
-		add_ref_exclusion(pack_refs_opts.exclusions, item->string);
+		add_ref_exclusion(optimize_opts.exclusions, item->string);
 
 	if (pack_all)
-		string_list_append(pack_refs_opts.includes, "*");
+		string_list_append(optimize_opts.includes, "*");
 
-	if (!pack_refs_opts.includes->nr)
-		string_list_append(pack_refs_opts.includes, "refs/tags/*");
+	if (!optimize_opts.includes->nr)
+		string_list_append(optimize_opts.includes, "refs/tags/*");
 
-	ret = refs_optimize(get_main_ref_store(repo), &pack_refs_opts);
+	ret = refs_optimize(get_main_ref_store(repo), &optimize_opts);
 
 	clear_ref_exclusions(&excludes);
 	string_list_clear(&included_refs, 0);
diff --git a/refs.c b/refs.c
index b9a4a606462d71..0d0831f29ba25b 100644
--- a/refs.c
+++ b/refs.c
@@ -2313,7 +2313,7 @@ void base_ref_store_init(struct ref_store *refs, struct repository *repo,
 	refs->gitdir = xstrdup(path);
 }
 
-int refs_optimize(struct ref_store *refs, struct pack_refs_opts *opts)
+int refs_optimize(struct ref_store *refs, struct refs_optimize_opts *opts)
 {
 	return refs->be->optimize(refs, opts);
 }
diff --git a/refs.h b/refs.h
index 8ff591ea95c7d9..6b05bba527ffca 100644
--- a/refs.h
+++ b/refs.h
@@ -499,16 +499,16 @@ void refs_warn_dangling_symrefs(struct ref_store *refs, FILE *fp,
 				const struct string_list *refnames);
 
 /*
- * Flags for controlling behaviour of pack_refs()
- * PACK_REFS_PRUNE: Prune loose refs after packing
- * PACK_REFS_AUTO: Pack refs on a best effort basis. The heuristics and end
- *                 result are decided by the ref backend. Backends may ignore
- *                 this flag and fall back to a normal repack.
+ * Flags for controlling behaviour of refs_optimize()
+ * REFS_OPTIMIZE_PRUNE: Prune loose refs after packing
+ * REFS_OPTIMIZE_AUTO: Pack refs on a best effort basis. The heuristics and end
+ *                     result are decided by the ref backend. Backends may ignore
+ *                     this flag and fall back to a normal repack.
  */
-#define PACK_REFS_PRUNE (1 << 0)
-#define PACK_REFS_AUTO  (1 << 1)
+#define REFS_OPTIMIZE_PRUNE (1 << 0)
+#define REFS_OPTIMIZE_AUTO  (1 << 1)
 
-struct pack_refs_opts {
+struct refs_optimize_opts {
 	unsigned int flags;
 	struct ref_exclusions *exclusions;
 	struct string_list *includes;
@@ -518,7 +518,7 @@ struct pack_refs_opts {
  * Optimize the ref store. The exact behavior is up to the backend.
  * For the files backend, this is equivalent to packing refs.
  */
-int refs_optimize(struct ref_store *refs, struct pack_refs_opts *opts);
+int refs_optimize(struct ref_store *refs, struct refs_optimize_opts *opts);
 
 /*
  * Setup reflog before using. Fill in err and return -1 on failure.
diff --git a/refs/debug.c b/refs/debug.c
index 40cd1d9c1540c6..2defd2d465712e 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -116,7 +116,7 @@ static int debug_transaction_abort(struct ref_store *refs,
 	return res;
 }
 
-static int debug_optimize(struct ref_store *ref_store, struct pack_refs_opts *opts)
+static int debug_optimize(struct ref_store *ref_store, struct refs_optimize_opts *opts)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
 	int res = drefs->refs->be->optimize(drefs->refs, opts);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index d13b87e056cf5e..23bb641f2c84ee 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1355,7 +1355,7 @@ static void prune_refs(struct files_ref_store *refs, struct ref_to_prune **refs_
  */
 static int should_pack_ref(struct files_ref_store *refs,
 			   const struct reference *ref,
-			   struct pack_refs_opts *opts)
+			   struct refs_optimize_opts *opts)
 {
 	struct string_list_item *item;
 
@@ -1383,7 +1383,7 @@ static int should_pack_ref(struct files_ref_store *refs,
 }
 
 static int should_pack_refs(struct files_ref_store *refs,
-			    struct pack_refs_opts *opts)
+			    struct refs_optimize_opts *opts)
 {
 	struct ref_iterator *iter;
 	size_t packed_size;
@@ -1391,7 +1391,7 @@ static int should_pack_refs(struct files_ref_store *refs,
 	size_t limit;
 	int ret;
 
-	if (!(opts->flags & PACK_REFS_AUTO))
+	if (!(opts->flags & REFS_OPTIMIZE_AUTO))
 		return 1;
 
 	ret = packed_refs_size(refs->packed_ref_store, &packed_size);
@@ -1445,7 +1445,7 @@ static int should_pack_refs(struct files_ref_store *refs,
 }
 
 static int files_optimize(struct ref_store *ref_store,
-			  struct pack_refs_opts *opts)
+			  struct refs_optimize_opts *opts)
 {
 	struct files_ref_store *refs =
 		files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB,
@@ -1488,7 +1488,7 @@ static int files_optimize(struct ref_store *ref_store,
 			    iter->ref.name, err.buf);
 
 		/* Schedule the loose reference for pruning if requested. */
-		if ((opts->flags & PACK_REFS_PRUNE)) {
+		if ((opts->flags & REFS_OPTIMIZE_PRUNE)) {
 			struct ref_to_prune *n;
 			FLEX_ALLOC_STR(n, name, iter->ref.name);
 			oidcpy(&n->oid, iter->ref.oid);
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 20cf9fab18e2e9..10062fd8b63063 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1774,7 +1774,7 @@ static int packed_transaction_finish(struct ref_store *ref_store,
 }
 
 static int packed_optimize(struct ref_store *ref_store UNUSED,
-			   struct pack_refs_opts *pack_opts UNUSED)
+			   struct refs_optimize_opts *opts UNUSED)
 {
 	/*
 	 * Packed refs are already packed. It might be that loose refs
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index fc5149df5b3c5c..dee42f231dbd0a 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -423,7 +423,7 @@ typedef int ref_transaction_commit_fn(struct ref_store *refs,
 				      struct strbuf *err);
 
 typedef int optimize_fn(struct ref_store *ref_store,
-			struct pack_refs_opts *opts);
+			struct refs_optimize_opts *opts);
 typedef int rename_ref_fn(struct ref_store *ref_store,
 			  const char *oldref, const char *newref,
 			  const char *logmsg);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 43cc66a48e9143..c23c45f3bf4639 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1701,7 +1701,7 @@ static int reftable_be_transaction_finish(struct ref_store *ref_store UNUSED,
 }
 
 static int reftable_be_optimize(struct ref_store *ref_store,
-				struct pack_refs_opts *opts)
+				struct refs_optimize_opts *opts)
 {
 	struct reftable_ref_store *refs =
 		reftable_be_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB, "optimize_refs");
@@ -1715,7 +1715,7 @@ static int reftable_be_optimize(struct ref_store *ref_store,
 	if (!stack)
 		stack = refs->main_backend.stack;
 
-	if (opts->flags & PACK_REFS_AUTO)
+	if (opts->flags & REFS_OPTIMIZE_AUTO)
 		ret = reftable_stack_auto_compact(stack);
 	else
 		ret = reftable_stack_compact_all(stack, NULL);

From c113f4ca4dbba5529af645f7d7837c7dae12b403 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Mon, 20 Oct 2025 10:18:31 +0200
Subject: [PATCH 042/553] t/pack-refs-tests: move the 'test_done' to callees

In ac0bad0af4 (t0601: refactor tests to be shareable, 2025-09-19), we
refactored 't/t0601-reffiles-pack-refs.sh' to move all of the tests to
't/pack-refs-tests.sh', which became a common test suite which was also
used by 't/t1463-refs-optimize.sh'.

This also moved the 'test_done' directive to 't/pack-refs-tests.sh'.
Which inhibits additional tests from being added to either of the tests.
Let's move the directive out to both the tests, so that we can add
additional specific tests to them. Also the test flow logic shouldn't be
part of tests which can be embedded in other test scripts.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/pack-refs-tests.sh          | 2 --
 t/t0601-reffiles-pack-refs.sh | 2 ++
 t/t1463-refs-optimize.sh      | 2 ++
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/pack-refs-tests.sh b/t/pack-refs-tests.sh
index 095823d915fd63..81086c369089b9 100644
--- a/t/pack-refs-tests.sh
+++ b/t/pack-refs-tests.sh
@@ -459,5 +459,3 @@ test_expect_success 'pack-refs does not store invalid peeled tag value' '
 		test_grep ! "^\^" .git/packed-refs
 	)
 '
-
-test_done
diff --git a/t/t0601-reffiles-pack-refs.sh b/t/t0601-reffiles-pack-refs.sh
index 12cf5d1dcba814..3c706978efc219 100755
--- a/t/t0601-reffiles-pack-refs.sh
+++ b/t/t0601-reffiles-pack-refs.sh
@@ -18,3 +18,5 @@ export GIT_TEST_DEFAULT_REF_FORMAT
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/pack-refs-tests.sh
+
+test_done
diff --git a/t/t1463-refs-optimize.sh b/t/t1463-refs-optimize.sh
index c11c905d795d26..9afe3c1ed7e33a 100755
--- a/t/t1463-refs-optimize.sh
+++ b/t/t1463-refs-optimize.sh
@@ -15,3 +15,5 @@ export GIT_TEST_DEFAULT_REF_FORMAT
 
 pack_refs='refs optimize'
 . "$TEST_DIRECTORY"/pack-refs-tests.sh
+
+test_done

From e031fa100603af74def6bf2a646c731e4fcd12fc Mon Sep 17 00:00:00 2001
From: Siddharth Asthana <siddharthasthana31@gmail.com>
Date: Thu, 6 Nov 2025 00:45:59 +0530
Subject: [PATCH 043/553] replay: use die_for_incompatible_opt2() for option
 validation

In preparation for adding the --ref-action option, convert option
validation to use die_for_incompatible_opt2(). This helper provides
standardized error messages for mutually exclusive options.

The following commit introduces --ref-action which will be incompatible
with certain other options. Using die_for_incompatible_opt2() now means
that commit can cleanly add its validation using the same pattern,
keeping the validation logic consistent and maintainable.

This also aligns git-replay's option handling with how other Git commands
manage option conflicts, using the established die_for_incompatible_opt*()
helper family.

Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 6172c8aacc9873..b64fc72063ee8e 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -330,9 +330,9 @@ int cmd_replay(int argc,
 		usage_with_options(replay_usage, replay_options);
 	}
 
-	if (advance_name_opt && contained)
-		die(_("options '%s' and '%s' cannot be used together"),
-		    "--advance", "--contained");
+	die_for_incompatible_opt2(!!advance_name_opt, "--advance",
+				  contained, "--contained");
+
 	advance_name = xstrdup_or_null(advance_name_opt);
 
 	repo_init_revisions(repo, &revs, prefix);

From 15cd4ef1f495e51f7db39583b7f562e7170da3d2 Mon Sep 17 00:00:00 2001
From: Siddharth Asthana <siddharthasthana31@gmail.com>
Date: Thu, 6 Nov 2025 00:46:00 +0530
Subject: [PATCH 044/553] replay: make atomic ref updates the default behavior

The git replay command currently outputs update commands that can be
piped to update-ref to achieve a rebase, e.g.

  git replay --onto main topic1..topic2 | git update-ref --stdin

This separation had advantages for three special cases:
  * it made testing easy (when state isn't modified from one step to
    the next, you don't need to make temporary branches or have undo
    commands, or try to track the changes)
  * it provided a natural can-it-rebase-cleanly (and what would it
    rebase to) capability without automatically updating refs, similar
    to a --dry-run
  * it provided a natural low-level tool for the suite of hash-object,
    mktree, commit-tree, mktag, merge-tree, and update-ref, allowing
    users to have another building block for experimentation and making
    new tools

However, it should be noted that all three of these are somewhat
special cases; users, whether on the client or server side, would
almost certainly find it more ergonomic to simply have the updating
of refs be the default.

For server-side operations in particular, the pipeline architecture
creates process coordination overhead. Server implementations that need
to perform rebases atomically must maintain additional code to:

  1. Spawn and manage a pipeline between git-replay and git-update-ref
  2. Coordinate stdout/stderr streams across the pipe boundary
  3. Handle partial failure states if the pipeline breaks mid-execution
  4. Parse and validate the update-ref command output

Change the default behavior to update refs directly, and atomically (at
least to the extent supported by the refs backend in use). This
eliminates the process coordination overhead for the common case.

For users needing the traditional pipeline workflow, add a new
--ref-action=<mode> option that preserves the original behavior:

  git replay --ref-action=print --onto main topic1..topic2 | git update-ref --stdin

The mode can be:
  * update (default): Update refs directly using an atomic transaction
  * print: Output update-ref commands for pipeline use

Test suite changes:

All existing tests that expected command output now use
--ref-action=print to preserve their original behavior. This keeps
the tests valid while allowing them to verify that the pipeline workflow
still works correctly.

New tests were added to verify:
  - Default atomic behavior (no output, refs updated directly)
  - Bare repository support (server-side use case)
  - Equivalence between traditional pipeline and atomic updates
  - Real atomicity using a lock file to verify all-or-nothing guarantee
  - Test isolation using test_when_finished to clean up state
  - Reflog messages include replay mode and target

A following commit will add a replay.refAction configuration
option for users who prefer the traditional pipeline output as their
default behavior.

Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-replay.adoc |  61 ++++++++++++-------
 builtin/replay.c              | 111 +++++++++++++++++++++++++++++++---
 t/t3650-replay-basics.sh      |  67 +++++++++++++++++---
 3 files changed, 199 insertions(+), 40 deletions(-)

diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index 0b12bf8aa4df42..2ef74ddb127b0c 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -9,15 +9,16 @@ git-replay - EXPERIMENTAL: Replay commits on a new base, works with bare repos t
 SYNOPSIS
 --------
 [verse]
-(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) <revision-range>...
+(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) [--ref-action[=<mode>]] <revision-range>...
 
 DESCRIPTION
 -----------
 
 Takes ranges of commits and replays them onto a new location. Leaves
-the working tree and the index untouched, and updates no references.
-The output of this command is meant to be used as input to
-`git update-ref --stdin`, which would update the relevant branches
+the working tree and the index untouched. By default, updates the
+relevant references using an atomic transaction (all refs update or
+none). Use `--ref-action=print` to avoid automatic ref updates and
+instead get update commands that can be piped to `git update-ref --stdin`
 (see the OUTPUT section below).
 
 THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
@@ -29,18 +30,27 @@ OPTIONS
 	Starting point at which to create the new commits.  May be any
 	valid commit, and not just an existing branch name.
 +
-When `--onto` is specified, the update-ref command(s) in the output will
-update the branch(es) in the revision range to point at the new
-commits, similar to the way how `git rebase --update-refs` updates
-multiple branches in the affected range.
+When `--onto` is specified, the branch(es) in the revision range will be
+updated to point at the new commits, similar to the way `git rebase --update-refs`
+updates multiple branches in the affected range.
 
 --advance <branch>::
 	Starting point at which to create the new commits; must be a
 	branch name.
 +
-When `--advance` is specified, the update-ref command(s) in the output
-will update the branch passed as an argument to `--advance` to point at
-the new commits (in other words, this mimics a cherry-pick operation).
+The history is replayed on top of the <branch> and <branch> is updated to
+point at the tip of the resulting history. This is different from `--onto`,
+which uses the target only as a starting point without updating it.
+
+--ref-action[=<mode>]::
+	Control how references are updated. The mode can be:
++
+--
+	* `update` (default): Update refs directly using an atomic transaction.
+	  All refs are updated or none are (all-or-nothing behavior).
+	* `print`: Output update-ref commands for pipeline use. This is the
+	  traditional behavior where output can be piped to `git update-ref --stdin`.
+--
 
 <revision-range>::
 	Range of commits to replay. More than one <revision-range> can
@@ -54,8 +64,11 @@ include::rev-list-options.adoc[]
 OUTPUT
 ------
 
-When there are no conflicts, the output of this command is usable as
-input to `git update-ref --stdin`.  It is of the form:
+By default, or with `--ref-action=update`, this command produces no output on
+success, as refs are updated directly using an atomic transaction.
+
+When using `--ref-action=print`, the output is usable as input to
+`git update-ref --stdin`. It is of the form:
 
 	update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
 	update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
@@ -81,6 +94,14 @@ To simply rebase `mybranch` onto `target`:
 
 ------------
 $ git replay --onto target origin/main..mybranch
+------------
+
+The refs are updated atomically and no output is produced on success.
+
+To see what would be updated without actually updating:
+
+------------
+$ git replay --ref-action=print --onto target origin/main..mybranch
 update refs/heads/mybranch ${NEW_mybranch_HASH} ${OLD_mybranch_HASH}
 ------------
 
@@ -88,33 +109,29 @@ To cherry-pick the commits from mybranch onto target:
 
 ------------
 $ git replay --advance target origin/main..mybranch
-update refs/heads/target ${NEW_target_HASH} ${OLD_target_HASH}
 ------------
 
 Note that the first two examples replay the exact same commits and on
 top of the exact same new base, they only differ in that the first
-provides instructions to make mybranch point at the new commits and
-the second provides instructions to make target point at them.
+updates mybranch to point at the new commits and the second updates
+target to point at them.
 
 What if you have a stack of branches, one depending upon another, and
 you'd really like to rebase the whole set?
 
 ------------
 $ git replay --contained --onto origin/main origin/main..tipbranch
-update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
-update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
-update refs/heads/tipbranch ${NEW_tipbranch_HASH} ${OLD_tipbranch_HASH}
 ------------
 
+All three branches (`branch1`, `branch2`, and `tipbranch`) are updated
+atomically.
+
 When calling `git replay`, one does not need to specify a range of
 commits to replay using the syntax `A..B`; any range expression will
 do:
 
 ------------
 $ git replay --onto origin/main ^base branch1 branch2 branch3
-update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
-update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
-update refs/heads/branch3 ${NEW_branch3_HASH} ${OLD_branch3_HASH}
 ------------
 
 This will simultaneously rebase `branch1`, `branch2`, and `branch3`,
diff --git a/builtin/replay.c b/builtin/replay.c
index b64fc72063ee8e..94e60b5b107516 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -20,6 +20,11 @@
 #include <oidset.h>
 #include <tree.h>
 
+enum ref_action_mode {
+	REF_ACTION_UPDATE,
+	REF_ACTION_PRINT,
+};
+
 static const char *short_commit_name(struct repository *repo,
 				     struct commit *commit)
 {
@@ -284,6 +289,38 @@ static struct commit *pick_regular_commit(struct repository *repo,
 	return create_commit(repo, result->tree, pickme, replayed_base);
 }
 
+static enum ref_action_mode parse_ref_action_mode(const char *ref_action, const char *source)
+{
+	if (!ref_action || !strcmp(ref_action, "update"))
+		return REF_ACTION_UPDATE;
+	if (!strcmp(ref_action, "print"))
+		return REF_ACTION_PRINT;
+	die(_("invalid %s value: '%s'"), source, ref_action);
+}
+
+static int handle_ref_update(enum ref_action_mode mode,
+			     struct ref_transaction *transaction,
+			     const char *refname,
+			     const struct object_id *new_oid,
+			     const struct object_id *old_oid,
+			     const char *reflog_msg,
+			     struct strbuf *err)
+{
+	switch (mode) {
+	case REF_ACTION_PRINT:
+		printf("update %s %s %s\n",
+		       refname,
+		       oid_to_hex(new_oid),
+		       oid_to_hex(old_oid));
+		return 0;
+	case REF_ACTION_UPDATE:
+		return ref_transaction_update(transaction, refname, new_oid, old_oid,
+					      NULL, NULL, 0, reflog_msg, err);
+	default:
+		BUG("unknown ref_action_mode %d", mode);
+	}
+}
+
 int cmd_replay(int argc,
 	       const char **argv,
 	       const char *prefix,
@@ -294,6 +331,8 @@ int cmd_replay(int argc,
 	struct commit *onto = NULL;
 	const char *onto_name = NULL;
 	int contained = 0;
+	const char *ref_action = NULL;
+	enum ref_action_mode ref_mode = REF_ACTION_UPDATE;
 
 	struct rev_info revs;
 	struct commit *last_commit = NULL;
@@ -302,12 +341,15 @@ int cmd_replay(int argc,
 	struct merge_result result;
 	struct strset *update_refs = NULL;
 	kh_oid_map_t *replayed_commits;
+	struct ref_transaction *transaction = NULL;
+	struct strbuf transaction_err = STRBUF_INIT;
+	struct strbuf reflog_msg = STRBUF_INIT;
 	int ret = 0;
 
-	const char * const replay_usage[] = {
+	const char *const replay_usage[] = {
 		N_("(EXPERIMENTAL!) git replay "
 		   "([--contained] --onto <newbase> | --advance <branch>) "
-		   "<revision-range>..."),
+		   "[--ref-action[=<mode>]] <revision-range>..."),
 		NULL
 	};
 	struct option replay_options[] = {
@@ -319,6 +361,9 @@ int cmd_replay(int argc,
 			   N_("replay onto given commit")),
 		OPT_BOOL(0, "contained", &contained,
 			 N_("advance all branches contained in revision-range")),
+		OPT_STRING(0, "ref-action", &ref_action,
+			   N_("mode"),
+			   N_("control ref update behavior (update|print)")),
 		OPT_END()
 	};
 
@@ -333,6 +378,10 @@ int cmd_replay(int argc,
 	die_for_incompatible_opt2(!!advance_name_opt, "--advance",
 				  contained, "--contained");
 
+	/* Parse ref action mode */
+	if (ref_action)
+		ref_mode = parse_ref_action_mode(ref_action, "--ref-action");
+
 	advance_name = xstrdup_or_null(advance_name_opt);
 
 	repo_init_revisions(repo, &revs, prefix);
@@ -389,6 +438,24 @@ int cmd_replay(int argc,
 	determine_replay_mode(repo, &revs.cmdline, onto_name, &advance_name,
 			      &onto, &update_refs);
 
+	/* Build reflog message */
+	if (advance_name_opt)
+		strbuf_addf(&reflog_msg, "replay --advance %s", advance_name_opt);
+	else
+		strbuf_addf(&reflog_msg, "replay --onto %s",
+			    oid_to_hex(&onto->object.oid));
+
+	/* Initialize ref transaction if using update mode */
+	if (ref_mode == REF_ACTION_UPDATE) {
+		transaction = ref_store_transaction_begin(get_main_ref_store(repo),
+							  0, &transaction_err);
+		if (!transaction) {
+			ret = error(_("failed to begin ref transaction: %s"),
+				    transaction_err.buf);
+			goto cleanup;
+		}
+	}
+
 	if (!onto) /* FIXME: Should handle replaying down to root commit */
 		die("Replaying down to root commit is not supported yet!");
 
@@ -434,10 +501,16 @@ int cmd_replay(int argc,
 			if (decoration->type == DECORATION_REF_LOCAL &&
 			    (contained || strset_contains(update_refs,
 							  decoration->name))) {
-				printf("update %s %s %s\n",
-				       decoration->name,
-				       oid_to_hex(&last_commit->object.oid),
-				       oid_to_hex(&commit->object.oid));
+				if (handle_ref_update(ref_mode, transaction,
+						      decoration->name,
+						      &last_commit->object.oid,
+						      &commit->object.oid,
+						      reflog_msg.buf,
+						      &transaction_err) < 0) {
+					ret = error(_("failed to update ref '%s': %s"),
+						    decoration->name, transaction_err.buf);
+					goto cleanup;
+				}
 			}
 			decoration = decoration->next;
 		}
@@ -445,10 +518,24 @@ int cmd_replay(int argc,
 
 	/* In --advance mode, advance the target ref */
 	if (result.clean == 1 && advance_name) {
-		printf("update %s %s %s\n",
-		       advance_name,
-		       oid_to_hex(&last_commit->object.oid),
-		       oid_to_hex(&onto->object.oid));
+		if (handle_ref_update(ref_mode, transaction, advance_name,
+				      &last_commit->object.oid,
+				      &onto->object.oid,
+				      reflog_msg.buf,
+				      &transaction_err) < 0) {
+			ret = error(_("failed to update ref '%s': %s"),
+				    advance_name, transaction_err.buf);
+			goto cleanup;
+		}
+	}
+
+	/* Commit the ref transaction if we have one */
+	if (transaction && result.clean == 1) {
+		if (ref_transaction_commit(transaction, &transaction_err)) {
+			ret = error(_("failed to commit ref transaction: %s"),
+				    transaction_err.buf);
+			goto cleanup;
+		}
 	}
 
 	merge_finalize(&merge_opt, &result);
@@ -460,6 +547,10 @@ int cmd_replay(int argc,
 	ret = result.clean;
 
 cleanup:
+	if (transaction)
+		ref_transaction_free(transaction);
+	strbuf_release(&transaction_err);
+	strbuf_release(&reflog_msg);
 	release_revisions(&revs);
 	free(advance_name);
 
diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index 58b37599357827..ec79234c8044c7 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -52,7 +52,7 @@ test_expect_success 'setup bare' '
 '
 
 test_expect_success 'using replay to rebase two branches, one on top of other' '
-	git replay --onto main topic1..topic2 >result &&
+	git replay --ref-action=print --onto main topic1..topic2 >result &&
 
 	test_line_count = 1 result &&
 
@@ -68,7 +68,7 @@ test_expect_success 'using replay to rebase two branches, one on top of other' '
 '
 
 test_expect_success 'using replay on bare repo to rebase two branches, one on top of other' '
-	git -C bare replay --onto main topic1..topic2 >result-bare &&
+	git -C bare replay --ref-action=print --onto main topic1..topic2 >result-bare &&
 	test_cmp expect result-bare
 '
 
@@ -86,7 +86,7 @@ test_expect_success 'using replay to perform basic cherry-pick' '
 	# 2nd field of result is refs/heads/main vs. refs/heads/topic2
 	# 4th field of result is hash for main instead of hash for topic2
 
-	git replay --advance main topic1..topic2 >result &&
+	git replay --ref-action=print --advance main topic1..topic2 >result &&
 
 	test_line_count = 1 result &&
 
@@ -102,7 +102,7 @@ test_expect_success 'using replay to perform basic cherry-pick' '
 '
 
 test_expect_success 'using replay on bare repo to perform basic cherry-pick' '
-	git -C bare replay --advance main topic1..topic2 >result-bare &&
+	git -C bare replay --ref-action=print --advance main topic1..topic2 >result-bare &&
 	test_cmp expect result-bare
 '
 
@@ -115,7 +115,7 @@ test_expect_success 'replay fails when both --advance and --onto are omitted' '
 '
 
 test_expect_success 'using replay to also rebase a contained branch' '
-	git replay --contained --onto main main..topic3 >result &&
+	git replay --ref-action=print --contained --onto main main..topic3 >result &&
 
 	test_line_count = 2 result &&
 	cut -f 3 -d " " result >new-branch-tips &&
@@ -139,12 +139,12 @@ test_expect_success 'using replay to also rebase a contained branch' '
 '
 
 test_expect_success 'using replay on bare repo to also rebase a contained branch' '
-	git -C bare replay --contained --onto main main..topic3 >result-bare &&
+	git -C bare replay --ref-action=print --contained --onto main main..topic3 >result-bare &&
 	test_cmp expect result-bare
 '
 
 test_expect_success 'using replay to rebase multiple divergent branches' '
-	git replay --onto main ^topic1 topic2 topic4 >result &&
+	git replay --ref-action=print --onto main ^topic1 topic2 topic4 >result &&
 
 	test_line_count = 2 result &&
 	cut -f 3 -d " " result >new-branch-tips &&
@@ -168,7 +168,7 @@ test_expect_success 'using replay to rebase multiple divergent branches' '
 '
 
 test_expect_success 'using replay on bare repo to rebase multiple divergent branches, including contained ones' '
-	git -C bare replay --contained --onto main ^main topic2 topic3 topic4 >result &&
+	git -C bare replay --ref-action=print --contained --onto main ^main topic2 topic3 topic4 >result &&
 
 	test_line_count = 4 result &&
 	cut -f 3 -d " " result >new-branch-tips &&
@@ -217,4 +217,55 @@ test_expect_success 'merge.directoryRenames=false' '
 		--onto rename-onto rename-onto..rename-from
 '
 
+test_expect_success 'default atomic behavior updates refs directly' '
+	# Use a separate branch to avoid contaminating topic2 for later tests
+	git branch test-atomic topic2 &&
+	test_when_finished "git branch -D test-atomic" &&
+
+	# Test default atomic behavior (no output, refs updated)
+	git replay --onto main topic1..test-atomic >output &&
+	test_must_be_empty output &&
+
+	# Verify ref was updated
+	git log --format=%s test-atomic >actual &&
+	test_write_lines E D M L B A >expect &&
+	test_cmp expect actual &&
+
+	# Verify reflog message includes SHA of onto commit
+	git reflog test-atomic -1 --format=%gs >reflog-msg &&
+	ONTO_SHA=$(git rev-parse main) &&
+	echo "replay --onto $ONTO_SHA" >expect-reflog &&
+	test_cmp expect-reflog reflog-msg
+'
+
+test_expect_success 'atomic behavior in bare repository' '
+	# Store original state for cleanup
+	START=$(git -C bare rev-parse topic2) &&
+	test_when_finished "git -C bare update-ref refs/heads/topic2 $START" &&
+
+	# Test atomic updates work in bare repo
+	git -C bare replay --onto main topic1..topic2 >output &&
+	test_must_be_empty output &&
+
+	# Verify ref was updated in bare repo
+	git -C bare log --format=%s topic2 >actual &&
+	test_write_lines E D M L B A >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'reflog message for --advance mode' '
+	# Store original state
+	START=$(git rev-parse main) &&
+	test_when_finished "git update-ref refs/heads/main $START" &&
+
+	# Test --advance mode reflog message
+	git replay --advance main topic1..topic2 >output &&
+	test_must_be_empty output &&
+
+	# Verify reflog message includes --advance and branch name
+	git reflog main -1 --format=%gs >reflog-msg &&
+	echo "replay --advance main" >expect-reflog &&
+	test_cmp expect-reflog reflog-msg
+'
+
 test_done

From 336ac90c06ec757f613faae4ffc6c32578a99cd1 Mon Sep 17 00:00:00 2001
From: Siddharth Asthana <siddharthasthana31@gmail.com>
Date: Thu, 6 Nov 2025 00:46:01 +0530
Subject: [PATCH 045/553] replay: add replay.refAction config option

Add a configuration variable to control the default behavior of git replay
for updating references. This allows users who prefer the traditional
pipeline output to set it once in their config instead of passing
--ref-action=print with every command.

The config variable uses string values that mirror the behavior modes:
  * replay.refAction = update (default): atomic ref updates
  * replay.refAction = print: output commands for pipeline

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/replay.adoc | 11 ++++++++
 Documentation/git-replay.adoc    |  2 ++
 builtin/replay.c                 | 24 ++++++++++++++---
 t/t3650-replay-basics.sh         | 46 ++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/config/replay.adoc

diff --git a/Documentation/config/replay.adoc b/Documentation/config/replay.adoc
new file mode 100644
index 00000000000000..7d549d2f0e5195
--- /dev/null
+++ b/Documentation/config/replay.adoc
@@ -0,0 +1,11 @@
+replay.refAction::
+	Specifies the default mode for handling reference updates in
+	`git replay`. The value can be:
++
+--
+	* `update`: Update refs directly using an atomic transaction (default behavior).
+	* `print`: Output update-ref commands for pipeline use.
+--
++
+This setting can be overridden with the `--ref-action` command-line option.
+When not configured, `git replay` defaults to `update` mode.
diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index 2ef74ddb127b0c..dcb26e8a8e88ca 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -51,6 +51,8 @@ which uses the target only as a starting point without updating it.
 	* `print`: Output update-ref commands for pipeline use. This is the
 	  traditional behavior where output can be piped to `git update-ref --stdin`.
 --
++
+The default mode can be configured via the `replay.refAction` configuration variable.
 
 <revision-range>::
 	Range of commits to replay. More than one <revision-range> can
diff --git a/builtin/replay.c b/builtin/replay.c
index 94e60b5b107516..6606a2c94bc671 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -8,6 +8,7 @@
 #include "git-compat-util.h"
 
 #include "builtin.h"
+#include "config.h"
 #include "environment.h"
 #include "hex.h"
 #include "lockfile.h"
@@ -298,6 +299,22 @@ static enum ref_action_mode parse_ref_action_mode(const char *ref_action, const
 	die(_("invalid %s value: '%s'"), source, ref_action);
 }
 
+static enum ref_action_mode get_ref_action_mode(struct repository *repo, const char *ref_action)
+{
+	const char *config_value = NULL;
+
+	/* Command line option takes precedence */
+	if (ref_action)
+		return parse_ref_action_mode(ref_action, "--ref-action");
+
+	/* Check config value */
+	if (!repo_config_get_string_tmp(repo, "replay.refAction", &config_value))
+		return parse_ref_action_mode(config_value, "replay.refAction");
+
+	/* Default to update mode */
+	return REF_ACTION_UPDATE;
+}
+
 static int handle_ref_update(enum ref_action_mode mode,
 			     struct ref_transaction *transaction,
 			     const char *refname,
@@ -332,7 +349,7 @@ int cmd_replay(int argc,
 	const char *onto_name = NULL;
 	int contained = 0;
 	const char *ref_action = NULL;
-	enum ref_action_mode ref_mode = REF_ACTION_UPDATE;
+	enum ref_action_mode ref_mode;
 
 	struct rev_info revs;
 	struct commit *last_commit = NULL;
@@ -378,9 +395,8 @@ int cmd_replay(int argc,
 	die_for_incompatible_opt2(!!advance_name_opt, "--advance",
 				  contained, "--contained");
 
-	/* Parse ref action mode */
-	if (ref_action)
-		ref_mode = parse_ref_action_mode(ref_action, "--ref-action");
+	/* Parse ref action mode from command line or config */
+	ref_mode = get_ref_action_mode(repo, ref_action);
 
 	advance_name = xstrdup_or_null(advance_name_opt);
 
diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index ec79234c8044c7..cf3aacf3551f8e 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -268,4 +268,50 @@ test_expect_success 'reflog message for --advance mode' '
 	test_cmp expect-reflog reflog-msg
 '
 
+test_expect_success 'replay.refAction=print config option' '
+	# Store original state
+	START=$(git rev-parse topic2) &&
+	test_when_finished "git branch -f topic2 $START" &&
+
+	# Test with config set to print
+	test_config replay.refAction print &&
+	git replay --onto main topic1..topic2 >output &&
+	test_line_count = 1 output &&
+	test_grep "^update refs/heads/topic2 " output
+'
+
+test_expect_success 'replay.refAction=update config option' '
+	# Store original state
+	START=$(git rev-parse topic2) &&
+	test_when_finished "git branch -f topic2 $START" &&
+
+	# Test with config set to update
+	test_config replay.refAction update &&
+	git replay --onto main topic1..topic2 >output &&
+	test_must_be_empty output &&
+
+	# Verify ref was updated
+	git log --format=%s topic2 >actual &&
+	test_write_lines E D M L B A >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'command-line --ref-action overrides config' '
+	# Store original state
+	START=$(git rev-parse topic2) &&
+	test_when_finished "git branch -f topic2 $START" &&
+
+	# Set config to update but use --ref-action=print
+	test_config replay.refAction update &&
+	git replay --ref-action=print --onto main topic1..topic2 >output &&
+	test_line_count = 1 output &&
+	test_grep "^update refs/heads/topic2 " output
+'
+
+test_expect_success 'invalid replay.refAction value' '
+	test_config replay.refAction invalid &&
+	test_must_fail git replay --onto main topic1..topic2 2>error &&
+	test_grep "invalid.*replay.refAction.*value" error
+'
+
 test_done

From 77e7aab693daee402ef37323715207c5a2daec9f Mon Sep 17 00:00:00 2001
From: Johannes Sixt <j6t@kdbg.org>
Date: Thu, 6 Nov 2025 09:20:41 +0100
Subject: [PATCH 046/553] gitk: fix a 'continue' statement outside a loop to
 'return'

When 5de460a2cfdd (gitk: Refactor per-line part of getblobdiffline and
its support) moved the body of a loop into a separate function, several
'continue' statements were changed to 'return'. But one instance was
missed. Fix it now.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 gitk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gitk b/gitk
index c02db0194d5317..eb657aa1e491b4 100755
--- a/gitk
+++ b/gitk
@@ -8296,7 +8296,7 @@ proc parseblobdiffline {ids line} {
         if {![regexp {^diff (--cc|--git) } $line m type]} {
             set line [convertfrom utf-8 $line]
             $ctext insert end "$line\n" hunksep
-            continue
+            return
         }
         # start of a new file
         set diffinhdr 1

From d445a78873423c55b79f97361a082272acd17f7b Mon Sep 17 00:00:00 2001
From: Johannes Sixt <j6t@kdbg.org>
Date: Thu, 6 Nov 2025 10:42:37 +0100
Subject: [PATCH 047/553] gitk: show unescaped file names on 'rename' and
 'copy' lines

When a file is selected in the file list, the diff window scrolls to the
corresponding section. The administrative data needed for this purpose
is extracted from the 'rename from', 'rename to', and 'copy to' lines.
Escaped file names are unescaped for this purpose. However, the lines
shown in the diff window are left in the escaped form. This is not very
pleasing. Replace the escaped form by the unescaped form.

Add a section to treat the 'copy from' case.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 gitk | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/gitk b/gitk
index eb657aa1e491b4..9e4f113c9bc142 100755
--- a/gitk
+++ b/gitk
@@ -8401,6 +8401,7 @@ proc parseblobdiffline {ids line} {
             if {$i >= 0} {
                 setinlist difffilestart $i $curdiffstart
             }
+            set line "rename from $fname"
         } elseif {![string compare -length 10 $line "rename to "] ||
                   ![string compare -length 8 $line "copy to "]} {
             set fname [string range $line [expr 4 + [string first " to " $line] ] end]
@@ -8408,6 +8409,13 @@ proc parseblobdiffline {ids line} {
                 set fname [lindex $fname 0]
             }
             makediffhdr $fname $ids
+            set line "[lindex $line 0] to $fname"
+        } elseif {![string compare -length 10 $line "copy from "]} {
+            set fname [string range $line 10 end]
+            if {[string index $fname 0] eq "\""} {
+                set fname [lindex $fname 0]
+            }
+            set line "copy from $fname"
         } elseif {[string compare -length 3 $line "---"] == 0} {
             # do nothing
             return

From 46207a54cca1402532f6e658503a9f6b7ad36fb8 Mon Sep 17 00:00:00 2001
From: Queen Ediri Jessa <qjessa662@gmail.com>
Date: Wed, 5 Nov 2025 15:38:49 +0100
Subject: [PATCH 048/553] doc: clarify server behavior for invalid 'want' lines
 in HTTP protocol

Update the documentation to clearly describe how the server responds when a
client sends an invalid or malformed `want` line during the HTTP protocol
exchange. The server includes the offending object name in its error message.

Signed-off-by: Queen Ediri Jessa <qjessa662@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/gitprotocol-http.adoc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitprotocol-http.adoc b/Documentation/gitprotocol-http.adoc
index d024010414aa6d..e2ef7f045953d8 100644
--- a/Documentation/gitprotocol-http.adoc
+++ b/Documentation/gitprotocol-http.adoc
@@ -443,7 +443,8 @@ If no "want" objects are received, send an error:
 TODO: Define error if no "want" lines are requested.
 
 If any "want" object is not reachable, send an error:
-TODO: Define error if an invalid "want" is requested.
+When a Git server receives an invalid or malformed `want` line, it
+responds with an error message that includes the offending object name.
 
 Create an empty list, `s_common`.
 

From bdb1cf831251b16d174f742178caac181add87f4 Mon Sep 17 00:00:00 2001
From: Tobias Boesch <tobias.boesch@miele.com>
Date: Thu, 6 Nov 2025 14:42:11 +0000
Subject: [PATCH 049/553] gitk: add external diff file rename detection

If a file is renamed between commits and an external diff is started
through gitk on the original or the renamed file name,
gitk is unable to open the renamed file in the external diff editor.
It fails to fetch the renamed file from git, because it fetches it
using its original path in contrast to using the renamed path of the
file.
Detect the rename and open the external diff with the original and
the renamed file instead of no file (fetch the renamed file path and
name from git) no matter if the original or the renamed file is
selected in gitk.

Signed-off-by: Tobias Boesch <tobias.boesch@miele.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 gitk | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/gitk b/gitk
index bc9efa18566fb8..9659466052e91a 100755
--- a/gitk
+++ b/gitk
@@ -3806,6 +3806,34 @@ proc external_diff_get_one_file {diffid filename diffdir} {
                "revision $diffid"]
 }
 
+proc check_for_renames_in_diff {filepath} { # renames
+    global difffilestart ctext
+
+    set filename [file tail $filepath]
+    set renames {}
+
+    foreach loc $difffilestart {
+        set loclineend [string map {.0 .end} $loc]
+        set fromlineloc "$loc + 2 lines"
+        set tolineloc "$loc + 3 lines"
+        set renfromline [$ctext get $fromlineloc [string map {.0 .end} $fromlineloc]]
+        set rentoline [$ctext get $tolineloc [string map {.0 .end} $tolineloc]]
+        if {[string equal -length 12 "rename from " $renfromline]
+            && [string equal -length 10 "rename to " $rentoline]} {
+            set renfrom [string range $renfromline 12 end]
+            set rento [string range $rentoline 10 end]
+            if {[string first $filename $renfrom] != -1
+                || [string first $filename $rento] != -1} {
+                lappend renames $renfrom
+                lappend renames $rento
+                break
+            }
+        }
+    }
+
+    return $renames
+}
+
 proc external_diff {} {
     global nullid nullid2
     global flist_menu_file
@@ -3836,8 +3864,16 @@ proc external_diff {} {
     if {$diffdir eq {}} return
 
     # gather files to diff
-    set difffromfile [external_diff_get_one_file $diffidfrom $flist_menu_file $diffdir]
-    set difftofile [external_diff_get_one_file $diffidto $flist_menu_file $diffdir]
+    set renames [check_for_renames_in_diff $flist_menu_file]
+    set renamefrom [lindex $renames 0]
+    set renameto [lindex $renames 1]
+    if {$renamefrom ne {} && $renameto ne {}} {
+        set difffromfile [external_diff_get_one_file $diffidfrom $renamefrom $diffdir]
+        set difftofile [external_diff_get_one_file $diffidto $renameto $diffdir]
+    } else {
+        set difffromfile [external_diff_get_one_file $diffidfrom $flist_menu_file $diffdir]
+        set difftofile [external_diff_get_one_file $diffidto $flist_menu_file $diffdir]
+    }
 
     if {$difffromfile ne {} && $difftofile ne {}} {
         set cmd [list [shellsplit $extdifftool] $difffromfile $difftofile]

From 7048e74609fbef2c91bfa3a80e3a9c4fc0ac04c9 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 6 Nov 2025 09:52:54 +0100
Subject: [PATCH 050/553] object: fix performance regression when peeling tags
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Our Bencher dashboards [1] have recently alerted us about a bunch of
performance regressions when writing references, specifically with the
reftable backend. There is a 3x regression when writing many refs with
preexisting refs in the reftable format, and a 10x regression when
migrating refs between backends in either of the formats.

Bisecting the issue lands us at 6ec4c0b45b (refs: don't store peeled
object IDs for invalid tags, 2025-10-23). The gist of the commit is that
we may end up storing peeled objects in both reftables and packed-refs
for corrupted tags, where the claimed tagged object type is different
than the actual tagged object type. This will then cause us to create
the `struct object *` with a wrong type, as well, and obviously nothing
good comes out of that.

The fix for this issue was to introduce a new flag to `peel_object()`
that causes us to verify the tagged object's type before writing it into
the refdb -- if the tag is corrupt, we skip writing the peeled value.
To verify whether the peeled value is correct we have to look up the
object type via the ODB and compare the actual type with the claimed
type, and that additional object lookup is costly.

This also explains why we see the regression only when writing refs with
the reftable backend, but we see the regression with both backends when
migrating refs:

  - The reftable backend knows to store peeled values in the new table
    immediately, so it has to try and peel each ref it's about to write
    to the transaction. So the performance regression is visible for all
    writes.

  - The files backend only stores peeled values when writing the
    packed-refs file, so it wouldn't hit the performance regression for
    normal writes. But on ref migrations we know to write all new values
    into the packed-refs file immediately, and that's why we see the
    regression for both backends there.

Taking a step back though reveals an oddity in the new verification
logic: we not only verify the _tagged_ object's type, but we also verify
the type of the tag itself. But this isn't really needed, as we wouldn't
hit the bug in such a case anyway, as we only hit the issue with corrupt
tags claiming an invalid type for the tagged object.

The consequence of this is that we now started to look up the target
object of every single reference we're about to write, regardless of
whether it even is a tag or not. And that is of course quite costly.

Fix the issue by only verifying the type of the tagged objects. This
means that we of course still have a performance hit for actual tags.
But this only happens for writes anyway, and I'd claim it's preferable
to not store corrupted data in the refdb than to be fast here. Rename
the flag accordingly to clarify that we only verify the tagged object's
type.

This fix brings performance back to previous levels:

    Benchmark 1: baseline
      Time (mean ± σ):      46.0 ms ±   0.4 ms    [User: 40.0 ms, System: 5.7 ms]
      Range (min … max):    45.0 ms …  47.1 ms    54 runs

    Benchmark 2: regression
      Time (mean ± σ):     140.2 ms ±   1.3 ms    [User: 77.5 ms, System: 60.5 ms]
      Range (min … max):   138.0 ms … 142.7 ms    20 runs

    Benchmark 3: fix
      Time (mean ± σ):      46.2 ms ±   0.4 ms    [User: 40.2 ms, System: 5.7 ms]
      Range (min … max):    45.0 ms …  47.3 ms    55 runs

    Summary
      update-ref: baseline
        1.00 ± 0.01 times faster than fix
        3.05 ± 0.04 times faster than regression

[1]: https://bencher.dev/perf/git/plots

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object.c                |  4 ++--
 object.h                | 12 ++++++------
 ref-filter.c            |  2 +-
 refs/packed-backend.c   |  2 +-
 refs/reftable-backend.c |  2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index e72b0ed4360e67..b08fc7a163ae69 100644
--- a/object.c
+++ b/object.c
@@ -214,7 +214,7 @@ enum peel_status peel_object(struct repository *r,
 {
 	struct object *o = lookup_unknown_object(r, name);
 
-	if (o->type == OBJ_NONE || flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
+	if (o->type == OBJ_NONE) {
 		int type = odb_read_object_info(r->objects, name, NULL);
 		if (type < 0 || !object_as_type(o, type, 0))
 			return PEEL_INVALID;
@@ -228,7 +228,7 @@ enum peel_status peel_object(struct repository *r,
 		if (o && o->type == OBJ_TAG && ((struct tag *)o)->tagged) {
 			o = ((struct tag *)o)->tagged;
 
-			if (flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
+			if (flags & PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE) {
 				int type = odb_read_object_info(r->objects, &o->oid, NULL);
 				if (type < 0 || !object_as_type(o, type, 0))
 					return PEEL_INVALID;
diff --git a/object.h b/object.h
index 1499f63d507c32..e9baade1e0ecab 100644
--- a/object.h
+++ b/object.h
@@ -289,13 +289,13 @@ enum peel_status {
 
 enum peel_object_flags {
 	/*
-	 * Always verify the object type, even in the case where the looked-up
-	 * object already has an object type. This can be useful when the
-	 * stored object type may be invalid. One such case is when looking up
-	 * objects via tags, where we blindly trust the object type declared by
-	 * the tag.
+	 * Always verify the object type of the tagged object, even in the case
+	 * where the looked-up object already has an object type. This can be
+	 * useful when the tagged object type may be invalid. One such case is
+	 * when looking up objects via tags, where we blindly trust the object
+	 * type declared by the tag.
 	 */
-	PEEL_OBJECT_VERIFY_OBJECT_TYPE = (1 << 0),
+	PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE = (1 << 0),
 };
 
 /*
diff --git a/ref-filter.c b/ref-filter.c
index d8667c569a18f1..d7454269e87cd3 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2654,7 +2654,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		if (!is_null_oid(&ref->peeled_oid)) {
 			oidcpy(&oi_deref.oid, &ref->peeled_oid);
 		} else if (!peel_object(the_repository, &oi.oid, &oi_deref.oid,
-					PEEL_OBJECT_VERIFY_OBJECT_TYPE)) {
+					PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE)) {
 			/* We managed to peel the object ourselves. */
 		} else {
 			die("bad tag");
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1ab0c503930164..5aa615011a6db3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1528,7 +1528,7 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
 		} else {
 			struct object_id peeled;
 			int peel_error = peel_object(refs->base.repo, &update->new_oid,
-						     &peeled, PEEL_OBJECT_VERIFY_OBJECT_TYPE);
+						     &peeled, PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE);
 
 			if (write_packed_entry(out, update->refname,
 					       &update->new_oid,
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 6bbfd5618dac16..1ac1f6156f256d 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1633,7 +1633,7 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
 			ref.update_index = ts;
 
 			peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled,
-						 PEEL_OBJECT_VERIFY_OBJECT_TYPE);
+						 PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE);
 			if (!peel_error) {
 				ref.value_type = REFTABLE_REF_VAL2;
 				memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ);

From 135f491f83d4763bdc61642eb0126ce2e6ada286 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Sat, 8 Nov 2025 22:51:53 +0100
Subject: [PATCH 051/553] reftable/stack: return stack segments directly

The `stack_table_sizes_for_compaction()` function returns individual
sizes of each reftable table. This function is only called by
`reftable_stack_auto_compact()` to decide which tables need to be
compacted, if any.

Modify the function to directly return the segments, which avoids the
extra step of receiving the sizes only to pass it to
`suggest_compaction_segment()`.

A future commit will also add functionality for checking whether
auto-compaction is necessary without performing it. This change allows
code re-usability in that context.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 reftable/stack.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 65d89820bd0748..49387f93446398 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -1626,7 +1626,8 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
 	return seg;
 }
 
-static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+static int stack_segments_for_compaction(struct reftable_stack *st,
+					 struct segment *seg)
 {
 	int version = (st->opts.hash_id == REFTABLE_HASH_SHA1) ? 1 : 2;
 	int overhead = header_size(version) - 1;
@@ -1634,29 +1635,29 @@ static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
 
 	REFTABLE_CALLOC_ARRAY(sizes, st->merged->tables_len);
 	if (!sizes)
-		return NULL;
+		return REFTABLE_OUT_OF_MEMORY_ERROR;
 
 	for (size_t i = 0; i < st->merged->tables_len; i++)
 		sizes[i] = st->tables[i]->size - overhead;
 
-	return sizes;
+	*seg = suggest_compaction_segment(sizes, st->merged->tables_len,
+					  st->opts.auto_compaction_factor);
+	reftable_free(sizes);
+
+	return 0;
 }
 
 int reftable_stack_auto_compact(struct reftable_stack *st)
 {
 	struct segment seg;
-	uint64_t *sizes;
+	int err;
 
 	if (st->merged->tables_len < 2)
 		return 0;
 
-	sizes = stack_table_sizes_for_compaction(st);
-	if (!sizes)
-		return REFTABLE_OUT_OF_MEMORY_ERROR;
-
-	seg = suggest_compaction_segment(sizes, st->merged->tables_len,
-					 st->opts.auto_compaction_factor);
-	reftable_free(sizes);
+	err = stack_segments_for_compaction(st, &seg);
+	if (err)
+		return err;
 
 	if (segment_size(&seg) > 0)
 		return stack_compact_range(st, seg.start, seg.end - 1,

From e35155588aa9f0355eb7e116ea418c189479f62d Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Sat, 8 Nov 2025 22:51:54 +0100
Subject: [PATCH 052/553] reftable/stack: add function to check if optimization
 is required

The reftable backend performs auto-compaction as part of its regular
flow, which is required to keep the number of tables part of a stack at
bay. This allows it to stay optimized.

Compaction can also be triggered voluntarily by the user via the 'git
pack-refs' or the 'git refs optimize' command. However, currently there
is no way for the user to check if optimization is required without
actually performing it.

Extract out the heuristics logic from 'reftable_stack_auto_compact()'
into an internal function 'update_segment_if_compaction_required()'.
Then use this to add and expose `reftable_stack_compaction_required()`
which will allow users to check if the reftable backend can be
optimized.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 reftable/reftable-stack.h       | 11 +++++++++
 reftable/stack.c                | 42 +++++++++++++++++++++++++++++----
 t/unit-tests/u-reftable-stack.c | 12 ++++++++--
 3 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
index d70fcb705dcffe..c2415cbc6e46a6 100644
--- a/reftable/reftable-stack.h
+++ b/reftable/reftable-stack.h
@@ -123,6 +123,17 @@ struct reftable_log_expiry_config {
 int reftable_stack_compact_all(struct reftable_stack *st,
 			       struct reftable_log_expiry_config *config);
 
+/*
+ * Check if compaction is required.
+ *
+ * When `use_heuristics` is false, check if all tables can be compacted to a
+ * single table. If true, use heuristics to determine if the tables need to be
+ * compacted to maintain geometric progression.
+ */
+int reftable_stack_compaction_required(struct reftable_stack *st,
+				       bool use_heuristics,
+				       bool *required);
+
 /* heuristically compact unbalanced table stack. */
 int reftable_stack_auto_compact(struct reftable_stack *st);
 
diff --git a/reftable/stack.c b/reftable/stack.c
index 49387f93446398..1c9f21dfe1eb45 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -1647,19 +1647,51 @@ static int stack_segments_for_compaction(struct reftable_stack *st,
 	return 0;
 }
 
-int reftable_stack_auto_compact(struct reftable_stack *st)
+static int update_segment_if_compaction_required(struct reftable_stack *st,
+						 struct segment *seg,
+						 bool use_geometric,
+						 bool *required)
 {
-	struct segment seg;
 	int err;
 
-	if (st->merged->tables_len < 2)
+	if (st->merged->tables_len < 2) {
+		*required = false;
+		return 0;
+	}
+
+	if (!use_geometric) {
+		*required = true;
 		return 0;
+	}
+
+	err = stack_segments_for_compaction(st, seg);
+	if (err)
+		return err;
+
+	*required = segment_size(seg) > 0;
+	return 0;
+}
+
+int reftable_stack_compaction_required(struct reftable_stack *st,
+				       bool use_heuristics,
+				       bool *required)
+{
+	struct segment seg;
+	return update_segment_if_compaction_required(st, &seg, use_heuristics,
+						     required);
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	struct segment seg;
+	bool required;
+	int err;
 
-	err = stack_segments_for_compaction(st, &seg);
+	err = update_segment_if_compaction_required(st, &seg, true, &required);
 	if (err)
 		return err;
 
-	if (segment_size(&seg) > 0)
+	if (required)
 		return stack_compact_range(st, seg.start, seg.end - 1,
 					   NULL, STACK_COMPACT_RANGE_BEST_EFFORT);
 
diff --git a/t/unit-tests/u-reftable-stack.c b/t/unit-tests/u-reftable-stack.c
index a8b91812e89ab0..b8110cdeee664b 100644
--- a/t/unit-tests/u-reftable-stack.c
+++ b/t/unit-tests/u-reftable-stack.c
@@ -1067,6 +1067,7 @@ void test_reftable_stack__add_performs_auto_compaction(void)
 			.value_type = REFTABLE_REF_SYMREF,
 			.value.symref = (char *) "master",
 		};
+		bool required = false;
 		char buf[128];
 
 		/*
@@ -1087,10 +1088,17 @@ void test_reftable_stack__add_performs_auto_compaction(void)
 		 * auto compaction is disabled. When enabled, we should merge
 		 * all tables in the stack.
 		 */
-		if (i != n)
+		cl_assert_equal_i(reftable_stack_compaction_required(st, true, &required), 0);
+		if (i != n) {
 			cl_assert_equal_i(st->merged->tables_len, i + 1);
-		else
+			if (i < 1)
+				cl_assert_equal_b(required, false);
+			else
+				cl_assert_equal_b(required, true);
+		} else {
 			cl_assert_equal_i(st->merged->tables_len, 1);
+			cl_assert_equal_b(required, false);
+		}
 	}
 
 	reftable_stack_destroy(st);

From f6c5ca387a7693b16158826d157178be0ba439dc Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Sat, 8 Nov 2025 22:51:55 +0100
Subject: [PATCH 053/553] refs: add a `optimize_required` field to `struct
 ref_storage_be`

To allow users of the refs namespace to check if the reference backend
requires optimization, add a new field `optimize_required` field to
`struct ref_storage_be`. This field is of type `optimize_required_fn`
which is also introduced in this commit.

Modify the debug, files, packed and reftable backend to implement this
field. A following commit will expose this via 'git pack-refs' and 'git
refs optimize'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c                  |  7 +++++++
 refs.h                  |  7 +++++++
 refs/debug.c            | 13 +++++++++++++
 refs/files-backend.c    | 11 +++++++++++
 refs/packed-backend.c   | 13 +++++++++++++
 refs/refs-internal.h    |  6 ++++++
 refs/reftable-backend.c | 25 +++++++++++++++++++++++++
 7 files changed, 82 insertions(+)

diff --git a/refs.c b/refs.c
index 0d0831f29ba25b..5583f6e09d7c76 100644
--- a/refs.c
+++ b/refs.c
@@ -2318,6 +2318,13 @@ int refs_optimize(struct ref_store *refs, struct refs_optimize_opts *opts)
 	return refs->be->optimize(refs, opts);
 }
 
+int refs_optimize_required(struct ref_store *refs,
+			   struct refs_optimize_opts *opts,
+			   bool *required)
+{
+	return refs->be->optimize_required(refs, opts, required);
+}
+
 int reference_get_peeled_oid(struct repository *repo,
 			     const struct reference *ref,
 			     struct object_id *peeled_oid)
diff --git a/refs.h b/refs.h
index 6b05bba527ffca..d9051bbb0414c2 100644
--- a/refs.h
+++ b/refs.h
@@ -520,6 +520,13 @@ struct refs_optimize_opts {
  */
 int refs_optimize(struct ref_store *refs, struct refs_optimize_opts *opts);
 
+/*
+ * Check if refs backend can be optimized by calling 'refs_optimize'.
+ */
+int refs_optimize_required(struct ref_store *ref_store,
+			   struct refs_optimize_opts *opts,
+			   bool *required);
+
 /*
  * Setup reflog before using. Fill in err and return -1 on failure.
  */
diff --git a/refs/debug.c b/refs/debug.c
index 2defd2d465712e..36f8c58b6c781f 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -124,6 +124,17 @@ static int debug_optimize(struct ref_store *ref_store, struct refs_optimize_opts
 	return res;
 }
 
+static int debug_optimize_required(struct ref_store *ref_store,
+				   struct refs_optimize_opts *opts,
+				   bool *required)
+{
+	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
+	int res = drefs->refs->be->optimize_required(drefs->refs, opts, required);
+	trace_printf_key(&trace_refs, "optimize_required: %s, res: %d\n",
+			 required ? "yes" : "no", res);
+	return res;
+}
+
 static int debug_rename_ref(struct ref_store *ref_store, const char *oldref,
 			    const char *newref, const char *logmsg)
 {
@@ -431,6 +442,8 @@ struct ref_storage_be refs_be_debug = {
 	.transaction_abort = debug_transaction_abort,
 
 	.optimize = debug_optimize,
+	.optimize_required = debug_optimize_required,
+
 	.rename_ref = debug_rename_ref,
 	.copy_ref = debug_copy_ref,
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a1e70b1c10dbb8..6e0c9b340a0f6f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1512,6 +1512,16 @@ static int files_optimize(struct ref_store *ref_store,
 	return 0;
 }
 
+static int files_optimize_required(struct ref_store *ref_store,
+				   struct refs_optimize_opts *opts,
+				   bool *required)
+{
+	struct files_ref_store *refs = files_downcast(ref_store, REF_STORE_READ,
+						      "optimize_required");
+	*required = should_pack_refs(refs, opts);
+	return 0;
+}
+
 /*
  * People using contrib's git-new-workdir have .git/logs/refs ->
  * /some/other/path/.git/logs/refs, and that may live on another device.
@@ -3982,6 +3992,7 @@ struct ref_storage_be refs_be_files = {
 	.transaction_abort = files_transaction_abort,
 
 	.optimize = files_optimize,
+	.optimize_required = files_optimize_required,
 	.rename_ref = files_rename_ref,
 	.copy_ref = files_copy_ref,
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 10062fd8b63063..19ce4d58728e8c 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1784,6 +1784,17 @@ static int packed_optimize(struct ref_store *ref_store UNUSED,
 	return 0;
 }
 
+static int packed_optimize_required(struct ref_store *ref_store UNUSED,
+				    struct refs_optimize_opts *opts UNUSED,
+				    bool *required)
+{
+	/*
+	 * Packed refs are already optimized.
+	 */
+	*required = false;
+	return 0;
+}
+
 static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_store UNUSED)
 {
 	return empty_ref_iterator_begin();
@@ -2130,6 +2141,8 @@ struct ref_storage_be refs_be_packed = {
 	.transaction_abort = packed_transaction_abort,
 
 	.optimize = packed_optimize,
+	.optimize_required = packed_optimize_required,
+
 	.rename_ref = NULL,
 	.copy_ref = NULL,
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index dee42f231dbd0a..c7d2a6e50b7696 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -424,6 +424,11 @@ typedef int ref_transaction_commit_fn(struct ref_store *refs,
 
 typedef int optimize_fn(struct ref_store *ref_store,
 			struct refs_optimize_opts *opts);
+
+typedef int optimize_required_fn(struct ref_store *ref_store,
+				 struct refs_optimize_opts *opts,
+				 bool *required);
+
 typedef int rename_ref_fn(struct ref_store *ref_store,
 			  const char *oldref, const char *newref,
 			  const char *logmsg);
@@ -549,6 +554,7 @@ struct ref_storage_be {
 	ref_transaction_abort_fn *transaction_abort;
 
 	optimize_fn *optimize;
+	optimize_required_fn *optimize_required;
 	rename_ref_fn *rename_ref;
 	copy_ref_fn *copy_ref;
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index c23c45f3bf4639..a3ae0cf74a0588 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1733,6 +1733,29 @@ static int reftable_be_optimize(struct ref_store *ref_store,
 	return ret;
 }
 
+static int reftable_be_optimize_required(struct ref_store *ref_store,
+					 struct refs_optimize_opts *opts,
+					 bool *required)
+{
+	struct reftable_ref_store *refs = reftable_be_downcast(ref_store, REF_STORE_READ,
+							       "optimize_refs_required");
+	struct reftable_stack *stack;
+	bool use_heuristics = false;
+
+	if (refs->err)
+		return refs->err;
+
+	stack = refs->worktree_backend.stack;
+	if (!stack)
+		stack = refs->main_backend.stack;
+
+	if (opts->flags & REFS_OPTIMIZE_AUTO)
+		use_heuristics = true;
+
+	return reftable_stack_compaction_required(stack, use_heuristics,
+						  required);
+}
+
 struct write_create_symref_arg {
 	struct reftable_ref_store *refs;
 	struct reftable_stack *stack;
@@ -2756,6 +2779,8 @@ struct ref_storage_be refs_be_reftable = {
 	.transaction_abort = reftable_be_transaction_abort,
 
 	.optimize = reftable_be_optimize,
+	.optimize_required = reftable_be_optimize_required,
+
 	.rename_ref = reftable_be_rename_ref,
 	.copy_ref = reftable_be_copy_ref,
 

From 8c1ce2204cc755bdafec85aaa4ac9c5a686a8bf4 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Sat, 8 Nov 2025 22:51:56 +0100
Subject: [PATCH 054/553] maintenance: add checking logic in
 `pack_refs_condition()`

The 'git-maintenance(1)' command supports an '--auto' flag. Usage of the
flag ensures to run maintenance tasks only if certain thresholds are
met. The heuristic is defined on a task level, wherein each task defines
an 'auto_condition', which states if the task should be run.

The 'pack-refs' task is hard-coded to return 1 as:
1. There was never a way to check if the reference backend needs to be
optimized without actually performing the optimization.
2. We can pass in the '--auto' flag to 'git-pack-refs(1)' which would
optimize based on heuristics.

The previous commit added a `refs_optimize_required()` function, which
can be used to check if a reference backend required optimization. Use
this within `pack_refs_condition()`.

This allows us to add a 'git maintenance is-needed' subcommand which can
notify the user if maintenance is needed without actually performing the
optimization. Without this change, the reference backend would always
state that optimization is needed.

Since we import 'revision.h', we need to remove the definition for
'SEEN' which is duplicated in the included header.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/gc.c | 30 +++++++++++++++++++++---------
 object.h     |  1 -
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index c6d62c74a7169f..85e9a38d103522 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -35,6 +35,7 @@
 #include "path.h"
 #include "reflog.h"
 #include "rerere.h"
+#include "revision.h"
 #include "blob.h"
 #include "tree.h"
 #include "promisor-remote.h"
@@ -285,12 +286,26 @@ static void maintenance_run_opts_release(struct maintenance_run_opts *opts)
 
 static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
-	/*
-	 * The auto-repacking logic for refs is handled by the ref backends and
-	 * exposed via `git pack-refs --auto`. We thus always return truish
-	 * here and let the backend decide for us.
-	 */
-	return 1;
+	struct string_list included_refs = STRING_LIST_INIT_NODUP;
+	struct ref_exclusions excludes = REF_EXCLUSIONS_INIT;
+	struct refs_optimize_opts optimize_opts = {
+		.exclusions = &excludes,
+		.includes = &included_refs,
+		.flags = REFS_OPTIMIZE_PRUNE | REFS_OPTIMIZE_AUTO,
+	};
+	bool required;
+
+	/* Check for all refs, similar to 'git refs optimize --all'. */
+	string_list_append(optimize_opts.includes, "*");
+
+	if (refs_optimize_required(get_main_ref_store(the_repository),
+				   &optimize_opts, &required))
+		return 0;
+
+	clear_ref_exclusions(&excludes);
+	string_list_clear(&included_refs, 0);
+
+	return required;
 }
 
 static int maintenance_task_pack_refs(struct maintenance_run_opts *opts,
@@ -1090,9 +1105,6 @@ static int maintenance_opt_schedule(const struct option *opt, const char *arg,
 	return 0;
 }
 
-/* Remember to update object flag allocation in object.h */
-#define SEEN		(1u<<0)
-
 struct cg_auto_data {
 	int num_not_in_graph;
 	int limit;
diff --git a/object.h b/object.h
index 1499f63d507c32..832299e763876c 100644
--- a/object.h
+++ b/object.h
@@ -79,7 +79,6 @@ void object_array_init(struct object_array *array);
  * list-objects-filter.c:                                      21
  * bloom.c:                                                    2122
  * builtin/fsck.c:           0--3
- * builtin/gc.c:             0
  * builtin/index-pack.c:                                     2021
  * reflog.c:                           10--12
  * builtin/show-branch.c:    0-------------------------------------------26

From 28b83e6f08ae022d54d79e518e72933ae0930091 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Sat, 8 Nov 2025 22:51:57 +0100
Subject: [PATCH 055/553] maintenance: add 'is-needed' subcommand

The 'git-maintenance(1)' command provides tooling to run maintenance
tasks over Git repositories. The 'run' subcommand, as the name suggests,
runs the maintenance tasks. When used with the '--auto' flag, it uses
heuristics to determine if the required thresholds are met for running
said maintenance tasks.

There is however a lack of insight into these heuristics. Meaning, the
checks are linked to the execution.

Add a new 'is-needed' subcommand to 'git-maintenance(1)' which allows
users to simply check if it is needed to run maintenance without
performing it.

This subcommand can check if it is needed to run maintenance without
actually running it. Ideally it should be used with the '--auto' flag,
which would allow users to check if the thresholds required are met. The
subcommand also supports the '--task' flag which can be used to check
specific maintenance tasks.

While adding the respective tests in 't/t7900-maintenance.sh', remove a
duplicate of the test: 'worktree-prune task with --auto honors
maintenance.worktree-prune.auto'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-maintenance.adoc | 13 ++++++
 builtin/gc.c                       | 63 +++++++++++++++++++++++++++++-
 t/t7900-maintenance.sh             | 54 +++++++++++++++++--------
 3 files changed, 113 insertions(+), 17 deletions(-)

diff --git a/Documentation/git-maintenance.adoc b/Documentation/git-maintenance.adoc
index 540b5cf68b0a1c..bda616f14c45d9 100644
--- a/Documentation/git-maintenance.adoc
+++ b/Documentation/git-maintenance.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 'git maintenance' run [<options>]
 'git maintenance' start [--scheduler=<scheduler>]
 'git maintenance' (stop|register|unregister) [<options>]
+'git maintenance' is-needed [<options>]
 
 
 DESCRIPTION
@@ -84,6 +85,16 @@ The `unregister` subcommand will report an error if the current repository
 is not already registered. Use the `--force` option to return success even
 when the current repository is not registered.
 
+is-needed::
+    Check whether maintenance needs to be run without actually running it.
+    Exits with a 0 status code if maintenance needs to be run, 1 otherwise.
+    Ideally used with the '--auto' flag.
++
+If one or more `--task` options are specified, then those tasks are checked
+in that order. Otherwise, the tasks are determined by which
+`maintenance.<task>.enabled` config options are true. By default, only
+`maintenance.gc.enabled` is true.
+
 TASKS
 -----
 
@@ -183,6 +194,8 @@ OPTIONS
 	in the `gc.auto` config setting, or when the number of pack-files
 	exceeds the `gc.autoPackLimit` config setting. Not compatible with
 	the `--schedule` option.
+	When combined with the `is-needed` subcommand, check if the required
+	thresholds are met without actually running maintenance.
 
 --schedule::
 	When combined with the `run` subcommand, run maintenance tasks
diff --git a/builtin/gc.c b/builtin/gc.c
index 85e9a38d103522..928c805f02b493 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -3253,7 +3253,67 @@ static int maintenance_stop(int argc, const char **argv, const char *prefix,
 	return update_background_schedule(NULL, 0);
 }
 
-static const char * const builtin_maintenance_usage[] = {
+static const char *const builtin_maintenance_is_needed_usage[] = {
+	"git maintenance is-needed [--task=<task>] [--schedule]",
+	NULL
+};
+
+static int maintenance_is_needed(int argc, const char **argv, const char *prefix,
+				 struct repository *repo UNUSED)
+{
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
+	struct string_list selected_tasks = STRING_LIST_INIT_DUP;
+	struct gc_config cfg = GC_CONFIG_INIT;
+	struct option options[] = {
+		OPT_BOOL(0, "auto", &opts.auto_flag,
+			 N_("run tasks based on the state of the repository")),
+		OPT_CALLBACK_F(0, "task", &selected_tasks, N_("task"),
+			       N_("check a specific task"),
+			       PARSE_OPT_NONEG, task_option_parse),
+		OPT_END()
+	};
+	bool is_needed = false;
+
+	argc = parse_options(argc, argv, prefix, options,
+			     builtin_maintenance_is_needed_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (argc)
+		usage_with_options(builtin_maintenance_is_needed_usage, options);
+
+	gc_config(&cfg);
+	initialize_task_config(&opts, &selected_tasks);
+
+	if (opts.auto_flag) {
+		for (size_t i = 0; i < opts.tasks_nr; i++) {
+			if (tasks[opts.tasks[i]].auto_condition &&
+			    tasks[opts.tasks[i]].auto_condition(&cfg)) {
+				is_needed = true;
+				break;
+			}
+		}
+	} else {
+		/*
+		 * When not using --auto we always require maintenance right now.
+		 *
+		 * TODO: this certainly is too eager, as some maintenance tasks may
+		 * decide to not do anything because the data structures are already
+		 * fully optimized. We may eventually want to extend the auto
+		 * condition to also cover non-auto runs so that we can detect such
+		 * cases.
+		 */
+		is_needed = true;
+	}
+
+	string_list_clear(&selected_tasks, 0);
+	maintenance_run_opts_release(&opts);
+	gc_config_release(&cfg);
+
+	if (is_needed)
+		return 0;
+	return 1;
+}
+
+static const char *const builtin_maintenance_usage[] = {
 	N_("git maintenance <subcommand> [<options>]"),
 	NULL,
 };
@@ -3270,6 +3330,7 @@ int cmd_maintenance(int argc,
 		OPT_SUBCOMMAND("stop", &fn, maintenance_stop),
 		OPT_SUBCOMMAND("register", &fn, maintenance_register),
 		OPT_SUBCOMMAND("unregister", &fn, maintenance_unregister),
+		OPT_SUBCOMMAND("is-needed", &fn, maintenance_is_needed),
 		OPT_END(),
 	};
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index ddd273d8dc24fb..a17e2091c2e647 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -49,7 +49,9 @@ test_expect_success 'run [--auto|--quiet]' '
 		git maintenance run --auto 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
 		git maintenance run --no-quiet 2>/dev/null &&
+	git maintenance is-needed &&
 	test_subcommand git gc --quiet --no-detach --skip-foreground-tasks <run-no-auto.txt &&
+	! git maintenance is-needed --auto &&
 	test_subcommand ! git gc --auto --quiet --no-detach --skip-foreground-tasks <run-auto.txt &&
 	test_subcommand git gc --no-quiet --no-detach --skip-foreground-tasks <run-no-quiet.txt
 '
@@ -180,6 +182,11 @@ test_expect_success 'commit-graph auto condition' '
 
 	test_commit first &&
 
+	! git -c maintenance.commit-graph.auto=0 \
+		maintenance is-needed --auto --task=commit-graph &&
+	git -c maintenance.commit-graph.auto=1 \
+		maintenance is-needed --auto --task=commit-graph &&
+
 	GIT_TRACE2_EVENT="$(pwd)/cg-zero-means-no.txt" \
 		git -c maintenance.commit-graph.auto=0 $COMMAND &&
 	GIT_TRACE2_EVENT="$(pwd)/cg-one-satisfied.txt" \
@@ -290,16 +297,23 @@ test_expect_success 'maintenance.loose-objects.auto' '
 		git -c maintenance.loose-objects.auto=1 maintenance \
 		run --auto --task=loose-objects 2>/dev/null &&
 	test_subcommand ! git prune-packed --quiet <trace-lo1.txt &&
+
 	printf data-A | git hash-object -t blob --stdin -w &&
+	! git -c maintenance.loose-objects.auto=2 \
+		maintenance is-needed --auto --task=loose-objects &&
 	GIT_TRACE2_EVENT="$(pwd)/trace-loA" \
 		git -c maintenance.loose-objects.auto=2 \
 		maintenance run --auto --task=loose-objects 2>/dev/null &&
 	test_subcommand ! git prune-packed --quiet <trace-loA &&
+
 	printf data-B | git hash-object -t blob --stdin -w &&
+	git -c maintenance.loose-objects.auto=2 \
+		maintenance is-needed --auto --task=loose-objects &&
 	GIT_TRACE2_EVENT="$(pwd)/trace-loB" \
 		git -c maintenance.loose-objects.auto=2 \
 		maintenance run --auto --task=loose-objects 2>/dev/null &&
 	test_subcommand git prune-packed --quiet <trace-loB &&
+
 	GIT_TRACE2_EVENT="$(pwd)/trace-loC" \
 		git -c maintenance.loose-objects.auto=2 \
 		maintenance run --auto --task=loose-objects 2>/dev/null &&
@@ -421,10 +435,13 @@ run_incremental_repack_and_verify () {
 	test_commit A &&
 	git repack -adk &&
 	git multi-pack-index write &&
+	! git -c maintenance.incremental-repack.auto=1 \
+		maintenance is-needed --auto --task=incremental-repack &&
 	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
 		-c maintenance.incremental-repack.auto=1 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
+
 	test_commit B &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
@@ -434,11 +451,14 @@ run_incremental_repack_and_verify () {
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
+
 	test_commit C &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
 	EOF
+	git -c maintenance.incremental-repack.auto=2 \
+		maintenance is-needed --auto --task=incremental-repack &&
 	GIT_TRACE2_EVENT=$(pwd)/trace-B git \
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
@@ -485,9 +505,15 @@ test_expect_success 'reflog-expire task --auto only packs when exceeding limits'
 	git reflog expire --all --expire=now &&
 	test_commit reflog-one &&
 	test_commit reflog-two &&
+
+	! git -c maintenance.reflog-expire.auto=3 \
+		maintenance is-needed --auto --task=reflog-expire &&
 	GIT_TRACE2_EVENT="$(pwd)/reflog-expire-auto.txt" \
 		git -c maintenance.reflog-expire.auto=3 maintenance run --auto --task=reflog-expire &&
 	test_subcommand ! git reflog expire --all <reflog-expire-auto.txt &&
+
+	git -c maintenance.reflog-expire.auto=2 \
+		maintenance is-needed --auto --task=reflog-expire &&
 	GIT_TRACE2_EVENT="$(pwd)/reflog-expire-auto.txt" \
 		git -c maintenance.reflog-expire.auto=2 maintenance run --auto --task=reflog-expire &&
 	test_subcommand git reflog expire --all <reflog-expire-auto.txt
@@ -514,6 +540,7 @@ test_expect_success 'worktree-prune task --auto only prunes with prunable worktr
 	test_expect_worktree_prune ! git maintenance run --auto --task=worktree-prune &&
 	mkdir .git/worktrees &&
 	: >.git/worktrees/abc &&
+	git maintenance is-needed --auto --task=worktree-prune &&
 	test_expect_worktree_prune git maintenance run --auto --task=worktree-prune
 '
 
@@ -530,22 +557,7 @@ test_expect_success 'worktree-prune task with --auto honors maintenance.worktree
 	test_expect_worktree_prune ! git -c maintenance.worktree-prune.auto=0 maintenance run --auto --task=worktree-prune &&
 	# A positive value should require at least this many prunable worktrees.
 	test_expect_worktree_prune ! git -c maintenance.worktree-prune.auto=4 maintenance run --auto --task=worktree-prune &&
-	test_expect_worktree_prune git -c maintenance.worktree-prune.auto=3 maintenance run --auto --task=worktree-prune
-'
-
-test_expect_success 'worktree-prune task with --auto honors maintenance.worktree-prune.auto' '
-	# A negative value should always prune.
-	test_expect_worktree_prune git -c maintenance.worktree-prune.auto=-1 maintenance run --auto --task=worktree-prune &&
-
-	mkdir .git/worktrees &&
-	: >.git/worktrees/first &&
-	: >.git/worktrees/second &&
-	: >.git/worktrees/third &&
-
-	# Zero should never prune.
-	test_expect_worktree_prune ! git -c maintenance.worktree-prune.auto=0 maintenance run --auto --task=worktree-prune &&
-	# A positive value should require at least this many prunable worktrees.
-	test_expect_worktree_prune ! git -c maintenance.worktree-prune.auto=4 maintenance run --auto --task=worktree-prune &&
+	git -c maintenance.worktree-prune.auto=3 maintenance is-needed --auto --task=worktree-prune &&
 	test_expect_worktree_prune git -c maintenance.worktree-prune.auto=3 maintenance run --auto --task=worktree-prune
 '
 
@@ -554,11 +566,13 @@ test_expect_success 'worktree-prune task honors gc.worktreePruneExpire' '
 	rm -rf worktree &&
 
 	rm -f worktree-prune.txt &&
+	! git -c gc.worktreePruneExpire=1.week.ago maintenance is-needed --auto --task=worktree-prune &&
 	GIT_TRACE2_EVENT="$(pwd)/worktree-prune.txt" git -c gc.worktreePruneExpire=1.week.ago maintenance run --auto --task=worktree-prune &&
 	test_subcommand ! git worktree prune --expire 1.week.ago <worktree-prune.txt &&
 	test_path_is_dir .git/worktrees/worktree &&
 
 	rm -f worktree-prune.txt &&
+	git -c gc.worktreePruneExpire=now maintenance is-needed --auto --task=worktree-prune &&
 	GIT_TRACE2_EVENT="$(pwd)/worktree-prune.txt" git -c gc.worktreePruneExpire=now maintenance run --auto --task=worktree-prune &&
 	test_subcommand git worktree prune --expire now <worktree-prune.txt &&
 	test_path_is_missing .git/worktrees/worktree
@@ -583,10 +597,13 @@ test_expect_success 'rerere-gc task without --auto always collects garbage' '
 
 test_expect_success 'rerere-gc task with --auto only prunes with prunable entries' '
 	test_when_finished "rm -rf .git/rr-cache" &&
+	! git maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc ! git maintenance run --auto --task=rerere-gc &&
 	mkdir .git/rr-cache &&
+	! git maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc ! git maintenance run --auto --task=rerere-gc &&
 	: >.git/rr-cache/entry &&
+	git maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc git maintenance run --auto --task=rerere-gc
 '
 
@@ -594,17 +611,22 @@ test_expect_success 'rerere-gc task with --auto honors maintenance.rerere-gc.aut
 	test_when_finished "rm -rf .git/rr-cache" &&
 
 	# A negative value should always prune.
+	git -c maintenance.rerere-gc.auto=-1 maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc git -c maintenance.rerere-gc.auto=-1 maintenance run --auto --task=rerere-gc &&
 
 	# A positive value prunes when there is at least one entry.
+	! git -c maintenance.rerere-gc.auto=9000 maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc ! git -c maintenance.rerere-gc.auto=9000 maintenance run --auto --task=rerere-gc &&
 	mkdir .git/rr-cache &&
+	! git -c maintenance.rerere-gc.auto=9000 maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc ! git -c maintenance.rerere-gc.auto=9000 maintenance run --auto --task=rerere-gc &&
 	: >.git/rr-cache/entry-1 &&
+	git -c maintenance.rerere-gc.auto=9000 maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc git -c maintenance.rerere-gc.auto=9000 maintenance run --auto --task=rerere-gc &&
 
 	# Zero should never prune.
 	: >.git/rr-cache/entry-1 &&
+	! git -c maintenance.rerere-gc.auto=0 maintenance is-needed --auto --task=rerere-gc &&
 	test_expect_rerere_gc ! git -c maintenance.rerere-gc.auto=0 maintenance run --auto --task=rerere-gc
 '
 

From fa052367ef8f7829996ff15368d63edfff0e40c3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 9 Nov 2025 17:43:36 +0100
Subject: [PATCH 056/553] diff: disable rename detection with --quiet
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Detecting renames and copies improves diff's output.  This effort is
wasted if we don't show any.  Disable detection in that case.

This actually fixes the error code when using the options --cached,
--find-copies-harder, --no-ext-diff and --quiet together:
run_diff_index() indirectly calls diff-lib.c::show_modified(), which
queues even non-modified entries using diff_change() because we need
them for copy detection.  diff_change() sets flags.has_changes, though,
which causes diff_can_quit_early() to declare we're done after seeing
only the very first entry -- way too soon.

Using --cached, --find-copies-harder and --quiet together without
--no-ext-diff was not affected even before, as it causes the flag
flags.diff_from_contents to be set, which disables the optimization
in a different way.

Reported-by: D. Ben Knoble <ben.knoble@gmail.com>
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c              |  2 ++
 t/t4007-rename-3.sh | 10 ++++++++++
 2 files changed, 12 insertions(+)

diff --git a/diff.c b/diff.c
index 90e8003dd11e4d..e4f8c0dc6c691b 100644
--- a/diff.c
+++ b/diff.c
@@ -4965,6 +4965,8 @@ void diff_setup_done(struct diff_options *options)
 	if (options->flags.quick) {
 		options->output_format = DIFF_FORMAT_NO_OUTPUT;
 		options->flags.exit_with_status = 1;
+		options->detect_rename = 0;
+		options->flags.find_copies_harder = 0;
 	}
 
 	/*
diff --git a/t/t4007-rename-3.sh b/t/t4007-rename-3.sh
index e8faf0dd2ef1c5..3fc81bcd760081 100755
--- a/t/t4007-rename-3.sh
+++ b/t/t4007-rename-3.sh
@@ -41,6 +41,16 @@ test_expect_success 'copy detection, cached' '
 	compare_diff_raw current expected
 '
 
+test_expect_success 'exit code of quiet copy detection' '
+	test_expect_code 1 \
+	git diff --quiet --cached --find-copies-harder $tree
+'
+
+test_expect_success 'exit code of quiet copy detection with --no-ext-diff' '
+	test_expect_code 1 \
+	git diff --quiet --cached --find-copies-harder --no-ext-diff $tree
+'
+
 # In the tree, there is only path0/COPYING.  In the cache, path0 and
 # path1 both have COPYING and the latter is a copy of path0/COPYING.
 # However when we say we care only about path1, we should just see

From 358e94dc7059500af09435112ef1d4e5f7692e52 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 11 Nov 2025 10:41:20 -0800
Subject: [PATCH 057/553] .gitattributes: remove misspelled no-op whitespace
 attribute

Ever since 14f9e128 (Define the project whitespace policy,
2008-02-10) added the whitespace rules to .gitattributes, we spelled
the most general rule like so:

    * whitespace=!indent,trail,space

in the top-level .gitattributes file.  The intent of this line was
described in the commit log message:

     - Unless otherwise specified, indent with SP that could be
       replaced with HT are not "bad".  But SP before HT in the
       indent is "bad", and trailing whitespaces are "bad".

It clearly wanted to disable indent-with-non-tab, so !indent is most
likely a misspelt form of '-indent'.  Because indent-with-non-tab
has never been enabled by default, by luck this was not causing any
ill effect.

We could either remove "!indent", or spell it "-indent".  The
immediate effect would be the same.  It would only start to make a
difference when/if we enable indent-with-non-tab by default in
future versions of Git.

Let's take the former option to remove "!indent" from the list.  We
would feel the effect first-hand ourselves before anybody else if we
ever decide to change the built-in default whitespace rules, which
would be hidden from us if we decide to rewrite it to "-indent"
instead.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitattributes | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitattributes b/.gitattributes
index 158c3d45c4c10c..2a50ebaf2ee149 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,4 +1,4 @@
-* whitespace=!indent,trail,space
+* whitespace=trail,space
 *.[ch] whitespace=indent,trail,space diff=cpp
 *.sh whitespace=indent,trail,space text eol=lf
 *.perl text eol=lf diff=perl

From 42ed0468663dd493c0a0e00edc83b668369157d6 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 11 Nov 2025 17:36:47 -0500
Subject: [PATCH 058/553] attr: avoid recursion when expanding attribute macros

Given a set of attribute macros like:

   [attr]a1 a2
   [attr]a2 a3
   ...
   [attr]a300000 -text
   file a1

expanding the attributes for "file" requires expanding "a1" to "a2",
"a2" to "a3", and so on until hitting a non-macro expansion ("-text", in
this case). We implement this via recursion: fill_one() calls
macroexpand_one(), which then recurses back to fill_one(). As a result,
very deep macro chains like the one above can run out of stack space and
cause us to segfault.

The required stack space is fairly small; I needed on the order of
200,000 entries to get a segfault on Linux. So it's unlikely anybody
would hit this accidentally, leaving only malicious inputs. There you
can easily construct a repo which will segfault on clone (we look at
attributes during the checkout step, but you'd see the same trying to do
other operations, like diff in a bare repo). It's mostly harmless, since
anybody constructing such a repo is only preventing victims from cloning
their evil garbage, but it could be a nuisance for hosting sites.

One option to prevent this is to limit the depth of recursion we'll
allow. This is conceptually easy to implement, but it raises other
questions: what should the limit be, and do we need a configuration knob
for it?

The recursion here is simple enough that we can avoid those questions by
just converting it to iteration instead. Rather than iterate over the
states of a match_attr in fill_one(), we'll put them all in a queue, and
the expansion of each can add to the queue rather than recursing. Note
that this is a LIFO queue in order to keep the same depth-first order we
did with the recursive implementation. I've avoided using the word
"stack" in the code because the term is already heavily used to refer to
the stack of .gitattribute files that matches the tree structure of the
repository.

The test uses a limited stack size so we can trigger the problem with a
much smaller input than the one shown above. The value here (3000) is
enough to trigger the issue on my x86_64 Linux machine.

Reported-by: Ben Stav <benstav@miggo.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 attr.c                | 50 +++++++++++++++++++++++++++++--------------
 t/t0003-attributes.sh | 20 +++++++++++++++++
 2 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/attr.c b/attr.c
index d1daeb0b4d90a6..4999b7e09da930 100644
--- a/attr.c
+++ b/attr.c
@@ -1064,24 +1064,52 @@ static int path_matches(const char *pathname, int pathlen,
 			      pattern, prefix, pat->patternlen);
 }
 
-static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
+struct attr_state_queue {
+	const struct attr_state **items;
+	size_t alloc, nr;
+};
+
+static void attr_state_queue_push(struct attr_state_queue *t,
+				 const struct match_attr *a)
+{
+	for (size_t i = 0; i < a->num_attr; i++) {
+		ALLOC_GROW(t->items, t->nr + 1, t->alloc);
+		t->items[t->nr++] = &a->state[i];
+	}
+}
+
+static const struct attr_state *attr_state_queue_pop(struct attr_state_queue *t)
+{
+	return t->nr ? t->items[--t->nr] : NULL;
+}
+
+static void attr_state_queue_release(struct attr_state_queue *t)
+{
+	free(t->items);
+}
 
 static int fill_one(struct all_attrs_item *all_attrs,
 		    const struct match_attr *a, int rem)
 {
-	size_t i;
+	struct attr_state_queue todo = { 0 };
+	const struct attr_state *state;
 
-	for (i = a->num_attr; rem > 0 && i > 0; i--) {
-		const struct git_attr *attr = a->state[i - 1].attr;
+	attr_state_queue_push(&todo, a);
+	while (rem > 0 && (state = attr_state_queue_pop(&todo))) {
+		const struct git_attr *attr = state->attr;
 		const char **n = &(all_attrs[attr->attr_nr].value);
-		const char *v = a->state[i - 1].setto;
+		const char *v = state->setto;
 
 		if (*n == ATTR__UNKNOWN) {
+			const struct all_attrs_item *item =
+				&all_attrs[attr->attr_nr];
 			*n = v;
 			rem--;
-			rem = macroexpand_one(all_attrs, attr->attr_nr, rem);
+			if (item->macro && item->value == ATTR__TRUE)
+				attr_state_queue_push(&todo, item->macro);
 		}
 	}
+	attr_state_queue_release(&todo);
 	return rem;
 }
 
@@ -1106,16 +1134,6 @@ static int fill(const char *path, int pathlen, int basename_offset,
 	return rem;
 }
 
-static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem)
-{
-	const struct all_attrs_item *item = &all_attrs[nr];
-
-	if (item->macro && item->value == ATTR__TRUE)
-		return fill_one(all_attrs, item->macro, rem);
-	else
-		return rem;
-}
-
 /*
  * Marks the attributes which are macros based on the attribute stack.
  * This prevents having to search through the attribute stack each time
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 3c98b622f25b76..582e207aa12eb1 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -664,4 +664,24 @@ test_expect_success 'user defined builtin_objectmode values are ignored' '
 	test_cmp expect err
 '
 
+test_expect_success ULIMIT_STACK_SIZE 'deep macro recursion' '
+	n=3000 &&
+	{
+		i=0 &&
+		while test $i -lt $n; do
+			echo "[attr]a$i a$((i+1))" &&
+			i=$((i+1)) ||
+			return 1
+		done &&
+		echo "[attr]a$n -text" &&
+		echo "file a0"
+	} >.gitattributes &&
+	{
+		echo "file: text: unset" &&
+		test_seq -f "file: a%d: set" 0 $n
+	} >expect &&
+	run_with_limited_stack git check-attr -a file >actual &&
+	test_cmp expect actual
+'
+
 test_done

From dee80940b123ad7006e0497391d8c160ae15ba1b Mon Sep 17 00:00:00 2001
From: Julia Evans <julia@jvns.ca>
Date: Wed, 12 Nov 2025 19:53:20 +0000
Subject: [PATCH 059/553] doc: add an explanation of Git's data model

Git very often uses the terms "object", "reference", or "index" in its
documentation.

However, it's hard to find a clear explanation of these terms and how
they relate to each other in the documentation. The closest candidates
currently are:

1. `gitglossary`. This makes a good effort, but it's an alphabetically
    ordered dictionary and a dictionary is not a good way to learn
    concepts. You have to jump around too much and it's not possible to
    present the concepts in the order that they should be explained.
2. `gitcore-tutorial`. This explains how to use the "core" Git commands.
   This is a nice document to have, but it's not necessary to learn how
   `update-index` works to understand Git's data model, and we should
   not be requiring users to learn how to use the "plumbing" commands
   if they want to learn what the term "index" or "object" means.
3. `gitrepository-layout`. This is a great resource, but it includes a
   lot of information about configuration and internal implementation
   details which are not related to the data model. It also does
   not explain how commits work.

The result of this is that Git users (even users who have been using
Git for 15+ years) struggle to read the documentation because they don't
know what the core terms mean, and it's not possible to add links
to help them learn more.

Add an explanation of Git's data model. Some choices I've made in
deciding what "core data model" means:

1. Omit pseudorefs like `FETCH_HEAD`, because it's not clear to me
   if those are intended to be user facing or if they're more like
   internal implementation details.
2. Don't talk about submodules other than by mentioning how they
   relate to trees. This is because Git has a lot of special features,
   and explaining how they all work exhaustively could quickly go
   down a rabbit hole which would make this document less useful for
   understanding Git's core behaviour.
3. Don't discuss the structure of a commit message
   (first line, trailers etc).
4. Don't mention configuration.
5. Don't mention the `.git` directory, to avoid getting too much into
   implementation details

Signed-off-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/Makefile              |   1 +
 Documentation/gitdatamodel.adoc     | 307 ++++++++++++++++++++++++++++
 Documentation/glossary-content.adoc |   4 +-
 Documentation/meson.build           |   1 +
 4 files changed, 311 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/gitdatamodel.adoc

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 6fb83d0c6ebf22..5f4acfacbdb6f0 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -52,6 +52,7 @@ MAN7_TXT += gitcli.adoc
 MAN7_TXT += gitcore-tutorial.adoc
 MAN7_TXT += gitcredentials.adoc
 MAN7_TXT += gitcvs-migration.adoc
+MAN7_TXT += gitdatamodel.adoc
 MAN7_TXT += gitdiffcore.adoc
 MAN7_TXT += giteveryday.adoc
 MAN7_TXT += gitfaq.adoc
diff --git a/Documentation/gitdatamodel.adoc b/Documentation/gitdatamodel.adoc
new file mode 100644
index 00000000000000..3614f5960ea143
--- /dev/null
+++ b/Documentation/gitdatamodel.adoc
@@ -0,0 +1,307 @@
+gitdatamodel(7)
+===============
+
+NAME
+----
+gitdatamodel - Git's core data model
+
+SYNOPSIS
+--------
+gitdatamodel
+
+DESCRIPTION
+-----------
+
+It's not necessary to understand Git's data model to use Git, but it's
+very helpful when reading Git's documentation so that you know what it
+means when the documentation says "object", "reference" or "index".
+
+Git's core operations use 4 kinds of data:
+
+1. <<objects,Objects>>: commits, trees, blobs, and tag objects
+2. <<references,References>>: branches, tags,
+   remote-tracking branches, etc
+3. <<index,The index>>, also known as the staging area
+4. <<reflogs,Reflogs>>: logs of changes to references ("ref log")
+
+[[objects]]
+OBJECTS
+-------
+
+All of the commits and files in a Git repository are stored as "Git objects".
+Git objects never change after they're created, and every object has an ID,
+like `1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a`.
+
+This means that if you have an object's ID, you can always recover its
+exact contents as long as the object hasn't been deleted.
+
+Every object has:
+
+[[object-id]]
+1. an *ID* (aka "object name"), which is a cryptographic hash of its
+  type and contents.
+  It's fast to look up a Git object using its ID.
+  This is usually represented in hexadecimal, like
+  `1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a`.
+2. a *type*. There are 4 types of objects:
+   <<commit,commits>>, <<tree,trees>>, <<blob,blobs>>,
+   and <<tag-object,tag objects>>.
+3. *contents*. The structure of the contents depends on the type.
+
+Here's how each type of object is structured:
+
+[[commit]]
+commit::
+    A commit contains these required fields
+    (though there are other optional fields):
++
+1. The full directory structure of all the files in that version of the
+   repository and each file's contents, stored as the *<<tree,tree>>* ID
+   of the commit's top-level directory
+2. Its *parent commit ID(s)*. The first commit in a repository has 0 parents,
+  regular commits have 1 parent, merge commits have 2 or more parents
+3. An *author* and the time the commit was authored
+4. A *committer* and the time the commit was committed
+5. A *commit message*
++
+Here's how an example commit is stored:
++
+----
+tree 1b61de420a21a2f1aaef93e38ecd0e45e8bc9f0a
+parent 4ccb6d7b8869a86aae2e84c56523f8705b50c647
+author Maya <maya@example.com> 1759173425 -0400
+committer Maya <maya@example.com> 1759173425 -0400
+
+Add README
+----
++
+Like all other objects, commits can never be changed after they're created.
+For example, "amending" a commit with `git commit --amend` creates a new
+commit with the same parent.
++
+Git does not store the diff for a commit: when you ask Git to show
+the commit with linkgit:git-show[1], it calculates the diff from its
+parent on the fly.
+
+[[tree]]
+tree::
+    A tree is how Git represents a directory.
+    It can contain files or other trees (which are subdirectories).
+    It lists, for each item in the tree:
++
+1. The *filename*, for example `hello.py`
+2. The *file type*, which must be one of these five types:
+  - *regular file*
+  - *executable file*
+  - *symbolic link*
+  - *directory*
+  - *gitlink* (for use with submodules)
+3. The <<object-id,*object ID*>> with the contents of the file, directory,
+   or gitlink.
++
+For example, this is how a tree containing one directory (`src`) and one file
+(`README.md`) is stored:
++
+----
+100644 blob 8728a858d9d21a8c78488c8b4e70e531b659141f README.md
+040000 tree 89b1d2e0495f66d6929f4ff76ff1bb07fc41947d src
+----
+
+NOTE: In the output above, Git displays the file type of each tree entry
+using a format that's loosely modelled on Unix file modes (`100644` is
+"regular file", `100755` is "executable file", `120000` is "symbolic
+link", `040000` is "directory", and `160000` is "gitlink"). It also
+displays the object's type: `blob` for files and symlinks, `tree` for
+directories, and `commit` for gitlinks.
+
+[[blob]]
+blob::
+    A blob object contains a file's contents.
++
+When you make a commit, Git stores the full contents of each file that
+you changed as a blob.
+For example, if you have a commit that changes 2 files in a repository
+with 1000 files, that commit will create 2 new blobs, and use the
+previous blob ID for the other 998 files.
+This means that commits can use relatively little disk space even in a
+very large repository.
+
+[[tag-object]]
+tag object::
+    Tag objects contain these required fields
+    (though there are other optional fields):
++
+1. The *ID* of the object it references
+2. The *type* of the object it references
+3. The *tagger* and tag date
+4. A *tag message*, similar to a commit message
+
+Here's how an example tag object is stored:
+
+----
+object 750b4ead9c87ceb3ddb7a390e6c7074521797fb3
+type commit
+tag v1.0.0
+tagger Maya <maya@example.com> 1759927359 -0400
+
+Release version 1.0.0
+----
+
+NOTE: All of the examples in this section were generated with
+`git cat-file -p <object-id>`.
+
+[[references]]
+REFERENCES
+----------
+
+References are a way to give a name to a commit.
+It's easier to remember "the changes I'm working on are on the `turtle`
+branch" than "the changes are in commit bb69721404348e".
+Git often uses "ref" as shorthand for "reference".
+
+References can either refer to:
+
+1. An object ID, usually a <<commit,commit>> ID
+2. Another reference. This is called a "symbolic reference"
+
+References are stored in a hierarchy, and Git handles references
+differently based on where they are in the hierarchy.
+Most references are under `refs/`. Here are the main types:
+
+[[branch]]
+branches: `refs/heads/<name>`::
+    A branch refers to a commit ID.
+    That commit is the latest commit on the branch.
++
+To get the history of commits on a branch, Git will start at the commit
+ID the branch references, and then look at the commit's parent(s),
+the parent's parent, etc.
+
+[[tag]]
+tags: `refs/tags/<name>`::
+    A tag refers to a commit ID, tag object ID, or other object ID.
+    There are two types of tags:
+    1. "Annotated tags", which reference a <<tag-object,tag object>> ID
+       which contains a tag message
+    2. "Lightweight tags", which reference a commit, blob, or tree ID
+       directly
++
+Even though branches and tags both refer to a commit ID, Git
+treats them very differently.
+Branches are expected to change over time: when you make a commit, Git
+will update your <<HEAD,current branch>> to point to the new commit.
+Tags are usually not changed after they're created.
+
+[[HEAD]]
+HEAD: `HEAD`::
+    `HEAD` is where Git stores your current <<branch,branch>>,
+    if there is a current branch. `HEAD` can either be:
++
+1. A symbolic reference to your current branch, for example `ref:
+   refs/heads/main` if your current branch is `main`.
+2. A direct reference to a commit ID. In this case there is no current branch.
+   This is called "detached HEAD state", see the DETACHED HEAD section
+   of linkgit:git-checkout[1] for more.
+
+[[remote-tracking-branch]]
+remote-tracking branches: `refs/remotes/<remote>/<branch>`::
+    A remote-tracking branch refers to a commit ID.
+    It's how Git stores the last-known state of a branch in a remote
+    repository. `git fetch` updates remote-tracking branches. When
+    `git status` says "you're up to date with origin/main", it's looking at
+    this.
++
+`refs/remotes/<remote>/HEAD` is a symbolic reference to the remote's
+default branch. This is the branch that `git clone` checks out by default.
+
+[[other-refs]]
+Other references::
+    Git tools may create references anywhere under `refs/`.
+    For example, linkgit:git-stash[1], linkgit:git-bisect[1],
+    and linkgit:git-notes[1] all create their own references
+    in `refs/stash`, `refs/bisect`, etc.
+    Third-party Git tools may also create their own references.
++
+Git may also create references other than `HEAD` at the base of the
+hierarchy, like `ORIG_HEAD`.
+
+NOTE: Git may delete objects that aren't "reachable" from any reference
+or <<reflogs,reflog>>.
+An object is "reachable" if we can find it by following tags to whatever
+they tag, commits to their parents or trees, and trees to the trees or
+blobs that they contain.
+For example, if you amend a commit with `git commit --amend`,
+there will no longer be a branch that points at the old commit.
+The old commit is recorded in the current branch's <<reflogs,reflog>>,
+so it is still "reachable", but when the reflog entry expires it may
+become unreachable and get deleted.
+
+the old commit will usually not be reachable, so it may be deleted eventually.
+Reachable objects will never be deleted.
+
+[[index]]
+THE INDEX
+---------
+The index, also known as the "staging area", is a list of files and
+the contents of each file, stored as a <<blob,blob>>.
+You can add files to the index or update the contents of a file in the
+index with linkgit:git-add[1]. This is called "staging" the file for commit.
+
+Unlike a <<tree,tree>>, the index is a flat list of files.
+When you commit, Git converts the list of files in the index to a
+directory <<tree,tree>> and uses that tree in the new <<commit,commit>>.
+
+Each index entry has 4 fields:
+
+1. The *file type*, which must be one of:
+  - *regular file*
+  - *executable file*
+  - *symbolic link*
+  - *gitlink* (for use with submodules)
+2. The *<<blob,blob>>* ID of the file,
+   or (rarely) the *<<commit,commit>>* ID of the submodule
+3. The *stage number*, either 0, 1, 2, or 3. This is normally 0, but if
+   there's a merge conflict there can be multiple versions of the same
+   filename in the index.
+4. The *file path*, for example `src/hello.py`
+
+It's extremely uncommon to look at the index directly: normally you'd
+run `git status` to see a list of changes between the index and <<HEAD,HEAD>>.
+But you can use `git ls-files --stage` to see the index.
+Here's the output of `git ls-files --stage` in a repository with 2 files:
+
+----
+100644 8728a858d9d21a8c78488c8b4e70e531b659141f 0 README.md
+100644 665c637a360874ce43bf74018768a96d2d4d219a 0 src/hello.py
+----
+
+[[reflogs]]
+REFLOGS
+-------
+
+Every time a branch, remote-tracking branch, or HEAD is updated, Git
+updates a log called a "reflog" for that <<references,reference>>.
+This means that if you make a mistake and "lose" a commit, you can
+generally recover the commit ID by running `git reflog <reference>`.
+
+A reflog is a list of log entries. Each entry has:
+
+1. The *commit ID*
+2. *Timestamp* when the change was made
+3. *Log message*, for example `pull: Fast-forward`
+
+Reflogs only log changes made in your local repository.
+They are not shared with remotes.
+
+You can view a reflog with `git reflog <reference>`.
+For example, here's the reflog for a `main` branch which has changed twice:
+
+----
+$ git reflog main --date=iso --no-decorate
+750b4ea main@{2025-09-29 15:17:05 -0400}: commit: Add README
+4ccb6d7 main@{2025-09-29 15:16:48 -0400}: commit (initial): Initial commit
+----
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/glossary-content.adoc b/Documentation/glossary-content.adoc
index e423e4765b71b0..20ba121314b9a4 100644
--- a/Documentation/glossary-content.adoc
+++ b/Documentation/glossary-content.adoc
@@ -297,8 +297,8 @@ This commit is referred to as a "merge commit", or sometimes just a
 	identified by its <<def_object_name,object name>>. The objects usually
 	live in `$GIT_DIR/objects/`.
 
-[[def_object_identifier]]object identifier (oid)::
-	Synonym for <<def_object_name,object name>>.
+[[def_object_identifier]]object identifier, object ID, oid::
+	Synonyms for <<def_object_name,object name>>.
 
 [[def_object_name]]object name::
 	The unique identifier of an <<def_object,object>>.  The
diff --git a/Documentation/meson.build b/Documentation/meson.build
index 44f94cdb7ba672..f3fcf4bc9196aa 100644
--- a/Documentation/meson.build
+++ b/Documentation/meson.build
@@ -192,6 +192,7 @@ manpages = {
   'gitcore-tutorial.adoc' : 7,
   'gitcredentials.adoc' : 7,
   'gitcvs-migration.adoc' : 7,
+  'gitdatamodel.adoc' : 7,
   'gitdiffcore.adoc' : 7,
   'giteveryday.adoc' : 7,
   'gitfaq.adoc' : 7,

From 8d4725e48ef29bd857e21e689411878b6eb4df92 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:47 -0800
Subject: [PATCH 060/553] whitespace: correct bit assignment comments

A comment in diff.c claimed that bits up to 12th (counting from 0th)
are whitespace rules, and 13th thru 15th are for new/old/context,
but it turns out it was miscounting.  Correct them, and clarify
where the whitespace rule bits come from in the comment.  Extend bit
assignment comments to cover bits used for color-moved, which
weren't described.

Also update the way these bit constants are defined to use (1 << N)
notation, instead of octal constants, as it tends to make it easier
to notice a breakage like this.

Sprinkle a few blank lines between logically distinct groups of CPP
macro definitions to make them easier to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c |  7 +++++--
 diff.h |  6 +++---
 ws.h   | 25 ++++++++++++++-----------
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/diff.c b/diff.c
index a74e701806be52..74261b332af16c 100644
--- a/diff.c
+++ b/diff.c
@@ -801,16 +801,19 @@ enum diff_symbol {
 	DIFF_SYMBOL_CONTEXT_MARKER,
 	DIFF_SYMBOL_SEPARATOR
 };
+
 /*
  * Flags for content lines:
- * 0..12 are whitespace rules
- * 13-15 are WSEH_NEW | WSEH_OLD | WSEH_CONTEXT
+ * 0..11 are whitespace rules (see ws.h)
+ * 12..14 are WSEH_NEW | WSEH_CONTEXT | WSEH_OLD
  * 16 is marking if the line is blank at EOF
+ * 17..19 are used for color-moved.
  */
 #define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF	(1<<16)
 #define DIFF_SYMBOL_MOVED_LINE			(1<<17)
 #define DIFF_SYMBOL_MOVED_LINE_ALT		(1<<18)
 #define DIFF_SYMBOL_MOVED_LINE_UNINTERESTING	(1<<19)
+
 #define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK)
 
 /*
diff --git a/diff.h b/diff.h
index 2fa256c3ef0079..cbd355cf50f68e 100644
--- a/diff.h
+++ b/diff.h
@@ -331,9 +331,9 @@ struct diff_options {
 
 	int ita_invisible_in_index;
 /* white-space error highlighting */
-#define WSEH_NEW (1<<12)
-#define WSEH_CONTEXT (1<<13)
-#define WSEH_OLD (1<<14)
+#define WSEH_NEW        (1<<12)
+#define WSEH_CONTEXT    (1<<13)
+#define WSEH_OLD        (1<<14)
 	unsigned ws_error_highlight;
 	const char *prefix;
 	int prefix_length;
diff --git a/ws.h b/ws.h
index 5ba676c5595db5..23708efb7322ed 100644
--- a/ws.h
+++ b/ws.h
@@ -7,19 +7,22 @@ struct strbuf;
 /*
  * whitespace rules.
  * used by both diff and apply
- * last two digits are tab width
+ * last two octal-digits are tab width (we support only up to 63).
  */
-#define WS_BLANK_AT_EOL         0100
-#define WS_SPACE_BEFORE_TAB     0200
-#define WS_INDENT_WITH_NON_TAB  0400
-#define WS_CR_AT_EOL           01000
-#define WS_BLANK_AT_EOF        02000
-#define WS_TAB_IN_INDENT       04000
-#define WS_TRAILING_SPACE      (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF)
+#define WS_BLANK_AT_EOL         (1<<6)
+#define WS_SPACE_BEFORE_TAB     (1<<7)
+#define WS_INDENT_WITH_NON_TAB  (1<<8)
+#define WS_CR_AT_EOL            (1<<9)
+#define WS_BLANK_AT_EOF         (1<<10)
+#define WS_TAB_IN_INDENT        (1<<11)
+
+#define WS_TRAILING_SPACE       (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF)
 #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB|8)
-#define WS_TAB_WIDTH_MASK        077
-/* All WS_* -- when extended, adapt diff.c emit_symbol */
-#define WS_RULE_MASK           07777
+#define WS_TAB_WIDTH_MASK       ((1<<6)-1)
+
+/* All WS_* -- when extended, adapt constants defined after diff.c:diff_symbol */
+#define WS_RULE_MASK            ((1<<12)-1)
+
 extern unsigned whitespace_rule_cfg;
 unsigned whitespace_rule(struct index_state *, const char *);
 unsigned parse_whitespace_rule(const char *);

From f83d1afafb8b772397aa3854184c42f7810fa0df Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:48 -0800
Subject: [PATCH 061/553] diff: emit_line_ws_markup() if/else style fix

Apply the simple rule: if you need {} in one arm of the if/else
if/else... cascade, have {} in all of them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index 74261b332af16c..9a24a0791ceaca 100644
--- a/diff.c
+++ b/diff.c
@@ -1327,14 +1327,14 @@ static void emit_line_ws_markup(struct diff_options *o,
 			ws = NULL;
 	}
 
-	if (!ws && !set_sign)
+	if (!ws && !set_sign) {
 		emit_line_0(o, set, NULL, 0, reset, sign, line, len);
-	else if (!ws) {
+	} else if (!ws) {
 		emit_line_0(o, set_sign, set, !!set_sign, reset, sign, line, len);
-	} else if (blank_at_eof)
+	} else if (blank_at_eof) {
 		/* Blank line at EOF - paint '+' as well */
 		emit_line_0(o, ws, NULL, 0, reset, sign, line, len);
-	else {
+	} else {
 		/* Emit just the prefix, then the rest. */
 		emit_line_0(o, set_sign ? set_sign : set, NULL, !!set_sign, reset,
 			    sign, "", 0);

From fc7abcd9d5460b381701ef43b7f6dafa73962950 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:49 -0800
Subject: [PATCH 062/553] diff: correct suppress_blank_empty hack

The suppress-blank-empty feature abused the CONTEXT_INCOMPLETE
symbol that was meant to be used only for "\ No newline at the end
of file" code path.

The intent of the feature was to turn a context line we receive from
xdiff machinery (which always uses ' ' for context lines, even an
empty one) and spit it out as a truly empty line.

Perform such a conversion very locally at where a line from xdiff
that begins with ' ' is handled for output; there are many checks
before the control reaches such place that checks the first letter
of the diff output line to see if it is a context line, and having
to check for '\n' and treat it as a special case is error prone.

In order to catch similar hacks in the future, make sure the code
path that is meant for "\ No newline" case checks the first byte is
indeed a backslash.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/diff.c b/diff.c
index 9a24a0791ceaca..b9ef8550cc859a 100644
--- a/diff.c
+++ b/diff.c
@@ -1321,6 +1321,11 @@ static void emit_line_ws_markup(struct diff_options *o,
 	const char *ws = NULL;
 	int sign = o->output_indicators[sign_index];
 
+	if (diff_suppress_blank_empty &&
+	    sign_index == OUTPUT_INDICATOR_CONTEXT &&
+	    len == 1 && line[0] == '\n')
+		sign = 0;
+
 	if (o->ws_error_highlight & ws_rule) {
 		ws = diff_get_color_opt(o, DIFF_WHITESPACE);
 		if (!*ws)
@@ -1498,15 +1503,9 @@ static void emit_diff_symbol_from_struct(struct diff_options *o,
 	case DIFF_SYMBOL_WORDS:
 		context = diff_get_color_opt(o, DIFF_CONTEXT);
 		reset = diff_get_color_opt(o, DIFF_RESET);
-		/*
-		 * Skip the prefix character, if any.  With
-		 * diff_suppress_blank_empty, there may be
-		 * none.
-		 */
-		if (line[0] != '\n') {
-			line++;
-			len--;
-		}
+
+		/* Skip the prefix character */
+		line++; len--;
 		emit_line(o, context, reset, line, len);
 		break;
 	case DIFF_SYMBOL_FILEPAIR_PLUS:
@@ -2375,12 +2374,6 @@ static int fn_out_consume(void *priv, char *line, unsigned long len)
 		ecbdata->label_path[0] = ecbdata->label_path[1] = NULL;
 	}
 
-	if (diff_suppress_blank_empty
-	    && len == 2 && line[0] == ' ' && line[1] == '\n') {
-		line[0] = '\n';
-		len = 1;
-	}
-
 	if (line[0] == '@') {
 		if (ecbdata->diff_words)
 			diff_words_flush(ecbdata);
@@ -2431,12 +2424,14 @@ static int fn_out_consume(void *priv, char *line, unsigned long len)
 		ecbdata->lno_in_preimage++;
 		emit_context_line(ecbdata, line + 1, len - 1);
 		break;
-	default:
+	case '\\':
 		/* incomplete line at the end */
 		ecbdata->lno_in_preimage++;
 		emit_diff_symbol(o, DIFF_SYMBOL_CONTEXT_INCOMPLETE,
 				 line, len, 0);
 		break;
+	default:
+		BUG("fn_out_consume: unknown line '%s'", line);
 	}
 	return 0;
 }

From ced0561828271e8fc3fa2699754c5925969111b5 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:50 -0800
Subject: [PATCH 063/553] diff: keep track of the type of the last line seen

The "\ No newline at the end of the file" can come after any of the
"-" (deleted preimage line), " " (unchanged line), or "+" (added
postimage line).  In later steps in this series, we will start
treating a change that makes a file to end in an incomplete line
as a whitespace error, and we would need to know what the previous
line was when we react to "\ No newline" in the diff output.  If
the previous line was a context (i.e., unchanged) line, the file
lacked the final newline before the change, and the change did not
touch that line and left it still incomplete, so we do not want to
warn in such a case.

Teach fn_out_consume() function to keep track of what the previous
line was, and prepare an otherwise empty switch statement to let us
react differently to "\ No newline" based on that.

Note that there is an existing curiosity (read: likely to be a bug)
in the code that increments line number in the preimage file every
time it sees a line with "\ No newline" on it, regardless of what
the previous line was.  I left it as-is, because it does not affect
the main theme of this series, and more importantly, I do not think
it matters, as these numbers are used only to compare them with
blank_at_eof_in_{pre,post}image to issue a warning when we see more
empty line was added at the end, but by definition, after we see
"\ No newline at the end of the file" for an added line, we will not
see an added line for the file.

An independent audit to ensure that this curious increment can be
safely removed would make a good #leftoverbits clean-up (we may even
find some code that decrements this counter or over-increments the
other quantity this counter is compared with that compensates the
effect of this curious increment that hides a bug, in which case we
may also need to remove them).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/diff.c b/diff.c
index b9ef8550cc859a..ff8fc91f88d30e 100644
--- a/diff.c
+++ b/diff.c
@@ -601,6 +601,7 @@ struct emit_callback {
 	int blank_at_eof_in_postimage;
 	int lno_in_preimage;
 	int lno_in_postimage;
+	int last_line_kind;
 	const char **label_path;
 	struct diff_words_data *diff_words;
 	struct diff_options *opt;
@@ -2426,6 +2427,15 @@ static int fn_out_consume(void *priv, char *line, unsigned long len)
 		break;
 	case '\\':
 		/* incomplete line at the end */
+		switch (ecbdata->last_line_kind) {
+		case '+':
+		case '-':
+		case ' ':
+			break;
+		default:
+			BUG("fn_out_consume: '\\No newline' after unknown line (%c)",
+			    ecbdata->last_line_kind);
+		}
 		ecbdata->lno_in_preimage++;
 		emit_diff_symbol(o, DIFF_SYMBOL_CONTEXT_INCOMPLETE,
 				 line, len, 0);
@@ -2433,6 +2443,7 @@ static int fn_out_consume(void *priv, char *line, unsigned long len)
 	default:
 		BUG("fn_out_consume: unknown line '%s'", line);
 	}
+	ecbdata->last_line_kind = line[0];
 	return 0;
 }
 

From 29228cbdc5f8a80b1f61c7cc209ba8e3714cc38e Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:51 -0800
Subject: [PATCH 064/553] diff: refactor output of incomplete line

Create a helper function that reacts to "\ No newline at the end of
file" in preparation for unifying the incomplete line handling in
the code path that handles xdiff output and the code path that
bypasses xdiff and produces a complete-rewrite patch.

Currently the output from the DIFF_SYMBOL_CONTEXT_INCOMPLETE case
still (ab)uses the same code as what is used for context lines, but
that would change in a later step where we introduce support to treat
an incomplete line as a whitespace error.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index ff8fc91f88d30e..7ee86204291161 100644
--- a/diff.c
+++ b/diff.c
@@ -1379,6 +1379,10 @@ static void emit_diff_symbol_from_struct(struct diff_options *o,
 		emit_line(o, "", "", line, len);
 		break;
 	case DIFF_SYMBOL_CONTEXT_INCOMPLETE:
+		set = diff_get_color_opt(o, DIFF_CONTEXT);
+		reset = diff_get_color_opt(o, DIFF_RESET);
+		emit_line(o, set, reset, line, len);
+		break;
 	case DIFF_SYMBOL_CONTEXT_MARKER:
 		context = diff_get_color_opt(o, DIFF_CONTEXT);
 		reset = diff_get_color_opt(o, DIFF_RESET);
@@ -1668,6 +1672,13 @@ static void emit_context_line(struct emit_callback *ecbdata,
 	emit_diff_symbol(ecbdata->opt, DIFF_SYMBOL_CONTEXT, line, len, flags);
 }
 
+static void emit_incomplete_line_marker(struct emit_callback *ecbdata,
+					const char *line, int len)
+{
+	emit_diff_symbol(ecbdata->opt, DIFF_SYMBOL_CONTEXT_INCOMPLETE,
+			 line, len, 0);
+}
+
 static void emit_hunk_header(struct emit_callback *ecbdata,
 			     const char *line, int len)
 {
@@ -2437,8 +2448,7 @@ static int fn_out_consume(void *priv, char *line, unsigned long len)
 			    ecbdata->last_line_kind);
 		}
 		ecbdata->lno_in_preimage++;
-		emit_diff_symbol(o, DIFF_SYMBOL_CONTEXT_INCOMPLETE,
-				 line, len, 0);
+		emit_incomplete_line_marker(ecbdata, line, len);
 		break;
 	default:
 		BUG("fn_out_consume: unknown line '%s'", line);

From 35925f1832ac00a4926623dff061a3b52123470b Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:52 -0800
Subject: [PATCH 065/553] diff: call emit_callback ecbdata everywhere

Everybody else, except for emit_rewrite_lines(), calls the
emit_callback data ecbdata.  Make sure we call the same thing by
the same name for consistency.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 7ee86204291161..44b86544b75f97 100644
--- a/diff.c
+++ b/diff.c
@@ -1780,7 +1780,7 @@ static void add_line_count(struct strbuf *out, int count)
 	}
 }
 
-static void emit_rewrite_lines(struct emit_callback *ecb,
+static void emit_rewrite_lines(struct emit_callback *ecbdata,
 			       int prefix, const char *data, int size)
 {
 	const char *endp = NULL;
@@ -1791,17 +1791,17 @@ static void emit_rewrite_lines(struct emit_callback *ecb,
 		endp = memchr(data, '\n', size);
 		len = endp ? (endp - data + 1) : size;
 		if (prefix != '+') {
-			ecb->lno_in_preimage++;
-			emit_del_line(ecb, data, len);
+			ecbdata->lno_in_preimage++;
+			emit_del_line(ecbdata, data, len);
 		} else {
-			ecb->lno_in_postimage++;
-			emit_add_line(ecb, data, len);
+			ecbdata->lno_in_postimage++;
+			emit_add_line(ecbdata, data, len);
 		}
 		size -= len;
 		data += len;
 	}
 	if (!endp)
-		emit_diff_symbol(ecb->opt, DIFF_SYMBOL_NO_LF_EOF, NULL, 0, 0);
+		emit_diff_symbol(ecbdata->opt, DIFF_SYMBOL_NO_LF_EOF, NULL, 0, 0);
 }
 
 static void emit_rewrite_diff(const char *name_a,

From 8d8e3c61874bbcf50d64aff34fb6c533458adf5e Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:53 -0800
Subject: [PATCH 066/553] diff: update the way rewrite diff handles incomplete
 lines

The diff_symbol based output framework uses one DIFF_SYMBOL_* enum
value per the kind of output lines of "git diff", which corresponds
to one output line from the xdiff machinery used internally.  Most
notably, DIFF_SYMBOL_PLUS and DIFF_SYMBOL_MINUS that correspond to
"+" and "-" lines are designed to always take a complete line, even
if the output from xdiff machinery may produce "\ No newline at the
end of file" immediately after them.

But this is not true in the rewrite-diff codepath, which completely
bypasses the xdiff machinery.  Since the code path feeds the bytes
directly from the payload to the output routines, the output layer
has to deal with an incomplete line with DIFF_SYMBOL_PLUS and
DIFF_SYMBOL_MINUS, which never would see an incomplete line in the
normal code paths.  This lack of final newline is compensated by an
ugly hack for a fabricated DIFF_SYMBOL_NO_LF_EOF token to inject an
extra newline to the output to simulate output coming from the xdiff
machinery.

Revamp the way the complete-rewrite code path feeds the lines to the
output layer by treating the last line of the pre/post image when it
is an incomplete line specially.

This lets us remove the DIFF_SYMBOL_NO_LF_EOF hack and use the usual
DIFF_SYMBOL_CONTEXT_INCOMPLETE code path, which will later learn how
to handle whitespace errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/diff.c b/diff.c
index 44b86544b75f97..5c606409bb9cbe 100644
--- a/diff.c
+++ b/diff.c
@@ -797,7 +797,6 @@ enum diff_symbol {
 	DIFF_SYMBOL_CONTEXT_INCOMPLETE,
 	DIFF_SYMBOL_PLUS,
 	DIFF_SYMBOL_MINUS,
-	DIFF_SYMBOL_NO_LF_EOF,
 	DIFF_SYMBOL_CONTEXT_FRAGINFO,
 	DIFF_SYMBOL_CONTEXT_MARKER,
 	DIFF_SYMBOL_SEPARATOR
@@ -1352,7 +1351,6 @@ static void emit_line_ws_markup(struct diff_options *o,
 static void emit_diff_symbol_from_struct(struct diff_options *o,
 					 struct emitted_diff_symbol *eds)
 {
-	static const char *nneof = " No newline at end of file\n";
 	const char *context, *reset, *set, *set_sign, *meta, *fraginfo;
 
 	enum diff_symbol s = eds->s;
@@ -1361,13 +1359,6 @@ static void emit_diff_symbol_from_struct(struct diff_options *o,
 	unsigned flags = eds->flags;
 
 	switch (s) {
-	case DIFF_SYMBOL_NO_LF_EOF:
-		context = diff_get_color_opt(o, DIFF_CONTEXT);
-		reset = diff_get_color_opt(o, DIFF_RESET);
-		putc('\n', o->file);
-		emit_line_0(o, context, NULL, 0, reset, '\\',
-			    nneof, strlen(nneof));
-		break;
 	case DIFF_SYMBOL_SUBMODULE_HEADER:
 	case DIFF_SYMBOL_SUBMODULE_ERROR:
 	case DIFF_SYMBOL_SUBMODULE_PIPETHROUGH:
@@ -1786,22 +1777,38 @@ static void emit_rewrite_lines(struct emit_callback *ecbdata,
 	const char *endp = NULL;
 
 	while (0 < size) {
-		int len;
+		int len, plen;
+		char *pdata = NULL;
 
 		endp = memchr(data, '\n', size);
-		len = endp ? (endp - data + 1) : size;
+
+		if (endp) {
+			len = endp - data + 1;
+			plen = len;
+		} else {
+			len = size;
+			plen = len + 1;
+			pdata = xmalloc(plen + 2);
+			memcpy(pdata, data, len);
+			pdata[len] = '\n';
+			pdata[len + 1] = '\0';
+		}
 		if (prefix != '+') {
 			ecbdata->lno_in_preimage++;
-			emit_del_line(ecbdata, data, len);
+			emit_del_line(ecbdata, pdata ? pdata : data, plen);
 		} else {
 			ecbdata->lno_in_postimage++;
-			emit_add_line(ecbdata, data, len);
+			emit_add_line(ecbdata, pdata ? pdata : data, plen);
 		}
+		free(pdata);
 		size -= len;
 		data += len;
 	}
-	if (!endp)
-		emit_diff_symbol(ecbdata->opt, DIFF_SYMBOL_NO_LF_EOF, NULL, 0, 0);
+	if (!endp) {
+		static const char nneof[] = "\\ No newline at end of file\n";
+		ecbdata->last_line_kind = prefix;
+		emit_incomplete_line_marker(ecbdata, nneof, sizeof(nneof) - 1);
+	}
 }
 
 static void emit_rewrite_diff(const char *name_a,

From 3a4eb5ad2e9166255d5921196470710523f24ec4 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:54 -0800
Subject: [PATCH 067/553] apply: revamp the parsing of incomplete lines

A patch file represents the incomplete line at the end of the file
with two lines, one that is the usual "context" with " " as the
first letter, "added" with "+" as the first letter, or "removed"
with "-" as the first letter that shows the content of the line,
plus an extra "\ No newline at the end of file" line that comes
immediately after it.

Ever since the apply machinery was written, the "git apply"
machinery parses "\ No newline at the end of file" line
independently, without even knowing what line the incomplete-ness
applies to, simply because it does not even remember what the
previous line was.

This poses a problem if we want to check and warn on an incomplete
line.  Revamp the code that parses a fragment, to actually drop the
'\n' at the end of the incoming patch file that terminates a line,
so that check_whitespace() calls made from the code path actually
sees an incomplete as incomplete.

Note that the result of this parsing is not directly used by the
code path that applies the patch.  apply_one_fragment() function
already checks if each of the patch text it handles is followed by a
line that begins with a backslash to drop the newline at the end of
the current line it is looking at.  In a sense, this patch harmonizes
the behaviour of the parsing side to what is already done in the
application side.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 apply.c | 70 ++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 49 insertions(+), 21 deletions(-)

diff --git a/apply.c b/apply.c
index a2ceb3fb40d3b5..2b0f8bdab55463 100644
--- a/apply.c
+++ b/apply.c
@@ -1670,6 +1670,35 @@ static void check_old_for_crlf(struct patch *patch, const char *line, int len)
 }
 
 
+/*
+ * Just saw a single line in a fragment.  If it is a part of this hunk
+ * that is a context " ", an added "+", or a removed "-" line, it may
+ * be followed by "\\ No newline..." to signal that the last "\n" on
+ * this line needs to be dropped.  Depending on locale settings when
+ * the patch was produced we don't know what this line would exactly
+ * say. The only thing we do know is that it begins with "\ ".
+ * Checking for 12 is just for sanity check; "\ No newline..." would
+ * be at least that long in any l10n.
+ *
+ * Return 0 if the line we saw is not followed by "\ No newline...",
+ * or length of that line.  The caller will use it to skip over the
+ * "\ No newline..." line.
+ */
+static int adjust_incomplete(const char *line, int len,
+			     unsigned long size)
+{
+	int nextlen;
+
+	if (*line != '\n' && *line != ' ' && *line != '+' && *line != '-')
+		return 0;
+	if (size - len < 12 || memcmp(line + len, "\\ ", 2))
+		return 0;
+	nextlen = linelen(line + len, size - len);
+	if (nextlen < 12)
+		return 0;
+	return nextlen;
+}
+
 /*
  * Parse a unified diff. Note that this really needs to parse each
  * fragment separately, since the only way to know the difference
@@ -1684,6 +1713,7 @@ static int parse_fragment(struct apply_state *state,
 {
 	int added, deleted;
 	int len = linelen(line, size), offset;
+	int skip_len = 0;
 	unsigned long oldlines, newlines;
 	unsigned long leading, trailing;
 
@@ -1710,6 +1740,22 @@ static int parse_fragment(struct apply_state *state,
 		len = linelen(line, size);
 		if (!len || line[len-1] != '\n')
 			return -1;
+
+		/*
+		 * For an incomplete line, skip_len counts the bytes
+		 * on "\\ No newline..." marker line that comes next
+		 * to the current line.
+		 *
+		 * Reduce "len" to drop the newline at the end of
+		 * line[], but add one to "skip_len", which will be
+		 * added back to "len" for the next iteration, to
+		 * compensate.
+		 */
+		skip_len = adjust_incomplete(line, len, size);
+		if (skip_len) {
+			len--;
+			skip_len++;
+		}
 		switch (*line) {
 		default:
 			return -1;
@@ -1745,20 +1791,10 @@ static int parse_fragment(struct apply_state *state,
 			newlines--;
 			trailing = 0;
 			break;
-
-		/*
-		 * We allow "\ No newline at end of file". Depending
-		 * on locale settings when the patch was produced we
-		 * don't know what this line looks like. The only
-		 * thing we do know is that it begins with "\ ".
-		 * Checking for 12 is just for sanity check -- any
-		 * l10n of "\ No newline..." is at least that long.
-		 */
-		case '\\':
-			if (len < 12 || memcmp(line, "\\ ", 2))
-				return -1;
-			break;
 		}
+
+		/* eat the "\\ No newline..." as well, if exists */
+		len += skip_len;
 	}
 	if (oldlines || newlines)
 		return -1;
@@ -1768,14 +1804,6 @@ static int parse_fragment(struct apply_state *state,
 	fragment->leading = leading;
 	fragment->trailing = trailing;
 
-	/*
-	 * If a fragment ends with an incomplete line, we failed to include
-	 * it in the above loop because we hit oldlines == newlines == 0
-	 * before seeing it.
-	 */
-	if (12 < size && !memcmp(line, "\\ ", 2))
-		offset += linelen(line, size);
-
 	patch->lines_added += added;
 	patch->lines_deleted += deleted;
 

From a675104c399d242dd3ff5a0823fcd770563cf60f Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:55 -0800
Subject: [PATCH 068/553] whitespace: allocate a few more bits and define
 WS_INCOMPLETE_LINE

Reserve a few more bits in the diff flags word to be used for future
whitespace rules.  Add WS_INCOMPLETE_LINE without implementing the
behaviour (yet).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/core.adoc |  2 ++
 diff.c                         | 16 ++++++++--------
 diff.h                         |  6 +++---
 ws.c                           |  6 ++++++
 ws.h                           |  3 ++-
 5 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index e2de270c869c77..682fb595fb096c 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -626,6 +626,8 @@ core.whitespace::
   part of the line terminator, i.e. with it, `trailing-space`
   does not trigger if the character before such a carriage-return
   is not a whitespace (not enabled by default).
+* `incomplete-line` treats the last line of a file that is missing the
+  newline at the end as an error (not enabled by default).
 * `tabwidth=<n>` tells how many character positions a tab occupies; this
   is relevant for `indent-with-non-tab` and when Git fixes `tab-in-indent`
   errors. The default tab width is 8. Allowed values are 1 to 63.
diff --git a/diff.c b/diff.c
index 5c606409bb9cbe..1b27b15f846333 100644
--- a/diff.c
+++ b/diff.c
@@ -804,15 +804,15 @@ enum diff_symbol {
 
 /*
  * Flags for content lines:
- * 0..11 are whitespace rules (see ws.h)
- * 12..14 are WSEH_NEW | WSEH_CONTEXT | WSEH_OLD
- * 16 is marking if the line is blank at EOF
- * 17..19 are used for color-moved.
+ * 0..15 are whitespace rules (see ws.h)
+ * 16..18 are WSEH_NEW | WSEH_CONTEXT | WSEH_OLD
+ * 19 is marking if the line is blank at EOF
+ * 20..22 are used for color-moved.
  */
-#define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF	(1<<16)
-#define DIFF_SYMBOL_MOVED_LINE			(1<<17)
-#define DIFF_SYMBOL_MOVED_LINE_ALT		(1<<18)
-#define DIFF_SYMBOL_MOVED_LINE_UNINTERESTING	(1<<19)
+#define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF	(1<<19)
+#define DIFF_SYMBOL_MOVED_LINE			(1<<20)
+#define DIFF_SYMBOL_MOVED_LINE_ALT		(1<<21)
+#define DIFF_SYMBOL_MOVED_LINE_UNINTERESTING	(1<<22)
 
 #define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK)
 
diff --git a/diff.h b/diff.h
index cbd355cf50f68e..422658407d4e86 100644
--- a/diff.h
+++ b/diff.h
@@ -331,9 +331,9 @@ struct diff_options {
 
 	int ita_invisible_in_index;
 /* white-space error highlighting */
-#define WSEH_NEW        (1<<12)
-#define WSEH_CONTEXT    (1<<13)
-#define WSEH_OLD        (1<<14)
+#define WSEH_NEW        (1<<16)
+#define WSEH_CONTEXT    (1<<17)
+#define WSEH_OLD        (1<<18)
 	unsigned ws_error_highlight;
 	const char *prefix;
 	int prefix_length;
diff --git a/ws.c b/ws.c
index 70acee3337f241..34a7b4fad2840a 100644
--- a/ws.c
+++ b/ws.c
@@ -26,6 +26,7 @@ static struct whitespace_rule {
 	{ "blank-at-eol", WS_BLANK_AT_EOL, 0 },
 	{ "blank-at-eof", WS_BLANK_AT_EOF, 0 },
 	{ "tab-in-indent", WS_TAB_IN_INDENT, 0, 1 },
+	{ "incomplete-line", WS_INCOMPLETE_LINE, 0, 0 },
 };
 
 unsigned parse_whitespace_rule(const char *string)
@@ -139,6 +140,11 @@ char *whitespace_error_string(unsigned ws)
 			strbuf_addstr(&err, ", ");
 		strbuf_addstr(&err, "tab in indent");
 	}
+	if (ws & WS_INCOMPLETE_LINE) {
+		if (err.len)
+			strbuf_addstr(&err, ", ");
+		strbuf_addstr(&err, "no newline at the end of file");
+	}
 	return strbuf_detach(&err, NULL);
 }
 
diff --git a/ws.h b/ws.h
index 23708efb7322ed..06d5cb73f8f88f 100644
--- a/ws.h
+++ b/ws.h
@@ -15,13 +15,14 @@ struct strbuf;
 #define WS_CR_AT_EOL            (1<<9)
 #define WS_BLANK_AT_EOF         (1<<10)
 #define WS_TAB_IN_INDENT        (1<<11)
+#define WS_INCOMPLETE_LINE      (1<<12)
 
 #define WS_TRAILING_SPACE       (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF)
 #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB|8)
 #define WS_TAB_WIDTH_MASK       ((1<<6)-1)
 
 /* All WS_* -- when extended, adapt constants defined after diff.c:diff_symbol */
-#define WS_RULE_MASK            ((1<<12)-1)
+#define WS_RULE_MASK            ((1<<16)-1)
 
 extern unsigned whitespace_rule_cfg;
 unsigned whitespace_rule(struct index_state *, const char *);

From 9fb15a8e1430b77e2cc771e425ce4f0954ce4777 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:56 -0800
Subject: [PATCH 069/553] apply: check and fix incomplete lines

The final line of a file that lacks the terminating newline at its
end is called an incomplete line.  In general they are frowned upon
for many reasons (imagine concatenating two files with "cat A B" and
what happens when A ends in an incomplete line, for example), and
text-oriented tools often mishandle such a line.

Implement checks in "git apply" for incomplete lines, which is off
by default for backward compatibility's sake, so that "git apply
--whitespace={fix,warn,error}" can notice, warn against, and fix
them.

As one of the new test shows, if you modify contents on an
incomplete line in the original and leave the resulting line
incomplete, it is still considered a whitespace error, the reasoning
being that "you'd better fix it while at it if you are making a
change on an incomplete line anyway", which may controversial.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 apply.c                  |  13 ++-
 t/t4124-apply-ws-rule.sh | 187 +++++++++++++++++++++++++++++++++++++++
 ws.c                     |  14 +++
 3 files changed, 213 insertions(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index 2b0f8bdab55463..c9fb45247d8cd2 100644
--- a/apply.c
+++ b/apply.c
@@ -1640,6 +1640,14 @@ static void record_ws_error(struct apply_state *state,
 	    state->squelch_whitespace_errors < state->whitespace_error)
 		return;
 
+	/*
+	 * line[len] for an incomplete line points at the "\n" at the end
+	 * of patch input line, so "%.*s" would drop the last letter on line;
+	 * compensate for it.
+	 */
+	if (result & WS_INCOMPLETE_LINE)
+		len++;
+
 	err = whitespace_error_string(result);
 	if (state->apply_verbosity > verbosity_silent)
 		fprintf(stderr, "%s:%d: %s.\n%.*s\n",
@@ -1794,7 +1802,10 @@ static int parse_fragment(struct apply_state *state,
 		}
 
 		/* eat the "\\ No newline..." as well, if exists */
-		len += skip_len;
+		if (skip_len) {
+			len += skip_len;
+			state->linenr++;
+		}
 	}
 	if (oldlines || newlines)
 		return -1;
diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh
index 485c7d2d124ade..115a0f857906d4 100755
--- a/t/t4124-apply-ws-rule.sh
+++ b/t/t4124-apply-ws-rule.sh
@@ -556,4 +556,191 @@ test_expect_success 'whitespace check skipped for excluded paths' '
 	git apply --include=used --stat --whitespace=error <patch
 '
 
+test_expect_success 'check incomplete lines (setup)' '
+	rm -f .gitattributes &&
+	git config core.whitespace incomplete-line
+'
+
+test_expect_success 'incomplete context line (not an error)' '
+	(test_write_lines 1 2 3 4 5 && printf 6) >sample-i &&
+	(test_write_lines 1 2 3 0 5 && printf 6) >sample2-i &&
+	cat sample-i >target &&
+	git add target &&
+	cat sample2-i >target &&
+	git diff-files -p target >patch &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error <patch &&
+	test_cmp sample2-i target &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error --check <patch 2>error &&
+	test_cmp sample-i target &&
+	test_must_be_empty error &&
+
+	cat sample2-i >target &&
+	git apply --whitespace=error -R <patch &&
+	test_cmp sample-i target &&
+
+	cat sample2-i >target &&
+	git apply -R --whitespace=error --check <patch 2>error &&
+	test_cmp sample2-i target &&
+	test_must_be_empty error
+'
+
+test_expect_success 'last line made incomplete (error)' '
+	test_write_lines 1 2 3 4 5 6 >sample &&
+	(test_write_lines 1 2 3 4 5 && printf 6) >sample-i &&
+	cat sample >target &&
+	git add target &&
+	cat sample-i >target &&
+	git diff-files -p target >patch &&
+
+	cat sample >target &&
+	test_must_fail git apply --whitespace=error <patch 2>error &&
+	test_grep "no newline" error &&
+
+	cat sample >target &&
+	test_must_fail git apply --whitespace=error --check <patch 2>actual &&
+	test_cmp sample target &&
+	cat >expect <<-\EOF &&
+	<stdin>:10: no newline at the end of file.
+	6
+	error: 1 line adds whitespace errors.
+	EOF
+	test_cmp expect actual &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error -R <patch &&
+	test_cmp sample target &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error --check -R <patch 2>error &&
+	test_cmp sample-i target &&
+	test_must_be_empty error &&
+
+	cat sample >target &&
+	git apply --whitespace=fix <patch &&
+	test_cmp sample target
+'
+
+test_expect_success 'incomplete line removed at the end (not an error)' '
+	(test_write_lines 1 2 3 4 5 && printf 6) >sample-i &&
+	test_write_lines 1 2 3 4 5 6 >sample &&
+	cat sample-i >target &&
+	git add target &&
+	cat sample >target &&
+	git diff-files -p target >patch &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error <patch &&
+	test_cmp sample target &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error --check <patch 2>error &&
+	test_cmp sample-i target &&
+	test_must_be_empty error &&
+
+	cat sample >target &&
+	test_must_fail git apply --whitespace=error -R <patch 2>error &&
+	test_grep "no newline" error &&
+
+	cat sample >target &&
+	test_must_fail git apply --whitespace=error --check -R <patch 2>actual &&
+	test_cmp sample target &&
+	cat >expect <<-\EOF &&
+	<stdin>:9: no newline at the end of file.
+	6
+	error: 1 line adds whitespace errors.
+	EOF
+	test_cmp expect actual &&
+
+	cat sample >target &&
+	git apply --whitespace=fix -R <patch &&
+	test_cmp sample target
+'
+
+test_expect_success 'incomplete line corrected at the end (not an error)' '
+	(test_write_lines 1 2 3 4 5 && printf 6) >sample-i &&
+	test_write_lines 1 2 3 4 5 7 >sample3 &&
+	cat sample-i >target &&
+	git add target &&
+	cat sample3 >target &&
+	git diff-files -p target >patch &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error <patch &&
+	test_cmp sample3 target &&
+
+	cat sample-i >target &&
+	git apply --whitespace=error --check <patch 2>error &&
+	test_cmp sample-i target &&
+	test_must_be_empty error &&
+
+	cat sample3 >target &&
+	test_must_fail git apply --whitespace=error -R <patch 2>error &&
+	test_grep "no newline" error &&
+
+	cat sample3 >target &&
+	test_must_fail git apply --whitespace=error -R --check <patch 2>actual &&
+	test_cmp sample3 target &&
+	cat >expect <<-\EOF &&
+	<stdin>:9: no newline at the end of file.
+	6
+	error: 1 line adds whitespace errors.
+	EOF
+	test_cmp expect actual &&
+
+	cat sample3 >target &&
+	git apply --whitespace=fix -R <patch &&
+	test_cmp sample target
+'
+
+test_expect_success 'incomplete line modified at the end (error)' '
+	(test_write_lines 1 2 3 4 5 && printf 6) >sample-i &&
+	(test_write_lines 1 2 3 4 5 && printf 7) >sample3-i &&
+	test_write_lines 1 2 3 4 5 6 >sample &&
+	test_write_lines 1 2 3 4 5 7 >sample3 &&
+	cat sample-i >target &&
+	git add target &&
+	cat sample3-i >target &&
+	git diff-files -p target >patch &&
+
+	cat sample-i >target &&
+	test_must_fail git apply --whitespace=error <patch 2>error &&
+	test_grep "no newline" error &&
+
+	cat sample-i >target &&
+	test_must_fail git apply --whitespace=error --check <patch 2>actual &&
+	test_cmp sample-i target &&
+	cat >expect <<-\EOF &&
+	<stdin>:11: no newline at the end of file.
+	7
+	error: 1 line adds whitespace errors.
+	EOF
+	test_cmp expect actual &&
+
+	cat sample3-i >target &&
+	test_must_fail git apply --whitespace=error -R <patch 2>error &&
+	test_grep "no newline" error &&
+
+	cat sample3-i >target &&
+	test_must_fail git apply --whitespace=error --check -R <patch 2>actual &&
+	test_cmp sample3-i target &&
+	cat >expect <<-\EOF &&
+	<stdin>:9: no newline at the end of file.
+	6
+	error: 1 line adds whitespace errors.
+	EOF
+	test_cmp expect actual &&
+
+	cat sample-i >target &&
+	git apply --whitespace=fix <patch &&
+	test_cmp sample3 target &&
+
+	cat sample3-i >target &&
+	git apply --whitespace=fix -R <patch &&
+	test_cmp sample target
+'
+
 test_done
diff --git a/ws.c b/ws.c
index 34a7b4fad2840a..6cc2466c0c2c07 100644
--- a/ws.c
+++ b/ws.c
@@ -186,6 +186,9 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
 	if (trailing_whitespace == -1)
 		trailing_whitespace = len;
 
+	if (!trailing_newline && (ws_rule & WS_INCOMPLETE_LINE))
+		result |= WS_INCOMPLETE_LINE;
+
 	/* Check indentation */
 	for (i = 0; i < trailing_whitespace; i++) {
 		if (line[i] == ' ')
@@ -297,6 +300,17 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
 	int last_space_in_indent = -1;
 	int need_fix_leading_space = 0;
 
+	/*
+	 * Remembering that we need to add '\n' at the end
+	 * is sufficient to fix an incomplete line.
+	 */
+	if (ws_rule & WS_INCOMPLETE_LINE) {
+		if (0 < len && src[len - 1] != '\n') {
+			fixed = 1;
+			add_nl_to_tail = 1;
+		}
+	}
+
 	/*
 	 * Strip trailing whitespace
 	 */

From ab2693cb52a40f597d3cc2b6938626d14655f7c0 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:57 -0800
Subject: [PATCH 070/553] diff: highlight and error out on incomplete lines

Teach "git diff" to highlight "\ No newline at end of file" message
as a whitespace error when incomplete-line whitespace error class is
in effect.  Thanks to the previous refactoring of complete rewrite
code path, we can do this at a single place.

Unlike whitespace errors in the payload where we need to annotate in
line, possibly using colors, the line that has whitespace problems,
we have a dedicated line already that can serve as the error
message, so paint it as a whitespace error message.

Also teach "git diff --check" to notice incomplete lines as
whitespace errors and report when incomplete-line whitespace error
class is in effect.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c                     | 29 +++++++++++++++--
 t/t4015-diff-whitespace.sh | 67 +++++++++++++++++++++++++++++++++++---
 2 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 1b27b15f846333..7b7cd50dc24351 100644
--- a/diff.c
+++ b/diff.c
@@ -1370,7 +1370,11 @@ static void emit_diff_symbol_from_struct(struct diff_options *o,
 		emit_line(o, "", "", line, len);
 		break;
 	case DIFF_SYMBOL_CONTEXT_INCOMPLETE:
-		set = diff_get_color_opt(o, DIFF_CONTEXT);
+		if ((flags & WS_INCOMPLETE_LINE) &&
+		    (flags & o->ws_error_highlight))
+			set = diff_get_color_opt(o, DIFF_WHITESPACE);
+		else
+			set = diff_get_color_opt(o, DIFF_CONTEXT);
 		reset = diff_get_color_opt(o, DIFF_RESET);
 		emit_line(o, set, reset, line, len);
 		break;
@@ -1666,8 +1670,14 @@ static void emit_context_line(struct emit_callback *ecbdata,
 static void emit_incomplete_line_marker(struct emit_callback *ecbdata,
 					const char *line, int len)
 {
+	int last_line_kind = ecbdata->last_line_kind;
+	unsigned flags = (last_line_kind == '+'
+			  ? WSEH_NEW
+			  : last_line_kind == '-'
+			  ? WSEH_OLD
+			  : WSEH_CONTEXT) | ecbdata->ws_rule;
 	emit_diff_symbol(ecbdata->opt, DIFF_SYMBOL_CONTEXT_INCOMPLETE,
-			 line, len, 0);
+			 line, len, flags);
 }
 
 static void emit_hunk_header(struct emit_callback *ecbdata,
@@ -3254,6 +3264,7 @@ struct checkdiff_t {
 	struct diff_options *o;
 	unsigned ws_rule;
 	unsigned status;
+	int last_line_kind;
 };
 
 static int is_conflict_marker(const char *line, int marker_size, unsigned long len)
@@ -3292,6 +3303,7 @@ static void checkdiff_consume_hunk(void *priv,
 static int checkdiff_consume(void *priv, char *line, unsigned long len)
 {
 	struct checkdiff_t *data = priv;
+	int last_line_kind;
 	int marker_size = data->conflict_marker_size;
 	const char *ws = diff_get_color(data->o->use_color, DIFF_WHITESPACE);
 	const char *reset = diff_get_color(data->o->use_color, DIFF_RESET);
@@ -3302,6 +3314,8 @@ static int checkdiff_consume(void *priv, char *line, unsigned long len)
 	assert(data->o);
 	line_prefix = diff_line_prefix(data->o);
 
+	last_line_kind = data->last_line_kind;
+	data->last_line_kind = line[0];
 	if (line[0] == '+') {
 		unsigned bad;
 		data->lineno++;
@@ -3324,6 +3338,17 @@ static int checkdiff_consume(void *priv, char *line, unsigned long len)
 			      data->o->file, set, reset, ws);
 	} else if (line[0] == ' ') {
 		data->lineno++;
+	} else if (line[0] == '\\') {
+		/* no newline at the end of the line */
+		if ((data->ws_rule & WS_INCOMPLETE_LINE) &&
+		    (last_line_kind == '+')) {
+			unsigned bad = WS_INCOMPLETE_LINE;
+			data->status |= bad;
+			err = whitespace_error_string(bad);
+			fprintf(data->o->file, "%s%s:%d: %s.\n",
+				line_prefix, data->filename, data->lineno, err);
+			free(err);
+		}
 	}
 	return 0;
 }
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 9de7f73f42b534..3c8eb02e4f3e64 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -43,6 +43,53 @@ do
 	'
 done
 
+test_expect_success "incomplete line in both pre- and post-image context" '
+	(echo foo && echo baz | tr -d "\012") >x &&
+	git add x &&
+	(echo bar && echo baz | tr -d "\012") >x &&
+	git diff x &&
+	git -c core.whitespace=incomplete diff --check x &&
+	git diff -R x &&
+	git -c core.whitespace=incomplete diff -R --check x
+'
+
+test_expect_success "incomplete lines on both pre- and post-image" '
+	# The interpretation taken here is "since you are touching
+	# the line anyway, you would better fix the incomplete line
+	# while you are at it."  but this is debatable.
+	echo foo | tr -d "\012" >x &&
+	git add x &&
+	echo bar | tr -d "\012" >x &&
+	git diff x &&
+	test_must_fail git -c core.whitespace=incomplete diff --check x >error &&
+	test_grep "no newline at the end of file" error &&
+	git diff -R x &&
+	test_must_fail git -c core.whitespace=incomplete diff -R --check x >error &&
+	test_grep "no newline at the end of file" error
+'
+
+test_expect_success "fix incomplete line in pre-image" '
+	echo foo | tr -d "\012" >x &&
+	git add x &&
+	echo bar >x &&
+	git diff x &&
+	git -c core.whitespace=incomplete diff --check x &&
+	git diff -R x &&
+	test_must_fail git -c core.whitespace=incomplete diff -R --check x >error &&
+	test_grep "no newline at the end of file" error
+'
+
+test_expect_success "new incomplete line in post-image" '
+	echo foo >x &&
+	git add x &&
+	echo bar | tr -d "\012" >x &&
+	git diff x &&
+	test_must_fail git -c core.whitespace=incomplete diff --check x >error &&
+	test_grep "no newline at the end of file" error &&
+	git diff -R x &&
+	git -c core.whitespace=incomplete diff -R --check x
+'
+
 test_expect_success "Ray Lehtiniemi's example" '
 	cat <<-\EOF >x &&
 	do {
@@ -1040,7 +1087,8 @@ test_expect_success 'ws-error-highlight test setup' '
 	{
 		echo "0. blank-at-eol " &&
 		echo "1. still-blank-at-eol " &&
-		echo "2. and a new line "
+		echo "2. and a new line " &&
+		printf "3. and more"
 	} >x &&
 	new_hash_x=$(git hash-object x) &&
 	after=$(git rev-parse --short "$new_hash_x") &&
@@ -1050,11 +1098,13 @@ test_expect_success 'ws-error-highlight test setup' '
 	<BOLD>index $before..$after 100644<RESET>
 	<BOLD>--- a/x<RESET>
 	<BOLD>+++ b/x<RESET>
-	<CYAN>@@ -1,2 +1,3 @@<RESET>
+	<CYAN>@@ -1,2 +1,4 @@<RESET>
 	 0. blank-at-eol <RESET>
 	<RED>-<RESET><RED>1. blank-at-eol<RESET><BLUE> <RESET>
 	<GREEN>+<RESET><GREEN>1. still-blank-at-eol<RESET><BLUE> <RESET>
 	<GREEN>+<RESET><GREEN>2. and a new line<RESET><BLUE> <RESET>
+	<GREEN>+<RESET><GREEN>3. and more<RESET>
+	<BLUE>\ No newline at end of file<RESET>
 	EOF
 
 	cat >expect.all <<-EOF &&
@@ -1062,11 +1112,13 @@ test_expect_success 'ws-error-highlight test setup' '
 	<BOLD>index $before..$after 100644<RESET>
 	<BOLD>--- a/x<RESET>
 	<BOLD>+++ b/x<RESET>
-	<CYAN>@@ -1,2 +1,3 @@<RESET>
+	<CYAN>@@ -1,2 +1,4 @@<RESET>
 	 <RESET>0. blank-at-eol<RESET><BLUE> <RESET>
 	<RED>-<RESET><RED>1. blank-at-eol<RESET><BLUE> <RESET>
 	<GREEN>+<RESET><GREEN>1. still-blank-at-eol<RESET><BLUE> <RESET>
 	<GREEN>+<RESET><GREEN>2. and a new line<RESET><BLUE> <RESET>
+	<GREEN>+<RESET><GREEN>3. and more<RESET>
+	<BLUE>\ No newline at end of file<RESET>
 	EOF
 
 	cat >expect.none <<-EOF
@@ -1074,16 +1126,19 @@ test_expect_success 'ws-error-highlight test setup' '
 	<BOLD>index $before..$after 100644<RESET>
 	<BOLD>--- a/x<RESET>
 	<BOLD>+++ b/x<RESET>
-	<CYAN>@@ -1,2 +1,3 @@<RESET>
+	<CYAN>@@ -1,2 +1,4 @@<RESET>
 	 0. blank-at-eol <RESET>
 	<RED>-1. blank-at-eol <RESET>
 	<GREEN>+1. still-blank-at-eol <RESET>
 	<GREEN>+2. and a new line <RESET>
+	<GREEN>+3. and more<RESET>
+	\ No newline at end of file<RESET>
 	EOF
 
 '
 
 test_expect_success 'test --ws-error-highlight option' '
+	git config core.whitespace blank-at-eol,incomplete-line &&
 
 	git diff --color --ws-error-highlight=default,old >current.raw &&
 	test_decode_color <current.raw >current &&
@@ -1100,6 +1155,7 @@ test_expect_success 'test --ws-error-highlight option' '
 '
 
 test_expect_success 'test diff.wsErrorHighlight config' '
+	git config core.whitespace blank-at-eol,incomplete-line &&
 
 	git -c diff.wsErrorHighlight=default,old diff --color >current.raw &&
 	test_decode_color <current.raw >current &&
@@ -1116,6 +1172,7 @@ test_expect_success 'test diff.wsErrorHighlight config' '
 '
 
 test_expect_success 'option overrides diff.wsErrorHighlight' '
+	git config core.whitespace blank-at-eol,incomplete-line &&
 
 	git -c diff.wsErrorHighlight=none \
 		diff --color --ws-error-highlight=default,old >current.raw &&
@@ -1135,6 +1192,8 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 '
 
 test_expect_success 'detect moved code, complete file' '
+	git config core.whitespace blank-at-eol &&
+
 	git reset --hard &&
 	cat <<-\EOF >test.c &&
 	#include<stdio.h>

From 51358a1ede7f4b6b50e4e5a86558af5204691fe0 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 12 Nov 2025 14:02:58 -0800
Subject: [PATCH 071/553] attr: enable incomplete-line whitespace error for
 this project

Now "git diff --check" and "git apply --whitespace=warn/fix" learned
incomplete line is a whitespace error, enable them for this project
to prevent patches to add new incomplete lines to our source to both
code and documentation files.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitattributes | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/.gitattributes b/.gitattributes
index 32583149c2f927..a8e2950a735677 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,13 +1,13 @@
 * whitespace=!indent,trail,space
-*.[ch] whitespace=indent,trail,space diff=cpp
-*.sh whitespace=indent,trail,space text eol=lf
+*.[ch] whitespace=indent,trail,space,incomplete diff=cpp
+*.sh whitespace=indent,trail,space,incomplete text eol=lf
 *.perl text eol=lf diff=perl
 *.pl text eof=lf diff=perl
 *.pm text eol=lf diff=perl
 *.py text eol=lf diff=python
 *.bat text eol=crlf
 CODE_OF_CONDUCT.md -whitespace
-/Documentation/**/*.adoc text eol=lf
+/Documentation/**/*.adoc text eol=lf whitespace=trail,space,incomplete
 /command-list.txt text eol=lf
 /GIT-VERSION-GEN text eol=lf
 /mergetools/* text eol=lf

From 4580bcd2354aab9369164d936f7ccaa21fc98c98 Mon Sep 17 00:00:00 2001
From: Koji Nakamaru <koji.nakamaru@gree.net>
Date: Fri, 14 Nov 2025 06:04:30 +0000
Subject: [PATCH 072/553] osxkeychain: avoid incorrectly skipping store
 operation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

git-credential-osxkeychain skips storing a credential if its "get"
action sets "state[]=osxkeychain:seen=1". This behavior was introduced
in e1ab45b2 (osxkeychain: state to skip unnecessary store operations,
2024-05-15), which appeared in v2.46.

However, this state[] persists even if a credential returned by
"git-credential-osxkeychain get" is invalid and a subsequent helper's
"get" operation returns a valid credential. Another subsequent helper
(such as [1]) may expect git-credential-osxkeychain to store the valid
credential, but the "store" operation is incorrectly skipped because it
only checks "state[]=osxkeychain:seen=1".

To solve this issue, "state[]=osxkeychain:seen" needs to contain enough
information to identify whether the current "store" input matches the
output from the previous "get" operation (and not a credential from
another helper).

Set "state[]=osxkeychain:seen" to a value encoding the credential output
by "get", and compare it with a value encoding the credential input by
"store".

[1]: https://github.com/hickford/git-credential-oauth

Reported-by: Petter Sælen <petter@saelen.eu>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/credential/osxkeychain/Makefile       |  41 +++++-
 .../osxkeychain/git-credential-osxkeychain.c  | 120 ++++++++++++++----
 contrib/credential/osxkeychain/meson.build    |   1 +
 3 files changed, 132 insertions(+), 30 deletions(-)

diff --git a/contrib/credential/osxkeychain/Makefile b/contrib/credential/osxkeychain/Makefile
index 9680717abe44c6..c68445b82dc3e5 100644
--- a/contrib/credential/osxkeychain/Makefile
+++ b/contrib/credential/osxkeychain/Makefile
@@ -1,21 +1,55 @@
 # The default target of this Makefile is...
 all:: git-credential-osxkeychain
 
+include ../../../config.mak.uname
 -include ../../../config.mak.autogen
 -include ../../../config.mak
 
+ifdef ZLIB_NG
+	BASIC_CFLAGS += -DHAVE_ZLIB_NG
+        ifdef ZLIB_NG_PATH
+		BASIC_CFLAGS += -I$(ZLIB_NG_PATH)/include
+		EXTLIBS += $(call libpath_template,$(ZLIB_NG_PATH)/$(lib))
+        endif
+	EXTLIBS += -lz-ng
+else
+        ifdef ZLIB_PATH
+		BASIC_CFLAGS += -I$(ZLIB_PATH)/include
+		EXTLIBS += $(call libpath_template,$(ZLIB_PATH)/$(lib))
+        endif
+	EXTLIBS += -lz
+endif
+ifndef NO_ICONV
+        ifdef NEEDS_LIBICONV
+                ifdef ICONVDIR
+			BASIC_CFLAGS += -I$(ICONVDIR)/include
+			ICONV_LINK = $(call libpath_template,$(ICONVDIR)/$(lib))
+                else
+			ICONV_LINK =
+                endif
+                ifdef NEEDS_LIBINTL_BEFORE_LIBICONV
+			ICONV_LINK += -lintl
+                endif
+		EXTLIBS += $(ICONV_LINK) -liconv
+        endif
+endif
+ifndef LIBC_CONTAINS_LIBINTL
+	EXTLIBS += -lintl
+endif
+
 prefix ?= /usr/local
 gitexecdir ?= $(prefix)/libexec/git-core
 
 CC ?= gcc
-CFLAGS ?= -g -O2 -Wall
+CFLAGS ?= -g -O2 -Wall -I../../.. $(BASIC_CFLAGS)
+LDFLAGS ?= $(BASIC_LDFLAGS) $(EXTLIBS)
 INSTALL ?= install
 RM ?= rm -f
 
 %.o: %.c
 	$(CC) $(CFLAGS) $(CPPFLAGS) -o $@ -c $<
 
-git-credential-osxkeychain: git-credential-osxkeychain.o
+git-credential-osxkeychain: git-credential-osxkeychain.o ../../../libgit.a
 	$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS) \
 		-framework Security -framework CoreFoundation
 
@@ -23,6 +57,9 @@ install: git-credential-osxkeychain
 	$(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir)
 	$(INSTALL) -m 755 $< $(DESTDIR)$(gitexecdir)
 
+../../../libgit.a:
+	cd ../../..; make libgit.a
+
 clean:
 	$(RM) git-credential-osxkeychain git-credential-osxkeychain.o
 
diff --git a/contrib/credential/osxkeychain/git-credential-osxkeychain.c b/contrib/credential/osxkeychain/git-credential-osxkeychain.c
index 611c9798b3ae5c..b18026703418a9 100644
--- a/contrib/credential/osxkeychain/git-credential-osxkeychain.c
+++ b/contrib/credential/osxkeychain/git-credential-osxkeychain.c
@@ -2,6 +2,9 @@
 #include <string.h>
 #include <stdlib.h>
 #include <Security/Security.h>
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "wrapper.h"
 
 #define ENCODING kCFStringEncodingUTF8
 static CFStringRef protocol; /* Stores constant strings - not memory managed */
@@ -12,7 +15,7 @@ static CFStringRef username;
 static CFDataRef password;
 static CFDataRef password_expiry_utc;
 static CFDataRef oauth_refresh_token;
-static int state_seen;
+static char *state_seen;
 
 static void clear_credential(void)
 {
@@ -48,27 +51,6 @@ static void clear_credential(void)
 
 #define STRING_WITH_LENGTH(s) s, sizeof(s) - 1
 
-__attribute__((format (printf, 1, 2), __noreturn__))
-static void die(const char *err, ...)
-{
-	char msg[4096];
-	va_list params;
-	va_start(params, err);
-	vsnprintf(msg, sizeof(msg), err, params);
-	fprintf(stderr, "%s\n", msg);
-	va_end(params);
-	clear_credential();
-	exit(1);
-}
-
-static void *xmalloc(size_t len)
-{
-	void *ret = malloc(len);
-	if (!ret)
-		die("Out of memory");
-	return ret;
-}
-
 static CFDictionaryRef create_dictionary(CFAllocatorRef allocator, ...)
 {
 	va_list args;
@@ -112,6 +94,66 @@ static void write_item(const char *what, const char *buf, size_t len)
 	putchar('\n');
 }
 
+static void write_item_strbuf(struct strbuf *sb, const char *what, const char *buf, int n)
+{
+	char s[32];
+
+	xsnprintf(s, sizeof(s), "__%s=", what);
+	strbuf_add(sb, s, strlen(s));
+	strbuf_add(sb, buf, n);
+}
+
+static void write_item_strbuf_cfstring(struct strbuf *sb, const char *what, CFStringRef ref)
+{
+	char *buf;
+	int len;
+
+	if (!ref)
+		return;
+	len = CFStringGetMaximumSizeForEncoding(CFStringGetLength(ref), ENCODING) + 1;
+	buf = xmalloc(len);
+	if (CFStringGetCString(ref, buf, len, ENCODING))
+		write_item_strbuf(sb, what, buf, strlen(buf));
+	free(buf);
+}
+
+static void write_item_strbuf_cfnumber(struct strbuf *sb, const char *what, CFNumberRef ref)
+{
+	short n;
+	char buf[32];
+
+	if (!ref)
+		return;
+	if (!CFNumberGetValue(ref, kCFNumberShortType, &n))
+		return;
+	xsnprintf(buf, sizeof(buf), "%d", n);
+	write_item_strbuf(sb, what, buf, strlen(buf));
+}
+
+static void write_item_strbuf_cfdata(struct strbuf *sb, const char *what, CFDataRef ref)
+{
+	char *buf;
+	int len;
+
+	if (!ref)
+		return;
+	buf = (char *)CFDataGetBytePtr(ref);
+	if (!buf || strlen(buf) == 0)
+		return;
+	len = CFDataGetLength(ref);
+	write_item_strbuf(sb, what, buf, len);
+}
+
+static void encode_state_seen(struct strbuf *sb)
+{
+	strbuf_add(sb, "osxkeychain:seen=", strlen("osxkeychain:seen="));
+	write_item_strbuf_cfstring(sb, "host", host);
+	write_item_strbuf_cfnumber(sb, "port", port);
+	write_item_strbuf_cfstring(sb, "path", path);
+	write_item_strbuf_cfstring(sb, "username", username);
+	write_item_strbuf_cfdata(sb, "password", password);
+}
+
 static void find_username_in_item(CFDictionaryRef item)
 {
 	CFStringRef account_ref;
@@ -124,6 +166,7 @@ static void find_username_in_item(CFDictionaryRef item)
 		write_item("username", "", 0);
 		return;
 	}
+	username = CFStringCreateCopy(kCFAllocatorDefault, account_ref);
 
 	username_buf = (char *)CFStringGetCStringPtr(account_ref, ENCODING);
 	if (username_buf)
@@ -163,6 +206,7 @@ static OSStatus find_internet_password(void)
 	}
 
 	data = CFDictionaryGetValue(item, kSecValueData);
+	password = CFDataCreateCopy(kCFAllocatorDefault, data);
 
 	write_item("password",
 		   (const char *)CFDataGetBytePtr(data),
@@ -173,7 +217,14 @@ static OSStatus find_internet_password(void)
 	CFRelease(item);
 
 	write_item("capability[]", "state", strlen("state"));
-	write_item("state[]", "osxkeychain:seen=1", strlen("osxkeychain:seen=1"));
+	{
+		struct strbuf sb;
+
+		strbuf_init(&sb, 1024);
+		encode_state_seen(&sb);
+		write_item("state[]", sb.buf, strlen(sb.buf));
+		strbuf_release(&sb);
+	}
 
 out:
 	CFRelease(attrs);
@@ -288,13 +339,22 @@ static OSStatus add_internet_password(void)
 	CFDictionaryRef attrs;
 	OSStatus result;
 
-	if (state_seen)
-		return errSecSuccess;
-
 	/* Only store complete credentials */
 	if (!protocol || !host || !username || !password)
 		return -1;
 
+	if (state_seen) {
+		struct strbuf sb;
+
+		strbuf_init(&sb, 1024);
+		encode_state_seen(&sb);
+		if (!strcmp(state_seen, sb.buf)) {
+			strbuf_release(&sb);
+			return errSecSuccess;
+		}
+		strbuf_release(&sb);
+	}
+
 	data = CFDataCreateMutableCopy(kCFAllocatorDefault, 0, password);
 	if (password_expiry_utc) {
 		CFDataAppendBytes(data,
@@ -403,8 +463,9 @@ static void read_credential(void)
 							   (UInt8 *)v,
 							   strlen(v));
 		else if (!strcmp(buf, "state[]")) {
-			if (!strcmp(v, "osxkeychain:seen=1"))
-				state_seen = 1;
+			int len = strlen("osxkeychain:seen=");
+			if (!strncmp(v, "osxkeychain:seen=", len))
+				state_seen = xstrdup(v);
 		}
 		/*
 		 * Ignore other lines; we don't know what they mean, but
@@ -443,5 +504,8 @@ int main(int argc, const char **argv)
 
 	clear_credential();
 
+	if (state_seen)
+		free(state_seen);
+
 	return 0;
 }
diff --git a/contrib/credential/osxkeychain/meson.build b/contrib/credential/osxkeychain/meson.build
index 3c7677f736c684..ec91d0c14b0eb1 100644
--- a/contrib/credential/osxkeychain/meson.build
+++ b/contrib/credential/osxkeychain/meson.build
@@ -1,6 +1,7 @@
 executable('git-credential-osxkeychain',
   sources: 'git-credential-osxkeychain.c',
   dependencies: [
+    libgit,
     dependency('CoreFoundation'),
     dependency('Security'),
   ],

From df90eccd931dfd8e6ecbc0c18c5037c85cc115dc Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Fri, 14 Nov 2025 15:04:47 +0100
Subject: [PATCH 073/553] doc: commit: link to git-status(1) on all format
 options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

`--branch` and `--long` refer to git-status(1) options but they don’t tell us
what `short-format` and `long-format` are, respectively. And `--null`
mentions “status” but does not link to the command.

Refer to git-config(1) on `--branch` like `--short` does.

`long-format` is the git-status(1) output. So we can just say that
directly.

Replace “status” with a `linkgit` on `--null`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-commit.adoc | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-commit.adoc b/Documentation/git-commit.adoc
index 54c207ad45eaa2..8329c1034b9b30 100644
--- a/Documentation/git-commit.adoc
+++ b/Documentation/git-commit.adoc
@@ -146,7 +146,8 @@ See linkgit:git-rebase[1] for details.
 	linkgit:git-status[1] for details. Implies `--dry-run`.
 
 `--branch`::
-	Show the branch and tracking info even in short-format.
+	Show the branch and tracking info even in short-format. See
+	linkgit:git-status[1] for details.
 
 `--porcelain`::
 	When doing a dry-run, give the output in a porcelain-ready
@@ -154,12 +155,13 @@ See linkgit:git-rebase[1] for details.
 	`--dry-run`.
 
 `--long`::
-	When doing a dry-run, give the output in the long-format.
-	Implies `--dry-run`.
+	When doing a dry-run, give the output in the long-format. This
+	is the default output of linkgit:git-status[1]. Implies
+	`--dry-run`.
 
 `-z`::
 `--null`::
-	When showing `short` or `porcelain` status output, print the
+	When showing `short` or `porcelain` linkgit:git-status[1] output, print the
 	filename verbatim and terminate the entries with _NUL_, instead of _LF_.
 	If no format is given, implies the `--porcelain` output format.
 	Without the `-z` option, filenames with "unusual" characters are

From 66c78e0653a4e60c625b8400da31da0ba5bd1286 Mon Sep 17 00:00:00 2001
From: "brian m. carlson" <sandals@crustytoothpaste.net>
Date: Sat, 15 Nov 2025 00:58:17 +0000
Subject: [PATCH 074/553] object-file: disallow adding submodules of different
 hash algo

The design of the hash algorithm transition plan is that objects stored
must be entirely in one algorithm since we lack any way to indicate a
mix of algorithms.  This also includes submodules, but we have
traditionally not enforced this, which leads to various problems when
trying to clone or check out the the submodule from the remote.

Since this cannot work in the general case, restrict adding a submodule
of a different algorithm to the index.  Add tests for git add and git
submodule add that these are rejected.

Note that we cannot check this in git fsck because the malformed
submodule is stored in the tree as an object ID which is either
truncated (when a SHA-256 submodule is added to a SHA-1 repository) or
padded with zeros (when a SHA-1 submodule is added to a SHA-256
repository).  We cannot detect even the latter case because someone
could have an actual submodule that actually ends in 24 zeros, which
would be a false positive.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c              |  6 +++++-
 t/t3700-add.sh             | 25 +++++++++++++++++++++++++
 t/t7400-submodule-basic.sh | 25 +++++++++++++++++++++++++
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index 2bc36ab3ee8cbf..6ff6cf75499d30 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1296,7 +1296,11 @@ int index_path(struct index_state *istate, struct object_id *oid,
 		strbuf_release(&sb);
 		break;
 	case S_IFDIR:
-		return repo_resolve_gitlink_ref(istate->repo, path, "HEAD", oid);
+		if (repo_resolve_gitlink_ref(istate->repo, path, "HEAD", oid))
+			return -1;
+		if (&hash_algos[oid->algo] != istate->repo->hash_algo)
+			return error(_("cannot add a submodule of a different hash algorithm"));
+		break;
 	default:
 		return error(_("%s: unsupported file type"), path);
 	}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index df580a5806b4f1..9a2c8dbcc23d91 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -541,6 +541,31 @@ test_expect_success 'all statuses changed in folder if . is given' '
 	)
 '
 
+test_expect_success 'cannot add a submodule of a different algorithm' '
+	git init --object-format=sha256 sha256 &&
+	(
+		cd sha256 &&
+		test_commit abc &&
+		git init --object-format=sha1 submodule &&
+		test_commit -C submodule def &&
+		test_must_fail git add submodule 2>err &&
+		test_grep "cannot add a submodule of a different hash algorithm" err &&
+		git ls-files --stage >entries &&
+		test_grep ! ^160000 entries
+	) &&
+	git init --object-format=sha1 sha1 &&
+	(
+		cd sha1 &&
+		test_commit abc &&
+		git init --object-format=sha256 submodule &&
+		test_commit -C submodule def &&
+		test_must_fail git add submodule 2>err &&
+		test_grep "cannot add a submodule of a different hash algorithm" err &&
+		git ls-files --stage >entries &&
+		test_grep ! ^160000 entries
+	)
+'
+
 test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' '
 	path="$(pwd)/BLUB" &&
 	touch "$path" &&
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index fd3e7e355e4ffc..e6b551daad7e58 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -407,6 +407,31 @@ test_expect_success 'submodule add in subdirectory with relative path should fai
 	test_grep toplevel output.err
 '
 
+test_expect_success 'submodule add of a different algorithm fails' '
+	git init --object-format=sha256 sha256 &&
+	(
+		cd sha256 &&
+		test_commit abc &&
+		git init --object-format=sha1 submodule &&
+		test_commit -C submodule def &&
+		test_must_fail git submodule add "$submodurl" submodule 2>err &&
+		test_grep "cannot add a submodule of a different hash algorithm" err &&
+		git ls-files --stage >entries &&
+		test_grep ! ^160000 entries
+	) &&
+	git init --object-format=sha1 sha1 &&
+	(
+		cd sha1 &&
+		test_commit abc &&
+		git init --object-format=sha256 submodule &&
+		test_commit -C submodule def &&
+		test_must_fail git submodule add "$submodurl" submodule 2>err &&
+		test_grep "cannot add a submodule of a different hash algorithm" err &&
+		git ls-files --stage >entries &&
+		test_grep ! ^160000 entries
+	)
+'
+
 test_expect_success 'setup - add an example entry to .gitmodules' '
 	git config --file=.gitmodules submodule.example.url git://example.com/init.git
 '

From 6fe288bfbcbbabc3d399dd71f876dccf71affff0 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Sat, 15 Nov 2025 00:58:18 +0000
Subject: [PATCH 075/553] read-cache: drop submodule check from add_to_cache()

In add_to_cache(), we treat any directories as submodules, and complain
if we can't resolve their HEAD. This call to resolve_gitlink_ref() was
added by f937bc2f86 (add: error appropriately on repository with no
commits, 2019-04-09), with the goal of improving the error message for
empty repositories.

But we already resolve the submodule HEAD in index_path(), which is
where we find the actual oid we're going to use. Resolving it again here
introduces some downsides:

  1. It's more work, since we have to open up the submodule repository's
     files twice.

  2. There are call paths that get to index_path() without going through
     add_to_cache(). For instance, we'd want a similar informative
     message if "git diff empty" finds that it can't resolve the
     submodule's HEAD. (In theory we can also get there through
     update-index, but AFAICT it refuses to consider directories as
     submodules at all, and just complains about them).

  3. The resolution in index_path() catches more errors that we don't
     handle here. In particular, it will validate that the object format
     for the submodule matches that of the superproject. This isn't a
     bug, since our call in add_to_cache() throws away the oid it gets
     without looking at it. But it certainly caused confusion for me
     when looking at where the object-format check should go.

So instead of resolving the submodule HEAD in add_to_cache(), let's just
teach the call in index_path() to actually produce an error message
(which it already does for other cases). That's probably what f937bc2f86
should have done in the first place, and it gives us a single point of
resolution when adding a submodule to the index.

The resulting output is slightly more verbose, as we propagate the error
up the call stack, but I think that's OK (and again, matches many other
errors we get when indexing fails).

I've left the text of the error message as-is, though it is perhaps
overly specific.  There are many reasons that resolving the submodule
HEAD might fail, though outside of corruption or system errors it is
probably most likely that the submodule HEAD is simply on an unborn
branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c  | 2 +-
 read-cache.c   | 3 ---
 t/t3700-add.sh | 1 +
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/object-file.c b/object-file.c
index 6ff6cf75499d30..bb0c77b45d89a5 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1297,7 +1297,7 @@ int index_path(struct index_state *istate, struct object_id *oid,
 		break;
 	case S_IFDIR:
 		if (repo_resolve_gitlink_ref(istate->repo, path, "HEAD", oid))
-			return -1;
+			return error(_("'%s' does not have a commit checked out"), path);
 		if (&hash_algos[oid->algo] != istate->repo->hash_algo)
 			return error(_("cannot add a submodule of a different hash algorithm"));
 		break;
diff --git a/read-cache.c b/read-cache.c
index 06ad74db2286ae..e34c5c56c637fd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -707,7 +707,6 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	int add_option = (ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE|
 			  (intent_only ? ADD_CACHE_NEW_ONLY : 0));
 	unsigned hash_flags = pretend ? 0 : INDEX_WRITE_OBJECT;
-	struct object_id oid;
 
 	if (flags & ADD_CACHE_RENORMALIZE)
 		hash_flags |= INDEX_RENORMALIZE;
@@ -717,8 +716,6 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 
 	namelen = strlen(path);
 	if (S_ISDIR(st_mode)) {
-		if (repo_resolve_gitlink_ref(the_repository, path, "HEAD", &oid) < 0)
-			return error(_("'%s' does not have a commit checked out"), path);
 		while (namelen && path[namelen-1] == '/')
 			namelen--;
 	}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index 9a2c8dbcc23d91..af93e53c12cfe3 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -388,6 +388,7 @@ test_expect_success 'error on a repository with no commits' '
 	test_must_fail git add empty >actual 2>&1 &&
 	cat >expect <<-EOF &&
 	error: '"'empty/'"' does not have a commit checked out
+	error: unable to index file '"'empty/'"'
 	fatal: adding files failed
 	EOF
 	test_cmp expect actual

From 878fef8ebf6cf513842de14284ee58f4d92fcef3 Mon Sep 17 00:00:00 2001
From: Jiang Xin <worldhello.net@gmail.com>
Date: Sat, 15 Nov 2025 08:36:10 -0500
Subject: [PATCH 076/553] t/unit-tests: add UTF-8 width tests for CJK chars

The file "builtin/repo.c" uses utf8_strwidth() to calculate the display
width of UTF-8 characters in a table, but the resulting output is still
misaligned. Add test cases for both utf8_strwidth and utf8_strnwidth to
verify that they correctly compute the display width for UTF-8
characters.

Also updated the build configuration in Makefile and meson.build to
include the new test suite in the build process.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile                    |  1 +
 t/meson.build               |  1 +
 t/unit-tests/u-utf8-width.c | 97 +++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+)
 create mode 100644 t/unit-tests/u-utf8-width.c

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..2a675461540c26 100644
--- a/Makefile
+++ b/Makefile
@@ -1525,6 +1525,7 @@ CLAR_TEST_SUITES += u-string-list
 CLAR_TEST_SUITES += u-strvec
 CLAR_TEST_SUITES += u-trailer
 CLAR_TEST_SUITES += u-urlmatch-normalization
+CLAR_TEST_SUITES += u-utf8-width
 CLAR_TEST_PROG = $(UNIT_TEST_BIN)/unit-tests$(X)
 CLAR_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(CLAR_TEST_SUITES))
 CLAR_TEST_OBJS += $(UNIT_TEST_DIR)/clar/clar.o
diff --git a/t/meson.build b/t/meson.build
index a5531df415ffe2..dc43d69636d15c 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -24,6 +24,7 @@ clar_test_suites = [
   'unit-tests/u-strvec.c',
   'unit-tests/u-trailer.c',
   'unit-tests/u-urlmatch-normalization.c',
+  'unit-tests/u-utf8-width.c',
 ]
 
 clar_sources = [
diff --git a/t/unit-tests/u-utf8-width.c b/t/unit-tests/u-utf8-width.c
new file mode 100644
index 00000000000000..3766f19726ed9d
--- /dev/null
+++ b/t/unit-tests/u-utf8-width.c
@@ -0,0 +1,97 @@
+#include "unit-test.h"
+#include "utf8.h"
+#include "strbuf.h"
+
+/*
+ * Test utf8_strnwidth with various Chinese strings
+ * Chinese characters typically have a width of 2 columns when displayed
+ */
+void test_utf8_width__strnwidth_chinese(void)
+{
+	const char *str;
+
+	/* Test basic ASCII - each character should have width 1 */
+	cl_assert_equal_i(5, utf8_strnwidth("Hello", 5, 0));
+	/* skip_ansi = 1 */
+	cl_assert_equal_i(5, utf8_strnwidth("Hello", 5, 1));
+
+	/* Test simple Chinese characters - each should have width 2 */
+	/* "你好" is 6 bytes (3 bytes per char in UTF-8), 4 display columns */
+	cl_assert_equal_i(4, utf8_strnwidth("你好", 6, 0));
+
+	/* Test mixed ASCII and Chinese - ASCII = 1 column, Chinese = 2 columns */
+	/* "h"(1) + "i"(1) + "你"(2) + "好"(2) = 6 */
+	cl_assert_equal_i(6, utf8_strnwidth("Hi你好", 8, 0));
+
+	/* Test longer Chinese string */
+	/* 5 Chinese chars = 10 display columns */
+	cl_assert_equal_i(10, utf8_strnwidth("你好世界！", 15, 0));
+
+	/* Test individual Chinese character width */
+	cl_assert_equal_i(2, utf8_strnwidth("中", 3, 0));
+
+	/* Test empty string */
+	cl_assert_equal_i(0, utf8_strnwidth("", 0, 0));
+
+	/* Test length limiting */
+	str = "你好世界";
+	/* Only first char "你"(2 columns) within 3 bytes */
+	cl_assert_equal_i(2, utf8_strnwidth(str, 3, 0));
+	/* First two chars "你好"(4 columns) in 6 bytes */
+	cl_assert_equal_i(4, utf8_strnwidth(str, 6, 0));
+}
+
+/*
+ * Tests for utf8_strwidth (simpler version without length limit)
+ */
+void test_utf8_width__strwidth_chinese(void)
+{
+	/* Test basic ASCII */
+	cl_assert_equal_i(5, utf8_strwidth("Hello"));
+
+	/* Test Chinese characters */
+	/* 2 Chinese chars = 4 display columns */
+	cl_assert_equal_i(4, utf8_strwidth("你好"));
+
+	/* Test longer Chinese string */
+	/* 5 Chinese chars = 10 display columns */
+	cl_assert_equal_i(10, utf8_strwidth("你好世界！"));
+
+	/* Test mixed ASCII and Chinese */
+	/* 5 ASCII (5 cols) + 2 Chinese (4 cols) = 9 */
+	cl_assert_equal_i(9, utf8_strwidth("Hello世界"));
+	/* 2 ASCII (2 cols) + 2 Chinese (4 cols) + 1 ASCII (1 col) = 7 */
+	cl_assert_equal_i(7, utf8_strwidth("Hi世界!"));
+}
+
+/*
+ * Additional tests with other East Asian characters
+ */
+void test_utf8_width__strnwidth_japanese_korean(void)
+{
+	/* Japanese characters (should also be 2 columns each) */
+	/* 5 Japanese chars x 2 cols each = 10 display columns */
+	cl_assert_equal_i(10, utf8_strnwidth("こんにちは", 15, 0));
+
+	/* Korean characters (should also be 2 columns each) */
+	/* 5 Korean chars x 2 cols each = 10 display columns */
+	cl_assert_equal_i(10, utf8_strnwidth("안녕하세요", 15, 0));
+}
+
+/*
+ * Test utf8_strnwidth with CJK strings and ANSI sequences
+ */
+void test_utf8_width__strnwidth_cjk_with_ansi(void)
+{
+	/* Test CJK with ANSI sequences */
+	const char *ansi_test = "\033[1m你好\033[0m";
+	int width = utf8_strnwidth(ansi_test, strlen(ansi_test), 1);
+	/* Should skip ANSI sequences and count "你好" as 4 columns */
+	cl_assert_equal_i(4, width);
+
+	/* Test mixed ASCII, CJK, and ANSI */
+	ansi_test = "Hello\033[32m世界\033[0m!";
+	width = utf8_strnwidth(ansi_test, strlen(ansi_test), 1);
+	/* "Hello"(5) + "世界"(4) + "!"(1) = 10 */
+	cl_assert_equal_i(10, width);
+}

From 7a03a10a3a746dd8565a3a0e6126f60523b41738 Mon Sep 17 00:00:00 2001
From: Jiang Xin <worldhello.net@gmail.com>
Date: Sat, 15 Nov 2025 08:36:11 -0500
Subject: [PATCH 077/553] builtin/repo: fix table alignment for UTF-8
 characters
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The output table from "git repo structure" is misaligned when displaying
UTF-8 characters (e.g., non-ASCII glyphs). E.g.:

    | 仓库结构   | 值  |
    | -------------- | ---- |
    | * 引用       |      |
    |   * 计数     |   67 |

The previous implementation used simple width formatting with printf()
which didn't properly handle multi-byte UTF-8 characters, causing
misaligned table columns when displaying repository structure
information.

This change modifies the stats_table_print_structure function to use
strbuf_utf8_align() instead of basic printf width specifiers. This
ensures proper column alignment regardless of the character encoding of
the content being displayed.

Also add test cases for strbuf_utf8_align(), a function newly introduced
in "builtin/repo.c".

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c              | 21 +++++++++++++++++----
 t/unit-tests/u-utf8-width.c | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index 9d4749f79befa8..e3adb353a24ae8 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -292,14 +292,20 @@ static void stats_table_print_structure(const struct stats_table *table)
 	int name_col_width = utf8_strwidth(name_col_title);
 	int value_col_width = utf8_strwidth(value_col_title);
 	struct string_list_item *item;
+	struct strbuf buf = STRBUF_INIT;
 
 	if (table->name_col_width > name_col_width)
 		name_col_width = table->name_col_width;
 	if (table->value_col_width > value_col_width)
 		value_col_width = table->value_col_width;
 
-	printf("| %-*s | %-*s |\n", name_col_width, name_col_title,
-	       value_col_width, value_col_title);
+	strbuf_addstr(&buf, "| ");
+	strbuf_utf8_align(&buf, ALIGN_LEFT, name_col_width, name_col_title);
+	strbuf_addstr(&buf, " | ");
+	strbuf_utf8_align(&buf, ALIGN_LEFT, value_col_width, value_col_title);
+	strbuf_addstr(&buf, " |");
+	printf("%s\n", buf.buf);
+
 	printf("| ");
 	for (int i = 0; i < name_col_width; i++)
 		putchar('-');
@@ -317,9 +323,16 @@ static void stats_table_print_structure(const struct stats_table *table)
 			value = entry->value;
 		}
 
-		printf("| %-*s | %*s |\n", name_col_width, item->string,
-		       value_col_width, value);
+		strbuf_reset(&buf);
+		strbuf_addstr(&buf, "| ");
+		strbuf_utf8_align(&buf, ALIGN_LEFT, name_col_width, item->string);
+		strbuf_addstr(&buf, " | ");
+		strbuf_utf8_align(&buf, ALIGN_RIGHT, value_col_width, value);
+		strbuf_addstr(&buf, " |");
+		printf("%s\n", buf.buf);
 	}
+
+	strbuf_release(&buf);
 }
 
 static void stats_table_clear(struct stats_table *table)
diff --git a/t/unit-tests/u-utf8-width.c b/t/unit-tests/u-utf8-width.c
index 3766f19726ed9d..86e09c3574a331 100644
--- a/t/unit-tests/u-utf8-width.c
+++ b/t/unit-tests/u-utf8-width.c
@@ -95,3 +95,40 @@ void test_utf8_width__strnwidth_cjk_with_ansi(void)
 	/* "Hello"(5) + "世界"(4) + "!"(1) = 10 */
 	cl_assert_equal_i(10, width);
 }
+
+/*
+ * Test the strbuf_utf8_align function with CJK characters
+ */
+void test_utf8_width__strbuf_utf8_align(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	/* Test left alignment with CJK */
+	strbuf_utf8_align(&buf, ALIGN_LEFT, 10, "你好");
+	/* Since "你好" is 4 display columns, we need 6 more spaces to reach 10 */
+	cl_assert_equal_s("你好      ", buf.buf);
+	strbuf_reset(&buf);
+
+	/* Test right alignment with CJK */
+	strbuf_utf8_align(&buf, ALIGN_RIGHT, 8, "世界");
+	/* "世界" is 4 display columns, so we need 4 leading spaces */
+	cl_assert_equal_s("    世界", buf.buf);
+	strbuf_reset(&buf);
+
+	/* Test center alignment with CJK */
+	strbuf_utf8_align(&buf, ALIGN_MIDDLE, 10, "中");
+	/* "中" is 2 display columns, so (10-2)/2 = 4 spaces on left, 4 on right */
+	cl_assert_equal_s("    中    ", buf.buf);
+	strbuf_reset(&buf);
+
+	strbuf_utf8_align(&buf, ALIGN_MIDDLE, 5, "中");
+	/* "中" is 2 display columns, so (5-2)/2 = 1 spaces on left, 2 on right */
+	cl_assert_equal_s(" 中  ", buf.buf);
+	strbuf_reset(&buf);
+
+	/* Test alignment that is smaller than string width */
+	strbuf_utf8_align(&buf, ALIGN_LEFT, 2, "你好");
+	/* Since "你好" is 4 display columns, it should not be truncated */
+	cl_assert_equal_s("你好", buf.buf);
+	strbuf_release(&buf);
+}

From 388517c14ce62e1c52b091af862bbaf28dbabb7a Mon Sep 17 00:00:00 2001
From: Christian Couder <christian.couder@gmail.com>
Date: Mon, 17 Nov 2025 05:34:48 +0100
Subject: [PATCH 078/553] fast-import: refactor finalize_commit_buffer()

In a following commit we are going to finalize commit buffers with or
without signatures in order to check the signatures and possibly drop
them.

To do so easily and without duplication, let's refactor the current
code that finalizes commit buffers into a new finalize_commit_buffer()
function.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fast-import.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 54d3e592c6e460..493de57ef67bfb 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -2815,6 +2815,18 @@ static void import_one_signature(struct signature_data *sig_sha1,
 		die(_("parse_one_signature() returned unknown hash algo"));
 }
 
+static void finalize_commit_buffer(struct strbuf *new_data,
+				   struct signature_data *sig_sha1,
+				   struct signature_data *sig_sha256,
+				   struct strbuf *msg)
+{
+	add_gpgsig_to_commit(new_data, "gpgsig ", sig_sha1);
+	add_gpgsig_to_commit(new_data, "gpgsig-sha256 ", sig_sha256);
+
+	strbuf_addch(new_data, '\n');
+	strbuf_addbuf(new_data, msg);
+}
+
 static void parse_new_commit(const char *arg)
 {
 	static struct strbuf msg = STRBUF_INIT;
@@ -2950,11 +2962,8 @@ static void parse_new_commit(const char *arg)
 			"encoding %s\n",
 			encoding);
 
-	add_gpgsig_to_commit(&new_data, "gpgsig ", &sig_sha1);
-	add_gpgsig_to_commit(&new_data, "gpgsig-sha256 ", &sig_sha256);
+	finalize_commit_buffer(&new_data, &sig_sha1, &sig_sha256, &msg);
 
-	strbuf_addch(&new_data, '\n');
-	strbuf_addbuf(&new_data, &msg);
 	free(author);
 	free(committer);
 	free(encoding);

From cb034c020aba54360e7c19faf82021399bf131e7 Mon Sep 17 00:00:00 2001
From: Christian Couder <christian.couder@gmail.com>
Date: Mon, 17 Nov 2025 05:34:49 +0100
Subject: [PATCH 079/553] commit: refactor verify_commit_buffer()

In a following commit, we are going to check commit signatures, but we
won't have a commit yet, only a commit buffer, and we are going to
discard this commit buffer if the signature is invalid. So it would be
wasteful to create a commit that we might discard, just to be able to
check a commit signature.

It would be simpler instead to be able to check commit signatures
using only a commit buffer instead of a commit.

To be able to do that, let's extract some code from the
check_commit_signature() function into a new verify_commit_buffer()
function, and then let's make check_commit_signature() call
verify_commit_buffer().

Note that this doesn't fundamentally change how
check_commit_signature() works. It used to call parse_signed_commit()
which calls repo_get_commit_buffer(), parse_buffer_signed_by_header()
and repo_unuse_commit_buffer(). Now these 3 functions are called
directly by verify_commit_buffer().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c | 17 +++++++++++++++--
 commit.h |  7 +++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index 16d91b2bfcf291..709c9eed58a790 100644
--- a/commit.c
+++ b/commit.c
@@ -1315,7 +1315,8 @@ static void handle_signed_tag(const struct commit *parent, struct commit_extra_h
 	free(buf);
 }
 
-int check_commit_signature(const struct commit *commit, struct signature_check *sigc)
+int verify_commit_buffer(const char *buffer, size_t size,
+			 struct signature_check *sigc)
 {
 	struct strbuf payload = STRBUF_INIT;
 	struct strbuf signature = STRBUF_INIT;
@@ -1323,7 +1324,8 @@ int check_commit_signature(const struct commit *commit, struct signature_check *
 
 	sigc->result = 'N';
 
-	if (parse_signed_commit(commit, &payload, &signature, the_hash_algo) <= 0)
+	if (parse_buffer_signed_by_header(buffer, size, &payload,
+					  &signature, the_hash_algo) <= 0)
 		goto out;
 
 	sigc->payload_type = SIGNATURE_PAYLOAD_COMMIT;
@@ -1337,6 +1339,17 @@ int check_commit_signature(const struct commit *commit, struct signature_check *
 	return ret;
 }
 
+int check_commit_signature(const struct commit *commit, struct signature_check *sigc)
+{
+	unsigned long size;
+	const char *buffer = repo_get_commit_buffer(the_repository, commit, &size);
+	int ret = verify_commit_buffer(buffer, size, sigc);
+
+	repo_unuse_commit_buffer(the_repository, commit, buffer);
+
+	return ret;
+}
+
 void verify_merge_signature(struct commit *commit, int verbosity,
 			    int check_trust)
 {
diff --git a/commit.h b/commit.h
index 1d6e0c7518b3bb..5406dd266327d4 100644
--- a/commit.h
+++ b/commit.h
@@ -333,6 +333,13 @@ int remove_signature(struct strbuf *buf);
  */
 int check_commit_signature(const struct commit *commit, struct signature_check *sigc);
 
+/*
+ * Same as check_commit_signature() but accepts a commit buffer and
+ * its size, instead of a `struct commit *`.
+ */
+int verify_commit_buffer(const char *buffer, size_t size,
+			 struct signature_check *sigc);
+
 /* record author-date for each commit object */
 struct author_date_slab;
 void record_author_date(struct author_date_slab *author_date,

From 881793c4f71c84e70af256c5721475c7c088b3f7 Mon Sep 17 00:00:00 2001
From: Antonin Delpeuch <antonin@delpeuch.eu>
Date: Mon, 17 Nov 2025 08:04:31 +0000
Subject: [PATCH 080/553] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK

The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.

Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c        | 1 -
 merge-ort.c   | 1 -
 xdiff/xdiff.h | 2 +-
 3 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index a74e701806be52..0eec4a2fe8ceee 100644
--- a/diff.c
+++ b/diff.c
@@ -3527,7 +3527,6 @@ static int set_diff_algorithm(struct diff_options *opts,
 		return -1;
 
 	/* clear out previous settings */
-	DIFF_XDL_CLR(opts, NEED_MINIMAL);
 	opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
 	opts->xdl_opts |= value;
 
diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9d8bf..23e2b64c79bf27 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5496,7 +5496,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s)
 		if (value < 0)
 			return -1;
 		/* clear out previous settings */
-		DIFF_XDL_CLR(opt, NEED_MINIMAL);
 		opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
 		opt->xdl_opts |= value;
 	}
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe5da1..dc370712e92860 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -43,7 +43,7 @@ extern "C" {
 
 #define XDF_PATIENCE_DIFF (1 << 14)
 #define XDF_HISTOGRAM_DIFF (1 << 15)
-#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
+#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
 #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
 
 #define XDF_INDENT_HEURISTIC (1 << 23)

From ffffb987fcd3b3d6b88aceed87000ef4a5b6114e Mon Sep 17 00:00:00 2001
From: Antonin Delpeuch <antonin@delpeuch.eu>
Date: Mon, 17 Nov 2025 08:04:32 +0000
Subject: [PATCH 081/553] blame: make diff algorithm configurable

The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.

There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:

https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/

As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.

Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/diff-algorithm-option.adoc |  20 +++
 Documentation/diff-options.adoc          |  21 +--
 Documentation/git-blame.adoc             |   2 +
 builtin/blame.c                          |  52 +++++-
 t/meson.build                            |   1 +
 t/t8015-blame-diff-algorithm.sh          | 203 +++++++++++++++++++++++
 6 files changed, 278 insertions(+), 21 deletions(-)
 create mode 100644 Documentation/diff-algorithm-option.adoc
 create mode 100755 t/t8015-blame-diff-algorithm.sh

diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 00000000000000..8e3a0b63d784d8
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+	Choose a diff algorithm. The variants are as follows:
++
+--
+   `default`;;
+   `myers`;;
+	The basic greedy diff algorithm. Currently, this is the default.
+   `minimal`;;
+	Spend extra time to make sure the smallest possible diff is
+	produced.
+   `patience`;;
+	Use "patience diff" algorithm when generating patches.
+   `histogram`;;
+	This algorithm extends the patience algorithm to "support
+	low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f1d13..9cdad6f72a0c7d 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
 appearing as a deletion or addition in the output. It uses the "patience
 diff" algorithm internally.
 
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
-	Choose a diff algorithm. The variants are as follows:
-+
---
-   `default`;;
-   `myers`;;
-	The basic greedy diff algorithm. Currently, this is the default.
-   `minimal`;;
-	Spend extra time to make sure the smallest possible diff is
-	produced.
-   `patience`;;
-	Use "patience diff" algorithm when generating patches.
-   `histogram`;;
-	This algorithm extends the patience algorithm to "support
-	low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
 
 `--stat[=<width>[,<name-width>[,<count>]]]`::
 	Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d286258826..adcbb6f5dc97a3 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
 	Ignore whitespace when comparing the parent's version and
 	the child's to find where the lines came from.
 
+include::diff-algorithm-option.adoc[]
+
 --abbrev=<n>::
 	Instead of using the default 7+1 hexadecimal digits as the
 	abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258d5f9..27b513d27faddd 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
 		}
 	}
 
+	if (!strcmp(var, "diff.algorithm")) {
+		long diff_algorithm;
+		if (!value)
+			return config_error_nonbool(var);
+		diff_algorithm = parse_algorithm_value(value);
+		if (diff_algorithm < 0)
+			return error(_("unknown value for config '%s': %s"),
+				     var, value);
+		xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+		xdl_opts |= diff_algorithm;
+		return 0;
+	}
+
 	if (git_diff_heuristic_config(var, value, cb) < 0)
 		return -1;
 	if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
 	return 0;
 }
 
+static int blame_diff_algorithm_minimal(const struct option *option,
+					const char *arg, int unset)
+{
+	int *opt = option->value;
+
+	BUG_ON_OPT_ARG(arg);
+
+	*opt &= ~XDF_DIFF_ALGORITHM_MASK;
+	if (!unset)
+		*opt |= XDF_NEED_MINIMAL;
+
+	return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+					 const char *arg, int unset)
+{
+	int *opt = option->value;
+	long value = parse_algorithm_value(arg);
+
+	BUG_ON_OPT_NEG(unset);
+
+	if (value < 0)
+		return error(_("option diff-algorithm accepts \"myers\", "
+			       "\"minimal\", \"patience\" and \"histogram\""));
+
+	*opt &= ~XDF_DIFF_ALGORITHM_MASK;
+	*opt |= value;
+
+	return 0;
+}
+
 static int is_a_rev(const char *name)
 {
 	struct object_id oid;
@@ -915,11 +960,16 @@ int cmd_blame(int argc,
 		OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
 		OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
 		OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+		OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+			       N_("choose a diff algorithm"),
+			       PARSE_OPT_NONEG, blame_diff_algorithm_callback),
 		OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
 		OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
 		OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
 		OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
-		OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+		OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+			       N_("spend extra cycles to find a better match"),
+			       PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal),
 		OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
 		OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
 		OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e0499..9f2fe7af8ba4c9 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
   't8012-blame-colors.sh',
   't8013-blame-ignore-revs.sh',
   't8014-blame-ignore-fuzzy.sh',
+  't8015-blame-diff-algorithm.sh',
   't8020-last-modified.sh',
   't9001-send-email.sh',
   't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 00000000000000..cd709536c6ecb1
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	cat >file.c <<-\EOF &&
+	int f(int x, int y)
+	{
+	  if (x == 0)
+	  {
+	    return y;
+	  }
+	  return x;
+	}
+
+	int g(size_t u)
+	{
+	  while (u < 30)
+	  {
+	    u++;
+	  }
+	  return u;
+	}
+	EOF
+	test_write_lines x x x x >file.txt &&
+	git add file.c file.txt &&
+	GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+	cat >file.c <<-\EOF &&
+	int g(size_t u)
+	{
+	  while (u < 30)
+	  {
+	    u++;
+	  }
+	  return u;
+	}
+
+	int h(int x, int y, int z)
+	{
+	  if (z == 0)
+	  {
+	    return x;
+	  }
+	  return y;
+	}
+	EOF
+	test_write_lines x x x A B C D x E F G >file.txt &&
+	git add file.c file.txt &&
+	GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default' '
+	cat >expected <<-\EOF &&
+	Commit_2 int g(size_t u)
+	Commit_1 {
+	Commit_2   while (u < 30)
+	Commit_1   {
+	Commit_2     u++;
+	Commit_1   }
+	Commit_2   return u;
+	Commit_1 }
+	Commit_1
+	Commit_2 int h(int x, int y, int z)
+	Commit_1 {
+	Commit_2   if (z == 0)
+	Commit_1   {
+	Commit_2     return x;
+	Commit_1   }
+	Commit_2   return y;
+	Commit_1 }
+	EOF
+
+	git blame file.c >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
+	sed -e "s/ *$//g" without_varying_parts >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+	cat >expected <<-\EOF &&
+	Commit_1 int g(size_t u)
+	Commit_1 {
+	Commit_1   while (u < 30)
+	Commit_1   {
+	Commit_1     u++;
+	Commit_1   }
+	Commit_1   return u;
+	Commit_1 }
+	Commit_2
+	Commit_2 int h(int x, int y, int z)
+	Commit_2 {
+	Commit_2   if (z == 0)
+	Commit_2   {
+	Commit_2     return x;
+	Commit_2   }
+	Commit_2   return y;
+	Commit_2 }
+	EOF
+
+	git blame file.c --diff-algorithm histogram >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
+	sed -e "s/ *$//g" without_varying_parts >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+	cat >expected <<-\EOF &&
+	Commit_1 int g(size_t u)
+	Commit_1 {
+	Commit_1   while (u < 30)
+	Commit_1   {
+	Commit_1     u++;
+	Commit_1   }
+	Commit_1   return u;
+	Commit_1 }
+	Commit_2
+	Commit_2 int h(int x, int y, int z)
+	Commit_2 {
+	Commit_2   if (z == 0)
+	Commit_2   {
+	Commit_2     return x;
+	Commit_2   }
+	Commit_2   return y;
+	Commit_2 }
+	EOF
+
+	git -c diff.algorithm=histogram blame file.c >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+	    -e "s/ *$//g" output >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
+	cat >expected <<-\EOF &&
+	Commit_1 int g(size_t u)
+	Commit_1 {
+	Commit_1   while (u < 30)
+	Commit_1   {
+	Commit_1     u++;
+	Commit_1   }
+	Commit_1   return u;
+	Commit_1 }
+	Commit_2
+	Commit_2 int h(int x, int y, int z)
+	Commit_2 {
+	Commit_2   if (z == 0)
+	Commit_2   {
+	Commit_2     return x;
+	Commit_2   }
+	Commit_2   return y;
+	Commit_2 }
+	EOF
+
+	git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+	    -e "s/ *$//g" output >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blame honors --minimal option' '
+	cat >expected <<-\EOF &&
+	Commit_1 x
+	Commit_1 x
+	Commit_1 x
+	Commit_2 A
+	Commit_2 B
+	Commit_2 C
+	Commit_2 D
+	Commit_1 x
+	Commit_2 E
+	Commit_2 F
+	Commit_2 G
+	EOF
+
+	git blame file.txt --minimal >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'blame respects the order of diff options' '
+	cat >expected <<-\EOF &&
+	Commit_1 x
+	Commit_1 x
+	Commit_1 x
+	Commit_2 A
+	Commit_2 B
+	Commit_2 C
+	Commit_2 D
+	Commit_2 x
+	Commit_2 E
+	Commit_2 F
+	Commit_2 G
+	EOF
+
+	git blame file.txt --minimal --diff-algorithm myers >output &&
+	sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+	test_cmp expected actual
+'
+
+test_done

From f18aa68861538e93421699aa366d6691a85258b6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Mon, 17 Nov 2025 20:42:55 +0100
Subject: [PATCH 082/553] wrapper: simplify xmkstemp()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Call xmkstemp_mode() instead of duplicating its error handling code.
This switches the implementation from the system's mkstemp(3) to our own
git_mkstemp_mode(), which works just as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 wrapper.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index 2f00d2ac876c16..bfe7e30f0c80e8 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -421,24 +421,7 @@ FILE *fopen_or_warn(const char *path, const char *mode)
 
 int xmkstemp(char *filename_template)
 {
-	int fd;
-	char origtemplate[PATH_MAX];
-	strlcpy(origtemplate, filename_template, sizeof(origtemplate));
-
-	fd = mkstemp(filename_template);
-	if (fd < 0) {
-		int saved_errno = errno;
-		const char *nonrelative_template;
-
-		if (strlen(filename_template) != strlen(origtemplate))
-			filename_template = origtemplate;
-
-		nonrelative_template = absolute_path(filename_template);
-		errno = saved_errno;
-		die_errno("Unable to create temporary file '%s'",
-			nonrelative_template);
-	}
-	return fd;
+	return xmkstemp_mode(filename_template, 0600);
 }
 
 /* Adapted from libiberty's mkstemp.c. */

From c64eb849b14e5a78864f4260ccc12f46052020ec Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 17 Nov 2025 19:51:26 +0000
Subject: [PATCH 083/553] make strip: include `scalar`

When Scalar was made a canonical part of Git in 7b5c93c6c68 (scalar:
include in standard Git build & installation, 2022-09-02), it was added
to all relevant Makefile targets except for the `strip` target.

Let's correct that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 70d1543b6b8688..fffb0d66e39842 100644
--- a/Makefile
+++ b/Makefile
@@ -2499,7 +2499,7 @@ please_set_SHELL_PATH_to_a_more_modern_shell:
 
 shell_compatibility_test: please_set_SHELL_PATH_to_a_more_modern_shell
 
-strip: $(PROGRAMS) git$X
+strip: $(PROGRAMS) git$X scalar$X
 	$(STRIP) $(STRIP_OPTS) $^
 
 ### Target-specific flags and dependencies

From ffe702b3edf85aa924d685a8603205d47e94e851 Mon Sep 17 00:00:00 2001
From: Elijah Newren <newren@gmail.com>
Date: Mon, 3 Nov 2025 18:01:46 +0000
Subject: [PATCH 084/553] t6429: update comment to mention correct tool

A comment at the top of t6429 mentions why the test doesn't exercise git
rebase or git cherry-pick.  However, it claims that it uses `test-tool
fast-rebase`.  That was true when the comment was written, but commit
f920b0289ba3 (replay: introduce new builtin, 2023-11-24) changed it to
use git replay without updating this comment.

We could potentially just strike this second comment, since git replay
is a bona fide built-in, but perhaps the explanation about why it focuses
on git replay is still useful.  Update the comment to make it accurate
again.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6429-merge-sequence-rename-caching.sh | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/t/t6429-merge-sequence-rename-caching.sh b/t/t6429-merge-sequence-rename-caching.sh
index 0f39ed0d08a342..dcb734b10b3e2b 100755
--- a/t/t6429-merge-sequence-rename-caching.sh
+++ b/t/t6429-merge-sequence-rename-caching.sh
@@ -11,14 +11,13 @@ test_description="remember regular & dir renames in sequence of merges"
 #         sure that we are triggering rename caching rather than rename
 #         bypassing.
 #
-# NOTE 2: this testfile uses 'test-tool fast-rebase' instead of either
-#         cherry-pick or rebase.  sequencer.c is only superficially
-#         integrated with merge-ort; it calls merge_switch_to_result()
-#         after EACH merge, which updates the index and working copy AND
-#         throws away the cached results (because merge_switch_to_result()
-#         is only supposed to be called at the end of the sequence).
-#         Integrating them more deeply is a big task, so for now the tests
-#         use 'test-tool fast-rebase'.
+# NOTE 2: this testfile uses replay instead of either cherry-pick or rebase.
+#         sequencer.c is only superficially integrated with merge-ort; it
+#         calls merge_switch_to_result() after EACH merge, which updates the
+#         index and working copy AND throws away the cached results (because
+#         merge_switch_to_result() is only supposed to be called at the end
+#         of the sequence).  Integrating them more deeply is a big task, so
+#         for now the tests use 'git replay'.
 #
 
 

From d5663a4b05640a44aa52a0cc32ba6a601d7c9149 Mon Sep 17 00:00:00 2001
From: Elijah Newren <newren@gmail.com>
Date: Mon, 3 Nov 2025 18:01:47 +0000
Subject: [PATCH 085/553] merge-ort: remove debugging crud

While developing commit a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), I was testing things out and
had an extra condition on one of the if-blocks that I occasionally
swapped between '&& 0' and '&& 1' to see the effects of the changes.  I
forgot to remove it before submitting and it wasn't caught in review.
Remove it now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-ort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9d8bf..23b55c5b929b22 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3438,7 +3438,7 @@ static int collect_renames(struct merge_options *opt,
 			continue;
 		}
 		if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE &&
-		    p->status == 'R' && 1) {
+		    p->status == 'R') {
 			possibly_cache_new_pair(renames, p, side_index, NULL);
 			goto skip_directory_renames;
 		}

From a562d90a350dcbddc8794809aef9608467022e34 Mon Sep 17 00:00:00 2001
From: Elijah Newren <newren@gmail.com>
Date: Mon, 3 Nov 2025 18:01:48 +0000
Subject: [PATCH 086/553] merge-ort: fix failing merges in special corner case

At GitHub, we had a repository that was triggering
  git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
during git replay.

This sounds similar to the somewhat recent f6ecb603ff8a (merge-ort: fix
directory rename on top of source of other rename/delete, 2025-08-06),
but the cause is different.  Unlike that case, there are no
rename-to-self situations arising in this case, and new to this case it
can only be triggered during a replay operation on the 2nd or later
commit being replayed, never on the first merge in the sequence.

To trigger, the repository needs:
  * an upstream which:
    * renames a file to a different directory, e.g.
        old/file -> new/file
    * leaves other files remaining in the original directory (so that
      e.g. "old/" still exists upstream even though file has been
      removed from it and placed elsewhere)
  * a topic branch being rebased where:
    * a commit in the sequence:
      * modifies old/file
    * a subsequent commit in the sequence being replayed:
      * does NOT touch *anything* under new/
      * does NOT touch old/file
      * DOES modify other paths under old/
      * does NOT have any relevant renames that we need to detect
        _anywhere_ elsewhere in the tree (meaning this interacts
        interestingly with both directory renames and cached renames)

In such a case, the assertion will trigger.  The fix turns out to be
surprisingly simple.  I have a very vague recollection that I actually
considered whether to add such an if-check years ago when I added the
very similar one for oldinfo in 1b6b902d95a5 (merge-ort:
process_renames() now needs more defensiveness, 2021-01-19), but I think
I couldn't figure out a possible way to trigger it and was worried at
the time that if I didn't know how to trigger it then I wasn't so sure
that simply skipping it was correct.  Waiting did give me a chance to
put more thorough tests and checks into place for the rename-to-self
cases a few months back, which I might not have found as easily
otherwise.  Anyway, put the check in place now and add a test that
demonstrates the fix.

Note that this bug, as demonstrated by the conditions listed above,
runs at the intersection of relevant renames, trivial directory
resolutions, and cached renames.  All three of those optimizations are
ones that unfortunately make the code (and testcases!) a bit more
complex, and threading all three makes it a bit more so.  However, the
testcase isn't crazy enough that I'd expect no one to ever hit it in
practice, and was confused why we didn't see it before.  After some
digging, I discovered that merge.directoryRenames=false is a workaround
to this bug, and GitHub used that setting until recently (it was a
"temporary" match-what-libgit2-does piece of code that lasted years
longer than intended).  Since the conditions I gave above for triggering
this bug rule out the possibility of there being directory renames, one
might assume that it shouldn't matter whether you try to detect such
renames if there aren't any.  However, due to commit a16e8efe5c2b
(merge-ort: fix merge.directoryRenames=false, 2025-03-13), the heavy
hammer used there means that merge.directoryRenames=false ALSO turns off
rename caching, which is critical to triggering the bug.  This becomes
a bit more than an aside since...

Re-reading that old commit, a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), it appears that the solution
to this latest bug might have been at least a partial alternative
solution to that old commit.  And it may have been an improved
alternative (or at least help implement one), since it may be able to
avoid the heavy-handed disabling of rename cache.  That might be an
interesting future thing to investigate, but is not critical for the
current fix.  However, since I spent time digging it all up, at least
leave a small comment tweak breadcrumb to help some future reader
(myself or others) who wants to dig further to connect the dots a little
quicker.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-ort.c                              | 29 ++++++++-
 t/t6429-merge-sequence-rename-caching.sh | 78 ++++++++++++++++++++++++
 2 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 23b55c5b929b22..a1f3333e44a1ef 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2912,6 +2912,32 @@ static int process_renames(struct merge_options *opt,
 		if (!oldinfo || oldinfo->merged.clean)
 			continue;
 
+		/*
+		 * Rename caching from a previous commit might give us an
+		 * irrelevant rename for the current commit.
+		 *
+		 * Imagine:
+		 *     foo/A -> bar/A
+		 * was a cached rename for the upstream side from the
+		 * previous commit (without the directories being renamed),
+		 * but the next commit being replayed
+		 *     * does NOT add or delete files
+		 *     * does NOT have directory renames
+		 *     * does NOT modify any files under bar/
+		 *     * does NOT modify foo/A
+		 *     * DOES modify other files under foo/ (otherwise the
+		 *       !oldinfo check above would have already exited for
+		 *       us)
+		 * In such a case, our trivial directory resolution will
+		 * have already merged bar/, and our attempt to process
+		 * the cached
+		 *     foo/A -> bar/A
+		 * would be counterproductive, and lack the necessary
+		 * information anyway.  Skip such renames.
+		 */
+		if (!newinfo)
+			continue;
+
 		/*
 		 * diff_filepairs have copies of pathnames, thus we have to
 		 * use standard 'strcmp()' (negated) instead of '=='.
@@ -5118,7 +5144,8 @@ static void merge_check_renames_reusable(struct merge_options *opt,
 	 * optimization" comment near that case).
 	 *
 	 * This could be revisited in the future; see the commit message
-	 * where this comment was added for some possible pointers.
+	 * where this comment was added for some possible pointers, or the
+	 * later commit where this comment was added.
 	 */
 	if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE) {
 		renames->cached_pairs_valid_side = 0; /* neither side valid */
diff --git a/t/t6429-merge-sequence-rename-caching.sh b/t/t6429-merge-sequence-rename-caching.sh
index dcb734b10b3e2b..15dd2d94b75f0a 100755
--- a/t/t6429-merge-sequence-rename-caching.sh
+++ b/t/t6429-merge-sequence-rename-caching.sh
@@ -768,4 +768,82 @@ test_expect_success 'avoid assuming we detected renames' '
 	)
 '
 
+#
+# In the following testcase:
+#   Base:     olddir/{valuesX_1, valuesY_1, valuesZ_1}
+#             other/content
+#   Upstream: rename olddir/valuesX_1 -> newdir/valuesX_2
+#   Topic_1:  modify olddir/valuesX_1 -> olddir/valuesX_3
+#   Topic_2:  modify olddir/valuesY,
+#             modify other/content
+#   Expected Pick1: olddir/{valuesY, valuesZ}, newdir/valuesX, other/content
+#   Expected Pick2: olddir/{valuesY, valuesZ}, newdir/valuesX, other/content
+#
+# This testcase presents no problems for git traditionally, but the fact that
+#    olddir/valuesX -> newdir/valuesX
+# gets cached after the first pick presents a problem for the second commit to
+# be replayed, because it appears to be an irrelevant rename, so the trivial
+# directory resolution will resolve newdir/ without recursing into it, giving
+# us no way to apply the cached rename to anything.
+#
+test_expect_success 'rename a file, use it on first pick, but irrelevant on second' '
+	git init rename_a_file_use_it_once_irrelevant_on_second &&
+	(
+		cd rename_a_file_use_it_once_irrelevant_on_second &&
+
+		mkdir olddir/ other/ &&
+		test_seq 3 8 >olddir/valuesX &&
+		test_seq 3 8 >olddir/valuesY &&
+		test_seq 3 8 >olddir/valuesZ &&
+		printf "%s\n" A B C D E F G >other/content &&
+		git add olddir other &&
+		git commit -m orig &&
+
+		git branch upstream &&
+		git branch topic &&
+
+		git switch upstream &&
+		test_seq 1 8 >olddir/valuesX &&
+		git add olddir &&
+		mkdir newdir &&
+		git mv olddir/valuesX newdir &&
+		git commit -m "Renamed (and modified) olddir/valuesX into newdir/" &&
+
+		git switch topic &&
+
+		test_seq 3 10 >olddir/valuesX &&
+		git add olddir &&
+		git commit -m A &&
+
+		test_seq 1 8 >olddir/valuesY &&
+		printf "%s\n" A B C D E F G H I >other/content &&
+		git add olddir/valuesY other &&
+		git commit -m B &&
+
+		#
+		# Actual testing; mostly we want to verify that we do not hit
+		#     git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
+		#
+
+		git switch upstream &&
+		git config merge.directoryRenames true &&
+
+		git replay --onto HEAD upstream~1..topic >out &&
+
+		#
+		# ...but we may as well check that the replay gave us a reasonable result
+		#
+
+		git update-ref --stdin <out &&
+		git checkout topic &&
+
+		git ls-files >tracked &&
+		test_line_count = 4 tracked &&
+		test_path_is_file newdir/valuesX &&
+		test_path_is_file olddir/valuesY &&
+		test_path_is_file olddir/valuesZ &&
+		test_path_is_file other/content
+	)
+'
+
 test_done

From d22a488482092da64ad19fda82edde199bed2466 Mon Sep 17 00:00:00 2001
From: David Macek <david.macek.0@gmail.com>
Date: Mon, 17 Nov 2025 20:39:44 +0000
Subject: [PATCH 087/553] wincred: avoid memory corruption

`wcsncpy_s()` wants to write the terminating null character so we need
to allocate one more space for it in the target memory block.

This should fix crashes when trying to read passwords.  When this
happened, the password/token wouldn't print out and Git would therefore
ask for a new password every time.

Signed-off-by: David Macek <david.macek.0@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/credential/wincred/git-credential-wincred.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/credential/wincred/git-credential-wincred.c b/contrib/credential/wincred/git-credential-wincred.c
index 5683846b4b4d1f..73c2b9b72ab53e 100644
--- a/contrib/credential/wincred/git-credential-wincred.c
+++ b/contrib/credential/wincred/git-credential-wincred.c
@@ -165,7 +165,7 @@ static void get_credential(void)
 			write_item("username", creds[i]->UserName,
 				creds[i]->UserName ? wcslen(creds[i]->UserName) : 0);
 			if (creds[i]->CredentialBlobSize > 0) {
-				secret = xmalloc(creds[i]->CredentialBlobSize);
+				secret = xmalloc(creds[i]->CredentialBlobSize + sizeof(WCHAR));
 				wcsncpy_s(secret, creds[i]->CredentialBlobSize, (LPCWSTR)creds[i]->CredentialBlob, creds[i]->CredentialBlobSize / sizeof(WCHAR));
 				line = wcstok_s(secret, L"\r\n", &remaining_lines);
 				write_item("password", line, line ? wcslen(line) : 0);

From b0d5c88cca3d67fd020d1e71fb04380cd15f5e55 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 17 Nov 2025 20:40:08 +0000
Subject: [PATCH 088/553] cmake: stop trying to build the reftable and xdiff
 libraries

In the `en/make-libgit-a` topic branch, more precisely in the commits
f3b4c89d59f1 (make: delete REFTABLE_LIB, add reftable to LIB_OBJS,
2025-10-02) and cf680cdb9543 (make: delete XDIFF_LIB, add xdiff to
LIB_OBJS, 2025-10-02), the strategy to build three static libraries was
rethought, and instead only one static library is now built.

This is good.

However, the CMake definition was not changed accordingly, and now
CMake-based builds fail thusly:

  [...]
  Generating hook-list.h
  CMake Error at CMakeLists.txt:122 (string):
    string sub-command REPLACE requires at least four arguments.
  Call Stack (most recent call first):
    CMakeLists.txt:711 (parse_makefile_for_sources)

  CMake Error at CMakeLists.txt:122 (string):
    string sub-command REPLACE requires at least four arguments.
  Call Stack (most recent call first):
    CMakeLists.txt:717 (parse_makefile_for_sources)

  -- Configuring incomplete, errors occurred!

Fix that by removing the parts that expect the reftable and xdiff
objects to be defined separately in the Makefile, still.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/buildsystems/CMakeLists.txt | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index edb0fc04ad7649..479163ab5cd3b5 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -679,18 +679,6 @@ list(APPEND libgit_SOURCES "${CMAKE_BINARY_DIR}/version-def.h")
 
 add_library(libgit ${libgit_SOURCES} ${compat_SOURCES})
 
-#libxdiff
-parse_makefile_for_sources(libxdiff_SOURCES ${CMAKE_SOURCE_DIR}/Makefile "XDIFF_OBJS")
-
-list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
-add_library(xdiff STATIC ${libxdiff_SOURCES})
-
-#reftable
-parse_makefile_for_sources(reftable_SOURCES ${CMAKE_SOURCE_DIR}/Makefile "REFTABLE_OBJS")
-
-list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
-add_library(reftable STATIC ${reftable_SOURCES})
-
 if(WIN32)
 	add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.rc
 			COMMAND "${SH_EXE}" "${CMAKE_SOURCE_DIR}/GIT-VERSION-GEN"
@@ -720,7 +708,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
 
-target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()

From af3919816f20c7c54e6d377945b2f6344b3588fe Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 17 Nov 2025 20:46:14 +0000
Subject: [PATCH 089/553] mingw: avoid the comma operator

The pattern `return errno = ..., -1;` is observed several times in
`compat/mingw.c`. It has served us well over the years, but now clang
starts complaining:

  compat/mingw.c:723:24: error: possible misuse of comma operator here [-Werror,-Wcomma]
    723 |                 return errno = ENOSYS, -1;
        |                                      ^

See for example this failing workflow run:
https://github.com/git-for-windows/git-sdk-arm64/actions/runs/15457893907/job/43513458823#step:8:201

Let's appease clang (and also reduce the use of the no longer common
comma operator).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/mingw.c | 48 ++++++++++++++++++++++++++++--------------------
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 8538e3d1729d25..9b894d9639b642 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -491,8 +491,10 @@ static int mingw_open_append(wchar_t const *wfilename, int oflags, ...)
 	DWORD create = (oflags & O_CREAT) ? OPEN_ALWAYS : OPEN_EXISTING;
 
 	/* only these flags are supported */
-	if ((oflags & ~O_CREAT) != (O_WRONLY | O_APPEND))
-		return errno = ENOSYS, -1;
+	if ((oflags & ~O_CREAT) != (O_WRONLY | O_APPEND)) {
+		errno = ENOSYS;
+		return -1;
+	}
 
 	/*
 	 * FILE_SHARE_WRITE is required to permit child processes
@@ -2450,12 +2452,14 @@ static int start_timer_thread(void)
 	timer_event = CreateEvent(NULL, FALSE, FALSE, NULL);
 	if (timer_event) {
 		timer_thread = (HANDLE) _beginthreadex(NULL, 0, ticktack, NULL, 0, NULL);
-		if (!timer_thread )
-			return errno = ENOMEM,
-				error("cannot start timer thread");
-	} else
-		return errno = ENOMEM,
-			error("cannot allocate resources for timer");
+		if (!timer_thread ) {
+			errno = ENOMEM;
+			return error("cannot start timer thread");
+		}
+	} else {
+		errno = ENOMEM;
+		return error("cannot allocate resources for timer");
+	}
 	return 0;
 }
 
@@ -2488,13 +2492,15 @@ int setitimer(int type UNUSED, struct itimerval *in, struct itimerval *out)
 	static const struct timeval zero;
 	static int atexit_done;
 
-	if (out)
-		return errno = EINVAL,
-			error("setitimer param 3 != NULL not implemented");
+	if (out) {
+		errno = EINVAL;
+		return error("setitimer param 3 != NULL not implemented");
+	}
 	if (!is_timeval_eq(&in->it_interval, &zero) &&
-	    !is_timeval_eq(&in->it_interval, &in->it_value))
-		return errno = EINVAL,
-			error("setitimer: it_interval must be zero or eq it_value");
+	    !is_timeval_eq(&in->it_interval, &in->it_value)) {
+		errno = EINVAL;
+		return error("setitimer: it_interval must be zero or eq it_value");
+	}
 
 	if (timer_thread)
 		stop_timer_thread();
@@ -2516,12 +2522,14 @@ int sigaction(int sig, struct sigaction *in, struct sigaction *out)
 {
 	if (sig == SIGCHLD)
 		return -1;
-	else if (sig != SIGALRM)
-		return errno = EINVAL,
-			error("sigaction only implemented for SIGALRM");
-	if (out)
-		return errno = EINVAL,
-			error("sigaction: param 3 != NULL not implemented");
+	else if (sig != SIGALRM) {
+		errno = EINVAL;
+		return error("sigaction only implemented for SIGALRM");
+	}
+	if (out) {
+		errno = EINVAL;
+		return error("sigaction: param 3 != NULL not implemented");
+	}
 
 	timer_fn = in->sa_handler;
 	return 0;

From cd99203f86ea32e0b84d1ef18f5148b74972f617 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Date: Tue, 18 Nov 2025 02:26:45 -0500
Subject: [PATCH 090/553] ci: bump actions/setup-go from 5 to 6

Bumps actions/setup-go from 5 to 6. This upgrade includes dependency
updates that incorporate a fix for a critical vulnerability.
[Originally opened at https://github.com/git-for-windows/git/pull/5811]

- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v5...v6)

Originally-authored-by: dependabot[bot] <support@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .github/workflows/l10n.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml
index e2c3dbdcb50f0c..95e55134bdbed4 100644
--- a/.github/workflows/l10n.yml
+++ b/.github/workflows/l10n.yml
@@ -63,7 +63,7 @@ jobs:
             origin \
             ${{ github.ref }} \
             $args
-      - uses: actions/setup-go@v5
+      - uses: actions/setup-go@v6
         with:
           go-version: '>=1.16'
           cache: false

From 65e8141f051401a101567b95860424d94f42ef39 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:11:56 -0500
Subject: [PATCH 091/553] compat/mmap: mark unused argument in git_munmap()

Our mmap compat code emulates mapping by using malloc/free. Our
git_munmap() must take a "length" parameter to match the interface of
munmap(), but we don't use it (it is up to the allocator to know how big
the block is in free()).

Let's mark it as UNUSED to avoid complaints from -Wunused-parameter.
Otherwise you cannot build with "make DEVELOPER=1 NO_MMAP=1".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/mmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/mmap.c b/compat/mmap.c
index 2fe1c7732eea94..1a118711f7244a 100644
--- a/compat/mmap.c
+++ b/compat/mmap.c
@@ -38,7 +38,7 @@ void *git_mmap(void *start, size_t length, int prot, int flags, int fd, off_t of
 	return start;
 }
 
-int git_munmap(void *start, size_t length)
+int git_munmap(void *start, size_t length UNUSED)
 {
 	free(start);
 	return 0;

From 4deb882e54152c31bef23f8b33ad38b7bc26d398 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:06 -0500
Subject: [PATCH 092/553] pack-bitmap: handle name-hash lookups in incremental
 bitmaps

If a bitmap has a name-hash cache, it is an array of 32-bit integers,
one per entry in the bitmap, which we've mmap'd from the .bitmap file.
We access it directly like this:

    if (bitmap_git->hashes)
            hash = get_be32(bitmap_git->hashes + index_pos);

That works for both regular pack bitmaps and for non-incremental midx
bitmaps. There is one bitmap_index with one "hashes" array, and
index_pos is within its bounds (we do the bounds-checking when we load
the bitmap).

But for an incremental midx bitmap, we have a linked list of
bitmap_index structs, and each one has only its own small slice of the
name-hash array. If index_pos refers to an object that is not in the
first bitmap_git of the chain, then we'll access memory outside of the
bounds of its "hashes" array, and often outside of the mmap.

Instead, we should walk through the list until we find the bitmap_index
which serves our index_pos, and use its hash (after adjusting index_pos
to make it relative to the slice we found). This is exactly what we do
elsewhere for incremental midx lookups (like the pack_pos_to_midx() call
a few lines above). But we can't use existing helpers like
midx_for_object() here, because we're walking through the chain of
bitmap_index structs (each of which refers to a midx), not the chain of
incremental multi_pack_index structs themselves.

The problem is triggered in the test suite, but we don't get a segfault
because the out-of-bounds index is too small. The OS typically rounds
our mmap up to the nearest page size, so we just end up accessing some
extra zero'd memory. Nor do we catch it with ASan, since it doesn't seem
to instrument mmaps at all. But if we build with NO_MMAP, then our maps
are replaced with heap allocations, which ASan does check. And so:

  make NO_MMAP=1 SANITIZE=address
  cd t
  ./t5334-incremental-multi-pack-index.sh

does show the problem (and this patch makes it go away).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 pack-bitmap.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index ac6d62b980c5a8..9c9aa7b67c3197 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -212,6 +212,28 @@ static uint32_t bitmap_num_objects(struct bitmap_index *index)
 	return index->pack->num_objects;
 }
 
+static uint32_t bitmap_name_hash(struct bitmap_index *index, uint32_t pos)
+{
+	if (bitmap_is_midx(index)) {
+		while (index && pos < index->midx->num_objects_in_base) {
+			ASSERT(bitmap_is_midx(index));
+			index = index->base;
+		}
+
+		if (!index)
+			BUG("NULL base bitmap for object position: %"PRIu32, pos);
+
+		pos -= index->midx->num_objects_in_base;
+		if (pos >= index->midx->num_objects)
+			BUG("out-of-bounds midx bitmap object at %"PRIu32, pos);
+	}
+
+	if (!index->hashes)
+		return 0;
+
+	return get_be32(index->hashes + pos);
+}
+
 static struct repository *bitmap_repo(struct bitmap_index *bitmap_git)
 {
 	if (bitmap_is_midx(bitmap_git))
@@ -1726,8 +1748,7 @@ static void show_objects_for_type(
 				pack = bitmap_git->pack;
 			}
 
-			if (bitmap_git->hashes)
-				hash = get_be32(bitmap_git->hashes + index_pos);
+			hash = bitmap_name_hash(bitmap_git, index_pos);
 
 			show_reach(&oid, object_type, 0, hash, pack, ofs, payload);
 		}
@@ -3083,8 +3104,8 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git,
 
 		if (oe) {
 			reposition[i] = oe_in_pack_pos(mapping, oe) + 1;
-			if (bitmap_git->hashes && !oe->hash)
-				oe->hash = get_be32(bitmap_git->hashes + index_pos);
+			if (!oe->hash)
+				oe->hash = bitmap_name_hash(bitmap_git, index_pos);
 		}
 	}
 

From a9990f8ec0e750a68802f7d79d2aa2293c4811b4 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:13 -0500
Subject: [PATCH 093/553] Makefile: turn on NO_MMAP when building with ASan

Git often uses mmap() to access on-disk files. This leaves a blind spot
in our SANITIZE=address builds, since ASan does not seem to handle mmap
at all. Nor does the OS notice most out-of-bounds access, since it tends
to round up to the nearest page size (so depending on how big the map
is, you might have to overrun it by up to 4095 bytes to trigger a
segfault).

The previous commit demonstrates a memory bug that we missed. We could
have made a new test where the out-of-bounds access was much larger, or
where the mapped file ended closer to a page boundary. But the point of
running the test suite with sanitizers is to catch these problems
without having to construct specific tests.

Let's enable NO_MMAP for our ASan builds by default, which should give
us better coverage. This does increase the memory usage of Git, since
we're copying from the filesystem into heap. But the repositories in the
test suite tend to be small, so the overhead isn't really noticeable
(and ASan already has quite a performance penalty).

There are a few other known bugs that this patch will help flush out.
However, they aren't directly triggered in the test suite (yet). So
it's safe to turn this on now without breaking the test suite, which
will help us add new tests to demonstrate those other bugs as we fix
them.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile    | 1 +
 meson.build | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 70d1543b6b8688..c2d327838a64cb 100644
--- a/Makefile
+++ b/Makefile
@@ -1518,6 +1518,7 @@ SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
+NO_MMAP = NeededForASAN
 SANITIZE_ADDRESS = YesCompiledWithIt
 endif
 endif
diff --git a/meson.build b/meson.build
index 596f5ac7110ebf..269769b166a8c7 100644
--- a/meson.build
+++ b/meson.build
@@ -1369,12 +1369,18 @@ if host_machine.system() == 'windows'
   libgit_c_args += '-DUSE_WIN32_MMAP'
 else
   checkfuncs += {
-    'mmap' : ['mmap.c'],
     # provided by compat/mingw.c.
     'unsetenv' : ['unsetenv.c'],
     # provided by compat/mingw.c.
     'getpagesize' : [],
   }
+
+  if get_option('b_sanitize').contains('address')
+    libgit_c_args += '-DNO_MMAP'
+    libgit_sources += 'compat/mmap.c'
+  else
+    checkfuncs += { 'mmap': ['mmap.c'] }
+  endif
 endif
 
 foreach func, impls : checkfuncs

From c4c9089584d0ed04978e8d0945b2ba2985e67bd3 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:18 -0500
Subject: [PATCH 094/553] cache-tree: avoid strtol() on non-string buffer

A cache-tree extension entry in the index looks like this:

  <name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid>

where the "_nr" items are human-readable base-10 ASCII. We parse them
with strtol(), even though we do not have a NUL-terminated string (we'd
generally have an mmap() of the on-disk index file). For a well-formed
entry, this is not a problem; strtol() will stop when it sees the
newline. But there are two problems:

  1. A corrupted entry could omit the newline, causing us to read
     further. You'd mostly get stopped by seeing non-digits in the oid
     field (and if it is likewise truncated, there will still be 20 or
     more bytes of the index checksum). So it's possible, though
     unlikely, to read off the end of the mmap'd buffer. Of course a
     malicious index file can fake the oid and the index checksum to all
     (ASCII) 0's.

     This is further complicated by the fact that mmap'd buffers tend to
     be zero-padded up to the page boundary. So to run off the end, the
     index size also has to be a multiple of the page size. This is also
     unlikely, though you can construct a malicious index file that
     matches this.

     The security implications aren't too interesting. The index file is
     a local file anyway (so you can't attack somebody by cloning, but
     only if you convince them to operate in a .git directory you made,
     at which point attacking .git/config is much easier). And it's just
     a read overflow via strtol(), which is unlikely to buy you much
     beyond a crash.

  2. ASan has a strict_string_checks option, which tells it to make sure
     that options to string functions (like strtol) have some eventual
     NUL, without regard to what the function would actually do (like
     stopping at a newline here). This option sometimes has false
     positives, but it can point to sketchy areas (like this one) where
     the input we use doesn't exhibit a problem, but different input
     _could_ cause us to misbehave.

Let's fix it by just parsing the values ourselves with a helper function
that is careful not to go past the end of the buffer. There are a few
behavior changes here that should not matter:

  - We do not consider overflow, as strtol() would. But nor did the
    original code. However, we don't trust the value we get from the
    on-disk file, and if it says to read 2^30 entries, we would notice
    that we do not have that many and bail before reading off the end of
    the buffer.

  - Our helper does not skip past extra leading whitespace as strtol()
    would, but according to gitformat-index(5) there should not be any.

  - The original quit parsing at a newline or a NUL byte, but now we
    insist on a newline (which is what the documentation says, and what
    Git has always produced).

Since we are providing our own helper function, we can tweak the
interface a bit to make our lives easier. The original code does not use
strtol's "end" pointer to find the end of the parsed data, but rather
uses a separate loop to advance our "buf" pointer to the trailing
newline. We can instead provide a helper that advances "buf" as it
parses, letting us read strictly left-to-right through the buffer.

I didn't add a new test here. It's surprisingly difficult to construct
an index of exactly the right size due to the way we pad entries. But it
is easy to trigger the problem in existing tests when using ASan's
strict string checking, coupled with a recent change to use NO_MMAP with
ASan builds. So:

  make SANITIZE=address
  cd t
  ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh

triggers it reliably. Technically it is not deterministic because there
is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing
checksum hash has a NUL byte in it. But we compute enough cache-trees in
the course of that script that we are very likely to hit the problem in
one of them.

We can look at making strict_string_checks the default for ASan builds,
but there are some other cases we'd want to fix first.

Reported-by: correctmost <cmlists@sent.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache-tree.c | 50 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index fa3858e2829aa8..2309911dfa0309 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -548,12 +548,41 @@ void cache_tree_write(struct strbuf *sb, struct cache_tree *root)
 	trace2_region_leave("cache_tree", "write", the_repository);
 }
 
+static int parse_int(const char **ptr, unsigned long *len_p, int *out)
+{
+	const char *s = *ptr;
+	unsigned long len = *len_p;
+	int ret = 0;
+	int sign = 1;
+
+	while (len && *s == '-') {
+		sign *= -1;
+		s++;
+		len--;
+	}
+
+	while (len) {
+		if (!isdigit(*s))
+			break;
+		ret *= 10;
+		ret += *s - '0';
+		s++;
+		len--;
+	}
+
+	if (s == *ptr)
+		return -1;
+
+	*ptr = s;
+	*len_p = len;
+	*out = sign * ret;
+	return 0;
+}
+
 static struct cache_tree *read_one(const char **buffer, unsigned long *size_p)
 {
 	const char *buf = *buffer;
 	unsigned long size = *size_p;
-	const char *cp;
-	char *ep;
 	struct cache_tree *it;
 	int i, subtree_nr;
 	const unsigned rawsz = the_hash_algo->rawsz;
@@ -569,19 +598,14 @@ static struct cache_tree *read_one(const char **buffer, unsigned long *size_p)
 	buf++; size--;
 	it = cache_tree();
 
-	cp = buf;
-	it->entry_count = strtol(cp, &ep, 10);
-	if (cp == ep)
+	if (parse_int(&buf, &size, &it->entry_count) < 0)
 		goto free_return;
-	cp = ep;
-	subtree_nr = strtol(cp, &ep, 10);
-	if (cp == ep)
+	if (!size || *buf != ' ')
 		goto free_return;
-	while (size && *buf && *buf != '\n') {
-		size--;
-		buf++;
-	}
-	if (!size)
+	buf++; size--;
+	if (parse_int(&buf, &size, &subtree_nr) < 0)
+		goto free_return;
+	if (!size || *buf != '\n')
 		goto free_return;
 	buf++; size--;
 	if (0 <= it->entry_count) {

From 0b6ec075df2ac77a4792b8b1a7290a36b636012b Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:20 -0500
Subject: [PATCH 095/553] fsck: assert newline presence in fsck_ident()

The fsck code purports to handle buffers that are not NUL-terminated,
but fsck_ident() uses some string functions. This works OK in practice,
as explained in 8e4309038f (fsck: do not assume NUL-termination of
buffers, 2023-01-19). Before calling fsck_ident() we'll have called
verify_headers(), which makes sure we have at least a trailing newline.
And none of our string-like functions will walk past that newline.

However, that makes this code at the top of fsck_ident() very confusing:

    *ident = strchrnul(*ident, '\n');
    if (**ident == '\n')
            (*ident)++;

We should always see that newline, or our memory safety assumptions have
been violated! Further, using strchrnul() is weird, since the whole
point is that if the newline is not there, we don't necessarily have a
NUL at all, and might read off the end of the buffer.

So let's have callers pass in the boundary of our buffer, which lets us
safely find the newline with memchr(). And if it is not there, this is a
BUG(), because it means our caller did not validate the input with
verify_headers() as it was supposed to (and we are better off bailing
rather than having memory-safety problems).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fsck.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/fsck.c b/fsck.c
index 8dc8472ceb3781..3d1027dc877424 100644
--- a/fsck.c
+++ b/fsck.c
@@ -859,16 +859,18 @@ static int verify_headers(const void *data, unsigned long size,
 		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
-static int fsck_ident(const char **ident,
+static int fsck_ident(const char **ident, const char *ident_end,
 		      const struct object_id *oid, enum object_type type,
 		      struct fsck_options *options)
 {
 	const char *p = *ident;
+	const char *nl;
 	char *end;
 
-	*ident = strchrnul(*ident, '\n');
-	if (**ident == '\n')
-		(*ident)++;
+	nl = memchr(p, '\n', ident_end - p);
+	if (!nl)
+		BUG("verify_headers() should have made sure we have a newline");
+	*ident = nl + 1;
 
 	if (*p == '<')
 		return report(options, oid, type, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
@@ -957,7 +959,7 @@ static int fsck_commit(const struct object_id *oid,
 	author_count = 0;
 	while (buffer < buffer_end && skip_prefix(buffer, "author ", &buffer)) {
 		author_count++;
-		err = fsck_ident(&buffer, oid, OBJ_COMMIT, options);
+		err = fsck_ident(&buffer, buffer_end, oid, OBJ_COMMIT, options);
 		if (err)
 			return err;
 	}
@@ -969,7 +971,7 @@ static int fsck_commit(const struct object_id *oid,
 		return err;
 	if (buffer >= buffer_end || !skip_prefix(buffer, "committer ", &buffer))
 		return report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, oid, OBJ_COMMIT, options);
+	err = fsck_ident(&buffer, buffer_end, oid, OBJ_COMMIT, options);
 	if (err)
 		return err;
 	if (memchr(buffer_begin, '\0', size)) {
@@ -1064,7 +1066,7 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
 			goto done;
 	}
 	else
-		ret = fsck_ident(&buffer, oid, OBJ_TAG, options);
+		ret = fsck_ident(&buffer, buffer_end, oid, OBJ_TAG, options);
 
 	if (buffer < buffer_end && !starts_with(buffer, "\n")) {
 		/*

From 830424def4896dbf041d41dad873f5b86fdf9bfa Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:23 -0500
Subject: [PATCH 096/553] fsck: avoid strcspn() in fsck_ident()

We may be operating on a buffer that is not NUL-terminated, but we use
strcspn() to parse it. This is OK in practice, as discussed in
8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19),
because we know there is at least a trailing newline in our buffer, and
we always pass "\n" to strcspn(). So we know it will stop before running
off the end of the buffer.

But this is a subtle point to hang our memory safety hat on. And it
confuses ASan's strict_string_checks mode, even though it is technically
a false positive (that mode complains that we have no NUL, which is
true, but it does not know that we have verified the presence of the
newline already).

Let's instead open-code the loop. As a bonus, this makes the logic more
obvious (to my mind, anyway). The current code skips forward with
strcspn until it hits "<", ">", or "\n". But then it must check which it
saw to decide if that was what we expected or not, duplicating some
logic between what's in the strcspn() and what's in the domain logic.
Instead, we can just check each character as we loop and act on it
immediately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fsck.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/fsck.c b/fsck.c
index 3d1027dc877424..3696f1b849ad9f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -874,18 +874,30 @@ static int fsck_ident(const char **ident, const char *ident_end,
 
 	if (*p == '<')
 		return report(options, oid, type, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	p += strcspn(p, "<>\n");
-	if (*p == '>')
-		return report(options, oid, type, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (*p != '<')
-		return report(options, oid, type, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
+	for (;;) {
+		if (p >= ident_end || *p == '\n')
+			return report(options, oid, type, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
+		if (*p == '>')
+			return report(options, oid, type, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
+		if (*p == '<')
+			break; /* end of name, beginning of email */
+
+		/* otherwise, skip past arbitrary name char */
+		p++;
+	}
 	if (p[-1] != ' ')
 		return report(options, oid, type, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	p++;
-	p += strcspn(p, "<>\n");
-	if (*p != '>')
-		return report(options, oid, type, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	p++;
+	p++; /* skip past '<' we found */
+	for (;;) {
+		if (p >= ident_end || *p == '<' || *p == '\n')
+			return report(options, oid, type, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
+		if (*p == '>')
+			break; /* end of email */
+
+		/* otherwise, skip past arbitrary email char */
+		p++;
+	}
+	p++; /* skip past '>' we found */
 	if (*p != ' ')
 		return report(options, oid, type, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	p++;

From f05df7ffca492b37d604ad6beed788055eb56ebd Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:25 -0500
Subject: [PATCH 097/553] fsck: remove redundant date timestamp check

After calling "parse_timestamp(p, &end, 10)", we complain if "p == end",
which would imply that we did not see any digits at all. But we know
this cannot be the case, since we would have bailed already if we did
not see any digits, courtesy of extra checks added by 8e4309038f (fsck:
do not assume NUL-termination of buffers, 2023-01-19). Since then,
checking "p == end" is redundant and we can drop it.

This will make our lives a little easier as we refactor further.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fsck.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 3696f1b849ad9f..98b16a9e584060 100644
--- a/fsck.c
+++ b/fsck.c
@@ -919,7 +919,7 @@ static int fsck_ident(const char **ident, const char *ident_end,
 		return report(options, oid, type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(parse_timestamp(p, &end, 10)))
 		return report(options, oid, type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if ((end == p || *end != ' '))
+	if (*end != ' ')
 		return report(options, oid, type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	p = end + 1;
 	if ((*p != '+' && *p != '-') ||

From 5a993593b24df699f60841296795f9a6ca60d399 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:28 -0500
Subject: [PATCH 098/553] fsck: avoid parse_timestamp() on buffer that isn't
 NUL-terminated

In fsck_ident(), we parse the timestamp with parse_timestamp(), which is
really an alias for strtoumax(). But since our buffer may not be
NUL-terminated, this can trigger a complaint from ASan's
strict_string_checks mode. This is a false positive, since we know that
the buffer contains a trailing newline (which we checked earlier in the
function), and that strtoumax() would stop there.

But it is worth working around ASan's complaint. One is because that
will let us turn on strict_string_checks by default, which has helped
catch other real problems. And two is that the safety of the current
code is very hard to reason about (it subtly depends on distant code
which could change).

One option here is to just parse the number left-to-right ourselves. But
we care about the size of a timestamp_t and detecting overflow, since
that's part of the point of these checks. And doing that correctly is
tricky. So we'll instead just pull the digits into a separate,
NUL-terminated buffer, and use that to call parse_timestamp().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fsck.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/fsck.c b/fsck.c
index 98b16a9e584060..ec45f786d6ed74 100644
--- a/fsck.c
+++ b/fsck.c
@@ -859,13 +859,28 @@ static int verify_headers(const void *data, unsigned long size,
 		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
+static timestamp_t parse_timestamp_from_buf(const char **start, const char *end)
+{
+	const char *p = *start;
+	char buf[24]; /* big enough for 2^64 */
+	size_t i = 0;
+
+	while (p < end && isdigit(*p)) {
+		if (i >= ARRAY_SIZE(buf) - 1)
+			return TIME_MAX;
+		buf[i++] = *p++;
+	}
+	buf[i] = '\0';
+	*start = p;
+	return parse_timestamp(buf, NULL, 10);
+}
+
 static int fsck_ident(const char **ident, const char *ident_end,
 		      const struct object_id *oid, enum object_type type,
 		      struct fsck_options *options)
 {
 	const char *p = *ident;
 	const char *nl;
-	char *end;
 
 	nl = memchr(p, '\n', ident_end - p);
 	if (!nl)
@@ -917,11 +932,11 @@ static int fsck_ident(const char **ident, const char *ident_end,
 			      "invalid author/committer line - bad date");
 	if (*p == '0' && p[1] != ' ')
 		return report(options, oid, type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(parse_timestamp(p, &end, 10)))
+	if (date_overflows(parse_timestamp_from_buf(&p, ident_end)))
 		return report(options, oid, type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (*end != ' ')
+	if (*p != ' ')
 		return report(options, oid, type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	p = end + 1;
+	p++;
 	if ((*p != '+' && *p != '-') ||
 	    !isdigit(p[1]) ||
 	    !isdigit(p[2]) ||

From a031b6181a1e1ee6768d19d6a03b031b6e9004e9 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:12:30 -0500
Subject: [PATCH 099/553] t: enable ASan's strict_string_checks option

ASan has an option to enable strict string checking, where any pointer
passed to a function that expects a NUL-terminated string will be
checked for that NUL termination. This can sometimes produce false
positives. E.g., it is not wrong to pass a buffer with { '1', '2', '\n' }
into strtoul(). Even though it is not NUL-terminated, it will stop at
the newline.

But in trying it out, it identified two problematic spots in our test
suite (which have now been adjusted):

  1. The strtol() parsing in cache-tree.c was a real potential problem,
     which would have been very hard to find otherwise (since it
     required constructing a very specific broken index file).

  2. The use of string functions in fsck_ident() were false positives,
     because we knew that there was always a trailing newline which
     would stop the functions from reading off the end of the buffer.
     But the reasoning behind that is somewhat fragile, and silencing
     those complaints made the code easier to reason about.

So even though this did not find any earth-shattering bugs, and even had
a few false positives, I'm sufficiently convinced that its complaints
are more helpful than hurtful. Let's turn it on by default (since the
test suite now runs cleanly with it) and see if it ever turns up any
other instances.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/test-lib.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 92d0db13d7429d..bbda7abb16c1c4 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -77,6 +77,7 @@ prepend_var GIT_SAN_OPTIONS : strip_path_prefix="$GIT_BUILD_DIR/"
 # want that one to complain to stderr).
 prepend_var ASAN_OPTIONS : $GIT_SAN_OPTIONS
 prepend_var ASAN_OPTIONS : detect_leaks=0
+prepend_var ASAN_OPTIONS : strict_string_checks=1
 export ASAN_OPTIONS
 
 prepend_var LSAN_OPTIONS : $GIT_SAN_OPTIONS

From e96105aa1741ebb61c17a59aee962058a3743e09 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:32:43 -0500
Subject: [PATCH 100/553] unit-test: ignore --no-chain-lint

In the same spirit as 9faf3963b6 (t: introduce compatibility options to
clar-based tests, 2024-12-13), we should ignore --no-chain-lint passed
to our clar tests, since it may appear in GIT_TEST_OPTS to be used with
other tests.

This is particularly important on Windows CI, where --no-chain-lint is
added to the test options by default, and the meson build will pass all
options to the unit tests. The only reason our meson Windows CI job does
not run into this currently is that it is not respecting GIT_TEST_OPTS
at all! So ignoring this option is a prerequisite to fixing that
situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/unit-tests/unit-test.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/unit-tests/unit-test.c b/t/unit-tests/unit-test.c
index 5af645048adf4e..752fb38fb324a1 100644
--- a/t/unit-tests/unit-test.c
+++ b/t/unit-tests/unit-test.c
@@ -29,6 +29,7 @@ int cmd_main(int argc, const char **argv)
 		OPT_NOOP_NOARG('d', "debug"),
 		OPT_NOOP_NOARG(0, "github-workflow-markup"),
 		OPT_NOOP_NOARG(0, "no-bin-wrappers"),
+		OPT_NOOP_ARG(0, "no-chain-lint"),
 		OPT_NOOP_ARG(0, "root"),
 		OPT_NOOP_ARG(0, "stress"),
 		OPT_NOOP_NOARG(0, "tee"),

From 17bd1108eac98d8977a24233a498143ffd577c31 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 04:35:19 -0500
Subject: [PATCH 101/553] ci(windows-meson-test): handle options and output
 like other test jobs

The GitHub windows-meson-test jobs directly run "meson test" with the
--slice option. This means they skip all of the ci/lib.sh
infrastructure, and in particular:

  1. They do not actually set any GIT_TEST_OPTS like --verbose-log or
     -x.

  2. They do not do the usual handle_failed_tests() magic to print test
     failures or tar up failed directories.

As a result, you get almost no feedback at all when a test fails in this
job, making debugging rather tricky.

Let's try to make this behave more like the other CI jobs. Because we're
on Windows, we can't just use the normal run-build-and-tests.sh script.
Our build runs as a separate job (like the non-meson Windows job), and
then we parallelize the tests across several job slices. So we need
something like the run-test-slice.sh script that the "windows-test" job
uses.

In theory we could just swap out the "make" invocation there for
"meson". But it doesn't quite work, because "make" knows how to pull
GIT_TEST_OPTS out of GIT-BUILD-OPTIONS automatically. But for meson, we
have to extract them into the --test-args option ourselves. I tried
making the logic in run-test-slice.sh conditional, but there ended up
being hardly any common code at all (and there are some tricky ordering
constraints). So I added up with a new meson-specific test-slice runner.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .github/workflows/main.yml | 12 +++++++++++-
 ci/run-test-slice-meson.sh | 13 +++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100755 ci/run-test-slice-meson.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index aa6bce673b4e99..e8c824406f17f3 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -298,7 +298,17 @@ jobs:
         path: build
     - name: Test
       shell: pwsh
-      run: meson test -C build --no-rebuild --print-errorlogs --slice "$(1+${{ matrix.nr }})/10"
+      run: ci/run-test-slice-meson.sh build ${{matrix.nr}} 10
+    - name: print test failures
+      if: failure() && env.FAILED_TEST_ARTIFACTS != ''
+      shell: bash
+      run: ci/print-test-failures.sh
+    - name: Upload failed tests' directories
+      if: failure() && env.FAILED_TEST_ARTIFACTS != ''
+      uses: actions/upload-artifact@v4
+      with:
+        name: failed-tests-windows-meson-${{ matrix.nr }}
+        path: ${{env.FAILED_TEST_ARTIFACTS}}
 
   regular:
     name: ${{matrix.vector.jobname}} (${{matrix.vector.pool}})
diff --git a/ci/run-test-slice-meson.sh b/ci/run-test-slice-meson.sh
new file mode 100755
index 00000000000000..961c94fba0b2ee
--- /dev/null
+++ b/ci/run-test-slice-meson.sh
@@ -0,0 +1,13 @@
+#!/bin/sh
+
+# We must load the build options so we know where to find
+# things like TEST_OUTPUT_DIRECTORY. This has to come before
+# loading lib.sh, though, because it may clobber some CI lib
+# variables like our custom GIT_TEST_OPTS.
+. "$1"/GIT-BUILD-OPTIONS
+. ${0%/*}/lib.sh
+
+group "Run tests" \
+	meson test -C "$1" --no-rebuild --print-errorlogs \
+		--test-args="$GIT_TEST_OPTS" --slice "$((1+$2))/$3" ||
+handle_failed_tests

From 14b561e7685ee91d6a3d39684f9089c902641083 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Tue, 18 Nov 2025 07:21:24 -0500
Subject: [PATCH 102/553] test-mktemp: plug memory and descriptor leaks

We test xmkstemp() in our helper by just calling:

  xmkstemp(xstrdup(argv[1]));

This leaks both the copied string as well as the descriptor returned by
the function. In practice this isn't a big deal, since we immediately
exit the program, but:

  1. LSan will complain about the memory leak. The only reason we did
     not notice this in our leak-checking builds is that both of the
     callers in the test suite (both in t0070) pass a broken template
     (and expect failure). So the function calls die() before we can
     actually leak.

     But it's an accident waiting to happen if anybody adds a call which
     succeeds.

  2. Coverity complains about the descriptor leak. There's a long list
     of uninteresting or false positives in Coverity's results, but
     since we're here we might as well fix it, too.

I didn't bother adding a new test that triggers the leak. It's not even
in real production code, but just in the test-helper itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/helper/test-mktemp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-mktemp.c b/t/helper/test-mktemp.c
index 22906889402933..da195640a9dcc8 100644
--- a/t/helper/test-mktemp.c
+++ b/t/helper/test-mktemp.c
@@ -6,10 +6,16 @@
 
 int cmd__mktemp(int argc, const char **argv)
 {
+	char *template;
+	int fd;
+
 	if (argc != 2)
 		usage("Expected 1 parameter defining the temporary file template");
+	template = xstrdup(argv[1]);
 
-	xmkstemp(xstrdup(argv[1]));
+	fd = xmkstemp(template);
 
+	close(fd);
+	free(template);
 	return 0;
 }

From a6238ee16371247517e39da36782614a229184ff Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Tue, 18 Nov 2025 16:07:32 +0000
Subject: [PATCH 103/553] worktree list: fix column spacing

The output of "git worktree list" displays a table containing the
worktree path, HEAD OID and branch name for each worktree. The code
aligns the columns by measuring the visual width of the worktree path
when it is printed. Unfortunately it fails to use the visual width
when calculating the width of the column so, if any of the paths
contain a multibyte character, we can end up with excess padding
between columns. The simplest fix would be to replace strlen() with
utf8_strwidth() in measure_widths(). However that leaves us measuring
the visual width twice and the byte length once. By caching the visual
width and printing the padding separately to the worktree path, we only
need to calculate the visual width once and do not need the byte length
at all. The visual widths are stored in an arrays of structs rather
than an array of ints as the next commit will add more struct members.

Even if there are no multibyte characters in any of the paths we still
print an extra space between the path and the object id as the field
width is calculated as one plus the length of the path and we print an
explicit space as well. This is fixed by not printing the extra space.

The tests are updated to include multibyte characters in one of the
worktree paths and to check the spacing of the columns.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/worktree.c       | 35 +++++++++++++++++++++++------------
 t/t2402-worktree-list.sh | 22 ++++++++++------------
 2 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 812774a5ca992c..0643a22ee58b89 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -979,14 +979,17 @@ static void show_worktree_porcelain(struct worktree *wt, int line_terminator)
 	fputc(line_terminator, stdout);
 }
 
-static void show_worktree(struct worktree *wt, int path_maxlen, int abbrev_len)
+struct worktree_display {
+	int width;
+};
+
+static void show_worktree(struct worktree *wt, struct worktree_display *display,
+			  int path_maxwidth, int abbrev_len)
 {
 	struct strbuf sb = STRBUF_INIT;
-	int cur_path_len = strlen(wt->path);
-	int path_adj = cur_path_len - utf8_strwidth(wt->path);
 	const char *reason;
 
-	strbuf_addf(&sb, "%-*s ", 1 + path_maxlen + path_adj, wt->path);
+	strbuf_addf(&sb, "%s%*s", wt->path, 1 + path_maxwidth - display->width, "");
 	if (wt->is_bare)
 		strbuf_addstr(&sb, "(bare)");
 	else {
@@ -1020,20 +1023,24 @@ static void show_worktree(struct worktree *wt, int path_maxlen, int abbrev_len)
 	strbuf_release(&sb);
 }
 
-static void measure_widths(struct worktree **wt, int *abbrev, int *maxlen)
+static void measure_widths(struct worktree **wt, int *abbrev,
+			   struct worktree_display **d, int *maxwidth)
 {
-	int i;
+	int i, display_alloc = 0;
+	struct worktree_display *display = NULL;
 
 	for (i = 0; wt[i]; i++) {
 		int sha1_len;
-		int path_len = strlen(wt[i]->path);
+		ALLOC_GROW(display, i + 1, display_alloc);
+		display[i].width = utf8_strwidth(wt[i]->path);
 
-		if (path_len > *maxlen)
-			*maxlen = path_len;
+		if (display[i].width > *maxwidth)
+			*maxwidth = display[i].width;
 		sha1_len = strlen(repo_find_unique_abbrev(the_repository, &wt[i]->head_oid, *abbrev));
 		if (sha1_len > *abbrev)
 			*abbrev = sha1_len;
 	}
+	*d = display;
 }
 
 static int pathcmp(const void *a_, const void *b_)
@@ -1079,21 +1086,25 @@ static int list(int ac, const char **av, const char *prefix,
 		die(_("the option '%s' requires '%s'"), "-z", "--porcelain");
 	else {
 		struct worktree **worktrees = get_worktrees();
-		int path_maxlen = 0, abbrev = DEFAULT_ABBREV, i;
+		int path_maxwidth = 0, abbrev = DEFAULT_ABBREV, i;
+		struct worktree_display *display = NULL;
 
 		/* sort worktrees by path but keep main worktree at top */
 		pathsort(worktrees + 1);
 
 		if (!porcelain)
-			measure_widths(worktrees, &abbrev, &path_maxlen);
+			measure_widths(worktrees, &abbrev,
+				       &display, &path_maxwidth);
 
 		for (i = 0; worktrees[i]; i++) {
 			if (porcelain)
 				show_worktree_porcelain(worktrees[i],
 							line_terminator);
 			else
-				show_worktree(worktrees[i], path_maxlen, abbrev);
+				show_worktree(worktrees[i],
+					      &display[i], path_maxwidth, abbrev);
 		}
+		free(display);
 		free_worktrees(worktrees);
 	}
 	return 0;
diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
index 8ef1cad7f29d7d..a494df6d612668 100755
--- a/t/t2402-worktree-list.sh
+++ b/t/t2402-worktree-list.sh
@@ -30,22 +30,20 @@ test_expect_success 'rev-parse --git-path objects linked worktree' '
 '
 
 test_expect_success '"list" all worktrees from main' '
-	echo "$(git rev-parse --show-toplevel) $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
-	test_when_finished "rm -rf here out actual expect && git worktree prune" &&
-	git worktree add --detach here main &&
-	echo "$(git -C here rev-parse --show-toplevel) $(git rev-parse --short HEAD) (detached HEAD)" >>expect &&
-	git worktree list >out &&
-	sed "s/  */ /g" <out >actual &&
+	echo "$(git rev-parse --show-toplevel)      $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
+	test_when_finished "rm -rf áááá out actual expect && git worktree prune" &&
+	git worktree add --detach áááá main &&
+	echo "$(git -C áááá rev-parse --show-toplevel) $(git rev-parse --short HEAD) (detached HEAD)" >>expect &&
+	git worktree list >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success '"list" all worktrees from linked' '
-	echo "$(git rev-parse --show-toplevel) $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
-	test_when_finished "rm -rf here out actual expect && git worktree prune" &&
-	git worktree add --detach here main &&
-	echo "$(git -C here rev-parse --show-toplevel) $(git rev-parse --short HEAD) (detached HEAD)" >>expect &&
-	git -C here worktree list >out &&
-	sed "s/  */ /g" <out >actual &&
+	echo "$(git rev-parse --show-toplevel)      $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
+	test_when_finished "rm -rf áááá out actual expect && git worktree prune" &&
+	git worktree add --detach áááá main &&
+	echo "$(git -C áááá rev-parse --show-toplevel) $(git rev-parse --short HEAD) (detached HEAD)" >>expect &&
+	git -C áááá worktree list >actual &&
 	test_cmp expect actual
 '
 

From 08dfa5983572645ae7cc51b49cadfdf216ecfec6 Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Tue, 18 Nov 2025 16:07:33 +0000
Subject: [PATCH 104/553] worktree list: quote paths

If a worktree path contains newlines or other control characters
it messes up the output of "git worktree list". Fix this by using
quote_path() to display the worktree path. The output of "git worktree
list" is designed for human consumption, scripts should be using the
"--porcelain" option so this change should not break them.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/worktree.c       | 10 ++++++++--
 t/t2402-worktree-list.sh | 15 ++++++++++++++-
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 0643a22ee58b89..303cc3b2d64b64 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -980,6 +980,7 @@ static void show_worktree_porcelain(struct worktree *wt, int line_terminator)
 }
 
 struct worktree_display {
+	char *path;
 	int width;
 };
 
@@ -989,7 +990,7 @@ static void show_worktree(struct worktree *wt, struct worktree_display *display,
 	struct strbuf sb = STRBUF_INIT;
 	const char *reason;
 
-	strbuf_addf(&sb, "%s%*s", wt->path, 1 + path_maxwidth - display->width, "");
+	strbuf_addf(&sb, "%s%*s", display->path, 1 + path_maxwidth - display->width, "");
 	if (wt->is_bare)
 		strbuf_addstr(&sb, "(bare)");
 	else {
@@ -1028,11 +1029,14 @@ static void measure_widths(struct worktree **wt, int *abbrev,
 {
 	int i, display_alloc = 0;
 	struct worktree_display *display = NULL;
+	struct strbuf buf = STRBUF_INIT;
 
 	for (i = 0; wt[i]; i++) {
 		int sha1_len;
 		ALLOC_GROW(display, i + 1, display_alloc);
-		display[i].width = utf8_strwidth(wt[i]->path);
+		quote_path(wt[i]->path, NULL, &buf, 0);
+		display[i].width = utf8_strwidth(buf.buf);
+		display[i].path = strbuf_detach(&buf, NULL);
 
 		if (display[i].width > *maxwidth)
 			*maxwidth = display[i].width;
@@ -1104,6 +1108,8 @@ static int list(int ac, const char **av, const char *prefix,
 				show_worktree(worktrees[i],
 					      &display[i], path_maxwidth, abbrev);
 		}
+		for (i = 0; display && worktrees[i]; i++)
+			free(display[i].path);
 		free(display);
 		free_worktrees(worktrees);
 	}
diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
index a494df6d612668..e0c6abd2f58e20 100755
--- a/t/t2402-worktree-list.sh
+++ b/t/t2402-worktree-list.sh
@@ -29,7 +29,8 @@ test_expect_success 'rev-parse --git-path objects linked worktree' '
 	test_cmp expect actual
 '
 
-test_expect_success '"list" all worktrees from main' '
+test_expect_success '"list" all worktrees from main core.quotepath=false' '
+	test_config core.quotepath false &&
 	echo "$(git rev-parse --show-toplevel)      $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
 	test_when_finished "rm -rf áááá out actual expect && git worktree prune" &&
 	git worktree add --detach áááá main &&
@@ -38,7 +39,19 @@ test_expect_success '"list" all worktrees from main' '
 	test_cmp expect actual
 '
 
+test_expect_success '"list" all worktrees from main core.quotepath=true' '
+	test_config core.quotepath true &&
+	echo "$(git rev-parse --show-toplevel)            $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
+	test_when_finished "rm -rf á out actual expect && git worktree prune" &&
+	git worktree add --detach á main &&
+	echo "\"$(git -C á rev-parse --show-toplevel)\" $(git rev-parse --short HEAD) (detached HEAD)" |
+		sed s/á/\\\\303\\\\241/g >>expect &&
+	git worktree list >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success '"list" all worktrees from linked' '
+	test_config core.quotepath false &&
 	echo "$(git rev-parse --show-toplevel)      $(git rev-parse --short HEAD) [$(git symbolic-ref --short HEAD)]" >expect &&
 	test_when_finished "rm -rf áááá out actual expect && git worktree prune" &&
 	git worktree add --detach áááá main &&

From fd7d79d068dd14a4d7a4a93f7bfd31cf24020aec Mon Sep 17 00:00:00 2001
From: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Date: Tue, 18 Nov 2025 17:37:03 -0300
Subject: [PATCH 105/553] repo: factor out field printing to dedicated function

Move the field printing in git-repo-info to a new function called
`print_field`, allowing it to be called by functions other than
`print_fields`.

Also change its use of quote_c_style() helper to output directly to
the standard output stream, instead of taking a result in a strbuf
and then printing it outselves.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c | 34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index 9d4749f79befa8..f9fb4184940e2e 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -85,13 +85,29 @@ static get_value_fn *get_value_fn_for_key(const char *key)
 	return found ? found->get_value : NULL;
 }
 
+static void print_field(enum output_format format, const char *key,
+			const char *value)
+{
+	switch (format) {
+	case FORMAT_KEYVALUE:
+		printf("%s=", key);
+		quote_c_style(value, NULL, stdout, 0);
+		putchar('\n');
+		break;
+	case FORMAT_NUL_TERMINATED:
+		printf("%s\n%s%c", key, value, '\0');
+		break;
+	default:
+		BUG("not a valid output format: %d", format);
+	}
+}
+
 static int print_fields(int argc, const char **argv,
 			struct repository *repo,
 			enum output_format format)
 {
 	int ret = 0;
 	struct strbuf valbuf = STRBUF_INIT;
-	struct strbuf quotbuf = STRBUF_INIT;
 
 	for (int i = 0; i < argc; i++) {
 		get_value_fn *get_value;
@@ -105,25 +121,11 @@ static int print_fields(int argc, const char **argv,
 		}
 
 		strbuf_reset(&valbuf);
-		strbuf_reset(&quotbuf);
-
 		get_value(repo, &valbuf);
-
-		switch (format) {
-		case FORMAT_KEYVALUE:
-			quote_c_style(valbuf.buf, &quotbuf, NULL, 0);
-			printf("%s=%s\n", key, quotbuf.buf);
-			break;
-		case FORMAT_NUL_TERMINATED:
-			printf("%s\n%s%c", key, valbuf.buf, '\0');
-			break;
-		default:
-			BUG("not a valid output format: %d", format);
-		}
+		print_field(format, key, valbuf.buf);
 	}
 
 	strbuf_release(&valbuf);
-	strbuf_release(&quotbuf);
 	return ret;
 }
 

From 155caac7d1fa981b21192c598cf9bbffdb5aea12 Mon Sep 17 00:00:00 2001
From: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Date: Tue, 18 Nov 2025 17:37:04 -0300
Subject: [PATCH 106/553] repo: add --all to git-repo-info

Add a new flag `--all` to git-repo-info for requesting values for all
the available keys. By using this flag, the user can retrieve all the
values instead of searching what are the desired keys for what they
wants.

Helped-by: Karthik Nayak <karthik.188@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc |  6 +++---
 builtin/repo.c              | 29 +++++++++++++++++++++++++++--
 t/t1900-repo.sh             | 21 +++++++++++++++++++++
 3 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index ce43cb19c8b03c..70f0a6d2e47291 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -8,7 +8,7 @@ git-repo - Retrieve information about the repository
 SYNOPSIS
 --------
 [synopsis]
-git repo info [--format=(keyvalue|nul)] [-z] [<key>...]
+git repo info [--format=(keyvalue|nul)] [-z] [--all | <key>...]
 git repo structure [--format=(table|keyvalue|nul)]
 
 DESCRIPTION
@@ -19,13 +19,13 @@ THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
 
 COMMANDS
 --------
-`info [--format=(keyvalue|nul)] [-z] [<key>...]`::
+`info [--format=(keyvalue|nul)] [-z] [--all | <key>...]`::
 	Retrieve metadata-related information about the current repository. Only
 	the requested data will be returned based on their keys (see "INFO KEYS"
 	section below).
 +
 The values are returned in the same order in which their respective keys were
-requested.
+requested. The `--all` flag requests the values for all the available keys.
 +
 The output format can be chosen through the flag `--format`. Two formats are
 supported:
diff --git a/builtin/repo.c b/builtin/repo.c
index f9fb4184940e2e..e30e2416d4f59d 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -15,7 +15,7 @@
 #include "utf8.h"
 
 static const char *const repo_usage[] = {
-	"git repo info [--format=(keyvalue|nul)] [-z] [<key>...]",
+	"git repo info [--format=(keyvalue|nul)] [-z] [--all | <key>...]",
 	"git repo structure [--format=(table|keyvalue|nul)]",
 	NULL
 };
@@ -129,6 +129,23 @@ static int print_fields(int argc, const char **argv,
 	return ret;
 }
 
+static int print_all_fields(struct repository *repo,
+			    enum output_format format)
+{
+	struct strbuf valbuf = STRBUF_INIT;
+
+	for (size_t i = 0; i < ARRAY_SIZE(repo_info_fields); i++) {
+		const struct field *field = &repo_info_fields[i];
+
+		strbuf_reset(&valbuf);
+		field->get_value(repo, &valbuf);
+		print_field(format, field->key, valbuf.buf);
+	}
+
+	strbuf_release(&valbuf);
+	return 0;
+}
+
 static int parse_format_cb(const struct option *opt,
 			   const char *arg, int unset UNUSED)
 {
@@ -152,6 +169,7 @@ static int cmd_repo_info(int argc, const char **argv, const char *prefix,
 			 struct repository *repo)
 {
 	enum output_format format = FORMAT_KEYVALUE;
+	int all_keys = 0;
 	struct option options[] = {
 		OPT_CALLBACK_F(0, "format", &format, N_("format"),
 			       N_("output format"),
@@ -160,6 +178,7 @@ static int cmd_repo_info(int argc, const char **argv, const char *prefix,
 			       N_("synonym for --format=nul"),
 			       PARSE_OPT_NONEG | PARSE_OPT_NOARG,
 			       parse_format_cb),
+		OPT_BOOL(0, "all", &all_keys, N_("print all keys/values")),
 		OPT_END()
 	};
 
@@ -167,7 +186,13 @@ static int cmd_repo_info(int argc, const char **argv, const char *prefix,
 	if (format != FORMAT_KEYVALUE && format != FORMAT_NUL_TERMINATED)
 		die(_("unsupported output format"));
 
-	return print_fields(argc, argv, repo, format);
+	if (all_keys && argc)
+		die(_("--all and <key> cannot be used together"));
+
+	if (all_keys)
+		return print_all_fields(repo, format);
+	else
+		return print_fields(argc, argv, repo, format);
 }
 
 struct ref_stats {
diff --git a/t/t1900-repo.sh b/t/t1900-repo.sh
index 2beba67889af25..51d55f11a5ed66 100755
--- a/t/t1900-repo.sh
+++ b/t/t1900-repo.sh
@@ -4,6 +4,15 @@ test_description='test git repo-info'
 
 . ./test-lib.sh
 
+# git-repo-info keys. It must contain the same keys listed in the const
+# repo_info_fields, in lexicographical order.
+REPO_INFO_KEYS='
+	layout.bare
+	layout.shallow
+	object.format
+	references.format
+'
+
 # Test whether a key-value pair is correctly returned
 #
 # Usage: test_repo_info <label> <init command> <repo_name> <key> <expected value>
@@ -110,4 +119,16 @@ test_expect_success 'git repo info uses the last requested format' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git repo info --all returns all key-value pairs' '
+	git repo info $REPO_INFO_KEYS >expect &&
+	git repo info --all >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git repo info --all <key> aborts' '
+	echo "fatal: --all and <key> cannot be used together" >expect &&
+	test_must_fail git repo info --all object.format 2>actual &&
+	test_cmp expect actual
+'
+
 test_done

From 6971934d9bb4b8176b48658482862169a4582913 Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:13 +0000
Subject: [PATCH 107/553] doc: define unambiguous type mappings across C and
 Rust

Document other nuances when crossing the FFI boundary. Other language
mappings may be added in the future.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/Makefile                        |   1 +
 Documentation/technical/meson.build           |   1 +
 .../technical/unambiguous-types.adoc          | 224 ++++++++++++++++++
 3 files changed, 226 insertions(+)
 create mode 100644 Documentation/technical/unambiguous-types.adoc

diff --git a/Documentation/Makefile b/Documentation/Makefile
index a3fbd29744bd39..b580bdc98b100e 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -140,6 +140,7 @@ TECH_DOCS += technical/shallow
 TECH_DOCS += technical/sparse-checkout
 TECH_DOCS += technical/sparse-index
 TECH_DOCS += technical/trivial-merge
+TECH_DOCS += technical/unambiguous-types
 TECH_DOCS += technical/unit-tests
 SP_ARTICLES += $(TECH_DOCS)
 SP_ARTICLES += technical/api-index
diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build
index 858af811a7bcc1..7df9b8acf60ed4 100644
--- a/Documentation/technical/meson.build
+++ b/Documentation/technical/meson.build
@@ -31,6 +31,7 @@ articles = [
   'sparse-checkout.adoc',
   'sparse-index.adoc',
   'trivial-merge.adoc',
+  'unambiguous-types.adoc',
   'unit-tests.adoc',
 ]
 
diff --git a/Documentation/technical/unambiguous-types.adoc b/Documentation/technical/unambiguous-types.adoc
new file mode 100644
index 00000000000000..9a4990847c0e0b
--- /dev/null
+++ b/Documentation/technical/unambiguous-types.adoc
@@ -0,0 +1,224 @@
+= Unambiguous types
+
+Most of these mappings are obvious, but there are some nuances and gotchas with
+Rust FFI (Foreign Function Interface).
+
+This document defines clear, one-to-one mappings between primitive types in C,
+Rust (and possible other languages in the future). Its purpose is to eliminate
+ambiguity in type widths, signedness, and binary representation across
+platforms and languages.
+
+For Git, the only header required to use these unambiguous types in C is
+`git-compat-util.h`.
+
+== Boolean types
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| bool^1^       | bool
+|===
+
+== Integer types
+
+In C, `<stdint.h>` (or an equivalent) must be included.
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| uint8_t    | u8
+| uint16_t   | u16
+| uint32_t   | u32
+| uint64_t   | u64
+
+| int8_t     | i8
+| int16_t    | i16
+| int32_t    | i32
+| int64_t    | i64
+|===
+
+== Floating-point types
+
+Rust requires IEEE-754 semantics.
+In C, that is typically true, but not guaranteed by the standard.
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| float^2^      | f32
+| double^2^     | f64
+|===
+
+== Size types
+
+These types represent pointer-sized integers and are typically defined in
+`<stddef.h>` or an equivalent header.
+
+Size types should be used any time pointer arithmetic is performed e.g.
+indexing an array, describing the number of elements in memory, etc...
+
+[cols="1,1", options="header"]
+|===
+| C Type | Rust Type
+| size_t^3^     | usize
+| ptrdiff_t^3^  | isize
+|===
+
+== Character types
+
+This is where C and Rust don't have a clean one-to-one mapping.
+
+A C `char` and a Rust `u8` share the same bit width, so any C struct containing
+a `char` will have the same size as the corresponding Rust struct using `u8`.
+In that sense, such structs are safe to pass over the FFI boundary, because
+their fields will be laid out identically. However, beyond bit width, C `char`
+has additional semantics and platform-dependent behavior that can cause
+problems, as discussed below.
+
+The C language leaves the signedness of `char` implementation defined. Because
+our developer build enables -Wsign-compare, comparison of a value of `char`
+type with either signed or unsigned integers may trigger warnings from the
+compiler.
+
+Note: Rust's `char` type is an unsigned 32-bit integer that is used to describe
+Unicode code points.
+
+=== Notes
+^1^ This is only true if stdbool.h (or equivalent) is used. +
+^2^ C does not enforce IEEE-754 compatibility, but Rust expects it. If the
+platform/arch for C does not follow IEEE-754 then this equivalence does not
+hold. Also, it's assumed that `float` is 32 bits and `double` is 64, but
+there may be a strange platform/arch where even this isn't true. +
+^3^ C also defines uintptr_t, ssize_t and intptr_t, but these types are
+discouraged for FFI purposes. For functions like `read()` and `write()` ssize_t
+should be cast to a different, and unambiguous, type before being passed over
+the FFI boundary. +
+
+== Problems with std::ffi::c_* types in Rust
+TL;DR: In practice, Rust's `c_*` types aren't guaranteed to match C types for
+all possible C compilers, platforms, or architectures, because Rust only
+ensures correctness of C types on officially supported targets. These
+definitions have changed over time to match more targets which means that the
+c_* definitions will differ based on which Rust version Git chooses to use.
+
+Current list of safe, Rust side, FFI types in Git: +
+
+* `c_void`
+* `CStr`
+* `CString`
+
+Even then, they should be used sparingly, and only where the semantics match
+exactly.
+
+The std::os::raw::c_* directly inherits the problems of core::ffi, which
+changes over time and seems to make a best guess at the correct definition for
+a given platform/target. This probably isn't a problem for all other platforms
+that Rust supports currently, but can anyone say that Rust got it right for all
+C compilers of all platforms/targets?
+
+To give an example: c_long is defined in
+footnote:[https://doc.rust-lang.org/1.63.0/src/core/ffi/mod.rs.html#175-189[c_long in 1.63.0]]
+footnote:[https://doc.rust-lang.org/1.89.0/src/core/ffi/primitives.rs.html#135-151[c_long in 1.89.0]]
+
+=== Rust version 1.63.0
+
+```
+mod c_long_definition {
+    cfg_if! {
+        if #[cfg(all(target_pointer_width = "64", not(windows)))] {
+            pub type c_long = i64;
+            pub type NonZero_c_long = crate::num::NonZeroI64;
+            pub type c_ulong = u64;
+            pub type NonZero_c_ulong = crate::num::NonZeroU64;
+        } else {
+            // The minimal size of `long` in the C standard is 32 bits
+            pub type c_long = i32;
+            pub type NonZero_c_long = crate::num::NonZeroI32;
+            pub type c_ulong = u32;
+            pub type NonZero_c_ulong = crate::num::NonZeroU32;
+        }
+    }
+}
+```
+
+=== Rust version 1.89.0
+
+```
+mod c_long_definition {
+    crate::cfg_select! {
+        any(
+            all(target_pointer_width = "64", not(windows)),
+            // wasm32 Linux ABI uses 64-bit long
+            all(target_arch = "wasm32", target_os = "linux")
+        ) => {
+            pub(super) type c_long = i64;
+            pub(super) type c_ulong = u64;
+        }
+        _ => {
+            // The minimal size of `long` in the C standard is 32 bits
+            pub(super) type c_long = i32;
+            pub(super) type c_ulong = u32;
+        }
+    }
+}
+```
+
+Even for the cases where C types are correctly mapped to Rust types via
+std::ffi::c_* there are still problems. Let's take c_char for example. On some
+platforms it's u8 on others it's i8.
+
+=== Subtraction underflow in debug mode
+
+The following code will panic in debug on platforms that define c_char as u8,
+but won't if it's an i8.
+
+```
+let mut x: std::ffi::c_char = 0;
+x -= 1;
+```
+
+=== Inconsistent shift behavior
+
+`x` will be 0xC0 for platforms that use i8, but will be 0x40 where it's u8.
+
+```
+let mut x: std::ffi::c_char = 0x80;
+x >>= 1;
+```
+
+=== Equality fails to compile on some platforms
+
+The following will not compile on platforms that define c_char as i8, but will
+if it's u8. You can cast x e.g. `assert_eq!(x as u8, b'a');`, but then you get
+a warning on platforms that use u8 and a clean compilation where i8 is used.
+
+```
+let mut x: std::ffi::c_char = 0x61;
+assert_eq!(x, b'a');
+```
+
+== Enum types
+Rust enum types should not be used as FFI types. Rust enum types are more like
+C union types than C enum's. For something like:
+
+```
+#[repr(C, u8)]
+enum Fruit {
+    Apple,
+    Banana,
+    Cherry,
+}
+```
+
+It's easy enough to make sure the Rust enum matches what C would expect, but a
+more complex type like.
+
+```
+enum HashResult {
+    SHA1([u8; 20]),
+    SHA256([u8; 32]),
+}
+```
+
+The Rust compiler has to add a discriminant to the enum to distinguish between
+the variants. The width, location, and values for that discriminant is up to
+the Rust compiler and is not ABI stable.

From f007f4f4b473565fb2e94780028399030926bacb Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:14 +0000
Subject: [PATCH 108/553] xdiff: use ptrdiff_t for dstart/dend

ptrdiff_t is appropriate for dstart and dend because they both describe
positive or negative offsets relative to a pointer.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xtypes.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index f145abba3ea8a3..7a2d429ec5e7ea 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -47,7 +47,7 @@ typedef struct s_xrecord {
 typedef struct s_xdfile {
 	xrecord_t *recs;
 	long nrec;
-	long dstart, dend;
+	ptrdiff_t dstart, dend;
 	bool *changed;
 	long *rindex;
 	long nreff;

From 10f97d6affcb59bdcb74a6878d3d9da0eec81296 Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:15 +0000
Subject: [PATCH 109/553] xdiff: make xrecord_t.ptr a uint8_t instead of char

Make xrecord_t.ptr uint8_t because it's referring to bytes in memory.

In order to avoid a refactor avalanche, many uses of this field were
cast to char* or similar.

Places where casting was unnecessary:
xemit.c:156
xmerge.c:124
xmerge.c:127
xmerge.c:164
xmerge.c:169
xmerge.c:172
xmerge.c:178

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xdiffi.c    |  8 ++++----
 xdiff/xemit.c     |  6 +++---
 xdiff/xmerge.c    | 14 +++++++-------
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  |  6 +++---
 xdiff/xtypes.h    |  2 +-
 xdiff/xutils.c    |  4 ++--
 7 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 6f3998ee54c01e..95989b6af1d070 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -407,7 +407,7 @@ static int get_indent(xrecord_t *rec)
 	int ret = 0;
 
 	for (i = 0; i < rec->size; i++) {
-		char c = rec->ptr[i];
+		char c = (char) rec->ptr[i];
 
 		if (!XDL_ISSPACE(c))
 			return ret;
@@ -993,11 +993,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 		rec = &xe->xdf1.recs[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
+			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
 
 		rec = &xe->xdf2.recs[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
+			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
 
 		xch->ignore = ignore;
 	}
@@ -1008,7 +1008,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
 	size_t i;
 
 	for (i = 0; i < xpp->ignore_regex_nr; i++)
-		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+		if (!regexec_buf(xpp->ignore_regex[i], (const char *)rec->ptr, rec->size, 1,
 				 &regmatch, 0))
 			return 1;
 
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index b2f1f30cd36eef..ead930088a5601 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
 {
 	xrecord_t *rec = &xdf->recs[ri];
 
-	if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
+	if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
 		return -1;
 
 	return 0;
@@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
 	xrecord_t *rec = &xdf->recs[ri];
 
 	if (!xecfg->find_func)
-		return def_ff(rec->ptr, rec->size, buf, sz);
-	return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
+		return def_ff((const char *)rec->ptr, rec->size, buf, sz);
+	return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
 }
 
 static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index fd600cbb5d58a2..75cb3e76a2c8e4 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t *rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
-			rec2[i].ptr, rec2[i].size, flags);
+		int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
+			(const char *)rec2[i].ptr, rec2[i].size, flags);
 		if (!result)
 			return -1;
 	}
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch(rec1->ptr, rec1->size,
-			    rec2->ptr, rec2->size, flags);
+	return xdl_recmatch((const char *)rec1->ptr, rec1->size,
+			    (const char *)rec2->ptr, rec2->size, flags);
 }
 
 /*
@@ -382,10 +382,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 		 * we have a very simple mmfile structure.
 		 */
 		t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
-		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
+		t1.size = (char *)xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
 			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
 		t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
-		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
+		t2.size = (char *)xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
 			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
 		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
 			return -1;
@@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
 static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
-		if (line_contains_alnum(xe->xdf2.recs[i].ptr,
+		if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
 				xe->xdf2.recs[i].size))
 			return 1;
 	return 0;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 669b653580efe6..bb61354f22a177 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 		return;
 	map->entries[index].line1 = line;
 	map->entries[index].hash = record->ha;
-	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
+	map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
 	if (!map->first)
 		map->first = map->entries + index;
 	if (map->last) {
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 192334f1b72e63..4c564670764e53 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -99,8 +99,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->rec.ha == rec->ha &&
-				xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
-					rec->ptr, rec->size, cf->flags))
+				xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
+					(const char *)rec->ptr, rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -156,7 +156,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
 				goto abort;
 			crec = &xdf->recs[xdf->nrec++];
-			crec->ptr = prev;
+			crec->ptr = (uint8_t const *)prev;
 			crec->size = (long) (cur - prev);
 			crec->ha = hav;
 			if (xdl_classify_record(pass, cf, crec) < 0)
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 7a2d429ec5e7ea..69727fb29999ef 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -39,7 +39,7 @@ typedef struct s_chastore {
 } chastore_t;
 
 typedef struct s_xrecord {
-	char const *ptr;
+	uint8_t const *ptr;
 	long size;
 	unsigned long ha;
 } xrecord_t;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 447e66c7198b08..7be063bfb61d72 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -465,10 +465,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 	xdfenv_t env;
 
 	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
-	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
+	subfile1.size = (char *)diff_env->xdf1.recs[line1 + count1 - 2].ptr +
 		diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
 	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
-	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
+	subfile2.size = (char *)diff_env->xdf2.recs[line2 + count2 - 2].ptr +
 		diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
 	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
 		return -1;

From 9bd193253c5e590203fc566ad7cff8f891ec0493 Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:16 +0000
Subject: [PATCH 110/553] xdiff: use size_t for xrecord_t.size

size_t is the appropriate type because size is describing the number of
elements, bytes in this case, in memory.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xdiffi.c   |  7 +++----
 xdiff/xemit.c    |  8 ++++----
 xdiff/xmerge.c   | 16 ++++++++--------
 xdiff/xprepare.c |  6 +++---
 xdiff/xtypes.h   |  2 +-
 5 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 95989b6af1d070..cb8e412c7b9db6 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -403,10 +403,9 @@ static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
  */
 static int get_indent(xrecord_t *rec)
 {
-	long i;
 	int ret = 0;
 
-	for (i = 0; i < rec->size; i++) {
+	for (size_t i = 0; i < rec->size; i++) {
 		char c = (char) rec->ptr[i];
 
 		if (!XDL_ISSPACE(c))
@@ -993,11 +992,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 
 		rec = &xe->xdf1.recs[xch->i1];
 		for (i = 0; i < xch->chg1 && ignore; i++)
-			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
+			ignore = xdl_blankline((const char *)rec[i].ptr, (long)rec[i].size, flags);
 
 		rec = &xe->xdf2.recs[xch->i2];
 		for (i = 0; i < xch->chg2 && ignore; i++)
-			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
+			ignore = xdl_blankline((const char *)rec[i].ptr, (long)rec[i].size, flags);
 
 		xch->ignore = ignore;
 	}
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index ead930088a5601..2f8007753c30f2 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
 {
 	xrecord_t *rec = &xdf->recs[ri];
 
-	if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
+	if (xdl_emit_diffrec((char const *)rec->ptr, (long)rec->size, pre, strlen(pre), ecb) < 0)
 		return -1;
 
 	return 0;
@@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
 	xrecord_t *rec = &xdf->recs[ri];
 
 	if (!xecfg->find_func)
-		return def_ff((const char *)rec->ptr, rec->size, buf, sz);
-	return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
+		return def_ff((const char *)rec->ptr, (long)rec->size, buf, sz);
+	return xecfg->find_func((const char *)rec->ptr, (long)rec->size, buf, sz, xecfg->find_func_priv);
 }
 
 static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
@@ -151,7 +151,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
 static int is_empty_rec(xdfile_t *xdf, long ri)
 {
 	xrecord_t *rec = &xdf->recs[ri];
-	long i = 0;
+	size_t i = 0;
 
 	for (; i < rec->size && XDL_ISSPACE(rec->ptr[i]); i++);
 
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 75cb3e76a2c8e4..0dd4558a3280dc 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
 	xrecord_t *rec2 = xe2->xdf2.recs + i2;
 
 	for (i = 0; i < line_count; i++) {
-		int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
-			(const char *)rec2[i].ptr, rec2[i].size, flags);
+		int result = xdl_recmatch((const char *)rec1[i].ptr, (long)rec1[i].size,
+			(const char *)rec2[i].ptr, (long)rec2[i].size, flags);
 		if (!result)
 			return -1;
 	}
@@ -119,11 +119,11 @@ static int xdl_recs_copy_0(int use_orig, xdfenv_t *xe, int i, int count, int nee
 	if (count < 1)
 		return 0;
 
-	for (i = 0; i < count; size += recs[i++].size)
+	for (i = 0; i < count; size += (int)recs[i++].size)
 		if (dest)
 			memcpy(dest + size, recs[i].ptr, recs[i].size);
 	if (add_nl) {
-		i = recs[count - 1].size;
+		i = (int)recs[count - 1].size;
 		if (i == 0 || recs[count - 1].ptr[i - 1] != '\n') {
 			if (needs_cr) {
 				if (dest)
@@ -156,7 +156,7 @@ static int xdl_orig_copy(xdfenv_t *xe, int i, int count, int needs_cr, int add_n
  */
 static int is_eol_crlf(xdfile_t *file, int i)
 {
-	long size;
+	size_t size;
 
 	if (i < file->nrec - 1)
 		/* All lines before the last *must* end in LF */
@@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 
 static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
 {
-	return xdl_recmatch((const char *)rec1->ptr, rec1->size,
-			    (const char *)rec2->ptr, rec2->size, flags);
+	return xdl_recmatch((const char *)rec1->ptr, (long)rec1->size,
+			    (const char *)rec2->ptr, (long)rec2->size, flags);
 }
 
 /*
@@ -441,7 +441,7 @@ static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
 {
 	for (; chg; chg--, i++)
 		if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
-				xe->xdf2.recs[i].size))
+				(long)xe->xdf2.recs[i].size))
 			return 1;
 	return 0;
 }
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 4c564670764e53..b3219aed3e8795 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -99,8 +99,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
 		if (rcrec->rec.ha == rec->ha &&
-				xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
-					(const char *)rec->ptr, rec->size, cf->flags))
+				xdl_recmatch((const char *)rcrec->rec.ptr, (long)rcrec->rec.size,
+					(const char *)rec->ptr, (long)rec->size, cf->flags))
 			break;
 
 	if (!rcrec) {
@@ -157,7 +157,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 				goto abort;
 			crec = &xdf->recs[xdf->nrec++];
 			crec->ptr = (uint8_t const *)prev;
-			crec->size = (long) (cur - prev);
+			crec->size = cur - prev;
 			crec->ha = hav;
 			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 69727fb29999ef..354349b523fbac 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -40,7 +40,7 @@ typedef struct s_chastore {
 
 typedef struct s_xrecord {
 	uint8_t const *ptr;
-	long size;
+	size_t size;
 	unsigned long ha;
 } xrecord_t;
 

From b0d4ae30f5a23fa9da87e9396b78e6442b351ddc Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:17 +0000
Subject: [PATCH 111/553] xdiff: use unambiguous types in xdl_hash_record()

Convert the function signature and body to use unambiguous types. char
is changed to uint8_t because this function processes bytes in memory.
unsigned long to uint64_t so that the hash output is consistent across
platforms. `flags` was changed from long to uint64_t to ensure the
high order bits are not dropped on platforms that treat long as 32
bits.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff-interface.c |  2 +-
 xdiff/xprepare.c  |  6 +++---
 xdiff/xutils.c    | 28 ++++++++++++++--------------
 xdiff/xutils.h    |  6 +++---
 4 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 4971f722b3e5f4..1a35556380451a 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -300,7 +300,7 @@ void xdiff_clear_find_func(xdemitconf_t *xecfg)
 
 unsigned long xdiff_hash_string(const char *s, size_t len, long flags)
 {
-	return xdl_hash_record(&s, s + len, flags);
+	return xdl_hash_record((uint8_t const**)&s, (uint8_t const*)s + len, flags);
 }
 
 int xdiff_compare_lines(const char *l1, long s1,
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index b3219aed3e8795..85e56021daf9e2 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -137,8 +137,8 @@ static void xdl_free_ctx(xdfile_t *xdf)
 static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp,
 			   xdlclassifier_t *cf, xdfile_t *xdf) {
 	long bsize;
-	unsigned long hav;
-	char const *blk, *cur, *top, *prev;
+	uint64_t hav;
+	uint8_t const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
 
 	xdf->rindex = NULL;
@@ -156,7 +156,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
 				goto abort;
 			crec = &xdf->recs[xdf->nrec++];
-			crec->ptr = (uint8_t const *)prev;
+			crec->ptr = prev;
 			crec->size = cur - prev;
 			crec->ha = hav;
 			if (xdl_classify_record(pass, cf, crec) < 0)
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 7be063bfb61d72..77ee1ad9c86875 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -249,11 +249,11 @@ int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
 	return 1;
 }
 
-unsigned long xdl_hash_record_with_whitespace(char const **data,
-		char const *top, long flags) {
-	unsigned long ha = 5381;
-	char const *ptr = *data;
-	int cr_at_eol_only = (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL;
+uint64_t xdl_hash_record_with_whitespace(uint8_t const **data,
+		uint8_t const *top, uint64_t flags) {
+	uint64_t ha = 5381;
+	uint8_t const *ptr = *data;
+	bool cr_at_eol_only = (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL;
 
 	for (; ptr < top && *ptr != '\n'; ptr++) {
 		if (cr_at_eol_only) {
@@ -263,8 +263,8 @@ unsigned long xdl_hash_record_with_whitespace(char const **data,
 				continue;
 		}
 		else if (XDL_ISSPACE(*ptr)) {
-			const char *ptr2 = ptr;
-			int at_eol;
+			const uint8_t *ptr2 = ptr;
+			bool at_eol;
 			while (ptr + 1 < top && XDL_ISSPACE(ptr[1])
 					&& ptr[1] != '\n')
 				ptr++;
@@ -274,20 +274,20 @@ unsigned long xdl_hash_record_with_whitespace(char const **data,
 			else if (flags & XDF_IGNORE_WHITESPACE_CHANGE
 				 && !at_eol) {
 				ha += (ha << 5);
-				ha ^= (unsigned long) ' ';
+				ha ^= (uint64_t) ' ';
 			}
 			else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL
 				 && !at_eol) {
 				while (ptr2 != ptr + 1) {
 					ha += (ha << 5);
-					ha ^= (unsigned long) *ptr2;
+					ha ^= (uint64_t) *ptr2;
 					ptr2++;
 				}
 			}
 			continue;
 		}
 		ha += (ha << 5);
-		ha ^= (unsigned long) *ptr;
+		ha ^= (uint64_t) *ptr;
 	}
 	*data = ptr < top ? ptr + 1: ptr;
 
@@ -304,9 +304,9 @@ unsigned long xdl_hash_record_with_whitespace(char const **data,
 #define REASSOC_FENCE(x, y)
 #endif
 
-unsigned long xdl_hash_record_verbatim(char const **data, char const *top) {
-	unsigned long ha = 5381, c0, c1;
-	char const *ptr = *data;
+uint64_t xdl_hash_record_verbatim(uint8_t const **data, uint8_t const *top) {
+	uint64_t ha = 5381, c0, c1;
+	uint8_t const *ptr = *data;
 #if 0
 	/*
 	 * The baseline form of the optimized loop below. This is the djb2
@@ -314,7 +314,7 @@ unsigned long xdl_hash_record_verbatim(char const **data, char const *top) {
 	 */
 	for (; ptr < top && *ptr != '\n'; ptr++) {
 		ha += (ha << 5);
-		ha += (unsigned long) *ptr;
+		ha += (uint64_t) *ptr;
 	}
 	*data = ptr < top ? ptr + 1: ptr;
 #else
diff --git a/xdiff/xutils.h b/xdiff/xutils.h
index 13f68310472a69..615b4a9d355433 100644
--- a/xdiff/xutils.h
+++ b/xdiff/xutils.h
@@ -34,9 +34,9 @@ void *xdl_cha_alloc(chastore_t *cha);
 long xdl_guess_lines(mmfile_t *mf, long sample);
 int xdl_blankline(const char *line, long size, long flags);
 int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags);
-unsigned long xdl_hash_record_verbatim(char const **data, char const *top);
-unsigned long xdl_hash_record_with_whitespace(char const **data, char const *top, long flags);
-static inline unsigned long xdl_hash_record(char const **data, char const *top, long flags)
+uint64_t xdl_hash_record_verbatim(uint8_t const **data, uint8_t const *top);
+uint64_t xdl_hash_record_with_whitespace(uint8_t const **data, uint8_t const *top, uint64_t flags);
+static inline uint64_t xdl_hash_record(uint8_t const **data, uint8_t const *top, uint64_t flags)
 {
 	if (flags & XDF_WHITESPACE_FLAGS)
 		return xdl_hash_record_with_whitespace(data, top, flags);

From 6a26019c81faa07ba811541b4cf35be9e8ee1ead Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:18 +0000
Subject: [PATCH 112/553] xdiff: split xrecord_t.ha into line_hash and
 minimal_perfect_hash

The ha field is serving two different purposes, which makes the code
harder to read. At first glance, it looks like many places assume
there could never be hash collisions between lines of the two input
files. In reality, line_hash is used together with xdl_recmatch() to
ensure correct comparisons of lines, even when collisions occur.

To make this clearer, the old ha field has been split:
  * line_hash: a straightforward hash of a line, independent of any
    external context. Its type is uint64_t, as it comes from a fixed
    width hash function.
  * minimal_perfect_hash: Not a new concept, but now a separate
    field. It comes from the classifier's general-purpose hash table,
    which assigns each line a unique and minimal hash across the two
    files. A size_t is used here because it's meant to be used to
    index an array. This also avoids ` as usize` casts on the Rust
    side when using it to index a slice.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xdiffi.c     |  6 +++---
 xdiff/xhistogram.c |  4 ++--
 xdiff/xpatience.c  | 10 +++++-----
 xdiff/xprepare.c   | 18 +++++++++---------
 xdiff/xtypes.h     |  3 ++-
 5 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index cb8e412c7b9db6..8d96074414f2e9 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -22,9 +22,9 @@
 
 #include "xinclude.h"
 
-static unsigned long get_hash(xdfile_t *xdf, long index)
+static size_t get_hash(xdfile_t *xdf, long index)
 {
-	return xdf->recs[xdf->rindex[index]].ha;
+	return xdf->recs[xdf->rindex[index]].minimal_perfect_hash;
 }
 
 #define XDL_MAX_COST_MIN 256
@@ -385,7 +385,7 @@ static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1,
 
 static int recs_match(xrecord_t *rec1, xrecord_t *rec2)
 {
-	return (rec1->ha == rec2->ha);
+	return rec1->minimal_perfect_hash == rec2->minimal_perfect_hash;
 }
 
 /*
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 6dc450b1fe1dfc..5ae1282c27568c 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -90,7 +90,7 @@ struct region {
 
 static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
 {
-	return r1->ha == r2->ha;
+	return r1->minimal_perfect_hash == r2->minimal_perfect_hash;
 
 }
 
@@ -98,7 +98,7 @@ static int cmp_recs(xrecord_t *r1, xrecord_t *r2)
 	(cmp_recs(REC(i->env, s1, l1), REC(i->env, s2, l2)))
 
 #define TABLE_HASH(index, side, line) \
-	XDL_HASHLONG((REC(index->env, side, line))->ha, index->table_bits)
+	XDL_HASHLONG((REC(index->env, side, line))->minimal_perfect_hash, index->table_bits)
 
 static int scanA(struct histindex *index, int line1, int count1)
 {
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index bb61354f22a177..cc53266f3b8302 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -48,7 +48,7 @@
 struct hashmap {
 	int nr, alloc;
 	struct entry {
-		unsigned long hash;
+		size_t minimal_perfect_hash;
 		/*
 		 * 0 = unused entry, 1 = first line, 2 = second, etc.
 		 * line2 is NON_UNIQUE if the line is not unique
@@ -101,10 +101,10 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 	 * So we multiply ha by 2 in the hope that the hashing was
 	 * "unique enough".
 	 */
-	int index = (int)((record->ha << 1) % map->alloc);
+	int index = (int)((record->minimal_perfect_hash << 1) % map->alloc);
 
 	while (map->entries[index].line1) {
-		if (map->entries[index].hash != record->ha) {
+		if (map->entries[index].minimal_perfect_hash != record->minimal_perfect_hash) {
 			if (++index >= map->alloc)
 				index = 0;
 			continue;
@@ -120,7 +120,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
 	if (pass == 2)
 		return;
 	map->entries[index].line1 = line;
-	map->entries[index].hash = record->ha;
+	map->entries[index].minimal_perfect_hash = record->minimal_perfect_hash;
 	map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
 	if (!map->first)
 		map->first = map->entries + index;
@@ -248,7 +248,7 @@ static int match(struct hashmap *map, int line1, int line2)
 {
 	xrecord_t *record1 = &map->env->xdf1.recs[line1 - 1];
 	xrecord_t *record2 = &map->env->xdf2.recs[line2 - 1];
-	return record1->ha == record2->ha;
+	return record1->minimal_perfect_hash == record2->minimal_perfect_hash;
 }
 
 static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 85e56021daf9e2..bea0992b5e4a33 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -93,12 +93,12 @@ static void xdl_free_classifier(xdlclassifier_t *cf) {
 
 
 static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t *rec) {
-	long hi;
+	size_t hi;
 	xdlclass_t *rcrec;
 
-	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
+	hi = XDL_HASHLONG(rec->line_hash, cf->hbits);
 	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
-		if (rcrec->rec.ha == rec->ha &&
+		if (rcrec->rec.line_hash == rec->line_hash &&
 				xdl_recmatch((const char *)rcrec->rec.ptr, (long)rcrec->rec.size,
 					(const char *)rec->ptr, (long)rec->size, cf->flags))
 			break;
@@ -120,7 +120,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 	(pass == 1) ? rcrec->len1++ : rcrec->len2++;
 
-	rec->ha = (unsigned long) rcrec->idx;
+	rec->minimal_perfect_hash = (size_t)rcrec->idx;
 
 	return 0;
 }
@@ -158,7 +158,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 			crec = &xdf->recs[xdf->nrec++];
 			crec->ptr = prev;
 			crec->size = cur - prev;
-			crec->ha = hav;
+			crec->line_hash = hav;
 			if (xdl_classify_record(pass, cf, crec) < 0)
 				goto abort;
 		}
@@ -290,7 +290,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
-		rcrec = cf->rcrecs[recs->ha];
+		rcrec = cf->rcrecs[recs->minimal_perfect_hash];
 		nm = rcrec ? rcrec->len2 : 0;
 		action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
 	}
@@ -298,7 +298,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
-		rcrec = cf->rcrecs[recs->ha];
+		rcrec = cf->rcrecs[recs->minimal_perfect_hash];
 		nm = rcrec ? rcrec->len1 : 0;
 		action2[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
 	}
@@ -350,7 +350,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 	recs2 = xdf2->recs;
 	for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
 	     i++, recs1++, recs2++)
-		if (recs1->ha != recs2->ha)
+		if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
 			break;
 
 	xdf1->dstart = xdf2->dstart = i;
@@ -358,7 +358,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 	recs1 = xdf1->recs + xdf1->nrec - 1;
 	recs2 = xdf2->recs + xdf2->nrec - 1;
 	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
-		if (recs1->ha != recs2->ha)
+		if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
 			break;
 
 	xdf1->dend = xdf1->nrec - i - 1;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 354349b523fbac..d4e9cd2e763616 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -41,7 +41,8 @@ typedef struct s_chastore {
 typedef struct s_xrecord {
 	uint8_t const *ptr;
 	size_t size;
-	unsigned long ha;
+	uint64_t line_hash;
+	size_t minimal_perfect_hash;
 } xrecord_t;
 
 typedef struct s_xdfile {

From 016538780e9f6e83a1d9c7b0ec771fb6c5583c0f Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:19 +0000
Subject: [PATCH 113/553] xdiff: make xdfile_t.nrec a size_t instead of long

size_t is used because nrec describes the number of elements for both
recs, and for 'changed' + 2.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xdiffi.c    |  8 ++++----
 xdiff/xemit.c     | 20 ++++++++++----------
 xdiff/xmerge.c    |  8 ++++----
 xdiff/xpatience.c |  2 +-
 xdiff/xprepare.c  | 12 ++++++------
 xdiff/xtypes.h    |  2 +-
 6 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 8d96074414f2e9..21d06bce969765 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -483,7 +483,7 @@ static void measure_split(const xdfile_t *xdf, long split,
 {
 	long i;
 
-	if (split >= xdf->nrec) {
+	if (split >= (long)xdf->nrec) {
 		m->end_of_file = 1;
 		m->indent = -1;
 	} else {
@@ -506,7 +506,7 @@ static void measure_split(const xdfile_t *xdf, long split,
 
 	m->post_blank = 0;
 	m->post_indent = -1;
-	for (i = split + 1; i < xdf->nrec; i++) {
+	for (i = split + 1; i < (long)xdf->nrec; i++) {
 		m->post_indent = get_indent(&xdf->recs[i]);
 		if (m->post_indent != -1)
 			break;
@@ -717,7 +717,7 @@ static void group_init(xdfile_t *xdf, struct xdlgroup *g)
  */
 static inline int group_next(xdfile_t *xdf, struct xdlgroup *g)
 {
-	if (g->end == xdf->nrec)
+	if (g->end == (long)xdf->nrec)
 		return -1;
 
 	g->start = g->end + 1;
@@ -750,7 +750,7 @@ static inline int group_previous(xdfile_t *xdf, struct xdlgroup *g)
  */
 static int group_slide_down(xdfile_t *xdf, struct xdlgroup *g)
 {
-	if (g->end < xdf->nrec &&
+	if (g->end < (long)xdf->nrec &&
 	    recs_match(&xdf->recs[g->start], &xdf->recs[g->end])) {
 		xdf->changed[g->start++] = false;
 		xdf->changed[g->end++] = true;
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 2f8007753c30f2..04f7e9193b61f0 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -137,7 +137,7 @@ static long get_func_line(xdfenv_t *xe, xdemitconf_t const *xecfg,
 	buf = func_line ? func_line->buf : dummy;
 	size = func_line ? sizeof(func_line->buf) : sizeof(dummy);
 
-	for (l = start; l != limit && 0 <= l && l < xe->xdf1.nrec; l += step) {
+	for (l = start; l != limit && 0 <= l && l < (long)xe->xdf1.nrec; l += step) {
 		long len = match_func_rec(&xe->xdf1, xecfg, l, buf, size);
 		if (len >= 0) {
 			if (func_line)
@@ -179,14 +179,14 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 			long fs1, i1 = xch->i1;
 
 			/* Appended chunk? */
-			if (i1 >= xe->xdf1.nrec) {
+			if (i1 >= (long)xe->xdf1.nrec) {
 				long i2 = xch->i2;
 
 				/*
 				 * We don't need additional context if
 				 * a whole function was added.
 				 */
-				while (i2 < xe->xdf2.nrec) {
+				while (i2 < (long)xe->xdf2.nrec) {
 					if (is_func_rec(&xe->xdf2, xecfg, i2))
 						goto post_context_calculation;
 					i2++;
@@ -196,7 +196,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 				 * Otherwise get more context from the
 				 * pre-image.
 				 */
-				i1 = xe->xdf1.nrec - 1;
+				i1 = (long)xe->xdf1.nrec - 1;
 			}
 
 			fs1 = get_func_line(xe, xecfg, NULL, i1, -1);
@@ -228,8 +228,8 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 
  post_context_calculation:
 		lctx = xecfg->ctxlen;
-		lctx = XDL_MIN(lctx, xe->xdf1.nrec - (xche->i1 + xche->chg1));
-		lctx = XDL_MIN(lctx, xe->xdf2.nrec - (xche->i2 + xche->chg2));
+		lctx = XDL_MIN(lctx, (long)xe->xdf1.nrec - (xche->i1 + xche->chg1));
+		lctx = XDL_MIN(lctx, (long)xe->xdf2.nrec - (xche->i2 + xche->chg2));
 
 		e1 = xche->i1 + xche->chg1 + lctx;
 		e2 = xche->i2 + xche->chg2 + lctx;
@@ -237,13 +237,13 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 		if (xecfg->flags & XDL_EMIT_FUNCCONTEXT) {
 			long fe1 = get_func_line(xe, xecfg, NULL,
 						 xche->i1 + xche->chg1,
-						 xe->xdf1.nrec);
+						 (long)xe->xdf1.nrec);
 			while (fe1 > 0 && is_empty_rec(&xe->xdf1, fe1 - 1))
 				fe1--;
 			if (fe1 < 0)
-				fe1 = xe->xdf1.nrec;
+				fe1 = (long)xe->xdf1.nrec;
 			if (fe1 > e1) {
-				e2 = XDL_MIN(e2 + (fe1 - e1), xe->xdf2.nrec);
+				e2 = XDL_MIN(e2 + (fe1 - e1), (long)xe->xdf2.nrec);
 				e1 = fe1;
 			}
 
@@ -254,7 +254,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 			 */
 			if (xche->next) {
 				long l = XDL_MIN(xche->next->i1,
-						 xe->xdf1.nrec - 1);
+						 (long)xe->xdf1.nrec - 1);
 				if (l - xecfg->ctxlen <= e1 ||
 				    get_func_line(xe, xecfg, NULL, l, e1) < 0) {
 					xche = xche->next;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 0dd4558a3280dc..29dad98c496b07 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -158,7 +158,7 @@ static int is_eol_crlf(xdfile_t *file, int i)
 {
 	size_t size;
 
-	if (i < file->nrec - 1)
+	if (i < (long)file->nrec - 1)
 		/* All lines before the last *must* end in LF */
 		return (size = file->recs[i].size) > 1 &&
 			file->recs[i].ptr[size - 2] == '\r';
@@ -317,7 +317,7 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
 			continue;
 		i = m->i1 + m->chg1;
 	}
-	size += xdl_recs_copy(xe1, i, xe1->xdf2.nrec - i, 0, 0,
+	size += xdl_recs_copy(xe1, i, (int)xe1->xdf2.nrec - i, 0, 0,
 			      dest ? dest + size : NULL);
 	return size;
 }
@@ -622,7 +622,7 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 			changes = c;
 		i0 = xscr1->i1;
 		i1 = xscr1->i2;
-		i2 = xscr1->i1 + xe2->xdf2.nrec - xe2->xdf1.nrec;
+		i2 = xscr1->i1 + (long)xe2->xdf2.nrec - (long)xe2->xdf1.nrec;
 		chg0 = xscr1->chg1;
 		chg1 = xscr1->chg2;
 		chg2 = xscr1->chg1;
@@ -637,7 +637,7 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 		if (!changes)
 			changes = c;
 		i0 = xscr2->i1;
-		i1 = xscr2->i1 + xe1->xdf2.nrec - xe1->xdf1.nrec;
+		i1 = xscr2->i1 + (long)xe1->xdf2.nrec - (long)xe1->xdf1.nrec;
 		i2 = xscr2->i2;
 		chg0 = xscr2->chg1;
 		chg1 = xscr2->chg1;
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index cc53266f3b8302..a0b31eb5d8c1c0 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -370,5 +370,5 @@ static int patience_diff(xpparam_t const *xpp, xdfenv_t *env,
 
 int xdl_do_patience_diff(xpparam_t const *xpp, xdfenv_t *env)
 {
-	return patience_diff(xpp, env, 1, env->xdf1.nrec, 1, env->xdf2.nrec);
+	return patience_diff(xpp, env, 1, (int)env->xdf1.nrec, 1, (int)env->xdf2.nrec);
 }
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index bea0992b5e4a33..705ddd1ae00a36 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -153,7 +153,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 		for (top = blk + bsize; cur < top; ) {
 			prev = cur;
 			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
+			if (XDL_ALLOC_GROW(xdf->recs, (long)xdf->nrec + 1, narec))
 				goto abort;
 			crec = &xdf->recs[xdf->nrec++];
 			crec->ptr = prev;
@@ -287,7 +287,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	/*
 	 * Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
 	 */
-	if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
+	if ((mlim = xdl_bogosqrt((long)xdf1->nrec)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
 		rcrec = cf->rcrecs[recs->minimal_perfect_hash];
@@ -295,7 +295,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 		action1[i] = (nm == 0) ? DISCARD: (nm >= mlim && !need_min) ? INVESTIGATE: KEEP;
 	}
 
-	if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
+	if ((mlim = xdl_bogosqrt((long)xdf2->nrec)) > XDL_MAX_EQLIMIT)
 		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
 		rcrec = cf->rcrecs[recs->minimal_perfect_hash];
@@ -348,7 +348,7 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 
 	recs1 = xdf1->recs;
 	recs2 = xdf2->recs;
-	for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
+	for (i = 0, lim = (long)XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
 	     i++, recs1++, recs2++)
 		if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
 			break;
@@ -361,8 +361,8 @@ static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
 		if (recs1->minimal_perfect_hash != recs2->minimal_perfect_hash)
 			break;
 
-	xdf1->dend = xdf1->nrec - i - 1;
-	xdf2->dend = xdf2->nrec - i - 1;
+	xdf1->dend = (long)xdf1->nrec - i - 1;
+	xdf2->dend = (long)xdf2->nrec - i - 1;
 
 	return 0;
 }
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index d4e9cd2e763616..4c4d9bd147ebe8 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -47,7 +47,7 @@ typedef struct s_xrecord {
 
 typedef struct s_xdfile {
 	xrecord_t *recs;
-	long nrec;
+	size_t nrec;
 	ptrdiff_t dstart, dend;
 	bool *changed;
 	long *rindex;

From e35877eadbd9bee473936577c82abca9c8333abd Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:20 +0000
Subject: [PATCH 114/553] xdiff: make xdfile_t.nreff a size_t instead of long

size_t is used because nreff describes the number of elements in memory
for rindex.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xprepare.c | 14 +++++++-------
 xdiff/xtypes.h   |  2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 705ddd1ae00a36..39fd79d9d46331 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -264,7 +264,7 @@ static bool xdl_clean_mmatch(uint8_t const *action, long i, long s, long e) {
  * might be potentially discarded if they appear in a run of discardable.
  */
 static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
-	long i, nm, nreff, mlim;
+	long i, nm, mlim;
 	xrecord_t *recs;
 	xdlclass_t *rcrec;
 	uint8_t *action1 = NULL, *action2 = NULL;
@@ -307,29 +307,29 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	 * Use temporary arrays to decide if changed[i] should remain
 	 * false, or become true.
 	 */
-	for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
+	xdf1->nreff = 0;
+	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
 	     i <= xdf1->dend; i++, recs++) {
 		if (action1[i] == KEEP ||
 		    (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
-			xdf1->rindex[nreff++] = i;
+			xdf1->rindex[xdf1->nreff++] = i;
 			/* changed[i] remains false, i.e. keep */
 		} else
 			xdf1->changed[i] = true;
 			/* i.e. discard */
 	}
-	xdf1->nreff = nreff;
 
-	for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
+	xdf2->nreff = 0;
+	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
 	     i <= xdf2->dend; i++, recs++) {
 		if (action2[i] == KEEP ||
 		    (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
-			xdf2->rindex[nreff++] = i;
+			xdf2->rindex[xdf2->nreff++] = i;
 			/* changed[i] remains false, i.e. keep */
 		} else
 			xdf2->changed[i] = true;
 			/* i.e. discard */
 	}
-	xdf2->nreff = nreff;
 
 cleanup:
 	xdl_free(action1);
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 4c4d9bd147ebe8..1f495f987f861b 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -51,7 +51,7 @@ typedef struct s_xdfile {
 	ptrdiff_t dstart, dend;
 	bool *changed;
 	long *rindex;
-	long nreff;
+	size_t nreff;
 } xdfile_t;
 
 typedef struct s_xdfenv {

From 5004a8da14e2aa80b5697b0a3a60e594af1c8292 Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:21 +0000
Subject: [PATCH 115/553] xdiff: change rindex from long to size_t in xdfile_t

The field rindex describes an index offset for other arrays. Change it
to size_t.

Changing the type of rindex from long to size_t has no cascading
refactor impact because it is only ever used to directly index other
arrays.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xtypes.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 1f495f987f861b..9074cdadd1118c 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -50,7 +50,7 @@ typedef struct s_xdfile {
 	size_t nrec;
 	ptrdiff_t dstart, dend;
 	bool *changed;
-	long *rindex;
+	size_t *rindex;
 	size_t nreff;
 } xdfile_t;
 

From 22ce0cb6397d3d15c21c217696f262c4b8eb44b3 Mon Sep 17 00:00:00 2001
From: Ezekiel Newren <ezekielnewren@gmail.com>
Date: Tue, 18 Nov 2025 22:34:22 +0000
Subject: [PATCH 116/553] xdiff: rename rindex -> reference_index

The classic diff adds only the lines that it's going to consider,
during the diff, to an array. A mapping between the compacted
array, and the lines of the file that they reference, is
facilitated by this array.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xdiffi.c   |  6 +++---
 xdiff/xprepare.c | 10 +++++-----
 xdiff/xtypes.h   |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 21d06bce969765..4376f943dba539 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -24,7 +24,7 @@
 
 static size_t get_hash(xdfile_t *xdf, long index)
 {
-	return xdf->recs[xdf->rindex[index]].minimal_perfect_hash;
+	return xdf->recs[xdf->reference_index[index]].minimal_perfect_hash;
 }
 
 #define XDL_MAX_COST_MIN 256
@@ -278,10 +278,10 @@ int xdl_recs_cmp(xdfile_t *xdf1, long off1, long lim1,
 	 */
 	if (off1 == lim1) {
 		for (; off2 < lim2; off2++)
-			xdf2->changed[xdf2->rindex[off2]] = true;
+			xdf2->changed[xdf2->reference_index[off2]] = true;
 	} else if (off2 == lim2) {
 		for (; off1 < lim1; off1++)
-			xdf1->changed[xdf1->rindex[off1]] = true;
+			xdf1->changed[xdf1->reference_index[off1]] = true;
 	} else {
 		xdpsplit_t spl;
 		spl.i1 = spl.i2 = 0;
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 39fd79d9d46331..34c82e4f8e1626 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -128,7 +128,7 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 
 static void xdl_free_ctx(xdfile_t *xdf)
 {
-	xdl_free(xdf->rindex);
+	xdl_free(xdf->reference_index);
 	xdl_free(xdf->changed - 1);
 	xdl_free(xdf->recs);
 }
@@ -141,7 +141,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	uint8_t const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
 
-	xdf->rindex = NULL;
+	xdf->reference_index = NULL;
 	xdf->changed = NULL;
 	xdf->recs = NULL;
 
@@ -169,7 +169,7 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 
 	if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
 	    (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
-		if (!XDL_ALLOC_ARRAY(xdf->rindex, xdf->nrec + 1))
+		if (!XDL_ALLOC_ARRAY(xdf->reference_index, xdf->nrec + 1))
 			goto abort;
 	}
 
@@ -312,7 +312,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	     i <= xdf1->dend; i++, recs++) {
 		if (action1[i] == KEEP ||
 		    (action1[i] == INVESTIGATE && !xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))) {
-			xdf1->rindex[xdf1->nreff++] = i;
+			xdf1->reference_index[xdf1->nreff++] = i;
 			/* changed[i] remains false, i.e. keep */
 		} else
 			xdf1->changed[i] = true;
@@ -324,7 +324,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	     i <= xdf2->dend; i++, recs++) {
 		if (action2[i] == KEEP ||
 		    (action2[i] == INVESTIGATE && !xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))) {
-			xdf2->rindex[xdf2->nreff++] = i;
+			xdf2->reference_index[xdf2->nreff++] = i;
 			/* changed[i] remains false, i.e. keep */
 		} else
 			xdf2->changed[i] = true;
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 9074cdadd1118c..979586f20a6028 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -50,7 +50,7 @@ typedef struct s_xdfile {
 	size_t nrec;
 	ptrdiff_t dstart, dend;
 	bool *changed;
-	size_t *rindex;
+	size_t *reference_index;
 	size_t nreff;
 } xdfile_t;
 

From 5e6e4854e086ba0025bc7dc11e6b475c92a2f556 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 19 Nov 2025 10:55:15 -0800
Subject: [PATCH 117/553] Start 2.53 cycle

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 12 ++++++++++++
 GIT-VERSION-GEN                    |  2 +-
 RelNotes                           |  2 +-
 3 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/RelNotes/2.53.0.adoc

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
new file mode 100644
index 00000000000000..b0b3dc9b3dcc26
--- /dev/null
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -0,0 +1,12 @@
+Git v2.53 Release Notes
+=======================
+
+Performance, Internal Implementation, Development Support etc.
+--------------------------------------------------------------
+
+ * The list of packfiles used in a running Git process is moved from
+   the packed_git structure into the packfile store.
+
+ * Some ref backend storage can hold not just the object name of an
+   annotated tag, but the object name of the object the tag points at.
+   The code to handle this information has been streamlined.
diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN
index 8d5bbf7b6d3efd..1f7af0328a0461 100755
--- a/GIT-VERSION-GEN
+++ b/GIT-VERSION-GEN
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-DEF_VER=v2.52.0
+DEF_VER=v2.52.GIT
 
 LF='
 '
diff --git a/RelNotes b/RelNotes
index 6d16c0077a11cb..6dfd3d1bcf2f33 120000
--- a/RelNotes
+++ b/RelNotes
@@ -1 +1 @@
-Documentation/RelNotes/2.52.0.adoc
\ No newline at end of file
+Documentation/RelNotes/2.53.0.adoc
\ No newline at end of file

From 903b04a3e721f4afb337bd48890b69e16c04c5d6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Wed, 19 Nov 2025 21:40:02 +0000
Subject: [PATCH 118/553] doc: convert git fetch to synopsis style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/fetch.adoc     |  60 ++++-----
 Documentation/fetch-options.adoc    | 201 ++++++++++++++--------------
 Documentation/git-fetch.adoc        |  48 +++----
 Documentation/pull-fetch-param.adoc |  52 +++----
 Documentation/urls-remotes.adoc     |  16 +--
 builtin/fetch.c                     |   2 +-
 6 files changed, 190 insertions(+), 189 deletions(-)

diff --git a/Documentation/config/fetch.adoc b/Documentation/config/fetch.adoc
index d7dc461bd16ad7..cd40db0cad1c36 100644
--- a/Documentation/config/fetch.adoc
+++ b/Documentation/config/fetch.adoc
@@ -1,32 +1,32 @@
-fetch.recurseSubmodules::
+`fetch.recurseSubmodules`::
 	This option controls whether `git fetch` (and the underlying fetch
 	in `git pull`) will recursively fetch into populated submodules.
-	This option can be set either to a boolean value or to 'on-demand'.
+	This option can be set either to a boolean value or to `on-demand`.
 	Setting it to a boolean changes the behavior of fetch and pull to
 	recurse unconditionally into submodules when set to true or to not
-	recurse at all when set to false. When set to 'on-demand', fetch and
+	recurse at all when set to false. When set to `on-demand`, fetch and
 	pull will only recurse into a populated submodule when its
 	superproject retrieves a commit that updates the submodule's
 	reference.
-	Defaults to 'on-demand', or to the value of 'submodule.recurse' if set.
+	Defaults to `on-demand`, or to the value of `submodule.recurse` if set.
 
-fetch.fsckObjects::
+`fetch.fsckObjects`::
 	If it is set to true, git-fetch-pack will check all fetched
 	objects. See `transfer.fsckObjects` for what's
-	checked. Defaults to false. If not set, the value of
+	checked. Defaults to `false`. If not set, the value of
 	`transfer.fsckObjects` is used instead.
 
-fetch.fsck.<msg-id>::
+`fetch.fsck.<msg-id>`::
 	Acts like `fsck.<msg-id>`, but is used by
 	linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
 	the `fsck.<msg-id>` documentation for details.
 
-fetch.fsck.skipList::
+`fetch.fsck.skipList`::
 	Acts like `fsck.skipList`, but is used by
 	linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
 	the `fsck.skipList` documentation for details.
 
-fetch.unpackLimit::
+`fetch.unpackLimit`::
 	If the number of objects fetched over the Git native
 	transfer is below this
 	limit, then the objects will be unpacked into loose object
@@ -37,12 +37,12 @@ fetch.unpackLimit::
 	especially on slow filesystems.  If not set, the value of
 	`transfer.unpackLimit` is used instead.
 
-fetch.prune::
+`fetch.prune`::
 	If true, fetch will automatically behave as if the `--prune`
 	option was given on the command line.  See also `remote.<name>.prune`
 	and the PRUNING section of linkgit:git-fetch[1].
 
-fetch.pruneTags::
+`fetch.pruneTags`::
 	If true, fetch will automatically behave as if the
 	`refs/tags/*:refs/tags/*` refspec was provided when pruning,
 	if not set already. This allows for setting both this option
@@ -50,41 +50,41 @@ fetch.pruneTags::
 	refs. See also `remote.<name>.pruneTags` and the PRUNING
 	section of linkgit:git-fetch[1].
 
-fetch.all::
+`fetch.all`::
 	If true, fetch will attempt to update all available remotes.
 	This behavior can be overridden by passing `--no-all` or by
 	explicitly specifying one or more remote(s) to fetch from.
-	Defaults to false.
+	Defaults to `false`.
 
-fetch.output::
+`fetch.output`::
 	Control how ref update status is printed. Valid values are
 	`full` and `compact`. Default value is `full`. See the
 	OUTPUT section in linkgit:git-fetch[1] for details.
 
-fetch.negotiationAlgorithm::
+`fetch.negotiationAlgorithm`::
 	Control how information about the commits in the local repository
 	is sent when negotiating the contents of the packfile to be sent by
-	the server.  Set to "consecutive" to use an algorithm that walks
-	over consecutive commits checking each one.  Set to "skipping" to
+	the server.  Set to `consecutive` to use an algorithm that walks
+	over consecutive commits checking each one.  Set to `skipping` to
 	use an algorithm that skips commits in an effort to converge
 	faster, but may result in a larger-than-necessary packfile; or set
-	to "noop" to not send any information at all, which will almost
+	to `noop` to not send any information at all, which will almost
 	certainly result in a larger-than-necessary packfile, but will skip
-	the negotiation step.  Set to "default" to override settings made
+	the negotiation step.  Set to `default` to override settings made
 	previously and use the default behaviour.  The default is normally
-	"consecutive", but if `feature.experimental` is true, then the
-	default is "skipping".  Unknown values will cause 'git fetch' to
+	`consecutive`, but if `feature.experimental` is `true`, then the
+	default is `skipping`.  Unknown values will cause `git fetch` to
 	error out.
 +
 See also the `--negotiate-only` and `--negotiation-tip` options to
 linkgit:git-fetch[1].
 
-fetch.showForcedUpdates::
-	Set to false to enable `--no-show-forced-updates` in
+`fetch.showForcedUpdates`::
+	Set to `false` to enable `--no-show-forced-updates` in
 	linkgit:git-fetch[1] and linkgit:git-pull[1] commands.
-	Defaults to true.
+	Defaults to `true`.
 
-fetch.parallel::
+`fetch.parallel`::
 	Specifies the maximal number of fetch operations to be run in parallel
 	at a time (submodules, or remotes when the `--multiple` option of
 	linkgit:git-fetch[1] is in effect).
@@ -94,16 +94,16 @@ A value of 0 will give some reasonable default. If unset, it defaults to 1.
 For submodules, this setting can be overridden using the `submodule.fetchJobs`
 config setting.
 
-fetch.writeCommitGraph::
+`fetch.writeCommitGraph`::
 	Set to true to write a commit-graph after every `git fetch` command
 	that downloads a pack-file from a remote. Using the `--split` option,
 	most executions will create a very small commit-graph file on top of
 	the existing commit-graph file(s). Occasionally, these files will
 	merge and the write may take longer. Having an updated commit-graph
 	file helps performance of many Git commands, including `git merge-base`,
-	`git push -f`, and `git log --graph`. Defaults to false.
+	`git push -f`, and `git log --graph`. Defaults to `false`.
 
-fetch.bundleURI::
+`fetch.bundleURI`::
 	This value stores a URI for downloading Git object data from a bundle
 	URI before performing an incremental fetch from the origin Git server.
 	This is similar to how the `--bundle-uri` option behaves in
@@ -115,9 +115,9 @@ If you modify this value and your repository has a `fetch.bundleCreationToken`
 value, then remove that `fetch.bundleCreationToken` value before fetching from
 the new bundle URI.
 
-fetch.bundleCreationToken::
+`fetch.bundleCreationToken`::
 	When using `fetch.bundleURI` to fetch incrementally from a bundle
-	list that uses the "creationToken" heuristic, this config value
+	list that uses the "`creationToken`" heuristic, this config value
 	stores the maximum `creationToken` value of the downloaded bundles.
 	This value is used to prevent downloading bundles in the future
 	if the advertised `creationToken` is not strictly larger than this
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index ad1e1f49be181d..35a84a1ef27672 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -1,41 +1,41 @@
---all::
---no-all::
+`--all`::
+`--no-all`::
 	Fetch all remotes, except for the ones that has the
 	`remote.<name>.skipFetchAll` configuration variable set.
 	This overrides the configuration variable `fetch.all`.
 
--a::
---append::
+`-a`::
+`--append`::
 	Append ref names and object names of fetched refs to the
 	existing contents of `.git/FETCH_HEAD`.  Without this
 	option old data in `.git/FETCH_HEAD` will be overwritten.
 
---atomic::
+`--atomic`::
 	Use an atomic transaction to update local refs. Either all refs are
 	updated, or on error, no refs are updated.
 
---depth=<depth>::
+`--depth=<depth>`::
 	Limit fetching to the specified number of commits from the tip of
 	each remote branch history. If fetching to a 'shallow' repository
 	created by `git clone` with `--depth=<depth>` option (see
 	linkgit:git-clone[1]), deepen or shorten the history to the specified
 	number of commits. Tags for the deepened commits are not fetched.
 
---deepen=<depth>::
-	Similar to --depth, except it specifies the number of commits
+`--deepen=<depth>`::
+	Similar to `--depth`, except it specifies the number of commits
 	from the current shallow boundary instead of from the tip of
 	each remote branch history.
 
---shallow-since=<date>::
+`--shallow-since=<date>`::
 	Deepen or shorten the history of a shallow repository to
-	include all reachable commits after <date>.
+	include all reachable commits after _<date>_.
 
---shallow-exclude=<ref>::
+`--shallow-exclude=<ref>`::
 	Deepen or shorten the history of a shallow repository to
 	exclude commits reachable from a specified remote branch or tag.
 	This option can be specified multiple times.
 
---unshallow::
+`--unshallow`::
 	If the source repository is complete, convert a shallow
 	repository to a complete one, removing all the limitations
 	imposed by shallow repositories.
@@ -43,13 +43,13 @@
 If the source repository is shallow, fetch as much as possible so that
 the current repository has the same history as the source repository.
 
---update-shallow::
+`--update-shallow`::
 	By default when fetching from a shallow repository,
 	`git fetch` refuses refs that require updating
-	.git/shallow. This option updates .git/shallow and accepts such
+	`.git/shallow`. This option updates `.git/shallow` and accepts such
 	refs.
 
---negotiation-tip=<commit|glob>::
+`--negotiation-tip=(<commit>|<glob>)`::
 	By default, Git will report, to the server, commits reachable
 	from all local refs to find common commits in an attempt to
 	reduce the size of the to-be-received packfile. If specified,
@@ -69,28 +69,28 @@ See also the `fetch.negotiationAlgorithm` and `push.negotiate`
 configuration variables documented in linkgit:git-config[1], and the
 `--negotiate-only` option below.
 
---negotiate-only::
+`--negotiate-only`::
 	Do not fetch anything from the server, and instead print the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
-This is incompatible with `--recurse-submodules=[yes|on-demand]`.
+This is incompatible with `--recurse-submodules=(yes|on-demand)`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
---dry-run::
+`--dry-run`::
 	Show what would be done, without making any changes.
 
---porcelain::
+`--porcelain`::
 	Print the output to standard output in an easy-to-parse format for
 	scripts. See section OUTPUT in linkgit:git-fetch[1] for details.
 +
-This is incompatible with `--recurse-submodules=[yes|on-demand]` and takes
+This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
 precedence over the `fetch.output` config option.
 
 ifndef::git-pull[]
---write-fetch-head::
---no-write-fetch-head::
+`--write-fetch-head`::
+`--no-write-fetch-head`::
 	Write the list of remote refs fetched in the `FETCH_HEAD`
 	file directly under `$GIT_DIR`.  This is the default.
 	Passing `--no-write-fetch-head` from the command line tells
@@ -98,64 +98,65 @@ ifndef::git-pull[]
 	file is never written.
 endif::git-pull[]
 
--f::
---force::
-	When 'git fetch' is used with `<src>:<dst>` refspec, it may
-	refuse to update the local branch as discussed
+`-f`::
+`--force`::
 ifdef::git-pull[]
-	in the `<refspec>` part of the linkgit:git-fetch[1]
-	documentation.
+When `git fetch` is used with `<src>:<dst>` refspec, it may
+refuse to update the local branch as discussed
+in the _<refspec>_ part of the linkgit:git-fetch[1]
+documentation.
 endif::git-pull[]
 ifndef::git-pull[]
-	in the `<refspec>` part below.
+When `git fetch` is used with `<src>:<dst>` refspec, it may
+refuse to update the local branch as discussed in the _<refspec>_ part below.
 endif::git-pull[]
-	This option overrides that check.
+This option overrides that check.
 
--k::
---keep::
+`-k`::
+`--keep`::
 	Keep downloaded pack.
 
 ifndef::git-pull[]
---multiple::
-	Allow several <repository> and <group> arguments to be
-	specified. No <refspec>s may be specified.
-
---auto-maintenance::
---no-auto-maintenance::
---auto-gc::
---no-auto-gc::
+`--multiple`::
+	Allow several _<repository>_ and _<group>_ arguments to be
+	specified. No __<refspec>__s may be specified.
+
+`--auto-maintenance`::
+`--no-auto-maintenance`::
+`--auto-gc`::
+`--no-auto-gc`::
 	Run `git maintenance run --auto` at the end to perform automatic
 	repository maintenance if needed. (`--[no-]auto-gc` is a synonym.)
 	This is enabled by default.
 
---write-commit-graph::
---no-write-commit-graph::
+`--write-commit-graph`::
+`--no-write-commit-graph`::
 	Write a commit-graph after fetching. This overrides the config
 	setting `fetch.writeCommitGraph`.
 endif::git-pull[]
 
---prefetch::
+`--prefetch`::
 	Modify the configured refspec to place all refs into the
 	`refs/prefetch/` namespace. See the `prefetch` task in
 	linkgit:git-maintenance[1].
 
--p::
---prune::
+`-p`::
+`--prune`::
 	Before fetching, remove any remote-tracking references that no
 	longer exist on the remote.  Tags are not subject to pruning
 	if they are fetched only because of the default tag
-	auto-following or due to a --tags option.  However, if tags
+	auto-following or due to a `--tags` option.  However, if tags
 	are fetched due to an explicit refspec (either on the command
 	line or in the remote configuration, for example if the remote
-	was cloned with the --mirror option), then they are also
+	was cloned with the `--mirror` option), then they are also
 	subject to pruning. Supplying `--prune-tags` is a shorthand for
 	providing the tag refspec.
 ifndef::git-pull[]
 +
 See the PRUNING section below for more details.
 
--P::
---prune-tags::
+`-P`::
+`--prune-tags`::
 	Before fetching, remove any local tags that no longer exist on
 	the remote if `--prune` is enabled. This option should be used
 	more carefully, unlike `--prune` it will remove any local
@@ -168,17 +169,17 @@ See the PRUNING section below for more details.
 endif::git-pull[]
 
 ifndef::git-pull[]
--n::
+`-n`::
 endif::git-pull[]
---no-tags::
+`--no-tags`::
 	By default, tags that point at objects that are downloaded
 	from the remote repository are fetched and stored locally.
 	This option disables this automatic tag following. The default
-	behavior for a remote may be specified with the remote.<name>.tagOpt
+	behavior for a remote may be specified with the `remote.<name>.tagOpt`
 	setting. See linkgit:git-config[1].
 
 ifndef::git-pull[]
---refetch::
+`--refetch`::
 	Instead of negotiating with the server to avoid transferring commits and
 	associated objects that are already present locally, this option fetches
 	all objects as a fresh clone would. Use this to reapply a partial clone
@@ -187,19 +188,19 @@ ifndef::git-pull[]
 	object database pack consolidation to remove any duplicate objects.
 endif::git-pull[]
 
---refmap=<refspec>::
+`--refmap=<refspec>`::
 	When fetching refs listed on the command line, use the
 	specified refspec (can be given more than once) to map the
 	refs to remote-tracking branches, instead of the values of
-	`remote.*.fetch` configuration variables for the remote
+	`remote.<name>.fetch` configuration variables for the remote
 	repository.  Providing an empty `<refspec>` to the
 	`--refmap` option causes Git to ignore the configured
 	refspecs and rely entirely on the refspecs supplied as
 	command-line arguments. See section on "Configured Remote-tracking
 	Branches" for details.
 
--t::
---tags::
+`-t`::
+`--tags`::
 	Fetch all tags from the remote (i.e., fetch remote tags
 	`refs/tags/*` into local tags with the same name), in addition
 	to whatever else would otherwise be fetched.  Using this
@@ -208,8 +209,8 @@ endif::git-pull[]
 	destination of an explicit refspec; see `--prune`).
 
 ifndef::git-pull[]
---recurse-submodules[=(yes|on-demand|no)]::
-	This option controls if and under what conditions new commits of
+`--recurse-submodules[=(yes|on-demand|no)]`::
+	Control if and under what conditions new commits of
 	submodules should be fetched too. When recursing through submodules,
 	`git fetch` always attempts to fetch "changed" submodules, that is, a
 	submodule that has commits that are referenced by a newly fetched
@@ -219,19 +220,19 @@ ifndef::git-pull[]
 	adds a new submodule, that submodule cannot be fetched until it is
 	cloned e.g. by `git submodule update`.
 +
-When set to 'on-demand', only changed submodules are fetched. When set
-to 'yes', all populated submodules are fetched and submodules that are
-both unpopulated and changed are fetched. When set to 'no', submodules
+When set to `on-demand`, only changed submodules are fetched. When set
+to `yes`, all populated submodules are fetched and submodules that are
+both unpopulated and changed are fetched. When set to `no`, submodules
 are never fetched.
 +
 When unspecified, this uses the value of `fetch.recurseSubmodules` if it
-is set (see linkgit:git-config[1]), defaulting to 'on-demand' if unset.
-When this option is used without any value, it defaults to 'yes'.
+is set (see linkgit:git-config[1]), defaulting to `on-demand` if unset.
+When this option is used without any value, it defaults to `yes`.
 endif::git-pull[]
 
--j::
---jobs=<n>::
-	Number of parallel children to be used for all forms of fetching.
+`-j <n>`::
+`--jobs=<n>`::
+	Parallelize all forms of fetching up to _<n>_ jobs at a time.
 +
 If the `--multiple` option was specified, the different remotes will be fetched
 in parallel. If multiple submodules are fetched, they will be fetched in
@@ -242,12 +243,12 @@ Typically, parallel recursive and multi-remote fetches will be faster. By
 default fetches are performed sequentially, not in parallel.
 
 ifndef::git-pull[]
---no-recurse-submodules::
+`--no-recurse-submodules`::
 	Disable recursive fetching of submodules (this has the same effect as
 	using the `--recurse-submodules=no` option).
 endif::git-pull[]
 
---set-upstream::
+`--set-upstream`::
 	If the remote is fetched successfully, add upstream
 	(tracking) reference, used by argument-less
 	linkgit:git-pull[1] and other commands. For more information,
@@ -255,55 +256,55 @@ endif::git-pull[]
 	linkgit:git-config[1].
 
 ifndef::git-pull[]
---submodule-prefix=<path>::
-	Prepend <path> to paths printed in informative messages
+`--submodule-prefix=<path>`::
+	Prepend _<path>_ to paths printed in informative messages
 	such as "Fetching submodule foo".  This option is used
 	internally when recursing over submodules.
 
---recurse-submodules-default=[yes|on-demand]::
+`--recurse-submodules-default=(yes|on-demand)`::
 	This option is used internally to temporarily provide a
-	non-negative default value for the --recurse-submodules
+	non-negative default value for the `--recurse-submodules`
 	option.  All other methods of configuring fetch's submodule
 	recursion (such as settings in linkgit:gitmodules[5] and
 	linkgit:git-config[1]) override this option, as does
-	specifying --[no-]recurse-submodules directly.
+	specifying `--[no-]recurse-submodules` directly.
 
--u::
---update-head-ok::
-	By default 'git fetch' refuses to update the head which
+`-u`::
+`--update-head-ok`::
+	By default `git fetch` refuses to update the head which
 	corresponds to the current branch.  This flag disables the
-	check.  This is purely for the internal use for 'git pull'
-	to communicate with 'git fetch', and unless you are
+	check.  This is purely for the internal use for `git pull`
+	to communicate with `git fetch`, and unless you are
 	implementing your own Porcelain you are not supposed to
 	use it.
 endif::git-pull[]
 
---upload-pack <upload-pack>::
+`--upload-pack <upload-pack>`::
 	When given, and the repository to fetch from is handled
-	by 'git fetch-pack', `--exec=<upload-pack>` is passed to
+	by `git fetch-pack`, `--exec=<upload-pack>` is passed to
 	the command to specify non-default path for the command
 	run on the other end.
 
 ifndef::git-pull[]
--q::
---quiet::
-	Pass --quiet to git-fetch-pack and silence any other internally
+`-q`::
+`--quiet`::
+	Pass `--quiet` to `git-fetch-pack` and silence any other internally
 	used git commands. Progress is not reported to the standard error
 	stream.
 
--v::
---verbose::
+`-v`::
+`--verbose`::
 	Be verbose.
 endif::git-pull[]
 
---progress::
+`--progress`::
 	Progress status is reported on the standard error stream
-	by default when it is attached to a terminal, unless -q
+	by default when it is attached to a terminal, unless `-q`
 	is specified. This flag forces progress status even if the
 	standard error stream is not directed to a terminal.
 
--o <option>::
---server-option=<option>::
+`-o <option>`::
+`--server-option=<option>`::
 	Transmit the given string to the server when communicating using
 	protocol version 2.  The given string must not contain a NUL or LF
 	character.  The server's handling of server options, including
@@ -314,23 +315,23 @@ endif::git-pull[]
 	the values of configuration variable `remote.<name>.serverOption`
 	are used instead.
 
---show-forced-updates::
+`--show-forced-updates`::
 	By default, git checks if a branch is force-updated during
-	fetch. This can be disabled through fetch.showForcedUpdates, but
-	the --show-forced-updates option guarantees this check occurs.
+	fetch. This can be disabled through `fetch.showForcedUpdates`, but
+	the `--show-forced-updates` option guarantees this check occurs.
 	See linkgit:git-config[1].
 
---no-show-forced-updates::
+`--no-show-forced-updates`::
 	By default, git checks if a branch is force-updated during
-	fetch. Pass --no-show-forced-updates or set fetch.showForcedUpdates
+	fetch. Pass `--no-show-forced-updates` or set `fetch.showForcedUpdates`
 	to false to skip this check for performance reasons. If used during
-	'git-pull' the --ff-only option will still check for forced updates
+	`git-pull` the `--ff-only` option will still check for forced updates
 	before attempting a fast-forward update. See linkgit:git-config[1].
 
--4::
---ipv4::
+`-4`::
+`--ipv4`::
 	Use IPv4 addresses only, ignoring IPv6 addresses.
 
--6::
---ipv6::
+`-6`::
+`--ipv6`::
 	Use IPv6 addresses only, ignoring IPv4 addresses.
diff --git a/Documentation/git-fetch.adoc b/Documentation/git-fetch.adoc
index 16f5d9d69af78e..db035419154514 100644
--- a/Documentation/git-fetch.adoc
+++ b/Documentation/git-fetch.adoc
@@ -8,11 +8,11 @@ git-fetch - Download objects and refs from another repository
 
 SYNOPSIS
 --------
-[verse]
-'git fetch' [<options>] [<repository> [<refspec>...]]
-'git fetch' [<options>] <group>
-'git fetch' --multiple [<options>] [(<repository> | <group>)...]
-'git fetch' --all [<options>]
+[synopsis]
+git fetch [<options>] [<repository> [<refspec>...]]
+git fetch [<options>] <group>
+git fetch --multiple [<options>] [(<repository>|<group>)...]
+git fetch --all [<options>]
 
 
 DESCRIPTION
@@ -20,19 +20,19 @@ DESCRIPTION
 Fetch branches and/or tags (collectively, "refs") from one or more
 other repositories, along with the objects necessary to complete their
 histories.  Remote-tracking branches are updated (see the description
-of <refspec> below for ways to control this behavior).
+of _<refspec>_ below for ways to control this behavior).
 
 By default, any tag that points into the histories being fetched is
 also fetched; the effect is to fetch tags that
 point at branches that you are interested in.  This default behavior
-can be changed by using the --tags or --no-tags options or by
-configuring remote.<name>.tagOpt.  By using a refspec that fetches tags
+can be changed by using the `--tags` or `--no-tags` options or by
+configuring `remote.<name>.tagOpt`.  By using a refspec that fetches tags
 explicitly, you can fetch tags that do not point into branches you
 are interested in as well.
 
-'git fetch' can fetch from either a single named repository or URL,
-or from several repositories at once if <group> is given and
-there is a remotes.<group> entry in the configuration file.
+`git fetch` can fetch from either a single named repository or URL,
+or from several repositories at once if _<group>_ is given and
+there is a `remotes.<group>` entry in the configuration file.
 (See linkgit:git-config[1]).
 
 When no remote is specified, by default the `origin` remote will be used,
@@ -48,15 +48,15 @@ include::fetch-options.adoc[]
 
 include::pull-fetch-param.adoc[]
 
---stdin::
+`--stdin`::
 	Read refspecs, one per line, from stdin in addition to those provided
-	as arguments. The "tag <name>" format is not supported.
+	as arguments. The "tag _<name>_" format is not supported.
 
 include::urls-remotes.adoc[]
 
-
-CONFIGURED REMOTE-TRACKING BRANCHES[[CRTB]]
--------------------------------------------
+[[CRTB]]
+CONFIGURED REMOTE-TRACKING BRANCHES
+-----------------------------------
 
 You often interact with the same remote repository by
 regularly and repeatedly fetching from it.  In order to keep track
@@ -84,13 +84,13 @@ This configuration is used in two ways:
 
 * When `git fetch` is run with explicit branches and/or tags
   to fetch on the command line, e.g. `git fetch origin master`, the
-  <refspec>s given on the command line determine what are to be
+  _<refspec>s_ given on the command line determine what are to be
   fetched (e.g. `master` in the example,
   which is a short-hand for `master:`, which in turn means
-  "fetch the 'master' branch but I do not explicitly say what
+  "fetch the `master` branch but I do not explicitly say what
   remote-tracking branch to update with it from the command line"),
   and the example command will
-  fetch _only_ the 'master' branch.  The `remote.<repository>.fetch`
+  fetch _only_ the `master` branch.  The `remote.<repository>.fetch`
   values determine which
   remote-tracking branch, if any, is updated.  When used in this
   way, the `remote.<repository>.fetch` values do not have any
@@ -144,9 +144,9 @@ tracking branches that are deleted, but any local tag that doesn't
 exist on the remote.
 
 This might not be what you expect, i.e. you want to prune remote
-`<name>`, but also explicitly fetch tags from it, so when you fetch
+_<name>_, but also explicitly fetch tags from it, so when you fetch
 from it you delete all your local tags, most of which may not have
-come from the `<name>` remote in the first place.
+come from the _<name>_ remote in the first place.
 
 So be careful when using this with a refspec like
 `refs/tags/*:refs/tags/*`, or any other refspec which might map
@@ -213,11 +213,11 @@ of the form:
 <flag> <old-object-id> <new-object-id> <local-reference>
 -------------------------------
 
-The status of up-to-date refs is shown only if the --verbose option is
+The status of up-to-date refs is shown only if the `--verbose` option is
 used.
 
 In compact output mode, specified with configuration variable
-fetch.output, if either entire `<from>` or `<to>` is found in the
+fetch.output, if either entire _<from>_ or _<to>_ is found in the
 other string, it will be substituted with `*` in the other string. For
 example, `master -> origin/master` becomes `master -> origin/*`.
 
@@ -303,7 +303,7 @@ include::config/fetch.adoc[]
 
 BUGS
 ----
-Using --recurse-submodules can only fetch new commits in submodules that are
+Using `--recurse-submodules` can only fetch new commits in submodules that are
 present locally e.g. in `$GIT_DIR/modules/`. If the upstream adds a new
 submodule, that submodule cannot be fetched until it is cloned e.g. by `git
 submodule update`. This is expected to be fixed in a future Git version.
diff --git a/Documentation/pull-fetch-param.adoc b/Documentation/pull-fetch-param.adoc
index bb2cf6a4629e92..2a67641761b2b4 100644
--- a/Documentation/pull-fetch-param.adoc
+++ b/Documentation/pull-fetch-param.adoc
@@ -1,20 +1,20 @@
-<repository>::
+_<repository>_::
 	The "remote" repository that is the source of a fetch
 	or pull operation.  This parameter can be either a URL
 	(see the section <<URLS,GIT URLS>> below) or the name
 	of a remote (see the section <<REMOTES,REMOTES>> below).
 
 ifndef::git-pull[]
-<group>::
+_<group>_::
 	A name referring to a list of repositories as the value
-	of remotes.<group> in the configuration file.
+	of `remotes.<group>` in the configuration file.
 	(See linkgit:git-config[1]).
 endif::git-pull[]
 
 [[fetch-refspec]]
-<refspec>::
+_<refspec>_::
 	Specifies which refs to fetch and which local refs to update.
-	When no <refspec>s appear on the command line, the refs to fetch
+	When no __<refspec>__s appear on the command line, the refs to fetch
 	are read from `remote.<repository>.fetch` variables instead
 ifndef::git-pull[]
 	(see <<CRTB,CONFIGURED REMOTE-TRACKING BRANCHES>> below).
@@ -24,18 +24,18 @@ ifdef::git-pull[]
 	in linkgit:git-fetch[1]).
 endif::git-pull[]
 +
-The format of a <refspec> parameter is an optional plus
-`+`, followed by the source <src>, followed
-by a colon `:`, followed by the destination <dst>.
-The colon can be omitted when <dst> is empty.  <src> is
+The format of a _<refspec>_ parameter is an optional plus
+`+`, followed by the source _<src>_, followed
+by a colon `:`, followed by the destination _<dst>_.
+The colon can be omitted when _<dst>_ is empty.  _<src>_ is
 typically a ref, or a glob pattern with a single `*` that is used
 to match a set of refs, but it can also be a fully spelled hex object
 name.
 +
-A <refspec> may contain a `*` in its <src> to indicate a simple pattern
+A _<refspec>_ may contain a `*` in its _<src>_ to indicate a simple pattern
 match. Such a refspec functions like a glob that matches any ref with the
-pattern. A pattern <refspec> must have one and only one `*` in both the <src> and
-<dst>. It will map refs to the destination by replacing the `*` with the
+pattern. A pattern _<refspec>_ must have one and only one `*` in both the _<src>_ and
+_<dst>_. It will map refs to the destination by replacing the `*` with the
 contents matched from the source.
 +
 If a refspec is prefixed by `^`, it will be interpreted as a negative
@@ -45,14 +45,14 @@ considered to match if it matches at least one positive refspec, and does
 not match any negative refspec. Negative refspecs can be useful to restrict
 the scope of a pattern refspec so that it will not include specific refs.
 Negative refspecs can themselves be pattern refspecs. However, they may only
-contain a <src> and do not specify a <dst>. Fully spelled out hex object
+contain a _<src>_ and do not specify a _<dst>_. Fully spelled out hex object
 names are also not supported.
 +
 `tag <tag>` means the same as `refs/tags/<tag>:refs/tags/<tag>`;
 it requests fetching everything up to the given tag.
 +
-The remote ref that matches <src>
-is fetched, and if <dst> is not an empty string, an attempt
+The remote ref that matches _<src>_
+is fetched, and if _<dst>_ is not an empty string, an attempt
 is made to update the local ref that matches it.
 +
 Whether that update is allowed without `--force` depends on the ref
@@ -60,7 +60,7 @@ namespace it's being fetched to, the type of object being fetched, and
 whether the update is considered to be a fast-forward. Generally, the
 same rules apply for fetching as when pushing, see the `<refspec>...`
 section of linkgit:git-push[1] for what those are. Exceptions to those
-rules particular to 'git fetch' are noted below.
+rules particular to `git fetch` are noted below.
 +
 Until Git version 2.20, and unlike when pushing with
 linkgit:git-push[1], any updates to `refs/tags/*` would be accepted
@@ -91,7 +91,7 @@ object.
 When the remote branch you want to fetch is known to
 be rewound and rebased regularly, it is expected that
 its new tip will not be a descendant of its previous tip
-(as stored in your remote-tracking branch the last time
+(as stored in your remote-tracking branch the last time_
 you fetched).  You would want
 to use the `+` sign to indicate non-fast-forward updates
 will be needed for such branches.  There is no way to
@@ -101,19 +101,19 @@ must know this is the expected usage pattern for a branch.
 ifdef::git-pull[]
 +
 [NOTE]
-There is a difference between listing multiple <refspec>
-directly on 'git pull' command line and having multiple
+There is a difference between listing multiple _<refspec>_
+directly on `git pull` command line and having multiple
 `remote.<repository>.fetch` entries in your configuration
-for a <repository> and running a
-'git pull' command without any explicit <refspec> parameters.
-<refspec>s listed explicitly on the command line are always
+for a _<repository>_ and running a
+`git pull` command without any explicit _<refspec>_ parameters.
+__<refspec>__s listed explicitly on the command line are always
 merged into the current branch after fetching.  In other words,
-if you list more than one remote ref, 'git pull' will create
+if you list more than one remote ref, `git pull` will create
 an Octopus merge.  On the other hand, if you do not list any
-explicit <refspec> parameter on the command line, 'git pull'
-will fetch all the <refspec>s it finds in the
+explicit _<refspec>_ parameter on the command line, `git pull`
+will fetch all the __<refspec>__s it finds in the
 `remote.<repository>.fetch` configuration and merge
-only the first <refspec> found into the current branch.
+only the first _<refspec>_ found into the current branch.
 This is because making an
 Octopus from remote refs is rarely done, while keeping track
 of multiple remote heads in one-go by fetching more than one
diff --git a/Documentation/urls-remotes.adoc b/Documentation/urls-remotes.adoc
index 57b1646d3e2a4e..068b3ee4a69b61 100644
--- a/Documentation/urls-remotes.adoc
+++ b/Documentation/urls-remotes.adoc
@@ -4,7 +4,7 @@ REMOTES[[REMOTES]]
 ------------------
 
 The name of one of the following can be used instead
-of a URL as `<repository>` argument:
+of a URL as _<repository>_ argument:
 
 * a remote in the Git configuration file: `$GIT_DIR/config`,
 * a file in the `$GIT_DIR/remotes` directory, or
@@ -32,8 +32,8 @@ config file would appear like this:
 		fetch = <refspec>
 ------------
 
-The `<pushurl>` is used for pushes only. It is optional and defaults
-to `<URL>`. Pushing to a remote affects all defined pushurls or all
+The _<pushurl>_ is used for pushes only. It is optional and defaults
+to _<URL>_. Pushing to a remote affects all defined pushurls or all
 defined urls if no pushurls are defined. Fetch, however, will only
 fetch from the first defined url if multiple urls are defined.
 
@@ -54,8 +54,8 @@ following format:
 
 ------------
 
-`Push:` lines are used by 'git push' and
-`Pull:` lines are used by 'git pull' and 'git fetch'.
+`Push:` lines are used by `git push` and
+`Pull:` lines are used by `git pull` and `git fetch`.
 Multiple `Push:` and `Pull:` lines may
 be specified for additional branch mappings.
 
@@ -72,12 +72,12 @@ This file should have the following format:
 	<URL>#<head>
 ------------
 
-`<URL>` is required; `#<head>` is optional.
+_<URL>_ is required; `#<head>` is optional.
 
 Depending on the operation, git will use one of the following
 refspecs, if you don't provide one on the command line.
-`<branch>` is the name of this file in `$GIT_DIR/branches` and
-`<head>` defaults to `master`.
+_<branch> is the name of this file in `$GIT_DIR/branches` and
+_<head>_ defaults to `master`.
 
 git fetch uses:
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c7ff3480fb1827..74b62b13158827 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -47,7 +47,7 @@
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
 	N_("git fetch [<options>] <group>"),
-	N_("git fetch --multiple [<options>] [(<repository> | <group>)...]"),
+	N_("git fetch --multiple [<options>] [(<repository>|<group>)...]"),
 	N_("git fetch --all [<options>]"),
 	NULL
 };

From c80a5ebce0e6afe3f9d3f5047f3de524386c40bb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Wed, 19 Nov 2025 21:40:03 +0000
Subject: [PATCH 119/553] doc: convert git pull to synopsis style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/fetch-options.adoc | 10 +++---
 Documentation/git-pull.adoc      | 61 ++++++++++++++++----------------
 Documentation/merge-options.adoc |  2 +-
 Documentation/urls-remotes.adoc  |  4 +--
 4 files changed, 38 insertions(+), 39 deletions(-)

diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index 35a84a1ef27672..fcba46ee9e5d61 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -71,7 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 
 `--negotiate-only`::
 	Do not fetch anything from the server, and instead print the
-	ancestors of the provided `--negotiation-tip=*` arguments,
+	ancestors of the provided `--negotiation-tip=` arguments,
 	which we have in common with the server.
 +
 This is incompatible with `--recurse-submodules=(yes|on-demand)`.
@@ -126,7 +126,7 @@ ifndef::git-pull[]
 `--auto-gc`::
 `--no-auto-gc`::
 	Run `git maintenance run --auto` at the end to perform automatic
-	repository maintenance if needed. (`--[no-]auto-gc` is a synonym.)
+	repository maintenance if needed.
 	This is enabled by default.
 
 `--write-commit-graph`::
@@ -193,7 +193,7 @@ endif::git-pull[]
 	specified refspec (can be given more than once) to map the
 	refs to remote-tracking branches, instead of the values of
 	`remote.<name>.fetch` configuration variables for the remote
-	repository.  Providing an empty `<refspec>` to the
+	repository.  Providing an empty _<refspec>_ to the
 	`--refmap` option causes Git to ignore the configured
 	refspecs and rely entirely on the refspecs supplied as
 	command-line arguments. See section on "Configured Remote-tracking
@@ -204,7 +204,7 @@ endif::git-pull[]
 	Fetch all tags from the remote (i.e., fetch remote tags
 	`refs/tags/*` into local tags with the same name), in addition
 	to whatever else would otherwise be fetched.  Using this
-	option alone does not subject tags to pruning, even if --prune
+	option alone does not subject tags to pruning, even if `--prune`
 	is used (though tags may be pruned anyway if they are also the
 	destination of an explicit refspec; see `--prune`).
 
@@ -306,7 +306,7 @@ endif::git-pull[]
 `-o <option>`::
 `--server-option=<option>`::
 	Transmit the given string to the server when communicating using
-	protocol version 2.  The given string must not contain a NUL or LF
+	protocol version 2.  The given string must not contain a _NUL_ or _LF_
 	character.  The server's handling of server options, including
 	unknown ones, is server-specific.
 	When multiple `--server-option=<option>` are given, they are all
diff --git a/Documentation/git-pull.adoc b/Documentation/git-pull.adoc
index cd3bbc90e3008d..248f6c3f39ac74 100644
--- a/Documentation/git-pull.adoc
+++ b/Documentation/git-pull.adoc
@@ -8,8 +8,8 @@ git-pull - Fetch from and integrate with another repository or a local branch
 
 SYNOPSIS
 --------
-[verse]
-'git pull' [<options>] [<repository> [<refspec>...]]
+[synopsis]
+git pull [<options>] [<repository> [<refspec>...]]
 
 
 DESCRIPTION
@@ -43,7 +43,7 @@ want to handle, you can safely abort it with `git merge --abort` or `git
 OPTIONS
 -------
 
-<repository>::
+_<repository>_::
 	The "remote" repository to pull from.  This can be either
 	a URL (see the section <<URLS,GIT URLS>> below) or the name
 	of a remote (see the section <<REMOTES,REMOTES>> below).
@@ -52,29 +52,29 @@ Defaults to the configured upstream for the current branch, or `origin`.
 See <<UPSTREAM-BRANCHES,UPSTREAM BRANCHES>> below for more on how to
 configure upstreams.
 
-<refspec>::
+_<refspec>_::
 	Which branch or other reference(s) to fetch and integrate into the
 	current branch, for example `main` in `git pull origin main`.
 	Defaults to the configured upstream for the current branch.
 +
 This can be a branch, tag, or other collection of reference(s).
-See <<fetch-refspec,<refspec>>> below under "Options related to fetching"
+See <<fetch-refspec,_<refspec>_>> below under "Options related to fetching"
 for the full syntax, and <<DEFAULT-BEHAVIOUR,DEFAULT BEHAVIOUR>> below
 for how `git pull` uses this argument to determine which remote branch
 to integrate.
 
--q::
---quiet::
+`-q`::
+`--quiet`::
 	This is passed to both underlying git-fetch to squelch reporting of
 	during transfer, and underlying git-merge to squelch output during
 	merging.
 
--v::
---verbose::
-	Pass --verbose to git-fetch and git-merge.
+`-v`::
+`--verbose`::
+	Pass `--verbose` to git-fetch and git-merge.
 
---recurse-submodules[=(yes|on-demand|no)]::
---no-recurse-submodules::
+`--recurse-submodules[=(yes|on-demand|no)]`::
+`--no-recurse-submodules`::
 	This option controls if new commits of populated submodules should
 	be fetched, and if the working trees of active submodules should be
 	updated, too (see linkgit:git-fetch[1], linkgit:git-config[1] and
@@ -91,21 +91,20 @@ Options related to merging
 
 include::merge-options.adoc[]
 
--r::
---rebase[=(false|true|merges|interactive)]::
-	When true, rebase the current branch on top of the upstream
+`-r`::
+`--rebase[=(true|merges|false|interactive)]`::
+`true`;; rebase the current branch on top of the upstream
 	branch after fetching. If there is a remote-tracking branch
 	corresponding to the upstream branch and the upstream branch
 	was rebased since last fetched, the rebase uses that information
-	to avoid rebasing non-local changes.
-+
-When set to `merges`, rebase using `git rebase --rebase-merges` so that
+	to avoid rebasing non-local changes. This is the default.
+
+`merges`;; rebase using `git rebase --rebase-merges` so that
 the local merge commits are included in the rebase (see
 linkgit:git-rebase[1] for details).
-+
-When false, merge the upstream branch into the current branch.
-+
-When `interactive`, enable the interactive mode of rebase.
+`false`;; merge the upstream branch into the current branch.
+`interactive`;; enable the interactive mode of rebase.
+
 +
 See `pull.rebase`, `branch.<name>.rebase` and `branch.autoSetupRebase` in
 linkgit:git-config[1] if you want to make `git pull` always use
@@ -117,8 +116,8 @@ It rewrites history, which does not bode well when you
 published that history already.  Do *not* use this option
 unless you have read linkgit:git-rebase[1] carefully.
 
---no-rebase::
-	This is shorthand for --rebase=false.
+`--no-rebase`::
+	This is shorthand for `--rebase=false`.
 
 Options related to fetching
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -178,7 +177,7 @@ uses the refspec from the configuration or
 rules apply:
 
 . If `branch.<name>.merge` configuration for the current
-  branch `<name>` exists, that is the name of the branch at the
+  branch _<name>_ exists, that is the name of the branch at the
   remote site that is merged.
 
 . If the refspec is a globbing one, nothing is merged.
@@ -198,9 +197,9 @@ $ git pull
 $ git pull origin
 ------------------------------------------------
 +
-Normally the branch merged in is the HEAD of the remote repository,
-but the choice is determined by the branch.<name>.remote and
-branch.<name>.merge options; see linkgit:git-config[1] for details.
+Normally the branch merged in is the `HEAD` of the remote repository,
+but the choice is determined by the `branch.<name>.remote` and
+`branch.<name>.merge` options; see linkgit:git-config[1] for details.
 
 * Merge into the current branch the remote branch `next`:
 +
@@ -208,7 +207,7 @@ branch.<name>.merge options; see linkgit:git-config[1] for details.
 $ git pull origin next
 ------------------------------------------------
 +
-This leaves a copy of `next` temporarily in FETCH_HEAD, and
+This leaves a copy of `next` temporarily in `FETCH_HEAD`, and
 updates the remote-tracking branch `origin/next`.
 The same can be done by invoking fetch and merge:
 +
@@ -219,14 +218,14 @@ $ git merge origin/next
 
 
 If you tried a pull which resulted in complex conflicts and
-would want to start over, you can recover with 'git reset'.
+would want to start over, you can recover with `git reset`.
 
 
 include::transfer-data-leaks.adoc[]
 
 BUGS
 ----
-Using --recurse-submodules can only fetch new commits in already checked
+Using `--recurse-submodules` can only fetch new commits in already checked
 out submodules right now. When e.g. upstream added a new submodule in the
 just fetched commits of the superproject the submodule itself cannot be
 fetched, making it impossible to check out that submodule later without
diff --git a/Documentation/merge-options.adoc b/Documentation/merge-options.adoc
index 9d433265b2984b..952cb85e9a4af9 100644
--- a/Documentation/merge-options.adoc
+++ b/Documentation/merge-options.adoc
@@ -56,7 +56,7 @@ ifdef::git-pull[]
 `--ff-only`::
 	Only update to the new history if there is no divergent local
 	history.  This is the default when no method for reconciling
-	divergent histories is provided (via the --rebase=* flags).
+	divergent histories is provided (via the `--rebase` flags).
 
 `--ff`::
 `--no-ff`::
diff --git a/Documentation/urls-remotes.adoc b/Documentation/urls-remotes.adoc
index 068b3ee4a69b61..6878bbe0939c7e 100644
--- a/Documentation/urls-remotes.adoc
+++ b/Documentation/urls-remotes.adoc
@@ -76,7 +76,7 @@ _<URL>_ is required; `#<head>` is optional.
 
 Depending on the operation, git will use one of the following
 refspecs, if you don't provide one on the command line.
-_<branch> is the name of this file in `$GIT_DIR/branches` and
+_<branch>_ is the name of this file in `$GIT_DIR/branches` and
 _<head>_ defaults to `master`.
 
 git fetch uses:
@@ -111,7 +111,7 @@ Git defaults to using the upstream branch for remote operations, for example:
   'origin/main' have diverged, and have 2 and 3 different commits each
   respectively".
 
-The upstream is stored in `.git/config`, in the "remote" and "merge"
+The upstream is stored in `.git/config`, in the "`remote`" and "`merge`"
 fields. For example, if `main`'s upstream is `origin/main`:
 
 ------------

From f7316a66d36f39ed9e5be7a3ce0ecd7b71430ff5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Wed, 19 Nov 2025 21:40:04 +0000
Subject: [PATCH 120/553] doc: convert git push to synopsis style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/push.adoc | 113 +++++++-------
 Documentation/git-push.adoc    | 267 ++++++++++++++++++---------------
 2 files changed, 201 insertions(+), 179 deletions(-)

diff --git a/Documentation/config/push.adoc b/Documentation/config/push.adoc
index 0acbbea18a320f..d9112b22609b51 100644
--- a/Documentation/config/push.adoc
+++ b/Documentation/config/push.adoc
@@ -1,15 +1,15 @@
-push.autoSetupRemote::
-	If set to "true" assume `--set-upstream` on default push when no
+`push.autoSetupRemote`::
+	If set to `true` assume `--set-upstream` on default push when no
 	upstream tracking exists for the current branch; this option
-	takes effect with push.default options 'simple', 'upstream',
-	and 'current'. It is useful if by default you want new branches
+	takes effect with `push.default` options `simple`, `upstream`,
+	and `current`. It is useful if by default you want new branches
 	to be pushed to the default remote (like the behavior of
-	'push.default=current') and you also want the upstream tracking
+	`push.default=current`) and you also want the upstream tracking
 	to be set. Workflows most likely to benefit from this option are
-	'simple' central workflows where all branches are expected to
+	`simple` central workflows where all branches are expected to
 	have the same name on the remote.
 
-push.default::
+`push.default`::
 	Defines the action `git push` should take if no refspec is
 	given (whether from the command-line, config, or elsewhere).
 	Different values are well-suited for
@@ -18,24 +18,28 @@ push.default::
 	`upstream` is probably what you want.  Possible values are:
 +
 --
-
-* `nothing` - do not push anything (error out) unless a refspec is
-  given. This is primarily meant for people who want to
-  avoid mistakes by always being explicit.
-
-* `current` - push the current branch to update a branch with the same
-  name on the receiving end.  Works in both central and non-central
-  workflows.
-
-* `upstream` - push the current branch back to the branch whose
-  changes are usually integrated into the current branch (which is
-  called `@{upstream}`).  This mode only makes sense if you are
-  pushing to the same repository you would normally pull from
-  (i.e. central workflow).
-
-* `tracking` - This is a deprecated synonym for `upstream`.
-
-* `simple` - push the current branch with the same name on the remote.
+`nothing`;;
+do not push anything (error out) unless a refspec is
+given. This is primarily meant for people who want to
+avoid mistakes by always being explicit.
+
+`current`;;
+push the current branch to update a branch with the same
+name on the receiving end.  Works in both central and non-central
+workflows.
+
+`upstream`;;
+push the current branch back to the branch whose
+changes are usually integrated into the current branch (which is
+called `@{upstream}`).  This mode only makes sense if you are
+pushing to the same repository you would normally pull from
+(i.e. central workflow).
+
+`tracking`;;
+this is a deprecated synonym for `upstream`.
+
+`simple`;;
+push the current branch with the same name on the remote.
 +
 If you are working on a centralized workflow (pushing to the same repository you
 pull from, which is typically `origin`), then you need to configure an upstream
@@ -44,16 +48,17 @@ branch with the same name.
 This mode is the default since Git 2.0, and is the safest option suited for
 beginners.
 
-* `matching` - push all branches having the same name on both ends.
-  This makes the repository you are pushing to remember the set of
-  branches that will be pushed out (e.g. if you always push 'maint'
-  and 'master' there and no other branches, the repository you push
-  to will have these two branches, and your local 'maint' and
-  'master' will be pushed there).
+`matching`;;
+push all branches having the same name on both ends.
+This makes the repository you are pushing to remember the set of
+branches that will be pushed out (e.g. if you always push `maint`
+and `master` there and no other branches, the repository you push
+to will have these two branches, and your local `maint` and
+`master` will be pushed there).
 +
 To use this mode effectively, you have to make sure _all_ the
 branches you would push out are ready to be pushed out before
-running 'git push', as the whole point of this mode is to allow you
+running `git push`, as the whole point of this mode is to allow you
 to push all of the branches in one go.  If you usually finish work
 on only one branch and push out the result, while other branches are
 unfinished, this mode is not for you.  Also this mode is not
@@ -66,24 +71,24 @@ new default).
 
 --
 
-push.followTags::
+`push.followTags`::
 	If set to true, enable `--follow-tags` option by default.  You
 	may override this configuration at time of push by specifying
 	`--no-follow-tags`.
 
-push.gpgSign::
-	May be set to a boolean value, or the string 'if-asked'. A true
+`push.gpgSign`::
+	May be set to a boolean value, or the string `if-asked`. A true
 	value causes all pushes to be GPG signed, as if `--signed` is
-	passed to linkgit:git-push[1]. The string 'if-asked' causes
+	passed to linkgit:git-push[1]. The string `if-asked` causes
 	pushes to be signed if the server supports it, as if
-	`--signed=if-asked` is passed to 'git push'. A false value may
+	`--signed=if-asked` is passed to `git push`. A false value may
 	override a value from a lower-priority config file. An explicit
 	command-line flag always overrides this config option.
 
-push.pushOption::
+`push.pushOption`::
 	When no `--push-option=<option>` argument is given from the
-	command line, `git push` behaves as if each <value> of
-	this variable is given as `--push-option=<value>`.
+	command line, `git push` behaves as if each _<option>_ of
+	this variable is given as `--push-option=<option>`.
 +
 This is a multi-valued variable, and an empty value can be used in a
 higher priority configuration file (e.g. `.git/config` in a
@@ -109,26 +114,26 @@ This will result in only b (a and c are cleared).
 
 ----
 
-push.recurseSubmodules::
-	May be "check", "on-demand", "only", or "no", with the same behavior
-	as that of "push --recurse-submodules".
-	If not set, 'no' is used by default, unless 'submodule.recurse' is
-	set (in which case a 'true' value means 'on-demand').
+`push.recurseSubmodules`::
+	May be `check`, `on-demand`, `only`, or `no`, with the same behavior
+	as that of `push --recurse-submodules`.
+	If not set, `no` is used by default, unless `submodule.recurse` is
+	set (in which case a `true` value means `on-demand`).
 
-push.useForceIfIncludes::
-	If set to "true", it is equivalent to specifying
+`push.useForceIfIncludes`::
+	If set to `true`, it is equivalent to specifying
 	`--force-if-includes` as an option to linkgit:git-push[1]
 	in the command line. Adding `--no-force-if-includes` at the
 	time of push overrides this configuration setting.
 
-push.negotiate::
-	If set to "true", attempt to reduce the size of the packfile
+`push.negotiate`::
+	If set to `true`, attempt to reduce the size of the packfile
 	sent by rounds of negotiation in which the client and the
-	server attempt to find commits in common. If "false", Git will
+	server attempt to find commits in common. If `false`, Git will
 	rely solely on the server's ref advertisement to find commits
 	in common.
 
-push.useBitmaps::
-	If set to "false", disable use of bitmaps for "git push" even if
-	`pack.useBitmaps` is "true", without preventing other git operations
-	from using bitmaps. Default is true.
+`push.useBitmaps`::
+	If set to `false`, disable use of bitmaps for `git push` even if
+	`pack.useBitmaps` is `true`, without preventing other git operations
+	from using bitmaps. Default is `true`.
diff --git a/Documentation/git-push.adoc b/Documentation/git-push.adoc
index 864b0d0467579e..e5ba3a67421edc 100644
--- a/Documentation/git-push.adoc
+++ b/Documentation/git-push.adoc
@@ -8,13 +8,13 @@ git-push - Update remote refs along with associated objects
 
 SYNOPSIS
 --------
-[verse]
-'git push' [--all | --branches | --mirror | --tags] [--follow-tags] [--atomic] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
-	   [--repo=<repository>] [-f | --force] [-d | --delete] [--prune] [-q | --quiet] [-v | --verbose]
-	   [-u | --set-upstream] [-o <string> | --push-option=<string>]
-	   [--[no-]signed|--signed=(true|false|if-asked)]
-	   [--force-with-lease[=<refname>[:<expect>]] [--force-if-includes]]
-	   [--no-verify] [<repository> [<refspec>...]]
+[synopsis]
+git push [--all | --branches | --mirror | --tags] [--follow-tags] [--atomic] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
+	 [--repo=<repository>] [-f | --force] [-d | --delete] [--prune] [-q | --quiet] [-v | --verbose]
+	 [-u | --set-upstream] [-o <string> | --push-option=<string>]
+	 [--[no-]signed | --signed=(true|false|if-asked)]
+	 [--force-with-lease[=<refname>[:<expect>]] [--force-if-includes]]
+	 [--no-verify] [<repository> [<refspec>...]]
 
 DESCRIPTION
 -----------
@@ -35,7 +35,7 @@ To decide which branches, tags, or other refs to push, Git uses
 
 1. The `<refspec>` argument(s) (for example `main` in `git push origin main`)
    or the `--all`, `--mirror`, or `--tags` options
-2. The `remote.*.push` configuration for the repository being pushed to
+2. The `remote.<name>.push` configuration for the repository being pushed to
 3. The `push.default` configuration. The default is `push.default=simple`,
    which will push to a branch with the same name as the current branch.
    See the <<CONFIGURATION,CONFIGURATION>> section below for more on `push.default`.
@@ -49,25 +49,25 @@ You can make interesting things happen to a repository
 every time you push into it, by setting up 'hooks' there.  See
 documentation for linkgit:git-receive-pack[1].
 
-
-OPTIONS[[OPTIONS]]
-------------------
-<repository>::
+[[OPTIONS]]
+OPTIONS
+-------
+_<repository>_::
 	The "remote" repository that is the destination of a push
 	operation.  This parameter can be either a URL
 	(see the section <<URLS,GIT URLS>> below) or the name
 	of a remote (see the section <<REMOTES,REMOTES>> below).
 
-<refspec>...::
+`<refspec>...`::
 	Specify what destination ref to update with what source object.
 +
-The format for a refspec is [+]<src>[:<dst>], for example `main`,
+The format for a refspec is `[+]<src>[:<dst>]`, for example `main`,
 `main:other`, or `HEAD^:refs/heads/main`.
 +
-The `<src>` is often the name of the local branch to push, but it can be
+The _<src>_ is often the name of the local branch to push, but it can be
 any arbitrary "SHA-1 expression" (see linkgit:gitrevisions[7]).
 +
-The `<dst>` determines what ref to update on the remote side. It must be the
+The _<dst>_ determines what ref to update on the remote side. It must be the
 name of a branch, tag, or other ref, not an arbitrary expression.
 +
 The `+` is optional and does the same thing as `--force`.
@@ -78,23 +78,23 @@ and destination, or with a shorter form (for example `main` or
 `main:other`). Here are the rules for how refspecs are expanded,
 as well as various other special refspec forms:
 +
- *  `<src>` without a `:<dst>` means to update the same ref as the
-    `<src>`, unless the `remote.<repository>.push` configuration specifies a
-    different <dst>. For example, if `main` is a branch, then the refspec
+ *  _<src>_ without a `:<dst>` means to update the same ref as the
+    _<src>_, unless the `remote.<repository>.push` configuration specifies a
+    different _<dst>_. For example, if `main` is a branch, then the refspec
     `main` expands to `main:refs/heads/main`.
- *  If `<dst>` unambiguously refers to a ref on the <repository> remote,
+ *  If _<dst>_ unambiguously refers to a ref on the <repository> remote,
     then expand it to that ref. For example, if `v1.0` is a tag on the
     remote, then `HEAD:v1.0` expands to `HEAD:refs/tags/v1.0`.
- *  If `<src>` resolves to a ref starting with `refs/heads/` or `refs/tags/`,
+ *  If _<src>_ resolves to a ref starting with `refs/heads/` or `refs/tags/`,
     then prepend that to <dst>. For example, if `main` is a branch, then
     `main:other` expands to `main:refs/heads/other`
  *  The special refspec `:` (or `+:` to allow non-fast-forward updates)
     directs Git to push "matching" branches: for every branch that exists on
     the local side, the remote side is updated if a branch of the same name
     already exists on the remote side.
- *  <src> may contain a * to indicate a simple pattern match.
+ *  _<src>_ may contain a `*` to indicate a simple pattern match.
     This works like a glob that matches any ref matching the pattern.
-    There must be only one * in both the `<src>` and `<dst>`.
+    There must be only one `*` in both the `<src>` and `<dst>`.
     It will map refs to the destination by replacing the * with the
     contents matched from the source. For example, `refs/heads/*:refs/heads/*`
     will push all branches.
@@ -102,11 +102,11 @@ as well as various other special refspec forms:
     This specifies refs to exclude. A ref will be considered to
     match if it matches at least one positive refspec, and does not
     match any negative refspec. Negative refspecs can be pattern refspecs.
-    They must only contain a `<src>`.
+    They must only contain a _<src>_.
     Fully spelled out hex object names are also not supported.
     For example, `git push origin 'refs/heads/*' '^refs/heads/dev-*'`
     will push all branches except for those starting with `dev-`
- *  If `<src>` is empty, it deletes the `<dst>` ref from the remote
+ *  If _<src>_ is empty, it deletes the _<dst>_ ref from the remote
     repository. For example, `git push origin :dev` will
     delete the `dev` branch.
  *  `tag <tag>` expands to `refs/tags/<tag>:refs/tags/<tag>`.
@@ -121,12 +121,12 @@ as well as various other special refspec forms:
 
 Not all updates are allowed: see PUSH RULES below for the details.
 
---all::
---branches::
+`--all`::
+`--branches`::
 	Push all branches (i.e. refs under `refs/heads/`); cannot be
 	used with other <refspec>.
 
---prune::
+`--prune`::
 	Remove remote branches that don't have a local counterpart. For example
 	a remote branch `tmp` will be removed if a local branch with the same
 	name doesn't exist any more. This also respects refspecs, e.g.
@@ -134,7 +134,7 @@ Not all updates are allowed: see PUSH RULES below for the details.
 	make sure that remote `refs/tmp/foo` will be removed if `refs/heads/foo`
 	doesn't exist.
 
---mirror::
+`--mirror`::
 	Instead of naming each ref to push, specifies that all
 	refs under `refs/` (which includes but is not
 	limited to `refs/heads/`, `refs/remotes/`, and `refs/tags/`)
@@ -145,26 +145,26 @@ Not all updates are allowed: see PUSH RULES below for the details.
 	if the configuration option `remote.<remote>.mirror` is
 	set.
 
--n::
---dry-run::
+`-n`::
+`--dry-run`::
 	Do everything except actually send the updates.
 
---porcelain::
+`--porcelain`::
 	Produce machine-readable output.  The output status line for each ref
 	will be tab-separated and sent to stdout instead of stderr.  The full
 	symbolic names of the refs will be given.
 
--d::
---delete::
+`-d`::
+`--delete`::
 	All listed refs are deleted from the remote repository. This is
 	the same as prefixing all refs with a colon.
 
---tags::
+`--tags`::
 	All refs under `refs/tags` are pushed, in
 	addition to refspecs explicitly listed on the command
 	line.
 
---follow-tags::
+`--follow-tags`::
 	Push all the refs that would be pushed without this option,
 	and also push annotated tags in `refs/tags` that are missing
 	from the remote but are pointing at commit-ish that are
@@ -172,29 +172,34 @@ Not all updates are allowed: see PUSH RULES below for the details.
 	with configuration variable `push.followTags`.  For more
 	information, see `push.followTags` in linkgit:git-config[1].
 
---signed::
---no-signed::
---signed=(true|false|if-asked)::
+`--signed`::
+`--no-signed`::
+`--signed=(true|false|if-asked)`::
 	GPG-sign the push request to update refs on the receiving
 	side, to allow it to be checked by the hooks and/or be
-	logged.  If `false` or `--no-signed`, no signing will be
-	attempted.  If `true` or `--signed`, the push will fail if the
-	server does not support signed pushes.  If set to `if-asked`,
-	sign if and only if the server supports signed pushes.  The push
-	will also fail if the actual call to `gpg --sign` fails.  See
-	linkgit:git-receive-pack[1] for the details on the receiving end.
-
---atomic::
---no-atomic::
+	logged. Possible values are:
+`false`;;
+`--no-signed`;;
+no signing will be attempted.
+`true`;;
+`--signed`;;
+the push will fail if the server does not support signed pushes.
+`if-asked`;;
+sign if and only if the server supports signed pushes.  The push
+will also fail if the actual call to `gpg --sign` fails.  See
+linkgit:git-receive-pack[1] for the details on the receiving end.
+
+`--atomic`::
+`--no-atomic`::
 	Use an atomic transaction on the remote side if available.
 	Either all refs are updated, or on error, no refs are updated.
 	If the server does not support atomic pushes the push will fail.
 
--o <option>::
---push-option=<option>::
+`-o <option>`::
+`--push-option=<option>`::
 	Transmit the given string to the server, which passes them to
 	the pre-receive as well as the post-receive hook. The given string
-	must not contain a NUL or LF character.
+	must not contain a _NUL_ or _LF_ character.
 	When multiple `--push-option=<option>` are given, they are
 	all sent to the other side in the order listed on the
 	command line.
@@ -202,22 +207,22 @@ Not all updates are allowed: see PUSH RULES below for the details.
 	line, the values of configuration variable `push.pushOption`
 	are used instead.
 
---receive-pack=<git-receive-pack>::
---exec=<git-receive-pack>::
+`--receive-pack=<git-receive-pack>`::
+`--exec=<git-receive-pack>`::
 	Path to the 'git-receive-pack' program on the remote
 	end.  Sometimes useful when pushing to a remote
 	repository over ssh, and you do not have the program in
-	a directory on the default $PATH.
+	a directory on the default `$PATH`.
 
---force-with-lease::
---no-force-with-lease::
---force-with-lease=<refname>::
---force-with-lease=<refname>:<expect>::
-	Usually, "git push" refuses to update a remote ref that is
+`--force-with-lease`::
+`--no-force-with-lease`::
+`--force-with-lease=<refname>`::
+`--force-with-lease=<refname>:<expect>`::
+	Usually, `git push` refuses to update a remote ref that is
 	not an ancestor of the local ref used to overwrite it.
 +
 This option overrides this restriction if the current value of the
-remote ref is the expected value.  "git push" fails otherwise.
+remote ref is the expected value.  `git push` fails otherwise.
 +
 Imagine that you have to rebase what you have already published.
 You will have to bypass the "must fast-forward" rule in order to
@@ -239,16 +244,16 @@ current value to be the same as the remote-tracking branch we have
 for them.
 +
 `--force-with-lease=<refname>`, without specifying the expected value, will
-protect the named ref (alone), if it is going to be updated, by
+protect _<refname>_ (alone), if it is going to be updated, by
 requiring its current value to be the same as the remote-tracking
 branch we have for it.
 +
-`--force-with-lease=<refname>:<expect>` will protect the named ref (alone),
+`--force-with-lease=<refname>:<expect>` will protect _<refname>_ (alone),
 if it is going to be updated, by requiring its current value to be
-the same as the specified value `<expect>` (which is allowed to be
+the same as the specified value _<expect>_ (which is allowed to be
 different from the remote-tracking branch we have for the refname,
 or we do not even have to have such a remote-tracking branch when
-this form is used).  If `<expect>` is the empty string, then the named ref
+this form is used).  If _<expect>_ is the empty string, then the named ref
 must not already exist.
 +
 Note that all forms other than `--force-with-lease=<refname>:<expect>`
@@ -256,7 +261,7 @@ that specifies the expected current value of the ref explicitly are
 still experimental and their semantics may change as we gain experience
 with this feature.
 +
-"--no-force-with-lease" will cancel all the previous --force-with-lease on the
+`--no-force-with-lease` will cancel all the previous `--force-with-lease` on the
 command line.
 +
 A general note on safety: supplying this option without an expected
@@ -276,23 +281,29 @@ If your editor or some other system is running `git fetch` in the
 background for you a way to mitigate this is to simply set up another
 remote:
 +
-	git remote add origin-push $(git config remote.origin.url)
-	git fetch origin-push
+----
+git remote add origin-push $(git config remote.origin.url)
+git fetch origin-push
+----
 +
 Now when the background process runs `git fetch origin` the references
 on `origin-push` won't be updated, and thus commands like:
 +
-	git push --force-with-lease origin-push
+----
+git push --force-with-lease origin-push
+----
 +
 Will fail unless you manually run `git fetch origin-push`. This method
 is of course entirely defeated by something that runs `git fetch
 --all`, in that case you'd need to either disable it or do something
 more tedious like:
 +
-	git fetch              # update 'master' from remote
-	git tag base master    # mark our base point
-	git rebase -i master   # rewrite some commits
-	git push --force-with-lease=master:base master:master
+----
+git fetch              # update 'master' from remote
+git tag base master    # mark our base point
+git rebase -i master   # rewrite some commits
+git push --force-with-lease=master:base master:master
+----
 +
 I.e. create a `base` tag for versions of the upstream code that you've
 seen and are willing to overwrite, then rewrite history, and finally
@@ -308,26 +319,26 @@ verify if updates from the remote-tracking refs that may have been
 implicitly updated in the background are integrated locally before
 allowing a forced update.
 
--f::
---force::
+`-f`::
+`--force`::
 	Usually, `git push` will refuse to update a branch that is not an
 	ancestor of the commit being pushed.
 +
 This flag disables that check, the other safety checks in PUSH RULES
-below, and the checks in --force-with-lease. It can cause the remote
+below, and the checks in `--force-with-lease`. It can cause the remote
 repository to lose commits; use it with care.
 +
 Note that `--force` applies to all the refs that are pushed, hence
 using it with `push.default` set to `matching` or with multiple push
-destinations configured with `remote.*.push` may overwrite refs
+destinations configured with `remote.<name>.push` may overwrite refs
 other than the current branch (including local refs that are
 strictly behind their remote counterpart).  To force a push to only
 one branch, use a `+` in front of the refspec to push (e.g `git push
 origin +master` to force a push to the `master` branch). See the
 `<refspec>...` section above for details.
 
---force-if-includes::
---no-force-if-includes::
+`--force-if-includes`::
+`--no-force-if-includes`::
 	Force an update only if the tip of the remote-tracking ref
 	has been integrated locally.
 +
@@ -343,72 +354,78 @@ a "no-op".
 +
 Specifying `--no-force-if-includes` disables this behavior.
 
---repo=<repository>::
-	This option is equivalent to the <repository> argument. If both
+`--repo=<repository>`::
+	This option is equivalent to the _<repository>_ argument. If both
 	are specified, the command-line argument takes precedence.
 
--u::
---set-upstream::
+`-u`::
+`--set-upstream`::
 	For every branch that is up to date or successfully pushed, add
 	upstream (tracking) reference, used by argument-less
 	linkgit:git-pull[1] and other commands. For more information,
 	see `branch.<name>.merge` in linkgit:git-config[1].
 
---thin::
---no-thin::
+`--thin`::
+`--no-thin`::
 	These options are passed to linkgit:git-send-pack[1]. A thin transfer
 	significantly reduces the amount of sent data when the sender and
 	receiver share many of the same objects in common. The default is
 	`--thin`.
 
--q::
---quiet::
+`-q`::
+`--quiet`::
 	Suppress all output, including the listing of updated refs,
 	unless an error occurs. Progress is not reported to the standard
 	error stream.
 
--v::
---verbose::
+`-v`::
+`--verbose`::
 	Run verbosely.
 
---progress::
+`--progress`::
 	Progress status is reported on the standard error stream
-	by default when it is attached to a terminal, unless -q
+	by default when it is attached to a terminal, unless `-q`
 	is specified. This flag forces progress status even if the
 	standard error stream is not directed to a terminal.
 
---no-recurse-submodules::
---recurse-submodules=check|on-demand|only|no::
+`--no-recurse-submodules`::
+`--recurse-submodules=(check|on-demand|only|no)`::
 	May be used to make sure all submodule commits used by the
 	revisions to be pushed are available on a remote-tracking branch.
-	If 'check' is used Git will verify that all submodule commits that
+	Possible values are:
+`check`;;
+        Git will verify that all submodule commits that
 	changed in the revisions to be pushed are available on at least one
 	remote of the submodule. If any commits are missing the push will
-	be aborted and exit with non-zero status. If 'on-demand' is used
+	be aborted and exit with non-zero status.
+`on-demand`;;
 	all submodules that changed in the revisions to be pushed will be
-	pushed. If on-demand was not able to push all necessary revisions it will
-	also be aborted and exit with non-zero status. If 'only' is used all
-	submodules will be pushed while the superproject is left
-	unpushed. A value of 'no' or using `--no-recurse-submodules` can be used
-	to override the push.recurseSubmodules configuration variable when no
-	submodule recursion is required.
-+
-When using 'on-demand' or 'only', if a submodule has a
-"push.recurseSubmodules={on-demand,only}" or "submodule.recurse" configuration,
-further recursion will occur. In this case, "only" is treated as "on-demand".
-
---verify::
---no-verify::
+	pushed. If `on-demand` was not able to push all necessary revisions it will
+	also be aborted and exit with non-zero status.
+`only`;;
+	all submodules will be pushed while the superproject is left
+	unpushed.
+`no`;;
+	override the `push.recurseSubmodules` configuration variable when no
+	submodule recursion is required. Similar to using `--no-recurse-submodules`.
+
++
+When using `on-demand` or `only`, if a submodule has a
+`push.recurseSubmodules=(on-demand|only)` or `submodule.recurse` configuration,
+further recursion will occur. In this case, `only` is treated as `on-demand`.
+
+`--verify`::
+`--no-verify`::
 	Toggle the pre-push hook (see linkgit:githooks[5]).  The
-	default is --verify, giving the hook a chance to prevent the
-	push.  With --no-verify, the hook is bypassed completely.
+	default is `--verify`, giving the hook a chance to prevent the
+	push.  With `--no-verify`, the hook is bypassed completely.
 
--4::
---ipv4::
+`-4`::
+`--ipv4`::
 	Use IPv4 addresses only, ignoring IPv6 addresses.
 
--6::
---ipv6::
+`-6`::
+`--ipv6`::
 	Use IPv6 addresses only, ignoring IPv4 addresses.
 
 include::urls-remotes.adoc[]
@@ -427,16 +444,16 @@ representing the status of a single ref. Each line is of the form:
  <flag> <summary> <from> -> <to> (<reason>)
 -------------------------------
 
-If --porcelain is used, then each line of the output is of the form:
+If `--porcelain` is used, then each line of the output is of the form:
 
 -------------------------------
  <flag> \t <from>:<to> \t <summary> (<reason>)
 -------------------------------
 
-The status of up-to-date refs is shown only if --porcelain or --verbose
+The status of up-to-date refs is shown only if `--porcelain` or `--verbose`
 option is used.
 
-flag::
+_<flag>_::
 	A single character indicating the status of the ref:
 (space);; for a successfully pushed fast-forward;
 `+`;; for a successful forced update;
@@ -445,7 +462,7 @@ flag::
 `!`;; for a ref that was rejected or failed to push; and
 `=`;; for a ref that was up to date and did not need pushing.
 
-summary::
+_<summary>_::
 	For a successfully pushed ref, the summary shows the old and new
 	values of the ref in a form suitable for using as an argument to
 	`git log` (this is `<old>..<new>` in most cases, and
@@ -586,7 +603,7 @@ Updating A with the resulting merge commit will fast-forward and your
 push will be accepted.
 
 Alternatively, you can rebase your change between X and B on top of A,
-with "git pull --rebase", and push the result back.  The rebase will
+with `git pull --rebase`, and push the result back.  The rebase will
 create a new commit D that builds the change between X and B on top of
 A.
 
@@ -604,12 +621,12 @@ accepted.
 There is another common situation where you may encounter non-fast-forward
 rejection when you try to push, and it is possible even when you are
 pushing into a repository nobody else pushes into. After you push commit
-A yourself (in the first picture in this section), replace it with "git
-commit --amend" to produce commit B, and you try to push it out, because
+A yourself (in the first picture in this section), replace it with `git
+commit --amend` to produce commit B, and you try to push it out, because
 forgot that you have pushed A out already. In such a case, and only if
 you are certain that nobody in the meantime fetched your earlier commit A
-(and started building on top of it), you can run "git push --force" to
-overwrite it. In other words, "git push --force" is a method reserved for
+(and started building on top of it), you can run `git push --force` to
+overwrite it. In other words, `git push --force` is a method reserved for
 a case where you do mean to lose history.
 
 
@@ -627,18 +644,18 @@ EXAMPLES
 	variable) if it has the same name as the current branch, and
 	errors out without pushing otherwise.
 +
-The default behavior of this command when no <refspec> is given can be
+The default behavior of this command when no _<refspec>_ is given can be
 configured by setting the `push` option of the remote, or the `push.default`
 configuration variable.
 +
 For example, to default to pushing only the current branch to `origin`
-use `git config remote.origin.push HEAD`.  Any valid <refspec> (like
+use `git config remote.origin.push HEAD`.  Any valid _<refspec>_ (like
 the ones in the examples below) can be configured as the default for
 `git push origin`.
 
 `git push origin :`::
 	Push "matching" branches to `origin`. See
-	<refspec> in the <<OPTIONS,OPTIONS>> section above for a
+	_<refspec>_ in the <<OPTIONS,OPTIONS>> section above for a
 	description of "matching" branches.
 
 `git push origin master`::

From 831e02340b9de46c9ea0a1bbce3894f390f5a45e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:49 +0100
Subject: [PATCH 121/553] path: move `enter_repo()` into "setup.c"

The function `enter_repo()` is used to enter a repository at a given
path. As such it sits way closer to setting up a repository than it does
with handling paths, but regardless of that it's located in "path.c"
instead of in "setup.c".

Move the function into "setup.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/receive-pack.c   |   2 +-
 builtin/upload-archive.c |   2 +-
 builtin/upload-pack.c    |   2 +-
 http-backend.c           |   1 +
 path.c                   | 100 ---------------------------------------
 path.h                   |  15 ------
 setup.c                  |  81 +++++++++++++++++++++++++++++++
 setup.h                  |  38 +++++++++++++++
 8 files changed, 123 insertions(+), 118 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c9288a9c7e382b..79a0fd4756665d 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -34,7 +34,6 @@
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
-#include "path.h"
 #include "protocol.h"
 #include "commit-reach.h"
 #include "server-info.h"
@@ -42,6 +41,7 @@
 #include "trace2.h"
 #include "worktree.h"
 #include "shallow.h"
+#include "setup.h"
 #include "parse-options.h"
 
 static const char * const receive_pack_usage[] = {
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index 97d7c9522f9868..25312bb2a52887 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -4,8 +4,8 @@
 #define USE_THE_REPOSITORY_VARIABLE
 #include "builtin.h"
 #include "archive.h"
-#include "path.h"
 #include "pkt-line.h"
+#include "setup.h"
 #include "sideband.h"
 #include "run-command.h"
 #include "strvec.h"
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index c2bbc035ab0c91..30498fafea3a8b 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,11 +5,11 @@
 #include "gettext.h"
 #include "pkt-line.h"
 #include "parse-options.h"
-#include "path.h"
 #include "protocol.h"
 #include "replace-object.h"
 #include "upload-pack.h"
 #include "serve.h"
+#include "setup.h"
 #include "commit.h"
 #include "environment.h"
 
diff --git a/http-backend.c b/http-backend.c
index 52f0483dd309d7..e9d1ef92bd8dc1 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -16,6 +16,7 @@
 #include "run-command.h"
 #include "string-list.h"
 #include "url.h"
+#include "setup.h"
 #include "strvec.h"
 #include "packfile.h"
 #include "odb.h"
diff --git a/path.c b/path.c
index 7f56eaf9930374..d726537622cda6 100644
--- a/path.c
+++ b/path.c
@@ -738,106 +738,6 @@ char *interpolate_path(const char *path, int real_home)
 	return NULL;
 }
 
-/*
- * First, one directory to try is determined by the following algorithm.
- *
- * (0) If "strict" is given, the path is used as given and no DWIM is
- *     done. Otherwise:
- * (1) "~/path" to mean path under the running user's home directory;
- * (2) "~user/path" to mean path under named user's home directory;
- * (3) "relative/path" to mean cwd relative directory; or
- * (4) "/absolute/path" to mean absolute directory.
- *
- * Unless "strict" is given, we check "%s/.git", "%s", "%s.git/.git", "%s.git"
- * in this order. We select the first one that is a valid git repository, and
- * chdir() to it. If none match, or we fail to chdir, we return NULL.
- *
- * If all goes well, we return the directory we used to chdir() (but
- * before ~user is expanded), avoiding getcwd() resolving symbolic
- * links.  User relative paths are also returned as they are given,
- * except DWIM suffixing.
- */
-const char *enter_repo(const char *path, unsigned flags)
-{
-	static struct strbuf validated_path = STRBUF_INIT;
-	static struct strbuf used_path = STRBUF_INIT;
-
-	if (!path)
-		return NULL;
-
-	if (!(flags & ENTER_REPO_STRICT)) {
-		static const char *suffix[] = {
-			"/.git", "", ".git/.git", ".git", NULL,
-		};
-		const char *gitfile;
-		int len = strlen(path);
-		int i;
-		while ((1 < len) && (path[len-1] == '/'))
-			len--;
-
-		/*
-		 * We can handle arbitrary-sized buffers, but this remains as a
-		 * sanity check on untrusted input.
-		 */
-		if (PATH_MAX <= len)
-			return NULL;
-
-		strbuf_reset(&used_path);
-		strbuf_reset(&validated_path);
-		strbuf_add(&used_path, path, len);
-		strbuf_add(&validated_path, path, len);
-
-		if (used_path.buf[0] == '~') {
-			char *newpath = interpolate_path(used_path.buf, 0);
-			if (!newpath)
-				return NULL;
-			strbuf_attach(&used_path, newpath, strlen(newpath),
-				      strlen(newpath));
-		}
-		for (i = 0; suffix[i]; i++) {
-			struct stat st;
-			size_t baselen = used_path.len;
-			strbuf_addstr(&used_path, suffix[i]);
-			if (!stat(used_path.buf, &st) &&
-			    (S_ISREG(st.st_mode) ||
-			    (S_ISDIR(st.st_mode) && is_git_directory(used_path.buf)))) {
-				strbuf_addstr(&validated_path, suffix[i]);
-				break;
-			}
-			strbuf_setlen(&used_path, baselen);
-		}
-		if (!suffix[i])
-			return NULL;
-		gitfile = read_gitfile(used_path.buf);
-		if (!(flags & ENTER_REPO_ANY_OWNER_OK))
-			die_upon_dubious_ownership(gitfile, NULL, used_path.buf);
-		if (gitfile) {
-			strbuf_reset(&used_path);
-			strbuf_addstr(&used_path, gitfile);
-		}
-		if (chdir(used_path.buf))
-			return NULL;
-		path = validated_path.buf;
-	}
-	else {
-		const char *gitfile = read_gitfile(path);
-		if (!(flags & ENTER_REPO_ANY_OWNER_OK))
-			die_upon_dubious_ownership(gitfile, NULL, path);
-		if (gitfile)
-			path = gitfile;
-		if (chdir(path))
-			return NULL;
-	}
-
-	if (is_git_directory(".")) {
-		set_git_dir(".", 0);
-		check_repository_format(NULL);
-		return path;
-	}
-
-	return NULL;
-}
-
 int calc_shared_perm(struct repository *repo,
 		     int mode)
 {
diff --git a/path.h b/path.h
index e67348f25397cc..0ec95a0b079c90 100644
--- a/path.h
+++ b/path.h
@@ -146,21 +146,6 @@ int adjust_shared_perm(struct repository *repo, const char *path);
 
 char *interpolate_path(const char *path, int real_home);
 
-/* The bits are as follows:
- *
- * - ENTER_REPO_STRICT: callers that require exact paths (as opposed
- *   to allowing known suffixes like ".git", ".git/.git" to be
- *   omitted) can set this bit.
- *
- * - ENTER_REPO_ANY_OWNER_OK: callers that are willing to run without
- *   ownership check can set this bit.
- */
-enum {
-	ENTER_REPO_STRICT = (1<<0),
-	ENTER_REPO_ANY_OWNER_OK = (1<<1),
-};
-
-const char *enter_repo(const char *path, unsigned flags);
 const char *remove_leading_path(const char *in, const char *prefix);
 const char *relative_path(const char *in, const char *prefix, struct strbuf *sb);
 int normalize_path_copy_len(char *dst, const char *src, int *prefix_len);
diff --git a/setup.c b/setup.c
index 7086741e6c2d1f..98c6fd8ee4c02d 100644
--- a/setup.c
+++ b/setup.c
@@ -1703,6 +1703,87 @@ void set_git_dir(const char *path, int make_realpath)
 	strbuf_release(&realpath);
 }
 
+const char *enter_repo(const char *path, unsigned flags)
+{
+	static struct strbuf validated_path = STRBUF_INIT;
+	static struct strbuf used_path = STRBUF_INIT;
+
+	if (!path)
+		return NULL;
+
+	if (!(flags & ENTER_REPO_STRICT)) {
+		static const char *suffix[] = {
+			"/.git", "", ".git/.git", ".git", NULL,
+		};
+		const char *gitfile;
+		int len = strlen(path);
+		int i;
+		while ((1 < len) && (path[len-1] == '/'))
+			len--;
+
+		/*
+		 * We can handle arbitrary-sized buffers, but this remains as a
+		 * sanity check on untrusted input.
+		 */
+		if (PATH_MAX <= len)
+			return NULL;
+
+		strbuf_reset(&used_path);
+		strbuf_reset(&validated_path);
+		strbuf_add(&used_path, path, len);
+		strbuf_add(&validated_path, path, len);
+
+		if (used_path.buf[0] == '~') {
+			char *newpath = interpolate_path(used_path.buf, 0);
+			if (!newpath)
+				return NULL;
+			strbuf_attach(&used_path, newpath, strlen(newpath),
+				      strlen(newpath));
+		}
+		for (i = 0; suffix[i]; i++) {
+			struct stat st;
+			size_t baselen = used_path.len;
+			strbuf_addstr(&used_path, suffix[i]);
+			if (!stat(used_path.buf, &st) &&
+			    (S_ISREG(st.st_mode) ||
+			    (S_ISDIR(st.st_mode) && is_git_directory(used_path.buf)))) {
+				strbuf_addstr(&validated_path, suffix[i]);
+				break;
+			}
+			strbuf_setlen(&used_path, baselen);
+		}
+		if (!suffix[i])
+			return NULL;
+		gitfile = read_gitfile(used_path.buf);
+		if (!(flags & ENTER_REPO_ANY_OWNER_OK))
+			die_upon_dubious_ownership(gitfile, NULL, used_path.buf);
+		if (gitfile) {
+			strbuf_reset(&used_path);
+			strbuf_addstr(&used_path, gitfile);
+		}
+		if (chdir(used_path.buf))
+			return NULL;
+		path = validated_path.buf;
+	}
+	else {
+		const char *gitfile = read_gitfile(path);
+		if (!(flags & ENTER_REPO_ANY_OWNER_OK))
+			die_upon_dubious_ownership(gitfile, NULL, path);
+		if (gitfile)
+			path = gitfile;
+		if (chdir(path))
+			return NULL;
+	}
+
+	if (is_git_directory(".")) {
+		set_git_dir(".", 0);
+		check_repository_format(NULL);
+		return path;
+	}
+
+	return NULL;
+}
+
 static int git_work_tree_initialized;
 
 /*
diff --git a/setup.h b/setup.h
index 8522fa8575da71..bfea199bcd8769 100644
--- a/setup.h
+++ b/setup.h
@@ -97,6 +97,44 @@ static inline int discover_git_directory(struct strbuf *commondir,
 void set_git_dir(const char *path, int make_realpath);
 void set_git_work_tree(const char *tree);
 
+/* Flags that can be passed to `enter_repo()`. */
+enum {
+	/*
+	 * Callers that require exact paths (as opposed to allowing known
+	 * suffixes like ".git", ".git/.git" to be omitted) can set this bit.
+	 */
+	ENTER_REPO_STRICT = (1<<0),
+
+	/*
+	 * Callers that are willing to run without ownership check can set this
+	 * bit.
+	 */
+	ENTER_REPO_ANY_OWNER_OK = (1<<1),
+};
+
+/*
+ * Discover and enter a repository.
+ *
+ * First, one directory to try is determined by the following algorithm.
+ *
+ * (0) If "strict" is given, the path is used as given and no DWIM is
+ *     done. Otherwise:
+ * (1) "~/path" to mean path under the running user's home directory;
+ * (2) "~user/path" to mean path under named user's home directory;
+ * (3) "relative/path" to mean cwd relative directory; or
+ * (4) "/absolute/path" to mean absolute directory.
+ *
+ * Unless "strict" is given, we check "%s/.git", "%s", "%s.git/.git", "%s.git"
+ * in this order. We select the first one that is a valid git repository, and
+ * chdir() to it. If none match, or we fail to chdir, we return NULL.
+ *
+ * If all goes well, we return the directory we used to chdir() (but
+ * before ~user is expanded), avoiding getcwd() resolving symbolic
+ * links.  User relative paths are also returned as they are given,
+ * except DWIM suffixing.
+ */
+const char *enter_repo(const char *path, unsigned flags);
+
 const char *setup_git_directory_gently(int *);
 const char *setup_git_directory(void);
 char *prefix_path(const char *prefix, int len, const char *path);

From 7c188a9e45405ff911b81a5dd9029f4e91fb338e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:50 +0100
Subject: [PATCH 122/553] setup: convert `set_git_dir()` to have file scope

We don't have any external callers of `set_git_dir()` anymore now that
`enter_repo()` has been moved into "setup.c". Remove the declaration and
mark the function as static.

Note that this change requires us to move the implementation around so
that we can avoid adding any new forward declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 setup.c | 80 ++++++++++++++++++++++++++++-----------------------------
 setup.h |  1 -
 2 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/setup.c b/setup.c
index 98c6fd8ee4c02d..8bf52df71663a3 100644
--- a/setup.c
+++ b/setup.c
@@ -1002,6 +1002,46 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 	return error_code ? NULL : path;
 }
 
+static void set_git_dir_1(const char *path)
+{
+	xsetenv(GIT_DIR_ENVIRONMENT, path, 1);
+	setup_git_env(path);
+}
+
+static void update_relative_gitdir(const char *name UNUSED,
+				   const char *old_cwd,
+				   const char *new_cwd,
+				   void *data UNUSED)
+{
+	char *path = reparent_relative_path(old_cwd, new_cwd,
+					    repo_get_git_dir(the_repository));
+	struct tmp_objdir *tmp_objdir = tmp_objdir_unapply_primary_odb();
+
+	trace_printf_key(&trace_setup_key,
+			 "setup: move $GIT_DIR to '%s'",
+			 path);
+	set_git_dir_1(path);
+	if (tmp_objdir)
+		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
+	free(path);
+}
+
+static void set_git_dir(const char *path, int make_realpath)
+{
+	struct strbuf realpath = STRBUF_INIT;
+
+	if (make_realpath) {
+		strbuf_realpath(&realpath, path, 1);
+		path = realpath.buf;
+	}
+
+	set_git_dir_1(path);
+	if (!is_absolute_path(path))
+		chdir_notify_register(NULL, update_relative_gitdir, NULL);
+
+	strbuf_release(&realpath);
+}
+
 static const char *setup_explicit_git_dir(const char *gitdirenv,
 					  struct strbuf *cwd,
 					  struct repository_format *repo_fmt,
@@ -1663,46 +1703,6 @@ void setup_git_env(const char *git_dir)
 		fetch_if_missing = 0;
 }
 
-static void set_git_dir_1(const char *path)
-{
-	xsetenv(GIT_DIR_ENVIRONMENT, path, 1);
-	setup_git_env(path);
-}
-
-static void update_relative_gitdir(const char *name UNUSED,
-				   const char *old_cwd,
-				   const char *new_cwd,
-				   void *data UNUSED)
-{
-	char *path = reparent_relative_path(old_cwd, new_cwd,
-					    repo_get_git_dir(the_repository));
-	struct tmp_objdir *tmp_objdir = tmp_objdir_unapply_primary_odb();
-
-	trace_printf_key(&trace_setup_key,
-			 "setup: move $GIT_DIR to '%s'",
-			 path);
-	set_git_dir_1(path);
-	if (tmp_objdir)
-		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
-	free(path);
-}
-
-void set_git_dir(const char *path, int make_realpath)
-{
-	struct strbuf realpath = STRBUF_INIT;
-
-	if (make_realpath) {
-		strbuf_realpath(&realpath, path, 1);
-		path = realpath.buf;
-	}
-
-	set_git_dir_1(path);
-	if (!is_absolute_path(path))
-		chdir_notify_register(NULL, update_relative_gitdir, NULL);
-
-	strbuf_release(&realpath);
-}
-
 const char *enter_repo(const char *path, unsigned flags)
 {
 	static struct strbuf validated_path = STRBUF_INIT;
diff --git a/setup.h b/setup.h
index bfea199bcd8769..d55dcc66086308 100644
--- a/setup.h
+++ b/setup.h
@@ -94,7 +94,6 @@ static inline int discover_git_directory(struct strbuf *commondir,
 	return 0;
 }
 
-void set_git_dir(const char *path, int make_realpath);
 void set_git_work_tree(const char *tree);
 
 /* Flags that can be passed to `enter_repo()`. */

From 9aaba579932781c74f67d6cecddaad59f0daaaef Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:51 +0100
Subject: [PATCH 123/553] odb: adopt logic to close object databases

The logic to close an object database is currently contained in the
packfile subsystem. That choice is somewhat relatable, as most of the
logic really is to close resources associated with the packfile store
itself. But we also end up handling object sources and commit graphs,
which certainly is not related to packfiles.

Move the function into the object database subsystem and rename it to
`odb_close()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c  |  2 +-
 builtin/gc.c     |  2 +-
 builtin/repack.c |  2 +-
 midx-write.c     |  2 +-
 odb.c            | 18 +++++++++++++++++-
 odb.h            |  7 +++++++
 packfile.c       | 15 ---------------
 packfile.h       |  1 -
 run-command.c    |  2 +-
 scalar.c         |  2 +-
 10 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index c990f398ef6f37..b19b302b065467 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1617,7 +1617,7 @@ int cmd_clone(int argc,
 	transport_disconnect(transport);
 
 	if (option_dissociate) {
-		close_object_store(the_repository->objects);
+		odb_close(the_repository->objects);
 		dissociate_from_references();
 	}
 
diff --git a/builtin/gc.c b/builtin/gc.c
index d212cbb9b84781..961fa343c4b180 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1048,7 +1048,7 @@ int cmd_gc(int argc,
 	report_garbage = report_pack_garbage;
 	odb_reprepare(the_repository->objects);
 	if (pack_garbage.nr > 0) {
-		close_object_store(the_repository->objects);
+		odb_close(the_repository->objects);
 		clean_pack_garbage();
 	}
 
diff --git a/builtin/repack.c b/builtin/repack.c
index cfdb4c0920b191..d9012141f699c9 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -488,7 +488,7 @@ int cmd_repack(int argc,
 
 	string_list_sort(&names);
 
-	close_object_store(repo->objects);
+	odb_close(repo->objects);
 
 	/*
 	 * Ok we have prepared all new packfiles.
diff --git a/midx-write.c b/midx-write.c
index c73010df6d3a4f..60497586fdf2f4 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -1459,7 +1459,7 @@ static int write_midx_internal(struct odb_source *source,
 	}
 
 	if (ctx.m || ctx.base_midx)
-		close_object_store(ctx.repo->objects);
+		odb_close(ctx.repo->objects);
 
 	if (commit_lock_file(&lk) < 0)
 		die_errno(_("could not write multi-pack-index"));
diff --git a/odb.c b/odb.c
index 3ec21ef24e16bb..bcefa5cede60b5 100644
--- a/odb.c
+++ b/odb.c
@@ -9,6 +9,7 @@
 #include "khash.h"
 #include "lockfile.h"
 #include "loose.h"
+#include "midx.h"
 #include "object-file-convert.h"
 #include "object-file.h"
 #include "odb.h"
@@ -1044,6 +1045,21 @@ struct object_database *odb_new(struct repository *repo)
 	return o;
 }
 
+void odb_close(struct object_database *o)
+{
+	struct odb_source *source;
+
+	packfile_store_close(o->packfiles);
+
+	for (source = o->sources; source; source = source->next) {
+		if (source->midx)
+			close_midx(source->midx);
+		source->midx = NULL;
+	}
+
+	close_commit_graph(o);
+}
+
 static void odb_free_sources(struct object_database *o)
 {
 	while (o->sources) {
@@ -1076,7 +1092,7 @@ void odb_clear(struct object_database *o)
 		free((char *) o->cached_objects[i].value.buf);
 	FREE_AND_NULL(o->cached_objects);
 
-	close_object_store(o);
+	odb_close(o);
 	packfile_store_free(o->packfiles);
 	o->packfiles = NULL;
 
diff --git a/odb.h b/odb.h
index 9bb28008b1d953..71b4897c82f3a8 100644
--- a/odb.h
+++ b/odb.h
@@ -169,6 +169,13 @@ struct object_database {
 struct object_database *odb_new(struct repository *repo);
 void odb_clear(struct object_database *o);
 
+/*
+ * Close the object database and all of its sources so that any held resources
+ * will be released. The database can still be used after closing it, in which
+ * case these resources may be reallocated.
+ */
+void odb_close(struct object_database *o);
+
 /*
  * Clear caches, reload alternates and then reload object sources so that new
  * objects may become accessible.
diff --git a/packfile.c b/packfile.c
index 40f733dd234900..af71eaf7e34461 100644
--- a/packfile.c
+++ b/packfile.c
@@ -359,21 +359,6 @@ void close_pack(struct packed_git *p)
 	oidset_clear(&p->bad_objects);
 }
 
-void close_object_store(struct object_database *o)
-{
-	struct odb_source *source;
-
-	packfile_store_close(o->packfiles);
-
-	for (source = o->sources; source; source = source->next) {
-		if (source->midx)
-			close_midx(source->midx);
-		source->midx = NULL;
-	}
-
-	close_commit_graph(o);
-}
-
 void unlink_pack_path(const char *pack_name, int force_delete)
 {
 	static const char *exts[] = {".idx", ".pack", ".rev", ".keep", ".bitmap", ".promisor", ".mtimes"};
diff --git a/packfile.h b/packfile.h
index 58fcc88e20224b..d9226a072ac96d 100644
--- a/packfile.h
+++ b/packfile.h
@@ -279,7 +279,6 @@ struct object_database;
 unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
 void close_pack_windows(struct packed_git *);
 void close_pack(struct packed_git *);
-void close_object_store(struct object_database *o);
 void unuse_pack(struct pack_window **);
 void clear_delta_base_cache(void);
 struct packed_git *add_packed_git(struct repository *r, const char *path,
diff --git a/run-command.c b/run-command.c
index ed9575bd6a8cbb..e3e02475ccec50 100644
--- a/run-command.c
+++ b/run-command.c
@@ -743,7 +743,7 @@ int start_command(struct child_process *cmd)
 	fflush(NULL);
 
 	if (cmd->close_object_store)
-		close_object_store(the_repository->objects);
+		odb_close(the_repository->objects);
 
 #ifndef GIT_WINDOWS_NATIVE
 {
diff --git a/scalar.c b/scalar.c
index f7543116272b77..2aeb191cc89b72 100644
--- a/scalar.c
+++ b/scalar.c
@@ -931,7 +931,7 @@ static int cmd_delete(int argc, const char **argv)
 	if (dir_inside_of(cwd, enlistment.buf) >= 0)
 		res = error(_("refusing to delete current working directory"));
 	else {
-		close_object_store(the_repository->objects);
+		odb_close(the_repository->objects);
 		res = delete_enlistment(&enlistment);
 	}
 	strbuf_release(&enlistment);

From f8bdf3127ab7df8a8f3039f41889b35eefe029a3 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:52 +0100
Subject: [PATCH 124/553] odb: refactor `odb_clear()` to `odb_free()`

The function `odb_clear()` releases all resources allocated to an object
database and ensures that all fields become zero'd out. Despite its
naming though it doesn't really clear the object database so that it
becomes ready for reuse afterwards again -- the caller would first have
to reinitialize it, and that contradicts the terminology of "clearing"
as we have defined it in our coding guidelines.

There isn't really only a reason to have "clearing" semantics, either.
There's only a single caller of `odb_clear()`, and that caller also ends
up freeing the object database structure itself.

Refactor the function to have "freeing" semantics instead, so that the
structure itself is also freed, which allows us to drop some useless
boilerplate to zero out the structure's members.

This refactoring reveals that we're trying to close the commit graph
multiple times: once directly via `free_commit_graph()`, and once via
`odb_close()`. Drop the former call.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c        | 19 ++++++++-----------
 odb.h        |  4 +++-
 repository.c |  4 ++--
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/odb.c b/odb.c
index bcefa5cede60b5..29cf6496c5e50a 100644
--- a/odb.c
+++ b/odb.c
@@ -1073,30 +1073,27 @@ static void odb_free_sources(struct object_database *o)
 	o->source_by_path = NULL;
 }
 
-void odb_clear(struct object_database *o)
+void odb_free(struct object_database *o)
 {
-	FREE_AND_NULL(o->alternate_db);
+	if (!o)
+		return;
+
+	free(o->alternate_db);
 
 	oidmap_clear(&o->replace_map, 1);
 	pthread_mutex_destroy(&o->replace_mutex);
 
-	free_commit_graph(o->commit_graph);
-	o->commit_graph = NULL;
-	o->commit_graph_attempted = 0;
-
 	odb_free_sources(o);
-	o->sources_tail = NULL;
-	o->loaded_alternates = 0;
 
 	for (size_t i = 0; i < o->cached_object_nr; i++)
 		free((char *) o->cached_objects[i].value.buf);
-	FREE_AND_NULL(o->cached_objects);
+	free(o->cached_objects);
 
 	odb_close(o);
 	packfile_store_free(o->packfiles);
-	o->packfiles = NULL;
-
 	string_list_clear(&o->submodule_source_paths, 0);
+
+	free(o);
 }
 
 void odb_reprepare(struct object_database *o)
diff --git a/odb.h b/odb.h
index 71b4897c82f3a8..77b313b784cad3 100644
--- a/odb.h
+++ b/odb.h
@@ -167,7 +167,9 @@ struct object_database {
 };
 
 struct object_database *odb_new(struct repository *repo);
-void odb_clear(struct object_database *o);
+
+/* Free the object database and release all resources. */
+void odb_free(struct object_database *o);
 
 /*
  * Close the object database and all of its sources so that any held resources
diff --git a/repository.c b/repository.c
index 6aaa7ba00869bf..3c8b3813b00af0 100644
--- a/repository.c
+++ b/repository.c
@@ -382,8 +382,8 @@ void repo_clear(struct repository *repo)
 	FREE_AND_NULL(repo->worktree);
 	FREE_AND_NULL(repo->submodule_prefix);
 
-	odb_clear(repo->objects);
-	FREE_AND_NULL(repo->objects);
+	odb_free(repo->objects);
+	repo->objects = NULL;
 
 	parsed_object_pool_clear(repo->parsed_objects);
 	FREE_AND_NULL(repo->parsed_objects);

From fbf3d0669f830b4492070aa33f57dbf2c43fa4c8 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Thu, 20 Nov 2025 17:26:49 +0100
Subject: [PATCH 125/553] doc: warn against --committer-date-is-author-date
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This option could create a commit history which violates the assumption
that commits have non-decreasing commit timestamps. Warn against that in
both git-am(1) and git-rebase(1).

The genesis of this option is from git-am(1) and was added in
3f01ad66 (am: Add --committer-date-is-author-date option,
2009-01-22). The commit message doesn’t give us an example
of a use case, but the thread starter does:[1]

    I've a big set of patches in a mbox file: there's sufficient info
    inside for git-am to work.

    Yet, each time I do import these, my sha1sums are changing because of
    different commit dates.

    I'd like to force the commit date to match the info/date from the time
    I received the email (and therefore always get back the right
    sha1sums).

[1]: https://lore.kernel.org/git/46d6db660901221441q60eb90bdge601a7a250c3a247@mail.gmail.com/

So the motivation was to treat git-am(1) as an import command that
creates the same commit IDs.

Putting aside the question of whether you should be using git-am(1) for
importing commits, this approach is problematic:

• you still need to apply the commits to the same base if you want the
  same hashes; and
• you need the same committer.

And if you expect the same committer, why is this person applying the
same patches multiple times with the goal of making *identical* commits?

That was all for git-am(1).

It was added to git-rebase(1) in 570ccad3 (rebase: add options passed to
git-am, 2009-03-18)[2] in order to plug options that could not be sent
on to git-am(1). At this point the utility of the option graduated to
making no sense; a use case for `git rebase --committer-date-is-author-
date` is still yet to be found.

Just warn against using this option on both commands and remind the user
to consider whether they really need it.

† 2: See also 7573cec5 (rebase -i: support
     --committer-date-is-author-date, 2020-08-17) for the commit for the
     merge backend

Suggested-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-am.adoc     | 7 +++++++
 Documentation/git-rebase.adoc | 7 +++++++
 2 files changed, 14 insertions(+)

diff --git a/Documentation/git-am.adoc b/Documentation/git-am.adoc
index 221070de481227..264d21a7de7801 100644
--- a/Documentation/git-am.adoc
+++ b/Documentation/git-am.adoc
@@ -161,6 +161,13 @@ Valid <action> for the `--whitespace` option are:
 	commit creation as the committer date. This allows the
 	user to lie about the committer date by using the same
 	value as the author date.
++
+WARNING: The history walking machinery assumes that commits have
+non-decreasing commit timestamps. You should consider if you really need
+to use this option. Then you should only use this option to override the
+committer date when applying commits on top of a base which commit is
+older (in terms of the commit date) than the oldest patch you are
+applying.
 
 --ignore-date::
 	By default the command records the date from the e-mail
diff --git a/Documentation/git-rebase.adoc b/Documentation/git-rebase.adoc
index 956d3048f5a618..0f808c82b2877b 100644
--- a/Documentation/git-rebase.adoc
+++ b/Documentation/git-rebase.adoc
@@ -507,6 +507,13 @@ See also INCOMPATIBLE OPTIONS below.
 	Instead of using the current time as the committer date, use
 	the author date of the commit being rebased as the committer
 	date. This option implies `--force-rebase`.
++
+WARNING: The history walking machinery assumes that commits have
+non-decreasing commit timestamps. You should consider if you really need
+to use this option. Then you should only use this option to override the
+committer date when rebasing commits on top of a base which commit is
+older (in terms of the commit date) than the oldest commit you are
+applying (in terms of the author date).
 
 --ignore-date::
 --reset-author-date::

From 2367c6bcd600882d0ea70d4f654c8cfa5c1f53ac Mon Sep 17 00:00:00 2001
From: Greg Funni <gfunni234@gmail.com>
Date: Tue, 18 Nov 2025 15:41:54 +0000
Subject: [PATCH 126/553] win32: return error if SleepConditionVariableCS fails

If it fails, return an error.

Signed-off-by: Greg Funni <gfunni234@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/win32/pthread.c | 7 +++++++
 compat/win32/pthread.h | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/compat/win32/pthread.c b/compat/win32/pthread.c
index 58980a529c3eb9..7e93146963ec56 100644
--- a/compat/win32/pthread.c
+++ b/compat/win32/pthread.c
@@ -59,3 +59,10 @@ pthread_t pthread_self(void)
 	t.tid = GetCurrentThreadId();
 	return t;
 }
+
+int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
+{
+	if (SleepConditionVariableCS(cond, mutex, INFINITE) == 0)
+		return err_win_to_posix(GetLastError());
+	return 0;
+}
diff --git a/compat/win32/pthread.h b/compat/win32/pthread.h
index e2b5c4f64c9b91..859e1d9021c97c 100644
--- a/compat/win32/pthread.h
+++ b/compat/win32/pthread.h
@@ -36,7 +36,6 @@ typedef int pthread_mutexattr_t;
 
 #define pthread_cond_init(a,b) InitializeConditionVariable((a))
 #define pthread_cond_destroy(a) do {} while (0)
-#define pthread_cond_wait(a,b) return_0(SleepConditionVariableCS((a), (b), INFINITE))
 #define pthread_cond_signal WakeConditionVariable
 #define pthread_cond_broadcast WakeAllConditionVariable
 
@@ -64,6 +63,8 @@ int win32_pthread_join(pthread_t *thread, void **value_ptr);
 #define pthread_equal(t1, t2) ((t1).tid == (t2).tid)
 pthread_t pthread_self(void);
 
+int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
+
 static inline void NORETURN pthread_exit(void *ret)
 {
 	_endthreadex((unsigned)(uintptr_t)ret);

From 42aa7603aa752850c8ad89cca61e280dab520faf Mon Sep 17 00:00:00 2001
From: Greg Funni <gfunni234@gmail.com>
Date: Thu, 20 Nov 2025 21:43:36 +0000
Subject: [PATCH 127/553] win32: pthread_cond_init should return a value

This value is not checked, but it must return to match POSIX

Signed-off-by: Greg Funni <gfunni234@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/win32/pthread.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/win32/pthread.h b/compat/win32/pthread.h
index e2b5c4f64c9b91..000604cdf69ffc 100644
--- a/compat/win32/pthread.h
+++ b/compat/win32/pthread.h
@@ -34,7 +34,7 @@ typedef int pthread_mutexattr_t;
 
 #define pthread_cond_t CONDITION_VARIABLE
 
-#define pthread_cond_init(a,b) InitializeConditionVariable((a))
+#define pthread_cond_init(a,b) return_0((InitializeConditionVariable((a)), 0))
 #define pthread_cond_destroy(a) do {} while (0)
 #define pthread_cond_wait(a,b) return_0(SleepConditionVariableCS((a), (b), INFINITE))
 #define pthread_cond_signal WakeConditionVariable

From 770afe443784b3ec2c72d68aa509e48064942348 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 20 Nov 2025 11:32:45 -0800
Subject: [PATCH 128/553] config: mark otherwise unused function as file-scope
 static

git_configset_get_pathname() is only used once inside config.c; we do
not have to expose it as a public function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c | 2 +-
 config.h | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/config.c b/config.c
index 73fc74c8fa1a35..6552e5b0b80202 100644
--- a/config.c
+++ b/config.c
@@ -1954,7 +1954,7 @@ int git_configset_get_maybe_bool(struct config_set *set, const char *key, int *d
 		return 1;
 }
 
-int git_configset_get_pathname(struct config_set *set, const char *key, char **dest)
+static int git_configset_get_pathname(struct config_set *set, const char *key, char **dest)
 {
 	const char *value;
 	if (!git_configset_get_value(set, key, &value, NULL))
diff --git a/config.h b/config.h
index 19c87fc0bc1a2a..ba426a960af9f4 100644
--- a/config.h
+++ b/config.h
@@ -564,7 +564,6 @@ int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned lon
 int git_configset_get_bool(struct config_set *cs, const char *key, int *dest);
 int git_configset_get_bool_or_int(struct config_set *cs, const char *key, int *is_bool, int *dest);
 int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest);
-int git_configset_get_pathname(struct config_set *cs, const char *key, char **dest);
 
 /**
  * Run only the discover part of the repo_config_get_*() functions

From c3cf8e5907adb55380801007ff14f0e3b7cf7152 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Fri, 21 Nov 2025 12:13:45 +0100
Subject: [PATCH 129/553] fetch: extract out reference committing logic

The `do_fetch()` function contains the core of the `git-fetch(1)` logic.
Part of this is to fetch and store references. This is done by

  1. Creating a reference transaction (non-atomic mode uses batched
     updates).
  2. Adding individual reference updates to the transaction.
  3. Committing the transaction.
  4. When using batched updates, handling the rejected updates.

The following commit, will fix a bug wherein fetching tags with
conflicts was causing other reference updates to fail. Fixing this
requires utilizing this logic in different regions of the function.

In preparation of the follow up commit, extract the committing and
rejection handling logic into a separate function called
`commit_ref_transaction()`.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c | 59 +++++++++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index c7ff3480fb1827..f90179040ba34c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1686,6 +1686,36 @@ static void ref_transaction_rejection_handler(const char *refname,
 	*data->retcode = 1;
 }
 
+/*
+ * Commit the reference transaction. If it isn't an atomic transaction, handle
+ * rejected updates as part of using batched updates.
+ */
+static int commit_ref_transaction(struct ref_transaction **transaction,
+				  bool is_atomic, const char *remote_name,
+				  struct strbuf *err)
+{
+	int retcode = ref_transaction_commit(*transaction, err);
+	if (retcode)
+		goto out;
+
+	if (!is_atomic) {
+		struct ref_rejection_data data = {
+			.conflict_msg_shown = 0,
+			.remote_name = remote_name,
+			.retcode = &retcode,
+		};
+
+		ref_transaction_for_each_rejected_update(*transaction,
+							 ref_transaction_rejection_handler,
+							 &data);
+	}
+
+out:
+	ref_transaction_free(*transaction);
+	*transaction = NULL;
+	return retcode;
+}
+
 static int do_fetch(struct transport *transport,
 		    struct refspec *rs,
 		    const struct fetch_config *config)
@@ -1858,33 +1888,10 @@ static int do_fetch(struct transport *transport,
 	if (retcode)
 		goto cleanup;
 
-	retcode = ref_transaction_commit(transaction, &err);
-	if (retcode) {
-		/*
-		 * Explicitly handle transaction cleanup to avoid
-		 * aborting an already closed transaction.
-		 */
-		ref_transaction_free(transaction);
-		transaction = NULL;
+	retcode = commit_ref_transaction(&transaction, atomic_fetch,
+					 transport->remote->name, &err);
+	if (retcode)
 		goto cleanup;
-	}
-
-	if (!atomic_fetch) {
-		struct ref_rejection_data data = {
-			.retcode = &retcode,
-			.conflict_msg_shown = 0,
-			.remote_name = transport->remote->name,
-		};
-
-		ref_transaction_for_each_rejected_update(transaction,
-							 ref_transaction_rejection_handler,
-							 &data);
-		if (retcode) {
-			ref_transaction_free(transaction);
-			transaction = NULL;
-			goto cleanup;
-		}
-	}
 
 	commit_fetch_head(&fetch_head);
 

From debbc87557487aa9a8ed8a35367d17f8b4081c76 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Fri, 21 Nov 2025 09:13:56 -0800
Subject: [PATCH 130/553] The second batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index b0b3dc9b3dcc26..997ae7476c2942 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -1,6 +1,13 @@
 Git v2.53 Release Notes
 =======================
 
+UI, Workflows & Features
+------------------------
+
+ * "git maintenance" command learned "is-needed" subcommand to tell if
+   it is necessary to perform various maintenance tasks.
+
+
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
 
@@ -10,3 +17,15 @@ Performance, Internal Implementation, Development Support etc.
  * Some ref backend storage can hold not just the object name of an
    annotated tag, but the object name of the object the tag points at.
    The code to handle this information has been streamlined.
+
+ * As "git diff --quiet" only cares about the existence of any
+   changes, disable rename/copy detection to skip more expensive
+   processing whose result will be discarded anyway.
+
+
+Fixes since v2.51
+-----------------
+
+ * Ever since we added whitespace rules for this project, we misspelt
+   an entry, which has been corrected.
+   (merge 358e94dc70 jc/gitattributes-whitespace-no-indent-fix later to maint).

From 6bdda3a3b00fff9a1d64d1bb4732f0c446d7012c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:26 +0100
Subject: [PATCH 131/553] streaming: rename `git_istream` into
 `odb_read_stream`

In the following patches we are about to make the `git_istream` more
generic so that it becomes fully controlled by the specific object
source that wants to create it. As part of these refactorings we'll
fully move the structure into the object database subsystem.

Prepare for this change by renaming the structure from `git_istream`
to `odb_read_stream`. This mirrors the `odb_write_stream` structure that
we already have.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 archive-tar.c          |  2 +-
 archive-zip.c          |  2 +-
 builtin/index-pack.c   |  2 +-
 builtin/pack-objects.c |  4 +--
 object-file.c          |  2 +-
 streaming.c            | 62 +++++++++++++++++++++---------------------
 streaming.h            | 12 ++++----
 7 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 73b63ddc41bad6..dc1eda09e01e2b 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -129,7 +129,7 @@ static void write_trailer(void)
  */
 static int stream_blocked(struct repository *r, const struct object_id *oid)
 {
-	struct git_istream *st;
+	struct odb_read_stream *st;
 	enum object_type type;
 	unsigned long sz;
 	char buf[BLOCKSIZE];
diff --git a/archive-zip.c b/archive-zip.c
index bea5bdd43dc43e..40a9c93ff95233 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -309,7 +309,7 @@ static int write_zip_entry(struct archiver_args *args,
 	enum zip_method method;
 	unsigned char *out;
 	void *deflated = NULL;
-	struct git_istream *stream = NULL;
+	struct odb_read_stream *stream = NULL;
 	unsigned long flags = 0;
 	int is_binary = -1;
 	const char *path_without_prefix = path + args->baselen;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 2b78ba7fe4d14a..5f90f12f92d9c4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -762,7 +762,7 @@ static void find_ref_delta_children(const struct object_id *oid,
 
 struct compare_data {
 	struct object_entry *entry;
-	struct git_istream *st;
+	struct odb_read_stream *st;
 	unsigned char *buf;
 	unsigned long buf_size;
 };
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 69e80b1443a9b7..c693d948e193ed 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -404,7 +404,7 @@ static unsigned long do_compress(void **pptr, unsigned long size)
 	return stream.total_out;
 }
 
-static unsigned long write_large_blob_data(struct git_istream *st, struct hashfile *f,
+static unsigned long write_large_blob_data(struct odb_read_stream *st, struct hashfile *f,
 					   const struct object_id *oid)
 {
 	git_zstream stream;
@@ -513,7 +513,7 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 	unsigned hdrlen;
 	enum object_type type;
 	void *buf;
-	struct git_istream *st = NULL;
+	struct odb_read_stream *st = NULL;
 	const unsigned hashsz = the_hash_algo->rawsz;
 
 	if (!usable_delta) {
diff --git a/object-file.c b/object-file.c
index 811c569ed36aa4..b62b21a45289fc 100644
--- a/object-file.c
+++ b/object-file.c
@@ -134,7 +134,7 @@ int stream_object_signature(struct repository *r, const struct object_id *oid)
 	struct object_id real_oid;
 	unsigned long size;
 	enum object_type obj_type;
-	struct git_istream *st;
+	struct odb_read_stream *st;
 	struct git_hash_ctx c;
 	char hdr[MAX_HEADER_LEN];
 	int hdrlen;
diff --git a/streaming.c b/streaming.c
index 00ad649ae397f3..1fb4b7c1c002e8 100644
--- a/streaming.c
+++ b/streaming.c
@@ -14,17 +14,17 @@
 #include "replace-object.h"
 #include "packfile.h"
 
-typedef int (*open_istream_fn)(struct git_istream *,
+typedef int (*open_istream_fn)(struct odb_read_stream *,
 			       struct repository *,
 			       const struct object_id *,
 			       enum object_type *);
-typedef int (*close_istream_fn)(struct git_istream *);
-typedef ssize_t (*read_istream_fn)(struct git_istream *, char *, size_t);
+typedef int (*close_istream_fn)(struct odb_read_stream *);
+typedef ssize_t (*read_istream_fn)(struct odb_read_stream *, char *, size_t);
 
 #define FILTER_BUFFER (1024*16)
 
 struct filtered_istream {
-	struct git_istream *upstream;
+	struct odb_read_stream *upstream;
 	struct stream_filter *filter;
 	char ibuf[FILTER_BUFFER];
 	char obuf[FILTER_BUFFER];
@@ -33,7 +33,7 @@ struct filtered_istream {
 	int input_finished;
 };
 
-struct git_istream {
+struct odb_read_stream {
 	open_istream_fn open;
 	close_istream_fn close;
 	read_istream_fn read;
@@ -71,7 +71,7 @@ struct git_istream {
  *
  *****************************************************************/
 
-static void close_deflated_stream(struct git_istream *st)
+static void close_deflated_stream(struct odb_read_stream *st)
 {
 	if (st->z_state == z_used)
 		git_inflate_end(&st->z);
@@ -84,13 +84,13 @@ static void close_deflated_stream(struct git_istream *st)
  *
  *****************************************************************/
 
-static int close_istream_filtered(struct git_istream *st)
+static int close_istream_filtered(struct odb_read_stream *st)
 {
 	free_stream_filter(st->u.filtered.filter);
 	return close_istream(st->u.filtered.upstream);
 }
 
-static ssize_t read_istream_filtered(struct git_istream *st, char *buf,
+static ssize_t read_istream_filtered(struct odb_read_stream *st, char *buf,
 				     size_t sz)
 {
 	struct filtered_istream *fs = &(st->u.filtered);
@@ -150,10 +150,10 @@ static ssize_t read_istream_filtered(struct git_istream *st, char *buf,
 	return filled;
 }
 
-static struct git_istream *attach_stream_filter(struct git_istream *st,
-						struct stream_filter *filter)
+static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
+						    struct stream_filter *filter)
 {
-	struct git_istream *ifs = xmalloc(sizeof(*ifs));
+	struct odb_read_stream *ifs = xmalloc(sizeof(*ifs));
 	struct filtered_istream *fs = &(ifs->u.filtered);
 
 	ifs->close = close_istream_filtered;
@@ -173,7 +173,7 @@ static struct git_istream *attach_stream_filter(struct git_istream *st,
  *
  *****************************************************************/
 
-static ssize_t read_istream_loose(struct git_istream *st, char *buf, size_t sz)
+static ssize_t read_istream_loose(struct odb_read_stream *st, char *buf, size_t sz)
 {
 	size_t total_read = 0;
 
@@ -218,14 +218,14 @@ static ssize_t read_istream_loose(struct git_istream *st, char *buf, size_t sz)
 	return total_read;
 }
 
-static int close_istream_loose(struct git_istream *st)
+static int close_istream_loose(struct odb_read_stream *st)
 {
 	close_deflated_stream(st);
 	munmap(st->u.loose.mapped, st->u.loose.mapsize);
 	return 0;
 }
 
-static int open_istream_loose(struct git_istream *st, struct repository *r,
+static int open_istream_loose(struct odb_read_stream *st, struct repository *r,
 			      const struct object_id *oid,
 			      enum object_type *type)
 {
@@ -277,7 +277,7 @@ static int open_istream_loose(struct git_istream *st, struct repository *r,
  *
  *****************************************************************/
 
-static ssize_t read_istream_pack_non_delta(struct git_istream *st, char *buf,
+static ssize_t read_istream_pack_non_delta(struct odb_read_stream *st, char *buf,
 					   size_t sz)
 {
 	size_t total_read = 0;
@@ -336,13 +336,13 @@ static ssize_t read_istream_pack_non_delta(struct git_istream *st, char *buf,
 	return total_read;
 }
 
-static int close_istream_pack_non_delta(struct git_istream *st)
+static int close_istream_pack_non_delta(struct odb_read_stream *st)
 {
 	close_deflated_stream(st);
 	return 0;
 }
 
-static int open_istream_pack_non_delta(struct git_istream *st,
+static int open_istream_pack_non_delta(struct odb_read_stream *st,
 				       struct repository *r UNUSED,
 				       const struct object_id *oid UNUSED,
 				       enum object_type *type UNUSED)
@@ -380,13 +380,13 @@ static int open_istream_pack_non_delta(struct git_istream *st,
  *
  *****************************************************************/
 
-static int close_istream_incore(struct git_istream *st)
+static int close_istream_incore(struct odb_read_stream *st)
 {
 	free(st->u.incore.buf);
 	return 0;
 }
 
-static ssize_t read_istream_incore(struct git_istream *st, char *buf, size_t sz)
+static ssize_t read_istream_incore(struct odb_read_stream *st, char *buf, size_t sz)
 {
 	size_t read_size = sz;
 	size_t remainder = st->size - st->u.incore.read_ptr;
@@ -400,7 +400,7 @@ static ssize_t read_istream_incore(struct git_istream *st, char *buf, size_t sz)
 	return read_size;
 }
 
-static int open_istream_incore(struct git_istream *st, struct repository *r,
+static int open_istream_incore(struct odb_read_stream *st, struct repository *r,
 			       const struct object_id *oid, enum object_type *type)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
@@ -420,7 +420,7 @@ static int open_istream_incore(struct git_istream *st, struct repository *r,
  * static helpers variables and functions for users of streaming interface
  *****************************************************************************/
 
-static int istream_source(struct git_istream *st,
+static int istream_source(struct odb_read_stream *st,
 			  struct repository *r,
 			  const struct object_id *oid,
 			  enum object_type *type)
@@ -458,25 +458,25 @@ static int istream_source(struct git_istream *st,
  * Users of streaming interface
  ****************************************************************/
 
-int close_istream(struct git_istream *st)
+int close_istream(struct odb_read_stream *st)
 {
 	int r = st->close(st);
 	free(st);
 	return r;
 }
 
-ssize_t read_istream(struct git_istream *st, void *buf, size_t sz)
+ssize_t read_istream(struct odb_read_stream *st, void *buf, size_t sz)
 {
 	return st->read(st, buf, sz);
 }
 
-struct git_istream *open_istream(struct repository *r,
-				 const struct object_id *oid,
-				 enum object_type *type,
-				 unsigned long *size,
-				 struct stream_filter *filter)
+struct odb_read_stream *open_istream(struct repository *r,
+				     const struct object_id *oid,
+				     enum object_type *type,
+				     unsigned long *size,
+				     struct stream_filter *filter)
 {
-	struct git_istream *st = xmalloc(sizeof(*st));
+	struct odb_read_stream *st = xmalloc(sizeof(*st));
 	const struct object_id *real = lookup_replace_object(r, oid);
 	int ret = istream_source(st, r, real, type);
 
@@ -493,7 +493,7 @@ struct git_istream *open_istream(struct repository *r,
 	}
 	if (filter) {
 		/* Add "&& !is_null_stream_filter(filter)" for performance */
-		struct git_istream *nst = attach_stream_filter(st, filter);
+		struct odb_read_stream *nst = attach_stream_filter(st, filter);
 		if (!nst) {
 			close_istream(st);
 			return NULL;
@@ -508,7 +508,7 @@ struct git_istream *open_istream(struct repository *r,
 int stream_blob_to_fd(int fd, const struct object_id *oid, struct stream_filter *filter,
 		      int can_seek)
 {
-	struct git_istream *st;
+	struct odb_read_stream *st;
 	enum object_type type;
 	unsigned long sz;
 	ssize_t kept = 0;
diff --git a/streaming.h b/streaming.h
index bd27f59e5764ae..f5ff5d7ac9a573 100644
--- a/streaming.h
+++ b/streaming.h
@@ -7,14 +7,14 @@
 #include "object.h"
 
 /* opaque */
-struct git_istream;
+struct odb_read_stream;
 struct stream_filter;
 
-struct git_istream *open_istream(struct repository *, const struct object_id *,
-				 enum object_type *, unsigned long *,
-				 struct stream_filter *);
-int close_istream(struct git_istream *);
-ssize_t read_istream(struct git_istream *, void *, size_t);
+struct odb_read_stream *open_istream(struct repository *, const struct object_id *,
+				     enum object_type *, unsigned long *,
+				     struct stream_filter *);
+int close_istream(struct odb_read_stream *);
+ssize_t read_istream(struct odb_read_stream *, void *, size_t);
 
 int stream_blob_to_fd(int fd, const struct object_id *, struct stream_filter *, int can_seek);
 

From 70c8b5f5453b9f128a72fad4398acfb9e7d869c4 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:27 +0100
Subject: [PATCH 132/553] streaming: drop the `open()` callback function

When creating a read stream we first populate the structure with the
open callback function and then subsequently call the function. This
layout is somewhat weird though:

  - The structure needs to be allocated and partially populated with the
    open function before we can properly initialize it.

  - We only ever call the `open()` callback function right after having
    populated the `struct odb_read_stream::open` member, and it's never
    called thereafter again. So it is somewhat pointless to store the
    callback in the first place.

Especially the first point creates a problem for us. In subsequent
commits we'll want to fully move construction of the read source into
the respective object sources. E.g., the loose object source will be the
one that is responsible for creating the structure. But this creates a
problem: if we first need to create the structure so that we can call
the source-specific callback we cannot fully handle creation of the
structure in the source itself.

We could of course work around that and have the loose object source
create the structure and populate its `open()` callback, only. But
this doesn't really buy us anything due to the second bullet point
above.

Instead, drop the callback entirely and refactor `istream_source()` so
that we open the streams immediately. This unblocks a subsequent step,
where we'll also start to allocate the structure in the source-specific
logic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 37 +++++++++++++++----------------------
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/streaming.c b/streaming.c
index 1fb4b7c1c002e8..1bb3f393b87519 100644
--- a/streaming.c
+++ b/streaming.c
@@ -14,10 +14,6 @@
 #include "replace-object.h"
 #include "packfile.h"
 
-typedef int (*open_istream_fn)(struct odb_read_stream *,
-			       struct repository *,
-			       const struct object_id *,
-			       enum object_type *);
 typedef int (*close_istream_fn)(struct odb_read_stream *);
 typedef ssize_t (*read_istream_fn)(struct odb_read_stream *, char *, size_t);
 
@@ -34,7 +30,6 @@ struct filtered_istream {
 };
 
 struct odb_read_stream {
-	open_istream_fn open;
 	close_istream_fn close;
 	read_istream_fn read;
 
@@ -437,21 +432,25 @@ static int istream_source(struct odb_read_stream *st,
 
 	switch (oi.whence) {
 	case OI_LOOSE:
-		st->open = open_istream_loose;
+		if (open_istream_loose(st, r, oid, type) < 0)
+			break;
 		return 0;
 	case OI_PACKED:
-		if (!oi.u.packed.is_delta &&
-		    repo_settings_get_big_file_threshold(the_repository) < size) {
-			st->u.in_pack.pack = oi.u.packed.pack;
-			st->u.in_pack.pos = oi.u.packed.offset;
-			st->open = open_istream_pack_non_delta;
-			return 0;
-		}
-		/* fallthru */
-	default:
-		st->open = open_istream_incore;
+		if (oi.u.packed.is_delta ||
+		    repo_settings_get_big_file_threshold(the_repository) >= size)
+			break;
+
+		st->u.in_pack.pack = oi.u.packed.pack;
+		st->u.in_pack.pos = oi.u.packed.offset;
+		if (open_istream_pack_non_delta(st, r, oid, type) < 0)
+			break;
+
 		return 0;
+	default:
+		break;
 	}
+
+	return open_istream_incore(st, r, oid, type);
 }
 
 /****************************************************************
@@ -485,12 +484,6 @@ struct odb_read_stream *open_istream(struct repository *r,
 		return NULL;
 	}
 
-	if (st->open(st, r, real, type)) {
-		if (open_istream_incore(st, r, real, type)) {
-			free(st);
-			return NULL;
-		}
-	}
 	if (filter) {
 		/* Add "&& !is_null_stream_filter(filter)" for performance */
 		struct odb_read_stream *nst = attach_stream_filter(st, filter);

From 3f64deabdf0a2a9664acec61698affc449e07496 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:28 +0100
Subject: [PATCH 133/553] streaming: propagate final object type via the stream

When opening the read stream for a specific object the caller is also
expected to pass in a pointer to the object type. This type is passed
down via multiple levels and will eventually be populated with the type
of the looked-up object.

The way we propagate down the pointer though is somewhat non-obvious.
While `istream_source()` still expects the pointer and looks it up via
`odb_read_object_info_extended()`, we also pass it down even further
into the format-specific callbacks that perform another lookup. This is
quite confusing overall.

Refactor the code so that the responsibility to populate the object type
rests solely with the format-specific callbacks. This will allow us to
drop the call to `odb_read_object_info_extended()` in `istream_source()`
entirely in a subsequent patch.

Furthermore, instead of propagating the type via an in-pointer, we now
propagate the type via a new field in the object stream. It already has
a `size` field, so it's only natural to have a second field that
contains the object type.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/streaming.c b/streaming.c
index 1bb3f393b87519..665624ddc0494e 100644
--- a/streaming.c
+++ b/streaming.c
@@ -33,6 +33,7 @@ struct odb_read_stream {
 	close_istream_fn close;
 	read_istream_fn read;
 
+	enum object_type type;
 	unsigned long size; /* inflated size of full object */
 	git_zstream z;
 	enum { z_unused, z_used, z_done, z_error } z_state;
@@ -159,6 +160,7 @@ static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
 	fs->o_end = fs->o_ptr = 0;
 	fs->input_finished = 0;
 	ifs->size = -1; /* unknown */
+	ifs->type = st->type;
 	return ifs;
 }
 
@@ -221,14 +223,13 @@ static int close_istream_loose(struct odb_read_stream *st)
 }
 
 static int open_istream_loose(struct odb_read_stream *st, struct repository *r,
-			      const struct object_id *oid,
-			      enum object_type *type)
+			      const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
 	struct odb_source *source;
 
 	oi.sizep = &st->size;
-	oi.typep = type;
+	oi.typep = &st->type;
 
 	odb_prepare_alternates(r->objects);
 	for (source = r->objects->sources; source; source = source->next) {
@@ -249,7 +250,7 @@ static int open_istream_loose(struct odb_read_stream *st, struct repository *r,
 	case ULHR_TOO_LONG:
 		goto error;
 	}
-	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || *type < 0)
+	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || st->type < 0)
 		goto error;
 
 	st->u.loose.hdr_used = strlen(st->u.loose.hdr) + 1;
@@ -339,8 +340,7 @@ static int close_istream_pack_non_delta(struct odb_read_stream *st)
 
 static int open_istream_pack_non_delta(struct odb_read_stream *st,
 				       struct repository *r UNUSED,
-				       const struct object_id *oid UNUSED,
-				       enum object_type *type UNUSED)
+				       const struct object_id *oid UNUSED)
 {
 	struct pack_window *window;
 	enum object_type in_pack_type;
@@ -361,6 +361,7 @@ static int open_istream_pack_non_delta(struct odb_read_stream *st,
 	case OBJ_TAG:
 		break;
 	}
+	st->type = in_pack_type;
 	st->z_state = z_unused;
 	st->close = close_istream_pack_non_delta;
 	st->read = read_istream_pack_non_delta;
@@ -396,7 +397,7 @@ static ssize_t read_istream_incore(struct odb_read_stream *st, char *buf, size_t
 }
 
 static int open_istream_incore(struct odb_read_stream *st, struct repository *r,
-			       const struct object_id *oid, enum object_type *type)
+			       const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
 
@@ -404,7 +405,7 @@ static int open_istream_incore(struct odb_read_stream *st, struct repository *r,
 	st->close = close_istream_incore;
 	st->read = read_istream_incore;
 
-	oi.typep = type;
+	oi.typep = &st->type;
 	oi.sizep = &st->size;
 	oi.contentp = (void **)&st->u.incore.buf;
 	return odb_read_object_info_extended(r->objects, oid, &oi,
@@ -417,14 +418,12 @@ static int open_istream_incore(struct odb_read_stream *st, struct repository *r,
 
 static int istream_source(struct odb_read_stream *st,
 			  struct repository *r,
-			  const struct object_id *oid,
-			  enum object_type *type)
+			  const struct object_id *oid)
 {
 	unsigned long size;
 	int status;
 	struct object_info oi = OBJECT_INFO_INIT;
 
-	oi.typep = type;
 	oi.sizep = &size;
 	status = odb_read_object_info_extended(r->objects, oid, &oi, 0);
 	if (status < 0)
@@ -432,7 +431,7 @@ static int istream_source(struct odb_read_stream *st,
 
 	switch (oi.whence) {
 	case OI_LOOSE:
-		if (open_istream_loose(st, r, oid, type) < 0)
+		if (open_istream_loose(st, r, oid) < 0)
 			break;
 		return 0;
 	case OI_PACKED:
@@ -442,7 +441,7 @@ static int istream_source(struct odb_read_stream *st,
 
 		st->u.in_pack.pack = oi.u.packed.pack;
 		st->u.in_pack.pos = oi.u.packed.offset;
-		if (open_istream_pack_non_delta(st, r, oid, type) < 0)
+		if (open_istream_pack_non_delta(st, r, oid) < 0)
 			break;
 
 		return 0;
@@ -450,7 +449,7 @@ static int istream_source(struct odb_read_stream *st,
 		break;
 	}
 
-	return open_istream_incore(st, r, oid, type);
+	return open_istream_incore(st, r, oid);
 }
 
 /****************************************************************
@@ -477,7 +476,7 @@ struct odb_read_stream *open_istream(struct repository *r,
 {
 	struct odb_read_stream *st = xmalloc(sizeof(*st));
 	const struct object_id *real = lookup_replace_object(r, oid);
-	int ret = istream_source(st, r, real, type);
+	int ret = istream_source(st, r, real);
 
 	if (ret) {
 		free(st);
@@ -495,6 +494,7 @@ struct odb_read_stream *open_istream(struct repository *r,
 	}
 
 	*size = st->size;
+	*type = st->type;
 	return st;
 }
 

From 3c7722dd4d376e0fce4c48f723fe8b69af785998 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:29 +0100
Subject: [PATCH 134/553] streaming: explicitly pass packfile info when
 streaming a packed object

When streaming a packed object we first populate the stream with
information about the pack that contains the object before calling
`open_istream_pack_non_delta()`. This is done because we have already
looked up both the pack and the object's offset, so it would be a waste
of time to look up this information again.

But the way this is done makes for a somewhat awkward calling interface,
as the caller now needs to be aware of how exactly the function itself
behaves.

Refactor the code so that we instead explicitly pass the packfile info
into `open_istream_pack_non_delta()`. This makes the calling convention
explicit, but more importantly this allows us to refactor the function
so that it becomes its responsibility to allocate the stream itself in a
subsequent patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/streaming.c b/streaming.c
index 665624ddc0494e..bf277daadd48c2 100644
--- a/streaming.c
+++ b/streaming.c
@@ -340,16 +340,18 @@ static int close_istream_pack_non_delta(struct odb_read_stream *st)
 
 static int open_istream_pack_non_delta(struct odb_read_stream *st,
 				       struct repository *r UNUSED,
-				       const struct object_id *oid UNUSED)
+				       const struct object_id *oid UNUSED,
+				       struct packed_git *pack,
+				       off_t offset)
 {
 	struct pack_window *window;
 	enum object_type in_pack_type;
 
 	window = NULL;
 
-	in_pack_type = unpack_object_header(st->u.in_pack.pack,
+	in_pack_type = unpack_object_header(pack,
 					    &window,
-					    &st->u.in_pack.pos,
+					    &offset,
 					    &st->size);
 	unuse_pack(&window);
 	switch (in_pack_type) {
@@ -365,6 +367,8 @@ static int open_istream_pack_non_delta(struct odb_read_stream *st,
 	st->z_state = z_unused;
 	st->close = close_istream_pack_non_delta;
 	st->read = read_istream_pack_non_delta;
+	st->u.in_pack.pack = pack;
+	st->u.in_pack.pos = offset;
 
 	return 0;
 }
@@ -436,14 +440,10 @@ static int istream_source(struct odb_read_stream *st,
 		return 0;
 	case OI_PACKED:
 		if (oi.u.packed.is_delta ||
-		    repo_settings_get_big_file_threshold(the_repository) >= size)
+		    repo_settings_get_big_file_threshold(the_repository) >= size ||
+		    open_istream_pack_non_delta(st, r, oid, oi.u.packed.pack,
+						oi.u.packed.offset) < 0)
 			break;
-
-		st->u.in_pack.pack = oi.u.packed.pack;
-		st->u.in_pack.pos = oi.u.packed.offset;
-		if (open_istream_pack_non_delta(st, r, oid) < 0)
-			break;
-
 		return 0;
 	default:
 		break;

From 595296e124f5e8a67c4669fcaeb1b28e71c2d751 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:30 +0100
Subject: [PATCH 135/553] streaming: allocate stream inside the
 backend-specific logic

When creating a new stream we first allocate it and then call into
backend-specific logic to populate the stream. This design requires that
the stream itself contains a `union` with backend-specific members that
then ultimately get populated by the backend-specific logic.

This works, but it's awkward in the context of pluggable object
databases. Each backend will need its own member in that union, and as
the structure itself is completely opaque (it's only defined in
"streaming.c") it also has the consequence that we must have the logic
that is specific to backends in "streaming.c".

Ideally though, the infrastructure would be reversed: we have a generic
`struct odb_read_stream` and some helper functions in "streaming.c",
whereas the backend-specific logic sits in the backend's subsystem
itself.

This can be realized by using a design that is similar to how we handle
reference databases: instead of having a union of members, we instead
have backend-specific structures with a `struct odb_read_stream base`
as its first member. The backends would thus hand out the pointer to the
base, but internally they know to cast back to the backend-specific
type.

This means though that we need to allocate different structures
depending on the backend. To prepare for this, move allocation of the
structure into the backend-specific functions that open a new stream.
Subsequent commits will then create those new backend-specific structs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 103 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 65 insertions(+), 38 deletions(-)

diff --git a/streaming.c b/streaming.c
index bf277daadd48c2..a2c2d887387c57 100644
--- a/streaming.c
+++ b/streaming.c
@@ -222,27 +222,34 @@ static int close_istream_loose(struct odb_read_stream *st)
 	return 0;
 }
 
-static int open_istream_loose(struct odb_read_stream *st, struct repository *r,
+static int open_istream_loose(struct odb_read_stream **out,
+			      struct repository *r,
 			      const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
+	struct odb_read_stream *st;
 	struct odb_source *source;
-
-	oi.sizep = &st->size;
-	oi.typep = &st->type;
+	unsigned long mapsize;
+	void *mapped;
 
 	odb_prepare_alternates(r->objects);
 	for (source = r->objects->sources; source; source = source->next) {
-		st->u.loose.mapped = odb_source_loose_map_object(source, oid,
-								 &st->u.loose.mapsize);
-		if (st->u.loose.mapped)
+		mapped = odb_source_loose_map_object(source, oid, &mapsize);
+		if (mapped)
 			break;
 	}
-	if (!st->u.loose.mapped)
+	if (!mapped)
 		return -1;
 
-	switch (unpack_loose_header(&st->z, st->u.loose.mapped,
-				    st->u.loose.mapsize, st->u.loose.hdr,
+	/*
+	 * Note: we must allocate this structure early even though we may still
+	 * fail. This is because we need to initialize the zlib stream, and it
+	 * is not possible to copy the stream around after the fact because it
+	 * has self-referencing pointers.
+	 */
+	CALLOC_ARRAY(st, 1);
+
+	switch (unpack_loose_header(&st->z, mapped, mapsize, st->u.loose.hdr,
 				    sizeof(st->u.loose.hdr))) {
 	case ULHR_OK:
 		break;
@@ -250,19 +257,28 @@ static int open_istream_loose(struct odb_read_stream *st, struct repository *r,
 	case ULHR_TOO_LONG:
 		goto error;
 	}
+
+	oi.sizep = &st->size;
+	oi.typep = &st->type;
+
 	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || st->type < 0)
 		goto error;
 
+	st->u.loose.mapped = mapped;
+	st->u.loose.mapsize = mapsize;
 	st->u.loose.hdr_used = strlen(st->u.loose.hdr) + 1;
 	st->u.loose.hdr_avail = st->z.total_out;
 	st->z_state = z_used;
 	st->close = close_istream_loose;
 	st->read = read_istream_loose;
 
+	*out = st;
+
 	return 0;
 error:
 	git_inflate_end(&st->z);
 	munmap(st->u.loose.mapped, st->u.loose.mapsize);
+	free(st);
 	return -1;
 }
 
@@ -338,12 +354,16 @@ static int close_istream_pack_non_delta(struct odb_read_stream *st)
 	return 0;
 }
 
-static int open_istream_pack_non_delta(struct odb_read_stream *st,
+static int open_istream_pack_non_delta(struct odb_read_stream **out,
 				       struct repository *r UNUSED,
 				       const struct object_id *oid UNUSED,
 				       struct packed_git *pack,
 				       off_t offset)
 {
+	struct odb_read_stream stream = {
+		.close = close_istream_pack_non_delta,
+		.read = read_istream_pack_non_delta,
+	};
 	struct pack_window *window;
 	enum object_type in_pack_type;
 
@@ -352,7 +372,7 @@ static int open_istream_pack_non_delta(struct odb_read_stream *st,
 	in_pack_type = unpack_object_header(pack,
 					    &window,
 					    &offset,
-					    &st->size);
+					    &stream.size);
 	unuse_pack(&window);
 	switch (in_pack_type) {
 	default:
@@ -363,12 +383,13 @@ static int open_istream_pack_non_delta(struct odb_read_stream *st,
 	case OBJ_TAG:
 		break;
 	}
-	st->type = in_pack_type;
-	st->z_state = z_unused;
-	st->close = close_istream_pack_non_delta;
-	st->read = read_istream_pack_non_delta;
-	st->u.in_pack.pack = pack;
-	st->u.in_pack.pos = offset;
+	stream.type = in_pack_type;
+	stream.z_state = z_unused;
+	stream.u.in_pack.pack = pack;
+	stream.u.in_pack.pos = offset;
+
+	CALLOC_ARRAY(*out, 1);
+	**out = stream;
 
 	return 0;
 }
@@ -400,27 +421,35 @@ static ssize_t read_istream_incore(struct odb_read_stream *st, char *buf, size_t
 	return read_size;
 }
 
-static int open_istream_incore(struct odb_read_stream *st, struct repository *r,
+static int open_istream_incore(struct odb_read_stream **out,
+			       struct repository *r,
 			       const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
-
-	st->u.incore.read_ptr = 0;
-	st->close = close_istream_incore;
-	st->read = read_istream_incore;
-
-	oi.typep = &st->type;
-	oi.sizep = &st->size;
-	oi.contentp = (void **)&st->u.incore.buf;
-	return odb_read_object_info_extended(r->objects, oid, &oi,
-					     OBJECT_INFO_DIE_IF_CORRUPT);
+	struct odb_read_stream stream = {
+		.close = close_istream_incore,
+		.read = read_istream_incore,
+	};
+	int ret;
+
+	oi.typep = &stream.type;
+	oi.sizep = &stream.size;
+	oi.contentp = (void **)&stream.u.incore.buf;
+	ret = odb_read_object_info_extended(r->objects, oid, &oi,
+					    OBJECT_INFO_DIE_IF_CORRUPT);
+	if (ret)
+		return ret;
+
+	CALLOC_ARRAY(*out, 1);
+	**out = stream;
+	return 0;
 }
 
 /*****************************************************************************
  * static helpers variables and functions for users of streaming interface
  *****************************************************************************/
 
-static int istream_source(struct odb_read_stream *st,
+static int istream_source(struct odb_read_stream **out,
 			  struct repository *r,
 			  const struct object_id *oid)
 {
@@ -435,13 +464,13 @@ static int istream_source(struct odb_read_stream *st,
 
 	switch (oi.whence) {
 	case OI_LOOSE:
-		if (open_istream_loose(st, r, oid) < 0)
+		if (open_istream_loose(out, r, oid) < 0)
 			break;
 		return 0;
 	case OI_PACKED:
 		if (oi.u.packed.is_delta ||
 		    repo_settings_get_big_file_threshold(the_repository) >= size ||
-		    open_istream_pack_non_delta(st, r, oid, oi.u.packed.pack,
+		    open_istream_pack_non_delta(out, r, oid, oi.u.packed.pack,
 						oi.u.packed.offset) < 0)
 			break;
 		return 0;
@@ -449,7 +478,7 @@ static int istream_source(struct odb_read_stream *st,
 		break;
 	}
 
-	return open_istream_incore(st, r, oid);
+	return open_istream_incore(out, r, oid);
 }
 
 /****************************************************************
@@ -474,14 +503,12 @@ struct odb_read_stream *open_istream(struct repository *r,
 				     unsigned long *size,
 				     struct stream_filter *filter)
 {
-	struct odb_read_stream *st = xmalloc(sizeof(*st));
+	struct odb_read_stream *st;
 	const struct object_id *real = lookup_replace_object(r, oid);
-	int ret = istream_source(st, r, real);
+	int ret = istream_source(&st, r, real);
 
-	if (ret) {
-		free(st);
+	if (ret)
 		return NULL;
-	}
 
 	if (filter) {
 		/* Add "&& !is_null_stream_filter(filter)" for performance */

From e030d0aeb5ebf79cdc4910e79d59e33998de78cd Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:31 +0100
Subject: [PATCH 136/553] streaming: create structure for in-core object
 streams

As explained in a preceding commit, we want to get rid of the union of
stream-type specific data in `struct odb_read_stream`. Create a new
structure for in-core object streams to move towards this design.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 44 +++++++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/streaming.c b/streaming.c
index a2c2d887387c57..35307d72295988 100644
--- a/streaming.c
+++ b/streaming.c
@@ -39,11 +39,6 @@ struct odb_read_stream {
 	enum { z_unused, z_used, z_done, z_error } z_state;
 
 	union {
-		struct {
-			char *buf; /* from odb_read_object_info_extended() */
-			unsigned long read_ptr;
-		} incore;
-
 		struct {
 			void *mapped;
 			unsigned long mapsize;
@@ -401,22 +396,30 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
  *
  *****************************************************************/
 
-static int close_istream_incore(struct odb_read_stream *st)
+struct odb_incore_read_stream {
+	struct odb_read_stream base;
+	char *buf; /* from odb_read_object_info_extended() */
+	unsigned long read_ptr;
+};
+
+static int close_istream_incore(struct odb_read_stream *_st)
 {
-	free(st->u.incore.buf);
+	struct odb_incore_read_stream *st = (struct odb_incore_read_stream *)_st;
+	free(st->buf);
 	return 0;
 }
 
-static ssize_t read_istream_incore(struct odb_read_stream *st, char *buf, size_t sz)
+static ssize_t read_istream_incore(struct odb_read_stream *_st, char *buf, size_t sz)
 {
+	struct odb_incore_read_stream *st = (struct odb_incore_read_stream *)_st;
 	size_t read_size = sz;
-	size_t remainder = st->size - st->u.incore.read_ptr;
+	size_t remainder = st->base.size - st->read_ptr;
 
 	if (remainder <= read_size)
 		read_size = remainder;
 	if (read_size) {
-		memcpy(buf, st->u.incore.buf + st->u.incore.read_ptr, read_size);
-		st->u.incore.read_ptr += read_size;
+		memcpy(buf, st->buf + st->read_ptr, read_size);
+		st->read_ptr += read_size;
 	}
 	return read_size;
 }
@@ -426,22 +429,25 @@ static int open_istream_incore(struct odb_read_stream **out,
 			       const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
-	struct odb_read_stream stream = {
-		.close = close_istream_incore,
-		.read = read_istream_incore,
+	struct odb_incore_read_stream stream = {
+		.base.close = close_istream_incore,
+		.base.read = read_istream_incore,
 	};
+	struct odb_incore_read_stream *st;
 	int ret;
 
-	oi.typep = &stream.type;
-	oi.sizep = &stream.size;
-	oi.contentp = (void **)&stream.u.incore.buf;
+	oi.typep = &stream.base.type;
+	oi.sizep = &stream.base.size;
+	oi.contentp = (void **)&stream.buf;
 	ret = odb_read_object_info_extended(r->objects, oid, &oi,
 					    OBJECT_INFO_DIE_IF_CORRUPT);
 	if (ret)
 		return ret;
 
-	CALLOC_ARRAY(*out, 1);
-	**out = stream;
+	CALLOC_ARRAY(st, 1);
+	*st = stream;
+	*out = &st->base;
+
 	return 0;
 }
 

From b7774c0f0de43379c40984b4ede265a512c1a4f0 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:32 +0100
Subject: [PATCH 137/553] streaming: create structure for loose object streams

As explained in a preceding commit, we want to get rid of the union of
stream-type specific data in `struct odb_read_stream`. Create a new
structure for loose object streams to move towards this design.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 85 +++++++++++++++++++++++++++--------------------------
 1 file changed, 44 insertions(+), 41 deletions(-)

diff --git a/streaming.c b/streaming.c
index 35307d72295988..ac7b3026f5a604 100644
--- a/streaming.c
+++ b/streaming.c
@@ -39,14 +39,6 @@ struct odb_read_stream {
 	enum { z_unused, z_used, z_done, z_error } z_state;
 
 	union {
-		struct {
-			void *mapped;
-			unsigned long mapsize;
-			char hdr[32];
-			int hdr_avail;
-			int hdr_used;
-		} loose;
-
 		struct {
 			struct packed_git *pack;
 			off_t pos;
@@ -165,11 +157,21 @@ static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
  *
  *****************************************************************/
 
-static ssize_t read_istream_loose(struct odb_read_stream *st, char *buf, size_t sz)
+struct odb_loose_read_stream {
+	struct odb_read_stream base;
+	void *mapped;
+	unsigned long mapsize;
+	char hdr[32];
+	int hdr_avail;
+	int hdr_used;
+};
+
+static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t sz)
 {
+	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
 	size_t total_read = 0;
 
-	switch (st->z_state) {
+	switch (st->base.z_state) {
 	case z_done:
 		return 0;
 	case z_error:
@@ -178,42 +180,43 @@ static ssize_t read_istream_loose(struct odb_read_stream *st, char *buf, size_t
 		break;
 	}
 
-	if (st->u.loose.hdr_used < st->u.loose.hdr_avail) {
-		size_t to_copy = st->u.loose.hdr_avail - st->u.loose.hdr_used;
+	if (st->hdr_used < st->hdr_avail) {
+		size_t to_copy = st->hdr_avail - st->hdr_used;
 		if (sz < to_copy)
 			to_copy = sz;
-		memcpy(buf, st->u.loose.hdr + st->u.loose.hdr_used, to_copy);
-		st->u.loose.hdr_used += to_copy;
+		memcpy(buf, st->hdr + st->hdr_used, to_copy);
+		st->hdr_used += to_copy;
 		total_read += to_copy;
 	}
 
 	while (total_read < sz) {
 		int status;
 
-		st->z.next_out = (unsigned char *)buf + total_read;
-		st->z.avail_out = sz - total_read;
-		status = git_inflate(&st->z, Z_FINISH);
+		st->base.z.next_out = (unsigned char *)buf + total_read;
+		st->base.z.avail_out = sz - total_read;
+		status = git_inflate(&st->base.z, Z_FINISH);
 
-		total_read = st->z.next_out - (unsigned char *)buf;
+		total_read = st->base.z.next_out - (unsigned char *)buf;
 
 		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->z);
-			st->z_state = z_done;
+			git_inflate_end(&st->base.z);
+			st->base.z_state = z_done;
 			break;
 		}
 		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
-			git_inflate_end(&st->z);
-			st->z_state = z_error;
+			git_inflate_end(&st->base.z);
+			st->base.z_state = z_error;
 			return -1;
 		}
 	}
 	return total_read;
 }
 
-static int close_istream_loose(struct odb_read_stream *st)
+static int close_istream_loose(struct odb_read_stream *_st)
 {
-	close_deflated_stream(st);
-	munmap(st->u.loose.mapped, st->u.loose.mapsize);
+	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
+	close_deflated_stream(&st->base);
+	munmap(st->mapped, st->mapsize);
 	return 0;
 }
 
@@ -222,7 +225,7 @@ static int open_istream_loose(struct odb_read_stream **out,
 			      const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
-	struct odb_read_stream *st;
+	struct odb_loose_read_stream *st;
 	struct odb_source *source;
 	unsigned long mapsize;
 	void *mapped;
@@ -244,8 +247,8 @@ static int open_istream_loose(struct odb_read_stream **out,
 	 */
 	CALLOC_ARRAY(st, 1);
 
-	switch (unpack_loose_header(&st->z, mapped, mapsize, st->u.loose.hdr,
-				    sizeof(st->u.loose.hdr))) {
+	switch (unpack_loose_header(&st->base.z, mapped, mapsize, st->hdr,
+				    sizeof(st->hdr))) {
 	case ULHR_OK:
 		break;
 	case ULHR_BAD:
@@ -253,26 +256,26 @@ static int open_istream_loose(struct odb_read_stream **out,
 		goto error;
 	}
 
-	oi.sizep = &st->size;
-	oi.typep = &st->type;
+	oi.sizep = &st->base.size;
+	oi.typep = &st->base.type;
 
-	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || st->type < 0)
+	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
 		goto error;
 
-	st->u.loose.mapped = mapped;
-	st->u.loose.mapsize = mapsize;
-	st->u.loose.hdr_used = strlen(st->u.loose.hdr) + 1;
-	st->u.loose.hdr_avail = st->z.total_out;
-	st->z_state = z_used;
-	st->close = close_istream_loose;
-	st->read = read_istream_loose;
+	st->mapped = mapped;
+	st->mapsize = mapsize;
+	st->hdr_used = strlen(st->hdr) + 1;
+	st->hdr_avail = st->base.z.total_out;
+	st->base.z_state = z_used;
+	st->base.close = close_istream_loose;
+	st->base.read = read_istream_loose;
 
-	*out = st;
+	*out = &st->base;
 
 	return 0;
 error:
-	git_inflate_end(&st->z);
-	munmap(st->u.loose.mapped, st->u.loose.mapsize);
+	git_inflate_end(&st->base.z);
+	munmap(st->mapped, st->mapsize);
 	free(st);
 	return -1;
 }

From 5f0d8d2e8d3f992f58af247b6d21509c3c7595ca Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:33 +0100
Subject: [PATCH 138/553] streaming: create structure for packed object streams

As explained in a preceding commit, we want to get rid of the union of
stream-type specific data in `struct odb_read_stream`. Create a new
structure for packed object streams to move towards this design.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 75 ++++++++++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 35 deletions(-)

diff --git a/streaming.c b/streaming.c
index ac7b3026f5a604..788f04e83ef6c8 100644
--- a/streaming.c
+++ b/streaming.c
@@ -39,11 +39,6 @@ struct odb_read_stream {
 	enum { z_unused, z_used, z_done, z_error } z_state;
 
 	union {
-		struct {
-			struct packed_git *pack;
-			off_t pos;
-		} in_pack;
-
 		struct filtered_istream filtered;
 	} u;
 };
@@ -287,16 +282,23 @@ static int open_istream_loose(struct odb_read_stream **out,
  *
  *****************************************************************/
 
-static ssize_t read_istream_pack_non_delta(struct odb_read_stream *st, char *buf,
+struct odb_packed_read_stream {
+	struct odb_read_stream base;
+	struct packed_git *pack;
+	off_t pos;
+};
+
+static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *buf,
 					   size_t sz)
 {
+	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
 	size_t total_read = 0;
 
-	switch (st->z_state) {
+	switch (st->base.z_state) {
 	case z_unused:
-		memset(&st->z, 0, sizeof(st->z));
-		git_inflate_init(&st->z);
-		st->z_state = z_used;
+		memset(&st->base.z, 0, sizeof(st->base.z));
+		git_inflate_init(&st->base.z);
+		st->base.z_state = z_used;
 		break;
 	case z_done:
 		return 0;
@@ -311,21 +313,21 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *st, char *buf
 		struct pack_window *window = NULL;
 		unsigned char *mapped;
 
-		mapped = use_pack(st->u.in_pack.pack, &window,
-				  st->u.in_pack.pos, &st->z.avail_in);
+		mapped = use_pack(st->pack, &window,
+				  st->pos, &st->base.z.avail_in);
 
-		st->z.next_out = (unsigned char *)buf + total_read;
-		st->z.avail_out = sz - total_read;
-		st->z.next_in = mapped;
-		status = git_inflate(&st->z, Z_FINISH);
+		st->base.z.next_out = (unsigned char *)buf + total_read;
+		st->base.z.avail_out = sz - total_read;
+		st->base.z.next_in = mapped;
+		status = git_inflate(&st->base.z, Z_FINISH);
 
-		st->u.in_pack.pos += st->z.next_in - mapped;
-		total_read = st->z.next_out - (unsigned char *)buf;
+		st->pos += st->base.z.next_in - mapped;
+		total_read = st->base.z.next_out - (unsigned char *)buf;
 		unuse_pack(&window);
 
 		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->z);
-			st->z_state = z_done;
+			git_inflate_end(&st->base.z);
+			st->base.z_state = z_done;
 			break;
 		}
 
@@ -338,17 +340,18 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *st, char *buf
 		 * or truncated), then use_pack() catches that and will die().
 		 */
 		if (status != Z_OK && status != Z_BUF_ERROR) {
-			git_inflate_end(&st->z);
-			st->z_state = z_error;
+			git_inflate_end(&st->base.z);
+			st->base.z_state = z_error;
 			return -1;
 		}
 	}
 	return total_read;
 }
 
-static int close_istream_pack_non_delta(struct odb_read_stream *st)
+static int close_istream_pack_non_delta(struct odb_read_stream *_st)
 {
-	close_deflated_stream(st);
+	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
+	close_deflated_stream(&st->base);
 	return 0;
 }
 
@@ -358,19 +361,17 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
 				       struct packed_git *pack,
 				       off_t offset)
 {
-	struct odb_read_stream stream = {
-		.close = close_istream_pack_non_delta,
-		.read = read_istream_pack_non_delta,
-	};
+	struct odb_packed_read_stream *stream;
 	struct pack_window *window;
 	enum object_type in_pack_type;
+	size_t size;
 
 	window = NULL;
 
 	in_pack_type = unpack_object_header(pack,
 					    &window,
 					    &offset,
-					    &stream.size);
+					    &size);
 	unuse_pack(&window);
 	switch (in_pack_type) {
 	default:
@@ -381,13 +382,17 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
 	case OBJ_TAG:
 		break;
 	}
-	stream.type = in_pack_type;
-	stream.z_state = z_unused;
-	stream.u.in_pack.pack = pack;
-	stream.u.in_pack.pos = offset;
 
-	CALLOC_ARRAY(*out, 1);
-	**out = stream;
+	CALLOC_ARRAY(stream, 1);
+	stream->base.close = close_istream_pack_non_delta;
+	stream->base.read = read_istream_pack_non_delta;
+	stream->base.type = in_pack_type;
+	stream->base.size = size;
+	stream->base.z_state = z_unused;
+	stream->pack = pack;
+	stream->pos = offset;
+
+	*out = &stream->base;
 
 	return 0;
 }

From 1154b2d2e511113e9b7d567788b72acb05713915 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:34 +0100
Subject: [PATCH 139/553] streaming: create structure for filtered object
 streams

As explained in a preceding commit, we want to get rid of the union of
stream-type specific data in `struct odb_read_stream`. Create a new
structure for filtered object streams to move towards this design.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 54 +++++++++++++++++++++++++----------------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/streaming.c b/streaming.c
index 788f04e83ef6c8..199cca5abb0eaa 100644
--- a/streaming.c
+++ b/streaming.c
@@ -19,16 +19,6 @@ typedef ssize_t (*read_istream_fn)(struct odb_read_stream *, char *, size_t);
 
 #define FILTER_BUFFER (1024*16)
 
-struct filtered_istream {
-	struct odb_read_stream *upstream;
-	struct stream_filter *filter;
-	char ibuf[FILTER_BUFFER];
-	char obuf[FILTER_BUFFER];
-	int i_end, i_ptr;
-	int o_end, o_ptr;
-	int input_finished;
-};
-
 struct odb_read_stream {
 	close_istream_fn close;
 	read_istream_fn read;
@@ -37,10 +27,6 @@ struct odb_read_stream {
 	unsigned long size; /* inflated size of full object */
 	git_zstream z;
 	enum { z_unused, z_used, z_done, z_error } z_state;
-
-	union {
-		struct filtered_istream filtered;
-	} u;
 };
 
 /*****************************************************************
@@ -62,16 +48,28 @@ static void close_deflated_stream(struct odb_read_stream *st)
  *
  *****************************************************************/
 
-static int close_istream_filtered(struct odb_read_stream *st)
+struct odb_filtered_read_stream {
+	struct odb_read_stream base;
+	struct odb_read_stream *upstream;
+	struct stream_filter *filter;
+	char ibuf[FILTER_BUFFER];
+	char obuf[FILTER_BUFFER];
+	int i_end, i_ptr;
+	int o_end, o_ptr;
+	int input_finished;
+};
+
+static int close_istream_filtered(struct odb_read_stream *_fs)
 {
-	free_stream_filter(st->u.filtered.filter);
-	return close_istream(st->u.filtered.upstream);
+	struct odb_filtered_read_stream *fs = (struct odb_filtered_read_stream *)_fs;
+	free_stream_filter(fs->filter);
+	return close_istream(fs->upstream);
 }
 
-static ssize_t read_istream_filtered(struct odb_read_stream *st, char *buf,
+static ssize_t read_istream_filtered(struct odb_read_stream *_fs, char *buf,
 				     size_t sz)
 {
-	struct filtered_istream *fs = &(st->u.filtered);
+	struct odb_filtered_read_stream *fs = (struct odb_filtered_read_stream *)_fs;
 	size_t filled = 0;
 
 	while (sz) {
@@ -131,19 +129,17 @@ static ssize_t read_istream_filtered(struct odb_read_stream *st, char *buf,
 static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
 						    struct stream_filter *filter)
 {
-	struct odb_read_stream *ifs = xmalloc(sizeof(*ifs));
-	struct filtered_istream *fs = &(ifs->u.filtered);
+	struct odb_filtered_read_stream *fs;
 
-	ifs->close = close_istream_filtered;
-	ifs->read = read_istream_filtered;
+	CALLOC_ARRAY(fs, 1);
+	fs->base.close = close_istream_filtered;
+	fs->base.read = read_istream_filtered;
 	fs->upstream = st;
 	fs->filter = filter;
-	fs->i_end = fs->i_ptr = 0;
-	fs->o_end = fs->o_ptr = 0;
-	fs->input_finished = 0;
-	ifs->size = -1; /* unknown */
-	ifs->type = st->type;
-	return ifs;
+	fs->base.size = -1; /* unknown */
+	fs->base.type = st->type;
+
+	return &fs->base;
 }
 
 /*****************************************************************

From eb5abbb4e6a8c06f5c6275bbb541bf7d736171c5 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:35 +0100
Subject: [PATCH 140/553] streaming: move zlib stream into backends

While all backend-specific data is now contained in a backend-specific
structure, we still share the zlib stream across the loose and packed
objects.

Refactor the code and move it into the specific structures so that we
fully detangle the different backends from one another.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 104 ++++++++++++++++++++++++++--------------------------
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/streaming.c b/streaming.c
index 199cca5abb0eaa..46fddaf2cad0ba 100644
--- a/streaming.c
+++ b/streaming.c
@@ -25,23 +25,8 @@ struct odb_read_stream {
 
 	enum object_type type;
 	unsigned long size; /* inflated size of full object */
-	git_zstream z;
-	enum { z_unused, z_used, z_done, z_error } z_state;
 };
 
-/*****************************************************************
- *
- * Common helpers
- *
- *****************************************************************/
-
-static void close_deflated_stream(struct odb_read_stream *st)
-{
-	if (st->z_state == z_used)
-		git_inflate_end(&st->z);
-}
-
-
 /*****************************************************************
  *
  * Filtered stream
@@ -150,6 +135,12 @@ static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
 
 struct odb_loose_read_stream {
 	struct odb_read_stream base;
+	git_zstream z;
+	enum {
+		ODB_LOOSE_READ_STREAM_INUSE,
+		ODB_LOOSE_READ_STREAM_DONE,
+		ODB_LOOSE_READ_STREAM_ERROR,
+	} z_state;
 	void *mapped;
 	unsigned long mapsize;
 	char hdr[32];
@@ -162,10 +153,10 @@ static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t
 	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
 	size_t total_read = 0;
 
-	switch (st->base.z_state) {
-	case z_done:
+	switch (st->z_state) {
+	case ODB_LOOSE_READ_STREAM_DONE:
 		return 0;
-	case z_error:
+	case ODB_LOOSE_READ_STREAM_ERROR:
 		return -1;
 	default:
 		break;
@@ -183,20 +174,20 @@ static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t
 	while (total_read < sz) {
 		int status;
 
-		st->base.z.next_out = (unsigned char *)buf + total_read;
-		st->base.z.avail_out = sz - total_read;
-		status = git_inflate(&st->base.z, Z_FINISH);
+		st->z.next_out = (unsigned char *)buf + total_read;
+		st->z.avail_out = sz - total_read;
+		status = git_inflate(&st->z, Z_FINISH);
 
-		total_read = st->base.z.next_out - (unsigned char *)buf;
+		total_read = st->z.next_out - (unsigned char *)buf;
 
 		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->base.z);
-			st->base.z_state = z_done;
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_DONE;
 			break;
 		}
 		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
-			git_inflate_end(&st->base.z);
-			st->base.z_state = z_error;
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_ERROR;
 			return -1;
 		}
 	}
@@ -206,7 +197,8 @@ static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t
 static int close_istream_loose(struct odb_read_stream *_st)
 {
 	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
-	close_deflated_stream(&st->base);
+	if (st->z_state == ODB_LOOSE_READ_STREAM_INUSE)
+		git_inflate_end(&st->z);
 	munmap(st->mapped, st->mapsize);
 	return 0;
 }
@@ -238,7 +230,7 @@ static int open_istream_loose(struct odb_read_stream **out,
 	 */
 	CALLOC_ARRAY(st, 1);
 
-	switch (unpack_loose_header(&st->base.z, mapped, mapsize, st->hdr,
+	switch (unpack_loose_header(&st->z, mapped, mapsize, st->hdr,
 				    sizeof(st->hdr))) {
 	case ULHR_OK:
 		break;
@@ -256,8 +248,8 @@ static int open_istream_loose(struct odb_read_stream **out,
 	st->mapped = mapped;
 	st->mapsize = mapsize;
 	st->hdr_used = strlen(st->hdr) + 1;
-	st->hdr_avail = st->base.z.total_out;
-	st->base.z_state = z_used;
+	st->hdr_avail = st->z.total_out;
+	st->z_state = ODB_LOOSE_READ_STREAM_INUSE;
 	st->base.close = close_istream_loose;
 	st->base.read = read_istream_loose;
 
@@ -265,7 +257,7 @@ static int open_istream_loose(struct odb_read_stream **out,
 
 	return 0;
 error:
-	git_inflate_end(&st->base.z);
+	git_inflate_end(&st->z);
 	munmap(st->mapped, st->mapsize);
 	free(st);
 	return -1;
@@ -281,6 +273,13 @@ static int open_istream_loose(struct odb_read_stream **out,
 struct odb_packed_read_stream {
 	struct odb_read_stream base;
 	struct packed_git *pack;
+	git_zstream z;
+	enum {
+		ODB_PACKED_READ_STREAM_UNINITIALIZED,
+		ODB_PACKED_READ_STREAM_INUSE,
+		ODB_PACKED_READ_STREAM_DONE,
+		ODB_PACKED_READ_STREAM_ERROR,
+	} z_state;
 	off_t pos;
 };
 
@@ -290,17 +289,17 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *bu
 	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
 	size_t total_read = 0;
 
-	switch (st->base.z_state) {
-	case z_unused:
-		memset(&st->base.z, 0, sizeof(st->base.z));
-		git_inflate_init(&st->base.z);
-		st->base.z_state = z_used;
+	switch (st->z_state) {
+	case ODB_PACKED_READ_STREAM_UNINITIALIZED:
+		memset(&st->z, 0, sizeof(st->z));
+		git_inflate_init(&st->z);
+		st->z_state = ODB_PACKED_READ_STREAM_INUSE;
 		break;
-	case z_done:
+	case ODB_PACKED_READ_STREAM_DONE:
 		return 0;
-	case z_error:
+	case ODB_PACKED_READ_STREAM_ERROR:
 		return -1;
-	case z_used:
+	case ODB_PACKED_READ_STREAM_INUSE:
 		break;
 	}
 
@@ -310,20 +309,20 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *bu
 		unsigned char *mapped;
 
 		mapped = use_pack(st->pack, &window,
-				  st->pos, &st->base.z.avail_in);
+				  st->pos, &st->z.avail_in);
 
-		st->base.z.next_out = (unsigned char *)buf + total_read;
-		st->base.z.avail_out = sz - total_read;
-		st->base.z.next_in = mapped;
-		status = git_inflate(&st->base.z, Z_FINISH);
+		st->z.next_out = (unsigned char *)buf + total_read;
+		st->z.avail_out = sz - total_read;
+		st->z.next_in = mapped;
+		status = git_inflate(&st->z, Z_FINISH);
 
-		st->pos += st->base.z.next_in - mapped;
-		total_read = st->base.z.next_out - (unsigned char *)buf;
+		st->pos += st->z.next_in - mapped;
+		total_read = st->z.next_out - (unsigned char *)buf;
 		unuse_pack(&window);
 
 		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->base.z);
-			st->base.z_state = z_done;
+			git_inflate_end(&st->z);
+			st->z_state = ODB_PACKED_READ_STREAM_DONE;
 			break;
 		}
 
@@ -336,8 +335,8 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *bu
 		 * or truncated), then use_pack() catches that and will die().
 		 */
 		if (status != Z_OK && status != Z_BUF_ERROR) {
-			git_inflate_end(&st->base.z);
-			st->base.z_state = z_error;
+			git_inflate_end(&st->z);
+			st->z_state = ODB_PACKED_READ_STREAM_ERROR;
 			return -1;
 		}
 	}
@@ -347,7 +346,8 @@ static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *bu
 static int close_istream_pack_non_delta(struct odb_read_stream *_st)
 {
 	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
-	close_deflated_stream(&st->base);
+	if (st->z_state == ODB_PACKED_READ_STREAM_INUSE)
+		git_inflate_end(&st->z);
 	return 0;
 }
 
@@ -384,7 +384,7 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
 	stream->base.read = read_istream_pack_non_delta;
 	stream->base.type = in_pack_type;
 	stream->base.size = size;
-	stream->base.z_state = z_unused;
+	stream->z_state = ODB_PACKED_READ_STREAM_UNINITIALIZED;
 	stream->pack = pack;
 	stream->pos = offset;
 

From 385e18810f10ec0ce0a266d25da4e1878c8ce15a Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:36 +0100
Subject: [PATCH 141/553] packfile: introduce function to read object info from
 a store

Extract the logic to read object info for a packed object from
`do_oid_object_into_extended()` into a standalone function that operates
on the packfile store. This function will be used in a subsequent
commit.

Note that this change allows us to make `find_pack_entry()` an internal
implementation detail. As a consequence though we have to move around
`packfile_store_freshen_object()` so that it is defined after that
function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c      | 29 +++-------------------
 packfile.c | 71 +++++++++++++++++++++++++++++++++++++++++-------------
 packfile.h | 12 ++++++++-
 3 files changed, 69 insertions(+), 43 deletions(-)

diff --git a/odb.c b/odb.c
index 3ec21ef24e16bb..f4cbee4b042d83 100644
--- a/odb.c
+++ b/odb.c
@@ -666,8 +666,6 @@ static int do_oid_object_info_extended(struct object_database *odb,
 {
 	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	const struct cached_object *co;
-	struct pack_entry e;
-	int rtype;
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
@@ -702,8 +700,8 @@ static int do_oid_object_info_extended(struct object_database *odb,
 	while (1) {
 		struct odb_source *source;
 
-		if (find_pack_entry(odb->repo, real, &e))
-			break;
+		if (!packfile_store_read_object_info(odb->packfiles, real, oi, flags))
+			return 0;
 
 		/* Most likely it's a loose object. */
 		for (source = odb->sources; source; source = source->next)
@@ -713,8 +711,8 @@ static int do_oid_object_info_extended(struct object_database *odb,
 		/* Not a loose object; someone else may have just packed it. */
 		if (!(flags & OBJECT_INFO_QUICK)) {
 			odb_reprepare(odb->repo->objects);
-			if (find_pack_entry(odb->repo, real, &e))
-				break;
+			if (!packfile_store_read_object_info(odb->packfiles, real, oi, flags))
+				return 0;
 		}
 
 		/*
@@ -747,25 +745,6 @@ static int do_oid_object_info_extended(struct object_database *odb,
 		}
 		return -1;
 	}
-
-	if (oi == &blank_oi)
-		/*
-		 * We know that the caller doesn't actually need the
-		 * information below, so return early.
-		 */
-		return 0;
-	rtype = packed_object_info(odb->repo, e.p, e.offset, oi);
-	if (rtype < 0) {
-		mark_bad_packed_object(e.p, real);
-		return do_oid_object_info_extended(odb, real, oi, 0);
-	} else if (oi->whence == OI_PACKED) {
-		oi->u.packed.offset = e.offset;
-		oi->u.packed.pack = e.p;
-		oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
-					 rtype == OBJ_OFS_DELTA);
-	}
-
-	return 0;
 }
 
 static int oid_object_info_convert(struct repository *r,
diff --git a/packfile.c b/packfile.c
index 40f733dd234900..b4bc40d895c8da 100644
--- a/packfile.c
+++ b/packfile.c
@@ -819,22 +819,6 @@ struct packed_git *packfile_store_load_pack(struct packfile_store *store,
 	return p;
 }
 
-int packfile_store_freshen_object(struct packfile_store *store,
-				  const struct object_id *oid)
-{
-	struct pack_entry e;
-	if (!find_pack_entry(store->odb->repo, oid, &e))
-		return 0;
-	if (e.p->is_cruft)
-		return 0;
-	if (e.p->freshened)
-		return 1;
-	if (utime(e.p->pack_name, NULL))
-		return 0;
-	e.p->freshened = 1;
-	return 1;
-}
-
 void (*report_garbage)(unsigned seen_bits, const char *path);
 
 static void report_helper(const struct string_list *list,
@@ -2064,7 +2048,9 @@ static int fill_pack_entry(const struct object_id *oid,
 	return 1;
 }
 
-int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e)
+static int find_pack_entry(struct repository *r,
+			   const struct object_id *oid,
+			   struct pack_entry *e)
 {
 	struct list_head *pos;
 
@@ -2087,6 +2073,57 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 	return 0;
 }
 
+int packfile_store_freshen_object(struct packfile_store *store,
+				  const struct object_id *oid)
+{
+	struct pack_entry e;
+	if (!find_pack_entry(store->odb->repo, oid, &e))
+		return 0;
+	if (e.p->is_cruft)
+		return 0;
+	if (e.p->freshened)
+		return 1;
+	if (utime(e.p->pack_name, NULL))
+		return 0;
+	e.p->freshened = 1;
+	return 1;
+}
+
+int packfile_store_read_object_info(struct packfile_store *store,
+				    const struct object_id *oid,
+				    struct object_info *oi,
+				    unsigned flags UNUSED)
+{
+	static struct object_info blank_oi = OBJECT_INFO_INIT;
+	struct pack_entry e;
+	int rtype;
+
+	if (!find_pack_entry(store->odb->repo, oid, &e))
+		return 1;
+
+	/*
+	 * We know that the caller doesn't actually need the
+	 * information below, so return early.
+	 */
+	if (oi == &blank_oi)
+		return 0;
+
+	rtype = packed_object_info(store->odb->repo, e.p, e.offset, oi);
+	if (rtype < 0) {
+		mark_bad_packed_object(e.p, oid);
+		return -1;
+	}
+
+	if (oi->whence == OI_PACKED) {
+		oi->u.packed.offset = e.offset;
+		oi->u.packed.pack = e.p;
+		oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
+					 rtype == OBJ_OFS_DELTA);
+	}
+
+	return 0;
+}
+
 static void maybe_invalidate_kept_pack_cache(struct repository *r,
 					     unsigned flags)
 {
diff --git a/packfile.h b/packfile.h
index 58fcc88e20224b..0a98bddd811921 100644
--- a/packfile.h
+++ b/packfile.h
@@ -144,6 +144,17 @@ void packfile_store_add_pack(struct packfile_store *store,
 #define repo_for_each_pack(repo, p) \
 	for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next)
 
+/*
+ * Try to read the object identified by its ID from the object store and
+ * populate the object info with its data. Returns 1 in case the object was
+ * not found, 0 if it was and read successfully, and a negative error code in
+ * case the object was corrupted.
+ */
+int packfile_store_read_object_info(struct packfile_store *store,
+				    const struct object_id *oid,
+				    struct object_info *oi,
+				    unsigned flags);
+
 /*
  * Get all packs managed by the given store, including packfiles that are
  * referenced by multi-pack indices.
@@ -357,7 +368,6 @@ const struct packed_git *has_packed_and_bad(struct repository *, const struct ob
  * Iff a pack file in the given repository contains the object named by sha1,
  * return true and store its location to e.
  */
-int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e);
 int find_kept_pack_entry(struct repository *r, const struct object_id *oid, unsigned flags, struct pack_entry *e);
 
 int has_object_pack(struct repository *r, const struct object_id *oid);

From 4c89d31494bff4bde6079a0e0821f1437e37d07b Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:37 +0100
Subject: [PATCH 142/553] streaming: rely on object sources to create object
 stream

When creating an object stream we first look up the object info and, if
it's present, we call into the respective backend that contains the
object to create a new stream for it.

This has the consequence that, for loose object source, we basically
iterate through the object sources twice: we first discover that the
file exists as a loose object in the first place by iterating through
all sources. And, once we have discovered it, we again walk through all
sources to try and map the object. The same issue will eventually also
surface once the packfile store becomes per-object-source.

Furthermore, it feels rather pointless to first look up the object only
to then try and read it.

Refactor the logic to be centered around sources instead. Instead of
first reading the object, we immediately ask the source to create the
object stream for us. If the object exists we get stream, otherwise
we'll try the next source.

Like this we only have to iterate through sources once. But even more
importantly, this change also helps us to make the whole logic
pluggable. The object read stream subsystem does not need to be aware of
the different source backends anymore, but eventually it'll only have to
call the source's callback function.

Note that at the current point in time we aren't fully there yet:

  - The packfile store still sits on the object database level and is
    thus agnostic of the sources.

  - We still have to call into both the packfile store and the loose
    object source.

But both of these issues will soon be addressed.

This refactoring results in a slight change to semantics: previously, it
was `odb_read_object_info_extended()` that picked the source for us, and
it would have favored packed (non-deltified) objects over loose objects.
And while we still favor packed over loose objects for a single source
with the new logic, we'll now favor a loose object from an earlier
source over a packed object from a later source.

Ultimately this shouldn't matter though: the stream doesn't indicate to
the caller which source it is from and whether it was created from a
packed or loose object, so such details are opaque to the caller. And
other than that we should be able to assume that two objects with the
same object ID should refer to the same content, so the streamed data
would be the same, too.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 65 ++++++++++++++++++++---------------------------------
 1 file changed, 24 insertions(+), 41 deletions(-)

diff --git a/streaming.c b/streaming.c
index 46fddaf2cad0ba..f0f7d31956f59b 100644
--- a/streaming.c
+++ b/streaming.c
@@ -204,21 +204,15 @@ static int close_istream_loose(struct odb_read_stream *_st)
 }
 
 static int open_istream_loose(struct odb_read_stream **out,
-			      struct repository *r,
+			      struct odb_source *source,
 			      const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
 	struct odb_loose_read_stream *st;
-	struct odb_source *source;
 	unsigned long mapsize;
 	void *mapped;
 
-	odb_prepare_alternates(r->objects);
-	for (source = r->objects->sources; source; source = source->next) {
-		mapped = odb_source_loose_map_object(source, oid, &mapsize);
-		if (mapped)
-			break;
-	}
+	mapped = odb_source_loose_map_object(source, oid, &mapsize);
 	if (!mapped)
 		return -1;
 
@@ -352,21 +346,25 @@ static int close_istream_pack_non_delta(struct odb_read_stream *_st)
 }
 
 static int open_istream_pack_non_delta(struct odb_read_stream **out,
-				       struct repository *r UNUSED,
-				       const struct object_id *oid UNUSED,
-				       struct packed_git *pack,
-				       off_t offset)
+				       struct object_database *odb,
+				       const struct object_id *oid)
 {
 	struct odb_packed_read_stream *stream;
-	struct pack_window *window;
+	struct pack_window *window = NULL;
+	struct object_info oi = OBJECT_INFO_INIT;
 	enum object_type in_pack_type;
-	size_t size;
+	unsigned long size;
 
-	window = NULL;
+	oi.sizep = &size;
+
+	if (packfile_store_read_object_info(odb->packfiles, oid, &oi, 0) ||
+	    oi.u.packed.is_delta ||
+	    repo_settings_get_big_file_threshold(the_repository) >= size)
+		return -1;
 
-	in_pack_type = unpack_object_header(pack,
+	in_pack_type = unpack_object_header(oi.u.packed.pack,
 					    &window,
-					    &offset,
+					    &oi.u.packed.offset,
 					    &size);
 	unuse_pack(&window);
 	switch (in_pack_type) {
@@ -385,8 +383,8 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
 	stream->base.type = in_pack_type;
 	stream->base.size = size;
 	stream->z_state = ODB_PACKED_READ_STREAM_UNINITIALIZED;
-	stream->pack = pack;
-	stream->pos = offset;
+	stream->pack = oi.u.packed.pack;
+	stream->pos = oi.u.packed.offset;
 
 	*out = &stream->base;
 
@@ -463,30 +461,15 @@ static int istream_source(struct odb_read_stream **out,
 			  struct repository *r,
 			  const struct object_id *oid)
 {
-	unsigned long size;
-	int status;
-	struct object_info oi = OBJECT_INFO_INIT;
-
-	oi.sizep = &size;
-	status = odb_read_object_info_extended(r->objects, oid, &oi, 0);
-	if (status < 0)
-		return status;
+	struct odb_source *source;
 
-	switch (oi.whence) {
-	case OI_LOOSE:
-		if (open_istream_loose(out, r, oid) < 0)
-			break;
-		return 0;
-	case OI_PACKED:
-		if (oi.u.packed.is_delta ||
-		    repo_settings_get_big_file_threshold(the_repository) >= size ||
-		    open_istream_pack_non_delta(out, r, oid, oi.u.packed.pack,
-						oi.u.packed.offset) < 0)
-			break;
+	if (!open_istream_pack_non_delta(out, r->objects, oid))
 		return 0;
-	default:
-		break;
-	}
+
+	odb_prepare_alternates(r->objects);
+	for (source = r->objects->sources; source; source = source->next)
+		if (!open_istream_loose(out, source, oid))
+			return 0;
 
 	return open_istream_incore(out, r, oid);
 }

From c26da3446e98ad4aa98ec9154c70c6fd35cb9ad6 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:38 +0100
Subject: [PATCH 143/553] streaming: get rid of `the_repository`

Subsequent commits will move the backend-specific logic of object
streaming into their respective subsystems. These subsystems have gotten
rid of `the_repository` already, but we still use it in two locations in
the streaming subsystem.

Prepare for the move by fixing those two cases. Converting the logic in
`open_istream_pack_non_delta()` is trivial as we already got the object
database as input.

But for `stream_blob_to_fd()` we have to add a new parameter to make it
accessible. So, as we already have to adjust all callers anyway, rename
the function to `odb_stream_blob_to_fd()` to indicate it's part of the
object subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/cat-file.c  |  2 +-
 builtin/fsck.c      |  3 ++-
 builtin/log.c       |  4 ++--
 entry.c             |  2 +-
 parallel-checkout.c |  3 ++-
 streaming.c         | 13 +++++++------
 streaming.h         | 18 +++++++++++++++++-
 7 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 983ecec837b03b..120d626d66e140 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -95,7 +95,7 @@ static int filter_object(const char *path, unsigned mode,
 
 static int stream_blob(const struct object_id *oid)
 {
-	if (stream_blob_to_fd(1, oid, NULL, 0))
+	if (odb_stream_blob_to_fd(the_repository->objects, 1, oid, NULL, 0))
 		die("unable to stream %s to stdout", oid_to_hex(oid));
 	return 0;
 }
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b1a650c6731d32..1a348d43c26020 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -340,7 +340,8 @@ static void check_unreachable_object(struct object *obj)
 			}
 			f = xfopen(filename, "w");
 			if (obj->type == OBJ_BLOB) {
-				if (stream_blob_to_fd(fileno(f), &obj->oid, NULL, 1))
+				if (odb_stream_blob_to_fd(the_repository->objects, fileno(f),
+							  &obj->oid, NULL, 1))
 					die_errno(_("could not write '%s'"), filename);
 			} else
 				fprintf(f, "%s\n", describe_object(&obj->oid));
diff --git a/builtin/log.c b/builtin/log.c
index c8319b8af38c8c..e7b83a6e00a708 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -584,7 +584,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 	fflush(rev->diffopt.file);
 	if (!rev->diffopt.flags.textconv_set_via_cmdline ||
 	    !rev->diffopt.flags.allow_textconv)
-		return stream_blob_to_fd(1, oid, NULL, 0);
+		return odb_stream_blob_to_fd(the_repository->objects, 1, oid, NULL, 0);
 
 	if (get_oid_with_context(the_repository, obj_name,
 				 GET_OID_RECORD_PATH,
@@ -594,7 +594,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 	    !textconv_object(the_repository, obj_context.path,
 			     obj_context.mode, &oidc, 1, &buf, &size)) {
 		object_context_release(&obj_context);
-		return stream_blob_to_fd(1, oid, NULL, 0);
+		return odb_stream_blob_to_fd(the_repository->objects, 1, oid, NULL, 0);
 	}
 
 	if (!buf)
diff --git a/entry.c b/entry.c
index cae02eb50398d7..38dfe670f79920 100644
--- a/entry.c
+++ b/entry.c
@@ -139,7 +139,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path,
 	if (fd < 0)
 		return -1;
 
-	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
+	result |= odb_stream_blob_to_fd(the_repository->objects, fd, &ce->oid, filter, 1);
 	*fstat_done = fstat_checkout_output(fd, state, statbuf);
 	result |= close(fd);
 
diff --git a/parallel-checkout.c b/parallel-checkout.c
index fba6aa65a6e852..1cb6701b926dcf 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -281,7 +281,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
 
 	filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid);
 	if (filter) {
-		if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) {
+		if (odb_stream_blob_to_fd(the_repository->objects, fd,
+					  &pc_item->ce->oid, filter, 1)) {
 			/* On error, reset fd to try writing without streaming */
 			if (reset_fd(fd, path))
 				return -1;
diff --git a/streaming.c b/streaming.c
index f0f7d31956f59b..807a6e03a85b49 100644
--- a/streaming.c
+++ b/streaming.c
@@ -2,8 +2,6 @@
  * Copyright (c) 2011, Google Inc.
  */
 
-#define USE_THE_REPOSITORY_VARIABLE
-
 #include "git-compat-util.h"
 #include "convert.h"
 #include "environment.h"
@@ -359,7 +357,7 @@ static int open_istream_pack_non_delta(struct odb_read_stream **out,
 
 	if (packfile_store_read_object_info(odb->packfiles, oid, &oi, 0) ||
 	    oi.u.packed.is_delta ||
-	    repo_settings_get_big_file_threshold(the_repository) >= size)
+	    repo_settings_get_big_file_threshold(odb->repo) >= size)
 		return -1;
 
 	in_pack_type = unpack_object_header(oi.u.packed.pack,
@@ -518,8 +516,11 @@ struct odb_read_stream *open_istream(struct repository *r,
 	return st;
 }
 
-int stream_blob_to_fd(int fd, const struct object_id *oid, struct stream_filter *filter,
-		      int can_seek)
+int odb_stream_blob_to_fd(struct object_database *odb,
+			  int fd,
+			  const struct object_id *oid,
+			  struct stream_filter *filter,
+			  int can_seek)
 {
 	struct odb_read_stream *st;
 	enum object_type type;
@@ -527,7 +528,7 @@ int stream_blob_to_fd(int fd, const struct object_id *oid, struct stream_filter
 	ssize_t kept = 0;
 	int result = -1;
 
-	st = open_istream(the_repository, oid, &type, &sz, filter);
+	st = open_istream(odb->repo, oid, &type, &sz, filter);
 	if (!st) {
 		if (filter)
 			free_stream_filter(filter);
diff --git a/streaming.h b/streaming.h
index f5ff5d7ac9a573..148f6b30697ab7 100644
--- a/streaming.h
+++ b/streaming.h
@@ -6,6 +6,7 @@
 
 #include "object.h"
 
+struct object_database;
 /* opaque */
 struct odb_read_stream;
 struct stream_filter;
@@ -16,6 +17,21 @@ struct odb_read_stream *open_istream(struct repository *, const struct object_id
 int close_istream(struct odb_read_stream *);
 ssize_t read_istream(struct odb_read_stream *, void *, size_t);
 
-int stream_blob_to_fd(int fd, const struct object_id *, struct stream_filter *, int can_seek);
+/*
+ * Look up the object by its ID and write the full contents to the file
+ * descriptor. The object must be a blob, or the function will fail. When
+ * provided, the filter is used to transform the blob contents.
+ *
+ * `can_seek` should be set to 1 in case the given file descriptor can be
+ * seek(3p)'d on. This is used to support files with holes in case a
+ * significant portion of the blob contains NUL bytes.
+ *
+ * Returns a negative error code on failure, 0 on success.
+ */
+int odb_stream_blob_to_fd(struct object_database *odb,
+			  int fd,
+			  const struct object_id *oid,
+			  struct stream_filter *filter,
+			  int can_seek);
 
 #endif /* STREAMING_H */

From ffc9a3448500caa50766876ef2169e0f26ad3b3c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:39 +0100
Subject: [PATCH 144/553] streaming: make the `odb_read_stream` definition
 public

Subsequent commits will move the backend-specific logic of setting up an
object read stream into the specific subsystems. As the backends are now
the ones that are responsible for allocating the stream they'll need to
have the stream definition available to them.

Make the stream definition public to prepare for this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 streaming.c | 11 -----------
 streaming.h | 15 ++++++++++++++-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/streaming.c b/streaming.c
index 807a6e03a85b49..0635b7c12e2233 100644
--- a/streaming.c
+++ b/streaming.c
@@ -12,19 +12,8 @@
 #include "replace-object.h"
 #include "packfile.h"
 
-typedef int (*close_istream_fn)(struct odb_read_stream *);
-typedef ssize_t (*read_istream_fn)(struct odb_read_stream *, char *, size_t);
-
 #define FILTER_BUFFER (1024*16)
 
-struct odb_read_stream {
-	close_istream_fn close;
-	read_istream_fn read;
-
-	enum object_type type;
-	unsigned long size; /* inflated size of full object */
-};
-
 /*****************************************************************
  *
  * Filtered stream
diff --git a/streaming.h b/streaming.h
index 148f6b30697ab7..acfdef1598db52 100644
--- a/streaming.h
+++ b/streaming.h
@@ -7,10 +7,23 @@
 #include "object.h"
 
 struct object_database;
-/* opaque */
 struct odb_read_stream;
 struct stream_filter;
 
+typedef int (*odb_read_stream_close_fn)(struct odb_read_stream *);
+typedef ssize_t (*odb_read_stream_read_fn)(struct odb_read_stream *, char *, size_t);
+
+/*
+ * A stream that can be used to read an object from the object database without
+ * loading all of it into memory.
+ */
+struct odb_read_stream {
+	odb_read_stream_close_fn close;
+	odb_read_stream_read_fn read;
+	enum object_type type;
+	unsigned long size; /* inflated size of full object */
+};
+
 struct odb_read_stream *open_istream(struct repository *, const struct object_id *,
 				     enum object_type *, unsigned long *,
 				     struct stream_filter *);

From bc30a2f5dff6dd39966819ca3771ab5e9e072123 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:40 +0100
Subject: [PATCH 145/553] streaming: move logic to read loose objects streams
 into backend

Move the logic to read loose object streams into the respective
subsystem. This allows us to make a couple of function declarations
private.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c | 167 +++++++++++++++++++++++++++++++++++++++++++++++---
 object-file.h |  42 ++-----------
 streaming.c   | 133 +---------------------------------------
 3 files changed, 164 insertions(+), 178 deletions(-)

diff --git a/object-file.c b/object-file.c
index b62b21a45289fc..8c67847feaceb6 100644
--- a/object-file.c
+++ b/object-file.c
@@ -234,9 +234,9 @@ static void *map_fd(int fd, const char *path, unsigned long *size)
 	return map;
 }
 
-void *odb_source_loose_map_object(struct odb_source *source,
-				  const struct object_id *oid,
-				  unsigned long *size)
+static void *odb_source_loose_map_object(struct odb_source *source,
+					 const struct object_id *oid,
+					 unsigned long *size)
 {
 	const char *p;
 	int fd = open_loose_object(source->loose, oid, &p);
@@ -246,11 +246,29 @@ void *odb_source_loose_map_object(struct odb_source *source,
 	return map_fd(fd, p, size);
 }
 
-enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
-						    unsigned char *map,
-						    unsigned long mapsize,
-						    void *buffer,
-						    unsigned long bufsiz)
+enum unpack_loose_header_result {
+	ULHR_OK,
+	ULHR_BAD,
+	ULHR_TOO_LONG,
+};
+
+/**
+ * unpack_loose_header() initializes the data stream needed to unpack
+ * a loose object header.
+ *
+ * Returns:
+ *
+ * - ULHR_OK on success
+ * - ULHR_BAD on error
+ * - ULHR_TOO_LONG if the header was too long
+ *
+ * It will only parse up to MAX_HEADER_LEN bytes.
+ */
+static enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
+							   unsigned char *map,
+							   unsigned long mapsize,
+							   void *buffer,
+							   unsigned long bufsiz)
 {
 	int status;
 
@@ -329,11 +347,18 @@ static void *unpack_loose_rest(git_zstream *stream,
 }
 
 /*
+ * parse_loose_header() parses the starting "<type> <len>\0" of an
+ * object. If it doesn't follow that format -1 is returned. To check
+ * the validity of the <type> populate the "typep" in the "struct
+ * object_info". It will be OBJ_BAD if the object type is unknown. The
+ * parsed <len> can be retrieved via "oi->sizep", and from there
+ * passed to unpack_loose_rest().
+ *
  * We used to just use "sscanf()", but that's actually way
  * too permissive for what we want to check. So do an anal
  * object header parse by hand.
  */
-int parse_loose_header(const char *hdr, struct object_info *oi)
+static int parse_loose_header(const char *hdr, struct object_info *oi)
 {
 	const char *type_buf = hdr;
 	size_t size;
@@ -1976,3 +2001,127 @@ void odb_source_loose_free(struct odb_source_loose *loose)
 	loose_object_map_clear(&loose->map);
 	free(loose);
 }
+
+struct odb_loose_read_stream {
+	struct odb_read_stream base;
+	git_zstream z;
+	enum {
+		ODB_LOOSE_READ_STREAM_INUSE,
+		ODB_LOOSE_READ_STREAM_DONE,
+		ODB_LOOSE_READ_STREAM_ERROR,
+	} z_state;
+	void *mapped;
+	unsigned long mapsize;
+	char hdr[32];
+	int hdr_avail;
+	int hdr_used;
+};
+
+static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t sz)
+{
+	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
+	size_t total_read = 0;
+
+	switch (st->z_state) {
+	case ODB_LOOSE_READ_STREAM_DONE:
+		return 0;
+	case ODB_LOOSE_READ_STREAM_ERROR:
+		return -1;
+	default:
+		break;
+	}
+
+	if (st->hdr_used < st->hdr_avail) {
+		size_t to_copy = st->hdr_avail - st->hdr_used;
+		if (sz < to_copy)
+			to_copy = sz;
+		memcpy(buf, st->hdr + st->hdr_used, to_copy);
+		st->hdr_used += to_copy;
+		total_read += to_copy;
+	}
+
+	while (total_read < sz) {
+		int status;
+
+		st->z.next_out = (unsigned char *)buf + total_read;
+		st->z.avail_out = sz - total_read;
+		status = git_inflate(&st->z, Z_FINISH);
+
+		total_read = st->z.next_out - (unsigned char *)buf;
+
+		if (status == Z_STREAM_END) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_DONE;
+			break;
+		}
+		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_ERROR;
+			return -1;
+		}
+	}
+	return total_read;
+}
+
+static int close_istream_loose(struct odb_read_stream *_st)
+{
+	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
+	if (st->z_state == ODB_LOOSE_READ_STREAM_INUSE)
+		git_inflate_end(&st->z);
+	munmap(st->mapped, st->mapsize);
+	return 0;
+}
+
+int odb_source_loose_read_object_stream(struct odb_read_stream **out,
+					struct odb_source *source,
+					const struct object_id *oid)
+{
+	struct object_info oi = OBJECT_INFO_INIT;
+	struct odb_loose_read_stream *st;
+	unsigned long mapsize;
+	void *mapped;
+
+	mapped = odb_source_loose_map_object(source, oid, &mapsize);
+	if (!mapped)
+		return -1;
+
+	/*
+	 * Note: we must allocate this structure early even though we may still
+	 * fail. This is because we need to initialize the zlib stream, and it
+	 * is not possible to copy the stream around after the fact because it
+	 * has self-referencing pointers.
+	 */
+	CALLOC_ARRAY(st, 1);
+
+	switch (unpack_loose_header(&st->z, mapped, mapsize, st->hdr,
+				    sizeof(st->hdr))) {
+	case ULHR_OK:
+		break;
+	case ULHR_BAD:
+	case ULHR_TOO_LONG:
+		goto error;
+	}
+
+	oi.sizep = &st->base.size;
+	oi.typep = &st->base.type;
+
+	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
+		goto error;
+
+	st->mapped = mapped;
+	st->mapsize = mapsize;
+	st->hdr_used = strlen(st->hdr) + 1;
+	st->hdr_avail = st->z.total_out;
+	st->z_state = ODB_LOOSE_READ_STREAM_INUSE;
+	st->base.close = close_istream_loose;
+	st->base.read = read_istream_loose;
+
+	*out = &st->base;
+
+	return 0;
+error:
+	git_inflate_end(&st->z);
+	munmap(st->mapped, st->mapsize);
+	free(st);
+	return -1;
+}
diff --git a/object-file.h b/object-file.h
index eeffa67bbda631..1229d5f675b44a 100644
--- a/object-file.h
+++ b/object-file.h
@@ -16,6 +16,8 @@ enum {
 int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
 int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
 
+struct object_info;
+struct odb_read_stream;
 struct odb_source;
 
 struct odb_source_loose {
@@ -47,9 +49,9 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 				      const struct object_id *oid,
 				      struct object_info *oi, int flags);
 
-void *odb_source_loose_map_object(struct odb_source *source,
-				  const struct object_id *oid,
-				  unsigned long *size);
+int odb_source_loose_read_object_stream(struct odb_read_stream **out,
+					struct odb_source *source,
+					const struct object_id *oid);
 
 /*
  * Return true iff an object database source has a loose object
@@ -143,40 +145,6 @@ int for_each_loose_object(struct object_database *odb,
 int format_object_header(char *str, size_t size, enum object_type type,
 			 size_t objsize);
 
-/**
- * unpack_loose_header() initializes the data stream needed to unpack
- * a loose object header.
- *
- * Returns:
- *
- * - ULHR_OK on success
- * - ULHR_BAD on error
- * - ULHR_TOO_LONG if the header was too long
- *
- * It will only parse up to MAX_HEADER_LEN bytes.
- */
-enum unpack_loose_header_result {
-	ULHR_OK,
-	ULHR_BAD,
-	ULHR_TOO_LONG,
-};
-enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
-						    unsigned char *map,
-						    unsigned long mapsize,
-						    void *buffer,
-						    unsigned long bufsiz);
-
-/**
- * parse_loose_header() parses the starting "<type> <len>\0" of an
- * object. If it doesn't follow that format -1 is returned. To check
- * the validity of the <type> populate the "typep" in the "struct
- * object_info". It will be OBJ_BAD if the object type is unknown. The
- * parsed <len> can be retrieved via "oi->sizep", and from there
- * passed to unpack_loose_rest().
- */
-struct object_info;
-int parse_loose_header(const char *hdr, struct object_info *oi);
-
 int force_object_loose(struct odb_source *source,
 		       const struct object_id *oid, time_t mtime);
 
diff --git a/streaming.c b/streaming.c
index 0635b7c12e2233..d5acc1c39650e4 100644
--- a/streaming.c
+++ b/streaming.c
@@ -114,137 +114,6 @@ static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
 	return &fs->base;
 }
 
-/*****************************************************************
- *
- * Loose object stream
- *
- *****************************************************************/
-
-struct odb_loose_read_stream {
-	struct odb_read_stream base;
-	git_zstream z;
-	enum {
-		ODB_LOOSE_READ_STREAM_INUSE,
-		ODB_LOOSE_READ_STREAM_DONE,
-		ODB_LOOSE_READ_STREAM_ERROR,
-	} z_state;
-	void *mapped;
-	unsigned long mapsize;
-	char hdr[32];
-	int hdr_avail;
-	int hdr_used;
-};
-
-static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t sz)
-{
-	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
-	size_t total_read = 0;
-
-	switch (st->z_state) {
-	case ODB_LOOSE_READ_STREAM_DONE:
-		return 0;
-	case ODB_LOOSE_READ_STREAM_ERROR:
-		return -1;
-	default:
-		break;
-	}
-
-	if (st->hdr_used < st->hdr_avail) {
-		size_t to_copy = st->hdr_avail - st->hdr_used;
-		if (sz < to_copy)
-			to_copy = sz;
-		memcpy(buf, st->hdr + st->hdr_used, to_copy);
-		st->hdr_used += to_copy;
-		total_read += to_copy;
-	}
-
-	while (total_read < sz) {
-		int status;
-
-		st->z.next_out = (unsigned char *)buf + total_read;
-		st->z.avail_out = sz - total_read;
-		status = git_inflate(&st->z, Z_FINISH);
-
-		total_read = st->z.next_out - (unsigned char *)buf;
-
-		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_LOOSE_READ_STREAM_DONE;
-			break;
-		}
-		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_LOOSE_READ_STREAM_ERROR;
-			return -1;
-		}
-	}
-	return total_read;
-}
-
-static int close_istream_loose(struct odb_read_stream *_st)
-{
-	struct odb_loose_read_stream *st = (struct odb_loose_read_stream *)_st;
-	if (st->z_state == ODB_LOOSE_READ_STREAM_INUSE)
-		git_inflate_end(&st->z);
-	munmap(st->mapped, st->mapsize);
-	return 0;
-}
-
-static int open_istream_loose(struct odb_read_stream **out,
-			      struct odb_source *source,
-			      const struct object_id *oid)
-{
-	struct object_info oi = OBJECT_INFO_INIT;
-	struct odb_loose_read_stream *st;
-	unsigned long mapsize;
-	void *mapped;
-
-	mapped = odb_source_loose_map_object(source, oid, &mapsize);
-	if (!mapped)
-		return -1;
-
-	/*
-	 * Note: we must allocate this structure early even though we may still
-	 * fail. This is because we need to initialize the zlib stream, and it
-	 * is not possible to copy the stream around after the fact because it
-	 * has self-referencing pointers.
-	 */
-	CALLOC_ARRAY(st, 1);
-
-	switch (unpack_loose_header(&st->z, mapped, mapsize, st->hdr,
-				    sizeof(st->hdr))) {
-	case ULHR_OK:
-		break;
-	case ULHR_BAD:
-	case ULHR_TOO_LONG:
-		goto error;
-	}
-
-	oi.sizep = &st->base.size;
-	oi.typep = &st->base.type;
-
-	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
-		goto error;
-
-	st->mapped = mapped;
-	st->mapsize = mapsize;
-	st->hdr_used = strlen(st->hdr) + 1;
-	st->hdr_avail = st->z.total_out;
-	st->z_state = ODB_LOOSE_READ_STREAM_INUSE;
-	st->base.close = close_istream_loose;
-	st->base.read = read_istream_loose;
-
-	*out = &st->base;
-
-	return 0;
-error:
-	git_inflate_end(&st->z);
-	munmap(st->mapped, st->mapsize);
-	free(st);
-	return -1;
-}
-
-
 /*****************************************************************
  *
  * Non-delta packed object stream
@@ -455,7 +324,7 @@ static int istream_source(struct odb_read_stream **out,
 
 	odb_prepare_alternates(r->objects);
 	for (source = r->objects->sources; source; source = source->next)
-		if (!open_istream_loose(out, source, oid))
+		if (!odb_source_loose_read_object_stream(out, source, oid))
 			return 0;
 
 	return open_istream_incore(out, r, oid);

From 8c1b84bc977bf1e4515efe0386de87257ec28689 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:41 +0100
Subject: [PATCH 146/553] streaming: move logic to read packed objects streams
 into backend

Move the logic to read packed object streams into the respective
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c  | 128 +++++++++++++++++++++++++++++++++++++++++++++++++
 packfile.h  |   5 ++
 streaming.c | 136 +---------------------------------------------------
 3 files changed, 134 insertions(+), 135 deletions(-)

diff --git a/packfile.c b/packfile.c
index b4bc40d895c8da..ad56ce0b905c0d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -20,6 +20,7 @@
 #include "tree.h"
 #include "object-file.h"
 #include "odb.h"
+#include "streaming.h"
 #include "midx.h"
 #include "commit-graph.h"
 #include "pack-revindex.h"
@@ -2406,3 +2407,130 @@ void packfile_store_close(struct packfile_store *store)
 		close_pack(p);
 	}
 }
+
+struct odb_packed_read_stream {
+	struct odb_read_stream base;
+	struct packed_git *pack;
+	git_zstream z;
+	enum {
+		ODB_PACKED_READ_STREAM_UNINITIALIZED,
+		ODB_PACKED_READ_STREAM_INUSE,
+		ODB_PACKED_READ_STREAM_DONE,
+		ODB_PACKED_READ_STREAM_ERROR,
+	} z_state;
+	off_t pos;
+};
+
+static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *buf,
+					   size_t sz)
+{
+	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
+	size_t total_read = 0;
+
+	switch (st->z_state) {
+	case ODB_PACKED_READ_STREAM_UNINITIALIZED:
+		memset(&st->z, 0, sizeof(st->z));
+		git_inflate_init(&st->z);
+		st->z_state = ODB_PACKED_READ_STREAM_INUSE;
+		break;
+	case ODB_PACKED_READ_STREAM_DONE:
+		return 0;
+	case ODB_PACKED_READ_STREAM_ERROR:
+		return -1;
+	case ODB_PACKED_READ_STREAM_INUSE:
+		break;
+	}
+
+	while (total_read < sz) {
+		int status;
+		struct pack_window *window = NULL;
+		unsigned char *mapped;
+
+		mapped = use_pack(st->pack, &window,
+				  st->pos, &st->z.avail_in);
+
+		st->z.next_out = (unsigned char *)buf + total_read;
+		st->z.avail_out = sz - total_read;
+		st->z.next_in = mapped;
+		status = git_inflate(&st->z, Z_FINISH);
+
+		st->pos += st->z.next_in - mapped;
+		total_read = st->z.next_out - (unsigned char *)buf;
+		unuse_pack(&window);
+
+		if (status == Z_STREAM_END) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_PACKED_READ_STREAM_DONE;
+			break;
+		}
+
+		/*
+		 * Unlike the loose object case, we do not have to worry here
+		 * about running out of input bytes and spinning infinitely. If
+		 * we get Z_BUF_ERROR due to too few input bytes, then we'll
+		 * replenish them in the next use_pack() call when we loop. If
+		 * we truly hit the end of the pack (i.e., because it's corrupt
+		 * or truncated), then use_pack() catches that and will die().
+		 */
+		if (status != Z_OK && status != Z_BUF_ERROR) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_PACKED_READ_STREAM_ERROR;
+			return -1;
+		}
+	}
+	return total_read;
+}
+
+static int close_istream_pack_non_delta(struct odb_read_stream *_st)
+{
+	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
+	if (st->z_state == ODB_PACKED_READ_STREAM_INUSE)
+		git_inflate_end(&st->z);
+	return 0;
+}
+
+int packfile_store_read_object_stream(struct odb_read_stream **out,
+				      struct packfile_store *store,
+				      const struct object_id *oid)
+{
+	struct odb_packed_read_stream *stream;
+	struct pack_window *window = NULL;
+	struct object_info oi = OBJECT_INFO_INIT;
+	enum object_type in_pack_type;
+	unsigned long size;
+
+	oi.sizep = &size;
+
+	if (packfile_store_read_object_info(store, oid, &oi, 0) ||
+	    oi.u.packed.is_delta ||
+	    repo_settings_get_big_file_threshold(store->odb->repo) >= size)
+		return -1;
+
+	in_pack_type = unpack_object_header(oi.u.packed.pack,
+					    &window,
+					    &oi.u.packed.offset,
+					    &size);
+	unuse_pack(&window);
+	switch (in_pack_type) {
+	default:
+		return -1; /* we do not do deltas for now */
+	case OBJ_COMMIT:
+	case OBJ_TREE:
+	case OBJ_BLOB:
+	case OBJ_TAG:
+		break;
+	}
+
+	CALLOC_ARRAY(stream, 1);
+	stream->base.close = close_istream_pack_non_delta;
+	stream->base.read = read_istream_pack_non_delta;
+	stream->base.type = in_pack_type;
+	stream->base.size = size;
+	stream->z_state = ODB_PACKED_READ_STREAM_UNINITIALIZED;
+	stream->pack = oi.u.packed.pack;
+	stream->pos = oi.u.packed.offset;
+
+	*out = &stream->base;
+
+	return 0;
+}
diff --git a/packfile.h b/packfile.h
index 0a98bddd811921..3fcc5ae6e08c4b 100644
--- a/packfile.h
+++ b/packfile.h
@@ -8,6 +8,7 @@
 
 /* in odb.h */
 struct object_info;
+struct odb_read_stream;
 
 struct packed_git {
 	struct hashmap_entry packmap_ent;
@@ -144,6 +145,10 @@ void packfile_store_add_pack(struct packfile_store *store,
 #define repo_for_each_pack(repo, p) \
 	for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next)
 
+int packfile_store_read_object_stream(struct odb_read_stream **out,
+				      struct packfile_store *store,
+				      const struct object_id *oid);
+
 /*
  * Try to read the object identified by its ID from the object store and
  * populate the object info with its data. Returns 1 in case the object was
diff --git a/streaming.c b/streaming.c
index d5acc1c39650e4..3140728a70bde7 100644
--- a/streaming.c
+++ b/streaming.c
@@ -114,140 +114,6 @@ static struct odb_read_stream *attach_stream_filter(struct odb_read_stream *st,
 	return &fs->base;
 }
 
-/*****************************************************************
- *
- * Non-delta packed object stream
- *
- *****************************************************************/
-
-struct odb_packed_read_stream {
-	struct odb_read_stream base;
-	struct packed_git *pack;
-	git_zstream z;
-	enum {
-		ODB_PACKED_READ_STREAM_UNINITIALIZED,
-		ODB_PACKED_READ_STREAM_INUSE,
-		ODB_PACKED_READ_STREAM_DONE,
-		ODB_PACKED_READ_STREAM_ERROR,
-	} z_state;
-	off_t pos;
-};
-
-static ssize_t read_istream_pack_non_delta(struct odb_read_stream *_st, char *buf,
-					   size_t sz)
-{
-	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
-	size_t total_read = 0;
-
-	switch (st->z_state) {
-	case ODB_PACKED_READ_STREAM_UNINITIALIZED:
-		memset(&st->z, 0, sizeof(st->z));
-		git_inflate_init(&st->z);
-		st->z_state = ODB_PACKED_READ_STREAM_INUSE;
-		break;
-	case ODB_PACKED_READ_STREAM_DONE:
-		return 0;
-	case ODB_PACKED_READ_STREAM_ERROR:
-		return -1;
-	case ODB_PACKED_READ_STREAM_INUSE:
-		break;
-	}
-
-	while (total_read < sz) {
-		int status;
-		struct pack_window *window = NULL;
-		unsigned char *mapped;
-
-		mapped = use_pack(st->pack, &window,
-				  st->pos, &st->z.avail_in);
-
-		st->z.next_out = (unsigned char *)buf + total_read;
-		st->z.avail_out = sz - total_read;
-		st->z.next_in = mapped;
-		status = git_inflate(&st->z, Z_FINISH);
-
-		st->pos += st->z.next_in - mapped;
-		total_read = st->z.next_out - (unsigned char *)buf;
-		unuse_pack(&window);
-
-		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_PACKED_READ_STREAM_DONE;
-			break;
-		}
-
-		/*
-		 * Unlike the loose object case, we do not have to worry here
-		 * about running out of input bytes and spinning infinitely. If
-		 * we get Z_BUF_ERROR due to too few input bytes, then we'll
-		 * replenish them in the next use_pack() call when we loop. If
-		 * we truly hit the end of the pack (i.e., because it's corrupt
-		 * or truncated), then use_pack() catches that and will die().
-		 */
-		if (status != Z_OK && status != Z_BUF_ERROR) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_PACKED_READ_STREAM_ERROR;
-			return -1;
-		}
-	}
-	return total_read;
-}
-
-static int close_istream_pack_non_delta(struct odb_read_stream *_st)
-{
-	struct odb_packed_read_stream *st = (struct odb_packed_read_stream *)_st;
-	if (st->z_state == ODB_PACKED_READ_STREAM_INUSE)
-		git_inflate_end(&st->z);
-	return 0;
-}
-
-static int open_istream_pack_non_delta(struct odb_read_stream **out,
-				       struct object_database *odb,
-				       const struct object_id *oid)
-{
-	struct odb_packed_read_stream *stream;
-	struct pack_window *window = NULL;
-	struct object_info oi = OBJECT_INFO_INIT;
-	enum object_type in_pack_type;
-	unsigned long size;
-
-	oi.sizep = &size;
-
-	if (packfile_store_read_object_info(odb->packfiles, oid, &oi, 0) ||
-	    oi.u.packed.is_delta ||
-	    repo_settings_get_big_file_threshold(odb->repo) >= size)
-		return -1;
-
-	in_pack_type = unpack_object_header(oi.u.packed.pack,
-					    &window,
-					    &oi.u.packed.offset,
-					    &size);
-	unuse_pack(&window);
-	switch (in_pack_type) {
-	default:
-		return -1; /* we do not do deltas for now */
-	case OBJ_COMMIT:
-	case OBJ_TREE:
-	case OBJ_BLOB:
-	case OBJ_TAG:
-		break;
-	}
-
-	CALLOC_ARRAY(stream, 1);
-	stream->base.close = close_istream_pack_non_delta;
-	stream->base.read = read_istream_pack_non_delta;
-	stream->base.type = in_pack_type;
-	stream->base.size = size;
-	stream->z_state = ODB_PACKED_READ_STREAM_UNINITIALIZED;
-	stream->pack = oi.u.packed.pack;
-	stream->pos = oi.u.packed.offset;
-
-	*out = &stream->base;
-
-	return 0;
-}
-
-
 /*****************************************************************
  *
  * In-core stream
@@ -319,7 +185,7 @@ static int istream_source(struct odb_read_stream **out,
 {
 	struct odb_source *source;
 
-	if (!open_istream_pack_non_delta(out, r->objects, oid))
+	if (!packfile_store_read_object_stream(out, r->objects->packfiles, oid))
 		return 0;
 
 	odb_prepare_alternates(r->objects);

From 378ec56beba161abbef6e2c87d9bc2ac43c355f3 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:42 +0100
Subject: [PATCH 147/553] streaming: refactor interface to be
 object-database-centric

Refactor the streaming interface to be centered around object databases
instead of centered around the repository. Rename the functions
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 archive-tar.c          |  6 +++---
 archive-zip.c          | 12 ++++++------
 builtin/index-pack.c   |  8 ++++----
 builtin/pack-objects.c | 14 +++++++-------
 object-file.c          |  8 ++++----
 streaming.c            | 44 +++++++++++++++++++++---------------------
 streaming.h            | 30 +++++++++++++++++++++++-----
 7 files changed, 71 insertions(+), 51 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index dc1eda09e01e2b..4d87b28504615a 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -135,16 +135,16 @@ static int stream_blocked(struct repository *r, const struct object_id *oid)
 	char buf[BLOCKSIZE];
 	ssize_t readlen;
 
-	st = open_istream(r, oid, &type, &sz, NULL);
+	st = odb_read_stream_open(r->objects, oid, &type, &sz, NULL);
 	if (!st)
 		return error(_("cannot stream blob %s"), oid_to_hex(oid));
 	for (;;) {
-		readlen = read_istream(st, buf, sizeof(buf));
+		readlen = odb_read_stream_read(st, buf, sizeof(buf));
 		if (readlen <= 0)
 			break;
 		do_write_blocked(buf, readlen);
 	}
-	close_istream(st);
+	odb_read_stream_close(st);
 	if (!readlen)
 		finish_record();
 	return readlen;
diff --git a/archive-zip.c b/archive-zip.c
index 40a9c93ff95233..c44684aebcf18d 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -348,8 +348,8 @@ static int write_zip_entry(struct archiver_args *args,
 
 		if (!buffer) {
 			enum object_type type;
-			stream = open_istream(args->repo, oid, &type, &size,
-					      NULL);
+			stream = odb_read_stream_open(args->repo->objects, oid,
+						      &type, &size, NULL);
 			if (!stream)
 				return error(_("cannot stream blob %s"),
 					     oid_to_hex(oid));
@@ -429,7 +429,7 @@ static int write_zip_entry(struct archiver_args *args,
 		ssize_t readlen;
 
 		for (;;) {
-			readlen = read_istream(stream, buf, sizeof(buf));
+			readlen = odb_read_stream_read(stream, buf, sizeof(buf));
 			if (readlen <= 0)
 				break;
 			crc = crc32(crc, buf, readlen);
@@ -439,7 +439,7 @@ static int write_zip_entry(struct archiver_args *args,
 							    buf, readlen);
 			write_or_die(1, buf, readlen);
 		}
-		close_istream(stream);
+		odb_read_stream_close(stream);
 		if (readlen)
 			return readlen;
 
@@ -462,7 +462,7 @@ static int write_zip_entry(struct archiver_args *args,
 		zstream.avail_out = sizeof(compressed);
 
 		for (;;) {
-			readlen = read_istream(stream, buf, sizeof(buf));
+			readlen = odb_read_stream_read(stream, buf, sizeof(buf));
 			if (readlen <= 0)
 				break;
 			crc = crc32(crc, buf, readlen);
@@ -486,7 +486,7 @@ static int write_zip_entry(struct archiver_args *args,
 			}
 
 		}
-		close_istream(stream);
+		odb_read_stream_close(stream);
 		if (readlen)
 			return readlen;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 5f90f12f92d9c4..fb76ef0f4c17c3 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -779,7 +779,7 @@ static int compare_objects(const unsigned char *buf, unsigned long size,
 	}
 
 	while (size) {
-		ssize_t len = read_istream(data->st, data->buf, size);
+		ssize_t len = odb_read_stream_read(data->st, data->buf, size);
 		if (len == 0)
 			die(_("SHA1 COLLISION FOUND WITH %s !"),
 			    oid_to_hex(&data->entry->idx.oid));
@@ -807,15 +807,15 @@ static int check_collison(struct object_entry *entry)
 
 	memset(&data, 0, sizeof(data));
 	data.entry = entry;
-	data.st = open_istream(the_repository, &entry->idx.oid, &type, &size,
-			       NULL);
+	data.st = odb_read_stream_open(the_repository->objects, &entry->idx.oid,
+				       &type, &size, NULL);
 	if (!data.st)
 		return -1;
 	if (size != entry->size || type != entry->type)
 		die(_("SHA1 COLLISION FOUND WITH %s !"),
 		    oid_to_hex(&entry->idx.oid));
 	unpack_data(entry, compare_objects, &data);
-	close_istream(data.st);
+	odb_read_stream_close(data.st);
 	free(data.buf);
 	return 0;
 }
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c693d948e193ed..1353c2384c336e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -417,7 +417,7 @@ static unsigned long write_large_blob_data(struct odb_read_stream *st, struct ha
 	for (;;) {
 		ssize_t readlen;
 		int zret = Z_OK;
-		readlen = read_istream(st, ibuf, sizeof(ibuf));
+		readlen = odb_read_stream_read(st, ibuf, sizeof(ibuf));
 		if (readlen == -1)
 			die(_("unable to read %s"), oid_to_hex(oid));
 
@@ -520,8 +520,8 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 		if (oe_type(entry) == OBJ_BLOB &&
 		    oe_size_greater_than(&to_pack, entry,
 					 repo_settings_get_big_file_threshold(the_repository)) &&
-		    (st = open_istream(the_repository, &entry->idx.oid, &type,
-				       &size, NULL)) != NULL)
+		    (st = odb_read_stream_open(the_repository->objects, &entry->idx.oid,
+					       &type, &size, NULL)) != NULL)
 			buf = NULL;
 		else {
 			buf = odb_read_object(the_repository->objects,
@@ -577,7 +577,7 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 			dheader[--pos] = 128 | (--ofs & 127);
 		if (limit && hdrlen + sizeof(dheader) - pos + datalen + hashsz >= limit) {
 			if (st)
-				close_istream(st);
+				odb_read_stream_close(st);
 			free(buf);
 			return 0;
 		}
@@ -591,7 +591,7 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 		 */
 		if (limit && hdrlen + hashsz + datalen + hashsz >= limit) {
 			if (st)
-				close_istream(st);
+				odb_read_stream_close(st);
 			free(buf);
 			return 0;
 		}
@@ -601,7 +601,7 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 	} else {
 		if (limit && hdrlen + datalen + hashsz >= limit) {
 			if (st)
-				close_istream(st);
+				odb_read_stream_close(st);
 			free(buf);
 			return 0;
 		}
@@ -609,7 +609,7 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 	}
 	if (st) {
 		datalen = write_large_blob_data(st, f, &entry->idx.oid);
-		close_istream(st);
+		odb_read_stream_close(st);
 	} else {
 		hashwrite(f, buf, datalen);
 		free(buf);
diff --git a/object-file.c b/object-file.c
index 8c67847feaceb6..9ba40a848c034a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -139,7 +139,7 @@ int stream_object_signature(struct repository *r, const struct object_id *oid)
 	char hdr[MAX_HEADER_LEN];
 	int hdrlen;
 
-	st = open_istream(r, oid, &obj_type, &size, NULL);
+	st = odb_read_stream_open(r->objects, oid, &obj_type, &size, NULL);
 	if (!st)
 		return -1;
 
@@ -151,10 +151,10 @@ int stream_object_signature(struct repository *r, const struct object_id *oid)
 	git_hash_update(&c, hdr, hdrlen);
 	for (;;) {
 		char buf[1024 * 16];
-		ssize_t readlen = read_istream(st, buf, sizeof(buf));
+		ssize_t readlen = odb_read_stream_read(st, buf, sizeof(buf));
 
 		if (readlen < 0) {
-			close_istream(st);
+			odb_read_stream_close(st);
 			return -1;
 		}
 		if (!readlen)
@@ -162,7 +162,7 @@ int stream_object_signature(struct repository *r, const struct object_id *oid)
 		git_hash_update(&c, buf, readlen);
 	}
 	git_hash_final_oid(&real_oid, &c);
-	close_istream(st);
+	odb_read_stream_close(st);
 	return !oideq(oid, &real_oid) ? -1 : 0;
 }
 
diff --git a/streaming.c b/streaming.c
index 3140728a70bde7..06993a751c6194 100644
--- a/streaming.c
+++ b/streaming.c
@@ -35,7 +35,7 @@ static int close_istream_filtered(struct odb_read_stream *_fs)
 {
 	struct odb_filtered_read_stream *fs = (struct odb_filtered_read_stream *)_fs;
 	free_stream_filter(fs->filter);
-	return close_istream(fs->upstream);
+	return odb_read_stream_close(fs->upstream);
 }
 
 static ssize_t read_istream_filtered(struct odb_read_stream *_fs, char *buf,
@@ -87,7 +87,7 @@ static ssize_t read_istream_filtered(struct odb_read_stream *_fs, char *buf,
 
 		/* refill the input from the upstream */
 		if (!fs->input_finished) {
-			fs->i_end = read_istream(fs->upstream, fs->ibuf, FILTER_BUFFER);
+			fs->i_end = odb_read_stream_read(fs->upstream, fs->ibuf, FILTER_BUFFER);
 			if (fs->i_end < 0)
 				return -1;
 			if (fs->i_end)
@@ -149,7 +149,7 @@ static ssize_t read_istream_incore(struct odb_read_stream *_st, char *buf, size_
 }
 
 static int open_istream_incore(struct odb_read_stream **out,
-			       struct repository *r,
+			       struct object_database *odb,
 			       const struct object_id *oid)
 {
 	struct object_info oi = OBJECT_INFO_INIT;
@@ -163,7 +163,7 @@ static int open_istream_incore(struct odb_read_stream **out,
 	oi.typep = &stream.base.type;
 	oi.sizep = &stream.base.size;
 	oi.contentp = (void **)&stream.buf;
-	ret = odb_read_object_info_extended(r->objects, oid, &oi,
+	ret = odb_read_object_info_extended(odb, oid, &oi,
 					    OBJECT_INFO_DIE_IF_CORRUPT);
 	if (ret)
 		return ret;
@@ -180,47 +180,47 @@ static int open_istream_incore(struct odb_read_stream **out,
  *****************************************************************************/
 
 static int istream_source(struct odb_read_stream **out,
-			  struct repository *r,
+			  struct object_database *odb,
 			  const struct object_id *oid)
 {
 	struct odb_source *source;
 
-	if (!packfile_store_read_object_stream(out, r->objects->packfiles, oid))
+	if (!packfile_store_read_object_stream(out, odb->packfiles, oid))
 		return 0;
 
-	odb_prepare_alternates(r->objects);
-	for (source = r->objects->sources; source; source = source->next)
+	odb_prepare_alternates(odb);
+	for (source = odb->sources; source; source = source->next)
 		if (!odb_source_loose_read_object_stream(out, source, oid))
 			return 0;
 
-	return open_istream_incore(out, r, oid);
+	return open_istream_incore(out, odb, oid);
 }
 
 /****************************************************************
  * Users of streaming interface
  ****************************************************************/
 
-int close_istream(struct odb_read_stream *st)
+int odb_read_stream_close(struct odb_read_stream *st)
 {
 	int r = st->close(st);
 	free(st);
 	return r;
 }
 
-ssize_t read_istream(struct odb_read_stream *st, void *buf, size_t sz)
+ssize_t odb_read_stream_read(struct odb_read_stream *st, void *buf, size_t sz)
 {
 	return st->read(st, buf, sz);
 }
 
-struct odb_read_stream *open_istream(struct repository *r,
-				     const struct object_id *oid,
-				     enum object_type *type,
-				     unsigned long *size,
-				     struct stream_filter *filter)
+struct odb_read_stream *odb_read_stream_open(struct object_database *odb,
+					     const struct object_id *oid,
+					     enum object_type *type,
+					     unsigned long *size,
+					     struct stream_filter *filter)
 {
 	struct odb_read_stream *st;
-	const struct object_id *real = lookup_replace_object(r, oid);
-	int ret = istream_source(&st, r, real);
+	const struct object_id *real = lookup_replace_object(odb->repo, oid);
+	int ret = istream_source(&st, odb, real);
 
 	if (ret)
 		return NULL;
@@ -229,7 +229,7 @@ struct odb_read_stream *open_istream(struct repository *r,
 		/* Add "&& !is_null_stream_filter(filter)" for performance */
 		struct odb_read_stream *nst = attach_stream_filter(st, filter);
 		if (!nst) {
-			close_istream(st);
+			odb_read_stream_close(st);
 			return NULL;
 		}
 		st = nst;
@@ -252,7 +252,7 @@ int odb_stream_blob_to_fd(struct object_database *odb,
 	ssize_t kept = 0;
 	int result = -1;
 
-	st = open_istream(odb->repo, oid, &type, &sz, filter);
+	st = odb_read_stream_open(odb, oid, &type, &sz, filter);
 	if (!st) {
 		if (filter)
 			free_stream_filter(filter);
@@ -263,7 +263,7 @@ int odb_stream_blob_to_fd(struct object_database *odb,
 	for (;;) {
 		char buf[1024 * 16];
 		ssize_t wrote, holeto;
-		ssize_t readlen = read_istream(st, buf, sizeof(buf));
+		ssize_t readlen = odb_read_stream_read(st, buf, sizeof(buf));
 
 		if (readlen < 0)
 			goto close_and_exit;
@@ -294,6 +294,6 @@ int odb_stream_blob_to_fd(struct object_database *odb,
 	result = 0;
 
  close_and_exit:
-	close_istream(st);
+	odb_read_stream_close(st);
 	return result;
 }
diff --git a/streaming.h b/streaming.h
index acfdef1598db52..7cb55213b780ff 100644
--- a/streaming.h
+++ b/streaming.h
@@ -24,11 +24,31 @@ struct odb_read_stream {
 	unsigned long size; /* inflated size of full object */
 };
 
-struct odb_read_stream *open_istream(struct repository *, const struct object_id *,
-				     enum object_type *, unsigned long *,
-				     struct stream_filter *);
-int close_istream(struct odb_read_stream *);
-ssize_t read_istream(struct odb_read_stream *, void *, size_t);
+/*
+ * Create a new object stream for the given object database. Populates the type
+ * and size pointers with the object's info. An optional filter can be used to
+ * transform the object's content.
+ *
+ * Returns the stream on success, a `NULL` pointer otherwise.
+ */
+struct odb_read_stream *odb_read_stream_open(struct object_database *odb,
+					     const struct object_id *oid,
+					     enum object_type *type,
+					     unsigned long *size,
+					     struct stream_filter *filter);
+
+/*
+ * Close the given read stream and release all resources associated with it.
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int odb_read_stream_close(struct odb_read_stream *stream);
+
+/*
+ * Read data from the stream into the buffer. Returns 0 on EOF and the number
+ * of bytes read on success. Returns a negative error code in case reading from
+ * the stream fails.
+ */
+ssize_t odb_read_stream_read(struct odb_read_stream *stream, void *buf, size_t len);
 
 /*
  * Look up the object by its ID and write the full contents to the file

From 1599b68d5e960a12f5ac624f81c70ece317db5a6 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:43 +0100
Subject: [PATCH 148/553] streaming: move into object database subsystem

The "streaming" terminology is somewhat generic, so it may not be
immediately obvious that "streaming.{c,h}" is specific to the object
database. Rectify this by moving it into the "odb/" directory so that it
can be immediately attributed to the object subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile                       | 2 +-
 archive-tar.c                  | 2 +-
 archive-zip.c                  | 2 +-
 builtin/cat-file.c             | 2 +-
 builtin/fsck.c                 | 2 +-
 builtin/index-pack.c           | 2 +-
 builtin/log.c                  | 2 +-
 builtin/pack-objects.c         | 2 +-
 entry.c                        | 2 +-
 meson.build                    | 2 +-
 object-file.c                  | 2 +-
 streaming.c => odb/streaming.c | 2 +-
 streaming.h => odb/streaming.h | 0
 packfile.c                     | 2 +-
 parallel-checkout.c            | 2 +-
 15 files changed, 14 insertions(+), 14 deletions(-)
 rename streaming.c => odb/streaming.c (99%)
 rename streaming.h => odb/streaming.h (100%)

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..6d8dcc4622b059 100644
--- a/Makefile
+++ b/Makefile
@@ -1201,6 +1201,7 @@ LIB_OBJS += object-file.o
 LIB_OBJS += object-name.o
 LIB_OBJS += object.o
 LIB_OBJS += odb.o
+LIB_OBJS += odb/streaming.o
 LIB_OBJS += oid-array.o
 LIB_OBJS += oidmap.o
 LIB_OBJS += oidset.o
@@ -1294,7 +1295,6 @@ LIB_OBJS += split-index.o
 LIB_OBJS += stable-qsort.o
 LIB_OBJS += statinfo.o
 LIB_OBJS += strbuf.o
-LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += strmap.o
 LIB_OBJS += strvec.o
diff --git a/archive-tar.c b/archive-tar.c
index 4d87b28504615a..494b9f0667a523 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -12,8 +12,8 @@
 #include "tar.h"
 #include "archive.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "strbuf.h"
-#include "streaming.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
diff --git a/archive-zip.c b/archive-zip.c
index c44684aebcf18d..a0bdc2fe3b2e5e 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -10,9 +10,9 @@
 #include "gettext.h"
 #include "git-zlib.h"
 #include "hex.h"
-#include "streaming.h"
 #include "utf8.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "strbuf.h"
 #include "userdiff.h"
 #include "write-or-die.h"
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 120d626d66e140..505ddaa12f5309 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -18,13 +18,13 @@
 #include "list-objects-filter-options.h"
 #include "parse-options.h"
 #include "userdiff.h"
-#include "streaming.h"
 #include "oid-array.h"
 #include "packfile.h"
 #include "pack-bitmap.h"
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "replace-object.h"
 #include "promisor-remote.h"
 #include "mailmap.h"
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1a348d43c26020..c7d2eea287fe7d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -13,11 +13,11 @@
 #include "fsck.h"
 #include "parse-options.h"
 #include "progress.h"
-#include "streaming.h"
 #include "packfile.h"
 #include "object-file.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "path.h"
 #include "read-cache-ll.h"
 #include "replace-object.h"
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index fb76ef0f4c17c3..581023495fdc9c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -16,12 +16,12 @@
 #include "progress.h"
 #include "fsck.h"
 #include "strbuf.h"
-#include "streaming.h"
 #include "thread-utils.h"
 #include "packfile.h"
 #include "pack-revindex.h"
 #include "object-file.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "oid-array.h"
 #include "oidset.h"
 #include "path.h"
diff --git a/builtin/log.c b/builtin/log.c
index e7b83a6e00a708..d4cf9c59c81a83 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -16,6 +16,7 @@
 #include "refs.h"
 #include "object-name.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "pager.h"
 #include "color.h"
 #include "commit.h"
@@ -35,7 +36,6 @@
 #include "parse-options.h"
 #include "line-log.h"
 #include "branch.h"
-#include "streaming.h"
 #include "version.h"
 #include "mailmap.h"
 #include "progress.h"
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1353c2384c336e..f109e26786e621 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -22,7 +22,6 @@
 #include "pack-objects.h"
 #include "progress.h"
 #include "refs.h"
-#include "streaming.h"
 #include "thread-utils.h"
 #include "pack-bitmap.h"
 #include "delta-islands.h"
@@ -33,6 +32,7 @@
 #include "packfile.h"
 #include "object-file.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "replace-object.h"
 #include "dir.h"
 #include "midx.h"
diff --git a/entry.c b/entry.c
index 38dfe670f79920..7817aee362ed9e 100644
--- a/entry.c
+++ b/entry.c
@@ -2,13 +2,13 @@
 
 #include "git-compat-util.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
 #include "hex.h"
 #include "name-hash.h"
 #include "sparse-index.h"
-#include "streaming.h"
 #include "submodule.h"
 #include "symlinks.h"
 #include "progress.h"
diff --git a/meson.build b/meson.build
index 1f95a06edb7829..fc82929b379dc5 100644
--- a/meson.build
+++ b/meson.build
@@ -397,6 +397,7 @@ libgit_sources = [
   'object-name.c',
   'object.c',
   'odb.c',
+  'odb/streaming.c',
   'oid-array.c',
   'oidmap.c',
   'oidset.c',
@@ -490,7 +491,6 @@ libgit_sources = [
   'stable-qsort.c',
   'statinfo.c',
   'strbuf.c',
-  'streaming.c',
   'string-list.c',
   'strmap.c',
   'strvec.c',
diff --git a/object-file.c b/object-file.c
index 9ba40a848c034a..9601fdb12dc9a8 100644
--- a/object-file.c
+++ b/object-file.c
@@ -20,13 +20,13 @@
 #include "object-file-convert.h"
 #include "object-file.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "oidtree.h"
 #include "pack.h"
 #include "packfile.h"
 #include "path.h"
 #include "read-cache-ll.h"
 #include "setup.h"
-#include "streaming.h"
 #include "tempfile.h"
 #include "tmp-objdir.h"
 
diff --git a/streaming.c b/odb/streaming.c
similarity index 99%
rename from streaming.c
rename to odb/streaming.c
index 06993a751c6194..7ef58adaa2a09e 100644
--- a/streaming.c
+++ b/odb/streaming.c
@@ -5,10 +5,10 @@
 #include "git-compat-util.h"
 #include "convert.h"
 #include "environment.h"
-#include "streaming.h"
 #include "repository.h"
 #include "object-file.h"
 #include "odb.h"
+#include "odb/streaming.h"
 #include "replace-object.h"
 #include "packfile.h"
 
diff --git a/streaming.h b/odb/streaming.h
similarity index 100%
rename from streaming.h
rename to odb/streaming.h
diff --git a/packfile.c b/packfile.c
index ad56ce0b905c0d..7a16aaa90d0a2f 100644
--- a/packfile.c
+++ b/packfile.c
@@ -20,7 +20,7 @@
 #include "tree.h"
 #include "object-file.h"
 #include "odb.h"
-#include "streaming.h"
+#include "odb/streaming.h"
 #include "midx.h"
 #include "commit-graph.h"
 #include "pack-revindex.h"
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 1cb6701b926dcf..0bf4bd6d4abd8c 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -13,7 +13,7 @@
 #include "read-cache-ll.h"
 #include "run-command.h"
 #include "sigchain.h"
-#include "streaming.h"
+#include "odb/streaming.h"
 #include "symlinks.h"
 #include "thread-utils.h"
 #include "trace2.h"

From 7b940286527ec2175dffbb317f47e080bb37cf3e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sun, 23 Nov 2025 19:59:44 +0100
Subject: [PATCH 149/553] streaming: drop redundant type and size pointers

In the preceding commits we have turned `struct odb_read_stream` into a
publicly visible structure. Furthermore, this structure now contains the
type and size of the object that we are about to stream. Consequently,
the out-pointers that we used before to propagate the type and size of
the streamed object are now somewhat redundant with the data contained
in the structure itself.

Drop these out-pointers and adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 archive-tar.c          |  4 +---
 archive-zip.c          |  5 ++---
 builtin/index-pack.c   |  7 ++-----
 builtin/pack-objects.c |  6 ++++--
 object-file.c          |  6 ++----
 odb/streaming.c        | 10 ++--------
 odb/streaming.h        |  7 ++-----
 7 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 494b9f0667a523..0fc70d13a8807e 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -130,12 +130,10 @@ static void write_trailer(void)
 static int stream_blocked(struct repository *r, const struct object_id *oid)
 {
 	struct odb_read_stream *st;
-	enum object_type type;
-	unsigned long sz;
 	char buf[BLOCKSIZE];
 	ssize_t readlen;
 
-	st = odb_read_stream_open(r->objects, oid, &type, &sz, NULL);
+	st = odb_read_stream_open(r->objects, oid, NULL);
 	if (!st)
 		return error(_("cannot stream blob %s"), oid_to_hex(oid));
 	for (;;) {
diff --git a/archive-zip.c b/archive-zip.c
index a0bdc2fe3b2e5e..97ea8d60d6187b 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -347,12 +347,11 @@ static int write_zip_entry(struct archiver_args *args,
 			method = ZIP_METHOD_DEFLATE;
 
 		if (!buffer) {
-			enum object_type type;
-			stream = odb_read_stream_open(args->repo->objects, oid,
-						      &type, &size, NULL);
+			stream = odb_read_stream_open(args->repo->objects, oid, NULL);
 			if (!stream)
 				return error(_("cannot stream blob %s"),
 					     oid_to_hex(oid));
+			size = stream->size;
 			flags |= ZIP_STREAM;
 			out = NULL;
 		} else {
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 581023495fdc9c..b01cb77f4a8500 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -798,8 +798,6 @@ static int compare_objects(const unsigned char *buf, unsigned long size,
 static int check_collison(struct object_entry *entry)
 {
 	struct compare_data data;
-	enum object_type type;
-	unsigned long size;
 
 	if (entry->size <= repo_settings_get_big_file_threshold(the_repository) ||
 	    entry->type != OBJ_BLOB)
@@ -807,11 +805,10 @@ static int check_collison(struct object_entry *entry)
 
 	memset(&data, 0, sizeof(data));
 	data.entry = entry;
-	data.st = odb_read_stream_open(the_repository->objects, &entry->idx.oid,
-				       &type, &size, NULL);
+	data.st = odb_read_stream_open(the_repository->objects, &entry->idx.oid, NULL);
 	if (!data.st)
 		return -1;
-	if (size != entry->size || type != entry->type)
+	if (data.st->size != entry->size || data.st->type != entry->type)
 		die(_("SHA1 COLLISION FOUND WITH %s !"),
 		    oid_to_hex(&entry->idx.oid));
 	unpack_data(entry, compare_objects, &data);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f109e26786e621..0d1d6995bfc35a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -521,9 +521,11 @@ static unsigned long write_no_reuse_object(struct hashfile *f, struct object_ent
 		    oe_size_greater_than(&to_pack, entry,
 					 repo_settings_get_big_file_threshold(the_repository)) &&
 		    (st = odb_read_stream_open(the_repository->objects, &entry->idx.oid,
-					       &type, &size, NULL)) != NULL)
+					       NULL)) != NULL) {
 			buf = NULL;
-		else {
+			type = st->type;
+			size = st->size;
+		} else {
 			buf = odb_read_object(the_repository->objects,
 					      &entry->idx.oid, &type,
 					      &size);
diff --git a/object-file.c b/object-file.c
index 9601fdb12dc9a8..12177a7dd707a8 100644
--- a/object-file.c
+++ b/object-file.c
@@ -132,19 +132,17 @@ int check_object_signature(struct repository *r, const struct object_id *oid,
 int stream_object_signature(struct repository *r, const struct object_id *oid)
 {
 	struct object_id real_oid;
-	unsigned long size;
-	enum object_type obj_type;
 	struct odb_read_stream *st;
 	struct git_hash_ctx c;
 	char hdr[MAX_HEADER_LEN];
 	int hdrlen;
 
-	st = odb_read_stream_open(r->objects, oid, &obj_type, &size, NULL);
+	st = odb_read_stream_open(r->objects, oid, NULL);
 	if (!st)
 		return -1;
 
 	/* Generate the header */
-	hdrlen = format_object_header(hdr, sizeof(hdr), obj_type, size);
+	hdrlen = format_object_header(hdr, sizeof(hdr), st->type, st->size);
 
 	/* Sha1.. */
 	r->hash_algo->init_fn(&c);
diff --git a/odb/streaming.c b/odb/streaming.c
index 7ef58adaa2a09e..745cd486fbb33d 100644
--- a/odb/streaming.c
+++ b/odb/streaming.c
@@ -214,8 +214,6 @@ ssize_t odb_read_stream_read(struct odb_read_stream *st, void *buf, size_t sz)
 
 struct odb_read_stream *odb_read_stream_open(struct object_database *odb,
 					     const struct object_id *oid,
-					     enum object_type *type,
-					     unsigned long *size,
 					     struct stream_filter *filter)
 {
 	struct odb_read_stream *st;
@@ -235,8 +233,6 @@ struct odb_read_stream *odb_read_stream_open(struct object_database *odb,
 		st = nst;
 	}
 
-	*size = st->size;
-	*type = st->type;
 	return st;
 }
 
@@ -247,18 +243,16 @@ int odb_stream_blob_to_fd(struct object_database *odb,
 			  int can_seek)
 {
 	struct odb_read_stream *st;
-	enum object_type type;
-	unsigned long sz;
 	ssize_t kept = 0;
 	int result = -1;
 
-	st = odb_read_stream_open(odb, oid, &type, &sz, filter);
+	st = odb_read_stream_open(odb, oid, filter);
 	if (!st) {
 		if (filter)
 			free_stream_filter(filter);
 		return result;
 	}
-	if (type != OBJ_BLOB)
+	if (st->type != OBJ_BLOB)
 		goto close_and_exit;
 	for (;;) {
 		char buf[1024 * 16];
diff --git a/odb/streaming.h b/odb/streaming.h
index 7cb55213b780ff..c7861f7e13c606 100644
--- a/odb/streaming.h
+++ b/odb/streaming.h
@@ -25,16 +25,13 @@ struct odb_read_stream {
 };
 
 /*
- * Create a new object stream for the given object database. Populates the type
- * and size pointers with the object's info. An optional filter can be used to
- * transform the object's content.
+ * Create a new object stream for the given object database. An optional filter
+ * can be used to transform the object's content.
  *
  * Returns the stream on success, a `NULL` pointer otherwise.
  */
 struct odb_read_stream *odb_read_stream_open(struct object_database *odb,
 					     const struct object_id *oid,
-					     enum object_type *type,
-					     unsigned long *size,
 					     struct stream_filter *filter);
 
 /*

From fddba8f73790c196d1fd1fc8b169aefb0ed311e3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila=20via=20GitGitGadget?=
 <gitgitgadget@gmail.com>
Date: Mon, 24 Nov 2025 12:48:49 +0000
Subject: [PATCH 150/553] doc: pull-fetch-param typofix
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

An earier patch had a typo discovered after it has been merged to
'next'.  Fix it.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/pull-fetch-param.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/pull-fetch-param.adoc b/Documentation/pull-fetch-param.adoc
index 2a67641761b2b4..d903dc89000b38 100644
--- a/Documentation/pull-fetch-param.adoc
+++ b/Documentation/pull-fetch-param.adoc
@@ -91,7 +91,7 @@ object.
 When the remote branch you want to fetch is known to
 be rewound and rebased regularly, it is expected that
 its new tip will not be a descendant of its previous tip
-(as stored in your remote-tracking branch the last time_
+(as stored in your remote-tracking branch the last time
 you fetched).  You would want
 to use the `+` sign to indicate non-fast-forward updates
 will be needed for such branches.  There is no way to

From df963f0df4756fa751bfbb39e104d004e3f7d60b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Mon, 24 Nov 2025 21:33:24 +0100
Subject: [PATCH 151/553] config: fix suggestion for failed set of multi-valued
 option
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The command "git config set <name> <value>" fails for an option that has
multiple values.  List the "git config set" flags that can be used,
instead of old-style "git config" actions.

Reported-by: Paul Wintz <pwintz@ucsc.edu>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/config.c b/builtin/config.c
index 59fb113b073926..ef1c1a9cf2c0c8 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -974,7 +974,7 @@ static int cmd_config_set(int argc, const char **argv, const char *prefix,
 						     argv[0], comment, value);
 		if (ret == CONFIG_NOTHING_SET)
 			error(_("cannot overwrite multiple values with a single value\n"
-			"       Use a regexp, --add or --replace-all to change %s."), argv[0]);
+			"       Use --value=<pattern>, --append or --all to change %s."), argv[0]);
 	}
 
 	location_options_release(&location_opts);

From 18bf67b7537f8ff0cd772847aa03f9cc319b1346 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Mon, 24 Nov 2025 22:00:05 +0100
Subject: [PATCH 152/553] config: fix short help of unset flags
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The flags --all and --value of "git config unset" don't make the command
"replace" or "show" anything, they are about selecting what to unset.
Change their help text accordingly.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/config.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index f70d6354772259..f1aa4e21ffd51f 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -992,8 +992,8 @@ static int cmd_config_unset(int argc, const char **argv, const char *prefix,
 	struct option opts[] = {
 		CONFIG_LOCATION_OPTIONS(location_opts),
 		OPT_GROUP(N_("Filter")),
-		OPT_BIT(0, "all", &flags, N_("replace multi-valued config option with new value"), CONFIG_FLAGS_MULTI_REPLACE),
-		OPT_STRING(0, "value", &value_pattern, N_("pattern"), N_("show config with values matching the pattern")),
+		OPT_BIT(0, "all", &flags, N_("unset all multi-valued config options"), CONFIG_FLAGS_MULTI_REPLACE),
+		OPT_STRING(0, "value", &value_pattern, N_("pattern"), N_("unset multi-valued config options with matching values")),
 		OPT_BIT(0, "fixed-value", &flags, N_("use string equality when comparing values to value pattern"), CONFIG_FLAGS_FIXED_VALUE),
 		OPT_END(),
 	};

From 6ab38b7e9cc7adafc304f3204616a4debd49c6e9 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, 24 Nov 2025 15:46:25 -0800
Subject: [PATCH 153/553] The third batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 33 +++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 997ae7476c2942..7882bc59e80a92 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -7,6 +7,10 @@ UI, Workflows & Features
  * "git maintenance" command learned "is-needed" subcommand to tell if
    it is necessary to perform various maintenance tasks.
 
+ * "git replay" (experimental) learned to perform ref updates itself
+   in a transaction by default, instead of emitting where each refs
+   should point at and leaving the actual update to another command.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -22,10 +26,37 @@ Performance, Internal Implementation, Development Support etc.
    changes, disable rename/copy detection to skip more expensive
    processing whose result will be discarded anyway.
 
+ * A part of code paths that deals with loose objects has been cleaned
+   up.
+
 
-Fixes since v2.51
+Fixes since v2.52
 -----------------
 
  * Ever since we added whitespace rules for this project, we misspelt
    an entry, which has been corrected.
    (merge 358e94dc70 jc/gitattributes-whitespace-no-indent-fix later to maint).
+
+ * The code to expand attribute macros has been rewritten to avoid
+   recursion to avoid running out of stack space in an uncontrolled
+   way.
+   (merge 42ed046866 jk/attr-macroexpand-wo-recursion later to maint).
+
+ * Adding a repository that uses a different hash function is a no-no,
+   but "git submodule add" did nt prevent it, which has been corrected.
+   (merge 6fe288bfbc bc/submodule-force-same-hash later to maint).
+
+ * An earlier check added to osx keychain credential helper to avoid
+   storing the credential itself supplied was overeager and rejected
+   credential material supplied by other helper backends that it would
+   have wanted to store, which has been corrected.
+   (merge 4580bcd235 kn/osxkeychain-idempotent-store-fix later to maint).
+
+ * The "git repo structure" subcommand tried to align its output but
+   mixed up byte count and display column width, which has been
+   corrected.
+   (merge 7a03a10a3a jx/repo-struct-utf8width-fix later to maint).
+
+ * Other code cleanup, docfix, build fix, etc.
+   (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
+   (merge df90eccd93 kh/doc-commit-extra-references later to maint).

From ce1a5a22a5beefac8a52da518855b5aecc562874 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 20 Nov 2025 11:34:50 -0800
Subject: [PATCH 154/553] config: really pretend missing :(optional) value is
 not there

Earlier we added support for a value spelled as ":(optional)path"
for configuration variables whose values are of type "path", with
the documented semantics "if the path is missing, behave as if such
a variable definition is not even there."

This has worked OK for code paths that reads configuration files and
stores the configured value as a string, where NULL in such a string
is treated as if the setting is not there, left as the default.

However, there are other code paths that do not _ignore_ such NULL
values and misbehave.  "git config get --path" is one of them.

When git_config_pathname() helper function finds that the value of
the variable is an optional path *and* the path is missing, it
leaves the destination pointer intact (which usually is left to
NULL) and returns 0 to signal a success.  format_config() helper
however assumed that the destination pointer always gets a string,
which no longer is the case, and segfaulted.

Make sure that git_config_pathname() clears the destination pointer
in such a case, and teach format_config() to react to the condition
by returning 1 (which is different from 0 that is a normal success
and negative that is an error) to its callers.  Adjust the callers
to react to this new return value that tells them to pretend as if
they did not even see this partcular <key, value> pair.

Reported-by: Han Jiang <jhcarl0814@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/config.c           | 45 ++++++++++++++++++++++++++++++--------
 config.c                   |  1 +
 t/meson.build              |  1 +
 t/t1311-config-optional.sh | 38 ++++++++++++++++++++++++++++++++
 4 files changed, 76 insertions(+), 9 deletions(-)
 create mode 100755 t/t1311-config-optional.sh

diff --git a/builtin/config.c b/builtin/config.c
index 59fb113b073926..2b36eb7d1cd204 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -261,6 +261,12 @@ struct strbuf_list {
 	int alloc;
 };
 
+/*
+ * Format the configuration key-value pair (`key_`, `value_`) and
+ * append it into strbuf `buf`.  Returns a negative value on failure,
+ * 0 on success, 1 on a missing optional value (i.e., telling the
+ * caller to pretend that <key_,value_> did not exist).
+ */
 static int format_config(const struct config_display_options *opts,
 			 struct strbuf *buf, const char *key_,
 			 const char *value_, const struct key_value_info *kvi)
@@ -299,7 +305,10 @@ static int format_config(const struct config_display_options *opts,
 			char *v;
 			if (git_config_pathname(&v, key_, value_) < 0)
 				return -1;
-			strbuf_addstr(buf, v);
+			if (v)
+				strbuf_addstr(buf, v);
+			else
+				return 1; /* :(optional)no-such-file */
 			free((char *)v);
 		} else if (opts->type == TYPE_EXPIRY_DATE) {
 			timestamp_t t;
@@ -344,6 +353,7 @@ static int collect_config(const char *key_, const char *value_,
 	struct collect_config_data *data = cb;
 	struct strbuf_list *values = data->values;
 	const struct key_value_info *kvi = ctx->kvi;
+	int status;
 
 	if (!(data->get_value_flags & GET_VALUE_KEY_REGEXP) &&
 	    strcmp(key_, data->key))
@@ -361,8 +371,15 @@ static int collect_config(const char *key_, const char *value_,
 	ALLOC_GROW(values->items, values->nr + 1, values->alloc);
 	strbuf_init(&values->items[values->nr], 0);
 
-	return format_config(data->display_opts, &values->items[values->nr++],
-			     key_, value_, kvi);
+	status = format_config(data->display_opts, &values->items[values->nr++],
+			       key_, value_, kvi);
+	if (status < 0)
+		return status;
+	if (status) {
+		strbuf_release(&values->items[--values->nr]);
+		status = 0;
+	}
+	return status;
 }
 
 static int get_value(const struct config_location_options *opts,
@@ -438,15 +455,23 @@ static int get_value(const struct config_location_options *opts,
 	if (!values.nr && display_opts->default_value) {
 		struct key_value_info kvi = KVI_INIT;
 		struct strbuf *item;
+		int status;
 
 		kvi_from_param(&kvi);
 		ALLOC_GROW(values.items, values.nr + 1, values.alloc);
 		item = &values.items[values.nr++];
 		strbuf_init(item, 0);
-		if (format_config(display_opts, item, key_,
-				  display_opts->default_value, &kvi) < 0)
+
+		status = format_config(display_opts, item, key_,
+				       display_opts->default_value, &kvi);
+		if (status < 0)
 			die(_("failed to format default config value: %s"),
 			    display_opts->default_value);
+		if (status) {
+			/* default was a missing optional value */
+			values.nr--;
+			strbuf_release(item);
+		}
 	}
 
 	ret = !values.nr;
@@ -706,11 +731,13 @@ static int get_urlmatch(const struct config_location_options *opts,
 	for_each_string_list_item(item, &values) {
 		struct urlmatch_current_candidate_value *matched = item->util;
 		struct strbuf buf = STRBUF_INIT;
+		int status;
 
-		format_config(&display_opts, &buf, item->string,
-			      matched->value_is_null ? NULL : matched->value.buf,
-			      &matched->kvi);
-		fwrite(buf.buf, 1, buf.len, stdout);
+		status = format_config(&display_opts, &buf, item->string,
+				       matched->value_is_null ? NULL : matched->value.buf,
+				       &matched->kvi);
+		if (!status)
+			fwrite(buf.buf, 1, buf.len, stdout);
 		strbuf_release(&buf);
 
 		strbuf_release(&matched->value);
diff --git a/config.c b/config.c
index 6552e5b0b80202..c6ef290011b752 100644
--- a/config.c
+++ b/config.c
@@ -1292,6 +1292,7 @@ int git_config_pathname(char **dest, const char *var, const char *value)
 
 	if (is_optional && is_missing_file(path)) {
 		free(path);
+		*dest = NULL;
 		return 0;
 	}
 
diff --git a/t/meson.build b/t/meson.build
index bbeba1a8d50e1b..137c0caea0e1b5 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -182,6 +182,7 @@ integration_tests = [
   't1308-config-set.sh',
   't1309-early-config.sh',
   't1310-config-default.sh',
+  't1311-config-optional.sh',
   't1350-config-hooks-path.sh',
   't1400-update-ref.sh',
   't1401-symbolic-ref.sh',
diff --git a/t/t1311-config-optional.sh b/t/t1311-config-optional.sh
new file mode 100755
index 00000000000000..fbbacfc67b368d
--- /dev/null
+++ b/t/t1311-config-optional.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+#
+# Copyright (c) 2025 Google LLC
+#
+
+test_description=':(optional) paths'
+
+. ./test-lib.sh
+
+test_expect_success 'var=:(optional)path-exists' '
+	test_config a.path ":(optional)path-exists" &&
+	>path-exists &&
+	echo path-exists >expect &&
+
+	git config get --path a.path >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'missing optional value is ignored' '
+	test_config a.path ":(optional)no-such-path" &&
+	# Using --show-scope ensures we skip writing not only the value
+	# but also any meta-information about the ignored key.
+	test_must_fail git config get --show-scope --path a.path >actual &&
+	test_line_count = 0 actual
+'
+
+test_expect_success 'missing optional value is ignored in multi-value config' '
+	test_when_finished "git config unset --all a.path" &&
+	git config set --append a.path ":(optional)path-exists" &&
+	git config set --append a.path ":(optional)no-such-path" &&
+	>path-exists &&
+	echo path-exists >expect &&
+
+	git config --get --path a.path >actual &&
+	test_cmp expect actual
+'
+
+test_done

From 0bd16856ffb3968de73699ad0555d1fae6c45406 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 20 Nov 2025 11:45:35 -0800
Subject: [PATCH 155/553] config: really treat missing optional path as not
 configured

These callers expect that git_config_pathname() that returns 0 is a
signal that the variable they passed has a string they need to act
on.  But with the introduction of ":(optional)path" earlier, that is
no longer the case.  If the path specified by the configuration
variable is missing, their variable will get a NULL in it, and they
need to act on it (often, just refraining from copying it elsewhere).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c        |  3 ++-
 builtin/receive-pack.c |  5 +++--
 fetch-pack.c           |  5 +++--
 fsck.c                 | 12 +++++++-----
 gpg-interface.c        | 10 +++++++++-
 setup.c                |  2 +-
 6 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 5b10e84b664228..10928341442e0f 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -739,7 +739,8 @@ static int git_blame_config(const char *var, const char *value,
 		ret = git_config_pathname(&str, var, value);
 		if (ret)
 			return ret;
-		string_list_insert(&ignore_revs_file_list, str);
+		if (str)
+			string_list_insert(&ignore_revs_file_list, str);
 		free(str);
 		return 0;
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1113137a6f0b3f..471857335429f8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -177,8 +177,9 @@ static int receive_pack_config(const char *var, const char *value,
 
 		if (git_config_pathname(&path, var, value))
 			return -1;
-		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
-			fsck_msg_types.len ? ',' : '=', path);
+		if (path)
+			strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
+				    fsck_msg_types.len ? ',' : '=', path);
 		free(path);
 		return 0;
 	}
diff --git a/fetch-pack.c b/fetch-pack.c
index 46c39f85c4ca9e..33a3f20bfc4d27 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1872,8 +1872,9 @@ int fetch_pack_fsck_config(const char *var, const char *value,
 
 		if (git_config_pathname(&path, var, value))
 			return -1;
-		strbuf_addf(msg_types, "%cskiplist=%s",
-			msg_types->len ? ',' : '=', path);
+		if (path)
+			strbuf_addf(msg_types, "%cskiplist=%s",
+				    msg_types->len ? ',' : '=', path);
 		free(path);
 		return 0;
 	}
diff --git a/fsck.c b/fsck.c
index 171b424dd57de1..0c287699d28267 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1351,14 +1351,16 @@ int git_fsck_config(const char *var, const char *value,
 
 	if (strcmp(var, "fsck.skiplist") == 0) {
 		char *path;
-		struct strbuf sb = STRBUF_INIT;
 
 		if (git_config_pathname(&path, var, value))
 			return -1;
-		strbuf_addf(&sb, "skiplist=%s", path);
-		free(path);
-		fsck_set_msg_types(options, sb.buf);
-		strbuf_release(&sb);
+		if (path) {
+			struct strbuf sb = STRBUF_INIT;
+			strbuf_addf(&sb, "skiplist=%s", path);
+			free(path);
+			fsck_set_msg_types(options, sb.buf);
+			strbuf_release(&sb);
+		}
 		return 0;
 	}
 
diff --git a/gpg-interface.c b/gpg-interface.c
index 06e7fb50603d22..8b91a11a430a35 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -794,8 +794,16 @@ static int git_gpg_config(const char *var, const char *value,
 		fmtname = "ssh";
 
 	if (fmtname) {
+		char *program;
+		int status;
+
 		fmt = get_format_by_name(fmtname);
-		return git_config_pathname((char **) &fmt->program, var, value);
+		status = git_config_pathname(&program, var, value);
+		if (status)
+			return status;
+		if (program)
+			fmt->program = program;
+		return status;
 	}
 
 	return 0;
diff --git a/setup.c b/setup.c
index 98ddbf377f923b..18301b5907dc20 100644
--- a/setup.c
+++ b/setup.c
@@ -1248,7 +1248,7 @@ static int safe_directory_cb(const char *key, const char *value,
 	} else {
 		char *allowed = NULL;
 
-		if (!git_config_pathname(&allowed, key, value)) {
+		if (!git_config_pathname(&allowed, key, value) && allowed) {
 			char *normalized = NULL;
 
 			/*

From dd8e8c786efdfb3ba588d807bfb0dc0d5196c343 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sat, 15 Nov 2025 23:02:57 -0800
Subject: [PATCH 156/553] submodule add: sanity check existing .gitmodules

"git submodule add" tries to find if a submodule with the same name
already exists at a different path, by looking up an entry in the
.gitmodules file.  If the entry in the file is incomplete, e.g.,
when the submodule.<name>.something variable is defined but there is
no definition of submodule.<name>.path variable, it accesses the
missing .path member of the submodule structure and triggers a
segfault.

A brief audit was done to make sure that the code does not assume
members other than those that are absolutely certain to exist: a
submodule obtained by submodule_from_name() should have .name
member, while a submodule obtained by submodule_from_path() should
also have .path as well as .name member, and we cannot assume
anything else.  Luckily, the module_add() codepath was the only
problematic one.  It is fairly recent code that comes from 1fa06ced
(submodule: prevent overwriting .gitmodules on path reuse,
2025-07-24).

A helper used by update_submodule() seems to assume that its call to
submodule_from_path() always yields a submodule object without a
failure, which seems to rely on the caller making sure it is the
case.  Leave an assert() with a NEEDSWORK comment there for future
developers to make sure the assumption actually holds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/submodule--helper.c | 12 ++++++++++--
 t/t7400-submodule-basic.sh  | 19 +++++++++++++++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 07a1935cbe1a69..1a1043cdab73af 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1913,6 +1913,13 @@ static int determine_submodule_update_strategy(struct repository *r,
 	const char *val;
 	int ret;
 
+	/*
+	 * NEEDSWORK: audit and ensure that update_submodule() has right
+	 * to assume that submodule_from_path() above will always succeed.
+	 */
+	if (!sub)
+		BUG("update_submodule assumes a submodule exists at path (%s)",
+		    path);
 	key = xstrfmt("submodule.%s.update", sub->name);
 
 	if (update) {
@@ -3537,14 +3544,15 @@ static int module_add(int argc, const char **argv, const char *prefix,
 		}
 	}
 
-	if(!add_data.sm_name)
+	if (!add_data.sm_name)
 		add_data.sm_name = add_data.sm_path;
 
 	existing = submodule_from_name(the_repository,
 					null_oid(the_hash_algo),
 					add_data.sm_name);
 
-	if (existing && strcmp(existing->path, add_data.sm_path)) {
+	if (existing && existing->path &&
+	    strcmp(existing->path, add_data.sm_path)) {
 		if (!force) {
 			die(_("submodule name '%s' already used for path '%s'"),
 			    add_data.sm_name, existing->path);
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index fd3e7e355e4ffc..9ade97e432162a 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -48,6 +48,25 @@ test_expect_success 'submodule deinit works on empty repository' '
 	git submodule deinit --all
 '
 
+test_expect_success 'submodule add with incomplete .gitmodules' '
+	test_when_finished "rm -f expect actual" &&
+	test_when_finished "git config remove-section submodule.one" &&
+	test_when_finished "git rm -f one .gitmodules" &&
+	git init one &&
+	git -C one commit --allow-empty -m one-initial &&
+	git config -f .gitmodules submodule.one.ignore all &&
+
+	git submodule add ./one &&
+
+	for var in ignore path url
+	do
+		git config -f .gitmodules --get "submodule.one.$var" ||
+		return 1
+	done >actual &&
+	test_write_lines all one ./one >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'setup - initial commit' '
 	>t &&
 	git add t &&

From b67b2d9fb7ccfaa72446d76abc8c36849d2e0685 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:53 +0100
Subject: [PATCH 157/553] odb: move logic to disable ref updates into repo

Our object database sources have a field `disable_ref_updates`. This
field can obviously be set to disable reference updates, but it is
somewhat curious that this logic is hosted by the object database.

The reason for this is that it was primarily added to keep us from
accidentally updating references while an ODB transaction is ongoing.
Any objects part of the transaction have not yet been committed to disk,
so new references that point to them might get corrupted in case we
never end up committing the transaction. As such, whenever we create a
new transaction we set up a new temporary ODB source and mark it as
disabling reference updates.

This has one (and only one?) upside: once we have committed the
transaction, the temporary source will be dropped and thus we clean up
the disabled reference updates automatically. But other than that, it's
somewhat misdesigned:

  - We can have multiple ODB sources, but only the currently active
    source inhibits reference updates.

  - We're mixing concerns of the refdb with the ODB.

Arguably, the decision of whether we can update references or not should
be handled by the refdb. But that wouldn't be a great fit either, as
there can be one refdb per worktree. So we'd again have the same problem
that a "global" intent becomes localized to a specific instance.

Instead, move the setting into the repository. While at it, convert it
into a boolean.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c        | 3 ++-
 odb.h        | 7 -------
 refs.c       | 2 +-
 repository.c | 2 +-
 repository.h | 9 ++++++++-
 setup.c      | 2 +-
 6 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/odb.c b/odb.c
index 29cf6496c5e50a..ccc6e999e7ae25 100644
--- a/odb.c
+++ b/odb.c
@@ -360,7 +360,7 @@ struct odb_source *odb_set_temporary_primary_source(struct object_database *odb,
 	 * Disable ref updates while a temporary odb is active, since
 	 * the objects in the database may roll back.
 	 */
-	source->disable_ref_updates = 1;
+	odb->repo->disable_ref_updates = true;
 	source->will_destroy = will_destroy;
 	source->next = odb->sources;
 	odb->sources = source;
@@ -387,6 +387,7 @@ void odb_restore_primary_source(struct object_database *odb,
 	if (cur_source->next != restore_source)
 		BUG("we expect the old primary object store to be the first alternate");
 
+	odb->repo->disable_ref_updates = false;
 	odb->sources = restore_source;
 	odb_source_free(cur_source);
 }
diff --git a/odb.h b/odb.h
index 77b313b784cad3..99c4d489729459 100644
--- a/odb.h
+++ b/odb.h
@@ -66,13 +66,6 @@ struct odb_source {
 	 */
 	bool local;
 
-	/*
-	 * This is a temporary object store created by the tmp_objdir
-	 * facility. Disable ref updates since the objects in the store
-	 * might be discarded on rollback.
-	 */
-	int disable_ref_updates;
-
 	/*
 	 * This object store is ephemeral, so there is no need to fsync.
 	 */
diff --git a/refs.c b/refs.c
index 965381367e0e53..6c7283d9eb2aa8 100644
--- a/refs.c
+++ b/refs.c
@@ -2491,7 +2491,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		break;
 	}
 
-	if (refs->repo->objects->sources->disable_ref_updates) {
+	if (refs->repo->disable_ref_updates) {
 		strbuf_addstr(err,
 			      _("ref updates forbidden inside quarantine environment"));
 		return -1;
diff --git a/repository.c b/repository.c
index 3c8b3813b00af0..455c2d279fb8ab 100644
--- a/repository.c
+++ b/repository.c
@@ -179,7 +179,7 @@ void repo_set_gitdir(struct repository *repo,
 		repo->objects->sources->path = objects_path;
 	}
 
-	repo->objects->sources->disable_ref_updates = o->disable_ref_updates;
+	repo->disable_ref_updates = o->disable_ref_updates;
 
 	free(repo->objects->alternate_db);
 	repo->objects->alternate_db = xstrdup_or_null(o->alternate_db);
diff --git a/repository.h b/repository.h
index 5808a5d610846a..614649413b68bc 100644
--- a/repository.h
+++ b/repository.h
@@ -71,6 +71,13 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/*
+	 * Disable ref updates. This is especially used in contexts where
+	 * transactions may still be rolled back so that we don't start to
+	 * reference objects that may vanish.
+	 */
+	bool disable_ref_updates;
+
 	/*
 	 * A strmap of ref_stores, stored by submodule name, accessible via
 	 * `repo_get_submodule_ref_store()`.
@@ -187,7 +194,7 @@ struct set_gitdir_args {
 	const char *graft_file;
 	const char *index_file;
 	const char *alternate_db;
-	int disable_ref_updates;
+	bool disable_ref_updates;
 };
 
 void repo_set_gitdir(struct repository *repo, const char *root,
diff --git a/setup.c b/setup.c
index 8bf52df71663a3..a752e9fc8476a0 100644
--- a/setup.c
+++ b/setup.c
@@ -1682,7 +1682,7 @@ void setup_git_env(const char *git_dir)
 	args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT);
 	args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT);
 	if (getenv(GIT_QUARANTINE_ENVIRONMENT)) {
-		args.disable_ref_updates = 1;
+		args.disable_ref_updates = true;
 	}
 
 	repo_set_gitdir(the_repository, git_dir, &args);

From 5d795b34dcbf46039c3dda028bb8df8d75a5a9d0 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:54 +0100
Subject: [PATCH 158/553] oidset: introduce `oidset_equal()`

Introduce a new function that allows the caller to verify whether two
oidsets contain the exact same object IDs.

Note that this change requires us to change `oidset_iter_init()` to
accept a `const struct oidset`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 oidset.c | 16 ++++++++++++++++
 oidset.h |  9 +++++++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/oidset.c b/oidset.c
index 8d36aef8dca4fc..c8ff0b385c58ac 100644
--- a/oidset.c
+++ b/oidset.c
@@ -16,6 +16,22 @@ int oidset_contains(const struct oidset *set, const struct object_id *oid)
 	return pos != kh_end(&set->set);
 }
 
+bool oidset_equal(const struct oidset *a, const struct oidset *b)
+{
+	struct oidset_iter iter;
+	struct object_id *a_oid;
+
+	if (oidset_size(a) != oidset_size(b))
+		return false;
+
+	oidset_iter_init(a, &iter);
+	while ((a_oid = oidset_iter_next(&iter)))
+		if (!oidset_contains(b, a_oid))
+			return false;
+
+	return true;
+}
+
 int oidset_insert(struct oidset *set, const struct object_id *oid)
 {
 	int added;
diff --git a/oidset.h b/oidset.h
index 0106b6f2787f0e..e0f1a6ff4ff203 100644
--- a/oidset.h
+++ b/oidset.h
@@ -38,6 +38,11 @@ void oidset_init(struct oidset *set, size_t initial_size);
  */
 int oidset_contains(const struct oidset *set, const struct object_id *oid);
 
+/**
+ * Returns true iff `a` and `b` contain the exact same OIDs.
+ */
+bool oidset_equal(const struct oidset *a, const struct oidset *b);
+
 /**
  * Insert the oid into the set; a copy is made, so "oid" does not need
  * to persist after this function is called.
@@ -94,11 +99,11 @@ void oidset_parse_file_carefully(struct oidset *set, const char *path,
 				 oidset_parse_tweak_fn fn, void *cbdata);
 
 struct oidset_iter {
-	kh_oid_set_t *set;
+	const kh_oid_set_t *set;
 	khiter_t iter;
 };
 
-static inline void oidset_iter_init(struct oidset *set,
+static inline void oidset_iter_init(const struct oidset *set,
 				    struct oidset_iter *iter)
 {
 	iter->set = &set->set;

From 8dc22e87f000a092b62b4fb08e2542433f1ae192 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:55 +0100
Subject: [PATCH 159/553] builtin/index-pack: fix deferred fsck outside repos

When asked to perform object consistency checks via the `--fsck-objects`
flag we verify that each object part of the pack is valid. In general,
this check can even be performed outside of a Git repository: we don't
need an initialized object database as we simply read the object from
the packfile directly.

But there's one exception: a subset of the object checks may be deferred
to a later point in time. For now, this only concerns ".gitmodules" and
".gitattributes" files: whenever we see a tree referencing these files
we queue them for a deferred check. This is done because we need to do
some extra checks for those files to ensure that they are well-formed,
and these checks need to be done regardless of whether the corresponding
blobs are part of the packfile or not.

This works inside a repository, but unfortunately the logic leads to a
segfault when running outside of one. This is because we eventually call
`odb_read_object()`, which will crash because the object database has
not been initialized.

There's multiple options here:

  - We could in theory create a purely in-memory database with only a
    packfile store that contains the single packfile. We don't really
    have the infrastructure for this yet though, and it would end up
    being quite hacky.

  - We could refuse to perform consistency checks outside of a
    repository. But most of the checks work alright, so this would be a
    regression.

  - We can skip the finalizing consistency checks when running outside
    of a repository. This is not as invasive as skipping all checks,
    but it's not great to randomly skip a subset of tests, either.

None of these options really feel perfect. The first one would be the
obvious choice if easily possible.

There's another option though: instead of skipping the final object
checks, we can die if there are any queued object checks. With this
change we now die exactly if and only if we would have previously
segfaulted. Like this we ensure that objects that _may_ fail the
consistency checks won't be silently skipped, and at the same time we
give users a much better error message.

Refactor the code accordingly and add a test that would have triggered
the segfault. Note that we also move down the logic to add the packfile
to the store. There is no point doing this any earlier than right before
we execute `fsck_finish()`, and it ensures that the logic to set up and
perform the consistency check is self-contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/index-pack.c  | 21 ++++++++++++++++++---
 fsck.c                |  6 ++++++
 fsck.h                |  7 +++++++
 t/t5302-pack-index.sh | 16 ++++++++++++++++
 4 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 2b78ba7fe4d14a..699fe678cd60b0 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1640,7 +1640,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
 	rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
 			    hash, "idx", 1);
 
-	if (do_fsck_object)
+	if (do_fsck_object && startup_info->have_repository)
 		packfile_store_load_pack(the_repository->objects->packfiles,
 					 final_index_name, 0);
 
@@ -2110,8 +2110,23 @@ int cmd_index_pack(int argc,
 	else
 		close(input_fd);
 
-	if (do_fsck_object && fsck_finish(&fsck_options))
-		die(_("fsck error in pack objects"));
+	if (do_fsck_object) {
+		/*
+		 * We cannot perform queued consistency checks when running
+		 * outside of a repository because those require us to read
+		 * from the object database, which is uninitialized.
+		 *
+		 * TODO: we may eventually set up an in-memory object database,
+		 * which would allow us to perform these queued checks.
+		 */
+		if (!startup_info->have_repository &&
+		    fsck_has_queued_checks(&fsck_options))
+			die(_("cannot perform queued object checks outside "
+			      "of a repository"));
+
+		if (fsck_finish(&fsck_options))
+			die(_("fsck error in pack objects"));
+	}
 
 	free(opts.anomaly);
 	free(objects);
diff --git a/fsck.c b/fsck.c
index 341e100d24ece0..8e1565fe6d0133 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1350,6 +1350,12 @@ int fsck_finish(struct fsck_options *options)
 	return ret;
 }
 
+bool fsck_has_queued_checks(struct fsck_options *options)
+{
+	return !oidset_equal(&options->gitmodules_found, &options->gitmodules_done) ||
+	       !oidset_equal(&options->gitattributes_found, &options->gitattributes_done);
+}
+
 void fsck_options_clear(struct fsck_options *options)
 {
 	free(options->msg_type);
diff --git a/fsck.h b/fsck.h
index cb6ef32f4f3aaa..336917c0451aac 100644
--- a/fsck.h
+++ b/fsck.h
@@ -248,6 +248,13 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
  */
 int fsck_finish(struct fsck_options *options);
 
+/*
+ * Check whether there are any checks that have been queued up and that still
+ * need to be run. Returns `false` iff `fsck_finish()` wouldn't perform any
+ * actions, `true` otherwise.
+ */
+bool fsck_has_queued_checks(struct fsck_options *options);
+
 /*
  * Clear the fsck_options struct, freeing any allocated memory.
  */
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 413c99274c8f30..9697448cb27634 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -293,4 +293,20 @@ test_expect_success 'too-large packs report the breach' '
 	grep "maximum allowed size (20 bytes)" err
 '
 
+# git-index-pack(1) uses the default hash algorithm outside of the repository,
+# and it has no way to tell it otherwise. So we can only run this test with the
+# default hash algorithm, as it would otherwise fail to parse the tree.
+test_expect_success DEFAULT_HASH_ALGORITHM 'index-pack --fsck-objects outside of a repo' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		printf "100644 blob $(test_oid 001)\t.gitattributes\n" >tree &&
+		git mktree --missing <tree >tree-oid &&
+		git pack-objects <tree-oid pack &&
+		test_must_fail nongit git index-pack --fsck-objects "$(pwd)"/pack-*.pack 2>err &&
+		test_grep "cannot perform queued object checks outside of a repository" err
+	)
+'
+
 test_done

From eea83c010cf431325a05f06cc7c64029538c8495 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:56 +0100
Subject: [PATCH 160/553] t/helper: stop setting up `the_repository` repeatedly

The "repository" test helper sets up `the_repository` twice. In fact
though, we don't even have to set it up even once: all we need is to set
up its hash algorithm, because we still depend on some subsystems that
aren't free of `the_repository`.

Refactor the code accordingly. This prepares for a subsequent change,
where setting up the repository repeatedly will lead to a `BUG()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/helper/test-repository.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index 63c37de33d22f1..9ba94cdffa4c47 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -17,10 +17,6 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 	struct commit *c;
 	struct commit_list *parent;
 
-	setup_git_env(gitdir);
-
-	repo_clear(the_repository);
-
 	if (repo_init(&r, gitdir, worktree))
 		die("Couldn't init repo");
 
@@ -47,10 +43,6 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 	struct commit *c;
 	struct tree *tree;
 
-	setup_git_env(gitdir);
-
-	repo_clear(the_repository);
-
 	if (repo_init(&r, gitdir, worktree))
 		die("Couldn't init repo");
 
@@ -75,24 +67,20 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 
 int cmd__repository(int argc, const char **argv)
 {
-	int nongit_ok = 0;
-
-	setup_git_directory_gently(&nongit_ok);
-
 	if (argc < 2)
 		die("must have at least 2 arguments");
 	if (!strcmp(argv[1], "parse_commit_in_graph")) {
 		struct object_id oid;
 		if (argc < 5)
 			die("not enough arguments");
-		if (parse_oid_hex(argv[4], &oid, &argv[4]))
+		if (parse_oid_hex_any(argv[4], &oid, &argv[4]) == GIT_HASH_UNKNOWN)
 			die("cannot parse oid '%s'", argv[4]);
 		test_parse_commit_in_graph(argv[2], argv[3], &oid);
 	} else if (!strcmp(argv[1], "get_commit_tree_in_graph")) {
 		struct object_id oid;
 		if (argc < 5)
 			die("not enough arguments");
-		if (parse_oid_hex(argv[4], &oid, &argv[4]))
+		if (parse_oid_hex_any(argv[4], &oid, &argv[4]) == GIT_HASH_UNKNOWN)
 			die("cannot parse oid '%s'", argv[4]);
 		test_get_commit_tree_in_graph(argv[2], argv[3], &oid);
 	} else {

From c257bd59165e2f55dfa2c97b0ca1e39131513654 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:57 +0100
Subject: [PATCH 161/553] http-push: stop setting up `the_repository` for each
 reference

When pushing references via HTTP we call `repo_init_revisions()` in a
loop for each reference that we're about to push. As third argument we
pass the result of `setup_git_directory()`, which causes us to
reinitialize the repository every single time.

This is an obvious waste of compute, as the repository that we're
working in will never change across any of the initializations. The only
reason that we do this is to retrieve the directory of the repository.
Furthermore, this is about to create issues in a subsequent commit,
where reinitializing the repository will cause a `BUG()`.

Address this by storing the Git directory in a variable instead so that
we don't have to call the function repeatedly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 http-push.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/http-push.c b/http-push.c
index a1c01e3b9b93a3..a48ca237996567 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1725,6 +1725,7 @@ int cmd_main(int argc, const char **argv)
 	int i;
 	int new_refs;
 	struct ref *ref, *local_refs = NULL;
+	const char *gitdir;
 
 	CALLOC_ARRAY(repo, 1);
 
@@ -1787,7 +1788,7 @@ int cmd_main(int argc, const char **argv)
 	if (delete_branch && rs.nr != 1)
 		die("You must specify only one branch name when deleting a remote branch");
 
-	setup_git_directory();
+	gitdir = setup_git_directory();
 
 	memset(remote_dir_exists, -1, 256);
 
@@ -1941,7 +1942,7 @@ int cmd_main(int argc, const char **argv)
 		if (!push_all && !is_null_oid(&ref->old_oid))
 			strvec_pushf(&commit_argv, "^%s",
 				     oid_to_hex(&ref->old_oid));
-		repo_init_revisions(the_repository, &revs, setup_git_directory());
+		repo_init_revisions(the_repository, &revs, gitdir);
 		setup_revisions_from_strvec(&commit_argv, &revs, NULL);
 		revs.edge_hint = 0; /* just in case */
 

From 35d9fc65edc0a5df9f714d02afaa2c942fb28570 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:58 +0100
Subject: [PATCH 162/553] odb: handle initialization of sources in `odb_new()`

The logic to set up a new object database is currently distributed
across two functions in "repository.c":

  - In `initialize_repository()` we initialize an empty object database.
    This object database is not fully initialized and doesn't have any
    sources attached to it.

  - The primary object database source is then created in
    `repo_set_gitdir()`.

Ideally though, the logic should be entirely self-contained so that we
can iterate more readily on how exactly the sources themselves get set
up.

Refactor `odb_new()` to handle both allocation and setup of the object
database. This ensures that the object database is always initialized
and ready for use, and it allows us to change how the sources get set up
eventually.

Note that `repo_set_gitdir()` still reaches into the sources when the
function gets called with an already-initialized object database. This
will be fixed in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c        | 14 +++++++++++++-
 odb.h        | 15 ++++++++++++++-
 repository.c | 20 ++++++++------------
 3 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/odb.c b/odb.c
index ccc6e999e7ae25..88b40c81c0060c 100644
--- a/odb.c
+++ b/odb.c
@@ -1034,15 +1034,27 @@ int odb_write_object_stream(struct object_database *odb,
 	return odb_source_loose_write_stream(odb->sources, stream, len, oid);
 }
 
-struct object_database *odb_new(struct repository *repo)
+struct object_database *odb_new(struct repository *repo,
+				const char *primary_source,
+				const char *secondary_sources)
 {
 	struct object_database *o = xmalloc(sizeof(*o));
+	char *to_free = NULL;
 
 	memset(o, 0, sizeof(*o));
 	o->repo = repo;
 	o->packfiles = packfile_store_new(o);
 	pthread_mutex_init(&o->replace_mutex, NULL);
 	string_list_init_dup(&o->submodule_source_paths);
+
+	if (!primary_source)
+		primary_source = to_free = xstrfmt("%s/objects", repo->commondir);
+	o->sources = odb_source_new(o, primary_source, true);
+	o->sources_tail = &o->sources->next;
+	o->alternate_db = xstrdup_or_null(secondary_sources);
+
+	free(to_free);
+
 	return o;
 }
 
diff --git a/odb.h b/odb.h
index 99c4d489729459..41b3c03027f4d8 100644
--- a/odb.h
+++ b/odb.h
@@ -159,7 +159,20 @@ struct object_database {
 	struct string_list submodule_source_paths;
 };
 
-struct object_database *odb_new(struct repository *repo);
+/*
+ * Create a new object database for the given repository.
+ *
+ * If the primary source parameter is set it will override the usual primary
+ * object directory derived from the repository's common directory. The
+ * alternate sources are expected to be a PATH_SEP-separated list of secondary
+ * sources. Note that these alternate sources will be added in addition to, not
+ * instead of, the alternates identified by the primary source.
+ *
+ * Returns the newly created object database.
+ */
+struct object_database *odb_new(struct repository *repo,
+				const char *primary_source,
+				const char *alternate_sources);
 
 /* Free the object database and release all resources. */
 void odb_free(struct object_database *o);
diff --git a/repository.c b/repository.c
index 455c2d279fb8ab..5975c8f341c8cb 100644
--- a/repository.c
+++ b/repository.c
@@ -52,7 +52,6 @@ static void set_default_hash_algo(struct repository *repo)
 
 void initialize_repository(struct repository *repo)
 {
-	repo->objects = odb_new(repo);
 	repo->remote_state = remote_state_new();
 	repo->parsed_objects = parsed_object_pool_new(repo);
 	ALLOC_ARRAY(repo->index, 1);
@@ -160,29 +159,26 @@ void repo_set_gitdir(struct repository *repo,
 	 * until after xstrdup(root). Then we can free it.
 	 */
 	char *old_gitdir = repo->gitdir;
-	char *objects_path = NULL;
 
 	repo->gitdir = xstrdup(gitfile ? gitfile : root);
 	free(old_gitdir);
 
 	repo_set_commondir(repo, o->commondir);
-	expand_base_dir(&objects_path, o->object_dir,
-			repo->commondir, "objects");
-
-	if (!repo->objects->sources) {
-		repo->objects->sources = odb_source_new(repo->objects,
-							objects_path, true);
-		repo->objects->sources_tail = &repo->objects->sources->next;
-		free(objects_path);
+
+	if (!repo->objects) {
+		repo->objects = odb_new(repo, o->object_dir, o->alternate_db);
 	} else {
+		char *objects_path = NULL;
+		expand_base_dir(&objects_path, o->object_dir,
+				repo->commondir, "objects");
 		free(repo->objects->sources->path);
 		repo->objects->sources->path = objects_path;
+		free(repo->objects->alternate_db);
+		repo->objects->alternate_db = xstrdup_or_null(o->alternate_db);
 	}
 
 	repo->disable_ref_updates = o->disable_ref_updates;
 
-	free(repo->objects->alternate_db);
-	repo->objects->alternate_db = xstrdup_or_null(o->alternate_db);
 	expand_base_dir(&repo->graft_file, o->graft_file,
 			repo->commondir, "info/grafts");
 	expand_base_dir(&repo->index_file, o->index_file,

From 2574c617362a0c67d15fa01e01cdbd0f6bcdbc93 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:50:59 +0100
Subject: [PATCH 163/553] chdir-notify: add function to unregister listeners

While we (obviously) have a way to register new listeners that get
called whenever we chdir(3p), we don't have an equivalent that can be
used to unregister such a listener again.

Add one, as it will be required in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 chdir-notify.c | 18 ++++++++++++++++++
 chdir-notify.h |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/chdir-notify.c b/chdir-notify.c
index 0d7bc0460747b2..f8bfe3cbef9aba 100644
--- a/chdir-notify.c
+++ b/chdir-notify.c
@@ -25,6 +25,24 @@ void chdir_notify_register(const char *name,
 	list_add_tail(&e->list, &chdir_notify_entries);
 }
 
+void chdir_notify_unregister(const char *name, chdir_notify_callback cb,
+			     void *data)
+{
+	struct list_head *pos, *p;
+
+	list_for_each_safe(pos, p, &chdir_notify_entries) {
+		struct chdir_notify_entry *e =
+			list_entry(pos, struct chdir_notify_entry, list);
+
+		if (e->cb != cb || e->data != data || !e->name != !name ||
+		    (e->name && strcmp(e->name, name)))
+			continue;
+
+		list_del(pos);
+		free(e);
+	}
+}
+
 static void reparent_cb(const char *name,
 			const char *old_cwd,
 			const char *new_cwd,
diff --git a/chdir-notify.h b/chdir-notify.h
index 366e4c1ee9908c..81eb69d846e45d 100644
--- a/chdir-notify.h
+++ b/chdir-notify.h
@@ -41,6 +41,8 @@ typedef void (*chdir_notify_callback)(const char *name,
 				      const char *new_cwd,
 				      void *data);
 void chdir_notify_register(const char *name, chdir_notify_callback cb, void *data);
+void chdir_notify_unregister(const char *name, chdir_notify_callback cb,
+			     void *data);
 void chdir_notify_reparent(const char *name, char **path);
 
 /*

From 2816b748e5c300afda559a09426f8342c235c29d Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:51:00 +0100
Subject: [PATCH 164/553] odb: handle changing a repository's commondir

The function `repo_set_gitdir()` is called in two situations:

  - To initialize the repository with its discovered location. As part
    of this we also set up the new object database.

  - To update the repository's discovered location in case the process
    changes its working directory so that we update relative paths. This
    means we also have to update any relative paths that are potentially
    used in the object database.

In the context of the object database we ideally wouldn't ever have to
worry about the second case: if all paths used by our object database
sources were absolute, then we wouldn't have to update them. But
unfortunately, the paths aren't only used to locate files owned by the
given source, but we also use them for reporting purposes. One such
example is `repo_get_object_directory()`, where we cannot just change
semantics to always return absolute paths, as that is likely to break
tooling out there.

One solution to this would be to have both a "display path" and an
"internal path". This would allow us to use internal paths for all
internal matters, but continue to use the potentially-relative display
paths so that we don't break compatibility. But converting the codebase
to honor this split is quite a messy endeavour, and it wouldn't even
help us with the goal to get rid of the need to update the display path
on chdir(3p).

Another solution would be to rework "setup.c" so that we never have to
update paths in the first place. In that case, we'd only initialize the
repository once we have figured out final locations for all directories.
This would be a significant simplification of that subsystem indeed, but
the current logic is so messy that it would take significant investments
to get there.

Meanwhile though, while object sources may still use relative paths, the
best thing we can do is to handle the reparenting of the object source
paths in the object database itself. This can be done by registering one
callback for each object database so that we get notified whenever the
current working directory changes, and we then perform the reparenting
ourselves.

Ideally, this wouldn't even happen on the object database level, but
instead handled by each object database source. But we don't yet have
proper pluggable object database sources, so this will need to be
handled at a later point in time.

The logic itself is rather simple:

  - We register the callback when creating the object database.

  - We unregister the callback when releasing it again.

  - We split up `set_git_dir_1()` so that it becomes possible to skip
    recreating the object database. This is required because the
    function is called both when the current working directory changes,
    but also when we set up the repository. Calling this function
    without skipping creation of the ODB will result in a bug in case
    it's already created.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c        | 37 +++++++++++++++++++++--
 odb.h        |  4 ---
 repository.c | 13 ++------
 repository.h |  1 +
 setup.c      | 84 ++++++++++++++++++++++++++++------------------------
 5 files changed, 83 insertions(+), 56 deletions(-)

diff --git a/odb.c b/odb.c
index 88b40c81c0060c..70665fb7f48f0f 100644
--- a/odb.c
+++ b/odb.c
@@ -1,5 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
+#include "chdir-notify.h"
 #include "commit-graph.h"
 #include "config.h"
 #include "dir.h"
@@ -142,9 +143,9 @@ static void read_info_alternates(struct object_database *odb,
 				 const char *relative_base,
 				 int depth);
 
-struct odb_source *odb_source_new(struct object_database *odb,
-				  const char *path,
-				  bool local)
+static struct odb_source *odb_source_new(struct object_database *odb,
+					 const char *path,
+					 bool local)
 {
 	struct odb_source *source;
 
@@ -1034,6 +1035,32 @@ int odb_write_object_stream(struct object_database *odb,
 	return odb_source_loose_write_stream(odb->sources, stream, len, oid);
 }
 
+static void odb_update_commondir(const char *name UNUSED,
+				 const char *old_cwd,
+				 const char *new_cwd,
+				 void *cb_data)
+{
+	struct object_database *odb = cb_data;
+	struct odb_source *source;
+
+	/*
+	 * In theory, we only have to do this for the primary object source, as
+	 * alternates' paths are always resolved to an absolute path.
+	 */
+	for (source = odb->sources; source; source = source->next) {
+		char *path;
+
+		if (is_absolute_path(source->path))
+			continue;
+
+		path = reparent_relative_path(old_cwd, new_cwd,
+					      source->path);
+
+		free(source->path);
+		source->path = path;
+	}
+}
+
 struct object_database *odb_new(struct repository *repo,
 				const char *primary_source,
 				const char *secondary_sources)
@@ -1055,6 +1082,8 @@ struct object_database *odb_new(struct repository *repo,
 
 	free(to_free);
 
+	chdir_notify_register(NULL, odb_update_commondir, o);
+
 	return o;
 }
 
@@ -1106,6 +1135,8 @@ void odb_free(struct object_database *o)
 	packfile_store_free(o->packfiles);
 	string_list_clear(&o->submodule_source_paths, 0);
 
+	chdir_notify_unregister(NULL, odb_update_commondir, o);
+
 	free(o);
 }
 
diff --git a/odb.h b/odb.h
index 41b3c03027f4d8..014cd9585a2f6e 100644
--- a/odb.h
+++ b/odb.h
@@ -78,10 +78,6 @@ struct odb_source {
 	char *path;
 };
 
-struct odb_source *odb_source_new(struct object_database *odb,
-				  const char *path,
-				  bool local);
-
 struct packed_git;
 struct packfile_store;
 struct cached_object_entry;
diff --git a/repository.c b/repository.c
index 5975c8f341c8cb..863f24411b7bf9 100644
--- a/repository.c
+++ b/repository.c
@@ -165,17 +165,10 @@ void repo_set_gitdir(struct repository *repo,
 
 	repo_set_commondir(repo, o->commondir);
 
-	if (!repo->objects) {
+	if (!repo->objects)
 		repo->objects = odb_new(repo, o->object_dir, o->alternate_db);
-	} else {
-		char *objects_path = NULL;
-		expand_base_dir(&objects_path, o->object_dir,
-				repo->commondir, "objects");
-		free(repo->objects->sources->path);
-		repo->objects->sources->path = objects_path;
-		free(repo->objects->alternate_db);
-		repo->objects->alternate_db = xstrdup_or_null(o->alternate_db);
-	}
+	else if (!o->skip_initializing_odb)
+		BUG("cannot reinitialize an already-initialized object directory");
 
 	repo->disable_ref_updates = o->disable_ref_updates;
 
diff --git a/repository.h b/repository.h
index 614649413b68bc..6063c4b846d031 100644
--- a/repository.h
+++ b/repository.h
@@ -195,6 +195,7 @@ struct set_gitdir_args {
 	const char *index_file;
 	const char *alternate_db;
 	bool disable_ref_updates;
+	bool skip_initializing_odb;
 };
 
 void repo_set_gitdir(struct repository *repo, const char *root,
diff --git a/setup.c b/setup.c
index a752e9fc8476a0..a625f9fbc8b1b7 100644
--- a/setup.c
+++ b/setup.c
@@ -1002,10 +1002,51 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 	return error_code ? NULL : path;
 }
 
-static void set_git_dir_1(const char *path)
+static void setup_git_env_internal(const char *git_dir,
+				   bool skip_initializing_odb)
+{
+	char *git_replace_ref_base;
+	const char *shallow_file;
+	const char *replace_ref_base;
+	struct set_gitdir_args args = { NULL };
+	struct strvec to_free = STRVEC_INIT;
+
+	args.commondir = getenv_safe(&to_free, GIT_COMMON_DIR_ENVIRONMENT);
+	args.object_dir = getenv_safe(&to_free, DB_ENVIRONMENT);
+	args.graft_file = getenv_safe(&to_free, GRAFT_ENVIRONMENT);
+	args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT);
+	args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT);
+	if (getenv(GIT_QUARANTINE_ENVIRONMENT))
+		args.disable_ref_updates = true;
+	args.skip_initializing_odb = skip_initializing_odb;
+
+	repo_set_gitdir(the_repository, git_dir, &args);
+	strvec_clear(&to_free);
+
+	if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
+		disable_replace_refs();
+	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
+	git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
+							  : "refs/replace/");
+	update_ref_namespace(NAMESPACE_REPLACE, git_replace_ref_base);
+
+	shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
+	if (shallow_file)
+		set_alternate_shallow_file(the_repository, shallow_file, 0);
+
+	if (git_env_bool(NO_LAZY_FETCH_ENVIRONMENT, 0))
+		fetch_if_missing = 0;
+}
+
+void setup_git_env(const char *git_dir)
+{
+	setup_git_env_internal(git_dir, false);
+}
+
+static void set_git_dir_1(const char *path, bool skip_initializing_odb)
 {
 	xsetenv(GIT_DIR_ENVIRONMENT, path, 1);
-	setup_git_env(path);
+	setup_git_env_internal(path, skip_initializing_odb);
 }
 
 static void update_relative_gitdir(const char *name UNUSED,
@@ -1020,7 +1061,7 @@ static void update_relative_gitdir(const char *name UNUSED,
 	trace_printf_key(&trace_setup_key,
 			 "setup: move $GIT_DIR to '%s'",
 			 path);
-	set_git_dir_1(path);
+	set_git_dir_1(path, true);
 	if (tmp_objdir)
 		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
 	free(path);
@@ -1035,7 +1076,7 @@ static void set_git_dir(const char *path, int make_realpath)
 		path = realpath.buf;
 	}
 
-	set_git_dir_1(path);
+	set_git_dir_1(path, false);
 	if (!is_absolute_path(path))
 		chdir_notify_register(NULL, update_relative_gitdir, NULL);
 
@@ -1668,41 +1709,6 @@ enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
 	return result;
 }
 
-void setup_git_env(const char *git_dir)
-{
-	char *git_replace_ref_base;
-	const char *shallow_file;
-	const char *replace_ref_base;
-	struct set_gitdir_args args = { NULL };
-	struct strvec to_free = STRVEC_INIT;
-
-	args.commondir = getenv_safe(&to_free, GIT_COMMON_DIR_ENVIRONMENT);
-	args.object_dir = getenv_safe(&to_free, DB_ENVIRONMENT);
-	args.graft_file = getenv_safe(&to_free, GRAFT_ENVIRONMENT);
-	args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT);
-	args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT);
-	if (getenv(GIT_QUARANTINE_ENVIRONMENT)) {
-		args.disable_ref_updates = true;
-	}
-
-	repo_set_gitdir(the_repository, git_dir, &args);
-	strvec_clear(&to_free);
-
-	if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
-		disable_replace_refs();
-	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
-	git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
-							  : "refs/replace/");
-	update_ref_namespace(NAMESPACE_REPLACE, git_replace_ref_base);
-
-	shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
-	if (shallow_file)
-		set_alternate_shallow_file(the_repository, shallow_file, 0);
-
-	if (git_env_bool(NO_LAZY_FETCH_ENVIRONMENT, 0))
-		fetch_if_missing = 0;
-}
-
 const char *enter_repo(const char *path, unsigned flags)
 {
 	static struct strbuf validated_path = STRBUF_INIT;

From ac65c70663b092e823b0d3de1c1cfdee0a4fbc8e Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 19 Nov 2025 08:51:01 +0100
Subject: [PATCH 165/553] odb: handle recreation of quarantine directories

In the preceding commit we have moved the logic that reparents object
database sources on chdir(3p) from "setup.c" into "odb.c". Let's also do
the same for any temporary quarantine directories so that the complete
reparenting logic is self-contained in "odb.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c   | 7 +++++++
 setup.c | 5 -----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/odb.c b/odb.c
index 70665fb7f48f0f..dc8f292f3d9645 100644
--- a/odb.c
+++ b/odb.c
@@ -24,6 +24,7 @@
 #include "strbuf.h"
 #include "strvec.h"
 #include "submodule.h"
+#include "tmp-objdir.h"
 #include "trace2.h"
 #include "write-or-die.h"
 
@@ -1041,8 +1042,11 @@ static void odb_update_commondir(const char *name UNUSED,
 				 void *cb_data)
 {
 	struct object_database *odb = cb_data;
+	struct tmp_objdir *tmp_objdir;
 	struct odb_source *source;
 
+	tmp_objdir = tmp_objdir_unapply_primary_odb();
+
 	/*
 	 * In theory, we only have to do this for the primary object source, as
 	 * alternates' paths are always resolved to an absolute path.
@@ -1059,6 +1063,9 @@ static void odb_update_commondir(const char *name UNUSED,
 		free(source->path);
 		source->path = path;
 	}
+
+	if (tmp_objdir)
+		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
 }
 
 struct object_database *odb_new(struct repository *repo,
diff --git a/setup.c b/setup.c
index a625f9fbc8b1b7..ae66188af3f6f4 100644
--- a/setup.c
+++ b/setup.c
@@ -22,7 +22,6 @@
 #include "chdir-notify.h"
 #include "path.h"
 #include "quote.h"
-#include "tmp-objdir.h"
 #include "trace.h"
 #include "trace2.h"
 #include "worktree.h"
@@ -1056,14 +1055,10 @@ static void update_relative_gitdir(const char *name UNUSED,
 {
 	char *path = reparent_relative_path(old_cwd, new_cwd,
 					    repo_get_git_dir(the_repository));
-	struct tmp_objdir *tmp_objdir = tmp_objdir_unapply_primary_odb();
-
 	trace_printf_key(&trace_setup_key,
 			 "setup: move $GIT_DIR to '%s'",
 			 path);
 	set_git_dir_1(path, true);
-	if (tmp_objdir)
-		tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd);
 	free(path);
 }
 

From c20f112e5149d1bd0d4741c4b28a65f81318309a Mon Sep 17 00:00:00 2001
From: Christian Couder <christian.couder@gmail.com>
Date: Mon, 17 Nov 2025 05:34:50 +0100
Subject: [PATCH 166/553] fast-import: add 'strip-if-invalid' mode to
 --signed-commits=<mode>

Tools like `git filter-repo`[1] use `git fast-export` and
`git fast-import` to rewrite repository history. When rewriting
history using one such tool though, commit signatures might become
invalid because the commits they sign changed due to the changes
in the repository history made by the tool between the fast-export
and the fast-import steps.

Note that as far as signature handling goes:

  * Since fast-export doesn't know what changes filter-repo may make
to the stream, it can't know whether the signatures will still be
valid.

  * Since filter-repo doesn't know what history canonicalizations
fast-export performed (and it performs a few), it can't know whether
the signatures will still be valid.

  * Therefore, fast-import is the only process in the pipeline that
can know whether a specified signature remains valid.

Having invalid signatures in a rewritten repository could be
confusing, so users rewritting history might prefer to simply
discard signatures that are invalid at the fast-import step.

For example a common use case is to rewrite only "recent" history.
While specifying commit ranges corresponding to "recent" commits
could work, users worry about getting it wrong and want to just
automatically rewrite everything, expecting older commit signatures
to be untouched.

To let them do that, let's add a new 'strip-if-invalid' mode to the
`--signed-commits=<mode>` option of `git fast-import`.

It would be interesting for the `--signed-tags=<mode>` option to
have this mode too, but we leave that for a future improvement.

It might also be possible for `git fast-export` to have such a mode
in its `--signed-commits=<mode>` and `--signed-tags=<mode>`
options, but the use cases for it are much less clear, so we also
leave that for possible future improvements.

For now let's just die() if 'strip-if-invalid' is passed to these
options where it hasn't been implemented yet.

[1]: https://github.com/newren/git-filter-repo

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-fast-import.adoc | 29 +++++++++----
 builtin/fast-export.c              | 38 ++++++++++++----
 builtin/fast-import.c              | 59 ++++++++++++++++++++++---
 gpg-interface.c                    |  2 +
 gpg-interface.h                    |  1 +
 t/t9305-fast-import-signatures.sh  | 69 +++++++++++++++++++++++++++++-
 6 files changed, 174 insertions(+), 24 deletions(-)

diff --git a/Documentation/git-fast-import.adoc b/Documentation/git-fast-import.adoc
index b74179a6c891d5..479c4081da8f27 100644
--- a/Documentation/git-fast-import.adoc
+++ b/Documentation/git-fast-import.adoc
@@ -66,15 +66,26 @@ fast-import stream! This option is enabled automatically for
 remote-helpers that use the `import` capability, as they are
 already trusted to run their own code.
 
---signed-tags=(verbatim|warn-verbatim|warn-strip|strip|abort)::
-	Specify how to handle signed tags.  Behaves in the same way
-	as the same option in linkgit:git-fast-export[1], except that
-	default is 'verbatim' (instead of 'abort').
-
---signed-commits=(verbatim|warn-verbatim|warn-strip|strip|abort)::
-	Specify how to handle signed commits.  Behaves in the same way
-	as the same option in linkgit:git-fast-export[1], except that
-	default is 'verbatim' (instead of 'abort').
+`--signed-tags=(verbatim|warn-verbatim|warn-strip|strip|abort)`::
+	Specify how to handle signed tags. Behaves in the same way as
+	the `--signed-commits=<mode>` below, except that the
+	`strip-if-invalid` mode is not yet supported. Like for signed
+	commits, the default mode is `verbatim`.
+
+`--signed-commits=<mode>`::
+	Specify how to handle signed commits. The following <mode>s
+	are supported:
++
+* `verbatim`, which is the default, will silently import commit
+  signatures.
+* `warn-verbatim` will import them, but will display a warning.
+* `abort` will make this program die when encountering a signed
+  commit.
+* `strip` will silently make the commits unsigned.
+* `warn-strip` will make them unsigned, but will display a warning.
+* `strip-if-invalid` will check signatures and, if they are invalid,
+  will strip them and display a warning. The validation is performed
+  in the same way as linkgit:git-verify-commit[1] does it.
 
 Options for Frontends
 ~~~~~~~~~~~~~~~~~~~~~
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 7adbc55f0dccb1..a839a8f9acefde 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -797,10 +797,8 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 	       (int)(committer_end - committer), committer);
 	if (signatures.nr) {
 		switch (signed_commit_mode) {
-		case SIGN_ABORT:
-			die("encountered signed commit %s; use "
-			    "--signed-commits=<mode> to handle it",
-			    oid_to_hex(&commit->object.oid));
+
+		/* Exporting modes */
 		case SIGN_WARN_VERBATIM:
 			warning("exporting %"PRIuMAX" signature(s) for commit %s",
 				(uintmax_t)signatures.nr, oid_to_hex(&commit->object.oid));
@@ -811,12 +809,25 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 				print_signature(item->string, item->util);
 			}
 			break;
+
+		/* Stripping modes */
 		case SIGN_WARN_STRIP:
 			warning("stripping signature(s) from commit %s",
 				oid_to_hex(&commit->object.oid));
 			/* fallthru */
 		case SIGN_STRIP:
 			break;
+
+		/* Aborting modes */
+		case SIGN_ABORT:
+			die(_("encountered signed commit %s; use "
+			      "--signed-commits=<mode> to handle it"),
+			    oid_to_hex(&commit->object.oid));
+		case SIGN_STRIP_IF_INVALID:
+			die(_("'strip-if-invalid' is not a valid mode for "
+			      "git fast-export with --signed-commits=<mode>"));
+		default:
+			BUG("invalid signed_commit_mode value %d", signed_commit_mode);
 		}
 		string_list_clear(&signatures, 0);
 	}
@@ -934,16 +945,16 @@ static void handle_tag(const char *name, struct tag *tag)
 		size_t sig_offset = parse_signed_buffer(message, message_size);
 		if (sig_offset < message_size)
 			switch (signed_tag_mode) {
-			case SIGN_ABORT:
-				die("encountered signed tag %s; use "
-				    "--signed-tags=<mode> to handle it",
-				    oid_to_hex(&tag->object.oid));
+
+			/* Exporting modes */
 			case SIGN_WARN_VERBATIM:
 				warning("exporting signed tag %s",
 					oid_to_hex(&tag->object.oid));
 				/* fallthru */
 			case SIGN_VERBATIM:
 				break;
+
+			/* Stripping modes */
 			case SIGN_WARN_STRIP:
 				warning("stripping signature from tag %s",
 					oid_to_hex(&tag->object.oid));
@@ -951,6 +962,17 @@ static void handle_tag(const char *name, struct tag *tag)
 			case SIGN_STRIP:
 				message_size = sig_offset;
 				break;
+
+			/* Aborting modes */
+			case SIGN_ABORT:
+				die(_("encountered signed tag %s; use "
+				      "--signed-tags=<mode> to handle it"),
+				    oid_to_hex(&tag->object.oid));
+			case SIGN_STRIP_IF_INVALID:
+				die(_("'strip-if-invalid' is not a valid mode for "
+				      "git fast-export with --signed-tags=<mode>"));
+			default:
+				BUG("invalid signed_commit_mode value %d", signed_commit_mode);
 			}
 	}
 
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 493de57ef67bfb..e2c6894461044b 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -2772,7 +2772,7 @@ static void add_gpgsig_to_commit(struct strbuf *commit_data,
 {
 	struct string_list siglines = STRING_LIST_INIT_NODUP;
 
-	if (!sig->hash_algo)
+	if (!sig || !sig->hash_algo)
 		return;
 
 	strbuf_addstr(commit_data, header);
@@ -2827,6 +2827,45 @@ static void finalize_commit_buffer(struct strbuf *new_data,
 	strbuf_addbuf(new_data, msg);
 }
 
+static void handle_strip_if_invalid(struct strbuf *new_data,
+				    struct signature_data *sig_sha1,
+				    struct signature_data *sig_sha256,
+				    struct strbuf *msg)
+{
+	struct strbuf tmp_buf = STRBUF_INIT;
+	struct signature_check signature_check = { 0 };
+	int ret;
+
+	/* Check signature in a temporary commit buffer */
+	strbuf_addbuf(&tmp_buf, new_data);
+	finalize_commit_buffer(&tmp_buf, sig_sha1, sig_sha256, msg);
+	ret = verify_commit_buffer(tmp_buf.buf, tmp_buf.len, &signature_check);
+
+	if (ret) {
+		const char *signer = signature_check.signer ?
+			signature_check.signer : _("unknown");
+		const char *subject;
+		int subject_len = find_commit_subject(msg->buf, &subject);
+
+		if (subject_len > 100)
+			warning(_("stripping invalid signature for commit '%.100s...'\n"
+				  "  allegedly by %s"), subject, signer);
+		else if (subject_len > 0)
+			warning(_("stripping invalid signature for commit '%.*s'\n"
+				  "  allegedly by %s"), subject_len, subject, signer);
+		else
+			warning(_("stripping invalid signature for commit\n"
+				  "  allegedly by %s"), signer);
+
+		finalize_commit_buffer(new_data, NULL, NULL, msg);
+	} else {
+		strbuf_swap(new_data, &tmp_buf);
+	}
+
+	signature_check_clear(&signature_check);
+	strbuf_release(&tmp_buf);
+}
+
 static void parse_new_commit(const char *arg)
 {
 	static struct strbuf msg = STRBUF_INIT;
@@ -2878,6 +2917,7 @@ static void parse_new_commit(const char *arg)
 			warning(_("importing a commit signature verbatim"));
 			/* fallthru */
 		case SIGN_VERBATIM:
+		case SIGN_STRIP_IF_INVALID:
 			import_one_signature(&sig_sha1, &sig_sha256, v);
 			break;
 
@@ -2962,7 +3002,11 @@ static void parse_new_commit(const char *arg)
 			"encoding %s\n",
 			encoding);
 
-	finalize_commit_buffer(&new_data, &sig_sha1, &sig_sha256, &msg);
+	if (signed_commit_mode == SIGN_STRIP_IF_INVALID &&
+	    (sig_sha1.hash_algo || sig_sha256.hash_algo))
+		handle_strip_if_invalid(&new_data, &sig_sha1, &sig_sha256, &msg);
+	else
+		finalize_commit_buffer(&new_data, &sig_sha1, &sig_sha256, &msg);
 
 	free(author);
 	free(committer);
@@ -2984,9 +3028,6 @@ static void handle_tag_signature(struct strbuf *msg, const char *name)
 	switch (signed_tag_mode) {
 
 	/* First, modes that don't change anything */
-	case SIGN_ABORT:
-		die(_("encountered signed tag; use "
-		      "--signed-tags=<mode> to handle it"));
 	case SIGN_WARN_VERBATIM:
 		warning(_("importing a tag signature verbatim for tag '%s'"), name);
 		/* fallthru */
@@ -3003,7 +3044,13 @@ static void handle_tag_signature(struct strbuf *msg, const char *name)
 		strbuf_setlen(msg, sig_offset);
 		break;
 
-	/* Third, BUG */
+	/* Third, aborting modes */
+	case SIGN_ABORT:
+		die(_("encountered signed tag; use "
+		      "--signed-tags=<mode> to handle it"));
+	case SIGN_STRIP_IF_INVALID:
+		die(_("'strip-if-invalid' is not a valid mode for "
+		      "git fast-import with --signed-tags=<mode>"));
 	default:
 		BUG("invalid signed_tag_mode value %d from tag '%s'",
 		    signed_tag_mode, name);
diff --git a/gpg-interface.c b/gpg-interface.c
index d1e88da8c1bfde..fe653b246433b8 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -1146,6 +1146,8 @@ int parse_sign_mode(const char *arg, enum sign_mode *mode)
 		*mode = SIGN_WARN_STRIP;
 	else if (!strcmp(arg, "strip"))
 		*mode = SIGN_STRIP;
+	else if (!strcmp(arg, "strip-if-invalid"))
+		*mode = SIGN_STRIP_IF_INVALID;
 	else
 		return -1;
 	return 0;
diff --git a/gpg-interface.h b/gpg-interface.h
index 50487aa1483274..71dde8cb807437 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -111,6 +111,7 @@ enum sign_mode {
 	SIGN_VERBATIM,
 	SIGN_WARN_STRIP,
 	SIGN_STRIP,
+	SIGN_STRIP_IF_INVALID,
 };
 
 /*
diff --git a/t/t9305-fast-import-signatures.sh b/t/t9305-fast-import-signatures.sh
index c2b427165862d3..022dae02e48177 100755
--- a/t/t9305-fast-import-signatures.sh
+++ b/t/t9305-fast-import-signatures.sh
@@ -79,7 +79,7 @@ test_expect_success GPG 'setup a commit with dual OpenPGP signatures on its SHA-
 	echo B >explicit-sha256/B &&
 	git -C explicit-sha256 add B &&
 	test_tick &&
-	git -C explicit-sha256 commit -S -m "signed" B &&
+	git -C explicit-sha256 commit -S -m "signed commit" B &&
 	SHA256_B=$(git -C explicit-sha256 rev-parse dual-signed) &&
 
 	# Create the corresponding SHA-1 commit
@@ -103,4 +103,71 @@ test_expect_success GPG 'strip both OpenPGP signatures with --signed-commits=war
 	test_line_count = 2 out
 '
 
+test_expect_success GPG 'import commit with no signature with --signed-commits=strip-if-invalid' '
+	git fast-export main >output &&
+	git -C new fast-import --quiet --signed-commits=strip-if-invalid <output >log 2>&1 &&
+	test_must_be_empty log
+'
+
+test_expect_success GPG 'keep valid OpenPGP signature with --signed-commits=strip-if-invalid' '
+	rm -rf new &&
+	git init new &&
+
+	git fast-export --signed-commits=verbatim openpgp-signing >output &&
+	git -C new fast-import --quiet --signed-commits=strip-if-invalid <output >log 2>&1 &&
+	IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) &&
+	test $OPENPGP_SIGNING = $IMPORTED &&
+	git -C new cat-file commit "$IMPORTED" >actual &&
+	test_grep -E "^gpgsig(-sha256)? " actual &&
+	test_must_be_empty log
+'
+
+test_expect_success GPG 'strip signature invalidated by message change with --signed-commits=strip-if-invalid' '
+	rm -rf new &&
+	git init new &&
+
+	git fast-export --signed-commits=verbatim openpgp-signing >output &&
+
+	# Change the commit message, which invalidates the signature.
+	# The commit message length should not change though, otherwise the
+	# corresponding `data <length>` command would have to be changed too.
+	sed "s/OpenPGP signed commit/OpenPGP forged commit/" output >modified &&
+
+	git -C new fast-import --quiet --signed-commits=strip-if-invalid <modified >log 2>&1 &&
+
+	IMPORTED=$(git -C new rev-parse --verify refs/heads/openpgp-signing) &&
+	test $OPENPGP_SIGNING != $IMPORTED &&
+	git -C new cat-file commit "$IMPORTED" >actual &&
+	test_grep ! -E "^gpgsig" actual &&
+	test_grep "stripping invalid signature" log
+'
+
+test_expect_success GPGSM 'keep valid X.509 signature with --signed-commits=strip-if-invalid' '
+	rm -rf new &&
+	git init new &&
+
+	git fast-export --signed-commits=verbatim x509-signing >output &&
+	git -C new fast-import --quiet --signed-commits=strip-if-invalid <output >log 2>&1 &&
+	IMPORTED=$(git -C new rev-parse --verify refs/heads/x509-signing) &&
+	test $X509_SIGNING = $IMPORTED &&
+	git -C new cat-file commit "$IMPORTED" >actual &&
+	test_grep -E "^gpgsig(-sha256)? " actual &&
+	test_must_be_empty log
+'
+
+test_expect_success GPGSSH 'keep valid SSH signature with --signed-commits=strip-if-invalid' '
+	rm -rf new &&
+	git init new &&
+
+	test_config -C new gpg.ssh.allowedSignersFile "${GPGSSH_ALLOWED_SIGNERS}" &&
+
+	git fast-export --signed-commits=verbatim ssh-signing >output &&
+	git -C new fast-import --quiet --signed-commits=strip-if-invalid <output >log 2>&1 &&
+	IMPORTED=$(git -C new rev-parse --verify refs/heads/ssh-signing) &&
+	test $SSH_SIGNING = $IMPORTED &&
+	git -C new cat-file commit "$IMPORTED" >actual &&
+	test_grep -E "^gpgsig(-sha256)? " actual &&
+	test_must_be_empty log
+'
+
 test_done

From 9f3a11508719a244fa52d5418bfd29847a272211 Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Wed, 26 Nov 2025 14:33:37 +0000
Subject: [PATCH 167/553] replay: do not copy "gpgsign-sha256" header

When "git replay" replays a commit it copies the extended headers
across from the original commit. However, if the original commit
was signed, we do not want to copy the header associated with the
signature is it wont be valid for the new commit. The code already
knows to avoid coping the "gpgsig" header but does not know to avoid
copying the "gpgsig-sha256" header.  Add that header to the list of
exclusions to match what "git commit --amend" does.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 6172c8aacc9873..d12e4d548727d6 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -67,7 +67,7 @@ static struct commit *create_commit(struct repository *repo,
 	const char *message = repo_logmsg_reencode(repo, based_on,
 						   NULL, out_enc);
 	const char *orig_message = NULL;
-	const char *exclude_gpgsig[] = { "gpgsig", NULL };
+	const char *exclude_gpgsig[] = { "gpgsig", "gpgsig-sha256", NULL };
 
 	commit_list_insert(parent, &parents);
 	extra = read_commit_extra_headers(based_on, exclude_gpgsig);

From 0458e8b85440f77d4a4aec28d09112ba06cee05a Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 17 Nov 2025 17:04:24 +0000
Subject: [PATCH 168/553] ci(dockerized): do show the result of failing tests
 again

The quality of tests and test suites is most apparent not when
everything passes, but in how quickly bugs can be identified,
analyzed, and resolved after test failures occur.

As such, it is an unfortunate side effect of 2a21098b98a (github: adapt
containerized jobs to be rootless, 2025-01-10) that the output of failed
test cases, which was shown before that change directly in the build
logs, is now no longer shown at all.

The reason is a side effect of trying to run the build and the tests
with permissions other than the `root` user, but without providing the
prerequisite permissions to signal what tests failed and whose output
hence needs to be included in the logs.

The way this signaling works is for the workflow to write into
special-purpose files whose path is specific to the current workflow
step and which can be accessed via the `$GITHUB_ENV` environment
variable, which differs between workflow steps. It is this file that is
missing write permission for the `builder` user that was introduced in
above-mentioned commit.

The solution is simple: make the file world-writable.

Technically, this write permission should be removed after the step has
completed, if proper security practices were to be upheld, but since
nothing uses that file again, it does not matter, and the fix is more
succinct this way.

This commit is best viewed with `--color-words`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: squashed Elijah's rewrite of the first paragraph of the log message]
[jc: updated chmod to match "world-writable" in the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .github/workflows/main.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 1c8260ecb68b76..a2c06f9272e532 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -417,7 +417,7 @@ jobs:
     - run: ci/install-dependencies.sh
     - run: useradd builder --create-home
     - run: chown -R builder .
-    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
+    - run: chmod a+w $GITHUB_ENV && sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh

From b31ab939fe8e3cbe8be48dddd1c6ac0265991f45 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 26 Nov 2025 10:22:08 -0800
Subject: [PATCH 169/553] The fourth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 39 ++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 7882bc59e80a92..70c4338675ae52 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -11,6 +11,8 @@ UI, Workflows & Features
    in a transaction by default, instead of emitting where each refs
    should point at and leaving the actual update to another command.
 
+ * "git blame" learns "--diff-algorithm=<algo>" option.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -57,6 +59,43 @@ Fixes since v2.52
    corrected.
    (merge 7a03a10a3a jx/repo-struct-utf8width-fix later to maint).
 
+ * Yet another corner case fix around renames in the "ort" merge
+   strategy.
+   (merge a562d90a35 en/ort-rename-another-fix later to maint).
+
+ * Test leakfix.
+   (merge 14b561e768 jk/test-mktemp-leakfix later to maint).
+
+ * Update a version of action used at the GitHub Actrions CI.
+   (merge cd99203f86 js/ci-github-setup-go-update later to maint).
+
+ * The "return errno = EFOO, -1" construct, which is heavily used in
+   compat/mingw.c and triggers warnings under "-Wcomma", has been
+   rewritten to avoid the warnings.
+   (merge af3919816f js/mingw-assign-comma-fix later to maint).
+
+ * Makefile based build have recently been updated to build a
+   libgit.a that also has reftable and xdiff objects; CMake based
+   build procedure has been updated to match.
+   (merge b0d5c88cca js/cmake-libgit-fix later to maint).
+
+ * Under-allocation fix.
+   (merge d22a488482 js/wincred-get-credential-alloc-fix later to maint).
+
+ * "git worktree list" attempts to show paths to worktrees while
+   aligning them, but miscounted display columns for the paths when
+   non-ASCII characters were involved, which has been corrected.
+   (merge 08dfa59835 pw/worktree-list-display-width-fix later to maint).
+
+ * "Windows+meson" job at the GitHub Actions CI was hard to debug, as
+   it did not show and save failed test artifacts, which has been
+   corrected.
+   (merge 17bd1108ea jk/ci-windows-meson-test-fix later to maint).
+
+ * Emulation code clean-up.
+   (merge 2367c6bcd6 gf/win32-pthread-cond-wait-err later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
+   (merge f18aa68861 rs/xmkstemp-simplify later to maint).

From bf25fca31c5a923598ce8461034a992920e3625b Mon Sep 17 00:00:00 2001
From: "brian m. carlson" <sandals@crustytoothpaste.net>
Date: Fri, 28 Nov 2025 01:21:05 +0000
Subject: [PATCH 170/553] t0614: use numerical comparison with test_line_count

In this comparison, we want to know whether the number of lines is
greater than 1.  Our test_line_count function passes the first argument
as the comparison operator to test, so what we want is a numerical
comparison, not a string comparison.  While this does not produce a
functional problem now, it could very well if we expected two or more
items, in which case the value "10" would not match when it should.

Furthermore, the "<" and ">" comparisons are new in POSIX 1003.1-2024
and we don't want to require such a new version of POSIX since many
popular and supported operating systems were released before that
version of POSIX was released.

Finally, zsh's builtin test operator does not like the greater-than sign
in "test", since it is only supported in the double-bracket extension.
This has been reported and will be addressed in a future version, but
since our code is also technically incorrect, as well as not very
compatible, let's fix it by using a numeric comparison.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0614-reftable-fsck.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0614-reftable-fsck.sh b/t/t0614-reftable-fsck.sh
index 85cc47d67e13bf..677eb9143c9de4 100755
--- a/t/t0614-reftable-fsck.sh
+++ b/t/t0614-reftable-fsck.sh
@@ -20,7 +20,7 @@ test_expect_success "no errors reported on a well formed repository" '
 		done &&
 
 		# The repository should end up with multiple tables.
-		test_line_count ">" 1 .git/reftable/tables.list &&
+		test_line_count -gt 1 .git/reftable/tables.list &&
 
 		git refs verify 2>err &&
 		test_must_be_empty err

From a92f243a94e6810394fb01d517726487252007f0 Mon Sep 17 00:00:00 2001
From: "brian m. carlson" <sandals@crustytoothpaste.net>
Date: Fri, 28 Nov 2025 01:21:06 +0000
Subject: [PATCH 171/553] t5564: fix test hang under zsh's sh mode

This test starts a SOCKS server in Perl in the background and then kills
it after the tests are done.  However, when using zsh (in sh mode) in
the tests, the start_socks function hangs until the background process
is killed.

Note that this does not reproduce in a simple shell script, so there is
likely some interaction between job handling, our heavy use of eval in
the test framework, and possibly other complexities of our test
framework.  What is clear, however, is that switching from a compound
statement to a subshell fixes the problem entirely and the test passes
with no problem, so do that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5564-http-proxy.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh
index c3903faf2d3e6f..3bcbdef409b25f 100755
--- a/t/t5564-http-proxy.sh
+++ b/t/t5564-http-proxy.sh
@@ -40,10 +40,10 @@ test_expect_success 'clone can prompt for proxy password' '
 
 start_socks() {
 	mkfifo socks_output &&
-	{
+	(
 		"$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output &
 		echo $! > "$TRASH_DIRECTORY/socks.pid"
-	} &&
+	) &&
 	read line <socks_output &&
 	test "$line" = ready
 }

From c7e3b8085bb2f74371f5017f42c58b0acf01b915 Mon Sep 17 00:00:00 2001
From: Yee Cheng Chin <ychin.git@gmail.com>
Date: Thu, 27 Nov 2025 02:16:06 +0000
Subject: [PATCH 172/553] xdiff: optimize patience diff's LCS search

The find_longest_common_sequence() function in patience diff is
inefficient as it calls binary_search() for every unique line it
encounters when deciding where to put it in the sequence. From
instrumentation (using xctrace) on popular repositories, binary_search()
takes up 50-60% of the run time within patience_diff() when performing a
diff.

To optimize this, add a boundary condition check before binary_search()
is called to see if the encountered unique line is located after the
entire currently tracked longest subsequence. If so, skip the
unnecessary binary search and simply append the entry to the end of
sequence. Given that most files compared in a diff are usually quite
similar to each other, this condition is very common, and should be hit
much more frequently than the binary search.

Below are some end-to-end performance results by timing `git log
--shortstat --oneline -500 --patience` on different repositories with
the old and new code. Generally speaking this seems to give at least
8-10% speed up. The "binary search hit %" column describes how often the
algorithm enters the binary search path instead of the new faster path.
Even in the WebKit case we can see that it's quite rare (1.46%).

| Repo     | Speed difference | binary search hit % |
|----------|------------------|---------------------|
| vim      | 1.27x            | 0.01%               |
| pytorch  | 1.16x            | 0.02%               |
| cpython  | 1.14x            | 0.06%               |
| ripgrep  | 1.14x            | 0.03%               |
| git      | 1.13x            | 0.12%               |
| vscode   | 1.09x            | 0.10%               |
| WebKit   | 1.08x            | 1.46%               |

The benchmarks were done using hyperfine, on an Apple M1 Max laptop,
with git compiled with `-O3 -flto`.

Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 xdiff/xpatience.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index 77dc411d1937ab..d4094e6acf8810 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -211,7 +211,10 @@ static int find_longest_common_sequence(struct hashmap *map, struct entry **res)
 	for (entry = map->first; entry; entry = entry->next) {
 		if (!entry->line2 || entry->line2 == NON_UNIQUE)
 			continue;
-		i = binary_search(sequence, longest, entry);
+		if (longest == 0 || entry->line2 > sequence[longest - 1]->line2)
+			i = longest - 1;
+		else
+			i = binary_search(sequence, longest, entry);
 		entry->previous = i < 0 ? NULL : sequence[i];
 		++i;
 		if (i <= anchor_i)

From 136f86abc052ef6186d9985fc26833ffc0484888 Mon Sep 17 00:00:00 2001
From: Elijah Newren <newren@gmail.com>
Date: Sat, 29 Nov 2025 04:44:24 +0000
Subject: [PATCH 173/553] Documentation/git-replay.adoc: fix errors around
 revision range

There was significant confusion in the git-replay manual about what
constitutes a revision range.  As noted in f302c1e4aa09 (revisions(7):
clarify that most commands take a single revision range, 2021-05-18):

   Commands that are specifically designed to take two distinct ranges
   (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they
   are exceptions. Unless otherwise noted, all "git" commands that operate
   on a set of commits work on a single revision range.

`git replay` is not an exception, but a few places in the manual were
written as though it were.  These appear to have come in revisions to
the original series, between v3->v4 (see
https://lore.kernel.org/git/CAP8UFD3bpLrVW97DH7j=V9H2GsTSAkksC9L3QujQERFk_kLnZA@mail.gmail.com/
, "More than one <revision-range> can be passed") and between v6->v7
(https://lore.kernel.org/git/20231115143327.2441397-1-christian.couder@gmail.com/,
"Takes ranges of commits"), and I missed both of these revisions when
reviewing.  Fix them now.

There was also a reference to the "Commit Limiting options below", but
this page has no such section of options; strike the misleading
reference.

It is worth noting that we are documenting existing behavior, rather
than optimal behavior.  Junio has multiple times suggested introducing
alternative ways to walk revisions and use them in `git replay
--advance`, e.g. at
  * https://lore.kernel.org/git/xmqqy1mqo6kv.fsf@gitster.g/
  * https://lore.kernel.org/git/xmqq8rb3is8c.fsf@gitster.g/
  * https://lore.kernel.org/git/xmqqtsydj2zk.fsf@gitster.g/ (item (2))
If/when we introduce some new revision walking flag that implements one
of these alternate types of revision walks, we can update the --advance
option and this manual appropriately.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-replay.adoc | 13 ++++++-------
 builtin/replay.c              |  2 +-
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index dcb26e8a8e88ca..d03235cca0c668 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -9,12 +9,12 @@ git-replay - EXPERIMENTAL: Replay commits on a new base, works with bare repos t
 SYNOPSIS
 --------
 [verse]
-(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) [--ref-action[=<mode>]] <revision-range>...
+(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) [--ref-action[=<mode>]] <revision-range>
 
 DESCRIPTION
 -----------
 
-Takes ranges of commits and replays them onto a new location. Leaves
+Takes a range of commits and replays them onto a new location. Leaves
 the working tree and the index untouched. By default, updates the
 relevant references using an atomic transaction (all refs update or
 none). Use `--ref-action=print` to avoid automatic ref updates and
@@ -55,11 +55,10 @@ which uses the target only as a starting point without updating it.
 The default mode can be configured via the `replay.refAction` configuration variable.
 
 <revision-range>::
-	Range of commits to replay. More than one <revision-range> can
-	be passed, but in `--advance <branch>` mode, they should have
-	a single tip, so that it's clear where <branch> should point
-	to. See "Specifying Ranges" in linkgit:git-rev-parse[1] and the
-	"Commit Limiting" options below.
+	Range of commits to replay; see "Specifying Ranges" in
+	linkgit:git-rev-parse[1]. In `--advance <branch>` mode, the
+	range should have a single tip, so that it's clear to which tip the
+	advanced <branch> should point.
 
 include::rev-list-options.adoc[]
 
diff --git a/builtin/replay.c b/builtin/replay.c
index 6606a2c94bc671..e6d6d2823969e5 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -366,7 +366,7 @@ int cmd_replay(int argc,
 	const char *const replay_usage[] = {
 		N_("(EXPERIMENTAL!) git replay "
 		   "([--contained] --onto <newbase> | --advance <branch>) "
-		   "[--ref-action[=<mode>]] <revision-range>..."),
+		   "[--ref-action[=<mode>]] <revision-range>"),
 		NULL
 	};
 	struct option replay_options[] = {

From fe4e60759bfbf4eaca17949d3bbb204bb5c908a2 Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Fri, 28 Nov 2025 17:37:13 +0100
Subject: [PATCH 174/553] last-modified: fix use of uninitialized memory

git-last-modified(1) uses a scratch bitmap to keep track of paths that
have been changed between commits. To avoid reallocating a bitmap on
each call of process_parent(), the scratch bitmap is kept and reused.
Although, it seems an incorrect length is passed to memset(3).

`struct bitmap` uses `eword_t` to for internal storage. This type is
typedef'd to uint64_t. To fully zero the memory used by the bitmap,
multiply the length (saved in `struct bitmap::word_alloc`) by the size
of `eword_t`.

Reported-by: Anders Kaseorg <andersk@mit.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/last-modified.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index b0ecbdc5400d13..cc5fd2e7950be7 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -327,7 +327,7 @@ static void process_parent(struct last_modified *lm,
 	if (!(parent->object.flags & PARENT1))
 		active_paths_free(lm, parent);
 
-	memset(lm->scratch->words, 0x0, lm->scratch->word_alloc);
+	memset(lm->scratch->words, 0x0, lm->scratch->word_alloc * sizeof(eword_t));
 	diff_queue_clear(&diff_queued_diff);
 }
 

From 38f88051dae6ddb2f1cdb9c7415d4ba6caef04af Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 30 Nov 2025 12:47:17 +0100
Subject: [PATCH 175/553] diff-index: don't queue unchanged filepairs with
 diff_change()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

diff_cache() queues unchanged filepairs if the flag find_copies_harder
is set, and uses diff_change() for that.  This function allocates a
filespec for each side, does a few other things that are unnecessary for
unchanged filepairs and always sets the diff_flag has_changes, which is
simply misleading in this case.

Add a new streamlined function for queuing unchanged filepairs and
use it in show_modified(), which is called by diff_cache() via
oneway_diff() and do_oneway_diff().  It allocates only a single filespec
for each filepair and uses it twice with reference counting.  This has a
measurable effect if there are a lot of them, like in the Linux repo:

Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      31.8 ms ±   0.2 ms    [User: 24.2 ms, System: 6.3 ms]
  Range (min … max):    31.5 ms …  32.3 ms    85 runs

Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      23.9 ms ±   0.2 ms    [User: 18.1 ms, System: 4.6 ms]
  Range (min … max):    23.5 ms …  24.4 ms    111 runs

Summary
  ./git -C ../linux diff --cached --find-copies-harder ran
    1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff-lib.c | 13 ++++++-------
 diff.c     | 20 ++++++++++++++++++++
 diff.h     |  5 +++++
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index b8f8f3bc312fbe..8e624f38c6d6f3 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -418,13 +418,12 @@ static int show_modified(struct rev_info *revs,
 	}
 
 	oldmode = old_entry->ce_mode;
-	if (mode == oldmode && oideq(oid, &old_entry->oid) && !dirty_submodule &&
-	    !revs->diffopt.flags.find_copies_harder)
-		return 0;
-
-	diff_change(&revs->diffopt, oldmode, mode,
-		    &old_entry->oid, oid, 1, !is_null_oid(oid),
-		    old_entry->name, 0, dirty_submodule);
+	if (mode != oldmode || !oideq(oid, &old_entry->oid) || dirty_submodule)
+		diff_change(&revs->diffopt, oldmode, mode,
+			    &old_entry->oid, oid, 1, !is_null_oid(oid),
+			    old_entry->name, 0, dirty_submodule);
+	else if (revs->diffopt.flags.find_copies_harder)
+		diff_same(&revs->diffopt, mode, oid, old_entry->name);
 	return 0;
 }
 
diff --git a/diff.c b/diff.c
index a1961526c0dab1..c3063d827e16d1 100644
--- a/diff.c
+++ b/diff.c
@@ -7347,6 +7347,26 @@ void diff_change(struct diff_options *options,
 			  concatpath, old_dirty_submodule, new_dirty_submodule);
 }
 
+void diff_same(struct diff_options *options,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *concatpath)
+{
+	struct diff_filespec *one;
+
+	if (S_ISGITLINK(mode) && is_submodule_ignored(concatpath, options))
+		return;
+
+	if (options->prefix &&
+	    strncmp(concatpath, options->prefix, options->prefix_length))
+		return;
+
+	one = alloc_filespec(concatpath);
+	fill_filespec(one, oid, 1, mode);
+	one->count++;
+	diff_queue(&diff_queued_diff, one, one);
+}
+
 struct diff_filepair *diff_unmerge(struct diff_options *options, const char *path)
 {
 	struct diff_filepair *pair;
diff --git a/diff.h b/diff.h
index 31eedd5c0c39d3..e80503aebb8d50 100644
--- a/diff.h
+++ b/diff.h
@@ -572,6 +572,11 @@ void diff_change(struct diff_options *,
 		 const char *fullpath,
 		 unsigned dirty_submodule1, unsigned dirty_submodule2);
 
+void diff_same(struct diff_options *,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *fullpath);
+
 struct diff_filepair *diff_unmerge(struct diff_options *, const char *path);
 
 void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat,

From f0ef5b6d9bcc258e4cbef93839d1b7465d5212b9 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 30 Nov 2025 18:31:24 -0800
Subject: [PATCH 176/553] The fifth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 70c4338675ae52..c4dfeb1c23b406 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -13,6 +13,13 @@ UI, Workflows & Features
 
  * "git blame" learns "--diff-algorithm=<algo>" option.
 
+ * "git repo info" learned "--all" option.
+
+ * Both "git apply" and "git diff" learn a new whitespace error class,
+   "incomplete-line".
+
+ * Add a new manual that describes the data model.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -95,7 +102,11 @@ Fixes since v2.52
  * Emulation code clean-up.
    (merge 2367c6bcd6 gf/win32-pthread-cond-wait-err later to maint).
 
+ * Various issues detected by Asan have been corrected.
+   (merge a031b6181a jk/asan-bonanza later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
    (merge f18aa68861 rs/xmkstemp-simplify later to maint).
+   (merge fddba8f737 ja/doc-synopsis-style later to maint).

From b14f1df9f26cf87856cf6767847ccb4a5b31499b Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Tue, 2 Dec 2025 16:56:51 +0100
Subject: [PATCH 177/553] branch: advice using git-help(1) instead of man(1)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added
an advice about checking git-check-ref-format(1) for the ref syntax
rules. The advice uses man(1). But git(1) is a multi-platform tool and
man(1) may not be available on some platforms. It might also be slightly
jarring to see a suggestion for running a command which is not from
the Git suite.

Let’s instead use git-help(1) in order to stay inside the land of
git(1). This also means that `help.format` (for `man`, `html` or other
formats) will be used if set.

Also change to using single quotes (') to quote the command since that
is more conventional.

While here let’s also update the test to use `{SQ}`, which is more
readable and easier to edit.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 branch.c          | 2 +-
 builtin/branch.c  | 2 +-
 t/t3200-branch.sh | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/branch.c b/branch.c
index 26be35834718f2..243db7d0fc0226 100644
--- a/branch.c
+++ b/branch.c
@@ -375,7 +375,7 @@ int validate_branchname(const char *name, struct strbuf *ref)
 	if (check_branch_ref(ref, name)) {
 		int code = die_message(_("'%s' is not a valid branch name"), name);
 		advise_if_enabled(ADVICE_REF_SYNTAX,
-				  _("See `man git check-ref-format`"));
+				  _("See 'git help check-ref-format'"));
 		exit(code);
 	}
 
diff --git a/builtin/branch.c b/builtin/branch.c
index 9fcf04bebb2e72..c577b5d20f2969 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -591,7 +591,7 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int
 		else {
 			int code = die_message(_("invalid branch name: '%s'"), oldname);
 			advise_if_enabled(ADVICE_REF_SYNTAX,
-					  _("See `man git check-ref-format`"));
+					  _("See 'git help check-ref-format'"));
 			exit(code);
 		}
 	}
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index f3e720dc10da46..c58e505c43f9b9 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -1707,9 +1707,9 @@ test_expect_success '--track overrides branch.autoSetupMerge' '
 '
 
 test_expect_success 'errors if given a bad branch name' '
-	cat <<-\EOF >expect &&
-	fatal: '\''foo..bar'\'' is not a valid branch name
-	hint: See `man git check-ref-format`
+	cat <<-EOF >expect &&
+	fatal: ${SQ}foo..bar${SQ} is not a valid branch name
+	hint: See ${SQ}git help check-ref-format${SQ}
 	hint: Disable this message with "git config set advice.refSyntax false"
 	EOF
 	test_must_fail git branch foo..bar >actual 2>&1 &&

From cfdce4afcc3809fc9233bad9477f1bd8a57b7586 Mon Sep 17 00:00:00 2001
From: Julia Evans <julia@jvns.ca>
Date: Tue, 2 Dec 2025 18:11:24 +0000
Subject: [PATCH 178/553] doc: remove stray text in Git data model

I meant to delete this sentence fragment when rewriting this paragraph,
but accidentally left it in. It's repetitive (since it was meant to be
deleted) and it's causing some formatting issues with the note.

Signed-off-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/gitdatamodel.adoc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/Documentation/gitdatamodel.adoc b/Documentation/gitdatamodel.adoc
index 3614f5960ea143..dcfdff0346f669 100644
--- a/Documentation/gitdatamodel.adoc
+++ b/Documentation/gitdatamodel.adoc
@@ -235,8 +235,6 @@ there will no longer be a branch that points at the old commit.
 The old commit is recorded in the current branch's <<reflogs,reflog>>,
 so it is still "reachable", but when the reflog entry expires it may
 become unreachable and get deleted.
-
-the old commit will usually not be reachable, so it may be deleted eventually.
 Reachable objects will never be deleted.
 
 [[index]]

From 8ef7355a8ffb0273f5b4713a0b1502887f8825d0 Mon Sep 17 00:00:00 2001
From: Julia Evans <julia@jvns.ca>
Date: Wed, 3 Dec 2025 15:34:55 +0000
Subject: [PATCH 179/553] doc: git-pull: fix 'git --rebase abort' typo

An earlier commit e9d221b0 (doc: git-pull: clarify how to exit a
conflicted merge, 2025-10-15) misspelt `git rebase --abort` to
`git --rebase abort`.  Fix it.

Signed-off-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-pull.adoc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-pull.adoc b/Documentation/git-pull.adoc
index cd3bbc90e3008d..4df4193575b345 100644
--- a/Documentation/git-pull.adoc
+++ b/Documentation/git-pull.adoc
@@ -37,8 +37,8 @@ You can also set the configuration options `pull.rebase`, `pull.squash`,
 or `pull.ff` with your preferred behaviour.
 
 If there's a merge conflict during the merge or rebase that you don't
-want to handle, you can safely abort it with `git merge --abort` or `git
---rebase abort`.
+want to handle, you can safely abort it with `git merge --abort` or
+`git rebase --abort`.
 
 OPTIONS
 -------

From 05491b90ce200e6411f9aaac0afe13af45d69824 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 29 Nov 2025 13:43:46 +0000
Subject: [PATCH 180/553] last-modified: support sparse checkouts

In a sparse checkout, a user might want to run `last-modified` on a
directory outside the worktree.

And even in non-sparse checkouts, a user might need to run that command
on a directory that does not exist in the worktree.

These use cases should be supported via the `--` separator between
revision and file arguments, which is even advertised in the
documentation. This patch fixes a tiny bug that prevents that from
working.

This fixes https://github.com/git-for-windows/git/issues/5978

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/last-modified.c  | 3 ++-
 t/t8020-last-modified.sh | 8 ++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index b0ecbdc5400d13..dc1e229f4d4f44 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -525,7 +525,8 @@ int cmd_last_modified(int argc, const char **argv, const char *prefix,
 
 	argc = parse_options(argc, argv, prefix, last_modified_options,
 			     last_modified_usage,
-			     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN_OPT);
+			     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN_OPT |
+			     PARSE_OPT_KEEP_DASHDASH);
 
 	repo_config(repo, git_default_config, NULL);
 
diff --git a/t/t8020-last-modified.sh b/t/t8020-last-modified.sh
index a4c1114ee28f7f..50f4312f715f41 100755
--- a/t/t8020-last-modified.sh
+++ b/t/t8020-last-modified.sh
@@ -78,6 +78,14 @@ test_expect_success 'last-modified subdir' '
 	EOF
 '
 
+test_expect_success 'last-modified in sparse checkout' '
+	test_when_finished "git sparse-checkout disable" &&
+	git sparse-checkout set b &&
+	check_last_modified -- a <<-\EOF
+	3 a
+	EOF
+'
+
 test_expect_success 'last-modified subdir recursive' '
 	check_last_modified -r a <<-\EOF
 	3 a/b/file

From 9ce3478410e6d9769f4203687b1f074a64c0ac8e Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Tue, 2 Dec 2025 11:48:08 +0100
Subject: [PATCH 181/553] meson: ignore subprojects/.wraplock

When asking Meson to wrap subprojects, it generates a .wraplock file in
the subprojects/ directory. Ignore this file.

See also https://github.com/mesonbuild/meson/issues/14948.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 subprojects/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/subprojects/.gitignore b/subprojects/.gitignore
index 63ea916ef5f302..2bb68c879432d7 100644
--- a/subprojects/.gitignore
+++ b/subprojects/.gitignore
@@ -1 +1,2 @@
 /*/
+.wraplock

From 574ac610761495b7b7afcced7717188501402925 Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Tue, 2 Dec 2025 11:48:09 +0100
Subject: [PATCH 182/553] meson: only detect ICONV_OMITS_BOM if possible

In our Meson setup it automatically detects whether ICONV_OMITS_BOM
should be defined. To check this, a piece of code is compiled and ran.

When cross-compiling, it's not possible to run this piece of code. Guard
this test with a can_run_host_binaries() check to ensure it can run.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index f1b3615659e56a..95348e69a413a9 100644
--- a/meson.build
+++ b/meson.build
@@ -1064,7 +1064,7 @@ if iconv.found()
     }
   '''
 
-  if compiler.run(iconv_omits_bom_source,
+  if meson.can_run_host_binaries() and compiler.run(iconv_omits_bom_source,
     dependencies: iconv,
     name: 'iconv omits BOM',
   ).returncode() != 0

From 4061692ba427af2085e934e0734926f93ea2c823 Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Wed, 3 Dec 2025 15:53:31 +0100
Subject: [PATCH 183/553] meson: use is_cross_build() where possible

In previous commit the first use of meson.can_run_host_binaries() was
introduced. This is a guard around compiler.run() to ensure it's
actually possible to execute the provided.

In other places we've been having the same issue, but here `not
meson.is_cross_build()` is used as guard. This does the trick, but it
also prevents the code from running even when an exe_wrapper is
configured.

Switch to using meson.can_run_host_binaries() here as well.

There is another place left that still uses `not
meson.is_cross_build()`, but here it's a guard around fs.exists(). That
function will always run on the build machine, so checking for
cross-compilation is still in place here.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 95348e69a413a9..00ad8a5c6011e5 100644
--- a/meson.build
+++ b/meson.build
@@ -1492,7 +1492,7 @@ if not has_bsd_sysctl
   endif
 endif
 
-if not meson.is_cross_build() and compiler.run('''
+if meson.can_run_host_binaries() and compiler.run('''
   #include <stdio.h>
 
   int main(int argc, const char **argv)

From 6fd44f55a7594842e70d549853b7f1ac4e27e6ea Mon Sep 17 00:00:00 2001
From: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Date: Thu, 4 Dec 2025 17:10:10 -0300
Subject: [PATCH 184/553] repo: remove blank line from
 Documentation/git-repo.adoc

There was an extra blank line in git-repo-structure documentation, which
led to an unwawnted '+' character after generating an HTML or PDF from
that page. This can be seen, for example, in Git 2.52.0 online docs [1].

Remove that extra line.

[1] https://git-scm.com/docs/git-repo/2.52.0

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index 70f0a6d2e47291..5d9c7641c24c9e 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -50,7 +50,6 @@ supported:
 +
 * Reference counts categorized by type
 * Reachable object counts categorized by type
-
 +
 The output format can be chosen through the flag `--format`. Three formats are
 supported:

From 768cf991ffea54dbcaf63c45750f0e3a26ebdcc6 Mon Sep 17 00:00:00 2001
From: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Date: Thu, 4 Dec 2025 17:10:11 -0300
Subject: [PATCH 185/553] repo: use [--format=... | -z] instead of [-z] in
 git-repo-info synopsis

The flag -z is only an alias for --format=null and even though --format
and -z can be used together and repeated, only the last one is
considered.

Replace `[-z]` in the synopsis of git-repo-info by
`[--format=... | -z]`, expliciting that the use of one of those flags
replace the other.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc | 4 ++--
 builtin/repo.c              | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index 5d9c7641c24c9e..f24514deaa4cd7 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -8,7 +8,7 @@ git-repo - Retrieve information about the repository
 SYNOPSIS
 --------
 [synopsis]
-git repo info [--format=(keyvalue|nul)] [-z] [--all | <key>...]
+git repo info [--format=(keyvalue|nul) | -z] [--all | <key>...]
 git repo structure [--format=(table|keyvalue|nul)]
 
 DESCRIPTION
@@ -19,7 +19,7 @@ THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
 
 COMMANDS
 --------
-`info [--format=(keyvalue|nul)] [-z] [--all | <key>...]`::
+`info [--format=(keyvalue|nul) | -z] [--all | <key>...]`::
 	Retrieve metadata-related information about the current repository. Only
 	the requested data will be returned based on their keys (see "INFO KEYS"
 	section below).
diff --git a/builtin/repo.c b/builtin/repo.c
index 2a653bd3eacf20..cc97dd1836fb9f 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -15,7 +15,7 @@
 #include "utf8.h"
 
 static const char *const repo_usage[] = {
-	"git repo info [--format=(keyvalue|nul)] [-z] [--all | <key>...]",
+	"git repo info [--format=(keyvalue|nul) | -z] [--all | <key>...]",
 	"git repo structure [--format=(table|keyvalue|nul)]",
 	NULL
 };

From 76c0704bdf6ee8ac0be11830cb6ae8d08cc587a8 Mon Sep 17 00:00:00 2001
From: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Date: Thu, 4 Dec 2025 17:10:12 -0300
Subject: [PATCH 186/553] repo: add -z as an alias for --format=nul to
 git-repo-structure

Other Git commands that have nul-terminated output, such as git-config,
git-status, git-ls-files, and git-repo-info have a flag `-z` for using
the null character as the record separator.

Add the `-z` flag to git-repo-structure as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc | 6 ++++--
 builtin/repo.c              | 6 +++++-
 t/t1901-repo-structure.sh   | 7 +++++++
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index f24514deaa4cd7..c4a78277df61c0 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -9,7 +9,7 @@ SYNOPSIS
 --------
 [synopsis]
 git repo info [--format=(keyvalue|nul) | -z] [--all | <key>...]
-git repo structure [--format=(table|keyvalue|nul)]
+git repo structure [--format=(table|keyvalue|nul) | -z]
 
 DESCRIPTION
 -----------
@@ -44,7 +44,7 @@ supported:
 +
 `-z` is an alias for `--format=nul`.
 
-`structure [--format=(table|keyvalue|nul)]`::
+`structure [--format=(table|keyvalue|nul) | -z]`::
 	Retrieve statistics about the current repository structure. The
 	following kinds of information are reported:
 +
@@ -71,6 +71,8 @@ supported:
 	the delimiter between the key and value instead of '='. Unlike the
 	`keyvalue` format, values containing "unusual" characters are never
 	quoted.
++
+`-z` is an alias for `--format=nul`.
 
 INFO KEYS
 ---------
diff --git a/builtin/repo.c b/builtin/repo.c
index cc97dd1836fb9f..0dd41b17783ed1 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -16,7 +16,7 @@
 
 static const char *const repo_usage[] = {
 	"git repo info [--format=(keyvalue|nul) | -z] [--all | <key>...]",
-	"git repo structure [--format=(table|keyvalue|nul)]",
+	"git repo structure [--format=(table|keyvalue|nul) | -z]",
 	NULL
 };
 
@@ -529,6 +529,10 @@ static int cmd_repo_structure(int argc, const char **argv, const char *prefix,
 		OPT_CALLBACK_F(0, "format", &format, N_("format"),
 			       N_("output format"),
 			       PARSE_OPT_NONEG, parse_format_cb),
+		OPT_CALLBACK_F('z', NULL, &format, NULL,
+			       N_("synonym for --format=nul"),
+			       PARSE_OPT_NONEG | PARSE_OPT_NOARG,
+			       parse_format_cb),
 		OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
 		OPT_END()
 	};
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index 36a71a144e3f74..df7d4ea52485da 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -101,6 +101,13 @@ test_expect_success 'keyvalue and nul format' '
 		tr "\n=" "\0\n" <expect >expect_nul &&
 		git repo structure --format=nul >out 2>err &&
 
+		test_cmp expect_nul out &&
+		test_line_count = 0 err &&
+
+		# "-z", as a synonym to "--format=nul", participates in the
+		# usual "last one wins" rule.
+		git repo structure --format=table -z >out 2>err &&
+
 		test_cmp expect_nul out &&
 		test_line_count = 0 err
 	)

From bdc5341ff65278a3cc80b2e8a02a2f02aa1fac06 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Fri, 5 Dec 2025 14:49:13 +0900
Subject: [PATCH 187/553] The sixth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 31 ++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index c4dfeb1c23b406..9e896bd4244389 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -20,6 +20,9 @@ UI, Workflows & Features
 
  * Add a new manual that describes the data model.
 
+ * "git fast-import" learns "--strip-if-invalid" option to drop
+   invalid cryptographic signature from objects.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -38,6 +41,13 @@ Performance, Internal Implementation, Development Support etc.
  * A part of code paths that deals with loose objects has been cleaned
    up.
 
+ * "make strip" has been taught to strip "scalar" as well as "git".
+
+ * Dockerised jobs at the GitHub Actions CI have been taught to show
+   more details of failed tests.
+
+ * Code refactoring around object database sources.
+
 
 Fixes since v2.52
 -----------------
@@ -105,8 +115,29 @@ Fixes since v2.52
  * Various issues detected by Asan have been corrected.
    (merge a031b6181a jk/asan-bonanza later to maint).
 
+ * "git config get --path" segfaulted on an ":(optional)path" that
+   does not exist, which has been corrected.
+   (merge 0bd16856ff jc/optional-path later to maint).
+
+ * The "--committer-date-is-author-date" option of "git am/rebase" is
+   a misguided one.  The documentation is updated to discourage its
+   use.
+   (merge fbf3d0669f kh/doc-committer-date-is-author-date later to maint).
+
+ * The option help text given by "git config unset -h" described
+   the "--all" option to "replace", not "unset", multiple variables,
+   which has been corrected.
+   (merge 18bf67b753 rs/config-unset-opthelp-fix later to maint).
+
+ * The error message given by "git config set", when the variable
+   being updated has more than one values defined, used old style "git
+   config" syntax with an incorrect option in its hint, both of which
+   have been corrected.
+   (merge df963f0df4 rs/config-set-multi-error-message-fix later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
    (merge f18aa68861 rs/xmkstemp-simplify later to maint).
    (merge fddba8f737 ja/doc-synopsis-style later to maint).
+   (merge 22ce0cb639 en/xdiff-cleanup-2 later to maint).

From d5e4aef3586c07c31e3e4d76ce7fdf0f9843314f Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sat, 6 Dec 2025 12:47:32 +0100
Subject: [PATCH 188/553] t/unit-tests: update clar to 39f11fe

Update clar to commit 39f11fe (Merge pull request #131 from
pks-gitlab/pks-integer-double-evaluation, 2025-12-05). This commit
includes the following changes relevant to Git:

  - There are now typesafe integer comparison functions. Furthermore,
    the range of comparison functions has been included to also have
    relative comparisons, like "greater than".

  - There is a new `cl_failf()` macro that allows the caller to specify
    an error message with formatting directives.

  - The TAP format has been fixed to correctly terminate YAML blocks
    with "...\n" instead of "---\n".

Note that we already had a `cl_failf()` function declared in our own
sources. This function is equivalent to the upstreamed function, so we
can simply drop it now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/unit-tests/clar/.github/workflows/ci.yml    |   2 +-
 t/unit-tests/clar/clar.c                      | 146 +++++++++++++++++-
 t/unit-tests/clar/clar.h                      |  82 +++++++++-
 t/unit-tests/clar/clar/print.h                |   2 +-
 t/unit-tests/clar/test/expected/quiet         |  40 ++++-
 .../clar/test/expected/summary_with_filename  |  42 ++++-
 .../test/expected/summary_without_filename    |  42 ++++-
 t/unit-tests/clar/test/expected/tap           |  88 +++++++++--
 .../clar/test/expected/without_arguments      |  42 ++++-
 t/unit-tests/clar/test/selftest.c             |  10 +-
 t/unit-tests/clar/test/suites/combined.c      |  65 +++++++-
 t/unit-tests/unit-test.h                      |   6 -
 12 files changed, 508 insertions(+), 59 deletions(-)

diff --git a/t/unit-tests/clar/.github/workflows/ci.yml b/t/unit-tests/clar/.github/workflows/ci.yml
index 4d4724222c3e89..14cb4ed1d4598a 100644
--- a/t/unit-tests/clar/.github/workflows/ci.yml
+++ b/t/unit-tests/clar/.github/workflows/ci.yml
@@ -53,7 +53,7 @@ jobs:
       if: matrix.platform.image == 'i386/debian:latest'
       run: apt -q update && apt -q -y install cmake gcc libc6-amd64 lib64stdc++6 make python3
     - name: Check out
-      uses: actions/checkout@v4
+      uses: actions/checkout@v6
     - name: Build
       shell: bash
       run: |
diff --git a/t/unit-tests/clar/clar.c b/t/unit-tests/clar/clar.c
index d6176e50b2214b..e959a5ae028b5a 100644
--- a/t/unit-tests/clar/clar.c
+++ b/t/unit-tests/clar/clar.c
@@ -24,6 +24,14 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 
+#ifndef va_copy
+#	ifdef __va_copy
+#		define va_copy(dst, src) __va_copy(dst, src)
+#	else
+#		define va_copy(dst, src) ((dst) = (src))
+#	endif
+#endif
+
 #if defined(__UCLIBC__) && ! defined(__UCLIBC_HAS_WCHAR__)
 	/*
 	 * uClibc can optionally be built without wchar support, in which case
@@ -76,8 +84,10 @@
 #			define S_ISDIR(x) ((x & _S_IFDIR) != 0)
 #		endif
 #		define p_snprintf(buf,sz,fmt,...) _snprintf_s(buf,sz,_TRUNCATE,fmt,__VA_ARGS__)
+#		define p_vsnprintf _vsnprintf
 #	else
 #		define p_snprintf snprintf
+#		define p_vsnprintf vsnprintf
 #	endif
 
 #	define localtime_r(timer, buf) (localtime_s(buf, timer) == 0 ? buf : NULL)
@@ -86,6 +96,7 @@
 #	include <unistd.h>
 #	define _MAIN_CC
 #	define p_snprintf snprintf
+#	define p_vsnprintf vsnprintf
 	typedef struct stat STAT_T;
 #endif
 
@@ -699,13 +710,14 @@ void clar__skip(void)
 	abort_test();
 }
 
-void clar__fail(
+static void clar__failv(
 	const char *file,
 	const char *function,
 	size_t line,
+	int should_abort,
 	const char *error_msg,
 	const char *description,
-	int should_abort)
+	va_list args)
 {
 	struct clar_error *error;
 
@@ -725,9 +737,19 @@ void clar__fail(
 	error->line_number = _clar.invoke_line ? _clar.invoke_line : line;
 	error->error_msg = error_msg;
 
-	if (description != NULL &&
-	    (error->description = strdup(description)) == NULL)
-		clar_abort("Failed to allocate description.\n");
+	if (description != NULL) {
+		va_list args_copy;
+		int len;
+
+		va_copy(args_copy, args);
+		if ((len = p_vsnprintf(NULL, 0, description, args_copy)) < 0)
+			clar_abort("Failed to compute description.");
+		va_end(args_copy);
+
+		if ((error->description = calloc(1, len + 1)) == NULL)
+			clar_abort("Failed to allocate buffer.");
+		p_vsnprintf(error->description, len + 1, description, args);
+	}
 
 	_clar.total_errors++;
 	_clar.last_report->status = CL_TEST_FAILURE;
@@ -736,6 +758,34 @@ void clar__fail(
 		abort_test();
 }
 
+void clar__failf(
+	const char *file,
+	const char *function,
+	size_t line,
+	int should_abort,
+	const char *error_msg,
+	const char *description,
+	...)
+{
+	va_list args;
+	va_start(args, description);
+	clar__failv(file, function, line, should_abort, error_msg,
+		    description, args);
+	va_end(args);
+}
+
+void clar__fail(
+	const char *file,
+	const char *function,
+	size_t line,
+	const char *error_msg,
+	const char *description,
+	int should_abort)
+{
+	clar__failf(file, function, line, should_abort, error_msg,
+		    description ? "%s" : NULL, description);
+}
+
 void clar__assert(
 	int condition,
 	const char *file,
@@ -889,6 +939,92 @@ void clar__assert_equal(
 		clar__fail(file, function, line, err, buf, should_abort);
 }
 
+void clar__assert_compare_i(
+	const char *file,
+	const char *func,
+	size_t line,
+	int should_abort,
+	enum clar_comparison cmp,
+	intmax_t value1,
+	intmax_t value2,
+	const char *error,
+	const char *description,
+	...)
+{
+	int fulfilled;
+	switch (cmp) {
+	case CLAR_COMPARISON_EQ:
+		fulfilled = value1 == value2;
+		break;
+	case CLAR_COMPARISON_LT:
+		fulfilled = value1 < value2;
+		break;
+	case CLAR_COMPARISON_LE:
+		fulfilled = value1 <= value2;
+		break;
+	case CLAR_COMPARISON_GT:
+		fulfilled = value1 > value2;
+		break;
+	case CLAR_COMPARISON_GE:
+		fulfilled = value1 >= value2;
+		break;
+	default:
+		cl_assert(0);
+		return;
+	}
+
+	if (!fulfilled) {
+		va_list args;
+		va_start(args, description);
+		clar__failv(file, func, line, should_abort, error,
+			    description, args);
+		va_end(args);
+	}
+}
+
+void clar__assert_compare_u(
+	const char *file,
+	const char *func,
+	size_t line,
+	int should_abort,
+	enum clar_comparison cmp,
+	uintmax_t value1,
+	uintmax_t value2,
+	const char *error,
+	const char *description,
+	...)
+{
+	int fulfilled;
+	switch (cmp) {
+	case CLAR_COMPARISON_EQ:
+		fulfilled = value1 == value2;
+		break;
+	case CLAR_COMPARISON_LT:
+		fulfilled = value1 < value2;
+		break;
+	case CLAR_COMPARISON_LE:
+		fulfilled = value1 <= value2;
+		break;
+	case CLAR_COMPARISON_GT:
+		fulfilled = value1 > value2;
+		break;
+	case CLAR_COMPARISON_GE:
+		fulfilled = value1 >= value2;
+		break;
+	default:
+		cl_assert(0);
+		return;
+	}
+
+	if (!fulfilled) {
+		va_list args;
+		va_start(args, description);
+		clar__failv(file, func, line, should_abort, error,
+			    description, args);
+		va_end(args);
+	}
+}
+
 void cl_set_cleanup(void (*cleanup)(void *), void *opaque)
 {
 	_clar.local_cleanup = cleanup;
diff --git a/t/unit-tests/clar/clar.h b/t/unit-tests/clar/clar.h
index ca72292ae918da..f7e43630226434 100644
--- a/t/unit-tests/clar/clar.h
+++ b/t/unit-tests/clar/clar.h
@@ -7,6 +7,7 @@
 #ifndef __CLAR_TEST_H__
 #define __CLAR_TEST_H__
 
+#include <inttypes.h>
 #include <stdlib.h>
 #include <limits.h>
 
@@ -149,6 +150,7 @@ const char *cl_fixture_basename(const char *fixture_name);
  * Forced failure/warning
  */
 #define cl_fail(desc) clar__fail(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, "Test failed.", desc, 1)
+#define cl_failf(desc,...) clar__failf(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, 1, "Test failed.", desc, __VA_ARGS__)
 #define cl_warning(desc) clar__fail(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, "Warning during test execution:", desc, 0)
 
 #define cl_skip() clar__skip()
@@ -168,9 +170,42 @@ const char *cl_fixture_basename(const char *fixture_name);
 #define cl_assert_equal_wcsn(wcs1,wcs2,len) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,"String mismatch: " #wcs1 " != " #wcs2, 1, "%.*ls", (wcs1), (wcs2), (int)(len))
 #define cl_assert_equal_wcsn_(wcs1,wcs2,len,note) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,"String mismatch: " #wcs1 " != " #wcs2 " (" #note ")", 1, "%.*ls", (wcs1), (wcs2), (int)(len))
 
-#define cl_assert_equal_i(i1,i2) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,#i1 " != " #i2, 1, "%d", (int)(i1), (int)(i2))
-#define cl_assert_equal_i_(i1,i2,note) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,#i1 " != " #i2 " (" #note ")", 1, "%d", (i1), (i2))
-#define cl_assert_equal_i_fmt(i1,i2,fmt) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,#i1 " != " #i2, 1, (fmt), (int)(i1), (int)(i2))
+#define cl_assert_compare_i_(i1, i2, cmp, error, ...) clar__assert_compare_i(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, 1, cmp, \
+									     (i1), (i2), "Expected comparison to hold: " error, __VA_ARGS__)
+#define cl_assert_compare_i(i1, i2, cmp, error, fmt) do { \
+	intmax_t v1 = (i1), v2 = (i2); \
+	clar__assert_compare_i(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, 1, cmp, \
+			       v1, v2, "Expected comparison to hold: " error, fmt, v1, v2); \
+} while (0)
+#define cl_assert_equal_i_(i1, i2, ...)    cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_EQ, #i1 " == " #i2, __VA_ARGS__)
+#define cl_assert_equal_i(i1, i2)          cl_assert_compare_i (i1, i2, CLAR_COMPARISON_EQ, #i1 " == " #i2, "%"PRIdMAX " != %"PRIdMAX)
+#define cl_assert_equal_i_fmt(i1, i2, fmt) cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_EQ, #i1 " == " #i2,  fmt " != " fmt, (int)(i1), (int)(i2))
+#define cl_assert_lt_i_(i1, i2, ...) cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_LT, #i1 " < " #i2, __VA_ARGS__)
+#define cl_assert_lt_i(i1, i2)       cl_assert_compare_i (i1, i2, CLAR_COMPARISON_LT, #i1 " < " #i2, "%"PRIdMAX " >= %"PRIdMAX)
+#define cl_assert_le_i_(i1, i2, ...) cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_LE, #i1 " <= " #i2, __VA_ARGS__)
+#define cl_assert_le_i(i1, i2)       cl_assert_compare_i (i1, i2, CLAR_COMPARISON_LE, #i1 " <= " #i2, "%"PRIdMAX " > %"PRIdMAX)
+#define cl_assert_gt_i_(i1, i2, ...) cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_GT, #i1 " > " #i2, __VA_ARGS__)
+#define cl_assert_gt_i(i1, i2)       cl_assert_compare_i (i1, i2, CLAR_COMPARISON_GT, #i1 " > " #i2, "%"PRIdMAX " <= %"PRIdMAX)
+#define cl_assert_ge_i_(i1, i2, ...) cl_assert_compare_i_(i1, i2, CLAR_COMPARISON_GE, #i1 " >= " #i2, __VA_ARGS__)
+#define cl_assert_ge_i(i1, i2)       cl_assert_compare_i (i1, i2, CLAR_COMPARISON_GE, #i1 " >= " #i2, "%"PRIdMAX " < %"PRIdMAX)
+
+#define cl_assert_compare_u_(u1, u2, cmp, error, ...) clar__assert_compare_u(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, 1, cmp, \
+									     (u1), (u2), "Expected comparison to hold: " error, __VA_ARGS__)
+#define cl_assert_compare_u(u1, u2, cmp, error, fmt) do { \
+	uintmax_t v1 = (u1), v2 = (u2); \
+	clar__assert_compare_u(CLAR_CURRENT_FILE, CLAR_CURRENT_FUNC, CLAR_CURRENT_LINE, 1, cmp, \
+			       v1, v2, "Expected comparison to hold: " error, fmt, v1, v2); \
+} while (0)
+#define cl_assert_equal_u_(u1, u2, ...) cl_assert_compare_u_(u1, u2, CLAR_COMPARISON_EQ, #u1 " == " #u2, __VA_ARGS__)
+#define cl_assert_equal_u(u1, u2)       cl_assert_compare_u (u1, u2, CLAR_COMPARISON_EQ, #u1 " == " #u2, "%"PRIuMAX " != %"PRIuMAX)
+#define cl_assert_lt_u_(u1, u2, ...) cl_assert_compare_u_(u1, u2, CLAR_COMPARISON_LT, #u1 " < " #u2, __VA_ARGS__)
+#define cl_assert_lt_u(u1, u2)       cl_assert_compare_u (u1, u2, CLAR_COMPARISON_LT, #u1 " < " #u2, "%"PRIuMAX " >= %"PRIuMAX)
+#define cl_assert_le_u_(u1, u2, ...) cl_assert_compare_u_(u1, u2, CLAR_COMPARISON_LE, #u1 " <= " #u2, __VA_ARGS__)
+#define cl_assert_le_u(u1, u2)       cl_assert_compare_u (u1, u2, CLAR_COMPARISON_LE, #u1 " <= " #u2, "%"PRIuMAX " > %"PRIuMAX)
+#define cl_assert_gt_u_(u1, u2, ...) cl_assert_compare_u_(u1, u2, CLAR_COMPARISON_GT, #u1 " > " #u2, __VA_ARGS__)
+#define cl_assert_gt_u(u1, u2)       cl_assert_compare_u (u1, u2, CLAR_COMPARISON_GT, #u1 " > " #u2, "%"PRIuMAX " <= %"PRIuMAX)
+#define cl_assert_ge_u_(u1, u2, ...) cl_assert_compare_u_(u1, u2, CLAR_COMPARISON_GE, #u1 " >= " #u2, __VA_ARGS__)
+#define cl_assert_ge_u(u1, u2)       cl_assert_compare_u (u1, u2, CLAR_COMPARISON_GE, #u1 " >= " #u2, "%"PRIuMAX " < %"PRIuMAX)
 
 #define cl_assert_equal_b(b1,b2) clar__assert_equal(CLAR_CURRENT_FILE,CLAR_CURRENT_FUNC,CLAR_CURRENT_LINE,#b1 " != " #b2, 1, "%d", (int)((b1) != 0),(int)((b2) != 0))
 
@@ -186,6 +221,15 @@ void clar__fail(
 	const char *description,
 	int should_abort);
 
+void clar__failf(
+	const char *file,
+	const char *func,
+	size_t line,
+	int should_abort,
+	const char *error,
+	const char *description,
+	...);
+
 void clar__assert(
 	int condition,
 	const char *file,
@@ -204,6 +248,38 @@ void clar__assert_equal(
 	const char *fmt,
 	...);
 
+enum clar_comparison {
+	CLAR_COMPARISON_EQ,
+	CLAR_COMPARISON_LT,
+	CLAR_COMPARISON_LE,
+	CLAR_COMPARISON_GT,
+	CLAR_COMPARISON_GE,
+};
+
+void clar__assert_compare_i(
+	const char *file,
+	const char *func,
+	size_t line,
+	int should_abort,
+	enum clar_comparison cmp,
+	intmax_t value1,
+	intmax_t value2,
+	const char *error,
+	const char *description,
+	...);
+
+void clar__assert_compare_u(
+	const char *file,
+	const char *func,
+	size_t line,
+	int should_abort,
+	enum clar_comparison cmp,
+	uintmax_t value1,
+	uintmax_t value2,
+	const char *error,
+	const char *description,
+	...);
+
 void clar__set_invokepoint(
 	const char *file,
 	const char *func,
diff --git a/t/unit-tests/clar/clar/print.h b/t/unit-tests/clar/clar/print.h
index 89b66591d7556d..6a2321b399d192 100644
--- a/t/unit-tests/clar/clar/print.h
+++ b/t/unit-tests/clar/clar/print.h
@@ -164,7 +164,7 @@ static void clar_print_tap_ontest(const char *suite_name, const char *test_name,
 			printf("      file: '"); print_escaped(error->file); printf("'\n");
 			printf("      line: %" PRIuMAX "\n", error->line_number);
 			printf("      function: '%s'\n", error->function);
-			printf("    ---\n");
+			printf("    ...\n");
 		}
 
 		break;
diff --git a/t/unit-tests/clar/test/expected/quiet b/t/unit-tests/clar/test/expected/quiet
index 280c99d8ad5eba..a93273b5a23003 100644
--- a/t/unit-tests/clar/test/expected/quiet
+++ b/t/unit-tests/clar/test/expected/quiet
@@ -18,27 +18,57 @@ combined::strings_with_length [file:42]
 
   5) Failure:
 combined::int [file:42]
-  101 != value ("extra note on failing test")
+  Expected comparison to hold: 101 == value
   101 != 100
 
   6) Failure:
+combined::int_note [file:42]
+  Expected comparison to hold: 101 == value
+  extra note on failing test
+
+  7) Failure:
 combined::int_fmt [file:42]
-  022 != value
+  Expected comparison to hold: 022 == value
   0022 != 0144
 
-  7) Failure:
+  8) Failure:
 combined::bool [file:42]
   0 != value
   0 != 1
 
-  8) Failure:
+  9) Failure:
 combined::multiline_description [file:42]
   Function call failed: -1
   description line 1
   description line 2
 
-  9) Failure:
+  10) Failure:
 combined::null_string [file:42]
   String mismatch: "expected" != actual ("this one fails")
   'expected' != NULL
 
+  11) Failure:
+combined::failf [file:42]
+  Test failed.
+  some reason: foo
+
+  12) Failure:
+combined::compare_i [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  13) Failure:
+combined::compare_i_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
+  14) Failure:
+combined::compare_u [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  15) Failure:
+combined::compare_u_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
diff --git a/t/unit-tests/clar/test/expected/summary_with_filename b/t/unit-tests/clar/test/expected/summary_with_filename
index 460160791d14c0..a9471cc7d51762 100644
--- a/t/unit-tests/clar/test/expected/summary_with_filename
+++ b/t/unit-tests/clar/test/expected/summary_with_filename
@@ -1,6 +1,6 @@
 Loaded 1 suites:
 Started (test status codes: OK='.' FAILURE='F' SKIPPED='S')
-FFFFFFFFF
+FFFFFFFFFFFFFFF
 
   1) Failure:
 combined::1 [file:42]
@@ -22,28 +22,58 @@ combined::strings_with_length [file:42]
 
   5) Failure:
 combined::int [file:42]
-  101 != value ("extra note on failing test")
+  Expected comparison to hold: 101 == value
   101 != 100
 
   6) Failure:
+combined::int_note [file:42]
+  Expected comparison to hold: 101 == value
+  extra note on failing test
+
+  7) Failure:
 combined::int_fmt [file:42]
-  022 != value
+  Expected comparison to hold: 022 == value
   0022 != 0144
 
-  7) Failure:
+  8) Failure:
 combined::bool [file:42]
   0 != value
   0 != 1
 
-  8) Failure:
+  9) Failure:
 combined::multiline_description [file:42]
   Function call failed: -1
   description line 1
   description line 2
 
-  9) Failure:
+  10) Failure:
 combined::null_string [file:42]
   String mismatch: "expected" != actual ("this one fails")
   'expected' != NULL
 
+  11) Failure:
+combined::failf [file:42]
+  Test failed.
+  some reason: foo
+
+  12) Failure:
+combined::compare_i [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  13) Failure:
+combined::compare_i_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
+  14) Failure:
+combined::compare_u [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  15) Failure:
+combined::compare_u_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
 written summary file to different.xml
diff --git a/t/unit-tests/clar/test/expected/summary_without_filename b/t/unit-tests/clar/test/expected/summary_without_filename
index 7874c1d98bc01b..83ba770d006e78 100644
--- a/t/unit-tests/clar/test/expected/summary_without_filename
+++ b/t/unit-tests/clar/test/expected/summary_without_filename
@@ -1,6 +1,6 @@
 Loaded 1 suites:
 Started (test status codes: OK='.' FAILURE='F' SKIPPED='S')
-FFFFFFFFF
+FFFFFFFFFFFFFFF
 
   1) Failure:
 combined::1 [file:42]
@@ -22,28 +22,58 @@ combined::strings_with_length [file:42]
 
   5) Failure:
 combined::int [file:42]
-  101 != value ("extra note on failing test")
+  Expected comparison to hold: 101 == value
   101 != 100
 
   6) Failure:
+combined::int_note [file:42]
+  Expected comparison to hold: 101 == value
+  extra note on failing test
+
+  7) Failure:
 combined::int_fmt [file:42]
-  022 != value
+  Expected comparison to hold: 022 == value
   0022 != 0144
 
-  7) Failure:
+  8) Failure:
 combined::bool [file:42]
   0 != value
   0 != 1
 
-  8) Failure:
+  9) Failure:
 combined::multiline_description [file:42]
   Function call failed: -1
   description line 1
   description line 2
 
-  9) Failure:
+  10) Failure:
 combined::null_string [file:42]
   String mismatch: "expected" != actual ("this one fails")
   'expected' != NULL
 
+  11) Failure:
+combined::failf [file:42]
+  Test failed.
+  some reason: foo
+
+  12) Failure:
+combined::compare_i [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  13) Failure:
+combined::compare_i_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
+  14) Failure:
+combined::compare_u [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  15) Failure:
+combined::compare_u_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
 written summary file to summary.xml
diff --git a/t/unit-tests/clar/test/expected/tap b/t/unit-tests/clar/test/expected/tap
index bddbd5dfe98b61..e67118d3aedbf8 100644
--- a/t/unit-tests/clar/test/expected/tap
+++ b/t/unit-tests/clar/test/expected/tap
@@ -8,7 +8,7 @@ not ok 1 - combined::1
       file: 'file'
       line: 42
       function: 'func'
-    ---
+    ...
 not ok 2 - combined::2
     ---
     reason: |
@@ -17,7 +17,7 @@ not ok 2 - combined::2
       file: 'file'
       line: 42
       function: 'func'
-    ---
+    ...
 not ok 3 - combined::strings
     ---
     reason: |
@@ -27,7 +27,7 @@ not ok 3 - combined::strings
       file: 'file'
       line: 42
       function: 'func'
-    ---
+    ...
 not ok 4 - combined::strings_with_length
     ---
     reason: |
@@ -37,28 +37,38 @@ not ok 4 - combined::strings_with_length
       file: 'file'
       line: 42
       function: 'func'
-    ---
+    ...
 not ok 5 - combined::int
     ---
     reason: |
-      101 != value ("extra note on failing test")
+      Expected comparison to hold: 101 == value
       101 != 100
     at:
       file: 'file'
       line: 42
       function: 'func'
+    ...
+not ok 6 - combined::int_note
     ---
-not ok 6 - combined::int_fmt
+    reason: |
+      Expected comparison to hold: 101 == value
+      extra note on failing test
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+not ok 7 - combined::int_fmt
     ---
     reason: |
-      022 != value
+      Expected comparison to hold: 022 == value
       0022 != 0144
     at:
       file: 'file'
       line: 42
       function: 'func'
-    ---
-not ok 7 - combined::bool
+    ...
+not ok 8 - combined::bool
     ---
     reason: |
       0 != value
@@ -67,8 +77,8 @@ not ok 7 - combined::bool
       file: 'file'
       line: 42
       function: 'func'
-    ---
-not ok 8 - combined::multiline_description
+    ...
+not ok 9 - combined::multiline_description
     ---
     reason: |
       Function call failed: -1
@@ -78,8 +88,8 @@ not ok 8 - combined::multiline_description
       file: 'file'
       line: 42
       function: 'func'
-    ---
-not ok 9 - combined::null_string
+    ...
+not ok 10 - combined::null_string
     ---
     reason: |
       String mismatch: "expected" != actual ("this one fails")
@@ -88,5 +98,55 @@ not ok 9 - combined::null_string
       file: 'file'
       line: 42
       function: 'func'
+    ...
+not ok 11 - combined::failf
+    ---
+    reason: |
+      Test failed.
+      some reason: foo
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+not ok 12 - combined::compare_i
     ---
-1..9
+    reason: |
+      Expected comparison to hold: two < 1
+      2 >= 1
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+not ok 13 - combined::compare_i_with_format
+    ---
+    reason: |
+      Expected comparison to hold: two < 1
+      foo: bar
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+not ok 14 - combined::compare_u
+    ---
+    reason: |
+      Expected comparison to hold: two < 1
+      2 >= 1
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+not ok 15 - combined::compare_u_with_format
+    ---
+    reason: |
+      Expected comparison to hold: two < 1
+      foo: bar
+    at:
+      file: 'file'
+      line: 42
+      function: 'func'
+    ...
+1..15
diff --git a/t/unit-tests/clar/test/expected/without_arguments b/t/unit-tests/clar/test/expected/without_arguments
index 1111d418a060f7..9891f45a703984 100644
--- a/t/unit-tests/clar/test/expected/without_arguments
+++ b/t/unit-tests/clar/test/expected/without_arguments
@@ -1,6 +1,6 @@
 Loaded 1 suites:
 Started (test status codes: OK='.' FAILURE='F' SKIPPED='S')
-FFFFFFFFF
+FFFFFFFFFFFFFFF
 
   1) Failure:
 combined::1 [file:42]
@@ -22,27 +22,57 @@ combined::strings_with_length [file:42]
 
   5) Failure:
 combined::int [file:42]
-  101 != value ("extra note on failing test")
+  Expected comparison to hold: 101 == value
   101 != 100
 
   6) Failure:
+combined::int_note [file:42]
+  Expected comparison to hold: 101 == value
+  extra note on failing test
+
+  7) Failure:
 combined::int_fmt [file:42]
-  022 != value
+  Expected comparison to hold: 022 == value
   0022 != 0144
 
-  7) Failure:
+  8) Failure:
 combined::bool [file:42]
   0 != value
   0 != 1
 
-  8) Failure:
+  9) Failure:
 combined::multiline_description [file:42]
   Function call failed: -1
   description line 1
   description line 2
 
-  9) Failure:
+  10) Failure:
 combined::null_string [file:42]
   String mismatch: "expected" != actual ("this one fails")
   'expected' != NULL
 
+  11) Failure:
+combined::failf [file:42]
+  Test failed.
+  some reason: foo
+
+  12) Failure:
+combined::compare_i [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  13) Failure:
+combined::compare_i_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
+  14) Failure:
+combined::compare_u [file:42]
+  Expected comparison to hold: two < 1
+  2 >= 1
+
+  15) Failure:
+combined::compare_u_with_format [file:42]
+  Expected comparison to hold: two < 1
+  foo: bar
+
diff --git a/t/unit-tests/clar/test/selftest.c b/t/unit-tests/clar/test/selftest.c
index eed83e4512006d..6eadc64c4813b8 100644
--- a/t/unit-tests/clar/test/selftest.c
+++ b/t/unit-tests/clar/test/selftest.c
@@ -298,7 +298,7 @@ void test_selftest__help(void)
 
 void test_selftest__without_arguments(void)
 {
-	cl_invoke(assert_output("combined", "without_arguments", 9, NULL));
+	cl_invoke(assert_output("combined", "without_arguments", 15, NULL));
 }
 
 void test_selftest__specific_test(void)
@@ -313,12 +313,12 @@ void test_selftest__stop_on_failure(void)
 
 void test_selftest__quiet(void)
 {
-	cl_invoke(assert_output("combined", "quiet", 9, "-q", NULL));
+	cl_invoke(assert_output("combined", "quiet", 15, "-q", NULL));
 }
 
 void test_selftest__tap(void)
 {
-	cl_invoke(assert_output("combined", "tap", 9, "-t", NULL));
+	cl_invoke(assert_output("combined", "tap", 15, "-t", NULL));
 }
 
 void test_selftest__suite_names(void)
@@ -329,7 +329,7 @@ void test_selftest__suite_names(void)
 void test_selftest__summary_without_filename(void)
 {
 	struct stat st;
-	cl_invoke(assert_output("combined", "summary_without_filename", 9, "-r", NULL));
+	cl_invoke(assert_output("combined", "summary_without_filename", 15, "-r", NULL));
 	/* The summary contains timestamps, so we cannot verify its contents. */
 	cl_must_pass(stat("summary.xml", &st));
 }
@@ -337,7 +337,7 @@ void test_selftest__summary_without_filename(void)
 void test_selftest__summary_with_filename(void)
 {
 	struct stat st;
-	cl_invoke(assert_output("combined", "summary_with_filename", 9, "-rdifferent.xml", NULL));
+	cl_invoke(assert_output("combined", "summary_with_filename", 15, "-rdifferent.xml", NULL));
 	/* The summary contains timestamps, so we cannot verify its contents. */
 	cl_must_pass(stat("different.xml", &st));
 }
diff --git a/t/unit-tests/clar/test/suites/combined.c b/t/unit-tests/clar/test/suites/combined.c
index e8b41c98c37fa2..9e9dbc2fb1f180 100644
--- a/t/unit-tests/clar/test/suites/combined.c
+++ b/t/unit-tests/clar/test/suites/combined.c
@@ -55,7 +55,12 @@ void test_combined__strings_with_length(void)
 void test_combined__int(void)
 {
 	int value = 100;
-	cl_assert_equal_i(100, value);
+	cl_assert_equal_i(101, value);
+}
+
+void test_combined__int_note(void)
+{
+	int value = 100;
 	cl_assert_equal_i_(101, value, "extra note on failing test");
 }
 
@@ -83,3 +88,61 @@ void test_combined__null_string(void)
 	cl_assert_equal_s(actual, actual);
 	cl_assert_equal_s_("expected", actual, "this one fails");
 }
+
+void test_combined__failf(void)
+{
+	cl_failf("some reason: %s", "foo");
+}
+
+void test_combined__compare_i(void)
+{
+	int one = 1, two = 2;
+
+	cl_assert_equal_i(one, 1);
+	cl_assert_equal_i(one, 1);
+	cl_assert_equal_i_(one, 1, "format");
+	cl_assert_lt_i(one, 2);
+	cl_assert_lt_i_(one, 2, "format");
+	cl_assert_le_i(one, 2);
+	cl_assert_le_i(two, 2);
+	cl_assert_le_i_(two, 2, "format");
+	cl_assert_gt_i(two, 1);
+	cl_assert_gt_i_(two, 1, "format");
+	cl_assert_ge_i(two, 2);
+	cl_assert_ge_i(3, two);
+	cl_assert_ge_i_(3, two, "format");
+
+	cl_assert_lt_i(two, 1); /* this one fails */
+}
+
+void test_combined__compare_i_with_format(void)
+{
+	int two = 2;
+	cl_assert_lt_i_(two, 1, "foo: %s", "bar");
+}
+
+void test_combined__compare_u(void)
+{
+	unsigned one = 1, two = 2;
+
+	cl_assert_equal_u(one, 1);
+	cl_assert_equal_u_(one, 1, "format");
+	cl_assert_lt_u(one, 2);
+	cl_assert_lt_u_(one, 2, "format");
+	cl_assert_le_u(one, 2);
+	cl_assert_le_u(two, 2);
+	cl_assert_le_u_(two, 2, "format");
+	cl_assert_gt_u(two, 1);
+	cl_assert_gt_u_(two, 1, "format");
+	cl_assert_ge_u(two, 2);
+	cl_assert_ge_u(3, two);
+	cl_assert_ge_u_(3, two, "format");
+
+	cl_assert_lt_u(two, 1); /* this one fails */
+}
+
+void test_combined__compare_u_with_format(void)
+{
+	unsigned two = 2;
+	cl_assert_lt_u_(two, 1, "foo: %s", "bar");
+}
diff --git a/t/unit-tests/unit-test.h b/t/unit-tests/unit-test.h
index 39a0b72a05dec3..5398b449171560 100644
--- a/t/unit-tests/unit-test.h
+++ b/t/unit-tests/unit-test.h
@@ -7,9 +7,3 @@
 #else
 # include GIT_CLAR_DECLS_H
 #endif
-
-#define cl_failf(fmt, ...) do { \
-	char desc[4096]; \
-	snprintf(desc, sizeof(desc), fmt, __VA_ARGS__); \
-	clar__fail(__FILE__, __func__, __LINE__, "Test failed.", desc, 1); \
-} while (0)

From 2e53d29f53e2a4c6bb5b9f8f3169c178e5ef62a0 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sat, 6 Dec 2025 12:47:33 +0100
Subject: [PATCH 189/553] t/unit-tests: demonstrate use of integer comparison
 assertions

The clar project has introduced a couple of new assertions that perform
relative integer comparisons, like "greater than" or "less or equal".
Adapt the reftable-record unit tests to demonstrate their usage.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/unit-tests/u-reftable-record.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/t/unit-tests/u-reftable-record.c b/t/unit-tests/u-reftable-record.c
index 6c8c0d5374a6a2..1bf2e170dc96a0 100644
--- a/t/unit-tests/u-reftable-record.c
+++ b/t/unit-tests/u-reftable-record.c
@@ -51,10 +51,10 @@ void test_reftable_record__varint_roundtrip(void)
 		int n = put_var_int(&out, in);
 		uint64_t got = 0;
 
-		cl_assert(n > 0);
+		cl_assert_gt_i(n, 0);
 		out.len = n;
 		n = get_var_int(&got, &out);
-		cl_assert(n > 0);
+		cl_assert_gt_i(n, 0);
 
 		cl_assert_equal_i(got, in);
 	}
@@ -110,7 +110,7 @@ void test_reftable_record__ref_record_comparison(void)
 	cl_assert(reftable_record_equal(&in[1], &in[2],
 					REFTABLE_HASH_SIZE_SHA1) == 0);
 	cl_assert_equal_i(reftable_record_cmp(&in[1], &in[2], &cmp), 0);
-	cl_assert(cmp > 0);
+	cl_assert_gt_i(cmp, 0);
 
 	in[1].u.ref.value_type = in[0].u.ref.value_type;
 	cl_assert(reftable_record_equal(&in[0], &in[1],
@@ -184,7 +184,7 @@ void test_reftable_record__ref_record_roundtrip(void)
 
 		reftable_record_key(&in, &key);
 		n = reftable_record_encode(&in, dest, REFTABLE_HASH_SIZE_SHA1);
-		cl_assert(n > 0);
+		cl_assert_gt_i(n, 0);
 
 		/* decode into a non-zero reftable_record to test for leaks. */
 		m = reftable_record_decode(&out, key, i, dest, REFTABLE_HASH_SIZE_SHA1, &scratch);
@@ -228,11 +228,11 @@ void test_reftable_record__log_record_comparison(void)
 	cl_assert_equal_i(reftable_record_equal(&in[1], &in[2],
 						REFTABLE_HASH_SIZE_SHA1), 0);
 	cl_assert_equal_i(reftable_record_cmp(&in[1], &in[2], &cmp), 0);
-	cl_assert(cmp > 0);
+	cl_assert_gt_i(cmp, 0);
 	/* comparison should be reversed for equal keys, because
 	 * comparison is now performed on the basis of update indices */
 	cl_assert_equal_i(reftable_record_cmp(&in[0], &in[1], &cmp), 0);
-	cl_assert(cmp < 0);
+	cl_assert_lt_i(cmp, 0);
 
 	in[1].u.log.update_index = in[0].u.log.update_index;
 	cl_assert(reftable_record_equal(&in[0], &in[1],
@@ -344,7 +344,7 @@ void test_reftable_record__log_record_roundtrip(void)
 		reftable_record_key(&rec, &key);
 
 		n = reftable_record_encode(&rec, dest, REFTABLE_HASH_SIZE_SHA1);
-		cl_assert(n >= 0);
+		cl_assert_ge_i(n, 0);
 		valtype = reftable_record_val_type(&rec);
 		m = reftable_record_decode(&out, key, valtype, dest,
 					   REFTABLE_HASH_SIZE_SHA1, &scratch);
@@ -382,7 +382,7 @@ void test_reftable_record__key_roundtrip(void)
 	extra = 6;
 	n = reftable_encode_key(&restart, dest, last_key, key, extra);
 	cl_assert(!restart);
-	cl_assert(n > 0);
+	cl_assert_gt_i(n, 0);
 
 	cl_assert_equal_i(reftable_buf_addstr(&roundtrip,
 					      "refs/heads/master"), 0);
@@ -432,7 +432,7 @@ void test_reftable_record__obj_record_comparison(void)
 	cl_assert_equal_i(reftable_record_equal(&in[1], &in[2],
 						REFTABLE_HASH_SIZE_SHA1), 0);
 	cl_assert_equal_i(reftable_record_cmp(&in[1], &in[2], &cmp), 0);
-	cl_assert(cmp > 0);
+	cl_assert_gt_i(cmp, 0);
 
 	in[1].u.obj.offset_len = in[0].u.obj.offset_len;
 	cl_assert(reftable_record_equal(&in[0], &in[1], REFTABLE_HASH_SIZE_SHA1) != 0);
@@ -485,7 +485,7 @@ void test_reftable_record__obj_record_roundtrip(void)
 		t_copy(&in);
 		reftable_record_key(&in, &key);
 		n = reftable_record_encode(&in, dest, REFTABLE_HASH_SIZE_SHA1);
-		cl_assert(n > 0);
+		cl_assert_gt_i(n, 0);
 		extra = reftable_record_val_type(&in);
 		m = reftable_record_decode(&out, key, extra, dest,
 					   REFTABLE_HASH_SIZE_SHA1, &scratch);
@@ -535,7 +535,7 @@ void test_reftable_record__index_record_comparison(void)
 	cl_assert_equal_i(reftable_record_equal(&in[1], &in[2],
 						REFTABLE_HASH_SIZE_SHA1), 0);
 	cl_assert_equal_i(reftable_record_cmp(&in[1], &in[2], &cmp), 0);
-	cl_assert(cmp > 0);
+	cl_assert_gt_i(cmp, 0);
 
 	in[1].u.idx.offset = in[0].u.idx.offset;
 	cl_assert(reftable_record_equal(&in[0], &in[1],

From 84071a6deac84fdb99d2415900d7713829de393c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Sat, 6 Dec 2025 12:47:34 +0100
Subject: [PATCH 190/553] gitattributes: disable blank-at-eof errors for clar
 test expectations

The clar unit testing framework carries a couple of files that contain
expected output for its self-tests. Some of these files expectedly end
with a blank line at the end of the file, which Git would consider to be
a whitespace error by default.

Teach our gitattributes to ignore those errors.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitattributes | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitattributes b/.gitattributes
index 32583149c2f927..e416c3720568eb 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -17,3 +17,4 @@ CODE_OF_CONDUCT.md -whitespace
 /Documentation/gitk.adoc conflict-marker-size=32
 /Documentation/user-manual.adoc conflict-marker-size=32
 /t/t????-*.sh conflict-marker-size=32
+/t/unit-tests/clar/test/expected/* whitespace=-blank-at-eof

From e1ecf0dd6897eae1594b7e9345605b8f88485b95 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 6 Dec 2025 14:27:39 +0100
Subject: [PATCH 191/553] wrapper: add git_mkdtemp()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Extend git_mkstemps_mode() to optionally call mkdir(2) instead of
open(2), then use that ability to create a mkdtemp(3) replacement,
git_mkdtemp().  We'll start using it in the next commit.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 wrapper.c | 21 +++++++++++++++++++--
 wrapper.h |  2 ++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index 3d507d42045203..89f6effe84371b 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -446,7 +446,11 @@ int xmkstemp(char *filename_template)
 #undef TMP_MAX
 #define TMP_MAX 16384
 
-int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
+/*
+ * Returns -1 on error, 0 if it created a directory, or an open file
+ * descriptor to the created regular file.
+ */
+static int git_mkdstemps_mode(char *pattern, int suffix_len, int mode, bool dir)
 {
 	static const char letters[] =
 		"abcdefghijklmnopqrstuvwxyz"
@@ -488,7 +492,10 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 			v /= num_letters;
 		}
 
-		fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
+		if (dir)
+			fd = mkdir(pattern, mode);
+		else
+			fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
 		if (fd >= 0)
 			return fd;
 		/*
@@ -503,6 +510,16 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 	return -1;
 }
 
+char *git_mkdtemp(char *pattern)
+{
+	return git_mkdstemps_mode(pattern, 0, 0700, true) ? NULL : pattern;
+}
+
+int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
+{
+	return git_mkdstemps_mode(pattern, suffix_len, mode, false);
+}
+
 int git_mkstemp_mode(char *pattern, int mode)
 {
 	/* mkstemp is just mkstemps with no suffix */
diff --git a/wrapper.h b/wrapper.h
index 44a8597ac31426..15ac3bab6e9748 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -37,6 +37,8 @@ int xsnprintf(char *dst, size_t max, const char *fmt, ...);
 
 int xgethostname(char *buf, size_t len);
 
+char *git_mkdtemp(char *pattern);
+
 /* set default permissions by passing mode arguments to open(2) */
 int git_mkstemps_mode(char *pattern, int suffix_len, int mode);
 int git_mkstemp_mode(char *pattern, int mode);

From 5ecd3590a3052820eeb3f1d6764584c537b68938 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 6 Dec 2025 14:27:47 +0100
Subject: [PATCH 192/553] compat: use git_mkdtemp()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A file might appear at the path returned by mktemp(3) before we call
mkdir(2).  Use the more robust git_mkdtemp() instead, which retries a
number of times and doesn't need to call lstat(2).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/mkdtemp.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/compat/mkdtemp.c b/compat/mkdtemp.c
index 11361195925c67..fcdd4e01e14613 100644
--- a/compat/mkdtemp.c
+++ b/compat/mkdtemp.c
@@ -2,7 +2,5 @@
 
 char *gitmkdtemp(char *template)
 {
-	if (!*mktemp(template) || mkdir(template, 0700))
-		return NULL;
-	return template;
+	return git_mkdtemp(template);
 }

From 47bf14750eee7e43e12d20414d3698f203245a35 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 6 Dec 2025 14:28:26 +0100
Subject: [PATCH 193/553] compat: remove mingw_mktemp()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Remove the mktemp(3) compatibility function now that its last caller was
removed by the previous commit.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/mingw-posix.h |  3 ---
 compat/mingw.c       | 12 ------------
 2 files changed, 15 deletions(-)

diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 631a20868489be..0939feff27ffec 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -241,9 +241,6 @@ int mingw_chdir(const char *dirname);
 int mingw_chmod(const char *filename, int mode);
 #define chmod mingw_chmod
 
-char *mingw_mktemp(char *template);
-#define mktemp mingw_mktemp
-
 char *mingw_getcwd(char *pointer, int len);
 #define getcwd mingw_getcwd
 
diff --git a/compat/mingw.c b/compat/mingw.c
index 736a07a028ab4d..abdc9684214dac 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1162,18 +1162,6 @@ unsigned int sleep (unsigned int seconds)
 	return 0;
 }
 
-char *mingw_mktemp(char *template)
-{
-	wchar_t wtemplate[MAX_PATH];
-	if (xutftowcs_path(wtemplate, template) < 0)
-		return NULL;
-	if (!_wmktemp(wtemplate))
-		return NULL;
-	if (xwcstoutf(template, wtemplate, strlen(template) + 1) < 0)
-		return NULL;
-	return template;
-}
-
 int mkstemp(char *template)
 {
 	return git_mkstemp_mode(template, 0600);

From 7bef658135944d26acf3e1ec9316ca11f4369cf8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 6 Dec 2025 14:29:43 +0100
Subject: [PATCH 194/553] banned.h: ban mktemp(3)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Older versions of mktemp(3) generate easily guessable file names.  The
function checks if the generated name is used, which is unreliable, as
a file with that name might then be created by some other process before
we can do it ourselves.  The function was dropped from POSIX due to its
security problems.  Forbid its use.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 banned.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/banned.h b/banned.h
index 44e76bd90af769..2b934c8c4381b5 100644
--- a/banned.h
+++ b/banned.h
@@ -41,4 +41,7 @@
 #undef asctime_r
 #define asctime_r(t, buf) BANNED(asctime_r)
 
+#undef mktemp
+#define mktemp(x) BANNED(mktemp)
+
 #endif /* BANNED_H */

From 10bba537c4c23e713af05be700748c6a3c25bf68 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 6 Dec 2025 14:35:39 +0100
Subject: [PATCH 195/553] compat: remove gitmkdtemp()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

gitmkdtemp() has become a trivial wrapper around git_mkdtemp().  Remove
this now unnecessary layer of indirection.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile                            | 1 -
 compat/mkdtemp.c                    | 6 ------
 compat/posix.h                      | 3 +--
 contrib/buildsystems/CMakeLists.txt | 4 ----
 meson.build                         | 2 +-
 5 files changed, 2 insertions(+), 14 deletions(-)
 delete mode 100644 compat/mkdtemp.c

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..8f74b25fe7f9e9 100644
--- a/Makefile
+++ b/Makefile
@@ -1917,7 +1917,6 @@ ifdef NO_SETENV
 endif
 ifdef NO_MKDTEMP
 	COMPAT_CFLAGS += -DNO_MKDTEMP
-	COMPAT_OBJS += compat/mkdtemp.o
 endif
 ifdef MKDIR_WO_TRAILING_SLASH
 	COMPAT_CFLAGS += -DMKDIR_WO_TRAILING_SLASH
diff --git a/compat/mkdtemp.c b/compat/mkdtemp.c
deleted file mode 100644
index fcdd4e01e14613..00000000000000
--- a/compat/mkdtemp.c
+++ /dev/null
@@ -1,6 +0,0 @@
-#include "../git-compat-util.h"
-
-char *gitmkdtemp(char *template)
-{
-	return git_mkdtemp(template);
-}
diff --git a/compat/posix.h b/compat/posix.h
index 067a00f33b83f3..245386fa4a9f4e 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -329,8 +329,7 @@ int gitsetenv(const char *, const char *, int);
 #endif
 
 #ifdef NO_MKDTEMP
-#define mkdtemp gitmkdtemp
-char *gitmkdtemp(char *);
+#define mkdtemp git_mkdtemp
 #endif
 
 #ifdef NO_UNSETENV
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index edb0fc04ad7649..b84d8a7c762f06 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -411,10 +411,6 @@ if(NOT HAVE_SETENV)
 	list(APPEND compat_SOURCES compat/setenv.c)
 endif()
 
-if(NOT HAVE_MKDTEMP)
-	list(APPEND compat_SOURCES compat/mkdtemp.c)
-endif()
-
 if(NOT HAVE_PREAD)
 	list(APPEND compat_SOURCES compat/pread.c)
 endif()
diff --git a/meson.build b/meson.build
index 1f95a06edb7829..4a42e783b1bb77 100644
--- a/meson.build
+++ b/meson.build
@@ -1401,7 +1401,7 @@ checkfuncs = {
   'strlcpy' : ['strlcpy.c'],
   'strtoull' : [],
   'setenv' : ['setenv.c'],
-  'mkdtemp' : ['mkdtemp.c'],
+  'mkdtemp' : [],
   'initgroups' : [],
   'strtoumax' : ['strtoumax.c', 'strtoimax.c'],
   'pread' : ['pread.c'],

From dc8a00fafef0608c27cdf47cd8a8de0d31dc2197 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 7 Dec 2025 10:03:56 +0900
Subject: [PATCH 196/553] completion: clarify support for short options and
 arguments

The list of supported completions in the header of the file was
mostly written a long time ago when Shawn added the initial version
of this script in 2006.  The list explicitly states that we complete
"common --long-options", which implies that we do not complete
not-so-common ones and single letter options (this text dates back
to May 2007).

Update the description to explicitly state that single-letter
options are not completed.  Also, document that arguments to options
are completed, even for single-letter options (e.g., "git -c <TAB>"
offers configuration variables).

The reason why we do not complete single-letter options is because
it does not seem to help all that much to learn that the command
takes -c, -d, -e options when "git foo -<TAB>" offers these three,
unlike long options that is easier to guess what they are about.

Because this rationale is primarily for our developers, let's leave
it out of the completion script itself, whose messages are entirely
for end-users.  Our developers can run "git blame" to find this
commit as needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/completion/git-completion.bash | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 73abea31b428f3..538dff1ee5cebe 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -13,7 +13,8 @@
 #    *) git email aliases for git-send-email
 #    *) tree paths within 'ref:path/to/file' expressions
 #    *) file paths within current working directory and index
-#    *) common --long-options
+#    *) common --long-options but not single-letter options
+#    *) arguments to long and single-letter options
 #
 # To use these routines:
 #

From 8cbbdc92f77a20014d9c425c8b9e4af46e492204 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 8 Dec 2025 08:27:11 +0100
Subject: [PATCH 197/553] doc: join default pre-commit paragraphs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Join two paragraphs that start with the standard “The default <hook>,
when enabled” into one and put it at the end of the “pre-commit”
section.

The trailing whitespace paragraph was added in the first commit for the
doc, in 6d35cc76 (Document hooks., 2005-09-02). Then 3e14dd2c (mention
use of "hooks.allownonascii" in "man githooks", 2019-02-20) updated the
“pre-commit” section to mention the non-ASCII check that was added in
d00e364d.[1] But this paragraph was added one-past the original
“default” paragraph, after the env. variable paragraph, and starts
exactly the same. That causes the flow of this section to feel
off (paragraphs in order):

1. Invoked by <cmd> and what parameters it takes
2. The default 'pre-commit' hook catches introduction of trailing
   whitespace
3. `GIT_EDITOR=:`
4. The default pre-commit' hook catches introduction of non-ASCII
   filenames

Let’s instead join these two paragrahs and explain the whole behavior of
the default script.

† 1: Extend sample pre-commit hook to check for non ascii filenames,
     2009-05-19

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/githooks.adoc | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/Documentation/githooks.adoc b/Documentation/githooks.adoc
index 0397dec64d7315..056553788d4f43 100644
--- a/Documentation/githooks.adoc
+++ b/Documentation/githooks.adoc
@@ -103,17 +103,14 @@ invoked before obtaining the proposed commit log message and
 making a commit.  Exiting with a non-zero status from this script
 causes the `git commit` command to abort before creating a commit.
 
-The default 'pre-commit' hook, when enabled, catches introduction
-of lines with trailing whitespaces and aborts the commit when
-such a line is found.
-
 All the `git commit` hooks are invoked with the environment
 variable `GIT_EDITOR=:` if the command will not bring up an editor
 to modify the commit message.
 
-The default 'pre-commit' hook, when enabled--and with the
-`hooks.allownonascii` config option unset or set to false--prevents
-the use of non-ASCII filenames.
+The default 'pre-commit' hook, when enabled, prevents the introduction
+of non-ASCII filenames and lines with trailing whitespace. The non-ASCII
+check can be turned off by setting the `hooks.allownonascii` config
+option to `true`.
 
 pre-merge-commit
 ~~~~~~~~~~~~~~~~

From 48176f953fe083e2f15dd4daa2e37c28950135f1 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 7 Dec 2025 13:40:46 +0900
Subject: [PATCH 198/553] connect: plug protocol capability leak

When pushing to a set of remotes using a nickname for the group, the
client initializes the connection to each remote, talks to the
remote and reads and parses capabilities line, and holds the
capabilities in a file-scope static variable server_capabilities_v1.

There are a few other such file-scope static variables, and these
connections cannot be parallelized until they are refactored to a
structure that keeps track of active connections.

Which is *not* the theme of this patch ;-)

For a single connection, the server_capabilities_v1 variable is
initialized to NULL (at the program initialization), populated when
we talk to the other side, used to look up capabilities of the other
side possibly multiple times, and the memory is held by the variable
until program exit, without leaking.  When talking to multiple remotes,
however, the server capabilities from the second connection overwrites
without freeing the one from the first connection, which leaks.

    ==1080970==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 421 byte(s) in 2 object(s) allocated from:
	#0 0x5615305f849e in strdup (/home/gitster/g/git-jch/bin/bin/git+0x2b349e) (BuildId: 54d149994c9e85374831958f694bd0aa3b8b1e26)
	#1 0x561530e76cc4 in xstrdup /home/gitster/w/build/wrapper.c:43:14
	#2 0x5615309cd7fa in process_capabilities /home/gitster/w/build/connect.c:243:27
	#3 0x5615309cd502 in get_remote_heads /home/gitster/w/build/connect.c:366:4
	#4 0x561530e2cb0b in handshake /home/gitster/w/build/transport.c:372:3
	#5 0x561530e29ed7 in get_refs_via_connect /home/gitster/w/build/transport.c:398:9
	#6 0x561530e26464 in transport_push /home/gitster/w/build/transport.c:1421:16
	#7 0x561530800bec in push_with_options /home/gitster/w/build/builtin/push.c:387:8
	#8 0x5615307ffb99 in do_push /home/gitster/w/build/builtin/push.c:442:7
	#9 0x5615307fe926 in cmd_push /home/gitster/w/build/builtin/push.c:664:7
	#10 0x56153065673f in run_builtin /home/gitster/w/build/git.c:506:11
	#11 0x56153065342f in handle_builtin /home/gitster/w/build/git.c:779:9
	#12 0x561530655b89 in run_argv /home/gitster/w/build/git.c:862:4
	#13 0x561530652cba in cmd_main /home/gitster/w/build/git.c:984:19
	#14 0x5615308dda0a in main /home/gitster/w/build/common-main.c:9:11
	#15 0x7f051651bca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

    SUMMARY: AddressSanitizer: 421 byte(s) leaked in 2 allocation(s).

Free the capablities data for the previous server before overwriting
it with the next server to plug this leak.

The added test fails without the freeing with SANITIZE=leak; I
somehow couldn't get it fail reliably with SANITIZE=leak,address
though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 connect.c                |  2 ++
 t/meson.build            |  1 +
 t/t5565-push-multiple.sh | 39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)
 create mode 100755 t/t5565-push-multiple.sh

diff --git a/connect.c b/connect.c
index 8352b71faf0931..c6f76e30829ff2 100644
--- a/connect.c
+++ b/connect.c
@@ -240,6 +240,8 @@ static void process_capabilities(struct packet_reader *reader, size_t *linelen)
 	size_t nul_location = strlen(line);
 	if (nul_location == *linelen)
 		return;
+
+	free(server_capabilities_v1);
 	server_capabilities_v1 = xstrdup(line + nul_location + 1);
 	*linelen = nul_location;
 
diff --git a/t/meson.build b/t/meson.build
index a5531df415ffe2..a7c477f1fb0b91 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -688,6 +688,7 @@ integration_tests = [
   't5562-http-backend-content-length.sh',
   't5563-simple-http-auth.sh',
   't5564-http-proxy.sh',
+  't5565-push-multiple.sh',
   't5570-git-daemon.sh',
   't5571-pre-push-hook.sh',
   't5572-pull-submodule.sh',
diff --git a/t/t5565-push-multiple.sh b/t/t5565-push-multiple.sh
new file mode 100755
index 00000000000000..7e93668566987b
--- /dev/null
+++ b/t/t5565-push-multiple.sh
@@ -0,0 +1,39 @@
+#!/bin/sh
+
+test_description='push to group'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	for i in 1 2 3
+	do
+		git init dest-$i &&
+		git -C dest-$i symbolic-ref HEAD refs/heads/not-a-branch ||
+		return 1
+	done &&
+	test_tick &&
+	git commit --allow-empty -m "initial" &&
+	git config set --append remote.them.pushurl "file://$(pwd)/dest-1" &&
+	git config set --append remote.them.pushurl "file://$(pwd)/dest-2" &&
+	git config set --append remote.them.pushurl "file://$(pwd)/dest-3" &&
+	git config set --append remote.them.push "+refs/heads/*:refs/heads/*"
+'
+
+test_expect_success 'push to group' '
+	git push them &&
+	j= &&
+	for i in 1 2 3
+	do
+		git -C dest-$i for-each-ref >actual-$i &&
+		if test -n "$j"
+		then
+			test_cmp actual-$j actual-$i
+		else
+			cat actual-$i
+		fi &&
+		j=$i ||
+		return 1
+	done
+'
+
+test_done

From 41d425008afc95e9e5d3ee5d13e76f78f392fe3a Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 8 Dec 2025 18:41:01 +0100
Subject: [PATCH 199/553] doc: send-email: fix broken list continuation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The list continuation has to be “immediately adjacent to the block
being attached”.[1]

[1]: https://web.archive.org/web/20251208172615/https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-send-email.adoc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/Documentation/git-send-email.adoc b/Documentation/git-send-email.adoc
index 263b977353f334..688efe2786c10c 100644
--- a/Documentation/git-send-email.adoc
+++ b/Documentation/git-send-email.adoc
@@ -321,7 +321,6 @@ for instructions.
 	If disabled with `--no-use-imap-only`, the emails will be sent like usual.
 	Disabled by default, but the `sendemail.useImapOnly` configuration
 	variable can be used to enable it.
-
 +
 This feature requires setting up `git imap-send`. See linkgit:git-imap-send[1]
 for instructions.

From d4bc39a4d96d7a1f6b69a6a300d77df748eab508 Mon Sep 17 00:00:00 2001
From: Matthew Hughes <matthewhughes934@gmail.com>
Date: Mon, 8 Dec 2025 19:04:35 +0000
Subject: [PATCH 200/553] config: document 'gui.GCWarning'

While investigating the config options set by 'scalar' I noticed this
one wasn't documented.

Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/gui.adoc | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/config/gui.adoc b/Documentation/config/gui.adoc
index 171be774d243fd..1565c0af197b9c 100644
--- a/Documentation/config/gui.adoc
+++ b/Documentation/config/gui.adoc
@@ -55,3 +55,8 @@ gui.blamehistoryctx::
 	linkgit:gitk[1] for the selected commit, when the `Show History
 	Context` menu item is invoked from 'git gui blame'. If this
 	variable is set to zero, the whole history is shown.
+
+gui.GCWarning::
+	Determines whether linkgit:git-gui[1] should prompt for garbage
+	collection when git detects a large number of loose objects in
+	the repository. The default value is "true".

From e85ae279b0d58edc2f4c3fd5ac391b51e1223985 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 9 Dec 2025 07:53:51 +0900
Subject: [PATCH 201/553] The seventh batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 9e896bd4244389..38cbd2186e8172 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -23,6 +23,9 @@ UI, Workflows & Features
  * "git fast-import" learns "--strip-if-invalid" option to drop
    invalid cryptographic signature from objects.
 
+ * The use of "revision" (a connected set of commits) has been
+   clarified in the "git replay" documentation.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -135,6 +138,18 @@ Fixes since v2.52
    have been corrected.
    (merge df963f0df4 rs/config-set-multi-error-message-fix later to maint).
 
+ * "git replay" forgot to omit the "gpgsig-sha256" extended header
+   from the resulting commit the same way it omits "gpgsig", which has
+   been corrected.
+   (merge 9f3a115087 pw/replay-exclude-gpgsig-fix later to maint).
+
+ * A few tests have been updated to work under the shell compatible
+   mode of zsh.
+   (merge a92f243a94 bc/zsh-testsuite later to maint).
+
+ * The way patience diff finds LCS has been optimized.
+   (merge c7e3b8085b yc/xdiff-patience-optim later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).

From 3c7c41d6b7ee4c4576490a3b6cfefe4d59d24172 Mon Sep 17 00:00:00 2001
From: Aaron Plattner <aplattner@nvidia.com>
Date: Mon, 8 Dec 2025 17:48:56 -0800
Subject: [PATCH 202/553] object: apply skip_hash and discard_tree
 optimizations to unknown blobs too

parse_object_with_flags() has an optimization to skip parsing blobs if
PARSE_OBJECT_SKIP_HASH_CHECK is set and the object hasn't been seen
before or might be a blob but hasn't been parsed yet. The latter can
happen, for example, if add_tree_entries() walks a path that references
a blob object that hasn't been seen before: lookup_blob() marks the
referenced oid as being a blob, but does not provide any additional
information about it until it is parsed.

It's possible for an object to be created without even a type, such as
when prepare_revision_walk() uses mark_uninteresting() to mark all
promisor objects as uninteresting. These objects have obj->parsed ==
false and obj->type == OBJ_NONE.

The skip_hash optimization does not consider this kind of object, so
parse_object_with_flags() proceeds to fully parse the object to
determine its type.

Improve the optimization by applying it to OBJ_NONE objects as well as
OBJ_BLOB ones. Apply a similar fix for trees.

Fixes: 8db2dad7a045 ("parse_object(): check on-disk type of suspected blob")
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/object.c b/object.c
index b08fc7a163ae69..4669b8d65e74ab 100644
--- a/object.c
+++ b/object.c
@@ -328,7 +328,7 @@ struct object *parse_object_with_flags(struct repository *r,
 			return &commit->object;
 	}
 
-	if ((!obj || obj->type == OBJ_BLOB) &&
+	if ((!obj || obj->type == OBJ_NONE || obj->type == OBJ_BLOB) &&
 	    odb_read_object_info(r->objects, oid, NULL) == OBJ_BLOB) {
 		if (!skip_hash && stream_object_signature(r, repl) < 0) {
 			error(_("hash mismatch %s"), oid_to_hex(oid));
@@ -344,7 +344,7 @@ struct object *parse_object_with_flags(struct repository *r,
 	 * have the on-disk object with the correct type.
 	 */
 	if (skip_hash && discard_tree &&
-	    (!obj || obj->type == OBJ_TREE) &&
+	    (!obj || obj->type == OBJ_NONE || obj->type == OBJ_TREE) &&
 	    odb_read_object_info(r->objects, oid, NULL) == OBJ_TREE) {
 		return &lookup_tree(r, oid)->object;
 	}

From 3f5d1749e7eb8ab745b348aa138564b809957d3d Mon Sep 17 00:00:00 2001
From: Aaron Plattner <aplattner@nvidia.com>
Date: Mon, 8 Dec 2025 17:48:57 -0800
Subject: [PATCH 203/553] packfile: skip hash checks in add_promisor_object()

When is_promisor_object() is called for the first time, it lazily
initializes a set of all promisor objects by iterating through all
objects in promisor packs. For each object, add_promisor_object() calls
parse_object(), which decompresses and hashes the entire object.

For repositories with large pack files, this can take an extremely long
time. For example, on a production repository with a 176 GB promisor
pack:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in   76.10 mins    fish           external
    usr time   72.10 mins    1.83 millis   72.10 mins
    sys time    3.56 mins    0.17 millis    3.56 mins

add_promisor_object() just wants to construct the set of all promisor
objects, so it doesn't really need to verify the hash of every object.
Set PARSE_OBJECT_SKIP_HASH_CHECK to skip the hash check. This has the
side effect of skipping decompression of blob objects completely, saving
a significant amount of time:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in  124.70 secs    fish           external
    usr time   46.94 secs    0.00 millis   46.94 secs
    sys time   43.11 secs    1.03 millis   43.11 secs

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/packfile.c b/packfile.c
index 9cc11b6dc56225..01b992a4e12f89 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2310,7 +2310,8 @@ static int add_promisor_object(const struct object_id *oid,
 		we_parsed_object = 0;
 	} else {
 		we_parsed_object = 1;
-		obj = parse_object(pack->repo, oid);
+		obj = parse_object_with_flags(pack->repo, oid,
+					      PARSE_OBJECT_SKIP_HASH_CHECK);
 	}
 
 	if (!obj)

From 8ff2eef8ada18c2d7ef61b1e8e13d53937524908 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Fri, 21 Nov 2025 12:13:46 +0100
Subject: [PATCH 204/553] fetch: fix non-conflicting tags not being committed

The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.

Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.

The cleanup section is reached with `retcode` set in several scenarios:

   - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
     `retcode` before the transaction is created, so no commit is
     attempted.

   - `fetch_and_consume_refs()` and `backfill_tags()` are the primary
     cases this fix targets, both setting a positive `retcode` to
     trigger the committing of the transaction.

This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.

Add tests to check for this regression. While here, add a missing
cleanup from previous test.

Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c  |  8 +++++++
 t/t5510-fetch.sh | 62 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f90179040ba34c..b19fa8e966df05 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1957,6 +1957,14 @@ static int do_fetch(struct transport *transport,
 	}
 
 cleanup:
+	/*
+	 * When using batched updates, we want to commit the non-rejected
+	 * updates and also handle the rejections.
+	 */
+	if (retcode && !atomic_fetch && transaction)
+		commit_ref_transaction(&transaction, false,
+				       transport->remote->name, &err);
+
 	if (retcode) {
 		if (err.len) {
 			error("%s", err.buf);
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index b7059cccaacce0..f500cb83cad997 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -1552,6 +1552,7 @@ test_expect_success CASE_INSENSITIVE_FS,REFFILES 'D/F conflict on case insensiti
 '
 
 test_expect_success REFFILES 'D/F conflict on case sensitive filesystem with lock' '
+	test_when_finished rm -rf base repo &&
 	(
 		git init --ref-format=reftable base &&
 		cd base &&
@@ -1577,6 +1578,67 @@ test_expect_success REFFILES 'D/F conflict on case sensitive filesystem with loc
 	)
 '
 
+test_expect_success 'fetch --tags fetches existing tags' '
+	test_when_finished rm -rf base repo &&
+
+	git init base &&
+	git -C base commit --allow-empty -m "empty-commit" &&
+
+	git clone --bare base repo &&
+
+	git -C base tag tag-1 &&
+	git -C repo for-each-ref >out &&
+	test_grep ! "tag-1" out &&
+	git -C repo fetch --tags &&
+	git -C repo for-each-ref >out &&
+	test_grep "tag-1" out
+'
+
+test_expect_success 'fetch --tags fetches non-conflicting tags' '
+	test_when_finished rm -rf base repo &&
+
+	git init base &&
+	git -C base commit --allow-empty -m "empty-commit" &&
+	git -C base tag tag-1 &&
+
+	git clone --bare base repo &&
+
+	git -C base tag tag-2 &&
+	git -C repo for-each-ref >out &&
+	test_grep ! "tag-2" out &&
+
+	git -C base commit --allow-empty -m "second empty-commit" &&
+	git -C base tag -f tag-1 &&
+
+	test_must_fail git -C repo fetch --tags 2>out &&
+	test_grep "tag-1  (would clobber existing tag)" out &&
+	git -C repo for-each-ref >out &&
+	test_grep "tag-2" out
+'
+
+test_expect_success "backfill tags when providing a refspec" '
+	test_when_finished rm -rf source target &&
+
+	git init source &&
+	git -C source commit --allow-empty --message common &&
+	git clone file://"$(pwd)"/source target &&
+	(
+	    cd source &&
+	    test_commit history &&
+	    test_commit fetch-me
+	) &&
+
+	# The "history" tag is backfilled even though we requested
+	# to only fetch HEAD
+	git -C target fetch origin HEAD:branch &&
+	git -C target tag -l >actual &&
+	cat >expect <<-\EOF &&
+	fetch-me
+	history
+	EOF
+	test_cmp expect actual
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 

From b7b17ec8a6b1cb176206ad69c194b84eb3490b99 Mon Sep 17 00:00:00 2001
From: Karthik Nayak <karthik.188@gmail.com>
Date: Fri, 21 Nov 2025 12:13:47 +0100
Subject: [PATCH 205/553] fetch: fix failed batched updates skipping operations

Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:

  - Update 'FETCH_HEAD' as part of `commit_fetch_head()`.

  - Add upstream tracking information via `set_upstream()`.

  - Setting remote 'HEAD' values when `do_set_head` is true.

For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.

Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.

Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c  |  6 +++-
 t/t5510-fetch.sh | 88 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b19fa8e966df05..74bf67349d30a0 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1890,7 +1890,11 @@ static int do_fetch(struct transport *transport,
 
 	retcode = commit_ref_transaction(&transaction, atomic_fetch,
 					 transport->remote->name, &err);
-	if (retcode)
+	/*
+	 * With '--atomic', bail out if the transaction fails. Without '--atomic',
+	 * continue to fetch head and perform other post-fetch operations.
+	 */
+	if (retcode && atomic_fetch)
 		goto cleanup;
 
 	commit_fetch_head(&fetch_head);
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index f500cb83cad997..ce1c23684ece38 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -1639,6 +1639,94 @@ test_expect_success "backfill tags when providing a refspec" '
 	test_cmp expect actual
 '
 
+test_expect_success REFFILES "FETCH_HEAD is updated even if ref updates fail" '
+	test_when_finished rm -rf base repo &&
+
+	git init base &&
+	(
+		cd base &&
+		test_commit "updated" &&
+
+		git update-ref refs/heads/foo @ &&
+		git update-ref refs/heads/branch @
+	) &&
+
+	git init --bare repo &&
+	(
+		cd repo &&
+		rm -f FETCH_HEAD &&
+		git remote add origin ../base &&
+		>refs/heads/foo.lock &&
+		test_must_fail git fetch -f origin "refs/heads/*:refs/heads/*" 2>err &&
+		test_grep "error: fetching ref refs/heads/foo failed: reference already exists" err &&
+		test_grep "branch ${SQ}branch${SQ} of ../base" FETCH_HEAD &&
+		test_grep "branch ${SQ}foo${SQ} of ../base" FETCH_HEAD
+	)
+'
+
+test_expect_success "upstream tracking info is added with --set-upstream" '
+	test_when_finished rm -rf base repo &&
+
+	git init --initial-branch=main base &&
+	test_commit -C base "updated" &&
+
+	git init --bare --initial-branch=main repo &&
+	(
+		cd repo &&
+		git remote add origin ../base &&
+		git fetch origin --set-upstream main &&
+		git config get branch.main.remote >actual &&
+		echo "origin" >expect &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success REFFILES "upstream tracking info is added even with conflicts" '
+	test_when_finished rm -rf base repo &&
+
+	git init --initial-branch=main base &&
+	test_commit -C base "updated" &&
+
+	git init --bare --initial-branch=main repo &&
+	(
+		cd repo &&
+		git remote add origin ../base &&
+		test_must_fail git config get branch.main.remote &&
+
+		mkdir -p refs/remotes/origin &&
+		>refs/remotes/origin/main.lock &&
+		test_must_fail git fetch origin --set-upstream main &&
+		git config get branch.main.remote >actual &&
+		echo "origin" >expect &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success REFFILES "HEAD is updated even with conflicts" '
+	test_when_finished rm -rf base repo &&
+
+	git init base &&
+	(
+		cd base &&
+		test_commit "updated" &&
+
+		git update-ref refs/heads/foo @ &&
+		git update-ref refs/heads/branch @
+	) &&
+
+	git init --bare repo &&
+	(
+		cd repo &&
+		git remote add origin ../base &&
+
+		test_path_is_missing refs/remotes/origin/HEAD &&
+		mkdir -p refs/remotes/origin &&
+		>refs/remotes/origin/branch.lock &&
+		test_must_fail git fetch origin &&
+		test -f refs/remotes/origin/HEAD
+	)
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 

From 665d19ec7bcc2d578e2fa2701f7399a6a965b086 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 10 Dec 2025 13:52:18 +0100
Subject: [PATCH 206/553] midx: fix `BUG()` when getting preferred pack without
 a reverse index

The function `midx_preferred_pack()` returns the preferred pack for a
given multi-pack index. To compute the preferred pack we:

  1. Take the first position indexed by the MIDX in pseudo-pack order.

  2. Convert this pseudo-pack position into the MIDX position.

  3. We then look up the pack that corresponds to this MIDX position.

This reliably returns the preferred pack given that all of its contained
objects will be up front in pseudo-pack order.

The second step that turns the pseudo-pack order into MIDX order
requires the reverse index though, which may not exist for example when
the MIDX does not have a bitmap. And in that case one may easily hit a
bug:

    BUG: ../pack-revindex.c:491: pack_pos_to_midx: reverse index not yet loaded

In theory, `midx_preferred_pack()` already knows to handle the case
where no reverse index exists, as it calls `load_midx_revindex()` before
calling into `midx_preferred_pack()`. But we only check for negative
return values there, even though the function returns a positive error
code in case the reverse index does not exist.

Fix the issue by testing for a non-zero return value instead, same as
all the other callers of this function already do. While at it, document
the return value of `load_midx_revindex()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 midx.c                      |  2 +-
 pack-revindex.h             |  3 ++-
 t/t5319-multi-pack-index.sh | 13 +++++++++++++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/midx.c b/midx.c
index 1d6269f957e781..79c890cf1b948e 100644
--- a/midx.c
+++ b/midx.c
@@ -688,7 +688,7 @@ int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id)
 {
 	if (m->preferred_pack_idx == -1) {
 		uint32_t midx_pos;
-		if (load_midx_revindex(m) < 0) {
+		if (load_midx_revindex(m)) {
 			m->preferred_pack_idx = -2;
 			return -1;
 		}
diff --git a/pack-revindex.h b/pack-revindex.h
index 422c2487ae32d8..004289209191d0 100644
--- a/pack-revindex.h
+++ b/pack-revindex.h
@@ -72,7 +72,8 @@ int verify_pack_revindex(struct packed_git *p);
  * multi-pack index by mmap-ing it and assigning pointers in the
  * multi_pack_index to point at it.
  *
- * A negative number is returned on error.
+ * A negative number is returned on error. A positive number is returned in
+ * case the multi-pack-index does not have a reverse index.
  */
 int load_midx_revindex(struct multi_pack_index *m);
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 93f319a4b29fbb..9492a9737b5f8e 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -350,7 +350,20 @@ test_expect_success 'preferred pack from existing MIDX without bitmaps' '
 		# the new MIDX
 		git multi-pack-index write --preferred-pack=pack-$pack.pack
 	)
+'
 
+test_expect_success 'preferred pack cannot be determined without bitmap' '
+	test_when_finished "rm -fr preferred-can-be-queried" &&
+	git init preferred-can-be-queried &&
+	(
+		cd preferred-can-be-queried &&
+		test_commit initial &&
+		git repack -Adl --write-midx --no-write-bitmap-index &&
+		test_must_fail test-tool read-midx --preferred-pack .git/objects 2>err &&
+		test_grep "could not determine MIDX preferred pack" err &&
+		git repack -Adl --write-midx --write-bitmap-index &&
+		test-tool read-midx --preferred-pack .git/objects
+	)
 '
 
 test_expect_success 'verify multi-pack-index success' '

From b3bab9d2729fde1f52c407447711c34a75c5c377 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 10 Dec 2025 13:52:19 +0100
Subject: [PATCH 207/553] midx-write: extract function to test whether MIDX
 needs updating

In `write_midx_internal()` we know to skip writing the new multi-pack
index in case it would be the same as the existing one. This logic does
not handle the `--stdin-packs` option yet though, so we end up always
rewriting the MIDX if that option is passed to us.

Extract the logic to decide whether or not to rewrite the MIDX into a
separate function. This will allow us to extend that feature in the next
commit to address the above issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 midx-write.c | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/midx-write.c b/midx-write.c
index c73010df6d3a4f..c1eed04691f0a7 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -1015,6 +1015,41 @@ static void clear_midx_files(struct odb_source *source,
 	strbuf_release(&buf);
 }
 
+static bool midx_needs_update(struct write_midx_context *ctx)
+{
+	struct multi_pack_index *midx = ctx->m;
+	bool needed = true;
+
+	/*
+	 * Ignore incremental updates for now. The assumption is that any
+	 * incremental update would be either empty (in which case we will bail
+	 * out later) or it would actually cover at least one new pack.
+	 */
+	if (ctx->incremental)
+		goto out;
+
+	/*
+	 * If there is no MIDX then either it doesn't exist, or we're doing a
+	 * geometric repack. We cannot (yet) determine whether we need to
+	 * update the multi-pack index in the second case.
+	 */
+	if (!midx)
+		goto out;
+
+	/*
+	 * Otherwise, we need to verify that the packs covered by the existing
+	 * MIDX match the packs that we already have. This test is somewhat
+	 * lenient and will be fixed.
+	 */
+	if (ctx->nr != midx->num_packs + midx->num_packs_in_base)
+		goto out;
+
+	needed = false;
+
+out:
+	return needed;
+}
+
 static int write_midx_internal(struct odb_source *source,
 			       struct string_list *packs_to_include,
 			       struct string_list *packs_to_drop,
@@ -1112,9 +1147,7 @@ static int write_midx_internal(struct odb_source *source,
 	for_each_file_in_pack_dir(source->path, add_pack_to_midx, &ctx);
 	stop_progress(&ctx.progress);
 
-	if ((ctx.m && ctx.nr == ctx.m->num_packs + ctx.m->num_packs_in_base) &&
-	    !ctx.incremental &&
-	    !(packs_to_include || packs_to_drop)) {
+	if (!packs_to_include && !packs_to_drop && !midx_needs_update(&ctx)) {
 		struct bitmap_index *bitmap_git;
 		int bitmap_exists;
 		int want_bitmap = flags & MIDX_WRITE_BITMAP;

From 6ce9d558ced275a707393d044e5b0035412f8360 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Wed, 10 Dec 2025 13:52:20 +0100
Subject: [PATCH 208/553] midx-write: skip rewriting MIDX with `--stdin-packs`
 unless needed

In `write_midx_internal()` we know to skip rewriting the multi-pack
index in case the existing one already covers all packs. This logic does
not know to handle `git multi-pack-index write --stdin-packs` though, so
we end up always rewriting the MIDX in this case even if the MIDX would
not change.

With our default maintenance strategy this isn't really much of a
problem, as git-gc(1) does not use the "--stdin-packs" option. But that
is changing with geometric repacking, where "--stdin-packs" is used to
explicitly select the packfiles part of the geometric sequence.

This issue can be demonstrated trivially with a benchmark in the Git
repository: executing `git repack --geometric=2 --write-midx -d` in the
Git repository takes more than 3 seconds only to end up with the same
multi-pack index as we already had before.

The logic that decides if we need to rewrite the MIDX only checks
whether the number of packfiles covered will change. That check is of
course too lenient for "--stdin-packs", as it could happen that we want
to cover a different-but-same-size set of packfiles. But there is no
inherent reason why we cannot handle "--stdin-packs".

Improve the logic to not only check for the number of packs, but to also
verify that we are asked to generate a MIDX for the _same_ packs. This
allows us to also skip no-op rewrites for "--stdin-packs".

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 midx-write.c                | 100 +++++++++++++++++++++++++-----------
 t/t5319-multi-pack-index.sh |  51 ++++++++++++++++++
 t/t7703-repack-geometric.sh |  35 +++++++++++++
 3 files changed, 156 insertions(+), 30 deletions(-)

diff --git a/midx-write.c b/midx-write.c
index c1eed04691f0a7..40abe3868c4d6f 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -1015,9 +1015,10 @@ static void clear_midx_files(struct odb_source *source,
 	strbuf_release(&buf);
 }
 
-static bool midx_needs_update(struct write_midx_context *ctx)
+static bool midx_needs_update(struct multi_pack_index *midx, struct write_midx_context *ctx)
 {
-	struct multi_pack_index *midx = ctx->m;
+	struct strset packs = STRSET_INIT;
+	struct strbuf buf = STRBUF_INIT;
 	bool needed = true;
 
 	/*
@@ -1028,25 +1029,48 @@ static bool midx_needs_update(struct write_midx_context *ctx)
 	if (ctx->incremental)
 		goto out;
 
-	/*
-	 * If there is no MIDX then either it doesn't exist, or we're doing a
-	 * geometric repack. We cannot (yet) determine whether we need to
-	 * update the multi-pack index in the second case.
-	 */
-	if (!midx)
-		goto out;
-
 	/*
 	 * Otherwise, we need to verify that the packs covered by the existing
-	 * MIDX match the packs that we already have. This test is somewhat
-	 * lenient and will be fixed.
+	 * MIDX match the packs that we already have. The logic to do so is way
+	 * more complicated than it has any right to be. This is because:
+	 *
+	 *   - We cannot assume any ordering.
+	 *
+	 *   - The MIDX packs may not be loaded at all, and loading them would
+	 *     be wasteful. So we need to use the pack names tracked by the
+	 *     MIDX itself.
+	 *
+	 *   - The MIDX pack names are tracking the ".idx" files, whereas the
+	 *     packs themselves are tracking the ".pack" files. So we need to
+	 *     strip suffixes.
 	 */
 	if (ctx->nr != midx->num_packs + midx->num_packs_in_base)
 		goto out;
 
+	for (uint32_t i = 0; i < ctx->nr; i++) {
+		strbuf_reset(&buf);
+		strbuf_addstr(&buf, pack_basename(ctx->info[i].p));
+		strbuf_strip_suffix(&buf, ".pack");
+
+		if (!strset_add(&packs, buf.buf))
+			BUG("same pack added twice?");
+	}
+
+	for (uint32_t i = 0; i < ctx->nr; i++) {
+		strbuf_reset(&buf);
+		strbuf_addstr(&buf, midx->pack_names[i]);
+		strbuf_strip_suffix(&buf, ".idx");
+
+		if (!strset_contains(&packs, buf.buf))
+			goto out;
+		strset_remove(&packs, buf.buf);
+	}
+
 	needed = false;
 
 out:
+	strbuf_release(&buf);
+	strset_clear(&packs);
 	return needed;
 }
 
@@ -1067,6 +1091,7 @@ static int write_midx_internal(struct odb_source *source,
 	struct write_midx_context ctx = {
 		.preferred_pack_idx = NO_PREFERRED_PACK,
 	 };
+	struct multi_pack_index *midx_to_free = NULL;
 	int bitmapped_packs_concat_len = 0;
 	int pack_name_concat_len = 0;
 	int dropped_packs = 0;
@@ -1147,25 +1172,39 @@ static int write_midx_internal(struct odb_source *source,
 	for_each_file_in_pack_dir(source->path, add_pack_to_midx, &ctx);
 	stop_progress(&ctx.progress);
 
-	if (!packs_to_include && !packs_to_drop && !midx_needs_update(&ctx)) {
-		struct bitmap_index *bitmap_git;
-		int bitmap_exists;
-		int want_bitmap = flags & MIDX_WRITE_BITMAP;
-
-		bitmap_git = prepare_midx_bitmap_git(ctx.m);
-		bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git);
-		free_bitmap_index(bitmap_git);
-
-		if (bitmap_exists || !want_bitmap) {
-			/*
-			 * The correct MIDX already exists, and so does a
-			 * corresponding bitmap (or one wasn't requested).
-			 */
-			if (!want_bitmap)
-				clear_midx_files_ext(source, "bitmap", NULL);
-			result = 0;
-			goto cleanup;
+	if (!packs_to_drop) {
+		/*
+		 * If there is no MIDX then either it doesn't exist, or we're
+		 * doing a geometric repack. Try to load it from the source to
+		 * tell these two cases apart.
+		 */
+		struct multi_pack_index *midx = ctx.m;
+		if (!midx)
+			midx = midx_to_free = load_multi_pack_index(ctx.source);
+
+		if (midx && !midx_needs_update(midx, &ctx)) {
+			struct bitmap_index *bitmap_git;
+			int bitmap_exists;
+			int want_bitmap = flags & MIDX_WRITE_BITMAP;
+
+			bitmap_git = prepare_midx_bitmap_git(midx);
+			bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git);
+			free_bitmap_index(bitmap_git);
+
+			if (bitmap_exists || !want_bitmap) {
+				/*
+				 * The correct MIDX already exists, and so does a
+				 * corresponding bitmap (or one wasn't requested).
+				 */
+				if (!want_bitmap)
+					clear_midx_files_ext(source, "bitmap", NULL);
+				result = 0;
+				goto cleanup;
+			}
 		}
+
+		close_midx(midx_to_free);
+		midx_to_free = NULL;
 	}
 
 	if (ctx.incremental && !ctx.nr) {
@@ -1521,6 +1560,7 @@ static int write_midx_internal(struct odb_source *source,
 		free(keep_hashes);
 	}
 	strbuf_release(&midx_name);
+	close_midx(midx_to_free);
 
 	trace2_region_leave("midx", "write_midx_internal", r);
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 9492a9737b5f8e..794f8b5ab4e136 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -366,6 +366,57 @@ test_expect_success 'preferred pack cannot be determined without bitmap' '
 	)
 '
 
+test_midx_is_retained () {
+	test-tool chmtime =0 .git/objects/pack/multi-pack-index &&
+	ls -l .git/objects/pack/multi-pack-index >expect &&
+	git multi-pack-index write "$@" &&
+	ls -l .git/objects/pack/multi-pack-index >actual &&
+	test_cmp expect actual
+}
+
+test_midx_is_rewritten () {
+	test-tool chmtime =0 .git/objects/pack/multi-pack-index &&
+	ls -l .git/objects/pack/multi-pack-index >expect &&
+	git multi-pack-index write "$@" &&
+	ls -l .git/objects/pack/multi-pack-index >actual &&
+	! test_cmp expect actual
+}
+
+test_expect_success 'up-to-date multi-pack-index is retained' '
+	test_when_finished "rm -fr midx-up-to-date" &&
+	git init midx-up-to-date &&
+	(
+		cd midx-up-to-date &&
+
+		# Write the initial pack that contains the most objects.
+		test_commit first &&
+		test_commit second &&
+		git repack -Ad --write-midx &&
+		test_midx_is_retained &&
+
+		# Writing a new bitmap index should cause us to regenerate the MIDX.
+		test_midx_is_rewritten --bitmap &&
+		test_midx_is_retained --bitmap &&
+
+		# Ensure that writing a new packfile causes us to rewrite the index.
+		test_commit incremental &&
+		git repack -d &&
+		test_midx_is_rewritten &&
+		test_midx_is_retained &&
+
+		for pack in .git/objects/pack/*.idx
+		do
+			basename "$pack" || exit 1
+		done >stdin &&
+		test_line_count = 2 stdin &&
+		test_midx_is_retained --stdin-packs <stdin &&
+		head -n1 stdin >stdin.trimmed &&
+		test_midx_is_rewritten --stdin-packs <stdin.trimmed
+	)
+'
+
+test_done
+
 test_expect_success 'verify multi-pack-index success' '
 	git multi-pack-index verify --object-dir=$objdir
 '
diff --git a/t/t7703-repack-geometric.sh b/t/t7703-repack-geometric.sh
index 9fc1626fbfde89..98806cdb6fe9b7 100755
--- a/t/t7703-repack-geometric.sh
+++ b/t/t7703-repack-geometric.sh
@@ -287,6 +287,41 @@ test_expect_success '--geometric with pack.packSizeLimit' '
 	)
 '
 
+test_expect_success '--geometric --write-midx retains up-to-date MIDX without bitmap index' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+
+		test_path_is_missing .git/objects/pack/multi-pack-index &&
+		git repack --geometric=2 --write-midx --no-write-bitmap-index &&
+		test_path_is_file .git/objects/pack/multi-pack-index &&
+		test-tool chmtime =0 .git/objects/pack/multi-pack-index &&
+
+		ls -l .git/objects/pack/ >expect &&
+		git repack --geometric=2 --write-midx --no-write-bitmap-index &&
+		ls -l .git/objects/pack/ >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success '--geometric --write-midx retains up-to-date MIDX with bitmap index' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	test_commit -C repo initial &&
+
+	test_path_is_missing repo/.git/objects/pack/multi-pack-index &&
+	git -C repo repack --geometric=2 --write-midx --write-bitmap-index &&
+	test_path_is_file repo/.git/objects/pack/multi-pack-index &&
+	test-tool chmtime =0 repo/.git/objects/pack/multi-pack-index &&
+
+	ls -l repo/.git/objects/pack/ >expect &&
+	git -C repo repack --geometric=2 --write-midx --write-bitmap-index &&
+	ls -l repo/.git/objects/pack/ >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success '--geometric --write-midx with packfiles in main and alternate ODB' '
 	test_when_finished "rm -fr shared member" &&
 

From a67b902c94a2f33275a3947a8bcdab03f64ae75e Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Wed, 10 Dec 2025 14:13:01 +0100
Subject: [PATCH 209/553] git-compat-util: introduce MEMZERO_ARRAY() macro

Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/last-modified.c        |  2 +-
 compat/simple-ipc/ipc-win32.c  |  2 +-
 contrib/coccinelle/array.cocci | 20 ++++++++++++++++++++
 diff-delta.c                   |  2 +-
 ewah/bitmap.c                  |  7 +++----
 git-compat-util.h              |  1 +
 hashmap.c                      |  2 +-
 pack-revindex.c                |  2 +-
 8 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index cc5fd2e7950be7..ac5387e861fd88 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -327,7 +327,7 @@ static void process_parent(struct last_modified *lm,
 	if (!(parent->object.flags & PARENT1))
 		active_paths_free(lm, parent);
 
-	memset(lm->scratch->words, 0x0, lm->scratch->word_alloc * sizeof(eword_t));
+	MEMZERO_ARRAY(lm->scratch->words, lm->scratch->word_alloc);
 	diff_queue_clear(&diff_queued_diff);
 }
 
diff --git a/compat/simple-ipc/ipc-win32.c b/compat/simple-ipc/ipc-win32.c
index a8fc812adfcbd3..4a3e7df9c739e1 100644
--- a/compat/simple-ipc/ipc-win32.c
+++ b/compat/simple-ipc/ipc-win32.c
@@ -686,7 +686,7 @@ static LPSECURITY_ATTRIBUTES get_sa(struct my_sa_data *d)
 		goto fail;
 	}
 
-	memset(ea, 0, NR_EA * sizeof(EXPLICIT_ACCESS));
+	MEMZERO_ARRAY(ea, NR_EA);
 
 	ea[0].grfAccessPermissions = GENERIC_READ | GENERIC_WRITE;
 	ea[0].grfAccessMode = SET_ACCESS;
diff --git a/contrib/coccinelle/array.cocci b/contrib/coccinelle/array.cocci
index 27a3b479c94e5c..d306f6a21efc9e 100644
--- a/contrib/coccinelle/array.cocci
+++ b/contrib/coccinelle/array.cocci
@@ -101,3 +101,23 @@ expression dst, src, n;
 -ALLOC_ARRAY(dst, n);
 -COPY_ARRAY(dst, src, n);
 +DUP_ARRAY(dst, src, n);
+
+@@
+type T;
+T *ptr;
+expression n;
+@@
+- memset(ptr, \( 0x0 \| 0 \), n * \( sizeof(T)
+-                                 \| sizeof(*ptr)
+-                                 \) )
++ MEMZERO_ARRAY(ptr, n)
+
+@@
+type T;
+T[] ptr;
+expression n;
+@@
+- memset(ptr, \( 0x0 \| 0 \), n * \( sizeof(T)
+-                                 \| sizeof(*ptr)
+-                                 \) )
++ MEMZERO_ARRAY(ptr, n)
diff --git a/diff-delta.c b/diff-delta.c
index 71d37368d68a18..43c339f01061ca 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -171,7 +171,7 @@ struct delta_index * create_delta_index(const void *buf, unsigned long bufsize)
 	mem = hash + hsize;
 	entry = mem;
 
-	memset(hash, 0, hsize * sizeof(*hash));
+	MEMZERO_ARRAY(hash, hsize);
 
 	/* allocate an array to count hash entries */
 	hash_count = calloc(hsize, sizeof(*hash_count));
diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 55928dada86a37..bf878bf8768ea0 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -46,8 +46,7 @@ static void bitmap_grow(struct bitmap *self, size_t word_alloc)
 {
 	size_t old_size = self->word_alloc;
 	ALLOC_GROW(self->words, word_alloc, self->word_alloc);
-	memset(self->words + old_size, 0x0,
-	       (self->word_alloc - old_size) * sizeof(eword_t));
+	MEMZERO_ARRAY(self->words + old_size, (self->word_alloc - old_size));
 }
 
 void bitmap_set(struct bitmap *self, size_t pos)
@@ -192,8 +191,8 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
 		REALLOC_ARRAY(self->words, self->word_alloc);
-		memset(self->words + original_size, 0x0,
-			(self->word_alloc - original_size) * sizeof(eword_t));
+		MEMZERO_ARRAY(self->words + original_size,
+		              (self->word_alloc - original_size));
 	}
 
 	ewah_iterator_init(&it, other);
diff --git a/git-compat-util.h b/git-compat-util.h
index 398e0fac4fab60..2b8192fd2e22fb 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -726,6 +726,7 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b)
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
 #define CALLOC_ARRAY(x, alloc) (x) = xcalloc((alloc), sizeof(*(x)))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), (alloc)))
+#define MEMZERO_ARRAY(x, alloc) memset((x), 0x0, st_mult(sizeof(*(x)), (alloc)))
 
 #define COPY_ARRAY(dst, src, n) copy_array((dst), (src), (n), sizeof(*(dst)) + \
 	BARF_UNLESS_COPYABLE((dst), (src)))
diff --git a/hashmap.c b/hashmap.c
index a711377853f185..3b5d6f14bc93fb 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -194,7 +194,7 @@ void hashmap_partial_clear_(struct hashmap *map, ssize_t entry_offset)
 		return;
 	if (entry_offset >= 0)  /* called by hashmap_clear_entries */
 		free_individual_entries(map, entry_offset);
-	memset(map->table, 0, map->tablesize * sizeof(struct hashmap_entry *));
+	MEMZERO_ARRAY(map->table, map->tablesize);
 	map->shrink_at = 0;
 	map->private_size = 0;
 }
diff --git a/pack-revindex.c b/pack-revindex.c
index d0791cc4938fa2..8598b941c8c419 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -75,7 +75,7 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
 	for (bits = 0; max >> bits; bits += DIGIT_SIZE) {
 		unsigned i;
 
-		memset(pos, 0, BUCKETS * sizeof(*pos));
+		MEMZERO_ARRAY(pos, BUCKETS);
 
 		/*
 		 * We want pos[i] to store the index of the last element that

From 467860bc0b0447093ae97bcecf1655131732338f Mon Sep 17 00:00:00 2001
From: Toon Claes <toon@iotcl.com>
Date: Wed, 10 Dec 2025 14:13:02 +0100
Subject: [PATCH 210/553] contrib/coccinelle: pass include paths to spatch(1)

In the previous commit a new coccinelle rule is added. But neiter
`make coccicheck` nor `meson compile coccicheck` did detect a case in
builtin/last-modified.c.

This case involves the field `scratch` in `struct last_modified`. This
field is of type `struct bitmap` and that struct has a member
`eword_t *words`. Both are defined in `ewah/ewok.h`. Now, while
builtin/last-modified.c does include that header (with the subdir in the
#include directive), it seems coccinelle does not process it. So it's
unaware of the type of `words` in the bitmap, and it doesn't recognize
the rule from previous commit that uses:

    type T;
    T *ptr;

Fix coccicheck by passing all possible include paths inside the Git
project so spatch(1) can find the headers and can determine the types.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile                       | 2 +-
 contrib/coccinelle/meson.build | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..7ca21188138ace 100644
--- a/Makefile
+++ b/Makefile
@@ -981,7 +981,7 @@ SANITIZE_LEAK =
 SANITIZE_ADDRESS =
 
 # For the 'coccicheck' target
-SPATCH_INCLUDE_FLAGS = --all-includes
+SPATCH_INCLUDE_FLAGS = --all-includes $(addprefix -I ,compat ewah refs sha256 trace2 win32 xdiff)
 SPATCH_FLAGS =
 SPATCH_TEST_FLAGS =
 
diff --git a/contrib/coccinelle/meson.build b/contrib/coccinelle/meson.build
index dc3f73c2e7b117..ae7f5b54602fd5 100644
--- a/contrib/coccinelle/meson.build
+++ b/contrib/coccinelle/meson.build
@@ -50,6 +50,11 @@ foreach header : headers_to_check
   coccinelle_headers += meson.project_source_root() / header
 endforeach
 
+coccinelle_includes = []
+foreach path : ['compat', 'ewah', 'refs', 'sha256', 'trace2', 'win32', 'xdiff']
+  coccinelle_includes += ['-I', meson.project_source_root() / path]
+endforeach
+
 patches = [ ]
 foreach source : coccinelle_sources
   patches += custom_target(
@@ -58,6 +63,7 @@ foreach source : coccinelle_sources
       '--all-includes',
       '--sp-file', concatenated_rules,
       '--patch', meson.project_source_root(),
+      coccinelle_includes,
       '@INPUT@',
     ],
     input: meson.project_source_root() / source,

From 1660496fc400b3956b4abe7bfc40351c9eddc168 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:10 +0100
Subject: [PATCH 211/553] odb: refactor parsing of alternates to be
 self-contained

Parsing of the alternates file and environment variable is currently
split up across multiple different functions and is entangled with
`link_alt_odb_entries()`, which is responsible for linking the parsed
object database sources. This results in two downsides:

  - We have mutual recursion between parsing alternates and linking them
    into the object database. This is because we also parse alternates
    that the newly added sources may have.

  - We mix up the actual logic to parse the data and to link them into
    place.

Refactor the logic so that parsing of the alternates file is entirely
self-contained. Note that this doesn't yet fix the above two issues, but
it is a necessary step to get there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 70 ++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 30 deletions(-)

diff --git a/odb.c b/odb.c
index dc8f292f3d9645..9785f62cb6be5e 100644
--- a/odb.c
+++ b/odb.c
@@ -216,39 +216,50 @@ static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 	return alternate;
 }
 
-static const char *parse_alt_odb_entry(const char *string,
-				       int sep,
-				       struct strbuf *out)
+static void parse_alternates(const char *string,
+			     int sep,
+			     struct strvec *out)
 {
-	const char *end;
+	struct strbuf buf = STRBUF_INIT;
 
-	strbuf_reset(out);
+	while (*string) {
+		const char *end;
+
+		strbuf_reset(&buf);
+
+		if (*string == '#') {
+			/* comment; consume up to next separator */
+			end = strchrnul(string, sep);
+		} else if (*string == '"' && !unquote_c_style(&buf, string, &end)) {
+			/*
+			 * quoted path; unquote_c_style has copied the
+			 * data for us and set "end". Broken quoting (e.g.,
+			 * an entry that doesn't end with a quote) falls
+			 * back to the unquoted case below.
+			 */
+		} else {
+			/* normal, unquoted path */
+			end = strchrnul(string, sep);
+			strbuf_add(&buf, string, end - string);
+		}
 
-	if (*string == '#') {
-		/* comment; consume up to next separator */
-		end = strchrnul(string, sep);
-	} else if (*string == '"' && !unquote_c_style(out, string, &end)) {
-		/*
-		 * quoted path; unquote_c_style has copied the
-		 * data for us and set "end". Broken quoting (e.g.,
-		 * an entry that doesn't end with a quote) falls
-		 * back to the unquoted case below.
-		 */
-	} else {
-		/* normal, unquoted path */
-		end = strchrnul(string, sep);
-		strbuf_add(out, string, end - string);
+		if (*end)
+			end++;
+		string = end;
+
+		if (!buf.len)
+			continue;
+
+		strvec_push(out, buf.buf);
 	}
 
-	if (*end)
-		end++;
-	return end;
+	strbuf_release(&buf);
 }
 
 static void link_alt_odb_entries(struct object_database *odb, const char *alt,
 				 int sep, const char *relative_base, int depth)
 {
-	struct strbuf dir = STRBUF_INIT;
+	struct strvec alternates = STRVEC_INIT;
 
 	if (!alt || !*alt)
 		return;
@@ -259,13 +270,12 @@ static void link_alt_odb_entries(struct object_database *odb, const char *alt,
 		return;
 	}
 
-	while (*alt) {
-		alt = parse_alt_odb_entry(alt, sep, &dir);
-		if (!dir.len)
-			continue;
-		link_alt_odb_entry(odb, dir.buf, relative_base, depth);
-	}
-	strbuf_release(&dir);
+	parse_alternates(alt, sep, &alternates);
+
+	for (size_t i = 0; i < alternates.nr; i++)
+		link_alt_odb_entry(odb, alternates.v[i], relative_base, depth);
+
+	strvec_clear(&alternates);
 }
 
 static void read_info_alternates(struct object_database *odb,

From 84cec5276e70bdabd651a3d0a250d006434d639f Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:11 +0100
Subject: [PATCH 212/553] odb: resolve relative alternative paths when parsing

Parsing alternates and resolving potential relative paths is currently
handled in two separate steps. This has the effect that the logic to
retrieve alternates is not entirely self-contained. We want it to be
just that though so that we can eventually move the logic to list
alternates into the `struct odb_source`.

Move the logic to resolve relative alternative paths into
`parse_alternates()`. Besides bringing us a step closer towards the
above goal, it also neatly separates concerns of generating the list of
alternatives and linking them into the object database.

Note that we ignore any errors when the relative path cannot be
resolved. This isn't really a change in behaviour though: if the path
cannot be resolved to a directory then `alt_odb_usable()` still knows to
bail out.

While at it, rename the function to `odb_add_alternate_recursively()` to
more clearly indicate what its intent is and to align it with modern
terminology.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 64 +++++++++++++++++++++++++++++------------------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/odb.c b/odb.c
index 9785f62cb6be5e..699bdbffd1e7a3 100644
--- a/odb.c
+++ b/odb.c
@@ -159,44 +159,21 @@ static struct odb_source *odb_source_new(struct object_database *odb,
 	return source;
 }
 
-static struct odb_source *link_alt_odb_entry(struct object_database *odb,
-					     const char *dir,
-					     const char *relative_base,
-					     int depth)
+static struct odb_source *odb_add_alternate_recursively(struct object_database *odb,
+							const char *source,
+							int depth)
 {
 	struct odb_source *alternate = NULL;
-	struct strbuf pathbuf = STRBUF_INIT;
 	struct strbuf tmp = STRBUF_INIT;
 	khiter_t pos;
 	int ret;
 
-	if (!is_absolute_path(dir) && relative_base) {
-		strbuf_realpath(&pathbuf, relative_base, 1);
-		strbuf_addch(&pathbuf, '/');
-	}
-	strbuf_addstr(&pathbuf, dir);
-
-	if (!strbuf_realpath(&tmp, pathbuf.buf, 0)) {
-		error(_("unable to normalize alternate object path: %s"),
-		      pathbuf.buf);
-		goto error;
-	}
-	strbuf_swap(&pathbuf, &tmp);
-
-	/*
-	 * The trailing slash after the directory name is given by
-	 * this function at the end. Remove duplicates.
-	 */
-	while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
-		strbuf_setlen(&pathbuf, pathbuf.len - 1);
-
-	strbuf_reset(&tmp);
 	strbuf_realpath(&tmp, odb->sources->path, 1);
 
-	if (!alt_odb_usable(odb, pathbuf.buf, tmp.buf))
+	if (!alt_odb_usable(odb, source, tmp.buf))
 		goto error;
 
-	alternate = odb_source_new(odb, pathbuf.buf, false);
+	alternate = odb_source_new(odb, source, false);
 
 	/* add the alternate entry */
 	*odb->sources_tail = alternate;
@@ -212,20 +189,22 @@ static struct odb_source *link_alt_odb_entry(struct object_database *odb,
 
  error:
 	strbuf_release(&tmp);
-	strbuf_release(&pathbuf);
 	return alternate;
 }
 
 static void parse_alternates(const char *string,
 			     int sep,
+			     const char *relative_base,
 			     struct strvec *out)
 {
+	struct strbuf pathbuf = STRBUF_INIT;
 	struct strbuf buf = STRBUF_INIT;
 
 	while (*string) {
 		const char *end;
 
 		strbuf_reset(&buf);
+		strbuf_reset(&pathbuf);
 
 		if (*string == '#') {
 			/* comment; consume up to next separator */
@@ -250,9 +229,30 @@ static void parse_alternates(const char *string,
 		if (!buf.len)
 			continue;
 
+		if (!is_absolute_path(buf.buf) && relative_base) {
+			strbuf_realpath(&pathbuf, relative_base, 1);
+			strbuf_addch(&pathbuf, '/');
+		}
+		strbuf_addbuf(&pathbuf, &buf);
+
+		strbuf_reset(&buf);
+		if (!strbuf_realpath(&buf, pathbuf.buf, 0)) {
+			error(_("unable to normalize alternate object path: %s"),
+			      pathbuf.buf);
+			continue;
+		}
+
+		/*
+		 * The trailing slash after the directory name is given by
+		 * this function at the end. Remove duplicates.
+		 */
+		while (buf.len && buf.buf[buf.len - 1] == '/')
+			strbuf_setlen(&buf, buf.len - 1);
+
 		strvec_push(out, buf.buf);
 	}
 
+	strbuf_release(&pathbuf);
 	strbuf_release(&buf);
 }
 
@@ -270,10 +270,10 @@ static void link_alt_odb_entries(struct object_database *odb, const char *alt,
 		return;
 	}
 
-	parse_alternates(alt, sep, &alternates);
+	parse_alternates(alt, sep, relative_base, &alternates);
 
 	for (size_t i = 0; i < alternates.nr; i++)
-		link_alt_odb_entry(odb, alternates.v[i], relative_base, depth);
+		odb_add_alternate_recursively(odb, alternates.v[i], depth);
 
 	strvec_clear(&alternates);
 }
@@ -348,7 +348,7 @@ struct odb_source *odb_add_to_alternates_memory(struct object_database *odb,
 	 * overwritten when they are.
 	 */
 	odb_prepare_alternates(odb);
-	return link_alt_odb_entry(odb, dir, NULL, 0);
+	return odb_add_alternate_recursively(odb, dir, 0);
 }
 
 struct odb_source *odb_set_temporary_primary_source(struct object_database *odb,

From d17673ef4285d3d5f70909136f1ffe2745bcb71c Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:12 +0100
Subject: [PATCH 213/553] odb: move computation of normalized objdir into
 `alt_odb_usable()`

The function `alt_odb_usable()` receives as input the object database,
the path it's supposed to determine usability for as well as the
normalized path of the main object directory of the repository. The last
part is derived by the function's caller from the object database. As we
already pass the object database to `alt_odb_usable()` it is redundant
information.

Drop the extra parameter and compute the normalized object directory in
the function itself.

While at it, rename the function to `odb_is_source_usable()` to align it
with modern terminology.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/odb.c b/odb.c
index 699bdbffd1e7a3..e314f86c3b843d 100644
--- a/odb.c
+++ b/odb.c
@@ -89,17 +89,20 @@ int odb_mkstemp(struct object_database *odb,
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
-static int alt_odb_usable(struct object_database *o, const char *path,
-			  const char *normalized_objdir)
+static bool odb_is_source_usable(struct object_database *o, const char *path)
 {
 	int r;
+	struct strbuf normalized_objdir = STRBUF_INIT;
+	bool usable = false;
+
+	strbuf_realpath(&normalized_objdir, o->sources->path, 1);
 
 	/* Detect cases where alternate disappeared */
 	if (!is_directory(path)) {
 		error(_("object directory %s does not exist; "
 			"check .git/objects/info/alternates"),
 		      path);
-		return 0;
+		goto out;
 	}
 
 	/*
@@ -116,13 +119,17 @@ static int alt_odb_usable(struct object_database *o, const char *path,
 		kh_value(o->source_by_path, p) = o->sources;
 	}
 
-	if (fspatheq(path, normalized_objdir))
-		return 0;
+	if (fspatheq(path, normalized_objdir.buf))
+		goto out;
 
 	if (kh_get_odb_path_map(o->source_by_path, path) < kh_end(o->source_by_path))
-		return 0;
+		goto out;
+
+	usable = true;
 
-	return 1;
+out:
+	strbuf_release(&normalized_objdir);
+	return usable;
 }
 
 /*
@@ -164,13 +171,10 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 							int depth)
 {
 	struct odb_source *alternate = NULL;
-	struct strbuf tmp = STRBUF_INIT;
 	khiter_t pos;
 	int ret;
 
-	strbuf_realpath(&tmp, odb->sources->path, 1);
-
-	if (!alt_odb_usable(odb, source, tmp.buf))
+	if (!odb_is_source_usable(odb, source))
 		goto error;
 
 	alternate = odb_source_new(odb, source, false);
@@ -188,7 +192,6 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 	read_info_alternates(odb, alternate->path, depth + 1);
 
  error:
-	strbuf_release(&tmp);
 	return alternate;
 }
 

From dccfb39cdb68e47a4c7103b3c465cde91c5f9f56 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:13 +0100
Subject: [PATCH 214/553] odb: stop splitting alternate in
 `odb_add_to_alternates_file()`

When calling `odb_add_to_alternates_file()` we know to add the newly
added source to the object database in case we have already loaded
alternates. This is done so that we can make its objects accessible
immediately without having to fully reload all alternates.

The way we do this though is to call `link_alt_odb_entries()`, which
adds _multiple_ sources to the object database source in case we have
newline-separated entries. This behaviour is not documented in the
function documentation of `odb_add_to_alternates_file()`, and all
callers only ever pass a single directory to it. It's thus entirely
surprising and a conceptual mismatch.

Fix this issue by directly calling `odb_add_alternate_recursively()`
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/odb.c b/odb.c
index e314f86c3b843d..3112eab5d03ed1 100644
--- a/odb.c
+++ b/odb.c
@@ -338,7 +338,7 @@ void odb_add_to_alternates_file(struct object_database *odb,
 		if (commit_lock_file(&lock))
 			die_errno(_("unable to move new alternates file into place"));
 		if (odb->loaded_alternates)
-			link_alt_odb_entries(odb, dir, '\n', NULL, 0);
+			odb_add_alternate_recursively(odb, dir, 0);
 	}
 	free(alts);
 }

From 430e0e0f2e75673206321f6f4942c0bc7856c8b7 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:14 +0100
Subject: [PATCH 215/553] odb: remove mutual recursion when parsing alternates

When adding an alternative object database source we not only have to
consider the added source itself, but we also have to add _its_ sources
to our database. We implement this via mutual recursion:

  1. We first call `link_alt_odb_entries()`.

  2. `link_alt_odb_entries()` calls `parse_alternates()`.

  3. We then add each alternate via `odb_add_alternate_recursively()`.

  4. `odb_add_alternate_recursively()` calls `link_alt_odb_entries()`
     again.

This flow is somewhat hard to follow, but more importantly it means that
parsing of alternates is somewhat tied to the recursive behaviour.

Refactor the function to remove the mutual recursion between adding
sources and parsing alternates. The parsing step thus becomes completely
oblivious to the fact that there is recursive behaviour going on at all.
The recursion is handled by `odb_add_alternate_recursively()` instead,
which now recurses with itself.

This refactoring allows us to move parsing of alternates into object
database sources in a subsequent step.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 60 +++++++++++++++++++++++++++--------------------------------
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/odb.c b/odb.c
index 3112eab5d03ed1..59944d46496e9c 100644
--- a/odb.c
+++ b/odb.c
@@ -147,9 +147,8 @@ static bool odb_is_source_usable(struct object_database *o, const char *path)
  * of the object ID, an extra slash for the first level indirection, and
  * the terminating NUL.
  */
-static void read_info_alternates(struct object_database *odb,
-				 const char *relative_base,
-				 int depth);
+static void read_info_alternates(const char *relative_base,
+				 struct strvec *out);
 
 static struct odb_source *odb_source_new(struct object_database *odb,
 					 const char *path,
@@ -171,6 +170,7 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 							int depth)
 {
 	struct odb_source *alternate = NULL;
+	struct strvec sources = STRVEC_INIT;
 	khiter_t pos;
 	int ret;
 
@@ -189,9 +189,17 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 	kh_value(odb->source_by_path, pos) = alternate;
 
 	/* recursively add alternates */
-	read_info_alternates(odb, alternate->path, depth + 1);
+	read_info_alternates(alternate->path, &sources);
+	if (sources.nr && depth + 1 > 5) {
+		error(_("%s: ignoring alternate object stores, nesting too deep"),
+		      source);
+	} else {
+		for (size_t i = 0; i < sources.nr; i++)
+			odb_add_alternate_recursively(odb, sources.v[i], depth + 1);
+	}
 
  error:
+	strvec_clear(&sources);
 	return alternate;
 }
 
@@ -203,6 +211,9 @@ static void parse_alternates(const char *string,
 	struct strbuf pathbuf = STRBUF_INIT;
 	struct strbuf buf = STRBUF_INIT;
 
+	if (!string || !*string)
+		return;
+
 	while (*string) {
 		const char *end;
 
@@ -259,34 +270,11 @@ static void parse_alternates(const char *string,
 	strbuf_release(&buf);
 }
 
-static void link_alt_odb_entries(struct object_database *odb, const char *alt,
-				 int sep, const char *relative_base, int depth)
+static void read_info_alternates(const char *relative_base,
+				 struct strvec *out)
 {
-	struct strvec alternates = STRVEC_INIT;
-
-	if (!alt || !*alt)
-		return;
-
-	if (depth > 5) {
-		error(_("%s: ignoring alternate object stores, nesting too deep"),
-				relative_base);
-		return;
-	}
-
-	parse_alternates(alt, sep, relative_base, &alternates);
-
-	for (size_t i = 0; i < alternates.nr; i++)
-		odb_add_alternate_recursively(odb, alternates.v[i], depth);
-
-	strvec_clear(&alternates);
-}
-
-static void read_info_alternates(struct object_database *odb,
-				 const char *relative_base,
-				 int depth)
-{
-	char *path;
 	struct strbuf buf = STRBUF_INIT;
+	char *path;
 
 	path = xstrfmt("%s/info/alternates", relative_base);
 	if (strbuf_read_file(&buf, path, 1024) < 0) {
@@ -294,8 +282,8 @@ static void read_info_alternates(struct object_database *odb,
 		free(path);
 		return;
 	}
+	parse_alternates(buf.buf, '\n', relative_base, out);
 
-	link_alt_odb_entries(odb, buf.buf, '\n', relative_base, depth);
 	strbuf_release(&buf);
 	free(path);
 }
@@ -622,13 +610,19 @@ int odb_for_each_alternate(struct object_database *odb,
 
 void odb_prepare_alternates(struct object_database *odb)
 {
+	struct strvec sources = STRVEC_INIT;
+
 	if (odb->loaded_alternates)
 		return;
 
-	link_alt_odb_entries(odb, odb->alternate_db, PATH_SEP, NULL, 0);
+	parse_alternates(odb->alternate_db, PATH_SEP, NULL, &sources);
+	read_info_alternates(odb->sources->path, &sources);
+	for (size_t i = 0; i < sources.nr; i++)
+		odb_add_alternate_recursively(odb, sources.v[i], 0);
 
-	read_info_alternates(odb, odb->sources->path, 0);
 	odb->loaded_alternates = 1;
+
+	strvec_clear(&sources);
 }
 
 int odb_has_alternates(struct object_database *odb)

From 3f42555322f86f17a2dac4f585edab1d84f3df57 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:15 +0100
Subject: [PATCH 216/553] odb: drop forward declaration of
 `read_info_alternates()`

Now that we have removed the mutual recursion in the preceding commit
it is not necessary anymore to have a forward declaration of the
`read_info_alternates()` function. Move the function and its
dependencies further up so that we can remove it.

Note that this commit also removes the function documentation of
`read_info_alternates()`. It's unclear what it's documenting, but it for
sure isn't documenting the modern behaviour of the function anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 125 +++++++++++++++++++++++++---------------------------------
 1 file changed, 54 insertions(+), 71 deletions(-)

diff --git a/odb.c b/odb.c
index 59944d46496e9c..dcf4a62cd2eaf2 100644
--- a/odb.c
+++ b/odb.c
@@ -132,77 +132,6 @@ static bool odb_is_source_usable(struct object_database *o, const char *path)
 	return usable;
 }
 
-/*
- * Prepare alternate object database registry.
- *
- * The variable alt_odb_list points at the list of struct
- * odb_source.  The elements on this list come from
- * non-empty elements from colon separated ALTERNATE_DB_ENVIRONMENT
- * environment variable, and $GIT_OBJECT_DIRECTORY/info/alternates,
- * whose contents is similar to that environment variable but can be
- * LF separated.  Its base points at a statically allocated buffer that
- * contains "/the/directory/corresponding/to/.git/objects/...", while
- * its name points just after the slash at the end of ".git/objects/"
- * in the example above, and has enough space to hold all hex characters
- * of the object ID, an extra slash for the first level indirection, and
- * the terminating NUL.
- */
-static void read_info_alternates(const char *relative_base,
-				 struct strvec *out);
-
-static struct odb_source *odb_source_new(struct object_database *odb,
-					 const char *path,
-					 bool local)
-{
-	struct odb_source *source;
-
-	CALLOC_ARRAY(source, 1);
-	source->odb = odb;
-	source->local = local;
-	source->path = xstrdup(path);
-	source->loose = odb_source_loose_new(source);
-
-	return source;
-}
-
-static struct odb_source *odb_add_alternate_recursively(struct object_database *odb,
-							const char *source,
-							int depth)
-{
-	struct odb_source *alternate = NULL;
-	struct strvec sources = STRVEC_INIT;
-	khiter_t pos;
-	int ret;
-
-	if (!odb_is_source_usable(odb, source))
-		goto error;
-
-	alternate = odb_source_new(odb, source, false);
-
-	/* add the alternate entry */
-	*odb->sources_tail = alternate;
-	odb->sources_tail = &(alternate->next);
-
-	pos = kh_put_odb_path_map(odb->source_by_path, alternate->path, &ret);
-	if (!ret)
-		BUG("source must not yet exist");
-	kh_value(odb->source_by_path, pos) = alternate;
-
-	/* recursively add alternates */
-	read_info_alternates(alternate->path, &sources);
-	if (sources.nr && depth + 1 > 5) {
-		error(_("%s: ignoring alternate object stores, nesting too deep"),
-		      source);
-	} else {
-		for (size_t i = 0; i < sources.nr; i++)
-			odb_add_alternate_recursively(odb, sources.v[i], depth + 1);
-	}
-
- error:
-	strvec_clear(&sources);
-	return alternate;
-}
-
 static void parse_alternates(const char *string,
 			     int sep,
 			     const char *relative_base,
@@ -288,6 +217,60 @@ static void read_info_alternates(const char *relative_base,
 	free(path);
 }
 
+
+static struct odb_source *odb_source_new(struct object_database *odb,
+					 const char *path,
+					 bool local)
+{
+	struct odb_source *source;
+
+	CALLOC_ARRAY(source, 1);
+	source->odb = odb;
+	source->local = local;
+	source->path = xstrdup(path);
+	source->loose = odb_source_loose_new(source);
+
+	return source;
+}
+
+static struct odb_source *odb_add_alternate_recursively(struct object_database *odb,
+							const char *source,
+							int depth)
+{
+	struct odb_source *alternate = NULL;
+	struct strvec sources = STRVEC_INIT;
+	khiter_t pos;
+	int ret;
+
+	if (!odb_is_source_usable(odb, source))
+		goto error;
+
+	alternate = odb_source_new(odb, source, false);
+
+	/* add the alternate entry */
+	*odb->sources_tail = alternate;
+	odb->sources_tail = &(alternate->next);
+
+	pos = kh_put_odb_path_map(odb->source_by_path, alternate->path, &ret);
+	if (!ret)
+		BUG("source must not yet exist");
+	kh_value(odb->source_by_path, pos) = alternate;
+
+	/* recursively add alternates */
+	read_info_alternates(alternate->path, &sources);
+	if (sources.nr && depth + 1 > 5) {
+		error(_("%s: ignoring alternate object stores, nesting too deep"),
+		      source);
+	} else {
+		for (size_t i = 0; i < sources.nr; i++)
+			odb_add_alternate_recursively(odb, sources.v[i], depth + 1);
+	}
+
+ error:
+	strvec_clear(&sources);
+	return alternate;
+}
+
 void odb_add_to_alternates_file(struct object_database *odb,
 				const char *dir)
 {

From f7dbd9fb2ea9b14b4df0949411205f4b5d284b41 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:16 +0100
Subject: [PATCH 217/553] odb: read alternates via sources

Adapt how we read alternates so that the interface is structured around
the object database source we're reading from. This will eventually
allow us to abstract away this behaviour with pluggable object databases
so that every format can have its own mechanism for listing alternates.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/odb.c b/odb.c
index dcf4a62cd2eaf2..c5ba26b85f20e1 100644
--- a/odb.c
+++ b/odb.c
@@ -199,19 +199,19 @@ static void parse_alternates(const char *string,
 	strbuf_release(&buf);
 }
 
-static void read_info_alternates(const char *relative_base,
-				 struct strvec *out)
+static void odb_source_read_alternates(struct odb_source *source,
+				       struct strvec *out)
 {
 	struct strbuf buf = STRBUF_INIT;
 	char *path;
 
-	path = xstrfmt("%s/info/alternates", relative_base);
+	path = xstrfmt("%s/info/alternates", source->path);
 	if (strbuf_read_file(&buf, path, 1024) < 0) {
 		warn_on_fopen_errors(path);
 		free(path);
 		return;
 	}
-	parse_alternates(buf.buf, '\n', relative_base, out);
+	parse_alternates(buf.buf, '\n', source->path, out);
 
 	strbuf_release(&buf);
 	free(path);
@@ -257,7 +257,7 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 	kh_value(odb->source_by_path, pos) = alternate;
 
 	/* recursively add alternates */
-	read_info_alternates(alternate->path, &sources);
+	odb_source_read_alternates(alternate, &sources);
 	if (sources.nr && depth + 1 > 5) {
 		error(_("%s: ignoring alternate object stores, nesting too deep"),
 		      source);
@@ -599,7 +599,7 @@ void odb_prepare_alternates(struct object_database *odb)
 		return;
 
 	parse_alternates(odb->alternate_db, PATH_SEP, NULL, &sources);
-	read_info_alternates(odb->sources->path, &sources);
+	odb_source_read_alternates(odb->sources, &sources);
 	for (size_t i = 0; i < sources.nr; i++)
 		odb_add_alternate_recursively(odb, sources.v[i], 0);
 

From 221a877d4785030e07d20977418609257fd606d8 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Thu, 11 Dec 2025 10:30:17 +0100
Subject: [PATCH 218/553] odb: write alternates via sources

Refactor writing of alternates so that the actual business logic is
structured around the object database source we want to write the
alternate to. Same as with the preceding commit, this will eventually
allow us to have different logic for writing alternates depending on the
backend used.

Note that after the refactoring we start to call
`odb_add_alternate_recursively()` unconditionally. This is fine though
as we know to skip adding sources that are tracked already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 51 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/odb.c b/odb.c
index c5ba26b85f20e1..cc7f8324655e08 100644
--- a/odb.c
+++ b/odb.c
@@ -271,25 +271,28 @@ static struct odb_source *odb_add_alternate_recursively(struct object_database *
 	return alternate;
 }
 
-void odb_add_to_alternates_file(struct object_database *odb,
-				const char *dir)
+static int odb_source_write_alternate(struct odb_source *source,
+				      const char *alternate)
 {
 	struct lock_file lock = LOCK_INIT;
-	char *alts = repo_git_path(odb->repo, "objects/info/alternates");
+	char *path = xstrfmt("%s/%s", source->path, "info/alternates");
 	FILE *in, *out;
 	int found = 0;
+	int ret;
 
-	hold_lock_file_for_update(&lock, alts, LOCK_DIE_ON_ERROR);
+	hold_lock_file_for_update(&lock, path, LOCK_DIE_ON_ERROR);
 	out = fdopen_lock_file(&lock, "w");
-	if (!out)
-		die_errno(_("unable to fdopen alternates lockfile"));
+	if (!out) {
+		ret = error_errno(_("unable to fdopen alternates lockfile"));
+		goto out;
+	}
 
-	in = fopen(alts, "r");
+	in = fopen(path, "r");
 	if (in) {
 		struct strbuf line = STRBUF_INIT;
 
 		while (strbuf_getline(&line, in) != EOF) {
-			if (!strcmp(dir, line.buf)) {
+			if (!strcmp(alternate, line.buf)) {
 				found = 1;
 				break;
 			}
@@ -298,20 +301,36 @@ void odb_add_to_alternates_file(struct object_database *odb,
 
 		strbuf_release(&line);
 		fclose(in);
+	} else if (errno != ENOENT) {
+		ret = error_errno(_("unable to read alternates file"));
+		goto out;
 	}
-	else if (errno != ENOENT)
-		die_errno(_("unable to read alternates file"));
 
 	if (found) {
 		rollback_lock_file(&lock);
 	} else {
-		fprintf_or_die(out, "%s\n", dir);
-		if (commit_lock_file(&lock))
-			die_errno(_("unable to move new alternates file into place"));
-		if (odb->loaded_alternates)
-			odb_add_alternate_recursively(odb, dir, 0);
+		fprintf_or_die(out, "%s\n", alternate);
+		if (commit_lock_file(&lock)) {
+			ret = error_errno(_("unable to move new alternates file into place"));
+			goto out;
+		}
 	}
-	free(alts);
+
+	ret = 0;
+
+out:
+	free(path);
+	return ret;
+}
+
+void odb_add_to_alternates_file(struct object_database *odb,
+				const char *dir)
+{
+	int ret = odb_source_write_alternate(odb->sources, dir);
+	if (ret < 0)
+		die(NULL);
+	if (odb->loaded_alternates)
+		odb_add_alternate_recursively(odb, dir, 0);
 }
 
 struct odb_source *odb_add_to_alternates_memory(struct object_database *odb,

From d4b732899e8b7571f2822b35e5cc7f55f6ce5e3d Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 11 Dec 2025 11:53:07 +0900
Subject: [PATCH 219/553] Makefile: help macOS novices by mentioning MacPorts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Since Aug 2006, the DarwinPorts project renamed themselves as
MacPorts.  Those who are not intimately familiar with the Opensource
ecosystem around macOS from olden days, the name DarwinPorts may not
ring a bell, even when they are using MacPorts.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..be027218a5d29d 100644
--- a/Makefile
+++ b/Makefile
@@ -95,7 +95,8 @@ include shared.mak
 # and LDFLAGS appropriately.
 #
 # Define NO_DARWIN_PORTS if you are building on Darwin/Mac OS X,
-# have DarwinPorts installed in /opt/local, but don't want GIT to
+# have DarwinPorts (which is an old name for MacPorts) installed
+# in /opt/local, but don't want GIT to
 # link against any libraries installed there.  If defined you may
 # specify your own (or DarwinPort's) include directories and
 # library directories by defining CFLAGS and LDFLAGS appropriately.

From a4a77e41fa0ee3d526993be47086bbfe3a115cdc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Thu, 11 Dec 2025 18:56:54 +0100
Subject: [PATCH 220/553] replay: move onto NULL check before first use
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

cmd_replay() aborts if the pointer "onto" is NULL after argument
parsing, e.g. when specifying a non-existing commit with --onto.
15cd4ef1f4 (replay: make atomic ref updates the default behavior,
2025-11-06) added code that dereferences this pointer before the check.
Switch their places to avoid a segmentation fault.

Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 6606a2c94bc671..312b8203cb525a 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -454,6 +454,9 @@ int cmd_replay(int argc,
 	determine_replay_mode(repo, &revs.cmdline, onto_name, &advance_name,
 			      &onto, &update_refs);
 
+	if (!onto) /* FIXME: Should handle replaying down to root commit */
+		die("Replaying down to root commit is not supported yet!");
+
 	/* Build reflog message */
 	if (advance_name_opt)
 		strbuf_addf(&reflog_msg, "replay --advance %s", advance_name_opt);
@@ -472,9 +475,6 @@ int cmd_replay(int argc,
 		}
 	}
 
-	if (!onto) /* FIXME: Should handle replaying down to root commit */
-		die("Replaying down to root commit is not supported yet!");
-
 	if (prepare_revision_walk(&revs) < 0) {
 		ret = error(_("error preparing revisions"));
 		goto cleanup;

From 4d75f2aea776f434b51481ae65233d6012da1a66 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Fri, 12 Dec 2025 21:54:14 +0900
Subject: [PATCH 221/553] FLEX_ARRAY: require platforms to support the C99
 syntax

Before C99 syntax to express that the final member in a struct is an
array of unknown number of elements, i.e.,

	struct {
		...
		T flexible_array[];
	};

came along, GNU introduced their own extension to declare such a
member with 0 size, i.e.,

		T flexible_array[0];

and the compilers that did not understand even that were given a way
to emulate it by wasting one element, i.e.,

		T flexible_array[1];

As we are using more and more C99 language features, let's see if
the platforms that still need to resort to the historical forms of
flexible array member support are still there, by forcing all the
flex array definitions to use the C99 syntax and see if anybody
screams (in which case reverting the changes is rather easy).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 git-compat-util.h | 33 ++-------------------------------
 1 file changed, 2 insertions(+), 31 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 398e0fac4fab60..8010397bdb5b80 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -38,37 +38,8 @@ struct strbuf;
 DISABLE_WARNING(-Wsign-compare)
 #endif
 
-#ifndef FLEX_ARRAY
-/*
- * See if our compiler is known to support flexible array members.
- */
-
-/*
- * Check vendor specific quirks first, before checking the
- * __STDC_VERSION__, as vendor compilers can lie and we need to be
- * able to work them around.  Note that by not defining FLEX_ARRAY
- * here, we can fall back to use the "safer but a bit wasteful" one
- * later.
- */
-#if defined(__SUNPRO_C) && (__SUNPRO_C <= 0x580)
-#elif defined(__GNUC__)
-# if (__GNUC__ >= 3)
-#  define FLEX_ARRAY /* empty */
-# else
-#  define FLEX_ARRAY 0 /* older GNU extension */
-# endif
-#elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
-# define FLEX_ARRAY /* empty */
-#endif
-
-/*
- * Otherwise, default to safer but a bit wasteful traditional style
- */
-#ifndef FLEX_ARRAY
-# define FLEX_ARRAY 1
-#endif
-#endif
-
+#undef FLEX_ARRAY
+#define FLEX_ARRAY /* empty - weather balloon to require C99 FAM */
 
 /*
  * BUILD_ASSERT_OR_ZERO - assert a build-time dependency, as an expression.

From bab391761d1c7cff59d1c29ee546efc3e588473d Mon Sep 17 00:00:00 2001
From: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Date: Fri, 12 Dec 2025 13:14:33 +0530
Subject: [PATCH 222/553] pull: move options[] array into function scope
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Unless there are good reasons, it is customary to have the options[]
array used with the parse-options API declared in function scope rather
than at file scope.

Move builtin/pull.c:cmd_pull()’s options[] array into the function to
match that convention.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pull.c | 283 ++++++++++++++++++++++++-------------------------
 1 file changed, 141 insertions(+), 142 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index 5ebd5296207061..3ff748e0b3ea60 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -119,148 +119,6 @@ static int opt_show_forced_updates = -1;
 static const char *set_upstream;
 static struct strvec opt_fetch = STRVEC_INIT;
 
-static struct option pull_options[] = {
-	/* Shared options */
-	OPT__VERBOSITY(&opt_verbosity),
-	OPT_PASSTHRU(0, "progress", &opt_progress, NULL,
-		N_("force progress reporting"),
-		PARSE_OPT_NOARG),
-	OPT_CALLBACK_F(0, "recurse-submodules",
-		   &recurse_submodules_cli, N_("on-demand"),
-		   N_("control for recursive fetching of submodules"),
-		   PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
-
-	/* Options passed to git-merge or git-rebase */
-	OPT_GROUP(N_("Options related to merging")),
-	OPT_CALLBACK_F('r', "rebase", &opt_rebase,
-		"(false|true|merges|interactive)",
-		N_("incorporate changes by rebasing rather than merging"),
-		PARSE_OPT_OPTARG, parse_opt_rebase),
-	OPT_PASSTHRU('n', NULL, &opt_diffstat, NULL,
-		N_("do not show a diffstat at the end of the merge"),
-		PARSE_OPT_NOARG | PARSE_OPT_NONEG),
-	OPT_PASSTHRU(0, "stat", &opt_diffstat, NULL,
-		N_("show a diffstat at the end of the merge"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "summary", &opt_diffstat, NULL,
-		N_("(synonym to --stat)"),
-		PARSE_OPT_NOARG | PARSE_OPT_HIDDEN),
-	OPT_PASSTHRU(0, "compact-summary", &opt_diffstat, NULL,
-		N_("show a compact-summary at the end of the merge"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "log", &opt_log, N_("n"),
-		N_("add (at most <n>) entries from shortlog to merge commit message"),
-		PARSE_OPT_OPTARG),
-	OPT_PASSTHRU(0, "signoff", &opt_signoff, NULL,
-		N_("add a Signed-off-by trailer"),
-		PARSE_OPT_OPTARG),
-	OPT_PASSTHRU(0, "squash", &opt_squash, NULL,
-		N_("create a single commit instead of doing a merge"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "commit", &opt_commit, NULL,
-		N_("perform a commit if the merge succeeds (default)"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "edit", &opt_edit, NULL,
-		N_("edit message before committing"),
-		PARSE_OPT_NOARG),
-	OPT_CLEANUP(&cleanup_arg),
-	OPT_PASSTHRU(0, "ff", &opt_ff, NULL,
-		N_("allow fast-forward"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "ff-only", &opt_ff, NULL,
-		N_("abort if fast-forward is not possible"),
-		PARSE_OPT_NOARG | PARSE_OPT_NONEG),
-	OPT_PASSTHRU(0, "verify", &opt_verify, NULL,
-		N_("control use of pre-merge-commit and commit-msg hooks"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "verify-signatures", &opt_verify_signatures, NULL,
-		N_("verify that the named commit has a valid GPG signature"),
-		PARSE_OPT_NOARG),
-	OPT_BOOL(0, "autostash", &opt_autostash,
-		N_("automatically stash/stash pop before and after")),
-	OPT_PASSTHRU_ARGV('s', "strategy", &opt_strategies, N_("strategy"),
-		N_("merge strategy to use"),
-		0),
-	OPT_PASSTHRU_ARGV('X', "strategy-option", &opt_strategy_opts,
-		N_("option=value"),
-		N_("option for selected merge strategy"),
-		0),
-	OPT_PASSTHRU('S', "gpg-sign", &opt_gpg_sign, N_("key-id"),
-		N_("GPG sign commit"),
-		PARSE_OPT_OPTARG),
-	OPT_SET_INT(0, "allow-unrelated-histories",
-		    &opt_allow_unrelated_histories,
-		    N_("allow merging unrelated histories"), 1),
-
-	/* Options passed to git-fetch */
-	OPT_GROUP(N_("Options related to fetching")),
-	OPT_PASSTHRU(0, "all", &opt_all, NULL,
-		N_("fetch from all remotes"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU('a', "append", &opt_append, NULL,
-		N_("append to .git/FETCH_HEAD instead of overwriting"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "upload-pack", &opt_upload_pack, N_("path"),
-		N_("path to upload pack on remote end"),
-		0),
-	OPT__FORCE(&opt_force, N_("force overwrite of local branch"), 0),
-	OPT_PASSTHRU('t', "tags", &opt_tags, NULL,
-		N_("fetch all tags and associated objects"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU('p', "prune", &opt_prune, NULL,
-		N_("prune remote-tracking branches no longer on remote"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU('j', "jobs", &max_children, N_("n"),
-		N_("number of submodules pulled in parallel"),
-		PARSE_OPT_OPTARG),
-	OPT_BOOL(0, "dry-run", &opt_dry_run,
-		N_("dry run")),
-	OPT_PASSTHRU('k', "keep", &opt_keep, NULL,
-		N_("keep downloaded pack"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "depth", &opt_depth, N_("depth"),
-		N_("deepen history of shallow clone"),
-		0),
-	OPT_PASSTHRU_ARGV(0, "shallow-since", &opt_fetch, N_("time"),
-		N_("deepen history of shallow repository based on time"),
-		0),
-	OPT_PASSTHRU_ARGV(0, "shallow-exclude", &opt_fetch, N_("ref"),
-		N_("deepen history of shallow clone, excluding ref"),
-		0),
-	OPT_PASSTHRU_ARGV(0, "deepen", &opt_fetch, N_("n"),
-		N_("deepen history of shallow clone"),
-		0),
-	OPT_PASSTHRU(0, "unshallow", &opt_unshallow, NULL,
-		N_("convert to a complete repository"),
-		PARSE_OPT_NONEG | PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "update-shallow", &opt_update_shallow, NULL,
-		N_("accept refs that update .git/shallow"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU(0, "refmap", &opt_refmap, N_("refmap"),
-		N_("specify fetch refmap"),
-		PARSE_OPT_NONEG),
-	OPT_PASSTHRU_ARGV('o', "server-option", &opt_fetch,
-		N_("server-specific"),
-		N_("option to transmit"),
-		0),
-	OPT_PASSTHRU('4',  "ipv4", &opt_ipv4, NULL,
-		N_("use IPv4 addresses only"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU('6',  "ipv6", &opt_ipv6, NULL,
-		N_("use IPv6 addresses only"),
-		PARSE_OPT_NOARG),
-	OPT_PASSTHRU_ARGV(0, "negotiation-tip", &opt_fetch, N_("revision"),
-		N_("report that we have only objects reachable from this object"),
-		0),
-	OPT_BOOL(0, "show-forced-updates", &opt_show_forced_updates,
-		 N_("check for forced-updates on all updated branches")),
-	OPT_PASSTHRU(0, "set-upstream", &set_upstream, NULL,
-		N_("set upstream for git pull/fetch"),
-		PARSE_OPT_NOARG),
-
-	OPT_END()
-};
-
 /**
  * Pushes "-q" or "-v" switches into arr to match the opt_verbosity level.
  */
@@ -1008,6 +866,147 @@ int cmd_pull(int argc,
 	int can_ff;
 	int divergent;
 	int ret;
+	static struct option pull_options[] = {
+		/* Shared options */
+		OPT__VERBOSITY(&opt_verbosity),
+		OPT_PASSTHRU(0, "progress", &opt_progress, NULL,
+			N_("force progress reporting"),
+			PARSE_OPT_NOARG),
+		OPT_CALLBACK_F(0, "recurse-submodules",
+			   &recurse_submodules_cli, N_("on-demand"),
+			   N_("control for recursive fetching of submodules"),
+			   PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
+
+		/* Options passed to git-merge or git-rebase */
+		OPT_GROUP(N_("Options related to merging")),
+		OPT_CALLBACK_F('r', "rebase", &opt_rebase,
+			"(false|true|merges|interactive)",
+			N_("incorporate changes by rebasing rather than merging"),
+			PARSE_OPT_OPTARG, parse_opt_rebase),
+		OPT_PASSTHRU('n', NULL, &opt_diffstat, NULL,
+			N_("do not show a diffstat at the end of the merge"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG),
+		OPT_PASSTHRU(0, "stat", &opt_diffstat, NULL,
+			N_("show a diffstat at the end of the merge"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "summary", &opt_diffstat, NULL,
+			N_("(synonym to --stat)"),
+			PARSE_OPT_NOARG | PARSE_OPT_HIDDEN),
+		OPT_PASSTHRU(0, "compact-summary", &opt_diffstat, NULL,
+			N_("show a compact-summary at the end of the merge"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "log", &opt_log, N_("n"),
+			N_("add (at most <n>) entries from shortlog to merge commit message"),
+			PARSE_OPT_OPTARG),
+		OPT_PASSTHRU(0, "signoff", &opt_signoff, NULL,
+			N_("add a Signed-off-by trailer"),
+			PARSE_OPT_OPTARG),
+		OPT_PASSTHRU(0, "squash", &opt_squash, NULL,
+			N_("create a single commit instead of doing a merge"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "commit", &opt_commit, NULL,
+			N_("perform a commit if the merge succeeds (default)"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "edit", &opt_edit, NULL,
+			N_("edit message before committing"),
+			PARSE_OPT_NOARG),
+		OPT_CLEANUP(&cleanup_arg),
+		OPT_PASSTHRU(0, "ff", &opt_ff, NULL,
+			N_("allow fast-forward"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "ff-only", &opt_ff, NULL,
+			N_("abort if fast-forward is not possible"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG),
+		OPT_PASSTHRU(0, "verify", &opt_verify, NULL,
+			N_("control use of pre-merge-commit and commit-msg hooks"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "verify-signatures", &opt_verify_signatures, NULL,
+			N_("verify that the named commit has a valid GPG signature"),
+			PARSE_OPT_NOARG),
+		OPT_BOOL(0, "autostash", &opt_autostash,
+			N_("automatically stash/stash pop before and after")),
+		OPT_PASSTHRU_ARGV('s', "strategy", &opt_strategies, N_("strategy"),
+			N_("merge strategy to use"),
+			0),
+		OPT_PASSTHRU_ARGV('X', "strategy-option", &opt_strategy_opts,
+			N_("option=value"),
+			N_("option for selected merge strategy"),
+			0),
+		OPT_PASSTHRU('S', "gpg-sign", &opt_gpg_sign, N_("key-id"),
+			N_("GPG sign commit"),
+			PARSE_OPT_OPTARG),
+		OPT_SET_INT(0, "allow-unrelated-histories",
+			    &opt_allow_unrelated_histories,
+			    N_("allow merging unrelated histories"), 1),
+
+		/* Options passed to git-fetch */
+		OPT_GROUP(N_("Options related to fetching")),
+		OPT_PASSTHRU(0, "all", &opt_all, NULL,
+			N_("fetch from all remotes"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU('a', "append", &opt_append, NULL,
+			N_("append to .git/FETCH_HEAD instead of overwriting"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "upload-pack", &opt_upload_pack, N_("path"),
+			N_("path to upload pack on remote end"),
+			0),
+		OPT__FORCE(&opt_force, N_("force overwrite of local branch"), 0),
+		OPT_PASSTHRU('t', "tags", &opt_tags, NULL,
+			N_("fetch all tags and associated objects"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU('p', "prune", &opt_prune, NULL,
+			N_("prune remote-tracking branches no longer on remote"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU('j', "jobs", &max_children, N_("n"),
+			N_("number of submodules pulled in parallel"),
+			PARSE_OPT_OPTARG),
+		OPT_BOOL(0, "dry-run", &opt_dry_run,
+			N_("dry run")),
+		OPT_PASSTHRU('k', "keep", &opt_keep, NULL,
+			N_("keep downloaded pack"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "depth", &opt_depth, N_("depth"),
+			N_("deepen history of shallow clone"),
+			0),
+		OPT_PASSTHRU_ARGV(0, "shallow-since", &opt_fetch, N_("time"),
+			N_("deepen history of shallow repository based on time"),
+			0),
+		OPT_PASSTHRU_ARGV(0, "shallow-exclude", &opt_fetch, N_("ref"),
+			N_("deepen history of shallow clone, excluding ref"),
+			0),
+		OPT_PASSTHRU_ARGV(0, "deepen", &opt_fetch, N_("n"),
+			N_("deepen history of shallow clone"),
+			0),
+		OPT_PASSTHRU(0, "unshallow", &opt_unshallow, NULL,
+			N_("convert to a complete repository"),
+			PARSE_OPT_NONEG | PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "update-shallow", &opt_update_shallow, NULL,
+			N_("accept refs that update .git/shallow"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU(0, "refmap", &opt_refmap, N_("refmap"),
+			N_("specify fetch refmap"),
+			PARSE_OPT_NONEG),
+		OPT_PASSTHRU_ARGV('o', "server-option", &opt_fetch,
+			N_("server-specific"),
+			N_("option to transmit"),
+			0),
+		OPT_PASSTHRU('4',  "ipv4", &opt_ipv4, NULL,
+			N_("use IPv4 addresses only"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU('6',  "ipv6", &opt_ipv6, NULL,
+			N_("use IPv6 addresses only"),
+			PARSE_OPT_NOARG),
+		OPT_PASSTHRU_ARGV(0, "negotiation-tip", &opt_fetch, N_("revision"),
+			N_("report that we have only objects reachable from this object"),
+			0),
+		OPT_BOOL(0, "show-forced-updates", &opt_show_forced_updates,
+			 N_("check for forced-updates on all updated branches")),
+		OPT_PASSTHRU(0, "set-upstream", &set_upstream, NULL,
+			N_("set upstream for git pull/fetch"),
+			PARSE_OPT_NOARG),
+
+		OPT_END()
+	};
 
 	if (!getenv("GIT_REFLOG_ACTION"))
 		set_reflog_message(argc, argv);

From 48695fcde51e10d6d6e72653fb94b5fd339cd6e6 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 12 Dec 2025 15:15:24 +0000
Subject: [PATCH 223/553] scalar: annotate config file with "set by scalar"

A repo may have config options set by 'scalar clone' or 'scalar
register' and then updated by 'scalar reconfigure'. It can be helpful to
point out which of those options were set by the latest scalar
recommendations.

Add "# set by scalar" to the end of each config option to assist users
in identifying why these config options were set in their repo. Use a new
helper method to simplify the two callsites.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 scalar.c          | 24 +++++++++++++++++-------
 t/t9210-scalar.sh |  3 +++
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/scalar.c b/scalar.c
index f7543116272b77..1c7bd1a8f8b67b 100644
--- a/scalar.c
+++ b/scalar.c
@@ -19,6 +19,7 @@
 #include "help.h"
 #include "setup.h"
 #include "trace2.h"
+#include "path.h"
 
 static void setup_enlistment_directory(int argc, const char **argv,
 				       const char * const *usagestr,
@@ -95,7 +96,17 @@ struct scalar_config {
 	int overwrite_on_reconfigure;
 };
 
-static int set_scalar_config(const struct scalar_config *config, int reconfigure)
+static int set_scalar_config(const char *key, const char *value)
+{
+	char *file = repo_git_path(the_repository, "config");
+	int res = repo_config_set_multivar_in_file_gently(the_repository, file,
+							  key, value, NULL,
+							  " # set by scalar", 0);
+	free(file);
+	return res;
+}
+
+static int set_config_if_missing(const struct scalar_config *config, int reconfigure)
 {
 	char *value = NULL;
 	int res;
@@ -103,7 +114,7 @@ static int set_scalar_config(const struct scalar_config *config, int reconfigure
 	if ((reconfigure && config->overwrite_on_reconfigure) ||
 	    repo_config_get_string(the_repository, config->key, &value)) {
 		trace2_data_string("scalar", the_repository, config->key, "created");
-		res = repo_config_set_gently(the_repository, config->key, config->value);
+		res = set_scalar_config(config->key, config->value);
 	} else {
 		trace2_data_string("scalar", the_repository, config->key, "exists");
 		res = 0;
@@ -178,14 +189,14 @@ static int set_recommended_config(int reconfigure)
 	char *value;
 
 	for (i = 0; config[i].key; i++) {
-		if (set_scalar_config(config + i, reconfigure))
+		if (set_config_if_missing(config + i, reconfigure))
 			return error(_("could not configure %s=%s"),
 				     config[i].key, config[i].value);
 	}
 
 	if (have_fsmonitor_support()) {
 		struct scalar_config fsmonitor = { "core.fsmonitor", "true" };
-		if (set_scalar_config(&fsmonitor, reconfigure))
+		if (set_config_if_missing(&fsmonitor, reconfigure))
 			return error(_("could not configure %s=%s"),
 				     fsmonitor.key, fsmonitor.value);
 	}
@@ -197,9 +208,8 @@ static int set_recommended_config(int reconfigure)
 	if (repo_config_get_string(the_repository, "log.excludeDecoration", &value)) {
 		trace2_data_string("scalar", the_repository,
 				   "log.excludeDecoration", "created");
-		if (repo_config_set_multivar_gently(the_repository, "log.excludeDecoration",
-						    "refs/prefetch/*",
-						    CONFIG_REGEX_NONE, 0))
+		if (set_scalar_config("log.excludeDecoration",
+					    "refs/prefetch/*"))
 			return error(_("could not configure "
 				       "log.excludeDecoration"));
 	} else {
diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index bd6f0c40d229b6..43c210a23d4bef 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
 	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
 	test_path_is_file one/src/cron.txt &&
 	test true = "$(git -C one/src config core.preloadIndex)" &&
+	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
+	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
+
 	test_subcommand git maintenance start <reconfigure &&
 	test_subcommand ! git maintenance unregister --force <reconfigure &&
 

From 05f28e4b3cc1f873e510e5692b70290c515abb98 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 12 Dec 2025 15:15:25 +0000
Subject: [PATCH 224/553] scalar: use index.skipHash=true for performance

The index.skipHash config option has been set to 'false' by Scalar since
4933152cbb (scalar: enable path-walk during push via config, 2025-05-16)
but that commit message is trying to communicate the exact opposite:
that the 'true' value is what we want instead. This means that we've
been disabling this performance benefit for Scalar repos
unintentionally.

Fix this issue before we add justification for the config options set in
this list.

Oddly, enabling index.skipHash causes a test issue during 'test_commit'
in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as
caught by the linux-test-vars build). I'm fixing the test by disabling
the environment variable, but the issue should be resolved in a series
focused on the split index.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 scalar.c          | 2 +-
 t/t9210-scalar.sh | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/scalar.c b/scalar.c
index 1c7bd1a8f8b67b..55b8542770b780 100644
--- a/scalar.c
+++ b/scalar.c
@@ -160,7 +160,7 @@ static int set_recommended_config(int reconfigure)
 		{ "credential.validate", "false", 1 }, /* GCM4W-only */
 		{ "gc.auto", "0", 1 },
 		{ "gui.GCWarning", "false", 1 },
-		{ "index.skipHash", "false", 1 },
+		{ "index.skipHash", "true", 1 },
 		{ "index.threads", "true", 1 },
 		{ "index.version", "4", 1 },
 		{ "merge.stat", "false", 1 },
diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index 43c210a23d4bef..923c243c133387 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -246,6 +246,10 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
 '
 
 test_expect_success 'scalar reconfigure --all with detached HEADs' '
+	# This test demonstrates an issue with index.skipHash=true and
+	# this test variable for the split index. Disable the test variable.
+	sane_unset GIT_TEST_SPLIT_INDEX &&
+
 	repos="two three four" &&
 	for num in $repos
 	do

From be667e40cbe2975aaf44748f5ee237e0d79359af Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 12 Dec 2025 15:15:26 +0000
Subject: [PATCH 225/553] scalar: remove stale config values

These config values were added in the original Scalar contribution,
d0feac4e8c (scalar: 'register' sets recommended config and starts
maintenance, 2021-12-03), but were never fully checked for validity in
the upstream Git project. At the time, Scalar was only intended for the
contrib/ directory so did not have as rigorous of an investigation.

Each config option has its own justification for removal:

* core.preloadIndex: This value is true by default, now. Removing this
  causes some changes required to the tests that checked this config
  value. Use gui.gcwarning=false instead.

* core.fscache: This config does not exist in the core Git project, but
  is instead a config option for a Git for Windows feature.

* core.multiPackIndex: This config value is now enabled by default, so
  does not need to be called out specifically. It was originally
  included to make sure the background maintenance that created
  multi-pack-indexes would result in the expected performance
  improvements.

* credential.validate: This option is not something specific to Git but
  instead an older version of Git Credential Manager for Windows. That
  software was replaced several years ago by the cross-platform Git
  Credential Manger so this option is no longer needed to help users who
  were on that older software.

* pack.useSparse=true: This value is now Git's default as of de3a864114
  (config: set pack.useSparse=true by default, 2020-03-20) so we don't
  need it set by Scalar.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 scalar.c          |  5 -----
 t/t9210-scalar.sh | 20 ++++++++++----------
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/scalar.c b/scalar.c
index 55b8542770b780..aeebea41fa8fc2 100644
--- a/scalar.c
+++ b/scalar.c
@@ -135,9 +135,6 @@ static int set_recommended_config(int reconfigure)
 	struct scalar_config config[] = {
 		/* Required */
 		{ "am.keepCR", "true", 1 },
-		{ "core.FSCache", "true", 1 },
-		{ "core.multiPackIndex", "true", 1 },
-		{ "core.preloadIndex", "true", 1 },
 #ifndef WIN32
 		{ "core.untrackedCache", "true", 1 },
 #else
@@ -157,7 +154,6 @@ static int set_recommended_config(int reconfigure)
 #endif
 		{ "core.logAllRefUpdates", "true", 1 },
 		{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
-		{ "credential.validate", "false", 1 }, /* GCM4W-only */
 		{ "gc.auto", "0", 1 },
 		{ "gui.GCWarning", "false", 1 },
 		{ "index.skipHash", "true", 1 },
@@ -166,7 +162,6 @@ static int set_recommended_config(int reconfigure)
 		{ "merge.stat", "false", 1 },
 		{ "merge.renames", "true", 1 },
 		{ "pack.useBitmaps", "false", 1 },
-		{ "pack.useSparse", "true", 1 },
 		{ "receive.autoGC", "false", 1 },
 		{ "feature.manyFiles", "false", 1 },
 		{ "feature.experimental", "false", 1 },
diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index 923c243c133387..009437a5f3168f 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -202,15 +202,15 @@ test_expect_success 'scalar clone --no-... opts' '
 test_expect_success 'scalar reconfigure' '
 	git init one/src &&
 	scalar register one &&
-	git -C one/src config core.preloadIndex false &&
+	git -C one/src config unset gui.gcwarning &&
 	scalar reconfigure one &&
-	test true = "$(git -C one/src config core.preloadIndex)" &&
-	git -C one/src config core.preloadIndex false &&
+	test false = "$(git -C one/src config gui.gcwarning)" &&
+	git -C one/src config unset gui.gcwarning &&
 	rm one/src/cron.txt &&
 	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
 	test_path_is_file one/src/cron.txt &&
-	test true = "$(git -C one/src config core.preloadIndex)" &&
-	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
+	test false = "$(git -C one/src config gui.gcwarning)" &&
+	test_grep "GCWarning = false # set by scalar" one/src/.git/config &&
 	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
 
 	test_subcommand git maintenance start <reconfigure &&
@@ -234,14 +234,14 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
 		git init $num/src &&
 		scalar register $num/src &&
 		git -C $num/src config includeif."onbranch:foo".path something &&
-		git -C $num/src config core.preloadIndex false || return 1
+		git -C $num/src config unset gui.gcwarning || return 1
 	done &&
 
 	scalar reconfigure --all &&
 
 	for num in $repos
 	do
-		test true = "$(git -C $num/src config core.preloadIndex)" || return 1
+		test false = "$(git -C $num/src config gui.gcwarning)" || return 1
 	done
 '
 
@@ -256,7 +256,7 @@ test_expect_success 'scalar reconfigure --all with detached HEADs' '
 		rm -rf $num/src &&
 		git init $num/src &&
 		scalar register $num/src &&
-		git -C $num/src config core.preloadIndex false &&
+		git -C $num/src config unset gui.gcwarning &&
 		test_commit -C $num/src initial &&
 		git -C $num/src switch --detach HEAD || return 1
 	done &&
@@ -265,7 +265,7 @@ test_expect_success 'scalar reconfigure --all with detached HEADs' '
 
 	for num in $repos
 	do
-		test true = "$(git -C $num/src config core.preloadIndex)" || return 1
+		test false = "$(git -C $num/src config gui.gcwarning)" || return 1
 	done
 '
 
@@ -297,7 +297,7 @@ test_expect_success 'scalar supports -c/-C' '
 	git init sub &&
 	scalar -C sub -c status.aheadBehind=bogus register &&
 	test -z "$(git -C sub config --local status.aheadBehind)" &&
-	test true = "$(git -C sub config core.preloadIndex)"
+	test false = "$(git -C sub config gui.gcwarning)"
 '
 
 test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '

From e1588c270d584fff0ddf1da684515cd218a0718b Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 12 Dec 2025 15:15:27 +0000
Subject: [PATCH 226/553] scalar: alphabetize and simplify config

The config values set by Scalar went through an audit in the previous
changes, so now reorganize the settings and simplify their purpose.

First, alphabetize the config options, except put the platform-specific
options at the end. This groups two Windows-specific settings and only
one non-Windows setting.

Also, this removes the 'overwrite_on_reconfigure' setting for many of
these options. That setting made nearly all of these options "required"
for scalar enlistments, restricting use for users. Instead, now nearly
all options have removed this setting.

However, there is one setting that still has this, which is
index.skipHash, which was previously being set to _false_ when we
actually prefer the value of true. Keep the overwrite here to help
Scalar users upgrade to the new version. We may remove that overwrite in
the future once we belive that most of the users who have the false
value have upgraded to a version that overwrites that to 'true'.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 scalar.c | 60 ++++++++++++++++++++++++++++----------------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/scalar.c b/scalar.c
index aeebea41fa8fc2..3b25fd3f353049 100644
--- a/scalar.c
+++ b/scalar.c
@@ -133,10 +133,33 @@ static int have_fsmonitor_support(void)
 static int set_recommended_config(int reconfigure)
 {
 	struct scalar_config config[] = {
-		/* Required */
-		{ "am.keepCR", "true", 1 },
+		{ "am.keepCR", "true" },
+		{ "commitGraph.changedPaths", "true" },
+		{ "commitGraph.generationVersion", "1" },
+		{ "core.autoCRLF", "false" },
+		{ "core.logAllRefUpdates", "true" },
+		{ "core.safeCRLF", "false" },
+		{ "credential.https://dev.azure.com.useHttpPath", "true" },
+		{ "feature.experimental", "false" },
+		{ "feature.manyFiles", "false" },
+		{ "fetch.showForcedUpdates", "false" },
+		{ "fetch.unpackLimit", "1" },
+		{ "fetch.writeCommitGraph", "false" },
+		{ "gc.auto", "0" },
+		{ "gui.GCWarning", "false" },
+		{ "index.skipHash", "true", 1 /* Fix previous setting. */ },
+		{ "index.threads", "true"},
+		{ "index.version", "4" },
+		{ "merge.renames", "true" },
+		{ "merge.stat", "false" },
+		{ "pack.useBitmaps", "false" },
+		{ "pack.usePathWalk", "true" },
+		{ "receive.autoGC", "false" },
+		{ "status.aheadBehind", "false" },
+
+		/* platform-specific */
 #ifndef WIN32
-		{ "core.untrackedCache", "true", 1 },
+		{ "core.untrackedCache", "true" },
 #else
 		/*
 		 * Unfortunately, Scalar's Functional Tests demonstrated
@@ -150,34 +173,11 @@ static int set_recommended_config(int reconfigure)
 		 * Therefore, with a sad heart, we disable this very useful
 		 * feature on Windows.
 		 */
-		{ "core.untrackedCache", "false", 1 },
-#endif
-		{ "core.logAllRefUpdates", "true", 1 },
-		{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
-		{ "gc.auto", "0", 1 },
-		{ "gui.GCWarning", "false", 1 },
-		{ "index.skipHash", "true", 1 },
-		{ "index.threads", "true", 1 },
-		{ "index.version", "4", 1 },
-		{ "merge.stat", "false", 1 },
-		{ "merge.renames", "true", 1 },
-		{ "pack.useBitmaps", "false", 1 },
-		{ "receive.autoGC", "false", 1 },
-		{ "feature.manyFiles", "false", 1 },
-		{ "feature.experimental", "false", 1 },
-		{ "fetch.unpackLimit", "1", 1 },
-		{ "fetch.writeCommitGraph", "false", 1 },
-#ifdef WIN32
-		{ "http.sslBackend", "schannel", 1 },
+		{ "core.untrackedCache", "false" },
+
+		/* Other Windows-specific required settings: */
+		{ "http.sslBackend", "schannel" },
 #endif
-		/* Optional */
-		{ "status.aheadBehind", "false" },
-		{ "commitGraph.changedPaths", "true" },
-		{ "commitGraph.generationVersion", "1" },
-		{ "core.autoCRLF", "false" },
-		{ "core.safeCRLF", "false" },
-		{ "fetch.showForcedUpdates", "false" },
-		{ "pack.usePathWalk", "true" },
 		{ NULL, NULL },
 	};
 	int i;

From d2e4099968ca1cd6b31b0516cdbafa0520674a8e Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sat, 13 Dec 2025 10:46:27 +0900
Subject: [PATCH 227/553] coccicheck: emit the contents of cocci patch

Telling the user "you got some error messages" without showing what
the errors are is almost useless in CI environment, as the errors
cannot be examined without downloading build artifacts.

Arrange it to spew out the output when it fails.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 7ca21188138ace..0117d0008c71b4 100644
--- a/Makefile
+++ b/Makefile
@@ -3521,7 +3521,7 @@ else
 COCCICHECK_PATCH_MUST_BE_EMPTY_FILES = $(COCCICHECK_PATCHES_INTREE)
 endif
 coccicheck: $(COCCICHECK_PATCH_MUST_BE_EMPTY_FILES)
-	! grep -q ^ $(COCCICHECK_PATCH_MUST_BE_EMPTY_FILES) /dev/null
+	! grep ^ $(COCCICHECK_PATCH_MUST_BE_EMPTY_FILES) /dev/null
 
 # See contrib/coccinelle/README
 coccicheck-pending: coccicheck-test

From 8ea9492cf3505c379d1c573b02db90e6b480cc75 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sat, 13 Dec 2025 10:46:28 +0900
Subject: [PATCH 228/553] cocci: use MEMZERO_ARRAY() a bit more

Existing code in files that have been fairly stable trigger the
"make coccicheck" suggestions due to the new check.

Rewrite them to use MEMZERO_ARRAY()

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diffcore-delta.c    | 4 ++--
 linear-assignment.c | 4 ++--
 shallow.c           | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/diffcore-delta.c b/diffcore-delta.c
index ba6cbee76ba018..2de9e9ccff321a 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -56,7 +56,7 @@ static struct spanhash_top *spanhash_rehash(struct spanhash_top *orig)
 			     st_mult(sizeof(struct spanhash), sz)));
 	new_spanhash->alloc_log2 = orig->alloc_log2 + 1;
 	new_spanhash->free = INITIAL_FREE(new_spanhash->alloc_log2);
-	memset(new_spanhash->data, 0, sizeof(struct spanhash) * sz);
+	MEMZERO_ARRAY(new_spanhash->data, sz);
 	for (i = 0; i < osz; i++) {
 		struct spanhash *o = &(orig->data[i]);
 		int bucket;
@@ -135,7 +135,7 @@ static struct spanhash_top *hash_chars(struct repository *r,
 			      st_mult(sizeof(struct spanhash), (size_t)1 << i)));
 	hash->alloc_log2 = i;
 	hash->free = INITIAL_FREE(i);
-	memset(hash->data, 0, sizeof(struct spanhash) * ((size_t)1 << i));
+	MEMZERO_ARRAY(hash->data, ((size_t)1 << i));
 
 	n = 0;
 	accum1 = accum2 = 0;
diff --git a/linear-assignment.c b/linear-assignment.c
index 5416cbcf409d26..97b4f750586a78 100644
--- a/linear-assignment.c
+++ b/linear-assignment.c
@@ -20,8 +20,8 @@ void compute_assignment(int column_count, int row_count, int *cost,
 	int i, j, phase;
 
 	if (column_count < 2) {
-		memset(column2row, 0, sizeof(int) * column_count);
-		memset(row2column, 0, sizeof(int) * row_count);
+		MEMZERO_ARRAY(column2row, column_count);
+		MEMZERO_ARRAY(row2column, row_count);
 		return;
 	}
 
diff --git a/shallow.c b/shallow.c
index d9cd4e219cb07d..c20471cd7e450b 100644
--- a/shallow.c
+++ b/shallow.c
@@ -713,7 +713,7 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 
 	if (used) {
 		int bitmap_size = DIV_ROUND_UP(pi.nr_bits, 32) * sizeof(uint32_t);
-		memset(used, 0, sizeof(*used) * info->shallow->nr);
+		MEMZERO_ARRAY(used, info->shallow->nr);
 		for (i = 0; i < nr_shallow; i++) {
 			const struct commit *c = lookup_commit(the_repository,
 							       &oid[shallow[i]]);
@@ -782,7 +782,7 @@ static void post_assign_shallow(struct shallow_info *info,
 
 	trace_printf_key(&trace_shallow, "shallow: post_assign_shallow\n");
 	if (ref_status)
-		memset(ref_status, 0, sizeof(*ref_status) * info->ref->nr);
+		MEMZERO_ARRAY(ref_status, info->ref->nr);
 
 	/* Remove unreachable shallow commits from "theirs" */
 	for (i = dst = 0; i < info->nr_theirs; i++) {

From 007b8994d4fc49f4a68ee414db9e814736a3fc04 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 13 Dec 2025 10:40:42 +0100
Subject: [PATCH 229/553] t4014: support Git version strings with spaces
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

git --version reports its version with the prefix "git version ".
Remove precisely this string instead of everything up to and including
the rightmost space to avoid butchering version strings that contain
spaces.  This helps Apple's release of Git, which reports its version
like this: "git version 2.50.1 (Apple Git-155)".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t4014-format-patch.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 2782b1fc183e8f..21d6d0cd9ef679 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -980,7 +980,7 @@ test_expect_success 'format-patch --ignore-if-in-upstream HEAD' '
 
 test_expect_success 'get git version' '
 	git_version=$(git --version) &&
-	git_version=${git_version##* }
+	git_version=${git_version#git version }
 '
 
 signature() {

From 8467c95419acaa826a6c1ca0db0f36a3fd614ae4 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Sat, 13 Dec 2025 14:46:56 +0100
Subject: [PATCH 230/553] doc: replay: mention no output on conflicts

Some commands will produce output on stderr if there are conflicts, but
git-replay(1) is completely silent. Explicitly spell that out.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-replay.adoc | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index dcb26e8a8e88ca..6fbb527b9d87b9 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -81,6 +81,10 @@ the shape of the history being replayed.  When using `--advance`, the
 number of refs updated is always one, but for `--onto`, it can be one
 or more (rebasing multiple branches simultaneously is supported).
 
+There is no stderr output on conflicts; see the <<exit-status,EXIT
+STATUS>> section below.
+
+[[exit-status]]
 EXIT STATUS
 -----------
 

From 03d7c9c457ba68f28269dcd607b9026ea6c6c9c8 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Sat, 13 Dec 2025 14:46:57 +0100
Subject: [PATCH 231/553] replay: improve --contained and add to doc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

There is no documentation for `--contained`.

Start by copying the text from `replay_options` in `builtin/
replay.c`. But some people think that the existing text is a
bit unclear; what does it mean for a branch to be contained
in a revision range? Let’s include the implied commits here:
the branches that point at commits in the range.

Also use “update” instead of “advance”. “Update” is the verb
commonly used in this context.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-replay.adoc | 4 ++++
 builtin/replay.c              | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index 6fbb527b9d87b9..1e2469b90341e2 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -42,6 +42,10 @@ The history is replayed on top of the <branch> and <branch> is updated to
 point at the tip of the resulting history. This is different from `--onto`,
 which uses the target only as a starting point without updating it.
 
+--contained::
+	Update all branches that point at commits in
+	<revision-range>. Requires `--onto`.
+
 --ref-action[=<mode>]::
 	Control how references are updated. The mode can be:
 +
diff --git a/builtin/replay.c b/builtin/replay.c
index 6606a2c94bc671..9e5ad64cad66a6 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -377,7 +377,7 @@ int cmd_replay(int argc,
 			   N_("revision"),
 			   N_("replay onto given commit")),
 		OPT_BOOL(0, "contained", &contained,
-			 N_("advance all branches contained in revision-range")),
+			 N_("update all branches that point at commits in <revision-range>")),
 		OPT_STRING(0, "ref-action", &ref_action,
 			   N_("mode"),
 			   N_("control ref update behavior (update|print)")),

From 9ba08b30a117e6925a9e5e87c92b37de7396d3a4 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Sat, 13 Dec 2025 14:46:58 +0100
Subject: [PATCH 232/553] doc: replay: link section using markup

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-replay.adoc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
index 1e2469b90341e2..22fd1b271afa35 100644
--- a/Documentation/git-replay.adoc
+++ b/Documentation/git-replay.adoc
@@ -19,7 +19,7 @@ the working tree and the index untouched. By default, updates the
 relevant references using an atomic transaction (all refs update or
 none). Use `--ref-action=print` to avoid automatic ref updates and
 instead get update commands that can be piped to `git update-ref --stdin`
-(see the OUTPUT section below).
+(see the <<output,OUTPUT>> section below).
 
 THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.
 
@@ -67,6 +67,7 @@ The default mode can be configured via the `replay.refAction` configuration vari
 
 include::rev-list-options.adoc[]
 
+[[output]]
 OUTPUT
 ------
 

From d8af7cadaa79d5837d73ec949e10b57dedb43e9b Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 14 Dec 2025 17:04:17 +0900
Subject: [PATCH 233/553] The eighth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 38cbd2186e8172..41ae2a5a7a4696 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -26,6 +26,11 @@ UI, Workflows & Features
  * The use of "revision" (a connected set of commits) has been
    clarified in the "git replay" documentation.
 
+ * A help message from "git branch" now mentions "git help" instead of
+   "man" when suggesting to read some documentation.
+
+ * "git repo struct" learned to take "-z" as a synonym to "--format=nul".
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -51,6 +56,10 @@ Performance, Internal Implementation, Development Support etc.
 
  * Code refactoring around object database sources.
 
+ * Halve the memory consumed by artificial filepairs created during
+   "git diff --find-copioes-harder", also making the operation run
+   faster.
+
 
 Fixes since v2.52
 -----------------
@@ -150,9 +159,18 @@ Fixes since v2.52
  * The way patience diff finds LCS has been optimized.
    (merge c7e3b8085b yc/xdiff-patience-optim later to maint).
 
+ * Recent optimization to "last-modified" command introduced use of
+   uninitialized block of memory, which has been corrected.
+   (merge fe4e60759b tc/last-modified-active-paths-optimization later to maint).
+
+ * "git last-modified" used to mishandle "--" to mark the beginning of
+   pathspec, which has been corrected.
+   (merge 05491b90ce js/last-modified-with-sparse-checkouts later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
    (merge f18aa68861 rs/xmkstemp-simplify later to maint).
    (merge fddba8f737 ja/doc-synopsis-style later to maint).
    (merge 22ce0cb639 en/xdiff-cleanup-2 later to maint).
+   (merge 8ef7355a8f je/doc-pull later to maint).

From 4ce170c522cd91e73e7d500667a4718af125bcf3 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 12 Dec 2025 15:15:28 +0000
Subject: [PATCH 234/553] scalar: document config settings

Add user-facing documentation that justifies the values being set by
'scalar clone', 'scalar register', and 'scalar reconfigure'.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/scalar.adoc | 164 ++++++++++++++++++++++++++++++++++++++
 scalar.c                  |   4 +
 2 files changed, 168 insertions(+)

diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
index f81b2832f8dfeb..5252fb134a47ab 100644
--- a/Documentation/scalar.adoc
+++ b/Documentation/scalar.adoc
@@ -197,6 +197,170 @@ delete <enlistment>::
 	This subcommand lets you delete an existing Scalar enlistment from your
 	local file system, unregistering the repository.
 
+RECOMMENDED CONFIG VALUES
+-------------------------
+
+As part of both `scalar clone` and `scalar register`, certain Git config
+values are set to optimize for large repositories or cross-platform support.
+These options are updated in new Git versions according to the best known
+advice for large repositories, and users can get the latest recommendations
+by running `scalar reconfigure [--all]`.
+
+This section lists justifications for the config values that are set in the
+latest version.
+
+am.keepCR=true::
+	This setting is important for cross-platform development across Windows
+	and non-Windows platforms and keeping carriage return (`\r`) characters
+	in certain workflows.
+
+commitGraph.changedPaths=true::
+	This setting helps the background maintenance steps that compute the
+	serialized commit-graph to also store changed-path Bloom filters. This
+	accelerates file history commands and allows users to automatically
+	benefit without running a foreground command.
+
+commitGraph.generationVersion=1::
+	While the preferred version is 2 for performance reasons, existing users
+	that had version 1 by default will need special care in upgrading to
+	version 2. This is likely to change in the future as the upgrade story
+	solidifies.
+
+core.autoCRLF=false::
+	This removes the transformation of worktree files to add CRLF line
+	endings when only LF line endings exist. This is removed for performance
+	reasons. Repositories that use tools that care about CRLF line endings
+	should commit the necessary files with those line endings instead.
+
+core.logAllRefUpdates=true::
+	This enables the reflog on all branches. While this is a performance
+	cost for large repositories, it is frequently an important data source
+	for users to get out of bad situations or to seek support from experts.
+
+core.safeCRLF=false::
+	Similar to `core.autoCRLF=false`, this disables checks around whether
+	the CRLF conversion is reversible. This is a performance improvement,
+	but can be dangerous if `core.autoCRLF` is reenabled by the user.
+
+credential.https://dev.azure.com.useHttpPath=true::
+	This setting enables the `credential.useHttpPath` feature only for web
+	URLs for Azure DevOps. This is important for users interacting with that
+	service using multiple organizations and thus multiple credential
+	tokens.
+
+feature.experimental=false::
+	This disables the "experimental" optimizations grouped under this
+	feature config. The expectation is that all valuable optimizations are
+	also set explicitly by Scalar config, and any differences are
+	intentional. Notable differences include several bitmap-related config
+	options which are disabled for client-focused Scalar repos.
+
+feature.manyFiles=false::
+	This disables the "many files" optimizations grouped under this feature
+	config. The expectation is that all valuable optimizations are also set
+	explicitly by Scalar config, and any differences are intentional.
+
+fetch.showForcedUpdates=false::
+	This disables the check at the end of `git fetch` that notifies the user
+	if the ref update was a forced update (one where the previous position
+	is not reachable from the latest position). This check can be very
+	expensive in large repositories, so is disabled and replaced with an
+	advice message. Set `advice.fetchShowForcedUpdates=false` to disable
+	this advice message.
+
+fetch.unpackLimit=1::
+	This setting prevents Git from unpacking packfiles into loose objects
+	as they are downloaded from the server. The default limit of 100 was
+	intended as a way to prevent performance issues from too many packfiles,
+	but Scalar uses background maintenance to group packfiles and cover them
+	with a multi-pack-index, removing this issue.
+
+fetch.writeCommitGraph=false::
+	This config setting was created to help users automatically update their
+	commit-graph files as they perform fetches. However, this takes time
+	from foreground fetches and pulls and Scalar uses background maintenance
+	for this function instead.
+
+gc.auto=0::
+	This disables automatic garbage collection, since Scalar uses background
+	maintenance to keep the repository data in good shape.
+
+gui.GCWarning=false::
+	Since Scalar disables garbage collection by setting `gc.auto=0`, the
+	`git-gui` tool may start to warn about this setting. Disable this
+	warning as Scalar's background maintenance configuration makes the
+	warning irrelevant.
+
+index.skipHash=true::
+	Disable computing the hash of the index contents as it is being written.
+	This assists with performance, especially for large index files.
+
+index.threads=true::
+	This tells Git to automatically detect how many threads it should use
+	when reading the index due to the default value of `core.preloadIndex`,
+	which enables parallel index reads. This explicit setting also enables
+	`index.recordOffsetTable=true` to speed up parallel index reads.
+
+index.version=4::
+	This index version adds compression to the path names, reducing the size
+	of the index in a significant way for large repos. This is an important
+	performance boost.
+
+log.excludeDecoration=refs/prefetch/*::
+	Since Scalar enables background maintenance with the `incremental`
+	strategy, this setting avoids polluting `git log` output with refs
+	stored by the background prefetch operations.
+
+merge.renames=true::
+	When computing merges in large repos, it is particularly important to
+	detect renames to maximize the potential for a result that will validate
+	correctly. Users performing merges locally are more likely to be doing
+	so because a server-side merge (via pull request or similar) resulted in
+	conflicts. While this is the default setting, it is set specifically to
+	override a potential change to `diff.renames` which a user may set for
+	performance reasons.
+
+merge.stat=false::
+	This disables a diff output after computing a merge. This improves
+	performance of `git merge` for large repos while reducing noisy output.
+
+pack.useBitmaps=false::
+	This disables the use of `.bitmap` files attached to packfiles. Bitmap
+	files are optimized for server-side use, not client-side use. Scalar
+	disables this to avoid some performance issues that can occur if a user
+	accidentally creates `.bitmap` files.
+
+pack.usePathWalk=true::
+	This enables the `--path-walk` option to `git pack-objects` by default.
+	This can accelerate the computation and compression of packfiles created
+	by `git push` and other repack operations.
+
+receive.autoGC=false::
+	Similar to `gc.auto`, this setting is disabled in preference of
+	background maintenance.
+
+status.aheadBehind=false::
+	This disables the ahead/behind calculation that would normally happen
+	during a `git status` command. This information is frequently ignored by
+	users but can be expensive to calculate in large repos that receive
+	thousands of commits per day. The calculation is replaced with an advice
+	message that can be disabled by disabling the `advice.statusAheadBehind`
+	config.
+
+The following settings are different based on which platform is in use:
+
+core.untrackedCache=(true|false)::
+	The untracked cache feature is important for performance benefits on
+	large repositories, but has demonstrated some bugs on Windows
+	filesystems. Thus, this is set for other platforms but disabled on
+	Windows.
+
+http.sslBackend=schannel::
+	On Windows, the `openssl` backend has some issues with certain types of
+	remote providers and certificate types. Override the default setting to
+	avoid these common problems.
+
+
 SEE ALSO
 --------
 linkgit:git-clone[1], linkgit:git-maintenance[1].
diff --git a/scalar.c b/scalar.c
index 3b25fd3f353049..21ab1dba8979aa 100644
--- a/scalar.c
+++ b/scalar.c
@@ -132,6 +132,10 @@ static int have_fsmonitor_support(void)
 
 static int set_recommended_config(int reconfigure)
 {
+	/*
+	 * Be sure to update Documentation/scalar.adoc if you add, update,
+	 * or remove any of these recommended settings.
+	 */
 	struct scalar_config config[] = {
 		{ "am.keepCR", "true" },
 		{ "commitGraph.changedPaths", "true" },

From 6d8dc99478adeefc1a74f3b4db9336decadddc48 Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Mon, 15 Dec 2025 14:05:12 -0600
Subject: [PATCH 235/553] docs: clarify git-rev-list(1) --filter behavior

When using the --filter option for git-rev-list(1), objects that are
explicitly provided ignore filters and are always printed unless the
--filter-provided-objects option is also specified. Clarify this
behavior in the documentation.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/rev-list-options.adoc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
index d9665d82c8dfbe..453ec590571ffc 100644
--- a/Documentation/rev-list-options.adoc
+++ b/Documentation/rev-list-options.adoc
@@ -983,7 +983,9 @@ to name units in KiB, MiB, or GiB.  For example, `blob:limit=1k`
 is the same as 'blob:limit=1024'.
 +
 The form `--filter=object:type=(tag|commit|tree|blob)` omits all objects
-which are not of the requested type.
+which are not of the requested type. Note that explicitly provided objects
+ignore filters and are always printed unless `--filter-provided-objects` is
+also specified.
 +
 The form `--filter=sparse:oid=<blob-ish>` uses a sparse-checkout
 specification contained in the blob (or blob-expression) _<blob-ish>_

From f293bdcc29f91e3e56c478473a85a8e13e6fd87c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 14 Dec 2025 16:57:06 +0100
Subject: [PATCH 236/553] diff-files: fix copy detection
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Copy detection cannot work when comparing the index to the working tree
because Git ignores files that it is not explicitly told to track.  It
should work in the other direction, though, i.e. for a reverse diff of
the deletion of a copy from the index.

d1f2d7e8ca (Make run_diff_index() use unpack_trees(), not read_tree(),
2008-01-19) broke it with a seemingly stray change to run_diff_files().

We didn't notice because there's no test for that.  But even if we had
one, it might have gone unnoticed because the breakage only happens
with index preloading, which requires at least 1000 entries (more than
most test repos have) and is racy because it runs in parallel with the
actual command.

Fix copy detection by queuing up-to-date and skip-worktree entries using
diff_same().

While at it, use diff_same() also for queuing unchanged files not
flagged as up-to-date, i.e. clean submodules and entries where
preloading was not done at all or not quickly enough.  It uses less
memory than diff_change() and doesn't unnecessarily set the diff flag
has_changes.

Add two tests to cover running both without and with preloading.  The
first one passes reliably with the original code.  The second one
enables preloading and thus is racy.  It has a good chance to pass even
without the fix, but fails within seconds when running the test script
with --stress.  With the fix it runs fine for several minutes, until
my patience runs out.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff-lib.c          | 12 +++++++++---
 t/t4007-rename-3.sh | 23 ++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 8e624f38c6d6f3..5307390ff3db7b 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -226,8 +226,12 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
 				continue;
 		}
 
-		if (ce_uptodate(ce) || ce_skip_worktree(ce))
+		if (ce_uptodate(ce) || ce_skip_worktree(ce)) {
+			if (revs->diffopt.flags.find_copies_harder)
+				diff_same(&revs->diffopt, ce->ce_mode,
+					  &ce->oid, ce->name);
 			continue;
+		}
 
 		/*
 		 * When CE_VALID is set (via "update-index --assume-unchanged"
@@ -272,8 +276,10 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
 		if (!changed && !dirty_submodule) {
 			ce_mark_uptodate(ce);
 			mark_fsmonitor_valid(istate, ce);
-			if (!revs->diffopt.flags.find_copies_harder)
-				continue;
+			if (revs->diffopt.flags.find_copies_harder)
+				diff_same(&revs->diffopt, newmode,
+					  &ce->oid, ce->name);
+			continue;
 		}
 		oldmode = ce->ce_mode;
 		old_oid = &ce->oid;
diff --git a/t/t4007-rename-3.sh b/t/t4007-rename-3.sh
index e8faf0dd2ef1c5..34f7d276d116e0 100755
--- a/t/t4007-rename-3.sh
+++ b/t/t4007-rename-3.sh
@@ -57,7 +57,28 @@ test_expect_success 'copy, limited to a subtree' '
 '
 
 test_expect_success 'tweak work tree' '
-	rm -f path0/COPYING &&
+	rm -f path0/COPYING
+'
+
+cat >expected <<EOF
+:100644 100644 $blob $blob C100	path1/COPYING	path0/COPYING
+EOF
+
+# The cache has path0/COPYING and path1/COPYING, the working tree only
+# path1/COPYING.  This is a deletion -- we don't treat deduplication
+# specially.  In reverse it should be detected as a copy, though.
+test_expect_success 'copy detection, files to index' '
+	git diff-files -C --find-copies-harder -R >current &&
+	compare_diff_raw current expected
+'
+
+test_expect_success 'copy detection, files to preloaded index' '
+	GIT_TEST_PRELOAD_INDEX=1 \
+	git diff-files -C --find-copies-harder -R >current &&
+	compare_diff_raw current expected
+'
+
+test_expect_success 'tweak index' '
 	git update-index --remove path0/COPYING
 '
 # In the tree, there is only path0/COPYING.  In the cache, path0 does

From e7ef0ca622016d12a85836928a03959de4537c2f Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 16 Dec 2025 11:08:23 +0900
Subject: [PATCH 237/553] The ninth batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 41ae2a5a7a4696..f28c8202919dc9 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -60,6 +60,13 @@ Performance, Internal Implementation, Development Support etc.
    "git diff --find-copioes-harder", also making the operation run
    faster.
 
+ * The "git_istream" abstraction has been revamped to make it easier
+   to interface with pluggable object database design.
+
+ * Rewrite the only use of "mktemp()" that is subject to TOCTOU race
+   and Stop using the insecure "mktemp()" function.
+   (merge 10bba537c4 rs/ban-mktemp later to maint).
+
 
 Fixes since v2.52
 -----------------
@@ -167,6 +174,9 @@ Fixes since v2.52
    pathspec, which has been corrected.
    (merge 05491b90ce js/last-modified-with-sparse-checkouts later to maint).
 
+ * Emulation code clean-up.
+   (merge 42aa7603aa gf/win32-pthread-cond-init later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).

From c4a0c8845e2426375ad257b6c221a3a7d92ecfda Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 17 Dec 2025 14:11:28 +0900
Subject: [PATCH 238/553] The 10th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index f28c8202919dc9..34216a59fe5fe6 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -184,3 +184,7 @@ Fixes since v2.52
    (merge fddba8f737 ja/doc-synopsis-style later to maint).
    (merge 22ce0cb639 en/xdiff-cleanup-2 later to maint).
    (merge 8ef7355a8f je/doc-pull later to maint).
+   (merge 48176f953f jc/capability-leak later to maint).
+   (merge 8cbbdc92f7 kh/doc-pre-commit-fix later to maint).
+   (merge d4bc39a4d9 mh/doc-config-gui-gcwarning later to maint).
+   (merge 41d425008a kh/doc-send-email-paragraph-fix later to maint).

From 1129780f6ab19f0a295c0b436890c510f71024f4 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 17 Dec 2025 03:54:15 +0900
Subject: [PATCH 239/553] commit: document that $command.signoff will not be
 added

Every now and then we see this coming up on the list.  Let's help
new contributors who are not aware of past discussions by clearly
documenting our past consensus.

Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/gitfaq.adoc         | 19 +++++++++++++++++++
 Documentation/signoff-option.adoc |  4 ++++
 2 files changed, 23 insertions(+)

diff --git a/Documentation/gitfaq.adoc b/Documentation/gitfaq.adoc
index f2917d142c3630..8d3647d359423b 100644
--- a/Documentation/gitfaq.adoc
+++ b/Documentation/gitfaq.adoc
@@ -83,6 +83,25 @@ Windows would be the configuration `"C:\Program Files\Vim\gvim.exe" --nofork`,
 which quotes the filename with spaces and specifies the `--nofork` option to
 avoid backgrounding the process.
 
+[[sign-off]]
+Why not have `commit.signoff` and other configuration variables?::
+	Git intentionally does not (and will not) provide a
+	configuration variable, such as `commit.signoff`, to
+	automatically add `--signoff` by default.  The reason is to
+	protect the legal and intentional significance of a sign-off.
+	If there were more automated and widely publicized ways for
+	sign-offs to be appended, it would become easier for someone
+	to argue later that a "Signed-off-by" trailer was just added
+	out of habit or by automation, without the committer's full
+	awareness or intent to certify their agreement with the
+	Developer Certificate of Origin (DCO) or a similar statement.
+	This could undermine the sign-off’s credibility in legal or
+	contractual situations.
++
+There exists `format.signoff`, but that is a historical mistake, and
+it is not an excuse to add more mistakes of the same kind on top.
+
+
 Credentials
 -----------
 
diff --git a/Documentation/signoff-option.adoc b/Documentation/signoff-option.adoc
index cddfb225d1d62a..9a80d60f1bb1b8 100644
--- a/Documentation/signoff-option.adoc
+++ b/Documentation/signoff-option.adoc
@@ -16,3 +16,7 @@ endif::git-commit[]
 +
 The `--no-signoff` option can be used to countermand an earlier `--signoff`
 option on the command line.
++
+Git does not (and will not) have a configuration variable to enable
+the `--signoff` command line option by default; see the
+`commit.signoff` entry in the gitfaq for more details.

From 4ec7ac101b737cd2add8369d0e04eaec1a9f0735 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:37 +0000
Subject: [PATCH 240/553] t9700: accommodate for Windows paths

Ever since fe53bbc9beb (Git.pm: Always set Repository to absolute path
if autodetecting, 2009-05-07), the t9700 test _must_ fail on Windows
because of that age-old Unix paths vs Windows paths problem.

The underlying root cause is that Git cannot run with a regular Win32
variant of Perl, the assumption that every path is a Unix path is just
too strong in Git's Perl code.

As a consequence, Git for Windows is basically stuck with using the
MSYS2 variant of Perl which uses a POSIX emulation layer (which is a
friendly fork of Cygwin) _and_ a best-effort Unix <-> Windows paths
conversion whenever crossing the boundary between MSYS2 and regular
Win32 processes. It is best effort only, though, using heuristics to
automagically convert correctly in most cases, but not in all cases.

In the context of this here patch, this means that asking `git.exe` for
the absolute path of the `.git/` directory will return a Win32 path
because `git.exe` is a regular Win32 executable that has no idea about
Unix-ish paths. But above-mentioned commit introduced a test that wants
to verify that this path is identical to the one that the Git Perl
module reports (which refuses to use Win32 paths and uses Unix-ish paths
instead). Obviously, this must fail because no heuristics can kick in at
that layer.

This test failure has not even been caught when Git introduced Windows
support in its CI definition in 2e90484eb4a (ci: add a Windows job to
the Azure Pipelines definition, 2019-01-29), as all tests relying on
Perl had to be disabled even from the start (because the CI runs would
otherwise have resulted in prohibitively long runtimes, not because
Windows is super slow per se, but because Git's test suite keeps
insisting on using technology that requires a POSIX emulation layer,
which _is_ super slow on Windows).

To work around this failure, let's use the `cygpath` utility to convert
the absolute `gitdir` path into the form that the Perl code expects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t9700/test.pl | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t9700/test.pl b/t/t9700/test.pl
index 58a9b328d558f3..570b0c5680fc73 100755
--- a/t/t9700/test.pl
+++ b/t/t9700/test.pl
@@ -117,7 +117,12 @@ sub adjust_dirsep {
 unlink $tmpfile;
 
 # paths
-is($r->repo_path, $abs_repo_dir . "/.git", "repo_path");
+my $abs_git_dir = $abs_repo_dir . "/.git";
+if ($^O eq 'msys' or $^O eq 'cygwin') {
+  $abs_git_dir = `cygpath -am "$abs_repo_dir/.git"`;
+  $abs_git_dir =~ s/\r?\n?$//;
+}
+is($r->repo_path, $abs_git_dir, "repo_path");
 is($r->wc_path, $abs_repo_dir . "/", "wc_path");
 is($r->wc_subdir, "", "wc_subdir initial");
 $r->wc_chdir("directory1");
@@ -127,7 +132,7 @@ sub adjust_dirsep {
 # Object generation in sub directory
 chdir("directory2");
 my $r2 = Git->repository();
-is($r2->repo_path, $abs_repo_dir . "/.git", "repo_path (2)");
+is($r2->repo_path, $abs_git_dir, "repo_path (2)");
 is($r2->wc_path, $abs_repo_dir . "/", "wc_path (2)");
 is($r2->wc_subdir, "directory2/", "wc_subdir initial (2)");
 

From b90a926371bbb45b2abd27241a8ef682f1450b99 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:38 +0000
Subject: [PATCH 241/553] apply: symbolic links lack a "trustable executable
 bit"

When 0482c32c334b (apply: ignore working tree filemode when
!core.filemode, 2023-12-26) fixed `git apply` to stop warning about
executable files, it inadvertently changed the code flow also for
symbolic links and directories.

Let's narrow the scope of the special `!trust_executable_git` code path
to apply only to regular files.

This is needed to let t4115.5(symlink escape when creating new files)
pass on Windows when symbolic link support is enabled in the MSYS2
runtime.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 apply.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/apply.c b/apply.c
index a2ceb3fb40d3b5..de5750354ad2f8 100644
--- a/apply.c
+++ b/apply.c
@@ -3779,7 +3779,7 @@ static int check_preimage(struct apply_state *state,
 		if (*ce && !(*ce)->ce_mode)
 			BUG("ce_mode == 0 for path '%s'", old_name);
 
-		if (trust_executable_bit)
+		if (trust_executable_bit || !S_ISREG(st->st_mode))
 			st_mode = ce_mode_from_stat(*ce, st->st_mode);
 		else if (*ce)
 			st_mode = (*ce)->ce_mode;

From 6fa50cc4a1979fb8a2f77a026e307d6336a09172 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:39 +0000
Subject: [PATCH 242/553] mingw: special-case `open(symlink, O_CREAT | O_EXCL)`

The `_wopen()` function would gladly follow a symbolic link to a
non-existent file and create it when given above-mentioned flags.

Git expects the `open()` call to fail, though. So let's add yet another
work-around to pretend that Windows behaves according to POSIX, see:
https://pubs.opengroup.org/onlinepubs/007904875/functions/open.html#:~:text=If%20O_CREAT%20and%20O_EXCL%20are,set%2C%20the%20result%20is%20undefined.

This is required to let t4115.8(--reject removes .rej symlink if it
exists) pass on Windows when enabling the MSYS2 runtime's symbolic link
support.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/mingw.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 90ba5cea9d3ace..ba1b7b6dd1e6a7 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -629,6 +629,7 @@ int mingw_open (const char *filename, int oflags, ...)
 	int fd, create = (oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL);
 	wchar_t wfilename[MAX_PATH];
 	open_fn_t open_fn;
+	WIN32_FILE_ATTRIBUTE_DATA fdata;
 
 	DECLARE_PROC_ADDR(ntdll.dll, NTSTATUS, NTAPI, RtlGetLastNtStatus, void);
 
@@ -653,6 +654,19 @@ int mingw_open (const char *filename, int oflags, ...)
 	else if (xutftowcs_path(wfilename, filename) < 0)
 		return -1;
 
+	/*
+	 * When `symlink` exists and is a symbolic link pointing to a
+	 * non-existing file, `_wopen(symlink, O_CREAT | O_EXCL)` would
+	 * create that file. Not what we want: Linux would say `EEXIST`
+	 * in that instance, which is therefore what Git expects.
+	 */
+	if (create &&
+	    GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata) &&
+	    (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT)) {
+		errno = EEXIST;
+		return -1;
+	}
+
 	fd = open_fn(wfilename, oflags, mode);
 
 	/*

From 5e8e7e47e0029335bb8b51333d56077d72b862a9 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:40 +0000
Subject: [PATCH 243/553] t0001: handle `diff --no-index` gracefully

The test case 're-init to move gitdir symlink' wants to compare the
contents of `newdir/.git`, which is a symbolic link pointing to a file.
However, `git diff --no-index`, which is used by `test_cmp` on Windows,
does not resolve symlinks; It shows the symlink _target_ instead (with a
file mode of 120000). That is totally unexpected by the test case, which
as a consequence fails, meaning that it's a bug in the test case itself.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0001-init.sh | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 618da080dc9ea9..e4d32bb4d259f6 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -425,7 +425,11 @@ test_expect_success SYMLINKS 're-init to move gitdir symlink' '
 	git init --separate-git-dir ../realgitdir
 	) &&
 	echo "gitdir: $(pwd)/realgitdir" >expected &&
-	test_cmp expected newdir/.git &&
+	case "$GIT_TEST_CMP" in
+	# `git diff --no-index` does not resolve symlinks
+	*--no-index*) cmp expected newdir/.git;;
+	*) test_cmp expected newdir/.git;;
+	esac &&
 	test_cmp expected newdir/here &&
 	test_path_is_dir realgitdir/refs
 '

From 492cc31b57b2f06626c302f3470471bfe355de9b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:41 +0000
Subject: [PATCH 244/553] t0301: another fix for Windows compatibility

Just like 0fdcfa2f9f5 (t0301: fixes for windows compatibility,
2021-09-14) explained, we should not call `mkdir -m<mode>` in the test
suite because that would fail on Windows.

There was one forgotten instance of this which was hidden by a `SYMLINK`
prerequisite. Currently, this prevents this test case from being
executed on Windows, but with the upcoming support for symbolic links,
it would become a problem.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0301-credential-cache.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t0301-credential-cache.sh b/t/t0301-credential-cache.sh
index dc30289f7539ee..6f7cfd9e33f633 100755
--- a/t/t0301-credential-cache.sh
+++ b/t/t0301-credential-cache.sh
@@ -123,7 +123,8 @@ test_expect_success SYMLINKS 'use user socket if user directory is a symlink to
 		rmdir \"\$HOME/dir/\" &&
 		rm \"\$HOME/.git-credential-cache\"
 	" &&
-	mkdir -p -m 700 "$HOME/dir/" &&
+	mkdir -p "$HOME/dir/" &&
+	chmod 700 "$HOME/dir/" &&
 	ln -s "$HOME/dir" "$HOME/.git-credential-cache" &&
 	check approve cache <<-\EOF &&
 	protocol=https

From bd6457cfa3f5216700da0ef6ee2ea6614c533a30 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:42 +0000
Subject: [PATCH 245/553] t0600: fix incomplete prerequisite for a test case

The 'symref transaction supports symlinks' test case is guarded by the
`SYMLINK` prerequisite because `core.prefersymlinkrefs = true` requires
symbolic links to be supported.

However, the `preferSymlinkRefs` feature is not supported on Windows,
therefore this test case needs the `MINGW` prerequisite, too.

There's a couple more cases where we set this config key:

  - In a subsequent test in t0600, but there we explicitly set it to
    "false". So this would naturally be supported by Windows.

  - In t7201 we set the value to `yes`, but we never verify that the
    written reference is a symbolic link in the first place. I guess
    that we could rather remove setting the configuration value here, as
    we are about to deprecate support for symrefs via symbolic links in
    the first place. But that's certainly outside of the scope of this
    patch.

  - In t9903 we do the same, but likewise, we don't check whether the
    written file is a symbolic link.

Therefore this seems to be the only instance where the tests actually
need to be adapted.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0600-reffiles-backend.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0600-reffiles-backend.sh b/t/t0600-reffiles-backend.sh
index b11126ed478129..74bfa2e9ba060d 100755
--- a/t/t0600-reffiles-backend.sh
+++ b/t/t0600-reffiles-backend.sh
@@ -467,7 +467,7 @@ test_expect_success POSIXPERM 'git reflog expire honors core.sharedRepository' '
 	esac
 '
 
-test_expect_success SYMLINKS 'symref transaction supports symlinks' '
+test_expect_success SYMLINKS,!MINGW 'symref transaction supports symlinks' '
 	test_when_finished "git symbolic-ref -d TEST_SYMREF_HEAD" &&
 	git update-ref refs/heads/new @ &&
 	test_config core.prefersymlinkrefs true &&

From dd479069232d5afcceb1134c501e24cf11ddd9ed Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:43 +0000
Subject: [PATCH 246/553] t1006: accommodate for symlink support in MSYS2

The MSYS2 runtime (which inherits this trait from the Cygwin runtime,
and which is used by Git for Windows' Bash to emulate POSIX
functionality on Windows, the same Bash that is also used to run Git's
test suite on Windows) has a mode where it can create native symbolic
links on Windows.

Naturally, this is a bit of a strange feature, given that Cygwin goes
out of its way to support Unix-like paths even if no Win32 program
understands those, and the symbolic links have to use Win32 paths
instead (which Win32 programs understand very well).

As a consequence, the symbolic link targets get normalized before the
links are created.

This results in certain quirks that Git's test suite is ill equipped to
accommodate (because Git's test suite expects to be able to use
Unix-like paths even on Windows).

The test script t1006-cat-file.sh contains two prime examples, two test
cases that need to skip a couple assertions because they are simply
wrong in the context of Git for Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t1006-cat-file.sh | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 1f61b666a7d382..0eee3bb8781b30 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1048,18 +1048,28 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for out-
 	echo .. >>expect &&
 	echo HEAD:dir/subdir/out-of-repo-link-dir | git cat-file --batch-check --follow-symlinks >actual &&
 	test_cmp expect actual &&
-	echo symlink 3 >expect &&
-	echo ../ >>expect &&
+	if test_have_prereq MINGW,SYMLINKS
+	then
+		test_write_lines "symlink 2" ..
+	else
+		test_write_lines "symlink 3" ../
+	fi >expect &&
 	echo HEAD:dir/subdir/out-of-repo-link-dir-trailing | git cat-file --batch-check --follow-symlinks >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'git cat-file --batch-check --follow-symlinks works for symlinks with internal ..' '
-	echo HEAD: | git cat-file --batch-check >expect &&
-	echo HEAD:up-down | git cat-file --batch-check --follow-symlinks >actual &&
-	test_cmp expect actual &&
-	echo HEAD:up-down-trailing | git cat-file --batch-check --follow-symlinks >actual &&
-	test_cmp expect actual &&
+	if test_have_prereq !MINGW
+	then
+		# The `up-down` and `up-down-trailing` symlinks are normalized
+		# in MSYS in `winsymlinks` mode and are therefore in a
+		# different shape than Git expects them.
+		echo HEAD: | git cat-file --batch-check >expect &&
+		echo HEAD:up-down | git cat-file --batch-check --follow-symlinks >actual &&
+		test_cmp expect actual &&
+		echo HEAD:up-down-trailing | git cat-file --batch-check --follow-symlinks >actual &&
+		test_cmp expect actual
+	fi &&
 	echo HEAD:up-down-file | git cat-file --batch-check --follow-symlinks >actual &&
 	test_cmp found actual &&
 	echo symlink 7 >expect &&

From be6ac3510708da0d662f97783ccaca0794a34593 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:44 +0000
Subject: [PATCH 247/553] t1305: skip symlink tests that do not apply to
 Windows

In Git for Windows, the gitdir is canonicalized so that even when the
gitdir is specified via a symbolic link, the `gitdir:` conditional
include will only match the real directory path.

Unfortunately, t1305 codifies a different behavior in two test cases,
which are hereby skipped on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t1305-config-include.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index 8ff2b0c232b485..6e51f892f320bb 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -286,7 +286,7 @@ test_expect_success SYMLINKS 'conditional include, relative path with symlinks'
 	)
 '
 
-test_expect_success SYMLINKS 'conditional include, gitdir matching symlink' '
+test_expect_success SYMLINKS,!MINGW 'conditional include, gitdir matching symlink' '
 	ln -s foo bar &&
 	(
 		cd bar &&
@@ -298,7 +298,7 @@ test_expect_success SYMLINKS 'conditional include, gitdir matching symlink' '
 	)
 '
 
-test_expect_success SYMLINKS 'conditional include, gitdir matching symlink, icase' '
+test_expect_success SYMLINKS,!MINGW 'conditional include, gitdir matching symlink, icase' '
 	(
 		cd bar &&
 		echo "[includeIf \"gitdir/i:BAR/\"]path=bar8" >>.git/config &&

From eae7c16c3db2e746dd720c4e9ad7c1724d372b07 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:45 +0000
Subject: [PATCH 248/553] t6423: introduce Windows-specific handling for
 symlinking to /dev/null

The device `/dev/null` does not exist on Windows, it's called `NUL`
there. Calling `ln -s /dev/null my-symlink` in a symlink-enabled MSYS2
Bash will therefore literally link to a file or directory called `null`
that is supposed to be in the current drive's top-level `dev` directory.
Which typically does not exist.

The test, however, really wants the created symbolic link to point to
the NUL device. Let's instead use the `mklink` utility on Windows to
perform that job, and keep using `ln -s /dev/null <target>` on
non-Windows platforms.

While at it, add the missing `SYMLINKS` prereq because this test _still_
would not pass on Windows before support for symbolic links is
upstreamed from Git for Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6423-merge-rename-directories.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh
index 533ac85dc83409..53535a8ebfc393 100755
--- a/t/t6423-merge-rename-directories.sh
+++ b/t/t6423-merge-rename-directories.sh
@@ -5158,13 +5158,18 @@ test_setup_12m () {
 		git switch B &&
 		git rm dir/subdir/file &&
 		mkdir dir &&
-		ln -s /dev/null dir/subdir &&
+		if test_have_prereq MINGW
+		then
+			cmd //c 'mklink dir\subdir NUL'
+		else
+			ln -s /dev/null dir/subdir
+		fi &&
 		git add . &&
 		git commit -m "B"
 	)
 }
 
-test_expect_success '12m: Change parent of renamed-dir to symlink on other side' '
+test_expect_success SYMLINKS '12m: Change parent of renamed-dir to symlink on other side' '
 	test_setup_12m &&
 	(
 		cd 12m &&

From ef6dd000ad813fc34a05c4b9055578df13a2eaa6 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 Dec 2025 14:18:46 +0000
Subject: [PATCH 249/553] t7800: work around the MSYS path conversion on
 Windows

Git's test suite's relies on Unix shell scripting, which is
understandable, of course, given Git's firm roots (and indeed, ongoing
focus) on Linux.

This fact, combined with Unix shell scripting's natural
habitat -- which is, naturally... *drumroll*... Unix --
often has unintended side effects, where developers expect the test
suite to run in a Unix environment, which is an incorrect assumption.

One instance of this problem can be observed in the 'difftool --dir-diff
handles modified symlinks' test case in `t7800-difftool.sh`, which
assumes that all absolute paths start with a forward slash. That
assumption is incorrect in general, e.g. on Windows, where absolute
paths have many shapes and forms, none of which starts with a forward
slash.

The only saving grace is that this test case is currently not run on
Windows because of the `SYMLINK` prerequisite. However, I am currently
working towards upstreaming symbolic link support from Git for Windows
to upstream Git, which will put a crack into that saving grace.

Let's change that test case so that it does not rely on absolute paths
(which are passed to the "external command" `ls` as parameters and are
therefore part of its output, and which the test case wants to filter
out before verifying that the output is as expected) starting with a
forward slash. Let's instead rely on the much more reliable fact that
`ls` will output the path in a line that ends in a colon, and simply
filter out those lines by matching said colon instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7800-difftool.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index 9b74db55634b16..bf0f67378dbb23 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -752,11 +752,11 @@ test_expect_success SYMLINKS 'difftool --dir-diff handles modified symlinks' '
 		c
 	EOF
 	git difftool --symlinks --dir-diff --extcmd ls >output &&
-	grep -v ^/ output >actual &&
+	grep -v ":\$" output >actual &&
 	test_cmp expect actual &&
 
 	git difftool --no-symlinks --dir-diff --extcmd ls >output &&
-	grep -v ^/ output >actual &&
+	grep -v ":\$" output >actual &&
 	test_cmp expect actual &&
 
 	# The left side contains symlink "c" that points to "b"
@@ -786,11 +786,11 @@ test_expect_success SYMLINKS 'difftool --dir-diff handles modified symlinks' '
 
 	EOF
 	git difftool --symlinks --dir-diff --extcmd ls >output &&
-	grep -v ^/ output >actual &&
+	grep -v ":\$" output >actual &&
 	test_cmp expect actual &&
 
 	git difftool --no-symlinks --dir-diff --extcmd ls >output &&
-	grep -v ^/ output >actual &&
+	grep -v ":\$" output >actual &&
 	test_cmp expect actual
 '
 

From 9faaf254ba061e9fc7065f4c940c9dfcc51e6bbe Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:53:58 -0600
Subject: [PATCH 250/553] builtin/repo: group per-type object values into
 struct

The `object_stats` structure stores object counts by type. In a
subsequent commit, additional per-type object measurements will also be
stored. Group per-type object values into a new struct to allow better
reuse.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index 2a653bd3eacf20..a69699857a5e03 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -202,13 +202,17 @@ struct ref_stats {
 	size_t others;
 };
 
-struct object_stats {
+struct object_values {
 	size_t tags;
 	size_t commits;
 	size_t trees;
 	size_t blobs;
 };
 
+struct object_stats {
+	struct object_values type_counts;
+};
+
 struct repo_structure {
 	struct ref_stats refs;
 	struct object_stats objects;
@@ -281,9 +285,9 @@ static inline size_t get_total_reference_count(struct ref_stats *stats)
 	return stats->branches + stats->remotes + stats->tags + stats->others;
 }
 
-static inline size_t get_total_object_count(struct object_stats *stats)
+static inline size_t get_total_object_values(struct object_values *values)
 {
-	return stats->tags + stats->commits + stats->trees + stats->blobs;
+	return values->tags + values->commits + values->trees + values->blobs;
 }
 
 static void stats_table_setup_structure(struct stats_table *table,
@@ -302,14 +306,18 @@ static void stats_table_setup_structure(struct stats_table *table,
 	stats_table_count_addf(table, refs->remotes, "    * %s", _("Remotes"));
 	stats_table_count_addf(table, refs->others, "    * %s", _("Others"));
 
-	object_total = get_total_object_count(objects);
+	object_total = get_total_object_values(&objects->type_counts);
 	stats_table_addf(table, "");
 	stats_table_addf(table, "* %s", _("Reachable objects"));
 	stats_table_count_addf(table, object_total, "  * %s", _("Count"));
-	stats_table_count_addf(table, objects->commits, "    * %s", _("Commits"));
-	stats_table_count_addf(table, objects->trees, "    * %s", _("Trees"));
-	stats_table_count_addf(table, objects->blobs, "    * %s", _("Blobs"));
-	stats_table_count_addf(table, objects->tags, "    * %s", _("Tags"));
+	stats_table_count_addf(table, objects->type_counts.commits,
+			       "    * %s", _("Commits"));
+	stats_table_count_addf(table, objects->type_counts.trees,
+			       "    * %s", _("Trees"));
+	stats_table_count_addf(table, objects->type_counts.blobs,
+			       "    * %s", _("Blobs"));
+	stats_table_count_addf(table, objects->type_counts.tags,
+			       "    * %s", _("Tags"));
 }
 
 static void stats_table_print_structure(const struct stats_table *table)
@@ -389,13 +397,13 @@ static void structure_keyvalue_print(struct repo_structure *stats,
 	       (uintmax_t)stats->refs.others, value_delim);
 
 	printf("objects.commits.count%c%" PRIuMAX "%c", key_delim,
-	       (uintmax_t)stats->objects.commits, value_delim);
+	       (uintmax_t)stats->objects.type_counts.commits, value_delim);
 	printf("objects.trees.count%c%" PRIuMAX "%c", key_delim,
-	       (uintmax_t)stats->objects.trees, value_delim);
+	       (uintmax_t)stats->objects.type_counts.trees, value_delim);
 	printf("objects.blobs.count%c%" PRIuMAX "%c", key_delim,
-	       (uintmax_t)stats->objects.blobs, value_delim);
+	       (uintmax_t)stats->objects.type_counts.blobs, value_delim);
 	printf("objects.tags.count%c%" PRIuMAX "%c", key_delim,
-	       (uintmax_t)stats->objects.tags, value_delim);
+	       (uintmax_t)stats->objects.type_counts.tags, value_delim);
 
 	fflush(stdout);
 }
@@ -473,22 +481,22 @@ static int count_objects(const char *path UNUSED, struct oid_array *oids,
 
 	switch (type) {
 	case OBJ_TAG:
-		stats->tags += oids->nr;
+		stats->type_counts.tags += oids->nr;
 		break;
 	case OBJ_COMMIT:
-		stats->commits += oids->nr;
+		stats->type_counts.commits += oids->nr;
 		break;
 	case OBJ_TREE:
-		stats->trees += oids->nr;
+		stats->type_counts.trees += oids->nr;
 		break;
 	case OBJ_BLOB:
-		stats->blobs += oids->nr;
+		stats->type_counts.blobs += oids->nr;
 		break;
 	default:
 		BUG("invalid object type");
 	}
 
-	object_count = get_total_object_count(stats);
+	object_count = get_total_object_values(&stats->type_counts);
 	display_progress(data->progress, object_count);
 
 	return 0;

From ce849b1851102d974653701564573798034492d5 Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:53:59 -0600
Subject: [PATCH 251/553] strbuf: split out logic to humanise byte values

In a subsequent commit, byte size values displayed in table output for
the git-repo(1) "structure" subcommand will be shown in a more
human-readable format with the appropriate unit prefixes. For this
usecase, the downscaled values and unit strings must be handled
separately to ensure proper column alignment.

Split out logic from strbuf_humanise() to downscale byte values and
determine the corresponding unit prefix into a separate humanise_bytes()
function that provides seperate value and unit strings.

Note that the "byte" string in "t/helper/test-simple-ipc.c" is unmarked
for translation here so that it doesn't conflict with the newly defined
plural "byte/bytes" translation and instead uses it.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 strbuf.c                   | 74 ++++++++++++++++++++------------------
 strbuf.h                   | 14 ++++++++
 t/helper/test-simple-ipc.c |  7 +++-
 3 files changed, 60 insertions(+), 35 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 6c3851a7f84d72..349ee9727a1920 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -836,47 +836,53 @@ void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
 	strbuf_add_urlencode(sb, s, strlen(s), allow_unencoded_fn);
 }
 
-static void strbuf_humanise(struct strbuf *buf, off_t bytes,
-				 int humanise_rate)
+void humanise_bytes(off_t bytes, char **value, const char **unit,
+		    unsigned flags)
 {
+	int humanise_rate = flags & HUMANISE_RATE;
+
 	if (bytes > 1 << 30) {
-		strbuf_addf(buf,
-				humanise_rate == 0 ?
-					/* TRANSLATORS: IEC 80000-13:2008 gibibyte */
-					_("%u.%2.2u GiB") :
-					/* TRANSLATORS: IEC 80000-13:2008 gibibyte/second */
-					_("%u.%2.2u GiB/s"),
-			    (unsigned)(bytes >> 30),
-			    (unsigned)(bytes & ((1 << 30) - 1)) / 10737419);
+		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(bytes >> 30),
+				 (unsigned)(bytes & ((1 << 30) - 1)) / 10737419);
+		/* TRANSLATORS: IEC 80000-13:2008 gibibyte/second and gibibyte */
+		*unit = humanise_rate ? _("GiB/s") : _("GiB");
 	} else if (bytes > 1 << 20) {
-		unsigned x = bytes + 5243;  /* for rounding */
-		strbuf_addf(buf,
-				humanise_rate == 0 ?
-					/* TRANSLATORS: IEC 80000-13:2008 mebibyte */
-					_("%u.%2.2u MiB") :
-					/* TRANSLATORS: IEC 80000-13:2008 mebibyte/second */
-					_("%u.%2.2u MiB/s"),
-			    x >> 20, ((x & ((1 << 20) - 1)) * 100) >> 20);
+		unsigned x = bytes + 5243; /* for rounding */
+		*value = xstrfmt(_("%u.%2.2u"), x >> 20,
+				 ((x & ((1 << 20) - 1)) * 100) >> 20);
+		/* TRANSLATORS: IEC 80000-13:2008 mebibyte/second and mebibyte */
+		*unit = humanise_rate ? _("MiB/s") : _("MiB");
 	} else if (bytes > 1 << 10) {
-		unsigned x = bytes + 5;  /* for rounding */
-		strbuf_addf(buf,
-				humanise_rate == 0 ?
-					/* TRANSLATORS: IEC 80000-13:2008 kibibyte */
-					_("%u.%2.2u KiB") :
-					/* TRANSLATORS: IEC 80000-13:2008 kibibyte/second */
-					_("%u.%2.2u KiB/s"),
-			    x >> 10, ((x & ((1 << 10) - 1)) * 100) >> 10);
+		unsigned x = bytes + 5; /* for rounding */
+		*value = xstrfmt(_("%u.%2.2u"), x >> 10,
+				 ((x & ((1 << 10) - 1)) * 100) >> 10);
+		/* TRANSLATORS: IEC 80000-13:2008 kibibyte/second and kibibyte */
+		*unit = humanise_rate ? _("KiB/s") : _("KiB");
 	} else {
-		strbuf_addf(buf,
-				humanise_rate == 0 ?
-					/* TRANSLATORS: IEC 80000-13:2008 byte */
-					Q_("%u byte", "%u bytes", bytes) :
-					/* TRANSLATORS: IEC 80000-13:2008 byte/second */
-					Q_("%u byte/s", "%u bytes/s", bytes),
-				(unsigned)bytes);
+		*value = xstrfmt("%u", (unsigned)bytes);
+		*unit = humanise_rate ?
+			       /* TRANSLATORS: IEC 80000-13:2008 byte/second */
+			       Q_("byte/s", "bytes/s", bytes) :
+			       /* TRANSLATORS: IEC 80000-13:2008 byte */
+			       Q_("byte", "bytes", bytes);
 	}
 }
 
+static void strbuf_humanise(struct strbuf *buf, off_t bytes, unsigned flags)
+{
+	char *value;
+	const char *unit;
+
+	humanise_bytes(bytes, &value, &unit, flags);
+
+	/*
+	 * TRANSLATORS: The first argument is the number string. The second
+	 * argument is the unit string (i.e. "12.34 MiB/s").
+	 */
+	strbuf_addf(buf, _("%s %s"), value, unit);
+	free(value);
+}
+
 void strbuf_humanise_bytes(struct strbuf *buf, off_t bytes)
 {
 	strbuf_humanise(buf, bytes, 0);
@@ -884,7 +890,7 @@ void strbuf_humanise_bytes(struct strbuf *buf, off_t bytes)
 
 void strbuf_humanise_rate(struct strbuf *buf, off_t bytes)
 {
-	strbuf_humanise(buf, bytes, 1);
+	strbuf_humanise(buf, bytes, HUMANISE_RATE);
 }
 
 int printf_ln(const char *fmt, ...)
diff --git a/strbuf.h b/strbuf.h
index a580ac6084b7f1..698b3cc4a51367 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -367,6 +367,20 @@ void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbuf *src);
  */
 void strbuf_add_percentencode(struct strbuf *dst, const char *src, int flags);
 
+enum humanise_flags {
+	/*
+	 * Use rate based units for humanised values.
+	 */
+	HUMANISE_RATE = (1 << 0),
+};
+
+/**
+ * Converts the given byte size into a downscaled human-readable value and
+ * corresponding unit as two separate strings.
+ */
+void humanise_bytes(off_t bytes, char **value, const char **unit,
+		    unsigned flags);
+
 /**
  * Append the given byte size as a human-readable string (i.e. 12.23 KiB,
  * 3.50 MiB).
diff --git a/t/helper/test-simple-ipc.c b/t/helper/test-simple-ipc.c
index 03cc5eea2c2944..442ad6b16f18d8 100644
--- a/t/helper/test-simple-ipc.c
+++ b/t/helper/test-simple-ipc.c
@@ -603,7 +603,12 @@ int cmd__simple_ipc(int argc, const char **argv)
 		OPT_INTEGER(0, "bytecount", &cl_args.bytecount, N_("number of bytes")),
 		OPT_INTEGER(0, "batchsize", &cl_args.batchsize, N_("number of requests per thread")),
 
-		OPT_STRING(0, "byte", &bytevalue, N_("byte"), N_("ballast character")),
+		/*
+		 * The "byte" string here is not marked for translation and
+		 * instead relies on translation in strbuf.c:humanise_bytes() to
+		 * avoid conflict with the plural form.
+		 */
+		OPT_STRING(0, "byte", &bytevalue, "byte", N_("ballast character")),
 		OPT_STRING(0, "token", &cl_args.token, N_("token"), N_("command token to send to the server")),
 
 		OPT_END()

From 54731320cc3db337f9a3e3920f707e9de3596c60 Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:54:00 -0600
Subject: [PATCH 252/553] builtin/repo: humanise count values in structure
 output

The table output format for the git-repo(1) structure subcommand is used
by default and intended to provide output to users in a human-friendly
manner. When the reference/object count values in a repository are
large, it becomes more cumbersome for users to read the values.

For larger values, update the table output format to instead produce
more human-friendly count values that are scaled down with the
appropriate unit prefix. Output for the keyvalue and nul formats remains
unchanged.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c            | 38 +++++++++++++++++-------
 strbuf.c                  | 26 ++++++++++++++++
 strbuf.h                  |  6 ++++
 t/t1901-repo-structure.sh | 62 +++++++++++++++++++--------------------
 4 files changed, 91 insertions(+), 41 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index a69699857a5e03..9c61bc3e173a94 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -223,6 +223,7 @@ struct stats_table {
 
 	int name_col_width;
 	int value_col_width;
+	int unit_col_width;
 };
 
 /*
@@ -230,6 +231,7 @@ struct stats_table {
  */
 struct stats_table_entry {
 	char *value;
+	const char *unit;
 };
 
 static void stats_table_vaddf(struct stats_table *table,
@@ -250,11 +252,18 @@ static void stats_table_vaddf(struct stats_table *table,
 
 	if (name_width > table->name_col_width)
 		table->name_col_width = name_width;
-	if (entry) {
+	if (!entry)
+		return;
+	if (entry->value) {
 		int value_width = utf8_strwidth(entry->value);
 		if (value_width > table->value_col_width)
 			table->value_col_width = value_width;
 	}
+	if (entry->unit) {
+		int unit_width = utf8_strwidth(entry->unit);
+		if (unit_width > table->unit_col_width)
+			table->unit_col_width = unit_width;
+	}
 }
 
 static void stats_table_addf(struct stats_table *table, const char *format, ...)
@@ -273,7 +282,7 @@ static void stats_table_count_addf(struct stats_table *table, size_t value,
 	va_list ap;
 
 	CALLOC_ARRAY(entry, 1);
-	entry->value = xstrfmt("%" PRIuMAX, (uintmax_t)value);
+	humanise_count(value, &entry->value, &entry->unit);
 
 	va_start(ap, format);
 	stats_table_vaddf(table, entry, format, ap);
@@ -324,20 +333,24 @@ static void stats_table_print_structure(const struct stats_table *table)
 {
 	const char *name_col_title = _("Repository structure");
 	const char *value_col_title = _("Value");
-	int name_col_width = utf8_strwidth(name_col_title);
-	int value_col_width = utf8_strwidth(value_col_title);
+	int title_name_width = utf8_strwidth(name_col_title);
+	int title_value_width = utf8_strwidth(value_col_title);
+	int name_col_width = table->name_col_width;
+	int value_col_width = table->value_col_width;
+	int unit_col_width = table->unit_col_width;
 	struct string_list_item *item;
 	struct strbuf buf = STRBUF_INIT;
 
-	if (table->name_col_width > name_col_width)
-		name_col_width = table->name_col_width;
-	if (table->value_col_width > value_col_width)
-		value_col_width = table->value_col_width;
+	if (title_name_width > name_col_width)
+		name_col_width = title_name_width;
+	if (title_value_width > value_col_width + unit_col_width + 1)
+		value_col_width = title_value_width - unit_col_width;
 
 	strbuf_addstr(&buf, "| ");
 	strbuf_utf8_align(&buf, ALIGN_LEFT, name_col_width, name_col_title);
 	strbuf_addstr(&buf, " | ");
-	strbuf_utf8_align(&buf, ALIGN_LEFT, value_col_width, value_col_title);
+	strbuf_utf8_align(&buf, ALIGN_LEFT,
+			  value_col_width + unit_col_width + 1, value_col_title);
 	strbuf_addstr(&buf, " |");
 	printf("%s\n", buf.buf);
 
@@ -345,17 +358,20 @@ static void stats_table_print_structure(const struct stats_table *table)
 	for (int i = 0; i < name_col_width; i++)
 		putchar('-');
 	printf(" | ");
-	for (int i = 0; i < value_col_width; i++)
+	for (int i = 0; i < value_col_width + unit_col_width + 1; i++)
 		putchar('-');
 	printf(" |\n");
 
 	for_each_string_list_item(item, &table->rows) {
 		struct stats_table_entry *entry = item->util;
 		const char *value = "";
+		const char *unit = "";
 
 		if (entry) {
 			struct stats_table_entry *entry = item->util;
 			value = entry->value;
+			if (entry->unit)
+				unit = entry->unit;
 		}
 
 		strbuf_reset(&buf);
@@ -363,6 +379,8 @@ static void stats_table_print_structure(const struct stats_table *table)
 		strbuf_utf8_align(&buf, ALIGN_LEFT, name_col_width, item->string);
 		strbuf_addstr(&buf, " | ");
 		strbuf_utf8_align(&buf, ALIGN_RIGHT, value_col_width, value);
+		strbuf_addch(&buf, ' ');
+		strbuf_utf8_align(&buf, ALIGN_LEFT, unit_col_width, unit);
 		strbuf_addstr(&buf, " |");
 		printf("%s\n", buf.buf);
 	}
diff --git a/strbuf.c b/strbuf.c
index 349ee9727a1920..995ff15169f59e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -836,6 +836,32 @@ void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
 	strbuf_add_urlencode(sb, s, strlen(s), allow_unencoded_fn);
 }
 
+void humanise_count(size_t count, char **value, const char **unit)
+{
+	if (count >= 1000000000) {
+		size_t x = count + 5000000; /* for rounding */
+		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000000000),
+				 (unsigned)(x % 1000000000 / 10000000));
+		/* TRANSLATORS: SI decimal prefix symbol for 10^9 */
+		*unit = _("G");
+	} else if (count >= 1000000) {
+		size_t x = count + 5000; /* for rounding */
+		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000000),
+				 (unsigned)(x % 1000000 / 10000));
+		/* TRANSLATORS: SI decimal prefix symbol for 10^6 */
+		*unit = _("M");
+	} else if (count >= 1000) {
+		size_t x = count + 5; /* for rounding */
+		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000),
+				 (unsigned)(x % 1000 / 10));
+		/* TRANSLATORS: SI decimal prefix symbol for 10^3 */
+		*unit = _("k");
+	} else {
+		*value = xstrfmt("%u", (unsigned)count);
+		*unit = NULL;
+	}
+}
+
 void humanise_bytes(off_t bytes, char **value, const char **unit,
 		    unsigned flags)
 {
diff --git a/strbuf.h b/strbuf.h
index 698b3cc4a51367..52feef4c1bb0fd 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -381,6 +381,12 @@ enum humanise_flags {
 void humanise_bytes(off_t bytes, char **value, const char **unit,
 		    unsigned flags);
 
+/**
+ * Converts the given count into a downscaled human-readable value and
+ * corresponding unit as two separate strings.
+ */
+void humanise_count(size_t count, char **value, const char **unit);
+
 /**
  * Append the given byte size as a human-readable string (i.e. 12.23 KiB,
  * 3.50 MiB).
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index 36a71a144e3f74..55fd13ad1b9c4c 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -10,21 +10,21 @@ test_expect_success 'empty repository' '
 	(
 		cd repo &&
 		cat >expect <<-\EOF &&
-		| Repository structure | Value |
-		| -------------------- | ----- |
-		| * References         |       |
-		|   * Count            |     0 |
-		|     * Branches       |     0 |
-		|     * Tags           |     0 |
-		|     * Remotes        |     0 |
-		|     * Others         |     0 |
-		|                      |       |
-		| * Reachable objects  |       |
-		|   * Count            |     0 |
-		|     * Commits        |     0 |
-		|     * Trees          |     0 |
-		|     * Blobs          |     0 |
-		|     * Tags           |     0 |
+		| Repository structure | Value  |
+		| -------------------- | ------ |
+		| * References         |        |
+		|   * Count            |     0  |
+		|     * Branches       |     0  |
+		|     * Tags           |     0  |
+		|     * Remotes        |     0  |
+		|     * Others         |     0  |
+		|                      |        |
+		| * Reachable objects  |        |
+		|   * Count            |     0  |
+		|     * Commits        |     0  |
+		|     * Trees          |     0  |
+		|     * Blobs          |     0  |
+		|     * Tags           |     0  |
 		EOF
 
 		git repo structure >out 2>err &&
@@ -39,7 +39,7 @@ test_expect_success 'repository with references and objects' '
 	git init repo &&
 	(
 		cd repo &&
-		test_commit_bulk 42 &&
+		test_commit_bulk 1005 &&
 		git tag -a foo -m bar &&
 
 		oid="$(git rev-parse HEAD)" &&
@@ -49,21 +49,21 @@ test_expect_success 'repository with references and objects' '
 		git notes add -m foo &&
 
 		cat >expect <<-\EOF &&
-		| Repository structure | Value |
-		| -------------------- | ----- |
-		| * References         |       |
-		|   * Count            |     4 |
-		|     * Branches       |     1 |
-		|     * Tags           |     1 |
-		|     * Remotes        |     1 |
-		|     * Others         |     1 |
-		|                      |       |
-		| * Reachable objects  |       |
-		|   * Count            |   130 |
-		|     * Commits        |    43 |
-		|     * Trees          |    43 |
-		|     * Blobs          |    43 |
-		|     * Tags           |     1 |
+		| Repository structure | Value  |
+		| -------------------- | ------ |
+		| * References         |        |
+		|   * Count            |    4   |
+		|     * Branches       |    1   |
+		|     * Tags           |    1   |
+		|     * Remotes        |    1   |
+		|     * Others         |    1   |
+		|                      |        |
+		| * Reachable objects  |        |
+		|   * Count            | 3.02 k |
+		|     * Commits        | 1.01 k |
+		|     * Trees          | 1.01 k |
+		|     * Blobs          | 1.01 k |
+		|     * Tags           |    1   |
 		EOF
 
 		git repo structure >out 2>err &&

From 3e114496e48e665d2bb9e0c0917e6051d60392ea Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:54:01 -0600
Subject: [PATCH 253/553] builtin/repo: add inflated object info to keyvalue
 structure output

The structure subcommand for git-repo(1) outputs basic count information
for objects and references. Extend this output to also provide
information regarding total size of inflated objects by object type.

For now, object size by object type info is only added to the keyvalue
and nul output formats. In a subsequent commit, this info is also added
to the table format.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc |  1 +
 builtin/repo.c              | 33 +++++++++++++++++++++++++++++++++
 t/t1901-repo-structure.sh   |  6 +++++-
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index 70f0a6d2e47291..287eee4b93de5a 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -50,6 +50,7 @@ supported:
 +
 * Reference counts categorized by type
 * Reachable object counts categorized by type
+* Total inflated size of reachable objects by type
 
 +
 The output format can be chosen through the flag `--format`. Three formats are
diff --git a/builtin/repo.c b/builtin/repo.c
index 9c61bc3e173a94..8da321a3866077 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -2,6 +2,8 @@
 
 #include "builtin.h"
 #include "environment.h"
+#include "hex.h"
+#include "odb.h"
 #include "parse-options.h"
 #include "path-walk.h"
 #include "progress.h"
@@ -211,6 +213,7 @@ struct object_values {
 
 struct object_stats {
 	struct object_values type_counts;
+	struct object_values inflated_sizes;
 };
 
 struct repo_structure {
@@ -423,6 +426,15 @@ static void structure_keyvalue_print(struct repo_structure *stats,
 	printf("objects.tags.count%c%" PRIuMAX "%c", key_delim,
 	       (uintmax_t)stats->objects.type_counts.tags, value_delim);
 
+	printf("objects.commits.inflated_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.inflated_sizes.commits, value_delim);
+	printf("objects.trees.inflated_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.inflated_sizes.trees, value_delim);
+	printf("objects.blobs.inflated_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.inflated_sizes.blobs, value_delim);
+	printf("objects.tags.inflated_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.inflated_sizes.tags, value_delim);
+
 	fflush(stdout);
 }
 
@@ -486,6 +498,7 @@ static void structure_count_references(struct ref_stats *stats,
 }
 
 struct count_objects_data {
+	struct object_database *odb;
 	struct object_stats *stats;
 	struct progress *progress;
 };
@@ -495,20 +508,39 @@ static int count_objects(const char *path UNUSED, struct oid_array *oids,
 {
 	struct count_objects_data *data = cb_data;
 	struct object_stats *stats = data->stats;
+	size_t inflated_total = 0;
 	size_t object_count;
 
+	for (size_t i = 0; i < oids->nr; i++) {
+		struct object_info oi = OBJECT_INFO_INIT;
+		unsigned long inflated;
+
+		oi.sizep = &inflated;
+
+		if (odb_read_object_info_extended(data->odb, &oids->oid[i], &oi,
+						  OBJECT_INFO_SKIP_FETCH_OBJECT |
+						  OBJECT_INFO_QUICK) < 0)
+			continue;
+
+		inflated_total += inflated;
+	}
+
 	switch (type) {
 	case OBJ_TAG:
 		stats->type_counts.tags += oids->nr;
+		stats->inflated_sizes.tags += inflated_total;
 		break;
 	case OBJ_COMMIT:
 		stats->type_counts.commits += oids->nr;
+		stats->inflated_sizes.commits += inflated_total;
 		break;
 	case OBJ_TREE:
 		stats->type_counts.trees += oids->nr;
+		stats->inflated_sizes.trees += inflated_total;
 		break;
 	case OBJ_BLOB:
 		stats->type_counts.blobs += oids->nr;
+		stats->inflated_sizes.blobs += inflated_total;
 		break;
 	default:
 		BUG("invalid object type");
@@ -526,6 +558,7 @@ static void structure_count_objects(struct object_stats *stats,
 {
 	struct path_walk_info info = PATH_WALK_INFO_INIT;
 	struct count_objects_data data = {
+		.odb = repo->objects,
 		.stats = stats,
 	};
 
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index 55fd13ad1b9c4c..33237822fd551e 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -73,7 +73,7 @@ test_expect_success 'repository with references and objects' '
 	)
 '
 
-test_expect_success 'keyvalue and nul format' '
+test_expect_success SHA1 'keyvalue and nul format' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
 	(
@@ -90,6 +90,10 @@ test_expect_success 'keyvalue and nul format' '
 		objects.trees.count=42
 		objects.blobs.count=42
 		objects.tags.count=1
+		objects.commits.inflated_size=9225
+		objects.trees.inflated_size=28554
+		objects.blobs.inflated_size=453
+		objects.tags.inflated_size=132
 		EOF
 
 		git repo structure --format=keyvalue >out 2>err &&

From 4d279ae36b1d0f68c8a7ba9b986ff9690ddc1af9 Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:54:02 -0600
Subject: [PATCH 254/553] builtin/repo: add inflated object info to structure
 table

Update the table output format for the git-repo(1) structure command to
begin printing the total inflated object size info by object type. To be
more human-friendly, larger values are scaled down and displayed with
the appropriate unit prefix. Output for the keyvalue and nul formats
remains unchanged.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c            | 33 +++++++++++++++++++--
 strbuf.c                  | 14 +++++----
 strbuf.h                  |  5 ++++
 t/t1901-repo-structure.sh | 62 +++++++++++++++++++++++----------------
 4 files changed, 80 insertions(+), 34 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index 8da321a3866077..67d7548b8864d3 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -292,6 +292,20 @@ static void stats_table_count_addf(struct stats_table *table, size_t value,
 	va_end(ap);
 }
 
+static void stats_table_size_addf(struct stats_table *table, size_t value,
+				  const char *format, ...)
+{
+	struct stats_table_entry *entry;
+	va_list ap;
+
+	CALLOC_ARRAY(entry, 1);
+	humanise_bytes(value, &entry->value, &entry->unit, HUMANISE_COMPACT);
+
+	va_start(ap, format);
+	stats_table_vaddf(table, entry, format, ap);
+	va_end(ap);
+}
+
 static inline size_t get_total_reference_count(struct ref_stats *stats)
 {
 	return stats->branches + stats->remotes + stats->tags + stats->others;
@@ -307,7 +321,8 @@ static void stats_table_setup_structure(struct stats_table *table,
 {
 	struct object_stats *objects = &stats->objects;
 	struct ref_stats *refs = &stats->refs;
-	size_t object_total;
+	size_t inflated_object_total;
+	size_t object_count_total;
 	size_t ref_total;
 
 	ref_total = get_total_reference_count(refs);
@@ -318,10 +333,10 @@ static void stats_table_setup_structure(struct stats_table *table,
 	stats_table_count_addf(table, refs->remotes, "    * %s", _("Remotes"));
 	stats_table_count_addf(table, refs->others, "    * %s", _("Others"));
 
-	object_total = get_total_object_values(&objects->type_counts);
+	object_count_total = get_total_object_values(&objects->type_counts);
 	stats_table_addf(table, "");
 	stats_table_addf(table, "* %s", _("Reachable objects"));
-	stats_table_count_addf(table, object_total, "  * %s", _("Count"));
+	stats_table_count_addf(table, object_count_total, "  * %s", _("Count"));
 	stats_table_count_addf(table, objects->type_counts.commits,
 			       "    * %s", _("Commits"));
 	stats_table_count_addf(table, objects->type_counts.trees,
@@ -330,6 +345,18 @@ static void stats_table_setup_structure(struct stats_table *table,
 			       "    * %s", _("Blobs"));
 	stats_table_count_addf(table, objects->type_counts.tags,
 			       "    * %s", _("Tags"));
+
+	inflated_object_total = get_total_object_values(&objects->inflated_sizes);
+	stats_table_size_addf(table, inflated_object_total,
+			      "  * %s", _("Inflated size"));
+	stats_table_size_addf(table, objects->inflated_sizes.commits,
+			      "    * %s", _("Commits"));
+	stats_table_size_addf(table, objects->inflated_sizes.trees,
+			      "    * %s", _("Trees"));
+	stats_table_size_addf(table, objects->inflated_sizes.blobs,
+			      "    * %s", _("Blobs"));
+	stats_table_size_addf(table, objects->inflated_sizes.tags,
+			      "    * %s", _("Tags"));
 }
 
 static void stats_table_print_structure(const struct stats_table *table)
diff --git a/strbuf.c b/strbuf.c
index 995ff15169f59e..7fb7d12ac0cb9e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -886,11 +886,15 @@ void humanise_bytes(off_t bytes, char **value, const char **unit,
 		*unit = humanise_rate ? _("KiB/s") : _("KiB");
 	} else {
 		*value = xstrfmt("%u", (unsigned)bytes);
-		*unit = humanise_rate ?
-			       /* TRANSLATORS: IEC 80000-13:2008 byte/second */
-			       Q_("byte/s", "bytes/s", bytes) :
-			       /* TRANSLATORS: IEC 80000-13:2008 byte */
-			       Q_("byte", "bytes", bytes);
+		if (flags & HUMANISE_COMPACT)
+			/* TRANSLATORS: IEC 80000-13:2008 byte/second and byte */
+			*unit = humanise_rate ? _("B/s") : _("B");
+		else
+			*unit = humanise_rate ?
+					/* TRANSLATORS: IEC 80000-13:2008 byte/second */
+					Q_("byte/s", "bytes/s", bytes) :
+					/* TRANSLATORS: IEC 80000-13:2008 byte */
+					Q_("byte", "bytes", bytes);
 	}
 }
 
diff --git a/strbuf.h b/strbuf.h
index 52feef4c1bb0fd..06e284f9cca445 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -372,6 +372,11 @@ enum humanise_flags {
 	 * Use rate based units for humanised values.
 	 */
 	HUMANISE_RATE = (1 << 0),
+	/*
+	 * Use compact "B" unit symbol instead of "byte/bytes" for humanised
+	 * values.
+	 */
+	HUMANISE_COMPACT = (1 << 1),
 };
 
 /**
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index 33237822fd551e..b18213c660ea26 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -13,18 +13,23 @@ test_expect_success 'empty repository' '
 		| Repository structure | Value  |
 		| -------------------- | ------ |
 		| * References         |        |
-		|   * Count            |     0  |
-		|     * Branches       |     0  |
-		|     * Tags           |     0  |
-		|     * Remotes        |     0  |
-		|     * Others         |     0  |
+		|   * Count            |    0   |
+		|     * Branches       |    0   |
+		|     * Tags           |    0   |
+		|     * Remotes        |    0   |
+		|     * Others         |    0   |
 		|                      |        |
 		| * Reachable objects  |        |
-		|   * Count            |     0  |
-		|     * Commits        |     0  |
-		|     * Trees          |     0  |
-		|     * Blobs          |     0  |
-		|     * Tags           |     0  |
+		|   * Count            |    0   |
+		|     * Commits        |    0   |
+		|     * Trees          |    0   |
+		|     * Blobs          |    0   |
+		|     * Tags           |    0   |
+		|   * Inflated size    |    0 B |
+		|     * Commits        |    0 B |
+		|     * Trees          |    0 B |
+		|     * Blobs          |    0 B |
+		|     * Tags           |    0 B |
 		EOF
 
 		git repo structure >out 2>err &&
@@ -34,7 +39,7 @@ test_expect_success 'empty repository' '
 	)
 '
 
-test_expect_success 'repository with references and objects' '
+test_expect_success SHA1 'repository with references and objects' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
 	(
@@ -49,21 +54,26 @@ test_expect_success 'repository with references and objects' '
 		git notes add -m foo &&
 
 		cat >expect <<-\EOF &&
-		| Repository structure | Value  |
-		| -------------------- | ------ |
-		| * References         |        |
-		|   * Count            |    4   |
-		|     * Branches       |    1   |
-		|     * Tags           |    1   |
-		|     * Remotes        |    1   |
-		|     * Others         |    1   |
-		|                      |        |
-		| * Reachable objects  |        |
-		|   * Count            | 3.02 k |
-		|     * Commits        | 1.01 k |
-		|     * Trees          | 1.01 k |
-		|     * Blobs          | 1.01 k |
-		|     * Tags           |    1   |
+		| Repository structure | Value      |
+		| -------------------- | ---------- |
+		| * References         |            |
+		|   * Count            |      4     |
+		|     * Branches       |      1     |
+		|     * Tags           |      1     |
+		|     * Remotes        |      1     |
+		|     * Others         |      1     |
+		|                      |            |
+		| * Reachable objects  |            |
+		|   * Count            |   3.02 k   |
+		|     * Commits        |   1.01 k   |
+		|     * Trees          |   1.01 k   |
+		|     * Blobs          |   1.01 k   |
+		|     * Tags           |      1     |
+		|   * Inflated size    |  16.03 MiB |
+		|     * Commits        | 217.92 KiB |
+		|     * Trees          |  15.81 MiB |
+		|     * Blobs          |  11.68 KiB |
+		|     * Tags           |    132 B   |
 		EOF
 
 		git repo structure >out 2>err &&

From 67cecc693f511321b9d96eead24fd42e6a5c0cdc Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:54:03 -0600
Subject: [PATCH 255/553] builtin/repo: add disk size info to keyvalue stucture
 output

Similar to a prior commit, extend the keyvalue and nul output formats of
the git-repo(1) structure command to additionally provide info regarding
total object disk sizes by object type.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-repo.adoc |  1 +
 builtin/repo.c              | 18 ++++++++++++++++++
 t/t1901-repo-structure.sh   | 11 ++++++++++-
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc
index 287eee4b93de5a..861073f641e0a3 100644
--- a/Documentation/git-repo.adoc
+++ b/Documentation/git-repo.adoc
@@ -51,6 +51,7 @@ supported:
 * Reference counts categorized by type
 * Reachable object counts categorized by type
 * Total inflated size of reachable objects by type
+* Total disk size of reachable objects by type
 
 +
 The output format can be chosen through the flag `--format`. Three formats are
diff --git a/builtin/repo.c b/builtin/repo.c
index 67d7548b8864d3..7ea051f3aff643 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -214,6 +214,7 @@ struct object_values {
 struct object_stats {
 	struct object_values type_counts;
 	struct object_values inflated_sizes;
+	struct object_values disk_sizes;
 };
 
 struct repo_structure {
@@ -462,6 +463,15 @@ static void structure_keyvalue_print(struct repo_structure *stats,
 	printf("objects.tags.inflated_size%c%" PRIuMAX "%c", key_delim,
 	       (uintmax_t)stats->objects.inflated_sizes.tags, value_delim);
 
+	printf("objects.commits.disk_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.disk_sizes.commits, value_delim);
+	printf("objects.trees.disk_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.disk_sizes.trees, value_delim);
+	printf("objects.blobs.disk_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.disk_sizes.blobs, value_delim);
+	printf("objects.tags.disk_size%c%" PRIuMAX "%c", key_delim,
+	       (uintmax_t)stats->objects.disk_sizes.tags, value_delim);
+
 	fflush(stdout);
 }
 
@@ -536,13 +546,16 @@ static int count_objects(const char *path UNUSED, struct oid_array *oids,
 	struct count_objects_data *data = cb_data;
 	struct object_stats *stats = data->stats;
 	size_t inflated_total = 0;
+	size_t disk_total = 0;
 	size_t object_count;
 
 	for (size_t i = 0; i < oids->nr; i++) {
 		struct object_info oi = OBJECT_INFO_INIT;
 		unsigned long inflated;
+		off_t disk;
 
 		oi.sizep = &inflated;
+		oi.disk_sizep = &disk;
 
 		if (odb_read_object_info_extended(data->odb, &oids->oid[i], &oi,
 						  OBJECT_INFO_SKIP_FETCH_OBJECT |
@@ -550,24 +563,29 @@ static int count_objects(const char *path UNUSED, struct oid_array *oids,
 			continue;
 
 		inflated_total += inflated;
+		disk_total += disk;
 	}
 
 	switch (type) {
 	case OBJ_TAG:
 		stats->type_counts.tags += oids->nr;
 		stats->inflated_sizes.tags += inflated_total;
+		stats->disk_sizes.tags += disk_total;
 		break;
 	case OBJ_COMMIT:
 		stats->type_counts.commits += oids->nr;
 		stats->inflated_sizes.commits += inflated_total;
+		stats->disk_sizes.commits += disk_total;
 		break;
 	case OBJ_TREE:
 		stats->type_counts.trees += oids->nr;
 		stats->inflated_sizes.trees += inflated_total;
+		stats->disk_sizes.trees += disk_total;
 		break;
 	case OBJ_BLOB:
 		stats->type_counts.blobs += oids->nr;
 		stats->inflated_sizes.blobs += inflated_total;
+		stats->disk_sizes.blobs += disk_total;
 		break;
 	default:
 		BUG("invalid object type");
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index b18213c660ea26..dd17caad05da67 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -4,6 +4,11 @@ test_description='test git repo structure'
 
 . ./test-lib.sh
 
+object_type_disk_usage() {
+	git rev-list --all --objects --disk-usage --filter=object:type=$1 \
+		--filter-provided-objects
+}
+
 test_expect_success 'empty repository' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&
@@ -91,7 +96,7 @@ test_expect_success SHA1 'keyvalue and nul format' '
 		test_commit_bulk 42 &&
 		git tag -a foo -m bar &&
 
-		cat >expect <<-\EOF &&
+		cat >expect <<-EOF &&
 		references.branches.count=1
 		references.tags.count=1
 		references.remotes.count=0
@@ -104,6 +109,10 @@ test_expect_success SHA1 'keyvalue and nul format' '
 		objects.trees.inflated_size=28554
 		objects.blobs.inflated_size=453
 		objects.tags.inflated_size=132
+		objects.commits.disk_size=$(object_type_disk_usage commit)
+		objects.trees.disk_size=$(object_type_disk_usage tree)
+		objects.blobs.disk_size=$(object_type_disk_usage blob)
+		objects.tags.disk_size=$(object_type_disk_usage tag)
 		EOF
 
 		git repo structure --format=keyvalue >out 2>err &&

From df1b071fedfddc322fa2a5e0f71d23cb05949d6f Mon Sep 17 00:00:00 2001
From: Justin Tobler <jltobler@gmail.com>
Date: Wed, 17 Dec 2025 11:54:04 -0600
Subject: [PATCH 256/553] builtin/repo: add object disk size info to structure
 table

Similar to a prior commit, update the table output format for the
git-repo(1) structure command to display the total object disk usage by
object type.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repo.c            | 13 +++++++++++++
 t/t1901-repo-structure.sh | 31 ++++++++++++++++++++++++++++---
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/builtin/repo.c b/builtin/repo.c
index 7ea051f3aff643..09bc8fccfd15b5 100644
--- a/builtin/repo.c
+++ b/builtin/repo.c
@@ -324,6 +324,7 @@ static void stats_table_setup_structure(struct stats_table *table,
 	struct ref_stats *refs = &stats->refs;
 	size_t inflated_object_total;
 	size_t object_count_total;
+	size_t disk_object_total;
 	size_t ref_total;
 
 	ref_total = get_total_reference_count(refs);
@@ -358,6 +359,18 @@ static void stats_table_setup_structure(struct stats_table *table,
 			      "    * %s", _("Blobs"));
 	stats_table_size_addf(table, objects->inflated_sizes.tags,
 			      "    * %s", _("Tags"));
+
+	disk_object_total = get_total_object_values(&objects->disk_sizes);
+	stats_table_size_addf(table, disk_object_total,
+			      "  * %s", _("Disk size"));
+	stats_table_size_addf(table, objects->disk_sizes.commits,
+			      "    * %s", _("Commits"));
+	stats_table_size_addf(table, objects->disk_sizes.trees,
+			      "    * %s", _("Trees"));
+	stats_table_size_addf(table, objects->disk_sizes.blobs,
+			      "    * %s", _("Blobs"));
+	stats_table_size_addf(table, objects->disk_sizes.tags,
+			      "    * %s", _("Tags"));
 }
 
 static void stats_table_print_structure(const struct stats_table *table)
diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh
index dd17caad05da67..435fd979fa93b9 100755
--- a/t/t1901-repo-structure.sh
+++ b/t/t1901-repo-structure.sh
@@ -5,8 +5,20 @@ test_description='test git repo structure'
 . ./test-lib.sh
 
 object_type_disk_usage() {
-	git rev-list --all --objects --disk-usage --filter=object:type=$1 \
-		--filter-provided-objects
+	disk_usage_opt="--disk-usage"
+
+	if test "$2" = "true"
+	then
+		disk_usage_opt="--disk-usage=human"
+	fi
+
+	if test "$1" = "all"
+	then
+		git rev-list --all --objects $disk_usage_opt
+	else
+		git rev-list --all --objects $disk_usage_opt \
+			--filter=object:type=$1 --filter-provided-objects
+	fi
 }
 
 test_expect_success 'empty repository' '
@@ -35,6 +47,11 @@ test_expect_success 'empty repository' '
 		|     * Trees          |    0 B |
 		|     * Blobs          |    0 B |
 		|     * Tags           |    0 B |
+		|   * Disk size        |    0 B |
+		|     * Commits        |    0 B |
+		|     * Trees          |    0 B |
+		|     * Blobs          |    0 B |
+		|     * Tags           |    0 B |
 		EOF
 
 		git repo structure >out 2>err &&
@@ -58,7 +75,10 @@ test_expect_success SHA1 'repository with references and objects' '
 		# Also creates a commit, tree, and blob.
 		git notes add -m foo &&
 
-		cat >expect <<-\EOF &&
+		# The tags disk size is handled specially due to the
+		# git-rev-list(1) --disk-usage=human option printing the full
+		# "byte/bytes" unit string instead of just "B".
+		cat >expect <<-EOF &&
 		| Repository structure | Value      |
 		| -------------------- | ---------- |
 		| * References         |            |
@@ -79,6 +99,11 @@ test_expect_success SHA1 'repository with references and objects' '
 		|     * Trees          |  15.81 MiB |
 		|     * Blobs          |  11.68 KiB |
 		|     * Tags           |    132 B   |
+		|   * Disk size        | $(object_type_disk_usage all true) |
+		|     * Commits        | $(object_type_disk_usage commit true) |
+		|     * Trees          | $(object_type_disk_usage tree true) |
+		|     * Blobs          |  $(object_type_disk_usage blob true) |
+		|     * Tags           |    $(object_type_disk_usage tag) B   |
 		EOF
 
 		git repo structure >out 2>err &&

From 1722c2244bc0f5663c53f5dc8fc9ff5b8bf0e523 Mon Sep 17 00:00:00 2001
From: Matthew Hughes <matthewhughes934@gmail.com>
Date: Wed, 17 Dec 2025 19:59:55 +0000
Subject: [PATCH 257/553] docs: note the type of core.attributesfile

The previous wording:

> Path expansions are made the same way as for `core.excludesFile`.

required one to check the docs for 'core.excludesFile' and from there
the definition of the pathname variable type to understand the path
expansion behaviour of this variable. Instead, just link directly to the
pathname type.

This change is basically the same rewording as was done to
'core.excludesFile' in dca83abd (config: describe 'pathname' value
type, 2016-04-29).

Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/core.adoc | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index 11efad189e8d72..62b4f5b654dccd 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -492,10 +492,9 @@ core.askPass::
 	command-line argument and write the password on its STDOUT.
 
 core.attributesFile::
-	In addition to `.gitattributes` (per-directory) and
-	`.git/info/attributes`, Git looks into this file for attributes
-	(see linkgit:gitattributes[5]). Path expansions are made the same
-	way as for `core.excludesFile`. Its default value is
+	Specifies the pathname to the file that contains attributes (see
+	linkgit:gitattributes[5]), in addition to `.gitattributes` (per-directory)
+	and `.git/info/attributes`. Its default value is
 	`$XDG_CONFIG_HOME/git/attributes`. If `$XDG_CONFIG_HOME` is either not
 	set or empty, `$HOME/.config/git/attributes` is used instead.
 

From a650ad996db85b64643970dd7dc5920f989260a0 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 18 Dec 2025 12:35:40 +0900
Subject: [PATCH 258/553] odb: do not use "blank" substitute for NULL

When various *object_info() functions are given an extended object
info structure as NULL by a caller that does not want any details,
the code uses a file-scope static blank_oi and passes it down to
the helper functions they use, to avoid handling NULL specifically.

The ps/object-read-stream topic graduated to 'master' recently
however had a bug that assumed that two identically named file-scope
static variables in two functions are the same, which of course is
not the case.  This made "git commit" take 0.38 seconds to 1508
seconds in some case, as reported by Aaron Plattner here:

  https://lore.kernel.org/git/f4ba7e89-4717-4b36-921f-56537131fd69@nvidia.com/

We _could_ move the blank_oi variable to the global scope in common
section to fix this regression, but explicitly handling the NULL is
a much safer fix.  It would also reduce the chance of errors that
somebody accidentally writes into blank_oi, making its contents
dirty, which potentially will make subsequent calls into the
function misbehave.  By explicitly handling NULL input, we no longer
have to worry about it.

Reported-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c |  8 ++++----
 odb.c         | 29 +++++++++++++----------------
 packfile.c    |  3 +--
 3 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/object-file.c b/object-file.c
index 12177a7dd707a8..e0cce3a62a827a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -426,7 +426,7 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 	unsigned long size_scratch;
 	enum object_type type_scratch;
 
-	if (oi->delta_base_oid)
+	if (oi && oi->delta_base_oid)
 		oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
 
 	/*
@@ -437,13 +437,13 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 	 * return value implicitly indicates whether the
 	 * object even exists.
 	 */
-	if (!oi->typep && !oi->sizep && !oi->contentp) {
+	if (!oi || (!oi->typep && !oi->sizep && !oi->contentp)) {
 		struct stat st;
-		if (!oi->disk_sizep && (flags & OBJECT_INFO_QUICK))
+		if ((!oi || !oi->disk_sizep) && (flags & OBJECT_INFO_QUICK))
 			return quick_has_loose(source->loose, oid) ? 0 : -1;
 		if (stat_loose_object(source->loose, oid, &st, &path) < 0)
 			return -1;
-		if (oi->disk_sizep)
+		if (oi && oi->disk_sizep)
 			*oi->disk_sizep = st.st_size;
 		return 0;
 	}
diff --git a/odb.c b/odb.c
index f4cbee4b042d83..85dc21b104d183 100644
--- a/odb.c
+++ b/odb.c
@@ -664,34 +664,31 @@ static int do_oid_object_info_extended(struct object_database *odb,
 				       const struct object_id *oid,
 				       struct object_info *oi, unsigned flags)
 {
-	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	const struct cached_object *co;
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
-
 	if (flags & OBJECT_INFO_LOOKUP_REPLACE)
 		real = lookup_replace_object(odb->repo, oid);
 
 	if (is_null_oid(real))
 		return -1;
 
-	if (!oi)
-		oi = &blank_oi;
-
 	co = find_cached_object(odb, real);
 	if (co) {
-		if (oi->typep)
-			*(oi->typep) = co->type;
-		if (oi->sizep)
-			*(oi->sizep) = co->size;
-		if (oi->disk_sizep)
-			*(oi->disk_sizep) = 0;
-		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, odb->repo->hash_algo);
-		if (oi->contentp)
-			*oi->contentp = xmemdupz(co->buf, co->size);
-		oi->whence = OI_CACHED;
+		if (oi) {
+			if (oi->typep)
+				*(oi->typep) = co->type;
+			if (oi->sizep)
+				*(oi->sizep) = co->size;
+			if (oi->disk_sizep)
+				*(oi->disk_sizep) = 0;
+			if (oi->delta_base_oid)
+				oidclr(oi->delta_base_oid, odb->repo->hash_algo);
+			if (oi->contentp)
+				*oi->contentp = xmemdupz(co->buf, co->size);
+			oi->whence = OI_CACHED;
+		}
 		return 0;
 	}
 
diff --git a/packfile.c b/packfile.c
index 7a16aaa90d0a2f..2aa6135c3a1fe4 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2095,7 +2095,6 @@ int packfile_store_read_object_info(struct packfile_store *store,
 				    struct object_info *oi,
 				    unsigned flags UNUSED)
 {
-	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	struct pack_entry e;
 	int rtype;
 
@@ -2106,7 +2105,7 @@ int packfile_store_read_object_info(struct packfile_store *store,
 	 * We know that the caller doesn't actually need the
 	 * information below, so return early.
 	 */
-	if (oi == &blank_oi)
+	if (!oi)
 		return 0;
 
 	rtype = packed_object_info(store->odb->repo, e.p, e.offset, oi);

From 2c6fc31e04b32d5a8523cfe69e4495f188e86ec3 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Thu, 18 Dec 2025 07:13:47 -0500
Subject: [PATCH 259/553] t5551: handle trailing slashes in expected cookies
 output

We check in t5551 that curl updates the expected list of cookies after
making a request. We do this by telling it to read and write cookies
from a particular text file, and then checking that after curl runs, the
file has the expected content.

However, in the upcoming curl 8.18.0, the output file has changed
slightly: curl will canonicalize the paths it writes, due to commit
a093c93994 (cookie: only keep and use the canonical cleaned up path,
2025-12-07). In particular, it strips trailing slashes from the paths we
see in the cookies.txt file.

This doesn't matter to Git, as the cookie handling is all internal to
curl. But our test is overly brittle and breaks as a result.

We can fix it by matching either format. We'll expect the new format
(without trailing slashes) and strip the slashes from curl's output
before comparing. That lets us pass with both old and new versions (I
tested against curl's 8_17_0 and rc-8_18_0-2 tags, which are
respectively before and after the curl change).

In theory it might be nice to try to future-proof this test more by
looking only for the bits we care about, rather than a byte-wise
comparison of the whole file. But after removing comments and blank
lines (which we already do), we care about most of what's there. So it's
not clear to me what a more liberal test would look like. Given that the
format doesn't change all that often, it's probably OK to stop here and
see if it ever breaks again.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5551-http-fetch-smart.sh | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index b0d4ea78015a25..73cf5315800fa4 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -333,12 +333,12 @@ test_expect_success 'dumb clone via http-backend respects namespace' '
 
 test_expect_success 'cookies stored in http.cookiefile when http.savecookies set' '
 	cat >cookies.txt <<-\EOF &&
-	127.0.0.1	FALSE	/smart_cookies/	FALSE	0	othername	othervalue
+	127.0.0.1	FALSE	/smart_cookies	FALSE	0	othername	othervalue
 	EOF
 	sort >expect_cookies.txt <<-\EOF &&
-	127.0.0.1	FALSE	/smart_cookies/	FALSE	0	othername	othervalue
-	127.0.0.1	FALSE	/smart_cookies/repo.git/	FALSE	0	name	value
-	127.0.0.1	FALSE	/smart_cookies/repo.git/info/	FALSE	0	name	value
+	127.0.0.1	FALSE	/smart_cookies	FALSE	0	othername	othervalue
+	127.0.0.1	FALSE	/smart_cookies/repo.git	FALSE	0	name	value
+	127.0.0.1	FALSE	/smart_cookies/repo.git/info	FALSE	0	name	value
 	EOF
 	git config http.cookiefile cookies.txt &&
 	git config http.savecookies true &&
@@ -351,8 +351,11 @@ test_expect_success 'cookies stored in http.cookiefile when http.savecookies set
 		tag -m "foo" cookie-tag &&
 	git fetch $HTTPD_URL/smart_cookies/repo.git cookie-tag &&
 
-	grep "^[^#]" cookies.txt | sort >cookies_stripped.txt &&
-	test_cmp expect_cookies.txt cookies_stripped.txt
+	# Strip trailing slashes from cookie paths to handle output from both
+	# old curl ("/smart_cookies/") and new ("/smart_cookies").
+	HT="	" &&
+	grep "^[^#]" cookies.txt | sed "s,/$HT,$HT," | sort >cookies_clean.txt &&
+	test_cmp expect_cookies.txt cookies_clean.txt
 '
 
 test_expect_success 'transfer.hiderefs works over smart-http' '

From 17f4b01da7a4d67d6c22d37904bdbbbddd81b9ac Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Thu, 18 Dec 2025 07:18:19 -0500
Subject: [PATCH 260/553] t5563: add missing end-of-line in HTTP header

In t5563, we test how various oddly-formatted WWW-Authenticate headers
are passed through curl to git's credential subsystem (and ultimately
out to credential helpers). One test, "access using basic auth with
wwwauth header mixed line-endings" does something odd. It does not mix
line endings at all (which must be CRLF according to the RFC anyway),
but omits the line ending entirely for the final header!

This means that the server produces an incomplete response. We send our
final header, and then the newline which is meant to mark the end of
headers (and the start of the body) becomes the line ending for that
header. And there is no header/body separator in the output at all.

Looking at strace, this is what the client reads:

  recvfrom(9, "WWW-Authenticate: FooBar param1=\"value1\"\r\n \r\n\tparam2=\"value2\"\r\nWWW-Authenticate: Basic realm=\"example.com\"", 16384, 0, NULL, NULL) = 106
  recvfrom(9, "\n", 16384, 0, NULL, NULL) = 1
  recvfrom(9, "", 16384, 0, NULL, NULL) = 0

The headers themselves are produced from the custom-auth.challenge file
we write in the test (which is missing the final CRLF), and then the
header/body separator comes from our lib-httpd/nph-custom-auth.sh CGI.
(Ignore for a moment that it is producing a bare newline, which I think
is a bug; it should be a CRLF but curl is happy with either).

Older versions of curl seemed to be OK with the truncated output, but
the upcoming 8.18.0 release seems to get confused. Specifically, since
67ae101666 (http: unfold response headers earlier, 2025-12-12) our
request to the server fails with insufficient credentials. I traced far
enough to see that curl does relay the header back to us, which we then
pass to a credential helper, which gives us the correct
username/password combination. But on our followup request, curl refuses
to send the Authorization header (and so gets an HTTP 401 again).

The change in curl's behavior is a bit unexpected, but since we are
sending it garbage, it is hard to complain too much. Let's add the
missing CRLF to the header. I _think_ this was just an oversight and not
the intent of the test. And that the "mixed line-endings" really meant
"mixed continuations", since we differ from the previous test in
continuing with both space and tab. So I've likewise updated the test
title to match that assumption.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5563-simple-http-auth.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh
index 317f33af5a7e60..c1febbae9d778b 100755
--- a/t/t5563-simple-http-auth.sh
+++ b/t/t5563-simple-http-auth.sh
@@ -469,7 +469,7 @@ test_expect_success 'access using basic auth with wwwauth header empty continuat
 	EOF
 '
 
-test_expect_success 'access using basic auth with wwwauth header mixed line-endings' '
+test_expect_success 'access using basic auth with wwwauth header mixed continuations' '
 	test_when_finished "per_test_cleanup" &&
 
 	set_credential_reply get <<-EOF &&
@@ -490,7 +490,7 @@ test_expect_success 'access using basic auth with wwwauth header mixed line-endi
 	printf "id=default response=WWW-Authenticate: FooBar param1=\"value1\"\r\n" >>"$CHALLENGE" &&
 	printf "id=default response= \r\n" >>"$CHALLENGE" &&
 	printf "id=default response=\tparam2=\"value2\"\r\n" >>"$CHALLENGE" &&
-	printf "id=default response=WWW-Authenticate: Basic realm=\"example.com\"" >>"$CHALLENGE" &&
+	printf "id=default response=WWW-Authenticate: Basic realm=\"example.com\"\r\n" >>"$CHALLENGE" &&
 
 	test_config_global credential.helper test-helper &&
 	git ls-remote "$HTTPD_URL/custom_auth/repo.git" &&

From 949df6ed6b02f0d52b3509b68b8c9fe27e56cd97 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 18 Dec 2025 15:20:59 +0000
Subject: [PATCH 261/553] test_detect_ref_format: fix comment

When 58aaf59133b (t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar,
2023-12-29) copy-edited the `test_detect_hash` function, the code
comment was accidentally left unchanged. Let's adjust it.

Noticed-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/test-lib-functions.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab503a65..10bbcea0667db4 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1702,7 +1702,7 @@ test_detect_hash () {
 	esac
 }
 
-# Detect the hash algorithm in use.
+# Detect the ref format in use.
 test_detect_ref_format () {
 	echo "${GIT_TEST_DEFAULT_REF_FORMAT:-files}"
 }

From 12f0be085701472f634a71ffe1416334c4869267 Mon Sep 17 00:00:00 2001
From: Greg Funni <gfunni234@gmail.com>
Date: Thu, 18 Dec 2025 15:49:12 +0000
Subject: [PATCH 262/553] repository: remove duplicate free of
 cache->squash_msg

Thankfully, it is set to NULL, so no security consequences.
However, this is still a mistake that must be rectified.

Signed-off-by: Greg Funni <gfunni234@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 repository.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/repository.c b/repository.c
index 1a6a62bbd03a5d..faa3fc23932df0 100644
--- a/repository.c
+++ b/repository.c
@@ -352,7 +352,6 @@ int repo_submodule_init(struct repository *subrepo,
 
 static void repo_clear_path_cache(struct repo_path_cache *cache)
 {
-	FREE_AND_NULL(cache->squash_msg);
 	FREE_AND_NULL(cache->squash_msg);
 	FREE_AND_NULL(cache->merge_msg);
 	FREE_AND_NULL(cache->merge_rr);

From 46d0ee2d6996779bf33acb83e36240443e27c79e Mon Sep 17 00:00:00 2001
From: Greg Funni <gfunni234@gmail.com>
Date: Thu, 18 Dec 2025 16:10:49 +0000
Subject: [PATCH 263/553] refs: dereference the value of the required pointer

Currently, this always prints yes because required is non-null.

This is the wrong behavior. The boolean must be
dereferenced.

Signed-off-by: Greg Funni <gfunni234@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs/debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/debug.c b/refs/debug.c
index 36f8c58b6c781f..f3d1079a2c805c 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -131,7 +131,7 @@ static int debug_optimize_required(struct ref_store *ref_store,
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
 	int res = drefs->refs->be->optimize_required(drefs->refs, opts, required);
 	trace_printf_key(&trace_refs, "optimize_required: %s, res: %d\n",
-			 required ? "yes" : "no", res);
+			 *required ? "yes" : "no", res);
 	return res;
 }
 

From c469ca26c588918cfad439636a26fbefa2049b1d Mon Sep 17 00:00:00 2001
From: "D. Ben Knoble" <ben.knoble+github@gmail.com>
Date: Thu, 18 Dec 2025 18:25:44 -0500
Subject: [PATCH 264/553] rust: build correctly without GNU sed

From e509b5b8be (rust: support for Windows, 2025-10-15), we check
cargo's information to decide which library to build. However, that
check mistakenly used "sed -s" ("consider files as separate rather than
as a single, continuous long stream"), which is a GNU extension. The
build thus fails on macOS with "meson -Drust=enabled", which comes with
BSD-derived sed.

Instead, use the intended "sed -n" and print the matching section of the
output. This failure mode likely went unnoticed on systems with GNU sed
(common for developer machines and CI) because, in those instances, the
output being matched by case is the full cargo output (which either
contains the string "-windows-" or doesn't).

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 src/cargo-meson.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/cargo-meson.sh b/src/cargo-meson.sh
index 3998db04354864..38728a371137f9 100755
--- a/src/cargo-meson.sh
+++ b/src/cargo-meson.sh
@@ -26,7 +26,7 @@ then
 	exit $RET
 fi
 
-case "$(cargo -vV | sed -s 's/^host: \(.*\)$/\1/')" in
+case "$(cargo -vV | sed -n 's/^host: \(.*\)$/\1/p')" in
 	*-windows-*)
 		LIBNAME=gitcore.lib;;
 	*)

From a0c813951afc4bbf5978e67201bccd8d20e9b36b Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Fri, 19 Dec 2025 21:51:01 +0900
Subject: [PATCH 265/553] signoff-option: linkify the reference to gitfaq

The GitFAQ is a proper manual page in the section 7, so refer to it
using the usual linkgit:stuff[7] syntax.

Helped-by: Kristoffer Haugsbakk
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/signoff-option.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/signoff-option.adoc b/Documentation/signoff-option.adoc
index 9a80d60f1bb1b8..6fc27692573688 100644
--- a/Documentation/signoff-option.adoc
+++ b/Documentation/signoff-option.adoc
@@ -19,4 +19,4 @@ option on the command line.
 +
 Git does not (and will not) have a configuration variable to enable
 the `--signoff` command line option by default; see the
-`commit.signoff` entry in the gitfaq for more details.
+`commit.signoff` entry in linkgit:gitfaq[7] for more details.

From 7796c14a1a4b73869ae6a954ec20bca561783231 Mon Sep 17 00:00:00 2001
From: Sam Bostock <sam.bostock@shopify.com>
Date: Fri, 19 Dec 2025 16:01:46 +0000
Subject: [PATCH 266/553] bundle-uri: validate that bundle entries have a uri

When a bundle list config file has a typo like 'url' instead of 'uri',
or simply omits the uri field, the bundle entry is created but
bundle->uri remains NULL. This causes a segfault when copy_uri_to_file()
passes the NULL to starts_with().

Signed-off-by: Sam Bostock <sam@sambostock.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 bundle-uri.c                | 24 +++++++++++++++++++++++-
 t/t5750-bundle-uri-parse.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 57cccfc6b8ee1f..3b2e347288c3b7 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -89,7 +89,10 @@ static int summarize_bundle(struct remote_bundle_info *info, void *data)
 {
 	FILE *fp = data;
 	fprintf(fp, "[bundle \"%s\"]\n", info->id);
-	fprintf(fp, "\turi = %s\n", info->uri);
+	if (info->uri)
+		fprintf(fp, "\turi = %s\n", info->uri);
+	else
+		fprintf(fp, "\t# uri = (missing)\n");
 
 	if (info->creationToken)
 		fprintf(fp, "\tcreationToken = %"PRIu64"\n", info->creationToken);
@@ -267,6 +270,19 @@ int bundle_uri_parse_config_format(const char *uri,
 		result = 1;
 	}
 
+	if (!result) {
+		struct hashmap_iter iter;
+		struct remote_bundle_info *bundle;
+
+		hashmap_for_each_entry(&list->bundles, &iter, bundle, ent) {
+			if (!bundle->uri) {
+				error(_("bundle list at '%s': bundle '%s' has no uri"),
+				      uri, bundle->id ? bundle->id : "<unknown>");
+				result = 1;
+			}
+		}
+	}
+
 	return result;
 }
 
@@ -751,6 +767,12 @@ static int fetch_bundle_uri_internal(struct repository *r,
 		return -1;
 	}
 
+	if (!bundle->uri) {
+		error(_("bundle '%s' has no uri"),
+		      bundle->id ? bundle->id : "<unknown>");
+		return -1;
+	}
+
 	if (!bundle->file &&
 	    !(bundle->file = find_temp_filename())) {
 		result = -1;
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index 80a3f83ffb7e60..294f9d9c6455d4 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -286,4 +286,30 @@ test_expect_success 'parse config format edge cases: creationToken heuristic' '
 	grep "could not parse bundle list key creationToken with value '\''bogus'\''" err
 '
 
+test_expect_success 'parse config format: bundle with missing uri' '
+	cat >input <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "missing-uri"]
+		creationToken = 1
+	EOF
+
+	test_must_fail test-tool bundle-uri parse-config input 2>err &&
+	grep "bundle '\''missing-uri'\'' has no uri" err
+'
+
+test_expect_success 'parse config format: bundle with url instead of uri' '
+	cat >input <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "typo"]
+		url = https://example.com/bundle.bdl
+	EOF
+
+	test_must_fail test-tool bundle-uri parse-config input 2>err &&
+	grep "bundle '\''typo'\'' has no uri" err
+'
+
 test_done

From b2ff85e12c9b8c9997ac22d2c1e49c5aa9660abd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Fri, 19 Dec 2025 18:54:15 +0000
Subject: [PATCH 267/553] doc: fix asciidoc markup issues in several files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* fix incorrect use of backticks for markup in
  git-checkout.adoc, git-worktree.adoc
* switch tabs to spaces	in git-send-email.adoc list items

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-checkout.adoc   |  2 +-
 Documentation/git-send-email.adoc | 14 +++++++-------
 Documentation/git-worktree.adoc   |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-checkout.adoc b/Documentation/git-checkout.adoc
index 6f281b298efac1..43ccf47cf6de28 100644
--- a/Documentation/git-checkout.adoc
+++ b/Documentation/git-checkout.adoc
@@ -509,7 +509,7 @@ ARGUMENT DISAMBIGUATION
 -----------------------
 
 When you run `git checkout <something>`, Git tries to guess whether
-`<something>` is intended to be a branch, a commit, or a set of file(s),
+_<something>_ is intended to be a branch, a commit, or a set of file(s),
 and then either switches to that branch or commit, or restores the
 specified files.
 
diff --git a/Documentation/git-send-email.adoc b/Documentation/git-send-email.adoc
index 263b977353f334..caf9d693a33f4a 100644
--- a/Documentation/git-send-email.adoc
+++ b/Documentation/git-send-email.adoc
@@ -277,7 +277,7 @@ must be used for each option.
 --smtp-ssl::
 	Legacy alias for `--smtp-encryption ssl`.
 
---smtp-ssl-cert-path::
+--smtp-ssl-cert-path <path>::
 	Path to a store of trusted CA certificates for SMTP SSL/TLS
 	certificate validation (either a directory that has been processed
 	by `c_rehash`, or a single file containing one or more PEM format
@@ -510,12 +510,12 @@ have been specified, in which case default to `compose`.
 	Currently, validation means the following:
 +
 --
-		*	Invoke the sendemail-validate hook if present (see linkgit:githooks[5]).
-		*	Warn of patches that contain lines longer than
-			998 characters unless a suitable transfer encoding
-			(`auto`, `base64`, or `quoted-printable`) is used;
-			this is due to SMTP limits as described by
-			https://www.ietf.org/rfc/rfc5322.txt.
+* Invoke the sendemail-validate hook if present (see linkgit:githooks[5]).
+* Warn of patches that contain lines longer than
+  998 characters unless a suitable transfer encoding
+  (`auto`, `base64`, or `quoted-printable`) is used;
+  this is due to SMTP limits as described by
+  https://www.ietf.org/rfc/rfc5322.txt.
 --
 +
 Default is the value of `sendemail.validate`; if this is not set,
diff --git a/Documentation/git-worktree.adoc b/Documentation/git-worktree.adoc
index f272f797837f45..d74ad7b0e9bd75 100644
--- a/Documentation/git-worktree.adoc
+++ b/Documentation/git-worktree.adoc
@@ -104,7 +104,7 @@ associated with a new unborn branch named _<branch>_ (after
 passed to the command. In the event the repository has a remote and
 `--guess-remote` is used, but no remote or local branches exist, then the
 command fails with a warning reminding the user to fetch from their remote
-first (or override by using `-f/--force`).
+first (or override by using `-f`/`--force`).
 
 `list`::
 

From 8ee262985a1197970cc8938a2fb5c70817d213ab Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Fri, 19 Dec 2025 18:54:16 +0000
Subject: [PATCH 268/553] doc: correct minor wording issues
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* use imperative mood for consistency in options descriptions
* add missing parenthesis
* reword verbose phrase in git-repack.adoc

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-rebase.adoc     |  2 +-
 Documentation/git-repack.adoc     |  6 +++---
 Documentation/git-send-email.adoc | 12 ++++++------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-rebase.adoc b/Documentation/git-rebase.adoc
index 005caf61646ec7..ecc347680c3d63 100644
--- a/Documentation/git-rebase.adoc
+++ b/Documentation/git-rebase.adoc
@@ -87,7 +87,7 @@ of the to-be-rebased branch. However, `ORIG_HEAD` is not guaranteed to still
 point to that commit at the end of the rebase if other commands that change
 `ORIG_HEAD` (like `git reset`) are used during the rebase. The previous branch
 tip, however, is accessible using the reflog of the current branch (i.e. `@{1}`,
-see linkgit:gitrevisions[7].
+see linkgit:gitrevisions[7]).
 
 TRANSPLANTING A TOPIC BRANCH WITH --ONTO
 ----------------------------------------
diff --git a/Documentation/git-repack.adoc b/Documentation/git-repack.adoc
index d12c4985f61c06..673ce91083720d 100644
--- a/Documentation/git-repack.adoc
+++ b/Documentation/git-repack.adoc
@@ -77,14 +77,14 @@ to the new separate pack will be written.
 	Only useful with `--cruft -d`.
 
 --max-cruft-size=<n>::
-	Overrides `--max-pack-size` for cruft packs. Inherits the value of
+	Override `--max-pack-size` for cruft packs. Inherits the value of
 	`--max-pack-size` (if any) by default. See the documentation for
 	`--max-pack-size` for more details.
 
 --combine-cruft-below-size=<n>::
 	When generating cruft packs without pruning, only repack
-	existing cruft packs whose size is strictly less than `<n>`,
-	where `<n>` represents a number of bytes, which can optionally
+	existing cruft packs whose size is strictly less than `<n>`
+	bytes, which can optionally
 	be suffixed with "k", "m", or "g". Cruft packs whose size is
 	greater than or equal to `<n>` are left as-is and not repacked.
 	Useful when you want to avoid repacking large cruft pack(s) in
diff --git a/Documentation/git-send-email.adoc b/Documentation/git-send-email.adoc
index caf9d693a33f4a..cdaf421cda9ca9 100644
--- a/Documentation/git-send-email.adoc
+++ b/Documentation/git-send-email.adoc
@@ -208,7 +208,7 @@ Sending
 	for your own case. Default is the value of `sendemail.smtpEncryption`.
 
 --smtp-domain=<FQDN>::
-	Specifies the Fully Qualified Domain Name (FQDN) used in the
+	Specify the Fully Qualified Domain Name (FQDN) used in the
 	HELO/EHLO command to the SMTP server.  Some servers require the
 	FQDN to match your IP address.  If not set, `git send-email` attempts
 	to determine your FQDN automatically.  Default is the value of
@@ -245,7 +245,7 @@ a password is obtained using linkgit:git-credential[1].
 	Disable SMTP authentication. Short hand for `--smtp-auth=none`.
 
 --smtp-server=<host>::
-	If set, specifies the outgoing SMTP server to use (e.g.
+	Specify the outgoing SMTP server to use (e.g.
 	`smtp.example.com` or a raw IP address).  If unspecified, and if
 	`--sendmail-cmd` is also unspecified, the default is to search
 	for `sendmail` in `/usr/sbin`, `/usr/lib` and `$PATH` if such a
@@ -258,7 +258,7 @@ command names.  For those use cases, consider using `--sendmail-cmd`
 instead.
 
 --smtp-server-port=<port>::
-	Specifies a port different from the default port (SMTP
+	Specify a port different from the default port (SMTP
 	servers typically listen to smtp port 25, but may also listen to
 	submission port 587, or the common SSL smtp port 465);
 	symbolic port names (e.g. `submission` instead of 587)
@@ -266,7 +266,7 @@ instead.
 	`sendemail.smtpServerPort` configuration variable.
 
 --smtp-server-option=<option>::
-	If set, specifies the outgoing SMTP server option to use.
+	Specify the outgoing SMTP server option to use.
 	Default value can be specified by the `sendemail.smtpServerOption`
 	configuration option.
 +
@@ -347,11 +347,11 @@ Automating
 --no-to::
 --no-cc::
 --no-bcc::
-	Clears any list of `To:`, `Cc:`, `Bcc:` addresses previously
+	Clear any list of `To:`, `Cc:`, `Bcc:` addresses previously
 	set via config.
 
 --no-identity::
-	Clears the previously read value of `sendemail.identity` set
+	Clear the previously read value of `sendemail.identity` set
 	via config, if any.
 
 --to-cmd=<command>::

From f53f133d8d456559daa5d06e1e7ddb09b1cc0ae5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Sat, 20 Dec 2025 19:16:23 +0000
Subject: [PATCH 269/553] doc: fix t0450-txt-doc-vs-help to select only first
 synopsis block
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In case there are multiple synopsis blocks (declared with [synopsis]
or [verse] style) in the same file, the previous implementation was
incorrectly picking up text from all the blocks until the first empty
line. This commit modifies the sed command to stop processing upon
encountering the first empty line after the first block declaration,
thereby ensuring that only the intended block is captured.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0450-txt-doc-vs-help.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0450-txt-doc-vs-help.sh b/t/t0450-txt-doc-vs-help.sh
index e12e18f97f02eb..822b0d55a50ae7 100755
--- a/t/t0450-txt-doc-vs-help.sh
+++ b/t/t0450-txt-doc-vs-help.sh
@@ -56,7 +56,7 @@ adoc_to_synopsis () {
 	b2t="$(builtin_to_adoc "$builtin")" &&
 	sed -n \
 		-E '/^\[(verse|synopsis)\]$/,/^$/ {
-			/^$/d;
+			/^$/q;
 			/^\[(verse|synopsis)\]$/d;
 			s/\{litdd\}/--/g;
 			s/'\''(git[ a-z-]*)'\''/\1/g;

From 20e56300d439a7d607de3e824cd98ddaa0f78f2d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Sat, 20 Dec 2025 19:16:24 +0000
Subject: [PATCH 270/553] doc: convert git-status to synopsis style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Also convert unformatted lists to proper AsciiDoc lists.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-status.adoc | 182 ++++++++++++++++++----------------
 1 file changed, 95 insertions(+), 87 deletions(-)

diff --git a/Documentation/git-status.adoc b/Documentation/git-status.adoc
index 9a376886a5867a..37b0453898d369 100644
--- a/Documentation/git-status.adoc
+++ b/Documentation/git-status.adoc
@@ -8,8 +8,9 @@ git-status - Show the working tree status
 
 SYNOPSIS
 --------
-[verse]
-'git status' [<options>] [--] [<pathspec>...]
+
+[synopsis]
+git status [<options>] [--] [<pathspec>...]
 
 DESCRIPTION
 -----------
@@ -18,57 +19,57 @@ current HEAD commit, paths that have differences between the working
 tree and the index file, and paths in the working tree that are not
 tracked by Git (and are not ignored by linkgit:gitignore[5]). The first
 are what you _would_ commit by running `git commit`; the second and
-third are what you _could_ commit by running 'git add' before running
+third are what you _could_ commit by running `git add` before running
 `git commit`.
 
 OPTIONS
 -------
 
--s::
---short::
+`-s`::
+`--short`::
 	Give the output in the short-format.
 
--b::
---branch::
+`-b`::
+`--branch`::
 	Show the branch and tracking info even in short-format.
 
---show-stash::
+`--show-stash`::
 	Show the number of entries currently stashed away.
 
---porcelain[=<version>]::
+`--porcelain[=<version>]`::
 	Give the output in an easy-to-parse format for scripts.
 	This is similar to the short output, but will remain stable
 	across Git versions and regardless of user configuration. See
 	below for details.
 +
-The version parameter is used to specify the format version.
-This is optional and defaults to the original version 'v1' format.
+The _<version>_ parameter is used to specify the format version.
+This is optional and defaults to the original version `v1` format.
 
---long::
+`--long`::
 	Give the output in the long-format. This is the default.
 
--v::
---verbose::
+`-v`::
+`--verbose`::
 	In addition to the names of files that have been changed, also
 	show the textual changes that are staged to be committed
 	(i.e., like the output of `git diff --cached`). If `-v` is specified
 	twice, then also show the changes in the working tree that
 	have not yet been staged (i.e., like the output of `git diff`).
 
--u[<mode>]::
---untracked-files[=<mode>]::
+`-u[<mode>]`::
+`--untracked-files[=<mode>]`::
 	Show untracked files.
 +
 --
 The mode parameter is used to specify the handling of untracked files.
-It is optional: it defaults to 'all', and if specified, it must be
+It is optional: it defaults to `all`, and if specified, it must be
 stuck to the option (e.g. `-uno`, but not `-u no`).
 
 The possible options are:
 
-	- 'no'     - Show no untracked files.
-	- 'normal' - Shows untracked files and directories.
-	- 'all'    - Also shows individual files in untracked directories.
+`no`:: Show no untracked files.
+`normal`:: Show untracked files and directories.
+`all`:: Also show individual files in untracked directories.
 
 When `-u` option is not used, untracked files and directories are
 shown (i.e. the same as specifying `normal`), to help you avoid
@@ -82,76 +83,78 @@ return more quickly without showing untracked files.
 All usual spellings for Boolean value `true` are taken as `normal`
 and `false` as `no`.
 
-The default can be changed using the status.showUntrackedFiles
+The default can be changed using the `status.showUntrackedFiles`
 configuration variable documented in linkgit:git-config[1].
 --
 
---ignore-submodules[=<when>]::
-	Ignore changes to submodules when looking for changes. <when> can be
-	either "none", "untracked", "dirty" or "all", which is the default.
-	Using "none" will consider the submodule modified when it either contains
+`--ignore-submodules[=<when>]`::
+	Ignore changes to submodules when looking for changes. _<when>_ can be
+	either `none`, `untracked`, `dirty` or `all`, which is the default.
+`none`;; will consider the submodule modified when it either contains
 	untracked or modified files or its HEAD differs from the commit recorded
 	in the superproject and can be used to override any settings of the
-	'ignore' option in linkgit:git-config[1] or linkgit:gitmodules[5]. When
-	"untracked" is used submodules are not considered dirty when they only
+	`ignore` option in linkgit:git-config[1] or linkgit:gitmodules[5].
+`untracked`;; submodules are not considered dirty when they only
 	contain untracked content (but they are still scanned for modified
-	content). Using "dirty" ignores all changes to the work tree of submodules,
+	content).
+`dirty`;; ignore all changes to the work tree of submodules,
 	only changes to the commits stored in the superproject are shown (this was
-	the behavior before 1.7.0). Using "all" hides all changes to submodules
+	the behavior before 1.7.0).
+`all`;; hide all changes to submodules
 	(and suppresses the output of submodule summaries when the config option
 	`status.submoduleSummary` is set).
 
---ignored[=<mode>]::
+`--ignored[=<mode>]`::
 	Show ignored files as well.
 +
 --
 The mode parameter is used to specify the handling of ignored files.
-It is optional: it defaults to 'traditional'.
+It is optional: it defaults to `traditional`.
 
 The possible options are:
 
-	- 'traditional' - Shows ignored files and directories, unless
-			  --untracked-files=all is specified, in which case
-			  individual files in ignored directories are
-			  displayed.
-	- 'no'	        - Show no ignored files.
-	- 'matching'    - Shows ignored files and directories matching an
-			  ignore pattern.
-
-When 'matching' mode is specified, paths that explicitly match an
+`traditional`:: Show ignored files and directories, unless
+`--untracked-files=all` is specified, in which case
+ individual files in ignored directories are
+ displayed.
+`no`:: Show no ignored files.
+`matching`:: Show ignored files and directories matching an
+ignore pattern.
++
+Paths that explicitly match an
 ignored pattern are shown. If a directory matches an ignore pattern,
 then it is shown, but not paths contained in the ignored directory. If
 a directory does not match an ignore pattern, but all contents are
 ignored, then the directory is not shown, but all contents are shown.
 --
 
--z::
-	Terminate entries with NUL, instead of LF.  This implies
+`-z`::
+	Terminate entries with _NUL_, instead of _LF_.  This implies
 	the `--porcelain=v1` output format if no other format is given.
 
---column[=<options>]::
---no-column::
+`--column[=<options>]`::
+`--no-column`::
 	Display untracked files in columns. See configuration variable
 	`column.status` for option syntax. `--column` and `--no-column`
-	without options are equivalent to 'always' and 'never'
+	without options are equivalent to `always` and `never`
 	respectively.
 
---ahead-behind::
---no-ahead-behind::
+`--ahead-behind`::
+`--no-ahead-behind`::
 	Display or do not display detailed ahead/behind counts for the
-	branch relative to its upstream branch.  Defaults to true.
+	branch relative to its upstream branch.  Defaults to `true`.
 
---renames::
---no-renames::
+`--renames`::
+`--no-renames`::
 	Turn on/off rename detection regardless of user configuration.
 	See also linkgit:git-diff[1] `--no-renames`.
 
---find-renames[=<n>]::
+`--find-renames[=<n>]`::
 	Turn on rename detection, optionally setting the similarity
 	threshold.
 	See also linkgit:git-diff[1] `--find-renames`.
 
-<pathspec>...::
+`<pathspec>...`::
 	See the 'pathspec' entry in linkgit:gitglossary[7].
 
 OUTPUT
@@ -173,12 +176,12 @@ Short Format
 In the short-format, the status of each path is shown as one of these
 forms
 
-	XY PATH
-	XY ORIG_PATH -> PATH
+	<xy> <path>
+	<xy> <orig-path> -> <path>
 
-where `ORIG_PATH` is where the renamed/copied contents came
-from. `ORIG_PATH` is only shown when the entry is renamed or
-copied. The `XY` is a two-letter status code.
+where _<orig-path>_ is where the renamed/copied contents came
+from. _<orig-path>_ is only shown when the entry is renamed or
+copied. The _<xy>_ is a two-letter status code `XY`.
 
 The fields (including the `->`) are separated from each other by a
 single space. If a filename contains whitespace or other nonprintable
@@ -187,7 +190,7 @@ literal: surrounded by ASCII double quote (34) characters, and with
 interior special characters backslash-escaped.
 
 There are three different types of states that are shown using this format, and
-each one uses the `XY` syntax differently:
+each one uses the _<xy>_ syntax differently:
 
 * When a merge is occurring and the merge was successful, or outside of a merge
 	situation, `X` shows the status of the index and `Y` shows the status of the
@@ -207,14 +210,14 @@ In the following table, these three classes are shown in separate sections, and
 these characters are used for `X` and `Y` fields for the first two sections that
 show tracked paths:
 
-* ' ' = unmodified
-* 'M' = modified
-* 'T' = file type changed (regular file, symbolic link or submodule)
-* 'A' = added
-* 'D' = deleted
-* 'R' = renamed
-* 'C' = copied (if config option status.renames is set to "copies")
-* 'U' = updated but unmerged
+' ':: unmodified
+`M`:: modified
+`T`:: file type changed (regular file, symbolic link or submodule)
+`A`:: added
+`D`:: deleted
+`R`:: renamed
+`C`:: copied (if config option status.renames is set to "copies")
+`U`:: updated but unmerged
 
 ....
 X          Y     Meaning
@@ -248,19 +251,21 @@ U           U    unmerged, both modified
 
 Submodules have more state and instead report
 
-* 'M' = the submodule has a different HEAD than recorded in the index
-* 'm' = the submodule has modified content
-* '?' = the submodule has untracked files
+`M`:: the submodule has a different HEAD than recorded in the index
+`m`:: the submodule has modified content
+`?`:: the submodule has untracked files
 
 This is since modified content or untracked files in a submodule cannot be added
 via `git add` in the superproject to prepare a commit.
 
-'m' and '?' are applied recursively. For example if a nested submodule
-in a submodule contains an untracked file, this is reported as '?' as well.
+`m` and `?` are applied recursively. For example if a nested submodule
+in a submodule contains an untracked file, this is reported as `?` as well.
+
+If `-b` is used the short-format status is preceded by a line
 
-If -b is used the short-format status is preceded by a line
+[synopsis]
+{empty}## <branchname> <tracking-info>
 
-    ## branchname tracking info
 
 Porcelain Format Version 1
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -271,16 +276,16 @@ based on user configuration. This makes it ideal for parsing by scripts.
 The description of the short format above also describes the porcelain
 format, with a few exceptions:
 
-1. The user's color.status configuration is not respected; color will
+1. The user's `color.status` configuration is not respected; color will
    always be off.
 
-2. The user's status.relativePaths configuration is not respected; paths
+2. The user's `status.relativePaths` configuration is not respected; paths
    shown will always be relative to the repository root.
 
-There is also an alternate -z format recommended for machine parsing. In
+There is also an alternate `-z` format recommended for machine parsing. In
 that format, the status field is the same, but some other things
-change.  First, the '\->' is omitted from rename entries and the field
-order is reversed (e.g 'from \-> to' becomes 'to from'). Second, a NUL
+change.  First, the `->` is omitted from rename entries and the field
+order is reversed (e.g `from -> to` becomes `to from`). Second, a _NUL_
 (ASCII 0) follows each filename, replacing space as a field separator
 and the terminating newline (but a space still separates the status
 field from the first filename).  Third, filenames containing special
@@ -296,7 +301,7 @@ Version 2 format adds more detailed information about the state of
 the worktree and changed items.  Version 2 also defines an extensible
 set of easy to parse optional headers.
 
-Header lines start with "#" and are added in response to specific
+Header lines start with `#` and are added in response to specific
 command line arguments.  Parsers should ignore headers they
 don't recognize.
 
@@ -336,11 +341,13 @@ line types in any order.
 
 Ordinary changed entries have the following format:
 
-    1 <XY> <sub> <mH> <mI> <mW> <hH> <hI> <path>
+[synopsis]
+1 <XY> <sub> <mH> <mI> <mW> <hH> <hI> <path>
 
 Renamed or copied entries have the following format:
 
-    2 <XY> <sub> <mH> <mI> <mW> <hH> <hI> <X><score> <path><sep><origPath>
+[synopsis]
+2 <XY> <sub> <mH> <mI> <mW> <hH> <hI> <X><score> <path><sep><origPath>
 
 ....
 Field       Meaning
@@ -377,7 +384,8 @@ Field       Meaning
 Unmerged entries have the following format; the first character is
 a "u" to distinguish from ordinary changed entries.
 
-    u <XY> <sub> <m1> <m2> <m3> <mW> <h1> <h2> <h3> <path>
+[synopsis]
+u <XY> <sub> <m1> <m2> <m3> <mW> <h1> <h2> <h3> <path>
 
 ....
 Field       Meaning
@@ -416,7 +424,7 @@ Pathname Format Notes and -z
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 When the `-z` option is given, pathnames are printed as is and
-without any quoting and lines are terminated with a NUL (ASCII 0x00)
+without any quoting and lines are terminated with a _NUL_ (ASCII 0x00)
 byte.
 
 Without the `-z` option, pathnames with "unusual" characters are
@@ -439,11 +447,11 @@ directory.
 If `status.submoduleSummary` is set to a non zero number or true (identical
 to -1 or an unlimited number), the submodule summary will be enabled for
 the long format and a summary of commits for modified submodules will be
-shown (see --summary-limit option of linkgit:git-submodule[1]). Please note
+shown (see `--summary-limit` option of linkgit:git-submodule[1]). Please note
 that the summary output from the status command will be suppressed for all
-submodules when `diff.ignoreSubmodules` is set to 'all' or only for those
+submodules when `diff.ignoreSubmodules` is set to `all` or only for those
 submodules where `submodule.<name>.ignore=all`. To also view the summary for
-ignored submodules you can either use the --ignore-submodules=dirty command
+ignored submodules you can either use the `--ignore-submodules=dirty` command
 line option or the 'git submodule summary' command, which shows a similar
 output but does not honor these settings.
 
@@ -484,7 +492,7 @@ results, so it could be faster on subsequent runs.
 	setting this variable to `false` disables the warning message
 	given when enumerating untracked files takes more than 2
 	seconds.  In a large project, it may take longer and the user
-	may have already accepted the trade off (e.g. using "-uno" may
+	may have already accepted the trade off (e.g. using `-uno` may
 	not be an acceptable option for the user), in which case, there
 	is no point issuing the warning message, and in such a case,
 	disabling the warning may be the best.

From ead7aae0e457bb0acbf20409b6cf36a89480e8b2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Sat, 20 Dec 2025 19:16:25 +0000
Subject: [PATCH 271/553] doc: convert git-status tables to AsciiDoc format
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Instead of plain text tables with hand formatting, take advantage of
asciidoc's table syntax to let the renderer do the heavy lifting and
make the tables more maintainable and translatable.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-status.adoc | 170 +++++++++++++++++-----------------
 1 file changed, 85 insertions(+), 85 deletions(-)

diff --git a/Documentation/git-status.adoc b/Documentation/git-status.adoc
index 37b0453898d369..9acca52bfb1abd 100644
--- a/Documentation/git-status.adoc
+++ b/Documentation/git-status.adoc
@@ -219,35 +219,32 @@ show tracked paths:
 `C`:: copied (if config option status.renames is set to "copies")
 `U`:: updated but unmerged
 
-....
-X          Y     Meaning
--------------------------------------------------
-	 [AMD]   not updated
-M        [ MTD]  updated in index
-T        [ MTD]  type changed in index
-A        [ MTD]  added to index
-D                deleted from index
-R        [ MTD]  renamed in index
-C        [ MTD]  copied in index
-[MTARC]          index and work tree matches
-[ MTARC]    M    work tree changed since index
-[ MTARC]    T    type changed in work tree since index
-[ MTARC]    D    deleted in work tree
-	    R    renamed in work tree
-	    C    copied in work tree
--------------------------------------------------
-D           D    unmerged, both deleted
-A           U    unmerged, added by us
-U           D    unmerged, deleted by them
-U           A    unmerged, added by them
-D           U    unmerged, deleted by us
-A           A    unmerged, both added
-U           U    unmerged, both modified
--------------------------------------------------
-?           ?    untracked
-!           !    ignored
--------------------------------------------------
-....
+[cols="^1m,^1m,<2",options="header"]
+|===
+|X        |  Y     |Meaning
+|         |[AMD]   |not updated
+|M        |[ MTD]  |updated in index
+|T        |[ MTD]  |type changed in index
+|A        |[ MTD]  |added to index
+|D        |        |deleted from index
+|R        |[ MTD]  |renamed in index
+|C        |[ MTD]  |copied in index
+|[MTARC]  |        |index and work tree matches
+|[ MTARC] |M       |work tree changed since index
+|[ MTARC] |T       |type changed in work tree since index
+|[ MTARC] |D       |deleted in work tree
+|         |R       |renamed in work tree
+|         |C       |copied in work tree
+|D        |D       |unmerged, both deleted
+|A        |U       |unmerged, added by us
+|U        |D       |unmerged, deleted by them
+|U        |A       |unmerged, added by them
+|D        |U       |unmerged, deleted by us
+|A        |A       |unmerged, both added
+|U        |U       |unmerged, both modified
+|?        |?       |untracked
+|!        |!       |ignored
+|===
 
 Submodules have more state and instead report
 
@@ -311,16 +308,15 @@ Branch Headers
 If `--branch` is given, a series of header lines are printed with
 information about the current branch.
 
-....
-Line                                     Notes
-------------------------------------------------------------
-# branch.oid <commit> | (initial)        Current commit.
-# branch.head <branch> | (detached)      Current branch.
-# branch.upstream <upstream-branch>      If upstream is set.
-# branch.ab +<ahead> -<behind>           If upstream is set and
-					 the commit is present.
-------------------------------------------------------------
-....
+[cols="<1,<1",options="header"]
+|===
+|Line                                     |Notes
+|`# branch.oid <commit> \| (initial)`     |Current commit.
+|`# branch.head <branch> \| (detached)`   |Current branch.
+|`# branch.upstream <upstream-branch>`    |If upstream is set.
+|`# branch.ab +<ahead> -<behind>`         |If upstream is set and
+					  the commit is present.
+|===
 
 Stash Information
 ^^^^^^^^^^^^^^^^^
@@ -349,37 +345,42 @@ Renamed or copied entries have the following format:
 [synopsis]
 2 <XY> <sub> <mH> <mI> <mW> <hH> <hI> <X><score> <path><sep><origPath>
 
-....
-Field       Meaning
---------------------------------------------------------
-<XY>        A 2 character field containing the staged and
-	    unstaged XY values described in the short format,
-	    with unchanged indicated by a "." rather than
-	    a space.
-<sub>       A 4 character field describing the submodule state.
-	    "N..." when the entry is not a submodule.
-	    "S<c><m><u>" when the entry is a submodule.
-	    <c> is "C" if the commit changed; otherwise ".".
-	    <m> is "M" if it has tracked changes; otherwise ".".
-	    <u> is "U" if there are untracked changes; otherwise ".".
-<mH>        The octal file mode in HEAD.
-<mI>        The octal file mode in the index.
-<mW>        The octal file mode in the worktree.
-<hH>        The object name in HEAD.
-<hI>        The object name in the index.
-<X><score>  The rename or copy score (denoting the percentage
-	    of similarity between the source and target of the
-	    move or copy). For example "R100" or "C75".
-<path>      The pathname.  In a renamed/copied entry, this
-	    is the target path.
-<sep>       When the `-z` option is used, the 2 pathnames are separated
-	    with a NUL (ASCII 0x00) byte; otherwise, a tab (ASCII 0x09)
-	    byte separates them.
-<origPath>  The pathname in the commit at HEAD or in the index.
-	    This is only present in a renamed/copied entry, and
-	    tells where the renamed/copied contents came from.
---------------------------------------------------------
-....
+[cols="<1,<1a",options="header"]
+|===
+|Field       | Meaning
+
+|_<XY>_
+|A 2 character field containing the staged and
+unstaged XY values described in the short format,
+with unchanged indicated by a "." rather than
+a space.
+|_<sub>_
+|A 4 character field describing the submodule state.
+"N..." when the entry is not a submodule.
+`S<c><m><u>` when the entry is a submodule.
+
+* _<c>_ is "C" if the commit changed; otherwise ".".
+* _<m>_ is "M" if it has tracked changes; otherwise ".".
+* _<u>_ is "U" if there are untracked changes; otherwise ".".
+|_<mH>_       |The octal file mode in HEAD.
+|_<mI>_       |The octal file mode in the index.
+|_<mW>_       |The octal file mode in the worktree.
+|_<hH>_       |The object name in HEAD.
+|_<hI>_       |The object name in the index.
+|_<X><score>_ |The rename or copy score (denoting the percentage
+of similarity between the source and target of the
+move or copy). For example "R100" or "C75".
+|_<path>_
+|The pathname.  In a renamed/copied entry, this is the target path.
+|_<sep>_
+|When the `-z` option is used, the 2 pathnames are separated
+with a _NUL_ (ASCII 0x00) byte; otherwise, a _TAB_ (ASCII 0x09)
+byte separates them.
+|_<origPath>_
+|The pathname in the commit at HEAD or in the index.
+This is only present in a renamed/copied entry, and
+tells where the renamed/copied contents came from.
+|===
 
 Unmerged entries have the following format; the first character is
 a "u" to distinguish from ordinary changed entries.
@@ -387,23 +388,22 @@ a "u" to distinguish from ordinary changed entries.
 [synopsis]
 u <XY> <sub> <m1> <m2> <m3> <mW> <h1> <h2> <h3> <path>
 
-....
-Field       Meaning
---------------------------------------------------------
-<XY>        A 2 character field describing the conflict type
+[cols="<1,<1a",options="header"]
+|===
+|Field       |Meaning
+|_<XY>_      |A 2 character field describing the conflict type
 	    as described in the short format.
-<sub>       A 4 character field describing the submodule state
+|_<sub>_     |A 4 character field describing the submodule state
 	    as described above.
-<m1>        The octal file mode in stage 1.
-<m2>        The octal file mode in stage 2.
-<m3>        The octal file mode in stage 3.
-<mW>        The octal file mode in the worktree.
-<h1>        The object name in stage 1.
-<h2>        The object name in stage 2.
-<h3>        The object name in stage 3.
-<path>      The pathname.
---------------------------------------------------------
-....
+|_<m1>_      |The octal file mode in stage 1.
+|_<m2>_      |The octal file mode in stage 2.
+|_<m3>_      |The octal file mode in stage 3.
+|_<mW>_      |The octal file mode in the worktree.
+|_<h1>_      |The object name in stage 1.
+|_<h2>_      |The object name in stage 2.
+|_<h3>_      |The object name in stage 3.
+|_<path>_    |The pathname.
+|===
 
 Other Items
 ^^^^^^^^^^^

From 5b35e736ddc5b1cad4a4e5e18a546b2f9c8f80d3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Sat, 20 Dec 2025 19:16:26 +0000
Subject: [PATCH 272/553] doc: convert git stage to use synopsis block
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-stage.adoc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-stage.adoc b/Documentation/git-stage.adoc
index 2f6aaa75b9a3f9..753a8176390130 100644
--- a/Documentation/git-stage.adoc
+++ b/Documentation/git-stage.adoc
@@ -8,8 +8,8 @@ git-stage - Add file contents to the staging area
 
 SYNOPSIS
 --------
-[verse]
-'git stage' <arg>...
+[synopsis]
+git stage <arg>...
 
 
 DESCRIPTION

From acffc5e9e54f5632abefe53d0914007709d409ce Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
Date: Sat, 20 Dec 2025 19:16:27 +0000
Subject: [PATCH 273/553] doc: convert git-remote to synopsis style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.
- also convert first sentences to imperative mood where applicable

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-remote.adoc | 106 +++++++++++++++++-----------------
 1 file changed, 53 insertions(+), 53 deletions(-)

diff --git a/Documentation/git-remote.adoc b/Documentation/git-remote.adoc
index 932a5c3ea4741c..eaae30aa88cc6e 100644
--- a/Documentation/git-remote.adoc
+++ b/Documentation/git-remote.adoc
@@ -8,20 +8,20 @@ git-remote - Manage set of tracked repositories
 
 SYNOPSIS
 --------
-[verse]
-'git remote' [-v | --verbose]
-'git remote add' [-t <branch>] [-m <master>] [-f] [--[no-]tags] [--mirror=(fetch|push)] <name> <URL>
-'git remote rename' [--[no-]progress] <old> <new>
-'git remote remove' <name>
-'git remote set-head' <name> (-a | --auto | -d | --delete | <branch>)
-'git remote set-branches' [--add] <name> <branch>...
-'git remote get-url' [--push] [--all] <name>
-'git remote set-url' [--push] <name> <newurl> [<oldurl>]
-'git remote set-url --add' [--push] <name> <newurl>
-'git remote set-url --delete' [--push] <name> <URL>
-'git remote' [-v | --verbose] 'show' [-n] <name>...
-'git remote prune' [-n | --dry-run] <name>...
-'git remote' [-v | --verbose] 'update' [-p | --prune] [(<group> | <remote>)...]
+[synopsis]
+git remote [-v | --verbose]
+git remote add [-t <branch>] [-m <master>] [-f] [--[no-]tags] [--mirror=(fetch|push)] <name> <URL>
+git remote rename [--[no-]progress] <old> <new>
+git remote remove <name>
+git remote set-head <name> (-a | --auto | -d | --delete | <branch>)
+git remote set-branches [--add] <name> <branch>...
+git remote get-url [--push] [--all] <name>
+git remote set-url [--push] <name> <newurl> [<oldurl>]
+git remote set-url --add [--push] <name> <newurl>
+git remote set-url --delete [--push] <name> <URL>
+git remote [-v | --verbose] show [-n] <name>...
+git remote prune [-n | --dry-run] <name>...
+git remote [-v | --verbose] update [-p | --prune] [(<group> | <remote>)...]
 
 DESCRIPTION
 -----------
@@ -32,8 +32,8 @@ Manage the set of repositories ("remotes") whose branches you track.
 OPTIONS
 -------
 
--v::
---verbose::
+`-v`::
+`--verbose`::
 	Be a little more verbose and show remote url after name.
 	For promisor remotes, also show which filters (`blob:none` etc.)
 	are configured.
@@ -43,14 +43,14 @@ OPTIONS
 COMMANDS
 --------
 
-With no arguments, shows a list of existing remotes.  Several
+With no arguments, show a list of existing remotes.  Several
 subcommands are available to perform operations on the remotes.
 
-'add'::
+`add`::
 
-Add a remote named <name> for the repository at
-<URL>.  The command `git fetch <name>` can then be used to create and
-update remote-tracking branches <name>/<branch>.
+Add a remote named _<name>_ for the repository at
+_<URL>_.  The command `git fetch <name>` can then be used to create and
+update remote-tracking branches `<name>/<branch>`.
 +
 With `-f` option, `git fetch <name>` is run immediately after
 the remote information is set up.
@@ -66,40 +66,40 @@ By default, only tags on fetched branches are imported
 +
 With `-t <branch>` option, instead of the default glob
 refspec for the remote to track all branches under
-the `refs/remotes/<name>/` namespace, a refspec to track only `<branch>`
+the `refs/remotes/<name>/` namespace, a refspec to track only _<branch>_
 is created.  You can give more than one `-t <branch>` to track
 multiple branches without grabbing all branches.
 +
 With `-m <master>` option, a symbolic-ref `refs/remotes/<name>/HEAD` is set
-up to point at remote's `<master>` branch. See also the set-head command.
+up to point at remote's _<master>_ branch. See also the set-head command.
 +
 When a fetch mirror is created with `--mirror=fetch`, the refs will not
-be stored in the 'refs/remotes/' namespace, but rather everything in
-'refs/' on the remote will be directly mirrored into 'refs/' in the
+be stored in the `refs/remotes/` namespace, but rather everything in
+`refs/` on the remote will be directly mirrored into `refs/` in the
 local repository. This option only makes sense in bare repositories,
 because a fetch would overwrite any local commits.
 +
 When a push mirror is created with `--mirror=push`, then `git push`
 will always behave as if `--mirror` was passed.
 
-'rename'::
+`rename`::
 
-Rename the remote named <old> to <new>. All remote-tracking branches and
+Rename the remote named _<old>_ to _<new>_. All remote-tracking branches and
 configuration settings for the remote are updated.
 +
-In case <old> and <new> are the same, and <old> is a file under
+In case _<old>_ and _<new>_ are the same, and _<old>_ is a file under
 `$GIT_DIR/remotes` or `$GIT_DIR/branches`, the remote is converted to
 the configuration file format.
 
-'remove'::
-'rm'::
+`remove`::
+`rm`::
 
-Remove the remote named <name>. All remote-tracking branches and
+Remove the remote named _<name>_. All remote-tracking branches and
 configuration settings for the remote are removed.
 
-'set-head'::
+`set-head`::
 
-Sets or deletes the default branch (i.e. the target of the
+Set or delete the default branch (i.e. the target of the
 symbolic-ref `refs/remotes/<name>/HEAD`) for
 the named remote. Having a default branch for a remote is not required,
 but allows the name of the remote to be specified in lieu of a specific
@@ -116,15 +116,15 @@ the symbolic-ref `refs/remotes/origin/HEAD` to `refs/remotes/origin/next`. This
 only work if `refs/remotes/origin/next` already exists; if not it must be
 fetched first.
 +
-Use `<branch>` to set the symbolic-ref `refs/remotes/<name>/HEAD` explicitly. e.g., `git
+Use _<branch>_ to set the symbolic-ref `refs/remotes/<name>/HEAD` explicitly. e.g., `git
 remote set-head origin master` will set the symbolic-ref `refs/remotes/origin/HEAD` to
 `refs/remotes/origin/master`. This will only work if
 `refs/remotes/origin/master` already exists; if not it must be fetched first.
 +
 
-'set-branches'::
+`set-branches`::
 
-Changes the list of branches tracked by the named remote.
+Change the list of branches tracked by the named remote.
 This can be used to track a subset of the available remote branches
 after the initial setup for a remote.
 +
@@ -134,7 +134,7 @@ The named branches will be interpreted as if specified with the
 With `--add`, instead of replacing the list of currently tracked
 branches, adds to that list.
 
-'get-url'::
+`get-url`::
 
 Retrieves the URLs for a remote. Configurations for `insteadOf` and
 `pushInsteadOf` are expanded here. By default, only the first URL is listed.
@@ -143,18 +143,18 @@ With `--push`, push URLs are queried rather than fetch URLs.
 +
 With `--all`, all URLs for the remote will be listed.
 
-'set-url'::
+`set-url`::
 
-Changes URLs for the remote. Sets first URL for remote <name> that matches
-regex <oldurl> (first URL if no <oldurl> is given) to <newurl>. If
-<oldurl> doesn't match any URL, an error occurs and nothing is changed.
+Change URLs for the remote. Sets first URL for remote _<name>_ that matches
+regex _<oldurl>_ (first URL if no _<oldurl>_ is given) to _<newurl>_. If
+_<oldurl>_ doesn't match any URL, an error occurs and nothing is changed.
 +
 With `--push`, push URLs are manipulated instead of fetch URLs.
 +
 With `--add`, instead of changing existing URLs, new URL is added.
 +
 With `--delete`, instead of changing existing URLs, all URLs matching
-regex <URL> are deleted for remote <name>.  Trying to delete all
+regex _<URL>_ are deleted for remote _<name>_.  Trying to delete all
 non-push URLs is an error.
 +
 Note that the push URL and the fetch URL, even though they can
@@ -165,17 +165,17 @@ fetch from one place (e.g. your upstream) and push to another (e.g.
 your publishing repository), use two separate remotes.
 
 
-'show'::
+`show`::
 
-Gives some information about the remote <name>.
+Give some information about the remote _<name>_.
 +
 With `-n` option, the remote heads are not queried first with
 `git ls-remote <name>`; cached information is used instead.
 
-'prune'::
+`prune`::
 
-Deletes stale references associated with <name>. By default, stale
-remote-tracking branches under <name> are deleted, but depending on
+Delete stale references associated with _<name>_. By default, stale
+remote-tracking branches under _<name>_ are deleted, but depending on
 global configuration and the configuration of the remote we might even
 prune local tags that haven't been pushed there. Equivalent to `git
 fetch --prune <name>`, except that no new references will be fetched.
@@ -186,13 +186,13 @@ depending on various configuration.
 With `--dry-run` option, report what branches would be pruned, but do not
 actually prune them.
 
-'update'::
+`update`::
 
 Fetch updates for remotes or remote groups in the repository as defined by
 `remotes.<group>`. If neither group nor remote is specified on the command line,
-the configuration parameter remotes.default will be used; if
-remotes.default is not defined, all remotes which do not have the
-configuration parameter `remote.<name>.skipDefaultUpdate` set to true will
+the configuration parameter `remotes.default` will be used; if
+`remotes.default` is not defined, all remotes which do not have the
+configuration parameter `remote.<name>.skipDefaultUpdate` set to `true` will
 be updated.  (See linkgit:git-config[1]).
 +
 With `--prune` option, run pruning against all the remotes that are updated.
@@ -210,7 +210,7 @@ EXIT STATUS
 
 On success, the exit status is `0`.
 
-When subcommands such as 'add', 'rename', and 'remove' can't find the
+When subcommands such as `add`, `rename`, and `remove` can't find the
 remote in question, the exit status is `2`. When the remote already
 exists, the exit status is `3`.
 
@@ -247,7 +247,7 @@ $ git switch -c staging staging/master
 ...
 ------------
 
-* Imitate 'git clone' but track only selected branches
+* Imitate `git clone` but track only selected branches
 +
 ------------
 $ mkdir project.git

From c8d76f7325e75c6f0549fce29ea4f3d97eb079cb Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, 22 Dec 2025 13:46:36 +0900
Subject: [PATCH 274/553] The 11th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 34216a59fe5fe6..e94a79516c2138 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -188,3 +188,6 @@ Fixes since v2.52
    (merge 8cbbdc92f7 kh/doc-pre-commit-fix later to maint).
    (merge d4bc39a4d9 mh/doc-config-gui-gcwarning later to maint).
    (merge 41d425008a kh/doc-send-email-paragraph-fix later to maint).
+   (merge d4b732899e jc/macports-darwinports later to maint).
+   (merge bab391761d kj/pull-options-decl-cleanup later to maint).
+   (merge 007b8994d4 rs/t4014-git-version-string-fix later to maint).

From 66ce5f8e8872f0183bb137911c52b07f1f242d13 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 23 Dec 2025 10:37:41 +0900
Subject: [PATCH 275/553] The 12th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index e94a79516c2138..88d24f6d4dde4c 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -67,6 +67,17 @@ Performance, Internal Implementation, Development Support etc.
    and Stop using the insecure "mktemp()" function.
    (merge 10bba537c4 rs/ban-mktemp later to maint).
 
+ * In-code comment update to clarify that single-letter options are
+   outside of the scope of command line completion script.
+   (merge dc8a00fafe jc/completion-no-single-letter-options later to maint).
+
+ * MEMZERO_ARRAY() helper is introduced to avoid clearing only the
+   first N bytes of an N-element array whose elements are larger than
+   a byte.
+
+ * "git diff-files -R --find-copies-harder" has been taught to use
+   the potential copy sources from the index correctly.
+
 
 Fixes since v2.52
 -----------------
@@ -177,6 +188,17 @@ Fixes since v2.52
  * Emulation code clean-up.
    (merge 42aa7603aa gf/win32-pthread-cond-init later to maint).
 
+ * "git submodule add" to add a submodule under <name> segfaulted,
+   when a submodule.<name>.something is already in .gitmodules file
+   without defining where its submodule.<name>.path is, which has been
+   corrected.
+   (merge dd8e8c786e jc/submodule-add later to maint).
+
+ * "git fetch" that involves fetching tags, when a tag being fetched
+   needs to overwrite existing one, failed to fetch other tags, which
+   has been corrected.
+   (merge b7b17ec8a6 kn/fix-fetch-backfill-tag-with-batched-ref-updates later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
@@ -191,3 +213,4 @@ Fixes since v2.52
    (merge d4b732899e jc/macports-darwinports later to maint).
    (merge bab391761d kj/pull-options-decl-cleanup later to maint).
    (merge 007b8994d4 rs/t4014-git-version-string-fix later to maint).
+   (merge 4ce170c522 ds/doc-scalar-config later to maint).

From 93f894c0012188a5d2b484ccf88a02692355d480 Mon Sep 17 00:00:00 2001
From: "brian m. carlson" <sandals@crustytoothpaste.net>
Date: Wed, 24 Dec 2025 20:32:53 +0000
Subject: [PATCH 276/553] checkout: quote invalid treeish in error message

We received a report that invoking "git restore -source my_base_branch"
resulted in the confusing error message "fatal: could not resolve
ource".  This looked like a typo in our error message, but it is
actually because "-source" is missing its second dash and is being
resolved as "-s ource".  However, due to the lack of the quoting
recommended in CodingGuidelines, this is confusing to the reader and
we can do better.

Add the necessary quoting to this message.  With this change, we now get
this less confusing message:

    fatal: could not resolve 'ource'

Reported-by: Zhelyo Zhelev <zhelyo@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/checkout.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 01ea9ff8b28022..afec07534ffbb7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1875,7 +1875,7 @@ static int checkout_main(int argc, const char **argv, const char *prefix,
 		struct object_id rev;
 
 		if (repo_get_oid_mb(the_repository, opts->from_treeish, &rev))
-			die(_("could not resolve %s"), opts->from_treeish);
+			die(_("could not resolve '%s'"), opts->from_treeish);
 
 		setup_new_branch_info_and_source_tree(&new_branch_info,
 						      opts, &rev,

From d8a17ef09b8d9fdeb7d22cbc926cbebf3d8a58c9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:14 +0100
Subject: [PATCH 277/553] revision: export commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Dynamic arrays of commit pointers are used in several places.  Some of
them use a custom struct to hold array, item count and capacity, others
have them as separate variables linked by a common name part.

Pick one succinct, clean implementation -- commit_stack -- and convert
the different variants to it to reduce code duplication.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c   | 17 +++++++++++++++++
 commit.h   | 10 ++++++++++
 revision.c | 23 -----------------------
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/commit.c b/commit.c
index 709c9eed58a790..f2edafa49cf09e 100644
--- a/commit.c
+++ b/commit.c
@@ -1981,3 +1981,20 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 	opt.invoked_hook = invoked_hook;
 	return run_hooks_opt(the_repository, name, &opt);
 }
+
+void commit_stack_push(struct commit_stack *stack, struct commit *commit)
+{
+	ALLOC_GROW(stack->items, stack->nr + 1, stack->alloc);
+	stack->items[stack->nr++] = commit;
+}
+
+struct commit *commit_stack_pop(struct commit_stack *stack)
+{
+	return stack->nr ? stack->items[--stack->nr] : NULL;
+}
+
+void commit_stack_clear(struct commit_stack *stack)
+{
+	FREE_AND_NULL(stack->items);
+	stack->nr = stack->alloc = 0;
+}
diff --git a/commit.h b/commit.h
index 5406dd266327d4..81e047f820acb4 100644
--- a/commit.h
+++ b/commit.h
@@ -381,4 +381,14 @@ int parse_buffer_signed_by_header(const char *buffer,
 				  const struct git_hash_algo *algop);
 int add_header_signature(struct strbuf *buf, struct strbuf *sig, const struct git_hash_algo *algo);
 
+struct commit_stack {
+	struct commit **items;
+	size_t nr, alloc;
+};
+#define COMMIT_STACK_INIT { 0 }
+
+void commit_stack_push(struct commit_stack *, struct commit *);
+struct commit *commit_stack_pop(struct commit_stack *);
+void commit_stack_clear(struct commit_stack *);
+
 #endif /* COMMIT_H */
diff --git a/revision.c b/revision.c
index 5f0850ae5c9c1a..1858e093eeeb89 100644
--- a/revision.c
+++ b/revision.c
@@ -250,29 +250,6 @@ void mark_trees_uninteresting_sparse(struct repository *r,
 	paths_and_oids_clear(&map);
 }
 
-struct commit_stack {
-	struct commit **items;
-	size_t nr, alloc;
-};
-#define COMMIT_STACK_INIT { 0 }
-
-static void commit_stack_push(struct commit_stack *stack, struct commit *commit)
-{
-	ALLOC_GROW(stack->items, stack->nr + 1, stack->alloc);
-	stack->items[stack->nr++] = commit;
-}
-
-static struct commit *commit_stack_pop(struct commit_stack *stack)
-{
-	return stack->nr ? stack->items[--stack->nr] : NULL;
-}
-
-static void commit_stack_clear(struct commit_stack *stack)
-{
-	FREE_AND_NULL(stack->items);
-	stack->nr = stack->alloc = 0;
-}
-
 static void mark_one_parent_uninteresting(struct rev_info *revs, struct commit *commit,
 					  struct commit_stack *pending)
 {

From 052efdd60f860dc5bc50a92f10911402d9cc71b4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:15 +0100
Subject: [PATCH 278/553] log: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Calling commit_stack_push() to add commits is simpler and more efficient
than using REALLOC_ARRAY.  Calling commit_stack_pop() to consume them in
LIFO order is also a tad simpler than calculating the array index from
the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/log.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index d4cf9c59c81a83..5c9a8ef3632906 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1896,11 +1896,11 @@ int cmd_format_patch(int argc,
 {
 	struct format_config cfg;
 	struct commit *commit;
-	struct commit **list = NULL;
+	struct commit_stack list = COMMIT_STACK_INIT;
 	struct rev_info rev;
 	char *to_free = NULL;
 	struct setup_revision_opt s_r_opt;
-	size_t nr = 0, total, i;
+	size_t total, i;
 	int use_stdout = 0;
 	int start_number = -1;
 	int just_numbers = 0;
@@ -2283,14 +2283,12 @@ int cmd_format_patch(int argc,
 		if (ignore_if_in_upstream && has_commit_patch_id(commit, &ids))
 			continue;
 
-		nr++;
-		REALLOC_ARRAY(list, nr);
-		list[nr - 1] = commit;
+		commit_stack_push(&list, commit);
 	}
-	if (nr == 0)
+	if (!list.nr)
 		/* nothing to do */
 		goto done;
-	total = nr;
+	total = list.nr;
 	if (cover_letter == -1) {
 		if (cfg.config_cover_letter == COVER_AUTO)
 			cover_letter = (total > 1);
@@ -2308,7 +2306,7 @@ int cmd_format_patch(int argc,
 		if (!cover_letter && total != 1)
 			die(_("--interdiff requires --cover-letter or single patch"));
 		rev.idiff_oid1 = &idiff_prev.oid[idiff_prev.nr - 1];
-		rev.idiff_oid2 = get_commit_tree_oid(list[0]);
+		rev.idiff_oid2 = get_commit_tree_oid(list.items[0]);
 		rev.idiff_title = diff_title(&idiff_title, reroll_count,
 					     _("Interdiff:"),
 					     _("Interdiff against v%d:"));
@@ -2324,7 +2322,7 @@ int cmd_format_patch(int argc,
 			die(_("--range-diff requires --cover-letter or single patch"));
 
 		infer_range_diff_ranges(&rdiff1, &rdiff2, rdiff_prev,
-					origin, list[0]);
+					origin, list.items[0]);
 		rev.rdiff1 = rdiff1.buf;
 		rev.rdiff2 = rdiff2.buf;
 		rev.creation_factor = creation_factor;
@@ -2360,11 +2358,11 @@ int cmd_format_patch(int argc,
 	}
 
 	memset(&bases, 0, sizeof(bases));
-	base = get_base_commit(&cfg, list, nr);
+	base = get_base_commit(&cfg, list.items, list.nr);
 	if (base) {
 		reset_revision_walk();
 		clear_object_flags(the_repository, UNINTERESTING);
-		prepare_bases(&bases, base, list, nr);
+		prepare_bases(&bases, base, list.items, list.nr);
 	}
 
 	if (in_reply_to || cfg.thread || cover_letter) {
@@ -2381,7 +2379,8 @@ int cmd_format_patch(int argc,
 		if (cfg.thread)
 			gen_message_id(&rev, "cover");
 		make_cover_letter(&rev, !!output_directory,
-				  origin, nr, list, description_file, branch_name, quiet, &cfg);
+				  origin, list.nr, list.items,
+				  description_file, branch_name, quiet, &cfg);
 		print_bases(&bases, rev.diffopt.file);
 		print_signature(signature, rev.diffopt.file);
 		total++;
@@ -2395,12 +2394,12 @@ int cmd_format_patch(int argc,
 	if (show_progress)
 		progress = start_delayed_progress(the_repository,
 						  _("Generating patches"), total);
-	for (i = 0; i < nr; i++) {
-		size_t idx = nr - i - 1;
+	while (list.nr) {
+		size_t idx = list.nr - 1;
 		int shown;
 
 		display_progress(progress, total - idx);
-		commit = list[idx];
+		commit = commit_stack_pop(&list);
 		rev.nr = total - idx + (start_number - 1);
 
 		/* Make the second and subsequent mails replies to the first */
@@ -2469,7 +2468,7 @@ int cmd_format_patch(int argc,
 		}
 	}
 	stop_progress(&progress);
-	free(list);
+	commit_stack_clear(&list);
 	if (ignore_if_in_upstream)
 		free_patch_ids(&ids);
 

From 041c557171f160174cc40b0583adf411cef9e316 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:16 +0100
Subject: [PATCH 279/553] midx: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Simplify collection commits in a callback function by passing it a
commit_stack pointer all the way from the caller, instead of using
separate variables for array and item count and a bunch of intermediate
members in struct bitmap_commit_cb.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 midx-write.c | 35 ++++++++++++-----------------------
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/midx-write.c b/midx-write.c
index e3e9be6d03cd6f..b4a82d6ba512da 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -723,9 +723,7 @@ static int add_ref_to_pending(const struct reference *ref, void *cb_data)
 }
 
 struct bitmap_commit_cb {
-	struct commit **commits;
-	size_t commits_nr, commits_alloc;
-
+	struct commit_stack *commits;
 	struct write_midx_context *ctx;
 };
 
@@ -745,8 +743,7 @@ static void bitmap_show_commit(struct commit *commit, void *_data)
 	if (pos < 0)
 		return;
 
-	ALLOC_GROW(data->commits, data->commits_nr + 1, data->commits_alloc);
-	data->commits[data->commits_nr++] = commit;
+	commit_stack_push(data->commits, commit);
 }
 
 static int read_refs_snapshot(const char *refs_snapshot,
@@ -784,17 +781,15 @@ static int read_refs_snapshot(const char *refs_snapshot,
 	return 0;
 }
 
-static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p,
-						    const char *refs_snapshot,
-						    struct write_midx_context *ctx)
+static void find_commits_for_midx_bitmap(struct commit_stack *commits,
+					 const char *refs_snapshot,
+					 struct write_midx_context *ctx)
 {
 	struct rev_info revs;
-	struct bitmap_commit_cb cb = {0};
+	struct bitmap_commit_cb cb = { .commits = commits, .ctx = ctx };
 
 	trace2_region_enter("midx", "find_commits_for_midx_bitmap", ctx->repo);
 
-	cb.ctx = ctx;
-
 	repo_init_revisions(ctx->repo, &revs, NULL);
 	if (refs_snapshot) {
 		read_refs_snapshot(refs_snapshot, &revs);
@@ -823,14 +818,10 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr
 		die(_("revision walk setup failed"));
 
 	traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb);
-	if (indexed_commits_nr_p)
-		*indexed_commits_nr_p = cb.commits_nr;
 
 	release_revisions(&revs);
 
 	trace2_region_leave("midx", "find_commits_for_midx_bitmap", ctx->repo);
-
-	return cb.commits;
 }
 
 static int write_midx_bitmap(struct write_midx_context *ctx,
@@ -1375,15 +1366,14 @@ static int write_midx_internal(struct odb_source *source,
 
 	if (flags & MIDX_WRITE_BITMAP) {
 		struct packing_data pdata;
-		struct commit **commits;
-		uint32_t commits_nr;
+		struct commit_stack commits = COMMIT_STACK_INIT;
 
 		if (!ctx.entries_nr)
 			BUG("cannot write a bitmap without any objects");
 
 		prepare_midx_packing_data(&pdata, &ctx);
 
-		commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
+		find_commits_for_midx_bitmap(&commits, refs_snapshot, &ctx);
 
 		/*
 		 * The previous steps translated the information from
@@ -1394,17 +1384,16 @@ static int write_midx_internal(struct odb_source *source,
 		FREE_AND_NULL(ctx.entries);
 		ctx.entries_nr = 0;
 
-		if (write_midx_bitmap(&ctx,
-				      midx_hash, &pdata, commits, commits_nr,
-				      flags) < 0) {
+		if (write_midx_bitmap(&ctx, midx_hash, &pdata,
+				      commits.items, commits.nr, flags) < 0) {
 			error(_("could not write multi-pack bitmap"));
 			clear_packing_data(&pdata);
-			free(commits);
+			commit_stack_clear(&commits);
 			goto cleanup;
 		}
 
 		clear_packing_data(&pdata);
-		free(commits);
+		commit_stack_clear(&commits);
 	}
 	/*
 	 * NOTE: Do not use ctx.entries beyond this point, since it might

From d78039cd500176c919a749deb197ed68d1847110 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:17 +0100
Subject: [PATCH 280/553] name-rev: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Simplify the code by using commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/name-rev.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 615f7d1aae4987..6188cf98ce0157 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -180,8 +180,7 @@ static void name_rev(struct commit *start_commit,
 {
 	struct prio_queue queue;
 	struct commit *commit;
-	struct commit **parents_to_queue = NULL;
-	size_t parents_to_queue_nr, parents_to_queue_alloc = 0;
+	struct commit_stack parents_to_queue = COMMIT_STACK_INIT;
 	struct rev_name *start_name;
 
 	repo_parse_commit(the_repository, start_commit);
@@ -206,7 +205,7 @@ static void name_rev(struct commit *start_commit,
 		struct commit_list *parents;
 		int parent_number = 1;
 
-		parents_to_queue_nr = 0;
+		parents_to_queue.nr = 0;
 
 		for (parents = commit->parents;
 				parents;
@@ -238,22 +237,18 @@ static void name_rev(struct commit *start_commit,
 								string_pool);
 				else
 					parent_name->tip_name = name->tip_name;
-				ALLOC_GROW(parents_to_queue,
-					   parents_to_queue_nr + 1,
-					   parents_to_queue_alloc);
-				parents_to_queue[parents_to_queue_nr] = parent;
-				parents_to_queue_nr++;
+				commit_stack_push(&parents_to_queue, parent);
 			}
 		}
 
 		/* The first parent must come out first from the prio_queue */
-		while (parents_to_queue_nr)
+		while (parents_to_queue.nr)
 			prio_queue_put(&queue,
-				       parents_to_queue[--parents_to_queue_nr]);
+				       commit_stack_pop(&parents_to_queue));
 	}
 
 	clear_prio_queue(&queue);
-	free(parents_to_queue);
+	commit_stack_clear(&parents_to_queue);
 }
 
 static int subpath_matches(const char *path, const char *filter)

From 06e1f6467ee3dd92ceb894fd584173271a8aa577 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:18 +0100
Subject: [PATCH 281/553] remote: use commit_stack for local_commits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace a commit array implementation with commit_stack.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c | 39 ++++++---------------------------------
 1 file changed, 6 insertions(+), 33 deletions(-)

diff --git a/remote.c b/remote.c
index 59b371512084eb..af888e3f20d9ee 100644
--- a/remote.c
+++ b/remote.c
@@ -2544,36 +2544,9 @@ static int remote_tracking(struct remote *remote, const char *refname,
 	return 0;
 }
 
-/*
- * The struct "reflog_commit_array" and related helper functions
- * are used for collecting commits into an array during reflog
- * traversals in "check_and_collect_until()".
- */
-struct reflog_commit_array {
-	struct commit **item;
-	size_t nr, alloc;
-};
-
-#define REFLOG_COMMIT_ARRAY_INIT { 0 }
-
-/* Append a commit to the array. */
-static void append_commit(struct reflog_commit_array *arr,
-			  struct commit *commit)
-{
-	ALLOC_GROW(arr->item, arr->nr + 1, arr->alloc);
-	arr->item[arr->nr++] = commit;
-}
-
-/* Free and reset the array. */
-static void free_commit_array(struct reflog_commit_array *arr)
-{
-	FREE_AND_NULL(arr->item);
-	arr->nr = arr->alloc = 0;
-}
-
 struct check_and_collect_until_cb_data {
 	struct commit *remote_commit;
-	struct reflog_commit_array *local_commits;
+	struct commit_stack *local_commits;
 	timestamp_t remote_reflog_timestamp;
 };
 
@@ -2605,7 +2578,7 @@ static int check_and_collect_until(const char *refname UNUSED,
 		return 1;
 
 	if ((commit = lookup_commit_reference(the_repository, n_oid)))
-		append_commit(cb->local_commits, commit);
+		commit_stack_push(cb->local_commits, commit);
 
 	/*
 	 * If the reflog entry timestamp is older than the remote ref's
@@ -2633,7 +2606,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 	struct commit *commit;
 	struct commit **chunk;
 	struct check_and_collect_until_cb_data cb;
-	struct reflog_commit_array arr = REFLOG_COMMIT_ARRAY_INIT;
+	struct commit_stack arr = COMMIT_STACK_INIT;
 	size_t size = 0;
 	int ret = 0;
 
@@ -2664,8 +2637,8 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 	 * Check if the remote commit is reachable from any
 	 * of the commits in the collected array, in batches.
 	 */
-	for (chunk = arr.item; chunk < arr.item + arr.nr; chunk += size) {
-		size = arr.item + arr.nr - chunk;
+	for (chunk = arr.items; chunk < arr.items + arr.nr; chunk += size) {
+		size = arr.items + arr.nr - chunk;
 		if (MERGE_BASES_BATCH_SIZE < size)
 			size = MERGE_BASES_BATCH_SIZE;
 
@@ -2674,7 +2647,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 	}
 
 cleanup_return:
-	free_commit_array(&arr);
+	commit_stack_clear(&arr);
 	return ret;
 }
 

From 4455d4a2ea6e89b3f3cc50dc1d7435afac3e1e0d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:19 +0100
Subject: [PATCH 282/553] remote: use commit_stack for sent_tips
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Call commit_stack functions instead of effectively open-coding them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/remote.c b/remote.c
index af888e3f20d9ee..ffea887c70fc32 100644
--- a/remote.c
+++ b/remote.c
@@ -1381,12 +1381,7 @@ static struct ref **tail_ref(struct ref **head)
 	return tail;
 }
 
-struct tips {
-	struct commit **tip;
-	size_t nr, alloc;
-};
-
-static void add_to_tips(struct tips *tips, const struct object_id *oid)
+static void add_to_tips(struct commit_stack *tips, const struct object_id *oid)
 {
 	struct commit *commit;
 
@@ -1396,8 +1391,7 @@ static void add_to_tips(struct tips *tips, const struct object_id *oid)
 	if (!commit || (commit->object.flags & TMP_MARK))
 		return;
 	commit->object.flags |= TMP_MARK;
-	ALLOC_GROW(tips->tip, tips->nr + 1, tips->alloc);
-	tips->tip[tips->nr++] = commit;
+	commit_stack_push(tips, commit);
 }
 
 static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***dst_tail)
@@ -1406,13 +1400,12 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 	struct string_list src_tag = STRING_LIST_INIT_NODUP;
 	struct string_list_item *item;
 	struct ref *ref;
-	struct tips sent_tips;
+	struct commit_stack sent_tips = COMMIT_STACK_INIT;
 
 	/*
 	 * Collect everything we know they would have at the end of
 	 * this push, and collect all tags they have.
 	 */
-	memset(&sent_tips, 0, sizeof(sent_tips));
 	for (ref = *dst; ref; ref = ref->next) {
 		if (ref->peer_ref &&
 		    !is_null_oid(&ref->peer_ref->new_oid))
@@ -1422,7 +1415,7 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 		if (starts_with(ref->name, "refs/tags/"))
 			string_list_append(&dst_tag, ref->name);
 	}
-	clear_commit_marks_many(sent_tips.nr, sent_tips.tip, TMP_MARK);
+	clear_commit_marks_many(sent_tips.nr, sent_tips.items, TMP_MARK);
 
 	string_list_sort(&dst_tag);
 
@@ -1471,7 +1464,8 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 			src_commits[nr_src_commits++] = commit;
 		}
 
-		found_commits = get_reachable_subset(sent_tips.tip, sent_tips.nr,
+		found_commits = get_reachable_subset(sent_tips.items,
+						     sent_tips.nr,
 						     src_commits, nr_src_commits,
 						     reachable_flag);
 
@@ -1508,7 +1502,7 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 	}
 
 	string_list_clear(&src_tag, 0);
-	free(sent_tips.tip);
+	commit_stack_clear(&sent_tips);
 }
 
 struct ref *find_ref_by_name(const struct ref *list, const char *name)

From bb3a1ce91f964b353e8a428be574c20571db60a6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:20 +0100
Subject: [PATCH 283/553] remote: use commit_stack for src_commits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/remote.c b/remote.c
index ffea887c70fc32..b756ff6f1594d9 100644
--- a/remote.c
+++ b/remote.c
@@ -1443,9 +1443,7 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 	if (sent_tips.nr) {
 		const int reachable_flag = 1;
 		struct commit_list *found_commits;
-		struct commit **src_commits;
-		size_t nr_src_commits = 0, alloc_src_commits = 16;
-		ALLOC_ARRAY(src_commits, alloc_src_commits);
+		struct commit_stack src_commits = COMMIT_STACK_INIT;
 
 		for_each_string_list_item(item, &src_tag) {
 			struct ref *ref = item->util;
@@ -1460,13 +1458,13 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 				/* not pushing a commit, which is not an error */
 				continue;
 
-			ALLOC_GROW(src_commits, nr_src_commits + 1, alloc_src_commits);
-			src_commits[nr_src_commits++] = commit;
+			commit_stack_push(&src_commits, commit);
 		}
 
 		found_commits = get_reachable_subset(sent_tips.items,
 						     sent_tips.nr,
-						     src_commits, nr_src_commits,
+						     src_commits.items,
+						     src_commits.nr,
 						     reachable_flag);
 
 		for_each_string_list_item(item, &src_tag) {
@@ -1496,8 +1494,9 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 			dst_ref->peer_ref = copy_ref(ref);
 		}
 
-		clear_commit_marks_many(nr_src_commits, src_commits, reachable_flag);
-		free(src_commits);
+		clear_commit_marks_many(src_commits.nr, src_commits.items,
+					reachable_flag);
+		commit_stack_clear(&src_commits);
 		free_commit_list(found_commits);
 	}
 

From 64dbeefbd24e02eb05ae3250e118c9552b7690fb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:21 +0100
Subject: [PATCH 284/553] test-reach: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/helper/test-reach.c | 34 ++++++++++++++--------------------
 1 file changed, 14 insertions(+), 20 deletions(-)

diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index c58c93800f3232..feabeb29c25d89 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -34,8 +34,8 @@ int cmd__reach(int ac, const char **av)
 	struct commit *A, *B;
 	struct commit_list *X, *Y;
 	struct object_array X_obj = OBJECT_ARRAY_INIT;
-	struct commit **X_array, **Y_array;
-	size_t X_nr, X_alloc, Y_nr, Y_alloc;
+	struct commit_stack X_stack = COMMIT_STACK_INIT;
+	struct commit_stack Y_stack = COMMIT_STACK_INIT;
 	struct strbuf buf = STRBUF_INIT;
 	struct repository *r = the_repository;
 
@@ -46,10 +46,6 @@ int cmd__reach(int ac, const char **av)
 
 	A = B = NULL;
 	X = Y = NULL;
-	X_nr = Y_nr = 0;
-	X_alloc = Y_alloc = 16;
-	ALLOC_ARRAY(X_array, X_alloc);
-	ALLOC_ARRAY(Y_array, Y_alloc);
 
 	while (strbuf_getline(&buf, stdin) != EOF) {
 		struct object_id oid;
@@ -88,15 +84,13 @@ int cmd__reach(int ac, const char **av)
 
 			case 'X':
 				commit_list_insert(c, &X);
-				ALLOC_GROW(X_array, X_nr + 1, X_alloc);
-				X_array[X_nr++] = c;
+				commit_stack_push(&X_stack, c);
 				add_object_array(orig, NULL, &X_obj);
 				break;
 
 			case 'Y':
 				commit_list_insert(c, &Y);
-				ALLOC_GROW(Y_array, Y_nr + 1, Y_alloc);
-				Y_array[Y_nr++] = c;
+				commit_stack_push(&Y_stack, c);
 				break;
 
 			default:
@@ -112,16 +106,16 @@ int cmd__reach(int ac, const char **av)
 		       repo_in_merge_bases(the_repository, A, B));
 	else if (!strcmp(av[1], "in_merge_bases_many"))
 		printf("%s(A,X):%d\n", av[1],
-		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0));
+		       repo_in_merge_bases_many(the_repository, A, X_stack.nr, X_stack.items, 0));
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_branch_base_for_tip"))
-		printf("%s(A,X):%d\n", av[1], get_branch_base_for_tip(r, A, X_array, X_nr));
+		printf("%s(A,X):%d\n", av[1], get_branch_base_for_tip(r, A, X_stack.items, X_stack.nr));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
 		struct commit_list *list = NULL;
 		if (repo_get_merge_bases_many(the_repository,
-					      A, X_nr,
-					      X_array,
+					      A, X_stack.nr,
+					      X_stack.items,
 					      &list) < 0)
 			exit(128);
 		printf("%s(A,X):\n", av[1]);
@@ -159,8 +153,8 @@ int cmd__reach(int ac, const char **av)
 		const int reachable_flag = 1;
 		int count = 0;
 		struct commit_list *current;
-		struct commit_list *list = get_reachable_subset(X_array, X_nr,
-								Y_array, Y_nr,
+		struct commit_list *list = get_reachable_subset(X_stack.items, X_stack.nr,
+								Y_stack.items, Y_stack.nr,
 								reachable_flag);
 		printf("get_reachable_subset(X,Y)\n");
 		for (current = list; current; current = current->next) {
@@ -169,8 +163,8 @@ int cmd__reach(int ac, const char **av)
 				    oid_to_hex(&list->item->object.oid));
 			count++;
 		}
-		for (size_t i = 0; i < Y_nr; i++) {
-			if (Y_array[i]->object.flags & reachable_flag)
+		for (size_t i = 0; i < Y_stack.nr; i++) {
+			if (Y_stack.items[i]->object.flags & reachable_flag)
 				count--;
 		}
 
@@ -185,7 +179,7 @@ int cmd__reach(int ac, const char **av)
 	strbuf_release(&buf);
 	free_commit_list(X);
 	free_commit_list(Y);
-	free(X_array);
-	free(Y_array);
+	commit_stack_clear(&X_stack);
+	commit_stack_clear(&Y_stack);
 	return 0;
 }

From 2ebaa2b45e752088fd03d03c484fe43a653deba8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:22 +0100
Subject: [PATCH 285/553] commit: add commit_stack_init()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a function for initializing a struct commit_stack, for when static
initialization is not possible or impractical.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c | 10 ++++++++--
 commit.h |  1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index f2edafa49cf09e..55b1c8d2f8d21d 100644
--- a/commit.c
+++ b/commit.c
@@ -1982,6 +1982,12 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 	return run_hooks_opt(the_repository, name, &opt);
 }
 
+void commit_stack_init(struct commit_stack *stack)
+{
+	stack->items = NULL;
+	stack->nr = stack->alloc = 0;
+}
+
 void commit_stack_push(struct commit_stack *stack, struct commit *commit)
 {
 	ALLOC_GROW(stack->items, stack->nr + 1, stack->alloc);
@@ -1995,6 +2001,6 @@ struct commit *commit_stack_pop(struct commit_stack *stack)
 
 void commit_stack_clear(struct commit_stack *stack)
 {
-	FREE_AND_NULL(stack->items);
-	stack->nr = stack->alloc = 0;
+	free(stack->items);
+	commit_stack_init(stack);
 }
diff --git a/commit.h b/commit.h
index 81e047f820acb4..7c01a76425f035 100644
--- a/commit.h
+++ b/commit.h
@@ -387,6 +387,7 @@ struct commit_stack {
 };
 #define COMMIT_STACK_INIT { 0 }
 
+void commit_stack_init(struct commit_stack *);
 void commit_stack_push(struct commit_stack *, struct commit *);
 struct commit *commit_stack_pop(struct commit_stack *);
 void commit_stack_clear(struct commit_stack *);

From 065523812f574b432242fcb7d7f2c1ed8c85b4b7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:23 +0100
Subject: [PATCH 286/553] pack-bitmap-write: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 pack-bitmap-write.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 4404921521ca34..bf73ce5710abcc 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -315,8 +315,7 @@ define_commit_slab(bb_data, struct bb_commit);
 
 struct bitmap_builder {
 	struct bb_data data;
-	struct commit **commits;
-	size_t commits_nr, commits_alloc;
+	struct commit_stack commits;
 };
 
 static void bitmap_builder_init(struct bitmap_builder *bb,
@@ -329,8 +328,8 @@ static void bitmap_builder_init(struct bitmap_builder *bb,
 	struct commit_list *r;
 	unsigned int i, num_maximal = 0;
 
-	memset(bb, 0, sizeof(*bb));
 	init_bb_data(&bb->data);
+	commit_stack_init(&bb->commits);
 
 	reset_revision_walk();
 	repo_init_revisions(writer->to_pack->repo, &revs, NULL);
@@ -390,8 +389,7 @@ static void bitmap_builder_init(struct bitmap_builder *bb,
 
 		if (c_ent->maximal) {
 			num_maximal++;
-			ALLOC_GROW(bb->commits, bb->commits_nr + 1, bb->commits_alloc);
-			bb->commits[bb->commits_nr++] = commit;
+			commit_stack_push(&bb->commits, commit);
 		}
 
 		if (p) {
@@ -438,8 +436,7 @@ static void bitmap_builder_init(struct bitmap_builder *bb,
 	}
 
 	for (r = reusable; r; r = r->next) {
-		ALLOC_GROW(bb->commits, bb->commits_nr + 1, bb->commits_alloc);
-		bb->commits[bb->commits_nr++] = r->item;
+		commit_stack_push(&bb->commits, r->item);
 	}
 
 	trace2_data_intmax("pack-bitmap-write", writer->repo,
@@ -454,8 +451,7 @@ static void bitmap_builder_init(struct bitmap_builder *bb,
 static void bitmap_builder_clear(struct bitmap_builder *bb)
 {
 	deep_clear_bb_data(&bb->data, clear_bb_commit);
-	free(bb->commits);
-	bb->commits_nr = bb->commits_alloc = 0;
+	commit_stack_clear(&bb->commits);
 }
 
 static int fill_bitmap_tree(struct bitmap_writer *writer,
@@ -630,8 +626,8 @@ int bitmap_writer_build(struct bitmap_writer *writer)
 		mapping = NULL;
 
 	bitmap_builder_init(&bb, writer, old_bitmap);
-	for (i = bb.commits_nr; i > 0; i--) {
-		struct commit *commit = bb.commits[i-1];
+	for (i = bb.commits.nr; i > 0; i--) {
+		struct commit *commit = bb.commits.items[i-1];
 		struct bb_commit *ent = bb_data_at(&bb.data, commit);
 		struct commit *child;
 		int reused = 0;

From 506a7b66908eb5c3898a3eadbd402308f5b43cf8 Mon Sep 17 00:00:00 2001
From: Rene Scharfe <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:24 +0100
Subject: [PATCH 287/553] shallow: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace a commit array implementation with commit_stack.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 shallow.c | 44 +++++++++++++++++---------------------------
 shallow.h |  4 ++--
 2 files changed, 19 insertions(+), 29 deletions(-)

diff --git a/shallow.c b/shallow.c
index 186e9178f32c33..c870efcefcac4a 100644
--- a/shallow.c
+++ b/shallow.c
@@ -471,6 +471,7 @@ void prepare_shallow_info(struct shallow_info *info, struct oid_array *sa)
 {
 	trace_printf_key(&trace_shallow, "shallow: prepare_shallow_info\n");
 	memset(info, 0, sizeof(*info));
+	commit_stack_init(&info->commits);
 	info->shallow = sa;
 	if (!sa)
 		return;
@@ -503,6 +504,7 @@ void clear_shallow_info(struct shallow_info *info)
 	free(info->shallow_ref);
 	free(info->ours);
 	free(info->theirs);
+	commit_stack_clear(&info->commits);
 }
 
 /* Step 4, remove non-existent ones in "theirs" after getting the pack */
@@ -733,19 +735,13 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 	free(shallow);
 }
 
-struct commit_array {
-	struct commit **commits;
-	size_t nr, alloc;
-};
-
 static int add_ref(const struct reference *ref, void *cb_data)
 {
-	struct commit_array *ca = cb_data;
-	ALLOC_GROW(ca->commits, ca->nr + 1, ca->alloc);
-	ca->commits[ca->nr] = lookup_commit_reference_gently(the_repository,
-							     ref->oid, 1);
-	if (ca->commits[ca->nr])
-		ca->nr++;
+	struct commit_stack *cs = cb_data;
+	struct commit *commit = lookup_commit_reference_gently(the_repository,
+							       ref->oid, 1);
+	if (commit)
+		commit_stack_push(cs, commit);
 	return 0;
 }
 
@@ -770,7 +766,7 @@ static void post_assign_shallow(struct shallow_info *info,
 	uint32_t **bitmap;
 	size_t dst, i, j;
 	size_t bitmap_nr = DIV_ROUND_UP(info->ref->nr, 32);
-	struct commit_array ca;
+	struct commit_stack cs = COMMIT_STACK_INIT;
 
 	trace_printf_key(&trace_shallow, "shallow: post_assign_shallow\n");
 	if (ref_status)
@@ -793,9 +789,8 @@ static void post_assign_shallow(struct shallow_info *info,
 	}
 	info->nr_theirs = dst;
 
-	memset(&ca, 0, sizeof(ca));
-	refs_head_ref(get_main_ref_store(the_repository), add_ref, &ca);
-	refs_for_each_ref(get_main_ref_store(the_repository), add_ref, &ca);
+	refs_head_ref(get_main_ref_store(the_repository), add_ref, &cs);
+	refs_for_each_ref(get_main_ref_store(the_repository), add_ref, &cs);
 
 	/* Remove unreachable shallow commits from "ours" */
 	for (i = dst = 0; i < info->nr_ours; i++) {
@@ -808,7 +803,7 @@ static void post_assign_shallow(struct shallow_info *info,
 		for (j = 0; j < bitmap_nr; j++)
 			if (bitmap[0][j]) {
 				/* Step 7, reachability test at commit level */
-				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
+				int ret = repo_in_merge_bases_many(the_repository, c, cs.nr, cs.items, 1);
 				if (ret < 0)
 					exit(128);
 				if (!ret) {
@@ -820,7 +815,7 @@ static void post_assign_shallow(struct shallow_info *info,
 	}
 	info->nr_ours = dst;
 
-	free(ca.commits);
+	commit_stack_clear(&cs);
 }
 
 /* (Delayed) step 7, reachability test at commit level */
@@ -830,22 +825,17 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 		struct commit *commit = lookup_commit(the_repository,
 						      &si->shallow->oid[c]);
 
-		if (!si->commits) {
-			struct commit_array ca;
-
-			memset(&ca, 0, sizeof(ca));
+		if (!si->commits.nr) {
 			refs_head_ref(get_main_ref_store(the_repository),
-				      add_ref, &ca);
+				      add_ref, &si->commits);
 			refs_for_each_ref(get_main_ref_store(the_repository),
-					  add_ref, &ca);
-			si->commits = ca.commits;
-			si->nr_commits = ca.nr;
+					  add_ref, &si->commits);
 		}
 
 		si->reachable[c] = repo_in_merge_bases_many(the_repository,
 							    commit,
-							    si->nr_commits,
-							    si->commits,
+							    si->commits.nr,
+							    si->commits.items,
 							    1);
 		if (si->reachable[c] < 0)
 			exit(128);
diff --git a/shallow.h b/shallow.h
index ad591bd1396854..1c0787de1d66b9 100644
--- a/shallow.h
+++ b/shallow.h
@@ -1,6 +1,7 @@
 #ifndef SHALLOW_H
 #define SHALLOW_H
 
+#include "commit.h"
 #include "lockfile.h"
 #include "object.h"
 #include "repository.h"
@@ -69,8 +70,7 @@ struct shallow_info {
 	int *need_reachability_test;
 	int *reachable;
 	int *shallow_ref;
-	struct commit **commits;
-	size_t nr_commits;
+	struct commit_stack commits;
 };
 
 void prepare_shallow_info(struct shallow_info *, struct oid_array *);

From 958a816794b9382fe90585b674ee8b96ed6aa8bf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:25 +0100
Subject: [PATCH 288/553] commit: add commit_stack_grow()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a function for increasing the capacity of a commit_stack.  It is
useful for reducing reallocations when the target size is known in
advance.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c | 7 ++++++-
 commit.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/commit.c b/commit.c
index 55b1c8d2f8d21d..28bb5ce029f3c5 100644
--- a/commit.c
+++ b/commit.c
@@ -1988,9 +1988,14 @@ void commit_stack_init(struct commit_stack *stack)
 	stack->nr = stack->alloc = 0;
 }
 
+void commit_stack_grow(struct commit_stack *stack, size_t extra)
+{
+	ALLOC_GROW(stack->items, st_add(stack->nr, extra), stack->alloc);
+}
+
 void commit_stack_push(struct commit_stack *stack, struct commit *commit)
 {
-	ALLOC_GROW(stack->items, stack->nr + 1, stack->alloc);
+	commit_stack_grow(stack, 1);
 	stack->items[stack->nr++] = commit;
 }
 
diff --git a/commit.h b/commit.h
index 7c01a76425f035..79a761c37df023 100644
--- a/commit.h
+++ b/commit.h
@@ -388,6 +388,7 @@ struct commit_stack {
 #define COMMIT_STACK_INIT { 0 }
 
 void commit_stack_init(struct commit_stack *);
+void commit_stack_grow(struct commit_stack *, size_t);
 void commit_stack_push(struct commit_stack *, struct commit *);
 struct commit *commit_stack_pop(struct commit_stack *);
 void commit_stack_clear(struct commit_stack *);

From 3e456f1d8ac409abcf1da3867c9505f48564e874 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:26 +0100
Subject: [PATCH 289/553] commit-graph: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace a commit array implementation with commit_stack.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit-graph.c | 86 +++++++++++++++++++++++---------------------------
 1 file changed, 39 insertions(+), 47 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 80be2ff2c39842..00e8193adcab81 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1127,18 +1127,12 @@ struct tree *get_commit_tree_in_graph(struct repository *r, const struct commit
 	return get_commit_tree_in_graph_one(r->objects->commit_graph, c);
 }
 
-struct packed_commit_list {
-	struct commit **list;
-	size_t nr;
-	size_t alloc;
-};
-
 struct write_commit_graph_context {
 	struct repository *r;
 	struct odb_source *odb_source;
 	char *graph_name;
 	struct oid_array oids;
-	struct packed_commit_list commits;
+	struct commit_stack commits;
 	int num_extra_edges;
 	int num_generation_data_overflows;
 	unsigned long approx_nr_objects;
@@ -1180,7 +1174,7 @@ static int write_graph_chunk_fanout(struct hashfile *f,
 {
 	struct write_commit_graph_context *ctx = data;
 	int i, count = 0;
-	struct commit **list = ctx->commits.list;
+	struct commit **list = ctx->commits.items;
 
 	/*
 	 * Write the first-level table (the list is sorted,
@@ -1206,7 +1200,7 @@ static int write_graph_chunk_oids(struct hashfile *f,
 				  void *data)
 {
 	struct write_commit_graph_context *ctx = data;
-	struct commit **list = ctx->commits.list;
+	struct commit **list = ctx->commits.items;
 	int count;
 	for (count = 0; count < ctx->commits.nr; count++, list++) {
 		display_progress(ctx->progress, ++ctx->progress_cnt);
@@ -1226,8 +1220,8 @@ static int write_graph_chunk_data(struct hashfile *f,
 				  void *data)
 {
 	struct write_commit_graph_context *ctx = data;
-	struct commit **list = ctx->commits.list;
-	struct commit **last = ctx->commits.list + ctx->commits.nr;
+	struct commit **list = ctx->commits.items;
+	struct commit **last = ctx->commits.items + ctx->commits.nr;
 	uint32_t num_extra_edges = 0;
 
 	while (list < last) {
@@ -1249,7 +1243,7 @@ static int write_graph_chunk_data(struct hashfile *f,
 			edge_value = GRAPH_PARENT_NONE;
 		else {
 			edge_value = oid_pos(&parent->item->object.oid,
-					     ctx->commits.list,
+					     ctx->commits.items,
 					     ctx->commits.nr,
 					     commit_to_oid);
 
@@ -1280,7 +1274,7 @@ static int write_graph_chunk_data(struct hashfile *f,
 			edge_value = GRAPH_EXTRA_EDGES_NEEDED | num_extra_edges;
 		else {
 			edge_value = oid_pos(&parent->item->object.oid,
-					     ctx->commits.list,
+					     ctx->commits.items,
 					     ctx->commits.nr,
 					     commit_to_oid);
 
@@ -1332,7 +1326,7 @@ static int write_graph_chunk_generation_data(struct hashfile *f,
 	int i, num_generation_data_overflows = 0;
 
 	for (i = 0; i < ctx->commits.nr; i++) {
-		struct commit *c = ctx->commits.list[i];
+		struct commit *c = ctx->commits.items[i];
 		timestamp_t offset;
 		repo_parse_commit(ctx->r, c);
 		offset = commit_graph_data_at(c)->generation - c->date;
@@ -1355,7 +1349,7 @@ static int write_graph_chunk_generation_data_overflow(struct hashfile *f,
 	struct write_commit_graph_context *ctx = data;
 	int i;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		struct commit *c = ctx->commits.list[i];
+		struct commit *c = ctx->commits.items[i];
 		timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
 		display_progress(ctx->progress, ++ctx->progress_cnt);
 
@@ -1372,8 +1366,8 @@ static int write_graph_chunk_extra_edges(struct hashfile *f,
 					 void *data)
 {
 	struct write_commit_graph_context *ctx = data;
-	struct commit **list = ctx->commits.list;
-	struct commit **last = ctx->commits.list + ctx->commits.nr;
+	struct commit **list = ctx->commits.items;
+	struct commit **last = ctx->commits.items + ctx->commits.nr;
 	struct commit_list *parent;
 
 	while (list < last) {
@@ -1393,7 +1387,7 @@ static int write_graph_chunk_extra_edges(struct hashfile *f,
 		/* Since num_parents > 2, this initializer is safe. */
 		for (parent = (*list)->parents->next; parent; parent = parent->next) {
 			int edge_value = oid_pos(&parent->item->object.oid,
-						 ctx->commits.list,
+						 ctx->commits.items,
 						 ctx->commits.nr,
 						 commit_to_oid);
 
@@ -1427,8 +1421,8 @@ static int write_graph_chunk_bloom_indexes(struct hashfile *f,
 					   void *data)
 {
 	struct write_commit_graph_context *ctx = data;
-	struct commit **list = ctx->commits.list;
-	struct commit **last = ctx->commits.list + ctx->commits.nr;
+	struct commit **list = ctx->commits.items;
+	struct commit **last = ctx->commits.items + ctx->commits.nr;
 	uint32_t cur_pos = 0;
 
 	while (list < last) {
@@ -1463,8 +1457,8 @@ static int write_graph_chunk_bloom_data(struct hashfile *f,
 					void *data)
 {
 	struct write_commit_graph_context *ctx = data;
-	struct commit **list = ctx->commits.list;
-	struct commit **last = ctx->commits.list + ctx->commits.nr;
+	struct commit **list = ctx->commits.items;
+	struct commit **last = ctx->commits.items + ctx->commits.nr;
 
 	trace2_bloom_filter_settings(ctx);
 
@@ -1585,7 +1579,7 @@ static void close_reachable(struct write_commit_graph_context *ctx)
 
 struct compute_generation_info {
 	struct repository *r;
-	struct packed_commit_list *commits;
+	struct commit_stack *commits;
 	struct progress *progress;
 	int progress_cnt;
 
@@ -1622,7 +1616,7 @@ static void compute_reachable_generation_numbers(
 	struct commit_list *list = NULL;
 
 	for (i = 0; i < info->commits->nr; i++) {
-		struct commit *c = info->commits->list[i];
+		struct commit *c = info->commits->items[i];
 		timestamp_t gen;
 		repo_parse_commit(info->r, c);
 		gen = info->get_generation(c, info->data);
@@ -1729,7 +1723,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 
 	if (!ctx->trust_generation_numbers) {
 		for (i = 0; i < ctx->commits.nr; i++) {
-			struct commit *c = ctx->commits.list[i];
+			struct commit *c = ctx->commits.items[i];
 			repo_parse_commit(ctx->r, c);
 			commit_graph_data_at(c)->generation = GENERATION_NUMBER_ZERO;
 		}
@@ -1738,7 +1732,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 	compute_reachable_generation_numbers(&info, 2);
 
 	for (i = 0; i < ctx->commits.nr; i++) {
-		struct commit *c = ctx->commits.list[i];
+		struct commit *c = ctx->commits.items[i];
 		timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
 		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX)
 			ctx->num_generation_data_overflows++;
@@ -1760,8 +1754,8 @@ void ensure_generations_valid(struct repository *r,
 			      struct commit **commits, size_t nr)
 {
 	int generation_version = get_configured_generation_version(r);
-	struct packed_commit_list list = {
-		.list = commits,
+	struct commit_stack list = {
+		.items = commits,
 		.alloc = nr,
 		.nr = nr,
 	};
@@ -1804,7 +1798,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
 			_("Computing commit changed paths Bloom filters"),
 			ctx->commits.nr);
 
-	DUP_ARRAY(sorted_commits, ctx->commits.list, ctx->commits.nr);
+	DUP_ARRAY(sorted_commits, ctx->commits.items, ctx->commits.nr);
 
 	if (ctx->order_by_pack)
 		QSORT(sorted_commits, ctx->commits.nr, commit_pos_cmp);
@@ -1992,26 +1986,26 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
 	oid_array_sort(&ctx->oids);
 	for (i = 0; i < ctx->oids.nr; i = oid_array_next_unique(&ctx->oids, i)) {
 		unsigned int num_parents;
+		struct commit *commit;
 
 		display_progress(ctx->progress, i + 1);
 
-		ALLOC_GROW(ctx->commits.list, ctx->commits.nr + 1, ctx->commits.alloc);
-		ctx->commits.list[ctx->commits.nr] = lookup_commit(ctx->r, &ctx->oids.oid[i]);
+		commit = lookup_commit(ctx->r, &ctx->oids.oid[i]);
 
 		if (ctx->split && flags != COMMIT_GRAPH_SPLIT_REPLACE &&
-		    commit_graph_position(ctx->commits.list[ctx->commits.nr]) != COMMIT_NOT_FROM_GRAPH)
+		    commit_graph_position(commit) != COMMIT_NOT_FROM_GRAPH)
 			continue;
 
 		if (ctx->split && flags == COMMIT_GRAPH_SPLIT_REPLACE)
-			repo_parse_commit(ctx->r, ctx->commits.list[ctx->commits.nr]);
+			repo_parse_commit(ctx->r, commit);
 		else
-			repo_parse_commit_no_graph(ctx->r, ctx->commits.list[ctx->commits.nr]);
+			repo_parse_commit_no_graph(ctx->r, commit);
 
-		num_parents = commit_list_count(ctx->commits.list[ctx->commits.nr]->parents);
+		num_parents = commit_list_count(commit->parents);
 		if (num_parents > 2)
 			ctx->num_extra_edges += num_parents - 1;
 
-		ctx->commits.nr++;
+		commit_stack_push(&ctx->commits, commit);
 	}
 	stop_progress(&ctx->progress);
 }
@@ -2330,7 +2324,7 @@ static void merge_commit_graph(struct write_commit_graph_context *ctx,
 		    oid_to_hex(&g->oid),
 		    (uintmax_t)st_add(ctx->commits.nr, g->num_commits));
 
-	ALLOC_GROW(ctx->commits.list, ctx->commits.nr + g->num_commits, ctx->commits.alloc);
+	commit_stack_grow(&ctx->commits, g->num_commits);
 
 	for (i = 0; i < g->num_commits; i++) {
 		struct object_id oid;
@@ -2343,10 +2337,8 @@ static void merge_commit_graph(struct write_commit_graph_context *ctx,
 		/* only add commits if they still exist in the repo */
 		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
 
-		if (result) {
-			ctx->commits.list[ctx->commits.nr] = result;
-			ctx->commits.nr++;
-		}
+		if (result)
+			commit_stack_push(&ctx->commits, result);
 	}
 }
 
@@ -2367,14 +2359,14 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 					_("Scanning merged commits"),
 					ctx->commits.nr);
 
-	QSORT(ctx->commits.list, ctx->commits.nr, commit_compare);
+	QSORT(ctx->commits.items, ctx->commits.nr, commit_compare);
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
 		display_progress(ctx->progress, i + 1);
 
-		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
-			  &ctx->commits.list[i]->object.oid)) {
+		if (i && oideq(&ctx->commits.items[i - 1]->object.oid,
+			  &ctx->commits.items[i]->object.oid)) {
 			/*
 			 * Silently ignore duplicates. These were likely
 			 * created due to a commit appearing in multiple
@@ -2385,10 +2377,10 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 		} else {
 			unsigned int num_parents;
 
-			ctx->commits.list[dedup_i] = ctx->commits.list[i];
+			ctx->commits.items[dedup_i] = ctx->commits.items[i];
 			dedup_i++;
 
-			num_parents = commit_list_count(ctx->commits.list[i]->parents);
+			num_parents = commit_list_count(ctx->commits.items[i]->parents);
 			if (num_parents > 2)
 				ctx->num_extra_edges += num_parents - 1;
 		}
@@ -2666,7 +2658,7 @@ int write_commit_graph(struct odb_source *source,
 cleanup:
 	free(ctx.graph_name);
 	free(ctx.base_graph_name);
-	free(ctx.commits.list);
+	commit_stack_clear(&ctx.commits);
 	oid_array_clear(&ctx.oids);
 	clear_topo_level_slab(&topo_levels);
 

From 0e445956f4c9b6d079feb5ed831f018c857b955b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 18:03:27 +0100
Subject: [PATCH 290/553] commit-reach: use commit_stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit-reach.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index cc18c86d3bb315..e7d9b3208fabc4 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -283,8 +283,8 @@ static int remove_redundant_with_gen(struct repository *r,
 {
 	size_t i, count_non_stale = 0, count_still_independent = cnt;
 	timestamp_t min_generation = GENERATION_NUMBER_INFINITY;
-	struct commit **walk_start, **sorted;
-	size_t walk_start_nr = 0, walk_start_alloc = cnt;
+	struct commit **sorted;
+	struct commit_stack walk_start = COMMIT_STACK_INIT;
 	size_t min_gen_pos = 0;
 
 	/*
@@ -298,7 +298,7 @@ static int remove_redundant_with_gen(struct repository *r,
 	QSORT(sorted, cnt, compare_commits_by_gen);
 	min_generation = commit_graph_generation(sorted[0]);
 
-	ALLOC_ARRAY(walk_start, walk_start_alloc);
+	commit_stack_grow(&walk_start, cnt);
 
 	/* Mark all parents of the input as STALE */
 	for (i = 0; i < cnt; i++) {
@@ -312,18 +312,17 @@ static int remove_redundant_with_gen(struct repository *r,
 			repo_parse_commit(r, parents->item);
 			if (!(parents->item->object.flags & STALE)) {
 				parents->item->object.flags |= STALE;
-				ALLOC_GROW(walk_start, walk_start_nr + 1, walk_start_alloc);
-				walk_start[walk_start_nr++] = parents->item;
+				commit_stack_push(&walk_start, parents->item);
 			}
 			parents = parents->next;
 		}
 	}
 
-	QSORT(walk_start, walk_start_nr, compare_commits_by_gen);
+	QSORT(walk_start.items, walk_start.nr, compare_commits_by_gen);
 
 	/* remove STALE bit for now to allow walking through parents */
-	for (i = 0; i < walk_start_nr; i++)
-		walk_start[i]->object.flags &= ~STALE;
+	for (i = 0; i < walk_start.nr; i++)
+		walk_start.items[i]->object.flags &= ~STALE;
 
 	/*
 	 * Start walking from the highest generation. Hopefully, it will
@@ -331,12 +330,12 @@ static int remove_redundant_with_gen(struct repository *r,
 	 * terminate early. Otherwise, we will do the same amount of work
 	 * as before.
 	 */
-	for (i = walk_start_nr; i && count_still_independent > 1; i--) {
+	for (i = walk_start.nr; i && count_still_independent > 1; i--) {
 		/* push the STALE bits up to min generation */
 		struct commit_list *stack = NULL;
 
-		commit_list_insert(walk_start[i - 1], &stack);
-		walk_start[i - 1]->object.flags |= STALE;
+		commit_list_insert(walk_start.items[i - 1], &stack);
+		walk_start.items[i - 1]->object.flags |= STALE;
 
 		while (stack) {
 			struct commit_list *parents;
@@ -390,8 +389,8 @@ static int remove_redundant_with_gen(struct repository *r,
 	}
 
 	/* clear marks */
-	clear_commit_marks_many(walk_start_nr, walk_start, STALE);
-	free(walk_start);
+	clear_commit_marks_many(walk_start.nr, walk_start.items, STALE);
+	commit_stack_clear(&walk_start);
 
 	*dedup_cnt = count_non_stale;
 	return 0;

From 363837afe75e7d6f6efd53775887dff67fb9e5d6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 09:02:45 +0100
Subject: [PATCH 291/553] macOS: make Homebrew use configurable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On macOS we opportunistically use Homebrew-installed versions of
gettext(3) and msgfmt(1).  Make that behavior configurable by providing
make variables to disable Homebrew usage (NO_HOMEBREW) and to allow
using a non-default installation location (HOMEBREW_PREFIX).

Include and link only the gettext keg via the symlink opt/gettext
pointing to its installed version instead of using the Homebrew prefix.
This is simpler and prevents accidentally including other libraries.

Suggested-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Suggested-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile         | 18 ++++++++++++++++++
 config.mak.uname | 26 ++++----------------------
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/Makefile b/Makefile
index 7e0f77e2988e3b..e4cbe24ad594ac 100644
--- a/Makefile
+++ b/Makefile
@@ -100,6 +100,12 @@ include shared.mak
 # specify your own (or DarwinPort's) include directories and
 # library directories by defining CFLAGS and LDFLAGS appropriately.
 #
+# Define NO_HOMEBREW if you don't want to use gettext and msgfmt
+# installed by Homebrew.
+#
+# Define HOMEBREW_PREFIX if you have Homebrew installed in a non-default
+# location on macOS or on Linux and want to use it.
+#
 # Define NO_APPLE_COMMON_CRYPTO if you are building on Darwin/Mac OS X
 # and do not want to use Apple's CommonCrypto library.  This allows you
 # to provide your own OpenSSL library, for example from MacPorts.
@@ -1690,6 +1696,18 @@ ifeq ($(uname_S),Darwin)
 	PTHREAD_LIBS =
 endif
 
+ifndef NO_HOMEBREW
+ifdef HOMEBREW_PREFIX
+ifeq ($(shell test -d $(HOMEBREW_PREFIX)/opt/gettext && echo y),y)
+	BASIC_CFLAGS += -I$(HOMEBREW_PREFIX)/opt/gettext/include
+	BASIC_LDFLAGS += -L$(HOMEBREW_PREFIX)/opt/gettext/lib
+endif
+ifeq ($(shell test -x $(HOMEBREW_PREFIX)/opt/gettext/msgfmt && echo y),y)
+	MSGFMT = $(HOMEBREW_PREFIX)/opt/gettext/msgfmt
+endif
+endif
+endif
+
 ifdef NO_LIBGEN_H
 	COMPAT_CFLAGS += -DNO_LIBGEN_H
 	COMPAT_OBJS += compat/basename.o
diff --git a/config.mak.uname b/config.mak.uname
index 1691c6ae6e01e3..db2a92275168f2 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -149,28 +149,10 @@ ifeq ($(uname_S),Darwin)
 	CSPRNG_METHOD = arc4random
 	USE_ENHANCED_BASIC_REGULAR_EXPRESSIONS = YesPlease
 
-	# Workaround for `gettext` being keg-only and not even being linked via
-	# `brew link --force gettext`, should be obsolete as of
-	# https://github.com/Homebrew/homebrew-core/pull/53489
-        ifeq ($(shell test -d /usr/local/opt/gettext/ && echo y),y)
-		BASIC_CFLAGS += -I/usr/local/include -I/usr/local/opt/gettext/include
-		BASIC_LDFLAGS += -L/usr/local/lib -L/usr/local/opt/gettext/lib
-                ifeq ($(shell test -x /usr/local/opt/gettext/bin/msgfmt && echo y),y)
-			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
-                endif
-	# On newer ARM-based machines the default installation path has changed to
-	# /opt/homebrew. Include it in our search paths so that the user does not
-	# have to configure this manually.
-	#
-	# Note that we do not employ the same workaround as above where we manually
-	# add gettext. The issue was fixed more than three years ago by now, and at
-	# that point there haven't been any ARM-based Macs yet.
-        else ifeq ($(shell test -d /opt/homebrew/ && echo y),y)
-		BASIC_CFLAGS += -I/opt/homebrew/include
-		BASIC_LDFLAGS += -L/opt/homebrew/lib
-                ifeq ($(shell test -x /opt/homebrew/bin/msgfmt && echo y),y)
-			MSGFMT = /opt/homebrew/bin/msgfmt
-                endif
+        ifeq ($(uname_M),arm64)
+		HOMEBREW_PREFIX = /opt/homebrew
+        else
+		HOMEBREW_PREFIX = /usr/local
         endif
 
 	# The builtin FSMonitor on MacOS builds upon Simple-IPC.  Both require

From cee341e9ddb0b57e19f16c64b17caf68683faaeb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Wed, 24 Dec 2025 09:03:01 +0100
Subject: [PATCH 292/553] macOS: use iconv from Homebrew if needed and present
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The library function iconv(3) supplied with macOS versions 15.7.2
(Sequoia) and 26.1 (Tahoe) is unreliable when doing conversions from
ISO-2022-JP to UTF-8 in multiple steps; t3900 reports this breakage:

  not ok 17 - ISO-2022-JP should be shown in UTF-8 now
  not ok 25 - ISO-2022-JP should be shown in UTF-8 now
  not ok 38 - commit --fixup into ISO-2022-JP from UTF-8

As a workaround, use libiconv from Homebrew, if available.  Search it in
its default locations: /opt/homebrew for Apple Silicon and /usr/local
for macOS Intel, with the former taking precedence.  Respect ICONVDIR if
already set by the user, though.

Helped-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile         | 12 ++++++++++--
 config.mak.uname |  4 ++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index e4cbe24ad594ac..ebfaec678d3a4d 100644
--- a/Makefile
+++ b/Makefile
@@ -100,12 +100,15 @@ include shared.mak
 # specify your own (or DarwinPort's) include directories and
 # library directories by defining CFLAGS and LDFLAGS appropriately.
 #
-# Define NO_HOMEBREW if you don't want to use gettext and msgfmt
-# installed by Homebrew.
+# Define NO_HOMEBREW if you don't want to use gettext, libiconv and
+# msgfmt installed by Homebrew.
 #
 # Define HOMEBREW_PREFIX if you have Homebrew installed in a non-default
 # location on macOS or on Linux and want to use it.
 #
+# Define USE_HOMEBREW_LIBICONV to link against libiconv installed by
+# Homebrew, if present.
+#
 # Define NO_APPLE_COMMON_CRYPTO if you are building on Darwin/Mac OS X
 # and do not want to use Apple's CommonCrypto library.  This allows you
 # to provide your own OpenSSL library, for example from MacPorts.
@@ -1705,6 +1708,11 @@ endif
 ifeq ($(shell test -x $(HOMEBREW_PREFIX)/opt/gettext/msgfmt && echo y),y)
 	MSGFMT = $(HOMEBREW_PREFIX)/opt/gettext/msgfmt
 endif
+ifdef USE_HOMEBREW_LIBICONV
+ifeq ($(shell test -d $(HOMEBREW_PREFIX)/opt/libiconv && echo y),y)
+	ICONVDIR ?= $(HOMEBREW_PREFIX)/opt/libiconv
+endif
+endif
 endif
 endif
 
diff --git a/config.mak.uname b/config.mak.uname
index db2a92275168f2..38b35af366d5fd 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -124,6 +124,7 @@ ifeq ($(uname_S),Darwin)
 	# - MacOS 10.0.* and MacOS 10.1.0 = Darwin 1.*
 	# - MacOS 10.x.* = Darwin (x+4).* for (1 <= x)
 	# i.e. "begins with [15678] and a dot" means "10.4.* or older".
+	DARWIN_MAJOR_VERSION = $(shell expr "$(uname_R)" : '\([0-9]*\)\.')
         ifeq ($(shell expr "$(uname_R)" : '[15678]\.'),2)
 		OLD_ICONV = UnfortunatelyYes
 		NO_APPLE_COMMON_CRYPTO = YesPlease
@@ -154,6 +155,9 @@ ifeq ($(uname_S),Darwin)
         else
 		HOMEBREW_PREFIX = /usr/local
         endif
+        ifeq ($(shell test "$(DARWIN_MAJOR_VERSION)" -ge 24 && echo 1),1)
+		USE_HOMEBREW_LIBICONV = UnfortunatelyYes
+        endif
 
 	# The builtin FSMonitor on MacOS builds upon Simple-IPC.  Both require
 	# Unix domain sockets and PThreads.

From abf05d856f50fbd8f0390b31e7187d78930dbaf5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Fri, 26 Dec 2025 08:44:28 +0100
Subject: [PATCH 293/553] show-branch: use prio_queue
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Building a list using commit_list_insert_by_date() has quadratic worst
case complexity.  Avoid it by using prio_queue.

Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+
prio_queue_put() if possible, as the former only rebalance the
prio_queue heap once instead of twice.

In sane repositories this won't make much of a difference because the
number of items in the list or queue won't be very high:

Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     538.2 ms ±   0.8 ms    [User: 527.6 ms, System: 9.6 ms]
  Range (min … max):   537.0 ms … 539.2 ms    10 runs

Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     530.6 ms ±   0.4 ms    [User: 519.8 ms, System: 9.8 ms]
  Range (min … max):   530.1 ms … 531.3 ms    10 runs

Summary
  ./git show-branch origin/main origin/next origin/seen origin/todo ran
    1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo

That number is not limited, though, and in pathological cases like the
one in p6010 we see a sizable improvement:

Test                      v2.52.0           HEAD
------------------------------------------------------------------
6010.4: git show-branch   2.19(2.19+0.00)   0.03(0.02+0.00) -98.6%

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/show-branch.c      | 34 +++++++++++++++++++++-------------
 t/perf/p6010-merge-base.sh |  8 ++++++--
 2 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 441babf2e350f9..9e4efbaaed7f4d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -18,6 +18,7 @@
 #include "commit-slab.h"
 #include "date.h"
 #include "wildmatch.h"
+#include "prio-queue.h"
 
 static const char*const show_branch_usage[] = {
     N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n"
@@ -59,11 +60,10 @@ static const char *get_color_reset_code(void)
 	return "";
 }
 
-static struct commit *interesting(struct commit_list *list)
+static struct commit *interesting(struct prio_queue *queue)
 {
-	while (list) {
-		struct commit *commit = list->item;
-		list = list->next;
+	for (size_t i = 0; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i].data;
 		if (commit->object.flags & UNINTERESTING)
 			continue;
 		return commit;
@@ -222,17 +222,18 @@ static int mark_seen(struct commit *commit, struct commit_list **seen_p)
 	return 0;
 }
 
-static void join_revs(struct commit_list **list_p,
+static void join_revs(struct prio_queue *queue,
 		      struct commit_list **seen_p,
 		      int num_rev, int extra)
 {
 	int all_mask = ((1u << (REV_SHIFT + num_rev)) - 1);
 	int all_revs = all_mask & ~((1u << REV_SHIFT) - 1);
 
-	while (*list_p) {
+	while (queue->nr) {
 		struct commit_list *parents;
-		int still_interesting = !!interesting(*list_p);
-		struct commit *commit = pop_commit(list_p);
+		int still_interesting = !!interesting(queue);
+		struct commit *commit = prio_queue_peek(queue);
+		bool get_pending = true;
 		int flags = commit->object.flags & all_mask;
 
 		if (!still_interesting && extra <= 0)
@@ -253,8 +254,14 @@ static void join_revs(struct commit_list **list_p,
 			if (mark_seen(p, seen_p) && !still_interesting)
 				extra--;
 			p->object.flags |= flags;
-			commit_list_insert_by_date(p, list_p);
+			if (get_pending)
+				prio_queue_replace(queue, p);
+			else
+				prio_queue_put(queue, p);
+			get_pending = false;
 		}
+		if (get_pending)
+			prio_queue_get(queue);
 	}
 
 	/*
@@ -642,7 +649,8 @@ int cmd_show_branch(int ac,
 {
 	struct commit *rev[MAX_REVS], *commit;
 	char *reflog_msg[MAX_REVS] = {0};
-	struct commit_list *list = NULL, *seen = NULL;
+	struct commit_list *seen = NULL;
+	struct prio_queue queue = { compare_commits_by_commit_date };
 	unsigned int rev_mask[MAX_REVS];
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_remotes = 0;
@@ -886,14 +894,14 @@ int cmd_show_branch(int ac,
 		 */
 		commit->object.flags |= flag;
 		if (commit->object.flags == flag)
-			commit_list_insert_by_date(commit, &list);
+			prio_queue_put(&queue, commit);
 		rev[num_rev] = commit;
 	}
 	for (i = 0; i < num_rev; i++)
 		rev_mask[i] = rev[i]->object.flags;
 
 	if (0 <= extra)
-		join_revs(&list, &seen, num_rev, extra);
+		join_revs(&queue, &seen, num_rev, extra);
 
 	commit_list_sort_by_date(&seen);
 
@@ -1004,7 +1012,7 @@ int cmd_show_branch(int ac,
 	for (size_t i = 0; i < ARRAY_SIZE(reflog_msg); i++)
 		free(reflog_msg[i]);
 	free_commit_list(seen);
-	free_commit_list(list);
+	clear_prio_queue(&queue);
 	free(args_copy);
 	free(head);
 	return ret;
diff --git a/t/perf/p6010-merge-base.sh b/t/perf/p6010-merge-base.sh
index 54f52fa23ee1e7..08212dd0377db0 100755
--- a/t/perf/p6010-merge-base.sh
+++ b/t/perf/p6010-merge-base.sh
@@ -83,9 +83,9 @@ build_history2 () {
 test_expect_success 'setup' '
 	max_level=15 &&
 	build_history $max_level | git fast-import --export-marks=marks &&
-	git tag one &&
+	git branch one &&
 	build_history2 $max_level | git fast-import --import-marks=marks --force &&
-	git tag two &&
+	git branch two &&
 	git gc &&
 	git log --format=%H --no-merges >expect
 '
@@ -98,4 +98,8 @@ test_expect_success 'verify result' '
 	test_cmp expect actual
 '
 
+test_perf 'git show-branch' '
+	git show-branch one two
+'
+
 test_done

From 56cef1e504d7d111b4acb588dfa1a12e5ab550b9 Mon Sep 17 00:00:00 2001
From: Adrian Ratiu <adrian.ratiu@collabora.com>
Date: Fri, 26 Dec 2025 14:23:24 +0200
Subject: [PATCH 294/553] run-command: add first helper for pp child states

There is a recurring pattern of testing parallel process child states
and file descriptors to determine if a child is running, receiving any
input or if it's ready for cleanup.

Name the pp_child structure and introduce a first helper to make these
checks more readable. Next commits will add more helpers and checks.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 run-command.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index ed9575bd6a8cbb..82eeac38bf0c10 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1478,15 +1478,22 @@ enum child_state {
 	GIT_CP_WAIT_CLEANUP,
 };
 
+struct parallel_child {
+	enum child_state state;
+	struct child_process process;
+	struct strbuf err;
+	void *data;
+};
+
+static int child_is_working(const struct parallel_child *pp_child)
+{
+	return pp_child->state == GIT_CP_WORKING;
+}
+
 struct parallel_processes {
 	size_t nr_processes;
 
-	struct {
-		enum child_state state;
-		struct child_process process;
-		struct strbuf err;
-		void *data;
-	} *children;
+	struct parallel_child *children;
 	/*
 	 * The struct pollfd is logically part of *children,
 	 * but the system call expects it as its own array.
@@ -1509,7 +1516,7 @@ static void kill_children(const struct parallel_processes *pp,
 			  int signo)
 {
 	for (size_t i = 0; i < opts->processes; i++)
-		if (pp->children[i].state == GIT_CP_WORKING)
+		if (child_is_working(&pp->children[i]))
 			kill(pp->children[i].process.pid, signo);
 }
 
@@ -1665,7 +1672,7 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
 
 	/* Buffer output from all pipes. */
 	for (size_t i = 0; i < opts->processes; i++) {
-		if (pp->children[i].state == GIT_CP_WORKING &&
+		if (child_is_working(&pp->children[i]) &&
 		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
 			int n = strbuf_read_once(&pp->children[i].err,
 						 pp->children[i].process.err, 0);
@@ -1683,7 +1690,7 @@ static void pp_output(const struct parallel_processes *pp)
 {
 	size_t i = pp->output_owner;
 
-	if (pp->children[i].state == GIT_CP_WORKING &&
+	if (child_is_working(&pp->children[i]) &&
 	    pp->children[i].err.len) {
 		strbuf_write(&pp->children[i].err, stderr);
 		strbuf_reset(&pp->children[i].err);
@@ -1748,7 +1755,7 @@ static int pp_collect_finished(struct parallel_processes *pp,
 			 * running process time.
 			 */
 			for (i = 0; i < n; i++)
-				if (pp->children[(pp->output_owner + i) % n].state == GIT_CP_WORKING)
+				if (child_is_working(&pp->children[(pp->output_owner + i) % n]))
 					break;
 			pp->output_owner = (pp->output_owner + i) % n;
 		}

From 23a720e96b98cb492077a2d23107df31dbc17a96 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:25 +0200
Subject: [PATCH 295/553] run-command: add stdin callback for parallelization
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

If a user of the run_processes_parallel() API wants to pipe a large
amount of information to the stdin of each parallel command, that
data could exceed the pipe buffer of the process's stdin and can be
too big to store in-memory via strbuf & friends or to slurp to a file.

Generally this is solved by repeatedly writing to child_process.in
between calls to start_command() and finish_command(). For a specific
pre-existing example of this, see transport.c:run_pre_push_hook().

This adds a generic callback API to run_processes_parallel() to do
exactly that in a unified manner, similar to the existing callback APIs,
which can then be used by hooks.h to convert the remaining hooks to the
new, simpler parallel interface.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 run-command.c               | 87 ++++++++++++++++++++++++++++++++++---
 run-command.h               | 21 +++++++++
 t/helper/test-run-command.c | 52 +++++++++++++++++++++-
 t/t0061-run-command.sh      | 31 +++++++++++++
 4 files changed, 182 insertions(+), 9 deletions(-)

diff --git a/run-command.c b/run-command.c
index 82eeac38bf0c10..a608d37fb22d98 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1490,6 +1490,16 @@ static int child_is_working(const struct parallel_child *pp_child)
 	return pp_child->state == GIT_CP_WORKING;
 }
 
+static int child_is_ready_for_cleanup(const struct parallel_child *pp_child)
+{
+	return child_is_working(pp_child) && !pp_child->process.in;
+}
+
+static int child_is_receiving_input(const struct parallel_child *pp_child)
+{
+	return child_is_working(pp_child) && pp_child->process.in > 0;
+}
+
 struct parallel_processes {
 	size_t nr_processes;
 
@@ -1659,6 +1669,44 @@ static int pp_start_one(struct parallel_processes *pp,
 	return 0;
 }
 
+static void pp_buffer_stdin(struct parallel_processes *pp,
+			    const struct run_process_parallel_opts *opts)
+{
+	/* Buffer stdin for each pipe. */
+	for (size_t i = 0; i < opts->processes; i++) {
+		struct child_process *proc = &pp->children[i].process;
+		int ret;
+
+		if (!child_is_receiving_input(&pp->children[i]))
+			continue;
+
+		/*
+		 * child input is provided via path_to_stdin when the feed_pipe cb is
+		 * missing, so we just signal an EOF.
+		 */
+		if (!opts->feed_pipe) {
+			close(proc->in);
+			proc->in = 0;
+			continue;
+		}
+
+		/**
+		 * Feed the pipe:
+		 *   ret < 0 means error
+		 *   ret == 0 means there is more data to be fed
+		 *   ret > 0 means feeding finished
+		 */
+		ret = opts->feed_pipe(proc->in, opts->data, pp->children[i].data);
+		if (ret < 0)
+			die_errno("feed_pipe");
+
+		if (ret) {
+			close(proc->in);
+			proc->in = 0;
+		}
+	}
+}
+
 static void pp_buffer_stderr(struct parallel_processes *pp,
 			     const struct run_process_parallel_opts *opts,
 			     int output_timeout)
@@ -1729,6 +1777,7 @@ static int pp_collect_finished(struct parallel_processes *pp,
 		pp->children[i].state = GIT_CP_FREE;
 		if (pp->pfd)
 			pp->pfd[i].fd = -1;
+		pp->children[i].process.in = 0;
 		child_process_init(&pp->children[i].process);
 
 		if (opts->ungroup) {
@@ -1763,6 +1812,27 @@ static int pp_collect_finished(struct parallel_processes *pp,
 	return result;
 }
 
+static void pp_handle_child_IO(struct parallel_processes *pp,
+				const struct run_process_parallel_opts *opts,
+				int output_timeout)
+{
+	/*
+	 * First push input, if any (it might no-op), to child tasks to avoid them blocking
+	 * after input. This also prevents deadlocks when ungrouping below, if a child blocks
+	 * while the parent also waits for them to finish.
+	 */
+	pp_buffer_stdin(pp, opts);
+
+	if (opts->ungroup) {
+		for (size_t i = 0; i < opts->processes; i++)
+			if (child_is_ready_for_cleanup(&pp->children[i]))
+				pp->children[i].state = GIT_CP_WAIT_CLEANUP;
+	} else {
+		pp_buffer_stderr(pp, opts, output_timeout);
+		pp_output(pp);
+	}
+}
+
 void run_processes_parallel(const struct run_process_parallel_opts *opts)
 {
 	int i, code;
@@ -1782,6 +1852,13 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 					   "max:%"PRIuMAX,
 					   (uintmax_t)opts->processes);
 
+	/*
+	 * Child tasks might receive input via stdin, terminating early (or not), so
+	 * ignore the default SIGPIPE which gets handled by each feed_pipe_fn which
+	 * actually writes the data to children stdin fds.
+	 */
+	sigchain_push(SIGPIPE, SIG_IGN);
+
 	pp_init(&pp, opts, &pp_sig);
 	while (1) {
 		for (i = 0;
@@ -1799,13 +1876,7 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 		}
 		if (!pp.nr_processes)
 			break;
-		if (opts->ungroup) {
-			for (size_t i = 0; i < opts->processes; i++)
-				pp.children[i].state = GIT_CP_WAIT_CLEANUP;
-		} else {
-			pp_buffer_stderr(&pp, opts, output_timeout);
-			pp_output(&pp);
-		}
+		pp_handle_child_IO(&pp, opts, output_timeout);
 		code = pp_collect_finished(&pp, opts);
 		if (code) {
 			pp.shutdown = 1;
@@ -1816,6 +1887,8 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 
 	pp_cleanup(&pp, opts);
 
+	sigchain_pop(SIGPIPE);
+
 	if (do_trace2)
 		trace2_region_leave(tr2_category, tr2_label, NULL);
 }
diff --git a/run-command.h b/run-command.h
index 0df25e445f001c..e1ca965b5b1988 100644
--- a/run-command.h
+++ b/run-command.h
@@ -420,6 +420,21 @@ typedef int (*start_failure_fn)(struct strbuf *out,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * This callback is repeatedly called on every child process who requests
+ * start_command() to create a pipe by setting child_process.in < 0.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel, and
+ * pp_task_cb is the callback cookie as passed into get_next_task_fn.
+ *
+ * Returns < 0 for error
+ * Returns == 0 when there is more data to be fed (will be called again)
+ * Returns > 0 when finished (child closed fd or no more data to be fed)
+ */
+typedef int (*feed_pipe_fn)(int child_in,
+				void *pp_cb,
+				void *pp_task_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -473,6 +488,12 @@ struct run_process_parallel_opts
 	 */
 	start_failure_fn start_failure;
 
+	/*
+	 * feed_pipe: see feed_pipe_fn() above. This can be NULL to omit any
+	 * special handling.
+	 */
+	feed_pipe_fn feed_pipe;
+
 	/**
 	 * task_finished: See task_finished_fn() above. This can be
 	 * NULL to omit any special handling.
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 3719f23cc2d02f..4a56456894ccff 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -23,19 +23,26 @@ static int number_callbacks;
 static int parallel_next(struct child_process *cp,
 			 struct strbuf *err,
 			 void *cb,
-			 void **task_cb UNUSED)
+			 void **task_cb)
 {
 	struct child_process *d = cb;
 	if (number_callbacks >= 4)
 		return 0;
 
 	strvec_pushv(&cp->args, d->args.v);
+	cp->in = d->in;
+	cp->no_stdin = d->no_stdin;
 	if (err)
 		strbuf_addstr(err, "preloaded output of a child\n");
 	else
 		fprintf(stderr, "preloaded output of a child\n");
 
 	number_callbacks++;
+
+	/* test_stdin callback will use this to count remaining lines */
+	*task_cb = xmalloc(sizeof(int));
+	*(int*)(*task_cb) = 2;
+
 	return 1;
 }
 
@@ -54,15 +61,48 @@ static int no_job(struct child_process *cp UNUSED,
 static int task_finished(int result UNUSED,
 			 struct strbuf *err,
 			 void *pp_cb UNUSED,
-			 void *pp_task_cb UNUSED)
+			 void *pp_task_cb)
 {
 	if (err)
 		strbuf_addstr(err, "asking for a quick stop\n");
 	else
 		fprintf(stderr, "asking for a quick stop\n");
+
+	FREE_AND_NULL(pp_task_cb);
+
 	return 1;
 }
 
+static int task_finished_quiet(int result UNUSED,
+			       struct strbuf *err UNUSED,
+			       void *pp_cb UNUSED,
+			       void *pp_task_cb)
+{
+	FREE_AND_NULL(pp_task_cb);
+	return 0;
+}
+
+static int test_stdin_pipe_feed(int hook_stdin_fd, void *cb UNUSED, void *task_cb)
+{
+	int *lines_remaining = task_cb;
+
+	if (*lines_remaining) {
+		struct strbuf buf = STRBUF_INIT;
+		strbuf_addf(&buf, "sample stdin %d\n", --(*lines_remaining));
+		if (write_in_full(hook_stdin_fd, buf.buf, buf.len) < 0) {
+			if (errno == EPIPE) {
+				/* child closed stdin, nothing more to do */
+				strbuf_release(&buf);
+				return 1;
+			}
+			die_errno("write");
+		}
+		strbuf_release(&buf);
+	}
+
+	return !(*lines_remaining);
+}
+
 struct testsuite {
 	struct string_list tests, failed;
 	int next;
@@ -157,6 +197,7 @@ static int testsuite(int argc, const char **argv)
 	struct run_process_parallel_opts opts = {
 		.get_next_task = next_test,
 		.start_failure = test_failed,
+		.feed_pipe = test_stdin_pipe_feed,
 		.task_finished = test_finished,
 		.data = &suite,
 	};
@@ -460,12 +501,19 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel")) {
 		opts.get_next_task = parallel_next;
+		opts.task_finished = task_finished_quiet;
 	} else if (!strcmp(argv[1], "run-command-abort")) {
 		opts.get_next_task = parallel_next;
 		opts.task_finished = task_finished;
 	} else if (!strcmp(argv[1], "run-command-no-jobs")) {
 		opts.get_next_task = no_job;
 		opts.task_finished = task_finished;
+	} else if (!strcmp(argv[1], "run-command-stdin")) {
+		proc.in = -1;
+		proc.no_stdin = 0;
+		opts.get_next_task = parallel_next;
+		opts.task_finished = task_finished_quiet;
+		opts.feed_pipe = test_stdin_pipe_feed;
 	} else {
 		ret = 1;
 		fprintf(stderr, "check usage\n");
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 76d4936a879afd..2f77fde0d964c8 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -164,6 +164,37 @@ test_expect_success 'run_command runs ungrouped in parallel with more tasks than
 	test_line_count = 4 err
 '
 
+test_expect_success 'run_command listens to stdin' '
+	cat >expect <<-\EOF &&
+	preloaded output of a child
+	listening for stdin:
+	sample stdin 1
+	sample stdin 0
+	preloaded output of a child
+	listening for stdin:
+	sample stdin 1
+	sample stdin 0
+	preloaded output of a child
+	listening for stdin:
+	sample stdin 1
+	sample stdin 0
+	preloaded output of a child
+	listening for stdin:
+	sample stdin 1
+	sample stdin 0
+	EOF
+
+	write_script stdin-script <<-\EOF &&
+	echo "listening for stdin:"
+	while read line
+	do
+		echo "$line"
+	done
+	EOF
+	test-tool run-command run-command-stdin 2 ./stdin-script 2>actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 asking for a quick stop

From 26238496a70f084912924d2c3af828c24bceb4aa Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:26 +0200
Subject: [PATCH 296/553] hook: provide stdin via callback
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This adds a callback mechanism for feeding stdin to hooks alongside
the existing path_to_stdin (which slurps a file's content to stdin).

The advantage of this new callback is that it can feed stdin without
going through the FS layer. This helps when feeding large amount of
data and uses the run-command parallel stdin callback introduced in
the preceding commit.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 hook.c | 23 ++++++++++++++++++++++-
 hook.h | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/hook.c b/hook.c
index b3de1048bf44b9..5ddd7678d18f0d 100644
--- a/hook.c
+++ b/hook.c
@@ -55,7 +55,7 @@ int hook_exists(struct repository *r, const char *name)
 static int pick_next_hook(struct child_process *cp,
 			  struct strbuf *out UNUSED,
 			  void *pp_cb,
-			  void **pp_task_cb UNUSED)
+			  void **pp_task_cb)
 {
 	struct hook_cb_data *hook_cb = pp_cb;
 	const char *hook_path = hook_cb->hook_path;
@@ -65,11 +65,22 @@ static int pick_next_hook(struct child_process *cp,
 
 	cp->no_stdin = 1;
 	strvec_pushv(&cp->env, hook_cb->options->env.v);
+
+	if (hook_cb->options->path_to_stdin && hook_cb->options->feed_pipe)
+		BUG("options path_to_stdin and feed_pipe are mutually exclusive");
+
 	/* reopen the file for stdin; run_command closes it. */
 	if (hook_cb->options->path_to_stdin) {
 		cp->no_stdin = 0;
 		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
 	}
+
+	if (hook_cb->options->feed_pipe) {
+		cp->no_stdin = 0;
+		/* start_command() will allocate a pipe / stdin fd for us */
+		cp->in = -1;
+	}
+
 	cp->stdout_to_stderr = 1;
 	cp->trace2_hook_name = hook_cb->hook_name;
 	cp->dir = hook_cb->options->dir;
@@ -77,6 +88,12 @@ static int pick_next_hook(struct child_process *cp,
 	strvec_push(&cp->args, hook_path);
 	strvec_pushv(&cp->args, hook_cb->options->args.v);
 
+	/*
+	 * Provide per-hook internal state via task_cb for easy access, so
+	 * hook callbacks don't have to go through hook_cb->options.
+	 */
+	*pp_task_cb = hook_cb->options->feed_pipe_cb_data;
+
 	/*
 	 * This pick_next_hook() will be called again, we're only
 	 * running one hook, so indicate that no more work will be
@@ -140,6 +157,7 @@ int run_hooks_opt(struct repository *r, const char *hook_name,
 
 		.get_next_task = pick_next_hook,
 		.start_failure = notify_start_failure,
+		.feed_pipe = options->feed_pipe,
 		.task_finished = notify_hook_finished,
 
 		.data = &cb_data,
@@ -148,6 +166,9 @@ int run_hooks_opt(struct repository *r, const char *hook_name,
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if (options->path_to_stdin && options->feed_pipe)
+		BUG("options path_to_stdin and feed_pipe are mutually exclusive");
+
 	if (options->invoked_hook)
 		*options->invoked_hook = 0;
 
diff --git a/hook.h b/hook.h
index 11863fa7347e6f..2169d4a6bd3f2e 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #ifndef HOOK_H
 #define HOOK_H
 #include "strvec.h"
+#include "run-command.h"
 
 struct repository;
 
@@ -37,6 +38,43 @@ struct run_hooks_opt
 	 * Path to file which should be piped to stdin for each hook.
 	 */
 	const char *path_to_stdin;
+
+	/**
+	 * Callback used to incrementally feed a child hook stdin pipe.
+	 *
+	 * Useful especially if a hook consumes large quantities of data
+	 * (e.g. a list of all refs in a client push), so feeding it via
+	 * in-memory strings or slurping to/from files is inefficient.
+	 * While the callback allows piecemeal writing, it can also be
+	 * used for smaller inputs, where it gets called only once.
+	 *
+	 * Add hook callback initalization context to `feed_pipe_ctx`.
+	 * Add hook callback internal state to `feed_pipe_cb_data`.
+	 *
+	 */
+	feed_pipe_fn feed_pipe;
+
+	/**
+	 * Opaque data pointer used to pass context to `feed_pipe_fn`.
+	 *
+	 * It can be accessed via the second callback arg 'pp_cb':
+	 * ((struct hook_cb_data *) pp_cb)->hook_cb->options->feed_pipe_ctx;
+	 *
+	 * The caller is responsible for managing the memory for this data.
+	 * Only useful when using `run_hooks_opt.feed_pipe`, otherwise ignore it.
+	 */
+	void *feed_pipe_ctx;
+
+	/**
+	 * Opaque data pointer used to keep internal state across callback calls.
+	 *
+	 * It can be accessed directly via the third callback arg 'pp_task_cb':
+	 * struct ... *state = pp_task_cb;
+	 *
+	 * The caller is responsible for managing the memory for this data.
+	 * Only useful when using `run_hooks_opt.feed_pipe`, otherwise ignore it.
+	 */
+	void *feed_pipe_cb_data;
 };
 
 #define RUN_HOOKS_OPT_INIT { \

From 05eccff8c7e6c58bad777a642ab8a28c87602d36 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:27 +0200
Subject: [PATCH 297/553] hook: convert 'post-rewrite' hook in sequencer.c to
 hook API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace the custom run-command calls used by post-rewrite with
the newer and simpler hook_run_opt(), which does not need to
create a custom 'struct child_process' or call find_hook().

Another benefit of using the hook API is that hook_run_opt()
handles the SIGPIPE toggle logic.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sequencer.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5476d39ba9b097..71ed31c7740688 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1292,32 +1292,40 @@ int update_head_with_reflog(const struct commit *old_head,
 	return ret;
 }
 
+static int pipe_from_strbuf(int hook_stdin_fd, void *pp_cb, void *pp_task_cb UNUSED)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+	struct strbuf *to_pipe = hook_cb->options->feed_pipe_ctx;
+	int ret;
+
+	if (!to_pipe)
+		BUG("pipe_from_strbuf called without feed_pipe_ctx");
+
+	ret = write_in_full(hook_stdin_fd, to_pipe->buf, to_pipe->len);
+	if (ret < 0 && errno != EPIPE)
+		return ret;
+
+	return 1; /* done writing */
+}
+
 static int run_rewrite_hook(const struct object_id *oldoid,
 			    const struct object_id *newoid)
 {
-	struct child_process proc = CHILD_PROCESS_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 	int code;
 	struct strbuf sb = STRBUF_INIT;
-	const char *hook_path = find_hook(the_repository, "post-rewrite");
 
-	if (!hook_path)
-		return 0;
+	strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid));
 
-	strvec_pushl(&proc.args, hook_path, "amend", NULL);
-	proc.in = -1;
-	proc.stdout_to_stderr = 1;
-	proc.trace2_hook_name = "post-rewrite";
+	opt.feed_pipe_ctx = &sb;
+	opt.feed_pipe = pipe_from_strbuf;
+
+	strvec_push(&opt.args, "amend");
+
+	code = run_hooks_opt(the_repository, "post-rewrite", &opt);
 
-	code = start_command(&proc);
-	if (code)
-		return code;
-	strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid));
-	sigchain_push(SIGPIPE, SIG_IGN);
-	write_in_full(proc.in, sb.buf, sb.len);
-	close(proc.in);
 	strbuf_release(&sb);
-	sigchain_pop(SIGPIPE);
-	return finish_command(&proc);
+	return code;
 }
 
 void commit_post_rewrite(struct repository *r,

From 3e2836a742d8b2b2da25ca06e9d0ac3a539bd966 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:28 +0200
Subject: [PATCH 298/553] transport: convert pre-push to hook API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Move the pre-push hook from custom run-command invocations to
the new hook API which doesn't require a custom child_process
structure and signal toggling.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 transport.c | 89 +++++++++++++++++++++++++++--------------------------
 1 file changed, 45 insertions(+), 44 deletions(-)

diff --git a/transport.c b/transport.c
index c7f06a7382e605..6d0f02be5d7c00 100644
--- a/transport.c
+++ b/transport.c
@@ -1316,65 +1316,66 @@ static void die_with_unpushed_submodules(struct string_list *needs_pushing)
 	die(_("Aborting."));
 }
 
-static int run_pre_push_hook(struct transport *transport,
-			     struct ref *remote_refs)
-{
-	int ret = 0, x;
-	struct ref *r;
-	struct child_process proc = CHILD_PROCESS_INIT;
+struct feed_pre_push_hook_data {
 	struct strbuf buf;
-	const char *hook_path = find_hook(the_repository, "pre-push");
+	const struct ref *refs;
+};
 
-	if (!hook_path)
-		return 0;
+static int pre_push_hook_feed_stdin(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_task_cb)
+{
+	struct feed_pre_push_hook_data *data = pp_task_cb;
+	const struct ref *r = data->refs;
+	int ret = 0;
 
-	strvec_push(&proc.args, hook_path);
-	strvec_push(&proc.args, transport->remote->name);
-	strvec_push(&proc.args, transport->url);
+	if (!r)
+		return 1; /* no more refs */
 
-	proc.in = -1;
-	proc.trace2_hook_name = "pre-push";
+	data->refs = r->next;
 
-	if (start_command(&proc)) {
-		finish_command(&proc);
-		return -1;
+	switch (r->status) {
+	case REF_STATUS_REJECT_NONFASTFORWARD:
+	case REF_STATUS_REJECT_REMOTE_UPDATED:
+	case REF_STATUS_REJECT_STALE:
+	case REF_STATUS_UPTODATE:
+		return 0; /* skip refs which won't be pushed */
+	default:
+		break;
 	}
 
-	sigchain_push(SIGPIPE, SIG_IGN);
+	if (!r->peer_ref)
+		return 0;
 
-	strbuf_init(&buf, 256);
+	strbuf_reset(&data->buf);
+	strbuf_addf(&data->buf, "%s %s %s %s\n",
+		    r->peer_ref->name, oid_to_hex(&r->new_oid),
+		    r->name, oid_to_hex(&r->old_oid));
 
-	for (r = remote_refs; r; r = r->next) {
-		if (!r->peer_ref) continue;
-		if (r->status == REF_STATUS_REJECT_NONFASTFORWARD) continue;
-		if (r->status == REF_STATUS_REJECT_STALE) continue;
-		if (r->status == REF_STATUS_REJECT_REMOTE_UPDATED) continue;
-		if (r->status == REF_STATUS_UPTODATE) continue;
+	ret = write_in_full(hook_stdin_fd, data->buf.buf, data->buf.len);
+	if (ret < 0 && errno != EPIPE)
+		return ret; /* We do not mind if a hook does not read all refs. */
 
-		strbuf_reset(&buf);
-		strbuf_addf( &buf, "%s %s %s %s\n",
-			 r->peer_ref->name, oid_to_hex(&r->new_oid),
-			 r->name, oid_to_hex(&r->old_oid));
+	return 0;
+}
 
-		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
-			/* We do not mind if a hook does not read all refs. */
-			if (errno != EPIPE)
-				ret = -1;
-			break;
-		}
-	}
+static int run_pre_push_hook(struct transport *transport,
+			     struct ref *remote_refs)
+{
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct feed_pre_push_hook_data data;
+	int ret = 0;
+
+	strvec_push(&opt.args, transport->remote->name);
+	strvec_push(&opt.args, transport->url);
 
-	strbuf_release(&buf);
+	strbuf_init(&data.buf, 0);
+	data.refs = remote_refs;
 
-	x = close(proc.in);
-	if (!ret)
-		ret = x;
+	opt.feed_pipe = pre_push_hook_feed_stdin;
+	opt.feed_pipe_cb_data = &data;
 
-	sigchain_pop(SIGPIPE);
+	ret = run_hooks_opt(the_repository, "pre-push", &opt);
 
-	x = finish_command(&proc);
-	if (!ret)
-		ret = x;
+	strbuf_release(&data.buf);
 
 	return ret;
 }

From 7a7717427ea7253003d221c47b462d9334429053 Mon Sep 17 00:00:00 2001
From: Adrian Ratiu <adrian.ratiu@collabora.com>
Date: Fri, 26 Dec 2025 14:23:29 +0200
Subject: [PATCH 299/553] reference-transaction: use hook API instead of
 run-command
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Convert the reference-transaction hook to the new hook API,
so it doesn't need to set up a struct child_process, call
find_hook or toggle the pipe signals.

The stdin feed callback is processing one ref update per
call. I haven't noticed any performance degradation due
to this, however we can batch as many we want in each call,
to ensure a good pipe throughtput (i.e. the child does not
wait after stdin).

Helped-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c | 100 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 52 insertions(+), 48 deletions(-)

diff --git a/refs.c b/refs.c
index 965381367e0e53..e0b060224ab143 100644
--- a/refs.c
+++ b/refs.c
@@ -2405,68 +2405,72 @@ static int ref_update_reject_duplicates(struct string_list *refnames,
 	return 0;
 }
 
-static int run_transaction_hook(struct ref_transaction *transaction,
-				const char *state)
+struct transaction_feed_cb_data {
+	size_t index;
+	struct strbuf buf;
+};
+
+static int transaction_hook_feed_stdin(int hook_stdin_fd, void *pp_cb, void *pp_task_cb)
 {
-	struct child_process proc = CHILD_PROCESS_INIT;
-	struct strbuf buf = STRBUF_INIT;
-	const char *hook;
-	int ret = 0;
+	struct hook_cb_data *hook_cb = pp_cb;
+	struct ref_transaction *transaction = hook_cb->options->feed_pipe_ctx;
+	struct transaction_feed_cb_data *feed_cb_data = pp_task_cb;
+	struct strbuf *buf = &feed_cb_data->buf;
+	struct ref_update *update;
+	size_t i = feed_cb_data->index++;
+	int ret;
 
-	hook = find_hook(transaction->ref_store->repo, "reference-transaction");
-	if (!hook)
-		return ret;
+	if (i >= transaction->nr)
+		return 1; /* No more refs to process */
 
-	strvec_pushl(&proc.args, hook, state, NULL);
-	proc.in = -1;
-	proc.stdout_to_stderr = 1;
-	proc.trace2_hook_name = "reference-transaction";
+	update = transaction->updates[i];
 
-	ret = start_command(&proc);
-	if (ret)
-		return ret;
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
 
-	sigchain_push(SIGPIPE, SIG_IGN);
+	strbuf_reset(buf);
 
-	for (size_t i = 0; i < transaction->nr; i++) {
-		struct ref_update *update = transaction->updates[i];
+	if (!(update->flags & REF_HAVE_OLD))
+		strbuf_addf(buf, "%s ", oid_to_hex(null_oid(the_hash_algo)));
+	else if (update->old_target)
+		strbuf_addf(buf, "ref:%s ", update->old_target);
+	else
+		strbuf_addf(buf, "%s ", oid_to_hex(&update->old_oid));
 
-		if (update->flags & REF_LOG_ONLY)
-			continue;
+	if (!(update->flags & REF_HAVE_NEW))
+		strbuf_addf(buf, "%s ", oid_to_hex(null_oid(the_hash_algo)));
+	else if (update->new_target)
+		strbuf_addf(buf, "ref:%s ", update->new_target);
+	else
+		strbuf_addf(buf, "%s ", oid_to_hex(&update->new_oid));
 
-		strbuf_reset(&buf);
+	strbuf_addf(buf, "%s\n", update->refname);
 
-		if (!(update->flags & REF_HAVE_OLD))
-			strbuf_addf(&buf, "%s ", oid_to_hex(null_oid(the_hash_algo)));
-		else if (update->old_target)
-			strbuf_addf(&buf, "ref:%s ", update->old_target);
-		else
-			strbuf_addf(&buf, "%s ", oid_to_hex(&update->old_oid));
+	ret = write_in_full(hook_stdin_fd, buf->buf, buf->len);
+	if (ret < 0 && errno != EPIPE)
+		return ret;
 
-		if (!(update->flags & REF_HAVE_NEW))
-			strbuf_addf(&buf, "%s ", oid_to_hex(null_oid(the_hash_algo)));
-		else if (update->new_target)
-			strbuf_addf(&buf, "ref:%s ", update->new_target);
-		else
-			strbuf_addf(&buf, "%s ", oid_to_hex(&update->new_oid));
+	return 0; /* no more input to feed */
+}
+
+static int run_transaction_hook(struct ref_transaction *transaction,
+				const char *state)
+{
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct transaction_feed_cb_data feed_ctx = { 0 };
+	int ret = 0;
 
-		strbuf_addf(&buf, "%s\n", update->refname);
+	strvec_push(&opt.args, state);
 
-		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
-			if (errno != EPIPE) {
-				/* Don't leak errno outside this API */
-				errno = 0;
-				ret = -1;
-			}
-			break;
-		}
-	}
+	opt.feed_pipe = transaction_hook_feed_stdin;
+	opt.feed_pipe_ctx = transaction;
+	opt.feed_pipe_cb_data = &feed_ctx;
 
-	close(proc.in);
-	sigchain_pop(SIGPIPE);
-	strbuf_release(&buf);
+	strbuf_init(&feed_ctx.buf, 0);
+
+	ret = run_hooks_opt(transaction->ref_store->repo, "reference-transaction", &opt);
 
-	ret |= finish_command(&proc);
+	strbuf_release(&feed_ctx.buf);
 	return ret;
 }
 

From 857f047e40f796aa43c6e7c754d8a32ee64e4f4d Mon Sep 17 00:00:00 2001
From: Adrian Ratiu <adrian.ratiu@collabora.com>
Date: Fri, 26 Dec 2025 14:23:30 +0200
Subject: [PATCH 300/553] hook: allow overriding the ungroup option

When calling run_process_parallel() in run_hooks_opt(), the
ungroup option is currently hardcoded to .ungroup = 1.

This causes problems when ungrouping should be disabled, for
example when sideband-reading collated output from child hooks,
because sideband-reading and ungrouping are mutually exclusive.

Thus a new hook.h option is added to allow overriding.

The existing ungroup=1 behavior is preserved in the run_hooks()
API and the "hook run" command. We could modify these to take
an option if necessary, so I added two code comments there.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/hook.c | 6 ++++++
 commit.c       | 3 +++
 hook.c         | 5 ++++-
 hook.h         | 5 +++++
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 7afec380d2e579..73e7b8c2e878eb 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -43,6 +43,12 @@ static int run(int argc, const char **argv, const char *prefix,
 	if (!argc)
 		goto usage;
 
+	/*
+	 * All current "hook run" use-cases require ungrouped child output.
+	 * If this changes, a hook run argument can be added to toggle it.
+	 */
+	opt.ungroup = 1;
+
 	/*
 	 * Having a -- for "run" when providing <hook-args> is
 	 * mandatory.
diff --git a/commit.c b/commit.c
index 16d91b2bfcf291..7da33dde86337b 100644
--- a/commit.c
+++ b/commit.c
@@ -1965,6 +1965,9 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 		strvec_push(&opt.args, arg);
 	va_end(args);
 
+	/* All commit hook use-cases require ungrouping child output. */
+	opt.ungroup = 1;
+
 	opt.invoked_hook = invoked_hook;
 	return run_hooks_opt(the_repository, name, &opt);
 }
diff --git a/hook.c b/hook.c
index 5ddd7678d18f0d..00a1e2ad22a9d7 100644
--- a/hook.c
+++ b/hook.c
@@ -153,7 +153,7 @@ int run_hooks_opt(struct repository *r, const char *hook_name,
 		.tr2_label = hook_name,
 
 		.processes = 1,
-		.ungroup = 1,
+		.ungroup = options->ungroup,
 
 		.get_next_task = pick_next_hook,
 		.start_failure = notify_start_failure,
@@ -198,6 +198,9 @@ int run_hooks(struct repository *r, const char *hook_name)
 {
 	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 
+	/* All use-cases of this API require ungrouping. */
+	opt.ungroup = 1;
+
 	return run_hooks_opt(r, hook_name, &opt);
 }
 
diff --git a/hook.h b/hook.h
index 2169d4a6bd3f2e..78a1a44690ef34 100644
--- a/hook.h
+++ b/hook.h
@@ -34,6 +34,11 @@ struct run_hooks_opt
 	 */
 	int *invoked_hook;
 
+	/**
+	 * Allow hooks to set run_processes_parallel() 'ungroup' behavior.
+	 */
+	unsigned int ungroup:1;
+
 	/**
 	 * Path to file which should be piped to stdin for each hook.
 	 */

From 5ab5872a53296b009cca43d412efd1a74ea4f149 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:31 +0200
Subject: [PATCH 301/553] run-command: allow capturing of collated output
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Some callers, for example server-side hooks which wish to relay hook
output to clients across a transport, want to capture what would
normally print to stderr and do something else with it. Allow that via a
callback.

By calling the callback regardless of whether there's output available,
we allow clients to send e.g. a keepalive if necessary.

Because we expose a strbuf, not a fd or FILE*, there's no need to create
a temporary pipe or similar - we can just skip the print to stderr and
instead hand it to the caller.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 run-command.c               | 30 ++++++++++++++++++++++--------
 run-command.h               | 17 +++++++++++++++++
 t/helper/test-run-command.c | 15 +++++++++++++++
 t/t0061-run-command.sh      |  7 +++++++
 4 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/run-command.c b/run-command.c
index a608d37fb22d98..6b1e4a34533315 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1595,7 +1595,10 @@ static void pp_cleanup(struct parallel_processes *pp,
 	 * When get_next_task added messages to the buffer in its last
 	 * iteration, the buffered output is non empty.
 	 */
-	strbuf_write(&pp->buffered_output, stderr);
+	if (opts->consume_output)
+		opts->consume_output(&pp->buffered_output, opts->data);
+	else
+		strbuf_write(&pp->buffered_output, stderr);
 	strbuf_release(&pp->buffered_output);
 
 	sigchain_pop_common();
@@ -1734,13 +1737,17 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
 	}
 }
 
-static void pp_output(const struct parallel_processes *pp)
+static void pp_output(const struct parallel_processes *pp,
+		      const struct run_process_parallel_opts *opts)
 {
 	size_t i = pp->output_owner;
 
 	if (child_is_working(&pp->children[i]) &&
 	    pp->children[i].err.len) {
-		strbuf_write(&pp->children[i].err, stderr);
+		if (opts->consume_output)
+			opts->consume_output(&pp->children[i].err, opts->data);
+		else
+			strbuf_write(&pp->children[i].err, stderr);
 		strbuf_reset(&pp->children[i].err);
 	}
 }
@@ -1788,11 +1795,15 @@ static int pp_collect_finished(struct parallel_processes *pp,
 		} else {
 			const size_t n = opts->processes;
 
-			strbuf_write(&pp->children[i].err, stderr);
+			/* Output errors, then all other finished child processes */
+			if (opts->consume_output) {
+				opts->consume_output(&pp->children[i].err, opts->data);
+				opts->consume_output(&pp->buffered_output, opts->data);
+			} else {
+				strbuf_write(&pp->children[i].err, stderr);
+				strbuf_write(&pp->buffered_output, stderr);
+			}
 			strbuf_reset(&pp->children[i].err);
-
-			/* Output all other finished child processes */
-			strbuf_write(&pp->buffered_output, stderr);
 			strbuf_reset(&pp->buffered_output);
 
 			/*
@@ -1829,7 +1840,7 @@ static void pp_handle_child_IO(struct parallel_processes *pp,
 				pp->children[i].state = GIT_CP_WAIT_CLEANUP;
 	} else {
 		pp_buffer_stderr(pp, opts, output_timeout);
-		pp_output(pp);
+		pp_output(pp, opts);
 	}
 }
 
@@ -1852,6 +1863,9 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 					   "max:%"PRIuMAX,
 					   (uintmax_t)opts->processes);
 
+	if (opts->ungroup && opts->consume_output)
+		BUG("ungroup and reading output are mutualy exclusive");
+
 	/*
 	 * Child tasks might receive input via stdin, terminating early (or not), so
 	 * ignore the default SIGPIPE which gets handled by each feed_pipe_fn which
diff --git a/run-command.h b/run-command.h
index e1ca965b5b1988..7093252863966f 100644
--- a/run-command.h
+++ b/run-command.h
@@ -435,6 +435,17 @@ typedef int (*feed_pipe_fn)(int child_in,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * If this callback is provided, output is collated into a new pipe instead
+ * of the process stderr. Then `consume_output_fn` will be called repeatedly
+ * with output contained in the `output` arg. It will also be called with an
+ * empty `output` to allow for keepalives or similar operations if necessary.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel.
+ * No task cookie is provided because the callback receives collated output.
+ */
+typedef void (*consume_output_fn)(struct strbuf *output, void *pp_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -494,6 +505,12 @@ struct run_process_parallel_opts
 	 */
 	feed_pipe_fn feed_pipe;
 
+	/*
+	 * consume_output: see consume_output_fn() above. This can be NULL
+	 * to omit any special handling.
+	 */
+	consume_output_fn consume_output;
+
 	/**
 	 * task_finished: See task_finished_fn() above. This can be
 	 * NULL to omit any special handling.
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 4a56456894ccff..49eace8dce1961 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -58,6 +58,16 @@ static int no_job(struct child_process *cp UNUSED,
 	return 0;
 }
 
+static void test_divert_output(struct strbuf *output, void *cb UNUSED)
+{
+	FILE *output_file;
+
+	output_file = fopen("./output_file", "a");
+
+	strbuf_write(output, output_file);
+	fclose(output_file);
+}
+
 static int task_finished(int result UNUSED,
 			 struct strbuf *err,
 			 void *pp_cb UNUSED,
@@ -198,6 +208,7 @@ static int testsuite(int argc, const char **argv)
 		.get_next_task = next_test,
 		.start_failure = test_failed,
 		.feed_pipe = test_stdin_pipe_feed,
+		.consume_output = test_divert_output,
 		.task_finished = test_finished,
 		.data = &suite,
 	};
@@ -514,6 +525,10 @@ int cmd__run_command(int argc, const char **argv)
 		opts.get_next_task = parallel_next;
 		opts.task_finished = task_finished_quiet;
 		opts.feed_pipe = test_stdin_pipe_feed;
+	} else if (!strcmp(argv[1], "run-command-divert-output")) {
+		opts.get_next_task = parallel_next;
+		opts.consume_output = test_divert_output;
+		opts.task_finished = task_finished_quiet;
 	} else {
 		ret = 1;
 		fprintf(stderr, "check usage\n");
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 2f77fde0d964c8..74529e219e2aef 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -164,6 +164,13 @@ test_expect_success 'run_command runs ungrouped in parallel with more tasks than
 	test_line_count = 4 err
 '
 
+test_expect_success 'run_command can divert output' '
+	test_when_finished rm output_file &&
+	test-tool run-command run-command-divert-output 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
+	test_must_be_empty actual &&
+	test_cmp expect output_file
+'
+
 test_expect_success 'run_command listens to stdin' '
 	cat >expect <<-\EOF &&
 	preloaded output of a child

From 53254bfa1b4e91bab2c675d1a6b561026f7b573a Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:32 +0200
Subject: [PATCH 302/553] hooks: allow callers to capture output
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Some server-side hooks will require capturing output to send over
sideband instead of printing directly to stderr. Expose that capability.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 hook.c | 1 +
 hook.h | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/hook.c b/hook.c
index 00a1e2ad22a9d7..35211e5ed7c474 100644
--- a/hook.c
+++ b/hook.c
@@ -158,6 +158,7 @@ int run_hooks_opt(struct repository *r, const char *hook_name,
 		.get_next_task = pick_next_hook,
 		.start_failure = notify_start_failure,
 		.feed_pipe = options->feed_pipe,
+		.consume_output = options->consume_output,
 		.task_finished = notify_hook_finished,
 
 		.data = &cb_data,
diff --git a/hook.h b/hook.h
index 78a1a44690ef34..ae502178b9bfad 100644
--- a/hook.h
+++ b/hook.h
@@ -80,6 +80,14 @@ struct run_hooks_opt
 	 * Only useful when using `run_hooks_opt.feed_pipe`, otherwise ignore it.
 	 */
 	void *feed_pipe_cb_data;
+
+	/*
+	 * Populate this to capture output and prevent it from being printed to
+	 * stderr. This will be passed directly through to
+	 * run_command:run_parallel_processes(). See t/helper/test-run-command.c
+	 * for an example.
+	 */
+	consume_output_fn consume_output;
 };
 
 #define RUN_HOOKS_OPT_INIT { \

From 0bbaf3653f54f49ac124c623906983839c38b354 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:33 +0200
Subject: [PATCH 303/553] receive-pack: convert update hooks to new API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use the new hook sideband API introduced in the previous commit.

The hook API avoids creating a custom struct child_process and other
internal hook plumbing (e.g. calling find_hook()) and prepares for
the specification of hooks via configs or running parallel hooks.

Execution is still sequential through the current hook.[ch] via the
run_process_parallel_opts.processes=1 arg.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/receive-pack.c | 64 +++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 39 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c9288a9c7e382b..3240427a0781a3 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -918,6 +918,16 @@ static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep)
 	return 0;
 }
 
+static void hook_output_to_sideband(struct strbuf *output, void *cb_data UNUSED)
+{
+	if (!output)
+		BUG("output must be non-NULL");
+
+	/* buffer might be empty for keepalives */
+	if (output->len)
+		send_sideband(1, 2, output->buf, output->len, use_sideband);
+}
+
 static int run_receive_hook(struct command *commands,
 			    const char *hook_name,
 			    int skip_broken,
@@ -941,29 +951,18 @@ static int run_receive_hook(struct command *commands,
 
 static int run_update_hook(struct command *cmd)
 {
-	struct child_process proc = CHILD_PROCESS_INIT;
-	int code;
-	const char *hook_path = find_hook(the_repository, "update");
-
-	if (!hook_path)
-		return 0;
-
-	strvec_push(&proc.args, hook_path);
-	strvec_push(&proc.args, cmd->ref_name);
-	strvec_push(&proc.args, oid_to_hex(&cmd->old_oid));
-	strvec_push(&proc.args, oid_to_hex(&cmd->new_oid));
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 
-	proc.no_stdin = 1;
-	proc.stdout_to_stderr = 1;
-	proc.err = use_sideband ? -1 : 0;
-	proc.trace2_hook_name = "update";
+	strvec_pushl(&opt.args,
+		     cmd->ref_name,
+		     oid_to_hex(&cmd->old_oid),
+		     oid_to_hex(&cmd->new_oid),
+		     NULL);
 
-	code = start_command(&proc);
-	if (code)
-		return code;
 	if (use_sideband)
-		copy_to_sideband(proc.err, -1, NULL);
-	return finish_command(&proc);
+		opt.consume_output = hook_output_to_sideband;
+
+	return run_hooks_opt(the_repository, "update", &opt);
 }
 
 static struct command *find_command_by_refname(struct command *list,
@@ -1640,33 +1639,20 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 static void run_update_post_hook(struct command *commands)
 {
 	struct command *cmd;
-	struct child_process proc = CHILD_PROCESS_INIT;
-	const char *hook;
-
-	hook = find_hook(the_repository, "post-update");
-	if (!hook)
-		return;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
 
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (cmd->error_string || cmd->did_not_exist)
 			continue;
-		if (!proc.args.nr)
-			strvec_push(&proc.args, hook);
-		strvec_push(&proc.args, cmd->ref_name);
+		strvec_push(&opt.args, cmd->ref_name);
 	}
-	if (!proc.args.nr)
+	if (!opt.args.nr)
 		return;
 
-	proc.no_stdin = 1;
-	proc.stdout_to_stderr = 1;
-	proc.err = use_sideband ? -1 : 0;
-	proc.trace2_hook_name = "post-update";
+	if (use_sideband)
+		opt.consume_output = hook_output_to_sideband;
 
-	if (!start_command(&proc)) {
-		if (use_sideband)
-			copy_to_sideband(proc.err, -1, NULL);
-		finish_command(&proc);
-	}
+	run_hooks_opt(the_repository, "post-update", &opt);
 }
 
 static void check_aliased_update_internal(struct command *cmd,

From c65f26fca46f742e8e457d859a83c4e6ef3c3953 Mon Sep 17 00:00:00 2001
From: Emily Shaffer <emilyshaffer@google.com>
Date: Fri, 26 Dec 2025 14:23:34 +0200
Subject: [PATCH 304/553] receive-pack: convert receive hooks to hook API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This converts the last remaining hooks to the new hook API, for
the same benefits as the previous conversions (no need to toggle
signals, manage custom struct child_process, call find_hook(),
prepares for specifyinig hooks via configs, etc.).

I noticed a performance degradation when processing large amounts
of hook input with just 1 line per callback, due to run-command's
poll loop, therefore I batched 500 lines per callback, to ensure
similar pipe throughput as before and to avoid hook child waiting
on stdin.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/receive-pack.c | 212 ++++++++++++++++++-----------------------
 1 file changed, 93 insertions(+), 119 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3240427a0781a3..6975f60b73bfec 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -749,7 +749,7 @@ static int check_cert_push_options(const struct string_list *push_options)
 	return retval;
 }
 
-static void prepare_push_cert_sha1(struct child_process *proc)
+static void prepare_push_cert_sha1(struct run_hooks_opt *opt)
 {
 	static int already_done;
 
@@ -775,23 +775,23 @@ static void prepare_push_cert_sha1(struct child_process *proc)
 		nonce_status = check_nonce(sigcheck.payload);
 	}
 	if (!is_null_oid(&push_cert_oid)) {
-		strvec_pushf(&proc->env, "GIT_PUSH_CERT=%s",
+		strvec_pushf(&opt->env, "GIT_PUSH_CERT=%s",
 			     oid_to_hex(&push_cert_oid));
-		strvec_pushf(&proc->env, "GIT_PUSH_CERT_SIGNER=%s",
+		strvec_pushf(&opt->env, "GIT_PUSH_CERT_SIGNER=%s",
 			     sigcheck.signer ? sigcheck.signer : "");
-		strvec_pushf(&proc->env, "GIT_PUSH_CERT_KEY=%s",
+		strvec_pushf(&opt->env, "GIT_PUSH_CERT_KEY=%s",
 			     sigcheck.key ? sigcheck.key : "");
-		strvec_pushf(&proc->env, "GIT_PUSH_CERT_STATUS=%c",
+		strvec_pushf(&opt->env, "GIT_PUSH_CERT_STATUS=%c",
 			     sigcheck.result);
 		if (push_cert_nonce) {
-			strvec_pushf(&proc->env,
+			strvec_pushf(&opt->env,
 				     "GIT_PUSH_CERT_NONCE=%s",
 				     push_cert_nonce);
-			strvec_pushf(&proc->env,
+			strvec_pushf(&opt->env,
 				     "GIT_PUSH_CERT_NONCE_STATUS=%s",
 				     nonce_status);
 			if (nonce_status == NONCE_SLOP)
-				strvec_pushf(&proc->env,
+				strvec_pushf(&opt->env,
 					     "GIT_PUSH_CERT_NONCE_SLOP=%ld",
 					     nonce_stamp_slop);
 		}
@@ -803,119 +803,64 @@ struct receive_hook_feed_state {
 	struct ref_push_report *report;
 	int skip_broken;
 	struct strbuf buf;
-	const struct string_list *push_options;
 };
 
-typedef int (*feed_fn)(void *, const char **, size_t *);
-static int run_and_feed_hook(const char *hook_name, feed_fn feed,
-			     struct receive_hook_feed_state *feed_state)
+static int feed_receive_hook_cb(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_task_cb)
 {
-	struct child_process proc = CHILD_PROCESS_INIT;
-	struct async muxer;
-	int code;
-	const char *hook_path = find_hook(the_repository, hook_name);
+	struct receive_hook_feed_state *state = pp_task_cb;
+	struct command *cmd = state->cmd;
+	unsigned int lines_batch_size = 500;
 
-	if (!hook_path)
-		return 0;
+	strbuf_reset(&state->buf);
 
-	strvec_push(&proc.args, hook_path);
-	proc.in = -1;
-	proc.stdout_to_stderr = 1;
-	proc.trace2_hook_name = hook_name;
-
-	if (feed_state->push_options) {
-		size_t i;
-		for (i = 0; i < feed_state->push_options->nr; i++)
-			strvec_pushf(&proc.env,
-				     "GIT_PUSH_OPTION_%"PRIuMAX"=%s",
-				     (uintmax_t)i,
-				     feed_state->push_options->items[i].string);
-		strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT=%"PRIuMAX"",
-			     (uintmax_t)feed_state->push_options->nr);
-	} else
-		strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT");
+	/* batch lines to avoid going through run-command's poll loop for each line */
+	for (unsigned int i = 0; i < lines_batch_size; i++) {
+		while (cmd &&
+		       state->skip_broken && (cmd->error_string || cmd->did_not_exist))
+			cmd = cmd->next;
 
-	if (tmp_objdir)
-		strvec_pushv(&proc.env, tmp_objdir_env(tmp_objdir));
+		if (!cmd)
+			break;  /* no more commands left */
 
-	if (use_sideband) {
-		memset(&muxer, 0, sizeof(muxer));
-		muxer.proc = copy_to_sideband;
-		muxer.in = -1;
-		code = start_async(&muxer);
-		if (code)
-			return code;
-		proc.err = muxer.in;
-	}
+		if (!state->report)
+			state->report = cmd->report;
 
-	prepare_push_cert_sha1(&proc);
+		if (state->report) {
+			struct object_id *old_oid;
+			struct object_id *new_oid;
+			const char *ref_name;
 
-	code = start_command(&proc);
-	if (code) {
-		if (use_sideband)
-			finish_async(&muxer);
-		return code;
-	}
+			old_oid = state->report->old_oid ? state->report->old_oid : &cmd->old_oid;
+			new_oid = state->report->new_oid ? state->report->new_oid : &cmd->new_oid;
+			ref_name = state->report->ref_name ? state->report->ref_name : cmd->ref_name;
 
-	sigchain_push(SIGPIPE, SIG_IGN);
+			strbuf_addf(&state->buf, "%s %s %s\n",
+				    oid_to_hex(old_oid), oid_to_hex(new_oid),
+				    ref_name);
 
-	while (1) {
-		const char *buf;
-		size_t n;
-		if (feed(feed_state, &buf, &n))
-			break;
-		if (write_in_full(proc.in, buf, n) < 0)
-			break;
+			state->report = state->report->next;
+			if (!state->report)
+				cmd = cmd->next;
+		} else {
+			strbuf_addf(&state->buf, "%s %s %s\n",
+				    oid_to_hex(&cmd->old_oid), oid_to_hex(&cmd->new_oid),
+				    cmd->ref_name);
+			cmd = cmd->next;
+		}
 	}
-	close(proc.in);
-	if (use_sideband)
-		finish_async(&muxer);
 
-	sigchain_pop(SIGPIPE);
+	state->cmd = cmd;
 
-	return finish_command(&proc);
-}
-
-static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep)
-{
-	struct receive_hook_feed_state *state = state_;
-	struct command *cmd = state->cmd;
-
-	while (cmd &&
-	       state->skip_broken && (cmd->error_string || cmd->did_not_exist))
-		cmd = cmd->next;
-	if (!cmd)
-		return -1; /* EOF */
-	if (!bufp)
-		return 0; /* OK, can feed something. */
-	strbuf_reset(&state->buf);
-	if (!state->report)
-		state->report = cmd->report;
-	if (state->report) {
-		struct object_id *old_oid;
-		struct object_id *new_oid;
-		const char *ref_name;
-
-		old_oid = state->report->old_oid ? state->report->old_oid : &cmd->old_oid;
-		new_oid = state->report->new_oid ? state->report->new_oid : &cmd->new_oid;
-		ref_name = state->report->ref_name ? state->report->ref_name : cmd->ref_name;
-		strbuf_addf(&state->buf, "%s %s %s\n",
-			    oid_to_hex(old_oid), oid_to_hex(new_oid),
-			    ref_name);
-		state->report = state->report->next;
-		if (!state->report)
-			state->cmd = cmd->next;
-	} else {
-		strbuf_addf(&state->buf, "%s %s %s\n",
-			    oid_to_hex(&cmd->old_oid), oid_to_hex(&cmd->new_oid),
-			    cmd->ref_name);
-		state->cmd = cmd->next;
-	}
-	if (bufp) {
-		*bufp = state->buf.buf;
-		*sizep = state->buf.len;
+	if (state->buf.len > 0) {
+		int ret = write_in_full(hook_stdin_fd, state->buf.buf, state->buf.len);
+		if (ret < 0) {
+			if (errno == EPIPE)
+				return 1; /* child closed pipe */
+			return ret;
+		}
 	}
-	return 0;
+
+	return state->cmd ? 0 : 1;  /* 0 = more to come, 1 = EOF */
 }
 
 static void hook_output_to_sideband(struct strbuf *output, void *cb_data UNUSED)
@@ -933,20 +878,49 @@ static int run_receive_hook(struct command *commands,
 			    int skip_broken,
 			    const struct string_list *push_options)
 {
-	struct receive_hook_feed_state state;
-	int status;
-
-	strbuf_init(&state.buf, 0);
-	state.cmd = commands;
-	state.skip_broken = skip_broken;
-	state.report = NULL;
-	if (feed_receive_hook(&state, NULL, NULL))
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct command *iter = commands;
+	struct receive_hook_feed_state feed_state;
+	int ret;
+
+	/* if there are no valid commands, don't invoke the hook at all. */
+	while (iter && skip_broken && (iter->error_string || iter->did_not_exist))
+		iter = iter->next;
+	if (!iter)
 		return 0;
-	state.cmd = commands;
-	state.push_options = push_options;
-	status = run_and_feed_hook(hook_name, feed_receive_hook, &state);
-	strbuf_release(&state.buf);
-	return status;
+
+	if (push_options) {
+		for (int i = 0; i < push_options->nr; i++)
+			strvec_pushf(&opt.env, "GIT_PUSH_OPTION_%d=%s", i,
+				     push_options->items[i].string);
+		strvec_pushf(&opt.env, "GIT_PUSH_OPTION_COUNT=%"PRIuMAX"",
+					     (uintmax_t)push_options->nr);
+	} else {
+		strvec_push(&opt.env, "GIT_PUSH_OPTION_COUNT");
+	}
+
+	if (tmp_objdir)
+		strvec_pushv(&opt.env, tmp_objdir_env(tmp_objdir));
+
+	prepare_push_cert_sha1(&opt);
+
+	/* set up sideband printer */
+	if (use_sideband)
+		opt.consume_output = hook_output_to_sideband;
+
+	/* set up stdin callback */
+	feed_state.cmd = commands;
+	feed_state.skip_broken = skip_broken;
+	feed_state.report = NULL;
+	strbuf_init(&feed_state.buf, 0);
+	opt.feed_pipe_cb_data = &feed_state;
+	opt.feed_pipe = feed_receive_hook_cb;
+
+	ret = run_hooks_opt(the_repository, hook_name, &opt);
+
+	strbuf_release(&feed_state.buf);
+
+	return ret;
 }
 
 static int run_update_hook(struct command *cmd)

From 06188ea5f3f14040eb01aa883ac7a7a03c93e6a2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sat, 27 Dec 2025 10:29:35 +0100
Subject: [PATCH 305/553] config: use git_parse_int() in
 git_config_get_expiry_in_days()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

git_config_get_expiry_in_days() calls git_parse_signed() with the
maximum value of int, which is equivalent to calling git_parse_int().
Do that instead, as its shorter and clearer.

This requires demoting "days" to int to match.  Promote "scale" to
intmax_t in turn to arrive at the same result when multiplying them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/config.c b/config.c
index f1def0dcfbacba..2d3e4d441a2124 100644
--- a/config.c
+++ b/config.c
@@ -2434,14 +2434,14 @@ int repo_config_get_expiry_in_days(struct repository *r, const char *key,
 				   timestamp_t *expiry, timestamp_t now)
 {
 	const char *expiry_string;
-	intmax_t days;
+	int days;
 	timestamp_t when;
 
 	if (repo_config_get_string_tmp(r, key, &expiry_string))
 		return 1; /* no such thing */
 
-	if (git_parse_signed(expiry_string, &days, maximum_signed_value_of_type(int))) {
-		const int scale = 86400;
+	if (git_parse_int(expiry_string, &days)) {
+		const intmax_t scale = 86400;
 		*expiry = now - days * scale;
 		return 0;
 	}

From 7c7698a654a7a0031f65b0ab0c1c4e438e95df60 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 28 Dec 2025 16:02:48 +0900
Subject: [PATCH 306/553] The 13th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 88d24f6d4dde4c..d71948829c9ee2 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -78,6 +78,11 @@ Performance, Internal Implementation, Development Support etc.
  * "git diff-files -R --find-copies-harder" has been taught to use
    the potential copy sources from the index correctly.
 
+ * Require C99 style flexible array member support from all platforms.
+
+ * The code path that enumerates promisor objects have been optimized
+   to skip pointlessly parsing blob objects.
+
 
 Fixes since v2.52
 -----------------
@@ -214,3 +219,5 @@ Fixes since v2.52
    (merge bab391761d kj/pull-options-decl-cleanup later to maint).
    (merge 007b8994d4 rs/t4014-git-version-string-fix later to maint).
    (merge 4ce170c522 ds/doc-scalar-config later to maint).
+   (merge a0c813951a jc/doc-commit-signoff-config later to maint).
+   (merge 8ee262985a ja/doc-misc-fixes later to maint).

From e61f227d0654212412ce1835f7e432df85cfc36b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 28 Dec 2025 19:10:48 +0100
Subject: [PATCH 307/553] tag: use algo of repo parameter in parse_tag_buffer()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Stop using "the_hash_algo" explicitly and implictly via parse_oid_hex()
and instead use the "hash_algo" member of the passed in repository,
which is more correct.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 tag.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tag.c b/tag.c
index f5c232d2f1f36c..dec5ea8eb0ab74 100644
--- a/tag.c
+++ b/tag.c
@@ -148,9 +148,11 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u
 		FREE_AND_NULL(item->tag);
 	}
 
-	if (size < the_hash_algo->hexsz + 24)
+	if (size < r->hash_algo->hexsz + 24)
 		return -1;
-	if (memcmp("object ", bufptr, 7) || parse_oid_hex(bufptr + 7, &oid, &bufptr) || *bufptr++ != '\n')
+	if (memcmp("object ", bufptr, 7) ||
+	    parse_oid_hex_algop(bufptr + 7, &oid, &bufptr, r->hash_algo) ||
+	    *bufptr++ != '\n')
 		return -1;
 
 	if (!starts_with(bufptr, "type "))

From 154717b3b0b0631fb6700d5fc77e779106530fc3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 28 Dec 2025 19:10:49 +0100
Subject: [PATCH 308/553] tag: support arbitrary repositories in
 gpg_verify_tag()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Allow callers of gpg_verify_tag() specify the repository to use by
providing a parameter for that.  One of the two has not been using
the_repository since 43a8391977 (builtin/verify-tag: stop using
`the_repository`, 2025-03-08); let it pass in the correct repository.
The other simply passes the_repository to get the same result as before.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/tag.c        |  2 +-
 builtin/verify-tag.c |  2 +-
 tag.c                | 12 ++++++------
 tag.h                |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 01eba90c5c7bb2..aeb04c487fe95a 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -149,7 +149,7 @@ static int verify_tag(const char *name, const char *ref UNUSED,
 	if (format->format)
 		flags = GPG_VERIFY_OMIT_STATUS;
 
-	if (gpg_verify_tag(oid, name, flags))
+	if (gpg_verify_tag(the_repository, oid, name, flags))
 		return -1;
 
 	if (format->format)
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index 558121eaa1688e..4a261b2369729f 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -61,7 +61,7 @@ int cmd_verify_tag(int argc,
 			continue;
 		}
 
-		if (gpg_verify_tag(&oid, name, flags)) {
+		if (gpg_verify_tag(repo, &oid, name, flags)) {
 			had_error = 1;
 			continue;
 		}
diff --git a/tag.c b/tag.c
index dec5ea8eb0ab74..9373c49d0614b0 100644
--- a/tag.c
+++ b/tag.c
@@ -44,28 +44,28 @@ static int run_gpg_verify(const char *buf, unsigned long size, unsigned flags)
 	return ret;
 }
 
-int gpg_verify_tag(const struct object_id *oid, const char *name_to_report,
-		unsigned flags)
+int gpg_verify_tag(struct repository *r, const struct object_id *oid,
+		   const char *name_to_report, unsigned flags)
 {
 	enum object_type type;
 	char *buf;
 	unsigned long size;
 	int ret;
 
-	type = odb_read_object_info(the_repository->objects, oid, NULL);
+	type = odb_read_object_info(r->objects, oid, NULL);
 	if (type != OBJ_TAG)
 		return error("%s: cannot verify a non-tag object of type %s.",
 				name_to_report ?
 				name_to_report :
-				repo_find_unique_abbrev(the_repository, oid, DEFAULT_ABBREV),
+				repo_find_unique_abbrev(r, oid, DEFAULT_ABBREV),
 				type_name(type));
 
-	buf = odb_read_object(the_repository->objects, oid, &type, &size);
+	buf = odb_read_object(r->objects, oid, &type, &size);
 	if (!buf)
 		return error("%s: unable to read file.",
 				name_to_report ?
 				name_to_report :
-				repo_find_unique_abbrev(the_repository, oid, DEFAULT_ABBREV));
+				repo_find_unique_abbrev(r, oid, DEFAULT_ABBREV));
 
 	ret = run_gpg_verify(buf, size, flags);
 
diff --git a/tag.h b/tag.h
index ef12a610372063..55c2d0792b99cb 100644
--- a/tag.h
+++ b/tag.h
@@ -16,7 +16,7 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u
 int parse_tag(struct tag *item);
 void release_tag_memory(struct tag *t);
 struct object *deref_tag(struct repository *r, struct object *, const char *, int);
-int gpg_verify_tag(const struct object_id *oid,
+int gpg_verify_tag(struct repository *r, const struct object_id *oid,
 		   const char *name_to_report, unsigned flags);
 struct object_id *get_tagged_oid(struct tag *tag);
 

From b6e4cc8c32850315323961659e553d1d14591f7f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 28 Dec 2025 19:10:50 +0100
Subject: [PATCH 309/553] tag: support arbitrary repositories in parse_tag()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Allow callers of parse_tag() pass in the repository to use.  Let most of
them pass in the_repository to get the same result as before.  One of
them has stopped using the_repository in ef9b0370da (sha1-name.c: store
and use repo in struct disambiguate_state, 2019-04-16); let it pass in
its stored repository.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/describe.c     | 6 +++---
 builtin/pack-objects.c | 2 +-
 fsck.c                 | 2 +-
 object-name.c          | 2 +-
 ref-filter.c           | 2 +-
 tag.c                  | 8 ++++----
 tag.h                  | 2 +-
 walker.c               | 2 +-
 8 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 443546aaac96f0..989a78d715d525 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -112,13 +112,13 @@ static int replace_name(struct commit_name *e,
 
 		if (!e->tag) {
 			t = lookup_tag(the_repository, &e->oid);
-			if (!t || parse_tag(t))
+			if (!t || parse_tag(the_repository, t))
 				return 1;
 			e->tag = t;
 		}
 
 		t = lookup_tag(the_repository, oid);
-		if (!t || parse_tag(t))
+		if (!t || parse_tag(the_repository, t))
 			return 0;
 		*tag = t;
 
@@ -335,7 +335,7 @@ static void append_name(struct commit_name *n, struct strbuf *dst)
 {
 	if (n->prio == 2 && !n->tag) {
 		n->tag = lookup_tag(the_repository, &n->oid);
-		if (!n->tag || parse_tag(n->tag))
+		if (!n->tag || parse_tag(the_repository, n->tag))
 			die(_("annotated tag %s not available"), n->path);
 	}
 	if (n->tag && !n->name_checked) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1ce8d6ee215326..ca44b7894fc064 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3293,7 +3293,7 @@ static void add_tag_chain(const struct object_id *oid)
 
 	tag = lookup_tag(the_repository, oid);
 	while (1) {
-		if (!tag || parse_tag(tag) || !tag->tagged)
+		if (!tag || parse_tag(the_repository, tag) || !tag->tagged)
 			die(_("unable to pack objects reachable from tag %s"),
 			    oid_to_hex(oid));
 
diff --git a/fsck.c b/fsck.c
index 138fffded935c4..fae18d8561e067 100644
--- a/fsck.c
+++ b/fsck.c
@@ -474,7 +474,7 @@ static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *optio
 {
 	const char *name = fsck_get_object_name(options, &tag->object.oid);
 
-	if (parse_tag(tag))
+	if (parse_tag(the_repository, tag))
 		return -1;
 	if (name)
 		fsck_put_object_name(options, &tag->tagged->oid, "%s", name);
diff --git a/object-name.c b/object-name.c
index fed5de51531fde..8b862c124e05a9 100644
--- a/object-name.c
+++ b/object-name.c
@@ -449,7 +449,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 
-		if (!parse_tag(tag) && tag->tag) {
+		if (!parse_tag(ds->repo, tag) && tag->tag) {
 			/*
 			 * TRANSLATORS: This is a line of ambiguous
 			 * tag object output. E.g.:
diff --git a/ref-filter.c b/ref-filter.c
index d7454269e87cd3..c318f9ca0ec8dd 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2866,7 +2866,7 @@ static int match_points_at(struct oid_array *points_at,
 	while (obj && obj->type == OBJ_TAG) {
 		struct tag *tag = (struct tag *)obj;
 
-		if (parse_tag(tag) < 0) {
+		if (parse_tag(the_repository, tag) < 0) {
 			obj = NULL;
 			break;
 		}
diff --git a/tag.c b/tag.c
index 9373c49d0614b0..9daeaf2a78ed54 100644
--- a/tag.c
+++ b/tag.c
@@ -13,6 +13,7 @@
 #include "gpg-interface.h"
 #include "hex.h"
 #include "packfile.h"
+#include "repository.h"
 
 const char *tag_type = "tag";
 
@@ -203,7 +204,7 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u
 	return 0;
 }
 
-int parse_tag(struct tag *item)
+int parse_tag(struct repository *r, struct tag *item)
 {
 	enum object_type type;
 	void *data;
@@ -212,8 +213,7 @@ int parse_tag(struct tag *item)
 
 	if (item->object.parsed)
 		return 0;
-	data = odb_read_object(the_repository->objects, &item->object.oid,
-			       &type, &size);
+	data = odb_read_object(r->objects, &item->object.oid, &type, &size);
 	if (!data)
 		return error("Could not read %s",
 			     oid_to_hex(&item->object.oid));
@@ -222,7 +222,7 @@ int parse_tag(struct tag *item)
 		return error("Object %s not a tag",
 			     oid_to_hex(&item->object.oid));
 	}
-	ret = parse_tag_buffer(the_repository, item, data, size);
+	ret = parse_tag_buffer(r, item, data, size);
 	free(data);
 	return ret;
 }
diff --git a/tag.h b/tag.h
index 55c2d0792b99cb..534687c4caeca4 100644
--- a/tag.h
+++ b/tag.h
@@ -13,7 +13,7 @@ struct tag {
 };
 struct tag *lookup_tag(struct repository *r, const struct object_id *oid);
 int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, unsigned long size);
-int parse_tag(struct tag *item);
+int parse_tag(struct repository *r, struct tag *item);
 void release_tag_memory(struct tag *t);
 struct object *deref_tag(struct repository *r, struct object *, const char *, int);
 int gpg_verify_tag(struct repository *r, const struct object_id *oid,
diff --git a/walker.c b/walker.c
index 409b646578a3d4..2891563b03620b 100644
--- a/walker.c
+++ b/walker.c
@@ -115,7 +115,7 @@ static int process_commit(struct walker *walker, struct commit *commit)
 
 static int process_tag(struct walker *walker, struct tag *tag)
 {
-	if (parse_tag(tag))
+	if (parse_tag(the_repository, tag))
 		return -1;
 	return process(walker, tag->tagged);
 }

From 009fceeda26e12e2dbacd04eef47c62d4e206403 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de>
Date: Sun, 28 Dec 2025 19:10:51 +0100
Subject: [PATCH 310/553] tag: stop using the_repository
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

gpg_verify_tag() shows the passed in object name on error.  Both callers
provide one.  It falls back to abbreviated hashes for future callers
that pass in a NULL name.  DEFAULT_ABBREV is default_abbrev, which in
turn is a global variable that's populated by git_default_config() and
only available with USE_THE_REPOSITORY_VARIABLE.

Don't let that hypothetical hold us back from getting rid of
the_repository in tag.c.  Fall back to full hashes, which are more
appropriate for error messages anyway.  This allows us to stop setting
USE_THE_REPOSITORY_VARIABLE.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 tag.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tag.c b/tag.c
index 9daeaf2a78ed54..2f12e51024ec0b 100644
--- a/tag.c
+++ b/tag.c
@@ -1,4 +1,3 @@
-#define USE_THE_REPOSITORY_VARIABLE
 #define DISABLE_SIGN_COMPARE_WARNINGS
 
 #include "git-compat-util.h"
@@ -58,7 +57,7 @@ int gpg_verify_tag(struct repository *r, const struct object_id *oid,
 		return error("%s: cannot verify a non-tag object of type %s.",
 				name_to_report ?
 				name_to_report :
-				repo_find_unique_abbrev(r, oid, DEFAULT_ABBREV),
+				oid_to_hex(oid),
 				type_name(type));
 
 	buf = odb_read_object(r->objects, oid, &type, &size);
@@ -66,7 +65,7 @@ int gpg_verify_tag(struct repository *r, const struct object_id *oid,
 		return error("%s: unable to read file.",
 				name_to_report ?
 				name_to_report :
-				repo_find_unique_abbrev(r, oid, DEFAULT_ABBREV));
+				oid_to_hex(oid));
 
 	ret = run_gpg_verify(buf, size, flags);
 

From 979ee83e8a908f920d097c10cea5f8857e10898f Mon Sep 17 00:00:00 2001
From: Elijah Newren <newren@gmail.com>
Date: Mon, 29 Dec 2025 18:43:03 +0000
Subject: [PATCH 311/553] merge-ort: fix corner case recursive
 submodule/directory conflict handling

At GitHub, a few repositories were triggering errors of the form:

    git: merge-ort.c:3037: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
    Aborted (core dumped)

While these may look similar to both
    a562d90a350d (merge-ort: fix failing merges in special corner case,
                  2025-11-03)
and
    f6ecb603ff8a (merge-ort: fix directory rename on top of source of other
                  rename/delete, 2025-08-06)
the cause is different and in this case the problem is not an
over-conservative assertion, but a bug before the assertion where we did
not update all relevant state appropriately.

It sadly took me a really long time to figure out how to get a simple
reproducer for this one.  It doesn't really have that many moving parts,
but there are multiple pieces of background information needed to
understand it.

First of all, when we have two files added at the same path, merge-ort
does a two-way merge of those files.  If we have two directories added
at the same path, we basically do the same thing (taking the union of
files, and two-way merging files with the same name).  But two-way
merging requires components of the same type.  We can't merge the
contents of a regular file with a directory, or with a symlink, or with
a submodule.  Nor can any of those other types be merged with each
other, e.g. merging a submodule with a directory is a bad idea.  When
two paths have the same name but their types do not match, merge-ort is
forced to move one of them to an alternate filename (using the
unique_path() function).

Second, if two commits being merged have more than one merge-base,
merge-ort will merge the merge-bases to create a virtual merge-base, and
use that as the base commit.

Third, one of the really important optimizations in merge-ort is trivial
tree-level resolution (roughly meaning merging trees without recursing
into them).  This optimization has some nuance to it that is important
to the current bug, and to understand it, it helps to first look at the
high-level overview of how merge-ort runs; there are basically three
high-level functions that the work is divided between:
    collect_merge_info() - walks the top-level trees getting individual
                           paths of interest
    detect_renames() - detect renames between paths in order to match up
                       paths for three-way merging
    process_entries() - does a few things of interest:
      * three-way merging of files,
      * other special handling (e.g. adjusting paths with conflicting
        types to avoid path collisions)
      * as it finishes handling all the files within a subdirectory,
        writes out a new tree object for that directory

If it were not for renames, we could just always do tree-level merging
whenever the tree on at least one side was unmodified.  Unfortunately,
we need to recurse into trees to determine whether there are renames.
However, we can also do tree-level merging so long as there aren't any
*relevant* renames (another merge-ort optimization), which we can
determine without recursing into trees.

We would also be able to do tree-level merging if we somehow apriori
knew what renames existed, by only recursing into the trees which we
could otherwise trivially merge if they contained files involved in
renames.  That might not seem useful, because we need to find out the
renames and we have to recurse into trees to do so, but when you find
out that the process_entries() step is more computationally expensive
than the collect_merge_info() step, it yields an interesting strategy:
   * run collect_merge_info()
   * run detect_renames()
   * cache the renames()
   * restart -- rerun collect_merge_info(), using the cached renames to
     only recurse into the needed trees
   * we already have the renames cached so no need to re-detect
   * run process_entries() on the reduced list of paths
which was implemented back in 7bee6c100431 (merge-ort: avoid recursing
into directories when we don't need to, 2021-07-16)  Crucially, this
restarting only occurs if the number of paths we could skip recursing
into exceeds the number we still need to recurse into by some safety
factor (wanted_factor in handle_deferred_entries()); forgetting this
fact is a great way to repeatedly fail to create a minimal testcase for
several days and go down alternate wrong paths).

Now, I earlier summarized this optimization as "merging trees without
recursing into them", but this optimization does not require that all
three sides of history has a directory at a given path.  So long as the
tree on one side matches the tree in the base version, we can decide to
resolve in favor of whatever the other side of history has at that path
-- be it a directory, a file, a submodule, or a symlink.  Unfortunately,
the code in question didn't fully realize this, and was written assuming
the base version and both sides would have a directory at the given
path, as can be seen by the "ci->filemask == 0" comment in
resolve_trivial_directory_merge() that was added as part of 7bee6c100431
(merge-ort: avoid recursing into directories when we don't need to,
2021-07-16).  A few additional lines of code are needed to handle cases
where we have something other than a directory on the other side of
history.

But, knowing that resolve_trivial_directory_merge() doesn't have
sufficient state updating logic doesn't show us how to trigger a bug
without combining with the other bits of information we provided above.
Here's a relevant testcase:
   * branches A & B
   * commit A1: adds "folder" as a directory with files tracked under it
   * commit B1: adds "folder" as a submodule
   * commit A2: merges B1 into A1, keeping "folder" as a directory
     (and in fact, with no changes to "folder" since A1), discarding the
     submodule
   * commit B2: merges A1 into B1, keeping "folder" as a submodule
     (and in fact, with no changes to "folder" since B1), discarding the
     directory
Here, if we try to merge A2 & B2, the logic proceeds as follows:
   * we have multiple merge-bases: A1 & B1.  So we have to merge those
     to get a virtual merge base.
   * due to "folder" as a directory and "folder" as a submodule, the
     path collision logic triggers and renames "folder" as a submodule
     to "folder~Temporary merge branch 2" so we can keep it alongside
     "folder" as a directory.
   * we now have a virtual merge base (containing both "folder"
     directory and a "folder~Temporary merge branch 2" submodule) and
     can now do the outer merge
   * in the first step of the outer merge, we attempt to defer recursing
     into folder/ as a directory, but find we need to for rename
     detection.
   * in rename detection, we note that "folder~Temporary merge branch 2"
     has the same hash as "folder" as a submodule in B2, which means we
     have an exact rename.
   * after rename detection, we discover no path in folder/ is needed
     for renames, and so we can cache renames and restart.
   * after restarting, we avoid recursing into "folder/" and realize we
     can resolve it trivially since it hasn't been modified.  The
     resolution removes "folder/", leaving us only "folder" as a
     submodule from commit B2.
   * After this point, we should have a rename/delete conflict on
     "folder~Temporary merge branch 2" -> "folder", but our marking of
     the merge of "folder" as clean broke our ability to handle that and
     in fact triggers an assertion in process_renames().

When there was a df_conflict (directory/"file" conflict, where "file"
could be submodule or regular file or symlink), ensure
resolve_trivial_directory_merge() handles it properly.  In particular:
  * do not pre-emptively mark the path as cleanly merged if the
    remaining path is a file; allow it to be processed in
    process_entries() later to determine if it was clean
  * clear the parts of dirmask or filemask corresponding to the matching
    sides of history, since we are resolving those away
  * clear the df_conflict bit afterwards; since we cleared away the two
    matching sides and only have one side left, that one side can't
    have a directory/file conflict with itself.

Also add the above minimal testcase showcasing this bug to t6422, **with
a sufficient number of paths under the folder/ directory to actually
trigger it**.  (I wish I could have all those days back from all the
wrong paths I went down due to not having enough files under that
directory...)

I know this commit has a very high ratio of lines in the commit message
to lines of comments, and a relatively high ratio of comments to actual
code, but given how long it took me to track down, on the off chance
that we ever need to further modify this logic, I wanted it thoroughly
documented for future me and for whatever other poor soul might end up
needing to read this commit message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-ort.c                          | 35 ++++++++++-
 t/t6422-merge-rename-corner-cases.sh | 86 ++++++++++++++++++++++++++++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9d8bf..738a61ce691882 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1502,11 +1502,44 @@ static void resolve_trivial_directory_merge(struct conflict_info *ci, int side)
 	VERIFY_CI(ci);
 	assert((side == 1 && ci->match_mask == 5) ||
 	       (side == 2 && ci->match_mask == 3));
+
+	/*
+	 * Since ci->stages[0] matches ci->stages[3-side], resolve merge in
+	 * favor of ci->stages[side].
+	 */
 	oidcpy(&ci->merged.result.oid, &ci->stages[side].oid);
 	ci->merged.result.mode = ci->stages[side].mode;
 	ci->merged.is_null = is_null_oid(&ci->stages[side].oid);
+
+	/*
+	 * Because we resolved in favor of "side", we are no longer
+	 * considering the paths which matched (i.e. had the same hash) any
+	 * more.  Strip the matching paths from both dirmask & filemask.
+	 * Another consequence of merging in favor of side is that we can no
+	 * longer have a directory/file conflict either..but there's a slight
+	 * nuance we consider before clearing it.
+	 *
+	 * In most cases, resolving in favor of the other side means there's
+	 * no conflict at all, but if we had a directory/file conflict to
+	 * start, and the directory is resolved away, the remaining file could
+	 * still be part of a rename.  If the remaining file is part of a
+	 * rename, then it may also be part of a rename conflict (e.g.
+	 * rename/delete or rename/rename(1to2)), so we can't
+	 * mark it as a clean merge if we started with a directory/file
+	 * conflict and still have a file left.
+	 *
+	 * In contrast, if we started with a directory/file conflict and
+	 * still have a directory left, no file under that directory can be
+	 * part of a rename, otherwise we would have had to recurse into the
+	 * directory and would have never ended up within
+	 * resolve_trivial_directory_merge() for that directory.
+	 */
+	ci->dirmask &= (~ci->match_mask);
+	ci->filemask &= (~ci->match_mask);
+	assert(!ci->filemask || !ci->dirmask);
 	ci->match_mask = 0;
-	ci->merged.clean = 1; /* (ci->filemask == 0); */
+	ci->merged.clean = !ci->df_conflict || ci->dirmask;
+	ci->df_conflict = 0;
 }
 
 static int handle_deferred_entries(struct merge_options *opt,
diff --git a/t/t6422-merge-rename-corner-cases.sh b/t/t6422-merge-rename-corner-cases.sh
index f14c0fb30e1bf2..e18d5a227d54f7 100755
--- a/t/t6422-merge-rename-corner-cases.sh
+++ b/t/t6422-merge-rename-corner-cases.sh
@@ -1439,4 +1439,90 @@ test_expect_success 'rename/rename(1to2) with a binary file' '
 	)
 '
 
+# Testcase preliminary submodule/directory conflict and submodule rename
+#   Commit O: <empty, or additional irrelevant stuff>
+#   Commit A1: introduce "folder" (as a tree)
+#   Commit B1: introduce "folder" (as a submodule)
+#   Commit A2: merge B1 into A1, but keep folder as a tree
+#   Commit B2: merge A1 into B1, but keep folder as a submodule
+#   Merge A2 & B2
+test_setup_submodule_directory_preliminary_conflict () {
+	git init submodule_directory_preliminary_conflict &&
+	(
+		cd submodule_directory_preliminary_conflict &&
+
+		# Trying to do the A2 and B2 merges above is slightly more
+		# challenging with a local submodule (because checking out
+		# another commit has the submodule in the way).  Instead,
+		# first create the commits with the wrong parents but right
+		# trees, in the order A1, A2, B1, B2...
+		#
+		# Then go back and create new A2 & B2 with the correct
+		# parents and the same trees.
+
+		git commit --allow-empty -m orig &&
+
+		git branch A &&
+		git branch B &&
+
+		git checkout B &&
+		mkdir folder &&
+		echo A>folder/A &&
+		echo B>folder/B &&
+		echo C>folder/C &&
+		echo D>folder/D &&
+		echo E>folder/E &&
+		git add folder &&
+		git commit -m B1 &&
+
+		git commit --allow-empty -m B2 &&
+
+		git checkout A &&
+		git init folder &&
+		(
+			cd folder &&
+			>Z &&
+			>Y &&
+			git add Z Y &&
+			git commit -m "original submodule commit"
+		) &&
+		git add folder &&
+		git commit -m A1 &&
+
+		git commit --allow-empty -m A2 &&
+
+		NewA2=$(git commit-tree -p A^ -p B^ -m "Merge B into A" A^{tree}) &&
+		NewB2=$(git commit-tree -p B^ -p A^ -m "Merge A into B" B^{tree}) &&
+		git update-ref refs/heads/A $NewA2 &&
+		git update-ref refs/heads/B $NewB2
+	)
+}
+
+test_expect_success 'submodule/directory preliminary conflict' '
+	test_setup_submodule_directory_preliminary_conflict &&
+	(
+		cd submodule_directory_preliminary_conflict &&
+
+		git checkout A^0 &&
+
+		test_expect_code 1 git merge B^0 &&
+
+		# Make sure the index has the right number of entries
+		git ls-files -s >actual &&
+		test_line_count = 2 actual &&
+
+		# The "folder" as directory should have been resolved away
+		# as part of the merge.  The "folder" as submodule got
+		# renamed to "folder~Temporary merge branch 2" in the
+		# virtual merge base, resulting in a
+		#    "folder~Temporary merge branch 2" -> "folder"
+		# rename in the outermerge for the submodule, which then
+		# becomes part of a rename/delete conflict (because "folder"
+		# as a submodule was deleted in A2).
+		submod=$(git rev-parse A:folder) &&
+		printf "160000 $submod 1\tfolder\n160000 $submod 2\tfolder\n" >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done

From 861dbb1586aaac6f02c8c87dd55e5c0b9862296a Mon Sep 17 00:00:00 2001
From: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Date: Mon, 29 Dec 2025 18:57:37 +0000
Subject: [PATCH 312/553] t5403: use test_path_is_file instead of test -f

Replace 'test -f' with the test_path_is_file in
t5403-post-checkout-hook.sh. This helper provides better error
messages when tests fail, making it easier to debug issues.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5403-post-checkout-hook.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5403-post-checkout-hook.sh b/t/t5403-post-checkout-hook.sh
index cfaae547398e0e..ade9e5087f9f30 100755
--- a/t/t5403-post-checkout-hook.sh
+++ b/t/t5403-post-checkout-hook.sh
@@ -110,7 +110,7 @@ test_expect_success 'post-checkout hook is triggered by clone' '
 	echo "$@" >"$GIT_DIR/post-checkout.args"
 	EOF
 	git clone --template=templates . clone3 &&
-	test -f clone3/.git/post-checkout.args
+	test_path_is_file clone3/.git/post-checkout.args
 '
 
 test_done

From 56d388e6ad9e819935a902b6d5ce3a6b3485b6e2 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Mon, 29 Dec 2025 21:44:57 +0000
Subject: [PATCH 313/553] diff: avoid segfault with freed entries

When computing a diff in a partial clone, there is a chance that we
could trigger a prefetch of missing objects at the same time as we are
freeing entries from the global diff queue. This is difficult to
reproduce, as we need to have some objects be freed from the queue
before triggering the prefetch of missing objects. There is a new test
in t4067 that does trigger the segmentation fault that results in this
case.

The fix is to set the queue pointer to NULL after it is freed, and then
to be careful about NULL values in the prefetch.

The more elaborate explanation is that within diffcore_std(), we may
skip the initial prefetch due to the output format (--name-only in the
test) and go straight to diffcore_skip_stat_unmatch(). In that method,
the index entries that have been invalidated by path changes show up as
entries but may be deleted because they are not actually content diffs
and only newer timestamps than expected. As those entries are deleted,
later entries are checked with diff_filespec_check_stat_unmatch(), which
uses diff_queued_diff_prefetch() as the missing_object_cb in its diff
options. That can trigger downloading missing objects if the appropriate
scenario occurs to trigger a call to diff_popoulate_filespec(). It's
finally within that callback to diff_queued_diff_prefetch() that the
segfault occurs.

The test was hard to find because it required some real differences,
some not-different files that had a newer modified time, and the order
of those files alphabetically was important to trigger the deletion
before the prefetch was triggered.

I briefly considered a "lock" member for the diff queue, but it was a
much larger diff and introduced many more possible error scenarios.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c                        |  5 +++++
 t/t4067-diff-partial-clone.sh | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/diff.c b/diff.c
index 436da250eb150d..a68ddd2168ba1c 100644
--- a/diff.c
+++ b/diff.c
@@ -7098,6 +7098,7 @@ static void diffcore_skip_stat_unmatch(struct diff_options *diffopt)
 			if (!diffopt->flags.no_index)
 				diffopt->skip_stat_unmatch++;
 			diff_free_filepair(p);
+			q->queue[i] = NULL;
 		}
 	}
 	free(q->queue);
@@ -7141,6 +7142,10 @@ void diff_queued_diff_prefetch(void *repository)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
+
+		if (!p)
+			continue;
+
 		diff_add_if_missing(repo, &to_fetch, p->one);
 		diff_add_if_missing(repo, &to_fetch, p->two);
 	}
diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh
index 581250dd2d227a..72f25de44950ae 100755
--- a/t/t4067-diff-partial-clone.sh
+++ b/t/t4067-diff-partial-clone.sh
@@ -132,6 +132,41 @@ test_expect_success 'diff with rename detection batches blobs' '
 	test_line_count = 1 done_lines
 '
 
+test_expect_success 'diff succeeds even if entries are removed from queue' '
+	test_when_finished "rm -rf server client trace" &&
+
+	test_create_repo server &&
+	for l in a c e g i p
+	do
+		echo $l >server/$l &&
+		git -C server add $l || return 1
+	done &&
+	git -C server commit -m x &&
+
+	for l in a e i
+	do
+		git -C server rm $l || return 1
+	done &&
+
+	for l in b d f i
+		do
+		echo $l$l >server/$l &&
+		git -C server add $l || return 1
+	done &&
+	git -C server commit -a -m x &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	git clone --filter=blob:limit=0 "file://$(pwd)/server" client &&
+
+	for file in $(ls client)
+	do
+		cat client/$file >$file &&
+		mv $file client/$file || return 1
+	done &&
+	git -C client diff --name-only --relative HEAD^
+'
+
 test_expect_success 'diff does not fetch anything if inexact rename detection is not needed' '
 	test_when_finished "rm -rf server client trace" &&
 

From 68cb7f9e92a5d8e9824f5b52ac3d0a9d8f653dbe Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, 29 Dec 2025 17:43:28 +0900
Subject: [PATCH 314/553] The 14th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index d71948829c9ee2..91cfb7adfaab8f 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -31,6 +31,9 @@ UI, Workflows & Features
 
  * "git repo struct" learned to take "-z" as a synonym to "--format=nul".
 
+ * More object database related information are shown in "git repo
+   structure" output.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -83,6 +86,9 @@ Performance, Internal Implementation, Development Support etc.
  * The code path that enumerates promisor objects have been optimized
    to skip pointlessly parsing blob objects.
 
+ * Prepare test suite for Git for Windows that supports symbolic
+   links.
+
 
 Fixes since v2.52
 -----------------
@@ -204,6 +210,17 @@ Fixes since v2.52
    has been corrected.
    (merge b7b17ec8a6 kn/fix-fetch-backfill-tag-with-batched-ref-updates later to maint).
 
+ * Document "rev-list --filter-provided-objects" better.
+   (merge 6d8dc99478 jt/doc-rev-list-filter-provided-objects later to maint).
+
+ * Even when there is no changes in the packfile and no need to
+   recompute bitmaps, "git repack" recomputed and updated the MIDX
+   file, which has been corrected.
+   (merge 6ce9d558ce ps/repack-avoid-noop-midx-rewrite later to maint).
+
+ * Update HTTP tests to adjust for changes in curl 8.18.0
+   (merge 17f4b01da7 jk/test-curl-updates later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
@@ -221,3 +238,7 @@ Fixes since v2.52
    (merge 4ce170c522 ds/doc-scalar-config later to maint).
    (merge a0c813951a jc/doc-commit-signoff-config later to maint).
    (merge 8ee262985a ja/doc-misc-fixes later to maint).
+   (merge 1722c2244b mh/doc-core-attributesfile later to maint).
+   (merge c469ca26c5 dk/ci-rust-fix later to maint).
+   (merge 12f0be0857 gf/clear-path-cache-cleanup later to maint).
+   (merge 949df6ed6b js/test-func-comment-fix later to maint).

From 0b495cd390ec39045542f6fcae53ce64264301b7 Mon Sep 17 00:00:00 2001
From: Paul Tarjan <github@paulisageek.com>
Date: Sat, 3 Jan 2026 20:40:09 +0000
Subject: [PATCH 315/553] t7800: fix racy "difftool --dir-diff syncs worktree"
 test

The "difftool --dir-diff syncs worktree without unstaged change" test
fails intermittently on Windows CI, as seen at:

  https://github.com/git/git/actions/runs/20624095002/job/59231745784#step:5:416

The root cause is that the original file content and the replacement
content have identical sizes:

  - Original: "main\ntest\na\n" = 12 bytes
  - New:      "new content\n"   = 12 bytes

When difftool's sync-back mechanism checks for changes, it compares
stat data between the temporary index and the modified files. If the
modification happens within the same timestamp granularity window and
file size stays the same, the change goes undetected.

On Windows, this is more likely to manifest because Git relies on
inode changes as a fallback when other stat fields match, but Windows
filesystems lack inodes. This is a real bug that could affect users
scripting difftool similarly, as seen at:

  https://github.com/git-for-windows/git/issues/5132

Fix the test by changing the replacement content to "modified content"
(17 bytes), ensuring the size difference is detected regardless of
timestamp resolution or platform-specific stat behavior.

Note: This fixes the test flakiness but not the underlying issue in
difftool's change detection. Other tests with same-size file patterns
(t0010-racy-git.sh, t2200-add-update.sh) are not affected because they
use normal index operations with proper racy-git detection.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Reviewed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7800-difftool.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index bf0f67378dbb23..8a91ff3603ff92 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -647,21 +647,21 @@ test_expect_success SYMLINKS 'difftool --dir-diff --symlinks without unstaged ch
 '
 
 write_script modify-right-file <<\EOF
-echo "new content" >"$2/file"
+echo "modified content" >"$2/file"
 EOF
 
 run_dir_diff_test 'difftool --dir-diff syncs worktree with unstaged change' '
 	test_when_finished git reset --hard &&
 	echo "orig content" >file &&
 	git difftool -d $symlinks --extcmd "$PWD/modify-right-file" branch &&
-	echo "new content" >expect &&
+	echo "modified content" >expect &&
 	test_cmp expect file
 '
 
 run_dir_diff_test 'difftool --dir-diff syncs worktree without unstaged change' '
 	test_when_finished git reset --hard &&
 	git difftool -d $symlinks --extcmd "$PWD/modify-right-file" branch &&
-	echo "new content" >expect &&
+	echo "modified content" >expect &&
 	test_cmp expect file
 '
 

From 404b6772297c77f48de298adf11ab94f080eba72 Mon Sep 17 00:00:00 2001
From: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Date: Sun, 4 Jan 2026 19:47:59 +0000
Subject: [PATCH 316/553] t1300: use test helpers instead of `test` command

Replace `test -f` and `test -h` checks with `test_path_is_file` and
`test_path_is_symlink`. Using the test framework helpers provides
clearer diagnostics and keeps tests consistent across the suite.

Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t1300-config.sh             | 8 ++++----
 t/t2021-checkout-overwrite.sh | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index 358d6363796f48..9850fcd5b567a5 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -1232,12 +1232,12 @@ test_expect_success SYMLINKS 'symlinked configuration' '
 	test_when_finished "rm myconfig" &&
 	ln -s notyet myconfig &&
 	git config --file=myconfig test.frotz nitfol &&
-	test -h myconfig &&
-	test -f notyet &&
+	test_path_is_symlink myconfig &&
+	test_path_is_file notyet &&
 	test "z$(git config --file=notyet test.frotz)" = znitfol &&
 	git config --file=myconfig test.xyzzy rezrov &&
-	test -h myconfig &&
-	test -f notyet &&
+	test_path_is_symlink myconfig &&
+	test_path_is_file notyet &&
 	cat >expect <<-\EOF &&
 	nitfol
 	rezrov
diff --git a/t/t2021-checkout-overwrite.sh b/t/t2021-checkout-overwrite.sh
index a5c03d5d4a2c5a..38c41ae37321ce 100755
--- a/t/t2021-checkout-overwrite.sh
+++ b/t/t2021-checkout-overwrite.sh
@@ -27,7 +27,7 @@ test_expect_success 'checkout commit with dir must not remove untracked a/b' '
 	git rm --cached a/b &&
 	git commit -m "un-track the file" &&
 	test_must_fail git checkout start &&
-	test -f a/b
+	test_path_is_file a/b
 '
 
 test_expect_success 'create a commit where dir a/b changed to symlink' '
@@ -49,7 +49,7 @@ test_expect_success 'checkout commit with dir must not remove untracked a/b' '
 
 test_expect_success SYMLINKS 'the symlink remained' '
 
-	test -h a/b
+	test_path_is_symlink a/b
 '
 
 test_expect_success 'cleanup after previous symlink tests' '

From 76eab50f756fedfa28388213d7fea209f86dfae6 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:17 +0100
Subject: [PATCH 317/553] replay: remove dead code and rearrange
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

22d99f01 (replay: add --advance or 'cherry-pick' mode, 2023-11-24) both
added `--advance` and made one of `--onto` or `--advance` mandatory.
But `determine_replay_mode` claims that there is a third alternative;
neither of `--onto` or `--advance` were given:

    if (onto_name) {
    ...
    } else if (*advance_name) {
    ...
    } else {
    ...
    }

But this is false—the fallthrough else-block is dead code.

Commit 22d99f01 was iterated upon by several people.[1] The initial
author wrote code for a sort of *guess mode*, allowing for shorter
commands when that was possible. But the next person instead made one
of the aforementioned options mandatory. In turn this code was dead on
arrival in git.git.

[1]: https://lore.kernel.org/git/CABPp-BEcJqjD4ztsZo2FTZgWT5ZOADKYEyiZtda+d0mSd1quPQ@mail.gmail.com/

Let’s remove this code. We can also join the if-block with the
condition `!*advance_name` into the `*onto` block since we do not set
`*advance_name` in this function. It only looked like we might set it
since the dead code has this line:

    *advance_name = xstrdup_or_null(last_key);

Let’s also rename the function since we do not determine the
replay mode here. We just set up `*onto` and refs to update.

Note that there might be more dead code caused by this *guess mode*.
We only concern ourselves with this function for now.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 70 +++++++++++-------------------------------------
 1 file changed, 16 insertions(+), 54 deletions(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 69c4c551297c03..524bf96ffd6c9d 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -162,12 +162,12 @@ static void get_ref_information(struct repository *repo,
 	}
 }
 
-static void determine_replay_mode(struct repository *repo,
-				  struct rev_cmdline_info *cmd_info,
-				  const char *onto_name,
-				  char **advance_name,
-				  struct commit **onto,
-				  struct strset **update_refs)
+static void set_up_replay_mode(struct repository *repo,
+			       struct rev_cmdline_info *cmd_info,
+			       const char *onto_name,
+			       char **advance_name,
+			       struct commit **onto,
+			       struct strset **update_refs)
 {
 	struct ref_info rinfo;
 
@@ -182,10 +182,16 @@ static void determine_replay_mode(struct repository *repo,
 		if (rinfo.positive_refexprs <
 		    strset_get_size(&rinfo.positive_refs))
 			die(_("all positive revisions given must be references"));
-	} else if (*advance_name) {
+		*update_refs = xcalloc(1, sizeof(**update_refs));
+		**update_refs = rinfo.positive_refs;
+		memset(&rinfo.positive_refs, 0, sizeof(**update_refs));
+	} else {
 		struct object_id oid;
 		char *fullname = NULL;
 
+		if (!*advance_name)
+			BUG("expected either onto_name or *advance_name in this function");
+
 		*onto = peel_committish(repo, *advance_name);
 		if (repo_dwim_ref(repo, *advance_name, strlen(*advance_name),
 			     &oid, &fullname, 0) == 1) {
@@ -196,51 +202,6 @@ static void determine_replay_mode(struct repository *repo,
 		}
 		if (rinfo.positive_refexprs > 1)
 			die(_("cannot advance target with multiple sources because ordering would be ill-defined"));
-	} else {
-		int positive_refs_complete = (
-			rinfo.positive_refexprs ==
-			strset_get_size(&rinfo.positive_refs));
-		int negative_refs_complete = (
-			rinfo.negative_refexprs ==
-			strset_get_size(&rinfo.negative_refs));
-		/*
-		 * We need either positive_refs_complete or
-		 * negative_refs_complete, but not both.
-		 */
-		if (rinfo.negative_refexprs > 0 &&
-		    positive_refs_complete == negative_refs_complete)
-			die(_("cannot implicitly determine whether this is an --advance or --onto operation"));
-		if (negative_refs_complete) {
-			struct hashmap_iter iter;
-			struct strmap_entry *entry;
-			const char *last_key = NULL;
-
-			if (rinfo.negative_refexprs == 0)
-				die(_("all positive revisions given must be references"));
-			else if (rinfo.negative_refexprs > 1)
-				die(_("cannot implicitly determine whether this is an --advance or --onto operation"));
-			else if (rinfo.positive_refexprs > 1)
-				die(_("cannot advance target with multiple source branches because ordering would be ill-defined"));
-
-			/* Only one entry, but we have to loop to get it */
-			strset_for_each_entry(&rinfo.negative_refs,
-					      &iter, entry) {
-				last_key = entry->key;
-			}
-
-			free(*advance_name);
-			*advance_name = xstrdup_or_null(last_key);
-		} else { /* positive_refs_complete */
-			if (rinfo.negative_refexprs > 1)
-				die(_("cannot implicitly determine correct base for --onto"));
-			if (rinfo.negative_refexprs == 1)
-				*onto = rinfo.onto;
-		}
-	}
-	if (!*advance_name) {
-		*update_refs = xcalloc(1, sizeof(**update_refs));
-		**update_refs = rinfo.positive_refs;
-		memset(&rinfo.positive_refs, 0, sizeof(**update_refs));
 	}
 	strset_clear(&rinfo.negative_refs);
 	strset_clear(&rinfo.positive_refs);
@@ -451,8 +412,9 @@ int cmd_replay(int argc,
 		revs.simplify_history = 0;
 	}
 
-	determine_replay_mode(repo, &revs.cmdline, onto_name, &advance_name,
-			      &onto, &update_refs);
+	set_up_replay_mode(repo, &revs.cmdline,
+			   onto_name, &advance_name,
+			   &onto, &update_refs);
 
 	if (!onto) /* FIXME: Should handle replaying down to root commit */
 		die("Replaying down to root commit is not supported yet!");

From 17b7965a03bd38215cb78ae1c4b9646d0ee73a40 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:18 +0100
Subject: [PATCH 318/553] replay: find *onto only after testing for ref name
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We are about to make `peel_committish` die when it cannot find
a commit-ish instead of returning `NULL`. But that would make e.g.
`git replay --advance=refs/non-existent` die with a less descriptive
error message; the highest-level error message is that the name does
not exist as a ref, not that we cannot find a commit-ish based on
the name.

Let’s try to find the ref and only after that try to peel to
as a commit-ish.

Also add a regression test to protect this error order from future
modifications.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c         | 2 +-
 t/t3650-replay-basics.sh | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 524bf96ffd6c9d..9265ebcd05d569 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -192,7 +192,6 @@ static void set_up_replay_mode(struct repository *repo,
 		if (!*advance_name)
 			BUG("expected either onto_name or *advance_name in this function");
 
-		*onto = peel_committish(repo, *advance_name);
 		if (repo_dwim_ref(repo, *advance_name, strlen(*advance_name),
 			     &oid, &fullname, 0) == 1) {
 			free(*advance_name);
@@ -200,6 +199,7 @@ static void set_up_replay_mode(struct repository *repo,
 		} else {
 			die(_("argument to --advance must be a reference"));
 		}
+		*onto = peel_committish(repo, *advance_name);
 		if (rinfo.positive_refexprs > 1)
 			die(_("cannot advance target with multiple sources because ordering would be ill-defined"));
 	}
diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index cf3aacf3551f8e..8ef0b1984d7324 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -51,6 +51,13 @@ test_expect_success 'setup bare' '
 	git clone --bare . bare
 '
 
+test_expect_success 'argument to --advance must be a reference' '
+	echo "fatal: argument to --advance must be a reference" >expect &&
+	oid=$(git rev-parse main) &&
+	test_must_fail git replay --advance=$oid topic1..topic2 2>actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'using replay to rebase two branches, one on top of other' '
 	git replay --ref-action=print --onto main topic1..topic2 >result &&
 

From 3074d08cfa1bfb75f96fa4a240c575fad4cb8060 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:19 +0100
Subject: [PATCH 319/553] replay: die descriptively when invalid commit-ish is
 given
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Giving an invalid commit-ish to `--onto` makes git-replay(1) fail with:

    fatal: Replaying down to root commit is not supported yet!

Going backwards from this point:

1. `onto` is `NULL` from `set_up_replay_mode`;
2. that function in turn calls `peel_committish`; and
3. here we return `NULL` if `repo_get_oid` fails.

Let’s die immediately with a descriptive error message instead.

Doing this also provides us with a descriptive error if we “forget” to
provide an argument to `--onto` (but we really do unintentionally):[1]

    $ git replay --onto ^main topic1
    fatal: '^main' is not a valid commit-ish

Note that the `--advance` case won’t be triggered in practice because
of the “argument to --advance must be a reference” check (see the
previous test, and commit).

† 1: The argument to `--onto` is mandatory and the option parser accepts
     both `--onto=<name>` (stuck form) and `--onto name`. The latter
     form makes it easy to unintentionally pass something to the option
     when you really meant to pass a positional argument.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c         | 13 +++++++------
 t/t3650-replay-basics.sh |  7 +++++++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 9265ebcd05d569..1899ccc7cc3ff5 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -33,13 +33,15 @@ static const char *short_commit_name(struct repository *repo,
 				       DEFAULT_ABBREV);
 }
 
-static struct commit *peel_committish(struct repository *repo, const char *name)
+static struct commit *peel_committish(struct repository *repo,
+				      const char *name,
+				      const char *mode)
 {
 	struct object *obj;
 	struct object_id oid;
 
 	if (repo_get_oid(repo, name, &oid))
-		return NULL;
+		die(_("'%s' is not a valid commit-ish for %s"), name, mode);
 	obj = parse_object(repo, &oid);
 	return (struct commit *)repo_peel_to_type(repo, name, 0, obj,
 						  OBJ_COMMIT);
@@ -178,7 +180,7 @@ static void set_up_replay_mode(struct repository *repo,
 	die_for_incompatible_opt2(!!onto_name, "--onto",
 				  !!*advance_name, "--advance");
 	if (onto_name) {
-		*onto = peel_committish(repo, onto_name);
+		*onto = peel_committish(repo, onto_name, "--onto");
 		if (rinfo.positive_refexprs <
 		    strset_get_size(&rinfo.positive_refs))
 			die(_("all positive revisions given must be references"));
@@ -199,7 +201,7 @@ static void set_up_replay_mode(struct repository *repo,
 		} else {
 			die(_("argument to --advance must be a reference"));
 		}
-		*onto = peel_committish(repo, *advance_name);
+		*onto = peel_committish(repo, *advance_name, "--advance");
 		if (rinfo.positive_refexprs > 1)
 			die(_("cannot advance target with multiple sources because ordering would be ill-defined"));
 	}
@@ -416,8 +418,7 @@ int cmd_replay(int argc,
 			   onto_name, &advance_name,
 			   &onto, &update_refs);
 
-	if (!onto) /* FIXME: Should handle replaying down to root commit */
-		die("Replaying down to root commit is not supported yet!");
+	/* FIXME: Should handle replaying down to root commit */
 
 	/* Build reflog message */
 	if (advance_name_opt)
diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index 8ef0b1984d7324..8d82dad71486ee 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -58,6 +58,13 @@ test_expect_success 'argument to --advance must be a reference' '
 	test_cmp expect actual
 '
 
+test_expect_success '--onto with invalid commit-ish' '
+	printf "fatal: ${SQ}refs/not-valid${SQ} is not " >expect &&
+	printf "a valid commit-ish for --onto\n" >>expect &&
+	test_must_fail git replay --onto=refs/not-valid topic1..topic2 2>actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'using replay to rebase two branches, one on top of other' '
 	git replay --ref-action=print --onto main topic1..topic2 >result &&
 

From f67f7ddbbd03634318d54b1d1ad7ed8df4a2b292 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:20 +0100
Subject: [PATCH 320/553] replay: improve code comment and die message

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 1899ccc7cc3ff5..402db44af2b38f 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -418,7 +418,7 @@ int cmd_replay(int argc,
 			   onto_name, &advance_name,
 			   &onto, &update_refs);
 
-	/* FIXME: Should handle replaying down to root commit */
+	/* FIXME: Should allow replaying commits with the first as a root commit */
 
 	/* Build reflog message */
 	if (advance_name_opt)
@@ -454,7 +454,7 @@ int cmd_replay(int argc,
 		int hr;
 
 		if (!commit->parents)
-			die(_("replaying down to root commit is not supported yet!"));
+			die(_("replaying down from root commit is not supported yet!"));
 		if (commit->parents->next)
 			die(_("replaying merge commits is not supported yet!"));
 

From 6f693364cc183ea5a8296c9ce2ff515f47206f92 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:21 +0100
Subject: [PATCH 321/553] replay: die if we cannot parse object
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

`parse_object` can return `NULL`. That will in turn make
`repo_peel_to_type` return the same.

Let’s die fast and descriptively with the `*_or_die` variant.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/replay.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/replay.c b/builtin/replay.c
index 402db44af2b38f..1960bbbee8685d 100644
--- a/builtin/replay.c
+++ b/builtin/replay.c
@@ -42,7 +42,7 @@ static struct commit *peel_committish(struct repository *repo,
 
 	if (repo_get_oid(repo, name, &oid))
 		die(_("'%s' is not a valid commit-ish for %s"), name, mode);
-	obj = parse_object(repo, &oid);
+	obj = parse_object_or_die(repo, &oid, name);
 	return (struct commit *)repo_peel_to_type(repo, name, 0, obj,
 						  OBJ_COMMIT);
 }

From 56b77a687eaf9c48482e9f59ab7077e442e85ff5 Mon Sep 17 00:00:00 2001
From: Kristoffer Haugsbakk <code@khaugsbakk.name>
Date: Mon, 5 Jan 2026 20:53:22 +0100
Subject: [PATCH 322/553] t3650: add more regression tests for failure
 conditions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

There isn’t much test coverage for basic failure conditions. Let’s add
a few more since these are simple to write and remove if they become
obsolete.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t3650-replay-basics.sh | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index 8d82dad71486ee..307101eeb911f7 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -43,6 +43,13 @@ test_expect_success 'setup' '
 	test_commit L &&
 	test_commit M &&
 
+	git switch --detach topic4 &&
+	test_commit N &&
+	test_commit O &&
+	git switch -c topic-with-merge topic4 &&
+	test_merge P O --no-ff &&
+	git switch main &&
+
 	git switch -c conflict B &&
 	test_commit C.conflict C.t conflict
 '
@@ -65,6 +72,39 @@ test_expect_success '--onto with invalid commit-ish' '
 	test_cmp expect actual
 '
 
+test_expect_success 'option --onto or --advance is mandatory' '
+	echo "error: option --onto or --advance is mandatory" >expect &&
+	test_might_fail git replay -h >>expect &&
+	test_must_fail git replay topic1..topic2 2>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no base or negative ref gives no-replaying down to root error' '
+	echo "fatal: replaying down from root commit is not supported yet!" >expect &&
+	test_must_fail git replay --onto=topic1 topic2 2>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'options --advance and --contained cannot be used together' '
+	printf "fatal: options ${SQ}--advance${SQ} " >expect &&
+	printf "and ${SQ}--contained${SQ} cannot be used together\n" >>expect &&
+	test_must_fail git replay --advance=main --contained \
+		topic1..topic2 2>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'cannot advance target ... ordering would be ill-defined' '
+	echo "fatal: cannot advance target with multiple sources because ordering would be ill-defined" >expect &&
+	test_must_fail git replay --advance=main main topic1 topic2 2>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'replaying merge commits is not supported yet' '
+	echo "fatal: replaying merge commits is not supported yet!" >expect &&
+	test_must_fail git replay --advance=main main..topic-with-merge 2>actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'using replay to rebase two branches, one on top of other' '
 	git replay --ref-action=print --onto main topic1..topic2 >result &&
 

From e0bfec3dfc356f7d808eb5ee546a54116b794397 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 6 Jan 2026 14:36:52 +0900
Subject: [PATCH 323/553] The 15th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 91cfb7adfaab8f..9e8384a4c101ac 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -89,6 +89,9 @@ Performance, Internal Implementation, Development Support etc.
  * Prepare test suite for Git for Windows that supports symbolic
    links.
 
+ * Use hook API to replace ad-hoc invocation of hook scripts with the
+   run_command() API.
+
 
 Fixes since v2.52
 -----------------
@@ -221,6 +224,10 @@ Fixes since v2.52
  * Update HTTP tests to adjust for changes in curl 8.18.0
    (merge 17f4b01da7 jk/test-curl-updates later to maint).
 
+ * Workaround the "iconv" shipped as part of macOS, which is broken
+   handling stateful ISO/IEC 2022 encoded strings.
+   (merge cee341e9dd rs/macos-iconv-workaround later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
@@ -242,3 +249,6 @@ Fixes since v2.52
    (merge c469ca26c5 dk/ci-rust-fix later to maint).
    (merge 12f0be0857 gf/clear-path-cache-cleanup later to maint).
    (merge 949df6ed6b js/test-func-comment-fix later to maint).
+   (merge 93f894c001 bc/checkout-error-message-fix later to maint).
+   (merge abf05d856f rs/show-branch-prio-queue later to maint).
+   (merge 06188ea5f3 rs/parse-config-expiry-simplify later to maint).

From b3449b151767e34a82bf996c4d63feed9ef854bd Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Tue, 6 Jan 2026 13:58:49 +0100
Subject: [PATCH 324/553] builtin/gc: fix condition for whether to write commit
 graphs

When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.

While crafting a test case for the issue I was puzzled that I couldn't
establish the proper border at which the auto-condition would be
fulfilled. As it turns out, there's another bug: if an object is at the
tip of any reference we don't mark it as seen. Consequently, if it is
the tip of or reachable via another ref, we'd count that object multiple
times.

Fix both of these bugs so that we properly count objects without leaking
any memory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/gc.c           |  8 +++++---
 t/t7900-maintenance.sh | 25 +++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 92c6e7b954faff..17ff68cbd91037 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1130,8 +1130,10 @@ static int dfs_on_ref(const struct reference *ref, void *cb_data)
 		return 0;
 
 	commit = lookup_commit(the_repository, maybe_peeled);
-	if (!commit)
+	if (!commit || commit->object.flags & SEEN)
 		return 0;
+	commit->object.flags |= SEEN;
+
 	if (repo_parse_commit(the_repository, commit) ||
 	    commit_graph_position(commit) != COMMIT_NOT_FROM_GRAPH)
 		return 0;
@@ -1141,7 +1143,7 @@ static int dfs_on_ref(const struct reference *ref, void *cb_data)
 	if (data->num_not_in_graph >= data->limit)
 		return 1;
 
-	commit_list_append(commit, &stack);
+	commit_list_insert(commit, &stack);
 
 	while (!result && stack) {
 		struct commit_list *parent;
@@ -1162,7 +1164,7 @@ static int dfs_on_ref(const struct reference *ref, void *cb_data)
 				break;
 			}
 
-			commit_list_append(parent->item, &stack);
+			commit_list_insert(parent->item, &stack);
 		}
 	}
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 6b36f52df7c95d..7cc0ce57f8f320 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -206,6 +206,31 @@ test_expect_success 'commit-graph auto condition' '
 	test_subcommand $COMMIT_GRAPH_WRITE <cg-two-satisfied.txt
 '
 
+test_expect_success 'commit-graph auto condition with merges' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		git config set maintenance.auto false &&
+		test_commit initial &&
+		git switch --create feature &&
+		test_commit feature-1 &&
+		test_commit feature-2 &&
+		git switch - &&
+		test_commit main-1 &&
+		test_commit main-2 &&
+		git merge feature &&
+
+		# We have 6 commits, none of which are covered by a commit
+		# graph. So this must be the boundary at which we start to
+		# perform maintenance.
+		test_must_fail git -c maintenance.commit-graph.auto=7 \
+			maintenance is-needed --auto --task=commit-graph &&
+		git -c maintenance.commit-graph.auto=6 \
+			maintenance is-needed --auto --task=commit-graph
+	)
+'
+
 test_expect_success 'run --task=bogus' '
 	test_must_fail git maintenance run --task=bogus 2>err &&
 	test_grep "is not a valid task" err

From 3d099686560b848fefe71b7e8edf70d1674b9c73 Mon Sep 17 00:00:00 2001
From: Patrick Steinhardt <ps@pks.im>
Date: Tue, 6 Jan 2026 13:58:50 +0100
Subject: [PATCH 325/553] odb: properly close sources before freeing them

It is possible to hit a memory leak when reading data from a submodule
via git-grep(1):

  Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x55555562e726 in calloc (git+0xda726)
    #1 0x555555964734 in xcalloc ../wrapper.c:154:8
    #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2
    #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6
    #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17
    #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3
    #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2
    #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2
    #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4
    #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8
    #10 0x5555558540bd in odb_read_object ../odb.c:920:6
    #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12
    #12 0x55555580a13a in grep_source_load ../grep.c:1986:10
    #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7
    #14 0x555555807574 in grep_source_1 ../grep.c:1625:8
    #15 0x555555807322 in grep_source ../grep.c:1837:10
    #16 0x5555556a5c58 in run ../builtin/grep.c:208:10
    #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
    #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
    #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)

The root caues of this leak is the way we set up and release the
submodule:

  1. We use `repo_submodule_init()` to initialize a new repository. This
     repository is stored in `repos_to_free`.

  2. We now read data from the submodule repository.

  3. We then call `repo_clear()` on the submodule repositories.

  4. `repo_clear()` calls `odb_free()`.

  5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`.

The issue here is the 5th step: we call `odb_free_sources()` _before_ we
call `odb_close()`. But `odb_free_sources()` already frees all sources,
so the logic that closes them in `odb_close()` now becomes a no-op. As a
consequence, we never explicitly close sources at all.

Fix the leak by closing the store before we free the sources.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 odb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/odb.c b/odb.c
index dc8f292f3d9645..8e67afe185eae9 100644
--- a/odb.c
+++ b/odb.c
@@ -1132,13 +1132,13 @@ void odb_free(struct object_database *o)
 	oidmap_clear(&o->replace_map, 1);
 	pthread_mutex_destroy(&o->replace_mutex);
 
+	odb_close(o);
 	odb_free_sources(o);
 
 	for (size_t i = 0; i < o->cached_object_nr; i++)
 		free((char *) o->cached_objects[i].value.buf);
 	free(o->cached_objects);
 
-	odb_close(o);
 	packfile_store_free(o->packfiles);
 	string_list_clear(&o->submodule_source_paths, 0);
 

From d529f3a197364881746f558e5652f0236131eb86 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 8 Jan 2026 15:58:11 +0900
Subject: [PATCH 326/553] The 16th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 9e8384a4c101ac..32b6966c9e4dc4 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -228,6 +228,16 @@ Fixes since v2.52
    handling stateful ISO/IEC 2022 encoded strings.
    (merge cee341e9dd rs/macos-iconv-workaround later to maint).
 
+ * Running "git diff" with "--name-only" and other options that allows
+   us not to look at the blob contents, while objects that are lazily
+   fetched from a promisor remote, caused use-after-free, which has
+   been corrected.
+
+ * The ort merge machinery hit an assertion failure in a history with
+   criss-cross merges renamed a directory and a non-directory, which
+   has been corrected.
+   (merge 979ee83e8a en/ort-recursive-d-f-conflict-fix later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
@@ -252,3 +262,4 @@ Fixes since v2.52
    (merge 93f894c001 bc/checkout-error-message-fix later to maint).
    (merge abf05d856f rs/show-branch-prio-queue later to maint).
    (merge 06188ea5f3 rs/parse-config-expiry-simplify later to maint).
+   (merge 861dbb1586 dd/t5403-modernise later to maint).

From e97678c4efe315e2bdae7eb28ccff8f4203c650b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 10 Jan 2026 11:06:44 +0000
Subject: [PATCH 327/553] .mailmap: replace Karsten Blees' default address

As per a recent email by Karsten, the @dcon.de address no longer works:
https://lore.kernel.org/git/77e768b2-6693-454f-9e11-fb0acdec703c@gmail.com

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .mailmap | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index 7b3198171fad1e..3cf26b1add06cd 100644
--- a/.mailmap
+++ b/.mailmap
@@ -140,8 +140,8 @@ Junio C Hamano <gitster@pobox.com> <junkio@twinsun.com>
 Kaartic Sivaraam <kaartic.sivaraam@gmail.com> <kaarticsivaraam91196@gmail.com>
 Karl Wiberg <kha@treskal.com> Karl  Hasselström
 Karl Wiberg <kha@treskal.com> <kha@yoghurt.hemma.treskal.com>
-Karsten Blees <blees@dcon.de> <karsten.blees@dcon.de>
-Karsten Blees <blees@dcon.de> <karsten.blees@gmail.com>
+Karsten Blees <karsten.blees@gmail.com> <karsten.blees@dcon.de>
+Karsten Blees <karsten.blees@gmail.com> <blees@dcon.de>
 Kay Sievers <kay.sievers@vrfy.org> <kay.sievers@suse.de>
 Kay Sievers <kay.sievers@vrfy.org> <kay@mam.(none)>
 Kazuki Saitoh <ksaitoh560@gmail.com> kazuki saitoh <ksaitoh560@gmail.com>

From 8745eae506f700657882b9e32b2aa00f234a6fb6 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Sun, 11 Jan 2026 21:54:28 -0800
Subject: [PATCH 328/553] The 17th batch

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 32b6966c9e4dc4..35a1ab91edc7fd 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -238,6 +238,13 @@ Fixes since v2.52
    has been corrected.
    (merge 979ee83e8a en/ort-recursive-d-f-conflict-fix later to maint).
 
+ * Diagnose invalid bundle-URI that lack the URI entry, instead of
+   crashing.
+   (merge 7796c14a1a sb/bundle-uri-without-uri later to maint).
+
+ * Mailmap update for Karsten
+   (merge e97678c4ef js/mailmap-karsten-blees later to maint).
+
  * Other code cleanup, docfix, build fix, etc.
    (merge 46207a54cc qj/doc-http-bad-want-response later to maint).
    (merge df90eccd93 kh/doc-commit-extra-references later to maint).
@@ -263,3 +270,4 @@ Fixes since v2.52
    (merge abf05d856f rs/show-branch-prio-queue later to maint).
    (merge 06188ea5f3 rs/parse-config-expiry-simplify later to maint).
    (merge 861dbb1586 dd/t5403-modernise later to maint).
+   (merge acffc5e9e5 ja/doc-synopsis-style-more later to maint).

From 7264e61d87e58b9d0f5e6424c47c11e9657dfb75 Mon Sep 17 00:00:00 2001
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 15 Jan 2026 05:59:37 -0800
Subject: [PATCH 329/553] Git 2.53-rc0

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/RelNotes/2.53.0.adoc | 7 +++++++
 GIT-VERSION-GEN                    | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc
index 35a1ab91edc7fd..dcdebe8954f067 100644
--- a/Documentation/RelNotes/2.53.0.adoc
+++ b/Documentation/RelNotes/2.53.0.adoc
@@ -34,6 +34,10 @@ UI, Workflows & Features
  * More object database related information are shown in "git repo
    structure" output.
 
+ * Improve the error message when a bad argument is given to the
+   `--onto` option of "git replay".  Test coverage of "git replay" has
+   been improved.
+
 
 Performance, Internal Implementation, Development Support etc.
 --------------------------------------------------------------
@@ -92,6 +96,9 @@ Performance, Internal Implementation, Development Support etc.
  * Use hook API to replace ad-hoc invocation of hook scripts with the
    run_command() API.
 
+ * Import newer version of "clar", unit testing framework.
+   (merge 84071a6dea ps/clar-integers later to maint).
+
 
 Fixes since v2.52
 -----------------
diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN
index 1f7af0328a0461..5adc4afd67488c 100755
--- a/GIT-VERSION-GEN
+++ b/GIT-VERSION-GEN
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-DEF_VER=v2.52.GIT
+DEF_VER=v2.53.0-rc0
 
 LF='
 '

From 054d29725f1078b1068d8679cc36423e9c008234 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 6 Nov 2024 20:34:50 +0100
Subject: [PATCH 330/553] sideband: mask control characters

The output of `git clone` is a vital component for understanding what
has happened when things go wrong. However, these logs are partially
under the control of the remote server (via the "sideband", which
typically contains what the remote `git pack-objects` process sends to
`stderr`), and is currently not sanitized by Git.

This makes Git susceptible to ANSI escape sequence injection (see
CWE-150, https://cwe.mitre.org/data/definitions/150.html), which allows
attackers to corrupt terminal state, to hide information, and even to
insert characters into the input buffer (i.e. as if the user had typed
those characters).

To plug this vulnerability, disallow any control character in the
sideband, replacing them instead with the common `^<letter/symbol>`
(e.g. `^[` for `\x1b`, `^A` for `\x01`).

There is likely a need for more fine-grained controls instead of using a
"heavy hammer" like this, which will be introduced subsequently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sideband.c                          | 17 +++++++++++++++--
 t/t5409-colorize-remote-messages.sh | 12 ++++++++++++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/sideband.c b/sideband.c
index ea7c25211ef7e1..d2e6023e60e5ed 100644
--- a/sideband.c
+++ b/sideband.c
@@ -66,6 +66,19 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
 		list_config_item(list, prefix, keywords[i].keyword);
 }
 
+static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n)
+{
+	strbuf_grow(dest, n);
+	for (; n && *src; src++, n--) {
+		if (!iscntrl(*src) || *src == '\t' || *src == '\n')
+			strbuf_addch(dest, *src);
+		else {
+			strbuf_addch(dest, '^');
+			strbuf_addch(dest, 0x40 + *src);
+		}
+	}
+}
+
 /*
  * Optionally highlight one keyword in remote output if it appears at the start
  * of the line. This should be called for a single line only, which is
@@ -81,7 +94,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
 	int i;
 
 	if (!want_color_stderr(use_sideband_colors())) {
-		strbuf_add(dest, src, n);
+		strbuf_add_sanitized(dest, src, n);
 		return;
 	}
 
@@ -114,7 +127,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
 		}
 	}
 
-	strbuf_add(dest, src, n);
+	strbuf_add_sanitized(dest, src, n);
 }
 
 
diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh
index fa5de4500a4f50..d0745c391b2625 100755
--- a/t/t5409-colorize-remote-messages.sh
+++ b/t/t5409-colorize-remote-messages.sh
@@ -98,4 +98,16 @@ test_expect_success 'fallback to color.ui' '
 	grep "<BOLD;RED>error<RESET>: error" decoded
 '
 
+test_expect_success 'disallow (color) control sequences in sideband' '
+	write_script .git/color-me-surprised <<-\EOF &&
+	printf "error: Have you \\033[31mread\\033[m this?\\n" >&2
+	exec "$@"
+	EOF
+	test_config_global uploadPack.packObjectshook ./color-me-surprised &&
+	test_commit need-at-least-one-commit &&
+	git clone --no-local . throw-away 2>stderr &&
+	test_decode_color <stderr >decoded &&
+	test_grep ! RED decoded
+'
+
 test_done

From 11155abae59a8d8f91a0d519d9d93b8a15a18e01 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 6 Nov 2024 21:07:51 +0100
Subject: [PATCH 331/553] sideband: introduce an "escape hatch" to allow
 control characters

The preceding commit fixed the vulnerability whereas sideband messages
(that are under the control of the remote server) could contain ANSI
escape sequences that would be sent to the terminal verbatim.

However, this fix may not be desirable under all circumstances, e.g.
when remote servers deliberately add coloring to their messages to
increase their urgency.

To help with those use cases, give users a way to opt-out of the
protections: `sideband.allowControlCharacters`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.adoc           |  2 ++
 Documentation/config/sideband.adoc  |  5 +++++
 sideband.c                          | 10 ++++++++++
 t/t5409-colorize-remote-messages.sh |  8 +++++++-
 4 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/sideband.adoc

diff --git a/Documentation/config.adoc b/Documentation/config.adoc
index 62eebe7c54501c..dcea3c0c15e2a9 100644
--- a/Documentation/config.adoc
+++ b/Documentation/config.adoc
@@ -523,6 +523,8 @@ include::config/sequencer.adoc[]
 
 include::config/showbranch.adoc[]
 
+include::config/sideband.adoc[]
+
 include::config/sparse.adoc[]
 
 include::config/splitindex.adoc[]
diff --git a/Documentation/config/sideband.adoc b/Documentation/config/sideband.adoc
new file mode 100644
index 00000000000000..3fb5045cd79581
--- /dev/null
+++ b/Documentation/config/sideband.adoc
@@ -0,0 +1,5 @@
+sideband.allowControlCharacters::
+	By default, control characters that are delivered via the sideband
+	are masked, to prevent potentially unwanted ANSI escape sequences
+	from being sent to the terminal. Use this config setting to override
+	this behavior.
diff --git a/sideband.c b/sideband.c
index d2e6023e60e5ed..ecba71e6610dc4 100644
--- a/sideband.c
+++ b/sideband.c
@@ -26,6 +26,8 @@ static struct keyword_entry keywords[] = {
 	{ "error",	GIT_COLOR_BOLD_RED },
 };
 
+static int allow_control_characters;
+
 /* Returns a color setting (GIT_COLOR_NEVER, etc). */
 static enum git_colorbool use_sideband_colors(void)
 {
@@ -39,6 +41,9 @@ static enum git_colorbool use_sideband_colors(void)
 	if (use_sideband_colors_cached != GIT_COLOR_UNKNOWN)
 		return use_sideband_colors_cached;
 
+	repo_config_get_bool(the_repository, "sideband.allowcontrolcharacters",
+			    &allow_control_characters);
+
 	if (!repo_config_get_string_tmp(the_repository, key, &value))
 		use_sideband_colors_cached = git_config_colorbool(key, value);
 	else if (!repo_config_get_string_tmp(the_repository, "color.ui", &value))
@@ -68,6 +73,11 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
 
 static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n)
 {
+	if (allow_control_characters) {
+		strbuf_add(dest, src, n);
+		return;
+	}
+
 	strbuf_grow(dest, n);
 	for (; n && *src; src++, n--) {
 		if (!iscntrl(*src) || *src == '\t' || *src == '\n')
diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh
index d0745c391b2625..fb31e8525418a1 100755
--- a/t/t5409-colorize-remote-messages.sh
+++ b/t/t5409-colorize-remote-messages.sh
@@ -105,9 +105,15 @@ test_expect_success 'disallow (color) control sequences in sideband' '
 	EOF
 	test_config_global uploadPack.packObjectshook ./color-me-surprised &&
 	test_commit need-at-least-one-commit &&
+
 	git clone --no-local . throw-away 2>stderr &&
 	test_decode_color <stderr >decoded &&
-	test_grep ! RED decoded
+	test_grep ! RED decoded &&
+
+	rm -rf throw-away &&
+	git -c sideband.allowControlCharacters clone --no-local . throw-away 2>stderr &&
+	test_decode_color <stderr >decoded &&
+	test_grep RED decoded
 '
 
 test_done

From 08bff143a78f65c87f0ebfe8db74e556bee4568e Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 18 Nov 2024 21:42:57 +0100
Subject: [PATCH 332/553] sideband: do allow ANSI color sequences by default

The preceding two commits introduced special handling of the sideband
channel to neutralize ANSI escape sequences before sending the payload
to the terminal, and `sideband.allowControlCharacters` to override that
behavior.

However, some `pre-receive` hooks that are actively used in practice
want to color their messages and therefore rely on the fact that Git
passes them through to the terminal.

In contrast to other ANSI escape sequences, it is highly unlikely that
coloring sequences can be essential tools in attack vectors that mislead
Git users e.g. by hiding crucial information.

Therefore we can have both: Continue to allow ANSI coloring sequences to
be passed to the terminal, and neutralize all other ANSI escape
sequences.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config/sideband.adoc  | 17 ++++++--
 sideband.c                          | 61 ++++++++++++++++++++++++++---
 t/t5409-colorize-remote-messages.sh | 16 +++++++-
 3 files changed, 84 insertions(+), 10 deletions(-)

diff --git a/Documentation/config/sideband.adoc b/Documentation/config/sideband.adoc
index 3fb5045cd79581..f347fd6b33004a 100644
--- a/Documentation/config/sideband.adoc
+++ b/Documentation/config/sideband.adoc
@@ -1,5 +1,16 @@
 sideband.allowControlCharacters::
 	By default, control characters that are delivered via the sideband
-	are masked, to prevent potentially unwanted ANSI escape sequences
-	from being sent to the terminal. Use this config setting to override
-	this behavior.
+	are masked, except ANSI color sequences. This prevents potentially
+	unwanted ANSI escape sequences from being sent to the terminal. Use
+	this config setting to override this behavior:
++
+--
+	color::
+		Allow ANSI color sequences, line feeds and horizontal tabs,
+		but mask all other control characters. This is the default.
+	false::
+		Mask all control characters other than line feeds and
+		horizontal tabs.
+	true::
+		Allow all control characters to be sent to the terminal.
+--
diff --git a/sideband.c b/sideband.c
index ecba71e6610dc4..17d0d5b7198332 100644
--- a/sideband.c
+++ b/sideband.c
@@ -26,7 +26,11 @@ static struct keyword_entry keywords[] = {
 	{ "error",	GIT_COLOR_BOLD_RED },
 };
 
-static int allow_control_characters;
+static enum {
+	ALLOW_NO_CONTROL_CHARACTERS = 0,
+	ALLOW_ALL_CONTROL_CHARACTERS = 1,
+	ALLOW_ANSI_COLOR_SEQUENCES = 2
+} allow_control_characters = ALLOW_ANSI_COLOR_SEQUENCES;
 
 /* Returns a color setting (GIT_COLOR_NEVER, etc). */
 static enum git_colorbool use_sideband_colors(void)
@@ -41,8 +45,24 @@ static enum git_colorbool use_sideband_colors(void)
 	if (use_sideband_colors_cached != GIT_COLOR_UNKNOWN)
 		return use_sideband_colors_cached;
 
-	repo_config_get_bool(the_repository, "sideband.allowcontrolcharacters",
-			    &allow_control_characters);
+	switch (repo_config_get_maybe_bool(the_repository, "sideband.allowcontrolcharacters", &i)) {
+	case 0: /* Boolean value */
+		allow_control_characters = i ? ALLOW_ALL_CONTROL_CHARACTERS :
+			ALLOW_NO_CONTROL_CHARACTERS;
+		break;
+	case -1: /* non-Boolean value */
+		if (repo_config_get_string_tmp(the_repository, "sideband.allowcontrolcharacters",
+					      &value))
+			; /* huh? `get_maybe_bool()` returned -1 */
+		else if (!strcmp(value, "color"))
+			allow_control_characters = ALLOW_ANSI_COLOR_SEQUENCES;
+		else
+			warning(_("unrecognized value for `sideband."
+				  "allowControlCharacters`: '%s'"), value);
+		break;
+	default:
+		break; /* not configured */
+	}
 
 	if (!repo_config_get_string_tmp(the_repository, key, &value))
 		use_sideband_colors_cached = git_config_colorbool(key, value);
@@ -71,9 +91,37 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
 		list_config_item(list, prefix, keywords[i].keyword);
 }
 
+static int handle_ansi_color_sequence(struct strbuf *dest, const char *src, int n)
+{
+	int i;
+
+	/*
+	 * Valid ANSI color sequences are of the form
+	 *
+	 * ESC [ [<n> [; <n>]*] m
+	 */
+
+	if (allow_control_characters != ALLOW_ANSI_COLOR_SEQUENCES ||
+	    n < 3 || src[0] != '\x1b' || src[1] != '[')
+		return 0;
+
+	for (i = 2; i < n; i++) {
+		if (src[i] == 'm') {
+			strbuf_add(dest, src, i + 1);
+			return i;
+		}
+		if (!isdigit(src[i]) && src[i] != ';')
+			break;
+	}
+
+	return 0;
+}
+
 static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n)
 {
-	if (allow_control_characters) {
+	int i;
+
+	if (allow_control_characters == ALLOW_ALL_CONTROL_CHARACTERS) {
 		strbuf_add(dest, src, n);
 		return;
 	}
@@ -82,7 +130,10 @@ static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n)
 	for (; n && *src; src++, n--) {
 		if (!iscntrl(*src) || *src == '\t' || *src == '\n')
 			strbuf_addch(dest, *src);
-		else {
+		else if ((i = handle_ansi_color_sequence(dest, src, n))) {
+			src += i;
+			n -= i;
+		} else {
 			strbuf_addch(dest, '^');
 			strbuf_addch(dest, 0x40 + *src);
 		}
diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh
index fb31e8525418a1..a755c49a74e634 100755
--- a/t/t5409-colorize-remote-messages.sh
+++ b/t/t5409-colorize-remote-messages.sh
@@ -100,7 +100,7 @@ test_expect_success 'fallback to color.ui' '
 
 test_expect_success 'disallow (color) control sequences in sideband' '
 	write_script .git/color-me-surprised <<-\EOF &&
-	printf "error: Have you \\033[31mread\\033[m this?\\n" >&2
+	printf "error: Have you \\033[31mread\\033[m this?\\a\\n" >&2
 	exec "$@"
 	EOF
 	test_config_global uploadPack.packObjectshook ./color-me-surprised &&
@@ -108,12 +108,24 @@ test_expect_success 'disallow (color) control sequences in sideband' '
 
 	git clone --no-local . throw-away 2>stderr &&
 	test_decode_color <stderr >decoded &&
+	test_grep RED decoded &&
+	test_grep "\\^G" stderr &&
+	tr -dc "\\007" <stderr >actual &&
+	test_must_be_empty actual &&
+
+	rm -rf throw-away &&
+	git -c sideband.allowControlCharacters=false \
+		clone --no-local . throw-away 2>stderr &&
+	test_decode_color <stderr >decoded &&
 	test_grep ! RED decoded &&
+	test_grep "\\^G" stderr &&
 
 	rm -rf throw-away &&
 	git -c sideband.allowControlCharacters clone --no-local . throw-away 2>stderr &&
 	test_decode_color <stderr >decoded &&
-	test_grep RED decoded
+	test_grep RED decoded &&
+	tr -dc "\\007" <stderr >actual &&
+	test_file_not_empty actual
 '
 
 test_done

From 7bc663f3a4df42a81aa5d378f0e097d6dac295ea Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 30 Oct 2024 19:48:46 +0100
Subject: [PATCH 333/553] unix-socket: avoid leak when initialization fails

When a Unix socket is initialized, the current directory's path is
stored so that the cleanup code can `chdir()` back to where it was
before exit.

If the path that needs to be stored exceeds the default size of the
`sun_path` attribute of `struct sockaddr_un` (which is defined as a
108-sized byte array on Linux), a larger buffer needs to be allocated so
that it can hold the path, and it is the responsibility of the
`unix_sockaddr_cleanup()` function to release that allocated memory.

In Git's CI, this stack allocation is not necessary because the code is
checked out to `/home/runner/work/git/git`. Concatenate the path
`t/trash directory.t0301-credential-cache/.cache/git/credential/socket`
and a terminating NUL, and you end up with 96 bytes, 12 shy of the
default `sun_path` size.

However, I use worktrees with slightly longer paths:
`/home/me/projects/git/yes/i/nest/worktrees/to/organize/them/` is more
in line with what I have. When I recently tried to locally reproduce a
failure of the `linux-leaks` CI job, this t0301 test failed (where it
had not failed in CI).

The reason: When `credential-cache` tries to reach its daemon initially
by calling `unix_sockaddr_init()`, it is expected that the daemon cannot
be reached (the idea is to spin up the daemon in that case and try
again). However, when this first call to `unix_sockaddr_init()` fails,
the code returns early from the `unix_stream_connect()` function
_without_ giving the cleanup code a chance to run, skipping the
deallocation of above-mentioned path.

The fix is easy: do not return early but instead go directly to the
cleanup code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 unix-socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/unix-socket.c b/unix-socket.c
index 8860203c3f46dc..1fa0cf6c15c721 100644
--- a/unix-socket.c
+++ b/unix-socket.c
@@ -84,7 +84,7 @@ int unix_stream_connect(const char *path, int disallow_chdir)
 	struct unix_sockaddr_context ctx;
 
 	if (unix_sockaddr_init(&sa, path, &ctx, disallow_chdir) < 0)
-		return -1;
+		goto fail;
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
 		goto fail;

From f33c39ddcb221b011bbdb937d0ac5a3b6f8d2112 Mon Sep 17 00:00:00 2001
From: Jeff King <peff@peff.net>
Date: Mon, 13 Jan 2025 01:26:01 -0500
Subject: [PATCH 334/553] grep: prevent `^$` false match at end of file

In some implementations, `regexec_buf()` assumes that it is fed lines;
Without `REG_NOTEOL` it thinks the end of the buffer is the end of a
line. Which makes sense, but trips up this case because we are not
feeding lines, but rather a whole buffer. So the final newline is not
the start of an empty line, but the true end of the buffer.

This causes an interesting bug:

  $ echo content >file.txt
  $ git grep --no-index -n '^$' file.txt
  file.txt:2:

This bug is fixed by making the end of the buffer consistently the end
of the final line.

The patch was applied from
https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/

Reported-by: Olly Betts <olly@survex.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 grep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/grep.c b/grep.c
index c7e1dc1e0ee4fe..4fc12251880544 100644
--- a/grep.c
+++ b/grep.c
@@ -1646,6 +1646,8 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 
 	bol = gs->buf;
 	left = gs->size;
+	if (left && gs->buf[left-1] == '\n')
+		left--;
 	while (left) {
 		const char *eol;
 		int hit;

From d4430418ce6945e81d10631fa6af269d47f8cff4 Mon Sep 17 00:00:00 2001
From: Sverre Rabbelier <srabbelier@gmail.com>
Date: Sun, 24 Jul 2011 15:54:04 +0200
Subject: [PATCH 335/553] t9350: point out that refs are not updated correctly

This happens only when the corresponding commits are not exported in
the current fast-export run. This can happen either when the relevant
commit is already marked, or when the commit is explicitly marked
as UNINTERESTING with a negative ref by another argument.

This breaks fast-export basec remote helpers.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---
 t/t9350-fast-export.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 3d153a4805bbfc..171eab4ba7f4d8 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -1010,4 +1010,15 @@ test_expect_success GPG 'export and import of doubly signed commit' '
 	fi
 '
 
+cat > expected << EOF
+reset refs/heads/master
+from $(git rev-parse master)
+
+EOF
+
+test_expect_failure 'refs are updated even if no commits need to be exported' '
+	git fast-export master..master > actual &&
+	test_cmp expected actual
+'
+
 test_done

From f2251177f3465e62eb01a31d4c11349de565b1ee Mon Sep 17 00:00:00 2001
From: Sverre Rabbelier <srabbelier@gmail.com>
Date: Sat, 28 Aug 2010 20:49:01 -0500
Subject: [PATCH 336/553] transport-helper: add trailing --

[PT: ensure we add an additional element to the argv array]

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 transport-helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index 4d95d84f9e4d05..0a48a0d7200942 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -499,6 +499,8 @@ static int get_exporter(struct transport *transport,
 	for (size_t i = 0; i < revlist_args->nr; i++)
 		strvec_push(&fastexport->args, revlist_args->items[i].string);
 
+	strvec_push(&fastexport->args, "--");
+
 	fastexport->git_cmd = 1;
 	return start_command(fastexport);
 }

From aa43d63f645d92c743250a9e7649e3663a33e686 Mon Sep 17 00:00:00 2001
From: Sverre Rabbelier <srabbelier@gmail.com>
Date: Sun, 24 Jul 2011 00:06:00 +0200
Subject: [PATCH 337/553] remote-helper: check helper status after
 import/export

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---
 t/t5801-remote-helpers.sh |  2 +-
 transport-helper.c        | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh
index d21877150ed82e..3917da47276825 100755
--- a/t/t5801-remote-helpers.sh
+++ b/t/t5801-remote-helpers.sh
@@ -262,7 +262,7 @@ test_expect_success 'push update refs failure' '
 	echo "update fail" >>file &&
 	git commit -a -m "update fail" &&
 	git rev-parse --verify testgit/origin/heads/update >expect &&
-	test_expect_code 1 env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \
+	test_must_fail env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \
 		git push origin update &&
 	git rev-parse --verify testgit/origin/heads/update >actual &&
 	test_cmp expect actual
diff --git a/transport-helper.c b/transport-helper.c
index 0a48a0d7200942..0032a259828cad 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -505,6 +505,19 @@ static int get_exporter(struct transport *transport,
 	return start_command(fastexport);
 }
 
+static void check_helper_status(struct helper_data *data)
+{
+	int pid, status;
+
+	pid = waitpid(data->helper->pid, &status, WNOHANG);
+	if (pid < 0)
+		die("Could not retrieve status of remote helper '%s'",
+		    data->name);
+	if (pid > 0 && WIFEXITED(status))
+		die("Remote helper '%s' died with %d",
+		    data->name, WEXITSTATUS(status));
+}
+
 static int fetch_with_import(struct transport *transport,
 			     int nr_heads, struct ref **to_fetch)
 {
@@ -541,6 +554,7 @@ static int fetch_with_import(struct transport *transport,
 
 	if (finish_command(&fastimport))
 		die(_("error while running fast-import"));
+	check_helper_status(data);
 
 	/*
 	 * The fast-import stream of a remote helper that advertises
@@ -1160,6 +1174,7 @@ static int push_refs_with_export(struct transport *transport,
 
 	if (finish_command(&exporter))
 		die(_("error while running fast-export"));
+	check_helper_status(data);
 	if (push_update_refs_status(data, remote_refs, flags))
 		return 1;
 

From 2970caddd2dc2129b26d7903034c51fc42d4f815 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 9 Apr 2012 13:04:35 -0500
Subject: [PATCH 338/553] Always auto-gc after calling a fast-import transport

After importing anything with fast-import, we should always let the
garbage collector do its job, since the objects are written to disk
inefficiently.

This brings down an initial import of http://selenic.com/hg from about
230 megabytes to about 14.

In the future, we may want to make this configurable on a per-remote
basis, or maybe teach fast-import about it in the first place.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 transport-helper.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index 0032a259828cad..0055f04dd9825a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -22,6 +22,8 @@
 #include "packfile.h"
 
 static int debug;
+/* TODO: put somewhere sensible, e.g. git_transport_options? */
+static int auto_gc = 1;
 
 struct helper_data {
 	char *name;
@@ -588,6 +590,13 @@ static int fetch_with_import(struct transport *transport,
 		}
 	}
 	strbuf_release(&buf);
+	if (auto_gc) {
+		struct child_process cmd = CHILD_PROCESS_INIT;
+
+		cmd.git_cmd = 1;
+		strvec_pushl(&cmd.args, "gc", "--auto", "--quiet", NULL);
+		run_command(&cmd);
+	}
 	return 0;
 }
 

From 433fbf252c0794200283e139c3a94a63aaa168d7 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 18 Apr 2017 12:09:08 +0200
Subject: [PATCH 339/553] mingw: prevent regressions with "drive-less" absolute
 paths

On Windows, there are several categories of absolute paths. One such
category starts with a backslash and is implicitly relative to the drive
associated with the current working directory. Example:

	c:
	git clone https://github.com/git-for-windows/git \G4W

should clone into C:\G4W.

Back in 2017, Juan Carlos Arevalo Baeza reported a bug in Git's handling
of those absolute paths was identified, and fixed. Let's make sure that
it stays fixed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5580-unc-paths.sh | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/t/t5580-unc-paths.sh b/t/t5580-unc-paths.sh
index 65ef1a3628ee94..e9df367d5777fd 100755
--- a/t/t5580-unc-paths.sh
+++ b/t/t5580-unc-paths.sh
@@ -20,14 +20,11 @@ fi
 UNCPATH="$(winpwd)"
 case "$UNCPATH" in
 [A-Z]:*)
+	WITHOUTDRIVE="${UNCPATH#?:}"
 	# Use administrative share e.g. \\localhost\C$\git-sdk-64\usr\src\git
 	# (we use forward slashes here because MSYS2 and Git accept them, and
 	# they are easier on the eyes)
-	UNCPATH="//localhost/${UNCPATH%%:*}\$/${UNCPATH#?:}"
-	test -d "$UNCPATH" || {
-		skip_all='could not access administrative share; skipping'
-		test_done
-	}
+	UNCPATH="//localhost/${UNCPATH%%:*}\$$WITHOUTDRIVE"
 	;;
 *)
 	skip_all='skipping UNC path tests, cannot determine current path as UNC'
@@ -35,6 +32,18 @@ case "$UNCPATH" in
 	;;
 esac
 
+test_expect_success 'clone into absolute path lacking a drive prefix' '
+	USINGBACKSLASHES="$(echo "$WITHOUTDRIVE"/without-drive-prefix |
+		tr / \\\\)" &&
+	git clone . "$USINGBACKSLASHES" &&
+	test -f without-drive-prefix/.git/HEAD
+'
+
+test -d "$UNCPATH" || {
+	skip_all='could not access administrative share; skipping'
+	test_done
+}
+
 test_expect_success setup '
 	test_commit initial
 '

From 29434e316184dd77bbaccb60c098078bdd02ed13 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 7 Dec 2018 13:39:30 +0100
Subject: [PATCH 340/553] clean: do not traverse mount points

It seems to be not exactly rare on Windows to install NTFS junction
points (the equivalent of "bind mounts" on Linux/Unix) in worktrees,
e.g. to map some development tools into a subdirectory.

In such a scenario, it is pretty horrible if `git clean -dfx` traverses
into the mapped directory and starts to "clean up".

Let's just not do that. Let's make sure before we traverse into a
directory that it is not a mount point (or junction).

This addresses https://github.com/git-for-windows/git/issues/607

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/clean.c   | 14 ++++++++++++++
 compat/mingw.c    | 22 ++++++++++++++++++++++
 compat/mingw.h    |  3 +++
 git-compat-util.h |  4 ++++
 path.c            | 39 +++++++++++++++++++++++++++++++++++++++
 path.h            |  1 +
 t/t7300-clean.sh  |  9 +++++++++
 7 files changed, 92 insertions(+)

diff --git a/builtin/clean.c b/builtin/clean.c
index 1d5e7e5366bf09..e4f2d56d3210ba 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -41,6 +41,8 @@ static const char *msg_remove = N_("Removing %s\n");
 static const char *msg_would_remove = N_("Would remove %s\n");
 static const char *msg_skip_git_dir = N_("Skipping repository %s\n");
 static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n");
+static const char *msg_skip_mount_point = N_("Skipping mount point %s\n");
+static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n");
 static const char *msg_warn_remove_failed = N_("failed to remove %s");
 static const char *msg_warn_lstat_failed = N_("could not lstat %s\n");
 static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n");
@@ -185,6 +187,18 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 		goto out;
 	}
 
+	if (is_mount_point(path)) {
+		if (!quiet) {
+			quote_path(path->buf, prefix, &quoted, 0);
+			printf(dry_run ?
+			       _(msg_would_skip_mount_point) :
+			       _(msg_skip_mount_point), quoted.buf);
+		}
+		*dir_gone = 0;
+
+		goto out;
+	}
+
 	dir = opendir(path->buf);
 	if (!dir) {
 		/* an empty dir could be removed even if it is unreadble */
diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..f4ea2bc7d4606f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2676,6 +2676,28 @@ pid_t waitpid(pid_t pid, int *status, int options)
 	return -1;
 }
 
+int mingw_is_mount_point(struct strbuf *path)
+{
+	WIN32_FIND_DATAW findbuf = { 0 };
+	HANDLE handle;
+	wchar_t wfilename[MAX_PATH];
+	int wlen = xutftowcs_path(wfilename, path->buf);
+	if (wlen < 0)
+		die(_("could not get long path for '%s'"), path->buf);
+
+	/* remove trailing slash, if any */
+	if (wlen > 0 && wfilename[wlen - 1] == L'/')
+		wfilename[--wlen] = L'\0';
+
+	handle = FindFirstFileW(wfilename, &findbuf);
+	if (handle == INVALID_HANDLE_VALUE)
+		return 0;
+	FindClose(handle);
+
+	return (findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) &&
+		(findbuf.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT);
+}
+
 int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen)
 {
 	int upos = 0, wpos = 0;
diff --git a/compat/mingw.h b/compat/mingw.h
index 444daedfa52469..af6fc3f12970bf 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -36,6 +36,9 @@ static inline void convert_slashes(char *path)
 		if (*path == '\\')
 			*path = '/';
 }
+struct strbuf;
+int mingw_is_mount_point(struct strbuf *path);
+#define is_mount_point mingw_is_mount_point
 #define PATH_SEP ';'
 char *mingw_query_user_email(void);
 #define query_user_email mingw_query_user_email
diff --git a/git-compat-util.h b/git-compat-util.h
index b0673d1a450db5..d461429857911f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -354,6 +354,10 @@ static inline int git_has_dir_sep(const char *path)
 #define has_dir_sep(path) git_has_dir_sep(path)
 #endif
 
+#ifndef is_mount_point
+#define is_mount_point is_mount_point_via_stat
+#endif
+
 #ifndef query_user_email
 #define query_user_email() NULL
 #endif
diff --git a/path.c b/path.c
index d726537622cda6..962920c1bad4e7 100644
--- a/path.c
+++ b/path.c
@@ -1323,6 +1323,45 @@ char *strip_path_suffix(const char *path, const char *suffix)
 	return offset == -1 ? NULL : xstrndup(path, offset);
 }
 
+int is_mount_point_via_stat(struct strbuf *path)
+{
+	size_t len = path->len;
+	dev_t current_dev;
+	struct stat st;
+
+	if (!strcmp("/", path->buf))
+		return 1;
+
+	strbuf_addstr(path, "/.");
+	if (lstat(path->buf, &st)) {
+		/*
+		 * If we cannot access the current directory, we cannot say
+		 * that it is a bind mount.
+		 */
+		strbuf_setlen(path, len);
+		return 0;
+	}
+	current_dev = st.st_dev;
+
+	/* Now look at the parent directory */
+	strbuf_addch(path, '.');
+	if (lstat(path->buf, &st)) {
+		/*
+		 * If we cannot access the parent directory, we cannot say
+		 * that it is a bind mount.
+		 */
+		strbuf_setlen(path, len);
+		return 0;
+	}
+	strbuf_setlen(path, len);
+
+	/*
+	 * If the device ID differs between current and parent directory,
+	 * then it is a bind mount.
+	 */
+	return current_dev != st.st_dev;
+}
+
 int daemon_avoid_alias(const char *p)
 {
 	int sl, ndot;
diff --git a/path.h b/path.h
index 0ec95a0b079c90..affdb970b1fb86 100644
--- a/path.h
+++ b/path.h
@@ -157,6 +157,7 @@ int normalize_path_copy(char *dst, const char *src);
 int strbuf_normalize_path(struct strbuf *src);
 int longest_ancestor_length(const char *path, struct string_list *prefixes);
 char *strip_path_suffix(const char *path, const char *suffix);
+int is_mount_point_via_stat(struct strbuf *path);
 int daemon_avoid_alias(const char *path);
 
 /*
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 00d4070156243b..7c3a1ca91df534 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -800,4 +800,13 @@ test_expect_success 'traverse into directories that may have ignored entries' '
 	)
 '
 
+test_expect_success MINGW 'clean does not traverse mount points' '
+	mkdir target &&
+	>target/dont-clean-me &&
+	git init with-mountpoint &&
+	cmd //c "mklink /j with-mountpoint\\mountpoint target" &&
+	git -C with-mountpoint clean -dfx &&
+	test_path_is_file target/dont-clean-me
+'
+
 test_done

From 12ecc9c9bfbdea9fef3d8b0374cdba85b43af692 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 20 Oct 2019 22:08:58 +0200
Subject: [PATCH 341/553] win32/pthread: avoid name clashes with winpthread

When asking the mingw-w64 variant of GCC to compile C11 code, it seems
to link implicitly to libwinpthread, which does implement a pthread
emulation (that is more complete than Git's).

In preparation for vendoring in mimalloc (which requires C11 support),
let's keep preferring Git's own pthread emulation.

To avoid linker errors where it thinks that the `pthread_self` and the
`pthread_create` symbols are defined twice, let's give our version a
`win32_` prefix, just like we already do for `pthread_join()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/pthread.c | 6 +++---
 compat/win32/pthread.h | 8 +++++---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/compat/win32/pthread.c b/compat/win32/pthread.c
index 7e93146963ec56..dcdf537cce680b 100644
--- a/compat/win32/pthread.c
+++ b/compat/win32/pthread.c
@@ -21,8 +21,8 @@ static unsigned __stdcall win32_start_routine(void *arg)
 	return 0;
 }
 
-int pthread_create(pthread_t *thread, const void *attr UNUSED,
-		   void *(*start_routine)(void *), void *arg)
+int win32_pthread_create(pthread_t *thread, const void *attr UNUSED,
+			 void *(*start_routine)(void *), void *arg)
 {
 	thread->arg = arg;
 	thread->start_routine = start_routine;
@@ -53,7 +53,7 @@ int win32_pthread_join(pthread_t *thread, void **value_ptr)
 	}
 }
 
-pthread_t pthread_self(void)
+pthread_t win32_pthread_self(void)
 {
 	pthread_t t = { NULL };
 	t.tid = GetCurrentThreadId();
diff --git a/compat/win32/pthread.h b/compat/win32/pthread.h
index ccacc5a53ba976..51a3eefac8e9b4 100644
--- a/compat/win32/pthread.h
+++ b/compat/win32/pthread.h
@@ -49,8 +49,9 @@ typedef struct {
 	DWORD tid;
 } pthread_t;
 
-int pthread_create(pthread_t *thread, const void *unused,
-		   void *(*start_routine)(void*), void *arg);
+int win32_pthread_create(pthread_t *thread, const void *unused,
+			 void *(*start_routine)(void*), void *arg);
+#define pthread_create win32_pthread_create
 
 /*
  * To avoid the need of copying a struct, we use small macro wrapper to pass
@@ -61,7 +62,8 @@ int pthread_create(pthread_t *thread, const void *unused,
 int win32_pthread_join(pthread_t *thread, void **value_ptr);
 
 #define pthread_equal(t1, t2) ((t1).tid == (t2).tid)
-pthread_t pthread_self(void);
+pthread_t win32_pthread_self(void);
+#define pthread_self win32_pthread_self
 
 int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
 

From 1891e11f39a572b30b14591e0ae4f784751e827f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 11 Dec 2018 12:55:26 +0100
Subject: [PATCH 342/553] clean: remove mount points when possible

Windows' equivalent to "bind mounts", NTFS junction points, can be
unlinked without affecting the mount target. This is clearly what users
expect to happen when they call `git clean -dfx` in a worktree that
contains NTFS junction points: the junction should be removed, and the
target directory of said junction should be left alone (unless it is
inside the worktree).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/clean.c  | 13 +++++++++++++
 compat/mingw.h   |  1 +
 t/t7300-clean.sh |  1 +
 3 files changed, 15 insertions(+)

diff --git a/builtin/clean.c b/builtin/clean.c
index e4f2d56d3210ba..6ed555000f9a41 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -41,8 +41,10 @@ static const char *msg_remove = N_("Removing %s\n");
 static const char *msg_would_remove = N_("Would remove %s\n");
 static const char *msg_skip_git_dir = N_("Skipping repository %s\n");
 static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n");
+#ifndef CAN_UNLINK_MOUNT_POINTS
 static const char *msg_skip_mount_point = N_("Skipping mount point %s\n");
 static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n");
+#endif
 static const char *msg_warn_remove_failed = N_("failed to remove %s");
 static const char *msg_warn_lstat_failed = N_("could not lstat %s\n");
 static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n");
@@ -188,6 +190,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 	}
 
 	if (is_mount_point(path)) {
+#ifndef CAN_UNLINK_MOUNT_POINTS
 		if (!quiet) {
 			quote_path(path->buf, prefix, &quoted, 0);
 			printf(dry_run ?
@@ -195,6 +198,16 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 			       _(msg_skip_mount_point), quoted.buf);
 		}
 		*dir_gone = 0;
+#else
+		if (!dry_run && unlink(path->buf)) {
+			int saved_errno = errno;
+			quote_path(path->buf, prefix, &quoted, 0);
+			errno = saved_errno;
+			warning_errno(_(msg_warn_remove_failed), quoted.buf);
+			*dir_gone = 0;
+			ret = -1;
+		}
+#endif
 
 		goto out;
 	}
diff --git a/compat/mingw.h b/compat/mingw.h
index af6fc3f12970bf..fb83cdaf4e982c 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -39,6 +39,7 @@ static inline void convert_slashes(char *path)
 struct strbuf;
 int mingw_is_mount_point(struct strbuf *path);
 #define is_mount_point mingw_is_mount_point
+#define CAN_UNLINK_MOUNT_POINTS 1
 #define PATH_SEP ';'
 char *mingw_query_user_email(void);
 #define query_user_email mingw_query_user_email
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 7c3a1ca91df534..6f16f3893191e7 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -806,6 +806,7 @@ test_expect_success MINGW 'clean does not traverse mount points' '
 	git init with-mountpoint &&
 	cmd //c "mklink /j with-mountpoint\\mountpoint target" &&
 	git -C with-mountpoint clean -dfx &&
+	test_path_is_missing with-mountpoint/mountpoint &&
 	test_path_is_file target/dont-clean-me
 '
 

From 59823fe475a130b6987e785a549b7e5d58a54e9b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 16 Feb 2015 14:06:59 +0100
Subject: [PATCH 343/553] mingw: include the Python parts in the build

While Git for Windows does not _ship_ Python (in order to save on
bandwidth), MSYS2 provides very fine Python interpreters that users can
easily take advantage of, by using Git for Windows within its SDK.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.uname b/config.mak.uname
index 38b35af366d5fd..f00d23b57a9f4c 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -758,6 +758,7 @@ ifeq ($(uname_S),MINGW)
         ifneq (CLANGARM64,$(MSYSTEM))
 		USE_NED_ALLOCATOR = YesPlease
         endif
+	NO_PYTHON =
         ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix))))
 		# Move system config into top-level /etc/
 		ETC_GITCONFIG = ../etc/gitconfig

From f4ebed71c617711d2ef39e4a6e4ef00915352579 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 12 Aug 2022 12:44:15 +0200
Subject: [PATCH 344/553] git-compat-util: avoid redeclaring _DEFAULT_SOURCE

We are about to vendor in `mimalloc`'s source code which we will want to
include `compat/posix.h` after defining that constant.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/posix.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/compat/posix.h b/compat/posix.h
index 245386fa4a9f4e..6a137480d93eef 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -70,7 +70,9 @@
 #define _ALL_SOURCE 1
 #define _GNU_SOURCE 1
 #define _BSD_SOURCE 1
+#ifndef _DEFAULT_SOURCE
 #define _DEFAULT_SOURCE 1
+#endif
 #define _NETBSD_SOURCE 1
 #define _SGI_SOURCE 1
 

From 5dc036e32d7f52da2c26743debde5345287e301c Mon Sep 17 00:00:00 2001
From: Thomas Braun <thomas.braun@byte-physics.de>
Date: Thu, 8 May 2014 21:43:24 +0200
Subject: [PATCH 345/553] transport: optionally disable side-band-64k

Since commit 0c499ea60fda (send-pack: demultiplex a sideband stream with
status data, 2010-02-05) the send-pack builtin uses the side-band-64k
capability if advertised by the server.

Unfortunately this breaks pushing over the dump git protocol if used
over a network connection.

The detailed reasons for this breakage are (by courtesy of Jeff Preshing,
quoted from https://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ):

	MinGW wraps Windows sockets in CRT file descriptors in order to
	mimic the functionality of POSIX sockets. This causes msvcrt.dll
	to treat sockets as Installable File System (IFS) handles,
	calling ReadFile, WriteFile, DuplicateHandle and CloseHandle on
	them. This approach works well in simple cases on recent
	versions of Windows, but does not support all usage patterns. In
	particular, using this approach, any attempt to read & write
	concurrently on the same socket (from one or more processes)
	will deadlock in a scenario where the read waits for a response
	from the server which is only invoked after the write. This is
	what send_pack currently attempts to do in the use_sideband
	codepath.

The new config option `sendpack.sideband` allows to override the
side-band-64k capability of the server, and thus makes the dumb git
protocol work.

Other transportation methods like ssh and http/https still benefit from
the sideband channel, therefore the default value of `sendpack.sideband`
is still true.

Signed-off-by: Thomas Braun <thomas.braun@byte-physics.de>
Signed-off-by: Oliver Schneider <oliver@assarbad.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.adoc          | 2 ++
 Documentation/config/sendpack.adoc | 5 +++++
 send-pack.c                        | 6 +++---
 3 files changed, 10 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/config/sendpack.adoc

diff --git a/Documentation/config.adoc b/Documentation/config.adoc
index dcea3c0c15e2a9..4332ce35154be0 100644
--- a/Documentation/config.adoc
+++ b/Documentation/config.adoc
@@ -519,6 +519,8 @@ include::config/safe.adoc[]
 
 include::config/sendemail.adoc[]
 
+include::config/sendpack.adoc[]
+
 include::config/sequencer.adoc[]
 
 include::config/showbranch.adoc[]
diff --git a/Documentation/config/sendpack.adoc b/Documentation/config/sendpack.adoc
new file mode 100644
index 00000000000000..e306f657fba7dd
--- /dev/null
+++ b/Documentation/config/sendpack.adoc
@@ -0,0 +1,5 @@
+sendpack.sideband::
+	Allows to disable the side-band-64k capability for send-pack even
+	when it is advertised by the server. Makes it possible to work
+	around a limitation in the git for windows implementation together
+	with the dump git protocol. Defaults to true.
diff --git a/send-pack.c b/send-pack.c
index 67d6987b1ccd7e..22a1beed8d9823 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -501,7 +501,7 @@ int send_pack(struct repository *r,
 	int need_pack_data = 0;
 	int allow_deleting_refs = 0;
 	int status_report = 0;
-	int use_sideband = 0;
+	int use_sideband = 1;
 	int quiet_supported = 0;
 	int agent_supported = 0;
 	int advertise_sid = 0;
@@ -525,6 +525,7 @@ int send_pack(struct repository *r,
 		goto out;
 	}
 
+	repo_config_get_bool(r, "sendpack.sideband", &use_sideband);
 	repo_config_get_bool(r, "push.negotiate", &push_negotiate);
 	if (push_negotiate) {
 		trace2_region_enter("send_pack", "push_negotiate", r);
@@ -546,8 +547,7 @@ int send_pack(struct repository *r,
 		allow_deleting_refs = 1;
 	if (server_supports("ofs-delta"))
 		args->use_ofs_delta = 1;
-	if (server_supports("side-band-64k"))
-		use_sideband = 1;
+	use_sideband = use_sideband && server_supports("side-band-64k");
 	if (server_supports("quiet"))
 		quiet_supported = 1;
 	if (server_supports("agent"))

From dccf34b858e4429781dc016f3fcc5b5730bee812 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 1 Jan 2020 21:07:22 +0100
Subject: [PATCH 346/553] mingw: do resolve symlinks in `getcwd()`

As pointed out in https://github.com/git-for-windows/git/issues/1676,
the `git rev-parse --is-inside-work-tree` command currently fails when
the current directory's path contains symbolic links.

The underlying reason for this bug is that `getcwd()` is supposed to
resolve symbolic links, but our `mingw_getcwd()` implementation did not.

We do have all the building blocks for that, though: the
`GetFinalPathByHandleW()` function will resolve symbolic links. However,
we only called that function if `GetLongPathNameW()` failed, for
historical reasons: the latter function was supported for a long time,
but the former API function was introduced only with Windows Vista, and
we used to support also Windows XP. With that support having been
dropped, we are free to call the symbolic link-resolving function right
away.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..cf4f3c92e7a889 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1239,18 +1239,16 @@ char *mingw_getcwd(char *pointer, int len)
 {
 	wchar_t cwd[MAX_PATH], wpointer[MAX_PATH];
 	DWORD ret = GetCurrentDirectoryW(ARRAY_SIZE(cwd), cwd);
+	HANDLE hnd;
 
 	if (!ret || ret >= ARRAY_SIZE(cwd)) {
 		errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError());
 		return NULL;
 	}
-	ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer));
-	if (!ret && GetLastError() == ERROR_ACCESS_DENIED) {
-		HANDLE hnd = CreateFileW(cwd, 0,
-			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
-			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
-		if (hnd == INVALID_HANDLE_VALUE)
-			return NULL;
+	hnd = CreateFileW(cwd, 0,
+			  FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			  OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+	if (hnd != INVALID_HANDLE_VALUE) {
 		ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0);
 		CloseHandle(hnd);
 		if (!ret || ret >= ARRAY_SIZE(wpointer))
@@ -1259,13 +1257,11 @@ char *mingw_getcwd(char *pointer, int len)
 			return NULL;
 		return pointer;
 	}
-	if (!ret || ret >= ARRAY_SIZE(wpointer))
-		return NULL;
-	if (GetFileAttributesW(wpointer) == INVALID_FILE_ATTRIBUTES) {
+	if (GetFileAttributesW(cwd) == INVALID_FILE_ATTRIBUTES) {
 		errno = ENOENT;
 		return NULL;
 	}
-	if (xwcstoutf(pointer, wpointer, len) < 0)
+	if (xwcstoutf(pointer, cwd, len) < 0)
 		return NULL;
 	convert_slashes(pointer);
 	return pointer;

From a4fe6c58992db0b32008d9f7e0b6e34f695393da Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 31 Jan 2020 12:02:47 +0100
Subject: [PATCH 347/553] mingw: demonstrate a `git add` issue with NTFS
 junctions

NTFS junctions are somewhat similar in spirit to Unix bind mounts: they
point to a different directory and are resolved by the filesystem
driver. As such, they appear to `lstat()` as if they are directories,
not as if they are symbolic links.

_Any_ user can create junctions, while symbolic links can only be
created by non-administrators in Developer Mode on Windows 10. Hence
NTFS junctions are much more common "in the wild" than NTFS symbolic
links.

It was reported in https://github.com/git-for-windows/git/issues/2481
that adding files via an absolute path that traverses an NTFS junction:
since 1e64d18 (mingw: do resolve symlinks in `getcwd()`), we resolve not
only symbolic links but also NTFS junctions when determining the
absolute path of the current directory. The same is not true for `git
add <file>`, where symbolic links are resolved in `<file>`, but not NTFS
junctions.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3700-add.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index af93e53c12cfe3..1865372af63d94 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -574,4 +574,15 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' '
 	git add "$downcased"
 '
 
+test_expect_failure MINGW 'can add files via NTFS junctions' '
+	test_when_finished "cmd //c rmdir junction && rm -rf target" &&
+	test_create_repo target &&
+	cmd //c "mklink /j junction target" &&
+	>target/via-junction &&
+	git -C junction add "$(pwd)/junction/via-junction" &&
+	echo via-junction >expect &&
+	git -C target diff --cached --name-only >actual &&
+	test_cmp expect actual
+'
+
 test_done

From 7240cd0a7ad293932a20d36d1441fd3d5fe1a7e9 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 31 Jan 2020 11:44:31 +0100
Subject: [PATCH 348/553] strbuf_realpath(): use platform-dependent API if
 available

Some platforms (e.g. Windows) provide API functions to resolve paths
much quicker. Let's offer a way to short-cut `strbuf_realpath()` on
those platforms.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 abspath.c         | 3 +++
 git-compat-util.h | 4 ++++
 2 files changed, 7 insertions(+)

diff --git a/abspath.c b/abspath.c
index 1202cde23dbc9b..0c17e98654e4b0 100644
--- a/abspath.c
+++ b/abspath.c
@@ -93,6 +93,9 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
 			goto error_out;
 	}
 
+	if (platform_strbuf_realpath(resolved, path))
+		return resolved->buf;
+
 	strbuf_addstr(&remaining, path);
 	get_root_part(resolved, &remaining);
 
diff --git a/git-compat-util.h b/git-compat-util.h
index b0673d1a450db5..650bd1e74f15bb 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -358,6 +358,10 @@ static inline int git_has_dir_sep(const char *path)
 #define query_user_email() NULL
 #endif
 
+#ifndef platform_strbuf_realpath
+#define platform_strbuf_realpath(resolved, path) NULL
+#endif
+
 #ifdef __TANDEM
 #include <floss.h(floss_execl,floss_execlp,floss_execv,floss_execvp)>
 #include <floss.h(floss_getpwuid)>

From e4a4aa54e9a3315e2e7d4846a6279e073bff5521 Mon Sep 17 00:00:00 2001
From: Bjoern Mueller <bjoernm@gmx.de>
Date: Wed, 22 Jan 2020 13:49:13 +0100
Subject: [PATCH 349/553] mingw: fix fatal error working on mapped network
 drives on Windows

In 1e64d18 (mingw: do resolve symlinks in `getcwd()`) a problem was
introduced that causes git for Windows to stop working with certain
mapped network drives (in particular, drives that are mapped to
locations with long path names). Error message was "fatal: Unable to
read current working directory: No such file or directory". Present
change fixes this issue as discussed in
https://github.com/git-for-windows/git/issues/2480

Signed-off-by: Bjoern Mueller <bjoernm@gmx.de>
---
 compat/mingw.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..23f8f7be156272 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1253,8 +1253,13 @@ char *mingw_getcwd(char *pointer, int len)
 			return NULL;
 		ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0);
 		CloseHandle(hnd);
-		if (!ret || ret >= ARRAY_SIZE(wpointer))
-			return NULL;
+		if (!ret || ret >= ARRAY_SIZE(wpointer)) {
+			ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer));
+			if (!ret || ret >= ARRAY_SIZE(wpointer)) {
+				errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError());
+				return NULL;
+			}
+		}
 		if (xwcstoutf(pointer, normalize_ntpath(wpointer), len) < 0)
 			return NULL;
 		return pointer;

From 45e3edf32ff3a73b90a2832715bcf4ed0ba2e889 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Thu, 30 Jan 2020 14:22:27 -0500
Subject: [PATCH 350/553] clink.pl: fix MSVC compile script to handle
 libcurl-d.lib

Update clink.pl to link with either libcurl.lib or libcurl-d.lib
depending on whether DEBUG=1 is set.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/vcbuild/scripts/clink.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
index 3bd824154be381..c4c99d1a11f18c 100755
--- a/compat/vcbuild/scripts/clink.pl
+++ b/compat/vcbuild/scripts/clink.pl
@@ -56,7 +56,8 @@
 		# need to use that instead?
 		foreach my $flag (@lflags) {
 			if ($flag =~ /^-LIBPATH:(.*)/) {
-				foreach my $l ("libcurl_imp.lib", "libcurl.lib") {
+				my $libcurl = $is_debug ? "libcurl-d.lib" : "libcurl.lib";
+				foreach my $l ("libcurl_imp.lib", $libcurl) {
 					if (-f "$1/$l") {
 						$lib = $l;
 						last;

From cf4f6146827b885459499747628e2e33005570ff Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 31 Jan 2020 11:49:04 +0100
Subject: [PATCH 351/553] mingw: implement a platform-specific
 `strbuf_realpath()`

There is a Win32 API function to resolve symbolic links, and we can use
that instead of resolving them manually. Even better, this function also
resolves NTFS junction points (which are somewhat similar to bind
mounts).

This fixes https://github.com/git-for-windows/git/issues/2481.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c        | 76 +++++++++++++++++++++++++++++++++++++++++++
 compat/mingw.h        |  3 ++
 t/t0060-path-utils.sh |  8 +++++
 t/t3700-add.sh        |  2 +-
 t/t5601-clone.sh      |  7 ++++
 5 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..2d11ab3e89b713 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1235,6 +1235,82 @@ struct tm *localtime_r(const time_t *timep, struct tm *result)
 }
 #endif
 
+char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path)
+{
+	wchar_t wpath[MAX_PATH];
+	HANDLE h;
+	DWORD ret;
+	int len;
+	const char *last_component = NULL;
+	char *append = NULL;
+
+	if (xutftowcs_path(wpath, path) < 0)
+		return NULL;
+
+	h = CreateFileW(wpath, 0,
+			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+
+	/*
+	 * strbuf_realpath() allows the last path component to not exist. If
+	 * that is the case, now it's time to try without last component.
+	 */
+	if (h == INVALID_HANDLE_VALUE &&
+	    GetLastError() == ERROR_FILE_NOT_FOUND) {
+		/* cut last component off of `wpath` */
+		wchar_t *p = wpath + wcslen(wpath);
+
+		while (p != wpath)
+			if (*(--p) == L'/' || *p == L'\\')
+				break; /* found start of last component */
+
+		if (p != wpath && (last_component = find_last_dir_sep(path))) {
+			append = xstrdup(last_component + 1); /* skip directory separator */
+			/*
+			 * Do not strip the trailing slash at the drive root, otherwise
+			 * the path would be e.g. `C:` (which resolves to the
+			 * _current_ directory on that drive).
+			 */
+			if (p[-1] == L':')
+				p[1] = L'\0';
+			else
+				*p = L'\0';
+			h = CreateFileW(wpath, 0, FILE_SHARE_READ |
+					FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+					NULL, OPEN_EXISTING,
+					FILE_FLAG_BACKUP_SEMANTICS, NULL);
+		}
+	}
+
+	if (h == INVALID_HANDLE_VALUE) {
+realpath_failed:
+		FREE_AND_NULL(append);
+		return NULL;
+	}
+
+	ret = GetFinalPathNameByHandleW(h, wpath, ARRAY_SIZE(wpath), 0);
+	CloseHandle(h);
+	if (!ret || ret >= ARRAY_SIZE(wpath))
+		goto realpath_failed;
+
+	len = wcslen(wpath) * 3;
+	strbuf_grow(resolved, len);
+	len = xwcstoutf(resolved->buf, normalize_ntpath(wpath), len);
+	if (len < 0)
+		goto realpath_failed;
+	resolved->len = len;
+
+	if (append) {
+		/* Use forward-slash, like `normalize_ntpath()` */
+		strbuf_complete(resolved, '/');
+		strbuf_addstr(resolved, append);
+		FREE_AND_NULL(append);
+	}
+
+	return resolved->buf;
+
+}
+
 char *mingw_getcwd(char *pointer, int len)
 {
 	wchar_t cwd[MAX_PATH], wpointer[MAX_PATH];
diff --git a/compat/mingw.h b/compat/mingw.h
index 444daedfa52469..f6daf47ee4e0a7 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -39,6 +39,9 @@ static inline void convert_slashes(char *path)
 #define PATH_SEP ';'
 char *mingw_query_user_email(void);
 #define query_user_email mingw_query_user_email
+struct strbuf;
+char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path);
+#define platform_strbuf_realpath mingw_strbuf_realpath
 
 /**
  * Verifies that the specified path is owned by the user running the
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 8545cdfab559b4..eb2ab9d437ea8e 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -281,6 +281,14 @@ test_expect_success SYMLINKS 'real path works on symlinks' '
 	test_cmp expect actual
 '
 
+test_expect_success MINGW 'real path works near drive root' '
+	# we need a non-existing path at the drive root; simply skip if C:/xyz exists
+	if test ! -e C:/xyz
+	then
+		test C:/xyz = $(test-tool path-utils real_path C:/xyz)
+	fi
+'
+
 test_expect_success SYMLINKS 'prefix_path works with absolute paths to work tree symlinks' '
 	ln -s target symlink &&
 	echo "symlink" >expect &&
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index 1865372af63d94..14ad83124082a2 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -574,7 +574,7 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' '
 	git add "$downcased"
 '
 
-test_expect_failure MINGW 'can add files via NTFS junctions' '
+test_expect_success MINGW 'can add files via NTFS junctions' '
 	test_when_finished "cmd //c rmdir junction && rm -rf target" &&
 	test_create_repo target &&
 	cmd //c "mklink /j junction target" &&
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index d743d986c401a0..f70d99016ea2f7 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -78,6 +78,13 @@ test_expect_success 'clone respects GIT_WORK_TREE' '
 
 '
 
+test_expect_success CASE_INSENSITIVE_FS 'core.worktree is not added due to path case' '
+
+	mkdir UPPERCASE &&
+	git clone src "$(pwd)/uppercase" &&
+	test "unset" = "$(git -C UPPERCASE config --default unset core.worktree)"
+'
+
 test_expect_success 'clone from hooks' '
 
 	test_create_repo r0 &&

From e4a1ca61be747830a1d712422af80e5f82a439be Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 9 May 2020 16:19:06 +0200
Subject: [PATCH 352/553] t5505/t5516: allow running without `.git/branches/`
 in the templates

When we commit the template directory as part of `make vcxproj`, the
`branches/` directory is not actually commited, as it is empty.

Two tests were not prepared for that situation.

This developer tried to get rid of the support for `.git/branches/` a
long time ago, but that effort did not bear fruit, so the best we can do
is work around in these here tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5505-remote.sh     | 4 ++--
 t/t5516-fetch-push.sh | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index e592c0bcde91e9..ed8ef69863ddd8 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -1155,7 +1155,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'migrate a remote from named file in
 	(
 		cd six &&
 		git remote rm origin &&
-		mkdir .git/branches &&
+		mkdir -p .git/branches &&
 		echo "$origin_url#main" >.git/branches/origin &&
 		git remote rename origin origin &&
 		test_path_is_missing .git/branches/origin &&
@@ -1170,7 +1170,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'migrate a remote from named file in
 	(
 		cd seven &&
 		git remote rm origin &&
-		mkdir .git/branches &&
+		mkdir -p .git/branches &&
 		echo "quux#foom" > .git/branches/origin &&
 		git remote rename origin origin &&
 		test_path_is_missing .git/branches/origin &&
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 46926e7bbd3a9a..c15963c3d0b229 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -933,7 +933,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' '
 	mk_empty testrepo &&
 	git branch second $the_first_commit &&
 	git checkout second &&
-	mkdir testrepo/.git/branches &&
+	mkdir -p testrepo/.git/branches &&
 	echo ".." > testrepo/.git/branches/branch1 &&
 	(
 		cd testrepo &&
@@ -947,7 +947,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' '
 
 test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches containing #' '
 	mk_empty testrepo &&
-	mkdir testrepo/.git/branches &&
+	mkdir -p testrepo/.git/branches &&
 	echo "..#second" > testrepo/.git/branches/branch2 &&
 	(
 		cd testrepo &&
@@ -964,7 +964,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches' '
 	git checkout second &&
 
 	test_when_finished "rm -rf .git/branches" &&
-	mkdir .git/branches &&
+	mkdir -p .git/branches &&
 	echo "testrepo" > .git/branches/branch1 &&
 
 	git push branch1 &&
@@ -980,7 +980,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches containing #' '
 	mk_empty testrepo &&
 
 	test_when_finished "rm -rf .git/branches" &&
-	mkdir .git/branches &&
+	mkdir -p .git/branches &&
 	echo "testrepo#branch3" > .git/branches/branch2 &&
 
 	git push branch2 &&

From d39efee1eefa2d34afe59b48ba2af6340dccbbf5 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 4 Mar 2020 21:55:28 +0100
Subject: [PATCH 353/553] http: use new "best effort" strategy for Secure
 Channel revoke checking
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The native Windows HTTPS backend is based on Secure Channel which lets
the caller decide how to handle revocation checking problems caused by
missing information in the certificate or offline CRL distribution
points.

Unfortunately, cURL chose to handle these problems differently than
OpenSSL by default: while OpenSSL happily ignores those problems
(essentially saying "¯\_(ツ)_/¯"), the Secure Channel backend will error
out instead.

As a remedy, the "no revoke" mode was introduced, which turns off
revocation checking altogether. This is a bit heavy-handed. We support
this via the `http.schannelCheckRevoke` setting.

In https://github.com/curl/curl/pull/4981, we contributed an opt-in
"best effort" strategy that emulates what OpenSSL seems to do.

In Git for Windows, we actually want this to be the default. This patch
makes it so, introducing it as a new value for the
`http.schannelCheckRevoke" setting, which now becmes a tristate: it
accepts the values "false", "true" or "best-effort" (defaulting to the
last one).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config/http.adoc | 12 +++++++-----
 http.c                         | 26 ++++++++++++++++++++++----
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/Documentation/config/http.adoc b/Documentation/config/http.adoc
index 9da5c298cc1d5e..9122c5dc23ea1a 100644
--- a/Documentation/config/http.adoc
+++ b/Documentation/config/http.adoc
@@ -233,11 +233,13 @@ http.sslKeyType::
 
 http.schannelCheckRevoke::
 	Used to enforce or disable certificate revocation checks in cURL
-	when http.sslBackend is set to "schannel". Defaults to `true` if
-	unset. Only necessary to disable this if Git consistently errors
-	and the message is about checking the revocation status of a
-	certificate. This option is ignored if cURL lacks support for
-	setting the relevant SSL option at runtime.
+	when http.sslBackend is set to "schannel" via "true" and "false",
+	respectively. Another accepted value is "best-effort" (the default)
+	in which case revocation checks are performed, but errors due to
+	revocation list distribution points that are offline are silently
+	ignored, as well as errors due to certificates missing revocation
+	list distribution points. This option is ignored if cURL lacks
+	support for setting the relevant SSL option at runtime.
 
 http.schannelUseSSLCAInfo::
 	As of cURL v7.60.0, the Secure Channel backend can use the
diff --git a/http.c b/http.c
index 41f850db16d19f..81b10d04b61d10 100644
--- a/http.c
+++ b/http.c
@@ -148,7 +148,13 @@ static char *cached_accept_language;
 
 static char *http_ssl_backend;
 
-static int http_schannel_check_revoke = 1;
+static long http_schannel_check_revoke_mode =
+#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT
+	CURLSSLOPT_REVOKE_BEST_EFFORT;
+#else
+	CURLSSLOPT_NO_REVOKE;
+#endif
+
 /*
  * With the backend being set to `schannel`, setting sslCAinfo would override
  * the Certificate Store in cURL v7.60.0 and later, which is not what we want
@@ -423,7 +429,19 @@ static int http_options(const char *var, const char *value,
 	}
 
 	if (!strcmp("http.schannelcheckrevoke", var)) {
-		http_schannel_check_revoke = git_config_bool(var, value);
+		if (value && !strcmp(value, "best-effort")) {
+			http_schannel_check_revoke_mode =
+#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT
+				CURLSSLOPT_REVOKE_BEST_EFFORT;
+#else
+				CURLSSLOPT_NO_REVOKE;
+			warning(_("%s=%s unsupported by current cURL"),
+				var, value);
+#endif
+		} else
+			http_schannel_check_revoke_mode =
+				(git_config_bool(var, value) ?
+				 0 : CURLSSLOPT_NO_REVOKE);
 		return 0;
 	}
 
@@ -1057,8 +1075,8 @@ static CURL *get_curl_handle(void)
 #endif
 
 	if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) &&
-	    !http_schannel_check_revoke) {
-		curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, (long)CURLSSLOPT_NO_REVOKE);
+	    http_schannel_check_revoke_mode) {
+		curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode);
 	}
 
 	if (http_proactive_auth != PROACTIVE_AUTH_NONE)

From 9dc1a8beb62a281cafd1894c441e3a2217bb3d0b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 9 May 2020 19:24:23 +0200
Subject: [PATCH 354/553] t5505/t5516: fix white-space around redirectors

The convention in Git project's shell scripts is to have white-space
_before_, but not _after_ the `>` (or `<`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5505-remote.sh     |  6 +++---
 t/t5516-fetch-push.sh | 10 +++++-----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index ed8ef69863ddd8..187a5206e17758 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -951,8 +951,8 @@ test_expect_success '"remote show" does not show symbolic refs' '
 	(
 		cd three &&
 		git remote show origin >output &&
-		! grep "^ *HEAD$" < output &&
-		! grep -i stale < output
+		! grep "^ *HEAD$" <output &&
+		! grep -i stale <output
 	)
 '
 
@@ -1171,7 +1171,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'migrate a remote from named file in
 		cd seven &&
 		git remote rm origin &&
 		mkdir -p .git/branches &&
-		echo "quux#foom" > .git/branches/origin &&
+		echo "quux#foom" >.git/branches/origin &&
 		git remote rename origin origin &&
 		test_path_is_missing .git/branches/origin &&
 		test "$(git config remote.origin.url)" = "quux" &&
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index c15963c3d0b229..d37cc7f486344d 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -934,7 +934,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' '
 	git branch second $the_first_commit &&
 	git checkout second &&
 	mkdir -p testrepo/.git/branches &&
-	echo ".." > testrepo/.git/branches/branch1 &&
+	echo ".." >testrepo/.git/branches/branch1 &&
 	(
 		cd testrepo &&
 		git fetch branch1 &&
@@ -948,7 +948,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' '
 test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches containing #' '
 	mk_empty testrepo &&
 	mkdir -p testrepo/.git/branches &&
-	echo "..#second" > testrepo/.git/branches/branch2 &&
+	echo "..#second" >testrepo/.git/branches/branch2 &&
 	(
 		cd testrepo &&
 		git fetch branch2 &&
@@ -965,7 +965,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches' '
 
 	test_when_finished "rm -rf .git/branches" &&
 	mkdir -p .git/branches &&
-	echo "testrepo" > .git/branches/branch1 &&
+	echo "testrepo" >.git/branches/branch1 &&
 
 	git push branch1 &&
 	(
@@ -981,7 +981,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches containing #' '
 
 	test_when_finished "rm -rf .git/branches" &&
 	mkdir -p .git/branches &&
-	echo "testrepo#branch3" > .git/branches/branch2 &&
+	echo "testrepo#branch3" >.git/branches/branch2 &&
 
 	git push branch2 &&
 	(
@@ -1511,7 +1511,7 @@ EOF
 	git init no-thin &&
 	git --git-dir=no-thin/.git config receive.unpacklimit 0 &&
 	git push no-thin/.git refs/heads/main:refs/heads/foo &&
-	echo modified >> path1 &&
+	echo modified >>path1 &&
 	git commit -am modified &&
 	git repack -adf &&
 	rcvpck="git receive-pack --reject-thin-pack-for-testing" &&

From ee46b588c46f4243c4201c891b8c057d0fe06c2b Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 12 Sep 2015 12:25:47 +0200
Subject: [PATCH 355/553] t3701: verify that we can add *lots* of files
 interactively

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3701-add-interactive.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 4285314f35f8f2..01e42a51682755 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -1204,6 +1204,27 @@ test_expect_success 'checkout -p patch editing of added file' '
 	)
 '
 
+test_expect_success EXPENSIVE 'add -i with a lot of files' '
+	git reset --hard &&
+	x160=0123456789012345678901234567890123456789 &&
+	x160=$x160$x160$x160$x160 &&
+	y= &&
+	i=0 &&
+	while test $i -le 200
+	do
+		name=$(printf "%s%03d" $x160 $i) &&
+		echo $name >$name &&
+		git add -N $name &&
+		y="${y}y$LF" &&
+		i=$(($i+1)) ||
+		exit 1
+	done &&
+	echo "$y" | git add -p -- . &&
+	git diff --cached >staged &&
+	test_line_count = 1407 staged &&
+	git reset --hard
+'
+
 test_expect_success 'show help from add--helper' '
 	git reset --hard &&
 	cat >expect <<-EOF &&

From 0ccb82142f73815e814f05d030bca081a971bafa Mon Sep 17 00:00:00 2001
From: Luke Bonanomi <lbonanomi@gmail.com>
Date: Wed, 24 Jun 2020 07:45:52 -0400
Subject: [PATCH 356/553] commit: accept "scissors" with CR/LF line endings

This change enhances `git commit --cleanup=scissors` by detecting
scissors lines ending in either LF (UNIX-style) or CR/LF (DOS-style).

Regression tests are included to specifically test for trailing
comments after a CR/LF-terminated scissors line.

Signed-off-by: Luke Bonanomi <lbonanomi@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t7502-commit-porcelain.sh | 42 +++++++++++++++++++++++++++++++++++++
 wt-status.c                 | 13 +++++++++---
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/t/t7502-commit-porcelain.sh b/t/t7502-commit-porcelain.sh
index 05f6da4ad98448..8a013669a5aa95 100755
--- a/t/t7502-commit-porcelain.sh
+++ b/t/t7502-commit-porcelain.sh
@@ -623,6 +623,48 @@ test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on
 	test_must_be_empty actual
 '
 
+test_expect_success 'helper-editor' '
+
+	write_script lf-to-crlf.sh <<-\EOF
+	sed "s/\$/Q/" <"$1" | tr Q "\\015" >"$1".new &&
+	mv -f "$1".new "$1"
+	EOF
+'
+
+test_expect_success 'cleanup commit messages (scissors option,-F,-e, CR/LF line endings)' '
+
+	test_config core.editor "\"$PWD/lf-to-crlf.sh\"" &&
+	scissors="# ------------------------ >8 ------------------------" &&
+
+	test_write_lines >text \
+	"# Keep this comment" "" " $scissors" \
+	"# Keep this comment, too" "$scissors" \
+	"# Remove this comment" "$scissors" \
+	"Remove this comment, too" &&
+
+	test_write_lines >expect \
+	"# Keep this comment" "" " $scissors" \
+	"# Keep this comment, too" &&
+
+	git commit --cleanup=scissors -e -F text --allow-empty &&
+	git cat-file -p HEAD >raw &&
+	sed -e "1,/^\$/d" raw >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on first line, CR/LF line endings)' '
+
+	scissors="# ------------------------ >8 ------------------------" &&
+	test_write_lines >text \
+	"$scissors" \
+	"# Remove this comment and any following lines" &&
+	cp text /tmp/test2-text &&
+	git commit --cleanup=scissors -e -F text --allow-empty --allow-empty-message &&
+	git cat-file -p HEAD >raw &&
+	sed -e "1,/^\$/d" raw >actual &&
+	test_must_be_empty actual
+'
+
 test_expect_success 'cleanup commit messages (strip option,-F)' '
 
 	echo >>negative &&
diff --git a/wt-status.c b/wt-status.c
index e12adb26b9f8eb..c9e14101424991 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -40,7 +40,7 @@
 #define UF_DELAY_WARNING_IN_MS (2 * 1000)
 
 static const char cut_line[] =
-"------------------------ >8 ------------------------\n";
+"------------------------ >8 ------------------------";
 
 static char default_wt_status_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_NORMAL, /* WT_STATUS_HEADER */
@@ -1097,15 +1097,22 @@ static void wt_longstatus_print_other(struct wt_status *s,
 	status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
 }
 
+static inline int starts_with_newline(const char *p)
+{
+    return *p == '\n' || (*p == '\r' && p[1] == '\n');
+}
+
 size_t wt_status_locate_end(const char *s, size_t len)
 {
 	const char *p;
 	struct strbuf pattern = STRBUF_INIT;
 
 	strbuf_addf(&pattern, "\n%s %s", comment_line_str, cut_line);
-	if (starts_with(s, pattern.buf + 1))
+	if (starts_with(s, pattern.buf + 1) &&
+	    starts_with_newline(s + pattern.len - 1))
 		len = 0;
-	else if ((p = strstr(s, pattern.buf))) {
+	else if ((p = strstr(s, pattern.buf)) &&
+		 starts_with_newline(p + pattern.len)) {
 		size_t newlen = p - s + 1;
 		if (newlen < len)
 			len = newlen;

From 54a32d5cb4d83040cf260b0bdeff20589f83631c Mon Sep 17 00:00:00 2001
From: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Date: Tue, 2 Jun 2020 12:12:25 +0200
Subject: [PATCH 357/553] t0014: fix indentation

For some reason, this test case was indented with 4 spaces instead of 1
horizontal tab. The other test cases in the same test script are fine.

Signed-off-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t0014-alias.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh
index 07a53e7366ef4b..62b4d81db875ca 100755
--- a/t/t0014-alias.sh
+++ b/t/t0014-alias.sh
@@ -52,10 +52,10 @@ test_expect_success 'looping aliases - deprecated builtins' '
 #'
 
 test_expect_success 'run-command formats empty args properly' '
-    test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw &&
-    sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual &&
-    echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect &&
-    test_cmp expect actual
+	test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw &&
+	sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual &&
+	echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect &&
+	test_cmp expect actual
 '
 
 test_expect_success 'tracing a shell alias with arguments shows trace of prepared command' '

From e34bacd6d919733acf67cc644a13716a2f855d53 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 12 Aug 2020 15:06:17 +0000
Subject: [PATCH 358/553] git-gui: accommodate for intent-to-add files

As of Git v2.28.0, the diff for files staged via `git add -N` marks them
as new files. Git GUI was ill-prepared for that, and this patch teaches
Git GUI about them.

Please note that this will not even fix things with v2.28.0, as the
`rp/apply-cached-with-i-t-a` patches are required on Git's side, too.

This fixes https://github.com/git-for-windows/git/issues/2779

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
---
 git-gui/git-gui.sh   |  2 ++
 git-gui/lib/diff.tcl | 12 ++++++++----
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index d3d3aa14a9b462..418cdf6c705f87 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -1934,6 +1934,7 @@ set all_icons(U$ui_index)   file_merge
 set all_icons(T$ui_index)   file_statechange
 
 set all_icons(_$ui_workdir) file_plain
+set all_icons(A$ui_workdir) file_plain
 set all_icons(M$ui_workdir) file_mod
 set all_icons(D$ui_workdir) file_question
 set all_icons(U$ui_workdir) file_merge
@@ -1960,6 +1961,7 @@ foreach i {
 		{A_ {mc "Staged for commit"}}
 		{AM {mc "Portions staged for commit"}}
 		{AD {mc "Staged for commit, missing"}}
+		{AA {mc "Intended to be added"}}
 
 		{_D {mc "Missing"}}
 		{D_ {mc "Staged for removal"}}
diff --git a/git-gui/lib/diff.tcl b/git-gui/lib/diff.tcl
index 442737ba4f260b..003e4613f3495b 100644
--- a/git-gui/lib/diff.tcl
+++ b/git-gui/lib/diff.tcl
@@ -554,7 +554,8 @@ proc apply_or_revert_hunk {x y revert} {
 	if {$current_diff_side eq $ui_index} {
 		set failed_msg [mc "Failed to unstage selected hunk."]
 		lappend apply_cmd --reverse --cached
-		if {[string index $mi 0] ne {M}} {
+		set file_state [string index $mi 0]
+		if {$file_state ne {M} && $file_state ne {A}} {
 			unlock_index
 			return
 		}
@@ -567,7 +568,8 @@ proc apply_or_revert_hunk {x y revert} {
 			lappend apply_cmd --cached
 		}
 
-		if {[string index $mi 1] ne {M}} {
+		set file_state [string index $mi 1]
+		if {$file_state ne {M} && $file_state ne {A}} {
 			unlock_index
 			return
 		}
@@ -659,7 +661,8 @@ proc apply_or_revert_range_or_line {x y revert} {
 		set failed_msg [mc "Failed to unstage selected line."]
 		set to_context {+}
 		lappend apply_cmd --reverse --cached
-		if {[string index $mi 0] ne {M}} {
+		set file_state [string index $mi 0]
+		if {$file_state ne {M} && $file_state ne {A}} {
 			unlock_index
 			return
 		}
@@ -674,7 +677,8 @@ proc apply_or_revert_range_or_line {x y revert} {
 			lappend apply_cmd --cached
 		}
 
-		if {[string index $mi 1] ne {M}} {
+		set file_state [string index $mi 1]
+		if {$file_state ne {M} && $file_state ne {A}} {
 			unlock_index
 			return
 		}

From 7836eb8c6b7f2ab6d38dfe7b844359e15da1cac2 Mon Sep 17 00:00:00 2001
From: Ian Bearman <ianb@microsoft.com>
Date: Fri, 31 Jan 2020 16:00:25 -0800
Subject: [PATCH 359/553] vcbuild: install ARM64 dependencies when building
 ARM64 binaries

Co-authored-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Ian Bearman <ianb@microsoft.com>
Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/vcbuild/README              | 6 +++++-
 compat/vcbuild/vcpkg_copy_dlls.bat | 7 ++++++-
 compat/vcbuild/vcpkg_install.bat   | 9 +++++++--
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/compat/vcbuild/README b/compat/vcbuild/README
index 29ec1d0f104b80..1df1cabb1ebbbd 100644
--- a/compat/vcbuild/README
+++ b/compat/vcbuild/README
@@ -6,7 +6,11 @@ The Steps to Build Git with VS2015 or VS2017 from the command line.
    Prompt or from an SDK bash window:
 
    $ cd <repo_root>
-   $ ./compat/vcbuild/vcpkg_install.bat
+   $ ./compat/vcbuild/vcpkg_install.bat x64-windows
+
+   or
+
+   $ ./compat/vcbuild/vcpkg_install.bat arm64-windows
 
    The vcpkg tools and all of the third-party sources will be installed
    in this folder:
diff --git a/compat/vcbuild/vcpkg_copy_dlls.bat b/compat/vcbuild/vcpkg_copy_dlls.bat
index 13661c14f8705c..8bea0cbf83b6cf 100644
--- a/compat/vcbuild/vcpkg_copy_dlls.bat
+++ b/compat/vcbuild/vcpkg_copy_dlls.bat
@@ -15,7 +15,12 @@ REM ================================================================
 	@FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD
 	cd %cwd%
 
-	SET arch=x64-windows
+	SET arch=%2
+	IF NOT DEFINED arch (
+		echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows'
+		set arch=x64-windows
+	)
+
 	SET inst=%cwd%vcpkg\installed\%arch%
 
 	IF [%1]==[release] (
diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat
index 8330d8120fb511..cacef18c11dc79 100644
--- a/compat/vcbuild/vcpkg_install.bat
+++ b/compat/vcbuild/vcpkg_install.bat
@@ -31,6 +31,12 @@ REM ================================================================
 
 	SETLOCAL EnableDelayedExpansion
 
+	SET arch=%1
+	IF NOT DEFINED arch (
+		echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows'
+		set arch=x64-windows
+	)
+
 	@FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD
 	cd %cwd%
 
@@ -55,9 +61,8 @@ REM ================================================================
 	echo Successfully installed %cwd%vcpkg\vcpkg.exe
 
 :install_libraries
-	SET arch=x64-windows
 
-	echo Installing third-party libraries...
+	echo Installing third-party libraries(%arch%)...
 	FOR %%i IN (zlib expat libiconv openssl libssh2 curl) DO (
 	    cd %cwd%vcpkg
 	    IF NOT EXIST "packages\%%i_%arch%" CALL :sub__install_one %%i

From 9e97ba9d4c0f9b8c88ba4c29385596c18d58ab07 Mon Sep 17 00:00:00 2001
From: Ian Bearman <ianb@microsoft.com>
Date: Tue, 4 Feb 2020 10:34:40 -0800
Subject: [PATCH 360/553] vcbuild: add an option to install individual
 'features'

In this context, a "feature" is a dependency combined with its own
dependencies.

Signed-off-by: Ian Bearman <ianb@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/vcbuild/vcpkg_install.bat | 35 +++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat
index cacef18c11dc79..8da212487ae97d 100644
--- a/compat/vcbuild/vcpkg_install.bat
+++ b/compat/vcbuild/vcpkg_install.bat
@@ -85,14 +85,47 @@ REM ================================================================
 :sub__install_one
 	echo     Installing package %1...
 
+	call :%1_features
+
 	REM vcpkg may not be reliable on slow, intermittent or proxy
 	REM connections, see e.g.
 	REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk
 	REM which explains the hidden 21 second timeout
 	REM (last post by Dave : Microsoft - Windows Networking team)
 
-	.\vcpkg.exe install %1:%arch%
+	.\vcpkg.exe install %1%features%:%arch%
 	IF ERRORLEVEL 1 ( EXIT /B 1 )
 
 	echo     Finished %1
 	goto :EOF
+
+::
+:: features for each vcpkg to install
+:: there should be an entry here for each package to install
+:: 'set features=' means use the default otherwise
+:: 'set features=[comma-delimited-feature-set]' is the syntax
+::
+
+:zlib_features
+set features=
+goto :EOF
+
+:expat_features
+set features=
+goto :EOF
+
+:libiconv_features
+set features=
+goto :EOF
+
+:openssl_features
+set features=
+goto :EOF
+
+:libssh2_features
+set features=
+goto :EOF
+
+:curl_features
+set features=[core,openssl]
+goto :EOF

From a57a900d1f695ace778578bdbd3d9f8ae43031c4 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Sun, 6 Oct 2019 18:40:55 +0100
Subject: [PATCH 361/553] vcpkg_install: detect lack of Git

The vcpkg_install batch file depends on the availability of a
working Git on the CMD path. This may not be present if the user
has selected the 'bash only' option during Git-for-Windows install.

Detect and tell the user about their lack of a working Git in the CMD
window.

Fixes #2348.
A separate PR https://github.com/git-for-windows/build-extra/pull/258
now highlights the recommended path setting during install.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 compat/vcbuild/vcpkg_install.bat | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat
index ebd0bad242a8ca..bcbbf536af3141 100644
--- a/compat/vcbuild/vcpkg_install.bat
+++ b/compat/vcbuild/vcpkg_install.bat
@@ -36,6 +36,13 @@ REM ================================================================
 
 	dir vcpkg\vcpkg.exe >nul 2>nul && GOTO :install_libraries
 
+	git.exe version 2>nul
+	IF ERRORLEVEL 1 (
+	echo "***"
+	echo "Git not found. Please adjust your CMD path or Git install option."
+	echo "***"
+	EXIT /B 1 )
+
 	echo Fetching vcpkg in %cwd%vcpkg
 	git.exe clone https://github.com/Microsoft/vcpkg vcpkg
 	IF ERRORLEVEL 1 ( EXIT /B 1 )

From a91e61c8de4eefa9bcfc4541b08048ff4422b2d8 Mon Sep 17 00:00:00 2001
From: Dennis Ameling <dennis@dennisameling.com>
Date: Fri, 4 Dec 2020 14:11:34 +0100
Subject: [PATCH 362/553] cmake: allow building for Windows/ARM64

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 contrib/buildsystems/CMakeLists.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 28877feb9d1707..3768c60cc89c32 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -65,9 +65,9 @@ if(USE_VCPKG)
 	set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg")
 	if(NOT EXISTS ${VCPKG_DIR})
 		message("Initializing vcpkg and building the Git's dependencies (this will take a while...)")
-		execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat)
+		execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH})
 	endif()
-	list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/x64-windows")
+	list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}")
 
 	# In the vcpkg edition, we need this to be able to link to libcurl
 	set(CURL_NO_CURL_CMAKE ON)
@@ -1197,7 +1197,7 @@ string(REPLACE "@USE_LIBPCRE2@" "" git_build_options "${git_build_options}")
 string(REPLACE "@WITH_BREAKING_CHANGES@" "" git_build_options "${git_build_options}")
 string(REPLACE "@X@" "${EXE_EXTENSION}" git_build_options "${git_build_options}")
 if(USE_VCPKG)
-	string(APPEND git_build_options "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/x64-windows/bin\"\n")
+	string(APPEND git_build_options "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/${VCPKG_ARCH}/bin\"\n")
 endif()
 file(WRITE ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS ${git_build_options})
 

From 332d431448f6f9d87060e576bd0c474545596531 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Sun, 6 Oct 2019 18:43:57 +0100
Subject: [PATCH 363/553] vcpkg_install: add comment regarding slow network
 connections

The vcpkg downloads may not succeed. Warn careful readers of the time out.

A simple retry will usually resolve the issue.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/vcbuild/vcpkg_install.bat | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat
index bcbbf536af3141..8330d8120fb511 100644
--- a/compat/vcbuild/vcpkg_install.bat
+++ b/compat/vcbuild/vcpkg_install.bat
@@ -80,6 +80,12 @@ REM ================================================================
 :sub__install_one
 	echo     Installing package %1...
 
+	REM vcpkg may not be reliable on slow, intermittent or proxy
+	REM connections, see e.g.
+	REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk
+	REM which explains the hidden 21 second timeout
+	REM (last post by Dave : Microsoft - Windows Networking team)
+
 	.\vcpkg.exe install %1:%arch%
 	IF ERRORLEVEL 1 ( EXIT /B 1 )
 

From 5946c3536c0796680441b6d1d0ac45fbf187894c Mon Sep 17 00:00:00 2001
From: Dennis Ameling <dennis@dennisameling.com>
Date: Sun, 29 Nov 2020 00:12:26 +0100
Subject: [PATCH 364/553] ci(vs-build) also build Windows/ARM64 artifacts

There are no Windows/ARM64 agents in GitHub Actions yet, therefore we
just skip adjusting the `vs-test` job for now.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/main.yml | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index f2e93f54611b62..b1e485e5f503ca 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -169,8 +169,11 @@ jobs:
       NO_PERL: 1
       GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'"
     runs-on: windows-latest
+    strategy:
+      matrix:
+        arch: [x64, arm64]
     concurrency:
-      group: vs-build-${{ github.ref }}
+      group: vs-build-${{ github.ref }}-${{ matrix.arch }}
       cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
     steps:
     - uses: actions/checkout@v5
@@ -189,14 +192,14 @@ jobs:
       uses: microsoft/setup-msbuild@v2
     - name: copy dlls to root
       shell: cmd
-      run: compat\vcbuild\vcpkg_copy_dlls.bat release
+      run: compat\vcbuild\vcpkg_copy_dlls.bat release ${{ matrix.arch }}-windows
     - name: generate Visual Studio solution
       shell: bash
       run: |
-        cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/x64-windows \
-        -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON
+        cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \
+        -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows
     - name: MSBuild
-      run: msbuild git.sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 -property:PlatformToolset=v142
+      run: msbuild git.sln -property:Configuration=Release -property:Platform=${{ matrix.arch }} -maxCpuCount:4 -property:PlatformToolset=v142
     - name: bundle artifact tar
       shell: bash
       env:
@@ -210,7 +213,7 @@ jobs:
     - name: upload tracked files and build artifacts
       uses: actions/upload-artifact@v5
       with:
-        name: vs-artifacts
+        name: vs-artifacts-${{ matrix.arch }}
         path: artifacts
   vs-test:
     name: win+VS test
@@ -228,7 +231,7 @@ jobs:
     - name: download tracked files and build artifacts
       uses: actions/download-artifact@v6
       with:
-        name: vs-artifacts
+        name: vs-artifacts-x64
         path: ${{github.workspace}}
     - name: extract tracked files and build artifacts
       shell: bash

From 05894290484a501bf1a9616ac0a99f11077ad543 Mon Sep 17 00:00:00 2001
From: Dennis Ameling <dennis@dennisameling.com>
Date: Sun, 6 Dec 2020 18:39:26 +0100
Subject: [PATCH 365/553] Add schannel to curl installation

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
---
 compat/vcbuild/vcpkg_install.bat | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat
index 8da212487ae97d..575c65c20ba307 100644
--- a/compat/vcbuild/vcpkg_install.bat
+++ b/compat/vcbuild/vcpkg_install.bat
@@ -127,5 +127,5 @@ set features=
 goto :EOF
 
 :curl_features
-set features=[core,openssl]
+set features=[core,openssl,schannel]
 goto :EOF

From 397169fe77b4d5716711066eaf76572cc2367a35 Mon Sep 17 00:00:00 2001
From: Dennis Ameling <dennis@dennisameling.com>
Date: Mon, 19 Jul 2021 13:02:16 +0200
Subject: [PATCH 366/553] cmake(): allow setting HOST_CPU for cross-compilation

Git's regular Makefile mentions that HOST_CPU should be defined when cross-compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L438-L439

This is then used to set the GIT_HOST_CPU variable when compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L1337-L1341

Then, when the user runs `git version --build-options`, it returns that value: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/help.c#L658

This commit adds the same functionality to the CMake configuration. Users can now set -DHOST_CPU= to set the target architecture.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
---
 .github/workflows/main.yml          | 2 +-
 contrib/buildsystems/CMakeLists.txt | 9 ++++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b1e485e5f503ca..39e7e63c986bd1 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -197,7 +197,7 @@ jobs:
       shell: bash
       run: |
         cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \
-        -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows
+        -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows -DHOST_CPU=${{ matrix.arch }}
     - name: MSBuild
       run: msbuild git.sln -property:Configuration=Release -property:Platform=${{ matrix.arch }} -maxCpuCount:4 -property:PlatformToolset=v142
     - name: bundle artifact tar
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 3768c60cc89c32..0146cccb436502 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -212,7 +212,14 @@ endif()
 
 #default behaviour
 include_directories(${CMAKE_SOURCE_DIR})
-add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}")
+
+# When cross-compiling, define HOST_CPU as the canonical name of the CPU on
+# which the built Git will run (for instance "x86_64").
+if(NOT HOST_CPU)
+	add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}")
+else()
+	add_compile_definitions(GIT_HOST_CPU="${HOST_CPU}")
+endif()
 add_compile_definitions(SHA256_BLK INTERNAL_QSORT RUNTIME_PREFIX)
 add_compile_definitions(NO_OPENSSL SHA1_DC SHA1DC_NO_STANDARD_INCLUDES
 			SHA1DC_INIT_SAFE_HASH_DEFAULT=0

From c9629b7a59507fa2bf2b947e0113549f173f2e29 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 2 Apr 2021 22:50:54 +0200
Subject: [PATCH 367/553] mingw: allow for longer paths in
 `parse_interpreter()`
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

As reported in https://github.com/newren/git-filter-repo/pull/225, it
looks like 99 bytes is not really sufficient to represent e.g. the full
path to Python when installed via Windows Store (and this path is used
in the hasb bang line when installing scripts via `pip`).

Let's increase it to what is probably the maximum sensible path size:
MAX_PATH. This makes `parse_interpreter()` in line with what
`lookup_prog()` handles.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Vilius Šumskas <vilius@sumskas.eu>
---
 compat/mingw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..7462f0f3de46fd 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1367,7 +1367,7 @@ static const char *quote_arg_msys2(const char *arg)
 
 static const char *parse_interpreter(const char *cmd)
 {
-	static char buf[100];
+	static char buf[MAX_PATH];
 	char *p, *opt;
 	ssize_t n; /* read() can return negative values */
 	int fd;

From d8c2a1ded762338dab2bdc9bb6b0ef42b1d4a91f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 17 May 2021 10:46:52 +0200
Subject: [PATCH 368/553] compat/vcbuild: document preferred way to build in
 Visual Studio

We used to have that `make vcxproj` hack, but a hack it is. In the
meantime, we have a much cleaner solution: using CMake, either
explicitly, or even more conveniently via Visual Studio's built-in CMake
support (simply open Git's top-level directory via File>Open>Folder...).

Let's let the `README` reflect this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/vcbuild/README | 28 +++++++++-------------------
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/compat/vcbuild/README b/compat/vcbuild/README
index 29ec1d0f104b80..5c71ea2daa4017 100644
--- a/compat/vcbuild/README
+++ b/compat/vcbuild/README
@@ -37,27 +37,17 @@ The Steps to Build Git with VS2015 or VS2017 from the command line.
 
 ================================================================
 
-Alternatively, run `make vcxproj` and then load the generated `git.sln` in
-Visual Studio. The initial build will install the vcpkg system and build the
+Alternatively, just open Git's top-level directory in Visual Studio, via
+`File>Open>Folder...`. This will use CMake internally to generate the
+project definitions. It will also install the vcpkg system and build the
 dependencies automatically. This will take a while.
 
-Instead of generating the `git.sln` file yourself (which requires a full Git
-for Windows SDK), you may want to consider fetching the `vs/master` branch of
-https://github.com/git-for-windows/git instead (which is updated automatically
-via CI running `make vcxproj`). The `vs/master` branch does not require a Git
-for Windows to build, but you can run the test scripts in a regular Git Bash.
-
-Note that `make vcxproj` will automatically add and commit the generated `.sln`
-and `.vcxproj` files to the repo. This is necessary to allow building a
-fully-testable Git in Visual Studio, where a regular Git Bash can be used to
-run the test scripts (as opposed to a full Git for Windows SDK): a number of
-build targets, such as Git commands implemented as Unix shell scripts (where
-`@@SHELL_PATH@@` and other placeholders are interpolated) require a full-blown
-Git for Windows SDK (which is about 10x the size of a regular Git for Windows
-installation).
-
-If your plan is to open a Pull Request with Git for Windows, it is a good idea
-to drop this commit before submitting.
+You can also generate the Visual Studio solution manually by downloading
+and running CMake explicitly rather than letting Visual Studio doing
+that implicitly.
+
+Another, deprecated option is to run `make vcxproj`. This option is
+superseded by the CMake-based build, and will be removed at some point.
 
 ================================================================
 The Steps of Build Git with VS2008

From 151b4192ad78f743967731fc5f5ee3be8bfeebaf Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Fri, 2 Jul 2021 00:30:24 +0100
Subject: [PATCH 369/553] CMake: default Visual Studio generator has changed

Correct some wording and inform users regarding the Visual Studio
changes (from V16.6) to the default generator.

Subsequent commits ensure that Git for Windows can be directly
opened in modern Visual Studio without needing special configuration
of the CMakeLists settings.

It appeares that internally Visual Studio creates it's own version of the
.sln file (etc.) for extension tools that expect them.

The large number of references below document the shifting of Visual Studio
default and CMake setting options.

refs: https://docs.microsoft.com/en-us/search/?scope=C%2B%2B&view=msvc-150&terms=Ninja

1. https://docs.microsoft.com/en-us/cpp/linux/cmake-linux-configure?view=msvc-160
(note the linux bit)
 "In Visual Studio 2019 version 16.6 or later ***, Ninja is the default
generator for configurations targeting a remote system or WSL. For more
information, see this post on the C++ Team Blog
[https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/].

For more information about these settings, see CMakeSettings.json reference
[https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160]."

2. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160
"CMake supports two files that allow users to specify common configure,
build, and test options and share them with others: CMakePresets.json
and CMakeUserPresets.json."

" Both files are supported in Visual Studio 2019 version 16.10 or later.
***"
3. https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/
" Ninja has been the default generator (underlying build system) for
CMake configurations targeting Windows for some time***, but in Visual
Studio 2019 version 16.6 Preview 3*** we added support for Ninja on Linux."

4. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160
" `generator`: specifies CMake generator to use for this configuration.
May be one of:

    Visual Studio 2019 only:
        Visual Studio 16 2019
        Visual Studio 16 2019 Win64
        Visual Studio 16 2019 ARM

    Visual Studio 2017 and later:
        Visual Studio 15 2017
        Visual Studio 15 2017 Win64
        Visual Studio 15 2017 ARM
        Visual Studio 14 2015
        Visual Studio 14 2015 Win64
        Visual Studio 14 2015 ARM
        Unix Makefiles
        Ninja

Because Ninja is designed for fast build speeds instead of flexibility
and function, it is set as the default. However, some CMake projects may
be unable to correctly build using Ninja. If this occurs, you can
instruct CMake to generate Visual Studio projects instead.

To specify a Visual Studio generator in Visual Studio 2017, open the
settings editor from the main menu by choosing CMake | Change CMake
Settings. Delete "Ninja" and type "V". This activates IntelliSense,
which enables you to choose the generator you want."

"To specify a Visual Studio generator in Visual Studio 2019, right-click
on the CMakeLists.txt file in Solution Explorer and choose CMake
Settings for project > Show Advanced Settings > CMake Generator.

When the active configuration specifies a Visual Studio generator, by
default MSBuild.exe is invoked with` -m -v:minimal` arguments."

5. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019
"Enable CMakePresets.json integration in Visual Studio 2019

CMakePresets.json integration isn't enabled by default in Visual Studio
2019. You can enable it for all CMake projects in Tools > Options >
CMake > General: (tick a box)" ... see more.

6. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-140
(whichever v140 is..)
"CMake projects are supported in Visual Studio 2017 and later."

7. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150
"Support added for the CMake Ninja generator."

8. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150#cmake-support-via-open-folder
"CMake support via Open Folder
Visual Studio 2017 introduces support for using CMake projects without
converting to MSBuild project files (.vcxproj). For more information,
see CMake projects in Visual
Studio[https://docs.microsoft.com/en-us/cpp/build/cmake-projects-in-visual-studio?view=msvc-150].
Opening CMake projects with Open Folder automatically configures the
environment for C++ editing, building, and debugging." ... +more!

9. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#supported-cmake-and-cmakepresetsjson-versions
"Visual Studio reads and evaluates CMakePresets.json and
CMakeUserPresets.json itself and doesn't invoke CMake directly with the
--preset option. So, CMake version 3.20 or later isn't strictly required
when you're building with CMakePresets.json inside Visual Studio. We
recommend using CMake version 3.14 or later."

10. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019
"If you don't want to enable CMakePresets.json integration for all CMake
projects, you can enable CMakePresets.json integration for a single
CMake project by adding a CMakePresets.json file to the root of the open
folder. You must close and reopen the folder in Visual Studio to
activate the integration.

11. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#default-configure-presets
***(doesn't actually say which version..)
"Default Configure Presets
If no CMakePresets.json or CMakeUserPresets.json file exists, or if
CMakePresets.json or CMakeUserPresets.json is invalid, Visual Studio
will fall back*** on the following default Configure Presets:

Windows example
JSON
{
  "name": "windows-default",
  "displayName": "Windows x64 Debug",
  "description": "Sets Ninja generator, compilers, x64 architecture,
build and install directory, debug build type",
  "generator": "Ninja",
  "binaryDir": "${sourceDir}/out/build/${presetName}",
  "architecture": {
    "value": "x64",
    "strategy": "external"
  },
  "cacheVariables": {
    "CMAKE_BUILD_TYPE": "Debug",
    "CMAKE_INSTALL_PREFIX": "${sourceDir}/out/install/${presetName}"
  },
  "vendor": {
    "microsoft.com/VisualStudioSettings/CMake/1.0": {
      "hostOS": [ "Windows" ]
    }
  }
},
"

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 contrib/buildsystems/CMakeLists.txt | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 0146cccb436502..bd7cb2659097f7 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -14,6 +14,11 @@ Note: Visual Studio also has the option of opening `CMakeLists.txt`
 directly; Using this option, Visual Studio will not find the source code,
 though, therefore the `File>Open>Folder...` option is preferred.
 
+Visual Studio does not produce a .sln solution file nor the .vcxproj files
+that may be required by VS extension tools.
+
+To generate the .sln/.vcxproj files run CMake manually, as described below.
+
 Instructions to run CMake manually:
 
     mkdir -p contrib/buildsystems/out
@@ -22,7 +27,7 @@ Instructions to run CMake manually:
 
 This will build the git binaries in contrib/buildsystems/out
 directory (our top-level .gitignore file knows to ignore contents of
-this directory).
+this directory). The project .sln and .vcxproj files are also generated.
 
 Possible build configurations(-DCMAKE_BUILD_TYPE) with corresponding
 compiler flags
@@ -35,17 +40,16 @@ empty(default) :
 NOTE: -DCMAKE_BUILD_TYPE is optional. For multi-config generators like Visual Studio
 this option is ignored
 
-This process generates a Makefile(Linux/*BSD/MacOS) , Visual Studio solution(Windows) by default.
+This process generates a Makefile(Linux/*BSD/MacOS), Visual Studio solution(Windows) by default.
 Run `make` to build Git on Linux/*BSD/MacOS.
 Open git.sln on Windows and build Git.
 
-NOTE: By default CMake uses Makefile as the build tool on Linux and Visual Studio in Windows,
-to use another tool say `ninja` add this to the command line when configuring.
-`-G Ninja`
-
 NOTE: By default CMake will install vcpkg locally to your source tree on configuration,
 to avoid this, add `-DNO_VCPKG=TRUE` to the command line when configuring.
 
+The Visual Studio default generator changed in v16.6 from its Visual Studio
+implemenation to `Ninja` This required changes to many CMake scripts.
+
 ]]
 cmake_minimum_required(VERSION 3.14)
 

From b2b5edc1d53cf5f69ff715bcd30a33d5ca86b9c5 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Sat, 24 Apr 2021 11:09:58 +0100
Subject: [PATCH 370/553] .gitignore: add Visual Studio CMakeSetting.json file

The CMakeSettings.json file is tool generated. Developers may track it
should they provide additional settings.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 78a45cb5bec991..f9ce34708caaca 100644
--- a/.gitignore
+++ b/.gitignore
@@ -256,5 +256,6 @@ Release/
 /git.VC.db
 *.dSYM
 /contrib/buildsystems/out
+CMakeSettings.json
 /contrib/libgit-rs/target
 /contrib/libgit-sys/target

From 565378cbcfdebe5eb903e3302b3c227f17de0459 Mon Sep 17 00:00:00 2001
From: Victoria Dye <vdye@github.com>
Date: Thu, 5 Aug 2021 19:04:13 -0400
Subject: [PATCH 371/553] subtree: update `contrib/subtree` `test` target

The intention of this change is to align with how the top-level git
`Makefile` defines its own test target (which also internally calls
`$(MAKE) -C t/ all`). This change also ensures the consistency of
`make -C contrib/subtree test` with other testing in CI executions
(which rely on `$DEFAULT_TEST_TARGET` being defined as `prove`).

Signed-off-by: Victoria Dye <vdye@github.com>
---
 contrib/subtree/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile
index c0c9f21cb78022..dab2dfc08ee222 100644
--- a/contrib/subtree/Makefile
+++ b/contrib/subtree/Makefile
@@ -95,7 +95,7 @@ $(GIT_SUBTREE_TEST): $(GIT_SUBTREE)
 	cp $< $@
 
 test: $(GIT_SUBTREE_TEST)
-	$(MAKE) -C t/ test
+	$(MAKE) -C t/ all
 
 clean:
 	$(RM) $(GIT_SUBTREE)

From bf74d8943fc741bc68a4214ce972e5017d990233 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Thu, 22 Apr 2021 11:11:38 +0100
Subject: [PATCH 372/553] CMakeLists: add default "x64-windows" arch for Visual
 Studio

In Git-for-Windows, work on using ARM64 has progressed. The
commit 2d94b77b27 (cmake: allow building for Windows/ARM64, 2020-12-04)
failed to notice that /compat/vcbuild/vcpkg_install.bat will default to
using the "x64-windows" architecture for the vcpkg installation if not set,
but CMake is not told of this default. Commit 635b6d99b3 (vcbuild: install
ARM64 dependencies when building ARM64 binaries, 2020-01-31) later updated
vcpkg_install.bat to accept an arch (%1) parameter, but retained the default.

This default is neccessary for the use case where the project directory is
opened directly in Visual Studio, which will find and build a CMakeLists.txt
file without any parameters, thus expecting use of the default setting.

Also Visual studio will generate internal .sln solution and .vcxproj project
files needed for some extension tools. Inform users of the additional
.sln/.vcxproj generation.

** How to test:
 rm -rf '.vs' # remove old visual studio settings
 rm -rf 'compat/vcbuild/vcpkg' # remove any vcpkg downloads
 rm -rf 'contrib/buildsystems/out' # remove builds & CMake artifacts
 with a fresh Visual Studio Community Edition, File>>Open>>(git *folder*)
   to load the project (which will take some time!).
 check for successful compilation.
The implicit .sln (etc.) are in the hidden .vs directory created by
Visual Studio.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 contrib/buildsystems/CMakeLists.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index bd7cb2659097f7..c9d4d42d7bb8c2 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -71,6 +71,10 @@ if(USE_VCPKG)
 		message("Initializing vcpkg and building the Git's dependencies (this will take a while...)")
 		execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH})
 	endif()
+	if(NOT EXISTS ${VCPKG_ARCH})
+		message("VCPKG_ARCH: unset, using 'x64-windows'")
+		set(VCPKG_ARCH "x64-windows") # default from vcpkg_install.bat
+	endif()
 	list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}")
 
 	# In the vcpkg edition, we need this to be able to link to libcurl

From 77d6c57ad14916d6cb6de34ebb2f2de594b541aa Mon Sep 17 00:00:00 2001
From: Pascal Muller <pascalmuller@gmail.com>
Date: Wed, 23 Jun 2021 21:21:10 +0200
Subject: [PATCH 373/553] http: optionally send SSL client certificate

This adds support for a new http.sslAutoClientCert config value.

In cURL 7.77 or later the schannel backend does not automatically send
client certificates from the Windows Certificate Store anymore.

This config value is only used if http.sslBackend is set to "schannel",
and can be used to opt in to the old behavior and force cURL to send
client certificates.

This fixes https://github.com/git-for-windows/git/issues/3292

Signed-off-by: Pascal Muller <pascalmuller@gmail.com>
---
 Documentation/config/http.adoc |  5 +++++
 git-curl-compat.h              |  8 ++++++++
 http.c                         | 24 +++++++++++++++++++++---
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/http.adoc b/Documentation/config/http.adoc
index 9122c5dc23ea1a..7fd001206ded22 100644
--- a/Documentation/config/http.adoc
+++ b/Documentation/config/http.adoc
@@ -249,6 +249,11 @@ http.schannelUseSSLCAInfo::
 	when the `schannel` backend was configured via `http.sslBackend`,
 	unless `http.schannelUseSSLCAInfo` overrides this behavior.
 
+http.sslAutoClientCert::
+	As of cURL v7.77.0, the Secure Channel backend won't automatically
+	send client certificates from the Windows Certificate Store anymore.
+	To opt in to the old behavior, http.sslAutoClientCert can be set.
+
 http.pinnedPubkey::
 	Public key of the https service. It may either be the filename of
 	a PEM or DER encoded public key file or a string starting with
diff --git a/git-curl-compat.h b/git-curl-compat.h
index 659e5a3875e3d6..ecc2e742922313 100644
--- a/git-curl-compat.h
+++ b/git-curl-compat.h
@@ -37,6 +37,14 @@
 #define GIT_CURL_NEED_TRANSFER_ENCODING_HEADER
 #endif
 
+/**
+ * CURLSSLOPT_AUTO_CLIENT_CERT was added in 7.77.0, released in May
+ * 2021.
+ */
+#if LIBCURL_VERSION_NUM >= 0x074d00
+#define GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT
+#endif
+
 /**
  * CURLOPT_PROTOCOLS_STR and CURLOPT_REDIR_PROTOCOLS_STR were added in 7.85.0,
  * released in August 2022.
diff --git a/http.c b/http.c
index 81b10d04b61d10..7d2802955d47ab 100644
--- a/http.c
+++ b/http.c
@@ -162,6 +162,8 @@ static long http_schannel_check_revoke_mode =
  */
 static int http_schannel_use_ssl_cainfo;
 
+static int http_auto_client_cert;
+
 static int always_auth_proactively(void)
 {
 	return http_proactive_auth != PROACTIVE_AUTH_NONE &&
@@ -450,6 +452,11 @@ static int http_options(const char *var, const char *value,
 		return 0;
 	}
 
+	if (!strcmp("http.sslautoclientcert", var)) {
+		http_auto_client_cert = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!strcmp("http.minsessions", var)) {
 		min_curl_sessions = git_config_int(var, value, ctx->kvi);
 		if (min_curl_sessions > 1)
@@ -1074,9 +1081,20 @@ static CURL *get_curl_handle(void)
 	}
 #endif
 
-	if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) &&
-	    http_schannel_check_revoke_mode) {
-		curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode);
+	if (http_ssl_backend && !strcmp("schannel", http_ssl_backend)) {
+		long ssl_options = 0;
+		if (http_schannel_check_revoke_mode) {
+			ssl_options |= http_schannel_check_revoke_mode;
+		}
+
+		if (http_auto_client_cert) {
+#ifdef GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT
+			ssl_options |= CURLSSLOPT_AUTO_CLIENT_CERT;
+#endif
+		}
+
+		if (ssl_options)
+			curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, ssl_options);
 	}
 
 	if (http_proactive_auth != PROACTIVE_AUTH_NONE)

From 637d76cb2a4af987d6cf3ffb2f7e4a624729f652 Mon Sep 17 00:00:00 2001
From: Victoria Dye <vdye@github.com>
Date: Thu, 5 Aug 2021 19:11:59 -0400
Subject: [PATCH 374/553] ci: run `contrib/subtree` tests in CI builds

Because `git subtree` (unlike most other `contrib` modules) is included as
part of the standard release of Git for Windows, its stability should be
verified as consistently as it is for the rest of git. By including the
`git subtree` tests in the CI workflow, these tests are as much of a gate to
merging and indicator of stability as the standard test suite.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 ci/run-build-and-tests.sh | 4 ++++
 ci/run-test-slice.sh      | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 8bda62b921920f..4fbc7a8b68ae67 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -60,5 +60,9 @@ case "$jobname" in
 	;;
 esac
 
+case " $MAKE_TARGETS " in
+*" all "*) make -C contrib/subtree test;;
+esac
+
 check_unignored_build_artifacts
 save_good_tree
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index 0444c79c023c82..6e21260e17543b 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -15,4 +15,7 @@ if [ "$1" == "0" ] ; then
 	group "Run unit tests" make --quiet -C t unit-tests-test-tool
 fi
 
+# Run the git subtree tests only if main tests succeeded
+test 0 != "$1" || make -C contrib/subtree test
+
 check_unignored_build_artifacts

From 9e4b1beafe587dced744147dfd482eac2779b519 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Sun, 31 Oct 2021 23:15:13 +0000
Subject: [PATCH 375/553] hash-object: demonstrate a >4GB/LLP64 problem

On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is
only 32 bits (for backward compatibility). Git's use of `unsigned long`
for file memory sizes in many places, rather than size_t, limits the
handling of large files on LLP64 systems (commonly given as `>4GB`).

Provide a minimum test for handling a >4GB file. The `hash-object`
command, with the  `--literally` and without `-w` option avoids
writing the object, either loose or packed. This avoids the code paths
hitting the `bigFileThreshold` config test code, the zlib code, and the
pack code.

Subsequent patches will walk the test's call chain, converting types to
`size_t` (which is larger in LLP64 data models) where appropriate.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index de076293b62a76..7867fd1dbf940c 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -49,6 +49,9 @@ test_expect_success 'setup' '
 
 	example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399
 	example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae
+
+	large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204
+	large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64
 	EOF
 '
 
@@ -258,4 +261,12 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
 	test_cmp expect actual
 '
 
+test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+		'files over 4GB hash literally' '
+	test-tool genzeros $((5*1024*1024*1024)) >big &&
+	test_oid large5GB >expect &&
+	git hash-object --stdin --literally <big >actual &&
+	test_cmp expect actual
+'
+
 test_done

From 6113951f333ba4c198a704219e1ba376f2901190 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Mon, 10 May 2021 16:47:40 +0100
Subject: [PATCH 376/553] CMake: show Win32 and Generator_platform build-option
 values

Ensure key CMake option values are part of the CMake output to
facilitate user support when tool updates impact the wider CMake
actions, particularly ongoing 'improvements' in Visual Studio.

These CMake displays perform the same function as the build-options.txt
provided in the main Git for Windows. CMake is already chatty.
The setting of CMAKE_EXPORT_COMPILE_COMMANDS is also reported.

Include the environment's CMAKE_EXPORT_COMPILE_COMMANDS value which
may have been propogated to CMake's internal value.

Testing the CMAKE_EXPORT_COMPILE_COMMANDS processing can be difficult
in the Visual Studio environment, as it may be cached in many places.
The 'environment' may include the OS, the user shell, CMake's
own environment, along with the Visual Studio presets and caches.

See previous commit for arefacts that need removing for a clean test.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 contrib/buildsystems/CMakeLists.txt | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index c9d4d42d7bb8c2..fa3226e4082500 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -63,10 +63,20 @@ endif()
 
 if(NOT DEFINED CMAKE_EXPORT_COMPILE_COMMANDS)
 	set(CMAKE_EXPORT_COMPILE_COMMANDS TRUE)
+	message("settting CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}")
 endif()
 
 if(USE_VCPKG)
 	set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg")
+	message("WIN32: ${WIN32}") # show its underlying text values
+	message("VCPKG_DIR: ${VCPKG_DIR}")
+	message("VCPKG_ARCH: ${VCPKG_ARCH}") # maybe unset
+	message("MSVC: ${MSVC}")
+	message("CMAKE_GENERATOR: ${CMAKE_GENERATOR}")
+	message("CMAKE_CXX_COMPILER_ID: ${CMAKE_CXX_COMPILER_ID}")
+	message("CMAKE_GENERATOR_PLATFORM: ${CMAKE_GENERATOR_PLATFORM}")
+	message("CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}")
+	message("ENV(CMAKE_EXPORT_COMPILE_COMMANDS): $ENV{CMAKE_EXPORT_COMPILE_COMMANDS}")
 	if(NOT EXISTS ${VCPKG_DIR})
 		message("Initializing vcpkg and building the Git's dependencies (this will take a while...)")
 		execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH})

From 2299036c3fb9407083ed53cb7a0bde6f4d254630 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 8 Sep 2021 13:05:42 +0200
Subject: [PATCH 377/553] init: do parse _all_ core.* settings early

In Git for Windows, `has_symlinks` is set to 0 by default. Therefore, we
need to parse the config setting `core.symlinks` to know if it has been
set to `true`. In `git init`, we must do that before copying the
templates because they might contain symbolic links.

Even if the support for symbolic links on Windows has not made it to
upstream Git yet, we really should make sure that all the `core.*`
settings are parsed before proceeding, as they might very well change
the behavior of `git init` in a way the user intended.

This fixes https://github.com/git-for-windows/git/issues/3414

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 environment.c | 4 ++--
 environment.h | 2 ++
 setup.c       | 2 +-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/environment.c b/environment.c
index a770b5921d9546..b65b85a01f18cf 100644
--- a/environment.c
+++ b/environment.c
@@ -324,8 +324,8 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-static int git_default_core_config(const char *var, const char *value,
-				   const struct config_context *ctx, void *cb)
+int git_default_core_config(const char *var, const char *value,
+			    const struct config_context *ctx, void *cb)
 {
 	/* This needs a better name */
 	if (!strcmp(var, "core.filemode")) {
diff --git a/environment.h b/environment.h
index 51898c99cd1e45..e61f843fdbb637 100644
--- a/environment.h
+++ b/environment.h
@@ -106,6 +106,8 @@ const char *strip_namespace(const char *namespaced_ref);
 
 int git_default_config(const char *, const char *,
 		       const struct config_context *, void *);
+int git_default_core_config(const char *var, const char *value,
+			    const struct config_context *ctx, void *cb);
 
 /*
  * TODO: All the below state either explicitly or implicitly relies on
diff --git a/setup.c b/setup.c
index 3a6a048620dd7d..b723f8b33931bd 100644
--- a/setup.c
+++ b/setup.c
@@ -2693,7 +2693,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * have set up the repository format such that we can evaluate
 	 * includeIf conditions correctly in the case of re-initialization.
 	 */
-	repo_config(the_repository, platform_core_config, NULL);
+	repo_config(the_repository, git_default_core_config, NULL);
 
 	safe_create_dir(the_repository, git_dir, 0);
 

From 00a38f6ff70334357cbbf7772e9bd2cb41b735a2 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Wed, 13 Apr 2022 14:49:17 -0400
Subject: [PATCH 378/553] setup: properly use "%(prefix)/" when in WSL

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 setup.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 3a6a048620dd7d..d946d24eb86011 100644
--- a/setup.c
+++ b/setup.c
@@ -1868,10 +1868,19 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		break;
 	case GIT_DIR_INVALID_OWNERSHIP:
 		if (!nongit_ok) {
+			struct strbuf prequoted = STRBUF_INIT;
 			struct strbuf quoted = STRBUF_INIT;
 
 			strbuf_complete(&report, '\n');
-			sq_quote_buf_pretty(&quoted, dir.buf);
+
+#ifdef __MINGW32__
+			if (dir.buf[0] == '/')
+				strbuf_addstr(&prequoted, "%(prefix)/");
+#endif
+
+			strbuf_add(&prequoted, dir.buf, dir.len);
+			sq_quote_buf_pretty(&quoted, prequoted.buf);
+
 			die(_("detected dubious ownership in repository at '%s'\n"
 			      "%s"
 			      "To add an exception for this directory, call:\n"

From 0153b8d0866ac4a1fbe0de101463d5f25f63f022 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Wed, 13 Apr 2022 14:54:43 -0400
Subject: [PATCH 379/553] compat/mingw.c: do not warn when failing to get owner

In the case of Git for Windows (say, in a Git Bash window) running in a
Windows Subsystem for Linux (WSL) directory, the GetNamedSecurityInfoW()
call in is_path_owned_By_current_side() returns an error code other than
ERROR_SUCCESS. This is consistent behavior across this boundary.

In these cases, the owner would always be different because the WSL
owner is a different entity than the Windows user.

The change here is to suppress the error message that looks like this:

  error: failed to get owner for '//wsl.localhost/...' (1)

Before this change, this warning happens for every Git command,
regardless of whether the directory is marked with safe.directory.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 compat/mingw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..0f42dd02fd612b 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2927,9 +2927,7 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report)
 				    DACL_SECURITY_INFORMATION,
 				    &sid, NULL, NULL, NULL, &descriptor);
 
-	if (err != ERROR_SUCCESS)
-		error(_("failed to get owner for '%s' (%ld)"), path, err);
-	else if (sid && IsValidSid(sid)) {
+	if (err == ERROR_SUCCESS && sid && IsValidSid(sid)) {
 		/* Now, verify that the SID matches the current user's */
 		static PSID current_user_sid;
 		static HANDLE linked_token;

From e9f5a264f0c0b7d0dc47f7e3f31b8bfb1f908a12 Mon Sep 17 00:00:00 2001
From: Rafael Kitover <rkitover@gmail.com>
Date: Tue, 12 Apr 2022 19:53:33 +0000
Subject: [PATCH 380/553] mingw: $env:TERM="xterm-256color" for newer OSes

For Windows builds >= 15063 set $env:TERM to "xterm-256color" instead of
"cygwin" because they have a more capable console system that supports
this. Also set $env:COLORTERM="truecolor" if unset.

$env:TERM is initialized so that ANSI colors in color.c work, see
29a3963484 (Win32: patch Windows environment on startup, 2012-01-15).

See git-for-windows/git#3629 regarding problems caused by always setting
$env:TERM="cygwin".

This is the same heuristic used by the Cygwin runtime.

Signed-off-by: Rafael Kitover <rkitover@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..090a6d4f323ab1 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2785,9 +2785,20 @@ static void setup_windows_environment(void)
 		convert_slashes(tmp);
 	}
 
-	/* simulate TERM to enable auto-color (see color.c) */
-	if (!getenv("TERM"))
-		setenv("TERM", "cygwin", 1);
+
+	/*
+	 * Make sure TERM is set up correctly to enable auto-color
+	 * (see color.c .) Use "cygwin" for older OS releases which
+	 * works correctly with MSYS2 utilities on older consoles.
+	 */
+	if (!getenv("TERM")) {
+		if ((GetVersion() >> 16) < 15063)
+			setenv("TERM", "cygwin", 0);
+		else {
+			setenv("TERM", "xterm-256color", 0);
+			setenv("COLORTERM", "truecolor", 0);
+		}
+	}
 
 	/* calculate HOME if not set */
 	if (!getenv("HOME")) {

From b82ee10cec604ad91f9eb022ff298659e78f4cc0 Mon Sep 17 00:00:00 2001
From: Christopher Degawa <ccom@randomderp.com>
Date: Sat, 28 May 2022 14:53:54 -0500
Subject: [PATCH 381/553] winansi: check result and Buffer before using Name

NtQueryObject under Wine can return a success but fill out no name.
In those situations, Wine will set Buffer to NULL, and set result to
the sizeof(OBJECT_NAME_INFORMATION).

Running a command such as

echo "$(git.exe --version 2>/dev/null)"

will crash due to a NULL pointer dereference when the code attempts to
null terminate the buffer, although, weirdly, removing the subshell or
redirecting stdout to a file will not trigger the crash.

Code has been added to also check Buffer and Length to ensure the check
is as robust as possible due to the current behavior being fragile at
best, and could potentially change in the future

This code is based on the behavior of NtQueryObject under wine and
reactos.

Signed-off-by: Christopher Degawa <ccom@randomderp.com>
---
 compat/winansi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/compat/winansi.c b/compat/winansi.c
index ac2ffb78691a7d..d28137a20b0bcc 100644
--- a/compat/winansi.c
+++ b/compat/winansi.c
@@ -575,6 +575,9 @@ static void detect_msys_tty(int fd)
 	if (!NT_SUCCESS(NtQueryObject(h, ObjectNameInformation,
 			buffer, sizeof(buffer) - 2, &result)))
 		return;
+	if (result < sizeof(*nameinfo) || !nameinfo->Name.Buffer ||
+		!nameinfo->Name.Length)
+		return;
 	name = nameinfo->Name.Buffer;
 	name[nameinfo->Name.Length / sizeof(*name)] = 0;
 

From 3d4b5cb69be0b222fca692d8bcadf4e42176a9ce Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=AD=99=E5=8D=93=E8=AF=86?= <sunzhuoshi@gmail.com>
Date: Sun, 16 Jan 2022 03:38:33 +0800
Subject: [PATCH 382/553] Add config option `windows.appendAtomically`
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Atomic append on windows is only supported on local disk files, and it may
cause errors in other situations, e.g. network file system. If that is the
case, this config option should be used to turn atomic append off.

Co-Authored-By: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: 孙卓识 <sunzhuoshi@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.adoc         |  2 ++
 Documentation/config/windows.adoc |  4 ++++
 compat/mingw.c                    | 36 ++++++++++++++++++++++++++++---
 3 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/config/windows.adoc

diff --git a/Documentation/config.adoc b/Documentation/config.adoc
index dcea3c0c15e2a9..40c68a1162fd3d 100644
--- a/Documentation/config.adoc
+++ b/Documentation/config.adoc
@@ -559,4 +559,6 @@ include::config/versionsort.adoc[]
 
 include::config/web.adoc[]
 
+include::config/windows.adoc[]
+
 include::config/worktree.adoc[]
diff --git a/Documentation/config/windows.adoc b/Documentation/config/windows.adoc
new file mode 100644
index 00000000000000..fdaaf1c65504f3
--- /dev/null
+++ b/Documentation/config/windows.adoc
@@ -0,0 +1,4 @@
+windows.appendAtomically::
+	By default, append atomic API is used on windows. But it works only with
+	local disk files, if you're working on a network file system, you should
+	set it false to turn it off.
diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..83cee78a1a2d9b 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -8,6 +8,7 @@
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
+#include "repository.h"
 #include "run-command.h"
 #include "strbuf.h"
 #include "symlinks.h"
@@ -623,6 +624,7 @@ static int is_local_named_pipe_path(const char *filename)
 
 int mingw_open (const char *filename, int oflags, ...)
 {
+	static int append_atomically = -1;
 	typedef int (*open_fn_t)(wchar_t const *wfilename, int oflags, ...);
 	va_list args;
 	unsigned mode;
@@ -642,7 +644,16 @@ int mingw_open (const char *filename, int oflags, ...)
 		return -1;
 	}
 
-	if ((oflags & O_APPEND) && !is_local_named_pipe_path(filename))
+	/*
+	 * Only set append_atomically to default value(1) when repo is initialized
+	 * and fail to get config value
+	 */
+	if (append_atomically < 0 && the_repository && the_repository->commondir &&
+		repo_config_get_bool(the_repository, "windows.appendatomically", &append_atomically))
+		append_atomically = 1;
+
+	if (append_atomically && (oflags & O_APPEND) &&
+		!is_local_named_pipe_path(filename))
 		open_fn = mingw_open_append;
 	else if (!(oflags & ~(O_ACCMODE | O_NOINHERIT)))
 		open_fn = mingw_open_existing;
@@ -821,9 +832,28 @@ ssize_t mingw_write(int fd, const void *buf, size_t len)
 
 		/* check if fd is a pipe */
 		HANDLE h = (HANDLE) _get_osfhandle(fd);
-		if (GetFileType(h) != FILE_TYPE_PIPE)
+		if (GetFileType(h) != FILE_TYPE_PIPE) {
+			if (orig == EINVAL) {
+				wchar_t path[MAX_PATH];
+				DWORD ret = GetFinalPathNameByHandleW(h, path,
+								ARRAY_SIZE(path), 0);
+				UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ?
+					GetDriveTypeW(path) : DRIVE_UNKNOWN;
+
+				/*
+				 * The default atomic append causes such an error on
+				 * network file systems, in such a case, it should be
+				 * turned off via config.
+				 *
+				 * `drive_type` of UNC path: DRIVE_NO_ROOT_DIR
+				 */
+				if (DRIVE_NO_ROOT_DIR == drive_type || DRIVE_REMOTE == drive_type)
+					warning("invalid write operation detected; you may try:\n"
+						"\n\tgit config windows.appendAtomically false");
+			}
+
 			errno = orig;
-		else if (orig == EINVAL)
+		} else if (orig == EINVAL)
 			errno = EPIPE;
 		else {
 			DWORD buf_size;

From 4f94a73803f09843ac7bd1b115ac3e68fe73ecf2 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 4 Sep 2017 11:59:45 +0200
Subject: [PATCH 383/553] mingw: change core.fsyncObjectFiles = 1 by default
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

From the documentation of said setting:

	This boolean will enable fsync() when writing object files.

	This is a total waste of time and effort on a filesystem that
	orders data writes properly, but can be useful for filesystems
	that do not use journalling (traditional UNIX filesystems) or
	that only journal metadata and not file contents (OS X’s HFS+,
	or Linux ext3 with "data=writeback").

The most common file system on Windows (NTFS) does not guarantee that
order, therefore a sudden loss of power (or any other event causing an
unclean shutdown) would cause corrupt files (i.e. files filled with
NULs). Therefore we need to change the default.

Note that the documentation makes it sound as if this causes really bad
performance. In reality, writing loose objects is something that is done
only rarely, and only a handful of files at a time.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 83cee78a1a2d9b..2d73e917fc5453 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -16,6 +16,7 @@
 #include "win32.h"
 #include "win32/lazyload.h"
 #include "wrapper.h"
+#include "write-or-die.h"
 #include <aclapi.h>
 #include <conio.h>
 #include <sddl.h>
@@ -3290,6 +3291,7 @@ int wmain(int argc, const wchar_t **wargv)
 #endif
 
 	maybe_redirect_std_handles();
+	fsync_object_files = 1;
 
 	/* determine size of argv and environ conversion buffer */
 	maxlen = wcslen(wargv[0]);

From b9c008699b7aefef549d7b8bbce831d9c7f0f3e7 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 6 May 2023 22:26:15 +0200
Subject: [PATCH 384/553] http: optionally load libcurl lazily

This compile-time option allows to ask Git to load libcurl dynamically
at runtime.

Together with a follow-up patch that optionally overrides the file name
depending on the `http.sslBackend` setting, this kicks open the door for
installing multiple libcurl flavors side by side, and load the one
corresponding to the (runtime-)configured SSL/TLS backend.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile               |  28 +++-
 compat/lazyload-curl.c | 364 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 385 insertions(+), 7 deletions(-)
 create mode 100644 compat/lazyload-curl.c

diff --git a/Makefile b/Makefile
index b7eba509c6a0ca..77f68c0e255385 100644
--- a/Makefile
+++ b/Makefile
@@ -480,6 +480,11 @@ include shared.mak
 #
 #     CURL_LDFLAGS=-lcurl
 #
+# Define LAZYLOAD_LIBCURL to dynamically load the libcurl; This can be useful
+# if Multiple libcurl versions exist (with different file names) that link to
+# various SSL/TLS backends, to support the `http.sslBackend` runtime switch in
+# such a scenario.
+#
 # === Optional library: libpcre2 ===
 #
 # Define USE_LIBPCRE if you have and want to use libpcre. Various
@@ -1763,10 +1768,19 @@ else
 		CURL_LIBCURL =
         endif
 
-        ifndef CURL_LDFLAGS
-		CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS)
+        ifdef LAZYLOAD_LIBCURL
+		LAZYLOAD_LIBCURL_OBJ = compat/lazyload-curl.o
+		OBJECTS += $(LAZYLOAD_LIBCURL_OBJ)
+		# The `CURL_STATICLIB` constant must be defined to avoid seeing the functions
+		# declared as DLL imports
+		CURL_CFLAGS = -DCURL_STATICLIB
+		CURL_LIBCURL = -ldl
+        else
+                ifndef CURL_LDFLAGS
+			CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS)
+                endif
+		CURL_LIBCURL += $(CURL_LDFLAGS)
         endif
-	CURL_LIBCURL += $(CURL_LDFLAGS)
 
         ifndef CURL_CFLAGS
 		CURL_CFLAGS = $(eval CURL_CFLAGS := $$(shell $$(CURL_CONFIG) --cflags))$(CURL_CFLAGS)
@@ -1787,7 +1801,7 @@ else
         endif
         ifdef USE_CURL_FOR_IMAP_SEND
 		BASIC_CFLAGS += -DUSE_CURL_FOR_IMAP_SEND
-		IMAP_SEND_BUILDDEPS = http.o
+		IMAP_SEND_BUILDDEPS = http.o $(LAZYLOAD_LIBCURL_OBJ)
 		IMAP_SEND_LDFLAGS += $(CURL_LIBCURL)
         endif
         ifndef NO_EXPAT
@@ -2967,10 +2981,10 @@ git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(IMAP_SEND_LDFLAGS) $(LIBS)
 
-git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
+git-http-fetch$X: http.o http-walker.o http-fetch.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(LIBS)
-git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
+git-http-push$X: http.o http-push.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
@@ -2980,7 +2994,7 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY)
 	ln -s $< $@ 2>/dev/null || \
 	cp $< $@
 
-$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS)
+$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)
 
diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c
new file mode 100644
index 00000000000000..f4e08f76dfcd7f
--- /dev/null
+++ b/compat/lazyload-curl.c
@@ -0,0 +1,364 @@
+#include "../git-compat-util.h"
+#include "../git-curl-compat.h"
+#include <dlfcn.h>
+
+/*
+ * The ABI version of libcurl is encoded in its shared libraries' file names.
+ * This ABI version has not changed since October 2006 and is unlikely to be
+ * changed in the future. See https://curl.se/libcurl/abi.html for details.
+ */
+#define LIBCURL_ABI_VERSION "4"
+
+typedef void (*func_t)(void);
+
+#ifdef __APPLE__
+#define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib"
+#else
+#define LIBCURL_FILE_NAME(base) base ".so." LIBCURL_ABI_VERSION
+#endif
+
+static void *load_library(const char *name)
+{
+	return dlopen(name, RTLD_LAZY);
+}
+
+static func_t load_function(void *handle, const char *name)
+{
+	/*
+	 * Casting the return value of `dlsym()` to a function pointer is
+	 * explicitly allowed in recent POSIX standards, but GCC complains
+	 * about this in pedantic mode nevertheless. For more about this issue,
+	 * see https://stackoverflow.com/q/31526876/1860823 and
+	 * http://stackoverflow.com/a/36385690/1905491.
+	 */
+	func_t f;
+	*(void **)&f = dlsym(handle, name);
+	return f;
+}
+
+typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version);
+static curl_version_info_type curl_version_info_func;
+
+typedef char *(*curl_easy_escape_type)(CURL *handle, const char *string, int length);
+static curl_easy_escape_type curl_easy_escape_func;
+
+typedef void (*curl_free_type)(void *p);
+static curl_free_type curl_free_func;
+
+typedef CURLcode (*curl_global_init_type)(long flags);
+static curl_global_init_type curl_global_init_func;
+
+typedef CURLsslset (*curl_global_sslset_type)(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail);
+static curl_global_sslset_type curl_global_sslset_func;
+
+typedef void (*curl_global_cleanup_type)(void);
+static curl_global_cleanup_type curl_global_cleanup_func;
+
+typedef CURLcode (*curl_global_trace_type)(const char *config);
+static curl_global_trace_type curl_global_trace_func;
+
+typedef struct curl_slist *(*curl_slist_append_type)(struct curl_slist *list, const char *data);
+static curl_slist_append_type curl_slist_append_func;
+
+typedef void (*curl_slist_free_all_type)(struct curl_slist *list);
+static curl_slist_free_all_type curl_slist_free_all_func;
+
+typedef const char *(*curl_easy_strerror_type)(CURLcode error);
+static curl_easy_strerror_type curl_easy_strerror_func;
+
+typedef CURLM *(*curl_multi_init_type)(void);
+static curl_multi_init_type curl_multi_init_func;
+
+typedef CURLMcode (*curl_multi_add_handle_type)(CURLM *multi_handle, CURL *curl_handle);
+static curl_multi_add_handle_type curl_multi_add_handle_func;
+
+typedef CURLMcode (*curl_multi_remove_handle_type)(CURLM *multi_handle, CURL *curl_handle);
+static curl_multi_remove_handle_type curl_multi_remove_handle_func;
+
+typedef CURLMcode (*curl_multi_fdset_type)(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd);
+static curl_multi_fdset_type curl_multi_fdset_func;
+
+typedef CURLMcode (*curl_multi_perform_type)(CURLM *multi_handle, int *running_handles);
+static curl_multi_perform_type curl_multi_perform_func;
+
+typedef CURLMcode (*curl_multi_cleanup_type)(CURLM *multi_handle);
+static curl_multi_cleanup_type curl_multi_cleanup_func;
+
+typedef CURLMsg *(*curl_multi_info_read_type)(CURLM *multi_handle, int *msgs_in_queue);
+static curl_multi_info_read_type curl_multi_info_read_func;
+
+typedef const char *(*curl_multi_strerror_type)(CURLMcode error);
+static curl_multi_strerror_type curl_multi_strerror_func;
+
+typedef CURLMcode (*curl_multi_timeout_type)(CURLM *multi_handle, long *milliseconds);
+static curl_multi_timeout_type curl_multi_timeout_func;
+
+typedef CURL *(*curl_easy_init_type)(void);
+static curl_easy_init_type curl_easy_init_func;
+
+typedef CURLcode (*curl_easy_perform_type)(CURL *curl);
+static curl_easy_perform_type curl_easy_perform_func;
+
+typedef void (*curl_easy_cleanup_type)(CURL *curl);
+static curl_easy_cleanup_type curl_easy_cleanup_func;
+
+typedef CURL *(*curl_easy_duphandle_type)(CURL *curl);
+static curl_easy_duphandle_type curl_easy_duphandle_func;
+
+typedef CURLcode (*curl_easy_getinfo_long_type)(CURL *curl, CURLINFO info, long *value);
+static curl_easy_getinfo_long_type curl_easy_getinfo_long_func;
+
+typedef CURLcode (*curl_easy_getinfo_pointer_type)(CURL *curl, CURLINFO info, void **value);
+static curl_easy_getinfo_pointer_type curl_easy_getinfo_pointer_func;
+
+typedef CURLcode (*curl_easy_getinfo_off_t_type)(CURL *curl, CURLINFO info, curl_off_t *value);
+static curl_easy_getinfo_off_t_type curl_easy_getinfo_off_t_func;
+
+typedef CURLcode (*curl_easy_setopt_long_type)(CURL *curl, CURLoption opt, long value);
+static curl_easy_setopt_long_type curl_easy_setopt_long_func;
+
+typedef CURLcode (*curl_easy_setopt_pointer_type)(CURL *curl, CURLoption opt, void *value);
+static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func;
+
+typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value);
+static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func;
+
+static void lazy_load_curl(void)
+{
+	static int initialized;
+	void *libcurl;
+	func_t curl_easy_getinfo_func, curl_easy_setopt_func;
+
+	if (initialized)
+		return;
+
+	initialized = 1;
+	libcurl = load_library(LIBCURL_FILE_NAME("libcurl"));
+	if (!libcurl)
+		die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl"));
+
+	curl_version_info_func = (curl_version_info_type)load_function(libcurl, "curl_version_info");
+	curl_easy_escape_func = (curl_easy_escape_type)load_function(libcurl, "curl_easy_escape");
+	curl_free_func = (curl_free_type)load_function(libcurl, "curl_free");
+	curl_global_init_func = (curl_global_init_type)load_function(libcurl, "curl_global_init");
+	curl_global_sslset_func = (curl_global_sslset_type)load_function(libcurl, "curl_global_sslset");
+	curl_global_cleanup_func = (curl_global_cleanup_type)load_function(libcurl, "curl_global_cleanup");
+	curl_global_trace_func = (curl_global_trace_type)load_function(libcurl, "curl_global_trace");
+	curl_slist_append_func = (curl_slist_append_type)load_function(libcurl, "curl_slist_append");
+	curl_slist_free_all_func = (curl_slist_free_all_type)load_function(libcurl, "curl_slist_free_all");
+	curl_easy_strerror_func = (curl_easy_strerror_type)load_function(libcurl, "curl_easy_strerror");
+	curl_multi_init_func = (curl_multi_init_type)load_function(libcurl, "curl_multi_init");
+	curl_multi_add_handle_func = (curl_multi_add_handle_type)load_function(libcurl, "curl_multi_add_handle");
+	curl_multi_remove_handle_func = (curl_multi_remove_handle_type)load_function(libcurl, "curl_multi_remove_handle");
+	curl_multi_fdset_func = (curl_multi_fdset_type)load_function(libcurl, "curl_multi_fdset");
+	curl_multi_perform_func = (curl_multi_perform_type)load_function(libcurl, "curl_multi_perform");
+	curl_multi_cleanup_func = (curl_multi_cleanup_type)load_function(libcurl, "curl_multi_cleanup");
+	curl_multi_info_read_func = (curl_multi_info_read_type)load_function(libcurl, "curl_multi_info_read");
+	curl_multi_strerror_func = (curl_multi_strerror_type)load_function(libcurl, "curl_multi_strerror");
+	curl_multi_timeout_func = (curl_multi_timeout_type)load_function(libcurl, "curl_multi_timeout");
+	curl_easy_init_func = (curl_easy_init_type)load_function(libcurl, "curl_easy_init");
+	curl_easy_perform_func = (curl_easy_perform_type)load_function(libcurl, "curl_easy_perform");
+	curl_easy_cleanup_func = (curl_easy_cleanup_type)load_function(libcurl, "curl_easy_cleanup");
+	curl_easy_duphandle_func = (curl_easy_duphandle_type)load_function(libcurl, "curl_easy_duphandle");
+
+	curl_easy_getinfo_func = load_function(libcurl, "curl_easy_getinfo");
+	curl_easy_getinfo_long_func = (curl_easy_getinfo_long_type)curl_easy_getinfo_func;
+	curl_easy_getinfo_pointer_func = (curl_easy_getinfo_pointer_type)curl_easy_getinfo_func;
+	curl_easy_getinfo_off_t_func = (curl_easy_getinfo_off_t_type)curl_easy_getinfo_func;
+
+	curl_easy_setopt_func = load_function(libcurl, "curl_easy_setopt");
+	curl_easy_setopt_long_func = (curl_easy_setopt_long_type)curl_easy_setopt_func;
+	curl_easy_setopt_pointer_func = (curl_easy_setopt_pointer_type)curl_easy_setopt_func;
+	curl_easy_setopt_off_t_func = (curl_easy_setopt_off_t_type)curl_easy_setopt_func;
+}
+
+struct curl_version_info_data *curl_version_info(CURLversion version)
+{
+	lazy_load_curl();
+	return curl_version_info_func(version);
+}
+
+char *curl_easy_escape(CURL *handle, const char *string, int length)
+{
+	lazy_load_curl();
+	return curl_easy_escape_func(handle, string, length);
+}
+
+void curl_free(void *p)
+{
+	lazy_load_curl();
+	curl_free_func(p);
+}
+
+CURLcode curl_global_init(long flags)
+{
+	lazy_load_curl();
+	return curl_global_init_func(flags);
+}
+
+CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail)
+{
+	lazy_load_curl();
+	return curl_global_sslset_func(id, name, avail);
+}
+
+void curl_global_cleanup(void)
+{
+	lazy_load_curl();
+	curl_global_cleanup_func();
+}
+
+CURLcode curl_global_trace(const char *config)
+{
+	lazy_load_curl();
+	return curl_global_trace_func(config);
+}
+
+struct curl_slist *curl_slist_append(struct curl_slist *list, const char *data)
+{
+	lazy_load_curl();
+	return curl_slist_append_func(list, data);
+}
+
+void curl_slist_free_all(struct curl_slist *list)
+{
+	lazy_load_curl();
+	curl_slist_free_all_func(list);
+}
+
+const char *curl_easy_strerror(CURLcode error)
+{
+	lazy_load_curl();
+	return curl_easy_strerror_func(error);
+}
+
+CURLM *curl_multi_init(void)
+{
+	lazy_load_curl();
+	return curl_multi_init_func();
+}
+
+CURLMcode curl_multi_add_handle(CURLM *multi_handle, CURL *curl_handle)
+{
+	lazy_load_curl();
+	return curl_multi_add_handle_func(multi_handle, curl_handle);
+}
+
+CURLMcode curl_multi_remove_handle(CURLM *multi_handle, CURL *curl_handle)
+{
+	lazy_load_curl();
+	return curl_multi_remove_handle_func(multi_handle, curl_handle);
+}
+
+CURLMcode curl_multi_fdset(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd)
+{
+	lazy_load_curl();
+	return curl_multi_fdset_func(multi_handle, read_fd_set, write_fd_set, exc_fd_set, max_fd);
+}
+
+CURLMcode curl_multi_perform(CURLM *multi_handle, int *running_handles)
+{
+	lazy_load_curl();
+	return curl_multi_perform_func(multi_handle, running_handles);
+}
+
+CURLMcode curl_multi_cleanup(CURLM *multi_handle)
+{
+	lazy_load_curl();
+	return curl_multi_cleanup_func(multi_handle);
+}
+
+CURLMsg *curl_multi_info_read(CURLM *multi_handle, int *msgs_in_queue)
+{
+	lazy_load_curl();
+	return curl_multi_info_read_func(multi_handle, msgs_in_queue);
+}
+
+const char *curl_multi_strerror(CURLMcode error)
+{
+	lazy_load_curl();
+	return curl_multi_strerror_func(error);
+}
+
+CURLMcode curl_multi_timeout(CURLM *multi_handle, long *milliseconds)
+{
+	lazy_load_curl();
+	return curl_multi_timeout_func(multi_handle, milliseconds);
+}
+
+CURL *curl_easy_init(void)
+{
+	lazy_load_curl();
+	return curl_easy_init_func();
+}
+
+CURLcode curl_easy_perform(CURL *curl)
+{
+	lazy_load_curl();
+	return curl_easy_perform_func(curl);
+}
+
+void curl_easy_cleanup(CURL *curl)
+{
+	lazy_load_curl();
+	curl_easy_cleanup_func(curl);
+}
+
+CURL *curl_easy_duphandle(CURL *curl)
+{
+	lazy_load_curl();
+	return curl_easy_duphandle_func(curl);
+}
+
+#ifndef CURL_IGNORE_DEPRECATION
+#define CURL_IGNORE_DEPRECATION(x) x
+#endif
+
+#ifndef CURLOPTTYPE_BLOB
+#define CURLOPTTYPE_BLOB 40000
+#endif
+
+#undef curl_easy_getinfo
+CURLcode curl_easy_getinfo(CURL *curl, CURLINFO info, ...)
+{
+	va_list ap;
+	CURLcode res;
+
+	va_start(ap, info);
+	lazy_load_curl();
+	CURL_IGNORE_DEPRECATION(
+		if (info >= CURLINFO_LONG && info < CURLINFO_DOUBLE)
+			res = curl_easy_getinfo_long_func(curl, info, va_arg(ap, long *));
+		else if ((info >= CURLINFO_STRING && info < CURLINFO_LONG) ||
+			 (info >= CURLINFO_SLIST && info < CURLINFO_SOCKET))
+			res = curl_easy_getinfo_pointer_func(curl, info, va_arg(ap, void **));
+		else if (info >= CURLINFO_OFF_T)
+			res = curl_easy_getinfo_off_t_func(curl, info, va_arg(ap, curl_off_t *));
+		else
+			die("%s:%d: TODO (info: %d)!", __FILE__, __LINE__, info);
+	)
+	va_end(ap);
+	return res;
+}
+
+#undef curl_easy_setopt
+CURLcode curl_easy_setopt(CURL *curl, CURLoption opt, ...)
+{
+	va_list ap;
+	CURLcode res;
+
+	va_start(ap, opt);
+	lazy_load_curl();
+	CURL_IGNORE_DEPRECATION(
+		if (opt >= CURLOPTTYPE_LONG && opt < CURLOPTTYPE_OBJECTPOINT)
+			res = curl_easy_setopt_long_func(curl, opt, va_arg(ap, long));
+		else if (opt >= CURLOPTTYPE_OBJECTPOINT && opt < CURLOPTTYPE_OFF_T)
+			res = curl_easy_setopt_pointer_func(curl, opt, va_arg(ap, void *));
+		else if (opt >= CURLOPTTYPE_OFF_T && opt < CURLOPTTYPE_BLOB)
+			res = curl_easy_setopt_off_t_func(curl, opt, va_arg(ap, curl_off_t));
+		else
+			die("%s:%d: TODO (opt: %d)!", __FILE__, __LINE__, opt);
+	)
+	va_end(ap);
+	return res;
+}

From 3fe2d2cbd358b9044517f9f5dd28e7c9a047d7f8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sun, 10 Jul 2022 11:27:25 +0200
Subject: [PATCH 385/553] MinGW: link as terminal server aware
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Whith Windows 2000, Microsoft introduced a flag to the PE header to mark executables as
"terminal server aware". Windows terminal servers provide a redirected Windows directory and
redirected registry hives when launching legacy applications without this flag set. Since we
do not use any INI files in the Windows directory and don't write to the registry, we don't
need  this additional preparation. Telling the OS that we don't need this should provide
slightly improved startup times in terminal server environments.

When building for supported Windows Versions with MSVC the /TSAWARE linker flag is
automatically set, but MinGW requires us to set the --tsaware flag manually.

This partially addresses https://github.com/git-for-windows/git/issues/3935.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
 config.mak.uname | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config.mak.uname b/config.mak.uname
index 38b35af366d5fd..0d642ec5d38ef6 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -704,7 +704,7 @@ ifeq ($(uname_S),MINGW)
 	DEFAULT_HELP_FORMAT = html
 	HAVE_PLATFORM_PROCINFO = YesPlease
 	CSPRNG_METHOD = rtlgenrandom
-	BASIC_LDFLAGS += -municode
+	BASIC_LDFLAGS += -municode -Wl,--tsaware
 	COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32
 	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
 	COMPAT_OBJS += compat/mingw.o compat/winansi.o \

From a7370259a36cfd6193ca0f698d013dea8d884316 Mon Sep 17 00:00:00 2001
From: Kiel Hurley <kielhurley@gmail.com>
Date: Wed, 2 Nov 2022 22:56:16 +1300
Subject: [PATCH 386/553] Fix Windows version resources

Add FileVersion, which is a required field
As not all required fields were present, none were being included
Fixes #4090

Signed-off-by: Kiel Hurley <kielhurley@gmail.com>
---
 git.rc.in | 1 +
 1 file changed, 1 insertion(+)

diff --git a/git.rc.in b/git.rc.in
index e69444eef3f0c5..460ea39561b87f 100644
--- a/git.rc.in
+++ b/git.rc.in
@@ -12,6 +12,7 @@ BEGIN
       VALUE "OriginalFilename", "git.exe\0"
       VALUE "ProductName", "Git\0"
       VALUE "ProductVersion", "@GIT_VERSION@\0"
+      VALUE "FileVersion", "@GIT_VERSION@\0"
     END
   END
 

From 3486fc691923e60783aeb8d024a1e9afc9f16bc3 Mon Sep 17 00:00:00 2001
From: Andrey Zabavnikov <zabavnikov@gmail.com>
Date: Fri, 28 Oct 2022 17:12:06 +0300
Subject: [PATCH 387/553] status: fix for old-style submodules with commondir

In f9b7573f6b00 (repository: free fields before overwriting them,
2017-09-05), Git was taught to release memory before overwriting it, but
357a03ebe9e0 (repository.c: move env-related setup code back to
environment.c, 2018-03-03) changed the code so that it would not
_always_ be overwritten.

As a consequence, the `commondir` attribute would point to
already-free()d memory.

This seems not to cause problems in core Git, but there are add-on
patches in Git for Windows where the `commondir` attribute is
subsequently used and causing invalid memory accesses e.g. in setups
containing old-style submodules (i.e. the ones with a `.git` directory
within theirs worktrees) that have `commondir` configured.

This fixes https://github.com/git-for-windows/git/pull/4083.

Signed-off-by: Andrey Zabavnikov <zabavnikov@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 repository.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index c7e75215ac2ab2..6c3b9dac8a7729 100644
--- a/repository.c
+++ b/repository.c
@@ -136,7 +136,7 @@ static void repo_set_commondir(struct repository *repo,
 {
 	struct strbuf sb = STRBUF_INIT;
 
-	free(repo->commondir);
+	FREE_AND_NULL(repo->commondir);
 
 	if (commondir) {
 		repo->different_commondir = 1;

From f72f60c5fd9205d3488c2ca5e098bc8b04e77b3a Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 7 May 2023 22:51:52 +0200
Subject: [PATCH 388/553] http: support lazy-loading libcurl also on Windows

This implements the Windows-specific support code, because everything is
slightly different on Windows, even loading shared libraries.

Note: I specifically do _not_ use the code from
`compat/win32/lazyload.h` here because that code is optimized for
loading individual functions from various system DLLs, while we
specifically want to load _many_ functions from _one_ DLL here, and
distinctly not a system DLL (we expect libcurl to be located outside
`C:\Windows\system32`, something `INIT_PROC_ADDR` refuses to work with).
Also, the `curl_easy_getinfo()`/`curl_easy_setopt()` functions are
declared as vararg functions, which `lazyload.h` cannot handle. Finally,
we are about to optionally override the exact file name that is to be
loaded, which is a goal contrary to `lazyload.h`'s design.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile               |  4 ++++
 compat/lazyload-curl.c | 52 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/Makefile b/Makefile
index 77f68c0e255385..77f9c76f226dda 100644
--- a/Makefile
+++ b/Makefile
@@ -1774,7 +1774,11 @@ else
 		# The `CURL_STATICLIB` constant must be defined to avoid seeing the functions
 		# declared as DLL imports
 		CURL_CFLAGS = -DCURL_STATICLIB
+ifneq ($(uname_S),MINGW)
+ifneq ($(uname_S),Windows)
 		CURL_LIBCURL = -ldl
+endif
+endif
         else
                 ifndef CURL_LDFLAGS
 			CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS)
diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c
index f4e08f76dfcd7f..82ab11de43a0fb 100644
--- a/compat/lazyload-curl.c
+++ b/compat/lazyload-curl.c
@@ -1,6 +1,8 @@
 #include "../git-compat-util.h"
 #include "../git-curl-compat.h"
+#ifndef WIN32
 #include <dlfcn.h>
+#endif
 
 /*
  * The ABI version of libcurl is encoded in its shared libraries' file names.
@@ -11,6 +13,7 @@
 
 typedef void (*func_t)(void);
 
+#ifndef WIN32
 #ifdef __APPLE__
 #define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib"
 #else
@@ -35,6 +38,55 @@ static func_t load_function(void *handle, const char *name)
 	*(void **)&f = dlsym(handle, name);
 	return f;
 }
+#else
+#define LIBCURL_FILE_NAME(base) base "-" LIBCURL_ABI_VERSION ".dll"
+
+static void *load_library(const char *name)
+{
+	size_t name_size = strlen(name) + 1;
+	const char *path = getenv("PATH");
+	char dll_path[MAX_PATH];
+
+	while (path && *path) {
+		const char *sep = strchrnul(path, ';');
+		size_t len = sep - path;
+
+		if (len && len + name_size < sizeof(dll_path)) {
+			memcpy(dll_path, path, len);
+			dll_path[len] = '/';
+			memcpy(dll_path + len + 1, name, name_size);
+
+			if (!access(dll_path, R_OK)) {
+				wchar_t wpath[MAX_PATH];
+				int wlen = MultiByteToWideChar(CP_UTF8, 0, dll_path, -1, wpath, ARRAY_SIZE(wpath));
+				void *res = wlen ? (void *)LoadLibraryExW(wpath, NULL, 0) : NULL;
+				if (!res) {
+					DWORD err = GetLastError();
+					char buf[1024];
+
+					if (!FormatMessageA(FORMAT_MESSAGE_FROM_SYSTEM |
+							    FORMAT_MESSAGE_ARGUMENT_ARRAY |
+							    FORMAT_MESSAGE_IGNORE_INSERTS,
+							    NULL, err, LANG_NEUTRAL,
+							    buf, sizeof(buf) - 1, NULL))
+						xsnprintf(buf, sizeof(buf), "last error: %ld", err);
+					error("LoadLibraryExW() failed with: %s", buf);
+				}
+				return res;
+			}
+		}
+
+		path = *sep ? sep + 1 : NULL;
+	}
+
+	return NULL;
+}
+
+static func_t load_function(void *handle, const char *name)
+{
+	return (func_t)GetProcAddress((HANDLE)handle, name);
+}
+#endif
 
 typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version);
 static curl_version_info_type curl_version_info_func;

From 23a0ac0100d6b54cda67c705d14001ee21b4efb6 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 7 May 2023 22:05:33 +0200
Subject: [PATCH 389/553] http: when loading libcurl lazily, allow for multiple
 SSL backends

The previous commits introduced a compile-time option to load libcurl
lazily, but it uses the hard-coded name "libcurl-4.dll" (or equivalent
on platforms other than Windows).

To allow for installing multiple libcurl flavors side by side, where
each supports one specific SSL/TLS backend, let's first look whether
`libcurl-<backend>-4.dll` exists, and only use `libcurl-4.dll` as a fall
back.

That will allow us to ship with a libcurl by default that only supports
the Secure Channel backend for the `https://` protocol. This libcurl
won't suffer from any dependency problem when upgrading OpenSSL to a new
major version (which will change the DLL name, and hence break every
program and library that depends on it).

This is crucial because Git for Windows relies on libcurl to keep
working when building and deploying a new OpenSSL package because that
library is used by `git fetch` and `git clone`.

Note that this feature is by no means specific to Windows. On Ubuntu,
for example, a `git` built using `LAZY_LOAD_LIBCURL` will use
`libcurl.so.4` for `http.sslbackend=openssl` and `libcurl-gnutls.so.4`
for `http.sslbackend=gnutls`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/lazyload-curl.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c
index 82ab11de43a0fb..a6a3f7e3a7aeaa 100644
--- a/compat/lazyload-curl.c
+++ b/compat/lazyload-curl.c
@@ -175,17 +175,26 @@ static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func;
 typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value);
 static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func;
 
+static char ssl_backend[64];
+
 static void lazy_load_curl(void)
 {
 	static int initialized;
-	void *libcurl;
+	void *libcurl = NULL;
 	func_t curl_easy_getinfo_func, curl_easy_setopt_func;
 
 	if (initialized)
 		return;
 
 	initialized = 1;
-	libcurl = load_library(LIBCURL_FILE_NAME("libcurl"));
+	if (ssl_backend[0]) {
+		char dll_name[64 + 16];
+		snprintf(dll_name, sizeof(dll_name) - 1,
+			 LIBCURL_FILE_NAME("libcurl-%s"), ssl_backend);
+		libcurl = load_library(dll_name);
+	}
+	if (!libcurl)
+		libcurl = load_library(LIBCURL_FILE_NAME("libcurl"));
 	if (!libcurl)
 		die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl"));
 
@@ -250,6 +259,9 @@ CURLcode curl_global_init(long flags)
 
 CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail)
 {
+	if (name && strlen(name) < sizeof(ssl_backend))
+		strlcpy(ssl_backend, name, sizeof(ssl_backend));
+
 	lazy_load_curl();
 	return curl_global_sslset_func(id, name, avail);
 }

From 234a3bd6770cec0ee260f3c52c2173bfe5eff009 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 7 May 2023 22:43:37 +0200
Subject: [PATCH 390/553] mingw: do load libcurl dynamically by default

This will help with Git for Windows' maintenance going forward: It
allows Git for Windows to switch its primary libcurl to a variant
without the OpenSSL backend, while still loading an alternate when
setting `http.sslBackend = openssl`.

This is necessary to avoid maintenance headaches with upgrading OpenSSL:
its major version name is encoded in the shared library's file name and
hence major version updates (temporarily) break libraries that are
linked against the OpenSSL library.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.uname b/config.mak.uname
index 0d642ec5d38ef6..9f357421c97e37 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -705,6 +705,7 @@ ifeq ($(uname_S),MINGW)
 	HAVE_PLATFORM_PROCINFO = YesPlease
 	CSPRNG_METHOD = rtlgenrandom
 	BASIC_LDFLAGS += -municode -Wl,--tsaware
+	LAZYLOAD_LIBCURL = YesDoThatPlease
 	COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32
 	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
 	COMPAT_OBJS += compat/mingw.o compat/winansi.o \

From 8e20dc1c9e9841ea4d7b1bed9f768dadf0d444ec Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 2 Nov 2022 16:23:58 +0100
Subject: [PATCH 391/553] Add a GitHub workflow to verify that Git/Scalar work
 in Nano Server

In Git for Windows v2.39.0, we fixed a regression where `git.exe` would
no longer work in Windows Nano Server (frequently used in Docker
containers).

This GitHub workflow can be used to verify manually that the Git/Scalar
executables work in Nano Server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/nano-server.yml | 76 +++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100644 .github/workflows/nano-server.yml

diff --git a/.github/workflows/nano-server.yml b/.github/workflows/nano-server.yml
new file mode 100644
index 00000000000000..85b3ed5f52ed4d
--- /dev/null
+++ b/.github/workflows/nano-server.yml
@@ -0,0 +1,76 @@
+name: Windows Nano Server tests
+
+on:
+  workflow_dispatch:
+
+env:
+  DEVELOPER: 1
+
+jobs:
+  test-nano-server:
+    runs-on: windows-2022
+    env:
+      WINDBG_DIR: "C:/Program Files (x86)/Windows Kits/10/Debuggers/x64"
+      IMAGE: mcr.microsoft.com/powershell:nanoserver-ltsc2022
+
+    steps:
+      - uses: actions/checkout@v5
+      - uses: git-for-windows/setup-git-for-windows-sdk@v1
+      - name: build Git
+        shell: bash
+        run: make -j15
+      - name: pull nanoserver image
+        shell: bash
+        run: docker pull $IMAGE
+      - name: run nano-server test
+        shell: bash
+        run: |
+          docker run \
+            --user "ContainerAdministrator" \
+            -v "$WINDBG_DIR:C:/dbg" \
+            -v "$(cygpath -aw /mingw64/bin):C:/mingw64-bin" \
+            -v "$(cygpath -aw .):C:/test" \
+            $IMAGE pwsh.exe -Command '
+              # Extend the PATH to include the `.dll` files in /mingw64/bin/
+              $env:PATH += ";C:\mingw64-bin"
+
+              # For each executable to test pick some no-operation set of
+              # flags/subcommands or something that should quickly result in an
+              # error with known exit code that is not a negative 32-bit
+              # number, and set the expected return code appropriately.
+              #
+              # Only test executables that could be expected to run in a UI
+              # less environment.
+              #
+              # ( Executable path, arguments, expected return code )
+              # also note space is required before close parenthesis (a
+              # powershell quirk when defining nested arrays like this)
+
+              $executables_to_test = @(
+                  ("C:\test\git.exe", "", 1 ),
+                  ("C:\test\scalar.exe", "version", 0 )
+              )
+
+              foreach ($executable in $executables_to_test)
+              {
+                  Write-Output "Now testing $($executable[0])"
+                  &$executable[0] $executable[1]
+                  if ($LASTEXITCODE -ne $executable[2]) {
+                      # if we failed, run the debugger to find out what function
+                      # or DLL could not be found and then exit the script with
+                      # failure The missing DLL or EXE will be referenced near
+                      # the end of the output
+
+                      # Set a flag to have the debugger show loader stub
+                      # diagnostics. This requires running as administrator,
+                      # otherwise the flag will be ignored.
+                      C:\dbg\gflags -i $executable[0] +SLS
+
+                      C:\dbg\cdb.exe -c "g" -c "q" $executable[0] $executable[1]
+
+                      exit 1
+                  }
+              }
+
+              exit 0
+            '

From 413038285aa14567bab49d00b4db670711f666d8 Mon Sep 17 00:00:00 2001
From: David Lomas <dl3@pale-eds.co.uk>
Date: Fri, 28 Jul 2023 15:31:25 +0100
Subject: [PATCH 392/553] mingw: suggest `windows.appendAtomically` in more
 cases

When running Git for Windows on a remote APFS filesystem, it would
appear that the `mingw_open_append()`/`write()` combination would fail
almost exactly like on some CIFS-mounted shares as had been reported in
https://github.com/git-for-windows/git/issues/2753, albeit with a
different `errno` value.

Let's handle that `errno` value just the same, by suggesting to set
`windows.appendAtomically=false`.

Signed-off-by: David Lomas <dl3@pale-eds.co.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 83cee78a1a2d9b..dd377ad71af43c 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -827,7 +827,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len)
 {
 	ssize_t result = write(fd, buf, len);
 
-	if (result < 0 && (errno == EINVAL || errno == ENOSPC) && buf) {
+	if (result < 0 && (errno == EINVAL || errno == EBADF || errno == ENOSPC) && buf) {
 		int orig = errno;
 
 		/* check if fd is a pipe */
@@ -853,7 +853,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len)
 			}
 
 			errno = orig;
-		} else if (orig == EINVAL)
+		} else if (orig == EINVAL || errno == EBADF)
 			errno = EPIPE;
 		else {
 			DWORD buf_size;

From c8f0a9233de895f1976ef3a998005d728b05faae Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 22 Nov 2023 22:57:38 +0100
Subject: [PATCH 393/553] win32: use native ANSI sequence processing, if
 possible

Windows 10 version 1511 (also known as Anniversary Update), according to
https://learn.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences
introduced native support for ANSI sequence processing. This allows
using colors from the entire 24-bit color range.

All we need to do is test whether the console's "virtual processing
support" can be enabled. If it can, we do not even need to start the
`console_thread` to handle ANSI sequences.

Or, almost all we need to do: When `console_thread()` does its work, it
uses the Unicode-aware `write_console()` function to write to the Win32
Console, which supports Git for Windows' implicit convention that all
text that is written is encoded in UTF-8. The same is not necessarily
true if native ANSI sequence processing is used, as the output is then
subject to the current code page. Let's ensure that the code page is set
to `CP_UTF8` as long as Git writes to it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/winansi.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/compat/winansi.c b/compat/winansi.c
index ac2ffb78691a7d..a83a7f47ada9b2 100644
--- a/compat/winansi.c
+++ b/compat/winansi.c
@@ -593,6 +593,49 @@ static void detect_msys_tty(int fd)
 
 #endif
 
+static HANDLE std_console_handle;
+static DWORD std_console_mode = ENABLE_VIRTUAL_TERMINAL_PROCESSING;
+static UINT std_console_code_page = CP_UTF8;
+
+static void reset_std_console(void)
+{
+	if (std_console_mode != ENABLE_VIRTUAL_TERMINAL_PROCESSING)
+		SetConsoleMode(std_console_handle, std_console_mode);
+	if (std_console_code_page != CP_UTF8)
+		SetConsoleOutputCP(std_console_code_page);
+}
+
+static int enable_virtual_processing(void)
+{
+	std_console_handle = GetStdHandle(STD_OUTPUT_HANDLE);
+	if (std_console_handle == INVALID_HANDLE_VALUE ||
+	    !GetConsoleMode(std_console_handle, &std_console_mode)) {
+		std_console_handle = GetStdHandle(STD_ERROR_HANDLE);
+		if (std_console_handle == INVALID_HANDLE_VALUE ||
+		    !GetConsoleMode(std_console_handle, &std_console_mode))
+		return 0;
+	}
+
+	std_console_code_page = GetConsoleOutputCP();
+	if (std_console_code_page != CP_UTF8)
+		SetConsoleOutputCP(CP_UTF8);
+	if (!std_console_code_page)
+		std_console_code_page = CP_UTF8;
+
+	atexit(reset_std_console);
+
+	if (std_console_mode & ENABLE_VIRTUAL_TERMINAL_PROCESSING)
+		return 1;
+
+	if (!SetConsoleMode(std_console_handle,
+			    std_console_mode |
+			    ENABLE_PROCESSED_OUTPUT |
+			    ENABLE_VIRTUAL_TERMINAL_PROCESSING))
+		return 0;
+
+	return 1;
+}
+
 /*
  * Wrapper for isatty().  Most calls in the main git code
  * call isatty(1 or 2) to see if the instance is interactive
@@ -631,6 +674,9 @@ void winansi_init(void)
 		return;
 	}
 
+	if (enable_virtual_processing())
+		return;
+
 	/* create a named pipe to communicate with the console thread */
 	if (swprintf(name, ARRAY_SIZE(name) - 1, L"\\\\.\\pipe\\winansi%lu",
 		     GetCurrentProcessId()) < 0)

From d9a0f9dd2e5db61beae113a987f35a53842ee3c3 Mon Sep 17 00:00:00 2001
From: MinarKotonoha <chengzhuo5@qq.com>
Date: Mon, 8 Apr 2024 16:41:10 +0800
Subject: [PATCH 394/553] common-main.c: fflush stdout buffer upon exit

By default, the buffer type of Windows' `stdout` is unbuffered (_IONBF),
and there is no need to manually fflush `stdout`.

But some programs, such as the Windows Filtering Platform driver
provided by the security software, may change the buffer type of
`stdout` to full buffering. This nees `fflush(stdout)` to be called
manually, otherwise there will be no output to `stdout`.

Signed-off-by: MinarKotonoha <chengzhuo5@qq.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 common-exit.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/common-exit.c b/common-exit.c
index 1aaa538be3ed67..609f32abed8b53 100644
--- a/common-exit.c
+++ b/common-exit.c
@@ -11,6 +11,13 @@ static void check_bug_if_BUG(void)
 /* We wrap exit() to call common_exit() in git-compat-util.h */
 int common_exit(const char *file, int line, int code)
 {
+	/*
+	 *  Windows Filtering Platform driver provided by the security software
+	 * may change buffer type of stdout from _IONBF to _IOFBF.
+	 * It will no output without fflush manually.
+	 */
+	fflush(stdout);
+
 	/*
 	 * For non-POSIX systems: Take the lowest 8 bits of the "code"
 	 * to e.g. turn -1 into 255. On a POSIX system this is

From ad8803a6d8dc73460d851d6210e1dde46babcefd Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 9 Apr 2024 16:50:56 +0200
Subject: [PATCH 395/553] t5601/t7406(mingw): do run tests with symlink support

A long time ago, we decided to run tests in Git for Windows' SDK with
the default `winsymlinks` mode: copying instead of linking. This is
still the default mode of MSYS2 to this day.

However, this is not how most users run Git for Windows: As the majority
of Git for Windows' users seem to be on Windows 10 and newer, likely
having enabled Developer Mode (which allows creating symbolic links
without administrator privileges), they will run with symlink support
enabled.

This is the reason why it is crucial to get the fixes for CVE-2024-? to
the users, and also why it is crucial to ensure that the test suite
exercises the related test cases. This commit ensures the latter.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5601-clone.sh            | 10 ++++++++++
 t/t7406-submodule-update.sh |  9 +++++++++
 2 files changed, 19 insertions(+)

diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index d743d986c401a0..a859e09956222c 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -7,6 +7,16 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+# This test script contains test cases that need to create symbolic links. To
+# make sure that these test cases are exercised in Git for Windows, where (for
+# historical reasons) `ln -s` creates copies by default, let's specifically ask
+# for `ln -s` to create symbolic links whenever possible.
+if test_have_prereq MINGW
+then
+	MSYS=${MSYS+$MSYS }winsymlinks:nativestrict
+	export MSYS
+fi
+
 X=
 test_have_prereq !MINGW || X=.exe
 
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 3adab12091a5f0..a3e0dc198ab646 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -14,6 +14,15 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+# This test script contains test cases that need to create symbolic links. To
+# make sure that these test cases are exercised in Git for Windows, where (for
+# historical reasons) `ln -s` creates copies by default, let's specifically ask
+# for `ln -s` to create symbolic links whenever possible.
+if test_have_prereq MINGW
+then
+	MSYS=${MSYS+$MSYS }winsymlinks:nativestrict
+	export MSYS
+fi
 
 compare_head()
 {

From 2734642975a7e9dd3043357c4401fb0f64ff7c34 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 21 May 2024 13:55:26 +0200
Subject: [PATCH 396/553] win32: ensure that `localtime_r()` is declared even
 in i686 builds

The `__MINGW64__` constant is defined, surprise, surprise, only when
building for a 64-bit CPU architecture.

Therefore using it as a guard to define `_POSIX_C_SOURCE` (so that
`localtime_r()` is declared, among other functions) is not enough, we
also need to check `__MINGW32__`.

Technically, the latter constant is defined even for 64-bit builds. But
let's make things a bit easier to understand by testing for both
constants.

Making it so fixes this compile warning (turned error in GCC v14.1):

  archive-zip.c: In function 'dos_time':
  archive-zip.c:612:9: error: implicit declaration of function 'localtime_r';
  did you mean 'localtime_s'? [-Wimplicit-function-declaration]
    612 |         localtime_r(&time, &tm);
        |         ^~~~~~~~~~~
        |         localtime_s

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/posix.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/posix.h b/compat/posix.h
index 245386fa4a9f4e..9bb6f096308408 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -45,7 +45,7 @@
 #define UNUSED
 #endif
 
-#ifdef __MINGW64__
+#if defined(__MINGW32__) || defined(__MINGW64__)
 #define _POSIX_C_SOURCE 1
 #elif defined(__sun__)
  /*

From 9ae1397a61c6663884607e63f30302c4e7a410d5 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <git@jeffhostetler.com>
Date: Mon, 29 Apr 2024 08:55:03 -0400
Subject: [PATCH 397/553] survey: stub in new experimental 'git-survey' command

Start work on a new 'git survey' command to scan the repository
for monorepo performance and scaling problems.  The goal is to
measure the various known "dimensions of scale" and serve as a
foundation for adding additional measurements as we learn more
about Git monorepo scaling problems.

The initial goal is to complement the scanning and analysis performed
by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool.
It is hoped that by creating a builtin command, we may be able to take
advantage of internal Git data structures and code that is not
accessible from GO to gain further insight into potential scaling
problems.

Co-authored-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 .gitignore                       |  1 +
 Documentation/config.adoc        |  2 +
 Documentation/config/survey.adoc | 11 +++++
 Documentation/git-survey.adoc    | 36 +++++++++++++++
 Documentation/meson.build        |  1 +
 Makefile                         |  1 +
 builtin.h                        |  1 +
 builtin/survey.c                 | 75 ++++++++++++++++++++++++++++++++
 command-list.txt                 |  1 +
 git.c                            |  1 +
 meson.build                      |  1 +
 t/meson.build                    |  1 +
 t/t1517-outside-repo.sh          |  2 +-
 t/t8100-git-survey.sh            | 18 ++++++++
 14 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/survey.adoc
 create mode 100644 Documentation/git-survey.adoc
 create mode 100644 builtin/survey.c
 create mode 100755 t/t8100-git-survey.sh

diff --git a/.gitignore b/.gitignore
index 78a45cb5bec991..f534410859f3dd 100644
--- a/.gitignore
+++ b/.gitignore
@@ -170,6 +170,7 @@
 /git-submodule
 /git-submodule--helper
 /git-subtree
+/git-survey
 /git-svn
 /git-switch
 /git-symbolic-ref
diff --git a/Documentation/config.adoc b/Documentation/config.adoc
index dcea3c0c15e2a9..11f4a23c56ee28 100644
--- a/Documentation/config.adoc
+++ b/Documentation/config.adoc
@@ -537,6 +537,8 @@ include::config/status.adoc[]
 
 include::config/submodule.adoc[]
 
+include::config/survey.adoc[]
+
 include::config/tag.adoc[]
 
 include::config/tar.adoc[]
diff --git a/Documentation/config/survey.adoc b/Documentation/config/survey.adoc
new file mode 100644
index 00000000000000..c1b0f852a1250e
--- /dev/null
+++ b/Documentation/config/survey.adoc
@@ -0,0 +1,11 @@
+survey.*::
+	These variables adjust the default behavior of the `git survey`
+	command. The intention is that this command could be run in the
+	background with these options.
++
+--
+	verbose::
+		This boolean value implies the `--[no-]verbose` option.
+	progress::
+		This boolean value implies the `--[no-]progress` option.
+--
diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc
new file mode 100644
index 00000000000000..5f8ec9bfea673b
--- /dev/null
+++ b/Documentation/git-survey.adoc
@@ -0,0 +1,36 @@
+git-survey(1)
+=============
+
+NAME
+----
+git-survey - EXPERIMENTAL: Measure various repository dimensions of scale
+
+SYNOPSIS
+--------
+[verse]
+(EXPERIMENTAL!) 'git survey' <options>
+
+DESCRIPTION
+-----------
+
+Survey the repository and measure various dimensions of scale.
+
+As repositories grow to "monorepo" size, certain data shapes can cause
+performance problems.  `git-survey` attempts to measure and report on
+known problem areas.
+
+OPTIONS
+-------
+
+--progress::
+	Show progress.  This is automatically enabled when interactive.
+
+OUTPUT
+------
+
+By default, `git survey` will print information about the repository in a
+human-readable format that includes overviews and tables.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/meson.build b/Documentation/meson.build
index f02dbc20cbcb86..b300ee06565a72 100644
--- a/Documentation/meson.build
+++ b/Documentation/meson.build
@@ -143,6 +143,7 @@ manpages = {
   'git-status.adoc' : 1,
   'git-stripspace.adoc' : 1,
   'git-submodule.adoc' : 1,
+  'git-survey.adoc' : 1,
   'git-svn.adoc' : 1,
   'git-switch.adoc' : 1,
   'git-symbolic-ref.adoc' : 1,
diff --git a/Makefile b/Makefile
index b7eba509c6a0ca..d405386889a2b5 100644
--- a/Makefile
+++ b/Makefile
@@ -1479,6 +1479,7 @@ BUILTIN_OBJS += builtin/sparse-checkout.o
 BUILTIN_OBJS += builtin/stash.o
 BUILTIN_OBJS += builtin/stripspace.o
 BUILTIN_OBJS += builtin/submodule--helper.o
+BUILTIN_OBJS += builtin/survey.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
 BUILTIN_OBJS += builtin/tag.o
 BUILTIN_OBJS += builtin/unpack-file.o
diff --git a/builtin.h b/builtin.h
index 1b35565fbd9a3c..a27c03907ba71c 100644
--- a/builtin.h
+++ b/builtin.h
@@ -234,6 +234,7 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix, struct
 int cmd_status(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_stash(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_stripspace(int argc, const char **argv, const char *prefix, struct repository *repo);
+int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_switch(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_symbolic_ref(int argc, const char **argv, const char *prefix, struct repository *repo);
diff --git a/builtin/survey.c b/builtin/survey.c
new file mode 100644
index 00000000000000..7b7214a289765c
--- /dev/null
+++ b/builtin/survey.c
@@ -0,0 +1,75 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
+#include "builtin.h"
+#include "config.h"
+#include "parse-options.h"
+
+static const char * const survey_usage[] = {
+	N_("(EXPERIMENTAL!) git survey <options>"),
+	NULL,
+};
+
+struct survey_opts {
+	int verbose;
+	int show_progress;
+};
+
+struct survey_context {
+	struct repository *repo;
+
+	/* Options that control what is done. */
+	struct survey_opts opts;
+};
+
+static int survey_load_config_cb(const char *var, const char *value,
+				 const struct config_context *cctx, void *pvoid)
+{
+	struct survey_context *ctx = pvoid;
+
+	if (!strcmp(var, "survey.verbose")) {
+		ctx->opts.verbose = git_config_bool(var, value);
+		return 0;
+	}
+	if (!strcmp(var, "survey.progress")) {
+		ctx->opts.show_progress = git_config_bool(var, value);
+		return 0;
+	}
+
+	return git_default_config(var, value, cctx, pvoid);
+}
+
+static void survey_load_config(struct survey_context *ctx)
+{
+	repo_config(the_repository, survey_load_config_cb, ctx);
+}
+
+int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo)
+{
+	static struct survey_context ctx = {
+		.opts = {
+			.verbose = 0,
+			.show_progress = -1, /* defaults to isatty(2) */
+		},
+	};
+
+	static struct option survey_options[] = {
+		OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")),
+		OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")),
+		OPT_END(),
+	};
+
+	show_usage_with_options_if_asked(argc, argv,
+					 survey_usage, survey_options);
+
+	ctx.repo = repo;
+
+	prepare_repo_settings(ctx.repo);
+	survey_load_config(&ctx);
+
+	argc = parse_options(argc, argv, prefix, survey_options, survey_usage, 0);
+
+	if (ctx.opts.show_progress < 0)
+		ctx.opts.show_progress = isatty(2);
+
+	return 0;
+}
diff --git a/command-list.txt b/command-list.txt
index accd3d0c4b5524..8c9256b3931da0 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -190,6 +190,7 @@ git-stash                               mainporcelain
 git-status                              mainporcelain           info
 git-stripspace                          purehelpers
 git-submodule                           mainporcelain
+git-survey                              mainporcelain
 git-svn                                 foreignscminterface
 git-switch                              mainporcelain           history
 git-symbolic-ref                        plumbingmanipulators
diff --git a/git.c b/git.c
index c5fad56813f437..0915bc643fe1c4 100644
--- a/git.c
+++ b/git.c
@@ -658,6 +658,7 @@ static struct cmd_struct commands[] = {
 	{ "status", cmd_status, RUN_SETUP | NEED_WORK_TREE },
 	{ "stripspace", cmd_stripspace },
 	{ "submodule--helper", cmd_submodule__helper, RUN_SETUP },
+	{ "survey", cmd_survey, RUN_SETUP },
 	{ "switch", cmd_switch, RUN_SETUP | NEED_WORK_TREE },
 	{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
 	{ "tag", cmd_tag, RUN_SETUP | DELAY_PAGER_CONFIG },
diff --git a/meson.build b/meson.build
index dd52efd1c87574..7578d256df268d 100644
--- a/meson.build
+++ b/meson.build
@@ -668,6 +668,7 @@ builtin_sources = [
   'builtin/stash.c',
   'builtin/stripspace.c',
   'builtin/submodule--helper.c',
+  'builtin/survey.c',
   'builtin/symbolic-ref.c',
   'builtin/tag.c',
   'builtin/unpack-file.c',
diff --git a/t/meson.build b/t/meson.build
index 459c52a48972e4..5c090ad0915f6c 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -961,6 +961,7 @@ integration_tests = [
   't8014-blame-ignore-fuzzy.sh',
   't8015-blame-diff-algorithm.sh',
   't8020-last-modified.sh',
+  't8100-git-survey.sh',
   't9001-send-email.sh',
   't9002-column.sh',
   't9003-help-autocorrect.sh',
diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
index c824c1a25cf27e..37371e3f5e3e4c 100755
--- a/t/t1517-outside-repo.sh
+++ b/t/t1517-outside-repo.sh
@@ -120,7 +120,7 @@ do
 	merge-octopus | merge-one-file | merge-resolve | mergetool | \
 	mktag | p4 | p4.py | pickaxe | remote-ftp | remote-ftps | \
 	remote-http | remote-https | replay | send-email | \
-	sh-i18n--envsubst | shell | show | stage | submodule | svn | \
+	sh-i18n--envsubst | shell | show | stage | submodule | survey | svn | \
 	upload-archive--writer | upload-pack | web--browse | whatchanged)
 		expect_outcome=expect_failure ;;
 	*)
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
new file mode 100755
index 00000000000000..d9816419855d1a
--- /dev/null
+++ b/t/t8100-git-survey.sh
@@ -0,0 +1,18 @@
+#!/bin/sh
+
+test_description='git survey'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=0
+export TEST_PASSES_SANITIZE_LEAK
+
+. ./test-lib.sh
+
+test_expect_success 'git survey -h shows experimental warning' '
+	test_expect_code 129 git survey -h >usage &&
+	grep "EXPERIMENTAL!" usage
+'
+
+test_done

From 4e7453dc2b1266980917e8009c0fefde8fe5a17a Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <git@jeffhostetler.com>
Date: Mon, 29 Apr 2024 09:51:34 -0400
Subject: [PATCH 398/553] survey: add command line opts to select references

By default we will scan all references in "refs/heads/", "refs/tags/"
and "refs/remotes/".

Add command line opts let the use ask for all refs or a subset of them
and to include a detached HEAD.

Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
 Documentation/git-survey.adoc |  34 +++++
 builtin/survey.c              | 248 ++++++++++++++++++++++++++++++++++
 t/t8100-git-survey.sh         |   9 ++
 3 files changed, 291 insertions(+)

diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc
index 5f8ec9bfea673b..56060d14b5cfef 100644
--- a/Documentation/git-survey.adoc
+++ b/Documentation/git-survey.adoc
@@ -19,12 +19,46 @@ As repositories grow to "monorepo" size, certain data shapes can cause
 performance problems.  `git-survey` attempts to measure and report on
 known problem areas.
 
+Ref Selection and Reachable Objects
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In this first analysis phase, `git survey` will iterate over the set of
+requested branches, tags, and other refs and treewalk over all of the
+reachable commits, trees, and blobs and generate various statistics.
+
 OPTIONS
 -------
 
 --progress::
 	Show progress.  This is automatically enabled when interactive.
 
+Ref Selection
+~~~~~~~~~~~~~
+
+The following options control the set of refs that `git survey` will examine.
+By default, `git survey` will look at tags, local branches, and remote refs.
+If any of the following options are given, the default set is cleared and
+only refs for the given options are added.
+
+--all-refs::
+	Use all refs.  This includes local branches, tags, remote refs,
+	notes, and stashes.  This option overrides all of the following.
+
+--branches::
+	Add local branches (`refs/heads/`) to the set.
+
+--tags::
+	Add tags (`refs/tags/`) to the set.
+
+--remotes::
+	Add remote branches (`refs/remote/`) to the set.
+
+--detached::
+	Add HEAD to the set.
+
+--other::
+	Add notes (`refs/notes/`) and stashes (`refs/stash/`) to the set.
+
 OUTPUT
 ------
 
diff --git a/builtin/survey.c b/builtin/survey.c
index 7b7214a289765c..8fbc104ec7bd74 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -2,16 +2,55 @@
 
 #include "builtin.h"
 #include "config.h"
+#include "object.h"
+#include "odb.h"
 #include "parse-options.h"
+#include "progress.h"
+#include "ref-filter.h"
+#include "strvec.h"
+#include "trace2.h"
 
 static const char * const survey_usage[] = {
 	N_("(EXPERIMENTAL!) git survey <options>"),
 	NULL,
 };
 
+struct survey_refs_wanted {
+	int want_all_refs; /* special override */
+
+	int want_branches;
+	int want_tags;
+	int want_remotes;
+	int want_detached;
+	int want_other; /* see FILTER_REFS_OTHERS -- refs/notes/, refs/stash/ */
+};
+
+static struct survey_refs_wanted default_ref_options = {
+	.want_all_refs = 1,
+};
+
 struct survey_opts {
 	int verbose;
 	int show_progress;
+	struct survey_refs_wanted refs;
+};
+
+struct survey_report_ref_summary {
+	size_t refs_nr;
+	size_t branches_nr;
+	size_t remote_refs_nr;
+	size_t tags_nr;
+	size_t tags_annotated_nr;
+	size_t others_nr;
+	size_t unknown_nr;
+};
+
+/**
+ * This struct contains all of the information that needs to be printed
+ * at the end of the exploration of the repository and its references.
+ */
+struct survey_report {
+	struct survey_report_ref_summary refs;
 };
 
 struct survey_context {
@@ -19,8 +58,84 @@ struct survey_context {
 
 	/* Options that control what is done. */
 	struct survey_opts opts;
+
+	/* Info for output only. */
+	struct survey_report report;
+
+	/*
+	 * The rest of the members are about enabling the activity
+	 * of the 'git survey' command, including ref listings, object
+	 * pointers, and progress.
+	 */
+
+	struct progress *progress;
+	size_t progress_nr;
+	size_t progress_total;
+
+	struct strvec refs;
 };
 
+static void clear_survey_context(struct survey_context *ctx)
+{
+	strvec_clear(&ctx->refs);
+}
+
+/*
+ * After parsing the command line arguments, figure out which refs we
+ * should scan.
+ *
+ * If ANY were given in positive sense, then we ONLY include them and
+ * do not use the builtin values.
+ */
+static void fixup_refs_wanted(struct survey_context *ctx)
+{
+	struct survey_refs_wanted *rw = &ctx->opts.refs;
+
+	/*
+	 * `--all-refs` overrides and enables everything.
+	 */
+	if (rw->want_all_refs == 1) {
+		rw->want_branches = 1;
+		rw->want_tags = 1;
+		rw->want_remotes = 1;
+		rw->want_detached = 1;
+		rw->want_other = 1;
+		return;
+	}
+
+	/*
+	 * If none of the `--<ref-type>` were given, we assume all
+	 * of the builtin unspecified values.
+	 */
+	if (rw->want_branches == -1 &&
+	    rw->want_tags == -1 &&
+	    rw->want_remotes == -1 &&
+	    rw->want_detached == -1 &&
+	    rw->want_other == -1) {
+		*rw = default_ref_options;
+		return;
+	}
+
+	/*
+	 * Since we only allow positive boolean values on the command
+	 * line, we will only have true values where they specified
+	 * a `--<ref-type>`.
+	 *
+	 * So anything that still has an unspecified value should be
+	 * set to false.
+	 */
+	if (rw->want_branches == -1)
+		rw->want_branches = 0;
+	if (rw->want_tags == -1)
+		rw->want_tags = 0;
+	if (rw->want_remotes == -1)
+		rw->want_remotes = 0;
+	if (rw->want_detached == -1)
+		rw->want_detached = 0;
+	if (rw->want_other == -1)
+		rw->want_other = 0;
+}
+
 static int survey_load_config_cb(const char *var, const char *value,
 				 const struct config_context *cctx, void *pvoid)
 {
@@ -43,18 +158,146 @@ static void survey_load_config(struct survey_context *ctx)
 	repo_config(the_repository, survey_load_config_cb, ctx);
 }
 
+static void do_load_refs(struct survey_context *ctx,
+			 struct ref_array *ref_array)
+{
+	struct ref_filter filter = REF_FILTER_INIT;
+	struct ref_sorting *sorting;
+	struct string_list sorting_options = STRING_LIST_INIT_DUP;
+
+	string_list_append(&sorting_options, "objectname");
+	sorting = ref_sorting_options(&sorting_options);
+
+	if (ctx->opts.refs.want_detached)
+		strvec_push(&ctx->refs, "HEAD");
+
+	if (ctx->opts.refs.want_all_refs) {
+		strvec_push(&ctx->refs, "refs/");
+	} else {
+		if (ctx->opts.refs.want_branches)
+			strvec_push(&ctx->refs, "refs/heads/");
+		if (ctx->opts.refs.want_tags)
+			strvec_push(&ctx->refs, "refs/tags/");
+		if (ctx->opts.refs.want_remotes)
+			strvec_push(&ctx->refs, "refs/remotes/");
+		if (ctx->opts.refs.want_other) {
+			strvec_push(&ctx->refs, "refs/notes/");
+			strvec_push(&ctx->refs, "refs/stash/");
+		}
+	}
+
+	filter.name_patterns = ctx->refs.v;
+	filter.ignore_case = 0;
+	filter.match_as_path = 1;
+
+	if (ctx->opts.show_progress) {
+		ctx->progress_total = 0;
+		ctx->progress = start_progress(ctx->repo,
+					       _("Scanning refs..."), 0);
+	}
+
+	filter_refs(ref_array, &filter, FILTER_REFS_KIND_MASK);
+
+	if (ctx->opts.show_progress) {
+		ctx->progress_total = ref_array->nr;
+		display_progress(ctx->progress, ctx->progress_total);
+	}
+
+	ref_array_sort(sorting, ref_array);
+
+	stop_progress(&ctx->progress);
+	ref_filter_clear(&filter);
+	ref_sorting_release(sorting);
+}
+
+/*
+ * The REFS phase:
+ *
+ * Load the set of requested refs and assess them for scalablity problems.
+ * Use that set to start a treewalk to all reachable objects and assess
+ * them.
+ *
+ * This data will give us insights into the repository itself (the number
+ * of refs, the size and shape of the DAG, the number and size of the
+ * objects).
+ *
+ * Theoretically, this data is independent of the on-disk representation
+ * (e.g. independent of packing concerns).
+ */
+static void survey_phase_refs(struct survey_context *ctx)
+{
+	struct ref_array ref_array = { 0 };
+
+	trace2_region_enter("survey", "phase/refs", ctx->repo);
+	do_load_refs(ctx, &ref_array);
+
+	ctx->report.refs.refs_nr = ref_array.nr;
+	for (int i = 0; i < ref_array.nr; i++) {
+		unsigned long size;
+		struct ref_array_item *item = ref_array.items[i];
+
+		switch (item->kind) {
+		case FILTER_REFS_TAGS:
+			ctx->report.refs.tags_nr++;
+			if (odb_read_object_info(ctx->repo->objects,
+						 &item->objectname,
+						 &size) == OBJ_TAG)
+				ctx->report.refs.tags_annotated_nr++;
+			break;
+
+		case FILTER_REFS_BRANCHES:
+			ctx->report.refs.branches_nr++;
+			break;
+
+		case FILTER_REFS_REMOTES:
+			ctx->report.refs.remote_refs_nr++;
+			break;
+
+		case FILTER_REFS_OTHERS:
+			ctx->report.refs.others_nr++;
+			break;
+
+		default:
+			ctx->report.refs.unknown_nr++;
+			break;
+		}
+	}
+
+	trace2_region_leave("survey", "phase/refs", ctx->repo);
+
+	ref_array_clear(&ref_array);
+}
+
 int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo)
 {
 	static struct survey_context ctx = {
 		.opts = {
 			.verbose = 0,
 			.show_progress = -1, /* defaults to isatty(2) */
+
+			.refs.want_all_refs = -1,
+
+			.refs.want_branches = -1, /* default these to undefined */
+			.refs.want_tags = -1,
+			.refs.want_remotes = -1,
+			.refs.want_detached = -1,
+			.refs.want_other = -1,
 		},
+		.refs = STRVEC_INIT,
 	};
 
 	static struct option survey_options[] = {
 		OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")),
 		OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")),
+
+		OPT_BOOL_F(0, "all-refs", &ctx.opts.refs.want_all_refs, N_("include all refs"),          PARSE_OPT_NONEG),
+
+		OPT_BOOL_F(0, "branches", &ctx.opts.refs.want_branches, N_("include branches"),          PARSE_OPT_NONEG),
+		OPT_BOOL_F(0, "tags",     &ctx.opts.refs.want_tags,     N_("include tags"),              PARSE_OPT_NONEG),
+		OPT_BOOL_F(0, "remotes",  &ctx.opts.refs.want_remotes,  N_("include all remotes refs"),  PARSE_OPT_NONEG),
+		OPT_BOOL_F(0, "detached", &ctx.opts.refs.want_detached, N_("include detached HEAD"),     PARSE_OPT_NONEG),
+		OPT_BOOL_F(0, "other",    &ctx.opts.refs.want_other,    N_("include notes and stashes"), PARSE_OPT_NONEG),
+
 		OPT_END(),
 	};
 
@@ -71,5 +314,10 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 	if (ctx.opts.show_progress < 0)
 		ctx.opts.show_progress = isatty(2);
 
+	fixup_refs_wanted(&ctx);
+
+	survey_phase_refs(&ctx);
+
+	clear_survey_context(&ctx);
 	return 0;
 }
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index d9816419855d1a..9bac3c2ba47e2c 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -15,4 +15,13 @@ test_expect_success 'git survey -h shows experimental warning' '
 	grep "EXPERIMENTAL!" usage
 '
 
+test_expect_success 'create a semi-interesting repo' '
+	test_commit_bulk 10
+'
+
+test_expect_success 'git survey (default)' '
+	git survey >out 2>err &&
+	test_line_count = 0 err
+'
+
 test_done

From d787ed625c0a39faad193db8a5b217d4e8b6561d Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 15:58:32 -0400
Subject: [PATCH 399/553] survey: start pretty printing data in table form

When 'git survey' provides information to the user, this will be presented
in one of two formats: plaintext and JSON. The JSON implementation will be
delayed until the functionality is complete for the plaintext format.

The most important parts of the plaintext format are headers specifying the
different sections of the report and tables providing concreted data.

Create a custom table data structure that allows specifying a list of
strings for the row values. When printing the table, check each column for
the maximum width so we can create a table of the correct size from the
start.

The table structure is designed to be flexible to the different kinds of
output that will be implemented in future changes.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-survey.adoc |   7 ++
 builtin/survey.c              | 157 ++++++++++++++++++++++++++++++++++
 t/t8100-git-survey.sh         |  18 +++-
 3 files changed, 181 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc
index 56060d14b5cfef..120ecb9a4d49f2 100644
--- a/Documentation/git-survey.adoc
+++ b/Documentation/git-survey.adoc
@@ -65,6 +65,13 @@ OUTPUT
 By default, `git survey` will print information about the repository in a
 human-readable format that includes overviews and tables.
 
+References Summary
+~~~~~~~~~~~~~~~~~~
+
+The references summary includes a count of each kind of reference,
+including branches, remote refs, and tags (split by "all" and
+"annotated").
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/survey.c b/builtin/survey.c
index 8fbc104ec7bd74..e79f97f8d75923 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -7,6 +7,7 @@
 #include "parse-options.h"
 #include "progress.h"
 #include "ref-filter.h"
+#include "strbuf.h"
 #include "strvec.h"
 #include "trace2.h"
 
@@ -80,6 +81,160 @@ static void clear_survey_context(struct survey_context *ctx)
 	strvec_clear(&ctx->refs);
 }
 
+struct survey_table {
+	const char *table_name;
+	struct strvec header;
+	struct strvec *rows;
+	size_t rows_nr;
+	size_t rows_alloc;
+};
+
+#define SURVEY_TABLE_INIT {	\
+	.header = STRVEC_INIT,	\
+}
+
+static void clear_table(struct survey_table *table)
+{
+	strvec_clear(&table->header);
+	for (size_t i = 0; i < table->rows_nr; i++)
+		strvec_clear(&table->rows[i]);
+	free(table->rows);
+}
+
+static void insert_table_rowv(struct survey_table *table, ...)
+{
+	va_list ap;
+	char *arg;
+	ALLOC_GROW(table->rows, table->rows_nr + 1, table->rows_alloc);
+
+	memset(&table->rows[table->rows_nr], 0, sizeof(struct strvec));
+
+	va_start(ap, table);
+	while ((arg = va_arg(ap, char *)))
+		strvec_push(&table->rows[table->rows_nr], arg);
+	va_end(ap);
+
+	table->rows_nr++;
+}
+
+#define SECTION_SEGMENT "========================================"
+#define SECTION_SEGMENT_LEN 40
+static const char *section_line = SECTION_SEGMENT
+				  SECTION_SEGMENT
+				  SECTION_SEGMENT
+				  SECTION_SEGMENT;
+static const size_t section_len = 4 * SECTION_SEGMENT_LEN;
+
+static void print_table_title(const char *name, size_t *widths, size_t nr)
+{
+	size_t width = 3 * (nr - 1);
+
+	for (size_t i = 0; i < nr; i++)
+		width += widths[i];
+
+	if (width > section_len)
+		width = section_len;
+
+	printf("\n%s\n%.*s\n", name, (int)width, section_line);
+}
+
+static void print_row_plaintext(struct strvec *row, size_t *widths)
+{
+	static struct strbuf line = STRBUF_INIT;
+	strbuf_setlen(&line, 0);
+
+	for (size_t i = 0; i < row->nr; i++) {
+		const char *str = row->v[i];
+		size_t len = strlen(str);
+		if (i)
+			strbuf_add(&line, " | ", 3);
+		strbuf_addchars(&line, ' ', widths[i] - len);
+		strbuf_add(&line, str, len);
+	}
+	printf("%s\n", line.buf);
+}
+
+static void print_divider_plaintext(size_t *widths, size_t nr)
+{
+	static struct strbuf line = STRBUF_INIT;
+	strbuf_setlen(&line, 0);
+
+	for (size_t i = 0; i < nr; i++) {
+		if (i)
+			strbuf_add(&line, "-+-", 3);
+		strbuf_addchars(&line, '-', widths[i]);
+	}
+	printf("%s\n", line.buf);
+}
+
+static void print_table_plaintext(struct survey_table *table)
+{
+	size_t *column_widths;
+	size_t columns_nr = table->header.nr;
+	CALLOC_ARRAY(column_widths, columns_nr);
+
+	for (size_t i = 0; i < columns_nr; i++) {
+		column_widths[i] = strlen(table->header.v[i]);
+
+		for (size_t j = 0; j < table->rows_nr; j++) {
+			size_t rowlen = strlen(table->rows[j].v[i]);
+			if (column_widths[i] < rowlen)
+				column_widths[i] = rowlen;
+		}
+	}
+
+	print_table_title(table->table_name, column_widths, columns_nr);
+	print_row_plaintext(&table->header, column_widths);
+	print_divider_plaintext(column_widths, columns_nr);
+
+	for (size_t j = 0; j < table->rows_nr; j++)
+		print_row_plaintext(&table->rows[j], column_widths);
+
+	free(column_widths);
+}
+
+static void survey_report_plaintext_refs(struct survey_context *ctx)
+{
+	struct survey_report_ref_summary *refs = &ctx->report.refs;
+	struct survey_table table = SURVEY_TABLE_INIT;
+
+	table.table_name = _("REFERENCES SUMMARY");
+
+	strvec_push(&table.header, _("Ref Type"));
+	strvec_push(&table.header, _("Count"));
+
+	if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_branches) {
+		char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->branches_nr);
+		insert_table_rowv(&table, _("Branches"), fmt, NULL);
+		free(fmt);
+	}
+
+	if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_remotes) {
+		char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->remote_refs_nr);
+		insert_table_rowv(&table, _("Remote refs"), fmt, NULL);
+		free(fmt);
+	}
+
+	if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_tags) {
+		char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->tags_nr);
+		insert_table_rowv(&table, _("Tags (all)"), fmt, NULL);
+		free(fmt);
+		fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->tags_annotated_nr);
+		insert_table_rowv(&table, _("Tags (annotated)"), fmt, NULL);
+		free(fmt);
+	}
+
+	print_table_plaintext(&table);
+	clear_table(&table);
+}
+
+static void survey_report_plaintext(struct survey_context *ctx)
+{
+	printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree);
+	printf("-----------------------------------------------------\n");
+	survey_report_plaintext_refs(ctx);
+}
+
 /*
  * After parsing the command line arguments, figure out which refs we
  * should scan.
@@ -318,6 +473,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 
 	survey_phase_refs(&ctx);
 
+	survey_report_plaintext(&ctx);
+
 	clear_survey_context(&ctx);
 	return 0;
 }
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index 9bac3c2ba47e2c..e518e4844fe2d0 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -21,7 +21,23 @@ test_expect_success 'create a semi-interesting repo' '
 
 test_expect_success 'git survey (default)' '
 	git survey >out 2>err &&
-	test_line_count = 0 err
+	test_line_count = 0 err &&
+
+	tr , " " >expect <<-EOF &&
+	GIT SURVEY for "$(pwd)"
+	-----------------------------------------------------
+
+	REFERENCES SUMMARY
+	========================
+	,       Ref Type | Count
+	-----------------+------
+	,       Branches |     1
+	     Remote refs |     0
+	      Tags (all) |     0
+	Tags (annotated) |     0
+	EOF
+
+	test_cmp expect out
 '
 
 test_done

From 10df7ad8547426e769a4b6ef222067b201204942 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 20:33:47 -0400
Subject: [PATCH 400/553] survey: add object count summary

At the moment, nothing is obvious about the reason for the use of the
path-walk API, but this will become more prevelant in future iterations. For
now, use the path-walk API to sum up the counts of each kind of object.

For example, this is the reachable object summary output for my local repo:

REACHABLE OBJECT SUMMARY
========================
Object Type |  Count
------------+-------
       Tags |   1343
    Commits | 179344
      Trees | 314350
      Blobs | 184030

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-survey.adoc |   6 ++
 builtin/survey.c              | 131 ++++++++++++++++++++++++++++++++--
 t/t8100-git-survey.sh         |  23 ++++--
 3 files changed, 149 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc
index 120ecb9a4d49f2..44f3a0568b7697 100644
--- a/Documentation/git-survey.adoc
+++ b/Documentation/git-survey.adoc
@@ -72,6 +72,12 @@ The references summary includes a count of each kind of reference,
 including branches, remote refs, and tags (split by "all" and
 "annotated").
 
+Reachable Object Summary
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+The reachable object summary shows the total number of each kind of Git
+object, including tags, commits, trees, and blobs.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/survey.c b/builtin/survey.c
index e79f97f8d75923..1e8b9c1e5492aa 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -2,13 +2,20 @@
 
 #include "builtin.h"
 #include "config.h"
+#include "environment.h"
+#include "hex.h"
 #include "object.h"
 #include "odb.h"
+#include "object-name.h"
 #include "parse-options.h"
+#include "path-walk.h"
 #include "progress.h"
 #include "ref-filter.h"
+#include "refs.h"
+#include "revision.h"
 #include "strbuf.h"
 #include "strvec.h"
+#include "tag.h"
 #include "trace2.h"
 
 static const char * const survey_usage[] = {
@@ -46,12 +53,20 @@ struct survey_report_ref_summary {
 	size_t unknown_nr;
 };
 
+struct survey_report_object_summary {
+	size_t commits_nr;
+	size_t tags_nr;
+	size_t trees_nr;
+	size_t blobs_nr;
+};
+
 /**
  * This struct contains all of the information that needs to be printed
  * at the end of the exploration of the repository and its references.
  */
 struct survey_report {
 	struct survey_report_ref_summary refs;
+	struct survey_report_object_summary reachable_objects;
 };
 
 struct survey_context {
@@ -74,10 +89,12 @@ struct survey_context {
 	size_t progress_total;
 
 	struct strvec refs;
+	struct ref_array ref_array;
 };
 
 static void clear_survey_context(struct survey_context *ctx)
 {
+	ref_array_clear(&ctx->ref_array);
 	strvec_clear(&ctx->refs);
 }
 
@@ -128,10 +145,14 @@ static const size_t section_len = 4 * SECTION_SEGMENT_LEN;
 static void print_table_title(const char *name, size_t *widths, size_t nr)
 {
 	size_t width = 3 * (nr - 1);
+	size_t min_width = strlen(name);
 
 	for (size_t i = 0; i < nr; i++)
 		width += widths[i];
 
+	if (width < min_width)
+		width = min_width;
+
 	if (width > section_len)
 		width = section_len;
 
@@ -228,11 +249,43 @@ static void survey_report_plaintext_refs(struct survey_context *ctx)
 	clear_table(&table);
 }
 
+static void survey_report_plaintext_reachable_object_summary(struct survey_context *ctx)
+{
+	struct survey_report_object_summary *objs = &ctx->report.reachable_objects;
+	struct survey_table table = SURVEY_TABLE_INIT;
+	char *fmt;
+
+	table.table_name = _("REACHABLE OBJECT SUMMARY");
+
+	strvec_push(&table.header, _("Object Type"));
+	strvec_push(&table.header, _("Count"));
+
+	fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->tags_nr);
+	insert_table_rowv(&table, _("Tags"), fmt, NULL);
+	free(fmt);
+
+	fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->commits_nr);
+	insert_table_rowv(&table, _("Commits"), fmt, NULL);
+	free(fmt);
+
+	fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->trees_nr);
+	insert_table_rowv(&table, _("Trees"), fmt, NULL);
+	free(fmt);
+
+	fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->blobs_nr);
+	insert_table_rowv(&table, _("Blobs"), fmt, NULL);
+	free(fmt);
+
+	print_table_plaintext(&table);
+	clear_table(&table);
+}
+
 static void survey_report_plaintext(struct survey_context *ctx)
 {
 	printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree);
 	printf("-----------------------------------------------------\n");
 	survey_report_plaintext_refs(ctx);
+	survey_report_plaintext_reachable_object_summary(ctx);
 }
 
 /*
@@ -381,15 +434,13 @@ static void do_load_refs(struct survey_context *ctx,
  */
 static void survey_phase_refs(struct survey_context *ctx)
 {
-	struct ref_array ref_array = { 0 };
-
 	trace2_region_enter("survey", "phase/refs", ctx->repo);
-	do_load_refs(ctx, &ref_array);
+	do_load_refs(ctx, &ctx->ref_array);
 
-	ctx->report.refs.refs_nr = ref_array.nr;
-	for (int i = 0; i < ref_array.nr; i++) {
+	ctx->report.refs.refs_nr = ctx->ref_array.nr;
+	for (int i = 0; i < ctx->ref_array.nr; i++) {
 		unsigned long size;
-		struct ref_array_item *item = ref_array.items[i];
+		struct ref_array_item *item = ctx->ref_array.items[i];
 
 		switch (item->kind) {
 		case FILTER_REFS_TAGS:
@@ -419,8 +470,72 @@ static void survey_phase_refs(struct survey_context *ctx)
 	}
 
 	trace2_region_leave("survey", "phase/refs", ctx->repo);
+}
+
+static void increment_object_counts(
+		struct survey_report_object_summary *summary,
+		enum object_type type,
+		size_t nr)
+{
+	switch (type) {
+	case OBJ_COMMIT:
+		summary->commits_nr += nr;
+		break;
 
-	ref_array_clear(&ref_array);
+	case OBJ_TREE:
+		summary->trees_nr += nr;
+		break;
+
+	case OBJ_BLOB:
+		summary->blobs_nr += nr;
+		break;
+
+	case OBJ_TAG:
+		summary->tags_nr += nr;
+		break;
+
+	default:
+		break;
+	}
+}
+
+static int survey_objects_path_walk_fn(const char *path UNUSED,
+				       struct oid_array *oids,
+				       enum object_type type,
+				       void *data)
+{
+	struct survey_context *ctx = data;
+
+	increment_object_counts(&ctx->report.reachable_objects,
+				type, oids->nr);
+
+	return 0;
+}
+
+static void survey_phase_objects(struct survey_context *ctx)
+{
+	struct rev_info revs = REV_INFO_INIT;
+	struct path_walk_info info = PATH_WALK_INFO_INIT;
+	unsigned int add_flags = 0;
+
+	trace2_region_enter("survey", "phase/objects", ctx->repo);
+
+	info.revs = &revs;
+	info.path_fn = survey_objects_path_walk_fn;
+	info.path_fn_data = ctx;
+
+	repo_init_revisions(ctx->repo, &revs, "");
+	revs.tag_objects = 1;
+
+	for (int i = 0; i < ctx->ref_array.nr; i++) {
+		struct ref_array_item *item = ctx->ref_array.items[i];
+		add_pending_oid(&revs, NULL, &item->objectname, add_flags);
+	}
+
+	walk_objects_by_path(&info);
+
+	release_revisions(&revs);
+	trace2_region_leave("survey", "phase/objects", ctx->repo);
 }
 
 int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo)
@@ -473,6 +588,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 
 	survey_phase_refs(&ctx);
 
+	survey_phase_objects(&ctx);
+
 	survey_report_plaintext(&ctx);
 
 	clear_survey_context(&ctx);
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index e518e4844fe2d0..d3086784090352 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -16,11 +16,17 @@ test_expect_success 'git survey -h shows experimental warning' '
 '
 
 test_expect_success 'create a semi-interesting repo' '
-	test_commit_bulk 10
+	test_commit_bulk 10 &&
+	git tag -a -m one one HEAD~5 &&
+	git tag -a -m two two HEAD~3 &&
+	git tag -a -m three three two &&
+	git tag -a -m four four three &&
+	git update-ref -d refs/tags/three &&
+	git update-ref -d refs/tags/two
 '
 
 test_expect_success 'git survey (default)' '
-	git survey >out 2>err &&
+	git survey --all-refs >out 2>err &&
 	test_line_count = 0 err &&
 
 	tr , " " >expect <<-EOF &&
@@ -33,8 +39,17 @@ test_expect_success 'git survey (default)' '
 	-----------------+------
 	,       Branches |     1
 	     Remote refs |     0
-	      Tags (all) |     0
-	Tags (annotated) |     0
+	      Tags (all) |     2
+	Tags (annotated) |     2
+
+	REACHABLE OBJECT SUMMARY
+	========================
+	Object Type | Count
+	------------+------
+	       Tags |     4
+	    Commits |    10
+	      Trees |    10
+	      Blobs |    10
 	EOF
 
 	test_cmp expect out

From 0fb4cffb348f4a149ba40e81a09916a0b4ca5875 Mon Sep 17 00:00:00 2001
From: Ariel Lourenco <ariellourenco@users.noreply.github.com>
Date: Tue, 2 Jul 2024 18:09:43 -0300
Subject: [PATCH 401/553] Fallback to AppData if XDG_CONFIG_HOME is unset

In order to be a better Windows citizenship, Git should
save its configuration files on AppData folder. This can
enables git configuration files be replicated between machines
using the same Microsoft account logon which would reduce the
friction of setting up Git on new systems. Therefore, if
%APPDATA%\Git\config exists, we use it; otherwise
$HOME/.config/git/config is used.

Signed-off-by: Ariel Lourenco <ariellourenco@users.noreply.github.com>
---
 path.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/path.c b/path.c
index d726537622cda6..60d18f4ff1b951 100644
--- a/path.c
+++ b/path.c
@@ -1540,6 +1540,7 @@ int looks_like_command_line_option(const char *str)
 char *xdg_config_home_for(const char *subdir, const char *filename)
 {
 	const char *home, *config_home;
+	char *home_config = NULL;
 
 	assert(subdir);
 	assert(filename);
@@ -1548,10 +1549,26 @@ char *xdg_config_home_for(const char *subdir, const char *filename)
 		return mkpathdup("%s/%s/%s", config_home, subdir, filename);
 
 	home = getenv("HOME");
-	if (home)
-		return mkpathdup("%s/.config/%s/%s", home, subdir, filename);
+	if (home && *home)
+		home_config = mkpathdup("%s/.config/%s/%s", home, subdir, filename);
+
+	#ifdef WIN32
+	{
+		const char *appdata = getenv("APPDATA");
+		if (appdata && *appdata) {
+			char *appdata_config = mkpathdup("%s/Git/%s", appdata, filename);
+			if (file_exists(appdata_config)) {
+				if (home_config && file_exists(home_config))
+					warning("'%s' was ignored because '%s' exists.", home_config, appdata_config);
+				free(home_config);
+				return appdata_config;
+			}
+			free(appdata_config);
+		}
+	}
+	#endif
 
-	return NULL;
+	return home_config;
 }
 
 char *xdg_config_home(const char *filename)

From d2ae7c4cba9072ce309bfcd11deb2179349074ac Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 4 Jul 2024 22:41:56 +0200
Subject: [PATCH 402/553] run-command: be helpful with Git LFS fails on Windows
 7

Git LFS is now built with Go 1.21 which no longer supports Windows 7.
However, Git for Windows still wants to support Windows 7.

Ideally, Git LFS would re-introduce Windows 7 support until Git for
Windows drops support for Windows 7, but that's not going to happen:
https://github.com/git-for-windows/git/issues/4996#issuecomment-2176152565

The next best thing we can do is to let the users know what is
happening, and how to get out of their fix, at least.

This is not quite as easy as it would first seem because programs
compiled with Go 1.21 or newer will simply throw an exception and fail
with an Access Violation on Windows 7.

The only way I found to address this is to replicate the logic from Go's
very own `version` command (which can determine the Go version with
which a given executable was built) to detect the situation, and in that
case offer a helpful error message.

This addresses https://github.com/git-for-windows/git/issues/4996.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/path-utils.c | 199 ++++++++++++++++++++++++++++++++++++++
 compat/win32/path-utils.h |   3 +
 git-compat-util.h         |   7 ++
 run-command.c             |   1 +
 4 files changed, 210 insertions(+)

diff --git a/compat/win32/path-utils.c b/compat/win32/path-utils.c
index 966ef779b9ca9b..c4fea0301b5ecc 100644
--- a/compat/win32/path-utils.c
+++ b/compat/win32/path-utils.c
@@ -2,6 +2,9 @@
 
 #include "../../git-compat-util.h"
 #include "../../environment.h"
+#include "../../wrapper.h"
+#include "../../strbuf.h"
+#include "../../versioncmp.h"
 
 int win32_has_dos_drive_prefix(const char *path)
 {
@@ -89,3 +92,199 @@ int win32_fspathcmp(const char *a, const char *b)
 {
 	return win32_fspathncmp(a, b, (size_t)-1);
 }
+
+static int read_at(int fd, char *buffer, size_t offset, size_t size)
+{
+	if (lseek(fd, offset, SEEK_SET) < 0) {
+		fprintf(stderr, "could not seek to 0x%x\n", (unsigned int)offset);
+		return -1;
+	}
+
+	return read_in_full(fd, buffer, size);
+}
+
+static size_t le16(const char *buffer)
+{
+	unsigned char *u = (unsigned char *)buffer;
+	return u[0] | (u[1] << 8);
+}
+
+static size_t le32(const char *buffer)
+{
+	return le16(buffer) | (le16(buffer + 2) << 16);
+}
+
+/*
+ * Determine the Go version of a given executable, if it was built with Go.
+ *
+ * This recapitulates the logic from
+ * https://github.com/golang/go/blob/master/src/cmd/go/internal/version/version.go
+ * (without requiring the user to install `go.exe` to find out).
+ */
+static ssize_t get_go_version(const char *path, char *go_version, size_t go_version_size)
+{
+	int fd = open(path, O_RDONLY);
+	char buffer[1024];
+	off_t offset;
+	size_t num_sections, opt_header_size, i;
+	char *p = NULL, *q;
+	ssize_t res = -1;
+
+	if (fd < 0)
+		return -1;
+
+	if (read_in_full(fd, buffer, 2) < 0)
+		goto fail;
+
+	/*
+	 * Parse the PE file format, for more details, see
+	 * https://en.wikipedia.org/wiki/Portable_Executable#Layout and
+	 * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
+	 */
+	if (buffer[0] != 'M' || buffer[1] != 'Z')
+		goto fail;
+
+	if (read_at(fd, buffer, 0x3c, 4) < 0)
+		goto fail;
+
+	/* Read the `PE\0\0` signature and the COFF file header */
+	offset = le32(buffer);
+	if (read_at(fd, buffer, offset, 24) < 0)
+		goto fail;
+
+	if (buffer[0] != 'P' || buffer[1] != 'E' || buffer[2] != '\0' || buffer[3] != '\0')
+		goto fail;
+
+	num_sections = le16(buffer + 6);
+	opt_header_size = le16(buffer + 20);
+	offset += 24; /* skip file header */
+
+	/*
+	 * Validate magic number 0x10b or 0x20b, for full details see
+	 * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#optional-header-standard-fields-image-only
+	 */
+	if (read_at(fd, buffer, offset, 2) < 0 ||
+	    ((i = le16(buffer)) != 0x10b && i != 0x20b))
+		goto fail;
+
+	offset += opt_header_size;
+
+	for (i = 0; i < num_sections; i++) {
+		if (read_at(fd, buffer, offset + i * 40, 40) < 0)
+			goto fail;
+
+		/*
+		 * For full details about the section headers, see
+		 * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#section-table-section-headers
+		 */
+		if ((le32(buffer + 36) /* characteristics */ & ~0x600000) /* IMAGE_SCN_ALIGN_32BYTES */ ==
+		    (/* IMAGE_SCN_CNT_INITIALIZED_DATA */ 0x00000040 |
+		     /* IMAGE_SCN_MEM_READ */ 0x40000000 |
+		     /* IMAGE_SCN_MEM_WRITE */ 0x80000000)) {
+			size_t size = le32(buffer + 16); /* "SizeOfRawData " */
+			size_t pointer = le32(buffer + 20); /* "PointerToRawData " */
+
+			/*
+			 * Skip the section if either size or pointer is 0, see
+			 * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L333
+			 * for full details.
+			 *
+			 * Merely seeing a non-zero size will not actually do,
+			 * though: he size must be at least `buildInfoSize`,
+			 * i.e. 32, and we expect a UVarint (at least another
+			 * byte) _and_ the bytes representing the string,
+			 * which we expect to start with the letters "go" and
+			 * continue with the Go version number.
+			 */
+			if (size < 32 + 1 + 2 + 1 || !pointer)
+				continue;
+
+			p = malloc(size);
+
+			if (!p || read_at(fd, p, pointer, size) < 0)
+				goto fail;
+
+			/*
+			 * Look for the build information embedded by Go, see
+			 * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L165-L175
+			 * for full details.
+			 *
+			 * Note: Go contains code to enforce alignment along a
+			 * 16-byte boundary. In practice, no `.exe` has been
+			 * observed that required any adjustment, therefore
+			 * this here code skips that logic for simplicity.
+			 */
+			q = memmem(p, size - 18, "\xff Go buildinf:", 14);
+			if (!q)
+				goto fail;
+			/*
+			 * Decode the build blob. For full details, see
+			 * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L177-L191
+			 *
+			 * Note: The `endianness` values observed in practice
+			 * were always 2, therefore the complex logic to handle
+			 * any other value is skipped for simplicty.
+			 */
+			if ((q[14] == 8 || q[14] == 4) && q[15] == 2) {
+				/*
+				 * Only handle a Go version string with fewer
+				 * than 128 characters, so the Go UVarint at
+				 * q[32] that indicates the string's length must
+				 * be only one byte (without the high bit set).
+				 */
+				if ((q[32] & 0x80) ||
+				    !q[32] ||
+				    (q + 33 + q[32] - p) > (ssize_t)size ||
+				    q[32] + 1 > (ssize_t)go_version_size)
+					goto fail;
+				res = q[32];
+				memcpy(go_version, q + 33, res);
+				go_version[res] = '\0';
+				break;
+			}
+		}
+	}
+
+fail:
+	free(p);
+	close(fd);
+	return res;
+}
+
+void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0)
+{
+	char buffer[128], *git_lfs = NULL;
+	const char *p;
+
+	/*
+	 * Git LFS v3.5.1 fails with an Access Violation on Windows 7; That
+	 * would usually show up as an exit code 0xc0000005. For some reason
+	 * (probably because at this point, we no longer have the _original_
+	 * HANDLE that was returned by `CreateProcess()`) we observe other
+	 * values like 0xb00 and 0x2 instead. Since the exact exit code
+	 * seems to be inconsistent, we check for a non-zero exit status.
+	 */
+	if (exit_code == 0)
+		return;
+	if (GetVersion() >> 16 > 7601)
+		return; /* Warn only on Windows 7 or older */
+	if (!istarts_with(argv0, "git-lfs ") &&
+	    strcasecmp(argv0, "git-lfs"))
+		return;
+	if (!(git_lfs = locate_in_PATH("git-lfs")))
+		return;
+	if (get_go_version(git_lfs, buffer, sizeof(buffer)) > 0 &&
+	    skip_prefix(buffer, "go", &p) &&
+	    versioncmp("1.21.0", p) <= 0)
+		warning("This program was built with Go v%s\n"
+			"i.e. without support for this Windows version:\n"
+			"\n\t%s\n"
+			"\n"
+			"To work around this, you can download and install a "
+			"working version from\n"
+			"\n"
+			"\thttps://github.com/git-lfs/git-lfs/releases/tag/"
+			"v3.4.1\n",
+			p, git_lfs);
+	free(git_lfs);
+}
diff --git a/compat/win32/path-utils.h b/compat/win32/path-utils.h
index a561c700e75713..a69483c332c1a7 100644
--- a/compat/win32/path-utils.h
+++ b/compat/win32/path-utils.h
@@ -34,4 +34,7 @@ int win32_fspathcmp(const char *a, const char *b);
 int win32_fspathncmp(const char *a, const char *b, size_t count);
 #define fspathncmp win32_fspathncmp
 
+void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0);
+#define warn_about_git_lfs_on_windows7 win32_warn_about_git_lfs_on_windows7
+
 #endif
diff --git a/git-compat-util.h b/git-compat-util.h
index b0673d1a450db5..76d2433ac04032 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -265,6 +265,13 @@ static inline int git_offset_1st_component(const char *path)
 #define fspathncmp git_fspathncmp
 #endif
 
+#ifndef warn_about_git_lfs_on_windows7
+static inline void warn_about_git_lfs_on_windows7(int exit_code UNUSED,
+						  const char *argv0 UNUSED)
+{
+}
+#endif
+
 #ifndef is_valid_path
 #define is_valid_path(path) 1
 #endif
diff --git a/run-command.c b/run-command.c
index 2d3c2ac55c5c02..1f60111d178063 100644
--- a/run-command.c
+++ b/run-command.c
@@ -582,6 +582,7 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		 */
 		code += 128;
 	} else if (WIFEXITED(status)) {
+		warn_about_git_lfs_on_windows7(status, argv0);
 		code = WEXITSTATUS(status);
 	} else {
 		if (!in_signal)

From ea0c51e000417bd579a7c8b5df0bce33d0ee991e Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Fri, 6 Sep 2024 14:16:13 -0400
Subject: [PATCH 403/553] revision: create mark_trees_uninteresting_dense()

The sparse tree walk algorithm was created in d5d2e93577e (revision:
implement sparse algorithm, 2019-01-16) and involves using the
mark_trees_uninteresting_sparse() method. This method takes a repository
and an oidset of tree IDs, some of which have the UNINTERESTING flag and
some of which do not.

Create a method that has an equivalent set of preconditions but uses a
"dense" walk (recursively visits all reachable trees, as long as they
have not previously been marked UNINTERESTING). This is an important
difference from mark_tree_uninteresting(), which short-circuits if the
given tree has the UNINTERESTING flag.

A use of this method will be added in a later change, with a condition
set whether the sparse or dense approach should be used.

Signed-off-by: Derrick Stolee <stolee@gmail.com>

From ef1254b2ef5dacc960967a82b69a6c29bb8280e5 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 24 Sep 2024 08:47:39 +0200
Subject: [PATCH 404/553] ci: work around a problem with HTTP/2 vs libcurl
 v8.10.0

As reported in https://lore.kernel.org/git/ZuPKvYP9ZZ2mhb4m@pks.im/,
libcurl v8.10.0 had a regression that was picked up by Git's t5559.30
"large fetch-pack requests can be sent using chunked encoding".

This bug was fixed in libcurl v8.10.1.

Sadly, the macos-13 runner image was updated in the brief window between
these two libcurl versions, breaking each and every CI build, as
reported at https://github.com/git-for-windows/git/issues/5159.

This would usually not matter, we would just ignore the failing CI
builds until the macos-13 runner image is rebuilt in a couple of days,
and then the CI builds would succeed again.

However.

As has become the custom, a surprise Git version was released, and now
that Git for Windows wants to follow suit, since Git for Windows has
this custom of trying to never release a version with a failing CI
build, we _must_ work around it.

This patch implements this work-around, basically for the sake of Git
for Windows v2.46.2's CI build.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5551-http-fetch-smart.sh | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 73cf5315800fa4..229310342d9426 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -413,7 +413,15 @@ test_expect_success CMDLINE_LIMIT \
 	)
 '
 
-test_expect_success 'large fetch-pack requests can be sent using chunked encoding' '
+# This is a temporary work-around for libcurl v8.10.0 on the macos-* runners;
+# see https://github.com/git-for-windows/git/issues/5159 for full details
+test_lazy_prereq UNBROKEN_HTTP2 '
+	test "$HTTP_PROTO" = HTTP/2 &&
+	test -z "$(brew info -q curl 2>/dev/null |
+		sed -n "/^Installed/{N;s/.*8\\.10\\.0.*/BROKEN HTTP2/p;}")"
+'
+
+test_expect_success UNBROKEN_HTTP2 'large fetch-pack requests can be sent using chunked encoding' '
 	GIT_TRACE_CURL=true git -c http.postbuffer=65536 \
 		clone --bare "$HTTPD_URL/smart/repo.git" split.git 2>err &&
 	{

From 7035055a85fc520d820be92c6380243910965ff1 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 20:58:35 -0400
Subject: [PATCH 405/553] survey: summarize total sizes by object type

Now that we have explored objects by count, we can expand that a bit more to
summarize the data for the on-disk and inflated size of those objects. This
information is helpful for diagnosing both why disk space (and perhaps
clone or fetch times) is growing but also why certain operations are slow
because the inflated size of the abstract objects that must be processed is
so large.

Note: zlib-ng is slightly more efficient even at those small sizes. Even
between zlib versions, there are slight differences in compression. To
accommodate for that in the tests, not the exact numbers but some rough
approximations are validated (the test should validate `git survey`,
after all, not zlib).

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/survey.c      | 133 ++++++++++++++++++++++++++++++++++++++++++
 t/t8100-git-survey.sh |  37 +++++++++++-
 2 files changed, 169 insertions(+), 1 deletion(-)

diff --git a/builtin/survey.c b/builtin/survey.c
index 1e8b9c1e5492aa..1d1290553250a1 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -60,6 +60,19 @@ struct survey_report_object_summary {
 	size_t blobs_nr;
 };
 
+/**
+ * For some category given by 'label', count the number of objects
+ * that match that label along with the on-disk size and the size
+ * after decompressing (both with delta bases and zlib).
+ */
+struct survey_report_object_size_summary {
+	char *label;
+	size_t nr;
+	size_t disk_size;
+	size_t inflated_size;
+	size_t num_missing;
+};
+
 /**
  * This struct contains all of the information that needs to be printed
  * at the end of the exploration of the repository and its references.
@@ -67,8 +80,16 @@ struct survey_report_object_summary {
 struct survey_report {
 	struct survey_report_ref_summary refs;
 	struct survey_report_object_summary reachable_objects;
+
+	struct survey_report_object_size_summary *by_type;
 };
 
+#define REPORT_TYPE_COMMIT 0
+#define REPORT_TYPE_TREE 1
+#define REPORT_TYPE_BLOB 2
+#define REPORT_TYPE_TAG 3
+#define REPORT_TYPE_COUNT 4
+
 struct survey_context {
 	struct repository *repo;
 
@@ -280,12 +301,48 @@ static void survey_report_plaintext_reachable_object_summary(struct survey_conte
 	clear_table(&table);
 }
 
+static void survey_report_object_sizes(const char *title,
+				       const char *categories,
+				       struct survey_report_object_size_summary *summary,
+				       size_t summary_nr)
+{
+	struct survey_table table = SURVEY_TABLE_INIT;
+	table.table_name = title;
+
+	strvec_push(&table.header, categories);
+	strvec_push(&table.header, _("Count"));
+	strvec_push(&table.header, _("Disk Size"));
+	strvec_push(&table.header, _("Inflated Size"));
+
+	for (size_t i = 0; i < summary_nr; i++) {
+		char *label_str =  xstrdup(summary[i].label);
+		char *nr_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].nr);
+		char *disk_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].disk_size);
+		char *inflate_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].inflated_size);
+
+		insert_table_rowv(&table, label_str, nr_str,
+				  disk_str, inflate_str, NULL);
+
+		free(label_str);
+		free(nr_str);
+		free(disk_str);
+		free(inflate_str);
+	}
+
+	print_table_plaintext(&table);
+	clear_table(&table);
+}
+
 static void survey_report_plaintext(struct survey_context *ctx)
 {
 	printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree);
 	printf("-----------------------------------------------------\n");
 	survey_report_plaintext_refs(ctx);
 	survey_report_plaintext_reachable_object_summary(ctx);
+	survey_report_object_sizes(_("TOTAL OBJECT SIZES BY TYPE"),
+				   _("Object Type"),
+				   ctx->report.by_type,
+				   REPORT_TYPE_COUNT);
 }
 
 /*
@@ -499,6 +556,69 @@ static void increment_object_counts(
 	}
 }
 
+static void increment_totals(struct survey_context *ctx,
+			     struct oid_array *oids,
+			     struct survey_report_object_size_summary *summary)
+{
+	for (size_t i = 0; i < oids->nr; i++) {
+		struct object_info oi = OBJECT_INFO_INIT;
+		unsigned oi_flags = OBJECT_INFO_FOR_PREFETCH;
+		unsigned long object_length = 0;
+		off_t disk_sizep = 0;
+		enum object_type type;
+
+		oi.typep = &type;
+		oi.sizep = &object_length;
+		oi.disk_sizep = &disk_sizep;
+
+		if (odb_read_object_info_extended(ctx->repo->objects,
+						  &oids->oid[i],
+						  &oi, oi_flags) < 0) {
+			summary->num_missing++;
+		} else {
+			summary->nr++;
+			summary->disk_size += disk_sizep;
+			summary->inflated_size += object_length;
+		}
+	}
+}
+
+static void increment_object_totals(struct survey_context *ctx,
+				    struct oid_array *oids,
+				    enum object_type type)
+{
+	struct survey_report_object_size_summary *total;
+	struct survey_report_object_size_summary summary = { 0 };
+
+	increment_totals(ctx, oids, &summary);
+
+	switch (type) {
+	case OBJ_COMMIT:
+		total = &ctx->report.by_type[REPORT_TYPE_COMMIT];
+		break;
+
+	case OBJ_TREE:
+		total = &ctx->report.by_type[REPORT_TYPE_TREE];
+		break;
+
+	case OBJ_BLOB:
+		total = &ctx->report.by_type[REPORT_TYPE_BLOB];
+		break;
+
+	case OBJ_TAG:
+		total = &ctx->report.by_type[REPORT_TYPE_TAG];
+		break;
+
+	default:
+		BUG("No other type allowed");
+	}
+
+	total->nr += summary.nr;
+	total->disk_size += summary.disk_size;
+	total->inflated_size += summary.inflated_size;
+	total->num_missing += summary.num_missing;
+}
+
 static int survey_objects_path_walk_fn(const char *path UNUSED,
 				       struct oid_array *oids,
 				       enum object_type type,
@@ -508,10 +628,20 @@ static int survey_objects_path_walk_fn(const char *path UNUSED,
 
 	increment_object_counts(&ctx->report.reachable_objects,
 				type, oids->nr);
+	increment_object_totals(ctx, oids, type);
 
 	return 0;
 }
 
+static void initialize_report(struct survey_context *ctx)
+{
+	CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT);
+	ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits"));
+	ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees"));
+	ctx->report.by_type[REPORT_TYPE_BLOB].label = xstrdup(_("Blobs"));
+	ctx->report.by_type[REPORT_TYPE_TAG].label = xstrdup(_("Tags"));
+}
+
 static void survey_phase_objects(struct survey_context *ctx)
 {
 	struct rev_info revs = REV_INFO_INIT;
@@ -524,12 +654,15 @@ static void survey_phase_objects(struct survey_context *ctx)
 	info.path_fn = survey_objects_path_walk_fn;
 	info.path_fn_data = ctx;
 
+	initialize_report(ctx);
+
 	repo_init_revisions(ctx->repo, &revs, "");
 	revs.tag_objects = 1;
 
 	for (int i = 0; i < ctx->ref_array.nr; i++) {
 		struct ref_array_item *item = ctx->ref_array.items[i];
 		add_pending_oid(&revs, NULL, &item->objectname, add_flags);
+		display_progress(ctx->progress, ++(ctx->progress_nr));
 	}
 
 	walk_objects_by_path(&info);
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index d3086784090352..c2a6333145bac1 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -25,10 +25,35 @@ test_expect_success 'create a semi-interesting repo' '
 	git update-ref -d refs/tags/two
 '
 
+approximate_sizes() {
+	# very simplistic approximate rounding
+	sed -Ee "s/  *(1[0-9][0-9])( |$)/ ~0.1kB\2/g" \
+	  -e "s/  *(4[6-9][0-9]|5[0-6][0-9])( |$)/ ~0.5kB\2/g" \
+	  -e "s/  *(5[6-9][0-9]|6[0-6][0-9])( |$)/ ~0.6kB\2/g" \
+	  -e "s/  *1(4[89][0-9]|5[0-8][0-9])( |$)/ ~1.5kB\2/g" \
+	  -e "s/  *1(69[0-9]|7[0-9][0-9])( |$)/ ~1.7kB\2/g" \
+	  -e "s/  *1(79[0-9]|8[0-9][0-9])( |$)/ ~1.8kB\2/g" \
+	  -e "s/  *2(1[0-9][0-9]|20[0-1])( |$)/ ~2.1kB\2/g" \
+	  -e "s/  *2(3[0-9][0-9]|4[0-1][0-9])( |$)/ ~2.3kB\2/g" \
+	  -e "s/  *2(5[0-9][0-9]|6[0-1][0-9])( |$)/ ~2.5kB\2/g" \
+	 "$@"
+}
+
 test_expect_success 'git survey (default)' '
 	git survey --all-refs >out 2>err &&
 	test_line_count = 0 err &&
 
+	test_oid_cache <<-EOF &&
+	commits_sizes sha1:~1.5kB | ~2.1kB
+	commits_sizes sha256:~1.8kB | ~2.5kB
+	trees_sizes sha1:~0.5kB | ~1.7kB
+	trees_sizes sha256:~0.6kB | ~2.3kB
+	blobs_sizes sha1:~0.1kB | ~0.1kB
+	blobs_sizes sha256:~0.1kB | ~0.1kB
+	tags_sizes sha1:~0.5kB | ~0.5kB
+	tags_sizes sha256:~0.5kB | ~0.6kB
+	EOF
+
 	tr , " " >expect <<-EOF &&
 	GIT SURVEY for "$(pwd)"
 	-----------------------------------------------------
@@ -50,9 +75,19 @@ test_expect_success 'git survey (default)' '
 	    Commits |    10
 	      Trees |    10
 	      Blobs |    10
+
+	TOTAL OBJECT SIZES BY TYPE
+	===============================================
+	Object Type | Count | Disk Size | Inflated Size
+	------------+-------+-----------+--------------
+	    Commits |    10 | $(test_oid commits_sizes)
+	      Trees |    10 | $(test_oid trees_sizes)
+	      Blobs |    10 | $(test_oid blobs_sizes)
+	       Tags |     4 | $(test_oid tags_sizes)
 	EOF
 
-	test_cmp expect out
+	approximate_sizes out >out-edited &&
+	test_cmp expect out-edited
 '
 
 test_done

From 36f1a8c50d892affbb462dc4f01fa8d926f232a0 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 21:21:54 -0400
Subject: [PATCH 406/553] survey: show progress during object walk

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 builtin/survey.c      | 16 ++++++++++++++++
 t/t8100-git-survey.sh |  5 +++++
 2 files changed, 21 insertions(+)

diff --git a/builtin/survey.c b/builtin/survey.c
index 1d1290553250a1..c570a1470122f4 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -630,6 +630,9 @@ static int survey_objects_path_walk_fn(const char *path UNUSED,
 				type, oids->nr);
 	increment_object_totals(ctx, oids, type);
 
+	ctx->progress_nr += oids->nr;
+	display_progress(ctx->progress, ctx->progress_nr);
+
 	return 0;
 }
 
@@ -659,13 +662,26 @@ static void survey_phase_objects(struct survey_context *ctx)
 	repo_init_revisions(ctx->repo, &revs, "");
 	revs.tag_objects = 1;
 
+	ctx->progress_nr = 0;
+	ctx->progress_total = ctx->ref_array.nr;
+	if (ctx->opts.show_progress)
+		ctx->progress = start_progress(ctx->repo,
+					       _("Preparing object walk"),
+					       ctx->progress_total);
 	for (int i = 0; i < ctx->ref_array.nr; i++) {
 		struct ref_array_item *item = ctx->ref_array.items[i];
 		add_pending_oid(&revs, NULL, &item->objectname, add_flags);
 		display_progress(ctx->progress, ++(ctx->progress_nr));
 	}
+	stop_progress(&ctx->progress);
 
+	ctx->progress_nr = 0;
+	ctx->progress_total = 0;
+	if (ctx->opts.show_progress)
+		ctx->progress = start_progress(ctx->repo,
+					       _("Walking objects"), 0);
 	walk_objects_by_path(&info);
+	stop_progress(&ctx->progress);
 
 	release_revisions(&revs);
 	trace2_region_leave("survey", "phase/objects", ctx->repo);
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index c2a6333145bac1..118410be55cc2a 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -25,6 +25,11 @@ test_expect_success 'create a semi-interesting repo' '
 	git update-ref -d refs/tags/two
 '
 
+test_expect_success 'git survey --progress' '
+	GIT_PROGRESS_DELAY=0 git survey --all-refs --progress >out 2>err &&
+	grep "Preparing object walk" err
+'
+
 approximate_sizes() {
 	# very simplistic approximate rounding
 	sed -Ee "s/  *(1[0-9][0-9])( |$)/ ~0.1kB\2/g" \

From c5c25d85083634b931f49c031f2d27f919336423 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 14 Nov 2019 20:09:23 +0100
Subject: [PATCH 407/553] mingw: make sure `errno` is set correctly when socket
 operations fail

The winsock2 library provides functions that work on different data
types than file descriptors, therefore we wrap them.

But that is not the only difference: they also do not set `errno` but
expect the callers to enquire about errors via `WSAGetLastError()`.

Let's translate that into appropriate `errno` values whenever the socket
operations fail so that Git's code base does not have to change its
expectations.

This closes https://github.com/git-for-windows/git/issues/2404

Helped-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 157 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 147 insertions(+), 10 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..5f1578e8361437 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2093,18 +2093,150 @@ static void ensure_socket_initialization(void)
 	initialized = 1;
 }
 
+static int winsock_error_to_errno(DWORD err)
+{
+	switch (err) {
+	case WSAEINTR: return EINTR;
+	case WSAEBADF: return EBADF;
+	case WSAEACCES: return EACCES;
+	case WSAEFAULT: return EFAULT;
+	case WSAEINVAL: return EINVAL;
+	case WSAEMFILE: return EMFILE;
+	case WSAEWOULDBLOCK: return EWOULDBLOCK;
+	case WSAEINPROGRESS: return EINPROGRESS;
+	case WSAEALREADY: return EALREADY;
+	case WSAENOTSOCK: return ENOTSOCK;
+	case WSAEDESTADDRREQ: return EDESTADDRREQ;
+	case WSAEMSGSIZE: return EMSGSIZE;
+	case WSAEPROTOTYPE: return EPROTOTYPE;
+	case WSAENOPROTOOPT: return ENOPROTOOPT;
+	case WSAEPROTONOSUPPORT: return EPROTONOSUPPORT;
+	case WSAEOPNOTSUPP: return EOPNOTSUPP;
+	case WSAEAFNOSUPPORT: return EAFNOSUPPORT;
+	case WSAEADDRINUSE: return EADDRINUSE;
+	case WSAEADDRNOTAVAIL: return EADDRNOTAVAIL;
+	case WSAENETDOWN: return ENETDOWN;
+	case WSAENETUNREACH: return ENETUNREACH;
+	case WSAENETRESET: return ENETRESET;
+	case WSAECONNABORTED: return ECONNABORTED;
+	case WSAECONNRESET: return ECONNRESET;
+	case WSAENOBUFS: return ENOBUFS;
+	case WSAEISCONN: return EISCONN;
+	case WSAENOTCONN: return ENOTCONN;
+	case WSAETIMEDOUT: return ETIMEDOUT;
+	case WSAECONNREFUSED: return ECONNREFUSED;
+	case WSAELOOP: return ELOOP;
+	case WSAENAMETOOLONG: return ENAMETOOLONG;
+	case WSAEHOSTUNREACH: return EHOSTUNREACH;
+	case WSAENOTEMPTY: return ENOTEMPTY;
+	/* No errno equivalent; default to EIO */
+	case WSAESOCKTNOSUPPORT:
+	case WSAEPFNOSUPPORT:
+	case WSAESHUTDOWN:
+	case WSAETOOMANYREFS:
+	case WSAEHOSTDOWN:
+	case WSAEPROCLIM:
+	case WSAEUSERS:
+	case WSAEDQUOT:
+	case WSAESTALE:
+	case WSAEREMOTE:
+	case WSASYSNOTREADY:
+	case WSAVERNOTSUPPORTED:
+	case WSANOTINITIALISED:
+	case WSAEDISCON:
+	case WSAENOMORE:
+	case WSAECANCELLED:
+	case WSAEINVALIDPROCTABLE:
+	case WSAEINVALIDPROVIDER:
+	case WSAEPROVIDERFAILEDINIT:
+	case WSASYSCALLFAILURE:
+	case WSASERVICE_NOT_FOUND:
+	case WSATYPE_NOT_FOUND:
+	case WSA_E_NO_MORE:
+	case WSA_E_CANCELLED:
+	case WSAEREFUSED:
+	case WSAHOST_NOT_FOUND:
+	case WSATRY_AGAIN:
+	case WSANO_RECOVERY:
+	case WSANO_DATA:
+	case WSA_QOS_RECEIVERS:
+	case WSA_QOS_SENDERS:
+	case WSA_QOS_NO_SENDERS:
+	case WSA_QOS_NO_RECEIVERS:
+	case WSA_QOS_REQUEST_CONFIRMED:
+	case WSA_QOS_ADMISSION_FAILURE:
+	case WSA_QOS_POLICY_FAILURE:
+	case WSA_QOS_BAD_STYLE:
+	case WSA_QOS_BAD_OBJECT:
+	case WSA_QOS_TRAFFIC_CTRL_ERROR:
+	case WSA_QOS_GENERIC_ERROR:
+	case WSA_QOS_ESERVICETYPE:
+	case WSA_QOS_EFLOWSPEC:
+	case WSA_QOS_EPROVSPECBUF:
+	case WSA_QOS_EFILTERSTYLE:
+	case WSA_QOS_EFILTERTYPE:
+	case WSA_QOS_EFILTERCOUNT:
+	case WSA_QOS_EOBJLENGTH:
+	case WSA_QOS_EFLOWCOUNT:
+#ifndef _MSC_VER
+	case WSA_QOS_EUNKNOWNPSOBJ:
+#endif
+	case WSA_QOS_EPOLICYOBJ:
+	case WSA_QOS_EFLOWDESC:
+	case WSA_QOS_EPSFLOWSPEC:
+	case WSA_QOS_EPSFILTERSPEC:
+	case WSA_QOS_ESDMODEOBJ:
+	case WSA_QOS_ESHAPERATEOBJ:
+	case WSA_QOS_RESERVED_PETYPE:
+	default: return EIO;
+	}
+}
+
+/*
+ * On Windows, `errno` is a global macro to a function call.
+ * This makes it difficult to debug and single-step our mappings.
+ */
+static inline void set_wsa_errno(void)
+{
+	DWORD wsa = WSAGetLastError();
+	int e = winsock_error_to_errno(wsa);
+	errno = e;
+
+#ifdef DEBUG_WSA_ERRNO
+	fprintf(stderr, "winsock error: %d -> %d\n", wsa, e);
+	fflush(stderr);
+#endif
+}
+
+static inline int winsock_return(int ret)
+{
+	if (ret < 0)
+		set_wsa_errno();
+
+	return ret;
+}
+
+#define WINSOCK_RETURN(x) do { return winsock_return(x); } while (0)
+
 #undef gethostname
 int mingw_gethostname(char *name, int namelen)
 {
-    ensure_socket_initialization();
-    return gethostname(name, namelen);
+	ensure_socket_initialization();
+	WINSOCK_RETURN(gethostname(name, namelen));
 }
 
 #undef gethostbyname
 struct hostent *mingw_gethostbyname(const char *host)
 {
+	struct hostent *ret;
+
 	ensure_socket_initialization();
-	return gethostbyname(host);
+
+	ret = gethostbyname(host);
+	if (!ret)
+		set_wsa_errno();
+
+	return ret;
 }
 
 #undef getaddrinfo
@@ -2112,7 +2244,7 @@ int mingw_getaddrinfo(const char *node, const char *service,
 		      const struct addrinfo *hints, struct addrinfo **res)
 {
 	ensure_socket_initialization();
-	return getaddrinfo(node, service, hints, res);
+	WINSOCK_RETURN(getaddrinfo(node, service, hints, res));
 }
 
 int mingw_socket(int domain, int type, int protocol)
@@ -2132,7 +2264,7 @@ int mingw_socket(int domain, int type, int protocol)
 		 * in errno so that _if_ someone looks up the code somewhere,
 		 * then it is at least the number that are usually listed.
 		 */
-		errno = WSAGetLastError();
+		set_wsa_errno();
 		return -1;
 	}
 	/* convert into a file descriptor */
@@ -2148,35 +2280,35 @@ int mingw_socket(int domain, int type, int protocol)
 int mingw_connect(int sockfd, struct sockaddr *sa, size_t sz)
 {
 	SOCKET s = (SOCKET)_get_osfhandle(sockfd);
-	return connect(s, sa, sz);
+	WINSOCK_RETURN(connect(s, sa, sz));
 }
 
 #undef bind
 int mingw_bind(int sockfd, struct sockaddr *sa, size_t sz)
 {
 	SOCKET s = (SOCKET)_get_osfhandle(sockfd);
-	return bind(s, sa, sz);
+	WINSOCK_RETURN(bind(s, sa, sz));
 }
 
 #undef setsockopt
 int mingw_setsockopt(int sockfd, int lvl, int optname, void *optval, int optlen)
 {
 	SOCKET s = (SOCKET)_get_osfhandle(sockfd);
-	return setsockopt(s, lvl, optname, (const char*)optval, optlen);
+	WINSOCK_RETURN(setsockopt(s, lvl, optname, (const char*)optval, optlen));
 }
 
 #undef shutdown
 int mingw_shutdown(int sockfd, int how)
 {
 	SOCKET s = (SOCKET)_get_osfhandle(sockfd);
-	return shutdown(s, how);
+	WINSOCK_RETURN(shutdown(s, how));
 }
 
 #undef listen
 int mingw_listen(int sockfd, int backlog)
 {
 	SOCKET s = (SOCKET)_get_osfhandle(sockfd);
-	return listen(s, backlog);
+	WINSOCK_RETURN(listen(s, backlog));
 }
 
 #undef accept
@@ -2187,6 +2319,11 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz)
 	SOCKET s1 = (SOCKET)_get_osfhandle(sockfd1);
 	SOCKET s2 = accept(s1, sa, sz);
 
+	if (s2 == INVALID_SOCKET) {
+		set_wsa_errno();
+		return -1;
+	}
+
 	/* convert into a file descriptor */
 	if ((sockfd2 = _open_osfhandle(s2, O_RDWR|O_BINARY)) < 0) {
 		int err = errno;

From 1fc52bcb5c7ff7ee421e02f577d977ec2a0911d9 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 22:35:06 -0400
Subject: [PATCH 408/553] survey: add ability to track prioritized lists

In future changes, we will make use of these methods. The intention is to
keep track of the top contributors according to some metric. We don't want
to store all of the entries and do a sort at the end, so track a
constant-size table and remove rows that get pushed out depending on the
chosen sorting algorithm.

Co-authored-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by; Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 builtin/survey.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/builtin/survey.c b/builtin/survey.c
index c570a1470122f4..5ff62fa4ab921c 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -73,6 +73,119 @@ struct survey_report_object_size_summary {
 	size_t num_missing;
 };
 
+typedef int (*survey_top_cmp)(void *v1, void *v2);
+
+MAYBE_UNUSED
+static int cmp_by_nr(void *v1, void *v2)
+{
+	struct survey_report_object_size_summary *s1 = v1;
+	struct survey_report_object_size_summary *s2 = v2;
+
+	if (s1->nr < s2->nr)
+		return -1;
+	if (s1->nr > s2->nr)
+		return 1;
+	return 0;
+}
+
+MAYBE_UNUSED
+static int cmp_by_disk_size(void *v1, void *v2)
+{
+	struct survey_report_object_size_summary *s1 = v1;
+	struct survey_report_object_size_summary *s2 = v2;
+
+	if (s1->disk_size < s2->disk_size)
+		return -1;
+	if (s1->disk_size > s2->disk_size)
+		return 1;
+	return 0;
+}
+
+MAYBE_UNUSED
+static int cmp_by_inflated_size(void *v1, void *v2)
+{
+	struct survey_report_object_size_summary *s1 = v1;
+	struct survey_report_object_size_summary *s2 = v2;
+
+	if (s1->inflated_size < s2->inflated_size)
+		return -1;
+	if (s1->inflated_size > s2->inflated_size)
+		return 1;
+	return 0;
+}
+
+/**
+ * Store a list of "top" categories by some sorting function. When
+ * inserting a new category, reorder the list and free the one that
+ * got ejected (if any).
+ */
+struct survey_report_top_table {
+	const char *name;
+	survey_top_cmp cmp_fn;
+	size_t nr;
+	size_t alloc;
+
+	/**
+	 * 'data' stores an array of structs and must be cast into
+	 * the proper array type before evaluating an index.
+	 */
+	void *data;
+};
+
+MAYBE_UNUSED
+static void init_top_sizes(struct survey_report_top_table *top,
+			   size_t limit, const char *name,
+			   survey_top_cmp cmp)
+{
+	struct survey_report_object_size_summary *sz_array;
+
+	top->name = name;
+	top->cmp_fn = cmp;
+	top->alloc = limit;
+	top->nr = 0;
+
+	CALLOC_ARRAY(sz_array, limit);
+	top->data = sz_array;
+}
+
+MAYBE_UNUSED
+static void clear_top_sizes(struct survey_report_top_table *top)
+{
+	struct survey_report_object_size_summary *sz_array = top->data;
+
+	for (size_t i = 0; i < top->nr; i++)
+		free(sz_array[i].label);
+	free(sz_array);
+}
+
+MAYBE_UNUSED
+static void maybe_insert_into_top_size(struct survey_report_top_table *top,
+				       struct survey_report_object_size_summary *summary)
+{
+	struct survey_report_object_size_summary *sz_array = top->data;
+	size_t pos = top->nr;
+
+	/* Compare against list from the bottom. */
+	while (pos > 0 && top->cmp_fn(&sz_array[pos - 1], summary) < 0)
+		pos--;
+
+	/* Not big enough! */
+	if (pos >= top->alloc)
+		return;
+
+	/* We need to shift the data. */
+	if (top->nr == top->alloc)
+		free(sz_array[top->nr - 1].label);
+	else
+		top->nr++;
+
+	for (size_t i = top->nr - 1; i > pos; i--)
+		memcpy(&sz_array[i], &sz_array[i - 1], sizeof(*sz_array));
+
+	memcpy(&sz_array[pos], summary, sizeof(*summary));
+	sz_array[pos].label = xstrdup(summary->label);
+}
+
 /**
  * This struct contains all of the information that needs to be printed
  * at the end of the exploration of the repository and its references.

From 795a4f75f1685e21838be9aed7459b1e37925099 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sun, 22 Dec 2024 17:15:39 +0100
Subject: [PATCH 409/553] compat/mingw: handle WSA errors in strerror
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We map WSAGetLastError() errors to errno errors in winsock_error_to_errno(),
but the MSVC strerror() implementation only produces "Unknown error" for
most of them. Produce some more meaningful error messages in these
cases.

Our builds for ARM64 link against the newer UCRT strerror() that does know
these errors, so we won't change the strerror() used there.

The wording of the messages is copied from glibc strerror() messages.

Reported-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile               |  1 +
 compat/mingw-posix.h   |  5 +++
 compat/mingw.c         | 85 ++++++++++++++++++++++++++++++++++++++++++
 t/meson.build          |  1 +
 t/unit-tests/u-mingw.c | 72 +++++++++++++++++++++++++++++++++++
 5 files changed, 164 insertions(+)
 create mode 100644 t/unit-tests/u-mingw.c

diff --git a/Makefile b/Makefile
index b7eba509c6a0ca..dd5bda65538b77 100644
--- a/Makefile
+++ b/Makefile
@@ -1517,6 +1517,7 @@ CLAR_TEST_SUITES += u-example-decorate
 CLAR_TEST_SUITES += u-hash
 CLAR_TEST_SUITES += u-hashmap
 CLAR_TEST_SUITES += u-mem-pool
+CLAR_TEST_SUITES += u-mingw
 CLAR_TEST_SUITES += u-oid-array
 CLAR_TEST_SUITES += u-oidmap
 CLAR_TEST_SUITES += u-oidtree
diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 0939feff27ffec..0ef26f7f80c44c 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -290,6 +290,11 @@ int mingw_socket(int domain, int type, int protocol);
 int mingw_connect(int sockfd, struct sockaddr *sa, size_t sz);
 #define connect mingw_connect
 
+char *mingw_strerror(int errnum);
+#ifndef _UCRT
+#define strerror mingw_strerror
+#endif
+
 int mingw_bind(int sockfd, struct sockaddr *sa, size_t sz);
 #define bind mingw_bind
 
diff --git a/compat/mingw.c b/compat/mingw.c
index 5f1578e8361437..87ac187168ae91 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2218,6 +2218,91 @@ static inline int winsock_return(int ret)
 
 #define WINSOCK_RETURN(x) do { return winsock_return(x); } while (0)
 
+#undef strerror
+char *mingw_strerror(int errnum)
+{
+	static char buf[41] ="";
+	switch (errnum) {
+		case EWOULDBLOCK:
+			xsnprintf(buf, 41, "%s", "Operation would block");
+			break;
+		case EINPROGRESS:
+			xsnprintf(buf, 41, "%s", "Operation now in progress");
+			break;
+		case EALREADY:
+			xsnprintf(buf, 41, "%s", "Operation already in progress");
+			break;
+		case ENOTSOCK:
+			xsnprintf(buf, 41, "%s", "Socket operation on non-socket");
+			break;
+		case EDESTADDRREQ:
+			xsnprintf(buf, 41, "%s", "Destination address required");
+			break;
+		case EMSGSIZE:
+			xsnprintf(buf, 41, "%s", "Message too long");
+			break;
+		case EPROTOTYPE:
+			xsnprintf(buf, 41, "%s", "Protocol wrong type for socket");
+			break;
+		case ENOPROTOOPT:
+			xsnprintf(buf, 41, "%s", "Protocol not available");
+			break;
+		case EPROTONOSUPPORT:
+			xsnprintf(buf, 41, "%s", "Protocol not supported");
+			break;
+		case EOPNOTSUPP:
+			xsnprintf(buf, 41, "%s", "Operation not supported");
+			break;
+		case EAFNOSUPPORT:
+			xsnprintf(buf, 41, "%s", "Address family not supported by protocol");
+			break;
+		case EADDRINUSE:
+			xsnprintf(buf, 41, "%s", "Address already in use");
+			break;
+		case EADDRNOTAVAIL:
+			xsnprintf(buf, 41, "%s", "Cannot assign requested address");
+			break;
+		case ENETDOWN:
+			xsnprintf(buf, 41, "%s", "Network is down");
+			break;
+		case ENETUNREACH:
+			xsnprintf(buf, 41, "%s", "Network is unreachable");
+			break;
+		case ENETRESET:
+			xsnprintf(buf, 41, "%s", "Network dropped connection on reset");
+			break;
+		case ECONNABORTED:
+			xsnprintf(buf, 41, "%s", "Software caused connection abort");
+			break;
+		case ECONNRESET:
+			xsnprintf(buf, 41, "%s", "Connection reset by peer");
+			break;
+		case ENOBUFS:
+			xsnprintf(buf, 41, "%s", "No buffer space available");
+			break;
+		case EISCONN:
+			xsnprintf(buf, 41, "%s", "Transport endpoint is already connected");
+			break;
+		case ENOTCONN:
+			xsnprintf(buf, 41, "%s", "Transport endpoint is not connected");
+			break;
+		case ETIMEDOUT:
+			xsnprintf(buf, 41, "%s", "Connection timed out");
+			break;
+		case ECONNREFUSED:
+			xsnprintf(buf, 41, "%s", "Connection refused");
+			break;
+		case ELOOP:
+			xsnprintf(buf, 41, "%s", "Too many levels of symbolic links");
+			break;
+		case EHOSTUNREACH:
+			xsnprintf(buf, 41, "%s", "No route to host");
+			break;
+		default: return strerror(errnum);
+	}
+	return buf;
+}
+
 #undef gethostname
 int mingw_gethostname(char *name, int namelen)
 {
diff --git a/t/meson.build b/t/meson.build
index 459c52a48972e4..20f311bac0897c 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -5,6 +5,7 @@ clar_test_suites = [
   'unit-tests/u-hash.c',
   'unit-tests/u-hashmap.c',
   'unit-tests/u-mem-pool.c',
+  'unit-tests/u-mingw.c',
   'unit-tests/u-oid-array.c',
   'unit-tests/u-oidmap.c',
   'unit-tests/u-oidtree.c',
diff --git a/t/unit-tests/u-mingw.c b/t/unit-tests/u-mingw.c
new file mode 100644
index 00000000000000..cb74da5e793a33
--- /dev/null
+++ b/t/unit-tests/u-mingw.c
@@ -0,0 +1,72 @@
+#include "unit-test.h"
+
+#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT)
+#undef strerror
+int errnos_contains(int);
+static int errnos [53]={
+    /* errnos in err_win_to_posix */
+    EACCES, EBUSY, EEXIST, ERANGE, EIO, ENODEV, ENXIO, ENOEXEC, EINVAL, ENOENT,
+    EPIPE, ENAMETOOLONG, ENOSYS, ENOTEMPTY, ENOSPC, EFAULT, EBADF, EPERM, EINTR,
+    E2BIG, ESPIPE, ENOMEM, EXDEV, EAGAIN, ENFILE, EMFILE, ECHILD, EROFS,
+    /* errnos only in winsock_error_to_errno */
+    EWOULDBLOCK, EINPROGRESS, EALREADY, ENOTSOCK, EDESTADDRREQ, EMSGSIZE,
+    EPROTOTYPE, ENOPROTOOPT, EPROTONOSUPPORT, EOPNOTSUPP, EAFNOSUPPORT,
+    EADDRINUSE, EADDRNOTAVAIL, ENETDOWN, ENETUNREACH, ENETRESET, ECONNABORTED,
+    ECONNRESET, ENOBUFS, EISCONN, ENOTCONN, ETIMEDOUT, ECONNREFUSED, ELOOP,
+    EHOSTUNREACH
+    };
+
+int errnos_contains(int errnum)
+{
+    for(int i=0;i<53;i++)
+	if(errnos[i]==errnum)
+	    return 1;
+    return 0;
+}
+#endif
+
+void test_mingw__no_strerror_shim_on_ucrt(void)
+{
+#if defined(GIT_WINDOWS_NATIVE) && defined(_UCRT)
+    cl_assert_(strerror != mingw_strerror,
+	"mingw_strerror is unnescessary when building against UCRT");
+#else
+    cl_skip();
+#endif
+}
+
+void test_mingw__strerror(void)
+{
+#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT)
+    for(int i=0;i<53;i++)
+    {
+	char *crt;
+	char *mingw;
+	mingw = mingw_strerror(errnos[i]);
+	crt = strerror(errnos[i]);
+	cl_assert_(!strcasestr(mingw, "unknown error"),
+	    "mingw_strerror should know all errno values we care about");
+	if(!strcasestr(crt, "unknown error"))
+	    cl_assert_equal_s(crt,mingw);
+    }
+#else
+    cl_skip();
+#endif
+}
+
+void test_mingw__errno_translation(void)
+{
+#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT)
+    /* GetLastError() return values are currently defined from 0 to 15841,
+    testing up to 20000 covers some room for future expansion */
+    for (int i=0;i<20000;i++)
+    {
+	if(i!=ERROR_SUCCESS)
+	    cl_assert_(errnos_contains(err_win_to_posix(i)),
+		"all err_win_to_posix return values should be tested against mingw_strerror");
+	/* ideally we'd test the same for winsock_error_to_errno, but it's static */
+    }
+#else
+    cl_skip();
+#endif
+}

From 09177778b092990d00d2d5ed436008490301f855 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Sun, 1 Sep 2024 22:35:40 -0400
Subject: [PATCH 410/553] survey: add report of "largest" paths
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Since we are already walking our reachable objects using the path-walk API,
let's now collect lists of the paths that contribute most to different
metrics. Specifically, we care about

 * Number of versions.
 * Total size on disk.
 * Total inflated size (no delta or zlib compression).

This information can be critical to discovering which parts of the
repository are causing the most growth, especially on-disk size. Different
packing strategies might help compress data more efficiently, but the toal
inflated size is a representation of the raw size of all snapshots of those
paths. Even when stored efficiently on disk, that size represents how much
information must be processed to complete a command such as 'git blame'.

The exact disk size seems to be not quite robust enough for testing, as
could be seen by the `linux-musl-meson` job consistently failing, possibly
because of zlib-ng deflates differently: t8100.4(git survey
(default)) was failing with a symptom like this:

   TOTAL OBJECT SIZES BY TYPE
   ===============================================
   Object Type | Count | Disk Size | Inflated Size
   ------------+-------+-----------+--------------
  -    Commits |    10 |      1523 |          2153
  +    Commits |    10 |      1528 |          2153
         Trees |    10 |       495 |          1706
         Blobs |    10 |       191 |           101
  -       Tags |     4 |       510 |           528
  +       Tags |     4 |       547 |           528

This means: the disk size is unlikely something we can verify robustly.
Since zlib-ng seems to increase the disk size of the tags from 528 to
547, we cannot even assume that the disk size is always smaller than the
inflated size. We will most likely want to either skip verifying the
disk size altogether, or go for some kind of fuzzy matching, say, by
replacing `s/ 1[45][0-9][0-9] / ~1.5k /` and `s/ [45][0-9][0-9] / ~½k /`
or something like that.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/survey.c      | 79 ++++++++++++++++++++++++++++++++++++++-----
 t/t8100-git-survey.sh | 12 ++++++-
 2 files changed, 82 insertions(+), 9 deletions(-)

diff --git a/builtin/survey.c b/builtin/survey.c
index 5ff62fa4ab921c..2dd1eedfda74f1 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -75,7 +75,6 @@ struct survey_report_object_size_summary {
 
 typedef int (*survey_top_cmp)(void *v1, void *v2);
 
-MAYBE_UNUSED
 static int cmp_by_nr(void *v1, void *v2)
 {
 	struct survey_report_object_size_summary *s1 = v1;
@@ -88,7 +87,6 @@ static int cmp_by_nr(void *v1, void *v2)
 	return 0;
 }
 
-MAYBE_UNUSED
 static int cmp_by_disk_size(void *v1, void *v2)
 {
 	struct survey_report_object_size_summary *s1 = v1;
@@ -101,7 +99,6 @@ static int cmp_by_disk_size(void *v1, void *v2)
 	return 0;
 }
 
-MAYBE_UNUSED
 static int cmp_by_inflated_size(void *v1, void *v2)
 {
 	struct survey_report_object_size_summary *s1 = v1;
@@ -132,7 +129,6 @@ struct survey_report_top_table {
 	void *data;
 };
 
-MAYBE_UNUSED
 static void init_top_sizes(struct survey_report_top_table *top,
 			   size_t limit, const char *name,
 			   survey_top_cmp cmp)
@@ -158,7 +154,6 @@ static void clear_top_sizes(struct survey_report_top_table *top)
 	free(sz_array);
 }
 
-MAYBE_UNUSED
 static void maybe_insert_into_top_size(struct survey_report_top_table *top,
 				       struct survey_report_object_size_summary *summary)
 {
@@ -195,6 +190,10 @@ struct survey_report {
 	struct survey_report_object_summary reachable_objects;
 
 	struct survey_report_object_size_summary *by_type;
+
+	struct survey_report_top_table *top_paths_by_count;
+	struct survey_report_top_table *top_paths_by_disk;
+	struct survey_report_top_table *top_paths_by_inflate;
 };
 
 #define REPORT_TYPE_COMMIT 0
@@ -446,6 +445,13 @@ static void survey_report_object_sizes(const char *title,
 	clear_table(&table);
 }
 
+static void survey_report_plaintext_sorted_size(
+		struct survey_report_top_table *top)
+{
+	survey_report_object_sizes(top->name,  _("Path"),
+				   top->data, top->nr);
+}
+
 static void survey_report_plaintext(struct survey_context *ctx)
 {
 	printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree);
@@ -456,6 +462,21 @@ static void survey_report_plaintext(struct survey_context *ctx)
 				   _("Object Type"),
 				   ctx->report.by_type,
 				   REPORT_TYPE_COUNT);
+
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_count[REPORT_TYPE_TREE]);
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_count[REPORT_TYPE_BLOB]);
+
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_disk[REPORT_TYPE_TREE]);
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB]);
+
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE]);
+	survey_report_plaintext_sorted_size(
+		&ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB]);
 }
 
 /*
@@ -698,7 +719,8 @@ static void increment_totals(struct survey_context *ctx,
 
 static void increment_object_totals(struct survey_context *ctx,
 				    struct oid_array *oids,
-				    enum object_type type)
+				    enum object_type type,
+				    const char *path)
 {
 	struct survey_report_object_size_summary *total;
 	struct survey_report_object_size_summary summary = { 0 };
@@ -730,9 +752,30 @@ static void increment_object_totals(struct survey_context *ctx,
 	total->disk_size += summary.disk_size;
 	total->inflated_size += summary.inflated_size;
 	total->num_missing += summary.num_missing;
+
+	if (type == OBJ_TREE || type == OBJ_BLOB) {
+		int index = type == OBJ_TREE ?
+			    REPORT_TYPE_TREE : REPORT_TYPE_BLOB;
+		struct survey_report_top_table *top;
+
+		/*
+		 * Temporarily store (const char *) here, but it will
+		 * be duped if inserted and will not be freed.
+		 */
+		summary.label = (char *)path;
+
+		top = ctx->report.top_paths_by_count;
+		maybe_insert_into_top_size(&top[index], &summary);
+
+		top = ctx->report.top_paths_by_disk;
+		maybe_insert_into_top_size(&top[index], &summary);
+
+		top = ctx->report.top_paths_by_inflate;
+		maybe_insert_into_top_size(&top[index], &summary);
+	}
 }
 
-static int survey_objects_path_walk_fn(const char *path UNUSED,
+static int survey_objects_path_walk_fn(const char *path,
 				       struct oid_array *oids,
 				       enum object_type type,
 				       void *data)
@@ -741,7 +784,7 @@ static int survey_objects_path_walk_fn(const char *path UNUSED,
 
 	increment_object_counts(&ctx->report.reachable_objects,
 				type, oids->nr);
-	increment_object_totals(ctx, oids, type);
+	increment_object_totals(ctx, oids, type, path);
 
 	ctx->progress_nr += oids->nr;
 	display_progress(ctx->progress, ctx->progress_nr);
@@ -751,11 +794,31 @@ static int survey_objects_path_walk_fn(const char *path UNUSED,
 
 static void initialize_report(struct survey_context *ctx)
 {
+	const int top_limit = 100;
+
 	CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT);
 	ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits"));
 	ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees"));
 	ctx->report.by_type[REPORT_TYPE_BLOB].label = xstrdup(_("Blobs"));
 	ctx->report.by_type[REPORT_TYPE_TAG].label = xstrdup(_("Tags"));
+
+	CALLOC_ARRAY(ctx->report.top_paths_by_count, REPORT_TYPE_COUNT);
+	init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_TREE],
+		       top_limit, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr);
+	init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_BLOB],
+		       top_limit, _("TOP FILES BY COUNT"), cmp_by_nr);
+
+	CALLOC_ARRAY(ctx->report.top_paths_by_disk, REPORT_TYPE_COUNT);
+	init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_TREE],
+		       top_limit, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size);
+	init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB],
+		       top_limit, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size);
+
+	CALLOC_ARRAY(ctx->report.top_paths_by_inflate, REPORT_TYPE_COUNT);
+	init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE],
+		       top_limit, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size);
+	init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB],
+		       top_limit, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size);
 }
 
 static void survey_phase_objects(struct survey_context *ctx)
diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh
index 118410be55cc2a..1ba48cc47e1b35 100755
--- a/t/t8100-git-survey.sh
+++ b/t/t8100-git-survey.sh
@@ -92,7 +92,17 @@ test_expect_success 'git survey (default)' '
 	EOF
 
 	approximate_sizes out >out-edited &&
-	test_cmp expect out-edited
+	lines=$(wc -l <expect) &&
+	head -n "$lines" <out-edited >out-trimmed &&
+	test_cmp expect out-trimmed &&
+
+	for type in "DIRECTORIES" "FILES"
+	do
+		for metric in "COUNT" "DISK SIZE" "INFLATED SIZE"
+		do
+			grep "TOP $type BY $metric" out || return 1
+		done || return 1
+	done
 '
 
 test_done

From 11f07ee7a49ff04d2495525981334e55b5c64883 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sun, 22 Dec 2024 17:43:45 +0100
Subject: [PATCH 411/553] compat/mingw: drop outdated comment
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This comment has been true for the longest time; The combination of the
two preceding commits made it incorrect, so let's drop that comment.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 87ac187168ae91..abd477d5297ee3 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2340,15 +2340,6 @@ int mingw_socket(int domain, int type, int protocol)
 	ensure_socket_initialization();
 	s = WSASocket(domain, type, protocol, NULL, 0, 0);
 	if (s == INVALID_SOCKET) {
-		/*
-		 * WSAGetLastError() values are regular BSD error codes
-		 * biased by WSABASEERR.
-		 * However, strerror() does not know about networking
-		 * specific errors, which are values beginning at 38 or so.
-		 * Therefore, we choose to leave the biased error code
-		 * in errno so that _if_ someone looks up the code somewhere,
-		 * then it is at least the number that are usually listed.
-		 */
 		set_wsa_errno();
 		return -1;
 	}

From 1945122e10be5e2243e29f7d3f090cfcb0488bea Mon Sep 17 00:00:00 2001
From: Derrick Stolee <stolee@gmail.com>
Date: Mon, 23 Sep 2024 15:38:25 -0400
Subject: [PATCH 412/553] survey: add --top=<N> option and config

The 'git survey' builtin provides several detail tables, such as "top
files by on-disk size". The size of these tables defaults to 10,
currently.

Allow the user to specify this number via a new --top=<N> option or the
new survey.top config key.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config/survey.adoc |  3 +++
 builtin/survey.c                 | 22 ++++++++++++++--------
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/survey.adoc b/Documentation/config/survey.adoc
index c1b0f852a1250e..9e594a2092f225 100644
--- a/Documentation/config/survey.adoc
+++ b/Documentation/config/survey.adoc
@@ -8,4 +8,7 @@ survey.*::
 		This boolean value implies the `--[no-]verbose` option.
 	progress::
 		This boolean value implies the `--[no-]progress` option.
+	top::
+		This integer value implies `--top=<N>`, specifying the
+		number of entries in the detail tables.
 --
diff --git a/builtin/survey.c b/builtin/survey.c
index 2dd1eedfda74f1..c1d78222146628 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -40,6 +40,7 @@ static struct survey_refs_wanted default_ref_options = {
 struct survey_opts {
 	int verbose;
 	int show_progress;
+	int top_nr;
 	struct survey_refs_wanted refs;
 };
 
@@ -548,6 +549,10 @@ static int survey_load_config_cb(const char *var, const char *value,
 		ctx->opts.show_progress = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "survey.top")) {
+		ctx->opts.top_nr = git_config_bool(var, value);
+		return 0;
+	}
 
 	return git_default_config(var, value, cctx, pvoid);
 }
@@ -794,8 +799,6 @@ static int survey_objects_path_walk_fn(const char *path,
 
 static void initialize_report(struct survey_context *ctx)
 {
-	const int top_limit = 100;
-
 	CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT);
 	ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits"));
 	ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees"));
@@ -804,21 +807,21 @@ static void initialize_report(struct survey_context *ctx)
 
 	CALLOC_ARRAY(ctx->report.top_paths_by_count, REPORT_TYPE_COUNT);
 	init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_TREE],
-		       top_limit, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr);
+		       ctx->opts.top_nr, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr);
 	init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_BLOB],
-		       top_limit, _("TOP FILES BY COUNT"), cmp_by_nr);
+		       ctx->opts.top_nr, _("TOP FILES BY COUNT"), cmp_by_nr);
 
 	CALLOC_ARRAY(ctx->report.top_paths_by_disk, REPORT_TYPE_COUNT);
 	init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_TREE],
-		       top_limit, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size);
+		       ctx->opts.top_nr, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size);
 	init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB],
-		       top_limit, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size);
+		       ctx->opts.top_nr, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size);
 
 	CALLOC_ARRAY(ctx->report.top_paths_by_inflate, REPORT_TYPE_COUNT);
 	init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE],
-		       top_limit, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size);
+		       ctx->opts.top_nr, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size);
 	init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB],
-		       top_limit, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size);
+		       ctx->opts.top_nr, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size);
 }
 
 static void survey_phase_objects(struct survey_context *ctx)
@@ -869,6 +872,7 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 		.opts = {
 			.verbose = 0,
 			.show_progress = -1, /* defaults to isatty(2) */
+			.top_nr = 10,
 
 			.refs.want_all_refs = -1,
 
@@ -884,6 +888,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 	static struct option survey_options[] = {
 		OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")),
 		OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")),
+		OPT_INTEGER('n', "top", &ctx.opts.top_nr,
+			    N_("number of entries to include in detail tables")),
 
 		OPT_BOOL_F(0, "all-refs", &ctx.opts.refs.want_all_refs, N_("include all refs"),          PARSE_OPT_NONEG),
 

From 3ee518d3bed51e5317156e6e71b13be9e594109a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sun, 29 Dec 2024 11:48:34 +0100
Subject: [PATCH 413/553] t0301: actually test credential-cache on Windows
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit 2406bf5 (Win32: detect unix socket support at runtime,
2024-04-03) introduced a runtime detection for whether the operating
system supports unix sockets for Windows, but a mistake snuck into the
tests. When building and testing Git without NO_UNIX_SOCKETS we
currently skip t0301-credential-cache on Windows if unix sockets are
supported and run the tests if they aren't.

Flip that logic to actually work the way it was intended.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t0301-credential-cache.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0301-credential-cache.sh b/t/t0301-credential-cache.sh
index 6f7cfd9e33f633..a14032626192d0 100755
--- a/t/t0301-credential-cache.sh
+++ b/t/t0301-credential-cache.sh
@@ -12,7 +12,7 @@ test -z "$NO_UNIX_SOCKETS" || {
 if test_have_prereq MINGW
 then
 	service_running=$(sc query afunix | grep "4  RUNNING")
-	test -z "$service_running" || {
+	test -n "$service_running" || {
 		skip_all='skipping credential-cache tests, unix sockets not available'
 		test_done
 	}

From 851712a152efd03bb762417eebd51b86b12119ae Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 1 Jul 2024 23:28:45 +0200
Subject: [PATCH 414/553] survey: clearly note the experimental nature in the
 output

While this command is definitely something we _want_, chances are that
upstreaming this will require substantial changes.

We still want to be able to experiment with this before that, to focus
on what we need out of this command: To assist with diagnosing issues
with large repositories, as well as to help monitoring the growth and
the associated painpoints of such repositories.

To that end, we are about to integrate this command into
`microsoft/git`, to get the tool into the hands of users who need it
most, with the idea to iterate in close collaboration between these
users and the developers familar with Git's internals.

However, we will definitely want to avoid letting anybody have the
impression that this command, its exact inner workings, as well as its
output format, are anywhere close to stable. To make that fact utterly
clear (and thereby protect the freedom to iterate and innovate freely
before upstreaming the command), let's mark its output as experimental
in all-caps, as the first thing we do.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/survey.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/survey.c b/builtin/survey.c
index c1d78222146628..f40905fb2fd57a 100644
--- a/builtin/survey.c
+++ b/builtin/survey.c
@@ -17,6 +17,7 @@
 #include "strvec.h"
 #include "tag.h"
 #include "trace2.h"
+#include "color.h"
 
 static const char * const survey_usage[] = {
 	N_("(EXPERIMENTAL!) git survey <options>"),
@@ -905,6 +906,11 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
 	show_usage_with_options_if_asked(argc, argv,
 					 survey_usage, survey_options);
 
+	if (isatty(2))
+		color_fprintf_ln(stderr,
+				 want_color_fd(2, GIT_COLOR_AUTO) ? GIT_COLOR_YELLOW : "",
+				 "(THIS IS EXPERIMENTAL, EXPECT THE OUTPUT FORMAT TO CHANGE!)");
+
 	ctx.repo = repo;
 
 	prepare_repo_settings(ctx.repo);

From 58a61cda613a2985e2f74628570f5e6ea895e97e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sun, 22 Dec 2024 17:24:24 +0100
Subject: [PATCH 415/553] credential-cache: handle ECONNREFUSED gracefully
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In 245670c (credential-cache: check for windows specific errors, 2021-09-14)
we concluded that on Windows we would always encounter ENETDOWN where we
would expect ECONNREFUSED on POSIX systems, when connecting to unix sockets.
As reported in [1], we do encounter ECONNREFUSED on Windows if the
socket file doesn't exist, but the containing directory does and ENETDOWN if
neither exists. We should handle this case like we do on non-windows systems.

[1] https://github.com/git-for-windows/git/pull/4762#issuecomment-2545498245

This fixes https://github.com/git-for-windows/git/issues/5314

Helped-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/credential-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/credential-cache.c b/builtin/credential-cache.c
index 7f733cb756e03c..3b8130d3d64f9c 100644
--- a/builtin/credential-cache.c
+++ b/builtin/credential-cache.c
@@ -23,7 +23,7 @@ static int connection_closed(int error)
 
 static int connection_fatally_broken(int error)
 {
-	return (error != ENOENT) && (error != ENETDOWN);
+	return (error != ENOENT) && (error != ENETDOWN) && (error != ECONNREFUSED);
 }
 
 #else

From 170494ea2d4eebfe5c911942d0bfeeee47c77b06 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 6 Mar 2025 14:05:03 +0100
Subject: [PATCH 416/553] reftable: do make sure to use custom allocators

The reftable library goes out of its way to use its own set of allocator
functions that can be configured using `reftable_set_alloc()`. However,
Git does not configure this.

That is not typically a problem, except when Git uses a custom allocator
via some definitions in `git-compat-util.h`, as is the case in Git for
Windows (which switched away from the long-unmaintained nedmalloc to
mimalloc).

Then, it is quite possible that Git assigns a `strbuf` (allocated via
the custom allocator) to, say, the `refname` field of a
`reftable_log_record` in `write_transaction_table()`, and later on asks
the reftable library function `reftable_log_record_release()` to release
it, but that function was compiled without using `git-compat-util.h` and
hence calls regular `free()` (i.e. _not_ the custom allocator's own
function).

This has been a problem for a long time and it was a matter of some sort
of "luck" that 1) reftables are not commonly used on Windows, and 2)
mimalloc can often ignore gracefully when it is asked to release memory
that it has not allocated.

However, a recent update to `seen` brought this problem to the
forefront, letting t1460 fail in Git for Windows, with symptoms much in
the same way as the problem I had to address in d02c37c3e6ba
(t-reftable-basics: allow for `malloc` to be `#define`d, 2025-01-08)
where exit code 127 was also produced in lieu of
`STATUS_HEAP_CORRUPTION` (C0000374) because exit codes are only 7 bits
wide.

It was not possible to figure out what change in particular caused these
new failures within a reasonable time frame, as there are too many
changes in `seen` that conflict with Git for Windows' patches, I had to
stop the investigation after spending four hours on it fruitlessly.

To verify that this patch fixes the issue, I avoided using mimalloc and
temporarily patched in a "custom allocator" that would more reliably
point out problems, like this:

  diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
  index 68f38291f84c..9421d630b9f5 100644
  --- a/refs/reftable-backend.c
  +++ b/refs/reftable-backend.c
  @@ -353,6 +353,69 @@ static int reftable_be_fsync(int fd)
   	return fsync_component(FSYNC_COMPONENT_REFERENCE, fd);
   }

  +#define DEBUG_REFTABLE_ALLOC
  +#ifdef DEBUG_REFTABLE_ALLOC
  +#include "khash.h"
  +
  +static inline khint_t __ac_X31_hash_ptr(void *ptr)
  +{
  +	union {
  +		void *ptr;
  +		char s[sizeof(void *)];
  +	} u;
  +	size_t i;
  +	khint_t h;
  +
  +	u.ptr = ptr;
  +	h = (khint_t)*u.s;
  +	for (i = 0; i < sizeof(void *); i++)
  +		h = (h << 5) - h + (khint_t)u.s[i];
  +	return h;
  +}
  +
  +#define kh_ptr_hash_func(key) __ac_X31_hash_ptr(key)
  +#define kh_ptr_hash_equal(a, b) ((a) == (b))
  +
  +KHASH_INIT(ptr, void *, int, 0, kh_ptr_hash_func, kh_ptr_hash_equal)
  +
  +static kh_ptr_t *my_malloced;
  +
  +static void *my_malloc(size_t sz)
  +{
  +	int dummy;
  +	void *ptr = malloc(sz);
  +	if (ptr)
  +		kh_put_ptr(my_malloced, ptr, &dummy);
  +	return ptr;
  +}
  +
  +static void *my_realloc(void *ptr, size_t sz)
  +{
  +	int dummy;
  +	if (ptr) {
  +		khiter_t pos = kh_get_ptr(my_malloced, ptr);
  +		if (pos >= kh_end(my_malloced))
  +			die("Was not my_malloc()ed: %p", ptr);
  +		kh_del_ptr(my_malloced, pos);
  +	}
  +	ptr = realloc(ptr, sz);
  +	if (ptr)
  +		kh_put_ptr(my_malloced, ptr, &dummy);
  +	return ptr;
  +}
  +
  +static void my_free(void *ptr)
  +{
  +	if (ptr) {
  +		khiter_t pos = kh_get_ptr(my_malloced, ptr);
  +		if (pos >= kh_end(my_malloced))
  +			die("Was not my_malloc()ed: %p", ptr);
  +		kh_del_ptr(my_malloced, pos);
  +	}
  +	free(ptr);
  +}
  +#endif
  +
   static struct ref_store *reftable_be_init(struct repository *repo,
   					  const char *gitdir,
   					  unsigned int store_flags)
  @@ -362,6 +425,11 @@ static struct ref_store *reftable_be_init(struct repository *repo,
   	int is_worktree;
   	mode_t mask;

  +#ifdef DEBUG_REFTABLE_ALLOC
  +	my_malloced = kh_init_ptr();
  +	reftable_set_alloc(my_malloc, my_realloc, my_free);
  +#endif
  +
   	mask = umask(0);
   	umask(mask);

I briefly considered contributing this "custom allocator" patch, too,
but it is unwieldy (for example, it would not work at all when compiling
with mimalloc support) and it would only waste space (or even time, if a
compile flag was introduced and exercised as part of the CI builds).
Given that it is highly unlikely that Git will lose the new
`reftable_set_alloc()` call by mistake, I rejected that idea as simply
too wasteful.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 refs/reftable-backend.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4319a4eacbafc4..31c9cd1ebf5ce6 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -366,6 +366,7 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	mask = umask(0);
 	umask(mask);
 
+	reftable_set_alloc(malloc, realloc, free);
 	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
 	strmap_init(&refs->worktree_backends);
 	refs->store_flags = store_flags;

From 10a82849e4bb11f567e597e79fb911d95dce88cc Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 3 Jun 2025 12:45:39 +0200
Subject: [PATCH 417/553] check-whitespace: avoid alerts about upstream commits

Every once in a while, whitespace errors are introduced in Git for
Windows' rebases to newer Git versions, simply by virtue of integrating
upstream commits that do not follow upstream Git's own whitespace rule.
In Git v2.50.0-rc0, for example, 03f2915541a4 (xdiff: disable
cleanup_records heuristic with --minimal, 2025-04-29) introduced a
trailing space.

Arguably, non-actionable alerts are worse than no alerts at all, so
let's suppress those alerts that we cannot do anything about, anyway.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/check-whitespace.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/ci/check-whitespace.sh b/ci/check-whitespace.sh
index c40804394cb079..e590ac0dfd765e 100755
--- a/ci/check-whitespace.sh
+++ b/ci/check-whitespace.sh
@@ -19,6 +19,7 @@ problems=()
 commit=
 commitText=
 commitTextmd=
+committerEmail=
 goodParent=
 
 if ! git rev-parse --quiet --verify "${baseCommit}"
@@ -27,7 +28,7 @@ then
     exit 1
 fi
 
-while read dash sha etc
+while read dash email sha etc
 do
 	case "${dash}" in
 	"---") # Line contains commit information.
@@ -40,10 +41,14 @@ do
 		commit="${sha}"
 		commitText="${sha} ${etc}"
 		commitTextmd="[${sha}](${url}/commit/${sha}) ${etc}"
+		committerEmail="${email}"
 		;;
 	"")
 		;;
 	*) # Line contains whitespace error information for current commit.
+		# Quod licet Iovi non licet bovi
+		test gitster@pobox.com != "$committerEmail" || break
+
 		if test -n "${goodParent}"
 		then
 			problems+=("1) --- ${commitTextmd}")
@@ -64,7 +69,7 @@ do
 		echo "${dash} ${sha} ${etc}"
 		;;
 	esac
-done <<< "$(git log --check --pretty=format:"---% h% s" "${baseCommit}"..)"
+done <<< "$(git log --check --pretty=format:"---% ce% h% s" "${baseCommit}"..)"
 
 if test ${#problems[*]} -gt 0
 then

From 991c5bb244bcc7d678dbdf2dd0a813661ed49da1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sat, 10 Jan 2026 10:15:30 +0100
Subject: [PATCH 418/553] Import the source code of mimalloc v2.2.6
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Update to newer mimalloc versions like this:

  update_mimalloc ()
  {
      test $# = 1 || {
          echo "Need a mimalloc version" 1>&2;
          return 1
      };
      for oneline in 'mimalloc: adjust for building inside Git' 'Import the source code of mimalloc';
      do
          git revert -n HEAD^{/^"$oneline"} && git checkout HEAD -- Makefile && git commit -sm "Temporarily revert \"$oneline\"" -m 'In preparation for upgrading to a newer mimalloc version.' || return 1;
      done;
      for file in $(git show --format='%n' --name-only --diff-filter=A HEAD^{/^"Import the source code of mimalloc "}) compat/mimalloc/arena-abandon.c compat/mimalloc/free.c compat/mimalloc/libc.c compat/mimalloc/prim/prim.c compat/mimalloc/mimalloc-stats.h;
      do
          file2=${file#compat/mimalloc/};
          case "$file2" in
              segment-cache.c)
                  : no longer needed;
                  continue
              ;;
              bitmap.h | *.c)
                  file2=src/$file2
              ;;
              *.h)
                  file2=include/$file2
              ;;
          esac;
          mkdir -p "${file%/*}" && git -C /usr/src/mimalloc/ show "$1":$file2 > "$file" && git add "$file" || {
              echo "Failed: $file2 -> $file" 1>&2;
              return 1
          };
      done;
      conv_sed='sed -n "/^ *eval/d;/      /p"' && git commit -sm "Import the source code of mimalloc $1" -m "Update to newer mimalloc versions like this:" -m "$(set | sed -n '/^update_mimalloc *() *$/,/^}/{s/^./  &/;p}')" -m '  update_mimalloc $MIMALLOC_VERSION' -m 'For convenience, you can set `MIMALLOC_VERSION` and then run:' -m '  eval "$(git show -s <this-commit> | '"$conv_sed"')"' || return 1;
      git cherry-pick HEAD^{/^'mimalloc: adjust for building inside Git'} || return 1
  }

  update_mimalloc $MIMALLOC_VERSION

For convenience, you can set `MIMALLOC_VERSION` and then run:

  eval "$(git show -s <this-commit> | sed -n "/^ *eval/d;/      /p")"

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
 compat/mimalloc/LICENSE             |   21 +
 compat/mimalloc/alloc-aligned.c     |  371 ++++++
 compat/mimalloc/alloc.c             |  735 ++++++++++++
 compat/mimalloc/arena-abandon.c     |  346 ++++++
 compat/mimalloc/arena.c             | 1045 ++++++++++++++++
 compat/mimalloc/bitmap.c            |  441 +++++++
 compat/mimalloc/bitmap.h            |  119 ++
 compat/mimalloc/free.c              |  588 +++++++++
 compat/mimalloc/heap.c              |  737 ++++++++++++
 compat/mimalloc/init.c              |  715 +++++++++++
 compat/mimalloc/libc.c              |  334 ++++++
 compat/mimalloc/mimalloc-stats.h    |  104 ++
 compat/mimalloc/mimalloc.h          |  629 ++++++++++
 compat/mimalloc/mimalloc/atomic.h   |  557 +++++++++
 compat/mimalloc/mimalloc/internal.h | 1153 ++++++++++++++++++
 compat/mimalloc/mimalloc/prim.h     |  421 +++++++
 compat/mimalloc/mimalloc/track.h    |  145 +++
 compat/mimalloc/mimalloc/types.h    |  686 +++++++++++
 compat/mimalloc/options.c           |  670 +++++++++++
 compat/mimalloc/os.c                |  770 ++++++++++++
 compat/mimalloc/page-queue.c        |  397 +++++++
 compat/mimalloc/page.c              | 1050 +++++++++++++++++
 compat/mimalloc/prim/osx/prim.c     |    9 +
 compat/mimalloc/prim/prim.c         |   76 ++
 compat/mimalloc/prim/unix/prim.c    |  962 +++++++++++++++
 compat/mimalloc/prim/windows/prim.c |  879 ++++++++++++++
 compat/mimalloc/random.c            |  258 ++++
 compat/mimalloc/segment-map.c       |  142 +++
 compat/mimalloc/segment.c           | 1702 +++++++++++++++++++++++++++
 compat/mimalloc/stats.c             |  633 ++++++++++
 30 files changed, 16695 insertions(+)
 create mode 100644 compat/mimalloc/LICENSE
 create mode 100644 compat/mimalloc/alloc-aligned.c
 create mode 100644 compat/mimalloc/alloc.c
 create mode 100644 compat/mimalloc/arena-abandon.c
 create mode 100644 compat/mimalloc/arena.c
 create mode 100644 compat/mimalloc/bitmap.c
 create mode 100644 compat/mimalloc/bitmap.h
 create mode 100644 compat/mimalloc/free.c
 create mode 100644 compat/mimalloc/heap.c
 create mode 100644 compat/mimalloc/init.c
 create mode 100644 compat/mimalloc/libc.c
 create mode 100644 compat/mimalloc/mimalloc-stats.h
 create mode 100644 compat/mimalloc/mimalloc.h
 create mode 100644 compat/mimalloc/mimalloc/atomic.h
 create mode 100644 compat/mimalloc/mimalloc/internal.h
 create mode 100644 compat/mimalloc/mimalloc/prim.h
 create mode 100644 compat/mimalloc/mimalloc/track.h
 create mode 100644 compat/mimalloc/mimalloc/types.h
 create mode 100644 compat/mimalloc/options.c
 create mode 100644 compat/mimalloc/os.c
 create mode 100644 compat/mimalloc/page-queue.c
 create mode 100644 compat/mimalloc/page.c
 create mode 100644 compat/mimalloc/prim/osx/prim.c
 create mode 100644 compat/mimalloc/prim/prim.c
 create mode 100644 compat/mimalloc/prim/unix/prim.c
 create mode 100644 compat/mimalloc/prim/windows/prim.c
 create mode 100644 compat/mimalloc/random.c
 create mode 100644 compat/mimalloc/segment-map.c
 create mode 100644 compat/mimalloc/segment.c
 create mode 100644 compat/mimalloc/stats.c

diff --git a/compat/mimalloc/LICENSE b/compat/mimalloc/LICENSE
new file mode 100644
index 00000000000000..53315ebee557ac
--- /dev/null
+++ b/compat/mimalloc/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2018-2025 Microsoft Corporation, Daan Leijen
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/compat/mimalloc/alloc-aligned.c b/compat/mimalloc/alloc-aligned.c
new file mode 100644
index 00000000000000..772b76c2027944
--- /dev/null
+++ b/compat/mimalloc/alloc-aligned.c
@@ -0,0 +1,371 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2021, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"  // mi_prim_get_default_heap
+
+#include <string.h>     // memset
+
+// ------------------------------------------------------
+// Aligned Allocation
+// ------------------------------------------------------
+
+static bool mi_malloc_is_naturally_aligned( size_t size, size_t alignment ) {
+  // objects up to `MI_MAX_ALIGN_GUARANTEE` are allocated aligned to their size (see `segment.c:_mi_segment_page_start`).
+  mi_assert_internal(_mi_is_power_of_two(alignment) && (alignment > 0));
+  if (alignment > size) return false;
+  if (alignment <= MI_MAX_ALIGN_SIZE) return true;
+  const size_t bsize = mi_good_size(size);
+  return (bsize <= MI_MAX_ALIGN_GUARANTEE && (bsize & (alignment-1)) == 0);
+}
+
+#if MI_GUARDED
+static mi_decl_restrict void* mi_heap_malloc_guarded_aligned(mi_heap_t* heap, size_t size, size_t alignment, bool zero) mi_attr_noexcept {
+  // use over allocation for guarded blocksl
+  mi_assert_internal(alignment > 0 && alignment < MI_BLOCK_ALIGNMENT_MAX);
+  const size_t oversize = size + alignment - 1;
+  void* base = _mi_heap_malloc_guarded(heap, oversize, zero);
+  void* p = mi_align_up_ptr(base, alignment);
+  mi_track_align(base, p, (uint8_t*)p - (uint8_t*)base, size);
+  mi_assert_internal(mi_usable_size(p) >= size);
+  mi_assert_internal(_mi_is_aligned(p, alignment));
+  return p;
+}
+
+static void* mi_heap_malloc_zero_no_guarded(mi_heap_t* heap, size_t size, bool zero, size_t* usable) {
+  const size_t rate = heap->guarded_sample_rate;
+  // only write if `rate!=0` so we don't write to the constant `_mi_heap_empty`
+  if (rate != 0) { heap->guarded_sample_rate = 0; }
+  void* p = _mi_heap_malloc_zero_ex(heap, size, zero, 0, usable);
+  if (rate != 0) { heap->guarded_sample_rate = rate; }
+  return p;
+}
+#else
+static void* mi_heap_malloc_zero_no_guarded(mi_heap_t* heap, size_t size, bool zero, size_t* usable) {
+  return _mi_heap_malloc_zero_ex(heap, size, zero, 0, usable);
+}
+#endif
+
+// Fallback aligned allocation that over-allocates -- split out for better codegen
+static mi_decl_noinline void* mi_heap_malloc_zero_aligned_at_overalloc(mi_heap_t* const heap, const size_t size, const size_t alignment, const size_t offset, const bool zero, size_t* usable) mi_attr_noexcept
+{
+  mi_assert_internal(size <= (MI_MAX_ALLOC_SIZE - MI_PADDING_SIZE));
+  mi_assert_internal(alignment != 0 && _mi_is_power_of_two(alignment));
+
+  void* p;
+  size_t oversize;
+  if mi_unlikely(alignment > MI_BLOCK_ALIGNMENT_MAX) {
+    // use OS allocation for very large alignment and allocate inside a huge page (dedicated segment with 1 page)
+    // This can support alignments >= MI_SEGMENT_SIZE by ensuring the object can be aligned at a point in the
+    // first (and single) page such that the segment info is `MI_SEGMENT_SIZE` bytes before it (so it can be found by aligning the pointer down)
+    if mi_unlikely(offset != 0) {
+      // todo: cannot support offset alignment for very large alignments yet
+#if MI_DEBUG > 0
+      _mi_error_message(EOVERFLOW, "aligned allocation with a very large alignment cannot be used with an alignment offset (size %zu, alignment %zu, offset %zu)\n", size, alignment, offset);
+#endif
+      return NULL;
+    }
+    oversize = (size <= MI_SMALL_SIZE_MAX ? MI_SMALL_SIZE_MAX + 1 /* ensure we use generic malloc path */ : size);
+    // note: no guarded as alignment > 0
+    p = _mi_heap_malloc_zero_ex(heap, oversize, false, alignment, usable); // the page block size should be large enough to align in the single huge page block
+    // zero afterwards as only the area from the aligned_p may be committed!
+    if (p == NULL) return NULL;
+  }
+  else {
+    // otherwise over-allocate
+    oversize = (size < MI_MAX_ALIGN_SIZE ? MI_MAX_ALIGN_SIZE : size) + alignment - 1;  // adjust for size <= 16; with size 0 and aligment 64k, we would allocate a 64k block and pointing just beyond that.
+    p = mi_heap_malloc_zero_no_guarded(heap, oversize, zero, usable);
+    if (p == NULL) return NULL;
+  }
+  mi_page_t* page = _mi_ptr_page(p);
+
+  // .. and align within the allocation
+  const uintptr_t align_mask = alignment - 1;  // for any x, `(x & align_mask) == (x % alignment)`
+  const uintptr_t poffset = ((uintptr_t)p + offset) & align_mask;
+  const uintptr_t adjust  = (poffset == 0 ? 0 : alignment - poffset);
+  mi_assert_internal(adjust < alignment);
+  void* aligned_p = (void*)((uintptr_t)p + adjust);
+  if (aligned_p != p) {
+    mi_page_set_has_aligned(page, true);
+    #if MI_GUARDED
+    // set tag to aligned so mi_usable_size works with guard pages
+    if (adjust >= sizeof(mi_block_t)) {
+      mi_block_t* const block = (mi_block_t*)p;
+      block->next = MI_BLOCK_TAG_ALIGNED;
+    }
+    #endif
+    _mi_padding_shrink(page, (mi_block_t*)p, adjust + size);
+  }
+  // todo: expand padding if overallocated ?
+
+  mi_assert_internal(mi_page_usable_block_size(page) >= adjust + size);
+  mi_assert_internal(((uintptr_t)aligned_p + offset) % alignment == 0);
+  mi_assert_internal(mi_usable_size(aligned_p)>=size);
+  mi_assert_internal(mi_usable_size(p) == mi_usable_size(aligned_p)+adjust);
+  #if MI_DEBUG > 1
+  mi_page_t* const apage = _mi_ptr_page(aligned_p);
+  void* unalign_p = _mi_page_ptr_unalign(apage, aligned_p);
+  mi_assert_internal(p == unalign_p);
+  #endif
+
+  // now zero the block if needed
+  if (alignment > MI_BLOCK_ALIGNMENT_MAX) {
+    // for the tracker, on huge aligned allocations only the memory from the start of the large block is defined
+    mi_track_mem_undefined(aligned_p, size);
+    if (zero) {
+      _mi_memzero_aligned(aligned_p, mi_usable_size(aligned_p));
+    }
+  }
+
+  if (p != aligned_p) {
+    mi_track_align(p,aligned_p,adjust,mi_usable_size(aligned_p));
+    #if MI_GUARDED
+    mi_track_mem_defined(p, sizeof(mi_block_t));
+    #endif
+  }
+  return aligned_p;
+}
+
+// Generic primitive aligned allocation -- split out for better codegen
+static mi_decl_noinline void* mi_heap_malloc_zero_aligned_at_generic(mi_heap_t* const heap, const size_t size, const size_t alignment, const size_t offset, const bool zero, size_t* usable) mi_attr_noexcept
+{
+  mi_assert_internal(alignment != 0 && _mi_is_power_of_two(alignment));
+  // we don't allocate more than MI_MAX_ALLOC_SIZE (see <https://sourceware.org/ml/libc-announce/2019/msg00001.html>)
+  if mi_unlikely(size > (MI_MAX_ALLOC_SIZE - MI_PADDING_SIZE)) {
+    #if MI_DEBUG > 0
+    _mi_error_message(EOVERFLOW, "aligned allocation request is too large (size %zu, alignment %zu)\n", size, alignment);
+    #endif
+    return NULL;
+  }
+
+  // use regular allocation if it is guaranteed to fit the alignment constraints.
+  // this is important to try as the fast path in `mi_heap_malloc_zero_aligned` only works when there exist
+  // a page with the right block size, and if we always use the over-alloc fallback that would never happen.
+  if (offset == 0 && mi_malloc_is_naturally_aligned(size,alignment)) {
+    void* p = mi_heap_malloc_zero_no_guarded(heap, size, zero, usable);
+    mi_assert_internal(p == NULL || ((uintptr_t)p % alignment) == 0);
+    const bool is_aligned_or_null = (((uintptr_t)p) & (alignment-1))==0;
+    if mi_likely(is_aligned_or_null) {
+      return p;
+    }
+    else {
+      // this should never happen if the `mi_malloc_is_naturally_aligned` check is correct..
+      mi_assert(false);
+      mi_free(p);
+    }
+  }
+
+  // fall back to over-allocation
+  return mi_heap_malloc_zero_aligned_at_overalloc(heap,size,alignment,offset,zero,usable);
+}
+
+
+// Primitive aligned allocation
+static void* mi_heap_malloc_zero_aligned_at(mi_heap_t* const heap, const size_t size, 
+                                            const size_t alignment, const size_t offset, const bool zero,
+                                            size_t* usable) mi_attr_noexcept
+{
+  // note: we don't require `size > offset`, we just guarantee that the address at offset is aligned regardless of the allocated size.
+  if mi_unlikely(alignment == 0 || !_mi_is_power_of_two(alignment)) { // require power-of-two (see <https://en.cppreference.com/w/c/memory/aligned_alloc>)
+    #if MI_DEBUG > 0
+    _mi_error_message(EOVERFLOW, "aligned allocation requires the alignment to be a power-of-two (size %zu, alignment %zu)\n", size, alignment);
+    #endif
+    return NULL;
+  }
+
+  #if MI_GUARDED
+  if (offset==0 && alignment < MI_BLOCK_ALIGNMENT_MAX && mi_heap_malloc_use_guarded(heap,size)) {
+    return mi_heap_malloc_guarded_aligned(heap, size, alignment, zero);
+  }
+  #endif
+
+  // try first if there happens to be a small block available with just the right alignment
+  if mi_likely(size <= MI_SMALL_SIZE_MAX && alignment <= size) {
+    const uintptr_t align_mask = alignment-1;       // for any x, `(x & align_mask) == (x % alignment)`
+    const size_t padsize = size + MI_PADDING_SIZE;
+    mi_page_t* page = _mi_heap_get_free_small_page(heap, padsize);
+    if mi_likely(page->free != NULL) {
+      const bool is_aligned = (((uintptr_t)page->free + offset) & align_mask)==0;
+      if mi_likely(is_aligned)
+      {
+        if (usable!=NULL) { *usable = mi_page_usable_block_size(page); }
+        void* p = (zero ? _mi_page_malloc_zeroed(heap,page,padsize) : _mi_page_malloc(heap,page,padsize)); // call specific page malloc for better codegen
+        mi_assert_internal(p != NULL);
+        mi_assert_internal(((uintptr_t)p + offset) % alignment == 0);
+        mi_track_malloc(p,size,zero);
+        return p;
+      }
+    }
+  }
+
+  // fallback to generic aligned allocation
+  return mi_heap_malloc_zero_aligned_at_generic(heap, size, alignment, offset, zero, usable);
+}
+
+
+// ------------------------------------------------------
+// Optimized mi_heap_malloc_aligned / mi_malloc_aligned
+// ------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_malloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_malloc_zero_aligned_at(heap, size, alignment, offset, false, NULL);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_malloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_malloc_aligned_at(heap, size, alignment, 0);
+}
+
+// ensure a definition is emitted
+#if defined(__cplusplus)
+void* _mi_extern_heap_malloc_aligned = (void*)&mi_heap_malloc_aligned;
+#endif
+
+// ------------------------------------------------------
+// Aligned Allocation
+// ------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_zalloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_malloc_zero_aligned_at(heap, size, alignment, offset, true, NULL);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_zalloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_zalloc_aligned_at(heap, size, alignment, 0);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_calloc_aligned_at(mi_heap_t* heap, size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count, size, &total)) return NULL;
+  return mi_heap_zalloc_aligned_at(heap, total, alignment, offset);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_calloc_aligned(mi_heap_t* heap, size_t count, size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_calloc_aligned_at(heap,count,size,alignment,0);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_malloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_malloc_aligned_at(mi_prim_get_default_heap(), size, alignment, offset);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_malloc_aligned(size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_malloc_aligned(mi_prim_get_default_heap(), size, alignment);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_umalloc_aligned(size_t size, size_t alignment, size_t* block_size) mi_attr_noexcept {
+  return mi_heap_malloc_zero_aligned_at(mi_prim_get_default_heap(), size, alignment, 0, false, block_size);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_zalloc_aligned_at(mi_prim_get_default_heap(), size, alignment, offset);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_aligned(size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_zalloc_aligned(mi_prim_get_default_heap(), size, alignment);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_uzalloc_aligned(size_t size, size_t alignment, size_t* block_size) mi_attr_noexcept {
+  return mi_heap_malloc_zero_aligned_at(mi_prim_get_default_heap(), size, alignment, 0, true, block_size);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_calloc_aligned_at(size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_calloc_aligned_at(mi_prim_get_default_heap(), count, size, alignment, offset);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_calloc_aligned(size_t count, size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_calloc_aligned(mi_prim_get_default_heap(), count, size, alignment);
+}
+
+
+// ------------------------------------------------------
+// Aligned re-allocation
+// ------------------------------------------------------
+
+static void* mi_heap_realloc_zero_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset, bool zero) mi_attr_noexcept {
+  mi_assert(alignment > 0);
+  if (alignment <= sizeof(uintptr_t)) return _mi_heap_realloc_zero(heap,p,newsize,zero,NULL,NULL);
+  if (p == NULL) return mi_heap_malloc_zero_aligned_at(heap,newsize,alignment,offset,zero,NULL);
+  size_t size = mi_usable_size(p);
+  if (newsize <= size && newsize >= (size - (size / 2))
+      && (((uintptr_t)p + offset) % alignment) == 0) {
+    return p;  // reallocation still fits, is aligned and not more than 50% waste
+  }
+  else {
+    // note: we don't zero allocate upfront so we only zero initialize the expanded part
+    void* newp = mi_heap_malloc_aligned_at(heap,newsize,alignment,offset);
+    if (newp != NULL) {
+      if (zero && newsize > size) {
+        // also set last word in the previous allocation to zero to ensure any padding is zero-initialized
+        size_t start = (size >= sizeof(intptr_t) ? size - sizeof(intptr_t) : 0);
+        _mi_memzero((uint8_t*)newp + start, newsize - start);
+      }
+      _mi_memcpy_aligned(newp, p, (newsize > size ? size : newsize));
+      mi_free(p); // only free if successful
+    }
+    return newp;
+  }
+}
+
+static void* mi_heap_realloc_zero_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, bool zero) mi_attr_noexcept {
+  mi_assert(alignment > 0);
+  if (alignment <= sizeof(uintptr_t)) return _mi_heap_realloc_zero(heap,p,newsize,zero,NULL,NULL);
+  size_t offset = ((uintptr_t)p % alignment); // use offset of previous allocation (p can be NULL)
+  return mi_heap_realloc_zero_aligned_at(heap,p,newsize,alignment,offset,zero);
+}
+
+mi_decl_nodiscard void* mi_heap_realloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_realloc_zero_aligned_at(heap,p,newsize,alignment,offset,false);
+}
+
+mi_decl_nodiscard void* mi_heap_realloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept {
+  return mi_heap_realloc_zero_aligned(heap,p,newsize,alignment,false);
+}
+
+mi_decl_nodiscard void* mi_heap_rezalloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_realloc_zero_aligned_at(heap, p, newsize, alignment, offset, true);
+}
+
+mi_decl_nodiscard void* mi_heap_rezalloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept {
+  return mi_heap_realloc_zero_aligned(heap, p, newsize, alignment, true);
+}
+
+mi_decl_nodiscard void* mi_heap_recalloc_aligned_at(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(newcount, size, &total)) return NULL;
+  return mi_heap_rezalloc_aligned_at(heap, p, total, alignment, offset);
+}
+
+mi_decl_nodiscard void* mi_heap_recalloc_aligned(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(newcount, size, &total)) return NULL;
+  return mi_heap_rezalloc_aligned(heap, p, total, alignment);
+}
+
+mi_decl_nodiscard void* mi_realloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_realloc_aligned_at(mi_prim_get_default_heap(), p, newsize, alignment, offset);
+}
+
+mi_decl_nodiscard void* mi_realloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept {
+  return mi_heap_realloc_aligned(mi_prim_get_default_heap(), p, newsize, alignment);
+}
+
+mi_decl_nodiscard void* mi_rezalloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_rezalloc_aligned_at(mi_prim_get_default_heap(), p, newsize, alignment, offset);
+}
+
+mi_decl_nodiscard void* mi_rezalloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept {
+  return mi_heap_rezalloc_aligned(mi_prim_get_default_heap(), p, newsize, alignment);
+}
+
+mi_decl_nodiscard void* mi_recalloc_aligned_at(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept {
+  return mi_heap_recalloc_aligned_at(mi_prim_get_default_heap(), p, newcount, size, alignment, offset);
+}
+
+mi_decl_nodiscard void* mi_recalloc_aligned(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept {
+  return mi_heap_recalloc_aligned(mi_prim_get_default_heap(), p, newcount, size, alignment);
+}
+
+
diff --git a/compat/mimalloc/alloc.c b/compat/mimalloc/alloc.c
new file mode 100644
index 00000000000000..120615b2ec8732
--- /dev/null
+++ b/compat/mimalloc/alloc.c
@@ -0,0 +1,735 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#ifndef _DEFAULT_SOURCE
+#define _DEFAULT_SOURCE   // for realpath() on Linux
+#endif
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "mimalloc/prim.h"   // _mi_prim_thread_id()
+
+#include <string.h>      // memset, strlen (for mi_strdup)
+#include <stdlib.h>      // malloc, abort
+
+#define MI_IN_ALLOC_C
+#include "alloc-override.c"
+#include "free.c"
+#undef MI_IN_ALLOC_C
+
+// ------------------------------------------------------
+// Allocation
+// ------------------------------------------------------
+
+// Fast allocation in a page: just pop from the free list.
+// Fall back to generic allocation only if the list is empty.
+// Note: in release mode the (inlined) routine is about 7 instructions with a single test.
+extern inline void* _mi_page_malloc_zero(mi_heap_t* heap, mi_page_t* page, size_t size, bool zero, size_t* usable) mi_attr_noexcept
+{
+  mi_assert_internal(size >= MI_PADDING_SIZE);
+  mi_assert_internal(page->block_size == 0 /* empty heap */ || mi_page_block_size(page) >= size);
+
+  // check the free list
+  mi_block_t* const block = page->free;
+  if mi_unlikely(block == NULL) {
+    return _mi_malloc_generic(heap, size, zero, 0, usable);
+  }
+  mi_assert_internal(block != NULL && _mi_ptr_page(block) == page);
+  if (usable != NULL) { *usable = mi_page_usable_block_size(page); };
+  // pop from the free list
+  page->free = mi_block_next(page, block);
+  page->used++;
+  mi_assert_internal(page->free == NULL || _mi_ptr_page(page->free) == page);
+  mi_assert_internal(page->block_size < MI_MAX_ALIGN_SIZE || _mi_is_aligned(block, MI_MAX_ALIGN_SIZE));
+
+  #if MI_DEBUG>3
+  if (page->free_is_zero && size > sizeof(*block)) {
+    mi_assert_expensive(mi_mem_is_zero(block+1,size - sizeof(*block)));
+  }
+  #endif
+
+  // allow use of the block internally
+  // note: when tracking we need to avoid ever touching the MI_PADDING since
+  // that is tracked by valgrind etc. as non-accessible (through the red-zone, see `mimalloc/track.h`)
+  mi_track_mem_undefined(block, mi_page_usable_block_size(page));
+
+  // zero the block? note: we need to zero the full block size (issue #63)
+  if mi_unlikely(zero) {
+    mi_assert_internal(page->block_size != 0); // do not call with zero'ing for huge blocks (see _mi_malloc_generic)
+    mi_assert_internal(!mi_page_is_huge(page));
+    #if MI_PADDING
+    mi_assert_internal(page->block_size >= MI_PADDING_SIZE);
+    #endif
+    if (page->free_is_zero) {
+      block->next = 0;
+      mi_track_mem_defined(block, page->block_size - MI_PADDING_SIZE);
+    }
+    else {
+      _mi_memzero_aligned(block, page->block_size - MI_PADDING_SIZE);
+    }
+  }
+
+  #if (MI_DEBUG>0) && !MI_TRACK_ENABLED && !MI_TSAN
+  if (!zero && !mi_page_is_huge(page)) {
+    memset(block, MI_DEBUG_UNINIT, mi_page_usable_block_size(page));
+  }
+  #elif (MI_SECURE!=0)
+  if (!zero) { block->next = 0; } // don't leak internal data
+  #endif
+
+  #if (MI_STAT>0)
+  const size_t bsize = mi_page_usable_block_size(page);
+  if (bsize <= MI_MEDIUM_OBJ_SIZE_MAX) {
+    mi_heap_stat_increase(heap, malloc_normal, bsize);
+    mi_heap_stat_counter_increase(heap, malloc_normal_count, 1);
+    #if (MI_STAT>1)
+    const size_t bin = _mi_bin(bsize);
+    mi_heap_stat_increase(heap, malloc_bins[bin], 1);
+    mi_heap_stat_increase(heap, malloc_requested, size - MI_PADDING_SIZE);
+    #endif
+  }
+  #endif
+
+  #if MI_PADDING // && !MI_TRACK_ENABLED
+    mi_padding_t* const padding = (mi_padding_t*)((uint8_t*)block + mi_page_usable_block_size(page));
+    ptrdiff_t delta = ((uint8_t*)padding - (uint8_t*)block - (size - MI_PADDING_SIZE));
+    #if (MI_DEBUG>=2)
+    mi_assert_internal(delta >= 0 && mi_page_usable_block_size(page) >= (size - MI_PADDING_SIZE + delta));
+    #endif
+    mi_track_mem_defined(padding,sizeof(mi_padding_t));  // note: re-enable since mi_page_usable_block_size may set noaccess
+    padding->canary = mi_ptr_encode_canary(page,block,page->keys);
+    padding->delta  = (uint32_t)(delta);
+    #if MI_PADDING_CHECK
+    if (!mi_page_is_huge(page)) {
+      uint8_t* fill = (uint8_t*)padding - delta;
+      const size_t maxpad = (delta > MI_MAX_ALIGN_SIZE ? MI_MAX_ALIGN_SIZE : delta); // set at most N initial padding bytes
+      for (size_t i = 0; i < maxpad; i++) { fill[i] = MI_DEBUG_PADDING; }
+    }
+    #endif
+  #endif
+
+  return block;
+}
+
+// extra entries for improved efficiency in `alloc-aligned.c`.
+extern void* _mi_page_malloc(mi_heap_t* heap, mi_page_t* page, size_t size) mi_attr_noexcept {
+  return _mi_page_malloc_zero(heap,page,size,false,NULL);
+}
+extern void* _mi_page_malloc_zeroed(mi_heap_t* heap, mi_page_t* page, size_t size) mi_attr_noexcept {
+  return _mi_page_malloc_zero(heap,page,size,true,NULL);
+}
+
+#if MI_GUARDED
+mi_decl_restrict void* _mi_heap_malloc_guarded(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept;
+#endif
+
+static inline mi_decl_restrict void* mi_heap_malloc_small_zero(mi_heap_t* heap, size_t size, bool zero, size_t* usable) mi_attr_noexcept {
+  mi_assert(heap != NULL);
+  mi_assert(size <= MI_SMALL_SIZE_MAX);
+  #if MI_DEBUG
+  const uintptr_t tid = _mi_thread_id();
+  mi_assert(heap->thread_id == 0 || heap->thread_id == tid); // heaps are thread local
+  #endif
+  #if (MI_PADDING || MI_GUARDED)
+  if (size == 0) { size = sizeof(void*); }
+  #endif
+  #if MI_GUARDED
+  if (mi_heap_malloc_use_guarded(heap,size)) {
+    return _mi_heap_malloc_guarded(heap, size, zero);
+  }
+  #endif
+
+  // get page in constant time, and allocate from it
+  mi_page_t* page = _mi_heap_get_free_small_page(heap, size + MI_PADDING_SIZE);
+  void* const p = _mi_page_malloc_zero(heap, page, size + MI_PADDING_SIZE, zero, usable);
+  mi_track_malloc(p,size,zero);
+
+  #if MI_DEBUG>3
+  if (p != NULL && zero) {
+    mi_assert_expensive(mi_mem_is_zero(p, size));
+  }
+  #endif
+  return p;
+}
+
+// allocate a small block
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_malloc_small(mi_heap_t* heap, size_t size) mi_attr_noexcept {
+  return mi_heap_malloc_small_zero(heap, size, false, NULL);
+}
+
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_malloc_small(size_t size) mi_attr_noexcept {
+  return mi_heap_malloc_small(mi_prim_get_default_heap(), size);
+}
+
+// The main allocation function
+extern inline void* _mi_heap_malloc_zero_ex(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment, size_t* usable) mi_attr_noexcept {
+  // fast path for small objects
+  if mi_likely(size <= MI_SMALL_SIZE_MAX) {
+    mi_assert_internal(huge_alignment == 0);
+    return mi_heap_malloc_small_zero(heap, size, zero, usable);
+  }
+  #if MI_GUARDED
+  else if (huge_alignment==0 && mi_heap_malloc_use_guarded(heap,size)) {
+    return _mi_heap_malloc_guarded(heap, size, zero);
+  }
+  #endif
+  else {
+    // regular allocation
+    mi_assert(heap!=NULL);
+    mi_assert(heap->thread_id == 0 || heap->thread_id == _mi_thread_id());   // heaps are thread local
+    void* const p = _mi_malloc_generic(heap, size + MI_PADDING_SIZE, zero, huge_alignment, usable);  // note: size can overflow but it is detected in malloc_generic
+    mi_track_malloc(p,size,zero);
+
+    #if MI_DEBUG>3
+    if (p != NULL && zero) {
+      mi_assert_expensive(mi_mem_is_zero(p, size));
+    }
+    #endif
+    return p;
+  }
+}
+
+extern inline void* _mi_heap_malloc_zero(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept {
+  return _mi_heap_malloc_zero_ex(heap, size, zero, 0, NULL);
+}
+
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_malloc(mi_heap_t* heap, size_t size) mi_attr_noexcept {
+  return _mi_heap_malloc_zero(heap, size, false);
+}
+
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_malloc(size_t size) mi_attr_noexcept {
+  return mi_heap_malloc(mi_prim_get_default_heap(), size);
+}
+
+// zero initialized small block
+mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_small(size_t size) mi_attr_noexcept {
+  return mi_heap_malloc_small_zero(mi_prim_get_default_heap(), size, true, NULL);
+}
+
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_zalloc(mi_heap_t* heap, size_t size) mi_attr_noexcept {
+  return _mi_heap_malloc_zero(heap, size, true);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_zalloc(size_t size) mi_attr_noexcept {
+  return mi_heap_zalloc(mi_prim_get_default_heap(),size);
+}
+
+
+mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_calloc(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count,size,&total)) return NULL;
+  return mi_heap_zalloc(heap,total);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_calloc(size_t count, size_t size) mi_attr_noexcept {
+  return mi_heap_calloc(mi_prim_get_default_heap(),count,size);
+}
+
+// Return usable size
+mi_decl_nodiscard mi_decl_restrict void* mi_umalloc_small(size_t size, size_t* usable) mi_attr_noexcept {
+  return mi_heap_malloc_small_zero(mi_prim_get_default_heap(), size, false, usable);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_umalloc(mi_heap_t* heap, size_t size, size_t* usable) mi_attr_noexcept {
+  return _mi_heap_malloc_zero_ex(heap, size, false, 0, usable);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_umalloc(size_t size, size_t* usable) mi_attr_noexcept {
+  return mi_heap_umalloc(mi_prim_get_default_heap(), size, usable);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_uzalloc(size_t size, size_t* usable) mi_attr_noexcept {
+  return _mi_heap_malloc_zero_ex(mi_prim_get_default_heap(), size, true, 0, usable);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_ucalloc(size_t count, size_t size, size_t* usable) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count,size,&total)) return NULL;
+  return mi_uzalloc(total, usable);
+}
+
+// Uninitialized `calloc`
+mi_decl_nodiscard extern mi_decl_restrict void* mi_heap_mallocn(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count, size, &total)) return NULL;
+  return mi_heap_malloc(heap, total);
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_mallocn(size_t count, size_t size) mi_attr_noexcept {
+  return mi_heap_mallocn(mi_prim_get_default_heap(),count,size);
+}
+
+// Expand (or shrink) in place (or fail)
+void* mi_expand(void* p, size_t newsize) mi_attr_noexcept {
+  #if MI_PADDING
+  // we do not shrink/expand with padding enabled
+  MI_UNUSED(p); MI_UNUSED(newsize);
+  return NULL;
+  #else
+  if (p == NULL) return NULL;
+  const mi_page_t* const page = mi_validate_ptr_page(p,"mi_expand");  
+  const size_t size = _mi_usable_size(p,page);
+  if (newsize > size) return NULL;
+  return p; // it fits
+  #endif
+}
+
+void* _mi_heap_realloc_zero(mi_heap_t* heap, void* p, size_t newsize, bool zero, size_t* usable_pre, size_t* usable_post) mi_attr_noexcept {
+  // if p == NULL then behave as malloc.
+  // else if size == 0 then reallocate to a zero-sized block (and don't return NULL, just as mi_malloc(0)).
+  // (this means that returning NULL always indicates an error, and `p` will not have been freed in that case.)
+  const mi_page_t* page;
+  size_t size;
+  if (p==NULL) {
+    page = NULL;
+    size = 0;
+    if (usable_pre!=NULL) { *usable_pre = 0; }
+  }
+  else {    
+    page = mi_validate_ptr_page(p,"mi_realloc");  
+    size = _mi_usable_size(p,page);
+    if (usable_pre!=NULL) { *usable_pre = mi_page_usable_block_size(page); }    
+  }
+  if mi_unlikely(newsize <= size && newsize >= (size / 2) && newsize > 0) {  // note: newsize must be > 0 or otherwise we return NULL for realloc(NULL,0)
+    mi_assert_internal(p!=NULL);
+    // todo: do not track as the usable size is still the same in the free; adjust potential padding?
+    // mi_track_resize(p,size,newsize)
+    // if (newsize < size) { mi_track_mem_noaccess((uint8_t*)p + newsize, size - newsize); }
+    if (usable_post!=NULL) { *usable_post = mi_page_usable_block_size(page); }
+    return p;  // reallocation still fits and not more than 50% waste
+  }
+  void* newp = mi_heap_umalloc(heap,newsize,usable_post);
+  if mi_likely(newp != NULL) {
+    if (zero && newsize > size) {
+      // also set last word in the previous allocation to zero to ensure any padding is zero-initialized
+      const size_t start = (size >= sizeof(intptr_t) ? size - sizeof(intptr_t) : 0);
+      _mi_memzero((uint8_t*)newp + start, newsize - start);
+    }
+    else if (newsize == 0) {
+      ((uint8_t*)newp)[0] = 0; // work around for applications that expect zero-reallocation to be zero initialized (issue #725)
+    }
+    if mi_likely(p != NULL) {
+      const size_t copysize = (newsize > size ? size : newsize);
+      mi_track_mem_defined(p,copysize);  // _mi_useable_size may be too large for byte precise memory tracking..
+      _mi_memcpy(newp, p, copysize);
+      mi_free(p); // only free the original pointer if successful
+    }
+  }
+  return newp;
+}
+
+mi_decl_nodiscard void* mi_heap_realloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept {
+  return _mi_heap_realloc_zero(heap, p, newsize, false, NULL, NULL);
+}
+
+mi_decl_nodiscard void* mi_heap_reallocn(mi_heap_t* heap, void* p, size_t count, size_t size) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count, size, &total)) return NULL;
+  return mi_heap_realloc(heap, p, total);
+}
+
+
+// Reallocate but free `p` on errors
+mi_decl_nodiscard void* mi_heap_reallocf(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept {
+  void* newp = mi_heap_realloc(heap, p, newsize);
+  if (newp==NULL && p!=NULL) mi_free(p);
+  return newp;
+}
+
+mi_decl_nodiscard void* mi_heap_rezalloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept {
+  return _mi_heap_realloc_zero(heap, p, newsize, true, NULL, NULL);
+}
+
+mi_decl_nodiscard void* mi_heap_recalloc(mi_heap_t* heap, void* p, size_t count, size_t size) mi_attr_noexcept {
+  size_t total;
+  if (mi_count_size_overflow(count, size, &total)) return NULL;
+  return mi_heap_rezalloc(heap, p, total);
+}
+
+
+mi_decl_nodiscard void* mi_realloc(void* p, size_t newsize) mi_attr_noexcept {
+  return mi_heap_realloc(mi_prim_get_default_heap(),p,newsize);
+}
+
+mi_decl_nodiscard void* mi_reallocn(void* p, size_t count, size_t size) mi_attr_noexcept {
+  return mi_heap_reallocn(mi_prim_get_default_heap(),p,count,size);
+}
+
+mi_decl_nodiscard void* mi_urealloc(void* p, size_t newsize, size_t* usable_pre, size_t* usable_post) mi_attr_noexcept {
+  return _mi_heap_realloc_zero(mi_prim_get_default_heap(),p,newsize, false, usable_pre, usable_post);
+}
+
+// Reallocate but free `p` on errors
+mi_decl_nodiscard void* mi_reallocf(void* p, size_t newsize) mi_attr_noexcept {
+  return mi_heap_reallocf(mi_prim_get_default_heap(),p,newsize);
+}
+
+mi_decl_nodiscard void* mi_rezalloc(void* p, size_t newsize) mi_attr_noexcept {
+  return mi_heap_rezalloc(mi_prim_get_default_heap(), p, newsize);
+}
+
+mi_decl_nodiscard void* mi_recalloc(void* p, size_t count, size_t size) mi_attr_noexcept {
+  return mi_heap_recalloc(mi_prim_get_default_heap(), p, count, size);
+}
+
+
+
+// ------------------------------------------------------
+// strdup, strndup, and realpath
+// ------------------------------------------------------
+
+// `strdup` using mi_malloc
+mi_decl_nodiscard mi_decl_restrict char* mi_heap_strdup(mi_heap_t* heap, const char* s) mi_attr_noexcept {
+  if (s == NULL) return NULL;
+  size_t len = _mi_strlen(s);
+  char* t = (char*)mi_heap_malloc(heap,len+1);
+  if (t == NULL) return NULL;
+  _mi_memcpy(t, s, len);
+  t[len] = 0;
+  return t;
+}
+
+mi_decl_nodiscard mi_decl_restrict char* mi_strdup(const char* s) mi_attr_noexcept {
+  return mi_heap_strdup(mi_prim_get_default_heap(), s);
+}
+
+// `strndup` using mi_malloc
+mi_decl_nodiscard mi_decl_restrict char* mi_heap_strndup(mi_heap_t* heap, const char* s, size_t n) mi_attr_noexcept {
+  if (s == NULL) return NULL;
+  const size_t len = _mi_strnlen(s,n);  // len <= n
+  char* t = (char*)mi_heap_malloc(heap, len+1);
+  if (t == NULL) return NULL;
+  _mi_memcpy(t, s, len);
+  t[len] = 0;
+  return t;
+}
+
+mi_decl_nodiscard mi_decl_restrict char* mi_strndup(const char* s, size_t n) mi_attr_noexcept {
+  return mi_heap_strndup(mi_prim_get_default_heap(),s,n);
+}
+
+#ifndef __wasi__
+// `realpath` using mi_malloc
+#ifdef _WIN32
+#ifndef PATH_MAX
+#define PATH_MAX MAX_PATH
+#endif
+
+mi_decl_nodiscard mi_decl_restrict char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept {
+  // todo: use GetFullPathNameW to allow longer file names
+  char buf[PATH_MAX];
+  DWORD res = GetFullPathNameA(fname, PATH_MAX, (resolved_name == NULL ? buf : resolved_name), NULL);
+  if (res == 0) {
+    errno = GetLastError(); return NULL;
+  }
+  else if (res > PATH_MAX) {
+    errno = EINVAL; return NULL;
+  }
+  else if (resolved_name != NULL) {
+    return resolved_name;
+  }
+  else {
+    return mi_heap_strndup(heap, buf, PATH_MAX);
+  }
+}
+#else
+/*
+#include <unistd.h>  // pathconf
+static size_t mi_path_max(void) {
+  static size_t path_max = 0;
+  if (path_max <= 0) {
+    long m = pathconf("/",_PC_PATH_MAX);
+    if (m <= 0) path_max = 4096;      // guess
+    else if (m < 256) path_max = 256; // at least 256
+    else path_max = m;
+  }
+  return path_max;
+}
+*/
+char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept {
+  if (resolved_name != NULL) {
+    return realpath(fname,resolved_name);
+  }
+  else {
+    char* rname = realpath(fname, NULL);
+    if (rname == NULL) return NULL;
+    char* result = mi_heap_strdup(heap, rname);
+    mi_cfree(rname);  // use checked free (which may be redirected to our free but that's ok)
+    // note: with ASAN realpath is intercepted and mi_cfree may leak the returned pointer :-(
+    return result;
+  }
+  /*
+    const size_t n  = mi_path_max();
+    char* buf = (char*)mi_malloc(n+1);
+    if (buf == NULL) {
+      errno = ENOMEM;
+      return NULL;
+    }
+    char* rname  = realpath(fname,buf);
+    char* result = mi_heap_strndup(heap,rname,n); // ok if `rname==NULL`
+    mi_free(buf);
+    return result;
+  }
+  */
+}
+#endif
+
+mi_decl_nodiscard mi_decl_restrict char* mi_realpath(const char* fname, char* resolved_name) mi_attr_noexcept {
+  return mi_heap_realpath(mi_prim_get_default_heap(),fname,resolved_name);
+}
+#endif
+
+/*-------------------------------------------------------
+C++ new and new_aligned
+The standard requires calling into `get_new_handler` and
+throwing the bad_alloc exception on failure. If we compile
+with a C++ compiler we can implement this precisely. If we
+use a C compiler we cannot throw a `bad_alloc` exception
+but we call `exit` instead (i.e. not returning).
+-------------------------------------------------------*/
+
+#ifdef __cplusplus
+#include <new>
+static bool mi_try_new_handler(bool nothrow) {
+  #if defined(_MSC_VER) || (__cplusplus >= 201103L)
+    std::new_handler h = std::get_new_handler();
+  #else
+    std::new_handler h = std::set_new_handler();
+    std::set_new_handler(h);
+  #endif
+  if (h==NULL) {
+    _mi_error_message(ENOMEM, "out of memory in 'new'");
+    #if defined(_CPPUNWIND) || defined(__cpp_exceptions)  // exceptions are not always enabled
+    if (!nothrow) {
+      throw std::bad_alloc();
+    }
+    #else
+    MI_UNUSED(nothrow);
+    #endif
+    return false;
+  }
+  else {
+    h();
+    return true;
+  }
+}
+#else
+typedef void (*std_new_handler_t)(void);
+
+#if (defined(__GNUC__) || (defined(__clang__) && !defined(_MSC_VER)))  // exclude clang-cl, see issue #631
+std_new_handler_t __attribute__((weak)) _ZSt15get_new_handlerv(void) {
+  return NULL;
+}
+static std_new_handler_t mi_get_new_handler(void) {
+  return _ZSt15get_new_handlerv();
+}
+#else
+// note: on windows we could dynamically link to `?get_new_handler@std@@YAP6AXXZXZ`.
+static std_new_handler_t mi_get_new_handler() {
+  return NULL;
+}
+#endif
+
+static bool mi_try_new_handler(bool nothrow) {
+  std_new_handler_t h = mi_get_new_handler();
+  if (h==NULL) {
+    _mi_error_message(ENOMEM, "out of memory in 'new'");
+    if (!nothrow) {
+      abort();  // cannot throw in plain C, use abort
+    }
+    return false;
+  }
+  else {
+    h();
+    return true;
+  }
+}
+#endif
+
+mi_decl_export mi_decl_noinline void* mi_heap_try_new(mi_heap_t* heap, size_t size, bool nothrow ) {
+  void* p = NULL;
+  while(p == NULL && mi_try_new_handler(nothrow)) {
+    p = mi_heap_malloc(heap,size);
+  }
+  return p;
+}
+
+static mi_decl_noinline void* mi_try_new(size_t size, bool nothrow) {
+  return mi_heap_try_new(mi_prim_get_default_heap(), size, nothrow);
+}
+
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_alloc_new(mi_heap_t* heap, size_t size) {
+  void* p = mi_heap_malloc(heap,size);
+  if mi_unlikely(p == NULL) return mi_heap_try_new(heap, size, false);
+  return p;
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_new(size_t size) {
+  return mi_heap_alloc_new(mi_prim_get_default_heap(), size);
+}
+
+
+mi_decl_nodiscard mi_decl_restrict void* mi_heap_alloc_new_n(mi_heap_t* heap, size_t count, size_t size) {
+  size_t total;
+  if mi_unlikely(mi_count_size_overflow(count, size, &total)) {
+    mi_try_new_handler(false);  // on overflow we invoke the try_new_handler once to potentially throw std::bad_alloc
+    return NULL;
+  }
+  else {
+    return mi_heap_alloc_new(heap,total);
+  }
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_new_n(size_t count, size_t size) {
+  return mi_heap_alloc_new_n(mi_prim_get_default_heap(), count, size);
+}
+
+
+mi_decl_nodiscard mi_decl_restrict void* mi_new_nothrow(size_t size) mi_attr_noexcept {
+  void* p = mi_malloc(size);
+  if mi_unlikely(p == NULL) return mi_try_new(size, true);
+  return p;
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_new_aligned(size_t size, size_t alignment) {
+  void* p;
+  do {
+    p = mi_malloc_aligned(size, alignment);
+  }
+  while(p == NULL && mi_try_new_handler(false));
+  return p;
+}
+
+mi_decl_nodiscard mi_decl_restrict void* mi_new_aligned_nothrow(size_t size, size_t alignment) mi_attr_noexcept {
+  void* p;
+  do {
+    p = mi_malloc_aligned(size, alignment);
+  }
+  while(p == NULL && mi_try_new_handler(true));
+  return p;
+}
+
+mi_decl_nodiscard void* mi_new_realloc(void* p, size_t newsize) {
+  void* q;
+  do {
+    q = mi_realloc(p, newsize);
+  } while (q == NULL && mi_try_new_handler(false));
+  return q;
+}
+
+mi_decl_nodiscard void* mi_new_reallocn(void* p, size_t newcount, size_t size) {
+  size_t total;
+  if mi_unlikely(mi_count_size_overflow(newcount, size, &total)) {
+    mi_try_new_handler(false);  // on overflow we invoke the try_new_handler once to potentially throw std::bad_alloc
+    return NULL;
+  }
+  else {
+    return mi_new_realloc(p, total);
+  }
+}
+
+#if MI_GUARDED
+// We always allocate a guarded allocation at an offset (`mi_page_has_aligned` will be true).
+// We then set the first word of the block to `0` for regular offset aligned allocations (in `alloc-aligned.c`)
+// and the first word to `~0` for guarded allocations to have a correct `mi_usable_size`
+
+static void* mi_block_ptr_set_guarded(mi_block_t* block, size_t obj_size) {
+  // TODO: we can still make padding work by moving it out of the guard page area
+  mi_page_t* const page = _mi_ptr_page(block);
+  mi_page_set_has_aligned(page, true);
+  block->next = MI_BLOCK_TAG_GUARDED;
+
+  // set guard page at the end of the block
+  mi_segment_t* const segment = _mi_page_segment(page);
+  const size_t block_size = mi_page_block_size(page);  // must use `block_size` to match `mi_free_local`
+  const size_t os_page_size = _mi_os_page_size();
+  mi_assert_internal(block_size >= obj_size + os_page_size + sizeof(mi_block_t));
+  if (block_size < obj_size + os_page_size + sizeof(mi_block_t)) {
+    // should never happen
+    mi_free(block);
+    return NULL;
+  }
+  uint8_t* guard_page = (uint8_t*)block + block_size - os_page_size;
+  mi_assert_internal(_mi_is_aligned(guard_page, os_page_size));
+  if mi_likely(segment->allow_decommit && _mi_is_aligned(guard_page, os_page_size)) {
+    const bool ok = _mi_os_protect(guard_page, os_page_size);
+    if mi_unlikely(!ok) {
+      _mi_warning_message("failed to set a guard page behind an object (object %p of size %zu)\n", block, block_size);
+    }
+  }
+  else {
+    _mi_warning_message("unable to set a guard page behind an object due to pinned memory (large OS pages?) (object %p of size %zu)\n", block, block_size);
+  }
+
+  // align pointer just in front of the guard page
+  size_t offset = block_size - os_page_size - obj_size;
+  mi_assert_internal(offset > sizeof(mi_block_t));
+  if (offset > MI_BLOCK_ALIGNMENT_MAX) {
+    // give up to place it right in front of the guard page if the offset is too large for unalignment
+    offset = MI_BLOCK_ALIGNMENT_MAX;
+  }
+  void* p = (uint8_t*)block + offset;
+  mi_track_align(block, p, offset, obj_size);
+  mi_track_mem_defined(block, sizeof(mi_block_t));
+  return p;
+}
+
+mi_decl_restrict void* _mi_heap_malloc_guarded(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept
+{
+  #if defined(MI_PADDING_SIZE)
+  mi_assert(MI_PADDING_SIZE==0);
+  #endif
+  // allocate multiple of page size ending in a guard page
+  // ensure minimal alignment requirement?
+  const size_t os_page_size = _mi_os_page_size();
+  const size_t obj_size = (mi_option_is_enabled(mi_option_guarded_precise) ? size : _mi_align_up(size, MI_MAX_ALIGN_SIZE));
+  const size_t bsize    = _mi_align_up(_mi_align_up(obj_size, MI_MAX_ALIGN_SIZE) + sizeof(mi_block_t), MI_MAX_ALIGN_SIZE);
+  const size_t req_size = _mi_align_up(bsize + os_page_size, os_page_size);
+  mi_block_t* const block = (mi_block_t*)_mi_malloc_generic(heap, req_size, zero, 0 /* huge_alignment */, NULL);
+  if (block==NULL) return NULL;
+  void* const p   = mi_block_ptr_set_guarded(block, obj_size);
+
+  // stats
+  mi_track_malloc(p, size, zero);
+  if (p != NULL) {
+    if (!mi_heap_is_initialized(heap)) { heap = mi_prim_get_default_heap(); }
+    #if MI_STAT>1
+    mi_heap_stat_adjust_decrease(heap, malloc_requested, req_size);
+    mi_heap_stat_increase(heap, malloc_requested, size);
+    #endif
+    _mi_stat_counter_increase(&heap->tld->stats.malloc_guarded_count, 1);
+  }
+  #if MI_DEBUG>3
+  if (p != NULL && zero) {
+    mi_assert_expensive(mi_mem_is_zero(p, size));
+  }
+  #endif
+  return p;
+}
+#endif
+
+// ------------------------------------------------------
+// ensure explicit external inline definitions are emitted!
+// ------------------------------------------------------
+
+#ifdef __cplusplus
+void* _mi_externs[] = {
+  (void*)&_mi_page_malloc,
+  (void*)&_mi_page_malloc_zero,
+  (void*)&_mi_heap_malloc_zero,
+  (void*)&_mi_heap_malloc_zero_ex,
+  (void*)&mi_malloc,
+  (void*)&mi_malloc_small,
+  (void*)&mi_zalloc_small,
+  (void*)&mi_heap_malloc,
+  (void*)&mi_heap_zalloc,
+  (void*)&mi_heap_malloc_small,
+  // (void*)&mi_heap_alloc_new,
+  // (void*)&mi_heap_alloc_new_n
+};
+#endif
diff --git a/compat/mimalloc/arena-abandon.c b/compat/mimalloc/arena-abandon.c
new file mode 100644
index 00000000000000..460c80fc22782f
--- /dev/null
+++ b/compat/mimalloc/arena-abandon.c
@@ -0,0 +1,346 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+#if !defined(MI_IN_ARENA_C)
+#error "this file should be included from 'arena.c' (so mi_arena_t is visible)"
+// add includes help an IDE
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "bitmap.h"
+#endif
+
+// Minimal exports for arena-abandoned.
+size_t      mi_arena_id_index(mi_arena_id_t id);
+mi_arena_t* mi_arena_from_index(size_t idx);
+size_t      mi_arena_get_count(void);
+void*       mi_arena_block_start(mi_arena_t* arena, mi_bitmap_index_t bindex);
+bool        mi_arena_memid_indices(mi_memid_t memid, size_t* arena_index, mi_bitmap_index_t* bitmap_index);
+
+/* -----------------------------------------------------------
+  Abandoned blocks/segments:
+
+  _mi_arena_segment_clear_abandoned
+  _mi_arena_segment_mark_abandoned
+
+  This is used to atomically abandon/reclaim segments
+  (and crosses the arena API but it is convenient to have here).
+
+  Abandoned segments still have live blocks; they get reclaimed
+  when a thread frees a block in it, or when a thread needs a fresh
+  segment.
+
+  Abandoned segments are atomically marked in the `block_abandoned`
+  bitmap of arenas. Any segments allocated outside arenas are put
+  in the sub-process `abandoned_os_list`. This list is accessed
+  using locks but this should be uncommon and generally uncontended.
+  Reclaim and visiting either scan through the `block_abandoned`
+  bitmaps of the arena's, or visit the `abandoned_os_list`
+
+  A potentially nicer design is to use arena's for everything
+  and perhaps have virtual arena's to map OS allocated memory
+  but this would lack the "density" of our current arena's. TBC.
+----------------------------------------------------------- */
+
+
+// reclaim a specific OS abandoned segment; `true` on success.
+// sets the thread_id.
+static bool mi_arena_segment_os_clear_abandoned(mi_segment_t* segment, bool take_lock) {
+  mi_assert(segment->memid.memkind != MI_MEM_ARENA);
+  // not in an arena, remove from list of abandoned os segments
+  mi_subproc_t* const subproc = segment->subproc;
+  if (take_lock && !mi_lock_try_acquire(&subproc->abandoned_os_lock)) {
+    return false;  // failed to acquire the lock, we just give up
+  }
+  // remove atomically from the abandoned os list (if possible!)
+  bool reclaimed = false;
+  mi_segment_t* const next = segment->abandoned_os_next;
+  mi_segment_t* const prev = segment->abandoned_os_prev;
+  if (next != NULL || prev != NULL || subproc->abandoned_os_list == segment) {
+    #if MI_DEBUG>3
+    // find ourselves in the abandoned list (and check the count)
+    bool found = false;
+    size_t count = 0;
+    for (mi_segment_t* current = subproc->abandoned_os_list; current != NULL; current = current->abandoned_os_next) {
+      if (current == segment) { found = true; }
+      count++;
+    }
+    mi_assert_internal(found);
+    mi_assert_internal(count == mi_atomic_load_relaxed(&subproc->abandoned_os_list_count));
+    #endif
+    // remove (atomically) from the list and reclaim
+    if (prev != NULL) { prev->abandoned_os_next = next; }
+    else { subproc->abandoned_os_list = next; }
+    if (next != NULL) { next->abandoned_os_prev = prev; }
+    else { subproc->abandoned_os_list_tail = prev; }
+    segment->abandoned_os_next = NULL;
+    segment->abandoned_os_prev = NULL;
+    mi_atomic_decrement_relaxed(&subproc->abandoned_count);
+    mi_atomic_decrement_relaxed(&subproc->abandoned_os_list_count);
+    if (take_lock) { // don't reset the thread_id when iterating
+      mi_atomic_store_release(&segment->thread_id, _mi_thread_id());
+    }
+    reclaimed = true;
+  }
+  if (take_lock) { mi_lock_release(&segment->subproc->abandoned_os_lock); }
+  return reclaimed;
+}
+
+// reclaim a specific abandoned segment; `true` on success.
+// sets the thread_id.
+bool _mi_arena_segment_clear_abandoned(mi_segment_t* segment) {
+  if mi_unlikely(segment->memid.memkind != MI_MEM_ARENA) {
+    return mi_arena_segment_os_clear_abandoned(segment, true /* take lock */);
+  }
+  // arena segment: use the blocks_abandoned bitmap.
+  size_t arena_idx;
+  size_t bitmap_idx;
+  mi_arena_memid_indices(segment->memid, &arena_idx, &bitmap_idx);
+  mi_arena_t* arena = mi_arena_from_index(arena_idx);
+  mi_assert_internal(arena != NULL);
+  // reclaim atomically
+  bool was_marked = _mi_bitmap_unclaim(arena->blocks_abandoned, arena->field_count, 1, bitmap_idx);
+  if (was_marked) {
+    mi_assert_internal(mi_atomic_load_acquire(&segment->thread_id) == 0);
+    mi_atomic_decrement_relaxed(&segment->subproc->abandoned_count);
+    mi_atomic_store_release(&segment->thread_id, _mi_thread_id());
+  }
+  // mi_assert_internal(was_marked);
+  mi_assert_internal(!was_marked || _mi_bitmap_is_claimed(arena->blocks_inuse, arena->field_count, 1, bitmap_idx));
+  //mi_assert_internal(arena->blocks_committed == NULL || _mi_bitmap_is_claimed(arena->blocks_committed, arena->field_count, 1, bitmap_idx));
+  return was_marked;
+}
+
+
+// mark a specific OS segment as abandoned
+static void mi_arena_segment_os_mark_abandoned(mi_segment_t* segment) {
+  mi_assert(segment->memid.memkind != MI_MEM_ARENA);
+  // not in an arena; we use a list of abandoned segments
+  mi_subproc_t* const subproc = segment->subproc;
+  mi_lock(&subproc->abandoned_os_lock) {
+    // push on the tail of the list (important for the visitor)
+    mi_segment_t* prev = subproc->abandoned_os_list_tail;
+    mi_assert_internal(prev == NULL || prev->abandoned_os_next == NULL);
+    mi_assert_internal(segment->abandoned_os_prev == NULL);
+    mi_assert_internal(segment->abandoned_os_next == NULL);
+    if (prev != NULL) { prev->abandoned_os_next = segment; }
+    else { subproc->abandoned_os_list = segment; }
+    subproc->abandoned_os_list_tail = segment;
+    segment->abandoned_os_prev = prev;
+    segment->abandoned_os_next = NULL;
+    mi_atomic_increment_relaxed(&subproc->abandoned_os_list_count);
+    mi_atomic_increment_relaxed(&subproc->abandoned_count);
+    // and release the lock
+  }
+  return;
+}
+
+// mark a specific segment as abandoned
+// clears the thread_id.
+void _mi_arena_segment_mark_abandoned(mi_segment_t* segment)
+{
+  mi_assert_internal(segment->used == segment->abandoned);
+  mi_atomic_store_release(&segment->thread_id, (uintptr_t)0);  // mark as abandoned for multi-thread free's
+  if mi_unlikely(segment->memid.memkind != MI_MEM_ARENA) {
+    mi_arena_segment_os_mark_abandoned(segment);
+    return;
+  }
+  // segment is in an arena, mark it in the arena `blocks_abandoned` bitmap
+  size_t arena_idx;
+  size_t bitmap_idx;
+  mi_arena_memid_indices(segment->memid, &arena_idx, &bitmap_idx);
+  mi_arena_t* arena = mi_arena_from_index(arena_idx);
+  mi_assert_internal(arena != NULL);
+  // set abandonment atomically
+  mi_subproc_t* const subproc = segment->subproc; // don't access the segment after setting it abandoned
+  const bool was_unmarked = _mi_bitmap_claim(arena->blocks_abandoned, arena->field_count, 1, bitmap_idx, NULL);
+  if (was_unmarked) { mi_atomic_increment_relaxed(&subproc->abandoned_count); }
+  mi_assert_internal(was_unmarked);
+  mi_assert_internal(_mi_bitmap_is_claimed(arena->blocks_inuse, arena->field_count, 1, bitmap_idx));
+}
+
+
+/* -----------------------------------------------------------
+  Iterate through the abandoned blocks/segments using a cursor.
+  This is used for reclaiming and abandoned block visiting.
+----------------------------------------------------------- */
+
+// start a cursor at a randomized arena
+void _mi_arena_field_cursor_init(mi_heap_t* heap, mi_subproc_t* subproc, bool visit_all, mi_arena_field_cursor_t* current) {
+  mi_assert_internal(heap == NULL || heap->tld->segments.subproc == subproc);
+  current->bitmap_idx = 0;
+  current->subproc = subproc;
+  current->visit_all = visit_all;
+  current->hold_visit_lock = false;
+  const size_t abandoned_count = mi_atomic_load_relaxed(&subproc->abandoned_count);
+  const size_t abandoned_list_count = mi_atomic_load_relaxed(&subproc->abandoned_os_list_count);
+  const size_t max_arena = mi_arena_get_count();
+  if (heap != NULL && heap->arena_id != _mi_arena_id_none()) {
+    // for a heap that is bound to one arena, only visit that arena
+    current->start = mi_arena_id_index(heap->arena_id);
+    current->end = current->start + 1;
+    current->os_list_count = 0;
+  }
+  else {
+    // otherwise visit all starting at a random location
+    if (abandoned_count > abandoned_list_count && max_arena > 0) {
+      current->start = (heap == NULL || max_arena == 0 ? 0 : (mi_arena_id_t)(_mi_heap_random_next(heap) % max_arena));
+      current->end = current->start + max_arena;
+    }
+    else {
+      current->start = 0;
+      current->end = 0;
+    }
+    current->os_list_count = abandoned_list_count; // max entries to visit in the os abandoned list
+  }
+  mi_assert_internal(current->start <= max_arena);
+}
+
+void _mi_arena_field_cursor_done(mi_arena_field_cursor_t* current) {
+  if (current->hold_visit_lock) {
+    mi_lock_release(&current->subproc->abandoned_os_visit_lock);
+    current->hold_visit_lock = false;
+  }
+}
+
+static mi_segment_t* mi_arena_segment_clear_abandoned_at(mi_arena_t* arena, mi_subproc_t* subproc, mi_bitmap_index_t bitmap_idx) {
+  // try to reclaim an abandoned segment in the arena atomically
+  if (!_mi_bitmap_unclaim(arena->blocks_abandoned, arena->field_count, 1, bitmap_idx)) return NULL;
+  mi_assert_internal(_mi_bitmap_is_claimed(arena->blocks_inuse, arena->field_count, 1, bitmap_idx));
+  mi_segment_t* segment = (mi_segment_t*)mi_arena_block_start(arena, bitmap_idx);
+  mi_assert_internal(mi_atomic_load_relaxed(&segment->thread_id) == 0);
+  // check that the segment belongs to our sub-process
+  // note: this is the reason we need the `abandoned_visit` lock in the case abandoned visiting is enabled.
+  //  without the lock an abandoned visit may otherwise fail to visit all abandoned segments in the sub-process.
+  //  for regular reclaim it is fine to miss one sometimes so without abandoned visiting we don't need the `abandoned_visit` lock.
+  if (segment->subproc != subproc) {
+    // it is from another sub-process, re-mark it and continue searching
+    const bool was_zero = _mi_bitmap_claim(arena->blocks_abandoned, arena->field_count, 1, bitmap_idx, NULL);
+    mi_assert_internal(was_zero); MI_UNUSED(was_zero);
+    return NULL;
+  }
+  else {
+    // success, we unabandoned a segment in our sub-process
+    mi_atomic_decrement_relaxed(&subproc->abandoned_count);
+    return segment;
+  }
+}
+
+static mi_segment_t* mi_arena_segment_clear_abandoned_next_field(mi_arena_field_cursor_t* previous) {
+  const size_t max_arena = mi_arena_get_count();
+  size_t field_idx = mi_bitmap_index_field(previous->bitmap_idx);
+  size_t bit_idx = mi_bitmap_index_bit_in_field(previous->bitmap_idx);
+  // visit arena's (from the previous cursor)
+  for (; previous->start < previous->end; previous->start++, field_idx = 0, bit_idx = 0) {
+    // index wraps around
+    size_t arena_idx = (previous->start >= max_arena ? previous->start % max_arena : previous->start);
+    mi_arena_t* arena = mi_arena_from_index(arena_idx);
+    if (arena != NULL) {
+      bool has_lock = false;
+      // visit the abandoned fields (starting at previous_idx)
+      for (; field_idx < arena->field_count; field_idx++, bit_idx = 0) {
+        size_t field = mi_atomic_load_relaxed(&arena->blocks_abandoned[field_idx]);
+        if mi_unlikely(field != 0) { // skip zero fields quickly
+          // we only take the arena lock if there are actually abandoned segments present
+          if (!has_lock && mi_option_is_enabled(mi_option_visit_abandoned)) {
+            has_lock = (previous->visit_all ? (mi_lock_acquire(&arena->abandoned_visit_lock),true) : mi_lock_try_acquire(&arena->abandoned_visit_lock));
+            if (!has_lock) {
+              if (previous->visit_all) {
+                _mi_error_message(EFAULT, "internal error: failed to visit all abandoned segments due to failure to acquire the visitor lock");
+              }
+              // skip to next arena
+              break;
+            }
+          }
+          mi_assert_internal(has_lock || !mi_option_is_enabled(mi_option_visit_abandoned));
+          // visit each set bit in the field  (todo: maybe use `ctz` here?)
+          for (; bit_idx < MI_BITMAP_FIELD_BITS; bit_idx++) {
+            // pre-check if the bit is set
+            size_t mask = ((size_t)1 << bit_idx);
+            if mi_unlikely((field & mask) == mask) {
+              mi_bitmap_index_t bitmap_idx = mi_bitmap_index_create(field_idx, bit_idx);
+              mi_segment_t* const segment = mi_arena_segment_clear_abandoned_at(arena, previous->subproc, bitmap_idx);
+              if (segment != NULL) {
+                //mi_assert_internal(arena->blocks_committed == NULL || _mi_bitmap_is_claimed(arena->blocks_committed, arena->field_count, 1, bitmap_idx));
+                if (has_lock) { mi_lock_release(&arena->abandoned_visit_lock); }
+                previous->bitmap_idx = mi_bitmap_index_create_ex(field_idx, bit_idx + 1); // start at next one for the next iteration
+                return segment;
+              }
+            }
+          }
+        }
+      }
+      if (has_lock) { mi_lock_release(&arena->abandoned_visit_lock); }
+    }
+  }
+  return NULL;
+}
+
+static mi_segment_t* mi_arena_segment_clear_abandoned_next_list(mi_arena_field_cursor_t* previous) {
+  // go through the abandoned_os_list
+  // we only allow one thread per sub-process to do to visit guarded by the `abandoned_os_visit_lock`.
+  // The lock is released when the cursor is released.
+  if (!previous->hold_visit_lock) {
+    previous->hold_visit_lock = (previous->visit_all ? (mi_lock_acquire(&previous->subproc->abandoned_os_visit_lock),true)
+                                                     : mi_lock_try_acquire(&previous->subproc->abandoned_os_visit_lock));
+    if (!previous->hold_visit_lock) {
+      if (previous->visit_all) {
+        _mi_error_message(EFAULT, "internal error: failed to visit all abandoned segments due to failure to acquire the OS visitor lock");
+      }
+      return NULL; // we cannot get the lock, give up
+    }
+  }
+  // One list entry at a time
+  while (previous->os_list_count > 0) {
+    previous->os_list_count--;
+    mi_lock_acquire(&previous->subproc->abandoned_os_lock); // this could contend with concurrent OS block abandonment and reclaim from `free`
+    mi_segment_t* segment = previous->subproc->abandoned_os_list;
+    // pop from head of the list, a subsequent mark will push at the end (and thus we iterate through os_list_count entries)
+    if (segment == NULL || mi_arena_segment_os_clear_abandoned(segment, false /* we already have the lock */)) {
+      mi_lock_release(&previous->subproc->abandoned_os_lock);
+      return segment;
+    }
+    // already abandoned, try again
+    mi_lock_release(&previous->subproc->abandoned_os_lock);
+  }
+  // done
+  mi_assert_internal(previous->os_list_count == 0);
+  return NULL;
+}
+
+
+// reclaim abandoned segments
+// this does not set the thread id (so it appears as still abandoned)
+mi_segment_t* _mi_arena_segment_clear_abandoned_next(mi_arena_field_cursor_t* previous) {
+  if (previous->start < previous->end) {
+    // walk the arena
+    mi_segment_t* segment = mi_arena_segment_clear_abandoned_next_field(previous);
+    if (segment != NULL) { return segment; }
+  }
+  // no entries in the arena's anymore, walk the abandoned OS list
+  mi_assert_internal(previous->start == previous->end);
+  return mi_arena_segment_clear_abandoned_next_list(previous);
+}
+
+
+bool mi_abandoned_visit_blocks(mi_subproc_id_t subproc_id, int heap_tag, bool visit_blocks, mi_block_visit_fun* visitor, void* arg) {
+  // (unfortunately) the visit_abandoned option must be enabled from the start.
+  // This is to avoid taking locks if abandoned list visiting is not required (as for most programs)
+  if (!mi_option_is_enabled(mi_option_visit_abandoned)) {
+    _mi_error_message(EFAULT, "internal error: can only visit abandoned blocks when MIMALLOC_VISIT_ABANDONED=ON");
+    return false;
+  }
+  mi_arena_field_cursor_t current;
+  _mi_arena_field_cursor_init(NULL, _mi_subproc_from_id(subproc_id), true /* visit all (blocking) */, &current);
+  mi_segment_t* segment;
+  bool ok = true;
+  while (ok && (segment = _mi_arena_segment_clear_abandoned_next(&current)) != NULL) {
+    ok = _mi_segment_visit_blocks(segment, heap_tag, visit_blocks, visitor, arg);
+    _mi_arena_segment_mark_abandoned(segment);
+  }
+  _mi_arena_field_cursor_done(&current);
+  return ok;
+}
diff --git a/compat/mimalloc/arena.c b/compat/mimalloc/arena.c
new file mode 100644
index 00000000000000..c87dd23b54107c
--- /dev/null
+++ b/compat/mimalloc/arena.c
@@ -0,0 +1,1045 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* ----------------------------------------------------------------------------
+"Arenas" are fixed area's of OS memory from which we can allocate
+large blocks (>= MI_ARENA_MIN_BLOCK_SIZE, 4MiB).
+In contrast to the rest of mimalloc, the arenas are shared between
+threads and need to be accessed using atomic operations.
+
+Arenas are also used to for huge OS page (1GiB) reservations or for reserving
+OS memory upfront which can be improve performance or is sometimes needed
+on embedded devices. We can also employ this with WASI or `sbrk` systems
+to reserve large arenas upfront and be able to reuse the memory more effectively.
+
+The arena allocation needs to be thread safe and we use an atomic bitmap to allocate.
+-----------------------------------------------------------------------------*/
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "bitmap.h"
+
+
+/* -----------------------------------------------------------
+  Arena allocation
+----------------------------------------------------------- */
+
+// A memory arena descriptor
+typedef struct mi_arena_s {
+  mi_arena_id_t       id;                   // arena id; 0 for non-specific
+  mi_memid_t          memid;                // memid of the memory area
+  _Atomic(uint8_t*)   start;                // the start of the memory area
+  size_t              block_count;          // size of the area in arena blocks (of `MI_ARENA_BLOCK_SIZE`)
+  size_t              field_count;          // number of bitmap fields (where `field_count * MI_BITMAP_FIELD_BITS >= block_count`)
+  size_t              meta_size;            // size of the arena structure itself (including its bitmaps)
+  mi_memid_t          meta_memid;           // memid of the arena structure itself (OS or static allocation)
+  int                 numa_node;            // associated NUMA node
+  bool                exclusive;            // only allow allocations if specifically for this arena
+  bool                is_large;             // memory area consists of large- or huge OS pages (always committed)
+  mi_lock_t           abandoned_visit_lock; // lock is only used when abandoned segments are being visited
+  _Atomic(size_t)     search_idx;           // optimization to start the search for free blocks
+  _Atomic(mi_msecs_t) purge_expire;         // expiration time when blocks should be purged from `blocks_purge`.
+
+  mi_bitmap_field_t*  blocks_dirty;         // are the blocks potentially non-zero?
+  mi_bitmap_field_t*  blocks_committed;     // are the blocks committed? (can be NULL for memory that cannot be decommitted)
+  mi_bitmap_field_t*  blocks_purge;         // blocks that can be (reset) decommitted. (can be NULL for memory that cannot be (reset) decommitted)
+  mi_bitmap_field_t*  blocks_abandoned;     // blocks that start with an abandoned segment. (This crosses API's but it is convenient to have here)
+  mi_bitmap_field_t   blocks_inuse[1];      // in-place bitmap of in-use blocks (of size `field_count`)
+  // do not add further fields here as the dirty, committed, purged, and abandoned bitmaps follow the inuse bitmap fields.
+} mi_arena_t;
+
+
+#define MI_ARENA_BLOCK_SIZE   (MI_SEGMENT_SIZE)        // 64MiB  (must be at least MI_SEGMENT_ALIGN)
+#define MI_ARENA_MIN_OBJ_SIZE (MI_ARENA_BLOCK_SIZE/2)  // 32MiB
+#define MI_MAX_ARENAS         (132)                    // Limited as the reservation exponentially increases (and takes up .bss)
+
+// The available arenas
+static mi_decl_cache_align _Atomic(mi_arena_t*) mi_arenas[MI_MAX_ARENAS];
+static mi_decl_cache_align _Atomic(size_t)      mi_arena_count; // = 0
+static mi_decl_cache_align _Atomic(int64_t)     mi_arenas_purge_expire; // set if there exist purgeable arenas
+
+#define MI_IN_ARENA_C
+#include "arena-abandon.c"
+#undef MI_IN_ARENA_C
+
+/* -----------------------------------------------------------
+  Arena id's
+  id = arena_index + 1
+----------------------------------------------------------- */
+
+size_t mi_arena_id_index(mi_arena_id_t id) {
+  return (size_t)(id <= 0 ? MI_MAX_ARENAS : id - 1);
+}
+
+static mi_arena_id_t mi_arena_id_create(size_t arena_index) {
+  mi_assert_internal(arena_index < MI_MAX_ARENAS);
+  return (int)arena_index + 1;
+}
+
+mi_arena_id_t _mi_arena_id_none(void) {
+  return 0;
+}
+
+static bool mi_arena_id_is_suitable(mi_arena_id_t arena_id, bool arena_is_exclusive, mi_arena_id_t req_arena_id) {
+  return ((!arena_is_exclusive && req_arena_id == _mi_arena_id_none()) ||
+          (arena_id == req_arena_id));
+}
+
+bool _mi_arena_memid_is_suitable(mi_memid_t memid, mi_arena_id_t request_arena_id) {
+  if (memid.memkind == MI_MEM_ARENA) {
+    return mi_arena_id_is_suitable(memid.mem.arena.id, memid.mem.arena.is_exclusive, request_arena_id);
+  }
+  else {
+    return mi_arena_id_is_suitable(_mi_arena_id_none(), false, request_arena_id);
+  }
+}
+
+bool _mi_arena_memid_is_os_allocated(mi_memid_t memid) {
+  return (memid.memkind == MI_MEM_OS);
+}
+
+size_t mi_arena_get_count(void) {
+  return mi_atomic_load_relaxed(&mi_arena_count);
+}
+
+mi_arena_t* mi_arena_from_index(size_t idx) {
+  mi_assert_internal(idx < mi_arena_get_count());
+  return mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[idx]);
+}
+
+
+/* -----------------------------------------------------------
+  Arena allocations get a (currently) 16-bit memory id where the
+  lower 8 bits are the arena id, and the upper bits the block index.
+----------------------------------------------------------- */
+
+static size_t mi_block_count_of_size(size_t size) {
+  return _mi_divide_up(size, MI_ARENA_BLOCK_SIZE);
+}
+
+static size_t mi_arena_block_size(size_t bcount) {
+  return (bcount * MI_ARENA_BLOCK_SIZE);
+}
+
+static size_t mi_arena_size(mi_arena_t* arena) {
+  return mi_arena_block_size(arena->block_count);
+}
+
+static mi_memid_t mi_memid_create_arena(mi_arena_id_t id, bool is_exclusive, mi_bitmap_index_t bitmap_index) {
+  mi_memid_t memid = _mi_memid_create(MI_MEM_ARENA);
+  memid.mem.arena.id = id;
+  memid.mem.arena.block_index = bitmap_index;
+  memid.mem.arena.is_exclusive = is_exclusive;
+  return memid;
+}
+
+bool mi_arena_memid_indices(mi_memid_t memid, size_t* arena_index, mi_bitmap_index_t* bitmap_index) {
+  mi_assert_internal(memid.memkind == MI_MEM_ARENA);
+  *arena_index = mi_arena_id_index(memid.mem.arena.id);
+  *bitmap_index = memid.mem.arena.block_index;
+  return memid.mem.arena.is_exclusive;
+}
+
+
+
+/* -----------------------------------------------------------
+  Special static area for mimalloc internal structures
+  to avoid OS calls (for example, for the arena metadata (~= 256b))
+----------------------------------------------------------- */
+
+#define MI_ARENA_STATIC_MAX  ((MI_INTPTR_SIZE/2)*MI_KiB)  // 4 KiB on 64-bit
+
+static mi_decl_cache_align uint8_t mi_arena_static[MI_ARENA_STATIC_MAX];  // must be cache aligned, see issue #895
+static mi_decl_cache_align _Atomic(size_t) mi_arena_static_top;
+
+static void* mi_arena_static_zalloc(size_t size, size_t alignment, mi_memid_t* memid) {
+  *memid = _mi_memid_none();
+  if (size == 0 || size > MI_ARENA_STATIC_MAX) return NULL;
+  const size_t toplow = mi_atomic_load_relaxed(&mi_arena_static_top);
+  if ((toplow + size) > MI_ARENA_STATIC_MAX) return NULL;
+
+  // try to claim space
+  if (alignment < MI_MAX_ALIGN_SIZE) { alignment = MI_MAX_ALIGN_SIZE; }
+  const size_t oversize = size + alignment - 1;
+  if (toplow + oversize > MI_ARENA_STATIC_MAX) return NULL;
+  const size_t oldtop = mi_atomic_add_acq_rel(&mi_arena_static_top, oversize);
+  size_t top = oldtop + oversize;
+  if (top > MI_ARENA_STATIC_MAX) {
+    // try to roll back, ok if this fails
+    mi_atomic_cas_strong_acq_rel(&mi_arena_static_top, &top, oldtop);
+    return NULL;
+  }
+
+  // success
+  *memid = _mi_memid_create(MI_MEM_STATIC);
+  memid->initially_zero = true;
+  const size_t start = _mi_align_up(oldtop, alignment);
+  uint8_t* const p = &mi_arena_static[start];
+  _mi_memzero_aligned(p, size);
+  return p;
+}
+
+void* _mi_arena_meta_zalloc(size_t size, mi_memid_t* memid) {
+  *memid = _mi_memid_none();
+
+  // try static
+  void* p = mi_arena_static_zalloc(size, MI_MAX_ALIGN_SIZE, memid);
+  if (p != NULL) return p;
+
+  // or fall back to the OS
+  p = _mi_os_zalloc(size, memid);
+  if (p == NULL) return NULL;
+
+  return p;
+}
+
+void _mi_arena_meta_free(void* p, mi_memid_t memid, size_t size) {
+  if (mi_memkind_is_os(memid.memkind)) {
+    _mi_os_free(p, size, memid);
+  }
+  else {
+    mi_assert(memid.memkind == MI_MEM_STATIC);
+  }
+}
+
+void* mi_arena_block_start(mi_arena_t* arena, mi_bitmap_index_t bindex) {
+  return (arena->start + mi_arena_block_size(mi_bitmap_index_bit(bindex)));
+}
+
+
+/* -----------------------------------------------------------
+  Thread safe allocation in an arena
+----------------------------------------------------------- */
+
+// claim the `blocks_inuse` bits
+static bool mi_arena_try_claim(mi_arena_t* arena, size_t blocks, mi_bitmap_index_t* bitmap_idx)
+{
+  size_t idx = 0; // mi_atomic_load_relaxed(&arena->search_idx);  // start from last search; ok to be relaxed as the exact start does not matter
+  if (_mi_bitmap_try_find_from_claim_across(arena->blocks_inuse, arena->field_count, idx, blocks, bitmap_idx)) {
+    mi_atomic_store_relaxed(&arena->search_idx, mi_bitmap_index_field(*bitmap_idx));  // start search from found location next time around
+    return true;
+  };
+  return false;
+}
+
+
+/* -----------------------------------------------------------
+  Arena Allocation
+----------------------------------------------------------- */
+
+static mi_decl_noinline void* mi_arena_try_alloc_at(mi_arena_t* arena, size_t arena_index, size_t needed_bcount,
+                                                    bool commit, mi_memid_t* memid)
+{
+  MI_UNUSED(arena_index);
+  mi_assert_internal(mi_arena_id_index(arena->id) == arena_index);
+
+  mi_bitmap_index_t bitmap_index;
+  if (!mi_arena_try_claim(arena, needed_bcount, &bitmap_index)) return NULL;
+
+  // claimed it!
+  void* p = mi_arena_block_start(arena, bitmap_index);
+  *memid = mi_memid_create_arena(arena->id, arena->exclusive, bitmap_index);
+  memid->is_pinned = arena->memid.is_pinned;
+
+  // none of the claimed blocks should be scheduled for a decommit
+  if (arena->blocks_purge != NULL) {
+    // this is thread safe as a potential purge only decommits parts that are not yet claimed as used (in `blocks_inuse`).
+    _mi_bitmap_unclaim_across(arena->blocks_purge, arena->field_count, needed_bcount, bitmap_index);
+  }
+
+  // set the dirty bits (todo: no need for an atomic op here?)
+  if (arena->memid.initially_zero && arena->blocks_dirty != NULL) {
+    memid->initially_zero = _mi_bitmap_claim_across(arena->blocks_dirty, arena->field_count, needed_bcount, bitmap_index, NULL, NULL);
+  }
+
+  // set commit state
+  if (arena->blocks_committed == NULL) {
+    // always committed
+    memid->initially_committed = true;
+  }
+  else if (commit) {
+    // commit requested, but the range may not be committed as a whole: ensure it is committed now
+    memid->initially_committed = true;
+    const size_t commit_size = mi_arena_block_size(needed_bcount);      
+    bool any_uncommitted;
+    size_t already_committed = 0;
+    _mi_bitmap_claim_across(arena->blocks_committed, arena->field_count, needed_bcount, bitmap_index, &any_uncommitted, &already_committed);
+    if (any_uncommitted) {
+      mi_assert_internal(already_committed < needed_bcount);
+      const size_t stat_commit_size = commit_size - mi_arena_block_size(already_committed);
+      bool commit_zero = false;
+      if (!_mi_os_commit_ex(p, commit_size, &commit_zero, stat_commit_size)) {
+        memid->initially_committed = false;
+      }
+      else {
+        if (commit_zero) { memid->initially_zero = true; }
+      }
+    }
+    else {
+      // all are already committed: signal that we are reusing memory in case it was purged before
+      _mi_os_reuse( p, commit_size );
+    }
+  }
+  else {
+    // no need to commit, but check if already fully committed
+    size_t already_committed = 0;
+    memid->initially_committed = _mi_bitmap_is_claimed_across(arena->blocks_committed, arena->field_count, needed_bcount, bitmap_index, &already_committed);
+    if (!memid->initially_committed && already_committed > 0) {
+      // partially committed: as it will be committed at some time, adjust the stats and pretend the range is fully uncommitted.
+      mi_assert_internal(already_committed < needed_bcount);
+      _mi_stat_decrease(&_mi_stats_main.committed, mi_arena_block_size(already_committed));
+      _mi_bitmap_unclaim_across(arena->blocks_committed, arena->field_count, needed_bcount, bitmap_index);
+    }
+  }
+
+  return p;
+}
+
+// allocate in a specific arena
+static void* mi_arena_try_alloc_at_id(mi_arena_id_t arena_id, bool match_numa_node, int numa_node, size_t size, size_t alignment,
+                                       bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid )
+{
+  MI_UNUSED_RELEASE(alignment);
+  mi_assert(alignment <= MI_SEGMENT_ALIGN);
+  const size_t bcount = mi_block_count_of_size(size);
+  const size_t arena_index = mi_arena_id_index(arena_id);
+  mi_assert_internal(arena_index < mi_atomic_load_relaxed(&mi_arena_count));
+  mi_assert_internal(size <= mi_arena_block_size(bcount));
+
+  // Check arena suitability
+  mi_arena_t* arena = mi_arena_from_index(arena_index);
+  if (arena == NULL) return NULL;
+  if (!allow_large && arena->is_large) return NULL;
+  if (!mi_arena_id_is_suitable(arena->id, arena->exclusive, req_arena_id)) return NULL;
+  if (req_arena_id == _mi_arena_id_none()) { // in not specific, check numa affinity
+    const bool numa_suitable = (numa_node < 0 || arena->numa_node < 0 || arena->numa_node == numa_node);
+    if (match_numa_node) { if (!numa_suitable) return NULL; }
+                    else { if (numa_suitable) return NULL; }
+  }
+
+  // try to allocate
+  void* p = mi_arena_try_alloc_at(arena, arena_index, bcount, commit, memid);
+  mi_assert_internal(p == NULL || _mi_is_aligned(p, alignment));
+  return p;
+}
+
+
+// allocate from an arena with fallback to the OS
+static mi_decl_noinline void* mi_arena_try_alloc(int numa_node, size_t size, size_t alignment,
+                                                  bool commit, bool allow_large,
+                                                  mi_arena_id_t req_arena_id, mi_memid_t* memid )
+{
+  MI_UNUSED(alignment);
+  mi_assert_internal(alignment <= MI_SEGMENT_ALIGN);
+  const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count);
+  if mi_likely(max_arena == 0) return NULL;
+
+  if (req_arena_id != _mi_arena_id_none()) {
+    // try a specific arena if requested
+    if (mi_arena_id_index(req_arena_id) < max_arena) {
+      void* p = mi_arena_try_alloc_at_id(req_arena_id, true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid);
+      if (p != NULL) return p;
+    }
+  }
+  else {
+    // try numa affine allocation
+    for (size_t i = 0; i < max_arena; i++) {
+      void* p = mi_arena_try_alloc_at_id(mi_arena_id_create(i), true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid);
+      if (p != NULL) return p;
+    }
+
+    // try from another numa node instead..
+    if (numa_node >= 0) {  // if numa_node was < 0 (no specific affinity requested), all arena's have been tried already
+      for (size_t i = 0; i < max_arena; i++) {
+        void* p = mi_arena_try_alloc_at_id(mi_arena_id_create(i), false /* only proceed if not numa local */, numa_node, size, alignment, commit, allow_large, req_arena_id, memid);
+        if (p != NULL) return p;
+      }
+    }
+  }
+  return NULL;
+}
+
+// try to reserve a fresh arena space
+static bool mi_arena_reserve(size_t req_size, bool allow_large, mi_arena_id_t *arena_id)
+{
+  if (_mi_preloading()) return false;  // use OS only while pre loading
+
+  const size_t arena_count = mi_atomic_load_acquire(&mi_arena_count);
+  if (arena_count > (MI_MAX_ARENAS - 4)) return false;
+
+  size_t arena_reserve = mi_option_get_size(mi_option_arena_reserve);
+  if (arena_reserve == 0) return false;
+
+  if (!_mi_os_has_virtual_reserve()) {
+    arena_reserve = arena_reserve/4;  // be conservative if virtual reserve is not supported (for WASM for example)
+  }
+  arena_reserve = _mi_align_up(arena_reserve, MI_ARENA_BLOCK_SIZE);
+  arena_reserve = _mi_align_up(arena_reserve, MI_SEGMENT_SIZE);
+  if (arena_count >= 8 && arena_count <= 128) {
+    // scale up the arena sizes exponentially every 8 entries (128 entries get to 589TiB)
+    const size_t multiplier = (size_t)1 << _mi_clamp(arena_count/8, 0, 16 );
+    size_t reserve = 0;
+    if (!mi_mul_overflow(multiplier, arena_reserve, &reserve)) {
+      arena_reserve = reserve;
+    }
+  }
+  if (arena_reserve < req_size) return false;  // should be able to at least handle the current allocation size
+
+  // commit eagerly?
+  bool arena_commit = false;
+  if (mi_option_get(mi_option_arena_eager_commit) == 2)      { arena_commit = _mi_os_has_overcommit(); }
+  else if (mi_option_get(mi_option_arena_eager_commit) == 1) { arena_commit = true; }
+
+  return (mi_reserve_os_memory_ex(arena_reserve, arena_commit, allow_large, false /* exclusive? */, arena_id) == 0);
+}
+
+
+void* _mi_arena_alloc_aligned(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large,
+                              mi_arena_id_t req_arena_id, mi_memid_t* memid)
+{
+  mi_assert_internal(memid != NULL);
+  mi_assert_internal(size > 0);
+  *memid = _mi_memid_none();
+
+  const int numa_node = _mi_os_numa_node(); // current numa node
+
+  // try to allocate in an arena if the alignment is small enough and the object is not too small (as for heap meta data)
+  if (!mi_option_is_enabled(mi_option_disallow_arena_alloc)) {  // is arena allocation allowed?
+    if (size >= MI_ARENA_MIN_OBJ_SIZE && alignment <= MI_SEGMENT_ALIGN && align_offset == 0)
+    {
+      void* p = mi_arena_try_alloc(numa_node, size, alignment, commit, allow_large, req_arena_id, memid);
+      if (p != NULL) return p;
+
+      // otherwise, try to first eagerly reserve a new arena
+      if (req_arena_id == _mi_arena_id_none()) {
+        mi_arena_id_t arena_id = 0;
+        if (mi_arena_reserve(size, allow_large, &arena_id)) {
+          // and try allocate in there
+          mi_assert_internal(req_arena_id == _mi_arena_id_none());
+          p = mi_arena_try_alloc_at_id(arena_id, true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid);
+          if (p != NULL) return p;
+        }
+      }
+    }
+  }
+
+  // if we cannot use OS allocation, return NULL
+  if (mi_option_is_enabled(mi_option_disallow_os_alloc) || req_arena_id != _mi_arena_id_none()) {
+    errno = ENOMEM;
+    return NULL;
+  }
+
+  // finally, fall back to the OS
+  if (align_offset > 0) {
+    return _mi_os_alloc_aligned_at_offset(size, alignment, align_offset, commit, allow_large, memid);
+  }
+  else {
+    return _mi_os_alloc_aligned(size, alignment, commit, allow_large, memid);
+  }
+}
+
+void* _mi_arena_alloc(size_t size, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid)
+{
+  return _mi_arena_alloc_aligned(size, MI_ARENA_BLOCK_SIZE, 0, commit, allow_large, req_arena_id, memid);
+}
+
+
+void* mi_arena_area(mi_arena_id_t arena_id, size_t* size) {
+  if (size != NULL) *size = 0;
+  size_t arena_index = mi_arena_id_index(arena_id);
+  if (arena_index >= MI_MAX_ARENAS) return NULL;
+  mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[arena_index]);
+  if (arena == NULL) return NULL;
+  if (size != NULL) { *size = mi_arena_block_size(arena->block_count); }
+  return arena->start;
+}
+
+
+/* -----------------------------------------------------------
+  Arena purge
+----------------------------------------------------------- */
+
+static long mi_arena_purge_delay(void) {
+  // <0 = no purging allowed, 0=immediate purging, >0=milli-second delay
+  return (mi_option_get(mi_option_purge_delay) * mi_option_get(mi_option_arena_purge_mult));
+}
+
+// reset or decommit in an arena and update the committed/decommit bitmaps
+// assumes we own the area (i.e. blocks_in_use is claimed by us)
+static void mi_arena_purge(mi_arena_t* arena, size_t bitmap_idx, size_t blocks) {
+  mi_assert_internal(arena->blocks_committed != NULL);
+  mi_assert_internal(arena->blocks_purge != NULL);
+  mi_assert_internal(!arena->memid.is_pinned);
+  const size_t size = mi_arena_block_size(blocks);
+  void* const p = mi_arena_block_start(arena, bitmap_idx);
+  bool needs_recommit;
+  size_t already_committed = 0;
+  if (_mi_bitmap_is_claimed_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx, &already_committed)) {
+    // all blocks are committed, we can purge freely
+    mi_assert_internal(already_committed == blocks);
+    needs_recommit = _mi_os_purge(p, size);
+  }
+  else {
+    // some blocks are not committed -- this can happen when a partially committed block is freed
+    // in `_mi_arena_free` and it is conservatively marked as uncommitted but still scheduled for a purge
+    // we need to ensure we do not try to reset (as that may be invalid for uncommitted memory).
+    mi_assert_internal(already_committed < blocks);
+    mi_assert_internal(mi_option_is_enabled(mi_option_purge_decommits));
+    needs_recommit = _mi_os_purge_ex(p, size, false /* allow reset? */, mi_arena_block_size(already_committed));
+  }
+
+  // clear the purged blocks
+  _mi_bitmap_unclaim_across(arena->blocks_purge, arena->field_count, blocks, bitmap_idx);
+  // update committed bitmap
+  if (needs_recommit) {
+    _mi_bitmap_unclaim_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx);
+  }
+}
+
+// Schedule a purge. This is usually delayed to avoid repeated decommit/commit calls.
+// Note: assumes we (still) own the area as we may purge immediately
+static void mi_arena_schedule_purge(mi_arena_t* arena, size_t bitmap_idx, size_t blocks) {
+  mi_assert_internal(arena->blocks_purge != NULL);
+  const long delay = mi_arena_purge_delay();
+  if (delay < 0) return;  // is purging allowed at all?
+
+  if (_mi_preloading() || delay == 0) {
+    // decommit directly
+    mi_arena_purge(arena, bitmap_idx, blocks);
+  }
+  else {
+    // schedule purge
+    const mi_msecs_t expire = _mi_clock_now() + delay;
+    mi_msecs_t expire0 = 0;
+    if (mi_atomic_casi64_strong_acq_rel(&arena->purge_expire, &expire0, expire)) {
+      // expiration was not yet set
+      // maybe set the global arenas expire as well (if it wasn't set already)
+      mi_atomic_casi64_strong_acq_rel(&mi_arenas_purge_expire, &expire0, expire);
+    }
+    else {
+      // already an expiration was set
+    }
+    _mi_bitmap_claim_across(arena->blocks_purge, arena->field_count, blocks, bitmap_idx, NULL, NULL);
+  }
+}
+
+// purge a range of blocks
+// return true if the full range was purged.
+// assumes we own the area (i.e. blocks_in_use is claimed by us)
+static bool mi_arena_purge_range(mi_arena_t* arena, size_t idx, size_t startidx, size_t bitlen, size_t purge) {
+  const size_t endidx = startidx + bitlen;
+  size_t bitidx = startidx;
+  bool all_purged = false;
+  while (bitidx < endidx) {
+    // count consecutive ones in the purge mask
+    size_t count = 0;
+    while (bitidx + count < endidx && (purge & ((size_t)1 << (bitidx + count))) != 0) {
+      count++;
+    }
+    if (count > 0) {
+      // found range to be purged
+      const mi_bitmap_index_t range_idx = mi_bitmap_index_create(idx, bitidx);
+      mi_arena_purge(arena, range_idx, count);
+      if (count == bitlen) {
+        all_purged = true;
+      }
+    }
+    bitidx += (count+1); // +1 to skip the zero bit (or end)
+  }
+  return all_purged;
+}
+
+// returns true if anything was purged
+static bool mi_arena_try_purge(mi_arena_t* arena, mi_msecs_t now, bool force)
+{
+  // check pre-conditions
+  if (arena->memid.is_pinned) return false;
+
+  // expired yet?
+  mi_msecs_t expire = mi_atomic_loadi64_relaxed(&arena->purge_expire);
+  if (!force && (expire == 0 || expire > now)) return false;
+
+  // reset expire (if not already set concurrently)
+  mi_atomic_casi64_strong_acq_rel(&arena->purge_expire, &expire, (mi_msecs_t)0);
+  _mi_stat_counter_increase(&_mi_stats_main.arena_purges, 1);
+
+  // potential purges scheduled, walk through the bitmap
+  bool any_purged = false;
+  bool full_purge = true;
+  for (size_t i = 0; i < arena->field_count; i++) {
+    size_t purge = mi_atomic_load_relaxed(&arena->blocks_purge[i]);
+    if (purge != 0) {
+      size_t bitidx = 0;
+      while (bitidx < MI_BITMAP_FIELD_BITS) {
+        // find consecutive range of ones in the purge mask
+        size_t bitlen = 0;
+        while (bitidx + bitlen < MI_BITMAP_FIELD_BITS && (purge & ((size_t)1 << (bitidx + bitlen))) != 0) {
+          bitlen++;
+        }
+        // temporarily claim the purge range as "in-use" to be thread-safe with allocation
+        // try to claim the longest range of corresponding in_use bits
+        const mi_bitmap_index_t bitmap_index = mi_bitmap_index_create(i, bitidx);
+        while( bitlen > 0 ) {
+          if (_mi_bitmap_try_claim(arena->blocks_inuse, arena->field_count, bitlen, bitmap_index)) {
+            break;
+          }
+          bitlen--;
+        }
+        // actual claimed bits at `in_use`
+        if (bitlen > 0) {
+          // read purge again now that we have the in_use bits
+          purge = mi_atomic_load_acquire(&arena->blocks_purge[i]);
+          if (!mi_arena_purge_range(arena, i, bitidx, bitlen, purge)) {
+            full_purge = false;
+          }
+          any_purged = true;
+          // release the claimed `in_use` bits again
+          _mi_bitmap_unclaim(arena->blocks_inuse, arena->field_count, bitlen, bitmap_index);
+        }
+        bitidx += (bitlen+1);  // +1 to skip the zero (or end)
+      } // while bitidx
+    } // purge != 0
+  }
+  // if not fully purged, make sure to purge again in the future
+  if (!full_purge) {
+    const long delay = mi_arena_purge_delay();
+    mi_msecs_t expected = 0;
+    mi_atomic_casi64_strong_acq_rel(&arena->purge_expire,&expected,_mi_clock_now() + delay);
+  }
+  return any_purged;
+}
+
+static void mi_arenas_try_purge( bool force, bool visit_all )
+{
+  if (_mi_preloading() || mi_arena_purge_delay() <= 0) return;  // nothing will be scheduled
+
+  // check if any arena needs purging?
+  const mi_msecs_t now = _mi_clock_now();
+  mi_msecs_t arenas_expire = mi_atomic_loadi64_acquire(&mi_arenas_purge_expire);
+  if (!force && (arenas_expire == 0 || arenas_expire < now)) return;
+
+  const size_t max_arena = mi_atomic_load_acquire(&mi_arena_count);
+  if (max_arena == 0) return;
+
+  // allow only one thread to purge at a time
+  static mi_atomic_guard_t purge_guard;
+  mi_atomic_guard(&purge_guard)
+  {
+    // increase global expire: at most one purge per delay cycle
+    mi_atomic_storei64_release(&mi_arenas_purge_expire, now + mi_arena_purge_delay());
+    size_t max_purge_count = (visit_all ? max_arena : 2);
+    bool all_visited = true;
+    for (size_t i = 0; i < max_arena; i++) {
+      mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[i]);
+      if (arena != NULL) {
+        if (mi_arena_try_purge(arena, now, force)) {
+          if (max_purge_count <= 1) {
+            all_visited = false;
+            break;
+          }
+          max_purge_count--;
+        }
+      }
+    }
+    if (all_visited) {
+      // all arena's were visited and purged: reset global expire
+      mi_atomic_storei64_release(&mi_arenas_purge_expire, 0);
+    }
+  }
+}
+
+
+/* -----------------------------------------------------------
+  Arena free
+----------------------------------------------------------- */
+
+void _mi_arena_free(void* p, size_t size, size_t committed_size, mi_memid_t memid) {
+  mi_assert_internal(size > 0);
+  mi_assert_internal(committed_size <= size);
+  if (p==NULL) return;
+  if (size==0) return;
+  const bool all_committed = (committed_size == size);
+  const size_t decommitted_size = (committed_size <= size ? size - committed_size : 0);
+
+  // need to set all memory to undefined as some parts may still be marked as no_access (like padding etc.)
+  mi_track_mem_undefined(p,size);
+
+  if (mi_memkind_is_os(memid.memkind)) {
+    // was a direct OS allocation, pass through
+    if (!all_committed && decommitted_size > 0) {
+      // if partially committed, adjust the committed stats (as `_mi_os_free` will decrease commit by the full size)
+      _mi_stat_increase(&_mi_stats_main.committed, decommitted_size);
+    }
+    _mi_os_free(p, size, memid);
+  }
+  else if (memid.memkind == MI_MEM_ARENA) {
+    // allocated in an arena
+    size_t arena_idx;
+    size_t bitmap_idx;
+    mi_arena_memid_indices(memid, &arena_idx, &bitmap_idx);
+    mi_assert_internal(arena_idx < MI_MAX_ARENAS);
+    mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t,&mi_arenas[arena_idx]);
+    mi_assert_internal(arena != NULL);
+    const size_t blocks = mi_block_count_of_size(size);
+
+    // checks
+    if (arena == NULL) {
+      _mi_error_message(EINVAL, "trying to free from an invalid arena: %p, size %zu, memid: 0x%zx\n", p, size, memid);
+      return;
+    }
+    mi_assert_internal(arena->field_count > mi_bitmap_index_field(bitmap_idx));
+    if (arena->field_count <= mi_bitmap_index_field(bitmap_idx)) {
+      _mi_error_message(EINVAL, "trying to free from an invalid arena block: %p, size %zu, memid: 0x%zx\n", p, size, memid);
+      return;
+    }
+
+    // potentially decommit
+    if (arena->memid.is_pinned || arena->blocks_committed == NULL) {
+      mi_assert_internal(all_committed);
+    }
+    else {
+      mi_assert_internal(arena->blocks_committed != NULL);
+      mi_assert_internal(arena->blocks_purge != NULL);
+
+      if (!all_committed) {
+        // mark the entire range as no longer committed (so we will recommit the full range when re-using)
+        _mi_bitmap_unclaim_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx);
+        mi_track_mem_noaccess(p,size);
+        //if (committed_size > 0) {
+          // if partially committed, adjust the committed stats (is it will be recommitted when re-using)
+          // in the delayed purge, we do no longer decrease the commit if the range is not marked entirely as committed.
+          _mi_stat_decrease(&_mi_stats_main.committed, committed_size);
+        //}
+        // note: if not all committed, it may be that the purge will reset/decommit the entire range
+        // that contains already decommitted parts. Since purge consistently uses reset or decommit that
+        // works (as we should never reset decommitted parts).
+      }
+      // (delay) purge the entire range
+      mi_arena_schedule_purge(arena, bitmap_idx, blocks);
+    }
+
+    // and make it available to others again
+    bool all_inuse = _mi_bitmap_unclaim_across(arena->blocks_inuse, arena->field_count, blocks, bitmap_idx);
+    if (!all_inuse) {
+      _mi_error_message(EAGAIN, "trying to free an already freed arena block: %p, size %zu\n", p, size);
+      return;
+    };
+  }
+  else {
+    // arena was none, external, or static; nothing to do
+    mi_assert_internal(memid.memkind < MI_MEM_OS);
+  }
+
+  // purge expired decommits
+  mi_arenas_try_purge(false, false);
+}
+
+// destroy owned arenas; this is unsafe and should only be done using `mi_option_destroy_on_exit`
+// for dynamic libraries that are unloaded and need to release all their allocated memory.
+static void mi_arenas_unsafe_destroy(void) {
+  const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count);
+  size_t new_max_arena = 0;
+  for (size_t i = 0; i < max_arena; i++) {
+    mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[i]);
+    if (arena != NULL) {
+      mi_lock_done(&arena->abandoned_visit_lock);
+      if (arena->start != NULL && mi_memkind_is_os(arena->memid.memkind)) {
+        mi_atomic_store_ptr_release(mi_arena_t, &mi_arenas[i], NULL);
+        _mi_os_free(arena->start, mi_arena_size(arena), arena->memid);
+      }
+      else {
+        new_max_arena = i;
+      }
+      _mi_arena_meta_free(arena, arena->meta_memid, arena->meta_size);
+    }
+  }
+
+  // try to lower the max arena.
+  size_t expected = max_arena;
+  mi_atomic_cas_strong_acq_rel(&mi_arena_count, &expected, new_max_arena);
+}
+
+// Purge the arenas; if `force_purge` is true, amenable parts are purged even if not yet expired
+void _mi_arenas_collect(bool force_purge) {
+  mi_arenas_try_purge(force_purge, force_purge /* visit all? */);
+}
+
+// destroy owned arenas; this is unsafe and should only be done using `mi_option_destroy_on_exit`
+// for dynamic libraries that are unloaded and need to release all their allocated memory.
+void _mi_arena_unsafe_destroy_all(void) {
+  mi_arenas_unsafe_destroy();
+  _mi_arenas_collect(true /* force purge */);  // purge non-owned arenas
+}
+
+// Is a pointer inside any of our arenas?
+bool _mi_arena_contains(const void* p) {
+  const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count);
+  for (size_t i = 0; i < max_arena; i++) {
+    mi_arena_t* arena = mi_atomic_load_ptr_relaxed(mi_arena_t, &mi_arenas[i]);
+    if (arena != NULL && arena->start <= (const uint8_t*)p && arena->start + mi_arena_block_size(arena->block_count) > (const uint8_t*)p) {
+      return true;
+    }
+  }
+  return false;
+}
+
+/* -----------------------------------------------------------
+  Add an arena.
+----------------------------------------------------------- */
+
+static bool mi_arena_add(mi_arena_t* arena, mi_arena_id_t* arena_id, mi_stats_t* stats) {
+  mi_assert_internal(arena != NULL);
+  mi_assert_internal((uintptr_t)mi_atomic_load_ptr_relaxed(uint8_t,&arena->start) % MI_SEGMENT_ALIGN == 0);
+  mi_assert_internal(arena->block_count > 0);
+  if (arena_id != NULL) { *arena_id = -1; }
+
+  size_t i = mi_atomic_load_relaxed(&mi_arena_count);
+  while (i < MI_MAX_ARENAS) {
+    if (mi_atomic_cas_strong_acq_rel(&mi_arena_count, &i, i+1)) {
+      _mi_stat_counter_increase(&stats->arena_count, 1);
+      arena->id = mi_arena_id_create(i);
+      mi_atomic_store_ptr_release(mi_arena_t, &mi_arenas[i], arena);
+      if (arena_id != NULL) { *arena_id = arena->id; }
+      return true;
+    }
+  }
+
+  return false;
+}
+
+static bool mi_manage_os_memory_ex2(void* start, size_t size, bool is_large, int numa_node, bool exclusive, mi_memid_t memid, mi_arena_id_t* arena_id) mi_attr_noexcept
+{
+  if (arena_id != NULL) *arena_id = _mi_arena_id_none();
+  if (size < MI_ARENA_BLOCK_SIZE) {
+    _mi_warning_message("the arena size is too small (memory at %p with size %zu)\n", start, size);
+    return false;
+  }
+  if (is_large) {
+    mi_assert_internal(memid.initially_committed && memid.is_pinned);
+  }
+  if (!_mi_is_aligned(start, MI_SEGMENT_ALIGN)) {
+    void* const aligned_start = mi_align_up_ptr(start, MI_SEGMENT_ALIGN);
+    const size_t diff = (uint8_t*)aligned_start - (uint8_t*)start;
+    if (diff >= size || (size - diff) < MI_ARENA_BLOCK_SIZE) {
+      _mi_warning_message("after alignment, the size of the arena becomes too small (memory at %p with size %zu)\n", start, size);
+      return false;
+    }
+    start = aligned_start;
+    size = size - diff;
+  }
+
+  const size_t bcount = size / MI_ARENA_BLOCK_SIZE;
+  const size_t fields = _mi_divide_up(bcount, MI_BITMAP_FIELD_BITS);
+  const size_t bitmaps = (memid.is_pinned ? 3 : 5);
+  const size_t asize  = sizeof(mi_arena_t) + (bitmaps*fields*sizeof(mi_bitmap_field_t));
+  mi_memid_t meta_memid;
+  mi_arena_t* arena   = (mi_arena_t*)_mi_arena_meta_zalloc(asize, &meta_memid);
+  if (arena == NULL) return false;
+
+  // already zero'd due to zalloc
+  // _mi_memzero(arena, asize);
+  arena->id = _mi_arena_id_none();
+  arena->memid = memid;
+  arena->exclusive = exclusive;
+  arena->meta_size = asize;
+  arena->meta_memid = meta_memid;
+  arena->block_count = bcount;
+  arena->field_count = fields;
+  arena->start = (uint8_t*)start;
+  arena->numa_node    = numa_node; // TODO: or get the current numa node if -1? (now it allows anyone to allocate on -1)
+  arena->is_large     = is_large;
+  arena->purge_expire = 0;
+  arena->search_idx   = 0;
+  mi_lock_init(&arena->abandoned_visit_lock);
+  // consecutive bitmaps
+  arena->blocks_dirty     = &arena->blocks_inuse[fields];     // just after inuse bitmap
+  arena->blocks_abandoned = &arena->blocks_inuse[2 * fields]; // just after dirty bitmap
+  arena->blocks_committed = (arena->memid.is_pinned ? NULL : &arena->blocks_inuse[3*fields]); // just after abandoned bitmap
+  arena->blocks_purge     = (arena->memid.is_pinned ? NULL : &arena->blocks_inuse[4*fields]); // just after committed bitmap
+  // initialize committed bitmap?
+  if (arena->blocks_committed != NULL && arena->memid.initially_committed) {
+    memset((void*)arena->blocks_committed, 0xFF, fields*sizeof(mi_bitmap_field_t)); // cast to void* to avoid atomic warning
+  }
+
+  // and claim leftover blocks if needed (so we never allocate there)
+  ptrdiff_t post = (fields * MI_BITMAP_FIELD_BITS) - bcount;
+  mi_assert_internal(post >= 0);
+  if (post > 0) {
+    // don't use leftover bits at the end
+    mi_bitmap_index_t postidx = mi_bitmap_index_create(fields - 1, MI_BITMAP_FIELD_BITS - post);
+    _mi_bitmap_claim(arena->blocks_inuse, fields, post, postidx, NULL);
+  }
+  return mi_arena_add(arena, arena_id, &_mi_stats_main);
+
+}
+
+bool mi_manage_os_memory_ex(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept {
+  mi_memid_t memid = _mi_memid_create(MI_MEM_EXTERNAL);
+  memid.initially_committed = is_committed;
+  memid.initially_zero = is_zero;
+  memid.is_pinned = is_large;
+  return mi_manage_os_memory_ex2(start,size,is_large,numa_node,exclusive,memid, arena_id);
+}
+
+// Reserve a range of regular OS memory
+int mi_reserve_os_memory_ex(size_t size, bool commit, bool allow_large, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept {
+  if (arena_id != NULL) *arena_id = _mi_arena_id_none();
+  size = _mi_align_up(size, MI_ARENA_BLOCK_SIZE); // at least one block
+  mi_memid_t memid;
+  void* start = _mi_os_alloc_aligned(size, MI_SEGMENT_ALIGN, commit, allow_large, &memid);
+  if (start == NULL) return ENOMEM;
+  const bool is_large = memid.is_pinned; // todo: use separate is_large field?
+  if (!mi_manage_os_memory_ex2(start, size, is_large, -1 /* numa node */, exclusive, memid, arena_id)) {
+    _mi_os_free_ex(start, size, commit, memid);
+    _mi_verbose_message("failed to reserve %zu KiB memory\n", _mi_divide_up(size, 1024));
+    return ENOMEM;
+  }
+  _mi_verbose_message("reserved %zu KiB memory%s\n", _mi_divide_up(size, 1024), is_large ? " (in large os pages)" : "");
+  return 0;
+}
+
+
+// Manage a range of regular OS memory
+bool mi_manage_os_memory(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node) mi_attr_noexcept {
+  return mi_manage_os_memory_ex(start, size, is_committed, is_large, is_zero, numa_node, false /* exclusive? */, NULL);
+}
+
+// Reserve a range of regular OS memory
+int mi_reserve_os_memory(size_t size, bool commit, bool allow_large) mi_attr_noexcept {
+  return mi_reserve_os_memory_ex(size, commit, allow_large, false, NULL);
+}
+
+
+/* -----------------------------------------------------------
+  Debugging
+----------------------------------------------------------- */
+
+static size_t mi_debug_show_bitmap(const char* prefix, const char* header, size_t block_count, mi_bitmap_field_t* fields, size_t field_count ) {
+  _mi_message("%s%s:\n", prefix, header);
+  size_t bcount = 0;
+  size_t inuse_count = 0;
+  for (size_t i = 0; i < field_count; i++) {
+    char buf[MI_BITMAP_FIELD_BITS + 1];
+    uintptr_t field = mi_atomic_load_relaxed(&fields[i]);
+    for (size_t bit = 0; bit < MI_BITMAP_FIELD_BITS; bit++, bcount++) {
+      if (bcount < block_count) {
+        bool inuse = ((((uintptr_t)1 << bit) & field) != 0);
+        if (inuse) inuse_count++;
+        buf[bit] = (inuse ? 'x' : '.');
+      }
+      else {
+        buf[bit] = ' ';
+      }
+    }
+    buf[MI_BITMAP_FIELD_BITS] = 0;
+    _mi_message("%s  %s\n", prefix, buf);
+  }
+  _mi_message("%s  total ('x'): %zu\n", prefix, inuse_count);
+  return inuse_count;
+}
+
+void mi_debug_show_arenas(void) mi_attr_noexcept {
+  const bool show_inuse = true;
+  size_t max_arenas = mi_atomic_load_relaxed(&mi_arena_count);
+  size_t inuse_total = 0;
+  //size_t abandoned_total = 0;
+  //size_t purge_total = 0;
+  for (size_t i = 0; i < max_arenas; i++) {
+    mi_arena_t* arena = mi_atomic_load_ptr_relaxed(mi_arena_t, &mi_arenas[i]);
+    if (arena == NULL) break;
+    _mi_message("arena %zu: %zu blocks of size %zuMiB (in %zu fields) %s\n", i, arena->block_count, (size_t)(MI_ARENA_BLOCK_SIZE / MI_MiB), arena->field_count, (arena->memid.is_pinned ? ", pinned" : ""));
+    if (show_inuse) {
+      inuse_total += mi_debug_show_bitmap("  ", "inuse blocks", arena->block_count, arena->blocks_inuse, arena->field_count);
+    }
+    if (arena->blocks_committed != NULL) {
+      mi_debug_show_bitmap("  ", "committed blocks", arena->block_count, arena->blocks_committed, arena->field_count);
+    }
+    //if (show_abandoned) {
+    //  abandoned_total += mi_debug_show_bitmap("  ", "abandoned blocks", arena->block_count, arena->blocks_abandoned, arena->field_count);
+    //}
+    //if (show_purge && arena->blocks_purge != NULL) {
+    //  purge_total += mi_debug_show_bitmap("  ", "purgeable blocks", arena->block_count, arena->blocks_purge, arena->field_count);
+    //}
+  }
+  if (show_inuse)     _mi_message("total inuse blocks    : %zu\n", inuse_total);
+  //if (show_abandoned) _mi_message("total abandoned blocks: %zu\n", abandoned_total);
+  //if (show_purge)     _mi_message("total purgeable blocks: %zu\n", purge_total);
+}
+
+
+void mi_arenas_print(void) mi_attr_noexcept {
+  mi_debug_show_arenas();
+}
+
+
+/* -----------------------------------------------------------
+  Reserve a huge page arena.
+----------------------------------------------------------- */
+// reserve at a specific numa node
+int mi_reserve_huge_os_pages_at_ex(size_t pages, int numa_node, size_t timeout_msecs, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept {
+  if (arena_id != NULL) *arena_id = -1;
+  if (pages==0) return 0;
+  if (numa_node < -1) numa_node = -1;
+  if (numa_node >= 0) numa_node = numa_node % _mi_os_numa_node_count();
+  size_t hsize = 0;
+  size_t pages_reserved = 0;
+  mi_memid_t memid;
+  void* p = _mi_os_alloc_huge_os_pages(pages, numa_node, timeout_msecs, &pages_reserved, &hsize, &memid);
+  if (p==NULL || pages_reserved==0) {
+    _mi_warning_message("failed to reserve %zu GiB huge pages\n", pages);
+    return ENOMEM;
+  }
+  _mi_verbose_message("numa node %i: reserved %zu GiB huge pages (of the %zu GiB requested)\n", numa_node, pages_reserved, pages);
+
+  if (!mi_manage_os_memory_ex2(p, hsize, true, numa_node, exclusive, memid, arena_id)) {
+    _mi_os_free(p, hsize, memid);
+    return ENOMEM;
+  }
+  return 0;
+}
+
+int mi_reserve_huge_os_pages_at(size_t pages, int numa_node, size_t timeout_msecs) mi_attr_noexcept {
+  return mi_reserve_huge_os_pages_at_ex(pages, numa_node, timeout_msecs, false, NULL);
+}
+
+// reserve huge pages evenly among the given number of numa nodes (or use the available ones as detected)
+int mi_reserve_huge_os_pages_interleave(size_t pages, size_t numa_nodes, size_t timeout_msecs) mi_attr_noexcept {
+  if (pages == 0) return 0;
+
+  // pages per numa node
+  int numa_count = (numa_nodes > 0 && numa_nodes <= INT_MAX ? (int)numa_nodes : _mi_os_numa_node_count());
+  if (numa_count == 0) numa_count = 1;
+  const size_t pages_per = pages / numa_count;
+  const size_t pages_mod = pages % numa_count;
+  const size_t timeout_per = (timeout_msecs==0 ? 0 : (timeout_msecs / numa_count) + 50);
+
+  // reserve evenly among numa nodes
+  for (int numa_node = 0; numa_node < numa_count && pages > 0; numa_node++) {
+    size_t node_pages = pages_per;  // can be 0
+    if ((size_t)numa_node < pages_mod) node_pages++;
+    int err = mi_reserve_huge_os_pages_at(node_pages, numa_node, timeout_per);
+    if (err) return err;
+    if (pages < node_pages) {
+      pages = 0;
+    }
+    else {
+      pages -= node_pages;
+    }
+  }
+
+  return 0;
+}
+
+int mi_reserve_huge_os_pages(size_t pages, double max_secs, size_t* pages_reserved) mi_attr_noexcept {
+  MI_UNUSED(max_secs);
+  _mi_warning_message("mi_reserve_huge_os_pages is deprecated: use mi_reserve_huge_os_pages_interleave/at instead\n");
+  if (pages_reserved != NULL) *pages_reserved = 0;
+  int err = mi_reserve_huge_os_pages_interleave(pages, 0, (size_t)(max_secs * 1000.0));
+  if (err==0 && pages_reserved!=NULL) *pages_reserved = pages;
+  return err;
+}
diff --git a/compat/mimalloc/bitmap.c b/compat/mimalloc/bitmap.c
new file mode 100644
index 00000000000000..32d1e9548d3e3b
--- /dev/null
+++ b/compat/mimalloc/bitmap.c
@@ -0,0 +1,441 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2023 Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* ----------------------------------------------------------------------------
+Concurrent bitmap that can set/reset sequences of bits atomically,
+represented as an array of fields where each field is a machine word (`size_t`)
+
+There are two api's; the standard one cannot have sequences that cross
+between the bitmap fields (and a sequence must be <= MI_BITMAP_FIELD_BITS).
+
+The `_across` postfixed functions do allow sequences that can cross over
+between the fields. (This is used in arena allocation)
+---------------------------------------------------------------------------- */
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "bitmap.h"
+
+/* -----------------------------------------------------------
+  Bitmap definition
+----------------------------------------------------------- */
+
+// The bit mask for a given number of blocks at a specified bit index.
+static inline size_t mi_bitmap_mask_(size_t count, size_t bitidx) {
+  mi_assert_internal(count + bitidx <= MI_BITMAP_FIELD_BITS);
+  mi_assert_internal(count > 0);
+  if (count >= MI_BITMAP_FIELD_BITS) return MI_BITMAP_FIELD_FULL;
+  if (count == 0) return 0;
+  return ((((size_t)1 << count) - 1) << bitidx);
+}
+
+
+/* -----------------------------------------------------------
+  Claim a bit sequence atomically
+----------------------------------------------------------- */
+
+// Try to atomically claim a sequence of `count` bits in a single
+// field at `idx` in `bitmap`. Returns `true` on success.
+inline bool _mi_bitmap_try_find_claim_field(mi_bitmap_t bitmap, size_t idx, const size_t count, mi_bitmap_index_t* bitmap_idx)
+{
+  mi_assert_internal(bitmap_idx != NULL);
+  mi_assert_internal(count <= MI_BITMAP_FIELD_BITS);
+  mi_assert_internal(count > 0);
+  mi_bitmap_field_t* field = &bitmap[idx];
+  size_t map  = mi_atomic_load_relaxed(field);
+  if (map==MI_BITMAP_FIELD_FULL) return false; // short cut
+
+  // search for 0-bit sequence of length count
+  const size_t mask = mi_bitmap_mask_(count, 0);
+  const size_t bitidx_max = MI_BITMAP_FIELD_BITS - count;
+
+#ifdef MI_HAVE_FAST_BITSCAN
+  size_t bitidx = mi_ctz(~map);    // quickly find the first zero bit if possible
+#else
+  size_t bitidx = 0;               // otherwise start at 0
+#endif
+  size_t m = (mask << bitidx);     // invariant: m == mask shifted by bitidx
+
+  // scan linearly for a free range of zero bits
+  while (bitidx <= bitidx_max) {
+    const size_t mapm = (map & m);
+    if (mapm == 0) {  // are the mask bits free at bitidx?
+      mi_assert_internal((m >> bitidx) == mask); // no overflow?
+      const size_t newmap = (map | m);
+      mi_assert_internal((newmap^map) >> bitidx == mask);
+      if (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)) {  // TODO: use weak cas here?
+        // no success, another thread claimed concurrently.. keep going (with updated `map`)
+        continue;
+      }
+      else {
+        // success, we claimed the bits!
+        *bitmap_idx = mi_bitmap_index_create(idx, bitidx);
+        return true;
+      }
+    }
+    else {
+      // on to the next bit range
+#ifdef MI_HAVE_FAST_BITSCAN
+      mi_assert_internal(mapm != 0);
+      const size_t shift = (count == 1 ? 1 : (MI_SIZE_BITS - mi_clz(mapm) - bitidx));
+      mi_assert_internal(shift > 0 && shift <= count);
+#else
+      const size_t shift = 1;
+#endif
+      bitidx += shift;
+      m <<= shift;
+    }
+  }
+  // no bits found
+  return false;
+}
+
+// Find `count` bits of 0 and set them to 1 atomically; returns `true` on success.
+// Starts at idx, and wraps around to search in all `bitmap_fields` fields.
+// `count` can be at most MI_BITMAP_FIELD_BITS and will never cross fields.
+bool _mi_bitmap_try_find_from_claim(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx) {
+  size_t idx = start_field_idx;
+  for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) {
+    if (idx >= bitmap_fields) { idx = 0; } // wrap
+    if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) {
+      return true;
+    }
+  }
+  return false;
+}
+
+// Like _mi_bitmap_try_find_from_claim but with an extra predicate that must be fullfilled
+bool _mi_bitmap_try_find_from_claim_pred(mi_bitmap_t bitmap, const size_t bitmap_fields, 
+            const size_t start_field_idx, const size_t count, 
+            mi_bitmap_pred_fun_t pred_fun, void* pred_arg,            
+            mi_bitmap_index_t* bitmap_idx) {
+  size_t idx = start_field_idx;
+  for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) {
+    if (idx >= bitmap_fields) idx = 0; // wrap
+    if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) {
+      if (pred_fun == NULL || pred_fun(*bitmap_idx, pred_arg)) { 
+        return true;
+      }
+      // predicate returned false, unclaim and look further
+      _mi_bitmap_unclaim(bitmap, bitmap_fields, count, *bitmap_idx);
+    }
+  }
+  return false;
+}
+
+// Set `count` bits at `bitmap_idx` to 0 atomically
+// Returns `true` if all `count` bits were 1 previously.
+bool _mi_bitmap_unclaim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  const size_t idx = mi_bitmap_index_field(bitmap_idx);
+  const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx);
+  const size_t mask = mi_bitmap_mask_(count, bitidx);
+  mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields);
+  // mi_assert_internal((bitmap[idx] & mask) == mask);
+  const size_t prev = mi_atomic_and_acq_rel(&bitmap[idx], ~mask);
+  return ((prev & mask) == mask);
+}
+
+
+// Set `count` bits at `bitmap_idx` to 1 atomically
+// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit.
+bool _mi_bitmap_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_zero) {
+  const size_t idx = mi_bitmap_index_field(bitmap_idx);
+  const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx);
+  const size_t mask = mi_bitmap_mask_(count, bitidx);
+  mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields);
+  //mi_assert_internal(any_zero != NULL || (bitmap[idx] & mask) == 0);
+  size_t prev = mi_atomic_or_acq_rel(&bitmap[idx], mask);
+  if (any_zero != NULL) { *any_zero = ((prev & mask) != mask); }
+  return ((prev & mask) == 0);
+}
+
+// Returns `true` if all `count` bits were 1. `any_ones` is `true` if there was at least one bit set to one.
+static bool mi_bitmap_is_claimedx(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_ones) {
+  const size_t idx = mi_bitmap_index_field(bitmap_idx);
+  const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx);
+  const size_t mask = mi_bitmap_mask_(count, bitidx);
+  mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields);
+  const size_t field = mi_atomic_load_relaxed(&bitmap[idx]);
+  if (any_ones != NULL) { *any_ones = ((field & mask) != 0); }
+  return ((field & mask) == mask);
+}
+
+// Try to set `count` bits at `bitmap_idx` from 0 to 1 atomically.
+// Returns `true` if successful when all previous `count` bits were 0.
+bool _mi_bitmap_try_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  const size_t idx = mi_bitmap_index_field(bitmap_idx);
+  const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx);
+  const size_t mask = mi_bitmap_mask_(count, bitidx);
+  mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields);
+  size_t expected = mi_atomic_load_relaxed(&bitmap[idx]);
+  do  {
+    if ((expected & mask) != 0) return false;
+  }
+  while (!mi_atomic_cas_strong_acq_rel(&bitmap[idx], &expected, expected | mask));
+  mi_assert_internal((expected & mask) == 0);
+  return true;
+}
+
+
+bool _mi_bitmap_is_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  return mi_bitmap_is_claimedx(bitmap, bitmap_fields, count, bitmap_idx, NULL);
+}
+
+bool _mi_bitmap_is_any_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  bool any_ones;
+  mi_bitmap_is_claimedx(bitmap, bitmap_fields, count, bitmap_idx, &any_ones);
+  return any_ones;
+}
+
+
+//--------------------------------------------------------------------------
+// the `_across` functions work on bitmaps where sequences can cross over
+// between the fields. This is used in arena allocation
+//--------------------------------------------------------------------------
+
+// Try to atomically claim a sequence of `count` bits starting from the field
+// at `idx` in `bitmap` and crossing into subsequent fields. Returns `true` on success.
+// Only needs to consider crossing into the next fields (see `mi_bitmap_try_find_from_claim_across`)
+static bool mi_bitmap_try_find_claim_field_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t idx, const size_t count, const size_t retries, mi_bitmap_index_t* bitmap_idx)
+{
+  mi_assert_internal(bitmap_idx != NULL);
+
+  // check initial trailing zeros
+  mi_bitmap_field_t* field = &bitmap[idx];
+  size_t map = mi_atomic_load_relaxed(field);
+  const size_t initial = mi_clz(map);  // count of initial zeros starting at idx
+  mi_assert_internal(initial <= MI_BITMAP_FIELD_BITS);
+  if (initial == 0)     return false;
+  if (initial >= count) return _mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx);    // no need to cross fields (this case won't happen for us)
+  if (_mi_divide_up(count - initial, MI_BITMAP_FIELD_BITS) >= (bitmap_fields - idx)) return false; // not enough entries
+
+  // scan ahead
+  size_t found = initial;
+  size_t mask = 0;     // mask bits for the final field
+  while(found < count) {
+    field++;
+    map = mi_atomic_load_relaxed(field);
+    const size_t mask_bits = (found + MI_BITMAP_FIELD_BITS <= count ? MI_BITMAP_FIELD_BITS : (count - found));
+    mi_assert_internal(mask_bits > 0 && mask_bits <= MI_BITMAP_FIELD_BITS);
+    mask = mi_bitmap_mask_(mask_bits, 0);
+    if ((map & mask) != 0) return false;  // some part is already claimed
+    found += mask_bits;
+  }
+  mi_assert_internal(field < &bitmap[bitmap_fields]);
+
+  // we found a range of contiguous zeros up to the final field; mask contains mask in the final field
+  // now try to claim the range atomically
+  mi_bitmap_field_t* const final_field = field;
+  const size_t final_mask = mask;
+  mi_bitmap_field_t* const initial_field = &bitmap[idx];
+  const size_t initial_idx = MI_BITMAP_FIELD_BITS - initial;
+  const size_t initial_mask = mi_bitmap_mask_(initial, initial_idx);
+
+  // initial field
+  size_t newmap;
+  field = initial_field;
+  map = mi_atomic_load_relaxed(field);
+  do {
+    newmap = (map | initial_mask);
+    if ((map & initial_mask) != 0) { goto rollback; };
+  } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap));
+
+  // intermediate fields
+  while (++field < final_field) {
+    newmap = MI_BITMAP_FIELD_FULL;
+    map = 0;
+    if (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)) { goto rollback; }
+  }
+
+  // final field
+  mi_assert_internal(field == final_field);
+  map = mi_atomic_load_relaxed(field);
+  do {
+    newmap = (map | final_mask);
+    if ((map & final_mask) != 0) { goto rollback; }
+  } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap));
+
+  // claimed!
+  *bitmap_idx = mi_bitmap_index_create(idx, initial_idx);
+  return true;
+
+rollback:
+  // roll back intermediate fields
+  // (we just failed to claim `field` so decrement first)
+  while (--field > initial_field) {
+    newmap = 0;
+    map = MI_BITMAP_FIELD_FULL;
+    mi_assert_internal(mi_atomic_load_relaxed(field) == map);
+    mi_atomic_store_release(field, newmap);
+  }
+  if (field == initial_field) {               // (if we failed on the initial field, `field + 1 == initial_field`)
+    map = mi_atomic_load_relaxed(field);
+    do {
+      mi_assert_internal((map & initial_mask) == initial_mask);
+      newmap = (map & ~initial_mask);
+    } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap));
+  }
+  mi_stat_counter_increase(_mi_stats_main.arena_rollback_count,1);
+  // retry? (we make a recursive call instead of goto to be able to use const declarations)
+  if (retries <= 2) {
+    return mi_bitmap_try_find_claim_field_across(bitmap, bitmap_fields, idx, count, retries+1, bitmap_idx);
+  }
+  else {
+    return false;
+  }
+}
+
+
+// Find `count` bits of zeros and set them to 1 atomically; returns `true` on success.
+// Starts at idx, and wraps around to search in all `bitmap_fields` fields.
+bool _mi_bitmap_try_find_from_claim_across(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx) {
+  mi_assert_internal(count > 0);
+  if (count <= 2) {
+    // we don't bother with crossover fields for small counts
+    return _mi_bitmap_try_find_from_claim(bitmap, bitmap_fields, start_field_idx, count, bitmap_idx);
+  }
+
+  // visit the fields
+  size_t idx = start_field_idx;
+  for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) {
+    if (idx >= bitmap_fields) { idx = 0; } // wrap
+    // first try to claim inside a field
+    /*
+    if (count <= MI_BITMAP_FIELD_BITS) {
+      if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) {
+        return true;
+      }
+    }
+    */
+    // if that fails, then try to claim across fields
+    if (mi_bitmap_try_find_claim_field_across(bitmap, bitmap_fields, idx, count, 0, bitmap_idx)) {
+      return true;
+    }
+  }
+  return false;
+}
+
+// Helper for masks across fields; returns the mid count, post_mask may be 0
+static size_t mi_bitmap_mask_across(mi_bitmap_index_t bitmap_idx, size_t bitmap_fields, size_t count, size_t* pre_mask, size_t* mid_mask, size_t* post_mask) {
+  MI_UNUSED(bitmap_fields);
+  const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx);
+  if mi_likely(bitidx + count <= MI_BITMAP_FIELD_BITS) {
+    *pre_mask = mi_bitmap_mask_(count, bitidx);
+    *mid_mask = 0;
+    *post_mask = 0;
+    mi_assert_internal(mi_bitmap_index_field(bitmap_idx) < bitmap_fields);
+    return 0;
+  }
+  else {
+    const size_t pre_bits = MI_BITMAP_FIELD_BITS - bitidx;
+    mi_assert_internal(pre_bits < count);
+    *pre_mask = mi_bitmap_mask_(pre_bits, bitidx);
+    count -= pre_bits;
+    const size_t mid_count = (count / MI_BITMAP_FIELD_BITS);
+    *mid_mask = MI_BITMAP_FIELD_FULL;
+    count %= MI_BITMAP_FIELD_BITS;
+    *post_mask = (count==0 ? 0 : mi_bitmap_mask_(count, 0));
+    mi_assert_internal(mi_bitmap_index_field(bitmap_idx) + mid_count + (count==0 ? 0 : 1) < bitmap_fields);
+    return mid_count;
+  }
+}
+
+// Set `count` bits at `bitmap_idx` to 0 atomically
+// Returns `true` if all `count` bits were 1 previously.
+bool _mi_bitmap_unclaim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  size_t idx = mi_bitmap_index_field(bitmap_idx);
+  size_t pre_mask;
+  size_t mid_mask;
+  size_t post_mask;
+  size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask);
+  bool all_one = true;
+  mi_bitmap_field_t* field = &bitmap[idx];
+  size_t prev = mi_atomic_and_acq_rel(field++, ~pre_mask);   // clear first part
+  if ((prev & pre_mask) != pre_mask) all_one = false;
+  while(mid_count-- > 0) {
+    prev = mi_atomic_and_acq_rel(field++, ~mid_mask);        // clear mid part
+    if ((prev & mid_mask) != mid_mask) all_one = false;
+  }
+  if (post_mask!=0) {
+    prev = mi_atomic_and_acq_rel(field, ~post_mask);         // clear end part
+    if ((prev & post_mask) != post_mask) all_one = false;
+  }
+  return all_one;
+}
+
+// Set `count` bits at `bitmap_idx` to 1 atomically
+// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit.
+bool _mi_bitmap_claim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_zero, size_t* already_set) {
+  size_t idx = mi_bitmap_index_field(bitmap_idx);
+  size_t pre_mask;
+  size_t mid_mask;
+  size_t post_mask;
+  size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask);
+  bool all_zero = true;
+  bool any_zero = false;
+  size_t one_count = 0;
+  _Atomic(size_t)*field = &bitmap[idx];
+  size_t prev = mi_atomic_or_acq_rel(field++, pre_mask);
+  if ((prev & pre_mask) != 0) { all_zero = false; one_count += mi_popcount(prev & pre_mask); }
+  if ((prev & pre_mask) != pre_mask) any_zero = true;
+  while (mid_count-- > 0) {
+    prev = mi_atomic_or_acq_rel(field++, mid_mask);
+    if ((prev & mid_mask) != 0) { all_zero = false; one_count += mi_popcount(prev & mid_mask); }
+    if ((prev & mid_mask) != mid_mask) any_zero = true;
+  }
+  if (post_mask!=0) {
+    prev = mi_atomic_or_acq_rel(field, post_mask);
+    if ((prev & post_mask) != 0) { all_zero = false; one_count += mi_popcount(prev & post_mask); }
+    if ((prev & post_mask) != post_mask) any_zero = true;
+  }
+  if (pany_zero != NULL) { *pany_zero = any_zero; }
+  if (already_set != NULL) { *already_set = one_count; };
+  mi_assert_internal(all_zero ? one_count == 0 : one_count <= count);
+  return all_zero;
+}
+
+
+// Returns `true` if all `count` bits were 1.
+// `any_ones` is `true` if there was at least one bit set to one.
+static bool mi_bitmap_is_claimedx_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_ones, size_t* already_set) {
+  size_t idx = mi_bitmap_index_field(bitmap_idx);
+  size_t pre_mask;
+  size_t mid_mask;
+  size_t post_mask;
+  size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask);
+  bool all_ones = true;
+  bool any_ones = false;
+  size_t one_count = 0;
+  mi_bitmap_field_t* field = &bitmap[idx];
+  size_t prev = mi_atomic_load_relaxed(field++);
+  if ((prev & pre_mask) != pre_mask) all_ones = false;
+  if ((prev & pre_mask) != 0) { any_ones = true; one_count += mi_popcount(prev & pre_mask); }
+  while (mid_count-- > 0) {
+    prev = mi_atomic_load_relaxed(field++);
+    if ((prev & mid_mask) != mid_mask) all_ones = false;
+    if ((prev & mid_mask) != 0) { any_ones = true; one_count += mi_popcount(prev & mid_mask); }
+  }
+  if (post_mask!=0) {
+    prev = mi_atomic_load_relaxed(field);
+    if ((prev & post_mask) != post_mask) all_ones = false;
+    if ((prev & post_mask) != 0) { any_ones = true; one_count += mi_popcount(prev & post_mask); }
+  }
+  if (pany_ones != NULL) { *pany_ones = any_ones; }
+  if (already_set != NULL) { *already_set = one_count; }
+  mi_assert_internal(all_ones ? one_count == count : one_count < count);
+  return all_ones;
+}
+
+bool _mi_bitmap_is_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, size_t* already_set) {
+  return mi_bitmap_is_claimedx_across(bitmap, bitmap_fields, count, bitmap_idx, NULL, already_set);
+}
+
+bool _mi_bitmap_is_any_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) {
+  bool any_ones;
+  mi_bitmap_is_claimedx_across(bitmap, bitmap_fields, count, bitmap_idx, &any_ones, NULL);
+  return any_ones;
+}
diff --git a/compat/mimalloc/bitmap.h b/compat/mimalloc/bitmap.h
new file mode 100644
index 00000000000000..0f4744f4fc3ffd
--- /dev/null
+++ b/compat/mimalloc/bitmap.h
@@ -0,0 +1,119 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2023 Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* ----------------------------------------------------------------------------
+Concurrent bitmap that can set/reset sequences of bits atomically,
+represented as an array of fields where each field is a machine word (`size_t`)
+
+There are two api's; the standard one cannot have sequences that cross
+between the bitmap fields (and a sequence must be <= MI_BITMAP_FIELD_BITS).
+(this is used in region allocation)
+
+The `_across` postfixed functions do allow sequences that can cross over
+between the fields. (This is used in arena allocation)
+---------------------------------------------------------------------------- */
+#pragma once
+#ifndef MI_BITMAP_H
+#define MI_BITMAP_H
+
+/* -----------------------------------------------------------
+  Bitmap definition
+----------------------------------------------------------- */
+
+#define MI_BITMAP_FIELD_BITS   (8*MI_SIZE_SIZE)
+#define MI_BITMAP_FIELD_FULL   (~((size_t)0))   // all bits set
+
+// An atomic bitmap of `size_t` fields
+typedef _Atomic(size_t)  mi_bitmap_field_t;
+typedef mi_bitmap_field_t*  mi_bitmap_t;
+
+// A bitmap index is the index of the bit in a bitmap.
+typedef size_t mi_bitmap_index_t;
+
+// Create a bit index.
+static inline mi_bitmap_index_t mi_bitmap_index_create_ex(size_t idx, size_t bitidx) {
+  mi_assert_internal(bitidx <= MI_BITMAP_FIELD_BITS);
+  return (idx*MI_BITMAP_FIELD_BITS) + bitidx;
+}
+static inline mi_bitmap_index_t mi_bitmap_index_create(size_t idx, size_t bitidx) {
+  mi_assert_internal(bitidx < MI_BITMAP_FIELD_BITS);
+  return mi_bitmap_index_create_ex(idx,bitidx);
+}
+
+// Create a bit index.
+static inline mi_bitmap_index_t mi_bitmap_index_create_from_bit(size_t full_bitidx) {  
+  return mi_bitmap_index_create(full_bitidx / MI_BITMAP_FIELD_BITS, full_bitidx % MI_BITMAP_FIELD_BITS);
+}
+
+// Get the field index from a bit index.
+static inline size_t mi_bitmap_index_field(mi_bitmap_index_t bitmap_idx) {
+  return (bitmap_idx / MI_BITMAP_FIELD_BITS);
+}
+
+// Get the bit index in a bitmap field
+static inline size_t mi_bitmap_index_bit_in_field(mi_bitmap_index_t bitmap_idx) {
+  return (bitmap_idx % MI_BITMAP_FIELD_BITS);
+}
+
+// Get the full bit index
+static inline size_t mi_bitmap_index_bit(mi_bitmap_index_t bitmap_idx) {
+  return bitmap_idx;
+}
+
+/* -----------------------------------------------------------
+  Claim a bit sequence atomically
+----------------------------------------------------------- */
+
+// Try to atomically claim a sequence of `count` bits in a single
+// field at `idx` in `bitmap`. Returns `true` on success.
+bool _mi_bitmap_try_find_claim_field(mi_bitmap_t bitmap, size_t idx, const size_t count, mi_bitmap_index_t* bitmap_idx);
+
+// Starts at idx, and wraps around to search in all `bitmap_fields` fields.
+// For now, `count` can be at most MI_BITMAP_FIELD_BITS and will never cross fields.
+bool _mi_bitmap_try_find_from_claim(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx);
+
+// Like _mi_bitmap_try_find_from_claim but with an extra predicate that must be fullfilled
+typedef bool (mi_cdecl *mi_bitmap_pred_fun_t)(mi_bitmap_index_t bitmap_idx, void* pred_arg);
+bool _mi_bitmap_try_find_from_claim_pred(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_pred_fun_t pred_fun, void* pred_arg, mi_bitmap_index_t* bitmap_idx);
+
+// Set `count` bits at `bitmap_idx` to 0 atomically
+// Returns `true` if all `count` bits were 1 previously.
+bool _mi_bitmap_unclaim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+
+// Try to set `count` bits at `bitmap_idx` from 0 to 1 atomically. 
+// Returns `true` if successful when all previous `count` bits were 0.
+bool _mi_bitmap_try_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+
+// Set `count` bits at `bitmap_idx` to 1 atomically
+// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit.
+bool _mi_bitmap_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_zero);
+
+bool _mi_bitmap_is_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+bool _mi_bitmap_is_any_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+
+
+//--------------------------------------------------------------------------
+// the `_across` functions work on bitmaps where sequences can cross over
+// between the fields. This is used in arena allocation
+//--------------------------------------------------------------------------
+
+// Find `count` bits of zeros and set them to 1 atomically; returns `true` on success.
+// Starts at idx, and wraps around to search in all `bitmap_fields` fields.
+bool _mi_bitmap_try_find_from_claim_across(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx);
+
+// Set `count` bits at `bitmap_idx` to 0 atomically
+// Returns `true` if all `count` bits were 1 previously.
+bool _mi_bitmap_unclaim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+
+// Set `count` bits at `bitmap_idx` to 1 atomically
+// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit.
+bool _mi_bitmap_claim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_zero, size_t* already_set);
+
+bool _mi_bitmap_is_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, size_t* already_set);
+bool _mi_bitmap_is_any_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx);
+
+#endif
diff --git a/compat/mimalloc/free.c b/compat/mimalloc/free.c
new file mode 100644
index 00000000000000..0129ce83bd6c06
--- /dev/null
+++ b/compat/mimalloc/free.c
@@ -0,0 +1,588 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#if !defined(MI_IN_ALLOC_C)
+#error "this file should be included from 'alloc.c' (so aliases can work from alloc-override)"
+// add includes help an IDE
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"   // _mi_prim_thread_id()
+#endif
+
+// forward declarations
+static void   mi_check_padding(const mi_page_t* page, const mi_block_t* block);
+static bool   mi_check_is_double_free(const mi_page_t* page, const mi_block_t* block);
+static size_t mi_page_usable_size_of(const mi_page_t* page, const mi_block_t* block);
+static void   mi_stat_free(const mi_page_t* page, const mi_block_t* block);
+
+
+// ------------------------------------------------------
+// Free
+// ------------------------------------------------------
+
+// forward declaration of multi-threaded free (`_mt`) (or free in huge block if compiled with MI_HUGE_PAGE_ABANDON)
+static mi_decl_noinline void mi_free_block_mt(mi_page_t* page, mi_segment_t* segment, mi_block_t* block);
+
+// regular free of a (thread local) block pointer
+// fast path written carefully to prevent spilling on the stack
+static inline void mi_free_block_local(mi_page_t* page, mi_block_t* block, bool track_stats, bool check_full)
+{
+  // checks
+  if mi_unlikely(mi_check_is_double_free(page, block)) return;
+  mi_check_padding(page, block);
+  if (track_stats) { mi_stat_free(page, block); }
+  #if (MI_DEBUG>0) && !MI_TRACK_ENABLED  && !MI_TSAN && !MI_GUARDED
+  if (!mi_page_is_huge(page)) {   // huge page content may be already decommitted
+    memset(block, MI_DEBUG_FREED, mi_page_block_size(page));
+  }
+  #endif
+  if (track_stats) { mi_track_free_size(block, mi_page_usable_size_of(page, block)); } // faster then mi_usable_size as we already know the page and that p is unaligned
+
+  // actual free: push on the local free list
+  mi_block_set_next(page, block, page->local_free);
+  page->local_free = block;
+  if mi_unlikely(--page->used == 0) {
+    _mi_page_retire(page);
+  }
+  else if mi_unlikely(check_full && mi_page_is_in_full(page)) {
+    _mi_page_unfull(page);
+  }
+}
+
+// Adjust a block that was allocated aligned, to the actual start of the block in the page.
+// note: this can be called from `mi_free_generic_mt` where a non-owning thread accesses the
+// `page_start` and `block_size` fields; however these are constant and the page won't be
+// deallocated (as the block we are freeing keeps it alive) and thus safe to read concurrently.
+mi_block_t* _mi_page_ptr_unalign(const mi_page_t* page, const void* p) {
+  mi_assert_internal(page!=NULL && p!=NULL);
+
+  size_t diff = (uint8_t*)p - page->page_start;
+  size_t adjust;
+  if mi_likely(page->block_size_shift != 0) {
+    adjust = diff & (((size_t)1 << page->block_size_shift) - 1);
+  }
+  else {
+    adjust = diff % mi_page_block_size(page);
+  }
+
+  return (mi_block_t*)((uintptr_t)p - adjust);
+}
+
+// forward declaration for a MI_GUARDED build
+#if MI_GUARDED
+static void mi_block_unguard(mi_page_t* page, mi_block_t* block, void* p); // forward declaration
+static inline void mi_block_check_unguard(mi_page_t* page, mi_block_t* block, void* p) {
+  if (mi_block_ptr_is_guarded(block, p)) { mi_block_unguard(page, block, p); }
+}
+#else
+static inline void mi_block_check_unguard(mi_page_t* page, mi_block_t* block, void* p) {
+  MI_UNUSED(page); MI_UNUSED(block); MI_UNUSED(p);
+}
+#endif
+
+// free a local pointer  (page parameter comes first for better codegen)
+static void mi_decl_noinline mi_free_generic_local(mi_page_t* page, mi_segment_t* segment, void* p) mi_attr_noexcept {
+  MI_UNUSED(segment);
+  mi_block_t* const block = (mi_page_has_aligned(page) ? _mi_page_ptr_unalign(page, p) : (mi_block_t*)p);
+  mi_block_check_unguard(page, block, p);
+  mi_free_block_local(page, block, true /* track stats */, true /* check for a full page */);
+}
+
+// free a pointer owned by another thread (page parameter comes first for better codegen)
+static void mi_decl_noinline mi_free_generic_mt(mi_page_t* page, mi_segment_t* segment, void* p) mi_attr_noexcept {
+  mi_block_t* const block = _mi_page_ptr_unalign(page, p); // don't check `has_aligned` flag to avoid a race (issue #865)
+  mi_block_check_unguard(page, block, p);
+  mi_free_block_mt(page, segment, block);
+}
+
+// generic free (for runtime integration)
+void mi_decl_noinline _mi_free_generic(mi_segment_t* segment, mi_page_t* page, bool is_local, void* p) mi_attr_noexcept {
+  if (is_local) mi_free_generic_local(page,segment,p);
+           else mi_free_generic_mt(page,segment,p);
+}
+
+// Get the segment data belonging to a pointer
+// This is just a single `and` in release mode but does further checks in debug mode
+// (and secure mode) to see if this was a valid pointer.
+static inline mi_segment_t* mi_checked_ptr_segment(const void* p, const char* msg)
+{
+  MI_UNUSED(msg);
+
+  #if (MI_DEBUG>0)
+  if mi_unlikely(((uintptr_t)p & (MI_INTPTR_SIZE - 1)) != 0 && !mi_option_is_enabled(mi_option_guarded_precise)) {
+    _mi_error_message(EINVAL, "%s: invalid (unaligned) pointer: %p\n", msg, p);
+    return NULL;
+  }
+  #endif
+
+  mi_segment_t* const segment = _mi_ptr_segment(p);
+  if mi_unlikely(segment==NULL) return segment;
+
+  #if (MI_DEBUG>0)
+  if mi_unlikely(!mi_is_in_heap_region(p)) {
+  #if (MI_INTPTR_SIZE == 8 && defined(__linux__))
+    if (((uintptr_t)p >> 40) != 0x7F) { // linux tends to align large blocks above 0x7F000000000 (issue #640)
+  #else
+    {
+  #endif
+      _mi_warning_message("%s: pointer might not point to a valid heap region: %p\n"
+        "(this may still be a valid very large allocation (over 64MiB))\n", msg, p);
+      if mi_likely(_mi_ptr_cookie(segment) == segment->cookie) {
+        _mi_warning_message("(yes, the previous pointer %p was valid after all)\n", p);
+      }
+    }
+  }
+  #endif
+  #if (MI_DEBUG>0 || MI_SECURE>=4)
+  if mi_unlikely(_mi_ptr_cookie(segment) != segment->cookie) {
+    _mi_error_message(EINVAL, "%s: pointer does not point to a valid heap space: %p\n", msg, p);
+    return NULL;
+  }
+  #endif
+
+  return segment;
+}
+
+// Free a block
+// Fast path written carefully to prevent register spilling on the stack
+static inline void mi_free_ex(void* p, size_t* usable) mi_attr_noexcept
+{
+  mi_segment_t* const segment = mi_checked_ptr_segment(p,"mi_free");
+  if mi_unlikely(segment==NULL) return;
+
+  const bool is_local = (_mi_prim_thread_id() == mi_atomic_load_relaxed(&segment->thread_id));
+  mi_page_t* const page = _mi_segment_page_of(segment, p);
+  if (usable!=NULL) { *usable = mi_page_usable_block_size(page); }
+  
+  if mi_likely(is_local) {                        // thread-local free?
+    if mi_likely(page->flags.full_aligned == 0) { // and it is not a full page (full pages need to move from the full bin), nor has aligned blocks (aligned blocks need to be unaligned)
+      // thread-local, aligned, and not a full page
+      mi_block_t* const block = (mi_block_t*)p;
+      mi_free_block_local(page, block, true /* track stats */, false /* no need to check if the page is full */);
+    }
+    else {
+      // page is full or contains (inner) aligned blocks; use generic path
+      mi_free_generic_local(page, segment, p);
+    }
+  }
+  else {
+    // not thread-local; use generic path
+    mi_free_generic_mt(page, segment, p);
+  }
+}
+
+void mi_free(void* p) mi_attr_noexcept {
+  mi_free_ex(p,NULL);
+}
+
+void mi_ufree(void* p, size_t* usable) mi_attr_noexcept {
+  mi_free_ex(p,usable);
+}
+
+// return true if successful
+bool _mi_free_delayed_block(mi_block_t* block) {
+  // get segment and page
+  mi_assert_internal(block!=NULL);
+  const mi_segment_t* const segment = _mi_ptr_segment(block);
+  mi_assert_internal(_mi_ptr_cookie(segment) == segment->cookie);
+  mi_assert_internal(_mi_thread_id() == segment->thread_id);
+  mi_page_t* const page = _mi_segment_page_of(segment, block);
+
+  // Clear the no-delayed flag so delayed freeing is used again for this page.
+  // This must be done before collecting the free lists on this page -- otherwise
+  // some blocks may end up in the page `thread_free` list with no blocks in the
+  // heap `thread_delayed_free` list which may cause the page to be never freed!
+  // (it would only be freed if we happen to scan it in `mi_page_queue_find_free_ex`)
+  if (!_mi_page_try_use_delayed_free(page, MI_USE_DELAYED_FREE, false /* dont overwrite never delayed */)) {
+    return false;
+  }
+
+  // collect all other non-local frees (move from `thread_free` to `free`) to ensure up-to-date `used` count
+  _mi_page_free_collect(page, false);
+
+  // and free the block (possibly freeing the page as well since `used` is updated)
+  mi_free_block_local(page, block, false /* stats have already been adjusted */, true /* check for a full page */);
+  return true;
+}
+
+// ------------------------------------------------------
+// Multi-threaded Free (`_mt`)
+// ------------------------------------------------------
+
+// Push a block that is owned by another thread on its page-local thread free
+// list or it's heap delayed free list. Such blocks are later collected by
+// the owning thread in `_mi_free_delayed_block`.
+static void mi_decl_noinline mi_free_block_delayed_mt( mi_page_t* page, mi_block_t* block )
+{
+  // Try to put the block on either the page-local thread free list,
+  // or the heap delayed free list (if this is the first non-local free in that page)
+  mi_thread_free_t tfreex;
+  bool use_delayed;
+  mi_thread_free_t tfree = mi_atomic_load_relaxed(&page->xthread_free);
+  do {
+    use_delayed = (mi_tf_delayed(tfree) == MI_USE_DELAYED_FREE);
+    if mi_unlikely(use_delayed) {
+      // unlikely: this only happens on the first concurrent free in a page that is in the full list
+      tfreex = mi_tf_set_delayed(tfree,MI_DELAYED_FREEING);
+    }
+    else {
+      // usual: directly add to page thread_free list
+      mi_block_set_next(page, block, mi_tf_block(tfree));
+      tfreex = mi_tf_set_block(tfree,block);
+    }
+  } while (!mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex));
+
+  // If this was the first non-local free, we need to push it on the heap delayed free list instead
+  if mi_unlikely(use_delayed) {
+    // racy read on `heap`, but ok because MI_DELAYED_FREEING is set (see `mi_heap_delete` and `mi_heap_collect_abandon`)
+    mi_heap_t* const heap = (mi_heap_t*)(mi_atomic_load_acquire(&page->xheap)); //mi_page_heap(page);
+    mi_assert_internal(heap != NULL);
+    if (heap != NULL) {
+      // add to the delayed free list of this heap. (do this atomically as the lock only protects heap memory validity)
+      mi_block_t* dfree = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free);
+      do {
+        mi_block_set_nextx(heap,block,dfree, heap->keys);
+      } while (!mi_atomic_cas_ptr_weak_release(mi_block_t,&heap->thread_delayed_free, &dfree, block));
+    }
+
+    // and reset the MI_DELAYED_FREEING flag
+    tfree = mi_atomic_load_relaxed(&page->xthread_free);
+    do {
+      tfreex = tfree;
+      mi_assert_internal(mi_tf_delayed(tfree) == MI_DELAYED_FREEING);
+      tfreex = mi_tf_set_delayed(tfree,MI_NO_DELAYED_FREE);
+    } while (!mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex));
+  }
+}
+
+// Multi-threaded free (`_mt`) (or free in huge block if compiled with MI_HUGE_PAGE_ABANDON)
+static void mi_decl_noinline mi_free_block_mt(mi_page_t* page, mi_segment_t* segment, mi_block_t* block)
+{
+  // first see if the segment was abandoned and if we can reclaim it into our thread
+  if (_mi_option_get_fast(mi_option_abandoned_reclaim_on_free) != 0 &&
+      #if MI_HUGE_PAGE_ABANDON
+      segment->page_kind != MI_PAGE_HUGE &&
+      #endif
+      mi_atomic_load_relaxed(&segment->thread_id) == 0 &&  // segment is abandoned?
+      mi_prim_get_default_heap() != (mi_heap_t*)&_mi_heap_empty) // and we did not already exit this thread (without this check, a fresh heap will be initalized (issue #944))
+  {
+    // the segment is abandoned, try to reclaim it into our heap
+    if (_mi_segment_attempt_reclaim(mi_heap_get_default(), segment)) {
+      mi_assert_internal(_mi_thread_id() == mi_atomic_load_relaxed(&segment->thread_id));
+      mi_assert_internal(mi_heap_get_default()->tld->segments.subproc == segment->subproc);
+      mi_free(block);  // recursively free as now it will be a local free in our heap
+      return;
+    }
+  }
+
+  // The padding check may access the non-thread-owned page for the key values.
+  // that is safe as these are constant and the page won't be freed (as the block is not freed yet).
+  mi_check_padding(page, block);
+
+  // adjust stats (after padding check and potentially recursive `mi_free` above)
+  mi_stat_free(page, block);    // stat_free may access the padding
+  mi_track_free_size(block, mi_page_usable_size_of(page,block));
+
+  // for small size, ensure we can fit the delayed thread pointers without triggering overflow detection
+  _mi_padding_shrink(page, block, sizeof(mi_block_t));
+
+  if (segment->kind == MI_SEGMENT_HUGE) {
+    #if MI_HUGE_PAGE_ABANDON
+    // huge page segments are always abandoned and can be freed immediately
+    _mi_segment_huge_page_free(segment, page, block);
+    return;
+    #else
+    // huge pages are special as they occupy the entire segment
+    // as these are large we reset the memory occupied by the page so it is available to other threads
+    // (as the owning thread needs to actually free the memory later).
+    _mi_segment_huge_page_reset(segment, page, block);
+    #endif
+  }
+  else {
+    #if (MI_DEBUG>0) && !MI_TRACK_ENABLED  && !MI_TSAN       // note: when tracking, cannot use mi_usable_size with multi-threading
+    memset(block, MI_DEBUG_FREED, mi_usable_size(block));
+    #endif
+  }
+
+  // and finally free the actual block by pushing it on the owning heap
+  // thread_delayed free list (or heap delayed free list)
+  mi_free_block_delayed_mt(page,block);
+}
+
+
+// ------------------------------------------------------
+// Usable size
+// ------------------------------------------------------
+
+// Bytes available in a block
+static size_t mi_decl_noinline mi_page_usable_aligned_size_of(const mi_page_t* page, const void* p) mi_attr_noexcept {
+  const mi_block_t* block = _mi_page_ptr_unalign(page, p);
+  const size_t size = mi_page_usable_size_of(page, block);
+  const ptrdiff_t adjust = (uint8_t*)p - (uint8_t*)block;
+  mi_assert_internal(adjust >= 0 && (size_t)adjust <= size);
+  const size_t aligned_size = (size - adjust);
+  #if MI_GUARDED
+  if (mi_block_ptr_is_guarded(block, p)) {
+    return aligned_size - _mi_os_page_size();
+  }
+  #endif
+  return aligned_size;
+}
+
+static inline mi_page_t* mi_validate_ptr_page(const void* p, const char* msg) {
+  const mi_segment_t* const segment = mi_checked_ptr_segment(p, msg);
+  if mi_unlikely(segment==NULL) return NULL;
+  mi_page_t* const page = _mi_segment_page_of(segment, p);
+  return page;
+}
+
+static inline size_t _mi_usable_size(const void* p, const mi_page_t* page) mi_attr_noexcept {
+  if mi_unlikely(page==NULL) return 0;
+  if mi_likely(!mi_page_has_aligned(page)) {
+    const mi_block_t* block = (const mi_block_t*)p;
+    return mi_page_usable_size_of(page, block);
+  }
+  else {
+    // split out to separate routine for improved code generation
+    return mi_page_usable_aligned_size_of(page, p);
+  }
+}
+
+mi_decl_nodiscard size_t mi_usable_size(const void* p) mi_attr_noexcept {
+  const mi_page_t* const page = mi_validate_ptr_page(p,"mi_usable_size");
+  return _mi_usable_size(p,page);
+}
+
+
+// ------------------------------------------------------
+// Free variants
+// ------------------------------------------------------
+
+void mi_free_size(void* p, size_t size) mi_attr_noexcept {
+  MI_UNUSED_RELEASE(size);
+  #if MI_DEBUG
+  const mi_page_t* const page = mi_validate_ptr_page(p,"mi_free_size");  
+  const size_t available = _mi_usable_size(p,page);
+  mi_assert(p == NULL || size <= available || available == 0 /* invalid pointer */ );
+  #endif
+  mi_free(p);
+}
+
+void mi_free_size_aligned(void* p, size_t size, size_t alignment) mi_attr_noexcept {
+  MI_UNUSED_RELEASE(alignment);
+  mi_assert(((uintptr_t)p % alignment) == 0);
+  mi_free_size(p,size);
+}
+
+void mi_free_aligned(void* p, size_t alignment) mi_attr_noexcept {
+  MI_UNUSED_RELEASE(alignment);
+  mi_assert(((uintptr_t)p % alignment) == 0);
+  mi_free(p);
+}
+
+
+// ------------------------------------------------------
+// Check for double free in secure and debug mode
+// This is somewhat expensive so only enabled for secure mode 4
+// ------------------------------------------------------
+
+#if (MI_ENCODE_FREELIST && (MI_SECURE>=4 || MI_DEBUG!=0))
+// linear check if the free list contains a specific element
+static bool mi_list_contains(const mi_page_t* page, const mi_block_t* list, const mi_block_t* elem) {
+  while (list != NULL) {
+    if (elem==list) return true;
+    list = mi_block_next(page, list);
+  }
+  return false;
+}
+
+static mi_decl_noinline bool mi_check_is_double_freex(const mi_page_t* page, const mi_block_t* block) {
+  // The decoded value is in the same page (or NULL).
+  // Walk the free lists to verify positively if it is already freed
+  if (mi_list_contains(page, page->free, block) ||
+      mi_list_contains(page, page->local_free, block) ||
+      mi_list_contains(page, mi_page_thread_free(page), block))
+  {
+    _mi_error_message(EAGAIN, "double free detected of block %p with size %zu\n", block, mi_page_block_size(page));
+    return true;
+  }
+  return false;
+}
+
+#define mi_track_page(page,access)  { size_t psize; void* pstart = _mi_page_start(_mi_page_segment(page),page,&psize); mi_track_mem_##access( pstart, psize); }
+
+static inline bool mi_check_is_double_free(const mi_page_t* page, const mi_block_t* block) {
+  bool is_double_free = false;
+  mi_block_t* n = mi_block_nextx(page, block, page->keys); // pretend it is freed, and get the decoded first field
+  if (((uintptr_t)n & (MI_INTPTR_SIZE-1))==0 &&  // quick check: aligned pointer?
+      (n==NULL || mi_is_in_same_page(block, n))) // quick check: in same page or NULL?
+  {
+    // Suspicious: decoded value a in block is in the same page (or NULL) -- maybe a double free?
+    // (continue in separate function to improve code generation)
+    is_double_free = mi_check_is_double_freex(page, block);
+  }
+  return is_double_free;
+}
+#else
+static inline bool mi_check_is_double_free(const mi_page_t* page, const mi_block_t* block) {
+  MI_UNUSED(page);
+  MI_UNUSED(block);
+  return false;
+}
+#endif
+
+
+// ---------------------------------------------------------------------------
+// Check for heap block overflow by setting up padding at the end of the block
+// ---------------------------------------------------------------------------
+
+#if MI_PADDING // && !MI_TRACK_ENABLED
+static bool mi_page_decode_padding(const mi_page_t* page, const mi_block_t* block, size_t* delta, size_t* bsize) {
+  *bsize = mi_page_usable_block_size(page);
+  const mi_padding_t* const padding = (mi_padding_t*)((uint8_t*)block + *bsize);
+  mi_track_mem_defined(padding,sizeof(mi_padding_t));
+  *delta = padding->delta;
+  uint32_t canary = padding->canary;
+  uintptr_t keys[2];
+  keys[0] = page->keys[0];
+  keys[1] = page->keys[1];
+  bool ok = (mi_ptr_encode_canary(page,block,keys) == canary && *delta <= *bsize);
+  mi_track_mem_noaccess(padding,sizeof(mi_padding_t));
+  return ok;
+}
+
+// Return the exact usable size of a block.
+static size_t mi_page_usable_size_of(const mi_page_t* page, const mi_block_t* block) {
+  size_t bsize;
+  size_t delta;
+  bool ok = mi_page_decode_padding(page, block, &delta, &bsize);
+  mi_assert_internal(ok); mi_assert_internal(delta <= bsize);
+  return (ok ? bsize - delta : 0);
+}
+
+// When a non-thread-local block is freed, it becomes part of the thread delayed free
+// list that is freed later by the owning heap. If the exact usable size is too small to
+// contain the pointer for the delayed list, then shrink the padding (by decreasing delta)
+// so it will later not trigger an overflow error in `mi_free_block`.
+void _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size) {
+  size_t bsize;
+  size_t delta;
+  bool ok = mi_page_decode_padding(page, block, &delta, &bsize);
+  mi_assert_internal(ok);
+  if (!ok || (bsize - delta) >= min_size) return;  // usually already enough space
+  mi_assert_internal(bsize >= min_size);
+  if (bsize < min_size) return;  // should never happen
+  size_t new_delta = (bsize - min_size);
+  mi_assert_internal(new_delta < bsize);
+  mi_padding_t* padding = (mi_padding_t*)((uint8_t*)block + bsize);
+  mi_track_mem_defined(padding,sizeof(mi_padding_t));
+  padding->delta = (uint32_t)new_delta;
+  mi_track_mem_noaccess(padding,sizeof(mi_padding_t));
+}
+#else
+static size_t mi_page_usable_size_of(const mi_page_t* page, const mi_block_t* block) {
+  MI_UNUSED(block);
+  return mi_page_usable_block_size(page);
+}
+
+void _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size) {
+  MI_UNUSED(page);
+  MI_UNUSED(block);
+  MI_UNUSED(min_size);
+}
+#endif
+
+#if MI_PADDING && MI_PADDING_CHECK
+
+static bool mi_verify_padding(const mi_page_t* page, const mi_block_t* block, size_t* size, size_t* wrong) {
+  size_t bsize;
+  size_t delta;
+  bool ok = mi_page_decode_padding(page, block, &delta, &bsize);
+  *size = *wrong = bsize;
+  if (!ok) return false;
+  mi_assert_internal(bsize >= delta);
+  *size = bsize - delta;
+  if (!mi_page_is_huge(page)) {
+    uint8_t* fill = (uint8_t*)block + bsize - delta;
+    const size_t maxpad = (delta > MI_MAX_ALIGN_SIZE ? MI_MAX_ALIGN_SIZE : delta); // check at most the first N padding bytes
+    mi_track_mem_defined(fill, maxpad);
+    for (size_t i = 0; i < maxpad; i++) {
+      if (fill[i] != MI_DEBUG_PADDING) {
+        *wrong = bsize - delta + i;
+        ok = false;
+        break;
+      }
+    }
+    mi_track_mem_noaccess(fill, maxpad);
+  }
+  return ok;
+}
+
+static void mi_check_padding(const mi_page_t* page, const mi_block_t* block) {
+  size_t size;
+  size_t wrong;
+  if (!mi_verify_padding(page,block,&size,&wrong)) {
+    _mi_error_message(EFAULT, "buffer overflow in heap block %p of size %zu: write after %zu bytes\n", block, size, wrong );
+  }
+}
+
+#else
+
+static void mi_check_padding(const mi_page_t* page, const mi_block_t* block) {
+  MI_UNUSED(page);
+  MI_UNUSED(block);
+}
+
+#endif
+
+// only maintain stats for smaller objects if requested
+#if (MI_STAT>0)
+static void mi_stat_free(const mi_page_t* page, const mi_block_t* block) {
+  MI_UNUSED(block);
+  mi_heap_t* const heap = mi_heap_get_default();
+  const size_t bsize = mi_page_usable_block_size(page);
+  // #if (MI_STAT>1)
+  // const size_t usize = mi_page_usable_size_of(page, block);
+  // mi_heap_stat_decrease(heap, malloc_requested, usize);
+  // #endif
+  if (bsize <= MI_MEDIUM_OBJ_SIZE_MAX) {
+    mi_heap_stat_decrease(heap, malloc_normal, bsize);
+    #if (MI_STAT > 1)
+    mi_heap_stat_decrease(heap, malloc_bins[_mi_bin(bsize)], 1);
+    #endif
+  }
+  //else if (bsize <= MI_LARGE_OBJ_SIZE_MAX) {
+  //  mi_heap_stat_decrease(heap, malloc_large, bsize);
+  //}
+  else {
+    mi_heap_stat_decrease(heap, malloc_huge, bsize);
+  }
+}
+#else
+static void mi_stat_free(const mi_page_t* page, const mi_block_t* block) {
+  MI_UNUSED(page); MI_UNUSED(block);
+}
+#endif
+
+
+// Remove guard page when building with MI_GUARDED
+#if MI_GUARDED
+static void mi_block_unguard(mi_page_t* page, mi_block_t* block, void* p) {
+  MI_UNUSED(p);
+  mi_assert_internal(mi_block_ptr_is_guarded(block, p));
+  mi_assert_internal(mi_page_has_aligned(page));
+  mi_assert_internal((uint8_t*)p - (uint8_t*)block >= (ptrdiff_t)sizeof(mi_block_t));
+  mi_assert_internal(block->next == MI_BLOCK_TAG_GUARDED);
+
+  const size_t bsize = mi_page_block_size(page);
+  const size_t psize = _mi_os_page_size();
+  mi_assert_internal(bsize > psize);
+  mi_assert_internal(_mi_page_segment(page)->allow_decommit);
+  void* gpage = (uint8_t*)block + bsize - psize;
+  mi_assert_internal(_mi_is_aligned(gpage, psize));
+  _mi_os_unprotect(gpage, psize);
+}
+#endif
diff --git a/compat/mimalloc/heap.c b/compat/mimalloc/heap.c
new file mode 100644
index 00000000000000..88969311e89586
--- /dev/null
+++ b/compat/mimalloc/heap.c
@@ -0,0 +1,737 @@
+/*----------------------------------------------------------------------------
+Copyright (c) 2018-2021, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "mimalloc/prim.h"  // mi_prim_get_default_heap
+
+#include <string.h>  // memset, memcpy
+
+#if defined(_MSC_VER) && (_MSC_VER < 1920)
+#pragma warning(disable:4204)  // non-constant aggregate initializer
+#endif
+
+/* -----------------------------------------------------------
+  Helpers
+----------------------------------------------------------- */
+
+// return `true` if ok, `false` to break
+typedef bool (heap_page_visitor_fun)(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2);
+
+// Visit all pages in a heap; returns `false` if break was called.
+static bool mi_heap_visit_pages(mi_heap_t* heap, heap_page_visitor_fun* fn, void* arg1, void* arg2)
+{
+  if (heap==NULL || heap->page_count==0) return 0;
+
+  // visit all pages
+  #if MI_DEBUG>1
+  size_t total = heap->page_count;
+  size_t count = 0;
+  #endif
+
+  for (size_t i = 0; i <= MI_BIN_FULL; i++) {
+    mi_page_queue_t* pq = &heap->pages[i];
+    mi_page_t* page = pq->first;
+    while(page != NULL) {
+      mi_page_t* next = page->next; // save next in case the page gets removed from the queue
+      mi_assert_internal(mi_page_heap(page) == heap);
+      #if MI_DEBUG>1
+      count++;
+      #endif
+      if (!fn(heap, pq, page, arg1, arg2)) return false;
+      page = next; // and continue
+    }
+  }
+  mi_assert_internal(count == total);
+  return true;
+}
+
+
+#if MI_DEBUG>=2
+static bool mi_heap_page_is_valid(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) {
+  MI_UNUSED(arg1);
+  MI_UNUSED(arg2);
+  MI_UNUSED(pq);
+  mi_assert_internal(mi_page_heap(page) == heap);
+  mi_segment_t* segment = _mi_page_segment(page);
+  mi_assert_internal(mi_atomic_load_relaxed(&segment->thread_id) == heap->thread_id);
+  mi_assert_expensive(_mi_page_is_valid(page));
+  return true;
+}
+#endif
+#if MI_DEBUG>=3
+static bool mi_heap_is_valid(mi_heap_t* heap) {
+  mi_assert_internal(heap!=NULL);
+  mi_heap_visit_pages(heap, &mi_heap_page_is_valid, NULL, NULL);
+  return true;
+}
+#endif
+
+
+
+
+/* -----------------------------------------------------------
+  "Collect" pages by migrating `local_free` and `thread_free`
+  lists and freeing empty pages. This is done when a thread
+  stops (and in that case abandons pages if there are still
+  blocks alive)
+----------------------------------------------------------- */
+
+typedef enum mi_collect_e {
+  MI_NORMAL,
+  MI_FORCE,
+  MI_ABANDON
+} mi_collect_t;
+
+
+static bool mi_heap_page_collect(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg_collect, void* arg2 ) {
+  MI_UNUSED(arg2);
+  MI_UNUSED(heap);
+  mi_assert_internal(mi_heap_page_is_valid(heap, pq, page, NULL, NULL));
+  mi_collect_t collect = *((mi_collect_t*)arg_collect);
+  _mi_page_free_collect(page, collect >= MI_FORCE);
+  if (collect == MI_FORCE) {
+    // note: call before a potential `_mi_page_free` as the segment may be freed if this was the last used page in that segment.
+    mi_segment_t* segment = _mi_page_segment(page);
+    _mi_segment_collect(segment, true /* force? */);
+  }
+  if (mi_page_all_free(page)) {
+    // no more used blocks, free the page.
+    // note: this will free retired pages as well.
+    _mi_page_free(page, pq, collect >= MI_FORCE);
+  }
+  else if (collect == MI_ABANDON) {
+    // still used blocks but the thread is done; abandon the page
+    _mi_page_abandon(page, pq);
+  }
+  return true; // don't break
+}
+
+static bool mi_heap_page_never_delayed_free(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) {
+  MI_UNUSED(arg1);
+  MI_UNUSED(arg2);
+  MI_UNUSED(heap);
+  MI_UNUSED(pq);
+  _mi_page_use_delayed_free(page, MI_NEVER_DELAYED_FREE, false);
+  return true; // don't break
+}
+
+static void mi_heap_collect_ex(mi_heap_t* heap, mi_collect_t collect)
+{
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return;
+
+  const bool force = (collect >= MI_FORCE);
+  _mi_deferred_free(heap, force);
+
+  // python/cpython#112532: we may be called from a thread that is not the owner of the heap
+  const bool is_main_thread = (_mi_is_main_thread() && heap->thread_id == _mi_thread_id());
+
+  // note: never reclaim on collect but leave it to threads that need storage to reclaim
+  const bool force_main =
+    #ifdef NDEBUG
+      collect == MI_FORCE
+    #else
+      collect >= MI_FORCE
+    #endif
+      && is_main_thread && mi_heap_is_backing(heap) && !heap->no_reclaim;
+
+  if (force_main) {
+    // the main thread is abandoned (end-of-program), try to reclaim all abandoned segments.
+    // if all memory is freed by now, all segments should be freed.
+    // note: this only collects in the current subprocess
+    _mi_abandoned_reclaim_all(heap, &heap->tld->segments);
+  }
+
+  // if abandoning, mark all pages to no longer add to delayed_free
+  if (collect == MI_ABANDON) {
+    mi_heap_visit_pages(heap, &mi_heap_page_never_delayed_free, NULL, NULL);
+  }
+
+  // free all current thread delayed blocks.
+  // (if abandoning, after this there are no more thread-delayed references into the pages.)
+  _mi_heap_delayed_free_all(heap);
+
+  // collect retired pages
+  _mi_heap_collect_retired(heap, force);
+
+  // collect all pages owned by this thread
+  mi_heap_visit_pages(heap, &mi_heap_page_collect, &collect, NULL);
+  mi_assert_internal( collect != MI_ABANDON || mi_atomic_load_ptr_acquire(mi_block_t,&heap->thread_delayed_free) == NULL );
+
+  // collect abandoned segments (in particular, purge expired parts of segments in the abandoned segment list)
+  // note: forced purge can be quite expensive if many threads are created/destroyed so we do not force on abandonment
+  _mi_abandoned_collect(heap, collect == MI_FORCE /* force? */, &heap->tld->segments);
+
+  // if forced, collect thread data cache on program-exit (or shared library unload)
+  if (force && is_main_thread && mi_heap_is_backing(heap)) {
+    _mi_thread_data_collect();  // collect thread data cache
+  }
+
+  // collect arenas (this is program wide so don't force purges on abandonment of threads)
+  _mi_arenas_collect(collect == MI_FORCE /* force purge? */);
+
+  // merge statistics
+  if (collect <= MI_FORCE) { _mi_stats_merge_thread(heap->tld); }
+}
+
+void _mi_heap_collect_abandon(mi_heap_t* heap) {
+  mi_heap_collect_ex(heap, MI_ABANDON);
+}
+
+void mi_heap_collect(mi_heap_t* heap, bool force) mi_attr_noexcept {
+  mi_heap_collect_ex(heap, (force ? MI_FORCE : MI_NORMAL));
+}
+
+void mi_collect(bool force) mi_attr_noexcept {
+  mi_heap_collect(mi_prim_get_default_heap(), force);
+}
+
+
+/* -----------------------------------------------------------
+  Heap new
+----------------------------------------------------------- */
+
+mi_heap_t* mi_heap_get_default(void) {
+  mi_thread_init();
+  return mi_prim_get_default_heap();
+}
+
+static bool mi_heap_is_default(const mi_heap_t* heap) {
+  return (heap == mi_prim_get_default_heap());
+}
+
+
+mi_heap_t* mi_heap_get_backing(void) {
+  mi_heap_t* heap = mi_heap_get_default();
+  mi_assert_internal(heap!=NULL);
+  mi_heap_t* bheap = heap->tld->heap_backing;
+  mi_assert_internal(bheap!=NULL);
+  mi_assert_internal(bheap->thread_id == _mi_thread_id());
+  return bheap;
+}
+
+void _mi_heap_init(mi_heap_t* heap, mi_tld_t* tld, mi_arena_id_t arena_id, bool noreclaim, uint8_t tag) {
+  _mi_memcpy_aligned(heap, &_mi_heap_empty, sizeof(mi_heap_t));
+  heap->tld = tld;
+  heap->thread_id  = _mi_thread_id();
+  heap->arena_id   = arena_id;
+  heap->no_reclaim = noreclaim;
+  heap->tag        = tag;
+  if (heap == tld->heap_backing) {
+    #if defined(_WIN32) && !defined(MI_SHARED_LIB)
+      _mi_random_init_weak(&heap->random);    // prevent allocation failure during bcrypt dll initialization with static linking (issue #1185)
+    #else
+      _mi_random_init(&heap->random);
+    #endif
+  }
+  else {
+    _mi_random_split(&tld->heap_backing->random, &heap->random);
+  }
+  heap->cookie  = _mi_heap_random_next(heap) | 1;
+  heap->keys[0] = _mi_heap_random_next(heap);
+  heap->keys[1] = _mi_heap_random_next(heap);
+  _mi_heap_guarded_init(heap);
+  // push on the thread local heaps list
+  heap->next = heap->tld->heaps;
+  heap->tld->heaps = heap;
+}
+
+mi_decl_nodiscard mi_heap_t* mi_heap_new_ex(int heap_tag, bool allow_destroy, mi_arena_id_t arena_id) {
+  mi_heap_t* bheap = mi_heap_get_backing();
+  mi_heap_t* heap = mi_heap_malloc_tp(bheap, mi_heap_t);  // todo: OS allocate in secure mode?
+  if (heap == NULL) return NULL;
+  mi_assert(heap_tag >= 0 && heap_tag < 256);
+  _mi_heap_init(heap, bheap->tld, arena_id, allow_destroy /* no reclaim? */, (uint8_t)heap_tag /* heap tag */);
+  return heap;
+}
+
+mi_decl_nodiscard mi_heap_t* mi_heap_new_in_arena(mi_arena_id_t arena_id) {
+  return mi_heap_new_ex(0 /* default heap tag */, false /* don't allow `mi_heap_destroy` */, arena_id);
+}
+
+mi_decl_nodiscard mi_heap_t* mi_heap_new(void) {
+  // don't reclaim abandoned memory or otherwise destroy is unsafe
+  return mi_heap_new_ex(0 /* default heap tag */, true /* no reclaim */, _mi_arena_id_none());
+}
+
+bool _mi_heap_memid_is_suitable(mi_heap_t* heap, mi_memid_t memid) {
+  return _mi_arena_memid_is_suitable(memid, heap->arena_id);
+}
+
+uintptr_t _mi_heap_random_next(mi_heap_t* heap) {
+  return _mi_random_next(&heap->random);
+}
+
+// zero out the page queues
+static void mi_heap_reset_pages(mi_heap_t* heap) {
+  mi_assert_internal(heap != NULL);
+  mi_assert_internal(mi_heap_is_initialized(heap));
+  // TODO: copy full empty heap instead?
+  memset(&heap->pages_free_direct, 0, sizeof(heap->pages_free_direct));
+  _mi_memcpy_aligned(&heap->pages, &_mi_heap_empty.pages, sizeof(heap->pages));
+  heap->thread_delayed_free = NULL;
+  heap->page_count = 0;
+}
+
+// called from `mi_heap_destroy` and `mi_heap_delete` to free the internal heap resources.
+static void mi_heap_free(mi_heap_t* heap) {
+  mi_assert(heap != NULL);
+  mi_assert_internal(mi_heap_is_initialized(heap));
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return;
+  if (mi_heap_is_backing(heap)) return; // dont free the backing heap
+
+  // reset default
+  if (mi_heap_is_default(heap)) {
+    _mi_heap_set_default_direct(heap->tld->heap_backing);
+  }
+
+  // remove ourselves from the thread local heaps list
+  // linear search but we expect the number of heaps to be relatively small
+  mi_heap_t* prev = NULL;
+  mi_heap_t* curr = heap->tld->heaps;
+  while (curr != heap && curr != NULL) {
+    prev = curr;
+    curr = curr->next;
+  }
+  mi_assert_internal(curr == heap);
+  if (curr == heap) {
+    if (prev != NULL) { prev->next = heap->next; }
+                 else { heap->tld->heaps = heap->next; }
+  }
+  mi_assert_internal(heap->tld->heaps != NULL);
+
+  // and free the used memory
+  mi_free(heap);
+}
+
+// return a heap on the same thread as `heap` specialized for the specified tag (if it exists)
+mi_heap_t* _mi_heap_by_tag(mi_heap_t* heap, uint8_t tag) {
+  if (heap->tag == tag) {
+    return heap;
+  }
+  for (mi_heap_t *curr = heap->tld->heaps; curr != NULL; curr = curr->next) {
+    if (curr->tag == tag) {
+      return curr;
+    }
+  }
+  return NULL;
+}
+
+/* -----------------------------------------------------------
+  Heap destroy
+----------------------------------------------------------- */
+
+static bool _mi_heap_page_destroy(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) {
+  MI_UNUSED(arg1);
+  MI_UNUSED(arg2);
+  MI_UNUSED(heap);
+  MI_UNUSED(pq);
+
+  // ensure no more thread_delayed_free will be added
+  _mi_page_use_delayed_free(page, MI_NEVER_DELAYED_FREE, false);
+
+  // stats
+  const size_t bsize = mi_page_block_size(page);
+  if (bsize > MI_MEDIUM_OBJ_SIZE_MAX) {
+    //if (bsize <= MI_LARGE_OBJ_SIZE_MAX) {
+    //  mi_heap_stat_decrease(heap, malloc_large, bsize);
+    //}
+    //else 
+    {
+      mi_heap_stat_decrease(heap, malloc_huge, bsize);
+    }
+  }
+  #if (MI_STAT>0)
+  _mi_page_free_collect(page, false);  // update used count
+  const size_t inuse = page->used;
+  if (bsize <= MI_LARGE_OBJ_SIZE_MAX) {
+    mi_heap_stat_decrease(heap, malloc_normal, bsize * inuse);
+    #if (MI_STAT>1)
+    mi_heap_stat_decrease(heap, malloc_bins[_mi_bin(bsize)], inuse);
+    #endif
+  }
+  // mi_heap_stat_decrease(heap, malloc_requested, bsize * inuse);  // todo: off for aligned blocks...
+  #endif
+
+  /// pretend it is all free now
+  mi_assert_internal(mi_page_thread_free(page) == NULL);
+  page->used = 0;
+
+  // and free the page
+  // mi_page_free(page,false);
+  page->next = NULL;
+  page->prev = NULL;
+  _mi_segment_page_free(page,false /* no force? */, &heap->tld->segments);
+
+  return true; // keep going
+}
+
+void _mi_heap_destroy_pages(mi_heap_t* heap) {
+  mi_heap_visit_pages(heap, &_mi_heap_page_destroy, NULL, NULL);
+  mi_heap_reset_pages(heap);
+}
+
+#if MI_TRACK_HEAP_DESTROY
+static bool mi_cdecl mi_heap_track_block_free(const mi_heap_t* heap, const mi_heap_area_t* area, void* block, size_t block_size, void* arg) {
+  MI_UNUSED(heap); MI_UNUSED(area);  MI_UNUSED(arg); MI_UNUSED(block_size);
+  mi_track_free_size(block,mi_usable_size(block));
+  return true;
+}
+#endif
+
+void mi_heap_destroy(mi_heap_t* heap) {
+  mi_assert(heap != NULL);
+  mi_assert(mi_heap_is_initialized(heap));
+  mi_assert(heap->no_reclaim);
+  mi_assert_expensive(mi_heap_is_valid(heap));
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return;
+  #if MI_GUARDED
+  // _mi_warning_message("'mi_heap_destroy' called but MI_GUARDED is enabled -- using `mi_heap_delete` instead (heap at %p)\n", heap);
+  mi_heap_delete(heap);
+  return;
+  #else
+  if (!heap->no_reclaim) {
+    _mi_warning_message("'mi_heap_destroy' called but ignored as the heap was not created with 'allow_destroy' (heap at %p)\n", heap);
+    // don't free in case it may contain reclaimed pages
+    mi_heap_delete(heap);
+  }
+  else {
+    // track all blocks as freed
+    #if MI_TRACK_HEAP_DESTROY
+    mi_heap_visit_blocks(heap, true, mi_heap_track_block_free, NULL);
+    #endif
+    // free all pages
+    _mi_heap_destroy_pages(heap);
+    mi_heap_free(heap);
+  }
+  #endif
+}
+
+// forcefully destroy all heaps in the current thread
+void _mi_heap_unsafe_destroy_all(mi_heap_t* heap) {
+  mi_assert_internal(heap != NULL);
+  if (heap == NULL) return;
+  mi_heap_t* curr = heap->tld->heaps;
+  while (curr != NULL) {
+    mi_heap_t* next = curr->next;
+    if (curr->no_reclaim) {
+      mi_heap_destroy(curr);
+    }
+    else {
+      _mi_heap_destroy_pages(curr);
+    }
+    curr = next;
+  }
+}
+
+/* -----------------------------------------------------------
+  Safe Heap delete
+----------------------------------------------------------- */
+
+// Transfer the pages from one heap to the other
+static void mi_heap_absorb(mi_heap_t* heap, mi_heap_t* from) {
+  mi_assert_internal(heap!=NULL);
+  if (from==NULL || from->page_count == 0) return;
+
+  // reduce the size of the delayed frees
+  _mi_heap_delayed_free_partial(from);
+
+  // transfer all pages by appending the queues; this will set a new heap field
+  // so threads may do delayed frees in either heap for a while.
+  // note: appending waits for each page to not be in the `MI_DELAYED_FREEING` state
+  // so after this only the new heap will get delayed frees
+  for (size_t i = 0; i <= MI_BIN_FULL; i++) {
+    mi_page_queue_t* pq = &heap->pages[i];
+    mi_page_queue_t* append = &from->pages[i];
+    size_t pcount = _mi_page_queue_append(heap, pq, append);
+    heap->page_count += pcount;
+    from->page_count -= pcount;
+  }
+  mi_assert_internal(from->page_count == 0);
+
+  // and do outstanding delayed frees in the `from` heap
+  // note: be careful here as the `heap` field in all those pages no longer point to `from`,
+  // turns out to be ok as `_mi_heap_delayed_free` only visits the list and calls a
+  // the regular `_mi_free_delayed_block` which is safe.
+  _mi_heap_delayed_free_all(from);
+  #if !defined(_MSC_VER) || (_MSC_VER > 1900) // somehow the following line gives an error in VS2015, issue #353
+  mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_block_t,&from->thread_delayed_free) == NULL);
+  #endif
+
+  // and reset the `from` heap
+  mi_heap_reset_pages(from);
+}
+
+// are two heaps compatible with respect to heap-tag, exclusive arena etc.
+static bool mi_heaps_are_compatible(mi_heap_t* heap1, mi_heap_t* heap2) {
+  return (heap1->tag == heap2->tag &&                   // store same kind of objects
+          heap1->arena_id == heap2->arena_id);          // same arena preference
+}
+
+// Safe delete a heap without freeing any still allocated blocks in that heap.
+void mi_heap_delete(mi_heap_t* heap)
+{
+  mi_assert(heap != NULL);
+  mi_assert(mi_heap_is_initialized(heap));
+  mi_assert_expensive(mi_heap_is_valid(heap));
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return;
+
+  mi_heap_t* bheap = heap->tld->heap_backing;
+  if (bheap != heap && mi_heaps_are_compatible(bheap,heap)) {
+    // transfer still used pages to the backing heap
+    mi_heap_absorb(bheap, heap);
+  }
+  else {
+    // the backing heap abandons its pages
+    _mi_heap_collect_abandon(heap);
+  }
+  mi_assert_internal(heap->page_count==0);
+  mi_heap_free(heap);
+}
+
+mi_heap_t* mi_heap_set_default(mi_heap_t* heap) {
+  mi_assert(heap != NULL);
+  mi_assert(mi_heap_is_initialized(heap));
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return NULL;
+  mi_assert_expensive(mi_heap_is_valid(heap));
+  mi_heap_t* old = mi_prim_get_default_heap();
+  _mi_heap_set_default_direct(heap);
+  return old;
+}
+
+
+
+
+/* -----------------------------------------------------------
+  Analysis
+----------------------------------------------------------- */
+
+// static since it is not thread safe to access heaps from other threads.
+static mi_heap_t* mi_heap_of_block(const void* p) {
+  if (p == NULL) return NULL;
+  mi_segment_t* segment = _mi_ptr_segment(p);
+  bool valid = (_mi_ptr_cookie(segment) == segment->cookie);
+  mi_assert_internal(valid);
+  if mi_unlikely(!valid) return NULL;
+  return mi_page_heap(_mi_segment_page_of(segment,p));
+}
+
+bool mi_heap_contains_block(mi_heap_t* heap, const void* p) {
+  mi_assert(heap != NULL);
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return false;
+  return (heap == mi_heap_of_block(p));
+}
+
+
+static bool mi_heap_page_check_owned(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* p, void* vfound) {
+  MI_UNUSED(heap);
+  MI_UNUSED(pq);
+  bool* found = (bool*)vfound;
+  void* start = mi_page_start(page);
+  void* end   = (uint8_t*)start + (page->capacity * mi_page_block_size(page));
+  *found = (p >= start && p < end);
+  return (!*found); // continue if not found
+}
+
+bool mi_heap_check_owned(mi_heap_t* heap, const void* p) {
+  mi_assert(heap != NULL);
+  if (heap==NULL || !mi_heap_is_initialized(heap)) return false;
+  if (((uintptr_t)p & (MI_INTPTR_SIZE - 1)) != 0) return false;  // only aligned pointers
+  bool found = false;
+  mi_heap_visit_pages(heap, &mi_heap_page_check_owned, (void*)p, &found);
+  return found;
+}
+
+bool mi_check_owned(const void* p) {
+  return mi_heap_check_owned(mi_prim_get_default_heap(), p);
+}
+
+/* -----------------------------------------------------------
+  Visit all heap blocks and areas
+  Todo: enable visiting abandoned pages, and
+        enable visiting all blocks of all heaps across threads
+----------------------------------------------------------- */
+
+void _mi_heap_area_init(mi_heap_area_t* area, mi_page_t* page) {
+  const size_t bsize = mi_page_block_size(page);
+  const size_t ubsize = mi_page_usable_block_size(page);
+  area->reserved = page->reserved * bsize;
+  area->committed = page->capacity * bsize;
+  area->blocks = mi_page_start(page);
+  area->used = page->used;   // number of blocks in use (#553)
+  area->block_size = ubsize;
+  area->full_block_size = bsize;
+  area->heap_tag = page->heap_tag;
+}
+
+
+static void mi_get_fast_divisor(size_t divisor, uint64_t* magic, size_t* shift) {
+  mi_assert_internal(divisor > 0 && divisor <= UINT32_MAX);
+  *shift = MI_SIZE_BITS - mi_clz(divisor - 1);
+  *magic = ((((uint64_t)1 << 32) * (((uint64_t)1 << *shift) - divisor)) / divisor + 1);
+}
+
+static size_t mi_fast_divide(size_t n, uint64_t magic, size_t shift) {
+  mi_assert_internal(n <= UINT32_MAX);
+  const uint64_t hi = ((uint64_t)n * magic) >> 32;
+  return (size_t)((hi + n) >> shift);
+}
+
+bool _mi_heap_area_visit_blocks(const mi_heap_area_t* area, mi_page_t* page, mi_block_visit_fun* visitor, void* arg) {
+  mi_assert(area != NULL);
+  if (area==NULL) return true;
+  mi_assert(page != NULL);
+  if (page == NULL) return true;
+
+  _mi_page_free_collect(page,true);              // collect both thread_delayed and local_free
+  mi_assert_internal(page->local_free == NULL);
+  if (page->used == 0) return true;
+
+  size_t psize;
+  uint8_t* const pstart = _mi_segment_page_start(_mi_page_segment(page), page, &psize);
+  mi_heap_t* const heap = mi_page_heap(page);
+  const size_t bsize    = mi_page_block_size(page);
+  const size_t ubsize   = mi_page_usable_block_size(page); // without padding
+
+  // optimize page with one block
+  if (page->capacity == 1) {
+    mi_assert_internal(page->used == 1 && page->free == NULL);
+    return visitor(mi_page_heap(page), area, pstart, ubsize, arg);
+  }
+  mi_assert(bsize <= UINT32_MAX);
+
+  // optimize full pages
+  if (page->used == page->capacity) {
+    uint8_t* block = pstart;
+    for (size_t i = 0; i < page->capacity; i++) {
+      if (!visitor(heap, area, block, ubsize, arg)) return false;
+      block += bsize;
+    }
+    return true;
+  }
+
+  // create a bitmap of free blocks.
+  #define MI_MAX_BLOCKS   (MI_SMALL_PAGE_SIZE / sizeof(void*))
+  uintptr_t free_map[MI_MAX_BLOCKS / MI_INTPTR_BITS];
+  const uintptr_t bmapsize = _mi_divide_up(page->capacity, MI_INTPTR_BITS);
+  memset(free_map, 0, bmapsize * sizeof(intptr_t));
+  if (page->capacity % MI_INTPTR_BITS != 0) {
+    // mark left-over bits at the end as free
+    size_t shift   = (page->capacity % MI_INTPTR_BITS);
+    uintptr_t mask = (UINTPTR_MAX << shift);
+    free_map[bmapsize - 1] = mask;
+  }
+
+  // fast repeated division by the block size
+  uint64_t magic;
+  size_t   shift;
+  mi_get_fast_divisor(bsize, &magic, &shift);
+
+  #if MI_DEBUG>1
+  size_t free_count = 0;
+  #endif
+  for (mi_block_t* block = page->free; block != NULL; block = mi_block_next(page, block)) {
+    #if MI_DEBUG>1
+    free_count++;
+    #endif
+    mi_assert_internal((uint8_t*)block >= pstart && (uint8_t*)block < (pstart + psize));
+    size_t offset = (uint8_t*)block - pstart;
+    mi_assert_internal(offset % bsize == 0);
+    mi_assert_internal(offset <= UINT32_MAX);
+    size_t blockidx = mi_fast_divide(offset, magic, shift);
+    mi_assert_internal(blockidx == offset / bsize);
+    mi_assert_internal(blockidx < MI_MAX_BLOCKS);
+    size_t bitidx = (blockidx / MI_INTPTR_BITS);
+    size_t bit = blockidx - (bitidx * MI_INTPTR_BITS);
+    free_map[bitidx] |= ((uintptr_t)1 << bit);
+  }
+  mi_assert_internal(page->capacity == (free_count + page->used));
+
+  // walk through all blocks skipping the free ones
+  #if MI_DEBUG>1
+  size_t used_count = 0;
+  #endif
+  uint8_t* block = pstart;
+  for (size_t i = 0; i < bmapsize; i++) {
+    if (free_map[i] == 0) {
+      // every block is in use
+      for (size_t j = 0; j < MI_INTPTR_BITS; j++) {
+        #if MI_DEBUG>1
+        used_count++;
+        #endif
+        if (!visitor(heap, area, block, ubsize, arg)) return false;
+        block += bsize;
+      }
+    }
+    else {
+      // visit the used blocks in the mask
+      uintptr_t m = ~free_map[i];
+      while (m != 0) {
+        #if MI_DEBUG>1
+        used_count++;
+        #endif
+        size_t bitidx = mi_ctz(m);
+        if (!visitor(heap, area, block + (bitidx * bsize), ubsize, arg)) return false;
+        m &= m - 1;  // clear least significant bit
+      }
+      block += bsize * MI_INTPTR_BITS;
+    }
+  }
+  mi_assert_internal(page->used == used_count);
+  return true;
+}
+
+
+
+// Separate struct to keep `mi_page_t` out of the public interface
+typedef struct mi_heap_area_ex_s {
+  mi_heap_area_t area;
+  mi_page_t* page;
+} mi_heap_area_ex_t;
+
+typedef bool (mi_heap_area_visit_fun)(const mi_heap_t* heap, const mi_heap_area_ex_t* area, void* arg);
+
+static bool mi_heap_visit_areas_page(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* vfun, void* arg) {
+  MI_UNUSED(heap);
+  MI_UNUSED(pq);
+  mi_heap_area_visit_fun* fun = (mi_heap_area_visit_fun*)vfun;
+  mi_heap_area_ex_t xarea;
+  xarea.page = page;
+  _mi_heap_area_init(&xarea.area, page);
+  return fun(heap, &xarea, arg);
+}
+
+// Visit all heap pages as areas
+static bool mi_heap_visit_areas(const mi_heap_t* heap, mi_heap_area_visit_fun* visitor, void* arg) {
+  if (visitor == NULL) return false;
+  return mi_heap_visit_pages((mi_heap_t*)heap, &mi_heap_visit_areas_page, (void*)(visitor), arg); // note: function pointer to void* :-{
+}
+
+// Just to pass arguments
+typedef struct mi_visit_blocks_args_s {
+  bool  visit_blocks;
+  mi_block_visit_fun* visitor;
+  void* arg;
+} mi_visit_blocks_args_t;
+
+static bool mi_heap_area_visitor(const mi_heap_t* heap, const mi_heap_area_ex_t* xarea, void* arg) {
+  mi_visit_blocks_args_t* args = (mi_visit_blocks_args_t*)arg;
+  if (!args->visitor(heap, &xarea->area, NULL, xarea->area.block_size, args->arg)) return false;
+  if (args->visit_blocks) {
+    return _mi_heap_area_visit_blocks(&xarea->area, xarea->page, args->visitor, args->arg);
+  }
+  else {
+    return true;
+  }
+}
+
+// Visit all blocks in a heap
+bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_blocks, mi_block_visit_fun* visitor, void* arg) {
+  mi_visit_blocks_args_t args = { visit_blocks, visitor, arg };
+  return mi_heap_visit_areas(heap, &mi_heap_area_visitor, &args);
+}
diff --git a/compat/mimalloc/init.c b/compat/mimalloc/init.c
new file mode 100644
index 00000000000000..c6cca89da9c5db
--- /dev/null
+++ b/compat/mimalloc/init.c
@@ -0,0 +1,715 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2022, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"
+
+#include <string.h>  // memcpy, memset
+#include <stdlib.h>  // atexit
+
+
+// Empty page used to initialize the small free pages array
+const mi_page_t _mi_page_empty = {
+  0,
+  false, false, false, false,
+  0,       // capacity
+  0,       // reserved capacity
+  { 0 },   // flags
+  false,   // is_zero
+  0,       // retire_expire
+  NULL,    // free
+  NULL,    // local_free
+  0,       // used
+  0,       // block size shift
+  0,       // heap tag
+  0,       // block_size
+  NULL,    // page_start
+  #if (MI_PADDING || MI_ENCODE_FREELIST)
+  { 0, 0 },
+  #endif
+  MI_ATOMIC_VAR_INIT(0), // xthread_free
+  MI_ATOMIC_VAR_INIT(0), // xheap
+  NULL, NULL
+  , { 0 }  // padding
+};
+
+#define MI_PAGE_EMPTY() ((mi_page_t*)&_mi_page_empty)
+
+#if (MI_SMALL_WSIZE_MAX==128)
+#if (MI_PADDING>0) && (MI_INTPTR_SIZE >= 8)
+#define MI_SMALL_PAGES_EMPTY  { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY(), MI_PAGE_EMPTY() }
+#elif (MI_PADDING>0)
+#define MI_SMALL_PAGES_EMPTY  { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY(), MI_PAGE_EMPTY(), MI_PAGE_EMPTY() }
+#else
+#define MI_SMALL_PAGES_EMPTY  { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY() }
+#endif
+#else
+#error "define right initialization sizes corresponding to MI_SMALL_WSIZE_MAX"
+#endif
+
+// Empty page queues for every bin
+#define QNULL(sz)  { NULL, NULL, (sz)*sizeof(uintptr_t) }
+#define MI_PAGE_QUEUES_EMPTY \
+  { QNULL(1), \
+    QNULL(     1), QNULL(     2), QNULL(     3), QNULL(     4), QNULL(     5), QNULL(     6), QNULL(     7), QNULL(     8), /* 8 */ \
+    QNULL(    10), QNULL(    12), QNULL(    14), QNULL(    16), QNULL(    20), QNULL(    24), QNULL(    28), QNULL(    32), /* 16 */ \
+    QNULL(    40), QNULL(    48), QNULL(    56), QNULL(    64), QNULL(    80), QNULL(    96), QNULL(   112), QNULL(   128), /* 24 */ \
+    QNULL(   160), QNULL(   192), QNULL(   224), QNULL(   256), QNULL(   320), QNULL(   384), QNULL(   448), QNULL(   512), /* 32 */ \
+    QNULL(   640), QNULL(   768), QNULL(   896), QNULL(  1024), QNULL(  1280), QNULL(  1536), QNULL(  1792), QNULL(  2048), /* 40 */ \
+    QNULL(  2560), QNULL(  3072), QNULL(  3584), QNULL(  4096), QNULL(  5120), QNULL(  6144), QNULL(  7168), QNULL(  8192), /* 48 */ \
+    QNULL( 10240), QNULL( 12288), QNULL( 14336), QNULL( 16384), QNULL( 20480), QNULL( 24576), QNULL( 28672), QNULL( 32768), /* 56 */ \
+    QNULL( 40960), QNULL( 49152), QNULL( 57344), QNULL( 65536), QNULL( 81920), QNULL( 98304), QNULL(114688), QNULL(131072), /* 64 */ \
+    QNULL(163840), QNULL(196608), QNULL(229376), QNULL(262144), QNULL(327680), QNULL(393216), QNULL(458752), QNULL(524288), /* 72 */ \
+    QNULL(MI_MEDIUM_OBJ_WSIZE_MAX + 1  /* 655360, Huge queue */), \
+    QNULL(MI_MEDIUM_OBJ_WSIZE_MAX + 2) /* Full queue */ }
+
+#define MI_STAT_COUNT_NULL()  {0,0,0}
+
+// Empty statistics
+#define MI_STATS_NULL  \
+  MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \
+  { 0 }, { 0 }, \
+  MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \
+  MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \
+  { 0 }, { 0 }, { 0 }, { 0 }, \
+  { 0 }, { 0 }, { 0 }, { 0 }, \
+  \
+  { 0 }, { 0 }, { 0 }, { 0 }, { 0 }, { 0 }, \
+  MI_INIT4(MI_STAT_COUNT_NULL), \
+  { 0 }, { 0 }, { 0 }, { 0 },  \
+  \
+  { MI_INIT4(MI_STAT_COUNT_NULL) }, \
+  { { 0 }, { 0 }, { 0 }, { 0 } }, \
+  \
+  { MI_INIT74(MI_STAT_COUNT_NULL) }, \
+  { MI_INIT74(MI_STAT_COUNT_NULL) }
+
+
+// Empty slice span queues for every bin
+#define SQNULL(sz)  { NULL, NULL, sz }
+#define MI_SEGMENT_SPAN_QUEUES_EMPTY \
+  { SQNULL(1), \
+    SQNULL(     1), SQNULL(     2), SQNULL(     3), SQNULL(     4), SQNULL(     5), SQNULL(     6), SQNULL(     7), SQNULL(    10), /*  8 */ \
+    SQNULL(    12), SQNULL(    14), SQNULL(    16), SQNULL(    20), SQNULL(    24), SQNULL(    28), SQNULL(    32), SQNULL(    40), /* 16 */ \
+    SQNULL(    48), SQNULL(    56), SQNULL(    64), SQNULL(    80), SQNULL(    96), SQNULL(   112), SQNULL(   128), SQNULL(   160), /* 24 */ \
+    SQNULL(   192), SQNULL(   224), SQNULL(   256), SQNULL(   320), SQNULL(   384), SQNULL(   448), SQNULL(   512), SQNULL(   640), /* 32 */ \
+    SQNULL(   768), SQNULL(   896), SQNULL(  1024) /* 35 */ }
+
+
+// --------------------------------------------------------
+// Statically allocate an empty heap as the initial
+// thread local value for the default heap,
+// and statically allocate the backing heap for the main
+// thread so it can function without doing any allocation
+// itself (as accessing a thread local for the first time
+// may lead to allocation itself on some platforms)
+// --------------------------------------------------------
+
+mi_decl_cache_align const mi_heap_t _mi_heap_empty = {
+  NULL,
+  MI_ATOMIC_VAR_INIT(NULL),
+  0,                // tid
+  0,                // cookie
+  0,                // arena id
+  { 0, 0 },         // keys
+  { {0}, {0}, 0, true }, // random
+  0,                // page count
+  MI_BIN_FULL, 0,   // page retired min/max
+  0, 0,             // generic count
+  NULL,             // next
+  false,            // can reclaim
+  0,                // tag
+  #if MI_GUARDED
+  0, 0, 0, 1,       // count is 1 so we never write to it (see `internal.h:mi_heap_malloc_use_guarded`)
+  #endif
+  MI_SMALL_PAGES_EMPTY,
+  MI_PAGE_QUEUES_EMPTY
+};
+
+static mi_decl_cache_align mi_subproc_t mi_subproc_default;
+
+#define tld_empty_stats  ((mi_stats_t*)((uint8_t*)&tld_empty + offsetof(mi_tld_t,stats)))
+
+mi_decl_cache_align static const mi_tld_t tld_empty = {
+  0,
+  false,
+  NULL, NULL,
+  { MI_SEGMENT_SPAN_QUEUES_EMPTY, 0, 0, 0, 0, 0, &mi_subproc_default, tld_empty_stats }, // segments
+  { MI_STAT_VERSION, MI_STATS_NULL }       // stats
+};
+
+mi_threadid_t _mi_thread_id(void) mi_attr_noexcept {
+  return _mi_prim_thread_id();
+}
+
+// the thread-local default heap for allocation
+mi_decl_thread mi_heap_t* _mi_heap_default = (mi_heap_t*)&_mi_heap_empty;
+
+extern mi_decl_hidden mi_heap_t _mi_heap_main;
+
+static mi_decl_cache_align mi_tld_t tld_main = {
+  0, false,
+  &_mi_heap_main, & _mi_heap_main,
+  { MI_SEGMENT_SPAN_QUEUES_EMPTY, 0, 0, 0, 0, 0, &mi_subproc_default, &tld_main.stats }, // segments
+  { MI_STAT_VERSION, MI_STATS_NULL }       // stats
+};
+
+mi_decl_cache_align mi_heap_t _mi_heap_main = {
+  &tld_main,
+  MI_ATOMIC_VAR_INIT(NULL),
+  0,                // thread id
+  0,                // initial cookie
+  0,                // arena id
+  { 0, 0 },         // the key of the main heap can be fixed (unlike page keys that need to be secure!)
+  { {0x846ca68b}, {0}, 0, true },  // random
+  0,                // page count
+  MI_BIN_FULL, 0,   // page retired min/max
+  0, 0,             // generic count
+  NULL,             // next heap
+  false,            // can reclaim
+  0,                // tag
+  #if MI_GUARDED
+  0, 0, 0, 0,
+  #endif
+  MI_SMALL_PAGES_EMPTY,
+  MI_PAGE_QUEUES_EMPTY
+};
+
+bool _mi_process_is_initialized = false;  // set to `true` in `mi_process_init`.
+
+mi_stats_t _mi_stats_main = { MI_STAT_VERSION, MI_STATS_NULL };
+
+#if MI_GUARDED
+mi_decl_export void mi_heap_guarded_set_sample_rate(mi_heap_t* heap, size_t sample_rate, size_t seed) {
+  heap->guarded_sample_rate  = sample_rate;
+  heap->guarded_sample_count = sample_rate;  // count down samples
+  if (heap->guarded_sample_rate > 1) {
+    if (seed == 0) {
+      seed = _mi_heap_random_next(heap);
+    }
+    heap->guarded_sample_count = (seed % heap->guarded_sample_rate) + 1;  // start at random count between 1 and `sample_rate`
+  }
+}
+
+mi_decl_export void mi_heap_guarded_set_size_bound(mi_heap_t* heap, size_t min, size_t max) {
+  heap->guarded_size_min = min;
+  heap->guarded_size_max = (min > max ? min : max);
+}
+
+void _mi_heap_guarded_init(mi_heap_t* heap) {
+  mi_heap_guarded_set_sample_rate(heap,
+    (size_t)mi_option_get_clamp(mi_option_guarded_sample_rate, 0, LONG_MAX),
+    (size_t)mi_option_get(mi_option_guarded_sample_seed));
+  mi_heap_guarded_set_size_bound(heap,
+    (size_t)mi_option_get_clamp(mi_option_guarded_min, 0, LONG_MAX),
+    (size_t)mi_option_get_clamp(mi_option_guarded_max, 0, LONG_MAX) );
+}
+#else
+mi_decl_export void mi_heap_guarded_set_sample_rate(mi_heap_t* heap, size_t sample_rate, size_t seed) {
+  MI_UNUSED(heap); MI_UNUSED(sample_rate); MI_UNUSED(seed);
+}
+
+mi_decl_export void mi_heap_guarded_set_size_bound(mi_heap_t* heap, size_t min, size_t max) {
+  MI_UNUSED(heap); MI_UNUSED(min); MI_UNUSED(max);
+}
+void _mi_heap_guarded_init(mi_heap_t* heap) {
+  MI_UNUSED(heap);
+}
+#endif
+
+
+static void mi_heap_main_init(void) {
+  if (_mi_heap_main.cookie == 0) {
+    _mi_heap_main.thread_id = _mi_thread_id();
+    _mi_heap_main.cookie = 1;
+    #if defined(_WIN32) && !defined(MI_SHARED_LIB)
+      _mi_random_init_weak(&_mi_heap_main.random);    // prevent allocation failure during bcrypt dll initialization with static linking
+    #else
+      _mi_random_init(&_mi_heap_main.random);
+    #endif
+    _mi_heap_main.cookie  = _mi_heap_random_next(&_mi_heap_main);
+    _mi_heap_main.keys[0] = _mi_heap_random_next(&_mi_heap_main);
+    _mi_heap_main.keys[1] = _mi_heap_random_next(&_mi_heap_main);
+    mi_lock_init(&mi_subproc_default.abandoned_os_lock);
+    mi_lock_init(&mi_subproc_default.abandoned_os_visit_lock);
+    _mi_heap_guarded_init(&_mi_heap_main);
+  }
+}
+
+mi_heap_t* _mi_heap_main_get(void) {
+  mi_heap_main_init();
+  return &_mi_heap_main;
+}
+
+/* -----------------------------------------------------------
+  Sub process
+----------------------------------------------------------- */
+
+mi_subproc_id_t mi_subproc_main(void) {
+  return NULL;
+}
+
+mi_subproc_id_t mi_subproc_new(void) {
+  mi_memid_t memid = _mi_memid_none();
+  mi_subproc_t* subproc = (mi_subproc_t*)_mi_arena_meta_zalloc(sizeof(mi_subproc_t), &memid);
+  if (subproc == NULL) return NULL;
+  subproc->memid = memid;
+  subproc->abandoned_os_list = NULL;
+  mi_lock_init(&subproc->abandoned_os_lock);
+  mi_lock_init(&subproc->abandoned_os_visit_lock);
+  return subproc;
+}
+
+mi_subproc_t* _mi_subproc_from_id(mi_subproc_id_t subproc_id) {
+  return (subproc_id == NULL ? &mi_subproc_default : (mi_subproc_t*)subproc_id);
+}
+
+void mi_subproc_delete(mi_subproc_id_t subproc_id) {
+  if (subproc_id == NULL) return;
+  mi_subproc_t* subproc = _mi_subproc_from_id(subproc_id);
+  // check if there are no abandoned segments still..
+  bool safe_to_delete = false;
+  mi_lock(&subproc->abandoned_os_lock) {
+    if (subproc->abandoned_os_list == NULL) {
+      safe_to_delete = true;
+    }
+  }
+  if (!safe_to_delete) return;
+  // safe to release
+  // todo: should we refcount subprocesses?
+  mi_lock_done(&subproc->abandoned_os_lock);
+  mi_lock_done(&subproc->abandoned_os_visit_lock);
+  _mi_arena_meta_free(subproc, subproc->memid, sizeof(mi_subproc_t));
+}
+
+void mi_subproc_add_current_thread(mi_subproc_id_t subproc_id) {
+  mi_heap_t* heap = mi_heap_get_default();
+  if (heap == NULL) return;
+  mi_assert(heap->tld->segments.subproc == &mi_subproc_default);
+  if (heap->tld->segments.subproc != &mi_subproc_default) return;
+  heap->tld->segments.subproc = _mi_subproc_from_id(subproc_id);
+}
+
+
+
+/* -----------------------------------------------------------
+  Initialization and freeing of the thread local heaps
+----------------------------------------------------------- */
+
+// note: in x64 in release build `sizeof(mi_thread_data_t)` is under 4KiB (= OS page size).
+typedef struct mi_thread_data_s {
+  mi_heap_t  heap;   // must come first due to cast in `_mi_heap_done`
+  mi_tld_t   tld;
+  mi_memid_t memid;  // must come last due to zero'ing
+} mi_thread_data_t;
+
+
+// Thread meta-data is allocated directly from the OS. For
+// some programs that do not use thread pools and allocate and
+// destroy many OS threads, this may causes too much overhead
+// per thread so we maintain a small cache of recently freed metadata.
+
+#define TD_CACHE_SIZE (32)
+static _Atomic(mi_thread_data_t*) td_cache[TD_CACHE_SIZE];
+
+static mi_thread_data_t* mi_thread_data_zalloc(void) {
+  // try to find thread metadata in the cache
+  mi_thread_data_t* td = NULL;
+  for (int i = 0; i < TD_CACHE_SIZE; i++) {
+    td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]);
+    if (td != NULL) {
+      // found cached allocation, try use it
+      td = mi_atomic_exchange_ptr_acq_rel(mi_thread_data_t, &td_cache[i], NULL);
+      if (td != NULL) {
+        _mi_memzero(td, offsetof(mi_thread_data_t,memid));
+        return td;
+      }
+    }
+  }
+
+  // if that fails, allocate as meta data
+  mi_memid_t memid;
+  td = (mi_thread_data_t*)_mi_os_zalloc(sizeof(mi_thread_data_t), &memid);
+  if (td == NULL) {
+    // if this fails, try once more. (issue #257)
+    td = (mi_thread_data_t*)_mi_os_zalloc(sizeof(mi_thread_data_t), &memid);
+    if (td == NULL) {
+      // really out of memory
+      _mi_error_message(ENOMEM, "unable to allocate thread local heap metadata (%zu bytes)\n", sizeof(mi_thread_data_t));
+      return NULL;
+    }
+  }
+  td->memid = memid;
+  return td;
+}
+
+static void mi_thread_data_free( mi_thread_data_t* tdfree ) {
+  // try to add the thread metadata to the cache
+  for (int i = 0; i < TD_CACHE_SIZE; i++) {
+    mi_thread_data_t* td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]);
+    if (td == NULL) {
+      mi_thread_data_t* expected = NULL;
+      if (mi_atomic_cas_ptr_weak_acq_rel(mi_thread_data_t, &td_cache[i], &expected, tdfree)) {
+        return;
+      }
+    }
+  }
+  // if that fails, just free it directly
+  _mi_os_free(tdfree, sizeof(mi_thread_data_t), tdfree->memid);
+}
+
+void _mi_thread_data_collect(void) {
+  // free all thread metadata from the cache
+  for (int i = 0; i < TD_CACHE_SIZE; i++) {
+    mi_thread_data_t* td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]);
+    if (td != NULL) {
+      td = mi_atomic_exchange_ptr_acq_rel(mi_thread_data_t, &td_cache[i], NULL);
+      if (td != NULL) {
+        _mi_os_free(td, sizeof(mi_thread_data_t), td->memid);
+      }
+    }
+  }
+}
+
+// Initialize the thread local default heap, called from `mi_thread_init`
+static bool _mi_thread_heap_init(void) {
+  if (mi_heap_is_initialized(mi_prim_get_default_heap())) return true;
+  if (_mi_is_main_thread()) {
+    // mi_assert_internal(_mi_heap_main.thread_id != 0);  // can happen on freeBSD where alloc is called before any initialization
+    // the main heap is statically allocated
+    mi_heap_main_init();
+    _mi_heap_set_default_direct(&_mi_heap_main);
+    //mi_assert_internal(_mi_heap_default->tld->heap_backing == mi_prim_get_default_heap());
+  }
+  else {
+    // use `_mi_os_alloc` to allocate directly from the OS
+    mi_thread_data_t* td = mi_thread_data_zalloc();
+    if (td == NULL) return false;
+
+    mi_tld_t*  tld = &td->tld;
+    mi_heap_t* heap = &td->heap;
+    _mi_tld_init(tld, heap);  // must be before `_mi_heap_init`
+    _mi_heap_init(heap, tld, _mi_arena_id_none(), false /* can reclaim */, 0 /* default tag */);
+    _mi_heap_set_default_direct(heap);
+  }
+  return false;
+}
+
+// initialize thread local data
+void _mi_tld_init(mi_tld_t* tld, mi_heap_t* bheap) {
+  _mi_memcpy_aligned(tld, &tld_empty, sizeof(mi_tld_t));
+  tld->heap_backing = bheap;
+  tld->heaps = NULL;
+  tld->segments.subproc = &mi_subproc_default;
+  tld->segments.stats = &tld->stats;
+}
+
+// Free the thread local default heap (called from `mi_thread_done`)
+static bool _mi_thread_heap_done(mi_heap_t* heap) {
+  if (!mi_heap_is_initialized(heap)) return true;
+
+  // reset default heap
+  _mi_heap_set_default_direct(_mi_is_main_thread() ? &_mi_heap_main : (mi_heap_t*)&_mi_heap_empty);
+
+  // switch to backing heap
+  heap = heap->tld->heap_backing;
+  if (!mi_heap_is_initialized(heap)) return false;
+
+  // delete all non-backing heaps in this thread
+  mi_heap_t* curr = heap->tld->heaps;
+  while (curr != NULL) {
+    mi_heap_t* next = curr->next; // save `next` as `curr` will be freed
+    if (curr != heap) {
+      mi_assert_internal(!mi_heap_is_backing(curr));
+      mi_heap_delete(curr);
+    }
+    curr = next;
+  }
+  mi_assert_internal(heap->tld->heaps == heap && heap->next == NULL);
+  mi_assert_internal(mi_heap_is_backing(heap));
+
+  // collect if not the main thread
+  if (heap != &_mi_heap_main) {
+    _mi_heap_collect_abandon(heap);
+  }
+
+  // merge stats
+  _mi_stats_done(&heap->tld->stats);
+
+  // free if not the main thread
+  if (heap != &_mi_heap_main) {
+    // the following assertion does not always hold for huge segments as those are always treated
+    // as abondened: one may allocate it in one thread, but deallocate in another in which case
+    // the count can be too large or negative. todo: perhaps not count huge segments? see issue #363
+    // mi_assert_internal(heap->tld->segments.count == 0 || heap->thread_id != _mi_thread_id());
+    mi_thread_data_free((mi_thread_data_t*)heap);
+  }
+  else {
+    #if 0
+    // never free the main thread even in debug mode; if a dll is linked statically with mimalloc,
+    // there may still be delete/free calls after the mi_fls_done is called. Issue #207
+    _mi_heap_destroy_pages(heap);
+    mi_assert_internal(heap->tld->heap_backing == &_mi_heap_main);
+    #endif
+  }
+  return false;
+}
+
+
+
+// --------------------------------------------------------
+// Try to run `mi_thread_done()` automatically so any memory
+// owned by the thread but not yet released can be abandoned
+// and re-owned by another thread.
+//
+// 1. windows dynamic library:
+//     call from DllMain on DLL_THREAD_DETACH
+// 2. windows static library:
+//     use `FlsAlloc` to call a destructor when the thread is done
+// 3. unix, pthreads:
+//     use a pthread key to call a destructor when a pthread is done
+//
+// In the last two cases we also need to call `mi_process_init`
+// to set up the thread local keys.
+// --------------------------------------------------------
+
+// Set up handlers so `mi_thread_done` is called automatically
+static void mi_process_setup_auto_thread_done(void) {
+  static bool tls_initialized = false; // fine if it races
+  if (tls_initialized) return;
+  tls_initialized = true;
+  _mi_prim_thread_init_auto_done();
+  _mi_heap_set_default_direct(&_mi_heap_main);
+}
+
+
+bool _mi_is_main_thread(void) {
+  return (_mi_heap_main.thread_id==0 || _mi_heap_main.thread_id == _mi_thread_id());
+}
+
+static _Atomic(size_t) thread_count = MI_ATOMIC_VAR_INIT(1);
+
+size_t  _mi_current_thread_count(void) {
+  return mi_atomic_load_relaxed(&thread_count);
+}
+
+// This is called from the `mi_malloc_generic`
+void mi_thread_init(void) mi_attr_noexcept
+{
+  // ensure our process has started already
+  mi_process_init();
+
+  // initialize the thread local default heap
+  // (this will call `_mi_heap_set_default_direct` and thus set the
+  //  fiber/pthread key to a non-zero value, ensuring `_mi_thread_done` is called)
+  if (_mi_thread_heap_init()) return;  // returns true if already initialized
+
+  _mi_stat_increase(&_mi_stats_main.threads, 1);
+  mi_atomic_increment_relaxed(&thread_count);
+  //_mi_verbose_message("thread init: 0x%zx\n", _mi_thread_id());
+}
+
+void mi_thread_done(void) mi_attr_noexcept {
+  _mi_thread_done(NULL);
+}
+
+void _mi_thread_done(mi_heap_t* heap)
+{
+  // calling with NULL implies using the default heap
+  if (heap == NULL) {
+    heap = mi_prim_get_default_heap();
+    if (heap == NULL) return;
+  }
+
+  // prevent re-entrancy through heap_done/heap_set_default_direct (issue #699)
+  if (!mi_heap_is_initialized(heap)) {
+    return;
+  }
+
+  // adjust stats
+  mi_atomic_decrement_relaxed(&thread_count);
+  _mi_stat_decrease(&_mi_stats_main.threads, 1);
+
+  // check thread-id as on Windows shutdown with FLS the main (exit) thread may call this on thread-local heaps...
+  if (heap->thread_id != _mi_thread_id()) return;
+
+  // abandon the thread local heap
+  if (_mi_thread_heap_done(heap)) return;  // returns true if already ran
+}
+
+void _mi_heap_set_default_direct(mi_heap_t* heap)  {
+  mi_assert_internal(heap != NULL);
+  #if defined(MI_TLS_SLOT)
+  mi_prim_tls_slot_set(MI_TLS_SLOT,heap);
+  #elif defined(MI_TLS_PTHREAD_SLOT_OFS)
+  *mi_prim_tls_pthread_heap_slot() = heap;
+  #elif defined(MI_TLS_PTHREAD)
+  // we use _mi_heap_default_key
+  #else
+  _mi_heap_default = heap;
+  #endif
+
+  // ensure the default heap is passed to `_mi_thread_done`
+  // setting to a non-NULL value also ensures `mi_thread_done` is called.
+  _mi_prim_thread_associate_default_heap(heap);
+}
+
+void mi_thread_set_in_threadpool(void) mi_attr_noexcept {
+  // nothing
+}
+
+// --------------------------------------------------------
+// Run functions on process init/done, and thread init/done
+// --------------------------------------------------------
+static bool os_preloading = true;    // true until this module is initialized
+
+// Returns true if this module has not been initialized; Don't use C runtime routines until it returns false.
+bool mi_decl_noinline _mi_preloading(void) {
+  return os_preloading;
+}
+
+// Returns true if mimalloc was redirected
+mi_decl_nodiscard bool mi_is_redirected(void) mi_attr_noexcept {
+  return _mi_is_redirected();
+}
+
+// Called once by the process loader from `src/prim/prim.c`
+void _mi_auto_process_init(void) {
+  mi_heap_main_init();
+  #if defined(__APPLE__) || defined(MI_TLS_RECURSE_GUARD)
+  volatile mi_heap_t* dummy = _mi_heap_default; // access TLS to allocate it before setting tls_initialized to true;
+  if (dummy == NULL) return;                    // use dummy or otherwise the access may get optimized away (issue #697)
+  #endif
+  os_preloading = false;
+  mi_assert_internal(_mi_is_main_thread());
+  _mi_options_init();
+  mi_process_setup_auto_thread_done();
+  mi_process_init();
+  if (_mi_is_redirected()) _mi_verbose_message("malloc is redirected.\n");
+
+  // show message from the redirector (if present)
+  const char* msg = NULL;
+  _mi_allocator_init(&msg);
+  if (msg != NULL && (mi_option_is_enabled(mi_option_verbose) || mi_option_is_enabled(mi_option_show_errors))) {
+    _mi_fputs(NULL,NULL,NULL,msg);
+  }
+
+  // reseed random
+  _mi_random_reinit_if_weak(&_mi_heap_main.random);
+}
+
+#if defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64))
+#include <intrin.h>
+mi_decl_cache_align bool _mi_cpu_has_fsrm = false;
+mi_decl_cache_align bool _mi_cpu_has_erms = false;
+
+static void mi_detect_cpu_features(void) {
+  // FSRM for fast short rep movsb/stosb support (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017))
+  // EMRS for fast enhanced rep movsb/stosb support
+  int32_t cpu_info[4];
+  __cpuid(cpu_info, 7);
+  _mi_cpu_has_fsrm = ((cpu_info[3] & (1 << 4)) != 0); // bit 4 of EDX : see <https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features>
+  _mi_cpu_has_erms = ((cpu_info[1] & (1 << 9)) != 0); // bit 9 of EBX : see <https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features>
+}
+#else
+static void mi_detect_cpu_features(void) {
+  // nothing
+}
+#endif
+
+// Initialize the process; called by thread_init or the process loader
+void mi_process_init(void) mi_attr_noexcept {
+  // ensure we are called once
+  static mi_atomic_once_t process_init;
+	#if _MSC_VER < 1920
+	mi_heap_main_init(); // vs2017 can dynamically re-initialize _mi_heap_main
+	#endif
+  if (!mi_atomic_once(&process_init)) return;
+  _mi_process_is_initialized = true;
+  _mi_verbose_message("process init: 0x%zx\n", _mi_thread_id());
+  mi_process_setup_auto_thread_done();
+
+  mi_detect_cpu_features();
+  _mi_os_init();
+  mi_heap_main_init();
+  mi_thread_init();
+
+  #if defined(_WIN32)
+  // On windows, when building as a static lib the FLS cleanup happens to early for the main thread.
+  // To avoid this, set the FLS value for the main thread to NULL so the fls cleanup
+  // will not call _mi_thread_done on the (still executing) main thread. See issue #508.
+  _mi_prim_thread_associate_default_heap(NULL);
+  #endif
+
+  mi_stats_reset();  // only call stat reset *after* thread init (or the heap tld == NULL)
+  mi_track_init();
+
+  if (mi_option_is_enabled(mi_option_reserve_huge_os_pages)) {
+    size_t pages = mi_option_get_clamp(mi_option_reserve_huge_os_pages, 0, 128*1024);
+    int reserve_at  = (int)mi_option_get_clamp(mi_option_reserve_huge_os_pages_at, -1, INT_MAX);
+    if (reserve_at != -1) {
+      mi_reserve_huge_os_pages_at(pages, reserve_at, pages*500);
+    } else {
+      mi_reserve_huge_os_pages_interleave(pages, 0, pages*500);
+    }
+  }
+  if (mi_option_is_enabled(mi_option_reserve_os_memory)) {
+    long ksize = mi_option_get(mi_option_reserve_os_memory);
+    if (ksize > 0) {
+      mi_reserve_os_memory((size_t)ksize*MI_KiB, true /* commit? */, true /* allow large pages? */);
+    }
+  }
+}
+
+// Called when the process is done (cdecl as it is used with `at_exit` on some platforms)
+void mi_cdecl mi_process_done(void) mi_attr_noexcept {
+  // only shutdown if we were initialized
+  if (!_mi_process_is_initialized) return;
+  // ensure we are called once
+  static bool process_done = false;
+  if (process_done) return;
+  process_done = true;
+
+  // get the default heap so we don't need to acces thread locals anymore
+  mi_heap_t* heap = mi_prim_get_default_heap();  // use prim to not initialize any heap
+  mi_assert_internal(heap != NULL);
+
+  // release any thread specific resources and ensure _mi_thread_done is called on all but the main thread
+  _mi_prim_thread_done_auto_done();
+
+
+  #ifndef MI_SKIP_COLLECT_ON_EXIT
+    #if (MI_DEBUG || !defined(MI_SHARED_LIB))
+    // free all memory if possible on process exit. This is not needed for a stand-alone process
+    // but should be done if mimalloc is statically linked into another shared library which
+    // is repeatedly loaded/unloaded, see issue #281.
+    mi_heap_collect(heap, true /* force */ );
+    #endif
+  #endif
+
+  // Forcefully release all retained memory; this can be dangerous in general if overriding regular malloc/free
+  // since after process_done there might still be other code running that calls `free` (like at_exit routines,
+  // or C-runtime termination code.
+  if (mi_option_is_enabled(mi_option_destroy_on_exit)) {
+    mi_heap_collect(heap, true /* force */);
+    _mi_heap_unsafe_destroy_all(heap);     // forcefully release all memory held by all heaps (of this thread only!)
+    _mi_arena_unsafe_destroy_all();
+    _mi_segment_map_unsafe_destroy();
+  }
+
+  if (mi_option_is_enabled(mi_option_show_stats) || mi_option_is_enabled(mi_option_verbose)) {
+    mi_stats_print(NULL);
+  }
+  _mi_allocator_done();
+  _mi_verbose_message("process done: 0x%zx\n", _mi_heap_main.thread_id);
+  os_preloading = true; // don't call the C runtime anymore
+}
+
+void mi_cdecl _mi_auto_process_done(void) mi_attr_noexcept {
+  if (_mi_option_get_fast(mi_option_destroy_on_exit)>1) return;
+  mi_process_done();
+}
diff --git a/compat/mimalloc/libc.c b/compat/mimalloc/libc.c
new file mode 100644
index 00000000000000..52d095eb240dc1
--- /dev/null
+++ b/compat/mimalloc/libc.c
@@ -0,0 +1,334 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+// --------------------------------------------------------
+// This module defines various std libc functions to reduce
+// the dependency on libc, and also prevent errors caused
+// by some libc implementations when called before `main`
+// executes (due to malloc redirection)
+// --------------------------------------------------------
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"      // mi_prim_getenv
+
+char _mi_toupper(char c) {
+  if (c >= 'a' && c <= 'z') return (c - 'a' + 'A');
+                       else return c;
+}
+
+int _mi_strnicmp(const char* s, const char* t, size_t n) {
+  if (n == 0) return 0;
+  for (; *s != 0 && *t != 0 && n > 0; s++, t++, n--) {
+    if (_mi_toupper(*s) != _mi_toupper(*t)) break;
+  }
+  return (n == 0 ? 0 : *s - *t);
+}
+
+void _mi_strlcpy(char* dest, const char* src, size_t dest_size) {
+  if (dest==NULL || src==NULL || dest_size == 0) return;
+  // copy until end of src, or when dest is (almost) full
+  while (*src != 0 && dest_size > 1) {
+    *dest++ = *src++;
+    dest_size--;
+  }
+  // always zero terminate
+  *dest = 0;
+}
+
+void _mi_strlcat(char* dest, const char* src, size_t dest_size) {
+  if (dest==NULL || src==NULL || dest_size == 0) return;
+  // find end of string in the dest buffer
+  while (*dest != 0 && dest_size > 1) {
+    dest++;
+    dest_size--;
+  }
+  // and catenate
+  _mi_strlcpy(dest, src, dest_size);
+}
+
+size_t _mi_strlen(const char* s) {
+  if (s==NULL) return 0;
+  size_t len = 0;
+  while(s[len] != 0) { len++; }
+  return len;
+}
+
+size_t _mi_strnlen(const char* s, size_t max_len) {
+  if (s==NULL) return 0;
+  size_t len = 0;
+  while(s[len] != 0 && len < max_len) { len++; }
+  return len;
+}
+
+#ifdef MI_NO_GETENV
+bool _mi_getenv(const char* name, char* result, size_t result_size) {
+  MI_UNUSED(name);
+  MI_UNUSED(result);
+  MI_UNUSED(result_size);
+  return false;
+}
+#else
+bool _mi_getenv(const char* name, char* result, size_t result_size) {
+  if (name==NULL || result == NULL || result_size < 64) return false;
+  return _mi_prim_getenv(name,result,result_size);
+}
+#endif
+
+// --------------------------------------------------------
+// Define our own limited `_mi_vsnprintf` and `_mi_snprintf`
+// This is mostly to avoid calling these when libc is not yet
+// initialized (and to reduce dependencies)
+//
+// format:      d i, p x u, s
+// prec:        z l ll L
+// width:       10
+// align-left:  -
+// fill:        0
+// plus:        +
+// --------------------------------------------------------
+
+static void mi_outc(char c, char** out, char* end) {
+  char* p = *out;
+  if (p >= end) return;
+  *p = c;
+  *out = p + 1;
+}
+
+static void mi_outs(const char* s, char** out, char* end) {
+  if (s == NULL) return;
+  char* p = *out;
+  while (*s != 0 && p < end) {
+    *p++ = *s++;
+  }
+  *out = p;
+}
+
+static void mi_out_fill(char fill, size_t len, char** out, char* end) {
+  char* p = *out;
+  for (size_t i = 0; i < len && p < end; i++) {
+    *p++ = fill;
+  }
+  *out = p;
+}
+
+static void mi_out_alignright(char fill, char* start, size_t len, size_t extra, char* end) {
+  if (len == 0 || extra == 0) return;
+  if (start + len + extra >= end) return;
+  // move `len` characters to the right (in reverse since it can overlap)
+  for (size_t i = 1; i <= len; i++) {
+    start[len + extra - i] = start[len - i];
+  }
+  // and fill the start
+  for (size_t i = 0; i < extra; i++) {
+    start[i] = fill;
+  }
+}
+
+
+static void mi_out_num(uintmax_t x, size_t base, char prefix, char** out, char* end)
+{
+  if (x == 0 || base == 0 || base > 16) {
+    if (prefix != 0) { mi_outc(prefix, out, end); }
+    mi_outc('0',out,end);
+  }
+  else {
+    // output digits in reverse
+    char* start = *out;
+    while (x > 0) {
+      char digit = (char)(x % base);
+      mi_outc((digit <= 9 ? '0' + digit : 'A' + digit - 10),out,end);
+      x = x / base;
+    }
+    if (prefix != 0) {
+      mi_outc(prefix, out, end);
+    }
+    size_t len = *out - start;
+    // and reverse in-place
+    for (size_t i = 0; i < (len / 2); i++) {
+      char c = start[len - i - 1];
+      start[len - i - 1] = start[i];
+      start[i] = c;
+    }
+  }
+}
+
+
+#define MI_NEXTC()  c = *in; if (c==0) break; in++;
+
+int _mi_vsnprintf(char* buf, size_t bufsize, const char* fmt, va_list args) {
+  if (buf == NULL || bufsize == 0 || fmt == NULL) return 0;
+  buf[bufsize - 1] = 0;
+  char* const end = buf + (bufsize - 1);
+  const char* in = fmt;
+  char* out = buf;
+  while (true) {
+    if (out >= end) break;
+    char c;
+    MI_NEXTC();
+    if (c != '%') {
+      if ((c >= ' ' && c <= '~') || c=='\n' || c=='\r' || c=='\t') { // output visible ascii or standard control only
+        mi_outc(c, &out, end);
+      }
+    }
+    else {
+      MI_NEXTC();
+      char   fill = ' ';
+      size_t width = 0;
+      char   numtype = 'd';
+      char   numplus = 0;
+      bool   alignright = true;
+      if (c == '+' || c == ' ') { numplus = c; MI_NEXTC(); }
+      if (c == '-') { alignright = false; MI_NEXTC(); }
+      if (c == '0') { fill = '0'; MI_NEXTC(); }
+      if (c >= '1' && c <= '9') {
+        width = (c - '0'); MI_NEXTC();
+        while (c >= '0' && c <= '9') {
+          width = (10 * width) + (c - '0'); MI_NEXTC();
+        }
+        if (c == 0) break;  // extra check due to while
+      }
+      if (c == 'z' || c == 't' || c == 'L') { numtype = c; MI_NEXTC(); }
+      else if (c == 'l') {
+        numtype = c; MI_NEXTC();
+        if (c == 'l') { numtype = 'L'; MI_NEXTC(); }
+      }
+
+      char* start = out;
+      if (c == 's') {
+        // string
+        const char* s = va_arg(args, const char*);
+        mi_outs(s, &out, end);
+      }
+      else if (c == 'p' || c == 'x' || c == 'u') {
+        // unsigned
+        uintmax_t x = 0;
+        if (c == 'x' || c == 'u') {
+          if (numtype == 'z')       x = va_arg(args, size_t);
+          else if (numtype == 't')  x = va_arg(args, uintptr_t); // unsigned ptrdiff_t
+          else if (numtype == 'L')  x = va_arg(args, unsigned long long);
+          else if (numtype == 'l')  x = va_arg(args, unsigned long);
+                               else x = va_arg(args, unsigned int);
+        }
+        else if (c == 'p') {
+          x = va_arg(args, uintptr_t);
+          mi_outs("0x", &out, end);
+          start = out;
+          width = (width >= 2 ? width - 2 : 0);
+        }
+        if (width == 0 && (c == 'x' || c == 'p')) {
+          if (c == 'p')   { width = 2 * (x <= UINT32_MAX ? 4 : ((x >> 16) <= UINT32_MAX ? 6 : sizeof(void*))); }
+          if (width == 0) { width = 2; }
+          fill = '0';
+        }
+        mi_out_num(x, (c == 'x' || c == 'p' ? 16 : 10), numplus, &out, end);
+      }
+      else if (c == 'i' || c == 'd') {
+        // signed
+        intmax_t x = 0;
+        if (numtype == 'z')       x = va_arg(args, intptr_t );
+        else if (numtype == 't')  x = va_arg(args, ptrdiff_t);
+        else if (numtype == 'L')  x = va_arg(args, long long);
+        else if (numtype == 'l')  x = va_arg(args, long);
+                             else x = va_arg(args, int);
+        char pre = 0;
+        if (x < 0) {
+          pre = '-';
+          if (x > INTMAX_MIN) { x = -x; }
+        }
+        else if (numplus != 0) {
+          pre = numplus;
+        }
+        mi_out_num((uintmax_t)x, 10, pre, &out, end);
+      }
+      else if (c >= ' ' && c <= '~') {
+        // unknown format
+        mi_outc('%', &out, end);
+        mi_outc(c, &out, end);
+      }
+
+      // fill & align
+      mi_assert_internal(out <= end);
+      mi_assert_internal(out >= start);
+      const size_t len = out - start;
+      if (len < width) {
+        mi_out_fill(fill, width - len, &out, end);
+        if (alignright && out <= end) {
+          mi_out_alignright(fill, start, len, width - len, end);
+        }
+      }
+    }
+  }
+  mi_assert_internal(out <= end);
+  *out = 0;
+  return (int)(out - buf);
+}
+
+int _mi_snprintf(char* buf, size_t buflen, const char* fmt, ...) {
+  va_list args;
+  va_start(args, fmt);
+  const int written = _mi_vsnprintf(buf, buflen, fmt, args);
+  va_end(args);
+  return written;
+}
+
+
+#if MI_SIZE_SIZE == 4
+#define mi_mask_even_bits32      (0x55555555)
+#define mi_mask_even_pairs32     (0x33333333)
+#define mi_mask_even_nibbles32   (0x0F0F0F0F)
+
+// sum of all the bytes in `x` if it is guaranteed that the sum < 256!
+static size_t mi_byte_sum32(uint32_t x) {
+  // perform `x * 0x01010101`: the highest byte contains the sum of all bytes.
+  x += (x << 8);
+  x += (x << 16);
+  return (size_t)(x >> 24);
+}
+
+static size_t mi_popcount_generic32(uint32_t x) {
+  // first count each 2-bit group `a`, where: a==0b00 -> 00, a==0b01 -> 01, a==0b10 -> 01, a==0b11 -> 10
+  // in other words, `a - (a>>1)`; to do this in parallel, we need to mask to prevent spilling a bit pair
+  // into the lower bit-pair:
+  x = x - ((x >> 1) & mi_mask_even_bits32);
+  // add the 2-bit pair results
+  x = (x & mi_mask_even_pairs32) + ((x >> 2) & mi_mask_even_pairs32);
+  // add the 4-bit nibble results
+  x = (x + (x >> 4)) & mi_mask_even_nibbles32;
+  // each byte now has a count of its bits, we can sum them now:
+  return mi_byte_sum32(x);
+}
+
+mi_decl_noinline size_t _mi_popcount_generic(size_t x) {
+  return mi_popcount_generic32(x);
+}
+
+#else
+#define mi_mask_even_bits64      (0x5555555555555555)
+#define mi_mask_even_pairs64     (0x3333333333333333)
+#define mi_mask_even_nibbles64   (0x0F0F0F0F0F0F0F0F)
+
+// sum of all the bytes in `x` if it is guaranteed that the sum < 256!
+static size_t mi_byte_sum64(uint64_t x) {
+  x += (x << 8);
+  x += (x << 16);
+  x += (x << 32);
+  return (size_t)(x >> 56);
+}
+
+static size_t mi_popcount_generic64(uint64_t x) {
+  x = x - ((x >> 1) & mi_mask_even_bits64);
+  x = (x & mi_mask_even_pairs64) + ((x >> 2) & mi_mask_even_pairs64);
+  x = (x + (x >> 4)) & mi_mask_even_nibbles64;
+  return mi_byte_sum64(x);
+}
+
+mi_decl_noinline size_t _mi_popcount_generic(size_t x) {
+  return mi_popcount_generic64(x);
+}
+#endif
+
diff --git a/compat/mimalloc/mimalloc-stats.h b/compat/mimalloc/mimalloc-stats.h
new file mode 100644
index 00000000000000..12c5c9a7d6ced7
--- /dev/null
+++ b/compat/mimalloc/mimalloc-stats.h
@@ -0,0 +1,104 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2025, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_STATS_H
+#define MIMALLOC_STATS_H
+
+#include <mimalloc.h>
+#include <stdint.h>
+
+#define MI_STAT_VERSION   3   // increased on every backward incompatible change
+
+// count allocation over time
+typedef struct mi_stat_count_s {
+  int64_t total;                              // total allocated
+  int64_t peak;                               // peak allocation
+  int64_t current;                            // current allocation
+} mi_stat_count_t;
+
+// counters only increase
+typedef struct mi_stat_counter_s {
+  int64_t total;                              // total count
+} mi_stat_counter_t;
+
+#define MI_STAT_FIELDS() \
+  MI_STAT_COUNT(pages)                      /* count of mimalloc pages */ \
+  MI_STAT_COUNT(reserved)                   /* reserved memory bytes */ \
+  MI_STAT_COUNT(committed)                  /* committed bytes */ \
+  MI_STAT_COUNTER(reset)                    /* reset bytes */ \
+  MI_STAT_COUNTER(purged)                   /* purged bytes */ \
+  MI_STAT_COUNT(page_committed)             /* committed memory inside pages */ \
+  MI_STAT_COUNT(pages_abandoned)            /* abandonded pages count */ \
+  MI_STAT_COUNT(threads)                    /* number of threads */ \
+  MI_STAT_COUNT(malloc_normal)              /* allocated bytes <= MI_LARGE_OBJ_SIZE_MAX */ \
+  MI_STAT_COUNT(malloc_huge)                /* allocated bytes in huge pages */ \
+  MI_STAT_COUNT(malloc_requested)           /* malloc requested bytes */ \
+  \
+  MI_STAT_COUNTER(mmap_calls) \
+  MI_STAT_COUNTER(commit_calls) \
+  MI_STAT_COUNTER(reset_calls) \
+  MI_STAT_COUNTER(purge_calls) \
+  MI_STAT_COUNTER(arena_count)              /* number of memory arena's */ \
+  MI_STAT_COUNTER(malloc_normal_count)      /* number of blocks <= MI_LARGE_OBJ_SIZE_MAX */ \
+  MI_STAT_COUNTER(malloc_huge_count)        /* number of huge bloks */ \
+  MI_STAT_COUNTER(malloc_guarded_count)     /* number of allocations with guard pages */ \
+  \
+  /* internal statistics */ \
+  MI_STAT_COUNTER(arena_rollback_count) \
+  MI_STAT_COUNTER(arena_purges) \
+  MI_STAT_COUNTER(pages_extended)           /* number of page extensions */ \
+  MI_STAT_COUNTER(pages_retire)             /* number of pages that are retired */ \
+  MI_STAT_COUNTER(page_searches)            /* total pages searched for a fresh page */ \
+  MI_STAT_COUNTER(page_searches_count)      /* searched count for a fresh page */ \
+  /* only on v1 and v2 */ \
+  MI_STAT_COUNT(segments) \
+  MI_STAT_COUNT(segments_abandoned) \
+  MI_STAT_COUNT(segments_cache) \
+  MI_STAT_COUNT(_segments_reserved) \
+  /* only on v3 */ \
+  MI_STAT_COUNTER(pages_reclaim_on_alloc) \
+  MI_STAT_COUNTER(pages_reclaim_on_free) \
+  MI_STAT_COUNTER(pages_reabandon_full) \
+  MI_STAT_COUNTER(pages_unabandon_busy_wait) \
+
+
+// Define the statistics structure
+#define MI_BIN_HUGE             (73U)   // see types.h
+#define MI_STAT_COUNT(stat)     mi_stat_count_t stat;
+#define MI_STAT_COUNTER(stat)   mi_stat_counter_t stat;
+
+typedef struct mi_stats_s
+{
+  int version;
+
+  MI_STAT_FIELDS()
+
+  // future extension
+  mi_stat_count_t   _stat_reserved[4];
+  mi_stat_counter_t _stat_counter_reserved[4];
+
+  // size segregated statistics
+  mi_stat_count_t   malloc_bins[MI_BIN_HUGE+1];   // allocation per size bin
+  mi_stat_count_t   page_bins[MI_BIN_HUGE+1];     // pages allocated per size bin
+} mi_stats_t;
+
+#undef MI_STAT_COUNT
+#undef MI_STAT_COUNTER
+
+// Exported definitions
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+mi_decl_export void  mi_stats_get( size_t stats_size, mi_stats_t* stats ) mi_attr_noexcept;
+mi_decl_export char* mi_stats_get_json( size_t buf_size, char* buf ) mi_attr_noexcept;    // use mi_free to free the result if the input buf == NULL
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif // MIMALLOC_STATS_H
diff --git a/compat/mimalloc/mimalloc.h b/compat/mimalloc/mimalloc.h
new file mode 100644
index 00000000000000..b2d5f2c8df0734
--- /dev/null
+++ b/compat/mimalloc/mimalloc.h
@@ -0,0 +1,629 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2026, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_H
+#define MIMALLOC_H
+
+#define MI_MALLOC_VERSION 226  // major + 2 digits minor
+
+// ------------------------------------------------------
+// Compiler specific attributes
+// ------------------------------------------------------
+
+#ifdef __cplusplus
+  #if (__cplusplus >= 201103L) || (_MSC_VER > 1900)  // C++11
+    #define mi_attr_noexcept   noexcept
+  #else
+    #define mi_attr_noexcept   throw()
+  #endif
+#else
+  #define mi_attr_noexcept
+#endif
+
+#if defined(__cplusplus) && (__cplusplus >= 201703)
+  #define mi_decl_nodiscard    [[nodiscard]]
+#elif (defined(__GNUC__) && (__GNUC__ >= 4)) || defined(__clang__)  // includes clang, icc, and clang-cl
+  #define mi_decl_nodiscard    __attribute__((warn_unused_result))
+#elif defined(_HAS_NODISCARD)
+  #define mi_decl_nodiscard    _NODISCARD
+#elif (_MSC_VER >= 1700)
+  #define mi_decl_nodiscard    _Check_return_
+#else
+  #define mi_decl_nodiscard
+#endif
+
+#if defined(_MSC_VER) || defined(__MINGW32__)
+  #if !defined(MI_SHARED_LIB)
+    #define mi_decl_export
+  #elif defined(MI_SHARED_LIB_EXPORT)
+    #define mi_decl_export              __declspec(dllexport)
+  #else
+    #define mi_decl_export              __declspec(dllimport)
+  #endif
+  #if defined(__MINGW32__)
+    #define mi_decl_restrict
+    #define mi_attr_malloc              __attribute__((malloc))
+  #else
+    #if (_MSC_VER >= 1900) && !defined(__EDG__)
+      #define mi_decl_restrict          __declspec(allocator) __declspec(restrict)
+    #else
+      #define mi_decl_restrict          __declspec(restrict)
+    #endif
+    #define mi_attr_malloc
+  #endif
+  #define mi_cdecl                      __cdecl
+  #define mi_attr_alloc_size(s)
+  #define mi_attr_alloc_size2(s1,s2)
+  #define mi_attr_alloc_align(p)
+#elif defined(__GNUC__)                 // includes clang and icc
+  #if defined(MI_SHARED_LIB) && defined(MI_SHARED_LIB_EXPORT)
+    #define mi_decl_export              __attribute__((visibility("default")))
+  #else
+    #define mi_decl_export
+  #endif
+  #define mi_cdecl                      // leads to warnings... __attribute__((cdecl))
+  #define mi_decl_restrict
+  #define mi_attr_malloc                __attribute__((malloc))
+  #if (defined(__clang_major__) && (__clang_major__ < 4)) || (__GNUC__ < 5)
+    #define mi_attr_alloc_size(s)
+    #define mi_attr_alloc_size2(s1,s2)
+    #define mi_attr_alloc_align(p)
+  #elif defined(__INTEL_COMPILER)
+    #define mi_attr_alloc_size(s)       __attribute__((alloc_size(s)))
+    #define mi_attr_alloc_size2(s1,s2)  __attribute__((alloc_size(s1,s2)))
+    #define mi_attr_alloc_align(p)
+  #else
+    #define mi_attr_alloc_size(s)       __attribute__((alloc_size(s)))
+    #define mi_attr_alloc_size2(s1,s2)  __attribute__((alloc_size(s1,s2)))
+    #define mi_attr_alloc_align(p)      __attribute__((alloc_align(p)))
+  #endif
+#else
+  #define mi_cdecl
+  #define mi_decl_export
+  #define mi_decl_restrict
+  #define mi_attr_malloc
+  #define mi_attr_alloc_size(s)
+  #define mi_attr_alloc_size2(s1,s2)
+  #define mi_attr_alloc_align(p)
+#endif
+
+// ------------------------------------------------------
+// Includes
+// ------------------------------------------------------
+
+#include <stddef.h>     // size_t
+#include <stdbool.h>    // bool
+#include <stdint.h>     // INTPTR_MAX
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+// ------------------------------------------------------
+// Standard malloc interface
+// ------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc(size_t size)  mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc(size_t count, size_t size)  mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2);
+mi_decl_nodiscard mi_decl_export void* mi_realloc(void* p, size_t newsize)      mi_attr_noexcept mi_attr_alloc_size(2);
+mi_decl_export void* mi_expand(void* p, size_t newsize)                         mi_attr_noexcept mi_attr_alloc_size(2);
+
+mi_decl_export void mi_free(void* p) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_strdup(const char* s) mi_attr_noexcept mi_attr_malloc;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_strndup(const char* s, size_t n) mi_attr_noexcept mi_attr_malloc;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_realpath(const char* fname, char* resolved_name) mi_attr_noexcept mi_attr_malloc;
+
+// ------------------------------------------------------
+// Extended functionality
+// ------------------------------------------------------
+#define MI_SMALL_WSIZE_MAX  (128)
+#define MI_SMALL_SIZE_MAX   (MI_SMALL_WSIZE_MAX*sizeof(void*))
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_small(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_small(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc(size_t size)       mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_mallocn(size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2);
+mi_decl_nodiscard mi_decl_export void* mi_reallocn(void* p, size_t count, size_t size)        mi_attr_noexcept mi_attr_alloc_size2(2,3);
+mi_decl_nodiscard mi_decl_export void* mi_reallocf(void* p, size_t newsize)                   mi_attr_noexcept mi_attr_alloc_size(2);
+
+mi_decl_nodiscard mi_decl_export size_t mi_usable_size(const void* p) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export size_t mi_good_size(size_t size)     mi_attr_noexcept;
+
+
+// ------------------------------------------------------
+// Internals
+// ------------------------------------------------------
+
+typedef void (mi_cdecl mi_deferred_free_fun)(bool force, unsigned long long heartbeat, void* arg);
+mi_decl_export void mi_register_deferred_free(mi_deferred_free_fun* deferred_free, void* arg) mi_attr_noexcept;
+
+typedef void (mi_cdecl mi_output_fun)(const char* msg, void* arg);
+mi_decl_export void mi_register_output(mi_output_fun* out, void* arg) mi_attr_noexcept;
+
+typedef void (mi_cdecl mi_error_fun)(int err, void* arg);
+mi_decl_export void mi_register_error(mi_error_fun* fun, void* arg);
+
+mi_decl_export void mi_collect(bool force)    mi_attr_noexcept;
+mi_decl_export int  mi_version(void)          mi_attr_noexcept;
+mi_decl_export void mi_stats_reset(void)      mi_attr_noexcept;
+mi_decl_export void mi_stats_merge(void)      mi_attr_noexcept;
+mi_decl_export void mi_stats_print(void* out) mi_attr_noexcept;  // backward compatibility: `out` is ignored and should be NULL
+mi_decl_export void mi_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept;
+mi_decl_export void mi_thread_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept;
+mi_decl_export void mi_options_print(void)    mi_attr_noexcept;
+
+mi_decl_export void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_msecs,
+                                    size_t* current_rss, size_t* peak_rss,
+                                    size_t* current_commit, size_t* peak_commit, size_t* page_faults) mi_attr_noexcept;
+
+
+// Generally do not use the following as these are usually called automatically
+mi_decl_export void mi_process_init(void)     mi_attr_noexcept;
+mi_decl_export void mi_cdecl mi_process_done(void) mi_attr_noexcept;
+mi_decl_export void mi_thread_init(void)      mi_attr_noexcept;
+mi_decl_export void mi_thread_done(void)      mi_attr_noexcept;
+
+
+// -------------------------------------------------------------------------------------
+// Aligned allocation
+// Note that `alignment` always follows `size` for consistency with unaligned
+// allocation, but unfortunately this differs from `posix_memalign` and `aligned_alloc`.
+// -------------------------------------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_aligned(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_aligned(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc_aligned(size_t count, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2) mi_attr_alloc_align(3);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc_aligned_at(size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2);
+mi_decl_nodiscard mi_decl_export void* mi_realloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(2) mi_attr_alloc_align(3);
+mi_decl_nodiscard mi_decl_export void* mi_realloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(2);
+
+
+// -----------------------------------------------------------------
+// Return allocated block size (if the return value is not NULL)
+// -----------------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_umalloc(size_t size, size_t* block_size)  mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_ucalloc(size_t count, size_t size, size_t* block_size)  mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2);
+mi_decl_nodiscard mi_decl_export void* mi_urealloc(void* p, size_t newsize, size_t* block_size_pre, size_t* block_size_post) mi_attr_noexcept mi_attr_alloc_size(2);
+mi_decl_export void mi_ufree(void* p, size_t* block_size) mi_attr_noexcept;
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_umalloc_aligned(size_t size, size_t alignment, size_t* block_size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_uzalloc_aligned(size_t size, size_t alignment, size_t* block_size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_umalloc_small(size_t size, size_t* block_size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_uzalloc_small(size_t size, size_t* block_size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+
+
+// -------------------------------------------------------------------------------------
+// Heaps: first-class, but can only allocate from the same thread that created it.
+// -------------------------------------------------------------------------------------
+
+struct mi_heap_s;
+typedef struct mi_heap_s mi_heap_t;
+
+mi_decl_nodiscard mi_decl_export mi_heap_t* mi_heap_new(void);
+mi_decl_export void       mi_heap_delete(mi_heap_t* heap);
+mi_decl_export void       mi_heap_destroy(mi_heap_t* heap);
+mi_decl_export mi_heap_t* mi_heap_set_default(mi_heap_t* heap);
+mi_decl_export mi_heap_t* mi_heap_get_default(void);
+mi_decl_export mi_heap_t* mi_heap_get_backing(void);
+mi_decl_export void       mi_heap_collect(mi_heap_t* heap, bool force) mi_attr_noexcept;
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_mallocn(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_small(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2);
+
+mi_decl_nodiscard mi_decl_export void* mi_heap_realloc(mi_heap_t* heap, void* p, size_t newsize)              mi_attr_noexcept mi_attr_alloc_size(3);
+mi_decl_nodiscard mi_decl_export void* mi_heap_reallocn(mi_heap_t* heap, void* p, size_t count, size_t size)  mi_attr_noexcept mi_attr_alloc_size2(3,4);
+mi_decl_nodiscard mi_decl_export void* mi_heap_reallocf(mi_heap_t* heap, void* p, size_t newsize)             mi_attr_noexcept mi_attr_alloc_size(3);
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_strdup(mi_heap_t* heap, const char* s)            mi_attr_noexcept mi_attr_malloc;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_strndup(mi_heap_t* heap, const char* s, size_t n) mi_attr_noexcept mi_attr_malloc;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept mi_attr_malloc;
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(3);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(3);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc_aligned(mi_heap_t* heap, size_t count, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3) mi_attr_alloc_align(4);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc_aligned_at(mi_heap_t* heap, size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3);
+mi_decl_nodiscard mi_decl_export void* mi_heap_realloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(3) mi_attr_alloc_align(4);
+mi_decl_nodiscard mi_decl_export void* mi_heap_realloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(3);
+
+
+// --------------------------------------------------------------------------------
+// Zero initialized re-allocation.
+// Only valid on memory that was originally allocated with zero initialization too.
+// e.g. `mi_calloc`, `mi_zalloc`, `mi_zalloc_aligned` etc.
+// see <https://github.com/microsoft/mimalloc/issues/63#issuecomment-508272992>
+// --------------------------------------------------------------------------------
+
+mi_decl_nodiscard mi_decl_export void* mi_rezalloc(void* p, size_t newsize)                mi_attr_noexcept mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export void* mi_recalloc(void* p, size_t newcount, size_t size)  mi_attr_noexcept mi_attr_alloc_size2(2,3);
+
+mi_decl_nodiscard mi_decl_export void* mi_rezalloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(2) mi_attr_alloc_align(3);
+mi_decl_nodiscard mi_decl_export void* mi_rezalloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export void* mi_recalloc_aligned(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept mi_attr_alloc_size2(2,3) mi_attr_alloc_align(4);
+mi_decl_nodiscard mi_decl_export void* mi_recalloc_aligned_at(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size2(2,3);
+
+mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc(mi_heap_t* heap, void* p, size_t newsize)                mi_attr_noexcept mi_attr_alloc_size(3);
+mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc(mi_heap_t* heap, void* p, size_t newcount, size_t size)  mi_attr_noexcept mi_attr_alloc_size2(3,4);
+
+mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(3) mi_attr_alloc_align(4);
+mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(3);
+mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc_aligned(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept mi_attr_alloc_size2(3,4) mi_attr_alloc_align(5);
+mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc_aligned_at(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size2(3,4);
+
+
+// ------------------------------------------------------
+// Analysis
+// ------------------------------------------------------
+
+mi_decl_export bool mi_heap_contains_block(mi_heap_t* heap, const void* p);
+mi_decl_export bool mi_heap_check_owned(mi_heap_t* heap, const void* p);
+mi_decl_export bool mi_check_owned(const void* p);
+
+// An area of heap space contains blocks of a single size.
+typedef struct mi_heap_area_s {
+  void*  blocks;      // start of the area containing heap blocks
+  size_t reserved;    // bytes reserved for this area (virtual)
+  size_t committed;   // current available bytes for this area
+  size_t used;        // number of allocated blocks
+  size_t block_size;  // size in bytes of each block
+  size_t full_block_size; // size in bytes of a full block including padding and metadata.
+  int    heap_tag;    // heap tag associated with this area
+} mi_heap_area_t;
+
+typedef bool (mi_cdecl mi_block_visit_fun)(const mi_heap_t* heap, const mi_heap_area_t* area, void* block, size_t block_size, void* arg);
+
+mi_decl_export bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_blocks, mi_block_visit_fun* visitor, void* arg);
+
+// Experimental
+mi_decl_nodiscard mi_decl_export bool mi_is_in_heap_region(const void* p) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export bool mi_is_redirected(void) mi_attr_noexcept;
+
+mi_decl_export int   mi_reserve_huge_os_pages_interleave(size_t pages, size_t numa_nodes, size_t timeout_msecs) mi_attr_noexcept;
+mi_decl_export int   mi_reserve_huge_os_pages_at(size_t pages, int numa_node, size_t timeout_msecs) mi_attr_noexcept;
+
+mi_decl_export int   mi_reserve_os_memory(size_t size, bool commit, bool allow_large) mi_attr_noexcept;
+mi_decl_export bool  mi_manage_os_memory(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node) mi_attr_noexcept;
+
+mi_decl_export void  mi_debug_show_arenas(void) mi_attr_noexcept;
+mi_decl_export void  mi_arenas_print(void) mi_attr_noexcept;
+
+// Experimental: heaps associated with specific memory arena's
+typedef int mi_arena_id_t;
+mi_decl_export void* mi_arena_area(mi_arena_id_t arena_id, size_t* size);
+mi_decl_export int   mi_reserve_huge_os_pages_at_ex(size_t pages, int numa_node, size_t timeout_msecs, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept;
+mi_decl_export int   mi_reserve_os_memory_ex(size_t size, bool commit, bool allow_large, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept;
+mi_decl_export bool  mi_manage_os_memory_ex(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept;
+
+#if MI_MALLOC_VERSION >= 182
+// Create a heap that only allocates in the specified arena
+mi_decl_nodiscard mi_decl_export mi_heap_t* mi_heap_new_in_arena(mi_arena_id_t arena_id);
+#endif
+
+
+// Experimental: allow sub-processes whose memory areas stay separated (and no reclamation between them)
+// Used for example for separate interpreters in one process.
+typedef void* mi_subproc_id_t;
+mi_decl_export mi_subproc_id_t mi_subproc_main(void);
+mi_decl_export mi_subproc_id_t mi_subproc_new(void);
+mi_decl_export void mi_subproc_delete(mi_subproc_id_t subproc);
+mi_decl_export void mi_subproc_add_current_thread(mi_subproc_id_t subproc); // this should be called right after a thread is created (and no allocation has taken place yet)
+
+// Experimental: visit abandoned heap areas (that are not owned by a specific heap)
+mi_decl_export bool mi_abandoned_visit_blocks(mi_subproc_id_t subproc_id, int heap_tag, bool visit_blocks, mi_block_visit_fun* visitor, void* arg);
+
+// Experimental: objects followed by a guard page.
+// A sample rate of 0 disables guarded objects, while 1 uses a guard page for every object.
+// A seed of 0 uses a random start point. Only objects within the size bound are eligable for guard pages.
+mi_decl_export void mi_heap_guarded_set_sample_rate(mi_heap_t* heap, size_t sample_rate, size_t seed);
+mi_decl_export void mi_heap_guarded_set_size_bound(mi_heap_t* heap, size_t min, size_t max);
+
+// Experimental: communicate that the thread is part of a threadpool
+mi_decl_export void mi_thread_set_in_threadpool(void) mi_attr_noexcept;
+
+// Experimental: create a new heap with a specified heap tag. Set `allow_destroy` to false to allow the thread
+// to reclaim abandoned memory (with a compatible heap_tag and arena_id) but in that case `mi_heap_destroy` will
+// fall back to `mi_heap_delete`.
+mi_decl_nodiscard mi_decl_export mi_heap_t* mi_heap_new_ex(int heap_tag, bool allow_destroy, mi_arena_id_t arena_id);
+
+// deprecated
+mi_decl_export int mi_reserve_huge_os_pages(size_t pages, double max_secs, size_t* pages_reserved) mi_attr_noexcept;
+mi_decl_export void mi_collect_reduce(size_t target_thread_owned) mi_attr_noexcept;
+
+
+
+// ------------------------------------------------------
+// Convenience
+// ------------------------------------------------------
+
+#define mi_malloc_tp(tp)                ((tp*)mi_malloc(sizeof(tp)))
+#define mi_zalloc_tp(tp)                ((tp*)mi_zalloc(sizeof(tp)))
+#define mi_calloc_tp(tp,n)              ((tp*)mi_calloc(n,sizeof(tp)))
+#define mi_mallocn_tp(tp,n)             ((tp*)mi_mallocn(n,sizeof(tp)))
+#define mi_reallocn_tp(p,tp,n)          ((tp*)mi_reallocn(p,n,sizeof(tp)))
+#define mi_recalloc_tp(p,tp,n)          ((tp*)mi_recalloc(p,n,sizeof(tp)))
+
+#define mi_heap_malloc_tp(hp,tp)        ((tp*)mi_heap_malloc(hp,sizeof(tp)))
+#define mi_heap_zalloc_tp(hp,tp)        ((tp*)mi_heap_zalloc(hp,sizeof(tp)))
+#define mi_heap_calloc_tp(hp,tp,n)      ((tp*)mi_heap_calloc(hp,n,sizeof(tp)))
+#define mi_heap_mallocn_tp(hp,tp,n)     ((tp*)mi_heap_mallocn(hp,n,sizeof(tp)))
+#define mi_heap_reallocn_tp(hp,p,tp,n)  ((tp*)mi_heap_reallocn(hp,p,n,sizeof(tp)))
+#define mi_heap_recalloc_tp(hp,p,tp,n)  ((tp*)mi_heap_recalloc(hp,p,n,sizeof(tp)))
+
+
+// ------------------------------------------------------
+// Options
+// ------------------------------------------------------
+
+typedef enum mi_option_e {
+  // stable options
+  mi_option_show_errors,                // print error messages
+  mi_option_show_stats,                 // print statistics on termination
+  mi_option_verbose,                    // print verbose messages
+  // advanced options
+  mi_option_eager_commit,               // eager commit segments? (after `eager_commit_delay` segments) (=1)
+  mi_option_arena_eager_commit,         // eager commit arenas? Use 2 to enable just on overcommit systems (=2)
+  mi_option_purge_decommits,            // should a memory purge decommit? (=1). Set to 0 to use memory reset on a purge (instead of decommit)
+  mi_option_allow_large_os_pages,       // allow use of large (2 or 4 MiB) OS pages, implies eager commit.
+  mi_option_reserve_huge_os_pages,      // reserve N huge OS pages (1GiB pages) at startup
+  mi_option_reserve_huge_os_pages_at,   // reserve huge OS pages at a specific NUMA node
+  mi_option_reserve_os_memory,          // reserve specified amount of OS memory in an arena at startup (internally, this value is in KiB; use `mi_option_get_size`)
+  mi_option_deprecated_segment_cache,
+  mi_option_deprecated_page_reset,
+  mi_option_abandoned_page_purge,       // immediately purge delayed purges on thread termination
+  mi_option_deprecated_segment_reset,
+  mi_option_eager_commit_delay,         // the first N segments per thread are not eagerly committed (but per page in the segment on demand)
+  mi_option_purge_delay,                // memory purging is delayed by N milli seconds; use 0 for immediate purging or -1 for no purging at all. (=10)
+  mi_option_use_numa_nodes,             // 0 = use all available numa nodes, otherwise use at most N nodes.
+  mi_option_disallow_os_alloc,          // 1 = do not use OS memory for allocation (but only programmatically reserved arenas)
+  mi_option_os_tag,                     // tag used for OS logging (macOS only for now) (=100)
+  mi_option_max_errors,                 // issue at most N error messages
+  mi_option_max_warnings,               // issue at most N warning messages
+  mi_option_max_segment_reclaim,        // max. percentage of the abandoned segments can be reclaimed per try (=10%)
+  mi_option_destroy_on_exit,            // if set, release all memory on exit; sometimes used for dynamic unloading but can be unsafe
+  mi_option_arena_reserve,              // initial memory size for arena reservation (= 1 GiB on 64-bit) (internally, this value is in KiB; use `mi_option_get_size`)
+  mi_option_arena_purge_mult,           // multiplier for `purge_delay` for the purging delay for arenas (=10)
+  mi_option_purge_extend_delay,
+  mi_option_abandoned_reclaim_on_free,  // allow to reclaim an abandoned segment on a free (=1)
+  mi_option_disallow_arena_alloc,       // 1 = do not use arena's for allocation (except if using specific arena id's)
+  mi_option_retry_on_oom,               // retry on out-of-memory for N milli seconds (=400), set to 0 to disable retries. (only on windows)
+  mi_option_visit_abandoned,            // allow visiting heap blocks from abandoned threads (=0)
+  mi_option_guarded_min,                // only used when building with MI_GUARDED: minimal rounded object size for guarded objects (=0)
+  mi_option_guarded_max,                // only used when building with MI_GUARDED: maximal rounded object size for guarded objects (=0)
+  mi_option_guarded_precise,            // disregard minimal alignment requirement to always place guarded blocks exactly in front of a guard page (=0)
+  mi_option_guarded_sample_rate,        // 1 out of N allocations in the min/max range will be guarded (=1000)
+  mi_option_guarded_sample_seed,        // can be set to allow for a (more) deterministic re-execution when a guard page is triggered (=0)
+  mi_option_target_segments_per_thread, // experimental (=0)
+  mi_option_generic_collect,            // collect heaps every N (=10000) generic allocation calls
+  mi_option_allow_thp,                  // allow transparent huge pages? (=1) (on Android =0 by default). Set to 0 to disable THP for the process.
+  _mi_option_last,
+  // legacy option names
+  mi_option_large_os_pages = mi_option_allow_large_os_pages,
+  mi_option_eager_region_commit = mi_option_arena_eager_commit,
+  mi_option_reset_decommits = mi_option_purge_decommits,
+  mi_option_reset_delay = mi_option_purge_delay,
+  mi_option_abandoned_page_reset = mi_option_abandoned_page_purge,
+  mi_option_limit_os_alloc = mi_option_disallow_os_alloc
+} mi_option_t;
+
+
+mi_decl_nodiscard mi_decl_export bool mi_option_is_enabled(mi_option_t option);
+mi_decl_export void mi_option_enable(mi_option_t option);
+mi_decl_export void mi_option_disable(mi_option_t option);
+mi_decl_export void mi_option_set_enabled(mi_option_t option, bool enable);
+mi_decl_export void mi_option_set_enabled_default(mi_option_t option, bool enable);
+
+mi_decl_nodiscard mi_decl_export long   mi_option_get(mi_option_t option);
+mi_decl_nodiscard mi_decl_export long   mi_option_get_clamp(mi_option_t option, long min, long max);
+mi_decl_nodiscard mi_decl_export size_t mi_option_get_size(mi_option_t option);
+mi_decl_export void mi_option_set(mi_option_t option, long value);
+mi_decl_export void mi_option_set_default(mi_option_t option, long value);
+
+
+// -------------------------------------------------------------------------------------------------------
+// "mi" prefixed implementations of various posix, Unix, Windows, and C++ allocation functions.
+// (This can be convenient when providing overrides of these functions as done in `mimalloc-override.h`.)
+// note: we use `mi_cfree` as "checked free" and it checks if the pointer is in our heap before free-ing.
+// -------------------------------------------------------------------------------------------------------
+
+mi_decl_export void  mi_cfree(void* p) mi_attr_noexcept;
+mi_decl_export void* mi__expand(void* p, size_t newsize) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export size_t mi_malloc_size(const void* p)        mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export size_t mi_malloc_good_size(size_t size)     mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export size_t mi_malloc_usable_size(const void *p) mi_attr_noexcept;
+
+mi_decl_export int mi_posix_memalign(void** p, size_t alignment, size_t size)   mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_memalign(size_t alignment, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_valloc(size_t size)  mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_pvalloc(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_aligned_alloc(size_t alignment, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(1);
+
+mi_decl_nodiscard mi_decl_export void* mi_reallocarray(void* p, size_t count, size_t size) mi_attr_noexcept mi_attr_alloc_size2(2,3);
+mi_decl_nodiscard mi_decl_export int   mi_reallocarr(void* p, size_t count, size_t size) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export void* mi_aligned_recalloc(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept;
+mi_decl_nodiscard mi_decl_export void* mi_aligned_offset_recalloc(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept;
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict unsigned short* mi_wcsdup(const unsigned short* s) mi_attr_noexcept mi_attr_malloc;
+mi_decl_nodiscard mi_decl_export mi_decl_restrict unsigned char*  mi_mbsdup(const unsigned char* s)  mi_attr_noexcept mi_attr_malloc;
+mi_decl_export int mi_dupenv_s(char** buf, size_t* size, const char* name)                      mi_attr_noexcept;
+mi_decl_export int mi_wdupenv_s(unsigned short** buf, size_t* size, const unsigned short* name) mi_attr_noexcept;
+
+mi_decl_export void mi_free_size(void* p, size_t size)                           mi_attr_noexcept;
+mi_decl_export void mi_free_size_aligned(void* p, size_t size, size_t alignment) mi_attr_noexcept;
+mi_decl_export void mi_free_aligned(void* p, size_t alignment)                   mi_attr_noexcept;
+
+// The `mi_new` wrappers implement C++ semantics on out-of-memory instead of directly returning `NULL`.
+// (and call `std::get_new_handler` and potentially raise a `std::bad_alloc` exception).
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new(size_t size)                   mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_aligned(size_t size, size_t alignment) mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_nothrow(size_t size)           mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_aligned_nothrow(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_n(size_t count, size_t size)   mi_attr_malloc mi_attr_alloc_size2(1, 2);
+mi_decl_nodiscard mi_decl_export void* mi_new_realloc(void* p, size_t newsize)                mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export void* mi_new_reallocn(void* p, size_t newcount, size_t size) mi_attr_alloc_size2(2, 3);
+
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_alloc_new(mi_heap_t* heap, size_t size)                mi_attr_malloc mi_attr_alloc_size(2);
+mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_alloc_new_n(mi_heap_t* heap, size_t count, size_t size) mi_attr_malloc mi_attr_alloc_size2(2, 3);
+
+#ifdef __cplusplus
+}
+#endif
+
+// ---------------------------------------------------------------------------------------------
+// Implement the C++ std::allocator interface for use in STL containers.
+// (note: see `mimalloc-new-delete.h` for overriding the new/delete operators globally)
+// ---------------------------------------------------------------------------------------------
+#ifdef __cplusplus
+
+#include <cstddef>     // std::size_t
+#include <cstdint>     // PTRDIFF_MAX
+#if (__cplusplus >= 201103L) || (_MSC_VER > 1900)  // C++11
+#include <type_traits> // std::true_type
+#include <utility>     // std::forward
+#endif
+
+template<class T> struct _mi_stl_allocator_common {
+  typedef T                 value_type;
+  typedef std::size_t       size_type;
+  typedef std::ptrdiff_t    difference_type;
+  typedef value_type&       reference;
+  typedef value_type const& const_reference;
+  typedef value_type*       pointer;
+  typedef value_type const* const_pointer;
+
+  #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900))  // C++11
+  using propagate_on_container_copy_assignment = std::true_type;
+  using propagate_on_container_move_assignment = std::true_type;
+  using propagate_on_container_swap            = std::true_type;
+  template <class U, class ...Args> void construct(U* p, Args&& ...args) { ::new(p) U(std::forward<Args>(args)...); }
+  template <class U> void destroy(U* p) mi_attr_noexcept { p->~U(); }
+  #else
+  void construct(pointer p, value_type const& val) { ::new(p) value_type(val); }
+  void destroy(pointer p) { p->~value_type(); }
+  #endif
+
+  size_type     max_size() const mi_attr_noexcept { return (PTRDIFF_MAX/sizeof(value_type)); }
+  pointer       address(reference x) const        { return &x; }
+  const_pointer address(const_reference x) const  { return &x; }
+};
+
+template<class T> struct mi_stl_allocator : public _mi_stl_allocator_common<T> {
+  using typename _mi_stl_allocator_common<T>::size_type;
+  using typename _mi_stl_allocator_common<T>::value_type;
+  using typename _mi_stl_allocator_common<T>::pointer;
+  template <class U> struct rebind { typedef mi_stl_allocator<U> other; };
+
+  mi_stl_allocator()                                             mi_attr_noexcept = default;
+  mi_stl_allocator(const mi_stl_allocator&)                      mi_attr_noexcept = default;
+  template<class U> mi_stl_allocator(const mi_stl_allocator<U>&) mi_attr_noexcept { }
+  mi_stl_allocator  select_on_container_copy_construction() const { return *this; }
+  void              deallocate(T* p, size_type) { mi_free(p); }
+
+  #if (__cplusplus >= 201703L)  // C++17
+  mi_decl_nodiscard T* allocate(size_type count) { return static_cast<T*>(mi_new_n(count, sizeof(T))); }
+  mi_decl_nodiscard T* allocate(size_type count, const void*) { return allocate(count); }
+  #else
+  mi_decl_nodiscard pointer allocate(size_type count, const void* = 0) { return static_cast<pointer>(mi_new_n(count, sizeof(value_type))); }
+  #endif
+
+  #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900))  // C++11
+  using is_always_equal = std::true_type;
+  #endif
+};
+
+template<class T1,class T2> bool operator==(const mi_stl_allocator<T1>& , const mi_stl_allocator<T2>& ) mi_attr_noexcept { return true; }
+template<class T1,class T2> bool operator!=(const mi_stl_allocator<T1>& , const mi_stl_allocator<T2>& ) mi_attr_noexcept { return false; }
+
+
+#if (__cplusplus >= 201103L) || (_MSC_VER >= 1900)  // C++11
+#define MI_HAS_HEAP_STL_ALLOCATOR 1
+
+#include <memory>      // std::shared_ptr
+
+// Common base class for STL allocators in a specific heap
+template<class T, bool _mi_destroy> struct _mi_heap_stl_allocator_common : public _mi_stl_allocator_common<T> {
+  using typename _mi_stl_allocator_common<T>::size_type;
+  using typename _mi_stl_allocator_common<T>::value_type;
+  using typename _mi_stl_allocator_common<T>::pointer;
+
+  _mi_heap_stl_allocator_common(mi_heap_t* hp) : heap(hp, [](mi_heap_t*) {}) {}    /* will not delete nor destroy the passed in heap */
+
+  #if (__cplusplus >= 201703L)  // C++17
+  mi_decl_nodiscard T* allocate(size_type count) { return static_cast<T*>(mi_heap_alloc_new_n(this->heap.get(), count, sizeof(T))); }
+  mi_decl_nodiscard T* allocate(size_type count, const void*) { return allocate(count); }
+  #else
+  mi_decl_nodiscard pointer allocate(size_type count, const void* = 0) { return static_cast<pointer>(mi_heap_alloc_new_n(this->heap.get(), count, sizeof(value_type))); }
+  #endif
+
+  #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900))  // C++11
+  using is_always_equal = std::false_type;
+  #endif
+
+  void collect(bool force) { mi_heap_collect(this->heap.get(), force); }
+  template<class U> bool is_equal(const _mi_heap_stl_allocator_common<U, _mi_destroy>& x) const { return (this->heap == x.heap); }
+
+protected:
+  std::shared_ptr<mi_heap_t> heap;
+  template<class U, bool D> friend struct _mi_heap_stl_allocator_common;
+
+  _mi_heap_stl_allocator_common() {
+    mi_heap_t* hp = mi_heap_new();
+    this->heap.reset(hp, (_mi_destroy ? &heap_destroy : &heap_delete));  /* calls heap_delete/destroy when the refcount drops to zero */
+  }
+  _mi_heap_stl_allocator_common(const _mi_heap_stl_allocator_common& x) mi_attr_noexcept : heap(x.heap) { }
+  template<class U> _mi_heap_stl_allocator_common(const _mi_heap_stl_allocator_common<U, _mi_destroy>& x) mi_attr_noexcept : heap(x.heap) { }
+
+private:
+  static void heap_delete(mi_heap_t* hp)  { if (hp != NULL) { mi_heap_delete(hp); } }
+  static void heap_destroy(mi_heap_t* hp) { if (hp != NULL) { mi_heap_destroy(hp); } }
+};
+
+// STL allocator allocation in a specific heap
+template<class T> struct mi_heap_stl_allocator : public _mi_heap_stl_allocator_common<T, false> {
+  using typename _mi_heap_stl_allocator_common<T, false>::size_type;
+  mi_heap_stl_allocator() : _mi_heap_stl_allocator_common<T, false>() { } // creates fresh heap that is deleted when the destructor is called
+  mi_heap_stl_allocator(mi_heap_t* hp) : _mi_heap_stl_allocator_common<T, false>(hp) { }  // no delete nor destroy on the passed in heap
+  template<class U> mi_heap_stl_allocator(const mi_heap_stl_allocator<U>& x) mi_attr_noexcept : _mi_heap_stl_allocator_common<T, false>(x) { }
+
+  mi_heap_stl_allocator select_on_container_copy_construction() const { return *this; }
+  void deallocate(T* p, size_type) { mi_free(p); }
+  template<class U> struct rebind { typedef mi_heap_stl_allocator<U> other; };
+};
+
+template<class T1, class T2> bool operator==(const mi_heap_stl_allocator<T1>& x, const mi_heap_stl_allocator<T2>& y) mi_attr_noexcept { return (x.is_equal(y)); }
+template<class T1, class T2> bool operator!=(const mi_heap_stl_allocator<T1>& x, const mi_heap_stl_allocator<T2>& y) mi_attr_noexcept { return (!x.is_equal(y)); }
+
+
+// STL allocator allocation in a specific heap, where `free` does nothing and
+// the heap is destroyed in one go on destruction -- use with care!
+template<class T> struct mi_heap_destroy_stl_allocator : public _mi_heap_stl_allocator_common<T, true> {
+  using typename _mi_heap_stl_allocator_common<T, true>::size_type;
+  mi_heap_destroy_stl_allocator() : _mi_heap_stl_allocator_common<T, true>() { } // creates fresh heap that is destroyed when the destructor is called
+  mi_heap_destroy_stl_allocator(mi_heap_t* hp) : _mi_heap_stl_allocator_common<T, true>(hp) { }  // no delete nor destroy on the passed in heap
+  template<class U> mi_heap_destroy_stl_allocator(const mi_heap_destroy_stl_allocator<U>& x) mi_attr_noexcept : _mi_heap_stl_allocator_common<T, true>(x) { }
+
+  mi_heap_destroy_stl_allocator select_on_container_copy_construction() const { return *this; }
+  void deallocate(T*, size_type) { /* do nothing as we destroy the heap on destruct. */ }
+  template<class U> struct rebind { typedef mi_heap_destroy_stl_allocator<U> other; };
+};
+
+template<class T1, class T2> bool operator==(const mi_heap_destroy_stl_allocator<T1>& x, const mi_heap_destroy_stl_allocator<T2>& y) mi_attr_noexcept { return (x.is_equal(y)); }
+template<class T1, class T2> bool operator!=(const mi_heap_destroy_stl_allocator<T1>& x, const mi_heap_destroy_stl_allocator<T2>& y) mi_attr_noexcept { return (!x.is_equal(y)); }
+
+#endif // C++11
+
+#endif // __cplusplus
+
+#endif
diff --git a/compat/mimalloc/mimalloc/atomic.h b/compat/mimalloc/mimalloc/atomic.h
new file mode 100644
index 00000000000000..e8bac316b3a6f3
--- /dev/null
+++ b/compat/mimalloc/mimalloc/atomic.h
@@ -0,0 +1,557 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024 Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_ATOMIC_H
+#define MIMALLOC_ATOMIC_H
+
+// include windows.h or pthreads.h
+#if defined(_WIN32)
+#ifndef WIN32_LEAN_AND_MEAN
+#define WIN32_LEAN_AND_MEAN
+#endif
+#include <windows.h>
+#elif !defined(__wasi__) && (!defined(__EMSCRIPTEN__) || defined(__EMSCRIPTEN_PTHREADS__))
+#define  MI_USE_PTHREADS
+#include <pthread.h>
+#endif
+
+// --------------------------------------------------------------------------------------------
+// Atomics
+// We need to be portable between C, C++, and MSVC.
+// We base the primitives on the C/C++ atomics and create a minimal wrapper for MSVC in C compilation mode.
+// This is why we try to use only `uintptr_t` and `<type>*` as atomic types.
+// To gain better insight in the range of used atomics, we use explicitly named memory order operations
+// instead of passing the memory order as a parameter.
+// -----------------------------------------------------------------------------------------------
+
+#if defined(__cplusplus)
+// Use C++ atomics
+#include <atomic>
+#define  _Atomic(tp)              std::atomic<tp>
+#define  mi_atomic(name)          std::atomic_##name
+#define  mi_memory_order(name)    std::memory_order_##name
+#if (__cplusplus >= 202002L)      // c++20, see issue #571
+ #define MI_ATOMIC_VAR_INIT(x)    x
+#elif !defined(ATOMIC_VAR_INIT)
+ #define MI_ATOMIC_VAR_INIT(x)    x
+#else
+ #define MI_ATOMIC_VAR_INIT(x)    ATOMIC_VAR_INIT(x)
+#endif
+#elif defined(_MSC_VER)
+// Use MSVC C wrapper for C11 atomics
+#define  _Atomic(tp)              tp
+#define  MI_ATOMIC_VAR_INIT(x)    x
+#define  mi_atomic(name)          mi_atomic_##name
+#define  mi_memory_order(name)    mi_memory_order_##name
+#else
+// Use C11 atomics
+#include <stdatomic.h>
+#define  mi_atomic(name)          atomic_##name
+#define  mi_memory_order(name)    memory_order_##name
+#if (__STDC_VERSION__ >= 201710L) // c17, see issue #735
+ #define MI_ATOMIC_VAR_INIT(x)    x
+#elif !defined(ATOMIC_VAR_INIT)
+ #define MI_ATOMIC_VAR_INIT(x)    x
+#else
+ #define MI_ATOMIC_VAR_INIT(x)    ATOMIC_VAR_INIT(x)
+#endif
+#endif
+
+// Various defines for all used memory orders in mimalloc
+#define mi_atomic_cas_weak(p,expected,desired,mem_success,mem_fail)  \
+  mi_atomic(compare_exchange_weak_explicit)(p,expected,desired,mem_success,mem_fail)
+
+#define mi_atomic_cas_strong(p,expected,desired,mem_success,mem_fail)  \
+  mi_atomic(compare_exchange_strong_explicit)(p,expected,desired,mem_success,mem_fail)
+
+#define mi_atomic_load_acquire(p)                mi_atomic(load_explicit)(p,mi_memory_order(acquire))
+#define mi_atomic_load_relaxed(p)                mi_atomic(load_explicit)(p,mi_memory_order(relaxed))
+#define mi_atomic_store_release(p,x)             mi_atomic(store_explicit)(p,x,mi_memory_order(release))
+#define mi_atomic_store_relaxed(p,x)             mi_atomic(store_explicit)(p,x,mi_memory_order(relaxed))
+#define mi_atomic_exchange_relaxed(p,x)          mi_atomic(exchange_explicit)(p,x,mi_memory_order(relaxed))
+#define mi_atomic_exchange_release(p,x)          mi_atomic(exchange_explicit)(p,x,mi_memory_order(release))
+#define mi_atomic_exchange_acq_rel(p,x)          mi_atomic(exchange_explicit)(p,x,mi_memory_order(acq_rel))
+#define mi_atomic_cas_weak_release(p,exp,des)    mi_atomic_cas_weak(p,exp,des,mi_memory_order(release),mi_memory_order(relaxed))
+#define mi_atomic_cas_weak_acq_rel(p,exp,des)    mi_atomic_cas_weak(p,exp,des,mi_memory_order(acq_rel),mi_memory_order(acquire))
+#define mi_atomic_cas_strong_release(p,exp,des)  mi_atomic_cas_strong(p,exp,des,mi_memory_order(release),mi_memory_order(relaxed))
+#define mi_atomic_cas_strong_acq_rel(p,exp,des)  mi_atomic_cas_strong(p,exp,des,mi_memory_order(acq_rel),mi_memory_order(acquire))
+
+#define mi_atomic_add_relaxed(p,x)               mi_atomic(fetch_add_explicit)(p,x,mi_memory_order(relaxed))
+#define mi_atomic_sub_relaxed(p,x)               mi_atomic(fetch_sub_explicit)(p,x,mi_memory_order(relaxed))
+#define mi_atomic_add_acq_rel(p,x)               mi_atomic(fetch_add_explicit)(p,x,mi_memory_order(acq_rel))
+#define mi_atomic_sub_acq_rel(p,x)               mi_atomic(fetch_sub_explicit)(p,x,mi_memory_order(acq_rel))
+#define mi_atomic_and_acq_rel(p,x)               mi_atomic(fetch_and_explicit)(p,x,mi_memory_order(acq_rel))
+#define mi_atomic_or_acq_rel(p,x)                mi_atomic(fetch_or_explicit)(p,x,mi_memory_order(acq_rel))
+
+#define mi_atomic_increment_relaxed(p)           mi_atomic_add_relaxed(p,(uintptr_t)1)
+#define mi_atomic_decrement_relaxed(p)           mi_atomic_sub_relaxed(p,(uintptr_t)1)
+#define mi_atomic_increment_acq_rel(p)           mi_atomic_add_acq_rel(p,(uintptr_t)1)
+#define mi_atomic_decrement_acq_rel(p)           mi_atomic_sub_acq_rel(p,(uintptr_t)1)
+
+static inline void mi_atomic_yield(void);
+static inline intptr_t mi_atomic_addi(_Atomic(intptr_t)*p, intptr_t add);
+static inline intptr_t mi_atomic_subi(_Atomic(intptr_t)*p, intptr_t sub);
+
+
+#if defined(__cplusplus) || !defined(_MSC_VER)
+
+// In C++/C11 atomics we have polymorphic atomics so can use the typed `ptr` variants (where `tp` is the type of atomic value)
+// We use these macros so we can provide a typed wrapper in MSVC in C compilation mode as well
+#define mi_atomic_load_ptr_acquire(tp,p)                mi_atomic_load_acquire(p)
+#define mi_atomic_load_ptr_relaxed(tp,p)                mi_atomic_load_relaxed(p)
+
+// In C++ we need to add casts to help resolve templates if NULL is passed
+#if defined(__cplusplus)
+#define mi_atomic_store_ptr_release(tp,p,x)             mi_atomic_store_release(p,(tp*)x)
+#define mi_atomic_store_ptr_relaxed(tp,p,x)             mi_atomic_store_relaxed(p,(tp*)x)
+#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des)    mi_atomic_cas_weak_release(p,exp,(tp*)des)
+#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des)    mi_atomic_cas_weak_acq_rel(p,exp,(tp*)des)
+#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des)  mi_atomic_cas_strong_release(p,exp,(tp*)des)
+#define mi_atomic_cas_ptr_strong_acq_rel(tp,p,exp,des)  mi_atomic_cas_strong_acq_rel(p,exp,(tp*)des)
+#define mi_atomic_exchange_ptr_relaxed(tp,p,x)          mi_atomic_exchange_relaxed(p,(tp*)x)
+#define mi_atomic_exchange_ptr_release(tp,p,x)          mi_atomic_exchange_release(p,(tp*)x)
+#define mi_atomic_exchange_ptr_acq_rel(tp,p,x)          mi_atomic_exchange_acq_rel(p,(tp*)x)
+#else
+#define mi_atomic_store_ptr_release(tp,p,x)             mi_atomic_store_release(p,x)
+#define mi_atomic_store_ptr_relaxed(tp,p,x)             mi_atomic_store_relaxed(p,x)
+#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des)    mi_atomic_cas_weak_release(p,exp,des)
+#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des)    mi_atomic_cas_weak_acq_rel(p,exp,des)
+#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des)  mi_atomic_cas_strong_release(p,exp,des)
+#define mi_atomic_cas_ptr_strong_acq_rel(tp,p,exp,des)  mi_atomic_cas_strong_acq_rel(p,exp,des)
+#define mi_atomic_exchange_ptr_relaxed(tp,p,x)          mi_atomic_exchange_relaxed(p,x)
+#define mi_atomic_exchange_ptr_release(tp,p,x)          mi_atomic_exchange_release(p,x)
+#define mi_atomic_exchange_ptr_acq_rel(tp,p,x)          mi_atomic_exchange_acq_rel(p,x)
+#endif
+
+// These are used by the statistics
+static inline int64_t mi_atomic_addi64_relaxed(volatile int64_t* p, int64_t add) {
+  return mi_atomic(fetch_add_explicit)((_Atomic(int64_t)*)p, add, mi_memory_order(relaxed));
+}
+static inline void mi_atomic_void_addi64_relaxed(volatile int64_t* p, const volatile int64_t* padd) {
+  const int64_t add = mi_atomic_load_relaxed((_Atomic(int64_t)*)padd);
+  if (add != 0) {
+    mi_atomic(fetch_add_explicit)((_Atomic(int64_t)*)p, add, mi_memory_order(relaxed));
+  }
+}
+static inline void mi_atomic_maxi64_relaxed(volatile int64_t* p, int64_t x) {
+  int64_t current = mi_atomic_load_relaxed((_Atomic(int64_t)*)p);
+  while (current < x && !mi_atomic_cas_weak_release((_Atomic(int64_t)*)p, &current, x)) { /* nothing */ };
+}
+
+// Used by timers
+#define mi_atomic_loadi64_acquire(p)            mi_atomic(load_explicit)(p,mi_memory_order(acquire))
+#define mi_atomic_loadi64_relaxed(p)            mi_atomic(load_explicit)(p,mi_memory_order(relaxed))
+#define mi_atomic_storei64_release(p,x)         mi_atomic(store_explicit)(p,x,mi_memory_order(release))
+#define mi_atomic_storei64_relaxed(p,x)         mi_atomic(store_explicit)(p,x,mi_memory_order(relaxed))
+
+#define mi_atomic_casi64_strong_acq_rel(p,e,d)  mi_atomic_cas_strong_acq_rel(p,e,d)
+#define mi_atomic_addi64_acq_rel(p,i)           mi_atomic_add_acq_rel(p,i)
+
+
+#elif defined(_MSC_VER)
+
+// Legacy MSVC plain C compilation wrapper that uses Interlocked operations to model C11 atomics.
+#include <intrin.h>
+#ifdef _WIN64
+typedef LONG64   msc_intptr_t;
+#define MI_64(f) f##64
+#else
+typedef LONG     msc_intptr_t;
+#define MI_64(f) f
+#endif
+
+typedef enum mi_memory_order_e {
+  mi_memory_order_relaxed,
+  mi_memory_order_consume,
+  mi_memory_order_acquire,
+  mi_memory_order_release,
+  mi_memory_order_acq_rel,
+  mi_memory_order_seq_cst
+} mi_memory_order;
+
+static inline uintptr_t mi_atomic_fetch_add_explicit(_Atomic(uintptr_t)*p, uintptr_t add, mi_memory_order mo) {
+  (void)(mo);
+  return (uintptr_t)MI_64(_InterlockedExchangeAdd)((volatile msc_intptr_t*)p, (msc_intptr_t)add);
+}
+static inline uintptr_t mi_atomic_fetch_sub_explicit(_Atomic(uintptr_t)*p, uintptr_t sub, mi_memory_order mo) {
+  (void)(mo);
+  return (uintptr_t)MI_64(_InterlockedExchangeAdd)((volatile msc_intptr_t*)p, -((msc_intptr_t)sub));
+}
+static inline uintptr_t mi_atomic_fetch_and_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) {
+  (void)(mo);
+  return (uintptr_t)MI_64(_InterlockedAnd)((volatile msc_intptr_t*)p, (msc_intptr_t)x);
+}
+static inline uintptr_t mi_atomic_fetch_or_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) {
+  (void)(mo);
+  return (uintptr_t)MI_64(_InterlockedOr)((volatile msc_intptr_t*)p, (msc_intptr_t)x);
+}
+static inline bool mi_atomic_compare_exchange_strong_explicit(_Atomic(uintptr_t)*p, uintptr_t* expected, uintptr_t desired, mi_memory_order mo1, mi_memory_order mo2) {
+  (void)(mo1); (void)(mo2);
+  uintptr_t read = (uintptr_t)MI_64(_InterlockedCompareExchange)((volatile msc_intptr_t*)p, (msc_intptr_t)desired, (msc_intptr_t)(*expected));
+  if (read == *expected) {
+    return true;
+  }
+  else {
+    *expected = read;
+    return false;
+  }
+}
+static inline bool mi_atomic_compare_exchange_weak_explicit(_Atomic(uintptr_t)*p, uintptr_t* expected, uintptr_t desired, mi_memory_order mo1, mi_memory_order mo2) {
+  return mi_atomic_compare_exchange_strong_explicit(p, expected, desired, mo1, mo2);
+}
+static inline uintptr_t mi_atomic_exchange_explicit(_Atomic(uintptr_t)*p, uintptr_t exchange, mi_memory_order mo) {
+  (void)(mo);
+  return (uintptr_t)MI_64(_InterlockedExchange)((volatile msc_intptr_t*)p, (msc_intptr_t)exchange);
+}
+static inline void mi_atomic_thread_fence(mi_memory_order mo) {
+  (void)(mo);
+  _Atomic(uintptr_t) x = 0;
+  mi_atomic_exchange_explicit(&x, 1, mo);
+}
+static inline uintptr_t mi_atomic_load_explicit(_Atomic(uintptr_t) const* p, mi_memory_order mo) {
+  (void)(mo);
+#if defined(_M_IX86) || defined(_M_X64)
+  return *p;
+#else
+  uintptr_t x = *p;
+  if (mo > mi_memory_order_relaxed) {
+    while (!mi_atomic_compare_exchange_weak_explicit((_Atomic(uintptr_t)*)p, &x, x, mo, mi_memory_order_relaxed)) { /* nothing */ };
+  }
+  return x;
+#endif
+}
+static inline void mi_atomic_store_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) {
+  (void)(mo);
+#if defined(_M_IX86) || defined(_M_X64)
+  *p = x;
+#else
+  mi_atomic_exchange_explicit(p, x, mo);
+#endif
+}
+static inline int64_t mi_atomic_loadi64_explicit(_Atomic(int64_t)*p, mi_memory_order mo) {
+  (void)(mo);
+#if defined(_M_X64)
+  return *p;
+#else
+  int64_t old = *p;
+  int64_t x = old;
+  while ((old = InterlockedCompareExchange64(p, x, old)) != x) {
+    x = old;
+  }
+  return x;
+#endif
+}
+static inline void mi_atomic_storei64_explicit(_Atomic(int64_t)*p, int64_t x, mi_memory_order mo) {
+  (void)(mo);
+#if defined(x_M_IX86) || defined(_M_X64)
+  *p = x;
+#else
+  InterlockedExchange64(p, x);
+#endif
+}
+
+// These are used by the statistics
+static inline int64_t mi_atomic_addi64_relaxed(volatile _Atomic(int64_t)*p, int64_t add) {
+#ifdef _WIN64
+  return (int64_t)mi_atomic_addi((int64_t*)p, add);
+#else
+  int64_t current;
+  int64_t sum;
+  do {
+    current = *p;
+    sum = current + add;
+  } while (_InterlockedCompareExchange64(p, sum, current) != current);
+  return current;
+#endif
+}
+static inline void mi_atomic_void_addi64_relaxed(volatile int64_t* p, const volatile int64_t* padd) {
+  const int64_t add = *padd;
+  if (add != 0) {
+    mi_atomic_addi64_relaxed((volatile _Atomic(int64_t)*)p, add);
+  }
+}
+
+static inline void mi_atomic_maxi64_relaxed(volatile _Atomic(int64_t)*p, int64_t x) {
+  int64_t current;
+  do {
+    current = *p;
+  } while (current < x && _InterlockedCompareExchange64(p, x, current) != current);
+}
+
+static inline void mi_atomic_addi64_acq_rel(volatile _Atomic(int64_t*)p, int64_t i) {
+  mi_atomic_addi64_relaxed(p, i);
+}
+
+static inline bool mi_atomic_casi64_strong_acq_rel(volatile _Atomic(int64_t*)p, int64_t* exp, int64_t des) {
+  int64_t read = _InterlockedCompareExchange64(p, des, *exp);
+  if (read == *exp) {
+    return true;
+  }
+  else {
+    *exp = read;
+    return false;
+  }
+}
+
+// The pointer macros cast to `uintptr_t`.
+#define mi_atomic_load_ptr_acquire(tp,p)                (tp*)mi_atomic_load_acquire((_Atomic(uintptr_t)*)(p))
+#define mi_atomic_load_ptr_relaxed(tp,p)                (tp*)mi_atomic_load_relaxed((_Atomic(uintptr_t)*)(p))
+#define mi_atomic_store_ptr_release(tp,p,x)             mi_atomic_store_release((_Atomic(uintptr_t)*)(p),(uintptr_t)(x))
+#define mi_atomic_store_ptr_relaxed(tp,p,x)             mi_atomic_store_relaxed((_Atomic(uintptr_t)*)(p),(uintptr_t)(x))
+#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des)    mi_atomic_cas_weak_release((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des)
+#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des)    mi_atomic_cas_weak_acq_rel((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des)
+#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des)  mi_atomic_cas_strong_release((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des)
+#define mi_atomic_cas_ptr_strong_acq_rel(tp,p,exp,des)  mi_atomic_cas_strong_acq_rel((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des)
+#define mi_atomic_exchange_ptr_relaxed(tp,p,x)          (tp*)mi_atomic_exchange_relaxed((_Atomic(uintptr_t)*)(p),(uintptr_t)x)
+#define mi_atomic_exchange_ptr_release(tp,p,x)          (tp*)mi_atomic_exchange_release((_Atomic(uintptr_t)*)(p),(uintptr_t)x)
+#define mi_atomic_exchange_ptr_acq_rel(tp,p,x)          (tp*)mi_atomic_exchange_acq_rel((_Atomic(uintptr_t)*)(p),(uintptr_t)x)
+
+#define mi_atomic_loadi64_acquire(p)    mi_atomic(loadi64_explicit)(p,mi_memory_order(acquire))
+#define mi_atomic_loadi64_relaxed(p)    mi_atomic(loadi64_explicit)(p,mi_memory_order(relaxed))
+#define mi_atomic_storei64_release(p,x) mi_atomic(storei64_explicit)(p,x,mi_memory_order(release))
+#define mi_atomic_storei64_relaxed(p,x) mi_atomic(storei64_explicit)(p,x,mi_memory_order(relaxed))
+
+
+#endif
+
+
+// Atomically add a signed value; returns the previous value.
+static inline intptr_t mi_atomic_addi(_Atomic(intptr_t)*p, intptr_t add) {
+  return (intptr_t)mi_atomic_add_acq_rel((_Atomic(uintptr_t)*)p, (uintptr_t)add);
+}
+
+// Atomically subtract a signed value; returns the previous value.
+static inline intptr_t mi_atomic_subi(_Atomic(intptr_t)*p, intptr_t sub) {
+  return (intptr_t)mi_atomic_addi(p, -sub);
+}
+
+
+// ----------------------------------------------------------------------
+// Once and Guard
+// ----------------------------------------------------------------------
+
+typedef _Atomic(uintptr_t) mi_atomic_once_t;
+
+// Returns true only on the first invocation
+static inline bool mi_atomic_once( mi_atomic_once_t* once ) {
+  if (mi_atomic_load_relaxed(once) != 0) return false;     // quick test
+  uintptr_t expected = 0;
+  return mi_atomic_cas_strong_acq_rel(once, &expected, (uintptr_t)1); // try to set to 1
+}
+
+typedef _Atomic(uintptr_t) mi_atomic_guard_t;
+
+// Allows only one thread to execute at a time
+#define mi_atomic_guard(guard) \
+  uintptr_t _mi_guard_expected = 0; \
+  for(bool _mi_guard_once = true; \
+      _mi_guard_once && mi_atomic_cas_strong_acq_rel(guard,&_mi_guard_expected,(uintptr_t)1); \
+      (mi_atomic_store_release(guard,(uintptr_t)0), _mi_guard_once = false) )
+
+
+
+// ----------------------------------------------------------------------
+// Yield
+// ----------------------------------------------------------------------
+
+#if defined(__cplusplus)
+#include <thread>
+static inline void mi_atomic_yield(void) {
+  std::this_thread::yield();
+}
+#elif defined(_WIN32)
+static inline void mi_atomic_yield(void) {
+  YieldProcessor();
+}
+#elif defined(__SSE2__)
+#include <emmintrin.h>
+static inline void mi_atomic_yield(void) {
+  _mm_pause();
+}
+#elif (defined(__GNUC__) || defined(__clang__)) && \
+      (defined(__x86_64__) || defined(__i386__) || \
+       defined(__aarch64__) || defined(__arm__) || \
+       defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) || defined(__POWERPC__))
+#if defined(__x86_64__) || defined(__i386__)
+static inline void mi_atomic_yield(void) {
+  __asm__ volatile ("pause" ::: "memory");
+}
+#elif defined(__aarch64__)
+static inline void mi_atomic_yield(void) {
+  __asm__ volatile("wfe");
+}
+#elif defined(__arm__)
+#if __ARM_ARCH >= 7
+static inline void mi_atomic_yield(void) {
+  __asm__ volatile("yield" ::: "memory");
+}
+#else
+static inline void mi_atomic_yield(void) {
+  __asm__ volatile ("nop" ::: "memory");
+}
+#endif
+#elif defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) || defined(__POWERPC__)
+#ifdef __APPLE__
+static inline void mi_atomic_yield(void) {
+  __asm__ volatile ("or r27,r27,r27" ::: "memory");
+}
+#else
+static inline void mi_atomic_yield(void) {
+  __asm__ __volatile__ ("or 27,27,27" ::: "memory");
+}
+#endif
+#endif
+#elif defined(__sun)
+// Fallback for other archs
+#include <synch.h>
+static inline void mi_atomic_yield(void) {
+  smt_pause();
+}
+#elif defined(__wasi__)
+#include <sched.h>
+static inline void mi_atomic_yield(void) {
+  sched_yield();
+}
+#else
+#include <unistd.h>
+static inline void mi_atomic_yield(void) {
+  sleep(0);
+}
+#endif
+
+
+// ----------------------------------------------------------------------
+// Locks 
+// These do not have to be recursive and should be light-weight 
+// in-process only locks. Only used for reserving arena's and to 
+// maintain the abandoned list.
+// ----------------------------------------------------------------------
+#if _MSC_VER
+#pragma warning(disable:26110)  // unlock with holding lock
+#endif
+
+#define mi_lock(lock)    for(bool _go = (mi_lock_acquire(lock),true); _go; (mi_lock_release(lock), _go=false) )
+
+#if defined(_WIN32)
+
+#if 1
+#define mi_lock_t  SRWLOCK   // slim reader-writer lock
+
+static inline bool mi_lock_try_acquire(mi_lock_t* lock) {
+  return TryAcquireSRWLockExclusive(lock);
+}
+static inline void mi_lock_acquire(mi_lock_t* lock) {
+  AcquireSRWLockExclusive(lock);
+}
+static inline void mi_lock_release(mi_lock_t* lock) {
+  ReleaseSRWLockExclusive(lock);
+}
+static inline void mi_lock_init(mi_lock_t* lock) {
+  InitializeSRWLock(lock);
+}
+static inline void mi_lock_done(mi_lock_t* lock) {
+  (void)(lock);
+}
+
+#else
+#define mi_lock_t  CRITICAL_SECTION
+
+static inline bool mi_lock_try_acquire(mi_lock_t* lock) {
+  return TryEnterCriticalSection(lock);
+}
+static inline void mi_lock_acquire(mi_lock_t* lock) {
+  EnterCriticalSection(lock);
+}
+static inline void mi_lock_release(mi_lock_t* lock) {
+  LeaveCriticalSection(lock);
+}
+static inline void mi_lock_init(mi_lock_t* lock) {
+  InitializeCriticalSection(lock);
+}
+static inline void mi_lock_done(mi_lock_t* lock) {
+  DeleteCriticalSection(lock);
+}
+
+#endif
+
+#elif defined(MI_USE_PTHREADS)
+
+void _mi_error_message(int err, const char* fmt, ...);
+
+#define mi_lock_t  pthread_mutex_t
+
+static inline bool mi_lock_try_acquire(mi_lock_t* lock) {
+  return (pthread_mutex_trylock(lock) == 0);
+}
+static inline void mi_lock_acquire(mi_lock_t* lock) {
+  const int err = pthread_mutex_lock(lock);
+  if (err != 0) {
+    _mi_error_message(err, "internal error: lock cannot be acquired\n");
+  }
+}
+static inline void mi_lock_release(mi_lock_t* lock) {
+  pthread_mutex_unlock(lock);
+}
+static inline void mi_lock_init(mi_lock_t* lock) {
+  pthread_mutex_init(lock, NULL);
+}
+static inline void mi_lock_done(mi_lock_t* lock) {
+  pthread_mutex_destroy(lock);
+}
+
+#elif defined(__cplusplus)
+
+#include <mutex>
+#define mi_lock_t  std::mutex
+
+static inline bool mi_lock_try_acquire(mi_lock_t* lock) {
+  return lock->try_lock();
+}
+static inline void mi_lock_acquire(mi_lock_t* lock) {
+  lock->lock();
+}
+static inline void mi_lock_release(mi_lock_t* lock) {
+  lock->unlock();
+}
+static inline void mi_lock_init(mi_lock_t* lock) {
+  (void)(lock);
+}
+static inline void mi_lock_done(mi_lock_t* lock) {
+  (void)(lock);
+}
+
+#else
+
+// fall back to poor man's locks.
+// this should only be the case in a single-threaded environment (like __wasi__)
+
+#define mi_lock_t  _Atomic(uintptr_t)
+
+static inline bool mi_lock_try_acquire(mi_lock_t* lock) {
+  uintptr_t expected = 0;
+  return mi_atomic_cas_strong_acq_rel(lock, &expected, (uintptr_t)1);
+}
+static inline void mi_lock_acquire(mi_lock_t* lock) {
+  for (int i = 0; i < 1000; i++) {  // for at most 1000 tries?
+    if (mi_lock_try_acquire(lock)) return;
+    mi_atomic_yield();
+  }
+}
+static inline void mi_lock_release(mi_lock_t* lock) {
+  mi_atomic_store_release(lock, (uintptr_t)0);
+}
+static inline void mi_lock_init(mi_lock_t* lock) {
+  mi_lock_release(lock);
+}
+static inline void mi_lock_done(mi_lock_t* lock) {
+  (void)(lock);
+}
+
+#endif
+
+
+#endif // __MIMALLOC_ATOMIC_H
diff --git a/compat/mimalloc/mimalloc/internal.h b/compat/mimalloc/mimalloc/internal.h
new file mode 100644
index 00000000000000..6845c9b5df3faf
--- /dev/null
+++ b/compat/mimalloc/mimalloc/internal.h
@@ -0,0 +1,1153 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_INTERNAL_H
+#define MIMALLOC_INTERNAL_H
+
+// --------------------------------------------------------------------------
+// This file contains the internal API's of mimalloc and various utility
+// functions and macros.
+// --------------------------------------------------------------------------
+
+#include "types.h"
+#include "track.h"
+
+
+// --------------------------------------------------------------------------
+// Compiler defines
+// --------------------------------------------------------------------------
+
+#if (MI_DEBUG>0)
+#define mi_trace_message(...)  _mi_trace_message(__VA_ARGS__)
+#else
+#define mi_trace_message(...)
+#endif
+
+#define mi_decl_cache_align     mi_decl_align(64)
+
+#if defined(_MSC_VER)
+#pragma warning(disable:4127)   // suppress constant conditional warning (due to MI_SECURE paths)
+#pragma warning(disable:26812)  // unscoped enum warning
+#define mi_decl_noinline        __declspec(noinline)
+#define mi_decl_thread          __declspec(thread)
+#define mi_decl_align(a)        __declspec(align(a))
+#define mi_decl_noreturn        __declspec(noreturn)
+#define mi_decl_weak
+#define mi_decl_hidden
+#define mi_decl_cold
+#elif (defined(__GNUC__) && (__GNUC__ >= 3)) || defined(__clang__) // includes clang and icc
+#define mi_decl_noinline        __attribute__((noinline))
+#define mi_decl_thread          __thread
+#define mi_decl_align(a)        __attribute__((aligned(a)))
+#define mi_decl_noreturn        __attribute__((noreturn))
+#define mi_decl_weak            __attribute__((weak))
+#define mi_decl_hidden          __attribute__((visibility("hidden")))
+#if (__GNUC__ >= 4) || defined(__clang__)
+#define mi_decl_cold            __attribute__((cold))
+#else
+#define mi_decl_cold
+#endif
+#elif __cplusplus >= 201103L    // c++11
+#define mi_decl_noinline
+#define mi_decl_thread          thread_local
+#define mi_decl_align(a)        alignas(a)
+#define mi_decl_noreturn        [[noreturn]]
+#define mi_decl_weak
+#define mi_decl_hidden
+#define mi_decl_cold
+#else
+#define mi_decl_noinline
+#define mi_decl_thread          __thread        // hope for the best :-)
+#define mi_decl_align(a)
+#define mi_decl_noreturn
+#define mi_decl_weak
+#define mi_decl_hidden
+#define mi_decl_cold
+#endif
+
+#if defined(__GNUC__) || defined(__clang__)
+#define mi_unlikely(x)     (__builtin_expect(!!(x),false))
+#define mi_likely(x)       (__builtin_expect(!!(x),true))
+#elif (defined(__cplusplus) && (__cplusplus >= 202002L)) || (defined(_MSVC_LANG) && _MSVC_LANG >= 202002L)
+#define mi_unlikely(x)     (x) [[unlikely]]
+#define mi_likely(x)       (x) [[likely]]
+#else
+#define mi_unlikely(x)     (x)
+#define mi_likely(x)       (x)
+#endif
+
+#ifndef __has_builtin
+#define __has_builtin(x)    0
+#endif
+
+#if defined(__cplusplus)
+#define mi_decl_externc     extern "C"
+#else
+#define mi_decl_externc
+#endif
+
+#if defined(__EMSCRIPTEN__) && !defined(__wasi__)
+#define __wasi__
+#endif
+
+
+// --------------------------------------------------------------------------
+// Internal functions
+// --------------------------------------------------------------------------
+
+// "libc.c"
+#include    <stdarg.h>
+int         _mi_vsnprintf(char* buf, size_t bufsize, const char* fmt, va_list args);
+int         _mi_snprintf(char* buf, size_t buflen, const char* fmt, ...);
+char        _mi_toupper(char c);
+int         _mi_strnicmp(const char* s, const char* t, size_t n);
+void        _mi_strlcpy(char* dest, const char* src, size_t dest_size);
+void        _mi_strlcat(char* dest, const char* src, size_t dest_size);
+size_t      _mi_strlen(const char* s);
+size_t      _mi_strnlen(const char* s, size_t max_len);
+bool        _mi_getenv(const char* name, char* result, size_t result_size);
+
+// "options.c"
+void        _mi_fputs(mi_output_fun* out, void* arg, const char* prefix, const char* message);
+void        _mi_fprintf(mi_output_fun* out, void* arg, const char* fmt, ...);
+void        _mi_message(const char* fmt, ...);
+void        _mi_warning_message(const char* fmt, ...);
+void        _mi_verbose_message(const char* fmt, ...);
+void        _mi_trace_message(const char* fmt, ...);
+void        _mi_options_init(void);
+long        _mi_option_get_fast(mi_option_t option);
+void        _mi_error_message(int err, const char* fmt, ...);
+
+// random.c
+void        _mi_random_init(mi_random_ctx_t* ctx);
+void        _mi_random_init_weak(mi_random_ctx_t* ctx);
+void        _mi_random_reinit_if_weak(mi_random_ctx_t * ctx);
+void        _mi_random_split(mi_random_ctx_t* ctx, mi_random_ctx_t* new_ctx);
+uintptr_t   _mi_random_next(mi_random_ctx_t* ctx);
+uintptr_t   _mi_heap_random_next(mi_heap_t* heap);
+uintptr_t   _mi_os_random_weak(uintptr_t extra_seed);
+static inline uintptr_t _mi_random_shuffle(uintptr_t x);
+
+// init.c
+extern mi_decl_hidden mi_decl_cache_align mi_stats_t       _mi_stats_main;
+extern mi_decl_hidden mi_decl_cache_align const mi_page_t  _mi_page_empty;
+void        _mi_auto_process_init(void);
+void mi_cdecl _mi_auto_process_done(void) mi_attr_noexcept;
+bool        _mi_is_redirected(void);
+bool        _mi_allocator_init(const char** message);
+void        _mi_allocator_done(void);
+bool        _mi_is_main_thread(void);
+size_t      _mi_current_thread_count(void);
+bool        _mi_preloading(void);           // true while the C runtime is not initialized yet
+void        _mi_thread_done(mi_heap_t* heap);
+void        _mi_thread_data_collect(void);
+void        _mi_tld_init(mi_tld_t* tld, mi_heap_t* bheap);
+mi_threadid_t _mi_thread_id(void) mi_attr_noexcept;
+mi_heap_t*    _mi_heap_main_get(void);     // statically allocated main backing heap
+mi_subproc_t* _mi_subproc_from_id(mi_subproc_id_t subproc_id);
+void        _mi_heap_guarded_init(mi_heap_t* heap);
+
+// os.c
+void        _mi_os_init(void);                                            // called from process init
+void*       _mi_os_alloc(size_t size, mi_memid_t* memid);
+void*       _mi_os_zalloc(size_t size, mi_memid_t* memid);
+void        _mi_os_free(void* p, size_t size, mi_memid_t memid);
+void        _mi_os_free_ex(void* p, size_t size, bool still_committed, mi_memid_t memid);
+
+size_t      _mi_os_page_size(void);
+size_t      _mi_os_good_alloc_size(size_t size);
+bool        _mi_os_has_overcommit(void);
+bool        _mi_os_has_virtual_reserve(void);
+
+bool        _mi_os_reset(void* addr, size_t size);
+bool        _mi_os_decommit(void* addr, size_t size);
+bool        _mi_os_unprotect(void* addr, size_t size);
+bool        _mi_os_purge(void* p, size_t size);
+bool        _mi_os_purge_ex(void* p, size_t size, bool allow_reset, size_t stat_size);
+void        _mi_os_reuse(void* p, size_t size);
+mi_decl_nodiscard bool _mi_os_commit(void* p, size_t size, bool* is_zero);
+mi_decl_nodiscard bool _mi_os_commit_ex(void* addr, size_t size, bool* is_zero, size_t stat_size);
+bool        _mi_os_protect(void* addr, size_t size);
+
+void*       _mi_os_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, mi_memid_t* memid);
+void*       _mi_os_alloc_aligned_at_offset(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large, mi_memid_t* memid);
+
+void*       _mi_os_get_aligned_hint(size_t try_alignment, size_t size);
+bool        _mi_os_canuse_large_page(size_t size, size_t alignment);
+size_t      _mi_os_large_page_size(void);
+void*       _mi_os_alloc_huge_os_pages(size_t pages, int numa_node, mi_msecs_t max_secs, size_t* pages_reserved, size_t* psize, mi_memid_t* memid);
+
+int         _mi_os_numa_node_count(void);
+int         _mi_os_numa_node(void);
+
+// arena.c
+mi_arena_id_t _mi_arena_id_none(void);
+void        _mi_arena_free(void* p, size_t size, size_t still_committed_size, mi_memid_t memid);
+void*       _mi_arena_alloc(size_t size, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid);
+void*       _mi_arena_alloc_aligned(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid);
+bool        _mi_arena_memid_is_suitable(mi_memid_t memid, mi_arena_id_t request_arena_id);
+bool        _mi_arena_contains(const void* p);
+void        _mi_arenas_collect(bool force_purge);
+void        _mi_arena_unsafe_destroy_all(void);
+
+bool        _mi_arena_segment_clear_abandoned(mi_segment_t* segment);
+void        _mi_arena_segment_mark_abandoned(mi_segment_t* segment);
+
+void*       _mi_arena_meta_zalloc(size_t size, mi_memid_t* memid);
+void        _mi_arena_meta_free(void* p, mi_memid_t memid, size_t size);
+
+typedef struct mi_arena_field_cursor_s { // abstract struct
+  size_t         os_list_count;           // max entries to visit in the OS abandoned list
+  size_t         start;                   // start arena idx (may need to be wrapped)
+  size_t         end;                     // end arena idx (exclusive, may need to be wrapped)
+  size_t         bitmap_idx;              // current bit idx for an arena
+  mi_subproc_t*  subproc;                 // only visit blocks in this sub-process
+  bool           visit_all;               // ensure all abandoned blocks are seen (blocking)
+  bool           hold_visit_lock;         // if the subproc->abandoned_os_visit_lock is held
+} mi_arena_field_cursor_t;
+void          _mi_arena_field_cursor_init(mi_heap_t* heap, mi_subproc_t* subproc, bool visit_all, mi_arena_field_cursor_t* current);
+mi_segment_t* _mi_arena_segment_clear_abandoned_next(mi_arena_field_cursor_t* previous);
+void          _mi_arena_field_cursor_done(mi_arena_field_cursor_t* current);
+
+// "segment-map.c"
+void        _mi_segment_map_allocated_at(const mi_segment_t* segment);
+void        _mi_segment_map_freed_at(const mi_segment_t* segment);
+void        _mi_segment_map_unsafe_destroy(void);
+
+// "segment.c"
+mi_page_t* _mi_segment_page_alloc(mi_heap_t* heap, size_t block_size, size_t page_alignment, mi_segments_tld_t* tld);
+void       _mi_segment_page_free(mi_page_t* page, bool force, mi_segments_tld_t* tld);
+void       _mi_segment_page_abandon(mi_page_t* page, mi_segments_tld_t* tld);
+bool       _mi_segment_try_reclaim_abandoned( mi_heap_t* heap, bool try_all, mi_segments_tld_t* tld);
+void       _mi_segment_collect(mi_segment_t* segment, bool force);
+
+#if MI_HUGE_PAGE_ABANDON
+void        _mi_segment_huge_page_free(mi_segment_t* segment, mi_page_t* page, mi_block_t* block);
+#else
+void        _mi_segment_huge_page_reset(mi_segment_t* segment, mi_page_t* page, mi_block_t* block);
+#endif
+
+uint8_t*   _mi_segment_page_start(const mi_segment_t* segment, const mi_page_t* page, size_t* page_size); // page start for any page
+void       _mi_abandoned_reclaim_all(mi_heap_t* heap, mi_segments_tld_t* tld);
+void       _mi_abandoned_collect(mi_heap_t* heap, bool force, mi_segments_tld_t* tld);
+bool       _mi_segment_attempt_reclaim(mi_heap_t* heap, mi_segment_t* segment);
+bool       _mi_segment_visit_blocks(mi_segment_t* segment, int heap_tag, bool visit_blocks, mi_block_visit_fun* visitor, void* arg);
+
+// "page.c"
+void*       _mi_malloc_generic(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment, size_t* usable)  mi_attr_noexcept mi_attr_malloc;
+
+void        _mi_page_retire(mi_page_t* page) mi_attr_noexcept;                  // free the page if there are no other pages with many free blocks
+void        _mi_page_unfull(mi_page_t* page);
+void        _mi_page_free(mi_page_t* page, mi_page_queue_t* pq, bool force);   // free the page
+void        _mi_page_abandon(mi_page_t* page, mi_page_queue_t* pq);            // abandon the page, to be picked up by another thread...
+void        _mi_page_force_abandon(mi_page_t* page);
+
+void        _mi_heap_delayed_free_all(mi_heap_t* heap);
+bool        _mi_heap_delayed_free_partial(mi_heap_t* heap);
+void        _mi_heap_collect_retired(mi_heap_t* heap, bool force);
+
+void        _mi_page_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never);
+bool        _mi_page_try_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never);
+size_t      _mi_page_queue_append(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_queue_t* append);
+void        _mi_deferred_free(mi_heap_t* heap, bool force);
+
+void        _mi_page_free_collect(mi_page_t* page,bool force);
+void        _mi_page_reclaim(mi_heap_t* heap, mi_page_t* page);   // callback from segments
+
+size_t      _mi_page_stats_bin(const mi_page_t* page); // for stats
+size_t      _mi_bin_size(size_t bin);                  // for stats
+size_t      _mi_bin(size_t size);                      // for stats
+
+// "heap.c"
+void        _mi_heap_init(mi_heap_t* heap, mi_tld_t* tld, mi_arena_id_t arena_id, bool noreclaim, uint8_t tag);
+void        _mi_heap_destroy_pages(mi_heap_t* heap);
+void        _mi_heap_collect_abandon(mi_heap_t* heap);
+void        _mi_heap_set_default_direct(mi_heap_t* heap);
+bool        _mi_heap_memid_is_suitable(mi_heap_t* heap, mi_memid_t memid);
+void        _mi_heap_unsafe_destroy_all(mi_heap_t* heap);
+mi_heap_t*  _mi_heap_by_tag(mi_heap_t* heap, uint8_t tag);
+void        _mi_heap_area_init(mi_heap_area_t* area, mi_page_t* page);
+bool        _mi_heap_area_visit_blocks(const mi_heap_area_t* area, mi_page_t* page, mi_block_visit_fun* visitor, void* arg);
+
+// "stats.c"
+void        _mi_stats_done(mi_stats_t* stats);
+void        _mi_stats_merge_thread(mi_tld_t* tld);
+mi_msecs_t  _mi_clock_now(void);
+mi_msecs_t  _mi_clock_end(mi_msecs_t start);
+mi_msecs_t  _mi_clock_start(void);
+
+// "alloc.c"
+void*       _mi_page_malloc_zero(mi_heap_t* heap, mi_page_t* page, size_t size, bool zero, size_t* usable) mi_attr_noexcept;  // called from `_mi_malloc_generic`
+void*       _mi_page_malloc(mi_heap_t* heap, mi_page_t* page, size_t size) mi_attr_noexcept;                  // called from `_mi_heap_malloc_aligned`
+void*       _mi_page_malloc_zeroed(mi_heap_t* heap, mi_page_t* page, size_t size) mi_attr_noexcept;           // called from `_mi_heap_malloc_aligned`
+void*       _mi_heap_malloc_zero(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept;
+void*       _mi_heap_malloc_zero_ex(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment, size_t* usable) mi_attr_noexcept;     // called from `_mi_heap_malloc_aligned`
+void*       _mi_heap_realloc_zero(mi_heap_t* heap, void* p, size_t newsize, bool zero, size_t* usable_pre, size_t* usable_post) mi_attr_noexcept;
+mi_block_t* _mi_page_ptr_unalign(const mi_page_t* page, const void* p);
+bool        _mi_free_delayed_block(mi_block_t* block);
+void        _mi_free_generic(mi_segment_t* segment, mi_page_t* page, bool is_local, void* p) mi_attr_noexcept;  // for runtime integration
+void        _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size);
+
+#if MI_DEBUG>1
+bool        _mi_page_is_valid(mi_page_t* page);
+#endif
+
+
+/* -----------------------------------------------------------
+  Error codes passed to `_mi_fatal_error`
+  All are recoverable but EFAULT is a serious error and aborts by default in secure mode.
+  For portability define undefined error codes using common Unix codes:
+  <https://www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/Errors/unix_system_errors.html>
+----------------------------------------------------------- */
+#include <errno.h>
+#ifndef EAGAIN         // double free
+#define EAGAIN (11)
+#endif
+#ifndef ENOMEM         // out of memory
+#define ENOMEM (12)
+#endif
+#ifndef EFAULT         // corrupted free-list or meta-data
+#define EFAULT (14)
+#endif
+#ifndef EINVAL         // trying to free an invalid pointer
+#define EINVAL (22)
+#endif
+#ifndef EOVERFLOW      // count*size overflow
+#define EOVERFLOW (75)
+#endif
+
+
+// ------------------------------------------------------
+// Assertions
+// ------------------------------------------------------
+
+#if (MI_DEBUG)
+// use our own assertion to print without memory allocation
+mi_decl_noreturn mi_decl_cold void _mi_assert_fail(const char* assertion, const char* fname, unsigned int line, const char* func) mi_attr_noexcept;
+#define mi_assert(expr)     ((expr) ? (void)0 : _mi_assert_fail(#expr,__FILE__,__LINE__,__func__))
+#else
+#define mi_assert(x)
+#endif
+
+#if (MI_DEBUG>1)
+#define mi_assert_internal    mi_assert
+#else
+#define mi_assert_internal(x)
+#endif
+
+#if (MI_DEBUG>2)
+#define mi_assert_expensive   mi_assert
+#else
+#define mi_assert_expensive(x)
+#endif
+
+
+
+/* -----------------------------------------------------------
+  Inlined definitions
+----------------------------------------------------------- */
+#define MI_UNUSED(x)     (void)(x)
+#if (MI_DEBUG>0)
+#define MI_UNUSED_RELEASE(x)
+#else
+#define MI_UNUSED_RELEASE(x)  MI_UNUSED(x)
+#endif
+
+#define MI_INIT4(x)   x(),x(),x(),x()
+#define MI_INIT8(x)   MI_INIT4(x),MI_INIT4(x)
+#define MI_INIT16(x)  MI_INIT8(x),MI_INIT8(x)
+#define MI_INIT32(x)  MI_INIT16(x),MI_INIT16(x)
+#define MI_INIT64(x)  MI_INIT32(x),MI_INIT32(x)
+#define MI_INIT128(x) MI_INIT64(x),MI_INIT64(x)
+#define MI_INIT256(x) MI_INIT128(x),MI_INIT128(x)
+#define MI_INIT74(x)  MI_INIT64(x),MI_INIT8(x),x(),x()
+
+#include <string.h>
+// initialize a local variable to zero; use memset as compilers optimize constant sized memset's
+#define _mi_memzero_var(x)  memset(&x,0,sizeof(x))
+
+// Is `x` a power of two? (0 is considered a power of two)
+static inline bool _mi_is_power_of_two(uintptr_t x) {
+  return ((x & (x - 1)) == 0);
+}
+
+// Is a pointer aligned?
+static inline bool _mi_is_aligned(void* p, size_t alignment) {
+  mi_assert_internal(alignment != 0);
+  return (((uintptr_t)p % alignment) == 0);
+}
+
+// Align upwards
+static inline uintptr_t _mi_align_up(uintptr_t sz, size_t alignment) {
+  mi_assert_internal(alignment != 0);
+  uintptr_t mask = alignment - 1;
+  if ((alignment & mask) == 0) {  // power of two?
+    return ((sz + mask) & ~mask);
+  }
+  else {
+    return (((sz + mask)/alignment)*alignment);
+  }
+}
+
+// Align downwards
+static inline uintptr_t _mi_align_down(uintptr_t sz, size_t alignment) {
+  mi_assert_internal(alignment != 0);
+  uintptr_t mask = alignment - 1;
+  if ((alignment & mask) == 0) { // power of two?
+    return (sz & ~mask);
+  }
+  else {
+    return ((sz / alignment) * alignment);
+  }
+}
+
+// Align a pointer upwards
+static inline void* mi_align_up_ptr(void* p, size_t alignment) {
+  return (void*)_mi_align_up((uintptr_t)p, alignment);
+}
+
+// Align a pointer downwards
+static inline void* mi_align_down_ptr(void* p, size_t alignment) {
+  return (void*)_mi_align_down((uintptr_t)p, alignment);
+}
+
+
+// Divide upwards: `s <= _mi_divide_up(s,d)*d < s+d`.
+static inline uintptr_t _mi_divide_up(uintptr_t size, size_t divider) {
+  mi_assert_internal(divider != 0);
+  return (divider == 0 ? size : ((size + divider - 1) / divider));
+}
+
+
+// clamp an integer
+static inline size_t _mi_clamp(size_t sz, size_t min, size_t max) {
+  if (sz < min) return min;
+  else if (sz > max) return max;
+  else return sz;
+}
+
+// Is memory zero initialized?
+static inline bool mi_mem_is_zero(const void* p, size_t size) {
+  for (size_t i = 0; i < size; i++) {
+    if (((uint8_t*)p)[i] != 0) return false;
+  }
+  return true;
+}
+
+
+// Align a byte size to a size in _machine words_,
+// i.e. byte size == `wsize*sizeof(void*)`.
+static inline size_t _mi_wsize_from_size(size_t size) {
+  mi_assert_internal(size <= SIZE_MAX - sizeof(uintptr_t));
+  return (size + sizeof(uintptr_t) - 1) / sizeof(uintptr_t);
+}
+
+// Overflow detecting multiply
+#if __has_builtin(__builtin_umul_overflow) || (defined(__GNUC__) && (__GNUC__ >= 5))
+#include <limits.h>      // UINT_MAX, ULONG_MAX
+#if defined(_CLOCK_T)    // for Illumos
+#undef _CLOCK_T
+#endif
+static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) {
+  #if (SIZE_MAX == ULONG_MAX)
+    return __builtin_umull_overflow(count, size, (unsigned long *)total);
+  #elif (SIZE_MAX == UINT_MAX)
+    return __builtin_umul_overflow(count, size, (unsigned int *)total);
+  #else
+    return __builtin_umulll_overflow(count, size, (unsigned long long *)total);
+  #endif
+}
+#else /* __builtin_umul_overflow is unavailable */
+static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) {
+  #define MI_MUL_COULD_OVERFLOW ((size_t)1 << (4*sizeof(size_t)))  // sqrt(SIZE_MAX)
+  *total = count * size;
+  // note: gcc/clang optimize this to directly check the overflow flag
+  return ((size >= MI_MUL_COULD_OVERFLOW || count >= MI_MUL_COULD_OVERFLOW) && size > 0 && (SIZE_MAX / size) < count);
+}
+#endif
+
+// Safe multiply `count*size` into `total`; return `true` on overflow.
+static inline bool mi_count_size_overflow(size_t count, size_t size, size_t* total) {
+  if (count==1) {  // quick check for the case where count is one (common for C++ allocators)
+    *total = size;
+    return false;
+  }
+  else if mi_unlikely(mi_mul_overflow(count, size, total)) {
+    #if MI_DEBUG > 0
+    _mi_error_message(EOVERFLOW, "allocation request is too large (%zu * %zu bytes)\n", count, size);
+    #endif
+    *total = SIZE_MAX;
+    return true;
+  }
+  else return false;
+}
+
+
+/*----------------------------------------------------------------------------------------
+  Heap functions
+------------------------------------------------------------------------------------------- */
+
+extern mi_decl_hidden const mi_heap_t _mi_heap_empty;  // read-only empty heap, initial value of the thread local default heap
+
+static inline bool mi_heap_is_backing(const mi_heap_t* heap) {
+  return (heap->tld->heap_backing == heap);
+}
+
+static inline bool mi_heap_is_initialized(mi_heap_t* heap) {
+  mi_assert_internal(heap != NULL);
+  return (heap != NULL && heap != &_mi_heap_empty);
+}
+
+static inline uintptr_t _mi_ptr_cookie(const void* p) {
+  extern mi_decl_hidden mi_heap_t _mi_heap_main;
+  mi_assert_internal(_mi_heap_main.cookie != 0);
+  return ((uintptr_t)p ^ _mi_heap_main.cookie);
+}
+
+/* -----------------------------------------------------------
+  Pages
+----------------------------------------------------------- */
+
+static inline mi_page_t* _mi_heap_get_free_small_page(mi_heap_t* heap, size_t size) {
+  mi_assert_internal(size <= (MI_SMALL_SIZE_MAX + MI_PADDING_SIZE));
+  const size_t idx = _mi_wsize_from_size(size);
+  mi_assert_internal(idx < MI_PAGES_DIRECT);
+  return heap->pages_free_direct[idx];
+}
+
+// Segment that contains the pointer
+// Large aligned blocks may be aligned at N*MI_SEGMENT_SIZE (inside a huge segment > MI_SEGMENT_SIZE),
+// and we need align "down" to the segment info which is `MI_SEGMENT_SIZE` bytes before it;
+// therefore we align one byte before `p`.
+// We check for NULL afterwards on 64-bit systems to improve codegen for `mi_free`.
+static inline mi_segment_t* _mi_ptr_segment(const void* p) {
+  mi_segment_t* const segment = (mi_segment_t*)(((uintptr_t)p - 1) & ~MI_SEGMENT_MASK);
+  #if MI_INTPTR_SIZE <= 4
+  return (p==NULL ? NULL : segment);
+  #else
+  return ((intptr_t)segment <= 0 ? NULL : segment);
+  #endif
+}
+
+static inline mi_page_t* mi_slice_to_page(mi_slice_t* s) {
+  mi_assert_internal(s->slice_offset== 0 && s->slice_count > 0);
+  return (mi_page_t*)(s);
+}
+
+static inline mi_slice_t* mi_page_to_slice(mi_page_t* p) {
+  mi_assert_internal(p->slice_offset== 0 && p->slice_count > 0);
+  return (mi_slice_t*)(p);
+}
+
+// Segment belonging to a page
+static inline mi_segment_t* _mi_page_segment(const mi_page_t* page) {
+  mi_assert_internal(page!=NULL);
+  mi_segment_t* segment = _mi_ptr_segment(page);
+  mi_assert_internal(segment == NULL || ((mi_slice_t*)page >= segment->slices && (mi_slice_t*)page < segment->slices + segment->slice_entries));
+  return segment;
+}
+
+static inline mi_slice_t* mi_slice_first(const mi_slice_t* slice) {
+  mi_slice_t* start = (mi_slice_t*)((uint8_t*)slice - slice->slice_offset);
+  mi_assert_internal(start >= _mi_ptr_segment(slice)->slices);
+  mi_assert_internal(start->slice_offset == 0);
+  mi_assert_internal(start + start->slice_count > slice);
+  return start;
+}
+
+// Get the page containing the pointer (performance critical as it is called in mi_free)
+static inline mi_page_t* _mi_segment_page_of(const mi_segment_t* segment, const void* p) {
+  mi_assert_internal(p > (void*)segment);
+  ptrdiff_t diff = (uint8_t*)p - (uint8_t*)segment;
+  mi_assert_internal(diff > 0 && diff <= (ptrdiff_t)MI_SEGMENT_SIZE);
+  size_t idx = (size_t)diff >> MI_SEGMENT_SLICE_SHIFT;
+  mi_assert_internal(idx <= segment->slice_entries);
+  mi_slice_t* slice0 = (mi_slice_t*)&segment->slices[idx];
+  mi_slice_t* slice = mi_slice_first(slice0);  // adjust to the block that holds the page data
+  mi_assert_internal(slice->slice_offset == 0);
+  mi_assert_internal(slice >= segment->slices && slice < segment->slices + segment->slice_entries);
+  return mi_slice_to_page(slice);
+}
+
+// Quick page start for initialized pages
+static inline uint8_t* mi_page_start(const mi_page_t* page) {
+  mi_assert_internal(page->page_start != NULL);
+  mi_assert_expensive(_mi_segment_page_start(_mi_page_segment(page),page,NULL) == page->page_start);
+  return page->page_start;
+}
+
+// Get the page containing the pointer
+static inline mi_page_t* _mi_ptr_page(void* p) {
+  mi_assert_internal(p!=NULL);
+  return _mi_segment_page_of(_mi_ptr_segment(p), p);
+}
+
+// Get the block size of a page (special case for huge objects)
+static inline size_t mi_page_block_size(const mi_page_t* page) {
+  mi_assert_internal(page->block_size > 0);
+  return page->block_size;
+}
+
+static inline bool mi_page_is_huge(const mi_page_t* page) {
+  mi_assert_internal((page->is_huge && _mi_page_segment(page)->kind == MI_SEGMENT_HUGE) ||
+                     (!page->is_huge && _mi_page_segment(page)->kind != MI_SEGMENT_HUGE));
+  return page->is_huge;
+}
+
+// Get the usable block size of a page without fixed padding.
+// This may still include internal padding due to alignment and rounding up size classes.
+static inline size_t mi_page_usable_block_size(const mi_page_t* page) {
+  return mi_page_block_size(page) - MI_PADDING_SIZE;
+}
+
+// size of a segment
+static inline size_t mi_segment_size(mi_segment_t* segment) {
+  return segment->segment_slices * MI_SEGMENT_SLICE_SIZE;
+}
+
+static inline uint8_t* mi_segment_end(mi_segment_t* segment) {
+  return (uint8_t*)segment + mi_segment_size(segment);
+}
+
+// Thread free access
+static inline mi_block_t* mi_page_thread_free(const mi_page_t* page) {
+  return (mi_block_t*)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xthread_free) & ~3);
+}
+
+static inline mi_delayed_t mi_page_thread_free_flag(const mi_page_t* page) {
+  return (mi_delayed_t)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xthread_free) & 3);
+}
+
+// Heap access
+static inline mi_heap_t* mi_page_heap(const mi_page_t* page) {
+  return (mi_heap_t*)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xheap));
+}
+
+static inline void mi_page_set_heap(mi_page_t* page, mi_heap_t* heap) {
+  mi_assert_internal(mi_page_thread_free_flag(page) != MI_DELAYED_FREEING);
+  mi_atomic_store_release(&page->xheap,(uintptr_t)heap);
+  if (heap != NULL) { page->heap_tag = heap->tag; }
+}
+
+// Thread free flag helpers
+static inline mi_block_t* mi_tf_block(mi_thread_free_t tf) {
+  return (mi_block_t*)(tf & ~0x03);
+}
+static inline mi_delayed_t mi_tf_delayed(mi_thread_free_t tf) {
+  return (mi_delayed_t)(tf & 0x03);
+}
+static inline mi_thread_free_t mi_tf_make(mi_block_t* block, mi_delayed_t delayed) {
+  return (mi_thread_free_t)((uintptr_t)block | (uintptr_t)delayed);
+}
+static inline mi_thread_free_t mi_tf_set_delayed(mi_thread_free_t tf, mi_delayed_t delayed) {
+  return mi_tf_make(mi_tf_block(tf),delayed);
+}
+static inline mi_thread_free_t mi_tf_set_block(mi_thread_free_t tf, mi_block_t* block) {
+  return mi_tf_make(block, mi_tf_delayed(tf));
+}
+
+// are all blocks in a page freed?
+// note: needs up-to-date used count, (as the `xthread_free` list may not be empty). see `_mi_page_collect_free`.
+static inline bool mi_page_all_free(const mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  return (page->used == 0);
+}
+
+// are there any available blocks?
+static inline bool mi_page_has_any_available(const mi_page_t* page) {
+  mi_assert_internal(page != NULL && page->reserved > 0);
+  return (page->used < page->reserved || (mi_page_thread_free(page) != NULL));
+}
+
+// are there immediately available blocks, i.e. blocks available on the free list.
+static inline bool mi_page_immediate_available(const mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  return (page->free != NULL);
+}
+
+// is more than 7/8th of a page in use?
+static inline bool mi_page_is_mostly_used(const mi_page_t* page) {
+  if (page==NULL) return true;
+  uint16_t frac = page->reserved / 8U;
+  return (page->reserved - page->used <= frac);
+}
+
+static inline mi_page_queue_t* mi_page_queue(const mi_heap_t* heap, size_t size) {
+  return &((mi_heap_t*)heap)->pages[_mi_bin(size)];
+}
+
+
+
+//-----------------------------------------------------------
+// Page flags
+//-----------------------------------------------------------
+static inline bool mi_page_is_in_full(const mi_page_t* page) {
+  return page->flags.x.in_full;
+}
+
+static inline void mi_page_set_in_full(mi_page_t* page, bool in_full) {
+  page->flags.x.in_full = in_full;
+}
+
+static inline bool mi_page_has_aligned(const mi_page_t* page) {
+  return page->flags.x.has_aligned;
+}
+
+static inline void mi_page_set_has_aligned(mi_page_t* page, bool has_aligned) {
+  page->flags.x.has_aligned = has_aligned;
+}
+
+/* -------------------------------------------------------------------
+  Guarded objects
+------------------------------------------------------------------- */
+#if MI_GUARDED
+static inline bool mi_block_ptr_is_guarded(const mi_block_t* block, const void* p) {
+  const ptrdiff_t offset = (uint8_t*)p - (uint8_t*)block;
+  return (offset >= (ptrdiff_t)(sizeof(mi_block_t)) && block->next == MI_BLOCK_TAG_GUARDED);
+}
+
+static inline bool mi_heap_malloc_use_guarded(mi_heap_t* heap, size_t size) {
+  // this code is written to result in fast assembly as it is on the hot path for allocation
+  const size_t count = heap->guarded_sample_count - 1;  // if the rate was 0, this will underflow and count for a long time..
+  if mi_likely(count != 0) {
+    // no sample
+    heap->guarded_sample_count = count;
+    return false;
+  }
+  else if (size >= heap->guarded_size_min && size <= heap->guarded_size_max) {
+    // use guarded allocation
+    heap->guarded_sample_count = heap->guarded_sample_rate;  // reset
+    return (heap->guarded_sample_rate != 0);
+  }
+  else {
+    // failed size criteria, rewind count (but don't write to an empty heap)
+    if (heap->guarded_sample_rate != 0) { heap->guarded_sample_count = 1; }
+    return false;
+  }
+}
+
+mi_decl_restrict void* _mi_heap_malloc_guarded(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept;
+
+#endif
+
+
+/* -------------------------------------------------------------------
+Encoding/Decoding the free list next pointers
+
+This is to protect against buffer overflow exploits where the
+free list is mutated. Many hardened allocators xor the next pointer `p`
+with a secret key `k1`, as `p^k1`. This prevents overwriting with known
+values but might be still too weak: if the attacker can guess
+the pointer `p` this  can reveal `k1` (since `p^k1^p == k1`).
+Moreover, if multiple blocks can be read as well, the attacker can
+xor both as `(p1^k1) ^ (p2^k1) == p1^p2` which may reveal a lot
+about the pointers (and subsequently `k1`).
+
+Instead mimalloc uses an extra key `k2` and encodes as `((p^k2)<<<k1)+k1`.
+Since these operations are not associative, the above approaches do not
+work so well any more even if the `p` can be guesstimated. For example,
+for the read case we can subtract two entries to discard the `+k1` term,
+but that leads to `((p1^k2)<<<k1) - ((p2^k2)<<<k1)` at best.
+We include the left-rotation since xor and addition are otherwise linear
+in the lowest bit. Finally, both keys are unique per page which reduces
+the re-use of keys by a large factor.
+
+We also pass a separate `null` value to be used as `NULL` or otherwise
+`(k2<<<k1)+k1` would appear (too) often as a sentinel value.
+------------------------------------------------------------------- */
+
+static inline bool mi_is_in_same_segment(const void* p, const void* q) {
+  return (_mi_ptr_segment(p) == _mi_ptr_segment(q));
+}
+
+static inline bool mi_is_in_same_page(const void* p, const void* q) {
+  mi_segment_t* segment = _mi_ptr_segment(p);
+  if (_mi_ptr_segment(q) != segment) return false;
+  // assume q may be invalid // return (_mi_segment_page_of(segment, p) == _mi_segment_page_of(segment, q));
+  mi_page_t* page = _mi_segment_page_of(segment, p);
+  size_t psize;
+  uint8_t* start = _mi_segment_page_start(segment, page, &psize);
+  return (start <= (uint8_t*)q && (uint8_t*)q < start + psize);
+}
+
+static inline uintptr_t mi_rotl(uintptr_t x, uintptr_t shift) {
+  shift %= MI_INTPTR_BITS;
+  return (shift==0 ? x : ((x << shift) | (x >> (MI_INTPTR_BITS - shift))));
+}
+static inline uintptr_t mi_rotr(uintptr_t x, uintptr_t shift) {
+  shift %= MI_INTPTR_BITS;
+  return (shift==0 ? x : ((x >> shift) | (x << (MI_INTPTR_BITS - shift))));
+}
+
+static inline void* mi_ptr_decode(const void* null, const mi_encoded_t x, const uintptr_t* keys) {
+  void* p = (void*)(mi_rotr(x - keys[0], keys[0]) ^ keys[1]);
+  return (p==null ? NULL : p);
+}
+
+static inline mi_encoded_t mi_ptr_encode(const void* null, const void* p, const uintptr_t* keys) {
+  uintptr_t x = (uintptr_t)(p==NULL ? null : p);
+  return mi_rotl(x ^ keys[1], keys[0]) + keys[0];
+}
+
+static inline uint32_t mi_ptr_encode_canary(const void* null, const void* p, const uintptr_t* keys) {
+  const uint32_t x = (uint32_t)(mi_ptr_encode(null,p,keys));
+  // make the lowest byte 0 to prevent spurious read overflows which could be a security issue (issue #951)
+  #ifdef MI_BIG_ENDIAN
+  return (x & 0x00FFFFFF);
+  #else
+  return (x & 0xFFFFFF00);
+  #endif
+}
+
+static inline mi_block_t* mi_block_nextx( const void* null, const mi_block_t* block, const uintptr_t* keys ) {
+  mi_track_mem_defined(block,sizeof(mi_block_t));
+  mi_block_t* next;
+  #ifdef MI_ENCODE_FREELIST
+  next = (mi_block_t*)mi_ptr_decode(null, block->next, keys);
+  #else
+  MI_UNUSED(keys); MI_UNUSED(null);
+  next = (mi_block_t*)block->next;
+  #endif
+  mi_track_mem_noaccess(block,sizeof(mi_block_t));
+  return next;
+}
+
+static inline void mi_block_set_nextx(const void* null, mi_block_t* block, const mi_block_t* next, const uintptr_t* keys) {
+  mi_track_mem_undefined(block,sizeof(mi_block_t));
+  #ifdef MI_ENCODE_FREELIST
+  block->next = mi_ptr_encode(null, next, keys);
+  #else
+  MI_UNUSED(keys); MI_UNUSED(null);
+  block->next = (mi_encoded_t)next;
+  #endif
+  mi_track_mem_noaccess(block,sizeof(mi_block_t));
+}
+
+static inline mi_block_t* mi_block_next(const mi_page_t* page, const mi_block_t* block) {
+  #ifdef MI_ENCODE_FREELIST
+  mi_block_t* next = mi_block_nextx(page,block,page->keys);
+  // check for free list corruption: is `next` at least in the same page?
+  // TODO: check if `next` is `page->block_size` aligned?
+  if mi_unlikely(next!=NULL && !mi_is_in_same_page(block, next)) {
+    _mi_error_message(EFAULT, "corrupted free list entry of size %zub at %p: value 0x%zx\n", mi_page_block_size(page), block, (uintptr_t)next);
+    next = NULL;
+  }
+  return next;
+  #else
+  MI_UNUSED(page);
+  return mi_block_nextx(page,block,NULL);
+  #endif
+}
+
+static inline void mi_block_set_next(const mi_page_t* page, mi_block_t* block, const mi_block_t* next) {
+  #ifdef MI_ENCODE_FREELIST
+  mi_block_set_nextx(page,block,next, page->keys);
+  #else
+  MI_UNUSED(page);
+  mi_block_set_nextx(page,block,next,NULL);
+  #endif
+}
+
+
+// -------------------------------------------------------------------
+// commit mask
+// -------------------------------------------------------------------
+
+static inline void mi_commit_mask_create_empty(mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    cm->mask[i] = 0;
+  }
+}
+
+static inline void mi_commit_mask_create_full(mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    cm->mask[i] = ~((size_t)0);
+  }
+}
+
+static inline bool mi_commit_mask_is_empty(const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    if (cm->mask[i] != 0) return false;
+  }
+  return true;
+}
+
+static inline bool mi_commit_mask_is_full(const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    if (cm->mask[i] != ~((size_t)0)) return false;
+  }
+  return true;
+}
+
+// defined in `segment.c`:
+size_t _mi_commit_mask_committed_size(const mi_commit_mask_t* cm, size_t total);
+size_t _mi_commit_mask_next_run(const mi_commit_mask_t* cm, size_t* idx);
+
+#define mi_commit_mask_foreach(cm,idx,count) \
+  idx = 0; \
+  while ((count = _mi_commit_mask_next_run(cm,&idx)) > 0) {
+
+#define mi_commit_mask_foreach_end() \
+    idx += count; \
+  }
+
+
+
+/* -----------------------------------------------------------
+  memory id's
+----------------------------------------------------------- */
+
+static inline mi_memid_t _mi_memid_create(mi_memkind_t memkind) {
+  mi_memid_t memid;
+  _mi_memzero_var(memid);
+  memid.memkind = memkind;
+  return memid;
+}
+
+static inline mi_memid_t _mi_memid_none(void) {
+  return _mi_memid_create(MI_MEM_NONE);
+}
+
+static inline mi_memid_t _mi_memid_create_os(void* base, size_t size, bool committed, bool is_zero, bool is_large) {
+  mi_memid_t memid = _mi_memid_create(MI_MEM_OS);
+  memid.mem.os.base = base;
+  memid.mem.os.size = size;
+  memid.initially_committed = committed;
+  memid.initially_zero = is_zero;
+  memid.is_pinned = is_large;
+  return memid;
+}
+
+
+// -------------------------------------------------------------------
+// Fast "random" shuffle
+// -------------------------------------------------------------------
+
+static inline uintptr_t _mi_random_shuffle(uintptr_t x) {
+  if (x==0) { x = 17; }   // ensure we don't get stuck in generating zeros
+#if (MI_INTPTR_SIZE>=8)
+  // by Sebastiano Vigna, see: <http://xoshiro.di.unimi.it/splitmix64.c>
+  x ^= x >> 30;
+  x *= 0xbf58476d1ce4e5b9UL;
+  x ^= x >> 27;
+  x *= 0x94d049bb133111ebUL;
+  x ^= x >> 31;
+#elif (MI_INTPTR_SIZE==4)
+  // by Chris Wellons, see: <https://nullprogram.com/blog/2018/07/31/>
+  x ^= x >> 16;
+  x *= 0x7feb352dUL;
+  x ^= x >> 15;
+  x *= 0x846ca68bUL;
+  x ^= x >> 16;
+#endif
+  return x;
+}
+
+
+
+// -----------------------------------------------------------------------
+// Count bits: trailing or leading zeros (with MI_INTPTR_BITS on all zero)
+// -----------------------------------------------------------------------
+
+#if defined(__GNUC__)
+
+#include <limits.h>       // LONG_MAX
+#define MI_HAVE_FAST_BITSCAN
+static inline size_t mi_clz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  #if (SIZE_MAX == ULONG_MAX)
+    return __builtin_clzl(x);
+  #else
+    return __builtin_clzll(x);
+  #endif
+}
+static inline size_t mi_ctz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  #if (SIZE_MAX == ULONG_MAX)
+    return __builtin_ctzl(x);
+  #else
+    return __builtin_ctzll(x);
+  #endif
+}
+
+#elif defined(_MSC_VER)
+
+#include <limits.h>       // LONG_MAX
+#include <intrin.h>       // BitScanReverse64
+#define MI_HAVE_FAST_BITSCAN
+static inline size_t mi_clz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  unsigned long idx;
+  #if (SIZE_MAX == ULONG_MAX)
+    _BitScanReverse(&idx, x);
+  #else
+    _BitScanReverse64(&idx, x);
+  #endif
+  return ((MI_SIZE_BITS - 1) - (size_t)idx);
+}
+static inline size_t mi_ctz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  unsigned long idx;
+  #if (SIZE_MAX == ULONG_MAX)
+    _BitScanForward(&idx, x);
+  #else
+    _BitScanForward64(&idx, x);
+  #endif
+  return (size_t)idx;
+}
+
+#else
+
+static inline size_t mi_ctz_generic32(uint32_t x) {
+  // de Bruijn multiplication, see <http://supertech.csail.mit.edu/papers/debruijn.pdf>
+  static const uint8_t debruijn[32] = {
+    0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
+    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
+  };
+  if (x==0) return 32;
+  return debruijn[(uint32_t)((x & -(int32_t)x) * (uint32_t)(0x077CB531U)) >> 27];
+}
+
+static inline size_t mi_clz_generic32(uint32_t x) {
+  // de Bruijn multiplication, see <http://supertech.csail.mit.edu/papers/debruijn.pdf>
+  static const uint8_t debruijn[32] = {
+    31, 22, 30, 21, 18, 10, 29, 2, 20, 17, 15, 13, 9, 6, 28, 1,
+    23, 19, 11, 3, 16, 14, 7, 24, 12, 4, 8, 25, 5, 26, 27, 0
+  };
+  if (x==0) return 32;
+  x |= x >> 1;
+  x |= x >> 2;
+  x |= x >> 4;
+  x |= x >> 8;
+  x |= x >> 16;
+  return debruijn[(uint32_t)(x * (uint32_t)(0x07C4ACDDU)) >> 27];
+}
+
+static inline size_t mi_ctz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  #if (MI_SIZE_BITS <= 32)
+    return mi_ctz_generic32((uint32_t)x);
+  #else
+    const uint32_t lo = (uint32_t)x;
+    if (lo != 0) {
+      return mi_ctz_generic32(lo);
+    }
+    else {
+      return (32 + mi_ctz_generic32((uint32_t)(x>>32)));
+    }
+  #endif
+}
+
+static inline size_t mi_clz(size_t x) {
+  if (x==0) return MI_SIZE_BITS;
+  #if (MI_SIZE_BITS <= 32)
+    return mi_clz_generic32((uint32_t)x);
+  #else
+    const uint32_t hi = (uint32_t)(x>>32);
+    if (hi != 0) {
+      return mi_clz_generic32(hi);
+    }
+    else {
+      return 32 + mi_clz_generic32((uint32_t)x);
+    }
+  #endif
+}
+
+#endif
+
+// "bit scan reverse": Return index of the highest bit (or MI_SIZE_BITS if `x` is zero)
+static inline size_t mi_bsr(size_t x) {
+  return (x==0 ? MI_SIZE_BITS : MI_SIZE_BITS - 1 - mi_clz(x));
+}
+
+size_t _mi_popcount_generic(size_t x);
+
+static inline size_t mi_popcount(size_t x) {
+  if (x<=1) return x;
+  if (x==SIZE_MAX) return MI_SIZE_BITS;
+  #if defined(__GNUC__)
+    #if (SIZE_MAX == ULONG_MAX)
+      return __builtin_popcountl(x);
+    #else
+      return __builtin_popcountll(x);
+    #endif
+  #else
+    return _mi_popcount_generic(x);
+  #endif
+}
+
+// ---------------------------------------------------------------------------------
+// Provide our own `_mi_memcpy` for potential performance optimizations.
+//
+// For now, only on Windows with msvc/clang-cl we optimize to `rep movsb` if
+// we happen to run on x86/x64 cpu's that have "fast short rep movsb" (FSRM) support
+// (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017). See also issue #201 and pr #253.
+// ---------------------------------------------------------------------------------
+
+#if !MI_TRACK_ENABLED && defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64))
+#include <intrin.h>
+extern mi_decl_hidden bool _mi_cpu_has_fsrm;
+extern mi_decl_hidden bool _mi_cpu_has_erms;
+static inline void _mi_memcpy(void* dst, const void* src, size_t n) {
+  if ((_mi_cpu_has_fsrm && n <= 128) || (_mi_cpu_has_erms && n > 128)) {
+    __movsb((unsigned char*)dst, (const unsigned char*)src, n);
+  }
+  else {
+    memcpy(dst, src, n);
+  }
+}
+static inline void _mi_memzero(void* dst, size_t n) {
+  if ((_mi_cpu_has_fsrm && n <= 128) || (_mi_cpu_has_erms && n > 128)) {
+    __stosb((unsigned char*)dst, 0, n);
+  }
+  else {
+    memset(dst, 0, n);
+  }
+}
+#else
+static inline void _mi_memcpy(void* dst, const void* src, size_t n) {
+  memcpy(dst, src, n);
+}
+static inline void _mi_memzero(void* dst, size_t n) {
+  memset(dst, 0, n);
+}
+#endif
+
+// -------------------------------------------------------------------------------
+// The `_mi_memcpy_aligned` can be used if the pointers are machine-word aligned
+// This is used for example in `mi_realloc`.
+// -------------------------------------------------------------------------------
+
+#if (defined(__GNUC__) && (__GNUC__ >= 4)) || defined(__clang__)
+// On GCC/CLang we provide a hint that the pointers are word aligned.
+static inline void _mi_memcpy_aligned(void* dst, const void* src, size_t n) {
+  mi_assert_internal(((uintptr_t)dst % MI_INTPTR_SIZE == 0) && ((uintptr_t)src % MI_INTPTR_SIZE == 0));
+  void* adst = __builtin_assume_aligned(dst, MI_INTPTR_SIZE);
+  const void* asrc = __builtin_assume_aligned(src, MI_INTPTR_SIZE);
+  _mi_memcpy(adst, asrc, n);
+}
+
+static inline void _mi_memzero_aligned(void* dst, size_t n) {
+  mi_assert_internal((uintptr_t)dst % MI_INTPTR_SIZE == 0);
+  void* adst = __builtin_assume_aligned(dst, MI_INTPTR_SIZE);
+  _mi_memzero(adst, n);
+}
+#else
+// Default fallback on `_mi_memcpy`
+static inline void _mi_memcpy_aligned(void* dst, const void* src, size_t n) {
+  mi_assert_internal(((uintptr_t)dst % MI_INTPTR_SIZE == 0) && ((uintptr_t)src % MI_INTPTR_SIZE == 0));
+  _mi_memcpy(dst, src, n);
+}
+
+static inline void _mi_memzero_aligned(void* dst, size_t n) {
+  mi_assert_internal((uintptr_t)dst % MI_INTPTR_SIZE == 0);
+  _mi_memzero(dst, n);
+}
+#endif
+
+
+#endif
diff --git a/compat/mimalloc/mimalloc/prim.h b/compat/mimalloc/mimalloc/prim.h
new file mode 100644
index 00000000000000..f8abc8c48cea32
--- /dev/null
+++ b/compat/mimalloc/mimalloc/prim.h
@@ -0,0 +1,421 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_PRIM_H
+#define MIMALLOC_PRIM_H
+#include "internal.h"             // mi_decl_hidden
+
+// --------------------------------------------------------------------------
+// This file specifies the primitive portability API.
+// Each OS/host needs to implement these primitives, see `src/prim`
+// for implementations on Window, macOS, WASI, and Linux/Unix.
+//
+// note: on all primitive functions, we always have result parameters != NULL, and:
+//  addr != NULL and page aligned
+//  size > 0     and page aligned
+//  the return value is an error code as an `int` where 0 is success
+// --------------------------------------------------------------------------
+
+// OS memory configuration
+typedef struct mi_os_mem_config_s {
+  size_t  page_size;              // default to 4KiB
+  size_t  large_page_size;        // 0 if not supported, usually 2MiB (4MiB on Windows)
+  size_t  alloc_granularity;      // smallest allocation size (usually 4KiB, on Windows 64KiB)
+  size_t  physical_memory_in_kib; // physical memory size in KiB
+  size_t  virtual_address_bits;   // usually 48 or 56 bits on 64-bit systems. (used to determine secure randomization)
+  bool    has_overcommit;         // can we reserve more memory than can be actually committed?
+  bool    has_partial_free;       // can allocated blocks be freed partially? (true for mmap, false for VirtualAlloc)
+  bool    has_virtual_reserve;    // supports virtual address space reservation? (if true we can reserve virtual address space without using commit or physical memory)
+} mi_os_mem_config_t;
+
+// Initialize
+void _mi_prim_mem_init( mi_os_mem_config_t* config );
+
+// Free OS memory
+int _mi_prim_free(void* addr, size_t size );
+
+// Allocate OS memory. Return NULL on error.
+// The `try_alignment` is just a hint and the returned pointer does not have to be aligned.
+// If `commit` is false, the virtual memory range only needs to be reserved (with no access)
+// which will later be committed explicitly using `_mi_prim_commit`.
+// `is_zero` is set to true if the memory was zero initialized (as on most OS's)
+// The `hint_addr` address is either `NULL` or a preferred allocation address but can be ignored.
+// pre: !commit => !allow_large
+//      try_alignment >= _mi_os_page_size() and a power of 2
+int _mi_prim_alloc(void* hint_addr, size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** addr);
+
+// Commit memory. Returns error code or 0 on success.
+// For example, on Linux this would make the memory PROT_READ|PROT_WRITE.
+// `is_zero` is set to true if the memory was zero initialized (e.g. on Windows)
+int _mi_prim_commit(void* addr, size_t size, bool* is_zero);
+
+// Decommit memory. Returns error code or 0 on success. The `needs_recommit` result is true
+// if the memory would need to be re-committed. For example, on Windows this is always true,
+// but on Linux we could use MADV_DONTNEED to decommit which does not need a recommit.
+// pre: needs_recommit != NULL
+int _mi_prim_decommit(void* addr, size_t size, bool* needs_recommit);
+
+// Reset memory. The range keeps being accessible but the content might be reset to zero at any moment.
+// Returns error code or 0 on success.
+int _mi_prim_reset(void* addr, size_t size);
+
+// Reuse memory. This is called for memory that is already committed but
+// may have been reset (`_mi_prim_reset`) or decommitted (`_mi_prim_decommit`) where `needs_recommit` was false.
+// Returns error code or 0 on success. On most platforms this is a no-op.
+int _mi_prim_reuse(void* addr, size_t size);
+
+// Protect memory. Returns error code or 0 on success.
+int _mi_prim_protect(void* addr, size_t size, bool protect);
+
+// Allocate huge (1GiB) pages possibly associated with a NUMA node.
+// `is_zero` is set to true if the memory was zero initialized (as on most OS's)
+// pre: size > 0  and a multiple of 1GiB.
+//      numa_node is either negative (don't care), or a numa node number.
+int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr);
+
+// Return the current NUMA node
+size_t _mi_prim_numa_node(void);
+
+// Return the number of logical NUMA nodes
+size_t _mi_prim_numa_node_count(void);
+
+// Clock ticks
+mi_msecs_t _mi_prim_clock_now(void);
+
+// Return process information (only for statistics)
+typedef struct mi_process_info_s {
+  mi_msecs_t  elapsed;
+  mi_msecs_t  utime;
+  mi_msecs_t  stime;
+  size_t      current_rss;
+  size_t      peak_rss;
+  size_t      current_commit;
+  size_t      peak_commit;
+  size_t      page_faults;
+} mi_process_info_t;
+
+void _mi_prim_process_info(mi_process_info_t* pinfo);
+
+// Default stderr output. (only for warnings etc. with verbose enabled)
+// msg != NULL && _mi_strlen(msg) > 0
+void _mi_prim_out_stderr( const char* msg );
+
+// Get an environment variable. (only for options)
+// name != NULL, result != NULL, result_size >= 64
+bool _mi_prim_getenv(const char* name, char* result, size_t result_size);
+
+
+// Fill a buffer with strong randomness; return `false` on error or if
+// there is no strong randomization available.
+bool _mi_prim_random_buf(void* buf, size_t buf_len);
+
+// Called on the first thread start, and should ensure `_mi_thread_done` is called on thread termination.
+void _mi_prim_thread_init_auto_done(void);
+
+// Called on process exit and may take action to clean up resources associated with the thread auto done.
+void _mi_prim_thread_done_auto_done(void);
+
+// Called when the default heap for a thread changes
+void _mi_prim_thread_associate_default_heap(mi_heap_t* heap);
+
+
+//-------------------------------------------------------------------
+// Access to TLS (thread local storage) slots.
+// We need fast access to both a unique thread id (in `free.c:mi_free`) and
+// to a thread-local heap pointer (in `alloc.c:mi_malloc`).
+// To achieve this we use specialized code for various platforms.
+//-------------------------------------------------------------------
+
+// On some libc + platform combinations we can directly access a thread-local storage (TLS) slot.
+// The TLS layout depends on both the OS and libc implementation so we use specific tests for each main platform.
+// If you test on another platform and it works please send a PR :-)
+// see also https://akkadia.org/drepper/tls.pdf for more info on the TLS register.
+//
+// Note: we would like to prefer `__builtin_thread_pointer()` nowadays instead of using assembly,
+// but unfortunately we can not detect support reliably (see issue #883)
+// We also use it on Apple OS as we use a TLS slot for the default heap there.
+#if defined(__GNUC__) && ( \
+           (defined(__GLIBC__)   && (defined(__x86_64__) || defined(__i386__) || (defined(__arm__) && __ARM_ARCH >= 7) || defined(__aarch64__))) \
+        || (defined(__APPLE__)   && (defined(__x86_64__) || defined(__aarch64__) || defined(__POWERPC__))) \
+        || (defined(__BIONIC__)  && (defined(__x86_64__) || defined(__i386__) || (defined(__arm__) && __ARM_ARCH >= 7) || defined(__aarch64__))) \
+        || (defined(__FreeBSD__) && (defined(__x86_64__) || defined(__i386__) || defined(__aarch64__))) \
+        || (defined(__OpenBSD__) && (defined(__x86_64__) || defined(__i386__) || defined(__aarch64__))) \
+      )
+
+#define MI_HAS_TLS_SLOT    1
+
+static inline void* mi_prim_tls_slot(size_t slot) mi_attr_noexcept {
+  void* res;
+  const size_t ofs = (slot*sizeof(void*));
+  #if defined(__i386__)
+    __asm__("movl %%gs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : );  // x86 32-bit always uses GS
+  #elif defined(__APPLE__) && defined(__x86_64__)
+    __asm__("movq %%gs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : );  // x86_64 macOSX uses GS
+  #elif defined(__x86_64__) && (MI_INTPTR_SIZE==4)
+    __asm__("movl %%fs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : );  // x32 ABI
+  #elif defined(__x86_64__)
+    __asm__("movq %%fs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : );  // x86_64 Linux, BSD uses FS
+  #elif defined(__arm__)
+    void** tcb; MI_UNUSED(ofs);
+    __asm__ volatile ("mrc p15, 0, %0, c13, c0, 3\nbic %0, %0, #3" : "=r" (tcb));
+    res = tcb[slot];
+  #elif defined(__aarch64__)
+    void** tcb; MI_UNUSED(ofs);
+    #if defined(__APPLE__) // M1, issue #343
+    __asm__ volatile ("mrs %0, tpidrro_el0\nbic %0, %0, #7" : "=r" (tcb));
+    #else
+    __asm__ volatile ("mrs %0, tpidr_el0" : "=r" (tcb));
+    #endif
+    res = tcb[slot];
+  #elif defined(__APPLE__) && defined(__POWERPC__) // ppc, issue #781
+    MI_UNUSED(ofs);
+    res = pthread_getspecific(slot);
+  #endif
+  return res;
+}
+
+// setting a tls slot is only used on macOS for now
+static inline void mi_prim_tls_slot_set(size_t slot, void* value) mi_attr_noexcept {
+  const size_t ofs = (slot*sizeof(void*));
+  #if defined(__i386__)
+    __asm__("movl %1,%%gs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : );  // 32-bit always uses GS
+  #elif defined(__APPLE__) && defined(__x86_64__)
+    __asm__("movq %1,%%gs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : );  // x86_64 macOS uses GS
+  #elif defined(__x86_64__) && (MI_INTPTR_SIZE==4)
+    __asm__("movl %1,%%fs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : );  // x32 ABI
+  #elif defined(__x86_64__)
+    __asm__("movq %1,%%fs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : );  // x86_64 Linux, BSD uses FS
+  #elif defined(__arm__)
+    void** tcb; MI_UNUSED(ofs);
+    __asm__ volatile ("mrc p15, 0, %0, c13, c0, 3\nbic %0, %0, #3" : "=r" (tcb));
+    tcb[slot] = value;
+  #elif defined(__aarch64__)
+    void** tcb; MI_UNUSED(ofs);
+    #if defined(__APPLE__) // M1, issue #343
+    __asm__ volatile ("mrs %0, tpidrro_el0\nbic %0, %0, #7" : "=r" (tcb));
+    #else
+    __asm__ volatile ("mrs %0, tpidr_el0" : "=r" (tcb));
+    #endif
+    tcb[slot] = value;
+  #elif defined(__APPLE__) && defined(__POWERPC__) // ppc, issue #781
+    MI_UNUSED(ofs);
+    pthread_setspecific(slot, value);
+  #endif
+}
+
+#elif _WIN32 && MI_WIN_USE_FIXED_TLS && !defined(MI_WIN_USE_FLS)
+
+// On windows we can store the thread-local heap at a fixed TLS slot to avoid
+// thread-local initialization checks in the fast path.
+// We allocate a user TLS slot at process initialization (see `windows/prim.c`)
+// and store the offset `_mi_win_tls_offset`.
+#define MI_HAS_TLS_SLOT  1              // 2 = we can reliably initialize the slot (saving a test on each malloc)
+
+extern mi_decl_hidden size_t _mi_win_tls_offset;
+
+#if MI_WIN_USE_FIXED_TLS > 1
+#define MI_TLS_SLOT     (MI_WIN_USE_FIXED_TLS)
+#elif MI_SIZE_SIZE == 4
+#define MI_TLS_SLOT     (0x0E10 + _mi_win_tls_offset)  // User TLS slots <https://en.wikipedia.org/wiki/Win32_Thread_Information_Block>
+#else
+#define MI_TLS_SLOT     (0x1480 + _mi_win_tls_offset)  // User TLS slots <https://en.wikipedia.org/wiki/Win32_Thread_Information_Block>
+#endif
+
+static inline void* mi_prim_tls_slot(size_t slot) mi_attr_noexcept {
+  #if (_M_X64 || _M_AMD64) && !defined(_M_ARM64EC)
+  return (void*)__readgsqword((unsigned long)slot);   // direct load at offset from gs
+  #elif _M_IX86 && !defined(_M_ARM64EC)
+  return (void*)__readfsdword((unsigned long)slot);   // direct load at offset from fs
+  #else
+  return ((void**)NtCurrentTeb())[slot / sizeof(void*)];
+  #endif
+}
+static inline void mi_prim_tls_slot_set(size_t slot, void* value) mi_attr_noexcept {
+  ((void**)NtCurrentTeb())[slot / sizeof(void*)] = value;
+}
+
+#endif
+
+
+
+//-------------------------------------------------------------------
+// Get a fast unique thread id.
+//
+// Getting the thread id should be performant as it is called in the
+// fast path of `_mi_free` and we specialize for various platforms as
+// inlined definitions. Regular code should call `init.c:_mi_thread_id()`.
+// We only require _mi_prim_thread_id() to return a unique id
+// for each thread (unequal to zero).
+//-------------------------------------------------------------------
+
+
+// Do we have __builtin_thread_pointer? This would be the preferred way to get a unique thread id
+// but unfortunately, it seems we cannot test for this reliably at this time (see issue #883)
+// Nevertheless, it seems needed on older graviton platforms (see issue #851).
+// For now, we only enable this for specific platforms.
+#if !defined(__APPLE__)  /* on apple (M1) the wrong register is read (tpidr_el0 instead of tpidrro_el0) so fall back to TLS slot assembly (<https://github.com/microsoft/mimalloc/issues/343#issuecomment-763272369>)*/ \
+    && !defined(__CYGWIN__) \
+    && !defined(MI_LIBC_MUSL) \
+    && (!defined(__clang_major__) || __clang_major__ >= 14)  /* older clang versions emit bad code; fall back to using the TLS slot (<https://lore.kernel.org/linux-arm-kernel/202110280952.352F66D8@keescook/T/>) */
+  #if    (defined(__GNUC__) && (__GNUC__ >= 7)  && defined(__aarch64__)) /* aarch64 for older gcc versions (issue #851) */ \
+      || (defined(__GNUC__) && (__GNUC__ >= 11) && defined(__x86_64__)) \
+      || (defined(__clang_major__) && (__clang_major__ >= 14) && (defined(__aarch64__) || defined(__x86_64__)))
+    #define MI_USE_BUILTIN_THREAD_POINTER  1
+  #endif
+#endif
+
+
+
+// defined in `init.c`; do not use these directly
+extern mi_decl_hidden mi_decl_thread mi_heap_t* _mi_heap_default;  // default heap to allocate from
+extern mi_decl_hidden bool _mi_process_is_initialized;             // has mi_process_init been called?
+
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept;
+
+// Get a unique id for the current thread.
+#if defined(MI_PRIM_THREAD_ID)
+
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept {
+  return MI_PRIM_THREAD_ID();  // used for example by CPython for a free threaded build (see python/cpython#115488)
+}
+
+#elif defined(_WIN32)
+
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept {
+  // Windows: works on Intel and ARM in both 32- and 64-bit
+  return (uintptr_t)NtCurrentTeb();
+}
+
+#elif MI_USE_BUILTIN_THREAD_POINTER
+
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept {
+  // Works on most Unix based platforms with recent compilers
+  return (uintptr_t)__builtin_thread_pointer();
+}
+
+#elif MI_HAS_TLS_SLOT
+
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept {
+  #if defined(__BIONIC__)
+    // issue #384, #495: on the Bionic libc (Android), slot 1 is the thread id
+    // see: https://github.com/aosp-mirror/platform_bionic/blob/c44b1d0676ded732df4b3b21c5f798eacae93228/libc/platform/bionic/tls_defines.h#L86
+    return (uintptr_t)mi_prim_tls_slot(1);
+  #else
+    // in all our other targets, slot 0 is the thread id
+    // glibc: https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/nptl/tls.h
+    // apple: https://github.com/apple/darwin-xnu/blob/main/libsyscall/os/tsd.h#L36
+    return (uintptr_t)mi_prim_tls_slot(0);
+  #endif
+}
+
+#else
+
+// otherwise use portable C, taking the address of a thread local variable (this is still very fast on most platforms).
+static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept {
+  return (uintptr_t)&_mi_heap_default;
+}
+
+#endif
+
+
+
+/* ----------------------------------------------------------------------------------------
+Get the thread local default heap: `_mi_prim_get_default_heap()`
+
+This is inlined here as it is on the fast path for allocation functions.
+
+On most platforms (Windows, Linux, FreeBSD, NetBSD, etc), this just returns a
+__thread local variable (`_mi_heap_default`). With the initial-exec TLS model this ensures
+that the storage will always be available (allocated on the thread stacks).
+
+On some platforms though we cannot use that when overriding `malloc` since the underlying
+TLS implementation (or the loader) will call itself `malloc` on a first access and recurse.
+We try to circumvent this in an efficient way:
+- macOSX : we use an unused TLS slot from the OS allocated slots (MI_TLS_SLOT). On OSX, the
+           loader itself calls `malloc` even before the modules are initialized.
+- OpenBSD: we use an unused slot from the pthread block (MI_TLS_PTHREAD_SLOT_OFS).
+- DragonFly: defaults are working but seem slow compared to freeBSD (see PR #323)
+------------------------------------------------------------------------------------------- */
+
+static inline mi_heap_t* mi_prim_get_default_heap(void);
+
+#if defined(MI_MALLOC_OVERRIDE)
+#if defined(__APPLE__) // macOS
+  #define MI_TLS_SLOT               89  // seems unused?
+  // other possible unused ones are 9, 29, __PTK_FRAMEWORK_JAVASCRIPTCORE_KEY4 (94), __PTK_FRAMEWORK_GC_KEY9 (112) and __PTK_FRAMEWORK_OLDGC_KEY9 (89)
+  // see <https://github.com/rweichler/substrate/blob/master/include/pthread_machdep.h>
+#elif defined(__OpenBSD__)
+  // use end bytes of a name; goes wrong if anyone uses names > 23 characters (ptrhread specifies 16)
+  // see <https://github.com/openbsd/src/blob/master/lib/libc/include/thread_private.h#L371>
+  #define MI_TLS_PTHREAD_SLOT_OFS   (6*sizeof(int) + 4*sizeof(void*) + 24)
+  // #elif defined(__DragonFly__)
+  // #warning "mimalloc is not working correctly on DragonFly yet."
+  // #define MI_TLS_PTHREAD_SLOT_OFS   (4 + 1*sizeof(void*))  // offset `uniqueid` (also used by gdb?) <https://github.com/DragonFlyBSD/DragonFlyBSD/blob/master/lib/libthread_xu/thread/thr_private.h#L458>
+#elif defined(__ANDROID__)
+  // See issue #381
+  #define MI_TLS_PTHREAD
+#endif
+#endif
+
+
+#if MI_TLS_SLOT
+# if !defined(MI_HAS_TLS_SLOT)
+#  error "trying to use a TLS slot for the default heap, but the mi_prim_tls_slot primitives are not defined"
+# endif
+
+static inline mi_heap_t* mi_prim_get_default_heap(void) {
+  mi_heap_t* heap = (mi_heap_t*)mi_prim_tls_slot(MI_TLS_SLOT);
+  #if MI_HAS_TLS_SLOT == 1   // check if the TLS slot is initialized
+  if mi_unlikely(heap == NULL) {
+    #ifdef __GNUC__
+    __asm(""); // prevent conditional load of the address of _mi_heap_empty
+    #endif
+    heap = (mi_heap_t*)&_mi_heap_empty;
+  }
+  #endif
+  return heap;
+}
+
+#elif defined(MI_TLS_PTHREAD_SLOT_OFS)
+
+static inline mi_heap_t** mi_prim_tls_pthread_heap_slot(void) {
+  pthread_t self = pthread_self();
+  #if defined(__DragonFly__)
+  if (self==NULL) return NULL;
+  #endif
+  return (mi_heap_t**)((uint8_t*)self + MI_TLS_PTHREAD_SLOT_OFS);
+}
+
+static inline mi_heap_t* mi_prim_get_default_heap(void) {
+  mi_heap_t** pheap = mi_prim_tls_pthread_heap_slot();
+  if mi_unlikely(pheap == NULL) return _mi_heap_main_get();
+  mi_heap_t* heap = *pheap;
+  if mi_unlikely(heap == NULL) return (mi_heap_t*)&_mi_heap_empty;
+  return heap;
+}
+
+#elif defined(MI_TLS_PTHREAD)
+
+extern mi_decl_hidden pthread_key_t _mi_heap_default_key;
+static inline mi_heap_t* mi_prim_get_default_heap(void) {
+  mi_heap_t* heap = (mi_unlikely(_mi_heap_default_key == (pthread_key_t)(-1)) ? _mi_heap_main_get() : (mi_heap_t*)pthread_getspecific(_mi_heap_default_key));
+  return (mi_unlikely(heap == NULL) ? (mi_heap_t*)&_mi_heap_empty : heap);
+}
+
+#else // default using a thread local variable; used on most platforms.
+
+static inline mi_heap_t* mi_prim_get_default_heap(void) {
+  #if defined(MI_TLS_RECURSE_GUARD)
+  if (mi_unlikely(!_mi_process_is_initialized)) return _mi_heap_main_get();
+  #endif
+  return _mi_heap_default;
+}
+
+#endif  // mi_prim_get_default_heap()
+
+
+#endif  // MIMALLOC_PRIM_H
diff --git a/compat/mimalloc/mimalloc/track.h b/compat/mimalloc/mimalloc/track.h
new file mode 100644
index 00000000000000..4b5709e2b54110
--- /dev/null
+++ b/compat/mimalloc/mimalloc/track.h
@@ -0,0 +1,145 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_TRACK_H
+#define MIMALLOC_TRACK_H
+
+/* ------------------------------------------------------------------------------------------------------
+Track memory ranges with macros for tools like Valgrind address sanitizer, or other memory checkers.
+These can be defined for tracking allocation:
+
+  #define mi_track_malloc_size(p,reqsize,size,zero)
+  #define mi_track_free_size(p,_size)
+
+The macros are set up such that the size passed to `mi_track_free_size`
+always matches the size of `mi_track_malloc_size`. (currently, `size == mi_usable_size(p)`).
+The `reqsize` is what the user requested, and `size >= reqsize`.
+The `size` is either byte precise (and `size==reqsize`) if `MI_PADDING` is enabled,
+or otherwise it is the usable block size which may be larger than the original request.
+Use `_mi_block_size_of(void* p)` to get the full block size that was allocated (including padding etc).
+The `zero` parameter is `true` if the allocated block is zero initialized.
+
+Optional:
+
+  #define mi_track_align(p,alignedp,offset,size)
+  #define mi_track_resize(p,oldsize,newsize)
+  #define mi_track_init()
+
+The `mi_track_align` is called right after a `mi_track_malloc` for aligned pointers in a block.
+The corresponding `mi_track_free` still uses the block start pointer and original size (corresponding to the `mi_track_malloc`).
+The `mi_track_resize` is currently unused but could be called on reallocations within a block.
+`mi_track_init` is called at program start.
+
+The following macros are for tools like asan and valgrind to track whether memory is
+defined, undefined, or not accessible at all:
+
+  #define mi_track_mem_defined(p,size)
+  #define mi_track_mem_undefined(p,size)
+  #define mi_track_mem_noaccess(p,size)
+
+-------------------------------------------------------------------------------------------------------*/
+
+#if MI_TRACK_VALGRIND
+// valgrind tool
+
+#define MI_TRACK_ENABLED      1
+#define MI_TRACK_HEAP_DESTROY 1           // track free of individual blocks on heap_destroy
+#define MI_TRACK_TOOL         "valgrind"
+
+#include <valgrind/valgrind.h>
+#include <valgrind/memcheck.h>
+
+#define mi_track_malloc_size(p,reqsize,size,zero) VALGRIND_MALLOCLIKE_BLOCK(p,size,MI_PADDING_SIZE /*red zone*/,zero)
+#define mi_track_free_size(p,_size)               VALGRIND_FREELIKE_BLOCK(p,MI_PADDING_SIZE /*red zone*/)
+#define mi_track_resize(p,oldsize,newsize)        VALGRIND_RESIZEINPLACE_BLOCK(p,oldsize,newsize,MI_PADDING_SIZE /*red zone*/)
+#define mi_track_mem_defined(p,size)              VALGRIND_MAKE_MEM_DEFINED(p,size)
+#define mi_track_mem_undefined(p,size)            VALGRIND_MAKE_MEM_UNDEFINED(p,size)
+#define mi_track_mem_noaccess(p,size)             VALGRIND_MAKE_MEM_NOACCESS(p,size)
+
+#elif MI_TRACK_ASAN
+// address sanitizer
+
+#define MI_TRACK_ENABLED      1
+#define MI_TRACK_HEAP_DESTROY 0
+#define MI_TRACK_TOOL         "asan"
+
+#include <sanitizer/asan_interface.h>
+
+#define mi_track_malloc_size(p,reqsize,size,zero) ASAN_UNPOISON_MEMORY_REGION(p,size)
+#define mi_track_free_size(p,size)                ASAN_POISON_MEMORY_REGION(p,size)
+#define mi_track_mem_defined(p,size)              ASAN_UNPOISON_MEMORY_REGION(p,size)
+#define mi_track_mem_undefined(p,size)            ASAN_UNPOISON_MEMORY_REGION(p,size)
+#define mi_track_mem_noaccess(p,size)             ASAN_POISON_MEMORY_REGION(p,size)
+
+#elif MI_TRACK_ETW
+// windows event tracing
+
+#define MI_TRACK_ENABLED      1
+#define MI_TRACK_HEAP_DESTROY 1
+#define MI_TRACK_TOOL         "ETW"
+
+#include "../src/prim/windows/etw.h"
+
+#define mi_track_init()                           EventRegistermicrosoft_windows_mimalloc();
+#define mi_track_malloc_size(p,reqsize,size,zero) EventWriteETW_MI_ALLOC((UINT64)(p), size)
+#define mi_track_free_size(p,size)                EventWriteETW_MI_FREE((UINT64)(p), size)
+
+#else
+// no tracking
+
+#define MI_TRACK_ENABLED      0
+#define MI_TRACK_HEAP_DESTROY 0
+#define MI_TRACK_TOOL         "none"
+
+#define mi_track_malloc_size(p,reqsize,size,zero)
+#define mi_track_free_size(p,_size)
+
+#endif
+
+// -------------------
+// Utility definitions
+
+#ifndef mi_track_resize
+#define mi_track_resize(p,oldsize,newsize)      mi_track_free_size(p,oldsize); mi_track_malloc(p,newsize,false)
+#endif
+
+#ifndef mi_track_align
+#define mi_track_align(p,alignedp,offset,size)  mi_track_mem_noaccess(p,offset)
+#endif
+
+#ifndef mi_track_init
+#define mi_track_init()
+#endif
+
+#ifndef mi_track_mem_defined
+#define mi_track_mem_defined(p,size)
+#endif
+
+#ifndef mi_track_mem_undefined
+#define mi_track_mem_undefined(p,size)
+#endif
+
+#ifndef mi_track_mem_noaccess
+#define mi_track_mem_noaccess(p,size)
+#endif
+
+
+#if MI_PADDING
+#define mi_track_malloc(p,reqsize,zero) \
+  if ((p)!=NULL) { \
+    mi_assert_internal(mi_usable_size(p)==(reqsize)); \
+    mi_track_malloc_size(p,reqsize,reqsize,zero); \
+  }
+#else
+#define mi_track_malloc(p,reqsize,zero) \
+  if ((p)!=NULL) { \
+    mi_assert_internal(mi_usable_size(p)>=(reqsize)); \
+    mi_track_malloc_size(p,reqsize,mi_usable_size(p),zero); \
+  }
+#endif
+
+#endif
diff --git a/compat/mimalloc/mimalloc/types.h b/compat/mimalloc/mimalloc/types.h
new file mode 100644
index 00000000000000..f52d37a82b19b6
--- /dev/null
+++ b/compat/mimalloc/mimalloc/types.h
@@ -0,0 +1,686 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#pragma once
+#ifndef MIMALLOC_TYPES_H
+#define MIMALLOC_TYPES_H
+
+// --------------------------------------------------------------------------
+// This file contains the main type definitions for mimalloc:
+// mi_heap_t      : all data for a thread-local heap, contains
+//                  lists of all managed heap pages.
+// mi_segment_t   : a larger chunk of memory (32GiB) from where pages
+//                  are allocated. A segment is divided in slices (64KiB) from
+//                  which pages are allocated.
+// mi_page_t      : a "mimalloc" page (usually 64KiB or 512KiB) from
+//                  where objects are allocated.
+//                  Note: we write "OS page" for OS memory pages while
+//                  using plain "page" for mimalloc pages (`mi_page_t`).
+// --------------------------------------------------------------------------
+
+
+#include <mimalloc-stats.h>
+#include <stddef.h>   // ptrdiff_t
+#include <stdint.h>   // uintptr_t, uint16_t, etc
+#include <stdbool.h>  // bool
+#include "atomic.h"   // _Atomic
+
+#ifdef _MSC_VER
+#pragma warning(disable:4214) // bitfield is not int
+#endif
+
+// Minimal alignment necessary. On most platforms 16 bytes are needed
+// due to SSE registers for example. This must be at least `sizeof(void*)`
+#ifndef MI_MAX_ALIGN_SIZE
+#define MI_MAX_ALIGN_SIZE  16   // sizeof(max_align_t)
+#endif
+
+// ------------------------------------------------------
+// Variants
+// ------------------------------------------------------
+
+// Define NDEBUG in the release version to disable assertions.
+// #define NDEBUG
+
+// Define MI_TRACK_<tool> to enable tracking support
+// #define MI_TRACK_VALGRIND 1
+// #define MI_TRACK_ASAN     1
+// #define MI_TRACK_ETW      1
+
+// Define MI_STAT as 1 to maintain statistics; set it to 2 to have detailed statistics (but costs some performance).
+// #define MI_STAT 1
+
+// Define MI_SECURE to enable security mitigations
+// #define MI_SECURE 1  // guard page around metadata
+// #define MI_SECURE 2  // guard page around each mimalloc page
+// #define MI_SECURE 3  // encode free lists (detect corrupted free list (buffer overflow), and invalid pointer free)
+// #define MI_SECURE 4  // checks for double free. (may be more expensive)
+
+#if !defined(MI_SECURE)
+#define MI_SECURE 0
+#endif
+
+// Define MI_DEBUG for debug mode
+// #define MI_DEBUG 1  // basic assertion checks and statistics, check double free, corrupted free list, and invalid pointer free.
+// #define MI_DEBUG 2  // + internal assertion checks
+// #define MI_DEBUG 3  // + extensive internal invariant checking (cmake -DMI_DEBUG_FULL=ON)
+#if !defined(MI_DEBUG)
+#if defined(MI_BUILD_RELEASE) || defined(NDEBUG)
+#define MI_DEBUG 0
+#else
+#define MI_DEBUG 2
+#endif
+#endif
+
+// Use guard pages behind objects of a certain size (set by the MIMALLOC_DEBUG_GUARDED_MIN/MAX options)
+// Padding should be disabled when using guard pages
+// #define MI_GUARDED 1
+#if defined(MI_GUARDED)
+#define MI_PADDING  0
+#endif
+
+// Reserve extra padding at the end of each block to be more resilient against heap block overflows.
+// The padding can detect buffer overflow on free.
+#if !defined(MI_PADDING) && (MI_SECURE>=3 || MI_DEBUG>=1 || (MI_TRACK_VALGRIND || MI_TRACK_ASAN || MI_TRACK_ETW))
+#define MI_PADDING  1
+#endif
+
+// Check padding bytes; allows byte-precise buffer overflow detection
+#if !defined(MI_PADDING_CHECK) && MI_PADDING && (MI_SECURE>=3 || MI_DEBUG>=1)
+#define MI_PADDING_CHECK 1
+#endif
+
+
+// Encoded free lists allow detection of corrupted free lists
+// and can detect buffer overflows, modify after free, and double `free`s.
+#if (MI_SECURE>=3 || MI_DEBUG>=1)
+#define MI_ENCODE_FREELIST  1
+#endif
+
+
+// We used to abandon huge pages in order to eagerly deallocate it if freed from another thread.
+// Unfortunately, that makes it not possible to visit them during a heap walk or include them in a
+// `mi_heap_destroy`. We therefore instead reset/decommit the huge blocks nowadays if freed from
+// another thread so the memory becomes "virtually" available (and eventually gets properly freed by
+// the owning thread).
+// #define MI_HUGE_PAGE_ABANDON 1
+
+
+// ------------------------------------------------------
+// Platform specific values
+// ------------------------------------------------------
+
+// ------------------------------------------------------
+// Size of a pointer.
+// We assume that `sizeof(void*)==sizeof(intptr_t)`
+// and it holds for all platforms we know of.
+//
+// However, the C standard only requires that:
+//  p == (void*)((intptr_t)p))
+// but we also need:
+//  i == (intptr_t)((void*)i)
+// or otherwise one might define an intptr_t type that is larger than a pointer...
+// ------------------------------------------------------
+
+#if INTPTR_MAX > INT64_MAX
+# define MI_INTPTR_SHIFT (4)  // assume 128-bit  (as on arm CHERI for example)
+#elif INTPTR_MAX == INT64_MAX
+# define MI_INTPTR_SHIFT (3)
+#elif INTPTR_MAX == INT32_MAX
+# define MI_INTPTR_SHIFT (2)
+#else
+#error platform pointers must be 32, 64, or 128 bits
+#endif
+
+#if SIZE_MAX == UINT64_MAX
+# define MI_SIZE_SHIFT (3)
+typedef int64_t  mi_ssize_t;
+#elif SIZE_MAX == UINT32_MAX
+# define MI_SIZE_SHIFT (2)
+typedef int32_t  mi_ssize_t;
+#else
+#error platform objects must be 32 or 64 bits
+#endif
+
+#if (SIZE_MAX/2) > LONG_MAX
+# define MI_ZU(x)  x##ULL
+# define MI_ZI(x)  x##LL
+#else
+# define MI_ZU(x)  x##UL
+# define MI_ZI(x)  x##L
+#endif
+
+#define MI_INTPTR_SIZE  (1<<MI_INTPTR_SHIFT)
+#define MI_INTPTR_BITS  (MI_INTPTR_SIZE*8)
+
+#define MI_SIZE_SIZE  (1<<MI_SIZE_SHIFT)
+#define MI_SIZE_BITS  (MI_SIZE_SIZE*8)
+
+#define MI_KiB     (MI_ZU(1024))
+#define MI_MiB     (MI_KiB*MI_KiB)
+#define MI_GiB     (MI_MiB*MI_KiB)
+
+
+// ------------------------------------------------------
+// Main internal data-structures
+// ------------------------------------------------------
+
+// Main tuning parameters for segment and page sizes
+// Sizes for 64-bit (usually divide by two for 32-bit)
+#ifndef MI_SEGMENT_SLICE_SHIFT
+#define MI_SEGMENT_SLICE_SHIFT            (13 + MI_INTPTR_SHIFT)         // 64KiB  (32KiB on 32-bit)
+#endif
+
+#ifndef MI_SEGMENT_SHIFT
+#if MI_INTPTR_SIZE > 4
+#define MI_SEGMENT_SHIFT                  ( 9 + MI_SEGMENT_SLICE_SHIFT)  // 32MiB
+#else
+#define MI_SEGMENT_SHIFT                  ( 7 + MI_SEGMENT_SLICE_SHIFT)  // 4MiB on 32-bit
+#endif
+#endif
+
+#ifndef MI_SMALL_PAGE_SHIFT
+#define MI_SMALL_PAGE_SHIFT               (MI_SEGMENT_SLICE_SHIFT)       // 64KiB
+#endif
+#ifndef MI_MEDIUM_PAGE_SHIFT
+#define MI_MEDIUM_PAGE_SHIFT              ( 3 + MI_SMALL_PAGE_SHIFT)     // 512KiB
+#endif
+
+// Derived constants
+#define MI_SEGMENT_SIZE                   (MI_ZU(1)<<MI_SEGMENT_SHIFT)
+#define MI_SEGMENT_ALIGN                  MI_SEGMENT_SIZE
+#define MI_SEGMENT_MASK                   ((uintptr_t)(MI_SEGMENT_ALIGN - 1))
+#define MI_SEGMENT_SLICE_SIZE             (MI_ZU(1)<< MI_SEGMENT_SLICE_SHIFT)
+#define MI_SLICES_PER_SEGMENT             (MI_SEGMENT_SIZE / MI_SEGMENT_SLICE_SIZE) // 1024
+
+#define MI_SMALL_PAGE_SIZE                (MI_ZU(1)<<MI_SMALL_PAGE_SHIFT)
+#define MI_MEDIUM_PAGE_SIZE               (MI_ZU(1)<<MI_MEDIUM_PAGE_SHIFT)
+
+#define MI_SMALL_OBJ_SIZE_MAX             (MI_SMALL_PAGE_SIZE/8)   // 8 KiB on 64-bit
+#define MI_MEDIUM_OBJ_SIZE_MAX            (MI_MEDIUM_PAGE_SIZE/8)  // 64 KiB on 64-bit
+#define MI_MEDIUM_OBJ_WSIZE_MAX           (MI_MEDIUM_OBJ_SIZE_MAX/MI_INTPTR_SIZE)
+#define MI_LARGE_OBJ_SIZE_MAX             (MI_SEGMENT_SIZE/2)      // 16 MiB on 64-bit
+#define MI_LARGE_OBJ_WSIZE_MAX            (MI_LARGE_OBJ_SIZE_MAX/MI_INTPTR_SIZE)
+
+// Maximum number of size classes. (spaced exponentially in 12.5% increments)
+#if MI_BIN_HUGE != 73U
+#error "mimalloc internal: expecting 73 bins"
+#endif
+
+#if (MI_MEDIUM_OBJ_WSIZE_MAX >= 655360)
+#error "mimalloc internal: define more bins"
+#endif
+
+// Maximum block size for which blocks are guaranteed to be block size aligned. (see `segment.c:_mi_segment_page_start`)
+#define MI_MAX_ALIGN_GUARANTEE            (MI_MEDIUM_OBJ_SIZE_MAX)
+
+// Alignments over MI_BLOCK_ALIGNMENT_MAX are allocated in dedicated huge page segments
+#define MI_BLOCK_ALIGNMENT_MAX            (MI_SEGMENT_SIZE >> 1)
+
+// Maximum slice count (255) for which we can find the page for interior pointers
+#define MI_MAX_SLICE_OFFSET_COUNT         ((MI_BLOCK_ALIGNMENT_MAX / MI_SEGMENT_SLICE_SIZE) - 1)
+
+// we never allocate more than PTRDIFF_MAX (see also <https://sourceware.org/ml/libc-announce/2019/msg00001.html>)
+// on 64-bit+ systems we also limit the maximum allocation size such that the slice count fits in 32-bits. (issue #877)
+#if (PTRDIFF_MAX > INT32_MAX) && (PTRDIFF_MAX >= (MI_SEGMENT_SLIZE_SIZE * UINT32_MAX))
+#define MI_MAX_ALLOC_SIZE   (MI_SEGMENT_SLICE_SIZE * (UINT32_MAX-1))
+#else
+#define MI_MAX_ALLOC_SIZE   PTRDIFF_MAX
+#endif
+
+
+// ------------------------------------------------------
+// Mimalloc pages contain allocated blocks
+// ------------------------------------------------------
+
+// The free lists use encoded next fields
+// (Only actually encodes when MI_ENCODED_FREELIST is defined.)
+typedef uintptr_t  mi_encoded_t;
+
+// thread id's
+typedef size_t     mi_threadid_t;
+
+// free lists contain blocks
+typedef struct mi_block_s {
+  mi_encoded_t next;
+} mi_block_t;
+
+#if MI_GUARDED
+// we always align guarded pointers in a block at an offset
+// the block `next` field is then used as a tag to distinguish regular offset aligned blocks from guarded ones
+#define MI_BLOCK_TAG_ALIGNED   ((mi_encoded_t)(0))
+#define MI_BLOCK_TAG_GUARDED   (~MI_BLOCK_TAG_ALIGNED)
+#endif
+
+
+// The delayed flags are used for efficient multi-threaded free-ing
+typedef enum mi_delayed_e {
+  MI_USE_DELAYED_FREE   = 0, // push on the owning heap thread delayed list
+  MI_DELAYED_FREEING    = 1, // temporary: another thread is accessing the owning heap
+  MI_NO_DELAYED_FREE    = 2, // optimize: push on page local thread free queue if another block is already in the heap thread delayed free list
+  MI_NEVER_DELAYED_FREE = 3  // sticky: used for abandoned pages without a owning heap; this only resets on page reclaim
+} mi_delayed_t;
+
+
+// The `in_full` and `has_aligned` page flags are put in a union to efficiently
+// test if both are false (`full_aligned == 0`) in the `mi_free` routine.
+#if !MI_TSAN
+typedef union mi_page_flags_s {
+  uint8_t full_aligned;
+  struct {
+    uint8_t in_full : 1;
+    uint8_t has_aligned : 1;
+  } x;
+} mi_page_flags_t;
+#else
+// under thread sanitizer, use a byte for each flag to suppress warning, issue #130
+typedef union mi_page_flags_s {
+  uint32_t full_aligned;
+  struct {
+    uint8_t in_full;
+    uint8_t has_aligned;
+  } x;
+} mi_page_flags_t;
+#endif
+
+// Thread free list.
+// We use the bottom 2 bits of the pointer for mi_delayed_t flags
+typedef uintptr_t mi_thread_free_t;
+
+// A page contains blocks of one specific size (`block_size`).
+// Each page has three list of free blocks:
+// `free` for blocks that can be allocated,
+// `local_free` for freed blocks that are not yet available to `mi_malloc`
+// `thread_free` for freed blocks by other threads
+// The `local_free` and `thread_free` lists are migrated to the `free` list
+// when it is exhausted. The separate `local_free` list is necessary to
+// implement a monotonic heartbeat. The `thread_free` list is needed for
+// avoiding atomic operations in the common case.
+//
+// `used - |thread_free|` == actual blocks that are in use (alive)
+// `used - |thread_free| + |free| + |local_free| == capacity`
+//
+// We don't count `freed` (as |free|) but use `used` to reduce
+// the number of memory accesses in the `mi_page_all_free` function(s).
+//
+// Notes:
+// - Access is optimized for `free.c:mi_free` and `alloc.c:mi_page_alloc`
+// - Using `uint16_t` does not seem to slow things down
+// - The size is 12 words on 64-bit which helps the page index calculations
+//   (and 14 words on 32-bit, and encoded free lists add 2 words)
+// - `xthread_free` uses the bottom bits as a delayed-free flags to optimize
+//   concurrent frees where only the first concurrent free adds to the owning
+//   heap `thread_delayed_free` list (see `free.c:mi_free_block_mt`).
+//   The invariant is that no-delayed-free is only set if there is
+//   at least one block that will be added, or as already been added, to
+//   the owning heap `thread_delayed_free` list. This guarantees that pages
+//   will be freed correctly even if only other threads free blocks.
+typedef struct mi_page_s {
+  // "owned" by the segment
+  uint32_t              slice_count;       // slices in this page (0 if not a page)
+  uint32_t              slice_offset;      // distance from the actual page data slice (0 if a page)
+  uint8_t               is_committed:1;    // `true` if the page virtual memory is committed
+  uint8_t               is_zero_init:1;    // `true` if the page was initially zero initialized
+  uint8_t               is_huge:1;         // `true` if the page is in a huge segment (`segment->kind == MI_SEGMENT_HUGE`)
+                                           // padding
+  // layout like this to optimize access in `mi_malloc` and `mi_free`
+  uint16_t              capacity;          // number of blocks committed, must be the first field, see `segment.c:page_clear`
+  uint16_t              reserved;          // number of blocks reserved in memory
+  mi_page_flags_t       flags;             // `in_full` and `has_aligned` flags (8 bits)
+  uint8_t               free_is_zero:1;    // `true` if the blocks in the free list are zero initialized
+  uint8_t               retire_expire:7;   // expiration count for retired blocks
+
+  mi_block_t*           free;              // list of available free blocks (`malloc` allocates from this list)
+  mi_block_t*           local_free;        // list of deferred free blocks by this thread (migrates to `free`)
+  uint16_t              used;              // number of blocks in use (including blocks in `thread_free`)
+  uint8_t               block_size_shift;  // if not zero, then `(1 << block_size_shift) == block_size` (only used for fast path in `free.c:_mi_page_ptr_unalign`)
+  uint8_t               heap_tag;          // tag of the owning heap, used to separate heaps by object type
+                                           // padding
+  size_t                block_size;        // size available in each block (always `>0`)
+  uint8_t*              page_start;        // start of the page area containing the blocks
+
+  #if (MI_ENCODE_FREELIST || MI_PADDING)
+  uintptr_t             keys[2];           // two random keys to encode the free lists (see `_mi_block_next`) or padding canary
+  #endif
+
+  _Atomic(mi_thread_free_t) xthread_free;  // list of deferred free blocks freed by other threads
+  _Atomic(uintptr_t)        xheap;
+
+  struct mi_page_s*     next;              // next page owned by this thread with the same `block_size`
+  struct mi_page_s*     prev;              // previous page owned by this thread with the same `block_size`
+
+  // 64-bit 11 words, 32-bit 13 words, (+2 for secure)
+  void* padding[1];
+} mi_page_t;
+
+
+
+// ------------------------------------------------------
+// Mimalloc segments contain mimalloc pages
+// ------------------------------------------------------
+
+typedef enum mi_page_kind_e {
+  MI_PAGE_SMALL,    // small blocks go into 64KiB pages inside a segment
+  MI_PAGE_MEDIUM,   // medium blocks go into 512KiB pages inside a segment
+  MI_PAGE_LARGE,    // larger blocks go into a single page spanning a whole segment
+  MI_PAGE_HUGE      // a huge page is a single page in a segment of variable size
+                    // used for blocks `> MI_LARGE_OBJ_SIZE_MAX` or an aligment `> MI_BLOCK_ALIGNMENT_MAX`.
+} mi_page_kind_t;
+
+typedef enum mi_segment_kind_e {
+  MI_SEGMENT_NORMAL, // MI_SEGMENT_SIZE size with pages inside.
+  MI_SEGMENT_HUGE,   // segment with just one huge page inside.
+} mi_segment_kind_t;
+
+// ------------------------------------------------------
+// A segment holds a commit mask where a bit is set if
+// the corresponding MI_COMMIT_SIZE area is committed.
+// The MI_COMMIT_SIZE must be a multiple of the slice
+// size. If it is equal we have the most fine grained
+// decommit (but setting it higher can be more efficient).
+// The MI_MINIMAL_COMMIT_SIZE is the minimal amount that will
+// be committed in one go which can be set higher than
+// MI_COMMIT_SIZE for efficiency (while the decommit mask
+// is still tracked in fine-grained MI_COMMIT_SIZE chunks)
+// ------------------------------------------------------
+
+#define MI_MINIMAL_COMMIT_SIZE      (1*MI_SEGMENT_SLICE_SIZE)
+#define MI_COMMIT_SIZE              (MI_SEGMENT_SLICE_SIZE)              // 64KiB
+#define MI_COMMIT_MASK_BITS         (MI_SEGMENT_SIZE / MI_COMMIT_SIZE)
+#define MI_COMMIT_MASK_FIELD_BITS    MI_SIZE_BITS
+#define MI_COMMIT_MASK_FIELD_COUNT  (MI_COMMIT_MASK_BITS / MI_COMMIT_MASK_FIELD_BITS)
+
+#if (MI_COMMIT_MASK_BITS != (MI_COMMIT_MASK_FIELD_COUNT * MI_COMMIT_MASK_FIELD_BITS))
+#error "the segment size must be exactly divisible by the (commit size * size_t bits)"
+#endif
+
+typedef struct mi_commit_mask_s {
+  size_t mask[MI_COMMIT_MASK_FIELD_COUNT];
+} mi_commit_mask_t;
+
+typedef mi_page_t  mi_slice_t;
+typedef int64_t    mi_msecs_t;
+
+
+// ---------------------------------------------------------------
+// a memory id tracks the provenance of arena/OS allocated memory
+// ---------------------------------------------------------------
+
+// Memory can reside in arena's, direct OS allocated, or statically allocated. The memid keeps track of this.
+typedef enum mi_memkind_e {
+  MI_MEM_NONE,      // not allocated
+  MI_MEM_EXTERNAL,  // not owned by mimalloc but provided externally (via `mi_manage_os_memory` for example)
+  MI_MEM_STATIC,    // allocated in a static area and should not be freed (for arena meta data for example)
+  MI_MEM_OS,        // allocated from the OS
+  MI_MEM_OS_HUGE,   // allocated as huge OS pages (usually 1GiB, pinned to physical memory)
+  MI_MEM_OS_REMAP,  // allocated in a remapable area (i.e. using `mremap`)
+  MI_MEM_ARENA      // allocated from an arena (the usual case)
+} mi_memkind_t;
+
+static inline bool mi_memkind_is_os(mi_memkind_t memkind) {
+  return (memkind >= MI_MEM_OS && memkind <= MI_MEM_OS_REMAP);
+}
+
+typedef struct mi_memid_os_info {
+  void*         base;               // actual base address of the block (used for offset aligned allocations)
+  size_t        size;               // full allocation size
+} mi_memid_os_info_t;
+
+typedef struct mi_memid_arena_info {
+  size_t        block_index;        // index in the arena
+  mi_arena_id_t id;                 // arena id (>= 1)
+  bool          is_exclusive;       // this arena can only be used for specific arena allocations
+} mi_memid_arena_info_t;
+
+typedef struct mi_memid_s {
+  union {
+    mi_memid_os_info_t    os;       // only used for MI_MEM_OS
+    mi_memid_arena_info_t arena;    // only used for MI_MEM_ARENA
+  } mem;
+  bool          is_pinned;          // `true` if we cannot decommit/reset/protect in this memory (e.g. when allocated using large (2Mib) or huge (1GiB) OS pages)
+  bool          initially_committed;// `true` if the memory was originally allocated as committed
+  bool          initially_zero;     // `true` if the memory was originally zero initialized
+  mi_memkind_t  memkind;
+} mi_memid_t;
+
+
+// -----------------------------------------------------------------------------------------
+// Segments are large allocated memory blocks (32mb on 64 bit) from arenas or the OS.
+//
+// Inside segments we allocated fixed size mimalloc pages (`mi_page_t`) that contain blocks.
+// The start of a segment is this structure with a fixed number of slice entries (`slices`)
+// usually followed by a guard OS page and the actual allocation area with pages.
+// While a page is not allocated, we view it's data as a `mi_slice_t` (instead of a `mi_page_t`).
+// Of any free area, the first slice has the info and `slice_offset == 0`; for any subsequent
+// slices part of the area, the `slice_offset` is the byte offset back to the first slice
+// (so we can quickly find the page info on a free, `internal.h:_mi_segment_page_of`).
+// For slices, the `block_size` field is repurposed to signify if a slice is used (`1`) or not (`0`).
+// Small and medium pages use a fixed amount of slices to reduce slice fragmentation, while
+// large and huge pages span a variable amount of slices.
+
+typedef struct mi_subproc_s mi_subproc_t;
+
+typedef struct mi_segment_s {
+  // constant fields
+  mi_memid_t        memid;              // memory id for arena/OS allocation
+  bool              allow_decommit;     // can we decommmit the memory
+  bool              allow_purge;        // can we purge the memory (reset or decommit)
+  size_t            segment_size;
+  mi_subproc_t*     subproc;            // segment belongs to sub process
+
+  // segment fields
+  mi_msecs_t        purge_expire;       // purge slices in the `purge_mask` after this time
+  mi_commit_mask_t  purge_mask;         // slices that can be purged
+  mi_commit_mask_t  commit_mask;        // slices that are currently committed
+
+  // from here is zero initialized
+  struct mi_segment_s* next;            // the list of freed segments in the cache (must be first field, see `segment.c:mi_segment_init`)
+  bool              was_reclaimed;      // true if it was reclaimed (used to limit on-free reclamation)
+  bool              dont_free;          // can be temporarily true to ensure the segment is not freed
+
+  size_t            abandoned;          // abandoned pages (i.e. the original owning thread stopped) (`abandoned <= used`)
+  size_t            abandoned_visits;   // count how often this segment is visited during abondoned reclamation (to force reclaim if it takes too long)
+  size_t            used;               // count of pages in use
+  uintptr_t         cookie;             // verify addresses in debug mode: `mi_ptr_cookie(segment) == segment->cookie`
+
+  struct mi_segment_s* abandoned_os_next; // only used for abandoned segments outside arena's, and only if `mi_option_visit_abandoned` is enabled
+  struct mi_segment_s* abandoned_os_prev;
+
+  size_t            segment_slices;      // for huge segments this may be different from `MI_SLICES_PER_SEGMENT`
+  size_t            segment_info_slices; // initial count of slices that we are using for segment info and possible guard pages.
+
+  // layout like this to optimize access in `mi_free`
+  mi_segment_kind_t kind;
+  size_t            slice_entries;       // entries in the `slices` array, at most `MI_SLICES_PER_SEGMENT`
+  _Atomic(mi_threadid_t) thread_id;      // unique id of the thread owning this segment
+
+  mi_slice_t        slices[MI_SLICES_PER_SEGMENT+1];  // one extra final entry for huge blocks with large alignment
+} mi_segment_t;
+
+
+// ------------------------------------------------------
+// Heaps
+// Provide first-class heaps to allocate from.
+// A heap just owns a set of pages for allocation and
+// can only be allocate/reallocate from the thread that created it.
+// Freeing blocks can be done from any thread though.
+// Per thread, the segments are shared among its heaps.
+// Per thread, there is always a default heap that is
+// used for allocation; it is initialized to statically
+// point to an empty heap to avoid initialization checks
+// in the fast path.
+// ------------------------------------------------------
+
+// Thread local data
+typedef struct mi_tld_s mi_tld_t;
+
+// Pages of a certain block size are held in a queue.
+typedef struct mi_page_queue_s {
+  mi_page_t* first;
+  mi_page_t* last;
+  size_t     block_size;
+} mi_page_queue_t;
+
+#define MI_BIN_FULL  (MI_BIN_HUGE+1)
+
+// Random context
+typedef struct mi_random_cxt_s {
+  uint32_t input[16];
+  uint32_t output[16];
+  int      output_available;
+  bool     weak;
+} mi_random_ctx_t;
+
+
+// In debug mode there is a padding structure at the end of the blocks to check for buffer overflows
+#if (MI_PADDING)
+typedef struct mi_padding_s {
+  uint32_t canary; // encoded block value to check validity of the padding (in case of overflow)
+  uint32_t delta;  // padding bytes before the block. (mi_usable_size(p) - delta == exact allocated bytes)
+} mi_padding_t;
+#define MI_PADDING_SIZE   (sizeof(mi_padding_t))
+#define MI_PADDING_WSIZE  ((MI_PADDING_SIZE + MI_INTPTR_SIZE - 1) / MI_INTPTR_SIZE)
+#else
+#define MI_PADDING_SIZE   0
+#define MI_PADDING_WSIZE  0
+#endif
+
+#define MI_PAGES_DIRECT   (MI_SMALL_WSIZE_MAX + MI_PADDING_WSIZE + 1)
+
+
+// A heap owns a set of pages.
+struct mi_heap_s {
+  mi_tld_t*             tld;
+  _Atomic(mi_block_t*)  thread_delayed_free;
+  mi_threadid_t         thread_id;                           // thread this heap belongs too
+  mi_arena_id_t         arena_id;                            // arena id if the heap belongs to a specific arena (or 0)
+  uintptr_t             cookie;                              // random cookie to verify pointers (see `_mi_ptr_cookie`)
+  uintptr_t             keys[2];                             // two random keys used to encode the `thread_delayed_free` list
+  mi_random_ctx_t       random;                              // random number context used for secure allocation
+  size_t                page_count;                          // total number of pages in the `pages` queues.
+  size_t                page_retired_min;                    // smallest retired index (retired pages are fully free, but still in the page queues)
+  size_t                page_retired_max;                    // largest retired index into the `pages` array.
+  long                  generic_count;                       // how often is `_mi_malloc_generic` called?
+  long                  generic_collect_count;               // how often is `_mi_malloc_generic` called without collecting?
+  mi_heap_t*            next;                                // list of heaps per thread
+  bool                  no_reclaim;                          // `true` if this heap should not reclaim abandoned pages
+  uint8_t               tag;                                 // custom tag, can be used for separating heaps based on the object types
+  #if MI_GUARDED
+  size_t                guarded_size_min;                    // minimal size for guarded objects
+  size_t                guarded_size_max;                    // maximal size for guarded objects
+  size_t                guarded_sample_rate;                 // sample rate (set to 0 to disable guarded pages)
+  size_t                guarded_sample_count;                // current sample count (counting down to 0)
+  #endif
+  mi_page_t*            pages_free_direct[MI_PAGES_DIRECT];  // optimize: array where every entry points a page with possibly free blocks in the corresponding queue for that size.
+  mi_page_queue_t       pages[MI_BIN_FULL + 1];              // queue of pages for each size class (or "bin")
+};
+
+
+// ------------------------------------------------------
+// Sub processes do not reclaim or visit segments
+// from other sub processes. These are essentially the
+// static variables of a process.
+// ------------------------------------------------------
+
+struct mi_subproc_s {
+  _Atomic(size_t)    abandoned_count;         // count of abandoned segments for this sub-process
+  _Atomic(size_t)    abandoned_os_list_count; // count of abandoned segments in the os-list
+  mi_lock_t          abandoned_os_lock;       // lock for the abandoned os segment list (outside of arena's) (this lock protect list operations)
+  mi_lock_t          abandoned_os_visit_lock; // ensure only one thread per subproc visits the abandoned os list
+  mi_segment_t*      abandoned_os_list;       // doubly-linked list of abandoned segments outside of arena's (in OS allocated memory)
+  mi_segment_t*      abandoned_os_list_tail;  // the tail-end of the list
+  mi_memid_t         memid;                   // provenance of this memory block
+};
+
+
+// ------------------------------------------------------
+// Thread Local data
+// ------------------------------------------------------
+
+// A "span" is is an available range of slices. The span queues keep
+// track of slice spans of at most the given `slice_count` (but more than the previous size class).
+typedef struct mi_span_queue_s {
+  mi_slice_t* first;
+  mi_slice_t* last;
+  size_t      slice_count;
+} mi_span_queue_t;
+
+#define MI_SEGMENT_BIN_MAX (35)     // 35 == mi_segment_bin(MI_SLICES_PER_SEGMENT)
+
+// Segments thread local data
+typedef struct mi_segments_tld_s {
+  mi_span_queue_t     spans[MI_SEGMENT_BIN_MAX+1];  // free slice spans inside segments
+  size_t              count;        // current number of segments;
+  size_t              peak_count;   // peak number of segments
+  size_t              current_size; // current size of all segments
+  size_t              peak_size;    // peak size of all segments
+  size_t              reclaim_count;// number of reclaimed (abandoned) segments
+  mi_subproc_t*       subproc;      // sub-process this thread belongs to.
+  mi_stats_t*         stats;        // points to tld stats
+} mi_segments_tld_t;
+
+// Thread local data
+struct mi_tld_s {
+  unsigned long long  heartbeat;     // monotonic heartbeat count
+  bool                recurse;       // true if deferred was called; used to prevent infinite recursion.
+  mi_heap_t*          heap_backing;  // backing heap of this thread (cannot be deleted)
+  mi_heap_t*          heaps;         // list of heaps in this thread (so we can abandon all when the thread terminates)
+  mi_segments_tld_t   segments;      // segment tld
+  mi_stats_t          stats;         // statistics
+};
+
+
+// ------------------------------------------------------
+// Debug
+// ------------------------------------------------------
+
+#if !defined(MI_DEBUG_UNINIT)
+#define MI_DEBUG_UNINIT     (0xD0)
+#endif
+#if !defined(MI_DEBUG_FREED)
+#define MI_DEBUG_FREED      (0xDF)
+#endif
+#if !defined(MI_DEBUG_PADDING)
+#define MI_DEBUG_PADDING    (0xDE)
+#endif
+
+
+// ------------------------------------------------------
+// Statistics
+// ------------------------------------------------------
+#ifndef MI_STAT
+#if (MI_DEBUG>0)
+#define MI_STAT 2
+#else
+#define MI_STAT 0
+#endif
+#endif
+
+// add to stat keeping track of the peak
+void _mi_stat_increase(mi_stat_count_t* stat, size_t amount);
+void _mi_stat_decrease(mi_stat_count_t* stat, size_t amount);
+void _mi_stat_adjust_decrease(mi_stat_count_t* stat, size_t amount);
+// counters can just be increased
+void _mi_stat_counter_increase(mi_stat_counter_t* stat, size_t amount);
+
+#if (MI_STAT)
+#define mi_stat_increase(stat,amount)         _mi_stat_increase( &(stat), amount)
+#define mi_stat_decrease(stat,amount)         _mi_stat_decrease( &(stat), amount)
+#define mi_stat_adjust_decrease(stat,amount)  _mi_stat_adjust_decrease( &(stat), amount)
+#define mi_stat_counter_increase(stat,amount) _mi_stat_counter_increase( &(stat), amount)
+#else
+#define mi_stat_increase(stat,amount)         ((void)0)
+#define mi_stat_decrease(stat,amount)         ((void)0)
+#define mi_stat_adjust_decrease(stat,amount)  ((void)0)
+#define mi_stat_counter_increase(stat,amount) ((void)0)
+#endif
+
+#define mi_heap_stat_counter_increase(heap,stat,amount)  mi_stat_counter_increase( (heap)->tld->stats.stat, amount)
+#define mi_heap_stat_increase(heap,stat,amount)  mi_stat_increase( (heap)->tld->stats.stat, amount)
+#define mi_heap_stat_decrease(heap,stat,amount)  mi_stat_decrease( (heap)->tld->stats.stat, amount)
+#define mi_heap_stat_adjust_decrease(heap,stat,amount)  mi_stat_adjust_decrease( (heap)->tld->stats.stat, amount)
+
+#endif
diff --git a/compat/mimalloc/options.c b/compat/mimalloc/options.c
new file mode 100644
index 00000000000000..b07f029e65dd29
--- /dev/null
+++ b/compat/mimalloc/options.c
@@ -0,0 +1,670 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2021, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "mimalloc/prim.h"  // mi_prim_out_stderr
+
+#include <stdio.h>      // stdin/stdout
+#include <stdlib.h>     // abort
+
+
+
+static long mi_max_error_count   = 16; // stop outputting errors after this (use < 0 for no limit)
+static long mi_max_warning_count = 16; // stop outputting warnings after this (use < 0 for no limit)
+
+static void mi_add_stderr_output(void);
+
+int mi_version(void) mi_attr_noexcept {
+  return MI_MALLOC_VERSION;
+}
+
+
+// --------------------------------------------------------
+// Options
+// These can be accessed by multiple threads and may be
+// concurrently initialized, but an initializing data race
+// is ok since they resolve to the same value.
+// --------------------------------------------------------
+typedef enum mi_init_e {
+  UNINIT,       // not yet initialized
+  DEFAULTED,    // not found in the environment, use default value
+  INITIALIZED   // found in environment or set explicitly
+} mi_init_t;
+
+typedef struct mi_option_desc_s {
+  long        value;  // the value
+  mi_init_t   init;   // is it initialized yet? (from the environment)
+  mi_option_t option; // for debugging: the option index should match the option
+  const char* name;   // option name without `mimalloc_` prefix
+  const char* legacy_name; // potential legacy option name
+} mi_option_desc_t;
+
+#define MI_OPTION(opt)                  mi_option_##opt, #opt, NULL
+#define MI_OPTION_LEGACY(opt,legacy)    mi_option_##opt, #opt, #legacy
+
+// Some options can be set at build time for statically linked libraries
+// (use `-DMI_EXTRA_CPPDEFS="opt1=val1;opt2=val2"`)
+//
+// This is useful if we cannot pass them as environment variables
+// (and setting them programmatically would be too late)
+
+#ifndef MI_DEFAULT_VERBOSE
+#define MI_DEFAULT_VERBOSE 0
+#endif
+
+#ifndef MI_DEFAULT_EAGER_COMMIT
+#define MI_DEFAULT_EAGER_COMMIT 1
+#endif
+
+#ifndef MI_DEFAULT_ARENA_EAGER_COMMIT
+#define MI_DEFAULT_ARENA_EAGER_COMMIT 2
+#endif
+
+// in KiB
+#ifndef MI_DEFAULT_ARENA_RESERVE
+ #if (MI_INTPTR_SIZE>4)
+  #define MI_DEFAULT_ARENA_RESERVE 1024L*1024L
+ #else
+  #define MI_DEFAULT_ARENA_RESERVE 128L*1024L
+ #endif
+#endif
+
+#ifndef MI_DEFAULT_DISALLOW_ARENA_ALLOC
+#define MI_DEFAULT_DISALLOW_ARENA_ALLOC 0
+#endif
+
+#ifndef MI_DEFAULT_ALLOW_LARGE_OS_PAGES
+#define MI_DEFAULT_ALLOW_LARGE_OS_PAGES 0
+#endif
+
+#ifndef MI_DEFAULT_RESERVE_HUGE_OS_PAGES
+#define MI_DEFAULT_RESERVE_HUGE_OS_PAGES 0
+#endif
+
+#ifndef MI_DEFAULT_RESERVE_OS_MEMORY
+#define MI_DEFAULT_RESERVE_OS_MEMORY 0
+#endif
+
+#ifndef MI_DEFAULT_GUARDED_SAMPLE_RATE
+#if MI_GUARDED
+#define MI_DEFAULT_GUARDED_SAMPLE_RATE 4000
+#else
+#define MI_DEFAULT_GUARDED_SAMPLE_RATE 0
+#endif
+#endif
+
+
+#ifndef MI_DEFAULT_ALLOW_THP
+#if defined(__ANDROID__)
+#define MI_DEFAULT_ALLOW_THP  0
+#else
+#define MI_DEFAULT_ALLOW_THP  1
+#endif
+#endif
+
+// Static options
+static mi_option_desc_t options[_mi_option_last] =
+{
+  // stable options
+  #if MI_DEBUG || defined(MI_SHOW_ERRORS)
+  { 1, UNINIT, MI_OPTION(show_errors) },
+  #else
+  { 0, UNINIT, MI_OPTION(show_errors) },
+  #endif
+  { 0, UNINIT, MI_OPTION(show_stats) },
+  { MI_DEFAULT_VERBOSE, UNINIT, MI_OPTION(verbose) },
+
+  // some of the following options are experimental and not all combinations are allowed.
+  { MI_DEFAULT_EAGER_COMMIT,
+       UNINIT, MI_OPTION(eager_commit) },               // commit per segment directly (4MiB)  (but see also `eager_commit_delay`)
+  { MI_DEFAULT_ARENA_EAGER_COMMIT,
+       UNINIT, MI_OPTION_LEGACY(arena_eager_commit,eager_region_commit) }, // eager commit arena's? 2 is used to enable this only on an OS that has overcommit (i.e. linux)
+  { 1, UNINIT, MI_OPTION_LEGACY(purge_decommits,reset_decommits) },        // purge decommits memory (instead of reset) (note: on linux this uses MADV_DONTNEED for decommit)
+  { MI_DEFAULT_ALLOW_LARGE_OS_PAGES,
+       UNINIT, MI_OPTION_LEGACY(allow_large_os_pages,large_os_pages) },    // use large OS pages, use only with eager commit to prevent fragmentation of VMA's
+  { MI_DEFAULT_RESERVE_HUGE_OS_PAGES,
+       UNINIT, MI_OPTION(reserve_huge_os_pages) },      // per 1GiB huge pages
+  {-1, UNINIT, MI_OPTION(reserve_huge_os_pages_at) },   // reserve huge pages at node N
+  { MI_DEFAULT_RESERVE_OS_MEMORY,
+       UNINIT, MI_OPTION(reserve_os_memory)     },      // reserve N KiB OS memory in advance (use `option_get_size`)
+  { 0, UNINIT, MI_OPTION(deprecated_segment_cache) },   // cache N segments per thread
+  { 0, UNINIT, MI_OPTION(deprecated_page_reset) },      // reset page memory on free
+  { 0, UNINIT, MI_OPTION_LEGACY(abandoned_page_purge,abandoned_page_reset) },       // reset free page memory when a thread terminates
+  { 0, UNINIT, MI_OPTION(deprecated_segment_reset) },   // reset segment memory on free (needs eager commit)
+#if defined(__NetBSD__)
+  { 0, UNINIT, MI_OPTION(eager_commit_delay) },         // the first N segments per thread are not eagerly committed
+#else
+  { 1, UNINIT, MI_OPTION(eager_commit_delay) },         // the first N segments per thread are not eagerly committed (but per page in the segment on demand)
+#endif
+  { 10,  UNINIT, MI_OPTION_LEGACY(purge_delay,reset_delay) },  // purge delay in milli-seconds
+  { 0,   UNINIT, MI_OPTION(use_numa_nodes) },           // 0 = use available numa nodes, otherwise use at most N nodes.
+  { 0,   UNINIT, MI_OPTION_LEGACY(disallow_os_alloc,limit_os_alloc) },           // 1 = do not use OS memory for allocation (but only reserved arenas)
+  { 100, UNINIT, MI_OPTION(os_tag) },                   // only apple specific for now but might serve more or less related purpose
+  { 32,  UNINIT, MI_OPTION(max_errors) },               // maximum errors that are output
+  { 32,  UNINIT, MI_OPTION(max_warnings) },             // maximum warnings that are output
+  { 10,  UNINIT, MI_OPTION(max_segment_reclaim)},       // max. percentage of the abandoned segments to be reclaimed per try.
+  { 0,   UNINIT, MI_OPTION(destroy_on_exit)},           // release all OS memory on process exit; careful with dangling pointer or after-exit frees!
+  { MI_DEFAULT_ARENA_RESERVE, UNINIT, MI_OPTION(arena_reserve) }, // reserve memory N KiB at a time (=1GiB) (use `option_get_size`)
+  { 10,  UNINIT, MI_OPTION(arena_purge_mult) },         // purge delay multiplier for arena's
+  { 1,   UNINIT, MI_OPTION_LEGACY(purge_extend_delay, decommit_extend_delay) },
+  { 0,   UNINIT, MI_OPTION(abandoned_reclaim_on_free) },// reclaim an abandoned segment on a free
+  { MI_DEFAULT_DISALLOW_ARENA_ALLOC,   UNINIT, MI_OPTION(disallow_arena_alloc) }, // 1 = do not use arena's for allocation (except if using specific arena id's)
+  { 400, UNINIT, MI_OPTION(retry_on_oom) },             // windows only: retry on out-of-memory for N milli seconds (=400), set to 0 to disable retries.
+#if defined(MI_VISIT_ABANDONED)
+  { 1,   INITIALIZED, MI_OPTION(visit_abandoned) },     // allow visiting heap blocks in abandoned segments; requires taking locks during reclaim.
+#else
+  { 0,   UNINIT, MI_OPTION(visit_abandoned) },
+#endif
+  { 0,   UNINIT, MI_OPTION(guarded_min) },              // only used when building with MI_GUARDED: minimal rounded object size for guarded objects
+  { MI_GiB, UNINIT, MI_OPTION(guarded_max) },           // only used when building with MI_GUARDED: maximal rounded object size for guarded objects
+  { 0,   UNINIT, MI_OPTION(guarded_precise) },          // disregard minimal alignment requirement to always place guarded blocks exactly in front of a guard page (=0)
+  { MI_DEFAULT_GUARDED_SAMPLE_RATE,
+         UNINIT, MI_OPTION(guarded_sample_rate)},       // 1 out of N allocations in the min/max range will be guarded (=4000)
+  { 0,   UNINIT, MI_OPTION(guarded_sample_seed)},
+  { 0,   UNINIT, MI_OPTION(target_segments_per_thread) }, // abandon segments beyond this point, or 0 to disable.
+  { 10000, UNINIT, MI_OPTION(generic_collect) },          // collect heaps every N (=10000) generic allocation calls
+  { MI_DEFAULT_ALLOW_THP, 
+         UNINIT, MI_OPTION(allow_thp) }                 // allow transparent huge pages?
+};
+
+static void mi_option_init(mi_option_desc_t* desc);
+
+static bool mi_option_has_size_in_kib(mi_option_t option) {
+  return (option == mi_option_reserve_os_memory || option == mi_option_arena_reserve);
+}
+
+void _mi_options_init(void) {
+  // called on process load
+  mi_add_stderr_output(); // now it safe to use stderr for output
+  for(int i = 0; i < _mi_option_last; i++ ) {
+    mi_option_t option = (mi_option_t)i;
+    long l = mi_option_get(option); MI_UNUSED(l); // initialize
+  }
+  mi_max_error_count = mi_option_get(mi_option_max_errors);
+  mi_max_warning_count = mi_option_get(mi_option_max_warnings);
+  #if MI_GUARDED
+  if (mi_option_get(mi_option_guarded_sample_rate) > 0) {
+    if (mi_option_is_enabled(mi_option_allow_large_os_pages)) {
+      mi_option_disable(mi_option_allow_large_os_pages);
+      _mi_warning_message("option 'allow_large_os_pages' is disabled to allow for guarded objects\n");
+    }
+  }
+  #endif
+  if (mi_option_is_enabled(mi_option_verbose)) { mi_options_print(); }
+}
+
+#define mi_stringifyx(str)  #str                // and stringify
+#define mi_stringify(str)   mi_stringifyx(str)  // expand
+
+void mi_options_print(void) mi_attr_noexcept
+{
+  // show version
+  const int vermajor = MI_MALLOC_VERSION/100;
+  const int verminor = (MI_MALLOC_VERSION%100)/10;
+  const int verpatch = (MI_MALLOC_VERSION%10);
+  _mi_message("v%i.%i.%i%s%s (built on %s, %s)\n", vermajor, verminor, verpatch,
+      #if defined(MI_CMAKE_BUILD_TYPE)
+      ", " mi_stringify(MI_CMAKE_BUILD_TYPE)
+      #else
+      ""
+      #endif
+      ,
+      #if defined(MI_GIT_DESCRIBE)
+      ", git " mi_stringify(MI_GIT_DESCRIBE)
+      #else
+      ""
+      #endif
+      , __DATE__, __TIME__);
+
+  // show options
+  for (int i = 0; i < _mi_option_last; i++) {
+    mi_option_t option = (mi_option_t)i;
+    long l = mi_option_get(option); MI_UNUSED(l); // possibly initialize
+    mi_option_desc_t* desc = &options[option];
+    _mi_message("option '%s': %ld %s\n", desc->name, desc->value, (mi_option_has_size_in_kib(option) ? "KiB" : ""));
+  }
+
+  // show build configuration
+  _mi_message("debug level : %d\n", MI_DEBUG );
+  _mi_message("secure level: %d\n", MI_SECURE );
+  _mi_message("mem tracking: %s\n", MI_TRACK_TOOL);
+  #if MI_GUARDED
+  _mi_message("guarded build: %s\n", mi_option_get(mi_option_guarded_sample_rate) != 0 ? "enabled" : "disabled");
+  #endif
+  #if MI_TSAN
+  _mi_message("thread santizer enabled\n");
+  #endif
+}
+
+long _mi_option_get_fast(mi_option_t option) {
+  mi_assert(option >= 0 && option < _mi_option_last);
+  mi_option_desc_t* desc = &options[option];
+  mi_assert(desc->option == option);  // index should match the option
+  //mi_assert(desc->init != UNINIT);
+  return desc->value;
+}
+
+
+mi_decl_nodiscard long mi_option_get(mi_option_t option) {
+  mi_assert(option >= 0 && option < _mi_option_last);
+  if (option < 0 || option >= _mi_option_last) return 0;
+  mi_option_desc_t* desc = &options[option];
+  mi_assert(desc->option == option);  // index should match the option
+  if mi_unlikely(desc->init == UNINIT) {
+    mi_option_init(desc);
+  }
+  return desc->value;
+}
+
+mi_decl_nodiscard long mi_option_get_clamp(mi_option_t option, long min, long max) {
+  long x = mi_option_get(option);
+  return (x < min ? min : (x > max ? max : x));
+}
+
+mi_decl_nodiscard size_t mi_option_get_size(mi_option_t option) {
+  const long x = mi_option_get(option);
+  size_t size = (x < 0 ? 0 : (size_t)x);
+  if (mi_option_has_size_in_kib(option)) {
+    size *= MI_KiB;
+  }
+  return size;
+}
+
+void mi_option_set(mi_option_t option, long value) {
+  mi_assert(option >= 0 && option < _mi_option_last);
+  if (option < 0 || option >= _mi_option_last) return;
+  mi_option_desc_t* desc = &options[option];
+  mi_assert(desc->option == option);  // index should match the option
+  desc->value = value;
+  desc->init = INITIALIZED;
+  // ensure min/max range; be careful to not recurse.
+  if (desc->option == mi_option_guarded_min && _mi_option_get_fast(mi_option_guarded_max) < value) {
+    mi_option_set(mi_option_guarded_max, value);
+  }
+  else if (desc->option == mi_option_guarded_max && _mi_option_get_fast(mi_option_guarded_min) > value) {
+    mi_option_set(mi_option_guarded_min, value);
+  }
+}
+
+void mi_option_set_default(mi_option_t option, long value) {
+  mi_assert(option >= 0 && option < _mi_option_last);
+  if (option < 0 || option >= _mi_option_last) return;
+  mi_option_desc_t* desc = &options[option];
+  if (desc->init != INITIALIZED) {
+    desc->value = value;
+  }
+}
+
+mi_decl_nodiscard bool mi_option_is_enabled(mi_option_t option) {
+  return (mi_option_get(option) != 0);
+}
+
+void mi_option_set_enabled(mi_option_t option, bool enable) {
+  mi_option_set(option, (enable ? 1 : 0));
+}
+
+void mi_option_set_enabled_default(mi_option_t option, bool enable) {
+  mi_option_set_default(option, (enable ? 1 : 0));
+}
+
+void mi_option_enable(mi_option_t option) {
+  mi_option_set_enabled(option,true);
+}
+
+void mi_option_disable(mi_option_t option) {
+  mi_option_set_enabled(option,false);
+}
+
+static void mi_cdecl mi_out_stderr(const char* msg, void* arg) {
+  MI_UNUSED(arg);
+  if (msg != NULL && msg[0] != 0) {
+    _mi_prim_out_stderr(msg);
+  }
+}
+
+// Since an output function can be registered earliest in the `main`
+// function we also buffer output that happens earlier. When
+// an output function is registered it is called immediately with
+// the output up to that point.
+#ifndef MI_MAX_DELAY_OUTPUT
+#define MI_MAX_DELAY_OUTPUT ((size_t)(16*1024))
+#endif
+static char out_buf[MI_MAX_DELAY_OUTPUT+1];
+static _Atomic(size_t) out_len;
+
+static void mi_cdecl mi_out_buf(const char* msg, void* arg) {
+  MI_UNUSED(arg);
+  if (msg==NULL) return;
+  if (mi_atomic_load_relaxed(&out_len)>=MI_MAX_DELAY_OUTPUT) return;
+  size_t n = _mi_strlen(msg);
+  if (n==0) return;
+  // claim space
+  size_t start = mi_atomic_add_acq_rel(&out_len, n);
+  if (start >= MI_MAX_DELAY_OUTPUT) return;
+  // check bound
+  if (start+n >= MI_MAX_DELAY_OUTPUT) {
+    n = MI_MAX_DELAY_OUTPUT-start-1;
+  }
+  _mi_memcpy(&out_buf[start], msg, n);
+}
+
+static void mi_out_buf_flush(mi_output_fun* out, bool no_more_buf, void* arg) {
+  if (out==NULL) return;
+  // claim (if `no_more_buf == true`, no more output will be added after this point)
+  size_t count = mi_atomic_add_acq_rel(&out_len, (no_more_buf ? MI_MAX_DELAY_OUTPUT : 1));
+  // and output the current contents
+  if (count>MI_MAX_DELAY_OUTPUT) count = MI_MAX_DELAY_OUTPUT;
+  out_buf[count] = 0;
+  out(out_buf,arg);
+  if (!no_more_buf) {
+    out_buf[count] = '\n'; // if continue with the buffer, insert a newline
+  }
+}
+
+
+// Once this module is loaded, switch to this routine
+// which outputs to stderr and the delayed output buffer.
+static void mi_cdecl mi_out_buf_stderr(const char* msg, void* arg) {
+  mi_out_stderr(msg,arg);
+  mi_out_buf(msg,arg);
+}
+
+
+
+// --------------------------------------------------------
+// Default output handler
+// --------------------------------------------------------
+
+// Should be atomic but gives errors on many platforms as generally we cannot cast a function pointer to a uintptr_t.
+// For now, don't register output from multiple threads.
+static mi_output_fun* volatile mi_out_default; // = NULL
+static _Atomic(void*) mi_out_arg; // = NULL
+
+static mi_output_fun* mi_out_get_default(void** parg) {
+  if (parg != NULL) { *parg = mi_atomic_load_ptr_acquire(void,&mi_out_arg); }
+  mi_output_fun* out = mi_out_default;
+  return (out == NULL ? &mi_out_buf : out);
+}
+
+void mi_register_output(mi_output_fun* out, void* arg) mi_attr_noexcept {
+  mi_out_default = (out == NULL ? &mi_out_stderr : out); // stop using the delayed output buffer
+  mi_atomic_store_ptr_release(void,&mi_out_arg, arg);
+  if (out!=NULL) mi_out_buf_flush(out,true,arg);         // output all the delayed output now
+}
+
+// add stderr to the delayed output after the module is loaded
+static void mi_add_stderr_output(void) {
+  mi_assert_internal(mi_out_default == NULL);
+  mi_out_buf_flush(&mi_out_stderr, false, NULL); // flush current contents to stderr
+  mi_out_default = &mi_out_buf_stderr;           // and add stderr to the delayed output
+}
+
+// --------------------------------------------------------
+// Messages, all end up calling `_mi_fputs`.
+// --------------------------------------------------------
+static _Atomic(size_t) error_count;   // = 0;  // when >= max_error_count stop emitting errors
+static _Atomic(size_t) warning_count; // = 0;  // when >= max_warning_count stop emitting warnings
+
+// When overriding malloc, we may recurse into mi_vfprintf if an allocation
+// inside the C runtime causes another message.
+// In some cases (like on macOS) the loader already allocates which
+// calls into mimalloc; if we then access thread locals (like `recurse`)
+// this may crash as the access may call _tlv_bootstrap that tries to
+// (recursively) invoke malloc again to allocate space for the thread local
+// variables on demand. This is why we use a _mi_preloading test on such
+// platforms. However, C code generator may move the initial thread local address
+// load before the `if` and we therefore split it out in a separate function.
+static mi_decl_thread bool recurse = false;
+
+static mi_decl_noinline bool mi_recurse_enter_prim(void) {
+  if (recurse) return false;
+  recurse = true;
+  return true;
+}
+
+static mi_decl_noinline void mi_recurse_exit_prim(void) {
+  recurse = false;
+}
+
+static bool mi_recurse_enter(void) {
+  #if defined(__APPLE__) || defined(__ANDROID__) || defined(MI_TLS_RECURSE_GUARD)
+  if (_mi_preloading()) return false;
+  #endif
+  return mi_recurse_enter_prim();
+}
+
+static void mi_recurse_exit(void) {
+  #if defined(__APPLE__) || defined(__ANDROID__) || defined(MI_TLS_RECURSE_GUARD)
+  if (_mi_preloading()) return;
+  #endif
+  mi_recurse_exit_prim();
+}
+
+void _mi_fputs(mi_output_fun* out, void* arg, const char* prefix, const char* message) {
+  if (out==NULL || (void*)out==(void*)stdout || (void*)out==(void*)stderr) { // TODO: use mi_out_stderr for stderr?
+    if (!mi_recurse_enter()) return;
+    out = mi_out_get_default(&arg);
+    if (prefix != NULL) out(prefix, arg);
+    out(message, arg);
+    mi_recurse_exit();
+  }
+  else {
+    if (prefix != NULL) out(prefix, arg);
+    out(message, arg);
+  }
+}
+
+// Define our own limited `fprintf` that avoids memory allocation.
+// We do this using `_mi_vsnprintf` with a limited buffer.
+static void mi_vfprintf( mi_output_fun* out, void* arg, const char* prefix, const char* fmt, va_list args ) {
+  char buf[512];
+  if (fmt==NULL) return;
+  if (!mi_recurse_enter()) return;
+  _mi_vsnprintf(buf, sizeof(buf)-1, fmt, args);
+  mi_recurse_exit();
+  _mi_fputs(out,arg,prefix,buf);
+}
+
+void _mi_fprintf( mi_output_fun* out, void* arg, const char* fmt, ... ) {
+  va_list args;
+  va_start(args,fmt);
+  mi_vfprintf(out,arg,NULL,fmt,args);
+  va_end(args);
+}
+
+static void mi_vfprintf_thread(mi_output_fun* out, void* arg, const char* prefix, const char* fmt, va_list args) {
+  if (prefix != NULL && _mi_strnlen(prefix,33) <= 32 && !_mi_is_main_thread()) {
+    char tprefix[64];
+    _mi_snprintf(tprefix, sizeof(tprefix), "%sthread 0x%tx: ", prefix, (uintptr_t)_mi_thread_id());
+    mi_vfprintf(out, arg, tprefix, fmt, args);
+  }
+  else {
+    mi_vfprintf(out, arg, prefix, fmt, args);
+  }
+}
+
+void _mi_message(const char* fmt, ...) {
+  va_list args;
+  va_start(args, fmt);
+  mi_vfprintf_thread(NULL, NULL, "mimalloc: ", fmt, args);
+  va_end(args);
+}
+
+void _mi_trace_message(const char* fmt, ...) {
+  if (mi_option_get(mi_option_verbose) <= 1) return;  // only with verbose level 2 or higher
+  va_list args;
+  va_start(args, fmt);
+  mi_vfprintf_thread(NULL, NULL, "mimalloc: ", fmt, args);
+  va_end(args);
+}
+
+void _mi_verbose_message(const char* fmt, ...) {
+  if (!mi_option_is_enabled(mi_option_verbose)) return;
+  va_list args;
+  va_start(args,fmt);
+  mi_vfprintf(NULL, NULL, "mimalloc: ", fmt, args);
+  va_end(args);
+}
+
+static void mi_show_error_message(const char* fmt, va_list args) {
+  if (!mi_option_is_enabled(mi_option_verbose)) {
+    if (!mi_option_is_enabled(mi_option_show_errors)) return;
+    if (mi_max_error_count >= 0 && (long)mi_atomic_increment_acq_rel(&error_count) > mi_max_error_count) return;
+  }
+  mi_vfprintf_thread(NULL, NULL, "mimalloc: error: ", fmt, args);
+}
+
+void _mi_warning_message(const char* fmt, ...) {
+  if (!mi_option_is_enabled(mi_option_verbose)) {
+    if (!mi_option_is_enabled(mi_option_show_errors)) return;
+    if (mi_max_warning_count >= 0 && (long)mi_atomic_increment_acq_rel(&warning_count) > mi_max_warning_count) return;
+  }
+  va_list args;
+  va_start(args,fmt);
+  mi_vfprintf_thread(NULL, NULL, "mimalloc: warning: ", fmt, args);
+  va_end(args);
+}
+
+
+#if MI_DEBUG
+mi_decl_noreturn mi_decl_cold void _mi_assert_fail(const char* assertion, const char* fname, unsigned line, const char* func ) mi_attr_noexcept {
+  _mi_fprintf(NULL, NULL, "mimalloc: assertion failed: at \"%s\":%u, %s\n  assertion: \"%s\"\n", fname, line, (func==NULL?"":func), assertion);
+  abort();
+}
+#endif
+
+// --------------------------------------------------------
+// Errors
+// --------------------------------------------------------
+
+static mi_error_fun* volatile  mi_error_handler; // = NULL
+static _Atomic(void*) mi_error_arg;     // = NULL
+
+static void mi_error_default(int err) {
+  MI_UNUSED(err);
+#if (MI_DEBUG>0)
+  if (err==EFAULT) {
+    #ifdef _MSC_VER
+    __debugbreak();
+    #endif
+    abort();
+  }
+#endif
+#if (MI_SECURE>0)
+  if (err==EFAULT) {  // abort on serious errors in secure mode (corrupted meta-data)
+    abort();
+  }
+#endif
+#if defined(MI_XMALLOC)
+  if (err==ENOMEM || err==EOVERFLOW) { // abort on memory allocation fails in xmalloc mode
+    abort();
+  }
+#endif
+}
+
+void mi_register_error(mi_error_fun* fun, void* arg) {
+  mi_error_handler = fun;  // can be NULL
+  mi_atomic_store_ptr_release(void,&mi_error_arg, arg);
+}
+
+void _mi_error_message(int err, const char* fmt, ...) {
+  // show detailed error message
+  va_list args;
+  va_start(args, fmt);
+  mi_show_error_message(fmt, args);
+  va_end(args);
+  // and call the error handler which may abort (or return normally)
+  if (mi_error_handler != NULL) {
+    mi_error_handler(err, mi_atomic_load_ptr_acquire(void,&mi_error_arg));
+  }
+  else {
+    mi_error_default(err);
+  }
+}
+
+// --------------------------------------------------------
+// Initialize options by checking the environment
+// --------------------------------------------------------
+
+// TODO: implement ourselves to reduce dependencies on the C runtime
+#include <stdlib.h> // strtol
+#include <string.h> // strstr
+
+
+static void mi_option_init(mi_option_desc_t* desc) {
+  // Read option value from the environment
+  char s[64 + 1];
+  char buf[64+1];
+  _mi_strlcpy(buf, "mimalloc_", sizeof(buf));
+  _mi_strlcat(buf, desc->name, sizeof(buf));
+  bool found = _mi_getenv(buf, s, sizeof(s));
+  if (!found && desc->legacy_name != NULL) {
+    _mi_strlcpy(buf, "mimalloc_", sizeof(buf));
+    _mi_strlcat(buf, desc->legacy_name, sizeof(buf));
+    found = _mi_getenv(buf, s, sizeof(s));
+    if (found) {
+      _mi_warning_message("environment option \"mimalloc_%s\" is deprecated -- use \"mimalloc_%s\" instead.\n", desc->legacy_name, desc->name);
+    }
+  }
+
+  if (found) {
+    size_t len = _mi_strnlen(s, sizeof(buf) - 1);
+    for (size_t i = 0; i < len; i++) {
+      buf[i] = _mi_toupper(s[i]);
+    }
+    buf[len] = 0;
+    if (buf[0] == 0 || strstr("1;TRUE;YES;ON", buf) != NULL) {
+      desc->value = 1;
+      desc->init = INITIALIZED;
+    }
+    else if (strstr("0;FALSE;NO;OFF", buf) != NULL) {
+      desc->value = 0;
+      desc->init = INITIALIZED;
+    }
+    else {
+      char* end = buf;
+      long value = strtol(buf, &end, 10);
+      if (mi_option_has_size_in_kib(desc->option)) {
+        // this option is interpreted in KiB to prevent overflow of `long` for large allocations
+        // (long is 32-bit on 64-bit windows, which allows for 4TiB max.)
+        size_t size = (value < 0 ? 0 : (size_t)value);
+        bool overflow = false;
+        if (*end == 'K') { end++; }
+        else if (*end == 'M') { overflow = mi_mul_overflow(size,MI_KiB,&size); end++; }
+        else if (*end == 'G') { overflow = mi_mul_overflow(size,MI_MiB,&size); end++; }
+        else if (*end == 'T') { overflow = mi_mul_overflow(size,MI_GiB,&size); end++; }
+        else { size = (size + MI_KiB - 1) / MI_KiB; }
+        if (end[0] == 'I' && end[1] == 'B') { end += 2; } // KiB, MiB, GiB, TiB
+        else if (*end == 'B') { end++; }                  // Kb, Mb, Gb, Tb
+        if (overflow || size > MI_MAX_ALLOC_SIZE) { size = (MI_MAX_ALLOC_SIZE / MI_KiB); }
+        value = (size > LONG_MAX ? LONG_MAX : (long)size);
+      }
+      if (*end == 0) {
+        mi_option_set(desc->option, value);
+      }
+      else {
+        // set `init` first to avoid recursion through _mi_warning_message on mimalloc_verbose.
+        desc->init = DEFAULTED;
+        if (desc->option == mi_option_verbose && desc->value == 0) {
+          // if the 'mimalloc_verbose' env var has a bogus value we'd never know
+          // (since the value defaults to 'off') so in that case briefly enable verbose
+          desc->value = 1;
+          _mi_warning_message("environment option mimalloc_%s has an invalid value.\n", desc->name);
+          desc->value = 0;
+        }
+        else {
+          _mi_warning_message("environment option mimalloc_%s has an invalid value.\n", desc->name);
+        }
+      }
+    }
+    mi_assert_internal(desc->init != UNINIT);
+  }
+  else if (!_mi_preloading()) {
+    desc->init = DEFAULTED;
+  }
+}
diff --git a/compat/mimalloc/os.c b/compat/mimalloc/os.c
new file mode 100644
index 00000000000000..241d6a2bee3487
--- /dev/null
+++ b/compat/mimalloc/os.c
@@ -0,0 +1,770 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2025, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "mimalloc/prim.h"
+
+#define mi_os_stat_increase(stat,amount)      _mi_stat_increase(&_mi_stats_main.stat, amount)
+#define mi_os_stat_decrease(stat,amount)      _mi_stat_decrease(&_mi_stats_main.stat, amount)
+#define mi_os_stat_counter_increase(stat,inc) _mi_stat_counter_increase(&_mi_stats_main.stat, inc)
+
+/* -----------------------------------------------------------
+  Initialization.
+----------------------------------------------------------- */
+#ifndef MI_DEFAULT_VIRTUAL_ADDRESS_BITS
+#if MI_INTPTR_SIZE < 8
+#define MI_DEFAULT_VIRTUAL_ADDRESS_BITS     32
+#else
+#define MI_DEFAULT_VIRTUAL_ADDRESS_BITS     48
+#endif
+#endif
+
+#ifndef MI_DEFAULT_PHYSICAL_MEMORY_IN_KIB
+#if MI_INTPTR_SIZE < 8
+#define MI_DEFAULT_PHYSICAL_MEMORY_IN_KIB   4*MI_MiB    // 4 GiB
+#else
+#define MI_DEFAULT_PHYSICAL_MEMORY_IN_KIB   32*MI_MiB   // 32 GiB
+#endif
+#endif
+
+static mi_os_mem_config_t mi_os_mem_config = {
+  4096,     // page size
+  0,        // large page size (usually 2MiB)
+  4096,     // allocation granularity
+  MI_DEFAULT_PHYSICAL_MEMORY_IN_KIB,
+  MI_DEFAULT_VIRTUAL_ADDRESS_BITS,
+  true,     // has overcommit?  (if true we use MAP_NORESERVE on mmap systems)
+  false,    // can we partially free allocated blocks? (on mmap systems we can free anywhere in a mapped range, but on Windows we must free the entire span)
+  true      // has virtual reserve? (if true we can reserve virtual address space without using commit or physical memory)
+};
+
+bool _mi_os_has_overcommit(void) {
+  return mi_os_mem_config.has_overcommit;
+}
+
+bool _mi_os_has_virtual_reserve(void) {
+  return mi_os_mem_config.has_virtual_reserve;
+}
+
+
+// OS (small) page size
+size_t _mi_os_page_size(void) {
+  return mi_os_mem_config.page_size;
+}
+
+// if large OS pages are supported (2 or 4MiB), then return the size, otherwise return the small page size (4KiB)
+size_t _mi_os_large_page_size(void) {
+  return (mi_os_mem_config.large_page_size != 0 ? mi_os_mem_config.large_page_size : _mi_os_page_size());
+}
+
+bool _mi_os_canuse_large_page(size_t size, size_t alignment) {
+  // if we have access, check the size and alignment requirements
+  if (mi_os_mem_config.large_page_size == 0) return false;
+  return ((size % mi_os_mem_config.large_page_size) == 0 && (alignment % mi_os_mem_config.large_page_size) == 0);
+}
+
+// round to a good OS allocation size (bounded by max 12.5% waste)
+size_t _mi_os_good_alloc_size(size_t size) {
+  size_t align_size;
+  if (size < 512*MI_KiB) align_size = _mi_os_page_size();
+  else if (size < 2*MI_MiB) align_size = 64*MI_KiB;
+  else if (size < 8*MI_MiB) align_size = 256*MI_KiB;
+  else if (size < 32*MI_MiB) align_size = 1*MI_MiB;
+  else align_size = 4*MI_MiB;
+  if mi_unlikely(size >= (SIZE_MAX - align_size)) return size; // possible overflow?
+  return _mi_align_up(size, align_size);
+}
+
+void _mi_os_init(void) {
+  _mi_prim_mem_init(&mi_os_mem_config);
+}
+
+
+/* -----------------------------------------------------------
+  Util
+-------------------------------------------------------------- */
+bool _mi_os_decommit(void* addr, size_t size);
+bool _mi_os_commit(void* addr, size_t size, bool* is_zero);
+
+
+/* -----------------------------------------------------------
+  aligned hinting
+-------------------------------------------------------------- */
+
+// On systems with enough virtual address bits, we can do efficient aligned allocation by using
+// the 2TiB to 30TiB area to allocate those. If we have at least 46 bits of virtual address
+// space (64TiB) we use this technique. (but see issue #939)
+#if (MI_INTPTR_SIZE >= 8) && !defined(MI_NO_ALIGNED_HINT)
+static mi_decl_cache_align _Atomic(uintptr_t)aligned_base;
+
+// Return a MI_SEGMENT_SIZE aligned address that is probably available.
+// If this returns NULL, the OS will determine the address but on some OS's that may not be
+// properly aligned which can be more costly as it needs to be adjusted afterwards.
+// For a size > 1GiB this always returns NULL in order to guarantee good ASLR randomization;
+// (otherwise an initial large allocation of say 2TiB has a 50% chance to include (known) addresses
+//  in the middle of the 2TiB - 6TiB address range (see issue #372))
+
+#define MI_HINT_BASE ((uintptr_t)2 << 40)  // 2TiB start
+#define MI_HINT_AREA ((uintptr_t)4 << 40)  // upto 6TiB   (since before win8 there is "only" 8TiB available to processes)
+#define MI_HINT_MAX  ((uintptr_t)30 << 40) // wrap after 30TiB (area after 32TiB is used for huge OS pages)
+
+void* _mi_os_get_aligned_hint(size_t try_alignment, size_t size)
+{
+  if (try_alignment <= 1 || try_alignment > MI_SEGMENT_SIZE) return NULL;
+  if (mi_os_mem_config.virtual_address_bits < 46) return NULL;  // < 64TiB virtual address space
+  size = _mi_align_up(size, MI_SEGMENT_SIZE);
+  if (size > 1*MI_GiB) return NULL;  // guarantee the chance of fixed valid address is at most 1/(MI_HINT_AREA / 1<<30) = 1/4096.
+  #if (MI_SECURE>0)
+  size += MI_SEGMENT_SIZE;        // put in `MI_SEGMENT_SIZE` virtual gaps between hinted blocks; this splits VLA's but increases guarded areas.
+  #endif
+
+  uintptr_t hint = mi_atomic_add_acq_rel(&aligned_base, size);
+  if (hint == 0 || hint > MI_HINT_MAX) {   // wrap or initialize
+    uintptr_t init = MI_HINT_BASE;
+    #if (MI_SECURE>0 || MI_DEBUG==0)       // security: randomize start of aligned allocations unless in debug mode
+    uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap());
+    init = init + ((MI_SEGMENT_SIZE * ((r>>17) & 0xFFFFF)) % MI_HINT_AREA);  // (randomly 20 bits)*4MiB == 0 to 4TiB
+    #endif
+    uintptr_t expected = hint + size;
+    mi_atomic_cas_strong_acq_rel(&aligned_base, &expected, init);
+    hint = mi_atomic_add_acq_rel(&aligned_base, size); // this may still give 0 or > MI_HINT_MAX but that is ok, it is a hint after all
+  }
+  if (hint%try_alignment != 0) return NULL;
+  return (void*)hint;
+}
+#else
+void* _mi_os_get_aligned_hint(size_t try_alignment, size_t size) {
+  MI_UNUSED(try_alignment); MI_UNUSED(size);
+  return NULL;
+}
+#endif
+
+/* -----------------------------------------------------------
+  Free memory
+-------------------------------------------------------------- */
+
+static void mi_os_free_huge_os_pages(void* p, size_t size);
+
+static void mi_os_prim_free(void* addr, size_t size, size_t commit_size) {
+  mi_assert_internal((size % _mi_os_page_size()) == 0);
+  if (addr == NULL) return; // || _mi_os_is_huge_reserved(addr)
+  int err = _mi_prim_free(addr, size);  // allow size==0 (issue #1041)
+  if (err != 0) {
+    _mi_warning_message("unable to free OS memory (error: %d (0x%x), size: 0x%zx bytes, address: %p)\n", err, err, size, addr);
+  }
+  if (commit_size > 0) {
+    mi_os_stat_decrease(committed, commit_size);
+  }
+  mi_os_stat_decrease(reserved, size);
+}
+
+void _mi_os_free_ex(void* addr, size_t size, bool still_committed, mi_memid_t memid) {
+  if (mi_memkind_is_os(memid.memkind)) {
+    size_t csize = memid.mem.os.size;
+    if (csize==0) { csize = _mi_os_good_alloc_size(size); }
+    mi_assert_internal(csize >= size);
+    size_t commit_size = (still_committed ? csize : 0);
+    void* base = addr;
+    // different base? (due to alignment)
+    if (memid.mem.os.base != base) {
+      mi_assert(memid.mem.os.base <= addr);
+      base = memid.mem.os.base;
+      const size_t diff = (uint8_t*)addr - (uint8_t*)memid.mem.os.base;
+      if (memid.mem.os.size==0) {
+        csize += diff;
+      }
+      if (still_committed) {
+        commit_size -= diff;  // the (addr-base) part was already un-committed
+      }
+    }
+    // free it
+    if (memid.memkind == MI_MEM_OS_HUGE) {
+      mi_assert(memid.is_pinned);
+      mi_os_free_huge_os_pages(base, csize);
+    }
+    else {
+      mi_os_prim_free(base, csize, (still_committed ? commit_size : 0));
+    }
+  }
+  else {
+    // nothing to do
+    mi_assert(memid.memkind < MI_MEM_OS);
+  }
+}
+
+void  _mi_os_free(void* p, size_t size, mi_memid_t memid) {
+  _mi_os_free_ex(p, size, true, memid);
+}
+
+
+/* -----------------------------------------------------------
+   Primitive allocation from the OS.
+-------------------------------------------------------------- */
+
+// Note: the `try_alignment` is just a hint and the returned pointer is not guaranteed to be aligned.
+// Also `hint_addr` is a hint and may be ignored.
+static void* mi_os_prim_alloc_at(void* hint_addr, size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero) {
+  mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0);
+  mi_assert_internal(is_zero != NULL);
+  mi_assert_internal(is_large != NULL);
+  if (size == 0) return NULL;
+  if (!commit) { allow_large = false; }
+  if (try_alignment == 0) { try_alignment = 1; } // avoid 0 to ensure there will be no divide by zero when aligning
+  *is_zero = false;
+  void* p = NULL;
+  int err = _mi_prim_alloc(hint_addr, size, try_alignment, commit, allow_large, is_large, is_zero, &p);
+  if (err != 0) {
+    _mi_warning_message("unable to allocate OS memory (error: %d (0x%x), addr: %p, size: 0x%zx bytes, align: 0x%zx, commit: %d, allow large: %d)\n", err, err, hint_addr, size, try_alignment, commit, allow_large);
+  }
+
+
+
+  mi_os_stat_counter_increase(mmap_calls, 1);
+  if (p != NULL) {
+    mi_os_stat_increase(reserved, size);
+    if (commit) {
+      mi_os_stat_increase(committed, size);
+      // seems needed for asan (or `mimalloc-test-api` fails)
+      #ifdef MI_TRACK_ASAN
+      if (*is_zero) { mi_track_mem_defined(p,size); }
+               else { mi_track_mem_undefined(p,size); }
+      #endif
+    }
+  }
+  return p;
+}
+
+static void* mi_os_prim_alloc(size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero) {
+  return mi_os_prim_alloc_at(NULL, size, try_alignment, commit, allow_large, is_large, is_zero);
+}
+
+
+// Primitive aligned allocation from the OS.
+// This function guarantees the allocated memory is aligned.
+static void* mi_os_prim_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** base) {
+  mi_assert_internal(alignment >= _mi_os_page_size() && ((alignment & (alignment - 1)) == 0));
+  mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0);
+  mi_assert_internal(is_large != NULL);
+  mi_assert_internal(is_zero != NULL);
+  mi_assert_internal(base != NULL);
+  if (!commit) allow_large = false;
+  if (!(alignment >= _mi_os_page_size() && ((alignment & (alignment - 1)) == 0))) return NULL;
+  size = _mi_align_up(size, _mi_os_page_size());
+
+  // try first with a requested alignment hint (this will usually be aligned directly on Win 10+ or BSD)
+  void* p = mi_os_prim_alloc(size, alignment, commit, allow_large, is_large, is_zero);
+  if (p == NULL) return NULL;
+
+  // aligned already?
+  if (((uintptr_t)p % alignment) == 0) {
+    *base = p;
+  }
+  else {
+    // if not aligned, free it, overallocate, and unmap around it
+    #if !MI_TRACK_ASAN
+    _mi_warning_message("unable to allocate aligned OS memory directly, fall back to over-allocation (size: 0x%zx bytes, address: %p, alignment: 0x%zx, commit: %d)\n", size, p, alignment, commit);
+    #endif
+    if (p != NULL) { mi_os_prim_free(p, size, (commit ? size : 0)); }
+    if (size >= (SIZE_MAX - alignment)) return NULL; // overflow
+    const size_t over_size = size + alignment;
+
+    if (!mi_os_mem_config.has_partial_free) {  // win32 virtualAlloc cannot free parts of an allocated block
+      // over-allocate uncommitted (virtual) memory
+      p = mi_os_prim_alloc(over_size, 1 /*alignment*/, false /* commit? */, false /* allow_large */, is_large, is_zero);
+      if (p == NULL) return NULL;
+
+      // set p to the aligned part in the full region
+      // note: this is dangerous on Windows as VirtualFree needs the actual base pointer
+      // this is handled though by having the `base` field in the memid's
+      *base = p; // remember the base
+      p = mi_align_up_ptr(p, alignment);
+
+      // explicitly commit only the aligned part
+      if (commit) {
+        if (!_mi_os_commit(p, size, NULL)) {
+          mi_os_prim_free(*base, over_size, 0);
+          return NULL;
+        }
+      }
+    }
+    else  { // mmap can free inside an allocation
+      // overallocate...
+      p = mi_os_prim_alloc(over_size, 1, commit, false, is_large, is_zero);
+      if (p == NULL) return NULL;
+
+      // and selectively unmap parts around the over-allocated area.
+      void* aligned_p = mi_align_up_ptr(p, alignment);
+      size_t pre_size = (uint8_t*)aligned_p - (uint8_t*)p;
+      size_t mid_size = _mi_align_up(size, _mi_os_page_size());
+      size_t post_size = over_size - pre_size - mid_size;
+      mi_assert_internal(pre_size < over_size&& post_size < over_size&& mid_size >= size);
+      if (pre_size > 0)  { mi_os_prim_free(p, pre_size, (commit ? pre_size : 0)); }
+      if (post_size > 0) { mi_os_prim_free((uint8_t*)aligned_p + mid_size, post_size, (commit ? post_size : 0)); }
+      // we can return the aligned pointer on `mmap` systems
+      p = aligned_p;
+      *base = aligned_p; // since we freed the pre part, `*base == p`.
+    }
+  }
+
+  mi_assert_internal(p == NULL || (p != NULL && *base != NULL && ((uintptr_t)p % alignment) == 0));
+  return p;
+}
+
+
+/* -----------------------------------------------------------
+  OS API: alloc and alloc_aligned
+----------------------------------------------------------- */
+
+void* _mi_os_alloc(size_t size, mi_memid_t* memid) {
+  *memid = _mi_memid_none();
+  if (size == 0) return NULL;
+  size = _mi_os_good_alloc_size(size);
+  bool os_is_large = false;
+  bool os_is_zero  = false;
+  void* p = mi_os_prim_alloc(size, 0, true, false, &os_is_large, &os_is_zero);
+  if (p == NULL) return NULL;
+
+  *memid = _mi_memid_create_os(p, size, true, os_is_zero, os_is_large);
+  mi_assert_internal(memid->mem.os.size >= size);
+  mi_assert_internal(memid->initially_committed);
+  return p;
+}
+
+void* _mi_os_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, mi_memid_t* memid)
+{
+  MI_UNUSED(&_mi_os_get_aligned_hint); // suppress unused warnings
+  *memid = _mi_memid_none();
+  if (size == 0) return NULL;
+  size = _mi_os_good_alloc_size(size);
+  alignment = _mi_align_up(alignment, _mi_os_page_size());
+
+  bool os_is_large = false;
+  bool os_is_zero  = false;
+  void* os_base = NULL;
+  void* p = mi_os_prim_alloc_aligned(size, alignment, commit, allow_large, &os_is_large, &os_is_zero, &os_base );
+  if (p == NULL) return NULL;
+
+  *memid = _mi_memid_create_os(p, size, commit, os_is_zero, os_is_large);
+  memid->mem.os.base = os_base;
+  memid->mem.os.size += ((uint8_t*)p - (uint8_t*)os_base);  // todo: return from prim_alloc_aligned?
+
+  mi_assert_internal(memid->mem.os.size >= size);
+  mi_assert_internal(_mi_is_aligned(p,alignment));
+  if (commit) { mi_assert_internal(memid->initially_committed); }
+  return p;
+}
+
+
+mi_decl_nodiscard static void* mi_os_ensure_zero(void* p, size_t size, mi_memid_t* memid) {
+  if (p==NULL || size==0) return p;
+  // ensure committed
+  if (!memid->initially_committed) {
+    bool is_zero = false;
+    if (!_mi_os_commit(p, size, &is_zero)) {
+      _mi_os_free(p, size, *memid);
+      return NULL;
+    }
+    memid->initially_committed = true;
+  }
+  // ensure zero'd
+  if (memid->initially_zero) return p;
+  _mi_memzero_aligned(p,size);
+  memid->initially_zero = true;
+  return p;
+}
+
+void*  _mi_os_zalloc(size_t size, mi_memid_t* memid) {
+  void* p = _mi_os_alloc(size,memid);
+  return mi_os_ensure_zero(p, size, memid);
+}
+
+/* -----------------------------------------------------------
+  OS aligned allocation with an offset. This is used
+  for large alignments > MI_BLOCK_ALIGNMENT_MAX. We use a large mimalloc
+  page where the object can be aligned at an offset from the start of the segment.
+  As we may need to overallocate, we need to free such pointers using `mi_free_aligned`
+  to use the actual start of the memory region.
+----------------------------------------------------------- */
+
+void* _mi_os_alloc_aligned_at_offset(size_t size, size_t alignment, size_t offset, bool commit, bool allow_large, mi_memid_t* memid) {
+  mi_assert(offset <= MI_SEGMENT_SIZE);
+  mi_assert(offset <= size);
+  mi_assert((alignment % _mi_os_page_size()) == 0);
+  *memid = _mi_memid_none();
+  if (offset > MI_SEGMENT_SIZE) return NULL;
+  if (offset == 0) {
+    // regular aligned allocation
+    return _mi_os_alloc_aligned(size, alignment, commit, allow_large, memid);
+  }
+  else {
+    // overallocate to align at an offset
+    const size_t extra = _mi_align_up(offset, alignment) - offset;
+    const size_t oversize = size + extra;
+    void* const start = _mi_os_alloc_aligned(oversize, alignment, commit, allow_large, memid);
+    if (start == NULL) return NULL;
+
+    void* const p = (uint8_t*)start + extra;
+    mi_assert(_mi_is_aligned((uint8_t*)p + offset, alignment));
+    // decommit the overallocation at the start
+    if (commit && extra > _mi_os_page_size()) {
+      _mi_os_decommit(start, extra);
+    }
+    return p;
+  }
+}
+
+/* -----------------------------------------------------------
+  OS memory API: reset, commit, decommit, protect, unprotect.
+----------------------------------------------------------- */
+
+// OS page align within a given area, either conservative (pages inside the area only),
+// or not (straddling pages outside the area is possible)
+static void* mi_os_page_align_areax(bool conservative, void* addr, size_t size, size_t* newsize) {
+  mi_assert(addr != NULL && size > 0);
+  if (newsize != NULL) *newsize = 0;
+  if (size == 0 || addr == NULL) return NULL;
+
+  // page align conservatively within the range
+  void* start = (conservative ? mi_align_up_ptr(addr, _mi_os_page_size())
+    : mi_align_down_ptr(addr, _mi_os_page_size()));
+  void* end = (conservative ? mi_align_down_ptr((uint8_t*)addr + size, _mi_os_page_size())
+    : mi_align_up_ptr((uint8_t*)addr + size, _mi_os_page_size()));
+  ptrdiff_t diff = (uint8_t*)end - (uint8_t*)start;
+  if (diff <= 0) return NULL;
+
+  mi_assert_internal((conservative && (size_t)diff <= size) || (!conservative && (size_t)diff >= size));
+  if (newsize != NULL) *newsize = (size_t)diff;
+  return start;
+}
+
+static void* mi_os_page_align_area_conservative(void* addr, size_t size, size_t* newsize) {
+  return mi_os_page_align_areax(true, addr, size, newsize);
+}
+
+bool _mi_os_commit_ex(void* addr, size_t size, bool* is_zero, size_t stat_size) {
+  if (is_zero != NULL) { *is_zero = false; }
+  mi_os_stat_counter_increase(commit_calls, 1);
+
+  // page align range
+  size_t csize;
+  void* start = mi_os_page_align_areax(false /* conservative? */, addr, size, &csize);
+  if (csize == 0) return true;
+
+  // commit
+  bool os_is_zero = false;
+  int err = _mi_prim_commit(start, csize, &os_is_zero);
+  if (err != 0) {
+    _mi_warning_message("cannot commit OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize);
+    return false;
+  }
+  if (os_is_zero && is_zero != NULL) {
+    *is_zero = true;
+    mi_assert_expensive(mi_mem_is_zero(start, csize));
+  }
+  // note: the following seems required for asan (otherwise `mimalloc-test-stress` fails)
+  #ifdef MI_TRACK_ASAN
+  if (os_is_zero) { mi_track_mem_defined(start,csize); }
+             else { mi_track_mem_undefined(start,csize); }
+  #endif
+  mi_os_stat_increase(committed, stat_size);  // use size for precise commit vs. decommit
+  return true;
+}
+
+bool _mi_os_commit(void* addr, size_t size, bool* is_zero) {
+  return _mi_os_commit_ex(addr, size, is_zero, size);
+}
+
+static bool mi_os_decommit_ex(void* addr, size_t size, bool* needs_recommit, size_t stat_size) {
+  mi_assert_internal(needs_recommit!=NULL);
+  mi_os_stat_decrease(committed, stat_size);
+
+  // page align
+  size_t csize;
+  void* start = mi_os_page_align_area_conservative(addr, size, &csize);
+  if (csize == 0) return true;
+
+  // decommit
+  *needs_recommit = true;
+  int err = _mi_prim_decommit(start,csize,needs_recommit);
+  if (err != 0) {
+    _mi_warning_message("cannot decommit OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize);
+  }
+  mi_assert_internal(err == 0);
+  return (err == 0);
+}
+
+bool _mi_os_decommit(void* addr, size_t size) {
+  bool needs_recommit;
+  return mi_os_decommit_ex(addr, size, &needs_recommit, size);
+}
+
+
+// Signal to the OS that the address range is no longer in use
+// but may be used later again. This will release physical memory
+// pages and reduce swapping while keeping the memory committed.
+// We page align to a conservative area inside the range to reset.
+bool _mi_os_reset(void* addr, size_t size) {
+  // page align conservatively within the range
+  size_t csize;
+  void* start = mi_os_page_align_area_conservative(addr, size, &csize);
+  if (csize == 0) return true;  // || _mi_os_is_huge_reserved(addr)
+  mi_os_stat_counter_increase(reset, csize);
+  mi_os_stat_counter_increase(reset_calls, 1);
+
+  #if (MI_DEBUG>1) && !MI_SECURE && !MI_TRACK_ENABLED // && !MI_TSAN
+  memset(start, 0, csize); // pretend it is eagerly reset
+  #endif
+
+  int err = _mi_prim_reset(start, csize);
+  if (err != 0) {
+    _mi_warning_message("cannot reset OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize);
+  }
+  return (err == 0);
+}
+
+
+void _mi_os_reuse( void* addr, size_t size ) {
+  // page align conservatively within the range
+  size_t csize = 0;
+  void* const start = mi_os_page_align_area_conservative(addr, size, &csize);
+  if (csize == 0) return;
+  const int err = _mi_prim_reuse(start, csize);
+  if (err != 0) {
+    _mi_warning_message("cannot reuse OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize);
+  }
+}
+
+// either resets or decommits memory, returns true if the memory needs
+// to be recommitted if it is to be re-used later on.
+bool _mi_os_purge_ex(void* p, size_t size, bool allow_reset, size_t stat_size)
+{
+  if (mi_option_get(mi_option_purge_delay) < 0) return false;  // is purging allowed?
+  mi_os_stat_counter_increase(purge_calls, 1);
+  mi_os_stat_counter_increase(purged, size);
+
+  if (mi_option_is_enabled(mi_option_purge_decommits) &&   // should decommit?
+      !_mi_preloading())                                   // don't decommit during preloading (unsafe)
+  {
+    bool needs_recommit = true;
+    mi_os_decommit_ex(p, size, &needs_recommit, stat_size);
+    return needs_recommit;
+  }
+  else {
+    if (allow_reset) {  // this can sometimes be not allowed if the range is not fully committed
+      _mi_os_reset(p, size);
+    }
+    return false;  // needs no recommit
+  }
+}
+
+// either resets or decommits memory, returns true if the memory needs
+// to be recommitted if it is to be re-used later on.
+bool _mi_os_purge(void* p, size_t size) {
+  return _mi_os_purge_ex(p, size, true, size);
+}
+
+// Protect a region in memory to be not accessible.
+static  bool mi_os_protectx(void* addr, size_t size, bool protect) {
+  // page align conservatively within the range
+  size_t csize = 0;
+  void* start = mi_os_page_align_area_conservative(addr, size, &csize);
+  if (csize == 0) return false;
+  /*
+  if (_mi_os_is_huge_reserved(addr)) {
+	  _mi_warning_message("cannot mprotect memory allocated in huge OS pages\n");
+  }
+  */
+  int err = _mi_prim_protect(start,csize,protect);
+  if (err != 0) {
+    _mi_warning_message("cannot %s OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", (protect ? "protect" : "unprotect"), err, err, start, csize);
+  }
+  return (err == 0);
+}
+
+bool _mi_os_protect(void* addr, size_t size) {
+  return mi_os_protectx(addr, size, true);
+}
+
+bool _mi_os_unprotect(void* addr, size_t size) {
+  return mi_os_protectx(addr, size, false);
+}
+
+
+
+/* ----------------------------------------------------------------------------
+Support for allocating huge OS pages (1Gib) that are reserved up-front
+and possibly associated with a specific NUMA node. (use `numa_node>=0`)
+-----------------------------------------------------------------------------*/
+#define MI_HUGE_OS_PAGE_SIZE  (MI_GiB)
+
+
+#if (MI_INTPTR_SIZE >= 8)
+// To ensure proper alignment, use our own area for huge OS pages
+static mi_decl_cache_align _Atomic(uintptr_t)  mi_huge_start; // = 0
+
+// Claim an aligned address range for huge pages
+static uint8_t* mi_os_claim_huge_pages(size_t pages, size_t* total_size) {
+  if (total_size != NULL) *total_size = 0;
+  const size_t size = pages * MI_HUGE_OS_PAGE_SIZE;
+
+  uintptr_t start = 0;
+  uintptr_t end = 0;
+  uintptr_t huge_start = mi_atomic_load_relaxed(&mi_huge_start);
+  do {
+    start = huge_start;
+    if (start == 0) {
+      // Initialize the start address after the 32TiB area
+      start = ((uintptr_t)32 << 40);  // 32TiB virtual start address
+    #if (MI_SECURE>0 || MI_DEBUG==0)      // security: randomize start of huge pages unless in debug mode
+      uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap());
+      start = start + ((uintptr_t)MI_HUGE_OS_PAGE_SIZE * ((r>>17) & 0x0FFF));  // (randomly 12bits)*1GiB == between 0 to 4TiB
+    #endif
+    }
+    end = start + size;
+    mi_assert_internal(end % MI_SEGMENT_SIZE == 0);
+  } while (!mi_atomic_cas_strong_acq_rel(&mi_huge_start, &huge_start, end));
+
+  if (total_size != NULL) *total_size = size;
+  return (uint8_t*)start;
+}
+#else
+static uint8_t* mi_os_claim_huge_pages(size_t pages, size_t* total_size) {
+  MI_UNUSED(pages);
+  if (total_size != NULL) *total_size = 0;
+  return NULL;
+}
+#endif
+
+// Allocate MI_SEGMENT_SIZE aligned huge pages
+void* _mi_os_alloc_huge_os_pages(size_t pages, int numa_node, mi_msecs_t max_msecs, size_t* pages_reserved, size_t* psize, mi_memid_t* memid) {
+  *memid = _mi_memid_none();
+  if (psize != NULL) *psize = 0;
+  if (pages_reserved != NULL) *pages_reserved = 0;
+  size_t size = 0;
+  uint8_t* const start = mi_os_claim_huge_pages(pages, &size);
+  if (start == NULL) return NULL; // or 32-bit systems
+
+  // Allocate one page at the time but try to place them contiguously
+  // We allocate one page at the time to be able to abort if it takes too long
+  // or to at least allocate as many as available on the system.
+  mi_msecs_t start_t = _mi_clock_start();
+  size_t page = 0;
+  bool all_zero = true;
+  while (page < pages) {
+    // allocate a page
+    bool is_zero = false;
+    void* addr = start + (page * MI_HUGE_OS_PAGE_SIZE);
+    void* p = NULL;
+    int err = _mi_prim_alloc_huge_os_pages(addr, MI_HUGE_OS_PAGE_SIZE, numa_node, &is_zero, &p);
+    if (!is_zero) { all_zero = false;  }
+    if (err != 0) {
+      _mi_warning_message("unable to allocate huge OS page (error: %d (0x%x), address: %p, size: %zx bytes)\n", err, err, addr, MI_HUGE_OS_PAGE_SIZE);
+      break;
+    }
+
+    // Did we succeed at a contiguous address?
+    if (p != addr) {
+      // no success, issue a warning and break
+      if (p != NULL) {
+        _mi_warning_message("could not allocate contiguous huge OS page %zu at %p\n", page, addr);
+        mi_os_prim_free(p, MI_HUGE_OS_PAGE_SIZE, MI_HUGE_OS_PAGE_SIZE);
+      }
+      break;
+    }
+
+    // success, record it
+    page++;  // increase before timeout check (see issue #711)
+    mi_os_stat_increase(committed, MI_HUGE_OS_PAGE_SIZE);
+    mi_os_stat_increase(reserved, MI_HUGE_OS_PAGE_SIZE);
+
+    // check for timeout
+    if (max_msecs > 0) {
+      mi_msecs_t elapsed = _mi_clock_end(start_t);
+      if (page >= 1) {
+        mi_msecs_t estimate = ((elapsed / (page+1)) * pages);
+        if (estimate > 2*max_msecs) { // seems like we are going to timeout, break
+          elapsed = max_msecs + 1;
+        }
+      }
+      if (elapsed > max_msecs) {
+        _mi_warning_message("huge OS page allocation timed out (after allocating %zu page(s))\n", page);
+        break;
+      }
+    }
+  }
+  mi_assert_internal(page*MI_HUGE_OS_PAGE_SIZE <= size);
+  if (pages_reserved != NULL) { *pages_reserved = page; }
+  if (psize != NULL) { *psize = page * MI_HUGE_OS_PAGE_SIZE; }
+  if (page != 0) {
+    mi_assert(start != NULL);
+    *memid = _mi_memid_create_os(start, size, true /* is committed */, all_zero, true /* is_large */);
+    memid->memkind = MI_MEM_OS_HUGE;
+    mi_assert(memid->is_pinned);
+    #ifdef MI_TRACK_ASAN
+    if (all_zero) { mi_track_mem_defined(start,size); }
+    #endif
+  }
+  return (page == 0 ? NULL : start);
+}
+
+// free every huge page in a range individually (as we allocated per page)
+// note: needed with VirtualAlloc but could potentially be done in one go on mmap'd systems.
+static void mi_os_free_huge_os_pages(void* p, size_t size) {
+  if (p==NULL || size==0) return;
+  uint8_t* base = (uint8_t*)p;
+  while (size >= MI_HUGE_OS_PAGE_SIZE) {
+    mi_os_prim_free(base, MI_HUGE_OS_PAGE_SIZE, MI_HUGE_OS_PAGE_SIZE);
+    size -= MI_HUGE_OS_PAGE_SIZE;
+    base += MI_HUGE_OS_PAGE_SIZE;
+  }
+}
+
+
+/* ----------------------------------------------------------------------------
+Support NUMA aware allocation
+-----------------------------------------------------------------------------*/
+
+static _Atomic(size_t) mi_numa_node_count; // = 0   // cache the node count
+
+int _mi_os_numa_node_count(void) {
+  size_t count = mi_atomic_load_acquire(&mi_numa_node_count);
+  if mi_unlikely(count == 0) {
+    long ncount = mi_option_get(mi_option_use_numa_nodes); // given explicitly?
+    if (ncount > 0 && ncount < INT_MAX) {
+      count = (size_t)ncount;
+    }
+    else {
+      const size_t n = _mi_prim_numa_node_count(); // or detect dynamically
+      if (n == 0 || n > INT_MAX) { count = 1; }
+                            else { count = n; }
+    }
+    mi_atomic_store_release(&mi_numa_node_count, count); // save it
+    _mi_verbose_message("using %zd numa regions\n", count);
+  }
+  mi_assert_internal(count > 0 && count <= INT_MAX);
+  return (int)count;
+}
+
+static int mi_os_numa_node_get(void) {
+  int numa_count = _mi_os_numa_node_count();
+  if (numa_count<=1) return 0; // optimize on single numa node systems: always node 0
+  // never more than the node count and >= 0
+  const size_t n = _mi_prim_numa_node();
+  int numa_node = (n < INT_MAX ? (int)n : 0);
+  if (numa_node >= numa_count) { numa_node = numa_node % numa_count; }
+  return numa_node;
+}
+
+int _mi_os_numa_node(void) {
+  if mi_likely(mi_atomic_load_relaxed(&mi_numa_node_count) == 1) {
+    return 0;
+  }
+  else {
+    return mi_os_numa_node_get();
+  }
+}
diff --git a/compat/mimalloc/page-queue.c b/compat/mimalloc/page-queue.c
new file mode 100644
index 00000000000000..1f700c6df4c866
--- /dev/null
+++ b/compat/mimalloc/page-queue.c
@@ -0,0 +1,397 @@
+/*----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* -----------------------------------------------------------
+  Definition of page queues for each block size
+----------------------------------------------------------- */
+
+#ifndef MI_IN_PAGE_C
+#error "this file should be included from 'page.c'"
+// include to help an IDE
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#endif
+
+/* -----------------------------------------------------------
+  Minimal alignment in machine words (i.e. `sizeof(void*)`)
+----------------------------------------------------------- */
+
+#if (MI_MAX_ALIGN_SIZE > 4*MI_INTPTR_SIZE)
+  #error "define alignment for more than 4x word size for this platform"
+#elif (MI_MAX_ALIGN_SIZE > 2*MI_INTPTR_SIZE)
+  #define MI_ALIGN4W   // 4 machine words minimal alignment
+#elif (MI_MAX_ALIGN_SIZE > MI_INTPTR_SIZE)
+  #define MI_ALIGN2W   // 2 machine words minimal alignment
+#else
+  // ok, default alignment is 1 word
+#endif
+
+
+/* -----------------------------------------------------------
+  Queue query
+----------------------------------------------------------- */
+
+
+static inline bool mi_page_queue_is_huge(const mi_page_queue_t* pq) {
+  return (pq->block_size == (MI_MEDIUM_OBJ_SIZE_MAX+sizeof(uintptr_t)));
+}
+
+static inline bool mi_page_queue_is_full(const mi_page_queue_t* pq) {
+  return (pq->block_size == (MI_MEDIUM_OBJ_SIZE_MAX+(2*sizeof(uintptr_t))));
+}
+
+static inline bool mi_page_queue_is_special(const mi_page_queue_t* pq) {
+  return (pq->block_size > MI_MEDIUM_OBJ_SIZE_MAX);
+}
+
+/* -----------------------------------------------------------
+  Bins
+----------------------------------------------------------- */
+
+// Return the bin for a given field size.
+// Returns MI_BIN_HUGE if the size is too large.
+// We use `wsize` for the size in "machine word sizes",
+// i.e. byte size == `wsize*sizeof(void*)`.
+static inline size_t mi_bin(size_t size) {
+  size_t wsize = _mi_wsize_from_size(size);
+#if defined(MI_ALIGN4W)
+  if mi_likely(wsize <= 4) {
+    return (wsize <= 1 ? 1 : (wsize+1)&~1); // round to double word sizes
+  }
+#elif defined(MI_ALIGN2W)
+  if mi_likely(wsize <= 8) {
+    return (wsize <= 1 ? 1 : (wsize+1)&~1); // round to double word sizes
+  }
+#else
+  if mi_likely(wsize <= 8) {
+    return (wsize == 0 ? 1 : wsize);
+  }
+#endif
+  else if mi_unlikely(wsize > MI_MEDIUM_OBJ_WSIZE_MAX) {
+    return MI_BIN_HUGE;
+  }
+  else {
+    #if defined(MI_ALIGN4W)
+    if (wsize <= 16) { wsize = (wsize+3)&~3; } // round to 4x word sizes
+    #endif
+    wsize--;
+    // find the highest bit
+    const size_t b = (MI_SIZE_BITS - 1 - mi_clz(wsize));  // note: wsize != 0
+    // and use the top 3 bits to determine the bin (~12.5% worst internal fragmentation).
+    // - adjust with 3 because we use do not round the first 8 sizes
+    //   which each get an exact bin
+    const size_t bin = ((b << 2) + ((wsize >> (b - 2)) & 0x03)) - 3;
+    mi_assert_internal(bin > 0 && bin < MI_BIN_HUGE);
+    return bin;
+  }
+}
+
+
+
+/* -----------------------------------------------------------
+  Queue of pages with free blocks
+----------------------------------------------------------- */
+
+size_t _mi_bin(size_t size) {
+  return mi_bin(size);
+}
+
+size_t _mi_bin_size(size_t bin) {
+  return _mi_heap_empty.pages[bin].block_size;
+}
+
+// Good size for allocation
+size_t mi_good_size(size_t size) mi_attr_noexcept {
+  if (size <= MI_MEDIUM_OBJ_SIZE_MAX) {
+    return _mi_bin_size(mi_bin(size + MI_PADDING_SIZE));
+  }
+  else {
+    return _mi_align_up(size + MI_PADDING_SIZE,_mi_os_page_size());
+  }
+}
+
+#if (MI_DEBUG>1)
+static bool mi_page_queue_contains(mi_page_queue_t* queue, const mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  mi_page_t* list = queue->first;
+  while (list != NULL) {
+    mi_assert_internal(list->next == NULL || list->next->prev == list);
+    mi_assert_internal(list->prev == NULL || list->prev->next == list);
+    if (list == page) break;
+    list = list->next;
+  }
+  return (list == page);
+}
+
+#endif
+
+#if (MI_DEBUG>1)
+static bool mi_heap_contains_queue(const mi_heap_t* heap, const mi_page_queue_t* pq) {
+  return (pq >= &heap->pages[0] && pq <= &heap->pages[MI_BIN_FULL]);
+}
+#endif
+
+static inline bool mi_page_is_large_or_huge(const mi_page_t* page) {
+  return (mi_page_block_size(page) > MI_MEDIUM_OBJ_SIZE_MAX || mi_page_is_huge(page));
+}
+
+static size_t mi_page_bin(const mi_page_t* page) {
+  const size_t bin = (mi_page_is_in_full(page) ? MI_BIN_FULL : (mi_page_is_huge(page) ? MI_BIN_HUGE : mi_bin(mi_page_block_size(page))));
+  mi_assert_internal(bin <= MI_BIN_FULL);
+  return bin;
+}
+
+// returns the page bin without using MI_BIN_FULL for statistics
+size_t _mi_page_stats_bin(const mi_page_t* page) {
+  const size_t bin = (mi_page_is_huge(page) ? MI_BIN_HUGE : mi_bin(mi_page_block_size(page)));
+  mi_assert_internal(bin <= MI_BIN_HUGE);
+  return bin;
+}
+
+static mi_page_queue_t* mi_heap_page_queue_of(mi_heap_t* heap, const mi_page_t* page) {
+  mi_assert_internal(heap!=NULL);
+  const size_t bin = mi_page_bin(page);
+  mi_page_queue_t* pq = &heap->pages[bin];
+  mi_assert_internal((mi_page_block_size(page) == pq->block_size) ||
+                       (mi_page_is_large_or_huge(page) && mi_page_queue_is_huge(pq)) ||
+                         (mi_page_is_in_full(page) && mi_page_queue_is_full(pq)));
+  return pq;
+}
+
+static mi_page_queue_t* mi_page_queue_of(const mi_page_t* page) {
+  mi_heap_t* heap = mi_page_heap(page);
+  mi_page_queue_t* pq = mi_heap_page_queue_of(heap, page);
+  mi_assert_expensive(mi_page_queue_contains(pq, page));
+  return pq;
+}
+
+// The current small page array is for efficiency and for each
+// small size (up to 256) it points directly to the page for that
+// size without having to compute the bin. This means when the
+// current free page queue is updated for a small bin, we need to update a
+// range of entries in `_mi_page_small_free`.
+static inline void mi_heap_queue_first_update(mi_heap_t* heap, const mi_page_queue_t* pq) {
+  mi_assert_internal(mi_heap_contains_queue(heap,pq));
+  size_t size = pq->block_size;
+  if (size > MI_SMALL_SIZE_MAX) return;
+
+  mi_page_t* page = pq->first;
+  if (pq->first == NULL) page = (mi_page_t*)&_mi_page_empty;
+
+  // find index in the right direct page array
+  size_t start;
+  size_t idx = _mi_wsize_from_size(size);
+  mi_page_t** pages_free = heap->pages_free_direct;
+
+  if (pages_free[idx] == page) return;  // already set
+
+  // find start slot
+  if (idx<=1) {
+    start = 0;
+  }
+  else {
+    // find previous size; due to minimal alignment upto 3 previous bins may need to be skipped
+    size_t bin = mi_bin(size);
+    const mi_page_queue_t* prev = pq - 1;
+    while( bin == mi_bin(prev->block_size) && prev > &heap->pages[0]) {
+      prev--;
+    }
+    start = 1 + _mi_wsize_from_size(prev->block_size);
+    if (start > idx) start = idx;
+  }
+
+  // set size range to the right page
+  mi_assert(start <= idx);
+  for (size_t sz = start; sz <= idx; sz++) {
+    pages_free[sz] = page;
+  }
+}
+
+/*
+static bool mi_page_queue_is_empty(mi_page_queue_t* queue) {
+  return (queue->first == NULL);
+}
+*/
+
+static void mi_page_queue_remove(mi_page_queue_t* queue, mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(mi_page_queue_contains(queue, page));
+  mi_assert_internal(mi_page_block_size(page) == queue->block_size ||
+                      (mi_page_is_large_or_huge(page) && mi_page_queue_is_huge(queue)) ||
+                        (mi_page_is_in_full(page) && mi_page_queue_is_full(queue)));
+  mi_heap_t* heap = mi_page_heap(page);
+
+  if (page->prev != NULL) page->prev->next = page->next;
+  if (page->next != NULL) page->next->prev = page->prev;
+  if (page == queue->last)  queue->last = page->prev;
+  if (page == queue->first) {
+    queue->first = page->next;
+    // update first
+    mi_assert_internal(mi_heap_contains_queue(heap, queue));
+    mi_heap_queue_first_update(heap,queue);
+  }
+  heap->page_count--;
+  page->next = NULL;
+  page->prev = NULL;
+  // mi_atomic_store_ptr_release(mi_atomic_cast(void*, &page->heap), NULL);
+  mi_page_set_in_full(page,false);
+}
+
+
+static void mi_page_queue_push(mi_heap_t* heap, mi_page_queue_t* queue, mi_page_t* page) {
+  mi_assert_internal(mi_page_heap(page) == heap);
+  mi_assert_internal(!mi_page_queue_contains(queue, page));
+  #if MI_HUGE_PAGE_ABANDON
+  mi_assert_internal(_mi_page_segment(page)->kind != MI_SEGMENT_HUGE);
+  #endif
+  mi_assert_internal(mi_page_block_size(page) == queue->block_size ||
+                      (mi_page_is_large_or_huge(page) && mi_page_queue_is_huge(queue)) ||
+                        (mi_page_is_in_full(page) && mi_page_queue_is_full(queue)));
+
+  mi_page_set_in_full(page, mi_page_queue_is_full(queue));
+  // mi_atomic_store_ptr_release(mi_atomic_cast(void*, &page->heap), heap);
+  page->next = queue->first;
+  page->prev = NULL;
+  if (queue->first != NULL) {
+    mi_assert_internal(queue->first->prev == NULL);
+    queue->first->prev = page;
+    queue->first = page;
+  }
+  else {
+    queue->first = queue->last = page;
+  }
+
+  // update direct
+  mi_heap_queue_first_update(heap, queue);
+  heap->page_count++;
+}
+
+static void mi_page_queue_move_to_front(mi_heap_t* heap, mi_page_queue_t* queue, mi_page_t* page) {
+  mi_assert_internal(mi_page_heap(page) == heap);
+  mi_assert_internal(mi_page_queue_contains(queue, page));
+  if (queue->first == page) return;
+  mi_page_queue_remove(queue, page);
+  mi_page_queue_push(heap, queue, page);
+  mi_assert_internal(queue->first == page);
+}
+
+static void mi_page_queue_enqueue_from_ex(mi_page_queue_t* to, mi_page_queue_t* from, bool enqueue_at_end, mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(mi_page_queue_contains(from, page));
+  mi_assert_expensive(!mi_page_queue_contains(to, page));
+  const size_t bsize = mi_page_block_size(page);
+  MI_UNUSED(bsize);
+  mi_assert_internal((bsize == to->block_size && bsize == from->block_size) ||
+                     (bsize == to->block_size && mi_page_queue_is_full(from)) ||
+                     (bsize == from->block_size && mi_page_queue_is_full(to)) ||
+                     (mi_page_is_large_or_huge(page) && mi_page_queue_is_huge(to)) ||
+                     (mi_page_is_large_or_huge(page) && mi_page_queue_is_full(to)));
+
+  mi_heap_t* heap = mi_page_heap(page);
+
+  // delete from `from`
+  if (page->prev != NULL) page->prev->next = page->next;
+  if (page->next != NULL) page->next->prev = page->prev;
+  if (page == from->last)  from->last = page->prev;
+  if (page == from->first) {
+    from->first = page->next;
+    // update first
+    mi_assert_internal(mi_heap_contains_queue(heap, from));
+    mi_heap_queue_first_update(heap, from);
+  }
+
+  // insert into `to`
+  if (enqueue_at_end) {
+    // enqueue at the end
+    page->prev = to->last;
+    page->next = NULL;
+    if (to->last != NULL) {
+      mi_assert_internal(heap == mi_page_heap(to->last));
+      to->last->next = page;
+      to->last = page;
+    }
+    else {
+      to->first = page;
+      to->last = page;
+      mi_heap_queue_first_update(heap, to);
+    }
+  }
+  else {
+    if (to->first != NULL) {
+      // enqueue at 2nd place
+      mi_assert_internal(heap == mi_page_heap(to->first));
+      mi_page_t* next = to->first->next;
+      page->prev = to->first;
+      page->next = next;
+      to->first->next = page;
+      if (next != NULL) {
+        next->prev = page;
+      }
+      else {
+        to->last = page;
+      }
+    }
+    else {
+      // enqueue at the head (singleton list)
+      page->prev = NULL;
+      page->next = NULL;
+      to->first = page;
+      to->last = page;
+      mi_heap_queue_first_update(heap, to);
+    }
+  }
+
+  mi_page_set_in_full(page, mi_page_queue_is_full(to));
+}
+
+static void mi_page_queue_enqueue_from(mi_page_queue_t* to, mi_page_queue_t* from, mi_page_t* page) {
+  mi_page_queue_enqueue_from_ex(to, from, true /* enqueue at the end */, page);
+}
+
+static void mi_page_queue_enqueue_from_full(mi_page_queue_t* to, mi_page_queue_t* from, mi_page_t* page) {
+  // note: we could insert at the front to increase reuse, but it slows down certain benchmarks (like `alloc-test`)
+  mi_page_queue_enqueue_from_ex(to, from, true /* enqueue at the end of the `to` queue? */, page);
+}
+
+// Only called from `mi_heap_absorb`.
+size_t _mi_page_queue_append(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_queue_t* append) {
+  mi_assert_internal(mi_heap_contains_queue(heap,pq));
+  mi_assert_internal(pq->block_size == append->block_size);
+
+  if (append->first==NULL) return 0;
+
+  // set append pages to new heap and count
+  size_t count = 0;
+  for (mi_page_t* page = append->first; page != NULL; page = page->next) {
+    // inline `mi_page_set_heap` to avoid wrong assertion during absorption;
+    // in this case it is ok to be delayed freeing since both "to" and "from" heap are still alive.
+    mi_atomic_store_release(&page->xheap, (uintptr_t)heap);
+    // set the flag to delayed free (not overriding NEVER_DELAYED_FREE) which has as a
+    // side effect that it spins until any DELAYED_FREEING is finished. This ensures
+    // that after appending only the new heap will be used for delayed free operations.
+    _mi_page_use_delayed_free(page, MI_USE_DELAYED_FREE, false);
+    count++;
+  }
+
+  if (pq->last==NULL) {
+    // take over afresh
+    mi_assert_internal(pq->first==NULL);
+    pq->first = append->first;
+    pq->last = append->last;
+    mi_heap_queue_first_update(heap, pq);
+  }
+  else {
+    // append to end
+    mi_assert_internal(pq->last!=NULL);
+    mi_assert_internal(append->first!=NULL);
+    pq->last->next = append->first;
+    append->first->prev = pq->last;
+    pq->last = append->last;
+  }
+  return count;
+}
diff --git a/compat/mimalloc/page.c b/compat/mimalloc/page.c
new file mode 100644
index 00000000000000..34dae9f5e473cb
--- /dev/null
+++ b/compat/mimalloc/page.c
@@ -0,0 +1,1050 @@
+/*----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* -----------------------------------------------------------
+  The core of the allocator. Every segment contains
+  pages of a certain block size. The main function
+  exported is `mi_malloc_generic`.
+----------------------------------------------------------- */
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+
+/* -----------------------------------------------------------
+  Definition of page queues for each block size
+----------------------------------------------------------- */
+
+#define MI_IN_PAGE_C
+#include "page-queue.c"
+#undef MI_IN_PAGE_C
+
+
+/* -----------------------------------------------------------
+  Page helpers
+----------------------------------------------------------- */
+
+// Index a block in a page
+static inline mi_block_t* mi_page_block_at(const mi_page_t* page, void* page_start, size_t block_size, size_t i) {
+  MI_UNUSED(page);
+  mi_assert_internal(page != NULL);
+  mi_assert_internal(i <= page->reserved);
+  return (mi_block_t*)((uint8_t*)page_start + (i * block_size));
+}
+
+static void mi_page_init(mi_heap_t* heap, mi_page_t* page, size_t size, mi_tld_t* tld);
+static bool mi_page_extend_free(mi_heap_t* heap, mi_page_t* page, mi_tld_t* tld);
+
+#if (MI_DEBUG>=3)
+static size_t mi_page_list_count(mi_page_t* page, mi_block_t* head) {
+  size_t count = 0;
+  while (head != NULL) {
+    mi_assert_internal(page == _mi_ptr_page(head));
+    count++;
+    head = mi_block_next(page, head);
+  }
+  return count;
+}
+
+/*
+// Start of the page available memory
+static inline uint8_t* mi_page_area(const mi_page_t* page) {
+  return _mi_page_start(_mi_page_segment(page), page, NULL);
+}
+*/
+
+static bool mi_page_list_is_valid(mi_page_t* page, mi_block_t* p) {
+  size_t psize;
+  uint8_t* page_area = _mi_segment_page_start(_mi_page_segment(page), page, &psize);
+  mi_block_t* start = (mi_block_t*)page_area;
+  mi_block_t* end   = (mi_block_t*)(page_area + psize);
+  while(p != NULL) {
+    if (p < start || p >= end) return false;
+    p = mi_block_next(page, p);
+  }
+#if MI_DEBUG>3 // generally too expensive to check this
+  if (page->free_is_zero) {
+    const size_t ubsize = mi_page_usable_block_size(page);
+    for (mi_block_t* block = page->free; block != NULL; block = mi_block_next(page, block)) {
+      mi_assert_expensive(mi_mem_is_zero(block + 1, ubsize - sizeof(mi_block_t)));
+    }
+  }
+#endif
+  return true;
+}
+
+static bool mi_page_is_valid_init(mi_page_t* page) {
+  mi_assert_internal(mi_page_block_size(page) > 0);
+  mi_assert_internal(page->used <= page->capacity);
+  mi_assert_internal(page->capacity <= page->reserved);
+
+  uint8_t* start = mi_page_start(page);
+  mi_assert_internal(start == _mi_segment_page_start(_mi_page_segment(page), page, NULL));
+  mi_assert_internal(page->is_huge == (_mi_page_segment(page)->kind == MI_SEGMENT_HUGE));
+  //mi_assert_internal(start + page->capacity*page->block_size == page->top);
+
+  mi_assert_internal(mi_page_list_is_valid(page,page->free));
+  mi_assert_internal(mi_page_list_is_valid(page,page->local_free));
+
+  #if MI_DEBUG>3 // generally too expensive to check this
+  if (page->free_is_zero) {
+    const size_t ubsize = mi_page_usable_block_size(page);
+    for(mi_block_t* block = page->free; block != NULL; block = mi_block_next(page,block)) {
+      mi_assert_expensive(mi_mem_is_zero(block + 1, ubsize - sizeof(mi_block_t)));
+    }
+  }
+  #endif
+
+  #if !MI_TRACK_ENABLED && !MI_TSAN
+  mi_block_t* tfree = mi_page_thread_free(page);
+  mi_assert_internal(mi_page_list_is_valid(page, tfree));
+  //size_t tfree_count = mi_page_list_count(page, tfree);
+  //mi_assert_internal(tfree_count <= page->thread_freed + 1);
+  #endif
+
+  size_t free_count = mi_page_list_count(page, page->free) + mi_page_list_count(page, page->local_free);
+  mi_assert_internal(page->used + free_count == page->capacity);
+
+  return true;
+}
+
+extern mi_decl_hidden bool _mi_process_is_initialized;             // has mi_process_init been called?
+
+bool _mi_page_is_valid(mi_page_t* page) {
+  mi_assert_internal(mi_page_is_valid_init(page));
+  #if MI_SECURE
+  mi_assert_internal(page->keys[0] != 0);
+  #endif
+  if (mi_page_heap(page)!=NULL) {
+    mi_segment_t* segment = _mi_page_segment(page);
+
+    mi_assert_internal(!_mi_process_is_initialized || segment->thread_id==0 || segment->thread_id == mi_page_heap(page)->thread_id);
+    #if MI_HUGE_PAGE_ABANDON
+    if (segment->kind != MI_SEGMENT_HUGE)
+    #endif
+    {
+      mi_page_queue_t* pq = mi_page_queue_of(page);
+      mi_assert_internal(mi_page_queue_contains(pq, page));
+      mi_assert_internal(pq->block_size==mi_page_block_size(page) || mi_page_block_size(page) > MI_MEDIUM_OBJ_SIZE_MAX || mi_page_is_in_full(page));
+      mi_assert_internal(mi_heap_contains_queue(mi_page_heap(page),pq));
+    }
+  }
+  return true;
+}
+#endif
+
+void _mi_page_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never) {
+  while (!_mi_page_try_use_delayed_free(page, delay, override_never)) {
+    mi_atomic_yield();
+  }
+}
+
+bool _mi_page_try_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never) {
+  mi_thread_free_t tfreex;
+  mi_delayed_t     old_delay;
+  mi_thread_free_t tfree;
+  size_t yield_count = 0;
+  do {
+    tfree = mi_atomic_load_acquire(&page->xthread_free); // note: must acquire as we can break/repeat this loop and not do a CAS;
+    tfreex = mi_tf_set_delayed(tfree, delay);
+    old_delay = mi_tf_delayed(tfree);
+    if mi_unlikely(old_delay == MI_DELAYED_FREEING) {
+      if (yield_count >= 4) return false;  // give up after 4 tries
+      yield_count++;
+      mi_atomic_yield(); // delay until outstanding MI_DELAYED_FREEING are done.
+      // tfree = mi_tf_set_delayed(tfree, MI_NO_DELAYED_FREE); // will cause CAS to busy fail
+    }
+    else if (delay == old_delay) {
+      break; // avoid atomic operation if already equal
+    }
+    else if (!override_never && old_delay == MI_NEVER_DELAYED_FREE) {
+      break; // leave never-delayed flag set
+    }
+  } while ((old_delay == MI_DELAYED_FREEING) ||
+           !mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex));
+
+  return true; // success
+}
+
+/* -----------------------------------------------------------
+  Page collect the `local_free` and `thread_free` lists
+----------------------------------------------------------- */
+
+// Collect the local `thread_free` list using an atomic exchange.
+// Note: The exchange must be done atomically as this is used right after
+// moving to the full list in `mi_page_collect_ex` and we need to
+// ensure that there was no race where the page became unfull just before the move.
+static void _mi_page_thread_free_collect(mi_page_t* page)
+{
+  mi_block_t* head;
+  mi_thread_free_t tfreex;
+  mi_thread_free_t tfree = mi_atomic_load_relaxed(&page->xthread_free);
+  do {
+    head = mi_tf_block(tfree);
+    tfreex = mi_tf_set_block(tfree,NULL);
+  } while (!mi_atomic_cas_weak_acq_rel(&page->xthread_free, &tfree, tfreex));
+
+  // return if the list is empty
+  if (head == NULL) return;
+
+  // find the tail -- also to get a proper count (without data races)
+  size_t max_count = page->capacity; // cannot collect more than capacity
+  size_t count = 1;
+  mi_block_t* tail = head;
+  mi_block_t* next;
+  while ((next = mi_block_next(page,tail)) != NULL && count <= max_count) {
+    count++;
+    tail = next;
+  }
+  // if `count > max_count` there was a memory corruption (possibly infinite list due to double multi-threaded free)
+  if (count > max_count) {
+    _mi_error_message(EFAULT, "corrupted thread-free list\n");
+    return; // the thread-free items cannot be freed
+  }
+
+  // and append the current local free list
+  mi_block_set_next(page,tail, page->local_free);
+  page->local_free = head;
+
+  // update counts now
+  page->used -= (uint16_t)count;
+}
+
+void _mi_page_free_collect(mi_page_t* page, bool force) {
+  mi_assert_internal(page!=NULL);
+
+  // collect the thread free list
+  if (force || mi_page_thread_free(page) != NULL) {  // quick test to avoid an atomic operation
+    _mi_page_thread_free_collect(page);
+  }
+
+  // and the local free list
+  if (page->local_free != NULL) {
+    if mi_likely(page->free == NULL) {
+      // usual case
+      page->free = page->local_free;
+      page->local_free = NULL;
+      page->free_is_zero = false;
+    }
+    else if (force) {
+      // append -- only on shutdown (force) as this is a linear operation
+      mi_block_t* tail = page->local_free;
+      mi_block_t* next;
+      while ((next = mi_block_next(page, tail)) != NULL) {
+        tail = next;
+      }
+      mi_block_set_next(page, tail, page->free);
+      page->free = page->local_free;
+      page->local_free = NULL;
+      page->free_is_zero = false;
+    }
+  }
+
+  mi_assert_internal(!force || page->local_free == NULL);
+}
+
+
+
+/* -----------------------------------------------------------
+  Page fresh and retire
+----------------------------------------------------------- */
+
+// called from segments when reclaiming abandoned pages
+void _mi_page_reclaim(mi_heap_t* heap, mi_page_t* page) {
+  mi_assert_expensive(mi_page_is_valid_init(page));
+
+  mi_assert_internal(mi_page_heap(page) == heap);
+  mi_assert_internal(mi_page_thread_free_flag(page) != MI_NEVER_DELAYED_FREE);
+  #if MI_HUGE_PAGE_ABANDON
+  mi_assert_internal(_mi_page_segment(page)->kind != MI_SEGMENT_HUGE);
+  #endif
+
+  // TODO: push on full queue immediately if it is full?
+  mi_page_queue_t* pq = mi_page_queue(heap, mi_page_block_size(page));
+  mi_page_queue_push(heap, pq, page);
+  mi_assert_expensive(_mi_page_is_valid(page));
+}
+
+// allocate a fresh page from a segment
+static mi_page_t* mi_page_fresh_alloc(mi_heap_t* heap, mi_page_queue_t* pq, size_t block_size, size_t page_alignment) {
+  #if !MI_HUGE_PAGE_ABANDON
+  mi_assert_internal(pq != NULL);
+  mi_assert_internal(mi_heap_contains_queue(heap, pq));
+  mi_assert_internal(page_alignment > 0 || block_size > MI_MEDIUM_OBJ_SIZE_MAX || block_size == pq->block_size);
+  #endif
+  mi_page_t* page = _mi_segment_page_alloc(heap, block_size, page_alignment, &heap->tld->segments);
+  if (page == NULL) {
+    // this may be out-of-memory, or an abandoned page was reclaimed (and in our queue)
+    return NULL;
+  }
+  #if MI_HUGE_PAGE_ABANDON
+  mi_assert_internal(pq==NULL || _mi_page_segment(page)->page_kind != MI_PAGE_HUGE);
+  #endif
+  mi_assert_internal(page_alignment >0 || block_size > MI_MEDIUM_OBJ_SIZE_MAX || _mi_page_segment(page)->kind != MI_SEGMENT_HUGE);
+  mi_assert_internal(pq!=NULL || mi_page_block_size(page) >= block_size);
+  // a fresh page was found, initialize it
+  const size_t full_block_size = (pq == NULL || mi_page_is_huge(page) ? mi_page_block_size(page) : block_size); // see also: mi_segment_huge_page_alloc
+  mi_assert_internal(full_block_size >= block_size);
+  mi_page_init(heap, page, full_block_size, heap->tld);
+  mi_heap_stat_increase(heap, pages, 1);
+  mi_heap_stat_increase(heap, page_bins[_mi_page_stats_bin(page)], 1);
+  if (pq != NULL) { mi_page_queue_push(heap, pq, page); }
+  mi_assert_expensive(_mi_page_is_valid(page));
+  return page;
+}
+
+// Get a fresh page to use
+static mi_page_t* mi_page_fresh(mi_heap_t* heap, mi_page_queue_t* pq) {
+  mi_assert_internal(mi_heap_contains_queue(heap, pq));
+  mi_page_t* page = mi_page_fresh_alloc(heap, pq, pq->block_size, 0);
+  if (page==NULL) return NULL;
+  mi_assert_internal(pq->block_size==mi_page_block_size(page));
+  mi_assert_internal(pq==mi_page_queue(heap, mi_page_block_size(page)));
+  return page;
+}
+
+/* -----------------------------------------------------------
+   Do any delayed frees
+   (put there by other threads if they deallocated in a full page)
+----------------------------------------------------------- */
+void _mi_heap_delayed_free_all(mi_heap_t* heap) {
+  while (!_mi_heap_delayed_free_partial(heap)) {
+    mi_atomic_yield();
+  }
+}
+
+// returns true if all delayed frees were processed
+bool _mi_heap_delayed_free_partial(mi_heap_t* heap) {
+  // take over the list (note: no atomic exchange since it is often NULL)
+  mi_block_t* block = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free);
+  while (block != NULL && !mi_atomic_cas_ptr_weak_acq_rel(mi_block_t, &heap->thread_delayed_free, &block, NULL)) { /* nothing */ };
+  bool all_freed = true;
+
+  // and free them all
+  while(block != NULL) {
+    mi_block_t* next = mi_block_nextx(heap,block, heap->keys);
+    // use internal free instead of regular one to keep stats etc correct
+    if (!_mi_free_delayed_block(block)) {
+      // we might already start delayed freeing while another thread has not yet
+      // reset the delayed_freeing flag; in that case delay it further by reinserting the current block
+      // into the delayed free list
+      all_freed = false;
+      mi_block_t* dfree = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free);
+      do {
+        mi_block_set_nextx(heap, block, dfree, heap->keys);
+      } while (!mi_atomic_cas_ptr_weak_release(mi_block_t,&heap->thread_delayed_free, &dfree, block));
+    }
+    block = next;
+  }
+  return all_freed;
+}
+
+/* -----------------------------------------------------------
+  Unfull, abandon, free and retire
+----------------------------------------------------------- */
+
+// Move a page from the full list back to a regular list
+void _mi_page_unfull(mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(_mi_page_is_valid(page));
+  mi_assert_internal(mi_page_is_in_full(page));
+  if (!mi_page_is_in_full(page)) return;
+
+  mi_heap_t* heap = mi_page_heap(page);
+  mi_page_queue_t* pqfull = &heap->pages[MI_BIN_FULL];
+  mi_page_set_in_full(page, false); // to get the right queue
+  mi_page_queue_t* pq = mi_heap_page_queue_of(heap, page);
+  mi_page_set_in_full(page, true);
+  mi_page_queue_enqueue_from_full(pq, pqfull, page);
+}
+
+static void mi_page_to_full(mi_page_t* page, mi_page_queue_t* pq) {
+  mi_assert_internal(pq == mi_page_queue_of(page));
+  mi_assert_internal(!mi_page_immediate_available(page));
+  mi_assert_internal(!mi_page_is_in_full(page));
+
+  if (mi_page_is_in_full(page)) return;
+  mi_page_queue_enqueue_from(&mi_page_heap(page)->pages[MI_BIN_FULL], pq, page);
+  _mi_page_free_collect(page,false);  // try to collect right away in case another thread freed just before MI_USE_DELAYED_FREE was set
+}
+
+
+// Abandon a page with used blocks at the end of a thread.
+// Note: only call if it is ensured that no references exist from
+// the `page->heap->thread_delayed_free` into this page.
+// Currently only called through `mi_heap_collect_ex` which ensures this.
+void _mi_page_abandon(mi_page_t* page, mi_page_queue_t* pq) {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(_mi_page_is_valid(page));
+  mi_assert_internal(pq == mi_page_queue_of(page));
+  mi_assert_internal(mi_page_heap(page) != NULL);
+
+  mi_heap_t* pheap = mi_page_heap(page);
+
+  // remove from our page list
+  mi_segments_tld_t* segments_tld = &pheap->tld->segments;
+  mi_page_queue_remove(pq, page);
+
+  // page is no longer associated with our heap
+  mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE);
+  mi_page_set_heap(page, NULL);
+
+#if (MI_DEBUG>1) && !MI_TRACK_ENABLED
+  // check there are no references left..
+  for (mi_block_t* block = (mi_block_t*)pheap->thread_delayed_free; block != NULL; block = mi_block_nextx(pheap, block, pheap->keys)) {
+    mi_assert_internal(_mi_ptr_page(block) != page);
+  }
+#endif
+
+  // and abandon it
+  mi_assert_internal(mi_page_heap(page) == NULL);
+  _mi_segment_page_abandon(page,segments_tld);
+}
+
+// force abandon a page
+void _mi_page_force_abandon(mi_page_t* page) {
+  mi_heap_t* heap = mi_page_heap(page);
+  // mark page as not using delayed free
+  _mi_page_use_delayed_free(page, MI_NEVER_DELAYED_FREE, false);
+
+  // ensure this page is no longer in the heap delayed free list
+  _mi_heap_delayed_free_all(heap);
+  // We can still access the page meta-info even if it is freed as we ensure
+  // in `mi_segment_force_abandon` that the segment is not freed (yet)
+  if (page->capacity == 0) return; // it may have been freed now
+
+  // and now unlink it from the page queue and abandon (or free)
+  mi_page_queue_t* pq = mi_heap_page_queue_of(heap, page);
+  if (mi_page_all_free(page)) {
+    _mi_page_free(page, pq, false);
+  }
+  else {
+    _mi_page_abandon(page, pq);
+  }
+}
+
+
+// Free a page with no more free blocks
+void _mi_page_free(mi_page_t* page, mi_page_queue_t* pq, bool force) {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(_mi_page_is_valid(page));
+  mi_assert_internal(pq == mi_page_queue_of(page));
+  mi_assert_internal(mi_page_all_free(page));
+  mi_assert_internal(mi_page_thread_free_flag(page)!=MI_DELAYED_FREEING);
+
+  // no more aligned blocks in here
+  mi_page_set_has_aligned(page, false);
+
+  // remove from the page list
+  // (no need to do _mi_heap_delayed_free first as all blocks are already free)
+  mi_heap_t* heap = mi_page_heap(page);
+  mi_segments_tld_t* segments_tld = &heap->tld->segments;
+  mi_page_queue_remove(pq, page);
+
+  // and free it  
+  mi_page_set_heap(page,NULL);
+  _mi_segment_page_free(page, force, segments_tld);
+}
+
+#define MI_MAX_RETIRE_SIZE    MI_MEDIUM_OBJ_SIZE_MAX   // should be less than size for MI_BIN_HUGE
+#define MI_RETIRE_CYCLES      (16)
+
+// Retire a page with no more used blocks
+// Important to not retire too quickly though as new
+// allocations might coming.
+// Note: called from `mi_free` and benchmarks often
+// trigger this due to freeing everything and then
+// allocating again so careful when changing this.
+void _mi_page_retire(mi_page_t* page) mi_attr_noexcept {
+  mi_assert_internal(page != NULL);
+  mi_assert_expensive(_mi_page_is_valid(page));
+  mi_assert_internal(mi_page_all_free(page));
+
+  mi_page_set_has_aligned(page, false);
+
+  // don't retire too often..
+  // (or we end up retiring and re-allocating most of the time)
+  // NOTE: refine this more: we should not retire if this
+  // is the only page left with free blocks. It is not clear
+  // how to check this efficiently though...
+  // for now, we don't retire if it is the only page left of this size class.
+  mi_page_queue_t* pq = mi_page_queue_of(page);
+  #if MI_RETIRE_CYCLES > 0
+  const size_t bsize = mi_page_block_size(page);
+  if mi_likely( /* bsize < MI_MAX_RETIRE_SIZE && */ !mi_page_queue_is_special(pq)) {  // not full or huge queue?
+    if (pq->last==page && pq->first==page) { // the only page in the queue?
+      mi_stat_counter_increase(_mi_stats_main.pages_retire,1);
+      page->retire_expire = (bsize <= MI_SMALL_OBJ_SIZE_MAX ? MI_RETIRE_CYCLES : MI_RETIRE_CYCLES/4);
+      mi_heap_t* heap = mi_page_heap(page);
+      mi_assert_internal(pq >= heap->pages);
+      const size_t index = pq - heap->pages;
+      mi_assert_internal(index < MI_BIN_FULL && index < MI_BIN_HUGE);
+      if (index < heap->page_retired_min) heap->page_retired_min = index;
+      if (index > heap->page_retired_max) heap->page_retired_max = index;
+      mi_assert_internal(mi_page_all_free(page));
+      return; // don't free after all
+    }
+  }
+  #endif
+  _mi_page_free(page, pq, false);
+}
+
+// free retired pages: we don't need to look at the entire queues
+// since we only retire pages that are at the head position in a queue.
+void _mi_heap_collect_retired(mi_heap_t* heap, bool force) {
+  size_t min = MI_BIN_FULL;
+  size_t max = 0;
+  for(size_t bin = heap->page_retired_min; bin <= heap->page_retired_max; bin++) {
+    mi_page_queue_t* pq   = &heap->pages[bin];
+    mi_page_t*       page = pq->first;
+    if (page != NULL && page->retire_expire != 0) {
+      if (mi_page_all_free(page)) {
+        page->retire_expire--;
+        if (force || page->retire_expire == 0) {
+          _mi_page_free(pq->first, pq, force);
+        }
+        else {
+          // keep retired, update min/max
+          if (bin < min) min = bin;
+          if (bin > max) max = bin;
+        }
+      }
+      else {
+        page->retire_expire = 0;
+      }
+    }
+  }
+  heap->page_retired_min = min;
+  heap->page_retired_max = max;
+}
+
+
+/* -----------------------------------------------------------
+  Initialize the initial free list in a page.
+  In secure mode we initialize a randomized list by
+  alternating between slices.
+----------------------------------------------------------- */
+
+#define MI_MAX_SLICE_SHIFT  (6)   // at most 64 slices
+#define MI_MAX_SLICES       (1UL << MI_MAX_SLICE_SHIFT)
+#define MI_MIN_SLICES       (2)
+
+static void mi_page_free_list_extend_secure(mi_heap_t* const heap, mi_page_t* const page, const size_t bsize, const size_t extend, mi_stats_t* const stats) {
+  MI_UNUSED(stats);
+  #if (MI_SECURE<=2)
+  mi_assert_internal(page->free == NULL);
+  mi_assert_internal(page->local_free == NULL);
+  #endif
+  mi_assert_internal(page->capacity + extend <= page->reserved);
+  mi_assert_internal(bsize == mi_page_block_size(page));
+  void* const page_area = mi_page_start(page);
+
+  // initialize a randomized free list
+  // set up `slice_count` slices to alternate between
+  size_t shift = MI_MAX_SLICE_SHIFT;
+  while ((extend >> shift) == 0) {
+    shift--;
+  }
+  const size_t slice_count = (size_t)1U << shift;
+  const size_t slice_extend = extend / slice_count;
+  mi_assert_internal(slice_extend >= 1);
+  mi_block_t* blocks[MI_MAX_SLICES];   // current start of the slice
+  size_t      counts[MI_MAX_SLICES];   // available objects in the slice
+  for (size_t i = 0; i < slice_count; i++) {
+    blocks[i] = mi_page_block_at(page, page_area, bsize, page->capacity + i*slice_extend);
+    counts[i] = slice_extend;
+  }
+  counts[slice_count-1] += (extend % slice_count);  // final slice holds the modulus too (todo: distribute evenly?)
+
+  // and initialize the free list by randomly threading through them
+  // set up first element
+  const uintptr_t r = _mi_heap_random_next(heap);
+  size_t current = r % slice_count;
+  counts[current]--;
+  mi_block_t* const free_start = blocks[current];
+  // and iterate through the rest; use `random_shuffle` for performance
+  uintptr_t rnd = _mi_random_shuffle(r|1); // ensure not 0
+  for (size_t i = 1; i < extend; i++) {
+    // call random_shuffle only every INTPTR_SIZE rounds
+    const size_t round = i%MI_INTPTR_SIZE;
+    if (round == 0) rnd = _mi_random_shuffle(rnd);
+    // select a random next slice index
+    size_t next = ((rnd >> 8*round) & (slice_count-1));
+    while (counts[next]==0) {                            // ensure it still has space
+      next++;
+      if (next==slice_count) next = 0;
+    }
+    // and link the current block to it
+    counts[next]--;
+    mi_block_t* const block = blocks[current];
+    blocks[current] = (mi_block_t*)((uint8_t*)block + bsize);  // bump to the following block
+    mi_block_set_next(page, block, blocks[next]);   // and set next; note: we may have `current == next`
+    current = next;
+  }
+  // prepend to the free list (usually NULL)
+  mi_block_set_next(page, blocks[current], page->free);  // end of the list
+  page->free = free_start;
+}
+
+static mi_decl_noinline void mi_page_free_list_extend( mi_page_t* const page, const size_t bsize, const size_t extend, mi_stats_t* const stats)
+{
+  MI_UNUSED(stats);
+  #if (MI_SECURE <= 2)
+  mi_assert_internal(page->free == NULL);
+  mi_assert_internal(page->local_free == NULL);
+  #endif
+  mi_assert_internal(page->capacity + extend <= page->reserved);
+  mi_assert_internal(bsize == mi_page_block_size(page));
+  void* const page_area = mi_page_start(page);
+
+  mi_block_t* const start = mi_page_block_at(page, page_area, bsize, page->capacity);
+
+  // initialize a sequential free list
+  mi_block_t* const last = mi_page_block_at(page, page_area, bsize, page->capacity + extend - 1);
+  mi_block_t* block = start;
+  while(block <= last) {
+    mi_block_t* next = (mi_block_t*)((uint8_t*)block + bsize);
+    mi_block_set_next(page,block,next);
+    block = next;
+  }
+  // prepend to free list (usually `NULL`)
+  mi_block_set_next(page, last, page->free);
+  page->free = start;
+}
+
+/* -----------------------------------------------------------
+  Page initialize and extend the capacity
+----------------------------------------------------------- */
+
+#define MI_MAX_EXTEND_SIZE    (4*1024)      // heuristic, one OS page seems to work well.
+#if (MI_SECURE>0)
+#define MI_MIN_EXTEND         (8*MI_SECURE) // extend at least by this many
+#else
+#define MI_MIN_EXTEND         (4)
+#endif
+
+// Extend the capacity (up to reserved) by initializing a free list
+// We do at most `MI_MAX_EXTEND` to avoid touching too much memory
+// Note: we also experimented with "bump" allocation on the first
+// allocations but this did not speed up any benchmark (due to an
+// extra test in malloc? or cache effects?)
+static bool mi_page_extend_free(mi_heap_t* heap, mi_page_t* page, mi_tld_t* tld) {
+  mi_assert_expensive(mi_page_is_valid_init(page));
+  #if (MI_SECURE<=2)
+  mi_assert(page->free == NULL);
+  mi_assert(page->local_free == NULL);
+  if (page->free != NULL) return true;
+  #endif
+  if (page->capacity >= page->reserved) return true;
+
+  mi_stat_counter_increase(tld->stats.pages_extended, 1);
+
+  // calculate the extend count
+  const size_t bsize = mi_page_block_size(page);
+  size_t extend = page->reserved - page->capacity;
+  mi_assert_internal(extend > 0);
+
+  size_t max_extend = (bsize >= MI_MAX_EXTEND_SIZE ? MI_MIN_EXTEND : MI_MAX_EXTEND_SIZE/bsize);
+  if (max_extend < MI_MIN_EXTEND) { max_extend = MI_MIN_EXTEND; }
+  mi_assert_internal(max_extend > 0);
+
+  if (extend > max_extend) {
+    // ensure we don't touch memory beyond the page to reduce page commit.
+    // the `lean` benchmark tests this. Going from 1 to 8 increases rss by 50%.
+    extend = max_extend;
+  }
+
+  mi_assert_internal(extend > 0 && extend + page->capacity <= page->reserved);
+  mi_assert_internal(extend < (1UL<<16));
+
+  // and append the extend the free list
+  if (extend < MI_MIN_SLICES || MI_SECURE==0) { //!mi_option_is_enabled(mi_option_secure)) {
+    mi_page_free_list_extend(page, bsize, extend, &tld->stats );
+  }
+  else {
+    mi_page_free_list_extend_secure(heap, page, bsize, extend, &tld->stats);
+  }
+  // enable the new free list
+  page->capacity += (uint16_t)extend;
+  mi_stat_increase(tld->stats.page_committed, extend * bsize);
+  mi_assert_expensive(mi_page_is_valid_init(page));
+  return true;
+}
+
+// Initialize a fresh page
+static void mi_page_init(mi_heap_t* heap, mi_page_t* page, size_t block_size, mi_tld_t* tld) {
+  mi_assert(page != NULL);
+  mi_segment_t* segment = _mi_page_segment(page);
+  mi_assert(segment != NULL);
+  mi_assert_internal(block_size > 0);
+  // set fields
+  mi_page_set_heap(page, heap);
+  page->block_size = block_size;
+  size_t page_size;
+  page->page_start = _mi_segment_page_start(segment, page, &page_size);
+  mi_track_mem_noaccess(page->page_start,page_size);
+  mi_assert_internal(mi_page_block_size(page) <= page_size);
+  mi_assert_internal(page_size <= page->slice_count*MI_SEGMENT_SLICE_SIZE);
+  mi_assert_internal(page_size / block_size < (1L<<16));
+  page->reserved = (uint16_t)(page_size / block_size);
+  mi_assert_internal(page->reserved > 0);
+  #if (MI_PADDING || MI_ENCODE_FREELIST)
+  page->keys[0] = _mi_heap_random_next(heap);
+  page->keys[1] = _mi_heap_random_next(heap);
+  #endif
+  page->free_is_zero = page->is_zero_init;
+  #if MI_DEBUG>2
+  if (page->is_zero_init) {
+    mi_track_mem_defined(page->page_start, page_size);
+    mi_assert_expensive(mi_mem_is_zero(page->page_start, page_size));
+  }
+  #endif
+  mi_assert_internal(page->is_committed);
+  if (block_size > 0 && _mi_is_power_of_two(block_size)) {
+    page->block_size_shift = (uint8_t)(mi_ctz((uintptr_t)block_size));
+  }
+  else {
+    page->block_size_shift = 0;
+  }
+
+  mi_assert_internal(page->capacity == 0);
+  mi_assert_internal(page->free == NULL);
+  mi_assert_internal(page->used == 0);
+  mi_assert_internal(page->xthread_free == 0);
+  mi_assert_internal(page->next == NULL);
+  mi_assert_internal(page->prev == NULL);
+  mi_assert_internal(page->retire_expire == 0);
+  mi_assert_internal(!mi_page_has_aligned(page));
+  #if (MI_PADDING || MI_ENCODE_FREELIST)
+  mi_assert_internal(page->keys[0] != 0);
+  mi_assert_internal(page->keys[1] != 0);
+  #endif
+  mi_assert_internal(page->block_size_shift == 0 || (block_size == ((size_t)1 << page->block_size_shift)));
+  mi_assert_expensive(mi_page_is_valid_init(page));
+
+  // initialize an initial free list
+  if (mi_page_extend_free(heap,page,tld)) {
+    mi_assert(mi_page_immediate_available(page));
+  }
+  return;
+}
+
+
+/* -----------------------------------------------------------
+  Find pages with free blocks
+-------------------------------------------------------------*/
+
+// search for a best next page to use for at most N pages (often cut short if immediate blocks are available)
+#define MI_MAX_CANDIDATE_SEARCH  (4)
+
+// is the page not yet used up to its reserved space?
+static bool mi_page_is_expandable(const mi_page_t* page) {
+  mi_assert_internal(page != NULL);
+  mi_assert_internal(page->capacity <= page->reserved);
+  return (page->capacity < page->reserved);
+}
+
+
+// Find a page with free blocks of `page->block_size`.
+static mi_page_t* mi_page_queue_find_free_ex(mi_heap_t* heap, mi_page_queue_t* pq, bool first_try)
+{
+  // search through the pages in "next fit" order
+  #if MI_STAT
+  size_t count = 0;
+  #endif
+  size_t candidate_count = 0;        // we reset this on the first candidate to limit the search
+  mi_page_t* page_candidate = NULL;  // a page with free space
+  mi_page_t* page = pq->first;
+
+  while (page != NULL)
+  {
+    mi_page_t* next = page->next; // remember next
+    #if MI_STAT
+    count++;
+    #endif
+    candidate_count++;
+
+    // collect freed blocks by us and other threads
+    _mi_page_free_collect(page, false);
+
+  #if MI_MAX_CANDIDATE_SEARCH > 1
+    // search up to N pages for a best candidate
+
+    // is the local free list non-empty?
+    const bool immediate_available = mi_page_immediate_available(page);
+
+    // if the page is completely full, move it to the `mi_pages_full`
+    // queue so we don't visit long-lived pages too often.
+    if (!immediate_available && !mi_page_is_expandable(page)) {
+      mi_assert_internal(!mi_page_is_in_full(page) && !mi_page_immediate_available(page));
+      mi_page_to_full(page, pq);
+    }
+    else {
+      // the page has free space, make it a candidate
+      // we prefer non-expandable pages with high usage as candidates (to reduce commit, and increase chances of free-ing up pages)
+      if (page_candidate == NULL) {
+        page_candidate = page;
+        candidate_count = 0;
+      }
+      // prefer to reuse fuller pages (in the hope the less used page gets freed)
+      else if (page->used >= page_candidate->used && !mi_page_is_mostly_used(page) && !mi_page_is_expandable(page)) {
+        page_candidate = page;
+      }
+      // if we find a non-expandable candidate, or searched for N pages, return with the best candidate
+      if (immediate_available || candidate_count > MI_MAX_CANDIDATE_SEARCH) {
+        mi_assert_internal(page_candidate!=NULL);
+        break;
+      }
+    }
+  #else
+    // first-fit algorithm
+    // If the page contains free blocks, we are done
+    if (mi_page_immediate_available(page) || mi_page_is_expandable(page)) {
+      break;  // pick this one
+    }
+
+    // If the page is completely full, move it to the `mi_pages_full`
+    // queue so we don't visit long-lived pages too often.
+    mi_assert_internal(!mi_page_is_in_full(page) && !mi_page_immediate_available(page));
+    mi_page_to_full(page, pq);
+  #endif
+
+    page = next;
+  } // for each page
+
+  mi_heap_stat_counter_increase(heap, page_searches, count);
+  mi_heap_stat_counter_increase(heap, page_searches_count, 1);
+
+  // set the page to the best candidate
+  if (page_candidate != NULL) {
+    page = page_candidate;
+  }
+  if (page != NULL) {
+    if (!mi_page_immediate_available(page)) {
+      mi_assert_internal(mi_page_is_expandable(page));
+      if (!mi_page_extend_free(heap, page, heap->tld)) {
+        page = NULL; // failed to extend
+      }
+    }
+    mi_assert_internal(page == NULL || mi_page_immediate_available(page));
+  }
+
+  if (page == NULL) {
+    _mi_heap_collect_retired(heap, false); // perhaps make a page available?
+    page = mi_page_fresh(heap, pq);
+    if (page == NULL && first_try) {
+      // out-of-memory _or_ an abandoned page with free blocks was reclaimed, try once again
+      page = mi_page_queue_find_free_ex(heap, pq, false);
+    }
+  }
+  else {
+    // move the page to the front of the queue
+    mi_page_queue_move_to_front(heap, pq, page);
+    page->retire_expire = 0;
+    // _mi_heap_collect_retired(heap, false); // update retire counts; note: increases rss on MemoryLoad bench so don't do this
+  }
+  mi_assert_internal(page == NULL || mi_page_immediate_available(page));
+
+
+  return page;
+}
+
+
+
+// Find a page with free blocks of `size`.
+static inline mi_page_t* mi_find_free_page(mi_heap_t* heap, size_t size) {
+  mi_page_queue_t* pq = mi_page_queue(heap, size);
+
+  // check the first page: we even do this with candidate search or otherwise we re-search every time
+  mi_page_t* page = pq->first;
+  if (page != NULL) {
+   #if (MI_SECURE>=3) // in secure mode, we extend half the time to increase randomness
+    if (page->capacity < page->reserved && ((_mi_heap_random_next(heap) & 1) == 1)) {
+      mi_page_extend_free(heap, page, heap->tld);
+      mi_assert_internal(mi_page_immediate_available(page));
+    }
+    else
+   #endif
+    {
+      _mi_page_free_collect(page,false);
+    }
+
+    if (mi_page_immediate_available(page)) {
+      page->retire_expire = 0;
+      return page; // fast path
+    }
+  }
+
+  return mi_page_queue_find_free_ex(heap, pq, true);
+}
+
+
+/* -----------------------------------------------------------
+  Users can register a deferred free function called
+  when the `free` list is empty. Since the `local_free`
+  is separate this is deterministically called after
+  a certain number of allocations.
+----------------------------------------------------------- */
+
+static mi_deferred_free_fun* volatile deferred_free = NULL;
+static _Atomic(void*) deferred_arg; // = NULL
+
+void _mi_deferred_free(mi_heap_t* heap, bool force) {
+  heap->tld->heartbeat++;
+  if (deferred_free != NULL && !heap->tld->recurse) {
+    heap->tld->recurse = true;
+    deferred_free(force, heap->tld->heartbeat, mi_atomic_load_ptr_relaxed(void,&deferred_arg));
+    heap->tld->recurse = false;
+  }
+}
+
+void mi_register_deferred_free(mi_deferred_free_fun* fn, void* arg) mi_attr_noexcept {
+  deferred_free = fn;
+  mi_atomic_store_ptr_release(void,&deferred_arg, arg);
+}
+
+
+/* -----------------------------------------------------------
+  General allocation
+----------------------------------------------------------- */
+
+// Large and huge page allocation.
+// Huge pages contain just one block, and the segment contains just that page (as `MI_SEGMENT_HUGE`).
+// Huge pages are also use if the requested alignment is very large (> MI_BLOCK_ALIGNMENT_MAX)
+// so their size is not always `> MI_LARGE_OBJ_SIZE_MAX`.
+static mi_page_t* mi_large_huge_page_alloc(mi_heap_t* heap, size_t size, size_t page_alignment) {
+  size_t block_size = _mi_os_good_alloc_size(size);
+  mi_assert_internal(mi_bin(block_size) == MI_BIN_HUGE || page_alignment > 0);
+  bool is_huge = (block_size > MI_LARGE_OBJ_SIZE_MAX || page_alignment > 0);
+  #if MI_HUGE_PAGE_ABANDON
+  mi_page_queue_t* pq = (is_huge ? NULL : mi_page_queue(heap, block_size));
+  #else
+  mi_page_queue_t* pq = mi_page_queue(heap, is_huge ? MI_LARGE_OBJ_SIZE_MAX+1 : block_size);
+  mi_assert_internal(!is_huge || mi_page_queue_is_huge(pq));
+  #endif
+  mi_page_t* page = mi_page_fresh_alloc(heap, pq, block_size, page_alignment);
+  if (page != NULL) {
+    mi_assert_internal(mi_page_immediate_available(page));
+
+    if (is_huge) {
+      mi_assert_internal(mi_page_is_huge(page));
+      mi_assert_internal(_mi_page_segment(page)->kind == MI_SEGMENT_HUGE);
+      mi_assert_internal(_mi_page_segment(page)->used==1);
+      #if MI_HUGE_PAGE_ABANDON
+      mi_assert_internal(_mi_page_segment(page)->thread_id==0); // abandoned, not in the huge queue
+      mi_page_set_heap(page, NULL);
+      #endif
+    }
+    else {
+      mi_assert_internal(!mi_page_is_huge(page));
+    }
+
+    const size_t bsize = mi_page_usable_block_size(page);  // note: not `mi_page_block_size` to account for padding
+    /*if (bsize <= MI_LARGE_OBJ_SIZE_MAX) {
+      mi_heap_stat_increase(heap, malloc_large, bsize);
+      mi_heap_stat_counter_increase(heap, malloc_large_count, 1);
+    }
+    else */
+    {
+      _mi_stat_increase(&heap->tld->stats.malloc_huge, bsize);
+      _mi_stat_counter_increase(&heap->tld->stats.malloc_huge_count, 1);
+    }
+  }
+  return page;
+}
+
+
+// Allocate a page
+// Note: in debug mode the size includes MI_PADDING_SIZE and might have overflowed.
+static mi_page_t* mi_find_page(mi_heap_t* heap, size_t size, size_t huge_alignment) mi_attr_noexcept {
+  // huge allocation?
+  const size_t req_size = size - MI_PADDING_SIZE;  // correct for padding_size in case of an overflow on `size`
+  if mi_unlikely(req_size > (MI_MEDIUM_OBJ_SIZE_MAX - MI_PADDING_SIZE) || huge_alignment > 0) {
+    if mi_unlikely(req_size > MI_MAX_ALLOC_SIZE) {
+      _mi_error_message(EOVERFLOW, "allocation request is too large (%zu bytes)\n", req_size);
+      return NULL;
+    }
+    else {
+      return mi_large_huge_page_alloc(heap,size,huge_alignment);
+    }
+  }
+  else {
+    // otherwise find a page with free blocks in our size segregated queues
+    #if MI_PADDING
+    mi_assert_internal(size >= MI_PADDING_SIZE);
+    #endif
+    return mi_find_free_page(heap, size);
+  }
+}
+
+// Generic allocation routine if the fast path (`alloc.c:mi_page_malloc`) does not succeed.
+// Note: in debug mode the size includes MI_PADDING_SIZE and might have overflowed.
+// The `huge_alignment` is normally 0 but is set to a multiple of MI_SLICE_SIZE for
+// very large requested alignments in which case we use a huge singleton page.
+void* _mi_malloc_generic(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment, size_t* usable) mi_attr_noexcept
+{
+  mi_assert_internal(heap != NULL);
+
+  // initialize if necessary
+  if mi_unlikely(!mi_heap_is_initialized(heap)) {
+    heap = mi_heap_get_default(); // calls mi_thread_init
+    if mi_unlikely(!mi_heap_is_initialized(heap)) { return NULL; }
+  }
+  mi_assert_internal(mi_heap_is_initialized(heap));
+
+  // do administrative tasks every N generic mallocs
+  if mi_unlikely(++heap->generic_count >= 100) {
+    heap->generic_collect_count += heap->generic_count;
+    heap->generic_count = 0;
+    // call potential deferred free routines
+    _mi_deferred_free(heap, false);
+
+    // free delayed frees from other threads (but skip contended ones)
+    _mi_heap_delayed_free_partial(heap);
+
+    // collect every once in a while (10000 by default)
+    const long generic_collect = mi_option_get_clamp(mi_option_generic_collect, 1, 1000000L);
+    if (heap->generic_collect_count >= generic_collect) {
+      heap->generic_collect_count = 0;
+      mi_heap_collect(heap, false /* force? */);
+    }
+  }
+
+  // find (or allocate) a page of the right size
+  mi_page_t* page = mi_find_page(heap, size, huge_alignment);
+  if mi_unlikely(page == NULL) { // first time out of memory, try to collect and retry the allocation once more
+    mi_heap_collect(heap, true /* force */);
+    page = mi_find_page(heap, size, huge_alignment);
+  }
+
+  if mi_unlikely(page == NULL) { // out of memory
+    const size_t req_size = size - MI_PADDING_SIZE;  // correct for padding_size in case of an overflow on `size`
+    _mi_error_message(ENOMEM, "unable to allocate memory (%zu bytes)\n", req_size);
+    return NULL;
+  }
+
+  mi_assert_internal(mi_page_immediate_available(page));
+  mi_assert_internal(mi_page_block_size(page) >= size);
+
+  // and try again, this time succeeding! (i.e. this should never recurse through _mi_page_malloc)
+  void* p;
+  if mi_unlikely(zero && mi_page_is_huge(page)) {
+    // note: we cannot call _mi_page_malloc with zeroing for huge blocks; we zero it afterwards in that case.
+    p = _mi_page_malloc_zero(heap, page, size, false, usable);
+    mi_assert_internal(p != NULL);
+    _mi_memzero_aligned(p, mi_page_usable_block_size(page));
+  }
+  else {
+    p = _mi_page_malloc_zero(heap, page, size, zero, usable);
+    mi_assert_internal(p != NULL);
+  }
+  // move singleton pages to the full queue
+  if (page->reserved == page->used) {
+    mi_page_to_full(page, mi_page_queue_of(page));
+  }
+  return p;
+}
diff --git a/compat/mimalloc/prim/osx/prim.c b/compat/mimalloc/prim/osx/prim.c
new file mode 100644
index 00000000000000..8a2f4e8aa47316
--- /dev/null
+++ b/compat/mimalloc/prim/osx/prim.c
@@ -0,0 +1,9 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+// We use the unix/prim.c with the mmap API on macOSX
+#include "../unix/prim.c"
diff --git a/compat/mimalloc/prim/prim.c b/compat/mimalloc/prim/prim.c
new file mode 100644
index 00000000000000..5147bae81feaaf
--- /dev/null
+++ b/compat/mimalloc/prim/prim.c
@@ -0,0 +1,76 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+// Select the implementation of the primitives
+// depending on the OS.
+
+#if defined(_WIN32)
+#include "windows/prim.c"  // VirtualAlloc (Windows)
+
+#elif defined(__APPLE__)
+#include "osx/prim.c"      // macOSX (actually defers to mmap in unix/prim.c)
+
+#elif defined(__wasi__)
+#define MI_USE_SBRK
+#include "wasi/prim.c"     // memory-grow or sbrk (Wasm)
+
+#elif defined(__EMSCRIPTEN__)
+#include "emscripten/prim.c" // emmalloc_*, + pthread support
+
+#else
+#include "unix/prim.c"     // mmap() (Linux, macOSX, BSD, Illumnos, Haiku, DragonFly, etc.)
+
+#endif
+
+// Generic process initialization
+#ifndef MI_PRIM_HAS_PROCESS_ATTACH
+#if defined(__GNUC__) || defined(__clang__)
+  // gcc,clang: use the constructor/destructor attribute
+  // which for both seem to run before regular constructors/destructors
+  #if defined(__clang__)
+    #define mi_attr_constructor __attribute__((constructor(101)))
+    #define mi_attr_destructor  __attribute__((destructor(101)))
+  #else
+    #define mi_attr_constructor __attribute__((constructor))
+    #define mi_attr_destructor  __attribute__((destructor))
+  #endif
+  static void mi_attr_constructor mi_process_attach(void) {
+    _mi_auto_process_init();
+  }
+  static void mi_attr_destructor mi_process_detach(void) {
+    _mi_auto_process_done();
+  }
+#elif defined(__cplusplus)
+  // C++: use static initialization to detect process start/end
+  // This is not guaranteed to be first/last but the best we can generally do?
+  struct mi_init_done_t {
+    mi_init_done_t() {
+      _mi_auto_process_init();
+    }
+    ~mi_init_done_t() {
+      _mi_auto_process_done();
+    }
+  };
+  static mi_init_done_t mi_init_done;
+ #else
+  #pragma message("define a way to call _mi_auto_process_init/done on your platform")
+#endif
+#endif
+
+// Generic allocator init/done callback
+#ifndef MI_PRIM_HAS_ALLOCATOR_INIT
+bool _mi_is_redirected(void) {
+  return false;
+}
+bool _mi_allocator_init(const char** message) {
+  if (message != NULL) { *message = NULL; }
+  return true;
+}
+void _mi_allocator_done(void) {
+  // nothing to do
+}
+#endif
diff --git a/compat/mimalloc/prim/unix/prim.c b/compat/mimalloc/prim/unix/prim.c
new file mode 100644
index 00000000000000..99331e3f8215c1
--- /dev/null
+++ b/compat/mimalloc/prim/unix/prim.c
@@ -0,0 +1,962 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2025, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+// This file is included in `src/prim/prim.c`
+
+#ifndef _DEFAULT_SOURCE
+#define _DEFAULT_SOURCE   // ensure mmap flags and syscall are defined
+#endif
+
+#if defined(__sun)
+// illumos provides new mman.h api when any of these are defined
+// otherwise the old api based on caddr_t which predates the void pointers one.
+// stock solaris provides only the former, chose to atomically to discard those
+// flags only here rather than project wide tough.
+#undef _XOPEN_SOURCE
+#undef _POSIX_C_SOURCE
+#endif
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"
+
+#include <sys/mman.h>  // mmap
+#include <unistd.h>    // sysconf
+#include <fcntl.h>     // open, close, read, access
+#include <stdlib.h>    // getenv, arc4random_buf
+
+#if defined(__linux__)
+  #include <features.h>
+  #include <sys/prctl.h>    // THP disable, PR_SET_VMA
+  #include <sys/sysinfo.h>  // sysinfo
+  #if defined(__GLIBC__) && !defined(PR_SET_VMA)
+  #include <linux/prctl.h>
+  #endif
+  #if defined(__GLIBC__)
+  #include <linux/mman.h>   // linux mmap flags
+  #else
+  #include <sys/mman.h>
+  #endif
+#elif defined(__APPLE__)
+  #include <AvailabilityMacros.h>
+  #include <TargetConditionals.h>
+  #if !defined(TARGET_OS_OSX) || TARGET_OS_OSX   // see issue #879, used to be (!TARGET_IOS_IPHONE && !TARGET_IOS_SIMULATOR)
+  #include <mach/vm_statistics.h>    // VM_MAKE_TAG, VM_FLAGS_SUPERPAGE_SIZE_2MB, etc.
+  #endif
+  #if !defined(MAC_OS_X_VERSION_10_7)
+  #define MAC_OS_X_VERSION_10_7   1070
+  #endif
+  #include <sys/sysctl.h>
+#elif defined(__FreeBSD__) || defined(__DragonFly__)
+  #include <sys/param.h>
+  #if __FreeBSD_version >= 1200000
+  #include <sys/cpuset.h>
+  #include <sys/domainset.h>
+  #endif
+  #include <sys/sysctl.h>
+#endif
+
+#if (defined(__linux__) && !defined(__ANDROID__)) || defined(__FreeBSD__)
+  #define MI_HAS_SYSCALL_H
+  #include <sys/syscall.h>
+#endif
+
+#if !defined(MADV_DONTNEED) && defined(POSIX_MADV_DONTNEED)  // QNX
+#define MADV_DONTNEED  POSIX_MADV_DONTNEED
+#endif
+#if !defined(MADV_FREE) && defined(POSIX_MADV_FREE)  // QNX
+#define MADV_FREE  POSIX_MADV_FREE
+#endif
+
+#define MI_UNIX_LARGE_PAGE_SIZE (2*MI_MiB) // TODO: can we query the OS for this?
+
+//------------------------------------------------------------------------------------
+// Use syscalls for some primitives to allow for libraries that override open/read/close etc.
+// and do allocation themselves; using syscalls prevents recursion when mimalloc is
+// still initializing (issue #713)
+// Declare inline to avoid unused function warnings.
+//------------------------------------------------------------------------------------
+
+#if defined(MI_HAS_SYSCALL_H) && defined(SYS_open) && defined(SYS_close) && defined(SYS_read) && defined(SYS_access)
+
+static inline int mi_prim_open(const char* fpath, int open_flags) {
+  return syscall(SYS_open,fpath,open_flags,0);
+}
+static inline ssize_t mi_prim_read(int fd, void* buf, size_t bufsize) {
+  return syscall(SYS_read,fd,buf,bufsize);
+}
+static inline int mi_prim_close(int fd) {
+  return syscall(SYS_close,fd);
+}
+static inline int mi_prim_access(const char *fpath, int mode) {
+  return syscall(SYS_access,fpath,mode);
+}
+
+#else
+
+static inline int mi_prim_open(const char* fpath, int open_flags) {
+  return open(fpath,open_flags);
+}
+static inline ssize_t mi_prim_read(int fd, void* buf, size_t bufsize) {
+  return read(fd,buf,bufsize);
+}
+static inline int mi_prim_close(int fd) {
+  return close(fd);
+}
+static inline int mi_prim_access(const char *fpath, int mode) {
+  return access(fpath,mode);
+}
+
+#endif
+
+
+
+//---------------------------------------------
+// init
+//---------------------------------------------
+
+static bool unix_detect_overcommit(void) {
+  bool os_overcommit = true;
+  #if defined(__linux__)
+    int fd = mi_prim_open("/proc/sys/vm/overcommit_memory", O_RDONLY);
+    if (fd >= 0) {
+      char buf[32];
+      ssize_t nread = mi_prim_read(fd, &buf, sizeof(buf));
+      mi_prim_close(fd);
+      // <https://www.kernel.org/doc/Documentation/vm/overcommit-accounting>
+      // 0: heuristic overcommit, 1: always overcommit, 2: never overcommit (ignore NORESERVE)
+      if (nread >= 1) {
+        os_overcommit = (buf[0] == '0' || buf[0] == '1');
+      }
+    }
+  #elif defined(__FreeBSD__)
+    int val = 0;
+    size_t olen = sizeof(val);
+    if (sysctlbyname("vm.overcommit", &val, &olen, NULL, 0) == 0) {
+      os_overcommit = (val != 0);
+    }
+  #else
+    // default: overcommit is true
+  #endif
+  return os_overcommit;
+}
+
+// try to detect the physical memory dynamically (if possible)
+static void unix_detect_physical_memory( size_t page_size, size_t* physical_memory_in_kib ) {
+  #if defined(CTL_HW) && (defined(HW_PHYSMEM64) || defined(HW_MEMSIZE))  // freeBSD, macOS
+    MI_UNUSED(page_size);
+    int64_t physical_memory = 0;
+    size_t length = sizeof(int64_t);
+    #if defined(HW_PHYSMEM64)
+    int mib[2] = { CTL_HW, HW_PHYSMEM64 };
+    #else
+    int mib[2] = { CTL_HW, HW_MEMSIZE };
+    #endif
+    const int err = sysctl(mib, 2, &physical_memory, &length, NULL, 0);
+    if (err==0 && physical_memory > 0) {
+      const int64_t phys_in_kib = physical_memory / MI_KiB;
+      if (phys_in_kib > 0 && (uint64_t)phys_in_kib <= SIZE_MAX) {
+        *physical_memory_in_kib = (size_t)phys_in_kib;
+      }
+    }
+  #elif defined(__linux__)
+    MI_UNUSED(page_size);
+    struct sysinfo info; _mi_memzero_var(info);
+    const int err = sysinfo(&info);
+    if (err==0 && info.totalram > 0 && info.totalram <= SIZE_MAX) {
+      *physical_memory_in_kib = (size_t)info.totalram / MI_KiB;
+    }
+  #elif defined(_SC_PHYS_PAGES)  // do not use by default as it might cause allocation (by using `fopen` to parse /proc/meminfo) (issue #1100)
+    const long pphys = sysconf(_SC_PHYS_PAGES);
+    const size_t psize_in_kib = page_size / MI_KiB;
+    if (psize_in_kib > 0 && pphys > 0 && (unsigned long)pphys <= SIZE_MAX && (size_t)pphys <= (SIZE_MAX/psize_in_kib)) {
+      *physical_memory_in_kib = (size_t)pphys * psize_in_kib;
+    }
+  #endif
+}
+
+void _mi_prim_mem_init( mi_os_mem_config_t* config )
+{
+  long psize = sysconf(_SC_PAGESIZE);
+  if (psize > 0 && (unsigned long)psize < SIZE_MAX) {
+    config->page_size = (size_t)psize;
+    config->alloc_granularity = (size_t)psize;
+    unix_detect_physical_memory(config->page_size, &config->physical_memory_in_kib);
+  }
+  config->large_page_size = MI_UNIX_LARGE_PAGE_SIZE;
+  config->has_overcommit = unix_detect_overcommit();
+  config->has_partial_free = true;    // mmap can free in parts
+  config->has_virtual_reserve = true; // todo: check if this true for NetBSD?  (for anonymous mmap with PROT_NONE)
+
+  // disable transparent huge pages for this process?
+  #if (defined(__linux__) || defined(__ANDROID__)) && defined(PR_GET_THP_DISABLE)
+  #if defined(MI_NO_THP)
+  if (true)
+  #else
+  if (!mi_option_is_enabled(mi_option_allow_thp)) // disable THP if requested through an option
+  #endif
+  {
+    int val = 0;
+    if (prctl(PR_GET_THP_DISABLE, &val, 0, 0, 0) != 0) {
+      // Most likely since distros often come with always/madvise settings.
+      val = 1;
+      // Disabling only for mimalloc process rather than touching system wide settings
+      (void)prctl(PR_SET_THP_DISABLE, &val, 0, 0, 0);
+    }
+  }
+  #endif
+}
+
+
+//---------------------------------------------
+// free
+//---------------------------------------------
+
+int _mi_prim_free(void* addr, size_t size ) {
+  if (size==0) return 0;
+  bool err = (munmap(addr, size) == -1);
+  return (err ? errno : 0);
+}
+
+
+//---------------------------------------------
+// mmap
+//---------------------------------------------
+
+static int unix_madvise(void* addr, size_t size, int advice) {
+  #if defined(__sun)
+  int res = madvise((caddr_t)addr, size, advice);  // Solaris needs cast (issue #520)
+  #elif defined(__QNX__)
+  int res = posix_madvise(addr, size, advice);
+  #else
+  int res = madvise(addr, size, advice);
+  #endif
+  return (res==0 ? 0 : errno);
+}
+
+static void* unix_mmap_prim(void* addr, size_t size, int protect_flags, int flags, int fd) {
+  void* p = mmap(addr, size, protect_flags, flags, fd, 0 /* offset */);
+  #if defined(__linux__) && defined(PR_SET_VMA)
+  if (p!=MAP_FAILED && p!=NULL) {
+    prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, p, size, "mimalloc");
+  }
+  #endif
+  return p;
+}
+
+static void* unix_mmap_prim_aligned(void* addr, size_t size, size_t try_alignment, int protect_flags, int flags, int fd) {
+  MI_UNUSED(try_alignment);
+  void* p = NULL;
+  #if defined(MAP_ALIGNED)  // BSD
+  if (addr == NULL && try_alignment > 1 && (try_alignment % _mi_os_page_size()) == 0) {
+    size_t n = mi_bsr(try_alignment);
+    if (((size_t)1 << n) == try_alignment && n >= 12 && n <= 30) {  // alignment is a power of 2 and 4096 <= alignment <= 1GiB
+      p = unix_mmap_prim(addr, size, protect_flags, flags | MAP_ALIGNED(n), fd);
+      if (p==MAP_FAILED || !_mi_is_aligned(p,try_alignment)) {
+        int err = errno;
+        _mi_trace_message("unable to directly request aligned OS memory (error: %d (0x%x), size: 0x%zx bytes, alignment: 0x%zx, hint address: %p)\n", err, err, size, try_alignment, addr);
+      }
+      if (p!=MAP_FAILED) return p;
+      // fall back to regular mmap
+    }
+  }
+  #elif defined(MAP_ALIGN)  // Solaris
+  if (addr == NULL && try_alignment > 1 && (try_alignment % _mi_os_page_size()) == 0) {
+    p = unix_mmap_prim((void*)try_alignment, size, protect_flags, flags | MAP_ALIGN, fd);  // addr parameter is the required alignment
+    if (p!=MAP_FAILED) return p;
+    // fall back to regular mmap
+  }
+  #endif
+  #if (MI_INTPTR_SIZE >= 8) && !defined(MAP_ALIGNED)
+  // on 64-bit systems, use the virtual address area after 2TiB for 4MiB aligned allocations
+  if (addr == NULL) {
+    void* hint = _mi_os_get_aligned_hint(try_alignment, size);
+    if (hint != NULL) {
+      p = unix_mmap_prim(hint, size, protect_flags, flags, fd);
+      if (p==MAP_FAILED || !_mi_is_aligned(p,try_alignment)) {
+        #if MI_TRACK_ENABLED  // asan sometimes does not instrument errno correctly?
+        int err = 0;
+        #else
+        int err = errno;
+        #endif
+        _mi_trace_message("unable to directly request hinted aligned OS memory (error: %d (0x%x), size: 0x%zx bytes, alignment: 0x%zx, hint address: %p)\n", err, err, size, try_alignment, hint);
+      }
+      if (p!=MAP_FAILED) return p;
+      // fall back to regular mmap
+    }
+  }
+  #endif
+  // regular mmap
+  p = unix_mmap_prim(addr, size, protect_flags, flags, fd);
+  if (p!=MAP_FAILED) return p;
+  // failed to allocate
+  return NULL;
+}
+
+static int unix_mmap_fd(void) {
+  #if defined(VM_MAKE_TAG)
+  // macOS: tracking anonymous page with a specific ID. (All up to 98 are taken officially but LLVM sanitizers had taken 99)
+  int os_tag = (int)mi_option_get(mi_option_os_tag);
+  if (os_tag < 100 || os_tag > 255) { os_tag = 254; }
+  return VM_MAKE_TAG(os_tag);
+  #else
+  return -1;
+  #endif
+}
+
+static void* unix_mmap(void* addr, size_t size, size_t try_alignment, int protect_flags, bool large_only, bool allow_large, bool* is_large) {
+  #if !defined(MAP_ANONYMOUS)
+  #define MAP_ANONYMOUS  MAP_ANON
+  #endif
+  #if !defined(MAP_NORESERVE)
+  #define MAP_NORESERVE  0
+  #endif
+  void* p = NULL;
+  const int fd = unix_mmap_fd();
+  int flags = MAP_PRIVATE | MAP_ANONYMOUS;
+  if (_mi_os_has_overcommit()) {
+    flags |= MAP_NORESERVE;
+  }
+  #if defined(PROT_MAX)
+  protect_flags |= PROT_MAX(PROT_READ | PROT_WRITE); // BSD
+  #endif
+  // huge page allocation
+  if (allow_large && (large_only || (_mi_os_canuse_large_page(size, try_alignment) && mi_option_is_enabled(mi_option_allow_large_os_pages)))) {
+    static _Atomic(size_t) large_page_try_ok; // = 0;
+    size_t try_ok = mi_atomic_load_acquire(&large_page_try_ok);
+    if (!large_only && try_ok > 0) {
+      // If the OS is not configured for large OS pages, or the user does not have
+      // enough permission, the `mmap` will always fail (but it might also fail for other reasons).
+      // Therefore, once a large page allocation failed, we don't try again for `large_page_try_ok` times
+      // to avoid too many failing calls to mmap.
+      mi_atomic_cas_strong_acq_rel(&large_page_try_ok, &try_ok, try_ok - 1);
+    }
+    else {
+      int lflags = flags & ~MAP_NORESERVE;  // using NORESERVE on huge pages seems to fail on Linux
+      int lfd = fd;
+      #ifdef MAP_ALIGNED_SUPER
+      lflags |= MAP_ALIGNED_SUPER;
+      #endif
+      #ifdef MAP_HUGETLB
+      lflags |= MAP_HUGETLB;
+      #endif
+      #ifdef MAP_HUGE_1GB
+      static bool mi_huge_pages_available = true;
+      if (large_only && (size % MI_GiB) == 0 && mi_huge_pages_available) {
+        lflags |= MAP_HUGE_1GB;
+      }
+      else
+      #endif
+      {
+        #ifdef MAP_HUGE_2MB
+        lflags |= MAP_HUGE_2MB;
+        #endif
+      }
+      #ifdef VM_FLAGS_SUPERPAGE_SIZE_2MB
+      lfd |= VM_FLAGS_SUPERPAGE_SIZE_2MB;
+      #endif
+      if (large_only || lflags != flags) {
+        // try large OS page allocation
+        *is_large = true;
+        p = unix_mmap_prim_aligned(addr, size, try_alignment, protect_flags, lflags, lfd);
+        #ifdef MAP_HUGE_1GB
+        if (p == NULL && (lflags & MAP_HUGE_1GB) == MAP_HUGE_1GB) {
+          mi_huge_pages_available = false; // don't try huge 1GiB pages again
+          if (large_only) {
+            _mi_warning_message("unable to allocate huge (1GiB) page, trying large (2MiB) pages instead (errno: %i)\n", errno);
+          }
+          lflags = ((lflags & ~MAP_HUGE_1GB) | MAP_HUGE_2MB);
+          p = unix_mmap_prim_aligned(addr, size, try_alignment, protect_flags, lflags, lfd);
+        }
+        #endif
+        if (large_only) return p;
+        if (p == NULL) {
+          mi_atomic_store_release(&large_page_try_ok, (size_t)8);  // on error, don't try again for the next N allocations
+        }
+      }
+    }
+  }
+  // regular allocation
+  if (p == NULL) {
+    *is_large = false;
+    p = unix_mmap_prim_aligned(addr, size, try_alignment, protect_flags, flags, fd);
+    #if !defined(MI_NO_THP)
+    if (p != NULL && allow_large && mi_option_is_enabled(mi_option_allow_thp) && _mi_os_canuse_large_page(size, try_alignment)) {
+      #if defined(MADV_HUGEPAGE)
+      // Many Linux systems don't allow MAP_HUGETLB but they support instead
+      // transparent huge pages (THP). Generally, it is not required to call `madvise` with MADV_HUGE
+      // though since properly aligned allocations will already use large pages if available
+      // in that case -- in particular for our large regions (in `memory.c`).
+      // However, some systems only allow THP if called with explicit `madvise`, so
+      // when large OS pages are enabled for mimalloc, we call `madvise` anyways.
+      if (unix_madvise(p, size, MADV_HUGEPAGE) == 0) {
+        // *is_large = true; // possibly
+      };
+      #elif defined(__sun)
+      struct memcntl_mha cmd = {0};
+      cmd.mha_pagesize = _mi_os_large_page_size();
+      cmd.mha_cmd = MHA_MAPSIZE_VA;
+      if (memcntl((caddr_t)p, size, MC_HAT_ADVISE, (caddr_t)&cmd, 0, 0) == 0) {
+        // *is_large = true; // possibly
+      }
+      #endif
+    }
+    #endif
+  }
+  return p;
+}
+
+// Note: the `try_alignment` is just a hint and the returned pointer is not guaranteed to be aligned.
+int _mi_prim_alloc(void* hint_addr, size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** addr) {
+  mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0);
+  mi_assert_internal(commit || !allow_large);
+  mi_assert_internal(try_alignment > 0);
+  if (hint_addr == NULL && size >= 8*MI_UNIX_LARGE_PAGE_SIZE && try_alignment > 1 && _mi_is_power_of_two(try_alignment) && try_alignment < MI_UNIX_LARGE_PAGE_SIZE) {
+    try_alignment = MI_UNIX_LARGE_PAGE_SIZE; // try to align along large page size for larger allocations
+  }
+
+  *is_zero = true;
+  int protect_flags = (commit ? (PROT_WRITE | PROT_READ) : PROT_NONE);
+  *addr = unix_mmap(hint_addr, size, try_alignment, protect_flags, false, allow_large, is_large);
+  return (*addr != NULL ? 0 : errno);
+}
+
+
+//---------------------------------------------
+// Commit/Reset
+//---------------------------------------------
+
+static void unix_mprotect_hint(int err) {
+  #if defined(__linux__) && (MI_SECURE>=2) // guard page around every mimalloc page
+  if (err == ENOMEM) {
+    _mi_warning_message("The next warning may be caused by a low memory map limit.\n"
+                        "  On Linux this is controlled by the vm.max_map_count -- maybe increase it?\n"
+                        "  For example: sudo sysctl -w vm.max_map_count=262144\n");
+  }
+  #else
+  MI_UNUSED(err);
+  #endif
+}
+
+int _mi_prim_commit(void* start, size_t size, bool* is_zero) {
+  // commit: ensure we can access the area
+  // note: we may think that *is_zero can be true since the memory
+  // was either from mmap PROT_NONE, or from decommit MADV_DONTNEED, but
+  // we sometimes call commit on a range with still partially committed
+  // memory and `mprotect` does not zero the range.
+  *is_zero = false;
+  int err = mprotect(start, size, (PROT_READ | PROT_WRITE));
+  if (err != 0) {
+    err = errno;
+    unix_mprotect_hint(err);
+  }
+  return err;
+}
+
+int _mi_prim_reuse(void* start, size_t size) {
+  MI_UNUSED(start); MI_UNUSED(size);
+  #if defined(__APPLE__) && defined(MADV_FREE_REUSE)
+  return unix_madvise(start, size, MADV_FREE_REUSE);
+  #endif
+  return 0;
+}
+
+int _mi_prim_decommit(void* start, size_t size, bool* needs_recommit) {
+  int err = 0;
+  #if defined(__APPLE__) && defined(MADV_FREE_REUSABLE)
+    // decommit on macOS: use MADV_FREE_REUSABLE as it does immediate rss accounting (issue #1097)
+    err = unix_madvise(start, size, MADV_FREE_REUSABLE);
+    if (err) { err = unix_madvise(start, size, MADV_DONTNEED); }
+  #else
+    // decommit: use MADV_DONTNEED as it decreases rss immediately (unlike MADV_FREE)
+    err = unix_madvise(start, size, MADV_DONTNEED);
+  #endif
+  #if !MI_DEBUG && MI_SECURE<=2
+    *needs_recommit = false;
+  #else
+    *needs_recommit = true;
+    mprotect(start, size, PROT_NONE);
+  #endif
+  /*
+  // decommit: use mmap with MAP_FIXED and PROT_NONE to discard the existing memory (and reduce rss)
+  *needs_recommit = true;
+  const int fd = unix_mmap_fd();
+  void* p = mmap(start, size, PROT_NONE, (MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE), fd, 0);
+  if (p != start) { err = errno; }
+  */
+  return err;
+}
+
+int _mi_prim_reset(void* start, size_t size) {
+  int err = 0;
+
+  // on macOS can use MADV_FREE_REUSABLE (but we disable this for now as it seems slower)
+  #if 0 && defined(__APPLE__) && defined(MADV_FREE_REUSABLE)
+  err = unix_madvise(start, size, MADV_FREE_REUSABLE);
+  if (err==0) return 0;
+  // fall through
+  #endif
+
+  #if defined(MADV_FREE)
+  // Otherwise, we try to use `MADV_FREE` as that is the fastest. A drawback though is that it
+  // will not reduce the `rss` stats in tools like `top` even though the memory is available
+  // to other processes. With the default `MIMALLOC_PURGE_DECOMMITS=1` we ensure that by
+  // default `MADV_DONTNEED` is used though.
+  static _Atomic(size_t) advice = MI_ATOMIC_VAR_INIT(MADV_FREE);
+  int oadvice = (int)mi_atomic_load_relaxed(&advice);
+  while ((err = unix_madvise(start, size, oadvice)) != 0 && errno == EAGAIN) { errno = 0;  };
+  if (err != 0 && errno == EINVAL && oadvice == MADV_FREE) {
+    // if MADV_FREE is not supported, fall back to MADV_DONTNEED from now on
+    mi_atomic_store_release(&advice, (size_t)MADV_DONTNEED);
+    err = unix_madvise(start, size, MADV_DONTNEED);
+  }
+  #else
+  err = unix_madvise(start, size, MADV_DONTNEED);
+  #endif
+  return err;
+}
+
+int _mi_prim_protect(void* start, size_t size, bool protect) {
+  int err = mprotect(start, size, protect ? PROT_NONE : (PROT_READ | PROT_WRITE));
+  if (err != 0) { err = errno; }
+  unix_mprotect_hint(err);
+  return err;
+}
+
+
+
+//---------------------------------------------
+// Huge page allocation
+//---------------------------------------------
+
+#if (MI_INTPTR_SIZE >= 8) && !defined(__HAIKU__) && !defined(__CYGWIN__)
+
+#ifndef MPOL_PREFERRED
+#define MPOL_PREFERRED 1
+#endif
+
+#if defined(MI_HAS_SYSCALL_H) && defined(SYS_mbind)
+static long mi_prim_mbind(void* start, unsigned long len, unsigned long mode, const unsigned long* nmask, unsigned long maxnode, unsigned flags) {
+  return syscall(SYS_mbind, start, len, mode, nmask, maxnode, flags);
+}
+#else
+static long mi_prim_mbind(void* start, unsigned long len, unsigned long mode, const unsigned long* nmask, unsigned long maxnode, unsigned flags) {
+  MI_UNUSED(start); MI_UNUSED(len); MI_UNUSED(mode); MI_UNUSED(nmask); MI_UNUSED(maxnode); MI_UNUSED(flags);
+  return 0;
+}
+#endif
+
+int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr) {
+  bool is_large = true;
+  *is_zero = true;
+  *addr = unix_mmap(hint_addr, size, MI_SEGMENT_SIZE, PROT_READ | PROT_WRITE, true, true, &is_large);
+  if (*addr != NULL && numa_node >= 0 && numa_node < 8*MI_INTPTR_SIZE) { // at most 64 nodes
+    unsigned long numa_mask = (1UL << numa_node);
+    // TODO: does `mbind` work correctly for huge OS pages? should we
+    // use `set_mempolicy` before calling mmap instead?
+    // see: <https://lkml.org/lkml/2017/2/9/875>
+    long err = mi_prim_mbind(*addr, size, MPOL_PREFERRED, &numa_mask, 8*MI_INTPTR_SIZE, 0);
+    if (err != 0) {
+      err = errno;
+      _mi_warning_message("failed to bind huge (1GiB) pages to numa node %d (error: %d (0x%x))\n", numa_node, err, err);
+    }
+  }
+  return (*addr != NULL ? 0 : errno);
+}
+
+#else
+
+int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr) {
+  MI_UNUSED(hint_addr); MI_UNUSED(size); MI_UNUSED(numa_node);
+  *is_zero = false;
+  *addr = NULL;
+  return ENOMEM;
+}
+
+#endif
+
+//---------------------------------------------
+// NUMA nodes
+//---------------------------------------------
+
+#if defined(__linux__)
+
+size_t _mi_prim_numa_node(void) {
+  #if defined(MI_HAS_SYSCALL_H) && defined(SYS_getcpu)
+    unsigned long node = 0;
+    unsigned long ncpu = 0;
+    long err = syscall(SYS_getcpu, &ncpu, &node, NULL);
+    if (err != 0) return 0;
+    return node;
+  #else
+    return 0;
+  #endif
+}
+
+size_t _mi_prim_numa_node_count(void) {
+  char buf[128];
+  unsigned node = 0;
+  for(node = 0; node < 256; node++) {
+    // enumerate node entries -- todo: it there a more efficient way to do this? (but ensure there is no allocation)
+    _mi_snprintf(buf, 127, "/sys/devices/system/node/node%u", node + 1);
+    if (mi_prim_access(buf,R_OK) != 0) break;
+  }
+  return (node+1);
+}
+
+#elif defined(__FreeBSD__) && __FreeBSD_version >= 1200000
+
+size_t _mi_prim_numa_node(void) {
+  domainset_t dom;
+  size_t node;
+  int policy;
+  if (cpuset_getdomain(CPU_LEVEL_CPUSET, CPU_WHICH_PID, -1, sizeof(dom), &dom, &policy) == -1) return 0ul;
+  for (node = 0; node < MAXMEMDOM; node++) {
+    if (DOMAINSET_ISSET(node, &dom)) return node;
+  }
+  return 0ul;
+}
+
+size_t _mi_prim_numa_node_count(void) {
+  size_t ndomains = 0;
+  size_t len = sizeof(ndomains);
+  if (sysctlbyname("vm.ndomains", &ndomains, &len, NULL, 0) == -1) return 0ul;
+  return ndomains;
+}
+
+#elif defined(__DragonFly__)
+
+size_t _mi_prim_numa_node(void) {
+  // TODO: DragonFly does not seem to provide any userland means to get this information.
+  return 0ul;
+}
+
+size_t _mi_prim_numa_node_count(void) {
+  size_t ncpus = 0, nvirtcoresperphys = 0;
+  size_t len = sizeof(size_t);
+  if (sysctlbyname("hw.ncpu", &ncpus, &len, NULL, 0) == -1) return 0ul;
+  if (sysctlbyname("hw.cpu_topology_ht_ids", &nvirtcoresperphys, &len, NULL, 0) == -1) return 0ul;
+  return nvirtcoresperphys * ncpus;
+}
+
+#else
+
+size_t _mi_prim_numa_node(void) {
+  return 0;
+}
+
+size_t _mi_prim_numa_node_count(void) {
+  return 1;
+}
+
+#endif
+
+// ----------------------------------------------------------------
+// Clock
+// ----------------------------------------------------------------
+
+#include <time.h>
+
+#if defined(CLOCK_REALTIME) || defined(CLOCK_MONOTONIC)
+
+mi_msecs_t _mi_prim_clock_now(void) {
+  struct timespec t;
+  #ifdef CLOCK_MONOTONIC
+  clock_gettime(CLOCK_MONOTONIC, &t);
+  #else
+  clock_gettime(CLOCK_REALTIME, &t);
+  #endif
+  return ((mi_msecs_t)t.tv_sec * 1000) + ((mi_msecs_t)t.tv_nsec / 1000000);
+}
+
+#else
+
+// low resolution timer
+mi_msecs_t _mi_prim_clock_now(void) {
+  #if !defined(CLOCKS_PER_SEC) || (CLOCKS_PER_SEC == 1000) || (CLOCKS_PER_SEC == 0)
+  return (mi_msecs_t)clock();
+  #elif (CLOCKS_PER_SEC < 1000)
+  return (mi_msecs_t)clock() * (1000 / (mi_msecs_t)CLOCKS_PER_SEC);
+  #else
+  return (mi_msecs_t)clock() / ((mi_msecs_t)CLOCKS_PER_SEC / 1000);
+  #endif
+}
+
+#endif
+
+
+
+
+//----------------------------------------------------------------
+// Process info
+//----------------------------------------------------------------
+
+#if defined(__unix__) || defined(__unix) || defined(unix) || defined(__APPLE__) || defined(__HAIKU__)
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/resource.h>
+
+#if defined(__APPLE__)
+#include <mach/mach.h>
+#endif
+
+#if defined(__HAIKU__)
+#include <kernel/OS.h>
+#endif
+
+static mi_msecs_t timeval_secs(const struct timeval* tv) {
+  return ((mi_msecs_t)tv->tv_sec * 1000L) + ((mi_msecs_t)tv->tv_usec / 1000L);
+}
+
+void _mi_prim_process_info(mi_process_info_t* pinfo)
+{
+  struct rusage rusage;
+  getrusage(RUSAGE_SELF, &rusage);
+  pinfo->utime = timeval_secs(&rusage.ru_utime);
+  pinfo->stime = timeval_secs(&rusage.ru_stime);
+#if !defined(__HAIKU__)
+  pinfo->page_faults = rusage.ru_majflt;
+#endif
+#if defined(__HAIKU__)
+  // Haiku does not have (yet?) a way to
+  // get these stats per process
+  thread_info tid;
+  area_info mem;
+  ssize_t c;
+  get_thread_info(find_thread(0), &tid);
+  while (get_next_area_info(tid.team, &c, &mem) == B_OK) {
+    pinfo->peak_rss += mem.ram_size;
+  }
+  pinfo->page_faults = 0;
+#elif defined(__APPLE__)
+  pinfo->peak_rss = rusage.ru_maxrss;         // macos reports in bytes
+  #ifdef MACH_TASK_BASIC_INFO
+  struct mach_task_basic_info info;
+  mach_msg_type_number_t infoCount = MACH_TASK_BASIC_INFO_COUNT;
+  if (task_info(mach_task_self(), MACH_TASK_BASIC_INFO, (task_info_t)&info, &infoCount) == KERN_SUCCESS) {
+    pinfo->current_rss = (size_t)info.resident_size;
+  }
+  #else
+  struct task_basic_info info;
+  mach_msg_type_number_t infoCount = TASK_BASIC_INFO_COUNT;
+  if (task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t)&info, &infoCount) == KERN_SUCCESS) {
+    pinfo->current_rss = (size_t)info.resident_size;
+  }
+  #endif
+#else
+  pinfo->peak_rss = rusage.ru_maxrss * 1024;  // Linux/BSD report in KiB
+#endif
+  // use defaults for commit
+}
+
+#else
+
+#ifndef __wasi__
+// WebAssembly instances are not processes
+#pragma message("define a way to get process info")
+#endif
+
+void _mi_prim_process_info(mi_process_info_t* pinfo)
+{
+  // use defaults
+  MI_UNUSED(pinfo);
+}
+
+#endif
+
+
+//----------------------------------------------------------------
+// Output
+//----------------------------------------------------------------
+
+void _mi_prim_out_stderr( const char* msg ) {
+  fputs(msg,stderr);
+}
+
+
+//----------------------------------------------------------------
+// Environment
+//----------------------------------------------------------------
+
+#if !defined(MI_USE_ENVIRON) || (MI_USE_ENVIRON!=0)
+// On Posix systemsr use `environ` to access environment variables
+// even before the C runtime is initialized.
+#if defined(__APPLE__) && defined(__has_include) && __has_include(<crt_externs.h>)
+#include <crt_externs.h>
+static char** mi_get_environ(void) {
+  return (*_NSGetEnviron());
+}
+#else
+extern char** environ;
+static char** mi_get_environ(void) {
+  return environ;
+}
+#endif
+bool _mi_prim_getenv(const char* name, char* result, size_t result_size) {
+  if (name==NULL) return false;
+  const size_t len = _mi_strlen(name);
+  if (len == 0) return false;
+  char** env = mi_get_environ();
+  if (env == NULL) return false;
+  // compare up to 10000 entries
+  for (int i = 0; i < 10000 && env[i] != NULL; i++) {
+    const char* s = env[i];
+    if (_mi_strnicmp(name, s, len) == 0 && s[len] == '=') { // case insensitive
+      // found it
+      _mi_strlcpy(result, s + len + 1, result_size);
+      return true;
+    }
+  }
+  return false;
+}
+#else
+// fallback: use standard C `getenv` but this cannot be used while initializing the C runtime
+bool _mi_prim_getenv(const char* name, char* result, size_t result_size) {
+  // cannot call getenv() when still initializing the C runtime.
+  if (_mi_preloading()) return false;
+  const char* s = getenv(name);
+  if (s == NULL) {
+    // we check the upper case name too.
+    char buf[64+1];
+    size_t len = _mi_strnlen(name,sizeof(buf)-1);
+    for (size_t i = 0; i < len; i++) {
+      buf[i] = _mi_toupper(name[i]);
+    }
+    buf[len] = 0;
+    s = getenv(buf);
+  }
+  if (s == NULL || _mi_strnlen(s,result_size) >= result_size)  return false;
+  _mi_strlcpy(result, s, result_size);
+  return true;
+}
+#endif  // !MI_USE_ENVIRON
+
+
+//----------------------------------------------------------------
+// Random
+//----------------------------------------------------------------
+
+#if defined(__APPLE__) && defined(MAC_OS_X_VERSION_10_15) && (MAC_OS_X_VERSION_MIN_REQUIRED >= MAC_OS_X_VERSION_10_15)
+#include <CommonCrypto/CommonCryptoError.h>
+#include <CommonCrypto/CommonRandom.h>
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  // We prefer CCRandomGenerateBytes as it returns an error code while arc4random_buf
+  // may fail silently on macOS. See PR #390, and <https://opensource.apple.com/source/Libc/Libc-1439.40.11/gen/FreeBSD/arc4random.c.auto.html>
+  return (CCRandomGenerateBytes(buf, buf_len) == kCCSuccess);
+}
+
+#elif defined(__ANDROID__) || defined(__DragonFly__) || \
+      defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) || \
+      defined(__sun) || \
+      (defined(__APPLE__) && (MAC_OS_X_VERSION_MIN_REQUIRED >= MAC_OS_X_VERSION_10_7))
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  arc4random_buf(buf, buf_len);
+  return true;
+}
+
+#elif defined(__APPLE__) || defined(__linux__) || defined(__HAIKU__)   // also for old apple versions < 10.7 (issue #829)
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <errno.h>
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  // Modern Linux provides `getrandom` but different distributions either use `sys/random.h` or `linux/random.h`
+  // and for the latter the actual `getrandom` call is not always defined.
+  // (see <https://stackoverflow.com/questions/45237324/why-doesnt-getrandom-compile>)
+  // We therefore use a syscall directly and fall back dynamically to /dev/urandom when needed.
+  #if defined(MI_HAS_SYSCALL_H) && defined(SYS_getrandom)
+    #ifndef GRND_NONBLOCK
+    #define GRND_NONBLOCK (1)
+    #endif
+    static _Atomic(uintptr_t) no_getrandom; // = 0
+    if (mi_atomic_load_acquire(&no_getrandom)==0) {
+      ssize_t ret = syscall(SYS_getrandom, buf, buf_len, GRND_NONBLOCK);
+      if (ret >= 0) return (buf_len == (size_t)ret);
+      if (errno != ENOSYS) return false;
+      mi_atomic_store_release(&no_getrandom, (uintptr_t)1); // don't call again, and fall back to /dev/urandom
+    }
+  #endif
+  int flags = O_RDONLY;
+  #if defined(O_CLOEXEC)
+  flags |= O_CLOEXEC;
+  #endif
+  int fd = mi_prim_open("/dev/urandom", flags);
+  if (fd < 0) return false;
+  size_t count = 0;
+  while(count < buf_len) {
+    ssize_t ret = mi_prim_read(fd, (char*)buf + count, buf_len - count);
+    if (ret<=0) {
+      if (errno!=EAGAIN && errno!=EINTR) break;
+    }
+    else {
+      count += ret;
+    }
+  }
+  mi_prim_close(fd);
+  return (count==buf_len);
+}
+
+#else
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  return false;
+}
+
+#endif
+
+
+//----------------------------------------------------------------
+// Thread init/done
+//----------------------------------------------------------------
+
+#if defined(MI_USE_PTHREADS)
+
+// use pthread local storage keys to detect thread ending
+// (and used with MI_TLS_PTHREADS for the default heap)
+pthread_key_t _mi_heap_default_key = (pthread_key_t)(-1);
+
+static void mi_pthread_done(void* value) {
+  if (value!=NULL) {
+    _mi_thread_done((mi_heap_t*)value);
+  }
+}
+
+void _mi_prim_thread_init_auto_done(void) {
+  mi_assert_internal(_mi_heap_default_key == (pthread_key_t)(-1));
+  pthread_key_create(&_mi_heap_default_key, &mi_pthread_done);
+}
+
+void _mi_prim_thread_done_auto_done(void) {
+  if (_mi_heap_default_key != (pthread_key_t)(-1)) {  // do not leak the key, see issue #809
+    pthread_key_delete(_mi_heap_default_key);
+  }
+}
+
+void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) {
+  if (_mi_heap_default_key != (pthread_key_t)(-1)) {  // can happen during recursive invocation on freeBSD
+    pthread_setspecific(_mi_heap_default_key, heap);
+  }
+}
+
+#else
+
+void _mi_prim_thread_init_auto_done(void) {
+  // nothing
+}
+
+void _mi_prim_thread_done_auto_done(void) {
+  // nothing
+}
+
+void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) {
+  MI_UNUSED(heap);
+}
+
+#endif
diff --git a/compat/mimalloc/prim/windows/prim.c b/compat/mimalloc/prim/windows/prim.c
new file mode 100644
index 00000000000000..75a93d2a7277de
--- /dev/null
+++ b/compat/mimalloc/prim/windows/prim.c
@@ -0,0 +1,879 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+// This file is included in `src/prim/prim.c`
+
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"
+#include <stdio.h>   // fputs, stderr
+
+// xbox has no console IO
+#if !defined(WINAPI_FAMILY_PARTITION) || WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_APP | WINAPI_PARTITION_SYSTEM)
+#define MI_HAS_CONSOLE_IO
+#endif
+
+//---------------------------------------------
+// Dynamically bind Windows API points for portability
+//---------------------------------------------
+
+// We use VirtualAlloc2 for aligned allocation, but it is only supported on Windows 10 and Windows Server 2016.
+// So, we need to look it up dynamically to run on older systems. (use __stdcall for 32-bit compatibility)
+// NtAllocateVirtualAllocEx is used for huge OS page allocation (1GiB)
+// We define a minimal MEM_EXTENDED_PARAMETER ourselves in order to be able to compile with older SDK's.
+typedef enum MI_MEM_EXTENDED_PARAMETER_TYPE_E {
+  MiMemExtendedParameterInvalidType = 0,
+  MiMemExtendedParameterAddressRequirements,
+  MiMemExtendedParameterNumaNode,
+  MiMemExtendedParameterPartitionHandle,
+  MiMemExtendedParameterUserPhysicalHandle,
+  MiMemExtendedParameterAttributeFlags,
+  MiMemExtendedParameterMax
+} MI_MEM_EXTENDED_PARAMETER_TYPE;
+
+typedef struct DECLSPEC_ALIGN(8) MI_MEM_EXTENDED_PARAMETER_S {
+  struct { DWORD64 Type : 8; DWORD64 Reserved : 56; } Type;
+  union  { DWORD64 ULong64; PVOID Pointer; SIZE_T Size; HANDLE Handle; DWORD ULong; } Arg;
+} MI_MEM_EXTENDED_PARAMETER;
+
+typedef struct MI_MEM_ADDRESS_REQUIREMENTS_S {
+  PVOID  LowestStartingAddress;
+  PVOID  HighestEndingAddress;
+  SIZE_T Alignment;
+} MI_MEM_ADDRESS_REQUIREMENTS;
+
+#define MI_MEM_EXTENDED_PARAMETER_NONPAGED_HUGE   0x00000010
+
+#include <winternl.h>
+typedef PVOID (__stdcall *PVirtualAlloc2)(HANDLE, PVOID, SIZE_T, ULONG, ULONG, MI_MEM_EXTENDED_PARAMETER*, ULONG);
+typedef LONG  (__stdcall *PNtAllocateVirtualMemoryEx)(HANDLE, PVOID*, SIZE_T*, ULONG, ULONG, MI_MEM_EXTENDED_PARAMETER*, ULONG);  // avoid NTSTATUS as it is not defined on xbox (pr #1084)
+static PVirtualAlloc2 pVirtualAlloc2 = NULL;
+static PNtAllocateVirtualMemoryEx pNtAllocateVirtualMemoryEx = NULL;
+
+// Similarly, GetNumaProcessorNodeEx is only supported since Windows 7  (and GetNumaNodeProcessorMask is not supported on xbox)
+typedef struct MI_PROCESSOR_NUMBER_S { WORD Group; BYTE Number; BYTE Reserved; } MI_PROCESSOR_NUMBER;
+
+typedef VOID (__stdcall *PGetCurrentProcessorNumberEx)(MI_PROCESSOR_NUMBER* ProcNumber);
+typedef BOOL (__stdcall *PGetNumaProcessorNodeEx)(MI_PROCESSOR_NUMBER* Processor, PUSHORT NodeNumber);
+typedef BOOL (__stdcall* PGetNumaNodeProcessorMaskEx)(USHORT Node, PGROUP_AFFINITY ProcessorMask);
+typedef BOOL (__stdcall *PGetNumaProcessorNode)(UCHAR Processor, PUCHAR NodeNumber);
+typedef BOOL (__stdcall* PGetNumaNodeProcessorMask)(UCHAR Node, PULONGLONG ProcessorMask);
+typedef BOOL (__stdcall* PGetNumaHighestNodeNumber)(PULONG Node);
+static PGetCurrentProcessorNumberEx pGetCurrentProcessorNumberEx = NULL;
+static PGetNumaProcessorNodeEx      pGetNumaProcessorNodeEx = NULL;
+static PGetNumaNodeProcessorMaskEx  pGetNumaNodeProcessorMaskEx = NULL;
+static PGetNumaProcessorNode        pGetNumaProcessorNode = NULL;
+static PGetNumaNodeProcessorMask    pGetNumaNodeProcessorMask = NULL;
+static PGetNumaHighestNodeNumber    pGetNumaHighestNodeNumber = NULL;
+
+// Not available on xbox
+typedef SIZE_T(__stdcall* PGetLargePageMinimum)(VOID);
+static PGetLargePageMinimum pGetLargePageMinimum = NULL;
+
+// Available after Windows XP
+typedef BOOL (__stdcall *PGetPhysicallyInstalledSystemMemory)( PULONGLONG TotalMemoryInKilobytes );
+
+//---------------------------------------------
+// Enable large page support dynamically (if possible)
+//---------------------------------------------
+
+static bool win_enable_large_os_pages(size_t* large_page_size)
+{
+  static bool large_initialized = false;
+  if (large_initialized) return (_mi_os_large_page_size() > 0);
+  large_initialized = true;
+  if (pGetLargePageMinimum==NULL) return false;  // no large page support (xbox etc.)
+
+  // Try to see if large OS pages are supported
+  // To use large pages on Windows, we first need access permission
+  // Set "Lock pages in memory" permission in the group policy editor
+  // <https://devblogs.microsoft.com/oldnewthing/20110128-00/?p=11643>
+  unsigned long err = 0;
+  HANDLE token = NULL;
+  BOOL ok = OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY, &token);
+  if (ok) {
+    TOKEN_PRIVILEGES tp;
+    ok = LookupPrivilegeValue(NULL, TEXT("SeLockMemoryPrivilege"), &tp.Privileges[0].Luid);
+    if (ok) {
+      tp.PrivilegeCount = 1;
+      tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;
+      ok = AdjustTokenPrivileges(token, FALSE, &tp, 0, (PTOKEN_PRIVILEGES)NULL, 0);
+      if (ok) {
+        err = GetLastError();
+        ok = (err == ERROR_SUCCESS);
+        if (ok && large_page_size != NULL && pGetLargePageMinimum != NULL) {
+          *large_page_size = (*pGetLargePageMinimum)();
+        }
+      }
+    }
+    CloseHandle(token);
+  }
+  if (!ok) {
+    if (err == 0) err = GetLastError();
+    _mi_warning_message("cannot enable large OS page support, error %lu\n", err);
+  }
+  return (ok!=0);
+}
+
+
+//---------------------------------------------
+// Initialize
+//---------------------------------------------
+
+void _mi_prim_mem_init( mi_os_mem_config_t* config )
+{
+  config->has_overcommit = false;
+  config->has_partial_free = false;
+  config->has_virtual_reserve = true;
+  // get the page size
+  SYSTEM_INFO si;
+  GetSystemInfo(&si);
+  if (si.dwPageSize > 0) { config->page_size = si.dwPageSize; }
+  if (si.dwAllocationGranularity > 0) { config->alloc_granularity = si.dwAllocationGranularity; }
+  // get virtual address bits
+  if ((uintptr_t)si.lpMaximumApplicationAddress > 0) {
+    const size_t vbits = MI_SIZE_BITS - mi_clz((uintptr_t)si.lpMaximumApplicationAddress);
+    config->virtual_address_bits = vbits;
+  }
+
+  // get the VirtualAlloc2 function
+  HINSTANCE  hDll;
+  hDll = LoadLibrary(TEXT("kernelbase.dll"));
+  if (hDll != NULL) {
+    // use VirtualAlloc2FromApp if possible as it is available to Windows store apps
+    pVirtualAlloc2 = (PVirtualAlloc2)(void (*)(void))GetProcAddress(hDll, "VirtualAlloc2FromApp");
+    if (pVirtualAlloc2==NULL) pVirtualAlloc2 = (PVirtualAlloc2)(void (*)(void))GetProcAddress(hDll, "VirtualAlloc2");
+    FreeLibrary(hDll);
+  }
+  // NtAllocateVirtualMemoryEx is used for huge page allocation
+  hDll = LoadLibrary(TEXT("ntdll.dll"));
+  if (hDll != NULL) {
+    pNtAllocateVirtualMemoryEx = (PNtAllocateVirtualMemoryEx)(void (*)(void))GetProcAddress(hDll, "NtAllocateVirtualMemoryEx");
+    FreeLibrary(hDll);
+  }
+  // Try to use Win7+ numa API
+  hDll = LoadLibrary(TEXT("kernel32.dll"));
+  if (hDll != NULL) {
+    pGetCurrentProcessorNumberEx = (PGetCurrentProcessorNumberEx)(void (*)(void))GetProcAddress(hDll, "GetCurrentProcessorNumberEx");
+    pGetNumaProcessorNodeEx = (PGetNumaProcessorNodeEx)(void (*)(void))GetProcAddress(hDll, "GetNumaProcessorNodeEx");
+    pGetNumaNodeProcessorMaskEx = (PGetNumaNodeProcessorMaskEx)(void (*)(void))GetProcAddress(hDll, "GetNumaNodeProcessorMaskEx");
+    pGetNumaProcessorNode = (PGetNumaProcessorNode)(void (*)(void))GetProcAddress(hDll, "GetNumaProcessorNode");
+    pGetNumaNodeProcessorMask = (PGetNumaNodeProcessorMask)(void (*)(void))GetProcAddress(hDll, "GetNumaNodeProcessorMask");
+    pGetNumaHighestNodeNumber = (PGetNumaHighestNodeNumber)(void (*)(void))GetProcAddress(hDll, "GetNumaHighestNodeNumber");
+    pGetLargePageMinimum = (PGetLargePageMinimum)(void (*)(void))GetProcAddress(hDll, "GetLargePageMinimum");
+    // Get physical memory (not available on XP, so check dynamically)
+    PGetPhysicallyInstalledSystemMemory pGetPhysicallyInstalledSystemMemory = (PGetPhysicallyInstalledSystemMemory)(void (*)(void))GetProcAddress(hDll,"GetPhysicallyInstalledSystemMemory");
+    if (pGetPhysicallyInstalledSystemMemory != NULL) {
+      ULONGLONG memInKiB = 0;
+      if ((*pGetPhysicallyInstalledSystemMemory)(&memInKiB)) {
+        if (memInKiB > 0 && memInKiB <= SIZE_MAX) {
+          config->physical_memory_in_kib = (size_t)memInKiB;
+        }
+      }
+    }
+    FreeLibrary(hDll);
+  }
+  // Enable large/huge OS page support?
+  if (mi_option_is_enabled(mi_option_allow_large_os_pages) || mi_option_is_enabled(mi_option_reserve_huge_os_pages)) {
+    win_enable_large_os_pages(&config->large_page_size);
+  }
+}
+
+
+//---------------------------------------------
+// Free
+//---------------------------------------------
+
+int _mi_prim_free(void* addr, size_t size ) {
+  MI_UNUSED(size);
+  DWORD errcode = 0;
+  bool err = (VirtualFree(addr, 0, MEM_RELEASE) == 0);
+  if (err) { errcode = GetLastError(); }
+  if (errcode == ERROR_INVALID_ADDRESS) {
+    // In mi_os_mem_alloc_aligned the fallback path may have returned a pointer inside
+    // the memory region returned by VirtualAlloc; in that case we need to free using
+    // the start of the region.
+    MEMORY_BASIC_INFORMATION info; _mi_memzero_var(info);
+    VirtualQuery(addr, &info, sizeof(info));
+    if (info.AllocationBase < addr && ((uint8_t*)addr - (uint8_t*)info.AllocationBase) < (ptrdiff_t)MI_SEGMENT_SIZE) {
+      errcode = 0;
+      err = (VirtualFree(info.AllocationBase, 0, MEM_RELEASE) == 0);
+      if (err) { errcode = GetLastError(); }
+    }
+  }
+  return (int)errcode;
+}
+
+
+//---------------------------------------------
+// VirtualAlloc
+//---------------------------------------------
+
+static void* win_virtual_alloc_prim_once(void* addr, size_t size, size_t try_alignment, DWORD flags) {
+  #if (MI_INTPTR_SIZE >= 8)
+  // on 64-bit systems, try to use the virtual address area after 2TiB for 4MiB aligned allocations
+  if (addr == NULL) {
+    void* hint = _mi_os_get_aligned_hint(try_alignment,size);
+    if (hint != NULL) {
+      void* p = VirtualAlloc(hint, size, flags, PAGE_READWRITE);
+      if (p != NULL) return p;
+      _mi_verbose_message("warning: unable to allocate hinted aligned OS memory (%zu bytes, error code: 0x%x, address: %p, alignment: %zu, flags: 0x%x)\n", size, GetLastError(), hint, try_alignment, flags);
+      // fall through on error
+    }
+  }
+  #endif
+  // on modern Windows try use VirtualAlloc2 for aligned allocation
+  if (addr == NULL && try_alignment > 1 && (try_alignment % _mi_os_page_size()) == 0 && pVirtualAlloc2 != NULL) {
+    MI_MEM_ADDRESS_REQUIREMENTS reqs = { 0, 0, 0 };
+    reqs.Alignment = try_alignment;
+    MI_MEM_EXTENDED_PARAMETER param = { {0, 0}, {0} };
+    param.Type.Type = MiMemExtendedParameterAddressRequirements;
+    param.Arg.Pointer = &reqs;
+    void* p = (*pVirtualAlloc2)(GetCurrentProcess(), addr, size, flags, PAGE_READWRITE, &param, 1);
+    if (p != NULL) return p;
+    _mi_warning_message("unable to allocate aligned OS memory (0x%zx bytes, error code: 0x%x, address: %p, alignment: 0x%zx, flags: 0x%x)\n", size, GetLastError(), addr, try_alignment, flags);
+    // fall through on error
+  }
+  // last resort
+  return VirtualAlloc(addr, size, flags, PAGE_READWRITE);
+}
+
+static bool win_is_out_of_memory_error(DWORD err) {
+  switch (err) {
+    case ERROR_COMMITMENT_MINIMUM:
+    case ERROR_COMMITMENT_LIMIT:
+    case ERROR_PAGEFILE_QUOTA:
+    case ERROR_NOT_ENOUGH_MEMORY:
+      return true;
+    default:
+      return false;
+  }
+}
+
+static void* win_virtual_alloc_prim(void* addr, size_t size, size_t try_alignment, DWORD flags) {
+  long max_retry_msecs = mi_option_get_clamp(mi_option_retry_on_oom, 0, 2000);  // at most 2 seconds
+  if (max_retry_msecs == 1) { max_retry_msecs = 100; }  // if one sets the option to "true"
+  for (long tries = 1; tries <= 10; tries++) {          // try at most 10 times (=2200ms)
+    void* p = win_virtual_alloc_prim_once(addr, size, try_alignment, flags);
+    if (p != NULL) {
+      // success, return the address
+      return p;
+    }
+    else if (max_retry_msecs > 0 && (try_alignment <= 2*MI_SEGMENT_ALIGN) &&
+              (flags&MEM_COMMIT) != 0 && (flags&MEM_LARGE_PAGES) == 0 &&
+              win_is_out_of_memory_error(GetLastError())) {
+      // if committing regular memory and being out-of-memory,
+      // keep trying for a bit in case memory frees up after all. See issue #894
+      _mi_warning_message("out-of-memory on OS allocation, try again... (attempt %lu, 0x%zx bytes, error code: 0x%x, address: %p, alignment: 0x%zx, flags: 0x%x)\n", tries, size, GetLastError(), addr, try_alignment, flags);
+      long sleep_msecs = tries*40;  // increasing waits
+      if (sleep_msecs > max_retry_msecs) { sleep_msecs = max_retry_msecs; }
+      max_retry_msecs -= sleep_msecs;
+      Sleep(sleep_msecs);
+    }
+    else {
+      // otherwise return with an error
+      break;
+    }
+  }
+  return NULL;
+}
+
+static void* win_virtual_alloc(void* addr, size_t size, size_t try_alignment, DWORD flags, bool large_only, bool allow_large, bool* is_large) {
+  mi_assert_internal(!(large_only && !allow_large));
+  static _Atomic(size_t) large_page_try_ok; // = 0;
+  void* p = NULL;
+  // Try to allocate large OS pages (2MiB) if allowed or required.
+  if ((large_only || (_mi_os_canuse_large_page(size, try_alignment) && mi_option_is_enabled(mi_option_allow_large_os_pages)))
+      && allow_large && (flags&MEM_COMMIT)!=0 && (flags&MEM_RESERVE)!=0)
+  {
+    size_t try_ok = mi_atomic_load_acquire(&large_page_try_ok);
+    if (!large_only && try_ok > 0) {
+      // if a large page allocation fails, it seems the calls to VirtualAlloc get very expensive.
+      // therefore, once a large page allocation failed, we don't try again for `large_page_try_ok` times.
+      mi_atomic_cas_strong_acq_rel(&large_page_try_ok, &try_ok, try_ok - 1);
+    }
+    else {
+      // large OS pages must always reserve and commit.
+      *is_large = true;
+      p = win_virtual_alloc_prim(addr, size, try_alignment, flags | MEM_LARGE_PAGES);
+      if (large_only) return p;
+      // fall back to non-large page allocation on error (`p == NULL`).
+      if (p == NULL) {
+        mi_atomic_store_release(&large_page_try_ok,10UL);  // on error, don't try again for the next N allocations
+      }
+    }
+  }
+  // Fall back to regular page allocation
+  if (p == NULL) {
+    *is_large = ((flags&MEM_LARGE_PAGES) != 0);
+    p = win_virtual_alloc_prim(addr, size, try_alignment, flags);
+  }
+  //if (p == NULL) { _mi_warning_message("unable to allocate OS memory (%zu bytes, error code: 0x%x, address: %p, alignment: %zu, flags: 0x%x, large only: %d, allow large: %d)\n", size, GetLastError(), addr, try_alignment, flags, large_only, allow_large); }
+  return p;
+}
+
+int _mi_prim_alloc(void* hint_addr, size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** addr) {
+  mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0);
+  mi_assert_internal(commit || !allow_large);
+  mi_assert_internal(try_alignment > 0);
+  *is_zero = true;
+  int flags = MEM_RESERVE;
+  if (commit) { flags |= MEM_COMMIT; }
+  *addr = win_virtual_alloc(hint_addr, size, try_alignment, flags, false, allow_large, is_large);
+  return (*addr != NULL ? 0 : (int)GetLastError());
+}
+
+
+//---------------------------------------------
+// Commit/Reset/Protect
+//---------------------------------------------
+#ifdef _MSC_VER
+#pragma warning(disable:6250)   // suppress warning calling VirtualFree without MEM_RELEASE (for decommit)
+#endif
+
+int _mi_prim_commit(void* addr, size_t size, bool* is_zero) {
+  *is_zero = false;
+  /*
+  // zero'ing only happens on an initial commit... but checking upfront seems expensive..
+  _MEMORY_BASIC_INFORMATION meminfo; _mi_memzero_var(meminfo);
+  if (VirtualQuery(addr, &meminfo, size) > 0) {
+    if ((meminfo.State & MEM_COMMIT) == 0) {
+      *is_zero = true;
+    }
+  }
+  */
+  // commit
+  void* p = VirtualAlloc(addr, size, MEM_COMMIT, PAGE_READWRITE);
+  if (p == NULL) return (int)GetLastError();
+  return 0;
+}
+
+int _mi_prim_decommit(void* addr, size_t size, bool* needs_recommit) {
+  BOOL ok = VirtualFree(addr, size, MEM_DECOMMIT);
+  *needs_recommit = true;  // for safety, assume always decommitted even in the case of an error.
+  return (ok ? 0 : (int)GetLastError());
+}
+
+int _mi_prim_reset(void* addr, size_t size) {
+  void* p = VirtualAlloc(addr, size, MEM_RESET, PAGE_READWRITE);
+  mi_assert_internal(p == addr);
+  #if 0
+  if (p != NULL) {
+    VirtualUnlock(addr,size); // VirtualUnlock after MEM_RESET removes the memory directly from the working set
+  }
+  #endif
+  return (p != NULL ? 0 : (int)GetLastError());
+}
+
+int _mi_prim_reuse(void* addr, size_t size) {
+  MI_UNUSED(addr); MI_UNUSED(size);
+  return 0;
+}
+
+int _mi_prim_protect(void* addr, size_t size, bool protect) {
+  DWORD oldprotect = 0;
+  BOOL ok = VirtualProtect(addr, size, protect ? PAGE_NOACCESS : PAGE_READWRITE, &oldprotect);
+  return (ok ? 0 : (int)GetLastError());
+}
+
+
+//---------------------------------------------
+// Huge page allocation
+//---------------------------------------------
+
+static void* _mi_prim_alloc_huge_os_pagesx(void* hint_addr, size_t size, int numa_node)
+{
+  const DWORD flags = MEM_LARGE_PAGES | MEM_COMMIT | MEM_RESERVE;
+
+  win_enable_large_os_pages(NULL);
+
+  MI_MEM_EXTENDED_PARAMETER params[3] = { {{0,0},{0}},{{0,0},{0}},{{0,0},{0}} };
+  // on modern Windows try use NtAllocateVirtualMemoryEx for 1GiB huge pages
+  static bool mi_huge_pages_available = true;
+  if (pNtAllocateVirtualMemoryEx != NULL && mi_huge_pages_available) {
+    params[0].Type.Type = MiMemExtendedParameterAttributeFlags;
+    params[0].Arg.ULong64 = MI_MEM_EXTENDED_PARAMETER_NONPAGED_HUGE;
+    ULONG param_count = 1;
+    if (numa_node >= 0) {
+      param_count++;
+      params[1].Type.Type = MiMemExtendedParameterNumaNode;
+      params[1].Arg.ULong = (unsigned)numa_node;
+    }
+    SIZE_T psize = size;
+    void* base = hint_addr;
+    LONG err = (*pNtAllocateVirtualMemoryEx)(GetCurrentProcess(), &base, &psize, flags, PAGE_READWRITE, params, param_count);
+    if (err == 0 && base != NULL) {
+      return base;
+    }
+    else {
+      // fall back to regular large pages
+      mi_huge_pages_available = false; // don't try further huge pages
+      _mi_warning_message("unable to allocate using huge (1GiB) pages, trying large (2MiB) pages instead (status 0x%lx)\n", err);
+    }
+  }
+  // on modern Windows try use VirtualAlloc2 for numa aware large OS page allocation
+  if (pVirtualAlloc2 != NULL && numa_node >= 0) {
+    params[0].Type.Type = MiMemExtendedParameterNumaNode;
+    params[0].Arg.ULong = (unsigned)numa_node;
+    return (*pVirtualAlloc2)(GetCurrentProcess(), hint_addr, size, flags, PAGE_READWRITE, params, 1);
+  }
+
+  // otherwise use regular virtual alloc on older windows
+  return VirtualAlloc(hint_addr, size, flags, PAGE_READWRITE);
+}
+
+int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr) {
+  *is_zero = true;
+  *addr = _mi_prim_alloc_huge_os_pagesx(hint_addr,size,numa_node);
+  return (*addr != NULL ? 0 : (int)GetLastError());
+}
+
+
+//---------------------------------------------
+// Numa nodes
+//---------------------------------------------
+
+size_t _mi_prim_numa_node(void) {
+  USHORT numa_node = 0;
+  if (pGetCurrentProcessorNumberEx != NULL && pGetNumaProcessorNodeEx != NULL) {
+    // Extended API is supported
+    MI_PROCESSOR_NUMBER pnum;
+    (*pGetCurrentProcessorNumberEx)(&pnum);
+    USHORT nnode = 0;
+    BOOL ok = (*pGetNumaProcessorNodeEx)(&pnum, &nnode);
+    if (ok) { numa_node = nnode; }
+  }
+  else if (pGetNumaProcessorNode != NULL) {
+    // Vista or earlier, use older API that is limited to 64 processors. Issue #277
+    DWORD pnum = GetCurrentProcessorNumber();
+    UCHAR nnode = 0;
+    BOOL ok = pGetNumaProcessorNode((UCHAR)pnum, &nnode);
+    if (ok) { numa_node = nnode; }
+  }
+  return numa_node;
+}
+
+size_t _mi_prim_numa_node_count(void) {
+  ULONG numa_max = 0;
+  if (pGetNumaHighestNodeNumber!=NULL) {
+    (*pGetNumaHighestNodeNumber)(&numa_max);
+  }
+  // find the highest node number that has actual processors assigned to it. Issue #282
+  while (numa_max > 0) {
+    if (pGetNumaNodeProcessorMaskEx != NULL) {
+      // Extended API is supported
+      GROUP_AFFINITY affinity;
+      if ((*pGetNumaNodeProcessorMaskEx)((USHORT)numa_max, &affinity)) {
+        if (affinity.Mask != 0) break;  // found the maximum non-empty node
+      }
+    }
+    else {
+      // Vista or earlier, use older API that is limited to 64 processors.
+      ULONGLONG mask;
+      if (pGetNumaNodeProcessorMask != NULL) {
+        if ((*pGetNumaNodeProcessorMask)((UCHAR)numa_max, &mask)) {
+          if (mask != 0) break; // found the maximum non-empty node
+        }
+      };
+    }
+    // max node was invalid or had no processor assigned, try again
+    numa_max--;
+  }
+  return ((size_t)numa_max + 1);
+}
+
+
+//----------------------------------------------------------------
+// Clock
+//----------------------------------------------------------------
+
+static mi_msecs_t mi_to_msecs(LARGE_INTEGER t) {
+  static LARGE_INTEGER mfreq; // = 0
+  if (mfreq.QuadPart == 0LL) {
+    LARGE_INTEGER f;
+    QueryPerformanceFrequency(&f);
+    mfreq.QuadPart = f.QuadPart/1000LL;
+    if (mfreq.QuadPart == 0) mfreq.QuadPart = 1;
+  }
+  return (mi_msecs_t)(t.QuadPart / mfreq.QuadPart);
+}
+
+mi_msecs_t _mi_prim_clock_now(void) {
+  LARGE_INTEGER t;
+  QueryPerformanceCounter(&t);
+  return mi_to_msecs(t);
+}
+
+
+//----------------------------------------------------------------
+// Process Info
+//----------------------------------------------------------------
+
+#include <psapi.h>
+
+static mi_msecs_t filetime_msecs(const FILETIME* ftime) {
+  ULARGE_INTEGER i;
+  i.LowPart = ftime->dwLowDateTime;
+  i.HighPart = ftime->dwHighDateTime;
+  mi_msecs_t msecs = (i.QuadPart / 10000); // FILETIME is in 100 nano seconds
+  return msecs;
+}
+
+typedef BOOL (WINAPI *PGetProcessMemoryInfo)(HANDLE, PPROCESS_MEMORY_COUNTERS, DWORD);
+static PGetProcessMemoryInfo pGetProcessMemoryInfo = NULL;
+
+void _mi_prim_process_info(mi_process_info_t* pinfo)
+{
+  FILETIME ct;
+  FILETIME ut;
+  FILETIME st;
+  FILETIME et;
+  GetProcessTimes(GetCurrentProcess(), &ct, &et, &st, &ut);
+  pinfo->utime = filetime_msecs(&ut);
+  pinfo->stime = filetime_msecs(&st);
+
+  // load psapi on demand
+  if (pGetProcessMemoryInfo == NULL) {
+    HINSTANCE hDll = LoadLibrary(TEXT("psapi.dll"));
+    if (hDll != NULL) {
+      pGetProcessMemoryInfo = (PGetProcessMemoryInfo)(void (*)(void))GetProcAddress(hDll, "GetProcessMemoryInfo");
+    }
+  }
+
+  // get process info
+  PROCESS_MEMORY_COUNTERS info; _mi_memzero_var(info);
+  if (pGetProcessMemoryInfo != NULL) {
+    pGetProcessMemoryInfo(GetCurrentProcess(), &info, sizeof(info));
+  }
+  pinfo->current_rss    = (size_t)info.WorkingSetSize;
+  pinfo->peak_rss       = (size_t)info.PeakWorkingSetSize;
+  pinfo->current_commit = (size_t)info.PagefileUsage;
+  pinfo->peak_commit    = (size_t)info.PeakPagefileUsage;
+  pinfo->page_faults    = (size_t)info.PageFaultCount;
+}
+
+//----------------------------------------------------------------
+// Output
+//----------------------------------------------------------------
+
+void _mi_prim_out_stderr( const char* msg )
+{
+  // on windows with redirection, the C runtime cannot handle locale dependent output
+  // after the main thread closes so we use direct console output.
+  if (!_mi_preloading()) {
+    // _cputs(msg);  // _cputs cannot be used as it aborts when failing to lock the console
+    static HANDLE hcon = INVALID_HANDLE_VALUE;
+    static bool hconIsConsole = false;
+    if (hcon == INVALID_HANDLE_VALUE) {
+      hcon = GetStdHandle(STD_ERROR_HANDLE);
+      #ifdef MI_HAS_CONSOLE_IO
+      CONSOLE_SCREEN_BUFFER_INFO sbi;
+      hconIsConsole = ((hcon != INVALID_HANDLE_VALUE) && GetConsoleScreenBufferInfo(hcon, &sbi));
+      #endif
+    }
+    const size_t len = _mi_strlen(msg);
+    if (len > 0 && len < UINT32_MAX) {
+      DWORD written = 0;
+      if (hconIsConsole) {
+        #ifdef MI_HAS_CONSOLE_IO
+        WriteConsoleA(hcon, msg, (DWORD)len, &written, NULL);
+        #endif
+      }
+      else if (hcon != INVALID_HANDLE_VALUE) {
+        // use direct write if stderr was redirected
+        WriteFile(hcon, msg, (DWORD)len, &written, NULL);
+      }
+      else {
+        // finally fall back to fputs after all
+        fputs(msg, stderr);
+      }
+    }
+  }
+}
+
+
+//----------------------------------------------------------------
+// Environment
+//----------------------------------------------------------------
+
+// On Windows use GetEnvironmentVariable instead of getenv to work
+// reliably even when this is invoked before the C runtime is initialized.
+// i.e. when `_mi_preloading() == true`.
+// Note: on windows, environment names are not case sensitive.
+bool _mi_prim_getenv(const char* name, char* result, size_t result_size) {
+  result[0] = 0;
+  size_t len = GetEnvironmentVariableA(name, result, (DWORD)result_size);
+  return (len > 0 && len < result_size);
+}
+
+
+//----------------------------------------------------------------
+// Random
+//----------------------------------------------------------------
+
+#if defined(MI_USE_RTLGENRANDOM) // || defined(__cplusplus)
+// We prefer to use BCryptGenRandom instead of (the unofficial) RtlGenRandom but when using
+// dynamic overriding, we observed it can raise an exception when compiled with C++, and
+// sometimes deadlocks when also running under the VS debugger.
+// In contrast, issue #623 implies that on Windows Server 2019 we need to use BCryptGenRandom.
+// To be continued..
+#pragma comment (lib,"advapi32.lib")
+#define RtlGenRandom  SystemFunction036
+mi_decl_externc BOOLEAN NTAPI RtlGenRandom(PVOID RandomBuffer, ULONG RandomBufferLength);
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  return (RtlGenRandom(buf, (ULONG)buf_len) != 0);
+}
+
+#else
+
+#ifndef BCRYPT_USE_SYSTEM_PREFERRED_RNG
+#define BCRYPT_USE_SYSTEM_PREFERRED_RNG 0x00000002
+#endif
+
+typedef LONG (NTAPI *PBCryptGenRandom)(HANDLE, PUCHAR, ULONG, ULONG);
+static  PBCryptGenRandom pBCryptGenRandom = NULL;
+
+bool _mi_prim_random_buf(void* buf, size_t buf_len) {
+  if (pBCryptGenRandom == NULL) {
+    HINSTANCE hDll = LoadLibrary(TEXT("bcrypt.dll"));
+    if (hDll != NULL) {
+      pBCryptGenRandom = (PBCryptGenRandom)(void (*)(void))GetProcAddress(hDll, "BCryptGenRandom");
+    }
+    if (pBCryptGenRandom == NULL) return false;
+  }
+  return (pBCryptGenRandom(NULL, (PUCHAR)buf, (ULONG)buf_len, BCRYPT_USE_SYSTEM_PREFERRED_RNG) >= 0);
+}
+
+#endif  // MI_USE_RTLGENRANDOM
+
+
+
+//----------------------------------------------------------------
+// Process & Thread Init/Done
+//----------------------------------------------------------------
+
+#if MI_WIN_USE_FIXED_TLS==1
+mi_decl_cache_align size_t _mi_win_tls_offset = 0;
+#endif
+
+//static void mi_debug_out(const char* s) {
+//  HANDLE h = GetStdHandle(STD_ERROR_HANDLE);
+//  WriteConsole(h, s, (DWORD)_mi_strlen(s), NULL, NULL);
+//}
+
+static void mi_win_tls_init(DWORD reason) {
+  if (reason==DLL_PROCESS_ATTACH || reason==DLL_THREAD_ATTACH) {
+    #if MI_WIN_USE_FIXED_TLS==1  // we must allocate a TLS slot dynamically
+    if (_mi_win_tls_offset == 0 && reason == DLL_PROCESS_ATTACH) {
+      const DWORD tls_slot = TlsAlloc();  // usually returns slot 1
+      if (tls_slot == TLS_OUT_OF_INDEXES) {
+        _mi_error_message(EFAULT, "unable to allocate the a TLS slot (rebuild without MI_WIN_USE_FIXED_TLS?)\n");
+      }
+      _mi_win_tls_offset = (size_t)tls_slot * sizeof(void*);
+    }
+    #endif
+    #if MI_HAS_TLS_SLOT >= 2  // we must initialize the TLS slot before any allocation
+    if (mi_prim_get_default_heap() == NULL) {
+      _mi_heap_set_default_direct((mi_heap_t*)&_mi_heap_empty);
+      #if MI_DEBUG && MI_WIN_USE_FIXED_TLS==1
+      void* const p = TlsGetValue((DWORD)(_mi_win_tls_offset / sizeof(void*)));
+      mi_assert_internal(p == (void*)&_mi_heap_empty);
+      #endif
+    }
+    #endif
+  }
+}
+
+static void NTAPI mi_win_main(PVOID module, DWORD reason, LPVOID reserved) {
+  MI_UNUSED(reserved);
+  MI_UNUSED(module);
+  mi_win_tls_init(reason);
+  if (reason==DLL_PROCESS_ATTACH) {
+    _mi_auto_process_init();
+  }
+  else if (reason==DLL_PROCESS_DETACH) {
+    _mi_auto_process_done();
+  }
+  else if (reason==DLL_THREAD_DETACH && !_mi_is_redirected()) {
+    _mi_thread_done(NULL);
+  }
+}
+
+
+#if defined(MI_SHARED_LIB)
+  #define MI_PRIM_HAS_PROCESS_ATTACH  1
+
+  // Windows DLL: easy to hook into process_init and thread_done
+  BOOL WINAPI DllMain(HINSTANCE inst, DWORD reason, LPVOID reserved) {
+    mi_win_main((PVOID)inst,reason,reserved);
+    return TRUE;
+  }
+
+  // nothing to do since `_mi_thread_done` is handled through the DLL_THREAD_DETACH event.
+  void _mi_prim_thread_init_auto_done(void) { }
+  void _mi_prim_thread_done_auto_done(void) { }
+  void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) {
+    MI_UNUSED(heap);
+  }
+
+#elif !defined(MI_WIN_USE_FLS)
+  #define MI_PRIM_HAS_PROCESS_ATTACH  1
+
+  static void NTAPI mi_win_main_attach(PVOID module, DWORD reason, LPVOID reserved) {
+    if (reason == DLL_PROCESS_ATTACH || reason == DLL_THREAD_ATTACH) {
+      mi_win_main(module, reason, reserved);
+    }
+  }
+  static void NTAPI mi_win_main_detach(PVOID module, DWORD reason, LPVOID reserved) {
+    if (reason == DLL_PROCESS_DETACH || reason == DLL_THREAD_DETACH) {
+      mi_win_main(module, reason, reserved);
+    }
+  }
+
+  // Set up TLS callbacks in a statically linked library by using special data sections.
+  // See <https://stackoverflow.com/questions/14538159/tls-callback-in-windows>
+  // We use 2 entries to ensure we call attach events before constructors
+  // are called, and detach events after destructors are called.
+  #if defined(__cplusplus)
+  extern "C" {
+  #endif
+
+  #if defined(_WIN64)
+    #pragma comment(linker, "/INCLUDE:_tls_used")
+    #pragma comment(linker, "/INCLUDE:_mi_tls_callback_pre")
+    #pragma comment(linker, "/INCLUDE:_mi_tls_callback_post")
+    #pragma const_seg(".CRT$XLB")
+    extern const PIMAGE_TLS_CALLBACK _mi_tls_callback_pre[];
+    const PIMAGE_TLS_CALLBACK _mi_tls_callback_pre[] = { &mi_win_main_attach };
+    #pragma const_seg()
+    #pragma const_seg(".CRT$XLY")
+    extern const PIMAGE_TLS_CALLBACK _mi_tls_callback_post[];
+    const PIMAGE_TLS_CALLBACK _mi_tls_callback_post[] = { &mi_win_main_detach };
+    #pragma const_seg()
+  #else
+    #pragma comment(linker, "/INCLUDE:__tls_used")
+    #pragma comment(linker, "/INCLUDE:__mi_tls_callback_pre")
+    #pragma comment(linker, "/INCLUDE:__mi_tls_callback_post")
+    #pragma data_seg(".CRT$XLB")
+    PIMAGE_TLS_CALLBACK _mi_tls_callback_pre[] = { &mi_win_main_attach };
+    #pragma data_seg()
+    #pragma data_seg(".CRT$XLY")
+    PIMAGE_TLS_CALLBACK _mi_tls_callback_post[] = { &mi_win_main_detach };
+    #pragma data_seg()
+  #endif
+
+  #if defined(__cplusplus)
+  }
+  #endif
+
+  // nothing to do since `_mi_thread_done` is handled through the DLL_THREAD_DETACH event.
+  void _mi_prim_thread_init_auto_done(void) { }
+  void _mi_prim_thread_done_auto_done(void) { }
+  void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) {
+    MI_UNUSED(heap);
+  }
+
+#else // deprecated: statically linked, use fiber api
+
+  #if defined(_MSC_VER) // on clang/gcc use the constructor attribute (in `src/prim/prim.c`)
+    // MSVC: use data section magic for static libraries
+    // See <https://www.codeguru.com/cpp/misc/misc/applicationcontrol/article.php/c6945/Running-Code-Before-and-After-Main.htm>
+    #define MI_PRIM_HAS_PROCESS_ATTACH 1
+
+    static int mi_process_attach(void) {
+      mi_win_main(NULL,DLL_PROCESS_ATTACH,NULL);
+      atexit(&_mi_auto_process_done);
+      return 0;
+    }
+    typedef int(*mi_crt_callback_t)(void);
+    #if defined(_WIN64)
+      #pragma comment(linker, "/INCLUDE:_mi_tls_callback")
+      #pragma section(".CRT$XIU", long, read)
+    #else
+      #pragma comment(linker, "/INCLUDE:__mi_tls_callback")
+    #endif
+    #pragma data_seg(".CRT$XIU")
+    mi_decl_externc mi_crt_callback_t _mi_tls_callback[] = { &mi_process_attach };
+    #pragma data_seg()
+  #endif
+
+  // use the fiber api for calling `_mi_thread_done`.
+  #include <fibersapi.h>
+  #if (_WIN32_WINNT < 0x600)  // before Windows Vista
+  WINBASEAPI DWORD WINAPI FlsAlloc( _In_opt_ PFLS_CALLBACK_FUNCTION lpCallback );
+  WINBASEAPI PVOID WINAPI FlsGetValue( _In_ DWORD dwFlsIndex );
+  WINBASEAPI BOOL  WINAPI FlsSetValue( _In_ DWORD dwFlsIndex, _In_opt_ PVOID lpFlsData );
+  WINBASEAPI BOOL  WINAPI FlsFree(_In_ DWORD dwFlsIndex);
+  #endif
+
+  static DWORD mi_fls_key = (DWORD)(-1);
+
+  static void NTAPI mi_fls_done(PVOID value) {
+    mi_heap_t* heap = (mi_heap_t*)value;
+    if (heap != NULL) {
+      _mi_thread_done(heap);
+      FlsSetValue(mi_fls_key, NULL);  // prevent recursion as _mi_thread_done may set it back to the main heap, issue #672
+    }
+  }
+
+  void _mi_prim_thread_init_auto_done(void) {
+    mi_fls_key = FlsAlloc(&mi_fls_done);
+  }
+
+  void _mi_prim_thread_done_auto_done(void) {
+    // call thread-done on all threads (except the main thread) to prevent
+    // dangling callback pointer if statically linked with a DLL; Issue #208
+    FlsFree(mi_fls_key);
+  }
+
+  void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) {
+    mi_assert_internal(mi_fls_key != (DWORD)(-1));
+    FlsSetValue(mi_fls_key, heap);
+  }
+#endif
+
+// ----------------------------------------------------
+// Communicate with the redirection module on Windows
+// ----------------------------------------------------
+#if defined(MI_SHARED_LIB) && !defined(MI_WIN_NOREDIRECT)
+  #define MI_PRIM_HAS_ALLOCATOR_INIT 1
+
+  static bool mi_redirected = false;   // true if malloc redirects to mi_malloc
+
+  bool _mi_is_redirected(void) {
+    return mi_redirected;
+  }
+
+  #ifdef __cplusplus
+  extern "C" {
+  #endif
+  mi_decl_export void _mi_redirect_entry(DWORD reason) {
+    // called on redirection; careful as this may be called before DllMain
+    mi_win_tls_init(reason);
+    if (reason == DLL_PROCESS_ATTACH) {
+      mi_redirected = true;
+    }
+    else if (reason == DLL_PROCESS_DETACH) {
+      mi_redirected = false;
+    }
+    else if (reason == DLL_THREAD_DETACH) {
+      _mi_thread_done(NULL);
+    }
+  }
+  __declspec(dllimport) bool mi_cdecl mi_allocator_init(const char** message);
+  __declspec(dllimport) void mi_cdecl mi_allocator_done(void);
+  #ifdef __cplusplus
+  }
+  #endif
+  bool _mi_allocator_init(const char** message) {
+    return mi_allocator_init(message);
+  }
+  void _mi_allocator_done(void) {
+    mi_allocator_done();
+  }
+#endif
diff --git a/compat/mimalloc/random.c b/compat/mimalloc/random.c
new file mode 100644
index 00000000000000..f17698ba8a6d08
--- /dev/null
+++ b/compat/mimalloc/random.c
@@ -0,0 +1,258 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2021, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/prim.h"    // _mi_prim_random_buf
+#include <string.h>       // memset
+
+/* ----------------------------------------------------------------------------
+We use our own PRNG to keep predictable performance of random number generation
+and to avoid implementations that use a lock. We only use the OS provided
+random source to initialize the initial seeds. Since we do not need ultimate
+performance but we do rely on the security (for secret cookies in secure mode)
+we use a cryptographically secure generator (chacha20).
+-----------------------------------------------------------------------------*/
+
+#define MI_CHACHA_ROUNDS (20)   // perhaps use 12 for better performance?
+
+
+/* ----------------------------------------------------------------------------
+Chacha20 implementation as the original algorithm with a 64-bit nonce
+and counter: https://en.wikipedia.org/wiki/Salsa20
+The input matrix has sixteen 32-bit values:
+Position  0 to  3: constant key
+Position  4 to 11: the key
+Position 12 to 13: the counter.
+Position 14 to 15: the nonce.
+
+The implementation uses regular C code which compiles very well on modern compilers.
+(gcc x64 has no register spills, and clang 6+ uses SSE instructions)
+-----------------------------------------------------------------------------*/
+
+static inline uint32_t rotl(uint32_t x, uint32_t shift) {
+  return (x << shift) | (x >> (32 - shift));
+}
+
+static inline void qround(uint32_t x[16], size_t a, size_t b, size_t c, size_t d) {
+  x[a] += x[b]; x[d] = rotl(x[d] ^ x[a], 16);
+  x[c] += x[d]; x[b] = rotl(x[b] ^ x[c], 12);
+  x[a] += x[b]; x[d] = rotl(x[d] ^ x[a], 8);
+  x[c] += x[d]; x[b] = rotl(x[b] ^ x[c], 7);
+}
+
+static void chacha_block(mi_random_ctx_t* ctx)
+{
+  // scramble into `x`
+  uint32_t x[16];
+  for (size_t i = 0; i < 16; i++) {
+    x[i] = ctx->input[i];
+  }
+  for (size_t i = 0; i < MI_CHACHA_ROUNDS; i += 2) {
+    qround(x, 0, 4,  8, 12);
+    qround(x, 1, 5,  9, 13);
+    qround(x, 2, 6, 10, 14);
+    qround(x, 3, 7, 11, 15);
+    qround(x, 0, 5, 10, 15);
+    qround(x, 1, 6, 11, 12);
+    qround(x, 2, 7,  8, 13);
+    qround(x, 3, 4,  9, 14);
+  }
+
+  // add scrambled data to the initial state
+  for (size_t i = 0; i < 16; i++) {
+    ctx->output[i] = x[i] + ctx->input[i];
+  }
+  ctx->output_available = 16;
+
+  // increment the counter for the next round
+  ctx->input[12] += 1;
+  if (ctx->input[12] == 0) {
+    ctx->input[13] += 1;
+    if (ctx->input[13] == 0) {  // and keep increasing into the nonce
+      ctx->input[14] += 1;
+    }
+  }
+}
+
+static uint32_t chacha_next32(mi_random_ctx_t* ctx) {
+  if (ctx->output_available <= 0) {
+    chacha_block(ctx);
+    ctx->output_available = 16; // (assign again to suppress static analysis warning)
+  }
+  const uint32_t x = ctx->output[16 - ctx->output_available];
+  ctx->output[16 - ctx->output_available] = 0; // reset once the data is handed out
+  ctx->output_available--;
+  return x;
+}
+
+static inline uint32_t read32(const uint8_t* p, size_t idx32) {
+  const size_t i = 4*idx32;
+  return ((uint32_t)p[i+0] | (uint32_t)p[i+1] << 8 | (uint32_t)p[i+2] << 16 | (uint32_t)p[i+3] << 24);
+}
+
+static void chacha_init(mi_random_ctx_t* ctx, const uint8_t key[32], uint64_t nonce)
+{
+  // since we only use chacha for randomness (and not encryption) we
+  // do not _need_ to read 32-bit values as little endian but we do anyways
+  // just for being compatible :-)
+  memset(ctx, 0, sizeof(*ctx));
+  for (size_t i = 0; i < 4; i++) {
+    const uint8_t* sigma = (uint8_t*)"expand 32-byte k";
+    ctx->input[i] = read32(sigma,i);
+  }
+  for (size_t i = 0; i < 8; i++) {
+    ctx->input[i + 4] = read32(key,i);
+  }
+  ctx->input[12] = 0;
+  ctx->input[13] = 0;
+  ctx->input[14] = (uint32_t)nonce;
+  ctx->input[15] = (uint32_t)(nonce >> 32);
+}
+
+static void chacha_split(mi_random_ctx_t* ctx, uint64_t nonce, mi_random_ctx_t* ctx_new) {
+  memset(ctx_new, 0, sizeof(*ctx_new));
+  _mi_memcpy(ctx_new->input, ctx->input, sizeof(ctx_new->input));
+  ctx_new->input[12] = 0;
+  ctx_new->input[13] = 0;
+  ctx_new->input[14] = (uint32_t)nonce;
+  ctx_new->input[15] = (uint32_t)(nonce >> 32);
+  mi_assert_internal(ctx->input[14] != ctx_new->input[14] || ctx->input[15] != ctx_new->input[15]); // do not reuse nonces!
+  chacha_block(ctx_new);
+}
+
+
+/* ----------------------------------------------------------------------------
+Random interface
+-----------------------------------------------------------------------------*/
+
+#if MI_DEBUG>1
+static bool mi_random_is_initialized(mi_random_ctx_t* ctx) {
+  return (ctx != NULL && ctx->input[0] != 0);
+}
+#endif
+
+void _mi_random_split(mi_random_ctx_t* ctx, mi_random_ctx_t* ctx_new) {
+  mi_assert_internal(mi_random_is_initialized(ctx));
+  mi_assert_internal(ctx != ctx_new);
+  chacha_split(ctx, (uintptr_t)ctx_new /*nonce*/, ctx_new);
+}
+
+uintptr_t _mi_random_next(mi_random_ctx_t* ctx) {
+  mi_assert_internal(mi_random_is_initialized(ctx));
+  uintptr_t r;
+  do {
+    #if MI_INTPTR_SIZE <= 4
+    r = chacha_next32(ctx);
+    #elif MI_INTPTR_SIZE == 8
+    r = (((uintptr_t)chacha_next32(ctx) << 32) | chacha_next32(ctx));
+    #else
+    # error "define mi_random_next for this platform"
+    #endif
+  } while (r==0);
+  return r;
+}
+
+
+/* ----------------------------------------------------------------------------
+To initialize a fresh random context.
+If we cannot get good randomness, we fall back to weak randomness based on a timer and ASLR.
+-----------------------------------------------------------------------------*/
+
+uintptr_t _mi_os_random_weak(uintptr_t extra_seed) {
+  uintptr_t x = (uintptr_t)&_mi_os_random_weak ^ extra_seed; // ASLR makes the address random
+  x ^= _mi_prim_clock_now();  
+  // and do a few randomization steps
+  uintptr_t max = ((x ^ (x >> 17)) & 0x0F) + 1;
+  for (uintptr_t i = 0; i < max || x==0; i++, x++) {
+    x = _mi_random_shuffle(x);
+  }
+  mi_assert_internal(x != 0);
+  return x;
+}
+
+static void mi_random_init_ex(mi_random_ctx_t* ctx, bool use_weak) {
+  uint8_t key[32];
+  if (use_weak || !_mi_prim_random_buf(key, sizeof(key))) {
+    // if we fail to get random data from the OS, we fall back to a
+    // weak random source based on the current time
+    #if !defined(__wasi__)
+    if (!use_weak) { _mi_warning_message("unable to use secure randomness\n"); }
+    #endif
+    uintptr_t x = _mi_os_random_weak(0);
+    for (size_t i = 0; i < 8; i++, x++) {  // key is eight 32-bit words.
+      x = _mi_random_shuffle(x);
+      ((uint32_t*)key)[i] = (uint32_t)x;
+    }
+    ctx->weak = true;
+  }
+  else {
+    ctx->weak = false;
+  }
+  chacha_init(ctx, key, (uintptr_t)ctx /*nonce*/ );
+}
+
+void _mi_random_init(mi_random_ctx_t* ctx) {
+  mi_random_init_ex(ctx, false);
+}
+
+void _mi_random_init_weak(mi_random_ctx_t * ctx) {
+  mi_random_init_ex(ctx, true);
+}
+
+void _mi_random_reinit_if_weak(mi_random_ctx_t * ctx) {
+  if (ctx->weak) {
+    _mi_random_init(ctx);
+  }
+}
+
+/* --------------------------------------------------------
+test vectors from <https://tools.ietf.org/html/rfc8439>
+----------------------------------------------------------- */
+/*
+static bool array_equals(uint32_t* x, uint32_t* y, size_t n) {
+  for (size_t i = 0; i < n; i++) {
+    if (x[i] != y[i]) return false;
+  }
+  return true;
+}
+static void chacha_test(void)
+{
+  uint32_t x[4] = { 0x11111111, 0x01020304, 0x9b8d6f43, 0x01234567 };
+  uint32_t x_out[4] = { 0xea2a92f4, 0xcb1cf8ce, 0x4581472e, 0x5881c4bb };
+  qround(x, 0, 1, 2, 3);
+  mi_assert_internal(array_equals(x, x_out, 4));
+
+  uint32_t y[16] = {
+       0x879531e0,  0xc5ecf37d,  0x516461b1,  0xc9a62f8a,
+       0x44c20ef3,  0x3390af7f,  0xd9fc690b,  0x2a5f714c,
+       0x53372767,  0xb00a5631,  0x974c541a,  0x359e9963,
+       0x5c971061,  0x3d631689,  0x2098d9d6,  0x91dbd320 };
+  uint32_t y_out[16] = {
+       0x879531e0,  0xc5ecf37d,  0xbdb886dc,  0xc9a62f8a,
+       0x44c20ef3,  0x3390af7f,  0xd9fc690b,  0xcfacafd2,
+       0xe46bea80,  0xb00a5631,  0x974c541a,  0x359e9963,
+       0x5c971061,  0xccc07c79,  0x2098d9d6,  0x91dbd320 };
+  qround(y, 2, 7, 8, 13);
+  mi_assert_internal(array_equals(y, y_out, 16));
+
+  mi_random_ctx_t r = {
+    { 0x61707865, 0x3320646e, 0x79622d32, 0x6b206574,
+      0x03020100, 0x07060504, 0x0b0a0908, 0x0f0e0d0c,
+      0x13121110, 0x17161514, 0x1b1a1918, 0x1f1e1d1c,
+      0x00000001, 0x09000000, 0x4a000000, 0x00000000 },
+    {0},
+    0
+  };
+  uint32_t r_out[16] = {
+       0xe4e7f110, 0x15593bd1, 0x1fdd0f50, 0xc47120a3,
+       0xc7f4d1c7, 0x0368c033, 0x9aaa2204, 0x4e6cd4c3,
+       0x466482d2, 0x09aa9f07, 0x05d7c214, 0xa2028bd9,
+       0xd19c12b5, 0xb94e16de, 0xe883d0cb, 0x4e3c50a2 };
+  chacha_block(&r);
+  mi_assert_internal(array_equals(r.output, r_out, 16));
+}
+*/
diff --git a/compat/mimalloc/segment-map.c b/compat/mimalloc/segment-map.c
new file mode 100644
index 00000000000000..bbcea28aabc2e1
--- /dev/null
+++ b/compat/mimalloc/segment-map.c
@@ -0,0 +1,142 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2019-2023, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+
+/* -----------------------------------------------------------
+  The following functions are to reliably find the segment or
+  block that encompasses any pointer p (or NULL if it is not
+  in any of our segments).
+  We maintain a bitmap of all memory with 1 bit per MI_SEGMENT_SIZE (64MiB)
+  set to 1 if it contains the segment meta data.
+----------------------------------------------------------- */
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+
+// Reduce total address space to reduce .bss  (due to the `mi_segment_map`)
+#if (MI_INTPTR_SIZE > 4) && MI_TRACK_ASAN
+#define MI_SEGMENT_MAP_MAX_ADDRESS    (128*1024ULL*MI_GiB)  // 128 TiB  (see issue #881)
+#elif (MI_INTPTR_SIZE > 4)
+#define MI_SEGMENT_MAP_MAX_ADDRESS    (48*1024ULL*MI_GiB)   // 48 TiB
+#else
+#define MI_SEGMENT_MAP_MAX_ADDRESS    (UINT32_MAX)
+#endif
+
+#define MI_SEGMENT_MAP_PART_SIZE      (MI_INTPTR_SIZE*MI_KiB - 128)      // 128 > sizeof(mi_memid_t) ! 
+#define MI_SEGMENT_MAP_PART_BITS      (8*MI_SEGMENT_MAP_PART_SIZE)
+#define MI_SEGMENT_MAP_PART_ENTRIES   (MI_SEGMENT_MAP_PART_SIZE / MI_INTPTR_SIZE)
+#define MI_SEGMENT_MAP_PART_BIT_SPAN  (MI_SEGMENT_ALIGN)                 // memory area covered by 1 bit
+
+#if (MI_SEGMENT_MAP_PART_BITS < (MI_SEGMENT_MAP_MAX_ADDRESS / MI_SEGMENT_MAP_PART_BIT_SPAN)) // prevent overflow on 32-bit (issue #1017)
+#define MI_SEGMENT_MAP_PART_SPAN      (MI_SEGMENT_MAP_PART_BITS * MI_SEGMENT_MAP_PART_BIT_SPAN)
+#else
+#define MI_SEGMENT_MAP_PART_SPAN      MI_SEGMENT_MAP_MAX_ADDRESS
+#endif
+
+#define MI_SEGMENT_MAP_MAX_PARTS      ((MI_SEGMENT_MAP_MAX_ADDRESS / MI_SEGMENT_MAP_PART_SPAN) + 1)
+
+// A part of the segment map.
+typedef struct mi_segmap_part_s {
+  mi_memid_t memid;
+  _Atomic(uintptr_t) map[MI_SEGMENT_MAP_PART_ENTRIES];
+} mi_segmap_part_t;
+
+// Allocate parts on-demand to reduce .bss footprint
+static _Atomic(mi_segmap_part_t*) mi_segment_map[MI_SEGMENT_MAP_MAX_PARTS]; // = { NULL, .. }
+
+static mi_segmap_part_t* mi_segment_map_index_of(const mi_segment_t* segment, bool create_on_demand, size_t* idx, size_t* bitidx) {
+  // note: segment can be invalid or NULL.
+  mi_assert_internal(_mi_ptr_segment(segment + 1) == segment); // is it aligned on MI_SEGMENT_SIZE?
+  *idx = 0;
+  *bitidx = 0;  
+  if ((uintptr_t)segment >= MI_SEGMENT_MAP_MAX_ADDRESS) return NULL;
+  const uintptr_t segindex = ((uintptr_t)segment) / MI_SEGMENT_MAP_PART_SPAN;
+  if (segindex >= MI_SEGMENT_MAP_MAX_PARTS) return NULL;
+  mi_segmap_part_t* part = mi_atomic_load_ptr_relaxed(mi_segmap_part_t, &mi_segment_map[segindex]);
+
+  // allocate on demand to reduce .bss footprint
+  if mi_unlikely(part == NULL) {
+    if (!create_on_demand) return NULL;
+    mi_memid_t memid;
+    part = (mi_segmap_part_t*)_mi_os_zalloc(sizeof(mi_segmap_part_t), &memid);
+    if (part == NULL) return NULL;
+    part->memid = memid;
+    mi_segmap_part_t* expected = NULL;
+    if (!mi_atomic_cas_ptr_strong_release(mi_segmap_part_t, &mi_segment_map[segindex], &expected, part)) {
+      _mi_os_free(part, sizeof(mi_segmap_part_t), memid);
+      part = expected;
+      if (part == NULL) return NULL;
+    }
+  }
+  mi_assert(part != NULL);
+  const uintptr_t offset = ((uintptr_t)segment) % MI_SEGMENT_MAP_PART_SPAN;
+  const uintptr_t bitofs = offset / MI_SEGMENT_MAP_PART_BIT_SPAN;
+  *idx = bitofs / MI_INTPTR_BITS;
+  *bitidx = bitofs % MI_INTPTR_BITS;
+  return part;
+}
+
+void _mi_segment_map_allocated_at(const mi_segment_t* segment) {
+  if (segment->memid.memkind == MI_MEM_ARENA) return; // we lookup segments first in the arena's and don't need the segment map
+  size_t index;
+  size_t bitidx;
+  mi_segmap_part_t* part = mi_segment_map_index_of(segment, true /* alloc map if needed */, &index, &bitidx);
+  if (part == NULL) return; // outside our address range..
+  uintptr_t mask = mi_atomic_load_relaxed(&part->map[index]);
+  uintptr_t newmask;
+  do {
+    newmask = (mask | ((uintptr_t)1 << bitidx));
+  } while (!mi_atomic_cas_weak_release(&part->map[index], &mask, newmask));
+}
+
+void _mi_segment_map_freed_at(const mi_segment_t* segment) {
+  if (segment->memid.memkind == MI_MEM_ARENA) return;
+  size_t index;
+  size_t bitidx;
+  mi_segmap_part_t* part = mi_segment_map_index_of(segment, false /* don't alloc if not present */, &index, &bitidx);
+  if (part == NULL) return; // outside our address range..
+  uintptr_t mask = mi_atomic_load_relaxed(&part->map[index]);
+  uintptr_t newmask;
+  do {
+    newmask = (mask & ~((uintptr_t)1 << bitidx));
+  } while (!mi_atomic_cas_weak_release(&part->map[index], &mask, newmask));
+}
+
+// Determine the segment belonging to a pointer or NULL if it is not in a valid segment.
+static mi_segment_t* _mi_segment_of(const void* p) {
+  if (p == NULL) return NULL;
+  mi_segment_t* segment = _mi_ptr_segment(p);  // segment can be NULL  
+  size_t index;
+  size_t bitidx;
+  mi_segmap_part_t* part = mi_segment_map_index_of(segment, false /* dont alloc if not present */, &index, &bitidx);
+  if (part == NULL) return NULL;  
+  const uintptr_t mask = mi_atomic_load_relaxed(&part->map[index]);
+  if mi_likely((mask & ((uintptr_t)1 << bitidx)) != 0) {
+    bool cookie_ok = (_mi_ptr_cookie(segment) == segment->cookie);
+    mi_assert_internal(cookie_ok); MI_UNUSED(cookie_ok);
+    return segment; // yes, allocated by us
+  }
+  return NULL;
+}
+
+// Is this a valid pointer in our heap?
+static bool mi_is_valid_pointer(const void* p) {
+  // first check if it is in an arena, then check if it is OS allocated
+  return (_mi_arena_contains(p) || _mi_segment_of(p) != NULL);
+}
+
+mi_decl_nodiscard mi_decl_export bool mi_is_in_heap_region(const void* p) mi_attr_noexcept {
+  return mi_is_valid_pointer(p);
+}
+
+void _mi_segment_map_unsafe_destroy(void) {
+  for (size_t i = 0; i < MI_SEGMENT_MAP_MAX_PARTS; i++) {
+    mi_segmap_part_t* part = mi_atomic_exchange_ptr_relaxed(mi_segmap_part_t, &mi_segment_map[i], NULL);
+    if (part != NULL) {
+      _mi_os_free(part, sizeof(mi_segmap_part_t), part->memid);
+    }
+  }
+}
diff --git a/compat/mimalloc/segment.c b/compat/mimalloc/segment.c
new file mode 100644
index 00000000000000..6f398822dfb421
--- /dev/null
+++ b/compat/mimalloc/segment.c
@@ -0,0 +1,1702 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2024, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+
+#include <string.h>  // memset
+#include <stdio.h>
+
+// -------------------------------------------------------------------
+// Segments
+// mimalloc pages reside in segments. See `mi_segment_valid` for invariants.
+// -------------------------------------------------------------------
+
+
+static void mi_segment_try_purge(mi_segment_t* segment, bool force);
+
+
+// -------------------------------------------------------------------
+// commit mask
+// -------------------------------------------------------------------
+
+static bool mi_commit_mask_all_set(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    if ((commit->mask[i] & cm->mask[i]) != cm->mask[i]) return false;
+  }
+  return true;
+}
+
+static bool mi_commit_mask_any_set(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    if ((commit->mask[i] & cm->mask[i]) != 0) return true;
+  }
+  return false;
+}
+
+static void mi_commit_mask_create_intersect(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm, mi_commit_mask_t* res) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    res->mask[i] = (commit->mask[i] & cm->mask[i]);
+  }
+}
+
+static void mi_commit_mask_clear(mi_commit_mask_t* res, const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    res->mask[i] &= ~(cm->mask[i]);
+  }
+}
+
+static void mi_commit_mask_set(mi_commit_mask_t* res, const mi_commit_mask_t* cm) {
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    res->mask[i] |= cm->mask[i];
+  }
+}
+
+static void mi_commit_mask_create(size_t bitidx, size_t bitcount, mi_commit_mask_t* cm) {
+  mi_assert_internal(bitidx < MI_COMMIT_MASK_BITS);
+  mi_assert_internal((bitidx + bitcount) <= MI_COMMIT_MASK_BITS);
+  if (bitcount == MI_COMMIT_MASK_BITS) {
+    mi_assert_internal(bitidx==0);
+    mi_commit_mask_create_full(cm);
+  }
+  else if (bitcount == 0) {
+    mi_commit_mask_create_empty(cm);
+  }
+  else {
+    mi_commit_mask_create_empty(cm);
+    size_t i = bitidx / MI_COMMIT_MASK_FIELD_BITS;
+    size_t ofs = bitidx % MI_COMMIT_MASK_FIELD_BITS;
+    while (bitcount > 0) {
+      mi_assert_internal(i < MI_COMMIT_MASK_FIELD_COUNT);
+      size_t avail = MI_COMMIT_MASK_FIELD_BITS - ofs;
+      size_t count = (bitcount > avail ? avail : bitcount);
+      size_t mask = (count >= MI_COMMIT_MASK_FIELD_BITS ? ~((size_t)0) : (((size_t)1 << count) - 1) << ofs);
+      cm->mask[i] = mask;
+      bitcount -= count;
+      ofs = 0;
+      i++;
+    }
+  }
+}
+
+size_t _mi_commit_mask_committed_size(const mi_commit_mask_t* cm, size_t total) {
+  mi_assert_internal((total%MI_COMMIT_MASK_BITS)==0);
+  size_t count = 0;
+  for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) {
+    size_t mask = cm->mask[i];
+    if (~mask == 0) {
+      count += MI_COMMIT_MASK_FIELD_BITS;
+    }
+    else {
+      for (; mask != 0; mask >>= 1) {  // todo: use popcount
+        if ((mask&1)!=0) count++;
+      }
+    }
+  }
+  // we use total since for huge segments each commit bit may represent a larger size
+  return ((total / MI_COMMIT_MASK_BITS) * count);
+}
+
+
+size_t _mi_commit_mask_next_run(const mi_commit_mask_t* cm, size_t* idx) {
+  size_t i = (*idx) / MI_COMMIT_MASK_FIELD_BITS;
+  size_t ofs = (*idx) % MI_COMMIT_MASK_FIELD_BITS;
+  size_t mask = 0;
+  // find first ones
+  while (i < MI_COMMIT_MASK_FIELD_COUNT) {
+    mask = cm->mask[i];
+    mask >>= ofs;
+    if (mask != 0) {
+      while ((mask&1) == 0) {
+        mask >>= 1;
+        ofs++;
+      }
+      break;
+    }
+    i++;
+    ofs = 0;
+  }
+  if (i >= MI_COMMIT_MASK_FIELD_COUNT) {
+    // not found
+    *idx = MI_COMMIT_MASK_BITS;
+    return 0;
+  }
+  else {
+    // found, count ones
+    size_t count = 0;
+    *idx = (i*MI_COMMIT_MASK_FIELD_BITS) + ofs;
+    do {
+      mi_assert_internal(ofs < MI_COMMIT_MASK_FIELD_BITS && (mask&1) == 1);
+      do {
+        count++;
+        mask >>= 1;
+      } while ((mask&1) == 1);
+      if ((((*idx + count) % MI_COMMIT_MASK_FIELD_BITS) == 0)) {
+        i++;
+        if (i >= MI_COMMIT_MASK_FIELD_COUNT) break;
+        mask = cm->mask[i];
+        ofs = 0;
+      }
+    } while ((mask&1) == 1);
+    mi_assert_internal(count > 0);
+    return count;
+  }
+}
+
+
+/* --------------------------------------------------------------------------------
+  Segment allocation
+  We allocate pages inside bigger "segments" (32 MiB on 64-bit). This is to avoid
+  splitting VMA's on Linux and reduce fragmentation on other OS's.
+  Each thread owns its own segments.
+
+  Currently we have:
+  - small pages (64KiB)
+  - medium pages (512KiB)
+  - large pages (4MiB),
+  - huge segments have 1 page in one segment that can be larger than `MI_SEGMENT_SIZE`.
+    it is used for blocks `> MI_LARGE_OBJ_SIZE_MAX` or with alignment `> MI_BLOCK_ALIGNMENT_MAX`.
+
+  The memory for a segment is usually committed on demand.
+  (i.e. we are careful to not touch the memory until we actually allocate a block there)
+
+  If a  thread ends, it "abandons" pages that still contain live blocks.
+  Such segments are abandoned and these can be reclaimed by still running threads,
+  (much like work-stealing).
+-------------------------------------------------------------------------------- */
+
+
+/* -----------------------------------------------------------
+   Slices
+----------------------------------------------------------- */
+
+
+static const mi_slice_t* mi_segment_slices_end(const mi_segment_t* segment) {
+  return &segment->slices[segment->slice_entries];
+}
+
+static uint8_t* mi_slice_start(const mi_slice_t* slice) {
+  mi_segment_t* segment = _mi_ptr_segment(slice);
+  mi_assert_internal(slice >= segment->slices && slice < mi_segment_slices_end(segment));
+  return ((uint8_t*)segment + ((slice - segment->slices)*MI_SEGMENT_SLICE_SIZE));
+}
+
+
+/* -----------------------------------------------------------
+   Bins
+----------------------------------------------------------- */
+// Use bit scan forward to quickly find the first zero bit if it is available
+
+static inline size_t mi_slice_bin8(size_t slice_count) {
+  if (slice_count<=1) return slice_count;
+  mi_assert_internal(slice_count <= MI_SLICES_PER_SEGMENT);
+  slice_count--;
+  size_t s = mi_bsr(slice_count);  // slice_count > 1
+  if (s <= 2) return slice_count + 1;
+  size_t bin = ((s << 2) | ((slice_count >> (s - 2))&0x03)) - 4;
+  return bin;
+}
+
+static inline size_t mi_slice_bin(size_t slice_count) {
+  mi_assert_internal(slice_count*MI_SEGMENT_SLICE_SIZE <= MI_SEGMENT_SIZE);
+  mi_assert_internal(mi_slice_bin8(MI_SLICES_PER_SEGMENT) <= MI_SEGMENT_BIN_MAX);
+  size_t bin = mi_slice_bin8(slice_count);
+  mi_assert_internal(bin <= MI_SEGMENT_BIN_MAX);
+  return bin;
+}
+
+static inline size_t mi_slice_index(const mi_slice_t* slice) {
+  mi_segment_t* segment = _mi_ptr_segment(slice);
+  ptrdiff_t index = slice - segment->slices;
+  mi_assert_internal(index >= 0 && index < (ptrdiff_t)segment->slice_entries);
+  return index;
+}
+
+
+/* -----------------------------------------------------------
+   Slice span queues
+----------------------------------------------------------- */
+
+static void mi_span_queue_push(mi_span_queue_t* sq, mi_slice_t* slice) {
+  // todo: or push to the end?
+  mi_assert_internal(slice->prev == NULL && slice->next==NULL);
+  slice->prev = NULL; // paranoia
+  slice->next = sq->first;
+  sq->first = slice;
+  if (slice->next != NULL) slice->next->prev = slice;
+                     else sq->last = slice;
+  slice->block_size = 0; // free
+}
+
+static mi_span_queue_t* mi_span_queue_for(size_t slice_count, mi_segments_tld_t* tld) {
+  size_t bin = mi_slice_bin(slice_count);
+  mi_span_queue_t* sq = &tld->spans[bin];
+  mi_assert_internal(sq->slice_count >= slice_count);
+  return sq;
+}
+
+static void mi_span_queue_delete(mi_span_queue_t* sq, mi_slice_t* slice) {
+  mi_assert_internal(slice->block_size==0 && slice->slice_count>0 && slice->slice_offset==0);
+  // should work too if the queue does not contain slice (which can happen during reclaim)
+  if (slice->prev != NULL) slice->prev->next = slice->next;
+  if (slice == sq->first) sq->first = slice->next;
+  if (slice->next != NULL) slice->next->prev = slice->prev;
+  if (slice == sq->last) sq->last = slice->prev;
+  slice->prev = NULL;
+  slice->next = NULL;
+  slice->block_size = 1; // no more free
+}
+
+
+/* -----------------------------------------------------------
+ Invariant checking
+----------------------------------------------------------- */
+
+static bool mi_slice_is_used(const mi_slice_t* slice) {
+  return (slice->block_size > 0);
+}
+
+
+#if (MI_DEBUG>=3)
+static bool mi_span_queue_contains(mi_span_queue_t* sq, mi_slice_t* slice) {
+  for (mi_slice_t* s = sq->first; s != NULL; s = s->next) {
+    if (s==slice) return true;
+  }
+  return false;
+}
+
+static bool mi_segment_is_valid(mi_segment_t* segment, mi_segments_tld_t* tld) {
+  mi_assert_internal(segment != NULL);
+  mi_assert_internal(_mi_ptr_cookie(segment) == segment->cookie);
+  mi_assert_internal(segment->abandoned <= segment->used);
+  mi_assert_internal(segment->thread_id == 0 || segment->thread_id == _mi_thread_id());
+  mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask)); // can only decommit committed blocks
+  //mi_assert_internal(segment->segment_info_size % MI_SEGMENT_SLICE_SIZE == 0);
+  mi_slice_t* slice = &segment->slices[0];
+  const mi_slice_t* end = mi_segment_slices_end(segment);
+  size_t used_count = 0;
+  mi_span_queue_t* sq;
+  while(slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    size_t index = mi_slice_index(slice);
+    size_t maxindex = (index + slice->slice_count >= segment->slice_entries ? segment->slice_entries : index + slice->slice_count) - 1;
+    if (mi_slice_is_used(slice)) { // a page in use, we need at least MAX_SLICE_OFFSET_COUNT valid back offsets
+      used_count++;
+      mi_assert_internal(slice->is_huge == (segment->kind == MI_SEGMENT_HUGE));
+      for (size_t i = 0; i <= MI_MAX_SLICE_OFFSET_COUNT && index + i <= maxindex; i++) {
+        mi_assert_internal(segment->slices[index + i].slice_offset == i*sizeof(mi_slice_t));
+        mi_assert_internal(i==0 || segment->slices[index + i].slice_count == 0);
+        mi_assert_internal(i==0 || segment->slices[index + i].block_size == 1);
+      }
+      // and the last entry as well (for coalescing)
+      const mi_slice_t* last = slice + slice->slice_count - 1;
+      if (last > slice && last < mi_segment_slices_end(segment)) {
+        mi_assert_internal(last->slice_offset == (slice->slice_count-1)*sizeof(mi_slice_t));
+        mi_assert_internal(last->slice_count == 0);
+        mi_assert_internal(last->block_size == 1);
+      }
+    }
+    else {  // free range of slices; only last slice needs a valid back offset
+      mi_slice_t* last = &segment->slices[maxindex];
+      if (segment->kind != MI_SEGMENT_HUGE || slice->slice_count <= (segment->slice_entries - segment->segment_info_slices)) {
+        mi_assert_internal((uint8_t*)slice == (uint8_t*)last - last->slice_offset);
+      }
+      mi_assert_internal(slice == last || last->slice_count == 0 );
+      mi_assert_internal(last->block_size == 0 || (segment->kind==MI_SEGMENT_HUGE && last->block_size==1));
+      if (segment->kind != MI_SEGMENT_HUGE && segment->thread_id != 0) { // segment is not huge or abandoned
+        sq = mi_span_queue_for(slice->slice_count,tld);
+        mi_assert_internal(mi_span_queue_contains(sq,slice));
+      }
+    }
+    slice = &segment->slices[maxindex+1];
+  }
+  mi_assert_internal(slice == end);
+  mi_assert_internal(used_count == segment->used + 1);
+  return true;
+}
+#endif
+
+/* -----------------------------------------------------------
+ Segment size calculations
+----------------------------------------------------------- */
+
+static size_t mi_segment_info_size(mi_segment_t* segment) {
+  return segment->segment_info_slices * MI_SEGMENT_SLICE_SIZE;
+}
+
+static uint8_t* _mi_segment_page_start_from_slice(const mi_segment_t* segment, const mi_slice_t* slice, size_t block_size, size_t* page_size)
+{
+  const ptrdiff_t idx = slice - segment->slices;
+  const size_t psize = (size_t)slice->slice_count * MI_SEGMENT_SLICE_SIZE;
+  uint8_t* const pstart = (uint8_t*)segment + (idx*MI_SEGMENT_SLICE_SIZE);
+  // make the start not OS page aligned for smaller blocks to avoid page/cache effects
+  // note: the offset must always be a block_size multiple since we assume small allocations
+  // are aligned (see `mi_heap_malloc_aligned`).
+  size_t start_offset = 0;
+  if (block_size > 0 && block_size <= MI_MAX_ALIGN_GUARANTEE) {
+    // for small objects, ensure the page start is aligned with the block size (PR#66 by kickunderscore)
+    const size_t adjust = block_size - ((uintptr_t)pstart % block_size);
+    if (adjust < block_size && psize >= block_size + adjust) {
+      start_offset += adjust;
+    }
+  }
+  if (block_size >= MI_INTPTR_SIZE) {
+    if (block_size <= 64) { start_offset += 3*block_size; }
+    else if (block_size <= 512) { start_offset += block_size; }
+  }
+  start_offset = _mi_align_up(start_offset, MI_MAX_ALIGN_SIZE);
+  mi_assert_internal(_mi_is_aligned(pstart + start_offset, MI_MAX_ALIGN_SIZE));
+  mi_assert_internal(block_size == 0 || block_size > MI_MAX_ALIGN_GUARANTEE || _mi_is_aligned(pstart + start_offset,block_size));
+  if (page_size != NULL) { *page_size = psize - start_offset; }
+  return (pstart + start_offset);
+}
+
+// Start of the page available memory; can be used on uninitialized pages
+uint8_t* _mi_segment_page_start(const mi_segment_t* segment, const mi_page_t* page, size_t* page_size)
+{
+  const mi_slice_t* slice = mi_page_to_slice((mi_page_t*)page);
+  uint8_t* p = _mi_segment_page_start_from_slice(segment, slice, mi_page_block_size(page), page_size);
+  mi_assert_internal(mi_page_block_size(page) > 0 || _mi_ptr_page(p) == page);
+  mi_assert_internal(_mi_ptr_segment(p) == segment);
+  return p;
+}
+
+
+static size_t mi_segment_calculate_slices(size_t required, size_t* info_slices) {
+  size_t page_size = _mi_os_page_size();
+  size_t isize     = _mi_align_up(sizeof(mi_segment_t), page_size);
+  size_t guardsize = 0;
+
+  if (MI_SECURE>0) {
+    // in secure mode, we set up a protected page in between the segment info
+    // and the page data (and one at the end of the segment)
+    guardsize = page_size;
+    if (required > 0) {
+      required = _mi_align_up(required, MI_SEGMENT_SLICE_SIZE) + page_size;
+    }
+  }
+
+  isize = _mi_align_up(isize + guardsize, MI_SEGMENT_SLICE_SIZE);
+  if (info_slices != NULL) *info_slices = isize / MI_SEGMENT_SLICE_SIZE;
+  size_t segment_size = (required==0 ? MI_SEGMENT_SIZE : _mi_align_up( required + isize + guardsize, MI_SEGMENT_SLICE_SIZE) );
+  mi_assert_internal(segment_size % MI_SEGMENT_SLICE_SIZE == 0);
+  return (segment_size / MI_SEGMENT_SLICE_SIZE);
+}
+
+
+/* ----------------------------------------------------------------------------
+Segment caches
+We keep a small segment cache per thread to increase local
+reuse and avoid setting/clearing guard pages in secure mode.
+------------------------------------------------------------------------------- */
+
+static void mi_segments_track_size(long segment_size, mi_segments_tld_t* tld) {
+  if (segment_size>=0) _mi_stat_increase(&tld->stats->segments,1);
+                  else _mi_stat_decrease(&tld->stats->segments,1);
+  tld->count += (segment_size >= 0 ? 1 : -1);
+  if (tld->count > tld->peak_count) tld->peak_count = tld->count;
+  tld->current_size += segment_size;
+  if (tld->current_size > tld->peak_size) tld->peak_size = tld->current_size;
+}
+
+static void mi_segment_os_free(mi_segment_t* segment, mi_segments_tld_t* tld) {
+  segment->thread_id = 0;
+  _mi_segment_map_freed_at(segment);
+  mi_segments_track_size(-((long)mi_segment_size(segment)),tld);
+  if (segment->was_reclaimed) {
+    tld->reclaim_count--;
+    segment->was_reclaimed = false;
+  }
+  if (MI_SECURE>0) {
+    // _mi_os_unprotect(segment, mi_segment_size(segment)); // ensure no more guard pages are set
+    // unprotect the guard pages; we cannot just unprotect the whole segment size as part may be decommitted
+    size_t os_pagesize = _mi_os_page_size();
+    _mi_os_unprotect((uint8_t*)segment + mi_segment_info_size(segment) - os_pagesize, os_pagesize);
+    uint8_t* end = (uint8_t*)segment + mi_segment_size(segment) - os_pagesize;
+    _mi_os_unprotect(end, os_pagesize);
+  }
+
+  // purge delayed decommits now? (no, leave it to the arena)
+  // mi_segment_try_purge(segment,true,tld->stats);
+
+  const size_t size = mi_segment_size(segment);
+  const size_t csize = _mi_commit_mask_committed_size(&segment->commit_mask, size);
+
+  _mi_arena_free(segment, mi_segment_size(segment), csize, segment->memid);
+}
+
+/* -----------------------------------------------------------
+   Commit/Decommit ranges
+----------------------------------------------------------- */
+
+static void mi_segment_commit_mask(mi_segment_t* segment, bool conservative, uint8_t* p, size_t size, uint8_t** start_p, size_t* full_size, mi_commit_mask_t* cm) {
+  mi_assert_internal(_mi_ptr_segment(p + 1) == segment);
+  mi_assert_internal(segment->kind != MI_SEGMENT_HUGE);
+  mi_commit_mask_create_empty(cm);
+  if (size == 0 || size > MI_SEGMENT_SIZE || segment->kind == MI_SEGMENT_HUGE) return;
+  const size_t segstart = mi_segment_info_size(segment);
+  const size_t segsize = mi_segment_size(segment);
+  if (p >= (uint8_t*)segment + segsize) return;
+
+  size_t pstart = (p - (uint8_t*)segment);
+  mi_assert_internal(pstart + size <= segsize);
+
+  size_t start;
+  size_t end;
+  if (conservative) {
+    // decommit conservative
+    start = _mi_align_up(pstart, MI_COMMIT_SIZE);
+    end   = _mi_align_down(pstart + size, MI_COMMIT_SIZE);
+    mi_assert_internal(start >= segstart);
+    mi_assert_internal(end <= segsize);
+  }
+  else {
+    // commit liberal
+    start = _mi_align_down(pstart, MI_MINIMAL_COMMIT_SIZE);
+    end   = _mi_align_up(pstart + size, MI_MINIMAL_COMMIT_SIZE);
+  }
+  if (pstart >= segstart && start < segstart) {  // note: the mask is also calculated for an initial commit of the info area
+    start = segstart;
+  }
+  if (end > segsize) {
+    end = segsize;
+  }
+
+  mi_assert_internal(start <= pstart && (pstart + size) <= end);
+  mi_assert_internal(start % MI_COMMIT_SIZE==0 && end % MI_COMMIT_SIZE == 0);
+  *start_p   = (uint8_t*)segment + start;
+  *full_size = (end > start ? end - start : 0);
+  if (*full_size == 0) return;
+
+  size_t bitidx = start / MI_COMMIT_SIZE;
+  mi_assert_internal(bitidx < MI_COMMIT_MASK_BITS);
+
+  size_t bitcount = *full_size / MI_COMMIT_SIZE; // can be 0
+  if (bitidx + bitcount > MI_COMMIT_MASK_BITS) {
+    _mi_warning_message("commit mask overflow: idx=%zu count=%zu start=%zx end=%zx p=0x%p size=%zu fullsize=%zu\n", bitidx, bitcount, start, end, p, size, *full_size);
+  }
+  mi_assert_internal((bitidx + bitcount) <= MI_COMMIT_MASK_BITS);
+  mi_commit_mask_create(bitidx, bitcount, cm);
+}
+
+static bool mi_segment_commit(mi_segment_t* segment, uint8_t* p, size_t size) {
+  mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask));
+
+  // commit liberal
+  uint8_t* start = NULL;
+  size_t   full_size = 0;
+  mi_commit_mask_t mask;
+  mi_segment_commit_mask(segment, false /* conservative? */, p, size, &start, &full_size, &mask);
+  if (mi_commit_mask_is_empty(&mask) || full_size == 0) return true;
+
+  if (!mi_commit_mask_all_set(&segment->commit_mask, &mask)) {
+    // committing
+    bool is_zero = false;
+    mi_commit_mask_t cmask;
+    mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask);
+    _mi_stat_decrease(&_mi_stats_main.committed, _mi_commit_mask_committed_size(&cmask, MI_SEGMENT_SIZE)); // adjust for overlap
+    if (!_mi_os_commit(start, full_size, &is_zero)) return false;
+    mi_commit_mask_set(&segment->commit_mask, &mask);
+  }
+
+  // increase purge expiration when using part of delayed purges -- we assume more allocations are coming soon.
+  if (mi_commit_mask_any_set(&segment->purge_mask, &mask)) {
+    segment->purge_expire = _mi_clock_now() + mi_option_get(mi_option_purge_delay);
+  }
+
+  // always clear any delayed purges in our range (as they are either committed now)
+  mi_commit_mask_clear(&segment->purge_mask, &mask);
+  return true;
+}
+
+static bool mi_segment_ensure_committed(mi_segment_t* segment, uint8_t* p, size_t size) {
+  mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask));
+  // note: assumes commit_mask is always full for huge segments as otherwise the commit mask bits can overflow
+  if (mi_commit_mask_is_full(&segment->commit_mask) && mi_commit_mask_is_empty(&segment->purge_mask)) return true; // fully committed
+  mi_assert_internal(segment->kind != MI_SEGMENT_HUGE);
+  return mi_segment_commit(segment, p, size);
+}
+
+static bool mi_segment_purge(mi_segment_t* segment, uint8_t* p, size_t size) {
+  mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask));
+  if (!segment->allow_purge) return true;
+
+  // purge conservative
+  uint8_t* start = NULL;
+  size_t   full_size = 0;
+  mi_commit_mask_t mask;
+  mi_segment_commit_mask(segment, true /* conservative? */, p, size, &start, &full_size, &mask);
+  if (mi_commit_mask_is_empty(&mask) || full_size==0) return true;
+
+  if (mi_commit_mask_any_set(&segment->commit_mask, &mask)) {
+    // purging
+    mi_assert_internal((void*)start != (void*)segment);
+    mi_assert_internal(segment->allow_decommit);
+    const bool decommitted = _mi_os_purge(start, full_size);  // reset or decommit
+    if (decommitted) {
+      mi_commit_mask_t cmask;
+      mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask);
+      _mi_stat_increase(&_mi_stats_main.committed, full_size - _mi_commit_mask_committed_size(&cmask, MI_SEGMENT_SIZE)); // adjust for double counting
+      mi_commit_mask_clear(&segment->commit_mask, &mask);
+    }
+  }
+
+  // always clear any scheduled purges in our range
+  mi_commit_mask_clear(&segment->purge_mask, &mask);
+  return true;
+}
+
+static void mi_segment_schedule_purge(mi_segment_t* segment, uint8_t* p, size_t size) {
+  if (!segment->allow_purge) return;
+
+  if (mi_option_get(mi_option_purge_delay) == 0) {
+    mi_segment_purge(segment, p, size);
+  }
+  else {
+    // register for future purge in the purge mask
+    uint8_t* start = NULL;
+    size_t   full_size = 0;
+    mi_commit_mask_t mask;
+    mi_segment_commit_mask(segment, true /*conservative*/, p, size, &start, &full_size, &mask);
+    if (mi_commit_mask_is_empty(&mask) || full_size==0) return;
+
+    // update delayed commit
+    mi_assert_internal(segment->purge_expire > 0 || mi_commit_mask_is_empty(&segment->purge_mask));
+    mi_commit_mask_t cmask;
+    mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask);  // only purge what is committed; span_free may try to decommit more
+    mi_commit_mask_set(&segment->purge_mask, &cmask);
+    mi_msecs_t now = _mi_clock_now();
+    if (segment->purge_expire == 0) {
+      // no previous purgess, initialize now
+      segment->purge_expire = now + mi_option_get(mi_option_purge_delay);
+    }
+    else if (segment->purge_expire <= now) {
+      // previous purge mask already expired
+      if (segment->purge_expire + mi_option_get(mi_option_purge_extend_delay) <= now) {
+        mi_segment_try_purge(segment, true);
+      }
+      else {
+        segment->purge_expire = now + mi_option_get(mi_option_purge_extend_delay); // (mi_option_get(mi_option_purge_delay) / 8); // wait a tiny bit longer in case there is a series of free's
+      }
+    }
+    else {
+      // previous purge mask is not yet expired, increase the expiration by a bit.
+      segment->purge_expire += mi_option_get(mi_option_purge_extend_delay);
+    }
+  }
+}
+
+static void mi_segment_try_purge(mi_segment_t* segment, bool force) {
+  if (!segment->allow_purge || segment->purge_expire == 0 || mi_commit_mask_is_empty(&segment->purge_mask)) return;
+  mi_msecs_t now = _mi_clock_now();
+  if (!force && now < segment->purge_expire) return;
+
+  mi_commit_mask_t mask = segment->purge_mask;
+  segment->purge_expire = 0;
+  mi_commit_mask_create_empty(&segment->purge_mask);
+
+  size_t idx;
+  size_t count;
+  mi_commit_mask_foreach(&mask, idx, count) {
+    // if found, decommit that sequence
+    if (count > 0) {
+      uint8_t* p = (uint8_t*)segment + (idx*MI_COMMIT_SIZE);
+      size_t size = count * MI_COMMIT_SIZE;
+      mi_segment_purge(segment, p, size);
+    }
+  }
+  mi_commit_mask_foreach_end()
+  mi_assert_internal(mi_commit_mask_is_empty(&segment->purge_mask));
+}
+
+// called from `mi_heap_collect_ex`
+// this can be called per-page so it is important that try_purge has fast exit path
+void _mi_segment_collect(mi_segment_t* segment, bool force) {
+  mi_segment_try_purge(segment, force);
+}
+
+/* -----------------------------------------------------------
+   Span free
+----------------------------------------------------------- */
+
+static bool mi_segment_is_abandoned(mi_segment_t* segment) {
+  return (mi_atomic_load_relaxed(&segment->thread_id) == 0);
+}
+
+// note: can be called on abandoned segments
+static void mi_segment_span_free(mi_segment_t* segment, size_t slice_index, size_t slice_count, bool allow_purge, mi_segments_tld_t* tld) {
+  mi_assert_internal(slice_index < segment->slice_entries);
+  mi_span_queue_t* sq = (segment->kind == MI_SEGMENT_HUGE || mi_segment_is_abandoned(segment)
+                          ? NULL : mi_span_queue_for(slice_count,tld));
+  if (slice_count==0) slice_count = 1;
+  mi_assert_internal(slice_index + slice_count - 1 < segment->slice_entries);
+
+  // set first and last slice (the intermediates can be undetermined)
+  mi_slice_t* slice = &segment->slices[slice_index];
+  slice->slice_count = (uint32_t)slice_count;
+  mi_assert_internal(slice->slice_count == slice_count); // no overflow?
+  slice->slice_offset = 0;
+  if (slice_count > 1) {
+    mi_slice_t* last = slice + slice_count - 1;
+    mi_slice_t* end  = (mi_slice_t*)mi_segment_slices_end(segment);
+    if (last > end) { last = end; }
+    last->slice_count = 0;
+    last->slice_offset = (uint32_t)(sizeof(mi_page_t)*(slice_count - 1));
+    last->block_size = 0;
+  }
+
+  // perhaps decommit
+  if (allow_purge) {
+    mi_segment_schedule_purge(segment, mi_slice_start(slice), slice_count * MI_SEGMENT_SLICE_SIZE);
+  }
+
+  // and push it on the free page queue (if it was not a huge page)
+  if (sq != NULL) mi_span_queue_push( sq, slice );
+             else slice->block_size = 0; // mark huge page as free anyways
+}
+
+/*
+// called from reclaim to add existing free spans
+static void mi_segment_span_add_free(mi_slice_t* slice, mi_segments_tld_t* tld) {
+  mi_segment_t* segment = _mi_ptr_segment(slice);
+  mi_assert_internal(slice->xblock_size==0 && slice->slice_count>0 && slice->slice_offset==0);
+  size_t slice_index = mi_slice_index(slice);
+  mi_segment_span_free(segment,slice_index,slice->slice_count,tld);
+}
+*/
+
+static void mi_segment_span_remove_from_queue(mi_slice_t* slice, mi_segments_tld_t* tld) {
+  mi_assert_internal(slice->slice_count > 0 && slice->slice_offset==0 && slice->block_size==0);
+  mi_assert_internal(_mi_ptr_segment(slice)->kind != MI_SEGMENT_HUGE);
+  mi_span_queue_t* sq = mi_span_queue_for(slice->slice_count, tld);
+  mi_span_queue_delete(sq, slice);
+}
+
+// note: can be called on abandoned segments
+static mi_slice_t* mi_segment_span_free_coalesce(mi_slice_t* slice, mi_segments_tld_t* tld) {
+  mi_assert_internal(slice != NULL && slice->slice_count > 0 && slice->slice_offset == 0);
+  mi_segment_t* const segment = _mi_ptr_segment(slice);
+
+  // for huge pages, just mark as free but don't add to the queues
+  if (segment->kind == MI_SEGMENT_HUGE) {
+    // issue #691: segment->used can be 0 if the huge page block was freed while abandoned (reclaim will get here in that case)
+    mi_assert_internal((segment->used==0 && slice->block_size==0) || segment->used == 1);  // decreased right after this call in `mi_segment_page_clear`
+    slice->block_size = 0;  // mark as free anyways
+    // we should mark the last slice `xblock_size=0` now to maintain invariants but we skip it to
+    // avoid a possible cache miss (and the segment is about to be freed)
+    return slice;
+  }
+
+  // otherwise coalesce the span and add to the free span queues
+  const bool is_abandoned = (segment->thread_id == 0); // mi_segment_is_abandoned(segment);
+  size_t slice_count = slice->slice_count;
+  mi_slice_t* next = slice + slice->slice_count;
+  mi_assert_internal(next <= mi_segment_slices_end(segment));
+  if (next < mi_segment_slices_end(segment) && next->block_size==0) {
+    // free next block -- remove it from free and merge
+    mi_assert_internal(next->slice_count > 0 && next->slice_offset==0);
+    slice_count += next->slice_count; // extend
+    if (!is_abandoned) { mi_segment_span_remove_from_queue(next, tld); }
+  }
+  if (slice > segment->slices) {
+    mi_slice_t* prev = mi_slice_first(slice - 1);
+    mi_assert_internal(prev >= segment->slices);
+    if (prev->block_size==0) {
+      // free previous slice -- remove it from free and merge
+      mi_assert_internal(prev->slice_count > 0 && prev->slice_offset==0);
+      slice_count += prev->slice_count;
+      slice->slice_count = 0;
+      slice->slice_offset = (uint32_t)((uint8_t*)slice - (uint8_t*)prev); // set the slice offset for `segment_force_abandon` (in case the previous free block is very large).
+      if (!is_abandoned) { mi_segment_span_remove_from_queue(prev, tld); }
+      slice = prev;
+    }
+  }
+
+  // and add the new free page
+  mi_segment_span_free(segment, mi_slice_index(slice), slice_count, true, tld);
+  return slice;
+}
+
+
+
+/* -----------------------------------------------------------
+   Page allocation
+----------------------------------------------------------- */
+
+// Note: may still return NULL if committing the memory failed
+static mi_page_t* mi_segment_span_allocate(mi_segment_t* segment, size_t slice_index, size_t slice_count) {
+  mi_assert_internal(slice_index < segment->slice_entries);
+  mi_slice_t* const slice = &segment->slices[slice_index];
+  mi_assert_internal(slice->block_size==0 || slice->block_size==1);
+
+  // commit before changing the slice data
+  if (!mi_segment_ensure_committed(segment, _mi_segment_page_start_from_slice(segment, slice, 0, NULL), slice_count * MI_SEGMENT_SLICE_SIZE)) {
+    return NULL;  // commit failed!
+  }
+
+  // convert the slices to a page
+  slice->slice_offset = 0;
+  slice->slice_count = (uint32_t)slice_count;
+  mi_assert_internal(slice->slice_count == slice_count);
+  const size_t bsize = slice_count * MI_SEGMENT_SLICE_SIZE;
+  slice->block_size = bsize;
+  mi_page_t*  page = mi_slice_to_page(slice);
+  mi_assert_internal(mi_page_block_size(page) == bsize);
+
+  // set slice back pointers for the first MI_MAX_SLICE_OFFSET_COUNT entries
+  size_t extra = slice_count-1;
+  if (extra > MI_MAX_SLICE_OFFSET_COUNT) extra = MI_MAX_SLICE_OFFSET_COUNT;
+  if (slice_index + extra >= segment->slice_entries) extra = segment->slice_entries - slice_index - 1;  // huge objects may have more slices than avaiable entries in the segment->slices
+
+  mi_slice_t* slice_next = slice + 1;
+  for (size_t i = 1; i <= extra; i++, slice_next++) {
+    slice_next->slice_offset = (uint32_t)(sizeof(mi_slice_t)*i);
+    slice_next->slice_count = 0;
+    slice_next->block_size = 1;
+  }
+
+  // and also for the last one (if not set already) (the last one is needed for coalescing and for large alignments)
+  // note: the cast is needed for ubsan since the index can be larger than MI_SLICES_PER_SEGMENT for huge allocations (see #543)
+  mi_slice_t* last = slice + slice_count - 1;
+  mi_slice_t* end = (mi_slice_t*)mi_segment_slices_end(segment);
+  if (last > end) last = end;
+  if (last > slice) {
+    last->slice_offset = (uint32_t)(sizeof(mi_slice_t) * (last - slice));
+    last->slice_count = 0;
+    last->block_size = 1;
+  }
+
+  // and initialize the page
+  page->is_committed = true;
+  page->is_huge = (segment->kind == MI_SEGMENT_HUGE);
+  segment->used++;
+  return page;
+}
+
+static void mi_segment_slice_split(mi_segment_t* segment, mi_slice_t* slice, size_t slice_count, mi_segments_tld_t* tld) {
+  mi_assert_internal(_mi_ptr_segment(slice) == segment);
+  mi_assert_internal(slice->slice_count >= slice_count);
+  mi_assert_internal(slice->block_size > 0); // no more in free queue
+  if (slice->slice_count <= slice_count) return;
+  mi_assert_internal(segment->kind != MI_SEGMENT_HUGE);
+  size_t next_index = mi_slice_index(slice) + slice_count;
+  size_t next_count = slice->slice_count - slice_count;
+  mi_segment_span_free(segment, next_index, next_count, false /* don't purge left-over part */, tld);
+  slice->slice_count = (uint32_t)slice_count;
+}
+
+static mi_page_t* mi_segments_page_find_and_allocate(size_t slice_count, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld) {
+  mi_assert_internal(slice_count*MI_SEGMENT_SLICE_SIZE <= MI_LARGE_OBJ_SIZE_MAX);
+  // search from best fit up
+  mi_span_queue_t* sq = mi_span_queue_for(slice_count, tld);
+  if (slice_count == 0) slice_count = 1;
+  while (sq <= &tld->spans[MI_SEGMENT_BIN_MAX]) {
+    for (mi_slice_t* slice = sq->first; slice != NULL; slice = slice->next) {
+      if (slice->slice_count >= slice_count) {
+        // found one
+        mi_segment_t* segment = _mi_ptr_segment(slice);
+        if (_mi_arena_memid_is_suitable(segment->memid, req_arena_id)) {
+          // found a suitable page span
+          mi_span_queue_delete(sq, slice);
+
+          if (slice->slice_count > slice_count) {
+            mi_segment_slice_split(segment, slice, slice_count, tld);
+          }
+          mi_assert_internal(slice != NULL && slice->slice_count == slice_count && slice->block_size > 0);
+          mi_page_t* page = mi_segment_span_allocate(segment, mi_slice_index(slice), slice->slice_count);
+          if (page == NULL) {
+            // commit failed; return NULL but first restore the slice
+            mi_segment_span_free_coalesce(slice, tld);
+            return NULL;
+          }
+          return page;
+        }
+      }
+    }
+    sq++;
+  }
+  // could not find a page..
+  return NULL;
+}
+
+
+/* -----------------------------------------------------------
+   Segment allocation
+----------------------------------------------------------- */
+
+static mi_segment_t* mi_segment_os_alloc( size_t required, size_t page_alignment, bool eager_delayed, mi_arena_id_t req_arena_id,
+                                          size_t* psegment_slices, size_t* pinfo_slices,
+                                          bool commit, mi_segments_tld_t* tld)
+
+{
+  mi_memid_t memid;
+  bool   allow_large = (!eager_delayed && (MI_SECURE == 0)); // only allow large OS pages once we are no longer lazy
+  size_t align_offset = 0;
+  size_t alignment = MI_SEGMENT_ALIGN;
+
+  if (page_alignment > 0) {
+    // mi_assert_internal(huge_page != NULL);
+    mi_assert_internal(page_alignment >= MI_SEGMENT_ALIGN);
+    alignment = page_alignment;
+    const size_t info_size = (*pinfo_slices) * MI_SEGMENT_SLICE_SIZE;
+    align_offset = _mi_align_up( info_size, MI_SEGMENT_ALIGN );
+    const size_t extra = align_offset - info_size;
+    // recalculate due to potential guard pages
+    *psegment_slices = mi_segment_calculate_slices(required + extra, pinfo_slices);
+    mi_assert_internal(*psegment_slices > 0 && *psegment_slices <= UINT32_MAX);
+  }
+
+  const size_t segment_size = (*psegment_slices) * MI_SEGMENT_SLICE_SIZE;
+  mi_segment_t* segment = (mi_segment_t*)_mi_arena_alloc_aligned(segment_size, alignment, align_offset, commit, allow_large, req_arena_id, &memid);
+  if (segment == NULL) {
+    return NULL;  // failed to allocate
+  }
+
+  // ensure metadata part of the segment is committed
+  mi_commit_mask_t commit_mask;
+  if (memid.initially_committed) {
+    mi_commit_mask_create_full(&commit_mask);
+  }
+  else {
+    // at least commit the info slices
+    const size_t commit_needed = _mi_divide_up((*pinfo_slices)*MI_SEGMENT_SLICE_SIZE, MI_COMMIT_SIZE);
+    mi_assert_internal(commit_needed>0);
+    mi_commit_mask_create(0, commit_needed, &commit_mask);
+    mi_assert_internal(commit_needed*MI_COMMIT_SIZE >= (*pinfo_slices)*MI_SEGMENT_SLICE_SIZE);
+    if (!_mi_os_commit(segment, commit_needed*MI_COMMIT_SIZE, NULL)) {
+      _mi_arena_free(segment,segment_size,0,memid);
+      return NULL;
+    }
+  }
+  mi_assert_internal(segment != NULL && (uintptr_t)segment % MI_SEGMENT_SIZE == 0);
+
+  segment->memid = memid;
+  segment->allow_decommit = !memid.is_pinned;
+  segment->allow_purge = segment->allow_decommit && (mi_option_get(mi_option_purge_delay) >= 0);
+  segment->segment_size = segment_size;
+  segment->subproc = tld->subproc;
+  segment->commit_mask = commit_mask;
+  segment->purge_expire = 0;
+  mi_commit_mask_create_empty(&segment->purge_mask);
+
+  mi_segments_track_size((long)(segment_size), tld);
+  _mi_segment_map_allocated_at(segment);
+  return segment;
+}
+
+
+// Allocate a segment from the OS aligned to `MI_SEGMENT_SIZE` .
+static mi_segment_t* mi_segment_alloc(size_t required, size_t page_alignment, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld, mi_page_t** huge_page)
+{
+  mi_assert_internal((required==0 && huge_page==NULL) || (required>0 && huge_page != NULL));
+
+  // calculate needed sizes first
+  size_t info_slices;
+  size_t segment_slices = mi_segment_calculate_slices(required, &info_slices);
+  mi_assert_internal(segment_slices > 0 && segment_slices <= UINT32_MAX);
+
+  // Commit eagerly only if not the first N lazy segments (to reduce impact of many threads that allocate just a little)
+  const bool eager_delay = (// !_mi_os_has_overcommit() &&             // never delay on overcommit systems
+                            _mi_current_thread_count() > 1 &&       // do not delay for the first N threads
+                            tld->peak_count < (size_t)mi_option_get(mi_option_eager_commit_delay));
+  const bool eager = !eager_delay && mi_option_is_enabled(mi_option_eager_commit);
+  bool commit = eager || (required > 0);
+
+  // Allocate the segment from the OS
+  mi_segment_t* segment = mi_segment_os_alloc(required, page_alignment, eager_delay, req_arena_id,
+                                              &segment_slices, &info_slices, commit, tld);
+  if (segment == NULL) return NULL;
+
+  // zero the segment info? -- not always needed as it may be zero initialized from the OS
+  if (!segment->memid.initially_zero) {
+    ptrdiff_t ofs    = offsetof(mi_segment_t, next);
+    size_t    prefix = offsetof(mi_segment_t, slices) - ofs;
+    size_t    zsize  = prefix + (sizeof(mi_slice_t) * (segment_slices + 1)); // one more
+    _mi_memzero((uint8_t*)segment + ofs, zsize);
+  }
+
+  // initialize the rest of the segment info
+  const size_t slice_entries = (segment_slices > MI_SLICES_PER_SEGMENT ? MI_SLICES_PER_SEGMENT : segment_slices);
+  segment->segment_slices = segment_slices;
+  segment->segment_info_slices = info_slices;
+  segment->thread_id = _mi_thread_id();
+  segment->cookie = _mi_ptr_cookie(segment);
+  segment->slice_entries = slice_entries;
+  segment->kind = (required == 0 ? MI_SEGMENT_NORMAL : MI_SEGMENT_HUGE);
+
+  // _mi_memzero(segment->slices, sizeof(mi_slice_t)*(info_slices+1));
+  _mi_stat_increase(&tld->stats->page_committed, mi_segment_info_size(segment));
+
+  // set up guard pages
+  size_t guard_slices = 0;
+  if (MI_SECURE>0) {
+    // in secure mode, we set up a protected page in between the segment info
+    // and the page data, and at the end of the segment.
+    size_t os_pagesize = _mi_os_page_size();
+    _mi_os_protect((uint8_t*)segment + mi_segment_info_size(segment) - os_pagesize, os_pagesize);
+    uint8_t* end = (uint8_t*)segment + mi_segment_size(segment) - os_pagesize;
+    mi_segment_ensure_committed(segment, end, os_pagesize);
+    _mi_os_protect(end, os_pagesize);
+    if (slice_entries == segment_slices) segment->slice_entries--; // don't use the last slice :-(
+    guard_slices = 1;
+  }
+
+  // reserve first slices for segment info
+  mi_page_t* page0 = mi_segment_span_allocate(segment, 0, info_slices);
+  mi_assert_internal(page0!=NULL); if (page0==NULL) return NULL; // cannot fail as we always commit in advance
+  mi_assert_internal(segment->used == 1);
+  segment->used = 0; // don't count our internal slices towards usage
+
+  // initialize initial free pages
+  if (segment->kind == MI_SEGMENT_NORMAL) { // not a huge page
+    mi_assert_internal(huge_page==NULL);
+    mi_segment_span_free(segment, info_slices, segment->slice_entries - info_slices, false /* don't purge */, tld);
+  }
+  else {
+    mi_assert_internal(huge_page!=NULL);
+    mi_assert_internal(mi_commit_mask_is_empty(&segment->purge_mask));
+    mi_assert_internal(mi_commit_mask_is_full(&segment->commit_mask));
+    *huge_page = mi_segment_span_allocate(segment, info_slices, segment_slices - info_slices - guard_slices);
+    mi_assert_internal(*huge_page != NULL); // cannot fail as we commit in advance
+  }
+
+  mi_assert_expensive(mi_segment_is_valid(segment,tld));
+  return segment;
+}
+
+
+static void mi_segment_free(mi_segment_t* segment, bool force, mi_segments_tld_t* tld) {
+  MI_UNUSED(force);
+  mi_assert_internal(segment != NULL);
+  mi_assert_internal(segment->next == NULL);
+  mi_assert_internal(segment->used == 0);
+
+  // in `mi_segment_force_abandon` we set this to true to ensure the segment's memory stays valid
+  if (segment->dont_free) return;
+
+  // Remove the free pages
+  mi_slice_t* slice = &segment->slices[0];
+  const mi_slice_t* end = mi_segment_slices_end(segment);
+  #if MI_DEBUG>1
+  size_t page_count = 0;
+  #endif
+  while (slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    mi_assert_internal(mi_slice_index(slice)==0 || slice->block_size == 0); // no more used pages ..
+    if (slice->block_size == 0 && segment->kind != MI_SEGMENT_HUGE) {
+      mi_segment_span_remove_from_queue(slice, tld);
+    }
+    #if MI_DEBUG>1
+    page_count++;
+    #endif
+    slice = slice + slice->slice_count;
+  }
+  mi_assert_internal(page_count == 2); // first page is allocated by the segment itself
+
+  // stats
+  // _mi_stat_decrease(&tld->stats->page_committed, mi_segment_info_size(segment));
+
+  // return it to the OS
+  mi_segment_os_free(segment, tld);
+}
+
+
+/* -----------------------------------------------------------
+   Page Free
+----------------------------------------------------------- */
+
+static void mi_segment_abandon(mi_segment_t* segment, mi_segments_tld_t* tld);
+
+// note: can be called on abandoned pages
+static mi_slice_t* mi_segment_page_clear(mi_page_t* page, mi_segments_tld_t* tld) {
+  mi_assert_internal(page->block_size > 0);
+  mi_assert_internal(mi_page_all_free(page));
+  mi_segment_t* segment = _mi_ptr_segment(page);
+  mi_assert_internal(segment->used > 0);
+
+  size_t inuse = page->capacity * mi_page_block_size(page);
+  _mi_stat_decrease(&tld->stats->page_committed, inuse);
+  _mi_stat_decrease(&tld->stats->pages, 1);
+  _mi_stat_decrease(&tld->stats->page_bins[_mi_page_stats_bin(page)], 1);
+
+  // reset the page memory to reduce memory pressure?
+  if (segment->allow_decommit && mi_option_is_enabled(mi_option_deprecated_page_reset)) {
+    size_t psize;
+    uint8_t* start = _mi_segment_page_start(segment, page, &psize);
+    _mi_os_reset(start, psize);
+  }
+
+  // zero the page data, but not the segment fields and heap tag
+  page->is_zero_init = false;
+  uint8_t heap_tag = page->heap_tag;
+  ptrdiff_t ofs = offsetof(mi_page_t, capacity);
+  _mi_memzero((uint8_t*)page + ofs, sizeof(*page) - ofs);
+  page->block_size = 1;
+  page->heap_tag = heap_tag;
+
+  // and free it
+  mi_slice_t* slice = mi_segment_span_free_coalesce(mi_page_to_slice(page), tld);
+  segment->used--;
+  // cannot assert segment valid as it is called during reclaim
+  // mi_assert_expensive(mi_segment_is_valid(segment, tld));
+  return slice;
+}
+
+void _mi_segment_page_free(mi_page_t* page, bool force, mi_segments_tld_t* tld)
+{
+  mi_assert(page != NULL);
+  mi_segment_t* segment = _mi_page_segment(page);
+  mi_assert_expensive(mi_segment_is_valid(segment,tld));
+
+  // mark it as free now
+  mi_segment_page_clear(page, tld);
+  mi_assert_expensive(mi_segment_is_valid(segment, tld));
+
+  if (segment->used == 0) {
+    // no more used pages; remove from the free list and free the segment
+    mi_segment_free(segment, force, tld);
+  }
+  else if (segment->used == segment->abandoned) {
+    // only abandoned pages; remove from free list and abandon
+    mi_segment_abandon(segment,tld);
+  }
+  else {
+    // perform delayed purges
+    mi_segment_try_purge(segment, false /* force? */);
+  }
+}
+
+
+/* -----------------------------------------------------------
+Abandonment
+
+When threads terminate, they can leave segments with
+live blocks (reachable through other threads). Such segments
+are "abandoned" and will be reclaimed by other threads to
+reuse their pages and/or free them eventually. The
+`thread_id` of such segments is 0.
+
+When a block is freed in an abandoned segment, the segment
+is reclaimed into that thread.
+
+Moreover, if threads are looking for a fresh segment, they
+will first consider abandoned segments -- these can be found
+by scanning the arena memory
+(segments outside arena memoryare only reclaimed by a free).
+----------------------------------------------------------- */
+
+/* -----------------------------------------------------------
+   Abandon segment/page
+----------------------------------------------------------- */
+
+static void mi_segment_abandon(mi_segment_t* segment, mi_segments_tld_t* tld) {
+  mi_assert_internal(segment->used == segment->abandoned);
+  mi_assert_internal(segment->used > 0);
+  mi_assert_internal(segment->abandoned_visits == 0);
+  mi_assert_expensive(mi_segment_is_valid(segment,tld));
+
+  // remove the free pages from the free page queues
+  mi_slice_t* slice = &segment->slices[0];
+  const mi_slice_t* end = mi_segment_slices_end(segment);
+  while (slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    if (slice->block_size == 0) { // a free page
+      mi_segment_span_remove_from_queue(slice,tld);
+      slice->block_size = 0; // but keep it free
+    }
+    slice = slice + slice->slice_count;
+  }
+
+  // perform delayed decommits (forcing is much slower on mstress)
+  // Only abandoned segments in arena memory can be reclaimed without a free
+  // so if a segment is not from an arena we force purge here to be conservative.
+  const bool force_purge = (segment->memid.memkind != MI_MEM_ARENA) || mi_option_is_enabled(mi_option_abandoned_page_purge);
+  mi_segment_try_purge(segment, force_purge);
+
+  // all pages in the segment are abandoned; add it to the abandoned list
+  _mi_stat_increase(&tld->stats->segments_abandoned, 1);
+  mi_segments_track_size(-((long)mi_segment_size(segment)), tld);
+  segment->thread_id = 0;
+  segment->abandoned_visits = 1;   // from 0 to 1 to signify it is abandoned
+  if (segment->was_reclaimed) {
+    tld->reclaim_count--;
+    segment->was_reclaimed = false;
+  }
+  _mi_arena_segment_mark_abandoned(segment);
+}
+
+void _mi_segment_page_abandon(mi_page_t* page, mi_segments_tld_t* tld) {
+  mi_assert(page != NULL);
+  mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE);
+  mi_assert_internal(mi_page_heap(page) == NULL);
+  mi_segment_t* segment = _mi_page_segment(page);
+
+  mi_assert_expensive(mi_segment_is_valid(segment,tld));
+  segment->abandoned++;
+
+  _mi_stat_increase(&tld->stats->pages_abandoned, 1);
+  mi_assert_internal(segment->abandoned <= segment->used);
+  if (segment->used == segment->abandoned) {
+    // all pages are abandoned, abandon the entire segment
+    mi_segment_abandon(segment, tld);
+  }
+}
+
+/* -----------------------------------------------------------
+  Reclaim abandoned pages
+----------------------------------------------------------- */
+
+static mi_slice_t* mi_slices_start_iterate(mi_segment_t* segment, const mi_slice_t** end) {
+  mi_slice_t* slice = &segment->slices[0];
+  *end = mi_segment_slices_end(segment);
+  mi_assert_internal(slice->slice_count>0 && slice->block_size>0); // segment allocated page
+  slice = slice + slice->slice_count; // skip the first segment allocated page
+  return slice;
+}
+
+// Possibly free pages and check if free space is available
+static bool mi_segment_check_free(mi_segment_t* segment, size_t slices_needed, size_t block_size, mi_segments_tld_t* tld)
+{
+  mi_assert_internal(mi_segment_is_abandoned(segment));
+  bool has_page = false;
+
+  // for all slices
+  const mi_slice_t* end;
+  mi_slice_t* slice = mi_slices_start_iterate(segment, &end);
+  while (slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    if (mi_slice_is_used(slice)) { // used page
+      // ensure used count is up to date and collect potential concurrent frees
+      mi_page_t* const page = mi_slice_to_page(slice);
+      _mi_page_free_collect(page, false);
+      if (mi_page_all_free(page)) {
+        // if this page is all free now, free it without adding to any queues (yet)
+        mi_assert_internal(page->next == NULL && page->prev==NULL);
+        _mi_stat_decrease(&tld->stats->pages_abandoned, 1);
+        segment->abandoned--;
+        slice = mi_segment_page_clear(page, tld); // re-assign slice due to coalesce!
+        mi_assert_internal(!mi_slice_is_used(slice));
+        if (slice->slice_count >= slices_needed) {
+          has_page = true;
+        }
+      }
+      else if (mi_page_block_size(page) == block_size && mi_page_has_any_available(page)) {
+        // a page has available free blocks of the right size
+        has_page = true;
+      }
+    }
+    else {
+      // empty span
+      if (slice->slice_count >= slices_needed) {
+        has_page = true;
+      }
+    }
+    slice = slice + slice->slice_count;
+  }
+  return has_page;
+}
+
+// Reclaim an abandoned segment; returns NULL if the segment was freed
+// set `right_page_reclaimed` to `true` if it reclaimed a page of the right `block_size` that was not full.
+static mi_segment_t* mi_segment_reclaim(mi_segment_t* segment, mi_heap_t* heap, size_t requested_block_size, bool* right_page_reclaimed, mi_segments_tld_t* tld) {
+  if (right_page_reclaimed != NULL) { *right_page_reclaimed = false; }
+  // can be 0 still with abandoned_next, or already a thread id for segments outside an arena that are reclaimed on a free.
+  mi_assert_internal(mi_atomic_load_relaxed(&segment->thread_id) == 0 || mi_atomic_load_relaxed(&segment->thread_id) == _mi_thread_id());
+  mi_assert_internal(segment->subproc == heap->tld->segments.subproc); // only reclaim within the same subprocess
+  mi_atomic_store_release(&segment->thread_id, _mi_thread_id());
+  segment->abandoned_visits = 0;
+  segment->was_reclaimed = true;
+  tld->reclaim_count++;
+  mi_segments_track_size((long)mi_segment_size(segment), tld);
+  mi_assert_internal(segment->next == NULL);
+  _mi_stat_decrease(&tld->stats->segments_abandoned, 1);
+
+  // for all slices
+  const mi_slice_t* end;
+  mi_slice_t* slice = mi_slices_start_iterate(segment, &end);
+  while (slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    if (mi_slice_is_used(slice)) {
+      // in use: reclaim the page in our heap
+      mi_page_t* page = mi_slice_to_page(slice);
+      mi_assert_internal(page->is_committed);
+      mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE);
+      mi_assert_internal(mi_page_heap(page) == NULL);
+      mi_assert_internal(page->next == NULL && page->prev==NULL);
+      _mi_stat_decrease(&tld->stats->pages_abandoned, 1);
+      segment->abandoned--;
+      // get the target heap for this thread which has a matching heap tag (so we reclaim into a matching heap)
+      mi_heap_t* target_heap = _mi_heap_by_tag(heap, page->heap_tag);  // allow custom heaps to separate objects
+      if (target_heap == NULL) {
+        target_heap = heap;
+        _mi_error_message(EFAULT, "page with tag %u cannot be reclaimed by a heap with the same tag (using heap tag %u instead)\n", page->heap_tag, heap->tag );
+      }
+      // associate the heap with this page, and allow heap thread delayed free again.
+      mi_page_set_heap(page, target_heap);
+      _mi_page_use_delayed_free(page, MI_USE_DELAYED_FREE, true); // override never (after heap is set)
+      _mi_page_free_collect(page, false); // ensure used count is up to date
+      if (mi_page_all_free(page)) {
+        // if everything free by now, free the page
+        slice = mi_segment_page_clear(page, tld);   // set slice again due to coalesceing
+      }
+      else {
+        // otherwise reclaim it into the heap
+        _mi_page_reclaim(target_heap, page);
+        if (requested_block_size == mi_page_block_size(page) && mi_page_has_any_available(page) && heap == target_heap) {
+          if (right_page_reclaimed != NULL) { *right_page_reclaimed = true; }
+        }
+      }
+    }
+    else {
+      // the span is free, add it to our page queues
+      slice = mi_segment_span_free_coalesce(slice, tld); // set slice again due to coalesceing
+    }
+    mi_assert_internal(slice->slice_count>0 && slice->slice_offset==0);
+    slice = slice + slice->slice_count;
+  }
+
+  mi_assert(segment->abandoned == 0);
+  mi_assert_expensive(mi_segment_is_valid(segment, tld));
+  if (segment->used == 0) {  // due to page_clear
+    mi_assert_internal(right_page_reclaimed == NULL || !(*right_page_reclaimed));
+    mi_segment_free(segment, false, tld);
+    return NULL;
+  }
+  else {
+    return segment;
+  }
+}
+
+
+// attempt to reclaim a particular segment (called from multi threaded free `alloc.c:mi_free_block_mt`)
+bool _mi_segment_attempt_reclaim(mi_heap_t* heap, mi_segment_t* segment) {
+  if (mi_atomic_load_relaxed(&segment->thread_id) != 0) return false;  // it is not abandoned
+  if (segment->subproc != heap->tld->segments.subproc)  return false;  // only reclaim within the same subprocess
+  if (!_mi_heap_memid_is_suitable(heap,segment->memid)) return false;  // don't reclaim between exclusive and non-exclusive arena's
+  const long target = _mi_option_get_fast(mi_option_target_segments_per_thread);
+  if (target > 0 && (size_t)target <= heap->tld->segments.count) return false; // don't reclaim if going above the target count
+
+  // don't reclaim more from a `free` call than half the current segments
+  // this is to prevent a pure free-ing thread to start owning too many segments
+  // (but not for out-of-arena segments as that is the main way to be reclaimed for those)
+  if (segment->memid.memkind == MI_MEM_ARENA && heap->tld->segments.reclaim_count * 2 > heap->tld->segments.count) {
+    return false;
+  }
+  if (_mi_arena_segment_clear_abandoned(segment)) {  // atomically unabandon
+    mi_segment_t* res = mi_segment_reclaim(segment, heap, 0, NULL, &heap->tld->segments);
+    mi_assert_internal(res == segment);
+    return (res != NULL);
+  }
+  return false;
+}
+
+void _mi_abandoned_reclaim_all(mi_heap_t* heap, mi_segments_tld_t* tld) {
+  mi_segment_t* segment;
+  mi_arena_field_cursor_t current;
+  _mi_arena_field_cursor_init(heap, tld->subproc, true /* visit all, blocking */, &current);
+  while ((segment = _mi_arena_segment_clear_abandoned_next(&current)) != NULL) {
+    mi_segment_reclaim(segment, heap, 0, NULL, tld);
+  }
+  _mi_arena_field_cursor_done(&current);
+}
+
+
+static bool segment_count_is_within_target(mi_segments_tld_t* tld, size_t* ptarget) {
+  const size_t target = (size_t)mi_option_get_clamp(mi_option_target_segments_per_thread, 0, 1024);
+  if (ptarget != NULL) { *ptarget = target; }
+  return (target == 0 || tld->count < target);
+}
+
+static long mi_segment_get_reclaim_tries(mi_segments_tld_t* tld) {
+  // limit the tries to 10% (default) of the abandoned segments with at least 8 and at most 1024 tries.
+  const size_t perc = (size_t)mi_option_get_clamp(mi_option_max_segment_reclaim, 0, 100);
+  if (perc <= 0) return 0;
+  const size_t total_count = mi_atomic_load_relaxed(&tld->subproc->abandoned_count);
+  if (total_count == 0) return 0;
+  const size_t relative_count = (total_count > 10000 ? (total_count / 100) * perc : (total_count * perc) / 100); // avoid overflow
+  long max_tries = (long)(relative_count <= 1 ? 1 : (relative_count > 1024 ? 1024 : relative_count));
+  if (max_tries < 8 && total_count > 8) { max_tries = 8;  }
+  return max_tries;
+}
+
+static mi_segment_t* mi_segment_try_reclaim(mi_heap_t* heap, size_t needed_slices, size_t block_size, bool* reclaimed, mi_segments_tld_t* tld)
+{
+  *reclaimed = false;
+  long max_tries = mi_segment_get_reclaim_tries(tld);
+  if (max_tries <= 0) return NULL;
+
+  mi_segment_t* result = NULL;
+  mi_segment_t* segment = NULL;
+  mi_arena_field_cursor_t current;
+  _mi_arena_field_cursor_init(heap, tld->subproc, false /* non-blocking */, &current);
+  while (segment_count_is_within_target(tld,NULL) && (max_tries-- > 0) && ((segment = _mi_arena_segment_clear_abandoned_next(&current)) != NULL))
+  {
+    mi_assert(segment->subproc == heap->tld->segments.subproc); // cursor only visits segments in our sub-process
+    segment->abandoned_visits++;
+    // todo: should we respect numa affinity for abandoned reclaim? perhaps only for the first visit?
+    // todo: an arena exclusive heap will potentially visit many abandoned unsuitable segments and use many tries
+    // Perhaps we can skip non-suitable ones in a better way?
+    bool is_suitable = _mi_heap_memid_is_suitable(heap, segment->memid);
+    bool has_page = mi_segment_check_free(segment,needed_slices,block_size,tld); // try to free up pages (due to concurrent frees)
+    if (segment->used == 0) {
+      // free the segment (by forced reclaim) to make it available to other threads.
+      // note1: we prefer to free a segment as that might lead to reclaiming another
+      // segment that is still partially used.
+      // note2: we could in principle optimize this by skipping reclaim and directly
+      // freeing but that would violate some invariants temporarily)
+      mi_segment_reclaim(segment, heap, 0, NULL, tld);
+    }
+    else if (has_page && is_suitable) {
+      // found a large enough free span, or a page of the right block_size with free space
+      // we return the result of reclaim (which is usually `segment`) as it might free
+      // the segment due to concurrent frees (in which case `NULL` is returned).
+      result = mi_segment_reclaim(segment, heap, block_size, reclaimed, tld);
+      break;
+    }
+    else if (segment->abandoned_visits > 3 && is_suitable) {
+      // always reclaim on 3rd visit to limit the abandoned segment count.
+      mi_segment_reclaim(segment, heap, 0, NULL, tld);
+    }
+    else {
+      // otherwise, push on the visited list so it gets not looked at too quickly again
+      max_tries++; // don't count this as a try since it was not suitable
+      mi_segment_try_purge(segment, false /* true force? */); // force purge if needed as we may not visit soon again
+      _mi_arena_segment_mark_abandoned(segment);
+    }
+  }
+  _mi_arena_field_cursor_done(&current);
+  return result;
+}
+
+// collect abandoned segments
+void _mi_abandoned_collect(mi_heap_t* heap, bool force, mi_segments_tld_t* tld)
+{
+  mi_segment_t* segment;
+  mi_arena_field_cursor_t current; _mi_arena_field_cursor_init(heap, tld->subproc, force /* blocking? */, &current);
+  long max_tries = (force ? (long)mi_atomic_load_relaxed(&tld->subproc->abandoned_count) : 1024);  // limit latency
+  while ((max_tries-- > 0) && ((segment = _mi_arena_segment_clear_abandoned_next(&current)) != NULL)) {
+    mi_segment_check_free(segment,0,0,tld); // try to free up pages (due to concurrent frees)
+    if (segment->used == 0) {
+      // free the segment (by forced reclaim) to make it available to other threads.
+      // note: we could in principle optimize this by skipping reclaim and directly
+      // freeing but that would violate some invariants temporarily)
+      mi_segment_reclaim(segment, heap, 0, NULL, tld);
+    }
+    else {
+      // otherwise, purge if needed and push on the visited list
+      // note: forced purge can be expensive if many threads are destroyed/created as in mstress.
+      mi_segment_try_purge(segment, force);
+      _mi_arena_segment_mark_abandoned(segment);
+    }
+  }
+  _mi_arena_field_cursor_done(&current);
+}
+
+/* -----------------------------------------------------------
+   Force abandon a segment that is in use by our thread
+----------------------------------------------------------- */
+
+// force abandon a segment
+static void mi_segment_force_abandon(mi_segment_t* segment, mi_segments_tld_t* tld)
+{
+  mi_assert_internal(!mi_segment_is_abandoned(segment));
+  mi_assert_internal(!segment->dont_free);
+
+  // ensure the segment does not get free'd underneath us (so we can check if a page has been freed in `mi_page_force_abandon`)
+  segment->dont_free = true;
+
+  // for all slices
+  const mi_slice_t* end;
+  mi_slice_t* slice = mi_slices_start_iterate(segment, &end);
+  while (slice < end) {
+    mi_assert_internal(slice->slice_count > 0);
+    mi_assert_internal(slice->slice_offset == 0);
+    if (mi_slice_is_used(slice)) {
+      // ensure used count is up to date and collect potential concurrent frees
+      mi_page_t* const page = mi_slice_to_page(slice);
+      _mi_page_free_collect(page, false);
+      {
+        // abandon the page if it is still in-use (this will free it if possible as well)
+        mi_assert_internal(segment->used > 0);
+        if (segment->used == segment->abandoned+1) {
+          // the last page.. abandon and return as the segment will be abandoned after this
+          // and we should no longer access it.
+          segment->dont_free = false;
+          _mi_page_force_abandon(page);
+          return;
+        }
+        else {
+          // abandon and continue
+          _mi_page_force_abandon(page);
+          // it might be freed, reset the slice (note: relies on coalesce setting the slice_offset)
+          slice = mi_slice_first(slice);
+        }
+      }
+    }
+    slice = slice + slice->slice_count;
+  }
+  segment->dont_free = false;
+  mi_assert(segment->used == segment->abandoned);
+  mi_assert(segment->used == 0);
+  if (segment->used == 0) {  // paranoia
+    // all free now
+    mi_segment_free(segment, false, tld);
+  }
+  else {
+    // perform delayed purges
+    mi_segment_try_purge(segment, false /* force? */);
+  }
+}
+
+
+// try abandon segments.
+// this should be called from `reclaim_or_alloc` so we know all segments are (about) fully in use.
+static void mi_segments_try_abandon_to_target(mi_heap_t* heap, size_t target, mi_segments_tld_t* tld) {
+  if (target <= 1) return;
+  const size_t min_target = (target > 4 ? (target*3)/4 : target);  // 75%
+  // todo: we should maintain a list of segments per thread; for now, only consider segments from the heap full pages
+  for (int i = 0; i < 64 && tld->count >= min_target; i++) {
+    mi_page_t* page = heap->pages[MI_BIN_FULL].first;
+    while (page != NULL && mi_page_block_size(page) > MI_LARGE_OBJ_SIZE_MAX) {
+      page = page->next;
+    }
+    if (page==NULL) {
+      break;
+    }
+    mi_segment_t* segment = _mi_page_segment(page);
+    mi_segment_force_abandon(segment, tld);
+    mi_assert_internal(page != heap->pages[MI_BIN_FULL].first); // as it is just abandoned
+  }
+}
+
+// try abandon segments.
+// this should be called from `reclaim_or_alloc` so we know all segments are (about) fully in use.
+static void mi_segments_try_abandon(mi_heap_t* heap, mi_segments_tld_t* tld) {
+  // we call this when we are about to add a fresh segment so we should be under our target segment count.
+  size_t target = 0;
+  if (segment_count_is_within_target(tld, &target)) return;
+  mi_segments_try_abandon_to_target(heap, target, tld);
+}
+
+void mi_collect_reduce(size_t target_size) mi_attr_noexcept {
+  mi_collect(true);
+  mi_heap_t* heap = mi_heap_get_default();
+  mi_segments_tld_t* tld = &heap->tld->segments;
+  size_t target = target_size / MI_SEGMENT_SIZE;
+  if (target == 0) {
+    target = (size_t)mi_option_get_clamp(mi_option_target_segments_per_thread, 1, 1024);
+  }
+  mi_segments_try_abandon_to_target(heap, target, tld);
+}
+
+/* -----------------------------------------------------------
+   Reclaim or allocate
+----------------------------------------------------------- */
+
+static mi_segment_t* mi_segment_reclaim_or_alloc(mi_heap_t* heap, size_t needed_slices, size_t block_size, mi_segments_tld_t* tld)
+{
+  mi_assert_internal(block_size <= MI_LARGE_OBJ_SIZE_MAX);
+
+  // try to abandon some segments to increase reuse between threads
+  mi_segments_try_abandon(heap,tld);
+
+  // 1. try to reclaim an abandoned segment
+  bool reclaimed;
+  mi_segment_t* segment = mi_segment_try_reclaim(heap, needed_slices, block_size, &reclaimed, tld);
+  if (reclaimed) {
+    // reclaimed the right page right into the heap
+    mi_assert_internal(segment != NULL);
+    return NULL; // pretend out-of-memory as the page will be in the page queue of the heap with available blocks
+  }
+  else if (segment != NULL) {
+    // reclaimed a segment with a large enough empty span in it
+    return segment;
+  }
+  // 2. otherwise allocate a fresh segment
+  return mi_segment_alloc(0, 0, heap->arena_id, tld, NULL);
+}
+
+
+/* -----------------------------------------------------------
+   Page allocation
+----------------------------------------------------------- */
+
+static mi_page_t* mi_segments_page_alloc(mi_heap_t* heap, mi_page_kind_t page_kind, size_t required, size_t block_size, mi_segments_tld_t* tld)
+{
+  mi_assert_internal(required <= MI_LARGE_OBJ_SIZE_MAX && page_kind <= MI_PAGE_LARGE);
+
+  // find a free page
+  size_t page_size = _mi_align_up(required, (required > MI_MEDIUM_PAGE_SIZE ? MI_MEDIUM_PAGE_SIZE : MI_SEGMENT_SLICE_SIZE));
+  size_t slices_needed = page_size / MI_SEGMENT_SLICE_SIZE;
+  mi_assert_internal(slices_needed * MI_SEGMENT_SLICE_SIZE == page_size);
+  mi_page_t* page = mi_segments_page_find_and_allocate(slices_needed, heap->arena_id, tld); //(required <= MI_SMALL_SIZE_MAX ? 0 : slices_needed), tld);
+  if (page==NULL) {
+    // no free page, allocate a new segment and try again
+    if (mi_segment_reclaim_or_alloc(heap, slices_needed, block_size, tld) == NULL) {
+      // OOM or reclaimed a good page in the heap
+      return NULL;
+    }
+    else {
+      // otherwise try again
+      return mi_segments_page_alloc(heap, page_kind, required, block_size, tld);
+    }
+  }
+  mi_assert_internal(page != NULL && page->slice_count*MI_SEGMENT_SLICE_SIZE == page_size);
+  mi_assert_internal(_mi_ptr_segment(page)->thread_id == _mi_thread_id());
+  mi_segment_try_purge(_mi_ptr_segment(page), false);
+  return page;
+}
+
+
+
+/* -----------------------------------------------------------
+   Huge page allocation
+----------------------------------------------------------- */
+
+static mi_page_t* mi_segment_huge_page_alloc(size_t size, size_t page_alignment, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld)
+{
+  mi_page_t* page = NULL;
+  mi_segment_t* segment = mi_segment_alloc(size,page_alignment,req_arena_id,tld,&page);
+  if (segment == NULL || page==NULL) return NULL;
+  mi_assert_internal(segment->used==1);
+  mi_assert_internal(mi_page_block_size(page) >= size);
+  #if MI_HUGE_PAGE_ABANDON
+  segment->thread_id = 0; // huge segments are immediately abandoned
+  #endif
+
+  // for huge pages we initialize the block_size as we may
+  // overallocate to accommodate large alignments.
+  size_t psize;
+  uint8_t* start = _mi_segment_page_start(segment, page, &psize);
+  page->block_size = psize;
+  mi_assert_internal(page->is_huge);
+
+  // decommit the part of the prefix of a page that will not be used; this can be quite large (close to MI_SEGMENT_SIZE)
+  if (page_alignment > 0 && segment->allow_decommit) {
+    uint8_t* aligned_p = (uint8_t*)_mi_align_up((uintptr_t)start, page_alignment);
+    mi_assert_internal(_mi_is_aligned(aligned_p, page_alignment));
+    mi_assert_internal(psize - (aligned_p - start) >= size);
+    uint8_t* decommit_start = start + sizeof(mi_block_t);              // for the free list
+    ptrdiff_t decommit_size = aligned_p - decommit_start;
+    _mi_os_reset(decommit_start, decommit_size);   // note: cannot use segment_decommit on huge segments
+  }
+
+  return page;
+}
+
+#if MI_HUGE_PAGE_ABANDON
+// free huge block from another thread
+void _mi_segment_huge_page_free(mi_segment_t* segment, mi_page_t* page, mi_block_t* block) {
+  // huge page segments are always abandoned and can be freed immediately by any thread
+  mi_assert_internal(segment->kind==MI_SEGMENT_HUGE);
+  mi_assert_internal(segment == _mi_page_segment(page));
+  mi_assert_internal(mi_atomic_load_relaxed(&segment->thread_id)==0);
+
+  // claim it and free
+  mi_heap_t* heap = mi_heap_get_default(); // issue #221; don't use the internal get_default_heap as we need to ensure the thread is initialized.
+  // paranoia: if this it the last reference, the cas should always succeed
+  size_t expected_tid = 0;
+  if (mi_atomic_cas_strong_acq_rel(&segment->thread_id, &expected_tid, heap->thread_id)) {
+    mi_block_set_next(page, block, page->free);
+    page->free = block;
+    page->used--;
+    page->is_zero_init = false;
+    mi_assert(page->used == 0);
+    mi_tld_t* tld = heap->tld;
+    _mi_segment_page_free(page, true, &tld->segments);
+  }
+#if (MI_DEBUG!=0)
+  else {
+    mi_assert_internal(false);
+  }
+#endif
+}
+
+#else
+// reset memory of a huge block from another thread
+void _mi_segment_huge_page_reset(mi_segment_t* segment, mi_page_t* page, mi_block_t* block) {
+  MI_UNUSED(page);
+  mi_assert_internal(segment->kind == MI_SEGMENT_HUGE);
+  mi_assert_internal(segment == _mi_page_segment(page));
+  mi_assert_internal(page->used == 1); // this is called just before the free
+  mi_assert_internal(page->free == NULL);
+  if (segment->allow_decommit) {
+    size_t csize = mi_usable_size(block);
+    if (csize > sizeof(mi_block_t)) {
+      csize = csize - sizeof(mi_block_t);
+      uint8_t* p = (uint8_t*)block + sizeof(mi_block_t);
+      _mi_os_reset(p, csize);  // note: cannot use segment_decommit on huge segments
+    }
+  }
+}
+#endif
+
+/* -----------------------------------------------------------
+   Page allocation and free
+----------------------------------------------------------- */
+mi_page_t* _mi_segment_page_alloc(mi_heap_t* heap, size_t block_size, size_t page_alignment, mi_segments_tld_t* tld) {
+  mi_page_t* page;
+  if mi_unlikely(page_alignment > MI_BLOCK_ALIGNMENT_MAX) {
+    mi_assert_internal(_mi_is_power_of_two(page_alignment));
+    mi_assert_internal(page_alignment >= MI_SEGMENT_SIZE);
+    if (page_alignment < MI_SEGMENT_SIZE) { page_alignment = MI_SEGMENT_SIZE; }
+    page = mi_segment_huge_page_alloc(block_size,page_alignment,heap->arena_id,tld);
+  }
+  else if (block_size <= MI_SMALL_OBJ_SIZE_MAX) {
+    page = mi_segments_page_alloc(heap,MI_PAGE_SMALL,block_size,block_size,tld);
+  }
+  else if (block_size <= MI_MEDIUM_OBJ_SIZE_MAX) {
+    page = mi_segments_page_alloc(heap,MI_PAGE_MEDIUM,MI_MEDIUM_PAGE_SIZE,block_size,tld);
+  }
+  else if (block_size <= MI_LARGE_OBJ_SIZE_MAX) {
+    page = mi_segments_page_alloc(heap,MI_PAGE_LARGE,block_size,block_size,tld);
+  }
+  else {
+    page = mi_segment_huge_page_alloc(block_size,page_alignment,heap->arena_id,tld);
+  }
+  mi_assert_internal(page == NULL || _mi_heap_memid_is_suitable(heap, _mi_page_segment(page)->memid));
+  mi_assert_expensive(page == NULL || mi_segment_is_valid(_mi_page_segment(page),tld));
+  mi_assert_internal(page == NULL || _mi_page_segment(page)->subproc == tld->subproc);
+  return page;
+}
+
+
+/* -----------------------------------------------------------
+   Visit blocks in a segment (only used for abandoned segments)
+----------------------------------------------------------- */
+
+static bool mi_segment_visit_page(mi_page_t* page, bool visit_blocks, mi_block_visit_fun* visitor, void* arg) {
+  mi_heap_area_t area;
+  _mi_heap_area_init(&area, page);
+  if (!visitor(NULL, &area, NULL, area.block_size, arg)) return false;
+  if (visit_blocks) {
+    return _mi_heap_area_visit_blocks(&area, page, visitor, arg);
+  }
+  else {
+    return true;
+  }
+}
+
+bool _mi_segment_visit_blocks(mi_segment_t* segment, int heap_tag, bool visit_blocks, mi_block_visit_fun* visitor, void* arg) {
+  const mi_slice_t* end;
+  mi_slice_t* slice = mi_slices_start_iterate(segment, &end);
+  while (slice < end) {
+    if (mi_slice_is_used(slice)) {
+      mi_page_t* const page = mi_slice_to_page(slice);
+      if (heap_tag < 0 || (int)page->heap_tag == heap_tag) {
+        if (!mi_segment_visit_page(page, visit_blocks, visitor, arg)) return false;
+      }
+    }
+    slice = slice + slice->slice_count;
+  }
+  return true;
+}
diff --git a/compat/mimalloc/stats.c b/compat/mimalloc/stats.c
new file mode 100644
index 00000000000000..36e8c9813edb09
--- /dev/null
+++ b/compat/mimalloc/stats.c
@@ -0,0 +1,633 @@
+/* ----------------------------------------------------------------------------
+Copyright (c) 2018-2021, Microsoft Research, Daan Leijen
+This is free software; you can redistribute it and/or modify it under the
+terms of the MIT license. A copy of the license can be found in the file
+"LICENSE" at the root of this distribution.
+-----------------------------------------------------------------------------*/
+#include "mimalloc.h"
+#include "mimalloc/internal.h"
+#include "mimalloc/atomic.h"
+#include "mimalloc/prim.h"
+
+#include <string.h> // memset
+
+#if defined(_MSC_VER) && (_MSC_VER < 1920)
+#pragma warning(disable:4204)  // non-constant aggregate initializer
+#endif
+
+/* -----------------------------------------------------------
+  Statistics operations
+----------------------------------------------------------- */
+
+static bool mi_is_in_main(void* stat) {
+  return ((uint8_t*)stat >= (uint8_t*)&_mi_stats_main
+         && (uint8_t*)stat < ((uint8_t*)&_mi_stats_main + sizeof(mi_stats_t)));
+}
+
+static void mi_stat_update(mi_stat_count_t* stat, int64_t amount) {
+  if (amount == 0) return;
+  if mi_unlikely(mi_is_in_main(stat))
+  {
+    // add atomically (for abandoned pages)
+    int64_t current = mi_atomic_addi64_relaxed(&stat->current, amount);
+    // if (stat == &_mi_stats_main.committed) { mi_assert_internal(current + amount >= 0); };
+    mi_atomic_maxi64_relaxed(&stat->peak, current + amount);
+    if (amount > 0) {
+      mi_atomic_addi64_relaxed(&stat->total,amount);
+    }
+  }
+  else {
+    // add thread local
+    stat->current += amount;
+    if (stat->current > stat->peak) { stat->peak = stat->current; }
+    if (amount > 0) { stat->total += amount; }
+  }
+}
+
+void _mi_stat_counter_increase(mi_stat_counter_t* stat, size_t amount) {
+  if (mi_is_in_main(stat)) {
+    mi_atomic_addi64_relaxed( &stat->total, (int64_t)amount );
+  }
+  else {
+    stat->total += amount;
+  }
+}
+
+void _mi_stat_increase(mi_stat_count_t* stat, size_t amount) {
+  mi_stat_update(stat, (int64_t)amount);
+}
+
+void _mi_stat_decrease(mi_stat_count_t* stat, size_t amount) {
+  mi_stat_update(stat, -((int64_t)amount));
+}
+
+
+static void mi_stat_adjust(mi_stat_count_t* stat, int64_t amount) {
+  if (amount == 0) return;
+  if mi_unlikely(mi_is_in_main(stat))
+  {
+    // adjust atomically 
+    mi_atomic_addi64_relaxed(&stat->current, amount);
+    mi_atomic_addi64_relaxed(&stat->total,amount);
+  }
+  else {
+    // adjust local
+    stat->current += amount;
+    stat->total += amount;
+  }
+}
+
+void _mi_stat_adjust_decrease(mi_stat_count_t* stat, size_t amount) {
+  mi_stat_adjust(stat, -((int64_t)amount));
+}
+
+
+// must be thread safe as it is called from stats_merge
+static void mi_stat_count_add_mt(mi_stat_count_t* stat, const mi_stat_count_t* src) {
+  if (stat==src) return;
+  mi_atomic_void_addi64_relaxed(&stat->total, &src->total); 
+  const int64_t prev_current = mi_atomic_addi64_relaxed(&stat->current, src->current);
+
+  // Global current plus thread peak approximates new global peak
+  // note: peak scores do really not work across threads.
+  // we used to just add them together but that often overestimates in practice.
+  // similarly, max does not seem to work well. The current approach
+  // by Artem Kharytoniuk (@artem-lunarg) seems to work better, see PR#1112 
+  // for a longer description.
+  mi_atomic_maxi64_relaxed(&stat->peak, prev_current + src->peak);
+}
+
+static void mi_stat_counter_add_mt(mi_stat_counter_t* stat, const mi_stat_counter_t* src) {
+  if (stat==src) return;
+  mi_atomic_void_addi64_relaxed(&stat->total, &src->total);
+}
+
+#define MI_STAT_COUNT(stat)    mi_stat_count_add_mt(&stats->stat, &src->stat);
+#define MI_STAT_COUNTER(stat)  mi_stat_counter_add_mt(&stats->stat, &src->stat);
+
+// must be thread safe as it is called from stats_merge
+static void mi_stats_add(mi_stats_t* stats, const mi_stats_t* src) {
+  if (stats==src) return;
+
+  // copy all fields
+  MI_STAT_FIELDS()
+
+  #if MI_STAT>1
+  for (size_t i = 0; i <= MI_BIN_HUGE; i++) {
+    mi_stat_count_add_mt(&stats->malloc_bins[i], &src->malloc_bins[i]);
+  }
+  #endif
+  for (size_t i = 0; i <= MI_BIN_HUGE; i++) {
+    mi_stat_count_add_mt(&stats->page_bins[i], &src->page_bins[i]);
+  }
+}
+
+#undef MI_STAT_COUNT
+#undef MI_STAT_COUNTER
+
+/* -----------------------------------------------------------
+  Display statistics
+----------------------------------------------------------- */
+
+// unit > 0 : size in binary bytes
+// unit == 0: count as decimal
+// unit < 0 : count in binary
+static void mi_printf_amount(int64_t n, int64_t unit, mi_output_fun* out, void* arg, const char* fmt) {
+  char buf[32]; buf[0] = 0;
+  int  len = 32;
+  const char* suffix = (unit <= 0 ? " " : "B");
+  const int64_t base = (unit == 0 ? 1000 : 1024);
+  if (unit>0) n *= unit;
+
+  const int64_t pos = (n < 0 ? -n : n);
+  if (pos < base) {
+    if (n!=1 || suffix[0] != 'B') {  // skip printing 1 B for the unit column
+      _mi_snprintf(buf, len, "%lld   %-3s", (long long)n, (n==0 ? "" : suffix));
+    }
+  }
+  else {
+    int64_t divider = base;
+    const char* magnitude = "K";
+    if (pos >= divider*base) { divider *= base; magnitude = "M"; }
+    if (pos >= divider*base) { divider *= base; magnitude = "G"; }
+    const int64_t tens = (n / (divider/10));
+    const long whole = (long)(tens/10);
+    const long frac1 = (long)(tens%10);
+    char unitdesc[8];
+    _mi_snprintf(unitdesc, 8, "%s%s%s", magnitude, (base==1024 ? "i" : ""), suffix);
+    _mi_snprintf(buf, len, "%ld.%ld %-3s", whole, (frac1 < 0 ? -frac1 : frac1), unitdesc);
+  }
+  _mi_fprintf(out, arg, (fmt==NULL ? "%12s" : fmt), buf);
+}
+
+
+static void mi_print_amount(int64_t n, int64_t unit, mi_output_fun* out, void* arg) {
+  mi_printf_amount(n,unit,out,arg,NULL);
+}
+
+static void mi_print_count(int64_t n, int64_t unit, mi_output_fun* out, void* arg) {
+  if (unit==1) _mi_fprintf(out, arg, "%12s"," ");
+          else mi_print_amount(n,0,out,arg);
+}
+
+static void mi_stat_print_ex(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg, const char* notok ) {
+  _mi_fprintf(out, arg,"%10s:", msg);
+  if (unit != 0) {
+    if (unit > 0) {
+      mi_print_amount(stat->peak, unit, out, arg);
+      mi_print_amount(stat->total, unit, out, arg);
+      // mi_print_amount(stat->freed, unit, out, arg);
+      mi_print_amount(stat->current, unit, out, arg);
+      mi_print_amount(unit, 1, out, arg);
+      mi_print_count(stat->total, unit, out, arg);
+    }
+    else {
+      mi_print_amount(stat->peak, -1, out, arg);
+      mi_print_amount(stat->total, -1, out, arg);
+      // mi_print_amount(stat->freed, -1, out, arg);
+      mi_print_amount(stat->current, -1, out, arg);
+      if (unit == -1) {
+        _mi_fprintf(out, arg, "%24s", "");
+      }
+      else {
+        mi_print_amount(-unit, 1, out, arg);
+        mi_print_count((stat->total / -unit), 0, out, arg);
+      }
+    }
+    if (stat->current != 0) {
+      _mi_fprintf(out, arg, "  ");
+      _mi_fprintf(out, arg, (notok == NULL ? "not all freed" : notok));
+      _mi_fprintf(out, arg, "\n");
+    }
+    else {
+      _mi_fprintf(out, arg, "  ok\n");
+    }
+  }
+  else {
+    mi_print_amount(stat->peak, 1, out, arg);
+    mi_print_amount(stat->total, 1, out, arg);
+    _mi_fprintf(out, arg, "%11s", " ");  // no freed
+    mi_print_amount(stat->current, 1, out, arg);
+    _mi_fprintf(out, arg, "\n");
+  }
+}
+
+static void mi_stat_print(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg) {
+  mi_stat_print_ex(stat, msg, unit, out, arg, NULL);
+}
+
+#if MI_STAT>1
+static void mi_stat_total_print(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg) {
+  _mi_fprintf(out, arg, "%10s:", msg);
+  _mi_fprintf(out, arg, "%12s", " ");  // no peak
+  mi_print_amount(stat->total, unit, out, arg);
+  _mi_fprintf(out, arg, "\n");
+}
+#endif
+
+static void mi_stat_counter_print(const mi_stat_counter_t* stat, const char* msg, mi_output_fun* out, void* arg ) {
+  _mi_fprintf(out, arg, "%10s:", msg);
+  mi_print_amount(stat->total, -1, out, arg);
+  _mi_fprintf(out, arg, "\n");
+}
+
+
+static void mi_stat_average_print(size_t count, size_t total, const char* msg, mi_output_fun* out, void* arg) {
+  const int64_t avg_tens = (count == 0 ? 0 : (total*10 / count));
+  const long avg_whole = (long)(avg_tens/10);
+  const long avg_frac1 = (long)(avg_tens%10);
+  _mi_fprintf(out, arg, "%10s: %5ld.%ld avg\n", msg, avg_whole, avg_frac1);
+}
+
+
+static void mi_print_header(mi_output_fun* out, void* arg ) {
+  _mi_fprintf(out, arg, "%10s: %11s %11s %11s %11s %11s\n", "heap stats", "peak   ", "total   ", "current   ", "block   ", "total#   ");
+}
+
+#if MI_STAT>1
+static void mi_stats_print_bins(const mi_stat_count_t* bins, size_t max, const char* fmt, mi_output_fun* out, void* arg) {
+  bool found = false;
+  char buf[64];
+  for (size_t i = 0; i <= max; i++) {
+    if (bins[i].total > 0) {
+      found = true;
+      int64_t unit = _mi_bin_size((uint8_t)i);
+      _mi_snprintf(buf, 64, "%s %3lu", fmt, (long)i);
+      mi_stat_print(&bins[i], buf, unit, out, arg);
+    }
+  }
+  if (found) {
+    _mi_fprintf(out, arg, "\n");
+    mi_print_header(out, arg);
+  }
+}
+#endif
+
+
+
+//------------------------------------------------------------
+// Use an output wrapper for line-buffered output
+// (which is nice when using loggers etc.)
+//------------------------------------------------------------
+typedef struct buffered_s {
+  mi_output_fun* out;   // original output function
+  void*          arg;   // and state
+  char*          buf;   // local buffer of at least size `count+1`
+  size_t         used;  // currently used chars `used <= count`
+  size_t         count; // total chars available for output
+} buffered_t;
+
+static void mi_buffered_flush(buffered_t* buf) {
+  buf->buf[buf->used] = 0;
+  _mi_fputs(buf->out, buf->arg, NULL, buf->buf);
+  buf->used = 0;
+}
+
+static void mi_cdecl mi_buffered_out(const char* msg, void* arg) {
+  buffered_t* buf = (buffered_t*)arg;
+  if (msg==NULL || buf==NULL) return;
+  for (const char* src = msg; *src != 0; src++) {
+    char c = *src;
+    if (buf->used >= buf->count) mi_buffered_flush(buf);
+    mi_assert_internal(buf->used < buf->count);
+    buf->buf[buf->used++] = c;
+    if (c == '\n') mi_buffered_flush(buf);
+  }
+}
+
+//------------------------------------------------------------
+// Print statistics
+//------------------------------------------------------------
+
+static void _mi_stats_print(mi_stats_t* stats, mi_output_fun* out0, void* arg0) mi_attr_noexcept {
+  // wrap the output function to be line buffered
+  char buf[256];
+  buffered_t buffer = { out0, arg0, NULL, 0, 255 };
+  buffer.buf = buf;
+  mi_output_fun* out = &mi_buffered_out;
+  void* arg = &buffer;
+
+  // and print using that
+  mi_print_header(out,arg);
+  #if MI_STAT>1
+  mi_stats_print_bins(stats->malloc_bins, MI_BIN_HUGE, "bin",out,arg);
+  #endif
+  #if MI_STAT
+  mi_stat_print(&stats->malloc_normal, "binned", (stats->malloc_normal_count.total == 0 ? 1 : -1), out, arg);
+  // mi_stat_print(&stats->malloc_large, "large", (stats->malloc_large_count.total == 0 ? 1 : -1), out, arg);
+  mi_stat_print(&stats->malloc_huge, "huge", (stats->malloc_huge_count.total == 0 ? 1 : -1), out, arg);
+  mi_stat_count_t total = { 0,0,0 };
+  mi_stat_count_add_mt(&total, &stats->malloc_normal);
+  // mi_stat_count_add(&total, &stats->malloc_large);
+  mi_stat_count_add_mt(&total, &stats->malloc_huge);
+  mi_stat_print_ex(&total, "total", 1, out, arg, "");
+  #endif
+  #if MI_STAT>1
+  mi_stat_total_print(&stats->malloc_requested, "malloc req", 1, out, arg);
+  _mi_fprintf(out, arg, "\n");
+  #endif
+  mi_stat_print_ex(&stats->reserved, "reserved", 1, out, arg, "");
+  mi_stat_print_ex(&stats->committed, "committed", 1, out, arg, "");
+  mi_stat_counter_print(&stats->reset, "reset", out, arg );
+  mi_stat_counter_print(&stats->purged, "purged", out, arg );
+  mi_stat_print_ex(&stats->page_committed, "touched", 1, out, arg, "");
+  mi_stat_print(&stats->segments, "segments", -1, out, arg);
+  mi_stat_print(&stats->segments_abandoned, "-abandoned", -1, out, arg);
+  mi_stat_print(&stats->segments_cache, "-cached", -1, out, arg);
+  mi_stat_print(&stats->pages, "pages", -1, out, arg);
+  mi_stat_print(&stats->pages_abandoned, "-abandoned", -1, out, arg);
+  mi_stat_counter_print(&stats->pages_extended, "-extended", out, arg);
+  mi_stat_counter_print(&stats->pages_retire, "-retire", out, arg);
+  mi_stat_counter_print(&stats->arena_count, "arenas", out, arg);
+  // mi_stat_counter_print(&stats->arena_crossover_count, "-crossover", out, arg);
+  mi_stat_counter_print(&stats->arena_rollback_count, "-rollback", out, arg);
+  mi_stat_counter_print(&stats->mmap_calls, "mmaps", out, arg);
+  mi_stat_counter_print(&stats->commit_calls, "commits", out, arg);
+  mi_stat_counter_print(&stats->reset_calls, "resets", out, arg);
+  mi_stat_counter_print(&stats->purge_calls, "purges", out, arg);
+  mi_stat_counter_print(&stats->malloc_guarded_count, "guarded", out, arg);
+  mi_stat_print(&stats->threads, "threads", -1, out, arg);
+  mi_stat_average_print(stats->page_searches_count.total, stats->page_searches.total, "searches", out, arg);
+  _mi_fprintf(out, arg, "%10s: %5i\n", "numa nodes", _mi_os_numa_node_count());
+
+  size_t elapsed;
+  size_t user_time;
+  size_t sys_time;
+  size_t current_rss;
+  size_t peak_rss;
+  size_t current_commit;
+  size_t peak_commit;
+  size_t page_faults;
+  mi_process_info(&elapsed, &user_time, &sys_time, &current_rss, &peak_rss, &current_commit, &peak_commit, &page_faults);
+  _mi_fprintf(out, arg, "%10s: %5zu.%03zu s\n", "elapsed", elapsed/1000, elapsed%1000);
+  _mi_fprintf(out, arg, "%10s: user: %zu.%03zu s, system: %zu.%03zu s, faults: %zu, peak rss: ", "process",
+              user_time/1000, user_time%1000, sys_time/1000, sys_time%1000, page_faults );
+  mi_printf_amount((int64_t)peak_rss, 1, out, arg, "%s");
+  if (peak_commit > 0) {
+    _mi_fprintf(out, arg, ", peak commit: ");
+    mi_printf_amount((int64_t)peak_commit, 1, out, arg, "%s");
+  }
+  _mi_fprintf(out, arg, "\n");
+}
+
+static mi_msecs_t mi_process_start; // = 0
+
+static mi_stats_t* mi_stats_get_default(void) {
+  mi_heap_t* heap = mi_heap_get_default();
+  return &heap->tld->stats;
+}
+
+static void mi_stats_merge_from(mi_stats_t* stats) {
+  if (stats != &_mi_stats_main) {
+    mi_stats_add(&_mi_stats_main, stats);
+    memset(stats, 0, sizeof(mi_stats_t));
+  }
+}
+
+void mi_stats_reset(void) mi_attr_noexcept {
+  mi_stats_t* stats = mi_stats_get_default();
+  if (stats != &_mi_stats_main) { memset(stats, 0, sizeof(mi_stats_t)); }
+  memset(&_mi_stats_main, 0, sizeof(mi_stats_t));
+  if (mi_process_start == 0) { mi_process_start = _mi_clock_start(); };
+}
+
+void mi_stats_merge(void) mi_attr_noexcept {
+  mi_stats_merge_from( mi_stats_get_default() );
+}
+
+void _mi_stats_merge_thread(mi_tld_t* tld) {
+  mi_stats_merge_from( &tld->stats );
+}
+
+void _mi_stats_done(mi_stats_t* stats) {  // called from `mi_thread_done`
+  mi_stats_merge_from(stats);
+}
+
+void mi_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept {
+  mi_stats_merge_from(mi_stats_get_default());
+  _mi_stats_print(&_mi_stats_main, out, arg);
+}
+
+void mi_stats_print(void* out) mi_attr_noexcept {
+  // for compatibility there is an `out` parameter (which can be `stdout` or `stderr`)
+  mi_stats_print_out((mi_output_fun*)out, NULL);
+}
+
+void mi_thread_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept {
+  _mi_stats_print(mi_stats_get_default(), out, arg);
+}
+
+
+// ----------------------------------------------------------------
+// Basic timer for convenience; use milli-seconds to avoid doubles
+// ----------------------------------------------------------------
+
+static mi_msecs_t mi_clock_diff;
+
+mi_msecs_t _mi_clock_now(void) {
+  return _mi_prim_clock_now();
+}
+
+mi_msecs_t _mi_clock_start(void) {
+  if (mi_clock_diff == 0.0) {
+    mi_msecs_t t0 = _mi_clock_now();
+    mi_clock_diff = _mi_clock_now() - t0;
+  }
+  return _mi_clock_now();
+}
+
+mi_msecs_t _mi_clock_end(mi_msecs_t start) {
+  mi_msecs_t end = _mi_clock_now();
+  return (end - start - mi_clock_diff);
+}
+
+
+// --------------------------------------------------------
+// Basic process statistics
+// --------------------------------------------------------
+
+mi_decl_export void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_msecs, size_t* current_rss, size_t* peak_rss, size_t* current_commit, size_t* peak_commit, size_t* page_faults) mi_attr_noexcept
+{
+  mi_process_info_t pinfo;
+  _mi_memzero_var(pinfo);
+  pinfo.elapsed        = _mi_clock_end(mi_process_start);
+  pinfo.current_commit = (size_t)(mi_atomic_loadi64_relaxed((_Atomic(int64_t)*)&_mi_stats_main.committed.current));
+  pinfo.peak_commit    = (size_t)(mi_atomic_loadi64_relaxed((_Atomic(int64_t)*)&_mi_stats_main.committed.peak));
+  pinfo.current_rss    = pinfo.current_commit;
+  pinfo.peak_rss       = pinfo.peak_commit;
+  pinfo.utime          = 0;
+  pinfo.stime          = 0;
+  pinfo.page_faults    = 0;
+
+  _mi_prim_process_info(&pinfo);
+
+  if (elapsed_msecs!=NULL)  *elapsed_msecs  = (pinfo.elapsed < 0 ? 0 : (pinfo.elapsed < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.elapsed : PTRDIFF_MAX));
+  if (user_msecs!=NULL)     *user_msecs     = (pinfo.utime < 0 ? 0 : (pinfo.utime < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.utime : PTRDIFF_MAX));
+  if (system_msecs!=NULL)   *system_msecs   = (pinfo.stime < 0 ? 0 : (pinfo.stime < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.stime : PTRDIFF_MAX));
+  if (current_rss!=NULL)    *current_rss    = pinfo.current_rss;
+  if (peak_rss!=NULL)       *peak_rss       = pinfo.peak_rss;
+  if (current_commit!=NULL) *current_commit = pinfo.current_commit;
+  if (peak_commit!=NULL)    *peak_commit    = pinfo.peak_commit;
+  if (page_faults!=NULL)    *page_faults    = pinfo.page_faults;
+}
+
+
+// --------------------------------------------------------
+// Return statistics
+// --------------------------------------------------------
+
+void mi_stats_get(size_t stats_size, mi_stats_t* stats) mi_attr_noexcept {
+  if (stats == NULL || stats_size == 0) return;
+  _mi_memzero(stats, stats_size);
+  const size_t size = (stats_size > sizeof(mi_stats_t) ? sizeof(mi_stats_t) : stats_size);
+  _mi_memcpy(stats, &_mi_stats_main, size);
+  stats->version = MI_STAT_VERSION;
+}
+
+
+// --------------------------------------------------------
+// Statics in json format
+// --------------------------------------------------------
+
+typedef struct mi_heap_buf_s {
+  char*   buf;
+  size_t  size;
+  size_t  used;
+  bool    can_realloc;
+} mi_heap_buf_t;
+
+static bool mi_heap_buf_expand(mi_heap_buf_t* hbuf) {
+  if (hbuf==NULL) return false;
+  if (hbuf->buf != NULL && hbuf->size>0) {
+    hbuf->buf[hbuf->size-1] = 0;
+  }
+  if (hbuf->size > SIZE_MAX/2 || !hbuf->can_realloc) return false;
+  const size_t newsize = (hbuf->size == 0 ? mi_good_size(12*MI_KiB) : 2*hbuf->size);
+  char* const  newbuf  = (char*)mi_rezalloc(hbuf->buf, newsize);
+  if (newbuf == NULL) return false;
+  hbuf->buf = newbuf;
+  hbuf->size = newsize;
+  return true;
+}
+
+static void mi_heap_buf_print(mi_heap_buf_t* hbuf, const char* msg) {
+  if (msg==NULL || hbuf==NULL) return;
+  if (hbuf->used + 1 >= hbuf->size && !hbuf->can_realloc) return;
+  for (const char* src = msg; *src != 0; src++) {
+    char c = *src;
+    if (hbuf->used + 1 >= hbuf->size) {
+      if (!mi_heap_buf_expand(hbuf)) return;
+    }
+    mi_assert_internal(hbuf->used < hbuf->size);
+    hbuf->buf[hbuf->used++] = c;
+  }
+  mi_assert_internal(hbuf->used < hbuf->size);
+  hbuf->buf[hbuf->used] = 0;
+}
+
+static void mi_heap_buf_print_count_bin(mi_heap_buf_t* hbuf, const char* prefix, mi_stat_count_t* stat, size_t bin, bool add_comma) {
+  const size_t binsize = _mi_bin_size(bin);
+  const size_t pagesize = (binsize <= MI_SMALL_OBJ_SIZE_MAX ? MI_SMALL_PAGE_SIZE :
+                            (binsize <= MI_MEDIUM_OBJ_SIZE_MAX ? MI_MEDIUM_PAGE_SIZE :
+                              #if MI_LARGE_PAGE_SIZE
+                              (binsize <= MI_LARGE_OBJ_SIZE_MAX ? MI_LARGE_PAGE_SIZE : 0)
+                              #else
+                              0
+                              #endif
+                              ));
+  char buf[128];
+  _mi_snprintf(buf, 128, "%s{ \"total\": %lld, \"peak\": %lld, \"current\": %lld, \"block_size\": %zu, \"page_size\": %zu }%s\n", prefix, stat->total, stat->peak, stat->current, binsize, pagesize, (add_comma ? "," : ""));
+  buf[127] = 0;
+  mi_heap_buf_print(hbuf, buf);
+}
+
+static void mi_heap_buf_print_count(mi_heap_buf_t* hbuf, const char* prefix, mi_stat_count_t* stat, bool add_comma) {
+  char buf[128];
+  _mi_snprintf(buf, 128, "%s{ \"total\": %lld, \"peak\": %lld, \"current\": %lld }%s\n", prefix, stat->total, stat->peak, stat->current, (add_comma ? "," : ""));
+  buf[127] = 0;
+  mi_heap_buf_print(hbuf, buf);
+}
+
+static void mi_heap_buf_print_count_value(mi_heap_buf_t* hbuf, const char* name, mi_stat_count_t* stat) {
+  char buf[128];
+  _mi_snprintf(buf, 128, "  \"%s\": ", name);
+  buf[127] = 0;
+  mi_heap_buf_print(hbuf, buf);
+  mi_heap_buf_print_count(hbuf, "", stat, true);
+}
+
+static void mi_heap_buf_print_value(mi_heap_buf_t* hbuf, const char* name, int64_t val) {
+  char buf[128];
+  _mi_snprintf(buf, 128, "  \"%s\": %lld,\n", name, val);
+  buf[127] = 0;
+  mi_heap_buf_print(hbuf, buf);
+}
+
+static void mi_heap_buf_print_size(mi_heap_buf_t* hbuf, const char* name, size_t val, bool add_comma) {
+  char buf[128];
+  _mi_snprintf(buf, 128, "    \"%s\": %zu%s\n", name, val, (add_comma ? "," : ""));
+  buf[127] = 0;
+  mi_heap_buf_print(hbuf, buf);
+}
+
+static void mi_heap_buf_print_counter_value(mi_heap_buf_t* hbuf, const char* name, mi_stat_counter_t* stat) {
+  mi_heap_buf_print_value(hbuf, name, stat->total);
+}
+
+#define MI_STAT_COUNT(stat)    mi_heap_buf_print_count_value(&hbuf, #stat, &stats->stat);
+#define MI_STAT_COUNTER(stat)  mi_heap_buf_print_counter_value(&hbuf, #stat, &stats->stat);
+
+char* mi_stats_get_json(size_t output_size, char* output_buf) mi_attr_noexcept {
+  mi_heap_buf_t hbuf = { NULL, 0, 0, true };
+  if (output_size > 0 && output_buf != NULL) {
+    _mi_memzero(output_buf, output_size);
+    hbuf.buf = output_buf;
+    hbuf.size = output_size;
+    hbuf.can_realloc = false;
+  }
+  else {
+    if (!mi_heap_buf_expand(&hbuf)) return NULL;
+  }
+  mi_heap_buf_print(&hbuf, "{\n");
+  mi_heap_buf_print_value(&hbuf, "version", MI_STAT_VERSION);
+  mi_heap_buf_print_value(&hbuf, "mimalloc_version", MI_MALLOC_VERSION);
+
+  // process info
+  mi_heap_buf_print(&hbuf, "  \"process\": {\n");
+  size_t elapsed;
+  size_t user_time;
+  size_t sys_time;
+  size_t current_rss;
+  size_t peak_rss;
+  size_t current_commit;
+  size_t peak_commit;
+  size_t page_faults;
+  mi_process_info(&elapsed, &user_time, &sys_time, &current_rss, &peak_rss, &current_commit, &peak_commit, &page_faults);
+  mi_heap_buf_print_size(&hbuf, "elapsed_msecs", elapsed, true);
+  mi_heap_buf_print_size(&hbuf, "user_msecs", user_time, true);
+  mi_heap_buf_print_size(&hbuf, "system_msecs", sys_time, true);
+  mi_heap_buf_print_size(&hbuf, "page_faults", page_faults, true);
+  mi_heap_buf_print_size(&hbuf, "rss_current", current_rss, true);
+  mi_heap_buf_print_size(&hbuf, "rss_peak", peak_rss, true);
+  mi_heap_buf_print_size(&hbuf, "commit_current", current_commit, true);
+  mi_heap_buf_print_size(&hbuf, "commit_peak", peak_commit, false);
+  mi_heap_buf_print(&hbuf, "  },\n");
+
+  // statistics
+  mi_stats_t* stats = &_mi_stats_main;
+  MI_STAT_FIELDS()
+
+  // size bins
+  mi_heap_buf_print(&hbuf, "  \"malloc_bins\": [\n");
+  for (size_t i = 0; i <= MI_BIN_HUGE; i++) {
+    mi_heap_buf_print_count_bin(&hbuf, "    ", &stats->malloc_bins[i], i, i!=MI_BIN_HUGE);
+  }
+  mi_heap_buf_print(&hbuf, "  ],\n");
+  mi_heap_buf_print(&hbuf, "  \"page_bins\": [\n");
+  for (size_t i = 0; i <= MI_BIN_HUGE; i++) {
+    mi_heap_buf_print_count_bin(&hbuf, "    ", &stats->page_bins[i], i, i!=MI_BIN_HUGE);
+  }
+  mi_heap_buf_print(&hbuf, "  ]\n");
+  mi_heap_buf_print(&hbuf, "}\n");
+  return hbuf.buf;
+}

From 0971b7fa5bec384930b411bcfc6d0333828448d6 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 24 Jun 2019 23:41:27 +0200
Subject: [PATCH 419/553] mimalloc: adjust for building inside Git
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We want to compile mimalloc's source code as part of Git, rather than
requiring the code to be built as an external library: mimalloc uses a
CMake-based build, which is not necessarily easy to integrate into the
flavors of Git for Windows (which will be the main benefitting port).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
 compat/mimalloc/alloc.c    | 1 -
 compat/mimalloc/mimalloc.h | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/compat/mimalloc/alloc.c b/compat/mimalloc/alloc.c
index 120615b2ec8732..b6cdca1882ecb7 100644
--- a/compat/mimalloc/alloc.c
+++ b/compat/mimalloc/alloc.c
@@ -17,7 +17,6 @@ terms of the MIT license. A copy of the license can be found in the file
 #include <stdlib.h>      // malloc, abort
 
 #define MI_IN_ALLOC_C
-#include "alloc-override.c"
 #include "free.c"
 #undef MI_IN_ALLOC_C
 
diff --git a/compat/mimalloc/mimalloc.h b/compat/mimalloc/mimalloc.h
index b2d5f2c8df0734..fcd19cc9bc2e15 100644
--- a/compat/mimalloc/mimalloc.h
+++ b/compat/mimalloc/mimalloc.h
@@ -95,7 +95,8 @@ terms of the MIT license. A copy of the license can be found in the file
 // Includes
 // ------------------------------------------------------
 
-#include <stddef.h>     // size_t
+#include "compat/posix.h"
+
 #include <stdbool.h>    // bool
 #include <stdint.h>     // INTPTR_MAX
 

From 149dc789d0cc1ba511932801da7b08e5abef27d4 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 24 Jun 2019 23:43:06 +0200
Subject: [PATCH 420/553] mimalloc: offer a build-time option to enable it

By defining `USE_MIMALLOC`, Git can now be compiled with that
nicely-fast and small allocator.

Note that we have to disable a couple `DEVELOPER` options to build
mimalloc's source code, as it makes heavy use of declarations after
statements, among other things that disagree with Git's conventions.

We even have to silence some GCC warnings in non-DEVELOPER mode. For
example, the `-Wno-array-bounds` flag is needed because in `-O2` builds,
trying to call `NtCurrentTeb()` (which `_mi_thread_id()` does on
Windows) causes the bogus warning about a system header, likely related
to https://sourceforge.net/p/mingw-w64/mailman/message/37674519/ and to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578:

C:/git-sdk-64-minimal/mingw64/include/psdk_inc/intrin-impl.h:838:1:
        error: array subscript 0 is outside array bounds of 'long long unsigned int[0]' [-Werror=array-bounds]
  838 | __buildreadseg(__readgsqword, unsigned __int64, "gs", "q")
      | ^~~~~~~~~~~~~~

Also: The `mimalloc` library uses C11-style atomics, therefore we must
require that standard when compiling with GCC if we want to use
`mimalloc` (instead of requiring "only" C99). This is what we do in the
CMake definition already, therefore this commit does not need to touch
`contrib/buildsystems/`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile              | 41 +++++++++++++++++++++++++++++++++++++++++
 compat/.gitattributes |  1 +
 compat/posix.h        | 10 ++++++++++
 config.mak.dev        |  2 ++
 config.mak.uname      |  2 +-
 5 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b7eba509c6a0ca..4a69c3c75c73d4 100644
--- a/Makefile
+++ b/Makefile
@@ -1502,6 +1502,7 @@ BUILTIN_OBJS += builtin/write-tree.o
 # upstream unnecessarily (making merging in future changes easier).
 THIRD_PARTY_SOURCES += compat/inet_ntop.c
 THIRD_PARTY_SOURCES += compat/inet_pton.c
+THIRD_PARTY_SOURCES += compat/mimalloc/%
 THIRD_PARTY_SOURCES += compat/nedmalloc/%
 THIRD_PARTY_SOURCES += compat/obstack.%
 THIRD_PARTY_SOURCES += compat/poll/%
@@ -2243,6 +2244,46 @@ ifdef USE_NED_ALLOCATOR
 	OVERRIDE_STRDUP = YesPlease
 endif
 
+ifdef USE_MIMALLOC
+	MIMALLOC_OBJS = \
+		compat/mimalloc/alloc-aligned.o \
+		compat/mimalloc/alloc.o \
+		compat/mimalloc/arena.o \
+		compat/mimalloc/bitmap.o \
+		compat/mimalloc/heap.o \
+		compat/mimalloc/init.o \
+		compat/mimalloc/libc.o \
+		compat/mimalloc/options.o \
+		compat/mimalloc/os.o \
+		compat/mimalloc/page.o \
+		compat/mimalloc/random.o \
+		compat/mimalloc/prim/prim.o \
+		compat/mimalloc/segment.o \
+		compat/mimalloc/segment-map.o \
+		compat/mimalloc/stats.o
+
+	COMPAT_CFLAGS += -Icompat/mimalloc -DMI_DEBUG=0 -DUSE_MIMALLOC --std=gnu11
+	COMPAT_OBJS += $(MIMALLOC_OBJS)
+
+$(MIMALLOC_OBJS): COMPAT_CFLAGS += -DBANNED_H
+
+$(MIMALLOC_OBJS): COMPAT_CFLAGS += \
+	-DMI_WIN_USE_FLS \
+	-Wno-attributes \
+	-Wno-unknown-pragmas \
+	-Wno-unused-function \
+	-Wno-array-bounds
+
+ifdef DEVELOPER
+$(MIMALLOC_OBJS): COMPAT_CFLAGS += \
+	-Wno-pedantic \
+	-Wno-declaration-after-statement \
+	-Wno-old-style-definition \
+	-Wno-missing-prototypes \
+	-Wno-implicit-function-declaration
+endif
+endif
+
 ifdef OVERRIDE_STRDUP
 	COMPAT_CFLAGS += -DOVERRIDE_STRDUP
 	COMPAT_OBJS += compat/strdup.o
diff --git a/compat/.gitattributes b/compat/.gitattributes
index 40dbfb170dabc5..2b5a66a3b34bda 100644
--- a/compat/.gitattributes
+++ b/compat/.gitattributes
@@ -1 +1,2 @@
 /zlib-uncompress2.c	whitespace=-indent-with-non-tab,-trailing-space
+/mimalloc/**/*	whitespace=-trailing-space
diff --git a/compat/posix.h b/compat/posix.h
index 6a137480d93eef..18a01b19bc976a 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -176,6 +176,16 @@ typedef unsigned long uintptr_t;
 #define _ALL_SOURCE 1
 #endif
 
+#ifdef USE_MIMALLOC
+#include "mimalloc.h"
+#define malloc mi_malloc
+#define calloc mi_calloc
+#define realloc mi_realloc
+#define free mi_free
+#define strdup mi_strdup
+#define strndup mi_strndup
+#endif
+
 #ifdef MKDIR_WO_TRAILING_SLASH
 #define mkdir(a,b) compat_mkdir_wo_trailing_slash((a),(b))
 int compat_mkdir_wo_trailing_slash(const char*, mode_t);
diff --git a/config.mak.dev b/config.mak.dev
index e86b6e1b34a2d7..b63797ef509333 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -22,8 +22,10 @@ endif
 
 ifneq ($(uname_S),FreeBSD)
 ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter clang7,$(COMPILER_FEATURES))),)
+ifndef USE_MIMALLOC
 DEVELOPER_CFLAGS += -std=gnu99
 endif
+endif
 else
 # FreeBSD cannot limit to C99 because its system headers unconditionally
 # rely on C11 features.
diff --git a/config.mak.uname b/config.mak.uname
index f00d23b57a9f4c..cada39d2d5b974 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -507,7 +507,7 @@ endif
 	CC = compat/vcbuild/scripts/clink.pl
 	AR = compat/vcbuild/scripts/lib.pl
 	CFLAGS =
-	BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE
+	BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -MP -std:c11
 	COMPAT_OBJS = compat/msvc.o compat/winansi.o \
 		compat/win32/flush.o \
 		compat/win32/path-utils.o \

From 836a75213ec8d53f7f7c45b1daf0cc9c2f95e6d5 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 24 Jun 2019 23:45:21 +0200
Subject: [PATCH 421/553] mingw: use mimalloc

Thorough benchmarking with repacking a subset of linux.git (the commit
history reachable from 93a6fefe2f ([PATCH] fix the SYSCTL=n compilation,
2007-02-28), to be precise) suggest that this allocator is on par, in
multi-threaded situations maybe even better than nedmalloc:

`git repack -adfq` with mimalloc, 8 threads:

31.166991900 27.576763800 28.712311000 27.373859000 27.163141900

`git repack -adfq` with nedmalloc, 8 threads:

31.915032900 27.149883100 28.244933700 27.240188800 28.580849500

In a different test using GitHub Actions build agents (probably
single-threaded, a core-strength of nedmalloc)):

`git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with mimalloc:

943.426 978.500 939.709 959.811 954.605

`git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with nedmalloc:

995.383 952.179 943.253 963.043 980.468

While these measurements were not executed with complete scientific
rigor, as no hardware was set aside specifically for these benchmarks,
it shows that mimalloc and nedmalloc perform almost the same, nedmalloc
with a bit higher variance and also slightly higher average (further
testing suggests that nedmalloc performs worse in multi-threaded
situations than in single-threaded ones).

In short: mimalloc seems to be slightly better suited for our purposes
than nedmalloc.

Seeing that mimalloc is developed actively, while nedmalloc ceased to
see any updates in eight years, let's use mimalloc on Windows instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index cada39d2d5b974..c3d77a123b760e 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -755,9 +755,7 @@ ifeq ($(uname_S),MINGW)
 	HAVE_LIBCHARSET_H = YesPlease
 	USE_GETTEXT_SCHEME = fallthrough
 	USE_LIBPCRE = YesPlease
-        ifneq (CLANGARM64,$(MSYSTEM))
-		USE_NED_ALLOCATOR = YesPlease
-        endif
+	USE_MIMALLOC = YesPlease
 	NO_PYTHON =
         ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix))))
 		# Move system config into top-level /etc/

From b9da4f898348670643b79abf73179d686942a53f Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Fri, 12 Nov 2021 21:14:50 +0000
Subject: [PATCH 422/553] object-file.c: use size_t for header lengths

Continue walking the code path for the >4GB `hash-object --literally`
test. The `hash_object_file_literally()` function internally uses both
`hash_object_file()` and `write_object_file_prepare()`. Both function
signatures use `unsigned long` rather than `size_t` for the mem buffer
sizes. Use `size_t` instead, for LLP64 compatibility.

While at it, convert those function's object's header buffer length to
`size_t` for consistency. The value is already upcast to `uintmax_t` for
print format compatibility.

Note: The hash-object test still does not pass. A subsequent commit
continues to walk the call tree's lower level hash functions to identify
further fixes.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 object-file.c | 14 +++++++-------
 object-file.h |  4 ++--
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/object-file.c b/object-file.c
index 6280e42f3412c3..5615908967b75f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -509,7 +509,7 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
 			     const void *buf, unsigned long len,
 			     struct object_id *oid,
-			     char *hdr, int *hdrlen)
+			     char *hdr, size_t *hdrlen)
 {
 	algo->init_fn(c);
 	git_hash_update(c, hdr, *hdrlen);
@@ -518,9 +518,9 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
 }
 
 static void write_object_file_prepare(const struct git_hash_algo *algo,
-				      const void *buf, unsigned long len,
+				      const void *buf, size_t len,
 				      enum object_type type, struct object_id *oid,
-				      char *hdr, int *hdrlen)
+				      char *hdr, size_t *hdrlen)
 {
 	struct git_hash_ctx c;
 
@@ -663,11 +663,11 @@ int finalize_object_file_flags(struct repository *repo,
 }
 
 void hash_object_file(const struct git_hash_algo *algo, const void *buf,
-		      unsigned long len, enum object_type type,
+		      size_t len, enum object_type type,
 		      struct object_id *oid)
 {
 	char hdr[MAX_HEADER_LEN];
-	int hdrlen = sizeof(hdr);
+	size_t hdrlen = sizeof(hdr);
 
 	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
 }
@@ -1108,7 +1108,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 }
 
 int odb_source_loose_write_object(struct odb_source *source,
-				  const void *buf, unsigned long len,
+				  const void *buf, size_t len,
 				  enum object_type type, struct object_id *oid,
 				  struct object_id *compat_oid_in, unsigned flags)
 {
@@ -1116,7 +1116,7 @@ int odb_source_loose_write_object(struct odb_source *source,
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
 	char hdr[MAX_HEADER_LEN];
-	int hdrlen = sizeof(hdr);
+	size_t hdrlen = sizeof(hdr);
 
 	/* Generate compat_oid */
 	if (compat) {
diff --git a/object-file.h b/object-file.h
index 1229d5f675b44a..a5b7f9aa5b9964 100644
--- a/object-file.h
+++ b/object-file.h
@@ -65,7 +65,7 @@ int odb_source_loose_freshen_object(struct odb_source *source,
 				    const struct object_id *oid);
 
 int odb_source_loose_write_object(struct odb_source *source,
-				  const void *buf, unsigned long len,
+				  const void *buf, size_t len,
 				  enum object_type type, struct object_id *oid,
 				  struct object_id *compat_oid_in, unsigned flags);
 
@@ -177,7 +177,7 @@ int finalize_object_file_flags(struct repository *repo,
 			       enum finalize_object_file_flags flags);
 
 void hash_object_file(const struct git_hash_algo *algo, const void *buf,
-		      unsigned long len, enum object_type type,
+		      size_t len, enum object_type type,
 		      struct object_id *oid);
 
 /* Helper to check and "touch" a file */

From 840b7e45a49cba7ccff11b720fd853ab8438cd1e Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Fri, 12 Nov 2021 21:16:51 +0000
Subject: [PATCH 423/553] hash algorithms: use size_t for section lengths

Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.

This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).

The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:

This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:

    static void git_hash_sha1_update(git_hash_ctx *ctx,
                                     const void *data, size_t len)
    {
        git_SHA1_Update(&ctx->sha1, data, len);
    }

i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.

With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 object-file.c          | 4 ++--
 sha1dc_git.c           | 3 +--
 sha1dc_git.h           | 2 +-
 t/t1007-hash-object.sh | 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index 5615908967b75f..90d2087b19bb46 100644
--- a/object-file.c
+++ b/object-file.c
@@ -507,7 +507,7 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 }
 
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
-			     const void *buf, unsigned long len,
+			     const void *buf, size_t len,
 			     struct object_id *oid,
 			     char *hdr, size_t *hdrlen)
 {
@@ -527,7 +527,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
 	/* Generate the header */
 	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
 
-	/* Sha1.. */
+	/* Hash (function pointers) computation */
 	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
 }
 
diff --git a/sha1dc_git.c b/sha1dc_git.c
index 9b675a046ee699..fe58d7962a30c9 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
 /*
  * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
  */
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
 {
 	const char *data = vdata;
-	/* We expect an unsigned long, but sha1dc only takes an int */
 	while (len > INT_MAX) {
 		SHA1DCUpdate(ctx, data, INT_MAX);
 		data += INT_MAX;
diff --git a/sha1dc_git.h b/sha1dc_git.h
index f6f880cabea382..0bcf1aa84b7241 100644
--- a/sha1dc_git.h
+++ b/sha1dc_git.h
@@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
 #endif
 
 void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);
 
 #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
 
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 7867fd1dbf940c..10382a815e4c14 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
 	test_cmp expect actual
 '
 
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
 		'files over 4GB hash literally' '
 	test-tool genzeros $((5*1024*1024*1024)) >big &&
 	test_oid large5GB >expect &&

From c8273e76eb84882636bf13199a3cbe287744c47f Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Mon, 6 Dec 2021 22:26:50 +0000
Subject: [PATCH 424/553] hash-object --stdin: verify that it works with
 >4GB/LLP64

Just like the `hash-object --literally` code path, the `--stdin` code
path also needs to use `size_t` instead of `unsigned long` to represent
memory sizes, otherwise it would cause problems on platforms using the
LLP64 data model (such as Windows).

To limit the scope of the test case, the object is explicitly not
written to the object store, nor are any filters applied.

The `big` file from the previous test case is reused to save setup time;
To avoid relying on that side effect, it is generated if it does not
exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`).

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 10382a815e4c14..59efee3affcff4 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
 	test_cmp expect actual
 '
 
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+		'files over 4GB hash correctly via --stdin' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	git hash-object --stdin <big >actual &&
+	test_cmp expect actual
+'
+
 test_done

From 8f2446647d0c8450c079e664c611acccb77e8f3a Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Mon, 6 Dec 2021 22:42:46 +0000
Subject: [PATCH 425/553] hash-object: add another >4GB/LLP64 test case

To complement the `--stdin` and `--literally` test cases that verify
that we can hash files larger than 4GB on 64-bit platforms using the
LLP64 data model, here is a test case that exercises `hash-object`
_without_ any options.

Just as before, we use the `big` file from the previous test case if it
exists to save on setup time, otherwise generate it.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 59efee3affcff4..f2722380ee1436 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
 	test_cmp expect actual
 '
 
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+		'files over 4GB hash correctly' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	git hash-object -- big >actual &&
+	test_cmp expect actual
+'
+
 test_done

From b8c422adedb398570b27889ed8a93539ce912804 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.email>
Date: Tue, 7 Dec 2021 09:53:41 +0000
Subject: [PATCH 426/553] hash-object: add a >4GB/LLP64 test case using
 filtered input

To verify that the `clean` side of the `clean`/`smudge` filter code is
correct with regards to LLP64 (read: to ensure that `size_t` is used
instead of `unsigned long`), here is a test case using a trivial filter,
specifically _not_ writing anything to the object store to limit the
scope of the test case.

As in previous commits, the `big` file from previous test cases is
reused if available, to save setup time, otherwise re-generated.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f2722380ee1436..841a6671d1a3c1 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -285,4 +285,16 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
 	test_cmp expect actual
 '
 
+# This clean filter does nothing, other than excercising the interface.
+# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+		'hash filtered files over 4GB correctly' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	test_config filter.null-filter.clean "cat" &&
+	echo "big filter=null-filter" >.gitattributes &&
+	git hash-object -- big >actual &&
+	test_cmp expect actual
+'
+
 test_done

From e9b7d365cede8983fcf96c915f230de5fcf75f70 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 27 Jan 2023 08:55:21 +0100
Subject: [PATCH 427/553] windows: skip linking `git-<command>` for built-ins

It is merely a historical wart that, say, `git-commit` exists in the
`libexec/git-core/` directory, a tribute to the original idea to let Git
be essentially a bunch of Unix shell scripts revolving around very few
"plumbing" (AKA low-level) commands.

Git has evolved a lot from there. These days, most of Git's
functionality is contained within the `git` executable, in the form of
"built-in" commands.

To accommodate for scripts that use the "dashed" form of Git commands,
even today, Git provides hard-links that make the `git` executable
available as, say, `git-commit`, just in case that an old script has not
been updated to invoke `git commit`.

Those hard-links do not come cheap: they take about half a minute for
every build of Git on Windows, they are mistaken for taking up huge
amounts of space by some Windows Explorer versions that do not
understand hard-links, and therefore many a "bug" report had to be
addressed.

The "dashed form" has been officially deprecated in Git version 1.5.4,
which was released on February 2nd, 2008, i.e. a very long time ago.
This deprecation was never finalized by skipping these hard-links, but
we can start the process now, in Git for Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/config.mak.uname b/config.mak.uname
index c3d77a123b760e..00f1b3d2b668fa 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -498,6 +498,7 @@ ifeq ($(uname_S),Windows)
 	NO_POSIX_GOODIES = UnfortunatelyYes
 	NATIVE_CRLF = YesPlease
 	DEFAULT_HELP_FORMAT = html
+	SKIP_DASHED_BUILT_INS = YabbaDabbaDoo
 ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix))))
 	# Move system config into top-level /etc/
 	ETC_GITCONFIG = ../etc/gitconfig
@@ -690,6 +691,7 @@ ifeq ($(uname_S),MINGW)
 	FSMONITOR_DAEMON_BACKEND = win32
 	FSMONITOR_OS_SETTINGS = win32
 
+	SKIP_DASHED_BUILT_INS = YabbaDabbaDoo
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
 	NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease

From 7766c21b4f67516fb0ac824a101795ccd48bb95d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 23 Nov 2025 11:11:01 +0100
Subject: [PATCH 428/553] mingw: stop hard-coding `CC = gcc`

This is no longer true in general, not with supporting Clang out of the
box.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 1 -
 1 file changed, 1 deletion(-)

diff --git a/config.mak.uname b/config.mak.uname
index 00f1b3d2b668fa..c2f6d11a2ff1e1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -747,7 +747,6 @@ ifeq ($(uname_S),MINGW)
 		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
 		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
-	CC = gcc
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong
 	EXTLIBS += -lntdll

From 649d9914e6902da7ed3632b0407ac89e0f648574 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 12:15:12 +0100
Subject: [PATCH 429/553] mingw: drop the -D_USE_32BIT_TIME_T option

This option was added in fa93bb20d72 (MinGW: Fix stat definitions to
work with MinGW runtime version 4.0, 2013-09-11), i.e. a _long_ time
ago. So long, in fact, that it still targeted MinGW. But we switched to
mingw-w64 in 2015, which seems not to share the problem, and therefore
does not require a fix.

Even worse: This flag is incompatible with UCRT64, which we are about to
support by way of upstreaming `mingw-w64-git` to the MSYS2 project, see
https://github.com/msys2/MINGW-packages/pull/26470 for details.

So let's send that option into its well-deserved retirement.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 1 -
 1 file changed, 1 deletion(-)

diff --git a/config.mak.uname b/config.mak.uname
index c2f6d11a2ff1e1..a57543839a8314 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -744,7 +744,6 @@ ifeq ($(uname_S),MINGW)
 		HOST_CPU = aarch64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
         else
-		COMPAT_CFLAGS += -D_USE_32BIT_TIME_T
 		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \

From 7f1f795e09de51e7ab1e4579b6b5ffea28a36f83 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 12:38:21 +0100
Subject: [PATCH 430/553] mingw: only use -Wl,--large-address-aware for 32-bit
 builds

That option only matters there, and is in fact only really understood in
those builds; UCRT64 versions of GCC, for example, do not know what to
do with that option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index a57543839a8314..02aa37576113e1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -733,9 +733,8 @@ ifeq ($(uname_S),MINGW)
         ifeq (MINGW32,$(MSYSTEM))
 		prefix = /mingw32
 		HOST_CPU = i686
-		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup
-        endif
-        ifeq (MINGW64,$(MSYSTEM))
+		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup -Wl,--large-address-aware
+        else ifeq (MINGW64,$(MSYSTEM))
 		prefix = /mingw64
 		HOST_CPU = x86_64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
@@ -744,7 +743,6 @@ ifeq ($(uname_S),MINGW)
 		HOST_CPU = aarch64
 		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
         else
-		BASIC_LDFLAGS += -Wl,--large-address-aware
         endif
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong

From 6a316dac66c882a74290b6b2818a7f25181ec2c5 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 13:44:56 +0100
Subject: [PATCH 431/553] mingw: avoid over-specifying `--pic-executable`

In bf2d5d8239e (Don't let ld strip relocations, 2016-01-16) (picked from
https://github.com/git-for-windows/git/pull/612/commits/6a237925bf10),
Git for Windows introduced the `-Wl,-pic-executable` flag, specifying
the exact entry point via `-e`. This required discerning between i686
and x86_64 code because the former required the symbol to be prefixed
with an underscore, the latter did not.

As per https://sourceware.org/bugzilla/show_bug.cgi?id=10865, the
specified symbols are already the default, though.

So let's drop the overly-specific definition.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index 02aa37576113e1..51e380d19506a1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -733,15 +733,15 @@ ifeq ($(uname_S),MINGW)
         ifeq (MINGW32,$(MSYSTEM))
 		prefix = /mingw32
 		HOST_CPU = i686
-		BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup -Wl,--large-address-aware
+		BASIC_LDFLAGS += -Wl,--pic-executable -Wl,--large-address-aware
         else ifeq (MINGW64,$(MSYSTEM))
 		prefix = /mingw64
 		HOST_CPU = x86_64
-		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		BASIC_LDFLAGS += -Wl,--pic-executable
         else ifeq (CLANGARM64,$(MSYSTEM))
 		prefix = /clangarm64
 		HOST_CPU = aarch64
-		BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup
+		BASIC_LDFLAGS += -Wl,--pic-executable
         else
         endif
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \

From 3e55d277143c0d1f5b32d28a46be4cdf2dd42ab7 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 13:53:19 +0100
Subject: [PATCH 432/553] mingw: set the prefix and HOST_CPU as per MSYS2's
 settings

MSYS2 already defines a couple of helpful environment variables, and we
can use those to infer the installation location as well as the CPU. No
need for hard-coding ;-)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index 51e380d19506a1..474d286f99a336 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -730,19 +730,13 @@ ifeq ($(uname_S),MINGW)
         ifneq (,$(findstring -O,$(filter-out -O0 -Og,$(CFLAGS))))
 		BASIC_LDFLAGS += -Wl,--dynamicbase
         endif
-        ifeq (MINGW32,$(MSYSTEM))
-		prefix = /mingw32
-		HOST_CPU = i686
-		BASIC_LDFLAGS += -Wl,--pic-executable -Wl,--large-address-aware
-        else ifeq (MINGW64,$(MSYSTEM))
-		prefix = /mingw64
-		HOST_CPU = x86_64
-		BASIC_LDFLAGS += -Wl,--pic-executable
-        else ifeq (CLANGARM64,$(MSYSTEM))
-		prefix = /clangarm64
-		HOST_CPU = aarch64
+        ifneq (,$(MSYSTEM))
+		prefix = $(MINGW_PREFIX)
+		HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST))
 		BASIC_LDFLAGS += -Wl,--pic-executable
-        else
+                ifeq (MINGW32,$(MSYSTEM))
+			BASIC_LDFLAGS += -Wl,--large-address-aware
+                endif
         endif
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
 		-fstack-protector-strong

From 8b4d819164b848a032bb6e84d16e6df81bdfe448 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 14:09:40 +0100
Subject: [PATCH 433/553] mingw: only enable the MSYS2-specific stuff when
 compiling in MSYS2

The tell-tale is the presence of the `MSYSTEM` value while compiling, of
course. In that case, we want to ensure that `MSYSTEM` is set when
running `git.exe`, and also enable the magic MSYS2 tty detection.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index 474d286f99a336..ee734abc59d3c2 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -734,12 +734,12 @@ ifeq ($(uname_S),MINGW)
 		prefix = $(MINGW_PREFIX)
 		HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST))
 		BASIC_LDFLAGS += -Wl,--pic-executable
+		COMPAT_CFLAGS += -DDETECT_MSYS_TTY
                 ifeq (MINGW32,$(MSYSTEM))
 			BASIC_LDFLAGS += -Wl,--large-address-aware
                 endif
         endif
-	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \
-		-fstack-protector-strong
+	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -fstack-protector-strong
 	EXTLIBS += -lntdll
 	EXTRA_PROGRAMS += headless-git$X
 	INSTALL = /bin/install

From 06a6b287b0b5f78f04fb9516e12f126c47dfcc11 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 14:17:24 +0100
Subject: [PATCH 434/553] mingw: rely on MSYS2's metadata instead of
 hard-coding it

MSYS2 defines some helpful environment variables, e.g. `MSYSTEM`. There
is code in Git for Windows to ensure that that `MSYSTEM` variable is
set, hard-coding a default.

However, the existing solution jumps through hoops to reconstruct the
proper default, and is even incomplete doing so, as we found out when we
extended it to support CLANGARM64.

This is absolutely unnecessary because there is already a perfectly
valid `MSYSTEM` value we can use at build time. This is even true when
building the MINGW32 variant on a MINGW64 system because `makepkg-mingw`
will override the `MSYSTEM` value as per the `MINGW_ARCH` array.

The same is equally true for the `/mingw64`, `/mingw32` and
`/clangarm64` prefix: those values are already available via the
`MINGW_PREFIX` environment variable, and we just need to pass that
setting through.

Only when `MINGW_PREFIX` is not set (as is the case in Git for Windows'
minimal SDK, where only `MSYSTEM` is guaranteed to be set correctly), we
use as fall-back the top-level directory whose name is the down-cased
value of the `MSYSTEM` variable.

Incidentally, this also broadens the support to all the configurations
supported by the MSYS2 project, i.e. clang64 & ucrt64, too.

Note: This keeps the same, hard-coded MSYSTEM platform support for CMake
as before, but drops it for Meson (because it is unclear how Meson could
do this in a more flexible manner).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname                    | 14 ++++++--------
 contrib/buildsystems/CMakeLists.txt |  9 ++++++++-
 meson.build                         | 13 ++++++++++++-
 meson_options.txt                   |  4 ++++
 4 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index ee734abc59d3c2..35a15a6c9963bd 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -438,14 +438,8 @@ ifeq ($(uname_S),Windows)
 	GIT_VERSION := $(GIT_VERSION).MSVC
 	pathsep = ;
 	# Assume that this is built in Git for Windows' SDK
-        ifeq (MINGW32,$(MSYSTEM))
-		prefix = /mingw32
-        else
-                ifeq (CLANGARM64,$(MSYSTEM))
-			prefix = /clangarm64
-                else
-			prefix = /mingw64
-                endif
+        ifneq (,$(MSYSTEM))
+		prefix = $(MINGW_PREFIX)
         endif
 	# Prepend MSVC 64-bit tool-chain to PATH.
 	#
@@ -731,6 +725,10 @@ ifeq ($(uname_S),MINGW)
 		BASIC_LDFLAGS += -Wl,--dynamicbase
         endif
         ifneq (,$(MSYSTEM))
+                ifeq ($(MINGW_PREFIX),$(filter-out /%,$(MINGW_PREFIX)))
+			# Override if empty or does not start with a slash
+			MINGW_PREFIX := /$(shell echo '$(MSYSTEM)' | tr A-Z a-z)
+                endif
 		prefix = $(MINGW_PREFIX)
 		HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST))
 		BASIC_LDFLAGS += -Wl,--pic-executable
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 28877feb9d1707..48c3feadde3c9b 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -256,7 +256,14 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 				_CONSOLE DETECT_MSYS_TTY STRIP_EXTENSION=".exe"  NO_SYMLINK_HEAD UNRELIABLE_FSTAT
 				NOGDI OBJECT_CREATION_MODE=1 __USE_MINGW_ANSI_STDIO=0
 				USE_NED_ALLOCATOR OVERRIDE_STRDUP MMAP_PREVENTS_DELETE USE_WIN32_MMAP
-				HAVE_WPGMPTR ENSURE_MSYSTEM_IS_SET HAVE_RTLGENRANDOM)
+				HAVE_WPGMPTR HAVE_RTLGENRANDOM)
+	if(CMAKE_GENERATOR_PLATFORM STREQUAL "x64")
+		add_compile_definitions(ENSURE_MSYSTEM_IS_SET="MINGW64" MINGW_PREFIX="mingw64")
+	elseif(CMAKE_GENERATOR_PLATFORM STREQUAL "arm64")
+		add_compile_definitions(ENSURE_MSYSTEM_IS_SET="CLANGARM64" MINGW_PREFIX="clangarm64")
+	elseif(CMAKE_GENERATOR_PLATFORM STREQUAL "x86")
+		add_compile_definitions(ENSURE_MSYSTEM_IS_SET="MINGW32" MINGW_PREFIX="mingw32")
+	endif()
 	list(APPEND compat_SOURCES
 		compat/mingw.c
 		compat/winansi.c
diff --git a/meson.build b/meson.build
index dd52efd1c87574..e7e39045da9468 100644
--- a/meson.build
+++ b/meson.build
@@ -1268,7 +1268,6 @@ elif host_machine.system() == 'windows'
 
   libgit_c_args += [
     '-DDETECT_MSYS_TTY',
-    '-DENSURE_MSYSTEM_IS_SET',
     '-DNATIVE_CRLF',
     '-DNOGDI',
     '-DNO_POSIX_GOODIES',
@@ -1278,6 +1277,18 @@ elif host_machine.system() == 'windows'
     '-D__USE_MINGW_ANSI_STDIO=0',
   ]
 
+  msystem = get_option('msystem')
+  if msystem != ''
+    mingw_prefix = get_option('mingw_prefix')
+    if mingw_prefix == ''
+      mingw_prefix = '/' + msystem.to_lower()
+    endif
+    libgit_c_args += [
+      '-DENSURE_MSYSTEM_IS_SET="' + msystem + '"',
+      '-DMINGW_PREFIX="' + mingw_prefix + '"'
+    ]
+  endif
+
   libgit_dependencies += compiler.find_library('ntdll')
   libgit_include_directories += 'compat/win32'
   if compiler.get_id() == 'msvc'
diff --git a/meson_options.txt b/meson_options.txt
index e0be260ae1bce8..c2d9f0bfc0c2fb 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -21,6 +21,10 @@ option('runtime_prefix', type: 'boolean', value: false,
   description: 'Resolve ancillary tooling and support files relative to the location of the runtime binary instead of hard-coding them into the binary.')
 option('sane_tool_path', type: 'array', value: [],
   description: 'An array of paths to pick up tools from in case the normal tools are broken or lacking.')
+option('msystem', type: 'string', value: '',
+  description: 'Fall-back on Windows when MSYSTEM is not set.')
+option('mingw_prefix', type: 'string', value: '',
+  description: 'Fall-back on Windows when MINGW_PREFIX is not set.')
 
 # Build information compiled into Git and other parts like documentation.
 option('build_date', type: 'string', value: '',

From a69ae8d13e8ab56adaa08d7dbc31051da874887e Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 14:45:45 +0100
Subject: [PATCH 435/553] mingw: always define `ETC_*` for MSYS2 environments

Special-casing even more configurations simply does not make sense.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index 35a15a6c9963bd..3928bc95fa9ea1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -493,7 +493,7 @@ ifeq ($(uname_S),Windows)
 	NATIVE_CRLF = YesPlease
 	DEFAULT_HELP_FORMAT = html
 	SKIP_DASHED_BUILT_INS = YabbaDabbaDoo
-ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix))))
+ifneq (,$(MINGW_PREFIX))
 	# Move system config into top-level /etc/
 	ETC_GITCONFIG = ../etc/gitconfig
 	ETC_GITATTRIBUTES = ../etc/gitattributes
@@ -736,6 +736,9 @@ ifeq ($(uname_S),MINGW)
                 ifeq (MINGW32,$(MSYSTEM))
 			BASIC_LDFLAGS += -Wl,--large-address-aware
                 endif
+		# Move system config into top-level /etc/
+		ETC_GITCONFIG = ../etc/gitconfig
+		ETC_GITATTRIBUTES = ../etc/gitattributes
         endif
 	COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -fstack-protector-strong
 	EXTLIBS += -lntdll
@@ -747,11 +750,6 @@ ifeq ($(uname_S),MINGW)
 	USE_LIBPCRE = YesPlease
 	USE_MIMALLOC = YesPlease
 	NO_PYTHON =
-        ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix))))
-		# Move system config into top-level /etc/
-		ETC_GITCONFIG = ../etc/gitconfig
-		ETC_GITATTRIBUTES = ../etc/gitattributes
-        endif
 endif
 ifeq ($(uname_S),QNX)
 	COMPAT_CFLAGS += -DSA_RESTART=0

From 1129fe0151680128f6dd078bb26c304feae9e65c Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Nov 2025 15:15:42 +0100
Subject: [PATCH 436/553] max_tree_depth: lower it for clang builds in general
 on Windows

In 436a42215e5 (max_tree_depth: lower it for clangarm64 on Windows,
2025-04-23), I provided a work-around for a nasty issue with clangarm
builds, where the stack is exhausted before the maximal tree depth is
reached, and the resulting error cannot easily be handled by Git
(because it would require Windows-specific handling).

Turns out that this is not at all limited to ARM64. In my tests with
CLANG64 in MSYS2 on the GitHub Actions runners, the test t6700.4 failed
in the exact same way. What's worse: The limit needs to be quite a bit
lower for x86_64 than for aarch64. In aforementioned tests, the breaking
point was 1232: With 1231 it still worked as expected, with 1232 it
would fail with the `STATUS_STACK_OVERFLOW` incorrectly mapped to exit
code 127. For comparison, in my tests on GitHub Actions' Windows/ARM64
runners, the breaking point was 1439 instead.

Therefore the condition needs to be adapted once more, to accommodate
(with some safety margin) both aarch64 and x86_64 in clang-based builds
on Windows, to let that test pass.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 environment.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/environment.c b/environment.c
index a770b5921d9546..ce37dccc03d6a4 100644
--- a/environment.c
+++ b/environment.c
@@ -91,16 +91,22 @@ int max_allowed_tree_depth =
 	 * the stack overflow can occur.
 	 */
 	512;
-#elif defined(GIT_WINDOWS_NATIVE) && defined(__clang__) && defined(__aarch64__)
+#elif defined(GIT_WINDOWS_NATIVE) && defined(__clang__)
 	/*
-	 * Similar to Visual C, it seems that on Windows/ARM64 the clang-based
-	 * builds have a smaller stack space available. When running out of
-	 * that stack space, a `STATUS_STACK_OVERFLOW` is produced. When the
+	 * Similar to Visual C, it seems that clang-based builds on Windows
+	 * have a smaller stack space available. When running out of that
+	 * stack space, a `STATUS_STACK_OVERFLOW` is produced. When the
 	 * Git command was run from an MSYS2 Bash, this unfortunately results
 	 * in an exit code 127. Let's prevent that by lowering the maximal
-	 * tree depth; This value seems to be low enough.
+	 * tree depth; Unfortunately, it seems that the exact limit differs
+	 * for aarch64 vs x86_64, and the difference is too large to simply
+	 * use a single limit.
 	 */
+#if defined(__aarch64__)
 	1280;
+#else
+	1152;
+#endif
 #else
 	2048;
 #endif

From 23a8c951e3c5a090e47ffdfbddeae672bde75f22 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 21 Feb 2017 13:28:58 +0100
Subject: [PATCH 437/553] mingw: ensure valid CTYPE

A change between versions 2.4.1 and 2.6.0 of the MSYS2 runtime modified
how Cygwin's runtime (and hence Git for Windows' MSYS2 runtime
derivative) handles locales: d16a56306d (Consolidate wctomb/mbtowc calls
for POSIX-1.2008, 2016-07-20).

An unintended side-effect is that "cold-calling" into the POSIX
emulation will start with a locale based on the current code page,
something that Git for Windows is very ill-prepared for, as it expects
to be able to pass a command-line containing non-ASCII characters to the
shell without having those characters munged.

One symptom of this behavior: when `git clone` or `git fetch` shell out
to call `git-upload-pack` with a path that contains non-ASCII
characters, the shell tried to interpret the entire command-line
(including command-line parameters) as executable path, which obviously
must fail.

This fixes https://github.com/git-for-windows/git/issues/1036

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index f09b49ff21ddab..88b4d9920539c4 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2812,6 +2812,9 @@ static void setup_windows_environment(void)
 		if (!tmp && (tmp = getenv("USERPROFILE")))
 			setenv("HOME", tmp, 1);
 	}
+
+	if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG"))
+		setenv("LC_CTYPE", "C.UTF-8", 1);
 }
 
 static void get_current_user_sid(PSID *sid, HANDLE *linked_token)

From 939e9198c8fd56456471edf2fac1f4a8f063cb4e Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 1 Feb 2020 00:31:16 +0100
Subject: [PATCH 438/553] mingw: allow `git.exe` to be used instead of the "Git
 wrapper"

Git for Windows wants to add `git.exe` to the users' `PATH`, without
cluttering the latter with unnecessary executables such as `wish.exe`.
To that end, it invented the concept of its "Git wrapper", i.e. a tiny
executable located in `C:\Program Files\Git\cmd\git.exe` (originally a
CMD script) whose sole purpose is to set up a couple of environment
variables and then spawn the _actual_ `git.exe` (which nowadays lives in
`C:\Program Files\Git\mingw64\bin\git.exe` for 64-bit, and the obvious
equivalent for 32-bit installations).

Currently, the following environment variables are set unless already
initialized:

- `MSYSTEM`, to make sure that the MSYS2 Bash and the MSYS2 Perl
  interpreter behave as expected, and

- `PLINK_PROTOCOL`, to force PuTTY's `plink.exe` to use the SSH
  protocol instead of Telnet,

- `PATH`, to make sure that the `bin` folder in the user's home
  directory, as well as the `/mingw64/bin` and the `/usr/bin`
  directories are included. The trick here is that the `/mingw64/bin/`
  and `/usr/bin/` directories are relative to the top-level installation
  directory of Git for Windows (which the included Bash interprets as
  `/`, i.e. as the MSYS pseudo root directory).

Using the absence of `MSYSTEM` as a tell-tale, we can detect in
`git.exe` whether these environment variables have been initialized
properly. Therefore we can call `C:\Program Files\Git\mingw64\bin\git`
in-place after this change, without having to call Git through the Git
wrapper.

Obviously, above-mentioned directories must be _prepended_ to the `PATH`
variable, otherwise we risk picking up executables from unrelated Git
installations. We do that by constructing the new `PATH` value from
scratch, appending `$HOME/bin` (if `HOME` is set), then the MSYS2 system
directories, and then appending the original `PATH`.

Side note: this modification of the `PATH` variable is independent of
the modification necessary to reach the executables and scripts in
`/mingw64/libexec/git-core/`, i.e. the `GIT_EXEC_PATH`. That
modification is still performed by Git, elsewhere, long after making the
changes described above.

While we _still_ cannot simply hard-link `mingw64\bin\git.exe` to `cmd`
(because the former depends on a couple of `.dll` files that are only in
`mingw64\bin`, i.e. calling `...\cmd\git.exe` would fail to load due to
missing dependencies), at least we can now avoid that extra process of
running the Git wrapper (which then has to wait for the spawned
`git.exe` to finish) by calling `...\mingw64\bin\git.exe` directly, via
its absolute path.

Testing this is in Git's test suite tricky: we set up a "new" MSYS
pseudo-root and copy the `git.exe` file into the appropriate location,
then verify that `MSYSTEM` is set properly, and also that the `PATH` is
modified so that scripts can be found in `$HOME/bin`, `/mingw64/bin/`
and `/usr/bin/`.

This addresses https://github.com/git-for-windows/git/issues/2283

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c        | 65 +++++++++++++++++++++++++++++++++++++++++++
 config.mak.uname      |  8 ++++--
 t/t0060-path-utils.sh | 33 +++++++++++++++++++++-
 3 files changed, 103 insertions(+), 3 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 88b4d9920539c4..4850dc4ffd9601 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2761,6 +2761,45 @@ int xwcstoutf(char *utf, const wchar_t *wcs, size_t utflen)
 	return -1;
 }
 
+#ifdef ENSURE_MSYSTEM_IS_SET
+#if !defined(RUNTIME_PREFIX) || !defined(HAVE_WPGMPTR) || !defined(MINGW_PREFIX)
+static size_t append_system_bin_dirs(char *path UNUSED, size_t size UNUSED)
+{
+	return 0;
+}
+#else
+static size_t append_system_bin_dirs(char *path, size_t size)
+{
+	char prefix[32768];
+	const char *slash;
+	size_t len = xwcstoutf(prefix, _wpgmptr, sizeof(prefix)), off = 0;
+
+	if (len == 0 || len >= sizeof(prefix) ||
+	    !(slash = find_last_dir_sep(prefix)))
+		return 0;
+	/* strip trailing `git.exe` */
+	len = slash - prefix;
+
+	/* strip trailing `cmd` or `<mingw-prefix>\bin` or `bin` or `libexec\git-core` */
+	if (strip_suffix_mem(prefix, &len, "\\" MINGW_PREFIX "\\libexec\\git-core") ||
+	    strip_suffix_mem(prefix, &len, "\\" MINGW_PREFIX "\\bin"))
+		off += xsnprintf(path + off, size - off,
+				 "%.*s\\" MINGW_PREFIX "\\bin;", (int)len, prefix);
+	else if (strip_suffix_mem(prefix, &len, "\\cmd") ||
+		 strip_suffix_mem(prefix, &len, "\\bin") ||
+		 strip_suffix_mem(prefix, &len, "\\libexec\\git-core"))
+		off += xsnprintf(path + off, size - off,
+				 "%.*s\\" MINGW_PREFIX "\\bin;", (int)len, prefix);
+	else
+		return 0;
+
+	off += xsnprintf(path + off, size - off,
+			 "%.*s\\usr\\bin;", (int)len, prefix);
+	return off;
+}
+#endif
+#endif
+
 static void setup_windows_environment(void)
 {
 	char *tmp = getenv("TMPDIR");
@@ -2813,6 +2852,32 @@ static void setup_windows_environment(void)
 			setenv("HOME", tmp, 1);
 	}
 
+	if (!getenv("PLINK_PROTOCOL"))
+		setenv("PLINK_PROTOCOL", "ssh", 0);
+
+#ifdef ENSURE_MSYSTEM_IS_SET
+	if (!(tmp = getenv("MSYSTEM")) || !tmp[0]) {
+		const char *home = getenv("HOME"), *path = getenv("PATH");
+		char buf[32768];
+		size_t off = 0;
+
+		setenv("MSYSTEM", ENSURE_MSYSTEM_IS_SET, 1);
+
+		if (home)
+			off += xsnprintf(buf + off, sizeof(buf) - off,
+					 "%s\\bin;", home);
+		off += append_system_bin_dirs(buf + off, sizeof(buf) - off);
+		if (path)
+			off += xsnprintf(buf + off, sizeof(buf) - off,
+					 "%s", path);
+		else if (off > 0)
+			buf[off - 1] = '\0';
+		else
+			buf[0] = '\0';
+		setenv("PATH", buf, 1);
+	}
+#endif
+
 	if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG"))
 		setenv("LC_CTYPE", "C.UTF-8", 1);
 }
diff --git a/config.mak.uname b/config.mak.uname
index 3928bc95fa9ea1..de9ab22e5777de 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -509,7 +509,9 @@ endif
 		compat/win32/pthread.o compat/win32/syslog.o \
 		compat/win32/trace2_win32_process_info.o \
 		compat/win32/dirent.o
-	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
+	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \
+		-DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \
+		-DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
 	BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE
 	# invalidcontinue.obj allows Git's source code to close the same file
 	# handle twice, or to access the osfhandle of an already-closed stdout
@@ -732,7 +734,9 @@ ifeq ($(uname_S),MINGW)
 		prefix = $(MINGW_PREFIX)
 		HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST))
 		BASIC_LDFLAGS += -Wl,--pic-executable
-		COMPAT_CFLAGS += -DDETECT_MSYS_TTY
+		COMPAT_CFLAGS += -DDETECT_MSYS_TTY \
+			-DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" \
+			-DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\""
                 ifeq (MINGW32,$(MSYSTEM))
 			BASIC_LDFLAGS += -Wl,--large-address-aware
                 endif
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 8545cdfab559b4..56faf5fe732ee0 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -602,7 +602,8 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX wor
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
 	GIT_EXEC_PATH= ./pretend/bin/git here >actual &&
 	echo HERE >expect &&
-	test_cmp expect actual'
+	test_cmp expect actual
+'
 
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
 	git config yes.path "%(prefix)/yes" &&
@@ -611,4 +612,34 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works'
 	test_cmp expect actual
 '
 
+test_expect_success MINGW,RUNTIME_PREFIX 'MSYSTEM/PATH is adjusted if necessary' '
+	if test -z "$MINGW_PREFIX"
+	then
+		MINGW_PREFIX="/$(echo "${MSYSTEM:-MINGW64}" | tr A-Z a-z)"
+	fi &&
+	mkdir -p "$HOME"/bin pretend"$MINGW_PREFIX"/bin \
+		pretend"$MINGW_PREFIX"/libexec/git-core pretend/usr/bin &&
+	cp "$GIT_EXEC_PATH"/git.exe pretend"$MINGW_PREFIX"/bin/ &&
+	cp "$GIT_EXEC_PATH"/git.exe pretend"$MINGW_PREFIX"/libexec/git-core/ &&
+	# copy the .dll files, if any (happens when building via CMake)
+	if test -n "$(ls "$GIT_EXEC_PATH"/*.dll 2>/dev/null)"
+	then
+		cp "$GIT_EXEC_PATH"/*.dll pretend"$MINGW_PREFIX"/bin/ &&
+		cp "$GIT_EXEC_PATH"/*.dll pretend"$MINGW_PREFIX"/libexec/git-core/
+	fi &&
+	echo "env | grep MSYSTEM=" | write_script "$HOME"/bin/git-test-home &&
+	echo "echo ${MINGW_PREFIX#/}" | write_script pretend"$MINGW_PREFIX"/bin/git-test-bin &&
+	echo "echo usr" | write_script pretend/usr/bin/git-test-bin2 &&
+
+	(
+		MSYSTEM= &&
+		GIT_EXEC_PATH= &&
+		pretend"$MINGW_PREFIX"/libexec/git-core/git.exe test-home >actual &&
+		pretend"$MINGW_PREFIX"/libexec/git-core/git.exe test-bin >>actual &&
+		pretend"$MINGW_PREFIX"/bin/git.exe test-bin2 >>actual
+	) &&
+	test_write_lines MSYSTEM=$MSYSTEM "${MINGW_PREFIX#/}" usr >expect &&
+	test_cmp expect actual
+'
+
 test_done

From c9daf7523e8fe30198aa05c0e9e0130270ed0fcc Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 25 Aug 2020 12:13:26 +0200
Subject: [PATCH 439/553] mingw: ignore HOMEDRIVE/HOMEPATH if it points to
 Windows' system directory

Internally, Git expects the environment variable `HOME` to be set, and
to point to the current user's home directory.

This environment variable is not set by default on Windows, and
therefore Git tries its best to construct one if it finds `HOME` unset.

There are actually two different approaches Git tries: first, it looks
at `HOMEDRIVE`/`HOMEPATH` because this is widely used in corporate
environments with roaming profiles, and a user generally wants their
global Git settings to be in a roaming profile.

Only when `HOMEDRIVE`/`HOMEPATH` is either unset or does not point to a
valid location, Git will fall back to using `USERPROFILE` instead.

However, starting with Windows Vista, for secondary logons and services,
the environment variables `HOMEDRIVE`/`HOMEPATH` point to Windows'
system directory (usually `C:\Windows\system32`).

That is undesirable, and that location is usually write-protected anyway.

So let's verify that the `HOMEDRIVE`/`HOMEPATH` combo does not point to
Windows' system directory before using it, falling back to `USERPROFILE`
if it does.

This fixes git-for-windows#2709

Initial-Path-by: Ivan Pozdeev <vano@mail.mipt.ru>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 4850dc4ffd9601..185488fc42ec55 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2800,6 +2800,18 @@ static size_t append_system_bin_dirs(char *path, size_t size)
 #endif
 #endif
 
+static int is_system32_path(const char *path)
+{
+	WCHAR system32[MAX_PATH], wpath[MAX_PATH];
+
+	if (xutftowcs_path(wpath, path) < 0 ||
+	    !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) ||
+	    _wcsicmp(system32, wpath))
+		return 0;
+
+	return 1;
+}
+
 static void setup_windows_environment(void)
 {
 	char *tmp = getenv("TMPDIR");
@@ -2840,7 +2852,8 @@ static void setup_windows_environment(void)
 			strbuf_addstr(&buf, tmp);
 			if ((tmp = getenv("HOMEPATH"))) {
 				strbuf_addstr(&buf, tmp);
-				if (is_directory(buf.buf))
+				if (!is_system32_path(buf.buf) &&
+				    is_directory(buf.buf))
 					setenv("HOME", buf.buf, 1);
 				else
 					tmp = NULL; /* use $USERPROFILE */

From 94f27cb3b2195b30324bfb080633745df9ffb020 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Tue, 30 Mar 2021 14:25:31 -0400
Subject: [PATCH 440/553] clink.pl: fix libexpatd.lib link error when using
 MSVC

When building with `make MSVC=1 DEBUG=1`, link to `libexpatd.lib`
rather than `libexpat.lib`.

It appears that the `vcpkg` package for "libexpat" has changed and now
creates `libexpatd.lib` for debug mode builds.  Previously, both debug
and release builds created a ".lib" with the same basename.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/vcbuild/scripts/clink.pl | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
index 3bd824154be381..2768ae15f1879f 100755
--- a/compat/vcbuild/scripts/clink.pl
+++ b/compat/vcbuild/scripts/clink.pl
@@ -66,7 +66,11 @@
 		}
 		push(@args, $lib);
 	} elsif ("$arg" eq "-lexpat") {
+	    if ($is_debug) {
+		push(@args, "libexpatd.lib");
+	    } else {
 		push(@args, "libexpat.lib");
+	    }
 	} elsif ("$arg" =~ /^-L/ && "$arg" ne "-LTCG") {
 		$arg =~ s/^-L/-LIBPATH:/;
 		push(@lflags, $arg);

From 0e7e9b2b4b1753d80163750c2a0c11b230514697 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Mon, 5 Apr 2021 15:27:38 -0400
Subject: [PATCH 441/553] Makefile: clean up .ilk files when MSVC=1

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index 4a69c3c75c73d4..4524d5b170e9c1 100644
--- a/Makefile
+++ b/Makefile
@@ -3913,12 +3913,15 @@ ifdef MSVC
 	$(RM) $(patsubst %.o,%.o.pdb,$(OBJECTS))
 	$(RM) headless-git.o.pdb
 	$(RM) $(patsubst %.exe,%.pdb,$(OTHER_PROGRAMS))
+	$(RM) $(patsubst %.exe,%.ilk,$(OTHER_PROGRAMS))
 	$(RM) $(patsubst %.exe,%.iobj,$(OTHER_PROGRAMS))
 	$(RM) $(patsubst %.exe,%.ipdb,$(OTHER_PROGRAMS))
 	$(RM) $(patsubst %.exe,%.pdb,$(PROGRAMS))
+	$(RM) $(patsubst %.exe,%.ilk,$(PROGRAMS))
 	$(RM) $(patsubst %.exe,%.iobj,$(PROGRAMS))
 	$(RM) $(patsubst %.exe,%.ipdb,$(PROGRAMS))
 	$(RM) $(patsubst %.exe,%.pdb,$(TEST_PROGRAMS))
+	$(RM) $(patsubst %.exe,%.ilk,$(TEST_PROGRAMS))
 	$(RM) $(patsubst %.exe,%.iobj,$(TEST_PROGRAMS))
 	$(RM) $(patsubst %.exe,%.ipdb,$(TEST_PROGRAMS))
 	$(RM) compat/vcbuild/MSVC-DEFS-GEN

From b905927fff090bf615961b9ff197b6b94036b7c2 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Mon, 5 Apr 2021 14:08:22 -0400
Subject: [PATCH 442/553] vcbuild: add support for compiling Windows resource
 files

Create a wrapper for the Windows Resource Compiler (RC.EXE)
for use by the MSVC=1 builds. This is similar to the CL.EXE
and LIB.EXE wrappers used for the MSVC=1 builds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/vcbuild/find_vs_env.bat |  7 ++++++
 compat/vcbuild/scripts/rc.pl   | 46 ++++++++++++++++++++++++++++++++++
 config.mak.uname               |  3 ++-
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 compat/vcbuild/scripts/rc.pl

diff --git a/compat/vcbuild/find_vs_env.bat b/compat/vcbuild/find_vs_env.bat
index b35d264c0e6bed..379b16296e09c2 100644
--- a/compat/vcbuild/find_vs_env.bat
+++ b/compat/vcbuild/find_vs_env.bat
@@ -99,6 +99,7 @@ REM ================================================================
 
    SET sdk_dir=%WindowsSdkDir%
    SET sdk_ver=%WindowsSDKVersion%
+   SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%%tgt%
    SET si=%sdk_dir%Include\%sdk_ver%
    SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared"
    SET sl=%sdk_dir%lib\%sdk_ver%
@@ -130,6 +131,7 @@ REM ================================================================
 
    SET sdk_dir=%WindowsSdkDir%
    SET sdk_ver=%WindowsSDKVersion%
+   SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%bin\amd64
    SET si=%sdk_dir%Include\%sdk_ver%
    SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared" -I"%si%winrt"
    SET sl=%sdk_dir%lib\%sdk_ver%
@@ -160,6 +162,11 @@ REM ================================================================
    echo msvc_includes=%msvc_includes%
    echo msvc_libs=%msvc_libs%
 
+   echo sdk_ver_bin_dir=%sdk_ver_bin_dir%
+   SET X1=%sdk_ver_bin_dir:C:=/C%
+   SET X2=%X1:\=/%
+   echo sdk_ver_bin_dir_msys=%X2%
+
    echo sdk_includes=%sdk_includes%
    echo sdk_libs=%sdk_libs%
 
diff --git a/compat/vcbuild/scripts/rc.pl b/compat/vcbuild/scripts/rc.pl
new file mode 100644
index 00000000000000..7bca4cd81c6c63
--- /dev/null
+++ b/compat/vcbuild/scripts/rc.pl
@@ -0,0 +1,46 @@
+#!/usr/bin/perl -w
+######################################################################
+# Compile Resources on Windows
+#
+# This is a wrapper to facilitate the compilation of Git with MSVC
+# using GNU Make as the build system. So, instead of manipulating the
+# Makefile into something nasty, just to support non-space arguments
+# etc, we use this wrapper to fix the command line options
+#
+######################################################################
+use strict;
+my @args = ();
+my @input = ();
+
+while (@ARGV) {
+	my $arg = shift @ARGV;
+	if ("$arg" =~ /^-[dD]/) {
+		# GIT_VERSION gets passed with too many
+		# layers of dquote escaping.
+		$arg =~ s/\\"/"/g;
+
+		push(@args, $arg);
+
+	} elsif ("$arg" eq "-i") {
+		my $arg = shift @ARGV;
+		# TODO complain if NULL or is dashed ??
+		push(@input, $arg);
+
+	} elsif ("$arg" eq "-o") {
+		my $arg = shift @ARGV;
+		# TODO complain if NULL or is dashed ??
+		push(@args, "-fo$arg");
+
+	} else {
+		push(@args, $arg);
+	}
+}
+
+push(@args, "-nologo");
+push(@args, "-v");
+push(@args, @input);
+
+unshift(@args, "rc.exe");
+printf("**** @args\n");
+
+exit (system(@args) != 0);
diff --git a/config.mak.uname b/config.mak.uname
index de9ab22e5777de..0fde937c92886b 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -447,7 +447,7 @@ ifeq ($(uname_S),Windows)
 	# link.exe next to, and required by, cl.exe, we have to prepend this
 	# onto the existing $PATH.
 	#
-	SANE_TOOL_PATH ?= $(msvc_bin_dir_msys)
+	SANE_TOOL_PATH ?= $(msvc_bin_dir_msys):$(sdk_ver_bin_dir_msys)
 	HAVE_ALLOCA_H = YesPlease
 	NO_PREAD = YesPlease
 	NEEDS_CRYPTO_WITH_SSL = YesPlease
@@ -518,6 +518,7 @@ endif
 	# See https://msdn.microsoft.com/en-us/library/ms235330.aspx
 	EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib
 	PTHREAD_LIBS =
+	RC = compat/vcbuild/scripts/rc.pl
 	lib =
 	BASIC_CFLAGS += $(vcpkg_inc) $(sdk_includes) $(msvc_includes)
 ifndef DEBUG

From 2e960e4fe1ae5cf68e3ab15b0a981d69be47d34d Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Mon, 5 Apr 2021 14:12:14 -0400
Subject: [PATCH 443/553] config.mak.uname: add git.rc to MSVC builds

Teach MSVC=1 builds to depend on the `git.rc` file so that
the resulting executables have Windows-style resources and
version number information within them.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 config.mak.uname | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.uname b/config.mak.uname
index 0fde937c92886b..bfb0fa7e100653 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -517,6 +517,7 @@ endif
 	# handle twice, or to access the osfhandle of an already-closed stdout
 	# See https://msdn.microsoft.com/en-us/library/ms235330.aspx
 	EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib
+	GITLIBS += git.res
 	PTHREAD_LIBS =
 	RC = compat/vcbuild/scripts/rc.pl
 	lib =

From 3d19409461f28e4bb3235cf2692ce8635ca4ecc2 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Mon, 5 Apr 2021 14:24:52 -0400
Subject: [PATCH 444/553] clink.pl: ignore no-stack-protector arg on MSVC=1
 builds

Ignore the `-fno-stack-protector` compiler argument when building
with MSVC.  This will be used in a later commit that needs to build
a Win32 GUI app.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/vcbuild/scripts/clink.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
index 2768ae15f1879f..73c8a2b184f38b 100755
--- a/compat/vcbuild/scripts/clink.pl
+++ b/compat/vcbuild/scripts/clink.pl
@@ -122,6 +122,8 @@
 		push(@cflags, "-wd4996");
 	} elsif ("$arg" =~ /^-W[a-z]/) {
 		# let's ignore those
+	} elsif ("$arg" eq "-fno-stack-protector") {
+		# eat this
 	} else {
 		push(@args, $arg);
 	}

From 6504869d80cbb59cb2b47ce26421324fc597b4e3 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Mon, 5 Apr 2021 14:39:33 -0400
Subject: [PATCH 445/553] clink.pl: move default linker options for MSVC=1
 builds

Move the default `-ENTRY` and `-SUBSYSTEM` arguments for
MSVC=1 builds from `config.mak.uname` into `clink.pl`.
These args are constant for console-mode executables.

Add support to `clink.pl` for generating a Win32 GUI application
using the `-mwindows` argument (to match how GCC does it).  This
changes the `-ENTRY` and `-SUBSYSTEM` arguments accordingly.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/vcbuild/scripts/clink.pl | 11 +++++++++++
 config.mak.uname                |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
index 73c8a2b184f38b..a38b360015ece9 100755
--- a/compat/vcbuild/scripts/clink.pl
+++ b/compat/vcbuild/scripts/clink.pl
@@ -15,6 +15,7 @@
 my @lflags = ();
 my $is_linking = 0;
 my $is_debug = 0;
+my $is_gui = 0;
 while (@ARGV) {
 	my $arg = shift @ARGV;
 	if ("$arg" eq "-DDEBUG") {
@@ -124,11 +125,21 @@
 		# let's ignore those
 	} elsif ("$arg" eq "-fno-stack-protector") {
 		# eat this
+	} elsif ("$arg" eq "-mwindows") {
+		$is_gui = 1;
 	} else {
 		push(@args, $arg);
 	}
 }
 if ($is_linking) {
+	if ($is_gui) {
+		push(@args, "-ENTRY:wWinMainCRTStartup");
+		push(@args, "-SUBSYSTEM:WINDOWS");
+	} else {
+		push(@args, "-ENTRY:wmainCRTStartup");
+		push(@args, "-SUBSYSTEM:CONSOLE");
+	}
+
 	push(@args, @lflags);
 	unshift(@args, "link.exe");
 } else {
diff --git a/config.mak.uname b/config.mak.uname
index bfb0fa7e100653..eac081b82ab996 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -512,7 +512,7 @@ endif
 	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \
 		-DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \
 		-DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
-	BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE
+	BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO
 	# invalidcontinue.obj allows Git's source code to close the same file
 	# handle twice, or to access the osfhandle of an already-closed stdout
 	# See https://msdn.microsoft.com/en-us/library/ms235330.aspx

From d4bc6526dc97baf87d2f736ada4bb6b6016dbfea Mon Sep 17 00:00:00 2001
From: Yuyi Wang <Strawberry_Str@hotmail.com>
Date: Sat, 11 Mar 2023 17:51:18 +0800
Subject: [PATCH 446/553] cmake: install headless-git.

headless-git is a git executable without opening a console window. It is
useful when other GUI executables want to call git. We should install it
together with git on Windows.

Signed-off-by: Yuyi Wang <Strawberry_Str@hotmail.com>
---
 contrib/buildsystems/CMakeLists.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 48c3feadde3c9b..ffb33b3d317e07 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -736,6 +736,7 @@ if(WIN32)
 	endif()
 
 	add_executable(headless-git ${CMAKE_SOURCE_DIR}/compat/win32/headless.c)
+	list(APPEND PROGRAMS_BUILT headless-git)
 	if(CMAKE_C_COMPILER_ID STREQUAL "GNU" OR CMAKE_C_COMPILER_ID STREQUAL "Clang")
 		target_link_options(headless-git PUBLIC -municode -Wl,-subsystem,windows)
 	elseif(CMAKE_C_COMPILER_ID STREQUAL "MSVC")
@@ -936,7 +937,7 @@ list(TRANSFORM git_perl_scripts PREPEND "${CMAKE_BINARY_DIR}/")
 
 #install
 foreach(program ${PROGRAMS_BUILT})
-if(program MATCHES "^(git|git-shell|scalar)$")
+if(program MATCHES "^(git|git-shell|headless-git|scalar)$")
 install(TARGETS ${program}
 	RUNTIME DESTINATION bin)
 else()

From 4846187758b3c911db6dccb8709e6feeb6bc9872 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
Date: Sat, 2 Dec 2023 12:10:00 +0100
Subject: [PATCH 447/553] git.rc: include winuser.h
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

winuser.h contains the definition of RT_MANIFEST that our LLVM based
toolchain needs to understand that we want to embed
compat/win32/git.manifest as an application manifest. It currently just
embeds it as additional data that Windows doesn't understand.

This also helps our GCC based toolchain understand that we only want one
copy embedded. It currently embeds one working assembly manifest and one
nearly identical, but useless copy as additional data.

This also teaches our Visual Studio based buildsystems to pick up the
manifest file from git.rc. This means we don't have to explicitly specify
it in contrib/buildsystems/Generators/Vcxproj.pm anymore. Slightly
counter-intuitively this also means we have to explicitly tell Cmake
not to embed a default manifest.

This fixes https://github.com/git-for-windows/git/issues/4707

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
 contrib/buildsystems/CMakeLists.txt | 1 +
 git.rc.in                           | 1 +
 2 files changed, 2 insertions(+)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index ffb33b3d317e07..de6c3921fccc77 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -208,6 +208,7 @@ if(CMAKE_C_COMPILER_ID STREQUAL "MSVC")
 	set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR})
 	set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR})
 	add_compile_options(/MP /std:c11)
+	add_link_options(/MANIFEST:NO)
 endif()
 
 #default behaviour
diff --git a/git.rc.in b/git.rc.in
index e69444eef3f0c5..1d5b627b610549 100644
--- a/git.rc.in
+++ b/git.rc.in
@@ -1,3 +1,4 @@
+#include<winuser.h>
 1 VERSIONINFO
 FILEVERSION     @GIT_MAJOR_VERSION@,@GIT_MINOR_VERSION@,@GIT_MICRO_VERSION@,@GIT_PATCH_LEVEL@
 PRODUCTVERSION  @GIT_MAJOR_VERSION@,@GIT_MINOR_VERSION@,@GIT_MICRO_VERSION@,@GIT_PATCH_LEVEL@

From e1e39bffae01e592208e28068d27684976da831c Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 3 Nov 2025 12:49:35 +0100
Subject: [PATCH 448/553] git-svn: mark it as unsupported by the Git for
 Windows project

There have been too many challenges supporting `git svn`, including lack
of participation in developing/maintaining the required stack.

See https://github.com/git-for-windows/git/issues/5405 for full details.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-svn.adoc                 |  1 +
 git-svn.perl                               | 13 +++++++++++++
 t/t9108-git-svn-glob.sh                    |  3 ++-
 t/t9109-git-svn-multi-glob.sh              |  3 ++-
 t/t9168-git-svn-partially-globbed-names.sh |  6 ++++--
 5 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-svn.adoc b/Documentation/git-svn.adoc
index c26c12bab37abf..047c412018adcc 100644
--- a/Documentation/git-svn.adoc
+++ b/Documentation/git-svn.adoc
@@ -9,6 +9,7 @@ SYNOPSIS
 --------
 [verse]
 'git svn' <command> [<options>] [<arguments>]
+(UNSUPPORTED!)
 
 DESCRIPTION
 -----------
diff --git a/git-svn.perl b/git-svn.perl
index 32c648c3956fa4..37af8e873a9738 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -305,6 +305,19 @@ sub term_init {
 			: new Term::ReadLine 'git-svn';
 }
 
+sub deprecated_warning {
+    my @lines = @_;
+    if (-t STDERR) {
+        @lines = map { "\e[33m$_\e[0m" } @lines;
+    }
+    warn join("\n", @lines), "\n";
+}
+
+deprecated_warning(
+	"WARNING: \`git svn\` is no longer supported by the Git for Windows project.",
+	"See https://github.com/git-for-windows/git/issues/5405 for details."
+);
+
 my $cmd;
 for (my $i = 0; $i < @ARGV; $i++) {
 	if (defined $cmd{$ARGV[$i]}) {
diff --git a/t/t9108-git-svn-glob.sh b/t/t9108-git-svn-glob.sh
index d5939d4753ece8..b867c5504ff452 100755
--- a/t/t9108-git-svn-glob.sh
+++ b/t/t9108-git-svn-glob.sh
@@ -110,7 +110,8 @@ test_expect_success 'test disallow multi-globs' '
 		svn_cmd commit -m "try to try"
 	) &&
 	test_must_fail git svn fetch three 2> stderr.three &&
-	test_cmp expect.three stderr.three
+	sed "/^WARNING.*no.* supported/{N;d}" <stderr.three >stderr.three.clean &&
+	test_cmp expect.three stderr.three.clean
 	'
 
 test_done
diff --git a/t/t9109-git-svn-multi-glob.sh b/t/t9109-git-svn-multi-glob.sh
index 648dcee1eac137..ebf34abcc3a952 100755
--- a/t/t9109-git-svn-multi-glob.sh
+++ b/t/t9109-git-svn-multi-glob.sh
@@ -161,7 +161,8 @@ test_expect_success 'test disallow multiple globs' '
 		svn_cmd commit -m "try to try"
 	) &&
 	test_must_fail git svn fetch three 2> stderr.three &&
-	test_cmp expect.three stderr.three
+	sed "/^WARNING.*no.* supported/{N;d}" <stderr.three >stderr.three.clean &&
+	test_cmp expect.three stderr.three.clean
 	'
 
 test_done
diff --git a/t/t9168-git-svn-partially-globbed-names.sh b/t/t9168-git-svn-partially-globbed-names.sh
index 854b3419b2c323..59be2eaf0f688a 100755
--- a/t/t9168-git-svn-partially-globbed-names.sh
+++ b/t/t9168-git-svn-partially-globbed-names.sh
@@ -155,7 +155,8 @@ test_expect_success 'test disallow prefixed multi-globs' '
 		svn_cmd commit -m "try to try"
 	) &&
 	test_must_fail git svn fetch four 2>stderr.four &&
-	test_cmp expect.four stderr.four &&
+	sed "/^WARNING.*no.* supported/{N;d}" <stderr.four >stderr.four.clean &&
+	test_cmp expect.four stderr.four.clean &&
 	git config --unset svn-remote.four.branches &&
 	git config --unset svn-remote.four.tags
 	'
@@ -223,7 +224,8 @@ test_expect_success 'test disallow multiple asterisks in one word' '
 		svn_cmd commit -m "try to try"
 	) &&
 	test_must_fail git svn fetch six 2>stderr.six &&
-	test_cmp expect.six stderr.six
+	sed "/^WARNING.*no.* supported/{N;d}" <stderr.six >stderr.six.clean &&
+	test_cmp expect.six stderr.six.clean
 	'
 
 test_done

From f33769fe7732cea223a63ceb1bff3e4296f0088c Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 13 Nov 2025 11:23:29 +0100
Subject: [PATCH 449/553] ci(macos): skip the `git p4` tests

Historically, the macOS jobs have always been among the longest-running
ones, and recently the `git p4` tests became another liability: They
started to fail much more often (maybe as of the switch away from the
`macos-13` pool?), requiring re-runs of the jobs that already were
responsible for long CI build times.

Of the 35 test scripts that exercise `git p4`, 32 are actually run on
macOS (3 are skipped for reasons like case-sensitivee filesystem), and
they take an accumulated runtime of over half an hour.

Furthermore, the `git p4` command is not really affected by Git for
Windows' patches, at least not as far as macOS is concerned, therefore
it is not only causing developer friction to have these long-running,
frequently failing tests, it is also quite wasteful: There has not been
a single instance so far where any `git p4` test failure in Git for
Windows had demonstrated an actionable bug.

While upstream Git is confident to have addressed the flakiness of the
`git p4` tests via ffff0bb0dac1 (Use Perforce arm64 binary on macOS CI
jobs, 2025-11-16) (which got slipped in at the 11th hour into the
v2.52.0 release, fast-tracked without ever hitting `seen` even after
-rc2 was released), I am not quite so confident, and besides, the
runtime penalty of running those tests in Git for Windows' CI runs is
still a worrisome burden.

So let's just disable those tests in the CI runs, at least on macOS.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/install-dependencies.sh | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 6ee8216a05e127..4719c20844ee8d 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -108,11 +108,12 @@ macos-*)
 	# brew install gnu-time
 	brew link --force gettext
 
-	mkdir -p "$CUSTOM_PATH"
-	wget -q "$P4WHENCE/bin.macosx12arm64/helix-core-server.tgz" &&
-	tar -xf helix-core-server.tgz -C "$CUSTOM_PATH" p4 p4d &&
-	sudo xattr -d com.apple.quarantine "$CUSTOM_PATH/p4" "$CUSTOM_PATH/p4d" 2>/dev/null || true
-	rm helix-core-server.tgz
+	# Uncomment this block if you want to run `git p4` tests:
+	# mkdir -p "$CUSTOM_PATH"
+	# wget -q "$P4WHENCE/bin.macosx12arm64/helix-core-server.tgz" &&
+	# tar -xf helix-core-server.tgz -C "$CUSTOM_PATH" p4 p4d &&
+	# sudo xattr -d com.apple.quarantine "$CUSTOM_PATH/p4" "$CUSTOM_PATH/p4d" 2>/dev/null || true
+	# rm helix-core-server.tgz
 
 	case "$jobname" in
 	osx-meson)

From f153204132cbffff72ee2b93a22a481f175c29c2 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 6 Jul 2013 02:09:35 +0200
Subject: [PATCH 450/553] Win32: make FILETIME conversion functions public

We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw-posix.h | 18 ++++++++++++++++++
 compat/mingw.c       | 18 ------------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 0ef26f7f80c44c..4539d3ca49883b 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -340,6 +340,17 @@ static inline int getrlimit(int resource, struct rlimit *rlp)
 	return 0;
 }
 
+/*
+ * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC.
+ * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch.
+ */
+static inline long long filetime_to_hnsec(const FILETIME *ft)
+{
+	long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime;
+	/* Windows to Unix Epoch conversion */
+	return winTime - 116444736000000000LL;
+}
+
 /*
  * Use mingw specific stat()/lstat()/fstat() implementations on Windows,
  * including our own struct stat with 64 bit st_size and nanosecond-precision
@@ -356,6 +367,13 @@ struct timespec {
 #endif
 #endif
 
+static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts)
+{
+	long long hnsec = filetime_to_hnsec(ft);
+	ts->tv_sec = (time_t)(hnsec / 10000000);
+	ts->tv_nsec = (hnsec % 10000000) * 100;
+}
+
 struct mingw_stat {
     _dev_t st_dev;
     _ino_t st_ino;
diff --git a/compat/mingw.c b/compat/mingw.c
index 780ba62cfd7b81..4673b1c34f8c8c 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -897,24 +897,6 @@ int mingw_chmod(const char *filename, int mode)
 	return _wchmod(wfilename, mode);
 }
 
-/*
- * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC.
- * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch.
- */
-static inline long long filetime_to_hnsec(const FILETIME *ft)
-{
-	long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime;
-	/* Windows to Unix Epoch conversion */
-	return winTime - 116444736000000000LL;
-}
-
-static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts)
-{
-	long long hnsec = filetime_to_hnsec(ft);
-	ts->tv_sec = (time_t)(hnsec / 10000000);
-	ts->tv_nsec = (hnsec % 10000000) * 100;
-}
-
 /**
  * Verifies that safe_create_leading_directories() would succeed.
  */

From 3ca6e7652e437cc36dbe7c24ec6807e2081f2cee Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 8 Sep 2013 14:17:31 +0200
Subject: [PATCH 451/553] Win32: dirent.c: Move opendir down

Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/win32/dirent.c | 68 +++++++++++++++++++++----------------------
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index 52420ec7d4dad7..2603a0fa39f45a 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -18,40 +18,6 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata)
 		ent->d_type = DT_REG;
 }
 
-DIR *opendir(const char *name)
-{
-	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
-	WIN32_FIND_DATAW fdata;
-	HANDLE h;
-	int len;
-	DIR *dir;
-
-	/* convert name to UTF-16 and check length < MAX_PATH */
-	if ((len = xutftowcs_path(pattern, name)) < 0)
-		return NULL;
-
-	/* append optional '/' and wildcard '*' */
-	if (len && !is_dir_sep(pattern[len - 1]))
-		pattern[len++] = '/';
-	pattern[len++] = '*';
-	pattern[len] = 0;
-
-	/* open find handle */
-	h = FindFirstFileW(pattern, &fdata);
-	if (h == INVALID_HANDLE_VALUE) {
-		DWORD err = GetLastError();
-		errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err);
-		return NULL;
-	}
-
-	/* initialize DIR structure and copy first dir entry */
-	dir = xmalloc(sizeof(DIR));
-	dir->dd_handle = h;
-	dir->dd_stat = 0;
-	finddata2dirent(&dir->dd_dir, &fdata);
-	return dir;
-}
-
 struct dirent *readdir(DIR *dir)
 {
 	if (!dir) {
@@ -90,3 +56,37 @@ int closedir(DIR *dir)
 	free(dir);
 	return 0;
 }
+
+DIR *opendir(const char *name)
+{
+	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
+	WIN32_FIND_DATAW fdata;
+	HANDLE h;
+	int len;
+	DIR *dir;
+
+	/* convert name to UTF-16 and check length < MAX_PATH */
+	if ((len = xutftowcs_path(pattern, name)) < 0)
+		return NULL;
+
+	/* append optional '/' and wildcard '*' */
+	if (len && !is_dir_sep(pattern[len - 1]))
+		pattern[len++] = '/';
+	pattern[len++] = '*';
+	pattern[len] = 0;
+
+	/* open find handle */
+	h = FindFirstFileW(pattern, &fdata);
+	if (h == INVALID_HANDLE_VALUE) {
+		DWORD err = GetLastError();
+		errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err);
+		return NULL;
+	}
+
+	/* initialize DIR structure and copy first dir entry */
+	dir = xmalloc(sizeof(DIR));
+	dir->dd_handle = h;
+	dir->dd_stat = 0;
+	finddata2dirent(&dir->dd_dir, &fdata);
+	return dir;
+}

From 1307818f676b8ca6e88c18e6dcbba990a49983d1 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 8 Sep 2013 14:18:40 +0200
Subject: [PATCH 452/553] mingw: make the dirent implementation pluggable

Emulating the POSIX `dirent` API on Windows via
`FindFirstFile()`/`FindNextFile()` is pretty staightforward, however,
most of the information provided in the `WIN32_FIND_DATA` structure is
thrown away in the process. A more sophisticated implementation may
cache this data, e.g. for later reuse in calls to `lstat()`.

Make the `dirent` implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to `readdir()`/`closedir()`
that match the `opendir()` implementation (similar to vtable pointers in
Object-Oriented Programming). Define `readdir()`/`closedir()` so that
they call the function pointers in the `DIR` structure. This allows to
choose the `opendir()` implementation on a call-by-call basis.

Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may
be implementation specific (e.g. a caching implementation may allocate a
`struct dirent` with _just_ the size needed to hold the `d_name` in
question).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/dirent.c | 30 +++++++++++++++++++-----------
 compat/win32/dirent.h | 28 +++++++++++++++++++++-------
 2 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index 2603a0fa39f45a..139d2ba3c4da34 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -1,15 +1,21 @@
 #include "../../git-compat-util.h"
 
-struct DIR {
-	struct dirent dd_dir; /* includes d_type */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpedantic"
+typedef struct dirent_DIR {
+	struct DIR base_dir;  /* extend base struct DIR */
 	HANDLE dd_handle;     /* FindFirstFile handle */
 	int dd_stat;          /* 0-based index */
-};
+	struct dirent dd_dir; /* includes d_type */
+} dirent_DIR;
+#pragma GCC diagnostic pop
+
+DIR *(*opendir)(const char *dirname) = dirent_opendir;
 
 static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata)
 {
-	/* convert UTF-16 name to UTF-8 */
-	xwcstoutf(ent->d_name, fdata->cFileName, sizeof(ent->d_name));
+	/* convert UTF-16 name to UTF-8 (d_name points to dirent_DIR.dd_name) */
+	xwcstoutf(ent->d_name, fdata->cFileName, MAX_PATH * 3);
 
 	/* Set file type, based on WIN32_FIND_DATA */
 	if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
@@ -18,7 +24,7 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata)
 		ent->d_type = DT_REG;
 }
 
-struct dirent *readdir(DIR *dir)
+static struct dirent *dirent_readdir(dirent_DIR *dir)
 {
 	if (!dir) {
 		errno = EBADF; /* No set_errno for mingw */
@@ -45,7 +51,7 @@ struct dirent *readdir(DIR *dir)
 	return &dir->dd_dir;
 }
 
-int closedir(DIR *dir)
+static int dirent_closedir(dirent_DIR *dir)
 {
 	if (!dir) {
 		errno = EBADF;
@@ -57,13 +63,13 @@ int closedir(DIR *dir)
 	return 0;
 }
 
-DIR *opendir(const char *name)
+DIR *dirent_opendir(const char *name)
 {
 	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
 	WIN32_FIND_DATAW fdata;
 	HANDLE h;
 	int len;
-	DIR *dir;
+	dirent_DIR *dir;
 
 	/* convert name to UTF-16 and check length < MAX_PATH */
 	if ((len = xutftowcs_path(pattern, name)) < 0)
@@ -84,9 +90,11 @@ DIR *opendir(const char *name)
 	}
 
 	/* initialize DIR structure and copy first dir entry */
-	dir = xmalloc(sizeof(DIR));
+	dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH);
+	dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir;
+	dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir;
 	dir->dd_handle = h;
 	dir->dd_stat = 0;
 	finddata2dirent(&dir->dd_dir, &fdata);
-	return dir;
+	return (DIR*) dir;
 }
diff --git a/compat/win32/dirent.h b/compat/win32/dirent.h
index 058207e4bfed62..a58a8075fd70e3 100644
--- a/compat/win32/dirent.h
+++ b/compat/win32/dirent.h
@@ -1,20 +1,34 @@
 #ifndef DIRENT_H
 #define DIRENT_H
 
-typedef struct DIR DIR;
-
 #define DT_UNKNOWN 0
 #define DT_DIR     1
 #define DT_REG     2
 #define DT_LNK     3
 
 struct dirent {
-	unsigned char d_type;      /* file type to prevent lstat after readdir */
-	char d_name[MAX_PATH * 3]; /* file name (* 3 for UTF-8 conversion) */
+	unsigned char d_type; /* file type to prevent lstat after readdir */
+	char d_name[/* FLEX_ARRAY */]; /* file name */
 };
 
-DIR *opendir(const char *dirname);
-struct dirent *readdir(DIR *dir);
-int closedir(DIR *dir);
+/*
+ * Base DIR structure, contains pointers to readdir/closedir implementations so
+ * that opendir may choose a concrete implementation on a call-by-call basis.
+ */
+typedef struct DIR {
+	struct dirent *(*preaddir)(struct DIR *dir);
+	int (*pclosedir)(struct DIR *dir);
+} DIR;
+
+/* default dirent implementation */
+extern DIR *dirent_opendir(const char *dirname);
+
+#define opendir git_opendir
+
+/* current dirent implementation */
+extern DIR *(*opendir)(const char *dirname);
+
+#define readdir(dir) (dir->preaddir(dir))
+#define closedir(dir) (dir->pclosedir(dir))
 
 #endif /* DIRENT_H */

From 78fc8077c3d5d946ddbc99fff6560c51f138710d Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 8 Sep 2013 14:21:30 +0200
Subject: [PATCH 453/553] Win32: make the lstat implementation pluggable

Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw-posix.h | 2 +-
 compat/mingw.c       | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 4539d3ca49883b..e1546978654d60 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -406,7 +406,7 @@ int mingw_fstat(int fd, struct stat *buf);
 #ifdef lstat
 #undef lstat
 #endif
-#define lstat mingw_lstat
+extern int (*lstat)(const char *file_name, struct stat *buf);
 
 
 int mingw_utime(const char *file_name, const struct utimbuf *times);
diff --git a/compat/mingw.c b/compat/mingw.c
index 4673b1c34f8c8c..c27860e67f5e63 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1036,6 +1036,8 @@ static int do_stat_internal(int follow, const char *file_name, struct stat *buf)
 	return do_lstat(follow, alt_name, buf);
 }
 
+int (*lstat)(const char *file_name, struct stat *buf) = mingw_lstat;
+
 static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 {
 	BY_HANDLE_FILE_INFORMATION fdata;

From 52937e34510cab654f239f52e74dd9f0d2b81151 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 8 Sep 2013 14:23:27 +0200
Subject: [PATCH 454/553] mingw: add infrastructure for read-only file system
 level caches

Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 Documentation/config/core.adoc |  6 ++++++
 builtin/commit.c               |  1 +
 compat/mingw.c                 |  6 ++++++
 compat/mingw.h                 |  2 ++
 git-compat-util.h              | 15 +++++++++++++++
 preload-index.c                |  3 +++
 6 files changed, 33 insertions(+)

diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index 9bc9de29d90ecd..80dcc4b004b1e9 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -710,6 +710,12 @@ relatively high IO latencies.  When enabled, Git will do the
 index comparison to the filesystem data in parallel, allowing
 overlapping IO's.  Defaults to true.
 
+core.fscache::
+	Enable additional caching of file system data for some operations.
++
+Git for Windows uses this to bulk-read and cache lstat data of entire
+directories (instead of doing lstat file by file).
+
 core.unsetenvvars::
 	Windows-only: comma-separated list of environment variables'
 	names that need to be unset before spawning any other process.
diff --git a/builtin/commit.c b/builtin/commit.c
index 0243f17d53c97c..2309cf06acad09 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1622,6 +1622,7 @@ struct repository *repo UNUSED)
 		       PATHSPEC_PREFER_FULL,
 		       prefix, argv);
 
+	enable_fscache(1);
 	if (status_format != STATUS_FORMAT_PORCELAIN &&
 	    status_format != STATUS_FORMAT_PORCELAIN_V2)
 		progress_flag = REFRESH_PROGRESS;
diff --git a/compat/mingw.c b/compat/mingw.c
index c27860e67f5e63..9cd3c2c8d21edb 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -248,6 +248,7 @@ enum hide_dotfiles_type {
 
 static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
 static char *unset_environment_variables;
+int core_fscache;
 
 int mingw_core_config(const char *var, const char *value,
 		      const struct config_context *ctx UNUSED,
@@ -261,6 +262,11 @@ int mingw_core_config(const char *var, const char *value,
 		return 0;
 	}
 
+	if (!strcmp(var, "core.fscache")) {
+		core_fscache = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!strcmp(var, "core.unsetenvvars")) {
 		if (!value)
 			return config_error_nonbool(var);
diff --git a/compat/mingw.h b/compat/mingw.h
index 6ea53ee0d29e17..65df57d2a786e4 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -1,5 +1,7 @@
 #include "mingw-posix.h"
 
+extern int core_fscache;
+
 struct config_context;
 int mingw_core_config(const char *var, const char *value,
 		      const struct config_context *ctx, void *cb);
diff --git a/git-compat-util.h b/git-compat-util.h
index 14eb9e346f9086..fa4cdd0fae404b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1020,6 +1020,21 @@ static inline int is_missing_file_error(int errno_)
 	return (errno_ == ENOENT || errno_ == ENOTDIR);
 }
 
+/*
+ * Enable/disable a read-only cache for file system data on platforms that
+ * support it.
+ *
+ * Implementing a live-cache is complicated and requires special platform
+ * support (inotify, ReadDirectoryChangesW...). enable_fscache shall be used
+ * to mark sections of git code that extensively read from the file system
+ * without modifying anything. Implementations can use this to cache e.g. stat
+ * data or even file content without the need to synchronize with the file
+ * system.
+ */
+#ifndef enable_fscache
+#define enable_fscache(x) /* noop */
+#endif
+
 int cmd_main(int, const char **);
 
 /*
diff --git a/preload-index.c b/preload-index.c
index b222821b448526..61e8f3a1f6ec84 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -141,6 +141,7 @@ void preload_index(struct index_state *index,
 		pthread_mutex_init(&pd.mutex, NULL);
 	}
 
+	enable_fscache(1);
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
 		int err;
@@ -176,6 +177,8 @@ void preload_index(struct index_state *index,
 
 	trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat);
 	trace2_region_leave("index", "preload", NULL);
+
+	enable_fscache(0);
 }
 
 int repo_read_index_preload(struct repository *repo,

From a983fd140fa87e8e09cb54d9b6ad2ac4d3f04f3d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 20 Sep 2017 21:52:28 +0200
Subject: [PATCH 455/553] git-gui--askyesno: fix funny text wrapping

The text wrapping seems to be aligned to the right side of the Yes
button, leaving an awful lot of empty space.

Let's try to counter this by using pixel units.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-gui/git-gui--askyesno | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno
index 142d1bc3de229b..837281fe337b6f 100755
--- a/git-gui/git-gui--askyesno
+++ b/git-gui/git-gui--askyesno
@@ -29,8 +29,8 @@ if {$argc < 1} {
 }
 
 ${NS}::frame .t
-${NS}::label .t.m -text $prompt -justify center -width 40
-.t.m configure -wraplength 400
+${NS}::label .t.m -text $prompt -justify center -width 400px
+.t.m configure -wraplength 400px
 pack .t.m -side top -fill x -padx 20 -pady 20 -expand 1
 pack .t -side top -fill x -ipadx 20 -ipady 20 -expand 1
 

From d5219bf2f3fc67e33ead3b803e2366bcce2653fd Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 1 Oct 2013 12:51:54 +0200
Subject: [PATCH 456/553] mingw: add a cache below mingw's lstat and dirent
 implementations

Checking the work tree status is quite slow on Windows, due to slow
`lstat()` emulation (git calls `lstat()` once for each file in the
index). Windows operating system APIs seem to be much better at scanning
the status of entire directories than checking single files.

Add an `lstat()` implementation that uses a cache for lstat data. Cache
misses read the entire parent directory and add it to the cache.
Subsequent `lstat()` calls for the same directory are served directly
from the cache.

Also implement `opendir()`/`readdir()`/`closedir()` so that they create
and use directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c              | 473 ++++++++++++++++++++++++++++
 compat/win32/fscache.h              |  10 +
 config.mak.uname                    |   4 +-
 contrib/buildsystems/CMakeLists.txt |   3 +-
 git-compat-util.h                   |   2 +
 meson.build                         |   1 +
 6 files changed, 490 insertions(+), 3 deletions(-)
 create mode 100644 compat/win32/fscache.c
 create mode 100644 compat/win32/fscache.h

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
new file mode 100644
index 00000000000000..dc765ddd57b5bc
--- /dev/null
+++ b/compat/win32/fscache.c
@@ -0,0 +1,473 @@
+#include "../../git-compat-util.h"
+#include "../../hashmap.h"
+#include "../win32.h"
+#include "fscache.h"
+#include "../../dir.h"
+#include "../../abspath.h"
+
+static int initialized;
+static volatile long enabled;
+static struct hashmap map;
+static CRITICAL_SECTION mutex;
+
+/*
+ * An entry in the file system cache. Used for both entire directory listings
+ * and file entries.
+ */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpedantic"
+struct fsentry {
+	struct hashmap_entry ent;
+	mode_t st_mode;
+	/* Pointer to the directory listing, or NULL for the listing itself. */
+	struct fsentry *list;
+	/* Pointer to the next file entry of the list. */
+	struct fsentry *next;
+
+	union {
+		/* Reference count of the directory listing. */
+		volatile long refcnt;
+		struct {
+			/* More stat members (only used for file entries). */
+			off64_t st_size;
+			struct timespec st_atim;
+			struct timespec st_mtim;
+			struct timespec st_ctim;
+		} s;
+	} u;
+
+	/* Length of name. */
+	unsigned short len;
+	/*
+	 * Name of the entry. For directory listings: relative path of the
+	 * directory, without trailing '/' (empty for cwd()). For file entries:
+	 * name of the file. Typically points to the end of the structure if
+	 * the fsentry is allocated on the heap (see fsentry_alloc), or to a
+	 * local variable if on the stack (see fsentry_init).
+	 */
+	struct dirent dirent;
+};
+#pragma GCC diagnostic pop
+
+#pragma GCC diagnostic push
+#ifdef __clang__
+#pragma GCC diagnostic ignored "-Wflexible-array-extensions"
+#endif
+struct heap_fsentry {
+	union {
+		struct fsentry ent;
+		char dummy[sizeof(struct fsentry) + MAX_PATH];
+	} u;
+};
+#pragma GCC diagnostic pop
+
+/*
+ * Compares the paths of two fsentry structures for equality.
+ */
+static int fsentry_cmp(void *cmp_data UNUSED,
+		       const struct fsentry *fse1, const struct fsentry *fse2,
+		       void *keydata UNUSED)
+{
+	int res;
+	if (fse1 == fse2)
+		return 0;
+
+	/* compare the list parts first */
+	if (fse1->list != fse2->list &&
+	    (res = fsentry_cmp(NULL, fse1->list ? fse1->list : fse1,
+			       fse2->list ? fse2->list	: fse2, NULL)))
+		return res;
+
+	/* if list parts are equal, compare len and name */
+	if (fse1->len != fse2->len)
+		return fse1->len - fse2->len;
+	return fspathncmp(fse1->dirent.d_name, fse2->dirent.d_name, fse1->len);
+}
+
+/*
+ * Calculates the hash code of an fsentry structure's path.
+ */
+static unsigned int fsentry_hash(const struct fsentry *fse)
+{
+	unsigned int hash = fse->list ? fse->list->ent.hash : 0;
+	return hash ^ memihash(fse->dirent.d_name, fse->len);
+}
+
+/*
+ * Initialize an fsentry structure for use by fsentry_hash and fsentry_cmp.
+ */
+static void fsentry_init(struct fsentry *fse, struct fsentry *list,
+			 const char *name, size_t len)
+{
+	fse->list = list;
+	if (len > MAX_PATH)
+		BUG("Trying to allocate fsentry for long path '%.*s'",
+		    (int)len, name);
+	memcpy(fse->dirent.d_name, name, len);
+	fse->dirent.d_name[len] = 0;
+	fse->len = len;
+	hashmap_entry_init(&fse->ent, fsentry_hash(fse));
+}
+
+/*
+ * Allocate an fsentry structure on the heap.
+ */
+static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name,
+		size_t len)
+{
+	/* overallocate fsentry and copy the name to the end */
+	struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1);
+	/* init the rest of the structure */
+	fsentry_init(fse, list, name, len);
+	fse->next = NULL;
+	fse->u.refcnt = 1;
+	return fse;
+}
+
+/*
+ * Add a reference to an fsentry.
+ */
+inline static void fsentry_addref(struct fsentry *fse)
+{
+	if (fse->list)
+		fse = fse->list;
+
+	InterlockedIncrement(&(fse->u.refcnt));
+}
+
+/*
+ * Release the reference to an fsentry, frees the memory if its the last ref.
+ */
+static void fsentry_release(struct fsentry *fse)
+{
+	if (fse->list)
+		fse = fse->list;
+
+	if (InterlockedDecrement(&(fse->u.refcnt)))
+		return;
+
+	while (fse) {
+		struct fsentry *next = fse->next;
+		free(fse);
+		fse = next;
+	}
+}
+
+/*
+ * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure.
+ */
+static struct fsentry *fseentry_create_entry(struct fsentry *list,
+					     const WIN32_FIND_DATAW *fdata)
+{
+	char buf[MAX_PATH * 3];
+	int len;
+	struct fsentry *fse;
+	len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf));
+
+	fse = fsentry_alloc(list, buf, len);
+
+	fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes);
+	fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG;
+	fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32)
+			| fdata->nFileSizeLow;
+	filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim));
+	filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim));
+	filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim));
+
+	return fse;
+}
+
+/*
+ * Create an fsentry-based directory listing (similar to opendir / readdir).
+ * Dir should not contain trailing '/'. Use an empty string for the current
+ * directory (not "."!).
+ */
+static struct fsentry *fsentry_create_list(const struct fsentry *dir)
+{
+	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
+	WIN32_FIND_DATAW fdata;
+	HANDLE h;
+	int wlen;
+	struct fsentry *list, **phead;
+	DWORD err;
+
+	/* convert name to UTF-16 and check length < MAX_PATH */
+	if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH,
+			       dir->len)) < 0) {
+		if (errno == ERANGE)
+			errno = ENAMETOOLONG;
+		return NULL;
+	}
+
+	/* append optional '/' and wildcard '*' */
+	if (wlen)
+		pattern[wlen++] = '/';
+	pattern[wlen++] = '*';
+	pattern[wlen] = 0;
+
+	/* open find handle */
+	h = FindFirstFileW(pattern, &fdata);
+	if (h == INVALID_HANDLE_VALUE) {
+		err = GetLastError();
+		errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err);
+		return NULL;
+	}
+
+	/* allocate object to hold directory listing */
+	list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len);
+
+	/* walk directory and build linked list of fsentry structures */
+	phead = &list->next;
+	do {
+		*phead = fseentry_create_entry(list, &fdata);
+		phead = &(*phead)->next;
+	} while (FindNextFileW(h, &fdata));
+
+	/* remember result of last FindNextFile, then close find handle */
+	err = GetLastError();
+	FindClose(h);
+
+	/* return the list if we've got all the files */
+	if (err == ERROR_NO_MORE_FILES)
+		return list;
+
+	/* otherwise free the list and return error */
+	fsentry_release(list);
+	errno = err_win_to_posix(err);
+	return NULL;
+}
+
+/*
+ * Adds a directory listing to the cache.
+ */
+static void fscache_add(struct fsentry *fse)
+{
+	if (fse->list)
+		fse = fse->list;
+
+	for (; fse; fse = fse->next)
+		hashmap_add(&map, &fse->ent);
+}
+
+/*
+ * Clears the cache.
+ */
+static void fscache_clear(void)
+{
+	hashmap_clear_and_free(&map, struct fsentry, ent);
+	hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0);
+}
+
+/*
+ * Checks if the cache is enabled for the given path.
+ */
+static inline int fscache_enabled(const char *path)
+{
+	return enabled > 0 && !is_absolute_path(path);
+}
+
+/*
+ * Looks up or creates a cache entry for the specified key.
+ */
+static struct fsentry *fscache_get(struct fsentry *key)
+{
+	struct fsentry *fse;
+
+	EnterCriticalSection(&mutex);
+	/* check if entry is in cache */
+	fse = hashmap_get_entry(&map, key, ent, NULL);
+	if (fse) {
+		fsentry_addref(fse);
+		LeaveCriticalSection(&mutex);
+		return fse;
+	}
+	/* if looking for a file, check if directory listing is in cache */
+	if (!fse && key->list) {
+		fse = hashmap_get_entry(&map, key->list, ent, NULL);
+		if (fse) {
+			LeaveCriticalSection(&mutex);
+			/* dir entry without file entry -> file doesn't exist */
+			errno = ENOENT;
+			return NULL;
+		}
+	}
+
+	/* create the directory listing (outside mutex!) */
+	LeaveCriticalSection(&mutex);
+	fse = fsentry_create_list(key->list ? key->list : key);
+	if (!fse)
+		return NULL;
+
+	EnterCriticalSection(&mutex);
+	/* add directory listing if it hasn't been added by some other thread */
+	if (!hashmap_get_entry(&map, key, ent, NULL))
+		fscache_add(fse);
+
+	/* lookup file entry if requested (fse already points to directory) */
+	if (key->list)
+		fse = hashmap_get_entry(&map, key, ent, NULL);
+
+	/* return entry or ENOENT */
+	if (fse)
+		fsentry_addref(fse);
+	else
+		errno = ENOENT;
+
+	LeaveCriticalSection(&mutex);
+	return fse;
+}
+
+/*
+ * Enables or disables the cache. Note that the cache is read-only, changes to
+ * the working directory are NOT reflected in the cache while enabled.
+ */
+int fscache_enable(int enable)
+{
+	int result;
+
+	if (!initialized) {
+		/* allow the cache to be disabled entirely */
+		if (!core_fscache)
+			return 0;
+
+		InitializeCriticalSection(&mutex);
+		hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0);
+		initialized = 1;
+	}
+
+	result = enable ? InterlockedIncrement(&enabled)
+			: InterlockedDecrement(&enabled);
+
+	if (enable && result == 1) {
+		/* redirect opendir and lstat to the fscache implementations */
+		opendir = fscache_opendir;
+		lstat = fscache_lstat;
+	} else if (!enable && !result) {
+		/* reset opendir and lstat to the original implementations */
+		opendir = dirent_opendir;
+		lstat = mingw_lstat;
+		EnterCriticalSection(&mutex);
+		fscache_clear();
+		LeaveCriticalSection(&mutex);
+	}
+	return result;
+}
+
+/*
+ * Lstat replacement, uses the cache if enabled, otherwise redirects to
+ * mingw_lstat.
+ */
+int fscache_lstat(const char *filename, struct stat *st)
+{
+	int dirlen, base, len;
+#pragma GCC diagnostic push
+#ifdef __clang__
+#pragma GCC diagnostic ignored "-Wflexible-array-extensions"
+#endif
+	struct heap_fsentry key[2];
+#pragma GCC diagnostic pop
+	struct fsentry *fse;
+
+	if (!fscache_enabled(filename))
+		return mingw_lstat(filename, st);
+
+	/* split filename into path + name */
+	len = strlen(filename);
+	if (len && is_dir_sep(filename[len - 1]))
+		len--;
+	base = len;
+	while (base && !is_dir_sep(filename[base - 1]))
+		base--;
+	dirlen = base ? base - 1 : 0;
+
+	/* lookup entry for path + name in cache */
+	fsentry_init(&key[0].u.ent, NULL, filename, dirlen);
+	fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base);
+	fse = fscache_get(&key[1].u.ent);
+	if (!fse) {
+		errno = ENOENT;
+		return -1;
+	}
+
+	/* copy stat data */
+	st->st_ino = 0;
+	st->st_gid = 0;
+	st->st_uid = 0;
+	st->st_dev = 0;
+	st->st_rdev = 0;
+	st->st_nlink = 1;
+	st->st_mode = fse->st_mode;
+	st->st_size = fse->u.s.st_size;
+	st->st_atim = fse->u.s.st_atim;
+	st->st_mtim = fse->u.s.st_mtim;
+	st->st_ctim = fse->u.s.st_ctim;
+
+	/* don't forget to release fsentry */
+	fsentry_release(fse);
+	return 0;
+}
+
+typedef struct fscache_DIR {
+	struct DIR base_dir; /* extend base struct DIR */
+	struct fsentry *pfsentry;
+	struct dirent *dirent;
+} fscache_DIR;
+
+/*
+ * Readdir replacement.
+ */
+static struct dirent *fscache_readdir(DIR *base_dir)
+{
+	fscache_DIR *dir = (fscache_DIR*) base_dir;
+	struct fsentry *next = dir->pfsentry->next;
+	if (!next)
+		return NULL;
+	dir->pfsentry = next;
+	dir->dirent = &next->dirent;
+	return dir->dirent;
+}
+
+/*
+ * Closedir replacement.
+ */
+static int fscache_closedir(DIR *base_dir)
+{
+	fscache_DIR *dir = (fscache_DIR*) base_dir;
+	fsentry_release(dir->pfsentry);
+	free(dir);
+	return 0;
+}
+
+/*
+ * Opendir replacement, uses a directory listing from the cache if enabled,
+ * otherwise calls original dirent implementation.
+ */
+DIR *fscache_opendir(const char *dirname)
+{
+	struct heap_fsentry key;
+	struct fsentry *list;
+	fscache_DIR *dir;
+	int len;
+
+	if (!fscache_enabled(dirname))
+		return dirent_opendir(dirname);
+
+	/* prepare name (strip trailing '/', replace '.') */
+	len = strlen(dirname);
+	if ((len == 1 && dirname[0] == '.') ||
+	    (len && is_dir_sep(dirname[len - 1])))
+		len--;
+
+	/* get directory listing from cache */
+	fsentry_init(&key.u.ent, NULL, dirname, len);
+	list = fscache_get(&key.u.ent);
+	if (!list)
+		return NULL;
+
+	/* alloc and return DIR structure */
+	dir = (fscache_DIR*) xmalloc(sizeof(fscache_DIR));
+	dir->base_dir.preaddir = fscache_readdir;
+	dir->base_dir.pclosedir = fscache_closedir;
+	dir->pfsentry = list;
+	return (DIR*) dir;
+}
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
new file mode 100644
index 00000000000000..ed518b422d705e
--- /dev/null
+++ b/compat/win32/fscache.h
@@ -0,0 +1,10 @@
+#ifndef FSCACHE_H
+#define FSCACHE_H
+
+int fscache_enable(int enable);
+#define enable_fscache(x) fscache_enable(x)
+
+DIR *fscache_opendir(const char *dir);
+int fscache_lstat(const char *file_name, struct stat *buf);
+
+#endif
diff --git a/config.mak.uname b/config.mak.uname
index 4d5b1db4272cfa..bbcfeb4eee0f05 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -508,7 +508,7 @@ endif
 		compat/win32/path-utils.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
 		compat/win32/trace2_win32_process_info.o \
-		compat/win32/dirent.o
+		compat/win32/dirent.o compat/win32/fscache.o
 	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \
 		-DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \
 		-DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
@@ -713,7 +713,7 @@ ifeq ($(uname_S),MINGW)
 		compat/win32/flush.o \
 		compat/win32/path-utils.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
-		compat/win32/dirent.o
+		compat/win32/dirent.o compat/win32/fscache.o
 	BASIC_CFLAGS += -DWIN32
 	EXTLIBS += -lws2_32
 	GITLIBS += git.res
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index fe26cebdab7dd4..7be6a0a7fccb72 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -301,7 +301,8 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 		compat/win32/trace2_win32_process_info.c
 		compat/win32/dirent.c
 		compat/nedmalloc/nedmalloc.c
-		compat/strdup.c)
+		compat/strdup.c
+		compat/win32/fscache.c)
 	set(NO_UNIX_SOCKETS 1)
 
 elseif(CMAKE_SYSTEM_NAME STREQUAL "Linux")
diff --git a/git-compat-util.h b/git-compat-util.h
index fa4cdd0fae404b..0ec082c57dbb7d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -162,9 +162,11 @@ static inline int is_xplatform_dir_sep(int c)
 /* pull in Windows compatibility stuff */
 #include "compat/win32/path-utils.h"
 #include "compat/mingw.h"
+#include "compat/win32/fscache.h"
 #elif defined(_MSC_VER)
 #include "compat/win32/path-utils.h"
 #include "compat/msvc.h"
+#include "compat/win32/fscache.h"
 #endif
 
 /* used on Mac OS X */
diff --git a/meson.build b/meson.build
index c7f62875afd710..e6c2e592b4297a 100644
--- a/meson.build
+++ b/meson.build
@@ -1260,6 +1260,7 @@ elif host_machine.system() == 'windows'
     'compat/winansi.c',
     'compat/win32/dirent.c',
     'compat/win32/flush.c',
+    'compat/win32/fscache.c',
     'compat/win32/path-utils.c',
     'compat/win32/pthread.c',
     'compat/win32/syslog.c',

From e43b2970258a466a1450b3dd092290fefdc6741f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 20 Sep 2017 21:55:45 +0200
Subject: [PATCH 457/553] git-gui--askyesno (mingw): use Git for Windows' icon,
 if available

For additional GUI goodness.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-gui/git-gui--askyesno | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno
index 837281fe337b6f..e431f86a8e16ae 100755
--- a/git-gui/git-gui--askyesno
+++ b/git-gui/git-gui--askyesno
@@ -59,5 +59,17 @@ if {$::tcl_platform(platform) eq {windows}} {
 	}
 }
 
+if {$::tcl_platform(platform) eq {windows}} {
+	set icopath [file dirname [file normalize $argv0]]
+	if {[file tail $icopath] eq {git-core}} {
+		set icopath [file dirname $icopath]
+	}
+	set icopath [file dirname $icopath]
+	set icopath [file join $icopath share git git-for-windows.ico]
+	if {[file exists $icopath]} {
+		wm iconbitmap . -default $icopath
+	}
+}
+
 wm title . $title
 tk::PlaceWindow .

From 7841a159fb5ee358ae00fd03d7e49272affa8b62 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 24 Jun 2014 13:22:35 +0200
Subject: [PATCH 458/553] fscache: load directories only once

If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.

On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/win32/fscache.c | 65 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index dc765ddd57b5bc..ff2479c7387f13 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -27,6 +27,8 @@ struct fsentry {
 	union {
 		/* Reference count of the directory listing. */
 		volatile long refcnt;
+		/* Handle to wait on the loading thread. */
+		HANDLE hwait;
 		struct {
 			/* More stat members (only used for file entries). */
 			off64_t st_size;
@@ -266,16 +268,43 @@ static inline int fscache_enabled(const char *path)
 	return enabled > 0 && !is_absolute_path(path);
 }
 
+/*
+ * Looks up a cache entry, waits if its being loaded by another thread.
+ * The mutex must be owned by the calling thread.
+ */
+static struct fsentry *fscache_get_wait(struct fsentry *key)
+{
+	struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL);
+
+	/* return if its a 'real' entry (future entries have refcnt == 0) */
+	if (!fse || fse->list || fse->u.refcnt)
+		return fse;
+
+	/* create an event and link our key to the future entry */
+	key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL);
+	key->next = fse->next;
+	fse->next = key;
+
+	/* wait for the loading thread to signal us */
+	LeaveCriticalSection(&mutex);
+	WaitForSingleObject(key->u.hwait, INFINITE);
+	CloseHandle(key->u.hwait);
+	EnterCriticalSection(&mutex);
+
+	/* repeat cache lookup */
+	return hashmap_get_entry(&map, key, ent, NULL);
+}
+
 /*
  * Looks up or creates a cache entry for the specified key.
  */
 static struct fsentry *fscache_get(struct fsentry *key)
 {
-	struct fsentry *fse;
+	struct fsentry *fse, *future, *waiter;
 
 	EnterCriticalSection(&mutex);
 	/* check if entry is in cache */
-	fse = hashmap_get_entry(&map, key, ent, NULL);
+	fse = fscache_get_wait(key);
 	if (fse) {
 		fsentry_addref(fse);
 		LeaveCriticalSection(&mutex);
@@ -283,7 +312,7 @@ static struct fsentry *fscache_get(struct fsentry *key)
 	}
 	/* if looking for a file, check if directory listing is in cache */
 	if (!fse && key->list) {
-		fse = hashmap_get_entry(&map, key->list, ent, NULL);
+		fse = fscache_get_wait(key->list);
 		if (fse) {
 			LeaveCriticalSection(&mutex);
 			/* dir entry without file entry -> file doesn't exist */
@@ -292,16 +321,34 @@ static struct fsentry *fscache_get(struct fsentry *key)
 		}
 	}
 
+	/* add future entry to indicate that we're loading it */
+	future = key->list ? key->list : key;
+	future->next = NULL;
+	future->u.refcnt = 0;
+	hashmap_add(&map, &future->ent);
+
 	/* create the directory listing (outside mutex!) */
 	LeaveCriticalSection(&mutex);
-	fse = fsentry_create_list(key->list ? key->list : key);
-	if (!fse)
+	fse = fsentry_create_list(future);
+	EnterCriticalSection(&mutex);
+
+	/* remove future entry and signal waiting threads */
+	hashmap_remove(&map, &future->ent, NULL);
+	waiter = future->next;
+	while (waiter) {
+		HANDLE h = waiter->u.hwait;
+		waiter = waiter->next;
+		SetEvent(h);
+	}
+
+	/* leave on error (errno set by fsentry_create_list) */
+	if (!fse) {
+		LeaveCriticalSection(&mutex);
 		return NULL;
+	}
 
-	EnterCriticalSection(&mutex);
-	/* add directory listing if it hasn't been added by some other thread */
-	if (!hashmap_get_entry(&map, key, ent, NULL))
-		fscache_add(fse);
+	/* add directory listing to the cache */
+	fscache_add(fse);
 
 	/* lookup file entry if requested (fse already points to directory) */
 	if (key->list)

From 534b4d16bb575aef84481226b99a3fe96f74c75b Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Tue, 24 Jan 2017 15:12:13 -0500
Subject: [PATCH 459/553] fscache: add key for GIT_TRACE_FSCACHE

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index ff2479c7387f13..d67dc918d6b71c 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -4,11 +4,13 @@
 #include "fscache.h"
 #include "../../dir.h"
 #include "../../abspath.h"
+#include "../../trace.h"
 
 static int initialized;
 static volatile long enabled;
 static struct hashmap map;
 static CRITICAL_SECTION mutex;
+static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
 /*
  * An entry in the file system cache. Used for both entire directory listings
@@ -212,6 +214,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir)
 	if (h == INVALID_HANDLE_VALUE) {
 		err = GetLastError();
 		errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err);
+		trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n",
+						 errno, dir->dirent.d_name);
 		return NULL;
 	}
 
@@ -397,6 +401,7 @@ int fscache_enable(int enable)
 		fscache_clear();
 		LeaveCriticalSection(&mutex);
 	}
+	trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable);
 	return result;
 }
 

From 3aea2e5ab474e31d71f463a5b43f52617bc62ef0 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Tue, 13 Dec 2016 14:05:32 -0500
Subject: [PATCH 460/553] fscache: remember not-found directories

Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c | 36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index d67dc918d6b71c..7aa3450e7edf47 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -186,7 +186,8 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list,
  * Dir should not contain trailing '/'. Use an empty string for the current
  * directory (not "."!).
  */
-static struct fsentry *fsentry_create_list(const struct fsentry *dir)
+static struct fsentry *fsentry_create_list(const struct fsentry *dir,
+					   int *dir_not_found)
 {
 	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
 	WIN32_FIND_DATAW fdata;
@@ -195,6 +196,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir)
 	struct fsentry *list, **phead;
 	DWORD err;
 
+	*dir_not_found = 0;
+
 	/* convert name to UTF-16 and check length < MAX_PATH */
 	if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH,
 			       dir->len)) < 0) {
@@ -213,6 +216,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir)
 	h = FindFirstFileW(pattern, &fdata);
 	if (h == INVALID_HANDLE_VALUE) {
 		err = GetLastError();
+		*dir_not_found = 1; /* or empty directory */
 		errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err);
 		trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n",
 						 errno, dir->dirent.d_name);
@@ -221,6 +225,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir)
 
 	/* allocate object to hold directory listing */
 	list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len);
+	list->st_mode = S_IFDIR;
+	list->dirent.d_type = DT_DIR;
 
 	/* walk directory and build linked list of fsentry structures */
 	phead = &list->next;
@@ -305,12 +311,16 @@ static struct fsentry *fscache_get_wait(struct fsentry *key)
 static struct fsentry *fscache_get(struct fsentry *key)
 {
 	struct fsentry *fse, *future, *waiter;
+	int dir_not_found;
 
 	EnterCriticalSection(&mutex);
 	/* check if entry is in cache */
 	fse = fscache_get_wait(key);
 	if (fse) {
-		fsentry_addref(fse);
+		if (fse->st_mode)
+			fsentry_addref(fse);
+		else
+			fse = NULL; /* non-existing directory */
 		LeaveCriticalSection(&mutex);
 		return fse;
 	}
@@ -319,7 +329,10 @@ static struct fsentry *fscache_get(struct fsentry *key)
 		fse = fscache_get_wait(key->list);
 		if (fse) {
 			LeaveCriticalSection(&mutex);
-			/* dir entry without file entry -> file doesn't exist */
+			/*
+			 * dir entry without file entry, or dir does not
+			 * exist -> file doesn't exist
+			 */
 			errno = ENOENT;
 			return NULL;
 		}
@@ -333,7 +346,7 @@ static struct fsentry *fscache_get(struct fsentry *key)
 
 	/* create the directory listing (outside mutex!) */
 	LeaveCriticalSection(&mutex);
-	fse = fsentry_create_list(future);
+	fse = fsentry_create_list(future, &dir_not_found);
 	EnterCriticalSection(&mutex);
 
 	/* remove future entry and signal waiting threads */
@@ -347,6 +360,18 @@ static struct fsentry *fscache_get(struct fsentry *key)
 
 	/* leave on error (errno set by fsentry_create_list) */
 	if (!fse) {
+		if (dir_not_found && key->list) {
+			/*
+			 * Record that the directory does not exist (or is
+			 * empty, which for all practical matters is the same
+			 * thing as far as fscache is concerned).
+			 */
+			fse = fsentry_alloc(key->list->list,
+					    key->list->dirent.d_name,
+					    key->list->len);
+			fse->st_mode = 0;
+			hashmap_add(&map, &fse->ent);
+		}
 		LeaveCriticalSection(&mutex);
 		return NULL;
 	}
@@ -358,6 +383,9 @@ static struct fsentry *fscache_get(struct fsentry *key)
 	if (key->list)
 		fse = hashmap_get_entry(&map, key, ent, NULL);
 
+	if (fse && !fse->st_mode)
+		fse = NULL; /* non-existing directory */
+
 	/* return entry or ENOENT */
 	if (fse)
 		fsentry_addref(fse);

From a063a3b2112a2218eb7a666718e2a10b50df33b2 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 25 Jan 2017 18:39:16 +0100
Subject: [PATCH 461/553] fscache: add a test for the dir-not-found
 optimization

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1090-sparse-checkout-scope.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh
index 3a14218b245d4c..529844e2862c74 100755
--- a/t/t1090-sparse-checkout-scope.sh
+++ b/t/t1090-sparse-checkout-scope.sh
@@ -106,4 +106,24 @@ test_expect_success 'in partial clone, sparse checkout only fetches needed blobs
 	test_cmp expect actual
 '
 
+test_expect_success MINGW 'no unnecessary opendir() with fscache' '
+	git clone . fscache-test &&
+	(
+		cd fscache-test &&
+		git config core.fscache 1 &&
+		echo "/excluded/*" >.git/info/sparse-checkout &&
+		for f in $(test_seq 10)
+		do
+			sha1=$(echo $f | git hash-object -w --stdin) &&
+			git update-index --add \
+				--cacheinfo 100644,$sha1,excluded/$f || exit 1
+		done &&
+		test_tick &&
+		git commit -m excluded &&
+		GIT_TRACE_FSCACHE=1 git status >out 2>err &&
+		grep excluded err >grep.out &&
+		test_line_count = 1 grep.out
+	)
+'
+
 test_done

From c6a1d46c123992c6fef2d7ce39ca3a012e181304 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Tue, 22 Nov 2016 11:26:38 -0500
Subject: [PATCH 462/553] add: use preload-index and fscache for performance

Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.

During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry.  This
calls lstat().  On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.

Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.

We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with

	not ok 47 - do not add files from a submodule

We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/add.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/builtin/add.c b/builtin/add.c
index 32709794b3873f..25add8da962cab 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -493,6 +493,10 @@ int cmd_add(int argc,
 	die_in_unpopulated_submodule(repo->index, prefix);
 	die_path_inside_submodule(repo->index, &pathspec);
 
+	enable_fscache(1);
+	/* We do not really re-read the index but update the up-to-date flags */
+	preload_index(repo->index, &pathspec, 0);
+
 	if (add_new_files) {
 		int baselen;
 
@@ -605,5 +609,6 @@ int cmd_add(int argc,
 	free(ps_matched);
 	dir_clear(&dir);
 	clear_pathspec(&pathspec);
+	enable_fscache(0);
 	return exit_status;
 }

From 101718ffd0dd4acfa8246ff8454a6be16c26434e Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Wed, 1 Nov 2017 15:05:44 -0400
Subject: [PATCH 463/553] dir.c: make add_excludes aware of fscache during
 status

Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.

The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree.  Change that when fscache is enabled to
call lstat() first and if present, call open().

This seems backwards because both lstat needs to do more
work than fstat.  But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls.  This works because of the lstat diversion
to mingw_lstat when fscache is enabled.

This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/win32/fscache.c |  5 +++++
 compat/win32/fscache.h |  3 +++
 dir.c                  | 39 ++++++++++++++++++++++++++++++---------
 git-compat-util.h      |  4 ++++
 4 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 7aa3450e7edf47..edec8f5813fcf1 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -12,6 +12,11 @@ static struct hashmap map;
 static CRITICAL_SECTION mutex;
 static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
+int fscache_is_enabled(void)
+{
+	return enabled;
+}
+
 /*
  * An entry in the file system cache. Used for both entire directory listings
  * and file entries.
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index ed518b422d705e..9a21fd5709c5bc 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -4,6 +4,9 @@
 int fscache_enable(int enable);
 #define enable_fscache(x) fscache_enable(x)
 
+int fscache_is_enabled(void);
+#define is_fscache_enabled() (fscache_is_enabled())
+
 DIR *fscache_opendir(const char *dir);
 int fscache_lstat(const char *file_name, struct stat *buf);
 
diff --git a/dir.c b/dir.c
index b00821f294fea2..154eab4b405685 100644
--- a/dir.c
+++ b/dir.c
@@ -1156,16 +1156,37 @@ static int add_patterns(const char *fname, const char *base, int baselen,
 	size_t size = 0;
 	char *buf;
 
-	if (flags & PATTERN_NOFOLLOW)
-		fd = open_nofollow(fname, O_RDONLY);
-	else
-		fd = open(fname, O_RDONLY);
-
-	if (fd < 0 || fstat(fd, &st) < 0) {
-		if (fd < 0)
-			warn_on_fopen_errors(fname);
+	/*
+	 * Since `clang`'s `-Wunreachable-code` mode is clever, it would figure
+	 * out that on non-Windows platforms, this `lstat()` is unreachable.
+	 * We do want to keep the conditional block for the sake of Windows,
+	 * though, so let's use the `NOT_CONSTANT()` trick to suppress that error.
+	 */
+	if (NOT_CONSTANT(is_fscache_enabled(fname))) {
+		if (lstat(fname, &st) < 0) {
+			fd = -1;
+		} else {
+			fd = open(fname, O_RDONLY);
+			if (fd < 0)
+				warn_on_fopen_errors(fname);
+		}
+	} else {
+		if (flags & PATTERN_NOFOLLOW)
+			fd = open_nofollow(fname, O_RDONLY);
 		else
-			close(fd);
+			fd = open(fname, O_RDONLY);
+
+		if (fd < 0 || fstat(fd, &st) < 0) {
+			if (fd < 0)
+				warn_on_fopen_errors(fname);
+			else {
+				close(fd);
+				fd = -1;
+			}
+		}
+	}
+
+	if (fd < 0) {
 		if (!istate)
 			return -1;
 		r = read_skip_worktree_file_from_index(istate, fname,
diff --git a/git-compat-util.h b/git-compat-util.h
index 0ec082c57dbb7d..9da9ea8ea06e45 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1037,6 +1037,10 @@ static inline int is_missing_file_error(int errno_)
 #define enable_fscache(x) /* noop */
 #endif
 
+#ifndef is_fscache_enabled
+#define is_fscache_enabled() (0)
+#endif
+
 int cmd_main(int, const char **);
 
 /*

From b36f5300a969da4847f1cd5e196d1d1ec464faa6 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Wed, 20 Dec 2017 10:43:41 -0500
Subject: [PATCH 464/553] fscache: make fscache_enabled() public

Make fscache_enabled() function public rather than static.
Remove unneeded fscache_is_enabled() function.
Change is_fscache_enabled() macro to call fscache_enabled().

is_fscache_enabled() now takes a pathname so that the answer
is more precise and mean "is fscache enabled for this pathname",
since fscache only stores repo-relative paths and not absolute
paths, we can avoid attempting lookups for absolute paths.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/win32/fscache.c | 7 +------
 compat/win32/fscache.h | 4 ++--
 git-compat-util.h      | 2 +-
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index edec8f5813fcf1..6e44df0a2dc2e7 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -12,11 +12,6 @@ static struct hashmap map;
 static CRITICAL_SECTION mutex;
 static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
-int fscache_is_enabled(void)
-{
-	return enabled;
-}
-
 /*
  * An entry in the file system cache. Used for both entire directory listings
  * and file entries.
@@ -278,7 +273,7 @@ static void fscache_clear(void)
 /*
  * Checks if the cache is enabled for the given path.
  */
-static inline int fscache_enabled(const char *path)
+int fscache_enabled(const char *path)
 {
 	return enabled > 0 && !is_absolute_path(path);
 }
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index 9a21fd5709c5bc..660ada053b4309 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -4,8 +4,8 @@
 int fscache_enable(int enable);
 #define enable_fscache(x) fscache_enable(x)
 
-int fscache_is_enabled(void);
-#define is_fscache_enabled() (fscache_is_enabled())
+int fscache_enabled(const char *path);
+#define is_fscache_enabled(path) fscache_enabled(path)
 
 DIR *fscache_opendir(const char *dir);
 int fscache_lstat(const char *file_name, struct stat *buf);
diff --git a/git-compat-util.h b/git-compat-util.h
index 9da9ea8ea06e45..86cf347f50a2f4 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1038,7 +1038,7 @@ static inline int is_missing_file_error(int errno_)
 #endif
 
 #ifndef is_fscache_enabled
-#define is_fscache_enabled() (0)
+#define is_fscache_enabled(path) (0)
 #endif
 
 int cmd_main(int, const char **);

From c05a4075445e91e898aae0ff839d13e8ab7355f8 Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Wed, 20 Dec 2017 11:19:27 -0500
Subject: [PATCH 465/553] dir.c: regression fix for add_excludes with fscache

Fix regression described in:
https://github.com/git-for-windows/git/issues/1392

which was introduced in:
https://github.com/git-for-windows/git/commit/b2353379bba414e6c00dde913497cc9c827366f2

Problem Symptoms
================
When the user has a .gitignore file that is a symlink, the fscache
optimization introduced above caused the stat-data from the symlink,
rather that of the target file, to be returned.  Later when the ignore
file was read, the buffer length did not match the stat.st_size field
and we called die("cannot use <path> as an exclude file")

Optimization Rationale
======================
The above optimization calls lstat() before open() primarily to ask
fscache if the file exists.  It gets the current stat-data as a side
effect essentially for free (since we already have it in memory).
If the file does not exist, it does not need to call open().  And
since very few directories have .gitignore files, we can greatly
reduce time spent in the filesystem.

Discussion of Fix
=================
The above optimization calls lstat() rather than stat() because the
fscache only intercepts lstat() calls.  Calls to stat() stay directed
to the mingw_stat() completly bypassing fscache.  Furthermore, calls
to mingw_stat() always call {open, fstat, close} so that symlinks are
properly dereferenced, which adds *additional* open/close calls on top
of what the original code in dir.c is doing.

Since the problem only manifests for symlinks, we add code to overwrite
the stat-data when the path is a symlink.  This preserves the effect of
the performance gains provided by the fscache in the normal case.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 dir.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/dir.c b/dir.c
index 154eab4b405685..186b8832db98fa 100644
--- a/dir.c
+++ b/dir.c
@@ -1157,6 +1157,28 @@ static int add_patterns(const char *fname, const char *base, int baselen,
 	char *buf;
 
 	/*
+	 * A performance optimization for status.
+	 *
+	 * During a status scan, git looks in each directory for a .gitignore
+	 * file before scanning the directory.  Since .gitignore files are not
+	 * that common, we can waste a lot of time looking for files that are
+	 * not there.  Fortunately, the fscache already knows if the directory
+	 * contains a .gitignore file, since it has already read the directory
+	 * and it already has the stat-data.
+	 *
+	 * If the fscache is enabled, use the fscache-lstat() interlude to see
+	 * if the file exists (in the fscache hash maps) before trying to open()
+	 * it.
+	 *
+	 * This causes problem when the .gitignore file is a symlink, because
+	 * we call lstat() rather than stat() on the symlnk and the resulting
+	 * stat-data is for the symlink itself rather than the target file.
+	 * We CANNOT use stat() here because the fscache DOES NOT install an
+	 * interlude for stat() and mingw_stat() always calls "open-fstat-close"
+	 * on the file and defeats the purpose of the optimization here.  Since
+	 * symlinks are even more rare than .gitignore files, we force a fstat()
+	 * after our open() to get stat-data for the target file.
+	 *
 	 * Since `clang`'s `-Wunreachable-code` mode is clever, it would figure
 	 * out that on non-Windows platforms, this `lstat()` is unreachable.
 	 * We do want to keep the conditional block for the sake of Windows,
@@ -1169,6 +1191,11 @@ static int add_patterns(const char *fname, const char *base, int baselen,
 			fd = open(fname, O_RDONLY);
 			if (fd < 0)
 				warn_on_fopen_errors(fname);
+			else if (S_ISLNK(st.st_mode) && fstat(fd, &st) < 0) {
+				warn_on_fopen_errors(fname);
+				close(fd);
+				fd = -1;
+			}
 		}
 	} else {
 		if (flags & PATTERN_NOFOLLOW)

From bc0a7055cac9bc9dc9c024649eee9aca7ca02aac Mon Sep 17 00:00:00 2001
From: Takuto Ikuta <tikuta@chromium.org>
Date: Wed, 22 Nov 2017 20:39:38 +0900
Subject: [PATCH 466/553] fetch-pack.c: enable fscache for stats under
 .git/objects

When I do git fetch, git call file stats under .git/objects for each
refs. This takes time when there are many refs.

By enabling fscache, git takes file stats by directory traversing and that
improved the speed of fetch-pack for repository having large number of
refs.

In my windows workstation, this improves the time of `git fetch` for
chromium repository like below. I took stats 3 times.

* With this patch
TotalSeconds: 9.9825165
TotalSeconds: 9.1862075
TotalSeconds: 10.1956256
Avg: 9.78811653333333

* Without this patch
TotalSeconds: 15.8406702
TotalSeconds: 15.6248053
TotalSeconds: 15.2085938
Avg: 15.5580231

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fetch-pack.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 40316c9a348f23..9cf2b6967c00d5 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -760,6 +760,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	save_commit_buffer = 0;
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
+	enable_fscache(1);
 	for (ref = *refs; ref; ref = ref->next) {
 		struct commit *commit;
 
@@ -784,6 +785,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 		if (!cutoff || cutoff < commit->date)
 			cutoff = commit->date;
 	}
+	enable_fscache(0);
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
 	/*

From 4cf77f3b440e9533f0d9fde23864a0672d11bebe Mon Sep 17 00:00:00 2001
From: Takuto Ikuta <tikuta@chromium.org>
Date: Tue, 30 Jan 2018 22:42:58 +0900
Subject: [PATCH 467/553] checkout.c: enable fscache for checkout again

This is retry of #1419.

I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.

git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.

Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333

Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136

I confirmed this patch passed all tests in t/ with core_fscache=1.

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
---
 builtin/checkout.c     |  2 ++
 compat/win32/fscache.c | 12 ++++++++++++
 compat/win32/fscache.h |  3 +++
 entry.c                |  3 +++
 git-compat-util.h      |  4 ++++
 parallel-checkout.c    |  1 +
 t/t7201-co.sh          | 36 ++++++++++++++++++++++++++++++++++++
 7 files changed, 61 insertions(+)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 261699e2f5fc97..d897d12331882d 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -415,6 +415,7 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	if (pc_workers > 1)
 		init_parallel_checkout();
 
+	enable_fscache(1);
 	for (pos = 0; pos < the_repository->index->cache_nr; pos++) {
 		struct cache_entry *ce = the_repository->index->cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -440,6 +441,7 @@ static int checkout_worktree(const struct checkout_opts *opts,
 		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
 					      NULL, NULL);
 	mem_pool_discard(&ce_mem_pool, should_validate_cache_entries());
+	enable_fscache(0);
 	remove_marked_cache_entries(the_repository->index, 1);
 	remove_scheduled_dirs();
 	errs |= finish_delayed_checkout(&state, opts->show_progress);
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 6e44df0a2dc2e7..b6de459c4d88a6 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -433,6 +433,18 @@ int fscache_enable(int enable)
 	return result;
 }
 
+/*
+ * Flush cached stats result when fscache is enabled.
+ */
+void fscache_flush(void)
+{
+	if (enabled) {
+		EnterCriticalSection(&mutex);
+		fscache_clear();
+		LeaveCriticalSection(&mutex);
+	}
+}
+
 /*
  * Lstat replacement, uses the cache if enabled, otherwise redirects to
  * mingw_lstat.
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index 660ada053b4309..2f06f8df97dcd0 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -7,6 +7,9 @@ int fscache_enable(int enable);
 int fscache_enabled(const char *path);
 #define is_fscache_enabled(path) fscache_enabled(path)
 
+void fscache_flush(void);
+#define flush_fscache() fscache_flush()
+
 DIR *fscache_opendir(const char *dir);
 int fscache_lstat(const char *file_name, struct stat *buf);
 
diff --git a/entry.c b/entry.c
index 7817aee362ed9e..5ab78ca884b215 100644
--- a/entry.c
+++ b/entry.c
@@ -411,6 +411,9 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca
 	}
 
 finish:
+	/* Flush cached lstat in fscache after writing to disk. */
+	flush_fscache();
+
 	if (state->refresh_cache) {
 		if (!fstat_done && lstat(ce->name, &st) < 0)
 			return error_errno("unable to stat just-written file %s",
diff --git a/git-compat-util.h b/git-compat-util.h
index 86cf347f50a2f4..559de2ce706a04 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1041,6 +1041,10 @@ static inline int is_missing_file_error(int errno_)
 #define is_fscache_enabled(path) (0)
 #endif
 
+#ifndef flush_fscache
+#define flush_fscache() /* noop */
+#endif
+
 int cmd_main(int, const char **);
 
 /*
diff --git a/parallel-checkout.c b/parallel-checkout.c
index 0bf4bd6d4abd8c..8fadb7c804bc02 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -640,6 +640,7 @@ static void write_items_sequentially(struct checkout *state)
 {
 	size_t i;
 
+	flush_fscache();
 	for (i = 0; i < parallel_checkout.nr; i++) {
 		struct parallel_checkout_item *pc_item = &parallel_checkout.items[i];
 		write_pc_item(pc_item, state);
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 9bcf7c0b40461f..545f388c44a515 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -35,6 +35,42 @@ fill () {
 }
 
 
+test_expect_success MINGW 'fscache flush cache' '
+
+	git init fscache-test &&
+	cd fscache-test &&
+	git config core.fscache 1 &&
+	echo A > test.txt &&
+	git add test.txt &&
+	git commit -m A &&
+	echo B >> test.txt &&
+	git checkout . &&
+	test -z "$(git status -s)" &&
+	echo A > expect.txt &&
+	test_cmp expect.txt test.txt &&
+	cd .. &&
+	rm -rf fscache-test
+'
+
+test_expect_success MINGW 'fscache flush cache dir' '
+
+	git init fscache-test &&
+	cd fscache-test &&
+	git config core.fscache 1 &&
+	echo A > test.txt &&
+	git add test.txt &&
+	git commit -m A &&
+	rm test.txt &&
+	mkdir test.txt &&
+	touch test.txt/test.txt &&
+	git checkout . &&
+	test -z "$(git status -s)" &&
+	echo A > expect.txt &&
+	test_cmp expect.txt test.txt &&
+	cd .. &&
+	rm -rf fscache-test
+'
+
 test_expect_success setup '
 	fill x y z >same &&
 	fill 1 2 3 4 5 6 7 8 >one &&

From 857a0d5503e0288ea30c5c87b40b60af96458508 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Fri, 7 Sep 2018 11:39:57 -0400
Subject: [PATCH 468/553] Enable the filesystem cache (fscache) in
 refresh_index().

On file systems that support it, this can dramatically speed up operations
like add, commit, describe, rebase, reset, rm that would otherwise have to
lstat() every file to "re-match" the stat information in the index to that
of the file system.

On a synthetic repo with 1M files, "git reset" dropped from 52.02 seconds to
14.42 seconds for a savings of 72%.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 read-cache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 990d4ead0d8ae4..caf0827c00d042 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1501,6 +1501,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n";
 	added_fmt      = in_porcelain ? "A\t%s\n" : "%s: needs update\n";
 	unmerged_fmt   = in_porcelain ? "U\t%s\n" : "%s: needs merge\n";
+	enable_fscache(1);
 	/*
 	 * Use the multi-threaded preload_index() to refresh most of the
 	 * cache entries quickly then in the single threaded loop below,
@@ -1595,6 +1596,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	display_progress(progress, istate->cache_nr);
 	stop_progress(&progress);
 	trace_performance_leave("refresh index");
+	enable_fscache(0);
 	return has_errors;
 }
 

From 055553063842ea7a88fa1f1a5248acfb93a3f655 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Tue, 23 Oct 2018 11:42:06 -0400
Subject: [PATCH 469/553] fscache: use FindFirstFileExW to avoid retrieving the
 short name

Use FindFirstFileExW with FindExInfoBasic to avoid forcing NTFS to look up
the short name.  Also switch to a larger (64K vs 4K) buffer using
FIND_FIRST_EX_LARGE_FETCH to minimize round trips to the kernel.

In a repo with ~200K files, this drops warm cache status times from 3.19
seconds to 2.67 seconds for a 16% savings.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/win32/fscache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index b6de459c4d88a6..c6ab9f1a2c7286 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -213,7 +213,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir,
 	pattern[wlen] = 0;
 
 	/* open find handle */
-	h = FindFirstFileW(pattern, &fdata);
+	h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch,
+		NULL, FIND_FIRST_EX_LARGE_FETCH);
 	if (h == INVALID_HANDLE_VALUE) {
 		err = GetLastError();
 		*dir_not_found = 1; /* or empty directory */

From 44c6b29ba96ad461cdf6c9642af5d452641042c6 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 4 Oct 2018 18:10:21 -0400
Subject: [PATCH 470/553] fscache: add GIT_TEST_FSCACHE support

Add support to fscache to enable running the entire test suite with the
fscache enabled.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/win32/fscache.c | 5 +++++
 t/README               | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index c6ab9f1a2c7286..13b38104732592 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -5,6 +5,7 @@
 #include "../../dir.h"
 #include "../../abspath.h"
 #include "../../trace.h"
+#include "config.h"
 
 static int initialized;
 static volatile long enabled;
@@ -406,7 +407,11 @@ int fscache_enable(int enable)
 	int result;
 
 	if (!initialized) {
+		int fscache = git_env_bool("GIT_TEST_FSCACHE", -1);
+
 		/* allow the cache to be disabled entirely */
+		if (fscache != -1)
+			core_fscache = fscache;
 		if (!core_fscache)
 			return 0;
 
diff --git a/t/README b/t/README
index adbbd9acf4ab27..f19468151410eb 100644
--- a/t/README
+++ b/t/README
@@ -479,6 +479,9 @@ GIT_TEST_NAME_HASH_VERSION=<int>, when set, causes 'git pack-objects' to
 assume '--name-hash-version=<n>'.
 
 
+GIT_TEST_FSCACHE=<boolean> exercises the uncommon fscache code path
+which adds a cache below mingw's lstat and dirent implementations.
+
 Naming Tests
 ------------
 

From 6784e9e04e2f3dbab9a155bb3f64831d17a35c78 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Tue, 25 Sep 2018 16:28:16 -0400
Subject: [PATCH 471/553] fscache: add fscache hit statistics

Track fscache hits and misses for lstat and opendir requests.  Reporting of
statistics is done when the cache is disabled for the last time and freed
and is only reported if GIT_TRACE_FSCACHE is set.

Sample output is:

11:33:11.836428 compat/win32/fscache.c:433 fscache: lstat 3775, opendir 263, total requests/misses 4052/269

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/win32/fscache.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 13b38104732592..cf8ed5c63573a6 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -11,6 +11,10 @@ static int initialized;
 static volatile long enabled;
 static struct hashmap map;
 static CRITICAL_SECTION mutex;
+static unsigned int lstat_requests;
+static unsigned int opendir_requests;
+static unsigned int fscache_requests;
+static unsigned int fscache_misses;
 static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
 /*
@@ -270,6 +274,8 @@ static void fscache_clear(void)
 {
 	hashmap_clear_and_free(&map, struct fsentry, ent);
 	hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0);
+	lstat_requests = opendir_requests = 0;
+	fscache_misses = fscache_requests = 0;
 }
 
 /*
@@ -316,6 +322,7 @@ static struct fsentry *fscache_get(struct fsentry *key)
 	int dir_not_found;
 
 	EnterCriticalSection(&mutex);
+	fscache_requests++;
 	/* check if entry is in cache */
 	fse = fscache_get_wait(key);
 	if (fse) {
@@ -379,6 +386,7 @@ static struct fsentry *fscache_get(struct fsentry *key)
 	}
 
 	/* add directory listing to the cache */
+	fscache_misses++;
 	fscache_add(fse);
 
 	/* lookup file entry if requested (fse already points to directory) */
@@ -416,6 +424,8 @@ int fscache_enable(int enable)
 			return 0;
 
 		InitializeCriticalSection(&mutex);
+		lstat_requests = opendir_requests = 0;
+		fscache_misses = fscache_requests = 0;
 		hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0);
 		initialized = 1;
 	}
@@ -432,6 +442,10 @@ int fscache_enable(int enable)
 		opendir = dirent_opendir;
 		lstat = mingw_lstat;
 		EnterCriticalSection(&mutex);
+		trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, "
+						 "total requests/misses %u/%u\n",
+				lstat_requests, opendir_requests,
+				fscache_requests, fscache_misses);
 		fscache_clear();
 		LeaveCriticalSection(&mutex);
 	}
@@ -469,6 +483,7 @@ int fscache_lstat(const char *filename, struct stat *st)
 	if (!fscache_enabled(filename))
 		return mingw_lstat(filename, st);
 
+	lstat_requests++;
 	/* split filename into path + name */
 	len = strlen(filename);
 	if (len && is_dir_sep(filename[len - 1]))
@@ -550,6 +565,7 @@ DIR *fscache_opendir(const char *dirname)
 	if (!fscache_enabled(dirname))
 		return dirent_opendir(dirname);
 
+	opendir_requests++;
 	/* prepare name (strip trailing '/', replace '.') */
 	len = strlen(dirname);
 	if ((len == 1 && dirname[0] == '.') ||

From 6ae87a26d5a76903eea964cab88a826ca3426aa8 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 4 Oct 2018 18:10:21 -0400
Subject: [PATCH 472/553] mem_pool: add GIT_TRACE_MEMPOOL support

Add tracing around initializing and discarding mempools. In discard report
on the amount of memory unused in the current block to help tune setting
the initial_size.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 mem-pool.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mem-pool.c b/mem-pool.c
index 62441dcc71968f..0fab0a5ef26472 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -7,7 +7,9 @@
 #include "git-compat-util.h"
 #include "mem-pool.h"
 #include "gettext.h"
+#include "trace.h"
 
+static struct trace_key trace_mem_pool = TRACE_KEY_INIT(MEMPOOL);
 #define BLOCK_GROWTH_SIZE (1024 * 1024 - sizeof(struct mp_block))
 
 /*
@@ -65,12 +67,20 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 
 	if (initial_size > 0)
 		mem_pool_alloc_block(pool, initial_size, NULL);
+
+	trace_printf_key(&trace_mem_pool,
+		"mem_pool (%p): init (%"PRIuMAX") initial size\n",
+		(void *)pool, (uintmax_t)initial_size);
 }
 
 void mem_pool_discard(struct mem_pool *pool, int invalidate_memory)
 {
 	struct mp_block *block, *block_to_free;
 
+	trace_printf_key(&trace_mem_pool,
+		"mem_pool (%p): discard (%"PRIuMAX") unused\n",
+		(void *)pool,
+		(uintmax_t)(pool->mp_block->end - pool->mp_block->next_free));
 	block = pool->mp_block;
 	while (block)
 	{

From 965494e94c0b64ec0d2d3c5f5ec7606175db2528 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Fri, 2 Nov 2018 11:19:10 -0400
Subject: [PATCH 473/553] fscache: fscache takes an initial size

Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.

Add a separate disable_fscache() macro to make the code clearer and easier
to read.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/add.c          | 2 +-
 builtin/checkout.c     | 4 ++--
 builtin/commit.c       | 4 ++--
 compat/win32/fscache.c | 8 ++++++--
 compat/win32/fscache.h | 5 +++--
 fetch-pack.c           | 4 ++--
 git-compat-util.h      | 4 ++++
 preload-index.c        | 4 ++--
 read-cache.c           | 4 ++--
 9 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 25add8da962cab..d71161dbf31232 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -493,7 +493,7 @@ int cmd_add(int argc,
 	die_in_unpopulated_submodule(repo->index, prefix);
 	die_path_inside_submodule(repo->index, &pathspec);
 
-	enable_fscache(1);
+	enable_fscache(0);
 	/* We do not really re-read the index but update the up-to-date flags */
 	preload_index(repo->index, &pathspec, 0);
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index d897d12331882d..6e4217783157b8 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -415,7 +415,7 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	if (pc_workers > 1)
 		init_parallel_checkout();
 
-	enable_fscache(1);
+	enable_fscache(the_repository->index->cache_nr);
 	for (pos = 0; pos < the_repository->index->cache_nr; pos++) {
 		struct cache_entry *ce = the_repository->index->cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -441,7 +441,7 @@ static int checkout_worktree(const struct checkout_opts *opts,
 		errs |= run_parallel_checkout(&state, pc_workers, pc_threshold,
 					      NULL, NULL);
 	mem_pool_discard(&ce_mem_pool, should_validate_cache_entries());
-	enable_fscache(0);
+	disable_fscache();
 	remove_marked_cache_entries(the_repository->index, 1);
 	remove_scheduled_dirs();
 	errs |= finish_delayed_checkout(&state, opts->show_progress);
diff --git a/builtin/commit.c b/builtin/commit.c
index 1b6def061cf33b..a7077bacb0fca5 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1622,7 +1622,7 @@ struct repository *repo UNUSED)
 		       PATHSPEC_PREFER_FULL,
 		       prefix, argv);
 
-	enable_fscache(1);
+	enable_fscache(0);
 	if (status_format != STATUS_FORMAT_PORCELAIN &&
 	    status_format != STATUS_FORMAT_PORCELAIN_V2)
 		progress_flag = REFRESH_PROGRESS;
@@ -1663,7 +1663,7 @@ struct repository *repo UNUSED)
 	wt_status_print(&s);
 	wt_status_collect_free_buffers(&s);
 
-	enable_fscache(0);
+	disable_fscache();
 	return 0;
 }
 
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index cf8ed5c63573a6..e9c10908d0e686 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -410,7 +410,7 @@ static struct fsentry *fscache_get(struct fsentry *key)
  * Enables or disables the cache. Note that the cache is read-only, changes to
  * the working directory are NOT reflected in the cache while enabled.
  */
-int fscache_enable(int enable)
+int fscache_enable(int enable, size_t initial_size)
 {
 	int result;
 
@@ -426,7 +426,11 @@ int fscache_enable(int enable)
 		InitializeCriticalSection(&mutex);
 		lstat_requests = opendir_requests = 0;
 		fscache_misses = fscache_requests = 0;
-		hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0);
+		/*
+		 * avoid having to rehash by leaving room for the parent dirs.
+		 * '4' was determined empirically by testing several repos
+		 */
+		hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4);
 		initialized = 1;
 	}
 
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index 2f06f8df97dcd0..d49c9381114da6 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -1,8 +1,9 @@
 #ifndef FSCACHE_H
 #define FSCACHE_H
 
-int fscache_enable(int enable);
-#define enable_fscache(x) fscache_enable(x)
+int fscache_enable(int enable, size_t initial_size);
+#define enable_fscache(initial_size) fscache_enable(1, initial_size)
+#define disable_fscache() fscache_enable(0, 0)
 
 int fscache_enabled(const char *path);
 #define is_fscache_enabled(path) fscache_enabled(path)
diff --git a/fetch-pack.c b/fetch-pack.c
index 9cf2b6967c00d5..b97f25f7900e31 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -760,7 +760,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	save_commit_buffer = 0;
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
-	enable_fscache(1);
+	enable_fscache(0);
 	for (ref = *refs; ref; ref = ref->next) {
 		struct commit *commit;
 
@@ -785,7 +785,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 		if (!cutoff || cutoff < commit->date)
 			cutoff = commit->date;
 	}
-	enable_fscache(0);
+	disable_fscache();
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
 	/*
diff --git a/git-compat-util.h b/git-compat-util.h
index 559de2ce706a04..0d7996dff69ae5 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1037,6 +1037,10 @@ static inline int is_missing_file_error(int errno_)
 #define enable_fscache(x) /* noop */
 #endif
 
+#ifndef disable_fscache
+#define disable_fscache() /* noop */
+#endif
+
 #ifndef is_fscache_enabled
 #define is_fscache_enabled(path) (0)
 #endif
diff --git a/preload-index.c b/preload-index.c
index 61e8f3a1f6ec84..e466fef15bcd79 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -141,7 +141,7 @@ void preload_index(struct index_state *index,
 		pthread_mutex_init(&pd.mutex, NULL);
 	}
 
-	enable_fscache(1);
+	enable_fscache(index->cache_nr);
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
 		int err;
@@ -178,7 +178,7 @@ void preload_index(struct index_state *index,
 	trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat);
 	trace2_region_leave("index", "preload", NULL);
 
-	enable_fscache(0);
+	disable_fscache();
 }
 
 int repo_read_index_preload(struct repository *repo,
diff --git a/read-cache.c b/read-cache.c
index caf0827c00d042..1719023d24feaf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1501,7 +1501,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n";
 	added_fmt      = in_porcelain ? "A\t%s\n" : "%s: needs update\n";
 	unmerged_fmt   = in_porcelain ? "U\t%s\n" : "%s: needs merge\n";
-	enable_fscache(1);
+	enable_fscache(0);
 	/*
 	 * Use the multi-threaded preload_index() to refresh most of the
 	 * cache entries quickly then in the single threaded loop below,
@@ -1596,7 +1596,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	display_progress(progress, istate->cache_nr);
 	stop_progress(&progress);
 	trace_performance_leave("refresh index");
-	enable_fscache(0);
+	disable_fscache();
 	return has_errors;
 }
 

From aa87b167eeead66ebc2191364da4b21a2a2bfe05 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 4 Oct 2018 15:38:08 -0400
Subject: [PATCH 474/553] fscache: update fscache to be thread specific instead
 of global

The threading model for fscache has been to have a single, global cache.
This puts requirements on it to be thread safe so that callers like
preload-index can call it from multiple threads.  This was implemented
with a single mutex and completion events which introduces contention
between the calling threads.

Simplify the threading model by making fscache thread specific.  This allows
us to remove the global mutex and synchronization events entirely and instead
associate a fscache with every thread that requests one. This works well with
the current multi-threading which divides the cache entries into blocks with
a separate thread processing each block.

At the end of each worker thread, if there is a fscache on the primary
thread, merge the cached results from the worker into the primary thread
cache. This enables us to reuse the cache later especially when scanning for
untracked files.

In testing, this reduced the time spent in preload_index() by about 25% and
also reduced the CPU utilization significantly.  On a repo with ~200K files,
it reduced overall status times by ~12%.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/win32/fscache.c | 294 +++++++++++++++++++++++++----------------
 compat/win32/fscache.h |  22 ++-
 git-compat-util.h      |  12 ++
 preload-index.c        |   8 +-
 4 files changed, 215 insertions(+), 121 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index e9c10908d0e686..f27a7e45e365f4 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -7,14 +7,24 @@
 #include "../../trace.h"
 #include "config.h"
 
-static int initialized;
-static volatile long enabled;
-static struct hashmap map;
+static volatile long initialized;
+static DWORD dwTlsIndex;
 static CRITICAL_SECTION mutex;
-static unsigned int lstat_requests;
-static unsigned int opendir_requests;
-static unsigned int fscache_requests;
-static unsigned int fscache_misses;
+
+/*
+ * Store one fscache per thread to avoid thread contention and locking.
+ * This is ok because multi-threaded access is 1) uncommon and 2) always
+ * splitting up the cache entries across multiple threads so there isn't
+ * any overlap between threads anyway.
+ */
+struct fscache {
+	volatile long enabled;
+	struct hashmap map;
+	unsigned int lstat_requests;
+	unsigned int opendir_requests;
+	unsigned int fscache_requests;
+	unsigned int fscache_misses;
+};
 static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
 /*
@@ -34,8 +44,6 @@ struct fsentry {
 	union {
 		/* Reference count of the directory listing. */
 		volatile long refcnt;
-		/* Handle to wait on the loading thread. */
-		HANDLE hwait;
 		struct {
 			/* More stat members (only used for file entries). */
 			off64_t st_size;
@@ -258,86 +266,63 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir,
 /*
  * Adds a directory listing to the cache.
  */
-static void fscache_add(struct fsentry *fse)
+static void fscache_add(struct fscache *cache, struct fsentry *fse)
 {
 	if (fse->list)
 		fse = fse->list;
 
 	for (; fse; fse = fse->next)
-		hashmap_add(&map, &fse->ent);
+		hashmap_add(&cache->map, &fse->ent);
 }
 
 /*
  * Clears the cache.
  */
-static void fscache_clear(void)
+static void fscache_clear(struct fscache *cache)
 {
-	hashmap_clear_and_free(&map, struct fsentry, ent);
-	hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0);
-	lstat_requests = opendir_requests = 0;
-	fscache_misses = fscache_requests = 0;
+	hashmap_clear_and_free(&cache->map, struct fsentry, ent);
+	hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0);
+	cache->lstat_requests = cache->opendir_requests = 0;
+	cache->fscache_misses = cache->fscache_requests = 0;
 }
 
 /*
  * Checks if the cache is enabled for the given path.
  */
-int fscache_enabled(const char *path)
+static int do_fscache_enabled(struct fscache *cache, const char *path)
 {
-	return enabled > 0 && !is_absolute_path(path);
+	return cache->enabled > 0 && !is_absolute_path(path);
 }
 
-/*
- * Looks up a cache entry, waits if its being loaded by another thread.
- * The mutex must be owned by the calling thread.
- */
-static struct fsentry *fscache_get_wait(struct fsentry *key)
+int fscache_enabled(const char *path)
 {
-	struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL);
-
-	/* return if its a 'real' entry (future entries have refcnt == 0) */
-	if (!fse || fse->list || fse->u.refcnt)
-		return fse;
-
-	/* create an event and link our key to the future entry */
-	key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL);
-	key->next = fse->next;
-	fse->next = key;
-
-	/* wait for the loading thread to signal us */
-	LeaveCriticalSection(&mutex);
-	WaitForSingleObject(key->u.hwait, INFINITE);
-	CloseHandle(key->u.hwait);
-	EnterCriticalSection(&mutex);
+	struct fscache *cache = fscache_getcache();
 
-	/* repeat cache lookup */
-	return hashmap_get_entry(&map, key, ent, NULL);
+	return cache ? do_fscache_enabled(cache, path) : 0;
 }
 
 /*
  * Looks up or creates a cache entry for the specified key.
  */
-static struct fsentry *fscache_get(struct fsentry *key)
+static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key)
 {
-	struct fsentry *fse, *future, *waiter;
+	struct fsentry *fse;
 	int dir_not_found;
 
-	EnterCriticalSection(&mutex);
-	fscache_requests++;
+	cache->fscache_requests++;
 	/* check if entry is in cache */
-	fse = fscache_get_wait(key);
+	fse = hashmap_get_entry(&cache->map, key, ent, NULL);
 	if (fse) {
 		if (fse->st_mode)
 			fsentry_addref(fse);
 		else
 			fse = NULL; /* non-existing directory */
-		LeaveCriticalSection(&mutex);
 		return fse;
 	}
 	/* if looking for a file, check if directory listing is in cache */
 	if (!fse && key->list) {
-		fse = fscache_get_wait(key->list);
+		fse = hashmap_get_entry(&cache->map, key->list, ent, NULL);
 		if (fse) {
-			LeaveCriticalSection(&mutex);
 			/*
 			 * dir entry without file entry, or dir does not
 			 * exist -> file doesn't exist
@@ -347,25 +332,8 @@ static struct fsentry *fscache_get(struct fsentry *key)
 		}
 	}
 
-	/* add future entry to indicate that we're loading it */
-	future = key->list ? key->list : key;
-	future->next = NULL;
-	future->u.refcnt = 0;
-	hashmap_add(&map, &future->ent);
-
-	/* create the directory listing (outside mutex!) */
-	LeaveCriticalSection(&mutex);
-	fse = fsentry_create_list(future, &dir_not_found);
-	EnterCriticalSection(&mutex);
-
-	/* remove future entry and signal waiting threads */
-	hashmap_remove(&map, &future->ent, NULL);
-	waiter = future->next;
-	while (waiter) {
-		HANDLE h = waiter->u.hwait;
-		waiter = waiter->next;
-		SetEvent(h);
-	}
+	/* create the directory listing */
+	fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found);
 
 	/* leave on error (errno set by fsentry_create_list) */
 	if (!fse) {
@@ -379,19 +347,18 @@ static struct fsentry *fscache_get(struct fsentry *key)
 					    key->list->dirent.d_name,
 					    key->list->len);
 			fse->st_mode = 0;
-			hashmap_add(&map, &fse->ent);
+			hashmap_add(&cache->map, &fse->ent);
 		}
-		LeaveCriticalSection(&mutex);
 		return NULL;
 	}
 
 	/* add directory listing to the cache */
-	fscache_misses++;
-	fscache_add(fse);
+	cache->fscache_misses++;
+	fscache_add(cache, fse);
 
 	/* lookup file entry if requested (fse already points to directory) */
 	if (key->list)
-		fse = hashmap_get_entry(&map, key, ent, NULL);
+		fse = hashmap_get_entry(&cache->map, key, ent, NULL);
 
 	if (fse && !fse->st_mode)
 		fse = NULL; /* non-existing directory */
@@ -402,59 +369,104 @@ static struct fsentry *fscache_get(struct fsentry *key)
 	else
 		errno = ENOENT;
 
-	LeaveCriticalSection(&mutex);
 	return fse;
 }
 
 /*
- * Enables or disables the cache. Note that the cache is read-only, changes to
+ * Enables the cache. Note that the cache is read-only, changes to
  * the working directory are NOT reflected in the cache while enabled.
  */
-int fscache_enable(int enable, size_t initial_size)
+int fscache_enable(size_t initial_size)
 {
-	int result;
+	int fscache;
+	struct fscache *cache;
+	int result = 0;
+
+	/* allow the cache to be disabled entirely */
+	fscache = git_env_bool("GIT_TEST_FSCACHE", -1);
+	if (fscache != -1)
+		core_fscache = fscache;
+	if (!core_fscache)
+		return 0;
 
+	/*
+	 * refcount the global fscache initialization so that the
+	 * opendir and lstat function pointers are redirected if
+	 * any threads are using the fscache.
+	 */
 	if (!initialized) {
-		int fscache = git_env_bool("GIT_TEST_FSCACHE", -1);
-
-		/* allow the cache to be disabled entirely */
-		if (fscache != -1)
-			core_fscache = fscache;
-		if (!core_fscache)
-			return 0;
-
 		InitializeCriticalSection(&mutex);
-		lstat_requests = opendir_requests = 0;
-		fscache_misses = fscache_requests = 0;
+		if (!dwTlsIndex) {
+			dwTlsIndex = TlsAlloc();
+			if (dwTlsIndex == TLS_OUT_OF_INDEXES) {
+				LeaveCriticalSection(&mutex);
+				return 0;
+			}
+		}
+
+		/* redirect opendir and lstat to the fscache implementations */
+		opendir = fscache_opendir;
+		lstat = fscache_lstat;
+	}
+	InterlockedIncrement(&initialized);
+
+	/* refcount the thread specific initialization */
+	cache = fscache_getcache();
+	if (cache) {
+		InterlockedIncrement(&cache->enabled);
+	} else {
+		cache = (struct fscache *)xcalloc(1, sizeof(*cache));
+		cache->enabled = 1;
 		/*
 		 * avoid having to rehash by leaving room for the parent dirs.
 		 * '4' was determined empirically by testing several repos
 		 */
-		hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4);
-		initialized = 1;
+		hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4);
+		if (!TlsSetValue(dwTlsIndex, cache))
+			BUG("TlsSetValue error");
 	}
 
-	result = enable ? InterlockedIncrement(&enabled)
-			: InterlockedDecrement(&enabled);
+	trace_printf_key(&trace_fscache, "fscache: enable\n");
+	return result;
+}
 
-	if (enable && result == 1) {
-		/* redirect opendir and lstat to the fscache implementations */
-		opendir = fscache_opendir;
-		lstat = fscache_lstat;
-	} else if (!enable && !result) {
+/*
+ * Disables the cache.
+ */
+void fscache_disable(void)
+{
+	struct fscache *cache;
+
+	if (!core_fscache)
+		return;
+
+	/* update the thread specific fscache initialization */
+	cache = fscache_getcache();
+	if (!cache)
+		BUG("fscache_disable() called on a thread where fscache has not been initialized");
+	if (!cache->enabled)
+		BUG("fscache_disable() called on an fscache that is already disabled");
+	InterlockedDecrement(&cache->enabled);
+	if (!cache->enabled) {
+		TlsSetValue(dwTlsIndex, NULL);
+		trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, "
+			"total requests/misses %u/%u\n",
+			cache->lstat_requests, cache->opendir_requests,
+			cache->fscache_requests, cache->fscache_misses);
+		fscache_clear(cache);
+		free(cache);
+	}
+
+	/* update the global fscache initialization */
+	InterlockedDecrement(&initialized);
+	if (!initialized) {
 		/* reset opendir and lstat to the original implementations */
 		opendir = dirent_opendir;
 		lstat = mingw_lstat;
-		EnterCriticalSection(&mutex);
-		trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, "
-						 "total requests/misses %u/%u\n",
-				lstat_requests, opendir_requests,
-				fscache_requests, fscache_misses);
-		fscache_clear();
-		LeaveCriticalSection(&mutex);
 	}
-	trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable);
-	return result;
+
+	trace_printf_key(&trace_fscache, "fscache: disable\n");
+	return;
 }
 
 /*
@@ -462,10 +474,10 @@ int fscache_enable(int enable, size_t initial_size)
  */
 void fscache_flush(void)
 {
-	if (enabled) {
-		EnterCriticalSection(&mutex);
-		fscache_clear();
-		LeaveCriticalSection(&mutex);
+	struct fscache *cache = fscache_getcache();
+
+	if (cache && cache->enabled) {
+		fscache_clear(cache);
 	}
 }
 
@@ -483,11 +495,12 @@ int fscache_lstat(const char *filename, struct stat *st)
 	struct heap_fsentry key[2];
 #pragma GCC diagnostic pop
 	struct fsentry *fse;
+	struct fscache *cache = fscache_getcache();
 
-	if (!fscache_enabled(filename))
+	if (!cache || !do_fscache_enabled(cache, filename))
 		return mingw_lstat(filename, st);
 
-	lstat_requests++;
+	cache->lstat_requests++;
 	/* split filename into path + name */
 	len = strlen(filename);
 	if (len && is_dir_sep(filename[len - 1]))
@@ -500,7 +513,7 @@ int fscache_lstat(const char *filename, struct stat *st)
 	/* lookup entry for path + name in cache */
 	fsentry_init(&key[0].u.ent, NULL, filename, dirlen);
 	fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base);
-	fse = fscache_get(&key[1].u.ent);
+	fse = fscache_get(cache, &key[1].u.ent);
 	if (!fse) {
 		errno = ENOENT;
 		return -1;
@@ -565,11 +578,12 @@ DIR *fscache_opendir(const char *dirname)
 	struct fsentry *list;
 	fscache_DIR *dir;
 	int len;
+	struct fscache *cache = fscache_getcache();
 
-	if (!fscache_enabled(dirname))
+	if (!cache || !do_fscache_enabled(cache, dirname))
 		return dirent_opendir(dirname);
 
-	opendir_requests++;
+	cache->opendir_requests++;
 	/* prepare name (strip trailing '/', replace '.') */
 	len = strlen(dirname);
 	if ((len == 1 && dirname[0] == '.') ||
@@ -578,7 +592,7 @@ DIR *fscache_opendir(const char *dirname)
 
 	/* get directory listing from cache */
 	fsentry_init(&key.u.ent, NULL, dirname, len);
-	list = fscache_get(&key.u.ent);
+	list = fscache_get(cache, &key.u.ent);
 	if (!list)
 		return NULL;
 
@@ -589,3 +603,53 @@ DIR *fscache_opendir(const char *dirname)
 	dir->pfsentry = list;
 	return (DIR*) dir;
 }
+
+struct fscache *fscache_getcache(void)
+{
+	return (struct fscache *)TlsGetValue(dwTlsIndex);
+}
+
+void fscache_merge(struct fscache *dest)
+{
+	struct hashmap_iter iter;
+	struct hashmap_entry *e;
+	struct fscache *cache = fscache_getcache();
+
+	/*
+	 * Only do the merge if fscache was enabled and we have a dest
+	 * cache to merge into.
+	 */
+	if (!dest) {
+		fscache_enable(0);
+		return;
+	}
+	if (!cache)
+		BUG("fscache_merge() called on a thread where fscache has not been initialized");
+
+	TlsSetValue(dwTlsIndex, NULL);
+	trace_printf_key(&trace_fscache, "fscache_merge: lstat %u, opendir %u, "
+		"total requests/misses %u/%u\n",
+		cache->lstat_requests, cache->opendir_requests,
+		cache->fscache_requests, cache->fscache_misses);
+
+	/*
+	 * This is only safe because the primary thread we're merging into
+	 * isn't being used so the critical section only needs to prevent
+	 * the the child threads from stomping on each other.
+	 */
+	EnterCriticalSection(&mutex);
+
+	hashmap_iter_init(&cache->map, &iter);
+	while ((e = hashmap_iter_next(&iter)))
+		hashmap_add(&dest->map, e);
+
+	dest->lstat_requests += cache->lstat_requests;
+	dest->opendir_requests += cache->opendir_requests;
+	dest->fscache_requests += cache->fscache_requests;
+	dest->fscache_misses += cache->fscache_misses;
+	LeaveCriticalSection(&mutex);
+
+	free(cache);
+
+	InterlockedDecrement(&initialized);
+}
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index d49c9381114da6..2eb8bf3f5cfee8 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -1,9 +1,16 @@
 #ifndef FSCACHE_H
 #define FSCACHE_H
 
-int fscache_enable(int enable, size_t initial_size);
-#define enable_fscache(initial_size) fscache_enable(1, initial_size)
-#define disable_fscache() fscache_enable(0, 0)
+/*
+ * The fscache is thread specific. enable_fscache() must be called
+ * for each thread where caching is desired.
+ */
+
+int fscache_enable(size_t initial_size);
+#define enable_fscache(initial_size) fscache_enable(initial_size)
+
+void fscache_disable(void);
+#define disable_fscache() fscache_disable()
 
 int fscache_enabled(const char *path);
 #define is_fscache_enabled(path) fscache_enabled(path)
@@ -14,4 +21,13 @@ void fscache_flush(void);
 DIR *fscache_opendir(const char *dir);
 int fscache_lstat(const char *file_name, struct stat *buf);
 
+/* opaque fscache structure */
+struct fscache;
+
+struct fscache *fscache_getcache(void);
+#define getcache_fscache() fscache_getcache()
+
+void fscache_merge(struct fscache *dest);
+#define merge_fscache(dest) fscache_merge(dest)
+
 #endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 0d7996dff69ae5..43bb9791f3d93d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1033,6 +1033,10 @@ static inline int is_missing_file_error(int errno_)
  * data or even file content without the need to synchronize with the file
  * system.
  */
+
+ /* opaque fscache structure */
+struct fscache;
+
 #ifndef enable_fscache
 #define enable_fscache(x) /* noop */
 #endif
@@ -1049,6 +1053,14 @@ static inline int is_missing_file_error(int errno_)
 #define flush_fscache() /* noop */
 #endif
 
+#ifndef getcache_fscache
+#define getcache_fscache() (NULL) /* noop */
+#endif
+
+#ifndef merge_fscache
+#define merge_fscache(dest) /* noop */
+#endif
+
 int cmd_main(int, const char **);
 
 /*
diff --git a/preload-index.c b/preload-index.c
index e466fef15bcd79..ac0310008754a3 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -20,6 +20,8 @@
 #include "trace2.h"
 #include "config.h"
 
+static struct fscache *fscache;
+
 /*
  * Mostly randomly chosen maximum thread counts: we
  * cap the parallelism to 20 threads, and we want
@@ -57,6 +59,7 @@ static void *preload_thread(void *_data)
 		nr = index->cache_nr - p->offset;
 	last_nr = nr;
 
+	enable_fscache(nr);
 	do {
 		struct cache_entry *ce = *cep++;
 		struct stat st;
@@ -100,6 +103,7 @@ static void *preload_thread(void *_data)
 		pthread_mutex_unlock(&pd->mutex);
 	}
 	cache_def_clear(&cache);
+	merge_fscache(fscache);
 	return NULL;
 }
 
@@ -118,6 +122,7 @@ void preload_index(struct index_state *index,
 	if (!HAVE_THREADS || !core_preload_index)
 		return;
 
+	fscache = getcache_fscache();
 	threads = index->cache_nr / THREAD_COST;
 	if ((index->cache_nr > 1) && (threads < 2) && git_env_bool("GIT_TEST_PRELOAD_INDEX", 0))
 		threads = 2;
@@ -141,7 +146,6 @@ void preload_index(struct index_state *index,
 		pthread_mutex_init(&pd.mutex, NULL);
 	}
 
-	enable_fscache(index->cache_nr);
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
 		int err;
@@ -177,8 +181,6 @@ void preload_index(struct index_state *index,
 
 	trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat);
 	trace2_region_leave("index", "preload", NULL);
-
-	disable_fscache();
 }
 
 int repo_read_index_preload(struct repository *repo,

From 7e4634b8d4650a299b3cb8e0bf790b21332ae582 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Fri, 2 Nov 2018 11:19:10 -0400
Subject: [PATCH 475/553] fscache: teach fscache to use mempool

Now that the fscache is single threaded, take advantage of the mem_pool as
the allocator to significantly reduce the cost of allocations and frees.

With the reduced cost of free, in future patches, we can start freeing the
fscache at the end of commands instead of just leaking it.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c | 45 ++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index f27a7e45e365f4..2d967bd62f129f 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -6,6 +6,7 @@
 #include "../../abspath.h"
 #include "../../trace.h"
 #include "config.h"
+#include "../../mem-pool.h"
 
 static volatile long initialized;
 static DWORD dwTlsIndex;
@@ -20,6 +21,7 @@ static CRITICAL_SECTION mutex;
 struct fscache {
 	volatile long enabled;
 	struct hashmap map;
+	struct mem_pool mem_pool;
 	unsigned int lstat_requests;
 	unsigned int opendir_requests;
 	unsigned int fscache_requests;
@@ -129,11 +131,12 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list,
 /*
  * Allocate an fsentry structure on the heap.
  */
-static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name,
+static struct fsentry *fsentry_alloc(struct fscache *cache, struct fsentry *list, const char *name,
 		size_t len)
 {
 	/* overallocate fsentry and copy the name to the end */
-	struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1);
+	struct fsentry *fse =
+		mem_pool_alloc(&cache->mem_pool, sizeof(*fse) + len + 1);
 	/* init the rest of the structure */
 	fsentry_init(fse, list, name, len);
 	fse->next = NULL;
@@ -153,27 +156,21 @@ inline static void fsentry_addref(struct fsentry *fse)
 }
 
 /*
- * Release the reference to an fsentry, frees the memory if its the last ref.
+ * Release the reference to an fsentry.
  */
 static void fsentry_release(struct fsentry *fse)
 {
 	if (fse->list)
 		fse = fse->list;
 
-	if (InterlockedDecrement(&(fse->u.refcnt)))
-		return;
-
-	while (fse) {
-		struct fsentry *next = fse->next;
-		free(fse);
-		fse = next;
-	}
+	InterlockedDecrement(&(fse->u.refcnt));
 }
 
 /*
  * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure.
  */
-static struct fsentry *fseentry_create_entry(struct fsentry *list,
+static struct fsentry *fseentry_create_entry(struct fscache *cache,
+					     struct fsentry *list,
 					     const WIN32_FIND_DATAW *fdata)
 {
 	char buf[MAX_PATH * 3];
@@ -181,7 +178,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list,
 	struct fsentry *fse;
 	len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf));
 
-	fse = fsentry_alloc(list, buf, len);
+	fse = fsentry_alloc(cache, list, buf, len);
 
 	fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes);
 	fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG;
@@ -199,7 +196,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list,
  * Dir should not contain trailing '/'. Use an empty string for the current
  * directory (not "."!).
  */
-static struct fsentry *fsentry_create_list(const struct fsentry *dir,
+static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir,
 					   int *dir_not_found)
 {
 	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
@@ -238,14 +235,14 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir,
 	}
 
 	/* allocate object to hold directory listing */
-	list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len);
+	list = fsentry_alloc(cache, NULL, dir->dirent.d_name, dir->len);
 	list->st_mode = S_IFDIR;
 	list->dirent.d_type = DT_DIR;
 
 	/* walk directory and build linked list of fsentry structures */
 	phead = &list->next;
 	do {
-		*phead = fseentry_create_entry(list, &fdata);
+		*phead = fseentry_create_entry(cache, list, &fdata);
 		phead = &(*phead)->next;
 	} while (FindNextFileW(h, &fdata));
 
@@ -257,7 +254,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir,
 	if (err == ERROR_NO_MORE_FILES)
 		return list;
 
-	/* otherwise free the list and return error */
+	/* otherwise release the list and return error */
 	fsentry_release(list);
 	errno = err_win_to_posix(err);
 	return NULL;
@@ -280,7 +277,9 @@ static void fscache_add(struct fscache *cache, struct fsentry *fse)
  */
 static void fscache_clear(struct fscache *cache)
 {
-	hashmap_clear_and_free(&cache->map, struct fsentry, ent);
+	mem_pool_discard(&cache->mem_pool, 0);
+	mem_pool_init(&cache->mem_pool, 0);
+	hashmap_clear(&cache->map);
 	hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0);
 	cache->lstat_requests = cache->opendir_requests = 0;
 	cache->fscache_misses = cache->fscache_requests = 0;
@@ -333,7 +332,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key)
 	}
 
 	/* create the directory listing */
-	fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found);
+	fse = fsentry_create_list(cache, key->list ? key->list : key, &dir_not_found);
 
 	/* leave on error (errno set by fsentry_create_list) */
 	if (!fse) {
@@ -343,7 +342,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key)
 			 * empty, which for all practical matters is the same
 			 * thing as far as fscache is concerned).
 			 */
-			fse = fsentry_alloc(key->list->list,
+			fse = fsentry_alloc(cache, key->list->list,
 					    key->list->dirent.d_name,
 					    key->list->len);
 			fse->st_mode = 0;
@@ -422,6 +421,7 @@ int fscache_enable(size_t initial_size)
 		 * '4' was determined empirically by testing several repos
 		 */
 		hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4);
+		mem_pool_init(&cache->mem_pool, 0);
 		if (!TlsSetValue(dwTlsIndex, cache))
 			BUG("TlsSetValue error");
 	}
@@ -453,7 +453,8 @@ void fscache_disable(void)
 			"total requests/misses %u/%u\n",
 			cache->lstat_requests, cache->opendir_requests,
 			cache->fscache_requests, cache->fscache_misses);
-		fscache_clear(cache);
+		mem_pool_discard(&cache->mem_pool, 0);
+		hashmap_clear(&cache->map);
 		free(cache);
 	}
 
@@ -643,6 +644,8 @@ void fscache_merge(struct fscache *dest)
 	while ((e = hashmap_iter_next(&iter)))
 		hashmap_add(&dest->map, e);
 
+	mem_pool_combine(&dest->mem_pool, &cache->mem_pool);
+
 	dest->lstat_requests += cache->lstat_requests;
 	dest->opendir_requests += cache->opendir_requests;
 	dest->fscache_requests += cache->fscache_requests;

From 6e866f77cc11cbe4cf5f0bd9c93cd42f6aaa14f6 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 11 Dec 2018 12:59:29 +0100
Subject: [PATCH 476/553] fscache: remember the reparse tag for each entry

We will use this in the next commit to implement an FSCache-aware
version of is_mount_point().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index ca16b35302a3ce..c05b931455945c 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -46,6 +46,7 @@ static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 struct fsentry {
 	struct hashmap_entry ent;
 	mode_t st_mode;
+	ULONG reparse_tag;
 	/* Pointer to the directory listing, or NULL for the listing itself. */
 	struct fsentry *list;
 	/* Pointer to the next file entry of the list. */
@@ -202,6 +203,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 
 	fse = fsentry_alloc(cache, list, buf, len);
 
+	fse->reparse_tag =
+		fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ?
+		fdata->EaSize : 0;
+
 	fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes);
 	fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG;
 	fse->u.s.st_size = fdata->EndOfFile.LowPart |

From 5d3269cf1b9a3c7fab5af3d944b6ccfddada050b Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Fri, 16 Nov 2018 10:59:18 -0500
Subject: [PATCH 477/553] fscache: make fscache_enable() thread safe

The recent change to make fscache thread specific relied on fscache_enable()
being called first from the primary thread before being called in parallel
from worker threads.  Make that more robust and protect it with a critical
section to avoid any issues.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/mingw.c         |  4 ++++
 compat/win32/fscache.c | 23 +++++++++++++----------
 compat/win32/fscache.h |  2 ++
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 9cd3c2c8d21edb..54d73f7333f7be 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -14,6 +14,7 @@
 #include "symlinks.h"
 #include "trace2.h"
 #include "win32.h"
+#include "win32/fscache.h"
 #include "win32/lazyload.h"
 #include "wrapper.h"
 #include "write-or-die.h"
@@ -3714,6 +3715,9 @@ int wmain(int argc, const wchar_t **wargv)
 	/* initialize critical section for waitpid pinfo_t list */
 	InitializeCriticalSection(&pinfo_cs);
 
+	/* initialize critical section for fscache */
+	InitializeCriticalSection(&fscache_cs);
+
 	/* set up default file mode and file modes for stdin/out/err */
 	_fmode = _O_BINARY;
 	_setmode(_fileno(stdin), _O_BINARY);
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 2d967bd62f129f..7234318520b8e8 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -10,7 +10,7 @@
 
 static volatile long initialized;
 static DWORD dwTlsIndex;
-static CRITICAL_SECTION mutex;
+CRITICAL_SECTION fscache_cs;
 
 /*
  * Store one fscache per thread to avoid thread contention and locking.
@@ -393,12 +393,12 @@ int fscache_enable(size_t initial_size)
 	 * opendir and lstat function pointers are redirected if
 	 * any threads are using the fscache.
 	 */
+	EnterCriticalSection(&fscache_cs);
 	if (!initialized) {
-		InitializeCriticalSection(&mutex);
 		if (!dwTlsIndex) {
 			dwTlsIndex = TlsAlloc();
 			if (dwTlsIndex == TLS_OUT_OF_INDEXES) {
-				LeaveCriticalSection(&mutex);
+				LeaveCriticalSection(&fscache_cs);
 				return 0;
 			}
 		}
@@ -407,12 +407,13 @@ int fscache_enable(size_t initial_size)
 		opendir = fscache_opendir;
 		lstat = fscache_lstat;
 	}
-	InterlockedIncrement(&initialized);
+	initialized++;
+	LeaveCriticalSection(&fscache_cs);
 
 	/* refcount the thread specific initialization */
 	cache = fscache_getcache();
 	if (cache) {
-		InterlockedIncrement(&cache->enabled);
+		cache->enabled++;
 	} else {
 		cache = (struct fscache *)xcalloc(1, sizeof(*cache));
 		cache->enabled = 1;
@@ -446,7 +447,7 @@ void fscache_disable(void)
 		BUG("fscache_disable() called on a thread where fscache has not been initialized");
 	if (!cache->enabled)
 		BUG("fscache_disable() called on an fscache that is already disabled");
-	InterlockedDecrement(&cache->enabled);
+	cache->enabled--;
 	if (!cache->enabled) {
 		TlsSetValue(dwTlsIndex, NULL);
 		trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, "
@@ -459,12 +460,14 @@ void fscache_disable(void)
 	}
 
 	/* update the global fscache initialization */
-	InterlockedDecrement(&initialized);
+	EnterCriticalSection(&fscache_cs);
+	initialized--;
 	if (!initialized) {
 		/* reset opendir and lstat to the original implementations */
 		opendir = dirent_opendir;
 		lstat = mingw_lstat;
 	}
+	LeaveCriticalSection(&fscache_cs);
 
 	trace_printf_key(&trace_fscache, "fscache: disable\n");
 	return;
@@ -638,7 +641,7 @@ void fscache_merge(struct fscache *dest)
 	 * isn't being used so the critical section only needs to prevent
 	 * the the child threads from stomping on each other.
 	 */
-	EnterCriticalSection(&mutex);
+	EnterCriticalSection(&fscache_cs);
 
 	hashmap_iter_init(&cache->map, &iter);
 	while ((e = hashmap_iter_next(&iter)))
@@ -650,9 +653,9 @@ void fscache_merge(struct fscache *dest)
 	dest->opendir_requests += cache->opendir_requests;
 	dest->fscache_requests += cache->fscache_requests;
 	dest->fscache_misses += cache->fscache_misses;
-	LeaveCriticalSection(&mutex);
+	initialized--;
+	LeaveCriticalSection(&fscache_cs);
 
 	free(cache);
 
-	InterlockedDecrement(&initialized);
 }
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index 2eb8bf3f5cfee8..042b247a542554 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -6,6 +6,8 @@
  * for each thread where caching is desired.
  */
 
+extern CRITICAL_SECTION fscache_cs;
+
 int fscache_enable(size_t initial_size);
 #define enable_fscache(initial_size) fscache_enable(initial_size)
 

From 35652a36facf3aecff7fc1e7d987b332363e93a4 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Wed, 12 Jun 2019 00:58:49 +0000
Subject: [PATCH 478/553] unpack-trees: enable fscache for sparse-checkout

When updating the skip-worktree bits in the index to align with new
values in a sparse-checkout file, Git scans the entire working
directory with lstat() calls. In a sparse-checkout, many of these
lstat() calls are for paths that do not exist.

Enable the fscache feature during this scan. Since enable_fscache()
calls nest, the disable_fscache() method decrements a counter and
would only clear the cache if that counter reaches zero.

In a local test of a repo with ~2.2 million paths, updating the index
with git read-tree -m -u HEAD with a sparse-checkout file containing
only /.gitattributes improved from 2-3 minutes to ~6 seconds.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 unpack-trees.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index f38c761ab987a6..450dbdf7c1bd6c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1823,7 +1823,9 @@ static void mark_new_skip_worktree(struct pattern_list *pl,
 	 * 2. Widen worktree according to sparse-checkout file.
 	 * Matched entries will have skip_wt_flag cleared (i.e. "in")
 	 */
+	enable_fscache(istate->cache_nr);
 	clear_ce_flags(istate, select_flag, skip_wt_flag, pl, show_progress);
+	disable_fscache();
 }
 
 static void populate_from_existing_patterns(struct unpack_trees_options *o,

From 7daf7fad83a5e3361f2a6ce94a21593a69212238 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 1 Nov 2018 11:40:51 -0400
Subject: [PATCH 479/553] status: disable and free fscache at the end of the
 status command

At the end of the status command, disable and free the fscache so that we
don't leak the memory and so that we can dump the fscache statistics.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/commit.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index 2309cf06acad09..1b6def061cf33b 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1663,6 +1663,7 @@ struct repository *repo UNUSED)
 	wt_status_print(&s);
 	wt_status_collect_free_buffers(&s);
 
+	enable_fscache(0);
 	return 0;
 }
 

From 257ad8faf42cb0370655f270501fafc41a00ab64 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 11 Dec 2018 12:17:49 +0100
Subject: [PATCH 480/553] fscache: implement an FSCache-aware is_mount_point()

When FSCache is active, we can cache the reparse tag and use it directly
to determine whether a path refers to an NTFS junction, without any
additional, costly I/O.

Note: this change only makes a difference with the next commit, which
will make use of the FSCache in `git clean` (contingent on
`core.fscache` set, of course).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c         |  2 ++
 compat/mingw.h         |  3 ++-
 compat/win32/fscache.c | 40 ++++++++++++++++++++++++++++++++++++++++
 compat/win32/fscache.h |  1 +
 4 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 54d73f7333f7be..38dc08241f7148 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2988,6 +2988,8 @@ pid_t waitpid(pid_t pid, int *status, int options)
 	return -1;
 }
 
+int (*win32_is_mount_point)(struct strbuf *path) = mingw_is_mount_point;
+
 int mingw_is_mount_point(struct strbuf *path)
 {
 	WIN32_FIND_DATAW findbuf = { 0 };
diff --git a/compat/mingw.h b/compat/mingw.h
index 65df57d2a786e4..96677cbb86716d 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -40,7 +40,8 @@ static inline void convert_slashes(char *path)
 }
 struct strbuf;
 int mingw_is_mount_point(struct strbuf *path);
-#define is_mount_point mingw_is_mount_point
+extern int (*win32_is_mount_point)(struct strbuf *path);
+#define is_mount_point win32_is_mount_point
 #define CAN_UNLINK_MOUNT_POINTS 1
 #define PATH_SEP ';'
 char *mingw_query_user_email(void);
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index c05b931455945c..75dd33dc66bea0 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -474,6 +474,7 @@ int fscache_enable(size_t initial_size)
 		/* redirect opendir and lstat to the fscache implementations */
 		opendir = fscache_opendir;
 		lstat = fscache_lstat;
+		win32_is_mount_point = fscache_is_mount_point;
 	}
 	initialized++;
 	LeaveCriticalSection(&fscache_cs);
@@ -534,6 +535,7 @@ void fscache_disable(void)
 		/* reset opendir and lstat to the original implementations */
 		opendir = dirent_opendir;
 		lstat = mingw_lstat;
+		win32_is_mount_point = mingw_is_mount_point;
 	}
 	LeaveCriticalSection(&fscache_cs);
 
@@ -609,6 +611,44 @@ int fscache_lstat(const char *filename, struct stat *st)
 	return 0;
 }
 
+/*
+ * is_mount_point() replacement, uses cache if enabled, otherwise falls
+ * back to mingw_is_mount_point().
+ */
+int fscache_is_mount_point(struct strbuf *path)
+{
+	int dirlen, base, len;
+#pragma GCC diagnostic push
+#ifdef __clang__
+#pragma GCC diagnostic ignored "-Wflexible-array-extensions"
+#endif
+	struct heap_fsentry key[2];
+#pragma GCC diagnostic pop
+	struct fsentry *fse;
+	struct fscache *cache = fscache_getcache();
+
+	if (!cache || !do_fscache_enabled(cache, path->buf))
+		return mingw_is_mount_point(path);
+
+	cache->lstat_requests++;
+	/* split path into path + name */
+	len = path->len;
+	if (len && is_dir_sep(path->buf[len - 1]))
+		len--;
+	base = len;
+	while (base && !is_dir_sep(path->buf[base - 1]))
+		base--;
+	dirlen = base ? base - 1 : 0;
+
+	/* lookup entry for path + name in cache */
+	fsentry_init(&key[0].u.ent, NULL, path->buf, dirlen);
+	fsentry_init(&key[1].u.ent, &key[0].u.ent, path->buf + base, len - base);
+	fse = fscache_get(cache, &key[1].u.ent);
+	if (!fse)
+		return mingw_is_mount_point(path);
+	return fse->reparse_tag == IO_REPARSE_TAG_MOUNT_POINT;
+}
+
 typedef struct fscache_DIR {
 	struct DIR base_dir; /* extend base struct DIR */
 	struct fsentry *pfsentry;
diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h
index 042b247a542554..386c770a85d321 100644
--- a/compat/win32/fscache.h
+++ b/compat/win32/fscache.h
@@ -22,6 +22,7 @@ void fscache_flush(void);
 
 DIR *fscache_opendir(const char *dir);
 int fscache_lstat(const char *file_name, struct stat *buf);
+int fscache_is_mount_point(struct strbuf *path);
 
 /* opaque fscache structure */
 struct fscache;

From 778c05b22ea7e3a42668841c7124f0d190151b1e Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 15 Nov 2018 14:15:40 -0500
Subject: [PATCH 481/553] fscache: teach fscache to use NtQueryDirectoryFile

Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each
directory and then free it when we call FindClose().  Update fscache to call
the underlying kernel API NtQueryDirectoryFile so that we can do the buffer
management ourselves.  That allows us to allocate a single buffer for the
lifetime of the cache and reuse it for each directory.

This change improves performance of 'git status' by 18% in a repo with ~200K
files and 30k folders.

Documentation for NtQueryDirectoryFile can be found at:

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfile
https://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constants
https://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags

To determine if the specified directory is a symbolic link, inspect the
FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is
set. If so, EaSize will contain the reparse tag (this is a so far
undocumented feature, but confirmed by the NTFS developers). To
determine if the reparse point is a symbolic link (and not some other
form of reparse point), test whether the tag value equals the value
IO_REPARSE_TAG_SYMLINK.

The NtQueryDirectoryFile() call works best (and on Windows 8.1 and
earlier, it works *only*) with buffer sizes up to 64kB. Which is 32k
wide characters, so let's use that as our buffer size.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/fscache.c | 123 ++++++++++++++++++++++++++++----------
 compat/win32/ntifs.h   | 131 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 224 insertions(+), 30 deletions(-)
 create mode 100644 compat/win32/ntifs.h

diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 7234318520b8e8..ca16b35302a3ce 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -7,6 +7,7 @@
 #include "../../trace.h"
 #include "config.h"
 #include "../../mem-pool.h"
+#include "ntifs.h"
 
 static volatile long initialized;
 static DWORD dwTlsIndex;
@@ -26,6 +27,13 @@ struct fscache {
 	unsigned int opendir_requests;
 	unsigned int fscache_requests;
 	unsigned int fscache_misses;
+	/*
+	 * 32k wide characters translates to 64kB, which is the maximum that
+	 * Windows 8.1 and earlier can handle. On network drives, not only
+	 * the client's Windows version matters, but also the server's,
+	 * therefore we need to keep this to 64kB.
+	 */
+	WCHAR buffer[32 * 1024];
 };
 static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE);
 
@@ -166,27 +174,44 @@ static void fsentry_release(struct fsentry *fse)
 	InterlockedDecrement(&(fse->u.refcnt));
 }
 
+static int xwcstoutfn(char *utf, int utflen, const wchar_t *wcs, int wcslen)
+{
+	if (!wcs || !utf || utflen < 1) {
+		errno = EINVAL;
+		return -1;
+	}
+	utflen = WideCharToMultiByte(CP_UTF8, 0, wcs, wcslen, utf, utflen, NULL, NULL);
+	if (utflen)
+		return utflen;
+	errno = ERANGE;
+	return -1;
+}
+
 /*
- * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure.
+ * Allocate and initialize an fsentry from a FILE_FULL_DIR_INFORMATION structure.
  */
 static struct fsentry *fseentry_create_entry(struct fscache *cache,
 					     struct fsentry *list,
-					     const WIN32_FIND_DATAW *fdata)
+					     PFILE_FULL_DIR_INFORMATION fdata)
 {
 	char buf[MAX_PATH * 3];
 	int len;
 	struct fsentry *fse;
-	len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf));
+
+	len = xwcstoutfn(buf, ARRAY_SIZE(buf), fdata->FileName, fdata->FileNameLength / sizeof(wchar_t));
 
 	fse = fsentry_alloc(cache, list, buf, len);
 
-	fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes);
+	fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes);
 	fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG;
-	fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32)
-			| fdata->nFileSizeLow;
-	filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim));
-	filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim));
-	filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim));
+	fse->u.s.st_size = fdata->EndOfFile.LowPart |
+		(((off_t)fdata->EndOfFile.HighPart) << 32);
+	filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime),
+			     &(fse->u.s.st_atim));
+	filetime_to_timespec((FILETIME *)&(fdata->LastWriteTime),
+			     &(fse->u.s.st_mtim));
+	filetime_to_timespec((FILETIME *)&(fdata->CreationTime),
+			     &(fse->u.s.st_ctim));
 
 	return fse;
 }
@@ -199,8 +224,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir,
 					   int *dir_not_found)
 {
-	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
-	WIN32_FIND_DATAW fdata;
+	wchar_t pattern[MAX_PATH];
+	NTSTATUS status;
+	IO_STATUS_BLOCK iosb;
+	PFILE_FULL_DIR_INFORMATION di;
 	HANDLE h;
 	int wlen;
 	struct fsentry *list, **phead;
@@ -216,15 +243,18 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f
 		return NULL;
 	}
 
-	/* append optional '/' and wildcard '*' */
-	if (wlen)
-		pattern[wlen++] = '/';
-	pattern[wlen++] = '*';
-	pattern[wlen] = 0;
+	/* handle CWD */
+	if (!wlen) {
+		wlen = GetCurrentDirectoryW(ARRAY_SIZE(pattern), pattern);
+		if (!wlen || wlen >= (ssize_t)ARRAY_SIZE(pattern)) {
+			errno = wlen ? ENAMETOOLONG : err_win_to_posix(GetLastError());
+			return NULL;
+		}
+	}
 
-	/* open find handle */
-	h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch,
-		NULL, FIND_FIRST_EX_LARGE_FETCH);
+	h = CreateFileW(pattern, FILE_LIST_DIRECTORY,
+		FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+		NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
 	if (h == INVALID_HANDLE_VALUE) {
 		err = GetLastError();
 		*dir_not_found = 1; /* or empty directory */
@@ -241,22 +271,55 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f
 
 	/* walk directory and build linked list of fsentry structures */
 	phead = &list->next;
-	do {
-		*phead = fseentry_create_entry(cache, list, &fdata);
+	status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer,
+		sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE);
+	if (!NT_SUCCESS(status)) {
+		/*
+		 * NtQueryDirectoryFile returns STATUS_INVALID_PARAMETER when
+		 * asked to enumerate an invalid directory (ie it is a file
+		 * instead of a directory).  Verify that is the actual cause
+		 * of the error.
+		*/
+		if (status == (NTSTATUS)STATUS_INVALID_PARAMETER) {
+			DWORD attributes = GetFileAttributesW(pattern);
+			if (!(attributes & FILE_ATTRIBUTE_DIRECTORY))
+				status = ERROR_DIRECTORY;
+		}
+		goto Error;
+	}
+	di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer);
+	for (;;) {
+
+		*phead = fseentry_create_entry(cache, list, di);
 		phead = &(*phead)->next;
-	} while (FindNextFileW(h, &fdata));
 
-	/* remember result of last FindNextFile, then close find handle */
-	err = GetLastError();
-	FindClose(h);
+		/* If there is no offset in the entry, the buffer has been exhausted. */
+		if (di->NextEntryOffset == 0) {
+			status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer,
+				sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE);
+			if (!NT_SUCCESS(status)) {
+				if (status == STATUS_NO_MORE_FILES)
+					break;
+				goto Error;
+			}
+
+			di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer);
+			continue;
+		}
+
+		/* Advance to the next entry. */
+		di = (PFILE_FULL_DIR_INFORMATION)(((PUCHAR)di) + di->NextEntryOffset);
+	}
 
-	/* return the list if we've got all the files */
-	if (err == ERROR_NO_MORE_FILES)
-		return list;
+	CloseHandle(h);
+	return list;
 
-	/* otherwise release the list and return error */
+Error:
+	trace_printf_key(&trace_fscache,
+			 "fscache: status(%ld) unable to query directory "
+			 "contents '%s'\n", status, dir->dirent.d_name);
+	CloseHandle(h);
 	fsentry_release(list);
-	errno = err_win_to_posix(err);
 	return NULL;
 }
 
diff --git a/compat/win32/ntifs.h b/compat/win32/ntifs.h
new file mode 100644
index 00000000000000..64ed792c52f352
--- /dev/null
+++ b/compat/win32/ntifs.h
@@ -0,0 +1,131 @@
+#ifndef _NTIFS_
+#define _NTIFS_
+
+/*
+ * Copy necessary structures and definitions out of the Windows DDK
+ * to enable calling NtQueryDirectoryFile()
+ */
+
+typedef _Return_type_success_(return >= 0) LONG NTSTATUS;
+#define NT_SUCCESS(Status)  (((NTSTATUS)(Status)) >= 0)
+
+#if !defined(_NTSECAPI_) && !defined(_WINTERNL_) && \
+	!defined(__UNICODE_STRING_DEFINED)
+#define __UNICODE_STRING_DEFINED
+typedef struct _UNICODE_STRING {
+	USHORT Length;
+	USHORT MaximumLength;
+	PWSTR Buffer;
+} UNICODE_STRING;
+typedef UNICODE_STRING *PUNICODE_STRING;
+typedef const UNICODE_STRING *PCUNICODE_STRING;
+#endif /* !_NTSECAPI_ && !_WINTERNL_ && !__UNICODE_STRING_DEFINED */
+
+typedef enum _FILE_INFORMATION_CLASS {
+	FileDirectoryInformation = 1,
+	FileFullDirectoryInformation,
+	FileBothDirectoryInformation,
+	FileBasicInformation,
+	FileStandardInformation,
+	FileInternalInformation,
+	FileEaInformation,
+	FileAccessInformation,
+	FileNameInformation,
+	FileRenameInformation,
+	FileLinkInformation,
+	FileNamesInformation,
+	FileDispositionInformation,
+	FilePositionInformation,
+	FileFullEaInformation,
+	FileModeInformation,
+	FileAlignmentInformation,
+	FileAllInformation,
+	FileAllocationInformation,
+	FileEndOfFileInformation,
+	FileAlternateNameInformation,
+	FileStreamInformation,
+	FilePipeInformation,
+	FilePipeLocalInformation,
+	FilePipeRemoteInformation,
+	FileMailslotQueryInformation,
+	FileMailslotSetInformation,
+	FileCompressionInformation,
+	FileObjectIdInformation,
+	FileCompletionInformation,
+	FileMoveClusterInformation,
+	FileQuotaInformation,
+	FileReparsePointInformation,
+	FileNetworkOpenInformation,
+	FileAttributeTagInformation,
+	FileTrackingInformation,
+	FileIdBothDirectoryInformation,
+	FileIdFullDirectoryInformation,
+	FileValidDataLengthInformation,
+	FileShortNameInformation,
+	FileIoCompletionNotificationInformation,
+	FileIoStatusBlockRangeInformation,
+	FileIoPriorityHintInformation,
+	FileSfioReserveInformation,
+	FileSfioVolumeInformation,
+	FileHardLinkInformation,
+	FileProcessIdsUsingFileInformation,
+	FileNormalizedNameInformation,
+	FileNetworkPhysicalNameInformation,
+	FileIdGlobalTxDirectoryInformation,
+	FileIsRemoteDeviceInformation,
+	FileAttributeCacheInformation,
+	FileNumaNodeInformation,
+	FileStandardLinkInformation,
+	FileRemoteProtocolInformation,
+	FileMaximumInformation
+} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS;
+
+typedef struct _FILE_FULL_DIR_INFORMATION {
+	ULONG NextEntryOffset;
+	ULONG FileIndex;
+	LARGE_INTEGER CreationTime;
+	LARGE_INTEGER LastAccessTime;
+	LARGE_INTEGER LastWriteTime;
+	LARGE_INTEGER ChangeTime;
+	LARGE_INTEGER EndOfFile;
+	LARGE_INTEGER AllocationSize;
+	ULONG FileAttributes;
+	ULONG FileNameLength;
+	ULONG EaSize;
+	WCHAR FileName[1];
+} FILE_FULL_DIR_INFORMATION, *PFILE_FULL_DIR_INFORMATION;
+
+typedef struct _IO_STATUS_BLOCK {
+	union {
+		NTSTATUS Status;
+		PVOID Pointer;
+	} u;
+	ULONG_PTR Information;
+} IO_STATUS_BLOCK, *PIO_STATUS_BLOCK;
+
+typedef VOID
+(NTAPI *PIO_APC_ROUTINE)(
+	IN PVOID ApcContext,
+	IN PIO_STATUS_BLOCK IoStatusBlock,
+	IN ULONG Reserved);
+
+NTSYSCALLAPI
+NTSTATUS
+NTAPI
+NtQueryDirectoryFile(
+	_In_ HANDLE FileHandle,
+	_In_opt_ HANDLE Event,
+	_In_opt_ PIO_APC_ROUTINE ApcRoutine,
+	_In_opt_ PVOID ApcContext,
+	_Out_ PIO_STATUS_BLOCK IoStatusBlock,
+	_Out_writes_bytes_(Length) PVOID FileInformation,
+	_In_ ULONG Length,
+	_In_ FILE_INFORMATION_CLASS FileInformationClass,
+	_In_ BOOLEAN ReturnSingleEntry,
+	_In_opt_ PUNICODE_STRING FileName,
+	_In_ BOOLEAN RestartScan
+);
+
+#define STATUS_NO_MORE_FILES             ((NTSTATUS)0x80000006L)
+
+#endif

From 817abd6aa67544701469e85fd76946b33c68e3e1 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 11 Dec 2018 12:17:49 +0100
Subject: [PATCH 482/553] clean: make use of FSCache

The `git clean` command needs to enumerate plenty of files and
directories, and can therefore benefit from the FSCache.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/clean.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/clean.c b/builtin/clean.c
index 6ed555000f9a41..e15d595c3dc7cc 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -1042,6 +1042,7 @@ int cmd_clean(int argc,
 
 	if (repo_read_index(the_repository) < 0)
 		die(_("index file corrupt"));
+	enable_fscache(the_repository->index->cache_nr);
 
 	pl = add_pattern_list(&dir, EXC_CMDL, "--exclude option");
 	for (i = 0; i < exclude_list.nr; i++)
@@ -1116,6 +1117,7 @@ int cmd_clean(int argc,
 		}
 	}
 
+	disable_fscache();
 	strbuf_release(&abs_path);
 	strbuf_release(&buf);
 	string_list_clear(&del_list, 0);

From bc43c6eb704abcfbd40f9dd5e7532f18fc556261 Mon Sep 17 00:00:00 2001
From: Doug Kelly <dougk.ff7@gmail.com>
Date: Wed, 8 Jan 2014 20:28:15 -0600
Subject: [PATCH 483/553] pack-objects (mingw): demonstrate a segmentation
 fault with large deltas

There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix
performance issues on packing large deltas, 2018-07-22) initializes that
mutex in the `packing_data` struct. The problem manifests in a
segmentation fault on Windows, when a mutex (AKA critical section) is
accessed without being initialized. (With pthreads, you apparently do
not really have to initialize them?)

This was reported in https://github.com/git-for-windows/git/issues/1839.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/meson.build                  |   1 +
 t/t7429-submodule-long-path.sh | 106 +++++++++++++++++++++++++++++++++
 2 files changed, 107 insertions(+)
 create mode 100755 t/t7429-submodule-long-path.sh

diff --git a/t/meson.build b/t/meson.build
index 1e26a4c7a9548d..b56620f16a85af 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -888,6 +888,7 @@ integration_tests = [
   't7422-submodule-output.sh',
   't7423-submodule-symlinks.sh',
   't7424-submodule-mixed-ref-formats.sh',
+  't7429-submodule-long-path.sh',
   't7450-bad-git-dotfiles.sh',
   't7500-commit-template-squash-signoff.sh',
   't7501-commit-basic-functionality.sh',
diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh
new file mode 100755
index 00000000000000..f692cedbff7ff8
--- /dev/null
+++ b/t/t7429-submodule-long-path.sh
@@ -0,0 +1,106 @@
+#!/bin/sh
+#
+# Copyright (c) 2013 Doug Kelly
+#
+
+test_description='Test submodules with a path near PATH_MAX
+
+This test verifies that "git submodule" initialization, update and clones work, including with recursive submodules and paths approaching PATH_MAX (260 characters on Windows)
+'
+
+TEST_NO_CREATE_REPO=1
+. ./test-lib.sh
+
+longpath=""
+for (( i=0; i<4; i++ )); do
+	longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath"
+done
+# Pick a substring maximum of 90 characters
+# This should be good, since we'll add on a lot for temp directories
+longpath=${longpath:0:90}; export longpath
+
+test_expect_failure 'submodule with a long path' '
+	git config --global protocol.file.allow always &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=long init --bare remote &&
+	test_create_repo bundle1 &&
+	(
+		cd bundle1 &&
+		test_commit "shoot" &&
+		git rev-parse --verify HEAD >../expect
+	) &&
+	mkdir home &&
+	(
+		cd home &&
+		git clone ../remote test &&
+		cd test &&
+		git checkout -B long &&
+		git submodule add ../bundle1 $longpath &&
+		test_commit "sogood" &&
+		(
+			cd $longpath &&
+			git rev-parse --verify HEAD >actual &&
+			test_cmp ../../../expect actual
+		) &&
+		git push origin long
+	) &&
+	mkdir home2 &&
+	(
+		cd home2 &&
+		git clone ../remote test &&
+		cd test &&
+		git checkout long &&
+		git submodule update --init &&
+		(
+			cd $longpath &&
+			git rev-parse --verify HEAD >actual &&
+			test_cmp ../../../expect actual
+		)
+	)
+'
+
+test_expect_failure 'recursive submodule with a long path' '
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=long init --bare super &&
+	test_create_repo child &&
+	(
+		cd child &&
+		test_commit "shoot" &&
+		git rev-parse --verify HEAD >../expect
+	) &&
+	test_create_repo parent &&
+	(
+		cd parent &&
+		git submodule add ../child $longpath &&
+		test_commit "aim"
+	) &&
+	mkdir home3 &&
+	(
+		cd home3 &&
+		git clone ../super test &&
+		cd test &&
+		git checkout -B long &&
+		git submodule add ../parent foo &&
+		git submodule update --init --recursive &&
+		test_commit "sogood" &&
+		(
+			cd foo/$longpath &&
+			git rev-parse --verify HEAD >actual &&
+			test_cmp ../../../../expect actual
+		) &&
+		git push origin long
+	) &&
+	mkdir home4 &&
+	(
+		cd home4 &&
+		git clone ../super test --recursive &&
+		(
+			cd test/foo/$longpath &&
+			git rev-parse --verify HEAD >actual &&
+			test_cmp ../../../../expect actual
+		)
+	)
+'
+unset longpath
+
+test_done

From e0bb0fcaa4f7728278c4e2cb235fa6c34fbcc372 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 28 Jul 2015 21:07:41 +0200
Subject: [PATCH 484/553] mingw: support long paths

Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users from shooting themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199

[jes: adjusted test number to avoid conflicts, added support for
chdir(), etc]

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/config/core.adoc |   7 ++
 compat/mingw.c                 | 173 ++++++++++++++++++++++++++-------
 compat/mingw.h                 |  75 +++++++++++++-
 compat/win32/dirent.c          |  17 ++--
 compat/win32/fscache.c         |  16 ++-
 t/meson.build                  |   1 +
 t/t2031-checkout-long-paths.sh | 102 +++++++++++++++++++
 t/t7429-submodule-long-path.sh |  24 +++--
 8 files changed, 348 insertions(+), 67 deletions(-)
 create mode 100755 t/t2031-checkout-long-paths.sh

diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index 80dcc4b004b1e9..fb9aba0e08cef6 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -716,6 +716,13 @@ core.fscache::
 Git for Windows uses this to bulk-read and cache lstat data of entire
 directories (instead of doing lstat file by file).
 
+core.longpaths::
+	Enable long path (> 260) support for builtin commands in Git for
+	Windows. This is disabled by default, as long paths are not supported
+	by Windows Explorer, cmd.exe and the Git for Windows tool chain
+	(msys, bash, tcl, perl...). Only enable this if you know what you're
+	doing and are prepared to live with a few quirks.
+
 core.unsetenvvars::
 	Windows-only: comma-separated list of environment variables'
 	names that need to be unset before spawning any other process.
diff --git a/compat/mingw.c b/compat/mingw.c
index 38dc08241f7148..2dd5a12f4bb59d 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -251,6 +251,27 @@ static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
 static char *unset_environment_variables;
 int core_fscache;
 
+int are_long_paths_enabled(void)
+{
+	/* default to `false` during initialization */
+	static const int fallback = 0;
+
+	static int enabled = -1;
+
+	if (enabled < 0) {
+		/* avoid infinite recursion */
+		if (!the_repository)
+			return fallback;
+
+		if (the_repository->config &&
+		    the_repository->config->hash_initialized &&
+		    repo_config_get_bool(the_repository, "core.longpaths", &enabled) < 0)
+			enabled = 0;
+	}
+
+	return enabled < 0 ? fallback : enabled;
+}
+
 int mingw_core_config(const char *var, const char *value,
 		      const struct config_context *ctx UNUSED,
 		      void *cb UNUSED)
@@ -307,8 +328,8 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf)
 int mingw_unlink(const char *pathname, int handle_in_use_error)
 {
 	int ret, tries = 0;
-	wchar_t wpathname[MAX_PATH];
-	if (xutftowcs_path(wpathname, pathname) < 0)
+	wchar_t wpathname[MAX_LONG_PATH];
+	if (xutftowcs_long_path(wpathname, pathname) < 0)
 		return -1;
 
 	if (DeleteFileW(wpathname))
@@ -343,7 +364,7 @@ static int is_dir_empty(const wchar_t *wpath)
 {
 	WIN32_FIND_DATAW findbuf;
 	HANDLE handle;
-	wchar_t wbuf[MAX_PATH + 2];
+	wchar_t wbuf[MAX_LONG_PATH + 2];
 	wcscpy(wbuf, wpath);
 	wcscat(wbuf, L"\\*");
 	handle = FindFirstFileW(wbuf, &findbuf);
@@ -364,7 +385,7 @@ static int is_dir_empty(const wchar_t *wpath)
 int mingw_rmdir(const char *pathname)
 {
 	int ret, tries = 0;
-	wchar_t wpathname[MAX_PATH];
+	wchar_t wpathname[MAX_LONG_PATH];
 	struct stat st;
 
 	/*
@@ -386,7 +407,7 @@ int mingw_rmdir(const char *pathname)
 		return -1;
 	}
 
-	if (xutftowcs_path(wpathname, pathname) < 0)
+	if (xutftowcs_long_path(wpathname, pathname) < 0)
 		return -1;
 
 	while ((ret = _wrmdir(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
@@ -465,15 +486,18 @@ static int set_hidden_flag(const wchar_t *path, int set)
 int mingw_mkdir(const char *path, int mode UNUSED)
 {
 	int ret;
-	wchar_t wpath[MAX_PATH];
+	wchar_t wpath[MAX_LONG_PATH];
 
 	if (!is_valid_win32_path(path, 0)) {
 		errno = EINVAL;
 		return -1;
 	}
 
-	if (xutftowcs_path(wpath, path) < 0)
+	/* CreateDirectoryW path limit is 248 (MAX_PATH - 8.3 file name) */
+	if (xutftowcs_path_ex(wpath, path, MAX_LONG_PATH, -1, 248,
+			      are_long_paths_enabled()) < 0)
 		return -1;
+
 	ret = _wmkdir(wpath);
 	if (!ret && needs_hiding(path))
 		return set_hidden_flag(wpath, 1);
@@ -637,7 +661,7 @@ int mingw_open (const char *filename, int oflags, ...)
 	va_list args;
 	unsigned mode;
 	int fd, create = (oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL);
-	wchar_t wfilename[MAX_PATH];
+	wchar_t wfilename[MAX_LONG_PATH];
 	open_fn_t open_fn;
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
 
@@ -670,7 +694,7 @@ int mingw_open (const char *filename, int oflags, ...)
 
 	if (filename && !strcmp(filename, "/dev/null"))
 		wcscpy(wfilename, L"nul");
-	else if (xutftowcs_path(wfilename, filename) < 0)
+	else if (xutftowcs_long_path(wfilename, filename) < 0)
 		return -1;
 
 	/*
@@ -756,14 +780,14 @@ FILE *mingw_fopen (const char *filename, const char *otype)
 {
 	int hide = needs_hiding(filename);
 	FILE *file;
-	wchar_t wfilename[MAX_PATH], wotype[4];
+	wchar_t wfilename[MAX_LONG_PATH], wotype[4];
 	if (filename && !strcmp(filename, "/dev/null"))
 		wcscpy(wfilename, L"nul");
 	else if (!is_valid_win32_path(filename, 1)) {
 		int create = otype && strchr(otype, 'w');
 		errno = create ? EINVAL : ENOENT;
 		return NULL;
-	} else if (xutftowcs_path(wfilename, filename) < 0)
+	} else if (xutftowcs_long_path(wfilename, filename) < 0)
 		return NULL;
 
 	if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0)
@@ -785,14 +809,14 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream)
 {
 	int hide = needs_hiding(filename);
 	FILE *file;
-	wchar_t wfilename[MAX_PATH], wotype[4];
+	wchar_t wfilename[MAX_LONG_PATH], wotype[4];
 	if (filename && !strcmp(filename, "/dev/null"))
 		wcscpy(wfilename, L"nul");
 	else if (!is_valid_win32_path(filename, 1)) {
 		int create = otype && strchr(otype, 'w');
 		errno = create ? EINVAL : ENOENT;
 		return NULL;
-	} else if (xutftowcs_path(wfilename, filename) < 0)
+	} else if (xutftowcs_long_path(wfilename, filename) < 0)
 		return NULL;
 
 	if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0)
@@ -842,7 +866,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len)
 		HANDLE h = (HANDLE) _get_osfhandle(fd);
 		if (GetFileType(h) != FILE_TYPE_PIPE) {
 			if (orig == EINVAL) {
-				wchar_t path[MAX_PATH];
+				wchar_t path[MAX_LONG_PATH];
 				DWORD ret = GetFinalPathNameByHandleW(h, path,
 								ARRAY_SIZE(path), 0);
 				UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ?
@@ -879,27 +903,33 @@ ssize_t mingw_write(int fd, const void *buf, size_t len)
 
 int mingw_access(const char *filename, int mode)
 {
-	wchar_t wfilename[MAX_PATH];
+	wchar_t wfilename[MAX_LONG_PATH];
 	if (!strcmp("nul", filename) || !strcmp("/dev/null", filename))
 		return 0;
-	if (xutftowcs_path(wfilename, filename) < 0)
+	if (xutftowcs_long_path(wfilename, filename) < 0)
 		return -1;
 	/* X_OK is not supported by the MSVCRT version */
 	return _waccess(wfilename, mode & ~X_OK);
 }
 
+/* cached length of current directory for handle_long_path */
+static int current_directory_len = 0;
+
 int mingw_chdir(const char *dirname)
 {
-	wchar_t wdirname[MAX_PATH];
-	if (xutftowcs_path(wdirname, dirname) < 0)
+	int result;
+	wchar_t wdirname[MAX_LONG_PATH];
+	if (xutftowcs_long_path(wdirname, dirname) < 0)
 		return -1;
-	return _wchdir(wdirname);
+	result = _wchdir(wdirname);
+	current_directory_len = GetCurrentDirectoryW(0, NULL);
+	return result;
 }
 
 int mingw_chmod(const char *filename, int mode)
 {
-	wchar_t wfilename[MAX_PATH];
-	if (xutftowcs_path(wfilename, filename) < 0)
+	wchar_t wfilename[MAX_LONG_PATH];
+	if (xutftowcs_long_path(wfilename, filename) < 0)
 		return -1;
 	return _wchmod(wfilename, mode);
 }
@@ -947,8 +977,8 @@ static int has_valid_directory_prefix(wchar_t *wfilename)
 static int do_lstat(int follow, const char *file_name, struct stat *buf)
 {
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
-	wchar_t wfilename[MAX_PATH];
-	if (xutftowcs_path(wfilename, file_name) < 0)
+	wchar_t wfilename[MAX_LONG_PATH];
+	if (xutftowcs_long_path(wfilename, file_name) < 0)
 		return -1;
 
 	if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) {
@@ -1119,10 +1149,10 @@ int mingw_utime (const char *file_name, const struct utimbuf *times)
 	FILETIME mft, aft;
 	int rc;
 	DWORD attrs;
-	wchar_t wfilename[MAX_PATH];
+	wchar_t wfilename[MAX_LONG_PATH];
 	HANDLE osfilehandle;
 
-	if (xutftowcs_path(wfilename, file_name) < 0)
+	if (xutftowcs_long_path(wfilename, file_name) < 0)
 		return -1;
 
 	/* must have write permission */
@@ -1826,6 +1856,10 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 
 	if (*argv && !strcmp(cmd, *argv))
 		wcmd[0] = L'\0';
+	/*
+	 * Paths to executables and to the current directory do not support
+	 * long paths, therefore we cannot use xutftowcs_long_path() here.
+	 */
 	else if (xutftowcs_path(wcmd, cmd) < 0)
 		return -1;
 	if (dir && xutftowcs_path(wdir, dir) < 0)
@@ -2515,12 +2549,12 @@ int mingw_rename(const char *pold, const char *pnew)
 	static int supports_file_rename_info_ex = 1;
 	DWORD attrs, gle;
 	int tries = 0;
-	wchar_t wpold[MAX_PATH], wpnew[MAX_PATH];
+	wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH];
 	int wpnew_len;
 
-	if (xutftowcs_path(wpold, pold) < 0)
+	if (xutftowcs_long_path(wpold, pold) < 0)
 		return -1;
-	wpnew_len = xutftowcs_path(wpnew, pnew);
+	wpnew_len = xutftowcs_long_path(wpnew, pnew);
 	if (wpnew_len < 0)
 		return -1;
 
@@ -2559,9 +2593,9 @@ int mingw_rename(const char *pold, const char *pnew)
 			 * flex array so that the structure has to be allocated on
 			 * the heap. As we declare this structure ourselves though
 			 * we can avoid the allocation and define FileName to have
-			 * MAX_PATH bytes.
+			 * MAX_LONG_PATH bytes.
 			 */
-			WCHAR FileName[MAX_PATH];
+			WCHAR FileName[MAX_LONG_PATH];
 		} rename_info = { 0 };
 		HANDLE old_handle = INVALID_HANDLE_VALUE;
 		BOOL success;
@@ -2924,9 +2958,9 @@ int mingw_raise(int sig)
 
 int link(const char *oldpath, const char *newpath)
 {
-	wchar_t woldpath[MAX_PATH], wnewpath[MAX_PATH];
-	if (xutftowcs_path(woldpath, oldpath) < 0 ||
-		xutftowcs_path(wnewpath, newpath) < 0)
+	wchar_t woldpath[MAX_LONG_PATH], wnewpath[MAX_LONG_PATH];
+	if (xutftowcs_long_path(woldpath, oldpath) < 0 ||
+	    xutftowcs_long_path(wnewpath, newpath) < 0)
 		return -1;
 
 	if (!CreateHardLinkW(wnewpath, woldpath, NULL)) {
@@ -2994,8 +3028,8 @@ int mingw_is_mount_point(struct strbuf *path)
 {
 	WIN32_FIND_DATAW findbuf = { 0 };
 	HANDLE handle;
-	wchar_t wfilename[MAX_PATH];
-	int wlen = xutftowcs_path(wfilename, path->buf);
+	wchar_t wfilename[MAX_LONG_PATH];
+	int wlen = xutftowcs_long_path(wfilename, path->buf);
 	if (wlen < 0)
 		die(_("could not get long path for '%s'"), path->buf);
 
@@ -3138,9 +3172,9 @@ static size_t append_system_bin_dirs(char *path, size_t size)
 
 static int is_system32_path(const char *path)
 {
-	WCHAR system32[MAX_PATH], wpath[MAX_PATH];
+	WCHAR system32[MAX_LONG_PATH], wpath[MAX_LONG_PATH];
 
-	if (xutftowcs_path(wpath, path) < 0 ||
+	if (xutftowcs_long_path(wpath, path) < 0 ||
 	    !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) ||
 	    _wcsicmp(system32, wpath))
 		return 0;
@@ -3567,6 +3601,68 @@ int is_valid_win32_path(const char *path, int allow_literal_nul)
 	}
 }
 
+int handle_long_path(wchar_t *path, int len, int max_path, int expand)
+{
+	int result;
+	wchar_t buf[MAX_LONG_PATH];
+
+	/*
+	 * we don't need special handling if path is relative to the current
+	 * directory, and current directory + path don't exceed the desired
+	 * max_path limit. This should cover > 99 % of cases with minimal
+	 * performance impact (git almost always uses relative paths).
+	 */
+	if ((len < 2 || (!is_dir_sep(path[0]) && path[1] != ':')) &&
+	    (current_directory_len + len < max_path))
+		return len;
+
+	/*
+	 * handle everything else:
+	 * - absolute paths: "C:\dir\file"
+	 * - absolute UNC paths: "\\server\share\dir\file"
+	 * - absolute paths on current drive: "\dir\file"
+	 * - relative paths on other drive: "X:file"
+	 * - prefixed paths: "\\?\...", "\\.\..."
+	 */
+
+	/* convert to absolute path using GetFullPathNameW */
+	result = GetFullPathNameW(path, MAX_LONG_PATH, buf, NULL);
+	if (!result) {
+		errno = err_win_to_posix(GetLastError());
+		return -1;
+	}
+
+	/*
+	 * return absolute path if it fits within max_path (even if
+	 * "cwd + path" doesn't due to '..' components)
+	 */
+	if (result < max_path) {
+		wcscpy(path, buf);
+		return result;
+	}
+
+	/* error out if we shouldn't expand the path or buf is too small */
+	if (!expand || result >= MAX_LONG_PATH - 6) {
+		errno = ENAMETOOLONG;
+		return -1;
+	}
+
+	/* prefix full path with "\\?\" or "\\?\UNC\" */
+	if (buf[0] == '\\') {
+		/* ...unless already prefixed */
+		if (buf[1] == '\\' && (buf[2] == '?' || buf[2] == '.'))
+			return len;
+
+		wcscpy(path, L"\\\\?\\UNC\\");
+		wcscpy(path + 8, buf + 2);
+		return result + 6;
+	} else {
+		wcscpy(path, L"\\\\?\\");
+		wcscpy(path + 4, buf);
+		return result + 4;
+	}
+}
+
 #if !defined(_MSC_VER)
 /*
  * Disable MSVCRT command line wildcard expansion (__getmainargs called from
@@ -3729,6 +3825,9 @@ int wmain(int argc, const wchar_t **wargv)
 	/* initialize Unicode console */
 	winansi_init();
 
+	/* init length of current directory for handle_long_path */
+	current_directory_len = GetCurrentDirectoryW(0, NULL);
+
 	/* invoke the real main() using our utf8 version of argv. */
 	exit_status = main(argc, argv);
 
diff --git a/compat/mingw.h b/compat/mingw.h
index 96677cbb86716d..ad1166b775322a 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -1,6 +1,7 @@
 #include "mingw-posix.h"
 
 extern int core_fscache;
+int are_long_paths_enabled(void);
 
 struct config_context;
 int mingw_core_config(const char *var, const char *value,
@@ -78,6 +79,42 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report);
 int is_valid_win32_path(const char *path, int allow_literal_nul);
 #define is_valid_path(path) is_valid_win32_path(path, 0)
 
+/**
+ * Max length of long paths (exceeding MAX_PATH). The actual maximum supported
+ * by NTFS is 32,767 (* sizeof(wchar_t)), but we choose an arbitrary smaller
+ * value to limit required stack memory.
+ */
+#define MAX_LONG_PATH 4096
+
+/**
+ * Handles paths that would exceed the MAX_PATH limit of Windows Unicode APIs.
+ *
+ * With expand == false, the function checks for over-long paths and fails
+ * with ENAMETOOLONG. The path parameter is not modified, except if cwd + path
+ * exceeds max_path, but the resulting absolute path doesn't (e.g. due to
+ * eliminating '..' components). The path parameter must point to a buffer
+ * of max_path wide characters.
+ *
+ * With expand == true, an over-long path is automatically converted in place
+ * to an absolute path prefixed with '\\?\', and the new length is returned.
+ * The path parameter must point to a buffer of MAX_LONG_PATH wide characters.
+ *
+ * Parameters:
+ * path: path to check and / or convert
+ * len: size of path on input (number of wide chars without \0)
+ * max_path: max short path length to check (usually MAX_PATH = 260, but just
+ * 248 for CreateDirectoryW)
+ * expand: false to only check the length, true to expand the path to a
+ * '\\?\'-prefixed absolute path
+ *
+ * Return:
+ * length of the resulting path, or -1 on failure
+ *
+ * Errors:
+ * ENAMETOOLONG if path is too long
+ */
+int handle_long_path(wchar_t *path, int len, int max_path, int expand);
+
 /**
  * Converts UTF-8 encoded string to UTF-16LE.
  *
@@ -136,18 +173,46 @@ static inline int xutftowcs(wchar_t *wcs, const char *utf, size_t wcslen)
 }
 
 /**
- * Simplified file system specific variant of xutftowcsn, assumes output
- * buffer size is MAX_PATH wide chars and input string is \0-terminated,
- * fails with ENAMETOOLONG if input string is too long.
+ * Simplified file system specific wrapper of xutftowcsn and handle_long_path.
+ * Converts ERANGE to ENAMETOOLONG. If expand is true, wcs must be at least
+ * MAX_LONG_PATH wide chars (see handle_long_path).
  */
-static inline int xutftowcs_path(wchar_t *wcs, const char *utf)
+static inline int xutftowcs_path_ex(wchar_t *wcs, const char *utf,
+		size_t wcslen, int utflen, int max_path, int expand)
 {
-	int result = xutftowcsn(wcs, utf, MAX_PATH, -1);
+	int result = xutftowcsn(wcs, utf, wcslen, utflen);
 	if (result < 0 && errno == ERANGE)
 		errno = ENAMETOOLONG;
+	if (result >= 0)
+		result = handle_long_path(wcs, result, max_path, expand);
 	return result;
 }
 
+/**
+ * Simplified file system specific variant of xutftowcsn, assumes output
+ * buffer size is MAX_PATH wide chars and input string is \0-terminated,
+ * fails with ENAMETOOLONG if input string is too long. Typically used for
+ * Windows APIs that don't support long paths, e.g. SetCurrentDirectory,
+ * LoadLibrary, CreateProcess...
+ */
+static inline int xutftowcs_path(wchar_t *wcs, const char *utf)
+{
+	return xutftowcs_path_ex(wcs, utf, MAX_PATH, -1, MAX_PATH, 0);
+}
+
+/**
+ * Simplified file system specific variant of xutftowcsn for Windows APIs
+ * that support long paths via '\\?\'-prefix, assumes output buffer size is
+ * MAX_LONG_PATH wide chars, fails with ENAMETOOLONG if input string is too
+ * long. The 'core.longpaths' git-config option controls whether the path
+ * is only checked or expanded to a long path.
+ */
+static inline int xutftowcs_long_path(wchar_t *wcs, const char *utf)
+{
+	return xutftowcs_path_ex(wcs, utf, MAX_LONG_PATH, -1, MAX_PATH,
+				 are_long_paths_enabled());
+}
+
 /**
  * Converts UTF-16LE encoded string to UTF-8.
  *
diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index 139d2ba3c4da34..c9fe2454efc01c 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -65,19 +65,24 @@ static int dirent_closedir(dirent_DIR *dir)
 
 DIR *dirent_opendir(const char *name)
 {
-	wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */
+	wchar_t pattern[MAX_LONG_PATH + 2]; /* + 2 for "\*" */
 	WIN32_FIND_DATAW fdata;
 	HANDLE h;
 	int len;
 	dirent_DIR *dir;
 
-	/* convert name to UTF-16 and check length < MAX_PATH */
-	if ((len = xutftowcs_path(pattern, name)) < 0)
+	/* convert name to UTF-16 and check length */
+	if ((len = xutftowcs_path_ex(pattern, name, MAX_LONG_PATH, -1,
+				     MAX_PATH - 2,
+				     are_long_paths_enabled())) < 0)
 		return NULL;
 
-	/* append optional '/' and wildcard '*' */
+	/*
+	 * append optional '\' and wildcard '*'. Note: we need to use '\' as
+	 * Windows doesn't translate '/' to '\' for "\\?\"-prefixed paths.
+	 */
 	if (len && !is_dir_sep(pattern[len - 1]))
-		pattern[len++] = '/';
+		pattern[len++] = '\\';
 	pattern[len++] = '*';
 	pattern[len] = 0;
 
@@ -90,7 +95,7 @@ DIR *dirent_opendir(const char *name)
 	}
 
 	/* initialize DIR structure and copy first dir entry */
-	dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH);
+	dir = xmalloc(sizeof(dirent_DIR) + MAX_LONG_PATH);
 	dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir;
 	dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir;
 	dir->dd_handle = h;
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 75dd33dc66bea0..dbf640ca790fde 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -84,7 +84,7 @@ struct fsentry {
 struct heap_fsentry {
 	union {
 		struct fsentry ent;
-		char dummy[sizeof(struct fsentry) + MAX_PATH];
+		char dummy[sizeof(struct fsentry) + MAX_LONG_PATH];
 	} u;
 };
 #pragma GCC diagnostic pop
@@ -128,7 +128,7 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list,
 			 const char *name, size_t len)
 {
 	fse->list = list;
-	if (len > MAX_PATH)
+	if (len > MAX_LONG_PATH)
 		BUG("Trying to allocate fsentry for long path '%.*s'",
 		    (int)len, name);
 	memcpy(fse->dirent.d_name, name, len);
@@ -229,7 +229,7 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir,
 					   int *dir_not_found)
 {
-	wchar_t pattern[MAX_PATH];
+	wchar_t pattern[MAX_LONG_PATH];
 	NTSTATUS status;
 	IO_STATUS_BLOCK iosb;
 	PFILE_FULL_DIR_INFORMATION di;
@@ -240,13 +240,11 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f
 
 	*dir_not_found = 0;
 
-	/* convert name to UTF-16 and check length < MAX_PATH */
-	if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH,
-			       dir->len)) < 0) {
-		if (errno == ERANGE)
-			errno = ENAMETOOLONG;
+	/* convert name to UTF-16 and check length */
+	if ((wlen = xutftowcs_path_ex(pattern, dir->dirent.d_name,
+				      MAX_LONG_PATH, dir->len, MAX_PATH - 2,
+				      are_long_paths_enabled())) < 0)
 		return NULL;
-	}
 
 	/* handle CWD */
 	if (!wlen) {
diff --git a/t/meson.build b/t/meson.build
index b56620f16a85af..4dd9ca9f303733 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -270,6 +270,7 @@ integration_tests = [
   't2026-checkout-pathspec-file.sh',
   't2027-checkout-track.sh',
   't2030-unresolve-info.sh',
+  't2031-checkout-long-paths.sh',
   't2050-git-dir-relative.sh',
   't2060-switch.sh',
   't2070-restore.sh',
diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh
new file mode 100755
index 00000000000000..f30f8920ca689c
--- /dev/null
+++ b/t/t2031-checkout-long-paths.sh
@@ -0,0 +1,102 @@
+#!/bin/sh
+
+test_description='checkout long paths on Windows
+
+Ensures that Git for Windows can deal with long paths (>260) enabled via core.longpaths'
+
+. ./test-lib.sh
+
+if test_have_prereq !MINGW
+then
+	skip_all='skipping MINGW specific long paths test'
+	test_done
+fi
+
+test_expect_success setup '
+	p=longpathxx && # -> 10
+	p=$p$p$p$p$p && # -> 50
+	p=$p$p$p$p$p && # -> 250
+
+	path=${p}/longtestfile && # -> 263 (MAX_PATH = 260)
+
+	blob=$(echo foobar | git hash-object -w --stdin) &&
+
+	printf "100644 %s 0\t%s\n" "$blob" "$path" |
+	git update-index --add --index-info &&
+	git commit -m initial -q
+'
+
+test_expect_success 'checkout of long paths without core.longpaths fails' '
+	git config core.longpaths false &&
+	test_must_fail git checkout -f 2>error &&
+	grep -q "Filename too long" error &&
+	test ! -d longpa*
+'
+
+test_expect_success 'checkout of long paths with core.longpaths works' '
+	git config core.longpaths true &&
+	git checkout -f &&
+	test_path_is_file longpa*/longtestfile
+'
+
+test_expect_success 'update of long paths' '
+	echo frotz >>$(ls longpa*/longtestfile) &&
+	echo $path > expect &&
+	git ls-files -m > actual &&
+	test_cmp expect actual &&
+	git add $path &&
+	git commit -m second &&
+	git grep "frotz" HEAD -- $path
+'
+
+test_expect_success cleanup '
+	# bash cannot delete the trash dir if it contains a long path
+	# lets help cleaning up (unless in debug mode)
+	if test -z "$debug"
+	then
+		rm -rf longpa~1
+	fi
+'
+
+# check that the template used in the test won't be too long:
+abspath="$(pwd)"/testdir
+test ${#abspath} -gt 230 ||
+test_set_prereq SHORTABSPATH
+
+test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' '
+	p=/123456789abcdef/123456789abcdef/123456789abcdef/123456789abc/ef &&
+	p=y$p$p$p$p &&
+	subdir="x$(echo "$p" | tail -c $((253 - ${#abspath})) - )" &&
+	# Now, $abspath/$subdir has exactly 254 characters, and is inside CWD
+	p2="$abspath/$subdir" &&
+	test 254 = ${#p2} &&
+
+	# Be careful to overcome path limitations of the MSys tools and split
+	# the $subdir into two parts. ($subdir2 has to contain 16 chars and a
+	# slash somewhere following; that is why we asked for abspath <= 230 and
+	# why we placed a slash near the end of the $subdir template.)
+	subdir2=${subdir#????????????????*/} &&
+	subdir1=testdir/${subdir%/$subdir2} &&
+	mkdir -p "$subdir1" &&
+	i=0 &&
+	# The most important case is when absolute path is 258 characters long,
+	# and that will be when i == 4.
+	while test $i -le 7
+	do
+		mkdir -p $subdir2 &&
+		touch $subdir2/one-file &&
+		mv ${subdir2%%/*} "$subdir1/" &&
+		subdir2=z${subdir2} &&
+		i=$(($i+1)) ||
+		exit 1
+	done &&
+
+	# now check that git is able to clear the tree:
+	(cd testdir &&
+	 git init &&
+	 git config core.longpaths yes &&
+	 git clean -fdx) &&
+	test ! -d "$subdir1"
+'
+
+test_done
diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh
index f692cedbff7ff8..458519eafd6f03 100755
--- a/t/t7429-submodule-long-path.sh
+++ b/t/t7429-submodule-long-path.sh
@@ -11,15 +11,20 @@ This test verifies that "git submodule" initialization, update and clones work,
 TEST_NO_CREATE_REPO=1
 . ./test-lib.sh
 
-longpath=""
-for (( i=0; i<4; i++ )); do
-	longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath"
-done
-# Pick a substring maximum of 90 characters
-# This should be good, since we'll add on a lot for temp directories
-longpath=${longpath:0:90}; export longpath
+# cloning a submodule calls is_git_directory("$path/../.git/modules/$path"),
+# which effectively limits the maximum length to PATH_MAX / 2 minus some
+# overhead; start with 3 * 36 = 108 chars (test 2 fails if >= 110)
+longpath36=0123456789abcdefghijklmnopqrstuvwxyz
+longpath180=$longpath36$longpath36$longpath36$longpath36$longpath36
 
-test_expect_failure 'submodule with a long path' '
+# the git database must fit within PATH_MAX, which limits the submodule name
+# to PATH_MAX - len(pwd) - ~90 (= len("/objects//") + 40-byte sha1 + some
+# overhead from the test case)
+pwd=$(pwd)
+pwdlen=$(echo "$pwd" | wc -c)
+longpath=$(echo $longpath180 | cut -c 1-$((170-$pwdlen)))
+
+test_expect_success 'submodule with a long path' '
 	git config --global protocol.file.allow always &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=long init --bare remote &&
@@ -59,7 +64,7 @@ test_expect_failure 'submodule with a long path' '
 	)
 '
 
-test_expect_failure 'recursive submodule with a long path' '
+test_expect_success 'recursive submodule with a long path' '
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=long init --bare super &&
 	test_create_repo child &&
@@ -101,6 +106,5 @@ test_expect_failure 'recursive submodule with a long path' '
 		)
 	)
 '
-unset longpath
 
 test_done

From 4df4c8b6bd70ae62218284c58b9128cc0c8a2e87 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 5 Jul 2014 00:00:36 +0200
Subject: [PATCH 485/553] Win32: fix 'lstat("dir/")' with long paths

Use a suffciently large buffer to strip the trailing slash.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 2dd5a12f4bb59d..0cc4aa453904d8 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1049,7 +1049,7 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf)
 static int do_stat_internal(int follow, const char *file_name, struct stat *buf)
 {
 	size_t namelen;
-	char alt_name[PATH_MAX];
+	char alt_name[MAX_LONG_PATH];
 
 	if (!do_lstat(follow, file_name, buf))
 		return 0;
@@ -1065,7 +1065,7 @@ static int do_stat_internal(int follow, const char *file_name, struct stat *buf)
 		return -1;
 	while (namelen && file_name[namelen-1] == '/')
 		--namelen;
-	if (!namelen || namelen >= PATH_MAX)
+	if (!namelen || namelen >= MAX_LONG_PATH)
 		return -1;
 
 	memcpy(alt_name, file_name, namelen);

From b8fb8c017676ad662c90a824fb1b26b705713ee4 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 6 Sep 2023 09:14:47 +0200
Subject: [PATCH 486/553] win32(long path support): leave drive-less absolute
 paths intact

When trying to ensure that long paths are handled correctly, we
first normalize absolute paths as we encounter them.

However, if the path is a so-called "drive-less" absolute path, i.e. if
it is relative to the current drive but _does_ start with a directory
separator, we would want the normalized path to be such a drive-less
absolute path, too.

Let's do that, being careful to still include the drive prefix when we
need to go through the `\\?\` dance (because there, the drive prefix is
absolutely required).

This fixes https://github.com/git-for-windows/git/issues/4586.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c                 | 12 +++++++++++-
 t/t2031-checkout-long-paths.sh |  9 +++++++++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 0cc4aa453904d8..623e538d0a3bfb 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -300,6 +300,11 @@ int mingw_core_config(const char *var, const char *value,
 	return 0;
 }
 
+static inline int is_wdir_sep(wchar_t wchar)
+{
+	return wchar == L'/' || wchar == L'\\';
+}
+
 /* Normalizes NT paths as returned by some low-level APIs. */
 static wchar_t *normalize_ntpath(wchar_t *wbuf)
 {
@@ -3637,7 +3642,12 @@ int handle_long_path(wchar_t *path, int len, int max_path, int expand)
 	 * "cwd + path" doesn't due to '..' components)
 	 */
 	if (result < max_path) {
-		wcscpy(path, buf);
+		/* Be careful not to add a drive prefix if there was none */
+		if (is_wdir_sep(path[0]) &&
+		    !is_wdir_sep(buf[0]) && buf[1] == L':' && is_wdir_sep(buf[2]))
+			wcscpy(path, buf + 2);
+		else
+			wcscpy(path, buf);
 		return result;
 	}
 
diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh
index f30f8920ca689c..15416a1d6ee8c7 100755
--- a/t/t2031-checkout-long-paths.sh
+++ b/t/t2031-checkout-long-paths.sh
@@ -99,4 +99,13 @@ test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' '
 	test ! -d "$subdir1"
 '
 
+test_expect_success SYMLINKS_WINDOWS 'leave drive-less, short paths intact' '
+	printf "/Program Files" >symlink-target &&
+	symlink_target_oid="$(git hash-object -w --stdin <symlink-target)" &&
+	git update-index --add --cacheinfo 120000,$symlink_target_oid,PF &&
+	git -c core.symlinks=true checkout -- PF &&
+	cmd //c dir >actual &&
+	grep "<SYMLINKD\\?> *PF *\\[\\\\Program Files\\]" actual
+'
+
 test_done

From 75c9a0d7c03eca2ee8a13376975c03640da2fd18 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Thu, 19 Mar 2015 16:33:44 +0100
Subject: [PATCH 487/553] mingw: Support `git_terminal_prompt` with more
 terminals

The `git_terminal_prompt()` function expects the terminal window to be
attached to a Win32 Console. However, this is not the case with terminal
windows other than `cmd.exe`'s, e.g. with MSys2's own `mintty`.

Non-cmd terminals such as `mintty` still have to have a Win32 Console
to be proper console programs, but have to hide the Win32 Console to
be able to provide more flexibility (such as being resizeable not only
vertically but also horizontally). By writing to that Win32 Console,
`git_terminal_prompt()` manages only to send the prompt to nowhere and
to wait for input from a Console to which the user has no access.

This commit introduces a function specifically to support `mintty` -- or
other terminals that are compatible with MSys2's `/dev/tty` emulation. We
use the `TERM` environment variable as an indicator for that: if the value
starts with "xterm" (such as `mintty`'s "xterm_256color"), we prefer to
let `xterm_prompt()` handle the user interaction.

The most prominent user of `git_terminal_prompt()` is certainly
`git-remote-https.exe`. It is an interesting use case because both
`stdin` and `stdout` are redirected when Git calls said executable, yet
it still wants to access the terminal.

When running inside a `mintty`, the terminal is not accessible to the
`git-remote-https.exe` program, though, because it is a MinGW program
and the `mintty` terminal is not backed by a Win32 console.

To solve that problem, we simply call out to the shell -- which is an
*MSys2* program and can therefore access `/dev/tty`.

Helped-by: nalla <nalla@hamal.uberspace.de>
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/terminal.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/compat/terminal.c b/compat/terminal.c
index 584f27bf7e1078..cdcde283644e41 100644
--- a/compat/terminal.c
+++ b/compat/terminal.c
@@ -418,6 +418,54 @@ static int getchar_with_timeout(int timeout)
 	return getchar();
 }
 
+static char *shell_prompt(const char *prompt, int echo)
+{
+	const char *read_input[] = {
+		/* Note: call 'bash' explicitly, as 'read -s' is bash-specific */
+		"bash", "-c", echo ?
+		"cat >/dev/tty && read -r line </dev/tty && echo \"$line\"" :
+		"cat >/dev/tty && read -r -s line </dev/tty && echo \"$line\" && echo >/dev/tty",
+		NULL
+	};
+	struct child_process child = CHILD_PROCESS_INIT;
+	static struct strbuf buffer = STRBUF_INIT;
+	int prompt_len = strlen(prompt), len = -1, code;
+
+	strvec_pushv(&child.args, read_input);
+	child.in = -1;
+	child.out = -1;
+
+	if (start_command(&child))
+		return NULL;
+
+	if (write_in_full(child.in, prompt, prompt_len) != prompt_len) {
+		error("could not write to prompt script");
+		close(child.in);
+		goto ret;
+	}
+	close(child.in);
+
+	strbuf_reset(&buffer);
+	len = strbuf_read(&buffer, child.out, 1024);
+	if (len < 0) {
+		error("could not read from prompt script");
+		goto ret;
+	}
+
+	strbuf_strip_suffix(&buffer, "\n");
+	strbuf_strip_suffix(&buffer, "\r");
+
+ret:
+	close(child.out);
+	code = finish_command(&child);
+	if (code) {
+		error("failed to execute prompt script (exit code %d)", code);
+		return NULL;
+	}
+
+	return len < 0 ? NULL : buffer.buf;
+}
+
 #endif
 
 #ifndef FORCE_TEXT
@@ -429,6 +477,12 @@ char *git_terminal_prompt(const char *prompt, int echo)
 	static struct strbuf buf = STRBUF_INIT;
 	int r;
 	FILE *input_fh, *output_fh;
+#ifdef GIT_WINDOWS_NATIVE
+	const char *term = getenv("TERM");
+
+	if (term && starts_with(term, "xterm"))
+		return shell_prompt(prompt, echo);
+#endif
 
 	input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT);
 	if (!input_fh)

From 5108b619ca3123f23db7d4f43e8d2f9f8bf2eacf Mon Sep 17 00:00:00 2001
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Fri, 25 Mar 2022 16:56:04 -0400
Subject: [PATCH 488/553] compat/fsmonitor/fsm-*-win32: support long paths

Update wchar_t buffers to use MAX_LONG_PATH instead of MAX_PATH and call
xutftowcs_long_path() in the Win32 backend source files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsm-health-win32.c     |  6 +++---
 compat/fsmonitor/fsm-listen-win32.c     | 18 +++++++++---------
 compat/fsmonitor/fsm-path-utils-win32.c |  8 ++++----
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/compat/fsmonitor/fsm-health-win32.c b/compat/fsmonitor/fsm-health-win32.c
index 2aa8c219acee4d..4b53360d194105 100644
--- a/compat/fsmonitor/fsm-health-win32.c
+++ b/compat/fsmonitor/fsm-health-win32.c
@@ -34,7 +34,7 @@ struct fsm_health_data
 
 	struct wt_moved
 	{
-		wchar_t wpath[MAX_PATH + 1];
+		wchar_t wpath[MAX_LONG_PATH + 1];
 		BY_HANDLE_FILE_INFORMATION bhfi;
 	} wt_moved;
 };
@@ -143,8 +143,8 @@ static int has_worktree_moved(struct fsmonitor_daemon_state *state,
 		return 0;
 
 	case CTX_INIT:
-		if (xutftowcs_path(data->wt_moved.wpath,
-				   state->path_worktree_watch.buf) < 0) {
+		if (xutftowcs_long_path(data->wt_moved.wpath,
+					state->path_worktree_watch.buf) < 0) {
 			error(_("could not convert to wide characters: '%s'"),
 			      state->path_worktree_watch.buf);
 			return -1;
diff --git a/compat/fsmonitor/fsm-listen-win32.c b/compat/fsmonitor/fsm-listen-win32.c
index 9a6efc9bea340b..afcc172750af10 100644
--- a/compat/fsmonitor/fsm-listen-win32.c
+++ b/compat/fsmonitor/fsm-listen-win32.c
@@ -28,7 +28,7 @@ struct one_watch
 	DWORD count;
 
 	struct strbuf path;
-	wchar_t wpath_longname[MAX_PATH + 1];
+	wchar_t wpath_longname[MAX_LONG_PATH + 1];
 	DWORD wpath_longname_len;
 
 	HANDLE hDir;
@@ -131,8 +131,8 @@ static int normalize_path_in_utf8(wchar_t *wpath, DWORD wpath_len,
  */
 static void check_for_shortnames(struct one_watch *watch)
 {
-	wchar_t buf_in[MAX_PATH + 1];
-	wchar_t buf_out[MAX_PATH + 1];
+	wchar_t buf_in[MAX_LONG_PATH + 1];
+	wchar_t buf_out[MAX_LONG_PATH + 1];
 	wchar_t *last;
 	wchar_t *p;
 
@@ -197,8 +197,8 @@ static enum get_relative_result get_relative_longname(
 	const wchar_t *wpath, DWORD wpath_len,
 	wchar_t *wpath_longname, size_t bufsize_wpath_longname)
 {
-	wchar_t buf_in[2 * MAX_PATH + 1];
-	wchar_t buf_out[MAX_PATH + 1];
+	wchar_t buf_in[2 * MAX_LONG_PATH + 1];
+	wchar_t buf_out[MAX_LONG_PATH + 1];
 	DWORD root_len;
 	DWORD out_len;
 
@@ -298,10 +298,10 @@ static struct one_watch *create_watch(const char *path)
 		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
 	HANDLE hDir;
 	DWORD len_longname;
-	wchar_t wpath[MAX_PATH + 1];
-	wchar_t wpath_longname[MAX_PATH + 1];
+	wchar_t wpath[MAX_LONG_PATH + 1];
+	wchar_t wpath_longname[MAX_LONG_PATH + 1];
 
-	if (xutftowcs_path(wpath, path) < 0) {
+	if (xutftowcs_long_path(wpath, path) < 0) {
 		error(_("could not convert to wide characters: '%s'"), path);
 		return NULL;
 	}
@@ -545,7 +545,7 @@ static int process_worktree_events(struct fsmonitor_daemon_state *state)
 	struct string_list cookie_list = STRING_LIST_INIT_DUP;
 	struct fsmonitor_batch *batch = NULL;
 	const char *p = watch->buffer;
-	wchar_t wpath_longname[MAX_PATH + 1];
+	wchar_t wpath_longname[MAX_LONG_PATH + 1];
 
 	/*
 	 * If the kernel gets more events than will fit in the kernel
diff --git a/compat/fsmonitor/fsm-path-utils-win32.c b/compat/fsmonitor/fsm-path-utils-win32.c
index f4f9cc1f336720..c6eb065bde48b4 100644
--- a/compat/fsmonitor/fsm-path-utils-win32.c
+++ b/compat/fsmonitor/fsm-path-utils-win32.c
@@ -69,8 +69,8 @@ static int check_remote_protocol(wchar_t *wpath)
  */
 int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info)
 {
-	wchar_t wpath[MAX_PATH];
-	wchar_t wfullpath[MAX_PATH];
+	wchar_t wpath[MAX_LONG_PATH];
+	wchar_t wfullpath[MAX_LONG_PATH];
 	size_t wlen;
 	UINT driveType;
 
@@ -78,7 +78,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info)
 	 * Do everything in wide chars because the drive letter might be
 	 * a multi-byte sequence.  See win32_has_dos_drive_prefix().
 	 */
-	if (xutftowcs_path(wpath, path) < 0) {
+	if (xutftowcs_long_path(wpath, path) < 0) {
 		return -1;
 	}
 
@@ -97,7 +97,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info)
 	 * slashes to backslashes.  This is essential to get GetDriveTypeW()
 	 * correctly handle some UNC "\\server\share\..." paths.
 	 */
-	if (!GetFullPathNameW(wpath, MAX_PATH, wfullpath, NULL)) {
+	if (!GetFullPathNameW(wpath, MAX_LONG_PATH, wfullpath, NULL)) {
 		return -1;
 	}
 

From 48631e119d45ac98b41551f98c4ee33393814c2e Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 9 May 2015 02:11:48 +0200
Subject: [PATCH 489/553] compat/terminal.c: only use the Windows console if
 bash 'read -r' fails
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Accessing the Windows console through the special CONIN$ / CONOUT$ devices
doesn't work properly for non-ASCII usernames an passwords.

It also doesn't work for terminal emulators that hide the native console
window (such as mintty), and 'TERM=xterm*' is not necessarily a reliable
indicator for such terminals.

The new shell_prompt() function, on the other hand, works fine for both
MSys1 and MSys2, in native console windows as well as mintty, and properly
supports Unicode. It just needs bash on the path (for 'read -s', which is
bash-specific).

On Windows, try to use the shell to read from the terminal. If that fails
with ENOENT (i.e. bash was not found), use CONIN/OUT as fallback.

Note: To test this, create a UTF-8 credential file with non-ASCII chars,
e.g. in git-bash: 'echo url=http://täst.com > cred.txt'. Then in git-cmd,
'git credential fill <cred.txt' works (shell version), while calling git
without the git-wrapper (i.e. 'mingw64\bin\git credential fill <cred.txt')
mangles non-ASCII chars in both console output and input.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/terminal.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/compat/terminal.c b/compat/terminal.c
index cdcde283644e41..a89c5cd9ccf604 100644
--- a/compat/terminal.c
+++ b/compat/terminal.c
@@ -434,6 +434,7 @@ static char *shell_prompt(const char *prompt, int echo)
 	strvec_pushv(&child.args, read_input);
 	child.in = -1;
 	child.out = -1;
+	child.silent_exec_failure = 1;
 
 	if (start_command(&child))
 		return NULL;
@@ -477,11 +478,14 @@ char *git_terminal_prompt(const char *prompt, int echo)
 	static struct strbuf buf = STRBUF_INIT;
 	int r;
 	FILE *input_fh, *output_fh;
+
 #ifdef GIT_WINDOWS_NATIVE
-	const char *term = getenv("TERM");
 
-	if (term && starts_with(term, "xterm"))
-		return shell_prompt(prompt, echo);
+	/* try shell_prompt first, fall back to CONIN/OUT if bash is missing */
+	char *result = shell_prompt(prompt, echo);
+	if (result || errno != ENOENT)
+		return result;
+
 #endif
 
 	input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT);

From d13aa686a74b33e11fed9f2ba8b6c2aa489d4a6e Mon Sep 17 00:00:00 2001
From: Ben Boeckel <mathstuf@gmail.com>
Date: Fri, 22 Apr 2022 09:06:23 -0400
Subject: [PATCH 490/553] clean: suggest using `core.longPaths` if paths are
 too long to remove

On Windows, git repositories may have extra files which need cleaned
(e.g., a build directory) that may be arbitrarily deep. Suggest using
`core.longPaths` if such situations are encountered.

Fixes: #2715
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
---
 Documentation/config/advice.adoc |  3 +++
 advice.c                         |  1 +
 advice.h                         |  1 +
 builtin/clean.c                  | 13 +++++++++++++
 4 files changed, 18 insertions(+)

diff --git a/Documentation/config/advice.adoc b/Documentation/config/advice.adoc
index 257db58918179a..0b3199f4660886 100644
--- a/Documentation/config/advice.adoc
+++ b/Documentation/config/advice.adoc
@@ -64,6 +64,9 @@ all advice messages.
 		set their identity configuration.
 	mergeConflict::
 		Shown when various commands stop because of conflicts.
+	nameTooLong::
+		Advice shown if a filepath operation is attempted where the
+		path was too long.
 	nestedTag::
 		Shown when a user attempts to recursively tag a tag object.
 	pushAlreadyExists::
diff --git a/advice.c b/advice.c
index 0018501b7bc103..fec2b37627d2df 100644
--- a/advice.c
+++ b/advice.c
@@ -61,6 +61,7 @@ static struct {
 	[ADVICE_IGNORED_HOOK]				= { "ignoredHook" },
 	[ADVICE_IMPLICIT_IDENTITY]			= { "implicitIdentity" },
 	[ADVICE_MERGE_CONFLICT]				= { "mergeConflict" },
+	[ADVICE_NAME_TOO_LONG]				= { "nameTooLong" },
 	[ADVICE_NESTED_TAG]				= { "nestedTag" },
 	[ADVICE_OBJECT_NAME_WARNING]			= { "objectNameWarning" },
 	[ADVICE_PUSH_ALREADY_EXISTS]			= { "pushAlreadyExists" },
diff --git a/advice.h b/advice.h
index 8def28068861df..b826620fb45916 100644
--- a/advice.h
+++ b/advice.h
@@ -28,6 +28,7 @@ enum advice_type {
 	ADVICE_IGNORED_HOOK,
 	ADVICE_IMPLICIT_IDENTITY,
 	ADVICE_MERGE_CONFLICT,
+	ADVICE_NAME_TOO_LONG,
 	ADVICE_NESTED_TAG,
 	ADVICE_OBJECT_NAME_WARNING,
 	ADVICE_PUSH_ALREADY_EXISTS,
diff --git a/builtin/clean.c b/builtin/clean.c
index e15d595c3dc7cc..f8a54a4a47bc7b 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -26,6 +26,7 @@
 #include "pathspec.h"
 #include "help.h"
 #include "prompt.h"
+#include "advice.h"
 
 static int require_force = -1; /* unset */
 static int interactive;
@@ -221,6 +222,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 			quote_path(path->buf, prefix, &quoted, 0);
 			errno = saved_errno;
 			warning_errno(_(msg_warn_remove_failed), quoted.buf);
+			if (saved_errno == ENAMETOOLONG) {
+				advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed."));
+			}
 			*dir_gone = 0;
 		}
 		ret = res;
@@ -256,6 +260,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 				quote_path(path->buf, prefix, &quoted, 0);
 				errno = saved_errno;
 				warning_errno(_(msg_warn_remove_failed), quoted.buf);
+				if (saved_errno == ENAMETOOLONG) {
+					advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed."));
+				}
 				*dir_gone = 0;
 				ret = 1;
 			}
@@ -299,6 +306,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 				quote_path(path->buf, prefix, &quoted, 0);
 				errno = saved_errno;
 				warning_errno(_(msg_warn_remove_failed), quoted.buf);
+				if (saved_errno == ENAMETOOLONG) {
+					advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed."));
+				}
 				*dir_gone = 0;
 				ret = 1;
 			}
@@ -1109,6 +1119,9 @@ int cmd_clean(int argc,
 				qname = quote_path(item->string, NULL, &buf, 0);
 				errno = saved_errno;
 				warning_errno(_(msg_warn_remove_failed), qname);
+				if (saved_errno == ENAMETOOLONG) {
+					advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed."));
+				}
 				errors++;
 			} else if (!quiet) {
 				qname = quote_path(item->string, NULL, &buf, 0);

From c8a8772b44ccfc0fa7f6595cd2c90d3963f1c3e7 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Mon, 11 May 2015 19:54:23 +0200
Subject: [PATCH 491/553] strbuf_readlink: don't call readlink twice if hint is
 the exact link size

strbuf_readlink() calls readlink() twice if the hint argument specifies the
exact size of the link target (e.g. by passing stat.st_size as returned by
lstat()). This is necessary because 'readlink(..., hint) == hint' could
mean that the buffer was too small.

Use hint + 1 as buffer size to prevent this.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 strbuf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 7fb7d12ac0cb9e..523be77a31ad19 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -578,12 +578,12 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	while (hint < STRBUF_MAXLINK) {
 		ssize_t len;
 
-		strbuf_grow(sb, hint);
-		len = readlink(path, sb->buf, hint);
+		strbuf_grow(sb, hint + 1);
+		len = readlink(path, sb->buf, hint + 1);
 		if (len < 0) {
 			if (errno != ERANGE)
 				break;
-		} else if (len < hint) {
+		} else if (len <= hint) {
 			strbuf_setlen(sb, len);
 			return 0;
 		}

From c00f2640817f875eda1ef28d58b9d4d2c601b41f Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Mon, 11 May 2015 22:15:40 +0200
Subject: [PATCH 492/553] strbuf_readlink: support link targets that exceed
 PATH_MAX

strbuf_readlink() refuses to read link targets that exceed PATH_MAX (even
if a sufficient size was specified by the caller).

As some platforms support longer paths, remove this restriction (similar
to strbuf_getcwd()).

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 strbuf.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 523be77a31ad19..228e430ef96b42 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
 	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
 }
 
-#define STRBUF_MAXLINK (2*PATH_MAX)
-
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 {
 	size_t oldalloc = sb->alloc;
@@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	if (hint < 32)
 		hint = 32;
 
-	while (hint < STRBUF_MAXLINK) {
+	for (;;) {
 		ssize_t len;
 
 		strbuf_grow(sb, hint + 1);

From ac035ff822a795f8603be72f1f5d4d0bc0ae62f9 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Mon, 11 May 2015 19:58:14 +0200
Subject: [PATCH 493/553] lockfile.c: use is_dir_sep() instead of hardcoded '/'
 checks

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 lockfile.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lockfile.c b/lockfile.c
index 1d5ed016828746..67082a9caaeb18 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -19,14 +19,14 @@ static void trim_last_path_component(struct strbuf *path)
 	int i = path->len;
 
 	/* back up past trailing slashes, if any */
-	while (i && path->buf[i - 1] == '/')
+	while (i && is_dir_sep(path->buf[i - 1]))
 		i--;
 
 	/*
 	 * then go backwards until a slash, or the beginning of the
 	 * string
 	 */
-	while (i && path->buf[i - 1] != '/')
+	while (i && !is_dir_sep(path->buf[i - 1]))
 		i--;
 
 	strbuf_setlen(path, i);

From 6d8f233c2d5d00b165b6198fb1235e8dbeed73b1 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 12 May 2015 11:09:01 +0200
Subject: [PATCH 494/553] Win32: don't call GetFileAttributes twice in
 mingw_lstat()

GetFileAttributes cannot handle paths with trailing dir separator. The
current [l]stat implementation calls GetFileAttributes twice if the path
has trailing slashes (first with the original path passed to [l]stat, and
and a second time with a path copy with trailing '/' removed).

With Unicode conversion, we get the length of the path for free and also
have a (wide char) buffer that can be modified.

Remove trailing directory separators before calling the Win32 API.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 48 ++++++++++++------------------------------------
 1 file changed, 12 insertions(+), 36 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 623e538d0a3bfb..0bddf16750fe37 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -983,8 +983,17 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf)
 {
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
 	wchar_t wfilename[MAX_LONG_PATH];
-	if (xutftowcs_long_path(wfilename, file_name) < 0)
+	int wlen = xutftowcs_long_path(wfilename, file_name);
+	if (wlen < 0)
+		return -1;
+
+	/* strip trailing '/', or GetFileAttributes will fail */
+	while (wlen && is_dir_sep(wfilename[wlen - 1]))
+		wfilename[--wlen] = 0;
+	if (!wlen) {
+		errno = ENOENT;
 		return -1;
+	}
 
 	if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) {
 		buf->st_ino = 0;
@@ -1045,39 +1054,6 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf)
 	return -1;
 }
 
-/* We provide our own lstat/fstat functions, since the provided
- * lstat/fstat functions are so slow. These stat functions are
- * tailored for Git's usage (read: fast), and are not meant to be
- * complete. Note that Git stat()s are redirected to mingw_lstat()
- * too, since Windows doesn't really handle symlinks that well.
- */
-static int do_stat_internal(int follow, const char *file_name, struct stat *buf)
-{
-	size_t namelen;
-	char alt_name[MAX_LONG_PATH];
-
-	if (!do_lstat(follow, file_name, buf))
-		return 0;
-
-	/* if file_name ended in a '/', Windows returned ENOENT;
-	 * try again without trailing slashes
-	 */
-	if (errno != ENOENT)
-		return -1;
-
-	namelen = strlen(file_name);
-	if (namelen && file_name[namelen-1] != '/')
-		return -1;
-	while (namelen && file_name[namelen-1] == '/')
-		--namelen;
-	if (!namelen || namelen >= MAX_LONG_PATH)
-		return -1;
-
-	memcpy(alt_name, file_name, namelen);
-	alt_name[namelen] = 0;
-	return do_lstat(follow, alt_name, buf);
-}
-
 int (*lstat)(const char *file_name, struct stat *buf) = mingw_lstat;
 
 static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
@@ -1105,11 +1081,11 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 
 int mingw_lstat(const char *file_name, struct stat *buf)
 {
-	return do_stat_internal(0, file_name, buf);
+	return do_lstat(0, file_name, buf);
 }
 int mingw_stat(const char *file_name, struct stat *buf)
 {
-	return do_stat_internal(1, file_name, buf);
+	return do_lstat(1, file_name, buf);
 }
 
 int mingw_fstat(int fd, struct stat *buf)

From 737ad11de87a138a85ea4dc22d6df2bed2336694 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 16 May 2015 01:18:14 +0200
Subject: [PATCH 495/553] Win32: implement stat() with symlink support

With respect to symlinks, the current stat() implementation is almost the
same as lstat(): except for the file type (st_mode & S_IFMT), it returns
information about the link rather than the target.

Implement stat by opening the file with as little permissions as possible
and calling GetFileInformationByHandle on it. This way, all link resoltion
is handled by the Windows file system layer.

If symlinks are disabled, use lstat() as before, but fail with ELOOP if a
symlink would have to be resolved.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 0bddf16750fe37..ca593998570250 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1083,9 +1083,26 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 {
 	return do_lstat(0, file_name, buf);
 }
+
 int mingw_stat(const char *file_name, struct stat *buf)
 {
-	return do_lstat(1, file_name, buf);
+	wchar_t wfile_name[MAX_LONG_PATH];
+	HANDLE hnd;
+	int result;
+
+	/* open the file and let Windows resolve the links */
+	if (xutftowcs_long_path(wfile_name, file_name) < 0)
+		return -1;
+	hnd = CreateFileW(wfile_name, 0,
+			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+	if (hnd == INVALID_HANDLE_VALUE) {
+		errno = err_win_to_posix(GetLastError());
+		return -1;
+	}
+	result = get_file_info_by_handle(hnd, buf);
+	CloseHandle(hnd);
+	return result;
 }
 
 int mingw_fstat(int fd, struct stat *buf)

From e64833d74b0d30d3ee264ae85a2713e37da4d907 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 12 May 2015 00:58:39 +0200
Subject: [PATCH 496/553] Win32: remove separate do_lstat() function

With the new mingw_stat() implementation, do_lstat() is only called from
mingw_lstat() (with follow == 0). Remove the extra function and the old
mingw_stat()-specific (follow == 1) logic.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index ca593998570250..ef84192df2bd93 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -972,14 +972,7 @@ static int has_valid_directory_prefix(wchar_t *wfilename)
 	return 1;
 }
 
-/* We keep the do_lstat code in a separate function to avoid recursion.
- * When a path ends with a slash, the stat will fail with ENOENT. In
- * this case, we strip the trailing slashes and stat again.
- *
- * If follow is true then act like stat() and report on the link
- * target. Otherwise report on the link itself.
- */
-static int do_lstat(int follow, const char *file_name, struct stat *buf)
+int mingw_lstat(const char *file_name, struct stat *buf)
 {
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
 	wchar_t wfilename[MAX_LONG_PATH];
@@ -1013,13 +1006,7 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf)
 			if (handle != INVALID_HANDLE_VALUE) {
 				if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) &&
 						(findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) {
-					if (follow) {
-						char buffer[MAXIMUM_REPARSE_DATA_BUFFER_SIZE];
-						buf->st_size = readlink(file_name, buffer, MAXIMUM_REPARSE_DATA_BUFFER_SIZE);
-					} else {
-						buf->st_mode = S_IFLNK;
-					}
-					buf->st_mode |= S_IREAD;
+					buf->st_mode = S_IFLNK | S_IREAD;
 					if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY))
 						buf->st_mode |= S_IWRITE;
 				}
@@ -1079,11 +1066,6 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 	return 0;
 }
 
-int mingw_lstat(const char *file_name, struct stat *buf)
-{
-	return do_lstat(0, file_name, buf);
-}
-
 int mingw_stat(const char *file_name, struct stat *buf)
 {
 	wchar_t wfile_name[MAX_LONG_PATH];

From 264325a36fc05629120ee1d53799e36541a02e3e Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 00:17:56 +0200
Subject: [PATCH 497/553] Win32: let mingw_lstat() error early upon problems
 with reparse points

When obtaining lstat information for reparse points, we need to call
FindFirstFile() in addition to GetFileInformationEx() to obtain the type
of the reparse point (symlink, mount point etc.). However, currently there
is no error handling whatsoever if FindFirstFile() fails.

Call FindFirstFile() before modifying the stat *buf output parameter and
error out if the call fails.

Note: The FindFirstFile() return value includes all the data that we get
from GetFileAttributesEx(), so we could replace GetFileAttributesEx() with
FindFirstFile(). We don't do that because GetFileAttributesEx() is about
twice as fast for single files. I.e. we only pay the extra cost of calling
FindFirstFile() in the rare case that we encounter a reparse point.

Note: The indentation of the remaining reparse point code will be fixed in
the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index ef84192df2bd93..e97c9f0da40aa1 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -975,6 +975,7 @@ static int has_valid_directory_prefix(wchar_t *wfilename)
 int mingw_lstat(const char *file_name, struct stat *buf)
 {
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
+	WIN32_FIND_DATAW findbuf = { 0 };
 	wchar_t wfilename[MAX_LONG_PATH];
 	int wlen = xutftowcs_long_path(wfilename, file_name);
 	if (wlen < 0)
@@ -989,6 +990,13 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 	}
 
 	if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) {
+		/* for reparse points, use FindFirstFile to get the reparse tag */
+		if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) {
+			HANDLE handle = FindFirstFileW(wfilename, &findbuf);
+			if (handle == INVALID_HANDLE_VALUE)
+				goto error;
+			FindClose(handle);
+		}
 		buf->st_ino = 0;
 		buf->st_gid = 0;
 		buf->st_uid = 0;
@@ -1001,20 +1009,16 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim));
 		filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim));
 		if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) {
-			WIN32_FIND_DATAW findbuf;
-			HANDLE handle = FindFirstFileW(wfilename, &findbuf);
-			if (handle != INVALID_HANDLE_VALUE) {
 				if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) &&
 						(findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) {
 					buf->st_mode = S_IFLNK | S_IREAD;
 					if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY))
 						buf->st_mode |= S_IWRITE;
 				}
-				FindClose(handle);
-			}
 		}
 		return 0;
 	}
+error:
 	switch (GetLastError()) {
 	case ERROR_ACCESS_DENIED:
 	case ERROR_SHARING_VIOLATION:

From 59c9e23ed2864b5936d492771961675a7e12e7d8 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 10 Jan 2017 23:21:56 +0100
Subject: [PATCH 498/553] mingw: teach fscache and dirent about symlinks

Move S_IFLNK detection to file_attr_to_st_mode() and reuse it in fscache.

Implement DT_LNK detection in dirent.c and the fscache readdir version.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c         | 13 +++----------
 compat/win32.h         |  6 ++++--
 compat/win32/dirent.c  |  5 ++++-
 compat/win32/fscache.c | 11 +++++++----
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index e97c9f0da40aa1..b33005df1a0f6c 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1001,21 +1001,14 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		buf->st_gid = 0;
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
-		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
+		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes,
+				findbuf.dwReserved0);
 		buf->st_size = fdata.nFileSizeLow |
 			(((off_t)fdata.nFileSizeHigh)<<32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim));
 		filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim));
 		filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim));
-		if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) {
-				if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) &&
-						(findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) {
-					buf->st_mode = S_IFLNK | S_IREAD;
-					if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY))
-						buf->st_mode |= S_IWRITE;
-				}
-		}
 		return 0;
 	}
 error:
@@ -1060,7 +1053,7 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 	buf->st_gid = 0;
 	buf->st_uid = 0;
 	buf->st_nlink = 1;
-	buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
+	buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0);
 	buf->st_size = fdata.nFileSizeLow |
 		(((off_t)fdata.nFileSizeHigh)<<32);
 	buf->st_dev = buf->st_rdev = 0; /* not used by Git */
diff --git a/compat/win32.h b/compat/win32.h
index a97e880757b6f1..671bcc81f93351 100644
--- a/compat/win32.h
+++ b/compat/win32.h
@@ -6,10 +6,12 @@
 #include <windows.h>
 #endif
 
-static inline int file_attr_to_st_mode (DWORD attr)
+static inline int file_attr_to_st_mode (DWORD attr, DWORD tag)
 {
 	int fMode = S_IREAD;
-	if (attr & FILE_ATTRIBUTE_DIRECTORY)
+	if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK)
+		fMode |= S_IFLNK;
+	else if (attr & FILE_ATTRIBUTE_DIRECTORY)
 		fMode |= S_IFDIR;
 	else
 		fMode |= S_IFREG;
diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index c9fe2454efc01c..87063101f57202 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -18,7 +18,10 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata)
 	xwcstoutf(ent->d_name, fdata->cFileName, MAX_PATH * 3);
 
 	/* Set file type, based on WIN32_FIND_DATA */
-	if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
+	if ((fdata->dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT)
+			&& fdata->dwReserved0 == IO_REPARSE_TAG_SYMLINK)
+		ent->d_type = DT_LNK;
+	else if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
 		ent->d_type = DT_DIR;
 	else
 		ent->d_type = DT_REG;
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index dbf640ca790fde..41fae636c12a41 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -207,10 +207,13 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 		fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ?
 		fdata->EaSize : 0;
 
-	fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes);
-	fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG;
-	fse->u.s.st_size = fdata->EndOfFile.LowPart |
-		(((off_t)fdata->EndOfFile.HighPart) << 32);
+	fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes,
+					    fdata->EaSize);
+	fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG :
+			S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK;
+	fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_LONG_PATH :
+			fdata->EndOfFile.LowPart |
+			(((off_t)fdata->EndOfFile.HighPart) << 32);
 	filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime),
 			     &(fse->u.s.st_atim));
 	filetime_to_timespec((FILETIME *)&(fdata->LastWriteTime),

From 6b9ee97a28e2c691e6bc41956086b2ed6119ebfa Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 16 May 2015 01:11:37 +0200
Subject: [PATCH 499/553] Win32: lstat(): return adequate stat.st_size for
 symlinks

Git typically doesn't trust the stat.st_size member of symlinks (e.g. see
strbuf_readlink()). However, some functions take shortcuts if st_size is 0
(e.g. diff_populate_filespec()).

In mingw_lstat() and fscache_lstat(), make sure to return an adequate size.

The extra overhead of opening and reading the reparse point to calculate
the exact size is not necessary, as git doesn't rely on the value anyway.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index b33005df1a0f6c..7b6c5fa6c29e08 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1003,8 +1003,8 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes,
 				findbuf.dwReserved0);
-		buf->st_size = fdata.nFileSizeLow |
-			(((off_t)fdata.nFileSizeHigh)<<32);
+		buf->st_size = S_ISLNK(buf->st_mode) ? MAX_LONG_PATH :
+			fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim));
 		filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim));

From 61d130eff7bf176b2a018d1183d20fb0f91fc8fc Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 19 May 2015 21:48:55 +0200
Subject: [PATCH 500/553] Win32: factor out retry logic

The retry pattern is duplicated in three places. It also seems to be too
hard to use: mingw_unlink() and mingw_rmdir() duplicate the code to retry,
and both of them do so incompletely. They also do not restore errno if the
user answers 'no'.

Introduce a retry_ask_yes_no() helper function that handles retry with
small delay, asking the user, and restoring errno.

mingw_unlink: include _wchmod in the retry loop (which may fail if the
file is locked exclusively).

mingw_rmdir: include special error handling in the retry loop.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 104 ++++++++++++++++++++++---------------------------
 1 file changed, 46 insertions(+), 58 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 7b6c5fa6c29e08..604e8f7abab398 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -30,8 +30,6 @@
 
 #define HCAST(type, handle) ((type)(intptr_t)handle)
 
-static const int delay[] = { 0, 1, 10, 20, 40 };
-
 void open_in_gdb(void)
 {
 	static struct child_process cp = CHILD_PROCESS_INIT;
@@ -207,15 +205,12 @@ static int read_yes_no_answer(void)
 	return -1;
 }
 
-static int ask_yes_no_if_possible(const char *format, ...)
+static int ask_yes_no_if_possible(const char *format, va_list args)
 {
 	char question[4096];
 	const char *retry_hook;
-	va_list args;
 
-	va_start(args, format);
 	vsnprintf(question, sizeof(question), format, args);
-	va_end(args);
 
 	retry_hook = mingw_getenv("GIT_ASK_YESNO");
 	if (retry_hook) {
@@ -240,6 +235,31 @@ static int ask_yes_no_if_possible(const char *format, ...)
 	}
 }
 
+static int retry_ask_yes_no(int *tries, const char *format, ...)
+{
+	static const int delay[] = { 0, 1, 10, 20, 40 };
+	va_list args;
+	int result, saved_errno = errno;
+
+	if ((*tries) < ARRAY_SIZE(delay)) {
+		/*
+		 * We assume that some other process had the file open at the wrong
+		 * moment and retry. In order to give the other process a higher
+		 * chance to complete its operation, we give up our time slice now.
+		 * If we have to retry again, we do sleep a bit.
+		 */
+		Sleep(delay[*tries]);
+		(*tries)++;
+		return 1;
+	}
+
+	va_start(args, format);
+	result = ask_yes_no_if_possible(format, args);
+	va_end(args);
+	errno = saved_errno;
+	return result;
+}
+
 /* Windows only */
 enum hide_dotfiles_type {
 	HIDE_DOTFILES_FALSE = 0,
@@ -332,7 +352,7 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf)
 
 int mingw_unlink(const char *pathname, int handle_in_use_error)
 {
-	int ret, tries = 0;
+	int tries = 0;
 	wchar_t wpathname[MAX_LONG_PATH];
 	if (xutftowcs_long_path(wpathname, pathname) < 0)
 		return -1;
@@ -340,29 +360,19 @@ int mingw_unlink(const char *pathname, int handle_in_use_error)
 	if (DeleteFileW(wpathname))
 		return 0;
 
-	/* read-only files cannot be removed */
-	_wchmod(wpathname, 0666);
-	while ((ret = _wunlink(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
+	do {
+		/* read-only files cannot be removed */
+		_wchmod(wpathname, 0666);
+		if (!_wunlink(wpathname))
+			return 0;
 		if (!is_file_in_use_error(GetLastError()))
 			break;
 		if (!handle_in_use_error)
-			return ret;
+			return -1;
 
-		/*
-		 * We assume that some other process had the source or
-		 * destination file open at the wrong moment and retry.
-		 * In order to give the other process a higher chance to
-		 * complete its operation, we give up our time slice now.
-		 * If we have to retry again, we do sleep a bit.
-		 */
-		Sleep(delay[tries]);
-		tries++;
-	}
-	while (ret == -1 && is_file_in_use_error(GetLastError()) &&
-	       ask_yes_no_if_possible("Unlink of file '%s' failed. "
-			"Should I try again?", pathname))
-	       ret = _wunlink(wpathname);
-	return ret;
+	} while (retry_ask_yes_no(&tries, "Unlink of file '%s' failed. "
+			"Should I try again?", pathname));
+	return -1;
 }
 
 static int is_dir_empty(const wchar_t *wpath)
@@ -389,7 +399,7 @@ static int is_dir_empty(const wchar_t *wpath)
 
 int mingw_rmdir(const char *pathname)
 {
-	int ret, tries = 0;
+	int tries = 0;
 	wchar_t wpathname[MAX_LONG_PATH];
 	struct stat st;
 
@@ -415,7 +425,11 @@ int mingw_rmdir(const char *pathname)
 	if (xutftowcs_long_path(wpathname, pathname) < 0)
 		return -1;
 
-	while ((ret = _wrmdir(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
+	do {
+		if (!_wrmdir(wpathname)) {
+			invalidate_lstat_cache();
+			return 0;
+		}
 		if (!is_file_in_use_error(GetLastError()))
 			errno = err_win_to_posix(GetLastError());
 		if (errno != EACCES)
@@ -424,23 +438,9 @@ int mingw_rmdir(const char *pathname)
 			errno = ENOTEMPTY;
 			break;
 		}
-		/*
-		 * We assume that some other process had the source or
-		 * destination file open at the wrong moment and retry.
-		 * In order to give the other process a higher chance to
-		 * complete its operation, we give up our time slice now.
-		 * If we have to retry again, we do sleep a bit.
-		 */
-		Sleep(delay[tries]);
-		tries++;
-	}
-	while (ret == -1 && errno == EACCES && is_file_in_use_error(GetLastError()) &&
-	       ask_yes_no_if_possible("Deletion of directory '%s' failed. "
-			"Should I try again?", pathname))
-	       ret = _wrmdir(wpathname);
-	if (!ret)
-		invalidate_lstat_cache();
-	return ret;
+	} while (retry_ask_yes_no(&tries, "Deletion of directory '%s' failed. "
+			"Should I try again?", pathname));
+	return -1;
 }
 
 static inline int needs_hiding(const char *path)
@@ -2645,20 +2645,8 @@ int mingw_rename(const char *pold, const char *pnew)
 			SetFileAttributesW(wpnew, attrs);
 		}
 	}
-	if (tries < ARRAY_SIZE(delay) && gle == ERROR_ACCESS_DENIED) {
-		/*
-		 * We assume that some other process had the source or
-		 * destination file open at the wrong moment and retry.
-		 * In order to give the other process a higher chance to
-		 * complete its operation, we give up our time slice now.
-		 * If we have to retry again, we do sleep a bit.
-		 */
-		Sleep(delay[tries]);
-		tries++;
-		goto repeat;
-	}
 	if (gle == ERROR_ACCESS_DENIED &&
-	       ask_yes_no_if_possible("Rename from '%s' to '%s' failed. "
+	       retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. "
 		       "Should I try again?", pold, pnew))
 		goto repeat;
 

From 6ed885351d9fbfe770b45f2ae8434e30ff44bde0 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 23 Feb 2018 02:50:03 +0100
Subject: [PATCH 501/553] mingw (git_terminal_prompt): do fall back to
 CONIN$/CONOUT$ method

To support Git Bash running in a MinTTY, we use a dirty trick to access
the MSYS2 pseudo terminal: we execute a Bash snippet that accesses
/dev/tty.

The idea was to fall back to writing to/reading from CONOUT$/CONIN$ if
that Bash call failed because Bash was not found.

However, we should fall back even in other error conditions, because we
have not successfully read the user input. Let's make it so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/terminal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/terminal.c b/compat/terminal.c
index a89c5cd9ccf604..882b027e41e52b 100644
--- a/compat/terminal.c
+++ b/compat/terminal.c
@@ -483,7 +483,7 @@ char *git_terminal_prompt(const char *prompt, int echo)
 
 	/* try shell_prompt first, fall back to CONIN/OUT if bash is missing */
 	char *result = shell_prompt(prompt, echo);
-	if (result || errno != ENOENT)
+	if (result)
 		return result;
 
 #endif

From a2de74c6f5d886a91becd5454accb6ad45531e87 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:55:05 +0200
Subject: [PATCH 502/553] Win32: change default of 'core.symlinks' to false

Symlinks on Windows don't work the same way as on Unix systems. E.g. there
are different types of symlinks for directories and files, creating
symlinks requires administrative privileges etc.

By default, disable symlink support on Windows. I.e. users explicitly have
to enable it with 'git config [--system|--global] core.symlinks true'.

The test suite ignores system / global config files. Allow testing *with*
symlink support by checking if native symlinks are enabled in MSys2 (via
'MSYS=winsymlinks:nativestrict').

Reminder: This would need to be changed if / when we find a way to run the
test suite in a non-MSys-based shell (e.g. dash).

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 604e8f7abab398..66d4ae86c24966 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -3239,6 +3239,15 @@ static void setup_windows_environment(void)
 
 	if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG"))
 		setenv("LC_CTYPE", "C.UTF-8", 1);
+
+	/*
+	 * Change 'core.symlinks' default to false, unless native symlinks are
+	 * enabled in MSys2 (via 'MSYS=winsymlinks:nativestrict'). Thus we can
+	 * run the test suite (which doesn't obey config files) with or without
+	 * symlink support.
+	 */
+	if (!(tmp = getenv("MSYS")) || !strstr(tmp, "winsymlinks:nativestrict"))
+		has_symlinks = 0;
 }
 
 static void get_current_user_sid(PSID *sid, HANDLE *linked_token)

From a1d0f3f802accc4a81042f3510e9de01bdb486ce Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sat, 16 May 2015 00:32:03 +0200
Subject: [PATCH 503/553] Win32: add symlink-specific error codes

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 66d4ae86c24966..caffc57644025e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -104,6 +104,7 @@ int err_win_to_posix(DWORD winerr)
 	case ERROR_INVALID_PARAMETER: error = EINVAL; break;
 	case ERROR_INVALID_PASSWORD: error = EPERM; break;
 	case ERROR_INVALID_PRIMARY_GROUP: error = EINVAL; break;
+	case ERROR_INVALID_REPARSE_DATA: error = EINVAL; break;
 	case ERROR_INVALID_SIGNAL_NUMBER: error = EINVAL; break;
 	case ERROR_INVALID_TARGET_HANDLE: error = EIO; break;
 	case ERROR_INVALID_WORKSTATION: error = EACCES; break;
@@ -118,6 +119,7 @@ int err_win_to_posix(DWORD winerr)
 	case ERROR_NEGATIVE_SEEK: error = ESPIPE; break;
 	case ERROR_NOACCESS: error = EFAULT; break;
 	case ERROR_NONE_MAPPED: error = EINVAL; break;
+	case ERROR_NOT_A_REPARSE_POINT: error = EINVAL; break;
 	case ERROR_NOT_ENOUGH_MEMORY: error = ENOMEM; break;
 	case ERROR_NOT_READY: error = EAGAIN; break;
 	case ERROR_NOT_SAME_DEVICE: error = EXDEV; break;
@@ -138,6 +140,9 @@ int err_win_to_posix(DWORD winerr)
 	case ERROR_PIPE_NOT_CONNECTED: error = EPIPE; break;
 	case ERROR_PRIVILEGE_NOT_HELD: error = EACCES; break;
 	case ERROR_READ_FAULT: error = EIO; break;
+	case ERROR_REPARSE_ATTRIBUTE_CONFLICT: error = EINVAL; break;
+	case ERROR_REPARSE_TAG_INVALID: error = EINVAL; break;
+	case ERROR_REPARSE_TAG_MISMATCH: error = EINVAL; break;
 	case ERROR_SEEK: error = EIO; break;
 	case ERROR_SEEK_ON_DEVICE: error = ESPIPE; break;
 	case ERROR_SHARING_BUFFER_EXCEEDED: error = ENFILE; break;

From 371fbd9480e22d2b955c42a33415e59e14838bbe Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:06:10 +0200
Subject: [PATCH 504/553] Win32: mingw_unlink: support symlinks to directories

_wunlink() / DeleteFileW() refuses to delete symlinks to directories. If
_wunlink() fails with ERROR_ACCESS_DENIED, try _wrmdir() as well.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index caffc57644025e..7e23e5efea870e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -372,9 +372,16 @@ int mingw_unlink(const char *pathname, int handle_in_use_error)
 			return 0;
 		if (!is_file_in_use_error(GetLastError()))
 			break;
+		/*
+		 * _wunlink() / DeleteFileW() for directory symlinks fails with
+		 * ERROR_ACCESS_DENIED (EACCES), so try _wrmdir() as well. This is the
+		 * same error we get if a file is in use (already checked above).
+		 */
+		if (!_wrmdir(wpathname))
+			return 0;
+
 		if (!handle_in_use_error)
 			return -1;
-
 	} while (retry_ask_yes_no(&tries, "Unlink of file '%s' failed. "
 			"Should I try again?", pathname));
 	return -1;

From 5e1fa352a7289286e563ac78c739ed8453b00e91 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Tue, 19 May 2015 22:42:48 +0200
Subject: [PATCH 505/553] Win32: mingw_rename: support renaming symlinks

MSVCRT's _wrename() cannot rename symlinks over existing files: it returns
success without doing anything. Newer MSVCR*.dll versions probably do not
have this problem: according to CRT sources, they just call MoveFileEx()
with the MOVEFILE_COPY_ALLOWED flag.

Get rid of _wrename() and call MoveFileEx() with proper error handling.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 7e23e5efea870e..2609e423ae6b2e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2536,7 +2536,7 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz)
 int mingw_rename(const char *pold, const char *pnew)
 {
 	static int supports_file_rename_info_ex = 1;
-	DWORD attrs, gle;
+	DWORD attrs = INVALID_FILE_ATTRIBUTES, gle;
 	int tries = 0;
 	wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH];
 	int wpnew_len;
@@ -2547,15 +2547,6 @@ int mingw_rename(const char *pold, const char *pnew)
 	if (wpnew_len < 0)
 		return -1;
 
-	/*
-	 * Try native rename() first to get errno right.
-	 * It is based on MoveFile(), which cannot overwrite existing files.
-	 */
-	if (!_wrename(wpold, wpnew))
-		return 0;
-	if (errno != EEXIST)
-		return -1;
-
 repeat:
 	if (supports_file_rename_info_ex) {
 		/*
@@ -2631,13 +2622,22 @@ int mingw_rename(const char *pold, const char *pnew)
 		 * to retry.
 		 */
 	} else {
-		if (MoveFileExW(wpold, wpnew, MOVEFILE_REPLACE_EXISTING))
+		if (MoveFileExW(wpold, wpnew,
+				MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED))
 			return 0;
 		gle = GetLastError();
 	}
 
-	/* TODO: translate more errors */
-	if (gle == ERROR_ACCESS_DENIED &&
+	/* revert file attributes on failure */
+	if (attrs != INVALID_FILE_ATTRIBUTES)
+		SetFileAttributesW(wpnew, attrs);
+
+	if (!is_file_in_use_error(gle)) {
+		errno = err_win_to_posix(gle);
+		return -1;
+	}
+
+	if (attrs == INVALID_FILE_ATTRIBUTES &&
 	    (attrs = GetFileAttributesW(wpnew)) != INVALID_FILE_ATTRIBUTES) {
 		if (attrs & FILE_ATTRIBUTE_DIRECTORY) {
 			DWORD attrsold = GetFileAttributesW(wpold);
@@ -2649,16 +2649,10 @@ int mingw_rename(const char *pold, const char *pnew)
 			return -1;
 		}
 		if ((attrs & FILE_ATTRIBUTE_READONLY) &&
-		    SetFileAttributesW(wpnew, attrs & ~FILE_ATTRIBUTE_READONLY)) {
-			if (MoveFileExW(wpold, wpnew, MOVEFILE_REPLACE_EXISTING))
-				return 0;
-			gle = GetLastError();
-			/* revert file attributes on failure */
-			SetFileAttributesW(wpnew, attrs);
-		}
+		    SetFileAttributesW(wpnew, attrs & ~FILE_ATTRIBUTE_READONLY))
+			goto repeat;
 	}
-	if (gle == ERROR_ACCESS_DENIED &&
-	       retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. "
+	if (retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. "
 		       "Should I try again?", pold, pnew))
 		goto repeat;
 

From c5c5a1b0beb2b1c15a1dbfeb48dad1c7bf1cb4c4 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:17:31 +0200
Subject: [PATCH 506/553] Win32: mingw_chdir: change to symlink-resolved
 directory

If symlinks are enabled, resolve all symlinks when changing directories,
as required by POSIX.

Note: Git's real_path() function bases its link resolution algorithm on
this property of chdir(). Unfortunately, the current directory on Windows
is limited to only MAX_PATH (260) characters. Therefore using symlinks and
long paths in combination may be problematic.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 2609e423ae6b2e..50825ca41ca413 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -938,7 +938,24 @@ int mingw_chdir(const char *dirname)
 	wchar_t wdirname[MAX_LONG_PATH];
 	if (xutftowcs_long_path(wdirname, dirname) < 0)
 		return -1;
-	result = _wchdir(wdirname);
+
+	if (has_symlinks) {
+		HANDLE hnd = CreateFileW(wdirname, 0,
+				FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+				OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+		if (hnd == INVALID_HANDLE_VALUE) {
+			errno = err_win_to_posix(GetLastError());
+			return -1;
+		}
+		if (!GetFinalPathNameByHandleW(hnd, wdirname, ARRAY_SIZE(wdirname), 0)) {
+			errno = err_win_to_posix(GetLastError());
+			CloseHandle(hnd);
+			return -1;
+		}
+		CloseHandle(hnd);
+	}
+
+	result = _wchdir(normalize_ntpath(wdirname));
 	current_directory_len = GetCurrentDirectoryW(0, NULL);
 	return result;
 }

From 32fd784f904cf9e68cadf4ed3defb400e9f2c08c Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:24:41 +0200
Subject: [PATCH 507/553] Win32: implement readlink()

Implement readlink() by reading NTFS reparse points. Works for symlinks
and directory junctions. If symlinks are disabled, fail with ENOSYS.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw-posix.h |  3 +-
 compat/mingw.c       | 98 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index e1546978654d60..0781423f0f3ab4 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -121,8 +121,6 @@ struct utsname {
  * trivial stubs
  */
 
-static inline int readlink(const char *path UNUSED, char *buf UNUSED, size_t bufsiz UNUSED)
-{ errno = ENOSYS; return -1; }
 static inline int symlink(const char *oldpath UNUSED, const char *newpath UNUSED)
 { errno = ENOSYS; return -1; }
 static inline int fchmod(int fildes UNUSED, mode_t mode UNUSED)
@@ -197,6 +195,7 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out);
 int sigaction(int sig, struct sigaction *in, struct sigaction *out);
 int link(const char *oldpath, const char *newpath);
 int uname(struct utsname *buf);
+int readlink(const char *path, char *buf, size_t bufsiz);
 
 /*
  * replacements of existing functions
diff --git a/compat/mingw.c b/compat/mingw.c
index 50825ca41ca413..59786ad174de3f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -24,6 +24,7 @@
 #define SECURITY_WIN32
 #include <sspi.h>
 #include <wchar.h>
+#include <winioctl.h>
 #include <winternl.h>
 
 #define STATUS_DELETE_PENDING ((NTSTATUS) 0xC0000056)
@@ -2958,6 +2959,103 @@ int link(const char *oldpath, const char *newpath)
 	return 0;
 }
 
+#ifndef _WINNT_H
+/*
+ * The REPARSE_DATA_BUFFER structure is defined in the Windows DDK (in
+ * ntifs.h) and in MSYS1's winnt.h (which defines _WINNT_H). So define
+ * it ourselves if we are on MSYS2 (whose winnt.h defines _WINNT_).
+ */
+typedef struct _REPARSE_DATA_BUFFER {
+	DWORD  ReparseTag;
+	WORD   ReparseDataLength;
+	WORD   Reserved;
+#ifndef _MSC_VER
+	_ANONYMOUS_UNION
+#endif
+	union {
+		struct {
+			WORD   SubstituteNameOffset;
+			WORD   SubstituteNameLength;
+			WORD   PrintNameOffset;
+			WORD   PrintNameLength;
+			ULONG  Flags;
+			WCHAR PathBuffer[1];
+		} SymbolicLinkReparseBuffer;
+		struct {
+			WORD   SubstituteNameOffset;
+			WORD   SubstituteNameLength;
+			WORD   PrintNameOffset;
+			WORD   PrintNameLength;
+			WCHAR PathBuffer[1];
+		} MountPointReparseBuffer;
+		struct {
+			BYTE   DataBuffer[1];
+		} GenericReparseBuffer;
+	} DUMMYUNIONNAME;
+} REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER;
+#endif
+
+int readlink(const char *path, char *buf, size_t bufsiz)
+{
+	HANDLE handle;
+	WCHAR wpath[MAX_LONG_PATH], *wbuf;
+	REPARSE_DATA_BUFFER *b = alloca(MAXIMUM_REPARSE_DATA_BUFFER_SIZE);
+	DWORD dummy;
+	char tmpbuf[MAX_LONG_PATH];
+	int len;
+
+	if (xutftowcs_long_path(wpath, path) < 0)
+		return -1;
+
+	/* read reparse point data */
+	handle = CreateFileW(wpath, 0,
+			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			OPEN_EXISTING,
+			FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
+	if (handle == INVALID_HANDLE_VALUE) {
+		errno = err_win_to_posix(GetLastError());
+		return -1;
+	}
+	if (!DeviceIoControl(handle, FSCTL_GET_REPARSE_POINT, NULL, 0, b,
+			MAXIMUM_REPARSE_DATA_BUFFER_SIZE, &dummy, NULL)) {
+		errno = err_win_to_posix(GetLastError());
+		CloseHandle(handle);
+		return -1;
+	}
+	CloseHandle(handle);
+
+	/* get target path for symlinks or mount points (aka 'junctions') */
+	switch (b->ReparseTag) {
+	case IO_REPARSE_TAG_SYMLINK:
+		wbuf = (WCHAR*) (((char*) b->SymbolicLinkReparseBuffer.PathBuffer)
+				+ b->SymbolicLinkReparseBuffer.SubstituteNameOffset);
+		*(WCHAR*) (((char*) wbuf)
+				+ b->SymbolicLinkReparseBuffer.SubstituteNameLength) = 0;
+		break;
+	case IO_REPARSE_TAG_MOUNT_POINT:
+		wbuf = (WCHAR*) (((char*) b->MountPointReparseBuffer.PathBuffer)
+				+ b->MountPointReparseBuffer.SubstituteNameOffset);
+		*(WCHAR*) (((char*) wbuf)
+				+ b->MountPointReparseBuffer.SubstituteNameLength) = 0;
+		break;
+	default:
+		errno = EINVAL;
+		return -1;
+	}
+
+	/*
+	 * Adapt to strange readlink() API: Copy up to bufsiz *bytes*, potentially
+	 * cutting off a UTF-8 sequence. Insufficient bufsize is *not* a failure
+	 * condition. There is no conversion function that produces invalid UTF-8,
+	 * so convert to a (hopefully large enough) temporary buffer, then memcpy
+	 * the requested number of bytes (including '\0' for robustness).
+	 */
+	if ((len = xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) < 0)
+		return -1;
+	memcpy(buf, tmpbuf, min(bufsiz, len + 1));
+	return min(bufsiz, len);
+}
+
 pid_t waitpid(pid_t pid, int *status, int options)
 {
 	HANDLE h = OpenProcess(SYNCHRONIZE | PROCESS_QUERY_INFORMATION,

From 85dc713b9652a090d0c1df2421f1db3ccee2b50c Mon Sep 17 00:00:00 2001
From: Bill Zissimopoulos <billziss@navimatics.com>
Date: Thu, 28 May 2020 16:35:57 -0700
Subject: [PATCH 508/553] mingw: lstat: compute correct size for symlinks

This commit fixes mingw_lstat by computing the proper size for symlinks
according to POSIX. POSIX specifies that upon successful return from
lstat: "the value of the st_size member shall be set to the length of
the pathname contained in the symbolic link not including any
terminating null byte".

Prior to this commit the mingw_lstat function returned a fixed size of
4096. This caused problems in git repositories that were accessed by
git for Cygwin or git for WSL. For example, doing `git reset --hard`
using git for Windows would update the size of symlinks in the index
to be 4096; at a later time git for Cygwin or git for WSL would find
that symlinks have changed size during `git status`. Vice versa doing
`git reset --hard` in git for Cygwin or git for WSL would update the
size of symlinks in the index with the correct value, only for git for
Windows to find incorrectly at a later time that the size had changed.

Signed-off-by: Bill Zissimopoulos <billziss@navimatics.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c         | 65 ++++++++++++++++++++++++++++--------------
 compat/win32/fscache.c | 12 ++++++++
 2 files changed, 56 insertions(+), 21 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 59786ad174de3f..a41db274ffa312 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1002,10 +1002,14 @@ static int has_valid_directory_prefix(wchar_t *wfilename)
 	return 1;
 }
 
+static int readlink_1(const WCHAR *wpath, BOOL fail_on_unknown_tag,
+		      char *tmpbuf, int *plen, DWORD *ptag);
+
 int mingw_lstat(const char *file_name, struct stat *buf)
 {
 	WIN32_FILE_ATTRIBUTE_DATA fdata;
-	WIN32_FIND_DATAW findbuf = { 0 };
+	DWORD reparse_tag = 0;
+	int link_len = 0;
 	wchar_t wfilename[MAX_LONG_PATH];
 	int wlen = xutftowcs_long_path(wfilename, file_name);
 	if (wlen < 0)
@@ -1020,20 +1024,21 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 	}
 
 	if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) {
-		/* for reparse points, use FindFirstFile to get the reparse tag */
+		/* for reparse points, get the link tag and length */
 		if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) {
-			HANDLE handle = FindFirstFileW(wfilename, &findbuf);
-			if (handle == INVALID_HANDLE_VALUE)
-				goto error;
-			FindClose(handle);
+			char tmpbuf[MAX_LONG_PATH];
+
+			if (readlink_1(wfilename, FALSE, tmpbuf, &link_len,
+				       &reparse_tag) < 0)
+				return -1;
 		}
 		buf->st_ino = 0;
 		buf->st_gid = 0;
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes,
-				findbuf.dwReserved0);
-		buf->st_size = S_ISLNK(buf->st_mode) ? MAX_LONG_PATH :
+				reparse_tag);
+		buf->st_size = S_ISLNK(buf->st_mode) ? link_len :
 			fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim));
@@ -1041,7 +1046,7 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim));
 		return 0;
 	}
-error:
+
 	switch (GetLastError()) {
 	case ERROR_ACCESS_DENIED:
 	case ERROR_SHARING_VIOLATION:
@@ -2995,17 +3000,13 @@ typedef struct _REPARSE_DATA_BUFFER {
 } REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER;
 #endif
 
-int readlink(const char *path, char *buf, size_t bufsiz)
+static int readlink_1(const WCHAR *wpath, BOOL fail_on_unknown_tag,
+		      char *tmpbuf, int *plen, DWORD *ptag)
 {
 	HANDLE handle;
-	WCHAR wpath[MAX_LONG_PATH], *wbuf;
+	WCHAR *wbuf;
 	REPARSE_DATA_BUFFER *b = alloca(MAXIMUM_REPARSE_DATA_BUFFER_SIZE);
 	DWORD dummy;
-	char tmpbuf[MAX_LONG_PATH];
-	int len;
-
-	if (xutftowcs_long_path(wpath, path) < 0)
-		return -1;
 
 	/* read reparse point data */
 	handle = CreateFileW(wpath, 0,
@@ -3025,7 +3026,7 @@ int readlink(const char *path, char *buf, size_t bufsiz)
 	CloseHandle(handle);
 
 	/* get target path for symlinks or mount points (aka 'junctions') */
-	switch (b->ReparseTag) {
+	switch ((*ptag = b->ReparseTag)) {
 	case IO_REPARSE_TAG_SYMLINK:
 		wbuf = (WCHAR*) (((char*) b->SymbolicLinkReparseBuffer.PathBuffer)
 				+ b->SymbolicLinkReparseBuffer.SubstituteNameOffset);
@@ -3039,10 +3040,34 @@ int readlink(const char *path, char *buf, size_t bufsiz)
 				+ b->MountPointReparseBuffer.SubstituteNameLength) = 0;
 		break;
 	default:
-		errno = EINVAL;
-		return -1;
+		if (fail_on_unknown_tag) {
+			errno = EINVAL;
+			return -1;
+		} else {
+			*plen = MAX_LONG_PATH;
+			return 0;
+		}
 	}
 
+	if ((*plen =
+	     xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) <  0)
+		return -1;
+	return 0;
+}
+
+int readlink(const char *path, char *buf, size_t bufsiz)
+{
+	WCHAR wpath[MAX_LONG_PATH];
+	char tmpbuf[MAX_LONG_PATH];
+	int len;
+	DWORD tag;
+
+	if (xutftowcs_long_path(wpath, path) < 0)
+		return -1;
+
+	if (readlink_1(wpath, TRUE, tmpbuf, &len, &tag) < 0)
+		return -1;
+
 	/*
 	 * Adapt to strange readlink() API: Copy up to bufsiz *bytes*, potentially
 	 * cutting off a UTF-8 sequence. Insufficient bufsize is *not* a failure
@@ -3050,8 +3075,6 @@ int readlink(const char *path, char *buf, size_t bufsiz)
 	 * so convert to a (hopefully large enough) temporary buffer, then memcpy
 	 * the requested number of bytes (including '\0' for robustness).
 	 */
-	if ((len = xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) < 0)
-		return -1;
 	memcpy(buf, tmpbuf, min(bufsiz, len + 1));
 	return min(bufsiz, len);
 }
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 41fae636c12a41..0f5e00ae18f949 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -594,6 +594,18 @@ int fscache_lstat(const char *filename, struct stat *st)
 		return -1;
 	}
 
+	/*
+	 * Special case symbolic links: FindFirstFile()/FindNextFile() did not
+	 * provide us with the length of the target path.
+	 */
+	if (fse->u.s.st_size == MAX_LONG_PATH && S_ISLNK(fse->st_mode)) {
+		char buf[MAX_LONG_PATH];
+		int len = readlink(filename, buf, sizeof(buf) - 1);
+
+		if (len > 0)
+			fse->u.s.st_size = len;
+	}
+
 	/* copy stat data */
 	st->st_ino = 0;
 	st->st_gid = 0;

From 5ea5a194e9b4a1aa3ee1737ddf48236c9c8e1bd1 Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:32:03 +0200
Subject: [PATCH 509/553] Win32: implement basic symlink() functionality (file
 symlinks only)

Implement symlink() that always creates file symlinks. Fails with ENOSYS
if symlinks are disabled or unsupported.

Note: CreateSymbolicLinkW() was introduced with symlink support in Windows
Vista. For compatibility with Windows XP, we need to load it dynamically
and fail gracefully if it isnt's available.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 compat/mingw-posix.h |  3 +--
 compat/mingw.c       | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 0781423f0f3ab4..1a917890b18930 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -121,8 +121,6 @@ struct utsname {
  * trivial stubs
  */
 
-static inline int symlink(const char *oldpath UNUSED, const char *newpath UNUSED)
-{ errno = ENOSYS; return -1; }
 static inline int fchmod(int fildes UNUSED, mode_t mode UNUSED)
 { errno = ENOSYS; return -1; }
 #ifndef __MINGW64_VERSION_MAJOR
@@ -195,6 +193,7 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out);
 int sigaction(int sig, struct sigaction *in, struct sigaction *out);
 int link(const char *oldpath, const char *newpath);
 int uname(struct utsname *buf);
+int symlink(const char *target, const char *link);
 int readlink(const char *path, char *buf, size_t bufsiz);
 
 /*
diff --git a/compat/mingw.c b/compat/mingw.c
index a41db274ffa312..430a7c18dbacc3 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2964,6 +2964,34 @@ int link(const char *oldpath, const char *newpath)
 	return 0;
 }
 
+int symlink(const char *target, const char *link)
+{
+	wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH];
+	int len;
+
+	/* fail if symlinks are disabled or API is not supported (WinXP) */
+	if (!has_symlinks) {
+		errno = ENOSYS;
+		return -1;
+	}
+
+	if ((len = xutftowcs_long_path(wtarget, target)) < 0
+			|| xutftowcs_long_path(wlink, link) < 0)
+		return -1;
+
+	/* convert target dir separators to backslashes */
+	while (len--)
+		if (wtarget[len] == '/')
+			wtarget[len] = '\\';
+
+	/* create file symlink */
+	if (!CreateSymbolicLinkW(wlink, wtarget, 0)) {
+		errno = err_win_to_posix(GetLastError());
+		return -1;
+	}
+	return 0;
+}
+
 #ifndef _WINNT_H
 /*
  * The REPARSE_DATA_BUFFER structure is defined in the Windows DDK (in

From 64f0b6140a3c2c8abb0edf7129841940d766780f Mon Sep 17 00:00:00 2001
From: Karsten Blees <blees@dcon.de>
Date: Sun, 24 May 2015 01:48:35 +0200
Subject: [PATCH 510/553] Win32: symlink: add support for symlinks to
 directories

Symlinks on Windows have a flag that indicates whether the target is a file
or a directory. Symlinks of wrong type simply don't work. This even affects
core Win32 APIs (e.g. DeleteFile() refuses to delete directory symlinks).

However, CreateFile() with FILE_FLAG_BACKUP_SEMANTICS doesn't seem to care.
Check the target type by first creating a tentative file symlink, opening
it, and checking the type of the resulting handle. If it is a directory,
recreate the symlink with the directory flag set.

It is possible to create symlinks before the target exists (or in case of
symlinks to symlinks: before the target type is known). If this happens,
create a tentative file symlink and postpone the directory decision: keep
a list of phantom symlinks to be processed whenever a new directory is
created in mingw_mkdir().

Limitations: This algorithm may fail if a link target changes from file to
directory or vice versa, or if the target directory is created in another
process.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 159 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 430a7c18dbacc3..8ef1d6b41e3c0e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -331,6 +331,126 @@ static inline int is_wdir_sep(wchar_t wchar)
 	return wchar == L'/' || wchar == L'\\';
 }
 
+static const wchar_t *make_relative_to(const wchar_t *path,
+				       const wchar_t *relative_to, wchar_t *out,
+				       size_t size)
+{
+	size_t i = wcslen(relative_to), len;
+
+	/* Is `path` already absolute? */
+	if (is_wdir_sep(path[0]) ||
+	    (iswalpha(path[0]) && path[1] == L':' && is_wdir_sep(path[2])))
+		return path;
+
+	while (i > 0 && !is_wdir_sep(relative_to[i - 1]))
+		i--;
+
+	/* Is `relative_to` in the current directory? */
+	if (!i)
+		return path;
+
+	len = wcslen(path);
+	if (i + len + 1 > size) {
+		error("Could not make '%ls' relative to '%ls' (too large)",
+		      path, relative_to);
+		return NULL;
+	}
+
+	memcpy(out, relative_to, i * sizeof(wchar_t));
+	wcscpy(out + i, path);
+	return out;
+}
+
+enum phantom_symlink_result {
+	PHANTOM_SYMLINK_RETRY,
+	PHANTOM_SYMLINK_DONE,
+	PHANTOM_SYMLINK_DIRECTORY
+};
+
+/*
+ * Changes a file symlink to a directory symlink if the target exists and is a
+ * directory.
+ */
+static enum phantom_symlink_result
+process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink)
+{
+	HANDLE hnd;
+	BY_HANDLE_FILE_INFORMATION fdata;
+	wchar_t relative[MAX_LONG_PATH];
+	const wchar_t *rel;
+
+	/* check that wlink is still a file symlink */
+	if ((GetFileAttributesW(wlink)
+			& (FILE_ATTRIBUTE_REPARSE_POINT | FILE_ATTRIBUTE_DIRECTORY))
+			!= FILE_ATTRIBUTE_REPARSE_POINT)
+		return PHANTOM_SYMLINK_DONE;
+
+	/* make it relative, if necessary */
+	rel = make_relative_to(wtarget, wlink, relative, ARRAY_SIZE(relative));
+	if (!rel)
+		return PHANTOM_SYMLINK_DONE;
+
+	/* let Windows resolve the link by opening it */
+	hnd = CreateFileW(rel, 0,
+			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+	if (hnd == INVALID_HANDLE_VALUE) {
+		errno = err_win_to_posix(GetLastError());
+		return PHANTOM_SYMLINK_RETRY;
+	}
+
+	if (!GetFileInformationByHandle(hnd, &fdata)) {
+		errno = err_win_to_posix(GetLastError());
+		CloseHandle(hnd);
+		return PHANTOM_SYMLINK_RETRY;
+	}
+	CloseHandle(hnd);
+
+	/* if target exists and is a file, we're done */
+	if (!(fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
+		return PHANTOM_SYMLINK_DONE;
+
+	/* otherwise recreate the symlink with directory flag */
+	if (DeleteFileW(wlink) && CreateSymbolicLinkW(wlink, wtarget, 1))
+		return PHANTOM_SYMLINK_DIRECTORY;
+
+	errno = err_win_to_posix(GetLastError());
+	return PHANTOM_SYMLINK_RETRY;
+}
+
+/* keep track of newly created symlinks to non-existing targets */
+struct phantom_symlink_info {
+	struct phantom_symlink_info *next;
+	wchar_t *wlink;
+	wchar_t *wtarget;
+};
+
+static struct phantom_symlink_info *phantom_symlinks = NULL;
+static CRITICAL_SECTION phantom_symlinks_cs;
+
+static void process_phantom_symlinks(void)
+{
+	struct phantom_symlink_info *current, **psi;
+	EnterCriticalSection(&phantom_symlinks_cs);
+	/* process phantom symlinks list */
+	psi = &phantom_symlinks;
+	while ((current = *psi)) {
+		enum phantom_symlink_result result = process_phantom_symlink(
+				current->wtarget, current->wlink);
+		if (result == PHANTOM_SYMLINK_RETRY) {
+			psi = &current->next;
+		} else {
+			/* symlink was processed, remove from list */
+			*psi = current->next;
+			free(current);
+			/* if symlink was a directory, start over */
+			if (result == PHANTOM_SYMLINK_DIRECTORY)
+				psi = &phantom_symlinks;
+		}
+	}
+	LeaveCriticalSection(&phantom_symlinks_cs);
+}
+
 /* Normalizes NT paths as returned by some low-level APIs. */
 static wchar_t *normalize_ntpath(wchar_t *wbuf)
 {
@@ -517,6 +637,8 @@ int mingw_mkdir(const char *path, int mode UNUSED)
 		return -1;
 
 	ret = _wmkdir(wpath);
+	if (!ret)
+		process_phantom_symlinks();
 	if (!ret && needs_hiding(path))
 		return set_hidden_flag(wpath, 1);
 	return ret;
@@ -2989,6 +3111,42 @@ int symlink(const char *target, const char *link)
 		errno = err_win_to_posix(GetLastError());
 		return -1;
 	}
+
+	/* convert to directory symlink if target exists */
+	switch (process_phantom_symlink(wtarget, wlink)) {
+	case PHANTOM_SYMLINK_RETRY:	{
+		/* if target doesn't exist, add to phantom symlinks list */
+		wchar_t wfullpath[MAX_LONG_PATH];
+		struct phantom_symlink_info *psi;
+
+		/* convert to absolute path to be independent of cwd */
+		len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL);
+		if (!len || len >= MAX_LONG_PATH) {
+			errno = err_win_to_posix(GetLastError());
+			return -1;
+		}
+
+		/* over-allocate and fill phantom_symlink_info structure */
+		psi = xmalloc(sizeof(struct phantom_symlink_info)
+			+ sizeof(wchar_t) * (len + wcslen(wtarget) + 2));
+		psi->wlink = (wchar_t *)(psi + 1);
+		wcscpy(psi->wlink, wfullpath);
+		psi->wtarget = psi->wlink + len + 1;
+		wcscpy(psi->wtarget, wtarget);
+
+		EnterCriticalSection(&phantom_symlinks_cs);
+		psi->next = phantom_symlinks;
+		phantom_symlinks = psi;
+		LeaveCriticalSection(&phantom_symlinks_cs);
+		break;
+	}
+	case PHANTOM_SYMLINK_DIRECTORY:
+		/* if we created a dir symlink, process other phantom symlinks */
+		process_phantom_symlinks();
+		break;
+	default:
+		break;
+	}
 	return 0;
 }
 
@@ -3963,6 +4121,7 @@ int wmain(int argc, const wchar_t **wargv)
 
 	/* initialize critical section for waitpid pinfo_t list */
 	InitializeCriticalSection(&pinfo_cs);
+	InitializeCriticalSection(&phantom_symlinks_cs);
 
 	/* initialize critical section for fscache */
 	InitializeCriticalSection(&fscache_cs);

From ef80a1ed1dfcf466d87617230fbef0e36a412f46 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 30 May 2017 21:50:57 +0200
Subject: [PATCH 511/553] mingw: try to create symlinks without elevated
 permissions

With Windows 10 Build 14972 in Developer Mode, a new flag is supported
by CreateSymbolicLink() to create symbolic links even when running
outside of an elevated session (which was previously required).

This new flag is called SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE and
has the numeric value 0x02.

Previous Windows 10 versions will not understand that flag and return an
ERROR_INVALID_PARAMETER, therefore we have to be careful to try passing
that flag only when the build number indicates that it is supported.

For more information about the new flag, see this blog post:
https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/

This patch is loosely based on the patch submitted by Samuel D. Leslie
as https://github.com/git-for-windows/git/pull/1184.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 8ef1d6b41e3c0e..cfe09fa09fa7ee 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -361,6 +361,8 @@ static const wchar_t *make_relative_to(const wchar_t *path,
 	return out;
 }
 
+static DWORD symlink_file_flags = 0, symlink_directory_flags = 1;
+
 enum phantom_symlink_result {
 	PHANTOM_SYMLINK_RETRY,
 	PHANTOM_SYMLINK_DONE,
@@ -411,7 +413,8 @@ process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink)
 		return PHANTOM_SYMLINK_DONE;
 
 	/* otherwise recreate the symlink with directory flag */
-	if (DeleteFileW(wlink) && CreateSymbolicLinkW(wlink, wtarget, 1))
+	if (DeleteFileW(wlink) &&
+	    CreateSymbolicLinkW(wlink, wtarget, symlink_directory_flags))
 		return PHANTOM_SYMLINK_DIRECTORY;
 
 	errno = err_win_to_posix(GetLastError());
@@ -3107,7 +3110,7 @@ int symlink(const char *target, const char *link)
 			wtarget[len] = '\\';
 
 	/* create file symlink */
-	if (!CreateSymbolicLinkW(wlink, wtarget, 0)) {
+	if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) {
 		errno = err_win_to_posix(GetLastError());
 		return -1;
 	}
@@ -4056,6 +4059,24 @@ static void maybe_redirect_std_handles(void)
 				  GENERIC_WRITE, FILE_FLAG_NO_BUFFERING);
 }
 
+static void adjust_symlink_flags(void)
+{
+	/*
+	 * Starting with Windows 10 Build 14972, symbolic links can be created
+	 * using CreateSymbolicLink() without elevation by passing the flag
+	 * SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE (0x02) as last
+	 * parameter, provided the Developer Mode has been enabled. Some
+	 * earlier Windows versions complain about this flag with an
+	 * ERROR_INVALID_PARAMETER, hence we have to test the build number
+	 * specifically.
+	 */
+	if (GetVersion() >= 14972 << 16) {
+		symlink_file_flags |= 2;
+		symlink_directory_flags |= 2;
+	}
+
+}
+
 #ifdef _MSC_VER
 #ifdef _DEBUG
 #include <crtdbg.h>
@@ -4091,6 +4112,7 @@ int wmain(int argc, const wchar_t **wargv)
 #endif
 
 	maybe_redirect_std_handles();
+	adjust_symlink_flags();
 	fsync_object_files = 1;
 
 	/* determine size of argv and environ conversion buffer */

From 3979ab70621b5941c6da8900423942ed69794cb0 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 2 Mar 2020 21:54:29 +0100
Subject: [PATCH 512/553] mingw: emulate stat() a little more faithfully

When creating directories via `safe_create_leading_directories()`, we
might encounter an already-existing directory which is not
readable by the current user. To handle that situation, Git's code calls
`stat()` to determine whether we're looking at a directory.

In such a case, `CreateFile()` will fail, though, no matter what, and
consequently `mingw_stat()` will fail, too. But POSIX semantics seem to
still allow `stat()` to go forward.

So let's call `mingw_lstat()` for the rescue if we fail to get a file
handle due to denied permission in `mingw_stat()`, and fill the stat
info that way.

We need to be careful to not allow this to go forward in case that we're
looking at a symbolic link: to resolve the link, we would still have to
create a file handle, and we just found out that we cannot. Therefore,
`stat()` still needs to fail with `EACCES` in that case.

This fixes https://github.com/git-for-windows/git/issues/2531.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index cfe09fa09fa7ee..064ff581f25f22 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1236,7 +1236,19 @@ int mingw_stat(const char *file_name, struct stat *buf)
 			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
 			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
 	if (hnd == INVALID_HANDLE_VALUE) {
-		errno = err_win_to_posix(GetLastError());
+		DWORD err = GetLastError();
+
+		if (err == ERROR_ACCESS_DENIED &&
+		    !mingw_lstat(file_name, buf) &&
+		    !S_ISLNK(buf->st_mode))
+			/*
+			 * POSIX semantics state to still try to fill
+			 * information, even if permission is denied to create
+			 * a file handle.
+			 */
+			return 0;
+
+		errno = err_win_to_posix(err);
 		return -1;
 	}
 	result = get_file_info_by_handle(hnd, buf);

From 5e432e7d54e1ab1d5ffb88fe94c306ddb9ea98bc Mon Sep 17 00:00:00 2001
From: JiSeop Moon <zcube@zcube.kr>
Date: Mon, 23 Apr 2018 22:30:18 +0900
Subject: [PATCH 513/553] mingw: introduce code to detect whether we're inside
 a Windows container

This will come in handy in the next commit.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 32 ++++++++++++++++++++++++++++++++
 compat/mingw.h |  5 +++++
 2 files changed, 37 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 064ff581f25f22..298931f95b7056 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -4216,3 +4216,35 @@ int mingw_have_unix_sockets(void)
 	return ret;
 }
 #endif
+
+/*
+ * Based on https://stackoverflow.com/questions/43002803
+ *
+ * [HKLM\SYSTEM\CurrentControlSet\Services\cexecsvc]
+ * "DisplayName"="@%systemroot%\\system32\\cexecsvc.exe,-100"
+ * "ErrorControl"=dword:00000001
+ * "ImagePath"=hex(2):25,00,73,00,79,00,73,00,74,00,65,00,6d,00,72,00,6f,00,
+ *    6f,00,74,00,25,00,5c,00,73,00,79,00,73,00,74,00,65,00,6d,00,33,00,32,00,
+ *    5c,00,63,00,65,00,78,00,65,00,63,00,73,00,76,00,63,00,2e,00,65,00,78,00,
+ *    65,00,00,00
+ * "Start"=dword:00000002
+ * "Type"=dword:00000010
+ * "Description"="@%systemroot%\\system32\\cexecsvc.exe,-101"
+ * "ObjectName"="LocalSystem"
+ * "ServiceSidType"=dword:00000001
+ */
+int is_inside_windows_container(void)
+{
+	static int inside_container = -1; /* -1 uninitialized */
+	const char *key = "SYSTEM\\CurrentControlSet\\Services\\cexecsvc";
+	HKEY handle = NULL;
+
+	if (inside_container != -1)
+		return inside_container;
+
+	inside_container = ERROR_SUCCESS ==
+		RegOpenKeyExA(HKEY_LOCAL_MACHINE, key, 0, KEY_READ, &handle);
+	RegCloseKey(handle);
+
+	return inside_container;
+}
diff --git a/compat/mingw.h b/compat/mingw.h
index ad1166b775322a..807ee7b7e2e573 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -288,3 +288,8 @@ int mingw_have_unix_sockets(void);
 #undef have_unix_sockets
 #define have_unix_sockets mingw_have_unix_sockets
 #endif
+
+/*
+ * Check current process is inside Windows Container.
+ */
+int is_inside_windows_container(void);

From 3691e9ef904416c949967ed424de6cd617ec37ae Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 4 Jun 2020 23:16:07 +0200
Subject: [PATCH 514/553] mingw: special-case index entries for symlinks with
 buggy size

In https://github.com/git-for-windows/git/pull/2637, we fixed a bug
where symbolic links' target path sizes were recorded incorrectly in the
index. The downside of this fix was that every user with tracked
symbolic links in their checkouts would see them as modified in `git
status`, but not in `git diff`, and only a `git add <path>` (or `git add
-u`) would "fix" this.

Let's do better than that: we can detect that situation and simply
pretend that a symbolic link with a known bad size (or a size that just
happens to be that bad size, a _very_ unlikely scenario because it would
overflow our buffers due to the trailing NUL byte) means that it needs
to be re-checked as if we had just checked it out.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 read-cache.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 1719023d24feaf..2fbe1f968469ec 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -470,6 +470,17 @@ int ie_modified(struct index_state *istate,
 	 * then we know it is.
 	 */
 	if ((changed & DATA_CHANGED) &&
+#ifdef GIT_WINDOWS_NATIVE
+	    /*
+	     * Work around Git for Windows v2.27.0 fixing a bug where symlinks'
+	     * target path lengths were not read at all, and instead recorded
+	     * as 4096: now, all symlinks would appear as modified.
+	     *
+	     * So let's just special-case symlinks with a target path length
+	     * (i.e. `sd_size`) of 4096 and force them to be re-checked.
+	     */
+	    (!S_ISLNK(st->st_mode) || ce->ce_stat_data.sd_size != MAX_LONG_PATH) &&
+#endif
 	    (S_ISGITLINK(ce->ce_mode) || ce->ce_stat_data.sd_size != 0))
 		return changed;
 

From ab106afa26fe4ad0b03c6c047a9ea699a6d6d59d Mon Sep 17 00:00:00 2001
From: Bert Belder <bertbelder@gmail.com>
Date: Fri, 26 Oct 2018 11:13:45 +0200
Subject: [PATCH 515/553] Win32: symlink: move phantom symlink creation to a
 separate function

Signed-off-by: Bert Belder <bertbelder@gmail.com>
---
 compat/mingw.c | 91 +++++++++++++++++++++++++++-----------------------
 1 file changed, 49 insertions(+), 42 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 064ff581f25f22..89bbc3f35e3365 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -454,6 +454,54 @@ static void process_phantom_symlinks(void)
 	LeaveCriticalSection(&phantom_symlinks_cs);
 }
 
+static int create_phantom_symlink(wchar_t *wtarget, wchar_t *wlink)
+{
+	int len;
+
+	/* create file symlink */
+	if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) {
+		errno = err_win_to_posix(GetLastError());
+		return -1;
+	}
+
+	/* convert to directory symlink if target exists */
+	switch (process_phantom_symlink(wtarget, wlink)) {
+	case PHANTOM_SYMLINK_RETRY: {
+		/* if target doesn't exist, add to phantom symlinks list */
+		wchar_t wfullpath[MAX_LONG_PATH];
+		struct phantom_symlink_info *psi;
+
+		/* convert to absolute path to be independent of cwd */
+		len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL);
+		if (!len || len >= MAX_LONG_PATH) {
+			errno = err_win_to_posix(GetLastError());
+			return -1;
+		}
+
+		/* over-allocate and fill phantom_symlink_info structure */
+		psi = xmalloc(sizeof(struct phantom_symlink_info) +
+			      sizeof(wchar_t) * (len + wcslen(wtarget) + 2));
+		psi->wlink = (wchar_t *)(psi + 1);
+		wcscpy(psi->wlink, wfullpath);
+		psi->wtarget = psi->wlink + len + 1;
+		wcscpy(psi->wtarget, wtarget);
+
+		EnterCriticalSection(&phantom_symlinks_cs);
+		psi->next = phantom_symlinks;
+		phantom_symlinks = psi;
+		LeaveCriticalSection(&phantom_symlinks_cs);
+		break;
+	}
+	case PHANTOM_SYMLINK_DIRECTORY:
+		/* if we created a dir symlink, process other phantom symlinks */
+		process_phantom_symlinks();
+		break;
+	default:
+		break;
+	}
+	return 0;
+}
+
 /* Normalizes NT paths as returned by some low-level APIs. */
 static wchar_t *normalize_ntpath(wchar_t *wbuf)
 {
@@ -3121,48 +3169,7 @@ int symlink(const char *target, const char *link)
 		if (wtarget[len] == '/')
 			wtarget[len] = '\\';
 
-	/* create file symlink */
-	if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) {
-		errno = err_win_to_posix(GetLastError());
-		return -1;
-	}
-
-	/* convert to directory symlink if target exists */
-	switch (process_phantom_symlink(wtarget, wlink)) {
-	case PHANTOM_SYMLINK_RETRY:	{
-		/* if target doesn't exist, add to phantom symlinks list */
-		wchar_t wfullpath[MAX_LONG_PATH];
-		struct phantom_symlink_info *psi;
-
-		/* convert to absolute path to be independent of cwd */
-		len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL);
-		if (!len || len >= MAX_LONG_PATH) {
-			errno = err_win_to_posix(GetLastError());
-			return -1;
-		}
-
-		/* over-allocate and fill phantom_symlink_info structure */
-		psi = xmalloc(sizeof(struct phantom_symlink_info)
-			+ sizeof(wchar_t) * (len + wcslen(wtarget) + 2));
-		psi->wlink = (wchar_t *)(psi + 1);
-		wcscpy(psi->wlink, wfullpath);
-		psi->wtarget = psi->wlink + len + 1;
-		wcscpy(psi->wtarget, wtarget);
-
-		EnterCriticalSection(&phantom_symlinks_cs);
-		psi->next = phantom_symlinks;
-		phantom_symlinks = psi;
-		LeaveCriticalSection(&phantom_symlinks_cs);
-		break;
-	}
-	case PHANTOM_SYMLINK_DIRECTORY:
-		/* if we created a dir symlink, process other phantom symlinks */
-		process_phantom_symlinks();
-		break;
-	default:
-		break;
-	}
-	return 0;
+	return create_phantom_symlink(wtarget, wlink);
 }
 
 #ifndef _WINNT_H

From ada6887a9d34379f6bf9e7cc1c62fbdcede9ed56 Mon Sep 17 00:00:00 2001
From: JiSeop Moon <zcube@zcube.kr>
Date: Mon, 23 Apr 2018 22:31:42 +0200
Subject: [PATCH 516/553] mingw: when running in a Windows container, try to
 rename() harder

It is a known issue that a rename() can fail with an "Access denied"
error at times, when copying followed by deleting the original file
works. Let's just fall back to that behavior.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 298931f95b7056..b5b9dc21391b0c 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2788,6 +2788,13 @@ int mingw_rename(const char *pold, const char *pnew)
 		gle = GetLastError();
 	}
 
+	if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) {
+		/* Fall back to copy to destination & remove source */
+		if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold))
+			return 0;
+		gle = GetLastError();
+	}
+
 	/* revert file attributes on failure */
 	if (attrs != INVALID_FILE_ATTRIBUTES)
 		SetFileAttributesW(wpnew, attrs);

From ca8467d4985f2bc086c9d0045e89c47e9701751f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 11 Feb 2019 14:19:18 +0100
Subject: [PATCH 517/553] Introduce helper to create symlinks that knows about
 index_state

On Windows, symbolic links actually have a type depending on the target:
it can be a file or a directory.

In certain circumstances, this poses problems, e.g. when a symbolic link
is supposed to point into a submodule that is not checked out, so there
is no way for Git to auto-detect the type.

To help with that, we will add support over the course of the next
commits to specify that symlink type via the Git attributes. This
requires an index_state, though, something that Git for Windows'
`symlink()` replacement cannot know about because the function signature
is defined by the POSIX standard and not ours to change.

So let's introduce a helper function to create symbolic links that
*does* know about the index_state.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 apply.c              |  2 +-
 builtin/difftool.c   |  2 +-
 compat/mingw-posix.h |  4 +++-
 compat/mingw.c       |  2 +-
 entry.c              |  2 +-
 git-compat-util.h    | 10 ++++++++++
 refs/files-backend.c |  2 +-
 setup.c              |  4 ++--
 8 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/apply.c b/apply.c
index 3de4aa4d2eaac5..b93e09f5a9ebd3 100644
--- a/apply.c
+++ b/apply.c
@@ -4440,7 +4440,7 @@ static int try_create_file(struct apply_state *state, const char *path,
 		/* Although buf:size is counted string, it also is NUL
 		 * terminated.
 		 */
-		return !!symlink(buf, path);
+		return !!create_symlink(state && state->repo ? state->repo->index : NULL, buf, path);
 
 	fd = open(path, O_CREAT | O_EXCL | O_WRONLY, (mode & 0100) ? 0777 : 0666);
 	if (fd < 0)
diff --git a/builtin/difftool.c b/builtin/difftool.c
index e4bc1f831696a8..8d10e2489f088e 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -544,7 +544,7 @@ static int run_dir_diff(struct repository *repo,
 				}
 				add_path(&wtdir, wtdir_len, dst_path);
 				if (dt_options->symlinks) {
-					if (symlink(wtdir.buf, rdir.buf)) {
+					if (create_symlink(lstate.istate, wtdir.buf, rdir.buf)) {
 						ret = error_errno("could not symlink '%s' to '%s'", wtdir.buf, rdir.buf);
 						goto finish;
 					}
diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h
index 1a917890b18930..9158f89d89d239 100644
--- a/compat/mingw-posix.h
+++ b/compat/mingw-posix.h
@@ -193,8 +193,10 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out);
 int sigaction(int sig, struct sigaction *in, struct sigaction *out);
 int link(const char *oldpath, const char *newpath);
 int uname(struct utsname *buf);
-int symlink(const char *target, const char *link);
 int readlink(const char *path, char *buf, size_t bufsiz);
+struct index_state;
+int mingw_create_symlink(struct index_state *index, const char *target, const char *link);
+#define create_symlink mingw_create_symlink
 
 /*
  * replacements of existing functions
diff --git a/compat/mingw.c b/compat/mingw.c
index 89bbc3f35e3365..897fb7b940d592 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -3149,7 +3149,7 @@ int link(const char *oldpath, const char *newpath)
 	return 0;
 }
 
-int symlink(const char *target, const char *link)
+int mingw_create_symlink(struct index_state *index UNUSED, const char *target, const char *link)
 {
 	wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH];
 	int len;
diff --git a/entry.c b/entry.c
index 5ab78ca884b215..b299e3f1071ff6 100644
--- a/entry.c
+++ b/entry.c
@@ -324,7 +324,7 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca
 		if (!has_symlinks || to_tempfile)
 			goto write_file_entry;
 
-		ret = symlink(new_blob, path);
+		ret = create_symlink(state->istate, new_blob, path);
 		free(new_blob);
 		if (ret)
 			return error_errno("unable to create symlink %s", path);
diff --git a/git-compat-util.h b/git-compat-util.h
index 43bb9791f3d93d..9a566982893866 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -367,6 +367,16 @@ static inline int git_has_dir_sep(const char *path)
 #define is_mount_point is_mount_point_via_stat
 #endif
 
+#ifndef create_symlink
+struct index_state;
+static inline int git_create_symlink(struct index_state *index UNUSED,
+				     const char *target, const char *link)
+{
+	return symlink(target, link);
+}
+#define create_symlink git_create_symlink
+#endif
+
 #ifndef query_user_email
 #define query_user_email() NULL
 #endif
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6f6f76a8d86dc4..8e98da8e44694c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2110,7 +2110,7 @@ static int create_ref_symlink(struct ref_lock *lock, const char *target)
 
 	ref_path = get_locked_file_path(&lock->lk);
 	unlink(ref_path);
-	ret = symlink(target, ref_path);
+	ret = create_symlink(NULL, target, ref_path);
 	free(ref_path);
 
 	if (ret)
diff --git a/setup.c b/setup.c
index cdd73b87fee8f8..e98c4c92ccf1d6 100644
--- a/setup.c
+++ b/setup.c
@@ -2224,7 +2224,7 @@ static void copy_templates_1(struct strbuf *path, struct strbuf *template_path,
 			if (strbuf_readlink(&lnk, template_path->buf,
 					    st_template.st_size) < 0)
 				die_errno(_("cannot readlink '%s'"), template_path->buf);
-			if (symlink(lnk.buf, path->buf))
+			if (create_symlink(NULL, lnk.buf, path->buf))
 				die_errno(_("cannot symlink '%s' '%s'"),
 					  lnk.buf, path->buf);
 			strbuf_release(&lnk);
@@ -2485,7 +2485,7 @@ static int create_default_files(const char *template_path,
 		repo_git_path_replace(the_repository, &path, "tXXXXXX");
 		if (!close(xmkstemp(path.buf)) &&
 		    !unlink(path.buf) &&
-		    !symlink("testing", path.buf) &&
+		    !create_symlink(NULL, "testing", path.buf) &&
 		    !lstat(path.buf, &st1) &&
 		    S_ISLNK(st1.st_mode))
 			unlink(path.buf); /* good */

From da65298e88d12c87ec7772c05a438f6754a48e75 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 20 Jul 2017 22:45:01 +0200
Subject: [PATCH 518/553] mingw: explicitly specify with which cmd to prefix
 the cmdline

The main idea of this patch is that even if we have to look up the
absolute path of the script, if only the basename was specified as
argv[0], then we should use that basename on the command line, too, not
the absolute path.

This patch will also help with the upcoming patch where we automatically
substitute "sh ..." by "busybox sh ..." if "sh" is not in the PATH but
"busybox" is: we will do that by substituting the actual executable, but
still keep prepending "sh" to the command line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index ac6954a37b4320..b3af187493d8c6 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1985,8 +1985,8 @@ static int is_msys2_sh(const char *cmd)
 }
 
 static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaenv,
-			      const char *dir,
-			      int prepend_cmd, int fhin, int fhout, int fherr)
+			      const char *dir, const char *prepend_cmd,
+			      int fhin, int fhout, int fherr)
 {
 	STARTUPINFOEXW si;
 	PROCESS_INFORMATION pi;
@@ -2066,9 +2066,9 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 	/* concatenate argv, quoting args as we go */
 	strbuf_init(&args, 0);
 	if (prepend_cmd) {
-		char *quoted = (char *)quote_arg(cmd);
+		char *quoted = (char *)quote_arg(prepend_cmd);
 		strbuf_addstr(&args, quoted);
-		if (quoted != cmd)
+		if (quoted != prepend_cmd)
 			free(quoted);
 	}
 	for (; *argv; argv++) {
@@ -2188,7 +2188,8 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 	return (pid_t)pi.dwProcessId;
 }
 
-static pid_t mingw_spawnv(const char *cmd, const char **argv, int prepend_cmd)
+static pid_t mingw_spawnv(const char *cmd, const char **argv,
+			  const char *prepend_cmd)
 {
 	return mingw_spawnve_fd(cmd, argv, NULL, NULL, prepend_cmd, 0, 1, 2);
 }
@@ -2216,14 +2217,14 @@ pid_t mingw_spawnvpe(const char *cmd, const char **argv, char **deltaenv,
 				pid = -1;
 			}
 			else {
-				pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, 1,
+				pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, interpr,
 						       fhin, fhout, fherr);
 				free(iprog);
 			}
 			argv[0] = argv0;
 		}
 		else
-			pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, 0,
+			pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, NULL,
 					       fhin, fhout, fherr);
 		free(prog);
 	}
@@ -2248,7 +2249,7 @@ static int try_shell_exec(const char *cmd, char *const *argv)
 		argv2[0] = (char *)cmd;	/* full path to the script file */
 		COPY_ARRAY(&argv2[1], &argv[1], argc);
 		exec_id = trace2_exec(prog, (const char **)argv2);
-		pid = mingw_spawnv(prog, (const char **)argv2, 1);
+		pid = mingw_spawnv(prog, (const char **)argv2, interpr);
 		if (pid >= 0) {
 			int status;
 			if (waitpid(pid, &status, 0) < 0)
@@ -2272,7 +2273,7 @@ int mingw_execv(const char *cmd, char *const *argv)
 		int exec_id;
 
 		exec_id = trace2_exec(cmd, (const char **)argv);
-		pid = mingw_spawnv(cmd, (const char **)argv, 0);
+		pid = mingw_spawnv(cmd, (const char **)argv, NULL);
 		if (pid < 0) {
 			trace2_exec_result(exec_id, -1);
 			return -1;

From 73c20fa19ae32bc31983d1d2c00f06d74ce45fdf Mon Sep 17 00:00:00 2001
From: JiSeop Moon <zcube@zcube.kr>
Date: Mon, 23 Apr 2018 22:35:26 +0200
Subject: [PATCH 519/553] mingw: move the file_attr_to_st_mode() function
 definition

In preparation for making this function a bit more complicated (to allow
for special-casing the `ContainerMappedDirectories` in Windows
containers, which look like a symbolic link, but are not), let's move it
out of the header.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 14 ++++++++++++++
 compat/win32.h | 14 +-------------
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index b5b9dc21391b0c..54e6f51574eb70 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -4255,3 +4255,17 @@ int is_inside_windows_container(void)
 
 	return inside_container;
 }
+
+int file_attr_to_st_mode (DWORD attr, DWORD tag)
+{
+	int fMode = S_IREAD;
+	if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK)
+		fMode |= S_IFLNK;
+	else if (attr & FILE_ATTRIBUTE_DIRECTORY)
+		fMode |= S_IFDIR;
+	else
+		fMode |= S_IFREG;
+	if (!(attr & FILE_ATTRIBUTE_READONLY))
+		fMode |= S_IWRITE;
+	return fMode;
+}
diff --git a/compat/win32.h b/compat/win32.h
index 671bcc81f93351..52169ae19f4371 100644
--- a/compat/win32.h
+++ b/compat/win32.h
@@ -6,19 +6,7 @@
 #include <windows.h>
 #endif
 
-static inline int file_attr_to_st_mode (DWORD attr, DWORD tag)
-{
-	int fMode = S_IREAD;
-	if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK)
-		fMode |= S_IFLNK;
-	else if (attr & FILE_ATTRIBUTE_DIRECTORY)
-		fMode |= S_IFDIR;
-	else
-		fMode |= S_IFREG;
-	if (!(attr & FILE_ATTRIBUTE_READONLY))
-		fMode |= S_IWRITE;
-	return fMode;
-}
+extern int file_attr_to_st_mode (DWORD attr, DWORD tag);
 
 static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata)
 {

From c9f4d0c881cedf0ed5451c65b6ebb6c944852b55 Mon Sep 17 00:00:00 2001
From: Bert Belder <bertbelder@gmail.com>
Date: Fri, 26 Oct 2018 11:51:51 +0200
Subject: [PATCH 520/553] mingw: allow to specify the symlink type in
 .gitattributes

On Windows, symbolic links have a type: a "file symlink" must point at
a file, and a "directory symlink" must point at a directory. If the
type of symlink does not match its target, it doesn't work.

Git does not record the type of symlink in the index or in a tree. On
checkout it'll guess the type, which only works if the target exists
at the time the symlink is created. This may often not be the case,
for example when the link points at a directory inside a submodule.

By specifying `symlink=file` or `symlink=dir` the user can specify what
type of symlink Git should create, so Git doesn't have to rely on
unreliable heuristics.

Signed-off-by: Bert Belder <bertbelder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/gitattributes.adoc | 30 ++++++++++++++++
 compat/mingw.c                   | 60 ++++++++++++++++++++++++++++++--
 2 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitattributes.adoc b/Documentation/gitattributes.adoc
index f20041a323d174..7794bf0fd98dad 100644
--- a/Documentation/gitattributes.adoc
+++ b/Documentation/gitattributes.adoc
@@ -403,6 +403,36 @@ sign `$` upon checkout.  Any byte sequence that begins with
 with `$Id$` upon check-in.
 
 
+`symlink`
+^^^^^^^^^
+
+On Windows, symbolic links have a type: a "file symlink" must point at
+a file, and a "directory symlink" must point at a directory. If the
+type of symlink does not match its target, it doesn't work.
+
+Git does not record the type of symlink in the index or in a tree. On
+checkout it'll guess the type, which only works if the target exists
+at the time the symlink is created. This may often not be the case,
+for example when the link points at a directory inside a submodule.
+
+The `symlink` attribute allows you to explicitly set the type of symlink
+to `file` or `dir`, so Git doesn't have to guess. If you have a set of
+symlinks that point at other files, you can do:
+
+------------------------
+*.gif 	symlink=file
+------------------------
+
+To tell Git that a symlink points at a directory, use:
+
+------------------------
+tools_folder 	symlink=dir
+------------------------
+
+The `symlink` attribute is ignored on platforms other than Windows,
+since they don't distinguish between different types of symlinks.
+
+
 `filter`
 ^^^^^^^^
 
diff --git a/compat/mingw.c b/compat/mingw.c
index 897fb7b940d592..ac6954a37b4320 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -4,6 +4,7 @@
 #include "git-compat-util.h"
 #include "abspath.h"
 #include "alloc.h"
+#include "attr.h"
 #include "config.h"
 #include "dir.h"
 #include "environment.h"
@@ -3149,7 +3150,38 @@ int link(const char *oldpath, const char *newpath)
 	return 0;
 }
 
-int mingw_create_symlink(struct index_state *index UNUSED, const char *target, const char *link)
+enum symlink_type {
+	SYMLINK_TYPE_UNSPECIFIED = 0,
+	SYMLINK_TYPE_FILE,
+	SYMLINK_TYPE_DIRECTORY,
+};
+
+static enum symlink_type check_symlink_attr(struct index_state *index, const char *link)
+{
+	static struct attr_check *check;
+	const char *value;
+
+	if (!index)
+		return SYMLINK_TYPE_UNSPECIFIED;
+
+	if (!check)
+		check = attr_check_initl("symlink", NULL);
+
+	git_check_attr(index, link, check);
+
+	value = check->items[0].value;
+	if (ATTR_UNSET(value))
+		return SYMLINK_TYPE_UNSPECIFIED;
+	if (!strcmp(value, "file"))
+		return SYMLINK_TYPE_FILE;
+	if (!strcmp(value, "dir") || !strcmp(value, "directory"))
+		return SYMLINK_TYPE_DIRECTORY;
+
+	warning(_("ignoring invalid symlink type '%s' for '%s'"), value, link);
+	return SYMLINK_TYPE_UNSPECIFIED;
+}
+
+int mingw_create_symlink(struct index_state *index, const char *target, const char *link)
 {
 	wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH];
 	int len;
@@ -3169,7 +3201,31 @@ int mingw_create_symlink(struct index_state *index UNUSED, const char *target, c
 		if (wtarget[len] == '/')
 			wtarget[len] = '\\';
 
-	return create_phantom_symlink(wtarget, wlink);
+	switch (check_symlink_attr(index, link)) {
+	case SYMLINK_TYPE_UNSPECIFIED:
+		/* Create a phantom symlink: it is initially created as a file
+		 * symlink, but may change to a directory symlink later if/when
+		 * the target exists. */
+		return create_phantom_symlink(wtarget, wlink);
+	case SYMLINK_TYPE_FILE:
+		if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags))
+			break;
+		return 0;
+	case SYMLINK_TYPE_DIRECTORY:
+		if (!CreateSymbolicLinkW(wlink, wtarget,
+					 symlink_directory_flags))
+			break;
+		/* There may be dangling phantom symlinks that point at this
+		 * one, which should now morph into directory symlinks. */
+		process_phantom_symlinks();
+		return 0;
+	default:
+		BUG("unhandled symlink type");
+	}
+
+	/* CreateSymbolicLinkW failed. */
+	errno = err_win_to_posix(GetLastError());
+	return -1;
 }
 
 #ifndef _WINNT_H

From 4c4ce15eb34ce715c2db3c7ddb080b6523c8a808 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 20 Jul 2017 20:41:29 +0200
Subject: [PATCH 521/553] mingw: when path_lookup() failed, try BusyBox

BusyBox comes with a ton of applets ("applet" being the identical
concept to Git's "builtins"). And similar to Git's builtins, the applets
can be called via `busybox <command>`, or the BusyBox executable can be
copied/hard-linked to the command name.

The similarities do not end here. Just as with Git's builtins, it is
problematic that BusyBox' hard-linked applets cannot easily be put into
a .zip file: .zip archives have no concept of hard-links and therefore
would store identical copies (and also extract identical copies,
"inflating" the archive unnecessarily).

To counteract that issue, MinGit already ships without hard-linked
copies of the builtins, and the plan is to do the same with BusyBox'
applets: simply ship busybox.exe as single executable, without
hard-linked applets.

To accommodate that, Git is being taught by this commit a very special
trick, exploiting the fact that it is possible to call an executable
with a command-line whose argv[0] is different from the executable's
name: when `sh` is to be spawned, and no `sh` is found in the PATH, but
busybox.exe is, use that executable (with unchanged argv).

Likewise, if any executable to be spawned is not on the PATH, but
busybox.exe is found, parse the output of `busybox.exe --help` to find
out what applets are included, and if the command matches an included
applet name, use busybox.exe to execute it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c   | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
 t/t0014-alias.sh |  2 +-
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index b3af187493d8c6..428bae765ed942 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -12,6 +12,7 @@
 #include "repository.h"
 #include "run-command.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "symlinks.h"
 #include "trace2.h"
 #include "win32.h"
@@ -1754,6 +1755,65 @@ static char *lookup_prog(const char *dir, int dirlen, const char *cmd,
 	return NULL;
 }
 
+static char *path_lookup(const char *cmd, int exe_only);
+
+static char *is_busybox_applet(const char *cmd)
+{
+	static struct string_list applets = STRING_LIST_INIT_DUP;
+	static char *busybox_path;
+	static int busybox_path_initialized;
+
+	/* Avoid infinite loop */
+	if (!strncasecmp(cmd, "busybox", 7) &&
+	    (!cmd[7] || !strcasecmp(cmd + 7, ".exe")))
+		return NULL;
+
+	if (!busybox_path_initialized) {
+		busybox_path = path_lookup("busybox.exe", 1);
+		busybox_path_initialized = 1;
+	}
+
+	/* Assume that sh is compiled in... */
+	if (!busybox_path || !strcasecmp(cmd, "sh"))
+		return xstrdup_or_null(busybox_path);
+
+	if (!applets.nr) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+		struct strbuf buf = STRBUF_INIT;
+		char *p;
+
+		strvec_pushl(&cp.args, busybox_path, "--help", NULL);
+
+		if (capture_command(&cp, &buf, 2048)) {
+			string_list_append(&applets, "");
+			return NULL;
+		}
+
+		/* parse output */
+		p = strstr(buf.buf, "Currently defined functions:\n");
+		if (!p) {
+			warning("Could not parse output of busybox --help");
+			string_list_append(&applets, "");
+			return NULL;
+		}
+		p = strchrnul(p, '\n');
+		for (;;) {
+			size_t len;
+
+			p += strspn(p, "\n\t ,");
+			len = strcspn(p, "\n\t ,");
+			if (!len)
+				break;
+			p[len] = '\0';
+			string_list_insert(&applets, p);
+			p = p + len + 1;
+		}
+	}
+
+	return string_list_has_string(&applets, cmd) ?
+		xstrdup(busybox_path) : NULL;
+}
+
 /*
  * Determines the absolute path of cmd using the split path in path.
  * If cmd contains a slash or backslash, no lookup is performed.
@@ -1782,6 +1842,9 @@ static char *path_lookup(const char *cmd, int exe_only)
 		path = sep + 1;
 	}
 
+	if (!prog && !isexe)
+		prog = is_busybox_applet(cmd);
+
 	return prog;
 }
 
diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh
index 62b4d81db875ca..ee0f0a54b6623f 100755
--- a/t/t0014-alias.sh
+++ b/t/t0014-alias.sh
@@ -53,7 +53,7 @@ test_expect_success 'looping aliases - deprecated builtins' '
 
 test_expect_success 'run-command formats empty args properly' '
 	test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw &&
-	sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual &&
+	sed -ne "/run_command: git-frotz/s/.*trace: run_command: //p" actual.raw >actual &&
 	echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect &&
 	test_cmp expect actual
 '

From 845bfa70830cac7ad1241cd6b9b633448004396d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 23 Apr 2018 23:20:00 +0200
Subject: [PATCH 522/553] mingw: Windows Docker volumes are *not* symbolic
 links

... even if they may look like them.

As looking up the target of the "symbolic link" (just to see whether it
starts with `/ContainerMappedDirectories/`) is pretty expensive, we
do it when we can be *really* sure that there is a possibility that this
might be the case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: JiSeop Moon <zcube@zcube.kr>
---
 compat/mingw.c         | 25 +++++++++++++++++++------
 compat/win32.h         |  2 +-
 compat/win32/fscache.c | 24 +++++++++++++++++++++++-
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 54e6f51574eb70..825d6486f6a204 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1162,7 +1162,7 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes,
-				reparse_tag);
+				reparse_tag, file_name);
 		buf->st_size = S_ISLNK(buf->st_mode) ? link_len :
 			fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
@@ -1213,7 +1213,7 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 	buf->st_gid = 0;
 	buf->st_uid = 0;
 	buf->st_nlink = 1;
-	buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0);
+	buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0, NULL);
 	buf->st_size = fdata.nFileSizeLow |
 		(((off_t)fdata.nFileSizeHigh)<<32);
 	buf->st_dev = buf->st_rdev = 0; /* not used by Git */
@@ -4256,12 +4256,25 @@ int is_inside_windows_container(void)
 	return inside_container;
 }
 
-int file_attr_to_st_mode (DWORD attr, DWORD tag)
+int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path)
 {
 	int fMode = S_IREAD;
-	if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK)
-		fMode |= S_IFLNK;
-	else if (attr & FILE_ATTRIBUTE_DIRECTORY)
+	if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) &&
+	    tag == IO_REPARSE_TAG_SYMLINK) {
+		int flag = S_IFLNK;
+		char buf[MAX_LONG_PATH];
+
+		/*
+		 * Windows containers' mapped volumes are marked as reparse
+		 * points and look like symbolic links, but they are not.
+		 */
+		if (path && is_inside_windows_container() &&
+		    readlink(path, buf, sizeof(buf)) > 27 &&
+		    starts_with(buf, "/ContainerMappedDirectories/"))
+			flag = S_IFDIR;
+
+		fMode |= flag;
+	} else if (attr & FILE_ATTRIBUTE_DIRECTORY)
 		fMode |= S_IFDIR;
 	else
 		fMode |= S_IFREG;
diff --git a/compat/win32.h b/compat/win32.h
index 52169ae19f4371..299f01bdf0f5a4 100644
--- a/compat/win32.h
+++ b/compat/win32.h
@@ -6,7 +6,7 @@
 #include <windows.h>
 #endif
 
-extern int file_attr_to_st_mode (DWORD attr, DWORD tag);
+extern int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path);
 
 static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata)
 {
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 0f5e00ae18f949..3f9a70e15df853 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -207,8 +207,30 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 		fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ?
 		fdata->EaSize : 0;
 
+	/*
+	 * On certain Windows versions, host directories mapped into
+	 * Windows Containers ("Volumes", see https://docs.docker.com/storage/volumes/)
+	 * look like symbolic links, but their targets are paths that
+	 * are valid only in kernel mode.
+	 *
+	 * Let's work around this by detecting that situation and
+	 * telling Git that these are *not* symbolic links.
+	 */
+	if (fse->reparse_tag == IO_REPARSE_TAG_SYMLINK &&
+	    sizeof(buf) > (size_t)(list ? list->len + 1 : 0) + fse->len + 1 &&
+	    is_inside_windows_container()) {
+		size_t off = 0;
+		if (list) {
+			memcpy(buf, list->dirent.d_name, list->len);
+			buf[list->len] = '/';
+			off = list->len + 1;
+		}
+		memcpy(buf + off, fse->dirent.d_name, fse->len);
+		buf[off + fse->len] = '\0';
+	}
+
 	fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes,
-					    fdata->EaSize);
+					    fdata->EaSize, buf);
 	fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG :
 			S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK;
 	fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_LONG_PATH :

From 0e9a2a2722572a0a67e22b34d7450d94c88dbda6 Mon Sep 17 00:00:00 2001
From: David Lomas <dl3@pale-eds.co.uk>
Date: Fri, 28 Jul 2023 15:20:43 +0100
Subject: [PATCH 523/553] mingw: work around rename() failing on a read-only
 file

At least on _some_ APFS network shares, Git fails to rename the object
files because they are marked as read-only, because that has the effect
of setting the uchg flag on APFS, which then means the file can't be
renamed or deleted.

To work around that, when a rename failed, and the read-only flag is
set, try to turn it off and on again.

This fixes https://github.com/git-for-windows/git/issues/4482

Signed-off-by: David Lomas <dl3@pale-eds.co.uk>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
 compat/mingw.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 825d6486f6a204..6e3bc9fa700f5e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2696,7 +2696,7 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz)
 int mingw_rename(const char *pold, const char *pnew)
 {
 	static int supports_file_rename_info_ex = 1;
-	DWORD attrs = INVALID_FILE_ATTRIBUTES, gle;
+	DWORD attrs = INVALID_FILE_ATTRIBUTES, gle, attrsold;
 	int tries = 0;
 	wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH];
 	int wpnew_len;
@@ -2788,11 +2788,24 @@ int mingw_rename(const char *pold, const char *pnew)
 		gle = GetLastError();
 	}
 
-	if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) {
-		/* Fall back to copy to destination & remove source */
-		if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold))
-			return 0;
-		gle = GetLastError();
+	if (gle == ERROR_ACCESS_DENIED) {
+		if (is_inside_windows_container()) {
+			/* Fall back to copy to destination & remove source */
+			if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold, 1))
+				return 0;
+			gle = GetLastError();
+		} else if ((attrsold = GetFileAttributesW(wpold)) & FILE_ATTRIBUTE_READONLY) {
+			/* if file is read-only, change and retry */
+			SetFileAttributesW(wpold, attrsold & ~FILE_ATTRIBUTE_READONLY);
+			if (MoveFileExW(wpold, wpnew,
+					MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED)) {
+				SetFileAttributesW(wpnew, attrsold);
+				return 0;
+			}
+			gle = GetLastError();
+			/* revert attribute change on failure */
+			SetFileAttributesW(wpold, attrsold);
+		}
 	}
 
 	/* revert file attributes on failure */

From 198b593670a518efa0f65a4add53f4683c40e141 Mon Sep 17 00:00:00 2001
From: Bert Belder <bertbelder@gmail.com>
Date: Fri, 26 Oct 2018 23:42:09 +0200
Subject: [PATCH 524/553] Win32: symlink: add test for `symlink` attribute

To verify that the symlink is resolved correctly, we use the fact that
`git.exe` is a native Win32 program, and that `git.exe config -f <path>`
therefore uses the native symlink resolution.

Signed-off-by: Bert Belder <bertbelder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/meson.build                    |  1 +
 t/t2040-checkout-symlink-attr.sh | 46 ++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)
 create mode 100755 t/t2040-checkout-symlink-attr.sh

diff --git a/t/meson.build b/t/meson.build
index 4dd9ca9f303733..f4277cc5784f95 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -271,6 +271,7 @@ integration_tests = [
   't2027-checkout-track.sh',
   't2030-unresolve-info.sh',
   't2031-checkout-long-paths.sh',
+  't2040-checkout-symlink-attr.sh',
   't2050-git-dir-relative.sh',
   't2060-switch.sh',
   't2070-restore.sh',
diff --git a/t/t2040-checkout-symlink-attr.sh b/t/t2040-checkout-symlink-attr.sh
new file mode 100755
index 00000000000000..e00c31d096ce88
--- /dev/null
+++ b/t/t2040-checkout-symlink-attr.sh
@@ -0,0 +1,46 @@
+#!/bin/sh
+
+test_description='checkout symlinks with `symlink` attribute on Windows
+
+Ensures that Git for Windows creates symlinks of the right type,
+as specified by the `symlink` attribute in `.gitattributes`.'
+
+# Tell MSYS to create native symlinks. Without this flag test-lib's
+# prerequisite detection for SYMLINKS doesn't detect the right thing.
+MSYS=winsymlinks:nativestrict && export MSYS
+
+. ./test-lib.sh
+
+if ! test_have_prereq MINGW,SYMLINKS
+then
+	skip_all='skipping $0: MinGW-only test, which requires symlink support.'
+	test_done
+fi
+
+# Adds a symlink to the index without clobbering the work tree.
+cache_symlink () {
+	sha=$(printf '%s' "$1" | git hash-object --stdin -w) &&
+	git update-index --add --cacheinfo 120000,$sha,"$2"
+}
+
+test_expect_success 'checkout symlinks with attr' '
+	cache_symlink file1 file-link &&
+	cache_symlink dir dir-link &&
+
+	printf "file-link symlink=file\ndir-link symlink=dir\n" >.gitattributes &&
+	git add .gitattributes &&
+
+	git checkout . &&
+
+	mkdir dir &&
+	echo "[a]b=c" >file1 &&
+	echo "[x]y=z" >dir/file2 &&
+
+	# MSYS2 is very forgiving, it will resolve symlinks even if the
+	# symlink type is incorrect. To make this test meaningful, try
+	# them with a native, non-MSYS executable, such as `git config`.
+	test "$(git config -f file-link a.b)" = "c" &&
+	test "$(git config -f dir-link/file2 x.y)" = "z"
+'
+
+test_done

From fc0a47ffeaf532d5ec46e45558d602cf15086a8c Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 20 Jul 2017 22:18:56 +0200
Subject: [PATCH 525/553] test-tool: learn to act as a drop-in replacement for
 `iconv`

It is convenient to assume that everybody who wants to build & test Git
has access to a working `iconv` executable (after all, we already pretty
much require libiconv).

However, that limits esoteric test scenarios such as Git for Windows',
where an end user installation has to ship with `iconv` for the sole
purpose of being testable. That payload serves no other purpose.

So let's just have a test helper (to be able to test Git, the test
helpers have to be available, after all) to act as `iconv` replacement.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile              |  1 +
 t/helper/meson.build  |  1 +
 t/helper/test-iconv.c | 47 +++++++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c  |  1 +
 t/helper/test-tool.h  |  1 +
 5 files changed, 51 insertions(+)
 create mode 100644 t/helper/test-iconv.c

diff --git a/Makefile b/Makefile
index f122f9198f1029..f67c4c60da399a 100644
--- a/Makefile
+++ b/Makefile
@@ -834,6 +834,7 @@ TEST_BUILTINS_OBJS += test-hash-speed.o
 TEST_BUILTINS_OBJS += test-hash.o
 TEST_BUILTINS_OBJS += test-hashmap.o
 TEST_BUILTINS_OBJS += test-hexdump.o
+TEST_BUILTINS_OBJS += test-iconv.o
 TEST_BUILTINS_OBJS += test-json-writer.o
 TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o
 TEST_BUILTINS_OBJS += test-match-trees.o
diff --git a/t/helper/meson.build b/t/helper/meson.build
index 675e64c0101b61..cba4a9bf4f1434 100644
--- a/t/helper/meson.build
+++ b/t/helper/meson.build
@@ -29,6 +29,7 @@ test_tool_sources = [
   'test-hash.c',
   'test-hashmap.c',
   'test-hexdump.c',
+  'test-iconv.c',
   'test-json-writer.c',
   'test-lazy-init-name-hash.c',
   'test-match-trees.c',
diff --git a/t/helper/test-iconv.c b/t/helper/test-iconv.c
new file mode 100644
index 00000000000000..d3c772fddf990b
--- /dev/null
+++ b/t/helper/test-iconv.c
@@ -0,0 +1,47 @@
+#include "test-tool.h"
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "gettext.h"
+#include "parse-options.h"
+#include "utf8.h"
+
+int cmd__iconv(int argc, const char **argv)
+{
+	struct strbuf buf = STRBUF_INIT;
+	char *from = NULL, *to = NULL, *p;
+	size_t len;
+	int ret = 0;
+	const char * const iconv_usage[] = {
+		N_("test-helper --iconv [<options>]"),
+		NULL
+	};
+	struct option options[] = {
+		OPT_STRING('f', "from-code", &from, "encoding", "from"),
+		OPT_STRING('t', "to-code", &to, "encoding", "to"),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, NULL, options,
+			iconv_usage, 0);
+
+	if (argc > 1 || !from || !to)
+		usage_with_options(iconv_usage, options);
+
+	if (!argc) {
+		if (strbuf_read(&buf, 0, 2048) < 0)
+			die_errno("Could not read from stdin");
+	} else if (strbuf_read_file(&buf, argv[0], 2048) < 0)
+		die_errno("Could not read from '%s'", argv[0]);
+
+	p = reencode_string_len(buf.buf, buf.len, to, from, &len);
+	if (!p)
+		die_errno("Could not reencode");
+	if (write(1, p, len) < 0)
+		ret = !!error_errno("Could not write %"PRIuMAX" bytes",
+				    (uintmax_t)len);
+
+	strbuf_release(&buf);
+	free(p);
+
+	return ret;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index a7abc618b3887e..9d1b41c8e39b89 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -39,6 +39,7 @@ static struct test_cmd cmds[] = {
 	{ "hashmap", cmd__hashmap },
 	{ "hash-speed", cmd__hash_speed },
 	{ "hexdump", cmd__hexdump },
+	{ "iconv", cmd__iconv },
 	{ "json-writer", cmd__json_writer },
 	{ "lazy-init-name-hash", cmd__lazy_init_name_hash },
 	{ "match-trees", cmd__match_trees },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 7f150fa1eb9ad2..e18e5a9ed9de81 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -32,6 +32,7 @@ int cmd__getcwd(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
 int cmd__hash_speed(int argc, const char **argv);
 int cmd__hexdump(int argc, const char **argv);
+int cmd__iconv(int argc, const char **argv);
 int cmd__json_writer(int argc, const char **argv);
 int cmd__lazy_init_name_hash(int argc, const char **argv);
 int cmd__match_trees(int argc, const char **argv);

From bc8d94de0bed594000150a18c0afd172312e7ff2 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 20 Jul 2017 22:25:21 +0200
Subject: [PATCH 526/553] tests(mingw): if `iconv` is unavailable, use
 `test-helper --iconv`

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0fb76f7d11e840..ef0294a30df8e0 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1682,6 +1682,12 @@ Darwin)
 	test_set_prereq GREP_STRIPS_CR
 	test_set_prereq WINDOWS
 	GIT_TEST_CMP="GIT_DIR=/dev/null git diff --no-index --ignore-cr-at-eol --"
+	if ! type iconv >/dev/null 2>&1
+	then
+		iconv () {
+			test-tool iconv "$@"
+		}
+	fi
 	;;
 *CYGWIN*)
 	test_set_prereq POSIXPERM

From 2f3b70fbec9e313fc0cc9fc15eb1839ef0479879 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 11 Oct 2018 23:55:44 +0200
Subject: [PATCH 527/553] gitattributes: mark .png files as binary

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .gitattributes | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitattributes b/.gitattributes
index 38b1c52fe0e230..532b246caeb1f0 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -6,6 +6,7 @@
 *.pm text eol=lf diff=perl
 *.py text eol=lf diff=python
 *.bat text eol=crlf
+*.png binary
 CODE_OF_CONDUCT.md -whitespace
 /Documentation/**/*.adoc text eol=lf whitespace=trail,space,incomplete
 /command-list.txt text eol=lf

From 4453543d98a086941f1b157bca00ee0331db041d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 5 Aug 2017 20:28:37 +0200
Subject: [PATCH 528/553] tests: move test PNGs into t/lib-diff/

We already have a directory where we store files intended for use by
multiple test scripts. The same directory is a better home for the
test-binary-*.png files than t/.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/{ => lib-diff}/test-binary-1.png | Bin
 t/{ => lib-diff}/test-binary-2.png | Bin
 t/t3307-notes-man.sh               |   2 +-
 t/t3903-stash.sh                   |   2 +-
 t/t4012-diff-binary.sh             |   2 +-
 t/t4049-diff-stat-count.sh         |   2 +-
 t/t4108-apply-threeway.sh          |  12 ++++++------
 t/t6403-merge-file.sh              |   4 ++--
 t/t6407-merge-binary.sh            |   2 +-
 t/t9200-git-cvsexportcommit.sh     |  14 +++++++-------
 10 files changed, 20 insertions(+), 20 deletions(-)
 rename t/{ => lib-diff}/test-binary-1.png (100%)
 rename t/{ => lib-diff}/test-binary-2.png (100%)

diff --git a/t/test-binary-1.png b/t/lib-diff/test-binary-1.png
similarity index 100%
rename from t/test-binary-1.png
rename to t/lib-diff/test-binary-1.png
diff --git a/t/test-binary-2.png b/t/lib-diff/test-binary-2.png
similarity index 100%
rename from t/test-binary-2.png
rename to t/lib-diff/test-binary-2.png
diff --git a/t/t3307-notes-man.sh b/t/t3307-notes-man.sh
index 1aa366a410e9a3..7e5c06e6615d7a 100755
--- a/t/t3307-notes-man.sh
+++ b/t/t3307-notes-man.sh
@@ -26,7 +26,7 @@ test_expect_success 'example 1: notes to add an Acked-by line' '
 '
 
 test_expect_success 'example 2: binary notes' '
-	cp "$TEST_DIRECTORY"/test-binary-1.png . &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . &&
 	git checkout B &&
 	blob=$(git hash-object -w test-binary-1.png) &&
 	git notes --ref=logo add -C "$blob" &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 70879941c22f8c..0c9022290fad0f 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1377,7 +1377,7 @@ test_expect_success 'stash -- <subdir> works with binary files' '
 	mkdir -p subdir &&
 	>subdir/untracked &&
 	>subdir/tracked &&
-	cp "$TEST_DIRECTORY"/test-binary-1.png subdir/tracked-binary &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png subdir/tracked-binary &&
 	git add subdir/tracked* &&
 	git stash -- subdir/ &&
 	test_path_is_missing subdir/tracked &&
diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
index d1d30ac2a9474e..73b1e43779783d 100755
--- a/t/t4012-diff-binary.sh
+++ b/t/t4012-diff-binary.sh
@@ -19,7 +19,7 @@ test_expect_success 'prepare repository' '
 	echo AIT >a && echo BIT >b && echo CIT >c && echo DIT >d &&
 	git update-index --add a b c d &&
 	echo git >a &&
-	cat "$TEST_DIRECTORY"/test-binary-1.png >b &&
+	cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >b &&
 	echo git >c &&
 	cat b b >d
 '
diff --git a/t/t4049-diff-stat-count.sh b/t/t4049-diff-stat-count.sh
index eceb47c8594416..2161a1e8cf5ba6 100755
--- a/t/t4049-diff-stat-count.sh
+++ b/t/t4049-diff-stat-count.sh
@@ -33,7 +33,7 @@ test_expect_success 'binary changes do not count in lines' '
 	git reset --hard &&
 	echo a >a &&
 	echo c >c &&
-	cat "$TEST_DIRECTORY"/test-binary-1.png >d &&
+	cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >d &&
 	cat >expect <<-\EOF &&
 	 a | 1 +
 	 c | 1 +
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index f30e85659dbb87..7f84edd9653a7d 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -272,11 +272,11 @@ test_expect_success 'apply with --3way --cached and conflicts' '
 
 test_expect_success 'apply binary file patch' '
 	git reset --hard main &&
-	cp "$TEST_DIRECTORY/test-binary-1.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png &&
 	git add bin.png &&
 	git commit -m "add binary file" &&
 
-	cp "$TEST_DIRECTORY/test-binary-2.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png &&
 
 	git diff --binary >bin.diff &&
 	git reset --hard &&
@@ -287,11 +287,11 @@ test_expect_success 'apply binary file patch' '
 
 test_expect_success 'apply binary file patch with 3way' '
 	git reset --hard main &&
-	cp "$TEST_DIRECTORY/test-binary-1.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png &&
 	git add bin.png &&
 	git commit -m "add binary file" &&
 
-	cp "$TEST_DIRECTORY/test-binary-2.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png &&
 
 	git diff --binary >bin.diff &&
 	git reset --hard &&
@@ -302,11 +302,11 @@ test_expect_success 'apply binary file patch with 3way' '
 
 test_expect_success 'apply full-index patch with 3way' '
 	git reset --hard main &&
-	cp "$TEST_DIRECTORY/test-binary-1.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png &&
 	git add bin.png &&
 	git commit -m "add binary file" &&
 
-	cp "$TEST_DIRECTORY/test-binary-2.png" bin.png &&
+	cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png &&
 
 	git diff --full-index >bin.diff &&
 	git reset --hard &&
diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh
index 06ab4d7aede081..3e06db0cbc579b 100755
--- a/t/t6403-merge-file.sh
+++ b/t/t6403-merge-file.sh
@@ -355,12 +355,12 @@ test_expect_success "expected conflict markers" '
 
 test_expect_success 'binary files cannot be merged' '
 	test_must_fail git merge-file -p \
-		orig.txt "$TEST_DIRECTORY"/test-binary-1.png new1.txt 2> merge.err &&
+		orig.txt "$TEST_DIRECTORY"/lib-diff/test-binary-1.png new1.txt 2> merge.err &&
 	grep "Cannot merge binary files" merge.err
 '
 
 test_expect_success 'binary files cannot be merged with --object-id' '
-	cp "$TEST_DIRECTORY"/test-binary-1.png . &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . &&
 	git add orig.txt new1.txt test-binary-1.png &&
 	test_must_fail git merge-file --object-id \
 		:orig.txt :test-binary-1.png :new1.txt 2> merge.err &&
diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index e8a28717cece32..2547f1d504a2c5 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -9,7 +9,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 test_expect_success setup '
 
-	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
+	cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
 	test_tick &&
diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh
index a44eabf0d80fa8..5249a9eb886e0b 100755
--- a/t/t9200-git-cvsexportcommit.sh
+++ b/t/t9200-git-cvsexportcommit.sh
@@ -54,8 +54,8 @@ test_expect_success 'New file' '
 	mkdir A B C D E F &&
 	echo hello1 >A/newfile1.txt &&
 	echo hello2 >B/newfile2.txt &&
-	cp "$TEST_DIRECTORY"/test-binary-1.png C/newfile3.png &&
-	cp "$TEST_DIRECTORY"/test-binary-1.png D/newfile4.png &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png C/newfile3.png &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png D/newfile4.png &&
 	git add A/newfile1.txt &&
 	git add B/newfile2.txt &&
 	git add C/newfile3.png &&
@@ -80,8 +80,8 @@ test_expect_success 'Remove two files, add two and update two' '
 	rm -f B/newfile2.txt &&
 	rm -f C/newfile3.png &&
 	echo Hello5  >E/newfile5.txt &&
-	cp "$TEST_DIRECTORY"/test-binary-2.png D/newfile4.png &&
-	cp "$TEST_DIRECTORY"/test-binary-1.png F/newfile6.png &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-2.png D/newfile4.png &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png F/newfile6.png &&
 	git add E/newfile5.txt &&
 	git add F/newfile6.png &&
 	git commit -a -m "Test: Remove, add and update" &&
@@ -169,7 +169,7 @@ test_expect_success 'New file with spaces in file name' '
 	mkdir "G g" &&
 	echo ok then >"G g/with spaces.txt" &&
 	git add "G g/with spaces.txt" && \
-	cp "$TEST_DIRECTORY"/test-binary-1.png "G g/with spaces.png" && \
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png "G g/with spaces.png" && \
 	git add "G g/with spaces.png" &&
 	git commit -a -m "With spaces" &&
 	id=$(git rev-list --max-count=1 HEAD) &&
@@ -181,7 +181,7 @@ test_expect_success 'New file with spaces in file name' '
 
 test_expect_success 'Update file with spaces in file name' '
 	echo Ok then >>"G g/with spaces.txt" &&
-	cat "$TEST_DIRECTORY"/test-binary-1.png >>"G g/with spaces.png" && \
+	cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >>"G g/with spaces.png" && \
 	git add "G g/with spaces.png" &&
 	git commit -a -m "Update with spaces" &&
 	id=$(git rev-list --max-count=1 HEAD) &&
@@ -206,7 +206,7 @@ test_expect_success !MINGW 'File with non-ascii file name' '
 	mkdir -p Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö &&
 	echo Foo >Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt &&
 	git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt &&
-	cp "$TEST_DIRECTORY"/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png &&
+	cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png &&
 	git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png &&
 	git commit -a -m "Går det så går det" && \
 	id=$(git rev-list --max-count=1 HEAD) &&

From fb2fc453960fad1879f6cf5b3dd495256e304982 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 18 Jul 2017 01:15:40 +0200
Subject: [PATCH 529/553] tests: only override sort & find if there are usable
 ones in /usr/bin/

The idea is to allow running the test suite on MinGit with BusyBox
installed in /mingw64/bin/sh.exe. In that case, we will want to exclude
sort & find (and other Unix utilities) from being bundled.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-sh-setup.sh | 21 ++++++++++++++-------
 t/test-lib.sh   | 21 ++++++++++++++-------
 2 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/git-sh-setup.sh b/git-sh-setup.sh
index 19aef72ec25530..fad4f9df94e143 100644
--- a/git-sh-setup.sh
+++ b/git-sh-setup.sh
@@ -292,13 +292,20 @@ create_virtual_base() {
 # Platform specific tweaks to work around some commands
 case $(uname -s) in
 *MINGW*)
-	# Windows has its own (incompatible) sort and find
-	sort () {
-		/usr/bin/sort "$@"
-	}
-	find () {
-		/usr/bin/find "$@"
-	}
+	if test -x /usr/bin/sort
+	then
+		# Windows has its own (incompatible) sort; override
+		sort () {
+			/usr/bin/sort "$@"
+		}
+	fi
+	if test -x /usr/bin/find
+	then
+		# Windows has its own (incompatible) find; override
+		find () {
+			/usr/bin/find "$@"
+		}
+	fi
 	# git sees Windows-style pwd
 	pwd () {
 		builtin pwd -W
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ef0294a30df8e0..5fbc88aec95f4d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1662,13 +1662,20 @@ Darwin)
 	test_set_prereq EXECKEEPSPID
 	;;
 *MINGW*)
-	# Windows has its own (incompatible) sort and find
-	sort () {
-		/usr/bin/sort "$@"
-	}
-	find () {
-		/usr/bin/find "$@"
-	}
+	if test -x /usr/bin/sort
+	then
+		# Windows has its own (incompatible) sort; override
+		sort () {
+			/usr/bin/sort "$@"
+		}
+	fi
+	if test -x /usr/bin/find
+	then
+		# Windows has its own (incompatible) find; override
+		find () {
+			/usr/bin/find "$@"
+		}
+	fi
 	# git sees Windows-style pwd
 	pwd () {
 		builtin pwd -W

From 0d1d89280786e0a8beb158880222c55a3d28080f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 19 Nov 2018 20:34:13 +0100
Subject: [PATCH 530/553] tests: use the correct path separator with BusyBox

BusyBox-w32 is a true Win32 application, i.e. it does not come with a
POSIX emulation layer.

That also means that it does *not* use the Unix convention of separating
the entries in the PATH variable using colons, but semicolons.

However, there are also BusyBox ports to Windows which use a POSIX
emulation layer such as Cygwin's or MSYS2's runtime, i.e. using colons
as PATH separators.

As a tell-tale, let's use the presence of semicolons in the PATH
variable: on Unix, it is highly unlikely that it contains semicolons,
and on Windows (without POSIX emulation), it is virtually guaranteed, as
everybody should have both $SYSTEMROOT and $SYSTEMROOT/system32 in their
PATH.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/interop/interop-lib.sh    |  8 ++++++--
 t/lib-proto-disable.sh      |  2 +-
 t/t0021-conversion.sh       |  2 +-
 t/t0060-path-utils.sh       | 24 ++++++++++++------------
 t/t0061-run-command.sh      |  6 +++---
 t/t0300-credentials.sh      |  2 +-
 t/t1504-ceiling-dirs.sh     | 10 +++++-----
 t/t2300-cd-to-toplevel.sh   |  2 +-
 t/t3418-rebase-continue.sh  |  4 ++--
 t/t5615-alternate-env.sh    |  4 ++--
 t/t5802-connect-helper.sh   |  2 +-
 t/t7006-pager.sh            |  4 ++--
 t/t7606-merge-custom.sh     |  2 +-
 t/t7811-grep-open.sh        |  2 +-
 t/t9003-help-autocorrect.sh |  2 +-
 t/t9800-git-p4-basic.sh     |  2 +-
 t/test-lib.sh               | 17 +++++++++++++----
 17 files changed, 54 insertions(+), 41 deletions(-)

diff --git a/t/interop/interop-lib.sh b/t/interop/interop-lib.sh
index 1b5864d2a7f22c..1facc69d97741a 100644
--- a/t/interop/interop-lib.sh
+++ b/t/interop/interop-lib.sh
@@ -4,6 +4,10 @@
 . ../../GIT-BUILD-OPTIONS
 INTEROP_ROOT=$(pwd)
 BUILD_ROOT=$INTEROP_ROOT/build
+case "$PATH" in
+*\;*) PATH_SEP=\; ;;
+*) PATH_SEP=: ;;
+esac
 
 build_version () {
 	if test -z "$1"
@@ -57,7 +61,7 @@ wrap_git () {
 	write_script "$1" <<-EOF
 	GIT_EXEC_PATH="$2"
 	export GIT_EXEC_PATH
-	PATH="$2:\$PATH"
+	PATH="$2$PATH_SEP\$PATH"
 	export GIT_EXEC_PATH
 	exec git "\$@"
 	EOF
@@ -71,7 +75,7 @@ generate_wrappers () {
 	echo >&2 fatal: test tried to run generic git: $*
 	exit 1
 	EOF
-	PATH=$(pwd)/.bin:$PATH
+	PATH=$(pwd)/.bin$PATH_SEP$PATH
 }
 
 VERSION_A=${GIT_TEST_VERSION_A:-$VERSION_A}
diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index 890622be81642b..9db481e1be15b2 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -214,7 +214,7 @@ setup_ext_wrapper () {
 		cd "$TRASH_DIRECTORY/remote" &&
 		eval "$*"
 		EOF
-		PATH=$TRASH_DIRECTORY:$PATH &&
+		PATH=$TRASH_DIRECTORY$PATH_SEP$PATH &&
 		export TRASH_DIRECTORY
 	'
 }
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index f0d50d769e9fc5..0c5975336f2104 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -8,7 +8,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
 
-PATH=$PWD:$PATH
+PATH=$PWD$PATH_SEP$PATH
 TEST_ROOT="$(pwd)"
 
 write_script <<\EOF "$TEST_ROOT/rot13.sh"
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 3cdc4738644dbc..5abfa202c19dca 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -147,25 +147,25 @@ ancestor /foo /fo -1
 ancestor /foo /foo -1
 ancestor /foo /bar -1
 ancestor /foo /foo/bar -1
-ancestor /foo /foo:/bar -1
-ancestor /foo /:/foo:/bar 0
-ancestor /foo /foo:/:/bar 0
-ancestor /foo /:/bar:/foo 0
+ancestor /foo "/foo$PATH_SEP/bar" -1
+ancestor /foo "/$PATH_SEP/foo$PATH_SEP/bar" 0
+ancestor /foo "/foo$PATH_SEP/$PATH_SEP/bar" 0
+ancestor /foo "/$PATH_SEP/bar$PATH_SEP/foo" 0
 ancestor /foo/bar / 0
 ancestor /foo/bar /fo -1
 ancestor /foo/bar /foo 4
 ancestor /foo/bar /foo/ba -1
-ancestor /foo/bar /:/fo 0
-ancestor /foo/bar /foo:/foo/ba 4
+ancestor /foo/bar "/$PATH_SEP/fo" 0
+ancestor /foo/bar "/foo$PATH_SEP/foo/ba" 4
 ancestor /foo/bar /bar -1
 ancestor /foo/bar /fo -1
-ancestor /foo/bar /foo:/bar 4
-ancestor /foo/bar /:/foo:/bar 4
-ancestor /foo/bar /foo:/:/bar 4
-ancestor /foo/bar /:/bar:/fo 0
-ancestor /foo/bar /:/bar 0
+ancestor /foo/bar "/foo$PATH_SEP/bar" 4
+ancestor /foo/bar "/$PATH_SEP/foo$PATH_SEP/bar" 4
+ancestor /foo/bar "/foo$PATH_SEP/$PATH_SEP/bar" 4
+ancestor /foo/bar "/$PATH_SEP/bar$PATH_SEP/fo" 0
+ancestor /foo/bar "/$PATH_SEP/bar" 0
 ancestor /foo/bar /foo 4
-ancestor /foo/bar /foo:/bar 4
+ancestor /foo/bar "/foo$PATH_SEP/bar" 4
 ancestor /foo/bar /bar -1
 
 # Windows-specific: DOS drives, network shares
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 74529e219e2aef..84335c0b884ba1 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -69,7 +69,7 @@ test_expect_success 'run_command does not try to execute a directory' '
 	cat bin2/greet
 	EOF
 
-	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+	PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \
 		test-tool run-command run-command greet >actual 2>err &&
 	test_cmp bin2/greet actual &&
 	test_must_be_empty err
@@ -86,7 +86,7 @@ test_expect_success POSIXPERM 'run_command passes over non-executable file' '
 	cat bin2/greet
 	EOF
 
-	PATH=$PWD/bin1:$PWD/bin2:$PATH \
+	PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \
 		test-tool run-command run-command greet >actual 2>err &&
 	test_cmp bin2/greet actual &&
 	test_must_be_empty err
@@ -106,7 +106,7 @@ test_expect_success POSIXPERM,SANITY 'unreadable directory in PATH' '
 	git config alias.nitfol "!echo frotz" &&
 	chmod a-rx local-command &&
 	(
-		PATH=./local-command:$PATH &&
+		PATH=./local-command$PATH_SEP$PATH &&
 		git nitfol >actual
 	) &&
 	echo frotz >expect &&
diff --git a/t/t0300-credentials.sh b/t/t0300-credentials.sh
index 07aa834d33e248..e740ce362988a5 100755
--- a/t/t0300-credentials.sh
+++ b/t/t0300-credentials.sh
@@ -80,7 +80,7 @@ test_expect_success 'setup helper scripts' '
 	printf "username=\\007latrix Lestrange\\n"
 	EOF
 
-	PATH="$PWD:$PATH"
+	PATH="$PWD$PATH_SEP$PATH"
 '
 
 test_expect_success 'credential_fill invokes helper' '
diff --git a/t/t1504-ceiling-dirs.sh b/t/t1504-ceiling-dirs.sh
index e04420f4368b93..ff9fb804827b59 100755
--- a/t/t1504-ceiling-dirs.sh
+++ b/t/t1504-ceiling-dirs.sh
@@ -84,9 +84,9 @@ then
 	GIT_CEILING_DIRECTORIES="$TRASH_ROOT/top/"
 	test_fail subdir_ceil_at_top_slash
 
-	GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top"
+	GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top"
 	test_prefix subdir_ceil_at_top_no_resolve "sub/dir/"
-	GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top/"
+	GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top/"
 	test_prefix subdir_ceil_at_top_slash_no_resolve "sub/dir/"
 fi
 
@@ -116,13 +116,13 @@ GIT_CEILING_DIRECTORIES="$TRASH_ROOT/subdi"
 test_prefix subdir_ceil_at_subdi_slash "sub/dir/"
 
 
-GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub"
+GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub"
 test_fail second_of_two
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub:/bar"
+GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub$PATH_SEP/bar"
 test_fail first_of_two
 
-GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub:/bar"
+GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub$PATH_SEP/bar"
 test_fail second_of_three
 
 
diff --git a/t/t2300-cd-to-toplevel.sh b/t/t2300-cd-to-toplevel.sh
index c8de6d8a190220..91f523d5198d8d 100755
--- a/t/t2300-cd-to-toplevel.sh
+++ b/t/t2300-cd-to-toplevel.sh
@@ -16,7 +16,7 @@ test_cd_to_toplevel () {
 	test_expect_success $3 "$2" '
 		(
 			cd '"'$1'"' &&
-			PATH="$EXEC_PATH:$PATH" &&
+			PATH="$EXEC_PATH$PATH_SEP$PATH" &&
 			. git-sh-setup &&
 			cd_to_toplevel &&
 			[ "$(pwd -P)" = "$TOPLEVEL" ]
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index f9b8999db50f1b..e03a28c0aaad24 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -82,7 +82,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' '
 
 	rm -f actual &&
 	(
-		PATH=./test-bin:$PATH &&
+		PATH=./test-bin$PATH_SEP$PATH &&
 		test_must_fail git rebase -s funny -X"option=arg with space" \
 				-Xop\"tion\\ -X"new${LF}line " main topic
 	) &&
@@ -91,7 +91,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' '
 	echo "Resolved" >F2 &&
 	git add F2 &&
 	(
-		PATH=./test-bin:$PATH &&
+		PATH=./test-bin$PATH_SEP$PATH &&
 		git rebase --continue
 	) &&
 	test_cmp expect actual
diff --git a/t/t5615-alternate-env.sh b/t/t5615-alternate-env.sh
index 9d6aa2187f2aaa..1bfeccdeb49958 100755
--- a/t/t5615-alternate-env.sh
+++ b/t/t5615-alternate-env.sh
@@ -39,7 +39,7 @@ test_expect_success 'access alternate via absolute path' '
 '
 
 test_expect_success 'access multiple alternates' '
-	check_obj "$PWD/one.git/objects:$PWD/two.git/objects" <<-EOF
+	check_obj "$PWD/one.git/objects$PATH_SEP$PWD/two.git/objects" <<-EOF
 	$one blob
 	$two blob
 	EOF
@@ -75,7 +75,7 @@ test_expect_success 'access alternate via relative path (subdir)' '
 quoted='"one.git\057objects"'
 unquoted='two.git/objects'
 test_expect_success 'mix of quoted and unquoted alternates' '
-	check_obj "$quoted:$unquoted" <<-EOF
+	check_obj "$quoted$PATH_SEP$unquoted" <<-EOF
 	$one blob
 	$two blob
 	EOF
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index a7be375bceb8d3..26cbcebf3b2b24 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -86,7 +86,7 @@ test_expect_success 'set up fake git-daemon' '
 		"$TRASH_DIRECTORY/remote"
 	EOF
 	export TRASH_DIRECTORY &&
-	PATH=$TRASH_DIRECTORY:$PATH
+	PATH=$TRASH_DIRECTORY$PATH_SEP$PATH
 '
 
 test_expect_success 'ext command can connect to git daemon (no vhost)' '
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 9717e825f0d7a5..e3aa496a286331 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -54,7 +54,7 @@ test_expect_success !MINGW,TTY 'LESS and LV envvars set by git-sh-setup' '
 		sane_unset LESS LV &&
 		PAGER="env >pager-env.out; wc" &&
 		export PAGER &&
-		PATH="$(git --exec-path):$PATH" &&
+		PATH="$(git --exec-path)$PATH_SEP$PATH" &&
 		export PATH &&
 		test_terminal sh -c ". git-sh-setup && git_pager"
 	) &&
@@ -388,7 +388,7 @@ test_default_pager() {
 		EOF
 		chmod +x \$less &&
 		(
-			PATH=.:\$PATH &&
+			PATH=.$PATH_SEP\$PATH &&
 			export PATH &&
 			$full_command
 		) &&
diff --git a/t/t7606-merge-custom.sh b/t/t7606-merge-custom.sh
index 81fb7c474c14c1..8197a1c46bb5b6 100755
--- a/t/t7606-merge-custom.sh
+++ b/t/t7606-merge-custom.sh
@@ -23,7 +23,7 @@ test_expect_success 'set up custom strategy' '
 	EOF
 
 	chmod +x git-merge-theirs &&
-	PATH=.:$PATH &&
+	PATH=.$PATH_SEP$PATH &&
 	export PATH
 '
 
diff --git a/t/t7811-grep-open.sh b/t/t7811-grep-open.sh
index 3160be59fd2e26..1a98d733dceb86 100755
--- a/t/t7811-grep-open.sh
+++ b/t/t7811-grep-open.sh
@@ -52,7 +52,7 @@ test_expect_success SIMPLEPAGER 'git grep -O' '
 	EOF
 	echo grep.h >expect.notless &&
 
-	PATH=.:$PATH git grep -O GREP_PATTERN >out &&
+	PATH=.$PATH_SEP$PATH git grep -O GREP_PATTERN >out &&
 	{
 		test_cmp expect.less pager-args ||
 		test_cmp expect.notless pager-args
diff --git a/t/t9003-help-autocorrect.sh b/t/t9003-help-autocorrect.sh
index 8da318d2b543da..c7a03aae697ac0 100755
--- a/t/t9003-help-autocorrect.sh
+++ b/t/t9003-help-autocorrect.sh
@@ -13,7 +13,7 @@ test_expect_success 'setup' '
 		echo distimdistim was called
 	EOF
 
-	PATH="$PATH:." &&
+	PATH="$PATH$PATH_SEP." &&
 	export PATH &&
 
 	git commit --allow-empty -m "a single log entry" &&
diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh
index 0816763e46639c..b3dbd02961fae3 100755
--- a/t/t9800-git-p4-basic.sh
+++ b/t/t9800-git-p4-basic.sh
@@ -286,7 +286,7 @@ test_expect_success 'exit when p4 fails to produce marshaled output' '
 	EOF
 	chmod 755 badp4dir/p4 &&
 	(
-		PATH="$TRASH_DIRECTORY/badp4dir:$PATH" &&
+		PATH="$TRASH_DIRECTORY/badp4dir$PATH_SEP$PATH" &&
 		export PATH &&
 		test_expect_code 1 git p4 clone --dest="$git" //depot >errs 2>&1
 	) &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 5fbc88aec95f4d..f9a084c8b7683f 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -15,6 +15,15 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see https://www.gnu.org/licenses/ .
 
+# On Unix/Linux, the path separator is the colon, on other systems it
+# may be different, though. On Windows, for example, it is a semicolon.
+# If the PATH variable contains semicolons, it is pretty safe to assume
+# that the path separator is a semicolon.
+case "$PATH" in
+*\;*) PATH_SEP=\; ;;
+*) PATH_SEP=: ;;
+esac
+
 # Test the binaries we have just built.  The tests are kept in
 # t/ subdirectory and are run in 'trash directory' subdirectory.
 if test -z "$TEST_DIRECTORY"
@@ -1392,7 +1401,7 @@ then
 		done
 	done
 	IFS=$OLDIFS
-	PATH=$GIT_VALGRIND/bin:$PATH
+	PATH=$GIT_VALGRIND/bin$PATH_SEP$PATH
 	GIT_EXEC_PATH=$GIT_VALGRIND/bin
 	export GIT_VALGRIND
 	GIT_VALGRIND_MODE="$valgrind"
@@ -1404,7 +1413,7 @@ elif test -n "$GIT_TEST_INSTALLED"
 then
 	GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path)  ||
 	error "Cannot run git from $GIT_TEST_INSTALLED."
-	PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR/t/helper:$PATH
+	PATH=$GIT_TEST_INSTALLED$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH
 	GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH}
 else # normal case, use ../bin-wrappers only unless $with_dashes:
 	if test -n "$no_bin_wrappers"
@@ -1420,12 +1429,12 @@ else # normal case, use ../bin-wrappers only unless $with_dashes:
 			fi
 			with_dashes=t
 		fi
-		PATH="$git_bin_dir:$PATH"
+		PATH="$git_bin_dir$PATH_SEP$PATH"
 	fi
 	GIT_EXEC_PATH=$GIT_BUILD_DIR
 	if test -n "$with_dashes"
 	then
-		PATH="$GIT_BUILD_DIR:$GIT_BUILD_DIR/t/helper:$PATH"
+		PATH="$GIT_BUILD_DIR$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH"
 	fi
 fi
 GIT_TEMPLATE_DIR="$GIT_TEST_TEMPLATE_DIR"

From 78d56771bc9cf7ce1f8c90bf357a8e7bddc2d5b7 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 30 Jun 2017 00:35:40 +0200
Subject: [PATCH 531/553] mingw: only use Bash-ism `builtin pwd -W` when
 available

Traditionally, Git for Windows' SDK uses Bash as its default shell.
However, other Unix shells are available, too. Most notably, the Win32
port of BusyBox comes with `ash` whose `pwd` command already prints
Windows paths as Git for Windows wants them, while there is not even a
`builtin` command.

Therefore, let's be careful not to override `pwd` unless we know that
the `builtin` command is available.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-sh-setup.sh | 14 ++++++++++----
 t/test-lib.sh   | 14 ++++++++++----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/git-sh-setup.sh b/git-sh-setup.sh
index fad4f9df94e143..c51ad34148ccf3 100644
--- a/git-sh-setup.sh
+++ b/git-sh-setup.sh
@@ -306,10 +306,16 @@ case $(uname -s) in
 			/usr/bin/find "$@"
 		}
 	fi
-	# git sees Windows-style pwd
-	pwd () {
-		builtin pwd -W
-	}
+	# On Windows, Git wants Windows paths. But /usr/bin/pwd spits out
+	# Unix-style paths. At least in Bash, we have a builtin pwd that
+	# understands the -W option to force "mixed" paths, i.e. with drive
+	# prefix but still with forward slashes. Let's use that, if available.
+	if type builtin >/dev/null 2>&1
+	then
+		pwd () {
+			builtin pwd -W
+		}
+	fi
 	is_absolute_path () {
 		case "$1" in
 		[/\\]* | [A-Za-z]:*)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index f9a084c8b7683f..a11d20949e7fb8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1685,10 +1685,16 @@ Darwin)
 			/usr/bin/find "$@"
 		}
 	fi
-	# git sees Windows-style pwd
-	pwd () {
-		builtin pwd -W
-	}
+	# On Windows, Git wants Windows paths. But /usr/bin/pwd spits out
+	# Unix-style paths. At least in Bash, we have a builtin pwd that
+	# understands the -W option to force "mixed" paths, i.e. with drive
+	# prefix but still with forward slashes. Let's use that, if available.
+	if type builtin >/dev/null 2>&1
+	then
+		pwd () {
+			builtin pwd -W
+		}
+	fi
 	# no POSIX permissions
 	# backslashes in pathspec are converted to '/'
 	# exec does not inherit the PID

From a432140351dab0e31e4dc61e59dff6f352bbb39a Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 30 Jun 2017 22:32:33 +0200
Subject: [PATCH 532/553] tests (mingw): remove Bash-specific pwd option

The -W option is only understood by MSYS2 Bash's pwd command. We already
make sure to override `pwd` by `builtin pwd -W` for MINGW, so let's not
double the effort here.

This will also help when switching the shell to another one (such as
BusyBox' ash) whose pwd does *not* understand the -W option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t9902-completion.sh | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 964e1f156932c6..661083d89a7c1e 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -139,12 +139,7 @@ invalid_variable_name='${foo.bar}'
 
 actual="$TRASH_DIRECTORY/actual"
 
-if test_have_prereq MINGW
-then
-	ROOT="$(pwd -W)"
-else
-	ROOT="$(pwd)"
-fi
+ROOT="$(pwd)"
 
 test_expect_success 'setup for __git_find_repo_path/__gitdir tests' '
 	mkdir -p subdir/subsubdir &&

From 34654b275c08001430d4c7c4495064ebac7f25af Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 19 Jul 2017 17:07:56 +0200
Subject: [PATCH 533/553] test-lib: add BUSYBOX prerequisite

When running with BusyBox, we will want to avoid calling executables on
the PATH that are implemented in BusyBox itself.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index a11d20949e7fb8..50cc6186c17ae6 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1879,6 +1879,10 @@ test_lazy_prereq UNZIP '
 	test $? -ne 127
 '
 
+test_lazy_prereq BUSYBOX '
+	case "$($SHELL --help 2>&1)" in *BusyBox*) true;; *) false;; esac
+'
+
 run_with_limited_cmdline () {
 	(ulimit -s 128 && "$@")
 }

From 54351505887c8bc05422c90b3e437b2d75744c57 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 5 Aug 2017 21:36:01 +0200
Subject: [PATCH 534/553] t5003: use binary file from t/lib-diff/

At some stage, t5003-archive-zip wants to add a file that is not ASCII.
To that end, it uses /bin/sh. But that file may actually not exist (it
is too easy to forget that not all the world is Unix/Linux...)! Besides,
we already have perfectly fine binary files intended for use solely by
the tests. So let's use one of them instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5003-archive-zip.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 961c6aac256135..2c3d5a13ad027f 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -88,7 +88,7 @@ test_expect_success \
     'mkdir a &&
      echo simple textfile >a/a &&
      mkdir a/bin &&
-     cp /bin/sh a/bin &&
+     cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" a/bin &&
      printf "text\r"	>a/text.cr &&
      printf "text\r\n"	>a/text.crlf &&
      printf "text\n"	>a/text.lf &&

From e78d797691dc6e32f7ad83508ea5042a9c5365b6 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Jul 2017 12:48:33 +0200
Subject: [PATCH 535/553] t5532: workaround for BusyBox on Windows

While it may seem super convenient to some old Unix hands to simpy
require Perl to be available when running the test suite, this is a
major hassle on Windows, where we want to verify that Perl is not,
actually, required in a NO_PERL build.

As a super ugly workaround, we "install" a script into /usr/bin/perl
reading like this:

	#!/bin/sh

	# We'd much rather avoid requiring Perl altogether when testing
	# an installed Git. Oh well, that's why we cannot have nice
	# things.
	exec c:/git-sdk-64/usr/bin/perl.exe "$@"

The problem with that is that BusyBox assumes that the #! line in a
script refers to an executable, not to a script. So when it encounters
the line #!/usr/bin/perl in t5532's proxy-get-cmd, it barfs.

Let's help this situation by simply executing the Perl script with the
"interpreter" specified explicitly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5532-fetch-proxy.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5532-fetch-proxy.sh b/t/t5532-fetch-proxy.sh
index 95d0f33b29531c..86fe5d8f752147 100755
--- a/t/t5532-fetch-proxy.sh
+++ b/t/t5532-fetch-proxy.sh
@@ -32,7 +32,7 @@ test_expect_success 'setup proxy script' '
 
 	write_script proxy <<-\EOF
 	echo >&2 "proxying for $*"
-	cmd=$(./proxy-get-cmd)
+	cmd=$("$PERL_PATH" ./proxy-get-cmd)
 	echo >&2 "Running $cmd"
 	exec $cmd
 	EOF

From 1c6cc4fdf9e044375840f1ff8e08f03c48b77f60 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 21 Jul 2017 13:24:55 +0200
Subject: [PATCH 536/553] t5605: special-case hardlink test for BusyBox-w32

When t5605 tries to verify that files are hardlinked (or that they are
not), it uses the `-links` option of the `find` utility.

BusyBox' implementation does not support that option, and BusyBox-w32's
lstat() does not even report the number of hard links correctly (for
performance reasons).

So let's just switch to a different method that actually works on
Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5605-clone-local.sh | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
index 2397f8fa618054..a7444acc5f89e4 100755
--- a/t/t5605-clone-local.sh
+++ b/t/t5605-clone-local.sh
@@ -11,6 +11,21 @@ repo_is_hardlinked() {
 	test_line_count = 0 output
 }
 
+if test_have_prereq MINGW,BUSYBOX
+then
+	# BusyBox' `find` does not support `-links`. Besides, BusyBox-w32's
+	# lstat() does not report hard links, just like Git's mingw_lstat()
+	# (from where BusyBox-w32 got its initial implementation).
+	repo_is_hardlinked() {
+		for f in $(find "$1/objects" -type f)
+		do
+			"$SYSTEMROOT"/system32/fsutil.exe \
+				hardlink list $f >links &&
+			test_line_count -gt 1 links || return 1
+		done
+	}
+fi
+
 test_expect_success 'preparing origin repository' '
 	: >file && git add . && git commit -m1 &&
 	git clone --bare . a.git &&

From cdefdda1c37bc6828920ef8970d1909d577c4638 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 5 Jul 2017 15:14:50 +0200
Subject: [PATCH 537/553] t5813: allow for $PWD to be a Windows path

Git for Windows uses MSYS2's Bash to run the test suite, which comes
with benefits but also at a heavy price: on the plus side, MSYS2's
POSIX emulation layer allows us to continue pretending that we are on a
Unix system, e.g. use Unix paths instead of Windows ones, yet this is
bought at a rather noticeable performance penalty.

There *are* some more native ports of Unix shells out there, though,
most notably BusyBox-w32's ash. These native ports do not use any POSIX
emulation layer (or at most a *very* thin one, choosing to avoid
features such as fork() that are expensive to emulate on Windows), and
they use native Windows paths (usually with forward slashes instead of
backslashes, which is perfectly legal in almost all use cases).

And here comes the problem: with a $PWD looking like, say,
C:/git-sdk-64/usr/src/git/t/trash directory.t5813-proto-disable-ssh
Git's test scripts get quite a bit confused, as their assumptions have
been shattered. Not only does this path contain a colon (oh no!), it
also does not start with a slash.

This is a problem e.g. when constructing a URL as t5813 does it:
ssh://remote$PWD. Not only is it impossible to separate the "host" from
the path with a $PWD as above, even prefixing $PWD by a slash won't
work, as /C:/git-sdk-64/... is not a valid path.

As a workaround, detect when $PWD does not start with a slash on
Windows, and simply strip the drive prefix, using an obscure feature of
Windows paths: if an absolute Windows path starts with a slash, it is
implicitly prefixed by the drive prefix of the current directory. As we
are talking about the current directory here, anyway, that strategy
works.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5813-proto-disable-ssh.sh | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/t/t5813-proto-disable-ssh.sh b/t/t5813-proto-disable-ssh.sh
index 045e2fe6ce376a..c78581dc9f4a1e 100755
--- a/t/t5813-proto-disable-ssh.sh
+++ b/t/t5813-proto-disable-ssh.sh
@@ -15,8 +15,23 @@ test_expect_success 'setup repository to clone' '
 '
 
 test_proto "host:path" ssh "remote:repo.git"
-test_proto "ssh://" ssh "ssh://remote$PWD/remote/repo.git"
-test_proto "git+ssh://" ssh "git+ssh://remote$PWD/remote/repo.git"
+
+hostdir="$PWD"
+if test_have_prereq MINGW && test "/${PWD#/}" != "$PWD"
+then
+	case "$PWD" in
+	[A-Za-z]:/*)
+		hostdir="${PWD#?:}"
+		;;
+	*)
+		skip_all="Unhandled PWD '$PWD'; skipping rest"
+		test_done
+		;;
+	esac
+fi
+
+test_proto "ssh://" ssh "ssh://remote$hostdir/remote/repo.git"
+test_proto "git+ssh://" ssh "git+ssh://remote$hostdir/remote/repo.git"
 
 # Don't even bother setting up a "-remote" directory, as ssh would generally
 # complain about the bogus option rather than completing our request. Our

From 335976771f4062f3485bfcbe3fdcb4296ef2a5d1 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 17 May 2017 17:05:09 +0200
Subject: [PATCH 538/553] mingw: kill child processes in a gentler way

The TerminateProcess() function does not actually leave the child
processes any chance to perform any cleanup operations. This is bad
insofar as Git itself expects its signal handlers to run.

A symptom is e.g. a left-behind .lock file that would not be left behind
if the same operation was run, say, on Linux.

To remedy this situation, we use an obscure trick: we inject a thread
into the process that needs to be killed and to let that thread run the
ExitProcess() function with the desired exit status. Thanks J Wyman for
describing this trick.

The advantage is that the ExitProcess() function lets the atexit
handlers run. While this is still different from what Git expects (i.e.
running a signal handler), in practice Git sets up signal handlers and
atexit handlers that call the same code to clean up after itself.

In case that the gentle method to terminate the process failed, we still
fall back to calling TerminateProcess(), but in that case we now also
make sure that processes spawned by the spawned process are terminated;
TerminateProcess() does not give the spawned process a chance to do so
itself.

Please note that this change only affects how Git for Windows tries to
terminate processes spawned by Git's own executables. Third-party
software that *calls* Git and wants to terminate it *still* need to make
sure to imitate this gentle method, otherwise this patch will not have
any effect.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c              |  29 +++++--
 compat/win32/exit-process.h | 165 ++++++++++++++++++++++++++++++++++++
 2 files changed, 186 insertions(+), 8 deletions(-)
 create mode 100644 compat/win32/exit-process.h

diff --git a/compat/mingw.c b/compat/mingw.c
index 064ff581f25f22..eb39defc6e93c7 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -14,6 +14,7 @@
 #include "symlinks.h"
 #include "trace2.h"
 #include "win32.h"
+#include "win32/exit-process.h"
 #include "win32/fscache.h"
 #include "win32/lazyload.h"
 #include "wrapper.h"
@@ -2252,16 +2253,28 @@ int mingw_execvp(const char *cmd, char *const *argv)
 int mingw_kill(pid_t pid, int sig)
 {
 	if (pid > 0 && sig == SIGTERM) {
-		HANDLE h = OpenProcess(PROCESS_TERMINATE, FALSE, pid);
-
-		if (TerminateProcess(h, -1)) {
+		HANDLE h = OpenProcess(PROCESS_CREATE_THREAD |
+				       PROCESS_QUERY_INFORMATION |
+				       PROCESS_VM_OPERATION | PROCESS_VM_WRITE |
+				       PROCESS_VM_READ | PROCESS_TERMINATE,
+				       FALSE, pid);
+		int ret;
+
+		if (h)
+			ret = exit_process(h, 128 + sig);
+		else {
+			h = OpenProcess(PROCESS_TERMINATE, FALSE, pid);
+			if (!h) {
+				errno = err_win_to_posix(GetLastError());
+				return -1;
+			}
+			ret = terminate_process_tree(h, 128 + sig);
+		}
+		if (ret) {
+			errno = err_win_to_posix(GetLastError());
 			CloseHandle(h);
-			return 0;
 		}
-
-		errno = err_win_to_posix(GetLastError());
-		CloseHandle(h);
-		return -1;
+		return ret;
 	} else if (pid > 0 && sig == 0) {
 		HANDLE h = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, pid);
 		if (h) {
diff --git a/compat/win32/exit-process.h b/compat/win32/exit-process.h
new file mode 100644
index 00000000000000..d53989884cfb0c
--- /dev/null
+++ b/compat/win32/exit-process.h
@@ -0,0 +1,165 @@
+#ifndef EXIT_PROCESS_H
+#define EXIT_PROCESS_H
+
+/*
+ * This file contains functions to terminate a Win32 process, as gently as
+ * possible.
+ *
+ * At first, we will attempt to inject a thread that calls ExitProcess(). If
+ * that fails, we will fall back to terminating the entire process tree.
+ *
+ * For simplicity, these functions are marked as file-local.
+ */
+
+#include <tlhelp32.h>
+
+/*
+ * Terminates the process corresponding to the process ID and all of its
+ * directly and indirectly spawned subprocesses.
+ *
+ * This way of terminating the processes is not gentle: the processes get
+ * no chance of cleaning up after themselves (closing file handles, removing
+ * .lock files, terminating spawned processes (if any), etc).
+ */
+static int terminate_process_tree(HANDLE main_process, int exit_status)
+{
+	HANDLE snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
+	PROCESSENTRY32 entry;
+	DWORD pids[16384];
+	int max_len = sizeof(pids) / sizeof(*pids), i, len, ret = 0;
+	pid_t pid = GetProcessId(main_process);
+
+	pids[0] = (DWORD)pid;
+	len = 1;
+
+	/*
+	 * Even if Process32First()/Process32Next() seem to traverse the
+	 * processes in topological order (i.e. parent processes before
+	 * child processes), there is nothing in the Win32 API documentation
+	 * suggesting that this is guaranteed.
+	 *
+	 * Therefore, run through them at least twice and stop when no more
+	 * process IDs were added to the list.
+	 */
+	for (;;) {
+		int orig_len = len;
+
+		memset(&entry, 0, sizeof(entry));
+		entry.dwSize = sizeof(entry);
+
+		if (!Process32First(snapshot, &entry))
+			break;
+
+		do {
+			for (i = len - 1; i >= 0; i--) {
+				if (pids[i] == entry.th32ProcessID)
+					break;
+				if (pids[i] == entry.th32ParentProcessID)
+					pids[len++] = entry.th32ProcessID;
+			}
+		} while (len < max_len && Process32Next(snapshot, &entry));
+
+		if (orig_len == len || len >= max_len)
+			break;
+	}
+
+	for (i = len - 1; i > 0; i--) {
+		HANDLE process = OpenProcess(PROCESS_TERMINATE, FALSE, pids[i]);
+
+		if (process) {
+			if (!TerminateProcess(process, exit_status))
+				ret = -1;
+			CloseHandle(process);
+		}
+	}
+	if (!TerminateProcess(main_process, exit_status))
+		ret = -1;
+	CloseHandle(main_process);
+
+	return ret;
+}
+
+/**
+ * Determine whether a process runs in the same architecture as the current
+ * one. That test is required before we assume that GetProcAddress() returns
+ * a valid address *for the target process*.
+ */
+static inline int process_architecture_matches_current(HANDLE process)
+{
+	static BOOL current_is_wow = -1;
+	BOOL is_wow;
+
+	if (current_is_wow == -1 &&
+	    !IsWow64Process (GetCurrentProcess(), &current_is_wow))
+		current_is_wow = -2;
+	if (current_is_wow == -2)
+		return 0; /* could not determine current process' WoW-ness */
+	if (!IsWow64Process (process, &is_wow))
+		return 0; /* cannot determine */
+	return is_wow == current_is_wow;
+}
+
+/**
+ * Inject a thread into the given process that runs ExitProcess().
+ *
+ * Note: as kernel32.dll is loaded before any process, the other process and
+ * this process will have ExitProcess() at the same address.
+ *
+ * This function expects the process handle to have the access rights for
+ * CreateRemoteThread(): PROCESS_CREATE_THREAD, PROCESS_QUERY_INFORMATION,
+ * PROCESS_VM_OPERATION, PROCESS_VM_WRITE, and PROCESS_VM_READ.
+ *
+ * The idea comes from the Dr Dobb's article "A Safer Alternative to
+ * TerminateProcess()" by Andrew Tucker (July 1, 1999),
+ * http://www.drdobbs.com/a-safer-alternative-to-terminateprocess/184416547
+ *
+ * If this method fails, we fall back to running terminate_process_tree().
+ */
+static int exit_process(HANDLE process, int exit_code)
+{
+	DWORD code;
+
+	if (GetExitCodeProcess(process, &code) && code == STILL_ACTIVE) {
+		static int initialized;
+		static LPTHREAD_START_ROUTINE exit_process_address;
+		PVOID arg = (PVOID)(intptr_t)exit_code;
+		DWORD thread_id;
+		HANDLE thread = NULL;
+
+		if (!initialized) {
+			HINSTANCE kernel32 = GetModuleHandleA("kernel32");
+			if (!kernel32)
+				die("BUG: cannot find kernel32");
+			exit_process_address =
+				(LPTHREAD_START_ROUTINE)(void (*)(void))
+				GetProcAddress(kernel32, "ExitProcess");
+			initialized = 1;
+		}
+		if (!exit_process_address ||
+		    !process_architecture_matches_current(process))
+			return terminate_process_tree(process, exit_code);
+
+		thread = CreateRemoteThread(process, NULL, 0,
+					    exit_process_address,
+					    arg, 0, &thread_id);
+		if (thread) {
+			CloseHandle(thread);
+			/*
+			 * If the process survives for 10 seconds (a completely
+			 * arbitrary value picked from thin air), fall back to
+			 * killing the process tree via TerminateProcess().
+			 */
+			if (WaitForSingleObject(process, 10000) ==
+			    WAIT_OBJECT_0) {
+				CloseHandle(process);
+				return 0;
+			}
+		}
+
+		return terminate_process_tree(process, exit_code);
+	}
+
+	return 0;
+}
+
+#endif

From e0ca06c21a1febd6c10da23f367cd0ff67b57281 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 7 Jul 2017 10:15:36 +0200
Subject: [PATCH 539/553] t9200: skip tests when $PWD contains a colon

On Windows, the current working directory is pretty much guaranteed to
contain a colon. If we feed that path to CVS, it mistakes it for a
separator between host and port, though.

This has not been a problem so far because Git for Windows uses MSYS2's
Bash using a POSIX emulation layer that also pretends that the current
directory is a Unix path (at least as long as we're in a shell script).

However, that is rather limiting, as Git for Windows also explores other
ports of other Unix shells. One of those is BusyBox-w32's ash, which is
a native port (i.e. *not* using any POSIX emulation layer, and certainly
not emulating Unix paths).

So let's just detect if there is a colon in $PWD and punt in that case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t9200-git-cvsexportcommit.sh | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh
index 5249a9eb886e0b..026089f6806733 100755
--- a/t/t9200-git-cvsexportcommit.sh
+++ b/t/t9200-git-cvsexportcommit.sh
@@ -11,6 +11,13 @@ if ! test_have_prereq PERL; then
 	test_done
 fi
 
+case "$PWD" in
+*:*)
+	skip_all='cvs would get confused by the colon in `pwd`; skipping tests'
+	test_done
+	;;
+esac
+
 cvs >/dev/null 2>&1
 if test $? -ne 1
 then

From 45f857b44f5777ad82d876fa4634caa79b405049 Mon Sep 17 00:00:00 2001
From: xungeng li <xungeng@gmail.com>
Date: Wed, 7 Jun 2023 20:26:33 +0800
Subject: [PATCH 540/553] mingw: optionally enable wsl compability file mode
 bits

The Windows Subsystem for Linux (WSL) version 2 allows to use `chmod` on
NTFS volumes provided that they are mounted with metadata enabled (see
https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/
for details), for example:

	$ chmod 0755 /mnt/d/test/a.sh

In order to facilitate better collaboration between the Windows
version of Git and the WSL version of Git, we can make the Windows
version of Git also support reading and writing NTFS file modes
in a manner compatible with WSL.

Since this slightly slows down operations where lots of files are
created (such as an initial checkout), this feature is only enabled when
`core.WSLCompat` is set to true. Note that you also have to set
`core.fileMode=true` in repositories that have been initialized without
enabling WSL compatibility.

There are several ways to enable metadata loading for NTFS volumes
in WSL, one of which is to modify `/etc/wsl.conf` by adding:

```
[automount]
enabled = true
options = "metadata,umask=027,fmask=117"
```

And reboot WSL.

It can also be enabled temporarily by this incantation:

	$ sudo umount /mnt/c &&
	  sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111

It's important to note that this modification is compatible with, but
does not depend on WSL. The helper functions in this commit can operate
independently and functions normally on devices where WSL is not
installed or properly configured.

Signed-off-by: xungeng li <xungeng@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config/core.adoc      |   6 ++
 compat/mingw.c                      |  13 +++
 compat/win32/fscache.c              |  16 ++++
 compat/win32/wsl.c                  | 142 ++++++++++++++++++++++++++++
 compat/win32/wsl.h                  |  12 +++
 config.mak.uname                    |   4 +-
 contrib/buildsystems/CMakeLists.txt |   1 +
 meson.build                         |   1 +
 8 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 compat/win32/wsl.c
 create mode 100644 compat/win32/wsl.h

diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc
index fb9aba0e08cef6..eb8957d4daca94 100644
--- a/Documentation/config/core.adoc
+++ b/Documentation/config/core.adoc
@@ -790,3 +790,9 @@ core.maxTreeDepth::
 	to allow Git to abort cleanly, and should not generally need to
 	be adjusted. When Git is compiled with MSVC, the default is 512.
 	Otherwise, the default is 2048.
+
+core.WSLCompat::
+	Tells Git whether to enable wsl compatibility mode.
+	The default value is false. When set to true, Git will set the mode
+	bits of the file in the way of wsl, so that the executable flag of
+	files can be set or read correctly.
diff --git a/compat/mingw.c b/compat/mingw.c
index 428bae765ed942..6a1ecfdfac7fe8 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -18,6 +18,7 @@
 #include "win32.h"
 #include "win32/fscache.h"
 #include "win32/lazyload.h"
+#include "win32/wsl.h"
 #include "wrapper.h"
 #include "write-or-die.h"
 #include <aclapi.h>
@@ -920,6 +921,11 @@ int mingw_open (const char *filename, int oflags, ...)
 	if (fd < 0 && create && GetLastError() == ERROR_ACCESS_DENIED &&
 	    INIT_PROC_ADDR(RtlGetLastNtStatus) && RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
 		errno = EEXIST;
+	else if ((oflags & O_CREAT) && fd >= 0 && are_wsl_compatible_mode_bits_enabled()) {
+		_mode_t wsl_mode = S_IFREG | (mode&0777);
+		set_wsl_mode_bits_by_handle((HANDLE)_get_osfhandle(fd), wsl_mode);
+	}
+
 	if (fd < 0 && (oflags & O_ACCMODE) != O_RDONLY && errno == EACCES) {
 		DWORD attrs = GetFileAttributesW(wfilename);
 		if (attrs != INVALID_FILE_ATTRIBUTES && (attrs & FILE_ATTRIBUTE_DIRECTORY))
@@ -1219,6 +1225,11 @@ int mingw_lstat(const char *file_name, struct stat *buf)
 		filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim));
 		filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim));
 		filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim));
+		if (S_ISREG(buf->st_mode) &&
+		    are_wsl_compatible_mode_bits_enabled()) {
+			copy_wsl_mode_bits_from_disk(wfilename, -1,
+						     &buf->st_mode);
+		}
 		return 0;
 	}
 
@@ -1270,6 +1281,8 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf)
 	filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim));
 	filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim));
 	filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim));
+	if (are_wsl_compatible_mode_bits_enabled())
+	    get_wsl_mode_bits_by_handle(hnd, &buf->st_mode);
 	return 0;
 }
 
diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c
index 0f5e00ae18f949..d2e67bd5ac0cd1 100644
--- a/compat/win32/fscache.c
+++ b/compat/win32/fscache.c
@@ -8,6 +8,7 @@
 #include "config.h"
 #include "../../mem-pool.h"
 #include "ntifs.h"
+#include "wsl.h"
 
 static volatile long initialized;
 static DWORD dwTlsIndex;
@@ -220,6 +221,21 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache,
 			     &(fse->u.s.st_mtim));
 	filetime_to_timespec((FILETIME *)&(fdata->CreationTime),
 			     &(fse->u.s.st_ctim));
+	if (fdata->EaSize > 0 &&
+	    sizeof(buf) >= (size_t)(list ? list->len+1 : 0) + fse->len+1 &&
+	    are_wsl_compatible_mode_bits_enabled()) {
+		size_t off = 0;
+		wchar_t wpath[MAX_LONG_PATH];
+		if (list && list->len) {
+			memcpy(buf, list->dirent.d_name, list->len);
+			buf[list->len] = '/';
+			off = list->len + 1;
+		}
+		memcpy(buf + off, fse->dirent.d_name, fse->len);
+		buf[off + fse->len] = '\0';
+		if (xutftowcs_long_path(wpath, buf) >= 0)
+			copy_wsl_mode_bits_from_disk(wpath, -1, &fse->st_mode);
+	}
 
 	return fse;
 }
diff --git a/compat/win32/wsl.c b/compat/win32/wsl.c
new file mode 100644
index 00000000000000..ab599770138b4e
--- /dev/null
+++ b/compat/win32/wsl.c
@@ -0,0 +1,142 @@
+#define USE_THE_REPOSITORY_VARIABLE
+#include "../../git-compat-util.h"
+#include "../win32.h"
+#include "../../repository.h"
+#include "config.h"
+#include "ntifs.h"
+#include "wsl.h"
+
+int are_wsl_compatible_mode_bits_enabled(void)
+{
+	/* default to `false` during initialization */
+	static const int fallback = 0;
+	static int enabled = -1;
+
+	if (enabled < 0) {
+		/* avoid infinite recursion */
+		if (!the_repository)
+			return fallback;
+
+		if (the_repository->config &&
+		    the_repository->config->hash_initialized &&
+		    repo_config_get_bool(the_repository, "core.wslcompat", &enabled) < 0)
+			enabled = 0;
+	}
+
+	return enabled < 0 ? fallback : enabled;
+}
+
+int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen,
+				 _mode_t *mode)
+{
+	int ret = -1;
+	HANDLE h;
+	if (wpathlen >= 0) {
+		/*
+		 * It's caller's duty to make sure wpathlen is reasonable so
+		 * it does not overflow.
+		 */
+		wchar_t *fn2 = (wchar_t*)alloca((wpathlen + 1) * sizeof(wchar_t));
+		memcpy(fn2, wpath, wpathlen * sizeof(wchar_t));
+		fn2[wpathlen] = 0;
+		wpath = fn2;
+	}
+	h = CreateFileW(wpath, FILE_READ_EA | SYNCHRONIZE,
+			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+			NULL, OPEN_EXISTING,
+			FILE_FLAG_BACKUP_SEMANTICS |
+				FILE_FLAG_OPEN_REPARSE_POINT,
+			NULL);
+	if (h != INVALID_HANDLE_VALUE) {
+		ret = get_wsl_mode_bits_by_handle(h, mode);
+		CloseHandle(h);
+	}
+	return ret;
+}
+
+#ifndef LX_FILE_METADATA_HAS_UID
+#define LX_FILE_METADATA_HAS_UID 0x1
+#define LX_FILE_METADATA_HAS_GID 0x2
+#define LX_FILE_METADATA_HAS_MODE 0x4
+#define LX_FILE_METADATA_HAS_DEVICE_ID 0x8
+#define LX_FILE_CASE_SENSITIVE_DIR 0x10
+typedef struct _FILE_STAT_LX_INFORMATION {
+	LARGE_INTEGER FileId;
+	LARGE_INTEGER CreationTime;
+	LARGE_INTEGER LastAccessTime;
+	LARGE_INTEGER LastWriteTime;
+	LARGE_INTEGER ChangeTime;
+	LARGE_INTEGER AllocationSize;
+	LARGE_INTEGER EndOfFile;
+	uint32_t FileAttributes;
+	uint32_t ReparseTag;
+	uint32_t NumberOfLinks;
+	ACCESS_MASK EffectiveAccess;
+	uint32_t LxFlags;
+	uint32_t LxUid;
+	uint32_t LxGid;
+	uint32_t LxMode;
+	uint32_t LxDeviceIdMajor;
+	uint32_t LxDeviceIdMinor;
+} FILE_STAT_LX_INFORMATION, *PFILE_STAT_LX_INFORMATION;
+#endif
+
+/*
+ * This struct is extended from the original FILE_FULL_EA_INFORMATION of
+ * Microsoft Windows.
+ */
+struct wsl_full_ea_info_t {
+	uint32_t NextEntryOffset;
+	uint8_t Flags;
+	uint8_t EaNameLength;
+	uint16_t EaValueLength;
+	char EaName[7];
+	char EaValue[4];
+	char Padding[1];
+};
+
+enum {
+	FileStatLxInformation = 70,
+};
+__declspec(dllimport) NTSTATUS WINAPI
+	NtQueryInformationFile(HANDLE FileHandle,
+			       PIO_STATUS_BLOCK IoStatusBlock,
+			       PVOID FileInformation, ULONG Length,
+			       uint32_t FileInformationClass);
+__declspec(dllimport) NTSTATUS WINAPI
+	NtSetInformationFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock,
+			     PVOID FileInformation, ULONG Length,
+			     uint32_t FileInformationClass);
+__declspec(dllimport) NTSTATUS WINAPI
+	NtSetEaFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock,
+		    PVOID EaBuffer, ULONG EaBufferSize);
+
+int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode)
+{
+	uint32_t value = mode;
+	struct wsl_full_ea_info_t ea_info;
+	IO_STATUS_BLOCK iob;
+	/* mode should be valid to make WSL happy */
+	assert(S_ISREG(mode) || S_ISDIR(mode));
+	ea_info.NextEntryOffset = 0;
+	ea_info.Flags = 0;
+	ea_info.EaNameLength = 6;
+	ea_info.EaValueLength = sizeof(value); /* 4 */
+	strlcpy(ea_info.EaName, "$LXMOD", sizeof(ea_info.EaName));
+	memcpy(ea_info.EaValue, &value, sizeof(value));
+	ea_info.Padding[0] = 0;
+	return NtSetEaFile(h, &iob, &ea_info, sizeof(ea_info));
+}
+
+int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode)
+{
+	FILE_STAT_LX_INFORMATION fxi;
+	IO_STATUS_BLOCK iob;
+	if (NtQueryInformationFile(h, &iob, &fxi, sizeof(fxi),
+				   FileStatLxInformation) == 0) {
+		if (fxi.LxFlags & LX_FILE_METADATA_HAS_MODE)
+			*mode = (_mode_t)fxi.LxMode;
+		return 0;
+	}
+	return -1;
+}
diff --git a/compat/win32/wsl.h b/compat/win32/wsl.h
new file mode 100644
index 00000000000000..1f5ad7e67a4fc2
--- /dev/null
+++ b/compat/win32/wsl.h
@@ -0,0 +1,12 @@
+#ifndef COMPAT_WIN32_WSL_H
+#define COMPAT_WIN32_WSL_H
+
+int are_wsl_compatible_mode_bits_enabled(void);
+
+int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen,
+				 _mode_t *mode);
+
+int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode);
+int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode);
+
+#endif
diff --git a/config.mak.uname b/config.mak.uname
index bbcfeb4eee0f05..24580d8ae28b88 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -508,7 +508,7 @@ endif
 		compat/win32/path-utils.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
 		compat/win32/trace2_win32_process_info.o \
-		compat/win32/dirent.o compat/win32/fscache.o
+		compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o
 	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \
 		-DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \
 		-DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
@@ -713,7 +713,7 @@ ifeq ($(uname_S),MINGW)
 		compat/win32/flush.o \
 		compat/win32/path-utils.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
-		compat/win32/dirent.o compat/win32/fscache.o
+		compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o
 	BASIC_CFLAGS += -DWIN32
 	EXTLIBS += -lws2_32
 	GITLIBS += git.res
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 7be6a0a7fccb72..65336d921375e8 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -300,6 +300,7 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 		compat/win32/syslog.c
 		compat/win32/trace2_win32_process_info.c
 		compat/win32/dirent.c
+		compat/win32/wsl.c
 		compat/nedmalloc/nedmalloc.c
 		compat/strdup.c
 		compat/win32/fscache.c)
diff --git a/meson.build b/meson.build
index e6c2e592b4297a..a479703802286d 100644
--- a/meson.build
+++ b/meson.build
@@ -1264,6 +1264,7 @@ elif host_machine.system() == 'windows'
     'compat/win32/path-utils.c',
     'compat/win32/pthread.c',
     'compat/win32/syslog.c',
+    'compat/win32/wsl.c',
     'compat/win32mmap.c',
     'compat/nedmalloc/nedmalloc.c',
   ]

From 93517b01e1b7690fd52afbae980b4f16c42cbd50 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 23 Apr 2018 00:24:29 +0200
Subject: [PATCH 541/553] mingw: really handle SIGINT

Previously, we did not install any handler for Ctrl+C, but now we really
want to because the MSYS2 runtime learned the trick to call the
ConsoleCtrlHandler when Ctrl+C was pressed.

With this, hitting Ctrl+C while `git log` is running will only terminate
the Git process, but not the pager. This finally matches the behavior on
Linux and on macOS.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index eb39defc6e93c7..bec3cc01fa0504 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -4099,7 +4099,14 @@ static void adjust_symlink_flags(void)
 		symlink_file_flags |= 2;
 		symlink_directory_flags |= 2;
 	}
+}
 
+static BOOL WINAPI handle_ctrl_c(DWORD ctrl_type)
+{
+	if (ctrl_type != CTRL_C_EVENT)
+		return FALSE; /* we did not handle this */
+	mingw_raise(SIGINT);
+	return TRUE; /* we did handle this */
 }
 
 #ifdef _MSC_VER
@@ -4136,6 +4143,8 @@ int wmain(int argc, const wchar_t **wargv)
 #endif
 #endif
 
+	SetConsoleCtrlHandler(handle_ctrl_c, TRUE);
+
 	maybe_redirect_std_handles();
 	adjust_symlink_flags();
 	fsync_object_files = 1;

From 63c8ebd75a3a12951875ce1991bd70d721100056 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Thu, 25 Nov 2021 11:26:41 +0100
Subject: [PATCH 542/553] Partially un-revert "editor: save and reset terminal
 after calling EDITOR"

In e3f7e01b50be (Revert "editor: save and reset terminal after calling
EDITOR", 2021-11-22), we reverted the commit wholesale where the
terminal state would be saved and restored before/after calling an
editor.

The reverted commit was intended to fix a problem with Windows Terminal
where simply calling `vi` would cause problems afterwards.

To fix the problem addressed by the revert, but _still_ keep the problem
with Windows Terminal fixed, let's revert the revert, with a twist: we
restrict the save/restore _specifically_ to the case where `vi` (or
`vim`) is called, and do not do the same for any other editor.

This should still catch the majority of the cases, and will bridge the
time until the original patch is re-done in a way that addresses all
concerns.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 editor.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/editor.c b/editor.c
index fd174e6a034f1c..f6d960c6f30782 100644
--- a/editor.c
+++ b/editor.c
@@ -13,6 +13,7 @@
 #include "strvec.h"
 #include "run-command.h"
 #include "sigchain.h"
+#include "compat/terminal.h"
 
 #ifndef DEFAULT_EDITOR
 #define DEFAULT_EDITOR "vi"
@@ -64,6 +65,7 @@ static int launch_specified_editor(const char *editor, const char *path,
 		return error("Terminal is dumb, but EDITOR unset");
 
 	if (strcmp(editor, ":")) {
+		int save_and_restore_term = !strcmp(editor, "vi") || !strcmp(editor, "vim");
 		struct strbuf realpath = STRBUF_INIT;
 		struct child_process p = CHILD_PROCESS_INIT;
 		int ret, sig;
@@ -92,7 +94,11 @@ static int launch_specified_editor(const char *editor, const char *path,
 			strvec_pushv(&p.env, (const char **)env);
 		p.use_shell = 1;
 		p.trace2_child_class = "editor";
+		if (save_and_restore_term)
+			save_and_restore_term = !save_term(1);
 		if (start_command(&p) < 0) {
+			if (save_and_restore_term)
+				restore_term();
 			strbuf_release(&realpath);
 			return error("unable to start editor '%s'", editor);
 		}
@@ -100,6 +106,8 @@ static int launch_specified_editor(const char *editor, const char *path,
 		sigchain_push(SIGINT, SIG_IGN);
 		sigchain_push(SIGQUIT, SIG_IGN);
 		ret = finish_command(&p);
+		if (save_and_restore_term)
+			restore_term();
 		strbuf_release(&realpath);
 		sig = ret - 128;
 		sigchain_pop(SIGINT);

From 1c0a7e713ba1542c4fcfb49ead3d4711d56341cd Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 29 Sep 2020 13:50:59 +0200
Subject: [PATCH 543/553] Add a GitHub workflow to monitor component updates
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Rather than using private IFTTT Applets that send mails to this
maintainer whenever a new version of a Git for Windows component was
released, let's use the power of GitHub workflows to make this process
publicly visible.

This workflow monitors the Atom/RSS feeds, and opens a ticket whenever a
new version was released.

Note: Bash sometimes releases multiple patched versions within a few
minutes of each other (i.e. 5.1p1 through 5.1p4, 5.0p15 and 5.0p16). The
MSYS2 runtime also has a similar system. We can address those patches as
a group, so we shouldn't get multiple issues about them.

Note further: We're not acting on newlib releases, OpenSSL alphas, Perl
release candidates or non-stable Perl releases. There's no need to open
issues about them.

Co-authored-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/monitor-components.yml | 94 ++++++++++++++++++++++++
 1 file changed, 94 insertions(+)
 create mode 100644 .github/workflows/monitor-components.yml

diff --git a/.github/workflows/monitor-components.yml b/.github/workflows/monitor-components.yml
new file mode 100644
index 00000000000000..f15ff218d28b81
--- /dev/null
+++ b/.github/workflows/monitor-components.yml
@@ -0,0 +1,94 @@
+name: Monitor component updates
+
+# Git for Windows is a slightly modified subset of MSYS2. Some of its
+# components are maintained by Git for Windows, others by MSYS2. To help
+# keeping the former up to date, this workflow monitors the Atom/RSS feeds
+# and opens new tickets for each new component version.
+
+on:
+  schedule:
+    - cron: "23 8,11,14,17 * * *"
+  workflow_dispatch:
+
+env:
+  CHARACTER_LIMIT: 5000
+  MAX_AGE: 7d
+
+jobs:
+  job:
+    # Only run this in Git for Windows' fork
+    if: github.event.repository.owner.login == 'git-for-windows'
+    runs-on: ubuntu-latest
+    permissions:
+      issues: write
+    strategy:
+      matrix:
+        component:
+          - label: git
+            feed: https://github.com/git/git/tags.atom
+          - label: git-lfs
+            feed: https://github.com/git-lfs/git-lfs/tags.atom
+          - label: git-credential-manager
+            feed: https://github.com/git-ecosystem/git-credential-manager/tags.atom
+          - label: tig
+            feed: https://github.com/jonas/tig/tags.atom
+          - label: cygwin
+            feed: https://github.com/cygwin/cygwin/releases.atom
+            title-pattern: ^(?!.*newlib)
+          - label: msys2-runtime-package
+            feed: https://github.com/msys2/MSYS2-packages/commits/master/msys2-runtime.atom
+          - label: msys2-runtime
+            feed: https://github.com/msys2/msys2-runtime/commits/HEAD.atom
+            aggregate: true
+          - label: openssh
+            feed: https://github.com/openssh/openssh-portable/tags.atom
+          - label: libfido2
+            feed: https://github.com/Yubico/libfido2/tags.atom
+          - label: libcbor
+            feed: https://github.com/PJK/libcbor/tags.atom
+          - label: openssl
+            feed: https://github.com/openssl/openssl/tags.atom
+            title-pattern: ^(?!.*alpha)
+          - label: gnutls
+            feed: https://gnutls.org/news.atom
+          - label: heimdal
+            feed: https://github.com/heimdal/heimdal/tags.atom
+          - label: git-sizer
+            feed: https://github.com/github/git-sizer/tags.atom
+          - label: gitflow
+            feed: https://github.com/petervanderdoes/gitflow-avh/tags.atom
+          - label: curl
+            feed: https://github.com/curl/curl/tags.atom
+            title-pattern: ^(?!rc-)
+          - label: mintty
+            feed: https://github.com/mintty/mintty/releases.atom
+          - label: 7-zip
+            feed: https://sourceforge.net/projects/sevenzip/rss?path=/7-Zip
+            aggregate: true
+          - label: bash
+            feed: https://git.savannah.gnu.org/cgit/bash.git/atom/?h=master
+            aggregate: true
+          - label: perl
+            feed: https://github.com/Perl/perl5/tags.atom
+            title-pattern: ^(?!.*(5\.[0-9]+[13579]|RC))
+          - label: pcre2
+            feed: https://github.com/PCRE2Project/pcre2/tags.atom
+          - label: mingw-w64-llvm
+            feed: https://github.com/msys2/MINGW-packages/commits/master/mingw-w64-llvm.atom
+          - label: innosetup
+            feed: https://github.com/jrsoftware/issrc/tags.atom
+          - label: mimalloc
+            feed: https://github.com/microsoft/mimalloc/tags.atom
+            title-pattern: ^(?!v1\.|v3\.[01]\.)
+      fail-fast: false
+    steps:
+      - uses: git-for-windows/rss-to-issues@v0
+        with:
+          feed: ${{matrix.component.feed}}
+          prefix: "[New ${{matrix.component.label}} version]"
+          labels: component-update
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+          character-limit: ${{ env.CHARACTER_LIMIT }}
+          max-age: ${{ env.MAX_AGE }}
+          aggregate: ${{matrix.component.aggregate}}
+          title-pattern: ${{matrix.component.title-pattern}}

From e084de92d47fa44fb3a950d0e7c23aa8173e3a6d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 10 Dec 2019 21:41:57 +0100
Subject: [PATCH 544/553] reset: reinstate support for the deprecated --stdin
 option

The `--stdin` option was a well-established paradigm in other commands,
therefore we implemented it in `git reset` for use by Visual Studio.

Unfortunately, upstream Git decided that it is time to introduce
`--pathspec-from-file` instead.

To keep backwards-compatibility for some grace period, we therefore
reinstate the `--stdin` option on top of the `--pathspec-from-file`
option, but mark it firmly as deprecated.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-reset.adoc | 11 +++++++++++
 builtin/reset.c              | 16 ++++++++++++++++
 t/meson.build                |  1 +
 t/t7108-reset-stdin.sh       | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+)
 create mode 100755 t/t7108-reset-stdin.sh

diff --git a/Documentation/git-reset.adoc b/Documentation/git-reset.adoc
index 3b9ba9aee95203..31cca828a36820 100644
--- a/Documentation/git-reset.adoc
+++ b/Documentation/git-reset.adoc
@@ -12,6 +12,7 @@ git reset [-q] [<tree-ish>] [--] <pathspec>...
 git reset [-q] [--pathspec-from-file=<file> [--pathspec-file-nul]] [<tree-ish>]
 git reset (--patch | -p) [<tree-ish>] [--] [<pathspec>...]
 git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] [<commit>]
+DEPRECATED: git reset [-q] [--stdin [-z]] [<tree-ish>]
 
 DESCRIPTION
 -----------
@@ -136,6 +137,16 @@ include::diff-context-options.adoc[]
 +
 For more details, see the 'pathspec' entry in linkgit:gitglossary[7].
 
+`--stdin`::
+	DEPRECATED (use `--pathspec-from-file=-` instead): Instead of taking
+	list of paths from the command line, read list of paths from the
+	standard input. Paths are separated by LF (i.e. one path per line) by
+	default.
+
+`-z`::
+	DEPRECATED (use `--pathspec-file-nul` instead): Only meaningful with
+	`--stdin`; paths are separated with NUL character instead of LF.
+
 EXAMPLES
 --------
 
diff --git a/builtin/reset.c b/builtin/reset.c
index ed35802af15c94..54244b6e32ea5a 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -38,6 +38,8 @@
 #include "trace2.h"
 #include "dir.h"
 #include "add-interactive.h"
+#include "strbuf.h"
+#include "quote.h"
 
 #define REFRESH_INDEX_DELAY_WARNING_IN_MS (2 * 1000)
 
@@ -46,6 +48,7 @@ static const char * const git_reset_usage[] = {
 	N_("git reset [-q] [<tree-ish>] [--] <pathspec>..."),
 	N_("git reset [-q] [--pathspec-from-file [--pathspec-file-nul]] [<tree-ish>]"),
 	N_("git reset --patch [<tree-ish>] [--] [<pathspec>...]"),
+	N_("DEPRECATED: git reset [-q] [--stdin [-z]] [<tree-ish>]"),
 	NULL
 };
 
@@ -347,6 +350,7 @@ int cmd_reset(int argc,
 	struct pathspec pathspec;
 	int intent_to_add = 0;
 	struct add_p_opt add_p_opt = ADD_P_OPT_INIT;
+	int nul_term_line = 0, read_from_stdin = 0;
 	const struct option options[] = {
 		OPT__QUIET(&quiet, N_("be quiet, only report errors")),
 		OPT_BOOL(0, "no-refresh", &no_refresh,
@@ -377,6 +381,10 @@ int cmd_reset(int argc,
 				N_("record only the fact that removed paths will be added later")),
 		OPT_PATHSPEC_FROM_FILE(&pathspec_from_file),
 		OPT_PATHSPEC_FILE_NUL(&pathspec_file_nul),
+		OPT_BOOL('z', NULL, &nul_term_line,
+			N_("DEPRECATED (use --pathspec-file-nul instead): paths are separated with NUL character")),
+		OPT_BOOL(0, "stdin", &read_from_stdin,
+				N_("DEPRECATED (use --pathspec-from-file=- instead): read paths from <stdin>")),
 		OPT_END()
 	};
 
@@ -386,6 +394,14 @@ int cmd_reset(int argc,
 						PARSE_OPT_KEEP_DASHDASH);
 	parse_args(&pathspec, argv, prefix, patch_mode, &rev);
 
+	if (read_from_stdin) {
+		warning(_("--stdin is deprecated, please use --pathspec-from-file=- instead"));
+		free(pathspec_from_file);
+		pathspec_from_file = xstrdup("-");
+		if (nul_term_line)
+			pathspec_file_nul = 1;
+	}
+
 	if (pathspec_from_file) {
 		if (patch_mode)
 			die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--patch");
diff --git a/t/meson.build b/t/meson.build
index 1e26a4c7a9548d..97af1c56b723c1 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -860,6 +860,7 @@ integration_tests = [
   't7105-reset-patch.sh',
   't7106-reset-unborn-branch.sh',
   't7107-reset-pathspec-file.sh',
+  't7108-reset-stdin.sh',
   't7110-reset-merge.sh',
   't7111-reset-table.sh',
   't7112-reset-submodule.sh',
diff --git a/t/t7108-reset-stdin.sh b/t/t7108-reset-stdin.sh
new file mode 100755
index 00000000000000..b7cbcbf869296c
--- /dev/null
+++ b/t/t7108-reset-stdin.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description='reset --stdin'
+
+. ./test-lib.sh
+
+test_expect_success 'reset --stdin' '
+	test_commit hello &&
+	git rm hello.t &&
+	test -z "$(git ls-files hello.t)" &&
+	echo hello.t | git reset --stdin &&
+	test hello.t = "$(git ls-files hello.t)"
+'
+
+test_expect_success 'reset --stdin -z' '
+	test_commit world &&
+	git rm hello.t world.t &&
+	test -z "$(git ls-files hello.t world.t)" &&
+	printf world.tQworld.tQhello.tQ | q_to_nul | git reset --stdin -z &&
+	printf "hello.t\nworld.t\n" >expect &&
+	git ls-files >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--stdin requires --mixed' '
+	echo hello.t >list &&
+	test_must_fail git reset --soft --stdin <list &&
+	test_must_fail git reset --hard --stdin <list &&
+	git reset --mixed --stdin <list
+'
+
+test_done

From 1904cea6f03c4587d15cf8f6eba11ce14563413f Mon Sep 17 00:00:00 2001
From: Victoria Dye <vdye@github.com>
Date: Mon, 4 Apr 2022 15:38:58 -0700
Subject: [PATCH 545/553] fsmonitor: reintroduce core.useBuiltinFSMonitor

Reintroduce the 'core.useBuiltinFSMonitor' config setting (originally added
in 0a756b2a25 (fsmonitor: config settings are repository-specific,
2021-03-05)) after its removal from the upstream version of FSMonitor.

Upstream, the 'core.useBuiltinFSMonitor' setting was rendered obsolete by
"overloading" the 'core.fsmonitor' setting to take a boolean value. However,
several applications (e.g., 'scalar') utilize the original config setting,
so it should be preserved for a deprecation period before complete removal:

* if 'core.fsmonitor' is a boolean, the user is correctly using the new
  config syntax; do not use 'core.useBuiltinFSMonitor'.
* if 'core.fsmonitor' is unspecified, use 'core.useBuiltinFSMonitor'.
* if 'core.fsmonitor' is a path, override and use the builtin FSMonitor if
  'core.useBuiltinFSMonitor' is 'true'; otherwise, use the FSMonitor hook
  indicated by the path.

Additionally, for this deprecation period, advise users to switch to using
'core.fsmonitor' to specify their use of the builtin FSMonitor.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 Documentation/config/advice.adoc |  4 ++++
 advice.c                         |  1 +
 advice.h                         |  1 +
 fsmonitor-settings.c             | 34 ++++++++++++++++++++++++++++++--
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/advice.adoc b/Documentation/config/advice.adoc
index 257db58918179a..f156f638dcd5ee 100644
--- a/Documentation/config/advice.adoc
+++ b/Documentation/config/advice.adoc
@@ -166,4 +166,8 @@ all advice messages.
 		Shown when the user tries to create a worktree from an
 		invalid reference, to tell the user how to create a new unborn
 		branch instead.
+
+	useCoreFSMonitorConfig::
+		Advice shown if the deprecated 'core.useBuiltinFSMonitor' config
+		setting is in use.
 --
diff --git a/advice.c b/advice.c
index 0018501b7bc103..01f0fe407e84a4 100644
--- a/advice.c
+++ b/advice.c
@@ -89,6 +89,7 @@ static struct {
 	[ADVICE_SUBMODULE_MERGE_CONFLICT]               = { "submoduleMergeConflict" },
 	[ADVICE_SUGGEST_DETACHING_HEAD]			= { "suggestDetachingHead" },
 	[ADVICE_UPDATE_SPARSE_PATH]			= { "updateSparsePath" },
+	[ADVICE_USE_CORE_FSMONITOR_CONFIG]		= { "useCoreFSMonitorConfig" },
 	[ADVICE_WAITING_FOR_EDITOR]			= { "waitingForEditor" },
 	[ADVICE_WORKTREE_ADD_ORPHAN]			= { "worktreeAddOrphan" },
 };
diff --git a/advice.h b/advice.h
index 8def28068861df..d5d7696897351e 100644
--- a/advice.h
+++ b/advice.h
@@ -56,6 +56,7 @@ enum advice_type {
 	ADVICE_SUBMODULE_MERGE_CONFLICT,
 	ADVICE_SUGGEST_DETACHING_HEAD,
 	ADVICE_UPDATE_SPARSE_PATH,
+	ADVICE_USE_CORE_FSMONITOR_CONFIG,
 	ADVICE_WAITING_FOR_EDITOR,
 	ADVICE_WORKTREE_ADD_ORPHAN,
 };
diff --git a/fsmonitor-settings.c b/fsmonitor-settings.c
index a6587a8972b184..b4c29f44a27827 100644
--- a/fsmonitor-settings.c
+++ b/fsmonitor-settings.c
@@ -5,6 +5,7 @@
 #include "fsmonitor-ipc.h"
 #include "fsmonitor-settings.h"
 #include "fsmonitor-path-utils.h"
+#include "advice.h"
 
 /*
  * We keep this structure definition private and have getters
@@ -100,6 +101,31 @@ static struct fsmonitor_settings *alloc_settings(void)
 	return s;
 }
 
+static int check_deprecated_builtin_config(struct repository *r)
+{
+	int core_use_builtin_fsmonitor = 0;
+
+	/*
+	 * If 'core.useBuiltinFSMonitor' is set, print a deprecation warning
+	 * suggesting the use of 'core.fsmonitor' instead. If the config is
+	 * set to true, set the appropriate mode and return 1 indicating that
+	 * the check resulted the config being set by this (deprecated) setting.
+	 */
+	if(!repo_config_get_bool(r, "core.useBuiltinFSMonitor", &core_use_builtin_fsmonitor) &&
+	   core_use_builtin_fsmonitor) {
+		if (!git_env_bool("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", 0)) {
+			advise_if_enabled(ADVICE_USE_CORE_FSMONITOR_CONFIG,
+					  _("core.useBuiltinFSMonitor=true is deprecated;"
+					    "please set core.fsmonitor=true instead"));
+			setenv("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", "1", 1);
+		}
+		fsm_settings__set_ipc(r);
+		return 1;
+	}
+
+	return 0;
+}
+
 static void lookup_fsmonitor_settings(struct repository *r)
 {
 	const char *const_str;
@@ -126,12 +152,16 @@ static void lookup_fsmonitor_settings(struct repository *r)
 		return;
 
 	case 1: /* config value was unset */
+		if (check_deprecated_builtin_config(r))
+			return;
+
 		const_str = getenv("GIT_TEST_FSMONITOR");
 		break;
 
 	case -1: /* config value set to an arbitrary string */
-		if (repo_config_get_pathname(r, "core.fsmonitor", &to_free))
-			return; /* should not happen */
+		if (check_deprecated_builtin_config(r) ||
+		    repo_config_get_pathname(r, "core.fsmonitor", &to_free))
+			return;
 		const_str = to_free;
 		break;
 

From e08b70347e8c14cf3f2a7c70d8312c47e3caa4fd Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 6 Feb 2024 18:45:35 +0100
Subject: [PATCH 546/553] dependabot: help keeping GitHub Actions versions up
 to date

See https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions for details.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/dependabot.yml | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 .github/dependabot.yml

diff --git a/.github/dependabot.yml b/.github/dependabot.yml
new file mode 100644
index 00000000000000..22d5376407abf1
--- /dev/null
+++ b/.github/dependabot.yml
@@ -0,0 +1,13 @@
+# To get started with Dependabot version updates, you'll need to specify which
+# package ecosystems to update and where the package manifests are located.
+# Please see the documentation for all configuration options:
+# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
+# especially
+# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions
+
+version: 2
+updates:
+  - package-ecosystem: "github-actions" # See documentation for possible values
+    directory: "/" # Location of package manifests
+    schedule:
+      interval: "weekly"

From 23e94407fa5b7489eef603fc67f1c6c84e824720 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 13 Feb 2023 13:31:35 +0100
Subject: [PATCH 547/553] Describe Git for Windows' architecture [no ci]

The Git for Windows project has grown quite complex over the years,
certainly much more complex than during the first years where the
`msysgit.git` repository was abusing Git for package management purposes
and the `git/git` fork was called `4msysgit.git`.

Let's describe the status quo in a thorough way.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ARCHITECTURE.md | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)
 create mode 100644 ARCHITECTURE.md

diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
new file mode 100644
index 00000000000000..7de4f99bf71ec4
--- /dev/null
+++ b/ARCHITECTURE.md
@@ -0,0 +1,116 @@
+# Architecture of Git for Windows
+
+Git for Windows is a complex project.
+
+## What _is_ Git for Windows?
+
+### A fork of `git/git`
+
+First and foremost, it is a friendly fork of [`git/git`](https://github.com/git/git), aiming to improve Git's Windows support. The [`git-for-windows/git`](https://github.com/git-for-windows/git) repository contains dozens of topics on top of `git/git`, some awaiting to be "upstreamed" (i.e. to be contributed to `git/git`), some still being stabilized, and a few topics are specific to the Git for Windows project and are not intended to be integrated into `git/git` at all.
+
+### Enhancing and maintaining Git's support for Windows
+
+On the source code side, Git's Windows support is made a bit more tricky than strictly necessary by the fact that Git does not have any platform abstraction layer (unlike other version control systems, such as Subversion). It relies on the presence of POSIX features such as the `hstrerror()` function, and on platforms lacking that functionality, Git provides shims. That leads to some challenges e.g. with the `stat()` function which is very slow on Windows because it has to collect much more metadata than what e.g. the very quick `GetFileAttributesExW()` Win32 API function provides, even when Git calls `stat()` merely to test for the presence of a file (for which all that gathered metadata is totally irrelevant).
+
+### Providing more than just source code
+
+In contrast to the Git project, Git for Windows not only publishes tagged source code versions, but full builds of Git. In fact, Git for Windows' primary purpose, as far as most users are concerned, is to provide a convenient installer that end-users can run to have Git on their computer, without ever having to check out `git-for-windows/git` let alone build it. In essence, Git for Windows has to maintain a separate project altogether in addition to the fork of `git/git`, just to build these release artifacts: [`git-for-windows/build-extra`](https://github.com/git-for-windows/build-extra). This repository also contains the definition for a couple of other release artifacts published by Git for Windows, e.g. the "portable" edition of Git for Windows which is a self-extracting 7-Zip archive that does not need to be installed.
+
+### A software distribution, really
+
+Another aspect that contributes to the complexity of Git for Windows is that it is not just building `git.exe` and distributes that. Due to its heritage within the Linux project, Git takes certain things for granted, such as the presence of a Unix shell, or for that matter, a package management system from which dependencies can be fetched and updated independently of Git itself. Things that are distinctly not present in most Windows setups. To accommodate for that, Git for Windows originally relied on the MSys project, a minimal fork of Cygwin providing a Unix shell ("Bash"), a Perl interpreter and similar Unix-like tools, and on the MINGW project, a project to build libraries and executables using a GNU C Compiler that relies only on Win32 API functions. As of Git for Windows v2.x, the project has switched away from [MSys](https://sourceforge.net/projects/mingw/files/MSYS/)/[MinGW](https://osdn.net/projects/mingw/) (due to less-than-active maintenance) to [the MSYS2 project](https://msys2.org). That switch brought along the benefit of a robust package management system based on [Pacman](https://archlinux.org/pacman/) (hailing from Arch Linux). To support Windows users, who are in general unfamiliar with Linux-like package management and the need to update installed packages frequently, Git for Windows bundles a subset of its own fork of MSYS2. To put things in perspective: Git for Windows bundles files from ~170 packages, one of which contains Git, and another one contains Git's help files. In that respect, Git for Windows acts like a distribution more than like a mere single software application.
+
+Most of MSYS2's packages that are bundled in Git for Windows are consumed directly from MSYS2. Others need forks that are maintained by Git for Windows project, to support Git for Windows better. These forks live in the [`git-for-windows/MSYS2-packages`](https://github.com/git-for-windows/MSYS2-packages) and [`git-for-windows/MINGW-packages`](https://github.com/git-for-windows/MINGW-packages) repositories. There are several reasons justifying these forks. For example, the Git for Windows' flavor of the MSYS2 runtime behaves like Git's test suite expects it while MSYS2's flavor does not. Another example: The Bash executable bundled in Git for Windows is code-signed with the same certificate as `git.exe` to help anti-malware programs get out of the users' way. That is why Git for Windows maintains its own `bash` Pacman package. And since MSYS2 dropped 32-bit support already, Git for Windows has to update the 32-bit Pacman packages itself, which is done in the git-for-windows/MSYS2-packages repository. (Side note: the 32-bit issue is a bit more complicated, actually: MSYS2 _still_ builds _MINGW_ packages targeting i686 processors, but no longer any _MSYS_ packages for said processor architecture, and Git for Windows does not keep all of the 32-bit MSYS packages up to date but instead judiciously decides which packages are vital enough as far as Git is concerned to justify the maintenance cost.)
+
+### Supporting third-party applications that use Git's functionality
+
+Since the infrastructure required by Git is non-trivial the installer (or for that matter, the Portable Git) is not exactly light-weight: As of January 2023, both artifacts are over fifty megabytes. This is a problem for third-party applications wishing to bundle a version of Git for Windows, which is often advisable given that applications may depend on features that have been introduced only in recent Git versions and therefore relying on an installed Git for Windows could break things. To help with that, the Git for Windows project also provides MinGit as a release artifact, a zip file that is much smaller than the full installer and that contains only the parts of Git for Windows relevant for third-party applications. It lacks Git GUI, for example, as well as the terminal program MinTTY, or for that matter, the documentation.
+
+### Supporting `git/git`'s GitHub workflows
+
+The Git for Windows project is also responsible for keeping the Windows part of `git/git`'s automated builds up and running. On Windows, there is no canonical and easy way to get a build environment necessary to build Git and run its test suite, therefore this is a non-trivial task that comes with its own maintenance cost. Git for Windows provides two GitHub Actions to help with that: [`git-for-windows/setup-git-for-windows-sdk`](https://github.com/git-for-windows/setup-git-for-windows-sdk) to set up a tiny subset of Git for Windows' full SDK (which would require about 500MB to be cloned, as opposed to the ~75MB of that subset) and [`git-for-windows/get-azure-pipelines-artifact`](https://github.com/git-for-windows/get-azure-pipelines-artifact) e.g. to download some regularly pre-built artifacts (for example, when `git/git`'s automated tests ran on an Ubuntu version that did not provide an up to date [Coccinelle](https://coccinelle.gitlabpages.inria.fr/website/) package, this GitHub Action was used to download a pre-built version of that Debian package).
+
+## Maintaining Git for Windows' components
+
+Git for Windows uses a combination of [a GitHub App called GitForWindowsHelper](https://github.com/git-for-windows/gfw-helper-github-app) (to listen for so-called [slash commands](https://github.com/git-for-windows/gfw-helper-github-app#slash-commands)) combined with workflows in [the `git-for-windows-automation` repository](https://github.com/git-for-windows/git-for-windows-automation/) (for computationally heavy tasks) to support Git for Windows' repetitive tasks.
+
+This heavy automation serves two purposes:
+
+1. Document the knowledge about "how things are done" in the Git for Windows project.
+2. Make Git for Windows' maintenance less tedious by off-loading as many tasks onto machines as possible.
+
+One neat trick of some `git-for-windows-automation` workflows is that they "mirror back" check runs to the targeted PRs in another repository. This essentially allows versioning the source code independently of the workflow definition.
+
+Here is a diagram showing how the bits and pieces fit together.
+
+```mermaid
+graph LR
+  A[`monitor-components`] --> |opens| B
+  B{issues labeled<br />`component-update`} --> |/open pr| C
+  C((GitForWindowsHelper)) --> |triggers| D
+  D[`open-pr`] --> |opens| E
+  E{PR in</br>MINGW-packages<br />MSYS2-packages<br />build-extra} --> |closes| B
+  E --> |/deploy| F
+  F((GitForWindowsHelper)) --> |triggers| G
+  G[`build-and-deploy`] --> |deploys to| H
+  H{Pacman repository}
+  C --> |backed by| I
+  F --> |backed by| I
+  I[[Azure Function]]
+  D --> |running in| J
+  G --> | running in| J
+  J[[git-for-windows-automation]]
+  K[[git-sdk-32<br />git-sdk-64<br />git-sdk-arm64]] --> |syncing from| H
+  B --> |/add release note| L
+  L[`add-release-note`]
+```
+
+For the curious mind, here are [detailed instructions how the Azure Function backing the GitForWindowsHelper GitHub App was set up](https://github.com/git-for-windows/gfw-helper-github-app#how-this-github-app-was-set-up).
+
+### The `monitor-components` workflow
+
+When new versions of components that Git for Windows builds become available, new Pacman packages have to be built. To this end, [the `monitor-components` workflow](https://github.com/git-for-windows/git/blob/main/.github/workflows/monitor-components.yml) monitors a couple of RSS feeds and opens new tickets labeled `component-update` for such new versions.
+
+### Opening Pull Requests to update Git for Windows' components
+
+After determining that such a ticket indeed indicates the need for a new Pacman package build, a Git for Windows maintainer issues the `/open pr` command via an issue comment ([example](https://github.com/git-for-windows/git/issues/4281#issuecomment-1426859787)), which gets picked up by the GitForWindowsHelper GitHub App, which in turn triggers [the `open-pr` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/open-pr.yml) in the `git-for-windows-automation` repository.
+
+### Deploying the Pacman packages
+
+This will open a Pull Request in one of Git for Windows' repositories, and once the PR build passes, a Git for Windows maintainer issues the `/deploy` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427591890)), which gets picked up by the GitForWindowsHelper GitHub App, which triggers [the `build-and-deploy` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/build-and-deploy.yml).
+
+### Adding release notes
+
+Finally, once the packages have been built and deployed to the Pacman repository (which is hosted in Azure Blob Storage), a Git for Windows maintainer will merge the PR(s), which in turn will close the ticket, and the maintainer then issues an `/add release note` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427782230)), which again gets picked up by the GitForWindowsHelper GitHub App that triggers [the `add-release-note` workflow](https://github.com/git-for-windows/build-extra/blob/main/.github/workflows/add-release-note.yml) that creates and pushes a new commit to the `ReleaseNotes.md` file in `build-extra` ([example](https://github.com/git-for-windows/build-extra/commit/b39c148ff8dc0e987afdb677d17c46a8e99fd0ef)).
+
+## Releasing official Git for Windows versions
+
+A relatively infrequent part of Git for Windows' maintainers' duties, if the most rewarding part, is the task of releasing new versions of Git for Windows.
+
+Most commonly, this is done in response to the "upstream" Git project releasing a new version. When that happens, a Git for Windows maintainer runs [the helper script](https://github.com/git-for-windows/build-extra/blob/main/shears.sh) to perform a "merging rebase" (i.e. a rebase that starts with a fake-merge of the previous tip commit, to maintain both a clean set of commits as well as a [fast-forwarding](https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---ff-only) commit history).
+
+Once that is done, the maintainer will open a Pull Request to benefit from the automated builds and tests ([example](https://github.com/git-for-windows/git/pull/4160)) as well as from reviews of the [`range-diff`](https://git-scm.com/docs/git-range-diff) relative to the current `main` branch.
+
+Once everything looks good, the maintainer will issue the `/git-artifacts` command ([example](https://github.com/git-for-windows/git/pull/4160#issuecomment-1346801735)). This will trigger an automated workflow that builds all of the release artifacts: installers, Portable Git, MinGit, `.tar.xz` archive and a NuGet package. Apart from the NuGet package, two sets of artifacts are built: targeting 32-bit ("x86") and 64-bit ("amd64").
+
+Once these artifacts are built, the maintainer will download the installer and run [the "pre-flight checklist"](https://github.com/git-for-windows/build-extra/blob/main/installer/checklist.txt).
+
+If everything looks good, a `/release` command will be issued, which triggers yet another workflow that will download the just-built-and-verified release artifacts, publish them as a new GitHub release, publish the NuGet packages, deploy the Pacman packages to the Pacman repository, send out an announcement mail, and update the respective repositories including [Git for Windows' website](https://gitforwindows.org/).
+
+As mentioned [before](#architecture-of-git-for-windows), the `/git-artifacts` and `/release` commands are picked up by the GitForWindowsHelper GitHub App which subsequently triggers the respective workflows in the `git-for-windows-automation` repository. Here is a diagram:
+
+```mermaid
+graph LR
+  A{Pull Request<br />updating to<br />new Git version} --> |/git-artifacts| B
+  B((GitForWindowsHelper)) --> |triggers| C
+  C[`tag-git`] --> |upon successful build<br />triggers| D
+  D((GitForWindowsHelper)) --> |triggers| E
+  E[`git-artifacts`]
+  E --> |maintainer verifies artifacts| E
+  A --> |upon verified `git-artifacts`<br />/release| F
+  F[`release-git`]
+  C --> |running in| J
+  E --> | running in| J
+  F --> | running in| J
+  J[[git-for-windows-automation]]
+```
\ No newline at end of file

From 677de27af2f7e83e47edebf9d349e8cd20a5cca8 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 11 Oct 2019 13:22:24 +0200
Subject: [PATCH 548/553] Modify the Code of Conduct for Git for Windows

The Git project followed Git for Windows' lead and added their Code of
Conduct, based on the Contributor Covenant v1.4, later updated to v2.0.

We adapt it slightly to Git for Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 CODE_OF_CONDUCT.md | 58 +++++++++++++++++++++-------------------------
 1 file changed, 26 insertions(+), 32 deletions(-)

diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
index e58917c50a96dc..4daef7e3ce9196 100644
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@@ -1,9 +1,9 @@
-# Git Code of Conduct
+# Git for Windows Code of Conduct
 
 This code of conduct outlines our expectations for participants within
-the Git community, as well as steps for reporting unacceptable behavior.
-We are committed to providing a welcoming and inspiring community for
-all and expect our code of conduct to be honored. Anyone who violates
+the **Git for Windows** community, as well as steps for reporting unacceptable
+behavior. We are committed to providing a welcoming and inspiring community
+for all and expect our code of conduct to be honored. Anyone who violates
 this code of conduct may be banned from the community.
 
 ## Our Pledge
@@ -12,8 +12,8 @@ We as members, contributors, and leaders pledge to make participation in our
 community a harassment-free experience for everyone, regardless of age, body
 size, visible or invisible disability, ethnicity, sex characteristics, gender
 identity and expression, level of experience, education, socio-economic status,
-nationality, personal appearance, race, religion, or sexual identity
-and orientation.
+nationality, personal appearance, race, caste, color, religion, or sexual
+identity and orientation.
 
 We pledge to act and interact in ways that contribute to an open, welcoming,
 diverse, inclusive, and healthy community.
@@ -28,17 +28,17 @@ community include:
 * Giving and gracefully accepting constructive feedback
 * Accepting responsibility and apologizing to those affected by our mistakes,
   and learning from the experience
-* Focusing on what is best not just for us as individuals, but for the
-  overall community
+* Focusing on what is best not just for us as individuals, but for the overall
+  community
 
 Examples of unacceptable behavior include:
 
-* The use of sexualized language or imagery, and sexual attention or
-  advances of any kind
+* The use of sexualized language or imagery, and sexual attention or advances of
+  any kind
 * Trolling, insulting or derogatory comments, and personal or political attacks
 * Public or private harassment
-* Publishing others' private information, such as a physical or email
-  address, without their explicit permission
+* Publishing others' private information, such as a physical or email address,
+  without their explicit permission
 * Other conduct which could reasonably be considered inappropriate in a
   professional setting
 
@@ -58,20 +58,14 @@ decisions when appropriate.
 
 This Code of Conduct applies within all community spaces, and also applies when
 an individual is officially representing the community in public spaces.
-Examples of representing our community include using an official e-mail address,
+Examples of representing our community include using an official email address,
 posting via an official social media account, or acting as an appointed
 representative at an online or offline event.
 
 ## Enforcement
 
 Instances of abusive, harassing, or otherwise unacceptable behavior may be
-reported to the community leaders responsible for enforcement at
-git@sfconservancy.org, or individually:
-
-  - Ævar Arnfjörð Bjarmason <avarab@gmail.com>
-  - Christian Couder <christian.couder@gmail.com>
-  - Junio C Hamano <gitster@pobox.com>
-  - Taylor Blau <me@ttaylorr.com>
+reported by contacting the Git for Windows maintainer.
 
 All complaints will be reviewed and investigated promptly and fairly.
 
@@ -94,15 +88,15 @@ behavior was inappropriate. A public apology may be requested.
 
 ### 2. Warning
 
-**Community Impact**: A violation through a single incident or series
-of actions.
+**Community Impact**: A violation through a single incident or series of
+actions.
 
 **Consequence**: A warning with consequences for continued behavior. No
 interaction with the people involved, including unsolicited interaction with
 those enforcing the Code of Conduct, for a specified period of time. This
 includes avoiding interactions in community spaces as well as external channels
-like social media. Violating these terms may lead to a temporary or
-permanent ban.
+like social media. Violating these terms may lead to a temporary or permanent
+ban.
 
 ### 3. Temporary Ban
 
@@ -118,27 +112,27 @@ Violating these terms may lead to a permanent ban.
 ### 4. Permanent Ban
 
 **Community Impact**: Demonstrating a pattern of violation of community
-standards, including sustained inappropriate behavior,  harassment of an
+standards, including sustained inappropriate behavior, harassment of an
 individual, or aggression toward or disparagement of classes of individuals.
 
-**Consequence**: A permanent ban from any sort of public interaction within
-the community.
+**Consequence**: A permanent ban from any sort of public interaction within the
+community.
 
 ## Attribution
 
 This Code of Conduct is adapted from the [Contributor Covenant][homepage],
-version 2.0, available at
-[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
+version 2.1, available at
+[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
 
 Community Impact Guidelines were inspired by
 [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
 
 For answers to common questions about this code of conduct, see the FAQ at
-[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
-at [https://www.contributor-covenant.org/translations][translations].
+[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
+[https://www.contributor-covenant.org/translations][translations].
 
 [homepage]: https://www.contributor-covenant.org
-[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html
+[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
 [Mozilla CoC]: https://github.com/mozilla/diversity
 [FAQ]: https://www.contributor-covenant.org/faq
 [translations]: https://www.contributor-covenant.org/translations

From b2fe567dab20b2876e801269a6e0046e7dc203ab Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Thu, 1 Mar 2018 12:10:14 -0500
Subject: [PATCH 549/553] CONTRIBUTING.md: add guide for first-time
 contributors

Getting started contributing to Git can be difficult on a Windows
machine. CONTRIBUTING.md contains a guide to getting started, including
detailed steps for setting up build tools, running tests, and
submitting patches to upstream.

[includes an example by Pratik Karki how to submit v2, v3, v4, etc.]

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 CONTRIBUTING.md | 417 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 417 insertions(+)
 create mode 100644 CONTRIBUTING.md

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 00000000000000..48ff9029374df3
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,417 @@
+How to Contribute to Git for Windows
+====================================
+
+Git was originally designed for Unix systems and still today, all the build tools for the Git
+codebase assume you have standard Unix tools available in your path. If you have an open-source
+mindset and want to start contributing to Git, but primarily use a Windows machine, then you may
+have trouble getting started. This guide is for you.
+
+Get the Source
+--------------
+
+Clone the [GitForWindows repository on GitHub](https://github.com/git-for-windows/git).
+It is helpful to create your own fork for storing your development branches.
+
+Windows uses different line endings than Unix systems. See
+[this GitHub article on working with line endings](https://help.github.com/articles/dealing-with-line-endings/#refreshing-a-repository-after-changing-line-endings)
+if you have trouble with line endings.
+
+Build the Source
+----------------
+
+First, download and install the latest [Git for Windows SDK (64-bit)](https://github.com/git-for-windows/build-extra/releases/latest).
+When complete, you can run the Git SDK, which creates a new Git Bash terminal window with
+the additional development commands, such as `make`.
+
+    As of time of writing, the SDK uses a different credential manager, so you may still want to use normal Git
+    Bash for interacting with your remotes.  Alternatively, use SSH rather than HTTPS and
+    avoid credential manager problems.
+
+You should now be ready to type `make` from the root of your `git` source directory.
+Here are some helpful variations:
+
+* `make -j[N] DEVELOPER=1`: Compile new sources using up to N concurrent processes.
+  The `DEVELOPER` flag turns on all warnings; code failing these warnings will not be
+  accepted upstream ("upstream" = "the core Git project").
+* `make clean`: Delete all compiled files.
+
+When running `make`, you can use `-j$(nproc)` to automatically use the number of processors
+on your machine as the number of concurrent build processes.
+
+You can go deeper on the Windows-specific build process by reading the
+[technical overview](https://gitforwindows.org/technical-overview) or the
+[guide to compiling Git with Visual Studio](https://gitforwindows.org/compiling-git-with-visual-studio).
+
+## Building `git` on Windows with Visual Studio
+
+The typical approach to building `git` is to use the standard `Makefile` with GCC, as
+above. Developers working in a Windows environment may want to instead build with the
+[Microsoft Visual C++ compiler and libraries toolset (MSVC)](https://blogs.msdn.microsoft.com/vcblog/2017/03/07/msvc-the-best-choice-for-windows/).
+There are a few benefits to using MSVC over GCC during your development, including creating
+symbols for debugging and [performance tracing](https://github.com/Microsoft/perfview#perfview-overview).
+
+There are two ways to build Git for Windows using MSVC. Each have their own merits.
+
+### Using SDK Command Line
+
+Use one of the following commands from the SDK Bash window to build Git for Windows:
+
+```
+    make MSVC=1 -j12
+    make MSVC=1 DEBUG=1 -j12
+```
+
+The first form produces release-mode binaries; the second produces debug-mode binaries.
+Both forms produce PDB files and can be debugged.  However, the first is best for perf
+tracing and the second is best for single-stepping.
+
+You can then open Visual Studio and select File -> Open -> Project/Solution and select
+the compiled `git.exe` file. This creates a basic solution and you can use the debugging
+and performance tracing tools in Visual Studio to monitor a Git process. Use the Debug
+Properties page to set the working directory and command line arguments.
+
+Be sure to clean up before switching back to GCC (or to switch between debug and
+release MSVC builds):
+
+```
+    make MSVC=1 -j12 clean
+    make MSVC=1 DEBUG=1 -j12 clean
+```
+
+### Using the IDE
+
+If you prefer working in Visual Studio with a solution full of projects, then you can use
+CMake, either by letting Visual Studio configure it automatically (simply open Git's
+top-level directory via `File>Open>Folder...`) or by (downloading and) running
+[CMake](https://cmake.org) manually.
+
+What to Change?
+---------------
+
+Many new contributors ask: What should I start working on?
+
+One way to win big with the open-source community is to look at the
+[issues page](https://github.com/git-for-windows/git/issues) and see if there are any issues that
+you can fix quickly, or if anything catches your eye.
+
+You can also look at [the unofficial Chromium issues page](https://crbug.com/git) for
+multi-platform issues. You can look at recent user questions on
+[the Git mailing list](https://public-inbox.org/git).
+
+Or you can "scratch your own itch", i.e. address an issue you have with Git. The team at Microsoft where the Git for Windows maintainer works, for example, is focused almost entirely on [improving performance](https://blogs.msdn.microsoft.com/devops/2018/01/11/microsofts-performance-contributions-to-git-in-2017/).
+We approach our work by finding something that is slow and try to speed it up. We start our
+investigation by reliably reproducing the slow behavior, then running that example using
+the MSVC build and tracing the results in PerfView.
+
+You could also think of something you wish Git could do, and make it do that thing! The
+only concern I would have with this approach is whether or not that feature is something
+the community also wants. If this excites you though, go for it! Don't be afraid to
+[get involved in the mailing list](http://vger.kernel.org/vger-lists.html#git) early for
+feedback on the idea.
+
+Test Your Changes
+-----------------
+
+After you make your changes, it is important that you test your changes. Manual testing is
+important, but checking and extending the existing test suite is even more important. You
+want to run the functional tests to see if you broke something else during your change, and
+you want to extend the functional tests to be sure no one breaks your feature in the future.
+
+### Functional Tests
+
+Navigate to the `t/` directory and type `make` to run all tests or use `prove` as
+[described on this Git for Windows page](https://gitforwindows.org/building-git):
+
+```
+prove -j12 --state=failed,save ./t[0-9]*.sh
+```
+
+You can also run each test directly by running the corresponding shell script with a name
+like `tNNNN-descriptor.sh`.
+
+If you are adding new functionality, you may need to create unit tests by creating
+helper commands that test a very limited action. These commands are stored in `t/helpers`.
+When adding a helper, be sure to add a line to `t/Makefile` and to the `.gitignore` for the
+binary file you add. The Git community prefers functional tests using the full `git`
+executable, so try to exercise your new code using `git` commands before creating a test
+helper.
+
+To find out why a test failed, repeat the test with the `-x -v -d -i` options and then
+navigate to the appropriate "trash" directory to see the data shape that was used for the
+test failed step.
+
+Read [`t/README`](t/README) for more details.
+
+### Performance Tests
+
+If you are working on improving performance, you will need to be acquainted with the
+performance tests in `t/perf`. There are not too many performance tests yet, but adding one
+as your first commit in a patch series helps to communicate the boost your change provides.
+
+To check the change in performance across multiple versions of `git`, you can use the
+`t/perf/run` script. For example, to compare the performance of `git rev-list` across the
+`core/master` and `core/next` branches compared to a `topic` branch, you can run
+
+```
+cd t/perf
+./run core/master core/next topic -- p0001-rev-list.sh
+```
+
+You can also set certain environment variables to help test the performance on different
+repositories or with more repetitions. The full list is available in
+[the `t/perf/README` file](t/perf/README),
+but here are a few important ones:
+
+```
+GIT_PERF_REPO=/path/to/repo
+GIT_PERF_LARGE_REPO=/path/to/large/repo
+GIT_PERF_REPEAT_COUNT=10
+```
+
+When running the performance tests on Linux, you may see a message "Can't locate JSON.pm in
+@INC" and that means you need to run `sudo cpanm install JSON` to get the JSON perl package.
+
+For running performance tests, it can be helpful to set up a few repositories with strange
+data shapes, such as:
+
+**Many objects:** Clone repos such as [Kotlin](https://github.com/jetbrains/kotlin), [Linux](https://github.com/torvalds/linux), or [Android](https://source.android.com/setup/downloading).
+
+**Many pack-files:** You can split a fresh clone into multiple pack-files of size at most
+16MB by running `git repack -adfF --max-pack-size=16m`. See the
+[`git repack` documentation](https://git-scm.com/docs/git-repack) for more information.
+You can count the number of pack-files using `ls .git/objects/pack/*.pack | wc -l`.
+
+**Many loose objects:** If you already split your repository into multiple pack-files, then
+you can pick one to split into loose objects using `cat .git/objects/pack/[id].pack | git unpack-objects`;
+delete the `[id].pack` and `[id].idx` files after this. You can count the number of loose
+bjects using `ls .git/objects/??/* | wc -l`.
+
+**Deep history:** Usually large repositories also have deep histories, but you can use the
+[test-many-commits-1m repo](https://github.com/cirosantilli/test-many-commits-1m/) to
+target deep histories without the overhead of many objects. One issue with this repository:
+there are no merge commits, so you will need to use a different repository to test a "wide"
+commit history.
+
+**Large Index:** You can generate a large index and repo by using the scripts in
+`t/perf/repos`.  There are two scripts. `many-files.sh` which will generate a repo with
+same tree and blobs but different paths.  Using `many-files.sh -d 5 -w 10 -f 9` will create
+a repo with ~1 million entries in the index. `inflate-repo.sh` will use an existing repo
+and copy the current work tree until it is a specified size.
+
+Test Your Changes on Linux
+--------------------------
+
+It can be important to work directly on the [core Git codebase](https://github.com/git/git),
+such as a recent commit into the `master` or `next` branch that has not been incorporated
+into Git for Windows. Also, it can help to run functional and performance tests on your
+code in Linux before submitting patches to the mailing list, which focuses on many platforms.
+The differences between Windows and Linux are usually enough to catch most cross-platform
+issues.
+
+### Using the Windows Subsystem for Linux
+
+The [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install-win10)
+allows you to [install Ubuntu Linux as an app](https://www.microsoft.com/en-us/store/p/ubuntu/9nblggh4msv6)
+that can run Linux executables on top of the Windows kernel. Internally,
+Linux syscalls are interpreted by the WSL, everything else is plain Ubuntu.
+
+First, open WSL (either type "Bash" in Cortana, or execute "bash.exe" in a CMD window).
+Then install the prerequisites, and `git` for the initial clone:
+
+```
+sudo apt-get update
+sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \
+		     libexpat-dev tcl tk gettext git-email zlib1g-dev
+```
+
+Then, clone and build:
+
+```
+git clone https://github.com/git-for-windows/git
+cd git
+git remote add -f upstream https://github.com/git/git
+make
+```
+
+Be sure to clone into `/home/[user]/` and not into any folder under `/mnt/?/` or your build
+will fail due to colons in file names.
+
+### Using a Linux Virtual Machine with Hyper-V
+
+If you prefer, you can use a virtual machine (VM) to run Linux and test your changes in the
+full environment. The test suite runs a lot faster on Linux than on Windows or with the WSL.
+You can connect to the VM using an SSH terminal like
+[PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/).
+
+The following instructions are for using Hyper-V, which is available in some versions of Windows.
+There are many virtual machine alternatives available, if you do not have such a version installed.
+
+* [Download an Ubuntu Server ISO](https://www.ubuntu.com/download/server).
+* Open [Hyper-V Manager](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v).
+* [Set up a virtual switch](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/connect-to-network)
+  so your VM can reach the network.
+* Select "Quick Create", name your machine, select the ISO as installation source, and un-check
+  "This virtual machine will run Windows."
+* Go through the Ubuntu install process, being sure to select to install OpenSSH Server.
+* When install is complete, log in and check the SSH server status with `sudo service ssh status`.
+    * If the service is not found, install with `sudo apt-get install openssh-server`.
+    * If the service is not running, then use `sudo service ssh start`.
+* Use `shutdown -h now` to shutdown the VM, go to the Hyper-V settings for the VM, expand Network Adapter
+  to select "Advanced Features", and set the MAC address to be static (this can save your VM from losing
+  network if shut down incorrectly).
+* Provide as many cores to your VM as you can (for parallel builds).
+* Restart your VM, but do not connect.
+* Use `ssh` in Git Bash, download [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/), or use your favorite SSH client to connect to the VM through SSH.
+
+In order to build and use `git`, you will need the following libraries via `apt-get`:
+
+```
+sudo apt-get update
+sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \
+                     libexpat-dev tcl tk gettext git-email zlib1g-dev
+```
+
+To get your code from your Windows machine to the Linux VM, it is easiest to push the branch to your fork of Git and clone your fork in the Linux VM.
+
+Don't forget to set your `git` config with your preferred name, email, and editor.
+
+Polish Your Commits
+-------------------
+
+Before submitting your patch, be sure to read the [coding guidelines](https://github.com/git/git/blob/master/Documentation/CodingGuidelines)
+and check your code to match as best you can. This can be a lot of effort, but it saves
+time during review to avoid style issues.
+
+The other possibly major difference between the mailing list submissions and GitHub PR workflows
+is that each commit will be reviewed independently. Even if you are submitting a
+patch series with multiple commits, each commit must stand on it's own and be reviewable
+by itself. Make sure the commit message clearly explain the why of the commit not the how.
+Describe what is wrong with the current code and how your changes have made the code better.
+
+When preparing your patch, it is important to put yourself in the shoes of the Git community.
+Accepting a patch requires more justification than approving a pull request from someone on
+your team. The community has a stable product and is responsible for keeping it stable. If
+you introduce a bug, then they cannot count on you being around to fix it. When you decided
+to start work on a new feature, they were not part of the design discussion and may not
+even believe the feature is worth introducing.
+
+Questions to answer in your patch message (and commit messages) may include:
+* Why is this patch necessary?
+* How does the current behavior cause pain for users?
+* What kinds of repositories are necessary for noticing a difference?
+* What design options did you consider before writing this version? Do you have links to
+  code for those alternate designs?
+* Is this a performance fix? Provide clear performance numbers for various well-known repos.
+
+Here are some other tips that we use when cleaning up our commits:
+
+* Commit messages should be wrapped at 76 columns per line (or less; 72 is also a
+  common choice).
+* Make sure the commits are signed off using `git commit (-s|--signoff)`. See
+  [SubmittingPatches](https://github.com/git/git/blob/v2.8.1/Documentation/SubmittingPatches#L234-L286)
+  for more details about what this sign-off means.
+* Check for whitespace errors using `git diff --check [base]...HEAD` or `git log --check`.
+* Run `git rebase --whitespace=fix` to correct upstream issues with whitespace.
+* Become familiar with interactive rebase (`git rebase -i`) because you will be reordering,
+  squashing, and editing commits as your patch or series of patches is reviewed.
+* Make sure any shell scripts that you add have the executable bit set on them.  This is
+  usually for test files that you add in the `/t` directory.  You can use
+  `git add --chmod=+x [file]` to update it. You can test whether a file is marked as executable
+  using `git ls-files --stage \*.sh`; the first number is 100755 for executable files.
+* Your commit titles should match the "area: change description" format. Rules of thumb:
+    * Choose "<area>: " prefix appropriately.
+    * Keep the description short and to the point.
+    * The word that follows the "<area>: " prefix is not capitalized.
+    * Do not include a full-stop at the end of the title.
+    * Read a few commit messages -- using `git log origin/master`, for instance -- to
+      become acquainted with the preferred commit message style.
+* Build source using  `make DEVELOPER=1` for extra-strict compiler warnings.
+
+Submit Your Patch
+-----------------
+
+Git for Windows [accepts pull requests on GitHub](https://github.com/git-for-windows/git/pulls), but
+these are reserved for Windows-specific improvements. For core Git, submissions are accepted on
+[the Git mailing list](https://public-inbox.org/git).
+
+### Configure Git to Send Emails
+
+There are a bunch of options for configuring the `git send-email` command. These options can
+be found in the documentation for
+[`git config`](https://git-scm.com/docs/git-config) and
+[`git send-email`](https://git-scm.com/docs/git-send-email).
+
+```
+git config --global sendemail.smtpserver <smtp server>
+git config --global sendemail.smtpserverport 587
+git config --global sendemail.smtpencryption tls
+git config --global sendemail.smtpuser <email address>
+```
+
+To avoid storing your password in the config file, store it in the Git credential manager:
+
+```
+$ git credential fill
+protocol=smtp
+host=<stmp server>
+username=<email address>
+password=password
+```
+
+Before submitting a patch, read the [Git documentation on submitting patches](https://github.com/git/git/blob/master/Documentation/SubmittingPatches).
+
+To construct a patch set, use the `git format-patch` command. There are three important options:
+
+* `--cover-letter`: If specified, create a `[v#-]0000-cover-letter.patch` file that can be
+  edited to describe the patch as a whole. If you previously added a branch description using
+  `git branch --edit-description`, you will end up with a 0/N mail with that description and
+  a nice overall diffstat.
+* `--in-reply-to=[Message-ID]`: This will mark your cover letter as replying to the given
+  message (which should correspond to your previous iteration). To determine the correct Message-ID,
+  find the message you are replying to on [public-inbox.org/git](https://public-inbox.org/git) and take
+  the ID from between the angle brackets.
+
+* `--subject-prefix=[prefix]`: This defaults to [PATCH]. For subsequent iterations, you will want to
+  override it like `--subject-prefix="[PATCH v2]"`.  You can also use the `-v` option to have it
+  automatically generate the version number in the patches.
+
+If you have multiple commits and use the `--cover-letter` option be sure to open the
+`0000-cover-letter.patch` file to update the subject and add some details about the overall purpose
+of the patch series.
+
+### Examples
+
+To generate a single commit patch file:
+```
+git format-patch -s -o [dir] -1
+```
+To generate four patch files from the last three commits with a cover letter:
+```
+git format-patch --cover-letter -s -o [dir] HEAD~4
+```
+To generate version 3 with four patch files from the last four commits with a cover letter:
+```
+git format-patch --cover-letter -s -o [dir] -v 3 HEAD~4
+```
+
+### Submit the Patch
+
+Run [`git send-email`](https://git-scm.com/docs/git-send-email), starting with a test email:
+
+```
+git send-email --to=yourself@address.com  [dir with patches]/*.patch
+```
+
+After checking the receipt of your test email, you can send to the list and to any
+potentially interested reviewers.
+
+```
+git send-email --to=git@vger.kernel.org --cc=<email1> --cc=<email2> [dir with patches]/*.patch
+```
+
+To submit a nth version patch (say version 3):
+
+```
+git send-email --to=git@vger.kernel.org --cc=<email1> --cc=<email2> \
+    --in-reply-to=<the message id of cover letter of patch v2> [dir with patches]/*.patch
+```

From 21a1601d87d2d10d5e31fd72b3ce30042ad2fa90 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 10 Jan 2014 16:16:03 -0600
Subject: [PATCH 550/553] README.md: Add a Windows-specific preamble
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Includes touch-ups by 마누엘, Philip Oakley and 孙卓识.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 README.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 76 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index d87bca1b8c3ebf..026d5d85caef09 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,77 @@
-[![Build status](https://github.com/git/git/workflows/CI/badge.svg)](https://github.com/git/git/actions?query=branch%3Amaster+event%3Apush)
+Git for Windows
+===============
+
+[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](CODE_OF_CONDUCT.md)
+[![Open in Visual Studio Code](https://img.shields.io/static/v1?logo=visualstudiocode&label=&message=Open%20in%20Visual%20Studio%20Code&labelColor=2c2c32&color=007acc&logoColor=007acc)](https://open.vscode.dev/git-for-windows/git)
+[![Build status](https://github.com/git-for-windows/git/workflows/CI/badge.svg)](https://github.com/git-for-windows/git/actions?query=branch%3Amain+event%3Apush)
+[![Join the chat at https://gitter.im/git-for-windows/git](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/git-for-windows/git?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
+
+This is [Git for Windows](http://git-for-windows.github.io/), the Windows port
+of [Git](http://git-scm.com/).
+
+The Git for Windows project is run using a [governance
+model](http://git-for-windows.github.io/governance-model.html). If you
+encounter problems, you can report them as [GitHub
+issues](https://github.com/git-for-windows/git/issues), discuss them in Git
+for Windows' [Discussions](https://github.com/git-for-windows/git/discussions)
+or on the [Git mailing list](mailto:git@vger.kernel.org), and [contribute bug
+fixes](https://gitforwindows.org/how-to-participate).
+
+To build Git for Windows, please either install [Git for Windows'
+SDK](https://gitforwindows.org/#download-sdk), start its `git-bash.exe`, `cd`
+to your Git worktree and run `make`, or open the Git worktree as a folder in
+Visual Studio.
+
+To verify that your build works, use one of the following methods:
+
+- If you want to test the built executables within Git for Windows' SDK,
+  prepend `<worktree>/bin-wrappers` to the `PATH`.
+- Alternatively, run `make install` in the Git worktree.
+- If you need to test this in a full installer, run `sdk build
+  git-and-installer`.
+- You can also "install" Git into an existing portable Git via `make install
+  DESTDIR=<dir>` where `<dir>` refers to the top-level directory of the
+  portable Git. In this instance, you will want to prepend that portable Git's
+  `/cmd` directory to the `PATH`, or test by running that portable Git's
+  `git-bash.exe` or `git-cmd.exe`.
+- If you built using a recent Visual Studio, you can use the menu item
+  `Build>Install git` (you will want to click on `Project>CMake Settings for
+  Git` first, then click on `Edit JSON` and then point `installRoot` to the
+  `mingw64` directory of an already-unpacked portable Git).
+
+  As in the previous  bullet point, you will then prepend `/cmd` to the `PATH`
+  or run using the portable Git's `git-bash.exe` or `git-cmd.exe`.
+- If you want to run the built executables in-place, but in a CMD instead of
+  inside a Bash, you can run a snippet like this in the `git-bash.exe` window
+  where Git was built (ensure that the `EOF` line has no leading spaces), and
+  then paste into the CMD window what was put in the clipboard:
+
+  ```sh
+  clip.exe <<EOF
+  set GIT_EXEC_PATH=$(cygpath -aw .)
+  set PATH=$(cygpath -awp ".:contrib/scalar:/mingw64/bin:/usr/bin:$PATH")
+  set GIT_TEMPLATE_DIR=$(cygpath -aw templates/blt)
+  set GITPERLLIB=$(cygpath -aw perl/build/lib)
+  EOF
+  ```
+- If you want to run the built executables in-place, but outside of Git for
+  Windows' SDK, and without an option to set/override any environment
+  variables (e.g. in Visual Studio's debugger), you can call the Git executable
+  by its absolute path and use the `--exec-path` option, like so:
+
+  ```cmd
+  C:\git-sdk-64\usr\src\git\git.exe --exec-path=C:\git-sdk-64\usr\src\git help
+  ```
+
+  Note: for this to work, you have to hard-link (or copy) the `.dll` files from
+  the `/mingw64/bin` directory to the Git worktree, or add the `/mingw64/bin`
+  directory to the `PATH` somehow or other.
+
+To make sure that you are testing the correct binary, call `./git.exe version`
+in the Git worktree, and then call `git version` in a directory/window where
+you want to test Git, and verify that they refer to the same version (you may
+even want to pass the command-line option `--build-options` to look at the
+exact commit from which the Git version was built).
 
 Git - fast, scalable, distributed revision control system
 =========================================================
@@ -29,7 +102,7 @@ CVS users may also want to read [Documentation/gitcvs-migration.adoc][]
 (`man gitcvs-migration` or `git help cvs-migration` if git is
 installed).
 
-The user discussion and development of Git take place on the Git
+The user discussion and development of core Git take place on the Git
 mailing list -- everyone is welcome to post bug reports, feature
 requests, comments and patches to git@vger.kernel.org (read
 [Documentation/SubmittingPatches][] for instructions on patch submission
@@ -43,6 +116,7 @@ To subscribe to the list, send an email to <git+subscribe@vger.kernel.org>
 (see https://subspace.kernel.org/subscribing.html for details). The mailing
 list archives are available at <https://lore.kernel.org/git/>,
 <https://marc.info/?l=git> and other archival sites.
+The core git mailing list is plain text (no HTML!).
 
 Issues which are security relevant should be disclosed privately to
 the Git Security mailing list <git-security@googlegroups.com>.

From e944b9792018f07cdd1e6ae538772fff1f2abf07 Mon Sep 17 00:00:00 2001
From: Brendan Forster <brendan@github.com>
Date: Thu, 18 Feb 2016 21:29:50 +1100
Subject: [PATCH 551/553] Add an issue template

With improvements by Clive Chan, Adric Norris, Ben Bodenmiller and
Philip Oakley.

Helped-by: Clive Chan <cc@clive.io>
Helped-by: Adric Norris <landstander668@gmail.com>
Helped-by: Ben Bodenmiller <bbodenmiller@hotmail.com>
Helped-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Brendan Forster <brendan@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/ISSUE_TEMPLATE/bug-report.yml | 105 ++++++++++++++++++++++++++
 .github/ISSUE_TEMPLATE/config.yml     |   1 +
 2 files changed, 106 insertions(+)
 create mode 100644 .github/ISSUE_TEMPLATE/bug-report.yml
 create mode 100644 .github/ISSUE_TEMPLATE/config.yml

diff --git a/.github/ISSUE_TEMPLATE/bug-report.yml b/.github/ISSUE_TEMPLATE/bug-report.yml
new file mode 100644
index 00000000000000..b49593339932b2
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/bug-report.yml
@@ -0,0 +1,105 @@
+name: Bug report
+description: Use this template to report bugs.
+body:
+  - type: checkboxes
+    id: search
+    attributes:
+      label: Existing issues matching what you're seeing
+      description: Please search for [open](https://github.com/git-for-windows/git/issues?q=is%3Aopen) or [closed](https://github.com/git-for-windows/git/issues?q=is%3Aclosed) issue matching what you're seeing before submitting a new issue.
+      options:
+        - label: I was not able to find an open or closed issue matching what I'm seeing
+  - type: textarea
+    id: git-for-windows-version
+    attributes:
+      label: Git for Windows version
+      description: Which version of Git for Windows are you using?
+      placeholder: Please insert the output of `git --version --build-options` here
+      render: shell
+    validations:
+      required: true
+  - type: dropdown
+    id: windows-version
+    attributes:
+      label: Windows version
+      description: Which version of Windows are you running?
+      options:
+        - Windows 8.1
+        - Windows 10
+        - Windows 11
+        - Other
+      default: 2
+    validations:
+      required: true
+  - type: dropdown
+    id: windows-arch
+    attributes:
+      label: Windows CPU architecture
+      description: What CPU Archtitecture does your Windows target?
+      options:
+        - i686 (32-bit)
+        - x86_64 (64-bit)
+        - ARM64
+      default: 1
+    validations:
+      required: true
+  - type: textarea
+    id: windows-version-cmd
+    attributes:
+      label: Additional Windows version information
+      description: This provides us with further information about your Windows such as the build number
+      placeholder: Please insert the output of `cmd.exe /c ver` here
+      render: shell
+  - type: textarea
+    id: options
+    attributes:
+      label: Options set during installation
+      description: What options did you set as part of the installation? Or did you choose the defaults?
+      placeholder: |
+        One of the following:
+        > type "C:\Program Files\Git\etc\install-options.txt"
+        > type "C:\Program Files (x86)\Git\etc\install-options.txt"
+        > type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
+        > type "$env:USERPROFILE\AppData\Local\Programs\Git\etc\install-options.txt"
+        $ cat /etc/install-options.txt
+      render: shell
+    validations:
+      required: true
+  - type: textarea
+    id: other-things
+    attributes:
+      label: Other interesting things
+      description: Any other interesting things about your environment that might be related to the issue you're seeing?
+  - type: input
+    id: terminal
+    attributes:
+      label: Terminal/shell
+      description: Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other
+    validations:
+      required: true
+  - type: textarea
+    id: commands
+    attributes:
+      label: Commands that trigger the issue
+      description: What commands did you run to trigger this issue? If you can provide a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) this will help us understand the issue.
+      render: shell
+    validations:
+      required: true
+  - type: textarea
+    id: expected-behaviour
+    attributes:
+      label: Expected behaviour
+      description: What did you expect to occur after running these commands?
+    validations:
+      required: true
+  - type: textarea
+    id: actual-behaviour
+    attributes:
+      label: Actual behaviour
+      description: What actually happened instead?
+    validations:
+      required: true
+  - type: textarea
+    id: repository
+    attributes:
+      label: Repository
+      description: If the problem was occurring with a specific repository, can you provide the URL to that repository to help us with testing?
\ No newline at end of file
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
new file mode 100644
index 00000000000000..ec4bb386bcf8a4
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1 @@
+blank_issues_enabled: false
\ No newline at end of file

From 1de126f49692ac11cc78a75dfca7bea760783a67 Mon Sep 17 00:00:00 2001
From: Philip Oakley <philipoakley@iee.org>
Date: Fri, 22 Dec 2017 17:15:50 +0000
Subject: [PATCH 552/553] Modify the GitHub Pull Request template (to reflect
 Git for Windows)

Git for Windows accepts pull requests; Core Git does not. Therefore we
need to adjust the template (because it only matches core Git's
project management style, not ours).

Also: direct Git for Windows enhancements to their contributions page,
space out the text for easy reading, and clarify that the mailing list
is plain text, not HTML.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/PULL_REQUEST_TEMPLATE.md | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 37654cdfd7abcf..7baf31f2c471ec 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,7 +1,19 @@
-Thanks for taking the time to contribute to Git! Please be advised that the
-Git community does not use github.com for their contributions. Instead, we use
-a mailing list (git@vger.kernel.org) for code submissions, code reviews, and
-bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
+Thanks for taking the time to contribute to Git!
+
+Those seeking to contribute to the Git for Windows fork should see
+http://gitforwindows.org/#contribute on how to contribute Windows specific
+enhancements.
+
+If your contribution is for the core Git functions and documentation
+please be aware that the Git community does not use the github.com issues
+or pull request mechanism for their contributions.
+
+Instead, we use the Git mailing list (git@vger.kernel.org) for code and
+documentation submissions, code reviews, and bug reports. The
+mailing list is plain text only (anything with HTML is sent directly
+to the spam folder).
+
+Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
 to conveniently send your Pull Requests commits to our mailing list.
 
 For a single-commit pull request, please *leave the pull request description

From c818d9eaeac38ab67267425147453a301aac5121 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Fri, 23 Aug 2019 14:14:42 +0200
Subject: [PATCH 553/553] SECURITY.md: document Git for Windows' policies

This is the recommended way on GitHub to describe policies revolving around
security issues and about supported versions.

Helped-by: Sven Strickroth <email@cs-ware.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 SECURITY.md | 56 +++++++++++++++++++++++++++++++++--------------------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/SECURITY.md b/SECURITY.md
index c720c2ae7f9580..42b6d458bfd557 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -28,24 +28,38 @@ Examples for details to include:
 
 ## Supported Versions
 
-There are no official "Long Term Support" versions in Git.
-Instead, the maintenance track (i.e. the versions based on the
-most recently published feature release, also known as ".0"
-version) sees occasional updates with bug fixes.
-
-Fixes to vulnerabilities are made for the maintenance track for
-the latest feature release and merged up to the in-development
-branches. The Git project makes no formal guarantee for any
-older maintenance tracks to receive updates. In practice,
-though, critical vulnerability fixes are applied not only to the
-most recent track, but to at least a couple more maintenance
-tracks.
-
-This is typically done by making the fix on the oldest and still
-relevant maintenance track, and merging it upwards to newer and
-newer maintenance tracks.
-
-For example, v2.24.1 was released to address a couple of
-[CVEs](https://cve.mitre.org/), and at the same time v2.14.6,
-v2.15.4, v2.16.6, v2.17.3, v2.18.2, v2.19.3, v2.20.2, v2.21.1,
-v2.22.2 and v2.23.1 were released.
+Git for Windows is a "friendly fork" of [Git](https://git-scm.com/), i.e. changes in Git for Windows are frequently contributed back, and Git for Windows' release cycle closely following Git's.
+
+While Git maintains several release trains (when v2.19.1 was released, there were updates to v2.14.x-v2.18.x, too, for example), Git for Windows follows only the latest Git release. For example, there is no Git for Windows release corresponding to Git v2.16.5 (which was released after v2.19.0).
+
+One exception is [MinGit for Windows](https://gitforwindows.org/mingit) (a minimal subset of Git for Windows, intended for bundling with third-party applications that do not need any interactive commands nor support for `git svn`): critical security fixes are backported to the v2.11.x, v2.14.x, v2.19.x, v2.21.x and v2.23.x release trains.
+
+## Version number scheme
+
+The Git for Windows versions reflect the Git version on which they are based. For example, Git for Windows v2.21.0 is based on Git v2.21.0.
+
+As Git for Windows bundles more than just Git (such as Bash, OpenSSL, OpenSSH, GNU Privacy Guard), sometimes there are interim releases without corresponding Git releases. In these cases, Git for Windows appends a number in parentheses, starting with the number 2, then 3, etc. For example, both Git for Windows v2.17.1 and v2.17.1(2) were based on Git v2.17.1, but the latter included updates for Git Credential Manager and Git LFS, fixing critical regressions.
+
+## Tag naming scheme
+
+Every Git for Windows version is tagged using a name that starts with the Git version on which it is based, with the suffix `.windows.<patchlevel>` appended. For example, Git for Windows v2.17.1' source code is tagged as [`v2.17.1.windows.1`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.1) (the patch level is always at least 1, given that Git for Windows always has patches on top of Git). Likewise, Git for Windows v2.17.1(2)' source code is tagged as [`v2.17.1.windows.2`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.2).
+
+## Release Candidate (rc) versions
+
+As a friendly fork of Git (the "upstream" project), Git for Windows is closely corelated to that project.
+
+Consequently, Git for Windows publishes versions based on Git's release candidates (for upcoming "`.0`" versions, see [Git's release schedule](https://tinyurl.com/gitCal)). These versions end in `-rc<n>`, starting with `-rc0` for a very early preview of what is to come, and as with regular versions, Git for Windows tries to follow Git's releases as quickly as possible.
+
+Note: there is currently a bug in the "Check daily for updates" code, where it mistakes the final version as a downgrade from release candidates. Example: if you installed Git for Windows v2.23.0-rc3 and enabled the auto-updater, it would ask you whether you want to "downgrade" to v2.23.0 when that version was available.
+
+[All releases](https://github.com/git-for-windows/git/releases/), including release candidates, are listed via a link at the footer of the [Git for Windows](https://gitforwindows.org/) home page.
+
+## Snapshot versions ('nightly builds')
+
+Git for Windows also provides snapshots (these are not releases) of the current development as per git-for-Windows/git's `master` branch at the [Snapshots](https://gitforwindows.org/git-snapshots/) page. This link is also listed in the footer of the [Git for Windows](https://gitforwindows.org/) home page.
+
+Note: even if those builds are not exactly "nightly", they are sometimes referred to as "nightly builds" to keep with other projects' nomenclature.
+
+## Following upstream's developments
+
+The [gitforwindows/git repository](https://github.com/git-for-windows/git) also provides the `shears/*` branches. The `shears/*` branches reflect Git for Windows' patches, rebased onto the upstream integration branches, [updated (mostly) via automated CI builds](https://dev.azure.com/git-for-windows/git/_build?definitionId=25).