Skip to content

Commit 1b7dea1

Browse files
newrendscho
authored andcommitted
object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt: '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb':: Output from `git describe`; i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} should parse to fully expanded ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed which, when commit (or tree or blob) affed exists, will not return us information about the file we are looking for but will instead erroneously tell us about object affed. A few additional notes: - This is a slight backward incompatibility break, because we used to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However, a backward incompatible break is necessary, because there is no other way for someone to be more specific and disambiguate that they want the blob master:path/to/who-gabbed instead of the object abbed. - There is a possibility that check_refname_format() rules change in the future. However, we can only realistically loosen the rules for what that function accepts rather than tighten. If we were to tighten the rules, some real world repositories may already have refnames that suddenly become unacceptable and we break those repositories. As such, any describe-like syntax of the form ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the changes in this commit will remain valid in the future. - The fact that check_refname_format() rules could loosen in the future is probably also an important reason to make this change. If the rules loosen, there might be additional cases within ${GARBAGE}-g${HASH} that become ambiguous in the future. While abbreviated hashes can be disambiguated by abbreviating less, it may well be that these alternative object names have no way of being disambiguated (much like pathnames cannot be). Accepting all random ${GARBAGE} thus makes it difficult for us to allow future extensions to object naming. So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are present in the string, and would be considered a valid ref and non-negative integer. Also, add a few tests for git describe using object names of the form ${REVISION_NAME}${MODIFIERS} since an early version of this patch failed on constructs like git describe v2.48.0-rc2-161-g6c2274cdbc^0 Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent c9bd22e commit 1b7dea1

File tree

2 files changed

+78
-1
lines changed

2 files changed

+78
-1
lines changed

object-name.c

+54-1
Original file line numberDiff line numberDiff line change
@@ -1272,6 +1272,58 @@ static int peel_onion(struct repository *r, const char *name, int len,
12721272
return 0;
12731273
}
12741274

1275+
/*
1276+
* Documentation/revisions.txt says:
1277+
* '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
1278+
* Output from `git describe`; i.e. a closest tag, optionally
1279+
* followed by a dash and a number of commits, followed by a dash, a
1280+
* 'g', and an abbreviated object name.
1281+
*
1282+
* which means that the stuff before '-g${HASH}' needs to be a valid
1283+
* refname, a dash, and a non-negative integer. This function verifies
1284+
* that.
1285+
*
1286+
* In particular, we do not want to treat
1287+
* branchname:path/to/file/named/i-gaffed
1288+
* as a request for commit affed.
1289+
*
1290+
* More generally, we should probably not treat
1291+
* 'refs/heads/./../.../ ~^:/?*[////\\\&}/busted.lock-g050e0ef6ead'
1292+
* as a request for object 050e0ef6ead either.
1293+
*
1294+
* We are called with name[len] == '-' and name[len+1] == 'g', i.e.
1295+
* we are verifying ${REFNAME}-{INTEGER} part of the name.
1296+
*/
1297+
static int ref_and_count_parts_valid(const char *name, int len)
1298+
{
1299+
struct strbuf sb;
1300+
const char *cp;
1301+
int flags = REFNAME_ALLOW_ONELEVEL;
1302+
int ret = 1;
1303+
1304+
/* Ensure we have at least one digit */
1305+
if (!isxdigit(name[len-1]))
1306+
return 0;
1307+
1308+
/* Skip over digits backwards until we get to the dash */
1309+
for (cp = name + len - 2; name < cp; cp--) {
1310+
if (*cp == '-')
1311+
break;
1312+
if (!isxdigit(*cp))
1313+
return 0;
1314+
}
1315+
/* Ensure we found the leading dash */
1316+
if (*cp != '-')
1317+
return 0;
1318+
1319+
len = cp - name;
1320+
strbuf_init(&sb, len);
1321+
strbuf_add(&sb, name, len);
1322+
ret = !check_refname_format(sb.buf, flags);
1323+
strbuf_release(&sb);
1324+
return ret;
1325+
}
1326+
12751327
static int get_describe_name(struct repository *r,
12761328
const char *name, int len,
12771329
struct object_id *oid)
@@ -1285,7 +1337,8 @@ static int get_describe_name(struct repository *r,
12851337
/* We must be looking at g in "SOMETHING-g"
12861338
* for it to be describe output.
12871339
*/
1288-
if (ch == 'g' && cp[-1] == '-') {
1340+
if (ch == 'g' && cp[-1] == '-' &&
1341+
ref_and_count_parts_valid(name, cp - 1 - name)) {
12891342
cp++;
12901343
len -= cp - name;
12911344
return get_short_oid(r,

t/t6120-describe.sh

+24
Original file line numberDiff line numberDiff line change
@@ -82,11 +82,13 @@ check_describe R-2-gHASH HEAD^^
8282
check_describe A-3-gHASH HEAD^^2
8383
check_describe B HEAD^^2^
8484
check_describe R-1-gHASH HEAD^^^
85+
check_describe R-1-gHASH R-1-g$(git rev-parse --short HEAD^^)~1
8586

8687
check_describe c-7-gHASH --tags HEAD
8788
check_describe c-6-gHASH --tags HEAD^
8889
check_describe e-1-gHASH --tags HEAD^^
8990
check_describe c-2-gHASH --tags HEAD^^2
91+
check_describe c-2-gHASH --tags c-2-g$(git rev-parse --short HEAD^^2)^0
9092
check_describe B --tags HEAD^^2^
9193
check_describe e --tags HEAD^^^
9294
check_describe e --tags --exact-match HEAD^^^
@@ -725,4 +727,26 @@ test_expect_success '--exact-match does not show --always fallback' '
725727
test_must_fail git describe --exact-match --always
726728
'
727729

730+
test_expect_success 'avoid being fooled by describe-like filename' '
731+
test_when_finished rm out &&
732+
733+
git rev-parse --short HEAD >out &&
734+
FILENAME=filename-g$(cat out) &&
735+
touch $FILENAME &&
736+
git add $FILENAME &&
737+
git commit -m "Add $FILENAME" &&
738+
739+
git cat-file -t HEAD:$FILENAME >actual &&
740+
741+
echo blob >expect &&
742+
test_cmp expect actual
743+
'
744+
745+
test_expect_success 'do not be fooled by invalid describe format ' '
746+
test_when_finished rm out &&
747+
748+
git rev-parse --short HEAD >out &&
749+
test_must_fail git cat-file -t "refs/tags/super-invalid/./../...../ ~^:/?*[////\\\\\\&}/busted.lock-42-g"$(cat out)
750+
'
751+
728752
test_done

0 commit comments

Comments
 (0)