Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust ja4h not sorting cookies right #58

Closed
awick opened this issue Jan 26, 2024 · 2 comments · Fixed by #70
Closed

rust ja4h not sorting cookies right #58

awick opened this issue Jan 26, 2024 · 2 comments · Fixed by #70
Assignees

Comments

@awick
Copy link

awick commented Jan 26, 2024

wireshark & arkime agree for 7th session of https://github.com/arkime/arkime/raw/main/tests/pcap/single-packets.pcap

rust:

  • ja4h: ge11cr06enus_8c2f9ef95269_fa51fb2862b2_c1eaa758c543
    ja4h_r: ge11cr06enus_Accept,Accept-Language,User-Agent,Accept-Encoding,Host,Connection_pardot,visitor_id413862-hash,visitor_id413862_pardot=tee2foreb3fefpgvk8u1056vt3,visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b,visitor_id413862=286585660

arkime:
"ja4h":["ge11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741fe"]
"ja4h_r":["ge11cr06enus_Accept,Accept-Language,User-Agent,Accept-Encoding,Host,Connection_pardot,visitor_id413862,visitor_id413862-hash_pardot=tee2foreb3fefpgvk8u1056vt3,visitor_id413862=286585660,visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b"]

wireshark:
JA4H: ge11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741fe
JA4H Raw [truncated]: ge11cr06enus_Accept,Accept-Language,User-Agent,Accept-Encoding,Host,Connection_pardot,visitor_id413862,visitor_id413862-hash_pardot=tee2foreb3fefpgvk8u1056vt3,visitor_id413862=286585660,visitor_id413862-hash=1f00bdb07

@vvv vvv self-assigned this Feb 3, 2024
@vvv
Copy link
Collaborator

vvv commented Feb 4, 2024

@awick Thanks for reporting this! 🙏🏻

  1. Indeed, the Rust app used to generate the JA4H_c chunk incorrectly. Not any more; see Fix JA4SSH and JA4H #69.
  2. The JA4H_d chunk produced by Wireshark and Python is wrong. The JA4H for TCP stream 7 of single-packets.pcap should end with d23bf79698dc_c1eaa758c543.

JA4H calculation

cookie-string:

❯ cookie-string() { tshark -J http -r pcap/single-packets.pcap -T fields -e http.cookie 'tcp.stream == 7'; }

❯ cookie-string
visitor_id413862=286585660; visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b; pardot=tee2foreb3fefpgvk8u1056vt3

cookie-pairs:

❯ cookie-pairs() { cookie-string | sed 's/; /\n/g'; }

❯ cookie-pairs
visitor_id413862=286585660
visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b
pardot=tee2foreb3fefpgvk8u1056vt3

cookie-names:

❯ cookie-names() { cookie-pairs | cut -d= -f1; }

❯ cookie-names
visitor_id413862
visitor_id413862-hash
pardot

Comma-separated list of sorted cookie-names:

❯ cookie-names | LC_COLLATE=C sort | paste -s -d, -
pardot,visitor_id413862,visitor_id413862-hash

Helper function:

❯ comma_join() {
    # HACK: `tr -d \\n` removes the newline character
    # that `paste` appends to its output
    paste -s -d, - |
    tr -d \\n
}

JA4H_c:

❯ JA4H_c() { cookie-names | LC_COLLATE=C sort | comma_join | sha256sum | head -c 12; echo; }

❯ JA4H_c
d23bf79698dc

Comma-separated list of sorted cookie-pairs:

❯ cookie-pairs | LC_COLLATE=C sort | paste -s -d, -
pardot=tee2foreb3fefpgvk8u1056vt3,visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b,visitor_id413862=286585660

JA4H_d:

❯ JA4H_d() { cookie-pairs | LC_COLLATE=C sort | comma_join | sha256sum | head -c12; echo; }

❯ JA4H_d
c1eaa758c543

vvv added a commit to vvv/ja4 that referenced this issue Feb 4, 2024
@vvv vvv mentioned this issue Feb 4, 2024
vvv added a commit to vvv/ja4 that referenced this issue Feb 4, 2024
igr001-galactica pushed a commit that referenced this issue Feb 4, 2024
* [fix] Generate a JA4SSH fingerprint every 200 _SSH_ packets

* Fix calculation of JA4H_c

Related issue: #58
@awick
Copy link
Author

awick commented Feb 4, 2024

So we should double check with John, but I think _d is RIGHT with wireshark and arkime, because for d you should use the same order of cookies as with c. You can NOT sort the cookie pairs directly, you have to walk thru the _c list and form the _d list because = comes before - when sorting. That particular capture is a good test case.

So everyone agrees C is:
pardot
visitor_id413862
visitor_id413862-hash

The means D should be
pardot=tee2foreb3fefpgvk8u1056vt3
visitor_id413862=286585660
visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b

NOT
pardot=tee2foreb3fefpgvk8u1056vt3
visitor_id413862-hash=1f00bdb076b5fb707c70254849819ec1797d3e27cef91a61a9488cb7ca0ebf77f226caa4075591b2591bf9a1ccdf29432c67379b
visitor_id413862=286585660

vvv added a commit to vvv/ja4 that referenced this issue Feb 4, 2024
Don't sort cookie-strings. Instead, split the cookie-pair on the first `'='`
and sort the vector of `(cookie-name, cookie-value)`.

Kudos to @awick for reporting the bug and explaining the correct semantics!

Closes FoxIO-LLC#58
igr001-galactica pushed a commit that referenced this issue Feb 5, 2024
Don't sort cookie-strings. Instead, split the cookie-pair on the first `'='`
and sort the vector of `(cookie-name, cookie-value)`.

Kudos to @awick for reporting the bug and explaining the correct semantics!

Closes #58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants