Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update ja4 compliance #4773

Merged
merged 6 commits into from
Sep 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions api/unstable/fingerprint.h
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ S2N_API int s2n_fingerprint_get_hash_size(const struct s2n_fingerprint *fingerpr
*
* JA4: A string consisting of three parts, separated by underscores: the prefix,
* and the hex-encoded truncated SHA256 hashes of the other two parts of the raw string.
* - See https://github.com/FoxIO-LLC/ja4/blob/v0.18.2/technical_details/JA4.md
* - See https://github.com/FoxIO-LLC/ja4/blob/df3c067/technical_details/JA4.md
* - Example: "t13i310900_e8f1e7e78f70_1f22a2ca17c4"
*
* @param fingerprint The s2n_fingerprint to be used for the hash
Expand Down Expand Up @@ -145,7 +145,7 @@ S2N_API int s2n_fingerprint_get_raw_size(const struct s2n_fingerprint *fingerpri
* 156-61-60-53-47-255,11-10-35-22-23-13-43-45-51,29-23-30-25-24,0-1-2"
*
* JA4: A string consisting of three parts: a prefix, and two lists of hex values.
* - See https://github.com/FoxIO-LLC/ja4/blob/v0.18.2/technical_details/JA4.md
* - See https://github.com/FoxIO-LLC/ja4/blob/df3c067/technical_details/JA4.md
* - Example: "t13i310900_002f,0033,0035,0039,003c,003d,0067,006b,009c,009d,009e,
* 009f,00ff,1301,1302,1303,c009,c00a,c013,c014,c023,c024,c027,c028,
* c02b,c02c,c02f,c030,cca8,cca9,ccaa_000a,000b,000d,0016,0017,0023,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
JA4 looks at the TLS Client Hello packet and builds a fingerprint of the client based on attributes within the packet.

### JA4 Algorithm:
(QUIC=”q” or TCP=”t”)
(QUIC=”q”, DTLS="d", or Normal TLS=”t”)
(2 character TLS version)
(SNI=”d” or no SNI=”i”)
(2 character count of ciphers)
Expand All @@ -22,22 +22,29 @@ t13d1516h2_8daaf6152771_b186095e22b6
## Details:
The program needs to ignore GREASE values anywhere it sees them: (https://datatracker.ietf.org/doc/html/draft-davidben-tls-grease-01#page-5)

### QUIC:
### QUIC and DTLS:
“q”, "d" or “t”, denotes whether the hello packet is for QUIC, DTLS, or normal TLS.

https://en.wikipedia.org/wiki/QUIC
“q” or “t”, which denotes whether the hello packet is for QUIC or TCP. QUIC is the protocol which the new HTTP/3 standard utilizes, encapsulating TLS 1.3 into UDP packets. As QUIC was developed by Google, if an organization heavily utilizes Google products, QUIC could make up half of their network traffic, so this is important to capture.
QUIC is the protocol which the new HTTP/3 standard utilizes, encapsulating TLS 1.3 into UDP packets. As QUIC was developed by Google, if an organization heavily utilizes Google products, QUIC could make up half of their network traffic, so this is important to capture.

https://en.wikipedia.org/wiki/Datagram_Transport_Layer_Security
DTLS is a version of TLS that can operate over UDP or SCTP.

If the protocol is QUIC then the first character of the fingerprint is “q” if not, it’s “t”.
If the protocol is QUIC then the first character of the fingerprint is “q”, if DTLS it is "d", else it is “t”.

### TLS Version:
TLS version is shown in 3 different places. If extension 0x002b exists (supported_versions), then the version is the highest value in the extension. Remember to ignore GREASE values. If the extension doesn’t exist, then the TLS version is the value of the Protocol Version. Handshake version (located at the top of the packet) should be ignored.
### TLS and DTLS Version:
The TLS version is shown in 3 different places. If extension 0x002b exists (supported_versions), then the version is the highest value in the extension. Remember to ignore GREASE values. If the extension doesn’t exist, then the TLS version is the value of the Protocol Version. Handshake version (located at the top of the packet) should be ignored.

0x0304 = TLS 1.3 = “13”
0x0303 = TLS 1.2 = “12”
0x0302 = TLS 1.1 = “11”
0x0301 = TLS 1.0 = “10”
0x0300 = SSL 3.0 = “s3”
0x0200 = SSL 2.0 = “s2”
0x0100 = SSL 1.0 = “s1”
0x0002 = SSL 2.0 = “s2”
0xfeff = DTLS 1.0 = "d1"
0xfefd = DTLS 1.2 = "d2"
0xfefc = DTLS 1.3 = "d3"

Unknown = “00”

Expand All @@ -51,16 +58,21 @@ If the SNI extension (0x0000) exists, then the destination of the connection is
Same as counting ciphers. Ignore GREASE. Include SNI and ALPN.

### ALPN Extension Value:
The first and last characters of the ALPN (Application-Layer Protocol Negotiation) first value.
The first and last alphanumeric characters of the ALPN (Application-Layer Protocol Negotiation) first value.
List of possible ALPN Values (scroll down): https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml



In the above example, the first ALPN value is h2 so the first and last characters to use in the fingerprint are “h2”. IF the first ALPN listed was http/1.1 then the first and last characters to use in the fingerprint would be “h1”.
In the above example, the first ALPN value is h2 so the first and last characters to use in the fingerprint are “h2”. If the first ALPN listed was http/1.1 then the first and last characters to use in the fingerprint would be “h1”.

In Wireshark this field is located under tls.handshake.extensions_alpn_str

If there are no ALPN values or no ALPN extension then we print “00” as the value in the fingerprint.
If there is no ALPN extension, no ALPN values, or the first ALPN value is empty, then we print "00" as the value in the fingerprint. If the first ALPN value is only a single character, then that character is treated as both the first and last character.

If the first or last byte of the first ALPN is non-alphanumeric (meaning not `0x30-0x39`, `0x41-0x5A`, or `0x61-0x7A`), then we print the first and last characters of the hex representation of the first ALPN instead. For example:
* `0xAB` would be printed as "ab"
* `0xAB 0xCD` would be printed as "ad"
* `0x30 0xAB` would be printed as "3b"
* `0x30 0x31 0xAB 0xCD` would be printed as "3d"
* `0x30 0xAB 0xCD 0x31` would be printed as "01"

### Cipher hash:
A 12 character truncated sha256 hash of the list of ciphers sorted in hex order, first 12 characters. The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
Expand All @@ -73,10 +85,13 @@ Is sorted to:
002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771
```

If there are no ciphers in the sorted cipher list, then the value of JA4_b is set to `000000000000`
We do this rather than running a sha256 hash of nothing as this makes it clear to the user when a field has no values.

### Extension hash:
A 12 character truncated sha256 hash of the list of extensions, sorted by hex value, followed by the list of signature algorithms, in the order that they appear (not sorted).

The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _b_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.
The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _c_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.

For example:
```
Expand Down Expand Up @@ -112,6 +127,9 @@ For example:
0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01 = 6d807ffa2a79
```

If there are no extensions in the sorted extensions list, then the value of JA4_c is set to `000000000000`
We do this rather than running a sha256 hash of nothing as this makes it clear to the user when a field has no values.

### Example

JA4 fingerprint:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#alpn-extension-value"

# ### ALPN Extension Value:
#
# The first and last alphanumeric characters of the ALPN (Application-Layer Protocol Negotiation) first value.
# List of possible ALPN Values (scroll down): https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml
#
# In the above example, the first ALPN value is h2 so the first and last characters to use in the fingerprint are “h2”. If the first ALPN listed was http/1.1 then the first and last characters to use in the fingerprint would be “h1”.
#
# In Wireshark this field is located under tls.handshake.extensions_alpn_str
#
# If there is no ALPN extension, no ALPN values, or the first ALPN value is empty, then we print "00" as the value in the fingerprint. If the first ALPN value is only a single character, then that character is treated as both the first and last character.
#
# If the first or last byte of the first ALPN is non-alphanumeric (meaning not `0x30-0x39`, `0x41-0x5A`, or `0x61-0x7A`), then we print the first and last characters of the hex representation of the first ALPN instead. For example:
# * `0xAB` would be printed as "ab"
# * `0xAB 0xCD` would be printed as "ad"
# * `0x30 0xAB` would be printed as "3b"
# * `0x30 0x31 0xAB 0xCD` would be printed as "3d"
# * `0x30 0xAB 0xCD 0x31` would be printed as "01"
#

[[spec]]
level = "MUST"
quote = '''
The first and last alphanumeric characters of the ALPN (Application-Layer Protocol Negotiation) first value.
'''

[[spec]]
level = "MUST"
quote = '''
If there is no ALPN extension, no ALPN values, or the first ALPN value is empty, then we print "00" as the value in the fingerprint.
'''

[[spec]]
level = "MUST"
quote = '''
If the first ALPN value is only a single character, then that character is treated as both the first and last character.
'''

[[spec]]
level = "MUST"
quote = '''
If the first or last byte of the first ALPN is non-alphanumeric (meaning not `0x30-0x39`, `0x41-0x5A`, or `0x61-0x7A`), then we print the first and last characters of the hex representation of the first ALPN instead.
'''
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#cipher-hash"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#cipher-hash"

# ### Cipher hash:
#
# A 12 character truncated sha256 hash of the list of ciphers sorted in hex order, first 12 characters. The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
# Example:
# ```
Expand All @@ -12,6 +11,9 @@ target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_deta
# 002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771
# ```
#
# If there are no ciphers in the sorted cipher list, then the value of JA4_b is set to `000000000000`
# We do this rather than running a sha256 hash of nothing as this makes it clear to the user when a field has no values.
#

[[spec]]
level = "MUST"
Expand All @@ -25,3 +27,8 @@ quote = '''
The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
'''

[[spec]]
level = "MUST"
quote = '''
If there are no ciphers in the sorted cipher list, then the value of JA4_b is set to `000000000000`
'''
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#details"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#details"

# ## Details:
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#example"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#example"

# ### Example
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#extension-hash"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#extension-hash"

# ### Extension hash:
#
# A 12 character truncated sha256 hash of the list of extensions, sorted by hex value, followed by the list of signature algorithms, in the order that they appear (not sorted).
#
# The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _b_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.
# The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _c_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.
#
# For example:
# ```
Expand Down Expand Up @@ -40,6 +40,9 @@ target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_deta
# 0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01 = 6d807ffa2a79
# ```
#
# If there are no extensions in the sorted extensions list, then the value of JA4_c is set to `000000000000`
# We do this rather than running a sha256 hash of nothing as this makes it clear to the user when a field has no values.
#

[[spec]]
level = "MUST"
Expand Down Expand Up @@ -70,3 +73,9 @@ level = "MUST"
quote = '''
If there are no signature algorithms in the hello packet, then the string ends without an underscore and is hashed.
'''

[[spec]]
level = "MUST"
quote = '''
If there are no extensions in the sorted extensions list, then the value of JA4_c is set to `000000000000`
'''
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#ja4-algorithm"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#ja4-algorithm"

# ### JA4 Algorithm:
#
# (QUIC=”q” or TCP=”t”)
# (QUIC=”q”, DTLS="d", or Normal TLS=”t”)
# (2 character TLS version)
# (SNI=”d” or no SNI=”i”)
# (2 character count of ciphers)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#ja4-tls-client-fingerprinting"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#ja4-tls-client-fingerprinting"

# # JA4: TLS Client Fingerprinting
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#number-of-ciphers"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#number-of-ciphers"

# ### Number of Ciphers:
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#number-of-extensions"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#number-of-extensions"

# ### Number of Extensions:
#
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#quic-and-dtls"

# ### QUIC and DTLS:
# “q”, "d" or “t”, denotes whether the hello packet is for QUIC, DTLS, or normal TLS.
#
# https://en.wikipedia.org/wiki/QUIC
# QUIC is the protocol which the new HTTP/3 standard utilizes, encapsulating TLS 1.3 into UDP packets. As QUIC was developed by Google, if an organization heavily utilizes Google products, QUIC could make up half of their network traffic, so this is important to capture.
#
# https://en.wikipedia.org/wiki/Datagram_Transport_Layer_Security
# DTLS is a version of TLS that can operate over UDP or SCTP.
#
# If the protocol is QUIC then the first character of the fingerprint is “q”, if DTLS it is "d", else it is “t”.
#

[[spec]]
level = "MUST"
quote = '''
If the protocol is QUIC then the first character of the fingerprint is “q”, if DTLS it is "d", else it is “t”.
'''
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#raw-output"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#raw-output"

# ### Raw Output
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#sni"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#sni"

# ### SNI:
#
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#tls-version"
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/df3c067/technical_details/JA4.md#tls-and-dtls-version"

# ### TLS Version:
#
# TLS version is shown in 3 different places. If extension 0x002b exists (supported_versions), then the version is the highest value in the extension. Remember to ignore GREASE values. If the extension doesn’t exist, then the TLS version is the value of the Protocol Version. Handshake version (located at the top of the packet) should be ignored.
# ### TLS and DTLS Version:
# The TLS version is shown in 3 different places. If extension 0x002b exists (supported_versions), then the version is the highest value in the extension. Remember to ignore GREASE values. If the extension doesn’t exist, then the TLS version is the value of the Protocol Version. Handshake version (located at the top of the packet) should be ignored.
#
# 0x0304 = TLS 1.3 = “13”
# 0x0303 = TLS 1.2 = “12”
# 0x0302 = TLS 1.1 = “11”
# 0x0301 = TLS 1.0 = “10”
# 0x0300 = SSL 3.0 = “s3”
# 0x0200 = SSL 2.0 = “s2”
# 0x0100 = SSL 1.0 = “s1”
# 0x0002 = SSL 2.0 = “s2”
# 0xfeff = DTLS 1.0 = "d1"
# 0xfefd = DTLS 1.2 = "d2"
# 0xfefc = DTLS 1.3 = "d3"
#
# Unknown = “00”
#
Expand Down

This file was deleted.

This file was deleted.

Binary file added tests/pcap/data/no_extensions.pcap
Binary file not shown.
Loading
Loading