Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing JA4 fingerprints in output #136

Open
elpy1 opened this issue Jul 29, 2024 · 2 comments
Open

Missing JA4 fingerprints in output #136

elpy1 opened this issue Jul 29, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@elpy1
Copy link

elpy1 commented Jul 29, 2024

Hi 👋 . While working on a personal project that implements JA4, I noticed some discrepancies when comparing JA4 (TCP) fingerprint output against some of the tls PCAP files in your repo.

For example, I get the following TLS fingerprints from tls-handshake.pcapng:

$ python pcap.py --file ~/git/ext/ja4/pcap/tls-handshake.pcapng | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With ja4.py I get:

$ python ja4.py --ja4 ~/git/ext/ja4/pcap/tls-handshake.pcapng | grep -E -o 't\w{9}_\w{12}_\w{12}' | sort | uniq -c | sort -nr
     49 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With tshark (TShark (Wireshark) 4.2.6 (Git commit fca52ffc018f).) I get:

$ tshark -r  ~/git/ext/ja4/pcap/tls-handshake.pcapng -Y 'tls.handshake.type == 1' -Tfields -e 'tls.handshake.ja4' | grep '^t' | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

Upon looking at this a bit further I realised the caching functionality in common.py is based on streams. So, if there is more than one fingerprint in a stream, it gets overwritten in the cache? Examples stream:
image

I was able to resolve this locally by hacking together a change that uses a tuple containing the stream and frame number as the cache key, but this probably isn't suitable because it results in multiple outputs for a stream, instead of multiple fingerprints inside a single stream output.

@john-althouse john-althouse added the enhancement New feature or request label Aug 5, 2024
@john-althouse
Copy link
Collaborator

Thanks for bringing this up! We should add any additional JA4s seen in streams to the output as JA4.2, etc. like how we do with JA4X I think. Would that work?

@elpy1
Copy link
Author

elpy1 commented Aug 6, 2024

Considering the core functionality currently involves extracting fingerprints from each stream, that makes sense to me.

I'm simply grepping for the JA4 pattern, so it doesn't matter where it is in the output for my use-case. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants