Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fields for executable object format metadata for ELF, Mach-O and PE #2083

Merged
merged 5 commits into from
Nov 1, 2022

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Oct 26, 2022

Fields are added to ELF for malware detection signatures and fields in ELF for this are reflected in PE. A new Mac OS Mach-O field group is added. Not all possible fields are added to Mach-O.

Specific Go executable fields are added because the import structure of Go executables is not described in standard dynamic library imports and Go is a reasonably popular language both generally, and significantly, in malware.

See discussion at elastic/beats#28802 (comment).

Optimistically putting in 8.6, but happy to bump to 8.7 if this is too tight. If this needs an RFC will do that.

ELF:

  • file.elf.go_import_hash
  • file.elf.go_imports
  • file.elf.go_imports_names_entropy
  • file.elf.go_imports_names_var_entropy
  • file.elf.go_stripped
  • file.elf.import_hash — alias for ELF equivalent of imphash to simplify cross-platform searches
  • file.elf.imports_names_entropy
  • file.elf.imports_names_var_entropy
  • file.elf.sections.var_entropy

PE:

  • file.pe.go_import_hash
  • file.pe.go_imports
  • file.pe.go_imports_names_entropy
  • file.pe.go_imports_names_var_entropy
  • file.pe.go_stripped
  • file.pe.imports_names_entropy
  • file.pe.imports_names_var_entropy
  • file.pe.sections.var_entropy

The following are added to PE as analogous to the existing ELF fields. All bar virtual_size are strictly informative (virtual_size is always equal to physical_size since only ELF has compressed sections)

  • file.pe.import_hash — alias for imphash to simplify cross-platform searches
  • file.pe.imports
  • file.pe.sections
  • file.pe.sections.entropy
  • file.pe.sections.name
  • file.pe.sections.physical_size
  • file.pe.sections.virtual_size — always the same as physical_size on PE (meaningful on ELF as a potential marker for evasive ELF)

ECS doesn't have any fields for Mach-O so all the following are new:

  • file.macho.go_import_hash
  • file.macho.go_imports_names_entropy
  • file.macho.go_imports_names_var_entropy
  • file.macho.go_imports
  • file.macho.go_stripped
  • file.macho.import_hash — alias for Mach-O equivalent of imphash
  • file.macho.imports
  • file.macho.imports_names_entropy
  • file.macho.imports_names_var_entropy
  • file.macho.sections
  • file.macho.sections.entropy
  • file.macho.sections.name
  • file.macho.sections.physical_size
  • file.macho.sections.var_entropy
  • file.macho.sections.virtual_size — always the same as physical_size on Mach-O (meaningful on ELF as a potential marker for evasive ELF)
  • file.macho.symhash — alias for Mach-O equivalent of for imphash

For elastic/beats#28802

Please take a look.

@efd6 efd6 requested review from ebeahan and peasead October 26, 2022 23:56
@efd6 efd6 self-assigned this Oct 26, 2022
@efd6
Copy link
Contributor Author

efd6 commented Oct 27, 2022

The failure appears to be due to the absence of the [ecs-macho] section in field_details.

INFO:build_docs:The Asciidoctor migration is complete! --asciidoctor will emit this message
INFO:build_docs:forever in honor of our success but otherwise doesn't do anything.
INFO:build_docs:
INFO:build_docs:Building HTML from /doc/ecs/docs/index.asciidoc
INFO:build_docs:Guessed toplevel=[/doc/ecs] remote=[git@github.com:elastic/ecs] branch=[master] repo=[ecs]
INFO:build_docs:Browserslist: caniuse-lite is outdated. Please run next command `npm update`
INFO:build_docs:
INFO:build_docs:asciidoctor: WARNING: invalid reference: ecs-macho
make: *** [docs] Error 255

I don't see how to generate this. Resolved: mark fields in schemas/subsets/main.yml

@efd6 efd6 marked this pull request as ready for review October 27, 2022 00:30
@efd6 efd6 requested a review from a team as a code owner October 27, 2022 00:30
Copy link
Contributor

@peasead peasead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I don't approve ECS merges, this LGTM.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope is that ECS contains enough detail that there is a high probably that all implementations produce the same values for these fields. With that in mind, when you read the descriptions do you think there's enough detail?

schemas/elf.yml Outdated
- name: go_import_hash
short: A hash of the Go language imports in an ELF file.
description: >
A hash of the Go language imports in an ELF file. An import hash can be used to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a well-known hash standard? If so can we link to its documentation. If not, can we write a document somewhere the describes how this is calculated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Go import hash is imphash/symhash but using the static Go imports rather than the dynamic imports. There is no standard. Where would you like a document put?

The parallels can be seen here https://github.com/elastic/toutoumomoma/blob/594ef30cb64046757baa988d4451b8f547dcb695/toutoumomoma.go#L149-L231

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference. How about in the https://github.com/elastic/toutoumomoma repo? You could point to the code as the reference implementation.

Copy link
Contributor Author

@efd6 efd6 Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

  • Enhance readme in toutoumomoma.
  • Pull description from *File.ImportHash out to godoc for both it and *File.GoSymbolHash
  • Link to relevant locations in the ECS descriptions

An import hash can be used to fingerprint binaries even after recompilation or other
code-level transformations have occurred, which would change more traditional hash values.

The algorithm used to calculate the Go symbol hash and a reference implementation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ECS expect this calculated with or without the inclusion of the stdlib imports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that depends on what we decide here since it's a new symbol. I excluded stdlib packages because it reduces the size of indexed documents. However, I can see some merit in including stdlib given that it is entirely feasible to write malicious code using only stdlib packages. The argument against that is that the packages that you might use to do so are entirely valid for benign applications.

We have to make a decision and this seems like the better of the two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excluding standard library import

Oh, it is specified. I came up with the question after reading the changes in toutoumomoma and didn't re-read this part.

By including stdlib we would have some signal to use when the binary does not utilize external packages. I assume that for any binary that only uses stdlib that the result is MD5("").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that for any binary that only uses stdlib that the result is MD5("").

Yes, this is correct d41d8cd98f00b204e9800998ecf8427e

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion either way, but I am happy you pointed out the trade-offs. So I'm good with proceeding as is specified.

efd6 added 5 commits November 2, 2022 07:31
Fields are added to ELF for malware detection signatures and fields in ELF for
this are reflected in PE. A new Mac OS Mach-O field group is added. Not all
possible fields are added to Mach-O.
@efd6 efd6 merged commit 3b182ef into elastic:main Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants