Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove the concept of relocatable files from the loader #1189

Closed
ivg opened this issue Jul 24, 2020 · 0 comments · Fixed by #1187
Closed

remove the concept of relocatable files from the loader #1189

ivg opened this issue Jul 24, 2020 · 0 comments · Fixed by #1187

Comments

@ivg
Copy link
Member

ivg commented Jul 24, 2020

Historically, we were designating a special class of binaries that were treated specially, the relocatable files. With time as we develop our backends we enabled relocation (via llvm-offset and bias) for all files, so there is no real need to separate them and we can remove a lot of code duplication.

What is more important, is that presumably due to some confusion, we output relocation and other special classes of entries only for relocatable files, despite the fact that they can also occur in regular files. So we loose important information here.

In the same venue I would like our backed not to ignore any information, as I noticed that we prune some relocations and symbols if they have zero addresses or some other values, that are not yet filled in by the system loader. This information shouldn't be hidden and, if necessary, filtered later.

Basically, I would like to see as few if/then/else as possible in the part that produces the binary specification. All the loading logic should go to the dissemblers and other downstream components. The loader should just provide information as is.

We started removing special treatment of relocatable files in #1187 but we need more work. Basically, this issue could be closed when we remove is-relocatable property from the specification.

ivg added a commit to ivg/bap that referenced this issue Aug 3, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 3, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 4, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 4, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 5, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 5, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
@ivg ivg closed this as completed in #1187 Aug 5, 2020
ivg added a commit that referenced this issue Aug 5, 2020
* fixes the base calculation

1. For ELF files we compute base as the difference between the address of
any loadable code segment and its offset. If there are no loadable code
segments, then we find a section with minimal offset value and
substract its address from its offset.

2. For MachO, when the file is relocatable, i.e., it doesn't have addresses we
compute base as $vaddr - offset$, the same as we do in ELF. This
gives us results that match objdump (but do not match radare2, however
radare2 is not seeing any symbols, so it doesn't really matter)

3. For COFF nothing is done, and I am not sure that we need
to do anything.

4. Removed special computation of the base
address (Base.from_sections_offset) from ELF, MachO, and COFF.

It is not tested on LLVM versions below 6, but I believe it should
work up to 3.4.

resolves #1183

Co-authored-by: gitoleg <forown@yandex.ru>

* re-enables the failing test again

Hope we will pass it now.

* updates paths to artifacts

* renovates the LLVM backend

Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes #1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes #1183
Fixes #1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to #1135 and #1161

Co-authored-by: gitoleg <forown@yandex.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant