Skip to content

Commit

Permalink
Improved projects section
Browse files Browse the repository at this point in the history
  • Loading branch information
radare committed Oct 11, 2024
1 parent 9914a47 commit d31c037
Show file tree
Hide file tree
Showing 7 changed files with 397 additions and 74 deletions.
2 changes: 2 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@
* [Usage](projects/usage.md)
* [Annotations](projects/annotations.md)
* [Handmade Setup](projects/handmade.md)
* [Versioning](projects/version.md)
* [Challenges](projects/challenges.md)
* [Disassembling](arch/intro.md)
* [Decompilers](arch/decompile.md)
* [Metadata](arch/metadata.md)
Expand Down
88 changes: 79 additions & 9 deletions src/projects/annotations.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,92 @@
## Annotations

Comment annotations (See the `ano` command) it's a cross-project metadata feature introduced in radare2 5.9.6.
Annotations are a powerful feature introduced in Radare2 5.9.6 that allows users to associate cross-project metadata with specific functions. Unlike comments or other metadata that are tied to a specific project or session, annotations are persistent and stored globally. This makes them particularly useful for tracking notes and observations across different projects without needing to worry about manually saving them.

The annotations are associated with each function and stored in a dedicated cache directory, making them available even after you leave the session. You can think of annotations as a place to store function-specific notes, decompilation output, or other important information that you want to keep handy.

### Using the ano

Annotations can be managed using the `ano` command. Below is an overview of the available options:

```console
[0x00000000]> ano?
Usage: ano [*] # function anotations
Usage: ano [*] # function annotations
| ano show or edit annotations for the current function
| ano-* remove all annotations of current file
| ano* dump all annotations in ano= commands
| ano=[b64text] set anotation text in base64 for current function
| ano-* remove all annotations of the current file
| ano* dump all annotations in `ano=` commands
| ano=[b64text] set annotation text in base64 for current function
| anoe edit annotation
| anos show annotation
| anol show first line of function annotation if any
| anol show the first line of function annotation if any
[0x00000000]>
```

The annotations are not tied to projects or sessions. They are stored in your home as separate files, and they are associated with each function.
### Key Features

1. **Persistent Across Sessions:**
Annotations are stored globally in the `~/.local/share/radare2/cache` directory. This ensures that they are accessible across different sessions, even if the project is closed and reopened later.

2. **Multiline Annotations:**
Annotations can contain multiple lines of text, making them ideal for storing detailed notes, such as decompilation output, comments, or any other observations about a function.

3. **Cross-Project:**
Since annotations are not tied to any specific project, they can be shared across different projects that analyze the same binary. This is useful when working with multiple teams or revisiting an old analysis.

### Examples

#### Setting an Annotation for a Function

To add an annotation to the current function, you can simply use the `ano=` command with base64-encoded text:

```console
[0x00000000]> ano=[b64text]
```

#### Editing and Viewing Annotations

If you want to manually edit the annotation for the current function, use the `anoe` command, which allows you to modify the annotation interactively:

```console
[0x00000000]> anoe
```

You can also view the full annotation using `anos`:

```console
[0x00000000]> anos
```

If you only want to see the first line of the annotation (which is useful for quick reference), you can use `anol`:

```console
[0x00000000]> anol
```

#### Removing Annotations

To remove all annotations for the current file, you can use the `ano-*` command:

```console
[0x00000000]> ano-*
```

### Annotations in Action: Using Annotations to Cache Decompilation Output

Annotations can also be used to improve efficiency when working with decompiled code. For example, the `-e cache=true` setting in Radare2 enables the caching of decompiled output. This prevents Radare2 from having to re-decompile the same function multiple times, thus saving time during the analysis.

Here's an example of how this works:

1. Decompiling a function in language mode normally requires Radare2 to call the decompiler for each function.
2. By enabling caching with `-e cache=true`, Radare2 will store the decompilation output in an annotation. The next time you view the same function, the cached annotation will be used instead of calling the decompiler again.

This is particularly helpful when working with large binaries or performing repetitive decompilation tasks.

```bash
$ r2 -e cache=true <binary>
```

By leveraging annotations in this way, you can significantly reduce the overhead of reprocessing functions during analysis.

This is useful because you can have a multiline comment for each function where you drop some notes, paste the decompilation output, etc and you can leave the shell without worrying about saving it later.
### Conclusion

These files are stored in the cache subdirectory of the radare2 datadir: `~/.local/share/radare2/cache`.
Annotations are an essential tool for efficiently managing function-specific metadata across multiple sessions and projects. Whether you are adding notes, decompilation output, or general observations, annotations allow you to persist important information and retrieve it at any time. The ability to use annotations for caching decompilation results further enhances the analysis workflow, saving both time and effort.
61 changes: 61 additions & 0 deletions src/projects/challenges.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## Challenges Managing Projects

Managing metadata during binary analysis is a critical aspect of reverse engineering. Metadata includes function names, comments, analysis flags, decompilation results, and much more. However, there are several inherent challenges that reverse engineering tools need to address to ensure efficient project management. Let's explore some of these challenges:

### Key Challenges

1. **Lack of a Standard Format**

There is no universally accepted format for saving or sharing metadata associated with binary analysis. Different tools may have their own methods, making interoperability and collaboration across tools challenging.

2. **Tool Evolution and Metadata Changes**

Check failure on line 11 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:11:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 2; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

As tools evolve, their analysis algorithms and metadata formats may change. This means that older analysis data may become obsolete or incompatible with newer versions of the tool, potentially leading to discrepancies when reloading projects.

3. **Metadata Storage and Analysis Dependency**

Check failure on line 15 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:15:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 3; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

The order in which metadata is stored and loaded is crucial. For example, you cannot name a function before it is analyzed, and any changes to the analysis steps could affect the final outcome. This creates dependencies between analysis steps and metadata storage.

4. **Impact of Analysis Order**

Check failure on line 19 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:19:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 4; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

The sequence in which analysis steps are performed can significantly impact the final results. Skipping or reordering analysis commands can lead to different interpretations of the binary.

5. **Versioning and Rebaselining**

Check failure on line 23 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:23:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 5; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

Projects can be versioned over time, but rebasing metadata can lead to unexpected outcomes. For instance, changes in one version may conflict with updates in another version, making it difficult to reconcile differences.

6. **Real-Time Syncing and Conflicts**

Check failure on line 27 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:27:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 6; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

Synchronizing metadata in real-time between different clients or sessions can lead to conflicts. This can occur when multiple users are analyzing the same binary simultaneously, or when working across distributed systems.

7. **Large Metadata Sets**

Check failure on line 31 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:31:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 7; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

As analysis progresses, the amount of metadata grows, and storing or loading large sets of metadata becomes slower compared to keeping it in memory. This adds overhead to project management and can affect performance.

8. **Address Space Layout Randomization (ASLR)**

Check failure on line 35 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:35:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 8; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

When debugging, binaries can be loaded at different addresses due to ASLR (Address Space Layout Randomization). This means that metadata must be adaptable to different memory layouts, complicating the process of restoring projects in different environments.

Note that this problem happens also when working via frida or remote gdb instances. The project metadata needs to save the information relative to the mapthat it is associated, which doesnt needs to be in the same order or even allocated at all because that depends on the state of the execution of the child.

9. **User Settings and Metadata Registration**

Check failure on line 41 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:41:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 9; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

User-specific settings can influence how analysis and metadata are registered. For example, if different users have different analysis preferences, the resulting metadata could vary significantly even for the same binary.

10. **Handling Incremental Metadata Patches**

Check failure on line 45 in src/projects/challenges.md

View workflow job for this annotation

GitHub Actions / build

Ordered list item prefix

src/projects/challenges.md:45:1 MD029/ol-prefix Ordered list item prefix [Expected: 1; Actual: 10; Style: 1/1/1] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md029.md

Metadata must be updated incrementally during the analysis. Each change to the analysis (e.g., renaming a function, adding a comment) must be stacked properly, and failure to do so can lead to inconsistency or corruption of the project state.

### Why These Challenges Matter

Looking at the challenges listed above, it's clear that managing reverse engineering projects is a complex task. Each of these issues can affect the accuracy, consistency, and efficiency of the analysis process. This complexity is magnified when working in collaborative environments or when dealing with long-term projects that may span multiple tool versions.

### Challenges Specific to Radare2

In the case of **Radare2**, the flexibility of the tool—while powerful—adds an additional layer of complexity. Radare2 allows users to configure many aspects of the tool, from analysis steps to how metadata is handled. This makes it harder to find a one-size-fits-all solution for serializing project information into a file and restoring it accurately.

Unlike other tools that may impose stricter constraints or fewer configuration options, Radare2's versatility requires more care when saving and loading projects. This flexibility, while advantageous for advanced users, can lead to additional challenges in managing project metadata.

### Conclusion

Understanding these challenges helps users troubleshoot potential issues that may arise when managing projects. Whether it's dealing with metadata conflicts, loading times, or tool versioning, addressing these problems head-on will improve both the efficiency and accuracy of reverse engineering workflows.
67 changes: 52 additions & 15 deletions src/projects/handmade.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,28 @@
## Handmade Projects

When you need full control and complete flexibility about your project metadata you can opt out to create a set of scripts that load the binary, setting the base address, setup the memory layout, run the analysis commands of interest, create flags, add comments, run other scripts autogenerated from SVD or other files with just one entrypoint.
When you need full control and complete flexibility over your project metadata, you can choose to create a set of scripts that manage your binary analysis. This allows you to:

The main inconvenience of these projects is that comments, flags or function names won't be saved to disk when leaving the session. We will need to manually type them into the scripts everytime we find them necessary.
- Load the binary.

Check failure on line 5 in src/projects/handmade.md

View workflow job for this annotation

GitHub Actions / build

Unordered list style

src/projects/handmade.md:5:1 MD004/ul-style Unordered list style [Expected: asterisk; Actual: dash] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md004.md
- Set the base address.
- Configure the memory layout.
- Run specific analysis commands.
- Have full control on the steps

This setup requires some level of consistency from the users and also must get used to the workflow to avoid
Note that having full control on the commands you run is important in a variety os sutations (read the section on challenges for more details).

As long as handmade projects are organized in directories and files it is ideal to use it with git, allowing other people to jump into the same files and have a proper versioned.

### Setup the files
These scripts are essentially the entry points for your analysis. However, this approach comes with a trade-off: comments, flags, or function names won’t be saved automatically when exiting the session. Instead, you’ll need to manually type them into the scripts every time they are required.

First of all we will create a directory to contain the binar(y/ies) you want to work on. Then you can create a **Makefile** or a shellscript with the commands to run like this:
This workflow demands consistency and discipline from the user. It’s crucial to maintain an organized system to ensure that important metadata isn’t lost, as nothing is auto-saved unless explicitly written into the project script by the user.

### Advantages and Best Practices

Handmade projects allow you to fully customize your setup. Since they are typically organized into directories and files, they are ideal for use with Git, which provides version control and enables collaboration.

### Folder Organization

To begin, create a directory to store the binaries you want to work on. Then, you can create a **Makefile** or a shell script to automate running commands:

```console
$ mkdir project/bins
Expand All @@ -21,40 +33,65 @@ $ echo r2 -s main -i script.r2 bins/ls > main.sh
$ chmod +x main.sh
```

Now you can run edit `script.r2` with the commands you like:
Next, you can edit the `script.r2` file to include the commands you need:

```
```bash
aaa
CC good boy @ sym.success
```

You can just run `./main.sh` everytime you want to open the handmade project. Note that you can create custom memory maps. Load the contents of ram from a file, enable cache, patch instructions, etc..
Running the script is simple. Just execute `./main.sh` to open the project. You can set up custom memory maps, load contents from RAM, enable cache, patch instructions, and much more.

### Primitives
### Radare2 Primitives

Most of the radare2 environment is built on top of the shell, and as you may know many commands handle the `*` suffix/subcommand which lists the data in r2 commands.
Much of Radare2’s environment is built around shell commands. Many commands include a `*` suffix or subcommand that lists data in Radare2 commands.

For example, if you want to save the flags you can do this:
For example, to save flags:

```console
> f* > flags.r2
```

Then reload them like this:
And to reload them:

```console
> . flags.r2
```

Same thing happens for comments `CC*`, function names `afn*@@F`, etc..
You can apply this same principle to comments (`CC*`), function names (`afn*@@F`), and more.

### Default Script
### Default Script Behavior

If you save an `.r2` file next to the file you are opening, r2 will prompt the user to run it when starting. This way you don't need to use the `-i` flag by yourself.
If you save an `.r2` script in the same directory as the binary, Radare2 will prompt you to run it upon opening the binary. This can save time by eliminating the need to use the `-i` flag:

```bash
$ echo '?e hello world' > ls.r2
$ cp /bin/ls .
$ r2 ls
Do you want to run the 'ls.r2' script? (y/N)
```

### Advanced Customization

You can learn more about custom handmade projects in the firmware reversing chapter of this book, where you’ll find examples of setting up memory layouts and project configurations.

For users who enjoy customization, Radare2 allows you to modify the environment by tweaking variables such as `scr.prompt` for a custom command prompt. For example:

```bash
$ r2 -p jeje
-- The more 'a' you add after 'aa' the more analysis steps are executed.
[0x00000000]> e scr.prompt.prj=true
<jeje>[0x00000000]> e prj.name
jeje
<jeje>[0x00000000]>
```

### Autosaving Options

Radare2 also provides a mechanism for users to define actions that are performed when leaving the shell. The `cmd.exit` configuration variable can be set to run specific commands or scripts when the session ends. For example, you could create a `.r2.js` script that saves all comments and function names before closing the session.

However, it’s important to remember that it’s up to the user to manage and manually update project scripts with new flags, analysis commands, or comments. While this approach requires more effort, it offers unmatched flexibility.

### Conclusion

The handmade project approach in Radare2 provides full control over your workflow but requires meticulous organization and manual updates. Over time, these processes will improve, and user contributions are always welcome, whether through feedback or pull requests.
27 changes: 4 additions & 23 deletions src/projects/intro.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,7 @@
# Projects
# Introduction to Projects

There are some scenarios where you need to work for a long period of time on a single or a set of binaries.
When working on a binary analysis, there are scenarios where you need to continuously work on a single or a set of binaries over a long period of time. This could be due to the complexity of the binaries or the amount of time required for in-depth analysis.

Sometimes when working with large binaries or even if it's small you want to store and keep your progress, the comments, function names, and other metadata so you don't have to handle it when loading the binary again.
In such cases, you often want to store and keep track of your progress, such as comments, function names, and other metadata, so you dont have to redo or reanalyze the same information every time you reload the binary.

This chapter will cover all these needs by exposing the challenges and limitations because, despite looking like a simple problem, projects is one of the hardest issues to cope, and first we need to understand why.

## Challenges

Metadata associated with a binary analysis is an important feature to support for all reverse engineering tools for several reasons, let's explore some of them:

* There's no standard format for saving or sharing it.
* Tools change over time, its analysis and metadata too.
* Metadata storage order matters, you can't name a function if its not analyzed
* Analysis order and steps can affect the final result
* Projects can be versioned, rebasing it can result on unexpected results
* Syncing metadata in realtime between different clients can cause conflicts
* Amount of data tends to be large, storing/loading is slower than keeping it in memory
* Binaries can be loaded in different addresses, aslr when debugging
* User settings affect analysis and metadata registration
* Incremental metadata patches must be stacked up properly

After checking this list we observe how difficult the problem is, and how many of the solutions don't fit in all the possible environments and use cases users will face.

In the case of **radare2**, as long as the tool permits creating so many different configuration paths it is harder to find a way to serialize project information into a file and restoring it back compared to other tools which are tied to much less options.
This chapter will explore how to handle these needs efficiently by introducing Radare2 projects. We will also discuss the challenges and limitations inherent in this process. Although managing projects may seem straightforward, it is one of the more difficult issues to address, and it's essential to understand why before delving into solutions.
Loading

0 comments on commit d31c037

Please sign in to comment.