Description
It would be nice to have a more fine grained option to select what file-types should be skipped. The current --skip-magic
option allows to select specific skipped types but it does not allow to extend the default list of magics. It would be nice to have the possibility to extend the default list instead of overwriting it. Moreover, using the magic prefix is confusing in scenarios where the magic bytes of a file is the same (for example, zip and apk files).
Is your feature request related to a problem? Please describe.
Issue #262 has described the exact same problem and the solution was to integrate the --skip-magic
option which is a nice way to solve the problem but I suggest extending the feature.
The problem that I'm facing is, that I want to extract an Android image but don't want to extract certain file types (e.g., .apk, .ttf). However, using the --skip-magic
option isn't really user friendly because I would need to define a list of --skip-magic
parameters for every filetype to exclude as well as for the default list of magic defined by unblob.
Consider the following example: Let's assume I have a .zip file that contains only three files (an .xlsx, an .apk., and a .jar file). We then want to extract this file with unblob but don't want to extract the .apk, .jar, and .xlsx file. As user, I would expect that it is sufficient to add --skip-magic "APK" --skip-magic "JAR"
to skip these file extensions. However, adding these two parameters doesn't match apk and jar files as it seems. Moreover, when setting a --skip-magic
parameter it overwrites the default list of skip-magic in unblob. Thus, unblob extracts all the files including the .xlsx, which is not what we want.
docker run --platform=linux/amd64 --rm --pull always -v /Volumes/ExtremeSSD/test/output/:/data/output -v /Volumes/ExtremeSSD/test/input/:/data/input ghcr.io/onekey-sec/unblob:latest --skip-magic "APK" --skip-magic "JAR" "/data/input/Test.apk.zip"
To overcome this problem, we have to figure out the correct magic prefix for apk and jar files. So we figured out that adding the magic "Android" and "Java" would actually skip the apk and jar files. However, we would need to add for all defaults another --skip-magic
parameter to prevent overwriting the default magic list and skip as well the .xlsx file. The list of defaults to skip is quiet long. Thus, we would need to add around 20 --skip-magic parameters to skip all the defaults.
docker run --platform=linux/amd64 --rm --pull always -v /Volumes/ExtremeSSD/test/output/:/data/output -v /Volumes/ExtremeSSD/test/input/:/data/input ghcr.io/onekey-sec/unblob:latest --skip-magic "Android" --skip-magic "Java" --skip-magic "Microsoft Excel" "/data/input/Test.apk.zip"
I hope the example is understandable.
Describe the solution you'd like
There is two things I would like to suggest to make the --skip-magic
parameter more user friendly:
- Add the possibility to extend the default magic list without overwriting it.
- Map file extensions within unblob to a magic if it is a known file type. For instance, "APK" = "Android"
I think users should just be able to type --skip-magic "<some-file-extension>"
to match a correct magic instead of having to extract the magic from a file by themselves.
Ps. if there is a better solution to match apk files I'm up for suggestions.