forked from trolldbois/python-haystack
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
143 lines (102 loc) · 6.32 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
List of heuristics:
- BasicCachingReverser: use heapwalker to organise heap user allocations chunks into raw records.
- AbstractReverser: Abtract controller
- FieldReverser: basic types on fields
- TextFieldCorrection: correction for text fields
- PointerFieldReverser: pointer type metadata enrichment
- KnownRecordTypeReverser: Apply Haystack search API
- DoubleLinkedListReverser: identify records type with double link list
- PointerGraphReverser: Create graph of records using pointer as edges
- ArrayFieldsReverser: TODO
- InlineRecordReverser: TODO (sub record type)
- TypeReverser: TODO compare records signatures to match types.
- CommonTypeReverser: Used by x2LinkedReverser to determine best types from a list of similar record
- WriteRecordToFile: TODO save to file
FIX the predecessors search. Most strings pointer in example are starting non-aligned <address> + 1.
Because the size is on <address>
But it doesn't seem like the POinterEnumerator checks for that :??
Check for 0xc329f8/0xc329f9 . present in memap 0xc30000
It seems the PointerEnumerator does not work. ???
And maybe we need to reverse the aligned check on DSA FieldReverser, or create an advanced one.
-> definitively an solution. but what does it break ?
1) integrate pattern and signature modules in the reverse API
2) add a Reverser that takes and search for known records types. - Done
# apply librairies structures/know structures search against heap before running anything else.
# that should cut our reverse time way down
# prioritize search per reverse struct size. limit to big struct and their graph.
# then make a Named/Anonymous Struct out of them
# The controller/Reverser should keep structs coherency. and appli maker to each of them
# the controller can have different heuristics to apply to struct :
# * aggregates: char[][], buffers
# * type definition: substructs, final reverse type step, c++ objects,
# on each integer array, look indices for \x00
# if there is a regular interval between \x00 in the sequence ( 5 char
# then 0 ) then make some sub arrays, nul terminated
# magic len approach on untyped bytearrays or array of int. - TRY TO ALIGN ON 2**x
# if len(fields[i:i+n]) == 4096 // ou un exposant de 2 > 63 # m = math.modf(math.log( l, 2)) %% m[0] == 0.0 && m[1]>5.0
# alors on a un buffer de taille l
# fields[i:i+n] ne devrait contenir que du zeroes, untyped et int
- use coroutine/threadify the StructureAnalyzer
# TODO look for VFT and malloc metadata ?
# se stdc++ to unmangle c++
# vivisect ?
# TODO 1: make an interactive thread on that anon_struct and a struct Comparator to find similar struct.
# that is a first step towards structure identification && naming. + caching of info
# create a typename for \xff * 8/16. buffer color ? array of char?
- make a TU for the DSASimple bug.use the bytes from that record and create
a debug help that prints the resolution criterias for a record/field
- do we trust get_user_allocation or not ? for libc ? for winheap ? YES
- change the reversers anonymous structure to be POPO.
- we want all funcs outside of it.
- keep the original address in the Ouputters, python output.
++ we also need advanced constraints in the search API to be able to check for next_back == current
- re-correlate the is_valid methods to the record type, so that advanced code-based
validation is allowed on a record, on top of constraits
Will require a registation of specific validation functions, and probable access
to parent nodes 'validation state'.
- finder the text outputter bug. haystack refresh PEB returns different and bad results with a
constraint file. with a string output.
see field process Heap
while python outputters give a None value for the pointer, the string outputters will return the first pointer value
BAD: python /home/other/Compil/python-haystack/scripts/haystack --debug --string --dumpname /home/jal/outputs/vol/zeus.vmem.1668.dump/ refresh winxp_32_peb.struct__PEB 0x7ffde000 --validate --constraints_file winxpheap.constraints
OK: python /home/other/Compil/python-haystack/scripts/haystack --debug --string --dumpname /home/jal/outputs/vol/zeus.vmem.1668.dump/ refresh winxp_32_peb.struct__PEB 0x7ffde000 --validate
- change Levenshtein for fuzzywuzzy or similar
- following memory_handler changes, the __book cache seems to forget about some ctypes buffers.
not cool. on make_context_for_heap.
Function getRef deactivated. it doesnt bug anymore.
it impacts performance _a lot_ to deactivate the cache.
To be obseleted ?
But output_to_python depends on it... TODO
- use PEB search to double check that we find all HEAPs in standard scenarios.
- use pycallgraph to cProfile a HEAP validation.
- make a callback profiler that profiles the graph path validation of a structure in graphical format
- using a decorator would be fun
- add a depth parameter to constraints loading on list fields.
- separate listmodel in a constraints like config file ?
- a working str() on Ctypes.loadableMembers requires no instanciated utils
- add an searcher configuration for padding value == 0 ( to be forwarded to basicmodel)
- padding value should be 0 most of the time. make it configurable/monkeypatchable
- use ASN.1 for constraints. ??
- add a hintoffset search thingy, so that one can search for a structure at a particular offset of a mmap ? maybe to complex for API
- make a listmodel method for arrays of structures, or not.
Attribute packed is not correct in ctypeslib
- proper way to load constraints and bit-dependant modules is side by side
# A - the caller has to load the proper bit-dependant python code,
ctypes3 = cls.my_model.import_module("test.src.ctypes3_gen64")
# my_model is alreadytarget dependant
# B - then apply constraints on module.
handler = constraints.ConstraintsConfigHandler()
my_constraints = handler.read('test/src/ctypes3.constraints')
results = haystack.search_record(self.memory_handler, self.ctypes3.struct_test3, my_constraints)
Todo:
- Add a struct beautifier for the string formatting. (hex vs int)
- easy API - done
- documented example - done
- Rekall plugin - done
- Rekall memorymap - done
- make basicmodel:loadable members work with vtypes (vol/rekall) ?
- Check why pdfbparse reports some gaps in structs gap_in_pdb_ofs_3C (HEAP)
- pylint ignore W0212 in profiles.
- add PyQt4 as dependency for optional functions
- Add to ipython
- Make reverse work again - done