Can't load 14 billion credentials at once #31

kpcyrd · 2018-03-29T00:52:56Z

It turned out the scheduler is surprisingly inefficient at loading very large lists. After some math it turns out it needs to be redesigned to allow lists of that size in one go:

On a 64bit system, even just collecting the pointers of all newlines takes a very large amount of memory:

14_000_000_000 * 8 = 112000000000 # 104.3 GiB

Three important bits we need to keep in mind for feature parity with the current system:

due to threading we need to be able to process this list at multiple positions at once
to measure the process, we need to know how many credentials we've processed, but we also need to know how many we have to process in total
jobs can fail and need to be rescheduled

To support lists that large, we'd have to change the scheduler design:

generator thread

Open the list of credentials
Scan the whole file and count newlines
Seek back to 0
Start the worker threads
Fill a size-limited mpsc queue with credentials, then block at send
Every time a worker receives from the queue, send unblocks, and a new line can be loaded that we try to insert into the queue.

Memory-wise, this would be one of the most lightweight solutions.

offset + limit

This could be applied to dict-style runs as well:

Skip offset number of attempts
Submit limit number of attempts
Ignore everything else
This would also allow resumption from aborted jobs (assuming the offset has been saved) or distributed tests (especially for dict style runs) as well.

It would be quirky to use though.

zero-copy + chunk assignment

To avoid overhead that comes from our data structures, we could just map the whole file into ram and then operate on slices. Since we need to process this list in parallel we could assign this file into chunks of a specific size and each worker is able to process this chunk individually, no synchronization needed until the end of that chunk has been reached.

This still requires enough ram to load the whole file at once.

Mutex<Cursor>

We can simply scan the file in the main thread, count the credentials, seek back to 0 and then lock the file handle in a mutex:

lock the bufreader
read an entry
release the mutex
parse the credentials and test them

This would introduce the need for an exception message to the msg loop since reading from the file might fail in a non-recoverable way.

Note that there's also some overhead by the way the threadpool currently works, which allocates some memory for each job that we want to run. While this isn't much, keep in mind that a single byte per credential would result in 14gb.

In the end, I'm not sure if tests that large are realistic and how much effort should go into this.

The text was updated successfully, but these errors were encountered:

kpcyrd · 2018-04-01T03:07:38Z

#40 significantly reduces ram usage

kpcyrd added the scheduler Related to the badtouch scheduler label Mar 29, 2018

kpcyrd mentioned this issue Mar 31, 2018

Consider replacing threadpool #39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't load 14 billion credentials at once #31

Can't load 14 billion credentials at once #31

kpcyrd commented Mar 29, 2018

kpcyrd commented Apr 1, 2018

Can't load 14 billion credentials at once #31

Can't load 14 billion credentials at once #31

Comments

kpcyrd commented Mar 29, 2018

generator thread

offset + limit

zero-copy + chunk assignment

Mutex<Cursor>

kpcyrd commented Apr 1, 2018