Use fully in-memory DataNodes in the same JVM #4

xkrogen · 2018-02-09T00:44:08Z

Currently each DataNode is launched as a separate process/JVM, and we fool it into thinking it has all of its necessary blocks by creating the files as 0-length. It would be much more efficient to launch all of the DataNodes in the same JVM using MiniDFSCluster, and to use SimulatedFSDataset to store the block metadata only in-memory, saving us from having to create millions of sparse files on disk.

The text was updated successfully, but these errors were encountered:

…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.

xkrogen closed this as completed in 272dc66 Mar 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use fully in-memory DataNodes in the same JVM #4

Use fully in-memory DataNodes in the same JVM #4

xkrogen commented Feb 9, 2018

Use fully in-memory DataNodes in the same JVM #4

Use fully in-memory DataNodes in the same JVM #4

Comments

xkrogen commented Feb 9, 2018