Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fully in-memory DataNodes in the same JVM #4

Closed
xkrogen opened this issue Feb 9, 2018 · 0 comments
Closed

Use fully in-memory DataNodes in the same JVM #4

xkrogen opened this issue Feb 9, 2018 · 0 comments

Comments

@xkrogen
Copy link
Collaborator

xkrogen commented Feb 9, 2018

Currently each DataNode is launched as a separate process/JVM, and we fool it into thinking it has all of its necessary blocks by creating the files as 0-length. It would be much more efficient to launch all of the DataNodes in the same JVM using MiniDFSCluster, and to use SimulatedFSDataset to store the block metadata only in-memory, saving us from having to create millions of sparse files on disk.

xkrogen added a commit to xkrogen/dynamometer that referenced this issue Feb 9, 2018
…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.
xkrogen added a commit to xkrogen/dynamometer that referenced this issue Feb 21, 2018
…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.
xkrogen added a commit to xkrogen/dynamometer that referenced this issue Feb 22, 2018
…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.
xkrogen added a commit to xkrogen/dynamometer that referenced this issue Feb 23, 2018
…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.
xkrogen added a commit to xkrogen/dynamometer that referenced this issue Mar 7, 2018
…ch uses MiniDFSCluster to launch many DataNodes in the same JVM. Additionally, SimulatedFSDataset is used to avoid the requirement of writing sparse files to the disk - all block metadata is now stored in-memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant