Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a dense memory backend implementation #1342

Conversation

tianrui-wei
Copy link

Currently, the mem_t only supports a sparse memory implementation. This
commit adds in dense memory support. It behaves as follows:

  1. it will try to take advantage of HugeTLB and Transparent Huge Pages
    in modern Linux kernels
  2. it will allocate once/free once for every memory region, instead of
    on demand
  3. paves the way for a simple memory backed block device implementation

Signed-off-by: Tianrui Wei tianrui@tianruiwei.com

@tianrui-wei tianrui-wei force-pushed the tianrui-dense-mem-impl branch 2 times, most recently from 5ed5f1b to 32acdef Compare April 25, 2023 21:46
Copy link
Contributor

@scottj97 scottj97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does far too much in one commit:

  1. Adding command-line option :d
  2. Creating class hierarchy in mem_t
  3. Adding dense to mem_cfg_t
  4. Implementing dense_mem_t

That's at least 4 separate commits.

The bugfix just pushed should be squashed back in so there is no bug in any commit.

But more importantly, I think this needs more justification. What is the benefit here? Does this improve performance when simulating programs that actually use a large block of memory? By how much?

riscv/devices.h Outdated
sparse_mem_t(reg_t size);
~sparse_mem_t();

char* contents(reg_t addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use virtual and override keywords on overridden virtual methods

@tianrui-wei
Copy link
Author

Hi Scott,

Thanks for reviewing my PR. I'll address your comments in subsequent commits. I'll split this single commit as you suggested.

The benefit is that the current lazy allocation scheme is inefficient compared to allocating the entire backing memory at once. For example, in ucb-bar/chipyard#1438, we unify the backing memory in both spike cosim and the actual RTL simulation to detect divergence. At the start of the cosimulation, the entire RTL memory is copied into spike, which in turn calls calloc many times.

The other reason is so that this could serve as the backing memory for a block device for using with drivers like https://github.com/u-boot/u-boot/blob/master/drivers/mmc/piton_mmc.c. If you'd like I could run some benchmark comparison on memcpy benchmarks.

Thanks,
Tianrui

@tianrui-wei tianrui-wei force-pushed the tianrui-dense-mem-impl branch from 32acdef to ca08a1f Compare April 25, 2023 22:00
@tianrui-wei tianrui-wei requested a review from scottj97 April 25, 2023 22:01
@aswaterman
Copy link
Collaborator

I suspect we can improve the current scheme to be more efficient, e.g. by using memalign and allocating larger chunks (say, 2 MiB to match the superpage size). This would require some experimentation, but it seems preferable to having two different schemes.

I suspect the block device will end up being a separate device_t anyway, so I question whether this PR really serves as a building block towards that.

@tianrui-wei tianrui-wei force-pushed the tianrui-dense-mem-impl branch from e2300a3 to 286f8b5 Compare April 25, 2023 23:33
@tianrui-wei
Copy link
Author

I would argue that multiple backends are beneficial, and is also practical in other projects as qemu: https://github.com/qemu/qemu/blob/master/backends/hostmem.c. For one, the dense allocation will crash if allocated physical memory is larger than host memory, but the sparse backend could effectively test memory address > 39 bits for example. The dense allocation will also make sharing memory easier and more flexible instead of performing a lookup.

@jerryz123
Copy link
Collaborator

@tianrui-wei for our use case, we can just provide our own implementation of mem_t that uses a dense memory in our code that links with spike, and pass it to the sim_t constructor.

@tianrui-wei
Copy link
Author

Could we perhaps only cherry pick e71df64 that implements an abstract class for memory interface, and expose a flexible interface in libriscv?

Copy link
Contributor

@scottj97 scottj97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit e71df64 fails to compile:

/local_home/sjohnson/spike-regress/riscv-isa-sim/spike_main/spike.cc: In function ‘std::vector<std::pair<long unsigned int, mem_t*> > make_mems(const std::vector<mem_cfg_t>&)’:
/local_home/sjohnson/spike-regress/riscv-isa-sim/spike_main/spike.cc:273:75: error: invalid new-expression of abstract class type ‘mem_t’
             mems.push_back(std::make_pair(cfg.get_base(), new mem_t(cfg.get_size())));
                                                                                   ^

@tianrui-wei tianrui-wei force-pushed the tianrui-dense-mem-impl branch 2 times, most recently from 3d6c41b to 69e70ad Compare April 26, 2023 19:42
@tianrui-wei
Copy link
Author

Hi Scott,

Thank you for reviewing and shepherding the PR. I've updated the commit to address your reviews and only cherry-picked the first commit.

Thanks,
Tianrui

@jerryz123 jerryz123 force-pushed the tianrui-dense-mem-impl branch from 69e70ad to 30a0169 Compare April 26, 2023 21:59
Signed-off-by: Tianrui Wei <tianrui@tianruiwei.com>
@tianrui-wei tianrui-wei force-pushed the tianrui-dense-mem-impl branch from 30a0169 to b72cf05 Compare April 26, 2023 22:23
@tianrui-wei tianrui-wei requested a review from scottj97 April 26, 2023 22:25
Copy link
Contributor

@scottj97 scottj97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it up to @aswaterman but this seems like a pointless change now.

virtual void dump(std::ostream& o) override;

private:
bool load_store(reg_t addr, size_t len, uint8_t* bytes, bool store) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs virtual

@michalt michalt mentioned this pull request Jul 10, 2023
@jerryz123
Copy link
Collaborator

Resolved by #1408

@jerryz123 jerryz123 closed this Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants