Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADIOS2 User Side KVCache to accelerate the remote query access. #4210
ADIOS2 User Side KVCache to accelerate the remote query access. #4210
Changes from 3 commits
ce2d56b
1e6e129
d7f5e27
9812eef
6d1bf17
e07c30b
398a0fb
2afdca6
5790fcd
22267e9
8079eac
ced5680
5f5c249
a4f572a
9414a33
8e3de8c
b336025
fbf1cf2
7c93c8a
4218e96
738e47e
52da2e2
27e0aab
49d4000
ac40ef6
1b67484
4168d6c
4352dcd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add as at adios2_option in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A future idea is that this could be a docker-compose file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use CURL/wget rather than git here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My lurking worry here is that we have expanded a simple loop with a 2 line body to 100+ lines. Maybe we can break out the KVCache case to a different subroutine to keep the non-cache case simple to understand? Also you might use CoreDims/DimsArray rather than Dims. It's probably not as critical here, but Dims uses std::vector, which means lots of tiny mallocs whenever these are created, destroyed, passed by value, etc. This caused serious performance problems in metadata manipulation with BP4 at scale. As a result, BP5 tries to avoid Dims.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to add a subroutine in:
if (m_Remote)
{
PerformRemoteGets();
}
else
{
PerformLocalGets();
}
However, what I can do is like:
#ifdef ADIOS2_HAVE_KVCACHE
if (getenv("useKVCache"))
{
PerformLocalGetsWithKVCache();
}
else
{
PerformRemoteGets();
}
#else
PerformRemoteGets();
#endif
do you think this is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this version...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I could get rid of
getenv("useKVCache")
->If the user builds with KVCache and wants to use remote Gets (DoRemote=1), KVCache will be used automatically. However, if the user prefers to use only the remote service without KVCache, they will need to rebuild (possibly in a separate folder).
We could also consider checking whether the cache connection is active, and if it is not, default to using the pure remote service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, let me tell you my worry. Your code in your cache stuff is in a lot of ways similar to the code in NdCopy (which looks at whether or not two blocks intersect, the extent of the overlap, etc.). We used to have some major performance problems in this when metadata was large (many thousand blocks per variable) because every time it did something like try to determine an intersection it ended up creating a destroying a dozen or more Dims vectors (doing things like converting a taking a StartEnd Box and creating a StartCount box from it, having all those temporary values get free'd when the intersection test returned, only to recreate them later, etc.). We traced the issue back to lots and lots of small malloc/free calls when those Dims were created, initialized and destroyed. It took a ton of effort to rewrite the worst parts of that code to use ArrayDims (stack allocated dimension array, so cheap to create/destroy, own their own data) or CoreDims (the equivalent of a "span", don't own their data but let us pass it in places and operate on it like a vector without a copy). With a lot of careful work kill all the object creation/destruction overhead in NdCopy so its only overhead was necessary control and data movement. (And if you go look at its source in helper/adiosMemory.cpp you'll see extensive use of ArrayDims and CoreDims.).
The storm of vector construction and destruction in ADIOS metadata handling wasn't noticeable at small scales and it was easy to ignore the implicit overhead of casual use of Dims, but as we ran at higher scales it became seriously problematic. You're probably not seeing an issue at the scales you're testing caching at either, but we hope to use this in large-metadata situations, so it'll become an issue if it's not done well. There's just too much similarity in your processing and the NdCopy case for me not to be concerned. We obviously didn't eliminate Dims from all the ADIOS code, but we killed it where it was used most intensively. You might do the same.
(And of course you do know the dimensionality of the variable in the place where you declare ReqInfo variable, it's in the Req variable that you're iterating over in that loop. Or it would be easy to add a default constructor and initialize the dimcount later. My concern is about places where there might be intensive vector usage.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test script we talked about just now is: https://github.com/Change72/adios-test/blob/master/TestCoreDims.cpp
// will make a value copy of dimsPoint2
DimsArray dimsPointer3(reinterpret_cast<CoreDims&>(dimsPointer2));
// will only copy the pointer
DimsArray dimsPointer3(dimsPointer2);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the following code will also give wrong results, but vector works fine.
Vector code: