Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions in session #270

Merged
merged 17 commits into from
Jul 10, 2020

Conversation

marius-klimantavicius
Copy link
Contributor

@marius-klimantavicius marius-klimantavicius commented May 27, 2020

For #230.
Not sure about naming (ISynchronizationListener).
There is at least one breaking change (ReadAsyncResult is in different class [different number of type arguments]).
For now I duplicated state machine code for sync and async versions, if this is not ideal I can unify into one method again.
Needed to introduce IClientSession, but this seems to be one-time call so no harm there.

@marius-klimantavicius
Copy link
Contributor Author

Fixed naming (at least to my liking).
Still missing tests and example.

@badrishc
Copy link
Contributor

Thanks for the contribution. As this is a breaking change, we need to better understand what benefits it brings users, in order to justify the change.

@tli2 what do you think of this: "For now I duplicated state machine code for sync and async versions, if this is not ideal I can unify into one method again."

@tli2
Copy link
Contributor

tli2 commented Jun 2, 2020

@badrishc

I only took a cursory look. In principle I think that's the right thing to do. For this particular case there also seems to be not too much actual code duplication.

If I am to be nit-picky I'd say we need a better model for when it is appropriate to use which, and reflect that in the API naming and documentation. We might also consider enforcing that separation in the code (e.g. if we decide that OnThreadAsync can only be called from sessions that support async, we can check that in FasterStateMachine.cs)

Not necessarily within the scope of this PR, of course.

@marius-klimantavicius
Copy link
Contributor Author

Should I experiment with moving Input, Output and Context out of FasterKV (to be part of method signature not of class itself, part of IFunctions and ClientSession)? Rationale - different callbacks might work with different Intput, Output, Context.

@marius-klimantavicius
Copy link
Contributor Author

marius-klimantavicius commented Jun 2, 2020

One (relatively annoying) issue with moving Input, Output and Context is that type inference for NewSession and ResumeSession do not work for primary FasterKV<Key, Value> (there is also legacy FasterKV for backwards compat).

Can slightly reduce the noise by adding helper "builder" (still need to specify Input, Output and Context but functions can be inferred):

Builder<Input, Output, Context> For<Input, Output, Context>() => new Builder<...>(this);
using (var session = fasterKV.For<MyInput, MyOutput, MyContext>().NewSession(new MyFunctions()...)

@badrishc
Copy link
Contributor

badrishc commented Jun 5, 2020

I think it would be great for us to support net 462 as well, so those changes are welcome, I would think, as long as perf is unaffected.

Fixed a bug where compaction would use stale value if multiple records with different lengths (for varlen structs) where encountered during compaction.
@marius-klimantavicius marius-klimantavicius marked this pull request as ready for review June 6, 2020 17:45
@marius-klimantavicius
Copy link
Contributor Author

Latest commit is for #272 - Added custom compaction functions. Note that in addition to having custom IsDeleted I added Copy and CopyInPlace to be able to deal with concurrency (now that I am writing this - do we really need these? Is it possible to compact inside mutable region? Probably yes as untilAddress can be tail).

Net462 is supported when using FASTER as binary dependency (e.g. via nuget) however I am embedding sources directly (in addition I sometimes do have more changes) thus locally I've removed readonly members, nullables, local static functions, etc. For PRs back to FASTER I am trying to keep it close to original but still some changes might come through (like extracting static local functions to full functions [this is more of a personal preference], I can move them back).

@marius-klimantavicius
Copy link
Contributor Author

I think I will do one more change - I want to add an overload to IVariableLengthStruct.GetLength to take in TInput. It is possible that due to RMW I might want to change the size of the record based on the incoming input.

@badrishc
Copy link
Contributor

For now I duplicated state machine code for sync and async versions, if this is not ideal I can unify into one method again.

I think this PR should not duplicate the async and sync versions, and instead should keep them unified as before. It seems orthogonal to the intent of this PR, correct me if that's not the case. We can consider the issue of better handling the async vs sync state machine in a separate PR.

@@ -95,15 +95,36 @@ internal class IndexSnapshotTask : ISynchronizationTask

if (async && !faster.IsIndexFuzzyCheckpointCompleted())
{
clientSession?.UnsafeSuspendThread();
fasterSession?.UnsafeSuspendThread();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this get inlined?

@badrishc
Copy link
Contributor

Hi @marius-klimantavicius, this is a really nicely done PR, thanks for your contribution. I am done with the review and have no other comments than this one. Let me know what you think.

For now I duplicated state machine code for sync and async versions, if this is not ideal I can unify into one method again.

I think this PR should not duplicate the async and sync versions, and instead should keep them unified as before. It seems orthogonal to the intent of this PR, correct me if that's not the case. We can consider the issue of better handling the async vs sync state machine in a separate PR.

@badrishc
Copy link
Contributor

One thing left is to verify that there is no performance regression - have you done some testing in this regard?

@TedHartMS - can we use your PerfTest benchmark PR to see the impact of this PR?

@marius-klimantavicius
Copy link
Contributor Author

Removed sync methods from state machine (it seems that during my experiments I stopped using sync version after all).

As for performance regression - I haven't performed any testing regarding this. Later will try to see if I can use PerfTest for that.

@marius-klimantavicius
Copy link
Contributor Author

I tried running PerfTest a few times (I can't find YCSB files so I am using synthetic test only) however I am unable to get stable results on my machine (even with fixed seed for random number generator). I've tried running only a few times (~10-15) as each run (Release build, Ctrl-F5) takes at least 5 minutes.

@TedHartMS
Copy link
Contributor

What's your machine configuration? The Benchmark test's Release build has a working set around 24GB. The PerfTest app @badrishc referred to is a separate branch and will be available soon.

@marius-klimantavicius
Copy link
Contributor Author

I see, I have only 16GB of RAM on this machine maybe it was due swapping (I might try the PerfTest branch a bit later).

@badrishc
Copy link
Contributor

badrishc commented Jul 9, 2020

We are noticing a 10% drop in performance on a single thread.

Before

FASTER.benchmark.exe -b 0 -t 1 -n 1 -r 50 -d zipf

loading keys from D:\ycsb_files\load_zipf_250M_raw.dat into memory...
loaded 250000000 keys.
loading txns from D:\ycsb_files\run_zipf_250M_1000M_raw.dat into memory...
loaded 1000000000 txns.
Executing setup.
Loading time: 79089ms
Executing experiment.
Thread 0 done; 50645771 reads, 50645996 writes, in 30105 ms.
End tail address = 6000001496
Total 101291767 ops done in 30.107 secs.
##, zipf, 1, 50, 1, 3364392.5665127714, 0

After

FASTER.benchmark.exe -b 0 -t 1 -n 1 -r 50 -d zipf
loading keys from D:\ycsb_files\load_zipf_250M_raw.dat into memory...
loaded 250000000 keys.
loading txns from D:\ycsb_files\run_zipf_250M_1000M_raw.dat into memory...
loaded 1000000000 txns.
Executing setup.
Loading time: 88491ms
Executing experiment.
Thread 0 done; 45542661 reads, 45542155 writes, in 30139 ms.
End tail address = 6000001496
Total 91084816 ops done in 30.141 secs.
##, zipf, 1, 50, 1, 3021957.333864172, 0

@badrishc
Copy link
Contributor

badrishc commented Jul 9, 2020

With a minor update to ClientSession.cs, I was able to restore performance to the same level as before. Please review if the change is ok with you.

@marius-klimantavicius
Copy link
Contributor Author

Yep these are absolutely OK (I was hoping new AsyncFasterSession(this) to be optimized away but I was wrong). Having readonly functions is the same as before this PR (though once I figure out how to properly benchmark [provided my pc is powerful enough] I will check if perf is not affected when functions is struct [due to defensive copy though only matter for non-empty structs]).

@badrishc
Copy link
Contributor

Yeah, 'new AsyncFasterSession(this)' is the main change that got back the 10%. The other change was just a move to be in the safe side. Thanks again!

@badrishc
Copy link
Contributor

I tried removing and re-adding the readonly marker for functions, but the results are inconclusive. I'm seeing +/- 10% across runs. It does not block this PR from getting merged, but it would be nice to get to the bottom of whats going on in the critical path, and why certain benchmark runs are regularly (but not always) 10% slower.

@badrishc badrishc merged commit 9b96efc into microsoft:master Jul 10, 2020
@badrishc
Copy link
Contributor

badrishc commented Sep 3, 2020

One (relatively annoying) issue with moving Input, Output and Context is that type inference for NewSession and ResumeSession do not work for primary FasterKV<Key, Value> (there is also legacy FasterKV for backwards compat).

Can slightly reduce the noise by adding helper "builder" (still need to specify Input, Output and Context but functions can be inferred):

Builder<Input, Output, Context> For<Input, Output, Context>() => new Builder<...>(this);
using (var session = fasterKV.For<MyInput, MyOutput, MyContext>().NewSession(new MyFunctions()...)

@marius-klimantavicius: In PR #310, I have extended the builder slightly, to support writing of slightly cleaner code that produces a proper client session with the right generic params (i.e., specific Function instead of IFunctions<>):

var session = store.For(new MyFunctions()).NewSession<MyFunctions>(...);

It is still ugly (MyFunctions has to be uttered twice), but at least user does not need to utter the types of Input, Output, Context.

@marius-klimantavicius marius-klimantavicius deleted the functions_in_session branch September 3, 2020 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants