Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AsyncResult and BATLedgerContext classes to bat-native-ledger #8118

Merged
merged 1 commit into from
Mar 6, 2021

Conversation

zenparsing
Copy link
Collaborator

@zenparsing zenparsing commented Mar 3, 2021

Resolves brave/brave-browser#14449

Submitter Checklist:

  • I confirm that no security/privacy review is needed, or that I have requested one
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally: npm run test -- brave_browser_tests, npm run test -- brave_unit_tests, npm run lint, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

All unit tests should pass.

@zenparsing zenparsing marked this pull request as ready for review March 3, 2021 23:23
@zenparsing zenparsing requested a review from emerick March 4, 2021 20:49
Copy link
Contributor

@emerick emerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zenparsing zenparsing merged commit 4f55ccd into master Mar 6, 2021
@zenparsing zenparsing deleted the ksmith-ledger-context branch March 6, 2021 17:45
@zenparsing zenparsing added this to the 1.23.x - Nightly milestone Mar 8, 2021

void Listen(CompleteCallback on_complete) {
Listener listener = {.on_complete = std::move(on_complete),
.task_runner = base::SequencedTaskRunnerHandle::Get()};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could very easily be misused to access things on the wrong thread and I'm not seeing the benefit over just passing a callback. cc @iefremov

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the purpose of this is to attach multiple callback, I think an observer would be simpler and easier to provide thread guarantees with a thread checker

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I gather this is something already handled very well using existing tools, like PostTaskAndReplyWithResult. I don't think we shouldn't introduce new async tools without serious reasons

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to look at this - fortunately we haven't created any PRs yet that use these constructs! 😄

Ultimately the intent here is to address some of the underlying issues in the bat-native-ledger codebase so that rewards devs can be more productive in the future. I'm not particularly attached to any way of doing things but I'm hoping the discussion can proceed with that goal in mind. That said, there is a natural limit to the amount of work that we can invest given other priorities.

Re threading: AsyncResult values are always delivered to listeners as const references - an assumption here is that const references will be safe to read concurrently. Obviously you can do things in C++ to make it unsafe.

Re callbacks: We have well-known issues in bat-native-ledger with the current callback approach. It is natural for devs to want to compose async tasks out of smaller async tasks, but this has lead to readability problems for the codebase, where callbacks are bound in to other callbacks (which are bound in to other callbacks, etc.). AsyncResult is one part of an intended solution out of this mess (the other being "Task" components).

Re observers: Defining observers for each abstraction that does something asynchronous is another option for dealing with the callback problem that we currently have, and one that I considered. I think it is appropriate when a component does things "on its own" (e.g. on a timer) and needs to publish those events. I think it is less appropriate for managing async "flow" tasks (of which there are many in bat-native-ledger); using the observer pattern in those cases also leads to code understandability problems.

Re existing tools: Trust me, I don't introduce new primitives lightly (and would prefer not to at all), but as mentioned above we have some issues with the current codebase and the existing tools haven't helped. Also, one of the goals is to take something that we must surely admit is fairly painful in Chromium code (async flows), and make it simpler for devs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess existing tools haven't helped because nobody tried to use them? I'd say Chromium async tools are pretty much straightforward when you get used to them.

just for clarity: how often would we need multiple listeners for a given task? Ho often do they listen on different sequences?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to learn more!

how often would we need multiple listeners for a given task

It comes up, in particular if you want to dedupe tasks. When I did the publisher info fetcher, you can see here how I had to put together a map of publisher_key -> vector<callback> and manage all that. Of course, it would have been a bit easier with base::CallbackList, but we couldn't do that because bat-native-ledger doesn't use Bind. (Part of what I'm trying to change!) It would have been even easier with AsyncResult, though, because I could have just stored a map of publisher_key -> AsyncResult<T> and handled those out to consumers as they requested publisher info.

Ho often do they listen on different sequences?

Not sure - I added the sequence logic in later after I learned a bit more about how sequences work and how I would want AsyncResult to work if there's more than one sequence. (I could very well not have something correct, though.) It's not a requirement for bat-native-ledger right now because that library only uses a single sequence.

In general, "reifying" the result of an async operation as a first class value that you can pass around means that you can abstract over it in a generic way. So for instance, you can write something like context()->StartTask<MyTask>(...) or a generic mapping function and things like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could start with transitioning from std::bind to base::Bind first

it would be nice to finally understand how many sequenced are used by Ledger. E.g., I see task_runner_ in LedgerImpl that is apparently not used 🤔

std::move(on_complete).Run(*store->value);
}

std::shared_ptr<Store> store_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared_ptr is a banned feature https://chromium-cpp.appspot.com/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware, and since this is an internal use it could be changed to use base::RefCounted without changing the interface. A couple of considerations that lead to using std::shared_ptr here:

  • std::shared_ptr is used throughout the bat-native-ledger code.
  • It really is better than base::RefCounted for this use case.
  • It really is better than base::RefCounted for this use case.

class LedgerImpl;
class LedgerClient;

// Represents a per-user running instance of the BAT client engine. It serves as
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I'm following, LedgerImpl is already per-user (profile) and how can singletons be per-user?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LedgerImpl is exactly the problem that this abstraction is attempting to solve: it is a manual dependency hub in which all of its subcomponents store a raw pointer back to LedgerImpl. It is non-scalable, non-compositional, non-testable, and, have you seen all of the raw this pointers?

Re "singleton": perhaps a misuse, but the intended meaning here is one instance of the component per context.

LedgerClient* ledger_client_;
LedgerImpl* ledger_impl_ = nullptr;
std::unordered_map<size_t, std::unique_ptr<Component>> components_;
std::unordered_map<size_t, std::unique_ptr<Component>> tasks_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen if the context dies before tasks are completed? Can't see any code canceling the tasks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task does not care who owns it. Generally, it will use a WeakPtrFactory, so any callbacks that it has created and registered will no-op. The associated AsyncResult for the task will be left in an always-pending state, and the AsyncResult's internal "state" data structure will be cleaned up once there are no more live instances of AsyncResult pointing to it.

Adding a "Cancel" method to tasks would be interesting (where presumably "Cancel" would be called by the context's destructor?), but it could result in subtle teardown ordering issues, and I don't think we need it for bat-native-ledger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any need for "Cancel", I just wanted to understand how a task lifetime is managed. If there is a weak pointer than someone has to invalidate the weak pointer on a proper sequence - just don't quite follow how this is supposed to work with this map

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this model, the weak pointer would be invalidated by the WeakPtrFactory that is owned by the Component that is owned by the map that is owned by the BATLedgerContext. So a context and all of its components would live, and have to remain on, the same sequence. That was the idea anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is it supposed that a task running on another sequence can dereference that weak pointer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is it supposed that a task running on another sequence can dereference that weak pointer?

No.

I think there might be some terminology confusion. (My fault!) "Task" here means: a ledger component that implements both a Start method and a result() -> AsyncResult method, and abstracts over a chain of async operations. We have these "chains of async ops" all over the place in ledger. I'm trying to think of a name that will distinguish this concept from a Chromium "task", which is some callback that's been posted to a TaskRunner, but I haven't been able to come up with another name yet. Let's just call it an "AsyncOp" for now. All AsyncOps that are registered with the ledger context would have their WeakPtrs deref'd on the same sequence as the context. The management of these "in progress" AsyncOps is separate from whether we're doing any multithreading.

Ledger doesn't do any multithreading at the moment, but in my imagination it would work something like:

  • Some ledger component (perhaps an AsyncOp) wants to do something in the background, so it creates an AsyncResult<T>::Resolver to represent the result of that something.
  • It passes that resolver, along with any additional (thread-safe) data to the other sequence through a TaskRunner.
  • Work is done on the other sequence, when finished, it calls resolver.Complete(...).
  • Internally, that posts a task to the originating sequence to update the async result data structure and post additional tasks to any sequences that have registered a listener for the async result.

Hope that helps!

T* ptr = instance.get();

size_t key = std::hash<T*>()(ptr);
tasks_[key] = std::move(instance);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw why this is not just a base::flat_set ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, it should be. Or just std::set, but the number of entries (i.e. concurrent tasks) will likely be smallish so a flat_set is probably best.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're also asking why a map? I wanted an std::unique_ptr in the collection and I think I ran into some minor difficulty using that in a set, but I forget at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah you won't need that juggling with keys when just using a set

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement BATLedgerContext
4 participants