Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Migrate Akka.Persistence.Sql.Common to Akka.Persistence.Linq2Db as the new "core engine" #5408

Closed
Aaronontheweb opened this issue Dec 1, 2021 · 7 comments
Labels
Milestone

Comments

@Aaronontheweb
Copy link
Member

Title says it all. Why not consolidate the implementations in order to improve performance + consistency? Any reasons why not?

cc @ismaelhamed @to11mtm

@to11mtm
Copy link
Member

to11mtm commented Dec 2, 2021

I'm for it, but there's a few things to be considered:

Edit: I've added my personal opinion of triage level on each of these.

Setup classes:

Don' t have them. It might be important for some users who don't want to mess with HOCONs. I'd kinda like them too, so long as we can make sure to define multiple configurations.

Opinion: Far sooner than later. In every Implementation I've ever done for Akka I've just YOLOed a HOCON generator, but setup classes sound really nice, having written a hocon genrator at each shop I've been at.

Packaging:

Sooooo, The Fun thing about Akka.Persistence.Linq2Db is, it replaces -all- the DB Specific Querying (If not provider-specific features at present) in one Dependency chain. In general Linq2Db itself makes the 'sane' choice in how to do things for a given database whenever possible. There's a couple spots where we have to get 'specific' but never on a hot path and it is without pulling in RDBMS specific libs.

To that point is the mixed bag, Linq2Db is written such that it just magically wires up a given provider based on ProviderName with possibly a little bit of magic, so long as a valid provider under the config is there 😄 (I use those weasel words because for some DBs that have multiple providers, Linq2Db will sometimes use some sniffing. For instance, ProviderName.SQLite will sniff for both System.Data.SQLite and Microsoft.Data.SQLite and will only error if it finds neither.)

IMO we can go one of two ways from a nuget standpoint:

  • Give people a good, concise Readme explaining the must pull in the DB Provider they want for unit tests/Apps.
  • Provide a Series of 'Akka.Persistence.Linq2Db.%DataProvider% packages that basically pull in the main repo as well as whatever DataProvider we want to throw in there. This may honestly work better when we integrate things like 'setup' classes or bits might have more DB dependent code (one example that comes to mind is Postgres Json payload option)

From a 'Consolidation' standpoint:

  • We Get all the DB Tests running under a single Repo. This is actually mostly handled in the Linq2Db repo, but we do not yet have Oracle or MySQL Tests, because I was waiting for an up/down on including those in CI/CD due to license concerns.
  • Speaking of which, we can theoretically run on a whole bunch of other RDBMSs. Would just be a matter of setting up unit tests and configuring for whatever particular schema quirk that DB may or may not have.
  • But the advantage of this is we can catch breaking changes easily and fix quickly.
    • Edit: IDK if you've ever seen the support provided at the Linq2Db repo, but over there a best effort is made to fix quickly or at least provide a work-around. If bus factor is a concern look at their closed issue history and how they handle things; the most GSD-Professional project I've ever worked with.
  • Overall Maintenance story for the project gets easier, only one PR for all providers to get a new release.

Opinion: This is low effort and overall your call on priority, since I don't think I can do the requisite CI/CD changes to make anything other than 'UX Optimizing' the readme further to avoid folks footgunning.

Indexing:

First, to continue the discussion about how Akka.Persistence.Linq2Db handles 'Indexes' right now from #5407;

We kinda cheat by PKing PersistenceID,SequenceNr on DBs other than SQLite (Which has PK requirements around it's Identity field, unlike other DBs). This will again wind up creating a non-clustered index on Most sane RDBMS as well as SQL Server.

From a lib standpoint my plan was to make other indices opt-in, either as config toggles (bootstrap friendly) or provided via a generator utility that could read a Hocon fragment or simialr (devops friendly for playbooking, enterprise friendly for getting DBA approval). For instance AFAIK Akka.Cluster.Sharding doesn't do any queries around timestamp, so you can have a separate journal/snapshot config for it and save the cost of indexes that will never be used. I think config options will be good to include though, since most folks will have to look at the reference hocon to look where to plug the connection string.

In either case, we can easily 'copy/paste' our scripts from existing journals in the short term, possibly adapting them to our ability to define column names. Picking which string to generate if config specifies would be easy to do/maintain and not on the hot path. Long term it would be nice to have a generator of some sort. FluentMigrator is a thing but feels like building a sandcastle with a bulldozer.

Opinion: Not a show-stopper but probably a 'resolve quickly'.

Persistence.Query Streaming configurability and/or better batching:

I've filed this as an issue in the Akka.Persistence.Linq2Db repo: akkadotnet/Akka.Persistence.Sql#45 and is better discussed there. I bring it up mostly because in light of this issue IDK whether the way we .ToList() in a fairly greedy way sometimes is going to be an issue for Persistence.Query users.

Edit: Frankly this may be less of an issue than I think, but as a non-user of Persistence.Query I can't be sure.

Opinion: Resolve soon-ish or on issue file.

Special sauces:

Some of the other Persistence libraries have other options implemented. For example Postgres provider has JSON Support. Amazingly, this is probably 100% doable, Linq2Db is very extensible in that regard, but it's not there right now.

Opinion: I am fine with the project letting users/orgs know that I am more than willing to help them with PRs for functionality if needed for the foreseeable future. This would be in addition to assistance the Linq2Db team would be able to provide as a cushion to 'bus factor'.

Side comments

All of the above said, there's two parts to Akka.Persistence.Linq2Db. The overall Query Architecture and the DB Layer.

The query architecture, I tried very hard in porting from JVM Akka.Persistence.Jdbc to be as close as possible within my understanding of Scala. There were a couple of spots where I distinctly deviated for the purpose of backward compatibility or Linq2Dbs architecture but I tried to be thoughtful about it. The spots where we have 'fancy' code are somewhat delineated.

The DB Layer, I've mentioned the bus factor above. But I'll say it here. I've written absurd semi-dynamic Datagrid code. I know how to use Dapper, I know how to use EF Classic and EF Core. Linq2Db is a swiss army laser for CRUD and the easiest way to write DB Agnostic code while hating the minimum number of vendors. That said, I tried to port it such that if someone had a better library to try, it wouldn't be a complete nightmare.

@ismaelhamed
Copy link
Member

ismaelhamed commented Dec 3, 2021

Yep. And I know @to11mtm has done a hell of a job porting this but, before I can push this at work, I need to be certain that this is ready for prime time.

@to11mtm
Copy link
Member

to11mtm commented Dec 9, 2021

Yep. And I know @to11mtm has done a hell of a job porting this but, before I can push this at work, I need to be certain that this is ready for prime time.

This is the Chicken+Egg problem I've run into from a 'core engine' standpoint.

I've used this in production against SqlServer for Sharding and AtLeastOnceDelivery, and it's worked great there. The few bugs that I have found over time have been fixed, and really there were only two that come to mind. One was we needed explicit transaction on batch writes, there was a translation-from-scala issue on write error reporting (ironically, the one spot I called out when I first ported because I was worried I did it wrong.) Both are fixed now.

However, I have not used any parts of persistence.query. And, for whatever it's worth, our count of persistent actors is bounded.

@ismaelhamed Do you have any sort of acceptance tests to possibly run against? The persistence logic is supposed to work such that if you are running compatibility mode, everything written by Akka.Persistence.Linq2Db should be readable by Sql.Common, and vice versa. We might need to make some minor tweaks for Oracle (I have a branch somewhere for that, I -think- the important stuff is present in mainline) but I would think you could run whatever acceptance tests you have against the library in a configuration to test migration.

I think as somebody (@Aaronontheweb ?) mentioned some 'heavy load' tests for persistence.query would be good for everyone. I think we should try to include scenarios that would be realistic in production (i.e. queries with poor bounding limits,) and would help everyone have confidence in moving over.

However, I think none of the above issues (helpers for indexing, potentially needing to batch-read more Persistence.Query types) are hard to solve if required. I can guide others if needed but do not have the bandwidth to commit to handling it on my own at this time.

@ismaelhamed
Copy link
Member

@to11mtm I'm most interested in how it'll perform under load. So, I must just take it for a spin in our load testing environment after the holydays.

Regarding Oracle, we only have one customer and the Sql.Common plugin is working simply fine for them so it's not a priority. We could focus more on Sql Server at this point.

@to11mtm
Copy link
Member

to11mtm commented Dec 18, 2021

@to11mtm I'm most interested in how it'll perform under load. So, I must just take it for a spin in our load testing environment after the holydays.

@ismaelhamed Please do.

Note if you want to test against Oracle you may or may not need to rebase the branch I did a while for Oracle support. If you need any help with that let me know via here or on Gitter.

In either case, SQL Server works well IME, do please refer to all of the unit tests including compatibility mode tests (Which show configuration for forward/backward compat with Sql.Common.)

Scaling up in general:

I think there's lots of options for scaling and tuning with Persistence.Linq2Db's architecture, with more opportunities in the future.

In general, I have found it scales well in production, and have adapted the write and read patterns to other application code to improve memory usage and scaling. If a setting that is needed is missing we can add it fairly easily. I originally ported this over because Batching.SQL unfortunately didn't play well with our persistence strategies under load tests and other conditions.

The chunked write architecture almost immediately proves it's worth in benchmarks. When I test against in Docker against an SqlServer container, Writes with Persistence.Linq2Db scale up to 4-6x for single writes across many actors and 5-20x on single Persist/PersistAsyncs compared to Batching SQL Common.

The chunked reads, are a little trickier; Under low load they are the same or better, under light load they can trade blows with Batching SQL. This is tricky to benchmark under heavier loads (lol threadpool/taskscheduler is the theme this week). On my laptop it appears that overall as you scale up Persistence.Linq2Db prevails fairly quickly; Likely because it's returning connections to pools faster via chunked reads. I guess I'm saying It's at least as good overall but don't expect the miraculous improvements seen on writes.

In an 'Framework level benchmark' context (i.e. using in a benchmark of parts of an App and/or its routing architecture; not real workflow numbers, but not just measurement of the persistence system) we saw a 2-5x improvement. Throwing that out there since I know micro-benchmarks can be deceptive.

The Arch Rant:

So here's how the architecture works overall (warning; possibly TMI but good info for people looking to tweak to the max!):

Writes:

  • We push all rows to write into an Akka Stream Queue. The way batching works, we make sure that any 'batch' of events from a PersistAll will be in a single 'batch' write downstream.
    • When we push to the Queue we get back a TCS. If the Queue is errored (i.e. queue is full or persistence is shut down) it will be set with an exception.
  • At the end of the Queue is a .SelectAsync that grabs will grab batches from the Stream queue (the max size of a batch is based on config, with exception of PersistAll of course) and then either write a single-row insert, or write 1 or more Batched insert statements (Note; in the case of Sql Server, once batches are over a certain size it will switch to BULK INSERT. You can get away from this behavior via config)
    • The end result is best described as a 'Turbocharger' effect:
      • When Parameters are not preferred on bulk insert, each sub-batch will be raw SQL but constrained to a certain maximum size based on number of rows and insert length. We can theoretically tune preferred max insert string size for latency in this case; the cost of this mode is that it puts pressure on the LOH due to size of strings used. (We could tune for the 'prefer to avoid LOH case' too, but...)
        • As terrifying as this might sound, it has extraordinarily efficient builders for binary data.
      • When parameters are preferred, each sub batch is SQL that is parameterized wherever possible, but still will be limited to a max row size.
    • Multi-Row writes are appropriately transactioned, to make sure operations like .PersistAll that may be across multiple sub-batches remain atomic.
    • When the Stream-Batch write succeeds or fails, we'll set the TCS we gave back when caller queued to Completed or Exception, respectively.
    • (I'm speaking very theoretical here, like SRE interview level) If there is a 'possible' weak-point to this approach, it is that we serialize rows to be written into the byte array -before- we toss them into the queue. We might be using a little more memory here. But, You can still set constraints if needed via buffer-size, (where writes will fail if the buffer is full,) and size per-insert will be constrained between the Plugin's batching via Akka.Streams and Linq2Db's batching of bulk inserts.
      • In practice, I have observed low memory usage, but did observe one or two situations where a poorly timed load-shift of 1000 sharded reliable delivery actors to a single node caused some meme-worthy Memory usage graph. But I don't know how much of that was the App itself, or Cluster Sharding's algo. But I can say nobody got woken up over it. :)
      • If that isn't sufficient, one could consider adding a configuration option to instead have the WriteQueue look at the total size of a a writeset's binary versus what's in the batch. The kinda crazy-cool thing about the way it's written is that would -not- be a hard change, but would be a tradeoff without further refactoring (since we'd be summing up each entry's message size rather than just the number of rows, it can be detrimental for the common case.)

Reads:

For Reads, what the Plugin does is a 'chunked read' mode. We figure out what the Max Sequence number is for a persistence ID, and then read in batches (X rows per batch, based on config, so you can tune for memory usage) and fire to the actor.

The one thing we -might- be able to do better here, is use the ordering in the queries rather than SequenceNr; Even with Persistence.Linq2Db's Index/PK strategy, itmore likely to wind up as a clustered Index with both our strategies as well as SQL.Common.

Growth over time:

If one looks through the repo (e.x. my most recent PR) you'll see that both Query logic as well as Schema setup/evolution are able to be handled in a way that might be slightly crude, but will be performant and easy to understand. Supporting things like Array tags in PostgreSql (Or other Dbs that have more optimal patterns) would go from painful to grumblable.

Also, since akka-persistence-jdbc was used as the template, we have some guidance for growth. Note that if we want to support tags in separate tables the way they have done, it will require decent size refactoring and/or contributions to the Linq2Db Project. (That said, it looks like that change was a decent size refactoring for them too.)

Maintenance level:

Frankly, porting Persistence.Linq2Db was my 'learning experience' for Akka Streams. If you understand basic LINQ (the 'fancy' parts of Linq2Db that we use are clear in intent) and the Akka.Streams operators used (and that the Seq sink isn't really different from an ImmutableList sink) it is very straightforward to understand, if perhaps a bit over-layered in Classes/interfaces where I kept the porting literal.

Dependency support:

I'm partially bringing this up because I know that @Aaronontheweb worries about things like bus factor.

Yes, there are parts of Linq2Db that are complex. Any Linq parser is. However the code is typically built in a way to be pretty clear where an issue is. Plus in my experience, the Linq2Db team is at the among the top for support/response/turnaround on issues and bugfix releases in a .NET Project. Multiple incredibly competent project contributors/members across multiple continents to me says low bus factor.

In either case This is mitigated in our current design because, quite frankly, we don't do anything particularly fancy. It's just CRUD under a facade, the stable API surface. We don't play with semi-dynamic dictionaries or the like. Even today most of the particularly fancy features can be opted out with mindful use of configs under current distribution. As an example: if you never want to go into BULK INSERT INTO under SQL server, one can set max-row-by-row-size to an absurdly large value.

@Aaronontheweb
Copy link
Member Author

I'm going to need to write a spec for how we introduce a package migration for these components - it should be relatively straightforward, but mostly want to make sure we don't catch users by surprise.

@Aaronontheweb Aaronontheweb modified the milestones: 1.5.0, 1.5.1 Mar 2, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.1, 1.5.2 Mar 15, 2023
@Aaronontheweb Aaronontheweb moved this from Spec Written to Spec Approved in Akka.NET v1.5 Total Delivery Apr 5, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.2, 1.5.3 Apr 6, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.3, 1.5.4, 1.5.5 Apr 20, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.5, 1.5.6, 1.5.7 May 4, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.7, 1.5.8 May 17, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.8, 1.5.9 Jun 15, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.9, 1.5.11, 1.5.12 Jul 25, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.12, 1.5.13 Aug 2, 2023
@Aaronontheweb
Copy link
Member Author

This is essentially done as of last week, but the deprecation process won't happen formally until 1.6.

@Aaronontheweb Aaronontheweb modified the milestones: 1.5.13, 1.6.0 Aug 17, 2023
@github-project-automation github-project-automation bot moved this from Spec Approved to Done in Akka.NET v1.5 Total Delivery Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

3 participants