-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Migrate Akka.Persistence.Sql.Common to Akka.Persistence.Linq2Db as the new "core engine" #5408
Comments
I'm for it, but there's a few things to be considered: Edit: I've added my personal opinion of triage level on each of these. Setup classes:Don' t have them. It might be important for some users who don't want to mess with HOCONs. I'd kinda like them too, so long as we can make sure to define multiple configurations. Opinion: Far sooner than later. In every Implementation I've ever done for Akka I've just YOLOed a HOCON generator, but setup classes sound really nice, having written a hocon genrator at each shop I've been at. Packaging:Sooooo, The Fun thing about Akka.Persistence.Linq2Db is, it replaces -all- the DB Specific Querying (If not provider-specific features at present) in one Dependency chain. In general Linq2Db itself makes the 'sane' choice in how to do things for a given database whenever possible. There's a couple spots where we have to get 'specific' but never on a hot path and it is without pulling in RDBMS specific libs. To that point is the mixed bag, Linq2Db is written such that it just magically wires up a given provider based on ProviderName with possibly a little bit of magic, so long as a valid provider under the config is there 😄 (I use those weasel words because for some DBs that have multiple providers, Linq2Db will sometimes use some sniffing. For instance, IMO we can go one of two ways from a nuget standpoint:
From a 'Consolidation' standpoint:
Opinion: This is low effort and overall your call on priority, since I don't think I can do the requisite CI/CD changes to make anything other than 'UX Optimizing' the readme further to avoid folks footgunning. Indexing:First, to continue the discussion about how Akka.Persistence.Linq2Db handles 'Indexes' right now from #5407; We kinda cheat by PKing From a lib standpoint my plan was to make other indices opt-in, either as config toggles (bootstrap friendly) or provided via a generator utility that could read a Hocon fragment or simialr (devops friendly for playbooking, enterprise friendly for getting DBA approval). For instance AFAIK Akka.Cluster.Sharding doesn't do any queries around timestamp, so you can have a separate journal/snapshot config for it and save the cost of indexes that will never be used. I think config options will be good to include though, since most folks will have to look at the reference hocon to look where to plug the connection string. In either case, we can easily 'copy/paste' our scripts from existing journals in the short term, possibly adapting them to our ability to define column names. Picking which string to generate if config specifies would be easy to do/maintain and not on the hot path. Long term it would be nice to have a generator of some sort. FluentMigrator is a thing but feels like building a sandcastle with a bulldozer. Opinion: Not a show-stopper but probably a 'resolve quickly'. Persistence.Query Streaming configurability and/or better batching:I've filed this as an issue in the Akka.Persistence.Linq2Db repo: akkadotnet/Akka.Persistence.Sql#45 and is better discussed there. I bring it up mostly because in light of this issue IDK whether the way we .ToList() in a fairly greedy way sometimes is going to be an issue for Persistence.Query users. Edit: Frankly this may be less of an issue than I think, but as a non-user of Persistence.Query I can't be sure. Opinion: Resolve soon-ish or on issue file. Special sauces:Some of the other Persistence libraries have other options implemented. For example Postgres provider has JSON Support. Amazingly, this is probably 100% doable, Linq2Db is very extensible in that regard, but it's not there right now. Opinion: I am fine with the project letting users/orgs know that I am more than willing to help them with PRs for functionality if needed for the foreseeable future. This would be in addition to assistance the Linq2Db team would be able to provide as a cushion to 'bus factor'. Side commentsAll of the above said, there's two parts to Akka.Persistence.Linq2Db. The overall Query Architecture and the DB Layer. The query architecture, I tried very hard in porting from JVM Akka.Persistence.Jdbc to be as close as possible within my understanding of Scala. There were a couple of spots where I distinctly deviated for the purpose of backward compatibility or Linq2Dbs architecture but I tried to be thoughtful about it. The spots where we have 'fancy' code are somewhat delineated. The DB Layer, I've mentioned the bus factor above. But I'll say it here. I've written absurd semi-dynamic Datagrid code. I know how to use Dapper, I know how to use EF Classic and EF Core. Linq2Db is a swiss army laser for CRUD and the easiest way to write DB Agnostic code while hating the minimum number of vendors. That said, I tried to port it such that if someone had a better library to try, it wouldn't be a complete nightmare. |
Yep. And I know @to11mtm has done a hell of a job porting this but, before I can push this at work, I need to be certain that this is ready for prime time. |
This is the Chicken+Egg problem I've run into from a 'core engine' standpoint. I've used this in production against SqlServer for Sharding and AtLeastOnceDelivery, and it's worked great there. The few bugs that I have found over time have been fixed, and really there were only two that come to mind. One was we needed explicit transaction on batch writes, there was a translation-from-scala issue on write error reporting (ironically, the one spot I called out when I first ported because I was worried I did it wrong.) Both are fixed now. However, I have not used any parts of persistence.query. And, for whatever it's worth, our count of persistent actors is bounded. @ismaelhamed Do you have any sort of acceptance tests to possibly run against? The persistence logic is supposed to work such that if you are running compatibility mode, everything written by Akka.Persistence.Linq2Db should be readable by Sql.Common, and vice versa. We might need to make some minor tweaks for Oracle (I have a branch somewhere for that, I -think- the important stuff is present in mainline) but I would think you could run whatever acceptance tests you have against the library in a configuration to test migration. I think as somebody (@Aaronontheweb ?) mentioned some 'heavy load' tests for persistence.query would be good for everyone. I think we should try to include scenarios that would be realistic in production (i.e. queries with poor bounding limits,) and would help everyone have confidence in moving over. However, I think none of the above issues (helpers for indexing, potentially needing to batch-read more Persistence.Query types) are hard to solve if required. I can guide others if needed but do not have the bandwidth to commit to handling it on my own at this time. |
@to11mtm I'm most interested in how it'll perform under load. So, I must just take it for a spin in our load testing environment after the holydays. Regarding Oracle, we only have one customer and the |
@ismaelhamed Please do. Note if you want to test against Oracle you may or may not need to rebase the branch I did a while for Oracle support. If you need any help with that let me know via here or on Gitter. In either case, SQL Server works well IME, do please refer to all of the unit tests including compatibility mode tests (Which show configuration for forward/backward compat with Sql.Common.) Scaling up in general:I think there's lots of options for scaling and tuning with Persistence.Linq2Db's architecture, with more opportunities in the future. In general, I have found it scales well in production, and have adapted the write and read patterns to other application code to improve memory usage and scaling. If a setting that is needed is missing we can add it fairly easily. I originally ported this over because Batching.SQL unfortunately didn't play well with our persistence strategies under load tests and other conditions. The chunked write architecture almost immediately proves it's worth in benchmarks. When I test against in Docker against an SqlServer container, Writes with Persistence.Linq2Db scale up to 4-6x for single writes across many actors and 5-20x on single Persist/PersistAsyncs compared to Batching SQL Common. The chunked reads, are a little trickier; Under low load they are the same or better, under light load they can trade blows with Batching SQL. This is tricky to benchmark under heavier loads (lol threadpool/taskscheduler is the theme this week). On my laptop it appears that overall as you scale up Persistence.Linq2Db prevails fairly quickly; Likely because it's returning connections to pools faster via chunked reads. I guess I'm saying It's at least as good overall but don't expect the miraculous improvements seen on writes. In an 'Framework level benchmark' context (i.e. using in a benchmark of parts of an App and/or its routing architecture; not real workflow numbers, but not just measurement of the persistence system) we saw a 2-5x improvement. Throwing that out there since I know micro-benchmarks can be deceptive. The Arch Rant:So here's how the architecture works overall (warning; possibly TMI but good info for people looking to tweak to the max!): Writes:
Reads:For Reads, what the Plugin does is a 'chunked read' mode. We figure out what the Max Sequence number is for a persistence ID, and then read in batches (X rows per batch, based on config, so you can tune for memory usage) and fire to the actor. The one thing we -might- be able to do better here, is use the Growth over time:If one looks through the repo (e.x. my most recent PR) you'll see that both Query logic as well as Schema setup/evolution are able to be handled in a way that might be slightly crude, but will be performant and easy to understand. Supporting things like Array tags in PostgreSql (Or other Dbs that have more optimal patterns) would go from painful to grumblable. Also, since akka-persistence-jdbc was used as the template, we have some guidance for growth. Note that if we want to support tags in separate tables the way they have done, it will require decent size refactoring and/or contributions to the Linq2Db Project. (That said, it looks like that change was a decent size refactoring for them too.) Maintenance level:Frankly, porting Persistence.Linq2Db was my 'learning experience' for Akka Streams. If you understand basic LINQ (the 'fancy' parts of Linq2Db that we use are clear in intent) and the Akka.Streams operators used (and that the Dependency support:I'm partially bringing this up because I know that @Aaronontheweb worries about things like bus factor. Yes, there are parts of Linq2Db that are complex. Any Linq parser is. However the code is typically built in a way to be pretty clear where an issue is. Plus in my experience, the Linq2Db team is at the among the top for support/response/turnaround on issues and bugfix releases in a .NET Project. Multiple incredibly competent project contributors/members across multiple continents to me says low bus factor. In either case This is mitigated in our current design because, quite frankly, we don't do anything particularly fancy. It's just CRUD under a facade, the stable API surface. We don't play with semi-dynamic dictionaries or the like. Even today most of the particularly fancy features can be opted out with mindful use of configs under current distribution. As an example: if you never want to go into |
I'm going to need to write a spec for how we introduce a package migration for these components - it should be relatively straightforward, but mostly want to make sure we don't catch users by surprise. |
This is essentially done as of last week, but the deprecation process won't happen formally until 1.6. |
Title says it all. Why not consolidate the implementations in order to improve performance + consistency? Any reasons why not?
cc @ismaelhamed @to11mtm
The text was updated successfully, but these errors were encountered: