Skip to content
This repository has been archived by the owner on Oct 18, 2023. It is now read-only.

namespaces #575

Merged
merged 21 commits into from
Aug 10, 2023
Merged

namespaces #575

merged 21 commits into from
Aug 10, 2023

Conversation

MarinPostma
Copy link
Collaborator

@MarinPostma MarinPostma commented Aug 2, 2023

This PR introduces namespaces to sqld.

Namespaces enable multiple database to be managed by a single sqld instance. A primary sqld instance can manage primary databases, and a replica sqld instance can manage replica databases.

Namespaces are lazily created based on the value of the host header. The host header must be present for every request. It is split by . and the first split determines the namespace. For example, if the host is somedatabase.turso.com, then the namespace is somedatabase. For websocket the handshake host header must also be set.

A hrana connection is opened for a specific namespace. In other words, a hrana connection can only open a connection to the database specified during the handshake.

Namespaces are lazily created on either the primary or a replica. When either receives a request to some namespaces, they create it on the fly. In the replica case, a replication stream is created on namespace creation that will trigger namespace creation on the primary. When a namespace is created on the primary, the creation is not eagerly propagated to the replicas. Rather, the replica will start replication from the primary upon the first request it gets for that namespace.

Integration with bottomless works seamlessly. The database id of the bottomless replicator is set to the namespace name, and the primary can lazily recover a database from bottomless if it exists there before responding to a request.

The snapshot script callback is also passed to the namespace. as a second argument

The --allow-default-namespace flag can be used to fall back to the default namespace if the host header can't be parsed properly.

--allow-replica-overwrite has been removed since the handshake is now explicit about the database we are trying to replicate. In case of log incompatibility, the replica is always reset.

Remaining work:

  • Finer-grained authentication: currently, the same token is used to query any database
  • Loading dumps: We need an API to load dumps into a namespace
  • Create dump: we need an API to perform a dump for a namespace dumps can be created for specific namespaces by passing the --namespace arguments to the dump subcommand.
  • Eager namespace creation: namespaces are lazily created. Let's move to a more convenient model where namespaces are created on demand through the admin API
  • Allow namespace deletion: deleting namespaces can be tricky since we need to ensure consistency between replicas and the primary

@MarinPostma MarinPostma force-pushed the namespace branch 3 times, most recently from 6359208 to da49ca2 Compare August 8, 2023 12:48
@MarinPostma MarinPostma marked this pull request as ready for review August 9, 2023 09:46
@MarinPostma MarinPostma requested review from LucioFranco and psarna and removed request for LucioFranco August 9, 2023 09:48
@penberg
Copy link
Contributor

penberg commented Aug 9, 2023

Finer-grained authentication: currently, the same token is used to query any database

This is fine for now. We can merge without this, let's just open a ticket to remind us that we need to do it.

Copy link
Collaborator

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks really good! I think we could use a better PR title and I left a few nits/comments.

sqld/src/hrana/ws/handshake.rs Outdated Show resolved Hide resolved
sqld/src/hrana/ws/handshake.rs Show resolved Hide resolved
Comment on lines 39 to 41
namespace = match extract_namespace(req) {
Ok(ns) => Some(ns),
Err(_) if allow_default_ns => Some(DEFAULT_NAMESPACE_NAME.into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the option here right? Since you return in one branch you can make the type the T from the Option and then remove an unwrap?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can do that since namespace is set by the callback body, and Rust can't prove that the callback will be invoked at any point, leaving the namespace uninitialized. Or maybe I misunderstood your suggestion?

sqld/src/http/mod.rs Outdated Show resolved Hide resolved
pub struct DbFactoryExtractor<D>(pub Arc<dyn DbFactory<Db = D>>);

#[async_trait::async_trait]
impl<F> FromRequestParts<AppState<F>> for DbFactoryExtractor<F::Database>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the benefit of this extractor over using the previous pattern? I assume its less work since you don't need do write out all the structs etc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must extract the namespace from the host header in multiple places and retrieve the corresponding DbFactory. I could have a function like retrieve_db_factory(&Request, &Namespaces) and call it everywhere, but I preferred to go with an extractor instead

};

#[async_trait::async_trait]
pub trait NamespaceFactory: Sync + Send + 'static {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this trait name is super confusing. I would maybe call it NamedspacedDb and just drop the factory since I think we all would know this is a factory :D (or can call it Maker which feels more rusty).

}

pub struct Namespaces<F: NamespaceFactory> {
#[allow(clippy::type_complexity)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just globally allow this lint....its really dumb

}

#[allow(dead_code)]
pub struct Namespace<D, T> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name as well doesn't make sense to me either would be good to have a doc comment or rename it to something more digestible.

}

#[async_trait::async_trait]
pub trait NamespaceCommon {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be called Namespace and we rename the struct above?

enable_console: bool,
idle_shutdown_layer: Option<IdleShutdownLayer>,
stats: Stats,
logger: Option<Arc<ReplicationLogger>>,
) -> anyhow::Result<()> {
replication_service: Option<S>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out why this was changed to be generic? Could you expand on that, I think adding generics if we don't need it can make this API much harder to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 step plan:

  • don't leak namespace-specific information to HTTP (Namespace<PrimaryNamespaceFactory>)
  • rn, only the primary can serve replication requests, but, it's easy enough to make a proxy rpc service so that replication request can be served from any node, which is what I want to do in a followup PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to just enum that service rather than make things more generic?

@LucioFranco
Copy link
Collaborator

Out of curiosity, how did you test all of this code?

@MarinPostma
Copy link
Collaborator Author

Out of curiosity, how did you test all of this code?

I ran plenty of tests:

  • in standalone mode:
    • create multiple namespaces
    • add data to those different tables
    • check that the data went to the correct namespaces
  • in primary replica mode:
    • Start a primary and 2 replicas
    • trigger namespace creation from replica by querying a namespace there
    • check that namespace was created on both the primary and replica
    • trigger namespace creation from primary by querying namespace there
    • check that namespace was created on both primary and replica
    • add data from both replica and primary, on different namespaces
    • check consistency between primary and replicas for all namespaces
    • kill the primary and bring it back up, and check the replicas connect to their respective primaries, attempt to add more data, and check the consistency
  • In primary/replica mode, with bottomless enabled
    • Same as above
    • destroy the primary (remove all its files) and start it again, let it recover from bottomless, and check that replicas end in a consistent state, all while replicas are still running.

@MarinPostma
Copy link
Collaborator Author

MarinPostma commented Aug 10, 2023

@LucioFranco optionally 1c12b95contains a refactor that should help with clarity. I can split it off in another PR if necessary, lmk.

I tried to make the concepts a bit clearer between database/connection/namespace

@MarinPostma MarinPostma force-pushed the namespace branch 3 times, most recently from 1c12b95 to b014270 Compare August 10, 2023 09:25
Copy link
Collaborator

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks really good, we chatted about some naming stuff over slack but no need to fix them now. Almost all my comments are stuff we can do in a follow up so if we don't do them now we should at least make an issue that captures all of the changes we might want to make in the future.

The only comment I want to bring to your attention is the one around the namespace extraction and the fly domain, I think that is a blocker for at least getting embedded replicas somewhat working due to the fly issue. Let me know if you have any comments.

@@ -125,20 +122,20 @@ where
}

#[async_trait::async_trait]
impl<W> DbFactory for LibSqlDbFactory<W>
impl<W> MakeConnection for LibSqlDbFactory<W>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to make this change, but MakeConnection throws me off super hard because tower has https://docs.rs/tower/latest/tower/make/trait.MakeConnection.html lol

}
}

#[async_trait::async_trait]
impl<F, DB, Fut> DbFactory for F
impl<F, C, Fut> MakeConnection for F
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that we plan to expose this but I would assume generally you would actually want to implement this on a new type like MakeConnectionFn<F> so that it doesn't affect other blanket impls. But since this is private not a big deal.

}
}

pub struct TrackedDb<DB> {
pub struct TrackedConnection<DB> {
inner: DB,
#[allow(dead_code)] // just hold on to it
permit: tokio::sync::OwnedSemaphorePermit,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the allow and name this field _permit and I think it will get rid of the error.

impl Database for PrimaryDatabase {
type Connection = TrackedConnection<LibSqlConnection>;

fn connection_maker(&self) -> Arc<dyn MakeConnection<Connection = Self::Connection>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should expose Arc<dyn ...> but instead should wrap this in a new type that encapsulates the dyn trait + arc behavior. This should make other code easier to read in the rest of the codebase. The reason I bring this up here is that its encoded into a trait so this is going to leak everywhere.


use super::super::{ProtocolError, Version};
use super::handshake::WebSocket;
use super::{handshake, proto, session, Server, Upgrade};

/// State of a Hrana connection.
struct Conn<D> {
struct Conn<F: MakeNamespace> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: By why F why not M? 😄


use super::AppState;

pub struct MakeConnectionExtractor<D>(pub Arc<dyn MakeConnection<Connection = D>>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would rename this to NamespaceMakeConnection and drop the extractor part since that is implicit. (Another name option could be NsMakeConnection)

sqld/src/http/db_factory.rs Show resolved Hide resolved

match split_namespace(host_str) {
Ok(ns) => Ok(ns),
Err(_) if allow_default_namespace => Ok(DEFAULT_NAMESPACE_NAME.into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a debug log could be useful here to catch weird issues where we expect a host to be parsed as a namespace but it gets sent to the default one.

}

/// A namspace isolates the resources pertaining to a database of type T
pub struct Namespace<T: Database> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can drop the bound and leave this to be just T since you don't use any of the trait items in the struct def.

logger: Arc<ReplicationLogger>,
replicas_with_hello: RwLock<HashSet<SocketAddr>>,
namespaces: Arc<NamespaceStore<PrimaryNamespaceMaker>>,
replicas_with_hello: RwLock<HashSet<(SocketAddr, Bytes)>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bytes supposed to be a namespace? If so I would suggest we maybe newtype this to make sure its clear that this bytes was extracted as a namespace, this can be done in a follow up though.

Copy link
Collaborator

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: 🔥

@MarinPostma MarinPostma added this pull request to the merge queue Aug 10, 2023
Merged via the queue into main with commit 87492b3 Aug 10, 2023
@MarinPostma MarinPostma deleted the namespace branch August 25, 2023 13:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants