Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction API. High prio #4918

Closed
3 tasks
Tracked by #4931
drmingdrmer opened this issue Apr 18, 2022 · 12 comments · Fixed by #5030
Closed
3 tasks
Tracked by #4931

Transaction API. High prio #4918

drmingdrmer opened this issue Apr 18, 2022 · 12 comments · Fixed by #5030
Assignees
Labels
A-meta Area: databend meta serive community-take

Comments

@drmingdrmer
Copy link
Member

drmingdrmer commented Apr 18, 2022

metasrv provides a small language with which a complete tx can be expressed as this demo illustrated(embedded tx support solution and the tx-language solution).

  • Define a struct(maybe Statements to describe transaction). This can be considered a small script language.
  • Define a runtime that interprets and executes the transaction defined with Statements.
  • Define format of transaction execution result.

Workflow:
A caller(metasrv client) creates a Statements, and sends it to metasrv.
metasrv replicates it through raft protocol, commits it, and executes it. Then respond the result to the client(metasrv client).

// embedded transaction
fn create_table() {
    let xx = dbs.get((args.db_name, args.table_name));
    if xx.is_ok() {
        return Ok();
    } else {
        let id = new_id();
        dbs.insert((args.db_name, args.table_name), id);
        tables.insert((args.db_name, id), args.table);
        return Ok();
    }
}
// user defined transaction

/// List of commands available for user.
pub enum Cmd {

    /// Conditional evaluate: 
    /// - first evaluate `condition`, it has to be a bool value.
    /// - then evaluate then_cmd or else_cmd
    Cmd::IfThenElse {
        condition: Cmd,
        then_cmd: Cmd,
        else_cmd: Cmd,
    }

    /// Assign a Cmd and save the output into a temporary variable with name `var`.
    ///
    /// 
    Cmd::Assign {
        var: String, 
        cmd: Cmd
    }

    /// A series of Cmd that will be evaluated one by one.
    Cmd::Statements(Vec<Cmd>),

    /// Return true if a variable is `None`.
    Cmd::IsNull(String), 

    /// Return the value of a variable, the value type is `AppliedState`.
    Cmd::Var(String), 


    /// Generate a new unique id in a specified name space.
    Cmd::NewId(String), 

    // Tree operations

    /// alias of BTreeMap.get()
    ///
    Cmd::GetKV{
        /// The tree to acces.
        /// Only predefined trees are provided, such as:
        /// table, db, 
        tree: String, 
        key: impl IntoTreeKey, 
    }
}

// Demo of user defined transaction which does the same as the embedded transaction above.
Statements(vec![

    Assign {
        var: "curr",
        cmd: GetKV{
            tree: "db_tbl_id",
            key: (args.db_name, args.table_name)
        }
    },
    IfThenElse {
        condition: IsNull("curr"),
        then_cmd: Var("curr"),
        else_cmd: Statements(vec![
            Assign {
                var: "id",
                cmd: NewId("table"),
            },
            Upsert {
                tree: "db_tbl_id",
                key: (args.db_name, args.table_name),
                value: Var("id"),
            },
            Upsert {
                tree: "db_tbl_t",
                key: (args.db_name, Var("id")),
                value: args.table,
            },
            Var(id)
        ])
    }

])
@BohuTANG BohuTANG added the A-meta Area: databend meta serive label Apr 18, 2022
@flaneur2020
Copy link
Member

can i take this issue?

i have got some issues about how to maintain the data relationships in an transactional manner like #4823 (delete a role, but the users still have the leaked role granted), it would be much easier to get these data relationships cleaner with these transactional apis

@drmingdrmer
Copy link
Member Author

Of course my man! Thank you so much!
This is still kind of a draft and an elaborately detailed design can be added if required.

@flaneur2020
Copy link
Member

flaneur2020 commented Apr 18, 2022

thank you, i'd take a try transform the existed create_table-like cases into the mini-tx-language representations before coding the implementations, maybe there're some possbilities about the implementations with the mini-language about the existed procedures. 🤝

@flaneur2020
Copy link
Member

/assignme

@drmingdrmer
Copy link
Member Author

maybe there're some possbilities about the implementations with the mini-language about the existed procedures. 🤝

That will be the final great goal:D

@drmingdrmer
Copy link
Member Author

BTW, the pseudo-code above lacks seq support.

Every record value in metasrv is a SeqV<T>, which contains a seq number(AKA version), some meta data and the value itself:

pub struct SeqV<T = Vec<u8>> {
    pub seq: u64,
    pub meta: Option<KVMeta>,
    pub data: T,
}

The transaction mini-lang needs several commands that manipulate SeqV.

// Since there is already the famous pua-lang it looks like we can not name it puny-user-action lang. 😭

@BohuTANG
Copy link
Member

It seems like need a simple vm with a few opcodes like https://github.com/vectordotdev/vector/blob/master/lib/vrl/compiler/src/vm/machine.rs ?

@ariesdevil
Copy link
Contributor

FYI graphql way: https://hasura.io/docs/latest/graphql/core/databases/postgres/mutations/multiple-mutations/

@flaneur2020
Copy link
Member

flaneur2020 commented Apr 19, 2022

here's a transaction API from etcd, it seems quite simple, only a batched IF expressions and a batched THEN operations:

https://etcd.io/docs/v3.4/learning/api/#transaction

@lichuang
Copy link
Contributor

One brain storm: may be can introduce Lua VM into metasrv, like redis?

@flaneur2020
Copy link
Member

script is a way to make transaction serializable (meets the Isolation from ACID), but it might make things hard to keep consistent across different nodes, it requires every compution is determined, otherwise any undertermined behaviour would be dangerous in the raft replication case, which might corrupt the data across the nodes:

set(k, current_timestamp()) -- different nodes get different value of k 

IMO, an restricted operations set other than a turing complete vm might be easier to reason about the consistency across the nodes. 🤔

@GrapeBaBa
Copy link
Contributor

script is a way to make transaction serializable (meets the Isolation from ACID), but it might make things hard to keep consistent across different nodes, it requires every compution is determined, otherwise any undertermined behaviour would be dangerous in the raft replication case, which might corrupt the data across the nodes:

set(k, current_timestamp()) -- different nodes get different value of k 

IMO, an restricted operations set other than a turing complete vm might be easier to reason about the consistency across the nodes. 🤔

Make metasrv as a blockchain with raft consensus in additional EVM?

@drmingdrmer drmingdrmer changed the title Transaction API. low prio Transaction API. High prio Apr 21, 2022
@BohuTANG BohuTANG assigned drmingdrmer and lichuang and unassigned flaneur2020 Apr 23, 2022
@BohuTANG BohuTANG mentioned this issue Apr 23, 2022
@mergify mergify bot closed this as completed in #5030 Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: databend meta serive community-take
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants