Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]Does SCQL support query jobs on bigdata? #318

Open
xyz-scorpio opened this issue Jul 10, 2024 · 2 comments
Open

[Question]Does SCQL support query jobs on bigdata? #318

xyz-scorpio opened this issue Jul 10, 2024 · 2 comments
Assignees

Comments

@xyz-scorpio
Copy link

xyz-scorpio commented Jul 10, 2024

Issue Type

Have you searched for existing issues?

Yes

Link to Relevant Documentation

No response

Question Details

What is the upper bound of dataset scale that SCQL could handle? Say, if I want to do a query job on two datasets from Alice and Bob, both of ~TB size, can SCQL handle that? 

Also, does SCQL support distributed computing? If I have 4 AWS EC2s, can SCQL take advantage of all the resources, and how?
@tongke6
Copy link
Collaborator

tongke6 commented Jul 11, 2024

Hello @xyz-scorpio, SCQL is a system implementation of MPC SQL. Limited by MPC network communication, computing and memory overhead, I think its upper bound is to support data analysis on a scale of tens of millions within an acceptable time(e.g. < 6 hours).

For now, SCQL can only use one computing node on one party to process a query job, but different jobs can be scheduled to different computing node.

@tongke6
Copy link
Collaborator

tongke6 commented Jul 11, 2024

Expect for privacy set intersection (PSI) scenario, it can scale well.

@tongke6 tongke6 self-assigned this Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants