-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better doc request: pool, replica, pool-id #8
Comments
Answering myself, this should be improved and placed into docs somewhere Time Series (TS) = named, ordered set of [time, value]; Time Series are always (automatically) assigned to a Pool. They cannot move to another pool, Shard = continuous and time limited part of TS (from - to); More questions: |
Thank you for the feedback. We will soon update the documentation and try to clarify some concepts. Your last questions: Will writing data to any server return success even if it is not yet replicated anywhere? The server will respond with success in case the data is at least saved in a replication (queue) file so yes, it can return successful before the data is actually replicated. is it possible to configure number of successful replications before returning success? (1, 2. .., N, most, 40%, 67%, ...) No, this is not possible and the current version of SiriDB supports only two servers in each pool. |
Hi guys, I am currently looking into horizontally scalable time-series DBMS and so I came across the very interesting SiriDB. Yet, I have similar problems in understanding the distribution concepts of SiriDB based on the exisitng documentations. In particular, I have the following questions:
Thanks a lot in advaance for your help! |
Hi @seybi87,
Here are a few benchmarks we did to compare some cloud solutions. The benchmark results show that adding more pools results in an almost linear insert performance. Azure (Netapp files, SiriDB 1 Pool): Azure + ONTAP Cloud (SiriDB 1 Pool): Google Cloud (SSD, SiriDB 1 Pool): Google Cloud (HDD, SiriDB 1 Pool): Google Cloud (HDD, SiriDB 2 Pools): Google Cloud (HDD, SiriDB 5 Pools (bottle neck, one insert host)): Google Cloud (HDD, SiriDB 5 Pools (split over 2 insert hosts)): |
Hi @joente , I have just two follow up questions based on your provided information: |
@seybi87 , No, that's not exactly right. Once you add a replica, both servers in the pool receive the replica role and data will be synchronized across both server. Both servers are active and will randomly be chosen to handle query requests. Maybe an example is easier to understand. Suppose you have four SiriDB servers, then both of these configurations are possible: Four servers, four pools (no redundancy and no replica roles)
Or, four servers, two pools (redundancy in each pool)
Note that adding a server (either as a replica or pool) has no impact on the running database. SiriDB extends in the background and the progress can be viewed with For the benchmarks we used TSBS. |
@joente thanks a lot that clarified all my questions! |
I am trying to understand how siridb works.
What exactly replica and pool is? How it works? What exactly is the relation between server, pool, replica, database, timeseries, shards ? How do i check (list) how i did configure those relations?
PS: i just found blog post http://siridb.net/blog/how-we-store-time-series-in-siridb/
which is interesting, and i really like the explanation about files and indexes,
but the blog post is not good enough to solve this issue.
Idea: making simple (UML) diagram explaining relations of server, pool, replica, database, timeseries, shards might be very good.
Also help text is somewhat wrong here (new pool instead of new replica):
The text was updated successfully, but these errors were encountered: