-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Design] Time Series Data #149
Comments
Can you give one example of an aggregation feature? Like aggregation by time periods like month, week, etc?
Just the total number of each one by time? For the tokens, will we want to have the number of NFTs as well? Or just any kind of token?
If the main concern are reorgs, I think the full-nodes will handle them as well, for the sake of the metrics we need to collect. I don't see any real difference between them in this field. If there is one, maybe you should detail the description a little more. By thinking of a reorg flow, it seems to me that both solutions would behave exactly the same. For instance, thinking of a reorg that could change the number of blocks in the accepted chain.
Both the full-node and the wallet-service should return the same information in each of the steps, right?
Maybe you could explain this a little more. You only mention this one time without explanation. I know what this is about just because you talked to me on Slack, but for others it may not be so clear.
Do we have confirmation that this is really feasible with Logstash? And what would be the calculation?
I think there is a concern here about someone trying to abuse the iframe to DDoS the service, right? Do they have any built-in protection against this? Or can we do something about it? I see that you have a task Additional comments
|
I changed the design to include two visualizations: Accumulated data from a period of time and number of items created by period of time. I included a visual representation of how Tokens visualization would look like. The user will be able to change the time range, and the Kibana dashboard will automatically adjust the aggregation (By hour, day, week, month...)
It will be the accumulated sum and total number of each by time. I included a visual representation to help undestading this part. We won't have NFTs at this time, just any kind of token. NFT information is a quick to get, and we may be able to include that without extra effort.
Yes, you are right. I removed the part where I stated that getting the info from wallet-service is better than the full-node. Both are equal on this part.
I added extra sections about backfilling data on both Prometheus and ElasticSearch.
It is feasible, and I will test our use case on the POC. This is possible using Ruby Filter Plugin. I will use it to add a field on Block index called
Elastic has built-in API Rate Limiting, which we will test during the POC. Also, currently this is not a big concern as this is a short-term solution and few features are running inside our ES Cluster (Tokens page and now the feature being introduced on this design)
I added the PoC as a task. |
My main concern with the design is about the security of using Kibana's iframe. Did you do any PoC with that? Can users access the data somehow? Or change the Kibana dashboard? How is the cache of that and rate limits?
From Luis question here, I feel we could have a dashboard with the number of NFTs, if possible.
From full node exporter we also would need development because we don't have the tokens metrics there.
What happens if the storage is fully consumed? Do we have alerts for that?
I agree with the conclusion. It's still not clear to me how will be the explorer UI. We are going to have two charts for Tokens, Transactions and Blocks, and one chart for Hash Rate, is that correct? If we add NFT separately, so two more. Are we going to add orphan blocks in this blocks index, or just the height of the blockchain? (I think we should have the height). Also about the chart, what's the default UI? Last week of data would be my first choice. This is one more feature that will depend on the wallet service sync (that later will be changed to the data service sync). Should we use maybe one more dev day to create a mechanism in the Elastic Search, so we can get the latest timestamp update? So we can share in the explorer screen "This screen is updated until block at height X and the last update was on timestamp Y"? I feel it's more interesting than just removing the feature like we are doing in the Tokens API (which is working perfectly before we release). What do you think? |
I will confirm that on the PoC, but according to this ElasticSearch PR, we will need to setup an anonymous user that will be used by Kibana to login, and then we define read-only privileges on the indexes we want.
Yes, I agree. We should not have extra development. I added that to the design.
Thanks! I updated the design with this information.
We do have alerts configured here. We are currently using 894MB out of 60GB we have available.
Yes, in total we will have 9 charts (2 for tokens, 2 for transactions, 2 for NFTs, 2 for blocks, 1 for hash rate). I updated the visual representation description.
It is possible just adding a new API on Explorer Service that call ElasticSearch for the last |
That's great. I feel we should start with this PoC just to make sure everything will work as we expect. For me this is approved. |
✔️ for me as well |
Summary
TimeSeries Project will introduce a new way users will deal with data. They will be able to check dashboards with Token, NFTs, Transaction, Blocks, and Hash Rate information. HTR deposited for mint will not be covered on this document.
This project, however, brings challenges that are beyond its initial scope. For a long-term solution, we would need to rearrange how data is stored and how services are organized in order to get the best of our data. This rearrangement would need much more dev days than expected for this project. Therefore, we are proposing a short-term solution, changing data the minimal necessary for this implementation.
Currently, two stacks are candidates for managing the data: ElasticSearch and Prometheus. Both are part of our stack, can be utilized for time series data and easily integrate with the UI (Using Kibana and Grafana, respectively).
Acceptance Criteria
1 - The Explorer must show the Time Series data of Transactions, Blocks, Tokens, NFTs, and Hash Rates.
2 - Users can check the data from its beginning until the current date. - From 12/31/2019 on testnet and from 1/3/2020 on mainnet.
3 - Users can choose a time range of data.
4 - For Tokens, Transactions, NFTs, and Blocks, two graphs must be rendered: Accumulated sum of items over time and number of items created by period of time. Depending on the time range, the aggregation must change automatically. See image below for a visual representation.
5 - For HashRate, one graph must be rendered: Average of hash rate by a period of time.
6 - In the header, we should render a message stating "This screen is updated until block at height X and the last update was on timestamp Y", so users can know the last time the graph was updated.
7 - By default, we must render data from last week until now.
8 - This document must result in a short-term solution as we parallelly work on a scalable solution for this demand.
Visual representation of acceptance criteria 4 (Only for Tokens, consider 2 more dashboards for NFTs, Transactions, Blocks, and 1 more for Hash Rate):
Alternatives
Prometheus
Prometheus is currently used for monitoring our services and full-nodes. We already use a Grafana dashboard to help us visualize the data. Prometheus works pulling data from exporters. In our case, there is an exporter on the full-node and on the Hathor Wallet Service. Both can be used as a source of truth for the domain of this project.
This is the high-level diagram if we use Prometheus:
Description of steps:
1 - Prometheus pulls data from the exporter. There are two options for getting data. Check the
Pulling the Data
section below for more information.2 - Prometheus stores the data obtained from the exporter. There are two options for data storage. Check the
Data Storage
section below for more information.3 - User, accessing Explorer, requests Time Series data.
4 - Grafana, embedded on the page, gets data from Prometheus.
Pulling the data
We might get data directly from the full node exporter or use the Wallet Service exporter. Both would require development.
Data Storage
The first option to get data is from the local storage, the default mode provided by Prometheus and what we currently use for monitoring our systems. However, Prometheus itself says that “Prometheus's local storage is not intended to be durable long-term storage”[1][2]. Prometheus default retention policy is 15 days (Which can be changed), and we have more than 2 years of data, so we consider we need long-term storage.
Alternatively, it is possible to attach remote storage to Prometheus service. One of the most famous options is InfluxDB[3]. Being InfluxDB or not, we would need to introduce a new tech component on our stack, which is not the intention with this short-term solution and would add extra costs. A performance comparison was made[4] and summarized that InfluxDB outperforms ElasticSearch on ingest performance, but the mean query response time of ElasticSearch is 11.54x faster than InfluxDB.
Backfilling Data
As exporters only provide the latest data, we would need to create a script to backfill the old data on Prometheus. According to this article, we would need to:
Steps 2 and 3 should be quick, but step 1 requires script development and reading data from Wallet Service SQL database. This would add 2 dev days to the project.
Pros and Cons
As advantages of using Prometheus, we have:
As disadvantages, we have:
Conclusion
Although Prometheus is an excellent tool for monitoring systems, it does not meet the criteria of this project, especially because of the data storage concerns we presented above.
ElasticSearch
ElasticSearch is a search engine that was recently introduced on Hathor’s tech stack. External sources (Like Logstash) must push data into the cluster, where indexes are created and data is mapped. In front of ElasticSearch data, we can use Kibana to create dashboards and import them directly on Hathor Explorer (Similarly to Grafana).
This is the high-level diagram if we use ElasticSearch:
Description of steps:
1 - Logstash makes a select statement on both the Transaction and Token tables to transfer the data to ElasticSearch. Note: 1a is already implemented
2 - Logstash handles the data and sends it to the correct index. In the case of Transaction Pipeline, we will need to evaluate if the transaction is a block. If it is, send the information to the
Block
index, including a new field forHash Rate
If the transaction is not a block, send it to theTransaction
index. Note: 2a is already implemented.3 - Users, via Hathor Explorer, access a Kibana iframe.
4 - Kibana directly accesses the cluster querying for information.
Backfilling weight data
Currently, Hathor Wallet Service does not store
weight
data from Transactions and Blocks. We will need to:weight
column on Transaction table.weight
from the full node and insert on the table.Pros and Cons
As advantages of using ElasticSearch, we have:
As disadvantages, we have:
Conclusion
Our application will not be ingest-heavy at the beginning. Also, we tolerate data to take seconds or even minutes to be ingested. On the other hand, response time is much more important, as many users may be accessing the data from Hathor Explorer. Therefore, we consider ElasticSearch to be the best short-term solution for this use case.
Task Breakdown
Considering ElasticSearch as the best tool for the solution, we will have the tasks:
Wallet Service - Total: 2.1 dev/days
weight
column on Transaction table. - 0.5 dev/dayweight
from the full node and insert on the table. - 0.8 dev/dayLogstash - Total: 2 dev/days
ElasticSearch - Total: 1.6 dev/days
Kibana - Total: 0.8 dev/days
Explorer - Total: 1.6 dev/days
Explorer Service - 0.7 dev/days
Total: 8.8 dev/days.
References:
[1] - https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
[2] - https://stackoverflow.com/questions/68891824/why-prometheus-is-not-suitable-for-long-term-storage
[3] - https://www.influxdata.com/
[4] - https://jolicode.com/blog/influxdb-vs-elasticsearch-for-time-series-and-metrics-data
[5] - https://medium.com/tlvince/prometheus-backfilling-a92573eb712c
The text was updated successfully, but these errors were encountered: