-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid long heavy tasks in the GraphQL service #2340
Conversation
@@ -134,7 +141,7 @@ impl ReadView { | |||
pub fn transaction(&self, tx_id: &TxId) -> StorageResult<Transaction> { | |||
let result = self.on_chain.transaction(tx_id); | |||
if result.is_not_found() { | |||
if let Some(tx) = self.old_transaction(tx_id)? { | |||
if let Some(tx) = self.off_chain.old_transaction(tx_id)? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the old helper... although the meaning of "old" here isn't clear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -12,6 +12,10 @@ pub struct GraphQLArgs { | |||
#[clap(long = "port", default_value = "4000", env)] | |||
pub port: u16, | |||
|
|||
/// The size of the batch fetched from the database by GraphQL service. | |||
#[clap(long = "graphql-database-batch-size", default_value = "100", env)] | |||
pub database_batch_size: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this have to be configurable? can we just set it to a global now to reduce the config surface? we have too many options rn :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to make it configurable for now to be able to adjust our performance on the fly. Later we can remove it when we have better ideas about the best value
7cb4231
to
ab5e940
Compare
Co-authored-by: Mårten Blankfors <marten@blankfors.se>
…re/avoid-long-heavy-tasks # Conflicts: # CHANGELOG.md # crates/fuel-core/src/query/balance/asset_query.rs # crates/fuel-core/src/query/coin.rs # crates/fuel-core/src/query/message.rs # crates/fuel-core/src/query/tx.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of nits, but otherwise looks good to me.
}) | ||
}); | ||
|
||
futures::stream::StreamExt::chunks(stream, database.batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just refer to self.database
here
futures::stream::StreamExt::chunks(stream, database.batch_size) | |
futures::stream::StreamExt::chunks(stream, self.database.batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or add a getter for batch_size
:)
}) | ||
.try_filter_map(move |chunk| async move { | ||
let chunk = database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here
let chunk = database | |
let chunk = self.database |
…re/avoid-long-heavy-tasks # Conflicts: # CHANGELOG.md # bin/fuel-core/src/cli/run.rs # bin/fuel-core/src/cli/run/graphql.rs # crates/fuel-core/src/graphql_api.rs # crates/fuel-core/src/service/config.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one outstanding comment, but that's mostly a nit. Otherwise LGTM. Will wait until there is another review to approve.
# Conflicts: # CHANGELOG.md # crates/fuel-core/src/coins_query.rs # crates/fuel-core/src/graphql_api/database.rs # crates/fuel-core/src/query/balance.rs # crates/fuel-core/src/query/balance/asset_query.rs # crates/fuel-core/src/query/block.rs # crates/fuel-core/src/query/coin.rs # crates/fuel-core/src/query/message.rs # crates/fuel-core/src/query/tx.rs # crates/fuel-core/src/schema/block.rs # crates/fuel-core/src/schema/tx.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice stuff! Just one question about fusing the underlying stream in YieldStream
.
.map(|tx_id| self.transaction(tx_id)) | ||
.collect::<Vec<_>>(); | ||
// Give a chance to other tasks to run. | ||
tokio::task::yield_now().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the point of this yield? This task has already finished using the database by now.
Should we be applying the same chunk batching to the iterator above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that in the future, we will have an async caching mechanism where we will wait for notification when the cache fetches a new desired value(which potentially can be used by several queries in the last block with transactions).
This yield_now
imitates this behavior, also allowing Tokio to work with other tasks.
}) | ||
}); | ||
|
||
futures::stream::StreamExt::chunks(stream, database.batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this all be cleaned up with the new yield_each
extension to avoid the need to map chunks and flatten results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, here we actually send a batch request to the database like: "Please fetch the all coins for these list of UtxoIds".
Later we can optimize it with multi get + caching(next follow up PR will add caching)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oic, since it might be multiget later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
## Version v0.40.0 ### Added - [2347](#2347): Add GraphQL complexity histogram to metrics. - [2350](#2350): Added a new CLI flag `graphql-number-of-threads` to limit the number of threads used by the GraphQL service. The default value is `2`, `0` enables the old behavior. - [2335](#2335): Added CLI arguments for configuring GraphQL query costs. ### Fixed - [2345](#2345): In PoA increase priority of block creation timer trigger compare to txpool event management ### Changed - [2334](#2334): Prepare the GraphQL service for the switching to `async` methods. - [2310](#2310): New metrics: "The gas prices used in a block" (`importer_gas_price_for_block`), "The total gas used in a block" (`importer_gas_per_block`), "The total fee (gwei) paid by transactions in a block" (`importer_fee_per_block_gwei`), "The total number of transactions in a block" (`importer_transactions_per_block`), P2P metrics for swarm and protocol. - [2340](#2340): Avoid long heavy tasks in the GraphQL service by splitting work into batches. - [2341](#2341): Updated all pagination queries to work with the async stream instead of the sync iterator. - [2350](#2350): Limited the number of threads used by the GraphQL service. #### Breaking - [2310](#2310): The `metrics` command-line parameter has been replaced with `disable-metrics`. Metrics are now enabled by default, with the option to disable them entirely or on a per-module basis. - [2341](#2341): The maximum number of processed coins from the `coins_to_spend` query is limited to `max_inputs`. ## What's Changed * fix(gas_price_service): service name and unused trait impl by @rymnc in #2317 * Do not require build of docker images to pass CI by @xgreenx in #2342 * Prepare the GraphQL service for the switching to `async` methods by @xgreenx in #2334 * Limited the number of threads used by the GraphQL service by @xgreenx in #2350 * Increase priority of timer over txpool event by @xgreenx in #2345 * Disable flaky `test_poa_multiple_producers` test by @rafal-ch in #2353 * feat: CLI arguments for configuring GraphQL query costs. by @netrome in #2335 * Add graphql query complexity histogram metric by @AurelienFT in #2349 * Updated all pagination queries to work with the `Stream` instead of `Iterator` by @xgreenx in #2341 * Avoid long heavy tasks in the GraphQL service by @xgreenx in #2340 * Add more metrics by @rafal-ch in #2310 **Full Changelog**: v0.39.0...v0.40.0 --------- Co-authored-by: Rafał Chabowski <rafal.chabowski@fuel.sh> Co-authored-by: acerone85 <andrea.cerone@gmail.com> Co-authored-by: rymnc <43716372+rymnc@users.noreply.github.com> Co-authored-by: Rafał Chabowski <88321181+rafal-ch@users.noreply.github.com>
This PR adds chunking to the stream in GraphQL and requests data in batches.
The current implementation is simple and executes batches in the same runtime. But at the end of batch fetching, it yields, allowing other tasks to be processed.
Checklist
Before requesting review