diff --git a/content/blog/2025-06-30-cancellation.md b/content/blog/2025-06-30-cancellation.md
new file mode 100644
index 00000000..d8127cb8
--- /dev/null
+++ b/content/blog/2025-06-30-cancellation.md
@@ -0,0 +1,490 @@
+---
+layout: post
+title: Using Rust async for Query Execution and Cancelling Long-Running Queries
+date: 2025-06-30
+author: Pepijn Van Eeckhoudt
+categories: [features]
+---
+
+
+
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll review how Rust's [`async` programming model](https://doc.rust-lang.org/book/ch17-00-async-await.html) works, how [DataFusion](https://datafusion.apache.org/) uses that model for CPU intensive tasks, and how this is used to cancel queries.
+Then we'll review some cases where queries could not be canceled in DataFusion and what the community did to resolve the problem.
+
+## Understanding Rust's Async Model
+
+DataFusion, somewhat unconventionally, [uses the Rust async system and the Tokio task scheduler](https://docs.rs/datafusion/latest/datafusion/#thread-scheduling-cpu--io-thread-pools-and-tokio-runtimes) for CPU intensive processing.
+To really understand the cancellation problem you first need to be familiar with Rust's asynchronous programming model which is a bit different from what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip this section.
+
+
+### Futures Are Inert
+
+Rust's asynchronous programming model is built around the [`Future`](https://doc.rust-lang.org/std/future/trait.Future.html) trait.
+In contrast to, for instance, Javascript's [`Promise`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise) or Java's [`Future`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/Future.html) a Rust `Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress when explicitly asked to do so.
+This is done by calling the [`poll`](https://doc.rust-lang.org/std/future/trait.Future.html#tymethod.poll) method of a `Future`.
+If nobody polls a `Future` explicitly, it is [an inert object](https://doc.rust-lang.org/std/future/trait.Future.html#runtime-characteristics).
+
+Calling `Future::poll` results in one of two options:
+
+- [`Poll::Pending`](https://doc.rust-lang.org/std/task/enum.Poll.html#variant.Pending) if the evaluation is not yet complete, most often because it needs to wait for something like I/O before it can continue
+- [`Poll::Ready`](https://doc.rust-lang.org/std/task/enum.Poll.html#variant.Ready) when it has completed and produced a value
+
+When a `Future` returns `Pending`, it saves its internal state so it can pick up where it left off the next time you poll it.
+This internal state management makes Rust's `Future`s memory-efficient and composable.
+Rather than freezing the full call stack leading to a certain point, only the relevant state to resume the future needs to be retained.
+
+Additionally, a `Future` must set up the necessary signaling to notify the caller when it should call `poll` again, to avoid a busy-waiting loop.
+This is done using a [`Waker`](https://doc.rust-lang.org/std/task/struct.Waker.html) which the `Future` receives via the `Context` parameter of the `poll` function.
+
+Manual implementations of `Future` are most often little finite state machines.
+Each state in the process of completing the calculation is modeled as a variant of an `enum`.
+Before a `Future` returns `Pending`, it bundles the data required to resume in an enum variant, stores that enum variant in itself, and then returns.
+While compact and efficient, the resulting code is often quite verbose.
+
+The `async` keyword was introduced to make life easier on Rust programmers.
+It provides elegant syntactic sugar for the manual state machine `Future` approach.
+When you write an `async` function or block, the compiler transforms linear code into a state machine based `Future` similar to the one described above for you.
+Since all the state management is compiler generated and hidden from sight, async code tends to be easier to write initially, more readable afterward, while maintaining the same underlying mechanics.
+
+The `await` keyword complements `async` pausing execution until a `Future` completes.
+When you `.await` a `Future`, you're essentially telling the compiler to generate code that:
+
+1. Polls the `Future` with the current (implicit) asynchronous context
+2. If `poll` returns `Poll::Pending`, save the state of the `Future` so that it can resume at this point and return `Poll::Pending`
+3. If it returns `Poll::Ready(value)`, continue execution with that value
+
+### From Futures to Streams
+
+The [`futures`](https://docs.rs/futures/latest/futures/) crate extends the `Future` model with a trait named [`Stream`](https://docs.rs/futures/latest/futures/prelude/trait.Stream.html).
+`Stream` represents a sequence of values that are each produced asynchronously rather than just a single value.
+It's the asynchronous equivalent of `Iterator`.
+
+The `Stream` trait has one method named [`poll_next`](https://docs.rs/futures/latest/futures/prelude/trait.Stream.html#tymethod.poll_next) that returns:
+
+- `Poll::Pending` when the next value isn't ready yet, just like a `Future` would
+- `Poll::Ready(Some(value))` when a new value is available
+- `Poll::Ready(None)` when the stream is exhausted
+
+Under the hood, an implementation of `Stream` is very similar to a `Future`.
+Typically, they're also implemented as state machines, the main difference being that they produce multiple values rather than just one.
+Just like `Future`, a `Stream` is inert unless explicitly polled.
+
+Now that we understand the basics of Rust's async model, let's see how DataFusion leverages these concepts to execute queries.
+
+## How DataFusion Executes Queries
+
+In DataFusion, the short version of how queries are executed is as follows (you can find more in-depth coverage of this in the [DataFusion documentation](https://docs.rs/datafusion/latest/datafusion/#streaming-execution)):
+
+1. First the query is compiled into a tree of [`ExecutionPlan`](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html) nodes
+2. [`ExecutionPlan::execute`](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#tymethod.execute) is called on the root of the tree.
+3. This method returns a [`SendableRecordBatchStream`](https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html) (a pinned `Box>`)
+4. `Stream::poll_next` is called in a loop to get the results
+
+In other words, the execution of a DataFusion query boils down to polling an asynchronous stream.
+Like all `Stream` implementations, we need to explicitly poll the stream for the query to make progress.
+
+The `Stream` we get in step 2 is actually the root of a tree of `Streams` that mostly mirrors the execution plan tree.
+Each stream tree node processes the record batches it gets from its children.
+The leaves of the tree produce record batches themselves.
+
+Query execution progresses each time you call `poll_next` on the root stream.
+This call typically cascades down the tree, with each node calling `poll_next` on its children to get the data it needs to process.
+
+Here's where the first signs of problems start to show up: some operations (like aggregations, sorts, or certain join phases) need to process a lot of data before producing any output.
+When `poll_next` encounters one of these operations, it might require substantial work before it can return a record batch.
+
+### Tokio and Cooperative Scheduling
+
+We need to make a small detour now via Tokio's scheduler before we can get to the query cancellation problem.
+DataFusion makes use of the [Tokio asynchronous runtime](https://tokio.rs), which uses a [cooperative scheduling model](https://docs.rs/tokio/latest/tokio/task/index.html#what-are-tasks).
+This is fundamentally different from preemptive scheduling that you might be used to:
+
+- In **preemptive scheduling**, the system can interrupt a task at any time to run something else
+- In **cooperative scheduling**, tasks must voluntarily yield control back to the scheduler
+
+This distinction is crucial for understanding our cancellation problem.
+
+A task in Tokio is modeled as a `Future` which is passed to one of the task initiation functions like [`spawn`](https://docs.rs/tokio/latest/tokio/task/fn.spawn.html).
+Tokio runs the task by calling `Future::poll` in a loop until it returns `Poll::Ready`.
+While that `Future::poll` call is running, Tokio has no way to forcibly interrupt it.
+It must cooperate by periodically yielding control, either by returning `Poll::Pending` or `Poll::Ready`.
+
+Similarly, when you try to abort a task by calling [`JoinHandle::abort()`](https://docs.rs/tokio/latest/tokio/task/struct.JoinHandle.html#method.abort), the Tokio runtime can't immediately force it to stop.
+You're just telling Tokio: "When this task next yields control, don't call `Future::poll` anymore."
+If the task never yields, it can't be aborted.
+
+### The Cancellation Problem
+
+With all the necessary background in place, now let's look at how the DataFusion CLI tries to run and cancel a query.
+The code below is a simplified version of [what the CLI actually does](https://github.com/apache/datafusion/blob/db13dd93579945628cd81d534c032f5e6cc77967/datafusion-cli/src/exec.rs#L179-L186):
+
+```rust
+fn exec_query() {
+ let runtime: tokio::runtime::Runtime = ...;
+ let stream: SendableRecordBatchStream = ...;
+
+ runtime.block_on(async {
+ tokio::select! {
+ next_batch = stream.next() => ...
+ _ = signal::ctrl_c() => ...,
+ }
+ })
+}
+```
+
+First the CLI sets up a Tokio runtime instance.
+It then reads the query to execute from standard input or file and turns it into a `Stream`.
+Then it calls `next` on stream which is an `async` wrapper for `poll_next`.
+It passes this to the [`select!`](https://docs.rs/tokio/latest/tokio/macro.select.html) macro along with a ctrl-C handler.
+
+The `select!` macro races these two `Future`s and completes when either one finishes.
+The intent is that when you press Ctrl+C, the `signal::ctrl_c()` `Future` should complete.
+The [stream is cancelled](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#cancellation--aborting-execution) when it is dropped as it is inert by itself and nothing will be able to call `poll_next` again.
+
+But there's a catch: `select!` still follows cooperative scheduling rules.
+It polls each `Future` in sequence, and if the first one (our query) gets stuck in a long computation, it never gets around to polling the cancellation signal.
+
+Imagine a query that needs to calculate something intensive, like sorting billions of rows.
+Unless the sorting Stream is written with care (which the one in DataFusion is), the `poll_next` call may take several minutes or even longer without returning.
+During this time, Tokio can't check if you've pressed Ctrl+C, and the query continues running despite your cancellation request.
+
+## A Closer Look at Blocking Operators
+
+Let's peel back a layer of the onion and look at what's happening in a blocking `poll_next` implementation.
+Here's a drastically simplified version of a `COUNT(*)` aggregation - something you might use in a query like `SELECT COUNT(*) FROM table`:
+
+```rust
+struct BlockingStream {
+ // the input: an inner stream that is wrapped
+ stream: SendableRecordBatchStream,
+ count: usize,
+ finished: bool,
+}
+
+impl Stream for BlockingStream {
+ type Item = Result;
+ fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll