Add telemetry on number of ops we are hoarding for delay-loaded data stores / DDSs #4616
Labels
focus
Items that engineers are focusing on now, but may not have any (coding) outcome in current milestone
good first issue
Good for newcomers
perf
telemetry
Milestone
Today, we hoard all the ops for unrealized data stores / DDSs. It may take a long time until we need to load data store / DDS and thus it's possible number of ops we need to process might be huge. It would be great to have statistics on two fronts:
Here are some references to the code:
FluidDataStoreContext.pending is the array that tracks all ops for data store if data store is not realized.
We process these ops in bindRuntime() and it would be great at least add telemetry event here with number of ops we process if it's higher than some number (otherwise event would be too noisy).
Ops are added in process() method - similar it might be great to record every time it is multiple of 1000 or something like that.
Similar logic exists in RemoteChannelContext. Methods are processOp() & loadChannel() respectively.
I'd skip LocalChannelContext for that task as it's a corner case (serializing and loading back detached container, attaching that container, with some channels not yet realized?)
The text was updated successfully, but these errors were encountered: