Description
context: Recently I set about trying to find a faster Python interpreter for my async app, which has a high rate of context switching, many coroutines, and each coroutine running different code. I found that the app runs 2x slower on PyPy, and 20% faster on Pyston-full. One reason may be that a tracing JIT will naively trace across context switches, such that a given trace will never be repeated, due to arbitrary occurrence and ordering of coroutine resumes.
There really seems to be nothing in the benchmark world that represents this kind of async workload-- which, by the way, I expect to become more popular with time. The current async benchmarks in pyperformance have nothing like this-- the coroutines are trivial and homogenous.
It's concerning that the Pyston full fork is being retired, while PyPy blindly continues as if everything's OK, and faster-python proceeds at a furious pace-- all without evaluating a workload that is significant today, and may become more so in the future.