Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRIU: Add tests to verify behaviour Executors in restore runs #15195

Open
tajila opened this issue Jun 2, 2022 · 1 comment
Open

CRIU: Add tests to verify behaviour Executors in restore runs #15195

tajila opened this issue Jun 2, 2022 · 1 comment
Labels
beta Used to track items that will be included in a feature beta release comp:vm criu Used to track CRIU snapshot related work

Comments

@tajila
Copy link
Contributor

tajila commented Jun 2, 2022

Add tests to verify the behaviour of:

  • ScheduledThreadPoolExecutor
  • ThreadPoolExecutor
  • ForkJoinPool
@tajila tajila added comp:vm criu Used to track CRIU snapshot related work labels Jun 2, 2022
@tajila tajila added the beta Used to track items that will be included in a feature beta release label Jun 2, 2022
@JasonFengJ9
Copy link
Member

From static code analysis:

ScheduledThreadPoolExecutor has four schedule methods:

schedule​(Runnable command, long delay, TimeUnit unit)
schedule​(Callable<V> callable, long delay, TimeUnit unit)
scheduleAtFixedRate​(Runnable command, long initialDelay, long period, TimeUnit unit)
scheduleWithFixedDelay​(Runnable command, long initialDelay, long delay, TimeUnit unit)

long triggerTime(long delay) [1] uses System.nanoTime() for delayed time calculation.
The scheduled task is to be added into super.getQueue() for later run which is in the super class ThreadPoolExecutor.

ThreadPoolExecutor main worker run loop is runWorker(Worker w) -> getTask() -> workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS).
workQueue is an instance of BlockingQueue which could be one of concert classes ArrayBlockingQueue, DelayQueue, LinkedBlockingQueue, PriorityBlockingQueue, ScheduledThreadPoolExecutor & SynchronousQueue.
All these classes implement poll(long timeout, TimeUnit unit) -> Condition.awaitNanos(nanos) except SynchronousQueue,

Condition.awaitNanos(nanos) -> ReentrantLock.Sync.newCondition()awaitNanos(nanos) -> AbstractQueuedSynchronizer.awaitNanos(long nanosTimeout) which uses System.nanoTime().
The code path using System.nanoTime() can take advantage of #15016 and ignore the JVM downtime between checkpoint and restore.
SynchronousQueue.poll(long timeout, TimeUnit unit) -> transferer.transfer(null, true, unit.toNanos(timeout)) -> awaitFulfill(s, timed, nanos) which uses System.nanoTime() as well.
So it appears ScheduledThreadPoolExecutor & ThreadPoolExecutor could ignore the JVM downtime and run the scheduled tasks normally. This will be verified through some tests.

On the other hand, ForkJoinPool top-level runloop for workers is runWorker(WorkQueue w) using System.currentTimeMillis() which is known sensitive to the downtime between checkpoint/restore. A JCL patch is required to support CRIU JVM.

[1] https://github.com/ibmruntimes/openj9-openjdk-jdk11/blob/29b9a5c558b8730701d47f79d9151bac2da910d8/src/java.base/share/classes/java/util/concurrent/ScheduledThreadPoolExecutor.java#L524-L530

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta Used to track items that will be included in a feature beta release comp:vm criu Used to track CRIU snapshot related work
Projects
Status: No status
Development

No branches or pull requests

2 participants