Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled NoSuchElementException when looking for executable process while deploying BPMN resource #11414

Closed
lenaschoenburg opened this issue Jan 16, 2023 · 8 comments · Fixed by #12196
Assignees
Labels
area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) area/ux Marks an issue as related to improving the user experience kind/bug Categorizes an issue or PR as a bug version:8.1.11 Marks an issue as being completely or in parts released in 8.1.11 version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0

Comments

@lenaschoenburg
Copy link
Member

Describe the bug

 java.util.NoSuchElementException: No value present
	at java.util.Optional.orElseThrow(Unknown Source) ~[?:?]
	at io.camunda.zeebe.engine.state.deployment.DbProcessState.updateInMemoryState(DbProcessState.java:180) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]

To Reproduce
Unclear

Expected behavior

Invalid BPMN resources should be handled gracefully and not result in an unhandled NoSuchElementException.

Log/Stacktrace

Full Stacktrace

 java.util.NoSuchElementException: No value present
	at java.util.Optional.orElseThrow(Unknown Source) ~[?:?]
	at io.camunda.zeebe.engine.state.deployment.DbProcessState.updateInMemoryState(DbProcessState.java:180) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.state.deployment.DbProcessState.lookupProcessByIdAndPersistedVersion(DbProcessState.java:314) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.state.deployment.DbProcessState.getLatestProcessVersionByProcessId(DbProcessState.java:217) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.BpmnResourceTransformer.transformProcessResource(BpmnResourceTransformer.java:138) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.BpmnResourceTransformer.lambda$transformResource$0(BpmnResourceTransformer.java:77) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.util.Either$Right.map(Either.java:355) ~[zeebe-util-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.BpmnResourceTransformer.lambda$transformResource$1(BpmnResourceTransformer.java:75) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.util.Either$Right.flatMap(Either.java:366) ~[zeebe-util-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.BpmnResourceTransformer.transformResource(BpmnResourceTransformer.java:65) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.DeploymentTransformer.transformResource(DeploymentTransformer.java:120) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.transform.DeploymentTransformer.transform(DeploymentTransformer.java:97) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.processing.deployment.DeploymentCreateProcessor.processRecord(DeploymentCreateProcessor.java:96) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.engine.Engine.process(Engine.java:127) ~[zeebe-workflow-engine-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.stream.impl.ProcessingStateMachine.lambda$processCommand$3(ProcessingStateMachine.java:264) ~[zeebe-stream-platform-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:84) ~[zeebe-db-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.stream.impl.ProcessingStateMachine.processCommand(ProcessingStateMachine.java:260) ~[zeebe-stream-platform-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.stream.impl.ProcessingStateMachine.tryToReadNextRecord(ProcessingStateMachine.java:209) ~[zeebe-stream-platform-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.stream.impl.ProcessingStateMachine.readNextRecord(ProcessingStateMachine.java:185) ~[zeebe-stream-platform-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:92) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:45) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:119) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3]
	at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:198) ~[zeebe-scheduler-8.2.0-alpha3.jar:8.2.0-alpha3] 

Environment:

  • SaaS
  • Zeebe Version: 8.2.0-alpha3

Error group

@lenaschoenburg lenaschoenburg added kind/bug Categorizes an issue or PR as a bug area/ux Marks an issue as related to improving the user experience area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) labels Jan 16, 2023
@lenaschoenburg
Copy link
Member Author

Might be related to #11392

@korthout
Copy link
Member

We need to understand the impact before we can prioritize this. Let's investigate whether this leads to a blacklisted instance.

@remcowesterhoud
Copy link
Contributor

I had a look into this with @koevskinikola. We suspect that somehow during the deployment, the deployment record process metadata does not contain a process that we do have in the bpmn file. As a result we have a process in the bpmn file that we have not stored as a PersistedProcess, resulting in this exception.

We are unsure how this could occur and how we can reproduce it. The exception has occurred 3 times in trial clusters thus far.

@remcowesterhoud
Copy link
Contributor

remcowesterhoud commented Feb 3, 2023

We discussed within the team and we will have another look into this issue together during out next mob-programming hour 2 weeks from now.

@korthout korthout self-assigned this Feb 16, 2023
@korthout
Copy link
Member

Today, the team looked again at this issue. Having no way to reproduce it nor a way to find out how it might have happened makes this hard to debug.

I've had another look at several parts:

  • ❌ could this have been caused by NPE during transformation of BPMN model #11392 by checking multiple scenarios (e.g. deploying valid processes with id's that were already used in deployments that encountered that bug)
  • ❓ what is this code actually doing?

This question led me down a path where I noticed something:

  • when we deploy a BPMN file, we transform it
  • then, for the duplication check, we lookup the latest deployed version
  • this either uses the cached version, or it will read it from the state and then cache it
  • if it wasn't cached already but does exist in the state, it will then proceed to transform the persisted process XML
  • there is no real reason why we transform it here. The latest version is only used to determine version duplicates (and in that case, return the same key, version, etc as a response)
  • but this is the transformation where this specific bug happened.

So we could perhaps swat two flies at once (this is a Dutch saying):

  • remove the code path that was part of this bug
  • improve the performance when a new process version is deployed, when the latest isn't cached anymore (not sure if that ever happens, though).

I'm curious as to what others think about this. Is that worth it? WDYT?

//cc @remcowesterhoud @koevskinikola @berkaycanbc

@koevskinikola
Copy link
Member

koevskinikola commented Feb 17, 2023

Hey @korthout, I would vote for the following:

  1. Close this issue since we don't have enough data to reproduce and qualify it.
  2. Create an issue for optimizing the performance of the deployment of new process versions (with a low priority).
    • I'm not sure if we should work on this soon, or until we understand the bug.

My reasoning is that currently we still don't understand why the bug is happening. The code you're suggesting for removal is just the place where the issue becomes visible (through a NoSuchElementException). By removing that code, we might miss any new occurrences of the bug, so it will become more difficult to detect and reproduce.

UPDATE: Maybe we can wrap the existing exception here into a more understandable message like: "This might indicate a bug with our deployment process, please raise an issue with our GitHub tracker"?

@korthout
Copy link
Member

korthout commented Feb 20, 2023

Thanks @koevskinikola

  1. Agreed 👍
  2. I don't think the performance improvement is a good enough reason to open an issue. The improvement could be tiny. I agree with your reasoning that "it will become more difficult to detect and reproduce" 👍

UPDATE: Maybe we can wrap the existing exception here into a more understandable message like: "This might indicate a bug with our deployment process, please raise an issue with our GitHub tracker"?

👍 I like this idea, but perhaps we can change the message. Likely, we read the error message in our own environment. So there is no need to ask users to report the bug if we already know it exists. Let's add some details about the current scenario instead.

EDIT: I've moved this back into READY, so we can make an effort to root cause this

@berkaycanbc
Copy link
Contributor

Implemented a PR to create a detailed error message. We should re-open it when we encounter the log message:

Expected to find executable process in persisted process with key '%s', but after transformation no such executable process could be found

cc: @korthout

@ghost ghost closed this as completed in 39ba69c Apr 3, 2023
ghost pushed a commit that referenced this issue Apr 19, 2023
12219: [Backport stable/8.0] refactor(engine): specify an error message when deployed process id not found in state r=berkaycanbc a=backport-action

# Description
Backport of #12196 to `stable/8.0`.

relates to #11414

Co-authored-by: berkaycanbc <berkay.can@camunda.com>
ghost pushed a commit that referenced this issue Apr 19, 2023
12220: [Backport stable/8.1] refactor(engine): specify an error message when deployed process id not found in state r=berkaycanbc a=backport-action

# Description
Backport of #12196 to `stable/8.1`.

relates to #11414

Co-authored-by: berkaycanbc <berkay.can@camunda.com>
ghost pushed a commit that referenced this issue Apr 20, 2023
12220: [Backport stable/8.1] refactor(engine): specify an error message when deployed process id not found in state r=berkaycanbc a=backport-action

# Description
Backport of #12196 to `stable/8.1`.

relates to #11414

Co-authored-by: berkaycanbc <berkay.can@camunda.com>
ghost pushed a commit that referenced this issue Apr 20, 2023
12220: [Backport stable/8.1] refactor(engine): specify an error message when deployed process id not found in state r=oleschoenburg a=backport-action

# Description
Backport of #12196 to `stable/8.1`.

relates to #11414

12429: [Backport stable/8.1] test(qa): save logs of zeebe containers if the test fails r=oleschoenburg a=backport-action

# Description
Backport of #12428 to `stable/8.1`.

relates to #12396

12468: [Backport 8.1]: skip unnecessary blacklist check r=oleschoenburg a=Zelldon

## Description

Backport  #12306
<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes to #12041



Co-authored-by: berkaycanbc <berkay.can@camunda.com>
Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
ghost pushed a commit that referenced this issue Apr 20, 2023
12220: [Backport stable/8.1] refactor(engine): specify an error message when deployed process id not found in state r=oleschoenburg a=backport-action

# Description
Backport of #12196 to `stable/8.1`.

relates to #11414

Co-authored-by: berkaycanbc <berkay.can@camunda.com>
@lenaschoenburg lenaschoenburg added the version:8.1.11 Marks an issue as being completely or in parts released in 8.1.11 label Apr 21, 2023
@remcowesterhoud remcowesterhoud added version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 release/8.0.14 and removed version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 labels May 3, 2023
@remcowesterhoud remcowesterhoud added the version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 label May 3, 2023
@megglos megglos added the version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0 label Oct 5, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) area/ux Marks an issue as related to improving the user experience kind/bug Categorizes an issue or PR as a bug version:8.1.11 Marks an issue as being completely or in parts released in 8.1.11 version:8.3.0-alpha1 Marks an issue as being completely or in parts released in 8.3.0-alpha1 version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0
Projects
None yet
6 participants