Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[modbus] Fix error handling with DNS resolution / Unknown host errors #3490

Merged
merged 1 commit into from
Apr 15, 2023

Conversation

ssalonen
Copy link
Contributor

@ssalonen ssalonen commented Mar 27, 2023

Potential fix for unhandled exception, leading to "unrecoverable" error state (openHAB restart needed). See the community post below.

I cannot verify whether this actually fixes the issue as the error is very rare

This would be good candidate for 3.x maintenance branch as well

  • Corrected null typing -- apparently IDE was not strict enough to complain
  • handle case of ModbusSlaveConnectionFactoryImpl::create returning null ModbusSlaveConnection, leading to activateObject called with PooledObject(null) [NPE was handled], and then leading to validateObject called with PooledObject(null) [was not handled, p.getObject().isConnected() crashes -- probably bad things happened!]

From https://community.openhab.org/t/modbus-binding-problems-with-silently-disconnected-items/139363/28?u=ssalonen

logs from user showing error in ModbusSlaveConnectionFactoryImpl:create calling getInetAddress. getInetAddress fails which leads to null connection being created.

2023-03-20 21:46:35.356 [WARN ] [ing.ModbusSlaveConnectionFactoryImpl] - KeyedPooledModbusSlaveConnectionFactory: Unknown host: REDACTED.fritz.box: Temporary failure in name resolution. Connection creation failed.
2023-03-20 21:46:35.359 [WARN ] [ing.ModbusSlaveConnectionFactoryImpl] - Error connecting connection null for endpoint ModbusIPSlaveEndpoint [address=REDACTED.fritz.box, port=502]: null

followed then by

2023-03-21 10:00:02.536 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Error getting a new connection for endpoint ModbusIPSlaveEndpoint [address=REDACTED.fritz.box, port=502]. Error was: java.lang.InterruptedException null 
2023-03-21 10:00:02.538 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Could not connect to endpoint ModbusIPSlaveEndpoint [address=REDACTED.fritz.box, port=502] -- aborting request ModbusReadRequestBlueprint [slaveId=1, functionCode=READ_INPUT_REGISTERS, start=8197, length=59, maxTries=3] [operation ID 14e6d99f-c13e-4b7b-8e86-3615c813ca1f]

@ssalonen ssalonen requested a review from a team as a code owner March 27, 2023 18:01
@openhab-bot
Copy link
Collaborator

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/modbus-binding-problems-with-silently-disconnected-items/139363/28

@ssalonen
Copy link
Contributor Author

Unrelated jenkins error:

org.openhab.core.thing: The following files had format violations:

@wborn
Copy link
Member

wborn commented Mar 28, 2023

Unrelated jenkins error

I side with Jenkins because you added the file with the format error. 🙂

Signed-off-by: Sami Salonen <ssalonen@gmail.com>
@ssalonen ssalonen force-pushed the modbus-npe-fix-on-connection branch from ca676af to 905d5ae Compare March 28, 2023 13:56
@ssalonen
Copy link
Contributor Author

Thanks @wborn 👍 Now re-pushed the relevant stuff

public ModbusSlaveConnection create(ModbusSlaveEndpoint endpoint) throws Exception {
return endpoint.accept(new ModbusSlaveEndpointVisitor<ModbusSlaveConnection>() {
public @Nullable ModbusSlaveConnection create(ModbusSlaveEndpoint endpoint) throws Exception {
return endpoint.accept(new ModbusSlaveEndpointVisitor<@Nullable ModbusSlaveConnection>() {
Copy link
Contributor Author

@ssalonen ssalonen Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the thing IDE did not catch -- the anonymous ModbusSlaveEndpointVisitor methods were returning @Nullable ModbusSlaveConnection even though the generic type argument was <ModbusSlaveConnection>

@@ -238,8 +249,9 @@ public void passivateObject(ModbusSlaveEndpoint endpoint, @Nullable PooledObject
}

@Override
public boolean validateObject(ModbusSlaveEndpoint key, @Nullable PooledObject<ModbusSlaveConnection> p) {
boolean valid = p != null && p.getObject().isConnected();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one likely raised unhandled NullPointerException before -- with unknown implications. Might not be expected in apache commons pool2

}

@Override
public void activateObject(ModbusSlaveEndpoint endpoint, @Nullable PooledObject<ModbusSlaveConnection> obj)
throws Exception {
public void activateObject(ModbusSlaveEndpoint endpoint,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method raised NPE as well, but it was caught by the catch(Exception e) block below... leading to message logged.

@ssalonen ssalonen changed the title Modbus npe fix on connection [modbus] Fix for error handling with DNS resolution / Unknown host errors Mar 28, 2023
@ssalonen ssalonen changed the title [modbus] Fix for error handling with DNS resolution / Unknown host errors [modbus] Fix error handling with DNS resolution / Unknown host errors Mar 28, 2023
@J-N-K J-N-K added rebuild Triggers the Jenkins PR build and removed rebuild Triggers the Jenkins PR build labels Apr 2, 2023
Copy link
Member

@J-N-K J-N-K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@J-N-K J-N-K merged commit fdea66a into openhab:main Apr 15, 2023
@J-N-K J-N-K added the bug An unexpected problem or unintended behavior of the Core label Apr 15, 2023
@J-N-K J-N-K added this to the 4.0 milestone Apr 15, 2023
splatch pushed a commit to ConnectorIO/copybara-hab-core that referenced this pull request Jul 12, 2023
…#3490)

Signed-off-by: Sami Salonen <ssalonen@gmail.com>
GitOrigin-RevId: fdea66a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of the Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants