Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip(protocol-next): implementing another heat pump protocol #52

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

tspopp
Copy link
Owner

@tspopp tspopp commented Nov 3, 2024

Implementing another heat pump protocol

Help

⚠️ This PR needs help by the community, as I don't have a machine with the new protocol ⚠️

Missing Features / Open Tasks:

  • Checksum Calculation
  • MITM
    • Passthrough (untested)
    • Error Passthrough (untested)
    • Override Values
  • HMI Message
    • time/date
    • target temperature
    • operation types (timer/always-on)
    • operation modes (boost, absence, eco...)
    • air-duct config (int/int..)
    • anti-legionella setting
    • enable emergency mode
    • disable heat-element
    • enable pv-input
    • timer window a/b
    • test modes
    • fan exhaust config (INT/EXT)
    • error request id / error number
  • Main Message
    • hot water temp
    • input air temp
    • lower evaporator temp
    • upper evaporator temp
    • NEW: compressor outlet temp
    • states: heatpump, heatelement, ext, fan, defrost, pv-input, solar-input...
    • ...
  • Energy Message
    • ...
  • Error Message
    • errorCode
    • requestId
    • optional: error message content
  • Compatibility to legacy protocol
  • Refactoring

Status Quo:

This branch contains a modified version of AquaMQTT which is meant to be installed in LISTENER mode. It is currently able to identify hmi, main and energy messages from the new protocol #45. These are provided to mqtt on the three topics:

aquamqtt/hmi/debug
aquamqtt/main/debug
aquamqtt/energy/debug

The new protocol findings are documented within PROTOCOL_NEXT.md

Tracing

Using the debug python script https://github.com/tspopp/AquaMQTT/blob/main/tools/debug.py you are able to record changing messages over time and identify which location holds what kind of attribute.

Create the python environment

python3 -m venv venv
source venv/bin/activate
pip install paho-mqtt

Run the python script

source venv/bin/activate
python3 debug.py

How to help here?

  • start a recording of the serial messages
  • change a single (previously unidentified) value at the hmi, e.g. water target temperature from 50 to 52
  • stop the recording of the serial message
  • attach the traces together with a description ("changed water target temperature from 50 to 52 and back")

@tspopp tspopp changed the title wip(protocol-next): first steps wip(protocol-next): implementing another heat pump protocol Nov 3, 2024
@tspopp
Copy link
Owner Author

tspopp commented Nov 3, 2024

@taloriko I've already added parsing of time and date from the hmi message. You may want to check if the values are correct.

@tspopp tspopp linked an issue Nov 3, 2024 that may be closed by this pull request
@taloriko
Copy link

taloriko commented Nov 4, 2024

Okay, that was my first PR that I used, but it seems to be working quite well.

These new topics were generated:

grafik

Check Time and Date:

HMI set to 17 Clock --> OK
grafik

HMI set to 16 Clock --> OK
grafik

HMI set to 04 November --> Day OK | Month --> NOK | Year --> OK
grafik

I initially tried to locate the bytes for HMI debugging. At first glance, the byte positions are similar to the previous ones. Here is the initial list of what I was able to reproduce on the HMI in a short amount of time.

HMI Debug

  1. Byte ????

  2. Byte --> target temperature -->HEX
    • 40 in Hex: 28
    • 41 in Hex: 29
    • 42 in Hex: 2A
    • 43 in Hex: 2B
    • 44 in Hex: 2C

In this dump, I set the target temperature from 40°C to 44°C in individual increments:

2228110000001000062C01D0020000000B44312310000000004E450000060422013E
2228110000001000062C01D0020000000C44312310000000004E450000060422013E
2228110000001000062C01D0020000000D44312310000000004E450000060422013E
2228110000001000062C01D0020000000E44312310000000004E450000060422013E
2228110000001000062C01D0020000000F44312310000000004E450000060422013E
2228110000001000062C01D0020000001044312310000000004E450000060422013E
2229110000001000062C01D0020000001144312310000000004E450000060422013E
2229110000001000062C01D0020000001244312310000000004E450000060422013E
2229110000001000062C01D0020000001344312310000000004E450000060422013E
222A110000001000062C01D0020000001444312310000000004E450000060422013E
222A110000001000062C01D0020000001544312310000000004E450000060422013E
222A110000001000062C01D0020000001644312310000000004E450000060422013E
222B110000001000062C01D0020000001644312310000000004E450000060422013E
222B110000001000062C01D0020000001744312310000000004E450000060422013E
222B110000001000062C01D0020000001844312310000000004E450000060422013E
222B110000001000062C01D0020000001944312310000000004E450000060422013E
222C110000001000062C01D0020000001944312310000000004E450000060422013E
222C110000001000062C01D0020000001A44312310000000004E450000060422013E
222C110000001000062C01D0020000001B44312310000000004E450000060422013E
222C110000001000062C01D0020000001C44312310000000004E450000060422013E

  1. Byte --> Operating mode

10 = timer operation | Auto
11 = timer operation | Eco/Manuell Eco=Aktiv
12 = timer operation | Eco/Manuell Eco=Inaktiv
15 = timer operation | Absence
19 = timer operation | Boost
40 = continuous operation | Auto
41 = continuous operation | Eco/Manuell Eco=Aktiv
42 = continuous operation | Eco/Manuell Eco=Inaktiv
45 = continuous operation | Absence
49 = continuous operation | Boost

In this dump, I set the Operating mode with "timer operation" from Auto --> Eco/Manuell Eco=Inaktiv --> Eco/Manuell Eco=Aktiv

2200100000001000062C01D0020000002F44313510000000004E450000060422013E
2200100000001000062C01D0020000003044313510000000004E450000060422013E
2200100000001000062C01D0020000003144313510000000004E450000060422013E
222B120000001000062C01D0020000003244313510000000004E450000060422013E
222B120000001000062C01D0020000003344313510000000004E450000060422013E
222B120000001000062C01D0020000003444313510000000004E450000060422013E
222B120000001000062C01D0020000003544313510000000004E450000060422013E
222B120000001000062C01D0020000003644313510000000004E450000060422013E
222B120000001000062C01D0020000003744313510000000004E450000060422013E
222B120000001000062C01D0020000003844313510000000004E450000060422013E
222B120000001000062C01D0020000003944313510000000004E450000060422013E
222B110000001000062C01D0020000003944313510000000004E450000060422013E
222B110000001000062C01D0020000003A44313510000000004E450000060422013E
222B110000001000062C01D0020000003B44313510000000004E450000060422013E
222B110000001000062C01D0020000000044313610000000004E450000060422013E

If the mode is set to auto, the target temperature is 00, and I cannot see any on the HMI either.

  1. Byte --> air connection / anti-legionella

0X --> recirculated air
1X --> one air connection
2X --> two air connection
X0 --> anti-legionella --> Inaktiv
X1 --> anti-legionella --> 1 Month Cycle
X2 --> anti-legionella --> 2 Month Cycle
X3 --> anti-legionella --> 3 Month Cycle
X4 --> anti-legionella --> 4 Month Cycle

  1. Byte --> emergency heating

00 --> inactive
01 --> active

  1. Byte --> heating element

00 --> Automatic
04 --> locked
06 --> PV System <> Yes

I hope the initial steps are okay for you.

@tspopp
Copy link
Owner Author

tspopp commented Nov 4, 2024

Yes, looks great so far. I will incorporate the findings later. One more hint, if you use the python script, you will get dumps in hex and dec representation. It is WAY easier to spot the changed attributes. For example:

e.g. in dec

2024-11-04 18:39:09.587725,34 50 18 0 0 0 16 0 6 44 1 208 2 0 0 0 0 33 30 13 12 0 0 0 0 78 69 0 0 6 4 34 1 62
2024-11-04 18:39:10.135443,34 50 18 0 0 0 16 0 6 44 1 208 2 0 0 0 2 33 30 13 12 0 0 0 0 78 69 0 0 6 4 34 1 62

or in hex

2024-11-04 18:39:09.587725,2232120000001000062C01D00200000000211E0D0C000000004E450000060422013E
2024-11-04 18:39:10.135443,2232120000001000062C01D00200000002211E0D0C000000004E450000060422013E

I think you've missed Operation Mode BOOST and Absence/Vacation. If we have these, we have the completed the Operation Modes 🎉

@tspopp
Copy link
Owner Author

tspopp commented Nov 4, 2024

I updated the implementation based on your findings. Please note PROTOCOL_NEXT.md . You might also try to edit that file directly or add comments directly to the document by reviewing this PR. This would be actually a nice way implementing this 😬

It might be helpful to also have a look on the pre-existing PROTOCOL.md since I expect that there will be a lot of similarities :)

@taloriko
Copy link

taloriko commented Nov 5, 2024

I can't edit the PROTOCOL_NEXT.md file directly. I'm using GitHub for the first time and hope that the PR Patch 1 is suitable.

I'm currently stuck with the timer. The changes start counting from byte 0 at index 9.

I have a few examples:

Timer 07:00-12:00 | 16:00-22:00 | 11h

34 48 18 0 0 0 48 0 6 192 3 124 1 0 0 0 34 69 49 47 16 0 0 0 0 78 69 0 0 164 1 44 1 62

Timer 08:00-12:00 | 16:00-22:00 | 10h

34 43 17 0 0 0 16 0 6 192 3 104 1 0 0 0 7 68 49 3 18 0 0 0 0 78 69 0 0 224 1 240 0 62

Timer 09:00-13:00 | 17:00-23:00 | 10h

34 43 17 0 0 0 16 0 6 252 3 104 1 0 0 0 58 68 49 8 18 0 0 0 0 78 69 0 0 28 2 240 0 62

Timer 01:00-05:00 | 10:00-18:00 | 12h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 54 69 49 27 16 0 0 0 0 78 69 0 0 60 0 240 0 62

Timer 01:00-06:00 | 10:00-18:00 | 13h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 12 69 49 35 16 0 0 0 0 78 69 0 0 60 0 44 1 62

Timer 00:00-06:00 | 10:00-18:00 | 14h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 38 69 49 37 16 0 0 0 0 78 69 0 0 60 0 44 1 62

@taloriko
Copy link

taloriko commented Nov 6, 2024

To narrow down the results more precisely, I only changed the beginning of the first time window and observed the respective outcome.

Byte 9 could represent the minutes since 00:00.
Up to 4:00, the minutes fit into the first byte.
At 5:00, Byte 10 comes into play: 44 and 1 → (256x1) + 44 = 300 / 60 = 5
At 6:00, however, it no longer fits: (256x3) + 192 = 960 / 60 = 16

From 6:00, it looks like it represents minutes until midnight.

At 7:00, it doesn’t fit anymore – or am I calculating it incorrectly?

Starting from 6 a.m., it seems that Byte 9 and 10 switch with Byte 29 and 30, and thus the time windows change.
Or my assumption may be incorrect.

00:00-12:00 | 16:00-22:00 | 18h,22,48,18,0,0,0,48,0,2,0,0,208,2,0,0,0,40,70,49,54,16,0,0,0,0,78,69,0,0,192,3,104,1
01:00-12:00 | 16:00-22:00 | 17h,22,48,18,0,0,0,48,0,2,60,0,148,2,0,0,0,7,70,49,55,16,0,0,0,0,78,69,0,0,192,3,104,1
02:00-12:00 | 16:00-22:00 | 16h,22,48,18,0,0,0,48,0,2,120,0,88,2,0,0,0,47,70,49,55,16,0,0,0,0,78,69,0,0,192,3,104,1
03:00-12:00 | 16:00-22:00 | 15h,22,48,18,0,0,0,48,0,2,180,0,28,2,0,0,0,34,70,49,56,16,0,0,0,0,78,69,0,0,192,3,104,1
04:00-12:00 | 16:00-22:00 | 14h,22,48,18,0,0,0,48,0,2,240,0,224,1,0,0,0,15,70,49,57,16,0,0,0,0,78,69,0,0,192,3,104,1
05:00-12:00 | 16:00-22:00 | 13h,22,48,18,0,0,0,48,0,2,44,1,164,1,0,0,0,57,70,49,57,16,0,0,0,0,78,69,0,0,192,3,104,1
06:00-12:00 | 16:00-22:00 | 12h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,43,70,49,58,16,0,0,0,0,78,69,0,0,104,1,104,1
07:00-12:00 | 16:00-22:00 | 11h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,30,70,49,59,16,0,0,0,0,78,69,0,0,164,1,44,1
08:00-12:00 | 16:00-22:00 | 10h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,24,70,49,0,17,0,0,0,0,78,69,0,0,224,1,240,0

Timer 00:00-04:00 | 05:00-09:00 | 8h
34 44 18 0 0 0 48 0 2 44 1 240 0 0 0 0 48 71 49 52 15 0 0 0 0 78 69 0 0 0 0 240 0 62

Timer 01:00-05:00 | 06:00-11:00 | 9h
34 44 18 0 0 0 48 0 2 104 1 44 1 0 0 0 42 71 49 8 16 0 0 0 0 78 69 0 0 0 0 44 1 62

Timer 01:00-05:00 | 06:00-14:00 | 12h
34 44 18 0 0 0 48 0 2 104 1 224 1 0 0 0 36 71 49 14 16 0 0 0 0 78 69 0 0 60 0 240 0 62

Timer 01:00-09:00 | XXXXXXX | 8h
34 44 18 0 0 0 48 0 2 60 0 224 1 0 0 0 8 71 49 22 16 0 0 0 0 78 69 0 0 104 1 224 1 62

Time window: Setting limitations
• One time window: Minimum duration of 8 hours, maximum duration of 12 hours.
• Two time windows: Each at least 4 hours, combined maximum of 20 hours.
• Disable the second time window: Set the end time to less than 4 hours.

@tspopp
Copy link
Owner Author

tspopp commented Nov 6, 2024

No worries, we can focus on other attributes first. I will try to look in these dumps as soon as I have some time for this. Alternatively, if you have a formula which seems promising, we can try to implement it and see if it matches...

@taloriko
Copy link

taloriko commented Nov 7, 2024

I’ve now tested everything I could find on the HMI (Menu + Installer Menu). I couldn’t identify any further details.

I’ve added the time windows, though I’m still not entirely certain about their full functionality.

Here are my observations:

Bytes 3, 14, 15, 23, 24, 27, and 28 did not change during testing and were always set to 0.
Bytes 25 and 26 consistently held the values 78 and 69 (I have no idea).
The function of Byte 33 is still unclear (possibly a checksum).

Do you have any further ideas on how we could test these bytes? Otherwise, I would wrap up the HMI testing and shift focus to energy management.

@tspopp
Copy link
Owner Author

tspopp commented Nov 7, 2024

I think initially you have identified all major attributes of interest in the hmi message 🍻 A few of these leftover bytes might be reserved for commands. Checksum is the last Byte 34, which is not available in the dumps by AquaMQTT, so it is likely that there is also something else stored in Byte 33. You may try to enter the secret menu "spin the wheel left and then to the right" and you get some more advanced options. You may reverse these items as well, but AquaMQTT currently does not implement commands. I think I documented how commands work in my protocol document, not sure if they changed the pattern with the new protocol 🤷‍♂️

But changing this advanced settings is somehow interesting, because you're changing the main message and you able to identify more information from the main message. For example, if you set the fan speed level to 55%, you will see some value changing within the main message to 55% 😉

In the meantime I will implement the changes to speak both protocol version at the same time, so we can merge this to main as soon as we feel like it's ready.

At some point in time I need you to provide a large (maybe a few minutes) raw dump using AquaDebug. Only the logs from AquaDebug contains the checksum and we need to find out, how the crc values are actually calculated. In MITM mode we need to recreate messages and therefore have to generate the checksum the same way. So understanding this pattern is crucial for getting MITM to work 😬

@taloriko
Copy link

taloriko commented Nov 7, 2024

In the secret menu, I can only read.

Should the heat pump be running during the dump?

Screenshot_20241107_182233_Gallery.jpg

Screenshot_20241107_182212_Gallery.jpg

Screenshot_20241107_182312_Gallery.jpg

Screenshot_20241107_182322_Gallery.jpg

Screenshot_20241107_182244_Gallery.jpg

Screenshot_20241107_182253_Gallery.jpg

Screenshot_20241107_182303_Gallery.jpg

Screenshot_20241107_182222_Gallery.jpg

Screenshot_20241107_182151_Gallery.jpg

@tspopp
Copy link
Owner Author

tspopp commented Nov 7, 2024

No it does not need to run during the dump. There will be a lot of hmi messages and they change checksum frequently since the time is always changing :)

But while dumping, you may also open the secret menu one time. We should see how error messages look like. We need to identify them for MITM, too.

@taloriko
Copy link

Today, I wanted to analyze the next set of data, and I noticed that the ESP32 restarts every 10 seconds.

I have always received the data so far, but there were occasional dropouts. I initially thought that IP-Symcon couldn’t receive the topics quickly enough, which caused these gaps.

First, I searched for a timeout issue and found one in the WiFi settings. I adjusted the value to 600:

constexpr uint16_t WIFI_RECONNECT_CYCLE_S = 600;

However, this change did not fix the problem.

I’m not sure if it’s related, but I also had to adjust the code slightly to be able to flash without errors:


#ifdef CUSTOM_CONFIGURATION
#    include "ExampleConfiguration.h"
#else
#    include "ExampleConfiguration.h"
#endif

I replaced CustomConfiguration with a different configuration.

@tspopp
Copy link
Owner Author

tspopp commented Nov 11, 2024

Most probably I broke something with my latest commits. I will check. Sorry for that 😅

Edit: with my simulation it is not crashing, so its probably related to the protocol 🤷‍♂️ , but I fixed something which could lead to the crash. It is most likely memory corruption, since I've refactored a lot of stuff with the support of both protocols in parallel. In any way, the large dump of AquaDebug (while opening the super secret menu) would be really helpful. Once I have that, I can simulate your heatpump, find out how the error message looks like and can begin trying to figure out how the checksum works.

If it is still crashing with latest commit, you can go back to the one where I did not refactored the serial protocols via git checkout b55b3b2 As soon as this branch is in a state where I can put in on my device, I will test it myself on my device. We will get those things sorted... 👍

Looking forward!

@taloriko
Copy link

I have finally created the log files with AquaDebug.

I created various files:

Without opening a menu
"Super Secret Menu" opened
Browsed errors in the "Super Secret Menu"
"Installer Menu" opened

aqua_debug_data_Normal.txt
aqua_debug_data_SuperSecret.txt
aqua_debug_data_SuperSecret_Errors.txt
aqua_debug_data_Installer.txt

@tspopp
Copy link
Owner Author

tspopp commented Nov 11, 2024

Nice, we found the error message 🎉 Here is an example:

4A410D0F2D00270015001800130000000000000000020000000000671D00000200000045110000581100007A2200000800000002000000211E000C003200000000086B

I will add debug topics for those, but you don't need to figure out what the bytes in this message actually mean (of course you can try, if you like). AquaMQTT is currently mostly interested in forwarding these messages to the HMI controller in MITM mode and therefore we have to know how they look like.

error_request_sequence.txt

This is how it works:

- messages without checksum
- hmi message requests error message with request id 0D
- controller answers with 4A4A0D (identifier + length + requestId)
- pattern is repeated with increased requestId

C2222E120000003000062C0126020000001C4B310A15000000004E4501 0D 6603C2013E
4A41 0D 0F2D00270015001800130000000000000000020000000000671D00000200000045110000581100007A2200000800000002000000211E000C00320000000008
C2222E120000003000062C0126020000001C4B310A15000000004E4502 0E 6603C2013E
4A41 0E 0F2C00270016001900130000000000000000020000000000661D0000020000004411000057110000792200000800000002000000211E000C00320000000007
C2222E120000003000062C0126020000001D4B310A15000000004E4503 0F 6603C2013E
4A41 0F 0F2A00260013001200110000000000000000020000000000611D0000020000004011000053110000712200000800000002000000211E000C00320000000006
C2222E120000003000062C0126020000001D4B310A15000000004E4504 10 6603C2013E
4A41 10 0F2D002800110010001100000000000000000200000000005B1D0000000000003B1100004E110000682200000700000002000000211E000C00320000000005
C2222E120000003000062C0126020000001E4B310A15000000004E4505 11 6603C2013E
4A41 11 0F2A003D00080008001000000000000000000200000000003D1D0000000000002211000028110000362200000600000000000000211E000C01320000000004
C2222E120000003000062C0126020000001F4B310A15000000004E4506 12 6603C2013E
4A41 12 0F2D00260017001700140000000000000000020000000000391D0000000000001F11000025110000312200000600000000000000211E000C01320000000003

- at the end, hmi controller knows somehow that all errors have been received and sets requestId to 00 which leads to no more error messages

C2222E120000003000062C0126020000001F4B310A15000000004E4500 00 6603C2013E

So we have also identified the error requestId in the hmi message, and the requestId in the error message. Enough for today

@tspopp
Copy link
Owner Author

tspopp commented Nov 12, 2024

Based on the dumps I figured out how the checksum is generated. Moreover I added error message parsing and passing. You might want to try MITM mode now and see if it works. As soon as you removed the passthrough jumper, and changed the configuation to constexpr EOperationMode OPERATION_MODE = EOperationMode::MITM and flashed, you may want to the test the following items to see if this is good:

  • HMI controller is still getting data from the main controller (temps are shown etc)
  • HMI controller is still able to controll the main controller (e.g. turn on Boost)
  • HMI controller still gets a list of errors when opening the super secret menu

Be aware that overrides / controls are not yet supported for the next protocol, if the above items are working we add the functionality for sure :)

@taloriko
Copy link

  • HMI controller is still getting data from the main controller (temps are shown etc)
  • HMI controller is still able to control the main controller (e.g. turn on Boost)
  • HMI controller still gets a list of errors when opening the super secret menu

The values I receive via MQTT seem correct at first glance. I can see the values on the HMI and control various components in test mode. In the super-secret menu, I can read errors.

Unfortunately, the controller restarts every 5–12 seconds.

Here are the new topics:
{42F4341A-FB6A-4E7A-872D-922A4A58C0D2}

{A02EED78-E47D-4FB4-9AED-A9B4655B13BD}

{A4F462C9-FF74-49D0-BD3B-A4C1CE08D56C}

@tspopp
Copy link
Owner Author

tspopp commented Nov 13, 2024

Actually, these are very good news, happy to get your heat pump fully supported soon 👍 I'll do the refactoring soon and resolve the crash you have. In the meantime you should go back to listener mode, because crashes during MITM might lead in protocol dropouts. Not sure if the heat-pump likes that behavior or not 🤷‍♂️ You might want to proceed with completing the protocol, observing states in the main message is actually nice. These are the icons shown in the hmi controller (fan is on, heat element is on, external boiler is on...). But of course we can proceed to add things step by step after the main PR has been merged.

@taloriko
Copy link

Okay, I’ll switch back to LISTENER mode.

As time permits, I’ll continue analyzing the main message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

After installation, no values are displayed on the HMI
2 participants