Sonoff Si7021 works unreliably with Sonoff TH10 #1798

aerofeev2k · 2018-09-25T13:48:07Z

Hi,

I tried using mega-20180923 build recently with the combination of Sonoff TH10 (Rev 2.0) station and Sonoff Si7021 sensor and found that temperature/humidity readings do not work reliably. Depending on the sensor instance (and I have 3 of them), the readings have between 25% and 75% success ratio.

This prompted an investigation, I've found previous reports on this issue like this (#691), this (#284) and this (adafruit/DHT-sensor-library#48), downloaded ESPEasy source code, compiled it in Atom, attached logic analyzer to the sensor connector on TH10, collected traces, tried some changes, and I think I've got what's wrong in the current code and how to make it work reliably with Sonoff Si7021...

I believe that the initial post on adafruit (last link above) has unnecessarily spooked developers from properly issuing a HIGH after 500us LOW under assumption that this would create almost like a short-circuit between the ESP pulling HIGH and sensor pulling LOW. In reality it's a non-issue, at least with Si7021, because: 1) the sensor doesn't pull LOW for at least 40us anyway, so there is plenty of time to issue a safe HIGH from ESP; 2) even when the sensor pulls LOW with ESP pulling HIGH nothing really blows; 3) someone online has translated a DHT datasheet from Chinese and posted that the datasheet itself says that while having a proper pullup resistor is the recommended way, it's still acceptable to issue a HIGH pulse after the initial LOW: http://forum.arduino.cc/index.php?topic=377413.msg2699553#msg2699553; 4) All quited pullup discussions happen in the context of DHT11 or DHT22, it may simply be wrong to extrapolate those findings on Si7021 even though it appears to use a protocol similar to DHT11/DHT22.

So, allow me to elaborate on my findings...

The preamble in the current ESPEasy code (introduced as a result of #691) pretty much looks like this (I've removed pieces that are not relevant to Si7021):

pinMode(Plugin_005_DHT_Pin, OUTPUT);
digitalWrite(Plugin_005_DHT_Pin, LOW);
delayMicroseconds(500);
pinMode(Plugin_005_DHT_Pin, INPUT);
delayMicroseconds(50);

Then go waitState(0), waitState(1), waitState(0) and reading of 5 bytes...

As you see, there is a 500us LOW followed up by changing the GPIO to INPUT (btw, should've been INPUT_PULLUP), followed up by a 50us wait.

In reality at least with Sonoff TH10 this doesn't produce the signal that the code thinks it's generating. All I see on the wire is a LO(555), meaning a 555us LOW signal followed up by a positive edge, and nothing else. Now, my logic analyzer isn't a recording oscilloscope like the one in the adafruit link above, it doesn't record signal transitions, it only records logic levels at 100MHz sampling rate. My guess about what's going on here is that in the absence of a strong HIGH signal from ESP, it's up to the existing pull up resistors (whether external or internal) to bring the signal up, and those resistors unfortunately have values too high, and as a result the initial 500us pulse stretches into 555us, which upsets microcontroller on the sensor board, and so it sends nothing back. Interestingly, replacing that pinMode(INPUT) with pinMode(INPUT_PULLUP) doesn't make any difference.

One can argue that this is a hardware issue, and indeed TH10 datasheet shows no external pullup resistors on GPIO14, however do note that TH10 and Si7021 are offered by iTead together, and they work fine with stock firmware. While the issue can probably be solved by adding a proper (smaller) pull up resistor, it can also be solved in software, one just needs to stop relying on the external pullups to do the weight pulling after the initial LOW pulse.

So, thinking that extended initial LOW signal might be the problem, I've modified the code to read as below and made a few runs with different values for the delay:

pinMode(Plugin_005_DHT_Pin, OUTPUT);
digitalWrite(Plugin_005_DHT_Pin, LOW);
delayMicroseconds(500);
digitalWrite(Plugin_005_DHT_Pin, HIGH);
delayMicroseconds(delay);
pinMode(Plugin_005_DHT_Pin, INPUT);

This is what I found...

For delay=10us or delay=20us, readings are 100% successful and the signal on the wire looks like this:

LO(519), HI(39), LO(41), HI(80), [LO(43), HI(24)], [LO(43), HI(24)], ... [LO(43), HI(74)]

You see the initial 519us LOW from ESP followed up by 39us HIGH, then 41us LOW, then 80us HIGH, and then a sequence of transmitted bits where 0 bit is encoded as [LO(43), HI(24)], and 1 bit is encoded as [LO(43), HI(74)]. There are a couple of things to note here:

The initial LOW pulse got reduced from 555us to 519us, and that's what's most likely allowing the sensor to respond back, it's now within the sensor microcontroller's specs.
ESP code is still too optimistic about the timing. It may think that it's issuing a 500us LOW on the wire, but in reality it's 519us. The big "if (Par3)" or still enabled interrupts might contribute to the discrepancy.
When the line goes HIGH after initial LOW, the HIGH stays for 39us, which means that the microcontroller on the sensor isn't issuing its first LOW pulse for 39us after seeing the initial positive front. So for 39us there should be no fear of shorting the bus by continuing to pull HIGH from ESP.

For delay=40us, readings are also 100% successful, but the signal now looks like this:

LO(519), HI(46), LO(33), HI(80), [LO(43), HI(24)], [LO(43), HI(24)], ... [LO(43), HI(74)]

You'll notice the same 519us initial pulse plus a few other things:

The first HIGH pulse is now 46us long. It's extended by ESP's output, and for 6us we now have the sensor pulling LOW and ESP pulling HIGH, and yet nothing blows.
While ESP thinks that it's issuing a 40us HIGH pulse, in reality it's 46us long. IDK whether it's the slowness of Arduino layer on top of ESP, or lack of optimization or something else, but lengths of actual pulses on the wire are really quite off from what ESP is trying to generate. This discrepancy makes things like delayMicroseconds(1) in waitState() and Plugin_005_read_dht_dat() suspicious at least.

For delay=50us things stop working, ESPEasy code is now reporting checksum errors, and the bus now looks like this:

LO(519), HI(56), LO(23), HI(80), [LO(43), HI(24)], [LO(43), HI(24)], ... [LO(43), HI(74)]

Things that are of interest here:

Again the initial HIGH pulse is 56us instead of the expected 50us - same kind of 6us slowness on the ESP side
The sum of the initial HIGH and the following LOW durations is always 80us, so whether the ESP generates its HIGH for 10us, 20us, 40us or even 50us, the microcontroller on the sensor is doing exactly the same thing: it's waiting 40us after the initial positive front, then is issuing a 40us LOW, then 80us HIGH and then is sending the actual bits.
The LO(23), being only 23us long is most likely missed by ESPEasy with it calling a separate waitState(0) function and losing just enough time before the initial digitalRead() in it. So ESPEasy ends up reading the stream off by a bit, and that screws up the checksum.

Frankly, I think the LO(23) should've been detected, and I'm actually kind of curious to try this in pure IDF SDK w/o any Arduino layer on top, just to see whether it's the CPU that's slow, the layer or something else.

But anyway, to summarize my findings:

Current code doesn't work reliably with Sonoff Si7021 and TH10 board because the board lacks external pullups, and the code depends on them. Yet the combination of Si7021 and TH10 is sold by iTead and works fine with stock firmware.
The worries about issuing a HIGH signal after initial LOW that prompted the dependency on external pullups are overrated because the reverse LOW from the Si7021 sensor wouldn't come for 40us anyway, and even when it comes, nothing really blows.
ESP is too sluggish to generate precisely timed signals. The error in delayMicroseconds(40) or delayMicroseconds(50) is 6us, and the error in the initial 500us pulse is 19us. IDK whether it's because code is not compiled with -O2 or because the Arduino layer on top of ESP is inefficient.
The code that generates 100% correct readings for Sonoff Si7021 and TH10 for me is this:

pinMode(Plugin_005_DHT_Pin, OUTPUT);
digitalWrite(Plugin_005_DHT_Pin, LOW);
delayMicroseconds(500);
digitalWrite(Plugin_005_DHT_Pin, HIGH);
delayMicroseconds(20);
pinMode(Plugin_005_DHT_Pin, INPUT);
// continue with waitState(0), waitState(1), etc.

Please review _P005_DHT.ino and consider changing Si7021 codepath to look something like the above.

TD-er · 2018-09-29T19:22:16Z

Maybe I am missing something here, but there is also a P014_Si7021.ino file.
Both are included in the stable plugin set.
The P014 version seems to perform I2C readings, while the P005_DHT does things a lot more low-level.
Also both sensors look a lot different when taking the datasheets.

So just to be sure, what Sonoff is calling a Si7021, is actually a DHTxx with some additional circuit?

Edit:
The Sonoff Si7021 appears to be an I2C device which is mimicking a DTHxx with additional circuitry.
See Sonoff Si7021 schematic.

See letscontrolit#1798

TD-er · 2018-09-29T20:55:19Z

Could you please test this PR: #1827
I also did make it more readable and removed some F-string duplicates from the source.
Also note the return value of P005_waitState is now inverted, which makes more sense. (return true when valid)

Implemented your suggestions (only for Si7021)
Make sure the interrupts are enabled again
Wait for state change does now use microsecond clock, instead of counting microsecond waits.

aerofeev2k · 2018-10-01T01:32:32Z

Yes, the new code works with Sonoff Si7021 reliably.

It's cool that the waitState() is now using hardware clock, but it would be even cooler if the clock was used to measure the width of positive pulses and then the decision regarding whether a pulse is a 0 or 1 would be made based on whether the width is e.g. longer than 50us or not. The potential problem with the current implementation that waits for the positive front and then sleeps for 35us is that the entire transmission is terminated with only a 16us low. So the timing has to be precise enough to measure that last 0 bit and hit the 16us window. But apparently it all works out, so no worries.

On the other hand, measuring pulse widths can also be tricky. That micros() is too heavy. It's calling esp_timer_get_time(), which is doing way too many things to measure intervals that are only 24us long. People on ESP use xthal_get_ccount() instead. It's basically a single assembly instruction querying the value of CCOUNT register that gets incremented every clock cycle (80 times per usec for ESP8266).

Btw, I did some experimentation with ESP SDK and ets_delay_us(1) from it in particular (I guess it's about the same as what delayMicroseconds(1) does). Normally it waits for 2.2us instead of 1us, but the biggest problem with timing comes when the code is not sitting in IRAM cache and has to be downloaded from the flash via SPI. Then ets_delay_us(1) suddenly takes 6us. So unless one pins time-critical code to IRAM via IRAM_ATTR, there is really no guarantee of how fast that method will run on ESP.

TD-er · 2018-10-01T07:33:03Z

That's great info.
Could you perhaps form this into a new issue to look into improving calls to micros()?
I use it also in timing statistics

[#1798] Bugfix Sonoff Si7021 stability

TD-er · 2018-10-11T22:39:49Z

I have taken all interesting information into a new issue #1893
So this one can be closed

TD-er added a commit to TD-er/ESPEasy that referenced this issue Sep 29, 2018

[letscontrolit#1798] Bugfix Sonoff Si7021 stability

c8ecd8d

See letscontrolit#1798

TD-er mentioned this issue Sep 29, 2018

[#1798] Bugfix Sonoff Si7021 stability #1827

Merged

TD-er added a commit that referenced this issue Oct 1, 2018

Merge pull request #1827 from TD-er/bugfix/DTH_Si7021_reading_stability

3e20f61

[#1798] Bugfix Sonoff Si7021 stability

TD-er mentioned this issue Oct 11, 2018

[timing] Improve timing by measuring time instead of waiting #1893

Open

TD-er added Type: Bug Considered a bug Category: Plugin Related to supported sensors Category: Stabiliy Things that work, but not as long as desired Status: Fixed Commit has been made, ready for testing labels Oct 11, 2018

TD-er closed this as completed Oct 11, 2018

TD-er mentioned this issue Oct 17, 2018

Bug in _P005 DHT #1917

Closed

pawart mentioned this issue Nov 18, 2018

DHT12: No reading! - on Sonoff touch extended with standard DHT12 running version 148 AND mega-20181101_normal_ESP8285_1024 #2062

Closed

sfromis mentioned this issue May 31, 2021

Tasmota 9.4 contact with SI7021 is unreliable arendst/Tasmota#12180

Closed

13 tasks

lmagyar mentioned this issue Feb 21, 2023

v12.4.0 on Sonoff TH10 with SI7021 randomly reports false temperature and humidity values arendst/Tasmota#17988

Closed

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sonoff Si7021 works unreliably with Sonoff TH10 #1798

Sonoff Si7021 works unreliably with Sonoff TH10 #1798

aerofeev2k commented Sep 25, 2018

TD-er commented Sep 29, 2018 •

edited

Loading

TD-er commented Sep 29, 2018

aerofeev2k commented Oct 1, 2018

TD-er commented Oct 1, 2018

TD-er commented Oct 11, 2018

Sonoff Si7021 works unreliably with Sonoff TH10 #1798

Sonoff Si7021 works unreliably with Sonoff TH10 #1798

Comments

aerofeev2k commented Sep 25, 2018

TD-er commented Sep 29, 2018 • edited Loading

TD-er commented Sep 29, 2018

aerofeev2k commented Oct 1, 2018

TD-er commented Oct 1, 2018

TD-er commented Oct 11, 2018

TD-er commented Sep 29, 2018 •

edited

Loading