-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Millis() Function Count Drift #3078
Comments
It sounds like you're observing inaccuracy drift in the onboard RTC which is a fairly widely documented and commented thing. If you've not found a solution, you might find this interesting (and the linked espressif RTC sample): http://www.esp8266.com/viewtopic.php?p=38750#p38750 |
I don't think that millis drift has to do anything with RTC... We don't use RTC clock for anything other than deep sleep. The observation about the rounding error in the current code is the correct one. |
I'm confirming the same drift in core code. Now trying the code above in my app. |
I confirm the same drift. |
@holla2040 were you able to test the above? It will be a while before I can get to it myself. |
Hate to say this, but I can't remember if I tested this. Nor do I remember what project I was working on where drift was important. Sorry for being a loser. Kinda busy around here. |
@holla2040 not a loser, you found a very subtle thing. |
fair enough, I'll try working on it tomorrow. |
@holla2040 did you get around to testing this? |
No but thanks for reminding me of my slackerness. I have a esp8266 connected for another project. I'll work on it right now. |
I'm still taking data. |
Here's my testing approach. get time from ntpdate function as reference, then build table with ref time, millis_orig, millis_corr, millis_altn. see attached. I was expecting to see a drift over time of millis_orig, but didn't. So I think my approach is flawed or my setup is incorrect or millis() has been corrected in esp8266 source. I do have a question though, what does setting us_ovflow do? |
Unfortunately, I can't test this anymore (my engineer who tested before no longer works for me). |
lastest git of esp8266? |
I meant for OP to rerun the original tests with latest git of the repo and compare results against what was reported in the first post. |
Hi @devyte and @holla2040, Got you messages.
By 'git', I'm assuming that you mean clone the arduino/esp8266 It wouldn't matter anyhow, as the variables required to test the code are inaccessible. This is why I created the test sketch with duplicated timing routines from 'core_esp8266_wiring.c'. To simulate impractically long elapsed times, I only have to manipulate the now exposed (and renamed) 'us_overflow'. Given that I'm working with the 'master' branch, I was surprised to see that no changes to the millis() function or its support routines have been made. I did see a new 'micro64()' function (#3830), but its used elsewhere. Since millis() wasn't changed, my guess is that @iggr decided that for the timescales that millis() is used over, the reported drift is too small to bother with. So please confirm that I'm looking in the right repo branch, and if so, ascertain the reason millis() was not changed (and if it will ever be changed). If everything is cool, I would would just close out this issue, as there is nothing to test. |
According to what I understand in @holla2040's test, there is no observable drift over ~40hours (a shift of ~300ms exists at the beginning and at the end of this test). During that time we would have had a drift of 10ms which cannot be observable without strong spectacles. I do not question the validity of this issue, and it is interesting to clarify it. It is obviously worth the change or addition, for the sake of precision, but we need first to evaluate the increased cost of calling Would it be interesting to make a new function available in the core (in a way it would not increase sketchs size if it is not used) with a different name like About how to use the current version of the core, it is explained there: (link). It is advised to make an install from scratch. |
for the problem of the OP it is not necessary to prove by sketch, it is a pure systematic (rounding) error that can be proven mathematically on paper as igrr allready stated.
by this
the rest of the initial posted setch is only for proof and demonstration of error and effect |
@mrwgx3 yes, by latest git I meant master branch with latest changes, which (usually) means installing via git. Instructions are on readthedocs as linked by @d-a-v . |
@devyte commented:
Neither do I. Normally, I look at the compiler's assembly source to make a decision, but I'm not up to speed on the Tensillica? processor the ESP8266 uses. You are correct that 'millis_altn()' tries to confine itself to 32-bit word arithmetic, with maybe (1) 64-bit interm generated by 'c*4294967'. The original 'millis()' deals with that issue as well. The intent was to avoid the 128-bit result from a 64bit x 64bit multiply. Your proposed benchmark would resolve which approach is faster. I probably can get to it sometime next week. |
DescriptionAs promised, here are some benchmarks comparing the run-times for various Millis() function candidates (see 'Sketch Output'). After some experimentation, I settled on the following function: //---------------------------------------------------------------------------
// Alternate form, corrected millis, 2nd pass (1.13 x Original)
//
// One (1/1000) divide converted to right-shifts
//
// (1.024 * N) * ((clo * 296 + chi * 656) / (N * 1024) )
// = ((clo * (1.024 * N) * 296 + chi * (1.024 * N) * 656) >> (N+10) )
// = ((clo * 2425 + chi * 5374) >> 13) for N = 3
//
unsigned long ICACHE_RAM_ATTR millis_test( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ((m < us_at_last_ovf) ? 1 : 0);
// High words of 'c'
uint16_t chi = c >> 16; // clo = c & 0xFFFF;
// Note that truncation of 'c * 4294967' is explicitly enforced
return (
(uint32_t)(c * 4294967) + (m / 1000) +
(( (c & 0xFFFF) * 2425 + chi * 5374) >> 13) + chi * 19398
); //end-return
} //millis_test
It runs about 15% slower than the original function, and is still reasonably compact. Sketch Output
Sketch//
// Millis() Runtime Benchmark
//
// a) Include code from Millis_Drift
// This code illustrates how the ESP8266 implementation of Millis()
// incrementally drifts 296us / ~71 minutes (2^32 us) of operation.
// This drift is cumlative, i.e., does not reset upon roll-over of
// millis()
//
// b) Add code to determine the runtime in 'usec' of various millis()
// function candidates.
//
#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <stdio.h>
// Include API-Headers
extern "C" { // SDK functions for Arduino IDE access
#include "osapi.h"
#include "user_interface.h"
#include "espconn.h"
} //end, 'extern C'
//---------------------------------------------------------------------------
// Globals
const int LedBlu = 2, LedRed = 0, LedOff = 1, LedOn = 0;
// For 64-bit usec ring counter
static os_timer_t us_ovf_timer;
static uint32_t us_cnt = 0;
static uint32_t us_ovflow = 0;
static uint32_t us_at_last_ovf = 0;
//---------------------------------------------------------------------------
// Interrupt code lifted directly from "cores/core_esp8266_wiring.c",
// with some variable renaming
//---------------------------------------------------------------------------
// Callback for usec counter overflow timer interrupt
void us_overflow_tick ( void* arg )
{
us_cnt = system_get_time();
// Check for usec counter overflow
if ( us_cnt < us_at_last_ovf ) {
++us_ovflow;
} //end-if
us_at_last_ovf = us_cnt;
} //us_overflow_tick
//---------------------------------------------------------------------------
#define REPEAT 1
void us_count_init ( void )
{
os_timer_setfn( &us_ovf_timer, (os_timer_func_t*)&us_overflow_tick, 0 );
os_timer_arm( &us_ovf_timer, 60000, REPEAT );
} //us_count_init
//---------------------------------------------------------------------------
// Original millis() function (1.0 x Orig)
unsigned long ICACHE_RAM_ATTR millis_orig ( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ((m < us_at_last_ovf) ? 1 : 0);
return ( c * 4294967 + m / 1000 );
// Above equation appears to be an approximation for
//
// (c * 4294967296 + m) / 1000 ~= c * 4294967 + m / 1000
//
// Does this simplify the compilers work, i.e., is the multiply
// 'c * 4294967' less complex than 'c * 4294967296'? Both would
// seem to require a 64-bit result, although the former could
// discard the most significant 32 bits.
} //millis_orig
//---------------------------------------------------------------------------
// Corrected millis(), 64-bit arithmetic gold standard (3.0 x Orig)
// truncated to 32-bits by return
unsigned long ICACHE_RAM_ATTR millis_corr( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ((m < us_at_last_ovf) ? 1 : 0);
return ( (c * 4294967296 + m) / 1000 );
} //millis_corr
//---------------------------------------------------------------------------
// Alternate form, corrected millis() (1.5 x Orig)
//
// Convert 64-bit form into 32-bit arithmetic, assuming that the compiler
// is smart enough to return the least-significant 32-bit word for the
// 64-bit multiply, 'c * 4294967'
//
// ( c * 4294967296 + m ) / 1000
// c * ( 4294967000 + 296 ) / 1000 + (m / 1000)
// c * 4294967 + ( m + c*296 ) / 1000
//
// noting c = 65536 * chi + clo, where chi & clo are uint16_t quantities...
//
// c*4294967 + ( m + (65536 * chi + clo) * 296 )/1000
// ..... + (m/1000) + ( clo*296 )/1000 + chi*(65536*296/1000)
// ..... + (m/1000) + ( clo*296 )/1000 + chi*(19398 + 0.656)
// ..... + (m/1000) + ( clo*296 )/1000 + (chi*1000*0.656)/1000 + chi*19398
// ..... + (m/1000) + ( clo*296 )/1000 + (chi*656)/1000 + chi*19398
// ..... + (m/1000) + ( clo*296 + chi*656 )/1000 + chi*19398
//
// Original millis() form Additional terms
//
// Truncated
// 64-bit? 32-bit 32-bit 32-bit
// c*4294967 + (m/1000) + ((clo*296 + chi*656 )/1000) + chi*19398
//
unsigned long ICACHE_RAM_ATTR millis_altn ( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ((m < us_at_last_ovf) ? 1 : 0);
// High/low words of 'c'
uint16_t chi = c >> 16, clo = c & 0xFFFF;
// Note that truncation of 'c * 4294967' is explicitly enforced
return (
(uint32_t)(c * 4294967) + (m / 1000) +
((clo * 296 + chi * 656) / 1000) + chi * 19398
); //end-return
} //millis_altn
//---------------------------------------------------------------------------
// Corrected millis(), Replace 1st mult by shift (3.2 x Orig)
unsigned long ICACHE_RAM_ATTR millis_test_LSH( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ( (c < us_at_last_ovf) ? 1 : 0);
return( ( ((uint64_t)c << 32) + m ) / 1000 );
} //millis_test_LSH
//---------------------------------------------------------------------------
// Corrected millis(), Eliminate 1st mult by little endian union (3.4 x Orig)
unsigned long ICACHE_RAM_ATTR millis_test_LEU( void )
{
union {
uint64_t sum;
uint32_t mc[2];
} a;
a.mc[0] = system_get_time();
a.mc[1] = us_ovflow + ( (a.mc[0] < us_at_last_ovf) ? 1 : 0);
return( a.sum / 1000 );
} //millis_test_LEU
//---------------------------------------------------------------------------
// Alternate form, corrected millis, 2nd pass (1.13 x Original)
//
// One (1/1000) divide converted to right-shifts
//
// (1.024 * N) * ((clo * 296 + chi * 656) / (N * 1024) )
// = ((clo * (1.024 * N) * 296 + chi * (1.024 * N) * 656) >> (N+10) )
// = ((clo * 2425 + chi * 5374) >> 13) for N = 3
//
unsigned long ICACHE_RAM_ATTR millis_test( void )
{
uint32_t m = system_get_time();
uint32_t c = us_ovflow + ((m < us_at_last_ovf) ? 1 : 0);
// High words of 'c'
uint16_t chi = c >> 16; // clo = c & 0xFFFF;
// Note that truncation of 'c * 4294967' is explicitly enforced
return (
(uint32_t)(c * 4294967) + (m / 1000) +
(( (c & 0xFFFF) * 2425 + chi * 5374) >> 13) + chi * 19398
); //end-return
} //millis_test
//---------------------------------------------------------------------------
// Output routines
//---------------------------------------------------------------------------
void millis_header ( void )
{
Serial.print( "us_ovflow err: " ); Serial.print( 0.296 * us_ovflow ); Serial.println( " ms" );
Serial.println( F(" Original Corrected Test Corr - Test Orig - Corr") );
Serial.println( F(" mills() millis() millis() Difference Difference Heap") );
} //millis_header
//---------------------------------------------------------------------------
void millis_compare ( void )
{
uint32_t ms_orig = millis_orig();
uint32_t ms_corr = millis_altn();
uint32_t ms_test = millis_test();
uint16_t heap = ESP.getFreeHeap();
char buf[60] = "";
sprintf( buf, "%12u %12u %12u",
ms_orig, ms_corr, ms_test
); //end-sprint
Serial.print( buf );
Serial.print( " " );
sprintf( buf, "%10d %10d %7d",
(int32_t)(ms_corr - ms_test),
(int32_t)(ms_orig - ms_corr),
heap
); //end-sprint
Serial.println( buf );
} //millis_compare
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
void Millis_DriftTest ( char *cmnt )
{
Serial.println();
Serial.print( cmnt );
Serial.print( ", us overflow count = " );
Serial.println( us_ovflow );
us_at_last_ovf = 0xFFFFFFFE; // Set close to max
millis_header();
for ( int i = 0; i < 3; i++ ) {
millis_compare();
delay(1000);
} //end-for
} //Millis_DriftTest
//---------------------------------------------------------------------------
void Millis_DriftAD ( void )
{
// Test A
us_at_last_ovf = 0xFFFFFFFE;
us_ovflow = 1000;
Millis_DriftTest( "1st Millis() overflow at ~49 days" );
// Test B
us_at_last_ovf = 0xFFFFFFFE;
us_ovflow = 7348;
Millis_DriftTest( "At 1 year" );
// Test C
us_at_last_ovf = 0xFFFFFFFE;
us_ovflow = 73476;
Millis_DriftTest( "At 10 years" );
// Test D
us_at_last_ovf = 0xFFFFFFFE;
us_ovflow = 734758;
Millis_DriftTest( "At 100 years" );
Serial.println();
Serial.println( F("*** End, Drift Test ***") );
} //Millis_DriftAD
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
void Millis_RunTimes ( void )
{
Serial.println();
Serial.println( "Millis RunTime Benchmark" );
uint32_t lc = 100000; // Samples
float nsf = 1 / float(lc); // Normalization
uint32_t bgn, cnto = 0, cntc = 0, cnta = 0, cntx = 0;
us_at_last_ovf = 0xFFFFFFE; // Slightly less than max
us_ovflow = 1000;
for (uint32_t i = 0; i < lc; i++ )
{
bgn = system_get_time();
millis_orig();
cnto += system_get_time() - bgn;
bgn = system_get_time();
millis_corr();
cntc += system_get_time() - bgn;
bgn = system_get_time();
millis_altn();
cnta += system_get_time() - bgn;
bgn = system_get_time();
millis_test();
cntx += system_get_time() - bgn;
yield();
} //end-for
Serial.println();
Serial.println( " usec x Orig Comment" );
Serial.print( "Orig: " );
Serial.print( nsf * (float)cnto );
Serial.print( " " );
Serial.print( 1.00 );
Serial.println( " Original code" );
Serial.print( "Corr: " );
Serial.print( nsf * (float)cntc );
Serial.print( " " );
Serial.print( (float)cntc / (float)cnto );
Serial.println( " 64-bit reference code" );
Serial.print( "Altn: " );
Serial.print( nsf * (float)cnta );
Serial.print( " " );
Serial.print( (float)cnta / (float)cnto );
Serial.println( " 32-bit trunc. code" );
Serial.print( "Test: " );
Serial.print( nsf * (float)cntx );
Serial.print( " " );
Serial.print( (float)cntx / (float)cnto );
Serial.println( " 32-bit trunc. code, some divides rescaled to shifts" );
Serial.println();
Serial.println( F("*** End, Bench Test ***") );
} //Millis_RunTimes
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
void setup ()
{
pinMode(LedRed, OUTPUT); // Turn off LEDs
pinMode(LedBlu, OUTPUT); // ...
digitalWrite(LedRed, LedOff); // ...
digitalWrite(LedBlu, LedOff); // ...
Serial.begin(115200);
while (!Serial) delay(250); // Wait until Arduino Serial Monitor opens
Serial.println( " " );
Serial.println(F("Millis() Drift Correction / Runtime Estimates"));
WiFi.mode( WIFI_OFF );
// Interrupt for overflow detection, millis() function tests
us_count_init();
Millis_DriftAD(); // Drift test, 49 days, 1, 10, & 100 years
Millis_RunTimes(); // Runtime benchmarks
} //setup
//---------------------------------------------------------------------------
void loop(void)
{
yield();
} //loop
//---------------------------------------------------------------------------
//--------------------------------------------------------------------------- |
This is awesome. I didn't even know the trick of converting to right shift for speed. |
Hi @devyte |
Hi All It turns out that I had (2) problems with the proposed 'millis()' solution:
The only thing left to explore was to find a fast implementation for the 64-bit arithmetic used by the slow 'gold-standard' routine. After doing a little research, I found that one can do fixed-point division by a constant simply multiplying by a scaled multiplicative inverse of the divisor. These 'magic numbers' are often used by compilers when dividing by small constants. See this link for a good discussion: http://ridiculousfish.com/blog/posts/labor-of-division-episode-i.html Given a magic number of 64-bits precision and a divisor of 10-bits precision (1000), one can accurately divide a dividend with up to 54-bits precision simply by multiplying by the magic number. This corresponds a dividend range of (54-32 = 22 bits), or 0x400000, which translates to about 570 years of usec counter overflows.
I have exhaustively tested the above code using the Borland C++ Builder v4.0 compiler for 0 < m < (2^32-1) at c = 0x400000, and did sampled testing for 0 < 'c' <= 0x400000. Benchmark times for the ESP8266 Huzzah Feather were:
The magic multiplier ran ~3x faster than the gold-standard. Execution times, however, vary considerably with the numbers being multiplied, so one should probably lower this factor to the expected worst case of around x2. Given the trade-offs between complexity and speed, this is best algorithm I've been able to come up with. Got any ideas for improvements? How do you wish to proceed? |
NERD please take this as polite compliment <8) |
@mrwgx3 awesome work. It took me a while to understand the init of a[], it's... strange usage, but the init does look correct.
|
Thank you, I do have to work at it!
Just a matter of personal preference. When proofing code, I find it much harder to spot a missing '*' de-reference at the beginning of a line. Both are equivalent, pick the one you like best.
The initialization / summing order is dictated by the little endian format. Here is the summing alignment presented graphically:
If one replaces the 'm kl' product with
the incidental save to a[0] could be eliminated, and the accumulator ultimately shortened by 32-bits. The right-shift, however, comes with a modest time penalty. If you need the ram space, it might be worth it. Is this what you meant by your question? |
Some additional comments after reviewing last night's post. As C or C++ does not have an intrinsic test for addition overflow, one has to code specifically to detect one. As proposed millis() contains none of this because of speed requirements, it means that:
To meet this criteria, not only do we have to pick 'k' to achieve our desired precision, we also have to split 'k' appropriately to avoid any addition overflows. Ideally, 'k' must be also chosen as to align the various products on byte boundaries to avoid any 64-bit word shifts before additions. These shifts impose major time penalties for shifts not equal to multiples of 8, and modest ones for those which are, The 'k' chosen for this specific division by 1000 was picked primarily to avoid shifts as well as meeting reasonable precision requirements. Hence this routine is NOT a general one, as picking another divisor may break the addition overflow requirement or greatly slow down the algorithm. |
@mrwgx3 My prior comment was literal: it's been a while since I've done pointer casting voodoo, so my voodoo eyes are rusty, hence why I found it strange and difficult to understand. The graphic in your previous comment helps, I think we should keep that as a comment close to the implementation. Also, we should keep your explanation about how the code was derived, why this implementation was picked over the others (performance numbers, breaking at 7years, etc), and especially the criteria for picking the divisor to avoid shifts. I have no trouble imagining somebody (myself included) taking a look at this code in the future and having a hard time understanding it, so having as much detail explained in the code comments is a Good Thing. Can you please prepare a PR with all this? I'd like to merge soon after 2.4.1 is out the door, to allow time for testing in the wild. |
Per your request, I've generated PR #4264 containing the corrected millis() function. I did go ahead and reduce the accumulator size, per the discussion a couple of comments back.
I was successful in installing the Arduino/esp8266 repo in my Arduino install, and tested the new millis() function using the 'BlinkWithoutDelay' example sketch. |
* Correct millis() drift, issue 3078 * Add 'test_millis_mm.ino' runtime benchmark * Eliminate 'punning' warning Add union 'acc' in millis() to eliminate 'punning' compiler warning * Correct minor typo * Eliminate 'punning' warning Add union 'acc' to eliminate 'punning' compiler warning in 'millis_test'_DEBUG() and 'millis_test()'
* Correct millis() drift, issue 3078 * Add 'test_millis_mm.ino' runtime benchmark * Eliminate 'punning' warning Add union 'acc' in millis() to eliminate 'punning' compiler warning * Correct minor typo * Eliminate 'punning' warning Add union 'acc' to eliminate 'punning' compiler warning in 'millis_test'_DEBUG() and 'millis_test()' (cherry picked from commit 3934d10)
PR #4264 is already merged, pulling this into 2.4.2. |
Hi, Sorry to barge in like this but I just can't find this anywhere else. with arduino uno i can do the following:
to reset ticks(), which is practically same as micros(), I do a: This way I don't have to care about the overflows. I would really appreciate if someone could help with this. Thnx a lot. |
Basic Infos
Hardware
OS, IDE, and SDK
Description
While studying how millis() and micros() work in order to extending them to 64-bits, I noticed an approximation in the millis() function which causes the returned millisecond value to incrementally drift 296us / ~71 minutes (2^32 us) of operation. This drift is cumlative, i.e., does not reset upon roll-over of millis(). The sketch included at the end of this post demonstrates the drift, and gives some corrections.
Settings in IDE
Sketch Output
Sketch
The text was updated successfully, but these errors were encountered: