Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/log: Add number of entries support in log #3168

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

vrahane
Copy link
Contributor

@vrahane vrahane commented Mar 20, 2024

Please check email on the dev list for more information:

  • This adds support for logging number of entries since a given index of the log, it is optional, so only used when enabled
  • The MYNEWT_VAL for the same is LOG_FLAGS_TLV_NUM_ENTRIES and the TLV support syscfg is LOG_FLAGS_TLV_SUPPORT
  • No LOG_VERSION: 4 needed
  • Backwards and forwards compatible with earlier log header since the TLV is added at the end of the log entry
  • CLI command support is also added to help read the number of entries since a particular index on the CLI
  • Added selftests for num_entries
    CLI command:
compat> log -ne reboot_log -i 50

043837 entries: 5

Example log entries:

7143966 [ih=0x92ea3333][ne=0] [1577940571820322]  [ix=16] {_ "rsn": "OTHER: 0x0", "cnt": 238, "img": "24.3.19.131805", "hash": "92ea3333ffe0c6eaa01b842e2a0f8692f6b4a748de10378423b7abc8887993d2", "flags": "active bootable confirmed"}
7143973 [ih=0x92ea3333][ne=1] [1577949942585905]  [ix=33] {_ "rsn": "SOFT", "cnt": 239, "img": "24.3.19.131805", "hash": "92ea3333ffe0c6eaa01b842e2a0f8692f6b4a748de10378423b7abc8887993d2", "flags": "active bootable confirmed"}
7143978 [ih=0x92ea3333][ne=2] [1577951411585993]  [ix=50] {_ "rsn": "SOFT", "cnt": 240, "img": "24.3.19.131805", "hash": "92ea3333ffe0c6eaa01b842e2a0f8692f6b4a748de10378423b7abc8887993d2", "flags": "active bootable confirmed"}
7143984 [ih=0x92ea3333][ne=3] [1577953577609410]  [ix=67] {_ "rsn": "SOFT", "cnt": 241, "img": "24.3.19.131805", "hash": "92ea3333ffe0c6eaa01b842e2a0f8692f6b4a748de10378423b7abc8887993d2", "flags": "active bootable confirmed"}
7143989 [ih=0x92ea3333][ne=4] [1577956085609419]  [ix=84] {_ "rsn": "WDOG", "cnt": 242, "img": "24.3.19.131805", "hash": "92ea3333ffe0c6eaa01b842e2a0f8692f6b4a748de10378423b7abc8887993d2", "flags": "active bootable confirmed"}

Tests performed

  • NRF52840 slinky logs fine as it is
  • NRF52840 slinky logs fine with LOG_FLAGS_TLV_SUPPORT and LOG_TLV_NUM_ENTRIES support
  • NRF52840 slinky logs fine with downgrading firmware with no LOG_FLAGS_TLV_SUPPORT and LOG_TLV_NUM_ENTRIES support

@andrzej-kaczmarek
Copy link
Contributor

Would be good to explain what's the purpose of logging number of entries in a log itself.

Since you introduce new log version which is not backwards compatible, why no just revamp this so we don't need to add new log version and bunch of ifdefs every time a new field is added, e.g. by using tlv.

@vrahane
Copy link
Contributor Author

vrahane commented Mar 26, 2024

The purpose of logging number of entries in the entry itself is that we do not have to walk all the entries to count the number of entries. For a large number of log entries, walking logs serially to find the end counting number of entries is little slow. So, this gives us a way to persist number of entries and retrieve number of entries whenever needed without having to walk the log.

This change is supposed to be backwards compatible since the base header is the same 15 bytes but somehow entries with log version 3 are not getting read correctly. I am still figuring out why that is happening. We wanted to open a PR with changes sooner than later and hence I thought the backwards compatibility fix could be added later on in a follow on PR. It does not require any format/log header changes as such.

The ifdefs can actually be removed to be similar to image hash. The only reason I kept them is to save us some stack space and code space.

TLV would be a good idea but wouldn't that require too many changes to the logging code ? I wanted to keep the change backwards compatible.

@vrahane
Copy link
Contributor Author

vrahane commented Mar 26, 2024

If we want to make this change such that it can log various entries of different types, flags could be utilized but this header in itself won't be enough to make the log format generic enough, perhaps we could agree upon a log header/trailer format and define its functional aspect. I am open to making some changes to the way it works but do keep in mind we want to keep this change backwards compatible with log version 3.

@andrzej-kaczmarek
Copy link
Contributor

If the base header is the same, you don't need the log version 4 in the first place. The new struct for v4 is redundant and also invalid since counter doesn't need to follow img hash - this depends on flags.

@vrahane
Copy link
Contributor Author

vrahane commented Mar 27, 2024

Yes, that’s what I have been thinking as well. I think the log version 3 was added at the time to add support for flags. So, I should be able to just add a field to the same header. Haven’t gotten to this yet again. Will let you know after I try it again and will update this PR accordingly.

@vrahane vrahane closed this Apr 2, 2024
@vrahane vrahane reopened this Apr 2, 2024
@vrahane
Copy link
Contributor Author

vrahane commented Apr 3, 2024

@andrzej-kaczmarek I did try this out, keeping the same log version works fine with upgrades but downgrades do not work as the header now has an extra field which the old firmware does not know about which is fine in a way. I am thinking of adding the num_entries field after the log entry body, that way downgrades will work fine as well. Another option which we spoke about internally was to read the logs at bootup and populate number of entries and keep incrementing in RAM which is doable but then there are corner cases to worry about. What are your thoughts on it ?

@vrahane vrahane changed the title sys/log: Add number of entries support in log and add log version 4 sys/log: Add number of entries support in log Apr 3, 2024
@kasjer kasjer added the LOGS label May 18, 2024
@vrahane vrahane force-pushed the vipul/logging_num_entries branch 4 times, most recently from b5ff0a6 to 32c6f41 Compare June 3, 2024 20:01
- Add number of entries support in logs
- Add TLV for number of entries support
- Also update log_shell and add syscfgs to support the same.
- Added support in fcb2 and cbmem for number of entries
- This makes it such that log entries are backwards and forwards
  compatible.
- Number of entries can be read using the console or newtmgr
- This is an optional feature enabled only if
  LOG_FLAGS_TLV_SUPPORT: 1 and LOG_TLV_NUM_ENTRIES: 1
- fcb2 and config_fcb2 and log_fcb2 should be selectable via the target
  using syscfgs, it does not need to be specific to the app
@vrahane vrahane force-pushed the vipul/logging_num_entries branch from 6784178 to 73d13af Compare June 4, 2024 20:39
@vrahane vrahane force-pushed the vipul/logging_num_entries branch 3 times, most recently from 7f369d3 to f9683b0 Compare June 4, 2024 23:05
@vrahane vrahane force-pushed the vipul/logging_num_entries branch 3 times, most recently from 011d31f to 6b97bfd Compare June 8, 2024 06:58
- Earlier these would only account for the entries, now they
  also account for headers and trailers
@vrahane vrahane force-pushed the vipul/logging_num_entries branch 2 times, most recently from a5029e5 to cb9ab28 Compare June 11, 2024 01:37
@vrahane vrahane force-pushed the vipul/logging_num_entries branch from cb9ab28 to 4760b88 Compare June 11, 2024 22:10
@vrahane
Copy link
Contributor Author

vrahane commented Jun 12, 2024

@sjanc @andrzej-kaczmarek @kasjer Can you please start reviewing this ?

Copy link
Contributor

@kasjer kasjer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLV implementation looks suspicious to me.

Lets consider how log entry was stored in FCB

[2 byte len] managed by fcb code
[8 byte] time stamp
[4 byte index] that would be what num_entreis is not due to global log indexing
[1 byte] module
[1 byte] level
[1 byte] type/flags
[n bytes] of log body
[1 byte] crc managed by fcb

Now it would look like this

[2 byte len] managed by fcb code
[8 byte] time stamp
[4 byte index] that would be what num_entreis is not due to global log indexing
[1 byte] module
[1 byte] level
[1 byte] type/flags with TLV bit set
[n bytes] of log body
[6 bytes] for TLV with num_entries
[1 byte] crc managed by fcb

Code assumes that TLV start 6 bytes from the end of the body which is true if this is the only TLV with known size.
It there was more entries in TLV style possibly with dynamic size as is allowed by TLV there would be no way of finding place where all TLVs are.

Maybe if TLVs were place before normal log body it would be more future proof

sys/log/full/include/log/log.h Outdated Show resolved Hide resolved
sys/log/full/include/log/log.h Show resolved Hide resolved
sys/log/full/include/log/log.h Show resolved Hide resolved
@vrahane
Copy link
Contributor Author

vrahane commented Jun 17, 2024

@kasjer Thanks for taking a look at this, please see below:

TLV implementation looks suspicious to me.

Lets consider how log entry was stored in FCB

[2 byte len] managed by fcb code
[8 byte] time stamp
[4 byte index] that would be what num_entreis is not due to global log indexing
[1 byte] module
[1 byte] level
[1 byte] type/flags
[n bytes] of log body
[1 byte] crc managed by fcb
Now it would look like this

[2 byte len] managed by fcb code
[8 byte] time stamp
[4 byte index] that would be what num_entreis is not due to global log indexing
[1 byte] module
[1 byte] level
[1 byte] type/flags with TLV bit set
[n bytes] of log body
[6 bytes] for TLV with num_entries
[1 byte] crc managed by fcb
Code assumes that TLV start 6 bytes from the end of the body which is true if this is the only TLV with known size.
It there was more entries in TLV style possibly with dynamic size as is allowed by TLV there would be no way of finding place where all TLVs are.

That is a good point. There is a limitation in our current log implementation since the header itself does not contain the length and log entries are of variable length. The length of the log entry can only be known either from the fcb header or the cbmem header after it is written. For backwards/forwards compatibility this is important.

So, the TLVs have to start from the end of the actual body of the log entry.

Maybe if TLVs were place before normal log body it would be more future proof

The way the code is structured and the log entry in flash is accessed if the log entry has to be backwards/forwards compatible the TLV cannot be stored before the entry. That is exactly what I was doing earlier but it breaks compatibility with old format which we do not want in case of repeated upgrades/downgrades.

Well, one way to mitigate this concern is to do the following:

  • Have reverse TLVS which means this:
> [2 byte len] managed by fcb code
> [8 byte] time stamp
> [4 byte index] that would be what num_entries is not due to global log indexing
> [1 byte] module
> [1 byte] level
> [1 byte] type/flags
> [n bytes] of log body
> [4 bytes] for num_entries
> [2 bytes] for tag and length in TLV
> [1 byte] crc managed by fcb

So, what will happen in this case is while reading we would read at the end of the fcb entry, read the TLV and interpret the value and we can keep reading TLVs in a repeated fashion in that way so the dynamic nature of TLVs is preserved.

Let me know if that seems like a good approach ?

Copy link
Contributor

@kasjer kasjer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that it will help to improve whatever it aims to do.

Code was tested with spi flash and logs generated before PR was applied were not correctly dumped with new code. Logs with new content was not correctly dumped by old code. So I can't say I saw any backward compatibility nor future change readiness.
Placement of TLV at the end does not look like it can be extended in the future.

sys/log/full/include/log/log.h Show resolved Hide resolved
sys/log/full/src/log_fcb.c Outdated Show resolved Hide resolved
sys/log/full/src/log_fcb.c Outdated Show resolved Hide resolved
sys/log/full/src/log_fcb.c Outdated Show resolved Hide resolved
sys/log/full/src/log.c Outdated Show resolved Hide resolved
sys/log/full/src/log_fcb.c Outdated Show resolved Hide resolved
uint16_t offset = 0;
uint16_t padding = 0;

if (body_len > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If whole body was written together with header trailer would not be written at all. (Maybe could happen on STM32H7 where alignment is 32 bytes)

sys/log/full/src/log_fcb2.c Outdated Show resolved Hide resolved
sys/log/full/src/log.c Show resolved Hide resolved
sys/log/full/src/log.c Outdated Show resolved Hide resolved
@vrahane vrahane force-pushed the vipul/logging_num_entries branch 5 times, most recently from 46cfdae to 1147dc9 Compare July 29, 2024 17:51
@vrahane vrahane force-pushed the vipul/logging_num_entries branch from 1147dc9 to 2c38d9e Compare July 29, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants