Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/coding: add XOR based coding module #17045

Merged
merged 2 commits into from
Feb 7, 2023
Merged

Conversation

benpicco
Copy link
Contributor

Contribution description

This implements the XOR based error-correction code described by @jue89 at the RIOT Summit.

A parity byte is generated for each 3 payload bytes, then the payload array is transposed by interpreting it as a 2D matrix with height of 3.

This is to reduce the chance of consecutive bytes ending up in the same packet.
With this it is possible to recover one in 3 lost data packets (if parity packets are received).

I'm not sure if the transpose function is ideal for mixing the bytes as it still generates a rather predictable pattern and we can't recover from two consecutive lost packets, but this might require more advanced codes (e.g. raptor codes).

Testing procedure

So far only unit tests have been added to unittests/tests-coding.
This should be the basis of future work to make firmware updates more reliable / efficient by transmitting parity data along with the payload to recover lost packets without retransmissions.

Issues/PRs references

@benpicco benpicco requested a review from miri64 as a code owner October 24, 2021 12:25
@benpicco benpicco requested review from maribu and jue89 October 24, 2021 12:25
@github-actions github-actions bot added Area: sys Area: System Area: tests Area: tests and testing framework labels Oct 24, 2021
@benpicco benpicco force-pushed the coding/xor branch 2 times, most recently from dc5f61c to 573c7d6 Compare October 24, 2021 12:35
Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only did a quick peek, but looks good so far. I will try to review in detail as soon as possible

sys/coding/xor.c Outdated Show resolved Hide resolved
sys/coding/xor.c Outdated Show resolved Hide resolved
@benpicco benpicco force-pushed the coding/xor branch 2 times, most recently from 73e8e11 to c1c006a Compare October 24, 2021 12:57
@github-actions github-actions bot added the Area: doc Area: Documentation label Oct 24, 2021
@jue89
Copy link
Contributor

jue89 commented Oct 24, 2021

Hey!

Thank you for picking up this topic! It hasn't been a while I was active in the RIOT development, again. Sry!

I had a a quick look onto your PR. I wasn't able to verify it in depth right now since I've tried to bring the layout into structs instead of stream of data with virtual chunks. But maybe a quick description what I did so far helps you to check quickly if our ideas are in-line.

The implementation has been carried out by XORing chunks of data (64 bytes if I remember correctly). Every chunk is transmitted in one UDP packet.

This is the XOR method I implemented
/**
 * @brief   Helper function for XORing chunks
 *
 * @param[in]  chksize   Size of one chunk in byte
 * @param[out] dst       Destination chunk
 * @param[in]  srccnt    Count of input chunks
 * @param[in]  src       Array of pointer to the input chunks
 */
void mota_block_chunk_xor(size_t chksize, uint8_t * dst, size_t srccnt, const uint8_t * src[])
{
    size_t i, j;

    // Copy first block
    memcpy(dst, src[0], chksize);

    // XOR following blocks
    for (i = 1; i < srccnt; i++) {
        for (j = 0; j < chksize; j++) {
            dst[j] ^= src[i][j];
        }
    }
}

The overall picture is:

  • 1 block has m columns
  • 1 column has n data chunks and 1 parity chunk. Parity is calculated among on column.
  • 1 chunk has x bytes

So one block holds m * n * x bytes of data plus m * x bytes of parity.

Furthermore, I also introduced the term channel chunks these are the data chunks + parity chunks. Every block has m * n data chunks and m * (n+1) channel chunks.

/**
 * @brief   The smallest unit holding data
 */
typedef struct {
    uint8_t data[CONFIG_MOTA_CHUNK_SIZE];
    int defined;  /*< Set to non-zero if data is valid */
} mota_chunk_t;

/**
 * @brief   One column contains 1..n data chunks and 1 parity chunk
 */
typedef struct {
    mota_chunk_t d[CONFIG_MOTA_DATA_CHUNKS_PER_COL]; /*< data chunks */
    mota_chunk_t p;                                  /*< parity */
} mota_col_t;

/**
 * @brief   One block consists of 1..m columns
 */
typedef struct {
    mota_col_t col[CONFIG_MOTA_COLS_PER_BLOCK];
} mota_block_t;

The procedure is:

1. Receive all the channel chunks of the block row-wise
static inline size_t idx2col(size_t n) {
    return n % CONFIG_MOTA_COLS_PER_BLOCK;
}

static inline size_t idx2row(size_t n) {
    return n / CONFIG_MOTA_COLS_PER_BLOCK;
}

/**
 * @brief   Copies given channel chunk into the block
 *
 * @param[out] block     Block instance
 * @param[in]  n         Channel chunk index
 * @param[in]  data      Channel chunk data
 *
 * @return  0 on success, -EOVERFLOW of n is out of bounds
 */
int mota_block_cchunk_write(mota_block_t * block, size_t n, const uint8_t * data) {
    size_t c = idx2col(n);
    size_t r = idx2row(n);

    // Data chunk
    if (r < CONFIG_MOTA_DATA_CHUNKS_PER_COL) {
        memcpy(block->col[c].d[r].data, data, CONFIG_MOTA_CHUNK_SIZE);
        block->col[c].d[r].defined = 1;
        return 0;
    }

    // Parity chunk
    if (r == CONFIG_MOTA_DATA_CHUNKS_PER_COL) {
        memcpy(block->col[c].p.data, data, CONFIG_MOTA_CHUNK_SIZE);
        block->col[c].p.defined = 1;
        return 0;
    }

    // Out of mem
    return -EOVERFLOW;
}
2. Check and repair the block if there are missing chunks
/**
 * @brief   Tries to calc missing data chunks from parity
 *
 * @param[out] block     Block instance
 *
 * @return  Count of chunks that couldn't be repaired
 */
int mota_block_repair(mota_block_t * block) {
    int rc = 0;

    for (size_t c = 0; c < CONFIG_MOTA_COLS_PER_BLOCK; c++) {
        // Check if the column must be repaired
        size_t r;
        size_t i;
        size_t missing = 0;
        const uint8_t * known_chunks[CONFIG_MOTA_DATA_CHUNKS_PER_COL];
        uint8_t * unknown_chunk = NULL;

        for (r = 0; r < CONFIG_MOTA_DATA_CHUNKS_PER_COL; r++) {
            if (block->col[c].d[r].defined == 0) {
                missing += 1;
            }
        }

        // Everything is fine :)
        if (missing == 0) {
            continue;
        }

        // Wa cannot repair this column :(
        if (missing > 1 || !block->col[c].p.defined) {
            rc += missing;
            DEBUG_PUTS("[mota-block] Cannot restore chunk(s)");
            continue;
        }

        // We can repair the missing data chunk
        i = 0;
        for (r = 0; r < CONFIG_MOTA_DATA_CHUNKS_PER_COL; r++) {
            if (block->col[c].d[r].defined == 0) {
                unknown_chunk = block->col[c].d[r].data;
                block->col[c].d[r].defined = 1;
                DEBUG("[mota-block] Restore chunk at col %d row %d\n", c, r);
            } else {
                known_chunks[i++] = block->col[c].d[r].data;
            }
        }
        known_chunks[i++] = block->col[c].p.data;
        mota_block_chunk_xor(CONFIG_MOTA_CHUNK_SIZE, unknown_chunk, i, known_chunks);
    }

    return rc;
}
3. If step 2 returned `0`: Fetch the data chunks from the block for further processing
/**
 * @brief   Retrieve pointer to data chunk
 *
 * @param[in]  block     Block instance
 * @param[in]  n         Data chunk index
 *
 * @return  Address to chunk or NULL if chunk is missing
 */
uint8_t * mota_block_get_dchunk_ptr(mota_block_t * block, size_t n) {
    size_t c = idx2col(n);
    size_t r = idx2row(n);

    // Out of memory
    if (r >= CONFIG_MOTA_DATA_CHUNKS_PER_COL) {
        return NULL;
    }

    if (block->col[c].d[r].defined == 0) {
        return NULL;
    } else {
        return block->col[c].d[r].data;
    }
}

I hope this descriptions helps.

@benpicco benpicco requested a review from chrysn October 25, 2021 15:08
@fjmolinas fjmolinas added this to the Release 2022.01 milestone Nov 18, 2021
@benpicco
Copy link
Contributor Author

benpicco commented Dec 1, 2021

Sorry for the long time it took me to reply, I had to think about this for a bit (and then DOSE kept me busy 😉)

Do I get it right that you can reconstruct one in 64 (CONFIG_MOTA_CHUNK_SIZE) packets?
Also how large are CONFIG_MOTA_DATA_CHUNKS_PER_COL and CONFIG_MOTA_COLS_PER_BLOCK?

I tried to keep the memory footprint low with my implementation, a drawback however is that I can't recover if two consecutive chunks are lost (I can recover 1 in 4 chunks).

How much RAM do you need for one mota_block_t?

@fjmolinas
Copy link
Contributor

ping @jue89 :)

@OlegHahm OlegHahm added the State: waiting for maintainer State: Action by a maintainer is required label Mar 9, 2022
@benpicco benpicco removed this from the Release 2022.04 milestone Mar 28, 2022
@stale
Copy link

stale bot commented Nov 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Nov 2, 2022
@chrysn
Copy link
Member

chrysn commented Nov 2, 2022

There have been recent chat exchanges on how this might look in the big picture of CoAP, so maybe it's not as stale as the bot makes it sound.

@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Nov 10, 2022
@benpicco
Copy link
Contributor Author

benpicco commented Feb 2, 2023

Some feedback on the algorithm would be great though, otherwise I would say this is pretty stale.

Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API looks nice. I didn't really check the mathematics behind the C code, but the unit tests looks assuring that it does what is claimed.

sys/coding/xor.c Outdated Show resolved Hide resolved
@benpicco
Copy link
Contributor Author

benpicco commented Feb 3, 2023

I was hoping @jue89 could chip in if his algorithm is performing any better than mine.
I can only recover one in three packets, so that seems pretty poor. And if 2 consecutive packets are lost, there is only a 33% chance of being able to recover them.

@maribu
Copy link
Member

maribu commented Feb 3, 2023

I though this is intentionally a less capable but more lightweight alternative to Reed Solomon codes... IMO this is still useful for when rs too large

This implements the XOR based error-correction code described by
Jürgen Fitschen (@jue89) at the RIOT Summit.

A parity byte is generated for each 3 payload bytes, then the payload array
is transposed by interpreting it as a 2D matrix with height of 3.

This is to reduce the chance of consecutive bytes ending up in the same
packet.

This allows to recover one in 3 lost data packets (if parity packets are received).

[0] https://summit.riot-os.org/2021/wp-content/uploads/sites/16/2021/09/s02-01.pdf
@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed State: waiting for maintainer State: Action by a maintainer is required labels Feb 6, 2023
@riot-ci
Copy link

riot-ci commented Feb 6, 2023

Murdock results

✔️ PASSED

24cb2da tests/coding: add tests for XOR codding

Success Failures Total Runtime
6851 0 6851 12m:28s

Artifacts

This only reflects a subset of all builds from https://ci-prod.riot-os.org. Please refer to https://ci.riot-os.org for a complete build for now.

@benpicco
Copy link
Contributor Author

benpicco commented Feb 6, 2023

bors merge

bors bot added a commit that referenced this pull request Feb 6, 2023
17045: sys/coding: add XOR based coding module r=benpicco a=benpicco



19248: cpu/gd32v: add periph_dac support r=benpicco a=gschorcht

### Contribution description

This PR provides the `periph_dac` support for GD32VF103.

### Testing procedure

`tests/periph_dac` should work on `sipeed-longan-nano` port on PA4 and PA5.

### Issues/PRs references

19251: tests/driver_dac_dds: fix output of sine and saw functions r=benpicco a=benpicco



Co-authored-by: Benjamin Valentin <benjamin.valentin@ml-pa.com>
Co-authored-by: Gunar Schorcht <gunar@schorcht.net>
@bors
Copy link
Contributor

bors bot commented Feb 6, 2023

This PR was included in a batch that was canceled, it will be automatically retried

@benpicco
Copy link
Contributor Author

benpicco commented Feb 7, 2023

bors merge

bors bot added a commit that referenced this pull request Feb 7, 2023
17045: sys/coding: add XOR based coding module r=benpicco a=benpicco



19251: tests/driver_dac_dds: fix output of sine and saw functions r=benpicco a=benpicco



19254: cpu/gd32v: add periph_rtc_mem support r=benpicco a=gschorcht

### Contribution description

This PR provides the `periph_rtc_mem` support for GD32VF103.

A modified version of this driver could also be used for STM32F1.

### Testing procedure

`tests/periph_rtt` should work on any GD32V board, for example:
```
BOARD=sipeed-longan-nano make -C tests/periph_rtt flash
```
```
Help: Press s to start test, r to print it is ready
START
main(): This is RIOT! (Version: 2023.04-devel-319-gebc86-cpu/gd32v/periph_rtc_mem)

RIOT RTT low-level driver test
RTT configuration:
RTT_MAX_VALUE: 0xffffffff
RTT_FREQUENCY: 32768

Testing the tick conversion
Trying to convert 1 to seconds and back
Trying to convert 256 to seconds and back
Trying to convert 65536 to seconds and back
Trying to convert 16777216 to seconds and back
Trying to convert 2147483648 to seconds and back
All ok

Initializing the RTT driver
RTC mem OK
This test will now display 'Hello' every 5 seconds

RTT now: 1
Setting initial alarm to now + 5 s (163841)
rtt_get_alarm() PASSED
RTC mem OK
```

### Issues/PRs references

Co-authored-by: Benjamin Valentin <benjamin.valentin@ml-pa.com>
Co-authored-by: Gunar Schorcht <gunar@schorcht.net>
@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

Build failed (retrying...):

bors bot added a commit that referenced this pull request Feb 7, 2023
17045: sys/coding: add XOR based coding module r=benpicco a=benpicco



Co-authored-by: Benjamin Valentin <benjamin.valentin@ml-pa.com>
@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

Build failed:

@benpicco
Copy link
Contributor Author

benpicco commented Feb 7, 2023

bors merge

2 similar comments
@benpicco
Copy link
Contributor Author

benpicco commented Feb 7, 2023

bors merge

@benpicco
Copy link
Contributor Author

benpicco commented Feb 7, 2023

bors merge

@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

Already running a review

1 similar comment
@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

Already running a review

@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

Build succeeded:

@bors bors bot merged commit f341ad6 into RIOT-OS:master Feb 7, 2023
@benpicco benpicco deleted the coding/xor branch February 7, 2023 20:23
@MrKevinWeiss MrKevinWeiss added this to the Release 2023.04 milestone Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: doc Area: Documentation Area: sys Area: System Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants