Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Partition Size computation #2230

Closed
sehz opened this issue Feb 25, 2022 · 9 comments
Closed

Optimize Partition Size computation #2230

sehz opened this issue Feb 25, 2022 · 9 comments
Assignees
Milestone

Comments

@sehz
Copy link
Contributor

sehz commented Feb 25, 2022

With #2212, we have ability get size of the partitions. However, performance of the size computation can be improved. Instead of querying metadata each time, it can query segments which already have size cached.

@sehz
Copy link
Contributor Author

sehz commented Feb 25, 2022

@Afourcat interested?

@Afourcat
Copy link
Contributor

Hi,
Until now I made the thing work with the active segment.
With an implementation similar to this one.

async fn get_partition_size(&self) -> Result<u64, ErrorCode> {
    let active_len = self.active_segment.get_msg_log().get_pos();
                                                               
    // let total_len = {
    //     let reader = self.prev_segments.read().await;
    //                                                           
    //     active_len + reader.get_total_logs_len() as u32
    // };
                                                               
    return Ok(total_len.into());
}

But while adding the total_len implementation I met a strange error.

To reproduce:
With a topic configured with a segment-size of 1024.
Producing until exceeding the 1024 segment size.
Then, I'm getting into to new segment rollout phase, and I got this error.

2022-02-26T16:05:54.424067Z DEBUG handle_request{host="0.0.0.0:9010"}:respond{socket=Socket(2060) _connection=peer("127.0.0.1:39258")}:handle_produce_request{id=4 client=fluvio}:handle_produce_topic{topic=lyon}:handle_produce_partition{replica_id=lyon-0}:write_record_set:write_record_set:write_recordset:write_batch: fluvio_storage::index: opening index index_file_path="./logs/spu-logs-5001/lyon-0/00000000000000000000.index"
2022-02-26T16:05:54.424398Z ERROR handle_request{host="0.0.0.0:9010"}:respond{socket=Socket(2060) _connection=peer("127.0.0.1:39258")}:handle_produce_request{id=4 client=fluvio}:handle_produce_topic{topic=lyon}:handle_produce_partition{replica_id=lyon-0}: fluvio_spu::services::public::produce_handler: Error writing to replica: Io(
    Os {
        code: 22,
        kind: InvalidInput,
        message: "Invalid argument",
    },
) replica_id=lyon-0

Going further, It seems that the mmap fail to open the index file.
So I straced the track the mmap call, but It seems that the syscall isn't called at all.
Because of this error, the self.prev_segments.add_segment(old_segment).await isn't called, so I can't fetch old segment sizes.

The error happens on both local and k8s cluster.

It seems the index opening also fail on closing and restarting the SPU.

2022-02-26T16:42:56.367716Z ERROR fluvio_storage::segments: error opening segment: Io(
    Os {
        code: 22,
        kind: InvalidInput,
        message: "Invalid argument",
    },
)

Does someone have any idea?
Thank you,

@sehz
Copy link
Contributor Author

sehz commented Feb 26, 2022

Is problem happening with or without commented line?

If with commented line the, this is probably problem:

let reader = self.prev_segments.read().await;

Avoid locking which could interfered with writer. Instead use AtomicU64 as here:

fn get_log_start_offset(&self) -> Offset {
        let min_base_offset = self.prev_segments.min_offset();
        if min_base_offset < 0 {
            self.active_segment.get_base_offset()
        } else {
            min_base_offset
        }
    }

@Afourcat
Copy link
Contributor

Hello,
The issue is happening without the commented code. In fact, I also have this issue on stable cluster and CLI.

Release Channel      : stable
Fluvio CLI           : 0.9.20
Fluvio CLI SHA256    : 01e4d42525292d6ca924ca6f34443f7533c269c974981634bcca1f676c80fad1
Fluvio Platform      : 0.9.20 (minikube)
Git Commit           : 50446a0ec086b9d87f7800f143a8f4f4f66285cd
OS Details           : Gentoo 2.8 (kernel 5.15.5-gentoo-dist)
=== Plugin Versions ===
Fluvio Runner (fluvio-run)     : 0.0.0
Infinyon Cloud CLI (fluvio-cloud) : 0.1.6

I think that this is this line that throws the "22: Invalid argument" error.
in fluvio/crates/fluvio-storage/src/index.rs:119.

        // make sure it is log file
        let (m_file, file) =
            MemoryMappedFile::open(&index_file_path, INDEX_ENTRY_SIZE as u64).await?;
2022-02-27T19:51:58.305053Z DEBUG handle_request{host="0.0.0.0:9010"}:respond{socket=Socket(1317) _connection=peer("127.0.0.1:39330")}:handle_produce_request{id=1 client=fluvio}:handle_produce_topic{topic=lyon-seg}:handle_produce_partition{replica_id=lyon-seg-0}:write_record_set:write_record_set:write_recordset:write_batch: fluvio_storage::index: opening index index_file_path="/home/alex/fluvio-logs/spu-logs-5001/lyon-seg-0/00000000000000000001.index"
2022-02-27T18:53:28.725918Z ERROR handle_request{host="0.0.0.0:9005"}:respond{socket=Socket(25) _connection=peer("172.17.0.1:22939")}:handle_produce_request{id=6 client=fluvio}:handle_produce_topic{topic=stable}:handle_produce_partition{replica_id=stable-0}: fluvio_spu::services::public::produce_handler: Error writing to replica: Io(
    Os {
        code: 22,
        kind: InvalidInput,
        message: "Invalid argument",
    },
) replica_id=stable-0

To Reproduce:

./k8-util/cluster/reset-minikube.sh
fluvio cluster start
fluvio topic create stable --segment-size 2048
fluvio produce stable
[...Producing 1000 bytes 5 times]
kubectl logs fluvio-spg-main-0
[...Error is here]

Weirdly, indexes look like this:

           size
            vv
.rw-r--r--   0 user 27 Feb 20:51 ~/fluvio-logs/spu-logs-5001/lyon-seg-0/00000000000000000000.index
.rw-r--r--   0 user 27 Feb 20:51 ~/fluvio-logs/spu-logs-5001/lyon-seg-0/00000000000000000001.index
.rw-r--r-- 10M user 27 Feb 20:51 ~/fluvio-logs/spu-logs-5001/lyon-seg-0/00000000000000000002.index

Can it be due to my testing environment?

Linux gentoo 5.15.5-gentoo-dist #1 SMP Thu Nov 25 18:25:42 -00 2021 x86_64 Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz GenuineIntel GNU/Linux

Unfortunately, I don't have another one.

I think I'm missing something pretty obvious.
Thank you very much,

@sehz
Copy link
Contributor Author

sehz commented Feb 27, 2022

So this is last release? Can you create reproducible test? You can create e2e test like smoke test or cli teet

@Afourcat
Copy link
Contributor

Afourcat commented Feb 28, 2022

So this is last release?

I tried with the 0.9.20 of the CLI with the binary ~/.fluvio/bin/fluvio.

Here is a bats file that I placed at: ./tests/cli/smoke_tests/e2e-segment-partitions.bats

#!/usr/bin/env bats

TEST_HELPER_DIR="$BATS_TEST_DIRNAME/../test_helper"
export TEST_HELPER_DIR

load "$TEST_HELPER_DIR"/tools_check.bash
load "$TEST_HELPER_DIR"/fluvio_dev.bash
load "$TEST_HELPER_DIR"/bats-support/load.bash
load "$TEST_HELPER_DIR"/bats-assert/load.bash

setup_file() {
    TOPIC_NAME=$(random_string)
    export TOPIC_NAME
    debug_msg "Topic name: $TOPIC_NAME"

    TOPIC_NAME_2=$(random_string)
    export TOPIC_NAME_2
    debug_msg "Topic name: $TOPIC_NAME_2"

    TOPIC_NAME_3=$(random_string)
    export TOPIC_NAME_3
    debug_msg "Topic name: $TOPIC_NAME_3"

    MESSAGE="$(random_string 1000)"
    export MESSAGE
    debug_msg "$MESSAGE"
}

teardown_file() {
    run timeout 15s "$FLUVIO_BIN" topic delete "$TOPIC_NAME"
    run timeout 15s "$FLUVIO_BIN" topic delete "$TOPIC_NAME_2"
    run timeout 15s "$FLUVIO_BIN" topic delete "$TOPIC_NAME_3"
}

@test "Create a topic" {
    debug_msg "topic: $TOPIC_NAME with segment size of 1024"
    run timeout 15s "$FLUVIO_BIN" topic create "$TOPIC_NAME" --segment-size 1024
    assert_success

    debug_msg "topic: $TOPIC_NAME with segment size of 2048"
    run timeout 15s "$FLUVIO_BIN" topic create "$TOPIC_NAME_2" --segment-size 2048
    assert_success

    debug_msg "topic: $TOPIC_NAME with segment size of 4096"
    run timeout 15s "$FLUVIO_BIN" topic create "$TOPIC_NAME_3" --segment-size 4096
    assert_success
}

@test "Produce message" {
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME"'

    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_2"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_2"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_2"'

    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_3"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_3"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_3"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_3"'
    run bash -c 'echo "$MESSAGE" | timeout 15s "$FLUVIO_BIN" produce "$TOPIC_NAME_3"'
    assert_success
}

@test "Check Invalid Argument Error" {
    run timeout 15s kubectl logs pod/fluvio-spg-main-0
    debug_msg "Checking for \'Invalid Argument\' Error in SPG logs"
    refute_output --partial 'message: "Invalid argument"'
}

I think the problem comes from my testing environment, but I'm stuck, I don't know how it could produce an EINVAL error when opening a file.
This test only covers the case of creating a log file rollout.
But I have the same EINVAL error when restarting an SPU in the FileReplica::create_or_load function.

From what I've seen, this error only occurs on index files.

@sehz
Copy link
Contributor Author

sehz commented Feb 28, 2022

Can you make PR for test so if this can be reproduce in the CI?

@Afourcat
Copy link
Contributor

Afourcat commented Mar 2, 2022

Hello,
As you can see, the error also occurs in the CI.
https://github.com/infinyon/fluvio/runs/5391890697?check_suite_focus=true

I'm not sure how to proceed.

bors bot pushed a commit that referenced this issue Mar 11, 2022
This PR Implements #2230.

- Add smoke-test
- Bumps `fluvio-future` to `0.3.15`.
- Replace File-based computation of logs' length by the sum of active and previous `Segment`s `msg_log` size.
Which removes the need for file system IO.
- Change tests
@sehz sehz added this to the 0.9.21 milestone Mar 14, 2022
@sehz
Copy link
Contributor Author

sehz commented Mar 14, 2022

This is done!

@sehz sehz closed this as completed Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants