Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDB: style enhancement for rdb load #1839

Merged
merged 4 commits into from
Oct 20, 2023

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Oct 19, 2023

After review #1798 , the patch is great, here is just some style enhancement:

  1. RDBStream::Read returns StatusOr<size_t>, but if status is ok, the size is always equal to input, so change it to return Status
  2. Some move rather than copy

Comment on lines +179 to +181
std::string read_string;
read_string.resize(len);
GET_OR_RET(stream_->Read(read_string.data(), len));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's a RdbStringStream, it might call memcpy on std::vector<char>, and when size is 0, maybe memcpy will memcpy 0 in a invalid address, which might be a ub?

@@ -471,7 +475,7 @@ Status RDB::saveRdbObject(int type, const std::string &key, const RedisObjValue
const auto &member_scores = std::get<std::vector<MemberScore>>(obj);
redis::ZSet zset_db(storage_, ns_);
uint64_t count = 0;
db_status = zset_db.Add(key, ZAddFlags(0), (redis::ZSet::MemberScores *)&member_scores, &count);
db_status = zset_db.Add(key, ZAddFlags(0), const_cast<std::vector<MemberScore> *>(&member_scores), &count);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a const_cast but I think it's ok to calling it?

@mapleFU mapleFU requested a review from git-hulk October 19, 2023 15:11
@mapleFU
Copy link
Member Author

mapleFU commented Oct 19, 2023

History:
 - Download (curl) error for 'http://mirrorcache-us.opensuse.org/update/leap/15.5/backports/repodata/6693d5099c437a8a40fc4c4d94d4916f050d24e533507f4e8a3360ba1d2709e3-deltainfo.xml.gz':
   Error code: Connection failed
   Error message: Failed to connect to mirrorcache-us.opensuse.org port 80 after 0 ms: Couldn't connect to server
 - Can't provide ./repodata/6693d5099c437a8a40fc4c4d94d49[16](https://github.com/apache/kvrocks/actions/runs/6576647903/job/17867181773?pr=1839#step:5:17)f050d24e533507f4e8a3360ba1d2709e3-deltainfo.xml.gz

Hmmm

Comment on lines 53 to 66
Status RdbFileStream::Read(char *buf, size_t len) {
while (len) {
size_t read_bytes = std::min(max_read_chunk_size_, len);
ifs_.read(buf, static_cast<std::streamsize>(read_bytes));
if (!ifs_.good()) {
return Status(Status::NotOK, fmt::format("read failed: {}:", strerror(errno)));
return {Status::NotOK, fmt::format("read failed: {}:", strerror(errno))};
}
check_sum_ = crc64(check_sum_, reinterpret_cast<const unsigned char *>(buf), read_bytes);
buf = (char *)buf + read_bytes;
buf = buf + read_bytes;
DCHECK(len >= read_bytes);
len -= read_bytes;
total_read_bytes_ += read_bytes;
n += read_bytes;
}

return n;
return Status::OK();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good change! But I think the real situation can be more complex, e.g. if eof is reached do we need to use istream::gcount to access the real position?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, let me update this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ifs_.read(buf, static_cast<std::streamsize>(read_bytes));
    if (!ifs_.good()) {
      if (!ifs_.eof()) {
        return {Status::NotOK, fmt::format("read failed: {}:", strerror(errno))};
      }
      auto eof_read_bytes = static_cast<size_t>(ifs_.gcount());
      if (read_bytes != eof_read_bytes) {
        return {Status::NotOK, fmt::format("read failed: {}:", strerror(errno))};
      }
    }

I've update it to code like this. In read(2), if file reaches EOF, the read would return 0. And return -1 when meets error. So I think previous code is ok. But I don't get what the ifstream works even after I gothrough the cppreference. So I change it to the code above.

@@ -23,13 +23,13 @@
#include "fmt/format.h"
#include "vendor/crc64.h"

StatusOr<size_t> RdbStringStream::Read(char *buf, size_t n) {
Status RdbStringStream::Read(char *buf, size_t n) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a deep look into this function, I think it need some other change. In my view, I think it should not just return error when facing EOF.

StatusOr<size_t> Read(char *dst, size_t len) {
    auto l = std::min(len, real_len);
    copy(src, dst, l);
    return l;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error should be returned only when some IO failure occurs, rather than not enough length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PragmaTwice The api is designed like this. It's not a standard fs api.

I've checkout redis rdb impl in rio, it just provide the same syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparing:

static inline size_t rioRead(rio *r, void *buf, size_t len) {
    if (r->flags & RIO_FLAG_READ_ERROR) return 0;
    while (len) {
        size_t bytes_to_read = (r->max_processing_chunk && r->max_processing_chunk < len) ? r->max_processing_chunk : len;
        if (r->read(r,buf,bytes_to_read) == 0) {
            r->flags |= RIO_FLAG_READ_ERROR;
            return 0;
        }
        if (r->update_cksum) r->update_cksum(r,buf,bytes_to_read);
        buf = (char*)buf + bytes_to_read;
        len -= bytes_to_read;
        r->processed_bytes += bytes_to_read;
    }
    return 1;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we require a fs api, we can wrap a

StatusOr<size_t> Read(buf, size_t);
StatusOr<Buffer> Read(..);
StatusOr<Buffer> ReadAt(..);

Style api as underlying impl, but this case I think rdb will gurantee the size should match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. LGTM.

@PragmaTwice PragmaTwice merged commit a618574 into apache:unstable Oct 20, 2023
28 checks passed
@mapleFU mapleFU deleted the rdb/tiny-code-style-change branch October 20, 2023 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants