Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client read not allocate segment #172

Merged

Conversation

SeanHai
Copy link
Contributor

@SeanHai SeanHai commented Nov 26, 2020

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What is changed and how it works?

What's Changed:

How it Works:

Side effects(Breaking backward compatibility? Performance regression?):

Check List

  • Relevant documentation/comments is changed or added
  • I acknowledge that all my contributions will be made under the project's license

Copy link
Contributor

@wu-hanqing wu-hanqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

再添加一下相关场景的单元测试吧。

curveCombineCtx->curveCtx.op = LIBCURVE_OP::LIBCURVE_OP_READ;
curveCombineCtx->curveCtx.cb = CurveAioCallback;

int ret = curveClient->AioRead(fd, &curveCombineCtx->curveCtx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个调用少了一个参数吧?

Copy link
Contributor Author

@SeanHai SeanHai Nov 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个调用少了一个参数吧?

改为:
int ret = curveClient->AioRead(fd, &curveCombineCtx->curveCtx, userDataType_);

// add zero data
std::string zeroData(r->rawlength_, 0);
butil::IOBuf zeroDataBuf;
zeroDataBuf.append(zeroData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOBuf直接有一个resize的接口,可以直接用。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOBuf直接有一个resize的接口,可以直接用。

改为:
butil::IOBuf zeroDataBuf;
zeroDataBuf.resize(r->rawlength_, 0);


for(auto r : reqlist_) {
// fake subrequest
if(0 == r->idinfo_.lpid_ && 0 == r->idinfo_.cpid_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个判断可以封装一下,或者直接用ChunkIDInfo::Valid?再确认一下,正常情况下chunkid、copysetid、logicalpoolid会不会等于0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个判断可以封装一下,或者直接用ChunkIDInfo::Valid?再确认一下,正常情况下chunkid、copysetid、logicalpoolid会不会等于0

改为:
ChunKIDInfo中新增 字段 bool chunkExist = true; 标记chunk不存在时添加到metaChache中的特殊ChunkIdInfo

std::string zeroData(r->rawlength_, 0);
butil::IOBuf zeroDataBuf;
zeroDataBuf.append(zeroData);
readDatas_[r->subIoIndex_] = zeroDataBuf;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里r->subIoIndex_还没赋值吧?

Copy link
Contributor Author

@SeanHai SeanHai Nov 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里r->subIoIndex_还没赋值吧?

改为readDatas_[subIoIndex] = zeroDataBuf;

ret = scheduler_->ScheduleRequest(reqlist_);
ret = ReadFromOrigin(originReadVec, fileInfo->userinfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里会把上一步返回的值给覆盖掉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里会把上一步返回的值给覆盖掉

改为:
ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, fileInfo->userinfo);
...
if (ret < 0) {
LOG(ERROR) << "split or schedule failed, return and recycle resource!";
ReturnOnFail();
}

int fd = 0;
{
std::unique_lock<std::mutex> lock(mtx_);
auto iter = fdMap_.find(fileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOTracker是跟踪一个用户IO请求的,fdMap_放IOTracker内部,每次都是空的。
可以考虑放到FileClient4ReadClone里面去。

而且需要添加一下,定期关闭一直没有IO的卷。

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch 5 times, most recently from 08fa2f1 to c5f812e Compare December 13, 2020 04:02
fakeReqNum++;
} else {
// read from original volume
originReadVec.push_back(r);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改用emplace_back吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改用emplace_back吧

done

// add zero data
butil::IOBuf zeroDataBuf;
zeroDataBuf.resize(r->rawlength_, 0);
r->readData_ = zeroDataBuf;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一段,直接r->readData_.resize就可以了吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一段,直接r->readData_.resize就可以了吧

done

originReadVec.push_back(r);
}

r->subIoIndex_ = subIoIndex++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是不是跟下面的r->subIoIndex_ = subIoIndex++重复了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是不是跟下面的r->subIoIndex_ = subIoIndex++重复了?

不会,这个下句就continue了

ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, fileInfo->userinfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥是+=?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥是+=?

意思是这两个都返回0(成功)才会成功,否则其中一个为失败(-1)都算失败。
这两可以表达成+ 或者 ||

done->SetFailed(LIBCURVE_ERROR::FAILED);
}
delete curveCombineCtx;
done->SetFailed(LIBCURVE_ERROR::OK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面SetFailed成FAILED之后,这里又Set成OK??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面SetFailed成FAILED之后,这里又Set成OK??

done

// LogicalPoolCopysetIDInfo_t lpcsIDInfo;
// mdsclient_.GetOrAllocateSegment(true, 0, &fi, &sinfo);
// int count = 0;
// for (auto iter : sinfo.chunkvec) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥注释掉?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥注释掉?

这块注释掉的内容为调用GetOrAllocateSegment,获取一个segment,并且将相关信息(chunkIdInfo、copysetInfo)。注释掉的原因是:
1、这和上面 2. 设置GetOrAllocateSegmentresponse 功能类似,注释掉会在调用GetOrAllocateSegment时返回segment信息并更新metacache。
2、如果这更新了metacache,新增测试用例读普通卷(未写)和克隆卷(未写)时,需要调整offset到第二个segment,就与上面测试用例在形式上没法保持一致。如果保持原来不注释,可以先修改下offset测试看看


LOG(ERROR) << "address = " << &data;
ASSERT_EQ(0, data[0]);
ASSERT_EQ(0, data[4 * 1024 - 1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这几个断言是为了断言data是全0的吗?可以写在循环里

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这几个断言是为了断言data是全0的吗?可以写在循环里

是的,验证全0;这里形式主要是为了和上面之前的用例在形式上保持一致。


LOG(ERROR) << "address = " << &data;
ASSERT_EQ('a', data[0]);
ASSERT_EQ('a', data[4 * 1024 - 1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全是a?可以写成,没分配的部分从源卷读到的是a,已分配的部分从当前的卷读到是b。方便验证。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全是a?可以写成,没分配的部分从源卷读到的是a,已分配的部分从当前的卷读到是b。方便验证。

这里读当前卷和源卷复用是同一个mock_schedule,这个mock_schedule中是根据请求fakedata[index%10]来填充请求数据的(fakadata{'a','b',...'j'}),可以调整下用例中已写过的chunk index,来使结果看起来更明显。

LOG(ERROR) << "address = " << &data;
ASSERT_EQ('a', data[0]);
ASSERT_EQ('a', data[4 * 1024 - 1]);
ASSERT_EQ('a', data[4 * 1024]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上。


LOG(ERROR) << "address = " << &data;
ASSERT_EQ('a', data[0]);
ASSERT_EQ('a', data[4 * 1024 - 1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该包含了上面那个全0的情况吧?是不是有点重复?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该包含了上面那个全0的情况吧?是不是有点重复?

上面那个用例是读的普通卷,且读的chunk全都没写过,这个是读的普通卷,部分写过,部分没写过。严格来说也可以合成下面这个

* @return 0 success; -1 fail
*/
int ReadFromOrigin(std::vector<RequestContext*> reqCtxVec,
UserInfo_t userInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

userInfo改成const引用

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

userInfo改成const引用

done

@@ -34,10 +34,14 @@ CurveClient::~CurveClient() {
}

int CurveClient::Init(const std::string& configPath) {
return fileClient_->Init(configPath);
int ret = fileClient_->Init(configPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configPath可以保存一份全局的,FileClient4ReadClone::GetFileClient4ReadClone().Init不用放到这里

return 0;
}

FileClient4ReadClone &fileClient4Clone =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以把逻辑都封装到FileClient4ReadClone里面,提供一个接口,这里直接调用就行了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以把逻辑都封装到FileClient4ReadClone里面,提供一个接口,这里直接调用就行了。

done

FileClient4ReadClone& operator=(const FileClient4ReadClone &);
~FileClient4ReadClone() {}

FileClient *fileClient_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用unique_ptr吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用unique_ptr吧

done

// the mutex lock for fdMap_
std::mutex mtx_;
// the mapping from filename to fd
std::unordered_map<std::string, int> fdMap_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里上次讨论的时候,应该是需要加一个定时关闭的功能。一段时间不用,自动把卷close。

// with no clonesource
TEST_F(IOTrackerSplitorTest, StartReadNotAllocateSegment2) {
curvefsservice.SetGetOrAllocateSegmentFakeReturn(notallocatefakeret);
MockRequestScheduler* mockschuler = new MockRequestScheduler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

done


TEST_F(IOTrackerSplitorTest, AsyncStartReadNotAllocateSegment2) {
curvefsservice.SetGetOrAllocateSegmentFakeReturn(notallocatefakeret);
MockRequestScheduler* mockschuler = new MockRequestScheduler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

done

ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, fileInfo->userinfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么用+=

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么用+=

意思是这两个都返回0(成功)才会成功,否则其中一个为失败(-1)都算失败。
这两可以表达成+ 或者 ||

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用if来判断吧 这种方式太隐晦了,读起来很不友好

uint32_t fakeReqNum = 0;
std::vector<RequestContext*> originReadVec;

for (auto r : reqlist_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的逻辑有些分散了。
这里应该可以把所有的请求,都放到Schedule中,在Schedule::Process里面,根据差异再不同处理。

@@ -334,6 +334,8 @@ class SnapshotCloneServerTest : public ::testing::Test {

fileClient_ = new FileClient();
fileClient_->Init(kClientConfigPath);
curve::client::FileClient4ReadClone::GetFileClient4ReadClone().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里尽量把初始化放到FileClient4ReadClone内部,不要让其他模块来做初始化。

}

r->subIoIndex_ = subIoIndex++;
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不需要continue吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不需要continue吧

done

} else {
LOG(ERROR) << "splitor read io failed, "
<< "offset = " << offset_ << ", length = " << length_;
}

if (ret == -1) {
if (ret < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面else不是已经处理了ret<0的场景?

else {
LOG(ERROR) << "splitor read io failed, " LOG(ERROR) << "splitor read io failed, "
<< "offset = " << offset_ << ", length = " << length_; << "offset = " << offset_ << ", length = " << length_;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面else不是已经处理了ret<0的场景?

else {
LOG(ERROR) << "splitor read io failed, " LOG(ERROR) << "splitor read io failed, "
<< "offset = " << offset_ << ", length = " << length_; << "offset = " << offset_ << ", length = " << length_;
}

ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, &fileInfo->userinfo);
这里都会gengxret的值,意思是两个都成功(0)才算成功

uint32_t fakeReqNum = 0;
std::vector<RequestContext*> originReadVec;

for (auto r : reqlist_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以把这一段for封装一下

}
CurveAioCombineContext *curveCombineCtx = new CurveAioCombineContext();
curveCombineCtx->done = reqCtx->done_;
curveCombineCtx->curveCtx.offset = reqCtx->sourceInfo_.cloneFileOffset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥不直接是reqCtx->offset_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥不直接是reqCtx->offset_

reqCtx->offset_只是chunk内偏移,reqCtx->sourceInfo_.cloneFileOffset 在CalcRequestSourceInfo中会被计算出chunk偏移

ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, fileInfo->userinfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里ret为什么要加?ret返回的是0或者-1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里ret为什么要加?ret返回的是0或者-1

想表达两个都成功才算成功,用+ 或 || 应该都行,下面进行 ret<0 判断

}

FileClient4ReadClone &fileClient4Clone =
FileClient4ReadClone::GetFileClient4ReadClone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么需要新增一个FileClient4ReadClone的类,直接使用原来的不行吗?无论是clone还是非clone, 读的流程不都是一样的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么需要新增一个FileClient4ReadClone的类,直接使用原来的不行吗?无论是clone还是非clone, 读的流程不都是一样的?

一方面在IO_tracker中FileClient无法获取,另外如果复用原FileClient在打开源卷read时 FileClient::Read中
ReadLockGuard lk(rwlock_)这个锁无法获得

@@ -299,6 +320,59 @@ class FileClient {
bvar::Adder<uint64_t> openedFileNum_;
};

class FileClient4ReadClone {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileClient不能满足下面的接口要求吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileClient不能满足下面的接口要求吗

理由同上

@@ -91,6 +91,14 @@ int RequestScheduler::ScheduleRequest(
if (running_.load(std::memory_order_acquire)) {
/* TODO(wudemiao): 后期考虑 qos */
for (auto it : requests) {
// skip the fake request
if (!it->idinfo_.chunkExist) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fake的request不是不会丢在这里面吗 为啥需要判断?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fake的request不是不会丢在这里面吗 为啥需要判断?

丢到这里的请求包括非源卷的所有请求,最初不包括fake request,改动原因是:
如果只有真实的请求,fake请求之间在IOTracker中变量时填充fakedata, 那么reqcount_.store计数就只有真实请求,
RequestScheduler::ScheduleRequest 会对reqlist逐个处理,在场景假设有64个子请求,第一个是读写过的真实请求,后面都是fake请求,当第一个真实请求返回后在IOTracker::Done中会DestoryRequestList(); 这边遍历就会访问已经释放的空间。

static_cast<uint64_t>(chunkidx) * fileinfo->chunksize,
mdsclient, metaCache, fileinfo)) {
mdsclient, metaCache, fileinfo, chunkidx)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunkidx是用于?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunkidx是用于?

Splitor::GetOrAllocateSegment 中更新ChunkIDInfo使用
ChunkIDInfo chunkIdInfo(0, 0, 0);
chunkIdInfo.chunkExist = false;
metaCache->UpdateChunkInfoByIndex(chunkidx, chunkIdInfo);

// for (auto iter : sinfo.chunkvec) {
// uint64_t index = (sinfo.startoffset + count*fi.chunksize )
// / fi.chunksize;
// mc->UpdateChunkInfoByIndex(index, iter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥这里全部注释掉了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥这里全部注释掉了

这块注释掉的内容为调用GetOrAllocateSegment,获取一个segment,并且将相关信息(chunkIdInfo、copysetInfo)。注释掉的原因是:
1、这和上面 2. 设置GetOrAllocateSegmentresponse 功能类似,注释掉会在调用GetOrAllocateSegment时返回segment信息并更新metacache。
2、如果这更新了metacache,新增测试用例读普通卷(未写)和克隆卷(未写)时,需要调整offset到第二个segment,就与上面测试用例在形式上没法保持一致。如果保持原来不注释,可以先修改下offset测试看看

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch 11 times, most recently from 548340d to 2831bea Compare December 23, 2020 02:06
/**
* close the timeout fd with timed thread
*/
int Closefd();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

既然是后台线程来删除,作为private感觉更合理

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

既然是后台线程来删除,作为private感觉更合理

done

std::mutex mtx_;
// the mapping from filename to fd_timestamp
std::unordered_map<std::string, std::string> fdMap_;
std::thread closeThread;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fdCloseThread

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fdCloseThread
done, fdCloseThread_

@@ -491,6 +501,66 @@ class CurveClient {
FileClient* fileClient_{nullptr};
};

class OriginReader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceReader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceReader?

done

ret = scheduler_->ScheduleRequest(reqlist_);
ret += ReadFromOrigin(originReadVec, fileInfo->userinfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用if来判断吧 这种方式太隐晦了,读起来很不友好

sleeper_.interrupt();
closeThread.join();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里打一下reader stop的日志

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里打一下reader stop的日志

done

}
CurveAioCombineContext *curveCombineCtx = new CurveAioCombineContext();
curveCombineCtx->done = reqCtx->done_;
curveCombineCtx->curveCtx.offset = reqCtx->sourceInfo_.cloneFileOffset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么需要offset相加

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么需要offset相加

reqCtx->offset_只是chunk内偏移,reqCtx->sourceInfo_.cloneFileOffset 在CalcRequestSourceInfo中会被计算出chunk偏移

// the mutex lock for fdMap_
std::mutex mtx_;
// the mapping from filename to fd_timestamp
std::unordered_map<std::string, std::string> fdMap_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里注释一下first是什么 second是什么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里注释一下first是什么 second是什么

done

std::unique_lock<std::mutex> lock(mtx_);
auto iter = fdMap_.find(fileName);
if (iter != fdMap_.end()) {
curve::common::SplitString(iter->second, FD_TIME_DELIM,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里直接封装一个函数出去吧
Getfd()

std::vector<std::string> fdtimestamp;
curve::common::SplitString(iter->second, FD_TIME_DELIM,
&fdtimestamp);
if (fdtimestamp.size() != 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是第三次重复了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是第三次重复了

done


int fd = std::stoi(fdtimestamp[0]);
time_t timestamp = std::stol(fdtimestamp[1]);
if (time(0) - timestamp > FD_TIMEOUT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样减出来不是负的吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样减出来不是负的吗

不会,time(0)是现在的时间,timestamp是之前访问时保存的,所以肯定大于timestamp

@@ -180,10 +180,12 @@ bool Splitor::AssignInternal(IOTracker* iotracker, MetaCache* metaCache,
metaCache->GetChunkInfoByIndex(chunkidx, &chunkIdInfo);

if (errCode == MetaCacheErrorType::CHUNKINFO_NOT_FOUND) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该少了一个条件,如果读请求的时候,把chunk标记为chunkExist=false,后续的写请求会拿到一个chunk不存在的ChunkInfo。这种情况下,也需要取分配segment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该少了一个条件,如果读请求的时候,把chunk标记为chunkExist=false,后续的写请求会拿到一个chunk不存在的ChunkInfo。这种情况下,也需要取分配segment

done

@@ -491,6 +501,66 @@ class CurveClient {
FileClient* fileClient_{nullptr};
};

class OriginReader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个类没必要暴露出去,在client内部维护就可以了。这样就不需要其他模块初始化这个类。
可以在第一次使用的时候,进行初始化,比如IOTracker::ReadFromOrigin这个地方。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
在curveclient初始化时进行 sourcereader的初始化,如果放到IOTracker::ReadFromSource每次都会去尝试初始化,并且感觉sourcereader和curveclient的级别好像差不多

// the mutex lock for fdMap_
std::mutex mtx_;
// the mapping from filename to fd_timestamp
std::unordered_map<std::string, std::string> fdMap_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里改成std::unordered_map<std::string, std::pair<int, uint64_t> fdMap_
std::pair<int, uint64_t> 第一个存fd,第二个存timestamp
或者也可以把<fd,timestamp>再封装一次,用string存,太蛮烦了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里改成std::unordered_map<std::string, std::pair<int, uint64_t> fdMap_
std::pair<int, uint64_t> 第一个存fd,第二个存timestamp
或者也可以把<fd,timestamp>再封装一次,用string存,太蛮烦了。

done


#define IO_ALIGNED_BLOCK_SIZE 4096
#define PATH_MAX_SIZE 4096
#define NAME_MAX_SIZE 256

#define FD_TIME_DELIM std::string("_")
#define FD_TIMEOUT 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的参数,用配置文件设置吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的参数,用配置文件设置吧。

done

brpc::ClosureGuard doneGuard(done);
}

OriginReader::OriginReader() : fileClient_(new FileClient()) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个类的实现,放到一个单独的cpp文件中吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个类的实现,放到一个单独的cpp文件中吧。

done


FileClient *fileClient_{nullptr};
// the mutex lock for fdMap_
std::mutex mtx_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个锁改成读写锁吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个锁改成读写锁吧

done

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch 2 times, most recently from ce7655a to 70d88df Compare December 24, 2020 05:40
@@ -124,4 +131,4 @@ void CurveClient::SetFileClient(FileClient* client) {
}

} // namespace client
} // namespace curve
} // namespace curve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件末尾应该要有一个空行,没有cpplint会报错

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -269,6 +269,31 @@ class FileClient {
return openedFileNum_.get_value();
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里几个函数的缩进

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里几个函数的缩进

done

std::string fileName = reqCtx->sourceInfo_.cloneFileSource;
int fd = 0;
{
curve::common::WriteLockGuard writeLockGuard(rwLock_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个锁的流程,应该是先拿读锁,如fdMap_里面查找,如果找到了,就可以直接用。
如果没找到,拿写锁去新建一个。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return -1;
}
}
fdMap_[fileName] = std::make_pair(fd, time(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么都要更新成time(0)?
我的理解是,如果新建,或者使用的时候,更新timestap为当前的时间,比如用GetTimeofDayMs来记录。
后台线程在扫描关闭的时候,对比当前的时间戳与fd的时间戳,如果超过阈值,就关闭。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么都要更新成time(0)?
我的理解是,如果新建,或者使用的时候,更新timestap为当前的时间,比如用GetTimeofDayMs来记录。
后台线程在扫描关闭的时候,对比当前的时间戳与fd的时间戳,如果超过阈值,就关闭。

time(0)就是获取现在时间戳

@@ -194,6 +196,15 @@ bool Splitor::AssignInternal(IOTracker* iotracker, MetaCache* metaCache,
int ret = 0;
uint64_t appliedindex_ = 0;

// check whether the chunkIdInfo is normal
if (!chunkIdInfo.chunkExist && iotracker->Optype() == OpType::WRITE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的判断,最好放到183行吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的判断,最好放到183行吧

done

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch from 70d88df to db9c4ec Compare December 24, 2020 09:00
int IOTracker::ReadFromSource(std::vector<RequestContext*> reqCtxVec,
const UserInfo_t& userInfo) {
SourceReader &sourceReader = SourceReader::GetInstance();
return sourceReader.Read(reqCtxVec, userInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceReader::GetInstance().Read(reqCtxVec, userInfo)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceReader::GetInstance().Read(reqCtxVec, userInfo)?

done

@@ -246,6 +246,15 @@ class CURVE_CACHELINE_ALIGNMENT IOTracker {
// perform read operation
void DoRead(MDSClient* mdsclient, const FInfo_t* fileInfo);

/**
* read from the origin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read from source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read from source

done

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch from db9c4ec to b02d604 Compare December 24, 2020 12:05
0 == SourceReader::GetInstance().Init(configPath)) {
SourceReader::GetInstance().Run();
return LIBCURVE_ERROR::OK;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error日志

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error日志

done

sleeper_.interrupt();
fdCloseThread_.join();
LOG(INFO) << "SourceReader fdCloseThread stoped successfully";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else 也打下日志吧 not running or already stopped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else 也打下日志吧 not running or already stopped

done

}
if (fileClient_ != nullptr) {
for (auto &pair : fdMap_) {
fileClient_->Close(pair.second.first);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close有返回值吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close有返回值吗?

有的,修改后增加了判断
if (LIBCURVE_ERROR::OK != fileClient_->Close(pair.second.first)) {
LOG(ERROR) << "Close fd failed, fd = " << pair.second.first;
return;
}

curve::common::ReadLockGuard readLockGuard(rwLock_);
auto iter = fdMap_.find(fileName);
if (iter != fdMap_.end()) {
iter->second.second = time(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么这里赋值为time(0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么这里赋值为time(0)

time(0)就是当前时间戳

}
{
int fd = 0;
curve::common::WriteLockGuard lk(rwLock_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

跟curve::common::ReadLockGuard readLockGuard(rwLock_); 命名一致吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

跟curve::common::ReadLockGuard readLockGuard(rwLock_); 命名一致吧?

done

{
int fd = 0;
curve::common::WriteLockGuard lk(rwLock_);
auto iter = fdMap_.find(fileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥这里还需要再find一次,而不是直接open

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥这里还需要再find一次,而不是直接open

因为上面拿的是读锁,有可能多个线程同时获取到然后去fdMap中查询结果都为不存在,下面获取写锁时,其中一个首先获得,其他等待,如果不再查询一次就会出现打开多次的问题

// iter->done_->SetFailed(0);
// iter->done_->Run();
// break;
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要的代码直接删掉吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要的代码直接删掉吧

done

@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch 2 times, most recently from 3f2d861 to f3292a0 Compare December 24, 2020 12:28
@SeanHai SeanHai force-pushed the curve_client_readNotAllocateSegment branch from f3292a0 to 211176c Compare December 25, 2020 02:10
@ilixiaocui ilixiaocui merged commit fcd778a into opencurve:master Dec 25, 2020
@SeanHai SeanHai deleted the curve_client_readNotAllocateSegment branch April 29, 2021 01:48
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 19, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, this `FileInstance::mdsclient_`
become dangling pointer.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 19, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 19, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 19, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 21, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request May 24, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
xu-chaojie pushed a commit that referenced this pull request May 24, 2021
After #172 and #209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
wu-hanqing added a commit to wu-hanqing/curve that referenced this pull request Jul 27, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
ilixiaocui pushed a commit that referenced this pull request Jul 28, 2021
After #172 and #209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
SeanHai pushed a commit to SeanHai/curve that referenced this pull request Oct 26, 2021
After opencurve#172 and opencurve#209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
ilixiaocui pushed a commit that referenced this pull request Oct 26, 2021
After #172 and #209, source volume data of the clone volume
can be read within the client through SourceReader, and each
source volume was also opened and represented by a FileInstance,
but `FileInstance::mdsclient_` points to `FileClient::mdsClient_`,
so after FileClient is destroyed, `FileInstance::mdsclient_`
becomes dangling pointer.

To fix this problem, we let `FileClient::mdsClient_` be a shared_ptr,
and each FileInstance holds ownership of it.

Signed-off-by: Hanqing Wu <wuhanqing@corp.netease.com>
ilixiaocui pushed a commit to ilixiaocui/curve that referenced this pull request Feb 6, 2023
CNCF Community Bridge mentors evaluation form added
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants