-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/multi group snapshot #42
Conversation
|
@@ -39,7 +39,9 @@ public static boolean isWindows() { | |||
} | |||
|
|||
private static boolean isWindows0() { | |||
boolean windows = SystemPropertyUtil.get("os.name", "").toLowerCase(Locale.US).contains("win"); | |||
final boolean windows = SystemPropertyUtil.get("os.name", "") // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉的这里可以搞成全局的静态变量,然后在 static block 里初始化这个变量,这里就变成 getter 了,是不是性能更好点? 如果这个方法调用不频繁,倒也没关系。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
恩,其实现在的代码就是和你描述的思路一样的,因为 isWindows0() 是只供内部调用的,并且有一个静态变量,
private static final boolean IS_WINDOWS = isWindows0();
对使用者开放的方法是 isWindows() , 所以说 isWindows0() 只会被调用一次
this.rawKVStore.initFencingToken(parentKey, splitKey); | ||
this.storeEngine.doSplit(regionIds.getKey(), regionIds.getValue(), splitKey, closure); | ||
} catch (final Exception e) { | ||
if (closure != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
打印下日志,加上堆栈,这里这样处理,会丢掉堆栈,后续排查问题会很痛苦。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} finally { | ||
writeLock.unlock(); | ||
} | ||
this.opts = opts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是因为将 map 修改为 concurrent map 后,所以读写锁都去掉了? 但是怎么保证几个 map 之间的数据是同步的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对几个 map 同时有操作的方法不多,比如 tryLockWith 与 onSnapshotLoad()
- 对于同 region 的并发操作时没有的,因为操作都要走 raft log,是串行的
- 线性一致读场景之后操作 defaultDB, 跟其他几个 map 无关联
- 对于不同 region 的并发操作,因为不同 region 之间是严格的 range 隔离,所以也没有影响
} finally { | ||
writeLock.unlock(); | ||
} | ||
this.defaultDB.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类似这个方法,可能你在 clear 几个 map 之间,还有其他操作在执行,这样会有什么后果,也就是 map 之间数据不完全同步。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是 store 关闭方法中的, 在 store close 之前,是要 close 所有当前节点 raft group 服务的,不会进来任何操作
try { | ||
closeMemoryDB(); | ||
openMemoryDB(this.opts, defaultDB, sequenceDB.data(), fencingKeyDB.data(), lockerDB.data()); | ||
final File file = new File(snapshotPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个方法建议封装成一个类似 MemoryKeyStoreSnapshotFile 的类,将这些逻辑放进去。类似 Counter 的 snapshot 处理。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
主要问题是 snapshot 严重依赖 store 内部的各种数据以及状态,我尝试将这以外的比如 io 操作等剥离出来
@Override | ||
public void onSnapshotLoad(final String snapshotPath, final LocalFileMeta meta, final Region region) | ||
throws Exception { | ||
final Timer.Context timeCtx = getTimeContext("SNAPSHOT_LOAD"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
openRocksDB(opts); | ||
LOG.info("[RocksRawKVStore] start successfully, options: {}.", opts); | ||
return true; | ||
} catch (final RocksDBException e) { | ||
} catch (final Exception e) { | ||
LOG.error("Fail to open rocksDB at path {}, {}.", opts.getDbPath(), StackTraceUtil.stackTrace(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
异常,没有 return false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false 了,在下边
} | ||
|
||
public long getDatabaseVersion() { | ||
return this.databaseVersion.get(); | ||
} | ||
|
||
public void addStatisticsCollectorCallback(final StatisticsCollectorCallback callback) { | ||
if (this.statisticsCollector == null || this.statistics == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以改成 requires 调用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1069,6 +1107,7 @@ public void createSstFiles(final EnumMap<SstColumnFamily, File> sstFileTable, fi | |||
it.seek(startKey); | |||
} | |||
sstFileWriter.open(sstFile.getAbsolutePath()); | |||
long count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int 似乎够了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉还是 long 稳妥一点呀,int 正整数范围才到 21 亿
@@ -1222,6 +1270,66 @@ private void readSnapshot(final String snapshotPath) { | |||
} | |||
} | |||
|
|||
private LocalFileMeta writeSstSnapshot(final String snapshotPath, final Region region) { | |||
final Timer.Context timeCtx = getTimeContext("WRITE_SST_SNAPSHOT"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这块 snapshot 处理,也同样建议重构成 SnapshotFile 类似的接口,一是更内聚,不用在 state machine 暴露这个逻辑,另一个是可以搞成接口,后续可以扩展。
} | ||
}; | ||
this.kvStore.addStatisticsCollectorCallback(callback); | ||
// StatisticsCollectorCallback callback = new StatisticsCollectorCallback() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要的代码删掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@fengjiachun 过了一遍,你看着改下。现在测试似乎过不了? |
@killme2008 先等等,还有 snapshot 部分的建议没修完 |
@killme2008 重构了 snapshot 相关,抽空再帮看看 |
* (fix) multi-group snapshot with memoryStore * (fix) multi-group snapshot with rocksDBStore * (fix) add multi sst snapshot benchmark * (fix) add meta check on load snapshot * (fix) add meta check on load snapshot * (fix) add timer metric with memory snapshot * (fix) typo * (fix) log * (bugfix) fencing token isolation * (bugfix) rename local variable * (bugfix) fencing token isolation by region * (fix) typo * (fix) region info visibility * (feat) add comment * (fix) delete existing data on rocksdb restart, because raft will play them back * (fix) delete existing data on rocksdb restart, because raft will play them back * (fix) add restart test #86 * (fix) add restart test #86 * (fix) update restart test * (fix) add error log on splitting fail * (fix) requires * (fix) remove useless code * (fix) code refactoring with snapshot * (fix) fix bad name * (fix) typo
背景:
#37
rheaKV 是 multi-raft-group 设计,但是 kv 存储是共享的,原来做 snapshot 是当前节点(Store)全量数据 snapshot,这有几个问题:
改进方案是:
改进以后的好处是:
缺点:
生成 sst file 不是很快,需要控制一个 region 的大小,简单测试了下对于 100 个字节的 value,最好一个 region 的 key 数量不要超过千万量级
以下为一千万个 key 的 snapshot 压测数据,其中 sst 前缀的为新的 snapshot 实现,测试代码见 SnapshotBenchmark: