Skip to content

[Proposal] Resolve error "fail to init reader.res=-230" by delayed deletion of rowset #4017

@ZhangYu0123

Description

@ZhangYu0123

Describe the bug

Because the compaction task on the BE will continuously merge the Rowset version, the useless Rowset after the merge is deleted. At this time, if the query version issued by the FE is among the merged versions, the BE can not obtain the Rowset version path to be queried, and the error OLAP_ERR_VERSION_ALREADY_MERGED = -230 is returned.

The specific meaning of this error can be found in #3270. And in PR #3271, #3859

Resolution
In order to not only ensure efficient compaction of Rowset merge, but also be able to query the previous version when querying, and make low-risk changes at the same time. This design adds the logic of the delayed deletion of the merged Rowset. The main ideas are as follows:

(1) Data structure changes

  • Add _expired_snapshot_rs_version_map to the Tablet to maintain the merged Rowset.
  • Add _expired_snapshot_rs_metas to TabletMeta to maintain the merged RowsetMeta.
  • Redefine the RowsetGraph structure in Rowset and change it to VersionedRowsetTracker, with the following responsibilities:
    a) Including the original RowsetGraph function, adding path information to the Vertex. The same path indicates the path that has been merged, and when pathVersion is -1, it indicates that the Rowset has not been merged.
    b) Join to maintain the merged Rowset collection _expired_snapshot_rs_path_map. The key of the map is the pathVersion and the value is the Rowset list with the same pathVersion.
    c) Maintain the current maximum path value and assign the Vertex corresponding to the Rowset merged next time.
    image
    Among them, the Rowset version on the path where the pathVersion is not -1 is the Rowset that can be deleted by delay.

(2) Compaction process changes

  • After compaction merge, enter the modify_rowsets stage. At the end of the modify_rowsets, the tablet adds the rowset deleted from rs_version_map to _expired_snaphort_rs_version_map; the same applies to the deletion of RowsetMeta.
  • In the reconstruct_rowset_graph reconstruction logic of VersionedRowsetTracker, also add Rowset of _expired_snapshot_rs_metas to build VersionedRowsetTracker. Add the merged Rowset list to _expired_snapshot_rs_path_map, and the pathVersion is incremented by 1.
  • Remove the gc operation in the last compaction.

(3) GC process changes

  • Add cleanup task of _expired_snapshot_rs_metas to start_trash_sweep of TabletManager.
  • When cleaning, check all paths in VersionedRowsetTracker where pathVersion is not -1. When the createtime of Rowset with the largest version number in a path is greater than config:tablet_rowset_expired_snapshot_sweep_time (new configuration, the default is 30 minutes), add Rowset on the entire pathVersion path to storage_engine's unused_rowset for cleaning.
  • After cleaning, use _expired_snapshot_rs_metas and _rs_meta to reconstruct VersionedRowsetTracker. At the same time, delete the key of the corresponding cleaned pathVersion in _expired_snapshot_rs_path_map.

(4) Find the Rowset to be read
When reading data, increase to find rowset in _expired_snapshot_rs_version_map.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions