-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIP-146: ManagedCursorInfo compression #14529
Labels
Comments
nodece
changed the title
PIP-146 ManagedCursorInfo compression
PIP-146: ManagedCursorInfo compression
Mar 2, 2022
1 task
This proposal has 3 (+1) bindings and 0 (-1) and has stayed open for at least 48 hours: |
codelipenghui
pushed a commit
that referenced
this issue
Apr 19, 2022
Fixes #14529 ### Motivation The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data. ### Modifications - Add a named `ManagedCursorInfoMetadata` message to `MLDataFormats.proto` for as compression metadata - Add the `managedCursorInfoCompressionType` to `org.apache.pulsar.broker.ServiceConfiguration` and `org.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig` - This feature is the same as the implementation of ManagedLedgerInfo compression, so the code is optimized to avoid duplication
Nicklee007
pushed a commit
to Nicklee007/pulsar
that referenced
this issue
Apr 20, 2022
Fixes apache#14529 ### Motivation The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data. ### Modifications - Add a named `ManagedCursorInfoMetadata` message to `MLDataFormats.proto` for as compression metadata - Add the `managedCursorInfoCompressionType` to `org.apache.pulsar.broker.ServiceConfiguration` and `org.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig` - This feature is the same as the implementation of ManagedLedgerInfo compression, so the code is optimized to avoid duplication
codelipenghui
pushed a commit
to codelipenghui/incubator-pulsar
that referenced
this issue
Jun 2, 2022
Fixes apache#14529 The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data. - Add a named `ManagedCursorInfoMetadata` message to `MLDataFormats.proto` for as compression metadata - Add the `managedCursorInfoCompressionType` to `org.apache.pulsar.broker.ServiceConfiguration` and `org.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig` - This feature is the same as the implementation of ManagedLedgerInfo compression, so the code is optimized to avoid duplication (cherry picked from commit 4398733)
codelipenghui
pushed a commit
that referenced
this issue
Jun 2, 2022
Fixes #14529 ### Motivation The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data. ### Modifications - Add a named `ManagedCursorInfoMetadata` message to `MLDataFormats.proto` for as compression metadata - Add the `managedCursorInfoCompressionType` to `org.apache.pulsar.broker.ServiceConfiguration` and `org.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig` - This feature is the same as the implementation of ManagedLedgerInfo compression, so the code is optimized to avoid duplication (cherry picked from commit 4398733)
nicoloboschi
pushed a commit
to datastax/pulsar
that referenced
this issue
Jun 6, 2022
Fixes apache#14529 ### Motivation The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data. ### Modifications - Add a named `ManagedCursorInfoMetadata` message to `MLDataFormats.proto` for as compression metadata - Add the `managedCursorInfoCompressionType` to `org.apache.pulsar.broker.ServiceConfiguration` and `org.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig` - This feature is the same as the implementation of ManagedLedgerInfo compression, so the code is optimized to avoid duplication (cherry picked from commit 4398733) (cherry picked from commit 70c7794)
15 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Discussion thread: https://lists.apache.org/thread/j92bzsby9n2ozc9gcw5psgcy2026l1wm
Motivation
The cursor data is managed by ZooKeeper/etcd metadata store. When cursor data becomes more and more, the data size will increase and will take a lot of time to pull the data. Therefore, it is necessary to add compression for the cursor, which can reduce the size of data and reduce the time of pulling data.
Goal
Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the ManagedCursorInfo.
Implementation
CursorInfo compression format
[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] + [MANAGED_CURSOR_INFO_PAYLOAD]
MAGIC_NUMBER
Use 0x4778, it is the same as the magic number of ledger info.
METADATA
Add a named
ManagedCursorInfoMetadata
message toMLDataFormats.proto
CursorInfo compression and decompression design
Currently, these compressions types have been defined and implemented by Pulsar, we only need to deal with compression and decompression of the
ManagedCursorInfo
data:Get CursorInfo from the metadata store
We will check the cursor data header, if it is compressed, we will parse the bytes data by compressed format, otherwise we will parse the cursor data directly by the original way.
Add/Update CursorInfo to the metadata store
The default is to use compression if the compression type is specified, otherwise we will put this data to the metadata store directly.
CursorInfo compression type configuration
Add
managedCursorInfoCompressionType
inorg.apache.pulsar.broker.ServiceConfiguration
andorg.apache.bookkeeper.mledger.ManagedLedgerFactoryConfig
.Compatibility
The text was updated successfully, but these errors were encountered: