-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22623 - Add RegionObserver coprocessor hook for preWALAppend #390
Conversation
@apurtell - here's my implementation of HBASE-22623. In the end I did manage to get everything working through a single coprocessor hook, but coprocs will need something like Phoenix's BatchMutateContext (or eventually HBASE-18127) for the mutation context. |
Would it be possible to resubmit the PR via force push without HBASE-22622? Not a big deal but not sure this can be merged after #352 as is. |
@@ -7953,6 +7954,7 @@ private WriteEntry doWALAppend(WALEdit walEdit, Durability durability, List<UUID | |||
if (walEdit.isReplay()) { | |||
walKey.setOrigLogSeqNum(origLogSeqNum); | |||
} | |||
walKey = this.coprocessorHost.preWALAppend(walKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this.coprocessorHost
be null here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapped in a null-check.
* @return | ||
* @throws IOException | ||
*/ | ||
default WALKeyImpl preWALAppend(ObserverContext<RegionCoprocessorEnvironment> ctx, WALKeyImpl key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only allows change of the WAL key. What if we want to also mutate or add more cells to the WALedit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead maybe WALEntry preWALAppend(ctx, entry)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apurtell - the header comments for WALEdit state that the class is meant to be read-only for coprocs. There are coprocessor hooks that take a WALEdit parameter, but unless I missed one, they all seem to be after the WAL has been appended so that mutating the edit has no function.
I don't understand the reason for the restriction -- we trust coprocessor implementers to do sane things already everywhere else -- but since the code had an explicit policy I tried to honor it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it can't easily be WALEntry preWALAppend(ctx, entry), because doWALAppend does not currently create a WALEntry (it passes the WALKey and WALEdit as separate parameters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, separate parameters is fine. Implementer can invoke WALKey and WALEdit methods to make changes.
@@ -7934,8 +7934,9 @@ private WriteEntry doWALAppend(WALEdit walEdit, Durability durability, List<UUID | |||
/** | |||
* @return writeEntry associated with this append | |||
*/ | |||
private WriteEntry doWALAppend(WALEdit walEdit, Durability durability, List<UUID> clusterIds, | |||
long now, long nonceGroup, long nonce, long origLogSeqNum) throws IOException { | |||
private WriteEntry doWALAppend( WALEdit walEdit, Durability durability, List<UUID> clusterIds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change but whitespace as far as I can tell. Unintentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undid the accidental whitespace changes.
7c911fa
to
f88cc43
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
* @param key | ||
* @param extendedAttributes | ||
*/ | ||
public WALKeyImpl(WALKeyImpl key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of CP creating a new WALKey object, we can give API to add the ext attributes? It allows to add any custom key value pair so that later CPs or Replication EPs an make use. So we might not even allow the CP to return a brand new WALKey object. (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also WALKeyImpl is marked LimitedPrivate({HBaseInterfaceAudience.REPLICATION}) I can see. So what this patch trying to do is to expose the class for CPs also. I can not see where WALKeyImpl to be used by Replication area. WALKey was enough. Anyways for this CP hook, IMO its enough to pass WALKey interface. We can add the API to add ext attributes into the interface. That is just adding ext attributes. It should be fine as it is not exposing any setters which takes other key attributes of the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Pushed up fixes to the checkstyle and test failures the CI pipeline found. Haven't yet incorporated the feedback on the coprocessor signature that @apurtell and @anoopsjohn requested -- will do that once I get a clean run. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on JIRA: https://issues.apache.org/jira/browse/HBASE-22623?focusedCommentId=16901262&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16901262
The LP(coproc) annotations on WALKey and WALEdit are already present.
Remove this comment:
// TODO: Do not expose this class to Coprocessors. It has set methods. A CP might meddle. |
Remove or improve this comment:
* Read-only. No Setters. For limited audience such as Coprocessors. |
Otherwise lgtm
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Thanks, @apurtell , just removed the comments as you requested. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After backing out the specifically requested changes, which do not affect the proposed hook signature, I will approve and merge this PR. The requested changes allow us to side step a rathole we went down on the JIRA.
@@ -48,15 +48,12 @@ | |||
* Used in HBase's transaction log (WAL) to represent a collection of edits (Cell/KeyValue objects) | |||
* that came in as a single transaction. All the edits for a given transaction are written out as a | |||
* single record, in PB format, followed (optionally) by Cells written via the WALCellEncoder. | |||
* <p>This class is LimitedPrivate for CPs to read-only. The {@link #add} methods are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to ask you to back this out, but discussion on the JIRA ratholed on whether or not WALEdit should be immutable or not. We can keep the hook signature as proposed, because WALEdit is useful to the coprocessor even if read only, and this would allow the discussion on WALEdit to be tabled until some future time without consequence to today's need for this change.
* <p>WALEdit will accumulate a Set of all column family names referenced by the Cells | ||
* {@link #add(Cell)}'d. This is an optimization. Usually when loading a WALEdit, we have the | ||
* column family name to-hand.. just shove it into the WALEdit if available. Doing this, we can | ||
* save on a parse of each Cell to figure column family down the line when we go to add the | ||
* WALEdit to the WAL file. See the hand-off in FSWALEntry Constructor. | ||
*/ | ||
// TODO: Do not expose this class to Coprocessors. It has set methods. A CP might meddle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
@@ -163,13 +160,11 @@ public boolean isReplay() { | |||
return this.replay; | |||
} | |||
|
|||
@InterfaceAudience.Private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
public WALEdit add(Cell cell, byte [] family) { | ||
getOrCreateFamilies().add(family); | ||
return addCell(cell); | ||
} | ||
|
||
@InterfaceAudience.Private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
* @param ctx the environment provided by the region server | ||
* @param key the WALKey associated with a particular append to a WAL | ||
*/ | ||
default void preWALAppend(ObserverContext<RegionCoprocessorEnvironment> ctx, WALKey key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mark this method as deprecated so we keep consistent on that WALEdit should not be exposed directly.
And no postWALAppend seems a bit strange to users but I think it is fine here. We can add it when we actually want to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 Just to make sure I understand you right...you want me to write a method which is Deprecated, as of the moment of its creation?
Adding a method to a public interface tells people they can use it. Deprecating a method tells people they should stop using it or refrain from starting to. Adding a deprecated method...just adds dead code.
Someday soon, there's going to be a Phoenix code review for changes I'll make that use this method, and someone's going to say, "You can't use this method; it's deprecated. -1" And they'll be completely right to say so.
Huh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phoenix already uses lots of deprecated methods in HBase, and I believe you can not find alternate solutions, for example, this one
/**
* Called before creation of Reader for a store file.
* Calling {@link org.apache.hadoop.hbase.coprocessor.ObserverContext#bypass()} has no
* effect in this hook.
*
* @param ctx the environment provided by the region server
* @param fs fileystem to read from
* @param p path to the file
* @param in {@link FSDataInputStreamWrapper}
* @param size Full size of the file
* @param cacheConf
* @param r original reference file. This will be not null only when reading a split file.
* @param reader the base reader, if not {@code null}, from previous RegionObserver in the chain
* @return a Reader instance to use instead of the base reader if overriding
* default behavior, null otherwise
* @deprecated For Phoenix only, StoreFileReader is not a stable interface.
*/
@Deprecated
// Passing InterfaceAudience.Private args FSDataInputStreamWrapper, CacheConfig and Reference.
// This is fine as the hook is deprecated any way.
default StoreFileReader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironment> ctx,
FileSystem fs, Path p, FSDataInputStreamWrapper in, long size, CacheConfig cacheConf,
Reference r, StoreFileReader reader) throws IOException {
return reader;
}
You can add javadoc to say that now it is only supposed to be used in Phoenix, but I still think we should mark it as deprecated, otherwise it will be confusing that why in WALObserver, we say that WALEdit is private and should not be used, but then in RegionObserver, we allow users to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal isn't to create a Phoenix-specific hack, but a general-purpose HBase API which Phoenix can use. You're at the same time complaining about Phoenix's previous and existing use of unapproved hooks while resisting efforts to create approved APIs the right way.
preStoreFileReaderOpen wasn't created deprecated -- it's still alive and well in branch-1 -- but was deprecated in HBASE-18793, which you know because you were the one who deprecated it.
Given that I'm backporting this new hook to branch-1, if I create it deprecated that presumably means that as soon as 1.5 releases, the new hook is eligible for cleanup in master via semver rules before it's even been released!
The comments are confusing, not because of this patch, but because the position that created those comments is already self-contradictory in existing code. WALEdit must never be exposed to coprocessors...except the 6 times it already is in RegionObserver and the 4 times it is in WALObserver.
When a policy leads to increasingly strange and nonsensical results -- and creating an already deprecated method is nonsensical -- it's time to rethink the policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new method is not going to be deprecated out of the bat.
We are not monolithic in our approach (and hostility) to coprocessor interfaces as a community. Imposing that disagreement on contributors is not appropriate.
I am going to merge this as is and we can follow up on what should or should not be deprecated as a larger conversation on the future of coprocessors. The community has some big disagreements in approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree the "policy", such as it is, is contradictory and confusing. We need to attack the bigger picture on dev@ in a discussion about the future of coprocessors and our tolerance (or not) to the requests of the Phoenix project. The opinions are not monolithic. There are some supporters, there are some hostile positions, both are valid in my view, we need to sort out the disagreement. This issue isn't the right scope for that. The contradictory positions are evident in the tug and pull of the suggestions to the contributor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied these comments over onto the JIRA for visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sent an email with the subject "Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess" to dev@
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting 24 hours for time zone turn around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API will be used by adding to the passed in WALEdit? Using WALEdit #setters?
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
|
||
/** | ||
* Called just before the WAL Entry is appended to the WAL. Implementing this hook allows | ||
* coprocessors to add extended attributes to the WALKey that then get persisted to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having trouble understanding 'add extended attributes to the WALKey'. WALKey is read-only. You mean WALEdit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WALKey is no longer read only. You can get and set extended attributes after HBASE-22622 WALKey Extended Attributes (#352)
* @param ctx the environment provided by the region server | ||
* @param key the WALKey associated with a particular append to a WAL | ||
*/ | ||
default void preWALAppend(ObserverContext<RegionCoprocessorEnvironment> ctx, WALKey key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API will be used by adding to the passed in WALEdit? Using WALEdit #setters?
@@ -7951,6 +7951,11 @@ private WriteEntry doWALAppend(WALEdit walEdit, Durability durability, List<UUID | |||
if (walEdit.isReplay()) { | |||
walKey.setOrigLogSeqNum(origLogSeqNum); | |||
} | |||
//don't call the coproc hook for writes to the WAL caused by | |||
//system lifecycle events like flushes or compactions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? I'd think we would to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this API is so we can get the WALKey and add an extended attribute just before it commits to the WAL from the RPC context. See above discussion and JIRA for context.
* @param attributeKey Name of the attribute | ||
* @param attributeValue Value of the attribute | ||
*/ | ||
void addExtendedAttribute(String attributeKey, byte[] attributeValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so WALKey goes from being read-only to now carrying burden? WALEdit is for freight?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've made this remark before... Looking for the response then, I see loads of commentary in issue... Let me review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, don't find where its ok to add freight to WALKey marked read-only (but alternative of letting WALEdit be mutable is a 'rat-hole'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WALKey is not read only.
WALedit is not where we want to add the state
The WALKey change is already committed.
Having a WALedit parameter on this hook is useful whether it is read only or not.
Still going to merge this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is the second half of work that began with HBASE-22622 WALKey Extended Attributes (#352)
…pache#390) Signed-off-by: Andrew Purtell <apurtell@apache.org>
…pache#390) Signed-off-by: Andrew Purtell <apurtell@apache.org>
No description provided.