-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24205 Create metric to know the number of reads that happens fr… #1552
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Will check the test case failure. Seems metrics needs to be explicitly handled. And other Javadoc issues needs to be fixed |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Some checkstyles seems to be coming from already existing code. Not sure. Fixed them. Lets see. Tests seems to be passing. The javadoc for hbase-hadoop2-compat seems to be failing due to
|
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall
@@ -45,6 +46,16 @@ private synchronized MetricsRegionAggregateSourceImpl getRegionAggregate() { | |||
} | |||
} | |||
|
|||
private synchronized MetricsStoreAggregateSourceImpl | |||
getStoreAggregate(MetricsStoreWrapper wrapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapper argument can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya not needed.
private final MetricsExecutorImpl executor = new MetricsExecutorImpl(); | ||
|
||
private final Set<MetricsStoreSource> storeSources = | ||
Collections.newSetFromMap(new ConcurrentHashMap<MetricsStoreSource, Boolean>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Collections.newSetFromMap(new ConcurrentHashMap<>())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the compiler was saying some warnings. Hence left it as is.
} catch (Exception e) { | ||
// Ignored. If this errors out it means that someone is double | ||
// closing the region source and the region is already nulled out. | ||
LOG.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to have LOG.error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the removal is not needed to handle throw catch. I just used what was being done in MetricsREgionAggregateSource. Seems that is not needed.
private final MetricsStoreSource source; | ||
private MetricsStoreWrapper storeWrapper; | ||
|
||
public MetricsStore(final MetricsStoreWrapper wrapper, Configuration conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
public class MetricsStoreWrapperImpl implements MetricsStoreWrapper, Closeable { | ||
|
||
private final HStore store; | ||
private static final Logger LOG = LoggerFactory.getLogger(MetricsStoreWrapperImpl.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to use LOG somewhere? Or maybe dropped the idea and hence it's not being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it.
|
||
@Override | ||
public long getMemStoreSize() { | ||
// todo : change this - we need to expose data, heapsize and offheapdatasize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. For this change, is it good to give name memstoreDataSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that next time when we add heapsize and offheapdatasize, we can give name: memstoreHeapSize
, memstoreOffHeapSize
etc.
Your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now lets be it this way. All other metrics has to be changed. We can do it that time.
public static final int PERIOD = 45; | ||
public static final String UNKNOWN = "unknown"; | ||
private ScheduledExecutorService executor; | ||
private Runnable runnable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executor and runnable both could be local variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
if (!heap.current.isFileScanner()) { | ||
updateMetricsStore(true); | ||
} else { | ||
updateMetricsStore(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified to updateMetricsStore(!heap.current.isFileScanner())
@@ -2547,7 +2565,7 @@ public CacheConfig getCacheConfig() { | |||
} | |||
|
|||
public static final long FIXED_OVERHEAD = | |||
ClassSize.align(ClassSize.OBJECT + (27 * ClassSize.REFERENCE) + (2 * Bytes.SIZEOF_LONG) | |||
ClassSize.align(ClassSize.OBJECT + (31 * ClassSize.REFERENCE) + (2 * Bytes.SIZEOF_LONG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is a must or it is an improvement? Just trying to understand it better :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually here added 5 more refs but considering 4 only. And infact 4 is enough when u keep getRequestsFromMemstore and getRequestsFromStore .
getRequestsFromStore => getRequests.. This is store anyways so 'FromStore' is implicit.
if (memstoreRead) { | ||
metricsStore.updateMemstoreGet(); | ||
} | ||
metricsStore.updateFileGet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an else statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Thanks for the reviews.
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
/** | ||
* Description | ||
*/ | ||
String METRICS_DESCRIPTION = "Metrics about Stores under a region"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will treat only one store under a region for this aggregation.
When we want to see an aggregated metric across all the store instances (under diff regions) for a table, this wont help right? (I mean metric for table CF) Do you think that will be more useful? The stuff of memstore hit rate and all at a CF level would be useful IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now tried to consolidate the metric at the table level also. Pls have a look. @anoopsjohn .
not treat as mixed reads
Entry<String, Long> entry = iterator.next(); | ||
// append 'store' and its name to the metric | ||
mrb.addGauge(Interns.info(this.tableNamePrefixPart1 + _STORE | ||
+ entry.getKey().split(MetricsTableWrapperAggregate.UNDERSCORE)[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the key will be <regionName>_<storeName>?
At this level we should call CF not store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the key should be Columnfamily? So _STORE - that should be _ColumnFamily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store is an instance of CF. So here what we have is an aggregated value across all instances (Stores) of a given CF in a table. So the name CF make sense than STORE? WDYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it was addressed.
/** | ||
* @return number of get requests from file per store for this table | ||
*/ | ||
Map<String, Long> getMixedRequestsCount(String table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont say whether it is read reqs in this method name. This is the total #rows reads from this Store right? Can we name in that way and avoid the term mixed?
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
this.executor = CompatibilitySingletonFactory.getInstance(MetricsExecutor.class).getExecutor(); | ||
this.runnable = new TableMetricsWrapperRunnable(); | ||
this.tableMetricsUpdateTask = this.executor.scheduleWithFixedDelay(this.runnable, period, | ||
this.period, TimeUnit.MILLISECONDS); | ||
this.tableMetricsUpdateTask = this.executor.scheduleWithFixedDelay(this.runnable, PERIOD, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a bug fix? (Like the one below) .. Can u explain .. Previously it was a config based thing and now hard coded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a bug fix and also the config based update I believe it was just added because the MetricsREgionServer based aggregate therad was having that config. But if you see the MetricRegionSource it is same 45 sec. The period of updation was rather too frequent. I just changed it to be in sync with MetricREgionSource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would request to keep this out of this PR. Raise another specific Jira to address this issue and may be below one also and get it committed as part of that. Later it will be easy for some one who is searching the change history/ bug fix history.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Committing the patch today. Thanks for all the reviews @anoopsjohn, @saintstack and @virajjasani . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a release note on the jira
MetricsRegionSource.ROW_READS_ONLY_ON_MEMSTORE_DESC); | ||
addCounter(mrb, this.regionWrapper.getMixedRowReadsCount(), | ||
MetricsRegionSource.MIXED_ROW_READS, | ||
MetricsRegionSource.MIXED_ROW_READS_ON_STORE_DESC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need this count? We don't have it already with the general read count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My aversion to the extra counting is that we already do so much; it costs us loads in cpu. Was trying to do less if we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are asking in terms of CPU that we add on while collecting the metric? Ifyou see we do collect the metric at the HStore level per row when the StoreScanner completes a row process. That is now a longadder. Seems it is more performant than AtomicLong. Also the above change that we have done at the region level is nothing but just get that metric when that runnable thread keeps running. We don do any metric collection at this level. Are you still thinking it may be a problem. @saintstack ? BTW thanks for your review here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have it already with the general read count?
This is a read count across all stores. But now what we get additionally is per store how much is the read count that hit both memstore and files - also one more where we say how many rows per store came out of memstore only. Ideally the sum of these values per store should be equal to the total read count per region.
"_metric_"; | ||
regionNamePrefix1 = "Namespace_" + regionWrapper.getNamespace() + "_table_" | ||
+ regionWrapper.getTableName() + "_region_" + regionWrapper.getRegionName(); | ||
regionNamePrefix2 = "_metric_"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'metric' addition to name was of no value?
/** | ||
* @return the number of row reads on memstore and file per store | ||
*/ | ||
Map<String, Long> getMixedRowReadsCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, do we need to keep this count? It doesn't overlap w/ another?
Entry<String, Long> entry = iterator.next(); | ||
// append 'store' and its name to the metric | ||
mrb.addGauge(Interns.info(this.tableNamePrefixPart1 + _STORE | ||
+ entry.getKey().split(MetricsTableWrapperAggregate.UNDERSCORE)[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it was addressed.
Initial PR to get the metrics that indicates the gets that happens on the memstore. We just try to get the count at the row level and not at the cell level.
Only StoreScanner has details about how the cell or row was retrieved. Even if one of the cell is retrieved from the Memstore we will account the read to have used the memstore indicating it was trying to access latest data. We can add metric for scan too later.
Another thing to note is that the real time counter on the Store level will use MetricStore which will use the Counters with hbase-metric-api and not the hadoop's counters.