Skip to content

Conversation

@dev-work-8103
Copy link
Contributor

No description provided.

@dev-work-8103 dev-work-8103 marked this pull request as ready for review January 6, 2026 08:41
@Apache-HBase

This comment has been minimized.

return Long.MAX_VALUE;
}
return dateFormat.parse(extractedValue).getTime();
} catch (IndexOutOfBoundsException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: omit this catch, since we validate the condition on init?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

public static final String ROWKEY_DATE_FORMAT =
"hbase.hstore.datatiering.tieringvalueprovider.dateformat";
public static final String ROWKEY_REGEX_EXTRACT_GROUP =
"hbase.hstore.datatiering.tieringvalueprovider.regexextractgroup";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since these would be table specific configs, let's follow the related all caps naming format:

  1. TIERING_KEY_DATE_PATTERN
  2. TIERING_KEY_DATE_FORMAT
  3. TIERING_KEY_DATE_GROUP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, the properties names itself should follow that pattern, just like we do for TIERING_CELL_QUALIFIER:

}

@Test
public void TestCustomCellTieredCompactorWithRowKeyDateTieringValueProvider() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: camel casing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 155 to 160
utility.getConfiguration().set(TIERING_VALUE_PROVIDER,
RowKeyDateTieringValueProvider.class.getName());
utility.getConfiguration().set(RowKeyDateTieringValueProvider.ROWKEY_REGEX_PATTERN,
"(\\d{17})$");
utility.getConfiguration().set(RowKeyDateTieringValueProvider.ROWKEY_DATE_FORMAT,
"yyyyMMddHHmmssSSS");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test that we can set these configs at the table level configuration, since multiple tables in a cluster may have its own row key format and thus require different regexes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added two new tests, one has Global configurations for value provider and other one has table level configurations with 2 different REGEX patterns for 2 different tables.
Thanks.

rowKeyStr = Bytes.toString(rowArray);
// Validate UTF-8 encoding
if (rowKeyStr.contains("\ufffd")) {
LOG.debug("Row key contains invalid UTF-8 sequences");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's say we failed to extract the date in the log message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Comment on lines 80 to 82
utility.getConfiguration().setInt("hbase.hfile.compaction.discharger.interval", 10);
utility.startMiniCluster();

Copy link
Contributor

@wchevreuil wchevreuil Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding these here?

Copy link
Contributor Author

@dev-work-8103 dev-work-8103 Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To handle different configurations across test methods, I've adopted the pattern used in TestColumnFamilyDescriptorDefaultVersions.

  1. The setUp() method initializes HBaseTestingUtil and starts the mini cluster with the default CustomCellTieringValueProvider configuration.
  2. testCustomCellTieredCompactor() executes using this default configuration.
  3. testCustomCellTieredCompactorWithRowKeyDateTieringValue() shuts down the mini cluster, reconfigures it with RowKeyDateTieringValueProvider, and restarts before executing its test logic.
  4. The tearDown() method ensures proper cleanup after each test.

Comment on lines 155 to 247
@Test
public void testCustomCellTieredCompactorWithRowKeyDateTieringValueProviderWithGlobalConf()
throws Exception {
utility.getConfiguration().set(TIERING_VALUE_PROVIDER,
RowKeyDateTieringValueProvider.class.getName());
utility.getConfiguration().set(RowKeyDateTieringValueProvider.TIERING_KEY_DATE_PATTERN,
"(\\d{17})$");
utility.getConfiguration().set(RowKeyDateTieringValueProvider.TIERING_KEY_DATE_FORMAT,
"yyyyMMddHHmmssSSS");
utility.startMiniCluster();

ColumnFamilyDescriptorBuilder clmBuilder = ColumnFamilyDescriptorBuilder.newBuilder(FAMILY);
clmBuilder.setValue("hbase.hstore.engine.class", CustomTieredStoreEngine.class.getName());

TableName tableName = TableName.valueOf("testCustomCellTieredCompactor");
TableDescriptorBuilder tblBuilder = TableDescriptorBuilder.newBuilder(tableName);
tblBuilder.setColumnFamily(clmBuilder.build());
utility.getAdmin().createTable(tblBuilder.build());
utility.waitTableAvailable(tableName);

Connection connection = utility.getConnection();
Table table = connection.getTable(tableName);
long recordTime = System.currentTimeMillis();

SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmssSSS");

// Write data with date embedded in row key
for (int i = 0; i < 6; i++) {
List<Put> puts = new ArrayList<>(2);

// Old data - embed old date in row key (11 years ago)
String oldDate = sdf.format(new Date(recordTime - (11L * 366L * 24L * 60L * 60L * 1000L)));
String oldRowKey = "row_" + i + "_" + oldDate;
Put put = new Put(Bytes.toBytes(oldRowKey));
put.addColumn(FAMILY, Bytes.toBytes("val"), Bytes.toBytes("v" + i));
puts.add(put);

// Recent data - embed current date in row key
String recentDate = sdf.format(new Date(recordTime));
String recentRowKey = "row_" + (i + 1000) + "_" + recentDate;
put = new Put(Bytes.toBytes(recentRowKey));
put.addColumn(FAMILY, Bytes.toBytes("val"), Bytes.toBytes("v" + (i + 1000)));
puts.add(put);

table.put(puts);
utility.flush(tableName);
}
table.close();

long firstCompactionTime = System.currentTimeMillis();
utility.getAdmin().majorCompact(tableName);
Waiter.waitFor(utility.getConfiguration(), 5000,
() -> utility.getMiniHBaseCluster().getMaster().getLastMajorCompactionTimestamp(tableName)
> firstCompactionTime);

long numHFiles = utility.getNumHFiles(tableName, FAMILY);
assertEquals(1, numHFiles);

utility.getMiniHBaseCluster().getRegions(tableName).get(0).getStore(FAMILY).getStorefiles()
.forEach(file -> {
byte[] rangeBytes = file.getMetadataValue(CUSTOM_TIERING_TIME_RANGE);
assertNotNull(rangeBytes);
try {
TimeRangeTracker timeRangeTracker = TimeRangeTracker.parseFrom(rangeBytes);
assertEquals((recordTime - (11L * 366L * 24L * 60L * 60L * 1000L)),
timeRangeTracker.getMin());
assertEquals(recordTime, timeRangeTracker.getMax());
} catch (IOException e) {
fail(e.getMessage());
}
});

long secondCompactionTime = System.currentTimeMillis();
utility.getAdmin().majorCompact(tableName);
Waiter.waitFor(utility.getConfiguration(), 5000,
() -> utility.getMiniHBaseCluster().getMaster().getLastMajorCompactionTimestamp(tableName)
> secondCompactionTime);

numHFiles = utility.getNumHFiles(tableName, FAMILY);
assertEquals(2, numHFiles);

utility.getMiniHBaseCluster().getRegions(tableName).get(0).getStore(FAMILY).getStorefiles()
.forEach(file -> {
byte[] rangeBytes = file.getMetadataValue(CUSTOM_TIERING_TIME_RANGE);
assertNotNull(rangeBytes);
try {
TimeRangeTracker timeRangeTracker = TimeRangeTracker.parseFrom(rangeBytes);
assertEquals(timeRangeTracker.getMin(), timeRangeTracker.getMax());
} catch (IOException e) {
fail(e.getMessage());
}
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These configs shouldn't be applied globally, let's just keep the testCustomCellTieredCompactorWithRowKeyDateTieringValueProviderWithTableLevelConf test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 5m 41s master passed
+1 💚 compile 5m 19s master passed
+1 💚 checkstyle 1m 33s master passed
+1 💚 spotbugs 2m 24s master passed
+1 💚 spotless 1m 12s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 4m 40s the patch passed
+1 💚 compile 4m 50s the patch passed
+1 💚 javac 4m 50s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 28s /results-checkstyle-hbase-server.txt hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 spotbugs 2m 46s the patch passed
+1 💚 hadoopcheck 16m 35s Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚 spotless 1m 21s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 16s The patch does not generate ASF License warnings.
60m 15s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7593/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7593
JIRA Issue HBASE-29569
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux c30dad5039ed 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / f4dab16
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7593/3/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 13s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 19s master passed
+1 💚 compile 0m 59s master passed
+1 💚 javadoc 0m 29s master passed
+1 💚 shadedjars 6m 37s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 53s the patch passed
+1 💚 compile 0m 59s the patch passed
+1 💚 javac 0m 59s the patch passed
+1 💚 javadoc 0m 28s the patch passed
+1 💚 shadedjars 6m 38s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 261m 4s /patch-unit-hbase-server.txt hbase-server in the patch failed.
287m 58s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7593/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7593
JIRA Issue HBASE-29569
Optional Tests javac javadoc unit compile shadedjars
uname Linux f9e97b465297 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov 24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / f4dab16
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7593/3/testReport/
Max. process+thread count 4983 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7593/3/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants