Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NotificationType.VALIDATE which can serve as a dry-run of a CREATE without a metadata file #321

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ public class BasePolarisCatalog extends BaseMetastoreViewCatalog
private CloseableGroup closeableGroup;
private Map<String, String> catalogProperties;
private Map<String, String> tableDefaultProperties;
private final FileIOFactory fileIOFactory;
private FileIOFactory fileIOFactory;
private PolarisMetaStoreManager metaStoreManager;

/**
Expand Down Expand Up @@ -1613,6 +1613,11 @@ private PolarisMetaStoreManager getMetaStoreManager() {
return metaStoreManager;
}

@VisibleForTesting
void setFileIOFactory(FileIOFactory newFactory) {
this.fileIOFactory = newFactory;
}

@VisibleForTesting
long getCatalogId() {
// TODO: Properly handle initialization
Expand Down Expand Up @@ -1873,6 +1878,51 @@ private boolean sendNotificationForTableLike(
if (notificationType == NotificationType.DROP) {
return dropTableLike(PolarisEntitySubType.TABLE, tableIdentifier, Map.of(), false /* purge */)
.isSuccess();
} else if (notificationType == NotificationType.VALIDATE) {
// In this mode we don't want to make any mutations, so we won't auto-create non-existing
// parent namespaces. This means when we want to validate allowedLocations for the proposed
// table metadata location, we must independently find the deepest non-null parent namespace
// of the TableIdentifier, which may even be the base CatalogEntity if no parent namespaces
// actually exist yet. We can then extract the right StorageInfo entity via a normal call
// to findStorageInfoFromHierarchy.
PolarisResolvedPathWrapper resolvedStorageEntity = null;
Optional<PolarisEntity> storageInfoEntity = Optional.empty();
for (int i = tableIdentifier.namespace().length(); i >= 0; i--) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still validate that this principle has privileges to create namespaces and table (if they don't exist)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. This is validated in the updated PolarisCatalogHandlerWrapperAuthzTest.java

Namespace nsLevel =
Namespace.of(
Arrays.stream(tableIdentifier.namespace().levels())
.limit(i)
.toArray(String[]::new));
resolvedStorageEntity = resolvedEntityView.getResolvedPath(nsLevel);
if (resolvedStorageEntity != null) {
storageInfoEntity = findStorageInfoFromHierarchy(resolvedStorageEntity);
break;
}
}

if (resolvedStorageEntity == null || storageInfoEntity.isEmpty()) {
throw new BadRequestException(
"Failed to find StorageInfo entity for TableIdentifier %s", tableIdentifier);
}

// Validate location against the resolvedStorageEntity
String metadataLocation =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, is the expectation is to supply a fake metadata.json location in the payload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I described some of the options in the rest spec description, but in a nutshell it can be either a parent directory or a prospective full filename, and the behavior is simply for the validation to ensure that whatever path is specified conforms to the ALLOWED_LOCATIONS of the storage config.

transformTableLikeLocation(request.getPayload().getMetadataLocation());
validateLocationForTableLike(tableIdentifier, metadataLocation, resolvedStorageEntity);

// Validate that we can construct a FileIO
String locationDir = metadataLocation.substring(0, metadataLocation.lastIndexOf("/"));
refreshIOWithCredentials(
tableIdentifier,
Set.of(locationDir),
resolvedStorageEntity,
new HashMap<>(tableDefaultProperties),
Set.of(PolarisStorageActions.READ));

LOGGER.debug(
"Successful VALIDATE notification for tableIdentifier {}, metadataLocation {}",
tableIdentifier,
metadataLocation);
} else if (notificationType == NotificationType.CREATE
|| notificationType == NotificationType.UPDATE) {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ public enum NotificationType {
UNKNOWN(0, "UNKNOWN"),
CREATE(1, "CREATE"),
UPDATE(2, "UPDATE"),
DROP(3, "DROP");
DROP(3, "DROP"),
VALIDATE(4, "VALIDATE");

NotificationType(int id, String displayName) {
this.id = id;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,142 @@ public void testCreateNestedNamespaceUnderMissingParent() {
.hasMessageContaining("Parent");
}

@Test
public void testValidateNotificationWhenTableAndNamespacesDontExist() {
Assumptions.assumeTrue(
requiresNamespaceCreate(),
"Only applicable if namespaces must be created before adding children");
Assumptions.assumeTrue(
supportsNestedNamespaces(), "Only applicable if nested namespaces are supported");
Assumptions.assumeTrue(
supportsNotifications(), "Only applicable if notifications are supported");

final String tableLocation = "s3://externally-owned-bucket/validate_table/";
final String tableMetadataLocation = tableLocation + "metadata/v1.metadata.json";
BasePolarisCatalog catalog = catalog();

Namespace namespace = Namespace.of("parent", "child1");
TableIdentifier table = TableIdentifier.of(namespace, "table");

// For a VALIDATE request we can pass in a full metadata JSON filename or just the table's
// metadata directory; either way the path will be validated to be under the allowed locations,
// but any actual metadata JSON file will not be accessed.
NotificationRequest request = new NotificationRequest();
request.setNotificationType(NotificationType.VALIDATE);
TableUpdateNotification update = new TableUpdateNotification();
update.setMetadataLocation(tableMetadataLocation);
update.setTableName(table.name());
update.setTableUuid(UUID.randomUUID().toString());
update.setTimestamp(230950845L);
request.setPayload(update);

// We should be able to send the notification without creating the metadata file since it's
// only validating the ability to send the CREATE/UPDATE notification possibly before actually
// creating the table at all on the remote catalog.
Assertions.assertThat(catalog.sendNotification(table, request))
.as("Notification should be sent successfully")
.isTrue();
Assertions.assertThat(catalog.namespaceExists(namespace))
.as("Intermediate namespaces should not be created")
.isFalse();
Assertions.assertThat(catalog.tableExists(table))
.as("Table should not be created for a VALIDATE notification")
.isFalse();

// Now also check that despite creating the metadata file, the validation call still doesn't
// create any namespaces or tables.
InMemoryFileIO fileIO = (InMemoryFileIO) catalog.getIo();
fileIO.addFile(
tableMetadataLocation,
TableMetadataParser.toJson(createSampleTableMetadata(tableLocation)).getBytes(UTF_8));

Assertions.assertThat(catalog.sendNotification(table, request))
.as("Notification should be sent successfully")
.isTrue();
Assertions.assertThat(catalog.namespaceExists(namespace))
.as("Intermediate namespaces should not be created")
.isFalse();
Assertions.assertThat(catalog.tableExists(table))
.as("Table should not be created for a VALIDATE notification")
.isFalse();
}

@Test
public void testValidateNotificationInDisallowedLocation() {
Assumptions.assumeTrue(
requiresNamespaceCreate(),
"Only applicable if namespaces must be created before adding children");
Assumptions.assumeTrue(
supportsNestedNamespaces(), "Only applicable if nested namespaces are supported");
Assumptions.assumeTrue(
supportsNotifications(), "Only applicable if notifications are supported");

// The location of the metadata JSON file specified in the create will be forbidden.
// For a VALIDATE call we can pass in the metadata/ prefix itself instead of a metadata JSON
// filename.
final String tableLocation = "s3://forbidden-table-location/table/";
final String tableMetadataLocation = tableLocation + "metadata/";
BasePolarisCatalog catalog = catalog();

Namespace namespace = Namespace.of("parent", "child1");
TableIdentifier table = TableIdentifier.of(namespace, "table");

NotificationRequest request = new NotificationRequest();
request.setNotificationType(NotificationType.VALIDATE);
TableUpdateNotification update = new TableUpdateNotification();
update.setMetadataLocation(tableMetadataLocation);
update.setTableName(table.name());
update.setTableUuid(UUID.randomUUID().toString());
update.setTimestamp(230950845L);
request.setPayload(update);

Assertions.assertThatThrownBy(() -> catalog.sendNotification(table, request))
.isInstanceOf(ForbiddenException.class)
.hasMessageContaining("Invalid location");
}

@Test
public void testValidateNotificationFailToCreateFileIO() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another test case to add would be if the role doesn't have privileges to create the namespace or table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Authz tests are covered in PolarisCatalogHandlerWrapperAuthzTest.java

Assumptions.assumeTrue(
requiresNamespaceCreate(),
"Only applicable if namespaces must be created before adding children");
Assumptions.assumeTrue(
supportsNestedNamespaces(), "Only applicable if nested namespaces are supported");
Assumptions.assumeTrue(
supportsNotifications(), "Only applicable if notifications are supported");

// The location of the metadata JSON file specified in the create will be allowed, but
// we'll inject a separate ForbiddenException during FileIO instantiation.
// For a VALIDATE call we can pass in the metadata/ prefix itself instead of a metadata JSON
// filename.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: update comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

final String tableLocation = "s3://externally-owned-bucket/validate_table/";
final String tableMetadataLocation = tableLocation + "metadata/";
BasePolarisCatalog catalog = catalog();

Namespace namespace = Namespace.of("parent", "child1");
TableIdentifier table = TableIdentifier.of(namespace, "table");

NotificationRequest request = new NotificationRequest();
request.setNotificationType(NotificationType.VALIDATE);
TableUpdateNotification update = new TableUpdateNotification();
update.setMetadataLocation(tableMetadataLocation);
update.setTableName(table.name());
update.setTableUuid(UUID.randomUUID().toString());
update.setTimestamp(230950845L);
request.setPayload(update);

catalog.setFileIOFactory(
new FileIOFactory() {
@Override
public FileIO loadFileIO(String impl, Map<String, String> properties) {
throw new ForbiddenException("Fake failure applying downscoped credentials");
}
});
Assertions.assertThatThrownBy(() -> catalog.sendNotification(table, request))
.isInstanceOf(ForbiddenException.class)
.hasMessageContaining("Fake failure applying downscoped credentials");
}

@Test
public void testUpdateNotificationWhenTableAndNamespacesDontExist() {
Assumptions.assumeTrue(
Expand Down
Loading