-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#2234] feat(core): Add the support of User entity #2481
Conversation
@Override | ||
public Namespace namespace() { | ||
return namespace; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need namespace for User
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User is under a metalake. So we should have a namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's weird to use namespace here, we don't have a namespace concept for "User".
You can use metalake directly, or remove this interface. Since I don't see the usage of namespace from the current code.
uint64 id = 1; | ||
string name = 2; | ||
map<string, string> properties = 3; | ||
AuditInfo audit_info = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to add some more basic information for user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitate to add them. If some properties are very important, we can make them become fields later. At the beginning, we can put them into the properties.
|
||
/** A class representing a user metadata entity in Gravitino. */ | ||
@ToString | ||
public class MetalakeUser implements User, Entity, Auditable, HasIdentifier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to choose to name this CombinedUser
Because the underlying system may have different users. We can't map them one to one.
Not to choose to name this BaseUser
. Because BaseUser looks like an abstract class although we already have a class BaseMetalake
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this name, user is just a user, MetalakeUser
will make me think that this user is only for metalake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about concreteUser
, UserImpl
and BasicUser
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about ManagedUser?
core/src/main/java/com/datastrato/gravitino/storage/kv/KvEntityStore.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastrato/gravitino/tenant/AccessControlManager.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastrato/gravitino/tenant/SupportsUserManagement.java
Outdated
Show resolved
Hide resolved
1e79fd7
to
2808f2d
Compare
api/src/main/java/com/datastrato/gravitino/exceptions/UserAlreadyExistsException.java
Show resolved
Hide resolved
public static final Field NAME = | ||
Field.required("name", String.class, "The name of the user entity."); | ||
public static final Field PROPERTIES = | ||
Field.optional("properties", Map.class, "The properties of the user entity."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you define properties here, but don't have a related method in the User
interface, how do users use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have defined the method properties()
in User.java
#24L.
@@ -390,6 +407,7 @@ public boolean delete(NameIdentifier ident, EntityType entityType, boolean casca | |||
.build()); | |||
} | |||
|
|||
deleteUserEntitiesIfNecessary(ident, entityType); | |||
return transactionalKvBackend.delete(dataKey); | |||
}); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuqi1129 please help to check this part. I don't think the change is enough to support User
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If a catalog and a user share the same entity, Gravitino will receive an error, so we need to take actions to prevent this from happening.
- You may also need to implement rational entity stores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- How to prevent it?
- I prefer doing it in another pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to prevent it?
We need to consider it any way, let me think about it also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qqqttt123
We'd better add some comments about it, If the name of a user and catalog is the same, they will share the same id in the id-name mapping, If we drop any one( the user or the catalog), the ID of it may also be removed( though we will not abandon the id of the name when dropping an entity), so I suggest we need to add some comment about it.
You can add the following comments here:
// As the name of users and catalogs shares the same ID if their names are the same, we must be cautious
// that we should NOT drop the id when remove a user or catalog, for more please refer to `KvNameMappingService`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my thought,
- Metalake is a tenant concept us. We have users, groups and roles under the metalake. They will be isolated among metalakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I want is that the semantic is correct to support different operations, no matter what underlying mapping you use.
Besides, I don't think the storage hierarchy should be directly mapping the semantic layer. Like user, it is a concept under the metalake. But we can store them in a system catalog/table, we don't have to store it directly under metalake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another potential issue is that renaming may not go as planned. For instance, the catalog 'A' and user A share the same id 1, when we rename catalog A to B, then the mapping then become B to 1 and user A can't be found.
This problem should also exist for other entities, it should be fixed. For example, renaming table from A to B, then the fileset A cannot be found either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should never assume that there's only one entity type under the specific namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to wait for @yuqi1129 's fix or we can merge it beforehand?
@@ -390,6 +407,7 @@ public boolean delete(NameIdentifier ident, EntityType entityType, boolean casca | |||
.build()); | |||
} | |||
|
|||
deleteUserEntitiesIfNecessary(ident, entityType); | |||
return transactionalKvBackend.delete(dataKey); | |||
}); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If a catalog and a user share the same entity, Gravitino will receive an error, so we need to take actions to prevent this from happening.
- You may also need to implement rational entity stores.
core/src/main/java/com/datastrato/gravitino/storage/kv/KvEntityStore.java
Outdated
Show resolved
Hide resolved
import org.slf4j.LoggerFactory; | ||
|
||
/* AccessControlManager is used for manage users, roles, grant information, this class is | ||
* an important class for tenant management. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be not a standard Java document format.
From the Snowflake, the user model is as follows.
We could ignore the session properties. Gravitino doesn't have the concept of the session. For object projecties, some of them may be useful like
|
From the Databricks, https://docs.databricks.com/api/workspace/users/create
|
From the Ranger, https://ranger.apache.org/apidocs/resource_XUserREST.html#resource_XUserREST_createXUser_POST
Some properties are similar to the Snowflake. Description is similar to the comment. |
From the surveys above, I think we can add the properties
|
CI isn't stable. It's not caused by this pull request. |
import com.google.errorprone.annotations.FormatMethod; | ||
import com.google.errorprone.annotations.FormatString; | ||
|
||
/** An exception thrown when a resource already exists. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** An exception thrown when a user already exists. */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* @throws UserAlreadyExistsException If a User with the same identifier already exists. | ||
* @throws RuntimeException If adding the User encounters storage issues. | ||
*/ | ||
public User addUser(String metalake, String name) throws UserAlreadyExistsException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can have a UserManager
under this AccessControlManager
, so that we don't have to put everything in the AccessControlManager
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
.withNamespace( | ||
Namespace.of( | ||
metalake, | ||
AuthorizationConstants.SYSTEM_CATALOG_RESERVED_NAME, | ||
AuthorizationConstants.USER_SCHEMA_NAME)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd better define this reserved Namespace
in the entity/user entity, not here in auth constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put SYSTEM_CATALOG_RESERVED_NAME into CatalogEntity.
Put USER_SCHEMA_NAME into UserEntity.
try { | ||
return store.get(ofUser(metalake, user), Entity.EntityType.USER, UserEntity.class); | ||
} catch (NoSuchEntityException e) { | ||
LOG.warn("user {} does not exist in the metalake {}", user, metalake, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize the first letter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -268,6 +269,10 @@ public Catalog createCatalog( | |||
Map<String, String> properties) | |||
throws NoSuchMetalakeException, CatalogAlreadyExistsException { | |||
|
|||
if (AuthorizationConstants.SYSTEM_CATALOG_RESERVED_NAME.equals(ident.name())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should also added to the catalog's common rule? @mchades
.startInclusive(true) | ||
.end(Bytes.increment(Bytes.wrap(prefix)).get()) | ||
.build()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this method, I cannot get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we delete a metalake, we should remove the users under the metalake.
I resolved the conflict with the main branch. |
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
/** UserManager is used for manage users */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is meaningless, please add more meaningful description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
private final IdGenerator idGenerator; | ||
|
||
/** | ||
* Constructs a User instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "User instance"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Removed the comment.
* | ||
* @param store The EntityStore to use for managing access control. | ||
* @param idGenerator The IdGenerator to use for generating identifiers. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove some of the javadoc here, it's not so useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Removed the comment.
What changes were proposed in this pull request?
Add the UserEntity. Gravitino doesn't manage users, just sets up the relationship between the metalake and the user. So we don't bring too many fields in the User entity. More user information should be managed by external user system.
Why are the changes needed?
Fix: #2234
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Add UT.