Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new metadata type - data element #1191

Merged
merged 4 commits into from
Mar 8, 2023

Conversation

jczhong84
Copy link
Collaborator

Add a new metadata type: data element.

A data element is a basic unit of data that has a specific meaning and can be used to represent information. Data elements are used to represent various types of information, such as names, addresses, dates, numbers, and so on. Each data element has a name or identifier, a data type that specifies the kind of data that can be stored, and a set of properties that determine how the data element can be used. A data element will be associated with a table column.

This PR is to add the support of loading and displaying data elements from a metastore. Currently we dont support adding a data element through Querybook UI.

image
image

class DataElementAssociationType(Enum):
REF = "ref"
ARRAY = "array"
MAP = "map"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add ROW here?
while I don't think we will need it for now, but we don't need to do DB changes in future if we do need it


@with_session
def get_data_element_by_name(name: str, session=None):
return session.query(DataElement).filter(DataElement.name == name).first()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return DataElement.get(name=name, session=session)

)
return None

resp = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you define resp type here?
also logic might not be used solely as resp to api, can you rename it to data_element?

f"Can't find created_by of data element {data_element_tuple.name}: {data_element_tuple.created_by}"
)

data_element_tuple.created_by = created_by or None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if created_by is None, then the expr is None or None?

type: DataElementAssociationType
# data element name. required for all association types
value: str
value_primitive: str = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value_primitive_type

same as key, also can you add a comment for what it is for?

data_element = get_data_element_by_name(data_element_name, session=session)

if not data_element and not primitive_type:
LOG.error(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just raise exception here and fail the whole flow with invalid values? Otherwise you might end up with partial data elements

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will check the return value of this function to determine if adding both key and value together, will not end up with partial data elements.

The reason of not raising exception to fail the whole flow is, adding data elements is just part of the whole table sync session, would like other parts to be finished even data elements can't be synced succesfully.



@with_session
def create_or_update_data_element(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is not called anywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used in this PR, but will be used by the data element sync task

default=DataElementAssociationType.REF,
nullable=False,
)
property_name = sql.Column(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this strings? otherwise we are gonna have fun implementing rows in future if needed lol

/>
)}
<span>
{typeof dataElement === 'string'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!isDataElement

export const DataElementCard = ({ dataElement }: IProps) => {
const { id, name, type, description, properties } = dataElement;

const onExternalLinkClick = async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useCallback

Copy link
Collaborator Author

@jczhong84 jczhong84 Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed this before, other than cases like

  • it will be depended by other hooks
  • it's a true performance issue

the cost of creating the memoized function itself may not less than the cost of creating the function itself. So i'd prefer not using useCallback if not really needed.

@czgu czgu merged commit 2742cdc into pinterest:master Mar 8, 2023
@jczhong84 jczhong84 deleted the feat/dataelement branch April 7, 2023 00:41
rohan-sh1 pushed a commit to CAI-TECHNOLOGIES/cai-ext-db-explorer that referenced this pull request Apr 11, 2023
* feat: add new metadata type: data element

* add create data element logic

* fix linter

* update db model and comments
aidenprice pushed a commit to arrowtail-precision/querybook that referenced this pull request Jan 3, 2024
* feat: add new metadata type: data element

* add create data element logic

* fix linter

* update db model and comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants