-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add server config option to disable validation of outgoing data #1530
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -91,7 +91,7 @@ def meta_values( | |||||||||||||
|
||||||||||||||
|
||||||||||||||
def handle_response_fields( | ||||||||||||||
results: Union[List[EntryResource], EntryResource], | ||||||||||||||
results: Union[List[EntryResource], EntryResource, List[Dict], Dict], | ||||||||||||||
exclude_fields: Set[str], | ||||||||||||||
include_fields: Set[str], | ||||||||||||||
) -> List[Dict[str, Any]]: | ||||||||||||||
|
@@ -115,7 +115,11 @@ def handle_response_fields( | |||||||||||||
|
||||||||||||||
new_results = [] | ||||||||||||||
while results: | ||||||||||||||
new_entry = results.pop(0).dict(exclude_unset=True, by_alias=True) | ||||||||||||||
new_entry = results.pop(0) | ||||||||||||||
try: | ||||||||||||||
new_entry = new_entry.dict(exclude_unset=True, by_alias=True) # type: ignore[union-attr] | ||||||||||||||
except AttributeError: | ||||||||||||||
pass | ||||||||||||||
Comment on lines
+119
to
+122
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should not use try and except here. Handling an exception is very slow. So you should only use it when failure is rare (< 1%).
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure this is so clear-cut; I just made an artificial benchmark with a very simple pydantic model with exception handling and isinstance checks. If you use exception handling then the I would rather avoid slowing down the "slower" method, i.e., using exception handling by default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If performance is important, the database will probably turn off validation to speed things up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps, though I'm not convinced that disabling validation provides any meaningful performance boost, and instead is just used to bypass some of the strict rules we have on databases where the effort is too much to apply them (e.g., NOMAD uses "X" in like 10 out of millions of chemical formulae, and trying to query them with validation on causes crashes). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just did a quick try on my laptop with the test data, and it takes 25% longer to process the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wow, really? I tried via the validator and could only get 1-2% difference. I'll re-investigate if I get time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant that the total processing time of a request increases by 25% if I do the validation, compared to not validating. I just did some more testing and it seems that the try except block takes about 1.5 times longer to execute than the "if" statement. Using "if" saves about 2.25 µs per entry. This is smaller than what I had expected. So for our example server we would only save 40 µs on 0.2 s so only 0.02%. |
||||||||||||||
|
||||||||||||||
# Remove fields excluded by their omission in `response_fields` | ||||||||||||||
for field in exclude_fields: | ||||||||||||||
|
@@ -133,7 +137,7 @@ def handle_response_fields( | |||||||||||||
|
||||||||||||||
|
||||||||||||||
def get_included_relationships( | ||||||||||||||
results: Union[EntryResource, List[EntryResource]], | ||||||||||||||
results: Union[EntryResource, List[EntryResource], Dict, List[Dict]], | ||||||||||||||
ENTRY_COLLECTIONS: Dict[str, EntryCollection], | ||||||||||||||
include_param: List[str], | ||||||||||||||
) -> List[Union[EntryResource, Dict]]: | ||||||||||||||
|
@@ -170,11 +174,17 @@ def get_included_relationships( | |||||||||||||
if doc is None: | ||||||||||||||
continue | ||||||||||||||
|
||||||||||||||
relationships = doc.relationships | ||||||||||||||
try: | ||||||||||||||
relationships = doc.relationships # type: ignore | ||||||||||||||
except AttributeError: | ||||||||||||||
relationships = doc.get("relationships", None) | ||||||||||||||
JPBergsma marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
|
||||||||||||||
if relationships is None: | ||||||||||||||
continue | ||||||||||||||
|
||||||||||||||
relationships = relationships.dict() | ||||||||||||||
if not isinstance(relationships, dict): | ||||||||||||||
relationships = relationships.dict() | ||||||||||||||
|
||||||||||||||
for entry_type in ENTRY_COLLECTIONS: | ||||||||||||||
# Skip entry type if it is not in `include_param` | ||||||||||||||
if entry_type not in include_param: | ||||||||||||||
|
@@ -187,7 +197,9 @@ def get_included_relationships( | |||||||||||||
if ref["id"] not in endpoint_includes[entry_type]: | ||||||||||||||
endpoint_includes[entry_type][ref["id"]] = ref | ||||||||||||||
|
||||||||||||||
included = {} | ||||||||||||||
included: Dict[ | ||||||||||||||
str, Union[List[EntryResource], EntryResource, List[Dict], Dict] | ||||||||||||||
] = {} | ||||||||||||||
for entry_type in endpoint_includes: | ||||||||||||||
compound_filter = " OR ".join( | ||||||||||||||
['id="{}"'.format(ref_id) for ref_id in endpoint_includes[entry_type]] | ||||||||||||||
|
@@ -203,6 +215,8 @@ def get_included_relationships( | |||||||||||||
|
||||||||||||||
# still need to handle pagination | ||||||||||||||
ref_results, _, _, _, _ = ENTRY_COLLECTIONS[entry_type].find(params) | ||||||||||||||
if ref_results is None: | ||||||||||||||
ref_results = [] | ||||||||||||||
included[entry_type] = ref_results | ||||||||||||||
|
||||||||||||||
# flatten dict by endpoint to list | ||||||||||||||
|
@@ -273,7 +287,7 @@ def get_entries( | |||||||||||||
|
||||||||||||||
return response( | ||||||||||||||
links=links, | ||||||||||||||
data=results, | ||||||||||||||
data=results if results else [], | ||||||||||||||
meta=meta_values( | ||||||||||||||
url=request.url, | ||||||||||||||
data_returned=data_returned, | ||||||||||||||
|
@@ -326,7 +340,7 @@ def get_single_entry( | |||||||||||||
|
||||||||||||||
return response( | ||||||||||||||
links=links, | ||||||||||||||
data=results, | ||||||||||||||
data=results if results else None, | ||||||||||||||
meta=meta_values( | ||||||||||||||
url=request.url, | ||||||||||||||
data_returned=data_returned, | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these lines are no longer needed for our implementation, now we always pass a list.