-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of custom field models and serializers #4998
Comments
I think we would be best to hold off on this until we get #4878 completed |
While I appreciate the thoughtful analysis, as Dan points out, #4878 was opened specifically to address the shortcomings of the current implementation. I'd hate to see people spend too much time working to improve the current implementation only for it to be ripped out in v2.10. That said, I'll leave this open for anyone who wants to volunteer to make improvements, with the understanding that any changes to the data model itself are untenable given the pending work. |
That's entirely reasonable. The primary motivation for submitting this was that I'd done the profiling and I figured a ticket seemed as good a place as any to document my findings since I hadn't seen an deep analysis of the root causes anywhere yet. Would a PR for roganartu@13238ea be compatible with your plans for #4878? It represents a nearly 20% improvement by cutting out one SQL query per result object, and only touches the model to add an optional kwarg for custom field memoisation, otherwise it just uses the existing serialiser cache. |
Given that #4878 has been merged into |
Environment
Proposed Functionality
Improve the performance of custom field models and serializers by reducing the number of SQL queries issued per request.
GET /api/dcim/devices?limit=1000
takes ~20-30s on our best bare metal hardware (~10-15s withexclude=config_context
), and results in thousands of duplicate SQL queries relating to custom fields.The profiling I did seems to suggest some of this performance may be improved by #4878 too, though this issue seemed distinct enough to warrant a separate ticket. If you disagree, please feel free to close this.
Profiling
I did some Python profiling and Postgres query logging and found that the majority of queries come from a small number of sources. I will try to document the process I used thoroughly to make reproducing this easier.
I used the following command to generate the request each time:
With the following postgres query logging config:
Each query then created two lines in the query log like this:
I then used the following (on a freshly rotated log file for each request) to aggregate query counts:
I also wrote an adhoc middleware plugin that let me pass
enable_profiler=1
to get acProfile.Profile()
of the request dumped to/tmp
that I then extracted the call graph from usinggprof2dot
and visualised usingxdot
. This allowed me to narrow down some of the sources of the SQL a little easier, since I'm not very familiar with Django.The test database I did this profiling on contains:
dcim_device
obj_type_id
Vanilla v2.8.8
The request typically takes ~10-15s depending on server load and executes ~3k SQL queries. This example is relatively fast because it was run with a local test DB that was idle:
Almost all of these are identical, repeatedly fetching custom field names and choices:
CustomFieldModelSerializer
Found here:
netbox/netbox/extras/api/customfields.py
Line 132 in ee51dae
These two lines result in one SQL query per device object:
netbox/netbox/extras/api/customfields.py
Lines 146 to 148 in ee51dae
The specific query this executes ~1000 times is:
However, afaict these lines can be removed and
fields
can be replaced with the existingself.context["custom_fields"]
without any problems. It requires a minor change to howCustomFieldModel
caches too since that seems to be where the DRF lazy execution actually ends up triggering the query. Someone who understands Django better should comment on the viability, but this use ofself.context
reduced the query count to ~2k and response time by ~17%:Here is the commit that achieved this, let me know if this makes sense and I can submit a PR: roganartu@13238ea
I did some adhoc testing of the returned objects, and the custom_fields in them appeared to be correct. I admit I do not understand how DRF serialization contexts work (further than a cursory read while doing this investigation), so if this is obviously the incorrect way to fix this I'd love to learn more.
CustomFieldManager
This one is basically the same as above, but the calls are generated from each instantiated model instead of each serializer. This seems to make it harder to deal with on a per-request basis, since the
CustomFieldManager
doesn't appear to have any kind of request context, unlike the serializer which has the per-requestself.context["custom_fields"]
.I don't know how to approach this correctly, but as a proof-of-concept to demonstrate this is where these queries are coming from, I made the
CustomFieldManager
a singleton with a per-model._meta.concrete_model
lru_cache
around theCustomFieldManager.get_for_model
method from here:netbox/netbox/extras/models/customfields.py
Line 61 in d5b9722
Before (including the
CustomFieldModelSerializer
fix):After (including the
CustomFieldModelSerializer
fix):This is the part of this issue that I think is most related to #4878. I wonder if thread locals could be used (with a middleware to create/del a context dict) to give a per-request cache for this stuff to the model objects?
Missing Index
I didn't do enough digging to see what exactly is generating this query, but it's missing an index which makes it rather slow on large
extras_customfieldvalue
tables.Before:
Proposed index:
After:
I can submit a PR with a migration to add this index if you agree it should be fixed.
Remaining queries
This should probably be a separate issue, but about half the remaining queries seem to be parent device lookups. It'd be nice if these were just one query (eg as
WHERE dcim_device.id IN (...)
) but I'm not sure how easy Django makes that. This is obviously a much less significant improvement than the other 3,000 queries though!Use Case
We do a lot of things that involve fetching data for a large number of devices. Additionally, we have a large number of custom fields, though that doesn't seem to really matter that much here. It seems that the number of queries currently executed is
3*N
where N is number ofCustomFieldModel
objects instantiated. Reducing this to a constant number of queries should significantly improve general Netbox performance.Database Changes
See
Proposed index
section above.External Dependencies
None.
The text was updated successfully, but these errors were encountered: