Add elastic logging for reference data logs #3566
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This PR adds elasticsearch logging for reference data in the request logger component.
Reference data can be sent as cloud events to the request logger, which will parse and insert the data exactly like it would with live predictions, with the only differences being that:
reference-log-<deployment_type>-<deployment_namespace>-<deployment_name>
This PR also introduces a fix for catching bad payloads i.e. if metadata can be found for a request, but the payload does not match the metadata, this throws a
BadPayloadException
, which is caught, logs a warning, and does not insert the data into elasticsearch.Previously, if the size of payload columns were larger than the number of metadata features, no errors are thrown and it would be inserted. If payload columns were less than metadata features, it would throw a runtime error (here or here)
Special notes for your reviewer:
A new index pattern has been introduced, so the elastic token needs the required permissions
Does this PR introduce a user-facing change?:
An example of data inserted into elasticsearch using a reference dataset from the tabular income dataset in this example is shown here: