Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow rename of structuremetadata labels #13955

Merged
merged 5 commits into from
Sep 27, 2024

Conversation

ravishankar15
Copy link
Contributor

@ravishankar15 ravishankar15 commented Aug 25, 2024

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #13937

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@ravishankar15 ravishankar15 requested a review from a team as a code owner August 25, 2024 17:42
Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

Comment on lines 513 to 515
StructuredMetadata: []logproto.LabelAdapter{
{Name: "boo", Value: "a"},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correctness of the fields StructedMetadata and Parsed is not asserted in this test.

@@ -407,6 +407,7 @@ func (lf *LabelsFormatter) Process(ts int64, l []byte, lbs *LabelsBuilder) ([]by
lbs.SetErrorDetails(err.Error())
continue
}
lbs.Del(f.Name) // Deleting to avoid duplication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, from what I can tell, this would solve the bug. However, I wonder if the bug is somewhere else, if we need to delete the label here. Or phrased differently: Could we hide the root cause of the bug by deleting the label?

There are two issues to the problem:

  1. A label filter after a label format expression needs to match the newly assigned value.
    At the moment, we incorrectly filter on the raw structured metadata label, instead of on the parsed one, because of the lookup order in the internal labels builder:
    StreamLabel -> StructuredMetadataLabel -> ParsedLabel, see https://github.com/grafana/loki/blob/7c1a8493b/pkg/logql/log/labels.go#L301-L307
  2. There must not be duplicate fields in the response.
    The labels builder handles stream labels differently than the rest when building the resulting labels, see https://github.com/grafana/loki/blob/7c1a8493b/pkg/logql/log/labels.go#L442-L449
    But since both metadata labels and parsed labels fall under "added" labels, just different "categories", label names will be duplicate, when the LabelsFormat stage adds the label name=value as "parsed" label, see https://github.com/grafana/loki/blob/7c1a8493b/pkg/logql/log/fmt.go#L410

@salvacorts Maybe I am overly cautious and my worries are ungrounded, so I want your opinion on this.

Copy link
Contributor

@salvacorts salvacorts Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ravishankar15 for creating the PR!

@salvacorts Maybe I am overly cautious and my worries are ungrounded, so I want your opinion on this.

I agree @chaudum, this problem may not only limited to the label formatter, but to any pipeline expression that may overwrite a label that already exists in structured metadata.

The bug is in the LabelsBuilder.UnsortedLabels method, which is used by IntoMap, later used by LabelsFormatter.Process. This is what I'd do:

  • Similarly to how we skip stream labels that will be replaced, skip structured metadata fields that will be replaced.
  • The BaseLabelsBuilder.add field is an array of 3 items (one per category). Each item is a list of labels to add for the category. I would replace this by two new fields:
    • addStructuredMetadata and addParsed
    • Then in LabelsBuilder.UnsortedLabels, we can filter out stream labels looking at addStructuredMetadata and addParsed, and then filter out structured metadata looking at the addParsed
    • Note that the items for the StreamLabels category is always empty as the stems labels are in the base field.

I've added a code snipped below exemplifying the idea above.

On top of this, I'd add tests to the labels test file

How confortable would you feel making these changes @ravishankar15? We are happy to help!


func labelsContain(labels labels.Labels, name string) bool {
	for _, l := range labels {
		if l.Name == name {
			return true
		}
	}
	return false
}

func (b *LabelsBuilder) UnsortedLabels(buf labels.Labels, categories ...LabelCategory) labels.Labels {
    ...
    if categoriesContain(categories, StreamLabel) {
	for _, l := range b.base {
		// Skip stream labels to be deleted
                 if labelsContain(b.del, l.Name) {
                     continue
                 }
			
		// Skip stream labels which value will be replaced by structured metadata
                if labelsContain(b.addStructuredMetadata, l.Name) {
                     continue
                }

                 // Skip stream labels which value will be replaced by parsed labels
                 if labelsContain(b.addParsed, l.Name) {
                     continue
                 }

		buf = append(buf, l)
	}
    }

    if categoriesContain(categories, StructuredMetadataLabel) {
	for _, l := range b.addStructuredMetadata {			
		// Skip stream labels which value will be replaced by parsed labels
                 if labelsContain(b.addParsed, l.Name) {
                     continue
                 }

		buf = append(buf, l)
	}
    }

    if categoriesContain(categories, ParsedLabel) {
	buf = append(buf, b.addParsed...)
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at the PR will check these out over the weekend

Copy link
Contributor Author

@ravishankar15 ravishankar15 Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @chaudum mentioned I think there are two parts of the problem.

  1. The suggestion from Salvacorts will solve the problem where we will not have duplicate values in the response
  2. For label filter after a label format expression needs to match the newly assigned value. as mentioned by Chandum we need to make changes to the getWithCategory method on that note we are setting the flag referencedStructuredMetadata to true if the label is from StructuredMetadata in our issue the label is from StructuredMetadata but its renamed by LabelFormat in ParsedLabel so if we make the change the flag will be false I hope that should be okay and expected behaviour.
  3. Also I believe we should handle the case in Add and Set method of LabelsBuilder ?

I believe we need to solve both the issues for this bug to be resolved truly. Can you clarify my (2) and (3) and I can change my PR accordingly.

Also, In @salvacorts suggestion was to move away from having single field BaseLabelsBuilder.add implementation and have separate fields addStructuredMetadata and addParsed. Do you feel we need to combine both the changes in a single PR. I feel it will be little cumbersome If its okay shall we split it into two PR's like

  1. Fixes the two issue mentioned in chandum comment -> With existing BaseLabelsBuilder.add implementation
  2. Modify the BaseLabelsBuilder.add implementation to use separate fields addStructuredMetadata and addParsed

I also think it will be better to handle the cases only in Set and UnsortedLabels since that should remove the duplicates from getting added to builder. The Get and Add methods should be taken care automatically

@ravishankar15
Copy link
Contributor Author

ravishankar15 commented Sep 5, 2024

I have made the changes to the builder as per our discussion without reformatting. I am hoping this should fix the issue.

Also @salvacorts I see the StreamLabels is not used logically outside of builder but I see some tests using that category to add labels and test some scenarios are all those outdated ? Let me know I can raise a separate PR as a follow up to remove all those stale logic and clean up the builder.

@ravishankar15
Copy link
Contributor Author

ravishankar15 commented Sep 16, 2024

Hi Team, Is something else required in this PR ? The order of label processing will be ParsedLabel -> Structured Meta -> Stream (Anyway stream can be removed since its not used logically)

Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@salvacorts Can I have a second pair of eyes on this PR?

Copy link
Contributor

@salvacorts salvacorts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👏 I left only a comment to add some comments to the Set method, and a nit for the test (feel free to disregard it).

I'm approving to unblock. Thank you tons for going the extra mile here :)


// Take value from stream label if present
if labelsContain(b.add[StreamLabel], l.Name) {
buf = append(buf, labels.Label{Name: l.Name, Value: b.add[StreamLabel].Get(l.Name)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you pointed out, the StreamLabel labels are not effectively used so the code will never get here, but good that you added the check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salvacorts Once this PR is merged I am planning to raise a separate PR, Removing the stream label and separating the add object as discussed here #13955 (comment)

if category == StructuredMetadataLabel {
b.deleteWithCategory(StreamLabel, n)
if labelsContain(b.add[ParsedLabel], n) {
return b
Copy link
Contributor

@salvacorts salvacorts Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Set operation is kinda misleading as it may not set the actual value. RankedSet would be abetter name but renaming all the occurrences will be a pain. For now I think we are good as long as we leave a comment explaining how it works.

I'd add this comment to the top of the method\

// Set the name/value pair as a label. 
// The value `v` may not be set if a category with higher preference already contains `n`. 
// Category preference goes as Parsed > Structured Metadata > Stream.

Then for each category check I'd add these comments:

For category == ParsedLabel

Parsed takes precedence over Structured Metadata and Stream labekls. If category is Parsed, we delete `n` from the structured metadata and stream labels.

For category == StructuredMetadataLabel

Structured Metadata takes precedence over Stream labels. If category is `StructuredMetadataLabel`,we delete `n` from the stream labels. If `n` exists in the parsed labels, we won't overwrite it's value and we just return what we have.

For category == StreamLabel

Finally, if category is `StreamLabel` and `n` already exists in either the structured metadata or parsed labels, the `Set` operation is a noop and we return the unmodified labels builder.

actual := b.UnsortedLabels(nil)
sort.Sort(expected)
sort.Sort(actual)
assert.Equal(t, expected, actual)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can use require.ElementsMatch() and skip the sorts

@salvacorts salvacorts merged commit 2d4792a into grafana:main Sep 27, 2024
62 checks passed
@salvacorts
Copy link
Contributor

PR merged, thank you @ravishankar15 👏

jeschkies pushed a commit to jeschkies/loki that referenced this pull request Oct 1, 2024
Co-authored-by: Christian Haudum <christian.haudum@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Label filters ignore parsed label if it exists in structured metadata
3 participants