Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

index pruning panic #1272

Closed
Dieterbe opened this issue Apr 8, 2019 · 10 comments
Closed

index pruning panic #1272

Dieterbe opened this issue Apr 8, 2019 · 10 comments

Comments

@Dieterbe
Copy link
Contributor

Dieterbe commented Apr 8, 2019

metrictank:v0.11.0-219-g88830e3

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc1f1ff]

goroutine 70808 [running]:
github.com/grafana/metrictank/idx/memory.(*UnpartitionedMemoryIdx).delete(0xc000420b80, 0x17a59, 0xc2c2b28fc0, 0x1, 0x2, 0x1, 0x1)
        /go/src/github.com/grafana/metrictank/idx/memory/memory.go:1268 +0x23f
github.com/grafana/metrictank/idx/memory.(*UnpartitionedMemoryIdx).Prune(0xc000420b80, 0xbf229911ab4385c4, 0xab4cfb02a6b, 0x1894c40, 0xc0003fb6f8, 0x44b01f, 0x0, 0xc00205a1e0, 0xc00205a1e0)
        /go/src/github.com/grafana/metrictank/idx/memory/memory.go:1456 +0x1102
github.com/grafana/metrictank/idx/cassandra.(*CasIdx).Prune(0xc000420bc0, 0xbf229911ab4385c4, 0xab4cfb02a6b, 0x1894c40, 0x18b4f80, 0xc1922ea000, 0xc000246054, 0xc1922ea050, 0x4348a9)
        /go/src/github.com/grafana/metrictank/idx/cassandra/cassandra.go:631 +0xb5
github.com/grafana/metrictank/idx/cassandra.(*CasIdx).prune(0xc000420bc0)
        /go/src/github.com/grafana/metrictank/idx/cassandra/cassandra.go:645 +0x94
created by github.com/grafana/metrictank/idx/cassandra.(*CasIdx).Init
        /go/src/github.com/grafana/metrictank/idx/cassandra/cassandra.go:231 +0x248
@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

deletedDefs = append(deletedDefs, *m.defById[id])

Looks like this is not being obeyed.

// deleteTaggedByIdSet deletes a map of ids from the tag index and also the DefByIds
// it is important that only IDs of series with tags get passed in here, because
// otherwise the result might be inconsistencies between DefByIDs and the tree index.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 8, 2019

@woodsaj can you elaborate? is deleteTaggedByIdSet() being called for series that don't have tags? where/how?

@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

This is originating in Prune()

func (m *UnpartitionedMemoryIdx) Prune(now time.Time) ([]idx.Archive, error) {

Somehow the same id was matched for a toPruneTagged and ToPruneUntagged

@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

Looking at the code, the only way for that to happen is if series is sent with the tags in the "name" field and then sent separately with the tags in the "Tags" field. And in this example, both expired at the same time.

eg.
series1:

Name: "some.series.name;k1=v1;k2=v2"
Tags: []

series2

Name: "some.series.name"
Tags: ["k1=v1", "k2=v2"]

Both of these series will end up with the same ID. but one will be added to the Tree index and one added to Tags index.

This just looks like another example of us running into problem because we are not strictly validating metric names on ingestion.

@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

We should be able to verify if this is in fact the cause of the problem by dumping the index and looking for series that have a names with ';' in them.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 8, 2019

this doesn't seem to be the case.

$ ./hm-dump-index.sh instancename 48h
$ zcat idx.list.gz | sort | uniq -c > sort-uniq-c
$ grep -v '^      1 ' sort-uniq-c
$ # no output, so we have no dupes
$ zcat idx.list.gz | sed 's#;.*#;tags#' | sort | uniq > reduce-all-tag-combos-to-one.txt
$ sed 's#;tags.*##' reduce-all-tag-combos-to-one.txt  | sort | uniq -c | grep -v '^      1 '
$ # no output, so there's no metricnameWithTag that clashes with a name without tags

the other thing worth noting here is that the instance was upgraded last week from v0.11.0 to v0.11.0-219-g88830e3 and that we only started seeing this now, so i suspect it's something to do with recent code. E.g. the partitioned memory index maybe.

@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

@Dieterbe the latest version of hm-dump-index.sh already removes duplicates and it uses nameWithTags not 'name'.

So it is not going to be able to give you any insights that can be used for this issue

@woodsaj
Copy link
Member

woodsaj commented Apr 8, 2019

I dumped the series names from cassandra to csv. There are no names with ";" in them, confirming your findings @Dieterbe
So, the issue must be something else, but i have no good ideas for what they could be.

Dieterbe added a commit that referenced this issue Apr 8, 2019
Dieterbe added a commit that referenced this issue Apr 8, 2019
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 9, 2019

I'm gonna close this. we have merged a workaround. not a true fix but it'll do for now.

@Dieterbe Dieterbe closed this as completed Apr 9, 2019
@fkaleo
Copy link
Contributor

fkaleo commented Nov 25, 2019

Was the final fix #1303 ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants