Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: 'index out of range' errors for some aggregations #38750

Closed
solongordon opened this issue Jul 9, 2019 · 3 comments · Fixed by #38900
Closed

exec: 'index out of range' errors for some aggregations #38750

solongordon opened this issue Jul 9, 2019 · 3 comments · Fixed by #38900
Assignees
Labels
A-sql-vec SQL vectorized engine C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@solongordon
Copy link
Contributor

Some aggregations cause an "index out of range" error in the vectorized implementation. You can see examples of this by running the aggregate logic test with the local-vec configuration. Some of the failing tests:

testdata/logic_test/aggregate:192: SELECT 3 FROM kv HAVING TRUE
testdata/logic_test/aggregate:254: SELECT count(*), kv.s FROM kv GROUP BY s
testdata/logic_test/aggregate:262: SELECT count(*), s FROM kv GROUP BY kv.s
testdata/logic_test/aggregate:270: SELECT count(*), kv.s FROM kv GROUP BY kv.s
testdata/logic_test/aggregate:278: SELECT count(*), s FROM kv GROUP BY s
testdata/logic_test/aggregate:287: SELECT v, count(*), w FROM kv GROUP BY v, w
testdata/logic_test/aggregate:297: SELECT v, count(*), w FROM kv GROUP BY 1, 3
testdata/logic_test/aggregate:339: SELECT count(*), k+v FROM kv GROUP BY k+v
@solongordon solongordon added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-vec SQL vectorized engine labels Jul 9, 2019
@rafiss
Copy link
Collaborator

rafiss commented Jul 12, 2019

Tracking some findings so far:

The table in the example looks like this

  k |  v   | w |  s
+---+------+---+------+
  1 |    2 | 3 | a
  3 |    4 | 5 | a
  5 | NULL | 5 | NULL
  6 |    2 | 3 | b
  7 |    2 | 2 | b
  8 |    4 | 2 | A

SELECT 3 FROM kv HAVING TRUE should be addressed by #38809

I believe many of the other failures have to do with the presence of NULLs in the group columns. The offending line is accessing op.ht.head[headID] in hash_aggregator.go

		for i := uint64(0); i < op.ht.size; i++ {
			op.distinct[i] = false
			for !op.ht.head[headID] || curID == 0 {
				op.distinct[i] = true
				headID++
				curID = headID
			}
			op.sel[i] = curID - 1
			curID = op.ht.same[curID]
		}

One interesting thing is that SELECT v, count(*), w FROM kv GROUP BY v, w; has the out-of-bounds issue, but SELECT v, count(*), k FROM kv GROUP BY v, k; does not.

@rafiss
Copy link
Collaborator

rafiss commented Jul 12, 2019

Noting this down here so I don't forget over the weekend --

Removing the line ht.groupID[ht.toCheck[i]] = 0 in hashjoiner_tmpl#_CHECK_COL_BODY fixes some of the test cases. It seems like this line is setting the keyID of any null key to be 0, which is causing the issue in the index-out-of-bounds in hash_aggregator. (I haven't yet determined why this line is there so I don't know the proper fix yet; just noting down the immediate cause of the error.)

@rafiss
Copy link
Collaborator

rafiss commented Jul 16, 2019

Here's the explanation: the hashTable data structure is shared between the hash joiner and the hash aggregator. It was originally implemented for the hash joiner, and in that case, NULLs are always supposed to be treated as non-equal. However, in the hash aggregator, we want NULLs to compare as equal to each other. I'll put up a PR with an attempted fix.

craig bot pushed a commit that referenced this issue Jul 17, 2019
38900: exec: handle NULL group by keys in hash aggregator r=rafiss a=rafiss

Previously, if an aggregation was requested with NULL group keys, an
index out of bounds error would occur in the hash aggregator. The reason
was that the hashTable data structure is shared between the hash joiner
and the hash aggregator. It was originally implemented for the hash joiner,
and in that case, NULLs are always supposed to be treated as non-equal.
However, in the hash aggregator, we want NULLs to compare as equal to each
other.

This change allows the tests in logic_test/aggregate to pass with the
local-vec configuration.

In order to exercise more of the null handling logic, the tests now set
garbage data in the corresponing position of the vector whenever an
element is null.

fixes #38750 

Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
@craig craig bot closed this as completed in #38900 Jul 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-vec SQL vectorized engine C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants