Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: 100x performance degradation in pgjdbc query to fetch columns #55140

Closed
rafiss opened this issue Oct 1, 2020 · 38 comments · Fixed by #55212, #55833 or #56345
Closed

sql: 100x performance degradation in pgjdbc query to fetch columns #55140

rafiss opened this issue Oct 1, 2020 · 38 comments · Fixed by #55212, #55833 or #56345
Labels
A-sql-memmon SQL memory monitoring A-sql-vtables Virtual tables - pg_catalog, information_schema etc C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@rafiss
Copy link
Collaborator

rafiss commented Oct 1, 2020

Describe the problem

I was running the liquibase integration tests and encountered a panic:

* ERROR: [n1,client=[::1]:65462,hostnossl,user=root] a panic has occurred!
* txn: no bytes in account to release, current 47738, free 47764
* (1) attached stack trace
*   -- stack trace:
*   | github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:508
*   | runtime.gopanic
*   | 	/usr/local/opt/go@1.13/libexec/src/runtime/panic.go:679
*   | github.com/cockroachdb/cockroach/pkg/util/log.ReportOrPanic
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/util/log/crash_reporting.go:340
*   | github.com/cockroachdb/cockroach/pkg/util/mon.(*BoundAccount).Shrink
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:583
*   | github.com/cockroachdb/cockroach/pkg/sql/rowcontainer.(*RowContainer).PopFirst
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowcontainer/datum_row_container.go:298
*   | github.com/cockroachdb/cockroach/pkg/sql.(*vTableLookupJoinNode).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:275
*   | github.com/cockroachdb/cockroach/pkg/sql.(*planNodeToRowSource).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/plan_node_to_row_source.go:178
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).receiveNext
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:656
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).build
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:303
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:237
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*windower).accumulateRows
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/windower.go:282
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*windower).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/windower.go:234
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*sortAllProcessor).fill
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/sorter.go:291
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*sortAllProcessor).Start
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/sorter.go:272
*   | github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBase).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:774
*   | github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:392
*   | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:422
*   | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:991
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1002
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:873
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:639
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:114
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1472
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1401
*   | github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:510
*   | github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:627
* Wraps: (2) while executing: SELECT * FROM (SELECT _._, _._, _._, _._, _._ OR ((_._ = _) AND _._) AS _, _._, _._, _._, row_number() OVER (PARTITION BY _._ ORDER BY _._) AS _, _ AS _, pg_get_expr(_._, _._) AS _, _._, _._, _._ FROM _._ AS _ JOIN _._ AS _ ON (_._ = _._) JOIN _._ AS _ ON (_._ = _._) JOIN _._ AS _ ON (_._ = _._) LEFT JOIN _._ AS _ ON ((_._ = _._) AND (_._ = _._)) LEFT JOIN _._ AS _ ON ((_._ = _._) AND (_._ = _._)) LEFT JOIN _._ AS _ ON ((_._ = _._) AND (_._ = _)) LEFT JOIN _._ AS _ ON ((_._ = _._) AND (_._ = _)) WHERE ((((_._ IN (_, _, __more3__)) AND (_._ > _)) AND (NOT _._)) AND (_._ LIKE _)) AND (_._ LIKE _)) AS _ WHERE _ AND (_ LIKE _) ORDER BY _, _._, _
* Wraps: (3) attached stack trace
*   -- stack trace:
*   | github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:508
*   | runtime.gopanic
*   | 	/usr/local/opt/go@1.13/libexec/src/runtime/panic.go:679
*   | [...repeated from below...]
* Wraps: (4) attached stack trace
*   -- stack trace:
*   | github.com/cockroachdb/cockroach/pkg/util/log.ReportOrPanic
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/util/log/crash_reporting.go:338
*   | github.com/cockroachdb/cockroach/pkg/util/mon.(*BoundAccount).Shrink
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:583
*   | github.com/cockroachdb/cockroach/pkg/sql/rowcontainer.(*RowContainer).PopFirst
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowcontainer/datum_row_container.go:298
*   | github.com/cockroachdb/cockroach/pkg/sql.(*vTableLookupJoinNode).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:275
*   | github.com/cockroachdb/cockroach/pkg/sql.(*planNodeToRowSource).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/plan_node_to_row_source.go:178
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).receiveNext
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:656
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).build
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:303
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*hashJoiner).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/hashjoiner.go:237
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*windower).accumulateRows
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/windower.go:282
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*windower).Next
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/windower.go:234
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*sortAllProcessor).fill
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/sorter.go:291
*   | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*sortAllProcessor).Start
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/sorter.go:272
*   | github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBase).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:774
*   | github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:392
*   | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:422
*   | github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:991
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1002
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:873
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:639
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:114
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1472
*   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1401
*   | github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:510
*   | github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1
*   | 	/Users/rafiss/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:627
*   | runtime.goexit
*   | 	/usr/local/opt/go@1.13/libexec/src/runtime/asm_amd64.s:1357
* Wraps: (5) txn: no bytes in account to release, current 47738, free 47764
* Error types: (1) *withstack.withStack (2) *safedetails.withSafeDetails (3) *withstack.withStack (4) *withstack.withStack (5) *errutil.leafError

To Reproduce

Ran the Liquibase integration tests on a fresh cluster.

This is the query that caused the panic:

SELECT
  *
FROM
  (
    SELECT
      n.nspname,
      c.relname,
      a.attname,
      a.atttypid,
      a.attnotnull
      OR ((t.typtype = 'd') AND t.typnotnull)
        AS attnotnull,
      a.atttypmod,
      a.attlen,
      t.typtypmod,
      row_number() OVER (
        PARTITION BY a.attrelid ORDER BY a.attnum
      )
        AS attnum,
      NULL AS attidentity,
      pg_get_expr(def.adbin, def.adrelid) AS adsrc,
      dsc.description,
      t.typbasetype,
      t.typtype
    FROM
      pg_catalog.pg_namespace AS n
      JOIN pg_catalog.pg_class AS c ON (c.relnamespace = n.oid)
      JOIN pg_catalog.pg_attribute AS a ON (a.attrelid = c.oid)
      JOIN pg_catalog.pg_type AS t ON (a.atttypid = t.oid)
      LEFT JOIN pg_catalog.pg_attrdef AS def ON
          (
            (a.attrelid = def.adrelid)
            AND (a.attnum = def.adnum)
          )
      LEFT JOIN pg_catalog.pg_description AS dsc ON
          ((c.oid = dsc.objoid) AND (a.attnum = dsc.objsubid))
      LEFT JOIN pg_catalog.pg_class AS dc ON
          (
            (dc.oid = dsc.classoid)
            AND (dc.relname = 'pg_class')
          )
      LEFT JOIN pg_catalog.pg_namespace AS dn ON
          (
            (dc.relnamespace = dn.oid)
            AND (dn.nspname = 'pg_catalog')
          )
    WHERE
      (
        (
          (
            (c.relkind IN ('r', 'p', 'v', 'f', 'm'))
            AND (a.attnum > 0)
          )
          AND (NOT a.attisdropped)
        )
        AND (n.nspname LIKE 'public')
      )
      AND (c.relname LIKE '%')
  )
    AS c
WHERE
  true AND (attname LIKE '%')
ORDER BY
  nspname, c.relname, attnum;

But running just the query on a fresh cluster is fine. I haven't figured out what preceding steps would make this panic.

Environment:
Using CockroachDB master (sha 33c1bb6)

@rafiss rafiss added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-memmon SQL memory monitoring labels Oct 1, 2020
@yuzefovich
Copy link
Member

I think it's the same as #53416.

@yuzefovich
Copy link
Member

Yeah, the query matches the signature of #53416 (comment), thanks for the report.

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 1, 2020

No problem. I still have the cluster so I'll attach the debug zip here at least until I can figure out all the repro steps.

Nice catch recognizing that query string signature!

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 1, 2020

debug55140.zip

@rafiss rafiss changed the title sql: panic in memory accounting caused by liquibase query to fetch columns sql: panic in memory accounting caused by pgjdbc query to fetch columns Oct 2, 2020
@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

Actually it looks like the query is from PGJDBC's getColumns, meaning that any Java ORM/app could hit this. I'm still working on figuring out what initial conditions cause this query to fail.

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

OK I found a reliable repro. Basically, just creating a lot of columns will cause the issue. master and 20.2 seem affected. v20.2.0-beta.3 does not have the panic (since log.ReportOrPanic is used here, and will not trigger a panic for official releases), but it does have incorrect results. In the repro below, there are supposed to be 1375 rows returned, but only 1100 rows are returned.

Steps:

  • Run the following Python program:
f = open('creates.sql', 'w')
for i in range(275):
    f.write('create table t{} (a int primary key, b string, c date, e interval);\n'.format(i))
  • Start a fresh cluster
  • Run ./cockroach sql --insecure -f creates.sql
  • Open a Cockroach SQL shell and run the query. Here it is in one line.
SELECT * FROM (SELECT n.nspname, c.relname, a.attname, a.atttypid, a.attnotnull OR ((t.typtype = 'd') AND t.typnotnull) AS attnotnull, a.atttypmod, a.attlen, t.typtypmod, row_number() OVER (PARTITION BY a.attrelid ORDER BY a.attnum) AS attnum, NULL AS attidentity, pg_get_expr(def.adbin, def.adrelid) AS adsrc, dsc.description, t.typbasetype, t.typtype FROM pg_catalog.pg_namespace AS n JOIN pg_catalog.pg_class AS c ON (c.relnamespace = n.oid) JOIN pg_catalog.pg_attribute AS a ON (a.attrelid = c.oid) JOIN pg_catalog.pg_type AS t ON (a.atttypid = t.oid) LEFT JOIN pg_catalog.pg_attrdef AS def ON ((a.attrelid = def.adrelid) AND (a.attnum = def.adnum)) LEFT JOIN pg_catalog.pg_description AS dsc ON ((c.oid = dsc.objoid) AND (a.attnum = dsc.objsubid)) LEFT JOIN pg_catalog.pg_class AS dc ON ((dc.oid = dsc.classoid) AND (dc.relname = 'pg_class')) LEFT JOIN pg_catalog.pg_namespace AS dn ON ((dc.relnamespace = dn.oid) AND (dn.nspname = 'pg_catalog')) WHERE ((((c.relkind IN ('r', 'p', 'v', 'f', 'm')) AND (a.attnum > 0)) AND (NOT a.attisdropped)) AND (n.nspname LIKE 'public')) AND (c.relname LIKE '%')) AS c WHERE true AND (attname LIKE '%') ORDER BY nspname, c.relname, attnum;

1100 rows returned but 1375 are expected.

Here is the explain output of the problematic query.

---------------------------------------------------------------------------------+--------------+----------------------------------------------------------------
                                                                                 | distribution | local
                                                                                 | vectorized   | false
  sort                                                                           |              |
   │                                                                             | order        | +relname,+row_number
   └── render                                                                    |              |
        └── filter                                                               |              |
             │                                                                   | filter       | attname LIKE '%'
             └── window                                                          |              |
                  └── hash join (left outer)                                     |              |
                       │                                                         | equality     | (relnamespace) = (oid)
                       ├── virtual table lookup join                             |              |
                       │    │                                                    | table        | pg_type@pg_type_oid_idx
                       │    │                                                    | equality     | (atttypid) = (oid)
                       │    └── virtual table lookup join (left outer)           |              |
                       │         │                                               | table        | pg_class@pg_class_oid_idx
                       │         │                                               | equality     | (classoid) = (oid)
                       │         │                                               | pred         | relname = 'pg_class'
                       │         └── hash join (right outer)                     |              |
                       │              │                                          | equality     | (objoid, objsubid) = (oid, attnum)
                       │              ├── filter                                 |              |
                       │              │    │                                     | filter       | objsubid > 0
                       │              │    └── virtual table                     |              |
                       │              │                                          | table        | pg_description@primary
                       │              └── virtual table lookup join (left outer) |              |
                       │                   │                                     | table        | pg_attrdef@pg_attrdef_adrelid_idx
                       │                   │                                     | equality     | (attrelid) = (adrelid)
                       │                   │                                     | pred         | (attnum = adnum) AND (adnum > 0)
                       │                   └── virtual table lookup join         |              |
                       │                        │                                | table        | pg_attribute@pg_attribute_attrelid_idx
                       │                        │                                | equality     | (oid) = (attrelid)
                       │                        │                                | pred         | (attnum > 0) AND (NOT attisdropped)
                       │                        └── hash join                    |              |
                       │                             │                           | equality     | (relnamespace) = (oid)
                       │                             ├── filter                  |              |
                       │                             │    │                      | filter       | (relkind IN ('f', 'm', 'p', 'r', 'v')) AND (relname LIKE '%')
                       │                             │    └── virtual table      |              |
                       │                             │                           | table        | pg_class@primary
                       │                             └── filter                  |              |
                       │                                  │                      | filter       | nspname LIKE 'public'
                       │                                  └── virtual table      |              |
                       │                                                         | table        | pg_namespace@primary
                       └── filter                                                |              |
                            │                                                    | filter       | nspname = 'pg_catalog'
                            └── virtual table                                    |              |
                                                                                 | table        | pg_namespace@primary
(45 rows)

@jordanlewis
Copy link
Member

Nice repro, works for me

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

It seems like v20.1.3 also has incorrect results. It also returns 1100 rows instead of 1375 rows.

Interestingly, even though v20.1 and v20.2 both have the wrong results, v20.1 completes the getColumns query in 69ms while v20.2 takes ~7 seconds.

@jordanlewis
Copy link
Member

20.1 didn't have the vtableLookupJoinNode stuff I don't think, but I'm also noticing that the plan in 20.2 doesn't even use that. The purpose of that is to avoid fully materializing all of the descriptors in memory.

But, the query you pasted also doesn't have a constraint on the name of the table - is that what you're testing in 20.2 and 20.1 both?

@jordanlewis
Copy link
Member

#48226 is the PR that added vtable lookup join

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

That's right -- none of the queries should have a constraint on the table name.

Hmmm Is my math off? Postgres is also returning 1100 rows. whoops.

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

OK, so I guess there's not a correctness issue in v20.1 or v20.2.

But v20.2 does have the memory accounting issue in the logs:

W201002 19:08:26.295868 1885 util/log/crash_reporting.go:342 ⋮ [n1,client=‹[::1]:62043›,hostnossl,user=root] ‹txn›: no bytes in account to release, current 25392, free 25393
E201002 19:08:26.302761 1885 util/log/crash_reporting.go:321 ⋮ [n1,client=‹[::1]:62043›,hostnossl,user=root] Queued as error ‹0286cbcfac3644ebb05d261e36db399a›

@jordanlewis
Copy link
Member

Makes sense - release binaries don't crash on things like this, merely reporting to sentry. Development binaries crash instead, to make it easier to notice.

@jordanlewis
Copy link
Member

This is looking like a weirder than normal memory account issue. We are trying to shrink by 25418 bytes, but only have 25416 bytes available. The small difference makes nervous that there was some kind of mutation to the data being accounted for, rather than a double close or whatever, in which situation we'd see a much larger delta.

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 2, 2020

Yeah, I noted the thing about not crashing on official releases above.

And just to close the loop, my math was wrong because I thought I was making 5 columns per table, but I was actually making 4 columns. Confusingly, I named them a, b, c, and e.

So the goals of this issue should be:

  • figure out why the query takes 100 times as long starting in v20.2
  • figure out why the memory accounting is off starting in v20.2

I'm guessing those two could be related?

@jordanlewis
Copy link
Member

I'm starting to believe my theory about mutated rows. Here is the log of the sizes of all row additions and subtractions from the lookup join node. I used the following modified query that uses only the vtable lookup join that crashes:

SELECT
	*
FROM
	(
		SELECT
			n.nspname,
			c.relname,
			a.attname,
			a.atttypid,
			a.attnotnull
			OR ((t.typtype = 'd') AND t.typnotnull)
				AS attnotnull,
			a.atttypmod,
			a.attlen,
			t.typtypmod,
			row_number() OVER (
				PARTITION BY a.attrelid ORDER BY a.attnum
			)
				AS attnum,
			NULL AS attidentity,
			pg_get_expr(def.adbin, def.adrelid) AS adsrc,
			dsc.description,
			t.typbasetype,
			t.typtype
		FROM
			pg_catalog.pg_namespace AS n
			JOIN pg_catalog.pg_class AS c ON
					(c.relnamespace = n.oid)
			JOIN pg_catalog.pg_attribute AS a ON
					(a.attrelid = c.oid)
			inner hash JOIN pg_catalog.pg_type AS t ON
					(a.atttypid = t.oid)
			LEFT JOIN pg_catalog.pg_attrdef AS def ON
					(
						(a.attrelid = def.adrelid)
						AND (a.attnum = def.adnum)
					)
			LEFT JOIN pg_catalog.pg_description AS dsc ON
					(
						(c.oid = dsc.objoid)
						AND (a.attnum = dsc.objsubid)
					)
			LEFT JOIN pg_catalog.pg_class AS dc ON
					(
						(dc.oid = dsc.classoid)
						AND (dc.relname = 'pg_class')
					)
			LEFT JOIN pg_catalog.pg_namespace AS dn ON
					(
						(dc.relnamespace = dn.oid)
						AND (dn.nspname = 'public')
					)
		WHERE
			(
				(
					(
						(
							c.relkind
							IN ('r', 'p', 'v', 'f', 'm')
						)
						AND (a.attnum > 0)
					)
					AND (NOT a.attisdropped)
				)
				AND (n.nspname LIKE 'public')
			)
			AND (c.relname LIKE '%')
	)
		AS c
WHERE
	true AND (attname LIKE '%')
ORDER BY
	nspname, c.relname, attnum;
0xc000139a20 +171
0xc000139a20 +171
0xc000139a20 -171
0xc000139a20 -171
0xc000139a20 +184
0xc000139a20 +188
0xc000139a20 -184
0xc000139a20 -188
0xc000139a20 +174
0xc000139a20 -174
0xc000139a20 +171
0xc000139a20 +174
0xc000139a20 +175
0xc000139a20 -171
0xc000139a20 -174
0xc000139a20 -175
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 +172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 -172
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 +173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173
0xc000139a20 -173

The numbers in this log sum to zero, as expected. But, I also captured the row sizes from the perspective of the row container, which only actually edits the memory account every 64th removed row due to chunking. Here is the log of those row sizes:

- 171
- 188
- 184
- 174
- 175
- 171
- 174
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 172
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173
- 173

Summing up the subtractions in the second log with the additions in the first, we get -2 bytes, the same as the discrepancy in the error.

This means that, in between some calls to At(0)/PopFirst and the time that PopFirst is called, one of the rows' sizes changes, which should never happen. Or there's another bug in RowContainer, which I kind of doubt because this infrastructure is quite old.

@jordanlewis
Copy link
Member

Here is the log of the rows removed. The first half ("fake" at the end of the line) is the output of At(0) of the row container. The second half is the output of what PopFirst actually deletes. Notice the discrepancy - the first half removes two rows with "b" first, and then removes 4 rows with "t13". The second half removes 1 row with "b" first, and then removes 5 rows with "t13". The first "t13" row in the second half appears to be identical to the last "t13" row in the first half in the last half. I am starting to think there is a bug in RowContainer...

@jordanlewis
Copy link
Member

Forgot to include the log.

[487 'b' 1330834471 'r' 1330834471 'public' 487 'a' 2950 16 1 -1 true false] 171 fake
[487 'b' 1330834471 'r' 1330834471 'public' 487 'b' 2950 16 2 -1 true false] 171 fake
[500 'i suck. badly)' 1330834471 'r' 1330834471 'public' 500 'a' 20 8 1 -1 false false] 184 fake
[500 'i suck. badly)' 1330834471 'r' 1330834471 'public' 500 'rowid' 20 8 2 -1 true false] 188 fake
[516 'derp' 1330834471 'r' 1330834471 'public' 516 'a' 2950 16 1 -1 true false] 174 fake
[517 'a' 1330834471 'r' 1330834471 'public' 517 'a' 20 8 1 -1 false false] 171 fake
[517 'a' 1330834471 'r' 1330834471 'public' 517 'blob' 25 -1 2 -1 false false] 174 fake
[517 'a' 1330834471 'r' 1330834471 'public' 517 'rowid' 20 8 3 -1 true false] 175 fake
[518 't0' 1330834471 'r' 1330834471 'public' 518 'a' 20 8 1 -1 true false] 172 fake
[518 't0' 1330834471 'r' 1330834471 'public' 518 'b' 25 -1 2 -1 false false] 172 fake
[518 't0' 1330834471 'r' 1330834471 'public' 518 'c' 1082 16 3 -1 false false] 172 fake
[518 't0' 1330834471 'r' 1330834471 'public' 518 'e' 1186 24 4 -1 false false] 172 fake
[519 't1' 1330834471 'r' 1330834471 'public' 519 'a' 20 8 1 -1 true false] 172 fake
[519 't1' 1330834471 'r' 1330834471 'public' 519 'b' 25 -1 2 -1 false false] 172 fake
[519 't1' 1330834471 'r' 1330834471 'public' 519 'c' 1082 16 3 -1 false false] 172 fake
[519 't1' 1330834471 'r' 1330834471 'public' 519 'e' 1186 24 4 -1 false false] 172 fake
[520 't2' 1330834471 'r' 1330834471 'public' 520 'a' 20 8 1 -1 true false] 172 fake
[520 't2' 1330834471 'r' 1330834471 'public' 520 'b' 25 -1 2 -1 false false] 172 fake
[520 't2' 1330834471 'r' 1330834471 'public' 520 'c' 1082 16 3 -1 false false] 172 fake
[520 't2' 1330834471 'r' 1330834471 'public' 520 'e' 1186 24 4 -1 false false] 172 fake
[521 't3' 1330834471 'r' 1330834471 'public' 521 'a' 20 8 1 -1 true false] 172 fake
[521 't3' 1330834471 'r' 1330834471 'public' 521 'b' 25 -1 2 -1 false false] 172 fake
[521 't3' 1330834471 'r' 1330834471 'public' 521 'c' 1082 16 3 -1 false false] 172 fake
[521 't3' 1330834471 'r' 1330834471 'public' 521 'e' 1186 24 4 -1 false false] 172 fake
[522 't4' 1330834471 'r' 1330834471 'public' 522 'a' 20 8 1 -1 true false] 172 fake
[522 't4' 1330834471 'r' 1330834471 'public' 522 'b' 25 -1 2 -1 false false] 172 fake
[522 't4' 1330834471 'r' 1330834471 'public' 522 'c' 1082 16 3 -1 false false] 172 fake
[522 't4' 1330834471 'r' 1330834471 'public' 522 'e' 1186 24 4 -1 false false] 172 fake
[523 't5' 1330834471 'r' 1330834471 'public' 523 'a' 20 8 1 -1 true false] 172 fake
[523 't5' 1330834471 'r' 1330834471 'public' 523 'b' 25 -1 2 -1 false false] 172 fake
[523 't5' 1330834471 'r' 1330834471 'public' 523 'c' 1082 16 3 -1 false false] 172 fake
[523 't5' 1330834471 'r' 1330834471 'public' 523 'e' 1186 24 4 -1 false false] 172 fake
[524 't6' 1330834471 'r' 1330834471 'public' 524 'a' 20 8 1 -1 true false] 172 fake
[524 't6' 1330834471 'r' 1330834471 'public' 524 'b' 25 -1 2 -1 false false] 172 fake
[524 't6' 1330834471 'r' 1330834471 'public' 524 'c' 1082 16 3 -1 false false] 172 fake
[524 't6' 1330834471 'r' 1330834471 'public' 524 'e' 1186 24 4 -1 false false] 172 fake
[525 't7' 1330834471 'r' 1330834471 'public' 525 'a' 20 8 1 -1 true false] 172 fake
[525 't7' 1330834471 'r' 1330834471 'public' 525 'b' 25 -1 2 -1 false false] 172 fake
[525 't7' 1330834471 'r' 1330834471 'public' 525 'c' 1082 16 3 -1 false false] 172 fake
[525 't7' 1330834471 'r' 1330834471 'public' 525 'e' 1186 24 4 -1 false false] 172 fake
[526 't8' 1330834471 'r' 1330834471 'public' 526 'a' 20 8 1 -1 true false] 172 fake
[526 't8' 1330834471 'r' 1330834471 'public' 526 'b' 25 -1 2 -1 false false] 172 fake
[526 't8' 1330834471 'r' 1330834471 'public' 526 'c' 1082 16 3 -1 false false] 172 fake
[526 't8' 1330834471 'r' 1330834471 'public' 526 'e' 1186 24 4 -1 false false] 172 fake
[527 't9' 1330834471 'r' 1330834471 'public' 527 'a' 20 8 1 -1 true false] 172 fake
[527 't9' 1330834471 'r' 1330834471 'public' 527 'b' 25 -1 2 -1 false false] 172 fake
[527 't9' 1330834471 'r' 1330834471 'public' 527 'c' 1082 16 3 -1 false false] 172 fake
[527 't9' 1330834471 'r' 1330834471 'public' 527 'e' 1186 24 4 -1 false false] 172 fake
[528 't10' 1330834471 'r' 1330834471 'public' 528 'a' 20 8 1 -1 true false] 173 fake
[528 't10' 1330834471 'r' 1330834471 'public' 528 'b' 25 -1 2 -1 false false] 173 fake
[528 't10' 1330834471 'r' 1330834471 'public' 528 'c' 1082 16 3 -1 false false] 173 fake
[528 't10' 1330834471 'r' 1330834471 'public' 528 'e' 1186 24 4 -1 false false] 173 fake
[529 't11' 1330834471 'r' 1330834471 'public' 529 'a' 20 8 1 -1 true false] 173 fake
[529 't11' 1330834471 'r' 1330834471 'public' 529 'b' 25 -1 2 -1 false false] 173 fake
[529 't11' 1330834471 'r' 1330834471 'public' 529 'c' 1082 16 3 -1 false false] 173 fake
[529 't11' 1330834471 'r' 1330834471 'public' 529 'e' 1186 24 4 -1 false false] 173 fake
[530 't12' 1330834471 'r' 1330834471 'public' 530 'a' 20 8 1 -1 true false] 173 fake
[530 't12' 1330834471 'r' 1330834471 'public' 530 'b' 25 -1 2 -1 false false] 173 fake
[530 't12' 1330834471 'r' 1330834471 'public' 530 'c' 1082 16 3 -1 false false] 173 fake
[530 't12' 1330834471 'r' 1330834471 'public' 530 'e' 1186 24 4 -1 false false] 173 fake
[531 't13' 1330834471 'r' 1330834471 'public' 531 'a' 20 8 1 -1 true false] 173 fake
[531 't13' 1330834471 'r' 1330834471 'public' 531 'b' 25 -1 2 -1 false false] 173 fake
[531 't13' 1330834471 'r' 1330834471 'public' 531 'c' 1082 16 3 -1 false false] 173 fake
[531 't13' 1330834471 'r' 1330834471 'public' 531 'e' 1186 24 4 -1 false false] 173 fake

[487 'b' 1330834471 'r' 1330834471 'public' 487 'a' 2950 16 1 -1 true false] 171 real
[500 'i suck. badly)' 1330834471 'r' 1330834471 'public' 500 'rowid' 20 8 2 -1 true false] 188 real
[500 'i suck. badly)' 1330834471 'r' 1330834471 'public' 500 'a' 20 8 1 -1 false false] 184 real
[516 'derp' 1330834471 'r' 1330834471 'public' 516 'a' 2950 16 1 -1 true false] 174 real
[517 'a' 1330834471 'r' 1330834471 'public' 517 'rowid' 20 8 3 -1 true false] 175 real
[517 'a' 1330834471 'r' 1330834471 'public' 517 'a' 20 8 1 -1 false false] 171 real
[517 'a' 1330834471 'r' 1330834471 'public' 517 'blob' 25 -1 2 -1 false false] 174 real
[518 't0' 1330834471 'r' 1330834471 'public' 518 'e' 1186 24 4 -1 false false] 172 real
[518 't0' 1330834471 'r' 1330834471 'public' 518 'a' 20 8 1 -1 true false] 172 real
[518 't0' 1330834471 'r' 1330834471 'public' 518 'b' 25 -1 2 -1 false false] 172 real
[518 't0' 1330834471 'r' 1330834471 'public' 518 'c' 1082 16 3 -1 false false] 172 real
[519 't1' 1330834471 'r' 1330834471 'public' 519 'e' 1186 24 4 -1 false false] 172 real
[519 't1' 1330834471 'r' 1330834471 'public' 519 'a' 20 8 1 -1 true false] 172 real
[519 't1' 1330834471 'r' 1330834471 'public' 519 'b' 25 -1 2 -1 false false] 172 real
[519 't1' 1330834471 'r' 1330834471 'public' 519 'c' 1082 16 3 -1 false false] 172 real
[520 't2' 1330834471 'r' 1330834471 'public' 520 'e' 1186 24 4 -1 false false] 172 real
[520 't2' 1330834471 'r' 1330834471 'public' 520 'a' 20 8 1 -1 true false] 172 real
[520 't2' 1330834471 'r' 1330834471 'public' 520 'b' 25 -1 2 -1 false false] 172 real
[520 't2' 1330834471 'r' 1330834471 'public' 520 'c' 1082 16 3 -1 false false] 172 real
[521 't3' 1330834471 'r' 1330834471 'public' 521 'e' 1186 24 4 -1 false false] 172 real
[521 't3' 1330834471 'r' 1330834471 'public' 521 'a' 20 8 1 -1 true false] 172 real
[521 't3' 1330834471 'r' 1330834471 'public' 521 'b' 25 -1 2 -1 false false] 172 real
[521 't3' 1330834471 'r' 1330834471 'public' 521 'c' 1082 16 3 -1 false false] 172 real
[522 't4' 1330834471 'r' 1330834471 'public' 522 'e' 1186 24 4 -1 false false] 172 real
[522 't4' 1330834471 'r' 1330834471 'public' 522 'a' 20 8 1 -1 true false] 172 real
[522 't4' 1330834471 'r' 1330834471 'public' 522 'b' 25 -1 2 -1 false false] 172 real
[522 't4' 1330834471 'r' 1330834471 'public' 522 'c' 1082 16 3 -1 false false] 172 real
[523 't5' 1330834471 'r' 1330834471 'public' 523 'e' 1186 24 4 -1 false false] 172 real
[523 't5' 1330834471 'r' 1330834471 'public' 523 'a' 20 8 1 -1 true false] 172 real
[523 't5' 1330834471 'r' 1330834471 'public' 523 'b' 25 -1 2 -1 false false] 172 real
[523 't5' 1330834471 'r' 1330834471 'public' 523 'c' 1082 16 3 -1 false false] 172 real
[524 't6' 1330834471 'r' 1330834471 'public' 524 'e' 1186 24 4 -1 false false] 172 real
[524 't6' 1330834471 'r' 1330834471 'public' 524 'a' 20 8 1 -1 true false] 172 real
[524 't6' 1330834471 'r' 1330834471 'public' 524 'b' 25 -1 2 -1 false false] 172 real
[524 't6' 1330834471 'r' 1330834471 'public' 524 'c' 1082 16 3 -1 false false] 172 real
[525 't7' 1330834471 'r' 1330834471 'public' 525 'e' 1186 24 4 -1 false false] 172 real
[525 't7' 1330834471 'r' 1330834471 'public' 525 'a' 20 8 1 -1 true false] 172 real
[525 't7' 1330834471 'r' 1330834471 'public' 525 'b' 25 -1 2 -1 false false] 172 real
[525 't7' 1330834471 'r' 1330834471 'public' 525 'c' 1082 16 3 -1 false false] 172 real
[526 't8' 1330834471 'r' 1330834471 'public' 526 'e' 1186 24 4 -1 false false] 172 real
[526 't8' 1330834471 'r' 1330834471 'public' 526 'a' 20 8 1 -1 true false] 172 real
[526 't8' 1330834471 'r' 1330834471 'public' 526 'b' 25 -1 2 -1 false false] 172 real
[526 't8' 1330834471 'r' 1330834471 'public' 526 'c' 1082 16 3 -1 false false] 172 real
[527 't9' 1330834471 'r' 1330834471 'public' 527 'e' 1186 24 4 -1 false false] 172 real
[527 't9' 1330834471 'r' 1330834471 'public' 527 'a' 20 8 1 -1 true false] 172 real
[527 't9' 1330834471 'r' 1330834471 'public' 527 'b' 25 -1 2 -1 false false] 172 real
[527 't9' 1330834471 'r' 1330834471 'public' 527 'c' 1082 16 3 -1 false false] 172 real
[528 't10' 1330834471 'r' 1330834471 'public' 528 'e' 1186 24 4 -1 false false] 173 real
[528 't10' 1330834471 'r' 1330834471 'public' 528 'a' 20 8 1 -1 true false] 173 real
[528 't10' 1330834471 'r' 1330834471 'public' 528 'b' 25 -1 2 -1 false false] 173 real
[528 't10' 1330834471 'r' 1330834471 'public' 528 'c' 1082 16 3 -1 false false] 173 real
[529 't11' 1330834471 'r' 1330834471 'public' 529 'e' 1186 24 4 -1 false false] 173 real
[529 't11' 1330834471 'r' 1330834471 'public' 529 'a' 20 8 1 -1 true false] 173 real
[529 't11' 1330834471 'r' 1330834471 'public' 529 'b' 25 -1 2 -1 false false] 173 real
[529 't11' 1330834471 'r' 1330834471 'public' 529 'c' 1082 16 3 -1 false false] 173 real
[530 't12' 1330834471 'r' 1330834471 'public' 530 'e' 1186 24 4 -1 false false] 173 real
[530 't12' 1330834471 'r' 1330834471 'public' 530 'a' 20 8 1 -1 true false] 173 real
[530 't12' 1330834471 'r' 1330834471 'public' 530 'b' 25 -1 2 -1 false false] 173 real
[530 't12' 1330834471 'r' 1330834471 'public' 530 'c' 1082 16 3 -1 false false] 173 real
[531 't13' 1330834471 'r' 1330834471 'public' 531 'e' 1186 24 4 -1 false false] 173 real
[531 't13' 1330834471 'r' 1330834471 'public' 531 'a' 20 8 1 -1 true false] 173 real
[531 't13' 1330834471 'r' 1330834471 'public' 531 'b' 25 -1 2 -1 false false] 173 real
[531 't13' 1330834471 'r' 1330834471 'public' 531 'c' 1082 16 3 -1 false false] 173 real
[531 't13' 1330834471 'r' 1330834471 'public' 531 'e' 1186 24 4 -1 false false] 173 real

@jordanlewis
Copy link
Member

The rows on the output side are also out of order... something seems really messed up

@jordanlewis
Copy link
Member

It was a memory aliasing issue, a bug in the vTableLookupJoinNode. Here's a diff that fixes it:

@@ -271,7 +277,9 @@ func (v *vTableLookupJoinNode) Next(params runParams) (bool, error) {
        for {
                // Check if there are any rows left to emit from the last input row.
                if v.run.rows.Len() > 0 {
-                       v.run.row = v.run.rows.At(0)
+                       copy(v.run.row, v.run.rows.At(0))

this only explains the crash, not the slowdown, though.

jordanlewis added a commit to jordanlewis/cockroach that referenced this issue Oct 5, 2020
This commit adds a new function called ConstantWithTestValue that is
designed to be used to initialize "magic constants", like batch sizes,
that cause edge conditions in the code.

ConstantWithTestValue should be used to initialize "magic constants" that
should be varied during test scenarios to check for bugs at boundary
conditions. When built with the test_constants build tag, the test value
will be used. In all other cases, the production value will be used.

An example of a "magic constant" that behaves this way is a batch size. Batch
sizes tend to present testing problems, because often the logic that deals
with what to do when a batch is finished is less likely to be exercised by
simple unit tests that don't use enough data to fill up a batch.

For example, instead of writing:

```
const batchSize = 64
```

you should write:

```
var batchSize = util.ConstantWithTestValue(64, 1)
```

This will give your code a batch size of 1 in the test_constants build
configuration, increasing the amount of exercise the edge conditions get.

This commit also adds several uses of `ConstantWithTestValue` in the SQL
package. One of them in particular (the datum row container chunk size)
was tailored to catch cockroachdb#55140.

Release note: None
rytaft added a commit to rytaft/cockroach that referenced this issue Oct 22, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
craig bot pushed a commit that referenced this issue Oct 23, 2020
55833: opt: add cost for table descriptor fetch during virtual scan r=rytaft a=rytaft

This commit adds a cost equal to `10*randIOCostFactor` for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

Fixes #55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.

Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Oct 24, 2020
55833: opt: add cost for table descriptor fetch during virtual scan r=rytaft a=rytaft

This commit adds a cost equal to `10*randIOCostFactor` for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

Fixes #55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.

Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
@craig craig bot closed this as completed in dcec6ee Oct 24, 2020
rafiss pushed a commit to rafiss/cockroach that referenced this issue Oct 26, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
@rafiss
Copy link
Collaborator Author

rafiss commented Oct 29, 2020

@rytaft apologies for noticing this late, but I just tried running my reproduction above (#55140 (comment)) and I found that the query is still taking around 6 seconds when I test with the latest version from master (commit f945cae), as opposed to ~50ms when I run with v20.1.3 locally.

I'm going to re-open and hopefully we can diagnose this further.

@rafiss rafiss reopened this Oct 29, 2020
@rytaft
Copy link
Collaborator

rytaft commented Oct 29, 2020

Hmm -- I did run those reproduction steps when I was working on this PR and confirmed that the new plan was significantly faster than the old one (at least an order of magnitude).

Are you sure that what you're seeing is plan-related?

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 30, 2020

Here is the plan I am getting now (taking 6 seconds to execute)

                                                         info
-----------------------------------------------------------------------------------------------------------------------
  distribution: local
  vectorized: false

  • sort
  │ order: +relname,+row_number
  │
  └── • render
      │
      └── • filter
          │ filter: attname LIKE '%'
          │
          └── • window
              │
              └── • merge join
                  │ equality: (oid) = (atttypid)
                  │
                  ├── • virtual table
                  │     table: pg_type@pg_type_oid_idx
                  │
                  └── • sort
                      │ order: +atttypid
                      │
                      └── • hash join (left outer)
                          │ equality: (classoid) = (oid)
                          │
                          ├── • hash join (right outer)
                          │   │ equality: (adrelid, adnum) = (attrelid, attnum)
                          │   │
                          │   ├── • filter
                          │   │   │ filter: adnum > 0
                          │   │   │
                          │   │   └── • virtual table
                          │   │         table: pg_attrdef@primary
                          │   │
                          │   └── • hash join (right outer)
                          │       │ equality: (objoid, objsubid) = (oid, attnum)
                          │       │
                          │       ├── • filter
                          │       │   │ filter: objsubid > 0
                          │       │   │
                          │       │   └── • virtual table
                          │       │         table: pg_description@primary
                          │       │
                          │       └── • virtual table lookup join
                          │           │ table: pg_attribute@pg_attribute_attrelid_idx
                          │           │ equality: (oid) = (attrelid)
                          │           │ pred: (attnum > 0) AND (NOT attisdropped)
                          │           │
                          │           └── • hash join
                          │               │ equality: (relnamespace) = (oid)
                          │               │
                          │               ├── • filter
                          │               │   │ filter: (relkind IN ('f', 'm', 'p', 'r', 'v')) AND (relname LIKE '%')
                          │               │   │
                          │               │   └── • virtual table
                          │               │         table: pg_class@primary
                          │               │
                          │               └── • filter
                          │                   │ filter: nspname LIKE 'public'
                          │                   │
                          │                   └── • virtual table
                          │                         table: pg_namespace@primary
                          │
                          └── • hash join (left outer)
                              │ equality: (relnamespace) = (oid)
                              │
                              ├── • filter
                              │   │ filter: relname = 'pg_class'
                              │   │
                              │   └── • virtual table
                              │         table: pg_class@primary
                              │
                              └── • filter
                                  │ filter: nspname = 'pg_catalog'
                                  │
                                  └── • virtual table
                                        table: pg_namespace@primary

Here is explain analyze output.

It looks like the EXPLAIN format itself has changed, but it also does seem like this is different plan... There is only one virtual table lookup join now, but it seems to be taking 5.6 seconds. Not sure what's going on (or what is expected) yet.

rafiss pushed a commit to rafiss/cockroach that referenced this issue Nov 3, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
rytaft added a commit to rytaft/cockroach that referenced this issue Nov 5, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
rafiss pushed a commit to rafiss/cockroach that referenced this issue Nov 5, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
rytaft added a commit to rytaft/cockroach that referenced this issue Nov 6, 2020
This commit adds a cost equal to 10*randIOCostFactor for each virtual
scan in order to represent the cost of fetching table descriptors.
This cost is especially important when perfoming lookup joins, because
the descriptors may need to be fetched on each lookup. As a result of
this change, the optimizer is much less likely to plan a lookup join
into a virtual table.

This commit also includes some fixes to the test catalog to provide
better support for testing virtual tables with indexes.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
rytaft added a commit to rytaft/cockroach that referenced this issue Nov 6, 2020
This commit bumps the cost of each virtual scan to 25*randIOCostFactor
from its previous value of 10*randIOCostFactor. This new value threads
the needle so that a lookup join will still be chosen if the predicate
is very selective, but the plan for the PGJDBC query identified in cockroachdb#55140
no longer includes lookup joins.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
craig bot pushed a commit that referenced this issue Nov 12, 2020
55808: kvserver: skip non-live nodes when considering candidates for transfers r=tbg a=knz

(This PR is forked off #55460 to simplify the discussion. I believe there's no discussion left here? Maybe I can merge it directly?)

Fixes #55440.

Prior to this patch, 3 components could attempt to transfer a replica
to a node currently being drained:

- the store rebalancer, which rebalances replicas based on disk
  usage and QPS.
- the allocator, to place new replicas.
- the allocator, to rebalance replicas depending on load.

This commit introduces a consideration for node liveness when building
the list of candidates, to detect whether a target node is
acceptable. Any node that is not LIVE according to its liveness status
is not considered for a transfer.

Release note (bug fix): In some cases CockroachDB would attempt to
transfer ranges to nodes in the process of being decommissioned or
being shut down; this could cause disruption the moment the node
did actually terminate. This bug has been fixed. It had been
introduced some time before v2.0.

56334: kvserver: use messages on NotLeaseholderErrors everywhere r=andreimatei a=andreimatei

NLHE permits custom messages in it, but the field was rarely used. This
patch makes every instance where we instantiate the error provide a
message, since this error comes from a wide variety of conditions.

Release note: None

56345: opt: increase cost for table descriptor fetch during virtual scan r=rytaft a=rytaft

This commit bumps the cost of each virtual scan to `25*randIOCostFactor`
from its previous value of `10*randIOCostFactor`. This new value threads
the needle so that a lookup join will still be chosen if the predicate
is very selective, but the plan for the PGJDBC query identified in #55140
no longer includes lookup joins.

Fixes #55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.

56525: bazel: Move third party repositories to c-deps/REPOSITORIES.bzl r=otan a=alan-mas

bazel: Move third party repositories to c-deps/REPOSITORIES.bzl

This is one of the Bazel re-factoring that we are working on
and it is about to move third party repositories out of root WORKSPACE. 
fixes #56053

Best practices is to separate external dependencies and it also
hides the repo WORKSPACE from being used by other directories.

We are creating a new .bzl file inside c-deps with all the external dependencies
and then load then inside our root WORKSPACE

Release note: None

56589: sql: resolve error due to drop table after schema change in same txn r=ajwerner a=jayshrivastava

Previously, if a drop table statement was executed in a transaction
following other schema changes to the table in the same transaction,
an error would occur. This error was due to the drop table statement
marking previous jobs as succeeded and then proceeding to modify them.
This change ensures that drop table statement will delete all existing
jobs from the job cache so that it does not interfere with previous jobs.

Release note (sql change): A table can successfully be dropped in
a transaction following other schema changes to the table in the
same transaction.

This resolves one of the issues in #56235

56597: colflow: fix recent misuse of two slices in the flow setup r=yuzefovich a=yuzefovich

We've recently added the reusing of metadataSourcesQueue and toClose
slices in order to reduce some allocations. However, the components that
are using those slices don't make a deep copy, and as a result, we
introduced a bug in which we were breaking the current contract. This
commit fixes the issue by going back to the old method (with slight
difference in that we currently delay any allocations unlike previously
when we allocated a slice with capacity of 1).

Release note: None (no release with this bug)

56598: tree: introduce concept of "default" collation r=rafiss a=otan

Resolves #54989


Release note (sql change): Introduced a pg_collation of "default".
Strings now return the "default" collation OID in the pg_attribute
table (this was previously en_US). The "default" collation is also
visible on the pg_collation virtual table.



56602: roachpb: remove various `(gogoproto.equal)` options r=nvanbenschoten a=tbg

The first commit explains why some cleanup was necessary,
the others are the result of spending a little extra time
cleaning up "unnecessarily".

There are plenty of Equals left to clean up, but the returns
were diminishing. The important part is that when additions
to the KV API are made, nobody will be forced to add the
`equal` option any more.

- roachpb: don't generate Equal() on Error
- roachpb: remove more `Equal` methods
- roachpb: remove (gogoproto.equal) from api.proto
- roachpb: mostly remove (gogoproto.equal) from data.proto
- roachpb: remove Value.Equal
- kvserverpb: remove Equal from ReplicatedEvalResult


Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Co-authored-by: Alanmas <acostas.alan@gmail.com>
Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
@craig craig bot closed this as completed in 0830131 Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-memmon SQL memory monitoring A-sql-vtables Virtual tables - pg_catalog, information_schema etc C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
4 participants