Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List multiple table names in the Hot Ranges table for coalesced ranges containing different tables #130997

Closed
NikRocher opened this issue Sep 19, 2024 · 0 comments · Fixed by #133190, #133840 or #134106
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-1 Issues/test failures with a fix SLA of 1 month T-observability

Comments

@NikRocher
Copy link

NikRocher commented Sep 19, 2024

Is your feature request related to a problem? Please describe.

A recent investigation into an issue revealed that It's difficult to trace the root cause of a hot range when a range is a coalesced range comprising of many different tables.

Describe the solution you'd like

Need better observability to show the names of various tables that are part of the same range in the Hot Ranges table when those ranges have been coalesced and include data from different tables. This will be useful for identifying which tables are contributing to the hot ranges when ranges span multiple tables.

Describe alternatives you've considered

On a remote session with the customer, we used LogSpy to look at the requests that were hitting the range from a single IP address. The IP was associated with an haproxy instance, and so the traffic was coming in from an external source.

After analyzing the sessions and noting the IP and Port from LogSpy output, we found that it was running only SQL to a different table. Upon checking the range's keyspan, we saw that the range was a coalesced range comprising of many different tables, including the table that was getting the brunt of the load.

Jira issue: CRDB-42328

Epic CRDB-43151

@NikRocher NikRocher added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs T-kv KV Team labels Sep 19, 2024
@exalate-issue-sync exalate-issue-sync bot added T-observability P-1 Issues/test failures with a fix SLA of 1 month and removed T-kv KV Team labels Oct 7, 2024
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 22, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into description span keys and
scanning them from the descriptions table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 22, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into description span keys and
scanning them from the descriptions table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 22, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into description span keys and
scanning them from the descriptions table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same index. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and binds it
		within its tenant's descriptor space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same index. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and binds it
		within its tenant's descriptor space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same index. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and binds it
		within its tenant's descriptor space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and binds it
		within its tenant's descriptor space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Oct 30, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: none
Fixes: cockroachdb#130997

Release note: changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into descriptor span keys and
scanning them from the descriptors table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: CRDB-35928
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-35928
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-35928
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 1, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 4, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
    to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
    and new descriptor by span utility to turn those spans into a set of
    table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
    set of (database, table, index) names which deduplicate and identify
    each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
    keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
    passed in and a mapping from those ranges to indexes can be
    returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 5, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
    to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
    and new descriptor by span utility to turn those spans into a set of
    table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
    set of (database, table, index) names which deduplicate and identify
    each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
    keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
    passed in and a mapping from those ranges to indexes can be
    returned.

A variety of caveats come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 5, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 5, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
    to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
    and new descriptor by span utility to turn those spans into a set of
    table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
    set of (database, table, index) names which deduplicate and identify
    each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
    keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
    passed in and a mapping from those ranges to indexes can be
    returned.

A variety of caveats come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 5, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 5, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 6, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
craig bot pushed a commit that referenced this issue Nov 8, 2024
133190: sql: using catalog reader, allow descriptor scanning by span r=angles-n-daemons a=angles-n-daemons

Summary: new `catalogReader.GetDescriptorsInSpans(..., spans []roachpb.Span)` function which acts as named.

-----------
sql: using catalog reader, allow descriptor scanning by span

Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into descriptor span keys and
scanning them from the descriptors table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: CRDB-43151
Fixes: #130997

Release note: None

Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
@craig craig bot closed this as completed in 7bd74a1 Nov 8, 2024
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 8, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 8, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 8, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
craig bot pushed a commit that referenced this issue Nov 11, 2024
133840: apiutil, roachpb: create utilities to map descriptors to ranges r=angles-n-daemons a=angles-n-daemons

apiutil, roachpb: create utilities to map descriptors to ranges

Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of caveats come with this approach. It attempts to scan the
desrciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: #130997

Release note: None

134748: storage: propagate KeySchemas to debug tools r=RaduBerinde a=jbowens

Ensure the `cockroach debug pebble ...` debug CLI tools can understand sstables with columnar blocks by propagating the CockroachDB KeySchema to the tool.

Epic: none
Release note: none

134770: sql/tests: include error with full stack trace in RSG tests r=rafiss a=rafiss

informs #133913
informs #134280
informs #133510
informs #134752
Release note: None

Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
@exalate-issue-sync exalate-issue-sync bot reopened this Nov 11, 2024
@craig craig bot closed this as completed in 37f422d Nov 11, 2024
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 11, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 11, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
craig bot pushed a commit that referenced this issue Nov 11, 2024
134106: ui, server: modify hot ranges api and table to use new contents approx. r=angles-n-daemons a=angles-n-daemons

ui, server: modify hot ranges api and table to use new contents approx.

This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

![image](https://github.com/user-attachments/assets/c5e0a175-7940-4051-b4de-c9ba9c492951)

Epic: CRDB-43151
Fixes: #130997

Release note: changes the table, index contents of the hot ranges page in DB console.


134868: licenses: remove unneeded notice file r=rail a=jlinder

Part of CRDB-43871

Release note: None

Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
Co-authored-by: James H. Linder <jamesl@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Nov 11, 2024
134106: ui, server: modify hot ranges api and table to use new contents approx. r=angles-n-daemons a=angles-n-daemons

ui, server: modify hot ranges api and table to use new contents approx.

This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

![image](https://github.com/user-attachments/assets/c5e0a175-7940-4051-b4de-c9ba9c492951)

Epic: CRDB-43151
Fixes: #130997

Release note: changes the table, index contents of the hot ranges page in DB console.


134597: vecstore: add the vecstore package r=mw5h a=andy-kimball

The vecstore package defines a Store interface that abstracts away the interactions that a vector index has with the component storing the vectors. This PR contains a simple implementation of Store that stores the vectors in memory. In the future, we'll have an implementation that stores the vectors in CRDB. The Store interface supports batching of important operations like Search so that in the future we can push such operations closer to the data. In addition, Store methods take a transaction so that the vector index can perform complex multi-step operations like split or merge with transactional guarantees.

Epic: CRDB-42943

Release note: None

Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Nov 11, 2024
134106: ui, server: modify hot ranges api and table to use new contents approx. r=angles-n-daemons a=angles-n-daemons

ui, server: modify hot ranges api and table to use new contents approx.

This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

![image](https://github.com/user-attachments/assets/c5e0a175-7940-4051-b4de-c9ba9c492951)

Epic: CRDB-43151
Fixes: #130997

Release note: changes the table, index contents of the hot ranges page in DB console.


Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 12, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
craig bot pushed a commit that referenced this issue Nov 12, 2024
133879: schemachanger: refactor subzone config elements r=annrpom a=annrpom

### config/zonepb: add subzoneSpan-related methods
This patch adds subzoneSpan-related methods that aid in configuring
subzone zcs in the declarative schemachanger.

Epic: none

Release note: None

---

### schemachanger: extend Catalog interface to support subzone writes
This patch extends the `Catalog` interface to support getting and
writing a zone config. This is necessary for us to be able to get
the parent zone config, apply all subzone changes, and then write
to kv in `immediateState`.

Epic: none

Release note: None

---

### schemachanger: refactor subzone config elements
This patch refactors how subzoneSpans are handled in the declarative schema changer. We now filter out subzoneSpans that do not relate to the index/partition zone config element it is nested within. This makes discarding these subzone elements feasible.

In order to support zone config changes where the target's subzone spans -- along with other targets' subzone spans -- are modified, we make those changes a side-effect. That is:

```
...
ALTER PARTITION default OF TABLE db.person CONFIGURE ZONE USING gc.ttlseconds = 1;
-> PartitionZoneConfig{default}

ALTER PARTITION australia OF TABLE db.person CONFIGURE ZONE USING gc.ttlseconds = 2;
-> PartitionZoneConfig{default}
and
-> PartitionZoneConfig{australia}
```

Informs: #133158 
Epic: none

Release note: None

---

### schemachanger: refactor add/drop zc elements
This patch separates marking zc elements as dropped from
the add direction in the builder as subzone configs will have
a mixed state of drop subzone + add side effects.

Epic: none

Release note: None

134106: ui, server: modify hot ranges api and table to use new contents approx. r=angles-n-daemons a=angles-n-daemons

ui, server: modify hot ranges api and table to use new contents approx.

This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

![image](https://github.com/user-attachments/assets/c5e0a175-7940-4051-b4de-c9ba9c492951)

Epic: CRDB-43151
Fixes: #130997

Release note: changes the table, index contents of the hot ranges page in DB console.


Co-authored-by: Annie Pompa <annie@cockroachlabs.com>
Co-authored-by: Brian Dillmann <brian.dillmann@cockroachlabs.com>
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 12, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into descriptor span keys and
scanning them from the descriptors table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 12, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
    to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
    and new descriptor by span utility to turn those spans into a set of
    table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
    set of (database, table, index) names which deduplicate and identify
    each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
    keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
    passed in and a mapping from those ranges to indexes can be
    returned.

A variety of caveats come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 12, 2024
Previously each range correlated to a single table, or even a single
index in a database, so all that was required to identify which tables,
indexes were in the range were to look at the start key of the range and
map it accordingly.

With range coalescing however, it's possible for one, or many, tables,
indexes and the like to reside within the same range. To properly
identify the contents of a range, this PR adds the following utilities:

 1. A utility function which turns a range into a span, and clamps it
		to its tenant's table space.
 2. A utility function which takes the above spans and uses the catalog
		and new descriptor by span utility to turn those spans into a set of
		table descriptors ordered by id.
 3. A utility function which transforms those table descriptors into a
		set of (database, table, index) names which deduplicate and identify
		each index uniquely.
 4. A utility function, which merges the ranges and indexes into a map
		keyed by RangeID whose values are the above index names.
 5. A primary entrypoint for consumers from which a set of ranges can be
		passed in and a mapping from those ranges to indexes can be
		returned.

A variety of cavets come with this approach. It attempts to scan the
desciptors all at once, but it still will scan a sizable portion of the
descriptors table if the request is large enough. This makes no attempt
to describe system information which does not have a descriptor. It will
describe system tables which appear in the descriptors table, but it
will not try to explain "tables" which do not have descriptors (example
tsdb), or any other information stored in the keyspace without a
descriptor (PseudoTableIDs, GossipKeys for example).

Throughout this work, many existing utilities were duplicated, and then
un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you
see anything that seems to already exist, feel free to point it out
accordingly.

Epic: none
Fixes: cockroachdb#130997

Release note: None
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this issue Nov 12, 2024
This change is the last in a set of commits to change the hot ranges
page from only showing one table, index per range to many. It builds on
top of the changes in the catalog reader
(10b9ee0) and the range utilities
(109219d) to surface a set of tables,
indexes for each range.

The primary changes in this commit specifically are the modification of
the status server to use the new `rangeutil` utilities, and changing the
wire, presentation format of the information.

Epic: CRDB-43151
Fixes: cockroachdb#130997

Release note (bug fix): changes the table, index contents of the hot ranges page
in DB console.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-1 Issues/test failures with a fix SLA of 1 month T-observability
Projects
None yet
2 participants