Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unowned lookup_unique vindex columns to be null #9302

Conversation

arthurschreiber
Copy link
Contributor

@arthurschreiber arthurschreiber commented Nov 30, 2021

Description

We're running into an issue when inserting data into a table that uses a secondary, unowned lookup_unique VIndex.

When inserting data in such a table with a NULL value for the VIndex column, Vitess returns an error:

value must be supplied for column <column name>

We believe that stems from a bug in the code that processes unowned VIndexes for INSERT operations, as we don't see a reason why a value for these columns would be required.

Example

Let's say we have two tables, issues and pull_requests. An issues row can optionally be associated with a pull_requests row via the issues.pull_request_id column. Both issues and pull_requests are sharded by repository_id. An issue can only be associated to a pull request with the same repository_id.

SQL Schema for sharded keyspace
CREATE TABLE `pull_requests_id_keyspace_idx` (
  `id` bigint(20) unsigned NOT NULL,
  `keyspace_id` varbinary(10) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `issues_id_keyspace_idx` (
  `id` bigint(20) unsigned NOT NULL,
  `keyspace_id` varbinary(10) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `pull_requests` (
  `id` bigint(20) unsigned NOT NULL,
  `repository_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_pull_requests_on_repository_id` (`repository_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `issues` (
  `id` bigint(20) unsigned NOT NULL,
  `repository_id` bigint(20) unsigned NOT NULL,
  `pull_request_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_issues_on_pull_request_id` (`pull_request_id`),
  KEY `index_issues_on_repository_id` (`repository_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
VSchema for sharded keyspace
{
  "sharded": true,
  "vindexes": {
    "hash": {
      "type": "hash"
    },
    "issues_id_keyspace_idx": {
      "type": "lookup_unique",
      "params": {
        "table": "issues_id_keyspace_idx",
        "from": "id",
        "to": "keyspace_id"
      },
      "owner": "issues"
    },
    "pull_requests_id_keyspace_idx": {
      "type": "lookup_unique",
      "params": {
        "table": "pull_requests_id_keyspace_idx",
        "from": "id",
        "to": "keyspace_id"
      },
      "owner": "pull_requests"
    },
  },
  "tables": {
    "issues_id_keyspace_idx": {
      "column_vindexes": [
        {
          "column": "id",
          "name": "hash"
        }
      ]
    },
    "issues": {
      "column_vindexes": [
        {
          "column": "repository_id",
          "name": "hash"
        },
        {
          "column": "id",
          "name": "issues_id_keyspace_idx"
        },
        {
          "column": "pull_request_id",
          "name": "pull_requests_id_keyspace_idx"
        }
      ]
    },
    "pull_requests_id_keyspace_idx": {
      "column_vindexes": [
        {
          "column": "id",
          "name": "hash"
        }
      ]
    },
    "pull_requests": {
      "column_vindexes": [
        {
          "column": "repository_id",
          "name": "hash"
        },
        {
          "column": "id",
          "name": "pull_requests_id_keyspace_idx"
        }
      ]
    }
  }
}

When we want to create a pull request, we first add a new pull_requests row, which will write a new entry to the pull_requests_id_keyspace_idx lookup table. Then we create an issues row with a pull_request_id value - this will cause a lookup to the pull_requests_id_keyspace_idx lookup table to ensure that the keyspace id matches with the keyspace id generated from the repository_id column.

But when we want to create a standalone issues row, where pull_requests_id is NULL, this will fail with the above mentioned error.

Other Considerations

This change is unrelated to the ignore_nulls option. ignore_nulls allows NULL values on the owner table of the lookup vindex, but this is not the case here. We don't want to allow NULL values on e.g. pull_requests.id on the owner table.

Adding another option to the lookup vindex for this functionality is not required - nullability of the column is fully handled by the underlying database schema (in the example above, the issues.pull_request_id column itself defines whether it can contain NULL values or not).

Writing a NULL value to an unowned, non-unique lookup Vindex (lookup/consistent_lookup) was possible even without the fix - due to the nature of the bug this only affected lookup_unique, consistent_lookup_unique and other non-reversible single column vindexes.

Related Issue(s)

N/A

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

@arthurschreiber arthurschreiber changed the title Allow unowned lookup vindex columns to be null Allow unowned lookup_unique vindex columns to be null Dec 2, 2021
@arthurschreiber arthurschreiber marked this pull request as ready for review December 2, 2021 10:06
@arthurschreiber arthurschreiber force-pushed the allow-null-in-unowned-lookup-index-columns branch from 5ee66b3 to 701bb7e Compare December 2, 2021 10:36
Comment on lines 598 to 599
// Verify keyspace ids from primary and secondary VIndexes match
if len(verifyIndexes) > 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of Go is not great, but I believe the verifyKsids != nil check was always returning true, because verifyKsids is declared as an empty slice, not a nil slice. Using len(verifyIndexes) > 0 is more in line of the actual intention of this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently using nil slices is preferred in Go, according to https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices. I'll make that change and push it up shortly.

Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes overall looks good.

Could you please add an end to end test for it here go/test/endtoend/vtgate/lookup_test.go ?

Comment on lines 1585 to 1587
require.EqualError(t, err, `value must be supplied for column [c3]`)
if err != nil {
t.Fatal(err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
require.NoError(t, err) instead of if condition

@arthurschreiber
Copy link
Contributor Author

@harshit-gangal Thank you for the review! ❤️ I'll try to find some time to make the requested change and add a few end to end test cases tomorrow! 👍

@arthurschreiber arthurschreiber force-pushed the allow-null-in-unowned-lookup-index-columns branch 2 times, most recently from 2091144 to c560ca4 Compare December 13, 2021 09:52
@arthurschreiber
Copy link
Contributor Author

@harshit-gangal I updated the pull request to also add an end to end test case. 🙇‍♂️

@arthurschreiber arthurschreiber force-pushed the allow-null-in-unowned-lookup-index-columns branch 2 times, most recently from 5b4a8ae to 3da8d3b Compare December 13, 2021 10:24
@mattlord
Copy link
Contributor

@arthurschreiber if you rebase on origin/main then the dependency issues will be fixed. Thank you for the contribution!

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
@arthurschreiber arthurschreiber force-pushed the allow-null-in-unowned-lookup-index-columns branch from 3da8d3b to f52da1e Compare December 14, 2021 21:31
@arthurschreiber
Copy link
Contributor Author

@mattlord CI has been quite flaky for me. Could you kick off that one failing build again? 🙇‍♂️

@harshit-gangal harshit-gangal merged commit aa3da1f into vitessio:main Dec 22, 2021
@arthurschreiber arthurschreiber deleted the allow-null-in-unowned-lookup-index-columns branch January 7, 2022 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants