Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

actual_lrps internal_routes field does not get re-encrypted when rotating bbs_encryption_key #626

Closed
sleepychild opened this issue Jun 29, 2022 · 7 comments
Assignees
Labels

Comments

@sleepychild
Copy link

actual_lrps internal_routes field does not get re-encrypted when rotating bbs_encryption_key

Summary

Rotation of bbs_encryption_key makes the internal_routes field of the actual_lrps table unreadable.

Steps to Reproduce

Push an application. Rotate the bbs_encryption_key so that the one used to encrypt the internal_routes of your app is no longer in the encryption_keys list in the bbs config. Run cfdot actual-lrps and observe the following error.

Error: BBS error
Type 0: UnknownError
Message: Key with label "{label of the key used to encrypt the internal_routes field when the lrp was created}" was not found

Diego repo

It's in the current bbs release. Probably introduced when internal_routes were introduced to the actual_lrps table.
cloudfoundry/bbs@2cfb94f

Environment Details

diego-release ~= 2.61.0 is where the issue manifested on our deployments
It could have been introduced prior to that.

Possible Causes or Fixes (optional)

None that I can think of.

Additional Text Output, Screenshots, contextual information (optional)

From the diegodb we take a list of actual-lrps that would be suspicious. They have crashed 200 times and are still CRASHED and have no cell_id. We order them from oldest to newest.

SELECT process_guid, since, internal_routes FROM actual_lrps WHERE cell_id='' AND state='CRASHED' AND crash_count=200 ORDER BY since ASC;

Note: Output modified to show human readable times from the since field.

                               process_guid                                |        since        |                                  internal_routes                                   
---------------------------------------------------------------------------+---------------------+------------------------------------------------------------------------------------
 ae89aca9-0ea8-4bf2-9ce5-67cb3c27b03a-03e07a2e-982a-4307-8273-36eb9897a287 | 2020-09-13 16:03:28.506457 | 
 c8999d5f-6a8b-4adc-b4b9-edc08e99c423-56a5cfcd-f897-4a41-83cd-5a4d7b63f3d2 | 2021-09-03 21:07:57.263618 | 
 2b11e173-0c0e-4cb0-97f6-7784666253de-9df7e555-85c8-4355-b061-c43d5b76b85e | 2021-10-09 02:55:27.335427 | 
 46b52952-53ff-4d83-abce-f10f3e6c53b2-cadea321-4d40-4af8-9322-50c7d9fe59d9 | 2021-11-27 00:09:14.887557 | 
 aaa1da98-c455-43d5-b1b2-fb6e8f020e4d-9de3a015-a299-4566-9634-e03eb6fc70f2 | 2021-11-28 01:06:39.908895 | 
 efc715e7-fe02-4402-ab2d-2f9500029861-112cae23-3b95-4e53-ab5a-249a00c6e7b1 | 2021-12-10 03:38:45.860073 | 
 a22f1a98-cd2c-47ed-8102-7abe45d70fd3-19e97ded-abe7-44cb-8f6c-456d2604cecc | 2022-01-08 01:36:24.592237 | 
 7f4537fd-66cf-4e92-ae1e-758015149250-edbf3038-1df4-4ae9-a4c7-625298d7d3c7 | 2022-01-23 18:28:13.988153 | 
 0cd19389-2965-488c-b3ba-e22870ae348f-9cd1eefe-b6b3-411b-9520-5ffb43ee424b | 2022-02-03 01:09:35.666100 | 
 f892a70c-c5ce-4c9b-acce-1abbb8cfb2d0-fa28762f-7596-453b-9931-890355cc7b5d | 2022-02-03 01:28:38.491541 | 
 fcbda808-70ef-4ea4-875b-c2f79884b218-deaf75be-3390-4037-b5ac-dbdc92bf0841 | 2022-03-18 20:52:32.211675 | 
 24e7458a-e478-429b-a0ed-911711f67e9f-f8058377-7871-4274-a266-8d0c4f9d708b | 2022-03-25 16:59:36.861837 | 
 9a72d6a9-9e55-4c39-a814-c881848c6454-3cc0b0d9-f41f-446e-93ea-2663e7c66de8 | 2022-03-27 16:50:29.306901 | 
 4874ab52-b0e5-493a-9475-972e9b794e3a-5ffa27f7-def5-4411-8d95-cf2c765abafb | 2022-03-30 13:48:35.714268 | 
 3a11917c-94c8-47c8-88f6-83a6b50390ff-255cafc3-6a78-4462-9ed5-95d44fa2b8fd | 2022-04-07 23:29:18.333547 | 
 885ed994-253a-4635-8f3b-eb285024cc8c-32c2b1f5-fe9a-48df-9782-b885f7ee2504 | 2022-04-13 18:59:02.870381 | 
 79b5d722-9c41-4ac4-b2ee-1c1c54d6362f-0ae9569d-f5aa-4382-944d-0ea606274ef3 | 2022-04-13 19:06:03.521516 | 
 def35092-f043-486e-852b-6fc75aa9efb1-d6d0c617-511a-4882-87bc-283d2dad30c8 | 2022-04-13 19:18:08.964568 | 
 bab207c6-c0c8-44d4-880f-6c9c05f8ba92-2e2e26ba-cb8e-45cc-81af-e2997262585c | 2022-04-13 19:35:05.970331 | 
 e5b217a1-3825-4cfa-b02b-6032d0984ada-58969920-d9ac-4a86-808f-37357c597d42 | 2022-04-13 19:45:41.102326 | 
 bc362781-d3b3-44fa-8b82-57a0d383ca14-32b2ac1b-f03c-4e86-985d-05e7ed2ff102 | 2022-04-13 19:58:12.648285 | 
 68cc1eb8-1c02-4e09-8e97-d18d8d13522e-8b6941ae-c158-4c6f-99ed-ebfcfc0cb5ce | 2022-04-13 20:10:43.393329 | 
 27c91377-edbc-4a41-bbf4-139dbea989ec-0059b083-a987-411c-8c44-54528a7e8594 | 2022-05-29 19:43:49.169786 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA0Yl8wCoZGNQVczEfDkMKOxyQdCMeRFvtqpjFl0WILOCXC
 b98f8918-1c10-49ea-9280-c609f9784f19-dcbb183c-75f5-4792-9f3e-f29781dab80c | 2022-06-01 21:07:45.759119 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA0Yl8wmLzwyPT5w+FrVXE93MWNx1NY3IiLCkBEMZpezD6y
 a58dee1c-d65a-444f-95e1-4b29f40052dc-15526680-dfed-46c5-bff0-829c425fe724 | 2022-06-09 20:15:38.286955 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8wXQAPI41fklrora0FpbmTkdhhfkWnvvcyCUB7zqUk
 846e9941-19f7-48ab-929f-80fd6af6b2fc-912daed2-1d16-434c-a0fe-1b1547e7c336 | 2022-06-16 01:41:36.250757 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8w7MnpTdSBfx/LFXLY91Zb2kFrHkLVSDYxExkwKdyR
 0b9f75a7-136f-4c70-868a-d253646aaee7-55692b04-91fc-44d7-90d9-373f6f512e30 | 2022-06-26 14:01:27.369856 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8weYtedPrhsObnUODQlIisqqvRnwMTzlL7OCBHnIgi
 cfdf2117-d8ed-4d2c-a3d4-05af63b7b34c-7b7b04cd-8e46-4395-9bb8-9087bd664711 | 2022-06-27 13:51:36.207459 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8w6+0kCUkhg1pucAoI3gWomjvyUA8F7KvHa0wmF2KS
 cfdf2117-d8ed-4d2c-a3d4-05af63b7b34c-7b7b04cd-8e46-4395-9bb8-9087bd664711 | 2022-06-27 13:51:36.254745 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8w4vFjutioKVCsKIKAFp1IHA7mg2Nbsk5XEAgg1g9d
 cfdf2117-d8ed-4d2c-a3d4-05af63b7b34c-7b7b04cd-8e46-4395-9bb8-9087bd664711 | 2022-06-27 13:51:36.488093 | 02HWJic19lbmNyeXB0aW9uX2tleV8yMDIyVDA1YV8wnSS1nadxzAGXQxvjcuR6uKj8gQeiiJ9ybrJ5O7nA
(30 rows)

For all of the entries we execute cfdot actual-lrps -p {process_guid}. Showing only the one before the two broken and the one after for brevity.

diego-cell/859ab2d2-9b6a-4aa2-8a2b-c143a9576796:/var/vcap/bosh_ssh/bosh_aad697f4924f492# cfdot actual-lrps -p 68cc1eb8-1c02-4e09-8e97-d18d8d13522e-8b6941ae-c158-4c6f-99ed-ebfcfc0cb5ce
{"process_guid":"68cc1eb8-1c02-4e09-8e97-d18d8d13522e-8b6941ae-c158-4c6f-99ed-ebfcfc0cb5ce","index":0,"domain":"cf-apps","instance_guid":"","cell_id":"","address":"","ports":null,"preferred_address":"UNKNOWN","crash_count":200,"crash_reason":"APP/PROC/WEB: Exited with status 1","state":"CRASHED","since":1649880643393329116,"modification_tag":{"epoch":"fcf272b5-8d11-4497-7c90-2bf4c842ee96","index":597},"presence":"ORDINARY"}
diego-cell/859ab2d2-9b6a-4aa2-8a2b-c143a9576796:/var/vcap/bosh_ssh/bosh_aad697f4924f492# cfdot actual-lrps -p 27c91377-edbc-4a41-bbf4-139dbea989ec-0059b083-a987-411c-8c44-54528a7e8594
Error: BBS error
Type 0: UnknownError
Message: Key with label "bbs_encryption_key_2022T04b_0" was not found
diego-cell/859ab2d2-9b6a-4aa2-8a2b-c143a9576796:/var/vcap/bosh_ssh/bosh_aad697f4924f492# cfdot actual-lrps -p b98f8918-1c10-49ea-9280-c609f9784f19-dcbb183c-75f5-4792-9f3e-f29781dab80c
Error: BBS error
Type 0: UnknownError
Message: Key with label "bbs_encryption_key_2022T04b_0" was not found
diego-cell/859ab2d2-9b6a-4aa2-8a2b-c143a9576796:/var/vcap/bosh_ssh/bosh_aad697f4924f492# cfdot actual-lrps -p a58dee1c-d65a-444f-95e1-4b29f40052dc-15526680-dfed-46c5-bff0-829c425fe724
{"process_guid":"a58dee1c-d65a-444f-95e1-4b29f40052dc-15526680-dfed-46c5-bff0-829c425fe724","index":0,"domain":"cf-apps","instance_guid":"","cell_id":"","address":"","ports":null,"preferred_address":"UNKNOWN","crash_count":200,"crash_reason":"APP/PROC/WEB: Exited with status 1","state":"CRASHED","since":1654805738286954714,"modification_tag":{"epoch":"47ab4452-ed34-49c8-5b7e-00c0ea0bf017","index":597},"presence":"ORDINARY"}

As we can see the actual-lrps old enough to have their initial encryption keys are failing. The even older entries that don't have the internal_routes field populated and those whose encrypting key is still available in the bbs config work.

@xavierW
Copy link

xavierW commented Jun 29, 2022 via email

@klapkov
Copy link
Contributor

klapkov commented Jun 30, 2022

Hello,
I think I found where the problem is. The internal_routes field was added to ActualLRP table, but it seems like it was not added to the PerformEncryption function that reEncrypts the columns when the bbs_encryption_keys are rotated.

func (db *SQLDB) PerformEncryption(ctx context.Context, logger lager.Logger) error {
	errCh := make(chan error)

	funcs := []func(){
		func() {
			errCh <- db.reEncrypt(ctx, logger, encryptable{
				TableName:       tasksTable,
				PrimaryKeyNames: []string{"guid"},
				Columns:         []string{"task_definition"},
				EncryptIfEmpty:  true,
				PrimaryKeyFunc:  func() primaryKey { return &taskPrimaryKey{} },
			})
		},
		func() {
			errCh <- db.reEncrypt(ctx, logger, encryptable{
				TableName:       desiredLRPsTable,
				PrimaryKeyNames: []string{"process_guid"},
				Columns:         []string{"run_info", "volume_placement", "routes"},
				EncryptIfEmpty:  true,
				PrimaryKeyFunc:  func() primaryKey { return &desiredLRPPrimaryKey{} },
			})
		},
		func() {
			errCh <- db.reEncrypt(ctx, logger, encryptable{
				TableName:       actualLRPsTable,
				PrimaryKeyNames: []string{"process_guid", "instance_index", "presence"},
				Columns:         []string{"net_info"},
				EncryptIfEmpty:  false,
				PrimaryKeyFunc:  func() primaryKey { return &actualLRPPrimaryKey{} },
			})
		},
	}

As you can see, on the ActualLRP table, only the net_info is being reEncrypted. Yesterday I managed to reproduce the issue on a dev environment by rotating out of existence the key, that was used to encrypt the internal_routes column in the first place and today I added internal_routes to the Columns, cleared the diegodb from the broken records and rotated the keys again. This time the ActualLRP's endpoint worked fine and they were no broken records in the database. So yeah, seems like a simple fix, if I am not missing something and please correct me if I am.

@mariash
Copy link
Member

mariash commented Jul 6, 2022

@sleepychild thank you for reporting this. Seems like a bug indeed. @klapkov this looks like the right place to make a fix. We need to make sure we also have a test for this. @klapkov would you be willing to make a PR for this fix? Please let me know if you have any more questions or need help.

@klapkov
Copy link
Contributor

klapkov commented Jul 7, 2022

Hey @mariash, thanks for the response, I will make a PR with the fix and tests, but we will probably leave it for the next sprint, meaning in about 2 weeks. I hope that's okay with you.

@mariash
Copy link
Member

mariash commented Jul 8, 2022

@klapkov thank you for this information. We consider this bug too important to wait for 2 weeks and we will be working on a fix.

mariash added a commit to cloudfoundry/bbs that referenced this issue Jul 8, 2022
When encryption key is rotated internal_routes field was not getting
re-encrypted and thus BBS was failing to read it from database.

* Add a test to verify all text fields are being re-encrypted.

cloudfoundry/diego-release#626
@mariash
Copy link
Member

mariash commented Jul 8, 2022

We pushed the fix, thank you for reporting this issue.

@mariash mariash closed this as completed Jul 8, 2022
@sleepychild
Copy link
Author

@mariash , thank you for the quick response and fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants