Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

Feat/ota 4984/secondary replacement #1686

Merged
merged 10 commits into from
Jun 17, 2020

Conversation

pattivacek
Copy link
Collaborator

Allow ECU re-registration to support adding, removing, and replacing Secondaries. Tested manually and with unit tests for virtual and IP Secondaries.

@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2020

Codecov Report

Merging #1686 into master will decrease coverage by 13.22%.
The diff coverage is 56.25%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #1686       +/-   ##
===========================================
- Coverage   78.26%   65.03%   -13.23%     
===========================================
  Files         190      190               
  Lines       12756    15346     +2590     
===========================================
- Hits         9983     9980        -3     
- Misses       2773     5366     +2593     
Impacted Files Coverage Δ
src/aktualizr_secondary/aktualizr_secondary.cc 63.58% <0.00%> (-22.36%) ⬇️
...aktualizr_secondary/aktualizr_secondary_factory.cc 48.14% <ø> (-13.76%) ⬇️
src/libaktualizr/primary/aktualizr.cc 58.88% <ø> (-37.02%) ⬇️
src/libaktualizr/primary/aktualizr.h 75.00% <ø> (-25.00%) ⬇️
src/libaktualizr/primary/sotauptaneclient.h 57.89% <0.00%> (-17.11%) ⬇️
src/libaktualizr/storage/invstorage.h 24.35% <0.00%> (-7.85%) ⬇️
src/libaktualizr/storage/sqlstorage.h 33.33% <ø> (ø)
src/libaktualizr/primary/sotauptaneclient.cc 50.73% <33.33%> (-36.26%) ⬇️
src/libaktualizr/primary/initializer.cc 53.72% <51.76%> (-36.03%) ⬇️
src/libaktualizr-posix/ipuptanesecondary.cc 88.98% <58.82%> (-0.31%) ⬇️
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 889ba11...d943cd9. Read the comment docs.

@pattivacek
Copy link
Collaborator Author

The LGTM failure looks pretty bogus to me; it failed because of a timeout fetching from the Ubuntu servers. I can't figure out how to retry it, though.

@lbonn
Copy link
Contributor

lbonn commented Jun 9, 2020

The LGTM failure looks pretty bogus to me; it failed because of a timeout fetching from the Ubuntu servers. I can't figure out how to retry it, though.

If you log in, they have a "retry analysis" button. Just clicked it.

@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from 81e1837 to c890cf7 Compare June 9, 2020 07:39
@pattivacek
Copy link
Collaborator Author

If you log in, they have a "retry analysis" button. Just clicked it.

Thanks. I logged in and looked around but couldn't find it. Anyway, I just pushed another tiny commit, so it probably got retriggered by that, too.

@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from c890cf7 to 70df3f1 Compare June 9, 2020 07:53
auto statement = db.prepareStatement<std::string, std::string, int>(
"INSERT INTO misconfigured_ecus VALUES (?,?,?);", it->serial.ToString(), it->hardware_id.ToString(),
static_cast<int>(it->state));
auto statement = db.prepareStatement<std::string>("SELECT count(*) FROM misconfigured_ecus WHERE serial = ?;",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could get rid of the count and use INSERT OR REPLACE here. serial would be better off being a primary key instead of just "unique" but it should also work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could get rid of the count and use INSERT OR REPLACE here.

Good point.

serial would be better off being a primary key instead of just "unique" but it should also work.

True, but then I have to add a migration. Is that worth it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should work without so probably not.

@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch 3 times, most recently from 0cbc6bc to 5201bf4 Compare June 15, 2020 12:22
<< it->second->getHwId() << " not found in storage.";
register_ecus = true;
} else {
found[static_cast<size_t>(std::distance(stored_ecu_serials.cbegin(), store_it))] = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
found[static_cast<size_t>(std::distance(stored_ecu_serials.cbegin(), store_it))] = true;
found[store_it - stored_ecu_serials.cbegin()] = true;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this code was just moved...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but I think I also wrote it originally. Can't remember why exactly I used std::distance here. Do you see a good reason to prefer one over the other?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh oops I thought I answered earlier, must have missed the button.

So after reading some docs, I can't really see the difference but the substraction is shorter and seems clearer to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -65,53 +64,11 @@ TEST(PrimarySecondaryReg, SecondariesMigration) {
EXPECT_EQ(secs_info[0].type, "IP");
EXPECT_EQ(secs_info[0].extra, R"({"ip":"127.0.0.1","port":9061})");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to remove this part of the test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew you were going to ask about that! Well, I spent a lot of time trying to get it to work, and then I realized it was testing something that we never tried to fix: it's migrating two Secondaries from the old storage layout to the new storage. In the actual migration code for that (in src/aktualizr_primary/secondary.cc), it explicitly only does the migration for one Secondary. Previously, this test was doing something kind of strange. Now, this isn't allowed: we just reprovision the device with two Secondaries in this case. Migration doesn't really make sense here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That's a satisfying reason.

@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from 5201bf4 to b73bde3 Compare June 17, 2020 08:15
@pattivacek
Copy link
Collaborator Author

I've added a fix for a somewhat obscure issue with trying to reregister Secondaries while an update is active. The backend will reject the registration, and the only resolution is to restore the previous Secodnary configuration, which was complicated by the fact that we'd already stored the new Secondary information. Now we wait to store it until registration is succesful.

@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from b73bde3 to 0e688c5 Compare June 17, 2020 08:54
lbonn
lbonn previously approved these changes Jun 17, 2020
@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from 0e688c5 to 6b43af9 Compare June 17, 2020 09:13
@pattivacek
Copy link
Collaborator Author

@lbonn Sorry, just blasted your review because I realized I'd broken the networking test without realizing it. Thankfully the fix was easy. :)

@pattivacek
Copy link
Collaborator Author

Finally fixed all the tests.

lbonn
lbonn previously approved these changes Jun 17, 2020
auto statement = db.prepareStatement<std::string, std::string, int>(
"INSERT OR REPLACE INTO misconfigured_ecus VALUES (?,?,?);", ecu.serial.ToString(), ecu.hardware_id.ToString(),
static_cast<int>(ecu.state));
if (statement.step() != SQLITE_DONE || sqlite3_changes(db.get()) != 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's worth checking sqlite3_changes value here. My understanding was that it is only useful when you have WHERE in your statement or something similar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that was lazy copying on my part, didn't even think about it. I'll double check but will probably remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Not is it not necessary, but according to https://www.sqlite.org/c3ref/changes.html it doesn't even do anything for REPLACE, so it would've caused a bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my first impression as well, but it looks like while REPLACE doesn't increment the counter, the subsequent INSERT does. As I understood, REPLACE does only the deleting part.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Well, I removed it anyway. :) Seems unnecessary.

Works for Virtual Secondaries, but still a work in progress for IP
Secondaries.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
Now they cover adding, remove, and replacing Secondary ECUs.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
The logic for matching addresses to ECUs is not trivial, so it's worth
being extra careful. Also test replacing an ECU that reuses the same
address and port and that we correctly track an ECU that changes its
address or port.

Also note some the timeouts (and some ordering issues) that were
necessary to get some of these tests to pass. These are more reasons why
a proper connection manager would be nice to have; things like this
really shouldn't be necessary.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
We no longer track unregistered ECUs, since that is no longer a problem.
Still dump anything in the database with that flag, just in case, but it
shouldn't happen.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
There's no real problem, just an ugly log message that will go away
eventually on its own.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
For device credential provisioning, it is still a fatal error if the
cert is unavailable. For shared cred prov, we revert to the old behavior
of generating a random pet name if the cert is unavailable.

This is generally only observable if you attempt to replace a Primary
ECU, which requires reusing the same device certificate.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
The backend does not accept empty filenames, but we have to say
something. This comes up if you wipe a Secondary's database but keep the
image file.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
If registration fails (due to the server complaining about an update in
progress, for example), we want to make it easier to undo the changes
and prevent re-registration.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
@pattivacek pattivacek force-pushed the feat/OTA-4984/secondary-replacement branch from 6b43af9 to d943cd9 Compare June 17, 2020 11:37
@eu-siemann eu-siemann self-requested a review June 17, 2020 12:03
@pattivacek pattivacek merged commit f868186 into master Jun 17, 2020
@pattivacek pattivacek deleted the feat/OTA-4984/secondary-replacement branch June 17, 2020 12:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants