-
Notifications
You must be signed in to change notification settings - Fork 61
Wait for secondaries before installing #1533
Conversation
config/sql/migration/migrate.23.sql
Outdated
-- Don't modify this! Create a new migration instead--see docs/schema-migrations.adoc | ||
SAVEPOINT MIGRATION; | ||
|
||
ALTER TABLE ecu_serials ADD COLUMN sec_type TEXT NOT NULL DEFAULT ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following comment is not exactly related to this PR and is not blocker for its merging, just simply general comment to the overall data model of aktualizr.
I think, it makes sense to have a notion of ECU in the data model (not ecu_serial) and corresponding table in the DB (ECU table), so manifest
, public_key
, etc are attributes of ECU not ecu_serials entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've also wondered if it should just be another table.
As we've added fields to ecu_serials
it has effectively become what you describe and would probably more aptly named ecus
.
src/aktualizr_primary/secondary.cc
Outdated
@@ -141,20 +147,53 @@ class SecondaryWaiter { | |||
std::unordered_set<std::string> secondaries_to_wait_for_; | |||
}; | |||
|
|||
static Secondaries createIPSecondaries(const IPSecondariesConfig& config) { | |||
static Secondaries createIPSecondaries(const IPSecondariesConfig& config, bool provision, Aktualizr& aktualizr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dependency of the IP Secondary factory on Aktualizr is questionable. Actually, it's me who implicitly introduced this dependency by putting createIPSecondaries() implementation in this file :) , it should be under libaktualizr-posix folder, IMHO. Anyway, this is not exactly comment to this PR, just thing that we might need to consider at some point.
src/aktualizr_primary/secondary.cc
Outdated
|
||
sec_waiter.wait(); | ||
} else { | ||
auto secondaries_info = aktualizr.GetRegisteredSecondaries(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the secondaries that have been added to the aktualizr instance be 'not registered', if not then perhaps renaming of GetRegisteredSecondaries() to just GetSecondaries() would make sense. Or, as an option, storage_
can be passed to createIPSecondaries instead of aktualizr
so we can call storage_->loadSecondaryInfo(&info);
here and replace dependency on Aktualizr with dependency on Storage which looks better since it reduces cross-dependencies, IMHO. Or even more better, just pass exactly what is required to do the job here, which is std::vector<SecondaryInfo>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetSecondaries()
would probably work.
My first version was also passing the storage around, the problem I had is that call originates from src/aktualizr_primary/main.cc
directly where the underlying storage object is not accessible.
Passing std::vector<SecondaryInfo>
might be fine or maybe too specific? Also, because of the previous point, the aktualizr method would still be needed in the current state.
src/aktualizr_primary/secondary.cc
Outdated
throw std::runtime_error(err_ss.str()); | ||
} | ||
|
||
// note: order in configuration and storage should be the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps, IP and port could be stored in the DB too, so an order that secondaries are listed in the conf won't matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the same vein as the previous comment, it is made harder by the fact we've abstracted the ip secondary concept from SotaUptaneClient and the storage.
One version I've explored was to add another free form string column to the ecu_serials
table and SecondaryInfo
that might have to be re-parsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to avoid tables with secondary type-specific columns, so the free-form string might be the best option. Tedious but might be the most future-proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that adding secondary specific columns into a generic table (ecu_serials here) is a bad idea. I see a few options here:
-
extend the existing ecu_serials table with an opaque column (e.g. sec_data, sec_info, etc) intended for serialized secondary specific data required for its instantiation;
-
don't alter the existing ecu_serials table. Introduce another table
ecu
with the generic fields + the column containing serialized secondary specific data; -
Introduce another table ip_secondary that contains the ip secondary specific info. This table is managed solely by the IP secondary factory object/method/function. In this case we can avoid dependency on Aktualizr object at all, what is needed just a path to the DB if we would like to store this table in the same DB file.
-
Introduce two new tables ECU containing generic columns + the ip_secondary specific table.
} | ||
|
||
for (auto sec_it = targeted_secondaries.begin(); sec_it != targeted_secondaries.end();) { | ||
if (sec_it->second->getManifest() != Json::Value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sec_it->second->getManifest() sets up a TCP connection with Secondary, a timeout of the setting up TCP connection really depends on multiple factors and configuration of TCP stack at a kernel level. The same is true for write/read to/from socket, so it won't be 10 minutes for sure and actually will significantly differs in a case of a local host and a non-local one (http://willbryant.net/overriding_the_default_linux_kernel_20_second_tcp_socket_connect_timeout)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's true, I will rework something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, why not just do certain amount of retries within getManifest() (I mean Asn1Rpc call) or keep retrying until certain time if we need to wait at least some period of time.
a7fdec1
to
5b6b169
Compare
Codecov Report
@@ Coverage Diff @@
## master #1533 +/- ##
=========================================
- Coverage 82.19% 82.1% -0.09%
=========================================
Files 186 187 +1
Lines 11494 11744 +250
=========================================
+ Hits 9447 9642 +195
- Misses 2047 2102 +55
Continue to review full report at Codecov.
|
5b6b169
to
3f22919
Compare
Looks like I've overlooked something, e2e62c6 breaks Uptane.UptaneSecondaryMisconfigured |
Not necessarily unexpected, right? That test might need some reworking to still be meaningful. |
if (secmanifest == Json::Value()) { | ||
// could not get the secondary manifest, a cached value will be provided | ||
std::string cached; | ||
if (storage->loadCachedEcuManifest(ecu_serial, &cached)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'manifest caching' functionality looks like IP Secondary specific functionality, so I would 'hide' it behind the secondary interface and implemented it in IpUptaneSecondary::getManifest()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like that also, but there should then be a way to access the storage from there. To evaluate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe dedicated table in sql.db or even DB that is initialized in IP secondary factory (preferable) or ctor.
// note: this fail after a time out but will be retried at the next install | ||
// phase if the targets have not been changed. This is done to avoid being | ||
// stuck in an unrecoverable state here | ||
if (!waitSecondariesReachable(updates)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that this functionality belongs here, IMHO, this is secondary specific functionality and should not be in a generic code. Why we wait just here before the sending metatada but not wait before sending actual images, secondary can be available at the first step but becomes not available before the second (installation) step ?
This rhetorical question is another argument for having this wait
or retry policy
withing the IP secondary implementation, specifically, within IpUptaneSecondary implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. We're likely to need a generic way to wait for Secondaries regardless of protocol. Obviously for something like virtual Secondaries it can just return true. And if we can improve this later, that's fine too.
@@ -14,6 +14,7 @@ class SecondaryInterface { | |||
using Ptr = std::shared_ptr<SecondaryInterface>; | |||
|
|||
public: | |||
virtual std::string Type() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, this Type()
as well as the logic of passing around of the registered
secondaries can be eliminated if we persist required information about IP secondaries within the IP secondary builder/factory, e.g. in createIPSecondaries
under if (provision) {
branch. Therefore, we won't need aktualizr instance here and in its else
we won't need to secondaries_info.erase(
, we will just load the info from the DB.
Actually, I think, even provision
parameter is not required here. In createIPSecondaries
, we just check if the info about the secondary is persisted in the DB if so, just `Uptane::IpUptaneSecondary::connectAndCheck() if not connectAndCreate() + persist the info into the DB. What we need here, is just an instance of the object that can operate over the DB, BTW, I think, it's not necessary has to be the same instance that the sotauptaneclient owns as we can have IP secondary specific table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the provision
around was indeed needlessly complicated, the last version is a bit more simple.
I'll continue the investigations about the Type()
method and dedicated storage.
Regardless of the solution/PR design, I think, it makes sense to add test(s) into |
b18a231
to
d631a21
Compare
Here is another version with two tables |
2e631a4
to
0fbd57f
Compare
The version currently pushed only caches the manifest in memory and it is indeed simpler. But I am gonna revert to put it in the storage somehow because it doesn't fit the bill: if the secondary never gets up between the primary start and the attempted install, the backend marks the install as failed before it can start as the manifest is incomplete. And we clearly want to support this case. |
config/sql/migration/migrate.23.sql
Outdated
SAVEPOINT MIGRATION; | ||
|
||
ALTER TABLE ecu_serials RENAME TO ecus; | ||
CREATE TABLE secondary_ecus(serial TEXT PRIMARY KEY, sec_type TEXT NOT NULL, public_key_type TEXT NOT NULL, public_key TEXT NOT NULL, FOREIGN KEY(serial) REFERENCES ecus ON DELETE CASCADE ON UPDATE CASCADE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, these two tables should not know anything about each other. libaktualizr should not know details of any secondary implementation. In ideal world the IP secondary should have its own dedicated persistent layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the real world is all about compromises.
0d0ece5
to
71cdefc
Compare
Not used anywhere now Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
It does not seem to have a meaningful purpose here and is needlessly limiting. Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
The extra check in `addNewSecondary()` can be added anywhere Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
71cdefc
to
f9ea804
Compare
It duplicates information and can be easily replaced by an helper function. Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
98ecfdc
to
505166c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you still document somewhere (maybe in the ticket?) what might still require some followup?
'''Test Aktualizr with multiple ip secondaries''' | ||
|
||
with aktualizr, secondary, secondary2: | ||
aktualizr.wait_for_completion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all happy-path, with everything more or less starting simultaneously, right? Can we extended this for the cases when one or more Secondaries don't come up immediately?
Secondary type, public key and an "extra" field. Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
We have to store some additional data to be able to pick it back up. Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
And fix some other test after changes Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
If we did not store a secondary public key (migrated from old storage for example): - store it later when we get it - do not compare against the actual one, but do it on subsequent tries Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
e389c18
to
10638d8
Compare
Solves some headaches Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
10638d8
to
83d0aa9
Compare
83d0aa9
to
c927768
Compare
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
c927768
to
ba40f33
Compare
This is quite big and will need more tests.