-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTTPS/certificate authentication support to file transfer service #2474
Add HTTPS/certificate authentication support to file transfer service #2474
Conversation
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Codecov Report
Additional details and impacted files
|
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
6ce986f
to
6127ba2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only one request - adding some documentation to the new axum_tls
crate. I think it having doc comments for our crates especially (and sometimes for some complex modules) significantly helps with understanding and why the code is the way it is, and in this case, I think even a short summary would be helpful.
|
||
Once HTTPS is enabled for the file-transfer service, certificate-based authentication can also be enabled. | ||
The directory containing the certificates that the mapper will trust can be configured using `http.ca_path`, | ||
and the agent can be configured to use a trusted certificate using the `http.client.auth.cert_file` and `http.client.auth.key_file` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the agent can be configured to use a trusted certificate using the `http.client.auth.cert_file` and `http.client.auth.key_file` | |
and the mapper can be configured to use a trusted certificate using the `http.client.auth.cert_file` and `http.client.auth.key_file` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's true, but the "mapper" in the sentence above is incorrect (as you can probably tell, I copied and pasted it from the auth proxy docs since they're broadly the same, but forgot to change that bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? There seems to be some misunderstanding about this setting then, as I see @didier-wenzek also suggesting a similar change. This is the client
setting, right? Meant to be used by the clients of the agent: which are the mappers and other child devices.
Or these are the client settings to be used by the agent
, when it is the client of other services like the c8y.proxy
? If that's the case, then you may wanna stress that and also mention the settings to be used by clients of the tedge-agent
like the mapper and child devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adjusted it to be in line with Didier's comment, as well as correcting the line above, which was also adding to my confusion as that was also wrong.
## HTTPS and authenticated access | ||
By default, the service is unauthenticated and does not support HTTPS connections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## HTTPS and authenticated access | |
By default, the service is unauthenticated and does not support HTTPS connections. | |
## HTTPS and authenticated access | |
By default, the service is unauthenticated and does not support HTTPS connections. |
…ent config generation Signed-off-by: James Rhodes <jarhodes314@gmail.com>
/// The file that will be used as a the server certificate for the File Transfer Service | ||
#[tedge_config(example = "/etc/tedge/device-certs/file_transfer_certificate.pem")] | ||
#[doku(as = "PathBuf")] | ||
cert_path: Utf8PathBuf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take the time to write scripts and docs on how to setup all these certificates.
I suspect that the current set of settings could be made simpler. For instance, in practice, the tedge-agent
running on the main device can use the same certificate to connect the c8y auth proxy and to serve files to child devices. Also, on a child device, a user might wrongly set http.cert_path
instead of http.client.auth.cert_file
.
It would be good to have a certificate per service instance and to have config settings that reflect that.
Writing scripts and docs will help me to clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is first raw version for a script (where all names are hard-coded!): https://gist.github.com/didier-wenzek/110b618eda6866814dc02fe4f168ff2d
Using this script, I can confirm that we can configure different levels of authentication (HTTP -> HTTPS with server authentication -> HTTPS with symmetric authentication) for c8y-auth-proxy and file-transfer. This flexibility adds complexity though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a certificate per service instance and to have config settings that reflect that.
I just hope per-service certificates won't make the setup too complicated for clients. Ideally, a child device should be able to connect to both the c8y-proxy
and FTS
using the same client certificate. At least for our docs/demos, having the same root CA sign all the service certificates as well as the client certificates should simplify things I guess. Just saw that your script is doing exactly that.
Writing scripts and docs will help me to clarify this.
That'd really help. With the agent being a client of the mapper for the proxy and the reverse for the file transfer service, I found the client cert setting twins of both the services really confusing to get my head around and made a lot of errors.
I feel that it'd be even more complicated when a secure tedge-agent
on a child device wants to connect to a secure tedge-agent
on the main device.
Here is first raw version for a script (where all names are hard-coded!): https://gist.github.com/didier-wenzek/110b618eda6866814dc02fe4f168ff2d
That really helped and that long list of config settings highlights the complexity as well. It all makes sense, but as a regular user I find it extremely cumbersome.
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
46cddf5
to
77d05ba
Compare
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
d89b8d6
to
c795e91
Compare
So the thing that's happening here is the PUT request to upload the file cannot be retried, so the response to that request is a 307. As far as reqwest is concerned, this isn't an error, the request has succeeded since it's not a 4xx or 5xx status code. That explains why (a) the upload failed and (b) why the log plugin ignored the issue. I plan to solve this by making a HEAD request to the URL we intend to upload to, and using the final URL from that request as the target to upload to. This is a bit of a hack at the moment since we don't actually implement the HEAD method in the file transfer service, but it's sufficient to generate the desired 307 response in the case where we are using HTTPS. @didier-wenzek does this sound like a reasonable approach to you? I've implemented it locally, and I'm just refining the error handling so we get clear error messages from the log plugin in the case of a failure. |
I'm okay to send a first HEAD request, even if not usual nor ideal. Could the client issue directly an HTTPS request, guessing the scheme from the config? |
…n mapper Signed-off-by: James Rhodes <jarhodes314@gmail.com>
It could issue an HTTPS request directly, but...
I think we're already on the verge of having too many configurations for this feature, I think making this "just work" at the cost of an additional HTTP request is worthwhile compared to adding another configuration. |
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The unit test coverage is definitely good, but shouldn't we have some integration tests as well, which tests a secure setup end-to-end? May be just one test that validates secure access to the file-transfer service with curl
or something.
But, to really make sure that our own components can work together when security is enabled, the following tests would also be required:
tedge-log-plugin
on child device connecting to secure tedgetedge-config-plugin
on child device connecting to secure tedgetedge-agent
on child device connecting to secure tedge
And the variations of the above cases as well: authenticated client and unauthenticated client. These tests can later be used as a reference for our docs on how to configure a secure deployment.
Since adding all those tests would be a fairly big task, it can be done in follow-up tasks and someone else should be able to support you with those tests.
@@ -214,8 +214,9 @@ impl CumulocityConverter { | |||
let download = match download_result { | |||
Err(err) => { | |||
let smartrest_error = SmartRestSetOperationToFailed::new( | |||
CumulocitySupportedOperations::C8yLogFileRequest, | |||
format!("Upload failed with {:?}", err), | |||
// TODO test I'm the right message type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks right. The previous op type: CumulocitySupportedOperations::C8yLogFileRequest
was clearly wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to see this was how we handle failed operations. We must already know this value is order to execute the operation, so all this function should need to provide is the failure reason, it shouldn't care what the operation is called. This is a particularly critical thing to handle correctly due to Cumulocity's assumption that devices always complete operations, and reproducing this failure is pretty difficult in order to test it (and expensive to test since it requires an integration test if you want more of a guarantee than "hopefully the test author wrote the correct value in their test").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must already know this value is order to execute the operation, so all this function should need to provide is the failure reason, it shouldn't care what the operation is called.
It's partly due to the fact that the operation status transitions in SmartREST is generic and requires the operation type to be sent in that message. That same design was adopted here as well, in generic structures like SmartRestSetOperationToFailed
which leads to such accidental errors. But, I agree that some sort of "current-operation-aware" warpper should have been used here to avoid such mistakes.
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
@@ -30,7 +30,7 @@ Request with non-existing log type | |||
... fragments={"c8y_LogfileRequest":{"dateFrom":"${start_timestamp}","dateTo":"${end_timestamp}","logFile":"example1","searchText":"first","maximumLines":10}} | |||
Operation Should Be FAILED | |||
... ${operation} | |||
... failure_reason=.*No such file or directory for log type: example1 | |||
... failure_reason=.*No logs found for log type example1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created (and will likely work on soon if no one objects) #2487 for including the quotes in the error message. It causes a bunch of unit test failures making the required change locally, so that will be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objections. But, in the meantime you can also use single quotes or back quotes to get around that problem.
I confirm that things just work.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. Thank you for this quite involved and really useful PR.
@@ -187,6 +187,7 @@ impl CumulocityConverter { | |||
download_result: DownloadResult, | |||
fts_download: FtsDownloadOperationData, | |||
) -> Result<Vec<Message>, ConversionError> { | |||
dbg!(&download_result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbg!(&download_result); |
@albinsuresh @didier-wenzek I've added some integration tests to the configuration plugin. There is currently one case where I believe I have found a genuine bug, where a configuration update is marked as successful in cloud but the file isn't actually updated on disk. This only appears to happen if a child device is configured to use client certificates, but the mapper is not configured to use client certificates. Taking a configuration fails as expected with a helpful error in this case, so it should be obvious that something is wrong. I think this is a sufficiently niche error in a niche feature that it doesn't justify blocking the merging of this PR. I'll start to investigate the root cause now, but unless anyone objects, I will merge this PR when the tests are passing and @albinsuresh has approved. |
I'm okay with that. |
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
9ddd420
to
7fd9a0c
Compare
Signed-off-by: James Rhodes <jarhodes314@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. The integration test also seems pretty complete. The rest of the variations can be added later on (even by someone else).
@@ -238,7 +238,7 @@ impl ConfigManagerActor { | |||
} | |||
Err(err) => { | |||
let error_message = format!( | |||
"Failed uploading configuration snapshot to file-transfer service: {}", | |||
"tedge-configuration-plugin failed uploading configuration snapshot: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the name of the plugin directly here won't be a good idea, as this actor is reused in the c8y-config-plugin
as well as the tedge-agent-v1
. So, some generic reference like "config manager" would be better.
Proposed changes
This adds support for HTTPS and certificate authentication to the file transfer service.
Types of changes
Paste Link to the issue
#2475
Checklist
cargo fmt
as mentioned in CODING_GUIDELINEScargo clippy
as mentioned in CODING_GUIDELINESFurther comments