Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up CA #19

Merged
merged 26 commits into from
Jun 11, 2022
Merged

Set up CA #19

merged 26 commits into from
Jun 11, 2022

Conversation

bafread
Copy link
Collaborator

@bafread bafread commented Jun 6, 2022

Description

This PR will

Testing the PR
You can verify the PR by following /terraform/README.md#Set-up-CA and set up the CA manually.

Future improvements

Developers notes

  • For CA's hostname, use sacrificial-vm ($ hostname) instead of $ hostname -f.
    Reason: $ hostname -f is different for different GCP accounts, so we cannot hardcode the hostname, which makes the configuration more cumbersome.

Takeaways

How to connect to a remote docker engine

Method 1: Create a docker context

  • Term (Docker Context): a full environment for docker.
# create a context
docker context create --docker host=ssh://username@host remote-engine
# connect to the context
docker context use remote-engine

Now your docker will work directly on the remote host.

Method 2 (Our case): set DOCKER_HOST

Docker supports remote connection via SSH, TCP, TCP with TLS (our case)

export DOCKER_HOST=ssh://username@host remote-engine

This will directly send requests to the remote host using the current context.

Others

see unresolved comments below

  • packer & terraform variable declaration vs definition
  • ...

Q&A

Regarding Container SSH documentation - Step 4:
Answers from Janos:

Q1: SSH vs. TLS

The document recommended us to connect to remote Docker engine over TLS. Why not over SSH? (SSH seems much easier to set up)
A1: SSH is unstable and should only be used for human access.

Janos: SSH is an additional layer of complexity. You would need to keep the SSH tunnel alive all the time for this to work, which in practice doesn't really work all that well. I once ran an SSHFS mount for a longer period of time and it frequently crashed or became unresponsive since SSH is generally meant for human use. There are very few mechanisms to handle connection failures transparently.

[...] you need ContainerSSH to be able to connect to the remote Docker socket whenever a connection comes in. We have a limited set of retries for the initial connection and no retries afterwards. If the SSH tunnel breaks, the client gets disconnected. I can't precisely tell you why, but my experience is that using SSH for anything but human access is a bad idea.

Q2: docker run -H doesn't work

The command on documentation doesn't work.

$ docker run -H tcp://your-sacrificial-host:2376 -ti ubuntu
unknown shorthand flag: 'H' in -H
See 'docker run --help'.

Is the command wrong? It works with

docker --tlsverify -H=tcp://sacrificial-vm:2376 run -ti ubuntu

when ~/.docker is set up as Docker - Secure by default

A2: Janos: It should work with docker -H <command>

# GOOD 1
docker --tlsverify -H tcp://sacrificial-vm:2376 version

# GOOD 2 (with -H=)
docker --tlsverify -H=tcp://sacrificial-vm:2376 version

# GOOD 3
export TLSVERIFY=1
docker -H=tcp://sacrificial-vm:2376 version

# BAD
docker -H=tcp://sacrificial-vm:2376 version
> Error response from daemon: Client sent an HTTP request to an HTTPS server.

Q3: Do we need Docker daemon to also run on port 2375?

Step 4 gives two ports for Docker connection, but only 2376 seems to be used in other places in the doc.
A3: Janos: typically 2376 for encrypted connection, 2375 for unencrypted.

Action: Our 2375 is not open. We can probably ignore it.
image

bafread added 7 commits June 4, 2022 14:41
1. Rename and add Firewall-rules with tags
2. Change the image of gateway-vm to ubuntu with docker
3. Build containerssh-guest-image on Sacrificial-vm
This reverts commit 292877b.
1. install ssh-container-guest-image directly with docker pull.
2.  add ca_cert for client and server (need to be tested, if it works with another server with different fqdn)
3.  unclean!
@gitguardian

This comment was marked as off-topic.

Comment on lines 42 to 43
sudo shutdown -r +1
sleep 5m
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it because ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will fix the disconnected issue :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changed it back to reboot as it's much faster and works now... (I tested it for > 10 times)
(maybe because of

  provisioner "shell" {
    script            = "./scripts/update_apt_packages.sh"
+    expect_disconnect = true
  }

Copy link
Owner

@paseaf paseaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some TODOs for us: - clean up what needs to be cleaned - try to automate everything (especially key file generation. we should not commit key files to github in the end) - finish the whole setup - add some description to the PR if needed

Notes:

  • hostname -f: get the fqdn
    Use FQDN for $HOST in docker manual commands to generate .pem

  • config.yaml not working yet

  • docker key gen still manual process. Need to automate it.

Decided to make CA work first to control the PR size.

packer/main.pkr.hcl Outdated Show resolved Hide resolved
packer/main.pkr.hcl Outdated Show resolved Hide resolved
packer/main.pkr.hcl Show resolved Hide resolved
packer/README.md Outdated Show resolved Hide resolved
terraform/files/prometheus.yml Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
bafread added 5 commits June 7, 2022 11:59
- make small changes in firewall rule (open port 9090 and 9091 only for logger-vm)
- add new script to run container in gateway-vm as Janos's Suggestion (not work correctly yet; the container keep exit instead of running in background)
the script run docker container inside our VM (container will run in background) and will mount the settings for containerssh

variable "project_id" {
type = string
default = "containerssh-352007"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enter your project ID here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move it to variables.auto.pkrvars.hclas recommended by
https://www.packer.io/guides/hcl/variables#from-a-file

allow {
protocol = "tcp"
ports = ["22", "9091"]
ports = ["22"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open port 22 for all VM (for SSH), so that it's easier to control
TODO Later: open port 22 only for gateway-vm

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can finalize firewall rules in #5 after everything works

terraform/main.tf Outdated Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
terraform/main.tf Outdated Show resolved Hide resolved
"./scripts/download_node_exporter.sh",
"./scripts/run_node_exporter.sh"
"./scripts/run_node_exporter.sh",
"./scripts/run_docker_container.sh"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per Janos's suggestion, running container inside gateway-vm to run sshcontainer

packer/README.md Outdated
@@ -30,7 +30,7 @@ What you need:

> Note: if you want to use a different file name or location, change `account_file` in [`./main.pkr.hcl`](./main.pkr.hcl) accordingly

3. Update `project-id` in `main.pkr.hcl` to match yours
3. Update `project-id` in `variables.pkr.hcl` to match yours
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update readme file to match the changes

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to *.auto.pkr.hcl file as packer recommended

@paseaf paseaf changed the title Makre some changes :) Set up CA Jun 11, 2022
@paseaf paseaf force-pushed the another-version branch from f81a23c to be287d6 Compare June 11, 2022 17:04
@paseaf paseaf marked this pull request as ready for review June 11, 2022 17:30
@paseaf paseaf added bug Something isn't working documentation Improvements or additions to documentation deployment packer labels Jun 11, 2022
Copy link
Owner

@paseaf paseaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished CA set up with manual configuration.
Will do next steps in future PRs.

TODOs before merge:

  • check out answers to open questions and update our code if needed
  • remove ca*.tar files and relevant config
  • verify CA set up again with readme

.gitignore Outdated
Comment on lines 1 to 4
terraform/variables.tf
terraform/.terraform.lock.hcl
packer/files/Neuer Ordner/
terraform/variables.tf
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this file for the following reasons:

Maybe we shouldn't ignore these files.

  • for .terraform.lock.hcl file, refer to
    https://stackoverflow.com/a/67975490

    We should update the file instead of ignore it. (done in this PR)

    terraform providers lock  -platform=windows_amd64  -platform=darwin_amd64   -platform=linux_amd64     
  • for Neuer Ordner, it's neither a common file to ignore nor a project relevant file, so maybe a bit off topic here?

  • variables.tf declares variables and their types, and should be checked into the repo.


variable "project_id" {
type = string
default = "containerssh-352007"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move it to variables.auto.pkrvars.hclas recommended by
https://www.packer.io/guides/hcl/variables#from-a-file

type = string
default = "containerssh-352007"
// Sensitive vars are hidden from output as of Packer v1.6.5
sensitive = true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed sensitive as project id should not be sensitive.

packer/README.md Outdated
@@ -30,7 +30,7 @@ What you need:

> Note: if you want to use a different file name or location, change `account_file` in [`./main.pkr.hcl`](./main.pkr.hcl) accordingly

3. Update `project-id` in `main.pkr.hcl` to match yours
3. Update `project-id` in `variables.pkr.hcl` to match yours
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to *.auto.pkr.hcl file as packer recommended

Comment on lines 42 to 43
sudo shutdown -r +1
sleep 5m
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changed it back to reboot as it's much faster and works now... (I tested it for > 10 times)
(maybe because of

  provisioner "shell" {
    script            = "./scripts/update_apt_packages.sh"
+    expect_disconnect = true
  }

allow {
protocol = "tcp"
ports = ["22", "9091"]
ports = ["22"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can finalize firewall rules in #5 after everything works

terraform/scripts/run_docker_container.sh Outdated Show resolved Hide resolved
packer/main.pkr.hcl Show resolved Hide resolved
packer/main.pkr.hcl Outdated Show resolved Hide resolved
@paseaf paseaf merged commit c0133b4 into main Jun 11, 2022
@paseaf paseaf deleted the another-version branch June 11, 2022 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deployment documentation Improvements or additions to documentation packer
Projects
None yet
2 participants