Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUGFIX | Fix Teleport ALPN Proxy not being HTTP CONNECT Proxy Aware #30

Open
wants to merge 233 commits into
base: v8.3.14-backport
Choose a base branch
from

Conversation

alexatcanva
Copy link
Owner

@alexatcanva alexatcanva commented Oct 13, 2022

No description provided.

strideynet and others added 30 commits June 9, 2022 09:34
* thread `context.Context` from tctl `Run()` to subcommands (gravitational#13029)

* update e submodule ref

* update e submodule ref to branch/v8
Backport of gravitational#12827.

`tctl version`, just as `tsh version`,
prints out the version of the binary itself,
not the version of the cluster it's connected to.

The cluster version can be discovered by running
`tctl status`.
…ravitational#13402)

* Don't GetAuthServers in transport.start

* Don't GetAuthServers in AuthProxyDialerService

* Don't GetAuthServers in localSite

* Fix lib/web tests

* Review comments

Co-authored-by: Alan Parra <alan.parra@goteleport.com>
Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.

Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* Backport of gravitational#10746 to v8

* Added prerelease check to new APT promotion pipeline
Make the description and first sentence of the Teleport Cloud
introduction clearer for users who have not heard of the Auth Service,
Proxy Service, or Teleport in general.
…cfg (gravitational#13470) (gravitational#13518)

* fix CA rotation watcher not starting when database svc enabled w/ no cfg

* move shouldInitDatabase test to db_test.go and t.Parallel()
Backports gravitational#12840

* Fix resource links

Fixes gravitational#12839

Some video links still refer to the outdated "/teleport/" path.
This change adds the videos these links refer to to the "img"
directory and updates the links.

Note that two of the three MDX files that are changed here do not
actually render the video. I've changed the links here anyway in
case someone uses these as a reference for the link format.

* Apply suggestions from code review

Co-authored-by: Ben Arent <ben@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>

Co-authored-by: Ben Arent <ben@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>

Co-authored-by: Ben Arent <ben@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
This test fails similarly to TestAgentForward, because the
file-based session uploader runs in the background and is
sometimes still writing to disk when the test goes to clean
up temp directories.

Since this test has nothing to do with the disk-based uploader
or the audit log, we use a discard emitter and set the session
recording mode to synchronous so there are no files on disk
which need to be uploaded.
Backports gravitational#12828

* Add port information for Cloud users

Fixes gravitational#10552

In the Networking page and Cloud FAQ page, add information on how to
determine which Proxy Service ports are open, and whether TLS routing
is enabled. Also corrects the earlier Networking Page information for
Cloud users, which implied that ports were identical accross tenants.

* Add links to the TLS Routing guide
* Hide Setup menu items based on scope

Backports gravitational#12742

See gravitational#11383

Help ensure that no visitor to the Teleport docs site sees content that
is irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu and menu
pages.

For pages that aren't step-by-step guides and are meant to convey
general information about a Teleport edition, show these pages in all
scopes so users who are curious about another scope can get the
information they need.

This PR focuses on the Setup section.

* Hide Kube Access menu items based on scope

Backports gravitational#12737

See gravitational#11383

Help ensure that no visitor to the Teleport docs site sees content that
is irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu and menu
pages.

For pages that aren't step-by-step guides and are meant to convey
general information about a Teleport edition, show these pages in all
scopes so users who are curious about another scope can get the
information they need.

This PR focuses on the Kubernetes Access section.

It also adds a short note at the top of the teleport-cluter Helm chart
reference that the chart supports custom agent configurations along with
the Auth/Proxy.

* Hide Getting Started pages/links based on scope

Backports gravitational#12718

See gravitational#11383

Help ensure that no visitor to the Teleport docs site sees content that
is irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu and menu
pages.

For pages that aren't step-by-step guides and are meant to convey
general information about a Teleport edition, show these pages in all
scopes so users who are curious about another scope can get the
information they need.

This page focuses on the Getting Started section.

Also reorganizes the getting-started.mdx menu page with the assumption
that users of all editions will visit this page for some of their first
guidance on using Teleport. With this change, getting-started.mdx now
includes links to Getting Started guides in all editions, plus all of
our local labs.

- fixes gravitational#10594
- fixes gravitational#10199

* Hide Enterprise links/pages based on scope

Backports gravitational#12716

See gravitational#11383

Ensure that no visitor to the Teleport docs site sees content that is
irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu.

In some cases, e.g., the introduction pages for the Cloud, Getting
Started, and Enterprise sections, the content is irrelevant to certain
scopes but a reader still may want to find out more. This change
preserves purely informational content for all scopes, while including
links to other scopes.

This PR focuses on the Enterprise section.

* Hide Cloud links/pages based on scope

Backports gravitational#12712

* Hide Cloud links/pages based on scope

See gravitational#11383

Help ensure that no visitor to the Teleport docs site sees content that
is irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu and menu
pages.

For pages that aren't step-by-step guides and are meant to convey
general information about a Teleport edition, show these pages in all
scopes so users who are curious about another scope can get the
information they need.

This PR focuses on the Cloud section, and also fleshes out the
introduction page to the Cloud section a bit.

* Respond to PR feedback

* Hide Access Controls links/pages based on scope

Backports gravitational#12708

See gravitational#11383

Ensure that no visitor to the Teleport docs site sees content that is
irrelevant to their scope (e.g., Cloud, Open Source, or Enterprise) by
hiding scope-irrelevant content from the navigation menu and menu
pages.

For pages that are only relevant to a specific scope, show users with
unintended scopes a menu of links to supported scopes.

This PR focuses on the Access Controls section.
Fix CLI reference for the tsh --ttl flag

The value of the --ttl flag for tsh commands was incorrectly stated to
be a duration. Instead, it is an integer indicating a number of minutes.
The default is 12 hours.

Also does some minor copy-editing on the tsh global flags.
…13251)

Convert the Trusted Clusters guide to a tutorial

Backports gravitational#10708

* Edit the Trusted Clusters guide for Cloud

See gravitational#10633

- Misc style/grammar/clarity tweaks
- Turn the Teleport Node Tunneling Admonition into a Details
  box so it can be invisible for Cloud users. In Cloud, Nodes
  must connect via Node Tunneling.
- Use Tabs components to add Cloud versions of CLI commands
- Only show the static join token method for self-hosted users
  via Tabs
- Use a Details box to show content relevant only for Enterprise
  and Cloud users
- Remove an Admonition that was duplicated in the Troubleshooting
  section

* Respond to PR feedback

* Address PR feedback

* Turn the Trusted Clusters guide into a tutorial

See: gravitational#11841

The Trusted Clusters guide is organized as a conceptual introduction,
with configuration/command snippets used as illustrations. To make this
guide easier to follow, I have structured it as a step-by-step tutorial
where a user should be able to copy each command/config snippet on
their own environment, establish trust between clusters, and connect to
a remote Node.

Some more specific changes:

- Remove Details box re: Node Tunneling: This isn't strictly relevant
  to Trusted Clusters, so removing it shortens and simplifies what is
  quite a long guide.

- Make "How Trusted Clusters work" more concise and add the information
  to the introduction.

- Move long explanatory passages into Details boxes. Eventually, it
  would be great to split this guide into multiple guides that explain
  different topics in more depth (e.g., a section of the docs devoted
  to Trusted Clusters). For now, this is the quickest way to organize
  conceptual information without detracting from the tutorial structure.
Fix Teleport welcome screen image

The Linux Server getting started guide shows the wrong screenshot
when referring to the Teleport welcome screen. This change uses
a screenshot of the view an unauthenticated user would see when
first visiting the Web UI.
Our upgrade procedure has always been such that servers (proxy/auth)
should be upgraded before upgrading nodes or other agents. This is
because servers can tolerate old clients, but servers do not necessarily
support newer clients (this sometimes works, but is not a supported
configuration).

While this has been the case in practice, it's not entirely clear in
our docs. The docs just say "applies to clients and servers" and never
gave a specific example of a new client trying to connect to an old
server.

Also update the versioning RFD to be more accurate with reality
(we no longer maintain release branches for minor versions).
gravitational#13674)

Improve log message when we fail to retrieve the client cert pool

Include the remote addr of the client to help cluster administrators
identify the misbehaving client.

Fixes gravitational#8594
…ravitational#13961)

CertAuthority watcher filtering (gravitational#10020)

Including similar changes to the v7 backport
…eb session (gravitational#13968)

* Open a new remote client when the remote site has changed in a web session

* Test coverage for remoteClientCache
* Repoint docttest build

* relabel image

* Fix CHANGELOG links for new resolution method

In gravitational/docs, relative links in partials are revaluated
relative to the filepath of the partial, not to the page that includes
that partial. In other words, a link in the same partial will be
resolved differently depending on the filepath of the page that
includes it.

To accommdate this link resolution method, this change edits relative
links within the CHANGELOG to be relative to that file, rather than to
docs/pages/changelog.mdx, which includes it.
…3737)

Ensure tctl commands include login instructions

Backports gravitational#12944

* Ensure tctl commands include login instructions

See gravitational#10192

Add the tctl.mdx partial or a "tsh login" command in some pages that
include example tctl commands. Note that this change does not address
SSO guides, which will be handled separately.

Where a guide requires a complete restructuring to provide full context,
"docs/pages/application-access/guides/connecting-apps.mdx", I've added
"tsh login" instruction above the first tctl command.

* Respond to PR feedback
…l#14119)

docs: remove blocks hiding content and scope links

Backports gravitational#14085

remove blocks hiding content and scope links, fixes gravitational#14052

Co-authored-by: Alexander Klizhentas <klizhentas@gmail.com>
rosstimothy and others added 7 commits October 6, 2022 17:30
…vitational#16916)

The periodic version metric calculation loaded all nodes, database
servers and app servers into memory in order to tally the versions
of each. For larger clusters unmarshalling all resources and loading
them into memory is quite expensive. To prevent Teleport from potentiall
being OOM killed we can use `ListNodes` and `ListResources` to limit
the number of resources being loaded into memory.
The cacert flag was removed from the curl output during the tsh app login as
most production Teleport clusters are likely to be using publicly trusted CAs,
and therefore wouldn't need the flag. If the user specifies an insecure login,
however, the cacert flag is included with the curl output as it used to be.

Additionally, some tests have been added for the formatAppConfig function. It
was discovered that the YAML output format was outputting two newlines, so a
small modification was made to remove this.

Addresses issue gravitational#7518.
Without this any tag that isn't part of the history on master will fail
to successfully promote.  This breaks most dev builds, which don't end
up as part of master or a release branch.

(cherry picked from commit 531bc51)
@alexatcanva alexatcanva changed the base branch from master to v8.3.14-backport October 13, 2022 08:29
@alexatcanva alexatcanva changed the title BUGFIX | Teleport ALPN Proxy no HTTP CONNECT Proxy Aware BUGFIX | Fix Teleport ALPN Proxy not being HTTP CONNECT Proxy Aware Oct 13, 2022
alexatcanva and others added 21 commits October 13, 2022 18:09
)

This PR updates our various Drone pipelines to use AWS roles for publishing.

Our AWS FTR requires that we do not use any long lived credentials in our AWS accounts and instead use roles. This means we need to move from attaching policies directly to users to attaching policies to roles and having policyless users assume those roles.

https://aws.amazon.com/partners/foundational-technical-review/

Contributes to gravitational/SecOps#213
Co-authored-by: Steven Martin <steven@goteleport.com>
This PR allows Kube proxy/service to use the global logger settings defined for the process.

Fixes gravitational#17461
* Serialize apt/yum promote pipelines

These were running in parallel, but we want them to run serially.
Therefore, we add a dependency between each step and its previous step.

* Allow dev build promotes to proceed in deb/rpm pipelines

This helps test a couple more changes from this pipeline when cutting a
dev build.  Particularly, we saw the download and role assumption steps
fail in gravitational#17334, and this
change would have allowed us to catch that error during testing.

* Fix globbing bug

This bug does not appear to affect anything currently.  However it
should be fixed in case the rm is important at some point in the future.

The bug is: when a wildcard is inside quotes, it is treated as a literal
filename.  So rm -rf "$ARTIFACT_PATH/*" tries to remove the file named
'*' instead of trying to remove everything in artifact path.
Backport of gravitational#17827

As part of migrating the AWS account used to host the helm chart repository,
the bucket for the helm chart repo is moving from `us-east-2` to `us-west-2`.

This change updates the drone script to use the new region.

The new bucket configuration also forbids public ACLs on the objects added to it (because
public access is now provided by CloudFront), so this change also removes the explicit ACL
on promoted objects

See-Also: gravitational#5184
See-Also: gravitational#17827
See-Also: gravitational#16582
…6456) (gravitational#17805)

* Allow tsh to retrieve cluster details in one request

Prior to connecting to a node via `tsh ssh` we  need two bits of
information about the cluster:
 1) The session recording mode
 2) Whether FIPS is enabled

In order to retrieve the information `tsh` first would send the
global ssh request `RecordingProxyReqType` to determine the
session recording mode. Later on `tsh` would Ping the auth
server to determine if the cluster was running in FIPS mode.

In an effort to reduce the number of round trips to retrieve
this data, a new global ssh request `ClusterDetailsReqType` is
introduced that returns both the session recording mode and
whether FIPS is enabled. This allows `tsh` to leverage the new
request to get all the information it needs, and is extensible
in case more information is needed` in one request which helps
reduce latency for `tsh ssh`.
…l#16434) (gravitational#17799)

By only providing the tunnel address from the `reversetunnel.Resolver`
callers would still need to lookup the proxy listener mode to determine
how to dial the address. This results in sending a request to
`/webapi/find` once by the resolver to get the tunnel address and then
a second request to `/webapi/find` by users of the `Resolver` to determine
the proxy listener mode. Propagating the listener mode along with the
tunnel address by the `Resolver` ensures only one `/webapi/find` call
is needed.

This is especially impactful because the `reversetunnel.TunnelAuthDialer`
which is used by the auth http client would do this everytime the
`http.Client` connection pool was empty. When the `http.Client` needed
to dial the auth server it was incurring the additional roundtrip to the
proxy.
…new attributes (gravitational#18097) (gravitational#18114)

When users re-login after a failed attempt to access a Kubernetes cluster, Teleport may continue to use the old credentials for cluster access. This behavior results in successive failures until the credential cache expires (~1h).

This PR includes changes made by @r0mant to resolve the cache issue. It introduces certificate expiration in the cache key. Every time the user logs in again, the key will be different because the certificate expiration date is different. Thus, Teleport won't reuse the cached credentials.

Fixes gravitational#18070
* Periodically resync proxies to agents (gravitational#18050)

Prior to gravitational#14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
…) (gravitational#18227)

* Ensure invalid tunnel agent connections get closed

Connections from reverse tunnel agents were being marked
as invalid by the proxy under certain conditions but would
ultimately never be closed. This could lead to scenarios where
the agent thought things were fine but the proxy considered
that agent unhealthy and unroutable.

Pruning of invalid connections used to occur when a proxy
tried to retrieve a connection for that tunnel. This also
further muddied the point in time at which the proxy could
close a connection as it never explicitly stopped tracking
the connection and closed it at the same time.

To remedy this, connections are explicitly closed by the proxy
and removed from the mapping to stop tracking immediately. In order
to prevent a connection that is servicing an active connection
from being closed the proxy now tracks which connections have
sessions. Closing does not occur when there are any active
sessions to prevent them from being force terminated.

When the proxy receives a heartbeat from an agent it now restores
the connection to a valid state. In the event that too many heart
beats have been missed for an agent, the proxy will now terminate
the connection, again only if it is not serving any sessions.

Fixes gravitational#15911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.