Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session: track sessions with a group construct #1551

Merged
merged 4 commits into from
Jul 8, 2020

Conversation

tonistiigi
Copy link
Member

Avoid hidden session passing and allow one session to drop when
multiple builds share a vertex.

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

@tonistiigi tonistiigi marked this pull request as ready for review July 1, 2020 05:19
@tonistiigi tonistiigi force-pushed the session-group branch 3 times, most recently from ad9bb6e to 6acceaa Compare July 1, 2020 06:55
@tonistiigi
Copy link
Member Author

@hinshun PTAL . This fixes the issue described in #1432 (comment) . I didn't make changes to the way the session interacts with frontends atm. I think more changes are needed there to really prepare for better shared session support but this should get us into a better state for these changes.

@tonistiigi tonistiigi requested a review from hinshun July 1, 2020 16:26
Copy link
Collaborator

@hinshun hinshun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, currently when multiple sessions share a vertex, if one session drops the solve is cancelled for everyone?

I looked closely at the changes, and it LGTM. I can see how various functions that depend on sessions now can choose Any of the session in the session.Group.

I didn't make changes to the way the session interacts with frontends atm.

I didn't see any cases where multiple sessions are added to a group, but I think that's what you alluded to in the PR comment above.

func getContentStore(ctx context.Context, sm *session.Manager, storeID string) (content.Store, error) {
sessionID := session.FromContext(ctx)
func getContentStore(ctx context.Context, sm *session.Manager, g session.Group, storeID string) (content.Store, error) {
// TODO: to ensure correct session is detected, new api for finding if storeID is supported is needed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean by storeID is supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are multiple sessions daemon could send "detect" request to know which one supports the current storeID. If a specific session does not know about storeID then the next one is tried.

exporter/exporter.go Outdated Show resolved Hide resolved
solver/llbsolver/ops/exec.go Outdated Show resolved Hide resolved
util/resolver/resolver.go Outdated Show resolved Hide resolved
NextSession() string
}

func NewGroup(ids ...string) Group {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this being called with more than one session ID in this PR. How will multiple sessions come together in a group in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vertexes use their own implementation of Group defined in jobs.go. This is the group passed to ops/subbuild etc.

return nil, err
}

// because ssh socket remains active, to actually handle session disconnecting ssh error
// should restart the whole exec with new session
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this? I don't understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have 2 builds both running the same exec() with ssh mounted the ssh remains active for the whole duration of the exec. So if one session goes away there is no way to switch this ssh socket to another session as it might be in unknown state. Atm we ignore this and only validate session works when exec starts. But if we would want to handle this case then when connection would drop from ssh, we could just restart the whole exec with the new session. If you look at the "local source" implementation now, then this is what I do there. If a transfer fails it will check if we can attempt a new transfer from another session before failing the build.

@tonistiigi
Copy link
Member Author

So to be clear, currently when multiple sessions share a vertex, if one session drops the solve is cancelled for everyone?

If a session drops while it is being used and it was the session that was randomly chosen for the op then the whole op fails. The race window is actually quite small, even my reproducer isn't very effective.

A somewhat more interesting case is also what happens to the puller. We reuse the resolver to avoid making new registry connections but this means CacheKey and Exec may have a different active session for getting the credentials. So there is a need to "update" the resolver with the new authentication backend.

I looked closely at the changes, and it LGTM. I can see how various functions that depend on sessions now can choose Any of the session in the session.Group.

Yes, and Any automatically retries the next session, should the previous one fail.

I didn't see any cases where multiple sessions are added to a group, but I think that's what you alluded to in the PR comment above.

Multiple session are added to the group automatically with the jobs mechanism. This existed before. The difference is that when passing to ops, previously solver would pick a random one and pass it as string, and now it passes a callback (masked in Group interface) for op to get access to all the valid sessions when needed.

@tonistiigi
Copy link
Member Author

Found an issue with ResolveImageConfig and Resolver cache that needs some refactoring.

Avoid hidden session passing and allow one session to drop when
multiple builds share a vertex.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Copy link
Collaborator

@hinshun hinshun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, otherwise looks good.

func (a *SessionAuthenticator) credentials(h string) (string, string, error) {
a.cacheMu.RLock()
c, ok := a.cache[h]
if ok && time.Since(c.created) < time.Minute {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does time.Minute come from? Is that the expiration time of creds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I guess I could add a named constant for it so it is clearer.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi
Copy link
Member Author

@hinshun good to merge?

@hinshun hinshun merged commit 4881300 into moby:master Jul 8, 2020
thaJeztah added a commit to thaJeztah/cli that referenced this pull request Jul 28, 2020
full diff: moby/buildkit@df35e98...4d1f260

- moby/buildkit#1551 session: track sessions with a group construct
- moby/buildkit#1534 secrets: allow providing secrets with env
- moby/buildkit#1533 git: support for token authentication
- moby/buildkit#1549 progressui: fix logs time formatting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah added a commit to thaJeztah/cli that referenced this pull request Jul 28, 2020
full diff: moby/buildkit@df35e98...4d1f260

- moby/buildkit#1551 session: track sessions with a group construct
- moby/buildkit#1534 secrets: allow providing secrets with env
- moby/buildkit#1533 git: support for token authentication
- moby/buildkit#1549 progressui: fix logs time formatting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah added a commit to thaJeztah/cli that referenced this pull request Jul 28, 2020
full diff: moby/buildkit@df35e98...4d1f260

- moby/buildkit#1551 session: track sessions with a group construct
- moby/buildkit#1534 secrets: allow providing secrets with env
- moby/buildkit#1533 git: support for token authentication
- moby/buildkit#1549 progressui: fix logs time formatting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah added a commit to thaJeztah/cli that referenced this pull request Sep 9, 2020
full diff: moby/buildkit@df35e98...4d1f260

- moby/buildkit#1551 session: track sessions with a group construct
- moby/buildkit#1534 secrets: allow providing secrets with env
- moby/buildkit#1533 git: support for token authentication
- moby/buildkit#1549 progressui: fix logs time formatting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Sep 11, 2020
full diff: moby/buildkit@df35e98...4d1f260

- moby/buildkit#1551 session: track sessions with a group construct
- moby/buildkit#1534 secrets: allow providing secrets with env
- moby/buildkit#1533 git: support for token authentication
- moby/buildkit#1549 progressui: fix logs time formatting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 7edc00d8088795798ae4e82d2e529a9829acfe72
Component: cli
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants