Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: check if the remote exists just before reading it #1713

Merged
merged 4 commits into from
Sep 7, 2024

Conversation

vmaerten
Copy link
Member

@vmaerten vmaerten commented Jul 7, 2024

Hey!

Context

When we have multiple remote Taskfiles, I noticed it can takes time to download all of them, so I added --offline.
But it was still taking times (either in running tasks, either in the autocomplete) so I took a look.

Problem

It appears that we check if the file exist in the remote even if the Offline flags is set, this cause unnecessary HTTP requests (at the end it takes more time than needed)

Solution

Move this check in Read method, this way it'll run only if there is no --offline .
I don't know if store the logger in the struct can causes performance issues, I did not notice that.

NOTE: as Read is executed before ResolveEntrypoint it shoud not cause any issues. I have a remote Taskfile included in a remote one and it's working

It can also improve preformance for this issue #1406

Benchmark locally

I've ran some local benchmark in my projects :

image
vs
image

EDIT:
Thanks to @ccoVeille, it appears that the timeout error does not work on main. cf :
image
with the fix :
image

PS: we may need to discuss about putting --offline by default in the autocomplete files or at least provide an env variable to do it easily
PS2: if this PR is merged, will backport to my remote-git PR

@vmaerten vmaerten added type: refactoring type: performance Issues with optimisation or performance. area: remote Changes related to remote taskfiles. labels Jul 7, 2024
@vmaerten vmaerten requested review from andreynering and pd93 July 7, 2024 10:06
return nil, err
}
node.URL = url
if errors.Is(ctx.Err(), context.DeadlineExceeded) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this one be before the previous if?

I mean you should be testing this no ?

Suggested change
if errors.Is(ctx.Err(), context.DeadlineExceeded) {
if errors.Is(err, context.DeadlineExceeded) {

Said otherwise, is there any case where RemoteExist will not return an error when the context is expired?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it should be the previous if but it's a bit more complexe to fix than apply your commit
After checking, the Timeout error does not work on main.
Thanks for reporting this

taskfile/reader.go Outdated Show resolved Hide resolved
@@ -60,6 +56,11 @@ func (node *HTTPNode) Remote() bool {
}

func (node *HTTPNode) Read(ctx context.Context) ([]byte, error) {
url, err := RemoteExists(ctx, node.logger, node.URL, node.timeout)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the context.WithTimeout

You could put it back here before this

Because the timeout should be used with the head and the get

Suggested change
url, err := RemoteExists(ctx, node.logger, node.URL, node.timeout)
ctx, cancel := context.WithTimeout(ctx, node.timeout)
defer cancel()
url, err := RemoteExists(ctx, node.logger, node.URL)

You are not using the timeout right now in remote except for the error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout is set in the reader :

ctx, cf := context.WithTimeout(context.Background(), r.timeout)
defer cf()

// Read the file
b, err = node.Read(ctx)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you should catch the CTX error in the caller ? This way, you don't introduce the timeout parameter, that is only used for reporting errors,no?

Maybe I missed something again 😅😁

@vmaerten
Copy link
Member Author

vmaerten commented Jul 7, 2024

Ok third attempt I think I have to desired behavior:

  • If the request timeout && --download : error
  • If the request timeout, use cached copy :
    • If the cache is here = OK
    • If not = error
  • if the request does not timeout = OK
    @pd93 Let me know if I missed something in the behavior

Copy link
Member

@pd93 pd93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if I missed something in the behavior

What you described sounds right to me.

PS: we may need to discuss about putting --offline by default in the autocomplete files or at least provide an env variable to do it easily

Agreed. If we haven't fetched the file yet then I'm assuming we haven't prompted the user to allow the remote file either, so we probably shouldn't fetch it for completions. I don't think a file should enter the user's filesystem until they have accepted the prompt.

PS2: if this PR is merged, will backport to my remote-git PR

❤️

@pd93 pd93 merged commit c77c8a4 into go-task:main Sep 7, 2024
14 checks passed
pd93 added a commit that referenced this pull request Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: remote Changes related to remote taskfiles. type: performance Issues with optimisation or performance. type: refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants