Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gsutil cp bad request #217

Closed
jonasfugedi opened this issue Apr 19, 2020 · 6 comments · Fixed by #1182
Closed

gsutil cp bad request #217

jonasfugedi opened this issue Apr 19, 2020 · 6 comments · Fixed by #1182

Comments

@jonasfugedi
Copy link
Contributor

jonasfugedi commented Apr 19, 2020

I need to test some bash scripts which run gsutil but I keep getting errors when trying to create objects using gsutil. The scenario I run is basically:

docker run --name fake-gcs-server -p 4443:4443 fsouza/fake-gcs-server

gsutil -o "Credentials:gs_json_host=0.0.0.0" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" mb "gs://test"

gsutil -o "Credentials:gs_json_host=0.0.0.0" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls "gs://test"

echo "Hello" | gsutil -o "Credentials:gs_json_host=0.0.0.0" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp - "gs://test/hello.txt"

Copying from ...
ResumableUploadStartOverException: 404 Bad Request

gsutil -o "Credentials:gs_json_host=0.0.0.0" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp ./tmp/funny-memes-81.jpg "gs://test/"

Copying file://./tmp/funny-memes-81.jpg [Content-Type=image/jpeg]...
BadRequestException: 400 Bad Request

@fsouza
Copy link
Owner

fsouza commented Apr 20, 2020

Hey @jonasfugedi, thanks for opening this issue. What do you see in the server logs?

@jonasfugedi
Copy link
Contributor Author

jonasfugedi commented Apr 21, 2020

The logs did not give me any good clues. Is there any debug flag I can enable to get more details?

time="2020-04-19T17:01:22Z"`
level=info
msg="172.17.0.1 - - [19/Apr/2020:17:01:22 +0000] "POST /resumable/upload/storage/v1/b/test/o?fields=generation%2CcustomerEncryption%2Cmd5Hash%2Ccrc32c%2Cetag%2Csize&alt=json&uploadType=resumable HTTP/1.1" 404 19"

server_log.txt

Also, I assume this is reproducible anywhere? I've only tried it on two machines so far.

@fsouza
Copy link
Owner

fsouza commented Apr 21, 2020

@jonasfugedi thanks for sharing. I wanted to check where exactly the 400 was happening.

So:

  1. gsutil is calling the /resumable/upload endpoint, which is not defined in fake-gcs-server. I couldn't find docs for that endpoint, so we may need to reverse engineer it from the client code or tapping into some requests
  2. it appens that gsutil falls back to multipart upload and calls upload with uploadType=multipart, which fails with a 400. As far a I can tell, that endpoint fails when the Content-Type header can't be parsed as a multipart header, which may mean that whatever gsutil is sending isn't recognized by fake-gcs-server.

I believe next step would be to try and tap into what gsutil is sending, creating a test and fixing the issue in fake-gcs-server. Will tag this as a bug.

Thanks again for reporting and for sharing the logs!

@fsouza fsouza added the bug label Apr 21, 2020
@ex-nerd
Copy link

ex-nerd commented Jun 3, 2020

I think I'm running into this same issue in my tests to create signed upload URLs. When using fake-gcs server I just return a "direct" URL without the signing key (since that I can't fake KMS stuff and this works for downloads).

Edit: I'm not entirely sure this is the same bug, so I moved this comment over to #270 as its own thing.

@StephenWithPH
Copy link

[...] that endpoint fails when the Content-Type header can't be parsed as a multipart header, which may mean that whatever gsutil is sending isn't recognized by fake-gcs-server.

I'm having a similar problem with gsutil cp. I did some digging. I think you are correct.

gsutil -DD cp ... enables debugging. I was able to capture the headers. In my case, they were:

Headers: {'accept': 'application/json',
 'accept-encoding': 'gzip, deflate',
 'content-length': '401',
 'content-type': 'multipart/related; '
                 "boundary='===============1523364337061494617=='",
 'user-agent': 'apitools Python/3.8.5 gsutil/4.52 (linux) analytics/disabled '
               'interactive/True command/cp google-cloud-sdk/306.0.0'}

https://stackoverflow.com/questions/43527820/mime-parsemediatype-fails-on-multipart-boundary gave me enough of a clue to mess with ' -> " (thanks, Python?), and that seems to fix it. See https://play.golang.org/p/TJ5qzwTzSOk.

I made the change on a fork and verified I was able to get past this error. See StephenWithPH@87e3e3e.

However...

gsutil cp ... is now failing at a different point:

Traceback (most recent call last):
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil.py", line 123, in RunMain
    sys.exit(gslib.__main__.main())
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 429, in main
    return _RunNamedCommandAndHandleExceptions(
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 767, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 625, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
    return_code = command_inst.RunCommand()
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1196, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1514, in Apply
    self._SequentialApply(func, args_iterator, exception_handler, caller_id,
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1586, in _SequentialApply
    worker_thread.PerformTask(task, self)
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2306, in PerformTask
    results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 778, in _CopyFuncWrapper
    cls.CopyFunc(args,
  File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1059, in CopyFunc
    self.total_bytes_transferred += bytes_transferred
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'

This is at https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/cp.py#L1053. I'm working backwards, but it looks like fake-gcs-server doesn't send back the size of the object in its response.

Of note, the object uploads successfully (see logs):

time="2020-08-25T01:02:11Z" level=info msg="172.18.0.3 - - [25/Aug/2020:01:02:11 +0000] \"POST /upload/storage/v1/b/<redacted>/o?alt=json&fields=crc32c%2Cgeneration%2CcustomerEncryption%2Cetag%2Csize%2Cmd5Hash&key=<redacted>&uploadType=multipart HTTP/1.1\" 200 360"

And the object is actually there in fake-gcs-server:

cat fake-gcs/<redacted>/foo.txt 
{"ContentType":"text/plain; charset=us-ascii","ContentEncoding":"","Content":"Zm9vCg==","Crc32c":"liY0ew==","Md5Hash":"07BzhNET7exJ6qYjitX/AA==","ACL":[{"Entity":"projectOwner","EntityID":"","Role":"OWNER","Domain":"","Email":"","ProjectTeam":null}],"Metadata":null,"Created":"2020-08-24T23:41:17.771381Z","Deleted":"0001-01-01T00:00:00Z","Updated":"2020-08-24T23:41:17.771384Z","Generation":0}

@ekimekim
Copy link
Contributor

An update: It seems that the second half of @StephenWithPH 's comment has been fixed - fixing the multipart boundary bug is now sufficient for gsutil cp to work, at least for the version I'm using.
I've made a PR with the ' -> " hack.

fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
fsouza added a commit that referenced this issue May 27, 2023
Hoping to reproduce #217. Still need to improve the situation with
examples vs integration tests (see #1168).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants