Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smartly join URLs #101

Merged
merged 2 commits into from
Apr 30, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Smartly join URLs
When paginating, some registries return an absolute URL in the Link HTTP
header. This happened on Amazon ECR, and docker_registry2 generated a
bad URI exception when trying to request `https://….com:443https://…`.
The absolute URL was naively appended to the base URL.

URI.join provides a smarter way to concatenate URLs, and behaves pretty
much like `<a href="…">` would in an HTML document. To preserve the path
of the base URL, I forced a trailing slash and made the API paths
relative. Otherwise, the semantic of `/v2` is to go back to the root.
  • Loading branch information
fmang committed Apr 29, 2024
commit 846e8b3d27006f763a657089e61c60a440527e20
36 changes: 20 additions & 16 deletions lib/registry/registry.rb
Original file line number Diff line number Diff line change
@@ -17,7 +17,9 @@ class Registry # rubocop:disable Metrics/ClassLength
# @option options [Hash] :http_options Extra options for RestClient::Request.execute.
def initialize(uri, options = {})
@uri = URI.parse(uri)
@base_uri = "#{@uri.scheme}://#{@uri.host}:#{@uri.port}#{@uri.path}"
@base_uri = +"#{@uri.scheme}://#{@uri.host}:#{@uri.port}#{@uri.path}"
# `URI.join("https://example.com/foo/bar", "v2")` drops `bar` in the base URL. A trailing slash prevents that.
@base_uri << '/' unless @base_uri.end_with? '/'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the only change here is ensuring it ends in a /. Is this because beforehand we always did @base_uri + path, and so it always treated it as appending?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clearer, when an URL does not end with a trailing slash, the last part is assumed to be a file, so joining another path to it will make the path relative to the directory and not the file.

To give a more natural example, /foo/index.html joined with style.css would become /foo/style.css, white foo/index/ joined with style.css becomes /foo/index/style.css.

@user = options[:user]
@password = options[:password]
@http_options = options[:http_options] || {}
@@ -49,16 +51,18 @@ def paginate_doget(url)
response = doget(url)
yield response

link_header = response.headers[:link]
break unless link_header
link_header = response.headers[:link] or break
next_url = parse_link_header(link_header)[:next] or break

url = parse_link_header(link_header)[:next]
# The next URL in the Link header may be relative to the request URL, or absolute.
# URI.join handles both cases nicely.
url = URI.join(response.request.url, next_url)
end
end

def search(query = '')
all_repos = []
paginate_doget('/v2/_catalog') do |response|
paginate_doget('v2/_catalog') do |response|
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the leading / here (and in all other cases)? Is this because it used to just @base_uri + path so it always added it? And thus if the base was http://foo.com/a/b then this would give http://foo.com/a/b/v2/_catalog. But with URI.join, it will treat the leading / as an absolute path, and so unless you remove the leading /, you would end up with http://foo.com/v2/_catalog instead of http://foo.com/a/b/v2/_catalog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. In convential URL semantics, paths starting with a / are absolute.

repos = JSON.parse(response)['repositories']
repos.select! { |repo| repo.match?(/#{query}/) } unless query.empty?
all_repos += repos
@@ -75,7 +79,7 @@ def tags(repo, count = nil, last = '', withHashes = false, auto_paginate: false)
query_vars = ''
query_vars = "?#{URI.encode_www_form(params)}" if params.length.positive?

response = doget "/v2/#{repo}/tags/list#{query_vars}"
response = doget "v2/#{repo}/tags/list#{query_vars}"
# parse the response
resp = JSON.parse response
# parse out next page link if necessary
@@ -104,7 +108,7 @@ def tags(repo, count = nil, last = '', withHashes = false, auto_paginate: false)

def manifest(repo, tag)
# first get the manifest
response = doget "/v2/#{repo}/manifests/#{tag}"
response = doget "v2/#{repo}/manifests/#{tag}"
parsed = JSON.parse response.body
manifest = DockerRegistry2::Manifest[parsed]
manifest.body = response.body
@@ -113,7 +117,7 @@ def manifest(repo, tag)
end

def blob(repo, digest, outpath = nil)
blob_url = "/v2/#{repo}/blobs/#{digest}"
blob_url = "v2/#{repo}/blobs/#{digest}"
if outpath.nil?
response = doget(blob_url)
DockerRegistry2::Blob.new(response.headers, response.body)
@@ -127,7 +131,7 @@ def blob(repo, digest, outpath = nil)
end

def manifest_digest(repo, tag)
tag_path = "/v2/#{repo}/manifests/#{tag}"
tag_path = "v2/#{repo}/manifests/#{tag}"
dohead(tag_path).headers[:docker_content_digest]
rescue DockerRegistry2::InvalidMethod
# Pre-2.3.0 registries didn't support manifest HEAD requests
@@ -161,9 +165,9 @@ def digest(image, tag, architecture = nil, os = nil, variant = nil)

def rmtag(image, tag)
# TODO: Need full response back. Rewrite other manifests() calls without JSON?
reference = doget("/v2/#{image}/manifests/#{tag}").headers[:docker_content_digest]
reference = doget("v2/#{image}/manifests/#{tag}").headers[:docker_content_digest]

dodelete("/v2/#{image}/manifests/#{reference}").code
dodelete("v2/#{image}/manifests/#{reference}").code
end

def pull(repo, tag, dir)
@@ -224,15 +228,15 @@ def tag(repo, tag, newrepo, newtag)

raise DockerRegistry2::RegistryVersionException unless manifest['schemaVersion'] == 2

doput "/v2/#{newrepo}/manifests/#{newtag}", manifest.to_json
doput "v2/#{newrepo}/manifests/#{newtag}", manifest.to_json
end

def copy(repo, tag, newregistry, newrepo, newtag); end

# gets the size of a particular blob, given the repo and the content-addressable hash
# usually unneeded, since manifest includes it
def blob_size(repo, blobSum)
response = dohead "/v2/#{repo}/blobs/#{blobSum}"
response = dohead "v2/#{repo}/blobs/#{blobSum}"
Integer(response.headers[:content_length], 10)
end

@@ -287,7 +291,7 @@ def doreq(type, url, stream = nil, payload = nil)
end
response = RestClient::Request.execute(@http_options.merge(
method: type,
url: @base_uri + url,
url: URI.join(@base_uri, url).to_s,
headers: headers(payload: payload),
block_response: block,
payload: payload
@@ -324,7 +328,7 @@ def do_basic_req(type, url, stream = nil, payload = nil)
end
response = RestClient::Request.execute(@http_options.merge(
method: type,
url: @base_uri + url,
url: URI.join(@base_uri, url).to_s,
user: @user,
password: @password,
headers: headers(payload: payload),
@@ -357,7 +361,7 @@ def do_bearer_req(type, url, header, stream = false, payload = nil)
end
response = RestClient::Request.execute(@http_options.merge(
method: type,
url: @base_uri + url,
url: URI.join(@base_uri, url).to_s,
headers: headers(payload: payload, bearer_token: token),
block_response: block,
payload: payload
13 changes: 6 additions & 7 deletions spec/docker_registry2_spec.rb
Original file line number Diff line number Diff line change
@@ -31,12 +31,11 @@
end
end

describe 'search' do
let(:search_hello_world) do
VCR.use_cassette('search/hello_world') { connected_object.search('hello-world') }
describe '#search' do
it 'lists all the repositories matching the query' do
repos = VCR.use_cassette('search/hello_world') { connected_object.search('hello-world') }
expect(repos).to eq %w[hello-world-v1 hello-world-v2 hello-world-v3 hello-world-v4]
end
it { expect { search_hello_world }.not_to raise_error }
it { expect(search_hello_world.size).to eq 2 }
end

describe 'manifest' do
@@ -88,7 +87,7 @@
let(:registry) { DockerRegistry2::Registry.new(uri) }

it 'The @path should be empty' do
expect(registry.instance_variable_get(:@base_uri)).to eq('https://example.com:443')
expect(registry.instance_variable_get(:@base_uri)).to eq('https://example.com:443/')
end
end

@@ -97,7 +96,7 @@
let(:registry) { DockerRegistry2::Registry.new(uri) }

it 'The @path is included' do
expect(registry.instance_variable_get(:@base_uri)).to eq('https://registry.myCompany.com:443/dockerproxy')
expect(registry.instance_variable_get(:@base_uri)).to eq('https://registry.myCompany.com:443/dockerproxy/')
end
end

30 changes: 30 additions & 0 deletions spec/vcr/search/hello_world.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading