WIP: Factorisation of implementation into modules and layers #135

samoconnor · 2017-12-13T02:06:26Z

Update: Architecture documentation is now more complete: HTTP.jl Internal Architecture and
the new HTTP.open function provides a more general interface for request and/or response body streaming:

HTTP.open("GET", "https://tinyurl.com/bach-cello-suite-1-ogg") do http
    open(`vlc -q --play-and-exit --intf dummy -`, "w") do vlc
        write(vlc, http)
    end
end

Update: Server side is working (but not yet integrated with existing API). See example of using HTTP.listen to create a minimal WebSockets server,
See also example here

This branch started as PR #124 (fixes for parser state machine bugs #125 and #126). It evolved from there in an attempt to address some of the other issues I've found (e.g. #108, #115, #120, #122, #130, #131).

The result is an attempt to:

split functionality into distinct modules and layers,
reduce the size of the interfaces between modules and layers,
reduce the degree of type and data interdependancy between modules and layers.

For example the parse! function signature before and after:

function parse!(r::Union{Request, Response}, parser, bytes, len=length(bytes);
        lenient::Bool=true, host::String="", method::Method=GET,
        maxuri::Int64=DEFAULT_MAX_URI, maxheader::Int64=DEFAULT_MAX_HEADER,
        maxbody::Int64=DEFAULT_MAX_BODY, maintask::Task=current_task(),
        canonicalizeheaders::Bool=true)::Tuple{ParsingErrorCode, Bool, Bool, String}

parseheaders(::Parser, bytes) do h::Pair{String,String} ... -> excess
parsebody(::Parser, bytes) -> data, excess

Another example is the single consolidated retry loop using Base.retry:

abstract type RetryLayer{Next <: Layer} <: Layer end

isrecoverable(e::Base.UVError) = true
isrecoverable(e::Base.DNSError) = true
isrecoverable(e::Base.EOFError) = true
isrecoverable(e::HTTP.StatusError) = e.status < 200 || e.status >= 500
isrecoverable(e::Exception) = false

function request(::Type{RetryLayer{Next}}, a...; maxretries=2, kw...) where Next
    retry(request,
          delays=ExponentialBackOff(n = maxretries),
          check=(s,ex)->(s,isrecoverable(ex)))(Next, a...; kw...)
end

And another example is the seperate layer for StatusError:

abstract type ExceptionLayer{Next <: Layer} <: Layer end

struct StatusError <: Exception
    status::Int16
    response::Messages.Response
end

function request(::Type{ExceptionLayer{Next}}, a...; kw...) where Next
    res = request(Next, a...; kw...)
    if iserror(res)
        throw(StatusError(res.status, res))
    end
    return res
end

I have started writing some architecture notes. This is probably the best place to start to get a feel for the changes in this PR.

One of the outcomes is that cold requests are a bit faster. The command below
was taking about 8 seconds, and now takes about 5.

time julia -e 'using HTTP; @show HTTP.request("GET", "http://192.168.0.239")'

I don't know exactly why it is faster, but I suspect that separating the code
into smaller methods with less arguments and simpler return types makes things
easier for the compiler.

Add mark/reset support to Form <: IO Dont call eof() in body(::IO, ...) if content length is known (eof() can block) Work around JuliaLang/julia#24465 Use non-blocking readavailable() to read serverlog in test/server.jl Hack test/fifobuffer.jl to work with BufferStream/IOBuffer implementation. The BufferStream API does not expose the size of the buffer, so tests that passed a fixed size to FIFOBuffer(n) have been tweaked to pass with BufferStream's automatic buffer size managment behaviour. Add note re @test_broken and https://github.com/kennethreitz/httpbin/issues/340#issuecomment-330176449

Move header canonicalization out of parser.

@nread

move "close(response.body)" from parser to client remove @nread -- redundant now that header size is unrestricted remove unused handlers: onmessagecomplete, onstatus, onmessagebegin, onheaderscomplete

…uliaWeb#126. Replace local `KEY` in `parse!()` with `parser.previous_field`. Add onurlbytes(). Accumulates URL in `parser.valuebuffer`. Use `Ref{String}` to return `extra`/`upgrade` so that caller can tell the difference between upgrade with empty string and no upgrade. Add `s_req_fragment_start` to list of states where the url_mark should be reset to 1.

…ate Parser from Request/Response

Store Vector{Pair} headers in Parser struct. Move cookie processing from parser to client. Move special treatement of multi-line and repeated headers out of main parser function.

@time

…Parser. append!() is replaced by write(). Faster, less allocaiton. unsafe_string(pointer(x.buf), length(x.buf)) is replaced by String(take!(x.buf)) ``` mutable struct s1 buf::Vector{UInt8} end mutable struct s2 buf::IOBuffer end a(x::s1, s) = append!(x.buf, s) a(x::s2, s) = write(x.buf, s) r(x::s1) = unsafe_string(pointer(x.buf), length(x.buf)) r(x::s2) = String(take!(x.buf)) function go(x) inbuf = Vector{UInt8}([0]) for i = 1:100000 inbuf[1] = UInt8(i % 255) a(x, inbuf) end outbuf = String[] for i = 1:1000 push!(outbuf, r(x)) end println(length(outbuf)) end t1() = go(s1(Vector{UInt8}())) t2() = go(s2(IOBuffer())) t1() t2() println("t1, Vector...") @time t1() println("") println("t2, IOBuffer...") @time t2() ``` ``` t1, Vector... 1000 0.101289 seconds (1.12 k allocations: 95.733 MiB, 65.69% gc time) t2, IOBuffer... 1000 0.002676 seconds (2.04 k allocations: 241.281 KiB) ```

… to body in the Parser struct

Explicitly handle data within main parse loop instead of using *_mark to clean up loose ends. Number of places that callbacks are called is reduced. Add `while p <= len` inner loop for URL parsing (follows pattern used for header field and value parsing) Remove on*bytes functions, just `write()` to valuebuffer or fieldbuffer as needed.

Purpose is not clear. Wasn't working anyway in some cases in origin master branch. e.g. In the "Foo: F\01ailure" test case nread was buffer length + 1.

no error. Remove redundant write-back of parser.state.

@debug

with single @debug at top of main loop.

Add to utils.jl: setkey, getkey

fix tests for invalid content length

… calls to parse!() ensure that catch-all return at end of main loop is only used for incomplete messages

functions. Replace multiple return points in main parse!() loop with single return at the bottom. Make s_message_done a loop termination condidion.

(was using string(r) for both). Work around for JuliaLang/MbedTLS.jl#113 Try to rationalise Base.write, Base.string, Base.show and opts.logbody (WIP)

EricForgy · 2018-01-14T17:32:04Z

When I ran test/clients.jl as part of the full test suite, it failed, but when I ran it in isolation, it passed. Is there some non-deterministic test in there? Trying again... 😅

Update: Client test failed again when run all together...

Update^2: Failed again when run in isolation 🤔

Update^3: Looks like a bug in client. Headers are not being populated correctly. Or it could be in the parameters. Parser?

samoconnor · 2018-01-14T22:19:42Z

Is tmpout a path in Linux or something?

function testget(url)
    mktempdir() do d
        cd(d) do
            cmd = `"curl -v -s $url > tmpout 2>&1"`
            cmd = `bash -c $cmd`
            run(cmd)
            return String(read(joinpath(d, "tmpout")))
        end
    end
end

The command curl -v -s $url > tmpout 2>&1 says send curl's STDOUT to a file called tmpout, merging stream 2 (STDERR) into stream 1 (STDOUT).

We can just use Requests because all you seem to want is an independent test of get.

That's fine for windows. The advantages of curl are that it is widely field-validated; it supports fetching multiple urls over the same connection; and it runs in a seperate process.

I'll submit a PR that uses Requests instead of curl. I hope that is alright 🙏

👍

samoconnor · 2018-01-14T22:21:17Z

Update^2: Failed again when run in isolation 🤔

Update^3: Looks like a bug in client. Headers are not being populated correctly. Or it could be in the parameters. Parser?

Please post the full logs whenever a test fails (create a seperate issue on HTTP.jl if you like). Records of intermittent failure are very valuable.

EricForgy · 2018-01-15T01:11:15Z

Please post the full logs whenever a test fails (create a seperate issue on HTTP.jl if you like). Records of intermittent failure are very valuable.

Yes. Sorry. Basically. the header test with parameter Dict("hey" => "dude") was failing with `cannot find key "hey"``. Then I tried again putting the parameters directly in the URL and it was still failing.

URLs are the subset of URIs that specify a means of acessing a resource in addition to specifying the resource name. e.g http://foo.com/bar?a=b is a URL but bar?a=b is not. Changes: - Variables of type URI that must contain a compelte URL renamed uri -> url. - Remove the "URL" constructor for URIs. It seems to have been constructing URI's that were not URLs for some input, which seems wrong. The other "URI" constructors should suffice. - Simplify the main kwargs URI constructor by putting the sanity checks in preconditions. Changes: - Renames HTTP.Request.uri -> HTTP.Request.target - Added "target=" kw arg to request(MessageLayer, ...) to allow setting the target to somthing that cannot be derived from the request URL. - Remove the old behaviour (I assume untested) of handling "CONNECT" as a special case and using the Request URL host:port as the target. I seems that this would have created a request for the server to proxy to itself. - Remove the now redundant option isconnect from the url parser. A client making a CONNECT Request should use target="$host:$port". A server implemetning a CONNECT proxy should use http_parse_host(target)

samoconnor · 2018-01-15T16:48:16Z

HI @EricForgy , re the Dict("hey" => "dude") test, I suspect that was down to dispatch ambiguity between the HTTP.get defined in the old client.jl file and the new HTTP.get defined in HTTP.jl.

samoconnor · 2018-01-15T17:13:18Z

v0.6 win64: 436548 passed, 0 failed, 1 errored, 4 broken.
-- Error is rmdir: Permission denied for tmp files

v0.6 linux all passing.

EricForgy · 2018-01-15T20:17:17Z

Yay! 🎉

This is great progress 🙌

(expr-fu stolen from Test.jl)

that the result can be losslessly transformed back into the input. Don't allow '*' at the start of URI path. JuliaWeb@1226c4b#r26878192 Add sentinal const blank_userinfo to handle test cases with empty '@' userinfo. Remove UF_XXX and UF_XXX_MASK constants. It's simpler to use the state machine constants to keep track of state, and use local variables to keep track of values

…0#section-5.3.4

EricForgy · 2018-01-16T19:42:01Z

I'm trying my best to get the tests to pass on Windows v0.6.2.

@samoconnor I submitted a PR to your PR, but it is still a work in progress, but please have a look at the new testget. I think this captures what you wanted and I can get the tests that use it to pass now. There are still 3 tests failing, but I'm running out of steam tonight 😅

quinnj · 2018-01-16T22:08:11Z

I went ahead and merged. Sorry if that was premature, but the travis queues are downright awful today and I want to do some more testing locally on some related things. I also added @samoconnor as an official collaborator to the repo, so you should be able to make branches off the repo directly/merge/revert etc as needed. Let's get things cleaned up and in a good state wrt to tests and we can do a release.

EricForgy · 2018-01-17T01:42:32Z

Yay! 🎉 🎉 🎉

samoconnor · 2018-01-17T04:53:04Z

Thx @quinnj, I've unsquashed the commit to avoid loosing git blame granularity.

quinnj · 2018-01-17T04:56:27Z

Oh, good call. Sorry about squashing that; it's the default so I was just clicking through things.

samoconnor added 30 commits November 6, 2017 16:48

Rename canonicalize!(::String) -> tocameldash!(::String)

9f01226

Move header canonicalization out of parser.

Remove URI, header and body size limits from parser.

8d8841e

Replace unused parser option "lenient" with global const "strict".

fabde09

Parser simplification

1b4aee0

move "close(response.body)" from parser to client remove @nread -- redundant now that header size is unrestricted remove unused handlers: onmessagecomplete, onstatus, onmessagebegin, onheaderscomplete

Store method, major, minor, url and status in Parser - begin to separ…

2dd90a8

…ate Parser from Request/Response

Use Vector{Pair} instead of Dict for Headers. See JuliaWeb#108.

bb120bd

Store Vector{Pair} headers in Parser struct. Move cookie processing from parser to client. Move special treatement of multi-line and repeated headers out of main parser function.

reduce main parse! function to a single method by keeping a reference…

35dfe80

… to body in the Parser struct

Remove (mostly) unused Parser.nread field

b1eebe8

Purpose is not clear. Wasn't working anyway in some cases in origin master branch. e.g. In the "Foo: F\01ailure" test case nread was buffer length + 1.

Explicitly return HPE_OK (not errno) in Parser in places where there is

7e0d643

no error. Remove redundant write-back of parser.state.

Clean up gotos and main loop condition in parser

2bf357e

Replace @debug(PARSING_DEBUG, ParsingStateCode(p_state)) in every ifelse

602d7ed

with single @debug at top of main loop.

Merge branch 'buffer_stream_branch' into simplify_parser_branch

c3194c0

Add to utils.jl: setkey, getkey

Move header and body procssing out of parser (via callbacks)

770247f

throw message incomplete error from parse()

ac46ca2

fix tests for invalid content length

move upgrade flag to Parser struct so that state is preserved between…

5af3f19

… calls to parse!() ensure that catch-all return at end of main loop is only used for incomplete messages

throw ArgumentError for zero len passed to parse!

175819e

simplify ParsingError display

5d3ffa1

Replace multiple status var returned by parse!() with state accessor

f788921

functions. Replace multiple return points in main parse!() loop with single return at the bottom. Make s_message_done a loop termination condidion.

Merge remote-tracking branch 'origin/master' into simplify_parser_branch

70a8373

replace "while true" read loop in client with "while !eof(socket)" loop

aaf8e5d

throw errors directly from parse!()

8f49465

Call show(r) for verbose logging and write(r) for comms.

5febfb8

(was using string(r) for both). Work around for JuliaLang/MbedTLS.jl#113 Try to rationalise Base.write, Base.string, Base.show and opts.logbody (WIP)

fix parse error when fragmentation splits CRLF

dbb1bae

simple debug macro

9accb0b

function absuri(u::URI, context::URI)

77ba528

parser interface tweaks in server (untested)

ebced40

iso8859_1_to_utf8 per JuliaWeb#141

6d978be

samoconnor added 2 commits January 15, 2018 12:35

typo

aff5cac

samoconnor added 2 commits January 16, 2018 03:51

update tests and compat

6c6dfbb

try to fix Inexact Error in win32

7d92068

samoconnor added 3 commits January 16, 2018 15:37

Update @Ensure macro to print values of failed comparison.

d27163c

(expr-fu stolen from Test.jl)

handle asterix-form Request Target https://tools.ietf.org/html/rfc723…

23dab21

…0#section-5.3.4

KristofferC mentioned this pull request Jan 16, 2018

HTTP 400 response in server side logs JuliaComputing/FemtoCleaner.jl#71

Closed

quinnj merged commit e5d43cc into JuliaWeb:master Jan 16, 2018

This was referenced Jan 17, 2018

Add test for parsing one charactar at a time, fix #125 and #126 #124

Closed

Remove MicroLogging #142

Closed

samoconnor added a commit that referenced this pull request Jan 18, 2018

kw args for body and headers per #135

6bc739f

samoconnor mentioned this pull request Jan 19, 2018

Potential for URI parsing performance improvement. #151

Closed

EricForgy mentioned this pull request Jan 23, 2018

Tag a new release? #168

Closed

samoconnor mentioned this pull request Feb 14, 2018

Use proposed new WebSockets.jl #197

Closed

samoconnor mentioned this pull request Dec 17, 2018

Move AWS4 signing into a separate repo? #355

Closed

ararslan pushed a commit to JuliaCloud/AWSAuth.jl that referenced this pull request Dec 19, 2018

https://github.com/JuliaWeb/HTTP.jl/pull/135#pullrequestreview-88370960

15fab02

WIP: Factorisation of implementation into modules and layers #135

WIP: Factorisation of implementation into modules and layers #135

Uh oh!

Conversation

samoconnor commented Dec 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EricForgy commented Jan 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samoconnor commented Jan 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samoconnor commented Jan 14, 2018

Uh oh!

EricForgy commented Jan 15, 2018

Uh oh!

samoconnor commented Jan 15, 2018

Uh oh!

samoconnor commented Jan 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EricForgy commented Jan 15, 2018

Uh oh!

EricForgy commented Jan 16, 2018

Uh oh!

quinnj commented Jan 16, 2018

Uh oh!

EricForgy commented Jan 17, 2018

Uh oh!

samoconnor commented Jan 17, 2018

Uh oh!

quinnj commented Jan 17, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

samoconnor commented Dec 13, 2017 •

edited

Loading

EricForgy commented Jan 14, 2018 •

edited

Loading

samoconnor commented Jan 14, 2018 •

edited

Loading

samoconnor commented Jan 15, 2018 •

edited

Loading