Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async IO with CLibvenice #15

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Async IO with CLibvenice #15

wants to merge 6 commits into from

Conversation

robertjpayne
Copy link

This PR makes PostgreSQL non-blocking and play nice with Libvenice:

  • Uses async connection
  • Uses async querying

@Danappelxx please review 👍

Copy link
Contributor

@Danappelxx Danappelxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great! Sorry took so long to review. The PR needs a bit of cleanup (not a problem), but I'm not 100% certain the semantics are correct. More in the next comment.

case PGRES_POLLING_OK:
break loop
case PGRES_POLLING_READING:
mill_fdwait_(fd, FDW_IN, 15.seconds.fromNow().int64milliseconds, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Venice.poll here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Venice.poll doesn't clean the file descriptor when it's done

mill_fdwait_(fd, FDW_IN, 15.seconds.fromNow().int64milliseconds, nil)
mill_fdclean_(fd)
case PGRES_POLLING_WRITING:
mill_fdwait_(fd, FDW_OUT, 15.seconds.fromNow().int64milliseconds, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Venice.poll doesn't clean the file descriptor when it's done

import CLibpq
import CLibvenice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLibvenice -> Venice - I don't think we're using any api's that venice doesn't wrap (if we are, should use both?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Venice have an API to explicitly clean up a fdwait? We have to wait and then immediately clean it up otherwise when libpq goes to read it causes a crash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, doesn't look like it. Venice.poll just wraps fdwait - I guess we can use fdclean from CLibvenice.

return try Result(result)
extension Collection where Iterator.Element == String {

func withUnsafeCStringArray<T>(_ body: (UnsafePointer<UnsafePointer<Int8>?>) throws -> T) rethrows -> T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to use this :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yes! I need to get rid of it 👍

throw ConnectionError(description: "Connection already opened.")
}

var components = URLComponents()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, better style would be

let url: String = {
    var components = URLComponents()
    // ...
    return components.url!.absoluteString
}()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally just think that is a nit, it's a local scope var and mutated either way step by step…

throw mostRecentError ?? ConnectionError(description: "Could not get file descriptor.")
}

loop: while true {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop isn't nested, so I don't think a label is necessary

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can get rid of that loop label

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you can't break from a switch to break the loop, hence the label?

mill_fdwait_(fd, FDW_OUT, 15.seconds.fromNow().int64milliseconds, nil)
mill_fdclean_(fd)
case PGRES_POLLING_ACTIVE:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue instead of break to avoid the label. This might be "clever", though...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the loop label can be fixed

var parameterData = [UnsafePointer<Int8>?]()
var deallocators = [() -> ()]()
defer { deallocators.forEach { $0() } }

for parameter in parameters {
if let parameters = parameters {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function got really long - needs some cleaning

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, it works though? It's still pretty small

do {
connection = try! PostgreSQL.Connection(info: .init(URL(string: "postgres://localhost:5432/swift_test")!))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the connection is being reconstructed each time? Is this the right way to use the api now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tests each test should use it's own connection, we shouldn't share it due to busy states/dangling etc…

Connections are also unrecoverable ( you can't "reconnect" )

@Danappelxx
Copy link
Contributor

Two things lead me to think that we're not polling quite right:

  1. In my testing, the blocking version was actually faster than this nonblocking version. In theory, this is not supposed to happen. I tested using wrk -d 10 -t 10 -c 100 http://localhost:8080/api/v1/posts on an endpoint which is 100% database io.

without this pr:

> wrk -d 10 -t 10 -c 100 http://localhost:8080/api/v1/posts
Running 10s test @ http://localhost:8080/api/v1/posts
  10 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    44.86ms    3.71ms 101.55ms   87.31%
    Req/Sec   223.54     22.24   300.00     75.20%
  22326 requests in 10.05s, 8.84MB read
Requests/sec:   2221.08
Transfer/sec:      0.88MB

with this pr:

> wrk -d 10 -t 10 -c 100 http://localhost:8080/api/v1/posts
Running 10s test @ http://localhost:8080/api/v1/posts
  10 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.06ms    5.66ms  82.92ms   87.77%
    Req/Sec   670.91    280.70     1.16k    66.05%
  10877 requests in 10.08s, 3.56MB read
  Socket errors: connect 0, read 2360, write 0, timeout 0
  Non-2xx or 3xx responses: 2459
Requests/sec:   1079.14
Transfer/sec:    361.59KB
  1. In the logs for this test, I found 2425 occurrences of Error: another command is already in progress. This is probably a symptom of the same issue :)

@Danappelxx
Copy link
Contributor

Danappelxx commented Dec 17, 2016

Also, it didn't actually build for me (3.0.2). Are you using a modified version of libmill/venice? mill_fdclean_ doesn't exist in libvenice (it's just fdclean). Renaming a few function calls fixed it.

@robertjpayne
Copy link
Author

@Danappelxx as per the performance your tests are re-using a connection I assume? This is why it appears to perform slower.

Before a single request will execute the full SQL command before any other request in the entire process could continue thus you sort of have an implicit lock on any SQL execution.

Now when libpq is waiting for IO another coroutine will pick up a new request and start to try and use the same Postgres connection, Postgres is half way through executing another SQL statement and hence you get the errors.

For this async stuff to work we really need a connection pool that a request checks a connection out and when finished it puts it back.

@Danappelxx
Copy link
Contributor

Did some more testing and I'm satisfied! Works great with a connection pool as you suggested. Just a little bit of cleanup and its good to go.

@Danappelxx
Copy link
Contributor

(merge whenever you're ready)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants