Skip to content

Commit 9a26c5e

Browse files
authored
Rewrite AsyncHttp\Client for cleaner API and Transfer-Encoding support (#113)
Refactors the `AsyncHttp\Client` to simplify the usage and the internal implementation. This will be helpful for [rewriting URLs in WordPress posts and downloading the related assets](WordPress/data-liberation#74). ## Changes * Handle errors at each step of the HTTP request lifecycle. * Drops support for PHP 7.0 and 7.1 since WordPress is dropping that support, too. * Provide `await_next_event()` as a single, filterable interface for consuming all the HTTP activity. Remove the `onProgress` callback and various other ways of waiting for information on specific requests. * Introduce an internal `event_loop_tick()` function that runs all the available non-blocking operations. * Move all the logic from functions into the `Client` class. It is now less generic, but I'd argue it already wasn't that generic and at least now we can avoid going back and froth between functions and that class. * Support `Transfer-Encoding: chunked`, `Transfer-Encoding: gzip`, and `Content-Encoding: gzip` via stream wrappers. * Remove most of the complexity associated with making PHP streams central to how the library works. In this version, the focus is on the `Client` object so we no longer have to go out of our way to store data in stream context, struggle with stream filters, passthrough data between stream wrappers layers etc. This PR also ships an implementation of a HTTP proxy built with this client library – it could come handy for running an [in-browser Git client](https://adamadam.blog/2024/06/21/cloning-a-git-repository-from-a-web-browser-using-fetch/): https://github.com/WordPress/blueprints-library/blob/http-client-api-refactir/http_proxy.php ## Usage example ```php $requests = [ new Request( "https://wordpress.org/latest.zip" ), new Request( "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" ), ]; $client = new Client(); $client->enqueue( $requests ); while ( $client->await_next_event() ) { $request = $client->get_request(); echo "Request " . $request->id . ": " . $client->get_event() . " "; switch ( $client->get_event() ) { case Client::EVENT_BODY_CHUNK_AVAILABLE: echo $request->response->received_bytes . "/". $request->response->total_bytes ." bytes received"; file_put_contents( 'downloads/' . $request->id, $client->get_response_body_chunk(), FILE_APPEND); break; case Client::EVENT_REDIRECT: case Client::EVENT_GOT_HEADERS: case Client::EVENT_FINISHED: break; case Client::EVENT_FAILED: echo "– ❌ Failed request to " . $request->url . " – " . $request->error; break; } echo "\n"; } ``` ## HTTP Proxy example ```php // Encode the current request details in a Request object $requests = [ new Request( $target_url, [ 'method' => $_SERVER['REQUEST_METHOD'], 'headers' => [ ...getallheaders(), // Ensure we won't receive an unsupported content encoding // just because the client browser supports it. 'Accept-Encoding' => 'gzip, deflate', 'Host' => parse_url($target_url, PHP_URL_HOST), ], // Naively assume only POST requests have body 'body_stream' => $_SERVER['REQUEST_METHOD'] === 'POST' ? fopen('php://input', 'r') : null, ] ), ]; $client = new Client(); $client->enqueue( $requests ); $headers_sent = false; while ( $client->await_next_event() ) { // Pass the response headers and body to the client, // Consult the previous example for the details. } ``` ## Future work * Unit tests. * Abundant inline documentation with examples and explanation of technical decisions. * Standard way of piping HTTP responses into ZIP processor, XML processor, HTML tag processor etc. * Find a useful way of treating HTTP error codes such as 404 or 501. Currently these requests are marked as "finished", not "failed", because the connection was successfully created and the server replied with a valid HTTP response. Perhaps it's fine not to do that. This could be a lower-level library and that behavior could belong to a higher-level client. cc @dmsnell @maypaw @reimic
1 parent 239f43e commit 9a26c5e

20 files changed

+1459
-804
lines changed

chunked_encoding_server.js

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
/**
2+
* Use with `http_api.php` to test chunked transfer encoding:
3+
*
4+
* ```php
5+
* $requests = [
6+
* new Request( "http://127.0.0.1:3000/", [
7+
* 'http_version' => '1.1'
8+
* ] ),
9+
* new Request( "http://127.0.0.1:3000/", [
10+
* 'http_version' => '1.0',
11+
* 'headers' => [
12+
* 'please-redirect' => 'yes',
13+
* ],
14+
* ] ),
15+
* ];
16+
*/
17+
18+
const http = require('http');
19+
const zlib = require('zlib');
20+
21+
const server = http.createServer((req, res) => {
22+
// Check if the client is using HTTP/1.1
23+
const isHttp11 = req.httpVersion === '1.1';
24+
res.useChunkedEncodingByDefault = false
25+
26+
// Check if the client accepts gzip encoding
27+
const acceptEncoding = req.headers['accept-encoding'];
28+
const useGzip = acceptEncoding && acceptEncoding.includes('gzip');
29+
30+
if (req.headers['please-redirect']) {
31+
res.writeHead(301, { Location: req.url });
32+
res.end();
33+
return;
34+
}
35+
36+
// Set headers for chunked transfer encoding if HTTP/1.1
37+
if (isHttp11) {
38+
res.setHeader('Transfer-Encoding', 'chunked');
39+
}
40+
41+
res.setHeader('Content-Type', 'text/plain');
42+
43+
// Create a function to write chunks
44+
const writeChunks = (stream) => {
45+
stream.write(`<!DOCTYPE html>
46+
<html lang=en>
47+
<head>
48+
<meta charset='utf-8'>
49+
<title>Chunked transfer encoding test</title>
50+
</head>\r\n`);
51+
52+
stream.write('<body><h1>Chunked transfer encoding test</h1>\r\n');
53+
54+
setTimeout(() => {
55+
stream.write('<h5>This is a chunked response after 100 ms.</h5>\n');
56+
57+
setTimeout(() => {
58+
stream.write('<h5>This is a chunked response after 1 second. The server should not close the stream before all chunks are sent to a client.</h5></body></html>\n');
59+
stream.end();
60+
}, 1000);
61+
}, 100);
62+
};
63+
64+
if (useGzip) {
65+
res.setHeader('Content-Encoding', 'gzip');
66+
const gzip = zlib.createGzip();
67+
gzip.pipe(res);
68+
69+
if (isHttp11) {
70+
writeChunks({
71+
write(data) {
72+
gzip.write(data);
73+
gzip.flush();
74+
},
75+
end() {
76+
gzip.end();
77+
}
78+
});
79+
} else {
80+
gzip.write('Chunked transfer encoding test\n');
81+
gzip.write('This is a chunked response after 100 ms.\n');
82+
gzip.write('This is a chunked response after 1 second.\n');
83+
gzip.end();
84+
}
85+
} else {
86+
if (isHttp11) {
87+
writeChunks(res);
88+
} else {
89+
res.write('Chunked transfer encoding test\n');
90+
res.write('This is a chunked response after 100 ms.\n');
91+
res.write('This is a chunked response after 1 second.\n');
92+
res.end();
93+
}
94+
}
95+
});
96+
97+
const port = 3000;
98+
server.listen(port, () => {
99+
console.log(`Server is listening on http://127.0.0.1:${port}`);
100+
});

composer.json

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,7 @@
3030
"files": [
3131
"src/WordPress/Blueprints/functions.php",
3232
"src/WordPress/Zip/functions.php",
33-
"src/WordPress/Streams/stream_str_replace.php",
34-
"src/WordPress/AsyncHttp/async_http_streams.php"
33+
"src/WordPress/Streams/stream_str_replace.php"
3534
]
3635
},
3736
"autoload-dev": {

http_api.php

Lines changed: 25 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,35 @@
11
<?php
22

33
use WordPress\AsyncHttp\Client;
4+
use WordPress\AsyncHttp\ClientEvent;
45
use WordPress\AsyncHttp\Request;
56

67
require __DIR__ . '/vendor/autoload.php';
78

8-
$client = new Client();
9-
$client->set_progress_callback( function ( Request $request, $downloaded, $total ) {
10-
echo "$request->url – Downloaded: $downloaded / $total\n";
11-
} );
12-
13-
$streams1 = $client->enqueue( [
14-
new Request( "https://downloads.wordpress.org/plugin/gutenberg.17.7.0.zip" ),
15-
new Request( "https://downloads.wordpress.org/theme/pendant.zip" ),
16-
] );
17-
// Enqueuing another request here is instant and won't start the download yet.
18-
//$streams2 = $client->enqueue( [
19-
// new Request( "https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip" ),
20-
//] );
9+
$requests = [
10+
new Request( "https://wordpress.org/latest.zip" ),
11+
new Request( "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" ),
12+
];
2113

22-
// Stream a single file, while streaming all the files
23-
file_put_contents( 'output-round1-0.zip', stream_get_contents( $streams1[0] ) );
24-
//file_put_contents( 'output-round1-1.zip', stream_get_contents( $streams1[1] ) );
25-
die();
26-
// Initiate more HTTPS requests
27-
$streams3 = $client->enqueue( [
28-
new Request( "https://downloads.wordpress.org/plugin/akismet.4.1.12.zip" ),
29-
new Request( "https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip" ),
30-
new Request( "https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip" ),
31-
] );
32-
33-
// Download the rest of the files. Foreach() seems like downloading things
34-
// sequentially, but we're actually streaming all the files in parallel.
35-
$streams = array_merge( $streams2, $streams3 );
36-
foreach ( $streams as $k => $stream ) {
37-
file_put_contents( 'output-round2-' . $k . '.zip', stream_get_contents( $stream ) );
14+
$client = new Client();
15+
$client->enqueue( $requests );
16+
17+
while ( $client->await_next_event() ) {
18+
$request = $client->get_request();
19+
echo "Request " . $request->id . ": " . $client->get_event() . " ";
20+
switch ( $client->get_event() ) {
21+
case Client::EVENT_BODY_CHUNK_AVAILABLE:
22+
echo $request->response->received_bytes . "/". $request->response->total_bytes ." bytes received";
23+
file_put_contents( 'downloads/' . $request->id, $client->get_response_body_chunk(), FILE_APPEND);
24+
break;
25+
case Client::EVENT_REDIRECT:
26+
case Client::EVENT_GOT_HEADERS:
27+
case Client::EVENT_FINISHED:
28+
break;
29+
case Client::EVENT_FAILED:
30+
echo "– ❌ Failed request to " . $request->url . "" . $request->error;
31+
break;
32+
}
33+
echo "\n";
3834
}
3935

40-
echo "Done! :)";
41-
42-
// ----------------------------
43-
//
44-
// Previous explorations:
45-
46-
// Non-blocking parallel processing – the fastest method.
47-
//while ( $results = sockets_http_response_await_bytes( $streams, 8096 ) ) {
48-
// foreach ( $results as $k => $chunk ) {
49-
// file_put_contents( 'output' . $k . '.zip', $chunk, FILE_APPEND );
50-
// }
51-
//}
52-
53-
// Blocking sequential processing – the slowest method.
54-
//foreach ( $streams as $k => $stream ) {
55-
// stream_set_blocking( $stream, 1 );
56-
// file_put_contents( 'output' . $k . '.zip', stream_get_contents( $stream ) );
57-
//}

http_proxy.php

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
<?php
2+
/**
3+
* HTTP Proxy implemented using AsyncHttp\Client
4+
*
5+
* This could be a replacement for the curl-based PHPProxy shipped
6+
* in https://github.com/WordPress/wordpress-playground/pull/1546.
7+
*/
8+
9+
use WordPress\AsyncHttp\Client;
10+
use WordPress\AsyncHttp\ClientEvent;
11+
use WordPress\AsyncHttp\Request;
12+
13+
require __DIR__ . '/vendor/autoload.php';
14+
15+
function get_target_url($server_data=null) {
16+
if ($server_data === null) {
17+
$server_data = $_SERVER;
18+
}
19+
$requestUri = $server_data['REQUEST_URI'];
20+
$targetUrl = $requestUri;
21+
22+
// Remove the current script name from the beginning of $targetUrl
23+
if (strpos($targetUrl, $server_data['SCRIPT_NAME']) === 0) {
24+
$targetUrl = substr($targetUrl, strlen($server_data['SCRIPT_NAME']));
25+
}
26+
27+
// Remove the leading slash
28+
if ($targetUrl[0] === '/' || $targetUrl[0] === '?') {
29+
$targetUrl = substr($targetUrl, 1);
30+
}
31+
32+
return $targetUrl;
33+
}
34+
$target_url = get_target_url();
35+
$host = parse_url($target_url, PHP_URL_HOST);
36+
$requests = [
37+
new Request(
38+
$target_url,
39+
[
40+
'method' => $_SERVER['REQUEST_METHOD'],
41+
'headers' => [
42+
...getallheaders(),
43+
'Accept-Encoding' => 'gzip, deflate',
44+
'Host' => $host,
45+
],
46+
'body_stream' => $_SERVER['REQUEST_METHOD'] === 'POST' ? fopen('php://input', 'r') : null,
47+
]
48+
),
49+
];
50+
51+
$client = new Client();
52+
$client->enqueue( $requests );
53+
54+
$headers_sent = false;
55+
while ( $client->await_next_event() ) {
56+
$request = $client->get_request();
57+
switch ( $client->get_event() ) {
58+
case Client::EVENT_GOT_HEADERS:
59+
http_response_code($request->response->status_code);
60+
foreach ( $request->response->get_headers() as $name => $value ) {
61+
if(
62+
$name === 'transfer-encoding' ||
63+
$name === 'set-cookie' ||
64+
$name === 'content-encoding'
65+
) {
66+
continue;
67+
}
68+
header("$name: $value");
69+
}
70+
$headers_sent = true;
71+
break;
72+
case Client::EVENT_BODY_CHUNK_AVAILABLE:
73+
echo $client->get_response_body_chunk();
74+
break;
75+
case Client::EVENT_FAILED:
76+
if(!$headers_sent) {
77+
http_response_code(500);
78+
echo "Failed request to " . $request->url . "" . $request->error;
79+
}
80+
break;
81+
case Client::EVENT_REDIRECT:
82+
case Client::EVENT_FINISHED:
83+
break;
84+
}
85+
echo "\n";
86+
}
87+

0 commit comments

Comments
 (0)