Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpConnection#prepareByteData() : should close bodyStream to reuse socket connection #1232

Closed
ghost opened this issue Jun 13, 2019 · 3 comments
Milestone

Comments

@ghost
Copy link

ghost commented Jun 13, 2019

It is significant when crawling maven repo index,

  • reuse socket: 0.3s/page
  • open a new socket: 1s/page

Will reuse the socket connection

Connection conn = HttpConnection.connect("http://localhost");
conn.execute().parse();
conn.url("http://localhost/index.html");
conn.execute().parse();

Will close and open a new socket connection

Connection conn = HttpConnection.connect("http://localhost");
conn.execute().body();
conn.url("http://localhost/index.html");
conn.execute().body();
  • parse() invoke DataUtil.parseInputStream, which close the bodyStream;
  • body() invoke DataUtil.readToByteBuffer, which not close the bodyStream.

If the InputStream closed & the host not changed, java.net.HttpURLConnection will reuse the socket, though invoked disconnect().


I do the following change:

private void prepareByteData() {
  Validate.isTrue(executed,
      "Request must be executed (with .execute(), .get(), or .post() before getting response body");
  if (byteData == null) {
    Validate.isFalse(inputStreamRead, "Request has already been read (with .parse())");
    try {
      byteData = DataUtil.readToByteBuffer(bodyStream, req.maxBodySize());
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    } finally {
      inputStreamRead = true;
      ////////////////////////////// start
      try {
        bodyStream.close();
      } catch (IOException e) {
        // no-op
      } finally {
        bodyStream = null;
      }
      ////////////////////////////// end
      safeClose();
    }
  }
}
@ghost
Copy link
Author

ghost commented Jun 13, 2019

a test case

import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;

public class URLConnectionDemo {

  public static void main(String[] args) throws IOException {
    fetch_url("http://localhost/");
    fetch_url("http://localhost/index.html");
  }

  public static void fetch_url(String link) throws IOException {
    URL url = new URL(link);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    InputStream in = conn.getInputStream();
    Files.copy(in, Paths.get("temp"), StandardCopyOption.REPLACE_EXISTING);
    in.close();
    conn.disconnect();
  }
}

jhy_jsoup_issues_1232_test

File Server: My Java Server https://github.com/h4675/web

run: java wjs.sample.Debug

@ghost
Copy link
Author

ghost commented Jun 13, 2019

a better way: close bodyStream first in safeClose()

private void safeClose() {
  if (bodyStream != null) {
    try {
      bodyStream.close();
    } catch (IOException e) {
      // no-op
    } finally {
      bodyStream = null;
    }
  }
  if (conn != null) {
    conn.disconnect();
    conn = null;
  }
}

@jhy jhy closed this as completed in fb58877 Jul 5, 2019
@jhy jhy added the improvement label Jul 5, 2019
@jhy jhy added this to the 1.12.2 milestone Jul 5, 2019
@jhy
Copy link
Owner

jhy commented Jul 5, 2019

Thanks!

This was referenced Feb 9, 2020
This was referenced Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant