Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a URLHelper factory API to ParsingUtils #1421

Merged
merged 5 commits into from
Oct 11, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion src/main/java/htsjdk/tribble/util/FTPHelper.java
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ public InputStream openInputStream() throws IOException {
}

@Override
@Deprecated
public InputStream openInputStreamForRange(long start, long end) throws IOException {
throw new UnsupportedOperationException("Cannot perform range operations over FTP");
}
Expand Down
19 changes: 12 additions & 7 deletions src/main/java/htsjdk/tribble/util/HTTPHelper.java
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ public URL getUrl() {
}

/**
* @return content length of the resource
* @return content length of the resource, or -1 if not available
* @throws IOException
*/
@Override
Expand All @@ -88,30 +88,35 @@ public long getContentLength() throws IOException {

@Override
public InputStream openInputStream() throws IOException {

HttpURLConnection connection = openConnection();
return new WrapperInputStream(connection, connection.getInputStream());
}


/**
* Open an input stream for the requested byte range. Its the client's responsibility to close the stream.
* Open an InputStream to stream a slice (range) of the resource. The host server must support
* range byte requests and return a 206 response code (partial response). If it does not an IOException will
* be thrown.
*
* Its the client's responsibility to close the stream.
*
* @param start start of range in bytes
* @param end end of range ni bytes
* @return
* @throws IOException
*
* @deprecated since 12/10/14 Will be removed in a future release, as is somewhat fragile
* and not used.
*/
@Override
@Deprecated
public InputStream openInputStreamForRange(long start, long end) throws IOException {

HttpURLConnection connection = openConnection();
String byteRange = "bytes=" + start + "-" + end;
connection.setRequestProperty("Range", byteRange);

if (connection.getResponseCode() != 206) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good check to add if this is important. I definitely did not know this detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If servers don't support range requests they are free according to the spec to return the entire file with response code 200. This is bad when its a 300 gb BAM

String msg = "Error: range requested, expected response code 206 but found " + connection.getResponseCode();
throw new IOException(msg);
}

return new WrapperInputStream(connection, connection.getInputStream());
}

Expand Down
53 changes: 18 additions & 35 deletions src/main/java/htsjdk/tribble/util/ParsingUtils.java
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,10 @@ public class ParsingUtils {

public static final Map<Object, Color> colorCache = new WeakHashMap<>(100);

// HTML 4.1 color table, + orange and magenta
static Map<String, String> colorSymbols = new HashMap();
private static URLHelperFactory urlHelperFactory = RemoteURLHelper::new;

private static final Class defaultUrlHelperClass = RemoteURLHelper.class;
public static Class urlHelperClass = defaultUrlHelperClass;
// HTML 4.1 color table, + orange and magenta
private static Map<String, String> colorSymbols = new HashMap();

static {
colorSymbols.put("white", "FFFFFF");
Expand Down Expand Up @@ -423,50 +422,34 @@ public static boolean resourceExists(String resource) throws IOException{
}

/**
* Return the registered URLHelper, constructed with the provided URL
* @see #registerHelperClass(Class)
* Return a URLHelper from the current URLHelperFactory
* @see #setURLHelperFactory(URLHelperFactory)
*
* @param url
* @return
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the javadoc here to point to the new method instead of the deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deprecated (reflection) method has been removed.

public static URLHelper getURLHelper(URL url) {
try {
return getURLHelper(urlHelperClass, url);
} catch (Exception e) {
return getURLHelper(defaultUrlHelperClass, url);
}
}

private static URLHelper getURLHelper(Class helperClass, URL url) {
try {
Constructor constr = helperClass.getConstructor(URL.class);
return (URLHelper) constr.newInstance(url);
} catch (Exception e) {
String errMsg = "Error instantiating url helper for class: " + helperClass;
throw new IllegalStateException(errMsg, e);
}
return urlHelperFactory.getHelper(url);
}

/**
* Register a {@code URLHelper} class to be used for URL operations. The helper
* may be used for both FTP and HTTP operations, so if any FTP URLs are used
* the {@code URLHelper} must support it.
*
* The default helper class is {@link RemoteURLHelper}, which delegates to FTP/HTTP
* helpers as appropriate.
* Set the factory object for providing URLHelpers. {@see URLHelperFactory}.
*
* @see URLHelper
* @param helperClass Class which implements {@link URLHelper}, and have a constructor
* which takes a URL as it's only argument.
* @param factory
*/
public static void registerHelperClass(Class helperClass) {
if (!URLHelper.class.isAssignableFrom(helperClass)) {
throw new IllegalArgumentException("helperClass must implement URLHelper");
//TODO check that it has 1 arg constructor of proper type
public static void setURLHelperFactory(URLHelperFactory factory) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably throw if the value is null.

if(factory == null) {
throw new NullPointerException("Null URLHelperFactory");
}
urlHelperClass = helperClass;
urlHelperFactory = factory;
}

public static URLHelperFactory getURLHelperFactory() {
return urlHelperFactory;
}

/**
*
* Add the {@code indexExtension} to the {@code filepath}, preserving
* query string elements if present. Intended for use where {@code filepath}
* is a URL. Will behave correctly on regular file paths (just add the extension
Expand Down
1 change: 0 additions & 1 deletion src/main/java/htsjdk/tribble/util/RemoteURLHelper.java
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ public InputStream openInputStream() throws IOException {
}

@Override
@Deprecated
public InputStream openInputStreamForRange(long start, long end) throws IOException {
return this.wrappedHelper.openInputStreamForRange(start, end);
}
Expand Down
24 changes: 19 additions & 5 deletions src/main/java/htsjdk/tribble/util/URLHelper.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,10 @@
import java.net.URL;

/**
* Interface defining a helper class for dealing with URL resources. Purpose of this class is to provide the
* flexibility to use either URLConnection or Apache HTTPClient. Also want to delegate to either HTTP or FTP
* Interface defining a helper class for dealing with URL resources. The purpose of this class is to provide the
* flexibility to use alternative http implementations, for example Apache HTTPClient, and secondly to provide
* a hook for clients to inject custom headers, for example oAuth tokens, into the requests. An instance of
* URLHelper is created for a URL (there is a 1-1 relation between a URL and HRLHelper).
*
* @see HTTPHelper
* @see FTPHelper
Expand All @@ -33,22 +35,34 @@
*/
public interface URLHelper {

/**
* @return URL of the associated resource
*/
URL getUrl();

/**
* @return content length of the resource, or -1 if not available
* @throws IOException
*/
long getContentLength() throws IOException;

/**
* Open an InputStream to stream the contents of the resource
* @return
* @throws IOException
*/
InputStream openInputStream() throws IOException;

/**
* Open an InputStream to stream a slice (range) of the resource.
*
* May throw an OperationUnsupportedException
* @deprecated Will be removed in a future release, as is somewhat fragile
* and not used.
* @param start
* @param end
* @return
* @throws IOException
*/
@Deprecated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, unecessary newline here.

InputStream openInputStreamForRange(long start, long end) throws IOException;

public boolean exists() throws IOException;
Expand Down
17 changes: 17 additions & 0 deletions src/main/java/htsjdk/tribble/util/URLHelperFactory.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package htsjdk.tribble.util;

import java.net.URL;

/**
* A factory for creating {@link URLHelper} instances. The factory contains a single function
* @see #getHelper(URL) which should return a <code>URLHelper</code> instance for the given URL.
*/
public interface URLHelperFactory {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add javadoc explaining what this interface is for and what the contract of getHelper is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, although not much to say. What else would you like to know here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thank you, it's basically boilerplate but it lets you know where to look for more information and it makes it clear that someone at least cared enough to write SOMETHING. Maybe it's just me, but I'm always suspicious of library code with 0 documentation.

I think the updated doc in URLHelper saying that there is a 1-1 relationship between urls and helpers is a useful addition. Naively I would have assumed that a helper could be used for many URL's so it's good to spell that out.


/**
* @param url
* @return a {@link URLHelper} object for the given URL
*/
URLHelper getHelper(URL url);

}