Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add warning around opening suspicious (non-.adam extension) ADAM files #522

Closed
ryan-williams opened this issue Dec 16, 2014 · 3 comments
Closed

Comments

@ryan-williams
Copy link
Member

Porting this over from guacamole #266:

I just got bit by trying to open a path that was a directory (I meant to pass a BAM); failure mode was not the most clear:

Exception in thread "Driver" java.io.IOException: Could not read footer: java.lang.RuntimeException: hdfs://demeter-nn1.demeter.hpc.mssm.edu:8020/user/willir31/data/set3/normal/set3.normal.fq/part-01487 is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [67, 68, 67, 10]
    at parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:238)
    at parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:179)
    at parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:389)
    at parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:361)
    at parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:245)

From digging around, #190 seems to imply that there is a decent amount of convention behind assuming ADAM parquet files will have an .adam extension; any interest in:

  • requiring presumed-ADAM files to have a .adam extension (throwing an Exception if the extension is not recognized)
  • printing a warning
  • Arun suggested: catching such an Exception and attempting to throw a new Exception with a better inferred message, e.g. inspecting whether the failing path is actually a directory and making this explicit in the re-thrown exception?
@fnothaft
Copy link
Member

+1, I would be interested. I like the idea of moving to the conventions from #190, with the slight change of .align.adam to .read.adam.

@ryan-williams
Copy link
Member Author

how about I just start by throwing an exception if we get to the "load as ADAM" code-path and it doesn't end in .adam?

ryan-williams added a commit to ryan-williams/adam that referenced this issue Dec 16, 2014
@fnothaft
Copy link
Member

Cool by me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants