-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve format identification #719
Changes from all commits
7624148
5124533
c0fcdb7
ca4b5b8
f2a4441
faa3454
040d302
ee30264
2221056
be777a2
90d6a38
90b7b10
770280e
9916c0a
1dc68ae
fb25866
812a14b
36c961b
69c7aee
c144281
501e13f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,8 @@ | |
from astrodendro import Dendrogram | ||
from ..data import Data | ||
|
||
from .gridded import is_fits, is_hdf5 | ||
|
||
__all__ = ['load_dendro'] | ||
|
||
|
||
|
@@ -34,3 +36,69 @@ def load_dendro(file): | |
im = Data(intensity=dg.data, structure=dg.index_map) | ||
im.join_on_key(dendro, 'structure', dendro.pixel_component_ids[0]) | ||
return [dendro, im] | ||
|
||
|
||
def is_dendro(file, **kwargs): | ||
|
||
if is_hdf5(file): | ||
|
||
import h5py | ||
|
||
f = h5py.File(file, 'r') | ||
|
||
return 'data' in f and 'index_map' in f and 'newick' in f | ||
|
||
elif is_fits(file): | ||
|
||
from ...external.astro import fits | ||
|
||
hdulist = fits.open(file) | ||
|
||
# In recent versions of Astropy, we could do 'DATA' in hdulist etc. but | ||
# this doesn't work with Astropy 0.3, so we use the following method | ||
# instead: | ||
try: | ||
hdulist['DATA'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor nit: does this pattern load data from disk? If so, it's an overly expensive test There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so but I'll do some tests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, this does not read any data from the disk, since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool -- I also checked to see what happens when memmap=False, but hdulist[label] still doesn't seem to read any data (until There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah interesting, I always assumed that with memmap=False the whole file was read immediately |
||
hdulist['INDEX_MAP'] | ||
hdulist['NEWICK'] | ||
except KeyError: | ||
pass # continue | ||
else: | ||
return True | ||
|
||
# For older versions of astrodendro, the HDUs did not have names | ||
|
||
# Here we use heuristics to figure out if this is likely to be a | ||
# dendrogram. Specifically, there should be three HDU extensions. | ||
# The primary HDU should be empty, HDU 1 and HDU 2 should have | ||
# matching shapes, and HDU 3 should have a 1D array. Also, if the | ||
# HDUs do have names then this is not a dendrogram since the old | ||
# files did not have names | ||
|
||
# This branch can be removed once we think most dendrogram files | ||
# will have HDU names. | ||
|
||
if len(hdulist) != 4: | ||
return False | ||
|
||
if hdulist[1].name != '' or hdulist[2].name != '' or hdulist[3].name != '': | ||
return False | ||
|
||
if hdulist[0].data is not None: | ||
return False | ||
|
||
if hdulist[1].data is None or hdulist[2].data is None or hdulist[3].data is None: | ||
return False | ||
|
||
if hdulist[1].data.shape != hdulist[2].data.shape: | ||
return False | ||
|
||
if hdulist[3].data.ndim != 1: | ||
return False | ||
|
||
# We're probably ok, so return True | ||
return True | ||
|
||
else: | ||
|
||
return False |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ | |
from ...utils import file_format | ||
from ..coordinates import coordinates_from_header | ||
|
||
from .helpers import set_default_factory, __factories__ | ||
from .helpers import __factories__ | ||
|
||
__all__ = ['is_casalike', 'gridded_data', 'casalike_cube'] | ||
|
||
|
@@ -71,10 +71,8 @@ def is_gridded_data(filename, **kwargs): | |
|
||
gridded_data.label = "FITS/HDF5 Image" | ||
gridded_data.identifier = is_gridded_data | ||
gridded_data.priority = 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see, we are not always adding 'item' namedtuples to the factory registry, so There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it was for backward-compatibility in case anyone was defining it like us here. But then it's probably true that most other people are using the decorator (in which case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds like a good idea in a followup PR |
||
__factories__.append(gridded_data) | ||
set_default_factory('fits', gridded_data) | ||
set_default_factory('hd5', gridded_data) | ||
set_default_factory('hdf5', gridded_data) | ||
|
||
|
||
def casalike_cube(filename, **kwargs): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is priority not defined in a namedtuple?