Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for use with TNG-like simulations #228

Merged
merged 22 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3905be3
Fix ingesting TNG-like simulations
apontzen Oct 5, 2023
d50f4b2
Add r200c, r200m and rvirial properties
apontzen Oct 5, 2023
3ed2c21
Clarify log output from tangos write
apontzen Oct 8, 2023
86f5def
Clarify how one class is preferred over another when asking for a pro…
apontzen Oct 9, 2023
428a494
Fix errors and clarify coding in providing_class
apontzen Oct 9, 2023
1cecd1e
Don't import property modules when testing, since they can override e…
apontzen Oct 9, 2023
30c9500
Provide option to explain why particular property classes get selecte…
apontzen Oct 9, 2023
3e6324d
Better docstring and explanations for property class selection
apontzen Oct 9, 2023
4187b7b
Fix pre-commit failures
apontzen Oct 9, 2023
eee95be
Delete tutorial files after ingesting them, to save disk space on git…
apontzen Oct 9, 2023
57bdbc9
Attempt to fit the integration testing into free github runner's redu…
apontzen Oct 9, 2023
d8f668c
Fix syntax errors in build.sh
apontzen Oct 9, 2023
e68d9c2
Update integration testing to retain tutorial_changa for crosslinking…
apontzen Oct 10, 2023
653578f
Correct paths for test data
apontzen Oct 10, 2023
6ff1919
Further fixes to paths for fetching data
apontzen Oct 10, 2023
66b8067
Fix confusing output from enumerate_objects
apontzen Oct 10, 2023
3bca3a7
Don't scan over files more often than necessary (important for slow f…
apontzen Oct 10, 2023
76db6df
Fix issue with new versions of yt?
apontzen Oct 10, 2023
fa77756
Fix test where numbers have changed slightly due to different file fi…
apontzen Oct 10, 2023
ceb5a40
Fix out of date yt reference (will this break people using yt<4?)
apontzen Oct 10, 2023
bd621f9
Now try getting something to work for yt 3 and yt 4 alike
apontzen Oct 10, 2023
a887b21
Fix typo in integration-test.yaml and update zenodo record to latest …
apontzen Oct 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 4 additions & 19 deletions .github/workflows/integration-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,26 +43,9 @@ jobs:
- name: Install latest yt
run: python -m pip install yt

- name: Cache test datasets
id: cache-test-datasets
uses: actions/cache@v2
with:
path: |
test_tutorial_build/tutorial_*
test_tutorial_build/enzo.tinycosmo
test_tutorial_build/reference_database.db
key: replace-later-with-md5 # need to work out a way to generate a key here at some point if test data changes

- name: Fetch test datasets
if: steps.cache-test-datasets.outputs.cache-hit != 'true'
working-directory: test_tutorial_build
run: |
wget -T 60 -nv ftp://zuserver2.star.ucl.ac.uk/app/tangos/mini_tutorial_test.tar.gz
tar -xzvf mini_tutorial_test.tar.gz

- name: Build test database
working-directory: test_tutorial_build
run: bash build.sh
run: export INTEGRATION_TESTING=1; bash build.sh

- uses: actions/upload-artifact@v2
with:
Expand All @@ -71,7 +54,9 @@ jobs:

- name: Verify database
working-directory: test_tutorial_build
run: tangos diff data.db reference_database.db --property-tolerance dm_density_profile 1e-2 0
run: |
wget https://zenodo.org/record/8430190/files/reference_database.db?download=1 -O reference_database.db -nv
tangos diff data.db reference_database.db --property-tolerance dm_density_profile 1e-2 0
# --property-tolerance dm_density_profile here is because if a single particle crosses between bins
# (which seems to happen due to differing library versions), the profile can change by this much
#
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
'pytest >= 5.0.0',
'webtest >= 2.0',
'pyquery >= 1.3.0',
'pynbody >= 1.2.2',
'pynbody >= 1.3.2',
'yt>=3.4.0',
'PyMySQL>=1.0.2',
]
Expand Down
2 changes: 1 addition & 1 deletion tangos/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@
from .core import *
from .query import *

__version__ = '1.7.1'
__version__ = '1.8.0'
8 changes: 5 additions & 3 deletions tangos/cached_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,20 @@ def create_property(halo, name, prop, session):

def _insert_list_unlocked(property_list):
session = core.get_default_session()

number = 0
for p in property_list:
if p[2] is not None:
session.add(create_property(p[0], p[1], p[2], session))
number += 1

session.commit()
return number

def insert_list(property_list):
from tangos import parallel_tasks as pt

if pt.backend!=None:
with pt.ExclusiveLock("insert_list"):
_insert_list_unlocked(property_list)
return _insert_list_unlocked(property_list)
else:
_insert_list_unlocked(property_list)
return _insert_list_unlocked(property_list)
4 changes: 2 additions & 2 deletions tangos/core/dictionary.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ def __repr__(self):
def __init__(self, text):
self.text = text

def providing_class(self, handler):
def providing_class(self, handler, explain=False):
from .. import properties
return properties.providing_class(self.text, handler)
return properties.providing_class(self.text, handler, explain)

raise_exception = object()

Expand Down
1 change: 1 addition & 0 deletions tangos/input_handlers/caterpillar.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

class CaterpillarInputHandler(PynbodyInputHandler):
patterns = ["snapdir_???"]
auxiliary_file_patterns = ["halos/halos_???", "halos_???"]

@classmethod
def _snap_id_from_snapdir_path(cls, path):
Expand Down
10 changes: 7 additions & 3 deletions tangos/input_handlers/finding.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,11 @@ def best_matching_handler(cls, basename):

def enumerate_timestep_extensions(self):
base = os.path.join(config.base, self.basename)
extensions = find(basename=base + "/", patterns=self.patterns)
extensions = sorted(find(basename=base + "/", patterns=self.patterns))
logger.info("Enumerate timestep extensions base=%r patterns=%s", base, self.patterns)
for e in extensions:
if self._is_able_to_load(e):
yield e[len(base) + 1:]
if self._is_able_to_load(self._transform_extension(e)):
yield self._transform_extension(e[len(base) + 1:])
else:
logger.info("Could not load %s with class %s", e, self)

Expand All @@ -78,3 +78,7 @@ def _is_able_to_load(self, fname):

Override in child class to filter the pattern-based file matches"""
return True

def _transform_extension(self, extension_name):
"""Can be overriden by child classes to map from the literal filename discovered to a different name for loading"""
return extension_name
35 changes: 22 additions & 13 deletions tangos/input_handlers/pynbody.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ def enumerate_objects(self, ts_extension, object_typetag="halo", min_halo_partic
if self._can_enumerate_objects_from_statfile(ts_extension, object_typetag):
yield from self._enumerate_objects_from_statfile(ts_extension, object_typetag)
else:
logger.warning("No halo statistics file found for timestep %r",ts_extension)
logger.warning("No %s statistics file found for timestep %r", object_typetag, ts_extension)

snapshot_keep_alive = self.load_timestep(ts_extension)
try:
Expand Down Expand Up @@ -260,9 +260,9 @@ def enumerate_objects(self, ts_extension, object_typetag="halo", min_halo_partic
pass

def get_properties(self):
timesteps = list(self.enumerate_timestep_extensions())
if len(timesteps)>0:
f = self.load_timestep_without_caching(sorted(timesteps)[-1])
timesteps = self.enumerate_timestep_extensions()
try:
f = self.load_timestep_without_caching(next(timesteps))
if self.quicker:
res_kpc = self._estimate_spatial_resolution_quicker(f)
res_msol = self._estimate_mass_resolution_quicker(f)
Expand All @@ -271,7 +271,7 @@ def get_properties(self):
res_msol = self._estimate_mass_resolution(f)
return {'approx_resolution_kpc': res_kpc, 'approx_resolution_Msol': res_msol}

else:
except StopIteration:
return {}

def _estimate_spatial_resolution(self, f):
Expand Down Expand Up @@ -326,7 +326,7 @@ class GadgetSubfindInputHandler(PynbodyInputHandler):

_property_prefix_for_type = {'halo': 'sub_'}

_sub_parent_name = 'sub_groupNr'
_sub_parent_names = ['sub_groupNr']

_hidden_properties = ['children', 'group_len', 'group_off', 'Nsubs', 'groupNr', 'len', 'off']

Expand Down Expand Up @@ -381,9 +381,9 @@ def available_object_property_names_for_timestep(self, ts_extension, object_type
if new_p.startswith("_"):
new_p = new_p[1:]
properties[i] = new_p
if p==self._sub_parent_name:
if p in self._sub_parent_names:
properties[i] = 'parent'
if p=='children': # NB 'children' is generated by pynbody for both Subfind and SubfindHDF catalogues
if p == 'children': # NB 'children' is generated by pynbody for both Subfind and SubfindHDF catalogues
properties[i] = 'child'

properties = [p for p in properties if p not in self._hidden_properties]
Expand All @@ -401,15 +401,17 @@ def iterate_object_properties_for_timestep(self, ts_extension, object_typetag, p
pynbody_properties = h.get_halo_properties(i,with_unit=False)

if k=='parent':
adapted_k = self._sub_parent_name
for adapted_k in self._sub_parent_names:
if adapted_k in pynbody_properties.keys():
break
else:
adapted_k = pynbody_prefix + k
if adapted_k not in pynbody_properties:
adapted_k = pynbody_prefix + "_" + k

if adapted_k in pynbody_properties:
data = self._resolve_units(pynbody_properties[adapted_k])
if adapted_k == self._sub_parent_name and data is not None:
if adapted_k in self._sub_parent_names and data is not None:
# turn into a link
data = proxy_object.IncompleteProxyObjectFromFinderId(data, 'group')
elif k=='child' and "children" in pynbody_properties:
Expand All @@ -425,18 +427,25 @@ def iterate_object_properties_for_timestep(self, ts_extension, object_typetag, p
yield all_data

class Gadget4HDFSubfindInputHandler(GadgetSubfindInputHandler):
patterns = ["snapshot_???.hdf5"]
auxiliary_file_patterns =["fof_subhalo_tab_???.hdf5"]
patterns = ["snapshot_???.hdf5", "snapshot_???.0.hdf5", "snap_???.hdf5", "snap_???.0.hdf5"]
auxiliary_file_patterns =["fof_subhalo_tab_???.hdf5", "fof_subhalo_tab_???.0.hdf5"]
snap_class_name = "pynbody.snapshot.gadgethdf.GadgetHDFSnap"
catalogue_class_name = "pynbody.halo.Gadget4SubfindHDFCatalogue"

_property_prefix_for_type = {'halo': 'Subhalo', 'group': 'Group'}

_sub_parent_name = 'SubhaloGroupNr'
_sub_parent_names = ['SubhaloGroupNr', 'SubhaloGrNr']

_hidden_properties = ['Len', 'LenType', 'OffsetType', 'ParentRank', 'RankInGr', 'Nr', 'Ascale', 'FirstSub',
'OffsetType']

def _transform_extension(self, extension_name):
if extension_name.endswith(".0.hdf5"):
return extension_name[:-7]
else:
return extension_name



class GadgetRockstarInputHandler(PynbodyInputHandler):
patterns = ["snapshot_???"]
Expand Down
2 changes: 1 addition & 1 deletion tangos/input_handlers/ramsesHOP.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ def enumerate_objects(self, ts_extension, object_typetag="halo", min_halo_partic
if self._can_enumerate_objects_from_statfile(ts_extension, object_typetag):
yield from self._enumerate_objects_from_statfile(ts_extension, object_typetag)
else:
logger.warning("No halo statistics file found for timestep %r", ts_extension)
logger.warning("No %s statistics file found for timestep %r", object_typetag, ts_extension)

try:
h = self._construct_halo_cat(ts_extension, object_typetag)
Expand Down
12 changes: 9 additions & 3 deletions tangos/input_handlers/yt.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,13 @@ class YtChangaAHFInputHandler(YtInputHandler):
patterns = ["*.00???", "*.00????"]

def _load_halo_cat_without_caching(self, ts_extension, snapshot_file):
cat = yt.frontends.ahf.AHFHalosDataset(self._extension_to_filename("halos/"+ts_extension)+".AHF_param",
try:
# yt 4
from yt.frontends.ahf.api import AHFHalosDataset
except ImportError:
# yt 3
from yt.frontends.ahf import AHFHalosDataset
cat = AHFHalosDataset(self._extension_to_filename("halos/"+ts_extension)+".AHF_param",
hubble_constant = snapshot_file.hubble_constant)
cat_data = cat.all_data()
return cat, cat_data
Expand Down Expand Up @@ -146,13 +152,13 @@ def _load_halo_cat_without_caching(self, ts_extension, snapshot_file):
rockfiles = np.array(rockfiles)[sortord]
timestep_ind = np.argwhere(np.array([s.split('/')[-1] for s in snapfiles])==ts_extension.split('/')[0])[0]
fnum = int(rockfiles[timestep_ind][0].split('.')[0].split('_')[-1])
cat = yt.frontends.rockstar.RockstarDataset(self._extension_to_filename("halos_"+str(fnum)+".0.bin"))
cat = yt.load(self._extension_to_filename("halos_"+str(fnum)+".0.bin"))
cat_data = cat.all_data()
# Check whether rockstar was run with Behroozi's distribution or Wise's
if np.any(cat_data["halos","particle_identifier"]<0):
del cat
del cat_data
cat = yt.frontends.rockstar.RockstarDataset(self._extension_to_filename("halos_"+str(fnum)+".0.bin"))
cat = yt.load(self._extension_to_filename("halos_"+str(fnum)+".0.bin"))
cat.parameters['format_revision'] = 2 #
cat_data = cat.all_data()
return cat, cat_data
Expand Down
Loading