Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[yt-4.0] check for bbox before reverting to all_data #2634

Merged
merged 2 commits into from
Jun 23, 2020
Merged

[yt-4.0] check for bbox before reverting to all_data #2634

merged 2 commits into from
Jun 23, 2020

Conversation

AshKelly
Copy link
Member

@AshKelly AshKelly commented Jun 12, 2020

PR Summary

Prior to this PR the code says if the edges of the selected region are equal to the domain edges then we need to return every particle stored on disk.

This is not true if a bounding box has been applied. I've added a check for a _domain_override

This fixes the example I posted in #2612

PR Checklist

  • Code passes flake8 checker
  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@AshKelly AshKelly added bug yt-4.0 feature targeted for the yt-4.0 release labels Jun 12, 2020
@AshKelly
Copy link
Member Author

Thanks for the help with thinking this through @munkm

@AshKelly AshKelly requested review from matthewturk and munkm June 12, 2020 14:50
Copy link
Member

@matthewturk matthewturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This absolutely makes sense. Great catch. I believe this is the only place this gets checked, and everything else just looks at is_all_data.

@AshKelly
Copy link
Member Author

That's great news! So now this example,

import numpy as np
import yt

n_ref = 32
fname = "../test_data/GadgetDiskGalaxy/snapshot_200.hdf5"

ptypes = ['PartType1']
left, right = (
    np.array([28007.81291869, 22181.34092477, 28461.73994773]),
    np.array([33147.21291312, 33320.7409192 , 33601.13994216]))

# Test a GadgetDiskGalaxy snapshot
bbox1 = [
    [left[0], right[0]],
    [left[1], right[1]],
    [left[2], right[2]],
]
bbox2 = None


for bbox in [bbox1, bbox2]:
    print("bbox:", bbox)
    ds = yt.load(fname, bounding_box=bbox)
    ds.index

    print(left, right)
    center = None
    _data_source = ds.region(center, left, right)

    positions = []
    for ptype in ptypes:
        field = (ptype, 'particle_position')
        for chunk in _data_source.chunks([field], "io"):
            positions.append(
                chunk[field].to("code_length").d
            )
    positions = np.concatenate(positions)
    print("Total num particles:", 4786616)
    print("Loaded:", positions.shape)

    mask = (
        (positions[:, 0] > left[0]) &
        (positions[:, 0] < right[0]) &
        (positions[:, 1] > left[1]) &
        (positions[:, 1] < right[1]) &
        (positions[:, 2] > left[2]) &
        (positions[:, 2] < right[2])
    )
    print("In region:", np.sum(mask))

outputs:

bbox: [[28007.81291869, 33147.21291312], [22181.34092477, 33320.7409192], [28461.73994773, 33601.13994216]]
yt : [INFO     ] 2020-06-12 15:52:29,466 Calculating time from 3.448e-01 to be 1.108e+17 seconds
yt : [INFO     ] 2020-06-12 15:52:29,471 Assuming length units are in kpc/h (comoving)
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: current_time              = 1.1075810732534835e+17 s
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: domain_left_edge          = [28007.81291869 22181.34092477 28461.73994773]
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: domain_right_edge         = [33147.21291312 33320.7409192  33601.13994216]
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: cosmological_simulation   = 1
yt : [INFO     ] 2020-06-12 15:52:29,565 Parameters: current_redshift          = 1.8999965286929705
yt : [INFO     ] 2020-06-12 15:52:29,566 Parameters: omega_lambda              = 0.721
yt : [INFO     ] 2020-06-12 15:52:29,566 Parameters: omega_matter              = 0.279
yt : [INFO     ] 2020-06-12 15:52:29,566 Parameters: omega_radiation           = 0.0
yt : [INFO     ] 2020-06-12 15:52:29,566 Parameters: hubble_constant           = 0.7
yt : [INFO     ] 2020-06-12 15:52:29,586 Allocating for 1.191e+07 particles
Loading particle index: 100%|████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 297.77it/s]
[28007.81291869 22181.34092477 28461.73994773] [33147.21291312 33320.7409192  33601.13994216]
('domain_override:', True)
Total num particles: 4786616
Loaded: (2475166, 3)
In region: 1898397
bbox: None
yt : [INFO     ] 2020-06-12 15:52:31,530 Calculating time from 3.448e-01 to be 1.108e+17 seconds
yt : [INFO     ] 2020-06-12 15:52:31,533 Assuming length units are in kpc/h (comoving)
yt : [INFO     ] 2020-06-12 15:52:31,614 Parameters: current_time              = 1.1075810732534835e+17 s
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: domain_dimensions         = [1 1 1]
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: domain_left_edge          = [0. 0. 0.]
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: domain_right_edge         = [64000. 64000. 64000.]
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: cosmological_simulation   = 1
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: current_redshift          = 1.8999965286929705
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: omega_lambda              = 0.721
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: omega_matter              = 0.279
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: omega_radiation           = 0.0
yt : [INFO     ] 2020-06-12 15:52:31,615 Parameters: hubble_constant           = 0.7
yt : [INFO     ] 2020-06-12 15:52:31,649 Allocating for 1.191e+07 particles
Loading particle index: 100%|████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 318.38it/s]
[28007.81291869 22181.34092477 28461.73994773] [33147.21291312 33320.7409192  33601.13994216]
('domain_override:', False)
Total num particles: 4786616
Loaded: (1898397, 3)
In region: 1898397

(I should note this is the second time I ran this after deleting the .ewah file. Which is great news, so now the bbox'd dataset loads up the particle index and uses it correctly)

The only other things I'm unsure about is- the bbox version does load slightly more particles than the none bbox in the exact same region. Would that be something we expect?

@matthewturk
Copy link
Member

Hm, I'm not sure. If anything I'd expect it to go the opposite way -- because regions will select particles whose smoothing length intrudes. But in this case, your bbox should filter those out.

@AshKelly
Copy link
Member Author

AshKelly commented Jun 12, 2020

I'm actually only selecting PartType1 to avoid smoothing length differences at the moment. With no bbox every particle yt selects is inside the region,

Loaded: (1898397, 3)
In region: 1898397

but, with the bounding_box it returns too many (but it does return all of the ones we expected),

Loaded: (2475166, 3)
In region: 1898397

Copy link
Member

@munkm munkm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AHhhhh great catch. I'm glad you added the inline comment too; it's helpful here.

@AshKelly
Copy link
Member Author

AshKelly commented Jun 12, 2020

Aha! I know why the bbox leads to a different number of particles.

It's something to do with the periodicity. If I set ds.periodicity = [False]*3 for the bbox dataset then the selector only selects the 1898397 particles which I expect it to. However, with periodicity on, we select 2475166 particles.

I'm not entirely convinced that the periodicity works when a bounding box is applied.... edit: I'm pretty confident it doesn't!

@munkm munkm changed the base branch from yt-4.0 to master June 22, 2020 15:14
@AshKelly
Copy link
Member Author

Assuming the tests pass I think this is good to go. I am still concerned about periodicity and bounding boxes but that might be better in a separate PR.

@munkm munkm merged commit 0793d3e into yt-project:master Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug yt-4.0 feature targeted for the yt-4.0 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants