Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom samplers, better collection, and better plotting to sinter #804

Merged
merged 27 commits into from
Sep 10, 2024

Conversation

Strilanc
Copy link
Collaborator

@Strilanc Strilanc commented Jul 27, 2024

  • Add sinter.Sampler and sinter.CompiledSampler classes
    • They can go anywhere a Decoder would go, but they are responsible for all parts of the sampling instead of only prediction
  • Add a new default sampler perfectionist, which discards anything with detection events and predicts the observables are not flipped
  • Improved layout of the progress printouts when collect is running
  • Sinter decoders can now flag that they want to discard shots by adding an extra byte to the returned observable data, with 0 meaning keep and not-0 meaning discard
  • Change how sinter collect distributes work
    • Workers are now distributed as widely as possible, instead of all on one task
    • Workers are now never switched between tasks until their current task is done
  • Add sinter plot --point_label_func argument for drawing text next to data points
  • Augment sinter plot --group_func to support dictionaries with special keys controlling precise grouping behaviors
    • If group_func returns a dict with a "color" key, all items with the same "color" value are drawn with the same color
    • If group_func returns a dict with a "linestyle" key, all items with the same "linestyle" value are drawn with the same linestyle
    • If group_func returns a dict with a "marker" key, all items with the same "marker" value are drawn with the same marker
    • If group_func returns a dict with a "label" key, this forces the label shown in the legend
    • If group_func returns a dict with an "order" key, this takes priority for ordering the legend
  • sinter collect --processes is no longer required (defaults to "auto")
  • sinter plot --show is no longer required (defaults to showing, unless --out is specified, unless --show is specified)
  • Group some of sinter's code into private subpackages
  • Show traditional error bars instead of a filled region for high/low fit when only one data point is present
  • Add sinter plot --preprocess_stats_func
  • Add sinter.TaskStats.with_edits
  • Add safety error when adding stats that have equal strong ids but differing identifying information (json_metadata or decoder)

Some of the sampler design is adapted from @inmzhang's design in #735

Fixes #774

Fixes #682

Fixes #392

Strilanc added 5 commits July 26, 2024 17:11
- Improve output printing to be in a table
- Change work distribution strategy to go wide instead of deep, and to avoid retargeting workers
- Added key property support to `sinter plot --group_func`
    - Activate this functionality by evaluating to a dict with certain keys
    - The "color" key controls color grouping
    - The "marker" key controls marker grouping
    - The "linestyle" key controls linestyle grouping
    - The "label" key controls what's shown in the legend
    - The "order" key can be used to force sorting order
@Strilanc Strilanc mentioned this pull request Jul 27, 2024
@Strilanc Strilanc marked this pull request as draft July 27, 2024 05:40
@Strilanc Strilanc marked this pull request as ready for review July 30, 2024 21:16
@Strilanc Strilanc enabled auto-merge (squash) September 10, 2024 03:57
@Strilanc Strilanc merged commit 5e4f8b2 into main Sep 10, 2024
54 checks passed
@Strilanc Strilanc deleted the sinter2 branch September 10, 2024 04:25
Strilanc added a commit that referenced this pull request Sep 10, 2024
…#804)

- Add `sinter.Sampler` and `sinter.CompiledSampler` classes
- They can go anywhere a Decoder would go, but they are responsible for
all parts of the sampling instead of only prediction
- Add a new default sampler `perfectionist`, which discards anything
with detection events and predicts the observables are not flipped
- Improved layout of the progress printouts when collect is running
- Sinter decoders can now flag that they want to discard shots by adding
an extra byte to the returned observable data, with 0 meaning keep and
not-0 meaning discard
- Change how `sinter collect` distributes work
- Workers are now distributed as widely as possible, instead of all on
one task
- Workers are now never switched between tasks until their current task
is done
- Add `sinter plot --point_label_func` argument for drawing text next to
data points
- Augment `sinter plot --group_func` to support dictionaries with
special keys controlling precise grouping behaviors
- If group_func returns a dict with a `"color"` key, all items with the
same `"color"` value are drawn with the same color
- If group_func returns a dict with a `"linestyle"` key, all items with
the same `"linestyle"` value are drawn with the same linestyle
- If group_func returns a dict with a `"marker"` key, all items with the
same `"marker"` value are drawn with the same marker
- If group_func returns a dict with a `"label"` key, this forces the
label shown in the legend
- If group_func returns a dict with an `"order"` key, this takes
priority for ordering the legend
- `sinter collect --processes` is no longer required (defaults to
`"auto"`)
- `sinter plot --show` is no longer required (defaults to showing,
unless `--out` is specified, unless `--show` is specified)
- Group some of sinter's code into private subpackages
- Show traditional error bars instead of a filled region for high/low
fit when only one data point is present
- Add `sinter plot --preprocess_stats_func`
- Add `sinter.TaskStats.with_edits`
- Add safety error when adding stats that have equal strong ids but
differing identifying information (json_metadata or decoder)

Some of the sampler design is adapted from @inmzhang's design in
#735

Fixes #774

Fixes #682

Fixes #392

---------

Co-authored-by: Matt McEwen <mmcewen@google.com>
Strilanc added a commit that referenced this pull request Sep 10, 2024
…#804)

- Add `sinter.Sampler` and `sinter.CompiledSampler` classes
- They can go anywhere a Decoder would go, but they are responsible for
all parts of the sampling instead of only prediction
- Add a new default sampler `perfectionist`, which discards anything
with detection events and predicts the observables are not flipped
- Improved layout of the progress printouts when collect is running
- Sinter decoders can now flag that they want to discard shots by adding
an extra byte to the returned observable data, with 0 meaning keep and
not-0 meaning discard
- Change how `sinter collect` distributes work
- Workers are now distributed as widely as possible, instead of all on
one task
- Workers are now never switched between tasks until their current task
is done
- Add `sinter plot --point_label_func` argument for drawing text next to
data points
- Augment `sinter plot --group_func` to support dictionaries with
special keys controlling precise grouping behaviors
- If group_func returns a dict with a `"color"` key, all items with the
same `"color"` value are drawn with the same color
- If group_func returns a dict with a `"linestyle"` key, all items with
the same `"linestyle"` value are drawn with the same linestyle
- If group_func returns a dict with a `"marker"` key, all items with the
same `"marker"` value are drawn with the same marker
- If group_func returns a dict with a `"label"` key, this forces the
label shown in the legend
- If group_func returns a dict with an `"order"` key, this takes
priority for ordering the legend
- `sinter collect --processes` is no longer required (defaults to
`"auto"`)
- `sinter plot --show` is no longer required (defaults to showing,
unless `--out` is specified, unless `--show` is specified)
- Group some of sinter's code into private subpackages
- Show traditional error bars instead of a filled region for high/low
fit when only one data point is present
- Add `sinter plot --preprocess_stats_func`
- Add `sinter.TaskStats.with_edits`
- Add safety error when adding stats that have equal strong ids but
differing identifying information (json_metadata or decoder)

Some of the sampler design is adapted from @inmzhang's design in
#735

Fixes #774

Fixes #682

Fixes #392

---------

Co-authored-by: Matt McEwen <mmcewen@google.com>
Strilanc added a commit that referenced this pull request Sep 10, 2024
…#804)

- Add `sinter.Sampler` and `sinter.CompiledSampler` classes
- They can go anywhere a Decoder would go, but they are responsible for
all parts of the sampling instead of only prediction
- Add a new default sampler `perfectionist`, which discards anything
with detection events and predicts the observables are not flipped
- Improved layout of the progress printouts when collect is running
- Sinter decoders can now flag that they want to discard shots by adding
an extra byte to the returned observable data, with 0 meaning keep and
not-0 meaning discard
- Change how `sinter collect` distributes work
- Workers are now distributed as widely as possible, instead of all on
one task
- Workers are now never switched between tasks until their current task
is done
- Add `sinter plot --point_label_func` argument for drawing text next to
data points
- Augment `sinter plot --group_func` to support dictionaries with
special keys controlling precise grouping behaviors
- If group_func returns a dict with a `"color"` key, all items with the
same `"color"` value are drawn with the same color
- If group_func returns a dict with a `"linestyle"` key, all items with
the same `"linestyle"` value are drawn with the same linestyle
- If group_func returns a dict with a `"marker"` key, all items with the
same `"marker"` value are drawn with the same marker
- If group_func returns a dict with a `"label"` key, this forces the
label shown in the legend
- If group_func returns a dict with an `"order"` key, this takes
priority for ordering the legend
- `sinter collect --processes` is no longer required (defaults to
`"auto"`)
- `sinter plot --show` is no longer required (defaults to showing,
unless `--out` is specified, unless `--show` is specified)
- Group some of sinter's code into private subpackages
- Show traditional error bars instead of a filled region for high/low
fit when only one data point is present
- Add `sinter plot --preprocess_stats_func`
- Add `sinter.TaskStats.with_edits`
- Add safety error when adding stats that have equal strong ids but
differing identifying information (json_metadata or decoder)

Some of the sampler design is adapted from @inmzhang's design in
#735

Fixes #774

Fixes #682

Fixes #392

---------

Co-authored-by: Matt McEwen <mmcewen@google.com>
Strilanc added a commit that referenced this pull request Sep 10, 2024
…#804)

- Add `sinter.Sampler` and `sinter.CompiledSampler` classes
- They can go anywhere a Decoder would go, but they are responsible for
all parts of the sampling instead of only prediction
- Add a new default sampler `perfectionist`, which discards anything
with detection events and predicts the observables are not flipped
- Improved layout of the progress printouts when collect is running
- Sinter decoders can now flag that they want to discard shots by adding
an extra byte to the returned observable data, with 0 meaning keep and
not-0 meaning discard
- Change how `sinter collect` distributes work
- Workers are now distributed as widely as possible, instead of all on
one task
- Workers are now never switched between tasks until their current task
is done
- Add `sinter plot --point_label_func` argument for drawing text next to
data points
- Augment `sinter plot --group_func` to support dictionaries with
special keys controlling precise grouping behaviors
- If group_func returns a dict with a `"color"` key, all items with the
same `"color"` value are drawn with the same color
- If group_func returns a dict with a `"linestyle"` key, all items with
the same `"linestyle"` value are drawn with the same linestyle
- If group_func returns a dict with a `"marker"` key, all items with the
same `"marker"` value are drawn with the same marker
- If group_func returns a dict with a `"label"` key, this forces the
label shown in the legend
- If group_func returns a dict with an `"order"` key, this takes
priority for ordering the legend
- `sinter collect --processes` is no longer required (defaults to
`"auto"`)
- `sinter plot --show` is no longer required (defaults to showing,
unless `--out` is specified, unless `--show` is specified)
- Group some of sinter's code into private subpackages
- Show traditional error bars instead of a filled region for high/low
fit when only one data point is present
- Add `sinter plot --preprocess_stats_func`
- Add `sinter.TaskStats.with_edits`
- Add safety error when adding stats that have equal strong ids but
differing identifying information (json_metadata or decoder)

Some of the sampler design is adapted from @inmzhang's design in
#735

Fixes #774

Fixes #682

Fixes #392

---------

Co-authored-by: Matt McEwen <mmcewen@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants