Skip to content

Commit 2b2e049

Browse files
authored
Merge pull request #1011 from rhayes777/feature/docs
Feature/docs
2 parents f3b9120 + 04ef5e3 commit 2b2e049

13 files changed

+1310
-467
lines changed

autofit/non_linear/search/nest/dynesty/search/static.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ def __init__(self, function):
2121
def grad(self):
2222
import jax
2323
from jax import grad
24-
2524
print("Compiling gradient")
2625
return jax.jit(grad(self.function))
2726

@@ -135,6 +134,7 @@ def search_internal_from(
135134
The number of CPU's over which multiprocessing is performed, determining how many samples are stored
136135
in the dynesty queue for samples.
137136
"""
137+
138138
if self.use_gradient:
139139
gradient = GradWrapper(fitness)
140140
else:

docs/cookbooks/analysis.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ Visualization
181181

182182
If a ``name`` is input into a non-linear search, all results are output to hard-disk in a folder.
183183

184-
By overwriting the ``Visualizer`` object of an ``Analysis`` class with a custom `Visualizer` class, custom results of the
184+
By overwriting the ``Visualizer`` object of an ``Analysis`` class with a custom ``Visualizer`` class, custom results of the
185185
model-fit can be visualized during the model-fit.
186186

187187
The ``Visualizer`` below has the methods ``visualize_before_fit`` and ``visualize``, which perform model specific

docs/cookbooks/database.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Database
44
========
55

6-
The default behaviour of model-fitting results output is to be written to hard-disc in folders. These are simple to
6+
The default behaviour of model-fitting results output is to be written to hard-disk in folders. These are simple to
77
navigate and manually check.
88

99
For small model-fitting tasks this is sufficient, however it does not scale well when performing many model fits to

docs/cookbooks/multiple_datasets.rst

+5
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,11 @@ To fit multiple datasets via a non-linear search we use this summed analysis obj
164164
165165
result_list = search.fit(model=model, analysis=analysis)
166166
167+
In the example above, the same ``Analysis`` class was used twice (to set up ``analysis_0`` and ``analysis_1``) and summed.
168+
169+
**PyAutoFit** supports the summing together of different ``Analysis`` classes, which may take as input completely different
170+
datasets and fit the model to them (via the ``log_likelihood_function``) following a completely different procedure.
171+
167172
Result List
168173
-----------
169174

docs/cookbooks/result.rst

+64
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ This cookbook provides an overview of using the results.
1212

1313
- **Model Fit**: Perform a simple model-fit to create a ``Result`` object.
1414
- **Info**: Print the ``info`` attribute of the ``Result`` object to display a summary of the model-fit.
15+
- **Loading From Hard-disk**: Loading results from hard-disk to Python variables via the aggregator.
1516
- **Samples**: The ``Samples`` object contained in the ``Result``, containing all non-linear samples (e.g. parameters, log likelihoods, etc.).
1617
- **Maximum Likelihood**: The maximum likelihood model instance.
1718
- **Posterior / PDF**: The median PDF model instance and PDF vectors of all model parameters via 1D marginalization.
@@ -98,6 +99,69 @@ The output appears as follows:
9899
normalization 24.79 (24.65, 24.94)
99100
sigma 9.85 (9.78, 9.90)
100101
102+
Loading From Hard-disk
103+
----------------------
104+
105+
When performing fits which output results to hard-disk, a `files` folder is created containing .json / .csv files of
106+
the model, samples, search, etc. You should check it out now for a completed fit on your hard-disk if you have
107+
not already!
108+
109+
These files can be loaded from hard-disk to Python variables via the aggregator, making them accessible in a
110+
Python script or Jupyter notebook. They are loaded as the internal **PyAutoFit** objects we are familiar with,
111+
for example the `model` is loaded as the `Model` object we passed to the search above.
112+
113+
Below, we will access these results using the aggregator's ``values`` method. A full list of what can be loaded is
114+
as follows:
115+
116+
- ``model``: The ``model`` defined above and used in the model-fit (``model.json``).
117+
- ``search``: The non-linear search settings (``search.json``).
118+
- ``samples``: The non-linear search samples (``samples.csv``).
119+
- ``samples_info``: Additional information about the samples (``samples_info.json``).
120+
- ``samples_summary``: A summary of key results of the samples (``samples_summary.json``).
121+
- ``info``: The info dictionary passed to the search (``info.json``).
122+
- ``covariance``: The inferred covariance matrix (``covariance.csv``).
123+
- ``data``: The 1D noisy data used that is fitted (``data.json``).
124+
- ``noise_map``: The 1D noise-map fitted (``noise_map.json``).
125+
126+
The ``samples`` and ``samples_summary`` results contain a lot of repeated information. The ``samples`` result contains
127+
the full non-linear search samples, for example every parameter sample and its log likelihood. The ``samples_summary``
128+
contains a summary of the results, for example the maximum log likelihood model and error estimates on parameters
129+
at 1 and 3 sigma confidence.
130+
131+
Accessing results via the ``samples_summary`` is much faster, because as it does not reperform calculations using the full
132+
list of samples. Therefore, if the result you want is accessible via the ``samples_summary`` you should use it
133+
but if not you can revert to the ``samples.
134+
135+
.. code-block:: python
136+
137+
from autofit.aggregator.aggregator import Aggregator
138+
139+
agg = Aggregator.from_directory(
140+
directory=path.join("output", "cookbook_result"),
141+
)
142+
143+
Before using the aggregator to inspect results, lets discuss Python generators.
144+
145+
A generator is an object that iterates over a function when it is called. The aggregator creates all of the objects
146+
that it loads from the database as generators (as opposed to a list, or dictionary, or another Python type).
147+
148+
This is because generators are memory efficient, as they do not store the entries of the database in memory
149+
simultaneously. This contrasts objects like lists and dictionaries, which store all entries in memory all at once.
150+
If you fit a large number of datasets, lists and dictionaries will use a lot of memory and could crash your computer!
151+
152+
Once we use a generator in the Python code, it cannot be used again. To perform the same task twice, the
153+
generator must be remade it. This cookbook therefore rarely stores generators as variables and instead uses the
154+
aggregator to create each generator at the point of use.
155+
156+
To create a generator of a specific set of results, we use the `values` method. This takes the `name` of the
157+
object we want to create a generator of, for example inputting `name=samples` will return the results `Samples`
158+
object (which is illustrated in detail below).
159+
160+
.. code-block:: python
161+
162+
for samples in agg.values("samples"):
163+
print(samples.parameter_lists[0])
164+
101165
Samples
102166
-------
103167

docs/cookbooks/search.rst

+13-7
Original file line numberDiff line numberDiff line change
@@ -60,24 +60,30 @@ Output To Hard-Disk
6060
-------------------
6161

6262
By default, a non-linear search does not output its results to hard-disk and its results can only be inspected
63-
in Python via the ``result`` object.
63+
in a Jupyter Notebook or Python script via the ``result`` object.
6464

6565
However, the results of any non-linear search can be output to hard-disk by passing the ``name`` and / or ``path_prefix``
6666
attributes, which are used to name files and output the results to a folder on your hard-disk.
6767

6868
The benefits of doing this include:
6969

70-
- Inspecting results via folders on your computer can be more efficient than using a Jupyter Notebook.
71-
- Results are output on-the-fly, making it possible to check that a fit i progressing as expected mid way through.
72-
- Additional information about a fit (e.g. visualization) is output.
70+
- Inspecting results via folders on your computer is more efficient than using a Jupyter Notebook for multiple datasets.
71+
- Results are output on-the-fly, making it possible to check that a fit is progressing as expected mid way through.
72+
- Additional information about a fit (e.g. visualization) can be output.
7373
- Unfinished runs can be resumed from where they left off if they are terminated.
74-
- On high performance super computers which use a batch system, results must be output in this way.
74+
- On high performance super computers results often must be output in this way.
7575

76-
These outputs are fully described in the scientific workflow example.
76+
The code below shows how to enable outputting of results to hard-disk:
7777

7878
.. code-block:: python
7979
80-
search = af.Emcee(path_prefix=path.join("folder_0", "folder_1"), name="example_mcmc")
80+
search = af.Emcee(
81+
path_prefix=path.join("folder_0", "folder_1"),
82+
name="example_mcmc"
83+
)
84+
85+
86+
These outputs are fully described in the scientific workflow example.
8187

8288
Output Customization
8389
--------------------

docs/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ model and marginalized probability density functions.
144144

145145
overview/the_basics
146146
overview/scientific_workflow
147+
overview/statistical_methods
147148

148149
.. toctree::
149150
:caption: Cookbooks:

0 commit comments

Comments
 (0)