Merge pull request #5 from martinkilbinger/p3

P3
martinkilbinger · Dec 26, 2023 · 1f36bf5 · 1f36bf5
2 parents ac0a339 + efb2799
commit 1f36bf5
Show file tree

Hide file tree

Showing 17 changed files with 1,123 additions and 378 deletions.
diff --git a/docs/source/post_processing.md b/docs/source/post_processing.md
@@ -3,63 +3,14 @@
 This page shows all required steps of post-processing the results from one or
 more `ShapePipe` runs. Post-processing combines various individual `ShapePipe`
 output files, and creates joint results, for example combining individual tile
-catalogues in a large sky area. The output of post-processing is a joint _shape
+catalogues into a large sky area. The output of post-processing is a joint _shape
 catalogue_, containing all required information to create a calibrated shear
 catalogue via _metacalibration_), a joint star catalogue, and PSF diagnostic plots.
 
-Some of the following steps pertain specifically to runs carried out on [canfar](https://www.canfar.net/en),
-but most are general.
+If main ShapePipe processing happened at the old canfar VM system (e.g. CFIS v0 and v1), go
+[here](vos_retrieve.md) for details how to retrieve the ShapePipe output files.
 
-1. Retrieve `ShapePipe` result files
-
-   For a local run on the same machine as for post-processing, nothing needs to be done.
-   In some cases, the run was carried out on a remote machine or cluster, and the resulting `ShapePipe`
-   output files need to be retrieved.
-
-   In the specific case of canfar_avail_results.py, this is done as follows.
-
-   A. Check availability of results
-
-      A `canfar` job can submit a large number of tiles, whose processing time can vary a lot.
-      We assume that the submitted tile ID list is available locally via the ascii file `tile_numbers.txt`. 
-      To check which tiles have finished running, and whose results have been uploaded, use
-      ```bash
-      canfar_avail_results -i tile_numbers.txt -v -p PSF --input_path INPUT_PATH
-      ```
-      where PSF is one in [`psfex`|`mccd`], and INPUT_PATH the input path on vos, default `vos:cfis/cosmostat/kilbinger/results`.
-      See `-h` for all options.
-
-   B. Download results
-
-      All results files will be downloaded with
-      ```bash
-      canfar_download_results -i tile_numbers.txt -v -p PSF --input_vos INPUT_VOS
-      ```
-      Use the same options as for same as for `canfar_avail_results`.
-
-      This command can be run in the same directory at subsequent times, to complete an ongoing run: Only newer files will be downloaded
-      from the `vos` directory. This also assures that partially downloaded or corrupt files will be replaced.
-
-      Checking the `vos` directorty can be slow for large patches.
-      To only download files that are not yet present locally (in `.`), first write the missing ones to an ascii file, using again the
-      script `canfar_avail_results`, but this time with `.` as input path:
-      ```bash
-      canfar_avail_results -i tile_numbers.txt --input_path . -p PSF -v -o missing.txt
-      '''
-      Then, download only the missing files with
-      ```bash
-      canfar_download_results -i missing.txt --input_vos cosmostat/kilbinger/results_mccd_oc2 -p mccd -v
-      ```
-
-   C. Un-tar results
-     ```bash
-      untar_results -p PSF
-      ```
-      On success, `ShapePipe` output `fits` and `log` files will be now in various subdirs of the `output` directory.
-
-At this step all required `ShapePipe` resulting output files are available in the current working directory.
-
-2. Optional: Split output in sub-samples
+1. Optional: Split output into sub-samples
 
    An optional intermediate step is to create directories for sub-samples, for example one directory
    for each patch on the sky. This will create symbolic links to the results `.tgz` files downloaded in
@@ -70,33 +21,34 @@ At this step all required `ShapePipe` resulting output files are available in th
     ```
     The following steps will then be done in the directory `tiles_W3`.
 
-3. Run PSF diagnostics, create merged catalogue
+2. Run PSF diagnostics, create merged catalogue
 
    Type
    ```bash
    post_proc_sp -p PSF
    ```
-   to automatically perform a number of post-processing steps. Chose the PSF model with the option
+   to automatically perform a number of post-processing steps. Choose the PSF model with the option
    `-p psfex|mccd`. In detail, these are (and can also be done individually
    by hand):
 
-   A. Analyse psf validation files
+   1. Analyse psf validation files
 
       ```bash
-      prepare_star_cat -p PSF
+      combine_runs -t psf -p PSF
       ```
       with options as for `post_proc_sp`.
-      This script identifies all psf validation files (from all processed tiles downloaded to `pwd`), creates symbolic links,
-      merges the catalogues, and creates plots of PSF ellipticity, size, and residuals over the focal plane.
+      This script creates a new combined psf run in the ShapePipe `output` directory, by identifying all psf validation files
+      and creating symbolic links. The run log file is updated.
 
-   B. Create plots of the PSF and their residuals in the focal plane, as a diagnostic of the overall PSF model.
-     As a scale-dependend test, which propagates directly to the shear correlation function, the rho statistics are computed,
-     see {cite:p}`rowe:10` and {cite:p}`jarvis:16`,
+   3. Merge individual psf validation files into one catalogue. Create plots of the PSF and their residuals in the focal plane,
+      as a diagnostic of the overall PSF model.
+      As a scale-dependend test, which propagates directly to the shear correlation function, the rho statistics are computed,
+      see {cite:p}`rowe:10` and {cite:p}`jarvis:16`,
       ```bash
       shapepipe_run -c /path/to/shapepipe/example/cfis/config_MsPl_PSF.ini
       ``` 
 
-   C. Prepare output directory
+   4. Prepare output directory
 
       Create links to all 'final_cat' result files with 
       ```bash
@@ -105,7 +57,7 @@ At this step all required `ShapePipe` resulting output files are available in th
       The corresponding output directory that is created is `output/run_sp_combined/make_catalog_runner/output`.
       On success, it contains links to all `final_cat` output catalogues
 
-   D. Merge final output files
+   5. Merge final output files
 
       Create a single main shape catalog:
       ```bash

diff --git a/docs/source/vos_retrieve.md b/docs/source/vos_retrieve.md
@@ -0,0 +1,51 @@
+## Retrieve files from VOspace
+
+This page describes how ShapePipe output files can be retrieved via the Virtual Observatory Space
+on canfar. This system was used for the CFIS v0 and v1 runs, and is now obsolete.
+
+1. Retrieve ShapePipe result files 
+
+   For a local run on the same machine as for post-processing, nothing needs to be done. In some cases, the run was carried out on a remote machine or cluster, and the resulting ShapePipe output files  
+  need to be retrieved.
+
+   In the specific case of canfar_avail_results.py, this is done as follows.
+
+   1. Check availability of results  
+
+      A canfar job can submit a large number of tiles, whose processing time can vary a lot. We assume that the submitted tile ID list is available locally via the ascii file tile_numbers.txt. To check 
+      which tiles have finished running, and whose results have been uploaded, use 
+      ```bash
+      canfar_avail_results -i tile_numbers.txt -v -p PSF --input_path INPUT_PATH
+      ```
+      where PSF is one in [`psfex`|`mccd`], and INPUT_PATH the input path on vos, default `vos:cfis/cosmostat/kilbinger/results`.
+      See `-h` for all options.
+
+   2. Download results
+
+      All results files will be downloaded with
+      ```bash
+      canfar_download_results -i tile_numbers.txt -v -p PSF --input_vos INPUT_VOS
+      ```
+      Use the same options as for same as for `canfar_avail_results`.
+
+      This command can be run in the same directory at subsequent times, to complete an ongoing run: Only newer files will be downloaded
+      from the `vos` directory. This also assures that partially downloaded or corrupt files will be replaced.
+
+      Checking the `vos` directorty can be slow for large patches.
+      To only download files that are not yet present locally (in `.`), first write the missing ones to an ascii file, using again the
+      script `canfar_avail_results`, but this time with `.` as input path:
+      ```bash
+      canfar_avail_results -i tile_numbers.txt --input_path . -p PSF -v -o missing.txt
+      ```
+      Then, download only the missing files with
+      ```bash
+      canfar_download_results -i missing.txt --input_vos cosmostat/kilbinger/results_mccd_oc2 -p mccd -v
+      ```
+
+   3. Un-tar results
+     ```bash
+      untar_results -p PSF
+      ```
+      On success, `ShapePipe` output `fits` and `log` files will be now in various subdirs of the `output` directory.
+
+At this step all required `ShapePipe` resulting output files are available in the current working directory.
diff --git a/example/cfis/config_MsPl_psfex.ini b/example/cfis/config_MsPl_psfex.ini
@@ -35,7 +35,7 @@ LOG_NAME = log_sp
 RUN_LOG_NAME = log_run_sp
 
 # Input directory, containing input files, single string or list of names
-INPUT_DIR = $SP_RUN/psf_validation_ind
+INPUT_DIR = $SP_RUN/output
 
 # Output directory
 OUTPUT_DIR = $SP_RUN/output
@@ -54,7 +54,7 @@ TIMEOUT = 96:00:00
 ## Module options
 [MERGE_STARCAT_RUNNER]
 
-INPUT_DIR = psf_validation_ind
+INPUT_DIR = last:psfex_interp_runner
 
 PSF_MODEL = psfex
 

diff --git a/example/cfis/config_Ms_psfex.ini b/example/cfis/config_Ms_psfex.ini
@@ -0,0 +1,68 @@
+# ShapePipe configuration file for post-processing.
+# merge star cat.
+
+
+## Default ShapePipe options
+[DEFAULT]
+
+# verbose mode (optional), default: True, print messages on terminal
+VERBOSE = True
+
+# Name of run (optional) default: shapepipe_run
+RUN_NAME = run_sp_Ms
+
+# Add date and time to RUN_NAME, optional, default: False
+RUN_DATETIME = False
+
+
+## ShapePipe execution options
+[EXECUTION]
+
+# Module name, single string or comma-separated list of valid module runner names
+MODULE = merge_starcat_runner
+
+# Parallel processing mode, SMP or MPI
+MODE = SMP
+
+
+## ShapePipe file handling options
+[FILE]
+
+# Log file master name, optional, default: shapepipe
+LOG_NAME = log_sp
+
+# Runner log file name, optional, default: shapepipe_runs
+RUN_LOG_NAME = log_run_sp
+
+# Input directory, containing input files, single string or list of names
+INPUT_DIR = $SP_RUN/output
+
+# Output directory
+OUTPUT_DIR = $SP_RUN/output
+
+
+## ShapePipe job handling options
+[JOB]
+
+# Batch size of parallel processing (optional), default is 1, i.e. run all jobs in serial
+SMP_BATCH_SIZE = 4
+
+# Timeout value (optional), default is None, i.e. no timeout limit applied
+TIMEOUT = 96:00:00
+
+
+## Module options
+[MERGE_STARCAT_RUNNER]
+
+INPUT_DIR = last:psfex_interp_runner
+
+PSF_MODEL = psfex
+
+NUMBERING_SCHEME = -0000000-0
+
+# Input file pattern(s), list of strings with length matching number of expected input file types
+# Cannot contain wild cards
+FILE_PATTERN = validation_psf
+
+# FILE_EXT (optional) list of string extensions to identify input files
+FILE_EXT = .fits
diff --git a/example/cfis/config_exp_Pi.ini b/example/cfis/config_exp_Pi.ini
@@ -46,7 +46,7 @@ OUTPUT_DIR = $SP_RUN/output
 [JOB]
 
 # Batch size of parallel processing (optional), default is 1, i.e. run all jobs in serial
-SMP_BATCH_SIZE = 2
+SMP_BATCH_SIZE = 1
 
 # Timeout value (optional), default is None, i.e. no timeout limit applied
 TIMEOUT = 96:00:00

diff --git a/scripts/python/link_to_exp_for_tile.py b/scripts/python/link_to_exp_for_tile.py
@@ -327,7 +327,7 @@ def main(argv=None):
     patterns = ["run_sp_exp_SxSePsf", "run_sp_exp_Pi"]
     for pattern in patterns:
         paths, number = get_paths(exp_base_dir, exp_shdu_IDs, pattern)
-        print(number)
+        #print(number)
 
         create_links_paths(tile_base_dir, tile_ID, paths, verbose=verbose)
 

diff --git a/scripts/python/summary_run.py b/scripts/python/summary_run.py
@@ -24,7 +24,7 @@ def main(argv=None):
     list_tile_IDs_dot = get_IDs_from_file(tile_ID_path)
 
     # tile IDs with dashes
-    list_tile_IDs = replace_dot_dash(list_tile_IDs_dot)
+    list_tile_IDs = job_data.replace_dot_dash(list_tile_IDs_dot)
     n_tile_IDs = len(list_tile_IDs)
     n_CCD = 40
 
@@ -147,6 +147,16 @@ def main(argv=None):
         verbose=verbose,
     )
 
+
+    jobs["1024"] = job_data(
+        "1024",
+        "run_sp_combined_psf",
+        ["psfex_interp_runner"],
+        "shdus",
+        path_left=f"{main_dir}/output",
+        verbose=verbose
+    )
+
     job_data.print_stats_header()
 
     for key in "1":
@@ -169,7 +179,7 @@ def main(argv=None):
     print_par_runtime(par_runtime, verbose=verbose)
 
     #for key in ["2", "4", "8", "16", "32", "64", "128"]:
-    for key in ["128"]:
+    for key in ["1024"]:
         job = jobs[key]
         job.print_intro()
         job.check_numbers(par_runtime=par_runtime)