Collaboration repository for novel work based upon HPDC'23 and ICS'23 papers.
Determine if embeddings from GNN can be used to improve GC performance
- Extract template from GC_TLA for use
- Picked Polybench's Syr2k benchmark as I have exhaustive empirical data for two application scales, which will make analyses easier
- File: syr2k_reference/mmp.c
- Reconstruct the filled-in templates based on original tuning data used in the Gaussian Copula
- Original training data referenced from YTOPT source data
- File: syr2k_reference/syr2k_S.csv
- File: syr2k_reference/syr2k_L.csv
- Script fill_in.py mimics the experiment templating process.
- Execute: python3 fill_in.py --csv syr2k_reference/*.csv --template syr2k_reference/mmp.c --output-dir syr2k_recreations
- Original training data referenced from YTOPT source data
- Compile all templates in the same manner as original experiments
- Script build_compile_script.py writes a bash script to mitigate environment/replication issues.
- Execute: python3 build_compile_script.py
- After running, compile_script.sh will exist. Ensure it can be executed (chmod +x compile_script.sh) and execute it.
- Execute: ./compile_script.sh
- There are a LOT of templates, this can take up to half an hour or so to complete
- Use GNN to generate embeddings for each executable in syr2k_recreations
- Compare GC performance:
- With GC_TLA approach only (quantile filtering on source task objectives)
- With GNN embeddings after quantile filtering