A package for machine learning inference in FPGAs. We create firmware implementations of machine learning algorithms using high level synthesis language (HLS). We translate traditional open-source machine learning package models into HLS that can be configured for your use-case!
If you have any questions, comments, or ideas regarding hls4ml or just want to show us how you use hls4ml, don't hesitate to reach us through the discussions tab.
For more information visit the webpage: https://fastmachinelearning.org/hls4ml/.
For introductory material on FPGAs, HLS and ML inferences using hls4ml, check out the video.
Detailed tutorials on how to use hls4ml
's various functionalities can be found here.
pip install hls4ml
To install the extra dependencies for profiling:
pip install hls4ml[profiling]
import hls4ml
# Fetch a keras model from our example repository
# This will download our example model to your working directory and return an example configuration file
config = hls4ml.utils.fetch_example_model('KERAS_3layer.json')
# You can print the configuration to see some default parameters
print(config)
# Convert it to a hls project
hls_model = hls4ml.converters.keras_to_hls(config)
# Print full list of example models if you want to explore more
hls4ml.utils.fetch_example_list()
We will build the project using Xilinx Vivado HLS, which can be downloaded and installed from here. Alongside Vivado HLS, hls4ml also supports Vitis HLS, Intel HLS, Catapult HLS and has some experimental support dor Intel oneAPI. The target back-end can be changed using the argument backend when building the model.
# Use Vivado HLS to synthesize the model
# This might take several minutes
hls_model.build()
# Print out the report if you want
hls4ml.report.read_vivado_report('my-hls-test')
List of frequently asked questions and common HLS synthesis can be found here
If you use this software in a publication, please cite the software
@software{fastml_hls4ml,
author = {{FastML Team}},
title = {fastmachinelearning/hls4ml},
year = 2024,
publisher = {Zenodo},
version = {v1.0.0},
doi = {10.5281/zenodo.1201549},
url = {https://github.com/fastmachinelearning/hls4ml}
}
and first publication:
@article{Duarte:2018ite,
author = "Duarte, Javier and others",
title = "{Fast inference of deep neural networks in FPGAs for particle physics}",
eprint = "1804.06913",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
reportNumber = "FERMILAB-PUB-18-089-E",
doi = "10.1088/1748-0221/13/07/P07027",
journal = "JINST",
volume = "13",
number = "07",
pages = "P07027",
year = "2018"
}
Additionally, if you use specific features developed in later papers, please cite those as well. For example, CNNs:
@article{Aarrestad:2021zos,
author = "Aarrestad, Thea and others",
title = "{Fast convolutional neural networks on FPGAs with hls4ml}",
eprint = "2101.05108",
archivePrefix = "arXiv",
primaryClass = "cs.LG",
reportNumber = "FERMILAB-PUB-21-130-SCD",
doi = "10.1088/2632-2153/ac0ea1",
journal = "Mach. Learn. Sci. Tech.",
volume = "2",
number = "4",
pages = "045015",
year = "2021"
}
@article{Ghielmetti:2022ndm,
author = "Ghielmetti, Nicol\`{o} and others",
title = "{Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml}",
eprint = "2205.07690",
archivePrefix = "arXiv",
primaryClass = "cs.CV",
reportNumber = "FERMILAB-PUB-22-435-PPD",
doi = "10.1088/2632-2153/ac9cb5",
journal ="Mach. Learn. Sci. Tech.",
year = "2022"
}
binary/ternary networks:
@article{Loncar:2020hqp,
author = "Ngadiuba, Jennifer and others",
title = "{Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML}",
eprint = "2003.06308",
archivePrefix = "arXiv",
primaryClass = "cs.LG",
reportNumber = "FERMILAB-PUB-20-167-PPD-SCD",
doi = "10.1088/2632-2153/aba042",
journal = "Mach. Learn. Sci. Tech.",
volume = "2",
pages = "015001",
year = "2021"
}
If you benefited from participating in our community, we ask that you please acknowledge the Fast Machine Learning collaboration, and particular individuals who helped you, in any publications. Please use the following text for this acknowledgment:
We acknowledge the Fast Machine Learning collective as an open community of multi-domain experts and collaborators. This community and <names of individuals>, in particular, were important for the development of this project.
We gratefully acknowledge previous and current support from the U.S. National Science Foundation (NSF) Harnessing the Data Revolution (HDR) Institute for Accelerating AI Algorithms for Data Driven Discovery (A3D3) under Cooperative Agreement No. PHY-2117997, U.S. Department of Energy (DOE) Office of Science, Office of Advanced Scientific Computing Research under the Real‐time Data Reduction Codesign at the Extreme Edge for Science (XDR) Project (DE-FOA-0002501), DOE Office of Science, Office of High Energy Physics Early Career Research Program (DE-SC0021187, DE-0000247070), and the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant No. 772369).