All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Agent: Add a compiler flag to generate debug info for the
windows-libfuzzer
load library test target. #1684 - Agent: Add a Rust crate to debug missing dynamic library errors on Windows. #1713
- Agent: Add support for detecting missing dynamic libraries on Linux. #1718
- Service: Connect the auto scaling diagnostics to the log analytics workspace. #1708
- Service: Handle the situation where a VM scale set instance is destroyed before we have removed scale-in protection. #1719
- Service: Add additional support for auto scaling including changes to the CLI. New scale sets will automatically be created with auto scaling enabled. #1717, #1763
- Agent/Service/CLI: Add support for generating log files that can be downloaded using the CLI. #1727, #1723, #1721
- Service: Port ARM templates to Bicep. #1724, #1732
- Service: Initial changes to port the service from Python to C#. #1734, #1733, #1736, #1737, #1738, #1742, #1744, #1749, #1750, #1753, #1755, #1760, #1761, #1762, #1765, #1757, #1780, #1782, #1783, #1777, #1791, #1801, #1805, #1804, #1803
- Service: Make sure the scale set nodes are unable to accept work while in the
setup
state. #1731
- Agent: Reduce the logging level down from
warn
todebug
when we are unable to parse an ASan log. #1705 - Service: Move the creation of the event grid topic to the deployment template from the
deploy.py
script. #1591 - Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1548, #1617, #1618
- Service: Consolidate the two log analytics down to one. #1679
- Service: Updated resource name in Bicep file to prevent name clash when deploying 5.3.0. #1808
- Service: Auto scale setting log statement is not an
error
changed it toinfo
. #1745 - Agent: Fixed Cobertera output so that coverage summary renders in Azure Devops correctly. #1728
- Agent: Continue after non-fatal errors during static recovery of SanCov coverage sites. #1796
- Service: Fixed name generation for a few resources in the Bicep file to increase uniqueness which prevents resource name clash. #1800
- Service: Added a new webhook message format compatible with Azure Event Grid. #1640
- Service: Added initial auto scaling support for VM scale sets. #1647, #1661
- Agent: Add an explicit timeout to setup scripts so hangs are easier to debug. #1659
- CLI/Service: Updated multiple first-party and third-party Python dependencies. #1606, #1634
- Agent: Check system-wide memory usage and fail tasks that are nearly out of memory. #1657
- Service: Fix
task
field to the correctNodeTasks
type so serialization works correctly. #1627 - Agent: Convert escaped characters when accessing the name of a blob in a URL. #1673
- Agent: Override
runs
parameter when testing inputs as we only want to test them once. #1651 - Service: Remove deprecated
warn()
method. #1641
- CLI/Service: Added
fuzzer_target_options
argument to thelibfuzzer
templates to allow passing some target options only in persistent fuzzing mode #1610
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1530
- CLI/Service: Updated multiple first-party and third-party Python dependencies. #1576 #1577 #1579 #1582 #1586 #1599
- CLI/Service: Begin update of scale set instances before reimaging to ensure they match the latest scale set model. #1612
- Agent: Removed the
process_stats
telemetry event, which fixes a class of memory leaks on Windowslibfuzzer_fuzz
tasks. #1608 - CLI/Service: Fixed seven day stale node reimaging check. #1616
- Agent: Added source line coverage data #1518 #1534 #1538 #1535 #1572
- Agent: Added Cobertura XML output for source code visualization #1533
- Service: Added auto configuration properties to the monitoring agents #1541
- Service: Added tags to scalesets and VMs #1560
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1489 #1495 #1496 #1501 #1502 #1507 #1510 #1513 #1514 #1517 #1519 #1521 #1522 #1528 #1557 #1566
- Agent: Changed the function that gets the
machine_id
to beasync
to avoid runtime nesting #1468 - Service: Removed generic reset command from the CLI #1511
- Service: Updated the way we check for endpoint authorization #1472
- Service: Increase reliability of integration tests. #1505
- Agent: Avoid leaking unused file and cache data #1539
- Agent: Fixed new clippy errors #1516
- Agent: Added common source coverage format. #1403
- Service: Added class to store and retrieve rules associated with an API endpoint. This supports the ability to control who has access to an API. #1420
- Service: Support for NSG creation during deployment, allowing restricted access to the scaleset and repro VMs. #1331, #1340, #1358, #1385, #1393, #1395, #1400, #1404, #1406, #1410
- Service: Guest account access is disabled by default when creating the default service principal during deployment. #1425
- Service: Group membership check added. #1074
- Service: Exposed the
target_timeout
parameter in theradamsa basic
template. #1499
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1360, #1364, #1367, #1368, #1369, #1382, #1429, #1455, #1456, #1414, #1416, #1417, #1423, #1438, #1446, #1458, #1463, #1470, #1453, #1492, #1493, #1480, #1488, #1490
- Service: Fixed Azure DevOps work item creation by adding missing client initialization. #1370
- Service: Fixed validation of the
target_exe
blob name, enabling nesting in a subdirectory of thesetup
container. #1371 - Service: Migrated to MS Graph, as
azure-graphrbac
is soon to be deprecated. #966 - Service: Stopped ignoring unexpected errors when authenticating the client secret. #1376
- Service: Fixed regex to correctly capture the object ID when trying to remove an invalid application ID. #1408
- Service: Added check for service principal use during user role assignment. #1479
- Service: Added support for Compute Gallery images. #1450
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1301, #1302, #1310, #1312, #1332, #1335, #1336, #1337, #1341, #1342, #1343, #1344, #1353
- CLI/Service: Updated multiple first-party and third-party Python dependencies. #1346, #1348, #1355, #1356
- Service: Fixed authentication when using a client secret. #1300
- Deployment: Fixed an issue where the wrong AppRole was assigned when creating new CLI registrations. #1308
- Deployment: Suppress a dependency's noisy logging of handled errors when deploying. #1304
- Agent: Added ability to handle fake crash reports generated by debugging tools during regression tasks. #1233
- Service: Added ability to configure virtual network IP ranges. #1268
- Deployment: Added
flake8
to the deployment process to align with rest of the Python codebase linting. #1286 - Service: Added custom extensions to enable Microsoft Security Monitoring extensions. #1184
- CLI: Added
--readonly_inputs
option to thelibfuzzer basic
template. #1247
- CLI: Increased the default verbosity of destructive CLI commands. #1264
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1239, #1240, #1236, #1238, #1245, #1246, #1252, #1253, #1254, #1257, #1261, #1262, #1276, #1278
- Deployment: Fixed deployment in some regions by specifying widely-supported versions of Application Insights resources. #1291
- Deployment: Fixed an issue with multi-tenant deployment caused by a mismatch between the identifier used to configure the app registration and value used to authenticate the CLI client. #1270
- Service: Fixed
scaleset proxy reset
to reset all proxies in specified region. #1275 - CLI: Temporarily ignore type errors from
azure-storage-blob
due to invalid Python type signatures. #1258
- CLI/Deployment/Service: Move to using
api://
for AAD Application "identifier URIs". Pre-3.0 clients will not be able to connect to newer instances. (BREAKING CHANGE) #1243 - Agent/Supervisor/Proxy: Redact device, IP, and machine name in runtime statistics reported to Microsoft via Application Insights. #1242
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1232, #1230, #1228, #1229, #1231, #1242.
- CLI: Fixed an issue printing results that include
SecretData
. #1223
- Agent: Added
machine_id
configuration value expansion for all tasks. #1217, #1216
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1215, #1214, #1213, #1211, #1218, #1219
- Deployment: Fixed the example deployment rule to include the required Azure Storage Queue support. #1207
- CLI: Fixed an issue printing results that include
set
,datetime
, orNone
. #1208, #1221
- CLI/Service: The Azure VM SKU used for proxies is now configurable via
onefuzz instance_config
. #1128 - CLI: Added
onefuzz status pool
command to give status information for a pool. #1170
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1152, #1155, #1156, #1157, #1158, #1163, #1164, #1165, #1166, #1176, #1177, #1178, #1179, #1181, #1182, #1183, #1185, #1186, #1191, #1198, #1199, #1200, #1201, #1202, #1203, #1204, #1205
- Agent: Changed
azcopy
calls to always retry when source files are modified mid-copy. #1196 - Agent: Continued development related to upcoming features. #1146
- Agent: SAS URLs are now redacted in logged
azcopy
failures. #1194 - CLI: Include the number of VMs used per-task in
onefuzz status top
. #1169 - Deployment: Application credentials created during deployment are no longer logged. #1172
- Deployment: Clarify logging when retrying AAD interactions. #1173
- Deployment: Replaced custom Azure Storage Queue creation with ARM templates. #1193
- Service: The validity period for SAS URLs is now back-dated to avoid time synchronization issues. #1195
- Deployment: Invalid preauthorized application references are removed during application registration. #1175
- Service: Fixed an issue logging node status. #1160
- Supervisor: Added recording of STDOUT and STDERR of the supervisor to file. #1109
- CLI/Service/Agent: Supervisor tasks can now optionally have a managed coverage container. #1123
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1151, #1149, #1145, #1134, #1135, #1137, #1133, #1138, #1132, #1140,
- Service: Enabled testing of the Azure Devops work item rendering. #1144
- Agent: Continued development related to upcoming features. #1142
- CLI: No longer retry service API requests that fail with service-level errors. #1129
- Agent/Supervisor/Proxy: Addressed multiple new
cargo-clippy
warnings. #1125 - CLI/Service: Updated third-party Python dependencies. #1124
- Service: Fixed an issue with incomplete authorization in multi-tenant deployments. CVE-2021-37705 #1153
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1116
- Service: Fixed an error when replacing notifications for a container. #1115
- Service: Fixed Python 3.9 compatibility issues. #1117
- Agent/Supervisor/Proxy: Addressed multiple new
cargo-clippy
warnings. #1118
- Agent: Fixed an issue with the "Premium" storage account utilities. #1111
- Agent: Addressed a rate-limiting issue when using
azcopy
from a large number of VMs with numbers cores. #1112
- Service: PII is now removed from Jobs, Tasks, and Repros after 18 months. #1051
- Service: Unused notifications are now removed after 18 months. #1051
- Service: SignalR events are routed through an Azure Storage Queue to prevent SignalR outages from impacting the entire service. #1100, #1102
- Service: Functionality used prior to 1.0.0 for assigning tasks to VMs rather than Pools is no longer supported. #1105
- Service: The
coverage
andgeneric_generator
tasks now verify{input}
is used intarget_env
ortarget_options
. #1106
- Service: Fixed an issue reimaging old nodes with
debug_keep_node
set. #1103 - Service: Fixed an issue authenticating to Azure services. #1099
- Service: Fixed an issue preventing Pools and Scalesets set to
shutdown
from being set tohalt
. #1104
- CLI: Added the ability to remove existing container notifications upon creating a notification integration. #1084
- CLI/Documentation: Added an example
generic_analysis
task that demonstrates collecting LLVM source-based coverage. #1072 - Supervisor: Added service-interaction resiliency for node commands. #1098
- Agent/Supervisor/Proxy: Addressed multiple new
cargo-clippy
warnings. #1089 - Agent: Added more context to errors in generator tasks. #1094
- Agent: Added support for ASAN runtime identification of format string bugs. #1093
- Agent: Added verification that
{input}
is provided to the application under test viatarget_env
ortarget_options
. #1097 - Agent: Continued development related to upcoming features. #1090, #1091
- CLI/Service: Updated multiple first-party and third-party Python dependencies. #1086
- CLI: Changed job templates to replace existing notifications for the unique report container. #1084
- Service: Added more context to Azure DevOps errors. #1082
- Service: Notification secrets are now deleted from Azure KeyVault upon notification deletion. #1085
- Agent: Fixed an issue logging ASAN output upon ASAN log parse errors. #1092
- Agent: Fixed issues handling non-UTF8 output from applications under test. #1088
- Agent: Batch processing results are now saved after every 10 executions. #1076
- Service: Optimized
file_added
event queueing by avoiding unnecessary Azure queries. #1075 - Agent: Optimized directory change monitoring. #1078
- Supervisor: Optimized agent monitoring. #1080
- CLI: Fixed an issue handling long-running requests. #1068
- CLI/Service: Fixed an issue related to upcoming features. #1067
- CLI: Fixed an issue handling
target_options
for libFuzzer jobs. #1066
- Supervisor: Added a
panic
handler to record supervisor failures. #1062
- Agent: Added more context to file upload errors. #1063
- CLI: Made errors locating
azcopy
more clear. #1061
- Service: Fixed an issue where long-lived VM scaleset instances could get reimaged with out-of-date VM setup scripts. #1060
- Service: Fixed an issue where VM setup script updates were not always pushed. #1059
- Service: Fixed an issue detecting and reimaging failed nodes. #1054
- Service: Fixed an issue with the supervisor restarting too quickly. #1055
- Agent: Added
minimized_stack_function_lines
andminimized_stack_function_lines_sha256
to crash reports. #993 - CLI/Service: Added
timestamp
toNotification
objects. #1043 - Service: Added the scaleset_resize_scheduled event. #1047
- Service: Added
pool_id
toNode
objects. #1049
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1040, #1052
- CLI/Deployment/Service: Updated multiple first-party and third-party Python dependencies. #922, #1045
- CLI/Service: Moved to using Pydantic built-in size validation for types. #1048
- Service: Continued development related to upcoming features. #1046, #1050
- CLI: Fixed an issue handling column sorting in
onefuzz status top
. #1037 - Service: Fixed an issue adding SSH keys to Windows VMs. #1038
- CLI/Service: Added instance configuration that can be managed via
onefuzz instance_config
. #1010 - Service: Added automatic retry for Azure Devops notifications. #1026
- CLI/Service: Added validation to GitHub Issues integration configuration. #1019
- Agent/Supervisor/Proxy: Moved to
rustls
to enable running the Agent and Supervisor on Ubuntu 20.04. #1029 - Agent: Continued development related to upcoming features. #1016
- Agent: Fixed an issue handling invalid data during coverage collection. #1032
- Agent: Fixed retry logic on coverage recording failures #1033
- Service: Fixed an issue preventing deletion or reimaging of nodes in some cases. #1023
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #1018, #1009, #1004
- Service: Tasks running on nodes without recent heartbeats are now marked as failed due to heartbeat issues. #1015
- Service: Updated multiple first-party Python dependencies. #1012
- Agent: Fixed an issue where
libfuzzer_fuzz
tasks on Windows that found crashes too rapidly were unable recover handles. #1002 - Agent: Fixed an issue with the regression tasks after using the
onefuzz debug notification
commands. #1011 - Deployment: Fixed a configuration issue reducing log retention durations. #1007
- Service: Fixed an issue creating GitHub Issues notifications. #1008
- Service: Fixed an issue handling reimaging nodes that took an excessive amount of time. #1005
- Service: Update node and task-related log messages to ease debugging. #988
- Agent: Changed the log level for
azcopy
retry notification toDEBUG
. #986 - Agent: Updated stack minimization regular expressions from
libclusterfuzz
. #992 - Agent: Added more context to synchronized directory errors. #995
- Deployment: Reduced the Application Insights log retention duration to 30 days. #997
- Agent: Improved tracking of threads during win32 debugging. #1000
- Agent: Fixed an issue using relative paths with synchronized directories. #996
- Service: Fixed an issue creating GitHub Issues notifications #990
- CLI/Service: Fixed an issue handling
Union
fields in theonefuzztypes
library #982 - Service: Fixed an issue handling manually-resized scalesets #984
- CLI: Added
onefuzz debug job rerun
command. #960
- Agent: Added more context to coverage recording errors. #979
- Agent: The coverage task now retries an input in the case of coverage recording failure. #978
- Service: Nodes with the
debug_keep_node
flag will now be reimaged once the node is 7 days old. #968 - Service: Updates to scalesets can now be requested while the node is in the
resize
state. #969
- Service: Fixed an issue when reimaging nodes that previously failed to reimage as expected. #970
- Service: Fixed an issue when resizing scalesets that exceed Azure VM quotas. #967
- Supervisor: Fixed an issue with refreshing service authentication tokens. #976
- Agent: Added a new
coverage
task that enables coverage analysis for both uninstrumented and Sancov targets on Linux and Windows. #763
- Agent: Improved performance of the libFuzzer fuzzing tasks. #941
- CLI: Changed the
libfuzzer basic
job template to use the newcoverage
task. #763 - Deployment: Added automatic retry when authorizing newly-created applications during deployment. #959
- Supervisor: Simplified the service coordination logic and added increased context upon failure. #963
- Agent/Supervisor: Added azcopy log recording upon azcopy failure. #945
- CLI: Added
onefuzz jobs containers delete
command. #949 - CLI: Added
onefuzz jobs containers download
command. #953
- Agent/Service: Agents scheduled to shut down no longer wait for work prior to shutting down. #940
- Agent/Supervisor/Proxy: Updated multiple third-party Rust dependencies. #942
- Agent: Continued deveopment related to upcoming features. #937, #929, #919
- CLI: Message details are now always shown in
onefuzz status top
. #933 - CLI: Renamed template helper methods for uploading task setup files. #926
- Contrib: Updated multiple third-party Python dependencies. #950
- Service: Tasks that are stopped without ever having started are now marked as failed. #935
- Supervisor: Added increased context when recording supervisor failures. #931
- CLI/Service: Worked around a third-party dependency issue in handling Python Unions in Events. #939
- Deployment: Fixed an authentication issue during deployment. #947, #954
- Deployment: Fixed an issue limiting application creation logs. #952
- Service: Fixed an issue deleting nodes with expired heartbeats. #930
- Service: Fixed an issue deleting nonexistent containers. #948
- Service: Fixed an issue deleting proxies. #932
- Service: Fixed an issue that prevented automatic migration of notification secrets to Azure KeyVault in some cases. #936
- Supervisor: Fixed an issue adding multiple SSH keys to Windows VMs. #928
- Agent: Added
setup_dir
configuration value expansion for generator tasks. #901 - CLI: Enable specifying alternate tenant configuration via command line arguments. #900
- CLI/Service: Proxy status is now available via
onefuzz scaleset_proxy list
command. #905
- Deployment: Moved to using Microsoft Graph
User.Read
rather than Azure AD Graph. #894 - Service: Tasks are now stopped on nodes before task related storage queues are deleted. #801
- Proxy: Proxies are automatically deployed and always available based on regions with active fuzzing scalesets. #839, #908, #907, #909, #904
- CLI: Added explanations to errors generated when parsing arguments whose values are key/value pairs. #910, #911
- Agent: Continued development related to upcoming features. #913, #918
- Service: Updated first-party Python libraries #903
- Documentation: Added descriptions for the Azure AD entities used by OneFuzz. #896
- Service: Added the scaleset_state_updated event. #882
- Agent/Supervisor/Proxy: Addressed multiple new
cargo-clippy
warnings. #884 - Agent/Supervisor/Proxy: Updated and removed third-party Rust dependencies. #892, #873, #865
- Service: Improved the Python typing signatures used in the service. #881
- Service: Updated multiple first-party and third-party Python libraries. #893, #889, #866, #885, #861, #890,
- Supervisor: The supervisor now includes the full error context upon failure. #879
- Service: Cleaned up scaleset update logs. #880
- Agent: Continued development related to upcoming features. #874, #868, #864
- SDK/CLI: Replaced Python based directory uploading with
azcopy sync
. #878
- Service/Supervisor: Fixed an issue shrinking scalesets where idle nodes would not shut down as expected. #866
- Deployment: Fixed an issue deploying to non-Microsoft single-tenant instances. #872, #898
- Deployment: Added ability to only deploy RBAC rsources. #818
- Agent: Continued development related to upcoming features. #855, #858
- Agent: Fixed issue where directory monitoring would fail due to
azcopy
temporary files. #859 - Service: Fixed issue where scalesets could get stuck trying to resize if also manually deleted. #860
- Agent: Added context to errors generated during configuration value expansion. #835.
- CLI/Service: Added messages awaiting processing for a node to the node status API. #836
- Agent: Continued development related to upcoming features. #844, #852, #850, #843, #837, #838, #844
- Agent/Proxy/Supervisor : Updated multiple third-party Rust dependencies. #842, #826, #829,
- Service/Contrib: Updated multiple Python dependencies. #828, #827, #823, #822, #821, #847
- Service: Resetting nodes no longer requires waiting for the node to acknowledge the shutdown in some cases. #834
- Supervisor: Fixed an issue introduced in 2.14.0 that sometimes prevents nodes from stopping processing tasks. #833
- Service: Fixed an issue related to Azure Storage Queues being deleted while in use. #832
- Deployment: Fixed an issue where the CLI client application role was not assigned during deployment. #825
- Contrib: Added a sample GitHub Actions workflow and an Azure DevOps Pipeline to demonstrate deploying OneFuzz jobs using CICD. #778
- CLI/Service: Added creation timestamps to
Job
,Node
,Pool
,Scaleset
,Repro
,Task
, andTaskEvent
records returned by the service. #796, #805, #804 - Agent/Proxy/Supervisor: Added additional context to web request failures to assist in debugging issues. #798
- Service: Added task configuration to the crash_reported and regression_reported events. #793
- Agent: The full error context is now logged upon task failure. #802
- CLI: The
libfuzzer-dotnet
template no longer defaults to failing the task if the fuzzer exits with a non-zero status but no crash artifact. #807 - Agent/Proxy/Supervisor: Updated multiple Rust dependencies. #800
- Service: When multiple failures are reported for a given task, only the first failure is recorded. #797
- Agent: Continued development related to upcoming features. #820, #816, #790, #809, #812, #811, #810, #794, #799, #779
- Deployment: Added missing actions to the example Custom Azure Role for deployment. #808
- Service: Fixed an issue in scaleset creation with incompatible VM SKUs and VM Images. #803
- Service: Fixed an issue removing user identity information from logging to user instances. #795
- Deployment: Allow specifying the Azure subscription to use for deployment, instead of always using the default #774
- Agent/Supervisor: Added automatic retry when executing
azcopy
. #701 - Service: When task setup fails, the error that caused the setup failure is now included in the Task error message. #781
- Agent: The
libfuzzer-fuzz
task no longer queries the full local system status when only reporting process status. #784 - Agent: The
libfuzzer-fuzz
task now limits the stderr collected to the last 1024 lines for potential failure reporting. #785 - Agent: The
libfuzzer-fuzz
task now summarizes the executions per second and iteration counts from all of the workers on each VM. #786 - Agent: The
libfuzzer-coverage
task no longer removes the initial copy of inputs. #788 - Agent: Debugger scripts for extracting libFuzzer coverage are now embedded in the agent. #783
- Agent: Continued development related to upcoming features. #787, #776, #663
- CLI: Fixed issue relating to line endings in the
libfuzzer-qemu
job template setup script. #782 - Service: Fixed backward compatibility issue in ephemeral disk support when creating scalesets. #780
- Deployment: Fixed issue in multi-tenant deployment support. #773
- Agent: LibFuzzer tasks now include a verification step that verifies the fuzzer can test a small number of seeds at the start of the task. #752
- Integration Tests: Added verification that no errors are logged to Application Insights during testing. #700
- Agent/Supervisor/Service/Deployment: Added support for multi-tenant authentication. #746
- CLI/Service: Added support for Ephemeral OS Disks. #461, #761
- Agent: Continued development related to upcoming features. #765, #762, #754, #756, #750, #744, #753
- Contrib: Updated multiple python dependencies. #764
- CLI/Agent: LibFuzzer fuzzing tasks no longer default to failing the task if the fuzzer exits with a non-zero status but no crash artifact. #748
- Agent/Proxy/Supervisor: Fixed issues prevent HTTPS retries. #766
- Agent/Service/Proxy/Supervisor: Fixed logging and telemetry from the agent. #769
- Agent/Proxy/Supervisor: Fixed issues preventing heartbeats. #749
- Agent: Continued log simplification and clarification. #736, #740, #742
- Agent: Prevent invalid queue messages from being ignored. #731
- Agent: Separated module and symbol names for Windows debugger-based crash reports. #723
- Deployment/Agent: Updated AFL++ to 3.11c. #728
- CLI/Deployment: Updated Python dependencies. #721
- Agent: Updated stack minimization regular expressions from ClusterFuzz. #722
- Service: Removed user's identity information from logging to user instances. #724, #725
- Agent: Continued development related to upcoming features. #699, #729, #733, #735, #738, #739
- Deployment: Worked around a race condition in service principal creation. #716
- Agent: Dotfiles are now ignored in libFuzzer-related directories. #741
- Agent/CLI/Service: Added regression testing tasks, including enabling git bisect using OneFuzz. #664, #691
- Agent/CLI/Service: Added call stack minimization using a Rust port of ClusterFuzz stack trace parsing. #591, #705, #706, #707, #714, #715, #719
- CLI: Added
onefuzz privacy_statement
command, which displays OneFuzz's privacy statement. #695 - Agent: Added installation of the
x86
andx86_64
Visual Studio C++ redistributable runtimes on Windows nodes. #686
- Agent/Proxy/Supervisor: Changed web request retry logic to include the underlying failure upon giving up retrying a request. #696
- Supervisor: Added automatic web request retry logic when communicating to the service. #704
- CLI/Service: Updated Python dependencies. #698, #687
- Supervisor: Clarified log message when the supervisor unexpectedly exits. #685
- Proxy: Simplified service communication logic. #683
- Proxy: Increased log verbosity on proxy failure. #702
- Agent: Increased setup script timestamp resolution. #709
- Agent: Continued development related to an upcoming feature. #508, #688, #703, #710, #711
- Agent: Fixed support for libFuzzer targets that use shared objects or DLLs from the setup container. #680, #681, #682, #689, #713
- Contrib: Added sample Webhook Service #666
- Agent: Add OneFuzz version and Software role to telemetry #586
- Agent: Add multiple telemetry data types for the upcoming functionality #619
- Agent: Added
input_file_sha256
to configuration value expansion. #641 - Agent: Added
job_id
to Task Heartbeat #646 - Service: Added task information to job_stopped events #648
- Service: task_stopped and task_failed now trigger once the task has stopped instead of upon entering the
stopping
state. #651 - CLI: Authentication tokens are saved upon successful login rather than on program exit. #665
- Service: If a task with dependent tasks fails, all of the dependent tasks are marked as failed. #650
- Agent: Fixed PC address in crash report backtraces. #658
- Service: Upon task completion, if all of the tasks in the associated job are completed, the job is marked as stopped. #649
- Deployment/Agent: Updated AFL++ to 3.11c. #675
- Agent/Proxy/Supervisor: Changed web request retry logic to always retry any request that fails, regardless of why the request failed. #674
- Agent: Downloading files from task queues will now automatically retry on failure. #676
- Service: User information is now stripped from Events before being logged to Application Insights. #661
- Service: Handle exception related to manually deleted scalesets #672
- Agent: Fixed Rust lifetime issues exposed by an update to Rust regex library #671
- CLI: Added support for Aarch64 libFuzzer targets using the QEMU user space emulator. #600
- Build: Added CodeQL pipeline. #617
- Service: Added node and task heartbeat events. #621
- Agent: Clarified batch-processing logs. #622
- Agent/Proxy: Updated multiple rust dependencies. #624
- Service/CLI/Contrib: Updated multiple python dependencies. #607, #608, #610, #611, #612, #625, #626, #630, #640
- Service: Update task configuration to verify
target_exe
is a canonicalized relative path. #613 - Deployment/Agent: Updated AFL++ to 3.10c. #609
- Deployment: Clarify application password creation succeeded after earlier failures. #629
- Service: VM passwords are no longer set on Linux VMs. #620
- Service: Clarify source of task failures when notification integration marks a task as failed. #635
- Agent/Proxy/Supervisor: Fixed web request retry logic when handling operating system level errors. #623
- Service: Handle exceptions when creating scalesets fail due to Azure VM quota issues. #614
- CLI: Added
onefuzz containers files download_dir
to enable downloading the contents of a container. #598 - Agent: Added
microsoft_telemetry_key
andinstance_telemetry_key
and expanded the availabilityreports_dir
in configuration value expansion. #561 - Agent/Service: Added
job_id
to agent-based heartbeats. #594 - Agent/Proxy/Supervisor: Added additional context to errors during Storage Queue and service interactions to improve debugging. #601
- Agent/Proxy/Supervisor: Renamed the Application Insights token names used for telemetry to
microsoft_telemetry_key
andinstance_telemetry_key
and the function that gated telemetry sharing tocan_share_with_microsoft
to make the telemetry implementation easier to understand. #587 - Deployment: Updated multiple Python dependencies. #596
- Service: Updated multiple Python dependencies. Addresses potential security issue CVE-2020-28493 #595
- Service: Don't let nodes run new tasks if they are part of a scaleset or pool that is scheduled to be shut down. #583
- Service: Fixed the queries used to identify nodes running outdated OneFuzz releases. #597
- Agent: Fixed an issue that would stop an agent or supervisor from performing work if an HTTPS request has failed in certain conditions. #603
- Agent: Fixed an issue that would stop a task if the task printed a significant amount of data to stdout or stderr. #588
- Deployment: Address deployment failures relating to cross-region Azure Active Directory resource creation delays. #585
- Service: Jobs that do not start within 30 days are automatically stopped. #565
- Service: Debug proxies now use ports 28000 through 32000. #552
- Service: Events now include the instance name and unique identifier. #577
- Service: All task related Events now include the task configuration. #580
- Service: Errors generated during report crash report notification due to invalid jobs or tasks now include the reason for the error. #576
- CLI: Namespaced containers for coverage used in job templates now include
build
andplatform
in addition toproject
andname
. #572 - Service: User triggered node reimaging no longer waits for confirmation from the node prior to starting the reimage process. #566
- Service: Fixed an error condition when users recreate a container immediately after deleting it. #582
- Service: Fixed an issue when one task on a node ended, the node was reimaged regardless of the state of other tasks running on the node. #567
- CLI: Added the ability to poll task status until the tasks have started to managed templates using
--wait_for_running
. #532 - CLI: Added a libfuzzer-dotnet support. #535
- Agent: Added
crashes_account
andcrashes_container
to configuration value expansion. #551 - CLI: Added
onefuzz status job
andonefuzz status project
to provide a user-friendly job status. #550
- Agent: Logs and local telemetry from the agent now include the role (
agent
orsupervisor
) in recorded events. #527 - Agent: Clarified the errors generated when libFuzzer coverage extraction fails #554
- Service: Handled
SkuNotAvailable
errors from Azure when creating scalesets. #557 - Agent/Proxy: Updated multiple third-party Rust libraries. Addresses potential security issue RUSTSEC-2021-0023. #548
- Agent: Verifying LibFuzzer targets at the start of a task using
-help=1
now happens prior to sending heartbeats. #528
- Service: Fixed issue related to Azure Functions not always providing the JWT token via Authorization headers. #531
- CLI: Fixed
--wait_for_running
in job templates. #530 - Deployment: Fixed a log error by setting the default SignalR transport used by Azure Functions. #525
- Agent: Fixed LibFuzzer coverage collection when instrumenting DLLs loaded at runtime. #519
- Service: Fixed issue where the cached Azure Identity was not being used. #526
- Service: Fixed log message related to identifying secondary corpus instances. #524
- Service: Handle scaleset nodes that never register, such as nodes with instance-specific setup script failures. #518
- Agent: Added stdout/stderr logging and clarifying context during failures to the
generic_analysis
task. #522 - Agent/Service/Proxy: Clarify log messages from the scaleset proxy. #520
- Agent/Proxy: Update multiple third-party Rust libraries. #517
- Agent: Fixed potential race condition when single stepping when debugging during the
generic_crash_reporter
andgeneric_generator
tasks running on Windows. #440
- Service: Clarify log messages when the service and agent versions mismatch. #510
- Service: Scalesets and Nodes are now updated in a consistent order during scheduled updates. #512
- CLI/Service: Expanded the use of Primitive data types that provide data validation. #514
- Service: Fixed an error generated when scalesets scheduled for deletion had configurations updated. #511
- Service: Fixed an issue where scaleset configurations were updated too frequently. #511
- Proxy: The logs from the proxy manager logged to Application Insights. #502
- Agent: Updated the web request retry logic to retry requests upon connection refused errors. #506
- Service: Improved the performance of shutting down pools. #503
- Service: Updated
azure-mgmt-compute
Python dependency. #499
- Proxy: Fixed an issue in the proxy heartbeats that caused proxy VMs to be reset after 10 minutes. #502
- Agent: Fixed an issue that broke libFuzzer based crash reporting that was introduced 2.1.1. #505
- Agent: Added Rust Clippy static analysis to CICD. #490
- CLI/Service: Added Bandit static analysis to CICD. #491
- Service: Fixed an issue where scalesets could get in a state that would stop updating configurations. #489
- Agent: Added
job_id
andtask_id
to configuration value expansion. #481 - Agent: Broadened the availability of
tools_dir
to configuration value expansion. #480 - Agent: Added clarifying context to command errors. #466
- CLI/Service/Agent: Supervisor can now be fully self-contained fuzzing tasks, no longer requiring
target_exe
. Additionally, supervisor tasks can now optionally have managed report containers. #474 - Service: Managed nodes that are unused beyond 7 days are automatically reimaged to ensure OS patch levels are maintained. #476
- CLI/Service: Updated the default Windows VM image to
MicrosoftWindowsDesktop:Windows-10:20h2-pro:latest
. Existing scalesets will not be impacted by this change, only newly created scalesets using the default image. #469
- Agent: New inputs discovered by supervisor tasks are now saved to the
inputs
container. #484 - CLI: The license is now properly set in the python package metadata. #472
- Agent: Failure to download files via HTTP from queues now results in a failure, rather than the HTTP error being interpreted as the requested file. #485
- Deployment: Fixed error when checking if the default CLI application exists. #488
- Agent: Added clarifying context to file system errors. #423
- CLI/Service: Significantly expanded the events available for webhooks. #394
- Agent: Added
{setup_dir}
to configuration value expansion #417 - Agent: Added
{tools_dir}
configuration value expansion to{supervisor_options}
and{supervisor_env}
#444
- CLI/Service: Migrated
onefuzz status top
to use Webhook Events. (BREAKING CHANGE) #394 - CLI/Service: New notification secrets, such as ADO tokens, are managed in Azure KeyVault and are no longer accessible to the user once created. (BREAKING CHANGE) #326, #389
- CLI/Service: Updated multiple Python dependencies. #426, #427, #430
- Agent: Fixed triggering condition for new unique report events #422
- Deployment: Mitigate issues related to deployments within conditional access policy scenarios. #447
- Agent: Fixed an issue where unused nodes would stop requesting new work. #459
- Service: Fixed dead node cleanup. #458
- Service: Fixed an issue logging excessively large stdout/stderr from tasks. #460
- Service: Added support for sharding corpus storage accounts using "Premium" storage accounts for improved IOPs. #334
- CLI/Service/Agent: Added the ability to optionally colocate multiple compatible tasks on a single machine. The coverage and crash reporting tasks in the LibFuzzer template make use of this functionality by default. #402
- CLI: Added
onefuzz debug log tail
which enables continuously following Application Insights query results. #401 - CLI/Agent: Support verifying LibFuzzer targets at the start of a task using
-help=1
, which will enable identifying non-functional LibFuzzer targets. #381 - CLI/Agent: Support specifying whether to log a warning or fail the task when a LibFuzzer target exits with a non-zero status code (without also generating a crashing input). #381
- Agent: The stdout and stderr for the supervisors and generators are now logged to Application Insights. #400
- Service: Enabled per-Scaleset SSH keys on Windows VMs, similar to existing Linux support, enabling
onefuzz debug node ssh
to both Windows and Linux nodes. #390 - Agent: Support ASAN odr-violation results. #380
- CLI/Service/Agent: Added the ability add SSH keys to nodes within scalesets. #441
- CLI: Added support for multi-tenant authentication. #346
- Service: Updating outdated nodes is now limited to 500 nodes at a time. #397
- Service: Restrict agent from accessing API endpoints not specific to the agent. #404
- Service: Increased Azure Functions runtime timeout to 15 minutes. #384
- Deployment/Agent: Updated AFL++ to 3.00c. #393
- Agent: Added randomized initial jitter to agent heartbeats, which reduce API query storms when launching large number of nodes concurrently. #387
- CLI/Agent: Add support to verify LibFuzzer targets execute correctly at the start of a task using
-help=1
. #381 - Service: Re-enable API endpoint used by
onefuzz nodes update
. #412 - Agent: Addressed a race condition in LibFuzzer coverage analysis without initial seeds. #403
- Agent: Prevent supervisor that fatally exits from processing additional new tasks. #378
- Agent: Address issues handling LibFuzzer targets that produce non-UTF8 output to stderr. #379
- CLI: Added
libfuzzer merge
job template, which enables running performing libFuzzer input minimization as a batch operation. #282 - CLI/Service: Added the instance-specific Application Insights telemetry key to
onefuzz info get
, which will enable logging to the instance specific application insights from the SDK. #353 - Agent: Added support for parsing ASAN
CHECK failed
entries, which can occur during large amounts of memory corruption. #358 - Agent/Service: Added support for parsing the ASAN "scariness" score and description when
print_scariness=1
inASAN_OPTIONS
. #359
- Agent: Mark tasks as failed if the application under test generates an ASAN log file that the agent is unable to parse. #351
- Agent: Updated the
libfuzzer_merge
task to merge pre-existing inputs in a single pass. #282 - CLI: Clarified the error messages when prefix-expansion fails. #342
- Service: Rendered
pydantic
models as JSON when logging to preventerror=None
from showing up in the error logs. #350 - Deployment: Pinned the version of pyOpenssl to the version used by multiple Azure libraries. #348
- CLI/Service: (PREVIEW FEATURE) Multiple updates to job template management. #354, #360, #361
- Agent: Fixed issue preventing the supervisor from notifying the service on some state changes. #337
- Deployment: Fixed a regression in retrying password creation during deployment #338
- Deployment: Fixed uploading tools when rolling back deployments. #347
- CLI/Service: Added Service-Managed Job Templates as a preview feature. Enable via
onefuzz config --enable_feature job_templates
. #226 - Service/agent: Added internal support for unmanaged nodes. This paves the way for bring your own compute for fuzzing. #318
- CLI: Added
onefuzz debug
subcommands to simplify coverage and fuzzing performance for libFuzzer jobs from Application Insights. #325 - Service: Information about the user responsible for creating jobs and repro VMs is now associated with the Job and Repro VMs. #327
- Deployment:
deploy.py
now automatically retries on failure when deploying the Azure Function App. #330
- Service: Address multiple minor issues previously hidden by function decorators used for caching. #322
- Agent: Fixed libFuzzer coverage support for internal builds of MSVC #324
- Agent: Address issue preventing instance-wide setup scripts from executing in some cases. #331
- CLI/Service: Added Event-based webhooks. #296
- Service: Information about the user responsible for creating tasks is now associated with the tasks (this information is available in the task related event webhooks). #303
- Contrib: Azure Devops deployment pipeline uses the
--upgrade
feature added in 1.7.0. #304
- Service: Fixed setting
target_workers
, used to configure the number of concurrent libFuzzer workers within a task. #305
- Deployment:
deploy.py
now takes--upgrade
to enable simplify upgrading deployments. For now, this skips assignment of the managed identity role which only needs to be done on installation. #271 - CLI: Added Application Insights debug CLI. See
onefuzz debug logs
#281 - CLI: Added unique_inputs to the default container types for
onefuzz reset --containers
andonefuzz containers reset
. #290 - CLI: Added
onefuzz debug node
to enable debugging a node in a scaleset without having to specify the scaleset. #298
- Service: When shutting down an individual scaleset, all of the nodes in the scaleset are now marked for shutdown. #252
- Service: The scaleset service principal IDs are now cached as part of the respective Scaleset object #255
- Service: The association from nodes that ran a task are now kept until the node is reimaged, enabling easily connecting to the node that ran a task after task completion. #273
- Deployment: Pinned
urllib3
version due to an incompatible new release #292 - CLI: Removed calls to
containers.list
, significantly improving job template creation performance. #289 - Service: No longer use HTTP 404 response codes during agent registration. #287
- Agent: Heartbeats are now only sent as part of the execution loop. #283
- Service: Refactored handlers for agent events, including much more detailed logging. #261
- Deployment: Prevent users from enabling public access ton containers. #300
- Service: Fixed libfuzzer_merge tasks #240
- Service: Fixed an issue where scheduled tasks waiting in the queue for longer than 7 days would never get scheduled. #259
- Service: Removed stale Node references from scalesets #275
- Service: The service now auto-scales the number of Azure Functions instances as needed #238
- CLI/Service/Agent: Added the ability to configure ensemble synchronization interval (including disabling ensemble altogether) #229
- Contrib: Added sample Azure Devops pipeline to maintain instances of OneFuzz #233
- Deployment: Added utility to create CLI application registrations #236
- Deployment/Service/Agent: Added a per-instance uniquely generated UUID to telemetry (see docs/telemetry.md for more information) #245
- CLI: The CLI now internally caches container authorization tokens #224
- Service: Moved to using user-assigned managed identities for Scalesets #219
- Agent: Added stdout to azcopy error logs #247
- Service: Increased function timeouts to 5 minutes
- CLI/Service: Added the ability to prevent a VM from getting reset in order to debug tasks #201
- SDK: Add examples directory to the python package #216
- Agent: Added connection resiliency via automatic retry (with back-off) throughout the agent #153
- Deployment: Added the ability to log the application passwords during registration #214
- Agent: LibFuzzer Coverage metrics are now reported after the batch processing phase #218
- Deployment: Added a utility to assign scalesets to roles #185
- Contrib: Added a utility to automate deployment of new releases of OneFuzz via Azure Devops pipelines #208
- Agent: Addressed a race condition syncing input seeds #204
- Agent: Instead of ignoring all access violations during libFuzzer coverage processing, stop on second-chance access violations #210
- Agent: During libFuzzer coverage, disable default symbol paths unless
_NT_SYMBOL_PATH
is set viatarget_env
. #222
- CLI: Added
onefuzz containers reset
to delete containers by type en masse. #198, #202 - Agent: Added missing approved telemetry as to tool names & crash report identification. #203
- Service: Enabled log sampling at the service at 20 items per second. #174
- Service: Fixed multiple bugs in the service, including an exception due to invalid format string proxy or repro VM creation #206
- CLI: Fixed incorrect resetting of granularly selected components introduced in 1.3.3 #193
- Service: Fixed rate-limiting issues requesting MSI and Storage Account tokens #195
- Service: Moved the SDK to use the same
pydantic
models as the service in request generation #191 - Service: Improved performance of container validation #196
- Service: Fixed exception generated when deleting repro & proxy VMs #188
- Service/Agent: Non-functional nodes are now automatically re-imaged #154, #164, #30
- CLI: Added more granularity for the
onefuzz reset
sub-command #161, #182 - Deployment/Agent: Now includes AFL++ #7
- Deployment/Agent: Now includes Radamsa for Windows #143
- CLI: The
onefuzz status top
TUI now allows filtering based on job ID, project, or name #152
- Service: Nodes no longer have to wait for the scaleset to finish setup before being able to fuzz #144
- Agent: Agent now only notifies the service about its current state upon state change #175
- Service: Task error messages now limit the stdout and stderr to the last 4096 bytes #170
- Service: Replaced custom queue based event loop with timers #160, #159
- Agent: Uploads that fail now report the failure earlier #166
- Agent: All timers now include automatic jitter to reduce request storms #180
- Agent: Ensemble container synchronization has been unified to once every 60 seconds (plus jitter) #180
- Agent: Upon agent failure, it will no longer incorrectly re-register and request new work. #150, #146
- Deployment: Addressed an issue with nested exceptions triggered during a failed deployment [#172] (microsoft#172)
- Deployment: Addressed incompatible prerequisite library warnings during deployment #167
- Testing: Added rust based libFuzzer in the end-to-end integration tests #132
- Agent: Always parse stderr when generating crash reports for LibFuzzer instead of using
ASAN_OPTIONS=log_path
, which fixes crash reports from non-sanitizer based crashes. #131 - Deployment: Added data-migration script to fix notifications for pre-release installs #135
- Agent: Crash reports for LibFuzzer now attempts to parse stderr in addition to
ASAN_OPTIONS=log_path
. This enables crash reporting of go-fuzz based binaries. #127 - Deployment: During deployment, App Insights logs can be configured to automatically export logs to the
app-insights
container in instance specificfunc
storage account. #102
- Agent: Reduced logs sent from the agent #125
- Service: Scalesets now use multiple placement groups, allowing a scaleset to grow to 1000 nodes (or 600 if using a custom image). #121
- Deployment: Support deploying additional platforms (such as OSX). #126
- Service: Fixed typing error in sorting TaskEvent. #129
- CLI/Service: Added creating and updating GitHub Issues based on crash reports. #110
- Agent: LibFuzzer fuzzing that exits with a non-zero exit code without a resulting crashing input now mark the task as failed. #108
- Service: The automatic variable
repro_cmd
used in crash report notifications now includes '--endpoint URL' to reduce friction for users with multiple OneFuzz instances. #113
- Agent/Service: Added the ability to automatically re-image nodes that are out-of-date #35
- Deployment: Added data-migration scripts for pre-release installs #12
- SDK/CLI: Added more
onefuzz debug
sub-commands to support debugging tasks #95 - Agent: Added machine_id and version to log messages #94
- Service: Errors in creating Azure Devops work items from reports now mark the task as failed #77
- Service: The nodes executing a task are now included when fetching details for a task (such as
onefuzz tasks get $TASKID
) #54 - SDK: Added example Azure Functions that uses the SDK #56
- SDK/CLI: Added the ability to execute debugger commands automatically during
repro
#39 - CLI: Added documentation of CLI sub-command arguments (used to describe
afl_container
in AFL templates #10 - Agent: Added
ONEFUZZ_TARGET_SETUP_PATH
environment variable that indicates the path to the task specific setup container on the fuzzing nodes #15 - CICD: Use sccache to speed up build times #47
- SDK: Added end-to-end integration test script to verify full fuzzing pipelines #46
- Documentation: Added definitions for pool, node, and scaleset #17
- Agent/Service: Refactored state management for on-VM supervisors #96
- Agent: Added 'done' semaphore to the agent to prevent agent from fetching additional work once the node should be reset. #86
- Agent: Nodes now sleep longer between checking for new work. #78
- Agent: The task execution clock is now started once the task is in the 'setting up' state #82
- Service: Drastically reduced logs sent to App Insights from third-party libraries #63
- Agent/Service: Added the ability to upgrade out-of-date VMs upon requesting new tasking #35
- CICD: Non-release builds now include the GIT hash in the versions and
localchanges
if built locally with un-committed code. #58 - Agent: Command replacements now use absolute rather than relative paths. #22
- CLI: Fixed issue using
onefuzz template stop
which would improperly stop jobs that had the same 'name' but different 'project' values. #97 - Agent: Fixed input marker expansion (used in AFL templates related to handling
@@
). #87 - Service: Errors generated after the task shutdown has started are ignored. #83
- Agent: Instance specific tools now download and run on windows nodes as expected #81
- CLI: Using
--wait_for_running
inonefuzz template
jobs now properly waits for tasks to launch before exiting #84 - Service: Handled more Azure Devops notification errors #80
- Agent: WSearch service is now properly disabled by default on Windows VMs #67
- Service: Properly deletes
repro
VMs #36 - Agent: Supervisor now flushes logs to Application Insights upon exit #21
- Agent: Task specific setup script failures now properly get recorded as a failed task and trigger the node to be re-imaged #24
- Initial public release