Skip to content

aws-samples/drs-malware-scan

Perform malware scan analysis of on-prem servers using AWS services

Challenges with on-premises malware detection

It can be difficult for security teams to continuously monitor all on-premises servers due to budget and resource constraints. Signature-based antivirus alone is insufficient as modern malware uses various obfuscation techniques. Server admins may lack visibility into security events across all servers historically. Determining compromised systems and safe backups to restore from during incidents is challenging without centralized monitoring and alerting. It is onerous for server admins to setup and maintain additional security tools for advanced threat detection. The rapid mean time to detect and remediate infections is critical but difficult to achieve without the right automated solution.

Determining which backup image is safe to restore from during incidents without comprehensive threat intelligence is another hard problem. Even if backups are available, without knowing when exactly a system got compromised, it is risky to blindly restore from backups. This increases the chance of restoring malware and losing even more valuable data and systems during incident response. There is a need for an automated solution that can pinpoint the timeline of infiltration and recommend safe backups for restoration.

How to use AWS services to address these challenges

The solution leverages AWS Elastic Disaster Recovery (AWS DRS), Amazon GuardDuty and AWS Security Hub to address the challenges of malware detection for on-premises servers.

This combo of services provides a cost-effective way to continuously monitor on-premises servers for malware without impacting performance. It also helps determine safe recovery point in time backups for restoration by identifying timeline of compromises through centralized threat analytics.

  • AWS Elastic Disaster Recovery (AWS DRS) minimizes downtime and data loss with fast, reliable recovery of on-premises and cloud-based applications using affordable storage, minimal compute, and point-in-time recovery.

  • Amazon GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity and delivers detailed security findings for visibility and remediation.

  • AWS Security Hub is a cloud security posture management (CSPM) service that performs security best practice checks, aggregates alerts, and enables automated remediation.

Architecture

sample

Solution description

The Malware Scan solution assumes on-premises servers are already being replicated with AWS DRS, and Amazon GuardDuty & AWS Security Hub are enabled. The cdk stack in this repository will only deploy the boxes labelled as DRS Malware Scan in the architecture diagram.

  1. AWS DRS is replicating source servers from the on-premises environment to AWS (or from any cloud provider for that matter). For further details about setting up AWS DRS please follow the Quick Start Guide.
  2. Amazon GuardDuty is already enabled.
  3. AWS Security Hub is already enabled.
  4. The Malware Scan solution is triggered by a Schedule Rule in Amazon EventBridge (with prefix DrsMalwareScanStack-ScheduleScanRule). You can adjust the scan frequency as needed (i.e. once a day, a week, etc).
  5. The Schedule Rule in Amazon EventBridge triggers the Submit Orders lambda function (with prefix DrsMalwareScanStack-SubmitOrders) which gathers the source servers to scan from the Source Servers DynamoDB table.
  6. Orders are placed on the SQS FIFO queue named Scan Orders (with prefix DrsMalwareScanStack-ScanOrdersfifo). The queue is used to serialize scan requests mapped to the same DRS instance, preventing a race condition.
  7. The Process Order lambda picks a malware scan order from the queue and enriches it, preparing the upcoming malware scan operation. For instance, it inserts the id of the replicating DRS instance associated to the DRS source server provided in the order. The output of Process Order are malware scan commands containing all the necessary information to invoke GuardDuty malware scan.
  8. Malware scan operations are tracked using the DRSVolumeAnnotationsDDBTable at the volume-level, providing reporting capabilities.
  9. Malware scan commands are inserted in the Scan Commands SQS FIFO queue (with prefix DrsMalwareScanStack-ScanCommandsfifo) to increase resiliency.
  10. The Process Commands function submits queued scan commands at a maximum rate of 1 command per second to avoid API throttling. It triggers the on-demand malware scan function provided by Amazon GuardDuty.
  11. The execution of the on-demand Amazon GuardDuty Malware job can be monitored from the Amazon GuardDuty service.
  12. The outcome of malware scan job is routed to Amazon Cloudwath Logs.
  13. The Subscription Filter lambda function receives the outcome of the scan and tracks the result using DynamoDB (step #14).
  14. The DRS Instance Annotations DynamoDB Table tracks the status of the malware scan job at the instance level.
  15. The CDK stack named ScanReportStack deploys the Scan Report lambda function (with prefix ScanReportStack-ScanReport) to populate the Amazon S3 bucket with prefix scanreportstack-scanreportbucket.
  16. AWS Security Hub aggregates and correlates findings from Amazon GuardDuty.
  17. The Security Hub finding event is caught by an EventBridge Rule (with prefix DrsMalwareScanStack-SecurityHubAnnotationsRule)
  18. The Security Hub Annotations lambda function (with prefix DrsMalwareScanStack-SecurityHubAnnotation) generates additional Notes (Annotations) to the Finding with contextualized information about the source server being affected. This additional information can be seen in the Notes section within the Security Hub Finding.
  19. The follow-up activities will depend on the incident response process being adopted. For example based on the date of the infection, AWS DRS can be used to perform a point in time recovery using a snapshot previous to the date of the malware infection.
  20. In a Multi-Account scenario, this solution can be deployed directly on the AWS account hosting the AWS DRS solution. The Amazon GuardDuty findings will be automatically sent to the centralized Security Account.

Usage

Pre-requisites

  • An AWS Account.

  • Amazon Elastic Disaster Recovery (DRS) configured, with at least 1 server source in sync. If not, please check this documentation. The Replication Configuration must consider EBS encryption using Custom Managed Key (CMK) from AWS Key Management Service (AWS KMS). Amazon GuardDuty Malware Protection does not support default AWS managed key for EBS.

  • IAM Privileges to deploy the components of this solution.

  • Amazon GuardDuty enabled. If not, please check this documentation

  • Amazon Security Hub enabled. If not, please check this documentation

    Warning
    Currently, Amazon GuardDuty Malware scan does not support EBS volumes encrypted with EBS-managed keys. If you want to use this solution to scan your on-prem (or other-cloud) servers replicated with DRS, you need to setup DRS replication with your own encryption key in KMS. If you are currently using EBS-managed keys with your replicating servers, you can change encryption settings to use your own KMS key in the DRS console.

Deploy

  1. Create a Cloud9 environment with Ubuntu image (at least t3.small for better performance) in your AWS account. Open your Cloud9 environment and clone the code in this repository. Note: Amazon Linux 2 has node v16 which is not longer supported since 2023-09-11

    git clone https://github.com/aws-samples/drs-malware-scan
    
    cd drs-malware-scan
    
    sh check_loggroup.sh
    
  2. Deploy the CDK stack by running the following command in the Cloud9 terminal and confirm the deployment

    npm install
    
    cdk bootstrap
    
    cdk deploy --all
    

    Note
    The solution is made of 2 stacks:

    • DrsMalwareScanStack: it deploys all resources needed for malware scanning feature. This stack is mandatory. If you want to deploy only this stack you can run cdk deploy DrsMalwareScanStack
    • ScanReportStack: it deploys the resources needed for reporting (Amazon Lambda and Amazon S3). This stack is optional. If you want to deploy only this stack you can run cdk deploy ScanReportStack

    If you want to deploy both stacks you can run cdk deploy --all

Configuration

  1. Make sure that the DRS source server(s) are continuously replicating and in a healthy replication state. This is required because of a limitation in the GuardDuty API: by the time of this writing, there is no public AWS API to scan DRS snapshots. The only way to perform malware scan over DRS source server(s) data is to perform a scan on the Replication Server(s) (Amazon EC2 instances) managed by AWS DRS. If replication is not running in a healthy state, your DRS malware scan may not complete successfully. You have to confirm ReadyforRecovery=Ready

  2. Identify the Source Servers to scan. From all the servers being replicated with AWS DRS, you need to determine the list of candidate servers to scan. From the AWS DRS console, copy the source servers names that you would like the solution to scan and paste it in a text editor. This will be used in the next step.

        sample

  1. Update the DynamoDB table. The list of servers to be scanned is stored in a DynamoDB table created by the cdk stack (with prefix DrsMalwareScanStack-SourceServersDDBTable). You have to create DynamoDB items for each Source Server being replicated by AWS DRS. To do, please go the Amazon DynamoDB service and follow these steps:

        sample

  1. Schedule the Malware scan job. You can go to Amazon Eventbridge service and modify the existing rule created by stack. Edit the rule with prefix DrsMalwareScanStack-ScheduleScanRule, set the scanning frequency of the malware scan analysis. Also this rule is DISABLED by default, please ENABLE it.

        sample

  1. Check that Amazon GuardDuty triggered a malware scan operation In order to confirm the solution is working as expected, you can check in Amazon GuardDuty -> Malware scans console. A few seconds after the scheduling time, you should see a job with ScanStatus = Running. If not, please check the Troubleshooting section below.

        sample

  1. Check AWS SecurityHub for potential malware findings on on-prem servers The Amazon GuardDuty integration with Security Hub enables you to send findings from GuardDuty to Security Hub.
    • Security Hub only shows findings, therefore those malware scan jobs with ScanResult=Clean won't be displayed in the Security Hub console (only those with ScanResult=Infected).
    • Within the AWS Security Hub console, you can go to Findings and apply a filter by ProductName=GuardDuty (as shown in the animation below).
    • The solution adds annotations to the Notes section of the finding, highlighting the name of the on-prem servers being infected.
    • Security Hub is integrated with Amazon Eventbridge to easily automate response and remediation activities, like sending an email to a SOC, report the incident in a slack channel, etc. You can check this link for further details.

        sample

  1. Optional: Check the malware scan report file on S3. In case you had deployed the ScanReportStack stack, you could schedule a report to run as frequent as it fits your needs. The report will extract the content of the DRSVolumeAnnotationsDDBTable DynamoDB table and write it in the Amazon S3 bucket created by the stack (with prefix scanreportstack-scanreportbucket). This report is overwritten (and cumulative) every time the rule is triggered.

    • Enable Amazon Eventbridge rule and set the schedule to run the report: Edit the rule with prefix ScanReportStack-ScanReportRule, to set the scanning frequency of the malware scan analysis and the list of DRS source server(s) to analyze. Also this rule is disabled by default, please enable it.

           sample

    • To check the report, you could query the csv file on the Amazon S3 bucket (with prefix scanreportstack-scanreportbucket)

           sample

  2. Optional: For Multiaccount configuration In case you had a designated security account to centralize all security findings as part of a multiaccount strategy, this solution will also work. The security team can perform the same analysis over the security account, the Security Hub & GuardDuty findings reported in the linked accounts are automatically copied to the centralized security account.

Troubleshooting

All lambda functions route logs to Amazon CloudWatch. You can verify the execution of each function by inspecting the proper CloudWatch log groups for each function, look for the /aws/lambda/DrsMalwareScanStack-* pattern.

The duration of the malware scan operation will depend on the number of servers/volumes to scan (and their size). When Amazon GuardDuty finds malware, it generates a SecurityHub finding: the solution intercepts this event and runs the $StackName-SecurityHubAnnotations lambda to augment the SecurityHub finding with a note containing the name(s) of the DRS source server(s) with malware.

The SQS FIFO queues can be monitored using the Messages available and Message in flight metrics from the AWS SQS console

The DRS Volume Annotations DynamoDB tables keeps track of the status of each Malware scan operation.

Amazon GuardDuty has documented reasons to skip scan operations. For further information please check Reasons for skipping resource during malware scan

In order to analize logs from Amazon GuardDuty Malware scan operations, you can check /aws/guardduty/malware-scan-events Amazon Cloudwatch LogGroup. The default log retention period for this log group is 90 days, after which the log events are deleted automatically.

Cleanup

  1. Run the following commands in your terminal:

    cdk destroy --all
    
  2. (Optional) Delete the CloudWatch log groups associated with Lambda Functions.

AWS Cost Estimation Analysis

For the purpose of this analysis, we have assumed a fictitious scenario to take as an example. The following cost estimates are based on services located in the North Virginia (us-east-1) region.

Estimated scenario:

  • 2 Source Servers to replicate (DR) (Total Storage: 100GB - 4 disks)
  • 3 TB Malware Scanned/Month
  • 30 days of EBS snapshot Retention period
  • Daily Malware scans
Monthly Cost Total Cost for 12 Months
171.22 USD 2,054.74 USD

Service Breakdown:

Service Name Description Monthly Cost (USD)
AWS Elastic Disaster Recovery 2 Source Servers / 1 Replication Server / 4 disks / 100GB / 30 days of EBS Snapshot Retention Period 71.41
Amazon GuardDuty 3 TB Malware Scanned/Month 94.56
Amazon DynamoDB 100MB 1 Read/Second 1 Writes/Second 3.65
AWS Security Hub 1 Account / 100 Security Checks / 1000 Finding Ingested 0.10
AWS EventBridge 1M custom events 1.00
Amazon Cloudwatch 1GB ingested/month 0.50
AWS Lambda 5 ARM Lambda Functions - 128MB / 10secs 0.00
Amazon SQS 2 SQS Fifo 0.00
Total 171.22

Note The figures presented here are estimates based on the assumptions described above, derived from the AWS Pricing Calculator. For further details please check this pricing calculator as a reference. You can adjust the services configuration in the referenced calculator to make your own estimation. This estimation does not include potential taxes or additional charges that might be applicable. It's crucial to remember that actual fees can vary based on usage and any additional services not covered in this analysis. For critical environments is advisable to include Business Support Plan (not considered in the estimation)

Security

See CONTRIBUTING for more information.

Authors

License

This sample code is licensed under the MIT-0 License. See the LICENSE file.