[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5256

eric-forte-elastic · 2025-10-28T15:04:21Z

Pull Request

Issue link(s):

Resolves #5255

Summary - What I changed

Initial Approach

Added an alignment check for subqueries. This column for error messages can become misaligned with unknown field types in subqueries as the error position may be including an offset in the printed return to show indentation of the subquery instead of its exact location as the function expects. This is caught and handled by checking to see if the character before the expected start of the field name to add is alpha numeric or a . which is an appropriate field character. If so, then this is not the beginning of the field, and the start should be moved backward until a non field character is reached (e.g. a space, parenthesis, etc.)

Since we now parse the EQL sequence queries into sub queries in the new schema validation the original row/column error combination is no longer accurate since this row/column references the original query, not the sub query. My first fix was to have the auto add calculate the correct offset with a max number of retries. But in my testing I found some other situations where it would be impossible to re-calculate the offset. So the better/more correct fix is to parse the row/column when the error is thrown and then pass the field down to the auto add which is what is being done now.

There are also 3 additional fixes in this PR:

This PR also fixes an issue where we did not allow rules with a risk_score of 0 which are allowed in Kibana. 373877c
The rule prompt would not accruately load in values of 0 which are permitted for other fields than risk_score.
The diff: 9aedd6d is required to update the rule prompt to allow for 0 values to not be evaluated as False. This patch is required regardless of the risk_score change, as 0 should be considered a valid input value in the prompt.
Additionally, now that we have fixed our schema validation on field types there were typos in detection_rules/etc/custom-consolidated-rules.ndjson the fixed.

How To Test

Export a rule with a similar query to the following and an empty etc/schemas/auto_gen.json file.

query = '''
sequence by user.email with maxspan=6h
 [any where integration.Field in~ ("SELECT COUNT() FROM Fake" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM AnotherFake")]
 [any where integration.Field in~ ("SELECT COUNT() FROM Fake" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM AnotherFake")]
 [any where integration.Field in~ ("SELECT COUNT() FROM Fake" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM AnotherFake")]
'''

The file should be appropriately updated with integration.Field instead of ntegration.Field.

Use the following rule as an example:

rules_export_py_fail.ndjson.txt

And run:

 python -m detection_rules import-rules-to-repo /home/forteea1/Downloads/rules_export_py_fail.ndjson --required-only -ac

This should result in the following rule. One can also run view rule on the following rule instead to achieve the same effect.

Details

[metadata]
creation_date = "2025/10/28"
maturity = "development"
updated_date = "2025/10/28"

[rule]
actions = []
author = ["username"]
data_view_id = "7225791c-5821-46dc-8145-cba32b5b6e4e"
description = "Fake Rule"
enabled = true
false_positives = ["Fake Rule", "Fake Rule"]
filters = []
from = "now-365m"
interval = "5m"
language = "eql"
max_signals = 100
name = "Fake Rule with Subqueries"
note = "note"
references = ["Fake Rule"]
revision = 1
risk_score = 73
rule_id = "644539e9-72b1-4698-aec2-81cdaef009de"
setup = "None"
severity = "high"
tags = ["Integration", "Another Tag"]
timestamp_override = "event.ingested"
to = "now"
type = "eql"
version = 1

query = '''
sequence by user.email with maxspan=6h
 [any where Integration.Field in~ ("SELECT COUNT() FROM Account" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM Contact")]
 [any where Integration.FieldTwo in~ ("SELECT COUNT() FROM Account" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM Contact")]
 [any where Integration.FieldThree in~ ("SELECT COUNT() FROM Account" , "SELECT COUNT() FROM User" , "SELECT COUNT() FROM Contact")]
'''


[[rule.threat]]
framework = "MITRE ATT&CK"
[[rule.threat.technique]]
id = "T1526"
name = "Cloud Service Discovery"
reference = "https://attack.mitre.org/techniques/T1526/"
subtechnique = []


[rule.threat.tactic]
id = "TA0007"
name = "Discovery"
reference = "https://attack.mitre.org/tactics/TA0007/"

[rule.alert_suppression]
group_by = ["user.email"]
missing_fields_strategy = "suppress"

[rule.alert_suppression.duration]
unit = "h"
value = 24

We also need to support validation and auto gen schema for rules in this case (the alternate EQL Sequence case)

Details

[metadata]
creation_date = "2020/02/18"
integration = ["endpoint", "windows"]
maturity = "production"
updated_date = "2025/03/20"

[rule]
author = ["Elastic"]
description = """
Indicates the creation of a scheduled task. Adversaries can use these to establish persistence, move laterally, and/or
escalate privileges.
"""
false_positives = ["Legitimate scheduled tasks may be created during installation of new software."]
from = "now-9m"
index = ["winlogbeat-*", "logs-endpoint.events.process-*", "logs-windows.sysmon_operational-*"]
language = "eql"
license = "Elastic License v2"
name = "Local Scheduled Task Creation"
note = """## Triage and analysis

> **Disclaimer**:
> This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.

### Investigating Local Scheduled Task Creation

Scheduled tasks in Windows automate routine tasks, but adversaries exploit them for persistence, lateral movement, or privilege escalation. They may use command-line tools like `schtasks.exe` to create tasks under non-system accounts. The detection rule identifies suspicious task creation by monitoring specific processes and command-line arguments, excluding those initiated by system-level users, to flag potential misuse.

### Possible investigation steps

- Review the process entity ID to identify the parent process that initiated the scheduled task creation. This can provide context on whether the task was created by a legitimate application or a potentially malicious one.
- Examine the command-line arguments used with schtasks.exe, specifically looking for unusual or suspicious parameters that might indicate malicious intent, such as unexpected task names or execution paths.
- Check the user account associated with the task creation to determine if it is a non-system account and assess whether this account should have the capability to create scheduled tasks.
- Investigate the integrity level of the process to confirm it is not running with elevated privileges, which could indicate an attempt to bypass security controls.
- Correlate the event with other recent activities on the host, such as file modifications or network connections, to identify any patterns or additional indicators of compromise.
- Review the code signature of the initiating process to determine if it is trusted or untrusted, which can help assess the legitimacy of the process creating the task.

### False positive analysis

- Scheduled tasks created by legitimate administrative tools or scripts may trigger false positives. Users should identify and whitelist these known benign processes to prevent unnecessary alerts.
- Routine maintenance tasks initiated by IT departments, such as software updates or system checks, can be mistaken for suspicious activity. Exclude these tasks by specifying their unique process names or command-line arguments.
- Tasks created by trusted third-party applications for legitimate purposes might be flagged. Review and exclude these applications by verifying their code signatures and adding them to an exception list.
- Automated tasks set up by non-system accounts for regular operations, like backups or monitoring, can be misinterpreted. Document these tasks and exclude them based on their specific parameters or user accounts involved.
- Consider excluding tasks with a consistent and verified schedule that aligns with organizational policies, as these are less likely to be malicious.

### Response and remediation

- Immediately isolate the affected system from the network to prevent potential lateral movement by the adversary.
- Terminate any suspicious scheduled tasks identified by the alert using Task Scheduler or command-line tools like schtasks.exe to stop further execution.
- Review and remove any unauthorized scheduled tasks created by non-system accounts to eliminate persistence mechanisms.
- Conduct a thorough scan of the affected system using updated antivirus or endpoint detection and response (EDR) tools to identify and remove any additional malicious artifacts.
- Analyze the user account involved in the task creation for signs of compromise, and reset credentials if necessary to prevent further unauthorized access.
- Escalate the incident to the security operations center (SOC) or incident response team for further investigation and to determine if additional systems are affected.
- Implement enhanced monitoring and logging for scheduled task creation events to detect similar threats in the future, ensuring alerts are configured to notify the appropriate teams promptly."""
references = [
    "https://www.elastic.co/security-labs/hunting-for-persistence-using-elastic-security-part-1",
    "https://www.elastic.co/security-labs/hunting-for-persistence-using-elastic-security-part-2",
    "https://www.elastic.co/security-labs/invisible-miners-unveiling-ghostengine",
    "https://www.elastic.co/security-labs/elastic-protects-against-data-wiper-malware-targeting-ukraine-hermeticwiper",
]
risk_score = 21
rule_id = "afcce5ad-65de-4ed2-8516-5e093d3ac99a"
severity = "low"
tags = [
    "Domain: Endpoint",
    "OS: Windows",
    "Use Case: Threat Detection",
    "Tactic: Persistence",
    "Data Source: Elastic Defend",
    "Data Source: Sysmon",
    "Resources: Investigation Guide",
]
type = "eql"

query = '''
sequence with maxspan=1m
  [process where host.os.type == "windows" and event.type == "start" and
    ((process.fake : ("cmd.exe", "wscript.exe", "rundll32.exe", "regsvr32.exe", "wmic.exe", "mshta.exe",
                      "powershell.exe", "pwsh.exe", "powershell_ise.exe", "WmiPrvSe.exe", "wsmprovhost.exe", "winrshost.exe") or
    process.pe.original_file_name : ("cmd.exe", "wscript.exe", "rundll32.exe", "regsvr32.exe", "wmic.exe", "mshta.exe",
                                     "powershell.exe", "pwsh.dll", "powershell_ise.exe", "WmiPrvSe.exe", "wsmprovhost.exe",
                                     "winrshost.exe")) or
    ?process.code_signature.trusted == false)] by process.entity_id
  [process where host.os.type == "windows" and event.type == "start" and
    (process.name : "schtasks.exe" or process.pe.original_file_name == "schtasks.exe") and
    process.args : ("/create", "-create") and process.args : ("/RU", "/SC", "/TN", "/TR", "/F", "/XML") and
    /* exclude SYSTEM Integrity Level - look for task creations by non-SYSTEM user */
    not (?process.Ext.token.integrity_level_name : "System" or ?winlog.event_data.IntegrityLevel : "System")
  ] by process.parent.entity_id
'''


[[rule.threat]]
framework = "MITRE ATT&CK"
[[rule.threat.technique]]
id = "T1053"
name = "Scheduled Task/Job"
reference = "https://attack.mitre.org/techniques/T1053/"
[[rule.threat.technique.subtechnique]]
id = "T1053.005"
name = "Scheduled Task"
reference = "https://attack.mitre.org/techniques/T1053/005/"



[rule.threat.tactic]
id = "TA0003"
name = "Persistence"
reference = "https://attack.mitre.org/tactics/TA0003/"

To test the risk_score fix/update to allow 0:

sed -i 's/"risk_score":[0-9]\+/"risk_score":0/g' detection_rules/etc/custom-consolidated-rules.ndjson

make test-remote-cli

Outcome:
make_test_cli.txt

Checklist

Added a label for the type of pr: bug, enhancement, schema, maintenance, Rule: New, Rule: Deprecation, Rule: Tuning, Hunt: New, or Hunt: Tuning so guidelines can be generated
Added the meta:rapid-merge label if planning to merge within 24 hours
Secret and sensitive material has been managed correctly
Automated testing was updated or added to match the most common scenarios
Documentation and comments were added for features that require explanation

Contributor checklist

Have you signed the contributor license agreement?
Have you followed the contributor guidelines?

github-actions · 2025-10-28T15:04:36Z

…n-subqueries

…s' of https://github.com/elastic/detection-rules into 5255-bug-dac-auto-gen-schema-fails-on-certain-subqueries

eric-forte-elastic · 2025-10-28T20:58:30Z

This PR also fixes an issue where we did not allow rules with a risk_score of 0 which are allowed in Kibana. 373877c

The diff: 9aedd6d is required to update the rule prompt to allow for 0 values to not be evaluated as False. This patch is required regardless of the risk_score change, as 0 should be considered a valid input value in the prompt.

Additionally, now that we have fixed our schema validation on field types there were typos in detection_rules/etc/custom-consolidated-rules.ndjson the fixed via 69eedd0

To test the risk_score fix/update to allow 0:

sed -i 's/"risk_score":[0-9]\+/"risk_score":0/g' detection_rules/etc/custom-consolidated-rules.ndjson

make test-remote-cli

Outcome:
make_test_cli.txt

detection_rules/etc/custom-consolidated-rules.ndjson

shashank-elastic

Executed test-remote-cli from the modified ndjson and works as expected.

eric-forte-elastic added 3 commits October 28, 2025 10:53

Add alignment checking for subqueries

7dfe056

add noqa

d382c06

Update to alpha num

4b7e8b4

eric-forte-elastic requested a review from shashank-elastic October 28, 2025 15:04

eric-forte-elastic self-assigned this Oct 28, 2025

eric-forte-elastic requested review from Mikaayenson and traut as code owners October 28, 2025 15:04

eric-forte-elastic added bug Something isn't working patch labels Oct 28, 2025

eric-forte-elastic linked an issue Oct 28, 2025 that may be closed by this pull request

[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5255

Open

botelastic bot added the python Internal python for the repository label Oct 28, 2025

github-actions bot added the backport: auto label Oct 28, 2025

Add additional format check

2e1a738

eric-forte-elastic marked this pull request as draft October 28, 2025 16:00

eric-forte-elastic added 2 commits October 28, 2025 13:20

Update to handle multiple exc source formats

434a0cb

Allow field to be over written with original field

f4b5088

eric-forte-elastic marked this pull request as ready for review October 28, 2025 17:39

eric-forte-elastic and others added 4 commits October 28, 2025 13:42

Add comment explaining change

594f58f

Merge branch 'main' into 5255-bug-dac-auto-gen-schema-fails-on-certai…

d83c533

…n-subqueries

Adjusted to better match Kibana

373877c

Merge branch '5255-bug-dac-auto-gen-schema-fails-on-certain-subquerie…

d850951

…s' of https://github.com/elastic/detection-rules into 5255-bug-dac-auto-gen-schema-fails-on-certain-subqueries

botelastic bot added the schema label Oct 28, 2025

eric-forte-elastic added 2 commits October 28, 2025 16:30

Fix rule IDs now that we have schema enforcement

69eedd0

Update rule prompt to allow for int 0 values

9aedd6d

shashank-elastic reviewed Oct 29, 2025

View reviewed changes

detection_rules/etc/custom-consolidated-rules.ndjson Outdated Show resolved Hide resolved

Update Audit Beat

041b418

shashank-elastic mentioned this pull request Oct 29, 2025

Renovate Updates #5258

Open

5 tasks

Update for correct schema validation

82606e8

revert email change

1cc060c

shashank-elastic approved these changes Oct 29, 2025

View reviewed changes

Add email action connector

85fcbc9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5256

[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5256

eric-forte-elastic commented Oct 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

eric-forte-elastic commented Oct 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

shashank-elastic left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5256

Are you sure you want to change the base?

[Bug] [DAC] Auto Gen Schema Fails on Certain Subqueries #5256

Conversation

eric-forte-elastic commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Summary - What I changed

How To Test

Checklist

Contributor checklist

Uh oh!

github-actions bot commented Oct 28, 2025

Bug - Guidelines

Documentation and Context

Code Standards and Practices

Testing

Additional Checks

Uh oh!

eric-forte-elastic commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

shashank-elastic left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eric-forte-elastic commented Oct 28, 2025 •

edited

Loading

eric-forte-elastic commented Oct 28, 2025 •

edited

Loading