-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security solution] naturalLanguageToEsql
Tool added to default assistant graph
#192042
Changes from 10 commits
5f4d328
5f8ba1b
7881215
2441e20
5bebe12
62e72b3
10a87ff
a97fee4
7330c3b
dd40a8b
cc06eab
96ae4eb
340649c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
"ml", | ||
"taskManager", | ||
"licensing", | ||
"inference", | ||
"spaces", | ||
"security" | ||
] | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,6 +48,7 @@ | |
"@kbn/apm-utils", | ||
"@kbn/std", | ||
"@kbn/zod", | ||
"@kbn/inference-plugin" | ||
], | ||
"exclude": [ | ||
"target/**/*", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0; you may not use this file except in compliance with the Elastic License | ||
* 2.0. | ||
*/ | ||
|
||
import { DynamicStructuredTool } from '@langchain/core/tools'; | ||
import { z } from '@kbn/zod'; | ||
import type { AssistantTool, AssistantToolParams } from '@kbn/elastic-assistant-plugin/server'; | ||
import { lastValueFrom } from 'rxjs'; | ||
import { naturalLanguageToEsql } from '@kbn/inference-plugin/server'; | ||
import { APP_UI_ID } from '../../../../common'; | ||
|
||
export type ESQLToolParams = AssistantToolParams; | ||
|
||
const TOOL_NAME = 'NaturalLanguageESQLTool'; | ||
|
||
const toolDetails = { | ||
id: 'nl-to-esql-tool', | ||
name: TOOL_NAME, | ||
description: `You MUST use the "${TOOL_NAME}" function when the user wants to: | ||
- visualize data | ||
- run any arbitrary query | ||
- breakdown or filter ES|QL queries that are displayed on the current page | ||
- convert queries from another language to ES|QL | ||
- asks general questions about ES|QL | ||
|
||
DO NOT UNDER ANY CIRCUMSTANCES generate ES|QL queries or explain anything about the ES|QL query language yourself. | ||
DO NOT UNDER ANY CIRCUMSTANCES try to correct an ES|QL query yourself - always use the "${TOOL_NAME}" function for this. | ||
|
||
If the user asks for a query, and one of the dataset info functions was called and returned no results, you should still call the query function to generate an example query. | ||
|
||
Even if the "${TOOL_NAME}" function was used before that, follow it up with the "${TOOL_NAME}" function. If a query fails, do not attempt to correct it yourself. Again you should call the "${TOOL_NAME}" function, | ||
even if it has been called before.`, | ||
}; | ||
|
||
export const NL_TO_ESQL_TOOL: AssistantTool = { | ||
...toolDetails, | ||
sourceRegister: APP_UI_ID, | ||
isSupported: (params: ESQLToolParams): params is ESQLToolParams => { | ||
const { chain, isEnabledKnowledgeBase, modelExists } = params; | ||
return isEnabledKnowledgeBase && modelExists && chain != null; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just making a note from my other recent works, but |
||
}, | ||
getTool(params: ESQLToolParams) { | ||
if (!this.isSupported(params)) return null; | ||
|
||
const { connectorId, inference, logger, request } = params as ESQLToolParams; | ||
if (inference == null || connectorId == null) return null; | ||
|
||
const callNaturalLanguageToEsql = async (question: string) => { | ||
return lastValueFrom( | ||
naturalLanguageToEsql({ | ||
client: inference.getClient({ request }), | ||
spong marked this conversation as resolved.
Show resolved
Hide resolved
|
||
connectorId, | ||
input: question, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What exactly The task behaves better when additional context is provided (info about which index / index pattern is being targeted, the index's schema or relevant fields to avoid hallucinating fields, and so on). Is that kind of user query rewriting / enhancement performed before calling the task, or not at all? (as this is not something that can be done in this black box, unfortunately) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is going to be however the LLM interprets how it needs to phrase the question to the tool. So for example, take a look at this trace. Within this trace, click into the first orange graph step Here you can see the user's question is:
The LLM is then given this message along with tools, tool descriptions, and tool schemas. The LLM then determines that it needs to create an input of
Right now the tool schema is:
We can try to add prompting for optional fields if an index is specified, but if the user do not provide it we will not have it. Should I try that and run evaluations to see if there is an improvement?
Patryk did have some code in his tool before the NL to ESQL task was available that validated the query alongside the available data views: https://github.com/elastic/kibana/pull/186489/files#diff-bf442ff72176edbab83ea0e5c13d7d23b9273d851bda705fbc6a000afd19232aR209 I did not include that in this PR. I was going to let him expand on that when he returns from PTO on Sept 23. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pgayvallet, IMO the This would be extremely beneficial for both consistency and ease of use on the consumer side. Soon we'll be bundling this task into other/more complex tasks like generalized retrievers, visualizations, etc, so the less initial input and custom context packing required the better here. Looks like we'll need to update the interface to take some There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I mostly agree, except for that part:
Retrieving the mapping from an index is fine, but I'm wondering about the "resolving/deducing which index to use" part: How would you see that being done in the task itself? The task is a black box, it has no knowledge of the current "context" of the user. How do you see it being able to deduce which index to target? FWIW, the |
||
logger: { | ||
debug: (source) => { | ||
logger.debug(typeof source === 'function' ? source() : source); | ||
}, | ||
}, | ||
}) | ||
); | ||
}; | ||
|
||
return new DynamicStructuredTool({ | ||
name: toolDetails.name, | ||
description: toolDetails.description, | ||
schema: z.object({ | ||
question: z.string().describe(`The user's exact question about ESQL`), | ||
}), | ||
func: async (input) => { | ||
const generateEvent = await callNaturalLanguageToEsql(input.question); | ||
const answer = generateEvent.content ?? 'An error occurred in the tool'; | ||
|
||
logger.debug(`Received response from NL to ESQL tool: ${answer}`); | ||
return answer; | ||
}, | ||
tags: ['esql', 'query-generation', 'knowledge-base'], | ||
// TODO: Remove after ZodAny is fixed https://github.com/langchain-ai/langchainjs/blob/main/langchain-core/src/tools.ts | ||
}) as unknown as DynamicStructuredTool; | ||
}, | ||
}; |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,17 +7,18 @@ | |
|
||
import type { AssistantTool } from '@kbn/elastic-assistant-plugin/server'; | ||
|
||
import { ALERT_COUNTS_TOOL } from './alert_counts/alert_counts_tool'; | ||
import { ESQL_KNOWLEDGE_BASE_TOOL } from './esql_language_knowledge_base/esql_language_knowledge_base_tool'; | ||
import { NL_TO_ESQL_TOOL } from './esql_language_knowledge_base/esql_language_tool'; | ||
import { ALERT_COUNTS_TOOL } from './alert_counts/alert_counts_tool'; | ||
import { OPEN_AND_ACKNOWLEDGED_ALERTS_TOOL } from './open_and_acknowledged_alerts/open_and_acknowledged_alerts_tool'; | ||
import { ATTACK_DISCOVERY_TOOL } from './attack_discovery/attack_discovery_tool'; | ||
import { KNOWLEDGE_BASE_RETRIEVAL_TOOL } from './knowledge_base/knowledge_base_retrieval_tool'; | ||
import { KNOWLEDGE_BASE_WRITE_TOOL } from './knowledge_base/knowledge_base_write_tool'; | ||
|
||
export const getAssistantTools = (): AssistantTool[] => [ | ||
export const getAssistantTools = (naturalLanguageESQLToolEnabled: boolean): AssistantTool[] => [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: consider making |
||
ALERT_COUNTS_TOOL, | ||
ATTACK_DISCOVERY_TOOL, | ||
ESQL_KNOWLEDGE_BASE_TOOL, | ||
naturalLanguageESQLToolEnabled ? NL_TO_ESQL_TOOL : ESQL_KNOWLEDGE_BASE_TOOL, | ||
KNOWLEDGE_BASE_RETRIEVAL_TOOL, | ||
KNOWLEDGE_BASE_WRITE_TOOL, | ||
OPEN_AND_ACKNOWLEDGED_ALERTS_TOOL, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably want to update this to be less suggestive of the additional o11y functions like
dataset info
,visualize data
, etc.I'll keep an eye out while testing for any flake here, but fine for now while behind a feature flag 👍