[Inference] Implement NL-to-ESQL task #190433

dgieselaar · 2024-08-13T15:57:46Z

Implements the NL-to-ESQL task and migrates the Observability AI Assistant to use the new task. Most of the files are simply generated documentation. I've also included two scripts: one to generate the documentation, and another to evaluate the task against a real LLM.

TBD: run evaluation framework in Observability AI Assistant to ensure there are no performance regressions.

obltmachine · 2024-08-13T15:57:59Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

dgieselaar · 2024-08-15T17:16:39Z

@elasticmachine merge upstream

dgieselaar · 2024-08-28T08:33:18Z

@pgayvallet is currently working on adding gemini/bedrock adapters, I'll put this back in draft until that work has completed.

dgieselaar · 2024-09-03T06:13:27Z

x-pack/plugins/inference/common/chat_complete/without_token_count_events.ts

@@ -8,7 +8,7 @@
 import { filter, OperatorFunction } from 'rxjs';
 import { ChatCompletionEvent, ChatCompletionEventType, ChatCompletionTokenCountEvent } from '.';

-export function withoutTokenCountEvents<T extends ChatCompletionEvent>(): OperatorFunction<
+export function withoutTokenCountEvents<T extends ChatCompletionEvent<any>>(): OperatorFunction<


Note to self: this shouldn't be needed

pgayvallet

LGTM in current state.

Left a few NITS, comments, and questions for follow-ups (on my side)

pgayvallet · 2024-09-03T05:56:08Z

x-pack/plugins/inference/common/chat_complete/is_chat_completion_message_event.ts

+import { ChatCompletionEvent, ChatCompletionEventType, ChatCompletionMessageEvent } from '.';
+import type { ToolOptions } from './tools';
+
+export function isChatCompletionMessageEvent<T extends ToolOptions<string>>(


NIT: isChatCompletionChunkEvent, isChatCompletionEvent and isChatCompletionMessageEvent could live in the same file (but i'll end up moving them myself anyway)

pgayvallet · 2024-09-03T05:58:41Z

x-pack/plugins/inference/common/chat_complete/without_token_count_events.ts

@@ -8,7 +8,7 @@
 import { filter, OperatorFunction } from 'rxjs';
 import { ChatCompletionEvent, ChatCompletionEventType, ChatCompletionTokenCountEvent } from '.';

-export function withoutTokenCountEvents<T extends ChatCompletionEvent>(): OperatorFunction<
+export function withoutTokenCountEvents<T extends ChatCompletionEvent<any>>(): OperatorFunction<


NIT: would that work with unknown instead of any?

pgayvallet · 2024-09-03T06:01:00Z

x-pack/plugins/inference/common/index.ts

+ * 2.0.
+ */
+
+export {


What? So you mean you're aware that index files are supposed to be used for re-exporting, not living alone in their folder containing concrete stuff?

who'd have thought!

pgayvallet · 2024-09-03T06:04:20Z

x-pack/plugins/inference/common/output/create_output_api.ts

+      messages: ensureMultiTurn([
+        ...(messages || []),


Do you really think we want that done at that level? We already have that done for Gemini at the adapter's level:

kibana/x-pack/plugins/inference/server/chat_complete/adapters/gemini/gemini_adapter.ts

Lines 153 to 164 in 48e4dcf

function messagesToGemini({ messages }: { messages: Message[] }): GeminiMessage[] {

return messages.map(messageToGeminiMapper()).reduce<GeminiMessage[]>((output, message) => {

// merging consecutive messages from the same user, as Gemini requires multi-turn messages

const previousMessage = output.length ? output[output.length - 1] : undefined;

if (previousMessage?.role === message.role) {

previousMessage.parts.push(...message.parts);

} else {

output.push(message);

}

return output;

}, []);

}

Which is cleaner, as it can work with the underlying format and append the things the right way instead of injecting - messages in the middle (that's not much noise, but still).

If other LLM services have the same limitation / restriction, I would do the patching at the adapter's level for them to, WDYT?

yeah you're probably right, I was trying to avoid the overhead of doing it three times, doesn't openai need it as well?

IIRC no, openAI doesn't care about it.

Now regarding the issue about doing it multiple times: I went with the approach to do it at the adapter's level because I felt like doing it during the conversion to the underlying LLM's syntax was more powerful, but if we want to avoid doing it multiple time, we can do that check / transform in the chat API before calling the adapter.

pgayvallet · 2024-09-03T06:07:36Z

x-pack/plugins/inference/common/output/is_output_complete_event.ts

+
+import type { ToolOptions } from '../chat_complete/tools';
+
+export function isOutputCompleteEvent<TId extends string, TToolOptions extends ToolOptions<string>>(


NIT: same, isOutput****Event don't need their dedicated file for a line functions.

pgayvallet · 2024-09-03T06:18:03Z

x-pack/plugins/inference/scripts/evaluation/evaluation_client.ts

+    evaluate: async (input, criteria) => {
+      const evaluation = await lastValueFrom(
+        outputApi('evaluate', {
+          connectorId,
+          system: `You are a helpful, respected assistant for evaluating task
+            inputs and outputs in the Elastic Platform.
+
+            Your goal is to verify whether the output of a task


As discussed on slack, I'm afraid having a generic evaluation function for all task will be too naive. We already see with the NL-to-ESQL test suite that the evaluator gets bamboozled by the observed LLM because the evaluator has no knowledge of ESQL and no way to retrieve the documentation, so it don't spot invalid syntaxes or wrong parameters usages.

I'll be taking that one as a follow-up

agreed, def room for improvement here

pgayvallet · 2024-09-03T06:20:55Z

x-pack/plugins/inference/scripts/evaluation/scenarios/esql/index.spec.ts

+const buildTestDefinitions = (): Section[] => {
+  const testDefinitions: Section[] = [
+    {
+      title: 'ES|QL query generation',
+      tests: [
+        {
+          title: 'Generates a query to show the top 10 domains by doc count',
+          question: `For standard Elastic ECS compliant packetbeat data view (\`packetbeat-*\`),


We only have 13 tests here, with given the very wide area of things we should be testing, is a very small number.

I'm not asking to add more in this PR, this is very fine for merging given it's just a port of the existing code from o11y, but I was wondering what the best way to add more tests and to cover more scenarios would be. Like, what was the initial strategy when those tests were added / how were they chosen?

these tests have been around for a while and they're probably too focused on the Observability domain, would leave to see us add more. As far as how they were chosen, it's a bit of a mix between "I want the Assistant to be able to do this" and "I see the Assistant makes this mistake".

pgayvallet · 2024-09-03T06:29:34Z

x-pack/plugins/inference/scripts/load_esql_docs/load_esql_docs.ts


-          const builtDocsDir = Path.join(__dirname, '../../../../../../../built-docs');
+          const builtDocsDir = Path.join(__dirname, '../../../../../../built-docs');


NIT: import { REPO_ROOT } from '@kbn/repo-info';

pgayvallet · 2024-09-03T06:33:08Z

x-pack/plugins/inference/scripts/load_esql_docs/load_esql_docs.ts


 yargs(process.argv.slice(2))
  .command(
    '*',
-    'Extract ES|QL documentation for the Observability AI Assistant',
+    'Extract ES|QL documentation',


No objections about me moving the ESQL doc generation script to a package in a follow-up? it's more consistent with how we're been working with scripts in AppEx, and I like the isolation of concerns (especially given we will likely be adding more extraction scripts later, e.g. for the elastic.co doc)

no, that's a good idea!

pgayvallet · 2024-09-03T06:34:55Z

x-pack/plugins/inference/scripts/util/read_kibana_config.ts

+export type KibanaConfig = ReturnType<typeof readKibanaConfig>;
+
+export const readKibanaConfig = () => {
+  const kibanaConfigDir = path.join(__filename, '../../../../../../config');


NIT: REPO_ROOT

stephmilovic · 2024-09-03T17:00:42Z

x-pack/plugins/inference/common/ensure_multi_turn.ts

+import { Message, MessageRole } from './chat_complete';
+
+function isUserMessage(message: Message): boolean {
+  return message.role !== MessageRole.Assistant;


Is MessageRole.Tool considered a user message then?

Yes, it is a reply from the user's system.

stephmilovic

Great work Dario, LGTM!

In a draft PR, I've implemented the new NL to ESQL task in a LangChain tool and ran it successfully with the Security solution's default assistant graph 🥳 . I will move forward once you merge: #192042

kibana-ci · 2024-09-04T15:28:18Z

💚 Build Succeeded

Buildkite Build
Commit: fe3596e
Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-190433-fe3596ee99fd

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`inference`	14	15	+1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`inference`	14	39	+25

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id	before	after	diff
`inference`	11	13	+2

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`inference`	5.2KB	5.6KB	+390.0B

Unknown metric groups

API count

id	before	after	diff
`inference`	16	41	+25

ESLint disabled line counts

id	before	after	diff
`inference`	0	1	+1

Total ESLint disabled count

id	before	after	diff
`inference`	2	3	+1

History

💛 Build #231924 was flaky 4cb642c
💛 Build #231894 was flaky e71fe3b
💛 Build #231847 was flaky c886419
💚 Build #231803 succeeded 1b57f13
💚 Build #231789 succeeded 47afdbe

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

neptunian

Tested in the UI and LGTM!

## Summary Follow-up of #190433 Fix [#192762](#192762) - Cleanup and refactor the documentation generation script - Make some tweak to the documentation to improve efficiency and make a better user of tokens - Perform human review of the generated content to make sure everything is accurate --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

## Summary Follow-up of elastic#190433 Fix [elastic#192762](elastic#192762) - Cleanup and refactor the documentation generation script - Make some tweak to the documentation to improve efficiency and make a better user of tokens - Perform human review of the generated content to make sure everything is accurate --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 3226eb6)

# Backport This will backport the following commits from `main` to `8.x`: - [[inference] NL-to-ESQL: improve doc generation (#192378)](#192378)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Pierre Gayvallet <pierre.gayvallet@elastic.co>

[Inference] Implement NL-to-ESQL task

d1eeee3

[Inference] Migrate o11y Assistant to use NL-to-ESQL task

badb149

dgieselaar force-pushed the nl-to-esql-task branch from 8610ec1 to badb149 Compare August 13, 2024 15:58

dgieselaar added the v8.16.0 label Aug 13, 2024

dgieselaar marked this pull request as ready for review August 13, 2024 17:32

dgieselaar requested review from a team as code owners August 13, 2024 17:32

dgieselaar added the release_note:skip Skip the PR/issue when compiling release notes label Aug 13, 2024

Update test coverage

dcb213e

botelastic bot added ci:project-deploy-observability Create an Observability project Team:Obs AI Assistant Observability AI Assistant labels Aug 13, 2024

dgieselaar added 6 commits August 13, 2024 20:04

Appease linting gods

7d06a40

Appease linting gods more

bf3d01e

Fix tests

015b162

Merge branch 'main' of github.com:elastic/kibana into nl-to-esql-task

3f3a1d1

Improve tool calling decisions

7a63bf7

Merge branch 'main' of github.com:elastic/kibana into nl-to-esql-task

7f23db5

Merge branch 'main' into nl-to-esql-task

88a418e

dgieselaar mentioned this pull request Aug 27, 2024

Inference plugin: Add Gemini model adapter #191292

Merged

dgieselaar marked this pull request as draft August 28, 2024 08:34

pgayvallet added 6 commits August 30, 2024 13:10

Merge remote-tracking branch 'upstream/main' into nl-to-esql-task

fd85185

adapt things due to rebase

5fee0cd

structure consistency

4af0c90

remove duplicate tool

e1c0450

get rid of the stack_connectors dep

48e4dcf

fix imports and types

67fe6df

We have unit tests now Dario. Amazing, I know.

3a052ed

dgieselaar commented Sep 3, 2024

View reviewed changes

pgayvallet approved these changes Sep 3, 2024

View reviewed changes

dgieselaar and others added 5 commits September 3, 2024 09:44

Fix OutputAPI type issues

edbd3af

Review feedback

e30b709

[CI] Auto-commit changed files from 'node scripts/yarn_deduplicate'

9ea7e6c

Improve the ESQL task evaluation

8d516e2

yeah okay eslint

81b0712

stephmilovic reviewed Sep 3, 2024

View reviewed changes

pgayvallet added 2 commits September 3, 2024 20:40

Add a test + small tweak

47afdbe

moar tests

1b57f13

stephmilovic mentioned this pull request Sep 3, 2024

[Security solution] naturalLanguageToEsql Tool added to default assistant graph #192042

Merged

stephmilovic approved these changes Sep 3, 2024

View reviewed changes

pgayvallet added 5 commits September 4, 2024 08:54

add workaround for table rendering truncate glitch

c886419

extract table renderer

e71fe3b

Merge remote-tracking branch 'upstream/main' into nl-to-esql-task

b1ca6a0

nits

4cb642c

fix error when calling output without a schema

fe3596e

neptunian approved these changes Sep 4, 2024

View reviewed changes

dgieselaar merged commit 5c298a1 into elastic:main Sep 4, 2024
23 checks passed

dgieselaar deleted the nl-to-esql-task branch September 4, 2024 16:30

kibanamachine added the backport:skip This commit does not require backporting label Sep 4, 2024

This was referenced Sep 4, 2024

[inference] NL-to-ESQL: improve doc generation #192112

Closed

[inference] NL-to-ESQL: improve doc generation #192378

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] Implement NL-to-ESQL task #190433

[Inference] Implement NL-to-ESQL task #190433

dgieselaar commented Aug 13, 2024 •

edited

Loading

obltmachine commented Aug 13, 2024

dgieselaar commented Aug 15, 2024

dgieselaar commented Aug 28, 2024

dgieselaar Sep 3, 2024

pgayvallet left a comment

pgayvallet Sep 3, 2024

pgayvallet Sep 3, 2024

pgayvallet Sep 3, 2024

dgieselaar Sep 3, 2024

pgayvallet Sep 3, 2024

dgieselaar Sep 3, 2024

pgayvallet Sep 3, 2024

pgayvallet Sep 3, 2024

pgayvallet Sep 3, 2024

dgieselaar Sep 3, 2024

pgayvallet Sep 3, 2024

dgieselaar Sep 3, 2024

pgayvallet Sep 3, 2024

pgayvallet Sep 3, 2024

dgieselaar Sep 3, 2024

pgayvallet Sep 3, 2024

stephmilovic Sep 3, 2024

dgieselaar Sep 3, 2024

stephmilovic left a comment

kibana-ci commented Sep 4, 2024 •

edited

Loading

API count

ESLint disabled line counts

Total ESLint disabled count

neptunian left a comment

	function messagesToGemini({ messages }: { messages: Message[] }): GeminiMessage[] {
	return messages.map(messageToGeminiMapper()).reduce<GeminiMessage[]>((output, message) => {
	// merging consecutive messages from the same user, as Gemini requires multi-turn messages
	const previousMessage = output.length ? output[output.length - 1] : undefined;
	if (previousMessage?.role === message.role) {
	previousMessage.parts.push(...message.parts);
	} else {
	output.push(message);
	}
	return output;
	}, []);
	}


		import type { ToolOptions } from '../chat_complete/tools';

		export function isOutputCompleteEvent<TId extends string, TToolOptions extends ToolOptions<string>>(


		const builtDocsDir = Path.join(__dirname, '../../../../../../../built-docs');
		const builtDocsDir = Path.join(__dirname, '../../../../../../built-docs');

[Inference] Implement NL-to-ESQL task #190433

[Inference] Implement NL-to-ESQL task #190433

Conversation

dgieselaar commented Aug 13, 2024 • edited Loading

obltmachine commented Aug 13, 2024

🤖 GitHub comments

dgieselaar commented Aug 15, 2024

dgieselaar commented Aug 28, 2024

Choose a reason for hiding this comment

pgayvallet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephmilovic left a comment

Choose a reason for hiding this comment

kibana-ci commented Sep 4, 2024 • edited Loading

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Public APIs missing exports

Page load bundle

API count

ESLint disabled line counts

Total ESLint disabled count

History

neptunian left a comment

Choose a reason for hiding this comment

dgieselaar commented Aug 13, 2024 •

edited

Loading

kibana-ci commented Sep 4, 2024 •

edited

Loading