Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Assistant] Migrates to LangGraph and adds KB Tools #184554

Merged
merged 26 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0b03e98
Migrates to LangSmith and adds KB Tools
spong May 30, 2024
b888052
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong May 30, 2024
de9d530
yarn.lock update
spong May 31, 2024
b862a29
Port title generation into langgraph node
spong May 31, 2024
0b0bdab
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 3, 2024
6451142
yarn.lock update
spong Jun 3, 2024
14b0c65
Primes initial context with required kb docs and updates kb retrieval…
spong Jun 4, 2024
62754d3
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 4, 2024
83cdc37
Remove chat title generation node
spong Jun 4, 2024
b04da0e
yarn.lock update
spong Jun 4, 2024
c564e2e
Merge branch 'main' into kb-tools
spong Jun 4, 2024
5bc866b
yarn.lock update2
spong Jun 4, 2024
8477e13
fix types
patrykkopycinski Jun 4, 2024
7a92a86
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 5, 2024
d9a5b54
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 6, 2024
c3dc271
test
patrykkopycinski Jun 12, 2024
800ffc7
Merge branch 'main' of github.com:elastic/kibana into kb-tools
patrykkopycinski Jun 12, 2024
249339f
fix
patrykkopycinski Jun 12, 2024
684a5f1
Fix tests missing logger
spong Jun 13, 2024
837c29c
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 13, 2024
81ec705
Merge branch 'main' into kb-tools
kibanamachine Jun 13, 2024
5fca287
Update langgraph and remove TS5 workaround
spong Jun 13, 2024
d38107d
Update import
spong Jun 13, 2024
3d2c140
Merge branch 'main' of github.com:elastic/kibana into kb-tools
spong Jun 13, 2024
269f8aa
Merge branch 'main' into kb-tools
spong Jun 14, 2024
f9505d9
Skip flakey test
spong Jun 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ export const KnowledgeBaseEntryErrorSchema = z
export type Metadata = z.infer<typeof Metadata>;
export const Metadata = z.object({
/**
* Knowledge Base resource name
* Knowledge Base resource name for grouping entries, e.g. 'esql', 'lens-docs', etc
*/
kbResource: z.string(),
/**
* Original text content source
* Source document name or filepath
*/
source: z.string(),
/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,10 @@ components:
properties:
kbResource:
type: string
description: Knowledge Base resource name
description: Knowledge Base resource name for grouping entries, e.g. 'esql', 'lens-docs', etc
source:
type: string
description: Original text content source
description: Source document name or filepath
required:
type: boolean
description: Whether or not this resource should always be included
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ import { CreateKnowledgeBaseEntrySchema } from './types';

export interface CreateKnowledgeBaseEntryParams {
esClient: ElasticsearchClient;
logger: Logger;
knowledgeBaseIndex: string;
logger: Logger;
spaceId: string;
user: AuthenticatedUser;
knowledgeBaseEntry: KnowledgeBaseEntryCreateProps;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
*/

import { errors } from '@elastic/elasticsearch';
import { QueryDslQueryContainer } from '@elastic/elasticsearch/lib/api/types';
import { AuthenticatedUser } from '@kbn/core-security-common';

export const isModelAlreadyExistsError = (error: Error) => {
return (
Expand All @@ -14,3 +16,87 @@ export const isModelAlreadyExistsError = (error: Error) => {
error.body.error.type === 'status_exception')
);
};

/**
* Returns an Elasticsearch query DSL that performs a vector search against the Knowledge Base for the given query/user/filter.
*
* @param filter - Optional filter to apply to the search
* @param kbResource - Specific resource tag to filter for, e.g. 'esql' or 'user'
* @param modelId - ID of the model to search with, e.g. `.elser_model_2`
* @param query - The search query provided by the user
* @param required - Whether to only include required entries
* @param user - The authenticated user
* @returns
*/
export const getKBVectorSearchQuery = ({
filter,
kbResource,
modelId,
query,
required,
user,
}: {
filter?: QueryDslQueryContainer | undefined;
kbResource?: string | undefined;
modelId: string;
query: string;
required?: boolean | undefined;
user: AuthenticatedUser;
}): QueryDslQueryContainer => {
const resourceFilter = kbResource
? [
{
term: {
'metadata.kbResource': kbResource,
},
},
]
: [];
const requiredFilter = required
? [
{
term: {
'metadata.required': required,
},
},
]
: [];

const userFilter = [
{
nested: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should extend the filter to support empty("shared") KBs, which are available for all users in the space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working through adding the different permutations of kb mock data today for #184974 and will use that to start crafting tests to cover these scenarios.

I want to take a moment and rethink how we're categorizing/namespacing content using kbResource though to see if there will be any issues here. For now I've kept things matching the original KB implementation, and just introduced the user kbResource for all user created entries. We will also need to capture entryType somewhere to differentiate between raw text content and index-backed entries. Would be nice to have a catch-all tags for organization and labeling too.

I will start thinking through more of this early next week as I get back to plumbing through the remainder of the kbDataClient methods and REST API's, but if you have any thoughts here I would love to hear them 🙂

path: 'users',
query: {
bool: {
must: [
{
match: user.profile_uid
? { 'users.id': user.profile_uid }
: { 'users.name': user.username },
},
],
},
},
},
},
];

return {
bool: {
must: [
{
text_expansion: {
'vector.tokens': {
model_id: modelId,
model_text: query,
},
},
},
...requiredFilter,
...resourceFilter,
...userFilter,
],
filter,
},
};
};
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,23 @@ import {
} from '@elastic/elasticsearch/lib/api/types';
import type { MlPluginSetup } from '@kbn/ml-plugin/server';
import type { KibanaRequest } from '@kbn/core-http-server';
import type { Document } from 'langchain/document';
import { Document } from 'langchain/document';
import type { SavedObjectsClientContract } from '@kbn/core-saved-objects-api-server';
import { KnowledgeBaseEntryResponse } from '@kbn/elastic-assistant-common';
import {
KnowledgeBaseEntryCreateProps,
KnowledgeBaseEntryResponse,
} from '@kbn/elastic-assistant-common';
import pRetry from 'p-retry';
import { QueryDslQueryContainer } from '@elastic/elasticsearch/lib/api/typesWithBodyKey';
import { AIAssistantDataClient, AIAssistantDataClientParams } from '..';
import { ElasticsearchStore } from '../../lib/langchain/elasticsearch_store/elasticsearch_store';
import { loadESQL } from '../../lib/langchain/content_loaders/esql_loader';
import { GetElser } from '../../types';
import { transformToCreateSchema } from './create_knowledge_base_entry';
import { createKnowledgeBaseEntry, transformToCreateSchema } from './create_knowledge_base_entry';
import { EsKnowledgeBaseEntrySchema } from './types';
import { transformESSearchToKnowledgeBaseEntry } from './transforms';
import { ESQL_DOCS_LOADED_QUERY } from '../../routes/knowledge_base/constants';
import { isModelAlreadyExistsError } from './helpers';
import { getKBVectorSearchQuery, isModelAlreadyExistsError } from './helpers';

interface KnowledgeBaseDataClientParams extends AIAssistantDataClientParams {
ml: MlPluginSetup;
Expand Down Expand Up @@ -217,8 +221,7 @@ export class AIAssistantKnowledgeBaseDataClient extends AIAssistantDataClient {
/**
* Adds LangChain Documents to the knowledge base
*
* @param documents
* @param authenticatedUser
* @param documents LangChain Documents to add to the knowledge base
*/
public addKnowledgeBaseDocuments = async ({
documents,
Expand Down Expand Up @@ -261,4 +264,100 @@ export class AIAssistantKnowledgeBaseDataClient extends AIAssistantDataClient {

return created?.data ? transformESSearchToKnowledgeBaseEntry(created?.data) : [];
};

/**
* Performs similarity search to retrieve LangChain Documents from the knowledge base
*/
public getKnowledgeBaseDocuments = async ({
filter,
kbResource,
query,
required,
}: {
filter?: QueryDslQueryContainer;
kbResource?: string;
query: string;
required?: boolean;
}): Promise<Document[]> => {
const user = this.options.currentUser;
if (user == null) {
throw new Error(
'Authenticated user not found! Ensure kbDataClient was initialized from a request.'
);
}

const esClient = await this.options.elasticsearchClientPromise;
const modelId = await this.options.getElserId();

const vectorSearchQuery = getKBVectorSearchQuery({
filter,
kbResource,
modelId,
query,
required,
user,
});

try {
const result = await esClient.search<EsKnowledgeBaseEntrySchema>({
index: this.indexTemplateAndPattern.alias,
size: 10,
query: vectorSearchQuery,
});

const results = result.hits.hits.map(
(hit) =>
new Document({
pageContent: hit?._source?.text ?? '',
metadata: hit?._source?.metadata ?? {},
})
);

this.options.logger.debug(
`getKnowledgeBaseDocuments() - Similarity Search Query:\n ${JSON.stringify(
vectorSearchQuery
)}`
);
this.options.logger.debug(
`getKnowledgeBaseDocuments() - Similarity Search Results:\n ${JSON.stringify(results)}`
);

return results;
} catch (e) {
this.options.logger.error(`Error performing KB Similarity Search: ${e.message}`);
return [];
}
};

/**
* Creates a new Knowledge Base Entry.
*
* @param knowledgeBaseEntry
*/
public createKnowledgeBaseEntry = async ({
knowledgeBaseEntry,
}: {
knowledgeBaseEntry: KnowledgeBaseEntryCreateProps;
}): Promise<KnowledgeBaseEntryResponse | null> => {
const authenticatedUser = this.options.currentUser;
if (authenticatedUser == null) {
throw new Error(
'Authenticated user not found! Ensure kbDataClient was initialized from a request.'
);
}

this.options.logger.debug(
`Creating Knowledge Base Entry:\n ${JSON.stringify(knowledgeBaseEntry, null, 2)}`
);
this.options.logger.debug(`kbIndex: ${this.indexTemplateAndPattern.alias}`);
const esClient = await this.options.elasticsearchClientPromise;
return createKnowledgeBaseEntry({
esClient,
knowledgeBaseIndex: this.indexTemplateAndPattern.alias,
logger: this.options.logger,
spaceId: this.spaceId,
user: authenticatedUser,
knowledgeBaseEntry,
});
};
}
Original file line number Diff line number Diff line change
Expand Up @@ -89,12 +89,13 @@ export const callAgentExecutor: AgentExecutor<true | false> = async ({

// Fetch any applicable tools that the source plugin may have registered
const assistantToolParams: AssistantToolParams = {
anonymizationFields,
alertsIndexPattern,
isEnabledKnowledgeBase,
anonymizationFields,
chain,
llm,
esClient,
isEnabledKnowledgeBase,
llm,
logger,
modelExists,
onNewReplacements,
replacements,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,29 @@ import { PluginStartContract as ActionsPluginStart } from '@kbn/actions-plugin/s
import { ElasticsearchClient } from '@kbn/core-elasticsearch-server';
import { BaseMessage } from '@langchain/core/messages';
import { Logger } from '@kbn/logging';
import { KibanaRequest, ResponseHeaders } from '@kbn/core-http-server';
import { KibanaRequest, KibanaResponseFactory, ResponseHeaders } from '@kbn/core-http-server';
import type { LangChainTracer } from '@langchain/core/tracers/tracer_langchain';
import { ExecuteConnectorRequestBody, Message, Replacements } from '@kbn/elastic-assistant-common';
import { StreamResponseWithHeaders } from '@kbn/ml-response-stream/server';
import { AnonymizationFieldResponse } from '@kbn/elastic-assistant-common/impl/schemas/anonymization_fields/bulk_crud_anonymization_fields_route.gen';
import { ResponseBody } from '../types';
import type { AssistantTool } from '../../../types';
import { ElasticsearchStore } from '../elasticsearch_store/elasticsearch_store';
import { AIAssistantKnowledgeBaseDataClient } from '../../../ai_assistant_data_clients/knowledge_base';
import { AIAssistantConversationsDataClient } from '../../../ai_assistant_data_clients/conversations';
import { AIAssistantDataClient } from '../../../ai_assistant_data_clients';

export type OnLlmResponse = (
content: string,
traceData?: Message['traceData'],
isError?: boolean
) => Promise<void>;

export interface AssistantDataClients {
anonymizationFieldsDataClient?: AIAssistantDataClient;
conversationsDataClient?: AIAssistantConversationsDataClient;
kbDataClient?: AIAssistantKnowledgeBaseDataClient;
}

export interface AgentExecutorParams<T extends boolean> {
abortSignal?: AbortSignal;
Expand All @@ -26,6 +41,8 @@ export interface AgentExecutorParams<T extends boolean> {
isEnabledKnowledgeBase: boolean;
assistantTools?: AssistantTool[];
connectorId: string;
conversationId?: string;
dataClients?: AssistantDataClients;
esClient: ElasticsearchClient;
esStore: ElasticsearchStore;
langChainMessages: BaseMessage[];
Expand All @@ -34,12 +51,9 @@ export interface AgentExecutorParams<T extends boolean> {
onNewReplacements?: (newReplacements: Replacements) => void;
replacements: Replacements;
isStream?: T;
onLlmResponse?: (
content: string,
traceData?: Message['traceData'],
isError?: boolean
) => Promise<void>;
onLlmResponse?: OnLlmResponse;
request: KibanaRequest<unknown, unknown, ExecuteConnectorRequestBody>;
response?: KibanaResponseFactory;
size?: number;
traceOptions?: TraceOptions;
}
Expand Down
Loading