feat(conversation): support response streaming #2986

atierian · 2024-10-29T15:10:36Z

Description of changes

Adds support for streaming assistant responses.
Updates E2E test cases for streaming.
Adds E2E test case for streaming with client tools.
Adds snapshot test cases for streaming resolvers.

Design

The key change here is that the lambda function sends chunks / events to the assistant response mutation rather than a full message.

Example
Bedrock streaming response “hello world” is broken up into two chunks “hello” and “ world”.

// first event
{ 
  // mutation response 
   conversationId: '123',
  associatedUserMessageId: 'abc',
  contentBlockIndex: 0,
  contentBlockDeltaIndex: 0,
  contentBlockText: 'hello',
  
  // persisted in messages table
  accumulatedTurnContent: [{ text: 'hello' }]
}

// second event
{
  // mutation response  
  conversationId: '123',
  associatedUserMessageId: 'abc',
  contentBlockIndex: 0,
  contentBlockDeltaIndex: 1,
  contentBlockText: ' world',
  
  // persisted in messages table
  accumulatedTurnContent: [{ text: 'hello world' }]
}

Types

GraphQL Input
The input type for the assistant response stream mutation invoked by the Lambda function.

input CreateConversationMessageRouteAssistantStreamingInput {
  # always included
  conversationId: ID!
  associatedUserMessageId: ID!
  contentBlockIndex: Int!
  accumulatedTurnContent: [ContentBlock]
  
  # text chunk 
  contentBlockDeltaIndex: Int
  contentBlockText: String
 
  # end of block. applicable to text blocks.
  contentBlockDoneAtIndex: Int
   
  # well-formed tool use (client tool)
  contentBlockToolUse: AWSJSON 
  
  # turn complete
  stopReason: String  
}

ConversationMessageStreamPart Type
The response type of the assistant response stream mutation and paired subscription

type ConversationMessageStreamPart {
  id: ID!
  owner: String
  conversationId: ID!
  associatedUserMessageId: ID!
  contentBlockIndex: Int!
  contentBlockText: String
  contentBlockDeltaIndex: Int
  contentBlockToolUse: ToolUseBlock
  contentBlockDoneAtIndex: Int
  stopReason: String
}

Data Flow

Related PRs

CDK / CloudFormation Parameters Changed

N/A

Issue #, if available

N/A

Description of how you validated changes

E2E Test Run

Checklist

PR description included
yarn test passes
E2E test run linked
Tests are changed or added
~~Relevant documentation is changed or added (and PR referenced)~~
~~New AWS SDK calls or CloudFormation actions have been added to relevant test and service IAM policies~~
~~Any CDK or CloudFormation parameter changes are called out explicitly~~

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

atierian · 2024-10-31T14:49:49Z

packages/amplify-graphql-conversation-transformer/src/graphql-types/message-model.ts

+export const constructStreamResponseType = (): ObjectTypeDefinitionNode => {
+  return {
+    kind: 'ObjectTypeDefinition',
+    name: { kind: 'Name', value: STREAM_RESPONSE_TYPE_NAME },
+    fields: [
+      makeField('id', [], makeNonNullType(makeNamedType('ID'))),
+      makeField('owner', [], makeNamedType('String')),
+      makeField('conversationId', [], makeNonNullType(makeNamedType('ID'))),
+      makeField('associatedUserMessageId', [], makeNonNullType(makeNamedType('ID'))),
+
+      makeField('contentBlockIndex', [], makeNonNullType(makeNamedType('Int'))),
+
+      makeField('contentBlockText', [], makeNamedType('String')),
+      makeField('contentBlockDeltaIndex', [], makeNamedType('Int')),
+
+      makeField('contentBlockToolUse', [], makeNamedType('AWSJSON')),
+
+      makeField('contentBlockDoneAtIndex', [], makeNamedType('Int')),
+
+      makeField('stopReason', [], makeNamedType('String')),
+    ],
+  };
+};


This currently isn't being used; the type comes from data-schema.
However, that is changing in a follow up PR where the supporting types are consolidated within the transformer.

If the type is coming from data-schema, do we need to refactor the definition so it's common to both data-schema and the transformer?

This currently isn't being used. I have a WIP change that removes all supporting type definitions from data-schema. For that change, this type is necessary. Happy to remove it from this PR if you'd like; in hindsight it shouldn't have been included here.

p5quared · 2024-10-31T18:47:06Z

packages/amplify-graphql-api-construct-tests/src/__tests__/conversations/conversation.test.ts

+        // reconstruct the message from the events
+        const sortedEvents = events
+          .filter((event) => event.contentBlockText)
+          .sort((a, b) => a.contentBlockDeltaIndex - b.contentBlockDeltaIndex);


It looks like we are able to assume that all of the events from from this single subscription have the same contentBlockIndex? I'm curious what would happen if I sent two messages at the same time with the same conversationID e.g. does one fail or does one hang or is it undefined?

It looks like we are able to assume that all of the events from from this single subscription have the same contentBlockIndex?

Close, but not quite.
The contentBlockIndex represents the index of a content block for a given assistant response message. In the example below, the first content block has a contentBlockIndex of 0, the second 1.

{ // other fields content: [ { text: "Checking the weather", }, { text: "The temperature in Charleston, SC is 84° F currently", }, ] }

This test is indeed assuming that there will only be one content block returned from the assistant because the test implementation guarantees it.

In hindsight, this test should just be using the reconcileStreamEvents function. I'll switch to using that in a follow up.

I'm curious what would happen if I sent two messages at the same time with the same conversationID e.g. does one fail or does one hang or is it undefined?

No blocking or failing 😄
When a user sends a message, we read the conversation history from the ConversationMessage DDB table. If an assistant response hasn't yet been written to that table for a separate in-flight user message, that other in-flight user message isn't included in history.

Thanks for clarification on contentBlockIndex and double messages for me.

- adds assistant response mutation streaming resolver implementation. - adds asistant response mutation resolver pipeline. - adds assistant response stream mutation input type to schema. - updates invoke-lambda resolver payload to include streaming metadata.

…ementations

…e tests for streaming

atierian · 2024-11-04T15:26:08Z

...fy-graphql-conversation-transformer/src/transformer-steps/conversation-resolver-generator.ts

    const assistantResponsePipelineResolver = generateResolverPipeline(assistantResponsePipelineDefinition, directive, ctx);
    ctx.resolvers.addResolver(parentName, directive.assistantResponseMutation.field.name.value, assistantResponsePipelineResolver);


This is the non-streaming response pipeline; it will be removed in a follow up.

palpatim · 2024-11-04T15:45:04Z

packages/amplify-graphql-conversation-transformer/src/graphql-types/message-model.ts

+
+      makeInputValueDefinition('stopReason', makeNamedType('String')),
+
+      makeInputValueDefinition('accumulatedTurnContent', makeListType(makeNamedType('ContentBlockInput'))),


re ContentBlockInput: I'm mildly nervous about referring to predefined AI types defined with a string literal, especially when that type is not local to the module. Do we have a formal definition of this somewhere that we can reference rather than using string typing? If not, maybe we just refactor it into a module-scoped constant like you do with STREAM_RESPONSE_TYPE_NAME?

Yes, good point. I'm going to be changing names of supporting types in an upcoming PR to reduce the likelihood of naming collisions with existing schema types. I'll const the names there.

palpatim · 2024-11-04T15:54:34Z

packages/amplify-graphql-conversation-transformer/src/graphql-types/message-model.ts

@@ -256,3 +283,29 @@ const constructConversationMessageModel = (

  return object;
 };
+
+const STREAM_RESPONSE_TYPE_NAME = 'ConversationMessageStreamPart';


Future PR nit: move to top of file for easier discovery

palpatim · 2024-11-04T16:01:20Z

packages/amplify-graphql-conversation-transformer/src/graphql-types/message-model.ts

+export const constructStreamResponseType = (): ObjectTypeDefinitionNode => {
+  return {
+    kind: 'ObjectTypeDefinition',
+    name: { kind: 'Name', value: STREAM_RESPONSE_TYPE_NAME },
+    fields: [
+      makeField('id', [], makeNonNullType(makeNamedType('ID'))),
+      makeField('owner', [], makeNamedType('String')),
+      makeField('conversationId', [], makeNonNullType(makeNamedType('ID'))),
+      makeField('associatedUserMessageId', [], makeNonNullType(makeNamedType('ID'))),
+
+      makeField('contentBlockIndex', [], makeNonNullType(makeNamedType('Int'))),
+
+      makeField('contentBlockText', [], makeNamedType('String')),
+      makeField('contentBlockDeltaIndex', [], makeNamedType('Int')),
+
+      makeField('contentBlockToolUse', [], makeNamedType('AWSJSON')),
+
+      makeField('contentBlockDoneAtIndex', [], makeNamedType('Int')),
+
+      makeField('stopReason', [], makeNamedType('String')),
+    ],
+  };
+};


If the type is coming from data-schema, do we need to refactor the definition so it's common to both data-schema and the transformer?

palpatim · 2024-11-04T16:08:11Z

...phql-conversation-transformer/src/resolvers/assistant-response-stream-pipeline-definition.ts

+/**
+ * The init slot for the assistant response mutation resolver.
+ */
+function init(): ResolverFunctionDefinition {


nit: We prefer arrow functions rather than function declarations. I'm not sure how this even passed linting?

palpatim · 2024-11-04T16:10:15Z

...phql-conversation-transformer/src/resolvers/assistant-response-stream-pipeline-definition.ts

+    fileName: 'init-resolver-fn.template.js',
+    generateTemplate: (_, code) => MappingTemplate.inlineTemplateFromString(code),
+    substitutions: (_, ctx) => ({
+      GRAPHQL_API_ENDPOINT: ctx.api.graphqlUrl,


Future refactor nit: we should join up the template and substitution keys in a structure so they're easier to manage.

palpatim · 2024-11-04T16:12:29Z

...phql-conversation-transformer/src/resolvers/assistant-response-stream-pipeline-definition.ts

+}
+
+/**
+ * The auth slot for the assistant response mutation resolver.


nit here & throughout: it'd be great to document the important functionality going on in each of the slots rather than reiterating the name. e.g., auth enforces CUP & ownership; session owner verifies session owned by sub, etc...

palpatim · 2024-11-04T16:14:20Z

...phql-conversation-transformer/src/resolvers/assistant-response-stream-pipeline-definition.ts

+ * Creates a template generator specific to the assistant response pipeline for a given slot name.
+ */
+function templateGenerator(slotName: string) {
+  return createS3AssetMappingTemplateGenerator('Mutation', slotName, fieldName);


to make sure I'm reading this correctly: you're intentionally passing the fieldName function as an argument to createS3AssetMappingTemplateGenerator right?

palpatim · 2024-11-04T16:15:58Z

...s/amplify-graphql-conversation-transformer/src/resolvers/send-message-pipeline-definition.ts

@@ -129,3 +131,4 @@ function templateGenerator(slotName: string) {
 }

 const selectionSet = `id conversationId content { image { format source { bytes }} text toolUse { toolUseId name input } toolResult { status toolUseId content { json text image { format source { bytes }} document { format name source { bytes }} }}} role owner createdAt updatedAt`;


Is this in use any more?

palpatim · 2024-11-04T16:19:32Z

...ion-transformer/src/resolvers/templates/assistant-streaming-mutation-resolver-fn.template.js

+    ':updatedAt': updatedAt,
+  });
+
+  // https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ReservedWords.html


What reserved word prompted this?

palpatim · 2024-11-04T16:53:11Z

...plify-graphql-conversation-transformer/src/transformer-steps/conversation-prepare-handler.ts

@@ -65,15 +66,17 @@ export class ConversationPrepareHandler {
   */
  private prepareResourcesForDirective(directive: ConversationDirectiveConfiguration, ctx: TransformerPrepareStepContextProvider): void {
    // TODO: Add @aws_cognito_user_pools directive to send messages mutation


Is this TODO covered by this change?

atierian commented Oct 31, 2024

View reviewed changes

This was referenced Oct 31, 2024

feat: add support for streaming responses in conversation routes aws-amplify/amplify-data#379

Merged

feat(ai): support streaming aws-amplify/amplify-codegen#899

Merged

atierian marked this pull request as ready for review October 31, 2024 16:53

atierian requested review from a team as code owners October 31, 2024 16:53

atierian mentioned this pull request Oct 31, 2024

feat(conversation): propagate errors from lambda through subscription #2995

Closed

7 tasks

p5quared previously approved these changes Oct 31, 2024

View reviewed changes

atierian mentioned this pull request Nov 1, 2024

feat(conversation): sorting and performant list queries #2997

Merged

7 tasks

atierian added 6 commits November 4, 2024 09:51

bump ai-constructs to 0.7.0

c4ff55b

test(conversation): add snapshot tests for streaming resolver fn impl…

8f220af

…ementations

update e2e supporting types, functions

f3f081d

test(conversation): update schema and generated gql client code in e2…

e38d55c

…e tests for streaming

test(conversation): add e2e tests for streaming

d8cdc68

atierian dismissed p5quared’s stale review via d8cdc68 November 4, 2024 15:19

atierian force-pushed the ai.conversation-streaming-accumulated-turn-content branch from 51ae7fc to d8cdc68 Compare November 4, 2024 15:19

atierian commented Nov 4, 2024

View reviewed changes

extract dependency licenses

7db6a79

palpatim reviewed Nov 4, 2024

View reviewed changes

palpatim approved these changes Nov 4, 2024

View reviewed changes

p5quared approved these changes Nov 4, 2024

View reviewed changes

atierian merged commit 815d51f into main Nov 4, 2024
7 checks passed

atierian deleted the ai.conversation-streaming-accumulated-turn-content branch November 4, 2024 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(conversation): support response streaming #2986

feat(conversation): support response streaming #2986

atierian commented Oct 29, 2024 •

edited

Loading

atierian Oct 31, 2024 •

edited

Loading

palpatim Nov 4, 2024

atierian Nov 4, 2024

p5quared Oct 31, 2024

atierian Oct 31, 2024

p5quared Nov 1, 2024

atierian Nov 4, 2024

palpatim Nov 4, 2024

atierian Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

palpatim Nov 4, 2024

		const assistantResponsePipelineResolver = generateResolverPipeline(assistantResponsePipelineDefinition, directive, ctx);
		ctx.resolvers.addResolver(parentName, directive.assistantResponseMutation.field.name.value, assistantResponsePipelineResolver);


		makeInputValueDefinition('stopReason', makeNamedType('String')),

		makeInputValueDefinition('accumulatedTurnContent', makeListType(makeNamedType('ContentBlockInput'))),

		@@ -129,3 +131,4 @@ function templateGenerator(slotName: string) {
		}

		const selectionSet = `id conversationId content { image { format source { bytes }} text toolUse { toolUseId name input } toolResult { status toolUseId content { json text image { format source { bytes }} document { format name source { bytes }} }}} role owner createdAt updatedAt`;

feat(conversation): support response streaming #2986

feat(conversation): support response streaming #2986

Conversation

atierian commented Oct 29, 2024 • edited Loading

Description of changes

Design

Types

Data Flow

Related PRs

CDK / CloudFormation Parameters Changed

Issue #, if available

Description of how you validated changes

Checklist

atierian Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atierian commented Oct 29, 2024 •

edited

Loading

atierian Oct 31, 2024 •

edited

Loading