diff --git a/packages/ai-sdk-tools/README.md b/packages/ai-sdk-tools/README.md index a2a0546..eb55f44 100644 --- a/packages/ai-sdk-tools/README.md +++ b/packages/ai-sdk-tools/README.md @@ -12,7 +12,7 @@ pnpm add ai @parallel-web/ai-sdk-tools yarn add ai @parallel-web/ai-sdk-tools ``` -> **Note:** This package requires AI SDK v5. If you're using AI SDK v4, see the [AI SDK v4 Implementation](#ai-sdk-v4-implementation) section below. +> **Note:** This package requires AI SDK v5. For AI SDK v4, use `parameters` instead of `inputSchema` when defining tools manually with the `parallel-web` SDK. ## Usage @@ -20,17 +20,26 @@ Add `PARALLEL_API_KEY` obtained from [Parallel Platform](https://platform.parall ### Search Tool -`searchTool` uses [Parallel's web search API](https://docs.parallel.ai/api-reference/search-api/search) to get fresh relevant search results. +`searchTool` uses [Parallel's Search API](https://docs.parallel.ai/api-reference/search-beta/search) to perform web searches and return LLM-optimized results. + +**Input schema:** +- `objective` (required): Natural-language description of what the web search is trying to find +- `search_queries` (optional): List of keyword search queries (1-6 words each) +- `mode` (optional): `'agentic'` (default) for concise results in agentic loops, or `'one-shot'` for comprehensive single-response results ### Extract Tool -`extractTool` uses [Parallel's extract API](https://docs.parallel.ai/api-reference/search-and-extract-api-beta/extract) to extract a web-page's content, for a given objective. +`extractTool` uses [Parallel's Extract API](https://docs.parallel.ai/api-reference/extract-beta/extract) to fetch and extract relevant content from specific URLs. + +**Input schema:** +- `urls` (required): List of URLs to extract content from (max 10) +- `objective` (optional): Natural-language description of what information you're looking for ### Basic Example ```typescript import { openai } from '@ai-sdk/openai'; -import { streamText, type Tool } from 'ai'; +import { streamText } from 'ai'; import { searchTool, extractTool } from '@parallel-web/ai-sdk-tools'; const result = streamText({ @@ -49,13 +58,44 @@ const result = streamText({ return result.toUIMessageStreamResponse(); ``` -### Custom Tools +## Factory Functions + +For more control over the tool configuration, use the factory functions to create tools with custom defaults: + +### createSearchTool -You can create custom tools that wrap the Parallel Web API: +Create a search tool with custom defaults for mode, max_results, excerpts, source_policy, or fetch_policy. ```typescript -import { tool, generateText } from 'ai'; -import { openai } from '@ai-sdk/openai'; +import { createSearchTool } from '@parallel-web/ai-sdk-tools'; + +const myCustomSearchTool = createSearchTool({ + mode: 'one-shot', // 'one-shot' returns more comprehensive results and longer excerpts to answer questions from a single response. + max_results: 5, // Limit to 5 results +}); +``` + +### createExtractTool + +Create an extract tool with custom defaults for excerpts, full_content, or fetch_policy. + +```typescript +import { createExtractTool } from '@parallel-web/ai-sdk-tools'; + +const myExtractTool = createExtractTool({ + full_content: true, // Include full page content + excerpts: { + max_chars_per_result: 10000, + }, +}); +``` + +## Direct API Usage + +You can also use the `parallel-web` SDK directly for maximum flexibility: + +```typescript +import { tool } from 'ai'; import { z } from 'zod'; import { Parallel } from 'parallel-web'; @@ -64,149 +104,120 @@ const parallel = new Parallel({ }); const webSearch = tool({ - description: 'Use this tool to search the web.', + description: 'Search the web for information.', inputSchema: z.object({ - searchQueries: z.array(z.string()).describe('Search queries'), - usersQuestion: z.string().describe("The user's question"), + query: z.string().describe("The user's question"), }), - execute: async ({ searchQueries, usersQuestion }) => { - const search = await parallel.beta.search({ - objective: usersQuestion, - search_queries: searchQueries, - max_results: 3, - max_chars_per_result: 1000, + execute: async ({ query }) => { + const result = await parallel.beta.search({ + objective: query, + mode: 'agentic', + max_results: 5, }); - return search.results; + return result; }, }); ``` -## AI SDK v4 Implementation +## API Reference -If you're using AI SDK v4, you can implement the tools manually using the Parallel Web API. The key difference is that v4 uses `parameters` instead of `inputSchema`. +- [Search API Documentation](https://docs.parallel.ai/search/search-quickstart) +- [Extract API Documentation](https://docs.parallel.ai/extract/extract-quickstart) +- [Search API Best Practices](https://docs.parallel.ai/search/best-practices) -### Search Tool (v4) +## Response Format + +Both tools return the raw API response from Parallel: + +### Search Response ```typescript -import { tool } from 'ai'; -import { z } from 'zod'; -import { Parallel } from 'parallel-web'; +{ + search_id: string; + results: Array<{ + url: string; + title?: string; + publish_date?: string; + excerpts: string[]; + }>; + usage?: Array<{ name: string; count: number }>; + warnings?: Array<{ code: string; message: string }>; +} +``` -const parallel = new Parallel({ - apiKey: process.env.PARALLEL_API_KEY, -}); +### Extract Response -function getSearchParams( - search_type: 'list' | 'targeted' | 'general' | 'single_page' -): Pick { - switch (search_type) { - case 'targeted': - return { - max_results: 5, - max_chars_per_result: 16000 - }; - case 'general': - return { - max_results: 10, - max_chars_per_result: 9000 - }; - case 'single_page': - return { - max_results: 2, - max_chars_per_result: 30000 - }; - case 'list': - default: - return { - max_results: 20, - max_chars_per_result: 1500 - }; - } +```typescript +{ + extract_id: string; + results: Array<{ + url: string; + title?: string; + excerpts?: string[]; + full_content?: string; + publish_date?: string; + }>; + errors: Array<{ + url: string; + error_type: string; + http_status_code?: number; + content?: string; + }>; + usage?: Array<{ name: string; count: number }>; + warnings?: Array<{ code: string; message: string }>; } +``` -const searchTool = tool({ - description: `Use the web_search_parallel tool to access information from the web. The -web_search_parallel tool returns ranked, extended web excerpts optimized for LLMs. -Intelligently scale the number of web_search_parallel tool calls to get more information -when needed, from a single call for simple factual questions to five or more calls for -complex research questions.`, - parameters: z.object({ // v4 uses parameters instead of inputSchema - objective: z.string().describe( - 'Natural-language description of what the web research goal is.' - ), - search_type: z - .enum(['list', 'general', 'single_page', 'targeted']) - .optional() - .default('list'), - search_queries: z - .array(z.string()) - .optional() - .describe('List of keyword search queries of 1-6 words.'), - include_domains: z - .array(z.string()) - .optional() - .describe('List of valid URL domains to restrict search results.'), - }), - execute: async ( - { ...args }, - { abortSignal }: { abortSignal?: AbortSignal } - ) => { - const results = const results = await search( - { ...args, ...getSearchParams(args.search_type) }, - { abortSignal } - ); - return { - searchParams: { objective, search_type, search_queries, include_domains }, - answer: results, - }; - }, +## Migration from v0.1.x + +Version 0.2.0 introduces an updated API that conforms with Parallel's Search and Extract MCP tools: + +### searchTool changes + +- **Input schema changed**: Removed `search_type` and `include_domains`. Added `mode` parameter. +- **Return value changed**: Now returns raw API response (`{ search_id, results, ... }`) instead of `{ searchParams, answer }`. + +**Before (v0.1.x):** +```typescript +const result = await searchTool.execute({ + objective: 'Find TypeScript info', + search_type: 'list', + search_queries: ['TypeScript'], + include_domains: ['typescriptlang.org'], }); +console.log(result.answer.results); ``` -### Extract Tool (v4) - +**After (v0.2.0):** ```typescript -import { tool } from 'ai'; -import { z } from 'zod'; -import { Parallel } from 'parallel-web'; +const result = await searchTool.execute({ + objective: 'Find TypeScript info', + search_queries: ['TypeScript'], + mode: 'agentic', // optional, defaults to 'agentic' +}); +console.log(result.results); +``` -const parallel = new Parallel({ - apiKey: process.env.PARALLEL_API_KEY, +### extractTool changes + +- **Input schema changed**: `urls` is now first, `objective` is optional. +- **Return value changed**: Now returns raw API response (`{ extract_id, results, errors, ... }`) instead of `{ searchParams, answer }`. + +**Before (v0.1.x):** +```typescript +const result = await extractTool.execute({ + objective: 'Extract content', + urls: ['https://example.com'], + search_queries: ['keyword'], }); +console.log(result.answer.results); +``` -const extractTool = tool({ - description: `Purpose: Fetch and extract relevant content from specific web URLs. - -Ideal Use Cases: -- Extracting content from specific URLs you've already identified -- Exploring URLs returned by a web search in greater depth`, - parameters: z.object({ - // v4 uses parameters instead of inputSchema - objective: z - .string() - .describe( - "Natural-language description of what information you're looking for from the URLs." - ), - urls: z - .array(z.string()) - .describe( - 'List of URLs to extract content from. Maximum 10 URLs per request.' - ), - search_queries: z - .array(z.string()) - .optional() - .describe('Optional keyword search queries related to the objective.'), - }), - execute: async ({ objective, urls, search_queries }) => { - const results = await parallel.beta.extract({ - objective, - urls, - search_queries, - }); - return { - searchParams: { objective, urls, search_queries }, - answer: results, - }; - }, +**After (v0.2.0):** +```typescript +const result = await extractTool.execute({ + urls: ['https://example.com'], + objective: 'Extract content', // optional }); +console.log(result.results); ``` diff --git a/packages/ai-sdk-tools/package.json b/packages/ai-sdk-tools/package.json index c771a99..1d50072 100644 --- a/packages/ai-sdk-tools/package.json +++ b/packages/ai-sdk-tools/package.json @@ -1,6 +1,6 @@ { "name": "@parallel-web/ai-sdk-tools", - "version": "0.1.6", + "version": "0.2.0", "description": "AI SDK tools for Parallel Web", "author": "Parallel Web", "license": "MIT", @@ -41,8 +41,8 @@ "access": "public" }, "dependencies": { - "parallel-web": "^0.2.1", - "zod": "^3.23.0" + "parallel-web": "^0.3.1", + "zod": "^4.3.6" }, "peerDependencies": { "ai": "^5.0.0" diff --git a/packages/ai-sdk-tools/src/__tests__/extract-integration.test.ts b/packages/ai-sdk-tools/src/__tests__/extract-integration.test.ts index 612a7e5..b174583 100644 --- a/packages/ai-sdk-tools/src/__tests__/extract-integration.test.ts +++ b/packages/ai-sdk-tools/src/__tests__/extract-integration.test.ts @@ -1,33 +1,39 @@ import { describe, it, expect } from 'vitest'; -import { extractTool } from '../index.js'; +import { extractTool, createExtractTool } from '../index.js'; +import type { ExtractResponse } from 'parallel-web/resources/beta/beta.mjs'; + +// Helper to execute tools in tests with proper typing +async function executeExtract( + tool: typeof extractTool, + params: Parameters>[0] +): Promise { + const result = await tool.execute!(params, { + toolCallId: 'test-call-id', + messages: [], + abortSignal: undefined, + }); + return result as ExtractResponse; +} describe.skipIf(!process.env.PARALLEL_API_KEY)( 'extractTool integration tests', () => { // Increase timeout for API calls - const timeout = 30000; + const timeout = 60000; - describe('basic extract execution', () => { + describe.concurrent('basic extract execution', () => { it( 'should extract content from a single URL', async () => { - const result = await extractTool.execute( - { - objective: 'Extract information about TypeScript', - urls: ['https://www.typescriptlang.org/'], - }, - { abortSignal: undefined } - ); + const result = await executeExtract(extractTool, { + urls: ['https://www.typescriptlang.org/'], + objective: 'Extract information about TypeScript', + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe( - 'Extract information about TypeScript' - ); - expect(result.searchParams.urls).toEqual([ - 'https://www.typescriptlang.org/', - ]); - expect(result.answer).toBeDefined(); + expect(result.extract_id).toBeDefined(); + expect(result.results).toBeDefined(); + expect(Array.isArray(result.results)).toBe(true); }, timeout ); @@ -35,153 +41,87 @@ describe.skipIf(!process.env.PARALLEL_API_KEY)( it( 'should extract content from multiple URLs', async () => { - const result = await extractTool.execute( - { - objective: 'Extract JavaScript documentation', - urls: [ - 'https://developer.mozilla.org/en-US/docs/Web/JavaScript', - 'https://javascript.info/', - ], - }, - { abortSignal: undefined } - ); + const result = await executeExtract(extractTool, { + urls: [ + 'https://developer.mozilla.org/en-US/docs/Web/JavaScript', + 'https://javascript.info/', + ], + objective: 'Extract JavaScript documentation', + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.urls).toHaveLength(2); - expect(result.answer).toBeDefined(); + expect(result.extract_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); it( - 'should extract content from Wikipedia URL', + 'should extract content without objective', async () => { - const result = await extractTool.execute( - { - objective: 'Extract information about Python programming', - urls: [ - 'https://en.wikipedia.org/wiki/Python_(programming_language)', - ], - }, - { abortSignal: undefined } - ); + const result = await executeExtract(extractTool, { + urls: ['https://example.com'], + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe( - 'Extract information about Python programming' - ); - expect(result.answer).toBeDefined(); + expect(result.extract_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); }); - describe('extract with optional parameters', () => { - it( - 'should extract content with search_queries parameter', - async () => { - const result = await extractTool.execute( - { - objective: 'Find React hooks information', - urls: ['https://react.dev/'], - search_queries: ['React hooks', 'useState', 'useEffect'], - }, - { abortSignal: undefined } - ); - - expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.search_queries).toEqual([ - 'React hooks', - 'useState', - 'useEffect', - ]); - expect(result.answer).toBeDefined(); - }, - timeout - ); - + describe.concurrent('response structure validation', () => { it( - 'should extract content without search_queries', + 'should return raw API response structure', async () => { - const result = await extractTool.execute( - { - objective: 'Get Node.js documentation', - urls: ['https://nodejs.org/en/docs/'], - }, - { abortSignal: undefined } - ); + const result = await executeExtract(extractTool, { + urls: ['https://example.com'], + objective: 'Test extraction', + }); - expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe( - 'Get Node.js documentation' - ); - expect(result.searchParams.search_queries).toBeUndefined(); - expect(result.answer).toBeDefined(); + // Should return raw API response, not wrapped + expect(result).toHaveProperty('extract_id'); + expect(result).toHaveProperty('results'); + expect(result).toHaveProperty('errors'); + expect(result).not.toHaveProperty('searchParams'); + expect(result).not.toHaveProperty('answer'); }, timeout ); it( - 'should extract content from GitHub repository URL', + 'should have results array with expected properties', async () => { - const result = await extractTool.execute( - { - objective: 'Extract README information', - urls: ['https://github.com/vercel/ai'], - search_queries: ['AI SDK', 'installation'], - }, - { abortSignal: undefined } - ); + const result = await executeExtract(extractTool, { + urls: ['https://www.typescriptlang.org/'], + objective: 'TypeScript information', + }); - expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.urls).toEqual([ - 'https://github.com/vercel/ai', - ]); - expect(result.answer).toBeDefined(); + expect(result.results.length).toBeGreaterThan(0); + const firstResult = result.results[0]; + expect(firstResult).toHaveProperty('url'); }, timeout ); }); - describe('response structure validation', () => { - it( - 'should return result with correct structure', - async () => { - const result = await extractTool.execute( - { - objective: 'Test extraction', - urls: ['https://example.com'], - }, - { abortSignal: undefined } - ); - - expect(result).toHaveProperty('searchParams'); - expect(result).toHaveProperty('answer'); - expect(typeof result.searchParams).toBe('object'); - expect(typeof result.answer).toBe('object'); - }, - timeout - ); - + describe.concurrent('createExtractTool factory', () => { it( - 'should preserve all input parameters in searchParams', + 'should create tool with custom defaults', async () => { - const inputParams = { - objective: 'Extract specific content', - urls: ['https://www.npmjs.com/package/ai'], - search_queries: ['AI SDK', 'features'], - }; + const customExtractTool = createExtractTool({ + full_content: true, + }); - const result = await extractTool.execute(inputParams, { - abortSignal: undefined, + const result = await executeExtract(customExtractTool, { + urls: ['https://example.com'], + objective: 'Extract full content', }); - expect(result.searchParams).toEqual(inputParams); + expect(result).toBeDefined(); + expect(result.extract_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); diff --git a/packages/ai-sdk-tools/src/__tests__/index.test.ts b/packages/ai-sdk-tools/src/__tests__/index.test.ts index d87e6cc..149ed0a 100644 --- a/packages/ai-sdk-tools/src/__tests__/index.test.ts +++ b/packages/ai-sdk-tools/src/__tests__/index.test.ts @@ -1,7 +1,7 @@ import { describe, it, expect } from 'vitest'; -describe('@parallel-web/ai-sdk-tools (default export)', () => { - describe('exports', () => { +describe('@parallel-web/ai-sdk-tools exports', () => { + describe('default tools', () => { it('should export searchTool', async () => { const { searchTool } = await import('../index.js'); expect(searchTool).toBeDefined(); @@ -19,8 +19,38 @@ describe('@parallel-web/ai-sdk-tools (default export)', () => { }); }); - describe('default export should match v5', () => { - it('should use inputSchema (v5 API)', async () => { + describe('factory functions', () => { + it('should export createSearchTool', async () => { + const { createSearchTool } = await import('../index.js'); + expect(createSearchTool).toBeDefined(); + expect(typeof createSearchTool).toBe('function'); + }); + + it('should export createExtractTool', async () => { + const { createExtractTool } = await import('../index.js'); + expect(createExtractTool).toBeDefined(); + expect(typeof createExtractTool).toBe('function'); + }); + + it('createSearchTool should return a tool with execute function', async () => { + const { createSearchTool } = await import('../index.js'); + const customTool = createSearchTool({ mode: 'one-shot', max_results: 5 }); + expect(customTool).toBeDefined(); + expect(typeof customTool.execute).toBe('function'); + expect(customTool.description).toBeDefined(); + }); + + it('createExtractTool should return a tool with execute function', async () => { + const { createExtractTool } = await import('../index.js'); + const customTool = createExtractTool({ full_content: true }); + expect(customTool).toBeDefined(); + expect(typeof customTool.execute).toBe('function'); + expect(customTool.description).toBeDefined(); + }); + }); + + describe('AI SDK v5 compatibility', () => { + it('searchTool should use inputSchema (v5 API)', async () => { const { searchTool } = await import('../index.js'); expect(searchTool).toHaveProperty('inputSchema'); expect(searchTool).not.toHaveProperty('parameters'); @@ -34,9 +64,9 @@ describe('@parallel-web/ai-sdk-tools (default export)', () => { }); describe('tool descriptions', () => { - it('searchTool should have web_search_parallel description', async () => { + it('searchTool should have web search description', async () => { const { searchTool } = await import('../index.js'); - expect(searchTool.description).toContain('web_search_parallel'); + expect(searchTool.description).toContain('web search'); }); it('extractTool should have extract description', async () => { diff --git a/packages/ai-sdk-tools/src/__tests__/search-integration.test.ts b/packages/ai-sdk-tools/src/__tests__/search-integration.test.ts index fc9eba9..7596963 100644 --- a/packages/ai-sdk-tools/src/__tests__/search-integration.test.ts +++ b/packages/ai-sdk-tools/src/__tests__/search-integration.test.ts @@ -1,182 +1,164 @@ import { describe, it, expect } from 'vitest'; -import { searchTool } from '../index.js'; +import { searchTool, createSearchTool } from '../index.js'; +import type { SearchResult } from 'parallel-web/resources/beta/beta.mjs'; + +type SearchParams = Parameters>[0]; + +// Helper to execute tools in tests with proper typing +// Uses Partial because Zod .default() makes mode optional at runtime +async function executeSearch( + tool: typeof searchTool | ReturnType, + params: Partial +): Promise { + const result = await tool.execute!(params as SearchParams, { + toolCallId: 'test-call-id', + messages: [], + abortSignal: undefined, + }); + return result as SearchResult; +} describe.skipIf(!process.env.PARALLEL_API_KEY)( 'searchTool integration tests', () => { // Increase timeout for API calls - const timeout = 30000; + const timeout = 60000; - describe('basic search execution', () => { + describe.concurrent('basic search execution', () => { it( - 'should execute search with list search_type', + 'should execute search with default agentic mode', async () => { - const result = await searchTool.execute( - { - objective: 'Find information about TypeScript', - search_type: 'list', - search_queries: ['TypeScript programming'], - }, - { abortSignal: undefined } - ); + const result = await executeSearch(searchTool, { + objective: 'Find information about TypeScript', + search_queries: ['TypeScript programming'], + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe( - 'Find information about TypeScript' - ); - expect(result.searchParams.search_type).toBe('list'); - expect(result.answer).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); + expect(Array.isArray(result.results)).toBe(true); }, timeout ); it( - 'should execute search with general search_type', + 'should execute search with one-shot mode', async () => { - const result = await searchTool.execute( - { - objective: 'Information about Node.js', - search_type: 'general', - search_queries: ['Node.js'], - }, - { abortSignal: undefined } - ); + const result = await executeSearch(searchTool, { + objective: 'Information about Node.js', + search_queries: ['Node.js'], + mode: 'one-shot', + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.search_type).toBe('general'); - expect(result.answer).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); it( - 'should execute search with targeted search_type', + 'should execute search with agentic mode explicitly', async () => { - const result = await searchTool.execute( - { - objective: 'React documentation', - search_type: 'targeted', - search_queries: ['React hooks'], - }, - { abortSignal: undefined } - ); + const result = await executeSearch(searchTool, { + objective: 'React documentation', + search_queries: ['React hooks'], + mode: 'agentic', + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.search_type).toBe('targeted'); - expect(result.answer).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); + }); + describe.concurrent('search with optional parameters', () => { it( - 'should execute search with single_page search_type', + 'should execute search with objective only', async () => { - const result = await searchTool.execute( - { - objective: 'Get content from Wikipedia Python page', - search_type: 'single_page', - search_queries: ['Python programming Wikipedia'], - }, - { abortSignal: undefined } - ); + const result = await executeSearch(searchTool, { + objective: 'Current weather trends', + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.search_type).toBe('single_page'); - expect(result.answer).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); - }); - describe('search with optional parameters', () => { it( - 'should execute search with include_domains parameter', + 'should execute search with objective and search_queries', async () => { - const result = await searchTool.execute( - { - objective: 'Find JavaScript tutorials', - search_type: 'list', - search_queries: ['JavaScript tutorial'], - include_domains: ['developer.mozilla.org', 'javascript.info'], - }, - { abortSignal: undefined } - ); + const result = await executeSearch(searchTool, { + objective: 'AI SDK information', + search_queries: ['AI SDK', 'Vercel AI'], + }); expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.include_domains).toEqual([ - 'developer.mozilla.org', - 'javascript.info', - ]); - expect(result.answer).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); }, timeout ); + }); + describe.concurrent('response structure validation', () => { it( - 'should execute search with only objective and search_queries', + 'should return raw API response structure', async () => { - const result = await searchTool.execute( - { - objective: 'AI SDK information', - search_queries: ['AI SDK', 'Vercel AI'], - }, - { abortSignal: undefined } - ); - - expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe('AI SDK information'); - expect(result.searchParams.search_queries).toEqual([ - 'AI SDK', - 'Vercel AI', - ]); - expect(result.answer).toBeDefined(); + const result = await executeSearch(searchTool, { + objective: 'Test query', + search_queries: ['test'], + }); + + // Should return raw API response, not wrapped + expect(result).toHaveProperty('search_id'); + expect(result).toHaveProperty('results'); + expect(result).not.toHaveProperty('searchParams'); + expect(result).not.toHaveProperty('answer'); }, timeout ); it( - 'should execute search without search_queries', + 'should have results array with expected properties', async () => { - const result = await searchTool.execute( - { - objective: 'Current weather trends', - search_type: 'general', - }, - { abortSignal: undefined } - ); - - expect(result).toBeDefined(); - expect(result.searchParams).toBeDefined(); - expect(result.searchParams.objective).toBe('Current weather trends'); - expect(result.answer).toBeDefined(); + const result = await executeSearch(searchTool, { + objective: 'TypeScript programming language', + search_queries: ['TypeScript'], + }); + + expect(result.results.length).toBeGreaterThan(0); + const firstResult = result.results[0]; + expect(firstResult).toHaveProperty('url'); + expect(firstResult).toHaveProperty('excerpts'); }, timeout ); }); - describe('response structure validation', () => { + describe.concurrent('createSearchTool factory', () => { it( - 'should return result with correct structure', + 'should create tool with custom defaults', async () => { - const result = await searchTool.execute( - { - objective: 'Test query', - search_type: 'list', - search_queries: ['test'], - }, - { abortSignal: undefined } - ); - - expect(result).toHaveProperty('searchParams'); - expect(result).toHaveProperty('answer'); - expect(typeof result.searchParams).toBe('object'); - expect(typeof result.answer).toBe('object'); + const customSearchTool = createSearchTool({ + mode: 'one-shot', + max_results: 3, + }); + + const result = await executeSearch(customSearchTool, { + objective: 'JavaScript frameworks', + search_queries: ['JavaScript framework'], + }); + + expect(result).toBeDefined(); + expect(result.search_id).toBeDefined(); + expect(result.results).toBeDefined(); + // max_results may limit results + expect(result.results.length).toBeLessThanOrEqual(3); }, timeout ); diff --git a/packages/ai-sdk-tools/src/client.ts b/packages/ai-sdk-tools/src/client.ts index 6e4c0f7..c698573 100644 --- a/packages/ai-sdk-tools/src/client.ts +++ b/packages/ai-sdk-tools/src/client.ts @@ -2,17 +2,22 @@ * Shared Parallel Web client instance */ +declare const __PACKAGE_VERSION__: string; + import { Parallel } from 'parallel-web'; let _parallelClient: Parallel | null = null; export const parallelClient = new Proxy({} as Parallel, { - get(_target, prop) { + get(_target, prop: keyof Parallel) { if (!_parallelClient) { _parallelClient = new Parallel({ apiKey: process.env['PARALLEL_API_KEY'], + defaultHeaders: { + 'X-Tool-Calling-Package': `npm:@parallel-web/ai-sdk-tools/v${__PACKAGE_VERSION__ ?? '0.0.0'}`, + }, }); } - return (_parallelClient as any)[prop]; + return _parallelClient[prop]; }, }); diff --git a/packages/ai-sdk-tools/src/index.ts b/packages/ai-sdk-tools/src/index.ts index d130c5d..46fcf25 100644 --- a/packages/ai-sdk-tools/src/index.ts +++ b/packages/ai-sdk-tools/src/index.ts @@ -6,5 +6,10 @@ * For AI SDK v4 compatibility, see the README for implementation examples. */ -export { searchTool } from './tools/search.js'; -export { extractTool } from './tools/extract.js'; +// Default tools (MCP-like API) +export { searchTool, createSearchTool } from './tools/search.js'; +export { extractTool, createExtractTool } from './tools/extract.js'; + +// Types for factory options +export type { CreateSearchToolOptions } from './tools/search.js'; +export type { CreateExtractToolOptions } from './tools/extract.js'; diff --git a/packages/ai-sdk-tools/src/tools/extract.ts b/packages/ai-sdk-tools/src/tools/extract.ts index 1f418d6..ea4a236 100644 --- a/packages/ai-sdk-tools/src/tools/extract.ts +++ b/packages/ai-sdk-tools/src/tools/extract.ts @@ -4,8 +4,48 @@ import { tool } from 'ai'; import { z } from 'zod'; +import type { + ExcerptSettings, + FetchPolicy, + BetaExtractParams, +} from 'parallel-web/resources/beta/beta.mjs'; import { parallelClient } from '../client.js'; +/** + * Options for creating a custom extract tool with code-supplied defaults. + */ +export interface CreateExtractToolOptions { + /** + * Include excerpts from each URL relevant to the search objective and queries. + * Can be a boolean or ExcerptSettings object. Defaults to true. + */ + excerpts?: boolean | ExcerptSettings; + + /** + * Include full content from each URL. Can be a boolean or FullContentSettings object. + * Defaults to false. + */ + full_content?: BetaExtractParams['full_content']; + + /** + * Fetch policy for controlling cached vs fresh content. + */ + fetch_policy?: FetchPolicy | null; + + /** + * Custom tool description. If not provided, uses the default description. + */ + description?: string; +} + +const urlsDescription = `List of URLs to extract content from. Must be valid HTTP/HTTPS URLs. Maximum 10 URLs per request.`; + +const objectiveDescription = `Natural-language description of what information you're looking for from the URLs.`; + +/** + * Extract tool that mirrors the MCP web_fetch tool. + * Takes urls and optional objective, returns raw extract response. + */ export const extractTool = tool({ description: `Purpose: Fetch and extract relevant content from specific web URLs. @@ -13,38 +53,77 @@ Ideal Use Cases: - Extracting content from specific URLs you've already identified - Exploring URLs returned by a web search in greater depth`, inputSchema: z.object({ - objective: z.string().describe( - `Natural-language description of what information you're looking for from the URLs. - Limit to 200 characters.` - ), - - urls: z.array(z.string()).describe( - `List of URLs to extract content from. Must be valid -HTTP/HTTPS URLs. Maximum 10 URLs per request.` - ), - search_queries: z - .array(z.string()) - .optional() - .describe( - `(optional) List of keyword search queries of 1-6 - words, which may include search operators. The search queries should be related to the - objective. Limited to 5 entries of 200 characters each. Usually 1-3 queries are - ideal.` - ), + urls: z.array(z.string()).describe(urlsDescription), + objective: z.string().optional().describe(objectiveDescription), }), - execute: async function ({ ...args }, { abortSignal }) { - const results = await parallelClient.beta.extract( - { ...args }, + execute: async function ( + { urls, objective }: { urls: string[]; objective?: string }, + { abortSignal }: { abortSignal?: AbortSignal } + ) { + return await parallelClient.beta.extract( + { + urls, + objective, + }, { signal: abortSignal, - headers: { 'parallel-beta': 'search-extract-2025-10-10' }, } ); - - return { - searchParams: args, - answer: results, - }; }, }); + +const defaultExtractDescription = `Purpose: Fetch and extract relevant content from specific web URLs. + +Ideal Use Cases: +- Extracting content from specific URLs you've already identified +- Exploring URLs returned by a web search in greater depth`; + +/** + * Factory function to create an extract tool with custom defaults. + * + * Use this when you want to set defaults for excerpts, full_content, or + * fetch_policy in your code, so the LLM only needs to provide urls and objective. + * + * @example + * ```ts + * const myExtractTool = createExtractTool({ + * excerpts: { max_chars_per_result: 5000 }, + * full_content: true, + * }); + * ``` + */ +export function createExtractTool(options: CreateExtractToolOptions = {}) { + const { + excerpts, + full_content, + fetch_policy, + description = defaultExtractDescription, + } = options; + + return tool({ + description, + inputSchema: z.object({ + urls: z.array(z.string()).describe(urlsDescription), + objective: z.string().optional().describe(objectiveDescription), + }), + + execute: async function ( + { urls, objective }: { urls: string[]; objective?: string }, + { abortSignal }: { abortSignal?: AbortSignal } + ) { + return await parallelClient.beta.extract( + { + urls, + objective, + excerpts, + full_content, + fetch_policy, + }, + { + signal: abortSignal, + } + ); + }, + }); +} diff --git a/packages/ai-sdk-tools/src/tools/search.ts b/packages/ai-sdk-tools/src/tools/search.ts index 29d9164..ac7024a 100644 --- a/packages/ai-sdk-tools/src/tools/search.ts +++ b/packages/ai-sdk-tools/src/tools/search.ts @@ -4,107 +4,150 @@ import { tool } from 'ai'; import { z } from 'zod'; -import { BetaSearchParams } from 'parallel-web/resources/beta/beta.mjs'; +import type { + ExcerptSettings, + FetchPolicy, +} from 'parallel-web/resources/beta/beta.mjs'; +import type { SourcePolicy } from 'parallel-web/resources/shared.mjs'; import { parallelClient } from '../client.js'; -function getSearchParams( - search_type: 'list' | 'targeted' | 'general' | 'single_page' -): Pick { - switch (search_type) { - case 'targeted': - return { max_results: 5, max_chars_per_result: 16000 }; - case 'general': - return { max_results: 10, max_chars_per_result: 9000 }; - case 'single_page': - return { max_results: 2, max_chars_per_result: 30000 }; - case 'list': - default: - return { max_results: 20, max_chars_per_result: 1500 }; - } +/** + * Options for creating a custom search tool with code-supplied defaults. + */ +export interface CreateSearchToolOptions { + /** + * Default mode for search. 'agentic' returns concise, token-efficient results + * for multi-step workflows. 'one-shot' returns comprehensive results with + * longer excerpts. Defaults to 'agentic'. + */ + mode?: 'agentic' | 'one-shot'; + + /** + * Maximum number of search results to return. Defaults to 10. + */ + max_results?: number; + + /** + * Excerpt settings for controlling excerpt length. + */ + excerpts?: ExcerptSettings; + + /** + * Source policy for controlling which domains to include/exclude and freshness. + */ + source_policy?: SourcePolicy | null; + + /** + * Fetch policy for controlling cached vs fresh content. + */ + fetch_policy?: FetchPolicy | null; + + /** + * Custom tool description. If not provided, uses the default description. + */ + description?: string; } -const search = async ( - searchArgs: BetaSearchParams, - { abortSignal }: { abortSignal: AbortSignal | undefined } -) => { - return await parallelClient.beta.search( - { - ...searchArgs, - }, - { - signal: abortSignal, - headers: { 'parallel-beta': 'search-extract-2025-10-10' }, - } - ); -}; +const objectiveDescription = `Natural-language description of what the web search is trying to find. +Try to make the search objective atomic, looking for a specific piece of information. May include guidance about preferred sources or freshness.`; + +const searchQueriesDescription = `(optional) List of keyword search queries of 1-6 words, which may include search operators. The search queries should be related to the objective. Limited to 5 entries of 200 characters each.`; + +const modeDescription = `Presets default values for different use cases. "one-shot" returns more comprehensive results and longer excerpts to answer questions from a single response, while "agentic" returns more concise, token-efficient results for use in an agentic loop. Defaults to "agentic".`; +/** + * Search tool that mirrors the MCP web_search_preview tool. + * Takes objective and optional search_queries/mode, returns raw search response. + */ export const searchTool = tool({ - description: `Use the web_search_parallel tool to access information from the web. The -web_search_parallel tool returns ranked, extended web excerpts optimized for LLMs. -Intelligently scale the number of web_search_parallel tool calls to get more information -when needed, from a single call for simple factual questions to five or more calls for -complex research questions. - -* Keep queries concise - 1-6 words for best results. Start broad with very short - queries and medium context, then add words to narrow results or use high context - if needed. -* Include broader context about what the search is trying to accomplish in the - \`objective\` field. This helps the search engine understand the user's intent and - provide relevant results and excerpts. -* Never repeat similar search queries - make every query unique. If initial results are - insufficient, reformulate queries to obtain new and better results. - -How to use: -- For simple queries, a one-shot call to depth is usually sufficient. -- For complex multi-hop queries, first try to use breadth to narrow down sources. Then -use other search types with include_domains to get more detailed results.`, + description: `Purpose: Perform web searches and return results in an LLM-friendly format. + +Use the web search tool to search the web and access information from the web. The tool returns ranked, extended web excerpts optimized for LLMs.`, inputSchema: z.object({ - objective: z.string().describe( - `Natural-language description of what the web research goal - is. Specify the broad intent of the search query here. Also include any source or - freshness guidance here. Limit to 200 characters. This should reflect the end goal so - that the tool can better understand the intent and return the best results. Do not - dump long texts.` - ), - search_type: z - .enum(['list', 'general', 'single_page', 'targeted']) - .describe( - `Can be "list", "general", "single_page" or "targeted". - "list" should be used for searching for data broadly, like aggregating data or - considering multiple sources or doing broad initial research. "targeted" should be - used for searching for data from a specific source set. "general" is a catch all case - if there is no specific use case from list or targeted. "single_page" extracts data - from a single page - extremely targeted. If there is a specific webpage you want the - data from, use "single_page" and mention the URL in the objective. - Use search_type appropriately.` - ) - .optional() - .default('list'), + objective: z.string().describe(objectiveDescription), search_queries: z .array(z.string()) .optional() - .describe( - `(optional) List of keyword search queries of 1-6 - words, which may include search operators. The search queries should be related to the - objective. Limited to 5 entries of 200 characters each. Usually 1-3 queries are - ideal.` - ), - include_domains: z.array(z.string()).optional() - .describe(`(optional) List of valid URL domains to explicitly - focus on for the search. This will restrict all search results to only include results - from the provided list. This is useful when you want to only use a specific set of - sources. example: ["google.com", "wikipedia.org"]. Maximum 10 entries.`), + .describe(searchQueriesDescription), + mode: z + .enum(['agentic', 'one-shot']) + .optional() + .default('agentic') + .describe(modeDescription), }), - execute: async function ({ ...args }, { abortSignal }) { - const results = await search( - { ...args, ...getSearchParams(args.search_type) }, - { abortSignal } + execute: async function ( + { objective, search_queries, mode }, + { abortSignal } + ) { + return await parallelClient.beta.search( + { + objective, + search_queries, + mode, + }, + { + signal: abortSignal, + } ); - - return { - searchParams: args, - answer: results, - }; }, }); + +const defaultSearchDescription = `Purpose: Perform web searches and return results in an LLM-friendly format. + +Use the web search tool to search the web and access information from the web. The tool returns ranked, extended web excerpts optimized for LLMs.`; + +/** + * Factory function to create a search tool with custom defaults. + * + * Use this when you want to set defaults for mode, max_results, excerpts, + * source_policy, or fetch_policy in your code, so the LLM only needs to + * provide objective and search_queries. + * + * @example + * ```ts + * const mySearchTool = createSearchTool({ + * mode: 'one-shot', + * max_results: 5, + * excerpts: { max_chars_per_result: 5000 }, + * }); + * ``` + */ +export function createSearchTool(options: CreateSearchToolOptions = {}) { + const { + mode: defaultMode = 'agentic', + max_results, + excerpts, + source_policy, + fetch_policy, + description = defaultSearchDescription, + } = options; + + return tool({ + description, + inputSchema: z.object({ + objective: z.string().describe(objectiveDescription), + search_queries: z + .array(z.string()) + .optional() + .describe(searchQueriesDescription), + }), + + execute: async function ({ objective, search_queries }, { abortSignal }) { + return await parallelClient.beta.search( + { + objective, + search_queries, + mode: defaultMode, + max_results, + excerpts, + source_policy, + fetch_policy, + }, + { + signal: abortSignal, + } + ); + }, + }); +} diff --git a/packages/ai-sdk-tools/tsup.config.ts b/packages/ai-sdk-tools/tsup.config.ts index 8ee9364..e972acf 100644 --- a/packages/ai-sdk-tools/tsup.config.ts +++ b/packages/ai-sdk-tools/tsup.config.ts @@ -1,4 +1,7 @@ import { defineConfig } from 'tsup'; +import { readFileSync } from 'fs'; + +const pkg = JSON.parse(readFileSync('./package.json', 'utf-8')); export default defineConfig({ entry: { @@ -12,4 +15,7 @@ export default defineConfig({ treeshake: true, minify: false, outDir: 'dist', + define: { + __PACKAGE_VERSION__: JSON.stringify(pkg.version), + }, }); diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 9a288a7..06bcac8 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -36,18 +36,18 @@ importers: packages/ai-sdk-tools: dependencies: parallel-web: - specifier: ^0.2.1 - version: 0.2.1 + specifier: ^0.3.1 + version: 0.3.1 zod: - specifier: ^3.23.0 - version: 3.25.76 + specifier: ^4.3.6 + version: 4.3.6 devDependencies: '@types/node': specifier: ^20.0.0 version: 20.19.21 ai: specifier: ^5.0.0 - version: 5.0.76(zod@3.25.76) + version: 5.0.76(zod@4.3.6) packages: @@ -1155,8 +1155,8 @@ packages: package-json-from-dist@1.0.1: resolution: {integrity: sha512-UEZIS3/by4OC8vL3P2dTXRETpebLI2NiI5vIrjaD/5UtrkFX/tNbwjTSRAGC/+7CAo2pIcBaRgWmcBBHcsaCIw==} - parallel-web@0.2.1: - resolution: {integrity: sha512-WJgEN92xdS+u8U0J7WXlltZHDZ6SDHJeJL1CYgZ+JQgJNwxhFKdIlawVkUL++EyGPSMu+lE8u/zsxDXtoxMryQ==} + parallel-web@0.3.1: + resolution: {integrity: sha512-1yLiqdjFFhwgZUV2qOLTvu36A3wiCld8/Y8Zu1FsGk6y4ov4cZPhXNGafjDkM68aFc1c5KVB9ZYpkJZZ/mZW/w==} parent-module@1.0.1: resolution: {integrity: sha512-GQ2EWRpQV8/o+Aw8YqtfZZPfNRWZYkbidE9k5rpl/hC3vtHHBfGm2Ifi6qWV+coDGkrUKZAxE3Lot5kcsRlh+g==} @@ -1554,24 +1554,24 @@ packages: resolution: {integrity: sha512-AyeEbWOu/TAXdxlV9wmGcR0+yh2j3vYPGOECcIj2S7MkrLyC7ne+oye2BKTItt0ii2PHk4cDy+95+LshzbXnGg==} engines: {node: '>=12.20'} - zod@3.25.76: - resolution: {integrity: sha512-gzUt/qt81nXsFGKIFcC3YnfEAx5NkunCfnDlvuBSSFS02bcXu4Lmea0AFIUwbLWxWPx3d9p8S5QoaujKcNQxcQ==} + zod@4.3.6: + resolution: {integrity: sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg==} snapshots: - '@ai-sdk/gateway@2.0.0(zod@3.25.76)': + '@ai-sdk/gateway@2.0.0(zod@4.3.6)': dependencies: '@ai-sdk/provider': 2.0.0 - '@ai-sdk/provider-utils': 3.0.12(zod@3.25.76) + '@ai-sdk/provider-utils': 3.0.12(zod@4.3.6) '@vercel/oidc': 3.0.3 - zod: 3.25.76 + zod: 4.3.6 - '@ai-sdk/provider-utils@3.0.12(zod@3.25.76)': + '@ai-sdk/provider-utils@3.0.12(zod@4.3.6)': dependencies: '@ai-sdk/provider': 2.0.0 '@standard-schema/spec': 1.0.0 eventsource-parser: 3.0.6 - zod: 3.25.76 + zod: 4.3.6 '@ai-sdk/provider@2.0.0': dependencies: @@ -2012,13 +2012,13 @@ snapshots: acorn@8.15.0: {} - ai@5.0.76(zod@3.25.76): + ai@5.0.76(zod@4.3.6): dependencies: - '@ai-sdk/gateway': 2.0.0(zod@3.25.76) + '@ai-sdk/gateway': 2.0.0(zod@4.3.6) '@ai-sdk/provider': 2.0.0 - '@ai-sdk/provider-utils': 3.0.12(zod@3.25.76) + '@ai-sdk/provider-utils': 3.0.12(zod@4.3.6) '@opentelemetry/api': 1.9.0 - zod: 3.25.76 + zod: 4.3.6 ajv@6.12.6: dependencies: @@ -2564,7 +2564,7 @@ snapshots: package-json-from-dist@1.0.1: {} - parallel-web@0.2.1: {} + parallel-web@0.3.1: {} parent-module@1.0.1: dependencies: @@ -2926,4 +2926,4 @@ snapshots: yocto-queue@1.2.1: {} - zod@3.25.76: {} + zod@4.3.6: {} diff --git a/vitest.config.ts b/vitest.config.ts index 2e27da4..fc4f94b 100644 --- a/vitest.config.ts +++ b/vitest.config.ts @@ -1,6 +1,14 @@ import { defineConfig } from 'vitest/config'; +import { readFileSync } from 'fs'; + +const pkg = JSON.parse( + readFileSync('./packages/ai-sdk-tools/package.json', 'utf-8') +); export default defineConfig({ + define: { + __PACKAGE_VERSION__: JSON.stringify(pkg.version), + }, test: { globals: true, environment: 'node',