diff --git a/README.md b/README.md index 38249f6..e7268bd 100644 --- a/README.md +++ b/README.md @@ -4,12 +4,13 @@ The following cookbook is designed to get you building with Parallel APIs as qui ## Recipes & Examples -| Title | Description | Code | Demo | -| ------------------------------- | ---------------------------------------- | ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | -| Tasks Playground with Streaming | Using Durable Objects and SSE Events API | [Recipe](https://github.com/parallel-web/parallel-cookbook/tree/main/typescript-recipes/parallel-tasks-sse) | [oss.parallel.ai/tasks-sse](https://oss.parallel.ai/tasks-sse/) | -| Market Analysis Demo | Deep Research for Market Analysis | [App Repo](https://github.com/parallel-web/parallel-cookbook/tree/main/python-recipes/market-analysis-demo) | [market-analysis-demo.parallel.ai](https://market-analysis-demo.parallel.ai/) | -| Search Agent | AI SDK + Parallel SDK Search API as tool | [Recipe](typescript-recipes/parallel-search-agent) | [oss.parallel.ai/agent](https://oss.parallel.ai/agent) | -| Competitive Analysis | Using Web Enrichment and Reddit MCP | [App Repo](https://github.com/parallel-web/competitive-analysis-demo/tree/main) | [competitive-analsis-demo.parallel.ai](https://competitive-analysis-demo.parallel.ai/) | +| Title | Description | Code | Demo | +| ------------------------------- | --------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | +| Tasks Playground with Streaming | Using Durable Objects and SSE Events API | [Recipe](https://github.com/parallel-web/parallel-cookbook/tree/main/typescript-recipes/parallel-tasks-sse) | [oss.parallel.ai/tasks-sse](https://oss.parallel.ai/tasks-sse/) | +| Market Analysis Demo | Deep Research for Market Analysis | [App Repo](https://github.com/parallel-web/parallel-cookbook/tree/main/python-recipes/market-analysis-demo) | [market-analysis-demo.parallel.ai](https://market-analysis-demo.parallel.ai/) | +| Search Agent | AI SDK + Parallel SDK Search API as tool | [Recipe](typescript-recipes/parallel-search-agent) | [oss.parallel.ai/agent](https://oss.parallel.ai/agent) | +| Competitive Analysis | Using Web Enrichment and Reddit MCP | [App Repo](https://github.com/parallel-web/competitive-analysis-demo/tree/main) | [competitive-analsis-demo.parallel.ai](https://competitive-analysis-demo.parallel.ai/) | +| Person Entity Resolution | Use the Task API to perform entity resolution | [App Repo](https://github.com/parallel-web/parallel-cookbook/tree/main/typescript-recipes/parallel-entity-resolution) | [entity-resolution-demo.parallel.ai](https://entity-resolution-demo.parallel.ai/) | ## Community Examples diff --git a/typescript-recipes/parallel-entity-resolution/BLOG.md b/typescript-recipes/parallel-entity-resolution/BLOG.md new file mode 100644 index 0000000..5f15ef4 --- /dev/null +++ b/typescript-recipes/parallel-entity-resolution/BLOG.md @@ -0,0 +1,201 @@ +# Building an AI-powered person entity resolution API with Parallel + +TL;DR: We built a person entity resolution API that maps any identifier—name, email, username, or profile URL—to verified social profiles across Twitter, LinkedIn, GitHub, and more. One Task API call handles the entire resolution pipeline. + +Tags: [Cookbook](/blog?tag=cookbook) + +Reading time: 5 min + +[Github](https://github.com/parallel-web/parallel-cookbook/tree/main/typescript-recipes/parallel-entity-resolution) [Try the App](https://entity-resolution-demo.parallel.ai) + +--- + +The average professional maintains 7+ social media accounts. Sales reps, recruiters, and researchers face the same challenge: given a name and an email, find the complete digital footprint. + +Manual searches don't scale. Name matching breaks down ("John Smith" returns millions). Cross-references are invisible to traditional search. + +We built an API that solves this using Parallel's Task API. + + + +## Key features + +- **Any identifier input**: Name, email, username, or profile URL +- **Multi-platform search**: Twitter, LinkedIn, GitHub, Instagram, Facebook, TikTok +- **Chain following**: Automatically follows self-proclaimed links (Twitter bio → GitHub → LinkedIn) +- **Confidence indicators**: Distinguishes self-proclaimed vs externally discovered profiles +- **Conservative matching**: Only returns high-confidence matches + +## Architecture + +The entity resolution API implements a single-call pattern: + +1. Parse input for identifiers (handles, emails, URLs, names, affiliations) +2. Search across platforms using multiple strategies +3. Follow transitive chains of self-references +4. Verify bidirectionally where possible +5. Return structured profiles with reasoning + +This architecture leverages the Task API's built-in web research capabilities—no separate search, scrape, or ranking pipeline required. + +## Technology stack + +- [Parallel Task API](https://docs.parallel.ai/task-api/task-quickstart) for the complete resolution pipeline +- [Parallel OAuth Provider](https://docs.parallel.ai/integrations/oauth-provider) for API key management +- [Cloudflare Workers](https://workers.cloudflare.com/) for serverless deployment +- Pure HTML/JavaScript/CSS frontend + +## Why this architecture + +**Single Task API call vs research pipeline** + +Traditional entity resolution requires building a pipeline: search each platform, scrape results, extract profiles, cross-reference, rank by confidence. The Task API collapses this into one call with structured output. + +The `pro` processor handles the reasoning required to follow chains of self-references and verify matches—tasks that would otherwise require custom logic. + +**Structured output for integration** + +The API returns typed JSON that slots directly into CRM systems, enrichment pipelines, or downstream analysis: + +### Example response + +```json +{ + "profiles": [ + { + "platform_slug": "twitter", + "profile_url": "https://twitter.com/johndoe", + "is_self_proclaimed": true, + "is_self_referring": true, + "match_reasoning": "Profile bio links to LinkedIn and GitHub profiles found in search", + "profile_snippet": "CTO @TechCorp | Building AI infrastructure" + }, + { + "platform_slug": "github", + "profile_url": "https://github.com/johndoe", + "is_self_proclaimed": true, + "is_self_referring": false, + "match_reasoning": "Linked from Twitter profile, same name and company affiliation", + "profile_snippet": "CTO at TechCorp. 50 repositories, 2.3k followers" + } + ] +} +``` + +**Conservative by design** + +False positives in entity resolution are worse than false negatives. The API only returns high-confidence matches with explicit reasoning for each. + +## Implementation + +### Defining the Task API call + +The resolution request uses a structured output schema to ensure consistent, typed responses: + +### Task API request with JSON schema + +```typescript +const client = new Parallel({ apiKey }); + +const result = await client.taskRun.create({ + input: `You are a person entity resolution system. Given information about a person, find and return their digital profiles across various platforms. + +Input: "${body.input}" + +Instructions: +1. Analyze the input for social media handles, usernames, email addresses, names, or other identifying information +2. Search for profiles across Twitter, LinkedIn, GitHub, Instagram, Facebook, TikTok +3. For each profile, determine is_self_proclaimed, is_self_referring, match_reasoning, and profile_snippet +4. Only return profiles you're confident belong to the same person`, + processor: "pro", + task_spec: { + output_schema: { json_schema: output_json_schema, type: "json" }, + }, +}); +``` + +### The confidence framework + +Two boolean fields capture match quality: + +**`is_self_proclaimed`**: Whether this profile was discovered through the input's chain of references. + +- `true` if directly mentioned in input, linked from an input profile, or part of a transitive chain +- `false` if discovered only through external search + +**`is_self_referring`**: Whether this profile links back to other found profiles. + +- Creates bidirectional verification +- Highest confidence signal when combined with `is_self_proclaimed` + +### Async polling pattern + +Entity resolution takes time—following chains, verifying cross-references, searching multiple platforms. The API uses async submission with polling: + +### Submit and poll + +```typescript +// Submit resolution request +const response = await fetch("/resolve", { + method: "POST", + headers: { "Content-Type": "application/json", "x-api-key": "YOUR_API_KEY" }, + body: JSON.stringify({ input: "john.doe@techcorp.com, @johndoe on Twitter" }), +}); + +const { trun_id } = await response.json(); + +// Poll for results +const result = await fetch(`/resolve/${trun_id}`, { + headers: { "x-api-key": "YOUR_API_KEY" }, +}); +``` + +### OAuth integration + +The demo uses Parallel's OAuth Provider for API key management, allowing users to authenticate with their Parallel account: + +### OAuth flow with PKCE + +```typescript +const authUrl = new URL("https://platform.parallel.ai/getKeys/authorize"); +authUrl.searchParams.set("client_id", window.location.hostname); +authUrl.searchParams.set("redirect_uri", `${window.location.origin}/callback`); +authUrl.searchParams.set("response_type", "code"); +authUrl.searchParams.set("scope", "key:read"); +authUrl.searchParams.set("code_challenge", codeChallenge); +authUrl.searchParams.set("code_challenge_method", "S256"); +``` + +## Use cases + +**Sales intelligence**: Complete digital footprint beyond LinkedIn. 10x more context per lead. + +**Technical recruiting**: Evaluate actual code contributions on GitHub, communication style on Twitter, thought leadership across platforms. + +**Customer success**: Connect product usage to social presence. Find advocates. Spot at-risk accounts through sentiment. + +**Data quality**: Merge duplicate CRM records by linking profiles to verified external identities. + +## Getting started + +### Clone and deploy + +```bash +git clone https://github.com/parallel-web/parallel-cookbook +cd typescript-recipes/parallel-entity-resolution +npm install +npm run dev +``` + + + +## Resources + +- [Live Demo](https://entity-resolution-demo.parallel.ai) +- [Source Code](https://github.com/parallel-web/parallel-cookbook/tree/main/typescript-recipes/parallel-entity-resolution) +- [Task API Documentation](https://docs.parallel.ai/task-api/task-quickstart) +- [OAuth Provider Documentation](https://docs.parallel.ai/integrations/oauth-provider) + +By Parallel + +January 19, 2026 diff --git a/typescript-recipes/parallel-entity-resolution/README.md b/typescript-recipes/parallel-entity-resolution/README.md index 6a5af13..4577c92 100644 --- a/typescript-recipes/parallel-entity-resolution/README.md +++ b/typescript-recipes/parallel-entity-resolution/README.md @@ -14,29 +14,11 @@ This API finds all social media profiles belonging to a single person. Give it a ## Why It Matters -### For Sales & Revenue - -- **Richer prospect intelligence**: Find prospects' GitHub repos, technical blogs, conference talks beyond LinkedIn -- **Better qualification**: A CTO active on GitHub is different from one who isn't -- **Warmer introductions**: Discover mutual connections across any platform - -### For Recruiting - -- **Complete candidate profiles**: Evaluate technical skills (GitHub), communication (Twitter), thought leadership (blogs) -- **Passive sourcing**: Find engineers active on GitHub but not updating LinkedIn -- **Cultural fit**: See how candidates present themselves across contexts - -### For Customer Success - -- **Relationship intelligence**: Identify vocal advocates and at-risk customers -- **Multi-channel engagement**: Meet champions where they are -- **Expansion signals**: Track when decision-makers change roles - -### For Data Quality - -- **CRM deduplication**: Merge duplicate records by linking profiles to single entities -- **Enrichment at scale**: Turn sparse contact data into rich profiles -- **Attribution accuracy**: Know which database profiles are the same person +- Sales & Revenue: Find prospects' complete digital footprint beyond LinkedIn for richer intelligence +- Recruiting: Evaluate technical skills (GitHub), communication (Twitter), and thought leadership in one view +- Customer Success: Identify vocal advocates and at-risk customers across all their active channels +- Data Quality: Merge duplicate CRM records by linking profiles to single verified entities +- SaaS Products: Understand your users by connecting their product usage to their public presence ## How It Works @@ -46,6 +28,8 @@ This API finds all social media profiles belonging to a single person. Give it a Built using [Parallel's Task API](https://docs.parallel.ai/task-api/guides/choose-a-processor) and [OAuth Provider](https://docs.parallel.ai/integrations/oauth-provider). + + ## Quick Start ```ts example.ts @@ -98,9 +82,17 @@ const { profiles } = await result.json(); ## Field Explanations -**`is_self_proclaimed`**: Profile was discovered through the person's own references. Either directly mentioned in input, linked from a mentioned profile, or linked transitively. High confidence indicator. +**`is_self_proclaimed`**: Whether this profile was discovered through the input's chain of references. + +`true` if: + +1. directly mentioned in the original input, +2. linked from a profile mentioned in the input, or +3. linked from any profile in this chain (transitive relationship). + +`false` if discovered only through external search without a self-reference chain.", -**`is_self_referring`**: Profile links back to other profiles in the result set. Bidirectional verification increases confidence. +**`is_self_referring`**: Whether this profile links back to input profile(s) or other found profile(s) **`match_reasoning`**: Human-readable explanation of why the AI matched this profile. Use for quality assurance and debugging. diff --git a/typescript-recipes/parallel-entity-resolution/TODO.md b/typescript-recipes/parallel-entity-resolution/TODO.md deleted file mode 100644 index 08045de..0000000 --- a/typescript-recipes/parallel-entity-resolution/TODO.md +++ /dev/null @@ -1,3 +0,0 @@ -Needs a benchmark for each usecase with real people. - -Can nicely connect this to the llmtext project! diff --git a/typescript-recipes/parallel-entity-resolution/index.html b/typescript-recipes/parallel-entity-resolution/index.html index e4f7d21..0d8c432 100644 --- a/typescript-recipes/parallel-entity-resolution/index.html +++ b/typescript-recipes/parallel-entity-resolution/index.html @@ -4,7 +4,7 @@
-