Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix composio tool add #2188

Merged
merged 5 commits into from
Dec 7, 2024
Merged

fix: Fix composio tool add #2188

merged 5 commits into from
Dec 7, 2024

Conversation

mattzh72
Copy link
Collaborator

@mattzh72 mattzh72 commented Dec 7, 2024

Fix composio tool add:

  • Tweaks the tool update to do less efficiency checks (trips up new schema generation errors for composio)
  • Attaches their API key to all composio wrappers

Tests:

  • Tested manually by invoking curls:
(letta-py3.12) (base) mattzhou@Matts-MacBook-Pro MemGPT % curl -X POST http://localhost:8283/v1/tools/composio/PEOPLEDATALABS_SEARCH_PERSON_DATA

{"id":"tool-57f42553-cee5-4566-864f-8f948aa13cb9","description":"Search Person Data Is A Tool That Searches For Person Data Based On A Given Sql Query.","source_type":"python","module":null,"organization_id":"org-00000000-0000-4000-8000-000000000000","name":"peopledatalabs_search_person_data","tags":["composio"],"source_code":"\ndef peopledatalabs_search_person_data(**kwargs):\n    from composio import Action, App, Tag\n    from composio_langchain import ComposioToolSet\n\n    composio_toolset = ComposioToolSet()\n    tool = composio_toolset.get_tools(actions=['action_name'])[0]\n    return tool.func(**kwargs)['data']\n    ","json_schema":{"name":"peopledatalabs_search_person_data","description":"Search Person Data Is A Tool That Searches For Person Data Based On A Given Sql Query.","parameters":{"type":"object","properties":{"sql":{"type":"string","description":"\n   # PDL Schema Documentation\n    A SQL query for People Data Labs (PDL) person profiles using Elasticsearch SQL syntax.\n\n    ## FUNDAMENTAL STRUCTURE & LIMITATIONS\n    0. All queries MUST be formatted in Elasticsearch SQL syntax.\n\n    0. **Limited Clauses**:\n    - No `LIMIT` clause (use `size` parameter instead)\n    - No `GROUP BY`, `HAVING`, or subqueries\n    - Must always use `SELECT * FROM person`\n\n    1. **Pattern Matching**:\n    - Uses `LIKE` and `NOT LIKE` with `%` wildcards\n    - Use `WHERE field_name LIKE 'pattern1%' OR field_name LIKE 'pattern2%' OR field_name LIKE 'pattern3%'` for multiple patterns\n    - Maximum 20 wildcards per query\n\n    2. **Nested Fields**:\n    - Uses dot notation (e.g., `experience.company.name`)\n    - Cannot compare array elements with each other\n\n    3. **Pattern Matching**:\n    - Uses `LIKE` with `%` wildcards\n    - `LIKE ANY` for multiple patterns (similar to SQL's `IN`)\n    - Maximum 20 wildcards per query\n\n    4. **Current Employment**:\n    - Must include `experience.is_primary = true` when querying current job details\n\n    5. **No Aggregations**:\n    - Cannot use `COUNT`, `SUM`, `AVG`, etc.\n    - No array element counting or comparison\n\n    1. Query Format MUST be: SELECT * FROM person WHERE <conditions>\n    2. NO column selections, JOINs, UNNEST, LIMIT clauses, or subqueries\n    3. Maximum 20 wildcard terms (LIKE with %) per request\n    4. Must use subfield notation for nested fields\n    5. All field names use snake_case\n    6. NO aggregate functions (COUNT, SUM, AVG, etc.)\n    7. NO GROUP BY or HAVING clauses\n    8. NO self-joins or array element comparisons\n    9. MUST include experience.is_primary = true when querying current employment\n    10. Correct field usage is critical (education.majors vs education.degrees)\n\n    ## TOP-LEVEL QUERYABLE FIELDS\n    ### Identity:\n    - id: Unique identifier\n    - first_name, last_name, full_name, last_initial: Name variations\n    - name_aliases: Array of name variations\n    - birth_date (YYYY-MM-DD), birth_year (integer)\n    - sex: male/female\n    - languages: Array[object]\n    Object fields:\n        - languages.language (canonical format)\n\n    ### Current Status:\n    - job_title: Current position\n    - location_name: Current location\n    - inferred_years_experience: Career duration (integer)\n\n    ### Social Profiles (Direct Access):\n    - linkedin_url, linkedin_username, linkedin_connections (integer)\n    - github_url, github_username\n    - facebook_url, facebook_username\n    - twitter_url, twitter_username\n\n    ### Current Company Information:\n    - job_company_12mo_employee_growth_rate: float\n    - job_company_founded: integer\n    - job_company_employee_count: integer\n    - job_company_location_continent: canonical continent name\n    - job_company_location_country: canonical country name\n    - job_company_location_metro: canonical metro name\n    - job_company_name: string\n    - job_company_total_funding_raised: integer > 0\n    - job_company_website: string   \n    - job_last_changed: string (Date)\n    - job_summary: string\n\n    ### Contact Information:\n    - emails: Array[Object]\n    Object fields:\n        - emails.address: Email address\n        - emails.type: Email type\n    - phones: Array[Object]\n    Object fields:\n        - phones.number: Phone number\n    - work_email: Current work email\n    - mobile_phone\n    - phone_numbers: Array[string]\n\n    ## NESTED STRUCTURES & ARRAYS\n    ### Experience Fields:\n    - experience.company.name: Company name\n    - experience.company.industry: canonical Industry classification\n    - experience.company.founded: integer\n    - experience.company.size: canonical Company size category\n    - experience.company.type: canonical Company type\n    - experience.company.location.continent: canonical Continent name\n    - experience.company.location.country: canonical Country name\n    - experience.company.location.region: canonical State/Province\n    - experience.company.location.locality: canonical City name\n    - experience.title.name: Job title (string)\n    - experience.title.role: canonical Job role\n    - experience.title.levels: canonical Job levels (Array [Enum (String)])\n    - experience.start_date, experience.end_date: Employment dates\n    - experience.is_primary: Boolean for current job\n\n    ### Education Fields:\n    - education.school.name: Institution name (string)\n    - education.school.type: canonical Institution type\n    - education.degrees: Degree types (e.g., 'BS', 'MS', 'PhD')\n    - education.majors: Fields of study (e.g., 'computer science', 'physics')\n    - education.gpa: Grade point average (float)\n    - education.start_date, education.end_date: Study dates\n\n    ## CRITICAL FIELD USAGE\n    1. Current Employment Queries:\n    - MUST include experience.is_primary = true\n    - Example: WHERE experience.company.name = 'Google' AND experience.is_primary = true\n\n    2. Education Field Usage:\n    - education.majors: For fields of study (e.g., 'computer science', 'physics')\n    - education.degrees: For degree types (e.g., 'BS', 'MS', 'PhD')\n    - education.school.name: For institution names\n\n    3. Array Field Access:\n    - Cannot compare array elements with each other\n    - Cannot use subqueries on arrays\n    - Cannot count array elements\n    3. Job Title Field Usage:\n    - job_title: For current position/role queries (e.g., 'VP of Engineering', 'Software Engineer')\n    - experience.title.levels: Only for job level classifications ('entry', 'senior', 'vp', 'director', 'cxo')\n    Example: \n    USE: WHERE job_title LIKE '%vp of engineering%'\n    NOT: WHERE experience.title.levels LIKE '%vp of engineering%'\n\n    ## CANONICAL VALUES (Standard Field Values)\n    ### Professional Information:\n    1. Title Levels (job_title_levels, experience.title.levels) (canonical formats):\n    ONLY SUPPORTED VALUES:\n    - cxo \n    - vp\n    - director\n    - manager\n    - senior\n    - entry\n    - owner\n    - partner\n    - training\n    - unpaid\n    2. Role (job_title_role, experience.title.role) (canonical formats):\n    - customer_service\n    - design\n    - education\n    - engineering\n    - finance\n    - health\n    - human_resources\n    - legal\n    - marketing\n    - media\n    - operations\n    - public_relations\n    - real_estate\n    - sales\n    - trades\n\n    2. Title Classes (job_title_class, experience.title.class):\n    - 'general_and_administrative'\n    - 'research_and_development'\n    - 'sales_and_marketing'\n    - 'services'\n    - 'unemployed'\n\n    3. Inferred Salary Ranges (canonical formats) (inferred_salary):\n    - '<20,000', '20,000-25,000', '25,000-35,000'\n    - '35,000-45,000', '45,000-55,000', '55,000-70,000'\n    - '70,000-85,000', '85,000-100,000', '100,000-150,000'\n    - '150,000-250,000', '> 250,000'\n\n    ### Company Information:\n    1. Industries (canonical formats) (job_company_industry, experience.company.industry):\n    MAJOR SUPPORTED INDUSTRIES, TRY TO USE THESE AS MUCH AS POSSIBLE:\n    - accounting\n    - airlines/aviation\n    - apparel & fashion\n    - automotive\n    - architecture & planning\n    - banking\n    - biotechnology\n    - computer software\n    - construction\n    - consumer goods\n    - consulting\n    - defense & space\n    - education management\n    - entertainment\n    - events services\n    - financial services\n    - food & beverage\n    - gambling & casinos\n    - health, wellness and fitness\n    - hospital & health care\n    - hospitality\n    - human resources\n    - information technology and services\n    - legal services\n    - luxury goods & jewelry\n    - logistics and supply chain\n    - mechanical or industrial engineering\n    - military\n    - machinery\n    - media production\n    - pharmaceuticals\n    - package/freight delivery\n    - real estate\n    - recreational facilities and services\n    - retail\n    - telecommunications\n    - textiles\n    - transportation/trucking/railroad\n    - utilities\n    - venture capital & private equity\n    - warehousing\n    - wholesale\n\n    2. Company Types (canonical formats) (job_company_type, experience.company.type):\n    ONLY SUPPORTED VALUES FOR COMPANY TYPE:\n    - public\n    - private\n    - public_subsidiary\n    - educational\n    - government\n    - nonprofit\n\n    3. Company Sizes (canonical formats) (job_company_size, experience.company.size):\n    ONLY SUPPORTED VALUES FOR COMPANY SIZE, DO NOT USE ANYTHING ELSE LIKE '1-100' OR '200-300', ONLY USE THE VALUES BELOW:\n    - '1-10', '11-50', '51-200', '201-500'\n    - '501-1000', '1001-5000', '5001-10000', '10001+'\n\n\n    4. Inferred Revenue Ranges (canonical formats) (job_company_inferred_revenue):\n    ONLY SUPPORTED VALUES FOR INFERRED REVENUE RANGES:\n    - '$0-$1M', '$1M-$10M', '$10M-$25M', '$25M-$50M'\n    - '$50M-$100M', '$100M-$250M', '$250M-$500M'\n    - '$500M-$1B', '$1B-$10B', '$10B+'\n\n    ### Education Information:\n    1. School Types (canonical formats):\n    ONLY SUPPORTED VALUES BELOW:\n    - 'post-secondary institution'\n    - 'primary school'\n    - 'secondary school'\n\n    2. Degree Types (canonical formats): \n    - Bachelor's: 'bachelor of arts', 'bachelor of science'\n    - Master's: 'master of science', 'master of arts'\n    - Other: 'associate of arts', 'phd'\n\n    3. Major Fields (canonical formats):\n    - Tech: 'computer science', 'software engineering'\n    - Business: 'accounting', 'business administration'\n\n    ### Contact & Communication:\n    1. Email Types (emails.type) (canonical formats):\n    - 'current_professional'\n    - 'personal'\n    - 'professional'\n    - 'disposable'\n\n    ### Location Information:\n    1. Metro Areas (canonical formats) (job_company_location_metro, location_metro, experience.company.location.metro):\n    - 'san francisco, california'\n    - 'new york, new york'\n    - 'london, england'\n    - 'los angeles, california'\n    [Follow standard format: city, region]\n    2. Countries (canonical formats): \n    - 'united states'\n    - 'united kingdom'\n    - 'canada'\n    - 'australia'\n    3. Continent is also supported: \n\n    2. Confidence Levels (canonical formats): \n    - 'very high', 'high'\n    - 'moderate'\n    - 'low', 'very low'  \n\n    ## VALID QUERY PATTERNS\n    1. Simple Field Query:\n    ```sql\n    SELECT * FROM person \n    WHERE job_title LIKE '%engineer%'\n    AND location_name LIKE '%san francisco%'\n    ```\n\n    2. Nested Field Query:\n    ```sql\n    SELECT * FROM person \n    WHERE experience.company.name LIKE '%google%'\n    AND experience.company.size IN ('1001-5000', '5001-10000')\n    AND experience.is_primary = true\n    ```\n\n    3. Multiple Location Query:\n    ```sql\n    SELECT * FROM person \n    WHERE experience.company.location.locality LIKE '%new york%'\n    AND experience.company.location.country = 'united states'\n    AND experience.is_primary = true\n    ```\n\n    4. Date and Social Profile Query:\n    ```sql\n    SELECT * FROM person \n    WHERE experience.start_date >= '2020-01-01'\n    AND linkedin_url IS NOT NULL\n    AND github_url IS NOT NULL\n    ```\n\n    5. Education Query Pattern:\n    ```sql\n    SELECT * FROM person \n    WHERE education.majors LIKE '%computer science%'  -- Field of study\n    AND education.degrees LIKE '%BS%'                 -- Degree type\n    AND education.school.name LIKE '%stanford%'       -- Institution\n    ```\n\n    6. Current Employment with Education:\n    ```sql\n    SELECT * FROM person \n    WHERE job_title LIKE '%software engineer%'\n    AND experience.company.name LIKE '%google%'\n    AND experience.is_primary = true                  -- Required for current job\n    AND education.majors LIKE '%computer science%'    -- Field of study\n    ```\n\n    ## COMMON MISTAKES (DO NOT USE)\n    ❌ Counting or aggregating:\n    WHERE COUNT(experience) > 2\n\n    ❌ Comparing array elements:\n    WHERE experience.location != experience.previous_location\n\n    ❌ Using subqueries:\n    WHERE field IN (SELECT...)\n\n    ❌ Direct array access:\n    WHERE experience[0].company.name\n\n    ❌ Non-existent fields:\n    email (use emails.address)\n    city (use locality)\n    verified_emails\n    phone_numbers.location\n\n    ❌ Missing experience.is_primary = true when querying current employment\n\n    ❌ Using education.degrees for fields of study (use education.majors instead)\n\n    ❌ Using education.majors for degree types (use education.degrees instead)\n    ❌ Using experience.title.levels for full job titles (use job_title instead)\n\n    ## QUERY BEST PRACTICES\n    1. Always use dot notation for nested fields\n    2. Keep wildcards under 20 per query\n    3. Use LIKE for pattern matching\n    4. Use experience.is_primary = true for current job\n    5. Use correct date format: 'YYYY-MM-DD'\n    6. Use IN clauses for multiple exact matches\n    7. Use IS NOT NULL for existence checks\n    8. Use AND, OR, NOT for boolean conditions\n    9. ALWAYS INCLUDE experience.is_primary = true when querying current employment\n    10. Use education.majors for fields of study and education.degrees for degree types\n    11. For complex queries, validate field paths against the schema documentation\n    12. For canonical values, they are enums and have specific values - You can use LIKE but try to use equals as much as possible.\n    13. For company size, or any size related fields, only use the canonical values.\n    ## Example Complex Valid Query:\n    ```sql\n    SELECT * FROM person \n    WHERE job_title LIKE '%engineering manager%'\n    AND experience.company.industry = 'computer software'\n    AND experience.company.size IN ('1001-5000', '5001-10000')\n    AND education.school.name LIKE ('%stanford%', '%mit%')\n    AND location_name LIKE '%california%'\n    AND linkedin_connections > 500\n    AND github_url IS NOT NULL\n    AND experience.is_primary = true\n    AND experience.start_date >= '2020-01-01'\n    ```\n    . Please provide a value of type string."},"size":{"type":"integer","description":"The number of matched records to return for this query if they exist*. Must be between 1 and 100. Please provide a value of type integer."},"scroll_token":{"type":"string","description":"Each search API response returns a scroll_token. Include it in the next request to fetch the next size matching records. Please provide a value of type string."},"dataset":{"type":"string","description":"Specifies which dataset category the API should search against. Valid dataset categories are ONLY 'resume', 'email', 'phone', 'mobile_phone', 'street_address', 'consumer_social', 'developer', 'all'. Please provide a value of type string."},"titlecase":{"type":"boolean","description":"Setting titlecase to true will titlecase any records returned. Please provide a value of type boolean."},"pretty":{"type":"boolean","description":"Whether the output should have human-readable indentation. Please provide a value of type boolean."},"request_heartbeat":{"type":"boolean","description":"Request an immediate heartbeat after function execution. Set to `True` if you want to send a follow-up message or run a follow-up function."}},"required":["request_heartbeat"]}},"created_by_id":"user-00000000-0000-4000-8000-000000000000","last_updated_by_id":"user-00000000-0000-4000-8000-000000000000"}%     
  • Tested also using the composio usage tooling example, which adds a composio tool and runs it e2e in an e2b sandbox

@mattzh72 mattzh72 merged commit 14d1009 into main Dec 7, 2024
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants