Skip to content

Comments

fix: remove html from AI prompt#598

Merged
nizzyabi merged 5 commits intoMail-0:stagingfrom
Vicentesan:fix-remove-html-from-ai-prompt
Apr 6, 2025
Merged

fix: remove html from AI prompt#598
nizzyabi merged 5 commits intoMail-0:stagingfrom
Vicentesan:fix-remove-html-from-ai-prompt

Conversation

@Vicentesan
Copy link
Contributor

@Vicentesan Vicentesan commented Apr 5, 2025

What Changed

This PR introduces several improvements to the email processing system in the apps/mail directory:

  • Added a new stripHtmlTags utility function in ai-reply.ts to clean HTML tags from email threads before AI processing
  • Integrated embeddings generation for email content in lib/ai.ts to improve context awareness in AI responses
  • Removed numerous console logging statements from lib/groq.ts to clean up the codebase

Why This Change Matters

  • Improved AI Response Quality: By stripping HTML tags from email content before processing, we ensure the AI receives cleaner text input, resulting in more relevant and accurate responses.
  • Enhanced Context Understanding: The new embeddings integration allows the AI to better understand the context of email threads, leading to more appropriate and helpful replies.
  • Code Maintenance: Removing unnecessary logging statements makes the codebase cleaner and more maintainable while reducing noise in production logs.

Areas Affected

  • Email Integration
  • AI Processing Pipeline
  • Code Quality

Notes for Reviewers

This change focuses on improving the quality of AI-generated email responses while cleaning up the codebase. No breaking changes are introduced, and all existing functionality should continue to work as expected.

Related Issues

https://www.loom.com/share/32e08357c70545569016fc0c8e61740a?sid=2553d5d6-f5cf-4b65-b27c-2e21b66c9543
image

Summary by CodeRabbit

  • New Features

    • Email summaries now automatically remove HTML formatting for cleaner, more readable content.
    • AI-generated email content has been enhanced to incorporate richer contextual insights, resulting in improved clarity and presentation.
    • The search bar functionality has been refined to better process and display search queries.
  • Bug Fixes

    • Minor formatting adjustments have been applied to refine the display of AI responses.

@vercel
Copy link

vercel bot commented Apr 5, 2025

@Vicentesan is attempting to deploy a commit to the Zero Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Apr 5, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This pull request enhances email processing by making the extractEmailSummary function asynchronous and incorporating HTML tag removal to improve email summarization. It introduces a new function, extractMetaText, for refining search queries. Additionally, several debugging log statements are removed across various files, and minor formatting adjustments are made for better readability without altering core functionality.

Changes

File(s) Change Summary
apps/mail/actions/ai-reply.ts - Async Function Update: Changed extractEmailSummary to return a Promise<string> and added HTML stripping functionality. Minor formatting adjustment in generateAIResponse with a comma addition.
apps/mail/actions/ai-search.ts - Log Cleanup: Removed debugging console logs related to the search query enhancement.
apps/mail/actions/ai.ts - Formatting: Added whitespace for improved readability in generateAIEmailContent.
apps/mail/lib/ai.ts - Contextual Embeddings: Enhanced generateEmailContent to include contextual embeddings and improved HTML processing. Removed the previous implementation of formatEmailContent.
apps/mail/lib/groq.ts - Embedding Update: Replaced Groq embedding API with OpenAIEmbeddings. Updated error handling and added createEmbeddingsBatch for batch processing.
apps/mail/components/mail/search-bar.tsx - New Function: Added extractMetaText to process search queries, modified submitSearch to utilize this function, and adjusted error handling logic and input field styling.
apps/mail/package.json - Dependency Addition: Added @langchain/groq dependency with version ^0.2.1.

Possibly related PRs

  • Improved ai with custom prompt #534: The changes in the main PR are related to the modifications in the generateAIResponse function in the retrieved PR, as both involve updates to the apps/mail/actions/ai-reply.ts file and adjustments to the function signatures and logic for generating AI responses.
  • perf: switch from open ai to groq api #594: The changes in the main PR are directly related to the modifications in the retrieved PR, as both involve the introduction and modification of the extractEmailSummary function in apps/mail/actions/ai-reply.ts.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4172041 and 22d1834.

📒 Files selected for processing (3)
  • apps/mail/lib/ai.ts (3 hunks)
  • apps/mail/lib/groq.ts (6 hunks)
  • apps/mail/package.json (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc8ab8a and 119292a.

📒 Files selected for processing (5)
  • apps/mail/actions/ai-reply.ts (3 hunks)
  • apps/mail/actions/ai-search.ts (1 hunks)
  • apps/mail/actions/ai.ts (2 hunks)
  • apps/mail/lib/ai.ts (4 hunks)
  • apps/mail/lib/groq.ts (0 hunks)
💤 Files with no reviewable changes (1)
  • apps/mail/lib/groq.ts
🧰 Additional context used
🧬 Code Definitions (2)
apps/mail/actions/ai-reply.ts (1)
apps/mail/lib/ai.ts (1)
  • stripHtmlTags (149-187)
apps/mail/lib/ai.ts (1)
apps/mail/lib/groq.ts (2)
  • createEmbeddings (132-146)
  • generateCompletions (168-278)
🔇 Additional comments (10)
apps/mail/actions/ai.ts (2)

42-42: Code style improvement.

The added whitespace improves readability by properly separating the error handling section from the subsequent API call.


61-61: Code style improvement.

The added whitespace improves readability by properly separating the email responses filtering from the content cleaning logic.

apps/mail/actions/ai-search.ts (1)

57-60: Removal of debug logging improves code quality.

The PR cleaned up the code by removing console.log statements that were used for debugging purposes. This is good practice for production code as it prevents unnecessary logging and keeps the output clean.

apps/mail/actions/ai-reply.ts (3)

6-6: Added import for HTML tag stripping functionality.

The new import brings in the stripHtmlTags utility function from the AI module, supporting the goal of removing HTML from AI prompts.


10-12: Key implementation: HTML tag removal from email threads.

This change addresses the core issue in the PR by stripping HTML tags from thread content before processing, resulting in cleaner text input for the AI to work with. This is a critical improvement that should lead to better AI-generated content.


46-46: Code style improvement.

Added whitespace improves readability by separating logical sections of the code.

apps/mail/lib/ai.ts (4)

1-1: Enhanced import to support embedding generation.

Updated import to include createEmbeddings alongside generateCompletions, which is used in the new embedding functionality to improve AI context awareness.


79-107: Added embedding generation for improved AI context awareness.

This significant enhancement creates embeddings from relevant context sources (current email, user prompt, and recent conversation history), enabling the AI to better understand the conversational context when generating responses. The implementation includes proper error handling to continue without embeddings if generation fails.


115-117: Updated generateCompletions call to include embeddings.

Improved the API call by passing the generated embeddings to the AI model, which enhances the contextual understanding for more relevant responses. Also fixed a max_tokens parameter that may have been modified as part of this change.


149-187: Added comprehensive stripHtmlTags utility function.

This well-implemented utility function:

  1. Removes HTML style and script tags completely
  2. Replaces common HTML entities with their text equivalents
  3. Strips all remaining HTML tags
  4. Normalizes whitespace and fixes common formatting issues
  5. Adds appropriate line breaks for better readability

The implementation is thorough and handles various edge cases. This function is a key part of the PR's goal to remove HTML from AI prompts.

Comment on lines +114 to +117
prompt,
temperature: 0.7,
max_tokens: 500
})
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fixed syntax error in generateCompletions call.

Added a missing comma after the prompt parameter, correcting what would have been a syntax error. This is an important fix for proper function operation.

Comment on lines 202 to 204
// Fixed bug: was using formatEmailContent.toString() instead of formattedContent
formattedContent = stripHtmlTags(formattedContent).trim();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fixed content cleaning bug in formatEmailContent.

This change fixes a significant bug where the function was incorrectly using formatEmailContent.toString() instead of the actual formattedContent variable, and applies the new stripHtmlTags function to ensure the email content is properly cleaned of HTML.

@Vicentesan Vicentesan marked this pull request as draft April 6, 2025 00:08
@Vicentesan
Copy link
Contributor Author

converted to draft cuz AI-search isn't working

@nizzyabi
Copy link
Collaborator

nizzyabi commented Apr 6, 2025

it isnt your fault. lets ship and worry about ai search after

@Vicentesan
Copy link
Contributor Author

it isnt your fault. lets ship and worry about ai search after

i actually think i might be... just saw a log and its returning a hole email instead of the filter

@Vicentesan
Copy link
Contributor Author

trying to understand why now

Copy link
Collaborator

@MrgSub MrgSub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Vicentesan Vicentesan marked this pull request as ready for review April 6, 2025 01:59
@Vicentesan Vicentesan marked this pull request as draft April 6, 2025 01:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
apps/mail/components/mail/search-bar.tsx (1)

885-932: Consider using optional chaining for regex match groups and add tests for edge cases.

  1. You can simplify the safety checks for quotedQueryMatch and captures by applying optional chaining, e.g. if (quotedQueryMatch?.[1]) {...}, improving readability and reducing potential defensive checks.
  2. Additionally, test the function against unusual inputs (e.g., partial quotes, misleading phrases, or no recognized patterns) to ensure robust coverage.
🧰 Tools
🪛 Biome (1.9.4)

[error] 889-890: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 119292a and 4172041.

📒 Files selected for processing (5)
  • apps/mail/actions/ai-reply.ts (3 hunks)
  • apps/mail/actions/ai-search.ts (2 hunks)
  • apps/mail/components/mail/search-bar.tsx (3 hunks)
  • apps/mail/lib/ai.ts (5 hunks)
  • apps/mail/lib/groq.ts (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • apps/mail/actions/ai-search.ts
  • apps/mail/actions/ai-reply.ts
  • apps/mail/lib/groq.ts
🧰 Additional context used
🧬 Code Definitions (1)
apps/mail/lib/ai.ts (1)
apps/mail/lib/groq.ts (2)
  • createEmbeddings (132-146)
  • generateCompletions (168-262)
🪛 Biome (1.9.4)
apps/mail/components/mail/search-bar.tsx

[error] 889-890: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

🔇 Additional comments (6)
apps/mail/components/mail/search-bar.tsx (2)

640-640: Minor style improvement.

The updated class names enhance the visual styling of the input field, improving readability and consistency with the theme. No issues found here.


201-204: Validate the fallback behavior after extractMetaText.

You gracefully handle cases where extractMetaText returns undefined by defaulting to an empty string. Ensure that dropping to an empty search query is desired behavior when extractMetaText returns nothing. Otherwise, consider logging or informing the user that no actionable query could be extracted.

apps/mail/lib/ai.ts (4)

1-2: Import statements look correct.

Using extractTextFromHTML from @/actions/extractText and bringing createEmbeddings, generateCompletions into this file is appropriate for local usage. No changes suggested here.


80-108: Embedding strategy is logical—verify large content performance.

You're generating embeddings for current content, prompt, and recent messages, which should enrich AI context. For extremely large email drafts or prompts, confirm performance remains acceptable. Consider chunking or summarizing large texts if you encounter timeouts or memory issues.


116-117: Condensing or adjusting tokens for questions is sensible.

Reducing max_tokens to 150 for queries helps tailor responses. This logic is reasonable, but ensure shorter outputs don’t cut off essential context. Keep an eye on user feedback to see if 150 tokens suffice.


150-150: Async content extraction and cleaning fix look solid.

  1. Changing formatEmailContent to async allows for proper HTML extraction before trimming, which resolves past mishaps where .toString() was mistakenly used.
  2. The call to extractTextFromHTML(formattedContent) ensures HTML tags are stripped, improving email response clarity.

No issues found with this fix.

Also applies to: 163-164

@nizzyabi
Copy link
Collaborator

nizzyabi commented Apr 6, 2025

@Vicentesan some issues with with drafting up an email. it tend to not space things out and wont perform some requests like writing an email about someone getting fired. let's fix plz. this only happens on the create-email page. search emails is great, same with reply-composer one. only one that needs fixing is create-email
CleanShot 2025-04-05 at 23 04 57@2x

@Vicentesan
Copy link
Contributor Author

working on it

@Vicentesan
Copy link
Contributor Author

@Vicentesan some issues with with drafting up an email. it tend to not space things out and wont perform some requests like writing an email about someone getting fired. let's fix plz. this only happens on the create-email page. search emails is great, same with reply-composer one. only one that needs fixing is create-email

it worked with me... can u share the used prompt?

CleanShot 2025-04-06 at 6  30 59@2x

@nizzyabi
Copy link
Collaborator

nizzyabi commented Apr 6, 2025

even in your email, it's not spaces correctly. it needs to be like this.
CleanShot 2025-04-06 at 11 40 36@2x

@nizzyabi nizzyabi marked this pull request as ready for review April 6, 2025 20:44
@nizzyabi
Copy link
Collaborator

nizzyabi commented Apr 6, 2025

not ideal but we can improve on this over the coming days

@nizzyabi nizzyabi merged commit 3cc5385 into Mail-0:staging Apr 6, 2025
1 of 3 checks passed
nizzyabi added a commit that referenced this pull request Apr 6, 2025
* fix: show primary emails

* (review) add "All Mail" section to inbox switcher

* ph provider

* perf: switch from open ai to groq api

* chore: remove openai package

* chore: remove duplicated function

* chore: remove duplicated function

* feat: use system prompt from env

* perf: use base js fetch instead of better-fetch

* homepage open fix

* context

* cleanup

* cleanup

* feat: update categories logic

* cleanup

* cleanup

* feat: use nuqs

* action buttons

* feat: added mark as read functionality for mails

* added try and catch for better error handling

* better error than codeRabbit lmao

* hold to scroll down emails

* fix spacing on reply composer
;

* spacing

* tooltip to thread display bar for ux

* fix to reply composer

* forward emails button

* cleanup

* cleanup

* reply ring

* overflow

* cleanup

* un animate x

* New translations en.json (French)

* New translations en.json (Spanish)

* New translations en.json (Arabic)

* New translations en.json (Catalan)

* New translations en.json (Czech)

* New translations en.json (German)

* New translations en.json (Japanese)

* New translations en.json (Korean)

* New translations en.json (Polish)

* New translations en.json (Portuguese)

* New translations en.json (Russian)

* New translations en.json (Turkish)

* New translations en.json (Latvian)

* New translations en.json (Hindi)

* New translations en.json (Russian)

* New translations en.json (Latvian)

* cleanup

* loading state and padding to mail list to not overflow when scrolling

* onMouseDown

* fix: remove html from AI prompt (#598)

* fix: remove html from AI prompt

* refact: use cheerio util to remove html from email

* feat: add regex to ai-search

* chore: remove debug logs

* refactor: migrate from groq embeddings to openai for and improve email formatting

* fix: join waitlist button contrast (#603)

* fix: join waitlist button contrast

The old contrast was terrible in light mode, so now I just made it black so it is always readable.

* Revert "regenerating bun file"

This reverts parts of commit cef74f3.

* Add tooltips and i18n for editor MenuBar (#604)

* feat: add tooltips and i18n to editor MenuBar

* fix: incorrect translation key

---------

Co-authored-by: Sergio Vazquez <sergiovazag@gmail.com>
Co-authored-by: plyght <plyght@peril.lol>
Co-authored-by: Sergio <60497216+sergio-jva@users.noreply.github.com>
Co-authored-by: Nizzy <nizabizaher@gmail.com>
Co-authored-by: Vicentesan <vikom.sanchez@gmail.com>
Co-authored-by: user12224 <122770437+user12224@users.noreply.github.com>
Co-authored-by: Rajmeet <rajmeetchandok@gmail.com>
Co-authored-by: [bot] <zero@ibra.rip>
Co-authored-by: needle <122770437+needleXO@users.noreply.github.com>
Co-authored-by: Dominik Koch <dominik@koch-bautechnik.de>
Co-authored-by: YK <70700647+yaraslau-klimuk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants