Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bounty] speaker identification (next steps) #695

Open
louis030195 opened this issue Nov 18, 2024 · 10 comments
Open

[bounty] speaker identification (next steps) #695

louis030195 opened this issue Nov 18, 2024 · 10 comments
Labels
💎 Bounty enhancement New feature or request

Comments

@louis030195
Copy link
Collaborator

louis030195 commented Nov 18, 2024

WIP

also i'm curious how we could associate frames to specific person (speaker) somehow (and thus screen text) but this is lower priority

/bounty 200

(TBD what is exactly the things to do)

@louis030195 louis030195 added the enhancement New feature or request label Nov 18, 2024
Copy link

linear bot commented Nov 18, 2024

@louis030195
Copy link
Collaborator Author

louis030195 commented Nov 18, 2024

Screenshot 2024-11-18 at 1 33 07 PM kinda tihkning about google photo ui

i think we need similar for speaker identification over long ranges, eg "listen to this voice, is it john? can you tell me the name?"

i bet they compute embeddings of face and group together and tune based on user feedback

@louis030195
Copy link
Collaborator Author

@EzraEllette what do you mean by "Attempt to use LLM for identification through meeting context"

we already do this in meeting page

Copy link

algora-pbc bot commented Nov 18, 2024

💎 $200 bounty • Screenpi.pe

Steps to solve:

  1. Start working: Comment /attempt #695 with your implementation plan
  2. Submit work: Create a pull request including /claim #695 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @EzraEllette Nov 18, 2024, 9:46:17 PM WIP

@EzraEllette
Copy link
Contributor

@EzraEllette what do you mean by "Attempt to use LLM for identification through meeting context"

we already do this in meeting page

When this function is called, It should now be able to update speaker information in the database with user permission.

@EzraEllette
Copy link
Contributor

EzraEllette commented Nov 18, 2024

/attempt #695

Algora profile Completed bounties Tech Active attempts Options
@EzraEllette 11 mediar-ai bounties
Rust, TypeScript,
JavaScript & more
Cancel attempt

@NicodemPL
Copy link

Limitless uses emails/names from calendar events. Might be good idea to implement calendar to meetings at this stage.

Other idea - OCR from meeting frames to find and confirm names. We already have this data (OCR)

@louis030195
Copy link
Collaborator Author

louis030195 commented Nov 19, 2024

Limitless uses emails/names from calendar events. Might be good idea to implement calendar to meetings at this stage.

Other idea - OCR from meeting frames to find and confirm names. We already have this data (OCR)

yes to 2

the screen (and mic) is the universal interface that contains all apps - we do not need any integration if we just use pixels and LLMs well

let's try without calendar first and if it's too hard we can think about it but it involves many things (auth, cloud API, db etc) that are not core to screenpipe

@EzraEllette
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💎 Bounty enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants