Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Spark! Tech Resources for Columbia Extradition #86

Open
jh000107 opened this issue Nov 20, 2024 · 2 comments
Open

Request for Spark! Tech Resources for Columbia Extradition #86

jh000107 opened this issue Nov 20, 2024 · 2 comments
Labels
tech resource request Request Tech Resources

Comments

@jh000107
Copy link

jh000107 commented Nov 20, 2024

Project Name

Columbia Extradition

Project Type

Data Science / Machine Learning

Team Members + Emails

Junhui Cho (jh00@bu.edu)

Detailed List of Resources Needed

OpenAI API Key

Description of Resource Usage

We are dealing with extremely unstructured data (hand typed) from judicial database, and we need to extract attorney information from more than 4000 cases. We concluded that utilizing a LLM would be our best choice to deal with inconsistencies and noisiness. We tried utilizing latest open-source models like llama3.2, but gpt-4o performed much better when manually tested on some of the cases.

Course Deadlines (if applicable)

Dec 05 2024

@funkyvoong

@jh000107 jh000107 added the tech resource request Request Tech Resources label Nov 20, 2024
@jh000107
Copy link
Author

It costs about $0.01568627451 on average per case using gpt-4o, and we have about 3400 cases left.

@funkyvoong
Copy link
Contributor

@jh000107 I added you to our OpenAI Org, please accept the invite. Let me know if there are any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tech resource request Request Tech Resources
Projects
None yet
Development

No branches or pull requests

2 participants