Skip to content

Commit

Permalink
v0.0.1 RC
Browse files Browse the repository at this point in the history
  • Loading branch information
matteocargnelutti committed Mar 7, 2024
0 parents commit 6ac0873
Show file tree
Hide file tree
Showing 51 changed files with 4,056 additions and 0 deletions.
135 changes: 135 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
#-------------------------------------------------------------------------------
# LLM APIs settings
#-------------------------------------------------------------------------------
# NOTE:
# - OLAW can use both Open AI and Ollama at the same time, but needs at least one of the two.
# - Ollama is one of the simplest ways to get started running models locally: https://ollama.ai/
OLLAMA_API_URL="http://localhost:11434"

#OPENAI_API_KEY=""
#OPENAI_ORG_ID=""

# NOTE: OPENAI_BASE_URL can be used to interact with OpenAI-compatible providers.
# For example:
# - https://huggingface.co/blog/tgi-messages-api
# - https://docs.vllm.ai/en/latest/getting_started/quickstart.html#using-openai-completions-api-with-vllm
# Make sure to specify both OPENAI_BASE_URL and OPENAI_COMPATIBLE_MODEL when doing so.
#OPENAI_BASE_URL=""
#OPENAI_COMPATIBLE_MODEL=""

#-------------------------------------------------------------------------------
# Basic Rate Limiting
#-------------------------------------------------------------------------------
# NOTE:
# - This set of variables allows for applying rate-limiting to individual API routes.
# - See https://flask-limiter.readthedocs.io/en/stable/ for details and syntax.
RATE_LIMIT_STORAGE_URI="memory://"
API_MODELS_RATE_LIMIT="1/second"
API_EXTRACT_SEARCH_STATEMENT_RATE_LIMIT="60 per 1 hour"
API_SEARCH_RATE_LIMIT="120 per 1 hour"
API_COMPLETE_RATE_LIMIT="60 per 1 hour"

#-------------------------------------------------------------------------------
# Court Listener API settings
#-------------------------------------------------------------------------------
# NOTE: The chatbot can make calls to the Court Listener API to pull relevant court opinions.
COURT_LISTENER_MAX_RESULTS=4 # NOTE: To be adjusted based on the context lenght of the model used for inference.
COURT_LISTENER_API_URL="https://www.courtlistener.com/api/rest/v3/"
COURT_LISTENER_BASE_URL="https://www.courtlistener.com"

#-------------------------------------------------------------------------------
# Extract Search Statement Prompt
#-------------------------------------------------------------------------------
# NOTE: This prompt is used to identify a legal question and make it into a search statement.
EXTRACT_SEARCH_STATEMENT_PROMPT="
Identify whether there is a legal question in the following message and, if so, transform it into a search statement.
If the legal question can be answered by searching case law:
- Follow the COURTLISTENER instructions to generate a search statment
If there are multiple questions, only consider the last one.
---
COURTLISTENER instructions:
Here are instructions on how to generate an effective search statement for that platform.
## Keywords
Identify and extract keywords from the question. If a term can be both singular or plural, use both (i.e: \"pony\" and \"ponies\").
Use quotation marks around proper nouns and terms that should not be broken up.
## Logical connectors
Separate the different keywords and parts of the search statement with logical connectors such as AND, OR, NOT.
## Dates and date ranges
If a date (or element of a date) is present in the question, you can add it to the search statement as such to define a range:
dateFiled:[YYYY-MM-DD TO YYYY-MM-DD]
If only the year is present, set MM and DD to 01 and 01.
If only the start year is present, assume the end date is the last day of that year.
Do not wrap dateField statement in parentheses.
## Name of cases
If the question features the name of a case, you can add it to the search statement as such:
caseName:(\"name of a case\")
Tip to recognize case names: they often feature v. or vs. As in: \"Roe v. Wade\".
## Name of court, state or jurisdiction
If the question features the name of a court or of a state, you can add it to the search statement as such:
court:(\"name of a court, state or jurisdiction\")
## Excluded terms
The following terms do not help make good search statements and MUST NOT be present in the search statement: law, laws, case, cases, precedent, precedents, adjudicated.
## Other fields available
dateFiled, caseName and court are the only fields you should use. Do not invent other fields. Everything else is a search term.
---
Return your response as a JSON object containing the following keys:
- search_statement: String representing the generated search statement. Is empty if the text does not contain a legal question.
- search_target: String representing the target API for that search statement. Can be \"courtlistener\" or empty.
Here is the message you need to analyze:
"

#-------------------------------------------------------------------------------
# Text Completion Prompts
#-------------------------------------------------------------------------------
# NOTE: {history} {rag} and {request} are reserved keywords.
TEXT_COMPLETION_BASE_PROMPT = "
{history}
You are a helpful and friendly AI legal assistant.
Your explanation of legal concepts should be easy to understand while still being accurate and detailed. Explain any legal jargon, and do not assume knowledge of any related concepts.
{rag}
Request: {request}
Helpful response (plain text, no markdown):
"

# NOTE: Injected into BASE prompt when relevant.
# Inspired by LangChain's default RAG prompt.
# {context} is a reserved keyword.
TEXT_COMPLETION_RAG_PROMPT = "
Here is context to help you fulfill the user's request:
{context}
----------------
When possible, use context to answer the request from the user.
Ignore context if it is empty or irrelevant.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Cite and quote your sources whenever possible. Use their number (for example: [1]) to reference them.
"

# NOTE: Injected into BASE prompt when relevant.
# NOTE: {history} is a reserved keyword
TEXT_COMPLETION_HISTORY_PROMPT = "
Here is a summary of the conversation thus far:
{history}
----------------
"
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.env.example linguist-language=Shell
*.env.example linguist-language=Shell
Binary file added .github/screenshots/idle-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/inspect-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/inspect-02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-03.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-05.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/question-06.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/screenshots/settings-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.env
*.pyc
.DS_Store
chromadb/
test.py
TODO.md
runs/
_*/
*.zip
8 changes: 8 additions & 0 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"recommendations": [
"ms-python.black-formatter",
"ms-python.flake8",
"tobermory.es6-string-html",
"standard.vscode-standard"
]
}
6 changes: 6 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"editor.formatOnSave": true,
"python.formatting.provider": "black",
"standard.enable": false,
"standard.autoFixOnSave": true
}
10 changes: 10 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Cargnelutti
given-names: Matteo
- family-names: Cushman
given-names: Jack
title: "Open Legal AI Workbench (OLAW)"
version: 0.0.1
date-released: 2024-03-06
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 Harvard Library Innovation Laboratory

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Loading

0 comments on commit 6ac0873

Please sign in to comment.