Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
218ddfc
Initial working release notes generator
benjaminmah Jan 23, 2025
51e7f34
Fixed prompt and list generation
benjaminmah Jan 24, 2025
6af501b
Fixed prompt and excluded Nightly
benjaminmah Jan 27, 2025
98ea3a9
Added duplicate remover
benjaminmah Jan 28, 2025
a8e9880
Added additional filtering
benjaminmah Jan 28, 2025
afd0d2c
Fixed prompt to clean
benjaminmah Jan 31, 2025
ce3bdce
Added extra conversation
benjaminmah Feb 1, 2025
bc1dd9c
New prompt
benjaminmah Feb 5, 2025
6ba5b88
Changed prompt
benjaminmah Feb 5, 2025
be742b1
Made prompt more strict
benjaminmah Feb 7, 2025
1dacebf
Fixed prompt and increased chunk size
benjaminmah Feb 10, 2025
4f0e081
Removed asterisks
benjaminmah Feb 10, 2025
821790c
Changed version
benjaminmah Feb 11, 2025
34e366a
Added bug filtering for webextensions
benjaminmah Feb 13, 2025
27204d1
Edited prompt
benjaminmah Feb 19, 2025
2033b89
Separated release notes into a runner and tool, updated the method to…
benjaminmah Feb 21, 2025
7638d2a
Fixed up runner to take in only one version
benjaminmah Feb 24, 2025
d9f6831
Moved version to the function
benjaminmah Feb 25, 2025
93a5982
Fixed release notes script to make use of URL instead of local repo
benjaminmah Mar 3, 2025
2b702fe
Removed old script
benjaminmah Mar 3, 2025
35ad073
Removed HTML parsing with json
benjaminmah Mar 4, 2025
2d0030c
Removed .get and response 200
benjaminmah Mar 5, 2025
fbb3c30
Made input and output list instead of string
benjaminmah Mar 5, 2025
3d6f4d0
Using LangChain
benjaminmah Mar 5, 2025
c220bb3
Using data.values()
benjaminmah Mar 5, 2025
1bf92b2
Added LLMChain
benjaminmah Mar 5, 2025
06af4d8
Cleaned up code
benjaminmah Mar 6, 2025
a14ad87
Added typings
benjaminmah Mar 6, 2025
20d0b6e
Removed OpenAI
benjaminmah Mar 6, 2025
b48089f
Changed type hints from List to list
benjaminmah Mar 6, 2025
5bd61ad
Removed regex search for bug id
benjaminmah Mar 6, 2025
2dddedb
Replaced token chunking with commit chunking
benjaminmah Mar 6, 2025
3572bd8
Changed chunk param to commit chunk
benjaminmah Mar 6, 2025
020fed3
Renamed functions
benjaminmah Mar 6, 2025
1c5cbe2
Fixed variable names
benjaminmah Mar 6, 2025
0d173ad
Changed to generator
benjaminmah Mar 7, 2025
030d705
Removed shortlist_with_gpt function
benjaminmah Mar 14, 2025
6326f66
Simplified filtering irrelevant commits
benjaminmah Mar 14, 2025
e25aad5
Removed refining shortlist function
benjaminmah Mar 14, 2025
6418551
Added author filtering
benjaminmah Mar 14, 2025
4140c52
Added generative_model_tool
benjaminmah Mar 14, 2025
b10f809
Fixed up code
benjaminmah Mar 14, 2025
c6eafb8
Generalized previous version function
benjaminmah Mar 14, 2025
1191215
Removed explicit llm arg
benjaminmah Mar 21, 2025
51d6d9f
Replaced regex with inequality
benjaminmah Mar 21, 2025
69af386
Added ignore commit list and specific component/product ignore list
benjaminmah Mar 21, 2025
88cf631
Addressed PR comments
benjaminmah Mar 21, 2025
2c0a3ce
Converted list to set
benjaminmah Mar 24, 2025
66dd826
Added test for previous version
benjaminmah Mar 24, 2025
f177f16
Fixed test to not require downloading DB
benjaminmah Mar 26, 2025
1946dca
Initial cloud function
benjaminmah Apr 7, 2025
75848d9
Moved cloud function file to functions folder
benjaminmah Apr 7, 2025
284c6f2
Added requirements
benjaminmah Apr 7, 2025
89bac35
Fixed args
benjaminmah Apr 7, 2025
ff62313
Fixed args
benjaminmah Apr 7, 2025
4a042fc
Added workflow to deploy
benjaminmah Apr 8, 2025
fbb46ad
Moved workflow file and fixed to trigger every tag rather than every …
benjaminmah Apr 10, 2025
2c5d73c
Addressed PR comments
benjaminmah Apr 10, 2025
bf239d0
Addressed PR comments
benjaminmah Apr 10, 2025
c79890e
Addressed PR comments
benjaminmah Apr 10, 2025
b72d217
Added explicit deduplication
benjaminmah Apr 10, 2025
3e9c7f7
Hard coded llm name and chunk size
benjaminmah Apr 14, 2025
0da3e8a
Changed output to be a list and JSON
benjaminmah Apr 15, 2025
1d6ecc6
Addressed PR comments
benjaminmah Apr 16, 2025
9b07b5b
Simplified LLM creation
benjaminmah Apr 16, 2025
e852e9d
Replaced DB with Bugzilla calls
benjaminmah Apr 17, 2025
11a6444
Addressed PR comments
benjaminmah Apr 23, 2025
aebee0a
Addressed PR comments
benjaminmah Apr 27, 2025
38499c3
Changed input to have channel and release separately
benjaminmah Apr 29, 2025
2187aab
Removed test and function
benjaminmah Apr 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/release_notes.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Deploy Release Notes Function

on: workflow_dispatch

jobs:
deploy:
runs-on: ubuntu-latest

permissions:
contents: read
id-token: write

steps:
- uses: actions/checkout@v4

- name: Google Cloud Auth
id: auth
uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_CREDENTIALS }}

- name: Set up gcloud
uses: google-github-actions/setup-gcloud@v2

- name: Deploy to Cloud Functions
working-directory: functions/release_notes
run: |
gcloud functions deploy release-notes \
--gen2 \
--trigger-http \
--allow-unauthenticated \
--region=us-central1 \
--timeout=240 \
--memory=2Gi \
--runtime=python311 \
--entry-point=handle_release_notes \
--service-account=review-helper@moz-bugbug.iam.gserviceaccount.com \
--set-secrets=OPENAI_API_KEY=openai-api-key:latest
259 changes: 259 additions & 0 deletions bugbug/tools/release_notes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
import logging
import re
from itertools import batched
from typing import Iterator, Optional

import requests
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from libmozdata.bugzilla import Bugzilla

KEYWORDS_TO_REMOVE = [
"Backed out",
"a=testonly",
"DONTBUILD",
"add tests",
"disable test",
"back out",
"backout",
"add test",
"added test",
"ignore-this-changeset",
"CLOSED TREE",
"nightly",
]

PRODUCT_OR_COMPONENT_TO_IGNORE = [
"Firefox Build System::Task Configuration",
"Developer Infrastructure::",
]


def fetch_bug_components(bug_ids: list[int]) -> dict[int, str]:
bug_id_to_component = {}

def bug_handler(bug):
bug_id_to_component[bug["id"]] = f"{bug['product']}::{bug['component']}"

Bugzilla(
bugids=bug_ids,
include_fields=["id", "product", "component"],
bughandler=bug_handler,
).wait()

return bug_id_to_component


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


class ReleaseNotesCommitsSelector:
def __init__(self, chunk_size: int, llm: LLMChain):
self.chunk_size = chunk_size
self.bug_id_to_component: dict[int, str] = {}
self.llm = llm
self.summarization_prompt = PromptTemplate(
input_variables=["commit_list"],
template="""You are an expert in writing Firefox release notes. Your task is to analyze a list of commits and identify important user-facing changes. Follow these steps:

1. Must Include Only Meaningful Changes:
- Only keep commits that significantly impact users and are strictly user-facing, such as:
- New features
- UI changes
- Major performance improvements
- Security patches (if user-facing)
- Web platform changes that affect how websites behave
- DO NOT include:
- Small bug fixes unless critical
- Internal code refactoring
- Test changes or documentation updates
- Developer tooling or CI/CD pipeline changes
Again, only include changes that are STRICTLY USER-FACING.

2. Output Format:
- Use simple, non-technical language suitable for release notes.
- Use the following strict format for each relevant commit, in CSV FORMAT:
[Type of Change],Description of the change,Bug XXXX,Reason why the change is impactful for end users
- Possible types of change: [Feature], [Fix], [Performance], [Security], [UI], [DevTools], [Web Platform], etc.

3. Be Aggressive in Filtering:
- If you're unsure whether a commit impacts end users, EXCLUDE it.
- Do not list developer-focused changes.

4. Select Only the Top 10 Commits:
- If there are more than 10 relevant commits, choose the most impactful ones.

5. Output Requirements:
- Output must be raw CSV text—no formatting, no extra text.
- Do not wrap the output in triple backticks (` ``` `) or use markdown formatting.
- Do not include the words "CSV" or any headers—just the data.

6. Input:
Here is the list of commits you need to focus on:
{commit_list}
""",
)

self.summarization_chain = LLMChain(
llm=self.llm,
prompt=self.summarization_prompt,
)

self.cleanup_prompt = PromptTemplate(
input_variables=["combined_list"],
template="""Review the following list of release notes and remove anything that is not worthy of official release notes. Keep only changes that are meaningful, impactful, and directly relevant to end users, such as:
- New features that users will notice and interact with.
- Significant fixes that resolve major user-facing issues.
- Performance improvements that make a clear difference in speed or responsiveness.
- Accessibility enhancements that improve usability for a broad set of users.
- Critical security updates that protect users from vulnerabilities.

Strict Filtering Criteria - REMOVE the following:
- Overly technical web platform changes (e.g., spec compliance tweaks, behind-the-scenes API adjustments).
- Developer-facing features that have no direct user impact.
- Minor UI refinements (e.g., button width adjustments, small animation tweaks).
- Bug fixes that don’t impact most users.
- Obscure web compatibility changes that apply only to edge-case websites.
- Duplicate entries or similar changes that were already listed.

Instructions:
- KEEP THE SAME FORMAT (do not change the structure of entries that remain).
- REMOVE UNWORTHY ENTRIES ENTIRELY (do not rewrite them—just delete).
- DO NOT ADD ANY TEXT BEFORE OR AFTER THE LIST.
- The output must be only the cleaned-up list, formatted exactly the same way.

Here is the list to filter:
{combined_list}
""",
)

self.cleanup_chain = LLMChain(
llm=self.llm,
prompt=self.cleanup_prompt,
)

def batch_commit_logs(self, commit_log: str) -> list[str]:
return [
"\n".join(batch)
for batch in batched(commit_log.strip().split("\n"), self.chunk_size)
]

def generate_commit_shortlist(self, commit_log_list: list[str]) -> list[str]:
commit_log_list_combined = "\n".join(commit_log_list)
chunks = self.batch_commit_logs(commit_log_list_combined)
return [
self.summarization_chain.run({"commit_list": chunk}).strip()
for chunk in chunks
]

def filter_irrelevant_commits(self, commit_log_list: list[dict]) -> Iterator[str]:
ignore_revs_url = "https://hg.mozilla.org/mozilla-central/raw-file/tip/.hg-annotate-ignore-revs"
response = requests.get(ignore_revs_url)
response.raise_for_status()
raw_commits_to_ignore = response.text.strip().splitlines()
hashes_to_ignore = {
line.split(" ", 1)[0]
for line in raw_commits_to_ignore
if re.search(r"Bug \d+", line, re.IGNORECASE)
}

for commit in commit_log_list:
desc = commit["desc"]
author = commit["author"]
node = commit["node"]
bug_id = commit["bug_id"]

if (
not any(
keyword.lower() in desc.lower() for keyword in KEYWORDS_TO_REMOVE
)
and bug_id
and re.search(r"\br=[^\s,]+", desc)
and author
!= "Mozilla Releng Treescript <release+treescript@mozilla.org>"
and node not in hashes_to_ignore
):
bug_component = self.bug_id_to_component.get(bug_id)
if bug_component and any(
to_ignore in bug_component
for to_ignore in PRODUCT_OR_COMPONENT_TO_IGNORE
):
continue
yield desc

def get_commit_logs(
self, target_release: int, channel: str
) -> Optional[list[dict]]:
preceding_release = target_release - 1

target_version = f"FIREFOX_{channel}_{target_release}_BASE".upper()
preceding_version = f"FIREFOX_{channel}_{preceding_release}_BASE".upper()

url = f"https://hg.mozilla.org/releases/mozilla-{channel.lower()}/json-pushes?fromchange={preceding_version}&tochange={target_version}&full=1"
response = requests.get(url)
response.raise_for_status()
data = response.json()
commit_log_list = []
for push_data in data.values():
for changeset in push_data["changesets"]:
if "desc" in changeset and changeset["desc"].strip():
desc = changeset["desc"].strip()
author = changeset.get("author", "").strip()
node = changeset.get("node", "").strip()
match = re.search(r"Bug (\d+)", desc, re.IGNORECASE)
bug_id = int(match.group(1)) if match else None
commit_log_list.append(
{
"desc": desc,
"author": author,
"node": node,
"bug_id": bug_id,
}
)
return commit_log_list if commit_log_list else None

def remove_duplicate_bugs(self, csv_text: str) -> str:
seen = set()
unique_lines = []
for line in csv_text.strip().splitlines():
parts = line.split(",", 3)
if len(parts) < 3:
continue
bug_id = parts[2].strip()
if bug_id not in seen:
seen.add(bug_id)
unique_lines.append(line)
return "\n".join(unique_lines)

def get_final_release_notes_commits(
self, target_release: int, channel: str
) -> Optional[list[str]]:
logger.info(
f"Generating commit shortlist for release {target_release} in channel {channel}"
)
commit_log_list = self.get_commit_logs(
target_release=target_release, channel=channel
)

if not commit_log_list:
return None

bug_ids = [commit["bug_id"] for commit in commit_log_list if commit["bug_id"]]

self.bug_id_to_component = fetch_bug_components(bug_ids)
filtered_commits = list(self.filter_irrelevant_commits(commit_log_list))

if not filtered_commits:
return None

commit_shortlist = self.generate_commit_shortlist(filtered_commits)

if not commit_shortlist:
return None

combined_list = "\n".join(commit_shortlist)
cleaned = self.cleanup_chain.run({"combined_list": combined_list}).strip()

deduped = self.remove_duplicate_bugs(cleaned)
return deduped.splitlines()
33 changes: 33 additions & 0 deletions functions/release_notes/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import flask
import functions_framework

from bugbug import generative_model_tool
from bugbug.tools.release_notes import ReleaseNotesCommitsSelector

tool: ReleaseNotesCommitsSelector | None = None

DEFAULT_CHUNK_SIZE = 1000


@functions_framework.http
def handle_release_notes(request: flask.Request):
global tool

if request.method != "GET":
return "Only GET requests are allowed", 405

release = int(request.args.get("release"))
channel = request.args.get("channel")

if not release or not channel:
return "Missing 'release' or 'channel' query parameter", 400

if tool is None:
llm = generative_model_tool.create_openai_llm()
tool = ReleaseNotesCommitsSelector(chunk_size=DEFAULT_CHUNK_SIZE, llm=llm)

commit_list = tool.get_final_release_notes_commits(
target_release=release, channel=channel
)

return {"commits": commit_list}, 200
3 changes: 3 additions & 0 deletions functions/release_notes/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
bugbug==0.0.573
Flask==2.2.5
functions-framework==3.5.0
28 changes: 28 additions & 0 deletions scripts/release_notes_runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import argparse
import logging

from bugbug import generative_model_tool
from bugbug.tools.release_notes import ReleaseNotesCommitsSelector

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


def main():
parser = argparse.ArgumentParser(description="Generate Firefox release notes.")
generative_model_tool.create_llm_to_args(parser)
parser.add_argument("--version", required=True, help="Target version identifier")
parser.add_argument(
"--chunk-size", type=int, default=100, help="Number of commits per chunk"
)

args = parser.parse_args()
llm = generative_model_tool.create_llm_from_args(args)

selector = ReleaseNotesCommitsSelector(chunk_size=args.chunk_size, llm=llm)
results = selector.get_final_release_notes_commits(version=args.version)
print(results)


if __name__ == "__main__":
main()