-
Notifications
You must be signed in to change notification settings - Fork 319
Release notes generator script #4866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
benjaminmah
wants to merge
70
commits into
mozilla:master
Choose a base branch
from
benjaminmah:release-notes
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+361
−0
Open
Changes from 63 commits
Commits
Show all changes
70 commits
Select commit
Hold shift + click to select a range
218ddfc
Initial working release notes generator
benjaminmah 51e7f34
Fixed prompt and list generation
benjaminmah 6af501b
Fixed prompt and excluded Nightly
benjaminmah 98ea3a9
Added duplicate remover
benjaminmah a8e9880
Added additional filtering
benjaminmah afd0d2c
Fixed prompt to clean
benjaminmah ce3bdce
Added extra conversation
benjaminmah bc1dd9c
New prompt
benjaminmah 6ba5b88
Changed prompt
benjaminmah be742b1
Made prompt more strict
benjaminmah 1dacebf
Fixed prompt and increased chunk size
benjaminmah 4f0e081
Removed asterisks
benjaminmah 821790c
Changed version
benjaminmah 34e366a
Added bug filtering for webextensions
benjaminmah 27204d1
Edited prompt
benjaminmah 2033b89
Separated release notes into a runner and tool, updated the method to…
benjaminmah 7638d2a
Fixed up runner to take in only one version
benjaminmah d9f6831
Moved version to the function
benjaminmah 93a5982
Fixed release notes script to make use of URL instead of local repo
benjaminmah 2b702fe
Removed old script
benjaminmah 35ad073
Removed HTML parsing with json
benjaminmah 2d0030c
Removed .get and response 200
benjaminmah fbb3c30
Made input and output list instead of string
benjaminmah 3d6f4d0
Using LangChain
benjaminmah c220bb3
Using data.values()
benjaminmah 1bf92b2
Added LLMChain
benjaminmah 06af4d8
Cleaned up code
benjaminmah a14ad87
Added typings
benjaminmah 20d0b6e
Removed OpenAI
benjaminmah b48089f
Changed type hints from List to list
benjaminmah 5bd61ad
Removed regex search for bug id
benjaminmah 2dddedb
Replaced token chunking with commit chunking
benjaminmah 3572bd8
Changed chunk param to commit chunk
benjaminmah 020fed3
Renamed functions
benjaminmah 1c5cbe2
Fixed variable names
benjaminmah 0d173ad
Changed to generator
benjaminmah 030d705
Removed shortlist_with_gpt function
benjaminmah 6326f66
Simplified filtering irrelevant commits
benjaminmah e25aad5
Removed refining shortlist function
benjaminmah 6418551
Added author filtering
benjaminmah 4140c52
Added generative_model_tool
benjaminmah b10f809
Fixed up code
benjaminmah c6eafb8
Generalized previous version function
benjaminmah 1191215
Removed explicit llm arg
benjaminmah 51d6d9f
Replaced regex with inequality
benjaminmah 69af386
Added ignore commit list and specific component/product ignore list
benjaminmah 88cf631
Addressed PR comments
benjaminmah 2c0a3ce
Converted list to set
benjaminmah 66dd826
Added test for previous version
benjaminmah f177f16
Fixed test to not require downloading DB
benjaminmah 1946dca
Initial cloud function
benjaminmah 75848d9
Moved cloud function file to functions folder
benjaminmah 284c6f2
Added requirements
benjaminmah 89bac35
Fixed args
benjaminmah ff62313
Fixed args
benjaminmah 4a042fc
Added workflow to deploy
benjaminmah fbb46ad
Moved workflow file and fixed to trigger every tag rather than every …
benjaminmah 2c5d73c
Addressed PR comments
benjaminmah bf239d0
Addressed PR comments
benjaminmah c79890e
Addressed PR comments
benjaminmah b72d217
Added explicit deduplication
benjaminmah 3e9c7f7
Hard coded llm name and chunk size
benjaminmah 0da3e8a
Changed output to be a list and JSON
benjaminmah 1d6ecc6
Addressed PR comments
benjaminmah 9b07b5b
Simplified LLM creation
benjaminmah e852e9d
Replaced DB with Bugzilla calls
benjaminmah 11a6444
Addressed PR comments
benjaminmah aebee0a
Addressed PR comments
benjaminmah 38499c3
Changed input to have channel and release separately
benjaminmah 2187aab
Removed test and function
benjaminmah File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| name: Deploy Release Notes Function | ||
|
|
||
| on: workflow_dispatch | ||
|
|
||
| jobs: | ||
| deploy: | ||
| runs-on: ubuntu-latest | ||
|
|
||
| permissions: | ||
| contents: read | ||
| id-token: write | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Google Cloud Auth | ||
| id: auth | ||
| uses: google-github-actions/auth@v2 | ||
| with: | ||
| credentials_json: ${{ secrets.GCP_SA_CREDENTIALS }} | ||
|
|
||
| - name: Set up gcloud | ||
| uses: google-github-actions/setup-gcloud@v2 | ||
|
|
||
| - name: Deploy to Cloud Functions | ||
| working-directory: functions/release_notes | ||
| run: | | ||
| gcloud functions deploy release-notes \ | ||
| --gen2 \ | ||
| --trigger-http \ | ||
| --allow-unauthenticated \ | ||
| --region=us-central1 \ | ||
| --timeout=240 \ | ||
| --memory=2Gi \ | ||
| --runtime=python311 \ | ||
| --entry-point=handle_release_notes \ | ||
| --service-account=review-helper@moz-bugbug.iam.gserviceaccount.com \ | ||
| --set-secrets=OPENAI_API_KEY=openai-api-key:latest |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,250 @@ | ||
| import logging | ||
| import re | ||
| from itertools import batched | ||
| from typing import Generator, Optional | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| import requests | ||
| from langchain.chains import LLMChain | ||
| from langchain.prompts import PromptTemplate | ||
|
|
||
| from bugbug import bugzilla, db | ||
|
|
||
| KEYWORDS_TO_REMOVE = [ | ||
| "Backed out", | ||
| "a=testonly", | ||
| "DONTBUILD", | ||
| "add tests", | ||
| "disable test", | ||
| "back out", | ||
| "backout", | ||
| "add test", | ||
| "added test", | ||
| "ignore-this-changeset", | ||
| "CLOSED TREE", | ||
| "nightly", | ||
| ] | ||
|
|
||
| PRODUCT_OR_COMPONENT_TO_IGNORE = [ | ||
| "Firefox Build System::Task Configuration", | ||
| "Developer Infrastructure::", | ||
| ] | ||
|
|
||
|
|
||
| def get_previous_version(current_version: str) -> str: | ||
| match = re.search(r"(\d+)", current_version) | ||
| if not match: | ||
| raise ValueError("No number found in the version string") | ||
|
|
||
| number = match.group(0) | ||
| decremented_number = str(int(number) - 1) | ||
| return ( | ||
| current_version[: match.start()] | ||
| + decremented_number | ||
| + current_version[match.end() :] | ||
| ) | ||
|
|
||
|
|
||
| logging.basicConfig(level=logging.INFO) | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class ReleaseNotesCommitsSelector: | ||
| def __init__(self, chunk_size: int, llm: LLMChain): | ||
| self.chunk_size = chunk_size | ||
| self.bug_id_to_component = {} | ||
| db.download(bugzilla.BUGS_DB) | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| for bug in bugzilla.get_bugs(): | ||
| self.bug_id_to_component[ | ||
| bug["id"] | ||
| ] = f"{bug['product']}::{bug['component']}" | ||
| self.llm = llm | ||
| self.summarization_prompt = PromptTemplate( | ||
| input_variables=["input_text"], | ||
| template="""You are an expert in writing Firefox release notes. Your task is to analyze a list of commits and identify important user-facing changes. Follow these steps: | ||
| 1. Must Include Only Meaningful Changes: | ||
| - Only keep commits that significantly impact users and are strictly user-facing, such as: | ||
| - New features | ||
| - UI changes | ||
| - Major performance improvements | ||
| - Security patches (if user-facing) | ||
| - Web platform changes that affect how websites behave | ||
| - DO NOT include: | ||
| - Small bug fixes unless critical | ||
| - Internal code refactoring | ||
| - Test changes or documentation updates | ||
| - Developer tooling or CI/CD pipeline changes | ||
| Again, only include changes that are STRICTLY USER-FACING. | ||
| 2. Output Format: | ||
| - Use simple, non-technical language suitable for release notes. | ||
| - Use the following strict format for each relevant commit, in CSV FORMAT: | ||
| [Type of Change],Description of the change,Bug XXXX,Reason why the change is impactful for end users | ||
| - Possible types of change: [Feature], [Fix], [Performance], [Security], [UI], [DevTools], [Web Platform], etc. | ||
| 3. Be Aggressive in Filtering: | ||
| - If you're unsure whether a commit impacts end users, EXCLUDE it. | ||
| - Do not list developer-focused changes. | ||
| 4. Select Only the Top 10 Commits: | ||
| - If there are more than 10 relevant commits, choose the most impactful ones. | ||
| 5. Input: | ||
| Here is the chunk of commit logs you need to focus on: | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| {input_text} | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| 6. Output Requirements: | ||
| - Output must be raw CSV text—no formatting, no extra text. | ||
| - Do not wrap the output in triple backticks (` ``` `) or use markdown formatting. | ||
| - Do not include the words "CSV" or any headers—just the data. | ||
| """, | ||
| ) | ||
|
|
||
| self.summarization_chain = LLMChain( | ||
| llm=self.llm, | ||
| prompt=self.summarization_prompt, | ||
| ) | ||
|
|
||
| self.cleanup_prompt = PromptTemplate( | ||
| input_variables=["combined_list"], | ||
| template="""Review the following list of release notes and remove anything that is not worthy of official release notes. Keep only changes that are meaningful, impactful, and directly relevant to end users, such as: | ||
| - New features that users will notice and interact with. | ||
| - Significant fixes that resolve major user-facing issues. | ||
| - Performance improvements that make a clear difference in speed or responsiveness. | ||
| - Accessibility enhancements that improve usability for a broad set of users. | ||
| - Critical security updates that protect users from vulnerabilities. | ||
| Strict Filtering Criteria - REMOVE the following: | ||
| - Overly technical web platform changes (e.g., spec compliance tweaks, behind-the-scenes API adjustments). | ||
| - Developer-facing features that have no direct user impact. | ||
| - Minor UI refinements (e.g., button width adjustments, small animation tweaks). | ||
| - Bug fixes that don’t impact most users. | ||
| - Obscure web compatibility changes that apply only to edge-case websites. | ||
| - Duplicate entries or similar changes that were already listed. | ||
| Here is the list to filter: | ||
| {combined_list} | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Instructions: | ||
| - KEEP THE SAME FORMAT (do not change the structure of entries that remain). | ||
| - REMOVE UNWORTHY ENTRIES ENTIRELY (do not rewrite them—just delete). | ||
| - DO NOT ADD ANY TEXT BEFORE OR AFTER THE LIST. | ||
| - The output must be only the cleaned-up list, formatted exactly the same way. | ||
| """, | ||
| ) | ||
|
|
||
| self.cleanup_chain = LLMChain( | ||
| llm=self.llm, | ||
| prompt=self.cleanup_prompt, | ||
| ) | ||
|
|
||
| def batch_commit_logs(self, commit_log: str) -> list[str]: | ||
| return [ | ||
| "\n".join(batch) | ||
| for batch in batched(commit_log.strip().split("\n"), self.chunk_size) | ||
| ] | ||
|
|
||
| def generate_commit_shortlist(self, commit_log_list: list[str]) -> list[str]: | ||
| commit_log_list_combined = "\n".join(commit_log_list) | ||
| chunks = self.batch_commit_logs(commit_log_list_combined) | ||
| return [ | ||
| self.summarization_chain.run({"input_text": chunk}).strip() | ||
| for chunk in chunks | ||
| ] | ||
|
|
||
| def filter_irrelevant_commits( | ||
marco-c marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| self, commit_log_list: list[tuple[str, str, str]] | ||
| ) -> Generator[str, None, None]: | ||
| ignore_revs_url = "https://hg.mozilla.org/mozilla-central/raw-file/tip/.hg-annotate-ignore-revs" | ||
| response = requests.get(ignore_revs_url) | ||
| response.raise_for_status() | ||
| raw_commits_to_ignore = response.text.strip().splitlines() | ||
| hashes_to_ignore = { | ||
| line.split(" ", 1)[0] | ||
| for line in raw_commits_to_ignore | ||
| if re.search(r"Bug \d+", line, re.IGNORECASE) | ||
| } | ||
|
|
||
| for desc, author, node in commit_log_list: | ||
| bug_match = re.search(r"(Bug (\d+).*)", desc, re.IGNORECASE) | ||
| if ( | ||
| not any( | ||
| keyword.lower() in desc.lower() for keyword in KEYWORDS_TO_REMOVE | ||
| ) | ||
| and bug_match | ||
| and re.search(r"\br=[^\s,]+", desc) | ||
| and author | ||
| != "Mozilla Releng Treescript <[email protected]>" | ||
| and node not in hashes_to_ignore | ||
| ): | ||
| bug_id = int(bug_match.group(2)) | ||
|
|
||
| bug_component = self.bug_id_to_component.get(bug_id) | ||
| if bug_component and any( | ||
| to_ignore in bug_component | ||
| for to_ignore in PRODUCT_OR_COMPONENT_TO_IGNORE | ||
| ): | ||
| continue | ||
| yield bug_match.group(1) | ||
|
|
||
| def get_commit_logs(self) -> Optional[list[tuple[str, str, str]]]: | ||
| url = f"https://hg.mozilla.org/releases/mozilla-release/json-pushes?fromchange={self.version1}&tochange={self.version2}&full=1" | ||
| response = requests.get(url) | ||
| response.raise_for_status() | ||
|
|
||
| data = response.json() | ||
| commit_log_list = [ | ||
| ( | ||
| changeset["desc"].strip(), | ||
| changeset.get("author", "").strip(), | ||
| changeset.get("node", "").strip(), | ||
| ) | ||
| for push_data in data.values() | ||
| for changeset in push_data["changesets"] | ||
| if "desc" in changeset and changeset["desc"].strip() | ||
| ] | ||
|
|
||
| return commit_log_list if commit_log_list else None | ||
|
|
||
| def remove_duplicate_bugs(self, csv_text: str) -> str: | ||
| seen = set() | ||
| unique_lines = [] | ||
| for line in csv_text.strip().splitlines(): | ||
| parts = line.split(",", 3) | ||
| if len(parts) < 3: | ||
| continue | ||
| bug_id = parts[2].strip() | ||
| if bug_id not in seen: | ||
| seen.add(bug_id) | ||
| unique_lines.append(line) | ||
| return "\n".join(unique_lines) | ||
|
|
||
| def get_final_release_notes_commits(self, version: str) -> Optional[list[str]]: | ||
| self.version2 = version | ||
| self.version1 = get_previous_version(version) | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| logger.info(f"Generating commit shortlist for: {self.version2}") | ||
| commit_log_list = self.get_commit_logs() | ||
|
|
||
| if not commit_log_list: | ||
| return None | ||
|
|
||
| logger.info("Filtering irrelevant commits...") | ||
| filtered_commits = list(self.filter_irrelevant_commits(commit_log_list)) | ||
|
|
||
| if not filtered_commits: | ||
| return None | ||
|
|
||
| logger.info("Generating commit shortlist...") | ||
| commit_shortlist = self.generate_commit_shortlist(filtered_commits) | ||
|
|
||
| if not commit_shortlist: | ||
| return None | ||
|
|
||
| logger.info("Refining commit shortlist...") | ||
| combined_list = "\n".join(commit_shortlist) | ||
| cleaned = self.cleanup_chain.run({"combined_list": combined_list}).strip() | ||
|
|
||
| logger.info("Removing duplicates...") | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| deduped = self.remove_duplicate_bugs(cleaned) | ||
| return deduped.splitlines() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| import logging | ||
| import os | ||
|
|
||
| import flask | ||
| import functions_framework | ||
|
|
||
| from bugbug import generative_model_tool | ||
| from bugbug.tools.release_notes import ReleaseNotesCommitsSelector | ||
| from bugbug.utils import get_secret | ||
|
|
||
| logging.basicConfig(level=logging.INFO) | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
| os.environ["OPENAI_API_KEY"] = get_secret("OPENAI_API_KEY") | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| tool: ReleaseNotesCommitsSelector | None = None | ||
|
|
||
| DEFAULT_LLM_NAME = "openai" | ||
| DEFAULT_CHUNK_SIZE = 1000 | ||
|
|
||
|
|
||
| @functions_framework.http | ||
| def handle_release_notes(request: flask.Request): | ||
| global tool | ||
|
|
||
| if request.method != "GET": | ||
| return "Only GET requests are allowed", 405 | ||
|
|
||
| version = request.args.get("version") | ||
| if not version: | ||
| return "Missing 'version' query parameter", 400 | ||
|
|
||
| if ( | ||
| tool is None | ||
| or tool.llm_name != DEFAULT_LLM_NAME | ||
| or tool.chunk_size != DEFAULT_CHUNK_SIZE | ||
benjaminmah marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ): | ||
| logger.info("Initializing new ReleaseNotesCommitsSelector...") | ||
| llm = generative_model_tool.create_llm_from_request(DEFAULT_LLM_NAME, {}) | ||
| tool = ReleaseNotesCommitsSelector(chunk_size=DEFAULT_CHUNK_SIZE, llm=llm) | ||
| tool.llm_name = DEFAULT_LLM_NAME | ||
| tool.chunk_size = DEFAULT_CHUNK_SIZE | ||
|
|
||
| notes = tool.get_final_release_notes_commits(version=version) | ||
|
|
||
| if not notes: | ||
| return {"commits": []}, 200, {"Content-Type": "application/json"} | ||
benjaminmah marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| return {"commits": notes}, 200, {"Content-Type": "application/json"} | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| bugbug | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Flask==2.2.5 | ||
| functions-framework==3.5.0 | ||
| langchain | ||
| openai | ||
| requests | ||
suhaibmujahid marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.