By Miley Fu, CNCF Ambassador, DevRel at CNCF WasmEdge project
Code review is a critical aspect of modern software development. In a GitHub workflow, the code review starts when a Pull Request (PR) is created, and concludes when the PR is approved and merged or rejected. The reviewers are typically senior developers or architects. They help ensure that the code submitted to the repository is correct, maintainable, scalable, and secure. That is especially important for open source projects where the many code contributions could be from the community.
However, code reviews in PRs are also often the greatest friction point for software development.
- Senior developers are very busy and very expensive. They have the least amount of time for code reviews.
- Yet, the development process cannot move forward (eg merging the PR) without the code review. Developers are often idle waiting for reviews. For open-source community developers, untimely code reviews discourage further contributions.
- Management often ask the senior developer to report and explain the key changes and risk factors associated with a PR, further delaying the process.
According to a linear b survey of over 700,000 PRs by 26,000 developers, it takes on average more than 4 days to review a PR. Developers waste two days of idle time for every PR they submit. It is an enormous amount of waste of productivity.
In this blog post, we will discuss a GitHub PR code review bot created by CNCF’s WasmEdge community. It runs on the open-source WasmEdge runtime and uses ChatGPT / GPT4 to perform the code review tasks. It is already deployed on WasmEdge repositories to automatically review every PR. For the impatient, you can create and deploy your own code review bot on GitHub in less than 5 minutes!
Real-world examples
But, is ChatGPT/4 smart enough to do code review? Isn’t this a job for senior developers? Without further ado, let’s see an example. The figure below shows a PR submitted to one of the WasmEdge open source repos. It adds a check_prime() function to check whether an input number is prime. The implementation looks pretty standard. It loops from 2 to the square root of n and try every integer for divisibility.
The bot provides the following code review comment. I have to say that I am very impressed!
If you follow up with the conversation, you can get ChatGPT/4 to further optimize the code and come up with a solution that skips all multiples of prime numbers already discovered in the loop.
As a manager / maintainer, I found technical summaries written by the code review bot very helpful as well.
How it works
The code reviewer bot is a serverless function (ie a flow function) written in Rust (and soon JavaScript!) It is compiled into Wasm and runs in a WasmEdge runtime hosted by flows.network.
Flows.network is a PaaS that provides UI and hosted services to run the WasmEdge functions and connect them to external APIs (eg GitHub). It is has a generous free tier. Of course, you can run you own WasmEdge cloud services if you wish.
When a PR is created in the connected GitHub repo, the flow function (or 🤖) is triggered. The flow function collects the patches and files in the PR and asks ChatGPT/4 to review and summarize them. The result is then posted back to the PR as a comment.
The bot continuously monitors the PR for new commits and updates. It updates (overwrites) the code review comment in the PR as needed.
The bot also can be triggered by a magic phrase in the PR’s comments section. For example, if a reviewer wants the bot to update the summary, he or she can simply comment “flows summarize”.
Create your own bot
To create and deploy your own code review bot, follow these 3 simple steps. It takes less than 5 minutes!
There are two bot templates you can choose from. One is to summarize each commit in the PR (create a bot from it). The other is to review each of the changed file in the PR (create a bot from it). The following shows the steps for the former one.
- Load the code review bot template in flows.network. The template contains the source code for the bot itself. We will clone the source code to your own GitHub account so that you can modify and customize it later. Click on Create and Deploy.
- Give the bot your OpenAI API key. If you have saved API keys in the past, you can skip this step and reuse these keys.
- Authorize bot access to GitHub. The github_owner and github_repo point to the target GitHub repo where the bot will review PRs. Click on Authorize to give the bot the necessary permissions in GitHub.
The figures below show the steps 2 and 3 above.
Authorize the bot to access the WasmEdge/wasmedge-db-examples GitHub repo using the OAuth UI provided by GitHub.
That’s it. Create a new Pull Request on the github_owner/github_repo repo and see the bot work its magic!
Customize the bot
In the above process, you first cloned the bot source code from a template into your own GitHub account (eg the your_id/summarize-github-pull-requests repo. The bot is then created from this source code. You can customize or modify the bot behavior by making changes to the bot source code in your own account.
You must push changes to the bot source code to GitHub in order for flows.network to pick up those changes and rebuild your bot (ie flow function).
Here are some simple code changes you can make to customize your bot. Just change the src/github-pr-summary.rs source code file in your own cloned repo as follows. Remember to push your changes to GitHub so that flows.network can pick them up.
- Choose a different model. The bot uses the GPT 3.5 model by default. If you have access to the more advanced GPT-4 model, change “GPT35Turbo” to “GPT4” in the following source code. GPT4 provides better code reviews but is more expensive.
static MODEL : ChatModel = ChatModel::GPT35Turbo;
// static MODEL : ChatModel = ChatModel::GPT4;
- Engineer ChatGPT prompts. For example, you can let ChatGPT be an experienced Java developer to review Java source code files. Using custom prompts, you can make the bot focus on certain aspect of the code (eg to focus on security issues or performance). You can also prompt the bot to give specific types of review comments, such as providing code snippets for suggested changes or bullet points for security issues. The following code is the prompt in the template. There are many prompt libraries you can draw inspirations from.
let chat_id = format!("PR#{pull_number}");
let system = &format!("You are an experienced software developer. You will act as a reviewer for a GitHub Pull Request titled \"{}\".", title);
let mut reviews: Vec<String> = Vec::new();
let mut reviews_text = String::new();
for (_i, commit) in commits.iter().enumerate() {
let commit_hash = &commit[5..45];
let co = ChatOptions {
model: MODEL,
restart: true,
system_prompt: Some(system),
retry_times: 3,
};
let question = "The following is a GitHub patch. Please summarize the key changes and identify potential problems. Start with the most important findings.\n\n".to_string() + truncate(commit, CHAR_SOFT_LIMIT);
- Make the bot more friendly. You can change the content and style of the bot’s pull request comments by changing the sentence starting with “Hello, I am a code review bot on flows.network” in the following source code. For example, you can add a customized greeting to your community member.
let mut resp = String::new();
resp.push_str("Hello, I am a [code review bot](https://github.com/flows-network/github-pr-summary/) on [flows.network](https://flows.network/). Here are my reviews of code commits in this PR.\n\n------\n\n");
if reviews.len() > 1 {
let co = ChatOptions {
model: MODEL,
restart: true,
system_prompt: Some(system),
retry_times: 3,
};
- Customize the review strategy. By default, the bot will review every changed file and every commit in a pull request. You can edit the source code to only reviews certain files or only changes by specific developers.
Use the bot on multiple repos
Once you have the bot running successfully on one repo, you will probably want a code review for every one of your repos! You can obviously deploy a separate bot from the template for each of your repos. But that means each bot has its own source code to manage, and it could get hard to manage. You can use the same bot source code to create multiple bots! In flows.network, we call each bot a “flow”.
First, you can click on “Create a flow” and import your bot source code for the flow. Your bot source code is in the GitHub repo cloned from the template. It is not to be confused with the repos you want to deploy the bot on for PR reviews!
Next, in the “advanced” section, you can add the github_owner and github_repo settings to point to the target GitHub repo where the bot will review PRs.
The figures below show steps of “Create a new bot (flow) from an existing bot source code repo you cloned from the template”.
Finally, you will go through the process to authorize the bot (flow) access to your OpenAI API key and the target GitHub repo to deploy the bot on.
What’s next?
AI-assisted code review is a fast evolving field. CNCF’s WasmEdge provides an efficient runtime for code review bot applications. The community is experimenting with many new ideas to improve the bot template. Below are some near term improvements to look forward to!
- Support additional LLMs that are trained on coding tasks, such as Claude, PaLM and others.
- Support fine tuned models, such as Llama models that are trained on CVE databases.
- Integration with other R&D management tools such as issue trackers and project management tools.
- Support code hosting services beyond GitHub.
Boost code quality and developer productivity for your open source software repos today!