Marmelab Blog

Introducing Sedy, the Serverless GitHub Bot That Fixes Typos for you

We do code reviews on pull requests several times a day. All the code we push to production is, at least, read twice. But sometimes the required fix is so easy, the reviewer could fix it themselves:

Super simple code review

To ease the work of Pull Request authors, marmelab developers took the habit of using the Linux sed syntax ("s/[search pattern]/[replace]/") to explain what should be replaced:

Even simpler code review

The next logical step is to help developers even further, and let a bot do simple replacements. That would reduce the required work resulting from a code review to edits with a real added value.

Introducing Sedy

Sedy brings the power of the Linux sed command to Github comments. It's a bot registered as a GitHub user, who listens to code review comments under the shape s/[search pattern]/[replace]/, and tries to do the substitution. If it succeeds, it pushes the result to the pull request.

Sedy in action

In many cases, pull requests can be merged right after sedy-bot pushes - without any need to checkout the branch, open the files, fix the typos, save the edits, commit, and push. The greatest benefit is the lack of context switching for a developer who already works on another feature. These small gains add up to a lot at the end of the day.

Sedy Is Smart

This looks like a very simple idea, but it took us a lot longer to develop than we expected. That's because sedy must understand various inputs types:

  • Single comments on pull requests
  • Pull request reviews
  • Several Sedy commands per comment
Multiple calls to Sedy in one comment

Also, we don't want to be spammed by Sedy. So while it makes one commit per substitution, it only pushes once at the end. If you watch a repository, that means you'll receive at most one notification from Sedy per Pull Request.

Lastly, substitutions with a sed pattern can sometimes be tricky - just think of special characters, unicode, or messages like:

You should add this image: http://mydomain.com/images/blog/summer/flower.jpg

We've been using Sedy internally for about 6 months, and it is now stable enough to be used outside our organization.

Using Sedy in Open-Source Repositories

We're opening Sedy to every open-source project hosted on GitHub. If you want to enable Sedy on your repository, follow these steps:

  1. Go to https://marmelab.com/sedy/

    Sedy homepage
  2. Click on the Authorize button to add the Sedy application to your public or private projects

    Sedy GitHub Auth
  3. Sedy loads the list of your repositories from GitHub, and lets you choose where to enable it:

    Sedy GitHub Auth

    Click on the plus button in front of the repositories of your choice. This adds a webhook on your repo, and adds the Sedy GitHub account as a collaborator of this repository.

And that's all. Sedy receives notifications from Pull Requests comments, and pushes commits whenever it thinks a comment is for him.

Sedy only fetches your personal repositories for now. If you want to enable Sedy on an organization right now, you can still invite the sedy-bot user to your organization, and add a GitHub hook on Pull Request Review Comments to https://sedy.marmelab.com. Since we need to accept the invitation on behalf of the Sedy bot, you'll have to notify us about it. For the time being, we only accept notifications by postcard sent at:

Kevin Maschtaler
marmelab
31 rue du Haut Bourgeois
54 000 Nancy
France

What About Usage Limits?

Sedy was developed in Node.js, using the Serverless approach. It is hosted in AWS Lambda, where the first 1,000,000 requests are free. In six months, we've barely reached 5,000 calls to the Sedy lambda.

We're pretty confident that we can keep lambda usage under the free tier with Sedy if it's not too popular, so we decided not to put any usage limits. In case of sudden fame, we may revise this policy in the future to protect our bank account.

However, since Sedy has to fetch files, patch them, commit, and push the commits in less than 1 minute (a limit that is set by lambda), it usually fails on Pull Request Reviews with more than 5 comments. Be gentle with Sedy and use it sparingly.

Security Concerns

By granting the Sedy bot access to your public or private repository, you effectively grant us (marmelab) read and write access. We take security seriously, so:

  • We pledge not to do anything with this access outside of the Sedy bot.
  • The content published by the Sedy bot on your repository is the property of the commenter who wrote the sed pattern in the first place - we don't claim intellectual property on Sedy commits.
  • We have protected the Sedy bot account with a strong password and a Two Factor Authenticator (TFA).

If that isn't enough guarantees for you, you can still host your own Sedy bot.

Hosting Your Own Sedy Bot

The Sedy code is open-source, and hosted at marmelab/sedy on GitHub. You can easily publish Sedy to a lambda of your own, associate it with a Github Application that you define, and manage authorizations for your organization. It's all explained in the README.

Under The Hood

The Registration app is a static app running on the client-side - no GitHub authorization token is ever stored on our servers. When you associate Sedy with one of your repositories, this app adds sedy-bot as a collaborator and registers the webhook, but does not store the GitHub authorization token server-side (there is no server-side for this app).

The web service that receives GitHub hooks is a simple Lambda function behind AWS Gateway. It has no persistence. That means that it uses the Github API to fetch files, patch them, and push commits.

So Sedy is completely serverless - no database, no filesystem. No permission or token stored whatsoever. If you don't trust us, see for yourself in the source:

Sedy Repository

What's next

At marmelab, we use Sedy on a daily basis. We plan to add the features that we think can make us more efficient. This includes support for the sed flags (i to ignore case sensitivity, g to fix the typo on all the current file), and installation via GitHub Integrations (currently in early access), which simplify installation, and support organisations out of the box.

Later, we'd love a nice logo, a sedy.yml file to store your sedy preferences for each repository, or even a Gitlab and Bitbucket integration.

As you can see, Sedy is still at its infancy!

For all these, we'd welcome a helping hand. Feel free to open a pull request on the repository. If you meet a problem, please open an issue with the name of the repository and a reference to the pull request.

Happy seding!