Managing code

One of the core functions of the engineering group is to mange the code that powers our systems and applications. This is different from the day-to-day feature development and bug fixes, as it is work that won't be ticketed unless we identify and address it.

This page outlines how code is managed at Norton.

Tending the garden

There's no "right" way to manage code, but gardening can be a helpful metaphor and mental model for how to cultivate a healthy codebase. 🧑‍🌾🍅🥕

1. Design for growth

2. Feed regularly

3. Prune overgrowth

4. Protect from the elements

5. Know when to pull it all up and start over

GitLab

We use GitLab for source code management. Because it is a paid service, membership is limited to staff who are doing active work with code. Please reach out to the Director of Engineering (Evan Yamanishi) if you believe you or one of your team members should have access.

GitLab organizes projects like a file tree, with groups functioning like folders and projects functioning like files. Our GitLab space is organized around the principles of domain-driven design and usability. The top-level GitLab groups are organized according to type of project (the domain), not business unit ("department"), meaning no projects will ever be part of the "Engineering" group, even if they are managed by us. The top-level groups that are most relevant to engineers are:

Applications - our official products/applications. This is where we do most of our work.
Operations - projects that help keep things running.
The Lab - unofficial, experimental, or exploratory projects. Projects in here may "graduate" to the Applications space.
Groups - a group of groups that contains no actual code or projects. This is how we define and manage our business units in GitLab, which we can use to simplify member management. When new members are added to GitLab, they should be added to one of these groups.

Standards

While we prioritize team autonomy over uniformity to ensure that teams can focus on making it work, there are times when doing things in a standard way can help improve efficiency, reduce complexity, promote a shared understanding, and create greater confidence in our engineering practices.

The following standards are designed to address each of these concerns, as well as enabling our continuous delivery practice. All engineers are expected to follow these standards, which use key words for requirements like "must" and "should" to convey the level of requirement where appropriate.

Code ownership

Code owners are required for all projects in the applications and operations groups on GitLab. This feature helps us avoid a whole range of diffusion of responsibility problems, but it more specifically enables us to require approvals on protected branches. This also help discourage the creation of code with no clear maintenance plan or ownership structure, and will help highlight projects might need to be moved or retired.

Repositories should have more than one code owner. This helps avoid single-person bottlenecks and keeps us honest.
- Reach out to your manager if you're not sure who else can or should manage a project that you maintain.
Leads are responsible for the CODEOWNERS file in their projects.
- Lead software engineers ("tech leads") should work with their team to determine who should own what parts of a codebase. Leads should own the code by default.
Code owners don't need to be part of the core team that maintains the parent project.
- If it makes sense for someone from another team to own a file, directory, or whole repository, they should be listed as an owner.
- For instance, if nobody on the team is well versed in Docker, it might make sense to have two code owners for the Dockerfile: one person from the core team, and a Docker expert from another team.

Continuous integration checks

Continuous integration, or CI, is the process of running code through checks every time a change is made (continuously!). It is the first half of CI/CD, or continuous integration and continuous delivery/deployment.

Checking our code continuously helps us catch issues faster, allows us to focus on higher-value concerns, and gives everyone greater confidence in our work.

Central to CI is the simple reality that the vast majority of code mistakes are much easier for us to make than they are for us to find, but those same mistakes can be found extremely quickly by computers. For instance, accidentally spelling const as cost or not consistently returning a value in a JavaScript function could completely break a build, and trying to find them manually might mean sifting through lines of code for hours.

All projects are required to run continuous integration checks, though the nature of the checks will depend on the project.

Teams should discuss and iterate on the design of their pipeline, but the general shape of it should be similar across all projects. This example provides a high-level overview of an end-to-end pipeline for a containerized application that is deployed to Kubernetes. It is meant to be informative, not normative.

Check the code - code quality checks, unit tests, or other checks that should be run on the code. In other words, if it doesn't require or benefit from a build to run, run it here.
Build the image - we build the code after it has been checked. Under most circumstances, this will mean both compiling/transpiling the code and containerizing it with docker build.
Check the image - jobs that require or benefit from being run on the container but don't need a full environment to run. For instance: container scanning, integration tests, secret detection, or static application security testing (SAST).
Publish the image - tag and push the image to the container registry. When this occurs, our continuous delivery system (Rancher Fleet) will also publish the image to the relevant environment.
Check the environment - end-to-end tests (typically authored by our QA team), accessibility tests, load performance tests, or any other tests that require a live environment to run.

If code changes make it through the entire pipeline without any errors, congrats 🎉, it may be safe to merge. Of course, teams may want to do additional manual checks after the pipeline succeeds.

Once confident that the change is good to go, merging it should run a pipeline that reruns all these stages and then deploys it to a higher environment such as production.

Tips for pipeline design

Run one task per job.
- Why: we want to have a sense for what's failing at a glance, and if we run multiple scripts in the same job, we'll have to open the job log to know which failed.
- For instance, run ESLint and Stylelint scripts in two separate jobs rather than one job with two scripts. Jobs are very cheap to run, so don't worry about optimizing for instance efficiency or costs at the pipeline/job level.
Batch similar jobs in stages.
- Why: jobs in the same stage will run in parallel, speeding up our overall pipeline.
Run stages from cheapest to most expensive, where expense is a heuristic for what's required for the job to run, as well as the time the jobs might take.
- Why: we want to fail as quickly as possible and catch easy errors like syntax errors before we run checks that may require more people, time, and work to fix.
- For instance, run code quality jobs like linting first since they only require source code to run, typically run very quickly, are easy to fix, and only the engineer who wrote the code needs to be involved in the fix. Conversely, run end-to-end tests last since they require the change to be deployed to a live environment, often require live external endpoints to be deployed as well, and may require additional team members to validate errors.
Keep it simple by using GitLab's CI syntax.
- Why: it may be easier to reach for advanced YAML language features or shell scripting to solve pipeline problems, but GitLab's CI syntax is robust enough to solve most problems on its own and using it will result in less technical debt than more complex solutions.
Borrow from GitLab's templates.
- Why: GitLab provides templates for a lot of jobs so that we can extend them, but since they're all open source, we can also use them as references for our own jobs and pipeline design.

Branch management

Starting in 2023¹, we will be retiring two branch management conventions that we have used in previous years.

We will no longer use the term "master" in projects, including as a branch name. Projects that use a master branch must switch to main.
- Why: "main" is shorter, more descriptive, and more accurate than "master," and it also brings us in line with Norton's commitment to inclusive practices, as well as the broader engineering community².
- If you manage a project that still uses master as a branch name:
  1. Update the repository's default branch name to main
  2. If they exist, make sure any external references are updated.
  3. Optionally run git config --global init.defaultBranch main to ensure that locally-created repositories start with main.
All projects that use Git flow must migrate to a simpler branching strategy to facilitate continuous delivery.
- Why: while Git flow has helped us keep our main branch clean for many years, it was created in 2010, before the existence of more robust solutions for protecting the primary branch, such as continuous integration checks, code owners, or even modern code reviews. In fact, even the author of Git flow no longer recommends it, stating, "If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team.""

Teams should use GitHub flow by default and use GitLab flow in those rare occasions where GitHub flow won't work. Standardizing this across teams enables us to use templates for CI/CD pipelines, and greatly simplifies the configuration of infrastructure that needs to connect to our repositories.

Teams must also follow a few simple conventions related to branches on GitLab:

The default branch must be named main².
- Tip: ensure that locally-initialized repositories use main with git config --global init.defaultBranch main.
The main branch must be protected.
Resist the urge to create long-lived branches. If they are needed, they must be protected.
- Why: we should maintain one single source of truth, the main branch. Keeping multiple sources of truth can undermine that, but may be necessary in some rare circumstances like when we need to support multiple public releases at the same time.
Once merged, branches should be deleted.
- Why: when we embrace main as our single source of truth, all other branches are simply ephemeral working spaces that we use to stage changes before merging them into main. The history and paper trail of that merge will be captured in the merge request and commit history, so there's no reason to keep the branch around.
- Tip: select the delete source branch when merge request accepted option in the merge request to automatically delete the branch when it's merged.
- Tip: use the delete merged branches feature in GitLab to prune branches that weren't deleted when they were merged.

Aside from main, there are no branch naming requirements, meaning teams can name their branches feature/awesome-service, just awesome-service, or even Jira issue keys like NCIA-1234 (note that the issue key is required in the merge request name).

Git flow adds process and merge complexity in an effort to keep the main branch clean. But with the introduction of CI/CD and required merge approvals, we can simplify our branching strategies to reduce complexity while still maintaining a clean main branch.

GitLab's Git flow and its problems provides a great summary of the complexity problems that arise with Git flow:

The second problem of Git flow is the complexity introduced by the hotfix and release branches. These branches can be a good idea for some organizations but are overkill for the vast majority of them. Nowadays, most organizations practice continuous delivery, which means that your default branch can be deployed. Continuous delivery removes the need for hotfix and release branches, including all the ceremony they introduce. An example of this ceremony is the merging back of release branches. Though specialized tools do exist to solve this, they require documentation and add complexity. Frequently, developers make mistakes such as merging changes only into main and not into the develop branch. The reason for these errors is that Git flow is too complicated for most use cases. For example, many projects do releases but don’t need to do hotfixes.

Merge requests

Merge requests (MR) are the heart and soul of the development process. They are the primary place we propose changes, discuss technical problems and implementation strategies, and collaborate on solutions before changes are merged into the main branch.

The following requirements are designed to facilitate effective code reviews.

Every change must be reviewed and approved.
- This is enabled in GitLab, meaning it will be impossible to merge any change until it has been reviewed and approved.
- This applies to trivial changes as well as significant ones.
Merge request names must include a short title and the Jira issue key.
- The short title helps the reviewer understand it at a glance and should briefly describe the change in the author's own words.
- The Jira issue key is required to connect the MR to Jira via the Jira/GitLab integration. It can be anywhere in the name.
- Example: "Add trial access expired modal | NERD-1530" or "SW5-5675: fix multi-step question module".
Describe the changes in the description.
- Provide details that anticipate the code review, such as why specific choices were made and what alternatives were considered.
- Focus on the intent and design of the changes, not the details. The changes view provides the details.
Request a review as soon as there is code to review.
- The bar is working code, so resist the urge to wait until it is "perfect."
- Just as we favor fast continuous feedback through CI, we should favor quick, small reviews.
Mark work-in-progress merge requests as a draft.
- This will signal to teammates that we're still working on the changes.
Use merge request templates to provide more clarity about expectations.
- Teams that have a high volume of merge requests or many contributors may want to consider using templates to streamline the MR process.

Reviewing code

Code reviews are the primary way we collaborate with each other on technical details. They are an asynchronous way to communicate—that is, they don't require the participants to work together at the same time. We favor this method of review because it is more remote-friendly, flexible, inclusive, and functions as a record of the exchange and process.

Synchronous methods of reviewing or discussing code like hopping on a call to walk through it, virtual whiteboarding, or remote pair programming are encouraged, but teams must also capture comments in the code review.

We've been set as a reviewer in a merge request and we're ready to provide some feedback so we can approve it, merge it, and close the issue! 🚀 How should we proceed?

Read review a merge request to learn how reviews work in GitLab.
Understanding the changes tab and how to comment on lines of code are especially important.
Thank the contributor for their work.
This helps set a friendly tone—critically important when we're providing feedback to others—and it costs us next to nothing.
If the pipeline is failing, ask them to make it pass before looking at the code.
We shouldn't waste our time or the contributor's time on feedback that's already provided by the pipeline. Asking them to fix it will also help set clear expectations.
Provide feedback on code design, not syntax.
Let the continuous integration checks find the easy stuff, instead prioritizing more high-level concerns, like suggesting alternative methods or design patterns, or asking for more or less abstraction.
Only provide feedback on style if it is documented ahead of time.
If we expect contributors to follow a certain code style, that style must be documented in the repository (for instance, in the readme or in ESLint rules). Asking contributors to follow an undocumented style will feel like we're asking them to guess at our standards.
Try to frame feedback as questions, not suggestions.
Prompting the contributor to respond to questions gives them an opportunity to solve the problem themselves. Provide suggestions after they've responded.
Low-effort feedback is always better than no feedback. And sometimes it's the perfect amount of feedback!
Using the reaction feature, posting an emoji, and shorthands like "LGTM" (looks good to me) may be enough for very simple merge requests, such as a single character change. But it's always helpful to give some human feedback beyond clicking the approval button.
Only approve the MR when the review is done.
A review is done when all threads have been resolved and the we are happy with the changes.

Commits

Every time you perform a commit, you're recording a snapshot of your project that you can revert to or compare to later.
— The official Git book

Commits are one of the most powerful tools for documenting our code, and we should treat them as a fundamental part of our work. They are the lowest level of work that we describe within our development cycles, telling the story about how our code has changed over time.

These guidelines are designed to help us use commits effectively.

Commit early and often.
- It's better to have a verbose commit history than a sparse one.
- We can always squash commits later to combine them into one message if necessary.
Commit messages should be an accurate, clear, and brief (in that order) synopsis of the change.
- It should be up to the team to set their own expectations beyond just accuracy, clarity, and brevity because there are a lot opinions and ways to do commits well³, and there are even commit standards that teams can follow if they choose.
- Beware of commit anti-patterns.
Commit messages should only describe one change. More often than not, it takes multiple code changes to complete a ticket. Rather than committing everything all at once, stage and commit related changes together, but don't include unrelated changes in the same commit.
- This can be incredibly difficult but is always worth doing. For instance, a change to fix a modal dialog might require changes in CSS and React. Should we commit those changes together or separately? Reasonable people will disagree. It's not important that we get it "right"—it's important that we make an attempt to isolate our changes.
- Tip: Visual Studio Code has the stage selected ranges function, giving us the ability to stage and commit only a portion of a file's changes rather than the whole file.
Commit messages should uniquely describe the change. Duplicate commit messages make it difficult to understand how the code changed over time, and especially if those messages happen in succession.
Don't reference tickets in the commit message.
- We don't use smart commits because commits should independently tell a story of how code changes. In other words, if we need to know anything about the ticket in order to understand the commit, it is not a good commit message.
- Use the merge request to connect the ticket to the branch.
Use the commit body for explanation and detail.
- The commit message should be short and descriptive, whereas explanation typically requires more space and will be better-suited in the commit body.
- Read My favorite Git commit for a great example of using the commit body well.

Avoid the following commit conventions, which either add little value or actually make it more difficult to understand a repository's commit history. Maintainers who see commits like these should ask the contributor to reword the commit or revert it and try again.

Don't include redundant information in the message. Commit message space is precious and should be dense with useful information, not throwaway content. Examples of redundant information include:
- "code changes for x" - it's obviously a code change. What's it changing?
- "updated x {file,function,variable,etc.}" - the fact that a specific item was changed is already captured in the changes and the diff, so there's no reason to also say it in the message.
- "resolve JIRA-123 - tickets are resolved at the branch/merge-request level, not the commit level. This requires readers to either know the ticket numbers or click through to understand the change when reviewing the commit history of a project.
Don't use the same commit message multiple times in a row.
- It may be okay to use the exact same commit message for common spaced-out tasks like monthly dependency updates, but it is never okay to use the same message multiple times in a row.
- Think of commits like points in a timeline. If two points are the same, it's impossible to know how or if they're different without clicking into each.
Don't commit unrelated changes in the same commit.
- This makes it difficult to review the changes, and makes it especially difficult to revert or modify specific parts of the changes.

Code libraries

Code libraries (or simply libraries) are projects that are installed by other projects rather than being hosted on a server. They are typically used by many teams to accomplish functions that benefit from standardization, such as the React implementation of the Norton Design System that allows teams to use a common library of official Norton components.

Unlike all other repositories that a team might create in support of their product, teams must consult engineering management before they create a library to align around expectations.

All libraries must:

Use semantic versioning via Git tags.
- Node.js libraries must use the package.json version key in addition to Git tags.
- The versioning & release process should be automated with common tooling.
Maintain a changelog.
- It is highly recommend that this is automated through conventions and tooling such as conventional commits and semantic-release.
Have full code coverage in their tests before major releases (e.g., 1.0.0 or 2.0.0).
- Frontend libraries should strive for 90%+ coverage, and backend libraries should strive for 100% coverage.
Present a plan for publication and maintenance that describes who the users of the library will be, as well as how and when releases will occur.
- An engineering manager will review this plan.
- Open source Node.js libraries should be published to the public NPM registry under the @wwnorton scope, and packages that only Norton staff will use should be published to our private NPM registry.

The exact date when these changes will be required is still to be determined, and will be set once teams have had the opportunity to review the new standards. ↩
GitLab, GitHub, and Atlassian, have already switched to using “main” as their default branch name when creating a new project through their interface. The Git maintainers have also announced that they intend to change the initial branch name (related patches: addition of init.defaultBranch, test updates to use "main", updates to the Git documentation). ↩ ↩²
A lot has been written about commit messages. Here are some places to begin:
↩

Tending the garden​

GitLab​

Standards​

Code ownership​

Continuous integration checks​

Tips for pipeline design​

Branch management​

Merge requests​

Reviewing code​

Commits​

Code libraries​

Footnotes​