Managing code
One of the core functions of the engineering group is to mange the code that powers our systems and applications. This is different from the day-to-day feature development and bug fixes, as it is work that won't be ticketed unless we identify and address it.
This page outlines how code is managed at Norton.
Tending the garden
There's no "right" way to manage code, but gardening can be a helpful metaphor and mental model for how to cultivate a healthy codebase. 🧑🌾🍅🥕
1. Design for growth
2. Feed regularly
3. Prune overgrowth
4. Protect from the elements
5. Know when to pull it all up and start over
GitLab
We use GitLab for source code management. Because it is a paid service, membership is limited to staff who are doing active work with code. Please reach out to the Director of Engineering (Evan Yamanishi) if you believe you or one of your team members should have access.
GitLab organizes projects like a file tree, with groups functioning like folders and projects functioning like files. Our GitLab space is organized around the principles of domain-driven design and usability. The top-level GitLab groups are organized according to type of project (the domain), not business unit ("department"), meaning no projects will ever be part of the "Engineering" group, even if they are managed by us. The top-level groups that are most relevant to engineers are:
- Applications - our official products/applications. This is where we do most of our work.
- Operations - projects that help keep things running.
- The Lab - unofficial, experimental, or exploratory projects. Projects in here may "graduate" to the Applications space.
- Groups - a group of groups that contains no actual code or projects. This is how we define and manage our business units in GitLab, which we can use to simplify member management. When new members are added to GitLab, they should be added to one of these groups.
Standards
While we prioritize team autonomy over uniformity to ensure that teams can focus on making it work, there are times when doing things in a standard way can help improve efficiency, reduce complexity, promote a shared understanding, and create greater confidence in our engineering practices.
The following standards are designed to address each of these concerns, as well as enabling our continuous delivery practice. All engineers are expected to follow these standards, which use key words for requirements like "must" and "should" to convey the level of requirement where appropriate.
Code ownership
Code owners are required for all projects in the applications and operations groups on GitLab. This feature helps us avoid a whole range of diffusion of responsibility problems, but it more specifically enables us to require approvals on protected branches. This also help discourage the creation of code with no clear maintenance plan or ownership structure, and will help highlight projects might need to be moved or retired.
- Repositories should have more than one code owner. This helps avoid single-person bottlenecks and keeps us honest.
- Reach out to your manager if you're not sure who else can or should manage a project that you maintain.
- Leads are responsible for the
CODEOWNERSfile in their projects.- Lead software engineers ("tech leads") should work with their team to determine who should own what parts of a codebase. Leads should own the code by default.
- Code owners don't need to be part of the core team that maintains the parent project.
- If it makes sense for someone from another team to own a file, directory, or whole repository, they should be listed as an owner.
- For instance, if nobody on the team is well versed in Docker, it might make sense to have two code owners for the
Dockerfile: one person from the core team, and a Docker expert from another team.
Continuous integration checks
Continuous integration, or CI, is the process of running code through checks every time a change is made (continuously!). It is the first half of CI/CD, or continuous integration and continuous delivery/deployment.
Checking our code continuously helps us catch issues faster, allows us to focus on higher-value concerns, and gives everyone greater confidence in our work.
Central to CI is the simple reality that the vast majority of code mistakes are much easier for us to make than they are for us to find, but those same mistakes can be found extremely quickly by computers.
For instance, accidentally spelling const as cost or not consistently returning a value in a JavaScript function could completely break a build, and trying to find them manually might mean sifting through lines of code for hours.
All projects are required to run continuous integration checks, though the nature of the checks will depend on the project.
Tips for pipeline design
- Run one task per job.
- Why: we want to have a sense for what's failing at a glance, and if we run multiple scripts in the same job, we'll have to open the job log to know which failed.
- For instance, run ESLint and Stylelint scripts in two separate jobs rather than one job with two scripts. Jobs are very cheap to run, so don't worry about optimizing for instance efficiency or costs at the pipeline/job level.
- Batch similar jobs in stages.
- Why: jobs in the same stage will run in parallel, speeding up our overall pipeline.
- Run stages from cheapest to most expensive, where expense is a heuristic for what's required for the job to run, as well as the time the jobs might take.
- Why: we want to fail as quickly as possible and catch easy errors like syntax errors before we run checks that may require more people, time, and work to fix.
- For instance, run code quality jobs like linting first since they only require source code to run, typically run very quickly, are easy to fix, and only the engineer who wrote the code needs to be involved in the fix. Conversely, run end-to-end tests last since they require the change to be deployed to a live environment, often require live external endpoints to be deployed as well, and may require additional team members to validate errors.
- Keep it simple by using GitLab's CI syntax.
- Why: it may be easier to reach for advanced YAML language features or shell scripting to solve pipeline problems, but GitLab's CI syntax is robust enough to solve most problems on its own and using it will result in less technical debt than more complex solutions.
- Borrow from GitLab's templates.
- Why: GitLab provides templates for a lot of jobs so that we can extend them, but since they're all open source, we can also use them as references for our own jobs and pipeline design.
Branch management
Teams should use GitHub flow by default and use GitLab flow in those rare occasions where GitHub flow won't work. Standardizing this across teams enables us to use templates for CI/CD pipelines, and greatly simplifies the configuration of infrastructure that needs to connect to our repositories.
Teams must also follow a few simple conventions related to branches on GitLab:
- The default branch must be named
main2.- Tip: ensure that locally-initialized repositories use
mainwithgit config --global init.defaultBranch main.
- Tip: ensure that locally-initialized repositories use
- The
mainbranch must be protected. - Resist the urge to create long-lived branches. If they are needed, they must be protected.
- Why: we should maintain one single source of truth, the
mainbranch. Keeping multiple sources of truth can undermine that, but may be necessary in some rare circumstances like when we need to support multiple public releases at the same time.
- Why: we should maintain one single source of truth, the
- Once merged, branches should be deleted.
- Why: when we embrace
mainas our single source of truth, all other branches are simply ephemeral working spaces that we use to stage changes before merging them intomain. The history and paper trail of that merge will be captured in the merge request and commit history, so there's no reason to keep the branch around. - Tip: select the delete source branch when merge request accepted option in the merge request to automatically delete the branch when it's merged.
- Tip: use the delete merged branches feature in GitLab to prune branches that weren't deleted when they were merged.
- Why: when we embrace
Aside from main, there are no branch naming requirements, meaning teams can name their branches feature/awesome-service, just awesome-service, or even Jira issue keys like NCIA-1234 (note that the issue key is required in the merge request name).
Merge requests
Merge requests (MR) are the heart and soul of the development process.
They are the primary place we propose changes, discuss technical problems and implementation strategies, and collaborate on solutions before changes are merged into the main branch.
The following requirements are designed to facilitate effective code reviews.
- Every change must be reviewed and approved.
- This is enabled in GitLab, meaning it will be impossible to merge any change until it has been reviewed and approved.
- This applies to trivial changes as well as significant ones.
- Merge request names must include a short title and the Jira issue key.
- The short title helps the reviewer understand it at a glance and should briefly describe the change in the author's own words.
- The Jira issue key is required to connect the MR to Jira via the Jira/GitLab integration. It can be anywhere in the name.
- Example:
"Add trial access expired modal | NERD-1530"or"SW5-5675: fix multi-step question module".
- Describe the changes in the description.
- Provide details that anticipate the code review, such as why specific choices were made and what alternatives were considered.
- Focus on the intent and design of the changes, not the details. The changes view provides the details.
- Request a review as soon as there is code to review.
- The bar is working code, so resist the urge to wait until it is "perfect."
- Just as we favor fast continuous feedback through CI, we should favor quick, small reviews.
- Mark work-in-progress merge requests as a draft.
- This will signal to teammates that we're still working on the changes.
- Use merge request templates to provide more clarity about expectations.
- Teams that have a high volume of merge requests or many contributors may want to consider using templates to streamline the MR process.
Reviewing code
Code reviews are the primary way we collaborate with each other on technical details. They are an asynchronous way to communicate—that is, they don't require the participants to work together at the same time. We favor this method of review because it is more remote-friendly, flexible, inclusive, and functions as a record of the exchange and process.
Synchronous methods of reviewing or discussing code like hopping on a call to walk through it, virtual whiteboarding, or remote pair programming are encouraged, but teams must also capture comments in the code review.
Commits
Every time you perform a commit, you're recording a snapshot of your project that you can revert to or compare to later.
— The official Git book
Commits are one of the most powerful tools for documenting our code, and we should treat them as a fundamental part of our work. They are the lowest level of work that we describe within our development cycles, telling the story about how our code has changed over time.
These guidelines are designed to help us use commits effectively.
- Commit early and often.
- It's better to have a verbose commit history than a sparse one.
- We can always squash commits later to combine them into one message if necessary.
- Commit messages should be an accurate, clear, and brief (in that order) synopsis of the change.
- It should be up to the team to set their own expectations beyond just accuracy, clarity, and brevity because there are a lot opinions and ways to do commits well3, and there are even commit standards that teams can follow if they choose.
- Beware of commit anti-patterns.
- Commit messages should only describe one change. More often than not, it takes multiple code changes to complete a ticket. Rather than committing everything all at once, stage and commit related changes together, but don't include unrelated changes in the same commit.
- This can be incredibly difficult but is always worth doing. For instance, a change to fix a modal dialog might require changes in CSS and React. Should we commit those changes together or separately? Reasonable people will disagree. It's not important that we get it "right"—it's important that we make an attempt to isolate our changes.
- Tip: Visual Studio Code has the stage selected ranges function, giving us the ability to stage and commit only a portion of a file's changes rather than the whole file.
- Commit messages should uniquely describe the change. Duplicate commit messages make it difficult to understand how the code changed over time, and especially if those messages happen in succession.
- Don't reference tickets in the commit message.
- We don't use smart commits because commits should independently tell a story of how code changes. In other words, if we need to know anything about the ticket in order to understand the commit, it is not a good commit message.
- Use the merge request to connect the ticket to the branch.
- Use the commit body for explanation and detail.
- The commit message should be short and descriptive, whereas explanation typically requires more space and will be better-suited in the commit body.
- Read My favorite Git commit for a great example of using the commit body well.
Code libraries
Code libraries (or simply libraries) are projects that are installed by other projects rather than being hosted on a server. They are typically used by many teams to accomplish functions that benefit from standardization, such as the React implementation of the Norton Design System that allows teams to use a common library of official Norton components.
Unlike all other repositories that a team might create in support of their product, teams must consult engineering management before they create a library to align around expectations.
All libraries must:
- Use semantic versioning via Git tags.
- Node.js libraries must use the package.json version key in addition to Git tags.
- The versioning & release process should be automated with common tooling.
- Maintain a changelog.
- It is highly recommend that this is automated through conventions and tooling such as conventional commits and semantic-release.
- Have full code coverage in their tests before major releases (e.g.,
1.0.0or2.0.0).- Frontend libraries should strive for 90%+ coverage, and backend libraries should strive for 100% coverage.
- Present a plan for publication and maintenance that describes who the users of the library will be, as well as how and when releases will occur.
- An engineering manager will review this plan.
- Open source Node.js libraries should be published to the public NPM registry under the
@wwnortonscope, and packages that only Norton staff will use should be published to our private NPM registry.
Footnotes
-
The exact date when these changes will be required is still to be determined, and will be set once teams have had the opportunity to review the new standards. ↩
-
GitLab, GitHub, and Atlassian, have already switched to using “main” as their default branch name when creating a new project through their interface. The Git maintainers have also announced that they intend to change the initial branch name (related patches: addition of init.defaultBranch, test updates to use "main", updates to the Git documentation). ↩ ↩2
-
A lot has been written about commit messages. Here are some places to begin:
- Commit Often, Perfect Later, Publish Once: Git Best Practices
- How to Write a Git Commit Message
- Zen and the art of writing good commit messages