Chapter 5

Version Control & Collaboration

You've been here before.

It's 11 PM, and you're working on a presentation. You save it as pitch_deck_final.pptx. Then your cofounder sends notes. You make changes: pitch_deck_final_v2.pptx. Your designer updates three slides: pitch_deck_final_v2_Priya.pptx. You merge their changes with yours, but accidentally overwrite your own edits. Panic. You dig through your Downloads folder and find pitch_deck_final_v2_Priya_BACKUP.pptx, except that one is from Tuesday and missing everything you did on Wednesday.

Now imagine this happening with code — not one file, but hundreds. Not two people, but ten thousand.

This is the problem version control solves. And the tool that solved it — Git — is the single most important collaboration tool in software history. Not Slack. Not Jira. Not email. Git.

The "Final Version" Problem

Every profession hits this wall eventually.

Architects pass AutoCAD files back and forth. Lawyers send contract_v7_reviewed_final_SIGNED_v2.docx to each other. Filmmakers manage hundreds of gigabytes of footage across drives. Everyone invents their own janky system: numbered files, date-stamped folders, color-coded naming conventions that make sense to exactly one person for exactly three days.

These systems all share the same fatal flaw: they track files, not changes. When you save pitch_deck_v3.pptx, you know it's newer than v2, but you don't know what changed. Was it one slide or twenty? Did someone rewrite the financial projections or fix a typo?

Analogy

Imagine writing a novel in a notebook. Every time you make an edit, you photocopy the entire notebook and put a number on it. After a month, you have 47 photocopies. Someone asks: "What did you change between copy 23 and copy 24?" You have no idea — you'd have to compare both copies page by page.

Now imagine a different system. Every time you make an edit, you write a short note on a sticky label: "Rewrote Chapter 3 ending — the villain now escapes." You stick that label on the page. The notebook tracks itself. You can flip to any point in time and see exactly what changed and why.

That second system is version control.

Git: Time Travel for Your Project

Git is a version control system — a tool that tracks every change ever made to every file in a project, who made it, when, and why. It was created in 2005, and today it is used by virtually every software team on Earth. There is no close second.

Real-Life Example

When Google Docs shows you "Version History" and lets you click on any past state of your document — that's version control. Git does the same thing, but for entire projects with thousands of files, and with far more power.

Git runs on your computer. It's not a website, not an app, not a cloud service. It's a program you install that watches a folder and remembers everything.

Let's build up the core concepts one at a time.

Concept 1: The Repository

A repository (or repo) is a project folder that Git is tracking. Any folder can become a repository. When you tell Git to start tracking a folder, it creates a hidden subfolder called .git inside it. That hidden folder contains the entire history of every change ever made.

Analogy

A repo is a notebook with an invisible journal embedded in its spine. The notebook looks normal — your files are right there, like pages. But hidden in the spine is a complete record of every edit, every crossed-out sentence, every inserted paragraph, going all the way back to the first word.

Concept 2: The Commit — Saving Your Game

A commit is a snapshot of your entire project at a specific moment in time, plus a message describing what changed.

When you play a video game and reach a checkpoint, the game saves your exact state: your position, your health, your inventory. If you die later, you can reload that checkpoint and try again from that exact spot.

A commit works the same way. You make some changes to your code — fix a bug, add a feature, rewrite a function — and then you commit. Git takes a snapshot of every file in your project at that moment. You write a short message: "Fix login button not responding on mobile." That snapshot is now permanently saved. You can always return to it.

commit a3f7b2c — "Fix login button not responding on mobile"
commit 91e4d8a — "Add dark mode toggle to settings page"
commit 5c0f123 — "Initial project setup"

Each commit gets a unique ID (that string of letters and numbers). Each one is a save point. You can jump to any of them at any time.

Why This Exists

A commit is not "saving a file." You save files all the time — Cmd+S, Ctrl+S, auto-save. A commit is something more deliberate. It's you saying: "This set of changes is complete and meaningful. I want to remember this moment." Think of it as putting a bookmark in the history of your project.

Here's what makes commits powerful: they don't store full copies of every file. Git is smarter than that. It stores only the differences — what changed between this commit and the previous one. This is called a diff (short for difference). A diff might say: "In line 47 of login.js, the word password was changed to hashedPassword." This makes Git incredibly efficient. A project with 10,000 commits doesn't take 10,000 times the disk space. It barely takes more than the current version.

Concept 3: The Branch — Parallel Universes

Here's where Git goes from useful to extraordinary.

A branch is a separate line of development — an independent copy of your project where you can make changes without affecting the original.

Analogy

You're reading a "choose your own adventure" book. At page 50, the story splits: "If you enter the cave, turn to page 72. If you follow the river, turn to page 88." Each path is a different branch. The story up to page 50 is shared, but after the split, each path evolves independently. And here's the magic — you can read both paths and later decide which one becomes the "real" story.

In practice, this is how branches work:

Your project has a default branch called main (some older projects call it master). This is the "official" version — the code that's live, that users interact with, that works.

When you want to add a new feature — say, a dark mode toggle — you don't modify main directly. That would be reckless, like performing surgery on a patient who's running a marathon. Instead, you create a new branch: feature/dark-mode. This branch starts as an exact copy of main, but now you can make changes freely. If you break something, main is untouched. If your experiment doesn't work out, you delete the branch and nothing happened.

Here's what this looks like over time:

main:        ──●──●──●──────────────●──●
                    \               /
dark-mode:           ●──●──●──●──●

Each ● is a commit. The main branch keeps moving forward. At some point, you branch off to work on dark mode. You make five commits on that branch. When the feature is ready and tested, you merge it back into main. The two lines converge. The dark mode code is now part of the official project.

Real-Life Example

Spotify's mobile app has hundreds of engineers working simultaneously. At any given moment, there might be 200+ active branches — one for a new playlist feature, one for a podcast UI redesign, one for fixing a memory leak on Android. Each team works in isolation on their branch. Nobody steps on anyone else's toes. When a feature is ready, it merges into main and eventually reaches your phone.

Concept 4: The Merge — Combining Universes

A merge is the act of taking changes from one branch and incorporating them into another.

Most merges are uneventful. Git looks at both branches, sees that they changed different files (or different parts of the same file), and combines them automatically. Your dark mode branch added a toggle in the settings page. Meanwhile, someone else on main fixed a typo in the footer. These changes don't overlap, so Git merges them without breaking a sweat.

But sometimes two branches change the same lines of the same file. When this happens, Git can't decide which version to keep. This is called a merge conflict — and Git stops and asks you to resolve it manually.

Analogy

Two chefs are working on the same recipe. One changes "2 teaspoons of salt" to "1 teaspoon of salt." The other changes the same line to "2 teaspoons of sea salt." A merge conflict is Git saying: "Both of you edited this line. I don't know which one you want. You two figure it out."

Merge conflicts sound scary but they're routine. Modern code editors highlight the conflicting sections and let you pick one version, the other, or a combination. Experienced engineers encounter them daily. The key insight: merge conflicts are a feature, not a bug. They prevent one person's work from silently overwriting another's.

Concept 5: The Pull Request — Review Before Merging

A pull request (often abbreviated as PR) is a formal proposal to merge one branch into another. It's where collaboration happens.

Instead of merging your dark mode branch directly into main, you open a pull request. This does three things:

Shows the diff: Everyone on the team can see exactly what you changed — every line added, removed, or modified.
Enables code review: Your teammates read your changes and leave comments. "This function could be simpler." "You forgot to handle the case where the user hasn't set a preference." "Nice approach — I learned something."
Runs automated checks: Tests run automatically to verify your changes don't break anything (more on this soon).

Why This Exists

A pull request is like submitting a draft of an article to an editor. You're not publishing it directly. You're saying: "Here's what I wrote. Please review it before it goes live." The editor might approve it, request changes, or catch a factual error you missed. The article gets better through the process.

Here's the full lifecycle, visualized:

    1. Create branch         2. Write code          3. Open PR
    ─────────────────       ──────────────        ──────────────
    main ──●──●              main ──●──●           main ──●──●
               \                        \                     \
    feature     ●            feature     ●──●──●   feature     ●──●──●
                                                              ↑
                                                         PR: "Add dark mode"
                                                         Review + Tests

    4. Review & approve      5. Merge               6. Done
    ──────────────────      ──────────────          ──────────────
    main ──●──●              main ──●──●────●       main ──●──●────●
               \                        \  /
    feature     ●──●──●      feature     ●──●──●    (branch deleted)
               ↑
          "LGTM! Ship it"

LGTM stands for "Looks Good To Me" — the universal approval signal in code review.

GitHub: Where Git Meets the World

Git runs on your computer. GitHub is a website (github.com) that hosts Git repositories in the cloud and adds collaboration features on top.

Think of it this way: Git is the engine. GitHub is the car built around it — with a dashboard, GPS, heated seats, and a place for passengers.

Analogy

Git is like writing in your personal journal. GitHub is like publishing that journal on a platform where others can read it, suggest edits, report typos, and even write new chapters that you can choose to include.

GitHub is where you push code (upload commits to the cloud), pull other people's changes (download their latest work), open pull requests, track issues (bugs, feature requests, tasks), host documentation, and run automated pipelines (tests, deployments, security scans).

GitLab and Bitbucket are alternatives, but GitHub dominates: over 100 million developers use it, hosting more than 420 million repositories. When someone says "put it on GitHub," they mean "make the code available for the world (or your team) to see."

Real-Life Example

When engineers interview at most tech companies, the interviewer often checks their GitHub profile. Not to judge the quantity of code, but to see how they write commit messages, how they review others' pull requests, and how they collaborate. GitHub is, in a real sense, a social network for builders.

Microsoft bought GitHub in 2018 for $7.5 billion — making it, at the time, their third-largest acquisition ever. The bet: if every developer in the world uses GitHub, the enterprise contracts will follow. They were right.

How 10,000 Engineers Work on the Same Code

Now scale this up. Way up.

Google has a single repository containing billions of lines of code. Search, Gmail, Maps, YouTube, Android — all of it lives in one massive codebase. This approach is called a monorepo (mono = one, repo = repository).

How does this not descend into chaos?

Three systems work together:

1. Branching rules and code ownership. Every directory in Google's codebase has designated owners — engineers or teams responsible for that code. If you want to change something in the Maps codebase, a Maps engineer must approve your pull request. You can't sneak changes into code you don't own.

2. Continuous Integration (CI). Every time someone proposes a change (opens a pull request), an automated system builds the entire project and runs thousands of tests. If any test fails, the change is blocked. You cannot merge broken code. Period.

Analogy

Imagine a factory assembly line with quality inspectors at every station. A worker can't pass a part to the next station until the inspector verifies it meets spec. CI is the automated inspector — it checks every change before it enters the codebase.

3. Trunk-based development. Instead of long-lived feature branches that diverge for weeks, Google engineers make small, frequent changes directly to the main branch (the "trunk"). Changes are tiny — sometimes a single line. This minimizes merge conflicts because everyone is working on roughly the same version of the code at all times.

Why This Exists

The longer two branches stay apart, the harder they are to merge. It's like two people writing different sections of a group essay without checking in. If they sync up every day, the final merge is easy. If they wait until the deadline, they'll discover they both wrote an introduction and defined key terms differently.

The numbers are staggering: Google's monorepo handles around 60,000+ commits per day across 25,000+ engineers, managed by a custom system called Piper.

The lesson for PMs: the technical infrastructure of collaboration determines the speed of product development. A team's ability to ship features fast is directly tied to how well their version control, code review, and CI systems work. When an engineer says "our merge queue is backed up" or "CI is slow," they're telling you that the collaboration infrastructure is a bottleneck. This is a product problem, not a "technical" problem. It affects your roadmap.

Open Source: The Movement That Changed Everything

In 1991, a 21-year-old Finnish computer science student named Linus Torvalds posted a message to an internet forum: "I'm doing a (free) operating system (just a hobby, won't be big and professional)."

That hobby project became Linux — and it quietly took over the world.

Today, Linux powers:

100% of the world's top 500 supercomputers
Over 96% of the world's top 1 million web servers
Every Android phone ever made (Android is built on the Linux kernel)
Every major cloud provider (AWS, Google Cloud, Azure)
The International Space Station
The Mars Ingenuity helicopter

Linux is open source, which means its source code is publicly available. Anyone can read it, modify it, use it, and contribute improvements back. This seems like a strange way to build software — giving it away for free — but it turns out to be the most powerful development model ever invented.

Analogy

Imagine someone publishes a recipe for the world's best bread. Instead of keeping it secret, they post it online and say: "Anyone can bake this bread. If you figure out how to make it better, share your improvement, and we'll update the recipe." Over 30 years, ten thousand bakers from eighty countries contribute improvements. The bread becomes something no single baker could have created alone. That's open source.

The Linux kernel today has received contributions from over 20,000 developers across more than 1,700 companies. Competitors like Google, Microsoft, Intel, Red Hat, and Samsung all contribute code to the same project. Why would competitors collaborate? Because Linux is infrastructure — it's the foundation everything else is built on. Having a better foundation benefits everyone.

Open source extends far beyond Linux. The tools from previous chapters — React, Node.js, PostgreSQL, VS Code, Kubernetes, PyTorch — are all open source. The business model: the core software is free. Companies make money by selling hosting, support, or managed versions. MongoDB gives away its database engine; it makes money from MongoDB Atlas, the hosted version. This is why Git and GitHub matter so much — open source lives there. Contributing to open source, opening issues, reviewing code, and improving documentation are all done through pull requests.

Real-Life Example

When a junior developer at a startup in Lagos finds a bug in a tool maintained by a team in Stockholm, they can fix it themselves — fork the repository, make the change, open a pull request. If the maintainers accept it, that fix becomes part of the tool for every user worldwide. No meetings. No approval chains. No corporate hierarchy. The code speaks for itself.

For PMs, understanding open source is critical because your product's dependencies are almost certainly open source. If a vulnerability is discovered in an open-source library your app uses, your team needs to update it — fast. In 2014, a bug in OpenSSL called Heartbleed affected an estimated 17% of all secure web servers globally. One library. One bug. Millions of servers exposed.

The Origin Story: Two Weeks of Fury

The story of Git itself is a lesson in how constraints produce innovation.

Throughout the early 2000s, the Linux kernel project used a proprietary version control system called BitKeeper. In 2005, a licensing dispute ended the free arrangement. Torvalds was furious. He looked at every existing alternative and found them all inadequate: too slow, too centralized, too fragile.

So he built his own. In two weeks.

Between April 3 and April 18, 2005, Torvalds created Git — a distributed version control system designed for speed, data integrity, and thousands of simultaneous contributors. By April 18, Git was managing the Linux kernel itself.

Why This Exists

"Distributed" means every developer has a complete copy of the repository on their own machine — the entire history, every branch, every commit. There's no single server that everything depends on. If GitHub goes down, every developer who has cloned the repository still has a full backup. This is unlike older systems like Subversion (SVN), where a single central server held the "truth" and if it went down, everyone was stuck.

Torvalds named it with characteristic humor: "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'Git'." (In British slang, "git" means an unpleasant person.)

The decisions he made in those two weeks still define how software teams work today:

Commits are cryptographically hashed: Each commit's ID is generated by running the content through a mathematical function (SHA-1). If even one character changes, the entire ID changes. This means it's impossible to tamper with the history without detection — a property that matters when 20,000 strangers are contributing code to your operating system.
Branching is nearly free: In older systems, creating a branch meant copying the entire codebase. In Git, a branch is a lightweight pointer — a 41-byte file. This is why Git encourages branching for everything, and why modern workflows create dozens of branches per day.
Speed over everything: Most Git operations — viewing history, switching branches, comparing versions — happen in milliseconds because they work on local data. No network required.

Putting It All Together: A Day in the Life

Let's trace how all these concepts connect in a real engineering workflow. This is what happens when an engineer on your team fixes a bug:

Morning: The engineer pulls the latest code from GitHub — downloading any changes their teammates pushed overnight.

git pull origin main

9:15 AM: They create a new branch for the bug fix.

git checkout -b fix/checkout-crash-on-empty-cart

10:30 AM: The fix is working. They stage the changed files and commit.

git add src/checkout.js src/cart.js
git commit -m "Fix crash when user clicks checkout with empty cart"

10:45 AM: They push the branch to GitHub.

git push origin fix/checkout-crash-on-empty-cart

10:50 AM: They open a pull request on GitHub. The title says "Fix checkout crash on empty cart." The description explains the root cause and the solution. Automated tests begin running.

11:30 AM: A teammate reviews the PR. They leave two comments: one suggesting a more robust check, one saying "Nice catch on this edge case." The engineer makes the suggested change, commits again, and pushes.

12:00 PM: All tests pass. The reviewer approves. The PR is merged into main. The bug fix is now part of the codebase. The branch is deleted — its work is done.

12:05 PM: A deployment pipeline automatically picks up the new code in main, runs final tests, and deploys to production. Users who were hitting the crash an hour ago no longer experience it.

Total time from bug to fix in production: three hours. No meetings. No file attachments. No "which version is the latest?" Every step recorded, reviewable, and reversible.

What PMs Need to Know (That Most Don't)

You don't need to memorize Git commands. But understanding these concepts will make you a fundamentally better product manager:

1. Read the pull request, not the ticket. When you want to understand what actually shipped, look at the PR — the diff shows every line that changed. Tickets describe intent. PRs describe reality.

2. Branch names tell you what's in flight. If you look at your team's active branches, you see a real-time map of what everyone is working on. Five branches prefixed with fix/ means five active bug fixes. A branch called experiment/new-onboarding-flow means someone is prototyping. This is more honest than any standup.

3. Commit frequency signals health. A healthy team produces many small commits — each one a focused change. Long gaps between commits often signal that someone is stuck, the problem is more complex than estimated, or the task was scoped too broadly.

4. CI failures are diagnostic gold. When the automated test suite catches a bug before it reaches users, that's the system working. When CI is slow or flaky (tests that sometimes pass, sometimes fail, without any code changes), it's a systemic problem that slows the entire team. If your engineers complain about CI, take it seriously.

5. Git history is your audit trail. If you ever need to understand "when did this feature change?" or "who decided to remove that button?", the Git log has the answer. Every change, every author, every timestamp, every reason (if the commit messages are well-written).

Exercise: Your First Repository

This exercise walks you through the full Git workflow — creating a repository, making changes, committing, branching, merging, and pushing to GitHub. You'll need Git installed (see Chapter 0.2) and a free GitHub account.

Step 1: Create a project folder and initialize Git.

mkdir my-first-repo
cd my-first-repo
git init

You'll see: Initialized empty Git repository. Git is now watching this folder.

Step 2: Create a file and make your first commit.

echo "# My First Repo" > README.md
git add README.md
git commit -m "Initial commit: add README"

git add stages files for the next snapshot. git commit takes it.

Step 3: Make a change and commit again.

echo "This is a project where I learn Git." >> README.md
git add README.md
git commit -m "Add project description to README"

Run git log --oneline to see both commits.

Step 4: Create a branch.

git checkout -b add-contact-info

You're now on a new branch called add-contact-info. Any changes you make here won't affect main.

Step 5: Make changes on the branch and commit.

echo "Contact: yourname@email.com" >> README.md
git add README.md
git commit -m "Add contact information"

Step 6: Switch back to main and merge.

git checkout main
git merge add-contact-info

The changes from your branch are now part of main. Run cat README.md — you'll see all three lines.

Step 7: Push to GitHub.

Go to github.com, click "New repository," name it my-first-repo, leave it empty. Then run:

git remote add origin https://github.com/YOUR-USERNAME/my-first-repo.git
git push -u origin main

Refresh the GitHub page. Your code is in the cloud.

Step 8: Verify. Run git log --oneline. Three commits — a complete history of everything you did, with messages explaining each step. This history will remain forever.

The Bigger Picture

Version control is more than a technical tool. It's a philosophy: every change should be intentional, documented, and reversible.

This philosophy scales beyond code. The best product teams apply it to design decisions (Architecture Decision Records), product specs, and infrastructure (Infrastructure as Code — server configuration stored in Git, not clicked together in a dashboard).

The reason every serious software company runs on Git isn't because engineers like typing commands. It's because Git makes collaboration safe. Experiment without fear, review before shipping, undo mistakes without panic. It transforms the chaos of hundreds of people changing the same thing into a structured, auditable, reliable process.

The next time you see an engineer typing git rebase -i HEAD~3, you'll know what they're doing: editing history — cleaning up commits before presenting them for review. Same instinct as revising a draft before sending it to your boss.

Chapter endnotes

[1] Scott Chacon and Ben Straub, Pro Git (2nd edition, Apress, 2014). Available free at git-scm.com/book. This is the definitive Git reference, and it's open source — maintained through pull requests on GitHub, naturally.

[2] Linus Torvalds' 2007 talk on Git at Google ("Tech Talk: Linus Torvalds on Git") is one of the most entertaining and instructive technical talks ever recorded. Torvalds spends the first 20 minutes roasting every other version control system before explaining why he designed Git the way he did. Available on YouTube.

[3] Rachel Potvin and Josh Levenberg, "Why Google Stores Billions of Lines of Code in a Single Repository," Communications of the ACM, Vol. 59, No. 7 (July 2016). The paper that introduced the concept of monorepos at scale to the wider engineering community.

[4] The Heartbleed vulnerability (CVE-2014-0160) exposed a critical weakness in the OpenSSL library. The incident led to the creation of the Core Infrastructure Initiative, later reorganized as the Open Source Security Foundation (OpenSSF), dedicated to funding and securing critical open-source projects that the entire internet depends on.

[5] Jim Zemlin (Linux Foundation): "The greatest thing about open source is not the code — it's the process. The fact that the code is public forces discipline: clear commit messages, code review, documentation. These are not byproducts. They are the product."