What is the packfile format in Git?

Why Interviewers Ask This

Advanced questions like this reveal whether a candidate has internalized Git & GitHub deeply enough to make architectural decisions. Strong answers demonstrate both breadth and depth of experience.

Answer

Initially, Git stores each object as a separate loose object file in .git/objects/. As a repo grows, this creates thousands of files (slow for filesystem operations). Packfiles consolidate many objects into two files: .pack (compressed data) and .idx (index for fast lookup). Git creates packfiles: automatically via git gc (garbage collection); during git push/git clone (creates a packfile to transfer efficiently); when triggered by git repack. Delta compression: packfiles store many objects as deltas (differences) from similar objects rather than full copies. Different versions of the same file may differ by a few bytes — the packfile stores one full version and deltas for others. This dramatically reduces storage. Pack commands: git gc — run garbage collection (creates packfiles from loose objects, removes unreachable objects); git gc --aggressive — more thorough repack (slower); git count-objects -v — count loose objects and packed objects; git repack -a -d -f --depth=250 --window=250 — manual repack with aggressive settings. Verification: git verify-pack -v .git/objects/pack/*.idx — inspect pack contents. Pack format version 2 supports SHA-1 and SHA-256.

Pro Tip

Before answering, structure your response: one-line definition → real-world analogy → concrete example from a project. This makes even complex Git & GitHub answers easy to follow.