THE MACHINERY OF PATH DEPENDENCE

A Complete Guide to Irreversible History

How the Past Builds the Walls of the Future

What follows is not advice.

It is not a framework for better decision-making. Not a guide to avoiding regret. Not a strategy for keeping your options open.

It is mechanism.

The actual machinery by which history becomes structure. The physics of why where you are constrains where you can go. The mathematics of how small early moves become permanent features of the landscape.

Most people sense this. They feel the weight of prior choices. The narrowing of possibility as years accumulate. The way certain doors closed quietly while they were looking elsewhere.

But they attribute it to fate, bad luck, or character.

They never see the machinery.

This document is that seeing.

Nothing more.

What you do with it is your business.

PART ONE: THE DEFINITION

What Path Dependence Actually Means

Path dependence is not the claim that history matters.

Everything knows history matters. Every culture, every folk wisdom, every grandmother. That is not a discovery. That is a platitude.

Path dependence is a precise structural claim. It says: the outcome of a process depends not just on current conditions but on the specific sequence of events that produced those conditions. Two systems with identical present states, arrived at through different histories, will behave differently going forward.

The formal statement is this. A process is path dependent when its outcome is a function of the entire trajectory, not just the current position.

In mathematics, this maps to a distinction so fundamental it sits at the base of thermodynamics, calculus, and physics.

State functions depend only on where you are. Path functions depend on how you got there.

Temperature is a state function. You can measure it right now. It does not care how the system reached this temperature.

Work is a path function. The amount of work extracted from a gas expanding from state A to state B depends entirely on the route taken. Expand it slowly and you get one number. Expand it fast and you get another. Same starting point. Same ending point. Different path. Different result.

    STATE FUNCTION vs PATH FUNCTION

    ┌────────────────────────────────────────────────────┐
    │                                                    │
    │               STATE A ────────► STATE B            │
    │                 │                  │               │
    │                 │    Path 1        │               │
    │                 │   (slow)         │               │
    │                 │                  │               │
    │            Temperature          Temperature        │
    │            = 300 K              = 500 K            │
    │                                                    │
    │    Temperature change = 200 K (path irrelevant)    │
    │                                                    │
    └────────────────────────────────────────────────────┘

    ┌────────────────────────────────────────────────────┐
    │                                                    │
    │               STATE A ────────► STATE B            │
    │                                                    │
    │    Path 1 (slow expansion):   Work = 4,500 J      │
    │    Path 2 (fast expansion):   Work = 2,800 J      │
    │    Path 3 (two-stage):        Work = 3,600 J      │
    │                                                    │
    │    Same endpoints. Different work. Path matters.   │
    │                                                    │
    └────────────────────────────────────────────────────┘

This is not metaphor. This is the distinction that separates reversible from irreversible. Recoverable from lost. Undoable from permanent.

And the entire observable universe runs almost exclusively on path functions.

PART TWO: THE PHYSICS

Irreversibility and the Arrow of Time

The deepest instance of path dependence is time itself.

The fundamental laws of physics are time-symmetric. Run Newton’s equations backwards and they work perfectly. Every microscopic interaction is reversible.

Yet the macroscopic world has a direction. Eggs break but do not unbreak. Coffee cools but does not spontaneously reheat. Smoke disperses but does not reconverge.

The second law of thermodynamics names this asymmetry. Entropy in a closed system does not decrease. But entropy production is itself a path function. The amount of entropy generated depends on the specific trajectory the system takes through its state space.

A gas expanding reversibly (infinitely slowly, always in equilibrium) generates zero entropy. The same gas expanding freely generates maximum entropy. Same initial state. Same final state. Radically different entropy production.

    ENTROPY PRODUCTION AS PATH FUNCTION

    Entropy
    Generated
         │
         │
    MAX  │    ████████████████████████  ← Free expansion
         │    ████████████████████████    (irreversible)
         │
         │
    MED  │    ██████████████  ← Finite-speed expansion
         │    ██████████████    (partially irreversible)
         │
         │
    ZERO │    █  ← Quasi-static expansion
         │    █    (reversible limit)
         │
         └──────────────────────────────────────────
              Same start state. Same end state.
              Path determines entropy cost.

This is the arrow of time. Not a law written into the equations. An emergent consequence of the fact that the universe is a path-dependent process. Each configuration carries the record of how it was assembled. And that record constrains what comes next.

The coffee cannot reheat because the specific molecular configuration required to reverse every collision is vanishingly improbable. Not impossible. Improbable to a degree that makes the age of the universe look like a rounding error.

Path dependence at the physical level means that the universe accumulates history. And that accumulated history is irreversible. Not because reversal is forbidden. Because the path back does not exist in any accessible region of the state space.

The Feynman Insight

Richard Feynman’s path integral formulation of quantum mechanics provides a startling perspective.

A quantum particle traveling from point A to point B does not take one path. It takes every possible path simultaneously. Each path contributes a probability amplitude proportional to e^(iS/h), where S is the action (the integral of the Lagrangian along that path) and h is Planck’s constant.

The observed behavior emerges from the interference of all these amplitudes.

For classical objects, the action is enormous compared to Planck’s constant. The phase oscillates so rapidly for non-classical paths that they cancel each other out. Only paths near the classical trajectory survive. The classical world selects one path from the infinite ensemble.

    QUANTUM TO CLASSICAL PATH SELECTION

    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │              QUANTUM REGIME                      │
    │                                                  │
    │    All paths contribute                          │
    │    Amplitudes interfere                          │
    │    No single history                             │
    │    Outcome = sum over all histories              │
    │                                                  │
    └──────────────────────────────────────────────────┘
                          │
                          │  As scale increases
                          │  (S >> h)
                          ▼
    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │             CLASSICAL REGIME                     │
    │                                                  │
    │    Non-classical paths cancel                    │
    │    One path dominates                            │
    │    History becomes definite                      │
    │    Outcome = single trajectory                   │
    │                                                  │
    └──────────────────────────────────────────────────┘

The classical world is the regime where path dependence becomes absolute. In quantum mechanics, the system holds all paths in superposition. In the classical world, you get one path. And that path forecloses the others permanently.

This is not philosophy. This is the mathematical structure of reality. Path dependence is not imposed on the world by human psychology. It is the default condition of any system large enough to have a definite history.

PART THREE: THE MATHEMATICS

The Polya Urn

The simplest mathematical model of path dependence was introduced by George Polya in 1923.

An urn contains one red ball and one blue ball. You draw a ball at random. Whatever color you draw, you put it back and add another ball of the same color. Then draw again.

The rule is trivial. The consequences are not.

After the first draw, the urn is either 2:1 red or 2:1 blue. The majority color is now more likely to be drawn next. If it is drawn, the ratio shifts further. The advantage compounds.

Run this process to infinity. The proportion of red balls converges to some value. But that value is different every time you run the experiment.

    POLYA URN: THREE SAMPLE RUNS

    Draw #    Run 1       Run 2       Run 3
              Red:Blue    Red:Blue    Red:Blue

      0       1:1         1:1         1:1
      1       2:1         1:2         2:1
      2       3:1         2:2         2:2
      3       4:1         2:3         3:2
      5       5:2         3:5         5:3
     10       8:4         4:8         6:6
     50      38:14       12:40       27:25
    100      74:28       22:80       52:50
     ∞       ~73%        ~22%        ~51%

    Same rules. Same starting conditions.
    Different sequences. Different final states.
    The path IS the outcome.

The Polya urn has three properties that define path dependence formally.

Non-ergodicity. Different runs converge to different equilibria. The long-run outcome is not independent of the specific sequence. You cannot predict the final proportion from the rules alone. You need the actual history.

Unpredictability. Early draws have disproportionate influence on the final outcome. The first five draws matter more than the next five hundred. But those first draws are random.

Lock-in. As the process continues, the proportions stabilize. The system becomes increasingly resistant to change. After a thousand draws, a single contrary draw barely moves the ratio.

This is the mathematical skeleton of path dependence. Positive feedback. Increasing returns. Convergence to one of many possible equilibria, selected by early random events, then locked in by accumulated mass.

PART FOUR: THE MECHANISM OF LOCK-IN

Increasing Returns

W. Brian Arthur formalized the economic theory of path dependence in the late 1980s. His insight was that certain processes exhibit increasing returns to adoption. The more people adopt a technology, a standard, or an institution, the more attractive it becomes to the next adopter.

Four mechanisms produce increasing returns.

Learning effects. The more a technology is used, the better understood it becomes. Users develop skills. Developers fix bugs. The knowledge base compounds.

Network effects. The value of a network increases with the number of participants. A fax machine is useless alone. A phone network with one subscriber has zero value. Each new participant increases value for all existing participants.

Scale economies. Unit costs decrease with volume. Higher adoption means lower prices. Lower prices attract more adoption.

Adaptive expectations. People adopt what they expect others to adopt. Expectation becomes self-fulfilling. The technology expected to win attracts adopters, which makes it win.

    THE FOUR ENGINES OF LOCK-IN

    ┌──────────────────┐      ┌──────────────────┐
    │                  │      │                  │
    │  LEARNING        │      │  NETWORK         │
    │  EFFECTS         │      │  EFFECTS         │
    │                  │      │                  │
    │  More use =      │      │  More users =    │
    │  better product  │      │  more value      │
    │                  │      │                  │
    └────────┬─────────┘      └────────┬─────────┘
             │                         │
             └────────────┬────────────┘
                          │
                          ▼
              ┌───────────────────────┐
              │                       │
              │   INCREASING RETURNS  │
              │                       │
              │   Each adoption       │
              │   makes the next      │
              │   adoption more       │
              │   likely              │
              │                       │
              └───────────────────────┘
                          │
             ┌────────────┴────────────┐
             │                         │
    ┌────────┴─────────┐      ┌────────┴─────────┐
    │                  │      │                  │
    │  SCALE           │      │  ADAPTIVE        │
    │  ECONOMIES       │      │  EXPECTATIONS    │
    │                  │      │                  │
    │  More volume =   │      │  Expected winner │
    │  lower cost      │      │  becomes winner  │
    │                  │      │                  │
    └──────────────────┘      └──────────────────┘

Under increasing returns, the competitive landscape does not converge to a single optimum determined by the properties of the competing options. It converges to one of several possible equilibria determined by the sequence of early adoption events.

The first technology to gain a small lead gains a larger lead. The larger lead attracts more adoption. More adoption widens the lead further.

This is the Polya urn with a different skin. Same mathematics. Same convergence to one of many possible states. Same dependence on early random events. Same lock-in.

The QWERTY Parable

Paul David’s 1985 analysis of the QWERTY keyboard layout became the canonical example.

The QWERTY layout was designed in 1873 for the Sholes and Glidden typewriter. Its arrangement was constrained by the mechanical linkages of the time. Keys that were commonly struck in sequence had to be separated to prevent jamming.

Those mechanical constraints vanished with electric typewriters. They vanished completely with computers. There are no mechanical linkages in a laptop.

Yet QWERTY persists. Not because it is optimal. Because the switching costs exceed any individual’s incentive to switch.

Everyone learns QWERTY because everyone else uses QWERTY. Keyboards are manufactured in QWERTY because that is what people buy. People buy QWERTY keyboards because that is what they learned on.

    THE LOCK-IN CYCLE

            Manufacturers produce QWERTY
                       │
                       ▼
            Schools teach QWERTY
                       │
                       ▼
            Workers know QWERTY
                       │
                       ▼
            Employers require QWERTY
                       │
                       ▼
            Manufacturers produce QWERTY
                       │
                       ▼
                    (repeat)

    No single participant can break the cycle.
    The cost of switching falls on the switcher.
    The benefit of switching requires everyone else
    to switch simultaneously.

Whether QWERTY is actually inferior to alternatives like Dvorak is debated. Liebowitz and Margolis challenged David’s claims in the 1990s, arguing the Navy studies David cited were flawed and that no clearly superior alternative has been demonstrated.

But the structural point stands regardless. The specific question of keyboard optimality is less important than the mechanism it illustrates. Under increasing returns, the outcome is selected by history, not by intrinsic superiority. And once selected, it is maintained by the coordination structure that grew around it.

The system is not stuck because it is stupid. The system is stuck because every individual inside it faces the correct incentives to stay.

PART FIVE: THE LANDSCAPE

Phase Space Narrowing

Every system begins with a space of possibilities. The set of states it could occupy. The configurations it could reach.

As the system evolves, each step narrows the accessible region of that space. Not because the laws change. Because each state transition eliminates certain future states while making others accessible.

A river cutting through soft rock illustrates this. The initial landscape is roughly uniform. Water flows across a broad surface. Small variations in topography concentrate flow slightly. Concentrated flow erodes faster. Faster erosion deepens the channel. A deeper channel concentrates more flow.

Within geological time, a slight initial variation becomes a canyon. The Grand Canyon is a path-dependent structure. It does not exist because that location was uniquely suited for a canyon. It exists because water happened to concentrate there first, and the positive feedback of erosion-concentration-erosion carved the rest.

    PHASE SPACE NARROWING OVER TIME

    Time 0: Full possibility space
    ┌──────────────────────────────────────────────┐
    │  ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  │
    │  ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  │
    │  ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  │
    └──────────────────────────────────────────────┘

    Time 1: Early choices eliminate some regions
    ┌──────────────────────────────────────────────┐
    │  · · · · · ○ ○ ○ ○ ○ ○ ○ ○ · · · · · ·  │
    │  · · · ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ · · · · ·  │
    │  · · · · · ○ ○ ○ ○ ○ ○ ○ · · · · · · ·  │
    └──────────────────────────────────────────────┘

    Time 2: Positive feedback concentrates trajectory
    ┌──────────────────────────────────────────────┐
    │  · · · · · · · · · ○ ○ · · · · · · · · ·  │
    │  · · · · · · · · ○ ○ ○ ○ · · · · · · · ·  │
    │  · · · · · · · · · ○ ○ · · · · · · · · ·  │
    └──────────────────────────────────────────────┘

    Time 3: Lock-in to narrow channel
    ┌──────────────────────────────────────────────┐
    │  · · · · · · · · · · · · · · · · · · · ·  │
    │  · · · · · · · · · ○ ○ · · · · · · · · ·  │
    │  · · · · · · · · · · · · · · · · · · · ·  │
    └──────────────────────────────────────────────┘

    ○ = accessible states    · = foreclosed states

This happens in every domain.

A city grows where a trading post was established at a river crossing. The trading post attracted merchants. Merchants attracted infrastructure. Infrastructure attracted more merchants. Three centuries later, millions of people live there not because it is the optimal location for a city but because the self-reinforcing dynamics of urban growth locked in the original accident.

A career narrows the same way. Early skill investments compound. Opportunities flow toward demonstrated capability. Capability deepens in the direction of prior investment. The phase space of possible careers contracts with every year of specialization.

Critical Junctures

Not all moments are equal.

Path dependence theory identifies critical junctures. Brief windows where the system’s degrees of freedom are temporarily expanded. Where contingent events have disproportionate leverage over the future trajectory.

Before a critical juncture, the system is in an open state. Multiple outcomes are possible. Small pushes can redirect the entire trajectory.

After the critical juncture, self-reinforcing mechanisms activate. The system locks into one of the available paths. Switching costs escalate. The window closes.

    ANATOMY OF A CRITICAL JUNCTURE

              OPEN PHASE                LOCKED PHASE
         (high contingency)         (self-reinforcing)

    Degrees
    of Freedom
         │
         │████
    HIGH │████
         │████
         │████  ████
         │████  ████
    MED  │████  ████
         │████  ████
         │████  ████  ████
         │████  ████  ████
    LOW  │████  ████  ████  ████████████████████████
         │████  ████  ████  ████████████████████████
         │
         └──────────────────────────────────────────►
                   │              Time
                   │
                   ▼
           Critical juncture:
           Small event, large
           downstream consequence

James Mahoney formalized this structure. Path dependence characterizes sequences where contingent events set into motion institutional patterns that have deterministic properties. The contingency is at the juncture. The determinism is after it.

The founding of a nation. The adoption of a standard. The choice of programming language for a critical system. The first hire at a startup. These are not merely important decisions. They are structural forks after which the landscape rearranges itself around the choice.

PART SIX: NETWORKS AND PREFERENTIAL ATTACHMENT

The Rich Get Richer

In 1999, Albert-Laszlo Barabasi and Reka Albert published a model that explained why so many real-world networks exhibit power law degree distributions. The internet. Citation networks. Social networks. Metabolic pathways.

The mechanism is preferential attachment. When a new node joins a network, it does not connect randomly. It connects preferentially to nodes that already have many connections.

The probability of connecting to node i is proportional to k_i, the degree (number of connections) of node i.

This is the Polya urn mapped onto network topology. Each new connection is a “draw.” Each connection increases the probability of further connections to that node. Early well-connected nodes accumulate connections faster. The advantage compounds.

    PREFERENTIAL ATTACHMENT

    Time 1: Small network, roughly equal
    ┌──────────────────────────────────────┐
    │                                      │
    │        A ─── B                       │
    │        │     │                       │
    │        C ─── D ─── E                 │
    │                                      │
    │    Degrees: A=2, B=2, C=2, D=3, E=1 │
    │                                      │
    └──────────────────────────────────────┘

    Time 2: New nodes connect to D (highest degree)
    ┌──────────────────────────────────────┐
    │                                      │
    │        A ─── B                       │
    │        │     │                       │
    │    F ──C ─── D ─── E                 │
    │              │                       │
    │          G ──┤                       │
    │              │                       │
    │              H                       │
    │                                      │
    │    D: 6 connections (hub forming)    │
    │                                      │
    └──────────────────────────────────────┘

    Time 3: Hub dominates
    ┌──────────────────────────────────────┐
    │                                      │
    │    A ─── B                           │
    │    │     │                           │
    │    C ─── D ─── E                     │
    │    │   / │ \                         │
    │    F  G  H  I                        │
    │       │  │  │                        │
    │       J  K  L                        │
    │                                      │
    │    D: 10+ connections                │
    │    Original advantage compounded     │
    │                                      │
    └──────────────────────────────────────┘

The result is a power law distribution. A few nodes have enormous numbers of connections. Most nodes have very few. The distribution has a long tail that stretches far beyond what a random network would produce.

This is path dependent because which nodes become hubs depends on the order of arrival. The first nodes in the network have more time to accumulate connections. A node that arrives early and gains a small initial advantage can become a permanent hub. A node with identical properties arriving later may remain peripheral forever.

Google was not the first search engine. Facebook was not the first social network. But each arrived at a critical juncture when the network effects were available to capture, and each rode preferential attachment into dominance.

The topology of the resulting network is a fossil record of its growth history.

PART SEVEN: BIOLOGICAL PATH DEPENDENCE

The Tape of Life

Stephen Jay Gould proposed the most vivid thought experiment in evolutionary biology.

Rewind the tape of life to the Cambrian explosion, 540 million years ago. Press play again. Would you get the same result?

Gould argued no. Evolution is path dependent. Whatever mutations happen to come first set the stage for what later mutations are possible. Early stochastic events channel all subsequent development.

The vertebrate eye has a blind spot because the retina is wired backwards. Photoreceptors point away from incoming light. Blood vessels and nerve fibers run across the front of the retina, and the optic nerve punches through it to reach the brain, creating a gap.

The cephalopod eye has no blind spot. Its retina is wired correctly. Photoreceptors face the light.

Both solutions work. Both emerged from different ancestral starting points. The vertebrate arrangement is not a defect that evolution could fix. It is locked in by 500 million years of development built on top of it. Every subsequent adaptation assumed the existing architecture. Changing it would require unwinding a cascade of dependencies that extends across the entire visual system.

    EVOLUTIONARY PATH DEPENDENCE

    ┌──────────────────────────────────────────────────┐
    │  ANCESTRAL BODY PLAN                             │
    │  (Cambrian, ~540 Mya)                            │
    └──────────────────────┬───────────────────────────┘
                           │
              ┌────────────┴────────────┐
              │                         │
              ▼                         ▼
    ┌──────────────────┐      ┌──────────────────┐
    │                  │      │                  │
    │  VERTEBRATE      │      │  CEPHALOPOD      │
    │  LINEAGE         │      │  LINEAGE         │
    │                  │      │                  │
    │  Retina inverted │      │  Retina correct  │
    │  (accident of    │      │  (different      │
    │   early wiring)  │      │   starting path) │
    │                  │      │                  │
    └────────┬─────────┘      └────────┬─────────┘
             │                         │
             ▼                         ▼
    ┌──────────────────┐      ┌──────────────────┐
    │                  │      │                  │
    │  500 million     │      │  500 million     │
    │  years of        │      │  years of        │
    │  adaptations     │      │  adaptations     │
    │  built on        │      │  built on        │
    │  inverted retina │      │  correct retina  │
    │                  │      │                  │
    │  Cannot rewire.  │      │  Cannot rewire.  │
    │  Too many        │      │  Too many        │
    │  dependencies.   │      │  dependencies.   │
    │                  │      │                  │
    └──────────────────┘      └──────────────────┘

This is path dependence at the deepest biological level. The initial wiring pattern was not selected for its superiority. It was contingent. Random. But once established, it became the substrate on which everything else was built. And the cost of changing the substrate exceeds the cost of tolerating its imperfections.

The debate between Gould’s contingency and the opposing view of convergent evolution (that natural selection would produce similar outcomes regardless of path) remains active. But both sides agree on the mechanism. Early events constrain later possibilities. They disagree only on how tightly.

PART EIGHT: INSTITUTIONAL PATH DEPENDENCE

Douglass North and the Persistence of Institutions

Economist Douglass North extended path dependence from technology to institutions. His argument was that the same increasing returns mechanisms that lock in keyboard layouts also lock in legal systems, property rights, governance structures, and social norms.

Institutions create path dependence through four self-reinforcing mechanisms.

High setup costs. Building new institutions requires enormous initial investment. Legal codes. Administrative infrastructure. Training. Coordination. Once built, these are sunk costs that discourage replacement.

Learning effects. People and organizations learn to operate within existing institutions. They develop expertise, routines, and mental models calibrated to the current system. This expertise becomes worthless if the institution changes.

Coordination effects. Institutions work because participants share expectations. Everyone drives on the right (or left). Everyone uses the same currency. Everyone follows the same legal procedures. These shared expectations are enormously valuable and enormously costly to rebuild.

Adaptive expectations. Individuals shape their behavior around the expectation that existing institutions will persist. They invest, plan, and commit based on institutional continuity.

    INSTITUTIONAL SELF-REINFORCEMENT

    ┌─────────────────────────────────────────────────┐
    │                                                 │
    │               INSTITUTION                       │
    │        (law, norm, standard)                    │
    │                                                 │
    └─────────────────────┬───────────────────────────┘
                          │
            ┌─────────────┼─────────────┐
            │             │             │
            ▼             ▼             ▼
    ┌──────────────┐ ┌──────────┐ ┌──────────────┐
    │              │ │          │ │              │
    │  Sunk costs  │ │ Learned  │ │ Coordinated  │
    │  accumulate  │ │ skills   │ │ expectations │
    │              │ │ deepen   │ │ solidify     │
    │              │ │          │ │              │
    └──────┬───────┘ └────┬─────┘ └──────┬───────┘
           │              │              │
           └──────────────┼──────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────┐
    │                                                 │
    │        SWITCHING COSTS INCREASE                 │
    │                                                 │
    │   Cost of change grows with each year of        │
    │   operation. Benefits of change must exceed      │
    │   total accumulated switching costs.            │
    │                                                 │
    │   They almost never do.                         │
    │                                                 │
    └─────────────────────────────────────────────────┘

North observed that this explains one of the deepest puzzles in economic history. Why do some societies remain poor despite knowing what wealthy societies do differently? The answer is not ignorance. It is path dependence. The institutional structure of a society is not a choice that can be remade by fiat. It is a cumulative architecture with dependencies running through every level of social organization.

Changing the formal rules (laws, constitutions) is relatively easy. Changing the informal constraints (norms, habits, mental models) that make those rules operational is orders of magnitude harder. And the informal constraints are themselves path dependent, products of centuries of accumulated cultural evolution.

PART NINE: THE RELATIONSHIP TO ERGODICITY

The Ergodic Assumption and Its Failure

A process is ergodic when its time average equals its ensemble average. Run one system for a long time and you get the same statistics as running many copies of the system simultaneously.

Path-dependent processes are non-ergodic by definition.

If you run the Polya urn a thousand times, each run converges to a different proportion. The time average of one run does not predict the time average of another. The ensemble average (across all runs) does not predict any individual outcome.

    ERGODIC vs NON-ERGODIC (PATH-DEPENDENT)

    ERGODIC PROCESS:

    Run 1:  ████████████████████  → Average = 50%
    Run 2:  ████████████████████  → Average = 51%
    Run 3:  ████████████████████  → Average = 49%
    Run 4:  ████████████████████  → Average = 50%

    Ensemble average = 50%
    Every run converges to same value.
    History does not matter.


    NON-ERGODIC (PATH-DEPENDENT) PROCESS:

    Run 1:  ████████████████████  → Average = 73%
    Run 2:  ████████████████████  → Average = 22%
    Run 3:  ████████████████████  → Average = 51%
    Run 4:  ████████████████████  → Average = 88%

    Ensemble average = 59%
    No individual run converges to 59%.
    History IS the outcome.

This distinction has enormous consequences. Most of classical economics, most of statistical mechanics at equilibrium, most of standard probability theory assumes ergodicity. The assumption permits a critical simplification: you can replace time with ensemble. You can predict the future of one trajectory by studying many trajectories.

When the assumption fails, the simplification fails.

You cannot predict the wealth trajectory of one person from the average wealth trajectory of the population. You cannot predict the technology that will dominate from the average properties of competing technologies. You cannot predict the institutional structure of a society from the abstract properties of institutional alternatives.

Because the outcome depends on the path. And the path is unique.

Ole Peters at the London Mathematical Laboratory has argued that the failure to distinguish ergodic from non-ergodic processes is the foundational error of much economic theory. Expected value calculations that assume ergodicity give systematically wrong answers for path-dependent processes. What is rational for an ensemble (on average, across many independent trials) is not rational for a single trajectory unfolding through time.

The difference between ensemble-optimal and path-optimal behavior is the difference between knowing what would happen across all possible histories and knowing what will happen in your actual history.

Path dependence means you only get one history.

PART TEN: THE CONSTRAINTS

What Path Dependence Is Not

Path dependence is not determinism.

Determinism says: given initial conditions, the outcome is fixed. Path dependence says: given initial conditions, multiple outcomes are possible. Which one obtains depends on the sequence. The sequence includes contingent events. The outcome is not predictable from the starting state alone.

Path dependence is not inertia.

Inertia is resistance to change due to some frictional force. Remove the friction and the system changes readily. Path dependence is structural. The system cannot reach certain states not because of resistance but because the transitions required to reach them do not exist from the current position.

Path dependence is not the claim that all choices are permanent.

Some path-dependent processes have escape mechanisms. Phase transitions can break lock-in. Technological disruptions can make switching costs irrelevant by replacing the entire substrate. Institutional revolutions can reset the landscape. These are real. But they are rare, discontinuous, and costly. They do not operate through gradual optimization.

    WHAT PATH DEPENDENCE IS AND IS NOT

    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │   PATH DEPENDENCE IS:                            │
    │                                                  │
    │   • Non-ergodic (outcome depends on sequence)    │
    │   • Self-reinforcing (early leads compound)      │
    │   • Structurally constraining (not just sticky)  │
    │   • Present in physics, biology, economics,      │
    │     technology, institutions, landscapes          │
    │                                                  │
    └──────────────────────────────────────────────────┘

    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │   PATH DEPENDENCE IS NOT:                        │
    │                                                  │
    │   • Determinism (multiple outcomes possible)     │
    │   • Inertia (not mere resistance to change)      │
    │   • Permanence (disruptions can break lock-in)   │
    │   • Evidence that the outcome is suboptimal      │
    │     (lock-in can select good outcomes too)        │
    │                                                  │
    └──────────────────────────────────────────────────┘

The Optimality Question

The most contested question in path dependence theory is whether lock-in produces suboptimal outcomes.

David and Arthur argue that it can. Increasing returns can lock in inferior technologies, inefficient institutions, suboptimal standards. The market does not self-correct because the switching costs exceed any individual’s incentive to switch.

Liebowitz and Margolis counter that documented cases of genuine lock-in to inferior outcomes are rare or nonexistent. They argue that when switching costs are real, the current outcome may be the best achievable outcome given those costs. The system is not stuck at a bad equilibrium. It is at the best equilibrium accessible from where it is.

Both positions are compatible with the mechanism itself. Path dependence does not require suboptimality. It requires only that the outcome depends on the sequence. Whether that outcome is also inferior to some unreachable alternative is a separate empirical question.

The mechanism is neutral. It locks in whatever gets the early advantage. Sometimes that is the best option. Sometimes it is not. The machinery does not know the difference.

PART ELEVEN: THE COMPLETE PICTURE

The Unified Framework

Path dependence operates at every scale of the physical and social world.

    PATH DEPENDENCE ACROSS SCALES

    ┌─────────────────────────────────────────────────────┐
    │  PHYSICS                                            │
    │  Entropy production is path-dependent               │
    │  The arrow of time is accumulated irreversibility   │
    │  Classical reality selects one path from quantum    │
    │  superposition of all paths                         │
    └──────────────────────┬──────────────────────────────┘
                           │
                           ▼
    ┌─────────────────────────────────────────────────────┐
    │  MATHEMATICS                                        │
    │  Polya urn: identical rules produce different       │
    │  outcomes depending on sequence                     │
    │  Increasing returns: early advantage compounds      │
    │  Non-ergodicity: time average ≠ ensemble average    │
    └──────────────────────┬──────────────────────────────┘
                           │
                           ▼
    ┌─────────────────────────────────────────────────────┐
    │  NETWORKS                                           │
    │  Preferential attachment: hubs emerge from          │
    │  arrival order, not intrinsic superiority           │
    │  Power law distributions: fossils of growth         │
    │  history embedded in network topology               │
    └──────────────────────┬──────────────────────────────┘
                           │
                           ▼
    ┌─────────────────────────────────────────────────────┐
    │  BIOLOGY                                            │
    │  Evolution builds on what exists, not what is       │
    │  optimal. Ancestral accidents become permanent      │
    │  architecture. Gould's tape would play differently  │
    │  every time.                                        │
    └──────────────────────┬──────────────────────────────┘
                           │
                           ▼
    ┌─────────────────────────────────────────────────────┐
    │  INSTITUTIONS                                       │
    │  Self-reinforcing mechanisms lock in legal,         │
    │  economic, and social structures. Switching costs   │
    │  grow with each year of operation. The system is    │
    │  a fossil record of its founding conditions.        │
    └─────────────────────────────────────────────────────┘

The common structure across all these domains is this:

A system with multiple possible outcomes encounters a sequence of events. Early events, often contingent and small, establish an initial direction. Positive feedback mechanisms amplify the initial direction. The amplification narrows the accessible state space. The narrowing makes reversal increasingly costly. Eventually, the system locks into a trajectory that is maintained not by the superiority of the outcome but by the accumulated structure of the path itself.

The Asymmetry

Here is the thing that matters.

Path dependence produces a fundamental asymmetry between creation and dissolution.

Building a path-dependent structure is gradual. One adoption at a time. One investment at a time. One year of learning at a time. The lock-in accumulates slowly, invisibly, without a single decisive moment.

Breaking a path-dependent structure is discontinuous. It requires either overwhelming force (revolution, disruption, phase transition) or patient erosion across a timescale that dwarfs the timescale of lock-in.

    THE ASYMMETRY

    BUILDING LOCK-IN:
    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │  · · · ▪ ▪ ▪ ▪ ▪ █ █ █ █ █ ████████████████    │
    │                                                  │
    │  Gradual. Invisible. Each step is small.         │
    │  No single moment where you could say "stop."    │
    │                                                  │
    └──────────────────────────────────────────────────┘

    BREAKING LOCK-IN:
    ┌──────────────────────────────────────────────────┐
    │                                                  │
    │  ████████████████████████████████████ │ · · · ·  │
    │                                      │           │
    │  Requires discontinuous break.       │           │
    │  The cost is concentrated at the     │           │
    │  moment of transition.               ▼           │
    │                                   Disruption     │
    │                                                  │
    └──────────────────────────────────────────────────┘

This asymmetry is not a flaw in the mechanism. It is the mechanism. Path dependence IS the asymmetry between the ease of entering a trajectory and the difficulty of leaving it.

Final Synthesis

Path dependence is not a special case.

It is the general case.

Ergodic, path-independent processes are the special case. They exist in textbooks, in idealized models, in systems carefully constructed to eliminate history. The actual world, at every scale from thermodynamics to civilization, is path dependent.

The universe accumulates its history in its structure. Every river canyon is a memory of water flow. Every power law network is a memory of growth sequence. Every biological body plan is a memory of ancestral contingency. Every institution is a memory of founding conditions.

The machinery does not care whether you understand it.

Rivers carve canyons whether or not the water knows what it is doing.

Technologies lock in whether or not the market understands the dynamics.

Institutions persist whether or not the participants can see the structure holding them.

But understanding changes the relationship to the machinery.

Not because understanding creates freedom. It does not. The switching costs are real. The accumulated structure is real. The foreclosed paths are genuinely gone.

Understanding changes what is visible.

The person who sees path dependence stops asking “why can’t this change?” and starts asking “what would the switching cost actually be?” The first question assumes the system is irrational. The second question respects the mechanism.

The person who sees path dependence stops treating early decisions as reversible experiments and starts treating them as what they are. Foundational events. Critical junctures. Moments where the leverage over future trajectory is at maximum and will never be this high again.

The person who sees path dependence stops being surprised when the world resists optimization. The world is not resisting. The world is running a path-dependent process. The optimal outcome is not the one with the best properties. It is the one that is accessible from where the system actually is, given the actual history that produced it.

The machinery runs.

Whether you see it or not.

CITATIONS

Physics and Thermodynamics

Path Functions and State Functions

Chemistry LibreTexts. “State vs. Path Functions.” https://chem.libretexts.org/Bookshelves/Physical_and_Theoretical_Chemistry_Textbook_Maps/Supplemental_Modules_(Physical_and_Theoretical_Chemistry)/Thermodynamics/Fundamentals_of_Thermodynamics/State_vs._Path_Functions

Entropy Production and Irreversibility

Andrieux, D. & Gaspard, P. (2008). “Entropy production and the arrow of time.” ResearchGate. https://www.researchgate.net/publication/24268815_Entropy_production_and_the_arrow_of_time

PMC (2020). “Time, Irreversibility and Entropy Production in Nonequilibrium Systems.” https://pmc.ncbi.nlm.nih.gov/articles/PMC7517493/

Feynman Path Integrals

Kronecker Wallis. “Feynman Path Integral: A Revolutionary View of Quantum Mechanics.” https://www.kroneckerwallis.com/feynman-path-integral-a-revolutionary-view-of-quantum-mechanics/

Mathematics and Formal Models

Polya Urn Model

Grokipedia. “Polya Urn Model.” https://grokipedia.com/page/P%C3%B3lya_urn_model

MetaSD (2011). “Polya Urn with Increasing Returns.” https://metasd.com/2011/04/polya-urn-with-increasing-returns/

Economics and Technology

Increasing Returns and Path Dependence

Arthur, W.B. (1994). Increasing Returns and Path Dependence in the Economy. University of Michigan Press. https://books.google.com/books/about/Increasing_Returns_and_Path_Dependence_i.html?id=k6Vk5YZRzpEC

QWERTY and Lock-In

David, P.A. (1985). “Clio and the Economics of QWERTY.” American Economic Review 75(2):332-337.

Liebowitz, S.J. & Margolis, S.E. “Path Dependence, Lock-In, and History.” https://personal.utdallas.edu/~liebowit/paths.html

Path Dependence in Economic History

EH.net. “Path Dependence.” https://eh.net/encyclopedia/path-dependence/

Network Theory

Preferential Attachment

Barabasi, A.L. & Albert, R. (1999). “Emergence of scaling in random networks.” Science 286(5439):509-512. Wikipedia overview: https://en.wikipedia.org/wiki/Barab%C3%A1si%E2%80%93Albert_model

Evolutionary Biology

Contingency and Path Dependence

Gould, S.J. (1989). Wonderful Life: The Burgess Shale and the Nature of History. W.W. Norton.

Blount, Z.D., Lenski, R.E., & Losos, J.B. (2018). “Contingency and determinism in evolution: Replaying life’s tape.” Science 362(6415). https://www.science.org/doi/10.1126/science.aam5979

Penn Today. “Evolution Is Unpredictable and Irreversible, Penn Biologists Show.” https://penntoday.upenn.edu/news/evolution-unpredictable-and-irreversible-penn-biologists-show

Institutional Theory

Institutions and Path Dependence

North, D.C. (1990). Institutions, Institutional Change and Economic Performance. Cambridge University Press.

Mahoney, J. (2000). “Path Dependence in Historical Sociology.” Theory and Society 29(4):507-548. https://www.critical-juncture.net/uploads/2/1/9/9/21997192/mahoney_path_dependence_in_historical_sociology.pdf

Critical Junctures

Soifer, H.D. “The Causal Logic of Critical Junctures.” https://www.critical-juncture.net/uploads/2/1/9/9/21997192/soifer_the_causal_logic_of_critical_junctures.pdf

Ergodicity and Non-Ergodic Economics

Ergodicity Economics

Peters, O. (2019). “The ergodicity problem in economics.” Nature Physics 15:1216-1221.

Grokipedia. “Path Dependence.” https://grokipedia.com/page/Path_dependence

Document compiled from research across thermodynamics, quantum mechanics, probability theory, evolutionary biology, network science, institutional economics, and complex systems theory.

THE MACHINERY OF ERGODICITY. Path dependence is the regime where ergodicity fails. Non-ergodic processes cannot substitute ensemble averages for time averages because the path determines the outcome.
THE MACHINERY OF ENTROPY. Entropy production is itself a path function. The arrow of time that entropy describes is the deepest physical expression of path dependence.
THE MACHINERY OF FEEDBACK LOOPS. Positive feedback is the engine that converts small initial advantages into lock-in. Every increasing-returns mechanism in path dependence is a feedback loop running unchecked.
THE MACHINERY OF SYMMETRY BREAKING. Critical junctures are moments of symmetry breaking. A system with multiple equivalent possible outcomes selects one, and the symmetry among alternatives is permanently destroyed.