Chain of Thought
The emergence of inner speech in language models—how explicit step-by-step reasoning transforms performance and enables complex problem-solving.
Chain of Thought (CoT) is a prompting technique that elicits step-by-step reasoning from language models before producing a final answer. More profoundly, it represents the emergence of something analogous to inner speech—the internal monologue that humans use to think through complex problems.
The Discovery
Before Chain of Thought, language models were prompted to produce answers directly:
Question: If John has 3 apples and buys 2 more, then gives half to Mary, how many does John have? Direct prompting: “2” ❌ (often wrong on complex problems)
With Chain of Thought, the model is encouraged to “think aloud”:
Question: [same] CoT prompting: “John starts with 3 apples. He buys 2 more, so he has 3 + 2 = 5 apples. He gives half to Mary, so he keeps 5 / 2 = 2.5. Since we can’t have half an apple, he keeps 2 apples.” ✓
The same model, with explicit reasoning, performs dramatically better.
The Inner Speech Parallel
In developmental psychology, inner speech refers to the internalization of language as a tool for thought. Children first solve problems by talking aloud, then gradually internalize this dialogue.
Vygotsky described this progression:
- External speech: Talking through problems out loud
- Private speech: Talking to oneself (still audible)
- Inner speech: Internalized verbal thought
graph TD H1[HUMAN DEVELOPMENT<br/>External speech] --> H2[Private speech<br/>internalization] H2 --> H3[Inner speech<br/>efficient, compressed] M1[MODEL PROMPTING<br/>Zero-shot<br/>no reasoning shown] --> M2[Explicit CoT<br/>Chain of Thought elicits<br/>explicit reasoning] M2 --> M3[Internalized reasoning?<br/>not yet achieved<br/>implicit, reliable] style H1 fill:#0a0a0a,stroke:#00ff00,stroke-width:2px,color:#cccccc style H2 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style H3 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style M1 fill:#0a0a0a,stroke:#00ff00,stroke-width:2px,color:#cccccc style M2 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style M3 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
Current language models are at an interesting developmental stage: they perform better with explicit reasoning (like children with private speech) but haven’t fully internalized this capacity (like adult inner speech).
Mechanisms
Why does Chain of Thought work? Several mechanisms contribute:
Computation Extension
Each reasoning step adds tokens, providing more “compute” for the model to work through the problem. Complex problems need more processing.
Context Maintenance
Intermediate steps keep relevant information in context. Without them, the model must hold everything in its weights.
Error Correction
Explicit steps make errors visible (to the model and observers), allowing for mid-course correction.
Structure Imposition
The step-by-step format enforces logical structure, preventing the model from jumping to conclusions.
Knowledge Retrieval
Each step can trigger relevant knowledge retrieval, building up the information needed for the final answer.
Variants
Chain of Thought has spawned many variations:
Zero-Shot CoT
Simply adding “Let’s think step by step” to the prompt elicits reasoning without examples.
Few-Shot CoT
Providing examples of problems with reasoning traces before the target question.
Self-Consistency
Generate multiple reasoning chains and take the majority answer. Different reasoning paths should converge on correct answers.
Tree of Thought
Explore multiple reasoning branches, evaluating and pruning, rather than a single chain.
Chain of Thought with Self-Critique
The model generates reasoning, then critiques its own logic, then revises.
Chain of Thought Prompting
Wei et al. showed that prompting for step-by-step reasoning dramatically improved performance on math and reasoning tasks.
Zero-Shot CoT
Kojima et al. demonstrated that simply adding “Let’s think step by step” triggers reasoning without examples.
Tree of Thought
Yao et al. extended linear chains to branching exploration of reasoning paths.
Chain of Thought and Agents
Chain of Thought is foundational to agent behavior. The “reasoning” phase of the agent loop is essentially formalized CoT:
graph LR A[Observe] --> B[Reason] B --> C[Act] C --> D[Observe] B --> COT[Chain of Thought<br/><br/>I see X.<br/>To achieve goal Y,<br/>I should do Z<br/>because...] style A fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style B fill:#0a0a0a,stroke:#00ff00,stroke-width:2px,color:#cccccc style C fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style D fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc style COT fill:#0a0a0a,stroke:#00ff00,stroke-width:2px,color:#cccccc
ReAct, the dominant agent paradigm, is essentially CoT interleaved with actions. The “thought” traces are Chain of Thought reasoning.
Limitations
Chain of Thought is not a panacea:
Faithfulness
The stated reasoning may not reflect the model’s actual “computation.” CoT can be post-hoc rationalization rather than genuine reasoning.
Verbosity
Reasoning takes tokens. This increases latency and cost, and consumes context window.
Error Propagation
Once the model commits to a reasoning path, errors compound. Wrong early steps lead to wrong conclusions.
Manipulation
CoT can be used to manipulate model behavior—leading the reasoning toward desired (but incorrect) conclusions.
The Future
Chain of Thought represents an early understanding of how to elicit systematic reasoning from language models. Open questions include:
- Internalization: Can models learn to reason without explicit chains?
- Verification: How do we check if reasoning is faithful?
- Efficiency: Can we get the benefits of CoT with less verbosity?
- Teaching: Can CoT during training improve base reasoning ability?
The developmental trajectory—from external to internal speech—suggests that future models might reason more efficiently without explicit chains. But we’re not there yet.
See Also
- ReAct — CoT interleaved with actions
- The Agent Loop — where reasoning fits in agent behavior
- Scaffolding — how external structure supports model capabilities