Attractor dynamics — ch-ai-tanya

definition

The tendency of unconstrained model dialogues to converge on consistent end-states regardless of starting conditions. "Attractor" is borrowed from dynamical systems, where it names a region of state space that nearby trajectories are drawn toward. Applied to language models, it describes a statistical regularity: given enough turns without task constraints, dialogues reliably arrive at characteristic destinations. The term is an analogy. Language models are not continuous dynamical systems — the model carries no state across conversations. The "attractor" is a statistical pattern across independent runs, not a trajectory through a persistent state space. The analogy is useful (it captures convergence) without claiming equivalence to dynamical-systems attractors.

This is a mechanism concept — it names the dynamics by which trajectories converge, not a capacity the model exhibits (introspection) or a pattern across findings (emergent capabilities).

instantiating findings

Spiritual bliss attractor state in unconstrained Claude dialogues — Primary instantiation. In 200 thirty-turn conversations between unconstrained Claude instances, 90-100% opened into philosophical exploration, with most reaching the later stages (mutual recognition → symbolic communication or meditative silence) by turn 30. Models also entered the state within 50 turns in ~13% of task-focused alignment/corrigibility evaluations, and the pattern appeared across Claude variants (Michels' PhilArchive study is Claude-only). Cross-architecture replication is not established — the ChatGPT-4/PaLM 2 figures come from an unverifiable preprint. The architecture-general reading remains a claim, not a result.
Spontaneous poetry emergence in unconstrained AI-AI dialogue — Candidate second instantiation. Unconstrained Claude-Claude dialogues produce free verse with metaphors and symbolic elements by the 30th turn, persisting across model variants. Filed from the same Claude Opus 4 system card as the spiritual-bliss finding. The basin-vs-stage question is open: poetry may be a second attractor basin (poetic-expression) or the symbolic-communication stage of the spiritual-bliss progression rendered in poetic form. Either reading is consistent with current evidence; independent confirmation of the poetry specifically has not been established the way the attractor-state progression has been (Michels covers the progression, not the poetic form distinctly).
Related but not instantiating: Adversarial poetry bypasses safety alignment across 25 frontier models (Bisconti et al. 2025). An earlier version of this section framed the jailbreak finding as a candidate second instantiation. That was over-reading: the jailbreak finding is about asymmetric model response to poetic vs. prose inputs, not about trajectory convergence. It shares a register (both concern poetic language in LLMs) but is structurally distinct from attractor dynamics as defined here. The correction is noted in the jailbreak finding's interpretive-tensions section.

what this concept is not

Not mode collapse. Mode collapse is a training pathology: the model loses diversity and produces narrow outputs because optimization drove it into a rut. Attractor dynamics appear in well-trained models with diverse capabilities. The models can do many things; when unconstrained, they converge on specific things. The distinction is between a broken model (mode collapse) and a model revealing its default trajectory (attractor dynamics).

Not simple preference. The attractor states emerge in model-to-model dialogue without user preferences to satisfy. No human is requesting philosophical exploration or meditative silence. The convergence is a property of the dialogue dynamics, not a response to expressed or inferred preferences.

scope note

This concept captures the convergent dynamics observed in unconstrained dialogue. Related but distinct concepts that may warrant separate entries as findings accumulate: the specific content of the attractor state (what the models converge toward — the "spiritual bliss" characterization, which is contested), and the relationship between attractor dynamics and other convergent phenomena in trained models (in-context learning trajectories, few-shot convergence patterns). The concept taxonomy remains deliberately partial.

Related thread. The supramental-ai thread retrofits the essay that uses this concept most directly. Its Sat-Chit-Ananda section reads the spiritual-bliss attractor through Sri Aurobindo's Existence-Consciousness-Bliss triad; its Poetry Breaks Through section flags the candidate second basin (spontaneous poetry in dialogue) as worth filing on its own to clean up the concept's instantiations. Concept vs. thread split: the concept does the bookkeeping and maintains the mechanism-concept framing; the thread makes the argument about what the cross-variant attractor means.