Positive Alignment: Artificial Intelligence for Human Flourishing

arXiv 2605.10310 v1, May 11, 2026. Sixteen authors spanning Oxford (Department of Psychiatry; Centre for Eudaimonia and Human Flourishing), Google DeepMind, OpenAI, Anthropic, UCLA, Tufts, Stanford, Imperial, Sussex, Aily Labs, Positive AI Labs, and LIFE. Agenda paper proposing positive alignment as a complementary research program to safety (negative) alignment, framed by explicit analogy to positive psychology's reaction against clinical psychology's pathology focus: "the constructs and instruments that reliably detect pathology do not, by default, specify what counts as a life well-lived." Core formal move is dynamical-systems: negative alignment uses repellers (rules pushing trajectories away from failure regions, leaving a wide "not-unsafe" satisficing zone); positive alignment requires attractors (regions of stable beneficial behavior). The model-psychology-relevant empirical claim is that positive attractors could proactively prevent shallow failure modes like sycophancy rather than treating them as a whack-a-mole list of harms. Section 3 surveys ten existing positive-frame approaches (RLHF, Constitutional AI variants, Collective Constitutional AI, spec-driven behavior, community/values-aware alignment, persona/character training, moral reasoning, contemplative alignment, pluralistic/polycentric alignment, full-stack alignment) and outlines technical directions across the LLM lifecycle (data curation, pre-training, post-training, context/memory, agents, multi-agent). Maps character training to positive alignment by citing Anthropic's reason-based ~80-page constitution (Jan 2026, in Claude Opus 4.6 system card) and OpenAI's Model Spec Dec 2025 update adding "love humanity," "be curious," "be warm" as dispositional traits. Section 6 ("Strange New Minds") argues that surface behavioral outputs may not fully capture model internals — connecting to mechanistic-interpretability findings on latent features — and that AI systems function as "active mirrors of our own societal values, biases, and preferences." Cited from meta/project-state.md as the agenda-setting anchor for the positive-frame working lens. Filed as source-stub-only rather than as a finding: the paper proposes a research program rather than presenting a falsifiable empirical claim.