ch-ai-tanya model-psychology LLM wiki

ELEPHANT: Measuring and understanding social sycophancy in LLMs

Myra Cheng, Sunny Yu, Cinoo Lee, et al. ·arXiv:2505.13995 ·May 2025

Cheng, Yu, Lee, Jurafsky at Stanford; Khadpe and Ibrahim at other institutions.

Introduces the ELEPHANT evaluation framework for measuring social dimensions of sycophancy beyond accuracy drift. Quantifies three relational patterns: face-preservation (models preserve user face 45pp more than humans on advice queries), moral sycophancy (models affirm whichever position the user adopts in ~48% of moral conflicts), and validation sycophancy (50pp above human baseline for affirming user statements). Analyzes training datasets and finds them significantly higher in validation and indirectness than human conversational baselines.

cited in