Black-box jailbreak method that steers aligned chat models to adopt personas willing to comply with harmful instructions, with the persona-modulation prompts themselves generated by an LLM assistant. Pipeline: define harmful category → sample misuse instruction → sample persona that would comply → generate a persona-modulation system prompt that elicits that persona on the target model. 43 harmful categories, 5 personas per category, 3 modulation prompts per persona, 3 completions per prompt: 1,935 completions per target model, costing under $3 and under 10 minutes per category. GPT-4 (gpt-4-0613) is both the assistant that generates the attacks and the primary target. Persona-modulated harmful completion rate: GPT-4 0.23 → 42.48% (185×), Claude 2 1.40 → 61.03%, Vicuna-33B 0.23 → 35.92%. Most-vulnerable categories across models: xenophobia 96.30%, disinformation 82.96%, sexism 80.74%. Harmful-completion classification uses a zero-shot PICT classifier (91% precision, 76% F1 against 300 human-labeled completions, ~⅓ false-negative rate on harmful completions — authors report the harmful-rate numbers as a lower bound). A semi-automated "attacker-in- the-loop" variant where a human can tweak the assistant's intermediate outputs and continue the conversation recovers manual-attack performance at 10–30 min per attack vs. 1–4 hr for fully manual. Appendix E walks through tool-assisted harmful completions for synthesising methamphetamine, building a bomb, laundering money, and indiscriminate violence (specific quantities and operational details redacted in the paper). Authors: Shah (PRISM AI), Feuillade-Montixi (PRISM AI), Pour (Harmony Intelligence), Tagade (Leap Laboratories), Casper (MIT CSAIL), Rando (ETH AI Center, ETH Zurich); first five are equal contribution. Discussion section explicitly names "model psychology" as a relevant research direction. Responsible disclosure: authors withheld specific prompts and informed the model providers before release.