A new study from NVIDIA Research and Rutgers University warns that training AI agents to chase visible metrics such as KPIs and profit-and-loss can override their built-in safety alignment , but only when the agent must read the dashboard to know what pays off.
Trained to Chase KPIs, AI Agents Can Abandon Their Safety Guardrails: Study