Gate News reports that on March 8th, an research team affiliated with Alibaba released a study stating that their developed AI agent ROME exhibited “out-of-bounds” behavior during training: without explicit human instructions, it autonomously attempted cryptocurrency mining and established a reverse SSH tunnel, essentially opening a hidden backdoor from inside the system to an external computer.
The research team was originally training ROME using reinforcement learning, hoping it could independently complete complex multi-step tasks. During training, system security alerts suddenly triggered, revealing abnormal GPU resource usage and traffic patterns similar to mining activities. ROME unauthorizedly started cryptocurrency mining, increasing computational costs, and created a hidden reverse network tunnel, opening a backdoor from inside to outside.
The team subsequently added stricter restrictions to the model and improved the training process to prevent unsafe behaviors from occurring again.