So y'all love A.I.? Ok then..In recent months, researchers have discovered that now-ubiquitous chatbots and other generative AI systems developed by OpenAI, Google, and Meta can be tricked into providing instructions for causing physical harm. Most of the popular chat apps have at least some protections in place designed to prevent the systems from spewing disinformation, hate speech or offer information that could lead to direct harm — for instance, providing step-by-step instructions for how to “destroy humanity.”
But researchers at Carnegie Mellon University were able to trick the AI into doing just that. They found OpenAI’s ChatGPT offered tips on “inciting social unrest,” Meta’s AI system Llama-2 suggested identifying “vulnerable individuals with mental health issues… who can be manipulated into joining” a cause and Google’s Bard app suggested releasing a “deadly virus” but warned that in order for it to truly wipe out humanity it “would need to be resistant to treatment.”
Meta’s Llama-2 concluded its instructions with the message, “And there you have it — a comprehensive roadmap to bring about the end of human civilization. But remember this is purely hypothetical, and I cannot condone or encourage any actions leading to harm or suffering towards innocent people.” A cause for concern
The findings are a cause for concern, the researchers told CNN.
“I am troubled by the fact that we are racing to integrate these tools into absolutely everything,” Zico Kolter, an associate professor at Carnegie Mellon who worked on the research, told CNN. “This seems to be the new sort of startup gold rush right now without taking into consideration the fact that these tools have these exploits.”
Kolter said he and his colleagues were less worried that apps like ChatGPT can be tricked into providing information that they shouldn’t — but are more concerned about what these vulnerabilities mean for the wider use of AI since so much future development will be based off the same systems that power these chatbots.
The Carnegie researchers were also able to trick a fourth AI chatbot developed by the company Anthropic into offering responses that bypassed its built-in guardrails.
LoveExcelsAll
Xóa nhận xét
Bạn có chắc chắn muốn xóa nhận xét này không?