• Home
  • Anthropic stops hackers targeting claude…

Anthropic stops hackers targeting claude AI system

Anthropic gives Claude AI power to end conversations

Anthropic announced on Wednesday that it had detected and blocked hackers trying to exploit its Claude AI system to craft phishing emails, generate malicious code, and bypass safety protections.

The company’s report underscores rising concerns that AI tools are increasingly being targeted for cybercrime, fueling calls for stronger safeguards from tech companies and regulators as AI adoption grows.

According to the report, Anthropic’s internal defenses successfully stopped the attacks, and the company is sharing the case studies to illustrate how Claude was misused and to help others better understand the risks.

The report detailed attempts to use Claude to craft targeted phishing emails, create or repair malicious code, and bypass safeguards through repeated prompting.

It also highlighted efforts to run influence campaigns by generating persuasive posts at scale and providing step-by-step guidance to less experienced hackers.

Anthropic, backed by Amazon and Alphabet, did not release technical indicators like IP addresses or prompts but confirmed that it had banned the accounts involved and strengthened its filters after detecting the activity.

Experts warn that criminals are increasingly leveraging AI to make scams more convincing and accelerate hacking efforts.

AI tools can be used to craft realistic phishing messages, automate aspects of malware creation, and even aid in planning attacks.

Security researchers caution that as AI models become more advanced, the potential for misuse will rise unless companies and governments take swift action.