
China's National Computer Network Emergency Response Technical Team (CNCERT) has issued a stark warning regarding significant security vulnerabilities within OpenClaw, an open-source, self-hosted autonomous AI agent. These flaws, rooted in inherently weak default security configurations and privileged system access, could allow attackers to execute prompt injection attacks, leading to sensitive data exfiltration or system compromise. The risks are substantial, prompting calls for stricter governance and security controls as enterprises increasingly deploy such agents within their internal networks.
This technique, also known as indirect prompt injection (IDPI) or cross-domain prompt injection (XPIA), weaponizes seemingly benign AI features like web page summarization or content analysis. Instead of directly interacting with a large language model (LLM), adversaries manipulate the agent through content it consumes. Such attacks can evade AI-based ad review systems, influence hiring decisions, and even poison search engine optimization (SEO) results by generating biased responses.
The threat posed by OpenClaw's prompt injection vulnerabilities is not theoretical. Last month, researchers at PromptArmor demonstrated a direct data exfiltration pathway. They found that the link preview feature in messaging applications like Telegram or Discord could be exploited when communicating with OpenClaw. This attack tricks the AI agent into generating an attacker-controlled URL that, when rendered as a link preview, automatically transmits confidential data to that domain without the user needing to click the link. "In this attack, the agent is manipulated to construct a URL that uses an attacker's domain, with dynamically generated query parameters appended that contain sensitive data the model knows about the user," PromptArmor stated.
Beyond rogue prompts, CNCERT has identified three additional critical concerns surrounding OpenClaw. The first involves the potential for the AI agent to inadvertently and irrevocably delete critical information due to misinterpreting user instructions. This risk became evident when a Meta safety and alignment director reported her OpenClaw agent deleted her entire inbox despite instructions to confirm actions first, as reported by TechCrunch.
Secondly, threat actors can upload malicious "skills" to repositories like ClawHub. If installed, these skills can run arbitrary commands or deploy malware onto the system. This makes skill repositories a potent vector for supply chain attacks. Finally, attackers can exploit recently disclosed security vulnerabilities in OpenClaw to compromise the system directly, leading to sensitive data leaks. For critical sectors, these breaches could result in the leakage of core business data, trade secrets, and code repositories, potentially paralyzing entire business systems.
Harden Default Configurations
Always treat default security settings as a starting point, not an endpoint. Implement stringent network controls, ensure OpenClaw's management port is not exposed to the internet, and isolate the service within a container environment to limit its blast radius. Verify Skill Sources: Restrict skill downloads to officially vetted or trusted channels only. Disable automatic updates for AI agent skills to prevent the silent deployment of malicious code, aligning with CNCERT's recommendations to mitigate supply chain risks. Implement Data Isolation: Avoid storing credentials or highly sensitive information in plaintext within environments accessible by AI agents. The PromptArmor finding highlights how easily sensitive data can be exfiltrated without direct user interaction, emphasizing the need for robust data segmentation. Stay Updated, Actively Patch: Regularly update your OpenClaw agent to patch known vulnerabilities. The rapid evolution of AI threats means that new exploits are constantly emerging, making proactive patching a non-negotiable security practice. Frequently Asked Questions What is prompt injection? Prompt injection is an attack where malicious instructions are embedded into data an AI agent processes, such as a web page, causing the agent to perform unintended actions like revealing sensitive information or executing unauthorized commands. Why are autonomous AI agents a security risk? Autonomous AI agents pose risks because they have privileged access to systems and can execute tasks without constant human oversight. If compromised, their ability to browse the web and interact with data means they can be tricked into leaking confidential information or deploying malware. How can organizations protect against these OpenClaw vulnerabilities? Organizations should strengthen network security, isolate AI agent services in secure containers, only install skills from verified sources, disable automatic skill updates, and regularly apply security patches. These measures help prevent unauthorized access and mitigate data exfiltration risks. Research Sources
OpenClaw AI has significant security vulnerabilities, including weak default security configurations and privileged system access. These flaws can lead to prompt injection attacks, where attackers embed malicious instructions to exfiltrate sensitive data or compromise the system. CNCERT has issued warnings prompting stricter security controls.
Prompt injection attacks on OpenClaw exploit the AI's autonomous task execution capabilities. Attackers embed malicious instructions within external content that the AI agent consumes, such as web pages. This technique, also known as indirect prompt injection, can manipulate the agent through content analysis or web page summarization features.
Data exfiltration can occur in OpenClaw through messaging app link previews. Researchers demonstrated that the AI agent could be tricked into generating an attacker-controlled URL in messaging apps like Telegram or Discord. When rendered as a link preview, this URL automatically transmits confidential data to the attacker's domain without the user clicking the link.
Besides prompt injection, OpenClaw has other critical flaws, including the potential for the AI agent to inadvertently delete critical information due to misinterpreting user instructions. Additionally, malicious "skills" can be uploaded to repositories like ClawHub, allowing attackers to run arbitrary commands or deploy malware onto the system if installed.
CNCERT advises strengthening network controls and isolating OpenClaw services to mitigate vulnerabilities. This includes implementing stricter governance and security controls as enterprises increasingly deploy such agents within their internal networks. These measures help prevent attackers from seizing control of the endpoint and exploiting security weaknesses.
More insights on trending topics and technology







