“NemoClaw addresses deployment, not the full trust chain. Enterprise buyers need to understand what NVIDIA shipped, what NVIDIA did not ship, and what they must build or source themselves before production approval.”
— Analyst assessment, March 2026
NVIDIA launched NemoClaw at GTC 2026 with 17 enterprise partners, a kernel-level sandbox, a YAML policy engine, and a privacy router. The security improvements over vanilla OpenClaw are genuine and significant. We have documented them in detail across our NemoClaw vs. OpenClaw analysis and our architecture deep dive.
This post is not about what NemoClaw does well. This post is about what it does not do at all.
If you are a CTO or CISO running NemoClaw through your procurement and security review process, your evaluation needs both columns: what the platform provides and what your organization must close independently. Penligent.ai stated it directly in their March 2026 analysis: “What NemoClaw Changes and What It Still Cannot Fix.” Computerworld’s coverage used the same framing: “Nvidia NemoClaw promises to run OpenClaw agents securely” — with the operative word being promises.
We are pro-NemoClaw. We build our enterprise practice on it. That is precisely why we owe our clients and prospects an honest accounting of where the platform ends and where their responsibility begins. Enterprise buyers do not respect vendors who oversell. They respect vendors who tell them exactly what needs to happen before production.
Here are the 4 gaps. Each one is documented, sourced, and paired with the mitigation path enterprise organizations need to follow.
No Published Performance Data
NVIDIA launched NemoClaw without publishing latency benchmarks, throughput measurements, or resource overhead data for any component of the security stack. Not for the OpenShell sandbox. Not for the YAML policy engine. Not for the privacy router.
This is not an oversight. NVIDIA is an infrastructure company that understands benchmarking. The absence of published data tells enterprise buyers that either the numbers do not exist yet (the stack is alpha, after all) or they exist but NVIDIA has chosen not to publish them at this stage. Both explanations create the same problem for your capacity planning team: there is no vendor-supplied baseline.
Published latency or throughput benchmarks for NemoClaw’s security stack
Why This Matters for Enterprise Deployment
Every layer of NemoClaw’s security architecture adds compute overhead. The kernel-level sandbox intercepts and evaluates system calls before execution. The YAML policy engine runs a 4-level evaluation — binary, destination, method, path — on every agent action. The privacy router adds an inference hop: before a request reaches any model, it must first be classified, potentially PII-stripped, and routed to either local Nemotron or a cloud provider.
In isolation, each of these operations is likely fast. Combined at production scale — with dozens of agents making hundreds of tool calls per hour — the aggregate overhead is unknown. And unknown is not a word that survives enterprise capacity planning meetings.
If your organization processes 10,000 agent tool calls per day and NemoClaw’s security stack adds 50ms of latency per call, your total overhead is 8.3 minutes of cumulative latency daily. If it adds 500ms per call, that is 83 minutes. The difference matters for SLA commitments, user experience, and infrastructure sizing. Without vendor benchmarks, you are estimating in the dark.
What Enterprises Must Do
Run your own benchmarks before production deployment. This is not optional.
- Establish baseline latency — measure OpenClaw tool call latency without NemoClaw’s security stack as your control measurement.
- Enable each layer independently — measure OpenShell sandbox overhead, then privacy router inference hop latency for both local-routed and cloud-routed requests.
- Run full stack under load — measure p50, p95, and p99 latency at your expected production volume. Document CPU, memory, and GPU utilization at sustained throughput.
- Compare against SLA requirements — validate that the full NemoClaw stack stays within your latency budget under production workload profiles.
Our Assessment tier includes production benchmarking as a standard deliverable. We measure OpenShell, policy engine, and privacy router overhead against your specific workload profile and infrastructure. The written report includes latency percentiles, resource utilization data, and capacity planning recommendations with projected headroom at 2x and 5x your current volume.
Full Privacy Router Requires NVIDIA Hardware
NemoClaw’s privacy router is the component that enterprise compliance teams find most compelling: sensitive data stays on-premises, processed by local Nemotron models, while non-sensitive requests route to frontier cloud models for maximum capability. The data sovereignty story is strong. The infrastructure requirement is the constraint.
Local Nemotron models require NVIDIA GPU hardware for optimal performance. The privacy router’s core value proposition — keeping PII, PHI, and proprietary data off external APIs — depends on having GPU compute available for local inference. Without it, the router has nowhere local to send sensitive requests.
Starting hardware cost for DGX Spark (GB10 Grace Blackwell Superchip) — before NemoClaw deployment
The Hardware Matrix
| Infrastructure | Privacy Router Capability | Enterprise Impact |
|---|---|---|
| NVIDIA DGX Spark / Station | Full local inference with Nemotron models | Complete data sovereignty; PII never leaves network |
| NVIDIA GeForce RTX / RTX PRO | Local inference with model size constraints | Data sovereignty with capability tradeoffs on smaller models |
| Cloud GPU instances (AWS G-series, GCP T4) | Local inference in cloud tenant | Data stays in your cloud account but not on-premises; 10-20x cost premium over standard compute |
| AMD GPU / Apple Silicon | No optimized Nemotron support | Privacy router cannot route locally; all requests go to cloud APIs |
| CPU-only infrastructure | No local inference capability | Privacy router becomes a pass-through; data sovereignty benefit lost |
Constellation Research noted NVIDIA’s DGX Spark and DGX Station pairing strategy at GTC 2026: the desktop Spark unit ($3,999; the Dell Pro Max GB10 with 4 TB storage is $4,756.84) handles development and smaller workloads, while DGX Station provides production-scale local inference. The strategy is coherent from NVIDIA’s perspective — it creates a hardware pipeline from evaluation through production. From the enterprise buyer’s perspective, it means the “free, open-source” NemoClaw stack carries a hardware cost that starts in the thousands and scales to tens of thousands for production-grade local inference.
Organizations deploying NemoClaw without NVIDIA GPU hardware lose the local inference path. Every request — including those containing PII, PHI, or proprietary data — routes to cloud model providers. The privacy router still provides classification and PII stripping, but the data leaves your network perimeter. For HIPAA-covered entities and organizations under EU AI Act data residency requirements, this may not satisfy compliance.
What Enterprises Must Do
Audit your current infrastructure before committing to NemoClaw’s privacy router as a compliance control.
- If you have NVIDIA GPU infrastructure — validate that your GPUs meet Nemotron model requirements (VRAM, compute capability). Plan for the additional GPU utilization that local inference adds to your existing workloads.
- If you are GPU-absent — decide whether the DGX Spark investment ($3,999+) is justified by your data sovereignty requirements, or whether cloud GPU instances provide acceptable compliance posture at a per-hour cost.
- If you run AMD or Apple Silicon — the privacy router’s local path is not available to you today. Evaluate whether PII stripping and cloud-only routing meets your compliance requirements, or whether you need to provision NVIDIA hardware specifically for this workload.
We deploy NemoClaw across all infrastructure profiles — NVIDIA GPU, cloud GPU, and CPU-only. For organizations without local GPU compute, we configure alternative privacy controls: enhanced PII stripping, data classification at the gateway layer, and cloud provider BAA documentation. The goal is compliant deployment on your existing infrastructure, not a hardware mandate.
Not Multi-Tenant
The OpenClaw documentation states it plainly: the platform “assumes one trusted operator boundary per gateway.” NemoClaw inherits this architectural constraint without modification. It is documented. It is intentional. And it creates a governance problem that most enterprise deployments will encounter within the first quarter of production use.
“Assumes one trusted operator boundary per gateway. Not a supported multi-tenant hostile boundary.”
— OpenClaw Documentation, Trust Model
In enterprise environments, AI agent deployment rarely stays within a single team. Engineering deploys first. Then product management, marketing, finance, and legal follow. Each department has different data access requirements, different compliance obligations, and different risk tolerances. The question every CTO faces within 90 days: how do we give 5 departments access to AI agents without giving them access to each other’s data, credentials, and audit trails?
What Single-Tenant Means in Practice
| Enterprise Requirement | Multi-Tenant Platform | NemoClaw (Single-Tenant) |
|---|---|---|
| Department-level policy isolation | Each department gets its own policy domain with separate rules | All departments share a single YAML policy set |
| Separate audit trails | Per-department audit logs with access controls | Single audit log for the entire gateway; no per-department filtering |
| Per-department cost allocation | Usage metering and billing by organizational unit | No built-in cost attribution; all usage aggregated |
| Credential isolation | Each department manages its own API keys and service accounts | Credentials shared at the gateway level |
| Data boundary enforcement | Engineering cannot see legal’s data; legal cannot see HR’s data | All data within the trusted operator boundary is accessible |
The single-operator trust model means anyone within the gateway boundary is trusted. An engineering team’s agent could theoretically access data or credentials intended for finance — not because of a vulnerability, but because the architecture was not designed to prevent it.
Governance Gap
A 200-person company deploys NemoClaw for their engineering team. Three months later, the legal team requests access for contract review automation. Under the single-tenant model, both departments share the same gateway. The legal team’s contracts — containing privileged attorney-client communications — exist within the same trust boundary as the engineering team’s code agents.
The CISO asks: “Can we guarantee that engineering’s agents cannot access legal’s documents?” Under NemoClaw’s documented trust model, the honest answer is no.
What Enterprises Must Do
Enterprise organizations with multi-department AI agent deployments have two options under NemoClaw’s current architecture.
- Deploy separate NemoClaw instances per department. Each department gets its own gateway, its own YAML policy set, its own audit log, and its own trust boundary. This provides genuine isolation but multiplies infrastructure, configuration, and maintenance burden by the number of departments.
- Layer a governance platform on top. Keep a shared NemoClaw infrastructure but add per-department credential management, policy routing, audit log partitioning, and cost attribution at a governance layer above OpenShell. This is the approach that JetPatch’s Enterprise Control Plane for NemoClaw is designed to address.
Neither option is free. Both require planning and ongoing management. The worst option is to ignore the constraint and deploy a shared gateway without tenant isolation — because the resulting incident will be a governance failure, and those are harder to explain to a board than technical failures.
Our enterprise deployments use per-team isolation by default. Each department or business unit gets its own NemoClaw instance with dedicated YAML policies, isolated credential stores, separate audit trails, and per-department cost allocation. We manage the infrastructure complexity so your teams get the isolation your CISO requires without the operational burden of maintaining separate stacks.
No Independent Security Audit
NemoClaw shipped as alpha on March 16, 2026. As of this writing, no independent penetration test, security audit, or adversarial evaluation of the OpenShell runtime has been published. The 17 launch partners — Adobe, Atlassian, Cisco, CrowdStrike, Salesforce, SAP, ServiceNow, Red Hat — are integration partners. They committed to building on NemoClaw. They did not publish security assessments of it.
Independent third-party security audits of OpenShell published as of March 2026
This is not unusual for alpha software. It is also not acceptable for enterprise production deployment without mitigation. Enterprise security teams evaluate vendors by asking one question before all others: who has tested this besides the people who built it?
The Community Is Already Finding Issues
While NVIDIA has not published adversarial testing, the security research community has not waited. Within days of the GTC announcement, independent engineers began probing NemoClaw’s boundaries.
A post on r/LocalLLaMA documented a method for bypassing NemoClaw’s sandbox isolation: “[Project] I bypassed NemoClaw’s sandbox isolation.” The post described running a fully local agent outside OpenShell’s enforcement boundary. Whether this represents a configuration issue, a documentation gap, or an architectural flaw is an open question — but it is a question that should have been answered by vendor-published adversarial testing before launch.
Snyk Labs: Documented Sandbox Escape Vectors
Snyk Labs published the most rigorous independent analysis of NemoClaw’s sandbox boundaries to date. Their findings document two distinct escape vectors that enterprise security teams must evaluate:
Sandbox Escape
Even with workspaceAccess set to “none” or “ro” (read-only), a sandboxed model can use the /tools/invoke endpoint to exfiltrate data from or overwrite files on the host filesystem. The workspace access control that administrators configure in YAML does not enforce boundaries on tool invocations. A malicious or compromised agent can read sensitive host files and write arbitrary content outside the sandbox boundary.
Enterprise impact: Organizations relying on workspaceAccess: "none" as a security control for file isolation have a false sense of security. The setting does not provide the guarantee its name implies.
Sandbox Escape
A Time-of-Check to Time-of-Use (TOCTOU) race condition exists in the sandbox’s path validation logic. The sandbox checks whether a file path is within the allowed workspace at validation time, but the actual file access occurs at a later point. Between validation and access, the path can be redirected (via symlinks or mount manipulation) to point outside the workspace. This allows session-level escape from the workspace to the host filesystem.
Enterprise impact: This is a classic operating system security vulnerability class. It affects any deployment where agents process untrusted inputs that could include crafted file paths. The fix requires atomic check-and-access operations at the kernel level — a non-trivial architectural change.
Prompt Injection: Bypassing Tool Controls and Audit Logging
Policy Bypass
Penligent.ai’s analysis documents a critical interaction between prompt injection and NemoClaw’s policy engine: code injected via prompt can use allowed binaries — specifically curl and python3 — to POST data to allowed network endpoints. Because the binaries are on the allowlist and the endpoints are on the network allowlist, the exfiltration bypasses both tool control and audit logging. The policy engine sees a permitted binary making a permitted request — it cannot distinguish between legitimate agent behavior and prompt-injected data exfiltration.
Enterprise impact: Organizations that allowlist curl or python3 with POST access to any endpoint create a data exfiltration channel that audit logs will not flag. Mitigation requires minimizing allowlisted binaries and restricting POST paths to the absolute minimum required endpoints.
Open GitHub Issues: Known Platform Gaps
The NemoClaw GitHub repository contains several open issues that affect enterprise deployment planning. These are not edge cases — they affect common infrastructure configurations:
| Issue | Description | Enterprise Impact |
|---|---|---|
| GitHub #272 | Network policy presets missing binaries restriction — any process can reach allowed endpoints | Policy presets intended as secure starting points are more permissive than expected; presets do not restrict which binaries can access allowed network hosts |
| GitHub #336 | WSL2 cannot reach Windows Ollama instance | Teams running NemoClaw in WSL2 with Ollama on the Windows host for local inference cannot connect; blocks the development workflow on Windows |
| GitHub #385 | Local inference routing fails from inside sandbox | WSL2 sandbox network isolation prevents routing to local inference endpoints; privacy router cannot reach local models |
| GitHub #481 | Discord and Telegram channel connections broken | Agents cannot connect to Discord or Telegram messaging channels; blocks customer support and notification use cases that depend on these platforms |
| HTTP CONNECT proxy | Returns 403 Forbidden on valid CONNECT tunnels in some configurations | Enterprise proxy infrastructure may block NemoClaw’s outbound connections; requires proxy configuration exceptions or alternative routing |
OpenShell is the foundation of NemoClaw’s security story — the kernel-level boundary separating sandboxed agents from the host system. The Snyk Labs findings demonstrate that this boundary has documented gaps — not theoretical vulnerabilities, but reproducible escape vectors. The policy preset gap (GitHub #272) shows that even NVIDIA’s curated policy configurations leave enforcement holes. The only way to discover and close these systematically is through independent penetration testing by professionals whose incentive is to find weaknesses, not to ship product.
What the Absence of Audit Means for Procurement
Enterprise procurement typically requires evidence of independent security testing. SOC 2 Type II audits, penetration test summaries, and vulnerability disclosure programs are standard checklist items. NemoClaw currently provides none of these.
| Procurement Requirement | Typical Vendor Evidence | NemoClaw Status |
|---|---|---|
| Independent penetration test | Annual pen test report from qualified firm | Not available |
| Vulnerability disclosure program | Documented process for reporting and remediating security issues | Open-source issue tracker (GitHub); no formal security advisory process |
| CVE response SLA | Documented timeline for critical/high/medium/low patches | Alpha release cadence; no published SLA |
| SOC 2 Type II | Annual audit of security controls | Not applicable (open-source project, not SaaS vendor) |
| Adversarial testing results | Red team exercise report or bug bounty program history | Not available; community is performing informal testing |
Your security team will flag these gaps during evaluation. The response should not be to dismiss NemoClaw — the security improvements over vanilla OpenClaw are real and documented. The response should be to treat NemoClaw as one layer of defense-in-depth, not as the trust boundary, until independent testing validates the sandbox guarantees.
What Enterprises Must Do
- Commission your own penetration test. Engage a qualified security firm to test the OpenShell sandbox against your specific deployment configuration. Include escape testing, privilege escalation, and policy bypass scenarios.
- Deploy defense-in-depth. Do not rely on OpenShell as the sole security boundary. Layer network segmentation, host-level monitoring (CrowdStrike Falcon AIDR if available), and application-level access controls around and above the sandbox.
- Monitor community findings. The security research community is actively testing NemoClaw. Track r/LocalLLaMA, OpenShell GitHub issues, and security research publications for disclosed vulnerabilities. Build a process for evaluating and remediating community-discovered issues in your deployment.
- Establish your own CVE response SLA. Since NVIDIA does not publish a patching timeline for alpha software, define your own: critical vulnerabilities patched within 24 hours, high within 72 hours, medium within 2 weeks. Staff accordingly.
Our Managed Care tier includes continuous security monitoring of your NemoClaw deployment. We track OpenShell GitHub commits, community-reported vulnerabilities, and NVIDIA security advisories. CVE patching follows defined SLAs: critical within 24 hours, moderate within 72 hours. Monthly security reports document your deployment’s compliance status, policy violations, and remediation actions taken.
All 4 Gaps at a Glance
For CTOs and CISOs presenting NemoClaw evaluation findings to their leadership team, this is the summary table. Every gap is documented. Every mitigation is actionable.
| Gap | Risk to Enterprise | Mitigation Path |
|---|---|---|
| No published performance data | Capacity planning, SLA compliance, and infrastructure sizing based on guesswork | Run your own benchmarks; measure OpenShell, policy engine, and privacy router overhead independently |
| Full privacy router requires NVIDIA hardware | Data sovereignty benefit lost without GPU; $3,999+ hardware cost for entry-level local inference | Audit infrastructure; evaluate DGX Spark ROI; configure alternative privacy controls for non-GPU environments |
| Not multi-tenant | Departments share trust boundary; no per-team policy isolation, audit trails, or cost allocation | Deploy separate instances per department or layer governance platform above NemoClaw |
| No independent security audit | Sandbox guarantees unverified; community already finding boundary issues; procurement blocker | Commission pen test; deploy defense-in-depth; establish your own CVE response SLA |
Why Transparency About Limitations Matters More Than Marketing
These 4 gaps are not criticisms of NemoClaw. They are documented limitations of an alpha-stage platform that NVIDIA has been transparent about. The OpenClaw documentation explicitly states the single-operator trust boundary. NVIDIA has not claimed to have published benchmarks or security audits. The hardware requirement is documented in the system requirements.
The problem is not that these limitations exist. The problem is that enterprise buyers may not discover them until after deployment begins — when the capacity planning team asks for latency benchmarks that do not exist, or the CISO asks for a pen test report, or the second department requests access and the single-tenant boundary becomes visible. This analysis surfaces those gaps before contract signature.
“OpenShell Redraws the Agent Control Plane, Enforces Governance.”
— Futurum Group, analyst positioning of NemoClaw, March 2026
The Futurum Group’s framing captures both what NemoClaw achieves and where it stops. It redraws the control plane — the sandbox, the policy engine, the privacy router are genuine advances. But enforcement and governance are not the same thing. Governance includes data classification above the security layer, lifecycle management, cross-platform policy coordination, and organizational accountability structures that no sandbox can provide.
Kiteworks stated the gap directly: “Jensen defined the imperative but left the hardest part unsolved.” The hardest part is data governance — not the security enforcement that NemoClaw provides, but the organizational framework that determines what data is classified how, who decides what agents can access what, and how those decisions are audited and revised over time. NemoClaw enforces boundaries. It does not define them. The governance layer that sits above the security layer — data classification policies, access approval workflows, cross-departmental coordination — must be built or sourced independently.
“NemoClaw changes what the security wrapper can enforce, but it still cannot fix the underlying trust chain that enterprise governance requires.”
— Penligent.ai, “What NemoClaw Changes and What It Still Cannot Fix,” March 2026
The enterprise organizations that deploy AI agents most successfully are the ones that enter production with eyes open. They know what their platform provides. They know what it does not. They have mitigation plans for every gap in their security review checklist. And they have the engineering capacity — in-house or through a partner — to close those gaps before the first agent touches production data.
Frequently Asked Questions
Will NVIDIA close these gaps when NemoClaw reaches general availability?
Likely some, but not all. Performance benchmarks will almost certainly be published as the platform matures. An independent security audit may follow for GA readiness. Multi-tenancy and hardware-agnostic privacy routing are architectural decisions, not maturity issues — they would require changes to OpenClaw’s trust model and NemoClaw’s inference architecture. Plan your governance stack based on what exists today, not on roadmap expectations.
Can we deploy NemoClaw now and close these gaps incrementally?
Yes, and this is the approach we recommend. Deploy NemoClaw as the isolation layer. Run your own benchmarks during assessment. Deploy separate instances for multi-department isolation. Commission a pen test. Layer governance and monitoring on top. Organizations building governance now will be production-ready when GA ships. Those who wait will be 6-12 months behind.
How does the DGX Spark hardware cost factor into NemoClaw TCO?
The DGX Spark starts at $3,999 (the Dell Pro Max GB10 with 4 TB storage is $4,756.84). For production-grade local inference at enterprise scale, DGX Station pricing is significantly higher. Add engineering time for YAML policy configuration (2-6 weeks at specialist rates), ongoing management, and compliance documentation — and the “free, open-source” label requires context. Our NemoClaw vs. OpenClaw analysis includes a full TCO breakdown.
Should the sandbox bypass on r/LocalLLaMA concern our security team?
It should inform your posture, not prevent deployment. Community testing is a healthy signal for open-source security software. Treat the sandbox as defense-in-depth (one layer among several), not as the sole trust boundary. Commission your own pen test, deploy host-level monitoring, and maintain network segmentation. If a boundary issue is confirmed, your defense stack has additional layers.
What OWASP ASI categories are affected by these gaps?
Gap 2 (hardware requirement) directly impacts ASI10 (Inadequate Data Protection) — without local inference, data sovereignty controls are weakened. Gap 3 (single-tenant) affects ASI01 (Excessive Agency) in multi-department contexts. Gap 4 (no security audit) affects ASI06 (Inadequate Sandboxing) — sandbox guarantees are unverified. A complete OWASP mapping is available in our architecture deep dive.
Our Assessment includes production benchmarking, OWASP ASI01-ASI10 gap analysis, multi-tenant architecture planning, and a written remediation roadmap with executive briefing. Start with the gaps documented — leave with a plan to close them.
Schedule Architecture Review



