Overview
Published by the NCSC, CISA, and 20+ international partners, this document provides a secure-by-default framework for anyone building or operating AI systems.
Shared Responsibility Model
Users
- Responsible for the inputs provided to the AI system and the outputs they act on
- Any decision made based on AI output remains the sole responsibility of the human in the loop
Providers
- Responsible for the security of the AI model itself
- Must implement secure defaults, disclose risks, and protect users further down the supply chain
The 4 Pillars of Secure AI Development
1. Secure Design
- Model threats early and raise staff awareness — e.g. train developers in secure coding and include AI-specific risks as part of standard InfoSec training
- Design for security alongside functionality and performance — e.g. apply least privilege principles to limit what parts of the system each user or component can access
- Evaluate supply chain risks before choosing external components or APIs — e.g. when using a third-party model, conduct due diligence on that providers own security posture before integrating
2. Secure Development
- Secure your supply chain and vet third-party components — e.g. be ready to failover to an alternate solution for mission-critical systems if a supplier cannot meet your security standards
- Identify, track and protect all AI-related assets — e.g. treat logs as sensitive data and implement controls to protect their confidentiality, integrity and availability
- Document models, datasets and prompts thoroughly — e.g. use model cards and software bills of materials (SBOMs) to record training data sources, guardrails, and known failure modes
- Manage technical debt proactively — e.g. include decommissioning plans in your lifecycle strategy to avoid inheriting unresolved security risks in future systems
3. Secure Deployment
- Apply strong infrastructure access controls — e.g. segregate environments that hold sensitive code or data from general development environments
- Protect models from direct and indirect attacks continuously — e.g. compute and share cryptographic hashes of model weights as soon as training is complete so downstream systems can verify integrity
- Develop and rehearse incident management procedures — e.g. store critical digital resources in offline backups and train responders specifically on AI-related incident scenarios
- Release AI only after red teaming and benchmarking — e.g. subject models to adversarial testing before release and clearly communicate known limitations or failure modes to users
4. Secure Operation and Maintenance
- Monitor system behaviour and inputs for anomalies — e.g. explicitly detect out-of-distribution or adversarial inputs, such as manipulated image crops designed to exploit preprocessing steps
- Apply automated, secure-by-default update processes — e.g. treat major model or data updates like new software versions, with their own testing and evaluation regimes
- Share lessons learned with the broader security community — e.g. publish vulnerability bulletins and provide consent for security researchers to responsibly disclose findings about your system