Reflections on 'Safe' AI

Professor Emeritus Charles Raab, University of Edinburgh (1)

The AI Safety Summit’s Bletchley Declaration(2) is a timely contribution to current – and future – debates about how and why the several forms of artificial intelligence (AI) should be regulated. Signed by 29 countries (including the collective entity of the European Union), its heart is in the right place in terms of the goal of ensuring that AI innovations are demonstrably safe to be implemented in myriad applications and in all areas of life, many of which we can only vaguely foresee. It is easy to score points against the Declaration and its regulatory proposals in a host of dimensions, but it is what it is: a prominent bid by some countries to lay down a marker about perceived AI risks and harms that must be mitigated and, where they become manifest, remedied. The simultaneous announcement of the plans for a UK AI Safety Institute(3) captures this momentum for creating the capacity to ensure the safety and security of AI in terms of its development and deployment in the public interest.

This short blog is not the place for a long and critical contribution to the discourse and debate regulating AI that has flourished as interested parties of all kinds deploy their positions on the host of issues that the Declaration highlights or ignores. Nor is the aim to dissect the Declaration itself, much less the plans for the AI Safety Institute. However, these documents trigger comment on a headline element that hides in plain sight in very many policy initiatives and is ripe for opening up and examination: that is, the understanding of the concept ‘safety’ and what it means for AI – or for those upon whom it impacts – to be ‘safe’.

These tropes are so familiar as to be part of the wallpaper of, for example, policing, transport, food production and consumption, children’s play, the environment, public health, and many more. They extend to the military and defence fields in terms of ‘security’, and now to AI and its systems as technological innovations. Safety and security have gained a place at the heart of the rhetoric, mission statements, aims, goals, and purposes that can be found in these domains. The meaning of these terms is normally implicit and can often be gained from the field, context or discourse in which it occurs; to some extent, this helps to disambiguate its meaning. The private sector is intertwined with public provision in terms of providing the goods and services required for the functioning of governmental or state operations, and makes use of the same terminology in offering their products and ‘solutions’.

‘Safety’, ‘keeping us safe’ and ‘public safety’ are thus powerful conceptual and linguistic means for framing goals and legitimising the means of attaining them. They imply a subtext question: who would disagree with such worthy aims, object to the products or systems that purport to achieve them, or doubt the worthiness of the authoritative stewards of safety themselves? The Declaration is not the first or only document, fragment of discourse, or framing of ‘safe’ and ‘safety’ in the fields of technological innovation and information or data policy. A seminal 2019 report from the Alan Turing Institute(4) is commendably wide-ranging, as well as granular and practically focused in what it spells out. Yet, in its effort to understand and promote AI ethics and safety, it, too, falls short of pinning down the meaning of the central concepts that indicate the values that are to be ‘safeguarded’.

The Declaration uses ‘safe’ and ‘safety’ in many contexts:

  • as a criterion for judging AI’s development, deployment, and use;
  • as a property of AI that needs to be addressed, along with its transparency, fairness, bias mitigation, etc.
  • as something that AI puts at risk in many forms of its use or misuse;
  • as something for which many actors have roles to play in ensuring;
  • as an issue that is inherent across the AI lifecycle;
  • as something that is the particular responsibility of developers to ensure;
  • as an attribute whose presence can be tested; and
  • as a focus of shared concern about risks across a global range of scientific investigators.

This inventory is wide-ranging and enables more incisive investigation of the pros and cons of AI, perhaps leading to policy recommendations and regulation. But it brings us no closer to understanding what it means to be ‘safe’, or to experience or provide ‘safety’ in a world of AI innovation and deployment. The black box remains unopened. Is safety a value, an attribute, a quality, a property, or something else? Without further analysis and refinement, it remains largely a binary concept: either something is safe, or it is not. As a good – whether a public good or an individual good – there are no ways of accurately judging, or debating:

  • the value for money (or for effort spent) in providing safety;
  • the return on financial, labour, or moral investment in safety;
  • the social distribution of safety within or across societies (in terms of equality or bias: who gets what safety, how and why?);
  • whether the ‘safety game’ is zero-sum or positive-sum, and what it ought to be;
  • precisely how different forms and applications of AI compare in terms of safety;
  • the importance of safety in comparison with other AI-related risks; and
  • how much safety is ‘safe enough’.

‘Security’, a cognate and likewise multidimensional term for ‘safety’, functions rhetorically in a similar way through its ambiguity and vagueness. These terms are sometimes used either interchangeably or, conversely, as a pair, although AI-safety discourse seems to have escaped much of this so far: the Declaration does not use the word ‘secure’, and ‘security’ only appears once, as ‘food security’, but these terms are far more frequent in the AI Institute plan, as applied to AI itself. Moreover, some languages use the same word for both these terms. This is not a terminological quibble, because safety and security engage separate but overlapping human feelings, desires, emotions and perceptions. Eliding this conceptual distinction limits the range of enquiry about how values are considered in everyday life and in social and policy discussion.

‘Danger’ is an antonym of ‘security’; in the world of AI it is typically expressed in terms of ‘risk’ and ‘harm’. Albeit under-theorised, these latter terms are widespread in the language and activity of information and AI policy, whether the focus is upon, for example, breaches of privacy through technical or human failure, malicious attack, lax sharing practices and rules, or apocalyptic prognoses of mass unemployment and social disruption. ‘Harm’ is closely associated with ‘risk’, and features in the Declaration as a target for mitigation or prevention. Up to a point, there is somewhat more refinement in the discourse about risk and in the way it is embodied in regulatory attempts: for example, rankings of severity or ‘levels’ of harm or risk, the designation of unacceptability, and the gradated set of rules that should apply across this range. The EU’s proposed AI Act is a case in point, and has engendered widespread debate as it nears enactment. More generally than in that proposal, the determination of such levels and their descriptors, and their application to the development and deployment of specific technologies, need to be transparent and accountable. This is because these decision-making processes and outcomes are exercises of power, and are therefore properly subjects for public debate and possible challenge. To regulate, we need to understand, with more nuance, what the condition of being safe or secure is, beyond seeing it as the absence of harm, risk or danger.

The UK’s Online Safety Act 2023 deploys the vocabulary and idioms of harm, risk, and safety several hundred times, including the awkward phrases ‘risk of harm’ and ‘level of risk of harm’, in its attempt to make us – and children especially – safe, or ‘safer’ online. But saying ‘safer’ invites epistemological or methodological questions about ways of comparing today’s position with yesterday’s. Whether for online activity or AI, what statistical, emotion-measuring, or gut-feeling signs and indicators are available to underpin comparative safety judgements? Are they reliable? What if they point in opposite directions? Are there important thresholds? Can we conceive of ‘levels of safety’ and measure designated practices, processes and technologies against them? Safety standards do exist, or are being sought for the regulation of AI. Standard-setting processes, let alone the standards themselves, are also fit subjects for the application of transparency and accountability norms, for they, too, are exercises of power. And who is to pronounce authoritatively upon these and related issues – such as the assessment of AI’s impact on privacy, ethical values, human rights, or social groups – that enter into the formation of policy and regulation? Or are the latter to be left simply to an exercise of decision-making power?

These are but some of the issues that could usefully be tackled in the regulation of AI if it is to involve a cogent array of instruments, responsibilities and evaluative criteria for the development and deployment of AI. In support of the many international, national and other departures for furthering the formation of regulatory frameworks that might achieve the ends of safety and security, such conceptual enquiry and analysis would serve to reduce ambiguity and rhetoric, and to provide clearer guidance for the creation of standards, mitigations, and systems of accountability. We think we know what a safe toaster is and how to regulate its manufacture and sale; what is a ‘safe’ algorithm (and in what unforeseeable contexts of use, downstream) if we had not only to guarantee it, but to comprehend its safety and the safety of those upon whom it impacts?


1. This blog draws on my 2005 lecture, ‘Governing the Safety State’, given at the University of Edinburgh in 2005; on ‘Privacy, Security, Safety and the Public Interest: Related Values’, presented in 2015 at ‘The Value of [In-]Security’, International Conference on Ethics at the University of Tübingen; and on further writing in progress.

2. The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023,


4. Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. The Alan Turing Institute.

Professor Emeritus Charles Raab, University of Edinburgh; Co-Director, CRISP, Email: