From Paper Compliance to Runtime Enforcement: What Deploying AI in European Public Sector Operations Taught Us About the EU AI Act

Grigorios Tsinaforniotis
CEO, PROTOS AI Agency IKE | Registered EU Expert (EX2025D1322797) | EU AI Pact Signatory
---
The Gap Between Compliance Documentation and Operational Reality
There is a growing body of excellent policy analysis on the EU AI Act — risk classifications, transparency obligations, documentation requirements. What is less discussed is what happens when you actually try to make an AI system compliant while deploying it in a real European public sector environment, under real operational pressure, with real citizens affected by the outputs.
Over the past three years, our team has deployed AI systems in Greek government agencies — speech-to-text for law enforcement (98.9% accuracy across 100+ languages, reducing transcription time from 6 hours to 5 minutes per hour of audio), legal document analysis for judicial research (3.5 million documents from 15+ authoritative EU and national sources), and we recently submitted a proposal for the European Commission's DSA and AI Act multilingual chatbot (DG CNECT). We are currently coordinating CyberSentinel, a Digital Europe consortium developing AI-powered cybersecurity tools for European SOCs and Cyber Hubs, and MUSEION, a Horizon Europe consortium building autonomous AI agents for cross-jurisdictional legal reasoning.
These are not research prototypes. They run in production, processing real data, producing outputs that real people act on. And it is precisely in production that the EU AI Act's requirements reveal their most important — and most underexamined — dimension: the enforcement gap between what is documented and what actually executes at runtime.
Four Architectural Lessons from Real Deployments
1. Compliance Must Be an Architecture Decision, Not a Documentation Exercise
When we built our legal AI platform NOMOKRATIA.AI for Greek law, we initially treated EU AI Act compliance as a documentation task — risk assessments, transparency notices, data governance policies. All necessary. All insufficient.
The system processes 3.5 million legal documents. It serves legal professionals who make decisions based on its outputs. The moment we deployed it, we realised that compliance documentation describes intended behaviour. What matters is actual behaviour — in production, under load, with edge cases the documentation never anticipated.
Our architectural response was to embed compliance as a runtime constraint: every response must cite a verifiable source document with paragraph-level traceability. If the system cannot ground a claim in a specific legal text, it must say so explicitly rather than generate a plausible-sounding answer. Human oversight is not a policy statement but an interface design decision. Data governance is enforced at the pipeline level, not described in a policy document.
This is not a novel insight in software engineering. But in the EU AI Act compliance discussion, there remains a significant gap between the "what" (documentation, risk assessment, transparency) and the "how" (runtime enforcement, architectural constraints, operational monitoring).
2. Compliance Tooling Must Be Accessible, Not Reserved for Enterprises
This experience led us to build DOKIMASIA.AI — an EU AI Act compliance platform that embodies the compliance-by-architecture principle we advocate. The name comes from ancient Athens: the dokimasia (δοκιμασία) was the official fitness examination every officeholder had to pass before assuming office. Today, DOKIMASIA.AI examines AI systems.
The platform ingests the full EU AI Act text (all 113 Articles and 13 Annexes) into a legal knowledge base and provides three core capabilities: automated risk classification against Annex III categories, AI-powered generation of Annex IV technical documentation and Fundamental Rights Impact Assessments pre-filled for the user's specific system, and a legal advisor grounded in the actual regulatory text rather than training data approximations. Available in 24 EU languages.
We built DOKIMASIA.AI because we observed a structural gap in the compliance tooling market: enterprise platforms like Credo AI and Holistic AI serve Fortune 500 companies at five-figure annual contracts. The EU's own compliance checker is free but rudimentary. Between these extremes — where European SMEs, public administrations, and startups actually operate — there was effectively nothing. Yet these are precisely the organisations that need the most guidance: they lack dedicated AI compliance teams, they operate under tight budgets, and the August 2026 deadline for high-risk systems applies to them just as much as to large enterprises.
DOKIMASIA.AI is operational today. It is not a concept in a proposal — it is a running platform that any organisation in Europe can use to assess whether and how the AI Act applies to them, and to generate the documentation they need.
3. Multilingual Deployment Exposes Safety Asymmetries That Monolingual Testing Cannot Detect
El Bairi and Faruna's recent contribution on this platform — demonstrating that LLM safety guardrails degrade from 100% to 30-45% effectiveness when switching from English to low-resource languages — resonates directly with our operational experience.
Our speech-to-text system operates across 100+ languages in a law enforcement context. During deployment, we observed that transcription accuracy varied significantly across languages — not because the underlying model was incapable, but because the confidence thresholds, post-processing rules, and quality checks that worked well for high-resource languages produced unreliable results for others.
The EU AI Act's robustness requirement under Article 15 demands "consistent performance throughout the AI system's lifecycle." But "consistent" across what dimensions? Our experience suggests that cross-lingual consistency must be treated as a first-class robustness requirement, not an afterthought. For any AI system deployed in the European context — 24 official languages, 27 legal traditions — this is not an edge case. It is the default operating condition.
Our practical solution: language-specific confidence thresholds with explicit fallback to human review when confidence drops below validated levels. The system does not silently degrade — it tells the operator "I am not confident in this transcription" and routes to human verification. This is a design pattern, not a policy. And it is the kind of pattern that Article 15 compliance should actively encourage.
4. Protecting AI Systems Is as Important as Regulating Them
The EU AI Act focuses primarily on what AI systems do to people. Rightly so. But in critical infrastructure environments — SOCs, Cyber Hubs, NIS2-regulated entities — there is an equally urgent question: what can adversaries do to AI systems?
Through our CyberSentinel work with eight partners across five EU Member States, we are confronting this directly. SOC analysts increasingly rely on AI for threat detection, anomaly analysis, and incident response. But these AI systems themselves are attack surfaces: prompt injection, model manipulation, adversarial inputs designed to make the AI miss a threat or generate a false positive.
The EU AI Act does not yet adequately address the security of AI systems in adversarial environments. Article 15 mentions cybersecurity, but primarily in terms of resilience against "attempts to alter the use or performance of the AI system by malicious third parties." In practice, protecting an AI system deployed in a SOC requires runtime monitoring of AI model behaviour, isolation of AI inference from user-manipulable inputs, and continuous validation that the deployed model matches the assessed model.
Das' contribution on this platform about "execution-time architectures" for governance enforcement points in the right direction. The architectural insight — that policy defines what is permitted, but only runtime enforcement ensures that only what is permitted can execute — applies directly to AI security in critical infrastructure. Governance and security converge at the execution layer.
What This Means for European AI Policy
These are not theoretical observations. They emerge from building, deploying, and operating AI systems in European public sector environments where outputs have real consequences for real people.
Three concrete recommendations for policymakers and the AI Office:
First, the EU AI Act's compliance framework should explicitly recognise and encourage compliance-by-architecture approaches alongside documentation-based compliance. A system that architecturally prevents ungrounded outputs (by requiring source citation at the inference level) provides stronger guarantees than a system that documents a policy against hallucination but enforces it only through post-hoc review. Compliance tooling should be accessible to SMEs and public administrations — not only available as enterprise products at prohibitive cost.
Second, cross-lingual robustness testing should be a standard component of conformity assessment for any AI system deployed in multilingual European contexts. The current framework allows a system to be assessed primarily in one language and deployed across all 24. This is a structural gap.
Third, for AI systems deployed in critical infrastructure and cybersecurity contexts, the AI Act's security provisions should be developed in closer coordination with NIS2 and the Cyber Resilience Act. The AI Firewall concept — runtime protection of AI systems against adversarial manipulation — is not yet reflected in any EU regulatory framework, but it is operationally necessary today.
Conclusion
The EU AI Act is the right framework. Its principles — risk-based regulation, transparency, human oversight, robustness — are sound. But principles implemented only as documentation will not protect European citizens or institutions. Compliance must be architectural. Safety must be operational. And the organisations deploying AI in European public services need practical tools and design patterns — not just legal checklists — to meet the Act's ambitions.
Europe's regulatory leadership in AI is globally recognised. The next step is ensuring that this leadership translates into systems that are not just documented as compliant, but built to be compliant — by design, at runtime, across all 24 languages and 27 legal traditions.
---
Grigorios Tsinaforniotis is CEO of PROTOS AI Agency IKE (Thessaloniki, Greece), EU AI Pact Signatory, Registered EU Expert (EX2025D1322797), and EU AI Alliance Member. PROTOS develops regulatory AI systems for the European public sector, operates DOKIMASIA.AI (EU AI Act compliance platform) and NOMOKRATIA.AI (Greek legal AI), and coordinates two EU-funded consortia: CyberSentinel (Digital Europe, AI cybersecurity) and MUSEION (Horizon Europe, autonomous legal AI agents). The views expressed are the author's own, based on operational deployment experience.
Tags: AI regulation, Trustworthy AI, EU AI Act, compliance-by-design, multilingual AI, cybersecurity

Oznake
Trustworthy AI ai regulation cybersecurity Member States public sector

Komentarji

Profile picture for user n00krgn3
Poslal Mototsugu Shiraki Pet, 27/03/2026 - 12:30

Excellent and very grounded perspective.
What you describe is not just a compliance gap — it is a structural gap between intended behavior and executed behavior.

In our work, we describe this gap through four elements:
Concept (what the system is supposed to do),
Intent (why it is supposed to do it),
Boundary (what it must not violate), and
Rationale (why a specific output is justified).

Documentation captures Concept and Intent,
but failures occur because Boundary and Rationale are not enforced at runtime.

This is exactly the shift from paper compliance to runtime enforcement.

Profile picture for user n00jqhbj
Poslal Grigorios Tsin… Sre, 01/04/2026 - 15:28

Thank you, Mototsugu. Your four-element framework — Concept, Intent, Boundary, Rationale — captures precisely what I was trying to articulate. Documentation covers the first two; runtime enforcement must cover the latter two. That is a very clean decomposition.

In our experience with DOKIMASIA.AI, we see this play out daily: organisations can articulate what their AI system is supposed to do (Concept) and why (Intent), but struggle to define enforceable Boundaries and to ensure that every output carries a machine-verifiable Rationale. The result is what you rightly call a structural gap.

I would be interested to learn more about your work on this. Are you applying this framework in a specific regulatory or deployment context?

Profile picture for user n00l7qdn
Poslal Emre Öcal Pet, 10/04/2026 - 14:54

Strong post. This is one of the most important gaps in the current discussion: compliance documentation can describe intent, but it does not guarantee runtime behaviour.

What stands out to me is your insistence that compliance has to become an architectural and operational property of the system itself. That is especially true in public sector and multilingual European deployments, where edge cases, language asymmetries, and real-world pressure expose the distance between paper compliance and actual execution.

We see the same pattern at lextrace: teams do not just need legal interpretation, they need compliance translated into enforceable delivery decisions, evidence flows, human review paths, and re-check triggers when reality changes. Otherwise "compliant by design" stays too abstract.

Also fully agree that this cannot remain a large-enterprise privilege. SMEs and public administrations need practical, usable tooling and design patterns now, not after the deadline.

More of this operator-level perspective is needed in the EU AI Pact conversation.