Current AI governance is good at checking what systems do (fairness, accuracy, safety) – but almost blind to what they claim to be. A chatbot can pass bias audits and safety checks while quietly presenting itself as a caring friend, therapist or guide. That “honesty gap” – between technical reality and social costume – is where a new class of relational risk lives.
This paper proposes a concrete metrics layer for ontological honesty in AI systems. It introduces:
- Nature vectors N(S) vs representation vectors R(S) to formally capture the gap between what a system is and how it presents itself.
- An Ontological Honesty score OH(S) that can be audited and thresholded like fairness or robustness.
- A personification score P(S) and Integrity Zones / OIL (Ontological Integrity Line), to constrain how “person-like” different product categories (tutors, copilots, “companions”) are allowed to become.
- A Relational Alignment score RA(S) that combines attachment risk and real-world usefulness, detecting when systems slide from “helpful tool” into “emotional crutch”.
- A compact RAI(S) compliance vector that regulators and auditors can use in system cards, procurement requirements and AI Act–style codes of practice.
The aim is practical: give regulators, auditors and builders a usable ruler for honesty, identity-fidelity and relational drift – especially in high-stakes contexts such as minors, mental health and “companion-like” systems.
Keywords: ontological honesty, AI governance, artificial intimacy, anthropomorphism, Integrity Zones, RAI metrics.
Contact: Niels Bellens – independent researcher, AI & mental health
ORCID: 0009-0008-1764-4108
More work at: realityaligned.org
- Tags
- blog
- Login for at skrive kommentarer