Request to the AI HLEG about the concerns on considering algorithmic repeatability as an ethical requirement for a Trustworthy AI

In my previous post “algorithmic repeatability is a requirement for a Trustworthy AI” I posed the question if analogical computing could be compatible with the ethical requirement of algorithmic repeatability. After the contributions of Jose Pedro Manzano, Nicolas Beaume and David WROTH I can rewrite the question in a more systematic way with a clear definition of the requirement and a better description of the issues it arises.


The ethical motivation: If a decision susceptible of ethical controversy has been taken with the support of an AI system, it should be possible to investigate all circumstances that led the system to deliver the result used to conform such decision.


Definition of algorithmic repeatability: Is the capability of reproducing all the operations carried out by an information processing system during any process susceptible of interest.


Issue machine learning: AI systems involving some kind of machine learning evolve over time thanks to the new data they consume. In such a case the system must record the evolution version used for each process in order to restore it and be able to exactly reproduce the operations when necessary.


Issue random functions: Some AI algorithms involve random value generation functions. These systems must record the random function result and the execution step each time they are called. In this case the algorithm could not be repeated in a strict way but simulated. It would be necessary to implement a specific simulation capability as an integral part of the AI software in order to be able to exactly reproduce the operations performed during a process. Although random function calls represent only a small fraction of the operations performed, a normal process could generate a large amount of extra data to be saved.


Issue analogical hardware: Analogical hardware is meant to be a relevant alternative to currently dominant digital hardware in AI computing systems. Analogical hardware can perform most of the operations involved in AI algorithms using less energy, less time and with a simpler architecture. But analogical operations are not accurate in the strict way digital operations are. AI systems totally or partially executed in analogical hardware could represent a serious challenge regarding algorithmic repeatability. As it happens with the issue caused by random functions, here the algorithm could not be repeated in a strict way but simulated. To achieve such simulation the implementation faces two problems. In one hand it is the problem of saving an enormous volume of information representing all operations performed by the analogical hardware. In the other hand it is the fact that the saving of that information cannot be implemented at a software level but must be implemented at a hardware level. Although not physically impossible, to implement algorithmic repeatability for AI systems executed in analogical hardware could highly compromise the advantages of this kind of hardware. Fortunately quantum computing will apparently surpass the expected advantages of analogical hardware in the coming decades offering digital accuracy.


Request to the AI HLEG: Please, be so kind to consider the above dissertations in order to clearly define the rules and recommendations for the implementation and fulfillment of the requirement of algorithmic repeatability in AI systems.

Etiquetas AI Ethics AI HLEG

Comments

Enviado por Barry O'SULLIVAN el Dom, 04/07/2019 - 11:20

Dear Juan - thank you for your contribution and for raising these important issues. All of which are critical. I posted a response to the original thread a few moments ago, which I'm also posting here. If there are any further questions/comments please let me know and I would be very happy to address them. Apologies for my tardy response.

In case you're not aware, the European Commission will be releasing the Ethics Guidelines for Trustworthy AI on Tuesday of this week, which we have been working on for the last serveral months.

Kindest regards,

Barry O'Sullivan

Vice Chair of the High-Level Expert Group on AI

 

I have been watching this thread with some interest but, unfortunately, didn't get an opportunity before now to add a comment.

This week the Ethics Guidelines for Trustworthy AI will be published (Tuesday, in the context of Digital Day 2019). In there you will find that we have defined a (non-exhaustive) set of requirements for achieving trustworthy AI. Amongst these is "Technical Robustness and Safety" which includes accuracy, reliability, and reproducibility. Your use of "repeatability" is, I believe, what we mean by reproducibility. Trustworthy AI systems must be able to produce the same results under the same conditions, and we need methods that support that.

Of course, I believe Juan is also raising another very technical aspect of this which relates to the accuracy of operations over floating point representations on classic hardware, and how this creates challenges. Indeed it does, and there is much work at the moment on dealing with such calculations which can have associated accumulating error. It is a very important technical issue.

Thank you all for contributing to the AI Alliance.

 

En respuesta a por Barry O'SULLIVAN

Enviado por Stephane Pralet el Mié, 04/17/2019 - 10:50

Dear Barry,

I think your last point about accuracy of floating point operations is critical, we cannot ask AI systems to be repeatable in a stricter sense than the IEEE norm. Maybe a way to overcome this difficulty and avoid such tie-breaking effect would be ensure that AI systems will produce compatible/comparable decisions when noise (rounding errors) is applied.

A simple example to be more concrete. Let’s consider an AI vision system in the context of autonomous driving. If it detects an individual crossing the road because its DNN output is greater that a threshold (TH) and decides to slow down, we must be sure that under some “small” variations of TH, it will not ignore the people crossing the road, for example sending an alert message to the people in the car or honking.

Enviado por Juergen Gross el Jue, 04/11/2019 - 15:00

Dear Juan,

I would like to make it just a bit more complicated:

Looking at Algorithms which use backpropagation, it will be most likely impossible to record the evolution as you are demanding under your point machine learning.

Since each and every time something is "learned", the connections are reinforced/adjusted/changed, it would not be possible to revisit the "before" State, which also would imply, that

a. identification of error

b. correction of error

is close to impossible.

I do no know how that problem might be solved, but I consider it one of the most critical:

if you cannot identify and the rectify an error, you would have to either write off your investment in this solution or accept the error and live with it, with of cours euncertain consequences, to put it kindly.

In the Assessment Pilot Version there is mention of traceabiliity and logging, but htat does not quite adress this issue.

Does anyone have an idea how to adress the backpropagation issue?

En respuesta a por Juergen Gross

Enviado por Juan Andrés Hu… el Vie, 04/12/2019 - 10:09

Thanks Juergen.
Be it backpropagation or any other type of ML, if we are able to save the whole status of the software and its configuration before it is called for processing a specific dataset, then we will be able to reproduce it later regardless the internal changes if suffers during the process. Isn't it?. The goal is to recover that previous status, I think it is perfectly possible.
Other thing is the need of saving that status which is feasible, but could be very uncomfortable.

Enviado por Leo Kärkkäinen el Jue, 04/18/2019 - 14:25

Dear Juan,

In forensic analysis of an even analog AI solution, one can always run the system multiple times to get the predictable statistics of the behaviour for a given input data. Actually the expected error bars define repeatability trust in a consistent way - like for any measurement system, which AI is an example of (e.g detect cancer from X-ray).

Also, let us not forget that all IT systems are analog if you go deep enough, and have bit error rates. So this is not a question of principle, but a question of practical error tolerance.

In real life, the concern is about how the analog (originally) input data changes affect the results of the system. Can one in practice recreate a scene, take images and still get the same result from AI?  The always present "insignificant" variation of conditions in input data should not make "significant" change in the result. Actually, this is exaclty  the thing one trains for in neural networks: what features do matter, what do not.

Both analog and digital systems are vulnerable for bad traning, there is no fundamental difference between the two on that aspect.

Best Regards,

Leo Kärkkäinen

 

En respuesta a por Leo Kärkkäinen

Enviado por Juan Andrés Hu… el Jue, 04/18/2019 - 18:45

Thanks Leo.

 

I think the forensic point of view you highlight is extremely important. I agree with your reasoning but have doubts about the premise you use.

An AI decision support system, can be considered as a measurement system with an error tolerance? I am not suggesting that this consideration is incorrect but, will this simplification be accepted in a trial? How can we define (and inform) the acceptable tolerance for an AI system?

If that could be accepted by engineers, users and authorities then the problem of algorithmic repeatability would not be a real problem, or at least it should be defined statistically and not in a strict way.

If the expected result of an AI system is a probability (e.g detect cancer from X-ray) we should accept that statistical approach.

Nevertheless, will we renounce to strict algorithmic repeatability for investigative purposes when a controversy arises?

 

Very interesting contribution, thanks again Leo.