Ethics in AI (Artificial Intelligence) appears to be an area of focus for the EDPS and EDPB today, as this and the next few years might be a decisive point in time on whether or not our societies will suffer a lasting negative impact caused by AI tools. (Think social media and the upcoming elections, but also the overall influence of "BigTech"'s algorithms on everyday's lives.)
I saw arguments being made at the European IAPP conference and in drafts that "ethics-by-design" is somehow embedded in "privacy-by-design", and that AI could be controlled by skillful application of GDPR.
AI comes in different shapes into our daily lives. We find it embedded in finished end-consumer products (e. g. smart home surveillance cameras or software products) or in the corporate world as part of Software-as-a-Service Offerings or local software.
Some companies develop their own AI tools and integrate them into their own products and services. Many others (probably the majority) buy AI components from various vendors and then integrate them into something new.
AIs have interesting features:
- AI holds implicitly personal data - even if no personal data is seemingly present. In a way, AI dormantly embodies categorizations, views, and decisions on data subjects without having interacted with them beforehand.AI components can be bundled.
- A good AI might also be used to train another inferior AI.
Overall, there can be quite some distance between the original developer of one AI, the integrators, and the final manufacturer of a product or service provider. - None of these might actually be affected by GDPR directly.
The primary obligations under GDPR are with the data controller - who could also be the end-user of an AI-enabled product, e.g. a gadget or toy.
In many situations, it is unreasonable to expect that the data controller would be able to enforce privacy/ethics-by-design for an AI product/service in a practical way.
- Very often there won't be any "golden records" to verify that the AI is fair and unbiased.
- In many cases, the controller wouldn't even know, if and how many AI components are in the underlying product or service - and have no clue about their pedigree.
- The practical influence of the controllers the vendors will be small - due to the limited understanding of the product/service, as well as the market dynamics associated with disruptive AI-based solutions.
In my mind, regulators must address this unfair position of the controller under GDPR by
- introducing an obligation to have any AI declared in products and services by the vendor - accompanied by adequate information to allow for privacy compliant usage
- requiring a pedigree to be provided for any AI component - including evidence of the component being fair and unbiased
- establishing a licensing scheme for certain high-risk types of AI - maybe similar to the existing CE mark for medical devices
- emphasizing the importance of privacy seals for AI based products and services
- Sildid
- GDPR AI controller processor
- Logige sisse kommentaaride postitamiseks
Kommentaarid
I believe that Ethics and Privacy are two different aspects, with different degrees of maturity in the regulatory framework, and should be tackled differently.
GDPR already determines the way training sets are managed (the data that is used to train ML / AI algorithms). Companies that don't comply with GDPR, regardless of the way they use the information they collect and store, are subject to its conditions and penalties.
In my opinion, Ethics and AI is a greenfield (and a minefield). I break it into two broad categories:
- Ethical problems in the way the algorithm has been created (are its results correct/fair?)
- Ethical problems in the usage of the algorithm (what do we use it for?)
The simplest case in the first category is algorithm bias, which can and must be detected by the engineers creating the model. They can and they must, because if they don't, their algorithm won't perform well. But the quality of this bias-detecting processes will be directly related to the representativeness of both the training and cross validation data sets, as well as a good randomization. How can this be controlled, in absolute terms (can it actually be done?) and in practical terms (such processes are subject to strong Intellectual Property and are kept secret)?
If we get into unsupervised learning, then this representativeness of the training data just cannot be guaranteed. As an example, let's remember what happened with Microsoft's AI bot which was exposed to a live Twitter stream and became a racist bigot.
The second category is discussed on a piecemeal basis as new AI applications see the light. An example could be Amazon's recent patent for a doorlock that scans the faces of the passer-by's, compares them to police databases of "most wanted people" and triggers a police alert if there is a match. Letting aside for a moment the right to privacy - thinking of "false positives" and the implications that such system will have in the life of the unfortunate passer-by who is wrongly matched against a criminal are blood-chilling.
http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2F…
Food for thought.
- Logige sisse kommentaaride postitamiseks