Bias in AI

Hi All, 

I would be very much interested in any exchange of best practices/ideas/initiatives on addressing bias in AI systems (in particular gender bias). 

P.s. Great initiative by the Commission.

Thank You!

Tags
AI Bias

Reacties

Profile picture for user n0028kir
Geplaatst door Vladimiros Pei… op ma, 18/06/2018 - 12:17

I believe that equal to a human organization, an AI organization could be divided into two or more basic operational models. For example, a significant difference in what capabilities an AI might have could result in the illusion of separation between the new species, exactly the way we differentiate male and female beings based on significant external and operational differences, even when acknowledging that we are both humans. 

User
Geplaatst door Anonymous (niet gecontroleerd) op ma, 18/06/2018 - 15:30

User account was deleted

Als antwoord op door Anonymous (niet gecontroleerd)

User
Geplaatst door Emmanouil PATAVOS op ma, 18/06/2018 - 16:22

Yes, that would be great - my schedule is more flexible next month if you like?

Profile picture for user Eleftherios Chelioudakis
Geplaatst door Eleftherios Ch… op ma, 18/06/2018 - 15:40

Hi Emmanouil,

Take a look at the recent work of the EU Fundamental Rights Agency on this topic, namely "#BigData: Discrimination in data-supported decision making (May 2018)".It suggests 4 potential ways of minimising the risk of discrimination in using big data for automated decision making.

Als antwoord op door Eleftherios Ch…

Profile picture for user nrzeszpa
Geplaatst door Pawel RZESZUCINSKI op ma, 06/08/2018 - 14:37

Great report. Thanks.

Profile picture for user nrzeszpa
Geplaatst door Pawel RZESZUCINSKI op wo, 27/06/2018 - 23:27

Hi Emmanouil,

I'm slightly puzzled by the context of your question, and judging by the difference responses provided, I'm not the only one.

Nevertheless, from a strictly data scientific poin of view - you want your data to be as bias-free as possible (not to confuse with the model bias, as in the variance - bias trade-off). If the data to be modelled were to be biased, and you attempt to perform some sort of multicategorical classification, having the data biased may result in the algorithm learning to detect only the dominant category, rather than trully determine the correct category. Not all estimators suffer from this sensitivity issue (decision trees-based techniques do pretty well on imbalanced data), but in principle you should start by considering the three options: downsampling the dominant category i.e. removing some samples of the most common class from the dataset; upsample the underrepresented classes, making sure you don't alter the true nature of the real data; mix the two approaches, typically in the order of upsampling followed by downsampling. 

Not sure this is the angle of perspective you were aiming at, but if - hope this helps to guide you in the right direction.

Regards,

Paweł

User
Geplaatst door Jarosław Dukat op zo, 01/07/2018 - 21:53

Hi,

This is a very specific case example, but I'm sending this because from your question is not clear whether you look for generic info, or specific techniques & algorithms.

Check this one - the video about Debiasing word embeddings may give you some insights.

https://www.coursera.org/learn/nlp-sequence-models/home/week/2

 

Cheers,

Jarek