Ethical questionning

lsiksous · 18 May 2021 12:49

The use of ethnic statistics is prohibited in France. As this MOOC is hosted by INRIA, maybe it would be an option to remove the ‘Race’ feature from the dataset of the course.

ogrisel · 19 May 2021 21:08

This data was lawfully collected and published by the US administration. I don’t think any French law prohibits the study of such public data. It’s just that collecting such data in France would not be lawful in the first place.

Using such a feature to build a predictive model could cause some ethical problems depending on how the model is trained and how and for what purpose such a model would be is used. We could probably improve the MOOC by mentioning the possible ethical considerations of machine learning models and give some references to external resources.

But hiding/removing a feature is not necessarily the best way to solve ethical problems of machine learning. It is actually often useful to use such features to assess whether or not the deployment a given model can be detrimental to a group of people by causing some harms (for instance allocation harms or quality of service harms): 1. Fairness in Machine Learning — Fairlearn 0.6.2 documentation