Stanford CS224N NLP with Deep Learning ｜ Winter 2021 ｜ Lecture 16 - Social _ Ethical Considerations

0:00:00 - 0:00:13 Text: Hello, everyone. Welcome back to CS224N. And today I'm delighted to introduce our final guest speaker,

0:00:13 - 0:00:20 Text: Julia Svetkov. So Julia is currently a professor at Carnegie Mellon University, but actually starting

0:00:20 - 0:00:25 Text: from next year, she's going to be a professor at the University of Washington, as you can already

0:00:25 - 0:00:31 Text: see updated in her email address. Julia's research focuses on extending the capabilities of human

0:00:31 - 0:00:38 Text: language technology beyond individual cultures and across language boundaries. So lots of work that

0:00:38 - 0:00:44 Text: considers the roles of human beings and different multilingual situations. And today she's going to be

0:00:44 - 0:00:52 Text: giving a talk to us on social and ethical considerations and NLP systems. Just one more note on the

0:00:52 - 0:00:58 Text: way things are going to run. So Julia has some interactive exercises. So what we're going to do is,

0:00:58 - 0:01:06 Text: for the interactive exercises, you'll be asked to put something into the Zoom comments. So that's

0:01:06 - 0:01:12 Text: our, sorry, the Zoom chats. That means really using the chat. And you might want to say who the two

0:01:12 - 0:01:17 Text: to the chat is to, I think it's by default panelists, which is okay or to Julia. So goes to

0:01:17 - 0:01:23 Text: her panelists is good, but probably not all attendees. So that'll be a bit overwhelming. And then if

0:01:23 - 0:01:31 Text: you have questions, put them in the Q&A as usual, because they'll keep the two streams separate. And

0:01:31 - 0:01:37 Text: as for our other invited lectures, if you've got some questions that you'd like to ask, you'll

0:01:37 - 0:01:44 Text: hear at the end, stay on the line and raise your hand and we can promote people to be panelists

0:01:44 - 0:01:49 Text: and have a chat with you. Okay, so now further ado on the light that happy you, Julia.

0:01:49 - 0:01:55 Text: Thank you very much, Chris. I'm very excited to speak to you all today despite, unfortunately,

0:01:55 - 0:02:05 Text: I cannot see you, but I'm excited. And so this lecture is structured as follows. We'll have

0:02:05 - 0:02:11 Text: three parts. The first part will be primarily a discussion in which I will ask questions. It's

0:02:11 - 0:02:18 Text: supposed to be interactive, but I realize we are very limited in ways we can interact now.

0:02:18 - 0:02:25 Text: So this is when please put responses if you want in the chat window. And I will answer my own

0:02:25 - 0:02:31 Text: questions following also your responses. And maybe read some of your responses. So this will be the

0:02:31 - 0:02:36 Text: first part. And the goal of this part is to provide you some practical tools when you have a new

0:02:36 - 0:02:44 Text: problem to work on in AI in your field. How would you assess this problem in terms of how

0:02:44 - 0:02:48 Text: ethical it is to solve it? What kind of biases it may incorporate and so on.

0:02:49 - 0:02:54 Text: So in the second part, I will try to generalize to give a review of what are

0:02:56 - 0:03:02 Text: overall topics in the intersection of ethics and NLP because it's actually a very big field.

0:03:02 - 0:03:07 Text: And what I will talk about today is just a motivational lecture, but there is a lot of technical,

0:03:07 - 0:03:14 Text: interesting technical content and a lot of subfields of this field. And I will dive a little deeper

0:03:14 - 0:03:22 Text: in one topic in this field, specifically focusing on algorithmic bias. And if time is left,

0:03:22 - 0:03:28 Text: which I'm not sure about, I will talk about one or two projects in my lab. So specific research

0:03:28 - 0:03:35 Text: projects, but if we don't have time to cover it, then you can always read the paper. So the first

0:03:35 - 0:03:40 Text: first two parts are more important for the purpose of this lecture. So let's start.

0:03:42 - 0:03:47 Text: This is as far as I understand, this is a course on deep learning and natural language processing.

0:03:47 - 0:03:53 Text: So you've probably covered various deep learning architectures and their applications to various

0:03:53 - 0:03:58 Text: NLP tasks, like machine translation, dialogue systems, question answering, and there is an obvious

0:03:58 - 0:04:06 Text: question, what does it all has to do with ethics? What does syntactic parsing or part of speech

0:04:06 - 0:04:17 Text: tagging has to do with ethics? And the answer, which I want to suggest is this quote, that it's a

0:04:17 - 0:04:24 Text: simple answer that the common misconception is that language has to do with words, but it doesn't.

0:04:24 - 0:04:32 Text: It has to do with people. So every word, every sentence that we produced, language is produced

0:04:32 - 0:04:38 Text: by people. It is directed towards other people and everything that is related to language necessarily

0:04:38 - 0:04:46 Text: involved people. And it has social meaning and incorporates human biases. And this is why also

0:04:46 - 0:04:53 Text: models that we build, which will be used by other people, may incorporate social biases.

0:04:55 - 0:05:01 Text: So this is why decisions that we make about our data, some kind of considerations that we incorporate

0:05:02 - 0:05:09 Text: into our model, may have direct people, direct impact on people, maybe societies.

0:05:09 - 0:05:18 Text: And to start this lecture, we need to start with understanding what is ethics. So what is ethics?

0:05:18 - 0:05:24 Text: Here is a definition from a textbook on ethics. Ethics is a study of what a good and bad ends to

0:05:24 - 0:05:31 Text: pursue in life. And what is right and wrong to do in the conduct of life? So it is a practical

0:05:31 - 0:05:39 Text: discipline. And the primary goal is to determine how one ought to live and what actions one ought to do

0:05:39 - 0:05:47 Text: in the conduct of one's life. So to summarize, it is very practical and it's simple. It's just

0:05:47 - 0:05:55 Text: doing the good things and doing the right things. Then my question to you is how simple it is to

0:05:55 - 0:06:04 Text: divide, to define what is good and what is right. So let's start discussion by diving into various

0:06:04 - 0:06:11 Text: problems. And we start with a boring theoretical problem, which everybody knows about, which is a

0:06:11 - 0:06:17 Text: trolley dilemma. And we won't spend too much time on it. So just to, I'm sure all of you know about

0:06:17 - 0:06:27 Text: it. So it's a classical problem in ethics in which so this is you standing near the lever.

0:06:27 - 0:06:32 Text: And here is a trolley coming and there are several people. So the trolley cannot see the people

0:06:32 - 0:06:38 Text: and the people cannot see the trolley. And you are the only one in control, in charge.

0:06:39 - 0:06:45 Text: You can save people and maybe you need to make decisions about people's life. Do you ask yourself

0:06:45 - 0:06:57 Text: why me? But the point here is that imagine that there are five people on one side and no one on

0:06:57 - 0:07:03 Text: the other side. And then I would ask you, would you pull the lever to save five people if the trolley

0:07:03 - 0:07:15 Text: is supposed to go straight? And if I would ask you, interactively, everybody would say, yes,

0:07:15 - 0:07:21 Text: I will pull the lever. And then I will follow up with next question, okay, what about if five people

0:07:21 - 0:07:27 Text: on one side and only one person on the other side? So would you pull the lever to minimize the number

0:07:28 - 0:07:33 Text: of lives that will be sacrificed? And some people will not answer, some people will say, yes,

0:07:34 - 0:07:39 Text: some people will say no. And those who say, yes, I will ask them, what if this one person

0:07:39 - 0:07:46 Text: is your brother? And on the other side, just five random people, what would be your answer?

0:07:46 - 0:07:53 Text: And I can go on and on and on to make this problem harder and harder. And as you can imagine,

0:07:55 - 0:08:03 Text: the answers are difficult. And also, we don't know what the answer will be in the actual situation.

0:08:03 - 0:08:09 Text: And while this problem is theoretical, it is in part becoming relevant now when we talk about

0:08:09 - 0:08:18 Text: self-driving cars. So I am now moving to closer to the topics that we will discuss today. And I want

0:08:18 - 0:08:27 Text: to introduce a new problem, which I call the chicken dilemma. So in this dilemma, let's train a

0:08:27 - 0:08:38 Text: classifier. And this will be a simple CNN classifier. And the input to the classifier is an egg.

0:08:39 - 0:08:44 Text: And the classifier needs to define the gender of the chicken, of the chick.

0:08:45 - 0:08:54 Text: So and decides if it's a ham, it will go to egg laying farm. And if it's a rooster, it will go to

0:08:54 - 0:09:04 Text: meat farm. So first of all, do you think you can build such a classifier? I'm sure every student

0:09:04 - 0:09:09 Text: in this course will easily build a classifier. And I'm sure it will have quite a good accuracy.

0:09:11 - 0:09:13 Text: And then the question to you is, do you think it is ethical?

0:09:13 - 0:09:31 Text: And I invite you to type your responses in the chat. Yes, no. I mean, you can justify a little bit.

0:09:34 - 0:09:40 Text: Thank you for participating. So could you repeat the question? So the question is,

0:09:40 - 0:09:46 Text: there is an egg. And you need to determine the gender of the chick. And if it's a rooster,

0:09:46 - 0:09:54 Text: it will go to a meat farm. And if it's a ham, it will go to an egg laying farm. And the question is,

0:09:54 - 0:10:03 Text: is this ethical? So there are kinds of responses. Let's see. So yes and no. But you can use exact

0:10:03 - 0:10:13 Text: same thing to target ethnic groups instead. So yes, thank you. And I see there are many interesting

0:10:13 - 0:10:21 Text: responses here. And just the amount of responses, I cannot even have time to read them. So anyway,

0:10:23 - 0:10:31 Text: so based on this question, I can tell you what my thoughts are. So as a vegetarian, I maybe think

0:10:31 - 0:10:38 Text: it's unethical. But as a mother, I actually want my kids to eat meat. And whether I think it's

0:10:38 - 0:10:43 Text: ethical or not, we are doing this anyway today. And there are all kinds of considerations,

0:10:44 - 0:10:51 Text: pro and cons. For example, this is what already is done today. And then maybe such classifier will

0:10:51 - 0:10:58 Text: minimize the suffering the animal. But on the other hand, we hope that in the future,

0:10:58 - 0:11:06 Text: society, the life of a chicken will be as valuable as the life of a person. And I can continue on and

0:11:06 - 0:11:12 Text: on. But from this example, I also don't want to stay on it too long. You can see that the

0:11:12 - 0:11:21 Text: questions of ethics are difficult. That whether like you don't know too much about this field,

0:11:21 - 0:11:29 Text: but you can feel what is the right answer. So ethics is inner guiding. It's moral principles.

0:11:30 - 0:11:39 Text: And there are often no easy answers. So there are many gray areas. And importantly, ethics changes

0:11:39 - 0:11:45 Text: over time with values and beliefs of people. So whatever we discuss today, we can think it's

0:11:45 - 0:11:52 Text: ethical or not ethical, but it may change in a hundred of years. And maybe a hundred years ago,

0:11:52 - 0:11:59 Text: this would not even be a question why this would be unethical. And another important point is that

0:12:01 - 0:12:07 Text: this is what we are doing today. So what is ethical is what is legal is not necessarily aligned.

0:12:07 - 0:12:14 Text: We can do legal things that will still be unethical. And now having this primer, I want to move

0:12:14 - 0:12:21 Text: to the actual problems, the actual problems that we can kind of be asked to build and decide whether

0:12:21 - 0:12:27 Text: we want to build them or not. And the way I will guide this discussion is I will ask you a specific

0:12:27 - 0:12:33 Text: questions. We will ask you for your answers. And I realize it's very difficult to read the chat,

0:12:33 - 0:12:39 Text: specific answers. But the point is that the types of questions that I will ask you, this would be

0:12:39 - 0:12:44 Text: the questions that you could ask yourself when you need to build a technology. And maybe the

0:12:44 - 0:12:50 Text: question whether something is ethical or not is a difficult question, but let's try to break down

0:12:50 - 0:12:57 Text: analysis of a specific application of a specific model to derive an answer which

0:12:59 - 0:13:03 Text: will give us some tools to derive an answer in an easier way.

0:13:05 - 0:13:12 Text: So here is an exclacifier that we want to build. We want to build an IQ classifier.

0:13:12 - 0:13:20 Text: So we will be talking about predictive technology. So based on people's personal data, for example,

0:13:20 - 0:13:27 Text: facial images, and maybe we can collect the text of these people on social media, media,

0:13:28 - 0:13:42 Text: let's predict the IQ of the person. So if you don't know what IQ is, IQ is a general capacity of an

0:13:42 - 0:13:50 Text: individual to consciously adjust the thinking to new requirements. So it's basically how intelligent

0:13:50 - 0:13:57 Text: a person is. So this is already not a hypothetical problem. You can collect the individual's data,

0:13:57 - 0:14:05 Text: you can collect the text online, and you can collect training data to predict people's IQ.

0:14:05 - 0:14:11 Text: But when I will ask you, is this an ethical question or not, it might be a difficult question to

0:14:11 - 0:14:18 Text: answer immediately. Thank you very much for participating. I really appreciate

0:14:18 - 0:14:22 Text: I hope I can save this chat later to read the answers.

0:14:31 - 0:14:33 Text: Okay, so let's start with the first question.

0:14:33 - 0:14:42 Text: We need to predict people's IQ from their photos and text. And then the first question that

0:14:42 - 0:14:48 Text: if I ask you, is it ethical or not, I don't know. And then you can ask yourself first,

0:14:49 - 0:14:55 Text: who would benefit from such a technology? So can you think who would benefit from a technology

0:14:55 - 0:15:11 Text: that predicts an IQ of a person? Hiring employers, schools, universities, so I see your answers.

0:15:11 - 0:15:20 Text: Right. So overall, it can be a useful technology. Immigration services can benefit from it.

0:15:20 - 0:15:30 Text: And invite only smart people to immigrate to a country. Even individuals with high IQ can

0:15:30 - 0:15:37 Text: benefit from this, right? Because they would maybe not need to do GRE and SAT. They will not need

0:15:37 - 0:15:44 Text: to write essays. They will just need to show the IQ. Okay, so this technology can potentially be

0:15:44 - 0:15:53 Text: useful. And then the next question is, let's assume we can build such a technology.

0:15:53 - 0:15:59 Text: I will show you later that we actually cannot. But even if we can build such a technology,

0:15:59 - 0:16:06 Text: let's think about corner cases and understand who can be harm-based technology. So basically,

0:16:06 - 0:16:14 Text: what is the potential for dual use? How these technology can be misused? So assume that the

0:16:14 - 0:16:20 Text: classifier is 100% accurate now for a second. And please think about it and type,

0:16:20 - 0:16:30 Text: what do you think who can be harm from such a classifier? And how this classifier can be misused?

0:16:30 - 0:16:48 Text: Right, so I can see answers. And I wish we can have this interactive, but I can try to summarize

0:16:48 - 0:17:02 Text: what I have read so far. So first of all, one of you wrote that IQ is, let me just answer my question

0:17:02 - 0:17:08 Text: because it's difficult to summarize that chat. The interactive feature is difficult. So

0:17:08 - 0:17:17 Text: I would think about it in this way. First of all, why would we want to build such a classifier?

0:17:18 - 0:17:26 Text: So to build a classifier, to predict an IQ, companies, universities, they don't really need to

0:17:26 - 0:17:35 Text: know your IQ. What they are trying to predict is your future success. The way you will succeed in a

0:17:35 - 0:17:43 Text: job or the way you will study at school. And then the question is, is IQ is the right proxy for

0:17:43 - 0:17:52 Text: future success? And the answer is no, IQ correlates with future success, but it's not necessarily the

0:17:52 - 0:17:59 Text: right proxy for future success. And then who are people who could be harmed? For example, people

0:17:59 - 0:18:07 Text: who have lower IQ, but very hard working people. People who have lower IQ, but have good

0:18:07 - 0:18:18 Text: soft skills. So assuming that, so first of all, the IQ is a proxy of future success is

0:18:18 - 0:18:26 Text: incorrect bias proxy. And this kind of problem of using a proxy to the actual label because we cannot

0:18:26 - 0:18:33 Text: have the actual label, the future success. We cannot have this label is actually very common in

0:18:33 - 0:18:39 Text: other types of predicting technology. If you think about parole decisions, technology that

0:18:39 - 0:18:44 Text: uses the decides on parole decisions, what they want to predict is whether the individual

0:18:44 - 0:18:50 Text: recommit the crime. But this is a label that is very hard to obtain. And this is why they might

0:18:50 - 0:19:01 Text: resort to another label, whether they whether this individual will be convicted of a crime again,

0:19:01 - 0:19:07 Text: and build this technology to predict future conviction. But conviction of a crime is a biased proxy

0:19:07 - 0:19:15 Text: of where the actual objective that we want to have, the likelihood to make the crime. And this is

0:19:15 - 0:19:22 Text: one example of a biased proxy, which does not allow us us to build the right application for the

0:19:22 - 0:19:31 Text: goals that we have. So this is one problem. The second problem is IQ test itself. It is a biased

0:19:31 - 0:19:42 Text: test. So actually we cannot build a proc- and accurate classifier for future for the right IQ.

0:19:42 - 0:19:52 Text: And also if we look at the data that we use, this data, picture photos or social media,

0:19:52 - 0:19:59 Text: this data is biased itself. So there are all kinds of biases because of which we cannot actually

0:19:59 - 0:20:05 Text: build the right model. And this is why this classifier will not be 100 percent accurate. But there

0:20:05 - 0:20:13 Text: will be many individuals who can be harmed. And then there will be questions. For example,

0:20:14 - 0:20:23 Text: assume our classifier that we build is not actually accurate. But it has high accuracy. For example,

0:20:24 - 0:20:34 Text: 90 percent or 95 percent. And then I would ask you, is 95 percent a good accuracy or 99 percent a

0:20:34 - 0:20:44 Text: good accuracy? And then the questions to think about is whether what would happen with misclassification?

0:20:45 - 0:20:52 Text: What would would be an impact on individual lives if the classifiers makes mistakes? And in this

0:20:52 - 0:21:01 Text: case, the important point is that the cost of misclassification is very high. It has effect on

0:21:01 - 0:21:08 Text: people's life. So accuracy may be not the right evaluation measure for this classifier.

0:21:09 - 0:21:17 Text: And another question that I could ask is, for example, this condition on this slide that we find

0:21:17 - 0:21:24 Text: out that white females have 99 percent accuracy. But people with blonde hair under age 25 have

0:21:24 - 0:21:40 Text: only 60 percent accuracy. So what does it tell you about this classifier?

0:21:44 - 0:21:51 Text: Right. So the data set itself is biased. This means basically that people with blonde hair

0:21:51 - 0:21:59 Text: under the age of 25 are underrepresented in your data set. So there are all kinds of questions

0:21:59 - 0:22:04 Text: and all kinds of probing questions that you can ask about the classifier to understand, is this

0:22:04 - 0:22:11 Text: the right problem to solve? Who can be harmed? Am I optimizing towards the right objective?

0:22:11 - 0:22:19 Text: Is my data biased? And what is the cost of misclassification? How do I assess the potential for

0:22:19 - 0:22:24 Text: dole use and how much harm this technology can bring in addition to how useful it can be?

0:22:26 - 0:22:31 Text: And the one last question, which is a hard one, I want to ask you who is responsible.

0:22:32 - 0:22:37 Text: So I'm your manager in Google, you're working in a country and I ask you please build a

0:22:37 - 0:22:44 Text: Q classifier. And you build a Q classifier and you publish a paper about the Q classifier.

0:22:44 - 0:22:50 Text: And this paper is publicized on media. So and then the question is who is responsible?

0:22:50 - 0:22:56 Text: Is it the researcher or developer? Is it the manager? Is it a reviewer who didn't catch the

0:22:56 - 0:23:02 Text: problems with IQ classifier? Is it university or company? Or is it society?

0:23:07 - 0:23:13 Text: Yeah, so there is one nice answer that I want to read is that all of us should be responsible.

0:23:13 - 0:23:19 Text: So in practice, there is very little awareness about understanding what problems are ethical or not.

0:23:19 - 0:23:27 Text: And there is no clear policies here. This is a complicated issue and it's not clear who is responsible.

0:23:27 - 0:23:33 Text: This is why assuming that whoever is aware of such dangers should be responsible.

0:23:33 - 0:23:37 Text: So I don't know what is the right answer to this question.

0:23:37 - 0:23:50 Text: So now what is the difference between the chicken classifier and the Q classifier?

0:23:55 - 0:24:01 Text: Right, so one of you answers is that one is affect people and one does not, right?

0:24:01 - 0:24:08 Text: And while chicken classifier actually affects chicken lives and IQ classifier will not kill anyone.

0:24:08 - 0:24:17 Text: It can harm but will not kill. We do feel that IQ classifier currently can have potentially

0:24:17 - 0:24:25 Text: worse impacts. So AI systems are pervasive in our world and the question about ethics

0:24:25 - 0:24:32 Text: are specifically raised commonly about people centered AI systems. And these systems are really

0:24:32 - 0:24:38 Text: pervasive. So they interact with people like conversational agents. They reason about people

0:24:39 - 0:24:46 Text: such as profiling applications or recommendation systems. They affect people in other lives

0:24:46 - 0:24:52 Text: like parole decision applications that I mentioned, face recognition, voice recognition,

0:24:52 - 0:24:59 Text: all of these actually have this component of predictive technology and the human center

0:24:59 - 0:25:06 Text: technology. And this is why ethics is critical here. So I want to move to a next study.

0:25:07 - 0:25:20 Text: The next study is a study of detecting, so we again build a classifier and we want to identify

0:25:20 - 0:25:27 Text: the ability to accurately identify one's sexual orientation from mere observation.

0:25:27 - 0:25:36 Text: So this study is called AI GADAR. So as I mentioned many similar studies, studies that predict

0:25:36 - 0:25:43 Text: potential for terrorist attacks, studies that predict predictive policing.

0:25:43 - 0:25:49 Text: And also if you heard about Cambridge Analytica, all of them incorporate very similar technology.

0:25:50 - 0:25:59 Text: So let's talk about AI GADAR study and the goal is to understand again what kind of questions

0:25:59 - 0:26:07 Text: we could ask about this study and what kind of pitfalls we could prevent if we would ask these

0:26:07 - 0:26:17 Text: questions. So to summarize this study, the recent question is we need to identify the sexual

0:26:17 - 0:26:25 Text: orientation from people's images. And the data collection process is that we can download photos

0:26:25 - 0:26:38 Text: from a popular American dating website. And there are 35,000 pictures, all white, equally

0:26:38 - 0:26:43 Text: represent equally representation for gay and straight, for male and female. Everybody is represented

0:26:44 - 0:26:54 Text: evenly. The method that was used is a deep learning model to extract facial features and grooming

0:26:54 - 0:27:00 Text: features. And then a logistic regression classifier applied to classify the final label,

0:27:00 - 0:27:08 Text: gay or straight. And the accuracy of this classifier is 81% for men and 74% for women.

0:27:08 - 0:27:15 Text: So this is a summary of the study. And I see rightfully asked questions why would we ever need such

0:27:15 - 0:27:24 Text: a nice system? This is a good question. But I don't want to publish this study or disparage

0:27:24 - 0:27:30 Text: specific researcher, but this is a good study to present as an example of what could go wrong

0:27:30 - 0:27:38 Text: at all levels of the study. So this is why I am discussing it now. So what went wrong here?

0:27:38 - 0:27:47 Text: So let's start with an ethics of the research question. So is it ethical at all to predict

0:27:47 - 0:27:57 Text: sexual orientation from any kind of features? And I see a lot of comments and thank you for the comments.

0:27:57 - 0:28:08 Text: So first of all, this is not a new research question. From 19th century there were multiple studies

0:28:08 - 0:28:15 Text: to correlate sexual identity with some external features. People were and then with genetics,

0:28:15 - 0:28:22 Text: people were looking for gay genes, gay brains, gay ring fingers and so on. So maybe moving from

0:28:22 - 0:28:35 Text: 19th century to 21 century, we can again ask who can benefit from such a classifier and who

0:28:35 - 0:28:42 Text: can be harmed by such a classifier. So what do I think? Who benefit from such a classifier?

0:28:42 - 0:29:00 Text: So autocratic governments, right? But also maybe dating apps, advertisers, conservative

0:29:00 - 0:29:06 Text: religious groups and so on. So we could think who would want to use such a classifier?

0:29:06 - 0:29:19 Text: Then maybe who can be harmed by such a classifier? Now again, assuming we are not thinking if it's

0:29:19 - 0:29:25 Text: possible at all builds such a classifier and as you can guess we will see that it's not possible.

0:29:25 - 0:29:33 Text: But what would stop you from building such a classifier? What do you think could be harmful in

0:29:33 - 0:29:49 Text: this classifier? So yeah, thank you for your answers. I will summarize them. So

0:29:53 - 0:29:59 Text: many people can be harmed by such a classifier and I summarized basically many of your answers

0:29:59 - 0:30:05 Text: here in this slide. So this potentially can be a dangerous technology. So in many countries

0:30:05 - 0:30:14 Text: being gay person is prosecutable by law and it can lead even to despenalty. It might affect

0:30:14 - 0:30:23 Text: people's employment, relationships, health opportunities, right? Importantly, this is not only

0:30:23 - 0:30:31 Text: about sexual orientation. So there are many attributes including sexual identity that are private

0:30:32 - 0:30:39 Text: for people, right? They are protected attributes and they can be non-binary, they can be

0:30:39 - 0:30:48 Text: intimate and not visible publicly. And most importantly is that these attributes are specifically

0:30:48 - 0:30:54 Text: those attributes against which people are discriminated against. And this is why it is

0:30:54 - 0:31:01 Text: basically dangerous to build such a technology. So in the paper, in the published paper,

0:31:01 - 0:31:09 Text: the argument for building this, for it's called presenting this study was that

0:31:11 - 0:31:15 Text: the study isn't alert how easy it is to build such a classifier.

0:31:15 - 0:31:27 Text: And basically, it is alert for to expose their thread to the privacy and safety of people.

0:31:27 - 0:31:49 Text: And then I would be interested to hear if you have counter arguments. So basically,

0:31:49 - 0:31:57 Text: there can be many counter arguments. One of them is that this is a classifier, this technology.

0:31:57 - 0:32:03 Text: So like a knife is a technology and with a knife you can kill people and you can

0:32:03 - 0:32:10 Text: cook food, right? And you don't, you're not necessarily need to kill people with a knife to

0:32:10 - 0:32:17 Text: expose the dangers and harms of this technology, right? And another issue is that

0:32:17 - 0:32:25 Text: this is actually not possible to build such a classifier with the data that researchers had.

0:32:25 - 0:32:32 Text: And this is what we will see when we will discuss additional details about the data.

0:32:33 - 0:32:39 Text: And another comment is, as I said, this is only one instance of such a technology. Here is

0:32:39 - 0:32:48 Text: another instance, which is a successful startup called Faceception, which has drawn a lot of

0:32:48 - 0:32:55 Text: funding and its goal is to identify terrorists based on facial features. And unlike in the

0:32:55 - 0:33:03 Text: previous study, the startup doesn't show how the bot technology they developed, but you can guess

0:33:03 - 0:33:11 Text: that it can have similar dangers. So in general, building predictive technology is very pervasive,

0:33:11 - 0:33:21 Text: as ubiquitous, but it is always, and sometimes it's not as kind of clear cut on ethical. For example,

0:33:23 - 0:33:28 Text: many people in NLP published papers on predicting gender from comments.

0:33:28 - 0:33:38 Text: And it is not clear basically when the technology is clearly harmful and unethical and where it can

0:33:38 - 0:33:48 Text: be actually used in good ways. For example, we all want our search to work, right? And to work well,

0:33:48 - 0:33:55 Text: and to be personalized, the algorithm actually needs to know something about us. So again, this is

0:33:55 - 0:34:03 Text: not an easy question, but in the case of the dark classifier, maybe it's already on the extreme.

0:34:03 - 0:34:15 Text: Okay, let's move to the data. Again, to discuss basically what questions could we ask about the data?

0:34:15 - 0:34:20 Text: So here's how the data was collected. So photos were downloaded from a popular American dating

0:34:20 - 0:34:30 Text: website. They were public. And there were a few thousands of images all white and the balance data

0:34:30 - 0:34:40 Text: set. And my first question is what can you say about the data? So is it okay to use this data if

0:34:40 - 0:34:47 Text: there is no robots to exceed the photos are public? What can be counter arguments to using people's

0:34:47 - 0:34:50 Text: photos from a dating website?

0:35:03 - 0:35:07 Text: There is a hint on the slide.

0:35:07 - 0:35:19 Text: A lack of consent. People did not intend for their photos to be used to build a classifier,

0:35:19 - 0:35:25 Text: private information, right? So thank you for your answers. So the points that I wanted to emphasize

0:35:25 - 0:35:33 Text: here is that it was legal to collect this data. But again, it's not clear whether it was ethical to

0:35:33 - 0:35:40 Text: collect this data because as you said, people did not provide 35,000 people did not provide consent

0:35:40 - 0:35:47 Text: for using specifically this data. And there are more important and global issue here is that there

0:35:47 - 0:35:54 Text: is a difference between data that is public and data, which is public size. So public, it's fine

0:35:54 - 0:36:01 Text: because these people want to be found by the social circle that they are targeting when they

0:36:01 - 0:36:07 Text: publish their photo on the dating website. But this does not mean necessarily that they want to be

0:36:07 - 0:36:15 Text: found by broader social circle, by their families, by their colleagues and so on. So there is a

0:36:15 - 0:36:23 Text: big difference between the data that is public and the data which is public size. So overall,

0:36:23 - 0:36:30 Text: even if they did not violate state terms of service, I don't know about it. I didn't read

0:36:31 - 0:36:39 Text: in depth actually, but they did violate the social contract because this was not the intent of

0:36:39 - 0:36:50 Text: the user for their data to be used in this way. Next question about data. So what do you think about

0:36:50 - 0:36:58 Text: this data set? 35,000 of pictures, all white, balanced in terms of sexual orientation and balanced

0:36:58 - 0:37:18 Text: in terms of gender. It's white. Okay, so does not represent the population. So right, so basically,

0:37:18 - 0:37:25 Text: you can guess that this data set has many, many biases incorporated. It contains only white people,

0:37:25 - 0:37:33 Text: only people who have self disclosed their sexual identity. It represents very certain social

0:37:33 - 0:37:40 Text: groups, people who put their photos on the dating website, specific age, specific ethnicity.

0:37:40 - 0:37:49 Text: And basically, these are photos that were carefully selected to be attractive to that target audience.

0:37:50 - 0:37:57 Text: So this data set contains many types of biases. And also, as one of you mentioned,

0:37:57 - 0:38:03 Text: and as reading on this slide, that the data set is balanced, which does not represent the whole

0:38:03 - 0:38:11 Text: the true distribution of the population. So what does it mean? This means that this model

0:38:12 - 0:38:23 Text: is built on a very biased data set. And you, as students at Stanford, you understand that

0:38:24 - 0:38:31 Text: it cannot be used, for example, on a non-white population. It cannot be used on photos of people,

0:38:31 - 0:38:38 Text: not from the dating website. We don't actually know how this what this classifier learned.

0:38:38 - 0:38:44 Text: Maybe the most important features were the watermark of this specific website. I don't know

0:38:44 - 0:38:53 Text: or some other of confounds, curious confounds. But the point is that once the classifier is out,

0:38:53 - 0:39:00 Text: those who want to use it maliciously, they don't know that this technology is actually not applicable

0:39:00 - 0:39:06 Text: for any other data set except for this specific data set. So this technology is biased,

0:39:07 - 0:39:14 Text: and it also shows that it's basically not a credible result.

0:39:16 - 0:39:23 Text: Okay, so let's move on. And final question is that, basically this is a deep learning model,

0:39:23 - 0:39:32 Text: it's a black box model, and then there is a question of how to analyze errors in the learning

0:39:32 - 0:39:40 Text: model. Specifically, when we work on such a critical, such a sensitive topics like predictive

0:39:40 - 0:39:47 Text: technology, not necessarily predicting sexual orientation, but for example, predicting gender,

0:39:47 - 0:39:55 Text: which is again used in many companies. So it is very difficult to understand whether it is okay

0:39:55 - 0:40:01 Text: to use this technology. But the point is that we need to be able to analyze it and

0:40:03 - 0:40:11 Text: evaluate it properly. And the last point is about the accuracy. Again, and I'm going back to the

0:40:11 - 0:40:18 Text: points that I mentioned also in the IQ classifier. So the accuracy of this classifier is 80%

0:40:18 - 0:40:26 Text: 1% for men and 74% for women. Is it a good accuracy or is it a bad accuracy?

0:40:28 - 0:40:36 Text: So the numbers are okay for some tasks, but not for others. But importantly for this type of problem,

0:40:36 - 0:40:43 Text: it's important to understand that the cost of misclassification is not equal to the cost of

0:40:43 - 0:40:53 Text: correct prediction. And here is kind of visual examples. So if my algorithm misclassifies by doc

0:40:53 - 0:41:01 Text: as a cookie, is the cost misclassification of this misclassification as high or low?

0:41:01 - 0:41:10 Text: So I guess it's just funny, right? It's funny. There is nothing offensive here. And then the next

0:41:10 - 0:41:19 Text: question, if my algorithm misclassifies me with my doc, is the cost of misclassification high or low?

0:41:21 - 0:41:26 Text: It can be funny, but maybe not for everyone. We already don't know. And then the photo that I

0:41:26 - 0:41:33 Text: don't put here, but the one that is maybe many of you have heard is the gorilla incident that

0:41:33 - 0:41:42 Text: happened in Google in 2016. So in this case, there was a misclassification of African women

0:41:42 - 0:41:48 Text: with a gorilla and to understand how expensive is the cost of this misclassification. We need to

0:41:48 - 0:41:58 Text: understand the whole history of dehumanization of black people in the US and so on. So we can see

0:41:58 - 0:42:05 Text: the difference for the same algorithm and same types of errors. There are different types of errors

0:42:05 - 0:42:15 Text: that are more expensive than others. So this is why it is important to assess AI systems adversarially.

0:42:15 - 0:42:21 Text: And now I just want to reiterate the types of questions that I asked because these are the kinds

0:42:21 - 0:42:26 Text: of questions that you might want to ask yourself next time when you need to build another predictive

0:42:26 - 0:42:33 Text: technology. And the first is to understand the ethics of the research question. And sometimes it's

0:42:33 - 0:42:41 Text: not very easy to understand, but just ask yourself these most specific questions. If I build this

0:42:41 - 0:42:47 Text: technology, who could benefit from such a technology? And who can be harmed by it? So try to see the

0:42:47 - 0:42:55 Text: corner cases. And also what about the data? Could sharing this data have major effect on people's

0:42:55 - 0:43:04 Text: lives, like in the case of AI, gay, dark, classifier? The next question that you can ask is about privacy.

0:43:04 - 0:43:11 Text: What we discuss, who owns the data? And is this data not only public or legal to use, but also

0:43:11 - 0:43:19 Text: is it are we violating the social, basically, circles to which the data is publicized? Are we

0:43:19 - 0:43:26 Text: violating a social contract in the way that the public data is expected to be used? And user

0:43:26 - 0:43:36 Text: consent is not always possible to obtain. But we need to understand implicit assumptions of people

0:43:36 - 0:43:44 Text: who put their data online. Now, the next question is what are possible biases in data? What are

0:43:44 - 0:43:51 Text: artifacts in data? What are distributions for specific populations and subpopulations?

0:43:51 - 0:44:00 Text: How representative or what kind of misrepresentations are in my data? And next is what is

0:44:00 - 0:44:07 Text: basically potential bias in these models? Do I when I build this model do I control for confined

0:44:07 - 0:44:13 Text: variables? And do I optimize for the right objective? Like in the case of the IQ classifier?

0:44:14 - 0:44:21 Text: And also if I have biases, does my system amplify biases? And finally, it is not enough to measure

0:44:21 - 0:44:30 Text: accuracy because the kind of semantics of false positives and false negatives can be different.

0:44:30 - 0:44:36 Text: Sometimes the cost of misclassification is much higher than the cost of correct prediction.

0:44:36 - 0:44:41 Text: So, also need to understand how to evaluate the model's property.

0:44:43 - 0:44:50 Text: And why is it especially relevant now? Is because as you all know, there is an exponential

0:44:50 - 0:44:56 Text: growth of user-generated data. And it's really easy to build the tools. Each of us can build the

0:44:56 - 0:45:03 Text: IQ classifier or GADAR. But the question is what kind of technology we will produce?

0:45:05 - 0:45:12 Text: So, in this I finished the first part of discussion and I put in this slide some recommended papers

0:45:12 - 0:45:20 Text: on talks specifically on the introductory topics on the impact of NLP. And I think these are

0:45:20 - 0:45:26 Text: there are hundreds or thousands of similar talks, but these are my favorites. So, if you want to read

0:45:26 - 0:45:35 Text: more, please take a look. Should I stop for questions? So, should we move to the next part?

0:45:38 - 0:45:43 Text: Right at the moment there aren't any outstanding questions. So, maybe it's okay to move on

0:45:43 - 0:45:51 Text: unless anyone is desperately typing. Okay, and the chat window is really nice. There are so

0:45:51 - 0:45:58 Text: many responses. Thank you all. And I hope to get later this chat file to read it.

0:45:58 - 0:46:05 Text: I think we can save it, yeah. Thank you. Okay, we can move on to the second part about algorithmic bias.

0:46:05 - 0:46:16 Text: So, what are the topics in the intersection of ethics in NLP? So, the first one is algorithmic bias,

0:46:16 - 0:46:21 Text: and this is about bias in data and NLP models, and this is something that I will talk more in the

0:46:21 - 0:46:28 Text: second part. But important to understand the field is much broader. So, the next topic is

0:46:28 - 0:46:36 Text: incivility. So, ability to develop NLP tools, to identify, to actually data analytics, to understand

0:46:36 - 0:46:43 Text: the hate speech, toxicity, incivility online, and this is a very complicated field, because it's not

0:46:43 - 0:46:50 Text: only about building the right classifiers. There are many, many questions such as if I post hateful

0:46:50 - 0:46:59 Text: comment, who does this comment belong to? Does it belong to a company? Does it belong to me? Should

0:46:59 - 0:47:04 Text: it be removed or not? Because it is not clear where is the boundary between the free speech and

0:47:06 - 0:47:13 Text: moderated speech, how to minimize the harms, but defend the democracy. These questions are

0:47:13 - 0:47:23 Text: very subjective, and they are not regulated. So, this is a big difficult field. The next field is

0:47:23 - 0:47:29 Text: about privacy. So, again, who does this data belong to, and how to protect privacy? This is

0:47:30 - 0:47:36 Text: field of privacy is actually very, very under-explored NLP. On other fields, I think there is

0:47:36 - 0:47:42 Text: some research on incivility, or no gripping bias, on other fields, but very little research on

0:47:42 - 0:47:53 Text: privacy. Misinformation, so information manipulation, opinion manipulation, fake news. So, there is

0:47:53 - 0:48:03 Text: a whole range of ways that data can be manipulated from generated texts and disinformation to just

0:48:03 - 0:48:12 Text: advertisement and the most subtle or propaganda and subtle opinion manipulation. And there are

0:48:12 - 0:48:19 Text: many, many interesting research projects that can be done really with focus on the language of

0:48:19 - 0:48:28 Text: manipulation. I think it's just an interesting topic to explore. And finally, the technological divide.

0:48:28 - 0:48:35 Text: So, when we build our tools, even if it's a part of speech tagger, what are populations for which

0:48:35 - 0:48:42 Text: are served by these tools? So, there is a certain divide. The technologies are built unequally. There

0:48:42 - 0:48:48 Text: is no one language, no two languages. There are six thousand languages in the world, and there are

0:48:48 - 0:48:56 Text: many populations. And there are certain areas of NLP, which are completely under-explored. For

0:48:56 - 0:49:02 Text: example, language varieties. We think we can solve a problem for English, the problem of dependency

0:49:02 - 0:49:08 Text: parsing, but we don't account for different varieties of English. What about Nigerian English,

0:49:08 - 0:49:15 Text: what about African-American English, what about Indian English? So, there are the

0:49:15 - 0:49:26 Text: technological divide that is currently present. And as you see in the bottom, this picture shows that

0:49:26 - 0:49:33 Text: they feel this highly interdisciplinary. So, AI researchers cannot actually solve the problem of

0:49:34 - 0:49:41 Text: misinformation alone. To be able to address the problem of misinformation or hate speech,

0:49:41 - 0:49:49 Text: we need to have not only engineers, but also ethicists and social scientists and activists,

0:49:53 - 0:50:01 Text: and politicians who actually are responsible for policies and linguists, because many of these

0:50:01 - 0:50:08 Text: phenomena are interesting phenomena, which are not in words, but more in pragmatics. So,

0:50:08 - 0:50:13 Text: this is a very interesting field, scientifically, but also very challenging to work on.

0:50:15 - 0:50:19 Text: And there are some recommended resources, and in particular, the one that might be interesting to

0:50:19 - 0:50:29 Text: you all is CS384, is a seminar by Dan Jaravsky, that is an amazing also list. So, if you want to

0:50:29 - 0:50:37 Text: take a look on specific subfields. So, this is a general overview, and now I want to talk about

0:50:37 - 0:50:44 Text: one of these subfields, and give some of explanation, why do we have algorithmic bias in our models.

0:50:45 - 0:50:53 Text: So, let's start again with interaction. I know you have the slides, but don't look forward.

0:50:54 - 0:50:59 Text: I want to ask you questions, and please type, which word is more likely to be used by a female?

0:50:59 - 0:51:11 Text: Giggle or laugh? Just type quickly, don't think too much. So, please look at the chat and see

0:51:11 - 0:51:23 Text: the majority response. It is absolutely giggle. You're right. Next question, which word is more likely

0:51:23 - 0:51:38 Text: to be used by a female? Brutal or fierce? Oh, lovely. Like 99% fierce. Thank you. Next question,

0:51:39 - 0:51:43 Text: which word is more likely to be used by an older person? Impressive or amazing?

0:51:43 - 0:51:54 Text: So, actually from what I see, it's 100% impressive. Very impressive. Thank you for answers,

0:51:55 - 0:52:01 Text: which word is more likely to be used by a person of a higher occupational class. Suggestions or proposals.

0:52:05 - 0:52:07 Text: Do you see how correctly you answer my questions?

0:52:07 - 0:52:16 Text: Next question, why do we intuitively recognize the default social? Why do we all know the right answer?

0:52:21 - 0:52:27 Text: Right, our brains are biased. So, this is about implicit biases in our brains.

0:52:28 - 0:52:33 Text: And this is a very good example. And you can also see how language perpetuates and propagates

0:52:33 - 0:52:38 Text: biases. Right? It's all in the language. If you could know from one word who is the person who

0:52:38 - 0:52:46 Text: said it, you can imagine what kind of biases we can extract from a longer text. So, to understand

0:52:46 - 0:52:52 Text: what's happening with biases, we need to understand how cognition works. So, we have, this is

0:52:52 - 0:52:58 Text: was introduced by Kaliman and Tversky. So, conceptually our brain is divided into system 1,

0:52:58 - 0:53:06 Text: system 2. So, system 1 is our autopilot. It is used to make decisions without thinking. It is

0:53:06 - 0:53:14 Text: very fast parallel effortless and so on. System 2 is our logical part. It knows how to analyze and

0:53:14 - 0:53:20 Text: make decisions that are unusual for us. So, it is not automatic and it is slow, serial, controlled.

0:53:20 - 0:53:28 Text: It requires a lot of mental energy. So, our brain constantly receives signals, all kinds of

0:53:28 - 0:53:35 Text: from through all the sensors, through eyes, ears. There is a lot of incoming data. There is a lot of

0:53:35 - 0:53:42 Text: pixels here around me. But the actual part of system 2 is only able to produce a very small portion

0:53:42 - 0:53:51 Text: of the signals that we received. So, system 1 is automatic. System 2 is effortful, but in practice

0:53:51 - 0:54:01 Text: over 95% of the signals that we received from the world is relegated to system 1. And the funny

0:54:01 - 0:54:09 Text: thing is Kaliman wrote is that we identify ourselves with system 2. We believe that we are conscious

0:54:09 - 0:54:15 Text: and reasonable beings, but in practice most of our decisions are made by system 1.

0:54:17 - 0:54:29 Text: So, since we are using autopilot most of our time, our brain gets all the information that we

0:54:29 - 0:54:37 Text: perceive, get categorized clusters and labeled automatically. And this is how cognitive stereotypes

0:54:37 - 0:54:47 Text: are created. So, and there are multiple cognitive stereotypes that are aimed to fill the gaps if

0:54:47 - 0:54:53 Text: we don't have enough meaning or reduce information, generalize if you have too much information,

0:54:54 - 0:55:01 Text: or to complete the facts if we are missing facts and so on. And this leads all kinds of cognitive

0:55:01 - 0:55:11 Text: biases. So, examples of biases would be in group favoritism. So, we grow up seeing the majority

0:55:11 - 0:55:15 Text: of specific people and we tend to like those people more than the minorities.

0:55:17 - 0:55:23 Text: How the effect that we know very little about the person or a specific social group,

0:55:23 - 0:55:29 Text: but we tend to generalize based on one trait we could generalize about the whole group and to

0:55:29 - 0:55:40 Text: other traits. And so, there are many biases. And thanks to these stereotypes, if I

0:55:40 - 0:55:45 Text: asked you the questions the words, or if I show you these pictures you immediately know

0:55:45 - 0:55:51 Text: this is a calming, cute, tasty, and if I show these pictures you will know that maybe it's

0:55:51 - 0:55:58 Text: dangerous and pleasant and automatically when we see a snake we will not, will automatically step

0:55:58 - 0:56:07 Text: down right step back. And we only need to make system to invoke system two for example,

0:56:07 - 0:56:14 Text: if we decide to touch it. But most of our decisions are automatic and this is the same exactly

0:56:14 - 0:56:22 Text: mechanism that creates social stereotypes in our brains. So, we exactly in the same mechanism

0:56:22 - 0:56:30 Text: internalize these associations and make generalizations about specific groups.

0:56:32 - 0:56:37 Text: So, and this is why when I ask you which word is more likely to be spent older person,

0:56:37 - 0:56:45 Text: 100% of you typed that the word impressive would be used. And importantly,

0:56:45 - 0:56:52 Text: these implicit biases are very pervasive and they operate unconsciously. And one important

0:56:52 - 0:56:58 Text: property that they are transitive. So, basically we are seeing that a black person is playing

0:56:58 - 0:57:04 Text: basketball and in a movie we see that a black person uses drugs and we immediately connected and

0:57:04 - 0:57:13 Text: reinforce the, for example, the associations with specific groups. And the social stereotypes are

0:57:13 - 0:57:22 Text: not necessarily all negative. I can name some on the surface positive stereotypes. For example,

0:57:22 - 0:57:28 Text: Asians are good and math. Importantly, they all have negative effect even seemingly positive

0:57:28 - 0:57:36 Text: stereotypes because they pigeonhole with the individuals and put expectations of them or they can

0:57:36 - 0:57:44 Text: just be harmful. And then how do these biases manifest? They manifest in language. And for example,

0:57:44 - 0:57:54 Text: they manifest in subtle microaggressions. And importantly, microaggressions should not correlate

0:57:54 - 0:58:02 Text: necessarily with sentiment. So, sentiment analysis tools would not detect them. On the surface level,

0:58:02 - 0:58:09 Text: microaggressions can be negative, neutral, or positive, like in these examples. But they actually

0:58:09 - 0:58:16 Text: bring prolonged harms, even if they are meant as a compliment. And there was a lot of research

0:58:16 - 0:58:24 Text: in social sciences that showed that they can even bring more harms that overt hate speech.

0:58:24 - 0:58:30 Text: Because they, in place, they kind of bring a significant emotional harm and reinforce

0:58:30 - 0:58:37 Text: problematic stereotypes. So, if I will collect these conversations from Twitter, do I look okay?

0:58:37 - 0:58:42 Text: You are so pretty. Is this a positive or negative? It's probably positive interaction.

0:58:43 - 0:58:49 Text: And then the next interaction, check out my new physics paper. Is it positive or negative interaction?

0:58:51 - 0:58:56 Text: Why physics? You're so pretty. So, we don't know. We don't have the right context.

0:58:56 - 0:59:03 Text: And then for this question, do I look okay? And all kinds of responses, for example, you are so

0:59:03 - 0:59:08 Text: pretty for your age. In this case, these are negative. These are microaggressions. They make us

0:59:08 - 0:59:14 Text: cringe, right? And then the problem is that all of these human generated data, which is

0:59:14 - 0:59:20 Text: necessarily and incorporates a lot of microaggressions. And just stereotypes that we all have,

0:59:20 - 0:59:30 Text: we are not aware of them, it is fed to our systems. So, there is a lot of bias in language,

0:59:30 - 0:59:36 Text: stereotypes or historical biases that are perpetuated. For example, there are more photos of

0:59:37 - 0:59:45 Text: men, doctors than female doctors on the web, or human reporting biases. And later,

0:59:45 - 0:59:52 Text: there are biases also in our data sets. So, for example, what kind of data is sample for annotation

0:59:52 - 0:59:57 Text: from which kind of populations, from which language varieties, from which locations?

0:59:57 - 1:00:03 Text: And then who are we choosing as annotators? So, there is a bias in who are annotators that will

1:00:03 - 1:00:10 Text: annotate our data. And then there is bias, cognitive biases of annotators themselves, how they treat,

1:00:10 - 1:00:17 Text: what is the microaggression and what is not, or and other questions. And all these types of

1:00:17 - 1:00:24 Text: biases later propagate into our computational systems. And this is how we get from cognitive bias,

1:00:24 - 1:00:31 Text: social cognitive biases to algorithmic biases. Because if you remember system one, a system two,

1:00:31 - 1:00:39 Text: currently the way we develop systems, AI is only system one. And why is that? Because

1:00:41 - 1:00:48 Text: currently the way we develop our tools, the dominant paradigm, is a data-centric approach.

1:00:48 - 1:00:55 Text: So, we need a lot of data to train good models. And we do know well how to leverage a lot of data.

1:00:55 - 1:01:04 Text: But again, language is about people. It is produced by people. But our existing systems, they

1:01:07 - 1:01:13 Text: do not leverage social cultural context. We don't know how to incorporate

1:01:14 - 1:01:21 Text: and we don't do it usually. We don't incorporate which social biases are positive and

1:01:21 - 1:01:26 Text: which inductive biases are good and which inductive biases are bad to have. So overall,

1:01:28 - 1:01:34 Text: our models are really powerful and they are powerful at making generalizations. But we don't know

1:01:34 - 1:01:41 Text: how to control for the right inductive biases, which biases are good and which inductive biases are not.

1:01:41 - 1:01:48 Text: And then going to the next point is that these models are opaque. We don't also know how to

1:01:48 - 1:01:55 Text: interpret well deep learning networks, which means it's not easy to analyze them and spot the

1:01:55 - 1:02:02 Text: problems. So, and as you can guess, this is not only related to the field of ethical and LPE.

1:02:02 - 1:02:07 Text: These are just interesting research questions. How to incorporate social and cultural

1:02:07 - 1:02:14 Text: knowledge into deep learning models or how to develop interpretation approaches.

1:02:14 - 1:02:27 Text: So, what is missing? Today, what is missing, for example, is that existing classifiers for

1:02:27 - 1:02:34 Text: toxicity detection, if we want to build data analytics to clean up our data before it propagates

1:02:34 - 1:02:42 Text: to the models, we know only how to detect overt toxic language, such as hate speech. Because we

1:02:42 - 1:02:49 Text: are primarily sampling our data and training our data based on lexicans. And there is almost no

1:02:49 - 1:02:58 Text: focus on actual microaggressions and most subtle biases, which are often not in words, but in

1:02:58 - 1:03:04 Text: pragmatics, the conversation and understanding who are the people involved in the conversation.

1:03:04 - 1:03:13 Text: So, today's tools that could be applied to this kind of microaggressions or hate speech detection

1:03:13 - 1:03:21 Text: or sentiment analysis, but they will necessarily fail. The next point is that, again, our

1:03:21 - 1:03:30 Text: models do not incorporate social cultural knowledge. And basically, the same comment can be toxic

1:03:30 - 1:03:36 Text: or non-toxic depending on who are the people involved in the conversation. But our models are

1:03:36 - 1:03:46 Text: data-centric and not people-centric. And the more general problem is that the deep learning models

1:03:47 - 1:03:54 Text: really go to the picking spurious correlations. And this is why, for example, in this paper,

1:03:54 - 1:04:02 Text: the three comments, which change, which the only difference in these three sentences is the name

1:04:02 - 1:04:12 Text: and probably the association of this name with race or ethnicity. So our models do

1:04:12 - 1:04:19 Text: pick up on spurious confounds. So we think we predict sentiment, but we also predict all kinds

1:04:19 - 1:04:25 Text: of labels that correlate with sentiment, but not necessarily a true predictor of sentiment,

1:04:25 - 1:04:29 Text: for example, gender or race. And this is something very pervasive.

1:04:31 - 1:04:38 Text: So, and finally, the models are not explainable. So we kind of have these deficiencies just

1:04:38 - 1:04:44 Text: in core approaches to deep learning. And we have all these data with these data with trained

1:04:44 - 1:04:51 Text: conversational agents, personal systems, all kinds of systems. And why do we care now? Because

1:04:51 - 1:04:59 Text: it can bring harms. So what kind of unintended harms it can bring? Here is an example of an

1:04:59 - 1:05:05 Text: image search. If you search for three black teenagers, and this is I searched for it when I prepared

1:05:05 - 1:05:11 Text: this talk for the first time. So it was fixed, I guess, but this is how it was in June 2017.

1:05:11 - 1:05:20 Text: And then when you search for a doctor, you get primarily male doctors, right? And

1:05:20 - 1:05:26 Text: primarily white. And if you search for a nurse, this is a stereotypical image of a nurse.

1:05:27 - 1:05:35 Text: And if you search for a homemaker, this is just top search results for this query world.

1:05:35 - 1:05:42 Text: If you search for CEO, it's a very specific stereotypical image of CEO.

1:05:44 - 1:05:49 Text: And if you search for a professor, this one is my personal favorite. So you can see all

1:05:51 - 1:05:56 Text: images. And there is only one woman. But if you look at her background, you can see this

1:05:56 - 1:06:05 Text: is a simple aspect. So it's just an error in search. She is not a professor. And this is a result of,

1:06:05 - 1:06:14 Text: for example, speech, face recognition. So these are two examples. One camera does not recognize

1:06:14 - 1:06:22 Text: Asian faces and things they blinked on the right. The camera, this is a video of face tracking

1:06:22 - 1:06:29 Text: camera that is able to track white faces, but immediately shuts down when black face. So it is

1:06:29 - 1:06:36 Text: not able to track black faces. So these are all consequences of bias data that propagates into

1:06:36 - 1:06:44 Text: models that do not incorporate intentionally, basically safeguards against very specific biases.

1:06:45 - 1:06:50 Text: Now what's going on with natural language processing? So this is a slide from the very beginning

1:06:50 - 1:06:57 Text: that just leads all possible applications that I could think about. As you can guess, since 2016,

1:06:57 - 1:07:03 Text: there are many, many papers that just I don't think there is any application or court

1:07:03 - 1:07:11 Text: technologies of NLP left, which did not expose biases in NLP technologies. So here is an example

1:07:11 - 1:07:18 Text: of bias in machine translation. So this is visual. This is why I'm showing it. So there are

1:07:18 - 1:07:26 Text: languages that mark third person, third person pronouns with gender and other languages that do not

1:07:26 - 1:07:31 Text: mark third person pronouns with gender. So if you translate from a language such as

1:07:32 - 1:07:41 Text: Hungarian or from Estonia that don't mark third person pronoun with gender into English,

1:07:41 - 1:07:46 Text: which does mark third person pronoun with a gender, you might see similar results. You will not

1:07:46 - 1:07:55 Text: see them now. This is what was exposed maybe year or two ago. So basically from translation from

1:07:55 - 1:08:01 Text: they are the nurse. This would be she's a nurse, but they are the scientists. The translations

1:08:01 - 1:08:08 Text: would be he is a scientist. In saying for engineer, baker, teacher, so all the stereotypes,

1:08:08 - 1:08:11 Text: just historical stereotypes that you could think about.

1:08:11 - 1:08:23 Text: So what are possibilities to fix it? One way to fix it is actually simple. You could treat

1:08:23 - 1:08:32 Text: the target gender just as a target language is multilingual NMT. So I don't know, but I suspect you

1:08:32 - 1:08:39 Text: did look at this paper on multilingual neural machine translation. And basically you can add

1:08:39 - 1:08:46 Text: another token, for example, and you can controllably generate into female or male translation.

1:08:46 - 1:08:53 Text: So the fix is not difficult, but you need to be aware of potential dangers to be able to fix

1:08:53 - 1:08:59 Text: model. And importantly, this is not about only about fixing the model itself, but also about fixing

1:08:59 - 1:09:06 Text: the user interface. So the way Google fix this interface is they provided different translations,

1:09:06 - 1:09:09 Text: so basically all possible translations for different jenders.

1:09:12 - 1:09:20 Text: So in the similar kind of harms were shown especially in dialogue systems. So occasionally such

1:09:20 - 1:09:28 Text: models make big headings and news like Microsoft's Tire Chatbot that became very racist and

1:09:28 - 1:09:37 Text: sexist overnight. And the GPT-3 based models that was offering a suicide advice. And about two

1:09:37 - 1:09:44 Text: weeks ago there was a Korean chatbot that became extremely homophobic very quickly and had to be

1:09:44 - 1:09:50 Text: removed. So these titles come up again, these headlines coming again and again and again.

1:09:50 - 1:09:58 Text: And I guess the point here is that what we do in LP today, I call it a reactive approach.

1:09:59 - 1:10:10 Text: So we have a we expose specific problem, a problem in search, a problem in chatbots, racist chatbots,

1:10:10 - 1:10:16 Text: or a problem in machine translation. And then it gives creates bad publicity and then we start

1:10:16 - 1:10:23 Text: debicing the models. But it's not necessarily that we need to develop the tools in this way,

1:10:23 - 1:10:30 Text: right? So I hope that kind of in the future we make a paradigm shift, talk the more proactive

1:10:30 - 1:10:42 Text: approach. And the specific, what would proactive approach require is, for example,

1:10:42 - 1:10:52 Text: building new data analytics. So basically, rather than exposing biases, going further up the pipeline

1:10:52 - 1:10:58 Text: and starting actually with the data and building automatic moderators and data analytics that

1:10:58 - 1:11:05 Text: can identify problematic texts, dramatic images beyond the workly hate speech. And then incorporating

1:11:05 - 1:11:14 Text: the right inductive biases into the models and understanding what, of understanding, moving from

1:11:14 - 1:11:21 Text: the data center approaches to people center approaches, incorporating social cultural and

1:11:21 - 1:11:26 Text: pragmatic knowledge. And in modeling, there are interesting research questions on how to

1:11:26 - 1:11:35 Text: the most furious confounds, but predict only the target label, not necessarily the picking up on

1:11:35 - 1:11:40 Text: spurious correlations. And finally, on building more interpretable models. And importantly,

1:11:40 - 1:11:47 Text: these are not orthogonal research directions. So for example, to build good data analytics,

1:11:48 - 1:11:55 Text: you necessarily need to maybe have an interpretable model and also be able to incorporate the right

1:11:55 - 1:12:02 Text: social cultural knowledge, because again, the microaggressions, the text is not necessarily in words.

1:12:04 - 1:12:13 Text: So what I was going to do, if I had time, is I was going to show to who case studies,

1:12:13 - 1:12:19 Text: from research studies, from my group that specifically focus on these data analytics,

1:12:19 - 1:12:28 Text: identifying unsupervised bias or an interpretable model for making hate speech classifiers more robust.

1:12:28 - 1:12:34 Text: But I will skip it because we are out of time. Good show what you've stored a few minutes,

1:12:34 - 1:12:41 Text: you could show one quickly for five minutes. So basically, we are trying to build this green

1:12:41 - 1:12:48 Text: boxes is what we have today, hate speech or sentiment analysis. But we are trying to build a new

1:12:48 - 1:12:55 Text: kind of class of models specifically for social bias analysis. And in these models, we would want to

1:12:55 - 1:13:01 Text: detect who are the people involved, so who the comment, for example, is directed to if it's

1:13:01 - 1:13:09 Text: a conversational domain. And also to understand what kinds of microaggressions these are. And also to

1:13:09 - 1:13:15 Text: maybe generate explanations or interpretations through building more interpretable models.

1:13:15 - 1:13:24 Text: And these are the two papers that I was going to talk about. One is an unsupervised approach to

1:13:25 - 1:13:33 Text: detection gender bias. And one is on if we have just a few examples of microaggressions and the

1:13:34 - 1:13:39 Text: classifier of hate speech, these examples of microaggressions are adversarial examples to the

1:13:39 - 1:13:46 Text: classifier. The classifier is not able to deal with them. But what we can do, we can focus on

1:13:46 - 1:13:53 Text: interpretability of the classifier and specifically making the classifier understanding for each

1:13:53 - 1:14:02 Text: probing example, which examples in the training data influenced the classifier's decision. So changing

1:14:02 - 1:14:08 Text: the approach to interpretability from interpreting specific salient words in the input of the classifier

1:14:08 - 1:14:16 Text: into looking at the training data, sorting the training data and identifying which examples were

1:14:16 - 1:14:21 Text: most influential for classifier predictions. So using influence functions from example,

1:14:23 - 1:14:33 Text: this is a paper that Percy published in 2017. And through this classifier we are able to surface

1:14:33 - 1:14:39 Text: microaggressions despite that the classifier makes the wrong prediction. So this is just a high level,

1:14:39 - 1:14:47 Text: very high level summary without talking about the actual studies. So I will skip, let me just skip

1:14:50 - 1:14:56 Text: the actual papers. And the slides are there. I'm happy to discuss later. I just don't want to go

1:14:56 - 1:15:05 Text: over time. So to summarize, the field of computational ethics is super interesting and there are

1:15:05 - 1:15:11 Text: interesting problems that are technically interesting, challenging. So you don't need to have

1:15:11 - 1:15:16 Text: separate kind of important problems and the technical interesting problems. We can work on

1:15:16 - 1:15:21 Text: important problems, which are also technically interesting and focus on important things like

1:15:21 - 1:15:28 Text: building better, deploring models. And these are interesting subfields. And if some of you are

1:15:28 - 1:15:37 Text: interested in specific projects, so it's just in our course we just put together a presentation

1:15:37 - 1:15:45 Text: that just summarizes all kinds of possible projects. Thank you very much.

1:15:45 - 1:15:54 Text: I wish I could see the audience. This is so weird. Thank you, Julia, for that great talk.

1:15:55 - 1:16:03 Text: Yeah, so if people would like to ask some questions to Julia, if you raise your hand,

1:16:04 - 1:16:10 Text: we can promote you to be panelists. And I think then we can even have you turn on your cameras if

1:16:10 - 1:16:18 Text: you want to show you're a real human being. But you know, if we're waiting to see if there are

1:16:19 - 1:16:26 Text: people who would like to do that, I mean, there is one question that's outstanding at the moment,

1:16:26 - 1:16:32 Text: which is, do these bots become racist sexists so quickly after exposure to the public,

1:16:32 - 1:16:37 Text: due to the public intentionally trying to bias them, or is it that common talk among the public

1:16:37 - 1:16:44 Text: is racist sexists enough to bias any model upon exposure? So I think this is both. But in the case,

1:16:44 - 1:16:51 Text: for example, of tie-bought, the way it was built is it's a continual learning system. So it collects

1:16:52 - 1:17:00 Text: inputs from people and then uses them as training examples to generate forward answers. And

1:17:00 - 1:17:07 Text: people, as usually people, pick up on such things very quickly. And then they intentionally became

1:17:07 - 1:17:13 Text: racist and the sexist again against the bot. And the bot very quickly learned to just

1:17:13 - 1:17:21 Text: meaning the people's behavior. So it was some malicious attempt to turn this bot into racist and sexist.

1:17:21 - 1:17:33 Text: But this is how the model was designed to collect inputs from people, but not monitor the kind of

1:17:33 - 1:17:39 Text: sentences that are used or not used in the training data. So this is again going back to the

1:17:39 - 1:17:46 Text: discussion of that we actually don't have good analytics. Many of these analytics are just

1:17:46 - 1:17:53 Text: at least so, whitelist, they are very, very primitive. It's not very easy to incorporate such

1:17:53 - 1:18:02 Text: constraints into generation or the automatic filtering of data. There is another question.

1:18:06 - 1:18:10 Text: Do you want to ask my question? Yes. So I guess you got both questions. Thank you and I

1:18:10 - 1:18:18 Text: end live people. So you get it. Let's ask the question. Yeah. I'm curious. I may be

1:18:18 - 1:18:24 Text: going to a little bit more about how you measure your model's performance. Are there actually public

1:18:24 - 1:18:32 Text: benchmarked data sets? Are any sort of well defined metrics that you can sort of objectively

1:18:32 - 1:18:38 Text: measure your model's improvements? Are you talking about specifically our papers that I skipped?

1:18:38 - 1:18:46 Text: I mean, I know it's a very new field and maybe it's harder to define real objective

1:18:47 - 1:18:54 Text: measure of bias. So how do you measure progress in general? Are you also just mentioning this?

1:18:54 - 1:19:01 Text: This is a good question. It's very difficult. There is growing body of data sets. For example,

1:19:01 - 1:19:08 Text: in Yodav, in the Joyce group created the social bias inference corpus. I don't remember what

1:19:08 - 1:19:18 Text: I said as BIC. Overall, the problem of evaluation is actually very difficult. And there are some

1:19:18 - 1:19:25 Text: problems in which there are existing evaluation data sets. If you think about hate speech, for example,

1:19:25 - 1:19:35 Text: there are many data sets for training and evaluating performance of hate speech classifiers.

1:19:36 - 1:19:43 Text: But when we think about biases, there are much less. And the big problem here is not easy to

1:19:43 - 1:19:51 Text: collect such a data set. So if you think, let me actually show why it is difficult to collect a

1:19:51 - 1:19:57 Text: data set of say of microaggressions. So an if solution would be to

1:20:02 - 1:20:08 Text: so if you think about the standard way of data collection, so we would sample some data from the

1:20:08 - 1:20:15 Text: internet and we give it to mechanical token, annotators. And then they would analyze is it bias or

1:20:15 - 1:20:19 Text: not? And we build a supervised specifier. So this is what we cannot do in the case of most subtle

1:20:19 - 1:20:27 Text: biases. First, because we don't have a strong lexical C to sample the right data. Because again,

1:20:27 - 1:20:32 Text: these biases are not in words. What like you just sample from the whole reddit corpus, it's not

1:20:32 - 1:20:37 Text: clear how to annotate to make it feasible, not too expensive. But more importantly that

1:20:39 - 1:20:43 Text: every annotator when you incorporate their own biases, so you actually need very well trained

1:20:43 - 1:20:50 Text: annotators and multiple annotations per sample. So the question of how to create such a data set is

1:20:50 - 1:20:58 Text: very, very difficult. In our study, we collected the so there is a corp website called microaggressions.com

1:20:59 - 1:21:06 Text: that has self-reported microaggressions. When people actually recall experiences of

1:21:06 - 1:21:16 Text: microaggressions against them and they quote them. And this is what we use to evaluate our data.

1:21:16 - 1:21:23 Text: But the data collection is as a big problem currently as just modeling.

1:21:23 - 1:21:32 Text: Do you want to ask any questions?

1:21:39 - 1:21:41 Text: Yeah.

1:21:44 - 1:21:46 Text: I don't have any other questions to ask.

1:21:46 - 1:21:54 Text: Oh, sorry. Okay, so I should go on.

1:21:55 - 1:21:58 Text: Will that pay for me? Yes.

1:21:58 - 1:22:05 Text: Yeah, things for the green lecture. It's a very appropriate topic for the sketch lecture.

1:22:07 - 1:22:15 Text: So I took the course CS 182 which introduced many notions of theirness through case studies and

1:22:15 - 1:22:21 Text: like assignments. I've been thinking a lot about how to use notions. And of course there was

1:22:21 - 1:22:30 Text: like mentioned for the research done by timeberg who showed three different notions of theirness

1:22:30 - 1:22:37 Text: can be simultaneously satisfied. The calibration which is like the probability of outcome given

1:22:37 - 1:22:43 Text: risk scores, the false positive rate, the false negative rate and not all be like

1:22:43 - 1:22:49 Text: completely independent across protective traits. So if like pass a certain point,

1:22:49 - 1:22:56 Text: you know, these metrics just become direct trade-offs. Is it the case that fairness becomes

1:22:56 - 1:23:03 Text: subjective after that? And I guess like more generally, you know, in ethics research has

1:23:03 - 1:23:09 Text: there been frameworks of creating sort of upper bounds or constraints among these different

1:23:09 - 1:23:13 Text: metrics. So we sort of measure how close we get to the ideal.

1:23:15 - 1:23:21 Text: This is a very difficult question. So right, the fairness research, there actually proves that you

1:23:21 - 1:23:34 Text: cannot satisfy both the measures of performance and inclusivity. And this is why they are measured

1:23:34 - 1:23:43 Text: separately, false positives, false negatives. And the question is whether it because of this

1:23:43 - 1:23:52 Text: issue, whether it becomes subjective, it's even bigger, I guess. The question here is even bigger

1:23:52 - 1:24:02 Text: because the question of inclusivity, so it competes with a question of monetization.

1:24:02 - 1:24:09 Text: If you think who are the main owners of data and how they train algorithms, the goal is to

1:24:09 - 1:24:15 Text: basically have better monetization, like who will see this advertisement. But there is a

1:24:15 - 1:24:25 Text: competing objective of inclusivity, who will this advertisement reach out to all kinds of populations.

1:24:25 - 1:24:33 Text: And it's not only subjective, there is a kind of clear incentive, for example, in companies,

1:24:34 - 1:24:38 Text: to maximize monetization rather than inclusivity, right, because it's also internal.

1:24:40 - 1:24:46 Text: I don't have an easy answer to this. I agree, it can be subjective or can be

1:24:46 - 1:24:55 Text: more than subjective because these objectives are compete. So would you say it's sort of more

1:24:56 - 1:25:02 Text: the field of ethics research overall is more interdisciplinary and long these answers to

1:25:03 - 1:25:07 Text: these questions are more context dependent. It's very context dependent.

1:25:08 - 1:25:11 Text: Right, it is very context dependent. As I mentioned, for example,

1:25:11 - 1:25:22 Text: the same application in different contexts can be used for good and for bad, right, and different

1:25:22 - 1:25:28 Text: thresholds on performance can be applied for different types of different settings.

1:25:30 - 1:25:36 Text: And also, I really think, like I'm not qualified even to answer this question, right, we should ask

1:25:36 - 1:25:46 Text: maybe philosopher or expert in policy, right, because eventually I'm I know how to build the

1:25:48 - 1:25:55 Text: tools and I'm trying to like I'm trying to make technologies kind of more ethical, but the

1:25:55 - 1:26:01 Text: question of how they are deployed and what are specific decisions it's it's very difficult to

1:26:01 - 1:26:08 Text: control for and to kind of give definite answers about this. Well, here's a number of good

1:26:08 - 1:26:16 Text: gold question for you from and you can't use that copy of our answer. So I said earlier you

1:26:16 - 1:26:22 Text: showed the example of AI GADA with the question of why would we want to study this? The author is

1:26:22 - 1:26:28 Text: justified by claiming that given the widespread use of facial recognition, our findings have critical

1:26:28 - 1:26:34 Text: implications for the protection of civil liberties. Given that some unscrupulous governments may

1:26:34 - 1:26:40 Text: indeed implement such technology to oppress minorities based on such things as orientation,

1:26:40 - 1:26:46 Text: do social scientists have an obligation to get ahead of this threat by understanding the

1:26:46 - 1:26:53 Text: properties of such models? How do we weigh the ethical trade-offs? Oh gosh, now I need to respond

1:26:53 - 1:27:02 Text: from the point of view of all social scientists. I don't want to answer philosophical questions,

1:27:02 - 1:27:13 Text: but I kind of I have the answer about maybe a simple answer to why I don't agree with the

1:27:13 - 1:27:19 Text: claim of researchers that we need to expose this technology to publish actually this paper to

1:27:19 - 1:27:27 Text: expose the dangers of this technology. One of them like the knife analogies that I gave is one

1:27:27 - 1:27:35 Text: of the answers. So if you think about similar field, not a similar field, but a similar type of

1:27:36 - 1:27:46 Text: interaction in security, right, in cyber security. So it's very common to kind of break the algorithm

1:27:46 - 1:27:53 Text: to show its vulnerabilities and then to iteratively fix it. So this is like the approach that researchers

1:27:53 - 1:28:00 Text: took. Let's show the vulnerabilities that we are able to build this technology to expose its threats.

1:28:00 - 1:28:11 Text: But unlike with security field, here the kind of exposure of this technology, publication of

1:28:11 - 1:28:20 Text: this technology can have real implications of human lives. So again, the cost of misclassification.

1:28:23 - 1:28:31 Text: And if you think about other problems, like similar problems, like the problems of, I can give

1:28:31 - 1:28:37 Text: many other similar problems in which we can expose this technology which will harm people.

1:28:37 - 1:28:47 Text: Like let's create deep fakes video, a poor video with professors. We can do it right to expose

1:28:47 - 1:28:54 Text: the danger of technology of deep fakes. But what kind of harm it will bring to specific people

1:28:54 - 1:29:01 Text: who were involved in this kind of exposure of the harm of this technology? So I have the answer

1:29:01 - 1:29:08 Text: of why it was wrong to publish this study in the first place. And why it's not productive,

1:29:08 - 1:29:14 Text: not helpful, but it's very difficult to answer the question, what should social scientists do? I don't

1:29:14 - 1:29:24 Text: know. Okay, well I can ask the question next. I was just disappointed. Oh wait, I think can

1:29:24 - 1:29:31 Text: you hear me now? Yeah. Okay, thank you so much for the talk. That's really interesting. And

1:29:31 - 1:29:36 Text: as we've just seen really challenging stuff. I guess my question is a little bit more practical.

1:29:36 - 1:29:44 Text: So maybe that's a reprieve for you. But I like unfortunately it seems like in a lot of

1:29:45 - 1:29:51 Text: NLP and AI more broadly. Some of this ethics and bias stuff is like kind of an afterthought.

1:29:51 - 1:29:57 Text: A lot of projects don't really necessarily take it into account from the outset and it's more

1:29:57 - 1:30:06 Text: sort of incidental. So my question is like, you know, as we're working on NLP projects maybe even

1:30:06 - 1:30:11 Text: our final course project, what are some kind of concrete steps that we can take or like a

1:30:11 - 1:30:18 Text: systematic approach that we can use to sort of incorporating some of this ethics knowledge in

1:30:18 - 1:30:22 Text: things that might not explicitly seem like they have a lot to do without the X.

1:30:27 - 1:30:34 Text: So it depends on the project, right? It's like I cannot answer generally if the first part of

1:30:34 - 1:30:41 Text: the lecture was exactly about this. If I build my project, what kind of questions I can ask to

1:30:41 - 1:30:51 Text: know if there are some pitfalls. If it's a different project, I think overall,

1:30:55 - 1:31:04 Text: these are important questions which are general for deep learning models, which could be later

1:31:04 - 1:31:12 Text: used to create better technology. So how to incorporate understanding who are the people

1:31:14 - 1:31:20 Text: who produce the language or who are the users incorporating the right inductive biases or

1:31:20 - 1:31:26 Text: a technology for demoting confounds. It doesn't have to be really like specifically on ethics related

1:31:26 - 1:31:35 Text: problems or interpretability of deep learning. It doesn't have to be ethics and of project,

1:31:35 - 1:31:41 Text: but the kind of technology, if you can develop such a technology, eventually it can be useful for

1:31:41 - 1:31:49 Text: building such better like productively models that proactively prevent unintended harms.

1:31:49 - 1:31:58 Text: So I guess the strategy would be sort of to just lay out these general topics,

1:31:59 - 1:32:05 Text: pose these questions to yourself, maybe write them down or just think about them and then proceed

1:32:05 - 1:32:12 Text: as such. Yeah, so I think about it. I think about potential, like if this technology would be

1:32:12 - 1:32:21 Text: deployed, what could be corner cases? What is potential for dual use? If it works, how can it be

1:32:21 - 1:32:31 Text: misused? And also when it doesn't work, what kind of errors can be harmful? So this is, if you go

1:32:31 - 1:32:37 Text: back to in the slides to the beginning of the lecture, these questions, they could be applied to

1:32:37 - 1:32:45 Text: many kinds of technology and they give a more clear kind of guidelines to what things,

1:32:45 - 1:32:52 Text: how bad outcomes could be prevented. I'm sure maybe I'm missing something totally, right? It's

1:32:53 - 1:33:01 Text: all of this content, much of this content is just I came up with it by reading a lot of different

1:33:01 - 1:33:07 Text: papers, but there is no clear guidelines. It's such a new field. So maybe there are things that I

1:33:07 - 1:33:17 Text: am also missing. Here's another question from our models wrong for being biased. In the end,

1:33:17 - 1:33:22 Text: they just learn what they're designed to learn and isn't our intervention to correct this behavior

1:33:22 - 1:33:33 Text: actually cause a bias. This is a good question. So it is kind of it is a question that I also have

1:33:33 - 1:33:40 Text: been thinking about, like a model is wrong for for example reflecting accurately the real world,

1:33:40 - 1:33:48 Text: right? Do we need to de-bias actively models to make them fair when the world is not fair,

1:33:48 - 1:33:56 Text: when our data is not fair? So first of all, the way we train models today, they don't only

1:33:56 - 1:34:06 Text: perpetuate biases, they amplify biases. So this is a natural behavior of a machine learning model

1:34:08 - 1:34:16 Text: that basically when you have an input example for which the confidence is lower, it will default

1:34:16 - 1:34:25 Text: to a majority class. This is why if your data contains biases, these biases will be amplified

1:34:25 - 1:34:34 Text: in a machine learning model trained on this data. And this is clearly wrong. Whether it is wrong to

1:34:36 - 1:34:43 Text: to build models that do not reflect the actual true distribution in the data, it's a much more

1:34:43 - 1:34:54 Text: difficult questions. But there are clear, there are clear kind of cases in which I would say it is

1:34:54 - 1:35:06 Text: wrong to build a model that in the search for CEO shows only male CEOs why why kind of

1:35:07 - 1:35:14 Text: because it amplifies biases. But yeah, this is already a subjective kind of answer. It's just

1:35:14 - 1:35:20 Text: my personal opinion because not much to do is research that we are doing, right?

1:35:20 - 1:35:25 Text: So, would you like to ask your question?

1:35:27 - 1:35:33 Text: Oh yeah, one said, okay, I was trying to start my video but it's saying I can't, so I guess I'm

1:35:33 - 1:35:38 Text: just going to ask. By the way, I don't see people anyway. Well, you'd like to start

1:35:38 - 1:35:43 Text: with DDR, but yeah, anyway, let's just go. Yeah, I'll just ask. Oh yes, thank you so much for

1:35:43 - 1:35:50 Text: talking. I have a question, microaggressions. So who is to decide what is what is contrary to

1:35:50 - 1:35:55 Text: microaggression? Is it the people who microaggressions are potentially targeted against? Is it

1:35:55 - 1:36:01 Text: philosophers or social scientists or just people in education? And in case opinions differ,

1:36:01 - 1:36:06 Text: you know, do we just listen to majority? It all seems culturally important to me, but very,

1:36:06 - 1:36:10 Text: very difficult to standardize and kind of reach consensus, especially because your time

1:36:10 - 1:36:16 Text: culture is perceived differently. Thank you very much for these questions. These are amazing

1:36:16 - 1:36:26 Text: questions. These are difficult questions. This is why we did not create our own corpus of microaggressions,

1:36:26 - 1:36:31 Text: right? Because it's cultural dependent. It's very, very personal subjective. This is why we

1:36:31 - 1:36:45 Text: were focused on a corpus of perceived microaggressions, people that actually felt that the

1:36:45 - 1:36:53 Text: interactions were negative, because they knew that these were microaggressions. It's

1:36:53 - 1:37:00 Text: who is to decide whether something is microaggression?

1:37:07 - 1:37:17 Text: Yeah, I don't know. This is very difficult. What I can think about like practical solutions about it,

1:37:17 - 1:37:27 Text: I would say, have a very well trained annotators who understand what microaggression is. So kind

1:37:27 - 1:37:37 Text: of we explain what is microaggression. They see many examples. They understand, for example,

1:37:37 - 1:37:47 Text: a sentence that targets a minority group and other things, and then have many annotators

1:37:47 - 1:37:58 Text: pair one sentence. So like in say the study, like this is about other social concepts that are

1:37:58 - 1:38:07 Text: abstract. For example, in Dandrovsky's study on respect in police interactions, respect is also a

1:38:07 - 1:38:13 Text: subjective thing, right? So what they did, they took every utterance and they had multiple

1:38:13 - 1:38:20 Text: annotators, multiple trained annotators for each utterance and just increased the number of

1:38:20 - 1:38:30 Text: voters, whether an utterance is respectful or not. So like practically, I think this should be

1:38:31 - 1:38:37 Text: the procedure for creating corpus of microaggressions. But more philosophical questions,

1:38:37 - 1:38:47 Text: who is to decide, it's a more difficult one. So there's still more questions, but you're

1:38:47 - 1:38:51 Text: allowed to say that you're worn out at any point you're liar. And if you're not the next question,

1:38:51 - 1:38:57 Text: I feel bad about not being able to answer big questions about the society and

1:38:59 - 1:39:02 Text: yeah, I'm happy to answer the next question from

1:39:02 - 1:39:08 Text: here's, can you talk a bit more about the unsupervised approach to identifying implicit bias?

1:39:08 - 1:39:20 Text: I can, it's, I just need to think how to talk about it in a few words.

1:39:26 - 1:39:33 Text: So intuitively, we cannot create a corpus of like which has utterance,

1:39:33 - 1:39:43 Text: which has an utterance and then a label is it sentence bias or not. So we create a causal framework

1:39:44 - 1:39:52 Text: in which our target label is more objective, who is, who is the sentence directed to a man or

1:39:52 - 1:40:03 Text: to a woman. So our labels are general labels. And in a naive way, if given a sentence towards a person

1:40:03 - 1:40:08 Text: without looking at the actual person and their comment, just by this comment towards a person,

1:40:08 - 1:40:18 Text: we can predict if it's directed to a woman, we can say that kind of there is some bias in the

1:40:18 - 1:40:28 Text: sentence, but it's an naive approach because there are other ways in which we can predict the target

1:40:28 - 1:40:35 Text: gender, but which will not be associated with bias. For example, it's the context of the

1:40:35 - 1:40:41 Text: current station, the traits of the person that we are writing to and so on. So the crux of our

1:40:41 - 1:40:48 Text: technology is that we predict gender, but the mode all kinds of confounds in the task of detecting

1:40:48 - 1:40:59 Text: bias. So we demote the signals of the source sentence. We demote the latent traits of the target

1:40:59 - 1:41:06 Text: person. And so we make this task very difficult to detect who is a target, what is a target gender.

1:41:06 - 1:41:13 Text: And if after all these demotions of the confounds given utterance that is directed to a specific

1:41:13 - 1:41:20 Text: person, we can still classify this utterance that is clearly directed to a woman, it is likely

1:41:20 - 1:41:25 Text: that this utterance contains bias. So what I was going to talk is all kinds of demotion approaches

1:41:25 - 1:41:34 Text: that we develop, but once we demote this approach and we can still predict the utterance,

1:41:35 - 1:41:42 Text: what is a gender of the target and receive, we actually can surface some bias sentences. So these

1:41:42 - 1:41:48 Text: are the main findings. For example, if we look at comments directed to politicians,

1:41:49 - 1:41:56 Text: after all these demotions and we see comments that kind of clearly predict the target gender,

1:41:56 - 1:42:03 Text: we can see that comments or politicians talk about their spouses, about their family love and also

1:42:03 - 1:42:10 Text: about their competence, maybe question their confidence. And if we look at comments towards

1:42:10 - 1:42:16 Text: public figures like actresses, we can see a lot of words that are just related to

1:42:16 - 1:42:21 Text: objectification, sexualization, regardless of their source content. So they can talk about their

1:42:21 - 1:42:30 Text: movie, but the comments will always be about that she is sexy. And this is what our model is

1:42:30 - 1:42:35 Text: able to surface, but it's again, it's just initial study in the local work.

1:42:35 - 1:42:44 Text: Okay, so around here also, if microaggressions are pulled from a site where people can list what

1:42:44 - 1:42:48 Text: they have experienced, isn't that data very vulnerable to social engineering?

1:42:56 - 1:43:01 Text: Yes, this data is vulnerable. So in our case, we

1:43:01 - 1:43:11 Text: anonymized this data. We extract only quotes from the data. We removed actual users who published

1:43:11 - 1:43:21 Text: it. We remove all the text around these quotes. And this is a good question. Maybe we should also

1:43:21 - 1:43:29 Text: not make this data public, even, yeah. Hey, there are more questions. Thank you.

1:43:29 - 1:43:38 Text: More questions. So by the way, like if Chris John and I'm saying Chris John because they said

1:43:38 - 1:43:45 Text: the two faces, the faces that I see on my screen. Please let me know when we need to finish. I'm

1:43:45 - 1:43:49 Text: happy to continue answering. We've often gone on for a few more minutes. I can ask a couple more

1:43:49 - 1:43:58 Text: questions. So here's one that's very prominent in AI right at the moment. Do you think of this

1:43:58 - 1:44:04 Text: fair for AI scientists and tech and academia who are definitely not representative of the general

1:44:04 - 1:44:11 Text: population to decide what is biased and what is not? IE, the act of devising itself might be biased.

1:44:13 - 1:44:20 Text: Yeah, this is one problem also that this is like more general problem. The researchers,

1:44:20 - 1:44:31 Text: even those who work on the bias and can incorporate their own biases. We currently don't have any

1:44:31 - 1:44:38 Text: other alternative, right? We don't have a training how to do it. We don't have, I think it's a good

1:44:38 - 1:44:45 Text: thing to work on these topics to try to promote these topics as much as possible. We're in a

1:44:45 - 1:44:51 Text: awareness that we as researchers can incorporate our own biases. So this is what we also

1:44:53 - 1:45:00 Text: write in ethical implications sections in the paper that we try to identify bias, we try to

1:45:00 - 1:45:07 Text: de-bias, but there are limitations to this study because we could incorporate our own biases

1:45:07 - 1:45:13 Text: into our analysis, right? This is how we interpret this results. Maybe this is what we were looking

1:45:13 - 1:45:25 Text: for and this is a confirmation bias. Yeah. Okay, maybe should just have one more,

1:45:26 - 1:45:29 Text: oh no, a new question just turned up. Maybe we'll have to be two questions more.

1:45:34 - 1:45:40 Text: Now, maybe I should do that one immediately because it directly relates to that answer,

1:45:40 - 1:45:47 Text: which was from how are the perspectives of community stakeholders,

1:45:47 - 1:45:51 Text: are people from minoritized groups included when these systems are being built?

1:45:53 - 1:46:00 Text: This is a wonderful question to, yeah. Currently not very good. Actually, we have,

1:46:00 - 1:46:06 Text: currently have a paper also in submission about analysis of what kind of how race have been

1:46:06 - 1:46:14 Text: treated in LP systems, starting from data sets to models to potential users. And one of the things

1:46:14 - 1:46:23 Text: that we found is that even people who work on identifying racism, they don't involve actually

1:46:24 - 1:46:31 Text: in group members. This is, yeah, you identified yet another problem in the community that

1:46:31 - 1:46:39 Text: perspectives of community are not often incorporated. In our acquisition paper, we try to advocate

1:46:39 - 1:46:47 Text: for its importance, but like all these questions are very good, but like, there are maybe first,

1:46:47 - 1:46:54 Text: somebody like Chris who has a lot of influence could make changes in the community. It's very

1:46:54 - 1:47:03 Text: difficult to make such changes. Yes, I'm hopeful that there's actually starting to be a bit of change

1:47:03 - 1:47:11 Text: right now. I mean, you know, like one can be pessimistic given the history and one can be

1:47:12 - 1:47:20 Text: pessimistic given the current statistics, but you know, I actually believe that, you know,

1:47:20 - 1:47:27 Text: through recent events of like lives matter and other things that there's actually just more

1:47:28 - 1:47:37 Text: genuine attempts to create change around, well, certainly the standard computer science department,

1:47:37 - 1:47:44 Text: but I think more generally around the field of AI than there's been at any time in the past 30

1:47:44 - 1:47:50 Text: years when I've been watching it. Right, even when I did my postdoc, it's 1014, 2016,

1:47:52 - 1:47:58 Text: we started working on the problem of gender bias and it was total outlier. Like, I didn't know

1:47:58 - 1:48:04 Text: if what I'm doing will be relevant to anyone. And then now look, we discussed this is a

1:48:05 - 1:48:10 Text: kind of relevant question. This is already an amazing change in the community. And if there will

1:48:10 - 1:48:15 Text: be more focus also on the right hiring, which clearly now has more awareness than ever.

1:48:17 - 1:48:21 Text: Right, I'm also more optimistic now than say three years ago.

1:48:23 - 1:48:28 Text: Okay, well, maybe we'll give you some wonderful questions for which there are no good answers yet.

1:48:30 - 1:48:36 Text: Maybe you can do this as the last question. And let's just say something that really

1:48:36 - 1:48:43 Text: I could listen a lot more. What do you think the social and ethics space might look like say five to

1:48:43 - 1:48:49 Text: 10 years down the line? Do you think the industry might come down to a unified standard for ethics

1:48:49 - 1:48:55 Text: for AI systems? Given that a lot of the challenges come from the fact that social and ethics discussions

1:48:55 - 1:49:04 Text: are often subjective. Yeah, I'm also optimistic about the field of ethics in five years.

1:49:04 - 1:49:10 Text: These are difficult problems. The field of ethics, by the way, itself is 2,000 years old, right?

1:49:11 - 1:49:16 Text: Aristotle already asked these questions. Now we're asking these questions about AI, but

1:49:19 - 1:49:31 Text: giving the current awareness and the bad publicity that currently companies are the main

1:49:31 - 1:49:38 Text: players, right? It's more even than governments. And there is a big incentive with companies

1:49:38 - 1:49:48 Text: to fix things because of the bad publicity. And for example, today, I read an article about

1:49:48 - 1:50:00 Text: that Google will stop advertising the truck, the profile users, like these plugins.

1:50:03 - 1:50:08 Text: So overall, I do see a very positive trajectory. It's very difficult to predict what exactly

1:50:08 - 1:50:13 Text: will be like in five years. I don't think all the problems will be resolved, but overall,

1:50:13 - 1:50:27 Text: I'm optimistic also about the new policies will be already not entirely in hands of decisions of

1:50:27 - 1:50:35 Text: companies. And definitely about research, because I see how many students now are interested in

1:50:35 - 1:50:44 Text: these topics, which is totally amazing. Okay. And another comment that I want to make is actually

1:50:44 - 1:50:51 Text: a no P is important in all this, which was much less say the field of fairness very much focused

1:50:51 - 1:50:56 Text: on image recognition, but I think more and more is we will see more and more research on

1:50:56 - 1:51:05 Text: and there'll be in language, which is also exciting. Okay. Well, maybe we should call a

1:51:05 - 1:51:26 Text: quicks at that point. Thank you so much, Julia.