Lucy Bernholz, Stanford University | Stanford Women in Data Science (WiDS) Conference 2020
>> Announcer: Live from Stanford University. It's theCUBE, covering Stanford Women in Data Science 2020, brought to you by SiliconANGLE Media. (upbeat music) >> Hi, and welcome to theCUBE. I'm your host, Sonia Tagare. And we're live at Stanford University covering the fifth annual WiDS Women in Data Science Conference. Joining us today is Lucy Bernholz, who is the Senior Research Scholar at Stanford University. Lucy, welcome to theCUBE. >> Thanks for having me. >> So you've led the Digital Civil Society Lab at Stanford for the past 11 years. So tell us more about that. >> Sure, so the Digital Civil Society Lab actually exists because we don't think digital civil society exists. So let me take that apart for you. Civil society is that weird third space outside of markets and outside of government. So it's where we associate together, it's where we as people get together and do things that help other people could be the nonprofit sector, it might be political action, it might be the eight of us just getting together and cleaning up a park or protesting something we don't like. So that's civil society. But what's happened over the last 30 years really is that everything we use to do that work has become dependent on digital systems and those digital systems, some tier, I'm talking gadgets, from our phones, to the infrastructure over which data is exchanged. That entire digital system is built by companies and surveilled by governments. So where do we as people get to go digitally? Where we could have a private conversation to say, "Hey, let's go meet downtown and protest x and y, or let's get together and create an alternative educational opportunity 'cause we feel our kids are being overlooked, whatever." All of that information that get exchanged, all of that associating that we might do in the digital world, it's all being watched. It's all being captured (laughs). And that's a problem because both history and political science, history and democracy theory show us that when there's no space for people to get together voluntarily, take collective action, and do that kind of thinking and planning and communicating it just between the people they want involved in that when that space no longer exists, democracies fall. So the lab exists to try to recreate that space. And in order to do that, we have to first of all recognize that it's being closed in. Secondly, we have to make real technological process, we need a whole set of different kind of different digital devices and norms. We need different kinds of organizations, and we need different laws. So that's what the lab does. >> And how does ethics play into that. >> It's all about ethics. And it's a word I try to avoid actually, because especially in the tech industry, I'll be completely blunt here. It's an empty term. It means nothing the companies are using it to avoid being regulated. People are trying to talk about ethics, but they don't want to talk about values. But you can't do that. Ethics is a code of practice built on a set of articulated values. And if you don't want to talk about values, you don't really having conversation about ethics, you're not having a conversation about the choices you're going to make in a difficult situation. You're not having a conversation over whether one life is worth 5000 lives or everybody's lives are equal. Or if you should shift the playing field to account for the millennia of systemic and structural biases that have been built into our system. There's no conversation about ethics, if you're not talking about that thing and those things. As long as we're just talking about ethics, we're not talking about anything. >> And you were actually on the ethics panel just now. So tell us a little bit about what you guys talked about and what were some highlights. >> So I think one of the key things about the ethics panel here at WiDS this morning was that first of all started the day, which is a good sign. It shouldn't be a separate topic of discussion. We need this conversation about values about what we're trying to build for, who we're trying to protect, how we're trying to recognize individual human agency that has to be built in throughout data science. So it's a good start to have a panel about it, the beginning of the conference, but I'm hopeful that the rest of the conversation will not leave it behind. We talked about the fact that just as civil society is now dependent on these digital systems that it doesn't control. Data scientists are building data sets and algorithmic forms of analysis, that are both of those two things are just coated sets of values. And if you try to have a conversation about that, at just the math level, you're going to miss the social level, you're going to miss the fact that that's humanity you're talking about. So it needs to really be integrated throughout the process. Talking about the values of what you're manipulating, and the values of the world that you're releasing these tools into. >> And what are some key issues today regarding ethics and data science? And what are some solutions? >> So I mean, this is the Women and Data Science Conference that happens because five years ago or whenever it was, the organizers realize, "Hey, women are really underrepresented in data science and maybe we should do something about that." That's true across the board. It's great to see hundreds of women here and around the world participating in the live stream, right? But as women, we need to make sure that as you're thinking about, again, the data and the algorithm, the data and the analysis that we're thinking about all of the people, all of the different kinds of people, all of the different kinds of languages, all of the different abilities, all of the different races, languages, ages, you name it that are represented in that data set and understand those people in context. In your data set, they may look like they're just two different points of data. But in the world writ large, we know perfectly well that women of color face a different environment than white men, right? They don't work, walk through the world in the same way. And it's ridiculous to assume that your shopping algorithm isn't going to affect that difference that they experience to the real world that isn't going to affect that in some way. It's fantasy, to imagine that is not going to work that way. So we need different kinds of people involved in creating the algorithms, different kinds of people in power in the companies who can say we shouldn't build that, we shouldn't use it. We need a different set of teaching mechanisms where people are actually trained to consider from the beginning, what's the intended positive, what's the intended negative, and what is some likely negatives, and then decide how far they go down that path? >> Right and we actually had on Dr. Rumman Chowdhury, from Accenture. And she's really big in data ethics. And she brought up the idea that just because we can doesn't mean that we should. So can you elaborate more on that? >> Yeah well, just because we can analyze massive datasets and possibly make some kind of mathematical model that based on a set of value statements might say, this person is more likely to get this disease or this person is more likely to excel in school in this dynamic or this person's more likely to commit a crime. Those are human experiences. And while analyzing large data sets, that in the best scenario might actually take into account the societal creation that those actual people are living in. Trying to extract that kind of analysis from that social setting, first of all is absurd. Second of all, it's going to accelerate the existing systemic problems. So you've got to use that kind of calculation over just because we could maybe do some things faster or with larger numbers, are the externalities that are going to be caused by doing it that way, the actual harm to living human beings? Or should those just be ignored, just so you can meet your shipping deadline? Because if we expanded our time horizon a little bit, if you expand your time horizon and look at some of the big companies out there now, they're now facing those externalities, and they're doing everything they possibly can to pretend that they didn't create them. And that loop needs to be shortened, so that you can actually sit down at some way through the process before you release some of these things and say, in the short term, it might look like we'd make x profit, but spread out that time horizon I don't know two x. And you face an election and the world's largest, longest lasting, stable democracy that people are losing faith in. Set up the right price to pay for a single company to meet its quarterly profit goals? I don't think so. So we need to reconnect those externalities back to the processes and the organizations that are causing those larger problems. >> Because essentially, having externalities just means that your data is biased. >> Data are biased, data about people are biased because people collect the data. There's this idea that there's some magic debias data set is science fiction. It doesn't exist. It certainly doesn't exist for more than two purposes, right? If we could, and I don't think we can debias a data set to then create an algorithm to do A, that same data set is not going to be debiased for creating algorithm B. Humans are biased. Let's get past this idea that we can strip that bias out of human created tools. What we're doing is we're embedding them in systems that accelerate them and expand them, they make them worse (laughs) right? They make them worse. So I'd spend a whole lot of time figuring out how to improve the systems and structures that we've already encoded with those biases. And using that then to try to inform the data science we're going about, in my opinion, we're going about this backwards. We're building the biases into the data science, and then exporting those tools into bias systems. And guess what problems are getting worse. That so let's stop doing that (laughs). >> Thank you so much for your insight Lucy. Thank you for being on theCUBE. >> Oh, thanks for having me. >> I'm Sonia Tagare, thanks for watching theCUBE. Stay tuned for more. (upbeat music)
SUMMARY :
brought to you by SiliconANGLE Media. covering the fifth annual WiDS for the past 11 years. So the lab exists to try to recreate that space. for the millennia of systemic and structural biases So tell us a little bit about what you guys talked about but I'm hopeful that the rest of the conversation that they experience to the real world doesn't mean that we should. And that loop needs to be shortened, just means that your data is biased. that same data set is not going to be debiased Thank you so much for your insight Lucy. I'm Sonia Tagare, thanks for watching theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lucy Bernholz | PERSON | 0.99+ |
Sonia Tagare | PERSON | 0.99+ |
Lucy | PERSON | 0.99+ |
Digital Civil Society Lab | ORGANIZATION | 0.99+ |
5000 lives | QUANTITY | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
Rumman Chowdhury | PERSON | 0.99+ |
one life | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
both | QUANTITY | 0.98+ |
five years ago | DATE | 0.98+ |
two things | QUANTITY | 0.98+ |
eight | QUANTITY | 0.98+ |
Stanford University | ORGANIZATION | 0.97+ |
one | QUANTITY | 0.97+ |
theCUBE | ORGANIZATION | 0.96+ |
single company | QUANTITY | 0.96+ |
WiDS Women in Data Science Conference | EVENT | 0.96+ |
today | DATE | 0.95+ |
two different points | QUANTITY | 0.95+ |
Stanford Women in Data Science | EVENT | 0.95+ |
Stanford | LOCATION | 0.95+ |
Secondly | QUANTITY | 0.94+ |
more than two purposes | QUANTITY | 0.93+ |
Women and Data Science Conference | EVENT | 0.93+ |
last 30 years | DATE | 0.92+ |
hundreds of women | QUANTITY | 0.91+ |
Second | QUANTITY | 0.91+ |
first | QUANTITY | 0.87+ |
third space | QUANTITY | 0.81+ |
this morning | DATE | 0.81+ |
Stanford Women in Data Science 2020 | EVENT | 0.76+ |
two | QUANTITY | 0.73+ |
past 11 years | DATE | 0.71+ |
Conference 2020 | EVENT | 0.69+ |
WiDS) | EVENT | 0.67+ |
WiDS | EVENT | 0.62+ |
fifth annual | QUANTITY | 0.58+ |