Hannah Sperling, SAP | WiDS 2022
>>Hey everyone. Welcome back to the cubes. Live coverage of women in data science, worldwide conference widths 2022. I'm Lisa Martin coming to you from Stanford university at the Arriaga alumni center. And I'm pleased to welcome my next guest. Hannah Sperling joins me business process intelligence or BPI, academic and research alliances at SAP HANA. Welcome to the program. >>Hi, thank you so much for having me. >>So you just flew in from Germany. >>I did last week. Yeah. Long way away. I'm very excited to be here. Uh, but before we get started, I would like to say that I feel very fortunate to be able to be here and that my heart and vicious still goes out to people that might be in more difficult situations right now. I agree >>Such a it's one of my favorite things about Wiz is the community that it's grown into. There's going to be about a 100,000 people that will be involved annually in woods, but you walk into the Arriaga alumni center and you feel this energy from all the women here, from what Margo and teams started seven years ago to what it has become. I was happened to be able to meet listening to one of the panels this morning, and they were talking about something that's just so important for everyone to hear, not just women, the importance of mentors and sponsors, and being able to kind of build your own personal board of directors. Talk to me about some of the mentors that you've had in the past and some of the ones that you have at SAP now. >>Yeah. Thank you. Um, that's actually a great starting point. So maybe talk a bit about how I got involved in tech. Yeah. So SAP is a global software company, but I actually studied business and I was hired directly from university, uh, around four years ago. And that was to join SAP's analytics department. And I've always had a weird thing for databases, even when I was in my undergrad. Um, I did enjoy working with data and so working in analytics with those teams and some people mentoring me, I got into database modeling and eventually ventured even further into development was working in analytics development for a couple of years. And yeah, still am with a global software provider now, which brought me to women and data science, because now I'm also involved in research again, because yeah, some reason couldn't couldn't get enough of that. Um, maybe learn about the stuff that I didn't do in my undergrad. >>And post-grad now, um, researching at university and, um, yeah, one big part in at least European data science efforts, um, is the topic of sensitive data and data privacy considerations. And this is, um, also topic very close to my heart because you can only manage what you measure, right. But if everybody is afraid to touch certain pieces of sensitive data, I think we might not get to where we want to be as fast as we possibly could be. And so I've been really getting into a data and anonymization procedures because I think if we could random a workforce data usable, especially when it comes to increasing diversity in stem or in technology jobs, we should really be, um, letting the data speak >>And letting the data speak. I like that. One of the things they were talking about this morning was the bias in data, the challenges that presents. And I've had some interesting conversations on the cube today, about data in health care data in transportation equity. Where do you, what do you think if we think of international women's day, which is tomorrow the breaking the bias is the theme. Where do you think we are from your perspective on breaking the bias that's across all these different data sets, >>Right. So I guess as somebody working with data on a daily basis, I'm sometimes amazed at how many people still seem to think that data can be unbiased. And this has actually touched upon also in the first keynote that I very much enjoyed, uh, talking about human centered data science people that believe that you can take the human factor out of any effort related to analysis, um, are definitely on the wrong path. So I feel like the sooner that we realize that we need to take into account certain bias sees that will definitely be there because data is humanly generated. Um, the closer we're going to get to something that represents reality better and might help us to change reality for the better as well, because we don't want to stick with the status quo. And any time you look at data, it's definitely gonna be a backward looking effort. So I think the first step is to be aware of that and not to strive for complete objectivity, but understanding and coming to terms with the fact just as it was mentioned in the equity panel, that that is logically impossible, right? >>That's an important, you bring up a really important point. It's important to understand that that is not possible, but what can we work with? What is possible? What can we get to, where do you think we are on the journey of being able to get there? >>I think that initiatives like widths of playing an important role in making that better and increasing that awareness there a big trend around explainability interpretability, um, an AI that you see, not just in Europe, but worldwide, because I think the awareness around those topics is increasing. And that will then, um, also show you the blind spots that you may still have, no matter how much you think about, um, uh, the context. Um, one thing that we still need to get a lot better at though, is including everybody in these types of projects, because otherwise you're always going to have a certain selection in terms of prospectus that you're getting it >>Right. That thought diversity there's so much value in thought diversity. That's something that I think I first started talking about thought diversity at a Wood's conference a few years ago, and really understanding the impact there that that can make to every industry. >>Totally. And I love this example of, I think it was a soap dispenser. I'm one of these really early examples of how technology, if you don't watch out for these, um, human centered considerations, how technology can, can go wrong and just, um, perpetuate bias. So a soap dispenser that would only recognize the hand, whether it was a certain, uh, light skin type that w you know, be placed underneath it. So it's simple examples like that, um, that I think beautifully illustrate what we need to watch out for when we design automatic decision aids, for example, because anywhere where you don't have a human checking, what's ultimately decided upon you end up, you might end up with much more grave examples, >>Right? No, it's, it's I agree. I, Cecilia Aragon gave the talk this morning on the human centered guy. I was able to interview her a couple of weeks ago for four winds and a very inspiring woman and another herself, but she brought up a great point about it's the humans and the AI working together. You can't ditch the humans completely to your point. There are things that will go wrong. I think that's a sends a good message that it's not going to be AI taking jobs, but we have to have those two components working better. >>Yeah. And maybe to also refer to the panel discussion we heard, um, on, on equity, um, I very much liked professor Bowles point. Um, I, and how she emphasized that we're never gonna get to this perfectly objective state. And then also during that panel, um, uh, data scientists said that 80% of her work is still cleaning the data most likely because I feel sometimes there is this, um, uh, almost mysticism around the role of a data scientist that sounds really catchy and cool, but, um, there's so many different aspects of work in data science that I feel it's hard to put that all in a nutshell narrowed down to one role. Um, I think in the end, if you enjoy working with data, and maybe you can even combine that with a certain domain that you're particularly interested in, be it sustainability, or, you know, urban planning, whatever that is the perfect match >>It is. And having that passion that goes along with that also can be very impactful. So you love data. You talked about that, you said you had a strange love for databases. Where do you, where do you want to go from where you are now? How much more deeply are you going to dive into the world of data? >>That's a good question because I would, at this point, definitely not consider myself a data scientist, but I feel like, you know, taking baby steps, I'm maybe on a path to becoming one in the future. Um, and so being at university, uh, again gives me, gives me the opportunity to dive back into certain courses and I've done, you know, smaller data science projects. Um, and I was actually amazed at, and this was touched on in a panel as well earlier. Um, how outdated, so many, um, really frequently used data sets are shown the realm of research, you know, AI machine learning, research, all these models that you feed with these super outdated data sets. And that's happened to me like something I can relate to. Um, and then when you go down that path, you come back to the sort of data engineering path that I really enjoy. So I could see myself, you know, keeping on working on that, the whole data, privacy and analytics, both topics that are very close to my heart, and I think can be combined. They're not opposites. That is something I would definitely stay true to >>Data. Privacy is a really interesting topic. We're seeing so many, you know, GDPR was how many years did a few years old that is now, and we've got other countries and states within the United States, for example, there's California has CCPA, which will become CPRA next year. And it's expanding the definition of what private sensitive data is. So we're companies have to be sensitive to that, but it's a huge challenge to do so because there's so much potential that can come from the data yet, we've got that personal aspect, that sensitive aspect that has to be aware of otherwise there's huge fines. Totally. Where do you think we are with that in terms of kind of compliance? >>So, um, I think in the past years we've seen quite a few, uh, rather shocking examples, um, in the United States, for instance, where, um, yeah, personal data was used or all proxies, um, that led to, uh, detrimental outcomes, um, in Europe, thanks to the strong data regulations. I think, um, we haven't had as many problems, but here the question remains, well, where do you draw the line? And, you know, how do you design this trade-off in between increasing efficiency, um, making business applications better, for example, in the case of SAP, um, while protecting the individual, uh, privacy rights of, of people. So, um, I guess in one way, SAP has a, as an easier position because we deal with business data. So anybody who doesn't want to care about the human element maybe would like to, you know, try building models and machine generated data first. >>I mean, at least I would feel much more comfortable because as soon as you look at personally identifiable data, you really need to watch out, um, there is however ways to make that happen. And I was touching upon these anonymization techniques that I think are going to be, um, more and more important in the, in the coming years, there is a proposed on the way by the European commission. And I was actually impressed by the sophisticated newness of legislation in, in that area. And the plan is for the future to tie the rules around the use of data science, to the specific objectives of the project. And I think that's the only way to go because of the data's out there it's going to be used. Right. We've sort of learned that and true anonymization might not even be possible because of the amount of data that's out there. So I think this approach of, um, trying to limit the, the projects in terms of, you know, um, looking at what do they want to achieve, not just for an individual company, but also for us as a society, think that needs to play a much bigger role in any data-related projects where >>You said getting true anonymization isn't really feasible. Where are we though on the anonymization pathway, >>If you will. I mean, it always, it's always the cost benefit trade off, right? Because if the question is not interesting enough, so if you're not going to allocate enough resources in trying to reverse engineer out an old, the tie to an individual, for example, sticking true to this, um, anonymization example, um, nobody's going to do it right. We live in a world where there's data everywhere. So I feel like that that's not going to be our problem. Um, and that is why this approach of trying to look at the objectives of a project come in, because, you know, um, sometimes maybe we're just lucky that it's not valuable enough to figure out certain details about our personal lives so that nobody will try, because I am sure that if people, data scientists tried hard enough, um, I wonder if there's challenges they wouldn't be able to solve. >>And there has been companies that have, you know, put out data sets that were supposedly anonymized. And then, um, it wasn't actually that hard to make interferences and in the, in the panel and equity one lab, one last thought about that. Um, we heard Jessica speak about, uh, construction and you know, how she would, um, she was trying to use, um, synthetic data because it's so hard to get the real data. Um, and the challenge of getting the synthetic data to, um, sort of, uh, um, mimic the true data. And the question came up of sensors in, in the household and so on. That is obviously a huge opportunity, but for me, it's somebody who's, um, very sensitive when it comes to privacy considerations straight away. I'm like, but what, you know, if we generate all this data, then somebody uses it for the wrong reasons, which might not be better urban planning for all different communities, but simple profit maximization. Right? So this is something that's also very dear to my heart, and I'm definitely going to go down that path further. >>Well, Hannah, it's been great having you on the program. Congratulations on being a Wood's ambassador. I'm sure there's going to be a lot of great lessons and experiences that you'll take back to Germany from here. Thank you so much. We appreciate your time for Hannah Sperling. I'm Lisa Martin. You're watching the QS live coverage of women in data science conference, 2020 to stick around. I'll be right back with my next guest.
SUMMARY :
I'm Lisa Martin coming to you from Stanford Uh, but before we get started, I would like to say that I feel very fortunate to be able to and some of the ones that you have at SAP now. And that was to join SAP's analytics department. And this is, um, also topic very close to my heart because Where do you think we are data science people that believe that you can take the human factor out of any effort related What can we get to, where do you think we are on the journey um, an AI that you see, not just in Europe, but worldwide, because I think the awareness around there that that can make to every industry. hand, whether it was a certain, uh, light skin type that w you know, be placed underneath it. I think that's a sends a good message that it's not going to be AI taking jobs, but we have to have those two Um, I think in the end, if you enjoy working So you love data. data sets are shown the realm of research, you know, AI machine learning, research, We're seeing so many, you know, many problems, but here the question remains, well, where do you draw the line? And the plan is for the future to tie the rules around the use of data Where are we though on the anonymization pathway, So I feel like that that's not going to be our problem. And there has been companies that have, you know, put out data sets that were supposedly anonymized. Well, Hannah, it's been great having you on the program.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Hannah | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Cecilia Aragon | PERSON | 0.99+ |
Hannah Sperling | PERSON | 0.99+ |
Jessica | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Germany | LOCATION | 0.99+ |
80% | QUANTITY | 0.99+ |
United States | LOCATION | 0.99+ |
2020 | DATE | 0.99+ |
Bowles | PERSON | 0.99+ |
next year | DATE | 0.99+ |
today | DATE | 0.99+ |
seven years ago | DATE | 0.99+ |
first step | QUANTITY | 0.99+ |
one role | QUANTITY | 0.99+ |
SAP | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
last week | DATE | 0.99+ |
first keynote | QUANTITY | 0.99+ |
European commission | ORGANIZATION | 0.98+ |
first | QUANTITY | 0.98+ |
two components | QUANTITY | 0.98+ |
One | QUANTITY | 0.97+ |
SAP HANA | TITLE | 0.97+ |
one | QUANTITY | 0.96+ |
this morning | DATE | 0.95+ |
around four years ago | DATE | 0.94+ |
both topics | QUANTITY | 0.94+ |
100,000 people | QUANTITY | 0.93+ |
four winds | QUANTITY | 0.93+ |
international women's day | EVENT | 0.91+ |
California | LOCATION | 0.9+ |
GDPR | TITLE | 0.89+ |
one way | QUANTITY | 0.88+ |
couple of weeks ago | DATE | 0.87+ |
few years ago | DATE | 0.87+ |
2022 | DATE | 0.86+ |
Stanford university | ORGANIZATION | 0.84+ |
European | OTHER | 0.82+ |
Arriaga | ORGANIZATION | 0.8+ |
CPRA | ORGANIZATION | 0.8+ |
Wood | PERSON | 0.78+ |
one thing | QUANTITY | 0.75+ |
one last | QUANTITY | 0.74+ |
one of | QUANTITY | 0.74+ |
QS | EVENT | 0.72+ |
CCPA | ORGANIZATION | 0.69+ |
years | DATE | 0.6+ |
Margo | PERSON | 0.6+ |
about | QUANTITY | 0.54+ |
years | QUANTITY | 0.52+ |
WiDS | EVENT | 0.47+ |
Wiz | ORGANIZATION | 0.39+ |