1.6 – AI Has Eyes: Computer Vision

Can a fake mustache fool facial recognition technology? How is computer vision being used? Can we really be tracked everywhere we go? Jessica Chobot dissects the hype in the season finale.
All AI: Hype vs. Reality Podcasts

Computer vision technology is already being used in a number of ways. Facial recognition is being used in lieu of boarding passes by some airlines. Facial biometrics can be used to unlock our devices—and even our doors. Conservationists are using the technology to identify and conserve rare species and promote biodiversity. Yet with this increasingly powerful technology comes increasing concern about other ways it may be used. Can we be tracked everywhere we go? What can be done to mitigate privacy loss, bias, and identity theft?? In this episode of AI: Hype vs. Reality, host Jessica Chobot talks with experts to explore how this artificial intelligence technology works and how it’s being applied in schools, stadiums, retail and in the wild. Then she puts current-day tech to the test—what really happens when computer vision is used in combination with CCTV security cameras? See for yourself.

What you’ll hear in this episode:

  • Can a fake mustache fool facial recognition?
  • What is being done to prevent bad actors’ usage of computer vision technology?
  • What happens when six members of Congress get run through AI face-scanning software. Parsing through the nuances of representation bias.
  • People are biased. Data can be biased. Technology is not.
  • The role of employees and shareholders in holding developers accountable.
  • How to identify any animal, anywhere.
  • Saving sharks (doo-doo-doo-doo)
  • What makes zebras unique?
  • Can AI tell how you’re feeling just by looking at you?

Guest List

  • Dave Graham is the director of emerging technologies messaging at Dell Technologies and specializes in AI and social transformation.
  • Tanya Berger-Wolf is a Professor of Computer Science at the University of Illinois at Chicago, and the project leader of Wildbook, an AI program which automatically identifies individual animals by their unique coat patterns or other hallmark features, such as fluke or ear outlines.
  • Ali Farhadi is an Associate Professor in the Department of Computer Science and Engineering at the University of Washington. Farhadi also leads the Perceptual Reasoning and Interaction Research team at the Allen Institute for Artificial Intelligence.
  • Jason Goodman is an electrical engineering and computer science student at the University of California Berkeley, and is a part of a project called "Seeing Like an Algorithm," which highlights the underlying biases in facial recognition and facial analysis technologies, supported by the CITRIS Tech for Social Good Program at UC Berkeley.
  • Shaun Moore Is the co-founder and CEO of computer vision company Trueface. Trueface is focused on identifying patterns like faces, objects, threats, age and emotion in real time video footage or pictures.

Jessica Chobot: I’m Jessica Chobot and this is AI Hype Vs. Reality, an original podcast from Dell Technologies. Facial recognition is grabbing all the headlines these days, but just how powerful is this tech and is there any way to hoodwink the AI? I’m here in Venice Beach, California to find out.

Jessica Chobot: Hi Jessica.

Shaun Moore: Shaun, nice to meet you.

Jessica Chobot: Nice to meet you. Shaun, what’s your role here?

Shaun Moore: I’m the co-founder of Trueface.

Jessica Chobot: What is Trueface?

Shaun Moore: Trueface is a computer vision company focused on identifying patterns in real time, in video footage or in pictures.

Jessica Chobot: What does that mean in layman’s terms?

Shaun Moore: It means we are ingesting video feed in real time and analyzing it for things like faces, objects, threats, age, emotion.

Jessica Chobot: We’ve all heard how powerful facial recognition AI can be, so I wanted to find out just how powerful it is, and if it could be fooled with something as simple as a mustache. Got a Ned Flanders mustache here, which is the smarty of the rogue, which is I think a Magnum PI-esque kind of mustache or the bandit.

Shaun Moore: Let’s go with the bandit.

Jessica Chobot: The bandit. All right. Let me get this. Wait. Ready?

Shaun Moore: Let’s check it out.

Jessica Chobot: Whoa. Just a second. Before we find out the results of that. Let’s sort the hype from the reality when it comes to AI and facial recognition. Quick and seamless security checks through airports, banks and major events. Any device including your car and your front door will be unlocked with facial biometrics. No more fingerprints, ID cards and best of all, no more passwords. Governments and corporations will be able to trace people no matter where they go. Of course, all of this will be happening any day now.

Jessica Chobot: To help me sift through all of that hype, I’m joined by Dell Technologies emerging tech guy, Dave Graham. Dave, when I go overseas, I have to go to those kiosks now and have them take a picture of my face and then take that up to a border patrol officer and they look and check everything and they still do fingerprints, but they are looking at my face, so some aspect of facial recognition is already in use.

Dave Graham: Absolutely. I think, yeah. What you’re seeing here is more on the front end of things, even domestically where JetBlue for example, and Delta, I believe, are two companies that have actually trialed. Instead of using a boarding pass, you now use your face and that’s freaked some people out because the understanding of where that data has come from is the challenge. How is this being set up? Who’s in control of that data? It leads into all kinds of privacy concerns and obviously those are pretty significant in this day and age. As those things relent, as people give up privacy in order for convenience to occur, you’ll start to see more of this.

Dave Graham: But, I think they’re controlled right now. They’re controlled experiments, social experiments really in the airports to allow that type of thing happen.

Jessica Chobot: Yeah. Based on what you just said, our discussion about privacy, that is actually the big issue that surrounds facial recognition. We hear lots of news stories that people can be spotted within minutes and then forever tracked all over the country. Is that proof that AI facial recognition is actually already super powerful?

Dave Graham: It’s an indication that facial recognition is being used. If you talk to researchers, there is no CSI like I’m going to track this person every single place that they go. There certainly are projects that are being worked on to get there, but a lot of this is being used, non realtime. We are finding this person or finding them in tape. We’re going to resources that might or might not have captured this person we’re following around. It’s very much a point in time, not an I can scan every face at every given second and every given moment. Along those same lines, I also heard from Ali Farhadi. He’s a machine learning and computer vision professor at the University of Washington and he thinks a lot of these concerns are based on hype.

Ali Farhadi: The biggest challenge in face recognition, even with the state of the art, despite all the hype and noise about it, is that the minute that you actually start thinking about recognizing millions of faces from each other, most of this technology would fall apart. That’s a key challenge is scaling face recognition to a large number of faces is really hard. As a result of that, concerns along the lines that you’re going to stick a camera somewhere at the corner of his treatment’s going to recognize all 400 million Americans that walk in front of that camera. I do not believe that technology exists today.

Ali Farhadi: Country Level, national, city level face recognition in the wild where you actually stick a camera at the corner of an intersection, and you would recognize everybody who passes by that camera like that we see in sci-fi movies, I do believe we are still years away from that.

Jessica Chobot: All right. I understand what Ali is saying, but even if he thinks we are still ways off from super powerful facial recognition, it does sound like we are still headed in that direction. What is being done in this country to protect privacy?

Dave Graham: The biggest things being done to protect our privacy are really around legislation. How things are enacted at the state, local, local government, and federal levels. For example, just in the news recently, San Francisco banned the use of facial recognition technology within municipality government. That’s a step to actively prevent it being used in ways that are disrespectful of its constituents’ privacy. That does prevent certain activities from happening. It’s not a carte blanche though to say it’s never going to be used. It’s more of preventative medicine, in this case.

Dave Graham: I think you’re going to start to see that type of mentality carried over in different cities as things move further along. There was a study that took six active sitting members of Congress, ran them through Amazon’s facial recognition software, and all of them came back as criminals. Despite your political affiliation or leanings, it’s probably not the case that they’re criminals actually sitting an office there.

Jessica Chobot: Bouncing off of that, if AI is only been trained to use male Caucasian faces, does it and will it then have trouble recognizing darker skin people and also even women because it’s used to looking at men?

Dave Graham: One of the things you touched on is essentially representation bias. You’ve grabbed a small portion of a population, and you’ve attempted to use that to describe the whole, so what is happening here is mugshots are being used or criminal databases that are being used that are predisposed one way or the other. They don’t accurately represent the population that you’re supposed to be taking images from. You end up in a place where there is bias from the onset. Your data is biased, not maybe in a maladaptive or with bad volition. It’s just something that’s happened, part of your dataset.

Dave Graham: Same thing gender to gender, race to race. There has to be equal and fair representation in your data. Otherwise, you will always end up with your results being biased to one direction or the other. In terms of what bias and facial recognition actually looks like, excusing the pun, I heard from Jason Goodman. He’s a researcher at UC Berkeley working in the tech for social good program.

Jason Goodman: One of the ways that facial recognition technologies are employed is that they’re used a lot by law enforcement agencies. When these technologies are potentially biased or inaccurate, they may perpetuate unfair systems such as mass incarceration that may unfairly harm marginalized communities. It’s an unfortunate fact that many of our prison systems today tend to have a large representation of black and Latino males. If it’s the case that these technologies are trained on datasets that also have a unfair representation of black and Latino males and in particular, if these datasets label those individuals as potential criminals, it could be the case that facial recognition technologies that try to identify criminals in our society may reflect that bias and reflect that disproportionate representation.

Jason Goodman: One potential way to overcome this problem, its employees and shareholders can hold accountable the developers of facial recognition technologies. If they’ve changed the way that the technology works such that it’s not necessarily targeting these individuals.

Jessica Chobot: Taking the points that Jason Goodman is making, I don’t know if I can trust this.

Dave Graham: It comes down to with every program that we write, who’s watching the watchers, who’s holding people accountable. I agree with Jason that there is a level of accountability that needs to be taken, whether it be shareholder activism or again, legislation or people in public place, public spaces or public offices pushing back.

Jessica Chobot: Well, Dave, thanks again for the insight and the knowledge. I’m actually going to take that with me as I put facial recognition AI to the test. First, Dave, though, I have a question for you and you’re going to have to answer because it’s in your contracts. What do whale sharks, YouTube and computer vision have in common?

Dave Graham: I have absolutely no idea.

Jessica Chobot: Well, it has something to do with wild book and I’m actually going to find out all about it. I’ll see you next time, Dave.

Dave Graham: All right. Bye.

Jessica Chobot: Bye.

Tanya Berger-Wolf: I’m Tanya Berger-Wolf. I’m a professor of computer science at the University of Illinois at Chicago.

Jessica Chobot: Tanya is also the co founder and director of a nonprofit group and their animal conservation project, Wildbook.

Tanya Berger-Wolf: We can take images and videos from different sources such as scientists, field assistants, camera traps, drones and autonomous vehicles on the water, ground and air, as well as tourists posting their vacation and safari pictures on social media and automatically find all the images that contain animals, where the animals are in those pictures and not only tell you what species they are, but also what animal, individual animals they are. Zippy the Zebra, Joe the Giraffe, Terry the Turtle and Willy the Whale.

Jessica Chobot: By sifting through millions of images and videos from around the world, Wildbook means that conservation scientists can get a much better idea of just how many of a certain animal there are and where they’re located, where they migrate, that kind of thing. Wildbook scans all of these images using computer vision technology and recognizing animals is actually a lot trickier than recognizing faces.

Tanya Berger-Wolf: The big difference is that in facial recognition for humans at least, we kind of sort of know what matters, what aspects of the face make it recognizable. We don’t really know what parts of zebra pattern really make this zebra unique and different from all the other zebras. Where should we look on the zebra’s body? What’s different from facial recognition is most of the facial recognition technology are about matching the face that we see in an image to the ones that we already have in our reference data set. For a majority of the animals that we see out there, we only see them once or twice. From the very first time that we see an animal, we have to be able to tell you that this is a new individual.

Jessica Chobot: One example of Wildbook in action cataloging an elusive animal is with whale sharks.

Tanya Berger-Wolf: Every day, we’re scraping YouTube videos that have whale sharks in them automatically find the frames that contain the whales track, identify the individual and use natural language processing to analyze the title and the text around the video to understand when and where the video was taken. We also have an intelligent agent interacting with the video poster, asking them for additional information, which is not contained in the text that we need it. We’ve gone from a few hundred known individuals before Wildbook for whale sharks was around in its current form to now we just crossed the 10,000 mark. We now know more than 10,000 identified individuals from almost 60,000 sightings contributed by nearly 8,000 citizens, scientists and volunteers.

Jessica Chobot: It’s not only amazing that this is being done, that Wildbook is able to scan the web and figure out how many whale sharks are out there based on vacation videos. It’s what they’re doing with that information that matters.

Tanya Berger-Wolf: Before the existence of Wildbook, whale shark global population numbers were estimated using genetic diversity, so that’s a very scientific way of saying we don’t know. You can’t manage what you can’t measure. These are endangered species, so we really need to know how many of them there are, and so now we have much better data. By combining together all these data, we can understand the migration patterns. The seasonality of migration patterns, the dispersal from nurseries to adulthood and where they’re going. We’ve discovered new populations in Madagascar.

Tanya Berger-Wolf: This is by combining the data, the efforts, the many eyes all over the globe to understand the picture of this global species.

Jessica Chobot: But, Tanya points out that it’s not all great news. In fact, Wildbook’s success could actually undermine its goals of animal conservation.

Tanya Berger-Wolf There is a little bit of a danger here because what is gold for scientists and conservation managers is also highly useful information for poachers and wildlife criminals. One of the aspects of using artificial intelligence to enable this wonderful tool and bringing data and extracting knowledge out of it from images, in this case. We have to be very careful in the process not to enable the extinction of the species we’re trying to protect.

Jessica Chobot: Still, that being said, the successes of Wildbook and its potential greatly outweigh the risk.

Tanya Berger-Wolf: The most recent WWF report on the status of the living organisms on our planet is showing that we’re losing biodiversity of this planet at unprecedented rate. So, there is urgency in understanding what’s going on, how we can help, and how we can reverse those trends, and what policies should be put in place. But for that, we really, really need to have even basic data. Wildbook comes in to engage everybody in the process of conservation, not only by contributing data but also by engaging with the biodiversity of our planet by engaging with conservation in a very personal way.

Jessica Chobot: All right. It’s time to take this out of the animal kingdom and bring it back to you and me. Well, specifically me and to my face. I’m heading off to Venice Beach to test the hype around facial recognition and I’m meeting with Shaun Moore, the co-founder of facial recognition company, Trueface to find out just how this tech works.

Shaun Moore: Our artificial intelligence is a self learning machine, so we feed it an abundance of data and like a face, like a bunch of human faces, and the model is trained to recognize faces then. The next time we see a face, we know it’s a face.

Jessica Chobot: Then as it scans more stuff, does it gain in intelligence and what it’s looking for?

Shaun Moore: The more commercial deployments or the more visibility it starts to understand with each instance, so it gets smarter as time goes on.

Jessica Chobot: Awesome. Then how does your AI handle things like potential bias? I mean race and gender.

Shaun Moore: Right. It’s important to know that the technology itself is not biased. What is biased is the data. Data, and if that data is biased, meaning it’s not proportionate to the population or the ethnic backgrounds, then you’re going to have an output that is also biased. We’ve partnered with companies around the world, which we collect data from those underrepresented areas of the world to train our models on to ensure that we do not have bias.

Jessica Chobot: Awesome. All right. What are you going to show me?

Shaun Moore: First, I’m going to show you realtime facial recognition. As we walked in, this camera behind us was actually recognizing us. I’ve loaded us into the system. I took an image of you off the internet and I had one of myself, and just with that one reference image we’re able to positively identify you.

Jessica Chobot: Yeah. To emphasize that with just one photo of me randomly taken off the internet, Trueface was able to identify me as Jessica Chobot the moment I walked into the office.

Shaun Moore: Then, we take that in leverage the technology we built there to run in emotional analysis. What you’re seeing here now is an emotional analysis tool where depending on the way your face looks, you’ll see fear, surprise, happy, neutral, sad.

Jessica Chobot: Yeah.

Shaun Moore: Then, the graph on the left is a realtime reflection of your emotions.

Jessica Chobot: It keeps going to disgust a lot. I’m not sure what that means.

Shaun Moore: Where we see this technology really being used is at specific points. When you think about focus groups, you want to gauge reaction based on the first time you look at that product or they see a video footage.

Jessica Chobot: Interesting. As I’m sitting looking at the pie graph, both graphs actually, I do seem to be defaulting to fear a lot, but it’s not like 100% fear. It’s 75% fear or 30% fear. Then, it kind of breaks down from there giving a percentage to each emotion. How can I be 75% scared and also 12%, like how does that work?

Shaun Moore: It’s similar in the way that facial recognition works. We take an abundance of emotion data, so people that are happy, people that are sad, that are fearful. We basically create a template for what those emotions look like from just a little happy to very happy. That’s where you’re getting the percentages from. Because you have 20% fear doesn’t mean that you are actually fearful. You could have 80% happy there, so happy would be the winning emotion, if you will.

Jessica Chobot: Got It. All right. I’m going to test it out. I’m going to try and throw it some curve balls.

Shaun Moore: Okay.

Jessica Chobot: Like a smize, like Tyra Banks smizing where she’s not really angry. She’s trying to be sultry. I noticed you don’t have sultry as an option, so let’s see what …

Shaun Moore: Not yet.

Jessica Chobot: Sultry. Fear.

Shaun Moore: It’s at 35% fear though, so it’s definitely trying to understand.

Jessica Chobot: It’s like something else is happening here. She’s squeezing her eyes shut, but I’m not feeling the fear off of her so much as some other emotion. Then, how does the AI learn?

Shaun Moore: We’re taking in hundreds of thousands, millions of examples of what happy looks like, what sad looks like, what fearful, what surprised. We’re teaching it to learn those emotions. We’re providing an abundance of data to the artificial intelligence is learning patterns in that data. These people with mouths, some that lift up are happy, eyes up are fearful or surprised. It’s just learning with time based on reference data.

Jessica Chobot: Okay. All right. What am I looking at here?

Shaun Moore: This is our pose recognition, and it’s measuring your eye movement. When you think about attention while driving, so if we see you look down, we’ll know you’re looking down. As you can see, we’re mapping out your ears, your eyes, and your nose so we know in which direction you were looking. That’s great for the automotive industry.

Jessica Chobot: I was going to say the self driving cars hitch, yeah.

Shaun Moore: Right. Then the pose detection is something that we’re deploying in schools right now to detect fights, also used in retail to see when people are reaching for objects. You can see there as you move your body around, we’re measuring the different movements of your physical body.

Jessica Chobot: Well, how does it know that it’s a punch and not a … Because it’s not looking at necessarily at the fist that I’m making. It’s looking at my wrist and my elbow and my shoulder.

Shaun Moore: Right. Similar to emotion or face recognition, we’ve shown artificial intelligence or our machine learning what fights look like, and so mathematically we understand what a punch looks like versus what something grabbing looks like.

Jessica Chobot: What about like horsing around? Like two friends are kind of pretend fighting, but they’re not really fighting, how does it differentiate between horsing around and an actual fight?

Shaun Moore: It goes back to Trueface’s principles of humanity first. Again, it’s meant to inform. It’s not meant to make a decision. We’re not telling you that there’s a fight. We’re saying this is an anomaly and typical behavior, two students are coming together and it looks like they may be fighting. Please investigate. Then, pose is also used to see if someone’s falling over or you think about elderly home care. If an individual falls over, we can see them actually falling over if this technology is running on a camera inside the house.

Jessica Chobot: Interesting. I see that we mentioned it before, that it’s recognizing inanimate objects, like the bottles and things like that. What’s the purpose of that? If it’s supposed to be following the pose of the person, why is it also looking at inanimate objects?

Shaun Moore: There’s multiple uses for object recognition. One is we track the dwell time of objects. If a bag is left at an airport, we know that that bag has not moved for 32 minutes, 33 minutes, whatever that number is.

Jessica Chobot: All right. I got some stuff that I brought from my house that I just had lying about.

Shaun Moore: Everyday items.

Jessica Chobot: I want to try and see if we can’t trick your AI. Let’s go with this horrific thing. It’s kind of a combination of just like some dude’s semi bald head and also the clown from It. Okay. Here we go. I don’t have any of my face covered up except for this ginormous forehead that I now have and fake red clown hair.

Shaun Moore: So that we are able to correctly identify you.

Jessica Chobot: It knows. Yeah. I guess that would make sense because major, all my …

Shaun Moore: You’re not fundamentally changing the structure of your face. It’s just hair.

Jessica Chobot: It’s an unfortunate accident with the clippers.

Shaun Moore: Exactly. Hair, sunglasses, hats won’t do enough to trick the system.

Jessica Chobot: Okay. This, I’m kind of 50-50 on guessing whether or not this would work because they are fake mustaches. You got a Ned Flanders mustache here, which is the smarty, or the rogue, which is I think a Magnum PI-esque kind of mustache or the bandit. Which one? I’m going to let you choose. Which one should we go with here?

Shaun Moore: Let’s go with the bandit.

Jessica Chobot: The bandit. I wonder if the bandit would also affect the emotions.

Shaun Moore: It could.

Jessica Chobot: Yeah. Ready?

Shaun Moore: Let’s check it out. Did not trick it.

Jessica Chobot: Really? What? It just looks … You know what because it’s beige. It doesn’t even look like I have a mustache on. It’s perfectly matching my skin tone. That’s a bad choice. All right. Let’s pick a different one. I am actually wearing a Groucho Marx glasses, fake nose and mustache combo right now.

Shaun Moore: As you can see here, we’re identifying it’s a face, but we don’t know who it is, so structurally changing the dimensions of your face and blocking the others will prevent us from properly identifying you.

Jessica Chobot: I mean, this is like a $0.99 disguise. I mean, have we foiled your technique?

Shaun Moore: No, you’ve not. You got to remember the environment which this is being deployed. You wearing that disguise will be far more alarming than anything else.

Jessica Chobot: That’s a really good point. Okay. Neat. I like this. All right. Hair doesn’t matter. It’s all just facial structure. Cool. There’s no way to hide. Shaun, thanks so much for chatting with me and giving me a rundown on the demos and putting up with my antics.

Shaun Moore: Yeah. Thank you for coming.

Jessica Chobot: Thanks.

Shaun Moore: Very nice to meet you.

Jessica Chobot: Nice to meet you too. After testing facial recognition AI and hearing from a variety of experts, is the hype justified? Will facial recognition technology replace all forms of passwords or even our house keys and will it track US everywhere we go? Well first, there’s a lot to figure out when it comes to privacy. But that thorny issue aside, while facial recognition is already powerful today and getting more accurate, it’s not quite at the stage that it can find anyone under any condition. If you want to see all the crazy antics that went down when I was trying to trick the facial recognition AI, make sure and checkout, delltechnologies.com/hypeVreality.

Jessica Chobot: Now, this is the last episode of AI hype versus reality from Dell Technologies. I’ve had so much fun working on the show, and I’d love to know what you think. The best way to do that is to leave a review. Thanks so much for listening. Until next time.