AI isn’t really intelligent, it is a system of trained responses. Trained being the key word here.
The gist of AI and deep learning is that you have a set of inputs and a set of outputs. The outputs are generally restricted. Too many outputs and things can get complicated. You take a sample set of inputs and feed it to the AI and it guesses at what to do. If the guess is good, then that decision with its inputs is remembered. There are random numbers thrown in as well as randomly keeping bad decisions. Over time the AI makes better and better decisions.
The problem is that AIs are goal driven. This means that when you set the goals the AI will make decisions that will cause it to reach those goals.
As an example, if your goal is to have an AI evaluate resumes to attempt to determine who is the best fit for the job you are offering you need to provide it with a training set and a set of rewards.
As an example, in the video included, the rewards are based on distance traveled. The programmer changes the goals over time to get different results, but the basic reward is distance traveled. Other rewards could be considered. One such reward could be based on “Smoothness” The less change of input, the better the rewards. This is sort of cheating as we can guess that smooth driving will give better results over all.
I’m don’t do a lot of work with AIs, I’ve got experts that I call upon for that.
In the case of judging resumes, the AI is given rewards based on picking candidates that were successful by some metric. Lets assume that the metric is “number of successfully resolved calls” or “number of positive feedback points on calls”. There are hundreds of different metrics that can be used to define “successful”. And those are used to create the feedback on what is a “good” choice.
The AI is then given the resumes. Those resumes might be pre-processed in some way but just consider it to be the full resume.
They did this. And after they got the AI trained they started feeding it new resumes. The AI consistently picked people that were not BIPOC. Yep, the AI became “racist”.
When this was discovered the AI discarded. Having a racist AI was a sign that the programmers/developers that created the AI were racist themselves. It was racism that is inherit in the system that caused the AI to be racist.
Reality is that the AI isn’t racist. It was just picking the resumes that had the best fit with resumes of “good” hires. This implies that there are characteristics that are associated with race that lead to better outcomes. It also implies that those characteristics are in resumes that are striped of identifying marks.
When I was hiring for a government contract by the time I saw a resume all personal identifying marks were removed. You could not know that the applicant was male or female, white or black or purple. You couldn’t tell how old they were or how young they were.
Out of a set of 100 resumes, 10 would be female. Of those 100 resumes no more than 20 would be forwarded to me for final evaluation. In general, the final 20 would contain more than 10% female candidates.
Those female candidates were rejected time after time. Even though I had no way of knowing they were female. This was bad for the company because we needed female hires to help with the Equal Opportunity Employment numbers. It didn’t seem to matter who was choosing or when the cut was made. There was some characteristic in their resumes that caused them to not make the final cut.
We did hire two females but the question was: Why were so many females rejected?
The AI is even worse as it doesn’t care about race or sex. It cares about the predicted outcome. And for whatever reason, it was showing it’s bias.
In a paper that was blocked from publication by Google and led to Gebru’s termination, she and her co-authors forced the company to reckon with a hard-to-swallow truth: that there is no clear way to build complex AI systems trained on massive datasets in a safe and responsible way, and that they stand to amplify biases that harm marginalized people.
Perhaps the film’s greatest feat is linking all of these stories to highlight a systemic problem: it’s not just that the algorithms “don’t work,” it’s that they were built by the same mostly-male, mostly-white cadre of engineers, who took the oppressive models of the past and deployed them at scale. As author and mathematician Cathy O’Neill points out in the film, we can’t understand algorithms—or technology in general—without understanding the asymmetric power structure of those who write code versus those who have code imposed on them.
—‘Coded Bias’ Is the Most Important Film About AI You Can Watch Today
“I have a dream that my four little children will one day live in a nation where they will not be judged by the color of their skin, but by the content of their character.” — Martin Luther King, Jr.
We can’t judge people by the content of their character. We can’t judge people by their skills. We can’t judge people by their success.
Judging people by merit or ability causes certain groups to be “under represented”.
This is your problem. This is my problem. We just need to stop being judgemental and racist.
Maybe, at some point, those groups that are under represented will take responsibility upon themselves. To succeed in larger and larger numbers.
Comments
7 responses to “AI is biased because…”
You speak of AI programs as being “goal driven”. That’s an anthropomorphic image promoted the the people who play with that stuff. Similarly, the statement that over time AI will make better and better decisions (the more input is fed into it) is a pious hope not based in sound engineering reality.
The best way to describe AI is as an algorithm that accepts a collection of inputs (its “training data”) and as a consequence changes its transfer function. Furthermore, the manner in which it changes is not fully understood by the programmers who contrapted the algorithm, nor is it knowable in general whether the resulting behavior meets any particular specification or goal or desired result.
For example, an image recognition AI will change how it reacts to inputs (images presented to it) based on what body of previous images it has been fed. But how accurately it will recognize any given image, or how well it will classify a pile of images of various types, is unknown and unknowable. It may be true that “well, I fed it 1000 test images and it correctly identified 990 of them” but that doesn’t really tell you how well it will do with a different set of 1000, or with a set of 1000 scenes encountered in moving through a city street.
This is why I have no faith in self-driving cars — the whole concept is built from AI, and AI is inherently and incurably unfit for safety critical applications.
There do exist adaptive algorithms that behave in well defined ways, and these are very useful: adaptive equalizers and adaptive FIR filters are examples of this, very useful elements of communication systems. But those are not “AI” and have none of the inherent defects of the AI concept.
My favorite example is a neural network (what we used to call what gets the label “AI” today) that was trained to identify tanks. Once they got past the training data and to verification, it was no better than flipping a coin. They went back to check the training data and realized all the “tank” photos had clouds in the background, and the “not tank” photos had mostly clear skies.
They had trained the computer to recognize clouds.
I’m not sure I’d use a Neural Network for a safety critical application either. I don’t trust them. The words surrounding the entire field have warp in huge ways over time. The term AI was coined a long time ago. Today people in the field talk about “deep learning” and other terms of art.
The phrase “goal driven” is not applied to the AI. It is to say that the quality of an answer is based on how close to a given goal the AI achieved. The developers at a human level set a goal “Reach the end of the race course” and then develop a scoring methodology to put a numeric value on how good a particular outcome is. If the developer chooses goals poorly, the results will not be what is expected.
As an example, if the goal is set as driving through as many “reward points” on the track, not distances the AI might decided that it is better to go slower to pick up all the reward points rather than deciding to skip reward points now in order to get more later.
As you say and as @rob points out, bad training data leads to bad results. For any given situation an AI is given an input set. A sub set of that input is picked as the training set. The AI is then trained with the training set. It is important that the training set be representative. Next the AI is presented the non-training set and that is used to judge the quality of the AI results.
When you say that the developer doesn’t know how it works, this is not quiet true. What the developer doesn’t know is the parameters that get the best results. With a well trained AI we still don’t know what parameters give us the best results. We only know what parameters give us the best results so far.
Having used image recognition AIs I have a good idea of what they can and can’t do. They actually do better in a moving environment because they can use past decisions to improve current detections. So that object that was detected in the last frame was identified as a car at a 62% confidence level. In this frame we identify it as a truck with a 75% confidence level. In the next frame it is at 78% truck. In the next it is at %90. As it turns the corner the confidence level starts to drop but because the AI has been trained with previous identifications it continues to identify it as a truck even though that particular image, in isolation my only identify it as a vehicle or a car.
My point in the article is that AIs are showing bias because there is something in the training set that is creating that bias. We are likely unwilling, for political reasons, to look for that something.
It might be simple. There are more successful purple people than green people in the training set. Most of the resumes for purple people mention they belonged to the purple people eaters club. The green people don’t ever belong to the purple people eaters club. Thus the AI discovers a correlation between members of PPE and high scoring resumes and thus chooses members of the PPE over non members creating a bias towards picking purple people
When I say that the developer doesn’t know how it works, I mean it in the sense that neither the developer nor anyone else can describe with rigor what the properties of the algorithm are.
An algorithm is a function that takes some input and produces an output. Given that computers are deterministic machines, ignoring time-dependent algorithms (which these are not, by and large), the function is deterministic: a given input produces a given output. For example, I can create an algorithm that accepts a number and produces its square root. A more precise specification would be: the square root, correctly rounded to the machine representation used.
Consider an AI image recognizer. Its input is a matrix of pixels; its output is an identification. (Perhaps a boolean, or a category code.) But what is the specification?
The key property of reliable algorithms is that you can demonstrate, with some rigor, that the implementation meets the spec. For things like square root, you can do this completely. For larger algorithms it gets a lot harder. But with AI, there is no precise spec, nor even a precise definition of the inputs, and no understanding of how the program processes inputs to produce its output other than vague religious handwaving. To pick Rob’s example, the vague spec is “if the picture is a tank, the function outputs True, else False”. But what does that mean, and even assuming you can answer that, how would you demonstrate that an AI program implements that function? It can’t be done and the methodology doesn’t even understand the notion, much less attempt to address the question.
Hmm.
.
I wonder if it correlates to culture vs skin color. Of course there is a correlation there also, but…
.
Nah. Can’t criticize culture either. Ah well.
My favorite observation on AI came from a physics prof before they called it AI, I had in the late 80s. He said you could train them to recognize a dog with enough pictures to choke a mainframe (you have to be old enough to appreciate that reference) and they would still mistake a stool for a dog. A dog never makes that mistake.
.
I took my first programming class back before distributed PCs and we had to go to a computer room to work on teletype terminals. There was a limerick someone printed out on the wall that sums it up perfectly.
.
I really hate this damned machine
I wish that they would sell it.
It never does do what I want
Only what I tell it.
.
That’s just as true of AI as a BASIC program (kids, ask your parents). It only does what you tell it.
Buddy of mine always points out it is about lowering standards, not racism/equality/equity/etc…
.
This example bears that assertion out. The AI evaluating resumes has only achievements to evaluate, nothing else, and curiously the BIPOC folks did not make the cut often enough. Of course, one could conclude that BIPOC folks are not qualified or capable, but I think there are other reasons for them not making the AI cut.