Behind the A.I.

On the borderline between Machine Learning and workers’ rights.

Raising awareness about the labor’s exploitation behind the new algorithms that are improving life (for a few).

con Milagros Miceli
Uscita di Aprile 2023 Condividi su

Recently, Youtube’s autoplay brought me back to the official video of This is America by Childish Gambino (one of the many characters played by American actor and singer Donald Glover - Atlanta, Community, Guava Island, etc.). In one long shot, Gambino sings and dances surrounded by extras while, in the background, the police perpetuates abuses on Afro Americans. The video, which went viral right away once it came out, breaking the Internet, is a metaphor on how mass media and the entertainment industry distract American society on the use of violence against black people. That happened in 2018, before George Floyd’s death and the rising of Black Lives Matter movement.

I have kept thinking about that video even after my meeting with Milagros Miceli. Sociologist and computer scientist, immigrant and first-generation academic, she enlightened me on the reality behind the progress and comfort provided by algorithms at the base of A.I. Few (and more privileged) parts of the world’s population enjoy the facilities of Machine Learning softwares (Chat GPT is still a hot topic), without a clue of the hundreds of (literally) invisible workers who provide them that privilege, without any rights, healthcare or salary.

But, also: what about the big techs, the researchers and engineers who work in close contact with A.I? What are their responsibilities? Is it possible real progress without the need for exploiting minorities, using the ghosts of hunger and poverty for blackmailing them? That’s the focus of Miceli’s work, what she’s trying to raise awareness about with her research.

There is a lot of hype in the imaginaries of how we think of AI and what we believe that AI is able to do. But really, truly, what AI is as of now, it's labor: it's humans behind it powering the systems that we believe to be automated. So it's like a puppet that is controlled by humans, but it's just a puppet, still a puppet. Milagros Miceli| AI researcher

These days Artificial Intelligence is kind of a buzzword that has moved from technical, academic, and science fiction contexts to the public discourse, and part of your work specifically addresses a domain related to AI which is Machine Learning. What is Machine Learning and how is it used?

So, Machine Learning is a subset of AI, of what we call artificial intelligence. It's a way of doing AI. Basically, it's algorithms that learn from vast amounts of data, and they learn to either classify or predict or to do both. So they learn from patterns in the data and learn to classify what's what. Or they learn to predict what will happen, but in a very simple way. It is not that they can read the future. When we talk about prediction, we don't talk about machines learning to tell us what will happen in 100 years, but they learn from patterns in the sense of imitating behavior.

So these are very difficult words, just to put it in simple words or in a couple of examples, when we talk about computer vision, which is, again, a subset of machine learning, we talk about algorithms that have been trained on a bunch of pictures. Let's just put it simply like that. And they learn to classify. They learn to recognize objects or faces or different things around the world. So they have been trained, let's say, on different pictures of dogs. And all kinds of pictures, images of different kinds of dogs. Also drawings, like, for example, my son drawing a dog, and how it can look like. So they learn the basic patterns of what a dog could look like in the real world. So if you have what is called a smart camera and you point it at the street when a dog passes by, hopefully that algorithm will be able to say, oh, that's a dog. I've learned I've seen that before. I've seen those patterns before. That's a dog. So that's classification.

When we talk about predictions, we can talk about technologies that have to do with language models. Like right now, we talk a lot about chat GPT, and what that does is recognize patterns in the way we humans write or speak. So, for example, they can predict that if I say hello, 2s comma my and then probably the next word would be name, because that's a common formula that I say, hello, my name is Miladros. So they learned those simple patterns, and they learned to use it in different situations and in context.

The term "Machine Learning" might give the impression that everything is done automatically but there is indeed a lot of labor involved, as your work highlights. Could you give us some examples of where the labor happens in data work and what this labor looks like?

Yeah, that's an excellent question. Yes, machine learning, or in general artificial intelligence, gives the impression that we are in front of something that is automated. And when we talk about machine learning algorithms, we often get the feeling that those algorithms have learned everything that is to be learned about the world and can recognize things that we humans cannot recognize, predict and see things that we are not able to see. But that is not true.

As it is right now, these algorithms, machine learning systems and in general AI, are not as smart as they want us to believe. And when I say they, I mean big tech. So there is a lot of hype around it. As you said, there is a lot of hype in the imaginaries of how we think of AI and what we believe that AI is able to do. But really, truly, what AI is as of now, it's labor: it's humans behind it powering the systems that we believe to be automated. So it's like a puppet that is controlled by humans, but it's just a puppet, still a puppet. So examples of the labor that go into AI are many.

There are millions of people behind the AI systems that we know and the ones that we don't know. So one obvious example are tech workers in Silicon Valley working for Google, working 2s for OpenAI or for whatever big tech company that is right now developing the latest algorithm. So these tech workers, the engineers, the engineers that keep on controlling it are and the work of engineers is not just once and it's done. It's not okay. Now, I've created this automated system. Now the system runs on its own, and I don't have anything to do anymore as an engineer. No, the work of those engineers continues checking and improving and correcting.

That's very important, because what happens is that these systems, at first, just release misinformation or biased results or even harmful results or information, aggressive behavior, because they are mostly trained on data scraped from the Internet. And as we know, the Internet can be a very toxic place, especially social media. There's people all the time controlling what happens with these systems. But also there is another type of work, mostly invisibilized or not as much as discussed as the work of engineers: that is the work of data workers. And those are workers who, first of all, participate in the training of these algorithms in the sense of collecting or generating sometimes the data that will feed these models.

They also participate in labeling the data because, as I said before, when we have computer vision systems that are able to recognize things and patterns in images, we need to give them a name for those patterns that they are recognizing. No machine learning algorithm is able to interpret the content of an image. They just recognize patterns, and they say: “Okay, this is a shape that I know, and this resembles something that I've seen before.”. But,if we don't give them a name for that, if we don't attach the label dog to that image, the system will never be able to give a name to that, to actually produce a classification that we understand.

You can listen to the full interview with Milagros Miceli on Spotify, Google Podcasts, Amazon Music, or Youtube
artificial intelligencemachine learningworkers rightsDAIR institute large language modelsparticipatory approaches

Milagros Miceli

Milagros Miceli

Sociologist and computer scientist, she is interested in questions of meaning-making, knowledge production, and symbolic power encoded in ML data. Comprising ethnographic fieldwork, interviews - through participatory engagements with data annotators, collectors, and scientists all over the world, she investigates how ground-truth data for Machine Learning is produced, focusing on the labor conditions and power dynamics in data generation and labeling. She also leads the newly funded research group Data, Algorithmic Systems, and Ethics at Weizenbaum-Institut, and works as a researcher at DAIR Institute

Leggi anche