Like most people extremely online, Brazilian screenwriter Fernando Marés was fascinated by the images generated by the DALL · E mini artificial intelligence (AI) model. In recent weeks, the AI system has gone viral by creating images based on seemingly random and wacky queries from users, such as “Lady Gaga as the Joker“”Elon Musk sued by a capybara“, and more.
Marés, a veteran hacktivist, started using DALL · E mini in early June. But instead of entering the text for a specific request, he tried something different: he left the field blank. Fascinated by the seemingly random results, Marés ran the blank search over and over again. It was then that Marés noticed something odd: Almost every time she performed an empty request, DALLE mini generated portraits of dark-skinned women wearing saria type of clothing common in South Asia.
Marés queried DALL · E mini thousands of times with the command input blank to see if it was just a coincidence. Then, he invited his friends to take turns on his computer to generate images on five browser tabs at the same time. He said he went on for almost 10 hours without interruption. He created a vast repository of over 5,000 unique images and shared 1.4 GB of raw DALLE mini-data with Rest of the world.
Most of these images contain images of dark-skinned women in saris. Why is DALL-E mini apparently obsessed with this very specific type of image? According to artificial intelligence researchers, the answer may have something to do with poor tags and incomplete datasets.
DALL · E mini was developed by artificial intelligence artist Boris Dayma and inspired by DALL · E 2, an OpenAI program that generates hyper-realistic images and images from text input. From cats meditating, to robot dinosaurs battling monster trucks in a colosseum, the images have blown everyone’s mind, and some have called it a threat to human illustrators. Recognizing the potential for misuse, OpenAI limited access to its model to only a select group of 400 researchers.
Dayma was fascinated by the graphics produced by DALL · E 2 and “wanted to have an open source version that everyone could access and improve it,” he said. Rest of the world. So, he went ahead and created a stripped down open source version of the model and called it DALL · E mini. He launched it in July 2021, and the model has been training and perfecting his results ever since.
DALL · E mini is now a viral Internet phenomenon. The images it produces are not as sharp as those of DALL E 2 and feature noticeable distortion and blurring, but the system’s wild renders, all from the Demogorgon from Stranger things holding a basketball at a public execution to Disney World – spawned an entire subculture, with subreddits and He runs Twitter dedicated to the care of his images. It inspired a cartoon in the New Yorker magazine and Twitter handle Weird Dall-E Creations has over 730,000 followers. Dayma said Rest of the world that the model generates about 5 million requests per day, and is currently working to keep pace with extreme user interest growth. (DALL.E mini has no relation to OpenAI and, at OpenAI’s insistence, has renamed its open source model Craiyon starting June 20.)
Dayma admits he is puzzled as to why the system generates images of dark-skinned women in saris for empty requests, but he suspects it has something to do with the program’s dataset. “It’s pretty interesting and I’m not sure why that happens,” Dayma said Rest of the world after looking at the pictures. “It is also possible that this type of image was highly represented in the dataset, perhaps even with short captions,” Dayma said. Rest of the world. Rest of the world he also reached out to OpenAI, the creator of DALL · E 2, to see if they had information, but haven’t heard a response yet.
Artificial intelligence models such as DALL-E mini learn to draw an image by analyzing millions of images from the Internet with associated captions. The DALL · E mini model was developed on three main data sets: Conceptual Captions data set, which contains 3 million pairs of images and captions; Conceptual 12M, which contains 12 million pairs of images and captions, and The OpenAI corpus of approximately 15 million images. Dayma and mini co-creator of DALL E Pedro Cuenca noted that their model was also trained using unfiltered data on the internet, which opens it up to unknown and unexplained bias in datasets that can go as far as generation models. of images.
Dayma isn’t alone in suspecting the underlying data set and training model. In search of answers, Marés turned to the popular Hugging Face machine learning discussion forum, where the DALL · E mini is hosted. There, the cyber community took over, with some members repeatedly offering plausible explanations: AI could have been trained on millions of images of South and Southeast Asian people who are “unlabeled” in the corpus. training data. Dayma disputes this theory, as she argued that no image in the dataset is untitled.
Michael Cook, who currently researches the intersection of artificial intelligence, creativity and game design at Queen Mary University in London, disputed the theory that the dataset included too many images of South Asian people. “Machine learning systems typically have the reverse problem: they don’t actually include enough photos of non-white people,” Cook said.
Cook has his own theory about the confusing results of DALL · E mini. “One thing that came to my mind while reading around is that a lot of these datasets delete text that isn’t English and also delete information about specific people, such as proper names,” Cook said.
“What we might see is a strange side effect of some of these filters or preprocessing, in which images of Indian women, for example, are less likely to be filtered out of the ban list or the text describing the images is removed and added to the dataset with no labels attached “. For example, if the captions were in Hindi or another language, it is possible that the text gets confused in the data processing, resulting in the image having no captions. “I can’t say for sure, it’s just a theory that came to my mind as I was exploring the data.”
Biases in AI systems are universal, and even well-funded Big Tech initiatives like Microsoft’s Tay chatbot and Amazon’s AI recruiting tool have succumbed to the problem. Indeed, OpenAI’s text-to-image generation model, Imagen, and DALL.E 2 explicitly reveal that their models have the potential to recreate harmful prejudices and stereotypes, as does DALL.E mini.
Cook was a vocal critic of what it sees as growing insensitivity and mechanical revelations that ignore bias as an inevitable part of emerging AI models. He said Rest of the world that while it is commendable that a new piece of technology is allowing people to have a lot of fun, “I think there are serious cultural and social problems with this technology that we don’t really appreciate.”
Dayma, the creator of DALL · E mini, admits that the model is still in the works and the extent of its bias has yet to be fully documented. “The model sparked a lot more interest than I expected,” Dayma said Rest of the world. He wants the model to remain open source so that his team can study its limitations and biases faster. “I think it is interesting for the public to be aware of what is possible so that they can develop a critical mind towards the media they receive as images, to the same extent that the media they receive as news articles.”
Meanwhile, the mystery continues to remain unanswered. “I’m learning a lot just by seeing how people use the model,” Dayma said Rest of the world. “When it’s empty, it’s a gray area, then [I] they still need to research in more detail. ”
Marés said it’s important for people to learn about the possible harms of seemingly funny AI systems like DALL-E mini. The fact that even Dayma is unable to discern why the system spits out these images reinforces his concerns. “This is what the press and the critics have [been] saying for years: that these things are unpredictable and they can’t control it.