Written by Ben Zhao, Neubauer Professor of Computer Science at University of Chicago
Since 2022, generative AI systems have made significant inroads into creative industries such as art, music and creative writing, areas long considered the exclusive domain of humans. Just in the realm of artistic imagery alone, human creatives have been replaced in significant numbers in industries ranging from graphics design, illustrations, to game design. Predictions of massive job loss have been confirmed by repeated waves of layoffs in 2023 and 2024 across the entertainment industry, many of which are explicitly linked to use of AI. Today, court cases and public discourse debate the legal and ethical practices of generative AI and training on copyrighted content without consent.
Generative AI models today use powerful machine learning algorithms to extract patterns from large volumes of popular content, to "learn" what is good art, what is good music, and what is compelling writing. If human tastes for art and creative content evolves over time, curated by stewards such as art critics and publishing editors, how do AI models do the same?
One answer might be that generative AI can also find new art styles or the next new genre of popular music, by scanning and filtering all possible genres of music and art. This answer assumes that the space of possible artistic styles is finite and searchable. However, in my experiences developing and optimizing tools that explore and disrupt style mimicry, we find that the number of distinctive styles in art and music are nearly infinite. How will generative AI find the next version of hip-pop, a musical genre that has transformed the music industry and influenced genres as disparate as country music? Many historians trace the origin of Hip-pop to a mixture of black, latino and Caribbean youth in 1970s New York Bronx, protesting and expressing their rage and pain in a time of economic downward spiral. How would future AI models find ways to identify and transform that human condition into music, so that it can connect with other humans sharing similar emotions and experiences?
Part of this is because appreciation of music and other artistic mediums is subjective, and fundamentally based on human tastes. For an AI model to understand and predict how humans do or do not appreciate a specific style, it would have to first understand human emotions. Contrast this with other domains like software engineering, where success or failure is clearly defined, usually in a design document.
Taking this perspective, it is not hard to understand why current research predicts that AI models trained on their own input will eventually collapse. If each generation of a generative AI model is trying to approximate and mimic the complex human appreciation of an art form, then its output will be a facsimile with some amount of error. A model can reproduce the most popular styles with confidence, but much less so on the edges. With each iteration, the evolved model adds iteratively more error to the previous generation, moving further and further away from the ground truth that is subjective human standards.
Perhaps this provides the most compelling reason for why AI models must foster and protect human artists and creatives, regardless of how we feel about the ethics or legalities of generative AI training. If we allow generative AI to destroy human creative industries, by displacing jobs and discouraging aspiring artists, we are heading towards a future where art and music styles are fixed and static, and we are doomed to listen and see the same styles forever.
Science, technology and innovation can be catalysts for achieving the sustainable development goals.
In the context of the UN Commission on Science and Technology for Development, the CSTD Dialogue brings together leaders and experts to address this question and contribute to rigorous thinking on the opportunities and challenges of STI in several crucial areas including gender equality, food security and poverty reduction.
The conversation continues at the annual session of the Commission on Science and Technology for Development and as an online exchange by thought leaders.