Using Generative AI as an Imaginative Aid
A post by Claire Anscomb
Recent advances in AI image generation have been said to assist the imaginative and creative powers of human users. As artist Sarah Meyohas has enthused, algorithmic models, like Generative Adversarial Networks (GANs), can generate “images one couldn’t have imagined” (quoted in Ploin et al. 2022, 42). Broadly speaking, GAN-style models, which date back to 2014, are trained on images, which may be pre-curated or assembled by the user themselves, to generate visual outputs that come from the same distribution as the training set (Elgammal et al. 2017, 5). Since the early 2020s, with the development of Text-to-Image (TTI) systems such as DALL-E and Midjourney, users can generate images to visualise their ideas based on just a text prompt. Although this may aid a user who is unable to imagine how an image corresponding to their prompt could look, this doesn’t necessarily implicate or facilitate their creativity to produce novel and valuable work.
Some users may, without much thought for which TTI system they are using, supply a simple prompt and, due to the sophistication of these systems, generate impressively realistic or stylistic images. However, in such cases, the user has likely not contributed very much to shaping the features of the images. Moreover, the results of TTI systems can be quite repetitive, which is an inherent part of the design – they have been created so that they produce new images in the likeness of those that already exist. But these are not all images, they are just the images that are in the training data. Notably, this training data typically consists of billions of pairings of text and images scraped from the internet (Heaven 2022). Although generative AI systems can represent non-existent and existing subjects alike, they struggle if something is rare or missing from the training data. The biases and noise of these systems may then “dominate our own imagination” (Manovich 2023, 17). So, what prospects are there for using generative AI systems as imaginative aids in creative undertakings?
To address this question, we should consider the role of imagination in creative art-making more generally (Anscomb, forthcoming b). Typically, this has been framed in terms of the powers of the individual imagination, particularly in the form of mental visual images. This is exemplified by Vasari writing of Michelangelo in the 16th century that he “had such a distinctive and perfect imagination and the works he envisioned were of such a nature that he found it impossible to express such grandiose and awesome conceptions with his hands, and he often abandoned his works” (1998, 472). Although such mental visual imagery may be central for some artists, attention to the art-making process reveals that even “geniuses” like Michelangelo typically partake in a creative process that, in large part, consists of careful and deliberate thought (Stokes 2014). As Vasari went on to write of Michelangelo in the same passage: “just before his death he burned a large number of his own drawings, sketches and cartoons to prevent anyone from seeing the labours he endured or the ways he tested his genius, for fear that he might seem less than perfect.” (1998, 472) Accordingly, Berys Gaut has proposed that active creativity seems more common and important in the arts – while a painter may suddenly “see” how their painting will look:
…much of the subsequent work will involve scrutinising the painting as it is being made, imagining how it could be improved by altering it in various ways, trying out these changes, observing the results, making more alterations, and so forth. And this process may take the painting far away from its original imagined look. (2003, 157)
So, while visual imagery conjured in the mind may drive some creative art-making, the role of imagination in this process seems to gain more significance from the release it allows from commitments to truth and action as artists consider how their work might be realised and progressed.
The freedom afforded by imagination is key to facilitating cognitive play, which Dustin Stokes has proposed may be “essential to creative art-making for the simple reason that creative things are, in part, new things” (2014, 158-9) Nevertheless, this does not necessarily mean that the imagination functions best when it runs completely wild. Stokes has highlighted how artists can imagine, working from within constraints that they impose on themselves for example, how certain processes or formal features may achieve particular goals. That is, by use of the imagination, artists can manipulate the relevant practice-based information in ways unbound to accurately representing it to achieve a creative output.
Concordantly, Amy Kind has proposed that the achievement of novelty in art can be understood in terms of imaginative constraints where the right balance is achieved between releasing and imposing constraints (2022). As Kind highlights, while creativity in the domain of art may seem to rely on transcendent uses of imagination, different artistic ends call for different degrees of constraints: “While some art strives for realism, some does not. An artist working on a series of realistic paintings set in New York will thus need to constrain their imaginings in a way that an artist working on a series of fantastical paintings set in the lost city of Atlantis will not.” (2022, 16) Analogously, working with GAN-style models, artist Anna Ridler constrained the dataset to 10,000 photographs she took of tulips to create the work Mosaic Virus between 2018-19 and produce an evolving set of images of plausibly real tulips to interrogate ideas around capitalism, value and collapse, while David Young trained his system with a handful of images consisting of solid colours or basic shapes to explore the materiality of AI and create the abstract series Tabula Rasa.
Kind and Stokes are not alone in holding the view that constraints can aid imagination and enhance artistic creativity. One neat way of summing up how artists may navigate different types of constraints – such as those that are self-imposed, imposed from outside (by patrons, for example), or intrinsic to a specific technology or craft – comes from Jon Elster, who has proposed “that both choice of constraints and choice within constraints” is how artists try to maximize artistic value (2000, 178).
As indicated earlier, artists working with GAN-style models have plenty of opportunities to select and balance their constraints - including the models they select or build and the kinds of images that will be produced based on the choice of dataset - and make decisions within the constraints that come with these technologies - from the size of the datasets used or constructed to the length of time the models are trained. To an extent, those working with TTI systems can also both choose their constraints, by deciding to work with DALL-E rather than Midjourney, for example, and make choices within the constraints of these models. For instance, Boris Eldagsen, a photographer by trade, has deployed his practical and linguistic knowledge of this domain to work within the constraints imposed by the prescribed datasets and produce an innovative body of generated images that employ the visual lexicon of photography to explore ideas pertaining to the representation of reality and false memories. Alongside inpainting and outpainting, Eldagsen used his knowledge of the domain to skilfully shape the way the TTI system would “interpret” his prompts and, resultantly, produced “immensely evocative” images (Williams 2023) - one of which, Pseudomnesia: The Electrician, controversially won a Sony World Photography Award in 2023.
We have seen then, two senses in which generative AI systems can be used as imaginative aids:
(a) to aid a user in visualising an idea;
(b) to aid a user’s creative process by assisting them in cognitive play (Anscomb, forthcoming a).
The latter may involve the former, but the former need not entail the latter. (a) is something that TTI systems can potentially do very well. There is also some scope for these systems to do (b). For example, results that come back that do not align with a user’s ideas may help them to isolate potentially valuable features and imagine a composition incorporating more of those features or produce alternative imaginative associations, and hence assist them in the cognitive manipulation that, as Stokes argues, is vital for creativity.
Those working with GAN-style models can certainly do (a) (although perhaps less reliably than TTI systems) and have far greater potential for using these systems as per (b), given the wider range of choices that facilitate artists in balancing constraints. Furthermore, as Elgammal (2023) and others have highlighted, these models are more likely to return surprising results – often due to mislearning and failures - that can really spur on cognitive play.
In sum, generative AI systems can potentially be very valuable as imaginative aids but, particularly considering the increasing dominance of TTI systems, to assist the creative powers of human users it is vital to facilitate their use in the active, iterated process of externalizing and perceiving that seems key to the cognitive manipulation involved in making visual art and creative leaps in this domain.
References
Anscomb, C. (forthcoming a). Creating Images with Generative AI: An Imaginative Aid? In F. T. Moura (Ed.), Artificial Intelligence, Co-Creation & Creativity The New Frontier for Innovation. Routledge.
Anscomb, C. (forthcoming b). Imagination, Creativity, and the Visual Arts. In A. Kind & J. Langkau (Eds.), Handbook of Philosophy of Imagination and Creativity. Oxford University Press.
Elgammal, A. (2023). Text-to-Image Generators Have Altered the Digital Art Landscape—But Killed Creativity. Here’s Why an Era of A.I. Art Is Over. Artnet News. https://news.artnet.com/opinion/ahmed-elgammal-op-ed-ai-art-is-over-2304028
Elgammal, A., et al. (2017). CAN: Creative Adversarial Networks and Generating “Art” by Learning About Styles and Deviating from Style Norms. In A Goel, A Jordanous & A Pease (Eds.), Proceedings of the 8th International Conference on Computational Creativity, ICCC 2017.
Elster, J. (2000). Ulysses Unbound: Studies in Rationality, Precommitment, and Constraints. Cambridge University Press.
Gaut, B. (2003). Creativity and Imagination. In B. Gaut, & P. Livingston (Eds.), The Creation of Art: New Essays in Philosophical Aesthetics, (pp. 148–173) Cambridge University Press.
Heaven, W. D. (2022). Generative AI is changing everything. But what’s left when the hype is gone? MIT Technology Review. https://www.technologyreview.com/2022/12/16/1065005/generative-ai-revolution-art/
Kind, A. (2022). Imagination and Creative Thinking. Cambridge University Press.
Manovich, L. (2023). AI Images and Generative Media. In E. Arielli, & L. Manovich (Eds.), Artificial Aesthetics: A Critical Guide to AI, Media and Design. http://manovich.net/index.php/projects/artificial-aesthetics
Ploin A, Eynon R, Hjorth I, Osborne M A. (2022). AI and the Arts: How Machine Learning is Changing Artistic Work. Oxford Internet Institute.
Stokes, D. (2014). The Role of Imagination in Creativity. In E. Paul, & S. Kaufman (Eds.), The Philosophy of Creativity, (pp. 157-184) Oxford University Press.
Williams, Z. (2023). ‘AI isn’t a threat’ – Boris Eldagsen, whose fake photo duped the Sony judges, hits back. The Guardian. https://www.theguardian.com/artanddesign/2023/apr/18/ai-threat-boris-eldagsen-fake-photo-duped-sony-judges-hits-back
Vasari, G. (1998). The Lives of the Artists. Oxford University Press.