Machines who Imagine

March 20, 2019 Sridhar Mahadevan

Sridhar Mahadevan has conducted research in artificial intelligence and machine learning for over three decades. He is currently a Research Professor at the College of Information and Computer Sciences at the University of Massachusetts, Amherst, and Director of the Data Science Laboratory at Adobe Research in San Jose. He has published over 150 peer-reviewed publications in AI, and was elected Fellow of the Association for the Advancement of AI (AAAI) in 2014, in recognition of his significant contributions to machine learning.

A post by Sridhar Mahadevan.

“Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world.” – Albert Einstein

“Imagine there’s no countries. It isn’t hard to do. Nothing to kill or die for. And no religion too.” – Song by John Lennon

We discuss a fundamental challenge for artificial intelligence (AI) enabled systems: can machines imagine? According to the Stanford Encyclopedia of Philosophy, “to imagine is to represent without aiming at things as they actually, presently, and subjectively are… to represent possibilities other than the actual, to represent times other than the present, and to represent perspectives other than one’s own.”[1] Art is perhaps the paradigmatic example of human imagination. Figure 1 shows an untitled painting by Jean-Michel Basquiat that sold at a recent auction in New York City for over $100 million.

The scope of imagination in human society goes far beyond art: numerous examples can be given to illustrate that human achievements in the sciences, technology, literature, sculpture, poetry, religion, and beyond, depend fundamentally on our ability to imagine. The importance of imagination to humans naturally raises the question of whether intelligent machines can be endowed with similar abilities.

Figure 1: An untitled painting by Jean-Michel Basquiat sold at an auction for over $100 million.

Much of the recent success of AI comes from a revolution in data science, specifically the use of deep learning neural networks to extract hidden patterns in data (Goodfellow, Bengio, and Courville 2016). Deep learning networks are able to drive cars autonomously, recognize a large variety of common objects from their visual appearance, and respond to human speech using devices such as Amazon’s Alexa or Apple’s Siri. Data science is possibly the fastest growing area in AI currently. A recent study by LinkedIn projects that the United States is facing a shortfall of over 150,000 data scientists, a number that may grow as demand outstrips the supply. Yet, for those of us who have worked in AI for three decades, data science is but a way station on the road towards designing machines that truly capture human intelligence. Imagination is perhaps foremost among the abilities that still separate machines from humans. We explain below where the challenges lie.

Broadly, the field of data science concerns itself with statistical summarization of experience: it answers the question “what is the world like now?” based on an analysis of historical data. However, imagination requires answering a different sort of question, one that has to do with “what if” counterfactual inference, and also with “why” explanations. Answering “what if” and “why?” questions requires developing a new field, which I have termed imagination science (Mahadevan, 2018).

Basquiat’s famous painting provides a vivid example of the challenge in designing machines that imagine. This painting is imaginative, because it portrays a human face in an “artistic” way that is difficult to mechanize. A leading class of deep learning models, called generative adversarial networks or GANs (Goodfellow et al., 2015) can learn to generate images from a given collection. In recent years, there has been a surge of interest in GAN models, and there are dozens of variants. As Figure 2 shows, GANs can produce frighteningly realistic images of faces of people who don’t exist.[2] Although this GAN approach is impressive, it does not capture the type of imagination represented by Basquiat’s painting.

Figure 2: Generative adversarial networks (GANs) can be used to create realistic looking novel faces of people who don't exist.

Can GANs be extended to do more imaginative types of art? To understand the limitations of traditional GANs, we need to delve a bit deeper into how they work. The basic architecture of a GAN consists of two modules, a generator module that aims to produce novel images, and a discriminator module that is tasked with distinguishing fake or generated images from a dataset of given images. GANs are an example of a generative model in machine learning, a type of statistical machinery that can produce new samples from some unknown probability distribution, accessible only using a dataset of known samples. However, imaginative art requires going beyond generating additional samples of a known type. If Basquiat had painted in the style of Monet, his paintings would not be regarded so highly.

Capturing imagination in art requires extending GANs to enable creativity. Ultimately, creativity is the basis for all of human achievements in the arts, sciences, and technology. This fundamental ability is still elusive to capture in machines. Recent work on a variant of GANs called a creative adversarial network (or CAN) has made some progress in this direction. [3] A CAN is given a dataset of thousands of paintings in different styles (impressionism, Cubism, etc.). The generator in a CAN is trained to produce a painting that is “art”, but uses a style that is different from any of the given styles. Figure 3 shows examples of art produced by a CAN.

Figure 3: These images were "painted" by a deep learning neural network called a creative adversarial network (CAN).

How can we evaluate paintings produced by a GAN? One objective way is to ask how much money would someone pay for a painting produced by a GAN? The first AI-generated art that came up for auction at Christie’s recently sold for $432,500, an impressive sum, but not in the stratospheric price range of a Picasso or Monet painting. [4]

CANs represent a specialized solution to endowing machines with a limited type of imagination for generating art. Can such an approach work in other mediums, like poetry? Unfortunately, thus far, the GAN framework works less effectively on text. One of the most revered poems in the English canon is that of Robert Frost, entitled “Stopping by the Woods on a Snowy Evening”. Its last paragraph reads:

“The woods are lovely, dark and deep,

But I have promises to keep,

And miles to go before I sleep,

And miles to go before I sleep.”

The imaginative nature of Frost’s poetry is undeniable. Having spent the last two decades teaching in the picturesque New England town of Amherst, where Robert Frost taught, this poem evokes in me images of the New England woods in winter. Recently, there has been much interest in computer-generated poetry. [5] A significant challenge in automating poetry writing is that it is not enough to simply capture rhythmic structure. The essence of great poetry is the ability to evoke metaphors. A wonderful example of this ability is in Bob Dylan’s song “It’s Alright Ma (I’m only bleeding)”:

“While money doesn’t talk, it swears”

Or, to take another poet who lived in Amherst, Massachusetts, Emily Dickinson, who evoked this rich metaphor of death in her poem “Because I Could Not Stop for Death:

“Because I could not stop for Death –
He kindly stopped for me –
The Carriage held but just Ourselves –
And Immortality.”

How can a machine capture such rich metaphors? There is much research on developing a computational theory of metaphors, but it has yet to be successful in practice at generating metaphors rich enough for beautiful verse (Veale et al., 2016). Recent work in natural language processing by Mikolov and others have shown that it is possible to construct vector-space embeddings of words by analyzing a large corpus, like Wikipedia, and analyzing each word in terms of the millions of contexts in which it occurs. [6] Mikolov’s program, called word2vec, is able to learn vectorized word embeddings, and solve simple word analogies, such as “He is to she as king is to X” (where X is of course “queen”). Word2vec is thereby able to reason about linguistic regularities in a relatively simple way, although its ability to capture rich metaphors and more powerful analogies seems quite limited at this point.

Imagination is also the basis for causal reasoning, and counterfactual inferences (Pearl and Mackenzie, 2018, Walker and Gopnik, 2013). Probabilistic models of causal and counterfactual inference are being actively studied in AI, and represent another area where much progress has been made in recent years (Pearl, 2009). Figure 4 shows a three layer cognitive architecture proposed by Judea Pearl that combines reasoning by association, which underlies most of the work in data mining, with causal and counterfactual reasoning.

Figure 4: A cognitive architecture proposed by Judea Pearl combining seeing, doing, and imagination.

Imagination in problem solving, science, and technology is another area where machines have yet to make much progress. Einstein prized imagination because it enabled him to pose hypothetical questions, such as “What would the world look like if I rode a beam of light”, a question that led him to develop the revolutionary theory of special (and later, general) relativity. These and other imaginative abilities lie far beyond the scope of intelligent machines at present.

[1] See https://plato.stanford.edu/entries/imagination/.

[2] See https://research.nvidia.com/publication/2017-10_Progressive-Growing-of

[3] See https://arxiv.org/abs/1706.07068.

[4] See https://www.christies.com/features/A-collaboration-between-two-artists-one-human-one-a-machine-9332-1.aspx.

[5] See https://www.ischool.berkeley.edu/news/2016/teaching-computer-write-poetry.

[6] See https://arxiv.org/abs/1301.3781.

References

Sridhar Mahadevan, “Imagination Machines: A New Challenge for Artificial Intelligence”, Proceedings of the AAAI Conference, New Orleans, 2018.

Goodfellow, I.; Bengio, Y.; and Courville, A. C. 2016. Deep Learning. Adaptive computation and machine learning. MIT Press.

Goodfellow, I. J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde- Farley, D.; Ozair, S.; Courville, A. C.; and Bengio, Y. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, 2672–2680.

Pearl, J. 2009. Causality: Models, Reasoning and Inference. Cambridge University Press.

Walker, C., and Gopnik, A. 2013. Causality and imagination. In Taylor, M., ed., Oxford Handbook of the Development of Imagination. Oxford University Press. 342–358.

Judea Pearl and Dana Mackenzie, The Book of Why: The New Science of Cause and Effect, Basic Books, 2018.

Tony Veale, Ekaterina Shutova, Beata Beigman Klebanov, “Metaphor: A Computational Perspective”, Morgan Claypool Synthesis Lectures on Human Language Technologies, 2016.