Collaboration With AI; Experiments With DALL-E2

Updated: Jan 23

Perhaps the most fraught and controversial aspect of the use of text to image diffusion models has been the role of the human artist in the act of creation. In general, the approach taken by artists as well as art critics and commentators has been to stress the need for the human artist to assert control over the creative process. The artist can accomplish this by continuing to experiment with different prompts or using a tool like Facebook’s Make a Scene by which they can enter a simple sketch along with a text prompt to obtain a result that is most faithful to the artist’s vision.

An excessive degree of emphasis on placing humans at the center of the creative enterprise, however, may result in the loss of the unexpected or random elements generated by the hallucinatory process that is as common inside deep neural networks (DNN) as it is in the human brains upon which they are modelled. A better and more fruitful approach might be to embrace interspecies collaboration, using both natural and artificial intelligence to co-create a work of art. What follows is my experiment in using Dall-E-2, perhaps the most popular of the text-to-image tools, to demonstrate the creative power of its DNN as applied to a portion of the work of the great Italian artist Giorgio de Chirico (1888-1978) specifically his ‘Metaphysical Town Square’ series. In these paintings there are a series of visual elements that tend to repeat: rounded arcades, a statue usually facing away from the viewer, long shadows and sometimes a diminutive figure or two. But what is most remarkable about these images is a sensation of profound loneliness, and an unsettling, haunted quality that a text to image tool might find very difficult to convey.

Since my goal was to test the creative capabilities of Dall-E2, I decided on a simple strategy designed to minimize human input; enter only one simple prompt “A plaza in the style of Giorgio de Chirico” and repeat the prompt four times to get 16 machine-generated images. The results were as follows:

From squared arcades to lampposts, from brightly colored tiled surfaces to pop-art inspired designs, what is fascinating and unexpected about these images is how little they resemble the Chirico originals. Given the opportunity and the scope provided by a minimalist prompt, the machine demonstrated a high degree of creativity. Hallucinating new details and improbable configurations but retaining Chirico’s trademark skewed perspectives. Above all, despite its radical re-interpretations of Chirico’s phantasmagoric architecture, the machine was able to retain the haunted, slightly menacing mood of the ‘Metaphysical Town Square’ series.

What lessons can artists engaged in designing VR environments for the metaverse draw from this experiment? For one thing, the results argue for starting with minimal prompts to get the full benefit of machine creativity which will certainly present unexpected, random elements that the artist may not have thought of on his own. The ability to appropriate these random machine-generated elements into the final design, may well make for a final product that is truly unique.

B. Jerbic, M.Svaco, (2022) Interspecies Collaboration in the Design of Visual Identity: ArXiv.