06 Jan OpenAI’s new DALL-E AI program generates images of anything with fascinating success

Source: DP Review

San Francisco-based artificial intelligence research firm OpenAI has created an artificial intelligence program, DALL-E, that creates images from text descriptions. The program utilizes a 12-billion parameter version of the Generative Pre-trained Transformer 3 (GPT-3) autoregressive language model, which is itself developed by OpenAI.

DALL-E creates illustrations, paintings, photos, renders, sketches and more of basically anything you can describe using text. In OpenAI’s paper about DALL-E, numerous examples are showcased. For example, a text prompt of ‘the same cat on the top as a sketch on the bottom’ produced a photo of a gray cat and five different accompanying sketches in different styles of the cat. Given another prompt, ‘an armchair in the shape of an avocado,’ DALL-E produced five different realistic renders of, well, an armchair shaped like an avocado.

In this instance, the prompt is ‘the exact same cat on the top as a sketch on the bottom.’ Click to enlarge. Image credit: OpenAI

OpenAI describes DALL-E as a ‘simple decoder-only transformer that receives both the text and the image as a single stream of 1280 tokens – 256 for the text and 1024 for the same – and models all of them autoregressively. The attention mask at each of its 64 self-attention layers allows each image token to attend to all text tokens. DALL·E uses the standard causal mask for the text tokens, and sparse attention for the image tokens with either a row, column, or convolutional

OpenAI's new DALL-E AI program generates images of anything with fascinating success posted on DP Review on .

Read the full article on DP Review
Home