Image Generating Models: Creating New Photographs from Scratch
Image generating models create new photographs from scratch with the help of datasets
The concept of generating an image from scratch is just years old. But since their emergence, image generative models are making a breakthrough change in the digital world. Advanced technologies like machine learning, Generative Adversarial Network (GAN), etc. are being used in the image generation process.
Generally, image generation is the task of generating new images from an existing dataset. Image generation is widely divided into two categories namely unconditional and conditional image generation. Unconditional generation refers to generating samples unconditionally from the dataset, while conditional generation means generating samples conditionally from the dataset, based on labels or captions. Image generative models can create new data instances from which they create images. For example, a generative image model kindles new photos of animals from scratch that look like real animals. It involves using a model to generate new examples that plausibly come from an existing distribution of samples such as generating new photographs that are similar, but specifically different from a dataset of existing photographs. In this article, GlobalTech Outlook brings you some of the emerging image generative models that are making the technology sector go berserk over it.
DALL.E by OpenAI
OpenAI has begun 2021 with a surprise to tech geeks. The company has introduced an image-generating model called DALL.E that uses machine learning to create images from the given caption. The famous example that the image-generating model gave was originating armchair images in the shape of an avocado. DALL.E was an extended model from its initial successful solution called GPT-3. The image generating model uses pictures to respond to text prompts rather than word replies. Remarkably, unlike many other images generating models, DALL.E generates a large set of images from a prompt. The generative model uses neural network architecture that is responsible for tons of recent advances in machine learning.
StyleGAN by NVIDIA
StyleGAN, an advanced creation of NVIDIA creates high-resolution human faces with the help of machine learning. The neural networks that make this possible are termed adversarial networks. This image generating model uses a new architecture for the ‘generator’ network of GAN, which provides a new method for the generation process. StyleGAN manipulates the image features for the layer to make small adjustments. This architecture helps separate the high-level attributes such as a person’s identity from low-level attributes like hairstyle within the image. These step-by-step separation and analysis help StyleGAN create a new face based on the datasets.
CLIP by OpenAI
OpenAI has introduced a neural network called CLIP (Contrastive Language-Image Pre-training) which efficiently learns visual concepts from natural language supervision. Similar to GPT-2 and GPT-3, CLIP can be applied to any visual classification benchmark by providing the names of the visual categories to be recognized. CLIP builds on a large body of work on zero-shot transfer, natural language supervision, and multimodal learning. Deep learning needs a lot of data, and vision models were traditionally used to train and manually label datasets, which makes the process expensive to construct. In contrast, CLIP learns from text-image pairs that are already publicly available on the internet. The image generating model can also perform a wide variety of visual classification tasks without needing additional training examples.