Latest Generative AI System From Meta Generates Amazing Visuals From Sketches and Words
OpenAI’s Craiyon (formerly DALL-E mini) and Google’s Imagen AIs have unleashed waves of wonderfully weird procedurally generated art synthesized from human and computer imaginations using text-to-image generation, making it the trendy algorithmic process of the moment. On Tuesday, Meta announced that it, too, had developed an AI image generation engine to facilitate the creation of high digital art and immersive worlds in the Metaverse.
- Among US VR Fans Teased at Meta Quest (Latest Update)
- Microsoft Will Resume Releasing New Windows Versions Every Three Years
Meta’s Latest AI System Generates Amazing Sketches and Visuals
The phrase “there’s a horse in the hospital” requires a lot of work from a generation AI in order to generate an appropriate image. The phrase is fed into a transformer model, a neural network that analyses the sentence word-by-word and builds a contextual understanding of the relationships between the words. An ensemble of GANs will be used by the AI to create a new image once it has a general idea of what the user is describing (generative adversarial networks).
Today’s state-of-the-art AIs can create photorealistic images of pretty much any nonsense you feed them, all thanks to efforts to train ML models on ever-expanding, high-definition image sets with well-curated text descriptions. Each AI has its own unique method of production.
According to a June Keyword blog, one example is Google’s Imagen, which employs a Diffusion model “which learns to convert a pattern of random dots to images.” These images begin at a low resolution and gradually improve over time. Google’s Parti AI, on the other hand, “first converts a collection of images into a sequence of code entries, similar to puzzle pieces. A given text prompt is then translated into these code entries, and a new image is created.”
Although these systems can produce virtually any visual result that is described to them, the user has no say over the final product details. Mark Zuckerberg, CEO of Meta, wrote on Tuesday that “people should be able to shape and control the content a system generates” to realize AI’s potential to fully advance creative expression.
Make-A-Scene, the company’s “exploratory AI research concept,” accomplishes this by adding user-generated sketches to the company’s text-based image generation, yielding a 2,048 x 2,048-pixel image. This mash-up lets the user specify not only the individual elements of the image but also how those elements should be arranged. “It demonstrates how people can use both text and simple drawings to convey their vision with greater specificity, using a variety of elements, forms, arrangements, depth, compositions, and structures,” Zuckerberg said.
A human evaluation panel found that the text-and-sketch image was more aligned with the original sketch (99.54% of the time) and the original text description (66% of the time) than the text-only image. Prominent AI artists such as Sofia Crespo, Scott Eaton, Alexander Reben, and Refik Anadol have been given access to Meta’s Make-A-Scene demo in order to help shape the direction of the future development. The public release date of the AI has not been announced.
I hope you found the information presented above to be of use. You can also look through our news section, where we normally cover all of the most recent news and happenings around the world. Visit our website if you’d like to learn more about this topic.