Getting started with AI Stability AI just debuted the latest version of Stable Diffusion—and the model does not disappoint.
Stable Diffusion XL (SDXL) v0.9 delivers ultra-photorealistic imagery, surpassing previous iterations in terms of sophistication and visual quality.
This means, among other things, that the new Stability AI model doesn’t generate those annoying “spaghetti hands” as often. Also, you don’t need to introduce a lot of words to get a stunning image, because the model is trained to do most of the heavy lifting for you. Talking to the model will be more natural.
The company announced the release on Twitter yesterday, saying that the new version “delivers a leap in use cases for generative AI imagery.”
Introducing the latest release from Stability AI: Breaking down barriers to #SDXL 0.9!
SDXL 0.9 makes massively improved text-to-image and composition details in the beta release and provides a leap in use cases for generative AI imagery. #StabilityAI
Unleash your creativity… pic.twitter.com/rX3BOoY7hE
— Stability AI (@StabilityAI) June 22, 2023
Dubbed SDXL v0.9, the image generator excels in response to text-based prompts, showing more composition detail than the previous SDXL beta version, which was launched in April. A detailed comparison of the images produced by the two versions highlights the distinct edge of the latest model.
For example, the prompt “A wolf in Yosemite National Park, cold nature documentary film photo” provides a more realistic image of the new AI model, which surpasses the previous version’s lack to describe real-life details. Such important improvements are attributed to an increased number of parameters in SDXL v0.9, which offers a deeper learning compared to its predecessor.
Stability AI, known for bringing the open-source image generator Stable Diffusion forward to August 2022, further boosted its competition with OpenAI’s Dall-E and MidJourney. Stable Diffusion is currently the most popular open sourced AI image generator in the world.
The company used to be recognized by TIME yesterday as one of the most influential companies in 2023. Other AI companies that appear on the list are OpenAI (ChatGPT), Hugging the Face (collaborative open source AI platform), Runway AI (generative video), Nvidiaand Google Deepmind. In the crypto space, Polygon and Chainalysis (blockchain forensics) also populate the list.
Beautiful Images With Little Work
In a remarkable move, SDXL v0.9 removes complex prompts, generating better results from simpler, less structured inputs. This is clearly shown when Decrypt submitted the brief prompt “two hands pointing at each other bright art,” which gave an impressively realistic result with SDXL v0.9, and less inspiring scribbles with the standard Stable Diffusion’s version 1.5 and 2.1.
This new ease of use could be a serious threat to MidJourney, whose main appeal is user-friendliness. Additionally, SDXL v0.9’s cinematic aesthetics and accurate object rendering could serve as a strong selling point for Stability AI, reminiscent of MidJourney’s visual style.
The latest Stability AI gem can be accessed via Clip drop, the AI image generating and editing tool developed by Init ML, is a recent acquisition of Stability. The company’s API customers should also get access soon. However, the model is not yet ready for training or refinement and will not run locally. When released to the public, it will require a system with at least 16GB of RAM and a GPU with 8GB of VRAM.
Meanwhile, Stability AI continues to improve the model with two other projects: a lagging multilingual model (LLM) named StableLM and the impressive DeepFloyd IFan advanced text-to-image generator capable of embedding readable text into images—a feat not yet achieved by current models.
According to Strength AI, mid-July is the expected date for the public release of this game-changing model as open-source software, marking another important milestone for the company.