build: static spread
Updated on 21 Aug 2022:
Open Diffusion is now available through the web interface. After you’re logged in, you can use the text command to draw a picture similar to DALL-E 2, and you have several additional fine-tuning options. As with DALL-E 2, there is a ban on entering commands such as sexual or violent images.
The open-diffusion model, which can run on-premises or in the cloud, will no longer have these restrictions. The model is expected to be published on Github in the next few days.
You can try Web Open Diffusion for free. You can buy around 1000 image commands for the equivalent of just 12 euros. The actual number of image commands available depends on the complexity of the calculations and the resolution involved in your image.
Here you have access to DreamStudio, the web interface of Open Diffusion.
Original article from 14 August 2022:
The Open Source DALL-E Competition Runs on Your Graphics Card
OpenAI’s DALL-E 2 gets free competition with stable diffusion. The project is supported by the AI open source movement and start-up Stability AI.
Artificial intelligence that can generate images from text descriptions has been progressing rapidly since early 2021. At the time, OpenAI showed impressive results with DALL-E 1 and CLIP. The open source community used CLIP for several alternative projects during the year. Then, in 2022, OpenAI released the impressive DALL-E 2, Google showed up to Imagine and the party, MidJourney reached millions and Crayons flooded social media.
Startup Sustainability AI has now announced the release of steady spreadAnother DALL-E-2-like system that will initially be gradually made available to new researchers and other groups via a Discord server.
After a testing phase, the stable spread will be released free of charge – the code and fully trained model will be published as open source. There will also be a hosted version with a web interface that users can use to test the system.
Sustainability AI Fund Free DALL-E 2 Competitors
Stable diffusion was created in collaboration between researchers from Stability AI, RunwayML, LMU Munich, as well as EleutherAI and LAION. The research group EleutherAI is known for its open source language models GPT-J-6B and GPT-NeoX-20B, and is also researching multimodal models.
The non-profit LAION (Large-Scale Artificial Intelligence Open Network) provided the training data with the open source data set LAION 5B, which the team filtered with human feedback in the first testing phase and thus the final training data set LAION-aesthetics Make .
Patrick Esser von Runewe and Robin Rombach of LMU Munich led the project, building on their work in the Compwis group at the University of Heidelberg. This is where the commonly used VQGAN and Latent Diffusion emerged. The latter served as the basis for stable diffusion with research from OpenAI and Google Brain.
— static diffusion photos (@DiffusionPics) 14 August 2022
The mathematician and computer scientist behind Sustainability AI is Imad Mostak, which was founded in 2020. He worked as an analyst for various hedge funds for several years before turning to public work. In 2019, he helped found Symmetry, a project that seeks to reduce the cost of smartphones and Internet access for vulnerable populations.
With Sustainability AI and its own personal fortune, Mostaq wants to foster an open source community of AI research. For example, their start-up previously supported the creation of the “LAION 5B” data set. For training the static propagation model, Stability AI provided servers with 4,000 Nvidia A100 GPUs.
“No one has voting rights other than our 75 employees — no billionaires, big funds, governments, or anyone else who controls the company or the communities we support. We are completely independent,” Mostak told TechCrunch. told. “We use our computing power to accelerate open-source AI.”
Stable propagation is an open source milestone
A test for steady propagation is currently underway, with new additions being delivered in waves. For example, the results that can be seen on Twitter suggest that a real DALL-E-2 competitor is emerging here.
Unlike DALL-E 2, stable diffusion pictures of prominent people and generate other purposes that OpenAI prohibits in DALL-E 2. Other systems such as Midjourney or Pixelz.ai can do this, but none of them achieve comparable quality with the high diversity seen in static spreads – and none of the other systems are open source.
come to know #steady spread If you fix the initialization noise and slap between the prompt conditioning vectors, can make for really horrible interpolation between text signals: pic.twitter.com/lWOoETYVZ3
— Xander Steinbrug (@xsteenbrugge) 7 August 2022
Stable Diffusion is said to already run on a single graphics card with 5.1 gigabytes of VRAM – the project brings AI technology to an edge that was previously only available through cloud services.
Steady dissemination thus provides an opportunity for researchers and interested parties to experiment with modern generative AI models without access to GPU servers. The model should also run on MacBooks with Apple’s M1 chip. However, image creation takes several minutes instead of seconds.
Sustainability AI itself also wants to enable companies to train their own version of static propagation. Thus the multimodal model is following the path that the larger language model has already taken: away from a single provider to the widespread availability of multiple alternatives through open source.
Runway is already researching text-to-video editing enabled by static diffusion.
#steady spread Text-to-image checkpoints are now available for research purposes upon request at https://t.co/7SFUVKoUdl.
Working on more permissible releases and inpainting checkpoints.
— Patrick Asr (@pess_r) 11 August 2022
Steady spread: Pandora’s box and net profit
Of course, with open access and the ability to run models on widely used GPUs, the potential for abuse increases dramatically.
“A certain percentage of people are weird and weird, but that’s human,” Mostack said. “We are convinced that this technology will move forward and that the patriarchal and somewhat condescending attitude of many AI fans is a mistake because they do not trust society.”
However, Mostak emphasizes that the free availability enables the community to develop countermeasures.
“We take comprehensive security measures, including the development of modern tools to share and minimize potential damage to our services. With thousands working on this model, we believe the net profit will be highly positive and this technology With billions of people using it, the damage will fade into the background.”
For more information, see the steady-spread github. You can find many examples of Stable Diffusion’s image generation capabilities on the Stable Diffusion subreddit. Steady spread beta signup here.