OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilities, AI agents, and more. However, there was one glaring omission: A really capable image generator.
Last week, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts, such as realistic images and, most impressively, accurate text.
Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed
For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text.
It also boasts many other capabilities that OpenAI's previous image generator did not have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie) or as inspiration for creating a completely new work.
this was a real labor of love from @gabeeegoooh. congrats gabe; excellent work!
here is what we generated during the livestream: pic.twitter.com/fmHWp4d9AF