Зарегистрируйтесь сейчас для лучшей персонализированной цитаты!

Can DALL-E 3 in ChatGPT read and modify images? Come see for yourself

Dec, 04, 2023 Hi-network.com
steam-santa.png
Screenshot by David Gewirtz/

I've been exploring the use of DALL-E 3 inside of ChatGPT Plus. I'm doing this because it's my job, not because I have some kind of unhealthy little addiction to describing something in my mind and see it manifest in mere minutes on the screen. I can stop at any time. Sure, that's the ticket, I can stop at any time.

But not today. Today, I found a new toy. DALL-E 3 inside of ChatGPT can read and modify images. Sort of. You see, it's a bit fussy. But I'm getting ahead of myself. Let's start this story at the beginning...

Also: How to get a perfect face swap using Midjourney AI

I've been using Midjourney to customize uploaded images for a while. The problem is that it's very convoluted. You have to be running Midjourney in Discord, and then you have to go through a number of steps to upload an image into Discord, get a URL, yada, yada, yada...

In ChatGPT Plus, you simply have to click on the paperclip icon and upload your image. One and done.

That makes it a lot easier to use, and also a lot more fun. But how well does it work? To test it out, I tried three images: a picture of my car, a picture of me, and the logo. Let's look at the results.

My car

Here's a picture of my car, a 2013 Dodge Challenger.

my-car
David Gewirtz/

Once the image was uploaded, I instructed DALL-E 3:

Put car in city

The results were promising. DALL-E 3 successfully reproduced a likeness of the car, in a city scene:

car-in-city.png
Screenshot by David Gewirtz/

Then, because I have a definite steampunk fascination, I asked DALL-E to:

Make it steampunk

Here's what we got. It still retained the overall body style of the Dodge Challenger:

steampunk.png
Screenshot by David Gewirtz/

DALL-E keeps breaking

One thing to note is that I couldn't get DALL-E to do too many iterations without failure. Every two or three requests (and never more than four), I got this message:

failure
Screenshot by David Gewirtz/

My workaround was to take the last successfully created image and upload it into a new ChatGPT Plus session, and work from that.

Also: The best AI art generators: DALL-E 2 and fun alternatives to try

Flying car

So, I uploaded the last image, the steampunk version of my car, and told DALL-E to:

Make the car fly

Here's what I got back. The representation of my original car was gone, and we got back a very cool Chitty Chitty Bang Bang-style vehicle, in the air:

chitty.png
Screenshot by David Gewirtz/

It was cool, but it was no longer my car. But that's OK, let's see where we can go.

The Santa Clause

My next attempt, given the season, was to try to put Santa in the driver's seat. Here, I ran into a rather over-zealous guardrail, with ChatGPT telling me, "I'm sorry, but I can't assist with that request."

Yes, we have arrived at that time in the future when an AI says the precise equivalent of, "I'm sorry Dave. I'm afraid I can't do that." Life imitates art.

sorry-dave
Screenshot by David Gewirtz/

As you can see, the reasoning was a bit bizarre. But it seemed to hinge on the fact that I was asking it to modify an uploaded picture. So I gave it the above prompts as a single request:

Put car in city. Make it steampunk. Make the car fly. Put Santa in the driver's seat.

I got this:

bright-santa.png
Screenshot by David Gewirtz/

It was a nice picture, but it missed the whole Victorian aesthetic that the previous images had. There's a fix for that, though.

Also: How to use Bing Image Creator (and why it's better than ever)

DALL-E with ChatGPT essentially rewrites every prompt into a more detailed instruction. So, for example, "Make it steampunk" got rewritten into:

Image of a steampunk-themed street scene with vintage architecture, featuring a classic muscle car modified with steampunk elements like brass pipes, gears, and steam exhausts. The car is parked on a cobblestone street with ambient street lamps and a backdrop of old-fashioned buildings that blend Victorian and industrial revolution influences. There are people dressed in steampunk attire, with goggles and Victorian clothing, walking on the sidewalks. The overall atmosphere is that of a retro-futuristic city from an alternate history where steam power is the primary technology.

So I grabbed portions of the descriptive material from both "make it steampunk" and "make the car fly" and combined them for this prompt:

Put car in city. Make it steampunk. Make the car fly. Put Santa in the driver's seat. Below, the cobblestone streets are lined with gaslight street lamps, and people in Victorian attire look up in amazement. The sky is a dusk orange with a hint of smog and the buzz of smaller steampunk drones and airships in the distance. The overall atmosphere is that of a retro-futuristic city from an alternate history where steam power is the primary technology.

Here's what I got back:

steam-santa.png
Screenshot by David Gewirtz/

Strictly speaking, it's not a flying car, but it's cool. Unfortunately, there's no connection at all to the original car image I started with.

Stop, Dave. Will you stop, Dave? Stop, Dave.

I had another HAL moment when I asked ChatGPT to put this picture of me in an office setting:

stop-dave
Screenshot by David Gewirtz/

It told me, "I'm sorry, but I can't assist with that request." At least ChatGPT didn't say, "Look Dave, I can see you're really upset about this. I honestly think you ought to sit down calmly, take a stress pill, and think things over."

Also: Thanks to my 5 favorite AI tools, I'm working smarter now

Fine. And now for something completely different.

Leaving on a jet train

Here's the logo, which I uploaded to DALL-E:

zdnet
Screenshot by David Gewirtz/

First, I tried to get it to put it on a jet:

Put this logo on the side of a jumbo jet

At least it got the color right:

jet.png
Screenshot by David Gewirtz/

Then I tried to get it to put the logo on a building.

Put this logo on the side of a brick building

It remembered green, but not the right green:

building.png
Screenshot by David Gewirtz/

So I tried to get DALL-E to move the building onto a model railroad.

Put the building on a model railroad

The result is something resembling a model railroad (although the track in the foreground is likely to cause a derailment).

railroad.png
Screenshot by David Gewirtz/

There is a brick building, but it's not the same brick building, and any pretense of the logo is gone. Not even the green remains.

Also: Generative AI can easily be made malicious despite guardrails, say scholars

So, of course, I asked it to do this:

Also put the jumbo jet on a model railroad

I got this. I just want to know if those are planes or missiles in the water.

jet-train.png
Screenshot by David Gewirtz/

What have we learned?

After tinkering with this DALL-E feature, I think we can conclude the following:

  • You can upload images to DALL-E.
  • You can ask it to modify them, but with mixed results.
  • DALL-E fails a lot.
  • ChatGPT may not be demonstrating Artificial General Intelligence, but it's got abstract expressionism down.
  • Its responses are uncomfortably close to those of the HAL-9000.

And there you go. Have you uploaded images to DALL-E? How has it done for you? Let us know in the comments below.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack, and follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.

tag-icon Горячие метки: 3. Инновации

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.