If Ai Is So Amazing, Why Does Chatgpt Melt Down Over This Simple Image Edit Task?

Серверы с серверами

dall-e-a-whimsical-illustration-of-a-napkin-in-love-with-a-spoon — Created by ChatGPT and Tiernan Ray/

The current state of the art in artificial intelligence (AI) is multimodal models, which can operate not only on text but other modalities, such as images, and, in some cases, audio and video.

Also:I changed these 5 ChatGPT settings for an instant productivity boost

For example, it's a trivial task for OpenAI's ChatGPT to generate an image from a text prompt, such as, "Make me an image of a napkin in love with a spoon" (above).

With another prompt, ChatGPT can simplify the drawing, producing a picture with less detail:

dall-e-a-simple-cute-illustration-of-a-napkin-in-love-with-a-spoon — Created by ChatGPT and Tiernan Ray/

However, ChatGPT, and all the AI models for that matter, currently falter when prompted to modify a given image not created by the tool. ChatGPT, using the GPT-4o model, is the most interesting failure because the program responds as if it's trying hard.

Also: Generative AI fails in this very common ability of human thought

The other AI models, from Google's Gemini to Anthropic's Claude, plus apps from Perplexity and Microsoft's Copilot (which also runs GPT-4), failed by simply refusing to undertake the task.

The project started when I drew a picture of two people on the subway sitting together looking at their iPhones. It's a simple line drawing in black and white completed on the iPad using the Concepts drawing app. Here is the drawing:

two-friends-on-the-subway — Drawn by Tiernan Ray on an iPad/

I uploaded the drawing to the ChatGPT app on iOS and typed in the prompt: "Is there a way to simplify this drawing?"

ChatGPT responded with some reasonable suggestions, demonstrating an ability to find appropriate responses in context. For example, the bot suggested: "Remove Minor Details: Eliminate small details like intricate hair strands or minor creases in clothing."

That's a pretty good suggestion. The output also implies the model's capacity to perceive the elements in the picture accurately. So, something is happening within the model that allows it to identify the picture's contents.

Also: I asked Gemini and GPT-4 to explain deep learning AI, and Gemini won hands down

Then ChatGPT asked: "Would you like me to create a simplified version of the drawing for you?" That was exciting, so I replied, "Yes, please."

However, the bot's output -- as an image in the chat thread -- was an exact copy of my image, not simplified, not even modified. Judge for yourself:

comparison-of-modification-on-two-friends — Created by ChatGPT and Tiernan Ray/

"It doesn't look any different," I wrote back to ChatGPT, commenting on the output.

But then ChatGPT began to respond very quickly, like the computer HAL 9000 in the movie 2001, when the astronaut Dave Bowman rips out HAL's circuits: "To further simplify the drawing, I'll focus on reducing the line density and enhancing the primary shapes and contours more effectively. Let me refine the process."

Also: Generative AI is new attack vector endangering enterprises, says CrowdStrike CTO

The result, again, was no different. When I pointed this fact out, ChatGPT surprised me with its mumbo-jumbo.

"It seems the automatic methods are not effectively simplifying the image as intended," it spat back to me. "I'll manually remove the less significant details and retain only the essential elements. Let me refine this approach further."

I wasn't aware a software program could have both an 'automatic' and a 'manual' mode. My sense was every piece of software is automatic. After all, why else would you use it?

The 'manual' method of ChatGPT just inverted the image, flipping black lines to white, and white background to black.

two-friends-inverted-standalone — Screenshot by Tiernan Ray/

The bot touted its achievement and stated: "This version retains only the key contours and essential lines."

When I protested that the output wasn't a simplification, just an inversion, the bot informed me it would focus on "essential lines", and produced a completely black square, no lines at all. If I didn't know better, I would say this output was an example of AI humor.

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

What followed were several attempts by the model to respond to my prompts by modifying the image in selected ways, mostly making it stylistically fuzzy, not simplified.

At some point, the bot reacted to my protests by producing a completely different line drawing:

dall-e-a-simplified-line-drawing-of-two-people-sitting-next-to-each-other — Created by ChatGPT and Tiernan Ray/

This nonsense continued until ChatGPT returned to the beginning and produced the same image I had uploaded originally.

Each time, the bot accompanied its output -- usually just the same version of my original image -- with a slew of technical speak, such as: "The latest image showcases a more simplified version, emphasizing only the primary outlines."

chatgpt-verbiage — Screenshot by Tiernan Ray/

The other programs didn't even get out of the gate. Google's Gemini offered suggestions to simplify an image but generated an apology that it couldn't create images of people. Claude said it cannot generate images yet. The Perplexity app said the same.

Microsoft's Copilot bizarrely uploaded my drawing and then cut the heads out, which it claimed was for privacy reasons. (I think it's a nice drawing, but it's certainly not realistic enough to be used by a facial recognition system to reveal anyone's identity.)

Copilot then offered the same suggestions about simplification as ChatGPT, and instead of changing the drawing, produced a brand-new line drawing, completely unrelated. When I protested, Copilot explained it cannot directly alter images.

Also: How to use ChatGPT to analyze PDFs for free

Leaving aside those non-starters from other models, what can we make of ChatGPT's failure?

The program can provide a competent analysis of an image, including its contents. But it has no way to act on that analysis. I would guess that without being able to assemble a picture based on high-level concepts, such as objects in the picture, ChatGPT is left with no path forward.

To test that hypothesis, I altered the prompt to read, "Is there a way to simplify this drawing of two friends on the subway looking at their phones?" That prompt provides some semantic clues, I thought.

Again, the model returned the same drawing. But when I protested again, the bot produced a brand-new image with some semantic similarity -- people on mass transit looking at their phones. The bot picked up on the semantic clues but could not apply them in any way to the supplied drawing.

I can't explain in deeply technical terms what is happening other than to say ChatGPT cannot act on individual picture elements of the most basic kind, such as lines. Even when it did, the tool would cut out specific lines to perform the simplification it proposes in its text responses.

I would suggest -- and this is also true of text-editing tasks, such as editing a transcript -- that ChatGPT, and GPT-4, don't know how to act on individual elementsof anything. That inability explains why ChatGPT is a terrible editor: it doesn't know what is essential in a given object and what can be left out.

Also: OpenAI's stock investing GPTs fail this basic question about stock investing

AI models can produce objects that match a target "probability distribution" deduced from training examples, but they cannot selectively reduce elements of an original work to essentials.

Most likely, the target probability distribution for an intelligently editedanythingis somewhere along the "long tail" of probabilities, the realm where humans excel at finding the unusual and where AI cannot yet go, the kind of thing we think of as creativity.

Apple co-founder Steve Jobs once said that the highest function of software makers -- the "high-order bit", as he put it -- is the "editing" function, knowing what to leave out and what to keep in. Right now, ChatGPT has no idea what the high-order bit might be.

Artificial Intelligence

Transparency is sorely lacking amid growing AI interest
What is a Chief AI Officer, and how do you become one?
How Adobe manages AI ethics concerns while fostering creativity
6 ways OpenAI just supercharged ChatGPT for free users

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Серверы с серверами

Новости по теме

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

If AI is so amazing, why does ChatGPT melt down over this simple image edit task?

Artificial Intelligence

Горячие метки: 3. Инновации

Ordering Guide

Ресурсы по программам

О нас

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Серверы с серверами

Новости по теме

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

If AI is so amazing, why does ChatGPT melt down over this simple image edit task?

Artificial Intelligence

Горячие метки: 3. Инновации

Ordering Guide

Ресурсы по программам

О нас

Huawei CloudEngine S5731‑S48P4X Datasheet