- ChatGPT Plus now allows you to upload images for analysis, enabling identification of plants, animals, and more with multiple photos to narrow down results.
- ChatGPT can analyze complex diagrams like flowcharts or slides, explaining their contents and answering specific questions.
- You can utilize ChatGPT to sort information in images, such as listing books or movie titles alphabetically, proving it a useful tool for organization and accessibility.
ChatGPT Plus users now have the ability to upload images for the AI chatbot to analyze. With all the flexibility ChatGPT already had, giving it the ability to see the world offers a huge number of possibilities, starting with these.
Identifying Plant and Animal Life (And Most Things)
If you’re like me and love taking photos of plants and animals, then ChatGPT can now help you identify them, at least to a degree. I tested its ability with photos I’ve taken of spiders, and in general it was able to correctly tell the broad type of spider, if not the specific species.
The neat thing about ChatGPT in this case that you can’t do with Google Lens, for example, is narrow things down over multiple photos. In some of my testing, I provided more photos of the same spider over the course of the conversation, and ChatGPT seemed to use this additional information to get closer to the correct answer.
Understanding Complex Diagrams
If you’ve been given a flowchart or an overly dense and complicated set of PowerPoint slides, you can now use ChatGPT to make sense of it. Have it explain the contents of the image to you and answer any specific questions you have.
Here, I’ve taken a wonderful flowchart created by the University of Alberta, which describes whether something is in the public domain under Canadian law. Then I ask ChatGPT to use the flowchart to determine whether Alice in Wonderland qualfies.
This is a good time to remind everyone that ChatGPT is still prone to making mistakes, or simply making things up. So double-check the answers it gives you, or at the very least run the same analysis more than once in separate chat threads to see if you get consistent results.
Here are two cool things I did with ChatGPT that have broad applications. First, I took a photo of my bookshelf, and asked it to list all the books in alphabetical order. It did a great job, with the main limitation being the legibility of text in the photo, which is on you as the photographer and how good your camera is.
Next, I took a photo of our DVD/Blu-Ray shelf and asked ChatGPT to list all the titles alphabetically. It did this with perfect accuracy, which I suspect is down to taking a photo with much better legibility.
You can probably already think of a bunch of uses for this, but what immediately came to mind was finding things in my physical collection. For example, I asked ChatGPT where my copy of “Dune” was in the image, and it did a good job, apart from mistaking the top of the bookshelf as one of the shelves.
Various Accessibility Options
Combined with ChatGPT’s new voice chat capabilities in the mobile app, ChatGPT Plus’s image input abilities have turned it into a potent accessibility tool. Since you can take photos and send them straight to ChatGPT, and then use the chat mode (by tapping the headphone icon to have a conversation about the image. So if you have visual problems that prevent you from parsing complicated scenes, but can still frame a photo, this could be a game changer.
If you use images on your website, or post images on social media platforms, you can also use this new feature of ChatGPT to write rich and descriptive ALT text. This is text that screen readers for visually-impaired users can use to provide descriptions of images. For the most part these are manually written, for example both Facebook and X (formerly Twitter) let you add ALT text to images you post. If you care about accessibility or visually-impaired audiences, you can now use this feature of ChatGPT to quickly write a rich ALT text description and then simply check it for correctness.
Creating AI Image Prompts From Images
Coming up with prompts for AI image generation tools like MidJourney of DALL-E is harder than it sounds. Things are getting better with, for example, DALL-E 3 offering much better understanding of natural language prompts, which meets it precisely stick to what you ask for, but even then not everyone is that detail-oriented. One cool thing you can do with ChatGPT’s image input ability is to ask it for a prompt based on an image you provide. So if there’s an AI-generated image you like, or any image really, you can ask it to write a prompt that reflects the contents of the image and use that as a starting point rather than a blank page.
Writing Based on AI Images
We can flip things around, and instead of asking for a prompt to generate images, ask ChatGPT to use images that we’ve generated using AI as inspiration for creative writing. In this case, I’ve generated some fantasy art, and then asked ChatGPT to come up with a story idea that goes with it. You could use this as a springboard for your own creativity.
The Tip of the Iceberg
These are just some of the most low-hanging fruit when it comes to visual input in ChatGPT. I expect over the coming days and weeks creative users will come up with even more ways this can make life easier or let people get more done. Of course, we also expect some new nefarious uses will be part and parcel of that, but only time will tell. For now, geeks have a hot new toy to play with.