I asked Claude, ChatGPT and Gemini to create a packing app and saw who really understood me.


As much as I love being in front of a computer screen for fourteen hours or more each day, in some cases I’m still a stickler for pen and paper. One of those cases, of course, is packing lists. I tend to travel once every two months, whether for a couple of days or a little longer, and the one thing I can never seem to get right is preparing my lists.

Sure, I have some checklists in Google Keep, but nothing beats the pen and paper method. Unfortunately, the caveat to that approach is that I have to create a new list each time, which is why I finally decided to see if I could get Claude, ChatGPT and Gemini to try to help me make the whole process more fluid and frictionless through a React-based packing app that I could load up whenever I had suitcases to fill.


Claude beat ChatGPT and Gemini in website creation

I used Claude, Gemini and ChatGPT to create my personal website and there was a clear winner.

The battle of AI web development.

Claude arrived first and definitely set the bar.

I hardly bothered with the rest of them.

To start, my message had to be the same in all three LLMs I tried. At the same time, I had to be extremely detailed, as it was the only way to ensure that I got the most out of my message and that the LLM had as much information as possible to work with while creating my agenda. The only thing he knew not to do was try to use Claude as a magic box and instead talk to him as a collaborator. As such, I had to be intentional with my message and design.

I approached Claude the same way you would inform a real developer. I started by defining what I wanted, my packaging habits, my preference for a clean user interface, and the fact that absolutely I love cats. I then reinforced the structure by separating the duffle bag from the duffle bag, and even defined how the items should be arranged dynamically. The fun part, of course, was the giant SVG cat mascot that changes as the packaging progresses. Among the three LLMsClaude clearly did the best job of selecting the cat. I even made sure to let the LLM ask me a few questions to clarify before I started building the app, which Claude did brilliantly – three precise questions that got to the heart of what I wanted.

The result, then, was brilliant software that asked me if I was going on a day trip or an overnight trip. It segmented items into sections, and each item had a triple-tap status: packed, packed, and double-checked. The entire time, a sticky status bar at the top kept track of my progress as I moved through the sections. Without a doubt, my favorite feature has been the triple-tap check for each item, which feels like a godsend when I’m about to leave an hour before my departure. Clearly, Claude knew what he was doing, and then some. The end result almost made me stop the experiment because I knew I had gotten exactly what I needed.


Claude Code terminal window on a Mac laptop showing a welcome message and recent activity

I gave Claude Code control of my desktop for a week and he automated things I didn’t think were possible.

I was seriously stunned.

ChatGPT failed, but not by much

However, he did not agree with Claude.

At that point, I knew exactly what to ask ChatGPT to create for me and GPT-5.5 by OpenAI He needed to beat Sonnet 4.6, or at least compete head-to-head. With a master message ready, I went ahead and asked ChatGPT to create a similar scheduler for me with similar conditions, and asked it to make sure my beloved triple tap feature stayed active throughout the scheduler. This time the only difference would be in the color palette and theme, over which I asked GPT to exercise full control.

Now, since I had made cats an integral part of all of this, ChatGPT needed get the graphics right. It managed to give me a prettier aesthetic that was easier on the eyes, of course, and the warmer pastel tones sure looked prettier. However, it didn’t have the shine that Claude had. This one was still in second place, and I could even see a case for choosing ChatGPT’s version of my packing planner over Claude’s, simply because it worked just as brilliantly, but had a completely different design philosophy.

the questions GPT-5.5 asked me Once you got the main message, they were precise and brief, and after a single answer to each of the five questions you asked me, I was left looking at a rather endearing planner that I can’t wait to use again, perhaps for a different trip.


laptop screen showing code editor with a terminal window showing html and css programming

I use OpenCode instead of Claude Code and it’s just as good.

Beat by beat, feature by feature.

Gemini completely lost the plot

Unglazed and difficult to see.

by the time I moved to GeminiI thought I had the formula locked. The message had already produced two really usable planners, so all Gemini really needed to do was follow the instructions and inject a little personality of his own into the experience, right? Good? Instead, what I got looked like a small HTML document coded in Notepad by a student who barely passed the class. The design was clean, sure, but only because it lacked color or personality, or even graphics, without the lone cat at the top. I would spend paragraphs defining the application, but Gemini seemed determined in giving me the bare minimum.

At that point, he wasn’t even interested in modifying the message or telling Gemini what he’d done wrong, especially considering how impressive his rivals had been so far. In fact, the most interesting part of all this is that Gemini was the only LLM who told me this. I didn’t need to ask me any questions for clarification, and I went ahead and created the React app.

The result was an app that even a mother would find hard to love. No critical pre-trip task had a line break before. For each bag and travel bag, each item simply said “CRITICAL” and “UNPACKAGED” right next to each item. Next, the percentage progress bar I specifically asked for simply didn’t appear, with just a little bit of text going up and down as I checked each box. Plus, it didn’t even stay sticky like I needed it to. No items or checkboxes seemed to be clickable, and all it really did was attach an emoji that I didn’t even know I had to click on.

This was hard to swallow, but I had to make sure I didn’t give Gemini additional instructions or point out the long list of things he had done wrong. claudio had set the bar immensely high Right off the bat, and ChatGPT followed suit. Gemini, on the other hand, simply read instructions and wrote code, as if he expected to simply finish the job and get by through his weaponized incompetence.

claudius

SW

Windows, MacOS

Individual prices

Free plan available; $17/month Pro Plan

Group prices

$100/month per person for Max plan

Claude is an AI assistant and LLM developed by Anthropic.



claude mac

I canceled my subscriptions to ChatGPT, Perplexity, and Gemini for Claude, and I should have done it sooner

I wish I had done this sooner.

There is a difference between generating code and understanding people

The future will belong to the tool that understands intention and small human annoyances.

We’ve spent the better part of two years pretending that all the major LLMs are playing roughly the same game, but after watching these three tackle the exact same task, I’m no longer convinced that’s true at all. While Claude and ChatGPT felt like collaborators I was working with, it seemed like Gemini had just read the assignment five minutes before the exam.

Clearly, the future of these tools will not belong to the chatbot that can generate the most code the fastest. Instead, it will belong to those who truly understand the intent, the rhythm, the design language, and the little human annoyances that make or break software. Claude achieved that balance perfectly, while ChatGPT came impressively close with a softer, more fun identity of its own. Meanwhile, Gemini never really understood the task in the first place.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *