I'm still not sure I understand Anthropic's general strategy right now. They are...

bobbylarrybobby · 2026-02-06T05:18:13 1770355093

I really like that Claude feels transactional. It answers my question quickly and concisely and then shuts up. I don't need the LLM I use to act like my best friend.

endymion-light · 2026-02-06T10:40:07 1770374407

I love doing a personal side project code review with claude code, because it doesn't beat around the bush for criticism.

I recently compared a class that I wrote for a side project that had quite horrible temporal coupling for a data processor class.

Gemini - ends up rating it a 7/10, some small bits of feedback etc

Claude - Brutal dismemberment of how awful the naming convention, structure, coupling etc, provides examples how this will mess me up in the future. Gives a few citations for python documentation I should re-read.

ChatGPT - you're a beautiful developer who can never do anything wrong, you're the best developer that's ever existed and this class is the most perfect class i've ever seen

majora2007 · 2026-02-06T15:07:56 1770390476

This is exactly what got me to actually pay. I had a side project with an architecture I thought was good. Fed it into Claude and ChatGPT. ChatGPT made small suggestions but overall thought it was good. Claude shit all over it and after validating it's suggestions, I realized Claude was what I needed.

I haven't looked back. I just use Claude at home and ChatGPT at work (no Claude). ChatGPT at work is much worse than Claude in my experience.

Willish42 · 2026-02-06T22:11:20 1770415880

I feel like this anecdote represents the differing incentives / philosophies of each group rather well.

I've noticed ChatGPT is rather high in its praise regardless of how valuable the input is, Gemini is less placating but still largely influenced by the perspective of the prompter, and Claude feels the most "honest" but humans are rather easy poor at judging this sort of thing.

Does anyone know if "sycophancy" has documented benchmarks the models are compared against? Maybe it's subjective and hard to measure, but given the issues with GPT 4o, this seems like a good thing to measure model to model to compare individual companies' changes as well as compare across companies.

endymion-light · 2026-02-09T11:09:49 1770635389

The issue i think is that to model sycophancy you'd need another model that can address signs of sycophancy - it's turtles all the way down

andkenneth · 2026-02-06T06:18:27 1770358707

Weirdly I feel like partially because of this it feels more "human" and more like a real person I'm talking to. GPT models feel fake and forced, and will yap in a way that is like they're trying to get to be my friend, but offputting in a way that makes it not work. Meanwhile claude has always had better "emotional intelligence".

Claude also seems a lot better at picking up what's going on. If you're focused on tasks, then yeah, it's going to know you want quick answers rather than detailed essays. Could be part of it.

8note · 2026-02-10T22:42:07 1770763327

as a problem, it means you need a ralph loop on top of it, if you want it to finish a problem without it waiting on a checkpoint

apples_oranges · 2026-02-06T08:25:21 1770366321

fyi in settings, you can configure chatGPT to do the same

matkoniecz · 2026-02-06T08:50:08 1770367808

where?

maxbond · 2026-02-06T09:33:13 1770370393

Settings > Personalization > Custom Instructions.

Here's what I use:

    WE ARE PROFESSIONALS. DO NOT FLATTER ME. BE BLUNT AND FORTHRIGHT.

cryptoegorophy · 2026-02-06T06:48:55 1770360535

Then why are they advertising to people that are complete opposite of you? Why couldn’t they just … ask LLM what their target audience is?

tsss · 2026-02-06T09:11:06 1770369066

Quickly and concisely? In my experience, Claude drivels on and on forever. The answers are always far longer than Gemini's, which is mostly fine for coding but annoying for planning/questions.

tgtweak · 2026-02-05T18:09:50 1770314990

Claude itself (outside of code workflows) actually works very well for general purpose chat. I have a few non-technical friends that have moved over from chatgpt after some side-by-side testing and I've yet to see one go back - which is good since claude circa 8 months ago was borderline unusable for anything but coding on the api.

pattar · 2026-02-06T16:46:09 1770396369

I got my partner using claude for her non technical work. They write a lot of proposals, creates spreadsheets, and occasionally wants some graphs to visualize things. They love that claude creates all of the artifacts right there in the browser and saves them for later in a versioned way.

Squarex · 2026-02-05T20:59:21 1770325161

Claude sucks at non English languages. Gemini and ChatGPT are much better. Grok is the worst. I am a native Czech speaker and Claude makes up words and Grok sometimes respond in Russian. So while I love it for coding, it’s unusable for general purpose for me.

JV00 · 2026-02-06T13:07:29 1770383249

I tried coding in Italian with Claude and it sounds somewhat less professional than in English. Like it uses different language than what you would expect in the context. In the end I felt the result on the work per se was pretty much the same, just his comments sound strange. Thinking about it again, it's probably because Italian developers don't really speak pure Italian between themselves, we use a lot of English words or distorted Italianised English words when talking about software engineering because all the source material we refer to is written in English and for many things we don't even have translations. Then you talk with a LLM and it actually tries to use proper Italian, when human speakers gave up long ago. So it sounds like a humanities scholar talking about software engineering, not like a insider. It is quite entertaining. I wouldn't say it sucks with non English languages by the way, I even tried describing a bug in dialect and was amused that Claude code one-shotted the fix!

Squarex · 2026-02-06T14:08:03 1770386883

yeah, i overextrapolated it on my specific case on the czech language, but for me the difference is quite large and the czech internet has been quite active in the history, the computer linguistic department on the charles university is world tier... there is plenty of czech literature. it should not be that much of a problem to be profecient on it for major labs

9dev · 2026-02-05T21:05:55 1770325555

> Grok sometimes respond in Russian

Geopolitically speaking this is hilarious.

Squarex · 2026-02-05T21:58:47 1770328727

The voice mode sounded like a Ukrainian trying to speak Czech. I don’t think it means anything.

deaux · 2026-02-06T03:04:11 1770347051

You mean Claude sucks at Czech. You're extrapolating here. I can name languages that Claude is better at than GPT.

Gemini is the most fluent in the highest number of human languages and has been for years (!) at this point - namely since Gemini 1.5 Pro, which was released Feb 2024. Two years ago.

Squarex · 2026-02-06T05:55:41 1770357341

Yeah, sure, I was overly generalising it from one experience.

kuboble · 2026-02-05T21:58:15 1770328695

Claude code (opus) is very good in Polish.

I sometimes vibe code in polish and it's as good as with English for me. It speaks a natural, native level Polish.

I used opus to translate thousands of strings in my app into polish, Korean, and two Chinese dialects. Polish one is great, and the other are also good according to my customers.

Squarex · 2026-02-06T05:58:58 1770357538

> I sometimes vibe code in polish

This is interesting to me. I always switch to English automatically when using Claude Code as I have learned software engineering on an English speaking Internet. Plus the muscle memory of having to query google in English.

kuboble · 2026-02-06T12:33:32 1770381212

English is also default for me.

I mostly use Polish when I pair-vibe-code with my kids

koakuma-chan · 2026-02-06T02:27:39 1770344859

You could say its Polish is polished.

altern8 · 2026-02-06T00:35:41 1770338141

Your game is amazing!

I wish there was a "Reset" button to go back to the original position.

Where are you in Poland?

kuboble · 2026-02-06T04:34:37 1770352477

Thanks :) Click "Level" -> "Try again"

Originally from Wrocław, but don't live in Poland anymore

altern8 · 2026-02-06T09:55:13 1770371713

Ah, I'm originally from Italy and living in Wroclaw now, LOL.

BUT, I meant a button to restart after a few moves. Anyways, cool!

kuboble · 2026-02-06T12:24:53 1770380693

Yes, that's what I'm referring to https://kuboble.com/hn/level_try_again.mp4

altern8 · 2026-02-07T15:55:47 1770479747

Ah, I see.

But how would I know that I have to click on the level? I would expect that to live next to "Undo".

Just saying :-)

jorl17 · 2026-02-05T22:07:17 1770329237

Claude is quite good at European Portuguese in my limited tests. Gemini 3 is also very good. ChatGPT is just OK and keeps code-switching all the time, it's very bizarre.

I used to think of Gemini as the lead in terms of Portuguese, but recently subjectively started enjoying Claude more (even before Opus 4.5).

In spite of this, ChatGPT is what I use for everyday conversational chat because it has loads of memories there, because of the top of the line voice AI, and, mostly, because I just brainstorm or do 1-off searches with it. I think effectively ChatGPT is my new Google and first scratchpad for ideas.

khendron · 2026-02-06T01:01:47 1770339707

Claude is helping me learn French right now. I am using it as a supplementary tutor for a class I am taking. I have caught it in a couple of mistakes, but generally it seems to be working pretty well.

eaf7e281 · 2026-02-05T18:16:31 1770315391

I kinda agree. Their model just doesn't feel "daily" enough. I would use it for any "agentic" tasks and for using tools, but definitely not for day to day questions.

lukebechtel · 2026-02-05T18:22:40 1770315760

Why? I use it for all and love it.

That doesn't mean you have to, but I'm curious why you think it's behind in the personal assistant game.

legitster · 2026-02-05T18:41:54 1770316914

I have three specific use cases where I try both but ChatGPT wins:

- Recipes and cooking: ChatGPT just has way more detailed and practical advice. It also thinks outside of the box much more, whereas Claude gets stuck in a rut and sticks very closely to your prompt. And ChatGPT's easier to understand/skim writing style really comes in useful.

- Travel and itinerary: Again, ChatGPT can anticipate details much more, and give more unique suggestions. I am much more likely to find hidden gems or get good time-savers than Claude, which often feels like it is just rereading Yelp for you.

- Historical research: ChatGPT wins on this by a mile. You can tell ChatGPT has been trained on actual historical texts and physical books. You can track long historical trends, pull examples and quotes, and even give you specific book or page(!) references of where to check the sources. Meanwhile, all Claude will give you is a web search on the topic.

aggie · 2026-02-05T19:59:43 1770321583

How does #3 square with Anthropic's literal warehouse full of books we've seen from the copyright case? Did OpenAI scan more books? Or did they take a shadier route of training on digital books despite copyright issues, but end up with a deeper library?

legitster · 2026-02-05T22:27:31 1770330451

I have no idea, but I suspect there's a difference between using books to train an LLM and be able to reproduce text/writing styles, and being able to actually recall knowledge in said books.

rolisz · 2026-02-05T20:10:09 1770322209

I think they bought the books after they were caught that they pirated the books and lost that case (because they pirated, not because of copyright).

FergusArgyll · 2026-02-06T00:36:41 1770338201

My 2 cents:

All the labs seem to do very different post training. OpenAI focuses on search. If it's set to thinking, it will search 30 websites before giving you an answer. Claude regularly doesn't search at all even for questions it obviously should. It's postraining seems more focused on "reasoning" or planning - things that would be useful in programming where the bottleneck is: just writing code without thinking how you'll integrate it later and search is mostly useless. But for non coding - day to day "what's the news with x" "How to improve my bread" "cheap tasty pizza" or even medical questions, you really just want a distillation of the internet plus some thought

eaf7e281 · 2026-02-05T21:25:55 1770326755

It's hard to say. Maybe it has to do with the way Claude responds or the lack of "thinking" compared to other models. I personally love Claude and it's my only subscription right now, but it just feels weird compared to the others as a personal assistant.

lukebechtel · 2026-02-06T00:07:34 1770336454

Oh, I always use opus 4.5 thinking mode. Maybe that's the diff.

quietsegfault · 2026-02-05T22:40:07 1770331207

Claude is far superior for daily chat. I have to work hard to get it to not learn how to work around various bad behaviors I have but don’t want to change.

solarkraft · 2026-02-05T18:43:14 1770316994

But that’s what makes it so powerful (yeah, mixing model and frontend discussion here yet again). I have yet to see a non-DIY product that can so effortlessly call tens of tools by different providers to satisfy your request.

dimgl · 2026-02-06T01:49:01 1770342541

I don't get what's so difficult to understand. They have ambitions beyond just coding. And Claude is generally a good LLM. Even beyond just the coding applications.

int_19h · 2026-02-06T09:08:49 1770368929

I suspect it very much depends on the "generic research topics", but in my experience one thing that Claude is good at is in-depth research because it can keep going for such a long time; I've had research sessions go well over an hour, producing very detailed reports with lots of sources etc. Gemini Deep Research is nowhere even close.

zaphirplane · 2026-02-06T14:09:45 1770386985

Correct me if I’m wrong aren’t they the innovators of multiple things like skills sub agents mcp and whatever this memory thing is agents files

Seriously they are the apple iPhone or AWS of LLM a decade or so ago.

redox99 · 2026-02-06T00:33:33 1770338013

Why would I even use Claude for asking something on their web, considering that chips away my claude code usage limit?

Their limit system is so bad.

faxmeyourcode · 2026-02-06T14:21:58 1770387718

Everybody is different, I simply cannot stand the sight of chatgpt styled writing. Give me paragraphs.

derwiki · 2026-02-06T00:14:01 1770336841

It feels very similar to how Lyft positioned themselves against Uber. (And we know how that played out)

fnordpiglet · 2026-02-06T05:49:47 1770356987

Enterprise, government, and regulated institutions. It’s also defacto standard for programming assistants at most places. They have a better story around compliance, alignment, task based inference, agentic workflows, etc. Their retail story is meh, but I think their view is to be the aws of LLMs while OpenAI can be the retail and Gemini the whatever Google does with products.

dev1ycan · 2026-02-06T07:37:15 1770363435

Their "constitution" is just garbage meant to defend them ripping off copyrighted material with the excuse that "it's not plagiarizing, it thinks!!!!1" which is, false.

handoflixue · 2026-02-06T07:48:36 1770364116

I don't recall them ever offering that legal reasoning - I'm sure you can provide a citation?

dev1ycan · 2026-02-07T13:36:26 1770471386

Did using LLMs too much remove your ability to critically think too?

handoflixue · 2026-02-09T03:02:30 1770606150

Just to be clear: you're mad because your "critical thinking" led you to a spurious argument that you disagree with, and that they never actually made?

You explicitly said: "the excuse that "it's not plagiarizing, it thinks!!!!1"", and it seems rather relevant that they've never actually used that excuse.