> Like most skeptics and critics, I use these tools daily. And 50% of the time t...

geuis · 2025-07-05T01:28:06 1751678886

Hmm. Ok so you're basically quoting the line from The Weatherman "60% of the time, it works all of the time."

I also use gpt and Claude daily via cursor.

Gpt o3 is kinda good for general knowledge searches. Claude falls down all the time, but I've noticed that while it's spending tokens to jerk itself off, quite often it happens on the actual issue going on with out recognizing it.

Models are dumb and more idiot than idiot savant, but sometimes they hit on relevant items. As long as you personally have an idea of what you need to happen and treat LLMs like rat terriers in a farm field, you can utilize them properly

leptons · 2025-07-05T00:01:27 1751673687

Your comment is no better than the comment in the article that the author is calling out.

"90%" also seems a bit suspect.

richardw · 2025-07-05T00:11:40 1751674300

I just went through the last 10 chat titles and all of them were spot on for me. Maybe the person you’re responding to has a different experience than you do and calling their perspective “suspect” is somewhat uncharitable.

(There are times I do other kinds of work and it fails terribly. My main point stands.)

leptons · 2025-07-05T03:04:28 1751684668

Pics or it didn't happen.

You're doing the same thing the article talks against. Some people claim miraculous results, while the reality for most is far less successful. But maybe you keep rolling the LLM dice and you keep winning? I personally don't like gambling with my time and energy, especially when I know the rules of the game are so iffy.

richardw · 2025-07-05T03:36:11 1751686571

Nah I’m all over the place. I said the last 10, to check if the 90% claim could be true if you do what I’ve done recently: use it for tons of little general ad hoc things rather than eg code needing serious accuracy.

I don’t “trust” it in the way I’d trust a smart colleague. We know how this works: use it for info that has a lot of results, or ask it to ground itself if it’s eg new info and you can’t rely on training memory. Asking it about esoteric languages or algo’s or numbers will just make you sad. It will generate 1000 confident tokens. But if you told me to lose Google or ChatGPT+Claude, Google is getting dumped instantly.

troupo · 2025-07-05T05:08:51 1751692131

This matches my experience as well. As you can see at the end of the article, I've vibe-coded full apps with zero knowledge of Swift/SwiftUI.

an0malous · 2025-07-05T01:30:49 1751679049

Can you share the questions you asked?

richardw · 2025-07-05T03:27:27 1751686047

It ranged from whether an epic v10 sport surf ski was a good fit for a newbie, to entra ID questions, to local data residency compliance laws, new jira alternatives, why schools ask for closed shoes, text to speech tool search. Many of these I use eg o4-mini-high for because I want it to ground itself: find material and compile something for me, but get me an answer fairly quickly.

Ones that don’t work but weren’t in the last 10: voice. It sounds amazing but is dumb as rocks. Feels like most of the GPU compute is for the media, not smarts. A question about floating solar heaters for pools. It fed me scam material. A question about negotiating software pricing. Just useless, parroted my points back at me.

I scale models up and down based on need. Very simple: gpt-40. Smarts: o4-mini-high. Research: deep research. I love Claude but at some point it kept running out of capacity so I’d move elsewhere. Although nothing beats it for artefacts. MS Copilot if I want a quick answer to something MS oriented. It’s terrible but it’s easy to access.

Coding is generally Windsurf but honestly that’s been rare for the last month. Been too busy doing other things.

standardUser · 2025-07-05T20:13:37 1751746417

It either helps me find a solution or it doesn't. About 90% of the time, or less formally I would just say "almost all of the time", it does. Keep in mind that I, the user, decide which questions to ask in the first place. If my batting average seems unbelievably high, perhaps my skill is in knowing when to use an LLM and when not to.

leptons · 2025-07-06T06:18:15 1751782695

Okay, well your vague response suggests to me you aren't asking the LLM anything important at all, and most likely it's things that could have appeared in the first page of a google search. So, sure, 90% of the time it's going to give you the top Google result. The other 10% were probably best answered by the top Google result but the LLM chose to hallucinate instead. Is that really better?