> Like most skeptics and critics, I use these tools daily. And 50% of the time they work 50% of the time.
I use LLMs nearly every day for my job as of about a year ago and they solve my issues about 90% of the time. I have a very hard time deciphering if these types of complaints about AI/LLMs should be taken seriously, or written off as irrational use patterns by some users. For example, I have never fed an LLM a codebase and expected it to work magic. I ask direct, specific questions at the edge of my understanding (not beyond it) and apply the solutions in a deliberate and testable manner.
if you're taking a different approach and complaining about LLMs, I'm inclined to think you're doing it wrong. And missing out on the actual magic, which is small, useful and fairly consistent.
Hmm. Ok so you're basically quoting the line from The Weatherman "60% of the time, it works all of the time."
I also use gpt and Claude daily via cursor.
Gpt o3 is kinda good for general knowledge searches. Claude falls down all the time, but I've noticed that while it's spending tokens to jerk itself off, quite often it happens on the actual issue going on with out recognizing it.
Models are dumb and more idiot than idiot savant, but sometimes they hit on relevant items. As long as you personally have an idea of what you need to happen and treat LLMs like rat terriers in a farm field, you can utilize them properly
I just went through the last 10 chat titles and all of them were spot on for me. Maybe the person you’re responding to has a different experience than you do and calling their perspective “suspect” is somewhat uncharitable.
(There are times I do other kinds of work and it fails terribly. My main point stands.)
You're doing the same thing the article talks against. Some people claim miraculous results, while the reality for most is far less successful. But maybe you keep rolling the LLM dice and you keep winning? I personally don't like gambling with my time and energy, especially when I know the rules of the game are so iffy.
Nah I’m all over the place. I said the last 10, to check if the 90% claim could be true if you do what I’ve done recently: use it for tons of little general ad hoc things rather than eg code needing serious accuracy.
I don’t “trust” it in the way I’d trust a smart colleague. We know how this works: use it for info that has a lot of results, or ask it to ground itself if it’s eg new info and you can’t rely on training memory. Asking it about esoteric languages or algo’s or numbers will just make you sad. It will generate 1000 confident tokens. But if you told me to lose Google or ChatGPT+Claude, Google is getting dumped instantly.
It ranged from whether an epic v10 sport surf ski was a good fit for a newbie, to entra ID questions, to local data residency compliance laws, new jira alternatives, why schools ask for closed shoes, text to speech tool search. Many of these I use eg o4-mini-high for because I want it to ground itself: find material and compile something for me, but get me an answer fairly quickly.
Ones that don’t work but weren’t in the last 10: voice. It sounds amazing but is dumb as rocks. Feels like most of the GPU compute is for the media, not smarts. A question about floating solar heaters for pools. It fed me scam material. A question about negotiating software pricing. Just useless, parroted my points back at me.
I scale models up and down based on need. Very simple: gpt-40. Smarts: o4-mini-high. Research: deep research. I love Claude but at some point it kept running out of capacity so I’d move elsewhere. Although nothing beats it for artefacts. MS Copilot if I want a quick answer to something MS oriented. It’s terrible but it’s easy to access.
Coding is generally Windsurf but honestly that’s been rare for the last month. Been too busy doing other things.
It either helps me find a solution or it doesn't. About 90% of the time, or less formally I would just say "almost all of the time", it does. Keep in mind that I, the user, decide which questions to ask in the first place. If my batting average seems unbelievably high, perhaps my skill is in knowing when to use an LLM and when not to.
Okay, well your vague response suggests to me you aren't asking the LLM anything important at all, and most likely it's things that could have appeared in the first page of a google search. So, sure, 90% of the time it's going to give you the top Google result. The other 10% were probably best answered by the top Google result but the LLM chose to hallucinate instead. Is that really better?
I use LLMs nearly every day for my job as of about a year ago and they solve my issues about 90% of the time. I have a very hard time deciphering if these types of complaints about AI/LLMs should be taken seriously, or written off as irrational use patterns by some users. For example, I have never fed an LLM a codebase and expected it to work magic. I ask direct, specific questions at the edge of my understanding (not beyond it) and apply the solutions in a deliberate and testable manner.
if you're taking a different approach and complaining about LLMs, I'm inclined to think you're doing it wrong. And missing out on the actual magic, which is small, useful and fairly consistent.