this post was submitted on 16 Jun 2026
63 points (92.0% liked)

Technology

85468 readers
4807 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] rozodru@piefed.world 6 points 3 hours ago (1 children)

I tried it, had to VPN in to do so but I tried it. I gave it 5 tasks, it succeeded in 2 of them, rest were hallucinations. so...yeah...guess it's much better than Opus.

[–] Hackworth@piefed.ca 2 points 2 hours ago (1 children)

rest were hallucinations

I'm having trouble parsing whatcha mean here if they were coding tasks. The code didn't run? Ran but had 0 functionality? If they were non-coding tasks, then agreed, I didn't notice it being significantly more accurate. Though I did appreciate the larger vocab. I wasn't gonna be able to afford to keep using it once it went to API pricing anyway.

[–] rozodru@piefed.world 3 points 1 hour ago

sorry should have been more specific. it was a mix of coding and non-coding. 1 coding task ran fine, another one just didn't work at all. one was a basic walk through tutorial type task that was accurate, the others were hallucinations.