this post was submitted on 11 Feb 2026
525 points (99.4% liked)

Fuck AI

5726 readers
772 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] PoliteDudeInTheMood@lemmy.ca 1 points 1 day ago (1 children)

Opus is heavily throttled outside enterprise tiers. I was regularly blowing through weekly usage limits by Tuesday using Opus. 5.3 on the higher thinking profiles match or exceed Opus capabilities, and I have yet to hit a single limitation.

If I need to process via API I will run tests against Anthropic Haiku or Sonnet before trying Gpt5-mini, If I need to use 5.3, and what I'm doing isn't time critical I'll use batch processing. Smaller token batches complete very quickly, often in under 2 hours. And at a 50% discount, provides serious cost savings.

[–] SuspciousCarrot78@lemmy.world 1 points 1 day ago* (last edited 1 day ago) (1 children)

Yeah me too. Opus 4.5 is awesome but my god...om nom nom go my daily / weekly quotas. Probably I should not yeet the entire repo at it lol.

4.6 is meant to be 2x worse for not much better output.

Viewed against that, Codex 5.3 @ medium is actual daylight robbery of OAI.

I was just looking at benchmarks and even smaller 8-10B models are now around 65-70% Sonnet level (Qwen 3-8, Nemotron 9B, Critique) and 110-140% Haiku.

If I had the VRAM, I'd switch to local Qwen3 next (which almost 90% of Opus 4.5 on SWE Bench) and just git gud. Probably I'll just look at smaller models, API calls and the git gud part.

RTX 3060 (probably what you need for decent Qwen 3 next) is $1500 here :(

For that much $$$ I can probably get 5 years of surgical API calls via OR + actual skills.

PS: how are you using batch processing? How did you set it up?

[–] PoliteDudeInTheMood@lemmy.ca 1 points 1 day ago* (last edited 1 day ago)

It's very content specific, what are you processing with the API?

One of my little side projects right now is translating Russian fiction, specifically a genre over there called 'boyar-anime' which is essentially fantasy set in imperial russia. I do most my heavy translation using Anthropic Haiku which is very cheap and unlike the higher end models it tends to dumb down some of the more complex parts of Imperial Russian aristocracy so it's more in line with similar fiction over here. When I take the source book, I chunk it down into small segments that I translate individually so I don't get context bleed, then I mechanically process to find anything that didn't translate very well. I combine roughly 40 of these weirdly translated segments into a jsonl file and submit the file through the API. OpenAI Batch API can accept up to 900k tokens, but you'll wait close to 11 hours for something that large. 40 segments is around 30k tokens and that usually processes in a few mins to an hour depending.

The jsonl file is essentially made up of smaller json blocks

{
  "custom_id": "SEGMENT-NUM",
  "method": "POST",
  "url": "/v1/responses",
  "body": {
    "model": "gpt-5.3",
    "input": [
      {
        "role": "system",
        "content": [
          {
            "type": "input_text",
            "text": "You are a meticulous English language proofreader."
          }
        ]
      },
      {
        "role": "user",
        "content": [
          {
            "type": "input_text",
            "text": "PROMPT - SUBMITTED SEGMENT"
          }
        ]
      }
    ],
    "max_output_tokens": 8192
  }
}

I then setup polling to check back with the API every few mins, when the submitted queries are completed, I send more automatically until everything has been processed.