this post was submitted on 14 Jan 2026
1565 points (99.3% liked)

Technology

78964 readers
3499 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] kayohtie@pawb.social 4 points 6 days ago (1 children)

If AIGM was like VSTs or vocaloids that'd be one thing. But it's more like imitation of sounds, synthesizing song chunks instead of instruments and voices themselves.

The best way to think of it is something creating an audio file solely by using the Photoshop clone stamp tool across millions of source files.

[–] Tja@programming.dev 0 points 6 days ago (1 children)

That's not how transformer neural networks work...

[–] kayohtie@pawb.social 1 points 5 days ago (1 children)

Sure, but we're talking generative here, as is the article, and to pretend it's referring to a tool that's been standard in libraries and even VSTs for over a decade is either misunderstanding the article or being disingenuous on purpose.

[–] Tja@programming.dev 0 points 5 days ago (1 children)

No, I get it. It's generative. GPT: Generative Pretrained Transformer. Music generators add a diffusion layer, but it's fundamentally new music being generated, not copies of existing songs.

My point is that it's just another tool, that automates it even more. It's not the same, it's the next step.

[–] kayohtie@pawb.social 1 points 3 days ago (1 children)

A text prompt -> audio is not a transformer in the sense of what people are talking about, and you know it or just don't care, or don't wholly understand how these systems work under the hood as well.

What I'm referring to are neural models that take an input audio and are effectively a filter that operates as a neural network. Voice mods, instrument adapters, virtual pedals, amp models... These are all actually transformative. There is actual music and effort going into these. And that is not what Bandcamp is after; those were already in heavy use like 15 years ago.

The things that generate based on text are a transformer in the most technically correct sense but not in the sense of what is meant when people talk about transformative.

They're fundamentally different purposes and usages. It's not generated vocals from nothing but the lyrics; it's someone else actually singing it and then a model transforming the sound to match an intended pre-set trained target, not generalization.

[–] Tja@programming.dev 1 points 3 days ago

They are a transformer in the same sense ChatGPT is a transformer. And hence they do generate new content that share characteristics and patterns with existing one. It's no clone tool. Lyrics are new. They probably follow the grammar rules of certain language, but it's not copy paste. Chords will probably be shared, but melody is new. Etc.