partial_accumen

joined 2 years ago
[–] partial_accumen@lemmy.world 1 points 3 days ago

Understanding how LLMs actually work that each word is a token (possibly each letter) with a calculated highest probably of the word that comes next, this output makes me think the training data heavily included social media or pop culture specifically around "teen angst".

I wonder if in context training would be helpful to mask the "edgelord" training data sets.