this post was submitted on 01 Dec 2025
-10 points (37.5% liked)

Technology

77084 readers
823 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

At least 80 million (3.3%) of Wikipedia's facts are inconsistent, LLMs may help finding them

A paper titled "Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models",^[1]^ presented earlier this month at the EMNLP conference, examines

all 12 comments
sorted by: hot top controversial new old
[–] Derpenheim@lemmy.zip 27 points 5 days ago (2 children)

My knee jerk is no, because fuck ai, but LLMs are literally made to parse vast amounts of data quickly. The analysis and corrections needs to be done manually, but finding these errors are literally what they were originally made to do

[–] CptBread@lemmy.world 5 points 5 days ago

Well it could have the issue of overloading volunteers with issues. Especially bad if the false positive rate is high enough.

[–] fodor@lemmy.zip 2 points 4 days ago

That isn’t what most LLMs were designed for, though. It’s just one possible use case.

[–] Aussiemandeus@aussie.zone 12 points 5 days ago

It can probably help make 160million

[–] FriendBesto@lemmy.ml 2 points 5 days ago

I watch a YT channel that talks and researches History on Wales, and on that somewhat narrow topic alone, he has found some ridiculous mistakes on Wikipedia. There are tons but few people are aware as they may lack the suffiency in knowledge or background to know how wrong they are. AI will surely make that problem worse. I have caught ChatGTP to be wrong numerous times on some topics within my wheelhouse. When I tell it is wrong it "apologizes," corrects itself and just adds what I told it. Well, if it had found the data before, then why does it have to wait until it is corrected? If kids use this for school, they are so fucked.

Who wants to put glue on their pizza?

[–] msage@programming.dev -4 points 5 days ago (1 children)
[–] TheLeadenSea@sh.itjust.works 18 points 5 days ago (1 children)

I know everyone on Lemmy hates LLMs, but analysing large amounts of text to fond inconsistencies is actually something they're good at. Not correcting them, of course, that can be left to humans. Just finding them.

[–] msage@programming.dev 0 points 5 days ago (1 children)

It's hard to believe then their output is at best inconsistent.

[–] artyom@piefed.social 6 points 5 days ago* (last edited 5 days ago) (1 children)

That's why you have to manually review them. The biggest problem with LLMs is abuse. People just print their outputs without ever checking their validity.

[–] msage@programming.dev 0 points 4 days ago (1 children)

Is it faster than doing it all by yourself?

[–] artyom@piefed.social 5 points 4 days ago

Doing what? Manually reviewing the entirety of Wikipedia? Absolutely.