10 tests per model seems like way too little and they should give confidence intervals…
the 10/10 vs. 8/10 is just as likely due chance than any real difference. But some people will definitely use this to justify model choice.
This is a most excellent place for technology news and articles.
10 tests per model seems like way too little and they should give confidence intervals…
the 10/10 vs. 8/10 is just as likely due chance than any real difference. But some people will definitely use this to justify model choice.
Mistral (the free version) seems to get it right. Maybe they fixed it specifically ?
Drive. Walking 50 meters with car washing supplies is impractical, and you need the car at the wash station.

I remember years ago getting downvoted into oblivion both here, and on Reddit for saying that AI would be a disaster.
Kinda neat about the human responses... sure some are trolling but maybe we have to test our global expectations. In North America, a car wash tends to be this garage thing with either automated cleaning or a set of supplies to clean your car, and your car has to be in the shed to be cleaned effectively. But if washing your car by hand is the norm, I wonder if people in some countries surmise that the cleaning staff could just walk over with the sponges, buckets and hoses and stuff to the car, if you're already 50 metres away from the washing point.