Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another update: did an automated severity analysis on each bug report (~2000 of them!) using an LLM at temp=0 with a very strict rubric (and I checked to make sure that it rated things in a consistent, stable way using it). The rubric, LLM used, and some example ratings are included in the methodology section. For now, the information was just stored per-bug in the DuckDB and used to filter out non-bug bugs, to get a clearer signal. I'm going to try to use it to see if the post-Claude bugs were more severe in any way next.
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: