Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Although I like the article, the author doesnt acknowledge the current way that (I think) most people are utilizing llama.cpp at this point. ollama.com has simplified his work into two lines:

curl -fsSL https://ollama.com/install.sh |sh

ollama run llama3.2



llamafile simplified it even further - you just download and run it :)


https://github.com/Mozilla-Ocho/llamafile

Wait.. there is one binary that executes just fine on half a dozen platforms? What wizardry is this?

edit: Their default LLM worked great on Windows. Fast inference on my 2080ti. You have to pass "-ngl 9999" to offload onto GPU or it runs on CPU. It is multi-modal too.


cosmopolitan libc, by Justine Tunney (jart here on HN.) How have you missed this?


That's wild. Thank you. I don't know how I missed this.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: