Although I like the article, the author doesnt acknowledge the current way that ...

romperstomper · on Nov 10, 2024

llamafile simplified it even further - you just download and run it :)

qingcharles · on Nov 10, 2024

https://github.com/Mozilla-Ocho/llamafile

Wait.. there is one binary that executes just fine on half a dozen platforms? What wizardry is this?

edit: Their default LLM worked great on Windows. Fast inference on my 2080ti. You have to pass "-ngl 9999" to offload onto GPU or it runs on CPU. It is multi-modal too.

dystnitem4r3 · on Nov 10, 2024

cosmopolitan libc, by Justine Tunney (jart here on HN.) How have you missed this?

qingcharles · on Nov 11, 2024

That's wild. Thank you. I don't know how I missed this.