Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The drawback however, is that these implementation only get that fast because ChaCha is embarrassingly parallel.

That is my point exactly. Why would you design, in this day and age, a generator that doesn't vectorize well? It will not take full advantage of the CPU. Even with parallel streams, large integer multiplication and variable-length rotation are SIMD-killers. Regarding the 1.5KB of data, I suspect you can get away with less than that if you specialize to this application, but note that this is still around half the state size of mt19937.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: