Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Never tried those, so I couldn't say. I guess it would.

Even so, creating all the abstractions needed to implement even regular matrix multiplication in Spiral in a generic fashion took me two months, so I'd consider that good enough exercise.

You could do it a lot faster by specializing for specific matrix sizes, like in the Cuda examples repo by Nvidia, but then you'd miss the opportunity to do the tensor magic that I did in the playlist.



You are the author of the playlist/maker of the videos?


Yes.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: