DeepSeek FAQ
Just came across this nice technical breakdown of what, exactly, R1 accomplished. Interesting news to me was that they had to bypass CUDA entirely and write optimizations in PDX. Good complementary reading if you're here from my DeepSeek post.