Courses Lecture 3: Getting Started With CUDA for Python Programmers https://www.youtube.com/watch?v=4sgKnKbR-WE&t=202s Papers Fast Inference from Transformers via Speculative Decoding https://arxiv.org/abs/2211.17192 Tutorials https://andrewkchan.dev/posts/yalm.html