Shashank Shekhar

      • AllGather
      • AllReduce
      • AllToAll
      • Bandwidth
      • BFloat16 (Brain Floating Point)
      • Compute-Bound
      • FLOPs (Floating Point Operations)
      • FLOPs (Floating Point Operations)
      • GPU (Graphics Processing Unit)
      • Memory-Bound
      • MXU (Matrix Multiply Unit)
      • Numba
      • ReduceScatter
      • Roofline Model
      • Sharding
      • Systolic Array
      • TPU (Tensor Processing Unit)
      • VMEM (Vector Memory)
        • Chapter 2 Heterogeneous Data Parallel Computing Exercise Solutions
        • Chapter 3 Multidimensional Grids and Data Exercise Solutions
        • Chapter 4 Compute Architecture and Scheduling Exercise Solutions
        • Chapter 5 Memory Architecture and Data Locality Exercise Solutions
        • Chapter 6 Performance Considerations Exercise Solutions
        • Chapter 7 Convolution Exercise Solutions
        • Chapter 1 Introduction Notes
        • Chapter 2 Heterogeneous Data Parallel Computing Notes
        • Chapter 3 Multidimensional Grids and Data Notes
        • Chapter 4 Compute Architecture and Scheduling Notes
        • Chapter 5 Memory Architecture and Data Locality Notes
      • Drawing 2025-03-28 13.46.45.excalidraw
      • Scaling Book Part 0. Introductions
      • Scaling Book Part 1. All About Rooflines
      • Scaling Book Part 1. All About Rooflines Exercises
      • Scaling Book Part 2. How to Think About TPUs
      • Scaling Book Part 2. TPU Questions
      • Scaling Book Part 3 - Sharding Questions
      • Scaling Book Part 3. Sharded Matrices and How to Multiply Them
      • PEFT Review by Xu et al, Dec 2023
      • Speculative Decoding (DeepMind) by Chen et al 2023
      • Speculative Decoding (Google Brain) by Leviathan et al 2023
      • MiniTorch
      • 5 Apr, 2025 Readings
      • 7 April, 2025 Readings
      • 8 April, 2025 Readings
      • 9 April, 2025 Readings
      • 10 March, 2025 Readings
      • 11 March, 2025 Readings
      • 12 March, 2025 Readings
      • 13 March, 2025 Readings
      • 14 April, 2025 Readings
      • 14 March, 2025 Readings
      • 14 March, 2025 Readings
      • 16 March, 2025 Readings
      • 17 April, 2025 Readings
      • 17 March, 2025 Readings
      • 18 April, 2025 Readings
      • 18 March, 2025 Readings
      • 19 March, 2025 Readings
      • 20 March, 2025 Readings
      • 21 April, 2025 Readings
      • 21 March, 2025 Readings
      • 22 April, 2025 Readings
      • 22 March, 2025 Readings
      • 24 April, 2025 Readings
      • 24 March, 2025 Readings
      • 25 April, 2025 Readings
      • 25 March, 2025 Readings
      • 26 March, 2025 Readings
      • 27 April, 2025 Readings
      • 27 March, 2025 Readings
      • 28 March, 2025 Readings
      • 29 March, 2025 Readings
      • 30 March, 2025 Readings
      • 31 March, 2025 Readings
        • CUDAvMetal
        • Optimizing a Metal Matmul kernel for 5TFLOPs+ performance on my M1 Max
      • Machine Learning Compilation Course
    Home

    ❯

    Reading List

    ❯

    20 March, 2025 Readings

    20 March, 2025 Readings

    Mar 20, 20251 min read

    Courses

    • MLC Episode 1: Overview of ML Compilation

    Tutorials

    • The PyTorch developer’s guide to JAX fundamentals https://cloud.google.com/blog/products/ai-machine-learning/guide-to-jax-for-pytorch-developers
    • Learning JAX as a PyTorch developer https://kidger.site/thoughts/torch2jax/

    Exploratory

    • The State of the JAX Ecosystem in 2025 https://www.youtube.com/watch?v=TzULI2PomIw

    Graph View

    • Courses
    • Tutorials
    • Exploratory

    Created with Quartz v4.4.0 © 2025

    • GitHub
    • Discord Community