keshan@blog:~$ ./home.sh

Implementing a Fast Attention Fusion Kernel

Writing Fast Attention on TPU — From Naive Kernel to Fused FlashAttention with Pallas Part 1 of the KernelForge series on writing, profiling, and optimizing custom TPU kernels in Python....

./read_more.sh

~/articles

Explore my thoughts on Machine Learning, AI, and more

Connected — session active
bash 03:54 10 posts