Skip to content

hdson07/GPU_MODE_Study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPU 모드 강의 학습 노트

이 Repo gpu-mode/lectures의 자료와 관련 YouTube 강의를 학습하고 개인적으로 정리한 내용입니다. 강의 내용과 GPT를 이용하여 정리를 진행하였으니, 오류가 있거나 부족한 부분이 있다면 언제든지 피드백 부탁드립니다.

학습 자료 출처

학습 진행 상황

# 강의명 강사 학습 완료 노트 정리 실습 완료
01 Profiling and Integrating CUDA kernels in PyTorch Mark Saroufim
02 Recap Ch. 1-3 from the PMPP book Andreas Koepf
03 Getting Started With CUDA Jeremy Howard
04 Intro to Compute and Memory Architecture Thomas Viehmann
05 Going Further with CUDA for Python Programmers Jeremy Howard
06 Optimizing PyTorch Optimizers Jane Xu
07 Advanced Quantization Charles Hernandez
08 CUDA Performance Checklist Mark Saroufim
09 Reductions Mark Saroufim
10 Build a Prod Ready CUDA Library Oscar Amoros Huguet
11 Sparsity Jesse Cai
12 Flash Attention Thomas Viehmann
13 Ring Attention Andreas Koepf
14 Practitioner's Guide to Triton Umer Adil
15 CUTLASS Eric Auld
16 On Hands profiling Taylor Robbie
17 GPU Collective Communication (NCCL) Dan Johnson
18 Fused Kernels Kapil Sharma
19 Data Processing on GPUs Devavret Makkar
20 Scan Algorithm Izzat El Haj
21 Scan Algorithm Part 2 Izzat El Haj
22 Hacker's Guide to Speculative Decoding in VLLM Cade Daniel
23 Tensor Cores Vijay Thakkar & Pradeep Ramani
24 Scan at the Speed of Light Jake Hemstad & Georgii Evtushenko
25 Speaking Composable Kernel Haocong Wang
26 SYCL MODE (Intel GPU) Patric Zhao
27 gpu.cpp Austin Huang
28 Liger Kernel Byron Hsu
29 Triton Internals Kapil Sharma
30 Quantized training Thien Tran
31 Beginners Guide to Metal Kernels Nikita Shulga
32 Unsloth - LLM Systems Engineering Daniel Han
33 BitBLAS Wang Lei
34 Low Bit Triton Kernels Hicham Badri
35 SGLang Performance Optimization Yineng Zhang
36 CUTLASS and Flash ATtention 3 Jay Shah
37 Introduction to SASS & GPU Microarchitecture Arun Demeure
38 Lowbit kernels for ARM CPU Scott Roy
39 TorchTitan Mark Saroufim and Tianyu Liu
40 Flash Infer Zihao Ye
41 CUDA Docs for Humans Charles Frye
42 Mosaic GPU Adam Paszke
43 TBD Erik Schultheis

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors