Role First author Timeline 2024 Stack CUDA, PyTorch, FlexGen, TensorRT Overview Research findings Impact Read paper View slides