Wonung Kim, Yubin Lee, Yoonsung Kim, Jinwoo Hwang, Seongryong Oh, Jiyong Jung, Aziz Huseynov, Woong Gyu Park, Chang Hyun Park, Divya Mahajan, Jongse Park
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving
International Symposium on Microarchitecture (MICRO), 2025.
paper
code