TPUs Go Brrr

Hi, I am Simon.


I am a first-year Computer Science Ph.D. student at Stanford University, currently rotating with Prof. Azalia Mirhoseini in the Scaling Intelligence Lab.

I studied Electrical Engineering and Computer Sciences during my undergrad at Berkeley. I was luckily involved in the SLICE lab working with Prof. Sophia Shao, and RISE lab working with Prof. Ion Stoica.

I am broadly interested in computer systems and machine learning. Most recently, I spent some time pre-training language models at Cohere. Previously, I designed GPUs at Apple, scaled out distributed systems at Anyscale, and make cars drive themselves at NVIDIA DRIVE.

If you are interested in my journey, please check out the rest of this site. Feel free to contact me at simonguo [@] stanford dot edu.

Resume CV

Research

I believe the future of computing would be specialized and distributed to enable intelligence at scale and to be ubiquitous. To do that, my research spans across the stack, including efficinet ML algorithms, hardware-software codesign, parallelism and efficient compilation, systems for machine learning, etc. I published at machine learning, computer systems & architecture conferences.

BAM! Just Like that, Simple and Efficient Parameter Upcycling for Mixture of Experts

Qizhen Zhang, Nikolas Gritsch, Dwaraknath Gnaneshwar, Simon Guo, David Cairuz, Bharat Venkitesh, Jakob Foerster, Phil Blunsom, Sebastian Ruder, Ahmet Üstün, Acyr Locatelli

To appear at Conference on Neural Information Processing Systems (NeurIPS), 2024
NGSM (Spotlight) and ES-FoMo-II Workshop at International Conference on Machine Learning (ICML), 2024

Upcycling MoE with Mixture-of-Attention for more efficient MoE pre-training


arxiv
Parallelism in Bundle Adjustment for SLAM

Simon Zirui Guo, Yakun Sophia Shao

ACM Student Research Competition at IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022

Speed up SLAM by exploiting structural sparsity and custom kernels on Tensor Cores


Extended Abstract
Gemmini: An Open-Source, Full-System DNN Accelerator Design and Evaluation Platform

Hasan Genc, Seah Kim, Vadim Vadimovich Nikiforov, Simon Zirui Guo, Borivoje Nikolić, Krste Asanović, Yakun Sophia Shao

First Workshop on Open-Source Computer Architecture Research (OSCAR) at ACM/IEEE International Symposium on Computer Architecture (ISCA), 2022

Design Space Exploration for DNN accelerators across the stack


Workshop Presentation
D3: A Dynamic Deadline-Driven Approach for Building Autonomous Vehicles

Ionel Gog, Sukrit Kalra, Peter Schafhalter*, Joseph E. Gonzalez, Ion Stoica

* worked as undergraduate research assistant for author

In Proceedings of European Conference on Computer Systems (EuroSys), 2022

OS for self-driving cars and robots


ACM