Efficient AI Inference
Hardware-software co-design across speculative decoding, quantization, and pruning.
Hardware-software co-design across speculative decoding, quantization, and pruning.
Building agent tooling and constructing benchmarks for evaluating agent capabilities.
Using world-model ideas to strengthen representation learning.