AI + Machine Learning PyTorch • June 30, 2021 • 7 min read Journey to optimize large scale transformer model inference with ONNX Runtime By Xiaoyu Liu, Applied Scientist II, Data&AI, Developer Division (DevDiv) Eric Lin, Senior Researcher SDE, Turing Team Emma Ning, Principal Program Manager, AI Frameworks “With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer…