News JavaScript • Feb 29 • 4 min read ONNX Runtime Web unleashes generative AI in the browser using WebGPU By Emma Ning, Principal Program Manager, AI Frameworks Yulong Wang, Senior Software Engineer, AI Frameworks Satya Jandhyala, Principal Software Engineer, AI Frameworks ONNX Runtime Web featuring WebGPU is now available in the ONNX Runtime 1.17 release—unlocking new possibilities.
News Tools PyTorch • June 26, 2023 • 4 min read Olive: A user-friendly toolchain for hardware-aware model optimization By Emma Ning, Principal Program Manager, AI Frameworks Devang Patel, Principal Architect, AI Frameworks Guoliang Hua, Principal Software Engineer Manager, Microsoft. Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don't need to be…
News Cloud • June 26, 2023 • 3 min read Automate optimization techniques for transformer models By Emma Ning, Principal Program Manager, AI Frameworks Feng Tian, AI Architect—Intel Yuwen Zhou, AI Engineer—Intel Haihao Shen, Leading AI Architect—Intel Saurabh Tangri, Principal AI Engineer—Intel Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage…
Project updates PyTorch • May 2, 2022 • 5 min read Optimizing and deploying transformer INT8 inference with ONNX Runtime-TensorRT on NVIDIA GPUs By Emma Ning, Principal Program Manager, AI Frameworks Mohit Ayani, Solutions Architect, NVIDIA Shang Zhang, Senior AI Developer Technology Engineer, NVIDIA Jay Rodge, Product Marketing Manager-AI,…
Project updates AI + Machine Learning JavaScript • September 2, 2021 • 5 min read ONNX Runtime Web—running your machine learning model in browser By Emma Ning, Principal Program Manager, AI Frameworks Yulong Wang, Senior Software Engineer, AI Frameworks Du Li, Senior Software Engineer, AI Frameworks We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers…
AI + Machine Learning PyTorch • June 30, 2021 • 7 min read Journey to optimize large scale transformer model inference with ONNX Runtime By Xiaoyu Liu, Applied Scientist II, Data&AI, Developer Division (DevDiv) Eric Lin, Senior Researcher SDE, Turing Team Emma Ning, Principal Program Manager, AI Frameworks “With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer…
Project updates AI + Machine Learning • January 21, 2020 • 4 min read Microsoft open sources breakthrough optimizations for transformer inference on GPU and CPU By Emma Ning, Principal Program Manager, AI Frameworks This post is co-authored by Emma Ning, Azure Machine Learning; Nathan Yan, Azure Machine Learning; Jeffrey Zhu, Bing;…