ONNX | Microsoft Open Source Blog

Three people collaborating in an office on laptops.

•

April 9, 2024

•

4 min read

Join Microsoft at Open Source Summit North America 2024

Join Microsoft at Open Source Summit North America 2024, taking place in Seattle, Washington from April 16 to 18, 2024.

News
Python

•

August 1, 2023

•

6 min read

Introducing ONNX Script: Authoring ONNX with the ease of Python

ONNX Script is a new open-source library for directly authoring ONNX models in Python.

•

June 26, 2023

•

4 min read

Olive: A user-friendly toolchain for hardware-aware model optimization

Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don't need to be an expert to explore diverse hardware optimization toolchains.

News
Cloud

•

June 26, 2023

•

3 min read

Automate optimization techniques for transformer models

Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage of model compression techniques in their deployment platform, including Intel processors and accelerators.

•

February 8, 2023

•

6 min read

Performant on-device inferencing with ONNX Runtime

The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how ONNX Runtime enables their success.

AI + Machine Learning

•

June 6, 2022

•

5 min read

Live demos of machine learning models with ONNX and Hugging Face Spaces

Choosing which machine learning model to use, sharing a model with a colleague, and quickly trying out a model are all reasons why you may find yourself wanting to quickly run inference on a model.

•

April 19, 2022

•

8 min read

Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime

Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows across the industry.

•

December 17, 2020

•

5 min read

Accelerate and simplify Scikit-learn model inference with ONNX Runtime

Scikit-learn is one of the most useful libraries for general machine learning in Python. To minimize the cost of deployment and avoid discrepancies, deploying scikit-learn models to production usually leverages Docker containers and pickle, the object serialization module of the Python standard library.

•

December 14, 2020

•

1 min read

ONNX Runtime scenario highlight: Vespa.ai integration

Since its open source debut two years ago, ONNX Runtime has seen strong growth with performance improvements, expanded platform and device compatibility, hardware accelerator support, an extension to training acceleration, and more.

•

November 24, 2020

•

3 min read

Adding RoBERTa NLP to the ONNX model zoo for natural language predictions

In summer 2019, I worked as a high school intern for the ONNX AI team at Microsoft and loved working on various projects with the team, including the BERT text classification model. However, due to Covid-19, the Microsoft Internship Program for high school students was canceled in the summer of 2020.

•

October 12, 2020

•

2 min read

Introducing ONNX Runtime mobile – a reduced size, high performance package for edge devices

ONNX Runtime is an open source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce ONNX Runtime release v1.5 as part of our AI at Scale initiative.

•

August 24, 2020

•

4 min read

GPT-2 fine-tuning with ONNX Runtime – a 34% speedup in training time

Model training is an important step when developing and deploying large scale Artificial Intelligence (AI) models. Training typically utilizes a large amount of compute resources to tune the model based on the input dataset.

•

May 19, 2020

•

3 min read

Announcing accelerated training with ONNX Runtime—train models up to 45% faster

ONNX Runtime is an open source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. It is used extensively in Microsoft products, like Office 365 and Bing, delivering over 20 billion inferences every day and up to 17 times faster inferencing.