AI + Machine Learning Archives - Page 2 of 5

•

February 8, 2023

•

6 min read

Performant on-device inferencing with ONNX Runtime

The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how…

•

January 25, 2023

•

5 min read

Improve BERT inference speed by combining the power of Optimum, OpenVINO™, ONNX Runtime, and Azure

Make large models smaller and faster with OpenVino Execution Provider, NNCF and ONNX Runtime leveraging Azure Machine Learning.

•

September 12, 2022

•

2 min read

Feathr feature store joins LF AI & Data Foundation

Together with our colleagues at LinkedIn, we are happy to announce that Feathr is joining the LF AI…

AI + Machine Learning

•

June 6, 2022

•

5 min read

Live demos of machine learning models with ONNX and Hugging Face Spaces

Choosing which machine learning model to use, sharing a model with a colleague, and quickly trying out a…

•

April 19, 2022

•

8 min read

Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime

Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows…

•

March 21, 2022

•

6 min read

Supporting efficient large model training on AMD Instinct™ GPUs with DeepSpeed

This post was co-authored by Jithun Nair and Aswin Mathews, members of technical staff at AMD. In recent…

•

December 14, 2021

•

2 min read

Add AI to mobile applications with Xamarin and ONNX Runtime

ONNX Runtime now supports building mobile applications in C# with Xamarin. Support for Android and iOS is included…

•

September 2, 2021

•

5 min read

ONNX Runtime Web—running your machine learning model in browser

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers…

•

July 13, 2021

•

3 min read

Accelerate PyTorch training with torch-ort

With a simple change to your PyTorch training script, you can now speed up training large language models…

•

July 13, 2021

•

4 min read

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD.…

•

July 9, 2021

•

7 min read

Simple steps to create scalable processes to deploy ML models as microservices

This post was co-authored by Alejandro Saucedo, Director of Machine Learning Engineering at Seldon Technologies. About the co-author:…

•

June 30, 2021

•

7 min read

Journey to optimize large scale transformer model inference with ONNX Runtime

With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer…