
Performant on-device inferencing with ONNX Runtime
The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how…
The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how…
Make large models smaller and faster with OpenVino Execution Provider, NNCF and ONNX Runtime leveraging Azure Machine Learning.
Together with our colleagues at LinkedIn, we are happy to announce that Feathr is joining the LF AI…
Choosing which machine learning model to use, sharing a model with a colleague, and quickly trying out a…
Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows…
This post was co-authored by Jithun Nair and Aswin Mathews, members of technical staff at AMD. In recent…
ONNX Runtime now supports building mobile applications in C# with Xamarin. Support for Android and iOS is included…
We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers…
With a simple change to your PyTorch training script, you can now speed up training large language models…
This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD.…
This post was co-authored by Alejandro Saucedo, Director of Machine Learning Engineering at Seldon Technologies. About the co-author:…
With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer…