Optimizing memory usage in large language models fine-tuning with KAITO: Best practices from Phi-3
The Cloud Native team at Azure is working to make AI on Kubernetes more cost-effective and approachable for a broader range of users.

The Cloud Native team at Azure is working to make AI on Kubernetes more cost-effective and approachable for a broader range of users.
Empowering developers with key improvements and innovations with Microsoft at KubeCon North America.
Redefining how applications are build, managed, and understood with Radius—from the Azure Incubations team.
LF AI & Data Foundation announced Recommenders as its latest Sandbox project.
ONNX models can be accelerated with ONNX Runtime, which works cross-platform and provides coverage for many cloud and language models.
ONNX Runtime harnesses Intel® AMX to accelerate performance for the 4th Gen Intel® Xeon® CPUs.
KEDA reduces the complexity of infrastructure autoscaling, making it simpler to configure, manage, and secure the application auto-scaler.
ONNX Script is a new open-source library for directly authoring ONNX models in Python.
Using ONNX Runtime to unlock the promise of developments in science for solving real world problems.
Building upon the foundation we established earlier, this blog will present comprehensive information about the underlying details of training models…
Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don't need to be an expert…
Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage of model…
ONNX Runtime is a high-performance cross-platform inference and training engine that can run a variety of machine learning models. ORT…