Transformer | Microsoft Open Source Blog

News
Cloud

•

June 26, 2023

•

3 min read

Automate optimization techniques for transformer models

Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage of model compression techniques in their deployment platform, including Intel processors and accelerators.

•

January 25, 2023

•

5 min read

Improve BERT inference speed by combining the power of Optimum, OpenVINO™, ONNX Runtime, and Azure

Make large models smaller and faster with OpenVino Execution Provider, NNCF and ONNX Runtime leveraging Azure Machine Learning.

•

May 2, 2022

•

5 min read

Optimizing and deploying transformer INT8 inference with ONNX Runtime-TensorRT on NVIDIA GPUs

Mohit Ayani, Solutions Architect, NVIDIA Shang Zhang, Senior AI Developer Technology Engineer, NVIDIA Jay Rodge, Product Marketing Manager-AI, NVIDIA Transformer-based models have revolutionized the natural language processing (NLP) domain.

•

June 30, 2021

•

7 min read

Journey to optimize large scale transformer model inference with ONNX Runtime

With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.