Microsoft open sources breakthrough optimizations for transformer inference on GPU and CPU
This post is co-authored by Emma Ning, Azure Machine Learning; Nathan Yan, Azure Machine Learning; Jeffrey Zhu, Bing; Jason Li, Bing One of the most popular deep learning models used for natural language processing is BERT (Bidirectional Encoder Representations from Transformers).