Optimizing and deploying transformer INT8 inference with ONNX Runtime-TensorRT on NVIDIA GPUs
Mohit Ayani, Solutions Architect, NVIDIA Shang Zhang, Senior AI Developer Technology Engineer, NVIDIA Jay Rodge, Product Marketing Manager-AI, NVIDIA Transformer-based models have revolutionized the natural language processing (NLP) domain.