{"id":82664,"date":"2020-09-29T10:00:04","date_gmt":"2020-09-29T17:00:04","guid":{"rendered":"https:\/\/cloudblogs.microsoft.com\/opensource\/?p=82664"},"modified":"2025-06-24T10:43:32","modified_gmt":"2025-06-24T17:43:32","slug":"accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird","status":"publish","type":"post","link":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/","title":{"rendered":"Accelerate traditional machine learning models on GPU with ONNX Runtime"},"content":{"rendered":"\n<p>With the growing trend towards deep learning techniques in AI, there are many investments in accelerating neural network models using GPUs and other specialized hardware. However, many models used in production are still based on traditional machine learning libraries or sometimes a combination of traditional machine learning (ML) and DNNs. We\u2019ve previously shared the performance gains that <a href=\"https:\/\/onnxruntime.ai\" target=\"_blank\" rel=\"noopener noreferrer\">ONNX Runtime<\/a> provides for popular DNN models such as <a href=\"https:\/\/cloudblogs.microsoft.com\/opensource\/2020\/01\/21\/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu\/\" target=\"_blank\" rel=\"noopener noreferrer\">BERT<\/a>, <a href=\"https:\/\/medium.com\/microsoftazure\/faster-and-smaller-quantized-nlp-with-hugging-face-and-onnx-runtime-ec5525473bb7\" target=\"_blank\" rel=\"noopener noreferrer\">quantized GPT-2<\/a>, and <a href=\"https:\/\/medium.com\/microsoftazure\/accelerate-your-nlp-pipelines-using-hugging-face-transformers-and-onnx-runtime-2443578f4333\" target=\"_blank\" rel=\"noopener noreferrer\">other Huggingface Transformer models<\/a>. Now, by utilizing <a href=\"https:\/\/github.com\/microsoft\/hummingbird\" target=\"_blank\" rel=\"noopener noreferrer\">Hummingbird<\/a> with ONNX Runtime, you can also capture the benefits of GPU acceleration for traditional ML models.<\/p>\n\n\n\n<p>This capability is enabled through the recently added integration of Hummingbird with the LightGBM converter in <a href=\"https:\/\/github.com\/onnx\/onnxmltools\" target=\"_blank\" rel=\"noopener noreferrer\">ONNXMLTools,<\/a> an open source library that can convert models to the interoperable <a href=\"http:\/\/onnx.ai\/\">ONNX<\/a> format. LightGBM is a gradient boosting framework that uses tree-based learning algorithms, designed for fast training speed and low memory usage. By simply setting a flag, you can feed a <a href=\"https:\/\/github.com\/microsoft\/LightGBM\">LightGBM<\/a> model to the converter to produce an ONNX model that uses neural network operators rather than traditional ML. This Hummingbird integration allows users of LightGBM to take advantage of the GPU accelerations typically only available for neural networks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-is-hummingbird\">What is Hummingbird?<\/h2>\n\n\n\n<p><a href=\"https:\/\/github.com\/microsoft\/hummingbird\" target=\"_blank\" rel=\"noopener noreferrer\">Hummingbird<\/a> is a library for converting traditional ML operators to tensors, with the goal of accelerating inference (scoring\/prediction) for traditional machine learning models. You can learn more about Hummingbird in our introductory <a href=\"http:\/\/aka.ms\/hb-blog\">blog post<\/a>, but we\u2019ll present a short summary here.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Traditional ML libraries and toolkits are usually developed to run in CPU environments. For example, LightGBM does not support using GPU for inference, only for training. Traditional ML models (such as DecisionTrees and LinearRegressors) also do not support hardware acceleration.<\/li>\n\n\n\n<li>Hummingbird addresses this gap and allows users to seamlessly leverage hardware acceleration without having to re-engineer their models.\u00a0This is done by reconfiguring algorithmic operators in the traditional ML pipelines such that we can perform computations which are amenable to GPU execution.<\/li>\n\n\n\n<li>Hummingbird is competitive and even <a href=\"https:\/\/aka.ms\/hb-paper\">outperforms<\/a> hand-crafted kernels on micro-benchmarks, while enabling seamless end-to-end acceleration of ML pipelines. We\u2019ll show an example of this speedup below.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"why-use-onnx-runtime\">Why use ONNX Runtime?<\/h2>\n\n\n\n<p>The integration of Hummingbird with ONNXMLTools allows users to take advantage of the flexibility and performance benefits of ONNX Runtime. ONNX Runtime provides a consistent API across platforms and architectures with APIs in Python, C++, C#, Java, and more. This allows models trained in Python to be used in a variety of production environments. ONNX Runtime also provides an abstraction layer for hardware accelerators, such as Nvidia CUDA and TensorRT, Intel OpenVINO, Windows DirectML, and others. This gives users the flexibility to deploy on their hardware of choice with minimal changes to the runtime integration and no changes in the converted model.<\/p>\n\n\n\n<p>While ONNX Runtime does natively support both DNNs and traditional ML models, the Hummingbird integration provides performance improvements by using the neural network form of LightGBM models for inferencing. This may be particularly useful for those already utilizing GPUs for the acceleration of other DNNs. Let\u2019s take a look at this in action.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"code-and-performance\">Code and performance<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"import\">Import<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nimport numpy as np\nimport lightgbm as lgb\nimport timeit\n \nimport onnxruntime as ort\nfrom onnxmltools.convert import convert_lightgbm\nfrom onnxconverter_common.data_types import FloatTensorType\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"create-some-random-data-for-binary-classification\">Create some random data for binary classification<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nmax_depth = 8\nnum_classes = 2\nn_estimators = 1000\nn_features = 30\nn_fit = 1000\nn_pred= 10000\nX = np.random.rand(n_fit, n_features)\nX = np.array(X, dtype=np.float32)\ny = np.random.randint(num_classes, size=n_fit)\ntest_data = np.random.rand(n_pred, n_features).astype('float32')\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"create-and-train-a-lightgbm-model\">Create and train a LightGBM model<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nmodel = lgb.LGBMClassifier(n_estimators=n_estimators, max_depth=max_depth, pred_early_stop=False)\nmodel.fit(X, y)\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"use-onnxmltools-to-convert-the-model-to-onnxml\">Use ONNXMLTOOLS to convert the model to ONNXML<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ninput_types = [(\"input\", FloatTensorType([n_pred, n_features))] # Define the inputs for the ONNX\nonnx_ml_model = convert_lightgbm(model, initial_types=input_types)\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"predict-with-lightgbm\">Predict with LightGBM<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nlgbm_time = timeit.timeit(\"model.predict_proba(test_data)\", number=7, \n                          setup=\"from __main__ import model, test_data\")\nprint(\"LightGBM (CPU): {}\".format(num_classes, max_depth, n_estimators, lgbm_time))\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"predict-with-onnx-ml-model\">Predict with ONNX ML model<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nsessionml = ort.InferenceSession(onnx_ml_model.SerializeToString())\nonnxml_time = timeit.timeit(\"sessionml.run( [sessionml.get_outputs()[1].name],  \n                             {sessionml.get_inputs()[0].name: test_data} )\", \n                            number=7, setup=\"from __main__ import sessionml, test_data\")\nprint(\"LGBM->ONNXML (CPU): {}\".format(num_classes, max_depth, n_estimators, onnxml_time))\n<\/pre><\/div>\n\n\n<p>The result is the following:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nLightGBM (CPU): 1.1157575770048425\nLGBM->ONNXML (CPU) 1.0180995319969952\n<\/pre><\/div>\n\n\n<p>Not bad! Now let\u2019s see Hummingbird in action. The only change to the conversion code above is the addition of without_onnx_ml=True<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"use-onnxmltools-to-generate-an-onnx-model-without-any-ml-operator-using-hummingbird\">Use ONNXMLTOOLS to generate an ONNX (model without any ML operator) using Hummingbird<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ninput_types = [(\"input\", FloatTensorType([n_pred, n_features))] # Define the inputs for the ONNX\nonnx_model = convert_lightgbm(model, initial_types=input_types, without_onnx_ml=True)\n<\/pre><\/div>\n\n\n<p>We can now pip install onnxruntime-gpu and run the prediction over the onnx_model:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"predict-with-the-onnx-model-on-gpu\">Predict with the ONNX model (on GPU)<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nsess_options = ort.SessionOptions()\nsession = ort.InferenceSession(onnx_model.SerializeToString(), sess_options)\nonnx_time = timeit.timeit(\"session.run( [session.get_outputs()[1].name], {session.get_inputs()[0].name:\n                            test_data} )\", number=7, setup=\"from __main__ import session, test_data\")\nprint(\"LGBM->ONNX (GPU): {}\".format(onnx_time))\n<\/pre><\/div>\n\n\n<p>And we get:<\/p>\n\n\n\n<p><code>LGBM-&gt;ONNXML-&gt;ONNX (GPU): 0.2364534509833902<\/code><\/p>\n\n\n\n<p>There is an approximate 5x improvement over the CPU implementation. Additionally, the ONNX model can take advantage of any additional optimizations available in future releases of ORT, and it can run on any hardware accelerator supported by ORT.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"going-forward\">Going forward<\/h2>\n\n\n\n<p>Hummingbird currently supports converters for ONNX, scikit-learn, XGBoost, and LightGBM. In the future, we plan to provide similar features for other converters in the ONNXMLTools family, such as XGBoost and scikit-learn. If there are additional operators or integrations you would like to see, please <a href=\"https:\/\/github.com\/microsoft\/hummingbird\/issues\">file an issue<\/a>. We would love to hear about how Hummingbird can help speed-up your workloads and we look forward to adding more features!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>With the growing trend towards deep learning techniques in AI, there are many investments in accelerating neural network models using GPUs and other specialized hardware. However, many models used in production are still based on traditional machine learning libraries or sometimes a combination of traditional machine learning (ML) and DNNs.<\/p>\n","protected":false},"author":5562,"featured_media":95473,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"msxcm_post_with_no_image":false,"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","footnotes":""},"post_tag":[2272],"content-type":[361,340],"topic":[2238],"programming-languages":[],"coauthors":[1697,1694,657],"class_list":["post-82664","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","tag-microsoft","content-type-project-updates","content-type-tutorials-and-demos","topic-ai-machine-learning","review-flag-1593580428-734","review-flag-1593580415-931","review-flag-1593580419-521","review-flag-1-1593580432-963","review-flag-2-1593580437-411","review-flag-7-1593580463-151","review-flag-8-1593580468-572","review-flag-lever-1593580265-989","review-flag-machi-1680214156-53","review-flag-ml-1680214110-748"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog<\/title>\n<meta name=\"description\" content=\"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog\" \/>\n<meta property=\"og:description\" content=\"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Open Source Blog\" \/>\n<meta property=\"article:published_time\" content=\"2020-09-29T17:00:04+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-24T17:43:32+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2020\/09\/hummingbird_square.png\" \/>\n\t<meta property=\"og:image:width\" content=\"350\" \/>\n\t<meta property=\"og:image:height\" content=\"350\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Matteo Interlandi, Karla Saur, Faith Xu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2020\/09\/hummingbird_square.png\" \/>\n<meta name=\"twitter:creator\" content=\"@OpenAtMicrosoft\" \/>\n<meta name=\"twitter:site\" content=\"@OpenAtMicrosoft\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Matteo Interlandi, Karla Saur, Faith Xu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 min read\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/\"},\"author\":[{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/author\\\/matteo-interlandi\\\/\",\"@type\":\"Person\",\"@name\":\"Matteo Interlandi\"},{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/author\\\/karla-saur\\\/\",\"@type\":\"Person\",\"@name\":\"Karla Saur\"},{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/author\\\/faith-xu\\\/\",\"@type\":\"Person\",\"@name\":\"Faith Xu\"}],\"headline\":\"Accelerate traditional machine learning models on GPU with ONNX Runtime\",\"datePublished\":\"2020-09-29T17:00:04+00:00\",\"dateModified\":\"2025-06-24T17:43:32+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/\"},\"wordCount\":726,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Fintech-006.webp\",\"keywords\":[\"Microsoft\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/\",\"name\":\"Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Fintech-006.webp\",\"datePublished\":\"2020-09-29T17:00:04+00:00\",\"dateModified\":\"2025-06-24T17:43:32+00:00\",\"description\":\"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#primaryimage\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Fintech-006.webp\",\"contentUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Fintech-006.webp\",\"width\":1170,\"height\":640},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2020\\\/09\\\/29\\\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Accelerate traditional machine learning models on GPU with ONNX Runtime\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\",\"name\":\"Microsoft Open Source Blog\",\"description\":\"Open dialogue about openness at Microsoft \u2013 open source, standards, interoperability\",\"publisher\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\",\"name\":\"Microsoft Open Source Blog\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Microsoft-Logo.png\",\"contentUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Microsoft-Logo.png\",\"width\":259,\"height\":194,\"caption\":\"Microsoft Open Source Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/OpenAtMicrosoft\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog","description":"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/","og_locale":"en_US","og_type":"article","og_title":"Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog","og_description":"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.","og_url":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/","og_site_name":"Microsoft Open Source Blog","article_published_time":"2020-09-29T17:00:04+00:00","article_modified_time":"2025-06-24T17:43:32+00:00","og_image":[{"width":350,"height":350,"url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2020\/09\/hummingbird_square.png","type":"image\/png"}],"author":"Matteo Interlandi, Karla Saur, Faith Xu","twitter_card":"summary_large_image","twitter_image":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2020\/09\/hummingbird_square.png","twitter_creator":"@OpenAtMicrosoft","twitter_site":"@OpenAtMicrosoft","twitter_misc":{"Written by":"Matteo Interlandi, Karla Saur, Faith Xu","Est. reading time":"4 min read"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#article","isPartOf":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/"},"author":[{"@id":"https:\/\/opensource.microsoft.com\/blog\/author\/matteo-interlandi\/","@type":"Person","@name":"Matteo Interlandi"},{"@id":"https:\/\/opensource.microsoft.com\/blog\/author\/karla-saur\/","@type":"Person","@name":"Karla Saur"},{"@id":"https:\/\/opensource.microsoft.com\/blog\/author\/faith-xu\/","@type":"Person","@name":"Faith Xu"}],"headline":"Accelerate traditional machine learning models on GPU with ONNX Runtime","datePublished":"2020-09-29T17:00:04+00:00","dateModified":"2025-06-24T17:43:32+00:00","mainEntityOfPage":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/"},"wordCount":726,"commentCount":0,"publisher":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#organization"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#primaryimage"},"thumbnailUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Fintech-006.webp","keywords":["Microsoft"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/","url":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/","name":"Accelerate traditional machine learning models on GPU with ONNX Runtime | Microsoft Open Source Blog","isPartOf":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#primaryimage"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#primaryimage"},"thumbnailUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Fintech-006.webp","datePublished":"2020-09-29T17:00:04+00:00","dateModified":"2025-06-24T17:43:32+00:00","description":"By utilizing Hummingbird with ONNX Runtime, you can capture the benefits of GPU acceleration for traditional maching learning models.","breadcrumb":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#primaryimage","url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Fintech-006.webp","contentUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Fintech-006.webp","width":1170,"height":640},{"@type":"BreadcrumbList","@id":"https:\/\/opensource.microsoft.com\/blog\/2020\/09\/29\/accelerate-machine-learning-models-gpu-onnx-runtime-hummingbird\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/opensource.microsoft.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Accelerate traditional machine learning models on GPU with ONNX Runtime"}]},{"@type":"WebSite","@id":"https:\/\/opensource.microsoft.com\/blog\/#website","url":"https:\/\/opensource.microsoft.com\/blog\/","name":"Microsoft Open Source Blog","description":"Open dialogue about openness at Microsoft \u2013 open source, standards, interoperability","publisher":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/opensource.microsoft.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/opensource.microsoft.com\/blog\/#organization","name":"Microsoft Open Source Blog","url":"https:\/\/opensource.microsoft.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/opensource.microsoft.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","contentUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","width":259,"height":194,"caption":"Microsoft Open Source Blog"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/OpenAtMicrosoft"]}]}},"msxcm_animated_featured_image":null,"bloginabox_display_generated_audio":false,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Open Source Blog","distributor_original_site_url":"https:\/\/opensource.microsoft.com\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/82664","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/users\/5562"}],"replies":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/comments?post=82664"}],"version-history":[{"count":1,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/82664\/revisions"}],"predecessor-version":[{"id":97641,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/82664\/revisions\/97641"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/media\/95473"}],"wp:attachment":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/media?parent=82664"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/post_tag?post=82664"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/content-type?post=82664"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/topic?post=82664"},{"taxonomy":"programming-languages","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/programming-languages?post=82664"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/coauthors?post=82664"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}