{"id":94855,"date":"2023-10-04T08:00:00","date_gmt":"2023-10-04T15:00:00","guid":{"rendered":""},"modified":"2024-08-23T09:13:54","modified_gmt":"2024-08-23T16:13:54","slug":"accelerating-over-130000-hugging-face-models-with-onnx-runtime","status":"publish","type":"post","link":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/","title":{"rendered":"Accelerating over 130,000 Hugging Face models with ONNX Runtime"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">There are currently over 320,000 models on Hugging Face (HF), and this number continues to grow every day. Only about 6,000 of these models have an indication of ONNX support in the HF Model Hub, but over 130,000 support the ONNX format.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">ONNX models can be accelerated with ONNX Runtime (ORT), which works cross-platform and provides coverage for many cloud models and language models. Updating the HF Model Hub with more accurate information about ONNX coverage will ensure that users can leverage all the benefits of ORT when deploying HF models. This blog post will provide an overview of HF model architectures with ORT support, discuss ORT coverage for cloud models and language models, and provide the next steps for increasing the number of ONNX models listed in the HF Model Hub. Ultimately, readers will have a better understanding of why they should use <a href=\"https:\/\/onnxruntime.ai\/\">ONNX Runtime<\/a> to accelerate open source machine learning models from Hugging Face.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hf-ort-support-overview\">HF ORT support overview<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Hugging Face provides a list of supported model architectures in its <a href=\"https:\/\/huggingface.co\/docs\/transformers\/index\" target=\"_blank\" rel=\"noreferrer noopener\">Transformers documentation<\/a>. Model architectures are groups of models with similar operators, meaning that if one model within a model architecture is supported by ONNX, the other models in the architecture are supported by ONNX as well (with rare exceptions). Models in the HF Model Hub can be filtered by model architecture using search queries (e.g., the number of models from the BERT model architecture can be found using this <a href=\"https:\/\/huggingface.co\/models?other=bert\" target=\"_blank\" rel=\"noreferrer noopener\">Hugging Face tool<\/a>).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">ORT supports model architectures where:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">One or more models in the model architecture have ONNX listed as a library in the <a href=\"https:\/\/huggingface.co\/models?library=onnx&amp;sort=trending\" target=\"_blank\" rel=\"noreferrer noopener\">HF Model Hub<\/a><\/li>\n\n\n\n<li class=\"wp-block-list-item\">The model architecture is supported by the Optimum API (<a href=\"https:\/\/huggingface.co\/docs\/optimum\/exporters\/onnx\/overview\" target=\"_blank\" rel=\"noreferrer noopener\">more information here<\/a>)<\/li>\n\n\n\n<li class=\"wp-block-list-item\">The model architecture is supported by Transformers.js (<a href=\"https:\/\/huggingface.co\/docs\/transformers.js\/index\" target=\"_blank\" rel=\"noreferrer noopener\">more information here<\/a>)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">ORT can greatly improve performance for some of the most popular models in the HF Model Hub. Using ORT instead of PyTorch can improve average latency per inference, a measure of how quickly data is received, with an up to 50.10 percent gain over PyTorch for the whisper-large model and an up to 74.30 percent gain over PyTorch for the whisper-tiny model:<\/p>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" src=\"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2023\/11\/Corrected-whisper-Perf-Charts.webp\" alt=\"(whisper-large model): Line chart of average latency per inference (in seconds) vs. (batch, beam) for whisper-large model showing that ORT CUDA performs better than PT GPU and ORT CPU performs better than PT CPU for (batch, beam) combinations (1, 1), (1, 2), and (2, 2).\n\" class=\"wp-image-94977 webp-format\" srcset=\"\" data-orig-src=\"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2023\/11\/Corrected-whisper-Perf-Charts.webp\"><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">These benchmark results were run with FP32 on an A100 40GB device. For CPU benchmarks, an AMD EPYC 7V12 64-core processor was used.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Other notable models for which ORT has been shown to improve performance include Stable Diffusion versions 1.5 and 2.1, T5, and many more.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The top 30 HF model architectures are all supported by ORT, and over 90 HF model architectures in total boast ORT support. Any gaps in ORT coverage generally represent less popular model architectures.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The following table includes a list of the top 11 model architectures, all of which are convertible to ONNX using the Hugging Face Optimum API, along with the corresponding number of models uploaded to HF (as of the date this post was published). These numbers will continue to grow over time, as will the list of supported model architectures.<\/p>\n\n\n\n<figure class=\"wp-block-table aligncenter is-style-stripes\"><table><thead><tr><th class=\"has-text-align-left\" data-align=\"left\"><strong>Model Architecture<\/strong><\/th><th class=\"has-text-align-left\" data-align=\"left\"><strong>Approx. No. of Models<\/strong><\/th><\/tr><\/thead><tbody><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=bert\" target=\"_blank\" rel=\"noreferrer noopener\">bert<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">28180<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=gpt2\" target=\"_blank\" rel=\"noreferrer noopener\">gpt2<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">14060<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=distilbert\" target=\"_blank\" rel=\"noreferrer noopener\">distilbert<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">11540<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=roberta\" target=\"_blank\" rel=\"noreferrer noopener\">roberta<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">10800<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=t5\" target=\"_blank\" rel=\"noreferrer noopener\">t5<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">10450<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=wav2vec2\" target=\"_blank\" rel=\"noreferrer noopener\">wav2vec2<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">6560<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=stable-diffusion\" target=\"_blank\" rel=\"noreferrer noopener\">stable-diffusion<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">5880<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=xlm-roberta\" target=\"_blank\" rel=\"noreferrer noopener\">xlm-roberta<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">5100<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=whisper\" target=\"_blank\" rel=\"noreferrer noopener\">whisper<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">4400<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=bart\" target=\"_blank\" rel=\"noreferrer noopener\">bart<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">3590<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=marian\" target=\"_blank\" rel=\"noreferrer noopener\">marian<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">2840<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"language-models\">Language models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">ONNX Runtime also supports many increasingly popular language model architectures, including most of those available in the HF Model Hub. These model architectures include the following, all of which are convertible to ONNX using the Hugging Face Optimum API:<\/p>\n\n\n\n<figure class=\"wp-block-table aligncenter is-style-stripes\"><table><thead><tr><th class=\"has-text-align-left\" data-align=\"left\"><strong>Language Model Architecture<\/strong><\/th><th class=\"has-text-align-left\" data-align=\"left\"><strong>Approx. No. of Models<\/strong><\/th><\/tr><\/thead><tbody><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=llama\" target=\"_blank\" rel=\"noreferrer noopener\">llama<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">8030<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=gpt_neox\" target=\"_blank\" rel=\"noreferrer noopener\">gpt_neox<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">1240<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=gpt_neo\" target=\"_blank\" rel=\"noreferrer noopener\">gpt_neo<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">950<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=opt\" target=\"_blank\" rel=\"noreferrer noopener\">opt<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">680<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=bloom\" target=\"_blank\" rel=\"noreferrer noopener\">bloom<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">620<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=gpt-j\" target=\"_blank\" rel=\"noreferrer noopener\">gpt-j<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">530<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><a href=\"https:\/\/huggingface.co\/models?other=flan-t5\" target=\"_blank\" rel=\"noreferrer noopener\">flan-t5<\/a><\/td><td class=\"has-text-align-left\" data-align=\"left\">10<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">ONNX Runtime support for the recently released llama2 model architecture is still in the works but will be available on Hugging Face very soon. For more detailed tracking and evaluation of recently released language models from the community, see HF\u2019s <a href=\"https:\/\/huggingface.co\/spaces\/HuggingFaceH4\/open_llm_leaderboard\" target=\"_blank\" rel=\"noreferrer noopener\">Open LLM Leaderboard<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"azure-machine-learning-cloud-models\">Azure Machine Learning cloud models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Models accelerated by ONNX Runtime can be easily deployed to the cloud through Azure Machine Learning, which improves time-to-value, streamlines MLOps, provides built-in AI governance, and designs responsible AI solutions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/azure.microsoft.com\/en-us\/products\/machine-learning\" target=\"_blank\" rel=\"noreferrer noopener\">Azure Machine Learning<\/a> also publishes a curated model list that is updated regularly and includes some of the most popular models at the moment. Of the models on this list that are available in the HF Model Hub, over 84 percent have HF Optimum ONNX support. Six of the remaining models are of the llama2 model architecture, so, as previously stated, ONNX Runtime support is coming soon.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"next-steps-with-onnx\">Next steps with ONNX<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The top priority moving forward is to add as many ONNX models as possible to the HF Model Hub so these models are easily accessible to the community.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We are currently in the process of identifying a scalable way to run the Optimum API and working with the HF team directly to increase the number of models indicated to have ONNX support in the HF Model Hub.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We also encourage members of the community to add their own ONNX models to HF, as over 100,000 models in the HF Model Hub have ONNX support that is not indicated.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A more condensed version of this post can also be found on the Hugging Face <a href=\"https:\/\/huggingface.co\/blog?tag=open-source-collab\" target=\"_blank\" rel=\"noreferrer noopener\">Open Source Collab blog<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>ONNX models can be accelerated with ONNX Runtime, which works cross-platform and provides coverage for many cloud and language models.<\/p>\n","protected":false},"author":6220,"featured_media":95475,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","_alt_title":"","ms-ems-related-posts":[],"footnotes":""},"tags":[],"programming-languages":[],"content-type":[346],"job-role":[],"topic":[2238],"coauthors":[2051],"class_list":["post-94855","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","content-type-news","topic-ai-machine-learning","review-flag-1-1593580432-963","review-flag-2-1593580437-411","review-flag-5-1593580453-725","review-flag-6-1593580457-852","review-flag-lever-1593580265-989","review-flag-machi-1680214156-53","review-flag-perce-1706214400-122"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog<\/title>\n<meta name=\"description\" content=\"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog\" \/>\n<meta property=\"og:description\" content=\"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Open Source Blog\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-04T15:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-08-23T16:13:54+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1170\" \/>\n\t<meta property=\"og:image:height\" content=\"640\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Sophie Schoenmeyer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@OpenAtMicrosoft\" \/>\n<meta name=\"twitter:site\" content=\"@OpenAtMicrosoft\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sophie Schoenmeyer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/\"},\"author\":[{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/author\\\/sophie-schoenmeyer\\\/\",\"@type\":\"Person\",\"@name\":\"Sophie Schoenmeyer\"}],\"headline\":\"Accelerating over 130,000 Hugging Face models with ONNX Runtime\",\"datePublished\":\"2023-10-04T15:00:00+00:00\",\"dateModified\":\"2024-08-23T16:13:54+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/\"},\"wordCount\":829,\"publisher\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Retail-025.webp\",\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/\",\"name\":\"Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Retail-025.webp\",\"datePublished\":\"2023-10-04T15:00:00+00:00\",\"dateModified\":\"2024-08-23T16:13:54+00:00\",\"description\":\"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#primaryimage\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Retail-025.webp\",\"contentUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/CLO24-Azure-Retail-025.webp\",\"width\":1170,\"height\":640},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/2023\\\/10\\\/04\\\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Accelerating over 130,000 Hugging Face models with ONNX Runtime\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\",\"name\":\"Microsoft Open Source Blog\",\"description\":\"Open dialogue about openness at Microsoft \u2013 open source, standards, interoperability\",\"publisher\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#organization\",\"name\":\"Microsoft Open Source Blog\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Microsoft-Logo.png\",\"contentUrl\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Microsoft-Logo.png\",\"width\":259,\"height\":194,\"caption\":\"Microsoft Open Source Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/OpenAtMicrosoft\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/#\\\/schema\\\/person\\\/37a9afee01644e520f78c2d5166a6917\",\"name\":\"Jordan Davis\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g3f61860ae848aafd28c0d71bbbb4c6f2\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g\",\"caption\":\"Jordan Davis\"},\"url\":\"https:\\\/\\\/opensource.microsoft.com\\\/blog\\\/author\\\/jordandavis\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog","description":"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/","og_locale":"en_US","og_type":"article","og_title":"Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog","og_description":"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.","og_url":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/","og_site_name":"Microsoft Open Source Blog","article_published_time":"2023-10-04T15:00:00+00:00","article_modified_time":"2024-08-23T16:13:54+00:00","og_image":[{"width":1170,"height":640,"url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.png","type":"image\/png"}],"author":"Sophie Schoenmeyer","twitter_card":"summary_large_image","twitter_creator":"@OpenAtMicrosoft","twitter_site":"@OpenAtMicrosoft","twitter_misc":{"Written by":"Sophie Schoenmeyer","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#article","isPartOf":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/"},"author":[{"@id":"https:\/\/opensource.microsoft.com\/blog\/author\/sophie-schoenmeyer\/","@type":"Person","@name":"Sophie Schoenmeyer"}],"headline":"Accelerating over 130,000 Hugging Face models with ONNX Runtime","datePublished":"2023-10-04T15:00:00+00:00","dateModified":"2024-08-23T16:13:54+00:00","mainEntityOfPage":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/"},"wordCount":829,"publisher":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#organization"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#primaryimage"},"thumbnailUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.webp","inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/","url":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/","name":"Accelerating over 130,000 Hugging Face models with ONNX Runtime | Microsoft Open Source Blog","isPartOf":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#primaryimage"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#primaryimage"},"thumbnailUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.webp","datePublished":"2023-10-04T15:00:00+00:00","dateModified":"2024-08-23T16:13:54+00:00","description":"Learn more on how ONNX Runtime helps users accelerate open source machine learning models from Hugging Face.","breadcrumb":{"@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#primaryimage","url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.webp","contentUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2024\/06\/CLO24-Azure-Retail-025.webp","width":1170,"height":640},{"@type":"BreadcrumbList","@id":"https:\/\/opensource.microsoft.com\/blog\/2023\/10\/04\/accelerating-over-130000-hugging-face-models-with-onnx-runtime\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/opensource.microsoft.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Accelerating over 130,000 Hugging Face models with ONNX Runtime"}]},{"@type":"WebSite","@id":"https:\/\/opensource.microsoft.com\/blog\/#website","url":"https:\/\/opensource.microsoft.com\/blog\/","name":"Microsoft Open Source Blog","description":"Open dialogue about openness at Microsoft \u2013 open source, standards, interoperability","publisher":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/opensource.microsoft.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/opensource.microsoft.com\/blog\/#organization","name":"Microsoft Open Source Blog","url":"https:\/\/opensource.microsoft.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/opensource.microsoft.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","contentUrl":"https:\/\/opensource.microsoft.com\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","width":259,"height":194,"caption":"Microsoft Open Source Blog"},"image":{"@id":"https:\/\/opensource.microsoft.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/OpenAtMicrosoft"]},{"@type":"Person","@id":"https:\/\/opensource.microsoft.com\/blog\/#\/schema\/person\/37a9afee01644e520f78c2d5166a6917","name":"Jordan Davis","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g3f61860ae848aafd28c0d71bbbb4c6f2","url":"https:\/\/secure.gravatar.com\/avatar\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/ec9971e70dcc01d0fb3aee74bf0f300b2dc40f42a228ed523c90f16cae07c017?s=96&d=microsoft&r=g","caption":"Jordan Davis"},"url":"https:\/\/opensource.microsoft.com\/blog\/author\/jordandavis\/"}]}},"bloginabox_animated_featured_image":null,"bloginabox_display_generated_audio":false,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Open Source Blog","distributor_original_site_url":"https:\/\/opensource.microsoft.com\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/94855","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/users\/6220"}],"replies":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/comments?post=94855"}],"version-history":[{"count":16,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/94855\/revisions"}],"predecessor-version":[{"id":96344,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/posts\/94855\/revisions\/96344"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/media\/95475"}],"wp:attachment":[{"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/media?parent=94855"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/tags?post=94855"},{"taxonomy":"programming-languages","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/programming-languages?post=94855"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/content-type?post=94855"},{"taxonomy":"job-role","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/job-role?post=94855"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/topic?post=94855"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/opensource.microsoft.com\/blog\/wp-json\/wp\/v2\/coauthors?post=94855"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}