Top 30 Cloud GPU Providers & the GPUs They Offer in 2024
Excerpt
Explore all cloud GPU providers’ offerings incl. deep learning chips from Nvidia / AMD, regions, focus markets, energy usage & bare metal options.
GPU procurement complexity has been increasing with more providers adding GPU cloud to their offering. AIMultiple analyzed GPU cloud providers across most relevant dimensions to facilitate cloud GPU procurement. If you
- know which GPU model (e.g. H100) you need, check out GPU cloud providers and the models they offer.
- want to see major cloud GPU providers:
GPU / AI CSP* | Brands** | Models*** | Combinations**** | Comments |
---|---|---|---|---|
Seeweb | 4 | 25 | Focus: Serving EU customers from EU data center | |
AWS | AWS chips like Trainium | 7 | 16 | |
Azure | Working on own chips | 9 | 20 | |
GCP | Google Cloud tensor processing units (TPUs) | 6 | 22 | |
Nvidia DGX | 2 | 2 | Sole focus: High-scale enterprise workloads | |
OCI | 5 | 12 | Bare metal available | |
IBM Cloud | 3 | 6 | ||
CoreWeave | 9 | 63 | ||
Jarvis Labs | 5 | 5 | Sole focus: Cloud GPUs | |
Lambda Labs | 3 | 7 | Sole focus: Cloud GPUs | |
Paperspace CORE | Graphcore | 10 | 28 | Sole focus: Cloud GPUs |
ACE Cloud | 3 | 11 | ||
Alibaba Cloud | Alibaba chips like Hanguang 800 | 11 | 12 | |
Cirrascale | Cerebras, Graphcore, SambaNova | 16 | 29 | Focus: Research workloads |
CoreWeave | 7 | 13 | ||
Crusoe Cloud | 4 | 18 | ||
Datacrunch.io | 4 | 16 | Sole focus: Cloud GPUs | |
FluidStack | 3 | 6 | ||
Latitude.sh | 2 | 4 | Bare metal available | |
US & EU data centers | ||||
LeaderGPU | 3 | 3 | ||
Linode | 1 | 4 | ||
Nebius AI | 4 | 8 | ||
OVHcloud | 3 | 3 | ||
RunPod | 4 | 4 | ||
Scaleway | 1 | 2 | ||
TensorDock | 12 | 14 | ||
Vast.ai | 8 | 28 | ||
Vultr | 5 | 16 | ||
Serverless GPU providers | Depends on provider | Not relevant |
Ranking: Sponsors have links and are highlighted at the top. Remaining are sorted alphabetically
* Cloud service provider (CSP)
** All providers offer Nvidia GPUs. They all use the NVIDIA brand. In addition, some CSPs provide hardware from other AI chip makers as indicated in this column.
*** Distinct Nvidia GPU models offered. For example, “A100 40 GB” and “A100 80 GB” are counted as separate models.
**** Distinct multi-GPU combinations offered. For example, “1 x A100 40 GB” and “2 x A100 40 GB” are counted as separate multi-GPU combinations.
If you know which GPU model you are interested in, please check out all providers with the specific GPU models that they offer.
GPUs can be delivered in a serverless manner, as virtual GPUs or as bare metal. While serverless offers the easiest way to manage workloads, bare metal offers the highest level of control of the hardware. If you are specifically looking for these, please visit relevant sections:
While listing pros and cons for each provider, we relied on user reviews on G2, other online reviews as well as our assessment.
What are Virtual GPU providers?
Virtual GPUs (vGPUs) are virtual machines that allow multiple users share GPUs over the cloud. They are the most commonly offered form of cloud GPUs. Leading providers include:
Amazon Web Services (AWS)
AWS is the largest cloud platform provider and a leading cloud GPU provider.1 Amazon EC2 (Elastic Compute Cloud) offers GPU-powered virtual machine instances facilitating accelerated computations for deep learning tasks.
Pros
Offers seamless integration with other popular AWS solutions like:
- SageMaker, used for creating, training, deploying, and large-scale application of ML models
- Simple Storage Service (Amazon S3), Amazon RDS (Relational Database Services) or other AWS storage services, which can serve as a storage solution for training data
Cons
- AWS offers fewer GPU options than some other players like Azure.2
- UI is found to be complex by users3
Pricing in key options
G4dn Instances (NVIDIA T4 GPUs)
- On-Demand pricing starts from ~ $0.526 per hour for the smallest instance type (g4dn.xlarge) with 4 vCPUs and 16 GB memory.
- Prices increase with larger instance types, such as the g4dn.12xlarge, which costs ~ $4.352 per hour and offers 48 vCPUs and 192 GB memory.
- Spot Instances can offer significant discounts, sometimes up to 90% off the On-Demand prices.
G4ad Instances (AMD Radeon Pro V520 GPUs)
- On-Demand pricing for the smallest instance type (g4ad.xlarge) starts at ~ $0.75 per hour with 4 vCPUs and 16 GB memory.
- The g4ad.16xlarge, a larger instance type, costs ~$7.00 per hour and offers 64 vCPUs and 256 GB memory.4
Microsoft Azure
Microsoft Azure, the second largest cloud provider, provides a cloud-based GPU service known as Azure N-Series Virtual Machines, which leverages NVIDIA GPUs like other providers to deliver high-performance computing capabilities. This service is particularly suited for demanding applications such as deep learning, simulations, rendering and the training of AI models.
Microsoft is also rumored to have started producing its own chips.5
Pros
- Microsoft Azure is offering a larger set of GPU options than most other providers
- Free plan offers 12 months of access to some services
- Azure’s intuitive user interface is praised for its ease of use
Cons
- Some users find that certain advanced features within Azure require a high level of technical expertise to configure and manage effectively6
Pricing of key options
Instance Type | vCPUs | Memory (GiB) | GPU | Price per Hour |
---|---|---|---|---|
ND A100 v4 1 | 8 | 112 | 1 | $3.06 |
ND A100 v4 2 | 16 | 224 | 2 | $6.12 |
ND A100 v4 4 | 32 | 448 | 4 | $12.24 |
ND A100 v4 8 | 64 | 896 | 8 | $24.48 |
This table contains the pricing of the latest versions of the N series.
Google Cloud Platform (GCP)
Google Cloud Platform (GCP) is the third biggest cloud platform.7 GCP offers GPU instances that can be attached to existing virtual machines (VMs) or can be part of a new VM setup.
Pros
- UI is easier than other common platforms such as AWS
- Offers limited free GPU options for Kaggle and Colab users
- Customers can use 20+ products for free, up to monthly usage limits
Cons
- GPUs must be attached to standard VMs, making pricing confusing
- Like AWS, GCP offers fewer GPU options than some players like Azure
Pricing in key options
Below we give the pricing model of A3 with H100 GPU, google’s latest cloud gpu model:
Pricing Model | Estimated Cost (USD) |
---|---|
Per Hour | 126 |
Per Month (Pay-as-you-go) | 92 |
Per Year (with Commitment) | 64 |
NVIDIA DGX Cloud
NVIDIA is the leader in the GPU hardware market. NVIDIA launched its GPU cloud offering, DGX Cloud, by leasing space in leading cloud providers’ (e.g. OCI, Azure and GCP) data centers.
DGX cloud offers NVIDIA Base Command™, NVIDIA AI Enterprise and NVIDIA networking platforms. DGX Cloud instances featured 8 NVIDIA H100 or A100 80GB Tensor Core GPUs at launch.
An initial customer’s, Amgen’s, research team claims 3x faster training of protein LLMs with BioNeMo and up to 100x faster post-training analysis with NVIDIA RAPIDS.8
The offering is enterprise focused with the list price of DGX Cloud instances starting at $36,999 per instance per month at launch.
Pros
- Support from NVIDIA engineers
- Multi-node scaling that can support training across up to 256 GPUs, enabling faster large-scale model training
- Pre-configured with NVIDIA AI software for quick deployment, reducing setup time
Cons
- Offering is not suitable for firms with limited GPU needs
- The service is provided on top of cloud providers’ physical infrastructure. Therefore buyer needs to pay for the margins of both the cloud provider and NVIDIA.
IBM Cloud
The GPU offered by IBM Cloud allows for a flexible process of selecting servers, and it has a seamless integration with the architecture, applications, and APIs of IBM Cloud. This is accomplished via a globally distributed network of data centers that are interconnected.
Pros
- Powerful integration with IBM Cloud architecture and applications
- Worldwide distributed data centers increases data protection
Cons
- Limited adoption compared to the top 3 providers.9
Oracle Cloud Infrastructure (OCI)
Oracle ramped up its GPU offering after formalizing its partnership with NVIDIA.10
Oracle provides GPU instances in both bare-metal and virtual machine formats for quick, cost-effective, and high-efficiency computing. Oracle’s Bare-Metal instances offer customers the capability to execute tasks in non-virtualized settings. These instances are accessible in regions such as the United States, Germany, and the United Kingdom, with availability under both on-demand and interruptible pricing models.
Customers
Oracle serves some of the leading LLM providers like Cohere, a company that Oracle also invested in.11
Pros
- Wide range of cloud products and services. Among the tech giants’ cloud services, only OCI offers bare metal GPUs.12 For GPU cluster users, only OCI offers RoCE v2 for its cluster technology among the tech giants’ cloud services.13
- Cost-effective compared to other major cloud providers
- Offers provision for free trial period and some free-forever products
Cons
-
User interface perceived as clunky and slow by users14
-
Some users find the documentation difficult to understand15
-
The process of starting to use Oracle Cloud compute services was viewed as bureaucratic, complicated, and time-consuming by some users
CoreWeave
CoreWeave is a specialized GPU cloud provider. NVIDIA is one of CoreWeave’s investors. CoreWeave claims to have 45,000 GPUs and to be selected as the first first Elite level cloud services provider by NVIDIA.16
Jarvis Labs
Jarvis Labs, established in 2019 and based in India, specializes in facilitating swift and straightforward training of deep learning models on GPU compute instances. With its data centers located in India, Jarvis Labs is recognized for its user-friendly setup that enables users to start operations promptly.
Jarvis Labs claims to serve 10,000+ AI practitioners.17
Pros
- No credit card required to register
- A simple interface for beginners
Cons
- Although Jarvis Labs is gaining momentum, its suitability for your business’ enterprise-level tasks would need to be validated. It seems to be catering to small workloads since it is not offering multi-GPU instances.
Lambda Labs
Originally, Lambda Labs was a hardware company offering GPU desktop assembly and server hardware solutions. Since 2018, Lambda Labs offers Lambda Cloud as a GPU platform. The virtual machines they offer are pre-equipped with predominant deep learning frameworks, CUDA drivers, and a dedicated Jupyter notebook. Users can connect to these instances through the web terminal in the cloud dashboard or directly using the given SSH keys.
Lambda Labs claims to be used by 10,000+ research teams and has a purely GPU focused offering.18
Paperspace CORE
Paperspace is a cloud computing platform that offers GPU-accelerated virtual machines, among other services. The company is well-regarded for its focus on GPU-intensive workloads and provides a cloud platform for developing, training, and deploying machine learning models.
Paperspace claims to have served 650,000 users.19
Pros
- Offers a wide range of GPUs compared to other providers
- Users find the prices fair for the computing power provided
- Users find the customer service to be friendly and responsive
Cons
-
Some users complain about machine availability, both in terms of the free virtual machines and specific machine types not being available in all regions20
-
The integrated Jupyter interface is criticized and lacks some keyboard shortcuts, although a native Jupyter Notebook interface is offered
-
Longer loading or creation times for machines
-
Monthly subscription fee on top of machine costs can be a downside, and multi-GPU training can be expensive
What are serverless GPU providers?
Serverless is a new cloud computing approach that facilitates cloud management. Many cloud providers are starting to offer serverless GPU offerings. We will be sharing a list here soon.
Explore more on Serverless GPUs.
Bare metal is not as commonly provided as GPU VMs. The providers include:
- Latitude.sh offers bare-metal A100 and H100 GPUs.
- Oracle Cloud Infrastructure
For more, see AIMultiple’s bare-metal GPU provider list.
What are cloud GPU cloud providers based in Europe?
European businesses may prefer to keep their data in Europe for
- GDPR compliance and data security
- Offering faster AI inference services to European users
This is possible with some of the global cloud providers but there are also European based cloud GPU providers.
Seeweb
Seeweb is a public cloud provider headquartered in Italy that runs 100% on renewable energy. Seeweb supports IaC via Terraform and offers 5 different GPU models.
Datacrunch.io
Datacrunch provides Nvidia’s A100, H100 RTX6000, V100 models in groups of 1, 2, 4 or 8. The company is based in Helsinki, Finland and relies on 100% renewable energy.
OVHcloud
OVHcloud is a public cloud provider headquartered in France. It started offering Nvidia GPUs in 2023 and plans to expand its offering.21
Scaleway
Scaleway offers H100 instances, provides 3 European regions (Paris, Amsterdam, Warsaw) and relies 100% of renewable energy. For high scale users, Nabu 2023 supercomputer with its 1,016 Nvidia H100 Tensor Core GPUs is available.
What are upcoming GPU cloud providers?
These providers have limited reach or scope or recently launched their offerings. Therefore they were not included in the top 10:
Alibaba Cloud
Alibaba’s offering may be attractive for businesses operating in China. It is also available across 20 regions including those in Australia, Dubai, Germany, India, Japan, Singapore, the USA and the UK.22
However, a US or EU organization with access to top secret data in domains such as state, defense or telecom may not prefer to work with a cloud service provider headquartered in China.
Cirrascale
Cirrascale is specialized in providing different AI hardware to research teams. Though they are one of the smallest teams in this domain with about ~20 employees, they offer AI hardware from 4 different AI hardware producers.23
Voltage Park
Voltage Park is a non-profit that spent funds including ~500 million with NVIDIA to set up 24,000 [cloud H100 GPUs](https://aimultiple.com/cloud-h100). [<sup>24 </sup>](https://research.aimultiple.com/cloud-gpu-providers/#easy-footnote-bottom-24-7469124 "“Crypto billionaire’s nonprofit buys 500 million of AI data center chips“. Reuters. October 30, 2023. November 25, 2023”) 25 It offers low-price GPU rental to AI focused companies like Character AI.
FAQ
What is a cloud GPU platform?
Why do you need cloud GPU services?
How secure are cloud GPU services?
External links
- 1. Big Three Dominate the Global Cloud Market, Statista, Retrieved July 19, 2023
- 2. https://www.g2.com/products/amazon-ec2/reviews/amazon-ec2-review-8154729
- 3. https://www.g2.com/products/aws-cloud/reviews/aws-cloud-review-8271023
- 4. ”Amazon EC2 G4 Instances — Amazon Web Services (AWS)” Retrieved on 23 July, 2024
- 5. “Microsoft to Debut AI Chip Next Month That Could Cut Nvidia GPU Costs“. The Information. October 6, 2023. October 8, 2023
- 6. ”G2 Review”
- 7. Same Statista source as above
- 8. “NVIDIA Launches DGX Cloud, Giving Every Enterprise Instant Access to AI Supercomputer From a Browser“. NVIDIA. March 21, 2023. Retrieved September 26, 2023.
- 9. Same Statista source as above
- 10. “Oracle and NVIDIA Partner to Speed AI Adoption for Enterprises“. Oracle. October 18, 2022. Retrieved September 26, 2023
- 11. “Oracle to Deliver Powerful and Secure Generative AI Services for Business“. Oracle. June 13, 2023. Retrieved October 10, 2023
- 12. GPU instances, Oracle, Retrieved July 19, 2023
- 13. Oracle Cloud Infrastructure Blog, Oracle, Retrieved July 19, 2023
- 14. ”G2 Review”
- 15. ”G2 Review”
- 16. “CoreWeave Becomes NVIDIA’s First Elite Cloud Services Provider for Compute“. CoreWeave. Retrieved September 26, 2023.
- 17. “Rent GPU“. Jarvislabs.ai. Retrieved October 3, 2023
- 18. “NVIDIA DGX™ Systems with Lambda“. Lambda Labs. Retrieved October 3, 2023
- 19. “Customers“. Paperspace. Retrieved October 3, 2023
- 20. ”G2 Review”
- 21. “GPU“. OVHcloud. Retrieved October 8, 2023
- 22. “Choosing the Best Hosting Region for You“. Alibaba Cloud. Retrieved October 3, 2023.
- 23. “Cirrascale Cloud Services“. Linkedin. Retrieved October 11, 2023.
- 24. “Crypto billionaire’s nonprofit buys $500 million of AI data center chips“. Reuters. October 30, 2023. November 25, 2023
- 25. “Voltage Park launches massive new cloud for AI development“. Voltage Park. October 29, 2023. November 25, 2023