One of Thunder Compute's main benefits is that you only pay for active GPU usage. On other cloud platforms, you pay for the entire runtime of instances—even when the GPU sits idle—making Thunder Compute a potentially huge cost-saver. Depending on your use case, this could lead to slight or substantial cost savings. This blog post explores use cases where Thunder Compute is likely the most cost-effective option, as well as a scenario where it may not be.
When prototyping for AI or data science, developers spend significant time on tasks that don’t need a GPU. A typical workflow begins with developers prototyping on their laptop, testing basic model functionality without a GPU. Eventually, they’ll need access to a GPU for further development. The developer will then move their code over to a GPU cloud instance with a basic GPU. After switching to the GPU instance, developers typically spend much of their time coding, debugging, and preparing data—tasks that don’t require active GPU usage.During this time, they’re paying for costly GPU resources that aren’t actively being used. On Thunder Compute, the developer saves cost any time they are not actively using the GPU.
Production AI inference servers often experience significant downtime between requests, leaving GPUs idle. These inference servers process application user requests in real time, so they need to be online for requests at all times, even while the GPU is not being used. As a result, these GPUs are often inactive for 90%+ of the time. If you are setting up a server like this, on most cloud platforms you pay the entire time the server is online, even if the GPU is inactive. With Thunder Compute, you only pay when user requests actively use the GPU, leading to significant savings compared to other cloud platforms.
Jupyter Notebooks see major benefits with Thunder Compute. Even while a notebook kernel is running (technically using the GPU), Thunder Compute allows each GPU to be shared among multiple users without performance degradation. This benefit is so substantial that we plan to offer a notebook-specific pricing tier with even lower rates in the future—stay tuned!
Since Thunder Compute yields the most savings when GPUs are frequently idle, the savings are smaller in scenarios where GPUs run continuously for extended periods, like multi-day training runs. That said, Thunder Compute still provides some savings here, such as any time spent configuring instances, processing or loading data, and debugging. In the future, Thunder Compute will support large multi-node configurations, making it the only platform offering per-second access to clusters of 100+ GPUs—a significant advantage in itself.