GPUs Brrr - Search

About 16,700 results

Open links in new tab

Any time

horace.io
https://horace.io › brrr_intro.html
Making Deep Learning Go Brrrr From First Principles - Horace
So, if you want to keep your GPUs going brrrr, let's discuss the three components your system might be spending time on - compute, memory bandwidth, and overhead. Behind the bitter lesson is a legion of engineers keeping GPUs running efficiently. Image from Gwern.
stanford.edu
https://hazyresearch.stanford.edu › blog
GPUs Go Brrr · Hazy Research - Stanford University
May 12, 2024 · On the practical side, we’re going to talk about what we’ve learned about making GPUs go brr -- and release an embedded DSL, ThunderKittens, that we’ve built to help us write some particularly speedy kernels (which we are also releasing).
reddit.com
https://www.reddit.com › LocalLLaMA › comments › gpus_go_brrr
GPUs Go Brrr : r/LocalLLaMA - Reddit
This post is a mixture of practice and philosophy. On the practical side, we’re going to talk about what we’ve learned about making GPUs go brr -- and release an embedded DSL, ThunderKittens, that we’ve built to help us write some particularly speedy kernels (which we are also releasing).
ycombinator.com
https://news.ycombinator.com › item
GPUs Go Brrr - Hacker News
Newer GPUs actually support dynamic memory allocation, recursion, and the GPU threads have their own stacks, so you could in fact treat them as sequential devices and write games and simulators directly on them.
reddit.com
https://www.reddit.com › MachineLearning › comments › d_making...
[D] Making Deep Learning Go Brrrr From First Principles
Mar 15, 2022 · To help address that, I wrote a blog called "Making Deep Learning Go Brrrr From First Principles": https://horace.io/brrr_intro.html. Basically, for most models, there are 3 regimes that you might be spending all of your time on - Compute, Memory-Bandwidth, and Overhead.
stanford.edu
https://hazyresearch.stanford.edu › blog
ThunderKittens: A Simple Embedded DSL for AI kernels
May 12, 2024 · Relatively quickly, we had a small library (DSL?) that we called ThunderKittens that we hope lets us write simple-to-understand clean code that indeed makes gpus go brrr. [1] Our observations for TK are pretty simple: You want to keep tensor cores busy.
gfickel.com
https://www.gfickel.com › jekyll › update › making-your-gpu...
Making your GPU go BRRR: Creating a CUDA Layer in PyTorch
Mar 13, 2024 · Implement the forward and backward pass in PyTorch. This gives access to an online debugger and the full functionality of Python, like Jupyter Notebooks. Validate the implementation with gradcheck. This somewhat magic function runs your forward pass and does numerical derivation to validate your backward pass code.
andrewpwheeler.com
https://andrewpwheeler.com › gpu-go-brrr-estimating-ols...
GPU go brrr: Estimating OLS (with standard errors) via deep learning
Jul 20, 2020 · GPU go brrr: Estimating OLS (with standard errors) via deep learning So a bunch of my criminologists friends have methods envy. So to help them out, I made some python functions to estimate OLS models using pytorch (a deep learning python library).
aili.app
https://aili.app › share
GPUs Go Brrr - Aili
May 12, 2024 · GPUs Go Brrr
The article discusses optimizing the performance of AI models on GPUs, particularly the NVIDIA H100 GPU. It covers various techniques and hardware features that can be leveraged to maximize GPU utilization, including:
simonwillison.net
https://simonwillison.net › May › gpus-go-brrr
GPUs Go Brrr - Simon Willison
May 13, 2024 · GPUs Go Brrr (via) Fascinating, detailed low-level notes on how to get the most out of NVIDIA's H100 GPUs (currently selling for around $40,000 a piece) from the research team at Stanford who created FlashAttention, among other things.
Pagination
- 1
- 2
- 3
- 4
- Next

Making Deep Learning Go Brrrr From First Principles - Horace

GPUs Go Brrr · Hazy Research - Stanford University

GPUs Go Brrr : r/LocalLLaMA - Reddit

GPUs Go Brrr - Hacker News

[D] Making Deep Learning Go Brrrr From First Principles

ThunderKittens: A Simple Embedded DSL for AI kernels

Making your GPU go BRRR: Creating a CUDA Layer in PyTorch

GPU go brrr: Estimating OLS (with standard errors) via deep learning

GPUs Go Brrr - Aili

GPUs Go Brrr - Simon Willison