

Most arch users are casuals that finally figured out how to read a manual. Then you have the 1% of arch users who are writing the manual…
It’s the Gentoo and BSD users we should fear and respect, walking quietly with a big stick of competence.
Most arch users are casuals that finally figured out how to read a manual. Then you have the 1% of arch users who are writing the manual…
It’s the Gentoo and BSD users we should fear and respect, walking quietly with a big stick of competence.
Yeah, that’s the thing.
The gaming market only barely exists at this point. That’s why Nvidia can ignore the gaming market for as long as they want to.
Pheasants gamers buy cheap inference cards gaming cards.
The absolute majority of Nvidias sales globally are top-of-the-line AI SKUs. Gaming cards are just a way of letting data scientists and developers have cheap CUDA hardware at home (while allowing some Cyberpunk), so they keep buying NVL clusters at work.
Nvidia’s networking division is probably a greater revenue stream than gaming GPUs.
the H200 has a very impressive bandwith of 4.89 TB/s, but for the same price you can get 37 TB/s spread across 58 RX 9070s, but if this actually works in practice i don’t know.
Your math checks out, but only for some workloads. Other workloads scale out like shit, and then you want all your bandwidth concentrated. At some point you’ll also want to consider power draw:
Now include power and cooling over a few years and do the same calculations.
As for apples and oranges, this is why you can’t look at the marketing numbers, you need to benchmark your workload yourself.
Well, a few issues:
For fun, home use, research or small time hacking? Sure, buy all the gaming cards you can. If you actually need support and have a commercial use case? Pony up. Either way, benchmark your workload, don’t look at marketing numbers.
Is it a scam? Of course, but you can’t avoid it.
Your numbers are old. If you are building today with anyone ad much as mentioning AI, you might as well consider 100kW/rack as ”normal”. An off-the-shelf CPU today runs at 500W, and you usually have two of them per server, along with memory, storage and networking. With old school 1U pizza boxes, that’s basically 100kW/rack. If you start adding GPUs, just double or quadruple power density right off the bat. Of course, assume everything is direct liquid cooled.
There is an argument that training actually is a type of (lossy) compression. You can actually build (bad) language models by using standard compression algorithms to ”train”.
By that argument, any model contains lossy and unstructured copies of all data it was trained on. If you download a 480p low quality h264-encoded Bluray rip of a Ghibli movie, it’s not legal, despite the fact that you aren’t downloading the same bits that were on the Bluray.
Besides, even if we consider the model itself to be fine, they did not buy all the media they trained the model on. The action of downloading media, regardless of purpose, is piracy. At least, that has been the interpretation for normal people sailing the seas, large companies are of course exempt from filthy things like laws.
https://wildergardenaudio.com/maim/