Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not an expert, so excuse me if this is obvious, but would these integrated graphics be any good for NLP? A GPU with 24GB of video memory costs $2000, you can put one of these in a system with 128 GB or 256 GB of DDR4 or DDR5 and give your neural network training software over 100GB of video memory if you want.

You only have 12 CUs, 768 shading units, 48 texture mapping units, and 32 ROPs but huge amounts of cheap memory. I'm not sure where the bottleneck is but at least it won't crash and burn if you ask it to start neural network training routine that requires 100 GB of RAM, and you don't have to take out a second mortgage for a video card with the requisite amount of graphics memory.



They’re good from the “do it at home” perspective, not from the business or enterprise performance perspective.

One of the ways folks do this now is use the Mac M* chips, since they have so much combined RAM. The raw performance isn’t as high as GPUs, but they can fit substantially larger models in memory.


The bottleneck would most certainly be memory, as you'll quickly overwhelm the on-die cache, without careful optimization.

That said, I think AMD's chiplet strategy might come into play. I could see AMD release a 4 core 8 thread processor with increased on die cache and other chiplets being neural compute units.


People keep reiterating this, but in practice one needs compute and bandwidth, especially outside of tiny context test prompts. On my 4900HS, mlc-llm vulkan is far faster than CPU inference on the same memory bus, with less cache, which wouldn't be the case if it was bandwidth/cache bound (since the CPU has far more cache as well).

My 7800X3D has 96MB of L3 and a golden-bin DDR5 overclock, but its absolutely dreadful for inference.


I don't disagree here, but the new chips here have special neural compute units, and specifically he's talking about models larger than 24GB.


They're slow, but OK for inference.

In practice no one uses AMD/Intel IGPs because no knows about the mlc-llm vulkan backend. llama.cpp is en vogue on the desktop, which does not support IGPs outside of Apple, and otherwise people use backends targeted at server GPUs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: