Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

CXL is that, but a compromise. Practically, that means an APU.

Intel/AMD are reportedly coming out with wide M1-Pro like chips.



At the end of the day it's limited by the PCIE lanes it is attached to, no different from a 2nd GPU.

Just saw one yesterday, 128GB PCIE 5.0 x8 which is a maximum of 32 GB/s in and out, where the 4090 has 1 TB/s memory bandwidth.

IMO for inference, dual socket EPYC (24 channel DDR5) is the way to go and CXL can theoretically allow you to bump up the bandwidth in that situation (assuming that you can optimize the software properly). Already in llama.cpp there seems to be some issues using 2P servers regarding numa.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: