I'm running 70B models (usually in q4 .. q5_k_m, but possible to q6) on my 96Gbyte Macbook Pro with M2-Max (12 cpu cores, 38 gpu cores). This also leaves me with plenty of ram for other purposes.
I'm currently using reflection:70b_q4 which does a very good job in my opinion. It generates with 5.5 tokens/s for the response, which is just about my reading speed.
edit: I usually dont run larger models (q6) because of the speed. I'd guess a 405B model would just be awfully slow.