Unified Memory Architecture
Apple Silicon’s design shares a single memory pool across CPU, GPU, and Neural Engine, eliminating the VRAM bottleneck of discrete GPU setups and making all installed RAM available for model loading.
Related
ram-capacity-constraints metal-gpu-acceleration local-llm-inference apple mac-mini-m2 mac-mini-m4-pro mac-studio