Harris Notes

Home

❯

concepts

❯

GGUF 4 bit Quantisation

GGUF 4-bit Quantisation

Apr 14, 20261 min read

concept
general
technique

GGUF 4-bit Quantisation

A model compression format that reduces LLM memory footprint dramatically, allowing 7B parameter models to fit within approximately 6GB RAM, making them viable on 16GB Apple Silicon devices.

Harris Notes

Explorer

GGUF 4-bit Quantisation

GGUF 4-bit Quantisation

Graph View

Table of Contents

Backlinks

Harris Notes

Explorer

GGUF 4-bit Quantisation

GGUF 4-bit Quantisation

Related

Graph View

Table of Contents

Backlinks