Llama 2 7B-Chat
Meta’s 7B parameter chat model; uses approximately 7GB RAM in 4-bit quantisation and has a mature ecosystem, though its non-commercial licence restricts closed-source use.
Details
- Services: text generation, chat
Related
gguf-4-bit-quantisation local-llm-inference llama-cpp ollama meta