Skip to content

QLoRA

A fine-tuning method combining quantization with LoRA. The base model loads in 4-bit precision to save memory while LoRA adapters train in higher precision. QLoRA enables fine-tuning very large models on a single GPU with minimal quality loss.

Related terms

LoRA (Low-Rank Adaptation)QuantizationFine-Tuning
← Back to glossary