What is
LoRA?
Low-Rank Adaptation: a lightweight fine-tuning method that trains small adapter layers on top of a frozen base model.
Definition
LoRA freezes the original LLM weights and trains small low-rank matrices that modify the model's behavior. The result: a fraction of the storage and compute of full fine-tuning, with often comparable quality. LoRA adapters can be loaded and unloaded at runtime, so a single base model can serve multiple specialized variants.
Example
A company fine-tunes 5 LoRA adapters on a single Llama 3 base — one per business unit — and switches between them per request.
How Vedwix uses LoRA in client work
Default fine-tuning approach. Full fine-tunes are reserved for very large data sets or when LoRA isn't enough.
We ship this.
If you're building with LoRA in production, we can help — from architecture review to full implementation.
Brief us