LoRA (Low-Rank Adaptation) is the technique that makes consistent AI characters possible. Without a LoRA, every image you generate of your character looks slightly different. With a trained LoRA, your AI influencer has the same face, features, and identity across hundreds of images. It is the foundation of every successful AI influencer account.
This guide covers the complete LoRA training process from dataset preparation through testing and stacking.
What Is LoRA and Why You Need It
Without LoRA
- - Every generation produces a different face
- - No way to maintain character identity
- - Prompts alone cannot guarantee consistency
- - IPAdapter helps but is not reliable for exact features
- - Impossible to build a recognizable AI persona
With LoRA
- - Same face, same features, every time
- - Works across different poses, outfits, and settings
- - Stackable with style and clothing LoRAs
- - Small file size (10-200MB vs 2-7GB for full models)
- - Trainable in 30-90 minutes
Dataset Preparation: The Most Important Step
Critical
Dataset quality determines 80% of your LoRA's quality. Spending extra time on preparation saves hours of retraining. Follow these steps carefully.
Collect 20-30 Reference Images
Generate your character images using Midjourney, Flux, or SDXL. You need variety across angles, expressions, and contexts while keeping the core features consistent.
Include:
- - 5-7 front-facing portraits (different expressions)
- - 3-5 three-quarter angle shots
- - 2-3 profile views
- - 3-5 half-body shots (different outfits)
- - 2-3 full-body shots
- - Mix of indoor and outdoor lighting
Avoid:
- - Heavily filtered or stylized images
- - Images with other people (confuses training)
- - Extreme angles (directly above/below)
- - Heavy accessories covering facial features
- - Low resolution or blurry images
- - Duplicate or near-identical poses
Crop and Resize Images
All training images must be the same resolution. The standard for SDXL LoRAs is 1024x1024 pixels.
- - Crop each image to a square aspect ratio centered on the face/body
- - Resize all images to 1024x1024 pixels (for SDXL) or 512x512 (for SD 1.5)
- - Use a batch processing tool like Birme.net or IrfanView for bulk resizing
- - Save as PNG to avoid compression artifacts
- - Ensure the subject fills at least 40-60% of the frame in most images
Caption Every Image
Each image needs a text file with the same filename describing what is in the image. Captions teach the LoRA what is unique about your character versus what is just context.
Captioning format (example):
"ohwx woman, brown hair, green eyes, light skin, smiling, wearing a white blouse, standing in a park, natural lighting, upper body shot"
- - Use a trigger word (like "ohwx") at the start of every caption to activate the LoRA
- - Describe fixed features (hair color, eye color, skin tone) in every caption
- - Describe variable elements (clothing, setting, pose) so the model learns to separate them
- - Use auto-captioning tools like BLIP or WD Tagger for a starting point, then edit manually
- - Keep captions consistent in style and terminology across all images
Training Methods: Choose Your Tool
Option 1: Kohya_ss (Local Training)
The gold standard for local LoRA training. Free, highly configurable, and produces the best results with fine-tuned parameters. Requires an NVIDIA GPU with 8GB+ VRAM.
Recommended Parameters for SDXL:
- - Network rank (dim): 32-64 (higher = more detail, larger file)
- - Network alpha: 16-32 (half of rank is a safe default)
- - Learning rate: 1e-4 (reduce to 5e-5 if results are distorted)
- - Training steps: 1500-3000 (for 20-30 images)
- - Batch size: 1-2 (depends on VRAM)
- - Optimizer: AdamW8bit or Prodigy
- - Resolution: 1024x1024
Installation: Clone the Kohya_ss repository from GitHub, install dependencies, and launch the GUI. The web interface makes it easy to configure and start training runs.
Option 2: CivitAI Online Training
The easiest option for beginners. Upload your images directly to CivitAI and train in the cloud with no setup required. Free with a CivitAI account.
- - Upload your captioned images to a new model page
- - Select base model (SDXL recommended)
- - Choose training preset or configure manually
- - Training takes 20-60 minutes in the cloud
- - Download the trained LoRA file when complete
- - Optionally publish to CivitAI for community sharing
Option 3: Replicate (API Training)
Cloud-based training via API. Costs $1-3 per training run. Good for automation and batch training multiple characters.
- - Create a Replicate account and add billing
- - Use the SDXL LoRA training model
- - Upload a ZIP file of your captioned images
- - Configure parameters via the web UI or API
- - Training completes in 15-45 minutes
- - Download the LoRA weights file from the output
Testing Your LoRA
After training, test your LoRA thoroughly before using it in production. Load it in ComfyUI or Automatic1111 and generate images with your trigger word.
Test at different strengths
Generate the same prompt at LoRA strengths 0.5, 0.7, 0.8, 0.9, and 1.0. The sweet spot is usually 0.7-0.85. Too low and the character is vague. Too high and the image becomes rigid or distorted.
Test different contexts
Generate your character in various settings: indoor, outdoor, different lighting, different outfits, different poses. A good LoRA maintains identity regardless of context. If the face changes significantly between settings, the training data lacked variety.
Test with other models
Try your LoRA with different base checkpoints. An SDXL LoRA should work across SDXL-based models (RealVisXL, Juggernaut XL, etc.). If it only works with one specific model, the training may have been too specific.
Check for overfitting
Signs of overfitting: the character always appears in the same pose, the background looks like training images, or adding new clothing or settings produces artifacts. Fix by reducing training steps by 20-30% and retraining.
Advanced: LoRA Stacking
LoRA stacking combines multiple LoRAs simultaneously. This lets you separate character identity from style, clothing, and other attributes.
Common LoRA combinations
- - Character LoRA (0.8) + Style LoRA (0.5) = Your character in a specific art style
- - Character LoRA (0.8) + Clothing LoRA (0.6) = Your character in specific outfits
- - Character LoRA (0.8) + Pose LoRA (0.5) = Your character in specific poses
- - Character LoRA (0.7) + Detail LoRA (0.4) = Enhanced skin texture and eye detail
Stacking rules
- - Keep total combined strength under 1.8 (e.g., 0.8 + 0.6 + 0.4 = 1.8)
- - Character LoRA should always be the strongest
- - Test combinations at low strength first, then increase
- - More than 3 LoRAs at once usually degrades quality
- - In ComfyUI, chain Load LoRA nodes sequentially
Frequently Asked Questions
How many images do I need to train a good LoRA?
For character LoRAs, 20-30 high-quality images is the sweet spot. Fewer than 15 images often produces inconsistent results. More than 50 images can cause overfitting where the LoRA memorizes specific poses instead of learning the character. Quality matters more than quantity. Each image should show the character from different angles, with different expressions, in different lighting conditions, and wearing different outfits.
Do I need a GPU to train a LoRA?
For local training with Kohya_ss, you need an NVIDIA GPU with at least 8GB VRAM (12GB+ recommended for SDXL LoRAs). Training takes 30-90 minutes on a 3090/4090. If you do not have a GPU, use cloud training services: CivitAI offers free LoRA training with an account, Replicate charges approximately $1-3 per training run, and RunPod lets you rent a GPU for $0.40-0.70 per hour. Cloud training is the easiest option for beginners.
What is the difference between LoRA and Dreambooth?
LoRA (Low-Rank Adaptation) trains a small adapter file (10-200MB) that modifies the base model weights without changing the model itself. Dreambooth fine-tunes the entire model (2-7GB output). LoRA is faster to train (30-90 minutes vs 2-4 hours), produces smaller files, and can be stacked with other LoRAs. Dreambooth can produce slightly higher fidelity but is less flexible. In 2026, LoRA is the standard for most character training.
Why does my LoRA look nothing like the training images?
Common causes: too few training images (need 20-30 minimum), poor image quality or inconsistent lighting across images, wrong training parameters (learning rate too high causes distortion, too low produces weak results), or insufficient training steps. Fix by using the recommended 1500-3000 steps for 20-30 images, learning rate of 1e-4 for SDXL, and ensuring all training images are consistent quality. Also check that your captions accurately describe each image.
Can I stack multiple LoRAs together?
Yes, LoRA stacking is a powerful technique. You can combine a character LoRA (for face consistency) with a style LoRA (for artistic look) or a clothing LoRA (for specific outfits). In ComfyUI, chain multiple Load LoRA nodes together. Start each LoRA at strength 0.7 and adjust down if they conflict. Two LoRAs at 0.7 each is a good starting point. More than three LoRAs simultaneously often causes quality degradation.
Want to master AI Influencers Academy? Get it + 3 more complete courses
Complete Creator Academy - All Courses
Master Instagram growth, AI influencers, n8n automation, and digital products for just $99/month. Cancel anytime.
All 4 premium courses (Instagram, AI Influencers, Automation, Digital Products)
100+ hours of training content
Exclusive templates and workflows
Weekly live Q&A sessions
Private community access
New courses and updates included
Cancel anytime - no long-term commitment
✨ Includes: Instagram Ignited • AI Influencers Academy • AI Automations • Digital Products Empire