What You'll Achieve
Facial Consistency
Same face every generation
Training Time
RTX 4070 or better
Image Variations
Unlimited poses/outfits
Understanding LoRA: The Basics
What is LoRA?
LoRA (Low-Rank Adaptation) is a technique that teaches Stable Diffusion to recognize and generate a specific face/character/style without retraining the entire model. Think of it as adding a new "word" to SD's vocabulary.
Advantages
- • 95-98% face consistency (vs 60% without LoRA)
- • Small file size (100-200MB vs 6GB full model)
- • Works with any SD checkpoint
- • Train in 30-90 minutes
- • Can combine multiple LoRAs
- • Easy to share and distribute
Use Cases
- • AI influencer character consistency
- • Product mascots (brand characters)
- • Comic/manga recurring characters
- • Personal avatar creation
- • Style transfer (artistic styles)
- • Concept replication (poses, objects)
Pro Tip: When to Use LoRA vs Textual Inversion
LoRA: Better for faces, complex characters, high consistency requirements. File size 100-200MB.
Textual Inversion: Better for styles, concepts, simple objects. File size 10-50KB but lower consistency (80%).
Phase 1: Dataset Preparation
Dataset Requirements
Image Count & Quality
Optimal: 15-30 images
Sweet spot for character LoRAs. More isn't always better - quality over quantity.
Minimum: 10 images
Can work but consistency drops to 85-90%. Only for quick tests.
Avoid: 50+ images
Overfitting risk. Model memorizes images instead of learning face features.
Diversity Requirements
Your dataset must include variety across these dimensions:
Angles (Critical)
- • 8-10 images: Front-facing (straight on)
- • 4-6 images: 3/4 view (45° angle)
- • 2-3 images: Side profile (90° angle)
- • 1-2 images: Looking down/up
Expressions
- • 10 images: Neutral/slight smile
- • 3-4 images: Big smile/laughing
- • 2-3 images: Serious/confident
- • 1-2 images: Other (surprised, etc)
Lighting
- • 8-10 images: Soft natural light
- • 3-4 images: Bright outdoor light
- • 2-3 images: Studio/dramatic lighting
- • 1-2 images: Low light/golden hour
Backgrounds
- • 6-8 images: Simple/blurred backgrounds
- • 4-6 images: Indoor settings
- • 3-4 images: Outdoor settings
- • 2-3 images: Urban/complex backgrounds
Common Dataset Mistakes
- • All same angle: Model can't generate 3/4 or profile views
- • All same expression: Character looks robotic, can't show emotion
- • Inconsistent style: Mixing anime + realistic breaks training
- • Multiple people in frame: Model gets confused about which face to learn
- • Low resolution: Under 512px leads to blurry generations
Step-by-Step: Creating Your Dataset
Generate Seed Images
Use Stable Diffusion to create 50-100 candidate images:
Example Prompt:
"photo of a beautiful woman, 25 years old, brown hair, blue eyes, natural makeup, looking at camera, soft lighting, professional photography, detailed face, 8k uhd, high quality"
Settings: Realistic Vision 5.1 checkpoint, DPM++ 2M Karras, 30 steps, CFG 7, 768x768
Select Best 20 Images
Curation criteria:
- ✓ Clear, sharp facial features (no blur)
- ✓ No artifacts (weird fingers, distortions)
- ✓ Good variety in angles/expressions
- ✓ Consistent ethnicity/age/gender
- ✓ Natural-looking (no AI "tells")
Crop & Standardize
Prepare images for training:
Resolution:
512x512 minimum, 768x768 optimal, 1024x1024 for SD XL
Cropping:
Face should fill 60-80% of frame. Include shoulders/chest for context.
Format:
PNG or JPG (PNG preferred for quality). Remove alpha channels.
Tag Your Images
Create .txt files with same name as images describing contents:
Example: image_001.txt
1girl, brown hair, blue eyes, smiling, white shirt, looking at viewer, natural lighting, portrait, high quality
Pro tip: Use WD14 Tagger in Automatic1111 to auto-generate tags, then manually refine.
Phase 2: Training Configuration
Training Parameters Explained
| Parameter | Recommended | Explanation |
|---|---|---|
| Learning Rate | 1e-4 | How fast model learns. 1e-4 (0.0001) is safest. 5e-5 for slower/safer, 5e-4 for faster/riskier. |
| Batch Size | 2-4 | Images processed per step. Higher = faster but more VRAM. RTX 4070: use 3-4. RTX 3060: use 2. |
| Epochs | 15-20 | Full passes through dataset. 20 images × 15 epochs = 300 training steps. Sweet spot for faces. |
| Network Rank | 32-64 | LoRA complexity. 32 = lighter/faster, 64 = more detail, 128 = overkill for faces. |
| Network Alpha | 16-32 | Usually half of Network Rank. Affects LoRA strength scaling. |
| Resolution | 512 or 768 | Training resolution. Match your dataset. 768 = better quality but slower. |
| Optimizer | AdamW8bit | Training algorithm. AdamW8bit uses less VRAM. ProdigyPlus for advanced users. |
| LR Scheduler | cosine | Learning rate changes over time. Cosine smoothly reduces LR toward end. |
Beginner Safe Settings
Learning Rate: 1e-4
Batch Size: 2
Epochs: 15
Network Rank: 32
Network Alpha: 16
Resolution: 512
Works 95% of the time. Start here.
Advanced High-Quality Settings
Learning Rate: 5e-5
Batch Size: 4
Epochs: 20
Network Rank: 64
Network Alpha: 32
Resolution: 768
Slower but maximum quality. Needs 12GB+ VRAM.
Phase 3: Training Process (Kohya SS)
Kohya SS GUI Setup
Installation
- 1. Download Kohya SS from GitHub:
bmaltais/kohya_ss - 2. Run setup script:
setup.bat(Windows) orsetup.sh(Linux) - 3. Launch GUI:
gui.batorgui.sh - 4. Navigate to LoRA tab in web interface
Configuration Steps
Source Model
Select base checkpoint (Realistic Vision 5.1, DreamShaper 8, etc). Choose same model you used for dataset generation.
Training Folder
Point to folder containing your 20 images + .txt tags. Structure: 20_character_name/
Output Folder
Where trained LoRA will be saved. Create: output/character_name/
Parameters
Enter training settings from table above. Enable "Save every N epochs" (set to 5) for checkpoints.
Start Training
Click "Train model" button. Monitor terminal for progress:
Epoch 1/15: [====================] 100% Loss: 0.142
Epoch 2/15: [====================] 100% Loss: 0.118
Epoch 5/15: [====================] 100% Loss: 0.092 - Checkpoint saved
...
Epoch 15/15: [====================] 100% Loss: 0.064
Training complete! LoRA saved to output folder.
Training time: RTX 4090: 20-30 min | RTX 4070: 30-45 min | RTX 3060: 60-90 min
Alternative: Automatic1111 Training
If you prefer training in Automatic1111 WebUI, use the built-in training tab:
A1111 Steps
- 1. Navigate to "Train" tab in WebUI
- 2. Select "LoRA" tab (not Dreambooth)
- 3. Create new LoRA, name your character
- 4. Set source model, training folder, epochs, learning rate
- 5. Click "Train" - monitor in terminal/console
Note: Kohya SS is more powerful with better settings control. A1111 works but is more limited.
Phase 4: Quality Control & Testing
Testing Your LoRA
Test Different Weights
LoRA strength can be adjusted 0-1. Test multiple values:
Weight 0.6 (Subtle)
Face is recognizable but allows more prompt influence. Good for artistic styles.
Weight 0.8 (Balanced)
Strong consistency while maintaining flexibility. Recommended for most use cases.
Weight 1.0 (Maximum)
Strongest consistency. Use for photorealistic influencers where exact face match is critical.
Test Prompts
Generate 20-30 test images with varied prompts:
Different Outfits
"wearing red dress", "in business suit", "casual jeans and t-shirt"
Different Locations
"at beach", "in office", "city street background"
Different Poses
"sitting on chair", "walking", "hands on hips"
Different Expressions
"laughing", "serious expression", "winking"
Consistency Checklist
Your LoRA passes if 95%+ of test images match these criteria:
- ✓ Same facial structure (jawline, cheekbones, chin)
- ✓ Same eye color and shape
- ✓ Same nose shape
- ✓ Same hair color (unless you prompt otherwise)
- ✓ Same overall facial proportions
- ✓ Same age appearance
- ✓ Same ethnicity/skin tone
Troubleshooting Common Issues
| Problem | Cause | Solution |
|---|---|---|
| Face inconsistent | Not enough epochs or too low learning rate | Increase epochs to 20-25 or learning rate to 1e-4 |
| Overfitted (copies dataset) | Too many epochs or too high learning rate | Reduce epochs to 10-12 or learning rate to 5e-5 |
| Can't generate profiles | Dataset lacks side-view images | Add 3-5 profile images to dataset, retrain |
| Strange artifacts | Low-quality dataset images | Curate dataset more strictly, remove blurry/distorted images |
| Style drift (wrong style) | Mixed styles in dataset | Keep dataset style consistent (all realistic or all anime) |
| Works poorly with prompts | Network rank too high | Reduce network rank to 32, retrain |
Case Study: 98% Consistent Character
@LunaRae - AI Fashion Influencer
Created with custom-trained LoRA | 98% face consistency
Training Specs
Results
Face Consistency Rate:
Images generated:
1,200+ (400/month for 3 months)
Audience feedback:
"Feels like a real person" - consistent comments
Creator's Success Metrics
- • Only 3 images (out of 1,200) had noticeable face inconsistency
- • Can generate unlimited variations: outfits, poses, locations
- • Works with multiple checkpoints (Realistic Vision, DreamShaper, ChilloutMix)
- • Followers believe she's a real person (no questions about AI)
- • Landed 4 brand deals - brands trust the consistency
Frequently Asked Questions
How long does LoRA training take?
With an RTX 4070 or better, expect 30-60 minutes for a standard character LoRA (20 images, 15-20 epochs). RTX 3060 takes 60-90 minutes. High-end GPUs like RTX 4090 can complete training in 15-25 minutes. Cloud services (RunPod, Vast.ai) offer similar speeds for $0.50-1.00 per training session.
Can I train a LoRA without a powerful GPU?
Yes! Use cloud services like RunPod ($0.34/hour for RTX 4090), Google Colab (free tier with limits, Pro for $10/month), or Vast.ai ($0.25-0.50/hour). These let you rent high-end GPUs by the hour. Training one LoRA costs $0.50-1.50, far cheaper than buying a GPU.
Why is my LoRA generating the same image every time?
This is overfitting - your LoRA memorized the training images instead of learning the face. Solution: Reduce epochs (try 10-12 instead of 20), lower learning rate (5e-5 instead of 1e-4), or add more variety to your dataset. Save checkpoints every 5 epochs and compare - use the epoch before overfitting started.
Can I update my LoRA with new images later?
Not directly - you can't "add" to an existing LoRA. However, you can create a new training dataset combining old + new images and retrain from scratch. This takes another 30-60 minutes but gives you a refreshed LoRA with updated features. Most creators retrain monthly to refine their character.
How do I share my LoRA with others?
Upload to CivitAI (largest LoRA community), Hugging Face, or your own hosting. LoRA files are 100-200MB. Include example images, recommended settings (weight, prompt), and base model used. Be aware: once shared, others can generate infinite images of your character.
Can I combine multiple LoRAs?
Yes! You can stack LoRAs - for example, character LoRA (weight 1.0) + style LoRA (weight 0.6) + clothing LoRA (weight 0.4). Keep total weights under 3.0 to avoid artifacts. This is powerful for creating unique combinations without training new LoRAs.
What's the difference between LoRA and Dreambooth?
LoRA trains small adapter weights (100-200MB) that work with any checkpoint. Dreambooth fine-tunes the entire model (6GB file) and is locked to that model. LoRA is faster, more flexible, and better for most use cases. Dreambooth gives slightly more control but requires 10x more training time and VRAM.
How often should I retrain my LoRA?
Most AI influencer creators retrain every 2-3 months to refine consistency or update the character's look (new hairstyle, style evolution). Initial LoRA lasts indefinitely if results are good. Retrain if: consistency drops, you want to add new features, or your audience wants visual updates.
Want to master AI Influencers Academy? Get it + 3 more complete courses
Complete Creator Academy - All Courses
Master Instagram growth, AI influencers, n8n automation, and digital products for just $99/month. Cancel anytime.
All 4 premium courses (Instagram, AI Influencers, Automation, Digital Products)
100+ hours of training content
Exclusive templates and workflows
Weekly live Q&A sessions
Private community access
New courses and updates included
Cancel anytime - no long-term commitment
✨ Includes: Instagram Ignited • AI Influencers Academy • AI Automations • Digital Products Empire