Loading...
Please wait while we prepare your experience
Please wait while we prepare your experience
Same face, 1,000 poses. LoRA training is the secret to 98% facial consistency for successful AI influencers. Complete step-by-step tutorial.
Face Consistency
Same face, every generation
Minutes Training
RTX 4070 or better
Images Needed
Optimal dataset size
Variations
Unlimited poses/outfits
Without LoRA training, your AI influencer will have a different face in every photo. Followers notice. Engagement drops. Brand deals disappear. LoRA training solves this by teaching Stable Diffusion to recognize and reproduce your character's exact facial features with 95-98% accuracy.
60-70% facial consistency
Face changes noticeably between posts
Followers question authenticity
"Why does she look different every day?"
Limited monetization
Brands won't trust inconsistent characters
Hours spent cherry-picking
Generate 100 images to find 10 consistent ones
95-98% facial consistency
Same recognizable face every single time
Followers build real connection
"She feels like a real person to me"
Brand deals at scale
$2K-$15K per sponsored post
95%+ keeper rate
Generate 100, use 95 - minimal waste
Study of 50 AI influencers (10K-100K followers) shows direct correlation between facial consistency and monetization:
60-70% Consistency
$0-500
Avg monthly revenue
80-90% Consistency
$2K-5K
Avg monthly revenue
95-98% Consistency (LoRA)
$8K-25K
Avg monthly revenue
LoRA (Low-Rank Adaptation) is a machine learning technique that teaches Stable Diffusion to recognize and generate a specific face, character, or style without retraining the entire 6GB model. Instead, it creates a small 100-200MB "adapter" file that contains only the information needed to reproduce your character.
You provide 15-25 images of your character
Different angles, expressions, lighting
AI analyzes facial features & patterns
Eye shape, nose structure, face geometry, etc.
Creates lightweight adapter file
100-200MB LoRA that "knows" your character
Load LoRA when generating
SD now recognizes and reproduces your face
LoRA Training
✓ 95-98% consistency, 100-200MB file, works with any checkpoint
Best for: AI influencers, serious projects
Textual Inversion (Embeddings)
◐ 75-85% consistency, 10-50KB file, limited control
Best for: Styles, concepts, simple characters
Dreambooth
◐ 90-95% consistency, 6GB file, locked to one model
Best for: Maximum control, advanced users only
No Training (Prompts Only)
✗ 60-70% consistency, zero file size, very inconsistent
Best for: One-off images, testing concepts
🎯 Highest Consistency
95-98% facial accuracy beats all alternatives except full Dreambooth
💾 Small File Size
100-200MB means easy sharing, fast loading, minimal storage
🔄 Works Everywhere
Compatible with any SD checkpoint - swap models freely
Get all courses, templates, and automation systems for just $99/month
Start Learning for $99/month90% of LoRA training failures come from poor dataset quality or composition. Get this right and training is easy. Get it wrong and no amount of parameter tweaking will save you.
15-25 images (OPTIMAL)
95-98% consistency. Industry standard for character LoRAs.
10-14 images (MINIMUM)
85-92% consistency. Works but less reliable. Use only for testing.
26-35 images (ADVANCED)
96-98% consistency. Requires careful curation. Overfitting risk if not diverse.
40+ images (AVOID)
High overfitting risk. Model memorizes images instead of learning features.
Your dataset must include variety across these dimensions. This teaches the LoRA to recognize your character in ANY situation:
Camera Angles (Critical)
• 40% - Front-facing (looking at camera, straight on)
• 30% - 3/4 view (45° angle, most flattering)
• 20% - Side profile (90° angle, shows bone structure)
• 10% - Looking up/down (varied head tilts)
Expressions
• 50% - Neutral or slight smile (natural resting)
• 25% - Big smile or laughing (shows teeth)
• 15% - Serious or confident (no smile)
• 10% - Other (surprised, pensive, etc.)
Lighting Conditions
• 40% - Soft natural light (window light, overcast)
• 30% - Bright outdoor (direct sunlight)
• 20% - Studio/dramatic (strong directional)
• 10% - Low light or golden hour
Backgrounds
• 40% - Simple/blurred (focus on face)
• 30% - Indoor settings (rooms, cafes)
• 20% - Outdoor settings (parks, streets)
• 10% - Urban/complex (city, busy scenes)
Use Stable Diffusion (no LoRA) to create 50-100 candidate images of your character concept. Goal: Find 20-25 high-quality images that feel like the same person.
Example Prompt (Photorealistic Female):
photo of a beautiful woman, 25 years old, long brown hair, blue eyes, natural makeup, clear skin, looking at camera, soft lighting, professional photography, detailed face, high quality, 8k uhd, photorealistic
Model: Realistic Vision 5.1, DreamShaper 8, or ChilloutMix
Sampler: DPM++ 2M Karras or Euler A
Steps: 25-35
CFG Scale: 6-8
Resolution: 768x768 (1:1 ratio for portraits)
Pro Tip: Batch Generate with Seed Variations
Set batch size to 4, generate 25 batches (100 images). Adjust seed slightly between batches to get variety while maintaining similar features. Keep same prompt for all batches.
Review all 50-100 generated images and select the best 20-25. This is the most important step - quality over quantity!
Selection Criteria (Keep If:)
Rejection Criteria (Delete If:)
⚠️ The "Same Person" Test
Line up all your selected images side by side. If you showed these to a friend, would they believe it's the same person in all of them? If not, remove outliers that look too different.
Prepare your 20-25 selected images for training by standardizing resolution and cropping.
Resolution Targets:
• SD 1.5: 512x512 minimum, 768x768 optimal
• SD XL: 1024x1024 standard
• Note: All images should be same resolution
Cropping Guidelines:
• Face should fill 60-80% of frame
• Include shoulders/upper chest for context
• Center the face in frame (unless artistic intent)
• Maintain consistent aspect ratio (1:1 for portraits)
File Format:
• Best: PNG (lossless, highest quality)
• Acceptable: JPG at 95%+ quality
• Avoid: WebP, heavily compressed JPG
• Remove alpha channels (convert RGBA to RGB)
Tool Recommendation: BIRME (Bulk Image Resizing)
Use BIRME (birme.net) to batch resize all images to exact dimensions. Upload all 20-25 images, set target resolution, export as PNG. Takes 2 minutes.
Create .txt files with the same name as each image, describing the contents. This helps the LoRA understand what to learn.
Example: character_001.png → character_001.txt
1girl, brown hair, blue eyes, smiling, white shirt, looking at viewer, natural lighting, portrait, high quality
What to Include:
What NOT to Include:
Auto-Tagging Tool: WD14 Tagger
Automatic1111 includes WD14 Tagger extension. Point it at your dataset folder, it auto-generates tags for all images, then you manually refine them. Saves 80% of tagging time.
Extensions → WD14 Tagger → Batch from directory → Review and edit auto-generated tags
These settings control how your LoRA learns. Copy these recommended values for 95%+ success rate:
| Parameter | Beginner Safe | Advanced | What It Does |
|---|---|---|---|
| Learning Rate | 1e-4 | 5e-5 to 5e-4 | Speed of learning. 1e-4 (0.0001) is safest. Lower = slower/safer, higher = faster/riskier. |
| Batch Size | 2 | 3-4 | Images processed per step. Higher = faster training but needs more VRAM. RTX 4070: 3-4, RTX 3060: 2. |
| Epochs | 15-20 | 10-30 | Full passes through dataset. 20 images × 15 epochs = 300 training steps. Sweet spot for faces. |
| Network Rank (Dim) | 32 | 64-128 | LoRA complexity/capacity. 32 = lighter/faster, 64 = more detail, 128 = maximum (overkill for most). |
| Network Alpha | 16 | 32-64 | Usually half of Network Rank. Affects LoRA strength scaling and learning stability. |
| Resolution | 512 | 768-1024 | Training resolution. Match your dataset. 768 = better quality but 2x slower. SD XL uses 1024. |
| Optimizer | AdamW8bit | AdamW / Lion | Training algorithm. AdamW8bit uses less VRAM (12GB vs 16GB). Lion is experimental but faster. |
| LR Scheduler | cosine | cosine_with_restarts | How learning rate changes over time. Cosine smoothly reduces LR, preventing overfitting at end. |
| Save Every N Epochs | 5 | 3-5 | Save checkpoint every N epochs. Lets you compare quality at epoch 5, 10, 15, 20 and choose best. |
Learning Rate: 0.0001 (1e-4)
Batch Size: 2
Epochs: 15
Network Rank: 32
Network Alpha: 16
Resolution: 512
Optimizer: AdamW8bit
LR Scheduler: cosine
Save Every: 5 epochs
Works 95% of the time for character faces. Start here. Total training time: 30-60 min (RTX 4070).
Learning Rate: 0.00005 (5e-5)
Batch Size: 4
Epochs: 20
Network Rank: 64
Network Alpha: 32
Resolution: 768
Optimizer: AdamW8bit
LR Scheduler: cosine_with_restarts
Save Every: 5 epochs
Maximum quality, slower learning. Needs 12GB+ VRAM. Training time: 60-120 min (RTX 4070).
Get all courses, templates, and automation systems for just $99/month
Start Learning for $99/monthKohya SS is the industry-standard LoRA training tool. It's powerful, free, and supports all advanced features. Here's the complete setup and training workflow:
Download Kohya SS from GitHub
Visit: github.com/bmaltais/kohya_ss | Download latest release ZIP
Extract to simple path
Example: C:\kohya_ss\ (Windows) or ~/kohya_ss (Linux) - avoid spaces in path
Run setup script
Windows: Double-click setup.bat | Linux/Mac: bash setup.sh | Takes 5-10 minutes
Launch GUI
Windows: gui.bat | Linux/Mac: bash gui.sh | Opens web browser at localhost:7860
Tab 1: Source Model
• Select base checkpoint (Realistic Vision 5.1, DreamShaper 8, etc.)
• Important: Use same model you generated dataset with
• Path example: C:\stable-diffusion\models\Stable-diffusion\realisticVision.safetensors
Tab 2: Folders
Image folder: Point to your dataset
C:\kohya_ss\datasets\20_charactername
Note: Folder name format is "repeatcount_triggername" (e.g., 20_mycharacter)
Output folder: Where LoRA will save
C:\kohya_ss\output\mycharacter
Tab 3: Training Parameters
• Paste the beginner config values from table above
• Learning rate: 0.0001 (1e-4)
• Batch size: 2
• Epochs: 15
• Network dim (rank): 32
• Network alpha: 16
• Resolution: 512,512
• Save every N epochs: 5
Tab 4: Advanced Settings (Optional)
• Enable xformers: ✓ (reduces VRAM usage)
• Mixed precision: fp16 (faster training)
• Cache latents: ✓ (speeds up training)
• Gradient checkpointing: ✓ if low on VRAM
Click "Start training" button at bottom of page. Monitor terminal/console window for progress:
Loading model from: realisticVision.safetensors
Preparing dataset: 20 images found
Starting training with 300 steps (20 images × 15 epochs)
Epoch 1/15: [====================] 100% | Loss: 0.142 | ETA: 28 min
Epoch 2/15: [====================] 100% | Loss: 0.118 | ETA: 24 min
Epoch 5/15: [====================] 100% | Loss: 0.092 | ETA: 15 min
Checkpoint saved: mycharacter-000005.safetensors
...
Epoch 15/15: [====================] 100% | Loss: 0.064 | ETA: 0 min
Training complete! Final LoRA saved to output folder.
RTX 4090
20-30 minutes
Fastest training
RTX 4070 / 3080
30-50 minutes
Good balance
RTX 3060
60-90 minutes
Minimum viable
"Loss" measures training error. Lower = better learning. Watch for this pattern:
Healthy Loss Curve (Good)
Starts high (0.15-0.20), drops steadily, plateaus around 0.05-0.08 at end. This is perfect.
Loss Too Low Too Fast (Overfitting)
Drops to 0.01-0.02 in first 5 epochs. LoRA is memorizing, not learning. Reduce learning rate or epochs.
Loss Not Dropping (Undertraining)
Stays above 0.15 entire training. LoRA isn't learning. Increase learning rate or epochs.
You now have 3-4 LoRA checkpoints (epoch 5, 10, 15, 20). Test each to find the best quality:
LoRA "weight" controls influence strength (0.0 to 1.5). Test each checkpoint at multiple weights:
Weight 0.6 (Subtle)
Face is recognizable but allows more prompt influence
Best for: Artistic styles, heavy customization
Weight 0.8 (Balanced)
Strong consistency while maintaining flexibility
Best for: Most use cases, recommended default
Weight 1.0 (Maximum)
Strongest consistency, exact face match
Best for: Photorealistic influencers, maximum accuracy
Generate 30-50 test images with varied prompts to stress-test consistency:
Category 1: Different Outfits
Category 2: Different Locations
Category 3: Different Poses
Category 4: Different Expressions
Your LoRA passes quality control if 95%+ of test images match these criteria:
Cause:
Not enough training epochs OR learning rate too low OR poor dataset diversity
Solution:
Cause:
Too many epochs OR learning rate too high OR Network Rank too high
Solution:
Cause:
Dataset lacks images from those angles - LoRA never learned them
Solution:
Cause:
Low-quality dataset images OR conflicting training data OR LoRA weight too high
Solution:
Cause:
Network Rank too high OR training overfit to specific styles/outfits
Solution:
Cause:
Batch size or resolution too high for available VRAM
Solution:
124K Instagram followers | $18K/month revenue | 97.8% face consistency
97.8% facial consistency
Only 11 images rejected out of 500 generated (2.2% failure rate)
Unlimited pose/outfit variations
Generated 1,800+ images across 6 months (300/month average)
Brand trust & credibility
Landed 8 brand deals, $2K-$5K per sponsored post
Follower engagement
4.2% engagement rate (industry average: 1.2%)
Week 1: Generated 100 seed images with Stable Diffusion, curated to 23 best
Week 2: Trained LoRA (42 minutes), tested checkpoints, selected epoch 18 at weight 0.85
Month 1-2: Posted 3x/day (90 posts), grew from 0 to 12K followers
Month 3-4: Reached 50K followers, first brand deal ($2,500)
Month 5-6: Hit 124K followers, 8 active brand deals, $18K/month revenue
"LoRA training changed everything. My first attempts without LoRA had inconsistent faces - comments were 'why does she look different?' After training LoRA, my character became REAL to my audience. Followers send DMs asking about her skincare routine. Brands trust me because they know exactly what face will appear in sponsored content. The 42 minutes I spent training this LoRA generated $100K+ in revenue over 6 months. Best investment ever."
Want to master AI Influencers Academy? Get it + 3 more complete courses
Master Instagram growth, AI influencers, n8n automation, and digital products for just $99/month. Cancel anytime.
All 4 premium courses (Instagram, AI Influencers, Automation, Digital Products)
100+ hours of training content
Exclusive templates and workflows
Weekly live Q&A sessions
Private community access
New courses and updates included
Cancel anytime - no long-term commitment
✨ Includes: Instagram Ignited • AI Influencers Academy • AI Automations • Digital Products Empire
With an RTX 4070 or better: 30-60 minutes for a standard character LoRA (20 images, 15-20 epochs, 512x512 resolution). RTX 3060: 60-90 minutes. RTX 4090: 20-30 minutes. Cloud GPUs (RunPod, Vast.ai) offer similar speeds for $0.50-1.50 per training session. Higher resolution (768) or more epochs (25-30) doubles training time.
Yes! Use cloud GPU services: RunPod ($0.34/hour RTX 4090), Google Colab Pro ($10/month, A100 access), or Vast.ai ($0.25-0.50/hour). Training one LoRA costs $0.50-1.50, far cheaper than buying hardware. Alternative: Use Kaggle free GPU (30 hours/week) or Paperspace Gradient free tier.
Three main causes: (1) Undertrained - increase epochs to 20-25 or raise learning rate to 2e-4. (2) Poor dataset - ensure 20+ images with good variety in angles/expressions. (3) Wrong checkpoint - test all saved checkpoints (epoch 5, 10, 15, 20), sometimes earlier epochs have better consistency than final.
You cannot directly "add" to an existing LoRA. However, you can create a new training dataset combining old images + new images (25-30 total) and retrain from scratch. This takes another 30-60 minutes but gives you a refreshed LoRA. Most AI influencer creators retrain every 2-3 months to refine consistency or update their character's look (new hairstyle, slight age progression, style evolution).
LoRA: Creates 100-200MB adapter file, works with any SD checkpoint, trains in 30-90 min, 95-98% consistency, easier to share. Dreambooth: Fine-tunes entire 6GB model, locked to that specific model, trains in 2-4 hours, 97-99% consistency, harder to share. For AI influencers, LoRA wins due to flexibility and speed. Only use Dreambooth if you need absolute maximum control (99%+ consistency) and don't mind the limitations.
Upload to CivitAI (largest LoRA community, 500K+ users), Hugging Face (AI researcher hub), or your own hosting (Google Drive, Dropbox). LoRA files are 100-200MB. Include: example images, recommended weight (0.6-1.0), base model used (Realistic Vision, DreamShaper, etc.), and sample prompts. Warning: Once shared publicly, anyone can generate infinite images of your character. Consider licensing/watermarking.
Yes! Stack LoRAs in Stable Diffusion: Character LoRA (weight 1.0) + Style LoRA (weight 0.6) + Clothing LoRA (weight 0.4). Keep total combined weight under 3.0 to avoid artifacts. This is powerful for creating unique combinations without training new LoRAs. Example: Your character's face + anime style + specific outfit = instant custom variation.
Initial LoRA lasts indefinitely if results are good. Most creators retrain every 2-3 months to: (1) Refine consistency with better curated dataset, (2) Update character's look (new hairstyle, subtle aging, style evolution), (3) Add new poses/expressions audience wants to see. Retrain if consistency drops below 90%, or followers request visual updates. Otherwise, one well-trained LoRA can generate thousands of images over years.