Sign In

caption / dataset lora training tips for illustrious v0.1 for the civitai on-site trainer

25

Oct 21, 2025

(Updated: 30 minutes ago)

training guide
caption / dataset lora training tips for illustrious v0.1 for the civitai on-site trainer

lots of people keep asking me about dataset since my last article so i'm going to write here what i know

dataset = image + caption

so training trains 2 things... the image and its caption... to generate that image from that caption. but u don't need to gen that image because u already have that image. so what u want from training is to "separate" some "concept" that you want to gen in images that aren't in the dataset. this "concept" can be an artist style, a character, a pose or even something else

so how does this? idk honestly. my guess is that when you train the trainer is able to "recognize" things in the dataset that are in the caption just like an app made to recognize objects from images would. so if the caption says "bottle" and it sees something bottle-shaped, the lora weights are changed to make "bottle" look like the bottle it recognized in the image. but what happens if the caption has something the model doesn't recognize and the image also has something the model doesn't recognize?

have you ever noticed that if u type gibberish in the prompt it usually doesn't generate anything? that's because the text prompt gets "tokenized" into tokens and those tokens go into the neural network. if the tokens are too weak to "activate" any neurons in the neural network it has no effect. so basically the goal of training is usually to put the parts of the image that don't match any of the strong tokens into some weak tokens so you can use that later. these weak tokens are usually the "trigger word" that goes in every image in ur dataset.

so basically if the whole dataset has an artist style and the same tokens appear with every image, the training will start dumping the style into those tokens. as the training goes on the trigger word gets "stronger" and it starts a cycle of the training being able to recognize the style in the image because the neurons are activate. probably.

but the trigger word isn't the only thing that gets trained! if you have a lora without a trigger word it's just going to adjust the words that do appear in the captions. so for example if you have 5 images with 1boy and 5 images with 1girl, it's just going to train what 1boy and 1girl look like. also if u use the lora to gen but u don't use the trigger word, that's also going to affect generation because every word in the caption was trained when the lora was trained

so what this means in practice?

let's say u have an artist who always draws flat color, full body, white background pics. those are all tags in illustrious so u can gen something similar without trying. u want to train a lora from this artist. lets say the trigger word is just trigger_word. u have 2 options

1st option is to use only the trigger_word. since the artist always draws these things that's their style, no? this training takes longer and is also more inflexible, so personally i wouldn't recommend it. remember that on danbooru if there is a post of an artist and it's drawn in flat color the post WILL get tagged with flat color too because nobody cares if the artist always draws that way for tagging purposes. same for characters. if a character has blue hair on danbooru, like xingqiu, it just gets tagged blue hair (almost) every time. but the booru isn't perfect. people often tag his bell sleeves with wide sleeves instead even tho they're completely different things. even tho it's generally tagged something illustrious can still gen a blue-haired xingqiu even if u don't include blue hair in the prompt, which means if two tags happen too often together in the dataset one thing gets blended with the other

2nd option is to include flat color, full body, white background in all pics. this will make training go faster and better but on the other hand you'll have to include all those tags when generating to generate the same thing. if u do this, what gets trained in the trigger_word? the anatomy probably plus some things that you haven't tagged but are common. for example some artists draw soft vignettes on the corners of the images, others love to use light particles or other effects that are too subtle for most people to recognize. that's all going to get trained into the trigger and you'll only realize it when you try to gen. if u do this u'll be able to generate non-flat color portraits using the same lora which is pretty awesome if u ask me. that means you can theoretically train a lora on manga pages that are black and white and you'll still be able to generate colored images if you tag the entire dataset monochrome, greyscale.

3rd option is to just not include any trigger word at all. this usually works since you'll usually have at least 1girl or 1boy in all pics, but even though i did this before now that i think about this is probably a bad idea and you should just always use a trigger word

which lora type is harder to train?

style loras are probably the easiest to train since it's very hard to get a lora that doesn't work. basically if it makes the gens look a bit closer to the artist style, that's a success, so u literally can't fail. the problem is u need more images to be able to generate more things in the same style. people normally recommend 50 images or more for a style lora.

character loras are also relatively easy specially if it's a "normal" character. characters with very complicated clothing or that have very weird styles are harder. if it's a very simple character with symmetric face/hairstyle you can do it with just 10 images. just make sure you have both portrait and full body pics so the lora doesn't start leaning into generating portraits all the time.

concept loras are the hardest. a thing lying around is easy, but if it's something characters wear or a pose you'll often end up affecting the base style of the lora with it. specially if u use real photos it tends to make all gens realistic, though there are tricks to help with that.

tip 1: u should be able to gen the image without the lora

this is basically the number 1 most important thing about captioning images. if u can't gen the same image without the lora (or something very close to it) then the caption isn't good enough. the autotagger generally sucks at this btw, but u can still get a good result in most cases anyway and people are always just dumping manga pages into the tagger go brrrrrr so this is something u only have to worry about if u REALLY care about it. also use an app to help u with tagging because doing it by editing txt files manually is too much work.

so let me give an example of what i mean. let's say i have this in my dataset from an old lora i made

016.png

and i want to train a character lora. but the idea is the same for style loras.

so what i want to do is to extract EVERYTHING that isn't going into the trigger_word. in this case i want the horns, pointy ears, purple hair and purple eyes to go all into the trigger_word, so i need to be able to generate literally everything else

1boy, general, solo, upper body, profile, looking to the side, otoko no ko, black flat cap, white collared shirt, black suit jacket, no necktie, closed mouth, serious, grey background, simple background

00039-1.png

as u can see, this is literally the same picture except it isn't in anime style and it isn't the character. so what happens if i try to generate it with the lora with the exact same seed? btw the trigger word i used was bcranga.

bcranga, 1boy, general, solo, upper body, profile, looking to the side, otoko no ko, black flat cap, white collared shirt, black suit jacket, no necktie, closed mouth, serious, grey background, simple background <lora:Ranga__Langa_The_Dungeon_of_Black_Company__Meikyuu_Black_Company-000010:1>

00041-1.png

we genned literally the same picture but with ranga in it and in anime style

we didn't tag purple hair, short hair, sidelocks, pointy ears, mole under eye, or horns, but because it was all dumped into trigger_word during training we can get it back by using the same trigger word

tip 2: tag the media

btw i didn't know this at the time but if i had tagged it anime screencap it would have turned even more flexible because the anime style would have been separated. here is some tags u can use depending on the source of the image

anime screencap = screenshots from anime. don't tag it with "screenshot" because if u try to gen u'll notice illustrious only gens anime screenshot style with the tag anime screencap not with anime screenshot

3d = for 3d images (not for real ppl!)

greyscale, monochrome = manga pages

flat color = if there is no shading

traditional media = anything that isn't drawn digitally like colored pencil (medium)

pixel art = u probably know what this is if u have a dataset with it

lineart = only lineart without shading

sketch = sketches

realistic = for realistic artwork

photorealistic = i use this tag with photos because in my tests if ur training an anime style lora from real photos of people this tag works better than realistic

tip 3: use composition/pose tags

if u don't use these tags, the lora will tend to generate only close-up pics or full body pics depending on ur dataset and worse yet if u don't use pose tags specially for hand poses the characters will end up with theirs hands up randomly and u won't know why

full body

foot out of frame

feet out of frame

head out of frame

don't tag cowboy shot because illustrious thinks that's an actual cowboy

upper body

portrait

foot focus, close-up

lower body

crotch focus, close-up

breast focus, close-up

hand focus, close-up

mouth focus, close-up

eye focus, close-up

ear focus, close-up

the most important pose tags to watch out for...

hand up

hands up

arm at side

arms at side

arm up

arms up

standing on one leg

knee up

knees up

sitting

standing

lying, on back

lying, on side

lying, on stomach

knees apart feet together

hand to mouth

finger to mouth

hand on own chest

breast suppress

arm under breasts

there are many others, just check danbooru for that. but keep in mind that illustrious was trained on booru data from 2 years ago and a lot of tags have been nuked or renamed since

important angles

from side

profile

from behind

from above

from below

oh and this is probablly extremely important as well since otherwise u'll get floating heads

fading background = if the bg is white for example and the borders of the image are faded to white blending with the bg

cropped head = like fading background but like a cropped head

cropped torso = same but a torso that fades at the bottom around the elbow range

cropped legs = same but for legs

00043-1.png

tip 4: avoid complex pics

if something has 4 characters in it u can probably drop it from your dataset because illustrious literally can't generate 4 characters correctly in a single picture anyway, so it can't really learn from it, specially if you start having to tag clothing and it's like 3 different colors of tops plus bottoms.

most of ur dataset should be

1girl, solo

1boy, solo, male focus

because that's what people will actually generate.

if there is 2 characters side-by-side but one is "cropped"

1girl, 1boy, solo focus, side-by-side

1girl, 1boy, solo focus, male focus, side-by-side

2girls, solo focus, side-by-side

2boys, solo focus, male focus, side-by-side

if both characters are visible side by side...

1girl, 1boy, side-by-side

2boys, side-by-side, male focus

2girls, side-by-side

3 characters is the maximum you should have in a single pic, but only if they are in a position that you can tag

1girl, 2boys, sandwiched, boy sandwich

1boy, 2girls, sandwiched, girl sandwich

3boys, sandwiched, boy sandwich

3girls, lineup

anything else is probably not going to work

i tried to train a lora had lots of 3boys/4boys full body pics standing around and now when i use the lora even if i don't use 1boy it generates a bunch of ghosts

if ur dataset has too many multi char pics, you should try to crop it and tag it with solo focus instead or if possible remove the other characters if the background is white for example

tip 5: remove or tag text

text is something that stable diffusion really sucks at so if ur dataset has any text u should try ur best to get rid of it since it's a lot of details u don't care about. some important tags to tag it away

signature = any written signature

artist name = if the artist's name is written someone

twitter username = @artist

watermark = any watermark

translation request, translated = these tags appear on posts with non-english text so if a pic has japanese or non-english text in it u should use it

english text

korean text

sound effects = text for sound effects u see around

speech bubble

spoken heart = if a speech bubble has a heart in it

obviously u should remove the text if u can. i u remove all the text from a speech bubble u can tag it

blank speech bubble

https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/06dfea74-74e2-45bb-ba26-e88b25e05314/original=true,quality=90/00015-897910231.jpeg

tip 6: avoid multiple views/comics

try to avoid putting a whole comic page into the dataset since it's too complex. even cropping the panels instead works better.

if u can still see the border of the panel even after cropping tag comic. idk if it helps but it probably does

some artists love to draw the same character in multiple views so u can actually use the tag multiple views for that. it's a tag people love to put in the negative but it's a tag that really works if u try to gen it, so it should work for recognizing multiple views during training

00044-1.png

tip 7: include close-up images of important details

if u want details u need to include close-ups.

this is very important for characters that have moles on one side of their face for example since illustrious will get the side wrong if u don't have enough close-ups. (also if they are asymmetrical don't use the flip augmentation during train or it's going to flip it)

for objects/devices with detailed parts u can include a few of EXTREMELY close close-ups just to make sure the lora can render it right during inpainting

if the character has some clothing with some weird design/emblem u should include close-ups of the designs as well

8K0CyiHDDDDkvMmRqyiRqZXvpA9E03ls-34-crop2.png

this is an actual image from my heart o-ring choker dataset. it's cropped so close u can't even see the whole choker. but if u don't have images like this, sd tends to gen the part that connects wrong. on the other hand if u don't have enough images of the whole thing, it tends to gen proportions of the object wrong.

tip 8: x1 "upscaling" gets rids of the jpeg

if ur working with very old images of artists that don't draw anything anymore and all you have is a bunch of extremely jpeged jpegs u can use an upscaler to literally get rid of the jpeg if u use 1x for upscale.

up.png

yeah i know. it's magic.

some people say upscaling is bad because it corrupts the original data but so is jpeg, let's at least choose the corruption that doesn't look like garbage

tip 9: x1 upscale photos for anime loras

the same trick works if ur training anime loras with photos. probably because of how "round" real images look the lora tends to start rendering more and more realistic images in general, but if u use an upscaler intended for anime it "flattens out" the photo and reduces this effect on the lora

tip 10: concepts work best with close-ups and known artists/characters

so the problem of training concepts specially concept that are held/worn by characters is that since the lora trains all tags in the caption at once that means just having something outside of the concept u want to train in the image is going to influence the lora to render that thing

literally anything as small as imaginable can do this, but there is a couple of tricks u can use to avoid concept blending

the first trick is to crop everything but the concept if u can. this doesn't work very well for things that are held in hand because if you have the hand and shoulder in an image but not the elbow the lora can end up drawing arms without elbows.

if ur training with photos of people NEVER use faces. loras will quickly learn to draw real noses if u let them! if u have a realistic nose in a photo u can tag it with nose but it's best to just avoid noses if u can since after that it starts drawing smaller eyes and more realistic faces and nobody wants that.

the second trick is to vary ur dataset with as many known artists/characters as you can. since the artists styles are recognizable by the trainer it doesn't influence other tags and because u have so many there is no shared element that gets dumped into the trigger word.

so you would have something like

concept, artist1, 1boy, ...

concept, artist2, 1boy, ...

concept, photorealistic, 1boy, ...

so basically the idea is to balance the two

you NEED photos because artists can't draw details right, but if u have too much photo data the lora tends to render real people instead of anime characters

so u balance that with drawings of known artists that aren't as close-up as the photos and things generally work out pretty decently

tip 11: upscale or downscale to balance the dataset

on civitai images need to be at least 256 pixels to be trained in the trainer, so u can't have crops smaller than that, but there is a gotcha

if an image is 256x256 it only has 1/16th of the training power of a 1024x1024 image. i don't know if this is really true or not but based on past experiences that is the vibes i get from it

basically that means if ur entire dataset is made of small images ur learning rate won't work as good as it should

so usually in order to balance a dataset that has a lot of randomly sized images u can use the upscaler to make sure they are all the same size. but obviously low res images will make ur lora uglier, so maybe they SHOULDN'T be upscaled so they don't affect the lora as much as high res images?

but then if u use photos u'll notice that some photos only have the pixels but their actual quality is a blurrier than anything you have ever seen, so maybe what u should do instead is make these sizes smaller instead of making them bigger?

anywyas whatever u choose keep in mind that the buckets can go above 1024px on a side if the bucket is narrow so it's a good idea to set the limit for upscaling to 2048 pixels. also make sure ur checked the upscaled images because sometimes esrgan adds a weird black gradient to one side and sometimes the image was so lowres to begin with that things just don't look right upscale at all.

also for artist styles upscaling is probably a bad idea if they aren't too noisy since the upscaler will get rid of details like diagonal blush lines and just make the whole thing flatter in general.

tip 12: training multiple concepts/artists generally doesn't work

if ur trying to train a concept, u should use only pictures of that EXACT same concept. the only variation that have is color (blue trigger_word, green trigger_word) OR size/length (small trigger_word, long trigger_word), pretty much everything else is bad

heart o-ring choker, multicolored choker, yellow choker, blue choker, 1boy, solo, otoko no ko, venti \(genshin impact\), genshin impact, smile, looking at viewer, portrait, straight-on, white background, shirt <lora:Heart_O-Ring_Choker_V1.1:1>

00045-1.png

i didn't have a blue/yellow choker in my dataset but because i had black/red and pink/aqua i'm able to gen something like this now

if 2 concepts are very similar the trainer isn't going to be smart enough to figure out the differences between one and the other and tends to just merge them together in ways u don't want

u can train a lora on multiple artists at once, but if u do this, the effect on the "default" style without artist tags will be minimal, so u can't make a masterpiece lora that always gens masterpieces by default just by training on various pics from great artists. in this case it's probably a better idea to make a separate lora for each artist because that way u can share it on civitai and others can find it easily. nobody is going to find ur lora if u call it "epic masterpiece artist styles"

tip 13: pick a good trigger word

a good trigger word helps training and using the lora a lot

personally i don't use weird triggers like slnslj3ilsdlfk. i did use bcranga in one my first character loras. but if it was today i'd just use the tag that already exists on danbooru, which is probably ranga (black company) or something like that. in a recent lora i used klen (clevatess) for example. if u use parentheses in the trigger name keep in mind that u need to escape them like klen \(clevatess\) to gen later. i don't know if this is true in civitai's generator but it's true on a1111/forge/reforge.

if you are training concepts it's probably a good idea to try to gen with the trigger word without the lora first just in case it gens something weird. like let's say hypothetically that illustrious didn't know wha a "sunflower" is. if u used this as a trigger word, it would probably generate the actual sun and a flower somewhere. so triggers that combine existing words need some extra care.

if ur training a concept/pose/artist style that already exists as a tag in the danbooru, you should just use the exact same tag. basically in a lot of cases the tag is already half-trained but because the booru dataset isn't cleaned up for training that tag specifically it's kinda messy.

for example danbooru has hundreds of contortion pictures but the contortion tag doesn't really work in prompt since there is no common pose. but if u made a lora using only one type of contortion it would probably train pretty easily. likewise danbooru has a lot of artists tagged in it, but a lot of artists have fewer than 100 pics so the style isn't trained strong enough. if u made a lora with those same 100 pics u should just use the same artist tag so ur starting point is a few steps ahead already. so instead of using a trigger word like coolartist_illu just use "cool artist" the same way it's spelled in danbooru with spaces instead of _

if ur training something that looks like a shirt, u should call it something shirt. if u training something that looks like a dress, something dress. basically the closest u are from the result WITHOUT the lora, the better it works

(same prompt without lora)

00046-1.png

basically what i do is just pick a trigger word that looks close enough and then i name the lora after it, because it makes everyone's lives easier if the lora file has the same name as the trigger word

generally u would think that things like "red choker" and "green choker" could work but what is a "large choker"? i don't think people would think of using those prompts if they used a lora. you wouldn't think "large choker" was part of the training. these tags that people would never use anyway are probably not very useful, so it's probably better to keep something off the dataset if nobody is going to prompt it anyway. btw you're probably going to forget about them too in 2 weeks unless you keep a list of the tags u used to train the lora somewhere.

tip 14: train anime on good art

so lots of people train on anime and some even on unreleased anime (this is why ppl hate us btw) and if u plan to train a lora on anime screenshots there is a couple of things u need to be careful about

anime as u know is made by taking a talented chinese illustrator and squeezing the art juice out of them. but their often squeezed too hard so all frames look bad and ugly. avoid including on ur dataset frames with eyes/hands that look too bad because it's going to mess ur eyes/hands

020.png

so u can see here that even though this picture is pretty tall, the quality is the worst. and this character is probably one of the worst characters in the universe to train a lora from because u can see he wears detached sleeves that have these "cuts" on them and in most frames u can't see it. he also has a clothing cutout on his chest that is covered by the bow of his fur-trimmed cape through pretty much all of the anime. this means that you only gave a few frames in the entire anime that have the full features of the character and they all look like garbage. it's a tricky situation. but anyway look at his hands and eyes. they look really bad because he's basically a walking background character for half a second and it's a full body picture. pretty much all full body screencaps will look terrible like this.

what u can do in this case is use pictures from the OP/ED that generally have better quality or use official art from the official website where they put these large high resolution full body pictures. if ur tagging ur screencaps with anime screencap, u can tag official art with official art (both tag booru tags btw)

001.png

at the time i didn't do it but following the previous tip i should probably have taken the official art pic and chopped it in various parts (like just the boots) so i could train only a lower body picture to learn the boot details better. in general when people train characters they don't really train the shoes even though many characters wear very unusual shoes!

another thing that u can do is use a stitch from the anime

basically those scenes the camera "pans" while the character is stopped still. u take a screenshot of a few frame then use photoshop to "stitch" them together. stitches are usually higher quality even if they are anime screenshots because they were drawn at a much larger height than what fits in the screen at once.

000.png

tip 15: start small

if ur training a lora with a large dataset and u don't know if it's going to work (it often doesn't), u should start with a smaller dataset first

think about it. normally ppl will only gen 1boy and it's going to be front facing (sometimes straight-on) or from behind. normally u don't gen from side, upside down, sideways, from below, from above, dutch angle, monochrome, spot color, etc. unless ur a power user doing gens in comfy instead of in civitai's online generator. so basically if u can make a lora that gens a piece of clothing or concept right front/back, that's good enough for most people and that is much easier to make than a lora that can gen the concept at every angle. for styles it's the same thing. if an artist has lots of complex pics, the lora is probably going to suck if u use all of it, but u can get a lora that is usable for 80% of the use cases if u train only on the solo pics.

so start small. make a lora with the best data that doesn't always work first. and then u can try to improve it with the other data. because if u do it this way, u get a lora that at least works. but if u start with lots of data right from the beginning, u can waste 2000 buzz on a style that still doesn't work and ur going to be like "i should have just used only the solo pics so at least i'd have the lora published by now" (speaking from personal experience btw)

also, for concepts if u train on very few images the lora becomes able to render that exactly angle exceptionally well. so sometimes it's a good thing to have a lora version that will do one angle extremely well and a second lora version that can do more angles but not as well as the first version.

tip 16: pick a default color

if you're training a concept that is usually one color but u want the ability to change the color of the thing, u should pick one color as ur "default." this can be the most common color or just the color someone would expect to gen when they use the model without color. picking a default is good because it means u don't have to tag the color in some posts. less tags = less things to learn.

tip 17: invent "secondary" concepts

sometimes u may have multiple pictures in ur dataset that come from the same source so they share a lot of similarities. they can be screencaps from the same anime or photos taken in the same room. or sometimes u have something weird that appears in one photo that u can't generate like a weird type shirt.

sometimes there is a tag for it like clothes writing or letter print, but sometimes it's just not something that exists in the booru. if u don't tag it SOMETHING it's going to affect the tags that u do have in the caption so in these cases u should just come up with a name for it

for example illustrious can't gen a heart o-ring choker right by default, so that's not something the model really knows about. but if i had a picture with one, i can't tag it just "choker" or it's going to make all chokers look like that. so in this case u'd just add the tag "heart o-ring choker" to this pic even though illustrious can't render it and the lora probably won't learn to render it right with just one picture because that way it doesn't influence the "choker" tag in general

one concept that keeps happening to me are fluffy carpets/bed sheets. afaik there is no tag for these. i did have a lora with so many of them tagged "fuzzy bed sheet" or something like that that the lora actually learned to render them as a secondary concept

there are also tags u can use for bad quality pictures like lowres, blurry and jpeg artifacts. i'm not very sure if it works tho, but if u want u can use them

tip 18: regularize concepts

in some cases if u train a lora for one thing it changes how a compeltely different thing looks. for example if u train a lora for a thigh strap it can change how chokers look because they look kind of similar

there is no way to combat this but u can only do it if u manually tag your trigger word as the first tag in ur txt files instead of using civitai's "trigger word" box.

basically u include an image of a choker in the dataset that doesn't have ur trigger word in it and instead has the "choker" word. now the lora is trained to gen ur thigh strap right while still rendering chokers right

this generally isn't very necessary if u gen locally since u can just inpaint the bugged concepts without the lora later but it's a technique to keep in mind

tip 19: don't use the same tags for the whole dataset

lots of loras have a problem that they use the exact same tags through the whole dataset so it's like "trigger_word, 1girl, solo, white background, black hair, black eyes, looking at viewer" and it's just 10 posts with these tags.

basically the problem is that tho illustrious uses commas to separate booru tags, under the hood it's just sdxl, so commas are just normal tokens inside sdxl. illustrious can't really tell that trigger_word stops at the comma and can merge it with the tag that comes after it, 1girl. shuffling the tags in training helps fix this because it's not going to be the exact same order of tags every time but u still have the exact same tags every time and that makes it harder to learn what parts of the images belong in trigger_word and not in the other tags

it's good idea to add at least one image that doesn't have the same tags so the trainer learns to gen the same concept in different contexts

tip 20: it doesn't actually have to be tags

ur trigger word doesn't actually HAVE to be tags separated by commas. ur allowed to do 2 cool things

u can describe a concept with more tags than danbooru would let u. like u can have "red frilled plaid microskirt" as a single "tag" instead of "red skirt, frilled skirt, plaid skirt, microskirt" which are the actual booru tags for something like that. this means less tokens in the whole caption which is good because it's less tags to train.

u can also have a caption like "xingqiu (genshin impact) on left and venti (genshin impact) on right, indoors, east asian architecture". in this caption the whole thing before the first comma is the "trigger word" in a sense. basically it means if u enable shuffle it's always going to keep this text at the start even though it's not a tag it's a whole phrase. if u have a pose lora with 2+ characters u may be able to train which character goes where doing something complicated like this. usually it's not worth it, though, since you can just inpaint it instead but it's a thing u can do and it can work

tip 21: avoid "numbers" in artist names

so the caption needs to be tokenized to be rendered and usually for normal words like "flower" it turns into a single token because the whole "flower" word appears like that in text a lot. but the tokenizer also tokenizes numbers like 123456789, the "_" thing and parentheses () which always occur standalone so each of these gets a different token. this means something like artist123_(pixiv123456789) is going to have A LOT of tokens and these tokens are shared with everything that has a number like 1boy, 2boys, 3boys, 4boys, 5boys, 6boys, 6+boys.

if ur training an artist that has a name with too many numbers it's probably better to use a different name instead like "artist style". there's actually lots of *_(style) tags in the booru so "style" should work ok as a neutral token (1990s (style), 2000s (style), etc.)

tip 99: do the science thing

my last tip for anyone who wants to make loras is to write down what their settings/dataset looked like and what worked/didn't so they can make progress

25