A practical guide based on personal experience

Last updated: [2025/09/06]

workflow description
Sample workflow files added (Default directory: \ComfyUI\user\default\workflows)

Introduction

There were some problems when I reinstalling WebUI, so I simply switched to comfyui. It took a little time to install and learn, but overall it was relatively easy to get started. There are many comfyui installation tutorials, I will not go into too much detail here. I will mainly share strategies and some personal experience instead.

To maintain the consistency of the articles, I have used the title of the previous article, but this article focuses more on the usage strategy of comfyui.

If you need more information about noobai, please consider checking out the article.

Why ComfyUI ?

ComfyUI is a node-graph-based Stable Diffusion front-end, more like a "modular programming environment."

Each operation (loading a model, sampler, amplifier, LoRA, ControlNet, etc.) is broken down into a node, which you can freely combine like Lego blocks. This provides a high degree of flexibility and controllability, making it ideal for researchers, tinkerers, and those who require complex workflows (multi-model/multi-branch/multi-stage builds). For me, this was as fun as building blocks. As I experimented with different workflows and tweaked each module, I was able to intuitively understand the functions of the different parts.

Compared with webui and platforms such as forge, comfyui has higher flexibility and efficiency, and can create richer works with fewer resources.

1. CFG Scale, Sampling Steps, and Sampler Strategy

1.1 CFG Scale (Classifier-Free Guidance)

The CFG scale controls how strictly the model follows the prompt. Generally, if you want a stable image composition, you need to use a larger cfg value, but this often results in extra fingers, deformed muscles, and other incorrect anatomy (lower image quality). For HomoSimile XL, a CFG of 4.5 to 6 is optimal.

Lower CFG (~4) allows more freedom and painterly aesthetics.
Higher CFG (~6–7) sticks more closely to your prompt but may reduce artistic flexibility.
Avoid using CFG above 8 as it tends to overfit and may distort faces or anatomy.

I usually use 4 or 4.5 (smaller values) for txt2img, and 5.5 or 6 (larger values) for img2img or inpainting.

1.2 Sampling Steps

25 to 35 steps is ideal for HomoSimile XL.

At 20 steps or lower, you may lose detail or structure.
30 steps offer a good balance between detail and performance.
Use 35–40 steps only when applying heavy LoRA influence or needing precise control.
Lower steps with high CFG often produce worse results; increase steps slightly if using strong LoRA.

Similarly, I usually use 28 (smaller values) for txt2img, and 32 or 35 (larger values) for img2img or inpainting.

1.3 Sampler Choice: Detailed Comparison

Recommended Samplers:

DPM++ 2M SDE Karras
- Best balance between structure and freedom.
- Performs well in portraits and complex characters.
- Good with LoRAs and expressive prompts.

Euler a
- Outdated. Tends to introduce artifacts or overfit.
- Fast drawing, less details

Samplers to Avoid:

UniPC
- Unstable across prompts. May occasionally work but is unreliable.
LMS / Heun
- Experimental samplers with limited benefit for this model.

These have been explained in the checkpoint introduction, but it's worth adding:

DPM++ 2M SDE Karras noise control is smoother, and the Karras scheduler converges more naturally at high step counts. It's well-suited for complex scenes, detailed scenes, and realistic styles. 20–30 steps yield very stable results; for extremely high detail, you can increase it to 40–50 steps, with diminishing returns above that.

Recommended CFG: The 6.5–8.5 range is most commonly used; 6–7 is more natural for real-time rendering; 8–9 is recommended for anime or when strong prompt-driven rendering is required.

Euler a uses "ancestral" sampling, resulting in strong randomness and sharp details, but can also be garbled. It's well-suited for sketchy, artistic, or scenes that require exploring varying degrees of randomness. 15–25 steps are generally sufficient; exceeding 30 steps yields little improvement and may even lead to oversharpening. Recommended CFG: The 7–11 range is used by everyone; low CFG (7–8) is more natural and soft, while high CFG (10–11) is more "obedient" but may sacrifice naturalness.

1.4 General Strategy

Use DPM++ 2M SDE Karras as your default.
Start with CFG = 5 and Steps = 28.
Raise steps slightly if faces or hands show issues.
Try DPM++ 3M SDE Karras when generating high-detail or large-format compositions.

2. Three Image Upscaling Methods in ComfyUI

Detail enhancement is relatively simple in WebUI: simply click the detail enhancement checkbox and choose the appropriate upscaling method. However, in the comfy UI, the process is completely different.

2.1 Real-Space Upscaling

Definition: Directly scales the input or output image in image resolution (pixel space).
Operational Target: RGB images (e.g., 512×512 → 1024×1024).
Applies upscaling to the final image in pixel space (after decoding). Bicubic interpolation, Lanczos, AI super-resolution (e.g., ESRGAN, SwinIR). Use models like R-ESRGAN 4x+, 4x UltraSharp, or BSRGAN. Delivers the highest detail quality. Ideal for final artworks, print, or sharing HD results. Requires more VRAM; can be slow for large images.
Disadvantages: May amplify noise or lose detail; does not participate in the diffusion process and lacks the "redrawing" during generation.

Workflow: Generate low-resolution image → Real-space super-resolution → img2img input → Re-render at high resolution.

2.2 Latent Space Upscaling

Definition: Scales the feature map in the latent space of the diffusion model (usually 1/8 the size).
Operational Target: For example, 64×64 latent → 128×128 latent (corresponding to 512×512 → 1024×1024). Performs upscaling inside the latent space before sampling. Directly resize the latent after encoding using a VAE, then decode or resample.
Change the latent resolution to e.g., 1024×1536 or 1536×1536. Efficient on memory and computation. Keeps stylistic consistency with lower risk of distortion. Fast (low latent resolution), preserves diffusion characteristics during regeneration.
Disadvantages: Scaling may disrupt the statistical properties of the latent, resulting in artifacts and blurring; generation consistency is not as good as native SD support. May lose sharpness around fine edges.

Workflow: Generate latent → Resize latent → VAE decoding (or further diffusion) → Samplers (such as K samplers) → VAE recoding → Output.

2.3 SD Upscaling (Re-generation-based)

Definition: Scaling via the internal mechanism of Stable Diffusion, leveraging its re-diffusion capabilities to "redraw" the original image at a higher resolution.
Operation target: The diffusion process itself (not just scaling).
Uses Stable Diffusion to resample and enhance the image. Feed the original image into an SD Upscale node or UltimateSDUpscaler. Good for facial fix, minor sharpness boosts, or stylistic refinement. Usually works best with 1.5× or 2× scaling. Excellent option for low-VRAM setups. Generate at low resolution first → Upscale → Diffusion again (e.g., Highres.fix). Not only does it upscale, but it also adds new details at higher resolutions, resulting in a more natural image.
Disadvantages: High computational effort, increased generation time; the effect of prompts may be amplified.

Workflow: Low-resolution upsampling → Upscaling → Further diffusion on high-resolution latent.

P.S. Please refer to the attached workflow files I shared.

P.P.S. The above method is for upscaling the entire image. If you're looking for face/hand correction, I've also shared a SD method using samloader and bbox (the workflow is included in the package). In the closing thoughts section of the previous article, I shared another approach for face correction (manual cropping → enlarging the image in latent space to the model's recommended size → scaling back to original size). Compared to inpainting, this approach allows for better detail adjustments, but requires a certain level of artistic skill and familiarity with drawing software.

3. LoRA CFG Strategy and Integration Tips

3.1 Loading and Using LoRAs in ComfyUI

Use the Apply LoRA node in ComfyUI to attach LoRA files to checkpoint model.
Always include trigger words from the LoRA in prompts to activate it.
Adjust the weight (influence strength) between 0.5 and 0.8.
For dual-LoRA setups (e.g., style + detailer), keep the combined weight below 1.2.

3.2 CFG and LoRA Interplay

Higher CFG will strengthen LoRA effect, forcing it to follow style more strictly.
Lower CFG may weaken LoRA visibility, allowing more base-model characteristics.
When using a strong LoRA like NoobAI Detailer, consider using CFG 5.5–6.5.
If using a soft artistic LoRA, drop CFG to 4.5–5 to allow creative blending.

3.3 Prompt Placement

Place the LoRA’s trigger keywords early in the prompt.
Combine with related descriptors (e.g., lighting, pose, background) for better results.
For best effect, experiment with prompt formatting: [LoRA trigger], subject, composition, style.

Closing Thoughts

If noobai is a canvas, then comfyui is a colorful brush.

I have to say that I am not a professional. This article comes from my love for painting, and what I share is just my personal experience. There are many tutorials and articles by other creators that are better than my work. I really hope that more people will put forward their own opinions and make progress together in the discussion.

Love is everything.

PS: This article will continue to be updated. Feel free to PM me or just comment here.

PPS. I would like to recommend an original couple from my work: YS and Ivan (based on reality). I will post more story about them in the future.

NoobAI Usage Guidelines and Reflections-ComfyUI