FLUX.2 [klein]

2026-01-15 15:12:38 +01:00
parent ab7cca6801
commit b56ac61450
12 changed files with 530 additions and 119 deletions
--- a/README.md
+++ b/README.md
@@ -1,10 +1,89 @@
 # FLUX.2
-by Black Forest Labs: https://bfl.ai.

-Documentation for our API can be found here: [docs.bfl.ai](https://docs.bfl.ai/).
+**Frontier Visual Intelligence** — State-of-the-art image generation and editing from [Black Forest Labs](https://bfl.ai).
+
+---
+
+<p align="center">
+<a href="https://docs.bfl.ai">API Docs</a> •
+<a href="https://huggingface.co/black-forest-labs">Hugging Face</a> •
+<a href="https://bfl.ai/blog">Blog</a>
+</p>

 This repo contains minimal inference code to run image generation & editing with our FLUX.2 open-weight models.

+## News
+
+- **[15.01.2026]** Today, we release the FLUX.2 [klein] family of models, our fastest models yet. Sub-second generation on consumer GPUs. Read more about it in our [blog post](https://bfl.ai/blog/flux2-klein-towards-interactive-visual-intelligence).
+- **[25.11.2025]** We are releasing FLUX.2 [dev], a 32B parameter model for text-to-image generation, and image editing (single reference image and multiple reference images).
+
+## Model Overview
+
+| Name | Step-distilled | Guidance-distilled | Text-to-Image | Image Editing (Single reference) | Image Editing (Multi-reference) | License |
+| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
+| [FLUX.2 [klein] 4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B) | ✅ | ✅ | ✅ | ✅ | ✅ | [apache-2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |
+| [FLUX.2 [klein] 9B](https://huggingface.co/black-forest-labs/FLUX.2-klein-9B) | ✅ | ✅ | ✅ | ✅ | ✅ | [FLUX Non-Commercial License](model_licenses/LICENSE-FLUX-NON-COMMERICAL) |
+| [FLUX.2 [klein] 4B Base](https://huggingface.co/black-forest-labs/FLUX.2-klein-base-4B) | ❌ | ❌ | ✅ | ✅ | ✅ | [apache-2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |
+| [FLUX.2 [klein] 9B Base](https://huggingface.co/black-forest-labs/FLUX.2-klein-base-9B) | ❌ | ❌ | ✅ | ✅ | ✅ | [FLUX Non-Commercial License](model_licenses/LICENSE-FLUX-NON-COMMERICAL) |
+| [FLUX.2 [dev]](https://huggingface.co/black-forest-labs/FLUX.2-dev) | ❌ | ✅ | ✅ | ✅ | ✅ | [FLUX.2-dev Non-Commercial License](model_licenses/LICENSE-FLUX-DEV) |
+
+**All models support**: Text-to-Image ✅ | Single-ref Editing ✅ | Multi-ref Editing ✅
+
+## Which Model Should I Use?
+
+| Need | Recommended |
+|------|-------------|
+| Real-time apps, interactive workflows | [klein] 4B or 9B (distilled) |
+| Consumer GPU (e.g. RTX 3090/4070) | [klein] 4B |
+| Fine-tuning, LoRA training | [klein] Base or FLUX.2 [dev] |
+| Maximum quality, no latency constraints | FLUX.2 [dev] |
+
+## `FLUX.2 [klein]`
+
+FLUX.2 [klein] is our fastest model family — generating and editing (multiple) images in under a second without sacrificing quality. Built for real-time applications, creative iteration, and deployment on consumer hardware.
+
+### Key Capabilities
+- **Sub-second inference** — Generate or edit images under a second on modern hardware
+- **Unified generation & editing** — Text-to-image, image editing, and multi-reference in one model
+- **Runs on consumer GPUs** — Klein 4B fits in ~8GB VRAM (RTX 3090/4070 and up)
+- **Apache 2.0 on 4B** — Open-source, fine-tuning, and customization
+
+### Performance
+
+Klein models define the Pareto frontier for quality vs. latency and VRAM across text-to-image, single-reference editing, and multi-reference generation:
+
+<p align="center">
+<img src="assets/klein_benchmark.jpg" alt="FLUX.2 [klein] vs Baselines — Elo vs Latency and VRAM" width="800"/>
+</p>
+<sub>Higher Elo + Lower Latency/VRAM = Better.</sub>
+
+### The Klein Family
+
+| Model | Best For |
+|:---|:---|
+| **[klein] 4B** | Maximum speed, consumer hardware, edge deployment |
+| **[klein] 9B** | Best quality-to-latency ratio, production apps |
+| **[klein] 4B Base** | Fine-tuning on limited hardware, full customization |
+| **[klein] 9B Base** | Research, LoRA training, maximum output diversity |
+
+**Distilled vs Base:**
+- Use **Distilled** (4-step) for production apps and real-time generation
+- Use **Base** (50-step) for fine-tuning, LoRA training, and maximum flexibility
+
+**Licensing:** 4B models are [Apache 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md). 9B models use the [FLUX.2-dev Non-Commercial License](model_licenses/LICENSE-FLUX-DEV).
+
+### Text-to-image examples
+
+Example focused on realism 
+![t2i-klein-grid](assets/t2i_klein_realism.jpg)
+
+Example focused on output diversity
+![t2i-klein-others](assets/t2i_klein_others.jpg)
+
+### Editing examples
+
+![i2i-klein](assets/i2i_klein.jpg)
+
 ## `FLUX.2 [dev]`

 `FLUX.2 [dev]` is a 32B parameter flow matching transformer model capable of generating and editing (multiple) images. The model is released under the [FLUX.2-dev Non-Commercial License](model_licenses/LICENSE-FLUX-DEV) and can be found [here](https://huggingface.co/black-forest-labs/FLUX.2-dev).
@@ -31,11 +110,7 @@ The FLUX.2 autoencoder has considerably improved over the [FLUX.1 autoencoder](h

 ## Local installation

-The inference code was tested on GB200 and H100 (with CPU offloading).
-
-### GB200
-
-On GB200, we tested `FLUX.2 [dev]` using CUDA 12.9 and Python 3.12.
+The inference code was tested on GB200 using CUDA 12.9 and Python 3.12.

 ```bash
 python3.12 -m venv .venv
@@ -43,16 +118,6 @@ source .venv/bin/activate
 pip install -e . --extra-index-url https://download.pytorch.org/whl/cu129 --no-cache-dir
 ```

-### H100
-
-On H100, we tested `FLUX.2 [dev]` using CUDA 12.6 and Python 3.10.
-
-```bash
-python3.10 -m venv .venv
-source .venv/bin/activate
-pip install -e . --extra-index-url https://download.pytorch.org/whl/cu126 --no-cache-dir
-```
-
 ## Run the CLI

 Before running the CLI, you may download the weights from [here](https://huggingface.co/black-forest-labs/FLUX.2-dev) and set the following environment variables.
@@ -60,21 +125,20 @@ Before running the CLI, you may download the weights from [here](https://hugging
 ```bash
 export FLUX2_MODEL_PATH="<flux2_path>"
 export AE_MODEL_PATH="<ae_path>"
+export KLEIN_4B_MODEL_PATH="<klein_4b_path>"
+export KLEIN_4B_BASE_MODEL_PATH="<klein_4b_base_path>"
+export KLEIN_9B_MODEL_PATH="<klein_9b_path>"
+export KLEIN_9B_BASE_MODEL_PATH="<klein_9b_base_path>"
 ```

-If you don't set the environment variables, the weights will be downloaded
-automatically.
+If you don't set the environment variables, the weights will be downloaded automatically.
+
+You can start an interactive session to do both text to image generation as well as editing (one or multiple) images with the following command:

-You can start an interactive session with loaded weights by running the
-following command. That will allow you to do both text to image generation as
-well as editing one or multiple images.
 ```bash
-export PYTHONPATH=src
-python scripts/cli.py
+PYTHONPATH=src python scripts/cli.py
 ```

-On H100, we additionally set the flag `--cpu_offloading True`.
-
 ## Watermarking

 We've added an option to embed invisible watermarks directly into the generated images
@@ -82,52 +146,6 @@ via the [invisible watermark library](https://github.com/ShieldMnt/invisible-wat

 Additionally, we are recommending implementing a solution to mark the metadata of your outputs, such as [C2PA](https://c2pa.org/)

-## 🧨 Lower VRAM diffusers example
-
-The below example should run on a RTX 4090. For more examples check the [diffusers quantization guide here](docs/flux2_dev_hf.md)
-
-```python
-import torch
-from diffusers import Flux2Pipeline
-from diffusers.utils import load_image
-from huggingface_hub import get_token
-import requests
-import io
-
-repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
-device = "cuda:0"
-torch_dtype = torch.bfloat16
-
-def remote_text_encoder(prompts):
-    response = requests.post(
-        "https://remote-text-encoder-flux-2.huggingface.co/predict",
-        json={"prompt": prompts},
-        headers={
-            "Authorization": f"Bearer {get_token()}",
-            "Content-Type": "application/json"
-        }
-    )
-    prompt_embeds = torch.load(io.BytesIO(response.content))
-
-    return prompt_embeds.to(device)
-
-pipe = Flux2Pipeline.from_pretrained(
-    repo_id, text_encoder=None, torch_dtype=torch_dtype
-).to(device)
-
-prompt = "Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text `BFL Diffusers` on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom."
-
-image = pipe(
-    prompt_embeds=remote_text_encoder(prompt),
-    #image=load_image("https://huggingface.co/spaces/zerogpu-aoti/FLUX.1-Kontext-Dev-fp8-dynamic/resolve/main/cat.png") #optional image input
-    generator=torch.Generator(device=device).manual_seed(42),
-    num_inference_steps=50, #28 steps can be a good trade-off
-    guidance_scale=4,
-).images[0]
-
-image.save("flux2_output.png")
-```
-
 ## Citation

 If you find the provided code or models useful for your research, consider citing them as:
--- a/assets/i2i_klein.jpg
+++ b/assets/i2i_klein.jpg
--- a/assets/klein_benchmark.jpg
+++ b/assets/klein_benchmark.jpg
--- a/assets/t2i_klein_others.jpg
+++ b/assets/t2i_klein_others.jpg
--- a/assets/t2i_klein_realism.jpg
+++ b/assets/t2i_klein_realism.jpg
--- a/model_licenses/LICENSE-FLUX-NON-COMMERICAL
+++ b/model_licenses/LICENSE-FLUX-NON-COMMERICAL
@@ -0,0 +1,53 @@
+FLUX Non-Commercial License v2.1
+
+Black Forest Labs Inc. (“we” or “our” or “Company”) is pleased to make the weights, parameters, and inference code for the FLUX Model (as defined below) freely available for your non-commercial and non-production use as set forth in this FLUX Non-Commercial License (“License”).  “FLUX Model” includes, individually and collectively, the models denoted as FLUX.x [dev], where “.x” denotes the FLUX Model version number, and models made available under the License, as indicated by a license notice that is included in or attached to the work, and their elements which includes algorithms, software, checkpoints, parameters, source code (inference code, evaluation code, and if applicable, fine-tuning code) and any other materials associated with the FLUX AI models made available by Company under this License, including if any, the technical documentation, manuals, and instructions for the use and operation thereof. Note that we may also make available certain elements of what is included in the definition of “FLUX Model” under a separate license, such as the inference code, and nothing in this License will be deemed to restrict or limit any other licenses granted by us in such elements.
+
+By downloading, accessing, using, Distributing (as defined below), or creating a Derivative (as defined below) of the FLUX Model, you agree to the terms of this License. If you do not agree to this License, then you do not have any rights to access, use, Distribute or create a Derivative of the FLUX Model and you must immediately cease using the FLUX Model. If you are agreeing to be bound by the terms of this License on behalf of your employer or other entity, you represent and warrant to us that you have full legal authority to bind your employer or such entity to this License. If you do not have the requisite authority, you may not accept the License or access the FLUX Model on behalf of your employer or other entity.
+
+ 1. Definitions.
+ - a. “Derivative”  means any (i) modified version of the FLUX Model (including but not limited to any customized or fine-tuned version thereof), (ii) work based on the FLUX Model, or (iii) any other derivative work thereof. For the avoidance of doubt, Outputs are not considered Derivatives under this License.
+ - b. “Distribution” or “Distribute” or “Distributing” means providing or making available, by any means, a copy of the FLUX Model and/or the Derivatives as the case may be.
+ - c. “Non-Commercial Purpose” means any of the following uses, but only so far as you do not receive any direct or indirect payment arising from the use of the FLUX Model, Derivatives, or Content Filters (as defined below): (i) personal use for research, experimentation, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, or otherwise not directly or indirectly connected to any commercial activities, business operations, or employment responsibilities; (ii) use by commercial or for-profit entities for testing, evaluation, or non-commercial research and development in a non-production environment; and (iii) use by any charitable organization for charitable purposes, or for testing or evaluation. For clarity, use (a) for revenue-generating activity, (b) in direct interactions with or that has impact on end users, or (c) to train, fine tune, or distill other models for commercial use, in each case, is not a Non-Commercial Purpose.
+ - d. “Outputs” means any content generated by the operation of the FLUX Model or Derivatives from an input (such as an image input) or prompt (i.e., text instructions) provided by users. For the avoidance of doubt, Outputs do not include any components of the FLUX Model, such as any fine-tuned versions of the FLUX Model, the weights, or parameters.
+ - e.   “you” or “your” means the individual or entity entering into this License with Company.
+
+2. License Grant.
+- a. License. Subject to your compliance with this License, Company grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free, and limited license to access, use, create Derivatives of, and Distribute the FLUX Model and Derivatives solely for your Non-Commercial Purposes. The foregoing license is personal to you, and you may not assign or sublicense this License or any other rights or obligations under this License without Company’s prior written consent; any such assignment or sublicense will be void and will automatically and immediately terminate this License.  Any restrictions set forth herein regarding the FLUX Model also apply to any Derivative you create or that are created on your behalf.
+- b. Non-Commercial Use Only.  You may only access, use, Distribute, or create Derivatives of the FLUX Model or Derivatives for Non-Commercial Purposes.  If you want to use a FLUX Model or a Derivative for any purpose that is not expressly authorized under this License, such as for a commercial activity, you must request a license from Company, which Company may grant to you in Company’s sole discretion and which additional use may be subject to a fee, royalty or other revenue share. Please see https://bfl.ai/licensing if you would like a commercial license.
+- c. Reserved Rights. The grant of rights expressly set forth in this License are the complete grant of rights to use the FLUX Model, and no other licenses are granted, whether by waiver, estoppel, implication, equity, or otherwise. Company and its licensors reserve all rights not expressly granted by this License.
+- d. Outputs. We claim no ownership rights in and to the Outputs. You are solely responsible for the Outputs you generate and their subsequent uses in accordance with this License. You may use Output for any purpose (including for commercial purposes), except as expressly prohibited herein. You may not use the Output to train, fine-tune, or distill a model that is competitive with a FLUX Model.
+- e. You may access, use, Distribute, or create Output of the FLUX Model or Derivatives if you: (i) (A) implement and maintain content filtering measures (“Content Filters”) for your use of the FLUX Model or Derivatives to prevent the creation, display, transmission, generation, or dissemination of unlawful or infringing content, which may include Content Filters that we may make available for use with the FLUX Model (“Provided Content Filters”), or (B) ensure Output undergoes review for unlawful or infringing content before public or non-public distribution, display, transmission or dissemination; and (ii) ensure Output includes disclosure (or other indication) that the Output was generated or modified using artificial intelligence technologies to the extent required under applicable law.
+
+3. Distribution. Subject to this License, you may Distribute copies of the FLUX Model and/or Derivatives made by you, under the following conditions:
+- a. you must make available a copy of this License to third-party recipients of the FLUX Mode and/or Derivatives you Distribute, and specify that any rights to use the FLUX Model and/or Derivatives shall be directly granted by Company to said third-party recipients pursuant to this License;
+- b. you must prominently display the following notice alongside the Distribution of the FLUX Model or Derivative (such as via a “Notice” text file distributed as part of such FLUX Model or Derivative) (the “Attribution Notice”):
+
+> This FLUX Model is licensed by Black Forest Labs Inc. under the FLUX Non-Commercial License. Copyright Black Forest Labs Inc. IN NO EVENT SHALL BLACK FOREST LABS INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.
+
+- c. in the case of Distribution of Derivatives made by you: (i) you must also include in the Attribution Notice a statement that you have modified the applicable FLUX Model; (ii) any terms and conditions you impose on any third-party recipients relating to Derivatives made by or for you shall neither limit such third-party recipients’ use of the FLUX Model or any Derivatives made by or for Company in accordance with this License nor conflict with any of its terms and conditions and must include disclaimer of warranties and limitation of liability provisions that are at least as protective of Company as those set forth herein; and (iii) you must not misrepresent or imply, through any means, that the Derivatives made by or for you and/or any modified version of the FLUX Model you Distribute under your name and responsibility is an official product of the Company or has been endorsed, approved or validated by the Company, unless you are authorized by Company to do so in writing.
+
+4. Restrictions.  You will not, and will not permit, assist or cause any third party to
+- a. use, modify, copy, reproduce, create Derivatives of, or Distribute the FLUX Model (or any Derivative thereof, or any data produced by the FLUX Model), in whole or in part, (i) for any commercial or production purposes, (ii) military purposes, (iii) purposes of surveillance, including any research or development relating to surveillance, (iv) biometric processing, (v) in any manner that infringes, misappropriates, or otherwise violates (or is likely to infringe, misappropriate, or otherwise violate) any third party’s legal rights, including rights of publicity or “digital replica” rights, (vi) in any unlawful, fraudulent, defamatory, or abusive activity, (vii) to generate unlawful content, including child sexual abuse material, or non-consensual intimate images; or (viii) in any manner that violates any applicable law and any privacy or security laws, rules, regulations, directives, or governmental requirements (including the General Data Privacy Regulation (Regulation (EU) 2016/679), the California Consumer Privacy Act, any and all laws governing the processing of biometric information, and the EU Artificial Intelligence Act (Regulation (EU) 2024/1689), as well as all amendments and successor laws to any of the foregoing);
+  - b. alter or remove copyright and other proprietary notices which appear on or in any portion of the FLUX Model;
+  - c. utilize any equipment, device, software, or other means to circumvent or remove any security or protection used by Company in connection with the FLUX Model, or to circumvent or remove any usage restrictions, or to enable functionality disabled by FLUX  Model;
+  - d. offer or impose any terms on the FLUX Model that alter, restrict, or are inconsistent with the terms of this License;
+  - e. violate any applicable U.S. and non-U.S. export control and trade sanctions laws (“Export Laws”) in connection with your use or Distribution of any FLUX Model;
+  - f. directly or indirectly Distribute, export, or otherwise transfer FLUX Model (i) to any individual, entity, or country prohibited by Export Laws; (ii) to anyone on U.S. or non-U.S. government restricted parties lists; (iii) for any purpose prohibited by Export Laws, including nuclear, chemical or biological weapons, or missile technology applications; (iv) use or download FLUX Model if you or they are (a) located in a comprehensively sanctioned jurisdiction, (b) currently listed on any U.S. or non-U.S. restricted parties list, or (c) for any purpose prohibited by Export Laws; and (v) will not disguise your location through IP proxying or other methods.
+
+5. DISCLAIMERS.  THE FLUX MODEL AND PROVIDED CONTENT FILTERS ARE PROVIDED “AS IS” AND “WITH ALL FAULTS” WITH NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. COMPANY EXPRESSLY DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS OR IMPLIED, WHETHER BY STATUTE, CUSTOM, USAGE OR OTHERWISE AS TO ANY MATTERS RELATED TO THE FLUX MODEL AND PROVIDED CONTENT FILTERS, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, SATISFACTORY QUALITY, OR NON-INFRINGEMENT. COMPANY MAKES NO WARRANTIES OR REPRESENTATIONS THAT THE FLUX MODEL AND PROVIDED CONTENT FILTERS WILL BE ERROR FREE OR FREE OF VIRUSES OR OTHER HARMFUL COMPONENTS, OR PRODUCE ANY PARTICULAR RESULTS.
+
+6. LIMITATION OF LIABILITY.  TO THE FULLEST EXTENT PERMITTED BY LAW, IN NO EVENT WILL COMPANY BE LIABLE TO YOU OR YOUR EMPLOYEES, AFFILIATES, USERS, OFFICERS OR DIRECTORS (A) UNDER ANY THEORY OF LIABILITY, WHETHER BASED IN CONTRACT, TORT, NEGLIGENCE, STRICT LIABILITY, WARRANTY, OR OTHERWISE UNDER THIS LICENSE, OR (B) FOR ANY INDIRECT, CONSEQUENTIAL, EXEMPLARY, INCIDENTAL, PUNITIVE OR SPECIAL DAMAGES OR LOST PROFITS, EVEN IF COMPANY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THE FLUX MODEL, ITS CONSTITUENT COMPONENTS, PROVIDED CONTENT FILTERS, AND ANY OUTPUT (COLLECTIVELY, “MODEL MATERIALS”) ARE NOT DESIGNED OR INTENDED FOR USE IN ANY APPLICATION OR SITUATION WHERE FAILURE OR FAULT OF THE MODEL MATERIALS COULD REASONABLY BE ANTICIPATED TO LEAD TO SERIOUS INJURY OF ANY PERSON, INCLUDING POTENTIAL DISCRIMINATION OR VIOLATION OF AN INDIVIDUAL’S PRIVACY RIGHTS, OR TO SEVERE PHYSICAL, PROPERTY, OR ENVIRONMENTAL DAMAGE (EACH, A “HIGH-RISK USE”). IF YOU ELECT TO USE ANY OF THE MODEL MATERIALS FOR A HIGH-RISK USE, YOU DO SO AT YOUR OWN RISK. YOU AGREE TO DESIGN AND IMPLEMENT APPROPRIATE DECISION-MAKING AND RISK-MITIGATION PROCEDURES AND POLICIES IN CONNECTION WITH A HIGH-RISK USE SUCH THAT EVEN IF THERE IS A FAILURE OR FAULT IN ANY OF THE MODEL MATERIALS, THE SAFETY OF PERSONS OR PROPERTY AFFECTED BY THE ACTIVITY STAYS AT A LEVEL THAT IS REASONABLE, APPROPRIATE, AND LAWFUL FOR THE FIELD OF THE HIGH-RISK USE.
+
+7. INDEMNIFICATION. You will indemnify, defend and hold harmless Company and our subsidiaries and affiliates, and each of our respective shareholders, directors, officers, employees, agents, successors, and assigns (collectively, the “Company Parties”) from and against any losses, liabilities, damages, fines, penalties, and expenses (including reasonable attorneys’ fees) incurred by any Company Party in connection with any claim, demand, allegation, lawsuit, proceeding, or investigation (collectively, “Claims”) arising out of or related to  (a) your access to or use of the FLUX Model (including in connection with any Output, results or data generated from such access or use, or from your access or use of any Content Filters), including any High-Risk Use; (b) your Content Filters, including your failure to implement any Content Filters where required by this License such as in Section 2(e); (c) your violation of this License; or (d) your violation, misappropriation or infringement of any rights of another (including intellectual property or other proprietary rights and privacy rights). You will promptly notify the Company Parties of any such Claims, and cooperate with Company Parties in defending such Claims. You will also grant the Company Parties sole control of the defense or settlement, at Company’s sole option, of any Claims. This indemnity is in addition to, and not in lieu of, any other indemnities or remedies set forth in a written agreement between you and Company or the other Company Parties.
+
+8. Termination; Survival.
+        a. This License will automatically terminate upon any breach by you of the terms of this License.
+        b. We may terminate this License, in whole or in part, at any time upon notice (including electronic) to you.
+        c. If you initiate any legal action or proceedings against Company or any other entity (including a cross-claim or counterclaim in a lawsuit), alleging that the FLUX Model, any Derivative, or Provided Content Filters, or any part thereof, infringe upon intellectual property or other rights owned or licensable by you, then any licenses granted to you under this License will immediately terminate as of the date such legal action or claim is filed or initiated.
+        d. Upon termination of this License, you must cease all use, access or Distribution of the FLUX Model, any Derivatives, and any Provided Content Filters.  The following sections survive termination of this License: 2(c), 2(d), 4-11.
+
+9. Third Party Materials. The FLUX Model and Provided Content Filters may contain third-party software or other components (including free and open source software) (all of the foregoing, “Third Party Materials”), which are subject to the license terms of the respective third-party licensors. Your dealings or correspondence with third parties and your use of or interaction with any Third Party Materials are solely between you and the third party. Company does not control or endorse, and makes no representations or warranties regarding, any Third Party Materials, and your access to and use of such Third Party Materials are at your own risk.
+
+10. Trademarks. You have not been granted any trademark license as part of this License and may not use any name, logo or trademark associated with Company without the prior written permission of Company, except to the extent necessary to make the reference required in the Attribution Notice as specified above or as is reasonably necessary in describing the FLUX Model and its creators.
+
+11. General. This License will be governed and construed under the laws of the State of Delaware without regard to conflicts of law provisions. If any provision or part of a provision of this License is unlawful, void or unenforceable, that provision or part of the provision is deemed severed from this License, and will not affect the validity and enforceability of any remaining provisions. The failure of Company to exercise or enforce any right or provision of this License will not operate as a waiver of such right or provision. This License does not confer any third-party beneficiary rights upon any other person or entity. This License, together with the documentation, contains the entire understanding between you and Company regarding the subject matter of this License, and supersedes all other written or oral agreements and understandings between you and Company regarding such subject matter.
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -17,6 +17,7 @@ dependencies = [
  "safetensors==0.4.5",
  "fire==0.7.1",
  "openai==2.8.1",
+  "accelerate==1.12.0",
 ]

 [project.optional-dependencies]
--- a/scripts/cli.py
+++ b/scripts/cli.py
@@ -16,11 +16,12 @@ from flux2.sampling import (
    batched_prc_img,
    batched_prc_txt,
    denoise,
+    denoise_cfg,
    encode_image_refs,
    get_schedule,
    scatter_ids,
 )
-from flux2.util import FLUX2_MODEL_INFO, load_ae, load_flow_model, load_mistral_small_embedder
+from flux2.util import FLUX2_MODEL_INFO, load_ae, load_flow_model, load_text_encoder

 # from flux2.watermark import embed_watermark

@@ -167,6 +168,7 @@ def print_help():
    print("""
 Available commands:
  [Enter]           - Run generation with current config
+  <any text>        - Set as prompt (then press Enter to generate)
  run               - Run generation with current config
  show              - Show current configuration
  reset             - Reset configuration to defaults
@@ -206,40 +208,114 @@ Parameters:
 """)


+def validate_model_params(model_name: str, cfg: Config) -> bool:
+    """Validate that config parameters match model requirements. Returns True if valid."""
+    model_info = FLUX2_MODEL_INFO[model_name]
+    defaults = model_info.get("defaults", {})
+    fixed_params = model_info.get("fixed_params", set())
+
+    errors = []
+    if "num_steps" in fixed_params and cfg.num_steps != defaults["num_steps"]:
+        errors.append(
+            f"Model '{model_name}' requires num_steps={defaults['num_steps']}, "
+            f"but you specified num_steps={cfg.num_steps}"
+        )
+
+    if "guidance" in fixed_params and cfg.guidance != defaults["guidance"]:
+        errors.append(
+            f"Model '{model_name}' requires guidance={defaults['guidance']}, "
+            f"but you specified guidance={cfg.guidance}"
+        )
+
+    if errors:
+        print("\nERROR: Invalid parameters for selected model:", file=sys.stderr)
+        for error in errors:
+            print(f"  - {error}", file=sys.stderr)
+        print("\nPlease adjust your parameters and try again.", file=sys.stderr)
+        return False
+
+    return True
+
+
 # ---------- Main Loop ----------


 def main(
-    model_name: str = "flux.2-dev",
+    model_name: str | None = None,
    single_eval: bool = False,
    prompt: str | None = None,
    debug_mode: bool = False,
    cpu_offloading: bool = False,
    **overwrite,
 ):
+    # Prompt for model selection if not provided
+    if model_name is None:
+        available_models = list(FLUX2_MODEL_INFO.keys())
+        print("Available models:")
+        for i, name in enumerate(available_models, 1):
+            print(f"  {i}. {name}")
+        while True:
+            try:
+                choice = input(f"\nSelect a model [default: {available_models[0]}]: ").strip()
+                if choice == "":
+                    model_name = available_models[0]
+                    break
+                elif choice.isdigit():
+                    idx = int(choice) - 1
+                    if 0 <= idx < len(available_models):
+                        model_name = available_models[idx]
+                        break
+                    print(f"Please enter a number between 1 and {len(available_models)}")
+                elif choice.lower() in FLUX2_MODEL_INFO:
+                    model_name = choice.lower()
+                    break
+                else:
+                    print(f"Invalid choice. Available models: {', '.join(available_models)}")
+            except (EOFError, KeyboardInterrupt):
+                print("\nbye!")
+                return
+
    assert (
        model_name.lower() in FLUX2_MODEL_INFO
    ), f"{model_name} is not available, choose from {FLUX2_MODEL_INFO.keys()}"

+    model_info = FLUX2_MODEL_INFO[model_name]
    torch_device = torch.device("cuda")

-    mistral = load_mistral_small_embedder()
+    text_encoder = load_text_encoder(model_name, device=torch_device)
+    if "klein" in model_name:
+        mod_and_upsampling_model = load_text_encoder("flux.2-dev")
+    else:
+        mod_and_upsampling_model = text_encoder
+
    model = load_flow_model(
        model_name, debug_mode=debug_mode, device="cpu" if cpu_offloading else torch_device
    )
    ae = load_ae(model_name)
    ae.eval()
-    mistral.eval()
+    text_encoder.eval()

    # API client will be initialized lazily when needed
    openrouter_api_client: Optional[OpenRouterAPIClient] = None

    cfg = DEFAULTS.copy()
+
+    # Apply model defaults if not overridden
+    defaults = model_info.get("defaults", {})
+    if "num_steps" in defaults and "num_steps" not in overwrite:
+        cfg.num_steps = defaults["num_steps"]
+    if "guidance" in defaults and "guidance" not in overwrite:
+        cfg.guidance = defaults["guidance"]
+
    changes = [f"{key}={value}" for key, value in overwrite.items()]
    updates = parse_key_values(" ".join(changes))
    apply_updates(cfg, updates)
    if prompt is not None:
        cfg.prompt = prompt
+
+    # Validate initial config
+    if not validate_model_params(model_name, cfg):
+        sys.exit(1)
    print_config(cfg)

    while True:
@@ -255,17 +331,24 @@ def main(
                cmd = "run"
                updates = {}
            else:
-                try:
-                    updates = parse_key_values(line)
-                except Exception as e:  # noqa: BLE001
-                    print(f"  ! Failed to parse command: {type(e).__name__}: {e}", file=sys.stderr)
-                    print(
-                        "  ! Please check your syntax (e.g., matching quotes) and try again.\n",
-                        file=sys.stderr,
-                    )
-                    continue
+                # Check if this is plain text (no key=value pairs and not a known command)
+                known_commands = {"run", "show", "reset", "quit", "q", "exit", "help", "h", "?"}
+                if "=" not in line and line.lower() not in known_commands:
+                    # Treat the entire line as a prompt
+                    updates = {"prompt": line}
+                    cmd = None
+                else:
+                    try:
+                        updates = parse_key_values(line)
+                    except Exception as e:  # noqa: BLE001
+                        print(f"  ! Failed to parse command: {type(e).__name__}: {e}", file=sys.stderr)
+                        print(
+                            "  ! Please check your syntax (e.g., matching quotes) and try again.\n",
+                            file=sys.stderr,
+                        )
+                        continue

-                if "prompt" in updates and mistral.test_txt(updates["prompt"]):
+                if "prompt" in updates and mod_and_upsampling_model.test_txt(updates["prompt"]):
                    print(
                        "Your prompt has been flagged for potential copyright or public personas concerns. Please choose another."
                    )
@@ -274,7 +357,7 @@ def main(
                if "input_images" in updates:
                    flagged = False
                    for image in updates["input_images"]:
-                        if mistral.test_image(image):
+                        if mod_and_upsampling_model.test_image(image):
                            print(f"The image {image} has been flagged as unsuitable. Please choose another.")
                            flagged = True
                    if flagged:
@@ -294,6 +377,11 @@ def main(
                break
            elif cmd == "reset":
                cfg = DEFAULTS.copy()
+                # Re-apply model defaults
+                if "num_steps" in defaults:
+                    cfg.num_steps = defaults["num_steps"]
+                if "guidance" in defaults:
+                    cfg.guidance = defaults["guidance"]
                print_config(cfg)
                continue
            elif cmd == "show":
@@ -305,7 +393,16 @@ def main(

            # Apply key=value changes
            if updates:
-                apply_updates(cfg, updates)
+                # Create a temporary copy to test the updates
+                temp_cfg = cfg.copy()
+                apply_updates(temp_cfg, updates)
+
+                # Validate the temporary config
+                if not validate_model_params(model_name, temp_cfg):
+                    continue
+
+                # Only apply to actual config if validation passed
+                cfg = temp_cfg
                print_config(cfg)
                continue

@@ -453,7 +550,7 @@ def main(
                        prompt = cfg.prompt
                elif cfg.upsample_prompt_mode == "local":
                    # Use local model for upsampling
-                    upsampled_prompts = mistral.upsample_prompt(
+                    upsampled_prompts = mod_and_upsampling_model.upsample_prompt(
                        [cfg.prompt], img=[img_ctx] if img_ctx else None
                    )
                    prompt = upsampled_prompts[0] if upsampled_prompts else cfg.prompt
@@ -463,13 +560,20 @@ def main(

                print("Generating with prompt: ", prompt)

-                ctx = mistral([prompt]).to(torch.bfloat16)
+                if model_info["guidance_distilled"]:
+                    ctx = text_encoder([prompt]).to(torch.bfloat16)
+                else:
+                    ctx_empty = text_encoder([""]).to(torch.bfloat16)
+                    ctx_prompt = text_encoder([prompt]).to(torch.bfloat16)
+                    ctx = torch.cat([ctx_empty, ctx_prompt], dim=0)
                ctx, ctx_ids = batched_prc_txt(ctx)

                if cpu_offloading:
-                    mistral = mistral.cpu()
+                    text_encoder = text_encoder.cpu()
                    torch.cuda.empty_cache()
                    model = model.to(torch_device)
+                    if "klein" in model_name:
+                        mod_and_upsampling_model = mod_and_upsampling_model.cpu()

                # Create noise
                shape = (1, 128, height // 16, width // 16)
@@ -478,17 +582,30 @@ def main(
                x, x_ids = batched_prc_img(randn)

                timesteps = get_schedule(cfg.num_steps, x.shape[1])
-                x = denoise(
-                    model,
-                    x,
-                    x_ids,
-                    ctx,
-                    ctx_ids,
-                    timesteps=timesteps,
-                    guidance=cfg.guidance,
-                    img_cond_seq=ref_tokens,
-                    img_cond_seq_ids=ref_ids,
-                )
+                if model_info["guidance_distilled"]:
+                    x = denoise(
+                        model,
+                        x,
+                        x_ids,
+                        ctx,
+                        ctx_ids,
+                        timesteps=timesteps,
+                        guidance=cfg.guidance,
+                        img_cond_seq=ref_tokens,
+                        img_cond_seq_ids=ref_ids,
+                    )
+                else:
+                    x = denoise_cfg(
+                        model,
+                        x,
+                        x_ids,
+                        ctx,
+                        ctx_ids,
+                        timesteps=timesteps,
+                        guidance=cfg.guidance,
+                        img_cond_seq=ref_tokens,
+                        img_cond_seq_ids=ref_ids,
+                    )
                x = torch.cat(scatter_ids(x, x_ids)).squeeze(2)
                x = ae.decode(x).float()
                # x = embed_watermark(x)
@@ -496,13 +613,17 @@ def main(
                if cpu_offloading:
                    model = model.cpu()
                    torch.cuda.empty_cache()
-                    mistral = mistral.to(torch_device)
+                    text_encoder = text_encoder.to(torch_device)
+
+                    if "klein" in model_name:
+                        mod_and_upsampling_model = mod_and_upsampling_model.to(torch_device)

            x = x.clamp(-1, 1)
            x = rearrange(x[0], "c h w -> h w c")

            img = Image.fromarray((127.5 * (x + 1.0)).cpu().byte().numpy())
-            if mistral.test_image(img):
+
+            if mod_and_upsampling_model.test_image(img):
                print("Your output has been flagged. Please choose another prompt / input image combination")
            else:
                exif_data = Image.Exif()
--- a/src/flux2/model.py
+++ b/src/flux2/model.py
@@ -17,6 +17,35 @@ class Flux2Params:
    axes_dim: list[int] = field(default_factory=lambda: [32, 32, 32, 32])
    theta: int = 2000
    mlp_ratio: float = 3.0
+    use_guidance_embed: bool = True
+
+
+@dataclass
+class Klein9BParams:
+    in_channels: int = 128
+    context_in_dim: int = 12288
+    hidden_size: int = 4096
+    num_heads: int = 32
+    depth: int = 8
+    depth_single_blocks: int = 24
+    axes_dim: list[int] = field(default_factory=lambda: [32, 32, 32, 32])
+    theta: int = 2000
+    mlp_ratio: float = 3.0
+    use_guidance_embed: bool = False
+
+
+@dataclass
+class Klein4BParams:
+    in_channels: int = 128
+    context_in_dim: int = 7680
+    hidden_size: int = 3072
+    num_heads: int = 24
+    depth: int = 5
+    depth_single_blocks: int = 20
+    axes_dim: list[int] = field(default_factory=lambda: [32, 32, 32, 32])
+    theta: int = 2000
+    mlp_ratio: float = 3.0
+    use_guidance_embed: bool = False


 class Flux2(nn.Module):
@@ -37,9 +66,12 @@ class Flux2(nn.Module):
        self.pe_embedder = EmbedND(dim=pe_dim, theta=params.theta, axes_dim=params.axes_dim)
        self.img_in = nn.Linear(self.in_channels, self.hidden_size, bias=False)
        self.time_in = MLPEmbedder(in_dim=256, hidden_dim=self.hidden_size, disable_bias=True)
-        self.guidance_in = MLPEmbedder(in_dim=256, hidden_dim=self.hidden_size, disable_bias=True)
        self.txt_in = nn.Linear(params.context_in_dim, self.hidden_size, bias=False)

+        self.use_guidance_embed = params.use_guidance_embed
+        if self.use_guidance_embed:
+            self.guidance_in = MLPEmbedder(in_dim=256, hidden_dim=self.hidden_size, disable_bias=True)
+
        self.double_blocks = nn.ModuleList(
            [
                DoubleStreamBlock(
@@ -86,14 +118,15 @@ class Flux2(nn.Module):
        timesteps: Tensor,
        ctx: Tensor,
        ctx_ids: Tensor,
-        guidance: Tensor,
+        guidance: Tensor | None,
    ):
        num_txt_tokens = ctx.shape[1]

        timestep_emb = timestep_embedding(timesteps, 256)
        vec = self.time_in(timestep_emb)
-        guidance_emb = timestep_embedding(guidance, 256)
-        vec = vec + self.guidance_in(guidance_emb)
+        if self.use_guidance_embed:
+            guidance_emb = timestep_embedding(guidance, 256)
+            vec = vec + self.guidance_in(guidance_emb)

        double_block_mod_img = self.double_stream_modulation_img(vec)
        double_block_mod_txt = self.double_stream_modulation_txt(vec)
--- a/src/flux2/sampling.py
+++ b/src/flux2/sampling.py
@@ -307,6 +307,61 @@ def denoise(
    return img


+def vanilla_guidance(x: torch.Tensor, cfg_val: float) -> torch.Tensor:
+    x_u, x_c = x.chunk(2)
+    x = x_u + cfg_val * (x_c - x_u)
+    return x
+
+
+def denoise_cfg(
+    model: Flux2,
+    img: Tensor,
+    img_ids: Tensor,
+    txt: Tensor,  # Already cat([txt_empty, txt_prompt])
+    txt_ids: Tensor,
+    timesteps: list[float],
+    guidance: float,
+    img_cond_seq: Tensor | None = None,
+    img_cond_seq_ids: Tensor | None = None,
+):
+    img = torch.cat([img, img], dim=0)
+    img_ids = torch.cat([img_ids, img_ids], dim=0)
+
+    if img_cond_seq is not None:
+        assert img_cond_seq_ids is not None
+        img_cond_seq = torch.cat([img_cond_seq, img_cond_seq], dim=0)
+        img_cond_seq_ids = torch.cat([img_cond_seq_ids, img_cond_seq_ids], dim=0)
+
+    for t_curr, t_prev in zip(timesteps[:-1], timesteps[1:]):
+        t_vec = torch.full((img.shape[0],), t_curr, dtype=img.dtype, device=img.device)
+
+        img_input = img
+        img_input_ids = img_ids
+        if img_cond_seq is not None:
+            img_input = torch.cat((img_input, img_cond_seq), dim=1)
+            img_input_ids = torch.cat((img_input_ids, img_cond_seq_ids), dim=1)
+
+        pred = model(
+            x=img_input,
+            x_ids=img_input_ids,
+            timesteps=t_vec,
+            ctx=txt,
+            ctx_ids=txt_ids,
+            guidance=None,
+        )
+
+        if img_cond_seq is not None:
+            pred = pred[:, : img.shape[1]]
+
+        pred_uncond, pred_cond = pred.chunk(2)
+        pred = pred_uncond + guidance * (pred_cond - pred_uncond)
+        pred = torch.cat([pred, pred], dim=0)
+
+        img = img + (t_prev - t_curr) * pred
+
+    return img.chunk(2)[0]
+
+
 def concatenate_images(
    images: list[Image.Image],
 ) -> Image.Image:
--- a/src/flux2/text_encoder.py
+++ b/src/flux2/text_encoder.py
@@ -4,7 +4,13 @@ import torch
 import torch.nn as nn
 from einops import rearrange
 from PIL import Image
-from transformers import AutoProcessor, Mistral3ForConditionalGeneration, pipeline
+from transformers import (
+    AutoModelForCausalLM,
+    AutoProcessor,
+    AutoTokenizer,
+    Mistral3ForConditionalGeneration,
+    pipeline,
+)

 from .sampling import cap_pixels, concatenate_images
 from .system_messages import (
@@ -17,7 +23,8 @@ from .system_messages import (
    SYSTEM_PROMPT_CONTENT_FILTER,
 )

-OUTPUT_LAYERS = [10, 20, 30]
+OUTPUT_LAYERS_MISTRAL = [10, 20, 30]
+OUTPUT_LAYERS_QWEN3 = [9, 18, 27]
 MAX_LENGTH = 512
 NSFW_THRESHOLD = 0.85
 UPSAMPLING_MAX_IMAGE_SIZE = 768**2
@@ -237,7 +244,7 @@ class Mistral3SmallEmbedder(nn.Module):
            use_cache=False,
        )

-        out = torch.stack([output.hidden_states[k] for k in OUTPUT_LAYERS], dim=1)
+        out = torch.stack([output.hidden_states[k] for k in OUTPUT_LAYERS_MISTRAL], dim=1)
        return rearrange(out, "b c l d -> b l (c d)")

    def yes_no_logit_processor(
@@ -354,3 +361,76 @@ class Mistral3SmallEmbedder(nn.Module):
            do_sample=False,
        )
        return generate_ids[0, -1].item() == self.yes_token
+
+
+class Qwen3Embedder(nn.Module):
+    def __init__(
+        self,
+        model_spec: str,
+        device: str | torch.device = "cuda",
+    ):
+        super().__init__()
+
+        self.model = AutoModelForCausalLM.from_pretrained(
+            model_spec,
+            torch_dtype=None,
+            device_map=str(device),
+        )
+
+        self.tokenizer = AutoTokenizer.from_pretrained(model_spec)
+        self.max_length = MAX_LENGTH
+
+    @torch.no_grad()
+    def forward(self, txt: list[str]):
+        all_input_ids = []
+        all_attention_masks = []
+
+        for prompt in txt:
+            messages = [{"role": "user", "content": prompt}]
+            text = self.tokenizer.apply_chat_template(
+                messages,
+                tokenize=False,
+                add_generation_prompt=True,
+                enable_thinking=False,
+            )
+
+            model_inputs = self.tokenizer(
+                text,
+                return_tensors="pt",
+                padding="max_length",
+                truncation=True,
+                max_length=self.max_length,
+            )
+
+            all_input_ids.append(model_inputs["input_ids"])
+            all_attention_masks.append(model_inputs["attention_mask"])
+
+        input_ids = torch.cat(all_input_ids, dim=0).to(self.model.device)
+        attention_mask = torch.cat(all_attention_masks, dim=0).to(self.model.device)
+
+        output = self.model(
+            input_ids=input_ids,
+            attention_mask=attention_mask,
+            output_hidden_states=True,
+            use_cache=False,
+        )
+
+        out = torch.stack([output.hidden_states[k] for k in OUTPUT_LAYERS_QWEN3], dim=1)
+        return rearrange(out, "b c l d -> b l (c d)")
+
+    def test_txt(self, txt: str) -> bool:
+        raise NotImplementedError("Qwen3Embedder does not support text testing")
+
+    def test_image(self, image) -> bool:
+        raise NotImplementedError("Qwen3Embedder does not support image testing")
+
+    def upsample_prompt(self, txt: list[str], img=None, **kwargs) -> list[str]:
+        raise NotImplementedError("Qwen3Embedder does not support upsampling")
+
+
+def load_mistral_small_embedder(device: str | torch.device = "cuda") -> Mistral3SmallEmbedder:
+    return Mistral3SmallEmbedder().to(device)
+
+
+def load_qwen3_embedder(variant: str, device: str | torch.device = "cuda"):
+    return Qwen3Embedder(model_spec=f"Qwen/Qwen3-{variant}-FP8", device=device)
--- a/src/flux2/util.py
+++ b/src/flux2/util.py
@@ -9,16 +9,65 @@ from PIL import Image
 from safetensors.torch import load_file as load_sft

 from .autoencoder import AutoEncoder, AutoEncoderParams
-from .model import Flux2, Flux2Params
-from .text_encoder import Mistral3SmallEmbedder
+from .model import Flux2, Flux2Params, Klein4BParams, Klein9BParams
+from .text_encoder import load_mistral_small_embedder, load_qwen3_embedder

 FLUX2_MODEL_INFO = {
+    "flux.2-klein-4b": {
+        "repo_id": "black-forest-labs/FLUX.2-klein-4B",
+        "filename": "flux-2-klein-4b.safetensors",
+        "filename_ae": "ae.safetensors",
+        "params": Klein4BParams(),
+        "text_encoder_load_fn": lambda device="cuda": load_qwen3_embedder(variant="4B", device=device),
+        "model_path": "KLEIN_4B_MODEL_PATH",
+        "defaults": {"guidance": 1.0, "num_steps": 4},
+        "fixed_params": {"guidance", "num_steps"},
+        "guidance_distilled": True,
+    },
+    "flux.2-klein-9b": {
+        "repo_id": "black-forest-labs/FLUX.2-klein-9B",
+        "filename": "flux-2-klein-9b.safetensors",
+        "filename_ae": "ae.safetensors",
+        "params": Klein9BParams(),
+        "text_encoder_load_fn": lambda device="cuda": load_qwen3_embedder(variant="8B", device=device),
+        "model_path": "KLEIN_9B_MODEL_PATH",
+        "defaults": {"guidance": 1.0, "num_steps": 4},
+        "fixed_params": {"guidance", "num_steps"},
+        "guidance_distilled": True,
+    },
+    "flux.2-klein-base-4b": {
+        "repo_id": "black-forest-labs/FLUX.2-klein-base-4B",
+        "filename": "flux-2-klein-base-4b.safetensors",
+        "filename_ae": "ae.safetensors",
+        "params": Klein4BParams(),
+        "text_encoder_load_fn": lambda device="cuda": load_qwen3_embedder(variant="4B", device=device),
+        "model_path": "KLEIN_4B_BASE_MODEL_PATH",
+        "defaults": {"guidance": 4.0, "num_steps": 50},
+        "fixed_params": {},
+        "guidance_distilled": False,
+    },
+    "flux.2-klein-base-9b": {
+        "repo_id": "black-forest-labs/FLUX.2-klein-base-9B",
+        "filename": "flux-2-klein-base-9b.safetensors",
+        "filename_ae": "ae.safetensors",
+        "params": Klein9BParams(),
+        "text_encoder_load_fn": lambda device="cuda": load_qwen3_embedder(variant="8B", device=device),
+        "model_path": "KLEIN_9B_BASE_MODEL_PATH",
+        "defaults": {"guidance": 4.0, "num_steps": 50},
+        "fixed_params": {},
+        "guidance_distilled": False,
+    },
    "flux.2-dev": {
        "repo_id": "black-forest-labs/FLUX.2-dev",
        "filename": "flux2-dev.safetensors",
        "filename_ae": "ae.safetensors",
        "params": Flux2Params(),
-    }
+        "text_encoder_load_fn": load_mistral_small_embedder,
+        "model_path": "FLUX2_MODEL_PATH",
+        "defaults": {"guidance": 4.0, "num_steps": 50},
+        "fixed_params": {},
+        "guidance_distilled": True,
+    },
 }


@@ -29,8 +78,8 @@ def load_flow_model(model_name: str, debug_mode: bool = False, device: str | tor
        config["params"].depth = 1
        config["params"].depth_single_blocks = 1
    else:
-        if "FLUX2_MODEL_PATH" in os.environ:
-            weight_path = os.environ["FLUX2_MODEL_PATH"]
+        if config["model_path"] in os.environ:
+            weight_path = os.environ[config["model_path"]]
            assert os.path.exists(weight_path), f"Provided weight path {weight_path} does not exist"
        else:
            # download from huggingface
@@ -53,15 +102,16 @@ def load_flow_model(model_name: str, debug_mode: bool = False, device: str | tor
            model = Flux2(FLUX2_MODEL_INFO[model_name.lower()]["params"]).to(torch.bfloat16)
        print(f"Loading {weight_path} for the FLUX.2 weights")
        sd = load_sft(weight_path, device=str(device))
-        model.load_state_dict(sd, strict=False, assign=True)
+        model.load_state_dict(sd, strict=True, assign=True)
        return model.to(device)
    else:
        with torch.device(device):
            return Flux2(FLUX2_MODEL_INFO[model_name.lower()]["params"]).to(torch.bfloat16)


-def load_mistral_small_embedder(device: str | torch.device = "cuda") -> Mistral3SmallEmbedder:
-    return Mistral3SmallEmbedder().to(device)
+def load_text_encoder(model_name: str, device: str | torch.device = "cuda"):
+    config = FLUX2_MODEL_INFO[model_name.lower()]
+    return config["text_encoder_load_fn"](device=device)


 def load_ae(model_name: str, device: str | torch.device = "cuda") -> AutoEncoder: