CLI ConfigurationĀ¶

InstructLabā€™s configuration is read from the $/XDG_CONFIG_DIR/instructlab/config.yaml file. The configuration is handled and valided by a Pydantic schema.

pydantic model instructlab.configuration.ConfigĀ¶

Configuration for the InstructLab CLI. Config options are defined by the respective subclasses and are loaded into a single ā€˜Configā€™ object here Instantation of this object should be done via ā€˜get_default_config()ā€™ Note that values here can be overriden by a users ā€˜config.yamlā€™ or command line overrides in some cases

Show JSON schema
{
   "title": "Config",
   "description": "Configuration for the InstructLab CLI.\nConfig options are defined by the respective subclasses and are loaded into a single 'Config' object here\nInstantation of this object should be done via 'get_default_config()'\nNote that values here can be overriden by a users 'config.yaml' or command line overrides in some cases",
   "type": "object",
   "properties": {
      "chat": {
         "$ref": "#/$defs/_chat",
         "description": "Chat configuration section."
      },
      "generate": {
         "$ref": "#/$defs/_generate",
         "description": "Generate configuration section."
      },
      "serve": {
         "$ref": "#/$defs/_serve",
         "description": "Serve configuration section."
      },
      "train": {
         "$ref": "#/$defs/_train",
         "description": "Train configuration section."
      },
      "evaluate": {
         "$ref": "#/$defs/_evaluate",
         "description": "Evaluate configuration section."
      },
      "general": {
         "$ref": "#/$defs/_general",
         "description": "General configuration section."
      },
      "version": {
         "default": "1.0.0",
         "description": "Configuration file structure version.",
         "title": "Version",
         "type": "string"
      },
      "metadata": {
         "$ref": "#/$defs/_metadata",
         "description": "Metadata pertaining to the specifics of the system which the Configuration is meant to be applied to."
      }
   },
   "$defs": {
      "DistributedBackend": {
         "enum": [
            "fsdp",
            "deepspeed"
         ],
         "title": "DistributedBackend",
         "type": "string"
      },
      "_chat": {
         "description": "Class describing configuration of the 'chat' sub-command.",
         "properties": {
            "model": {
               "description": "Model to be used for chatting with.",
               "title": "Model",
               "type": "string"
            },
            "vi_mode": {
               "default": false,
               "description": "Enable vim keybindings for chat.",
               "title": "Vi Mode",
               "type": "boolean"
            },
            "visible_overflow": {
               "default": true,
               "description": "Renders vertical overflow if enabled, displays ellipses otherwise.",
               "title": "Visible Overflow",
               "type": "boolean"
            },
            "context": {
               "default": "default",
               "description": "Predefined setting or environment that influences the behavior and responses of the chat assistant. Each context is associated with a specific prompt that guides the assistant on how to respond to user inputs. Available contexts: default, cli_helper.",
               "title": "Context",
               "type": "string"
            },
            "session": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Filepath of a dialog session file.",
               "title": "Session"
            },
            "logs_dir": {
               "description": "Directory where chat logs are stored.",
               "title": "Logs Dir",
               "type": "string"
            },
            "max_tokens": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The maximum number of tokens that can be generated in the chat completion. Be aware that larger values use more memory.",
               "title": "Max Tokens"
            },
            "temperature": {
               "default": 1.0,
               "description": "Controls the randomness of the model's responses. Lower values make the output more deterministic, while higher values produce more random results.",
               "title": "Temperature",
               "type": "number"
            }
         },
         "title": "_chat",
         "type": "object"
      },
      "_evaluate": {
         "description": "Class describing configuration of the 'evaluate' sub-command.",
         "properties": {
            "model": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Model to be evaluated",
               "title": "Model"
            },
            "base_model": {
               "default": "instructlab/granite-7b-lab",
               "description": "Base model to compare with 'model' for mt_bench_branch and mmlu_branch.",
               "title": "Base Model",
               "type": "string"
            },
            "branch": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Taxonomy branch containing custom skills/knowledge that should be used for evaluation runs.",
               "title": "Branch"
            },
            "base_branch": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Base taxonomy branch",
               "title": "Base Branch"
            },
            "gpus": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of GPUs to use for running evaluation.",
               "title": "Gpus"
            },
            "mmlu": {
               "$ref": "#/$defs/_mmlu",
               "description": "MMLU benchmarking settings"
            },
            "mmlu_branch": {
               "$ref": "#/$defs/_mmlubranch",
               "description": "Settings to run MMLU against a branch of taxonomy containing custom skills/knowledge used for training."
            },
            "mt_bench": {
               "$ref": "#/$defs/_mtbench",
               "description": "Multi-turn benchmarking settings for skills."
            },
            "mt_bench_branch": {
               "$ref": "#/$defs/_mtbenchbranch",
               "description": "Settings to run MT-Bench against a branch of taxonomy containing custom skills/knowledge used for training"
            }
         },
         "title": "_evaluate",
         "type": "object"
      },
      "_general": {
         "description": "Class describing various top-level configuration options for all commands.",
         "properties": {
            "log_level": {
               "default": "INFO",
               "description": "Log level for logging.",
               "title": "Log Level",
               "type": "string"
            },
            "debug_level": {
               "default": 0,
               "description": "Debug level for logging.",
               "title": "Debug Level",
               "type": "integer"
            },
            "log_format": {
               "default": "%(levelname)s %(asctime)s %(name)s:%(lineno)d: %(message)s",
               "description": "Log format. https://docs.python.org/3/library/logging.html#logrecord-attributes",
               "title": "Log Format",
               "type": "string"
            },
            "use_legacy_tmpl": {
               "default": false,
               "description": "Use legacy IBM Granite chat template (default uses 3.0 Instruct template)",
               "title": "Use Legacy Tmpl",
               "type": "boolean"
            }
         },
         "title": "_general",
         "type": "object"
      },
      "_generate": {
         "description": "Class describing configuration of the 'generate' sub-command.",
         "properties": {
            "pipeline": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "full",
               "description": "Data generation pipeline to use. Available: 'simple', 'full', or a valid path to a directory of pipeline workflow YAML files. Note that 'full' requires a larger teacher model, Mixtral-8x7b.",
               "title": "Pipeline"
            },
            "max_num_tokens": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 4096,
               "description": "The maximum amount of tokens for the model to generate during knowledge generation. A lower number yields less data but a faster SDG run. It is reccomended to use this on consumer hardware",
               "title": "Max Num Tokens"
            },
            "model": {
               "description": "Teacher model that will be used to synthetically generate training data.",
               "title": "Model",
               "type": "string"
            },
            "taxonomy_path": {
               "description": "Directory where taxonomy is stored and accessed from.",
               "title": "Taxonomy Path",
               "type": "string"
            },
            "taxonomy_base": {
               "default": "origin/main",
               "description": "Branch of taxonomy used to calculate diff against.",
               "title": "Taxonomy Base",
               "type": "string"
            },
            "teacher": {
               "$ref": "#/$defs/_serve",
               "description": "Teacher configuration"
            },
            "num_cpus": {
               "default": 10,
               "description": "Number of CPU cores to use for generation.",
               "exclusiveMinimum": 0,
               "title": "Num Cpus",
               "type": "integer"
            },
            "chunk_word_count": {
               "default": 1000,
               "description": "Maximum number of words per chunk.",
               "exclusiveMinimum": 0,
               "title": "Chunk Word Count",
               "type": "integer"
            },
            "num_instructions": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": -1,
               "deprecated": true,
               "description": "Number of instructions to use",
               "title": "Num Instructions"
            },
            "sdg_scale_factor": {
               "anyOf": [
                  {
                     "exclusiveMinimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 30,
               "description": "The total number of instructions to be generated.",
               "title": "Sdg Scale Factor"
            },
            "output_dir": {
               "description": "Directory where generated datasets are stored.",
               "title": "Output Dir",
               "type": "string"
            }
         },
         "title": "_generate",
         "type": "object"
      },
      "_metadata": {
         "properties": {
            "cpu_info": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Manufacturer, Family, and SKU of the system CPU, ex: Apple M3 Max",
               "title": "Cpu Info"
            },
            "gpu_manufacturer": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Manufacturer of the system GPU, ex: Nvidia",
               "title": "Gpu Manufacturer"
            },
            "gpu_family": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Family of the system GPU, ex: H100",
               "title": "Gpu Family"
            },
            "gpu_count": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Amount of GPUs on the system, ex: 8",
               "title": "Gpu Count"
            },
            "gpu_sku": {
               "anyOf": [
                  {
                     "items": {
                        "type": "string"
                     },
                     "type": "array"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Specific SKU related information about the given GPU, ex: PCIe, NVL",
               "title": "Gpu Sku"
            }
         },
         "title": "_metadata",
         "type": "object"
      },
      "_mmlu": {
         "description": "Class describing configuration of MMLU evaluation benchmark.",
         "properties": {
            "few_shots": {
               "default": 5,
               "description": "Number of question-answer pairs provided in the context preceding the question used for evaluation.",
               "title": "Few Shots",
               "type": "integer"
            },
            "batch_size": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "integer"
                  }
               ],
               "default": "auto",
               "description": "Batch size for evaluation. Valid values are a positive integer or 'auto' to select the largest batch size that will fit in memory.",
               "title": "Batch Size"
            }
         },
         "title": "_mmlu",
         "type": "object"
      },
      "_mmlubranch": {
         "description": "Class describing configuration of MMLUBranch evaluation benchmark.",
         "properties": {
            "tasks_dir": {
               "description": "Directory where custom MMLU tasks are stored.",
               "title": "Tasks Dir",
               "type": "string"
            }
         },
         "title": "_mmlubranch",
         "type": "object"
      },
      "_mtbench": {
         "description": "Class describing configuration of MTBench evaluation benchmark.",
         "properties": {
            "judge_model": {
               "description": "Judge model for mt_bench and mt_bench_branch.",
               "title": "Judge Model",
               "type": "string"
            },
            "output_dir": {
               "description": "Directory where evaluation results are stored.",
               "title": "Output Dir",
               "type": "string"
            },
            "max_workers": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "integer"
                  }
               ],
               "default": "auto",
               "description": "Number of workers to use for evaluation with mt_bench or mt_bench_branch. Must be a positive integer or 'auto'.",
               "title": "Max Workers"
            }
         },
         "title": "_mtbench",
         "type": "object"
      },
      "_mtbenchbranch": {
         "description": "Class describing configuration of MTBenchBranch evaluation benchmark.",
         "properties": {
            "taxonomy_path": {
               "description": "Path to where base taxonomy is stored.",
               "title": "Taxonomy Path",
               "type": "string"
            }
         },
         "title": "_mtbenchbranch",
         "type": "object"
      },
      "_serve": {
         "description": "Class describing configuration of the 'serve' sub-command.",
         "properties": {
            "vllm": {
               "$ref": "#/$defs/_serve_vllm",
               "description": "vLLM serving settings."
            },
            "llama_cpp": {
               "$ref": "#/$defs/_serve_llama_cpp",
               "description": "llama-cpp serving settings."
            },
            "model_path": {
               "description": "Directory where model to be served is stored.",
               "title": "Model Path",
               "type": "string"
            },
            "server": {
               "$ref": "#/$defs/_serve_server",
               "default": {
                  "host": "127.0.0.1",
                  "port": 8000
               },
               "description": "Server configuration including host and port."
            },
            "chat_template": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Chat template to supply to the model. Possible values: 'auto'(default), 'tokenizer', a path to a jinja2 file.",
               "examples": [
                  "auto",
                  "tokenizer",
                  "A filesystem path expressing the location of a custom template"
               ],
               "title": "Chat Template"
            },
            "backend": {
               "anyOf": [
                  {
                     "pattern": "vllm|llama-cpp",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Serving backend to use to host the model.",
               "examples": [
                  "vllm",
                  "llama-cpp"
               ],
               "title": "Backend"
            }
         },
         "title": "_serve",
         "type": "object"
      },
      "_serve_llama_cpp": {
         "description": "Class describing configuration of llama-cpp serving backend.",
         "properties": {
            "gpu_layers": {
               "default": -1,
               "description": "Number of model layers to offload to GPU. -1 means all layers.",
               "title": "Gpu Layers",
               "type": "integer"
            },
            "max_ctx_size": {
               "default": 4096,
               "description": "Maximum number of tokens that can be processed by the model.",
               "exclusiveMinimum": 0,
               "title": "Max Ctx Size",
               "type": "integer"
            },
            "llm_family": {
               "default": "",
               "description": "Large Language Model Family",
               "examples": [
                  "granite",
                  "mixtral"
               ],
               "title": "Llm Family",
               "type": "string"
            }
         },
         "title": "_serve_llama_cpp",
         "type": "object"
      },
      "_serve_server": {
         "description": "Class describing configuration of server serving backend.",
         "properties": {
            "host": {
               "default": "127.0.0.1",
               "description": "Host to serve on.",
               "title": "Host",
               "type": "string"
            },
            "port": {
               "default": 8000,
               "description": "Port to serve on.",
               "title": "Port",
               "type": "integer"
            }
         },
         "title": "_serve_server",
         "type": "object"
      },
      "_serve_vllm": {
         "description": "Class describing configuration of vLLM serving backend.",
         "properties": {
            "llm_family": {
               "default": "",
               "description": "Large Language Model Family",
               "examples": [
                  "granite",
                  "mixtral"
               ],
               "title": "Llm Family",
               "type": "string"
            },
            "max_startup_attempts": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 120,
               "description": "Maximum number of attempts to start the vLLM server.",
               "title": "Max Startup Attempts"
            },
            "gpus": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of GPUs to use.",
               "title": "Gpus"
            },
            "vllm_args": {
               "anyOf": [
                  {
                     "items": {
                        "type": "string"
                     },
                     "type": "array"
                  },
                  {
                     "type": "null"
                  }
               ],
               "description": "vLLM specific arguments. All settings can be passed as a list of strings, see: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html",
               "examples": [
                  [
                     "--dtype",
                     "auto"
                  ],
                  [
                     "--lora-alpha",
                     "32"
                  ]
               ],
               "title": "Vllm Args"
            }
         },
         "title": "_serve_vllm",
         "type": "object"
      },
      "_train": {
         "description": "Class describing configuration of the 'train' sub-command.",
         "properties": {
            "pipeline": {
               "default": "full",
               "description": "Training pipeline to use. Simple is for systems with limited resources, full is for more capable consumer systems (64 GB of RAM), and accelerated is for systems with a dedicated GPU.",
               "examples": [
                  "simple",
                  "full",
                  "accelerated"
               ],
               "pattern": "simple|full|accelerated",
               "title": "Pipeline",
               "type": "string"
            },
            "model_path": {
               "default": "instructlab/granite-7b-lab",
               "description": "Directory where the model to be trained is stored.",
               "title": "Model Path",
               "type": "string"
            },
            "device": {
               "default": "cpu",
               "description": "PyTorch device to use. Use 'cpu' for 'simple' and 'full' training on Linux. Use 'mps' for 'full' training on MacOS Metal Performance Shader. Use 'cuda' for Nvidia CUDA / AMD ROCm GPUs. Use 'hpu' for Intel Gaudi GPUs.",
               "examples": [
                  "cpu",
                  "mps",
                  "cuda",
                  "hpu"
               ],
               "pattern": "cpu|mps|cuda|hpu",
               "title": "Device",
               "type": "string"
            },
            "data_path": {
               "description": "For the training library (primary training method), this specifies the path to the dataset file. For legacy training (MacOS/Linux), this specifies the path to the directory.",
               "title": "Data Path",
               "type": "string"
            },
            "ckpt_output_dir": {
               "description": "Directory where periodic training checkpoints are stored.",
               "title": "Ckpt Output Dir",
               "type": "string"
            },
            "data_output_dir": {
               "description": "Directory where the processed training data is stored (post filtering/tokenization/masking).",
               "title": "Data Output Dir",
               "type": "string"
            },
            "max_seq_len": {
               "default": 4096,
               "description": "Maximum sequence length to be included in the training set. Samples exceeding this length will be dropped.",
               "title": "Max Seq Len",
               "type": "integer"
            },
            "max_batch_len": {
               "default": 5000,
               "description": "Maximum tokens per gpu for each batch that will be handled in a single step. If running into out-of-memory errors, this value can be lowered but not below the `max_seq_len`.",
               "title": "Max Batch Len",
               "type": "integer"
            },
            "num_epochs": {
               "default": 10,
               "description": "Number of epochs to run training for.",
               "title": "Num Epochs",
               "type": "integer"
            },
            "effective_batch_size": {
               "default": 64,
               "description": "The number of samples in a batch that the model should see before its parameters are updated.",
               "title": "Effective Batch Size",
               "type": "integer"
            },
            "save_samples": {
               "default": 250000,
               "description": "Number of samples the model should see before saving a checkpoint.",
               "title": "Save Samples",
               "type": "integer"
            },
            "checkpoint_at_epoch": {
               "default": true,
               "description": "Save a checkpoint at the end of each epoch.",
               "title": "Checkpoint At Epoch",
               "type": "boolean"
            },
            "deepspeed_cpu_offload_optimizer": {
               "default": false,
               "description": "Allow CPU offload for deepspeed optimizer.",
               "title": "Deepspeed Cpu Offload Optimizer",
               "type": "boolean"
            },
            "fsdp_cpu_offload_optimizer": {
               "default": false,
               "description": "Allow CPU offload for FSDP optimizer.",
               "title": "Fsdp Cpu Offload Optimizer",
               "type": "boolean"
            },
            "distributed_backend": {
               "$ref": "#/$defs/DistributedBackend",
               "default": "fsdp",
               "description": "Pick a distributed training backend framework for GPU accelerated full fine-tuning."
            },
            "lora_rank": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 0,
               "description": "Rank of low rank matrices to be used during training.",
               "title": "Lora Rank"
            },
            "lora_quantize_dtype": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "nf4",
               "description": "The data type for quantization in LoRA training. Valid options are 'None' and 'nf4'.",
               "examples": [
                  "nf4"
               ],
               "title": "Lora Quantize Dtype"
            },
            "is_padding_free": {
               "default": false,
               "description": "Boolean to indicate if the model being trained is a padding-free transformer model such as Granite.",
               "title": "Is Padding Free",
               "type": "boolean"
            },
            "nproc_per_node": {
               "default": 1,
               "description": "Number of GPUs to use for training. This value is not supported in legacy training or MacOS.",
               "title": "Nproc Per Node",
               "type": "integer"
            },
            "disable_flash_attn": {
               "anyOf": [
                  {
                     "type": "boolean"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": false,
               "description": "Whether or not we should disable the use of flash-attention during training. This is useful when using older GPUs.",
               "title": "Disable Flash Attn"
            },
            "additional_args": {
               "description": "Additional arguments to pass to the training script. These arguments are passed as key-value pairs to the training script.",
               "title": "Additional Args",
               "type": "object"
            },
            "phased_phase1_num_epochs": {
               "anyOf": [
                  {
                     "exclusiveMinimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 7,
               "description": "Number of epochs to run training for during phase1 (experimentally optimal number is 7).",
               "title": "Phased Phase1 Num Epochs"
            },
            "phased_phase1_samples_per_save": {
               "default": 0,
               "description": "Number of samples the model should see before saving a checkpoint during phase1. Disabled when set to 0.",
               "minimum": 0,
               "title": "Phased Phase1 Samples Per Save",
               "type": "integer"
            },
            "phased_phase1_learning_rate": {
               "default": 2e-05,
               "description": "Learning rate for phase1 knowledge training.",
               "minimum": 0.0,
               "title": "Phased Phase1 Learning Rate",
               "type": "number"
            },
            "phased_phase1_effective_batch_size": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 128,
               "description": "Phased phase1 effective batch size.",
               "title": "Phased Phase1 Effective Batch Size"
            },
            "phased_phase2_num_epochs": {
               "anyOf": [
                  {
                     "exclusiveMinimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 10,
               "description": "Number of epochs to run training for during phase2.",
               "title": "Phased Phase2 Num Epochs"
            },
            "phased_phase2_samples_per_save": {
               "default": 0,
               "description": "Number of samples the model should see before saving a checkpoint during phase2. Disabled when set to 0.",
               "minimum": 0,
               "title": "Phased Phase2 Samples Per Save",
               "type": "integer"
            },
            "phased_phase2_learning_rate": {
               "default": 6e-06,
               "description": "Learning rate for phase2 skills training.",
               "minimum": 0.0,
               "title": "Phased Phase2 Learning Rate",
               "type": "number"
            },
            "phased_phase2_effective_batch_size": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": 3840,
               "description": "Phased phase2 effective batch size.",
               "title": "Phased Phase2 Effective Batch Size"
            },
            "phased_mt_bench_judge": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "description": "Judge model path for phased MT-Bench evaluation.",
               "title": "Phased Mt Bench Judge"
            },
            "phased_base_dir": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "description": "Base directory for organization of end-to-end intermediate outputs.",
               "title": "Phased Base Dir"
            },
            "training_journal": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Optional path to a yaml file that tracks the progress of multiphase training.",
               "title": "Training Journal"
            }
         },
         "title": "_train",
         "type": "object"
      }
   }
}

Fields:
field chat: _chat [Optional]Ā¶

Chat configuration section.

field evaluate: _evaluate [Optional]Ā¶

Evaluate configuration section.

field general: _general [Optional]Ā¶

General configuration section.

field generate: _generate [Optional]Ā¶

Generate configuration section.

field metadata: _metadata [Optional]Ā¶

Metadata pertaining to the specifics of the system which the Configuration is meant to be applied to.

field serve: _serve [Optional]Ā¶

Serve configuration section.

field train: _train [Optional]Ā¶

Train configuration section.

field version: str = '1.0.0'Ā¶

Configuration file structure version.

GeneralĀ¶

pydantic model instructlab.configuration._generalĀ¶

Class describing various top-level configuration options for all commands.

Fields:
field debug_level: int = 0Ā¶

Debug level for logging.

Validated by:
  • after_debug_level

field log_format: Annotated[str, Strict(strict=True)] = '%(levelname)s %(asctime)s %(name)s:%(lineno)d: %(message)s'Ā¶

Log format. https://docs.python.org/3/library/logging.html#logrecord-attributes

Constraints:
  • strict = True

Validated by:
  • after_debug_level

  • validate_log_format

field log_level: Annotated[str, Strict(strict=True)] = 'INFO'Ā¶

Log level for logging.

Constraints:
  • strict = True

Validated by:
  • after_debug_level

  • validate_log_level

field use_legacy_tmpl: bool = FalseĀ¶

Use legacy IBM Granite chat template (default uses 3.0 Instruct template)

Validated by:
  • after_debug_level

MetadataĀ¶

pydantic model instructlab.configuration._metadataĀ¶
Fields:
field cpu_info: str | None = NoneĀ¶

Manufacturer, Family, and SKU of the system CPU, ex: Apple M3 Max

field gpu_count: int | None = NoneĀ¶

Amount of GPUs on the system, ex: 8

field gpu_family: str | None = NoneĀ¶

Family of the system GPU, ex: H100

field gpu_manufacturer: str | None = NoneĀ¶

Manufacturer of the system GPU, ex: Nvidia

field gpu_sku: list[str] | None = NoneĀ¶

Specific SKU related information about the given GPU, ex: PCIe, NVL

ilab model chatĀ¶

pydantic model instructlab.configuration._chatĀ¶

Class describing configuration of the ā€˜chatā€™ sub-command.

Fields:
field context: str = 'default'Ā¶

Predefined setting or environment that influences the behavior and responses of the chat assistant. Each context is associated with a specific prompt that guides the assistant on how to respond to user inputs. Available contexts: default, cli_helper.

field logs_dir: str [Optional]Ā¶

Directory where chat logs are stored.

field max_tokens: int | None = NoneĀ¶

The maximum number of tokens that can be generated in the chat completion. Be aware that larger values use more memory.

field model: str [Optional]Ā¶

Model to be used for chatting with.

field session: str | None = NoneĀ¶

Filepath of a dialog session file.

field temperature: float = 1.0Ā¶

Controls the randomness of the modelā€™s responses. Lower values make the output more deterministic, while higher values produce more random results.

field vi_mode: bool = FalseĀ¶

Enable vim keybindings for chat.

field visible_overflow: bool = TrueĀ¶

Renders vertical overflow if enabled, displays ellipses otherwise.

ilab model evaluateĀ¶

pydantic model instructlab.configuration._evaluateĀ¶

Class describing configuration of the ā€˜evaluateā€™ sub-command.

Fields:
field base_branch: str | None = NoneĀ¶

Base taxonomy branch

field base_model: str = 'instructlab/granite-7b-lab'Ā¶

Base model to compare with ā€˜modelā€™ for mt_bench_branch and mmlu_branch.

field branch: str | None = NoneĀ¶

Taxonomy branch containing custom skills/knowledge that should be used for evaluation runs.

field gpus: int | None = NoneĀ¶

Number of GPUs to use for running evaluation.

field mmlu: _mmlu [Optional]Ā¶

MMLU benchmarking settings

field mmlu_branch: _mmlubranch [Optional]Ā¶

Settings to run MMLU against a branch of taxonomy containing custom skills/knowledge used for training.

field model: str | None = NoneĀ¶

Model to be evaluated

field mt_bench: _mtbench [Optional]Ā¶

Multi-turn benchmarking settings for skills.

field mt_bench_branch: _mtbenchbranch [Optional]Ā¶

Settings to run MT-Bench against a branch of taxonomy containing custom skills/knowledge used for training

pydantic model instructlab.configuration._mmluĀ¶

Class describing configuration of MMLU evaluation benchmark.

Fields:
field batch_size: str | int = 'auto'Ā¶

Batch size for evaluation. Valid values are a positive integer or ā€˜autoā€™ to select the largest batch size that will fit in memory.

field few_shots: int = 5Ā¶

Number of question-answer pairs provided in the context preceding the question used for evaluation.

pydantic model instructlab.configuration._mmlubranchĀ¶

Class describing configuration of MMLUBranch evaluation benchmark.

Fields:
field tasks_dir: str [Optional]Ā¶

Directory where custom MMLU tasks are stored.

pydantic model instructlab.configuration._mtbenchĀ¶

Class describing configuration of MTBench evaluation benchmark.

Fields:
field judge_model: str [Optional]Ā¶

Judge model for mt_bench and mt_bench_branch.

field max_workers: str | int = 'auto'Ā¶

Number of workers to use for evaluation with mt_bench or mt_bench_branch. Must be a positive integer or ā€˜autoā€™.

field output_dir: str [Optional]Ā¶

Directory where evaluation results are stored.

pydantic model instructlab.configuration._mtbenchbranchĀ¶

Class describing configuration of MTBenchBranch evaluation benchmark.

Fields:
field taxonomy_path: str [Optional]Ā¶

Path to where base taxonomy is stored.

ilab data generateĀ¶

pydantic model instructlab.configuration._generateĀ¶

Class describing configuration of the ā€˜generateā€™ sub-command.

Fields:
field chunk_word_count: Annotated[int, Gt(gt=0)] = 1000Ā¶

Maximum number of words per chunk.

Constraints:
  • gt = 0

field max_num_tokens: int | None = 4096Ā¶

The maximum amount of tokens for the model to generate during knowledge generation. A lower number yields less data but a faster SDG run. It is reccomended to use this on consumer hardware

field model: Annotated[str, Strict(strict=True)] [Optional]Ā¶

Teacher model that will be used to synthetically generate training data.

Constraints:
  • strict = True

field num_cpus: Annotated[int, Gt(gt=0)] = 10Ā¶

Number of CPU cores to use for generation.

Constraints:
  • gt = 0

field output_dir: Annotated[str, Strict(strict=True)] [Optional]Ā¶

Directory where generated datasets are stored.

Constraints:
  • strict = True

field pipeline: str | None = 'full'Ā¶

Data generation pipeline to use. Available: ā€˜simpleā€™, ā€˜fullā€™, or a valid path to a directory of pipeline workflow YAML files. Note that ā€˜fullā€™ requires a larger teacher model, Mixtral-8x7b.

field sdg_scale_factor: Annotated[int, Gt(gt=0)] | None = 30Ā¶

The total number of instructions to be generated.

field taxonomy_base: Annotated[str, Strict(strict=True)] = 'origin/main'Ā¶

Branch of taxonomy used to calculate diff against.

Constraints:
  • strict = True

field taxonomy_path: Annotated[str, Strict(strict=True)] [Optional]Ā¶

Directory where taxonomy is stored and accessed from.

Constraints:
  • strict = True

field teacher: _serve [Optional]Ā¶

Teacher configuration

num_instructions: int | NoneĀ¶

Data descriptor used to emit a runtime deprecation warning before accessing a deprecated field.

msgĀ¶

The deprecation message to be emitted.

wrapped_propertyĀ¶

The property instance if the deprecated field is a computed field, or None.

field_nameĀ¶

The name of the field being deprecated.

ilab model serveĀ¶

pydantic model instructlab.configuration._serveĀ¶

Class describing configuration of the ā€˜serveā€™ sub-command.

Fields:
field backend: str | None = NoneĀ¶

Serving backend to use to host the model.

Constraints:
  • pattern = vllm|llama-cpp

field chat_template: str | None = NoneĀ¶

Chat template to supply to the model. Possible values: ā€˜autoā€™(default), ā€˜tokenizerā€™, a path to a jinja2 file.

field llama_cpp: _serve_llama_cpp [Optional]Ā¶

llama-cpp serving settings.

field model_path: Annotated[str, Strict(strict=True)] [Optional]Ā¶

Directory where model to be served is stored.

Constraints:
  • strict = True

field server: _serve_server = _serve_server(host='127.0.0.1', port=8000)Ā¶

Server configuration including host and port.

field vllm: _serve_vllm [Optional]Ā¶

vLLM serving settings.

api_base()Ā¶

Returns server API URL, based on the configured host and port

pydantic model instructlab.configuration._serve_llama_cppĀ¶

Class describing configuration of llama-cpp serving backend.

Fields:
field gpu_layers: int = -1Ā¶

Number of model layers to offload to GPU. -1 means all layers.

field llm_family: str = ''Ā¶

Large Language Model Family

field max_ctx_size: Annotated[int, Gt(gt=0)] = 4096Ā¶

Maximum number of tokens that can be processed by the model.

Constraints:
  • gt = 0

pydantic model instructlab.configuration._serve_vllmĀ¶

Class describing configuration of vLLM serving backend.

Fields:
field gpus: int | None = NoneĀ¶

Number of GPUs to use.

field llm_family: str = ''Ā¶

Large Language Model Family

field max_startup_attempts: int | None = 120Ā¶

Maximum number of attempts to start the vLLM server.

field vllm_args: list[str] | None [Optional]Ā¶

vLLM specific arguments. All settings can be passed as a list of strings, see: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html

pydantic model instructlab.configuration._serve_serverĀ¶

Class describing configuration of server serving backend.

Fields:
field host: Annotated[str, Strict(strict=True)] = '127.0.0.1'Ā¶

Host to serve on.

Constraints:
  • strict = True

field port: Annotated[int, Strict(strict=True)] = 8000Ā¶

Port to serve on.

Constraints:
  • strict = True

ilab model trainĀ¶

pydantic model instructlab.configuration._trainĀ¶

Class describing configuration of the ā€˜trainā€™ sub-command.

Fields:
field additional_args: dict[str, Any] [Optional]Ā¶

Additional arguments to pass to the training script. These arguments are passed as key-value pairs to the training script.

field checkpoint_at_epoch: bool = TrueĀ¶

Save a checkpoint at the end of each epoch.

field ckpt_output_dir: str [Optional]Ā¶

Directory where periodic training checkpoints are stored.

field data_output_dir: str [Optional]Ā¶

Directory where the processed training data is stored (post filtering/tokenization/masking).

field data_path: str [Optional]Ā¶

For the training library (primary training method), this specifies the path to the dataset file. For legacy training (MacOS/Linux), this specifies the path to the directory.

field deepspeed_cpu_offload_optimizer: bool = FalseĀ¶

Allow CPU offload for deepspeed optimizer.

field device: str = 'cpu'Ā¶

PyTorch device to use. Use ā€˜cpuā€™ for ā€˜simpleā€™ and ā€˜fullā€™ training on Linux. Use ā€˜mpsā€™ for ā€˜fullā€™ training on MacOS Metal Performance Shader. Use ā€˜cudaā€™ for Nvidia CUDA / AMD ROCm GPUs. Use ā€˜hpuā€™ for Intel Gaudi GPUs.

Constraints:
  • pattern = cpu|mps|cuda|hpu

field disable_flash_attn: bool | None = FalseĀ¶

Whether or not we should disable the use of flash-attention during training. This is useful when using older GPUs.

field distributed_backend: DistributedBackend = DistributedBackend.FSDPĀ¶

Pick a distributed training backend framework for GPU accelerated full fine-tuning.

field effective_batch_size: int = 64Ā¶

The number of samples in a batch that the model should see before its parameters are updated.

field fsdp_cpu_offload_optimizer: bool = FalseĀ¶

Allow CPU offload for FSDP optimizer.

field is_padding_free: bool = FalseĀ¶

Boolean to indicate if the model being trained is a padding-free transformer model such as Granite.

field lora_quantize_dtype: str | None = 'nf4'Ā¶

The data type for quantization in LoRA training. Valid options are ā€˜Noneā€™ and ā€˜nf4ā€™.

field lora_rank: int | None = 0Ā¶

Rank of low rank matrices to be used during training.

field max_batch_len: int = 5000Ā¶

Maximum tokens per gpu for each batch that will be handled in a single step. If running into out-of-memory errors, this value can be lowered but not below the max_seq_len.

field max_seq_len: int = 4096Ā¶

Maximum sequence length to be included in the training set. Samples exceeding this length will be dropped.

field model_path: str = 'instructlab/granite-7b-lab'Ā¶

Directory where the model to be trained is stored.

field nproc_per_node: int = 1Ā¶

Number of GPUs to use for training. This value is not supported in legacy training or MacOS.

field num_epochs: int = 10Ā¶

Number of epochs to run training for.

field phased_base_dir: str | None [Optional]Ā¶

Base directory for organization of end-to-end intermediate outputs.

field phased_mt_bench_judge: str | None [Optional]Ā¶

Judge model path for phased MT-Bench evaluation.

field phased_phase1_effective_batch_size: int | None = 128Ā¶

Phased phase1 effective batch size.

field phased_phase1_learning_rate: float = 2e-05Ā¶

Learning rate for phase1 knowledge training.

Constraints:
  • ge = 0

field phased_phase1_num_epochs: int | None = 7Ā¶

Number of epochs to run training for during phase1 (experimentally optimal number is 7).

Constraints:
  • gt = 0

field phased_phase1_samples_per_save: int = 0Ā¶

Number of samples the model should see before saving a checkpoint during phase1. Disabled when set to 0.

Constraints:
  • ge = 0

field phased_phase2_effective_batch_size: int | None = 3840Ā¶

Phased phase2 effective batch size.

field phased_phase2_learning_rate: float = 6e-06Ā¶

Learning rate for phase2 skills training.

Constraints:
  • ge = 0

field phased_phase2_num_epochs: int | None = 10Ā¶

Number of epochs to run training for during phase2.

Constraints:
  • gt = 0

field phased_phase2_samples_per_save: int = 0Ā¶

Number of samples the model should see before saving a checkpoint during phase2. Disabled when set to 0.

Constraints:
  • ge = 0

field pipeline: str = 'full'Ā¶

Training pipeline to use. Simple is for systems with limited resources, full is for more capable consumer systems (64 GB of RAM), and accelerated is for systems with a dedicated GPU.

Constraints:
  • pattern = simple|full|accelerated

field save_samples: int = 250000Ā¶

Number of samples the model should see before saving a checkpoint.

field training_journal: str | None = NoneĀ¶

Optional path to a yaml file that tracks the progress of multiphase training.