Configuration Reference¶

The CLI is driven by a single JSON configuration file. The mode field is mandatory and determines which operation to perform and which set of parameters are available. This page documents all available modes and their specific configurations.

Note

An asterisk (Yes*) in the “Required” column indicates that a parameter is conditionally required. For example, checkpoint_filepath is only required if checkpoints is set to “y”. The Pydantic models in config_schema.py are the ultimate source of truth for all validations.

Example Configuration¶

Here is an example of a config.json for the train_cnn mode:

{
  "mode": "train_cnn",
  "csv_path": "data/cleaves.csv",
  "img_folder": "data/images/",
  "image_shape": [224, 224, 3],
  "feature_shape": [5],
  "save_model_file": "models/my_cnn_model.keras",
  "max_epochs": 15,
  "cnn_mode": "bad_good",
  "classification_type": "binary",
  "num_classes": 1,

 "learning_rate": 0.005,
 "batch_size": 16,
 "buffer_size": 40,
 "test_size": 0.25,
 "max_epochs": 2,
 "objective": "val_accuracy",

 "brightness": 0.3,
 "height": 0.2,
 "width": 0.6,
 "contrast": 0.7,
 "rotation": 0.45,

  "angle_threshold": 0.5,
  "diameter_threshold": 125.0,
  "dense1": 32,
  "dense2": 16,
  "dropout1": 0.2,
  "dropout2": 0.2,
  "dropout3": 0.2,
  "checkpoints": "y",
  "checkpoint_filepath": "models/checkpoints/best_cnn.keras"
}

—

Common Parameters¶

Base Parameters¶

These settings form the foundation for almost every mode.

Parameter	Type	Required	Default	Description
`mode`	string	Yes		The operation to perform. Must be one of the documented modes.
`csv_path`	string (path)	Yes		Path to the input data CSV file.
`img_folder`	string (path)	Yes		Path to the folder containing all images.
`image_shape`	list[int]	Yes		Dimensions of the input image, e.g., `[224, 224, 3]`.
`feature_shape`	list[int]	No	null	Dimensions of the tabular feature vector. Required by some modes.
`set_mask`	string	No	null	Set to `"y"` to apply a circular background mask to the images.

Common Training Parameters¶

These settings are available in all training modes (e.g., train_cnn, train_mlp, train_image_only, train_xgboost).

Parameter	Type	Required	Default	Description
`save_model_file`	string (path)	No	null	Path to save the final trained model file. Recommended for all training modes.
`save_history_file`	string (path)	No	null	Path to save the training history to a CSV file.
`batch_size`	integer	No	8	Number of samples per gradient update.
`max_epochs`	integer	No	null	Maximum number of epochs to train for.
`learning_rate`	float	No	0.001	The learning rate for the optimizer.
`test_size`	float	No	0.2	Proportion of the dataset to use for the validation split (0.0 to 1.0).
`feature_scaler_path`	string (path)	No	null	Path to save the trained feature scaler.
`label_scaler_path`	string (path)	No	null	Path to save the trained label scaler (for regression).
`encoder_path`	string (path)	No	null	Path to save the trained label encoder (for classification).
`brightness`, `rotation`, `height`, `width`, `contrast`	float	No	0.0	Parameters for image data augmentation.

Common Callback Parameters¶

These settings control keras.callbacks and are available in all TensorFlow-based training modes.

Early Stopping¶

Parameter	Type	Required	Default	Description
`early_stopping`	string	No	“n”	Set to `"y"` to enable.
`patience`	integer	No	3	Epochs with no improvement before stopping training.
`monitor`	string	No	“val_accuracy”	Metric to monitor (e.g., `val_loss`).
`method`	string	No	“max”	Direction of improvement. Use `max` for accuracy, `min` for loss.

Model Checkpointing¶

Parameter	Type	Required	Default	Description
`checkpoints`	string	No	“n”	Set to `"y"` to enable model checkpointing.
`checkpoint_filepath`	string (path)	Yes*	null	Path to save the best model checkpoint. Required if checkpoints=”y”.
`monitor`	string	No	“val_accuracy”	Metric to monitor for saving the best model.
`method`	string	No	“max”	Direction of improvement (`max` for accuracy, `min` for loss).

—

Mode-Specific Parameters¶

Training Modes¶

train_cnn¶

Trains a hybrid model on a combination of images and tabular features.

Parameter	Type	Required	Default	Description
`cnn_mode`	string	Yes		The classification task. Can be `bad_good` or `multiclass`.
`classification_type`	string	Yes	binary	Must be `binary` or `multiclass`.
`num_classes`	integer	Yes		Number of output classes (e.g., 1 for binary, 5 for multiclass).
`feature_shape`	list[int]	Yes		Must be `[5]` for this mode.
`angle_threshold`	float	Yes		Threshold for angle-based classification logic.
`diameter_threshold`	float	Yes		Threshold for diameter-based classification logic.
`train_p`	float	Yes		Masking probability for training features.
`test_p`	float	Yes		Masking probability for testing features.
`dense1`, `dense2`	integer	Yes		Number of units in the two dense layers of the model head.
`dropout1`, `dropout2`, `dropout3`	float	Yes		Dropout rates for regularization.
`backbone`	string	No	efficientnet	The pre-trained CNN backbone (`resnet`, `mobilenet`, `efficientnet`).
`unfreeze_from`	integer	No	null	Layer index from which to unfreeze the backbone for fine-tuning.
`reduce_lr`	float	No	null	Factor to reduce learning rate on plateau (e.g. 0.2).
`reduce_lr_patience`	integer	No	null	Epochs to wait before reducing LR.

train_mlp¶

Trains an MLP regression model using features extracted from a pre-trained CNN.

Parameter	Type	Required	Default	Description
`model_path`	string (path)	Yes		Path to the pre-trained CNN model used for feature extraction.
`feature_shape`	list[int]	Yes		Must be `[4]` for this mode (the numerical features, excluding tension).
`angle_threshold`, `diameter_threshold`	float	Yes		Thresholds required for the data processing pipeline.
`dense1`, `dense2`, `dropout1`, etc.	float/int	Yes		Architecture parameters for the MLP model.
`reduce_lr`	float	No	null	Factor to reduce learning rate on plateau (e.g. 0.2).
`reduce_lr_patience`	integer	No	null	Epochs to wait before reducing LR.

train_image_only¶

Trains a classification model using only images as input.

Parameter	Type	Required	Default	Description
`backbone`	string	Yes		The pre-trained CNN backbone to use.
`classification_type`	string	Yes		Must be `binary` or `multiclass`.
`num_classes`	integer	Yes		Number of output classes.
`angle_threshold`, `diameter_threshold`	float	Yes		Thresholds for defining labels.
`dense1`, `dropout1`, `dropout2`, `l2_factor`	float/int	No	various	Architecture parameters for the model head.

train_xgboost¶

Trains an XGBoost regression model.

Parameter	Type	Required	Default	Description
`xgb_path`	string (path)	No	null	Path to save the trained XGBoost model (.pkl). Recommended.
`model_path`	string (path)	Yes		Path to the pre-trained CNN used for feature extraction.
`angle_threshold`, `diameter_threshold`	float	Yes		Thresholds for data processing.
`error_type`	string	Yes		The XGBoost objective function (e.g., reg:squarederror).
`n_estimators`	integer	No	200	Number of gradient boosted trees.
`max_depth`	integer	No	4	Maximum tree depth for base learners.
`gamma`, `subsample`, `reg_lambda`	float	No	various	Regularization and subsampling parameters for XGBoost.

Testing & Evaluation Modes¶

test_cnn & test_image_only¶

Tests a saved image-based classifier and generates evaluation reports.

Parameter	Type	Required	Default	Description
`model_path`	string (path)	Yes		Path to the trained classifier model (.keras).
`angle_threshold`, `diameter_threshold`	float	Yes		Thresholds used to generate the ground-truth labels for comparison.
`classification_path`	string (path)	No	null	Path to save the output CSV classification report.
`classification_threshold`	float	No	0.5	The probability threshold for binary classification.
`feature_scaler_path`	string (path)	No	null	Required for `test_cnn` if the model used scaled features.

test_mlp & test_xgboost¶

Tests a saved regression model and generates a performance report.

Parameter	Type	Required	Default	Description
`model_path`	string (path)	Yes		Path to the trained regressor (.keras for MLP) or feature extractor (.keras for XGBoost).
`xgb_path`	string (path)	Yes*	null	Required for `test_xgboost` mode. Path to the .pkl file.
`angle_threshold`, `diameter_threshold`	float	Yes		Thresholds for data processing.
`label_scaler_path`	string (path)	Yes*	null	Path to the saved label scaler used during training. Required.

Note

For the test_xgboost mode, the model_path parameter should point to the pre-trained CNN feature extractor model (.keras), not the XGBoost model itself.

Advanced Modes¶

K-Fold Cross-Validation¶

The train_kfold_cnn and train_kfold_mlp modes are used for more robust model evaluation. They accept the exact same parameters as their non-k-fold counterparts (train_cnn and train_mlp respectively), with the addition of n_splits if you want to change the number of folds.

Hyperparameter Tuning¶

The cnn_hyperparameter, mlp_hyperparameter, and image_hyperparameter modes are used to search for the best model architecture.

cnn_hyperparameter uses the same config as train_cnn.
image_hyperparameter uses the same config as train_image_only.
mlp_hyperparameter requires tuner_directory and project_name.

Visualization (grad_cam)¶

Generates a Grad-CAM heatmap to visualize which parts of an image the CNN is focusing on.

Parameter	Type	Required	Default	Description
`model_path`	string (path)	Yes		Path to the trained CNN model.
`img_path`	string (path)	Yes		Path to the specific image for visualization.
`test_features`	list[float]	Yes*	null	Required if the model takes numerical inputs.
`class_index`	int	Yes		Index of classification problem.
`backbone`	string	No	null	The name of the backbone layer in the saved model (e.g., ‘mobilenet’).
`conv_layer_name`	string	No	null	Name of the target convolutional layer. If null, the last conv layer is used.
`heatmap_file`	string (path)	No	null	Path to save the output heatmap image.
`backbone`	string	No	efficientnet	Name of pre-trained backbone.

Reinforcement Learning¶

train_rl & test_rl¶

Train or test an agent with reinforcement learning to predict optimal tension.

Parameter	Type	Required	Default	Description
`csv_path`	string (path)	Yes		Path to the csv dataset.
`cnn_path`	string (path)	Yes		Path to the cnn classifier.
`img_folder`	string (path)	Yes		Path to the saved images.
`agent_path`	string (path)	Yes		Path to save (or load) trained agent.
`learning_rate`	float	Yes		Typical learning rate for ML.
`buffer_size`	int	Yes		Size of replay buffer.
`threshold`	float	Yes		Classification threshold.
`max_tension_change`	float	Yes		Maximum tension change per episode.
`batch_size`	int	No	256`	Batch for training.
`tau`	float	No	0.1
`learning_rate`	float	No	0.0001	Size of steps to take during training.
`timesteps`	int	No	5000	Number of training rounds.
low_range`	float	No	0.7	Low percentage of tension.
`high_range`	float	No	1.4	High percentage of tension.