Configuration Reference

The CLI is driven by a single JSON configuration file. The mode field is mandatory and determines which operation to perform and which set of parameters are available. This page documents all available modes and their specific configurations.

Note

An asterisk (Yes*) in the “Required” column indicates that a parameter is conditionally required. For example, checkpoint_filepath is only required if checkpoints is set to “y”. The Pydantic models in config_schema.py are the ultimate source of truth for all validations.

Example Configuration

Here is an example of a config.json for the train_cnn mode:

{
  "mode": "train_cnn",
  "csv_path": "data/cleaves.csv",
  "img_folder": "data/images/",
  "image_shape": [224, 224, 3],
  "feature_shape": [5],
  "save_model_file": "models/my_cnn_model.keras",
  "max_epochs": 15,
  "cnn_mode": "bad_good",
  "classification_type": "binary",
  "num_classes": 1,

 "learning_rate": 0.005,
 "batch_size": 16,
 "buffer_size": 40,
 "test_size": 0.25,
 "max_epochs": 2,
 "objective": "val_accuracy",

 "brightness": 0.3,
 "height": 0.2,
 "width": 0.6,
 "contrast": 0.7,
 "rotation": 0.45,

  "angle_threshold": 0.5,
  "diameter_threshold": 125.0,
  "dense1": 32,
  "dense2": 16,
  "dropout1": 0.2,
  "dropout2": 0.2,
  "dropout3": 0.2,
  "checkpoints": "y",
  "checkpoint_filepath": "models/checkpoints/best_cnn.keras"
}

Common Parameters

Base Parameters

These settings form the foundation for almost every mode.

Parameter

Type

Required

Default

Description

mode

string

Yes

The operation to perform. Must be one of the documented modes.

csv_path

string (path)

Yes

Path to the input data CSV file.

img_folder

string (path)

Yes

Path to the folder containing all images.

image_shape

list[int]

Yes

Dimensions of the input image, e.g., [224, 224, 3].

feature_shape

list[int]

No

null

Dimensions of the tabular feature vector. Required by some modes.

set_mask

string

No

null

Set to "y" to apply a circular background mask to the images.

Common Training Parameters

These settings are available in all training modes (e.g., train_cnn, train_mlp, train_image_only, train_xgboost).

Parameter

Type

Required

Default

Description

save_model_file

string (path)

No

null

Path to save the final trained model file. Recommended for all training modes.

save_history_file

string (path)

No

null

Path to save the training history to a CSV file.

batch_size

integer

No

8

Number of samples per gradient update.

max_epochs

integer

No

null

Maximum number of epochs to train for.

learning_rate

float

No

0.001

The learning rate for the optimizer.

test_size

float

No

0.2

Proportion of the dataset to use for the validation split (0.0 to 1.0).

feature_scaler_path

string (path)

No

null

Path to save the trained feature scaler.

label_scaler_path

string (path)

No

null

Path to save the trained label scaler (for regression).

encoder_path

string (path)

No

null

Path to save the trained label encoder (for classification).

brightness, rotation, height, width, contrast

float

No

0.0

Parameters for image data augmentation.

Common Callback Parameters

These settings control keras.callbacks and are available in all TensorFlow-based training modes.

Early Stopping

Parameter

Type

Required

Default

Description

early_stopping

string

No

“n”

Set to "y" to enable.

patience

integer

No

3

Epochs with no improvement before stopping training.

monitor

string

No

“val_accuracy”

Metric to monitor (e.g., val_loss).

method

string

No

“max”

Direction of improvement. Use max for accuracy, min for loss.

Model Checkpointing

Parameter

Type

Required

Default

Description

checkpoints

string

No

“n”

Set to "y" to enable model checkpointing.

checkpoint_filepath

string (path)

Yes*

null

Path to save the best model checkpoint. Required if checkpoints=”y”.

monitor

string

No

“val_accuracy”

Metric to monitor for saving the best model.

method

string

No

“max”

Direction of improvement (max for accuracy, min for loss).

Mode-Specific Parameters

Training Modes

train_cnn

Trains a hybrid model on a combination of images and tabular features.

Parameter

Type

Required

Default

Description

cnn_mode

string

Yes

The classification task. Can be bad_good or multiclass.

classification_type

string

Yes

binary

Must be binary or multiclass.

num_classes

integer

Yes

Number of output classes (e.g., 1 for binary, 5 for multiclass).

feature_shape

list[int]

Yes

Must be [5] for this mode.

angle_threshold

float

Yes

Threshold for angle-based classification logic.

diameter_threshold

float

Yes

Threshold for diameter-based classification logic.

train_p

float

Yes

Masking probability for training features.

test_p

float

Yes

Masking probability for testing features.

dense1, dense2

integer

Yes

Number of units in the two dense layers of the model head.

dropout1, dropout2, dropout3

float

Yes

Dropout rates for regularization.

backbone

string

No

efficientnet

The pre-trained CNN backbone (resnet, mobilenet, efficientnet).

unfreeze_from

integer

No

null

Layer index from which to unfreeze the backbone for fine-tuning.

reduce_lr

float

No

null

Factor to reduce learning rate on plateau (e.g. 0.2).

reduce_lr_patience

integer

No

null

Epochs to wait before reducing LR.

train_mlp

Trains an MLP regression model using features extracted from a pre-trained CNN.

Parameter

Type

Required

Default

Description

model_path

string (path)

Yes

Path to the pre-trained CNN model used for feature extraction.

feature_shape

list[int]

Yes

Must be [4] for this mode (the numerical features, excluding tension).

angle_threshold, diameter_threshold

float

Yes

Thresholds required for the data processing pipeline.

dense1, dense2, dropout1, etc.

float/int

Yes

Architecture parameters for the MLP model.

reduce_lr

float

No

null

Factor to reduce learning rate on plateau (e.g. 0.2).

reduce_lr_patience

integer

No

null

Epochs to wait before reducing LR.

train_image_only

Trains a classification model using only images as input.

Parameter

Type

Required

Default

Description

backbone

string

Yes

The pre-trained CNN backbone to use.

classification_type

string

Yes

Must be binary or multiclass.

num_classes

integer

Yes

Number of output classes.

angle_threshold, diameter_threshold

float

Yes

Thresholds for defining labels.

dense1, dropout1, dropout2, l2_factor

float/int

No

various

Architecture parameters for the model head.

train_xgboost

Trains an XGBoost regression model.

Parameter

Type

Required

Default

Description

xgb_path

string (path)

No

null

Path to save the trained XGBoost model (.pkl). Recommended.

model_path

string (path)

Yes

Path to the pre-trained CNN used for feature extraction.

angle_threshold, diameter_threshold

float

Yes

Thresholds for data processing.

error_type

string

Yes

The XGBoost objective function (e.g., reg:squarederror).

n_estimators

integer

No

200

Number of gradient boosted trees.

max_depth

integer

No

4

Maximum tree depth for base learners.

gamma, subsample, reg_lambda

float

No

various

Regularization and subsampling parameters for XGBoost.

Testing & Evaluation Modes

test_cnn & test_image_only

Tests a saved image-based classifier and generates evaluation reports.

Parameter

Type

Required

Default

Description

model_path

string (path)

Yes

Path to the trained classifier model (.keras).

angle_threshold, diameter_threshold

float

Yes

Thresholds used to generate the ground-truth labels for comparison.

classification_path

string (path)

No

null

Path to save the output CSV classification report.

classification_threshold

float

No

0.5

The probability threshold for binary classification.

feature_scaler_path

string (path)

No

null

Required for test_cnn if the model used scaled features.

test_mlp & test_xgboost

Tests a saved regression model and generates a performance report.

Parameter

Type

Required

Default

Description

model_path

string (path)

Yes

Path to the trained regressor (.keras for MLP) or feature extractor (.keras for XGBoost).

xgb_path

string (path)

Yes*

null

Required for test_xgboost mode. Path to the .pkl file.

angle_threshold, diameter_threshold

float

Yes

Thresholds for data processing.

label_scaler_path

string (path)

Yes*

null

Path to the saved label scaler used during training. Required.

Note

For the test_xgboost mode, the model_path parameter should point to the pre-trained CNN feature extractor model (.keras), not the XGBoost model itself.

Advanced Modes

K-Fold Cross-Validation

The train_kfold_cnn and train_kfold_mlp modes are used for more robust model evaluation. They accept the exact same parameters as their non-k-fold counterparts (train_cnn and train_mlp respectively), with the addition of n_splits if you want to change the number of folds.

Hyperparameter Tuning

The cnn_hyperparameter, mlp_hyperparameter, and image_hyperparameter modes are used to search for the best model architecture.

  • cnn_hyperparameter uses the same config as train_cnn.

  • image_hyperparameter uses the same config as train_image_only.

  • mlp_hyperparameter requires tuner_directory and project_name.

Visualization (grad_cam)

Generates a Grad-CAM heatmap to visualize which parts of an image the CNN is focusing on.

Parameter

Type

Required

Default

Description

model_path

string (path)

Yes

Path to the trained CNN model.

img_path

string (path)

Yes

Path to the specific image for visualization.

test_features

list[float]

Yes*

null

Required if the model takes numerical inputs.

class_index

int

Yes

Index of classification problem.

backbone

string

No

null

The name of the backbone layer in the saved model (e.g., ‘mobilenet’).

conv_layer_name

string

No

null

Name of the target convolutional layer. If null, the last conv layer is used.

heatmap_file

string (path)

No

null

Path to save the output heatmap image.

backbone

string

No

efficientnet

Name of pre-trained backbone.

Reinforcement Learning

train_rl & test_rl

Train or test an agent with reinforcement learning to predict optimal tension.

Parameter

Type

Required

Default

Description

csv_path

string (path)

Yes

Path to the csv dataset.

cnn_path

string (path)

Yes

Path to the cnn classifier.

img_folder

string (path)

Yes

Path to the saved images.

agent_path

string (path)

Yes

Path to save (or load) trained agent.

learning_rate

float

Yes

Typical learning rate for ML.

buffer_size

int

Yes

Size of replay buffer.

threshold

float

Yes

Classification threshold.

max_tension_change

float

Yes

Maximum tension change per episode.

batch_size

int

No

256`

Batch for training.

tau

float

No

0.1

learning_rate

float

No

0.0001

Size of steps to take during training.

timesteps

int

No

5000

Number of training rounds.

low_range`

float

No

0.7

Low percentage of tension.

high_range

float

No

1.4

High percentage of tension.