Configuration Reference¶
The CLI is driven by a single JSON configuration file. The mode field is mandatory and determines which operation to perform and which set of parameters are available. This page documents all available modes and their specific configurations.
Note
An asterisk (Yes*) in the “Required” column indicates that a parameter is conditionally required. For example, checkpoint_filepath is only required if checkpoints is set to “y”. The Pydantic models in config_schema.py are the ultimate source of truth for all validations.
Example Configuration¶
Here is an example of a config.json for the train_cnn mode:
{
"mode": "train_cnn",
"csv_path": "data/cleaves.csv",
"img_folder": "data/images/",
"image_shape": [224, 224, 3],
"feature_shape": [5],
"save_model_file": "models/my_cnn_model.keras",
"max_epochs": 15,
"cnn_mode": "bad_good",
"classification_type": "binary",
"num_classes": 1,
"learning_rate": 0.005,
"batch_size": 16,
"buffer_size": 40,
"test_size": 0.25,
"max_epochs": 2,
"objective": "val_accuracy",
"brightness": 0.3,
"height": 0.2,
"width": 0.6,
"contrast": 0.7,
"rotation": 0.45,
"angle_threshold": 0.5,
"diameter_threshold": 125.0,
"dense1": 32,
"dense2": 16,
"dropout1": 0.2,
"dropout2": 0.2,
"dropout3": 0.2,
"checkpoints": "y",
"checkpoint_filepath": "models/checkpoints/best_cnn.keras"
}
—
Common Parameters¶
Base Parameters¶
These settings form the foundation for almost every mode.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
Yes |
The operation to perform. Must be one of the documented modes. |
|
|
string (path) |
Yes |
Path to the input data CSV file. |
|
|
string (path) |
Yes |
Path to the folder containing all images. |
|
|
list[int] |
Yes |
Dimensions of the input image, e.g., |
|
|
list[int] |
No |
null |
Dimensions of the tabular feature vector. Required by some modes. |
|
string |
No |
null |
Set to |
Common Training Parameters¶
These settings are available in all training modes (e.g., train_cnn, train_mlp, train_image_only, train_xgboost).
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
No |
null |
Path to save the final trained model file. Recommended for all training modes. |
|
string (path) |
No |
null |
Path to save the training history to a CSV file. |
|
integer |
No |
8 |
Number of samples per gradient update. |
|
integer |
No |
null |
Maximum number of epochs to train for. |
|
float |
No |
0.001 |
The learning rate for the optimizer. |
|
float |
No |
0.2 |
Proportion of the dataset to use for the validation split (0.0 to 1.0). |
|
string (path) |
No |
null |
Path to save the trained feature scaler. |
|
string (path) |
No |
null |
Path to save the trained label scaler (for regression). |
|
string (path) |
No |
null |
Path to save the trained label encoder (for classification). |
|
float |
No |
0.0 |
Parameters for image data augmentation. |
Common Callback Parameters¶
These settings control keras.callbacks and are available in all TensorFlow-based training modes.
Early Stopping¶
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
No |
“n” |
Set to |
|
integer |
No |
3 |
Epochs with no improvement before stopping training. |
|
string |
No |
“val_accuracy” |
Metric to monitor (e.g., |
|
string |
No |
“max” |
Direction of improvement. Use |
Model Checkpointing¶
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
No |
“n” |
Set to |
|
string (path) |
Yes* |
null |
Path to save the best model checkpoint. Required if checkpoints=”y”. |
|
string |
No |
“val_accuracy” |
Metric to monitor for saving the best model. |
|
string |
No |
“max” |
Direction of improvement ( |
—
Mode-Specific Parameters¶
Training Modes¶
train_cnn¶
Trains a hybrid model on a combination of images and tabular features.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
Yes |
The classification task. Can be |
|
|
string |
Yes |
binary |
Must be |
|
integer |
Yes |
Number of output classes (e.g., 1 for binary, 5 for multiclass). |
|
|
list[int] |
Yes |
Must be |
|
|
float |
Yes |
Threshold for angle-based classification logic. |
|
|
float |
Yes |
Threshold for diameter-based classification logic. |
|
|
float |
Yes |
Masking probability for training features. |
|
|
float |
Yes |
Masking probability for testing features. |
|
|
integer |
Yes |
Number of units in the two dense layers of the model head. |
|
|
float |
Yes |
Dropout rates for regularization. |
|
|
string |
No |
efficientnet |
The pre-trained CNN backbone ( |
|
integer |
No |
null |
Layer index from which to unfreeze the backbone for fine-tuning. |
|
float |
No |
null |
Factor to reduce learning rate on plateau (e.g. 0.2). |
|
integer |
No |
null |
Epochs to wait before reducing LR. |
train_mlp¶
Trains an MLP regression model using features extracted from a pre-trained CNN.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
Yes |
Path to the pre-trained CNN model used for feature extraction. |
|
|
list[int] |
Yes |
Must be |
|
|
float |
Yes |
Thresholds required for the data processing pipeline. |
|
|
float/int |
Yes |
Architecture parameters for the MLP model. |
|
|
float |
No |
null |
Factor to reduce learning rate on plateau (e.g. 0.2). |
|
integer |
No |
null |
Epochs to wait before reducing LR. |
train_image_only¶
Trains a classification model using only images as input.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
Yes |
The pre-trained CNN backbone to use. |
|
|
string |
Yes |
Must be |
|
|
integer |
Yes |
Number of output classes. |
|
|
float |
Yes |
Thresholds for defining labels. |
|
|
float/int |
No |
various |
Architecture parameters for the model head. |
train_xgboost¶
Trains an XGBoost regression model.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
No |
null |
Path to save the trained XGBoost model (.pkl). Recommended. |
|
string (path) |
Yes |
Path to the pre-trained CNN used for feature extraction. |
|
|
float |
Yes |
Thresholds for data processing. |
|
|
string |
Yes |
The XGBoost objective function (e.g., reg:squarederror). |
|
|
integer |
No |
200 |
Number of gradient boosted trees. |
|
integer |
No |
4 |
Maximum tree depth for base learners. |
|
float |
No |
various |
Regularization and subsampling parameters for XGBoost. |
Testing & Evaluation Modes¶
test_cnn & test_image_only¶
Tests a saved image-based classifier and generates evaluation reports.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
Yes |
Path to the trained classifier model (.keras). |
|
|
float |
Yes |
Thresholds used to generate the ground-truth labels for comparison. |
|
|
string (path) |
No |
null |
Path to save the output CSV classification report. |
|
float |
No |
0.5 |
The probability threshold for binary classification. |
|
string (path) |
No |
null |
Required for |
test_mlp & test_xgboost¶
Tests a saved regression model and generates a performance report.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
Yes |
Path to the trained regressor (.keras for MLP) or feature extractor (.keras for XGBoost). |
|
|
string (path) |
Yes* |
null |
Required for |
|
float |
Yes |
Thresholds for data processing. |
|
|
string (path) |
Yes* |
null |
Path to the saved label scaler used during training. Required. |
Note
For the test_xgboost mode, the model_path parameter should point to the pre-trained CNN feature extractor model (.keras), not the XGBoost model itself.
Advanced Modes¶
K-Fold Cross-Validation¶
The train_kfold_cnn and train_kfold_mlp modes are used for more robust model evaluation. They accept the exact same parameters as their non-k-fold counterparts (train_cnn and train_mlp respectively), with the addition of n_splits if you want to change the number of folds.
Hyperparameter Tuning¶
The cnn_hyperparameter, mlp_hyperparameter, and image_hyperparameter modes are used to search for the best model architecture.
cnn_hyperparameteruses the same config astrain_cnn.image_hyperparameteruses the same config astrain_image_only.mlp_hyperparameterrequirestuner_directoryandproject_name.
Visualization (grad_cam)¶
Generates a Grad-CAM heatmap to visualize which parts of an image the CNN is focusing on.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
Yes |
Path to the trained CNN model. |
|
|
string (path) |
Yes |
Path to the specific image for visualization. |
|
|
list[float] |
Yes* |
null |
Required if the model takes numerical inputs. |
|
int |
Yes |
Index of classification problem. |
|
|
string |
No |
null |
The name of the backbone layer in the saved model (e.g., ‘mobilenet’). |
|
string |
No |
null |
Name of the target convolutional layer. If null, the last conv layer is used. |
|
string (path) |
No |
null |
Path to save the output heatmap image. |
|
string |
No |
efficientnet |
Name of pre-trained backbone. |
Reinforcement Learning¶
train_rl & test_rl¶
Train or test an agent with reinforcement learning to predict optimal tension.
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string (path) |
Yes |
Path to the csv dataset. |
|
|
string (path) |
Yes |
Path to the cnn classifier. |
|
|
string (path) |
Yes |
Path to the saved images. |
|
|
string (path) |
Yes |
Path to save (or load) trained agent. |
|
|
float |
Yes |
Typical learning rate for ML. |
|
|
int |
Yes |
Size of replay buffer. |
|
|
float |
Yes |
Classification threshold. |
|
|
float |
Yes |
Maximum tension change per episode. |
|
|
int |
No |
256` |
Batch for training. |
|
float |
No |
0.1 |
|
|
float |
No |
0.0001 |
Size of steps to take during training. |
|
int |
No |
5000 |
Number of training rounds. |
|
float |
No |
0.7 |
Low percentage of tension. |
|
float |
No |
1.4 |
High percentage of tension. |