How to train your MHN#

If you want to learn a new MHN from mutation data, the optimizers submodule is likely where you should start. It currently contains Optimizer classes for training a classical MHN (cMHN) (see Schill et al. (2020)) or an observation MHN (oMHN) (see Schill et al. (2024)).

For an extensive demonstration of a simple MHN training and analysis workflow, have a look at this demo notebook.

Configure the Optimizer#

You can learn a new MHN from cross-sectional data with the Optimizer class:

from mhn.optimizers import Optimizer
opt = Optimizer()

By default, this class will train the most recent type of MHN. To train an older type, you can specify it explicitly:

# Example: training a classical MHN (cMHN) that does not account for the collider bias
opt = Optimizer(Optimizer.MHNType.cMHN)

We can specify the data that we want our MHN to be trained on:

opt.load_data_matrix(data_matrix)

Here, data_matrix can either be a numpy matrix or a pandas DataFrame, in which rows represent samples and columns represent events. If it is a numpy matrix, then you should set dtype=np.int32, else you might get a warning.
Alternatively, if your training data is stored in a CSV file, you can call

opt.load_data_from_csv(filename, delimiter)

where delimiter is the delimiter separating the items in the CSV file (default: ,). Internally, this method uses pandasread_csv() function to extract the data from the CSV file. All additional keyword arguments given to this method will be passed on to that pandas function (see read_csv()). This means parameters like usecols or skiprows of the read_csv() function can also be used as parameters for this method:

# loads data from a CSV file, but does not include rows 0 and 10
opt.load_data_from_csv(filename, delimiter, skiprows=[0, 10])

You can access the loaded data matrix with

loaded_matrix = opt.training_data

If you work with a CUDA-capable device, you can choose which device you want to use to train a new MHN:

# uses both CPU and GPU depending on the number of mutations in the individual sample (default)
opt.set_device(Optimizer.Device.AUTO)

# use the CPU to compute log-likelihood and gradient
opt.set_device(Optimizer.Device.CPU)

# use the GPU to compute log-likelihood and gradient
opt.set_device(Optimizer.Device.GPU)

# you can also access the Device enum directly with an Optimizer object
opt.set_device(opt.Device.AUTO)

You could also change the initial theta that is the starting point for training, which by default is the independence model used by Schill et al. (2019), with

opt.set_init_theta(init_theta)

If you want to regularly save the progress during training, you can use the save_progress() method:

# in this example we want to make a backup every 100 iterations
steps = 100
# we want to overwrite the previous backup file
always_new_file = False
# we want our backup file to be named 'mhn_training_backup.npy'
filename = 'mhn_training_backup.npy'

opt.save_progress(steps=steps, always_new_file=always_new_file, filename=filename)

You can also specify a callback function that is called after each training step:

# In this example, we create a callback function that prints
# the current theta matrix after each training step.
# Ensure that your callback function accepts the theta matrix as a parameter;
# otherwise, it will raise an error.
def our_callback_function(theta: np.ndarray):
    print(theta)

opt.set_callback_func(our_callback_function)

During training, a regularization penalty is applied to prevent overfitting. The Optimizer class currently supports three types: the L1-penalty (used by default), the L2-penalty, and a custom symmetrical penalty that is further discussed in Schill et al. (2024).
The following code snippet shows how to set a penalty:

# for the L1-penalty, we set
opt.set_penalty(opt.Penalty.L1)
# for the L2-penalty, we set
opt.set_penalty(opt.Penalty.L2)
# for the symmetrical penalty, we set
opt.set_penalty(opt.Penalty.SYM_SPARSE)

Train a new MHN model#

Once your optimizer is configured, you can call the lambda_from_cv() method to find the best penalty strength (“lambda”) for training by doing cross-validation.
The lambda_from_cv() method takes either a sequence of lambdas that should be tested or the minimum, maximum and step size for potential lambda values. In the latter case, the method will create a range of possible lambdas with logarithmic grid-spacing, e.g. (0.0001, 0.0010, 0.0100, 0.1000) for lambda_min=0.0001, lambda_max=0.1 and steps=4.
In this example, we opted for the latter option:

import mhn
# use a seed to make the cross-validation results reproducible
mhn.set_seed(0)

cv_lambda = opt.lambda_from_cv(
    lambda_min=1e-4,       # the smallest lambda value evaluated
    lambda_max=1e-1,       # the largest lambda value evaluated
    steps=4,               # total number of lambda values evaluated
    nfolds=5,              # number of cross-validation folds
    show_progressbar=True  # show a progressbar during cross-validation
)

Finally, you can train a new MHN with

opt.train(
    lam=cv_lambda,      # the lambda value used for regularization
    maxit=5000,         # the maximum number of training iterations
    round_result=True,  # round the resulting theta matrix to two decimal places
)

This function returns an MHN object (see here), which contains the learned model and provides additional methods for cancer progression analysis. You can also access the learned model via the result property:

learned_mhn = opt.result

The documentation of all available optimizer classes can be found here.