Quickstart

Here, we provide the path of least resistance (the command-line interface) to training a PRESCIENT model and running perturbational analyses. To install PRESCIENT refer to the homepage.

Create PRESCIENT torch object

First, we recommend looking at how to prepare inputs for PRESCIENT and bring your scRNA-seq to an acceptable format for PRESCIENT. For estimating growth weights, please refer to the notebooks tab.

Run the following to estimate growth rates and create a PRESCIENT training pyTorch object:
prescient process_data -d /path/to/your_data.csv -o /path/for/output/ -m /path/to/metadata.csv --tp_col "timepoint colname" --celltype_col "annotation colname" --growth_path /path/to/growth_weights.pt

Train PRESCIENT model

To train a PRESCIENT model, it is beneficial to use GPU acceleration with CUDA support. PRESCIENT models can be trained on CPUs but will take longer to train. For a demo on runining PRESCIENT with free GPU cloud resources on Google Colab, please refer to the notebooks tab.

Next, train a basic PRESCIENT model with default parameters with the following command and the data.pt file from the process_data command:
prescient train_model -i /path/to/data.pt --out_dir /experiments/ --weight_name 'kegg-growth'

For more options to control model architecture and hyperparameters, please refer to CLI documentation.

Simulate trajectories

Now, with a trained PRESCIENT model and the original PRESCIENT data object, you can simulate trajectories of cells with arbitrary intializations. To do so, run the simulate command line functions.

In the following example, the function will randomly sample 50 cells at the first provided timepoint and simulate forward to the final timepoint: prescient simulate_trajectories -i /path/to/data.pt --model_path /path/to/trained/model_directory -o /path/to/output_dir --seed 2

This will produce a PRESCIENT simulation object containing the following:

"sims": generated cells of simulated trajectory

For more control over choosing cells, number of steps, etc. please refer to CLI documentation.

Run perturbation simulations

One of the advantages of training a PRESCIENT model is the ability to simulate the trajectory of out-of-sample or perturbed initial cells. To do this, individual or sets of genes are perturbed by setting the value(s) to a z-score in scaled expression space. The following function induces perturbations and generates simulated trajectories of both unperturbed and perturbed cells for comparison.

In the following example GENE1, GENE2, and GENE3 are perturbed in 10 random samples of 200 cells with a z-score of 5 and simulated forward to the final timepoint with a trained PRESCIENT model:
prescient perturbation_analysis -i /path/to/data.pt -p 'GENE1,GENE2,GENE3' -z 5 --model_path /path/to/trained/model_directory --seed 2 -o /path/to/output_dir

This will produce a PRESCIENT simulation object containing the following:

"perturbed_genes": list of genes perturbed
"unperturbed_sim": PC coordinates of unperturbed simulated trajectory
"perturbed_sim": PC coordinates of perturbed simulated trajectory

For more control over choosing cells, number of steps, etc. please refer to CLI documentation.