{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training DeepRUOTv2\n",
    "\n",
    "DeepRUOTv2 uses a flexible configuration system, where users can specify the parameters used to train DeepRUOT. We provide example configurations used to train on four scRNA-seq datasets: Mouse Blood Hematopoiesis (50D), Embryoid Body (50D), Pancreatic $\\beta$ -cell differentiation (30D) and  A549 EMT (10D). The configurations are stored in the `config/` folder.\n",
    "\n",
    "To train DeepRUOTv2 on your own dataset, you need to convert your own dataset to a csv file and store it in the `data/` folder. Specifically, the column `samples` refers to the biological time points starting from time 0, and it is recommended to normalize the time scales to a reasonable range. The following columns, starting from `x1`, refer to the gene expression features. After the dataset is prepared, modify these parts in the confuguration file:\n",
    "\n",
    "```yaml\n",
    "device: 'cuda' # device to run the model\n",
    "\n",
    "exp:\n",
    "  name: \"my_experiment\"     # Experiment name\n",
    "\n",
    "data:\n",
    "  file_path: \"data.csv\"     # Path to your dataset, your dataset should be prepared as a csv file\n",
    "  dim: 50                   # Data dimension\n",
    "\n",
    "model:\n",
    "  in_out_dim: 50 # Data dimension\n",
    "```\n",
    "\n",
    "For other hyperparameters, we recommend using the same settings as `config/weinreb_config.yaml`. Note that the default setting for the hyperparameter `use_pinn`, which controls whether to update the score model in the final training phase, is set to False. Setting it to True may achieve better performance but will significantly increase training time. For more efficient training, we recommend setting it to False. If you encounter  `CUDA out of memory` error, you may set the parameters `sample_size` and `score_batch_size` to smaller values.\n",
    "\n",
    "For training, simply specify the path to your configuration file, and run  `train_RUOT.py`:\n",
    "\n",
    "```bash\n",
    "python train_RUOT.py --config config/<config_name>.yaml\n",
    "```\n",
    "\n",
    "For example, to reproduce our results on the Mouse Blood Hematopoiesis dataset, run:\n",
    "\n",
    "```bash\n",
    "python train_RUOT.py --config config/weinreb_config.yaml\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluation\n",
    "\n",
    " After training, model checkpoints will be generated in the `results/` directory: `model_final` and `score_final`, which can then be used to inference trajectories. We provide a Jupyter notebook to plot the learned results in `evaluation/plot.ipynb`. Downstream analysis can be conducted using the provided notebook in `evaluation/analysis.ipynb`.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "DeepRUOTv2",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}