scLinguist documentation

scLinguist a transformer-based framework for RNA-to-protein prediction that follows a two-step training strategy. First, we pretrain modality-specific models on large-scale single-omics datasets using self-supervised learning to extract informative representations. Subsequently, we fine-tune the model on paired RNA-protein data, enabling accurate cross-modality translation. This training paradigm allows our model to leverage abundant single-omics data while effectively learning modality relationships from limited paired datasets.

_images/workflow.png — Overview of scLinguist workflow

Getting started with scLinguist

To begin using scLinguist, please refer to the following sections of the documentation:

The Installation provides instructions for setting up scLinguist in your environment.
The Tutorials contains examples on how to use scLinguist.