Garfield: Graph-based Contrastive Learning enable Fast Single-Cell Embedding
API
Import Garfield as:
import Garfield
Configuration for Garfield
|
Set global parameters for figures. |
|
Set Garfield parameters |
|
Set working directory. |
Reading
|
Initialize edge-level and node-level training and validation dataloaders. |
|
Split a PyG Data object into training, validation and test PyG Data objects using an edge-level split. |
|
Split data on node-level into training, validation and test sets by adding node-level masks (train_mask, val_mask, test_mask) to the PyG Data object. |
|
Prepares the dataset for training and evaluation by performing node-level and edge-level splits and returns a dictionary containing the processed data. |
|
Spatially annotated torch dataset class to extract node features, node labels, adjacency matrix and edge indices in a standardized format from an AnnData object. |
See more at anndata
Preprocessing
|
Calculate gene scores of scATACseq data |
|
For each row in query_arr, compute its nearest neighbor in target_arr. |
|
Preprocessing single-cell RNA-seq data |
|
Preprocess scATAC data matrix. |
|
Preprocessing single-cell RNA-seq data |
|
Preprocessing function for single-cell and multi-modal data. |
|
Processes single or multi-modal data (e.g., RNA, ATAC, ADT, spatial) with optional preprocessing steps such as normalization, feature selection, and dimensionality reduction. |
Model
|
Trains a weighted KNN classifier on |
|
Annotates |
|
Garfield: Graph-based Contrastive Learning enable Fast Single-Cell Embedding |
Loss
|
Computes MSE loss between reconstructed data and ground truth data. |
|
Compute edge reconstruction weighted binary cross entropy loss with logits using ground truth edge labels and predicted edge logits. |
|
Compute Kullback-Leibler divergence as per Kingma, D. |
Compute the contrastive loss given two batches of feature vectors z_i and z_j. |
|
|
Cluster loss function. |
Initializes Maximum Mean Discrepancy(MMD) between source_features and target_features. |
Modules
|
Garfield model class. |
NN
|
The GATEncoder class implements a Graph Attention Network (GAT) encoder with multiple layers, normalization, and optional fully connected (FC) encoder. |
|
The GCNEncoder class implements a Graph Convolutional Network (GCN) encoder with multiple layers, normalization, and optional fully connected (FC) encoder. |
|
Graph Attention Network (GAT) Decoder class. |
|
Graph Convolutional Network (GCN) Decoder class. |
|
Domain-specific Batch Normalization |
Trainer
|
Initializes the GarfieldTrainer class, which handles data preparation, model initialization, and training of the Garfield model. |
|
Get the evaluation metrics for a (balanced) sample of positive and negative edges and a sample of nodes. |
|
Plot evaluation metrics. |
Tools
|
EarlyStopping class for early stopping of Garfield training. |
|
Create message for '_print_progress_bar()' and print it out with a progress bar. |
Analysis
|
Calculate marker statistics for grouped data. |
|
Filter marker statistics based on thresholds. |
|
Aggregate top marker genes. |
|
|
|
Perform Enrichr analysis on top genes for each niche derived from aggregated marker statistics (non-parallel version). |
|
Perform Enrichr analysis on top genes for each niche derived from aggregated marker statistics. |
|
Perform GSEA analysis for each niche based on the full ranked gene list from marker statistics. |
|
Normalize the cell type abundance based on the nearest neighbors for each batch in the AnnData object. |
Plot
|
Plot taken from cell2location at https://github.com/BayraktarLab/cell2location. |
|
Plot markers for specific groups. |
|
Create a barplot for the enrichment results (either from Enrichr or GSEA). |
|
Create a dotplot for the enrichment results (either from Enrichr or GSEA). |