Hackathon Challenges – Single Cell Multiome Data Analysis
Outlined below are some general ideas around the theme of analyzing single-cell multimodal data that can be tackled during the hackathon. Please feel free to generate your own ideas as well!
Benchmarking of current integration pipelines
-
Compare different methods (e.g., Seurat v5, Harmony, LIGER, MOFA+) on integration of unpaired multiome datasets.
-
Design integration metrics (biological conservation vs modality mixing)
Visualization of scMultiomics
Best Methods/Practices for Visualizing Different Modalities
- Visualization techniques for unpaired data (e.g., using MultiVI to align and merge latent spaces)
- Coverage track (ATAC) + violin (GEX) is used for paired data, are there visualizations for unpaired?
- Dimensionality reduction techniques for visualizing high-dimensional data (e.g., UMAP, t-SNE)
Challenges in Visualizing Unpaired Data
- Creating visual representations that highlight correlations between different modalities
Inferring True Gene Expression from Chromatin Accessibility
Data Simulation + Perturbation Testing
- Simulate unpaired multiome data with known ground truth (e.g., synthetic paired ATAC/GEX splits) to test integration robustness.
- Add noise or dropout in one modality to explore denoising/stability strategies.
Predictive Modeling
- Using machine learning models to predict true gene expression from chromatin accessibility data (BABEL, MAESTRO, Bridge integration)
- Developing interpretable models that prioritize transcription factors involved in gene regulation
Enhancer-Promoter Interactions
- Benchmarking methods to infer gene regulatory networks from single-cell multiome data (e.g., LINGER)
Integration of Unpaired Single Cell Multiome Data
- Best practices for dimensionality reduction while preserving biology to integrate unpaired single cell multiome data (e.g., CCA, WNN, UMAP, totalVI)
- Benchmarking existing integration methods to evaluate their performance (e.g. GLUE, LIGER, MinNet)