Skip to main content

Genomic data

We conduct analyses of genotyping arrays, whole exome sequencing (WES), and whole genome sequencing (WGS) data. Our process includes quality control, data preprocessing, genetic variant calling, and functional annotation. We also perform adjustments for confounders, imputation, and association analyses for both common variants (using genome-wide association studies, or GWAS) and rare variants (employing methods such as burden tests, SKAT, and SKAT-O), as well as the visualization of the results to enhance interpretability. Currently, we also focus on detecting somatic mutations from NGS data.

Transcriptome data

Bulk and single-cell RNA sequencing, microarray and RT-qPCR data analysis, including quality control, data filtering, normalisation, differential expression analyses, followed by functional enrichment analysis (gene ontology (GO), pathway, phenotype, drug target identification). Results visualization (Volcano plots, heatmaps, t-SNE and PCA plots). Our analyses also extend to small non-coding RNAs (microRNAs) and long non-coding (lncRNAs) RNAs. We also investigate alternative splicing events and detect allele-specific expressions.

Microbiome data

16S rRNA amplicon sequencing data quality control and adapter trimming, pre-processing including denoising and identifying amplicon sequence variants (ASVs), microbial identification at various taxonomic levels,  and predictive functional profiling of the microbial communities, as well as down-stream analyses, such as alpha diversity metrics (e.g., Shannon diversity, Simpson's index) to evaluate within-sample diversity, and beta diversity assessments, and their associations with environmental and clinical factors.

Other omics and clinical / phenotype data

We provide comprehensive preprocessing of omics and clinical/phenotypic data to ensure reliable and interpretable analysis, including outlier detection, normality checks, skewness and kurtosis detection, missingness quantification, filtering, imputation, and normalization, transformation, and scaling. Dimensionality reduction, as well as exploratory analyses and visualization to detect patterns and trends in the data.

Data integration

Various data integration methods, including quantitative trait loci (QTL) mapping, which links genetic variation to other omics data, co-expression/abundance networks that visualize relationships between genes/other omics features; and machine learning (ML) approaches that model complex nonlinear relationships across omics layers.

FAIR computational workflows

We are currently focused on enhancing research software quality and automating all our analysis pipelines by implementing them as FAIR (Findability, Accessibility, Interoperability, and Reusability) computational workflows. Our efforts include developing modular pipelines that utilize containerization and workflow management systems, ensuring seamless interoperability between local and cloud environments. These workflows are supplemented with rich metadata, promoting effective access and collaboration within our group and later also a broader scientific community.

Systems biology and mathematical modelling

Our systems biology team specializes in investigating complex biological mechanisms by constructing mathematical models that replicate system behaviour-based on interactions among various system elements. Their primary focus is on modelling genome-wide metabolic processes by integrating diverse omics data and physiology-based pharmacokinetic processes, which significantly contributes to advancing precision medicine. By leveraging these sophisticated models, the systems biology team aims to provide deeper insights into metabolic functions and regulatory mechanisms, facilitating the development of targeted therapeutic strategies and enhancing our understanding of disease dynamics.