Weighted Gene Co-expression Network Analysis (WGCNA) is a network-based approach that constructs weighted gene co-expression networks, groups genes into co-expressed modules, and relates these modules to complex phenotypic traits. Despite its wide use, setting up reproducible and scalable WGCNA pipelines from RNA-Seq data still often requires several manual steps. In this work, we present Auto_WGCNA, an integrated R-based pipeline that, starting from gene count matrices and a table of phenotypic metadata, automatically performs all the main stages of a WGCNA analysis. The pipeline performs import and VST normalization of raw counts with DESeq2, constructs the weighted co-expression network and detects gene modules, computes module eigengenes, correlates them with phenotypic traits to generate module–trait heatmaps, and identifies hub genes and module membership, providing all the output needed for downstream functional analyses. The pipeline is provided as a command-line script that takes as input the count files, the phenotypic metadata, and the main analysis parameters, making it suitable for execution in HPC environments and for studies requiring the systematic analysis of multiple datasets or comparisons. This system reduces the technical burden on the user, ensures consistency between interactive and batch analyses, and facilitates the adoption of WGCNA in gene expression studies integrating complex phenotypic information.
Auto_WGCNA: A Reproducible R Pipeline from Gene Counts to Hub Gene Identification / Giannelli, Federico; Arcieri, Manuel; Carrus, Meryam; Bottoni, Paolo; Castrignanò, Tiziana. - (2026), pp. 196-209. ( 13th International Conference on Big Data Analytics Aizu; Japan ) [10.1007/978-3-032-23241-0_11].
Auto_WGCNA: A Reproducible R Pipeline from Gene Counts to Hub Gene Identification
Manuel ArcieriSecondo
;Paolo Bottoni
Penultimo
;
2026
Abstract
Weighted Gene Co-expression Network Analysis (WGCNA) is a network-based approach that constructs weighted gene co-expression networks, groups genes into co-expressed modules, and relates these modules to complex phenotypic traits. Despite its wide use, setting up reproducible and scalable WGCNA pipelines from RNA-Seq data still often requires several manual steps. In this work, we present Auto_WGCNA, an integrated R-based pipeline that, starting from gene count matrices and a table of phenotypic metadata, automatically performs all the main stages of a WGCNA analysis. The pipeline performs import and VST normalization of raw counts with DESeq2, constructs the weighted co-expression network and detects gene modules, computes module eigengenes, correlates them with phenotypic traits to generate module–trait heatmaps, and identifies hub genes and module membership, providing all the output needed for downstream functional analyses. The pipeline is provided as a command-line script that takes as input the count files, the phenotypic metadata, and the main analysis parameters, making it suitable for execution in HPC environments and for studies requiring the systematic analysis of multiple datasets or comparisons. This system reduces the technical burden on the user, ensures consistency between interactive and batch analyses, and facilitates the adoption of WGCNA in gene expression studies integrating complex phenotypic information.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


