TY - JOUR
T1 - netMUG
T2 - a novel network-guided multi-view clustering workflow for dissecting genetic and facial heterogeneity
AU - Li, Zuqi
AU - Melograna, Federico
AU - Hoskens, Hanne
AU - Duroux, Diane
AU - Marazita, Mary L.
AU - Walsh, Susan
AU - Weinberg, Seth M.
AU - Shriver, Mark D.
AU - Müller-Myhsok, Bertram
AU - Claes, Peter
AU - Van Steen, Kristel
N1 - Publisher Copyright:
Copyright © 2023 Li, Melograna, Hoskens, Duroux, Marazita, Walsh, Weinberg, Shriver, Müller-Myhsok, Claes and Van Steen.
PY - 2023
Y1 - 2023
N2 - Introduction: Multi-view data offer advantages over single-view data for characterizing individuals, which is crucial in precision medicine toward personalized prevention, diagnosis, or treatment follow-up. Methods: Here, we develop a network-guided multi-view clustering framework named netMUG to identify actionable subgroups of individuals. This pipeline first adopts sparse multiple canonical correlation analysis to select multi-view features possibly informed by extraneous data, which are then used to construct individual-specific networks (ISNs). Finally, the individual subtypes are automatically derived by hierarchical clustering on these network representations. Results: We applied netMUG to a dataset containing genomic data and facial images to obtain BMI-informed multi-view strata and showed how it could be used for a refined obesity characterization. Benchmark analysis of netMUG on synthetic data with known strata of individuals indicated its superior performance compared with both baseline and benchmark methods for multi-view clustering. The clustering derived from netMUG achieved an adjusted Rand index of 1 with respect to the synthesized true labels. In addition, the real-data analysis revealed subgroups strongly linked to BMI and genetic and facial determinants of these subgroups. Discussion: netMUG provides a powerful strategy, exploiting individual-specific networks to identify meaningful and actionable strata. Moreover, the implementation is easy to generalize to accommodate heterogeneous data sources or highlight data structures.
AB - Introduction: Multi-view data offer advantages over single-view data for characterizing individuals, which is crucial in precision medicine toward personalized prevention, diagnosis, or treatment follow-up. Methods: Here, we develop a network-guided multi-view clustering framework named netMUG to identify actionable subgroups of individuals. This pipeline first adopts sparse multiple canonical correlation analysis to select multi-view features possibly informed by extraneous data, which are then used to construct individual-specific networks (ISNs). Finally, the individual subtypes are automatically derived by hierarchical clustering on these network representations. Results: We applied netMUG to a dataset containing genomic data and facial images to obtain BMI-informed multi-view strata and showed how it could be used for a refined obesity characterization. Benchmark analysis of netMUG on synthetic data with known strata of individuals indicated its superior performance compared with both baseline and benchmark methods for multi-view clustering. The clustering derived from netMUG achieved an adjusted Rand index of 1 with respect to the synthesized true labels. In addition, the real-data analysis revealed subgroups strongly linked to BMI and genetic and facial determinants of these subgroups. Discussion: netMUG provides a powerful strategy, exploiting individual-specific networks to identify meaningful and actionable strata. Moreover, the implementation is easy to generalize to accommodate heterogeneous data sources or highlight data structures.
UR - http://www.scopus.com/inward/record.url?scp=85180126437&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85180126437&partnerID=8YFLogxK
U2 - 10.3389/fgene.2023.1286800
DO - 10.3389/fgene.2023.1286800
M3 - Article
C2 - 38125750
AN - SCOPUS:85180126437
SN - 1664-8021
VL - 14
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 1286800
ER -