We propose a rigorous and efficient method for evaluating homophily and heterophily in edge-weighted networks. In a network with nodes partitioned into classes, homophily (resp., heterophily) is defined as the tendency to have edges between nodes in the same class (resp., in different classes). Assuming a suitable null model, we provide a closed formula for the z-score of the total weight of homophilic/heterophilic edges for each class/pair of classes. The z-score directly measures how much this weight deviates from its expected value under the null model. In addition, we also propose a global homophily measure, that gives a significant score of how the set of all classes at a glance tend to be homophilic. The proposed statistics can be computed for very large networks since, as we show, they can be efficiently computed in a data streaming setting. For a network with n nodes and m edges, our algorithm only needs O(n) internal memory space, optimal O(m) worst case time, and a single scan of the m input edges, in any order, is required. Experimental results are shown on ten Protein-Protein Interaction networks, reporting homophily w.r.t. protein functional classes.

Homophily of Large Weighted Networks in a Data Streaming Setting / Apollonio, Nicola; Franciosa, Paolo G.; Santoni, Daniele. - (2025), pp. 131-142. ( Computational Intelligence Methods for Bioinformatics and Biostatistics Padova, Italy ) [10.1007/978-3-031-90714-2_10].

Homophily of Large Weighted Networks in a Data Streaming Setting

Franciosa, Paolo G.
Membro del Collaboration Group
;
2025

Abstract

We propose a rigorous and efficient method for evaluating homophily and heterophily in edge-weighted networks. In a network with nodes partitioned into classes, homophily (resp., heterophily) is defined as the tendency to have edges between nodes in the same class (resp., in different classes). Assuming a suitable null model, we provide a closed formula for the z-score of the total weight of homophilic/heterophilic edges for each class/pair of classes. The z-score directly measures how much this weight deviates from its expected value under the null model. In addition, we also propose a global homophily measure, that gives a significant score of how the set of all classes at a glance tend to be homophilic. The proposed statistics can be computed for very large networks since, as we show, they can be efficiently computed in a data streaming setting. For a network with n nodes and m edges, our algorithm only needs O(n) internal memory space, optimal O(m) worst case time, and a single scan of the m input edges, in any order, is required. Experimental results are shown on ten Protein-Protein Interaction networks, reporting homophily w.r.t. protein functional classes.
2025
Computational Intelligence Methods for Bioinformatics and Biostatistics
network homophily; z-scores; weighted networks; data streaming algorithms
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Homophily of Large Weighted Networks in a Data Streaming Setting / Apollonio, Nicola; Franciosa, Paolo G.; Santoni, Daniele. - (2025), pp. 131-142. ( Computational Intelligence Methods for Bioinformatics and Biostatistics Padova, Italy ) [10.1007/978-3-031-90714-2_10].
File allegati a questo prodotto
File Dimensione Formato  
Apollonio_homophily_2025.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.54 MB
Formato Adobe PDF
1.54 MB Adobe PDF   Contatta l'autore
Apollonio_homophily_2025.pdf.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 584.05 kB
Formato Adobe PDF
584.05 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1738533
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact