Identification of multivariable Boolean patterns in microbiome and microbial gene composition data

George Golovko, Kamil Khanipov, Victor Reyes, Irina Pinchuk, Yuriy Fofanov

Research output: Contribution to journalArticlepeer-review

Abstract

Virtually every biological system is governed by complex relations among its components. Identifying such relations requires a rigorous or heuristics-based search for patterns among variables/features of a system. Various algorithms have been developed to identify two-dimensional (involving two variables) patterns employing correlation, covariation, mutual information, etc. It seems obvious, however, that comprehensive descriptions of complex biological systems need also to include more complicated multivariable relations, which can only be described using patterns that simultaneously embrace 3, 4, and more variables. The goal of this manuscript is to (a) introduce a novel type of associations (multivariable Boolean patterns) that can be manifested between features of complex systems but cannot be identified (described) by traditional pair-vise metrics; (b) propose patterns classification method, and (c) provide a novel definition of the pattern's strength (pattern's score) able to accommodate heterogeneous multi-omics data. To demonstrate the presence of such patterns, we performed a search for all possible 2-, 3-, and 4-dimensional patterns in historical data from the Human Microbiome Project (15 body sites) and collection of H. pylori genomes associated with gastric ulcers, gastritis, and duodenal ulcers. In all datasets under consideration, we were able to identify hundreds of statistically significant multivariable patterns. These results suggest that such patterns can be common in microbial genomics/microbiomics systems.

Original languageEnglish (US)
Article number105007
JournalBioSystems
Volume233
DOIs
StatePublished - Nov 2023
Externally publishedYes

Keywords

  • Boolean patterns
  • Co-exclusion
  • Co-presence
  • Microbiome
  • Multidimensional patterns
  • Multiomics
  • Multivariable patterns
  • Regulatory network

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation
  • General Biochemistry, Genetics and Molecular Biology
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Identification of multivariable Boolean patterns in microbiome and microbial gene composition data'. Together they form a unique fingerprint.

Cite this