The microbiome mystery: Gaining insights through data mining and analysis

The recent buzz around the microbiome and its effects on human health has both generated and resulted from an exponential growth in research and related publications. In fact, there’s been a 5000% increase in publications in PubMed over the last 15 years – a recent search for “microbiome” returned 62,260 results: 12,971 published to date in 2019 and 12,895 in 2018, compared with just 259 published in 2005!


Modifying the gut microbiome in Alzheimer’s disease

An exciting recent development is the approval of GV-971, a marine-derived oligosaccharide (sodium oligomannurarate) developed by Shanghai Green Valley Pharmaceutical for the treatment of Alzheimer’s disease. GV-971 was granted conditional approval in October 2019 in China based on results from a phase III clinical trial that showed consistently improved cognitive function in people with mild-moderate Alzheimer’s disease. The researchers state that GV-971 alters the gut microbiome, and in previously published mouse studies, reduced inflammation in the brains of mice with Alzheimer’s-like pathology.

How is this balance restored by what is essentially a long-chain sugar? That mystery currently remains unsolved. Despite the tremendous amount of data researchers have generated, we’re still missing an understanding of the underlying mechanisms for many microbiome-host interactions and their effect on human health. Think of what could be accomplished if we uncovered the mechanism of action through which GV-971 modulates the gut microflora and through that, a neuroinflammatory process. We could extrapolate those findings to the many systemic diseases with underlying proinflammatory processes.


Knowledge of cause and effect relationships could lead to biomarker identification, enabling non-invasive, objective, stool-based disease diagnosis.


Centralizing microbiome data as the first step

Part of the challenge in gaining this understanding stems from the fact that the human microbiome includes thousands of microbial strains that vary in presence and abundance across subjects and geographies, as well as between normal and pathological conditions. Finding the relevant information to determine specific correlations can be like finding a needle in a haystack.

In response, we’ve established a pre-competitive consortium with Merck to develop a centralized, comprehensive repository for microbiome data collection, collation and curation – DoMI, or the Database of Microbial-host Interactions. Using our collective knowledge and building from our experience with MetaBase, we aim to build mechanistic insight into microbiome-host relationships, effects of an imbalance, and therapeutic modulation.

DoMI is unique in that it includes 12,500+ host-microbial protein-protein interactions as well as information about host-microbial metabolic interplay and host-microbial network analysis. To date, we’ve curated approximately 500 publications through manual searches and curation, during which we’ve started to determine appropriate search terms and their combination to find the desired results. This exercise has informed the future use of text-mining algorithms for automatic searches and curation.


Achieving an understanding of microbiome-host interactions

But a database is just the beginning. Beyond just a body of knowledge, we need the tools to interrogate the interactions and gain insight about the correlations of disease – tools such as network analysis algorithms, machine learning models, and data visualization tools. Then, we can harness the power of data for improved health.  Collaborative efforts such as DOMI will be very helpful in accelerating microbiome research.


Accelerate your research with help from our Discovery and Translational Services consulting team, to achieve advanced analytics and actionable insights.