We are developing a triple-thronged large-scale metagenomic approach to identify bacterial species that would successfully colonize plants in field conditions. Recent years have generated massive amounts of data on plant microbial colonization. We attempt to harness this vast amount of data, combined with bioinformatics tools and machine-learning techniques, to understand which microbial genes provide a colonization advantage. Therefore, the overarching goal of our research is to make it possible to reliably boost crop production using the microbiome, by applying machine-learning techniques on public metagenomic datasets to understand the genomic determinants of plant microbiome assembly. By employing a variety of tools, including Hidden Markov Models, we can find clear differences between microbial communities near the plant root and in the soil away from the plant.
Furthermore, once we understand which microbial functions are required to colonize the root, we may the engineer specific soil microbiome interventions to maximize the amount of carbon that is sequestered by plants.