KBase
An open, collaborative, and FAIR biological data science platform
The Department of Energy’s Systems Biology Knowledgebase (KBase) is an ambitious program to disrupt the way we do biological systems science for prediction and engineering. It is an open-source, extensible platform that runs on DOE high-end infrastructure and supports collaborative, reproducible, reusable, and openly publishable analyses of organisms and their communities to predict and design their functions. It allows users to form teams, add their tools and data into the system, share with each other or everyone, and place their work in the context of public data and other people’s analyses. Our current, but always expanding, toolset spans genome and metagenome assembly and annotation, comparative and functional genomics, and sophisticated comparative metabolic modeling among other things. Two major features are: 1) the ability to publish DOI-linked “Narratives” to the web which are specially-instrumented Jupyter Notebook-backed, fully provenanced active and reusable records of the data, analyses, and thoughts of the researchers who produced them, and 2) a growing knowledge and relation-engine system that performs meta-analysis across public and publicly shared data across the system to automatically find biological and environmental relationships among the data and produce useful inferences of identity and function of, and interaction among them. This system is highly supportive of research like that in the ENIGMA program and currently has well over ten-thousand users.
In the program for which Arkin is the lead PI, we are specifically interested in adding tools for comparative microbial community analysis, modeling, and engineering; the design and implementation of the biological data science environment; and the design of the collaborative, open publishing platform.
For more information visit https://www.kbase.us/
Selected Publications
A bacterial sensor taxonomy across earth ecosystems for machine learning applications Journal Article
In: mSystems, 2023, ISSN: 2379-5077.
The ModelSEED Database for the integration of metabolic annotations and the reconstruction, comparison, and analysis of metabolic models for plants, fungi, and microbes Journal Article
In: 2020.
KBase: Ŧhe United States Đepartment of Energy Systems Biology Knowledgebase Journal Article
In: Nat. Biotechnol., vol. 36, no. 7, pp. 566–569, 2018.