What is Microsoft Research's BioGPT?
Will Generative A.I. push biotech forwards? AI4Science and beyond.
Image credit: “Alien Biology by Midjourney” via Avataart on DeviantArt.
I have an inkling that BioGPT could become important. I was just reading an update from Google’s Isomorphic Labs. Isomorphic Labs’ mission is to use AI and machine learning methods to accelerate and improve the drug discovery process.
The company is a pioneer in the emerging field of “digital biology” and aims to usher in a new era of biomedical breakthroughs in order to find cures for some of humanity’s most devastating diseases.
Lately I’ve been trying to wrap my head around all of this. I covered a bit about this A.I. synthetic biology singularity, here and here.
Microsoft Research Proposes BioGPT: A Domain-Specific Generative Transformer Language Model Pre-Trained on Large-Scale Biomedical Literature
I’ve been watching Microsoft Research very carefully as Generative A.I. means BigTech can get even more into healthcare and quickly. The incentives for BigTech to dominate and acquire emerging tech fields such as Generative A.I., biotech startups and quantum computing startups is very high.
While Microsoft aren’t building their own A.I. products at scale like Alphabet is planning to do, they are doing a lot of solid research. GitHub Copilot is itself a formidable coding productivity tool. Meanwhile Microsoft Research remains an improving A.I. lab as well (with deep ties to China and Chinese born researchers I note).
The abstract reads:
Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain. Among the two main branches of pre-trained language models in the general language domain, i.e., BERT (and its variants) and GPT (and its variants), the first one has been extensively studied in the biomedical domain, such as BioBERT and PubMedBERT.
See on Github: https://github.com/microsoft/BioGPT
The reachers seem pretty optimistic about BioGPT
BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature. We evaluate BioGPT on six biomedical NLP tasks and demonstrate that our model outperforms previous models on most tasks. Especially, we get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA, creating a new record. Our larger model BioGPT-Large achieves 81.0% accuracy on PubMedQA.
This bodes well for how GPT-4 might impact the biological sciences.
I agree with the conclusion of many researchers at the intersection of Generative A.I. and biotech:
By witnessing the success of pre-training in general NLP, people explore adapting these techniques into biomedical domain.
Lead author of the paper was Renqian Luo. You will notice he’s part of AI4Science the new off-shoot of Microsoft Research I keep talking about. So what’s exciting about AI4Science? I covered them here.
I recommend you take a look at the paper:
But what does this mean for the future of Generative A.I. and biotech as a whole?
Keep reading with a 7-day free trial
Subscribe to AI Supremacy to keep reading this post and get 7 days of free access to the full post archives.