Much of the world has been transfixed in recent months by the appearance of text generation engines such as OpenAI’s ChatGPT, artificial intelligence algorithms capable of producing text that seems as if it were written by a human.
Beyond embracing chatbots to help produce written materials, such as papers, patents or grant applications, others can repurpose them specifically for drug discovery, as a sort of advanced search engine specifically geared to biological science. “We can have a more specific, for example, Bio ChatGPT or Med ChatGPT,” says Lurong Pan, a computational chemist at the University of Alabama, Birmingham and founder and CEO of Ainnocence, a biotech with a platform to aid drug discovery.
If a scientist wants to investigate psoriasis, for example, the chat function can look at the knowledge graph for that disease. It will deliver a text description that includes the major signaling pathways and genes involved in psoriasis and the compounds known to interact with them. The user can then ask any question — for example, “How many genes are in this graph?” — and get an instant response, or look for associations between genes and specific diseases, such as sarcoma.
LLMs are still evolving, with developers adding features at a furious pace. The ChatGPT released in December was based on OpenAI’s GPT version 3.5. An update, GPT-4, was released mid-March and vastly outperforms its predecessor. In late March, ChatGPT added a so-called retrieval plug-in that could prove particularly useful to drug discovery.
Bioxcel Therapeutics, a company that uses AI to identify for repurposing drugs that were shelved in phase 2 or 3 trials, or even after approval, is considering LLMs to pick out potential winners from the different databases. But LLMs will only prove valuable, says Frank Yocca, a neuroscientist and the company’s CSO, if they fit into Bioxcel’s suite of AI tools. “Right now it’s not very accurate in terms of what you get back,” he cautions. “But we’re in the beginning stages of this.