This article is the first part of a series dedicated to Artificial Intelligence (AI) in drug discovery and development in oncology. It is an introductory overview on drug discovery (see Fig 1) that formulates and provides answers to key questions of the application of AI in drug discovery for oncology.
Fig 1. Overview of key components of drug discovery and development. Graphic from DrugBank. The present article focuses on the discovery phase.
What is the current state of AI in drug discovery supported by figures?
AI has become increasingly relevant within the pharmaceutical industry [1].
Up to date, $13.8B has been invested in companies and partnerships leveraging AI in drug discovery in a consolidating AI-enabled industry. In figures, more than 40 Pharma companies (and more than 230 start-ups) are using AI for drug discovery. Over 205 companies claim to offer AI-based drug discovery services / technologies in lead identification, optimization and generation and more than 115 drugs have been developed aided by AI technology. Its use has also been proved to have a positive economic impact in terms of an estimated 25% cost savings. The AI-based drug discovery market is projected to grow at an annualized rate of 25%, during the period 2022-2035, [2].
What are the general applications and techniques of AI that are being applied to drug discovery?
The various subfields of AI which are centered around its general applications include reasoning, knowledge representation, planning, learning, natural language and sequence processing, perception (including artificial vision), and the ability to move and manipulate objects. All these subfields are applicable to drug discovery.
AI techniques in drug discovery are numerous, and a few examples of AI techniques from older to more recent, are:
What differentiates general drug discovery tools from cancer-specific applications?
The subfield of oncological drug discovery shares most of the challenges and characteristics of broader drug discovery. Nonetheless, different from other therapeutic areas, cancer is a more complex disease where the principle of target-based drug discovery using isolated mechanisms and targets leads to frequent failures in the clinic, in particular as a result of poor efficacy [3]. For example, in the case of viral infections, it may be enough to target a certain protease required for replication or a receptor required for cell entry [3]. In contrast, cancer therapeutics require the targeting of essential biological capabilities – summarized in the 14 Hallmarks of Cancer (Fig 2) – that rule tumor development in humans.
Fig 2. Overview of key components of drug discovery and development. Graphic from DrugBank. The present article focuses on the discovery phase.
Toward this end, although still evolving, newer approaches facilitated by AI-based tools are emerging such as phenotypic screening which incorporates some of the cellular complexity of biology. It attempts to merge disease-relevant biology, such as transcriptome, with large numbers of compounds that can be screened [5]. Also, AI is increasingly used to advance our understanding of functional genomics to help decode how gene expression is regulated in normal and complex diseases like cancer.
What is the expectation of AI on drug discovery in oncology?
In the drug discovery field, clinical testing in human patients is the most expensive and difficult step. Drug discovery still fails 25% of the time on toxicity in humans and about 50% of the time on efficacy [6]. Estimates are still worse in oncology; hence the first expectation is that AI could help to reduce clinical failure and not only reduce the time and cost of current drug discovery phases.
In the last two decades, drug discovery tools have exponentially integrated AI techniques in a stepwise fashion focused each on vertical narrow tasks such as QSAR (quantitative structure-activity relationship) properties prediction, molecular dynamics simulations in allosteric modulation analysis, high throughput virtual screening or retrosynthesis, improving both the efficiency and reducing costs at those early steps of drug discovery. Of note, all these methods are focused mainly on traditional physicochemical and structural aspects of the generation of compounds where quality is often encoded by unidimensional metrics of activity and properties, and do not cover the likely biological consequences for the organism’s dynamics which are only assessed at later stages of clinical drug development in clinical trials. As a result, an “AI-discovered compound” nowadays does not guarantee success in clinical trials. Hence, the full potential of AI on drug discovery is not yet realized and will only be reached once the complexity of cancer biology – as recapitulated in the Hallmarks of Cancer -, and efficacy and toxicity of drugs in the human body can be modeled computationally, even if it does not necessarily require to be completely understood by humans.
This will not happen at once but will be a long-collaborative way to run, being the availability of foundation models+ trained on multimodal omic data++ as the first step to conquer. Nonetheless, in the time being, as we approach this goal, we will increasingly be more capable of using AI to reduce clinical failure, helping to decide at the drug discovery stages which drug candidate should be moved to clinical development based on more accurate predictions of drug candidate’s safety and efficacy. Also, it will pave the way for better patient selection and predictive biomarkers towards a more personalized medicine.
+ Foundation models are large-scale deep learning models trained (usually by self-supervised learning) on vast quantities of data at scale resulting in a model that can be adapted to a wide range of downstream tasks.
++ Multimodal refers to different modalities of data such as image, text, biological sequences, signals and tabular data. In cancer drug development multimodal omic data refers to different sources and modalities of data ranging from physico-chemical and structural properties of molecules to multi omic biological data (e.g. genomic, transcriptomic, epigenomic, reactoma, metabolomics, microbiome) up to clinical data (e.g toxicity, tumor response, patient outcomes etc…)
How could we realize the full potential of AI in drug discovery?
AI can be applied to all stages during drug discovery. In the next article, we will review concrete applications of AI and available AI-based resources including datasets applicable to the different phases of drug discovery in oncology: from target identification, phenotypic screening, target validation, virtual screening, de-novo design, retrosynthesis, drug synthesis and optimization and drug repositioning. It will also discuss the potential of Alpha Fold in drug discovery applied to oncology and the emergence of deep generative models in this field.
Dr. Aurelia Bustos, MD, PhD
References: