China through a drug-target interaction deep learning model
Abstract
The infection of a novel coronavirus found in Wuhan of China (2019-nCoV) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for 2019-nCoV, various strategies are being tested in China, including drug repurposing. In this study, we used our pretrained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI) to identify commercially available drugs that could act on viral proteins of 2019-nCoV. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chemical compound, showing a inhibitory potency with Kd of 94.94 nM against the 2019-nCoV 3C-like proteinase, followed by efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all
designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of 2019-nCoV with an inhibitory potency with Kd < 1000 nM. In addition, we also found that several antiviral agents, such as Kaletra, could be used for the treatment of 2019-nCoV, although there is no real-world evidence supporting the prediction. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for 2019-nCoV.
Coronaviruses (CoVs), belonging to the family Coronaviridae, are positive-sense enveloped RNA viruses and cause infections in birds, mammals, and humans (1) (2) (3) . The family includes four genera, such as Alphacoronavirus, Betacoronavirus, Deltacoronavirus, and Gammacoronavirus (4) . Two infamous infectious coronaviruses in the genus Betacoronavirus are severe acute respiratory syndrome coronavirus (SARS-CoV) (5) and Middle East respiratory syndrome coronavirus (MERS-CoV) (6) , which have infected more than 10,000 people around the world in the past two decades. Unfortunately, the incidence was accompanied by high mortality rates (9.6% for SARS-CoV and 34.4% for MERS-CoV), indicating that there is an urgent need for effective treatment at the beginning of the outbreak to prevent the spread (7, 8) . However, this cannot be achieved with current drug development or an application system, taking several years for newly developed drugs to come to the market.
Unexpectedly, the world is facing the same situation as the previous outbreak due to a recent epidemic of atypical pneumonia caused by a novel coronavirus (2019-nCoV) in Wuhan, China (5, 9) . Thus, a rapid drug application strategy that can be immediately applied to the patient is necessary. Currently, the only way to address this matter is to repurpose commercially available drugs for the pathogen in so-called "drug-repurposing". However, in theory, artificial intelligence (AI)-based architectures must be taken into account in order to accurately predict drug-target interactions (DTIs). This is because of the enormous amount of complex information (e.g. hydrophobic interactions, ionic interactions, hydrogen bonding, and/or van der Waals forces) between molecules. To this end, we previously developed a deep learning-based drug-target interaction prediction model, called Molecule Transformer-Drug Target Interaction (MT-DTI) (10) .
In this study, we applied our pre-trained MT-DTI model to identify commercially available antiviral drugs that could potentially disrupt 2019-nCoV's viral components, such as proteinase, RNA-dependent RNA polymerase, and/or helicase. Since the model utilizes simplified molecular-input line-entry system (SMILES) strings and amino acid (AA) sequences, which are 1D string inputs, it is possible to quickly apply target proteins that do not have experimentally confirmed 3D crystal structures, such as viral proteins of 2019-nCoV. We share a list of top commercially available antiviral drugs that could potentially hinder the multiplication cycle of 2019-nCoV with the hope that effective drugs can be developed based on these AI-proposed drug candidates and act against 2019-nCoV.
. CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org /10.1101 /10. /2020
Amino acid sequences of 3C-like proteinase (accession YP_009725301.1), RNA-dependent RNA polymerase (accession YP_009725307.1), helicase (accession YP_009725308.1), 3'-to-5' exonuclease (accession YP_009725309.1), endoRNAse (accession YP_009725310.1), and 2'-O-ribose methyltransferase (accession YP_009725311.1) of the 2019-nCoV replication complex were extracted from the 2019-nCoV whole genome sequence (accession NC_045512.2), from the National Center for Biotechnology Information (NCBI) database. The raw prediction results were screened for antiviral drugs that are FDA approved, target viral proteins, and have a Kd value less than 1,000 nM.
Molecule transformer-drug target interaction (MT-DTI) was used to predict binding affinity values between commercially available antiviral drugs and target proteins. Briefly, the natural language processing (NLP) based Bidirectional Encoder Representations from Transformers (BERT) framework is a core algorithm of the model with good performance and robust results in diverse drug-target interaction datasets through pretraining with 'chemical language' SMILES of approximately 1,000,000,000 compounds.
To train the model, the Drug Target Common (DTC) database (11) and BindingDB (12) database were manually curated and combined. Three types of efficacy value, Ki, Kd, and IC50 were integrated by a consistence-score-based averaging algorithm (13) to make the Pearson correlation score over 0.9 in terms of Ki, Kd, and IC50. Since the BindingDB database includes a wide variety of species and target proteins, the MT-DTI model has the potential power to predict interactions between antiviral drugs and 2019-nCoV proteins.
The 2019-nCoV 3C-like proteinase was predicted to bind with atazanavir (Kd 94.94 nM), followed by efavirenz, ritonavir, and other antiviral drugs that have a predicted affinity of Kd > 100 nM potency (Table 1 ). No other protease inhibitor antiviral drug was found in the Kd < 1,000 nM range. Although there is . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org /10.1101 /10. /2020 no real-world evidence about whether these drugs will act as predicted against 2019-nCoV yet, some case studies have been identified. For example, a docking study of lopinavir along with other HIV proteinase inhibitors of the CoV proteinase (PDBID 1UK3) suggests atazanavir and ritonavir, which are listed in the present prediction results, may inhibit the CoV proteinase in line with the inhibitory potency of lopinavir (14) . According to the prediction, viral proteinase-targeting drugs were predicted to act more favorably on the viral replication process than viral proteinase through the DTI model (Table 2 -6) . The results include antiviral drugs other than proteinase inhibitors, such as guanosine analogues (e.g., acyclovir, ganciclovir, and penciclovir), reverse transcriptase inhibitors, and integrase inhibitors.
Among the prediction results, atazanavir was predicted to have a potential binding affinity to bind to Table 2 -6) . Also, ganciclovir was predicted to bind to three subunits of the replication complex of the 2019-nCoV: RNA-dependent RNA polymerase (Kd 11.91 nM), 3'-to-5' exonuclease (Kd 56.29 nM), and RNA helicase (Kd 108.21 nM). Lopinavir and ritonavir, active materials of AbbVie's Kaletra, both were predicted to have a potential affinity to 2019-nCoV helicase (Table 3 ) and are suggested as potential MERS therapeutics (15) . Recently, approximately $2 million worth of Kaletra doses were donated to China (16) , and a previous clinical study of SARS by Chu et al. (17) may support this decision (17).
Another anti-HIV drug, Prezcobix of Johnson & Johnson, which consists of darunavir and cobicistat, was to be sent to China (16) , and darunavir is also predicted to have a Kd of 90.38 nM against 2019-nCoV's helicase (Table 3) . However, there was no current supporting literature found for darunavir to be used as a CoV therapeutic.
In many cases, DTI prediction models serve as a tool to repurpose drugs to develop novel usages of existing drugs. However, the application of DTI prediction in the present study may be useful to control The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.31.929547 doi: bioRxiv preprint Table 1 . Drug-target interaction (DTI) prediction results of FDA approved antviral drugs available on markets against a novel coronavirus (2019-nCoV, NCBI reference sequence NC_045512.2) 3C-like proteinase (accession YP_009725301.1). Ritonavir is expressed in canonical and isomeric forms SMILES, and * indicates isomeric form SMILES of ritonavir. Table 3 . Drug-target interaction (DTI) prediction results of antiviral drugs available on markets against a novel coronavirus (2019-nCoV, NCBI reference sequence NC_045512.2) helicase (accession YP_009725308.1). Ritonavir is expressed in canonical and isomeric form SMILES, and * indicates isomeric form SMILES of ritonavir. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org /10.1101 /10. /2020 The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.31.929547 doi: bioRxiv preprint Table 6 . Drug-target interaction (DTI) prediction results of anti-viral drugs available on markets against a novel coronavirus (2019-nCoV, NCBI reference sequence NC_045512.2) 2'-O-ribose methyltransferase (accession YP_009725311.1). The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org /10.1101 /10. /2020
|