The coronavirus is proving that we have to move faster in identifying and mitigating epidemics before they become pandemics because, in today’s global world, viruses spread much faster, further, and more frequently than ever before.
If COVID-19 has taught us anything, it’s that while our ability to identify and treat pandemics has improved greatly since the outbreak of the Spanish Flu in 1918, there is still a lot of room for improvement. Over the past few decades, we’ve taken huge strides to improve quick detection capabilities. It took a mere 12 days to map the outer “spike” protein of the COVID-19 virus using new techniques. In the 1980s, a similar structural analysis for HIV took four years.
But developing a cure or vaccine still takes a long time and involves such high costs that big pharma doesn’t always have incentive to try.
Drug discovery entrepreneur Prof. Noor Shaker posited that “Whenever a disease is identified, a new journey into the “chemical space” starts seeking a medicine that could become useful in contending diseases. The journey takes approximately 15 years and costs $2.6 billion, and starts with a process to filter millions of molecules to identify the promising hundreds with high potential to become medicines. Around 99% of selected leads fail later in the process due to inaccurate prediction of behavior and the limited pool from which they were sampled.”
Prof. Shaker highlights one of the main problems with our current drug discovery process: The development of pharmaceuticals is highly empirical. Molecules are made and then tested, without being able to accurately predict performance beforehand. The testing process itself is long, tedious, cumbersome, and may not predict future complications that will surface only when the molecule is deployed at scale, further eroding the cost/benefit ratio of the field. And while AI/ML tools are already being developed and implemented to optimize certain processes, there’s a limit to their efficiency at key tasks in the process.
Ideally, a great way to cut down the time and cost would be to transfer the discovery and testing from the expensive and time-inefficient laboratory process (in-vitro) we utilize today, to computer simulations (in-silico). Databases of molecules are already available to us today. If we had infinite computing power we could simply scan these databases and calculate whether each molecule could serve as a cure or vaccine to the COVID-19 virus. We would simply input our factors into the simulation and screen the chemical space for a solution to our problem.
In principle, this is possible. After all, chemical structures can be measured, and the laws of physics governing chemistry are well known. However, as the great British physicist Paul Dirac observed: “The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.”
In other words, we simply don’t have the computing power to solve the equations, and if we stick to classical computers we never will.
This is a bit of a simplification, but the fundamental problem of chemistry is to figure out where electrons sit inside a molecule and calculate the total energy of such a configuration. With this data, one could calculate the properties of a molecule and predict its behavior. Accurate calculations of these properties will allow the screening of molecular databases for compounds that exhibit particular functions, such as a drug molecule that is able to attach to the coronavirus “spike” and attack it. Essentially, if we could use a computer to accurately calculate the properties of a molecule and predict its behavior in a given situation, it would speed up the process of identifying a cure and improve its efficiency.
Why are quantum computers much better than classical computers at simulating molecules?
Electrons spread out over the molecule in a strongly correlated fashion, and the characteristics of each electron depend greatly on those of its neighbors. These quantum correlations (or entanglement) are at the heart of the quantum theory and make simulating electrons with a classical computer very tricky.
The electrons of the COVID-19 virus, for example, must be treated in general as being part of a single entity having many degrees of freedom, and the description of this ensemble cannot be divided into the sum of its individual, distinguishable electrons. The electrons, due to their strong correlations, have lost their individuality and must be treated as a whole. So to solve the equations, you need to take into account all of the electrons simultaneously. Although classical computers can in principle simulate such molecules, every multi-electron configuration must be stored in memory separately.
Let’s say you have a molecule with only 10 electrons (forget the rest of the atom for now), and each electron can be in two different positions within the molecule. Essentially, you have 2^10=1024 different configurations to keep track of rather just 10 electrons which would have been the case if the electrons were individual, distinguishable entities. You’d need 1024 classical bits to store the state of this molecule. Quantum computers, on the other hand, have quantum bits (qubits), which can be made to strongly correlate with one another in the same way electrons within molecules do. So in principle, you would need only about 10 such qubits to represent the strongly correlated electrons in this model system.
The exponentially large parameter space of electron configurations in molecules is exactly the space qubits naturally occupy. Thus, qubits are much more adapted to the simulation of quantum phenomena. This scaling difference between classical and quantum computation gets very big very quickly. For instance, simulating penicillin, a molecule with 41 atoms (and many more electrons) will require 10^86 classical bits, or more bits than the number of atoms in the universe. With a quantum computer, you would only need about 286 qubits. This is still far more qubits than we have today, but certainly a more reasonable and achievable number.
The COVID-19 virus outer “spike” protein, for comparison, contains many thousands of atoms and is thus completely intractable for classical computation. The size of proteins makes them intractable to classical simulation with any degree of accuracy even on today’s most powerful supercomputers. Chemists and pharma companies do simulate molecules with supercomputers (albeit not as large as the proteins), but they must resort to making very rough molecule models that don’t capture the details a full simulation would, leading to large errors in estimation.
It might take several decades until a sufficiently large quantum computer capable of simulating molecules as large as proteins will emerge. But when such a computer is available, it will mean a complete revolution in the way the pharma and the chemical industries operate.
The holy grail — end-to-end in-silico drug discovery — involves evaluating and breaking down the entire chemical structures of the virus and the cure.
The continued development of quantum computers, if successful, will allow for end-to-end in-silico drug discovery and the discovery of procedures to fabricate the drug. Several decades from now, with the right technology in place, we could move the entire process into a computer simulation, allowing us to reach results with amazing speed. Computer simulations could eliminate 99.9% of false leads in a fraction of the time it now takes with in-vitro methods. With the appearance of a new epidemic, scientists could identify and develop a potential vaccine/drug in a matter of days.
The bottleneck for drug development would then move from drug discovery to the human testing phases including toxicity and other safety tests. Eventually, even these last stage tests could potentially be expedited with the help of a large scale quantum computer, but that would require an even greater level of quantum computing than described here. Tests at this level would require a quantum computer with enough power to contain a simulation of the human body (or part thereof) that will screen candidate compounds and simulate their impact on the human body.
Achieving all of these dreams will demand a continuous investment into the development of quantum computing as a technology. As Prof. Shohini Ghose said in her 2018 Ted Talk: “You cannot build a light bulb by building better and better candles. A light bulb is a different technology based on a deeper scientific understanding.” Today’s computers are marvels of modern technology and will continue to improve as we move forward. However, we will not be able to solve this task with a more powerful classical computer. It requires new technology, more suited for the task.