Recently, I’ve gotten an influx of questions about how we can really be sure that mRNA vaccines will not affect our DNA. In my prior post on the subject I wrote:
Another concern raised has been the idea that mRNA can somehow alter the host’s genome. That would actually be super cool and be huge for gene therapy (and I could finally give myself the giant bat wings I’ve always wanted) but this is not so. This is ordinarily impossible except if there is also a reverse transcriptase enzyme present that produces DNA from the RNA template, which is how retroviruses work. There is no such risk with any mRNA vaccine candidate. mRNA vaccines act entirely within the cytosol of the cell- they do not go near the nucleus where all the DNA is. That’s actually a major advantage of RNA-based vaccines over DNA ones.
I gave this response in large part because I felt that the detailed discussion of reverse transcription, nuclear trafficking, the endocytic pathway, and the other 11 or so advanced cell biology topics that I would have to invoke to give this a rigorous answer was too complex to be of benefit to the average person wanting to know simply whether or not this is possible. However, I had a flurry of questions about “what ifs” relating to retroviruses or hepadnaviruses (hepatitis B), and I can grant that this response doesn’t address that, so here I will attempt to answer that as explicitly and with minimal complexity as I am capable.
To simplify the discussion so as to avoid having to explain the phases of phospholipid bilayers and the molecular composition of the lipid nanoparticle as it relates to stability (discussed in 1, 2, 3, 4), I will ask readers take for granted that mRNA vaccines are endocytosed and liberated (and this) into the cytoplasm of the cell.
Firstly, for mRNA to affect your DNA, at a minimum we need to establish that it would need to gain access to the DNA in question. There are two subcellular compartments where this can be accomplished. The first is the nucleus, so let’s start with a discussion of the trafficking of cargo in the nucleus. The nucleus of the cell is an isolated compartment with pore complexes (NPCs) that impose limits on the size of the particles that can freely enter. RNA is readily transported out as transcription occurs within the nucleus but the ribosomes required to produce proteins are in the cytosol or on the rough endoplasmic reticulum. This process is mediated by several accessory proteins which you can see on the left. Note however that there isn’t any physiological circumstance in which one might need RNA from the cytosol to be transported back to the nucleus. RNA is synthesized within the nucleus. Viruses which have a nuclear phase in their replication cycle have to have various tricks to be able to allow their RNA payload to enter. Though RNA is not readily transported into cells, proteins can be. This occurs via a network of proteins called importins (see figure 5–23C above). Proteins containing an amino acid sequence called the nuclear localization sequence (NLS; there are 2 common ones) are able to bind the importins, which can then transport them across the nuclear pore complex as shown on the left. RNA viruses often have replication cycles that do not require access to the nucleus, but there are some exceptions. Influenza viruses for example are RNA viruses that have their genomes associated with ribonucleoproteins, and these ribonucleoproteins express nuclear localization signals that facilitates the entry of their genomic RNA into the nucleus. mRNA vaccines, on the other hand, are not associated with any proteins. Once inside the cytosol, the mRNA is naked and exposed to the harsh environment of ribosomes and exonucleases which destroy the mRNA in a matter of hours (at most). There is no conceivable mechanism by which mRNA can spontaneously be trafficked into the nucleus. Being made of nucleotides, it cannot contain a nuclear localization sequence.
The other relevant compartment would be the mitochondrion. Mitochondria are actually vestigial bacteria with their own genomes, and it’s thought that billions of years ago an ancient bacteria tried to consume the ancestor of the mitochondria but lacked the machinery to actually do the digesting and the two established a symbiotic relationship. Since that instance, the mitochondria have been an essential feature of our cell’s biologies. This allowed the mitochondria to develop an extremely reduced genome containing only 37 genes (most of the genes relevant to mitochondrial function are still in the nucleus). Mitochondria have their own ribosomes and even their own genetic code (sort of). There is also a specialized process for the clearance of diseased mitochondria called mitophagy, which is the subject of many excellent reviews e.g. this, this, and this.
The collective conclusion from our understanding of these biological process is that a naked mRNA in the cytosol has no potential to end up in a cellular compartment that contains our own DNA means that, irrespective of the presence or absence of other factors, there is no chance of harm to the DNA from the mRNA vaccine. But still people wanted to ask me about reverse transcriptases so let’s discuss those.
The process of going from RNA to DNA (the exact opposite of what the central dogma of molecular biology dictates) is known as reverse transcription, and it is carried out with an enzyme called a reverse transcriptase (which are a really interesting group of enzymes). In general, reverse transcription is performed by a few different genetic entities: retroviruses, hepadnaviruses, telomeres, and retrotransposons. These are worth defining.
- Retroviruses are viruses who have an RNA genome, from which they create a DNA copy through reverse transcription that then integrates into the cell of the host (by which I mean, literally inserts itself into the host cell’s genome and becomes a permanent part of it, in the form of a sequence called a provirus). The proviral sequence itself can then be transcribed in the host cell to produce viral proteins and particles that can go on to spread to the next cell. The most famous retrovirus is HIV-1.
- Hepadnaviruses are DNA viruses which have gapped genomes (there is one complete DNA strand and another partial DNA strand which is linked to a pregenomic RNA), and unlike retroviruses, do not integrate into the genome of the host cell they infect. The most famous example is Hepatitis B virus, for which multiple effective vaccines exist.
- Telomeres are structures present at the ends of human chromosomes which are maintained by a protein complex called telomerase that uses a reverse transcriptase called TERT to maintain them. The reasons this is necessary are discussed in Figure 9–12 on the left. They are about 5–15 kilobases long normally, and shortening results in arrest of cell growth and replication (senescence), or can even trigger cell death by apoptosis.
- Retrotransposons are actually the most abundant component of our genome. The human genome contains about 21,000–27,000 genes (the number you get depends on how precisely you define a gene and which source you consult), which span 40–48 million base pairs, but this accounts for only about 1.5% of the 3.2 billion total base pairs. Retrotransposons account for about 2 billion base pairs. There are several kinds of retroelements, which are worth discussing further:
- SINEs (short interspersed nuclear elements) which encode short transcripts like tRNAs, and cannot function without a LINE-encoded protein.
- LINEs (long interspersed nuclear elements) which encode a reverse transcriptase formed from the ORF1 and pol genes which can copy itself and other LINE and SINE elements into other regions of the genome.
- About 5–8% of the human genome is also composed of human endogenous retroviruses, HERVs, which also fall into the category of retrotransposons, more specifically LTR (long terminal repeats) retrotransposons (more on this shortly). HERVs contain 3 genes: gag (“group antigens,” which encodes a polyprotein that is cleaved into the structural proteins of the resultant retrovirus), pol (the reverse transcriptase needed for the virus to replicate), and env (envelope, which encodes the protein that gives the viral particles their shape).
- More broadly, the term retroelement refers to genetic sequences that have moved from one region of the genome to another via reverse transcription, and these include retrotransposons, and processed pseudogenes. Processed pseudogenes refer to the sequences of processed mRNA that lack introns that have been inserted via reverse transcription (we know they had to be inserted into the genome via reverse transcription in large part because they lack introns). They are incapable of producing any gene product.
- The only retrotransposons that can move through the genome (literally copy their DNA to new sites where it was not initially present) are the LINEs and SINEs, and of these, only a few are able to accomplish this. HERVs are stuck where they are, and processed pseudogenes are as well.
Telomerases evolved as a solution to the end replication problem. Nascent (new) DNA strands are synthesized with a leading strand and a lagging strand because the DNA polymerases have a very restricted directionality in that they must travel 3’ to 5’ with respect to the template strand. This creates a problem because the DNA is oriented antiparallel (the strands are parallel but one strand is oriented in the direction opposite to the other), so to make both strands at the same time, a single DNA polymerase would have to manage to concurrently travel what would be a Sisyphean length for it in opposite directions (imaging trying to simultaneously run east and west for 10 miles). To deal with this dilemma, one of the strands is synthesized as a leading strand with a polymerase traveling down the strand uninterrupted for many nucleotides (formally the term is “processively”), and a lagging strand in which fragments of DNA (called Okazaki fragments) are consistently generated that are complementary to the other strand that get ligated (fused) together. The dilemma is that because our chromosomes are not circular, there will always be a missing fragment once we reach the 3’ end of the chromosome, and thus each replication cycle of the DNA will cause the size of the genome to shrink, eventually with the potential to hit genes important for biological function. This is known as the end replication problem.
To your left you see a telomerase complex with its favorite telomerase RNA. The ends of the chromosome contain structures called telomeres, which are repetitive, short, palindromic sequences that get copied many times, until the gaps between the strands are filled for a length of about 5000 to 15000 nucleotides. The production of telomeric DNA occurs via a large protein complex called telomerase, which makes use of TERT (telomerase reverse transcriptase), a reverse transcriptase that takes an RNA template to make the palindromic DNA sequences. Importantly, cells eventually do lose their telomerase function, which is thought to represent a safeguard against cancer (cells that express telomerase at high levels can continue dividing- and therefore accumulating mutations, some of which might be harmful- indefinitely, and thus in most cells after about 50 divisions, the cells will cease to divide; telomerase is notably expressed at high levels in stem cells). In practice, mice which have no functional telomerase will have substantial chromosomal shortening within 3 generations and by the fourth generation end up unable to reproduce. Here now, I have to shatter all your preconceived ideas about how RNA works. When speaking about DNA and RNA, we have a tendency to use the term “strand” which conjures up an image of a thread. The thread is relatively linear, it may curve, but the structure is relatively boring. This is a reasonable approximation of most DNA, as DNA can have basically one of 3 structures called A, B, and Z (there are rarer ones though such as i-motifs, and DNAzymes can do weird things). RNA on the other hand, is a much freer spirit when it comes to structure. RNA folds into complex shapes with all sorts of structural motifs in a manner not dissimilar to proteins, in that the structure of a protein relates directly to its function. What this means is: specific RNAs do specific things depending on how they fold, which depends on their sequence. To your right you can see a detailed diagram of telomerase RNA bound to the telomerase complex. That curvy thing with bars like a ladder and bubbles is the telomerase RNA. TERT, the reverse transcriptase of telomerase, binds the telomerase RNA at the core domain and a region called CR4/CR5. I won’t get into the other components of the complex but you can read in detail about how it works here and here. Immediately below the diagram to your right you can see how telomerase works to extend the 3’ cap of the chromosome through the aid of a repetitive, palindromic RNA sequence: CAAUCCCAAUC, which reproduces on the DNA a repeating “GGGTTA” to form a telomere with a length of about 5,000–15,000 nucleotides. For this to work, a bunch of things have to go right but solely for TERT to be able to recognize telomerase RNA there needs to be: the template for reverse transcription (the palindromic sequence CCCAAU), the pseudoknot domain (the core domain in the diagram), a stem–loop that interacts with TERT (CR4/CR5), and a 3′ element required for RNA stability (CR7). This is a very specific set of constraints and mRNA vaccines would have to be designed to have them (see image above for standard organization of an mRNA vaccine). Ribosomes also have intrinsic mRNA helicase activity that destroys such structures so that they can be read and processed for the synthesis of a protein. Additionally, the mature human telomerase RNA is 451 nucleotides in length. The mRNA from these vaccines is approximately 1200–1300 nucleotides long. It is too large to function as a telomerase RNA in humans (there are some animals which have telomerase RNAs of that size but we are not one of them) and given how precisely the telomerase RNA must fold, it is unlikely to assume the required structures for recognition and binding of the telomerase.
I initially considered discussing in detail the reverse transcriptases of the hepadnaviruses (i.e. hepatitis B) and retroviruses (i.e. HIV and HERVs) but the discussion quickly became inaccessible. Suffice it to say, reverse transcriptases are not capable of picking up any random RNA and generating a DNA from it. They require an RNA sequence to prime the reaction. For retroviruses, there is a tRNA that is stolen from the host cell and packaged into the virion. Furthermore, in the retroviruses, reverse transcription occurs within a nucleocapsid which allows dNTPs (the building blocks of DNA) in, but cannot permit something as large as an entirely separate RNA molecule spanning about 1200 bases. Reverse transcription by hepadnaviruses is similar in principle, requiring a pregenomic RNA segment that is chemically linked to the DNA of the hepadnavirus. Reverse transcription will not occur spontaneously with just any RNA. Even for RT-PCR reactions, the reaction requires the binding of an oligodeoxythymidine sequence to the polyA tail of the mRNA in question. Additionally, there’s a secondary requirement here to be able to “change” the DNA of the host: to actually manipulate it in some way. In the case of the hepadnaviruses this doesn’t really happen. The hepadnavirus genome gets into the nucleus and forms a covalently closed circular DNA with its own associated histones, essentially a small, separate chromosome. It doesn’t touch the host’s DNA. In the case of retroviruses, the DNA gets integrated into the host chromosome, and the effect depends on where it gets integrated. HIV for example has a strong bias for inserting itself into genes, which can be problematic if, for example, the gene produces a protein important for the maintenance of genome integrity (which could lead to cancer if left unchecked). The development of cancer from such a process however, cannot simply occur without many other things going wrong, like for instance a massive death of helper T cells that critically impairs the ability of the immune system to conduct surveillance of cells for evidence of malignancy and kill them, as happens in HIV. Now, should we choose to ignore everything thus far established about how cell biology works, including the need for a primer to initiate the reverse transcriptase reaction, and allow that a retrovirus readily permits integration of the resultant spike protein RBD or entire spike protein gene into the host, this would simply lead to the insertion of a gene that may be able to make the spike protein or just the RBD (depending on where it inserted and whether it could recruit transcriptional machinery), which would only serve to present to the immune system a foreign protein that it has been primed to respond against, and subsequently kill the cell. Also, as they are being delivered by an intramuscular injection, the cells in question would most likely be a muscle cell (which you can lose without loss of any eloquent function) or a dendritic cell (which you could also lose without any loss of significant immunological function).
To conclude, and I really hope this ends it:
There is no feasible means by which an mRNA vaccine could end up in the nucleus of a cell, nor prime a reverse transcription reaction, nor give you a mitochondrial disease.
There is no reasonable possibility based on the totality of our knowledge of cell biology, reverse transcriptases, human genetics, and the immune system that mRNA vaccines can affect your DNA.
We should await the detailed safety data, but, a priori, a segment of RNA encoding the spike protein RBD or even the whole spike protein of SARS-CoV-2 with no replicative potential, and no ability to form whole virus, nor even whole ability to form an ENTIRE spike protein, should be expected to be a safe vaccine that isn’t going to cause these insane pie-in-the-sky science fiction scenarios.
If you are worried about the mRNA vaccines, then don’t get them. The data suggest that there will soon be other kinds of vaccines with good efficacy as well. I, however, am content to roll up my sleeves for one of them.
- Hartwell L, Goldberg M, Fischer J, Hood L. Genetics. 6th ed. New York: McGraw Hill; 2018.
- Lodish H, Berk A, Kaiser C, Krieger M, Bretscher A, Ploegh H, Amon A, Martin K. Molecular cell biology. 8th ed. New York: W.H. Freeman; 2016
- Flint S, Racaniello V, Rall G, Skalka A, Enquist L. Principles of virology. Washington, DC: ASM Press; 2015.
- Blanco, E., Shen, H. & Ferrari, M. Principles of nanoparticle design for overcoming biological barriers to drug delivery. Nat Biotechnol 33, 941–951 (2015). https://doi.org/10.1038/nbt.3330
- Palikaras, K., Lionaki, E. & Tavernarakis, N. Mechanisms of mitophagy in cellular homeostasis, physiology and pathology. Nat Cell Biol 20, 1013–1022 (2018). https://doi.org/10.1038/s41556-018-0176-2
- Zhang Q, Kim N-K, Feigon J. Architecture of human telomerase RNA. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(51):20325–20332.
- Ackerman DG, Feigenson GW. Lipid bilayers: clusters, domains and phases. Essays in biochemistry. 2015;57:33–42.
- Zhang J, Shrivastava S, Cleveland RO, Rabbitts TH. Lipid-mRNA nanoparticle designed to enhance intracellular delivery mediated by shock waves. ACS applied materials & interfaces. 2019;11(11):10481–10491.
- Hafez IM, Maurer N, Cullis PR. On the mechanism whereby cationic lipids promote intracellular delivery of polynucleic acids. Gene therapy. 2001;8(15):1188–1196.
- Hassett KJ, Benenato KE, Jacquinet E, Lee A, Woods A, Yuzhakov O, Himansu S, Deterling J, Geilich BM, Ketova T, et al. Optimization of lipid nanoparticles for intramuscular administration of mRNA vaccines. Molecular therapy. Nucleic acids. 2019;15:1–11.
- Maruggi G, Zhang C, Li J, Ulmer JB, Yu D. MRNA as a transformative technology for vaccine development to control infectious diseases. Molecular therapy: the journal of the American Society of Gene Therapy. 2019;27(4):757–772.
- Linares-Fernández S, Lacroix C, Exposito J-Y, Verrier B. Tailoring mRNA vaccine to balance innate/adaptive immune response. Trends in molecular medicine. 2020;26(3):311–323.
- Kowalski PS, Rudra A, Miao L, Anderson DG. Delivering the messenger: Advances in technologies for therapeutic mRNA delivery. Molecular therapy: the journal of the American Society of Gene Therapy. 2019;27(4):710–728.
- Rauch J, JANOFFt AS. for immunorecognition of nonbilayer lipid phases in vivo. Pnas.org. https://www.pnas.org/content/pnas/87/11/4112.full.pdf
- Pickles S, Vigié P, Youle RJ. Mitophagy and quality control mechanisms in mitochondrial maintenance. Current biology: CB. 2018;28(4):R170–R185.
- Gooch JW. Transfer RNA (tRNA). In: Encyclopedic Dictionary of Polymers. New York, NY: Springer New York; 2011. p. 929–929.
- Alston CL, Rocha MC, Lax NZ, Turnbull DM, Taylor RW. The genetics and pathology of mitochondrial disease. The Journal of pathology. 2017;241(2):236–250.
- Huet S, Avilov SV, Ferbitz L, Daigle N, Cusack S, Ellenberg J. Nuclear import and assembly of influenza A virus RNA polymerase studied in live cells by fluorescence cross-correlation spectroscopy. Journal of virology. 2010;84(3):1254–1264.
- Schmidt JC, Cech TR. Human telomerase: biogenesis, trafficking, recruitment, and activation. Genes & development. 2015;29(11):1095–1105.
- The biotechnology revolution: PCR and cloning expressed genes. Nature.com. https://www.nature.com/scitable/topicpage/the-biotechnology-revolution-pcr-and-the-use-553/
- Šponer J, Bussi G, Krepl M, Banáš P, Bottaro S, Cunha RA, Gil-Ley A, Pinamonti G, Poblete S, Jurečka P, et al. RNA structural dynamics as captured by molecular simulations: A comprehensive overview. Chemical reviews. 2018;118(8):4177–4338.
- Takyar S, Hickerson RP, Noller HF. mRNA helicase activity of the ribosome. Cell. 2005;120(1):49–58.