Skip to main content
SearchLoginLogin or Signup

Review 1: "SARS-CoV-2 variant evolution in the United States: High accumulation of viral mutations over time likely through serial Founder Events and mutational bursts"

Published onApr 14, 2022
Review 1: "SARS-CoV-2 variant evolution in the United States: High accumulation of viral mutations over time likely through serial Founder Events and mutational bursts"
1 of 2
key-enterThis Pub is a Review of
SARS-CoV-2 variant evolution in the United States: High accumulation of viral mutations over time likely through serial Founder Events and mutational bursts

ABSTRACTSince the first case of COVID-19 in December 2019 in Wuhan, China, SARS-CoV-2 has spread worldwide and within a year has caused 2.29 million deaths globally. With dramatically increasing infection numbers, and the arrival of new variants with increased infectivity, tracking the evolution of its genome is crucial for effectively controlling the pandemic and informing vaccine platform development. Our study explores evolution of SARS-CoV-2 in a representative cohort of sequences covering the entire genome in the United States, through all of 2020 and early 2021. Strikingly, we detected many accumulating Single Nucleotide Variations (SNVs) encoding amino acid changes in the SARS-CoV-2 genome, with a pattern indicative of RNA editing enzymes as major mutators of SARS-CoV-2 genomes. We report three major variants through October of 2020. These revealed 14 key mutations that were found in various combinations among 14 distinct predominant signatures. These signatures likely represent evolutionary lineages of SARS-CoV-2 in the U.S. and reveal clues to its evolution such as a mutational burst in the summer of 2020 likely leading to a homegrown new variant, and a trend towards higher mutational load among viral isolates, but with occasional mutation loss. The last quartile of 2020 revealed a concerning accumulation of mostly novel low frequency replacement mutations in the Spike protein, and a hypermutable glutamine residue near the putative furin cleavage site. Finally, the end of the year data revealed the presence of known variants of concern including B.1.1.7, which has acquired additional Spike mutations. Overall, our results suggest that predominant viral sequences are dynamically evolving over time, with periods of mutational bursts and unabated mutation accumulation. This high level of existing variation, even at low frequencies and especially in the Spike-encoding region may be become problematic when superspreader events, akin to serial Founder Events in evolution, drive these rare mutations to prominence.AUTHOR SUMMARYThe pandemic of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has caused the death of more than 2.29 million people and continues to be a severe threat internationally. Although simple measures such as social distancing, periodic lockdowns and hygiene protocols were immediately put into force, the infection rates were only temporarily minimized. When infection rates exploded again new variants of the virus began to emerge. Our study focuses on a representative set of sequences from the United States throughout 2020 and early 2021. We show that the driving force behind the variants of public health concern, is widespread infection and superspreader events. In particular, we show accumulation of mutations over time with little loss from genetic drift, including in the Spike region, which could be problematic for vaccines and therapies. This lurking accumulated genetic variation may be a superspreader event from becoming more common and lead to variants that can escape the immune protection provided by the existing vaccines.

RR:C19 Evidence Scale rating by reviewer:

  • Reliable. The main study claims are generally justified by its methods and data. The results and conclusions are likely to be similar to the hypothetical ideal study. There are some minor caveats or limitations, but they would/do not change the major claims of the study. The study provides sufficient strength of evidence on its own that its main claims should be considered actionable, with some room for future revision.



Tasakis et al. describe the mutational landscape of the sARS-CoV-2 genome in a subset of US isolates. They have a particular focus on the non-synonymous mutations of the Spike protein. The data is well presented and is predominantly descriptive. The discussion is quite long and the significance of superspreader events is quite extended. Some specific comments for consideration are outlined below.

  • Page 3 lines 60-62: Please rephrase this sentence. It is an inaccurate generalization. Depending on your region, e.g. South East Asia, the infection control measures were highly effective, while in other regions they were not sufficient and infection rates did increase.

  • Page 3 lines 68 and 69: Do you mean “a superspreader event away from…”

  • Line 807: This sentence doesn’t make sense: “per state in from”.

  • Fig 1A: the y-axis describes the data as the median number of SNVs but the figure legend describes the data differently – please fix this. The same comment applies to line 153 in the results section.

  • Line 153: For clarity, this should say that the number of SNVs increases compared to a reference strain.

  • Line 259: It would be useful to confirm in the main text that these quartiles are defined by the calendar year. This is written in one of the supplementary legends but would be useful to have it clarified in the main text.

  • Line 295 to 297: This is more of a discussion point rather than results. But this is more of a stylistic issue rather than a scientific issue.

  • Line 300 to 302: I think this section could be rewritten to be a bit clearer, it should be made clear here that SARS-CoV-2 in comparison to other RNA viruses, such as HIV, are more stable. It is really the large scale of the infected population and the emergence of selection pressure due to herd immunity that is resulting in the emergence of new SARS-CoV-2 variants.

  • Line 358: I wouldn’t say “besides mutation”. Superspreading alone would not create diversity, it is rather it influences how these mutations are dispersed

  • Line 365: This should be clarified as neutralizing immune escape, as T cell immune escape is not limited to the Spike but nAb escape is concentrated (but not exclusively) to the Spike.

  • Line 388 and 389: This is the same point as at the start of the discussion. Is it really less stable than initially thought? It still has a lower substitution rate than other RNA viruses; it is just more the extent of the outbreak and the high ongoing transmission.

  • Line 418 – the authors state that they did not detect evidence of recombination but this data is not shown in the results section. Please modify this sentence to reflect what was and wasn’t shown in the results section.


No comments here

Why not start the discussion?