Id |
Subject |
Object |
Predicate |
Lexical cue |
T220 |
0-129 |
Sentence |
denotes |
Sequences from the United Kingdom corresponded to nearly half of the sequences (n = 12,157/25,671, 47%) of this filtered dataset. |
T221 |
130-431 |
Sentence |
denotes |
To avoid overrepresentation of the UK sequences and bias in subsequent analyses, we investigated the effect of downsampling sequences on the mean Hamming distance and identified the minimum number of sequences required to recover the mean corresponding to the full distribution (SI Appendix, Fig. S1). |
T222 |
432-631 |
Sentence |
denotes |
A subsample of 5,000 sequences satisfied these criteria, and also ensured that there were fewer sequences from the United Kingdom than from the United States (n = 5,398), reflecting the epidemiology. |
T223 |
632-754 |
Sentence |
denotes |
These 5,000 sequences were sampled randomly, with weight proportional to the number of UK sequences collected on that day. |