PubAnnotation

Id	Subject	Object	Predicate	Lexical cue
T823	0-39	Sentence	denotes	Nasal Washes during Influenza Infection
T824	40-123	Sentence	denotes	Sample processing, sequencing, and analysis was performed as in (Cao et al., 2020).
T825	124-206	Sentence	denotes	Reads were aligned to the GRCh37 reference genome combined with influenza genomes.
T826	207-392	Sentence	denotes	Mapped reads from each sample were then corrected for Drop-seq barcode synthesis error using the Drop-seq core computational tools developed by the McCarroll Lab (Macosko et al., 2015).
T827	393-554	Sentence	denotes	Genes were quantified using End Sequence Analysis Toolkit (ESAT, github/garber-lab/ESAT) with parameters -wlen 100 -wOlap 50 -wExt 0 -scPrep (Derr et al., 2016).
T828	555-752	Sentence	denotes	Finally, UMIs that likely result from sequencing errors were corrected by merging any UMIs that were observed only once and have 1 hamming distance from a UMI detected by two or more aligned reads.
T829	753-812	Sentence	denotes	Only cell barcodes with more than 1,000 UMIs were analyzed.
T830	813-881	Sentence	denotes	Cell barcodes with mostly erythrocyte genes (HBA, HBB) were removed.
T831	882-968	Sentence	denotes	From here on, the remaining cell barcodes in the matrix would be referred to as cells.
T832	969-1063	Sentence	denotes	The final gene by cell matrix was normalized using the scran package v3.10 (Lun et al., 2016).
T833	1064-1259	Sentence	denotes	The normalized matrix was used for dimensionality reduction by first selecting variable genes that had a high coefficient of variance (CV) and were expressed (> = 1 UMI) by more than three cells.
T834	1260-1488	Sentence	denotes	Influenza viral genes, interferon stimulated genes, and cell cycle related genes were removed from the variable gene list in order to minimize the impact of viral responses and mitosis on clustering and cell type identification.
T835	1489-1699	Sentence	denotes	This resulted in the selection of 2484 variable genes. t-distributed stochastic neighbor embedding (tSNE) was applied to the first ten principal components (PCs), which explained 95% of the total data variance.
T836	1700-1888	Sentence	denotes	Density clustering (Rodriguez and Laio, 2014) was performed on the resulting tSNE coordinates and identified four major clusters: epithelial cells, neutrophils, macrophages and leukocytes.
T837	1889-2049	Sentence	denotes	The epithelial cell cluster and the leukocyte cluster were then re-clustered independently, as described above, to identify populations within each metacluster.
T838	2050-2244	Sentence	denotes	Specifically, the epithelial cell cluster was re-embedded using 2629 variable genes selected by the same criteria mentioned in the previous section and 13 PCs that explained 95% of the variance.
T839	2245-2318	Sentence	denotes	Density clustering over the epithelial cell subset revealed ten clusters.
T840	2319-2449	Sentence	denotes	Differential gene expression analysis using edgeR (Robinson et al., 2010) was performed to identify marker genes for each cluster.
T841	2450-2759	Sentence	denotes	Influenza-infected and bystander cells were identified after correcting for sample-specific distribution of ambient influenza mRNA contamination and predicted cells most likely to be infected identified using a hurdle zero inflated negative binomial (ZINB) model and a support vector machine (SVM) classifier.

T823

0-39

Sentence

denotes

Nasal Washes during Influenza Infection

T824

40-123

Sentence

denotes

Sample processing, sequencing, and analysis was performed as in (Cao et al., 2020).

T825

124-206

Sentence

denotes

Reads were aligned to the GRCh37 reference genome combined with influenza genomes.

T826

207-392

Sentence

denotes

Mapped reads from each sample were then corrected for Drop-seq barcode synthesis error using the Drop-seq core computational tools developed by the McCarroll Lab (Macosko et al., 2015).

T827

393-554

Sentence

denotes

Genes were quantified using End Sequence Analysis Toolkit (ESAT, github/garber-lab/ESAT) with parameters -wlen 100 -wOlap 50 -wExt 0 -scPrep (Derr et al., 2016).

T828

555-752

Sentence

denotes

Finally, UMIs that likely result from sequencing errors were corrected by merging any UMIs that were observed only once and have 1 hamming distance from a UMI detected by two or more aligned reads.

T829

753-812

Sentence

denotes

Only cell barcodes with more than 1,000 UMIs were analyzed.

T830

813-881

Sentence

denotes

Cell barcodes with mostly erythrocyte genes (HBA, HBB) were removed.

T831

882-968

Sentence

denotes

From here on, the remaining cell barcodes in the matrix would be referred to as cells.

T832

969-1063

Sentence

denotes

The final gene by cell matrix was normalized using the scran package v3.10 (Lun et al., 2016).

T833

1064-1259

Sentence

denotes

The normalized matrix was used for dimensionality reduction by first selecting variable genes that had a high coefficient of variance (CV) and were expressed (> = 1 UMI) by more than three cells.

T834

1260-1488

Sentence

denotes

Influenza viral genes, interferon stimulated genes, and cell cycle related genes were removed from the variable gene list in order to minimize the impact of viral responses and mitosis on clustering and cell type identification.

T835

1489-1699

Sentence

denotes

This resulted in the selection of 2484 variable genes. t-distributed stochastic neighbor embedding (tSNE) was applied to the first ten principal components (PCs), which explained 95% of the total data variance.

T836

1700-1888

Sentence

denotes

Density clustering (Rodriguez and Laio, 2014) was performed on the resulting tSNE coordinates and identified four major clusters: epithelial cells, neutrophils, macrophages and leukocytes.

T837

1889-2049

Sentence

denotes

The epithelial cell cluster and the leukocyte cluster were then re-clustered independently, as described above, to identify populations within each metacluster.

T838

2050-2244

Sentence

denotes

Specifically, the epithelial cell cluster was re-embedded using 2629 variable genes selected by the same criteria mentioned in the previous section and 13 PCs that explained 95% of the variance.

T839

2245-2318

Sentence

denotes

Density clustering over the epithelial cell subset revealed ten clusters.

T840

2319-2449

Sentence

denotes

Differential gene expression analysis using edgeR (Robinson et al., 2010) was performed to identify marker genes for each cluster.

T841

2450-2759

Sentence

denotes

Influenza-infected and bystander cells were identified after correcting for sample-specific distribution of ambient influenza mRNA contamination and predicted cells most likely to be infected identified using a hurdle zero inflated negative binomial (ZINB) model and a support vector machine (SVM) classifier.

PMC:7252096 / 114990-117749 JSON TXT 10 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7252096 / 114990-117749 JSONTXT 10 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7252096 / 114990-117749 JSON TXT 10 Projects