PMC:7782580 / 42116-46626 JSONTXT 3 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T313 0-19 Sentence denotes Data sets splitting
T314 20-196 Sentence denotes We used the multi-modal data sets from four public data sets and one hospital (Youan hospital) in our research and split the hybrid data set in the following manner.For X-data:
T315 197-339 Sentence denotes The CXR images of COVID-19 cases collected from the public CCD52 contained 212 patients diagnosed with COVID-19 and were resized to 512 × 512.
T316 340-413 Sentence denotes Each image contained 1–2 suspected areas with inflammatory lesions (SAs).
T317 414-513 Sentence denotes We also collected 5100 normal cases and 3100 pneumonia cases from another public data set (RSNA)53.
T318 514-735 Sentence denotes In addition, The CXR images collected from the Youan hospital contained 45 cases diagnosed with COVID-19, 503 normal cases, 435 cases diagnosed with pneumonia (not COVID-19 patients), and 145 cases diagnosed as influenza.
T319 736-842 Sentence denotes The CXR images collected from the Youan hospital were obtained using the Carestream DRX-Revolution system.
T320 843-960 Sentence denotes All the CXR images of COVID-19 cases were analyzed by the two experienced radiologists to determine the lesion areas.
T321 961-1140 Sentence denotes The X-data of the normal cases (XNPDS), that of the pneumonia cases (XPPDS), and that of the COVID-19 cases (XCPDS) from public data sets constituted the X public data set (XPDS).
T322 1141-1324 Sentence denotes The X-data of the normal cases (XNHDS), that of the pneumonia cases (XPHDS), and that of the COVID-19 cases (XCHDS) from the Youan hospital constituted the X hospital data set (XHDS).
T323 1325-1337 Sentence denotes For CT-data:
T324 1338-1820 Sentence denotes We collected CT-data of 120 normal cases from a public lung CT-data set (LUNA16, a large data set for automatic nodule detection in the lungs54), which was a subset of LIDC-IDRI (The LIDC-IDRI contains a total of 1018 helical thoracic CT scans collected using manufacturers from eight medical imaging companies including AGFA Healthcare, Carestream Health, Inc., Fuji Photo Film Co., GE Healthcare, iCAD, Inc., Philips Healthcare, Riverain Medical, and Siemens Medical Solutions)55.
T325 1821-1986 Sentence denotes It was confirmed by the two experienced radiologists from the Youan Hospital that no lesion areas of COVID-19, pneumonia, or influenza were present in the 120 cases.
T326 1987-2129 Sentence denotes We also collected the CT-data of pneumonia cases from a public data set (images of COVID-19 positive and negative pneumonia patients: ICNP)56.
T327 2130-2302 Sentence denotes The CT-data collected from the Youan hospital contained 95 patients diagnosed with COVID-19, 50 patients diagnosed with influenza and 215 patients diagnosed with pneumonia.
T328 2303-2471 Sentence denotes The images of the CT scans collected from the Youan hospital were obtained using the PHILIPS Brilliance iCT 256 system (Which was also used for the LIDC-IDRI data set).
T329 2472-2585 Sentence denotes The slice thickness of the CT scans was 5 mm, and the CT-data images were grayscale images with 512 × 512 pixels.
T330 2586-2776 Sentence denotes Areas with 2–5 SAs were annotated by the two experienced radiologists using a rapid keystroke-entry format in the images for each case, and these areas ranged from 16 × 16 to 64 × 64 pixels.
T331 2777-2928 Sentence denotes The CT-data of the normal cases (CTNPDS) and that of the pneumonia cases (CTPPDS) from the public data sets constituted the CT public data set (CTPDS).
T332 2929-3173 Sentence denotes The CT-data of the COVID-19 cases from the Youan hospital (CTCHDS), the influenza cases from the Youan hospital (CTIHDS), and the normal cases from the Youan hospital (CTNHDS) constituted the CT hospital (clinically-diagnosed) data set (CTHDS).
T333 3174-3202 Sentence denotes For clinical indicator data:
T334 3203-3429 Sentence denotes Five clinical indicators (white blood cell count, neutrophil percentage, lymphocyte percentage, procalcitonin, C-reactive protein) of 95 COVID-19 cases were obtained from the Youan hospital, as shown in Supplementary Table 20.
T335 3430-3686 Sentence denotes A total of 95 data pairs from the 95 COVID-19 cases (369 images of the lesion area and the 95 × 5 clinical indicators) were collected from the Youan hospital for the correlation analysis of the lesion areas of the COVID-19 and the five clinical indicators.
T336 3687-3794 Sentence denotes The images of the SAs and the clinical indicator data constituted the correlation analysis data set (CADS).
T337 3795-3914 Sentence denotes We split the XPDS, XHDS, CTPDS, CTHDS, and CADS into the training-validation (train-val) and test data sets using TTSF.
T338 3915-4021 Sentence denotes The details of the hybrid data sets for the public data sets and Youan hospital data are shown in Table 1.
T339 4022-4109 Sentence denotes The train-val part of CTHDS is referred to as CTHTS, and the test part is called CTHVS.
T340 4110-4251 Sentence denotes The same naming scheme was adopted for XPDS, XHDS, CTPDS, and CADS, i.e., XPTS, XPVS, XHTS, XHVS, CTPTS, CTPVS, CATS, and CAVS, respectively.
T341 4252-4436 Sentence denotes The training-validation part of the four public data sets and the hospital (Youan Hospital) data set were mixed for X-data and CT-data, which were named as XMTS and CTMTS respectively.
T342 4437-4510 Sentence denotes While the test parts were split in the same way and named XMVS and CTMVS.