> top > docs > PMC:7782580 > spans > 33213-34381 > annotations

PMC:7782580 / 33213-34381 JSONTXT

Annnotations TAB JSON ListView MergeView

LitCovid-sentences

Id Subject Object Predicate Lexical cue
T255 0-233 Sentence denotes We adopted a knowledge distillation method in the training phrase; a small model (called a student network) was trained to mimic the ensemble of multiple models (called teacher networks) to obtain a small model with high performance.
T256 234-365 Sentence denotes In the distillation process, knowledge was transferred from the teacher networks to the student network to minimize knowledge loss.
T257 366-455 Sentence denotes The target was the output of the teacher networks; these outputs were called soft labels.
T258 456-645 Sentence denotes The student network also learned from the ground-truth labels (also called hard labels), thereby minimizing the knowledge loss from the student networks, whose targets were the hard labels.
T259 646-792 Sentence denotes Therefore, the overall loss function of the student network incorporated both knowledge distillation and knowledge loss from the student networks.
T260 793-1036 Sentence denotes After the student network had been well-trained, the task of the teacher networks was complete, and the student model could be used on a regular computer with a fast speed, which is suitable for hospitals without extensive computing resources.
T261 1037-1168 Sentence denotes As a result of the knowledge distillation method, the CNNCF achieved high performance with a few parameters in the teacher network.