PMC:7796058 / 56257-59410 JSONTXT

Annnotations TAB JSON ListView MergeView

{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/7796058","sourcedb":"PMC","sourceid":"7796058","source_url":"https://www.ncbi.nlm.nih.gov/pmc/7796058","text":"5.6. Audio-Based Risky Behavior Detection\nThis section examines an audio classification algorithm that recognizes coughing and sneezing using an audio sensor with an embedded DL engine. The methodology for audio detection is shown in Figure 13. This figure shows the four main steps of the audio DL process.The recording needs to first be preprocessed for noise before being used for extracting sound features. The most commonly known time-frequency feature is the short-time Fourier transform [67], Mel spectrogram [68], and wavelet spectrogram [69]. The Mel spectrogram was based on a nonlinear frequency scale motivated by human auditory perception and provides a more compact spectral representation of sounds when compared to the STFT [3]. To compute a Mel spectrogram, we first convert the sample audio files into time series. Next, its magnitude spectrogram is computed, and then mapped onto the Mel scale with power 2. The end result would be a Mel spectrogram [70]. The last step in preprocessing would be to convert Mel spectrograms into log Mel spectrograms. Then the image results would be introduced as an input to the deep learning modelling process.\nConvolutional neural network (CNN) architectures use multiple blocks of successive convolution and pooling operations for feature learning and down sampling along the time and feature dimensions, respectively [71]. The VGG16 is a pre-trained CNN [72] used as a base model for transfer learning (Table 6) [73]. VGG16 is a famous CNN architecture that uses multiple stacks of small kernel filters (3 by 3) instead of the shallow architecture of two or three layers with large kernel filters [74]. Using multiple stacks of small kernel filters increases the network’s depth, which results in improving complex feature learning while decreasing computation costs. VGG16 architecture includes 16 convolutional and three fully connected layers. Audio-based risky behavior detection is based on complex features and distinguishable behaviors (e.g., coughing, sneezing, background noise), which requires a deeper CNN model than shallow architecture (i.e., two or three-layer architecture) offers [75]. VGG16 has been adopted for audio event detection and demonstrated significant literature results [71]. The feature maps were flattened to obtain the fully connected layer after the last convolutional layer. For most CNN-based architectures, only the last convolutional layer activations are connected to the final classification layer [76].\nThe ESC-50 [77] and AudioSet [78] datasets were used to extract cough and sneezing training samples. The ESC-50 dataset is a labelled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification. AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labelled, 10 s sound clips taken from YouTube videos. Over 5000 samples were extracted for the transfer learning CNN model which was then divided to train and test datasets. We examined the performance of the trained CNN models using coughing and sneezing. The results are shown in Table 7.","divisions":[{"label":"title","span":{"begin":0,"end":41}},{"label":"p","span":{"begin":42,"end":1164}},{"label":"p","span":{"begin":1165,"end":2499}}],"tracks":[{"project":"LitCovid-PubTator","denotations":[{"id":"408","span":{"begin":626,"end":631},"obj":"Species"},{"id":"409","span":{"begin":114,"end":122},"obj":"Disease"},{"id":"411","span":{"begin":2007,"end":2015},"obj":"Disease"},{"id":"415","span":{"begin":2857,"end":2862},"obj":"Species"},{"id":"416","span":{"begin":2564,"end":2569},"obj":"Disease"},{"id":"417","span":{"begin":3097,"end":3105},"obj":"Disease"}],"attributes":[{"id":"A408","pred":"tao:has_database_id","subj":"408","obj":"Tax:9606"},{"id":"A409","pred":"tao:has_database_id","subj":"409","obj":"MESH:D003371"},{"id":"A411","pred":"tao:has_database_id","subj":"411","obj":"MESH:D003371"},{"id":"A415","pred":"tao:has_database_id","subj":"415","obj":"Tax:9606"},{"id":"A416","pred":"tao:has_database_id","subj":"416","obj":"MESH:D003371"},{"id":"A417","pred":"tao:has_database_id","subj":"417","obj":"MESH:D003371"},{"subj":"408","pred":"source","obj":"LitCovid-PubTator"},{"subj":"409","pred":"source","obj":"LitCovid-PubTator"},{"subj":"411","pred":"source","obj":"LitCovid-PubTator"},{"subj":"415","pred":"source","obj":"LitCovid-PubTator"},{"subj":"416","pred":"source","obj":"LitCovid-PubTator"},{"subj":"417","pred":"source","obj":"LitCovid-PubTator"}],"namespaces":[{"prefix":"Tax","uri":"https://www.ncbi.nlm.nih.gov/taxonomy/"},{"prefix":"MESH","uri":"https://id.nlm.nih.gov/mesh/"},{"prefix":"Gene","uri":"https://www.ncbi.nlm.nih.gov/gene/"},{"prefix":"CVCL","uri":"https://web.expasy.org/cellosaurus/CVCL_"}]},{"project":"LitCovid-PD-HP","denotations":[{"id":"T11","span":{"begin":114,"end":122},"obj":"Phenotype"},{"id":"T12","span":{"begin":2007,"end":2015},"obj":"Phenotype"},{"id":"T13","span":{"begin":2564,"end":2569},"obj":"Phenotype"},{"id":"T14","span":{"begin":3097,"end":3105},"obj":"Phenotype"}],"attributes":[{"id":"A11","pred":"hp_id","subj":"T11","obj":"http://purl.obolibrary.org/obo/HP_0012735"},{"id":"A12","pred":"hp_id","subj":"T12","obj":"http://purl.obolibrary.org/obo/HP_0012735"},{"id":"A13","pred":"hp_id","subj":"T13","obj":"http://purl.obolibrary.org/obo/HP_0012735"},{"id":"A14","pred":"hp_id","subj":"T14","obj":"http://purl.obolibrary.org/obo/HP_0012735"},{"subj":"T11","pred":"source","obj":"LitCovid-PD-HP"},{"subj":"T12","pred":"source","obj":"LitCovid-PD-HP"},{"subj":"T13","pred":"source","obj":"LitCovid-PD-HP"},{"subj":"T14","pred":"source","obj":"LitCovid-PD-HP"}]},{"project":"LitCovid-sentences","denotations":[{"id":"T406","span":{"begin":0,"end":4},"obj":"Sentence"},{"id":"T407","span":{"begin":5,"end":41},"obj":"Sentence"},{"id":"T408","span":{"begin":42,"end":185},"obj":"Sentence"},{"id":"T409","span":{"begin":186,"end":244},"obj":"Sentence"},{"id":"T410","span":{"begin":245,"end":410},"obj":"Sentence"},{"id":"T411","span":{"begin":411,"end":551},"obj":"Sentence"},{"id":"T412","span":{"begin":552,"end":744},"obj":"Sentence"},{"id":"T413","span":{"begin":745,"end":832},"obj":"Sentence"},{"id":"T414","span":{"begin":833,"end":926},"obj":"Sentence"},{"id":"T415","span":{"begin":927,"end":974},"obj":"Sentence"},{"id":"T416","span":{"begin":975,"end":1069},"obj":"Sentence"},{"id":"T417","span":{"begin":1070,"end":1164},"obj":"Sentence"},{"id":"T418","span":{"begin":1165,"end":1379},"obj":"Sentence"},{"id":"T419","span":{"begin":1380,"end":1474},"obj":"Sentence"},{"id":"T420","span":{"begin":1475,"end":1659},"obj":"Sentence"},{"id":"T421","span":{"begin":1660,"end":1824},"obj":"Sentence"},{"id":"T422","span":{"begin":1825,"end":1903},"obj":"Sentence"},{"id":"T423","span":{"begin":1904,"end":2158},"obj":"Sentence"},{"id":"T424","span":{"begin":2159,"end":2261},"obj":"Sentence"},{"id":"T425","span":{"begin":2262,"end":2365},"obj":"Sentence"},{"id":"T426","span":{"begin":2366,"end":2499},"obj":"Sentence"},{"id":"T427","span":{"begin":2500,"end":2600},"obj":"Sentence"},{"id":"T428","span":{"begin":2601,"end":2756},"obj":"Sentence"},{"id":"T429","span":{"begin":2757,"end":2916},"obj":"Sentence"},{"id":"T430","span":{"begin":2917,"end":3036},"obj":"Sentence"},{"id":"T431","span":{"begin":3037,"end":3119},"obj":"Sentence"},{"id":"T432","span":{"begin":3120,"end":3153},"obj":"Sentence"}],"namespaces":[{"prefix":"_base","uri":"http://pubannotation.org/ontology/tao.owl#"}],"attributes":[{"subj":"T406","pred":"source","obj":"LitCovid-sentences"},{"subj":"T407","pred":"source","obj":"LitCovid-sentences"},{"subj":"T408","pred":"source","obj":"LitCovid-sentences"},{"subj":"T409","pred":"source","obj":"LitCovid-sentences"},{"subj":"T410","pred":"source","obj":"LitCovid-sentences"},{"subj":"T411","pred":"source","obj":"LitCovid-sentences"},{"subj":"T412","pred":"source","obj":"LitCovid-sentences"},{"subj":"T413","pred":"source","obj":"LitCovid-sentences"},{"subj":"T414","pred":"source","obj":"LitCovid-sentences"},{"subj":"T415","pred":"source","obj":"LitCovid-sentences"},{"subj":"T416","pred":"source","obj":"LitCovid-sentences"},{"subj":"T417","pred":"source","obj":"LitCovid-sentences"},{"subj":"T418","pred":"source","obj":"LitCovid-sentences"},{"subj":"T419","pred":"source","obj":"LitCovid-sentences"},{"subj":"T420","pred":"source","obj":"LitCovid-sentences"},{"subj":"T421","pred":"source","obj":"LitCovid-sentences"},{"subj":"T422","pred":"source","obj":"LitCovid-sentences"},{"subj":"T423","pred":"source","obj":"LitCovid-sentences"},{"subj":"T424","pred":"source","obj":"LitCovid-sentences"},{"subj":"T425","pred":"source","obj":"LitCovid-sentences"},{"subj":"T426","pred":"source","obj":"LitCovid-sentences"},{"subj":"T427","pred":"source","obj":"LitCovid-sentences"},{"subj":"T428","pred":"source","obj":"LitCovid-sentences"},{"subj":"T429","pred":"source","obj":"LitCovid-sentences"},{"subj":"T430","pred":"source","obj":"LitCovid-sentences"},{"subj":"T431","pred":"source","obj":"LitCovid-sentences"},{"subj":"T432","pred":"source","obj":"LitCovid-sentences"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"LitCovid-PubTator","color":"#9993ec","default":true},{"id":"LitCovid-PD-HP","color":"#a6ec93"},{"id":"LitCovid-sentences","color":"#ec93c0"}]}]}}