> top > docs > PMC:7796058 > spans > 39663-63261 > annotations

PMC:7796058 / 39663-63261 JSONTXT

Annnotations TAB JSON ListView MergeView

LitCovid-PubTator

Id Subject Object Predicate Lexical cue tao:has_database_id
269 141-149 Disease denotes COVID-19 MESH:C000657245
275 715-723 Disease denotes COVID-19 MESH:C000657245
276 827-835 Disease denotes coughing MESH:D003371
277 1088-1096 Disease denotes COVID-19 MESH:C000657245
278 1761-1769 Disease denotes COVID-19 MESH:C000657245
279 1870-1878 Disease denotes COVID-19 MESH:C000657245
283 2231-2239 Disease denotes COVID-19 MESH:C000657245
284 2240-2248 Disease denotes infected MESH:D007239
285 2459-2467 Disease denotes infected MESH:D007239
290 3954-3960 Gene denotes beacon Gene:59286
291 3579-3585 Gene denotes beacon Gene:59286
292 3383-3389 Gene denotes beacon Gene:59286
293 4001-4009 Disease denotes COVID-19 MESH:C000657245
299 5367-5373 Species denotes people Tax:9606
300 4175-4183 Disease denotes COVID-19 MESH:C000657245
301 4686-4694 Disease denotes COVID-19 MESH:C000657245
302 5097-5105 Disease denotes COVID-19 MESH:C000657245
303 5264-5272 Disease denotes COVID-19 MESH:C000657245
307 6391-6397 Gene denotes beacon Gene:59286
308 6310-6316 Gene denotes beacon Gene:59286
309 6061-6067 Gene denotes beacon Gene:59286
317 6844-6850 Gene denotes beacon Gene:59286
318 6693-6699 Gene denotes beacon Gene:59286
319 6477-6483 Gene denotes beacon Gene:59286
320 7007-7018 Species denotes coronavirus Tax:11118
321 6900-6908 Disease denotes COVID-19 MESH:C000657245
322 6970-6978 Disease denotes infected MESH:D007239
323 7234-7242 Disease denotes COVID-19 MESH:C000657245
328 8216-8222 Gene denotes beacon Gene:59286
329 8070-8076 Gene denotes beacon Gene:59286
330 8489-8497 Disease denotes COVID-19 MESH:C000657245
331 8673-8681 Disease denotes COVID-19 MESH:C000657245
333 8903-8909 Gene denotes beacon Gene:59286
335 9287-9293 Species denotes People Tax:9606
344 9390-9396 Species denotes people Tax:9606
345 9398-9404 Species denotes People Tax:9606
346 9431-9437 Species denotes people Tax:9606
347 9568-9574 Species denotes people Tax:9606
348 10026-10032 Species denotes people Tax:9606
349 10113-10119 Species denotes people Tax:9606
350 10178-10184 Species denotes people Tax:9606
351 9758-9767 Disease denotes elevators MESH:D006973
360 10736-10742 Species denotes people Tax:9606
361 10788-10794 Species denotes People Tax:9606
362 11005-11011 Species denotes people Tax:9606
363 11208-11214 Species denotes people Tax:9606
364 11323-11329 Species denotes people Tax:9606
365 11388-11394 Species denotes people Tax:9606
366 11561-11567 Species denotes people Tax:9606
367 11686-11692 Species denotes people Tax:9606
375 11871-11877 Species denotes people Tax:9606
376 11976-11982 Species denotes people Tax:9606
377 12284-12290 Species denotes people Tax:9606
378 12599-12605 Species denotes people Tax:9606
379 12656-12662 Species denotes people Tax:9606
380 12820-12826 Species denotes people Tax:9606
381 12952-12958 Species denotes people Tax:9606
387 13241-13246 Species denotes Human Tax:9606
388 13400-13408 Disease denotes coughing MESH:D003371
389 13781-13789 Disease denotes COVID-19 MESH:C000657245
390 13887-13895 Disease denotes coughing MESH:D003371
391 13945-13953 Disease denotes COVID-19 MESH:C000657245
395 14437-14443 Species denotes people Tax:9606
396 14558-14564 Species denotes people Tax:9606
397 14031-14039 Disease denotes coughing MESH:D003371
399 15229-15237 Disease denotes COVID-19 MESH:C000657245
403 15608-15616 Disease denotes coughing MESH:D003371
404 15938-15946 Disease denotes coughing MESH:D003371
405 16018-16026 Disease denotes coughing MESH:D003371
408 17220-17225 Species denotes human Tax:9606
409 16708-16716 Disease denotes coughing MESH:D003371
411 18601-18609 Disease denotes coughing MESH:D003371
415 19451-19456 Species denotes human Tax:9606
416 19158-19163 Disease denotes cough MESH:D003371
417 19691-19699 Disease denotes coughing MESH:D003371
429 19984-19990 Species denotes people Tax:9606
430 20004-20010 Species denotes people Tax:9606
431 20102-20108 Species denotes people Tax:9606
432 20770-20776 Species denotes people Tax:9606
433 20969-20975 Species denotes People Tax:9606
434 21024-21030 Species denotes People Tax:9606
435 21461-21467 Species denotes people Tax:9606
436 20506-20511 Disease denotes Cough MESH:D003371
437 21106-21114 Disease denotes coughing MESH:D003371
438 21239-21244 Disease denotes cough MESH:D003371
439 21399-21407 Disease denotes elevator MESH:D006973
441 23443-23449 Species denotes people Tax:9606

LitCovid-PD-HP

Id Subject Object Predicate Lexical cue hp_id
T4 827-835 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T5 13400-13408 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T6 13887-13895 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T7 14031-14039 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T8 15608-15616 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T9 15938-15946 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T10 16018-16026 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T11 16708-16716 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T12 18601-18609 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T13 19158-19163 Phenotype denotes cough http://purl.obolibrary.org/obo/HP_0012735
T14 19691-19699 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T15 20506-20511 Phenotype denotes Cough http://purl.obolibrary.org/obo/HP_0012735
T16 21106-21114 Phenotype denotes coughing http://purl.obolibrary.org/obo/HP_0012735
T17 21176-21181 Phenotype denotes falls http://purl.obolibrary.org/obo/HP_0002527
T18 21239-21244 Phenotype denotes cough http://purl.obolibrary.org/obo/HP_0012735

LitCovid-sentences

Id Subject Object Predicate Lexical cue
T270 0-2 Sentence denotes 5.
T271 3-26 Sentence denotes Results and Discussions
T272 28-32 Sentence denotes 5.1.
T273 33-56 Sentence denotes Smartphone Cleaning App
T274 57-150 Sentence denotes Cleaning activities play an important role in reducing the risk of being exposed to COVID-19.
T275 151-229 Sentence denotes Three types of user activities were defined for the purposes of this research:
T276 230-437 Sentence denotes Working (i.e., the user is busy working), not working (i.e., the user is either a visitor or having time off), and cleaning (i.e., the user is a staff member who is either cleaning or disinfecting the room).
T277 438-637 Sentence denotes As seen in Figure 8a, these three different activities were taken into consideration by the mobile application and the user-selected types of activities were internally stored in their mobile phones.
T278 638-766 Sentence denotes We assumed that after the cleaning activity was carried out, the risk of any COVID-19 viral load being present returned to zero.
T279 767-921 Sentence denotes Over time, interactions between users and the space such as coughing, talking, and touching surfaces would again increase each room’s risk (Equation (2)).
T280 922-1058 Sentence denotes If a cleaner specifies in the mobile app that cleaning is done, the room will be marked as “cleaned”, and the risk will go down to zero.
T281 1059-1277 Sentence denotes Cleaning staff, based on the COVID-19 dissecting rules and regulations forced by the facilities, are trained and clean the room using advanced cleaning equipment (e.g., electrostatic sprayers), which kills 99% viruses.
T282 1278-1377 Sentence denotes This cleaning activity ensures the virus is killed, and there is no chance for cross-contamination.
T283 1378-1481 Sentence denotes It is reasonable to assume that the facilities will take precautions with cleaning as much as possible.
T284 1482-1649 Sentence denotes However, if this assumption is not valid, the risk will be increased over time, which complicates the calculations and increases virus spread and true-positive alarms.
T285 1650-1790 Sentence denotes Considering cleaning activities resets the risk calculations for the final risk map and reduces false-positive COVID-19 notification alerts.
T286 1791-1913 Sentence denotes In the future, we are going to evaluate standard-level cleaning activities for COVID-19 using smart cameras automatically.
T287 1914-2050 Sentence denotes Furthermore, cleaning should include enhanced space ventilation, as airborne particles are remarkably decreased by adequate ventilation.
T288 2051-2143 Sentence denotes For this research, a virus transmission interval is assumed to be a time interval of 15 min.
T289 2144-2330 Sentence denotes In other words, if user A was interacting with a room that had been used by a positive COVID-19 infected person, user B, the system would notify user A of probable exposure to the virus.
T290 2331-2488 Sentence denotes If we consider the situation in which cleaning activity took place after user B left the room, the risk of being exposed by the infected place would be zero.
T291 2489-2564 Sentence denotes This case can be considered a false positive notification alert for user A.
T292 2565-2690 Sentence denotes As a result, the proposed system can considerably reduce false positive notifications by using different types of activities.
T293 2691-2862 Sentence denotes A demo scenario of cleaning person is presented in Supplementary Materials and the trajectories of both building cleaners and visitors is shown in Supplementary Materials.
T294 2864-2868 Sentence denotes 5.2.
T295 2869-2900 Sentence denotes Proximity-Based Contact Tracing
T296 2901-3003 Sentence denotes For the purposes of this research the third floor of the CCIT building was selected for an experiment.
T297 3004-3185 Sentence denotes After extracting the related metadata such as room names for the rooms from the IndoorGML, 12 Estimote Proximity beacons were spatially distributed between 12 different cell spaces.
T298 3186-3291 Sentence denotes The contact tracing technique applied for this research was designed in a way that protects user privacy.
T299 3292-3500 Sentence denotes The application detects the proximal appearance of users within the proximity zone of each beacon by considering the value of the Received Signal Strength Indicator (RSSI) that was broadcasted by the beacons.
T300 3501-3736 Sentence denotes The duration of appearance of the user in the proximity zone defined for each beacon and the corresponding date and time information for this proximal appearance are the only information stored in the internal storage of mobile phones.
T301 3737-3961 Sentence denotes Figure 8b shows a screenshot of the developed mobile application for collecting different types of observations including BeaconID, time, date, and the duration that the target user spent in the proximal zone of each beacon.
T302 3962-4123 Sentence denotes Assuming that the incubation period of COVID-19 is two weeks, the application will work as a background service that saves data internally for a two-week period.
T303 4124-4303 Sentence denotes In situations in which the user becomes a positive COVID-19 case, he/she can voluntarily share data captured within the past two weeks with the backend database management system.
T304 4304-4401 Sentence denotes An AWS product Amazon Cognito was used to control user authentication and access to data storage.
T305 4402-4524 Sentence denotes As shown in Figure 8c, users are required to sign in/up for an Amazon Cognito account in order to share their information.
T306 4525-4654 Sentence denotes After signing in as an authorized client, users can publish their internal information to the Amazon cloud as shown in Figure 8d.
T307 4655-4773 Sentence denotes All of the data related to the COVID-19 cases will be stored and managed in the DynamoDB database in the Amazon cloud.
T308 4774-4870 Sentence denotes Our developed application was connected to the DynamoDB using another AWS product, the IoT Core.
T309 4871-5045 Sentence denotes When new data is added to cloud storage, the contact tracing application will look for any matches between the backend data and the data stored internally in the user device.
T310 5046-5321 Sentence denotes If it finds any matches that show that a confirmed COVID-19 positive case and the target user were close to each other for more than 15 min, the application will then notify the target user about potential exposure to COVID-19 and alert cleaning staff to disinfect the place.
T311 5322-5356 Sentence denotes This process is shown in Figure 9.
T312 5357-5423 Sentence denotes A demo of people trajectories is shown in Supplementary Materials.
T313 5424-5519 Sentence denotes There are various methods for indoor positioning, such as WiFi, BLE beacons, or dead reckoning.
T314 5520-5663 Sentence denotes Using BLE technology is cost-effective compared to other indoor positioning techniques, which use maintenance, installation, and cabling costs.
T315 5664-5764 Sentence denotes Generally, Bluetooth devices cost ~20× less than WiFi devices and have a similar WiFi accuracy [60].
T316 5765-5869 Sentence denotes In this paper, we focused on BLE proximity detection for contact tracing instead of precise positioning.
T317 5870-6068 Sentence denotes Three categories of user location will be of importance for this paper including immediate (less than 60 cm), near (1–6 m), and far (>10 m) distance of the Bluetooth receiver from active BLE beacon.
T318 6069-6177 Sentence denotes On the other hand, it was still a challenge working with BLE signals that are interfered with by structures.
T319 6178-6268 Sentence denotes Indoor setting and layout have direct effects on radio waves used in Bluetooth technology.
T320 6269-6447 Sentence denotes Another challenge was that the different beacon types and battery states produce different signal strengths, so using one beacon library for all types of beacons was problematic.
T321 6448-6530 Sentence denotes In this paper, an active BLE beacon is placed in each IndoorGML cell (e.g., room).
T322 6531-6778 Sentence denotes Moreover, we focus on proximity detection (i.e., immediate (within 0.6 m away), near (within about 1–8 m), and far (is beyond 10 m) distances from the active BLE beacon) to make indoor spatiotemporal trajectories using IndoorGML cell connectivity.
T323 6779-6881 Sentence denotes We avoided having to determine the exact range by way of careful beacon placement to prevent overlaps.
T324 6882-7063 Sentence denotes In the context of COVID-19 spread, locating in the immediate and near distance from the infected host would be dangerous for coronavirus transmission (through droplet transmission).
T325 7064-7164 Sentence denotes Accordingly, different health organizations such as WHO recommended two meters distance from others.
T326 7165-7251 Sentence denotes As a result, proximity detection should be of more importance in the COVID-19 context.
T327 7252-7370 Sentence denotes In other words, considering precise positioning would only increase the computation cost in this specific application.
T328 7371-7452 Sentence denotes Describing an indoor location using IndoorGML graph cell also helps with privacy.
T329 7453-7632 Sentence denotes Considering privacy concerns for individual tracking, especially in indoor environments, we believe that proximity positioning respects user privacy more than precise positioning.
T330 7633-7758 Sentence denotes Depending on the size of the data, type of beacons, and network bandwidth, mobile proximity detection performance may differ.
T331 7759-8015 Sentence denotes In our experiment, various beacons such as Estimote (https://estimote.com/), Accent Systems (https://accent-systems.com/) and Radius Networks (https://www.radiusnetworks.com/) have been evaluated using the developed app on the Samsung Galaxy S9 smartphone.
T332 8016-8155 Sentence denotes Our results demonstrated that the app could capture a beacon’s proximity of fewer than 60 milliseconds, which is enough for our case study.
T333 8156-8308 Sentence denotes The complexity of the position determination depends on the beacon software development kit; however, the complexity is O(n) in the worst-case scenario.
T334 8309-8451 Sentence denotes Concerning the duration spent in a room, we detected and recorded durations of less than five seconds when walking past beacons in a corridor.
T335 8452-8592 Sentence denotes Significance of time for the sake of COVID-19 risk was not considered important for durations less than 15 min, which was standard practice.
T336 8593-8698 Sentence denotes So, our sampling and recording intervals were much better than was required for COVID-19 risk evaluation.
T337 8699-8805 Sentence denotes The mobile application publishes a JSON payload to the AWS IoT Core cloud data management system in which:
T338 8806-8821 Sentence denotes Online service:
T339 8822-8943 Sentence denotes A single record showing the presence of a user in the proximity of an active BLE beacon is published to the AWS IoT core.
T340 8944-8960 Sentence denotes Offline service:
T341 8961-9057 Sentence denotes An array of records showing the user’s pretenses in a time window is published to the AWS cloud.
T342 9058-9195 Sentence denotes A JSON payload showing a single enriched proximity location captured by the developed smartphone app is shown in Supplementary Materials.
T343 9196-9268 Sentence denotes For more information regarding contact tracing app can be found in [61].
T344 9270-9274 Sentence denotes 5.3.
T345 9275-9301 Sentence denotes Video-Based People Density
T346 9302-9482 Sentence denotes This section discusses the experimental design for our camera surveillance for counting people, People Density, or the number of people who entered or left a geofence polygon area.
T347 9483-9593 Sentence denotes For indoor spaces, Physical Distancing rules result in restrictions on the number of people occupying a space.
T348 9594-9796 Sentence denotes The input for the DL models was online video feeds of fixed cameras focused on the regions of interest defined as IndoorGML cells (e.g., rooms, corridors, lobbies, elevators, stairs, and coffee places).
T349 9797-9982 Sentence denotes Some cameras might even be able to cover multiple regions of interest (IndoorGML cells), depending on where they are installed and if the spaces are separated by glass walls or windows.
T350 9983-10153 Sentence denotes An alarm can be triggered by the number of people entering or exiting a region (identified in the camera image) if the density of people exceeds the density of that area.
T351 10154-10264 Sentence denotes Moreover, the number of people violating physical distancing rules can be identified and reported to the IoCT.
T352 10265-10412 Sentence denotes For our cleaning use case demo (Supplementary Materials), we considered a meeting room as an IndoorGML node (Room 326) with a four-person capacity.
T353 10413-10494 Sentence denotes For this demo, the OGC indoorGML was used as it offered the following advantages:
T354 10495-10721 Sentence denotes IndoorGML cells were defined as the geofence; the geometry and area of each cell (geofence) were calculated and the location of each indoorGML cell (the centroid of the geofence) was used for the enrichment of the camera data.
T355 10722-10787 Sentence denotes The number of people entering or exiting each cell was monitored.
T356 10788-10968 Sentence denotes People in each frame were detected in real-time using a pre-trained You Only Look Once (YOLO) model [62] and the results were then published as an MQTT message to the AWS IoT Core.
T357 10969-11165 Sentence denotes On the backend, the maximum allowed people in a cell, or cell capacity, was either assigned by the building management, or calculated by dividing the cell area into squares of six feet two inches.
T358 11166-11260 Sentence denotes The “Gathering Restriction”—the number of people over each IndoorGML node—was then calculated.
T359 11261-11366 Sentence denotes This value changes over a range of 0–1 based on the number of people divided by the capacity of the room.
T360 11367-11483 Sentence denotes Should the number of people exceed the cell capacity, a Gathering Restriction alarm would be generated for the cell.
T361 11484-11601 Sentence denotes The following figure (Figure 10) shows a frame of the meeting room, detected people, and Gathering Restriction alarm.
T362 11602-11739 Sentence denotes The video demo of this scene is attached in Supplementary Materials which shows the people count online when they enter or exit the room.
T363 11741-11745 Sentence denotes 5.4.
T364 11746-11777 Sentence denotes Video-Based Physical Distancing
T365 11778-11891 Sentence denotes Physical Distancing was monitored for each cell using a pre-trained YOLO model for detecting people in that cell.
T366 11892-11941 Sentence denotes Relative distance was then calculated as follows:
T367 11942-12053 Sentence denotes The pairwise distance between two people is the distance between the two similar corners of their bounding box.
T368 12054-12175 Sentence denotes In order to minimize the camera’s vanishing point effect, the distance was then compared to their bounding box diameters.
T369 12176-12334 Sentence denotes If the distance was less than the longest diameter, it was assumed that the relative distance between those people was violating the Physical Distancing rule.
T370 12335-12437 Sentence denotes For the following example, the view from a fixed camera was divided into several polygons (geofences).
T371 12438-12584 Sentence denotes This can result in the creation of separate geofences (indicated by the IndoorGML nodes if they were in the building) from the camera’s viewpoint.
T372 12585-12717 Sentence denotes The number of people per geofence polygon and the number of times that people were closer than two metres were reported to the IoCT.
T373 12718-12867 Sentence denotes The following figure (Figure 11) shows a frame of multiple geofences in an outdoor area, the detected people, and the Physical Distancing violations.
T374 12868-13061 Sentence denotes The video demo of this scene is attached in Supplementary Materials which shows the people count online when they entered or exited the geofences, as well as the physical distancing violations.
T375 13062-13126 Sentence denotes Outdoor geofences can be connected to the IndoorGML graph nodes.
T376 13128-13132 Sentence denotes 5.5.
T377 13133-13169 Sentence denotes Video-Based Risky Behavior Detection
T378 13170-13240 Sentence denotes Camera stream processing is a popular and quick way to detect objects.
T379 13241-13354 Sentence denotes Human behaviors and actions can be detected as objects from the video frames using a trained deep learning model.
T380 13355-13605 Sentence denotes For the detection of risky behaviors such as coughing, hugging, handshaking, and doorknob touching, the You Only Look Once version3 (YOLOv3) which is suitable for real-time behavior detection for online video streams, was trained and applied [63,64].
T381 13606-13728 Sentence denotes This library classifies and localizes detected objects in one step with a speed of faster than 40 frames per second (FPS).
T382 13729-13810 Sentence denotes We considered two main types of risky behaviors for COVID-19 indoor transmission:
T383 13811-13897 Sentence denotes Group risky behaviors (e.g., hugging) and individual risky behaviors (e.g., coughing).
T384 13898-14005 Sentence denotes Figure 12 illustrates how to train a model for COVID-19 transmission risky behavior detection using YOLOv3.
T385 14006-14216 Sentence denotes In total, 603 images for coughing, 634 images for hugging, 608 images for handshaking, and 623 images for door touching were used from COCO dataset [62] for transfer learning for the pre-trained model (YOLOv3).
T386 14217-14295 Sentence denotes These images were taken from free sources found through Google image searches.
T387 14296-14355 Sentence denotes For labelling objects, a semi-automatic method was applied.
T388 14356-14399 Sentence denotes Darknet library was also used for training.
T389 14400-14592 Sentence denotes For individual behaviors, all of the people in images were detected and labelled in a text file whilst the algorithm aggregated intersected bounding boxes of people into a single bounding box.
T390 14593-14700 Sentence denotes As wrong labels might be generated, the images should be manually checked to correct misclassified objects.
T391 14701-14794 Sentence denotes For this step 80 percent of the images were selected for training and 20 percent for testing.
T392 14795-14873 Sentence denotes To increase the accuracy of this model, the configuration in Table 3 was used.
T393 14874-14955 Sentence denotes To increase training accuracy and speed, a transfer learning process was applied.
T394 14956-15073 Sentence denotes The base layer is a pre-trained YOLOv3 that uses the COCO dataset for all of the layers of our model except the last.
T395 15074-15259 Sentence denotes Transfer learning helps with training by exploiting the knowledge of a pre-trained supervised model to address the problems of small training datasets for COVID-19 risky behaviors [65].
T396 15260-15465 Sentence denotes To evaluate the accuracy of the model, we tried to check the results for different video datasets by exporting all of the frames for detection under various circumstances for the metrics listed in Table 4.
T397 15466-15671 Sentence denotes After studying the outcomes, we found that the “hugging” and “handshaking” classes experienced the highest false negative results compared to coughing as the larger dataset was being prepared for training.
T398 15672-15798 Sentence denotes It appeared that hugging and handshaking (grouping actions) were more varied in terms of the types of handshaking and hugging.
T399 15799-15888 Sentence denotes Therefore, training precision could be improved with the preparation of more varied data.
T400 15889-16075 Sentence denotes Moreover, some of the false positive results for coughing showed that in most cases, moving a hand near the face was detected as coughing, regardless whether it had actually taken place.
T401 16076-16154 Sentence denotes Furthermore, the number of false negatives increased in a more populated area.
T402 16155-16240 Sentence denotes Detected touching behavior results demonstrated high numbers of false negative cases.
T403 16241-16345 Sentence denotes About 75 percent of false negative cases occurred when the predictor incorrectly detected small objects.
T404 16346-16463 Sentence denotes Therefore, specifying limitations for box sizes and level of confidence for the predictor can reduce false negatives.
T405 16464-16592 Sentence denotes The results of evaluating precision, recall, F-score, and number of samples for each behavior action class is listed in Table 5.
T406 16594-16598 Sentence denotes 5.6.
T407 16599-16635 Sentence denotes Audio-Based Risky Behavior Detection
T408 16636-16779 Sentence denotes This section examines an audio classification algorithm that recognizes coughing and sneezing using an audio sensor with an embedded DL engine.
T409 16780-16838 Sentence denotes The methodology for audio detection is shown in Figure 13.
T410 16839-17004 Sentence denotes This figure shows the four main steps of the audio DL process.The recording needs to first be preprocessed for noise before being used for extracting sound features.
T411 17005-17145 Sentence denotes The most commonly known time-frequency feature is the short-time Fourier transform [67], Mel spectrogram [68], and wavelet spectrogram [69].
T412 17146-17338 Sentence denotes The Mel spectrogram was based on a nonlinear frequency scale motivated by human auditory perception and provides a more compact spectral representation of sounds when compared to the STFT [3].
T413 17339-17426 Sentence denotes To compute a Mel spectrogram, we first convert the sample audio files into time series.
T414 17427-17520 Sentence denotes Next, its magnitude spectrogram is computed, and then mapped onto the Mel scale with power 2.
T415 17521-17568 Sentence denotes The end result would be a Mel spectrogram [70].
T416 17569-17663 Sentence denotes The last step in preprocessing would be to convert Mel spectrograms into log Mel spectrograms.
T417 17664-17758 Sentence denotes Then the image results would be introduced as an input to the deep learning modelling process.
T418 17759-17973 Sentence denotes Convolutional neural network (CNN) architectures use multiple blocks of successive convolution and pooling operations for feature learning and down sampling along the time and feature dimensions, respectively [71].
T419 17974-18068 Sentence denotes The VGG16 is a pre-trained CNN [72] used as a base model for transfer learning (Table 6) [73].
T420 18069-18253 Sentence denotes VGG16 is a famous CNN architecture that uses multiple stacks of small kernel filters (3 by 3) instead of the shallow architecture of two or three layers with large kernel filters [74].
T421 18254-18418 Sentence denotes Using multiple stacks of small kernel filters increases the network’s depth, which results in improving complex feature learning while decreasing computation costs.
T422 18419-18497 Sentence denotes VGG16 architecture includes 16 convolutional and three fully connected layers.
T423 18498-18752 Sentence denotes Audio-based risky behavior detection is based on complex features and distinguishable behaviors (e.g., coughing, sneezing, background noise), which requires a deeper CNN model than shallow architecture (i.e., two or three-layer architecture) offers [75].
T424 18753-18855 Sentence denotes VGG16 has been adopted for audio event detection and demonstrated significant literature results [71].
T425 18856-18959 Sentence denotes The feature maps were flattened to obtain the fully connected layer after the last convolutional layer.
T426 18960-19093 Sentence denotes For most CNN-based architectures, only the last convolutional layer activations are connected to the final classification layer [76].
T427 19094-19194 Sentence denotes The ESC-50 [77] and AudioSet [78] datasets were used to extract cough and sneezing training samples.
T428 19195-19350 Sentence denotes The ESC-50 dataset is a labelled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.
T429 19351-19510 Sentence denotes AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labelled, 10 s sound clips taken from YouTube videos.
T430 19511-19630 Sentence denotes Over 5000 samples were extracted for the transfer learning CNN model which was then divided to train and test datasets.
T431 19631-19713 Sentence denotes We examined the performance of the trained CNN models using coughing and sneezing.
T432 19714-19747 Sentence denotes The results are shown in Table 7.
T433 19749-19753 Sentence denotes 5.7.
T434 19754-19788 Sentence denotes Risk Calculation and Visualization
T435 19789-19920 Sentence denotes To demonstrate risk calculation using Equation (2), we evaluated the proposed IoCT using the following cleaning use case scenarios.
T436 19921-20087 Sentence denotes In meeting room number 326 of the CCIT building, the number of people increased as people entered the room, and this event was detected by a smart camera in the room.
T437 20088-20227 Sentence denotes The number of people was shown online in the video frame and map visualization browser in green until the room capacity (five) was reached.
T438 20228-20360 Sentence denotes When the fourth person came in (room capacity is assumed to be three), the alarm notification for “Room exceeded capacity” is shown.
T439 20361-20482 Sentence denotes After that, a person coughed in the meeting room, and this event was detected by both the smart camera and audio sensors.
T440 20483-20522 Sentence denotes A notification showed “Cough detected”.
T441 20523-20616 Sentence denotes Then, the person who coughed opened the door and this event was detected by the smart camera.
T442 20617-20672 Sentence denotes A “High-risk behavior detected” notification was shown.
T443 20673-20803 Sentence denotes The risk profile at that moment exceeded the threshold of 0.7 and a notification was sent to the people in room, and to a cleaner.
T444 20804-20968 Sentence denotes The color of the room polygon turned red indicating high risk and the room polygon was extruded (i.e., the polygon height increases) proportional to the risk value.
T445 20969-21134 Sentence denotes People started to leave the room causing the risk from People Density to go down, but the risk is higher than at the very beginning as a coughing event had occurred.
T446 21135-21257 Sentence denotes The total risk value of the meeting room falls but remains higher than before the risky behavior (i.e., cough) took place.
T447 21258-21417 Sentence denotes The cleaner closer to the room changes his activity status to cleaning (shown by an icon on the map) and moves closer towards the room (from elevator to room).
T448 21418-21528 Sentence denotes The cleaner trajectory alongside the other people trajectories extracted from BLE beacons were visualized too.
T449 21529-21665 Sentence denotes After the cleaning activity, the room’s total risk level goes back down to zero and the color of the room polygon changes back to green.
T450 21666-21779 Sentence denotes The video demo of this scene is attached in the Supplementary Materials which shows the risk profile of the room.
T451 21780-21869 Sentence denotes A sample screen shot of the Supplementary Materials demo video is presented in Figure 14.
T452 21870-22020 Sentence denotes To evaluate the impact of various weights assigned to different map layers, we used two sets of weights for map layer aggregations on the client side:
T453 22021-22125 Sentence denotes Profile 1: W1=W2=W3=W4=1; and Profile 2: W1=0.1, W2=0.4, W3=0.3, and W4=0.2 as mentioned in Section 4.1.
T454 22126-22202 Sentence denotes Figure 15 shows two risk profiles for room 326 over 40 min from 20:00 to 20:
T455 22203-22227 Sentence denotes 40 p.m. on 11 June 2020.
T456 22228-22375 Sentence denotes Evaluating precision, recall, and F-Score of video-Based and audio-Based risky behavior detection are listed in in Table 5 and Table 7 accordingly.
T457 22376-22668 Sentence denotes Table 8 includes time performance of different developed functionalities (e.g., video-based person density, video-based physical distancing, video-based risky behavior detection, and audio-based risky behavior detection) on various platforms such as Jetson NX, laptop, and android smartphone.
T458 22669-22774 Sentence denotes The performance of using a deep learning engine is highly dependent on Graphics and Computing processors.
T459 22775-22886 Sentence denotes Therefore, the performance of those functionalities is evaluated on a laptop with more robust processing units.
T460 22887-22974 Sentence denotes The laptop has NVIDIA GeForce RTX 2070 with 7.5 computation capabilities and a Core i7.
T461 22975-23043 Sentence denotes Therefore, the performance on Jetson NX is lower than on the laptop.
T462 23044-23165 Sentence denotes The best performance values are video-based risky behavior detection because they only involve the object detection task.
T463 23166-23288 Sentence denotes Audio-based risky behavior detection segments the voice in specific time frames and converts them into spectrogram images.
T464 23289-23347 Sentence denotes Voice patterns are detected in images using the VGG model.
T465 23348-23430 Sentence denotes Therefore, the time of processing for audio is higher than video object detection.
T466 23431-23598 Sentence denotes Video-based people density and video-based physical distancing give worse performance values than simple object detection regarding complexities in tracking functions.