Id |
Subject |
Object |
Predicate |
Lexical cue |
T456 |
0-147 |
Sentence |
denotes |
Evaluating precision, recall, and F-Score of video-Based and audio-Based risky behavior detection are listed in in Table 5 and Table 7 accordingly. |
T457 |
148-440 |
Sentence |
denotes |
Table 8 includes time performance of different developed functionalities (e.g., video-based person density, video-based physical distancing, video-based risky behavior detection, and audio-based risky behavior detection) on various platforms such as Jetson NX, laptop, and android smartphone. |
T458 |
441-546 |
Sentence |
denotes |
The performance of using a deep learning engine is highly dependent on Graphics and Computing processors. |
T459 |
547-658 |
Sentence |
denotes |
Therefore, the performance of those functionalities is evaluated on a laptop with more robust processing units. |
T460 |
659-746 |
Sentence |
denotes |
The laptop has NVIDIA GeForce RTX 2070 with 7.5 computation capabilities and a Core i7. |
T461 |
747-815 |
Sentence |
denotes |
Therefore, the performance on Jetson NX is lower than on the laptop. |
T462 |
816-937 |
Sentence |
denotes |
The best performance values are video-based risky behavior detection because they only involve the object detection task. |
T463 |
938-1060 |
Sentence |
denotes |
Audio-based risky behavior detection segments the voice in specific time frames and converts them into spectrogram images. |
T464 |
1061-1119 |
Sentence |
denotes |
Voice patterns are detected in images using the VGG model. |
T465 |
1120-1202 |
Sentence |
denotes |
Therefore, the time of processing for audio is higher than video object detection. |
T466 |
1203-1370 |
Sentence |
denotes |
Video-based people density and video-based physical distancing give worse performance values than simple object detection regarding complexities in tracking functions. |