Title: | A reliable anchor regenerative-based transformer model for x-small and dense objects recognition |
Address: | "Vignan's Foundation for Science, Technology, and Research, Guntur, Andhra Pradesh, India. Electronic address: Vasanthi457@gmail.com. Vignan's Foundation for Science, Technology, and Research, Guntur, Andhra Pradesh, India. Electronic address: laavanvijay@gmail.com" |
DOI: | 10.1016/j.neunet.2023.06.020 |
ISSN/ISBN: | 1879-2782 (Electronic) 0893-6080 (Linking) |
Abstract: | "The past decade has witnessed significant progress in detecting objects by using enormous features of deep learning models. But, most of the existing models are unable to detect x-small and dense objects, due to the futility of feature extraction, and substantial misalignments between anchor boxes and axis-aligned convolution features, which leads to the discrepancy between the categorization score and positioning accuracy. This paper introduces an anchor regenerative-based transformer module in a feature refinement network to solve this problem. The anchor-regenerative module can generate anchor scales based on the semantic statistics of the objects present in the image, which avoids the inconsistency between the anchor boxes and axis-aligned convolution features. Whereas, the Multi-Head-Self-Attention (MHSA) based transformer module extracts the in-depth information from the feature maps based on the query, key, and value parameter information. This proposed model is experimentally verified on the VisDrone, VOC, and SKU-110K datasets. This model generates different anchor scales for these three datasets and achieves higher mAP, precision, and recall values on three datasets. These tested results prove that the suggested model has outstanding achievements compared with existing models in detecting x-small objects as well as dense objects. Finally, we evaluated the performance of these three datasets by using accuracy, kappa coefficient, and ROC metrics. These evaluated metrics demonstrate that our model is a good fit for VOC, and SKU-110K datasets" |
Keywords: | *Volatile Organic Compounds Benchmarking Mental Recall Semantics Visual Perception Auto-anchor Multi-head-self-attention Object detection Spatial pyramid pooling-faster YOLOv5; |
Notes: | "MedlineVasanthi, Ponduri Mohan, Laavanya eng 2023/07/08 Neural Netw. 2023 Aug; 165:809-829. doi: 10.1016/j.neunet.2023.06.020. Epub 2023 Jun 21" |