多尺度注意力与领域自适应的小样本图像识别

陈龙; 张建林; 彭昊; 李美惠; 徐智勇; 魏宇星

doi:doi:10.12086/oee.2023.220232

光电工程, 2023, 50 (4): 220232, 网络出版: 2023-06-15

多尺度注意力与领域自适应的小样本图像识别

Few-shot image classification via multi-scale attention and domain adaptation

陈龙 ^1,2张建林 ^1,*彭昊 ^1,2李美惠 ¹徐智勇 ¹魏宇星 ¹

作者单位

¹ 中国科学院光电技术研究所，四川成都 610209

² 中国科学院大学电子电气与通信工程学院，北京 100049

小样本图像识别注意力机制领域自适应相似性度量 few-shot image classification attention mechanism domain adaptation similarity metric

摘要

To improve the performance of few-shot classification, we present a general and flexible method named Multi-Scale Attention and Domain Adaptation Network (MADA). Firstly, to tackle the problem of limited samples, a masked autoencoder is used to image augmentation. Moreover, it can be inserted as a plug-and-play module into a few-shot classification. Secondly, the multi-scale attention module can adapt feature vectors extracted by embedding function to the current classification task. Multi-scale attention machine strengthens the discriminative image region by focusing on relating samples in both base class and novel class, which makes prototypes more accurate. In addition, the embedding function pays attention to the task-specific feature. Thirdly, the domain adaptation module is used to address the domain shift caused by the difference in data distributions of the two domains. The domain adaptation module consists of the metric module and the margin loss function. The margin loss pushes different prototypes away from each other in the feature space. Sufficient margin space in feature space improves the generalization performance of the method. The experimental results show the classification accuracy of the proposed method is 67.45% for 5-way 1-shot and 82.77% for 5-way 5-shot on the miniImageNet dataset. The classification accuracy is 70.57% for 5-way 1-shot and 85.10% for 5-way 5-shot on the tieredImageNet dataset. The classification accuracy of our method is better than most previous methods. After dimension reduction and visualization of features by using t-SNE, it can be concluded that domain drift is alleviated, and prototypes are more accurate. The multi-scale attention module enhanced feature representations are more discriminative for the target classification task. In addition, the domain adaptation module improves the generalization ability of the model.

Abstract

Learning with limited data is a challenging field for computer visual recognition. Prototypes calculated by the metric learning method are inaccurate when samples are limited. In addition, the generalization ability of the model is poor. To improve the performance of few-shot image classification, the following measures are adopted. Firstly, to tackle the problem of limited samples, the masked autoencoder is used to enhance data. Secondly, prototypes are calculated by task-specific features, which are obtained by the multi-scale attention mechanism. The attention mechanism makes prototypes more accurate. Thirdly, the domain adaptation module is added with a margin loss function. The margin loss pushes different prototypes away from each other in the feature space. Sufficient margin space improves the generalization performance of the method. The experimental results show the proposed method achieves better performance on few-shot classification.

PDF全文

陈龙, 张建林, 彭昊, 李美惠, 徐智勇, 魏宇星. 多尺度注意力与领域自适应的小样本图像识别[J]. 光电工程, 2023, 50(4): 220232. Long Chen, Jianlin Zhang, Hao Peng, Meihui Li, Zhiyong Xu, Yuxing Wei. Few-shot image classification via multi-scale attention and domain adaptation[J]. Opto-Electronic Engineering, 2023, 50(4): 220232.

多尺度注意力与领域自适应的小样本图像识别

关于本站 Cookie 的使用提示

全站搜索

多尺度注意力与领域自适应的小样本图像识别

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索