多茶类CNN图像识别的数据增强优化及类激活映射量化评价

doi:10.13305/j.cnki.jts.2023.03.006

茶叶科学 ›› 2023, Vol. 43 ›› Issue (3): 411-423.doi: 10.13305/j.cnki.jts.2023.03.006

多茶类CNN图像识别的数据增强优化及类激活映射量化评价

章展熠¹, 张宝荃¹, 王周立¹, 杨垚¹, 范冬梅¹, 何卫中², 马军辉^3,*, 林杰^1,*

1.浙江农林大学茶学与茶文化学院,浙江临安 311300;
2.丽水市农林科学研究院,浙江丽水 323000;
3.丽水市经济作物总站,浙江丽水 323000

收稿日期:2023-02-14 修回日期:2023-04-19 出版日期:2023-06-15 发布日期:2023-06-29
通讯作者: *278805795@qq.com;linjie@zafu.edu.cn
作者简介:章展熠,女,在读本科生,茶学专业,2248559187@qq.com。
基金资助:
国家级大学生创新创业训练计划项目（202110341061）、2022年丽水市茶产业专家团队项目（202203）、浙江省农业重大技术协同推广计划（2022XTTGCY04）

Data Enhancement Optimization and Class Activation Mapping Quantitative Evaluation for CNN Image Recognition of Multiple Tea Categories

ZHANG Zhanyi¹, ZHANG Baoquan¹, WANG Zhouli¹, YANG Yao¹, FAN Dongmei¹, HE Weizhong², MA Junhui^3,*, LIN Jie^1,*

1. College of Tea Science and Tea Culture, Zhejiang A&F University, Lin'an 311300, China;
2. Lishui Academy of Agricultural and Forestry Sciences, Lishui 323000, China;
3. Lishui Economic Crop Terminal, Lishui 323000, China

Received:2023-02-14 Revised:2023-04-19 Online:2023-06-15 Published:2023-06-29

摘要/Abstract

摘要： 我国茶叶种类繁多,识别难度大。卷积神经网络（Convolutional neural network,CNN）图像识别具有客观性、适应复杂图片背景且可移植于移动端的优势。但当前茶叶CNN图像识别缺乏对数据增强优化和识别准确性客观评价的研究,限制了模型识别的鲁棒性和泛化能力。采集29种常见茶类共6 123张图像构建数据集,对比了10种图像数据增强方法的ResNet-18（Residual network-18）训练效果;为了客观评价模型识别区域的准确性,构建了2个梯度加权类激活映射（Gradient-weighted class activation mapping,Grad-CAM）量化评价指标（IOB和MPI）。结果表明,网格擦除（Ratio=0.3）、分辨率扰动和HSV（Hue,Saturation,Value）颜色空间扰动是较优的数据增强方法,准确率（Accuracy）、损失值（Loss）、IOB和MPI等4个指标综合表现较优。进一步通过消融实验,得到了最佳的数据增强方法组合—水平镜像翻转+网格擦除（Ratio=0.3）+HSV颜色空间扰动,其模型测试准确率达到了99.82%、损失值仅有0.64,且IOB、MPI指标也表现较优,体现了较好的图像识别区域准确性。本研究对茶叶图像数据增强方法进行了优化,训练得到了高鲁棒性的多茶类CNN图像识别模型,构建的量化指标IOB和MPI也解决了CAM识别区域准确性客观评价的问题。

关键词: 茶类识别, 卷积神经网络, 图像识别, 数据增强, 类激活映射

Abstract: There are many kinds of tea in China, and subjective identification is easy to be confused and very dependent on professional experience. Convolutional Neural Network (CNN) image recognition applied to multi-tea identification has the advantages of objectivity, adaptability to complex image backgrounds and portability to mobile devices. However, the current CNN image recognition of tea lacks data enhancement optimization and objective evaluation of recognition accuracy, which limits the robustness and generalization ability of model recognition. In this study, a total of 6 123 images of 29 common tea categories were collected to construct a dataset, and the ResNet-18 (Residual network-18) training effects of 10 image data enhancement methods were compared. To objectively evaluate the accuracy of the model recognition area, two gradient-weighted class activation mapping (Grad-CAM ) quantitative evaluation indexes (IOB and MPI) were constructed. The results show that grid erasure (Ratio=0.3), resolution perturbation and HSV (Hue, Saturation, Value) color space perturbation are better data enhancement methods, with four indicators of accuracy, loss, IOB and MPI performing better. Furthermore, through the ablation experiment, the optimal combination of data enhancement methods “horizontal mirror flip + grid erasure (Ratio=0.3) + HSV color perturbation” was obtained. The accuracy rate of model test reached 99.82%, with a loss value of only 0.64, and the IOB and MPI indicators also performed better, reflecting good accuracy in image recognition. This study optimized the tea image data enhancement method, and obtained the multi-tea CNN image recognition model with high robustness. The constructed quantization indexes IOB and MPI also solved the problem of accuracy evaluation of CAM recognition region.

Key words: tea recognition, convolutional neural network, image recognition, data augmentation, class activation mapping

中图分类号:

章展熠, 张宝荃, 王周立, 杨垚, 范冬梅, 何卫中, 马军辉, 林杰. 多茶类CNN图像识别的数据增强优化及类激活映射量化评价[J]. 茶叶科学, 2023, 43(3): 411-423. doi: 10.13305/j.cnki.jts.2023.03.006.

ZHANG Zhanyi, ZHANG Baoquan, WANG Zhouli, YANG Yao, FAN Dongmei, HE Weizhong, MA Junhui, LIN Jie. Data Enhancement Optimization and Class Activation Mapping Quantitative Evaluation for CNN Image Recognition of Multiple Tea Categories[J]. Journal of Tea Science, 2023, 43(3): 411-423. doi: 10.13305/j.cnki.jts.2023.03.006.

参考文献

[1] 严俊, 林刚, 赖国亮, 等. 测色技术在炒青绿茶品质评价中的应用研究[J]. 食品科学, 1996, 17(7): 21-24.
Yan J, Lin G, Lai G L, et al.Study on the application of color measurement technology in evaluating the quality of roasted green tea[J]. Food Science, 1996, 17(7): 21-24.
[2] 蒋帆, 乔欣, 郑华军, 等. 基于高光谱分析技术的机炒龙井茶等级识别方法[J]. 农业工程学报, 2011, 27(7): 343-348.
Jiang F, Qiao X, Zheng H J, et al.Grade discrimination of machine-fried Longjing tea based on hyperspectral technology[J]. Transactions of the Chinese Society of Agricultural Engineering, 2011, 27(7): 343-348.
[3] 林新, 牛智有. 基于近红外光谱茶叶种类的快速识别[J]. 华中农业大学学报, 2008, 27(2): 326-330.
Lin X, Niu Z Y.Fast discrimination of tea species based on near infrared spectroscopy (NIRS)[J]. Journal of Huazhong Agricultural University, 2008, 27(2): 326-330.
[4] 陈孝敬, 吴迪, 何勇, 等. 基于多光谱图像颜色特征的茶叶分类研究[J]. 光谱学与光谱分析, 2008, 28(11): 2527-2530.
Chen X J, Wu D, He Y, et al.Study on discrimination of tea based on color of multispectral image[J]. Spectroscopy and Spectral Analysis, 2008, 28(11): 2527-2530.
[5] 尹志, 胡冬. 茶叶感官审评方法中存在的若干问题分析[J]. 茶叶, 2015, 41(1): 15-18.
Yin Z, Hu D.A discussion on the methodology of tea sensory assessment[J] Journal of Tea, 2015, 41(1): 15-18.
[6] Krizhevsky A, Sutskever I, Hinton G E.Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[7] Simonyan K, Zisserman A.Very deep convolutional networks for large-scale image recognition[J]. arXiv, 2014, 1409: 1556. doi: 10.48550/arXiv.1409.1556.
[8] He K, Zhang X, Ren S, et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[9] Zhang H, Patel V M.Densely connected pyramid dehazing network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 3194-3203.
[10] Szegedy C, Liu W, Jia Y, et al.Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.
[11] Girshick R.Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
[12] Girshick R, Donahue J, Darrell T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[13] Long J, Shelhamer E, Darrell T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.
[14] Zhao H, Shi J, Qi X, et al.Pyramid scene parsing network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2881-2890.
[15] 杨奉水, 王志博, 汪为通, 等. 人工智能识别茶树病虫害的应用与展望[J]. 中国茶叶, 2022, 44(6): 1-6.
Yang F S, Wang Z B, Wang W T, et al.Application and prospect of artificial intelligence identification of tea pests and diseases[J]. China Tea, 2022, 44(6): 1-6.
[16] 张怡, 赵珠蒙, 王校常, 等. 基于ResNet卷积神经网络的绿茶种类识别模型构建[J]. 茶叶科学, 2021, 41(2): 261-271.
Zhang Y, Zhao Z M, Wang J C, et al.Construction of green tea recognition model based on ResNet convolutional neural network[J]. Journal of Tea Science, 2021, 41(2): 261-271.
[17] 段瑞玲, 李庆祥, 李玉和. 图像边缘检测方法研究综述[J]. 光学技术, 2005, 31(3): 415-419.
Duan R L, Li Q X, Li Y H.Summary of image edge detection[J]. Optical Technique, 2005, 31(3): 415-419.
[18] 李文举, 苏攀, 崔柳. 基于随机扰动的过拟合抑制算法[J]. 计算机仿真, 2022, 39(5): 134-138.
Li W J, Su P, Cui L.Over-fitting suppression algorithm based on random perturbation[J]. Computer Simulation, 2022, 39(5): 134-138.
[19] Shijie J, Ping W, Peiyi J, et al.Research on data augmentation for image classification based on convolution neural networks[C]//2017 Chinese automation congress (CAC). IEEE, 2017: 4165-4170.
[20] Roth H R, Lee C T, Shin H C, et al.Anatomy-specific classification of medical images using deep convolutional nets[C]//2015 IEEE 12th international symposium on biomedical imaging (ISBI). IEEE, 2015: 101-104.
[21] 司念文, 常禾雨, 张文林, 等. 基于注意力机制的卷积神经网络可视化方法[J]. 信息工程大学学报, 2021, 22(3): 257-263.
Si N W, Chang H Y, Zhang W L, et al.Visualization method of convolutional neural network based on attention mechanism[J]. Journal of Information Engineering University, 2021, 22(3): 257-263.
[22] 赵洋, 梁迎春, 许军, 等. 改进ResNet18网络模型的花卉识别[J]. 计算机技术与发展, 2022, 32(7): 167-172.
Zhao Y, Liang Y C, Xu J, et al.Flower recognition based on improved ResNet18 network mode[J]. Computer Technology and Development, 2022, 32(7): 167-172.
[23] 何彦弘, 徐怡宁, 傅嘉琪, 等. 基于改进Resnet18的垃圾分类收运监管方法研究[J]. 软件工程, 2023, 26(1): 24-33.
He Y H, Xu Y N, Fu J Q, et al.Waste classified collection and transportation supervision approach based on improved Resnet18[J]. Software Engineering, 2023, 26(1): 24-33.
[24] 边柯橙, 杨海军, 路永华. 深度学习在农业病虫害检测识别中的应用综述[J]. 软件导刊, 2021, 20(3): 26-33.
Bian K C, Yang H J, Lu Y H.Application review of deep learning in detection and identification of agricultural pests and diseases[J]. Software Guide, 2021, 20(3): 26-33.
[25] Selvaraju R R, Chattopadhyay P, Elhoseiny M, et al.Choose your neuron: incorporating domain knowledge through neuron-importance[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 526-541.
[26] Selvaraju R R, Cogswell M, Das A, et al.Grad-cam: visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE international conference on computer vision. 2017: 618-626.
[27] Zhou B, Khosla A, Lapedriza A, et al.Learning deep features for discriminative localization[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2921-2929.
[28] 司念文, 张文林, 屈丹, 等. 卷积神经网络表征可视化研究综述[J]. 自动化学报, 2022, 48(8): 1890-1920.
Si N W, Zhang W L, Qu D, et al.Representation visualization of convolutional neural networks: asurvey[J]. Acta Automatica Sinica, 2022, 48(8): 1890-1920.
[29] Jung Y.Multiple predicting K-fold cross-validation for model selection[J]. Journal of Nonparametric Statistics, 2018, 30(1): 197-215.
[30] 王文明, 肖宏儒, 陈巧敏, 等. 基于图像处理的茶叶智能识别与检测技术研究进展分析[J]. 中国农机化学报, 2020, 41(7): 178-184.
Wang W M, Xiao H R, Chen Q M, et al.Research progress analysis of tea intelligent recognition and detection technology based on image processing[J]. Journal of Chinese Agricultural Mechanization, 2020, 41(7): 178-184.
[31] 刘自强, 周铁军, 傅冬和, 等. 基于颜色和形状的鲜茶叶图像特征提取及在茶树品种识别中的应用[J]. 江苏农业科学, 2021, 49(12): 168-172.
Liu Z Q, Zhou T J, Fu D H, et al.Study on image feature extraction of fresh tea based on color and shape and its application in tea variety recognition[J]. Jiangsu Agricultural Sciences, 2021, 49(12): 168-172.
[32] Liu Y, Zhong Y, Fei F, et al.Scene classification based on a deep random-scale stretched convolutional neural network[J]. Remote Sensing, 2018, 10(3): 444. doi: 10.3390/rs10030444.
[33] Chen P, Liu S, Zhao H, et al.Gridmask data augmentation[J]. arXiv, 2020, 2001: 04086. doi: 10.48550/arXiv.2001.04086.
[34] Howard A G, Zhu M, Chen B, et al.Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv, 2017, 1704: 04861. doi: 10.48550/arXiv.1704.04861.
[35] Chen L, Chen J, Hajimirsadeghi H, et al.Adapting grad-cam for embedding networks[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2020: 2794-2803.
[36] 张家钧, 唐云祁, 杨智雄. 基于改进残差网络和数据增强的鞋型识别算法[J]. 电子测量技术, 2021, 44(19): 139-147.
Zhang J J, Tang Y Q, Yang Z X.Shoe type recognition algorithm based on improved residual network and data augmentation[J]. Electronic Measurement Technology, 2021, 44(19): 139-147.
[37] 杨继增, 关胜晓. 面向CNN的类激活映射算法研究[J]. 信息技术与网络安全, 2022, 41(1): 63-68.
Yang J Z, Guan S X.A class activation mapping algorithm for CNN[J]. Information Technology and Network Security, 2022, 41(1): 63-68.

多茶类CNN图像识别的数据增强优化及类激活映射量化评价

Data Enhancement Optimization and Class Activation Mapping Quantitative Evaluation for CNN Image Recognition of Multiple Tea Categories

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 2

Metrics

本文评价

推荐阅读 0

[1]	杨肖委, 沈强, 罗金龙, 张拓, 杨婷, 戴宇樵, 刘忠英, 李琴, 王家伦. 基于改进YOLOv8n的茶树嫩芽识别[J]. 茶叶科学, 2024, 44(6): 949-959.
[2]	张怡, 赵珠蒙, 王校常, 冯海强, 林杰. 基于ResNet卷积神经网络的绿茶种类识别模型构建[J]. 茶叶科学, 2021, 41(2): 261-271.