欢迎访问《茶叶科学》,今天是

茶叶科学 ›› 2019, Vol. 39 ›› Issue (6): 715-722.doi: 10.13305/j.cnki.jts.2019.06.010

• • 上一篇    下一篇

非线性流形降维方法结合近红外光谱技术快速鉴别不同海拔的茶叶

刘鹏1, 艾施荣3, 杨普香2, 李文金2, 熊爱华1, 童阳3, 胡潇3, 吴瑞梅1,*   

  1. 1. 江西农业大学工学院,江西 南昌 330045;
    2. 江西省蚕桑茶叶研究所,江西 南昌 330203;
    3. 江西农业大学软件学院,江西 南昌 330045
  • 收稿日期:2018-10-19 修回日期:2019-06-12 出版日期:2019-12-15 发布日期:2019-12-24
  • 通讯作者: * aisrong@163.com
  • 作者简介:刘鹏,男,硕士研究生,主要从事农产品质量安全检测与模式识别方面的研究。
  • 基金资助:
    国家自然科学基金项目(31460315)、江西省重点研发计划项目(20171ACF60004)、江西省现代农业产业技术体系专项资金(JXARS-02)

Nonlinear Manifold Dimensionality Reduction Methods for Quick Discrimination of Tea at Different Altitude by Near Infrared Spectroscopy

LIU Peng1, AI Shirong3, YANG Puxiang2, LI Wenjin2, XIONG Aihua1, TONG Yang3, HU Xiao3, WU Ruimei1,*   

  1. 1. College of Engineering, Jiangxi Agricultural University, Nanchang 330045, China;
    2. Sericulture and Tea Research Institute of Jiangxi Province, Nanchang 330203, China;
    3. College of Software, Jiangxi Agricultural University, Nanchang 330045, China
  • Received:2018-10-19 Revised:2019-06-12 Online:2019-12-15 Published:2019-12-24

摘要: 为提高不同海拔茶叶品质近红外光谱技术鉴别方法的精度,提出采用局部线性嵌入法(LLE)和拉普拉斯特征映射法(LE)非线性流形学习方法对近红外光谱数据进行降维处理,并与基于核函数的非线性(KPCA)及线性(PCA)降维方法比较,建立不同海拔茶叶品质的近红外光谱LSSVM鉴别模型。不同降维方法可视化结果表明,KPCA和PCA方法的数据点离散性较大,400~800 m和800~1 200 m的样本点重叠较多,而非线性流形学习方法能将同一类样本点在三维空间很好地聚集在一起,不同海拔的茶叶能较好地区分开,且聚集效果方面LE方法好于LLE方法。模型性能表明,LE_LSSVM模型性能最佳,预测集总体判别率、Kappa系数分别为100%和1.00;相比于PCA_LSSVM、KPCA_LSSVM和LLE_LSSVM,模型预测集总体判别率分别提高1.7%、1.7%、3.3%;Kappa系数分别提高0.025、0.03、0.05。研究表明,LE等非线性流形学习降维方法在近红外光谱数据降维、简化模型复杂度、提高模型精度方面效果很好,为茶叶品质快速检测方法研究提供了一种新思路。

关键词: 茶叶, 近红外光谱, 非线性流形降维方法, 拉普拉斯特征映射

Abstract: In order to improve the accuracy of near infrared (NIR) spectroscopy identification methods for tea at different altitude, the non-linear manifold dimensionality reduction methods of locally linear embedding (LLE) and laplacian eigenmaps (LE) were used to reduce the dimension of NIR spectral data, and compared with non-linear (KPCA) and linear (PCA) dimensional reduction methods. Discrimination models were established for tea at different altitude based on different dimensional reduction methods and least squares support vector machine (LSSVM) algorithm. Visualization of different dimensionality reduction results show that data processed by KPCA and PCA methods were more discrete. In particular, there were more overlaps between 400-800 m and 800-1 200 m samples. However, the same kind of sample points could be gathered well in three-dimensional space by the nonlinear manifold dimensionality reduction methods can. Tea at different altitude could be easily separated and the aggregation effect of the LE was better than that of the LLE. The results of models indicate the LE_LSSVM model had the best performance, with the prediction set accuracy and Kappa value of 100% and 1.00 respectively. Compared with PCA_LSSVM, KPCA_LSSVM and LLE_LSSVM models, the accuracy of prediction set was improved by 1.7%, 1.7%, 3.3% and Kappa values increased by 0.025, 0.03, and 0.05. The results show that LE and other nonlinear manifold dimensionality reduction methods were effective in reducing dimension of near infrared spectral data, simplifying model complexity, and improving model precision. The study provides a new means for rapid detecting for tea quality research.

Key words: tea, near infrared spectroscopy, nonlinear manifold dimensionality reduction methods, laplacian eigenmaps

中图分类号: