光谱学与光谱分析, 2019, 39 (9): 2780, 网络出版: 2019-09-28
基于紫外-可见透射光谱技术和极限学习机的早期鸡胚雌雄识别
Early Identification of Male and Female Embryos Based on UV/Vis Transmission Spectroscopy and Extreme Learning Machine
摘要
为了对鸡种蛋胚胎进行雌雄识别, 探究利用紫外-可见-近红外透射光谱进行鸡胚雌雄识别的可行性, 搭建了鸡种蛋透射光谱检测系统, 采用横向和竖向大头朝上2种放置方式获取210枚鸡种蛋孵化0~15 d的光谱, 光谱范围为360~1 000 nm。 构建极限学习机(ELM)鸡胚雌雄识别模型, 通过比较不同放置方式和孵化天数下模型的识别准确率, 发现竖向放置且孵化第7 d的识别效果最好; 将竖向放置孵化第7 d的光谱初步分为紫外(360~380 nm)、 可见光(380~780 nm)、 近红外(780~1 000 nm)、 紫外-可见光(360~780 nm)和全波段(360~1 000 nm)5个不同的波段范围来分析, 预测集准确率分别为8286%, 7714%, 7571%, 8429%和8143%, 筛选出360~780 nm的紫外-可见光波段为有效波段; 在紫外-可见光(360~780 nm)波段, 采用多元散射校正(MSC)去噪, 并用竞争性自适应重加权采样算法(CARS)和连续投影算法(SPA)筛选特征波长降维, 建立不经筛选特征波长、 CARS筛选特征波长和SPA筛选特征波长的3种ELM模型。 其中不经筛选特征波长的ELM模型识别效果最好, 但输入变量最多, 隐含层神经元为680且激活函数为sig时, 预测集准确率为8429%。 SPA筛选特征波长的ELM模型识别效果次之, 输入变量有9个, 隐含层神经元为840且激活函数为hardlim时, 预测集准确率为8143%。 CARS筛选特征波长的ELM模型识别效果最差, 输入变量有27个, 隐含层神经元为100且激活函数为sig时, 预测集准确率为7857%; 用遗传算法(GA)优化ELM模型的权值变量和隐含层阈值, 不经筛选特征波长建立的GA-ELM模型, 预测集准确率为8714%, SPA筛选特征波长建立的GA-ELM模型, 预测集准确率为8714%, CARS筛选特征波长建立的GA-ELM模型, 预测集准确率为8143%。 紫外-可见光波段不经筛选特征波长的GA-ELM模型识别效果和经SPA筛选特征波长的GA-ELM模型相同, 表明SPA筛选的特征波长变量能够有效反映360~780 nm波段的信息, SPA使用的变量数仅占紫外-可见光波段的214%, 因此, 雌雄识别最佳模型为紫外-可见光波段经SPA筛选特征波长的GA-ELM模型, 预测集准确率为8714%, 其中, 雌性识别率为8857%, 雄性识别率为8571%, 单个样本平均判别时间0080 ms。 结果表明紫外-可见透射光谱技术和ELM模型为孵化早期鸡胚蛋雌雄识别提供了一种可行方法。
Abstract
In order to identify male and female embryos of chicken eggs, the feasibility of using UV/Vis/NIR transmission spectrum to identify the male and female embryos is explored. The transmission spectrum detection system of chicken eggs is established with blunt end vertically placed upwards and horizontally placed separately to obtain the 0~15 d spectrum (ranging from 360 to 1 000 nm) of 210 hatched eggs. The identification model of the embryo learning male and female of the extreme learning machine (ELM) is constructed. By comparing the identification accuracy of different placement and the number of hatching days, it is found that the recognition effect of the vertical placement hatching on the 7th day is the best. The spectrum of the 7th day of vertical incubation is initially divided into ultraviolet (360~380 nm), visible light (380~780 nm), near-infrared (780~1 000 nm), ultraviolet/visible (360~780 nm) and full-band (360~1 000 nm). Five different band ranges are analyzed, and the prediction set accuracy rates are 8286%, 7714%, 7571%, 8429%, and 8143%, respectively. The ultraviolet/visible bands of 360~780 nm are selected as effective bands; In the ultraviolet/visible (360~780 nm) band, Multiplicative scatter correction (MSC) is used to denoise, and the characteristic wavelength reduction is selected by Competitive adaptive reweighted sampling (CARS) and Successive projection algorithm (SPA). Three kinds of wavelengths without screening, CARS screening characteristic wavelength and SPA screening characteristic wavelength are established ELM model. Among them, the ELM model without screening characteristic wavelengths has the best recognition effect, but the input variables are the most. When the hidden layer neuron is 680 and the activation function is sig, the prediction set accuracy is 8429%. The ELM model of the SPA screening characteristic wavelength has the second recognition effect, and there are 9 input variables. When the hidden layer neurons are 840 and the activation function is hardlim, the prediction set accuracy is 8143%. The ELM model with the CARS screening characteristic wavelength has the worst recognition effect, and there are 27 input variables. When the hidden layer neurons are 100 and the activation function is sig, the prediction set accuracy is 7857%. Using Genetic algorithm (GA) to optimize the weight variable and hidden layer threshold of ELM model, the prediction set accuracy rate is 8714%, 8714% and 8143% separately under the condition of the GA-ELM model established without screening the characteristic wavelength, the GA-ELM model established by SPA screening characteristic wavelength, and the GA-ELM model established by the CARS screening characteristic wavelength. The recognition effect of GA-ELM model in the ultraviolet/visible band without screening characteristic wavelength is the same as that in the GA-ELM model with SPA screening characteristic wavelength, which indicates that the characteristic wavelength variable of SPA screening can effectively reflect the information of 360~780 nm band. The number of variables used by the SPA is only 214% of the ultraviolet/visible range. Therefore, the best model for male and female identification is the GA-ELM model for screening characteristic wavelengths with SPA in the ultraviolet/visible range. The accuracy of the prediction set is 8714%, of which the female recognition rate is 8857%, the male recognition rate is 8571%, and the average discrimination time of a single sample is 0080 ms. The results show that UV/Vis transmission spectroscopy and ELM model provide a feasible method for the identification of chicken embryo eggs in early hatching.
祝志慧, 洪琪, 吴林峰, 王巧华, 马美湖. 基于紫外-可见透射光谱技术和极限学习机的早期鸡胚雌雄识别[J]. 光谱学与光谱分析, 2019, 39(9): 2780. ZHU Zhi-hui, HONG Qi, WU Lin-feng, WANG Qiao-hua, MA Mei-hu. Early Identification of Male and Female Embryos Based on UV/Vis Transmission Spectroscopy and Extreme Learning Machine[J]. Spectroscopy and Spectral Analysis, 2019, 39(9): 2780.