Arid
环境敏感变量优选及机器学习算法预测绿洲土壤盐分
其他题名Environmental sensitive variable optimization and machine learning algorithm using in soil salt prediction at oasis
王飞; 杨胜天; 丁建丽; 魏阳; 葛翔宇; 梁静
来源期刊农业工程学报
ISSN1002-6819
出版年2018
卷号34期号:22页码:102-110
中文摘要基于机器学习预测干旱区(如新疆)土壤盐分的研究目前较少涉及且敏感变量的筛选还需深入探讨。该研究比较5种机器学习算法(套索算法,The Least Absolute Shrinkage and Selection Operator-LASSO;多元自适应回归样条函数,Multiple Adaptive Regression Splines-MARS;分类与回归树,Classification and Regression Trees-CART;随机森林,Random Forest-RF;随机梯度增进算法,Stochastic Gradient Treeboost-SGT)在3个不同地理区域(奇台绿洲,渭-库绿洲和于田绿洲)的性能表现;参与的变量被分为6组:波段,植被相关变量集,土壤相关变量集,数字高程模型(digital elevation model,DEM)衍生变量集,全变量组,优选变量组(全变量组经过算法筛选后的变量集合)。通过算法筛选,以示不同研究区的盐度敏感变量。同时借助以上述6组结果评判算法的性能。结果表明:综合分析6个变量组的R2和RMSE,预测精度排名如下:优选变量组>植被指数变量组>土壤相关变量组>波段>DEM衍生变量组。由于结果不稳定,全变量组未参与排名。在所有变量中,植被指数(EEVI,ENDVI,EVI2,CSRI,GDVI)和土壤盐度指数(SIT,SI2和SAIO)与土壤盐度相关性高于其他变量。综合评价以上5种算法,Lasso和MARS的预测结果出现极端异常值,但其预测结果能基本呈现土壤盐分空间分布格局。CART的结果能清晰分辨灌区和非灌区土壤盐分的分布态势,但二者内部并无太多变化且稳定性较差。RF和SGT的结果显示,二者在3个绿洲的土壤盐分值域范围和土壤盐分空间分布格局相似,纹理信息相对其他3个算法更为丰富。更为重要的是,算法在各个地区的结果都较为稳定。二者相比,SGT验证精度相对最高,其次为RF。
英文摘要The salt-affected cultivated land in Xinjiang accounts for about 37.72% of the irrigated area,which seriously restricts local economic development and ecological stability.In order to evaluate the distribution and severity of soil salinization,many scholars establish a corresponding soil salinity prediction model based on ground sampling data and environmental variables.The research on predicting soil salinity in arid areas (such as Xinjiang) based on machine learning is less involved.And the screening of sensitive variables needs to be further explored.Sensitive variables contribute to reduce the uncertainty of machine learning algorithms,and thus improve the prediction accuracy.The study aims to compare 1) Performance of five machine learning algorithms (The Least Absolute Shrinkage and Selection Operator-LASSO;multivariate adaptive regression spline function,Multiple Adaptive Regression Splines-MARS;Classification and Regression Tree,Classification and Regression Trees-CART;Random Forest,Random Forest-RF;Stochastic Gradient Treeboost-SGT) in three different geographic regions (Qitai oasis,Kuqa oasis and Yutian oasis);2) The variables involved are divided into five groups:bands,vegetation-related variable dataset,soil-related variable dataset,digital elevation model (DEM) derived variable dataset,full variable group,optimized variables group(screening in full variable group by algorithm to show salinity-sensitive variables in different study areas).Then,the performance of the algorithm is judged by the results of each dataset.According to R2 and RMSE,the prediction accuracy of the five variable groups is ranked as follows:optimized variable group > vegetation index variable group > soil related variable group > bands > DEM derived variable group.Among all variables,vegetation index (EEVI,ENDVI,EVI2,CSRI,GDVI) and soil salinity index (SIT,SI2 and SAIO) are more correlated with soil salinity than other variables.When the number of variables involved is scarce,the difference in verification accuracy of each algorithm is not obvious.When the number of variables increases and the correlation with soil salinity is low,such as the DEM derived variable group,SGT and RF have higher ability to mine useful information from complex environments than other algorithms.Based on the algorithm selected,the prediction results of Lasso and MARS have extreme abnormal values,although they basically show the distribution of soil salinity.The results of CART showed that the distribution of soil salinity in irrigation and non-irrigation areas can be clearly distinguished,but there is not much change inside.The results of RF and SGT show that soil salinity range and spatial distribution of soil salinity in the three oases are similar,and the texture information is more abundant than the other three algorithms.More importantly,the results of this these 2 algorithms in each region are relatively stable.Among 5 algorithms,SGT verification accuracy is highest,followed by RF.
中文关键词土壤盐分 ; 遥感 ; 机器学习 ; 绿洲 ; 数字高程模型 ; 新疆
英文关键词Landsat OLI soil salt remote senseing machine learning oasis landsat OLI digital elevation model Xinjiang
语种中文
国家中国
收录类别CSCD
WOS类目AGRICULTURE MULTIDISCIPLINARY
WOS研究方向Agriculture
CSCD记录号CSCD:6367339
来源机构新疆大学
资源类型期刊论文
条目标识符http://119.78.100.177/qdio/handle/2XILL650/237996
作者单位新疆大学资源与环境科学学院;;新疆大学, 智慧城市与环境建模新疆自治区普通高校重点实验室;;绿洲生态教育部重点实验室, 乌鲁木齐;;乌鲁木齐, ;; 830046;;830046
推荐引用方式
GB/T 7714
王飞,杨胜天,丁建丽,等. 环境敏感变量优选及机器学习算法预测绿洲土壤盐分[J]. 新疆大学,2018,34(22):102-110.
APA 王飞,杨胜天,丁建丽,魏阳,葛翔宇,&梁静.(2018).环境敏感变量优选及机器学习算法预测绿洲土壤盐分.农业工程学报,34(22),102-110.
MLA 王飞,et al."环境敏感变量优选及机器学习算法预测绿洲土壤盐分".农业工程学报 34.22(2018):102-110.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[王飞]的文章
[杨胜天]的文章
[丁建丽]的文章
百度学术
百度学术中相似的文章
[王飞]的文章
[杨胜天]的文章
[丁建丽]的文章
必应学术
必应学术中相似的文章
[王飞]的文章
[杨胜天]的文章
[丁建丽]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。