Knowledge Resource Center for Ecological Environment in Arid Area
DOI | 10.1016/j.geoderma.2023.116604 |
Model averaging of machine learning algorithms for digital soil mapping: A minimum variance framework | |
Bogaert, Patrick; Taghizadeh-Mehrjardi, Ruhollah; Hamzehpour, Nikou | |
通讯作者 | Bogaert, P |
来源期刊 | GEODERMA
![]() |
ISSN | 0016-7061 |
EISSN | 1872-6259 |
出版年 | 2023 |
卷号 | 437 |
英文摘要 | In the digital soil mapping framework, machine learning (ML) algorithms are currently the most popular meth-ods for the spatial prediction of soil properties. The fast developments of easy-to-use software implementations for a large panel of ML algorithms have encouraged comparison studies between algorithms, with the goal of ranking their performances and identifying the best ones among them. However, as no firm conclusions can be drawn about the best ML algorithm to be used in general, this suggests that combining a set of them could be a better approach. Numerous methods have been proposed to do so, most of them relying on a linear weighting of the individual algorithms. However, there are almost as many methods for linearly weighting ML algorithms as there are ML algorithms, thus leaving the problem unsolved. Moreover, these weighting methods are mostly used out-of-the-box, without paying a proper attention to the associated hypotheses. In this paper, we propose to address this issue by setting the problem in a more formal framework. Starting from classical hypotheses, it is shown how the benefit of averaging various ML algorithms can be estimated from their joint performances. Relying afterwards on the most commonly used linear weighting schemes, it is reminded that, as long as the performance metrics are based on mean square errors, the best averaging method is by essence the best linear (unbiased) predictor. Using a more general Bayesian framework, it is also shown that accounting for conditional biases when weighting ML algorithms is a key issue for obtaining improved predictions, and explicit formulas are proposed for that goal. Finally, these theoretical results are illustrated and discussed using a soil data set collected over an arid and semi-arid region in Iran where clay content, calcium carbonate equivalent, soil organic carbon and electrical conductivity were measured in topsoil samples. |
英文关键词 | Best linear predictor BLUP Data fusion Soil mapping Urmia Lake |
类型 | Article |
语种 | 英语 |
开放获取类型 | hybrid |
收录类别 | SCI-E |
WOS记录号 | WOS:001047023800001 |
WOS关键词 | PREDICTION ; TEXTURE |
WOS类目 | Soil Science |
WOS研究方向 | Agriculture |
资源类型 | 期刊论文 |
条目标识符 | http://119.78.100.177/qdio/handle/2XILL650/396696 |
推荐引用方式 GB/T 7714 | Bogaert, Patrick,Taghizadeh-Mehrjardi, Ruhollah,Hamzehpour, Nikou. Model averaging of machine learning algorithms for digital soil mapping: A minimum variance framework[J],2023,437. |
APA | Bogaert, Patrick,Taghizadeh-Mehrjardi, Ruhollah,&Hamzehpour, Nikou.(2023).Model averaging of machine learning algorithms for digital soil mapping: A minimum variance framework.GEODERMA,437. |
MLA | Bogaert, Patrick,et al."Model averaging of machine learning algorithms for digital soil mapping: A minimum variance framework".GEODERMA 437(2023). |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。