欢迎访问林草资源研究

林业资源管理 ›› 2018›› Issue (6): 130-137.doi: 10.13466/j.cnki.lyzygl.2018.06.021

• 技术应用 • 上一篇    下一篇

森林资源抽样调查缺失数据填充方法

刘菲(), 李明阳(), 刘雅楠, 江一帆, 王子   

  1. 南京林业大学 林学院,南京 210037
  • 收稿日期:2018-09-17 修回日期:2018-12-10 出版日期:2018-12-28 发布日期:2020-09-27
  • 通讯作者: 李明阳
  • 作者简介:刘菲(1994-),女,安徽滁州人,在读硕士,从事3S技术应用方面的研究。Email: 121126082@qq.com
  • 基金资助:
    国家自然科学基金项目“基于情景分析与多目标决策的南方集体林长期经营规划方法研究”(31770679)

Filling Method for Missing Data of Forest Resource Sampling Investigation

LIU Fei(), LI Mingyang(), LIU Yanan, JIANG Yifan, WANG Zi   

  1. College of Forestry,Nanjing Forestry University,Nanjing,Jiangsu 210037,China
  • Received:2018-09-17 Revised:2018-12-10 Online:2018-12-28 Published:2020-09-27
  • Contact: LI Mingyang

摘要:

在森林资源抽样调查中数据缺失现象时常发生,为了提高数据分析的准确性,有必要对缺失数据填充方法进行研究。以浙江省临安市1996年Landsat-5 TM影像及同期县级森林资源连续监测固定样地数据为主要信息源,以样地内林木平均胸径为缺失因子,在对其空间自相关分析的基础上,采用十折交叉验证法对缺失数据进行空间、非空间和基于遥感估测模型填充以及精度评价。结果表明:1)研究区样地林木平均胸径的Moran’s I系数为0.21,空间分布表现出较强的空间自相关性;2)遥感估测模型中K-近邻算法的填充精度最高,其次为随机森林、空间填充的克里金内插,非空间的期望极大化算法填充精度最低;3)克里金内插的4个半方差理论模型中,球状模型填充精度最高,相关系数(0.632 5)最高,平均绝对误差(2.049 3cm)和均方根误差(3.809 3cm)最低;4)按照填充精度由高到低的顺序,4种性能较好的数据填充方法依次为:K-近邻算法>随机森林>克里金内插>距离权重反比。在地势形态复杂、海拔差异较大的临安境内,K-近邻算法较适合样地林木平均胸径因子的缺失数据填充。

关键词: 森林资源抽样调查, 胸径, 缺失数据, 填充方法, 临安市

Abstract:

The phenomenon of data loss often occurs in forest resource sampling investigation.So it is necessary to study the filling method of missing data in order to improve the accuracy of the data analysis.Linan County located in Zhejiang Province was chosen as the case study area.Landsat-5 TM image in 1996 and County-level fixed plot data of forest resources continuous detection in the same period were used as the main information,and the average DBH(Diameter at Breast Height) of trees in sample plot as the missing factor to make spatial filling,non-spatial filling,model filling of remote sensing estimation for missing data.And 10 fold cross-validation method on the basis of spatial autocorrelation analysis of the average DBH of trees in sample plot was employed to make accuracy evaluation.The results show that:(1) The Moran’I coefficient of the average DBH of sample plot trees in study area is 0.21 and its spatial distribution shows strong spatial autocorrelation;(2)The filling accuracy of K-Nearest Neighbor of remote sensing estimation models is the highest,the second is Random Forest followed by the Kriging Interpolation of spatial filling.However,the filling accuracy of expectation maximization algorithm of non-spatial fillings is the lowest;(3)Among four semi-variance models of Kriging interpolation,the filling accuracy of spherical model is higher than any other models.Its correlation coefficient constitutes 0.632 5,the mean absolute error makes up 2.049 3 centimeters and the root mean square error accounts for 3.809 3 centimeters;(4)According to the order of filling accuracy from high to low,four priority filling methods of missing data includes:K-Nearest Neighbor,Random Forest,Kriging Interpolation and Inverse Distance Weighting.It is the K-Nearest Neighbor that is most suitable for filling missing data of the average DBH of sample plot trees in Linan with complex topography and great different altitudes.

Key words: forest resource sampling investigation, DBH, missing data, filling methods, Linan

中图分类号: