zhenbo

ISSN 2096-7780 CN 10-1665/P

基于XGBoost的地磁秒数据尖峰干扰修正方法研究

Research on a spike-interference correction method for 1-second geomagnetic sampling data using XGBoost

  • 摘要: 尖峰干扰是地磁秒采样数据的最常见干扰类型之一,处理是否恰当影响着地磁秒采样观测数据的质量。现有的中值平均法需要大量人工操作,且在应对干扰持续时间大于1 s的干扰时适应性受限。为了对尖峰干扰数据进行更加准确的修正,本研究将数据修正问题建模为机器学习回归任务,提出了基于XGBoost的地磁秒采样数据尖峰干扰修正模型。选用2024年1月—2025年4月满洲里地震观测站自动化地磁台站系统Z分量原始观测数据,提取17720条尖峰干扰时刻及前后磁场变化样本作为数据集训练模型。模型在测试集上的均方误差、均方根误差、平均绝对误差和决定系数分别达到了0.010971 nT20.104742 nT、0.074217 nT和0.999852。结果表明,XGBoost模型在地磁秒采样尖峰干扰修正中具有优异的性能,为解决地磁秒采样数据尖峰干扰的自动化、高精度修正问题提供了有效的新方法。

     

    Abstract: Spike-interference is one of the most common disturbances in 1-second geomagnetic sampling data, and its proper correction directly affects the quality of 1-second geomagnetic observations. Conventional methods, such as the median-average approach, require extensive manual intervention and show limited adaptability when dealing with interference lasting longer than one second. To improve the accuracy of spike-interference correction, this study formulates the task as a machine-learning regression problem and proposes an XGBoost-based model for correcting spike disturbances in 1-second geomagnetic sampling data. Using raw Z-component observational data from the automated geomagnetic station at the Manzhouli Seismic Observatory between January 2024 and April 2025, 17720 samples of magnetic-field variations—capturing and surrounding spike-interference events—were extracted to construct the dataset for model training. The model achieved the following performance metrics on the test set: a mean squared error of 0.010971 nT², a root mean squared error of 0.104742 nT, a mean absolute error of 0.074217 nT, and a coefficient of determination of 0.999852. The results demonstrate the excellent performance of the XGBoost model in correcting spike-interference in 1-second geomagnetic sampling data, providing a robust and high-precision method for automated correction of such disturbances.

     

/

返回文章
返回