Abstract:
In the extraction of saline-alkali land information, machine learning models often face the complexity of model feature selection and the difficulty of hyperparameter tuning, which can result in suboptimal classification accuracy in practical applications. To accurately extract information on saline-alkali land in western Jilin and provide a scientific basis for agricultural production and environmental governance, this study, using remote sensing and GIS technology, extracted the spectral features, soil index, salt index, and radar features based on Sentinel-1 and Sentinel-2 data. Recursive feature elimination (RFE) and random forest (RF) algorithms were used for feature optimization and feature importance ranking. Then, Bayesian optimization was used to optimize the hyperparameters of RF, support vector machine (SVM), and K-nearest neighbor (KNN) models, and the classification results were compared and analyzed. The results show that under the premise of maintaining the classification accuracy, 11 features were eliminated by feature selection, which greatly reduced the redundant information. The importance of features indicates that the features that significantly affected the performance of the model were the blue band (B2), green band (B3), and short-wave infrared band (B12). After Bayesian optimization, compared with SVM and KNN, RF has the highest classification accuracy. The overall precision, Kappa coefficient, user precision, and recall were 0.884, 0.878, 0.907, and 0.889, respectively, which can better eliminate or reduce the influence of noise on the classification results, and have better classification performance and stability. The RF model, after feature selection and Bayesian optimization, can accurately extract the saline-alkali land information in western Jilin. This study can provide a scientific basis and decision-making reference for the sustainable development of agriculture, improvement of saline-alkali land, and ecological environment protection in western Jilin.