Optimizing Sentinel-2 feature space for improved crop biophysical and biochemical variables retrieval using the novel spectral triad feature selection algorithm

This study presents a novel Spectral Triad feature selection (STfs) technique based on music theory and compares it to the entire Sentinel-2 feature space and Random Forest-Recursive Feature Elimination (RF-RFE). The optimal subsets were evaluated with Random Forest for retrieving Leaf Area Index (LAI), Leaf Chlorophyll Content (LCab), and Canopy Chlorophyll Content (CCC) in a semi-arid agricultural landscape. The results indicated that the proposed STfs algorithm obtained equivalent or better (i.e. by 1 - 3%) retrieval accuracies for LAI (R-cv(2) of 66%, root mean squared error of cross-validation [RMSEcv] of 0.53 m(2) m(-2)), LCab (R-cv(2): 74%, RMSEcv: 7.09 mu g cm(-2)) and CCC (R-cv(2): 77%, RMSEcv: 33.69 mu g cm(-2)), using only 5, 7 and 7 variables, respectively, when compared to RF-RFE and entire Sentinel-2 feature space. Overall, the proposed STfs algorithm has great potential to optimize the spectral feature space of quasi-hyperspectral sensors for rapid crop biophysical and biochemical parameter retrieval.