Abstract:As an important data preprocessing technology in the filed of data minming, feature selection algorithm can effectively deal with the “curse of dimensionality” caused by high-dimensional data. Nonetheless, how to perform feature selection on high-dimensional mixed data is still one of the focuses and difficulities of current research. Because of competently dealing with mixed data of categorical attributes and numerical attributes coexisting, neighborhood rough set model has been widely used in feature selection of mixed data in recent years. However, existing measurement of the neighborhood relationship for mixed data still adopts the simple fusion of categorical data partition based on equivalence relationship and numerical data partition based on similarity relationship. When the features of high-dimensional mixed data are selected by partitioned neighborhood space and predefined evaluation function, the adaptability is poor. To this end, an improved construction method of neighborhood space is proposed on the basis of neighborhood rough set model; Considering boundary overlapping data and the size of neighborhood space, an evaluation function is designed to characterize the discrimination ability of neighborhood space; On this basis, a heuristic feature selection algorithm considering high-dimensional mixed data is proposed. The validity and superiority of proposed algorithm is verified by the UCI standard dataset.