Kunming University of Science and Technology
多粒度形式概念分析是数据挖掘与知识发现的重要工具，但现有的多粒度形式概念分析理论中并未提出选择最优形式背景的标准，这导致只能对多个单粒度形式背景逐一研究其知识发现问题，因而无法应对含有多个粒度属性的形式背景. 本文通过对多粒度形式背景的粒度树上的属性块进行组合，将信息熵作为组合形式背景优劣的判别标准以评价最优粒度选择的性能. 首先，基于粒度树提出广义介粒度剪枝形式背景，它既能实现属性块内部跨粒度组合，又能实现属性块之间跨层组合；其次，给出广义介粒度剪枝形式背景的信息熵，以评价广义介粒度剪枝形式背景的优劣，并设计出了最优粒度选择算法；然后，利用信息熵度量了多粒度剪枝类属性块和粒度树的重要性；最后，实验分析表明基于信息熵的最优粒度选择和粒度树重要性度量方法是有效的.
Multi-granularity formal concept analysis is an important tool for data mining and knowledge discovery. However, there is no standard to select an optimal formal context in the existing multi-granularity formal concept analysis theory, which leads to the fact that multiple single-granularity formal contexts have to be studied separately one by one for achieving the task of knowledge discovery, leaving the formal contexts with multi-granularity attributes unexplored. In this paper, how to combine attribute blocks of the granularity tree of a multi-granularity formal context is studied, and information entropy is used as a criterion to judge whether a combined formal context is good or not, so as to evaluate the performance of the obtained optimal granularity selection results. Firstly, based on granularity tree, the notion of a generalized meso-granularity pruning formal context is proposed. It can not only realize inter-layer cross-granularity combination but also cross-layer combination of attribute blocks. Secondly, information entropy of a generalized mesogranularity pruning formal context is defined to evaluate its advantages and disadvantages, and an optimal granularity selection algorithm is designed. Then, information entropy is used to measure the importance of multi-granularity pruning class-attribute block and granularity tree. Finally, experimental analysis shows the effectiveness of the proposed methods of optimal granularity selection and importance measurement of granularity tree based on information entropy.