讲座题目：Granular Clustering: Augmenting Principles, Identifying New Directions and Forming Application-Oriented Implications
讲座人:Witold Pedrycz 加拿大皇家科学院院士
讲座内容：Clustering has been for decades a focal point of studies quite oftenresearched in relation with modeling, pattern classification, and dataanalysis. With the advent of data analytics bringing a suite of new problems,clustering has been subjected to a visible paradigm shift. Granular clustering,the term being recently used, has emphasized the role of clustering regarded asa sound vehicle to construct information granules – entities aimed at thebuilding abstract yet flexible and adjustable views at data, facilitatingprocessing of masses of data and subsequentlyconstructing interpretable models.
The term granular clustering can be sought from the two generalpoints of view; in this talk those perspectives are carefully analyzed alongwith a formulation of far reaching ramifications. The first general view isconcerned with the formation of information granules completed on a basis ofpredominantly numeric (non-granular) data. The alternative view stresses the clusteringof granular data themselves. The hybrid of these views are also investigated.
In the setting of data analytics, there are several well-articulatedand emerging challenges. Considering objective function-based clustering, thesetechniques return a small number of numeric representatives (prototypes) of bigdata. This triggers a question as to the representation capabilities of theprototypes. A certain line of research is to augment the numeric prototypesproduced by their granular generalizations (viz.granularprototypes) and optimize their abilities to capture theessence of the data. We discuss a direction of research aimed at buildingoptimal granular prototypes and their characterization. It is shown that some clustering techniquesexhibiting a great deal of flexibility (such as e.g., DBSCAN) still require aconcise characterization of the comprehensive results coming in the form ofgranular prototypes. An impact on ensuing modeling (viz. modeling exploitinggranular data) is discussed.
Clustering techniques are commonly concerned with the formation ofdirection-free (relational) constructssuch as those being used in association (linkage)analysis.The accommodation of the aspect of directionality (required to cope with invarious modeling tasks) entails another wave of pursuits that are referred toas direction-sensitive clustering.
Distributed data environment when various sources of data are to beanalyzed en block calls for clustering realized in the space of informationgranules. We discuss a concept of double level clustering where the concept oftensor distance is involved in order to capture the interrelationships betweengranular data encountered in the distributed environment.