Advances In Categorical Data Clustering PDF Download

Advances in Categorical Data Clustering PDF
Author: Yiqun Zhang
Publisher:
ISBN:
Size: 77.65 MB
Format: PDF, Kindle
Category : Cluster analysis
Languages : en
Pages : 145
View: 5615

Get Book

Advances In Categorical Data Clustering Book Description:

Categorical data are common in various research areas, and clustering is a prevalent technique used for analyse them. However, two challenging problems are encountered in categorical data clustering analysis. The first is that most categorical data distance metrics were actually proposed for nominal data (i.e., a categorical data set that comprises only nominal attributes), ignoring the fact that ordinal attributes are also common in various categorical data sets. As a result, these nominal data distance metrics cannot account for the order information of ordinal attributes and may thus inappropriately measure the distances for ordinal data (i.e., a categorical data set that comprises only ordinal attributes) and mixed categorical data (i.e., a categorical data set that comprises both ordinal and nominal attributes). The second problem is that most hierarchical clustering approaches were actually designed for numerical data and have very high computation costs; that is, with time complexity O(N2) for a data set with N data objects. These issues have presented huge obstacles to the clustering analysis of categorical data. To address the ordinal data distance measurement problem, we studied the characteristics of ordered possible values (also called 'categories' interchangeably in this thesis) of ordinal attributes and propose a novel ordinal data distance metric, which we call the Entropy-Based Distance Metric (EBDM), to quantify the distances between ordinal categories. The EBDM adopts cumulative entropy as a measure to indicate the amount of information in the ordinal categories and simulates the thinking process of changing one's mind between two ordered choices to quantify the distances according to the amount of information in the ordinal categories. The order relationship and the statistical information of the ordinal categories are both considered by the EBDM for more appropriate distance measurement. Experimental results illustrate the superiority of the proposed EBDM in ordinal data clustering. In addition to designing an ordinal data distance metric, we further propose a unified categorical data distance metric that is suitable for distance measurement of all three types of categorical data (i.e., ordinal data, nominal data, and mixed categorical data). The extended version uniformly defines distances and attribute weights for both ordinal and nominal attributes, by which the distances measured for the two types of attributes of a mixed categorical data can be directly combined to obtain the overall distances between data objects with no information loss. Extensive experiments on all three types of categorical data sets demonstrate the effectiveness of the unified distance metric in clustering analysis of categorical data. To address the hierarchical clustering problem of large-scale categorical data, we propose a fast hierarchical clustering framework called the Growing Multi-layer Topology Training (GMTT). The most significant merit of this framework is its ability to reduce the time complexity of most existing hierarchical clustering frameworks (i.e., O(N2)) to O(N1.5) without sacrificing the quality (i.e., clustering accuracy and hierarchical details) of the constructed hierarchy. According to our design, the GMTT framework is applicable to categorical data clustering simply by adopting a categorical data distance metric. To make the GMTT framework suitable for the processing of streaming categorical data, we also provide an incremental version of GMTT that can dynamically adopt new inputs into the hierarchy via local updating. Theoretical analysis proves that the GMTT frameworks have time complexity O(N1.5). Extensive experiments show the efficacy of the GMTT frameworks and demonstrate that they achieve more competitive categorical data clustering performance by adopting the proposed unified distance metric.

Advances In Intelligent Data Analysis Viii PDF Download

Advances in Intelligent Data Analysis VIII PDF
Author: Niall M. Adams
Publisher: Springer Science & Business Media
ISBN: 3642039146
Size: 69.75 MB
Format: PDF, ePub
Category : Computers
Languages : en
Pages : 418
View: 1307

Get Book

Advances In Intelligent Data Analysis Viii Book Description:

This book constitutes the refereed proceedings of the 8th International Conference on Intelligent Data Analysis, IDA 2009, held in Lyon, France, August 31 – September 2, 2009. The 33 revised papers, 18 full oral presentations and 15 poster and short oral presentations, presented were carefully reviewed and selected from almost 80 submissions. All current aspects of this interdisciplinary field are addressed; for example interactive tools to guide and support data analysis in complex scenarios, increasing availability of automatically collected data, tools that intelligently support and assist human analysts, how to control clustering results and isotonic classification trees. In general the areas covered include statistics, machine learning, data mining, classification and pattern recognition, clustering, applications, modeling, and interactive dynamic data visualization.

Advances In Knowledge Discovery And Data Mining PDF Download

Advances in Knowledge Discovery and Data Mining PDF
Author: Joshua Zhexue Huang
Publisher: Springer
ISBN: 364220841X
Size: 31.54 MB
Format: PDF, Docs
Category : Computers
Languages : en
Pages : 564
View: 6386

Get Book

Advances In Knowledge Discovery And Data Mining Book Description:

The two-volume set LNAI 6634 and 6635 constitutes the refereed proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011, held in Shenzhen, China in May 2011. The total of 32 revised full papers and 58 revised short papers were carefully reviewed and selected from 331 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD-related areas including data mining, machine learning, artificial intelligence and pattern recognition, data warehousing and databases, statistics, knowledge engineering, behavior sciences, visualization, and emerging areas such as social network analysis.

Advances In Intelligent Data Analysis Vi PDF Download

Advances in Intelligent Data Analysis VI PDF
Author: A. Fazel Famili
Publisher: Springer Science & Business Media
ISBN: 3540287957
Size: 35.23 MB
Format: PDF, ePub, Mobi
Category : Business & Economics
Languages : en
Pages : 522
View: 4353

Get Book

Advances In Intelligent Data Analysis Vi Book Description:

This book constitutes the refereed proceedings of the 6th International Conference on Intelligent Data Analysis, IDA 2005, held in Madrid, Spain in September 2005. The 46 revised papers presented together with two tutorials and two invited talks were carefully reviewed and selected from 184 submissions. All current aspects of this interdisciplinary field are addressed; the areas covered include statistics, machine learning, data mining, classification and pattern recognition, clustering, applications, modeling, and interactive dynamic data visualization.

Advances In Knowledge Discovery And Data Mining PDF Download

Advances in Knowledge Discovery and Data Mining PDF
Author: Jian Pei
Publisher: Springer
ISBN: 3642374565
Size: 42.86 MB
Format: PDF, Kindle
Category : Computers
Languages : en
Pages : 588
View: 5472

Get Book

Advances In Knowledge Discovery And Data Mining Book Description:

The two-volume set LNAI 7818 + LNAI 7819 constitutes the refereed proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2013, held in Gold Coast, Australia, in April 2013. The total of 98 papers presented in these proceedings was carefully reviewed and selected from 363 submissions. They cover the general fields of data mining and KDD extensively, including pattern mining, classification, graph mining, applications, machine learning, feature selection and dimensionality reduction, multiple information sources mining, social networks, clustering, text mining, text classification, imbalanced data, privacy-preserving data mining, recommendation, multimedia data mining, stream data mining, data preprocessing and representation.

Advances In Knowledge Discovery And Data Mining PDF Download

Advances in Knowledge Discovery and Data Mining PDF
Author: Hang Li
Publisher: Springer Science & Business Media
ISBN: 3540717005
Size: 47.63 MB
Format: PDF, ePub, Mobi
Category : Computers
Languages : en
Pages : 1161
View: 3320

Get Book

Advances In Knowledge Discovery And Data Mining Book Description:

This book constitutes the refereed proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007, held in Nanjing, China in May 2007.The 34 revised full papers and 92 revised short papers presented together with 4 keynote talks or extended abstracts thereof were carefully reviewed and selected from 730 submissions. The papers are devoted to new ideas, original research results and practical development experiences from all KDD-related areas including data mining, machine learning, databases, statistics, data warehousing, data visualization, automatic scientific discovery, knowledge acquisition and knowledge-based systems.

Advances In Knowledge Discovery And Data Mining PDF Download

Advances in Knowledge Discovery and Data Mining PDF
Author: Takashi Washio
Publisher: Springer Science & Business Media
ISBN: 3540681248
Size: 42.90 MB
Format: PDF, ePub, Mobi
Category : Computers
Languages : en
Pages : 1102
View: 6705

Get Book

Advances In Knowledge Discovery And Data Mining Book Description:

This book constitutes the refereed proceedings of the 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2008, held in Osaka, Japan, in May 2008. The 37 revised long papers, 40 revised full papers, and 36 revised short papers presented together with 1 keynote talk and 4 invited lectures were carefully reviewed and selected from 312 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition, automatic scientific discovery, data visualization, causal induction, and knowledge-based systems.

Advances In Visual Computing PDF Download

Advances in Visual Computing PDF
Author: George Bebis
Publisher: Springer Science & Business Media
ISBN: 3642103308
Size: 19.90 MB
Format: PDF, ePub
Category : Computers
Languages : en
Pages : 1117
View: 5486

Get Book

Advances In Visual Computing Book Description:

The two volume set LNCS 5875 and LNCS 5876 constitutes the refereed proceedings of the 5th International Symposium on Visual Computing, ISVC 2009, held in Las Vegas, NV, USA, in November/December 2009. The 97 revised full papers and 63 poster papers presented together with 40 full and 15 poster papers of 7 special tracks were carefully reviewed and selected from more than 320 submissions. The papers are organized in topical sections on computer graphics; visualization; feature extraction and matching; medical imaging; motion; virtual reality; face processing; reconstruction; detection and tracking; applications; and video analysis and event recognition. The 7 additional special tracks address issues such as object recognition; visual computing for robotics; computational bioimaging; 3D mapping, modeling and surface reconstruction; deformable models: theory and applications; visualization enhanced data analysis for health applications; and optimization for vision, graphics and medical imaging: theory and applications.

Advances In Knowledge Discovery And Data Mining PDF Download

Advances in Knowledge Discovery and Data Mining PDF
Author: Ming-Syan Cheng
Publisher: Springer Science & Business Media
ISBN: 3540437045
Size: 60.88 MB
Format: PDF, ePub, Docs
Category : Computers
Languages : en
Pages : 568
View: 264

Get Book

Advances In Knowledge Discovery And Data Mining Book Description:

This book constitutes the refereed proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2002, held in Taipei, Taiwan, in May 2002. The 32 revised full papers and 20 short papers presented together with 4 invited contributions were carefully reviewed and selected from a total of 128 submissions. The papers are organized in topical sections on association rules; classification; interestingness; sequence mining; clustering; Web mining; semi-structure and concept mining; data warehouse and data cube; bio-data mining; temporal mining; and outliers, missing data, and causation.