Public Softwares

The CAU AI Lab makes several data mining softwares available to the public. These are machine learning toolkits for various major data mining problems. All these software distributions are licenced under the GNU General Public License for non-commercial and research use. Software in this section is focused on a specific research goal. Most have been described in greater detail in one of our publications.


GK Clustering Method

A java implementation of Gustafson-Kessel (GK) algorithm for clustering Microarray gene expression data. This program detects gene clusters of different shapes in a data set by exploiting an adaptive distance norm. [Bioinformatics (2005), Download]


CIAO: Clustering Incomplete data using Alternating Optimization

A java implementation of CIAO algorithm that finds clusters for the incomplete gene expression data. This program detects clusters of incomplete microarray data without the use of imputation. [Bioinformatics (2007), Download]


SICAGO: SemI-supervised Clustering Analysis using Gene Ontology

A C# implementation of SICAGO system that helps to discover the groups of genes more effectively by exploiting semantic distance between gene pairs using prior knowledge extracted from Gene Ontology [Bioinformatics (2010), Download]


Classifying Categorical Data using AHD

A Matlab implementation of a nearest neighbor classification using Adoptive Hamming Distance (AHD) that finds equivalent categorical value pairs to improve the classification performance of given categorical data set. [IEICE Trans. Inf. & Syst. (2010), Download]


Classification Based on Predictive Association Rules of Incomplete Data

A Matlab implementation of associative classification that predicts the class label using association rules for incomplete data. [IEICE Trans. Inf. & Syst. (2012), Download]


Efficient Multivariate Feature Filter using Conditional Mutual Information

A Matlab implementation of a fast feature filter to find a feature subset that maximizes relevance to target class while minimizing redundancy between features. [Elec. Lett. (2012), Download]


Approximating Mutual Information for Multi-label Feature Selection

A Matlab implementation of approximated mutual information for multi-label feature selection that maximizes dependency to target labels and minimizes inter-dependency among features. [Elec. Lett. (2012), Download]


Toward Multiple Emotion Classification from Musical Audio Signals

Two music data sets of musical features and multiple emotions; a music can be assigned maximum four different music emotions. [Dataset Download]


Feature Selection for Multi-label Classification using Multivariate Mutual Information

A Matlab implementation of multivariate feature selection for multi-label data set [Patt. Recog. Lett. (2013), Download]


Efficient Dynamic Time Warping for 3D Handwriting Recognition using Gyroscope Equipped Smartphones

We are studying "Efficient Dynamic Time Warping for 3D Handwriting Recognition using Gyroscope Equipped Smartphones." The 3D hand writing dataset is available in public. [Dataset Download]


Memetic Feature Selection Algorithm for Multi-label Classification

A Matlab implementation of memetic multi-label feature selection for multi-label classification [Inform. Sci. (2015), [Download]


Cave: Experimental Maps for Robot Path Planning

20 cave-shaped maps for robot path planning; Each map is represented as binary value. The size of each map is 500 x 500. [Dataset Download]


Hermes: MIR library for Android

Music Information Retrieval Library In Android (MFCCs, SPL, Onset, Pitch, and Etc). [Download]


Fast Multi-label Feature Selection Algorithm

A Matlab implementation of fast feature selection for multi-label data set [Patt. Recog. (2015), Download]