This notebook illustrates the use of decision trees in Shogun for classification and regression. Various decision tree learning algorithms like ID3, C4.5, CART, CHAID have been discussed in detail using both intuitive toy datasets as well as real-world datasets.

This notebook illustrates the training of a factor graph model using structured SVM in Shogun. We begin by giving a brief outline of factor graphs and structured output learning followed by the corresponding API in Shogun. Finally, we test the scalability by performing an experiment on a real OCR data set for handwritten character recognition.

This notebook is about learning and using Gaussian Mixture Models (GMM) in Shogun. Below, we demonstrate how to use them for sampling, for density estimation via Expectation Maximisation (EM), and for clustering.

This notebook is about Document classification in Shogun. After providing a semi-formal introduction to the Bag of Words model and its limitations, we illustrate the hashing trick. This is consolidated by performing experiments on the large-scale webspam data set.

In this notebook we will see how machine learning problems are generally represented and solved in Shogun. As a primer to Shogun's many capabilities, we will see how various types of data and its attributes are handled and also how prediction is done.

This notebook demonstrates clustering with KMeans in Shogun along with its initialization and training. The initialization of cluster centres is shown manually, randomly and using the KMeans++ algorithm. Training is done via the classical Lloyds and mini-batch KMeans method. It is then applied to a real world data set. Furthermore, the effect of dimensionality reduction using PCA is analysed on the KMeans algorithm.

This notebook illustrates the K-Nearest Neighbors (KNN) algorithm on the USPS digit recognition dataset in Shogun. Further, the effect of Cover Trees on speed is illustrated by comparing KNN with and without it. Finally, a comparison with Multiclass Support Vector Machines is shown.

This notebook is on using the Shogun Machine Learning Toolbox for kernel density estimation (KDE). We start with a brief overview of KDE. Then we demonstrate the use of Shogun's $KernelDensity$ class on a toy example. Finally, we apply KDE to a real world example, thus demonstrating the its prowess as a non-parametric statistical method.

This notebook illustrates classification and feature selection using metric learning in Shogun. To overcome the limitations of knn with Euclidean distance as the distance measure, Large Margin Nearest Neighbour(LMNN) is discussed. This is consolidated by applying LMNN over the metagenomics data set.

This notebook is about multiple kernel learning in shogun. We will see how to construct a combined kernel, determine optimal kernel weights using MKL and use it for different types of classification and novelty detection.

This notebook demonstrates various regression methods provided in Shogun. Linear models like Least Square regression, Ridge regression, Least Angle regression, etc. and also kernel based methods like Kernel Ridge regression are discussed and applied to toy and real life data.

This notebook details how to recognize a sudoku puzzle from a picture. We'll make use of simple image processing algorithms (edge detection, thresholds... ) and character recognition using the K-Nearest Neighbors (KNN) algorithm. It is a very simple but effective algorithm for solving multi-class classification problems. This puzzle matrix is a 9x9 array of known numbers 1-9 or 0s where the number is unknown

This ipython notebook is divided into two parts. The first part is related to computer vision and the second part is related to machine learning. To complete this task it is necessary to use a computer vision library, in this case we are going to use OpenCV library.

This notebook illustrates unsupervised learning using the suite of dimensionality reduction algorithms available in Shogun. Shogun provides access to all these algorithms using Tapkee, a C++ library especialized in dimensionality reduction.

This notebook illustrates the use of Random Forests in Shogun for classification and regression. We will understand the functioning of Random Forests, discuss about the importance of its various parameters and appreciate the usefulness of this learning method.

This notebook illustrates how to train and evaluate a deep autoencoder using Shogun

This notebook illustrates Blind Source Seperation(BSS) on audio signals using Independent Component Analysis (ICA) in Shogun. We generate a mixed signal and try to seperate it out using Shogun's implementation of ICA & BSS called JADE.

This notebook illustrates Blind Source Seperation(BSS) on images using Independent Component Analysis (ICA) in Shogun. This is very similar to the BSS audio notebook except that here we have used images instead of audio signals.

This notebook illustrates Blind Source Seperation(BSS) on several time synchronised Electrocardiogram's (ECG's) of the baby's mother using Independent Component Analysis (ICA) in Shogun. This is used to extract the baby's ECG from it.

This notebook is about Bayesian regression and classification models with Gaussian Process (GP) priors in Shogun. After providing a semi-formal introduction, we illustrate how to efficiently train them, use them for predictions, and automatically learn parameters.

This notebook illustrates large-scale sparse Gaussian density likelihood estimation. It first introduces the reader to the mathematical background and then shows how one can do the estimation with Shogun on a number of real-world data sets.

This notebook describes Shogun's framework for statistical hypothesis testing. We begin by giving a brief outline of the problem setting and then describe various implemented algorithms. All the algorithms discussed here are for Kernel two-sample testing with Maximum Mean Discrepancy and are based on embedding probability distributions into Reproducing Kernel Hilbert Spaces( RKHS ).

This notebook demonstrates the reduction of a multiclass problem into binary ones using Shogun. Here, we will describe the built-in One-vs-Rest, One-vs-One and Error Correcting Output Codes strategies.

This notebook presents training of multi-label classification using structured SVM presented in shogun. We would be using MultilabelModel for multi-label classfication.

We begin with brief introduction to Multi-Label Structured Prediction [1] followed by corresponding API in Shogun. Then we are going to implement a toy example (for illustration) before getting to the real one. Finally, we evaluate the multi-label classification on well-known datasets [2]. We showed that SHOGUNs [3] implementation delivers same accuracy as scikit-learn and same or better training time.

This notebook illustrates multiclass learning using Naive Bayes in Shogun. A semi-formal introduction to Logistic Regression is provided at the end.

This notebook illustrates how to use the NeuralNets module to teach a neural network to recognize digits. It also explores the different optimization and regularization methods supported by the module.

This notebook is about finding Principal Components (PCA) of data (unsupervised) in Shogun. Its dimensional reduction capabilities are further utilised to show its application in data compression, image processing and face recognition.

This notebook illustrates the evaluation of prediction algorithms in Shogun using cross-validation, and their parameters selection using grid-search. We demonstrate this for a toy example on Binary Classification using Support Vector Machines.

What's New

Feb. 17, 2014 -> SHOGUN 3.2.0
Jan. 6, 2014 -> SHOGUN 3.1.1
Jan. 5, 2014 -> SHOGUN 3.1.0
Oct. 28, 2013 -> SHOGUN 3.0.0
March 17, 2013 -> SHOGUN 2.1.0
Sept. 1, 2012 -> SHOGUN 2.0.0
Dec. 1, 2011 -> SHOGUN 1.1.0