codemagic
  • Home
  • About.me
  • Resume
  • Blog

expanding   our    PELLET   of   default  kernels   in   scikit

1/31/2014

0 Comments

 
I was working on an simplified action descriptors for action detection using the bounded Dense trajectory, where my final step was to predict the action by training the action descriptors with SVM. Since my whole code was in python, I wanted a python based implementation of libsvm which I found in Scikit learns.
                 
The problem with the Scikit was it had a few default kernels.   
  • linear: 
  • polynomial:
  • rbf: 
  • sigmoid:

but I wanted chi square kernel as this is the most used kernel  for histogram data and I was unable to construct  custom kernel . After a brief amount of browsing and going through the  Scikit Documentation. I found there are several ways to include the other kinds of kernels

First way was to use Scikit's Pairwise Metrics.  Apart from the above mentioned kernels, we have two new kernels

metrics.pairwise.additive_chi2_kernel (X[, Y])Computes the additive chi-squared kernel between observations in X and Y
metrics.pairwise.chi2_kernel (X[, Y, gamma])Computes the exponential chi-squared kernel X and Y.

Usage:

from  metrics.pairwise.additive import chi2_kernel
>>> Y = [0, 1, 2, 3] 
>>> clf = svm.SVC(kernel=chi2_kernel) 
>>> clf.fit(X, Y) SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)
>>> test=[[0],[1],[1].[2]]
>>> clf.predict(test)

Second way is to use the approximations 

kernel_approximation.SkewedChi2Sampler  ( [...])Approximates feature map of the “skewed chi-squared” kernel by Monte
 
Usage:

>>> from sklearn.kernel_approximation import SkewedChi2Sampler
>>> X = [[0, 0], [1, 1], [1, 0], [0, 1]]
>>> y = [0, 0, 1, 1]
>>> chi2_feature = SkewedChi2Sampler(sample_step=1. sample_interval=3)
>>> X_features = rbf_feature.fit_transform(X) 
>>> clf = SGDClassifier() 
 >>> clf.fit(X_features, y)
 SGDClassifier(alpha=0.0001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.15, learning_rate='optimal', loss='hinge', n_iter=5, n_jobs=1, penalty='l2', power_t=0.5, random_state=None, rho=None, shuffle=False, verbose=0, warm_) 
>>> clf.predict([[1.0],[1,1]])

you can see how these kernels can be implemented in one of my projects here

0 Comments

.Mat files  - reading  and  creating  in  python

1/20/2014

0 Comments

 
There are several versions of the .mat files v7.3, v7, v6, v4.  A significant difference between each of the version is the way the data is stored inside them.
      -v4 ,-v6  : In these versions  Matlab allowed storing a variety of structures like sparse arrays , two dimensional double  and extended its varied structure storage.
      -v7  : From this version Matlab started compressing the data. This compression and decompression  slowed down the loading and saving process but used very less space in the disk
      -v7.3  : In this version Matlab started to use HDF5 format of storing the data in a compressed chunks. The time required to load the data differed by the way the data is stored among the chunks
you can check this for detailed information regarding the versions and their features list.

you can create any version mat file using the "save" command in Matlab 

save(filename,variables,version) saves to the MAT-file version specified by version. The variables argument is optional, as described above.

eg,
A = rand(5); 
B = magic(10); 
save('example.mat','A','B','-v7.3')

due to the varied versions of mat file. Reading a mat file became a complicated task to carryout. 
Here I would describe two ways you could read and create a mat file in python.

Matlab -v4. -v6,-v7
Need to import scipy.io

loadmat(file_name[, mdict, appendmat])   Load MATLAB file
savemat(file_name, mdict[, appendmat, ...])  Save a dictionary of names and arrays into a MATLAB-style .mat file.


eg,
#!/usr/bin/env python 
from scipy.io import loadmat 
x = loadmat('test.mat') 
lon = x['lon'] 
lat = x['lat'] 
# one-liner to read a single variable 
lon = loadmat('test.mat')['lon']
x['lon']='clon'
savemat('changetest.mat',x)

Matlab -v7.3
since the data is stored in the form of HDF5 chunks. we need to install python-tables or python-h5py package. which allows python to access HDF5 chunks. you could use apt-get  or download the files from these websites.
Pytables
h5py

eg
#!/usr/bin/env python 
import tables
file = tables.openFile('test.mat') 
lon = file.root.lon[:] 
lat = file.root.lat[:] 
# Alternate syntax if the variable name is in a string 
varname = 'lon' 
lon = file.getNode('/' + varname)[:]

additional references
http://wiki.scipy.org/Cookbook/Reading_mat_files
http://docs.scipy.org/doc/scipy/reference/tutorial/io.html
http://www.mathworks.com/help/matlab/import_export/mat-file-versions.html


0 Comments

    Author

    Kaushal Bondada

    Archives

    January 2014

    Categories

    All
    C++
    Latex
    Machine Learning
    Matlab
    Python
    Vim

    RSS Feed

Powered by Create your own unique website with customizable templates.