Monthly Archive

You are currently browsing the monthly archive for March 2011.

Machine learning theory

March 30, 2011 in Academic, Machine Learning | 1 comment

Today, the machine learning class started teaching the theory of machine learning. So I want to collect some materials related to it here. And I think I will update this post along with the class.

1, S. Boucheron, O. Bousquet, and G. Lugosi, (2005). Theory of Classification: a Survey of Recent Advances. ESAIM: Probability and Statistics, 9:323-375. (PDF,POSTSCRIPT)

This paper discusses a lot of theory behind classification, which is one main part of machine learning.

2, S. Boucheron, O. Bousquet, and G. Lugosi, (2004), Concentration inequalities. (PDF,POSTSCRIPT) in O. Bousquet, U.v. Luxburg, and G. Rätsch (editors), Advanced Lectures in Machine Learning, Springer, pp. 208–240, 2004.

This is the content of the first class on the theory of machine learning. And the author also has the following lecture notes on this topic:

Concentration-of-measure inequalities presented at the
Machine Learning Summer School 2003, Australian National University, Canberra,
at the Workshop on combinatorics, probability and algorithms 2003, held at the CRM in Montreal,
at the Winter School on Probabilistic Methods in High Dimension Phenomena , Toulouse, January 10-14, 2005
and at the Workshop de Combinatória e Concentração de Medida IMPA, Rio de Janeiro, February 23-25, 2005 (PDF,POSTSCRIPT).

3, O. Bousquet, S. Boucheron, and G. Lugosi, (2004), Introduction to statistical learning theory. (PDF,POSTSCRIPT) in O. Bousquet, U.v. Luxburg, and G. Rätsch (editors), Advanced Lectures in Machine Learning, Springer, pp. 169–207, 2004.

4, S. Kulkarni, G. Lugosi, and S. Venkatesh (1998), Learning Pattern Classification—A Survey. (POSTSCRIPT) 1948–1998 Special Commemorative Issue of IEEE Transactions on Information Theory. , vol.44, 2178–2206. Reprinted in S. Verdú, S.W. McLaughlin (editors.), Information Theory: 50 Years of Discovery, IEEE Press, New York, 1999.

So far these materials all come from one professor GÁBOR LUGOSI.

For concentration inequalities, there is also a chapter from one book: Old and new concentration inequalities, whose author is Fan Chung Graham. In addition, there is a survey about this: Concentration inequalities and martingale inequalities — a survey .

Videolectures on concentration inequalities with machine learning applications

There is a course: STAT 598Y: Statistical Learning Theory given in Statistics Department at Purdue University. It focuses more on the machine learning theory.

A useful concentration inequality: (α,β) Azuma-Hoeffding

VC-Entropy:

Metric spaces, VC-dimension, and metric entropy.—- Ken Clarkson’s magnificent survey

Foundations of Statistical Learning Theory : Empirical Infe-rence in high-dimention spaces (videolecture)

the slides of statstical learning theory given by Olivier Bousquet

Rademacher complexity:

Rademacher and Gaussian Complexities: Risk Bounds and Structural Results

Peter L. Bartlett, Shahar Mendelson; 3(Nov):463-482, 2002.

Rademacher Complexity—Marcel Lüthi, Thomas Vetter

Topics in Artificial Intelligence (Learning Theory) – Spring 2008

2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization

Pascal Massart

Harrison Huibin Zhou

Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)

Neural Network Learning: Theoretical Foundations
by: Martin Anthony, Peter L. Bartlett

dimentionality reduction

March 23, 2011 in Machine Learning | Leave a comment

What is the curse of dimensionality?

Highly recommend reading the lecture delivered by David Donoho titled “High Dimensional Data Analysis: on the Curses and Blessings of Dimensionality“. The “blessings” of data in high dimensions is that the information of interest effectively lies in a lower dimension which we can then leverage, typically, this involves sparse representations or low dimensional manifolds.

Here are some papers that greatly enhanced mathematical understanding of dimensionality reduction

An Elementary Proof of the Johnson-Lindenstrauss Lemma by Dasgupta & Gupta [1]
A Global Geometric Framework for Nonlinear Dimensionality Reduction (ISOMAP) by Tennenbaum, Silva and Langford [2]
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation by Niyogi and Belkin [3]

Linear models:

Bayesian Face Recognition by Moghaddam, Jebara & Pentland [1]
Object indexing using an iconic sparse distributed memory by Rao & Ballard [2]

Non-linear dimensionality reduction

An Introduction to Locally Linear Embedding by Saul & Roweis [3]
Nonlinear component analysis as a kernel eigenvalue problem by by Scholkopf, Smola & Muller [4]
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction by Weinberger, Sha & Saul[5]
Minimum Volume Embedding by Shaw & Jebara[6]

Reference: Advanced Machine Learning course materials by Tony Jebara: http://www.cs.columbia.edu/~jeba…

Also see

The lectures on large scale machine learning by Sanjiv Kumar: http://www.sanjivk.com/EECS6898/…

Lv’s paper on compact data structures (and well as the rest of her work): http://citeseerx.ist.psu.edu/vie…
Indyk’s and his colleagues’ work on LSH: portal.acm.org/citation.cfm?id=6…
Materials from Michael Jordan’s course “Practical Machine Learning”: http://www.cs.berkeley.edu/~jord…
Jolliffe, Principal Component Analysis: http://www.amazon.com/Principal-…
Stone, Independent Component Analysis: A Tutorial Introduction:
http://www.amazon.com/Independen…
Lee & Verleysen, Nonlinear Dimensionality Reduction : http://www.amazon.com/Nonlinear-…
Samet, Foundations of Multidimensional and Metric Data Structures, ISBN
0123694469
,http://www.amazon.com/Foundation…

Random Collection

March 10, 2011 in Academic | Leave a comment

List of books in computational geometry

10 Papers Every Programmer Should Read (At Least Twice)

What are good books for Computation Geometry?

Why Hopf Algebras

Curvature for Markov Chains

A survey of Ricci curvature for metric spaces and Markov chains

pdf (379K)
ps.gz (230K)

in Probabilistic approach to geometry, Adv. Stud. Pure Math. 57, Math. Soc. Japan (2010), 343–381.

This is a gentle introduction to the context and results of my article Ricci curvature of Markov chains on metric spaces. It begins with a description of classical Ricci curvature in Riemannian geometry, as well as a reminder for discrete Markov chains. A number of open problems are mentioned.

Ollivier’s Ricci curvature, local clustering and curvature dimension inequalities on graphs

Binary Cumulant Varieties

March 2, 2011 in Mathematics, Statistics | 1 comment

Binary Cumulant Varieties

Authors: Bernd Sturmfels, Piotr Zwiernik

Subjects: Combinatorics (math.CO); Algebraic Geometry (math.AG); Statistics Theory (math.ST)

Algebraic statistics for binary random variables is concerned with highly structured algebraic varieties in the space of 2x2x…x2-tensors. We demonstrate the advantages of representing such varieties in the coordinate system of binary cumulants. Our primary focus lies on hidden subset models. Parametrizations and implicit equations in cumulants are derived for hyperdeterminants, for secant and tangential varieties of Segre varieties, and for certain context-specific independence models. Extending work of Rota and collaborators, we explore the polynomial inequalities satisfied by cumulants.

Monthly Archive

Machine learning theory

A useful concentration inequality: (α,β) Azuma-Hoeffding

Foundations of Statistical Learning Theory : Empirical Infe-rence in high-dimention spaces (videolecture)

Rademacher and Gaussian Complexities: Risk Bounds and Structural Results

Topics in Artificial Intelligence (Learning Theory) – Spring 2008

Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)

Neural Network Learning: Theoretical Foundations

dimentionality reduction

Random Collection

Binary Cumulant Varieties

Binary Cumulant Varieties

Recent Comments

Blog Stats

Log In/Out

Email Subscription

Recent Posts

Twitter Updates

Categories

Archives

Bioinformatics

Blogroll

CS blogs

general math blogs

interesting blogs

Journal Club

machine learning blogs

Newly Added

probability blogs

statistics blogs

Blog Stats