You are currently browsing the monthly archive for March 2011.

Today, the machine learning class started teaching the theory of machine learning. So I want to collect some materials related to it here. And I think I will update this post along with the class.

1, S. Boucheron, O. Bousquet, and G. Lugosi, (2005). Theory of Classification: a Survey of Recent Advances. ESAIM: Probability and Statistics, 9:323-375. (PDF,POSTSCRIPT)

This paper discusses a lot of theory behind classification, which is one main part of machine learning.

2, S. Boucheron, O. Bousquet, and G. Lugosi, (2004), Concentration inequalities. (PDF,POSTSCRIPT) in O. Bousquet, U.v. Luxburg, and G. Rätsch (editors), Advanced Lectures in Machine Learning, Springer, pp. 208–240, 2004.

This is the content of the first class on the theory of machine learning. And the author also has the following lecture notes on this topic:

Concentration-of-measure inequalities presented at the
Machine Learning Summer School 2003, Australian National University, Canberra,
at the Workshop on combinatorics, probability and algorithms 2003, held at the CRM in Montreal,
at the Winter School on Probabilistic Methods in High Dimension Phenomena , Toulouse, January 10-14, 2005
and at the Workshop de Combinatória e Concentração de Medida IMPA, Rio de Janeiro, February 23-25, 2005 (PDF,POSTSCRIPT).

3, O. Bousquet, S. Boucheron, and G. Lugosi, (2004), Introduction to statistical learning theory. (PDF,POSTSCRIPT) in O. Bousquet, U.v. Luxburg, and G. Rätsch (editors), Advanced Lectures in Machine Learning, Springer, pp. 169–207, 2004.

4, S. Kulkarni, G. Lugosi, and S. Venkatesh (1998), Learning Pattern Classification—A Survey. (POSTSCRIPT) 1948–1998 Special Commemorative Issue of IEEE Transactions on Information Theory. , vol.44, 2178–2206. Reprinted in S. Verdú, S.W. McLaughlin (editors.), Information Theory: 50 Years of Discovery, IEEE Press, New York, 1999.

So far these materials all come from one professor GÁBOR LUGOSI.

For concentration inequalities, there is also a chapter from one book: Old and new concentration inequalities whose author is Fan Chung Graham. In addition, there is a survey about this: Concentration inequalities and martingale inequalities — a survey .

Videolectures on concentration inequalities with machine learning applications

There is a course: STAT 598Y: Statistical Learning Theory given in Statistics Department at Purdue University. It focuses more on the machine learning theory.

What is the curse of dimensionality?

Highly recommend reading the lecture delivered by David Donoho titled “High Dimensional Data Analysis: on the Curses and Blessings of Dimensionality“. The “blessings” of data in high dimensions is that the information of interest effectively lies in a lower dimension which we can then leverage, typically, this involves sparse representations or low dimensional manifolds.

Here are some papers that greatly enhanced mathematical understanding of dimensionality reduction

Linear models:

Non-linear dimensionality reduction

Reference: Advanced Machine Learning course materials by Tony Jebara: http://www.cs.columbia.edu/~jeba…

Also see

List of books in computational geometry

10 Papers Every Programmer Should Read (At Least Twice)

What are good books for Computation Geometry?

Why Hopf Algebras

Curvature for Markov Chains

pdf (379K)
ps.gz (230K)
in Probabilistic approach to geometry, Adv. Stud. Pure Math. 57, Math. Soc. Japan (2010), 343–381.
This is a gentle introduction to the context and results of my article Ricci curvature of Markov chains on metric spaces. It begins with a description of classical Ricci curvature in Riemannian geometry, as well as a reminder for discrete Markov chains. A number of open problems are mentioned.

Binary Cumulant Varieties

Authors: Bernd Sturmfels, Piotr Zwiernik

Subjects: Combinatorics (math.CO); Algebraic Geometry (math.AG); Statistics Theory (math.ST)

Algebraic statistics for binary random variables is concerned with highly structured algebraic varieties in the space of 2x2x…x2-tensors. We demonstrate the advantages of representing such varieties in the coordinate system of binary cumulants. Our primary focus lies on hidden subset models. Parametrizations and implicit equations in cumulants are derived for hyperdeterminants, for secant and tangential varieties of Segre varieties, and for certain context-specific independence models. Extending work of Rota and collaborators, we explore the polynomial inequalities satisfied by cumulants.

Blog Stats

  • 185,523 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 518 other subscribers