In my office I have two NIPS posters on the wall, 2011 and 2012. But I have not been there and I am not computer scientist neither. But anyway I like NIPS without reason. Now it’s time for me to organize posts from others:

- NIPS ruminations I
- NIPS II: Deep Learning and the evolution of data models
- NIPS stuff…
- NIPS 2012
- NIPS 2012 Conference in Lake Tahoe, NV
- Thoughts on NIPS 2012
- The Big NIPS Post
- NIPS 2012 : day one
- NIPS 2012 : day two
- Spectral Methods for Latent Models
- NIPS 2012 Trends

And among all of the posts, there are several things I have to digest later on:

- One tutorial on
**Random Matrices**, by Joel Tropp. People concluded in their posts that

Basically, break random matrices down into a sum of simpler, independent random matrices, then apply concentration bounds on the sum.—John Moeller. The basic result is that if you love your Chernoff bounds and Bernstein inequalities for (sums of) scalars, you can get almost exactly the same results for (sums of) matrices.—hal .

- “This year was definitely all about
**Deep Learning**,” John Moeller said. The Geomblog mentioned that it’s been in the news recently because of the Google untrained search for youtube cats, the methods of deep learning (basically neural nets without lots of back propagation) have been growing in popularity over a long while. And we have to spend sometime to read Deep Learning and the evolution of data models, which is related with manifold learning. - “Another trend that’s been around for a while, but was striking to me, was the detailed study of
**Optimization**methods.”—The Geomblog. There are at least two different workshops on optimization in machine learning (DISC and OPT), and numerous papers that very carefully examined the structure of optimizations to squeeze out empirical improvements. **Kernel distances:**An introduction to kernel distance from The Geomblog. “Scott Aaronson (at his NIPS invited talk) made this joke about how nature loves ℓ2. The kernel distance is “essentially” the ℓ2 variant of EMD (which makes so many things easier). There’s been a series of papers by Sriperumbudur et al. on this topic, and in a series of works they have shown that (a) the kernel distance captures the notion of “distance covariance” that has become popular in statistics as a way of testing independence of distributions. (b) as an estimator of distance between distributions, the kernel distance has more efficient estimators than (say) the EMD because its estimator can be computed in closed form instead of needing an algorithm that solves a transportation problem and (c ) the kernel that optimizes the efficient of the two-sample estimator can also be determined (the NIPS paper).”**Spectral Methods for Latent Models**: Spectral methods for latent variable models are based upon the method of moments rather than maximum likelihood.

Besides the papers mentioned in the above hot topics, there are some other papers from Memming‘s post:

**Graphical models via generalized linear models**: Eunho introduced a family of graphical models with GLM marginals and Ising model style pairwise interaction. He said the Poisson-Markov-Random-Fields version must have negative coupling, otherwise the log partition function blows up. He showed conditions for which the graph structure can be recovered with high probability in this family.**TCA: High dimensional principal component analysis for non-gaussian data**: Using an elliptical copula model (extending the nonparanormal), the eigenvectors of the covariance of the copula variables can be estimated from Kendall’s tau statistic which is invariant to the nonlinearity of the elliptical distribution and the transformation of the marginals. This estimator achieves close to the parametric convergence rate while being a semi-parametric model.

**Update: Make sure to check the lectures from the prominent 26th Annual NIPS Conference filmed @ Lake Tahoe 2012. Also make sure to check the NIPS 2012 Workshops, Oral sessions and Spotlight sessions which were collected for the Video Journal of Machine Learning Abstracts – Volume 3.**

## 2 comments

Comments feed for this article

December 18, 2012 at 7:07 pm

HansI have to look into Kernal Distances and all this about random matrices and matrix norms. I don’t really use that stuff.

I took some time to list the 20 papers that I thought would be the most useful for me here:

http://artent.net/blog/2012/12/18/the-15-most-striking-papers-and-presentations-from-nips/

Hopefully, I will get more time later to explain why I think those particular titles will be the most useful for me.

December 18, 2012 at 8:47 pm

Honglang WangThanks, Hans. I am looking forward to your ideas about these interesting papers or talks.