You are currently browsing the monthly archive for December 2012.

These days I have been working with computation and programming languages. I want to share something with you here.

1. You cannot expect C++ to magically make your code faster. If speed is of concern, you need profiling to find the bottleneck instead of blind guessing.——Yan Zhou. Thus we have to learn to know how to profile an program in R, Matlab, C++, Python.
2. When something complicated does not work, I generally try to restart with something simpler, and make sure it works.——Dirk Eddelbuettel.
3. If you’re calling your function thousands or millions of times, then it might pay to closely examine your memory allocation strategies and figure out what’s temporary.——Christian Gunning.
4. No, your main issue is not thinking about the computation.  As soon as you write something like
arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y;
you are in theory land which has very little relationship to practical numerical linear algebra.  If you want to perform linear algebra calculations like weighted least squares you should first take a bit of time to learn about numerical linear algebra as opposed to theoretical linear algebra.  They are very different disciplines.  In theoretical linear algebra you write the solution to a system of linear equations as above, using the inverse of the system matrix.  The first rule of numerical linear algebra is that you never calculate the inverse of a matrix, unless you only plan to do toy examples.  You mentioned sizes of 4000 by 4000 which means that the method you have chosen is doing thousands of times more work than necessary (hint: how do you think that the inverse of a matrix is calculated in practice? – ans: by solving n systems of equations, which you are doing here when you could be solving only one).
Dirk and I wrote about 7 different methods of solving least squares problems in our vignette on RcppEigen.  None of those methods involve taking the inverse of an n by n matrix.
R and Rcpp and whatever other programming technologies come along will never be a “special sauce” that takes the place of thinking about what you are trying to do in a computation.——Douglas Bates. |//[[Rcpp::depends(RcppEigen)]]
| #include <RcppEigen.h>
|
| typedef Eigen::MatrixXd          Mat;
| typedef Eigen::Map<Mat>          MMat;
| typedef Eigen::HouseholderQR<Mat>        QR;
| typedef Eigen::VectorXd             Vec;
| typedef Eigen::Map<Vec>          MVec;
|
| // [[Rcpp::export]]
|
| Rcpp::List wtls(const MMat X, const MVec y, const MVec sqrtwts) {
|     return Rcpp::List::create(Rcpp::

Named(“betahat”) =
|                             QR(sqrtwts.asDiagonal()*X).solve(sqrtwts.asDiagonal()*y));
| }
5. Repeatedly calling an R function is probably not the smartest thing to do in an otherwise complex and hard to decipher program.—-Dirk Eddelbuettel.
6. Computers don’t do random things, unlike human beings. Something worked once, is very likely to work whatever times you repeat it as long as the input is the same (unless the function has side effect). So repeating it 1000 times is the same as once.——Yan Zhou
7. Yan Zhou: Here are a few things people usually do before asking in a mailing list (not just Rcpp list, any such lists like R-help, StackOverflow, etc).
1. I write a program, it crashes,
2. I find out the site of crash
3. I make the program simpler and simpler until it is minimal and the crash is now reproducible.
4. I still cannot figure out what is wrong with that four or five lines that crash the minimal example
8. It does not matter how stupid your questions are. We all asked silly questions before, that is how we learn. But it matters you put in effort to ask the right question. The more effort you put into it, the more specific question you ask and more helpful answers you get.——Yan Zhou.

In my office I have two NIPS posters on the wall, 2011 and 2012. But I have not been there and I am not computer scientist neither. But anyway I like NIPS without reason. Now it’s time for me to organize posts from others:

And among all of the posts, there are several things I have to digest later on:

1. One tutorial on Random Matrices, by Joel Tropp. People concluded in their posts that

Basically, break random matrices down into a sum of simpler, independent random matrices, then apply concentration bounds on the sum.—. The basic result is that if you love your Chernoff bounds and Bernstein inequalities for (sums of) scalars, you can get almost exactly the same results for (sums of) matrices.—.

2. “This year was definitely all about Deep Learning,”  said. The Geomblog mentioned that it’s been in the news recently because of the Google untrained search for youtube cats, the methods of deep learning (basically neural nets without lots of back propagation) have been growing in popularity over a long while. And we have to spend sometime to read Deep Learning and the evolution of data models, which is related with manifold learning.
3. “Another trend that’s been around for a while, but was striking to me, was the detailed study of Optimization methods.”—The Geomblog.  There are at least two different workshops on optimization in machine learning (DISC and OPT), and numerous papers that very carefully examined the structure of optimizations to squeeze out empirical improvements.
4. Kernel distances:  An introduction to kernel distance from The Geomblog. “Scott Aaronson (at his NIPS invited talk) made this joke about how nature loves ℓ2. The  kernel distance is “essentially” the ℓ2 variant of EMD (which makes so many things easier). There’s been a series of papers by Sriperumbudur et al. on this topic, and in a series of works they have shown that (a) the kernel distance captures the notion of “distance covariance” that has become popular in statistics as a way of testing independence of distributions. (b) as an estimator of distance between distributions, the kernel distance has more efficient estimators than (say) the EMD because its estimator can be computed in closed form instead of needing an algorithm that solves a transportation problem and (c ) the kernel that optimizes the efficient of the two-sample estimator can also be determined (the NIPS paper).”
5. Spectral Methods for Latent Models: Spectral methods for latent variable models are based upon the method of moments rather than maximum likelihood.

Besides the papers mentioned in the above hot topics, there are some other papers from Memming‘s post:

1. Graphical models via generalized linear models: Eunho introduced a family of graphical models with GLM marginals and Ising model style pairwise interaction. He said the Poisson-Markov-Random-Fields version must have negative coupling, otherwise the log partition function blows up. He showed conditions for which the graph structure can be recovered with high probability in this family.
2. TCA: High dimensional principal component analysis for non-gaussian data: Using an elliptical copula model (extending the nonparanormal), the eigenvectors of the covariance of the copula variables can be estimated from Kendall’s tau statistic which is invariant to the nonlinearity of the elliptical distribution and the transformation of the marginals. This estimator achieves close to the parametric convergence rate while being a semi-parametric model.

Update: Make sure to check the lectures from the prominent 26th Annual NIPS Conference filmed @ Lake Tahoe 2012. Also make sure to check the NIPS 2012 Workshops, Oral sessions and Spotlight sessions which were collected for the Video Journal of Machine Learning Abstracts – Volume 3.