p-value and Bayes are the two hottest words in Statistics. Actually I still can not get why the debate between frequentist  statistics and Bayesian statistics can last so long. What is the essence arguments behind it? (Any one can help me with this?) In my point of view, they are just two ways for solving practical problems. Frequentist people are using the random version of proof-by-contradiction argument (i.e. small p-value indicates less likeliness for the null hypothesis to be true), while Bayesian people are using learning argument  to update their believes through data. Besides, mathematician are using partial differential equations (PDE) to model the real underlying process for the analysis. These are just different methodologies for dealing with practical problems. What’s the point for the long-last debate between frequentist  statistics and Bayesian statistics then?

Although my current research area is mostly in frequentist statistics domain, I am becoming more and more Bayesian lover, since it’s so natural. When I was teaching introductory statistics courses for undergraduate students at Michigan State University, I divided the whole course into three parts: Exploratory Data Analysis (EDA) by using R software, Bayesian Reasoning and Frequentist Statistics. I found that at the end of the semester, the most impressive example in my students mind was the one from the second section (Bayesian Reasoning).  That is the Monty Hall problem,  which was mentioned in the article that just came out in the NYT. (Note that about the argument from Professor Andrew Gelman, please also check out the response from Professor Gelman.) “Mr. Hall, longtime host of the game show “Let’s Make a Deal,” hides a car behind one of three doors and a goat behind each of the other two. The contestant picks Door No. 1, but before opening it, Mr. Hall opens Door No. 2 to reveal a goat. Should the contestant stick with No. 1 or switch to No. 3, or does it matter?” And the Bayesian approach to this problem “would start with one-third odds that any given door hides the car, then update that knowledge with the new data: Door No. 2 had a goat. The odds that the contestant guessed right — that the car is behind No. 1 — remain one in three. Thus, the odds that she guessed wrong are two in three. And if she guessed wrong, the car must be behind Door No. 3. So she should indeed switch.” What a natural argument! Bayesian babies and Google untrained search for youtube cats (the methods of deep learning) are all excellent examples proving that Bayesian Statistics IS a remarkable way for solving problems.

What about the p-values? This random version of proof-by-contradiction argument is also a great way for solving problems from the fact that it have been helping solve so many problems from various scientific areas, especially in bio-world. Check out today’s post from Simply Statistics: “You think P-values are bad? I say show me the data,” and also the early one: On the scalability of statistical procedures: why the p-value bashers just don’t get it.

I collected the following series on applying for faculty positions in 2011, when I was in my second year PhD. Now it’s my turn to apply for jobs. I will share the following useful materials with all you who want to apply for jobs this year.

My academic homepage just has been launched. Welcome to visit: Honglang Wang’s Homepage.

This post is for JSM2013. I will put useful links here and I will update this post during the meeting.

What I have learned from this meeting (Key words of this meeting):

Big Data, Bayesian, Statistical Efficiency vs Computational Efficiency

I was in Montreal from Aug 1st to Aug 8th for JSM2013 and traveling.

(Traveling in Quebec: Olympic Stadium; Underground City; Quebec City; Montreal City; basilique Nortre-Dame; China Town)

(Talks at JSM2013: Jianqing Fan; Jim Berger; Nate Silver; Tony Cai; Han Liu; Two Statistical Peters)

(My Presentation at JSM2013)

The following is the list for the talks I was there:

JSM

• Aug 4th
• 2:05 PM Analyzing Large Data with R and MonetDB — Thomas Lumley, University of Auckland
• 2:25 PM Empirical Likelihood and U-Statistics in Survival Analysis — Zhigang Zhang, Memorial Sloan-Kettering Cancer Center ; Yichuan Zhao, Georgia State University
• 2:50 PM Joint Unified Confidence Region for the Parameters of Branching Processes with Immigration — Pin Ren ; Anand Vidyashankar, George Mason University
• 3:05 PM Time-Varying Additive Models for Longitudinal Data — Xiaoke Zhang, University of California Davis ; Byeong U. Park, Seoul National University ; Jane-Ling Wang, UC Davis
• 3:20 PM Leveraging as a Paradigm for Statistically Informed Large-Scale Computation — Michael W. Mahoney, Stanford University
• 4:05 PM Joint Estimation of Multiple Dependent Gaussian Graphical Models — Yuying Xie, The University of North Carolina at Chapel Hill ; Yufeng Liu, The University of North Carolina ; William Valdar, UNC-CH Genetics
• 4:30 PM Computational Strategies in Regression of Big Data — Ping Ma, University of Illinois at Urbana-Champaign
• 4:55 PM Programming with Big Data in R — George Ostrouchov, Oak Ridge National Laboratory ; Wei-Chen Chen, Oak Ridge National Laboratory ; Drew Schmidt, University of Tennessee ; Pragneshkumar Patel, University of Tennessee
• 5:20 PM Inference and Optimalities in Estimation of Gaussian Graphical Model — Harrison Zhou, Yale University
• Aug 5th
• 99 Mon, 8/5/2013, 8:30 AM – 10:20 AM CC-710a
• Introductory Overview Lecture: Twenty Years of Gibbs Sampling/MCMC — Other Special Presentation
• 8:35 AM Gibbs Sampling and Markov Chain Monte Carlo: A Modeler’s Perspective — Alan E. Gelfand, Duke University
• 9:25 AM The Theoretical Underpinnings of MCMC — Jeffrey S. Rosenthal, University of Toronto
• 10:15 AM Floor Discussion
• 166 * Mon, 8/5/2013, 10:30 AM – 12:20 PM CC-520c
• Statistical Learning and Data Mining: Winners of Student Paper Competition — Topic Contributed Papers
• 10:35 AM Multicategory Angle-Based Large Margin Classification — Chong Zhang, UNC-CH ; Yufeng Liu, The University of North Carolina
• 10:55 AM Discrepancy Pursuit: A Nonparametric Framework for High-Dimensional Variable Selection — Li Liu, Carnegie Mellon University ; Kathryn Roeder, CMU ; Han Liu, Princeton University
• 11:15 AM PenPC: A Two-Step Approach to Estimate the Skeletons of High-Dimensional Directed Acyclic Graphs — Min Jin Ha ; Wei Sun, UNC Chapel Hill ; Jichun Xie, Temple University
• 11:35 AM An Underdetermined Peaceman-Rachford Splitting Algorithm with Application to Highly Nonsmooth Sparse Learning Problems— Zhaoran Wang, Princeton University ; Han Liu, Princeton University ; Xiaoming Yuan, Hong Kong Baptist University
• 11:55 AM Latent Supervised Learning — Susan Wei, UNC
• 12:15 PM Floor Discussion
• 220 Mon, 8/5/2013, 2:00 PM – 3:50 PM CC-710b
• 2:05 PM Statistics Meets Computation: Efficiency Trade-Offs in High Dimensions — Martin Wainwright, UC Berkeley
• 3:35 PM Floor Discussion
• 267 Mon, 8/5/2013, 4:00 PM – 5:50 PM CC-517ab
• 4:05 PM JSM Welcomes Nate Silver — Nate Silver, FiveThirtyEight.com
• 209305 Mon, 8/5/2013, 6:00 PM – 8:00 PM I-Maisonneuve, JSM Student Mixer, Sponsored by Pfizer — Other Cmte/Business, ASA , Pfizer, Inc.
• 268 Mon, 8/5/2013, 8:00 PM – 9:30 PM CC-517ab
• 8:05 PM Ars Conjectandi: 300 Years Later — Hans Rudolf Kunsch, Seminar fur Statistik, ETH Zurich
• Aug 6th
• 280 * Tue, 8/6/2013, 8:30 AM – 10:20 AM CC-510a
• Statistical Inference for Large Matrices — Invited Papers
• 8:35 AM Conditional Sparsity in Large Covariance Matrix Estimation — Jianqing Fan, Princeton University ; Yuan Liao, University of Maryland ; Martina Mincheva, Princeton University
• 9:05 AM Multivariate Regression with Calibration — Lie Wang, Massachusetts Institute of Technology ; Han Liu, Princeton University ; Tuo Zhao, Johns Hopkins University
• 9:35 AM Principal Component Analysis for High-Dimensional Non-Gaussian Data — Fang Han, Johns Hopkins University ; Han Liu, Princeton University
• 10:05 AM Floor Discussion
• 325 * ! Tue, 8/6/2013, 10:30 AM – 12:20 PM CC-520b
• Modern Nonparametric and High-Dimensional Statistics — Invited Papers
• 10:35 AM Simple Tiered Classifiers — Peter Gavin Hall, University of Melbourne ; Jinghao Xue, University College London ; Yingcun Xia, National University of Singapore
• 11:05 AM Sparse PCA: Optimal Rates and Adaptive Estimation — Tony Cai, University of Pennsylvania
• 11:35 AM Statistical Inference in Compound Functional Models — Alexandre Tsybakov, CREST-ENSAE
• 12:05 PM Floor Discussion
• 392 Tue, 8/6/2013, 2:00 PM – 3:50 PM CC-710a
• Introductory Overview Lecture: Big Data — Other Special Presentation
• 2:05 PM The Relative Size of Big Data — Bin Yu, Univ of California at Berkeley
• 2:55 PM Divide and Recombine (D&R) with RHIPE for Large Complex Data — William S. Cleveland, Purdue Universith
• 3:45 PM Floor Discussion
• 445 Tue, 8/6/2013, 4:00 PM – 5:50 PM CC-517ab
• Deming Lecture — Invited Papers
• 4:05 PM Industrial Statistics: Research vs. Practice — Vijay Nair, University of Michigan
• Aug 7th
• 10:35 AM Bayesian and Frequentist Issues in Large-Scale Inference — Bradley Efron, Stanford University
• 11:20 AM Criteria for Bayesian Model Choice with Application to Variable Selection — Jim Berger, Duke University ; Susie Bayarri, University of Valencia ; Anabel Forte, Universitat Jaume I ; Gonzalo Garcia-Donato, Universidad de Castilla-La Mancha
• 571 Wed, 8/7/2013, 2:00 PM – 3:50 PM CC-511c
• Statistical Methods for High-Dimensional Sequence Data — Invited Papers
• 2:05 PM Linkage Disequilibrium in Sequencing Data: A Blessing or a Curse? — Alkes L. Price, Harvard School of Public Health
• 2:25 PM Statistical Prioritization of Sequence Variants — Lisa Joanna Strug, The Hospital for Sick Children and University of Toronto ; Weili Li, The Hospital for Sick Children and University of Toronto
• 2:45 PM On Some Statistical Issues in Analyzing Whole-Genome Sequencing Data — Dan Liviu Nicolae, The University of Chicago
• 3:05 PM Statistical Methods for Studying Rare Variant Effects in Next-Generation Sequencing Association Studies — Xihong Lin, Harvard School of Public Health
• 3:25 PM Adjustment for Population Stratification in Association Analysis of Rare Variants — Wei Pan, University of Minnesota ; Yiwei Zhang, University of Minnesota ; Binghui Liu, University of Minnesota ; Xiaotong Shen, University of Minnesota
• 3:45 PM Floor Discussion
• 612 Wed, 8/7/2013, 4:00 PM – 5:50 PM CC-517ab
• COPSS Awards and Fisher Lecture — Invited Papers
• 4:05 PM From Fisher to Big Data: Continuities and Discontinuities — Peter Bickel, University of California – Berkeley
• 5:45 PM Floor Discussion
• Aug 8th
• 621 Thu, 8/8/2013, 8:30 AM – 10:20 AM CC-516d
• Recent Advances in Bayesian Computation — Invited Papers
• 8:35 AM An Adaptive Exchange Algorithm for Sampling from Distribution with Intractable Normalizing Constants — Faming Liang, Texas A&M University
• 9:00 AM Efficiency of Markov Chain Monte Carlo for Bayesian Computation — Dawn B Woodard, Cornell University
• 9:25 AM Scalable Inference for Hierarchical Topic Models — John W. Paisley, University of California, Berkeley
• 9:50 AM Augmented Particle Filters — Yuguo Chen, University of Illinois at Urbana-Champaign
• 10:15 AM Floor Discussion
• 661 * ! Thu, 8/8/2013, 10:30 AM – 12:20 PM CC-710b
• Patterns and Extremes: Developments and Review of Spatial Data Analysis — Invited Papers
• 10:35 AM Multivariate Max-Stable Spatial Processes — Marc G. Genton, KAUST ; Simone Padoan, Bocconi University of Milan ; Huiyan Sang, TAMU
• 10:55 AM Approximate Bayesian Computing for Spatial Extremes — Robert James Erhardt, Wake Forest University ; Richard Smith, The University of North Carolina at Chapel Hill

This is from a post Connected objects and a reconstruction theorem:

A common theme in mathematics is to replace the study of an object with the study of some category that can be built from that object. For example, we can

• replace the study of a group  $G$ with the study of its category $G\text{-Rep}$ of linear representations,
• replace the study of a ring $R$ with the study of its category $R\text{-Mod}$ of $R$-modules,
• replace the study of a topological space $X$ with the study of its category $\text{Sh}(X)$ of sheaves,

and so forth. A general question to ask about this setup is whether or to what extent we can recover the original object from the category. For example, if $G$ is a finite group, then as a category, the only data that can be recovered from $G\text{-Rep}$ is the number of conjugacy classes of $G$, which is not much information about $G$. We get considerably more data if we also have the monoidal structure on $G\text{-Rep}$, which gives us the character table of $G$ (but contains a little more data than that, e.g. in the associators), but this is still not a complete invariant of $G$. It turns out that to recover $G$ we need the symmetric monoidal structure on $G\text{-Rep}$; this is a simple form of Tannaka reconstruction.

The evidence in large medical data sets is direct, but indirect as well – and there is just too much of the indirect evidence to ignore. If you want to prove that your drug of choice is good or bad your evidence is not just how it does, it is also how all the other drugs do. And that is a crucial point that doesn’t fit easily into the frequentist world, which is a world of direct evidence (very often, but not always); and it also doesn’t fit extremely well into the formal Bayesian world, because the indirect information isn’t actually the prior distribution, it is evidence of a prior distribution, and that in some sense is not as neat. Neatness counts in science. Things that people can understand and really manipulate are terribly important.

“So I have been very interested in massive data sets not because they are massive but because they seem to offer opportunities to think about statistical inferences from the ground up again.”

The Fisher–Pearson –Neyman paradigm dating from around 1900 was, he says, “like a light being switched on. But it is so beautiful and so almost airtight that it is pretty hard to improve on; and that means that it is very hard to rethink what is good or bad about
statistics.

“Fisher of course had this wonderful view of how you do what I would call small-sample inference. You tend to get very smart people trying to improve on this kind of area, but you really cannot do that very well because there is a limited amount that is available to work on. But now suddenly there are these problems that have a different flavour. It really is quite different doing ten thousand estimates at once. There is evidence always lurking around the edges. It is hard to say where that evidence is, but it’s there. And if you ignore it you are just not going to do a good job.

“Another way to say it is that a Bayesian prior is an assumption of an infinite amount of past relevant experience. It is an incredibly powerful assumption, and often a very useful assumption for moving forward with complicated data analysis. But you cannot forget that you have just made up a whole bunch of  data.

“So of course the trick for Bayesians is to do their ‘making up’ part without really influencing the answer too much. And that is really
tricky in these higher-dimensional problems.”

1. Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ
2. A Funny Thing Happened on the Way to Academia . . .
4. Perspective: “Why C++ Is Not ‘Back’”
5. Is Fourier analysis a special case of representation theory or an analogue?
6. The Beauty of Bioconductor
7. The State of Statistics in Julia
8. Open Source Misfeasance
9. Book review: The Signal and The Noise
10. Should the Cox Proportional Hazards model get the Nobel Prize in Medicine?
11. The most influential data scientists on Twitter
12. Here is an interesting review of Nate Silver’s book. The interesting thing about the review is that it doesn’t criticize the statistical content, but criticizes the belief that people only use data analysis for good. This is an interesting theme we’ve seen before. Gelman also reviews the review.—–Simply Statistics
13. Video : “Matrices and their singular values” (1976)
14. Beyond Computation: The P vs NP Problem – Michael Sipser—-This talk is arguably the very best introduction to computational complexity .
15. What are some of your personal guidelines for writing good, clear code?
16. How do you explain Machine learning and Data Mining to non CS people?
17. Suggested New Year’s resolution: start a blog:  A blog forces you to articulate your thoughts rather than having vague feelings about issues; You also get much more comfortable with writing, because you’re doing it rather than thinking about doing it; If other people read your blog you get to hear what they think too. You learn a lot that way. || Set aside time for your blog every day. Keep notes for yourself on bloggy subjects (write a one-line gmail to yourself with the subject “blog ideas”).
18. The most influential data scientists on Twitter
19. Tips on job market interviews
20. The age of the essay

These days I have been working with computation and programming languages. I want to share something with you here.

1. You cannot expect C++ to magically make your code faster. If speed is of concern, you need profiling to find the bottleneck instead of blind guessing.——Yan Zhou. Thus we have to learn to know how to profile an program in R, Matlab, C++, Python.
2. When something complicated does not work, I generally try to restart with something simpler, and make sure it works.——Dirk Eddelbuettel.
3. If you’re calling your function thousands or millions of times, then it might pay to closely examine your memory allocation strategies and figure out what’s temporary.——Christian Gunning.
4. No, your main issue is not thinking about the computation.  As soon as you write something like
arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y;
you are in theory land which has very little relationship to practical numerical linear algebra.  If you want to perform linear algebra calculations like weighted least squares you should first take a bit of time to learn about numerical linear algebra as opposed to theoretical linear algebra.  They are very different disciplines.  In theoretical linear algebra you write the solution to a system of linear equations as above, using the inverse of the system matrix.  The first rule of numerical linear algebra is that you never calculate the inverse of a matrix, unless you only plan to do toy examples.  You mentioned sizes of 4000 by 4000 which means that the method you have chosen is doing thousands of times more work than necessary (hint: how do you think that the inverse of a matrix is calculated in practice? – ans: by solving n systems of equations, which you are doing here when you could be solving only one).
Dirk and I wrote about 7 different methods of solving least squares problems in our vignette on RcppEigen.  None of those methods involve taking the inverse of an n by n matrix.
R and Rcpp and whatever other programming technologies come along will never be a “special sauce” that takes the place of thinking about what you are trying to do in a computation.——Douglas Bates. |//[[Rcpp::depends(RcppEigen)]]
| #include <RcppEigen.h>
|
| typedef Eigen::MatrixXd          Mat;
| typedef Eigen::Map<Mat>          MMat;
| typedef Eigen::HouseholderQR<Mat>        QR;
| typedef Eigen::VectorXd             Vec;
| typedef Eigen::Map<Vec>          MVec;
|
| // [[Rcpp::export]]
|
| Rcpp::List wtls(const MMat X, const MVec y, const MVec sqrtwts) {
|     return Rcpp::List::create(Rcpp::

Named(“betahat”) =
|                             QR(sqrtwts.asDiagonal()*X).solve(sqrtwts.asDiagonal()*y));
| }
5. Repeatedly calling an R function is probably not the smartest thing to do in an otherwise complex and hard to decipher program.—-Dirk Eddelbuettel.
6. Computers don’t do random things, unlike human beings. Something worked once, is very likely to work whatever times you repeat it as long as the input is the same (unless the function has side effect). So repeating it 1000 times is the same as once.——Yan Zhou
7. Yan Zhou: Here are a few things people usually do before asking in a mailing list (not just Rcpp list, any such lists like R-help, StackOverflow, etc).
1. I write a program, it crashes,
2. I find out the site of crash
3. I make the program simpler and simpler until it is minimal and the crash is now reproducible.
4. I still cannot figure out what is wrong with that four or five lines that crash the minimal example