You are currently browsing the monthly archive for November 2012.

- Grad Student’s Guide to Good Coffee+Grad Student’s Guide to Good Tea
- Favorite Apps for Work and Life
- estimating a constant (not really)
- Reinforcement Learning in R: An Introduction to Dynamic Programming
- The Future of Machine Learning (and the End of the World?)
- 10 Papers Every Programmer Should Read (At Least Twice)
- R in the Press
- On Chomsky and the Two Cultures of Statistical Learning
- Speech Recognition Breakthrough for the Spoken, Translated Word
- Frequentist vs Bayesian
- w4s – the awesomeness we’re experiencing
- Why is the Gaussian so pervasive in mathematics?
- C++ Blogs that you Regularly Follow
- An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.
- Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.
- Statistics is still the new hotness. Here is a Business Insider list about 5 statistics problems that will“change the way you think about the world”.
- New Yorker, especially the line,”statisticians are the new sexy vampires, only even more pasty” (via Brooke A.)
- The closed graph theorem in various categories
- Got spare time? Watch some videos about statistics
- About the first Borel-Cantelli lemma
- Yihui Xie—-The Setup
- Best Practices for Scientific Computing

Python is great and I think will be also great. For pure mathematics, it has lots of symbol calculations, since pure mathematics is abstract and powerful, like differential geometry, commutative algebra, algebraic geometry, and so on. However, science is nothing but experiment and computation. We also need powerful computational software to help us to carry out the result by powerful computation. Sage is your choice ! Since Sage claims that

Sageis a free open-source mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface.Mission:Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab.

Not only for pure mathematics, today I happened to see a blog post about using Sage to calculate high moments of Gaussian:

`var('m, s, t')`

`mgf(t) = exp(m*t + t^2*s^2/2)`

`for i in range(1, 11):`

`derivative(mgf, t, i).subs(t=0)`

`m`

`m^2 + s^2`

`m^3 + 3*m*s^2`

`m^4 + 6*m^2*s^2 + 3*s^4`

`m^5 + 10*m^3*s^2 + 15*m*s^4`

`m^6 + 15*m^4*s^2 + 45*m^2*s^4 + 15*s^6`

`m^7 + 21*m^5*s^2 + 105*m^3*s^4 + 105*m*s^6`

`m^8 + 28*m^6*s^2 + 210*m^4*s^4 + 420*m^2*s^6 + 105*s^8`

`m^9 + 36*m^7*s^2 + 378*m^5*s^4 + 1260*m^3*s^6 + 945*m*s^8`

`m^10 + 45*m^8*s^2 + 630*m^6*s^4 + 3150*m^4*s^6 + 4725*m^2*s^8 + 945*s^10`

Recently, I have heard a lot about the disadvantages of frequentist statistics, including the complain about p value, which is a hot topic due to the God particle.

Professor Kruschke, J.K. gave a talk on *Doing Bayesian Data Analysis @* Michigan State University on September. He mentioned a concept “**Intention**“, including intended hypothesis, intended experiments, intended sampling. Basically he explained lots of frequentist procedure for doing statistics are intended procedure, which is not science, since everything depends on people’s intention. If you want to know more about this, please refer to the paper.

Today I came across the following a blog post from Statistical Modeling, Causal Inference, and Social Science, which is also about the intention issue about frequentist statistics:

Sometimes the problem is that the frequentist criterion being used is not of applied relevance. Consider a simple problem such as estimating a proportion p, given y successes out of n trials, where n=100 and y=0. The best estimate of p will be different if I tell you that p is the probability of a rare disease, compared to if I tell you that p is the proportion of African Americans who plan to vote for Mitt Romney.

I do need some frequentist people to explain this intention issue, since I think it’s kind of reasonable questioning. Any comments?

**Update:**

The following cartoon caused a fight between Frequentist and Bayesian:

- A post from Andrew: I don’t like this cartoon
- A post from Normal Deviate: anti xkcd

And the following is really a point:

Suppose I had a medical test with a 1/6 false positive rate and a 0% false negative rate. That is, if administered to someone without the disease it has a 1/6 chance of reporting positive. The protocol is to administer the test and, if positive, to administer it again. Assuming independence, the probability of two consecutive false positives is 1/36. Some statisticians would reject the null hypothesis (that the patient is disease free) given 2/2 positive tests. That is ridiculous for the same reason the xkcd example is ridiculous (it ignores prior or base rate information) but is is indeed the practice in some circles, I’m told.—–Phil

Also refer to the explanation from Andrew:

In the context of probability mathematics, textbooks carefully explain that p(A|B) != p(B|A), and how a test with a low error rate can have a high rate of errors conditional on a positive finding, if the underlying rate of positives is low, but the textbooks typically confine this problem to the probability chapters and don’t explain its relevance to accept/reject decisions in statistical hypothesis testing.

**Update**: (Two videos from Professor Kruschke, J.K.)

Bayesian estimation supersedes the *t* test in 14 minutes of video.+ Bayesian Methods Interpret Data Better

**Update:**

Examples of Bayesian and frequentist approach giving different answers

## Recent Comments