Monthly Archive

You are currently browsing the monthly archive for November 2012.

Useful for referring–11-28-2012

November 28, 2012 in Useful for referring | 1 comment

Grad Student’s Guide to Good Coffee+Grad Student’s Guide to Good Tea
Favorite Apps for Work and Life
estimating a constant (not really)
Reinforcement Learning in R: An Introduction to Dynamic Programming
The Future of Machine Learning (and the End of the World?)
10 Papers Every Programmer Should Read (At Least Twice)
R in the Press
On Chomsky and the Two Cultures of Statistical Learning
Speech Recognition Breakthrough for the Spoken, Translated Word
Frequentist vs Bayesian
w4s – the awesomeness we’re experiencing
Why is the Gaussian so pervasive in mathematics?
C++ Blogs that you Regularly Follow
An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.
Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.
Statistics is still the new hotness. Here is a Business Insider list about 5 statistics problems that will“change the way you think about the world”.
New Yorker, especially the line,”statisticians are the new sexy vampires, only even more pasty” (via Brooke A.)
The closed graph theorem in various categories
Got spare time? Watch some videos about statistics
About the first Borel-Cantelli lemma
Yihui Xie—-The Setup
Best Practices for Scientific Computing

Sage And Python

November 5, 2012 in Academic, Computer Science, Mathematics, Statistics | 1 comment

Python is great and I think will be also great. For pure mathematics, it has lots of symbol calculations, since pure mathematics is abstract and powerful, like differential geometry, commutative algebra, algebraic geometry, and so on. However, science is nothing but experiment and computation. We also need powerful computational software to help us to carry out the result by powerful computation. Sage is your choice ! Since Sage claims that

Sage is a free open-source mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface.

Mission: Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab.

Not only for pure mathematics, today I happened to see a blog post about using Sage to calculate high moments of Gaussian:

var('m, s, t')

mgf(t) = exp(m*t + t^2*s^2/2)

for i in range(1, 11):

derivative(mgf, t, i).subs(t=0)

which leads to the following result:

m

m^2 + s^2

m^3 + 3*m*s^2

m^4 + 6*m^2*s^2 + 3*s^4

m^5 + 10*m^3*s^2 + 15*m*s^4

m^6 + 15*m^4*s^2 + 45*m^2*s^4 + 15*s^6

m^7 + 21*m^5*s^2 + 105*m^3*s^4 + 105*m*s^6

m^8 + 28*m^6*s^2 + 210*m^4*s^4 + 420*m^2*s^6 + 105*s^8

m^9 + 36*m^7*s^2 + 378*m^5*s^4 + 1260*m^3*s^6 + 945*m*s^8

m^10 + 45*m^8*s^2 + 630*m^6*s^4 + 3150*m^4*s^6 + 4725*m^2*s^8 + 945*s^10

Go Python! Go Sage!

How Bayesian Challenge Frequentist

November 3, 2012 in Academic, Statistics | 3 comments

Recently, I have heard a lot about the disadvantages of frequentist statistics, including the complain about p value, which is a hot topic due to the God particle.

Professor Kruschke, J.K. gave a talk on Doing Bayesian Data Analysis @ Michigan State University on September. He mentioned a concept “Intention“, including intended hypothesis, intended experiments, intended sampling. Basically he explained lots of frequentist procedure for doing statistics are intended procedure, which is not science, since everything depends on people’s intention. If you want to know more about this, please refer to the paper.

Today I came across the following a blog post from Statistical Modeling, Causal Inference, and Social Science, which is also about the intention issue about frequentist statistics:

Sometimes the problem is that the frequentist criterion being used is not of applied relevance. Consider a simple problem such as estimating a proportion p, given y successes out of n trials, where n=100 and y=0. The best estimate of p will be different if I tell you that p is the probability of a rare disease, compared to if I tell you that p is the proportion of African Americans who plan to vote for Mitt Romney.

I do need some frequentist people to explain this intention issue, since I think it’s kind of reasonable questioning. Any comments?

Update:

The following cartoon caused a fight between Frequentist and Bayesian:

A post from Andrew: I don’t like this cartoon
A post from Normal Deviate: anti xkcd

And the following is really a point:

Suppose I had a medical test with a 1/6 false positive rate and a 0% false negative rate. That is, if administered to someone without the disease it has a 1/6 chance of reporting positive. The protocol is to administer the test and, if positive, to administer it again. Assuming independence, the probability of two consecutive false positives is 1/36. Some statisticians would reject the null hypothesis (that the patient is disease free) given 2/2 positive tests. That is ridiculous for the same reason the xkcd example is ridiculous (it ignores prior or base rate information) but is is indeed the practice in some circles, I’m told.—–Phil

Also refer to the explanation from Andrew:

In the context of probability mathematics, textbooks carefully explain that p(A|B) != p(B|A), and how a test with a low error rate can have a high rate of errors conditional on a positive finding, if the underlying rate of positives is low, but the textbooks typically confine this problem to the probability chapters and don’t explain its relevance to accept/reject decisions in statistical hypothesis testing.

Update: (Two videos from Professor Kruschke, J.K.)

Bayesian estimation supersedes the t test in 14 minutes of video.+ Bayesian Methods Interpret Data Better

Update:

Examples of Bayesian and frequentist approach giving different answers

Monthly Archive

Useful for referring–11-28-2012

Sage And Python

How Bayesian Challenge Frequentist

Recent Comments

Blog Stats

Log In/Out

Email Subscription

Recent Posts

Twitter Updates

Categories

Archives

Bioinformatics

Blogroll

CS blogs

general math blogs

interesting blogs

Journal Club

machine learning blogs

Newly Added

probability blogs

statistics blogs

Blog Stats