My academic homepage just has been launched. Welcome to visit: Honglang Wang’s Homepage.
A Blog on Statistics, Biostatistics, Mathematics and Machine Learning
This post is for JSM2013. I will put useful links here and I will update this post during the meeting.
What I have learned from this meeting (Key words of this meeting):
Big Data, Bayesian, Statistical Efficiency vs Computational Efficiency
I was in Montreal from Aug 1st to Aug 8th for JSM2013 and traveling.
(Traveling in Quebec: Olympic Stadium; Underground City; Quebec City; Montreal City; basilique Nortre-Dame; China Town)
(Talks at JSM2013: Jianqing Fan; Jim Berger; Nate Silver; Tony Cai; Han Liu; Two Statistical Peters)
(My Presentation at JSM2013)
The following is the list for the talks I was there:
This is from a post Connected objects and a reconstruction theorem:
A common theme in mathematics is to replace the study of an object with the study of some category that can be built from that object. For example, we can
with the study of its category of linear representations,and so forth. A general question to ask about this setup is whether or to what extent we can recover the original object from the category. For example, if is a finite group, then as a category, the only data that can be recovered from is the number of conjugacy classes of , which is not much information about . We get considerably more data if we also have the monoidal structure on , which gives us the character table of (but contains a little more data than that, e.g. in the associators), but this is still not a complete invariant of . It turns out that to recover we need the symmetric monoidal structure on ; this is a simple form of Tannaka reconstruction.
The evidence in large medical data sets is direct, but indirect as well – and there is just too much of the indirect evidence to ignore. If you want to prove that your drug of choice is good or bad your evidence is not just how it does, it is also how all the other drugs do. And that is a crucial point that doesn’t fit easily into the frequentist world, which is a world of direct evidence (very often, but not always); and it also doesn’t fit extremely well into the formal Bayesian world, because the indirect information isn’t actually the prior distribution, it is evidence of a prior distribution, and that in some sense is not as neat. Neatness counts in science. Things that people can understand and really manipulate are terribly important.
“So I have been very interested in massive data sets not because they are massive but because they seem to offer opportunities to think about statistical inferences from the ground up again.”
The Fisher–Pearson –Neyman paradigm dating from around 1900 was, he says, “like a light being switched on. But it is so beautiful and so almost airtight that it is pretty hard to improve on; and that means that it is very hard to rethink what is good or bad about
statistics.
“Fisher of course had this wonderful view of how you do what I would call small-sample inference. You tend to get very smart people trying to improve on this kind of area, but you really cannot do that very well because there is a limited amount that is available to work on. But now suddenly there are these problems that have a different flavour. It really is quite different doing ten thousand estimates at once. There is evidence always lurking around the edges. It is hard to say where that evidence is, but it’s there. And if you ignore it you are just not going to do a good job.
“Another way to say it is that a Bayesian prior is an assumption of an infinite amount of past relevant experience. It is an incredibly powerful assumption, and often a very useful assumption for moving forward with complicated data analysis. But you cannot forget that you have just made up a whole bunch of data.
“So of course the trick for Bayesians is to do their ‘making up’ part without really influencing the answer too much. And that is really
tricky in these higher-dimensional problems.”
From http://www-stat.stanford.edu/~ckirby/brad/other/2010Significance.pdf
These days I have been working with computation and programming languages. I want to share something with you here.
In my office I have two NIPS posters on the wall, 2011 and 2012. But I have not been there and I am not computer scientist neither. But anyway I like NIPS without reason. Now it’s time for me to organize posts from others:
And among all of the posts, there are several things I have to digest later on:
Basically, break random matrices down into a sum of simpler, independent random matrices, then apply concentration bounds on the sum.—John Moeller. The basic result is that if you love your Chernoff bounds and Bernstein inequalities for (sums of) scalars, you can get almost exactly the same results for (sums of) matrices.—hal .
Besides the papers mentioned in the above hot topics, there are some other papers from Memming‘s post:
Update: Make sure to check the lectures from the prominent 26th Annual NIPS Conference filmed @ Lake Tahoe 2012. Also make sure to check the NIPS 2012 Workshops, Oral sessions and Spotlight sessions which were collected for the Video Journal of Machine Learning Abstracts – Volume 3.
Python is great and I think will be also great. For pure mathematics, it has lots of symbol calculations, since pure mathematics is abstract and powerful, like differential geometry, commutative algebra, algebraic geometry, and so on. However, science is nothing but experiment and computation. We also need powerful computational software to help us to carry out the result by powerful computation. Sage is your choice ! Since Sage claims that
Sage is a free open-source mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface.Mission: Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab.
Not only for pure mathematics, today I happened to see a blog post about using Sage to calculate high moments of Gaussian:
var('m, s, t')
mgf(t) = exp(m*t + t^2*s^2/2)
for i in range(1, 11):
derivative(mgf, t, i).subs(t=0)
m
m^2 + s^2
m^3 + 3*m*s^2
m^4 + 6*m^2*s^2 + 3*s^4
m^5 + 10*m^3*s^2 + 15*m*s^4
m^6 + 15*m^4*s^2 + 45*m^2*s^4 + 15*s^6
m^7 + 21*m^5*s^2 + 105*m^3*s^4 + 105*m*s^6
m^8 + 28*m^6*s^2 + 210*m^4*s^4 + 420*m^2*s^6 + 105*s^8
m^9 + 36*m^7*s^2 + 378*m^5*s^4 + 1260*m^3*s^6 + 945*m*s^8
m^10 + 45*m^8*s^2 + 630*m^6*s^4 + 3150*m^4*s^6 + 4725*m^2*s^8 + 945*s^10
Recent Comments