Laurent Jacques wonders about a New class of RIP matrices ? What’s a submodular function you ask, I am glad you did, I had the same question. Here is an answer.

I mentioned Random Matrix Theory a while back, Terry Tao has some news results that he explains in Random matrices: Localization of the eigenvalues and the necessity of four moments. He makes a reference to the book An Introduction to Random Matrices by Greg Anderson, Alice Guionnet and Ofer Zeitouni. Of related interest:

Djalil Chafai:

Spectrum of non-Hermitian heavy tailed random matrices

Terry Tao

Future mini-polymath project: 2010 IMO Q6?

Gonzalo Vazquez Vilar

Testing and certification of Cognitive Radio equipment

John Langford

The Good News on Exploration and Learning

Djalil Chafai

Concentration for empirical spectral distributions

Sofia Dahl and Bob Sturm in

Andrew Gelman in

A Fast Hybrid Algorithm for Large Scale L1-Regularized Logistic Regression

Alex Gittens in

Gonzalo Vazquez Vilar in

Frank Nielsen in

New mailing list: infogeo for information geometry

Andrew McGregor in

ICALP accepts… and a MASSIVE reminder

I also found the following noteworthy papers, enjoy!

Surveying and comparing simultaneous sparse approximation (or group lasso) algorithms by Alain Rakotomamonjy.

First, I agree with Andrew, this essay by Mandelbrot is fascinating

A maverick’s apprenticeship. The Wolf Prize for Physics. Edited by David Thouless. Singapore: World Scientific, 2004. [ PDF (154.4 KB) ]

Alex: Find a generating function for the Stirling partition numbers and Random matrix sparsification, comparison of current results
Djalil: Back to basics: total variation distance
Bob: CFP: 8th Sound and Music Computing (SMC) Conference 2011, and Sound Quality Seminar, Papers of the Day (Po’D): Finding or Not Finding Rules in Time Series Edition
Muthu: Romance Leads to Insights
Dick: Strong Codes For Weak Channels
Gregory: 2011 MIT IAP course, build a synthetic aperture radar in 4 weeks
Brian: Solving Resistor Networks Using Gaussian Elimination — An Illustration and Inverting A’CA
ISW: IsInvariant Proposes New Sensor Technology (increasing dynamic range, I can see how CS would benefit from that)
Meena: Quantitative white matter fiber analysis: a short history (Part III)
Jason: A Simple and Computationally Efficient Sampling Approach to Covariate Adjustment for Multifactor Dimensionality Reduction Analysis of Epistasis
Frank: Statistical manifold: Dual conjugate connections
Arthur: Mandelbrot, fractals and counterexamples in applied probability, Margin of error, and comparing proportions in the same sample
Gonzalo: Comments on information theory
Terahertz Technology: ConverTec Corp. releases TeraLaz CO2 terahertz laser system

Now on to the sites to check for any news on compressive sensing, here is the (incomplete list).

Arxiv
Google (Compressive Sensing / Compressed Sensing) 24 hours, week, month.
Rice University Compressive Sensing repository

Q&As

MathOverflow:

MetaOptimize:

MetaOptimize (Compressed Sensing)

LinkedIn:

LinkedIn Group Discussions

TheoreticalCS:(not yet working)

TheoreticalCS

BioStar

Friendfeed/Twitter

Compressed Sensing and Compressive Sensing in FriendFeed
Compressed Sensing, Compressive Sensing on Twitter.

2011-02-06:

Fast algorithms for nonconvex compressive sensing by Rick Chartrand, LANL

2011-02-12:

here are blogs/papers to reflect on, enjoy!:

Back in December, I asked What was the most interesting paper on Compressive Sensing you read in 2010 ? Here is a compilation of y’alls answers:

T. T. Cai, L. Wang and G. Xu, “New bounds for restricted isometry constants,” IEEE Trans. Inf. Theory, vol. 59(6), pp. 4388 – 4394, Sept., 2010.
E.J. Candes and M.B. Wakin, “An Introduction To Compressive Sampling,” IEEE Signal Processing Magazine, vol. 25, Mar. 2008, pp. 21-30.
M. Mishali, Y.C. Eldar, O. Dounaevsky, E. Shoshan, “Xampling: Analog to Digital at Sub-Nyquist Rates”, CCIT Report #751 Oct-09, EE Pub No. 1708, EE Dept., Technion – Israel Institute of Technology,
“A probabilistic and RIPless theory of compressed sensing” by Emmanuel Candes and Yaniv Plan
J. T. O’Brien and W. P. Kamp and G. M. Hoover, Sign-bit amplitude recovery with applications to seismic data, Geophysics, 1982
Challenging Restricted Isometry Constants with Greedy Pursuit, with Peyre, G., and Fadili, J., Proc. of ITW’09, pp.475-479, 2009. ISBN: 978-1-4244-4982-8.
Mark Davenport, Jason Laska, Petros Boufounos, and Richard Baraniuk, A simple proof that random matrices are democratic. (Rice University ECE Department Technical Report TREE-0906, November 2009)
T. Blumensath, M. E. Davies, Iterative hard thresholding for compressed sensing. (Preprint, 2008)
Boufounos P. T., “Universal Rate-Efficient Scalar Quantization“
“Dequantizing Compressed Sensing: When Oversampling and Non-Gaussian Constraints Combine.“
Real versus complex null space properties for sparse vector recovery
Davenport, M.A.; Boufounos, P.T.; Wakin, M.B.; Baraniuk, R.G.; , “Signal Processing With Compressive Measurements,” Selected Topics in Signal Processing, IEEE Journal of , vol.4, no.2, pp.445-460, April 2010.

One of you readers recently let me know of the February Fourier Talks (http://www.norbertwiener.umd.edu/FFT/FFT11/index.html) being held on February 17 and 18 at the University of Maryland Norbert Wiener Center for Harmonic Analysis and Application (http://www.norbertwiener.umd.edu/).Thank you anonymnous reader.

Recent entries I’ll probably be re-reading include:

2011-02-22

“…I found this idea of CS sketchy,…”

2011-03-01

CS: Would you like that entry Supersized? there are dozens of articles on compressive sensing

2011-03-02:

Open Source Software for iPad and iPhone

2011-03-07:

There’s a wonderful interview at the Notices with last year’s Abel Prize winner John Tate (video here). He blames the fact that his name is on so many mathematical results and concepts on Serge Lang. The 2011 Abel Prize winner will be announced on March 23rd.
Sir Michael Atiyah’s February 1 talk at the College de France titled A Geometer Explores the Universe is now on-line.
Matthew Emerton posts really good answers

2011-3-30:

Compressed Sensing: the L1 norm finds sparse solutions

2011-4-3:

So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

2011-4-17:

videolectures

Physics

Harvard Physics: Quantum Field Theory by Sidney Coleman – 50 videos
University of New Mexico: Physics 524 Quantum Field Theory II -27 videos
University of New Mexico: Physics 521 Quantum Mechanics – 32 videos
UCSD Quantum Physics 130A, 130B, 130C ~ 25 videos each
University of South Carolina PHYS 729 – Applied Group Theory – 22 Videos, The Foundations of Theoretical Physics Using Lie Groups & Algebras
Florida Atlantic University: PHY 6938 General Relativity — Fall 2007 – 28 videos
Brookhaven National Laboratory Streaming Video: Cosmology for Beginners -5 videos
MIT OpenCourseWare | Physics | Video Lectures – Physics I: Classical Mechanics, 8.02 E & M, 8.03 Vibrations and Waves, 8.224 GR & Astrophysics
Oregon State University – Physics 464/564, Computational Physics – 23 videos, based on “A Survey of Computational Physics”, Landau, Paez, Bordeianu
Cambridge University Video – Thermodynamics and Phase Diagrams with Harry Bhadeshia – 7 videos
University of New Mexico: Prof. Ivan H. Deutsch, Short Course in Quantum Information 8 videos
The Vega Science Trust – Astrophysical Chemistry by Harry Kroto – 8 videos
CERN: Introduction to String Theory – W. Lerche – 4 videos
CERN: String Theory – Johnson, C. (University of Southern California) – 5 videos
CERN: String Theory for Pedestrians – Zwiebach, B. (MIT) 3 videos, author of “A First Course in String Theory”
CERN Short Courses in Particle Physics – Accelerators, Detectors, Bubble Chambers, Feynman Diagrams, etc.

Mathematics

Stanford EE364a: Optimization Lecture Videos
Stanford EE263: Linear Dynamical Systems Lecture Videos
MIT Courseware: Godel, Escher, Bach: A Mental Space Odyssey
Constraint Programming Summer School 2007
University of Colorado at Colorado Springs UCCS – Mathematics Video Courses – Requires free registration.. lots of courses
UCCS Math 432 Modern Analysis II | Spring 2008
UCCS Math 311 Number Theory | Spring 2008
UCCS Math 535 Applied Functional Analysis | Spring 2006
Texas A&M University – Math 614 Dynamical Systems and Chaos
MIT OpenCourseWare | Mathematics | Video Lectures– 18.03 Differential Equations, 18.06 Linear Algebra, 18.085 Computational Science and Engineering I, 18.086 Mathematical Methods for Engineers II

Computer Science & Engineering

Machine Learning

Neuroscience & Biology

Finance and Econometrics

University of Toronto ACT 460 / STA2502 – Stochastic Methods for Actuarial Science – S. Jaimungal, Department of Statistics and Mathematical Finance Program
Economics 421 – Econometrics– Mark Thoma: Department of Economics, University of Oregon
Course Video Lectures: Latent Variable Analysis Professor Bengt Muthén of the UCLA Graduate School of Education & Information Studies
INFO 747 – Social and Economic Data – Cornell Record Linkage Course Lecture Videos Prof. John M. Abowd
UC Berkeley Webcasts: Econometrics 244 – Discrete Choice Methods with Simulation

Seminars, Talks, and Conference Videos:

See http://del.icio.us/pskomoroch/talk+video for more links…

Physics

Mathematics

Computer Science & Engineering

Machine Learning

Neuroscience & Biology

Finance and Economics

Open Courseware Directories and Other Video Lecture Roundup Posts

Berkeley Course Webcasts
MIT OpenCourseWare Videos
Stanford University Lecture Videos
Open Yale Courses
VideoLectures – exchange ideas & share knowledge
Free Science and Video Lectures Online!
Lecturefox: free university lectures – computer science, mathematics, physics
Business Intelligence, Data Mining & Machine Learning: Machine Learning OnLine Lectures – Machine Learning OnLine Lectures
Yet Another Machine Learning Blog » Machine learning videos [Pierre Dangauthier]
obousquet – ML Videos – Online videos of talks or lectures about Machine Learning related topics

2011-4-18:

Through Mark Davenport’s twitter stream, I learned about an extensive course entitled “An Introduction to Compressive Sensing” written by Richard Baraniuk, Mark Davenport, Marco Duarte, Chinmay Hegde now available from cnx.org. wow, I’ll add it shortly to both the big picture and the Teaching Compressive Sensing page.

2011-5-5:

Ways to prove the fundamental theorem of algebra

2011-5-27:

Distinguished and Plenary Talks

Rothschild Lecture, Isaac Newton Institute for Mathematical Sciences, Cambridge, March 28, 2011
Albert Einstein Memorial Lecture, Israel Academy of Sciences and Humanities, Jerusalem, March 14, 2011
Fields Institute Distinguished Lectures, Toronto, September 14-16, 2010
Distinguished Lecture, University of Rochester, October 30, 2009
Levi L. Conant Lecture, Worcester Polytechnic Institute, September 24, 2009
Sackler Distinguished Lectures in Mathematics, Tel Aviv University, March 9-13, 2009
Distinguished Lecture, Rutgers University, December 4, 2008
Distinguished Lecture Colloquium, PennState, November 19, 2008
Asprey Distinguished Lecture Series, Vassar College, March 23, 2008
Toyota Technological Institute of Chicago Distinguished Lecture Series, March 6, 2008
UCLA Mathematics Department Distinguished Lecture Series, January 9,10 and 11, 2008
Gibbs Lecture, Joint AMS-MAA Meeting, San Diego, January 6, 2008
CISE Distinguished Lecture at NSF, Washington, DC, September 27, 2007
Keynote lecture at FCRC,San Diego, CA, June 13, 2007
KAM Mathematical Colloquim, Prague, Czech Republic, April 27, 2007
Distinguished Lecture Series, University of Haifa, February 27 – March 1, 2007
Louis Clark Vanuxem Lectures, Princeton University, February 13, 14 and 15, 2007
Distinguished Lecture Series, University of Wisconsin, Madison, October 18, 2006
IEEE Conference on Computational Complexity, Prague, Czech Republic, July 16-20, 2006
Horizons of Truth Goedel Centenary 2006, University of Vienna, April 29, 2006
Radcliff Institute for Advanced Study Science Lecture Series, October 9, 2003

The Institute of Advanced Studies’ Women and Mathematics series of lectures and seminars featured the following interesting presentations this year:

Rebecca Willett‘s 5/17 lecture 1 Methods for sparse analysis of high-dimensional data, I
Rebecca Willett‘s 5/18 lecture 2 Sparsity: Correcting Error in Data
Rebecca Willett‘s 5/19 lecture 3 Sparsity: Compressed Sensing
Rebecca Willett‘s 5/20 lecture 4 Sparsity: Generalized Sparsity Measures and Applications
Sofya Raskhodnikova‘s 5/17 lecture 1 Sublinear-Time Algorithms
Sofya Raskhodnikova‘s 5/18 lecture 2 Sublinear-Time Algorithms
Sofya Raskhodnikova‘s 5/19 lecture 3 Sublinear-Time Algorithms
Sofya Raskhodnikova‘s 5/20 lecture 4 Sublinear-Time Algorithms
Rachel Ward‘s 5/24 lecture 1 Methods for sparse analysis of high-dimensional data, II
Anna Gilbert‘s 5/24 lecture 1 Background on sparse approximation
Anna Gilbert‘s 5/25 lecture 2 Hardness results for sparse approximation problems
Anna Gilbert‘s 5/26 lecture 3 Dictionary geometry, greedy algorithms, and convex relaxation
Peter Sarnak Mobius Function Lecture Three Lectures on the Mobius Function Randomness and Dynamics
Peter Sarnak Integral Apollonian Packings

2011-5-29:

Geometric Tools for Identifying Structure in Large Social and Information Networks

2011-6-3:

Math

Elementary Applied Topology draft textbook
Introduction to category theory
Mathematical model of walking

Statistics and machine learning

Machine learning demos
On the accuracy of statistical procedures in Excel 2007
R reference card for data mining
Wisdom of statistically manipulated crowds

2011-6-22:

Videos of talks by Friedman and Macintyre

2011-6-23:

Deviance, DIC, AIC, cross-validation, etc

The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

############################################################################

2012-2-19——2012-2-26：

Active Bayesian Optimization
There are several videos from the meeting on the Group Testing Designs, Algorithms, and Applications to Biology IMA meeting. Enjoy!
Emergence of MCMC Bayesian Computation
Two Interesting Short Volumes on the (Graph) Laplacian
So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
Prediction: the Lasso vs. just using the top 10 predictors
Getting Genetics Done: Golden Helix: A Hitchhiker’s Guide to Next Generation Sequencing

2012-2-27—2012-3-11:

2012-3-12–2012-3-25:

GraphLab v2 @ Big Learning Workshop
Basic Introduction to ggplot2
Bayesian statistics made simple
Courses in CS this spring
A Numerical Tour of Signal Processing
Reading List for Feb and March 2012 This is about the materials on concentration and geometric techniques used in compressed sensing.
simulated annealing for Sudokus
Djalil talks about A random walk on the unitary group, Brownian Motion and From seductive theory to concrete applications (which got Nuit Blanche thinking about writing this entry: Whose heart doesn’t sink at the thought of Dirac being inferior to Theora ?)
Lectures on Gaussian approximations with Malliavin calculus
Useful R snippets
Special Section: Minimax Shrinkage Estimation: A Tribute to Charles Stein
Excellent Papers for 2011
Creating a designer’s CV in LaTeX
Is NGS the Answer?
Sequence Analysis Methods Not Just for Sequence Data
DNA Variant Analysis of Complete Genomics’ Next-Generation Sequencing Data
Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process
Best Written Paper
Online SVD/PCA resources
Probabilistic Topic Models

2012-3-26–2012-4-15:

2012-4-16–2012-4-30:

LDA explained
Counting the total number of…
Significance Test for Kendall’s Tau-b
dimension reduction in ABC [a review’s review]
9 essential LaTeX packages everyone should use
Linguistic Notation Inside of R Plots! about knitr
knitr Elegant, flexible and fast dynamic report generation with R
knitr Performance Report-Attempt 1
knitr Performance Report-Attempt 2
Question: Why you need perl/python if you know R/Shell [NGS data analysis]
SPAMS (SPArse Modeling Software) now with Python and R
Large-scale Inference and empirical Bayes, they are related with multiple testing
My setup about some softwares and editors
Fancy HTML5 Slides with knitr and pandoc
John talks about Random is as random does
MCMC at ICMS (1)
MCMC at ICMS (2)
MCMC at ICMS (3)
John Cook: Why and How People Use R
An Introduction to 6 Machine Learning Models
Machine Learning: Algorithms that Produce Clusters
Dirichlet Process for dummies

2012-5-1–5-20

2012-5-21–2012-6-1:

Note: the following 4-7 are from Simply Statistics.

A Personal Perspective on Machine Learning
The differing perspectives of statistics and machine learning
Kernel Methods and Support Vector Machines de-Mystified
I love this article in the WSJ about the crisis at JP Morgan. The key point it highlights is that looking only at the high-level analysis and summaries can be misleading, you have to look at the raw data to see the potential problems. As data become more complex, I think its critical we stay in touch with the raw data, regardless of discipline. At least if I miss something in the raw data I don’t lose a couple billion. Spotted by Leonid K.
On the other hand, this article in the Times drives me a little bonkers. It makes it sound like there is one mathematical model that will solve the obesity epidemic. Lines like this are ridiculous: “Because to do this experimentally would take years. You could find out much more quickly if you did the math.” The obesity epidemic is due to a complex interplay of cultural, sociological, economic, and policy factors. The idea you could “figure it out” with a set of simple equations is laughable. If you check out their model this is clearly not the answer to the obesity epidemic. Just another example of why statistics is not math. If you don’t want to hopelessly oversimplify the problem, you need careful data collection, analysis, and interpretation. For a broader look at this problem, check out this article on Science vs. PR. Via Andrew J.
Some cool applications of the raster package in R. This kind of thing is fun for student projects because analyzing images leads to results that are easy to interpret/visualize.
Check out John C.’s really fascinating post on determining when a white-collar worker is great. Inspired by Roger’s post on knowing when someone is good at data analysis.
knitR Performance Report 3 (really with knitr) and dprint
Unix doesn’t follow the Unix philosophy
Advice on writing research articles
knitr Performance Report–Attempt 3
Permutation tests in R
Understanding Bayesian Statistics – By Michael-Paul Agapow
knitr, Slideshows, and Dropbox
Generate LaTeX tables from CSV files (Excel)
The Tomato Genome
Optimization
Sichuan Agricultural University and LC Sciences Uncover the Epigenetics of Obesity
How to Stay Current in Bioinformatics/Genomics
Interactive HTML presentation with R, googleVis, knitr, pandoc and slidy
The R-Podcast Episode 7: Best Practices for Workflow Management
What is the point of statistics and operations research?
Question: C/C++ libraries for bioinformatics?
5 Hidden Skills for Big Data Scientists
Protocol – Computational Analysis of RNA-Seq

2012-6-3–

An easy way to think about priors on linear regression
Combining priors and downweighting in linear regression
Metropolis Hastings MCMC when the proposal and target have differing support
Slidify: Things are coming together fast
How to Convert Sweave LaTeX to knitr R Markdown: Winter Olympic Medals Example
Testing R Markdown with R Studio and posting it on RPubs.com
Announcing The R markdown Package
Announcing RPubs: A New Web Publishing Service for R
Approximate Bayesian computation
Load Packages Automatically in RStudio
Practical advice for machine learning: bias, variance and what to do next
The overview article on “Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis” associated with the invited talk at the upcoming PODS 2012 meeting is on the arXiv here.
The monograph on “Randomized Algorithms for Matrices and Data” is available in NOW’s “Foundations and Trends in Machine Learning” series here, and it is also available on the arXiv here.
Click here for information (including the slides and video!) on the Tutorial on “Geometric Tools for Identifying Structure in Large Social and Information Networks,” given originally at ICML10 and KDD10 and subsequently at many other places. (The slides are also linked to below.)
The overview chapter on “Algorithmic and Statistical Perspectives on Large-Scale Data Analysis” is finally on the arXiv here; the book in which it will appear is in press; and a video of the associated talk is here.
Recent teaching: Fall 2009: CS369M: Algorithms for Massive Data Set Analysis
Confidence distributions
Making a singular matrix non-singular
Statistics Versus Machine Learning
How to post R code on WordPress blogs
Causation
Pro Tips for Grad Students in Statistics/Biostatistics (Part 1)
Pro Tips for Grad Students in Statistics/Biostatistics (Part 2)
Why You Shouldn’t Conclude “No Effect” from Statistically Insignificant Slopes
For those interested in knitr with Rmarkdown to beamer slides
Notes from A Recent Spatial R Class I Gave
Sparse Bayesian Methods for Low-Rank Matrix Estimation and Bayesian Group-Sparse Modeling and Variational Inference – implementation
The Battle of the Bayes
Ockham Workshop, Day 1
Ockham Workshop, Day 2
Ockham Workshop, Day 3
Ockham’s Razor
Occam

2012-6-27–2012/7/15:

2012/7/16—-2012/8/12:

2012/8/13—2012/9/23:

Towards Better PDF Management with the Filesystem
What is life like for PhDs in computer science who go into industry?
Online REPL for 17 programming languages
Logistic regression vs. multiple regression—–Many statisticians seem to advise the use of logistic regression over multiple regression by invoking this logic: “A probability value can’t exceed 1 nor can it be less than 0. Since multiple regression often yields values less than 0 and greater than 1, use logistic regression.” While we can understand this argument, our feeling is that, in the applied fields we toil in, that argument is not a very practical one. In fact a seasoned statistics professor we know says (in effect): “What’s the big deal? If multiple regression yields any predicted values less than 0, consider them 0. If multiple regression yields any values greater than 1, consider them 1. End of story.” We agree.
Scientific Python
An everyday essential: the timer+My personal productivity rules
Bill Thurston—by Terrace Tao; Bill Thurston, 1946-2012—by Peter Woit; Bill Thurston 1946-2012—by David Speyer.
Surviving a PhD: 10 top tips that shows how to survive your PhD
How different PhD’s work:Differences and similarities between departments about PhD process
Countdown Begins: Countdown starts for submission of the thesis
PhD Life is Wonderful:Doing PhD at Warwick University is a wonderful experience
Too Many Emails In Your Inbox: Use Outlook folders to manage your emails
Introduction to REX Facility: Videos for introducing Wolfson Research Exchange and its facilities
Power of Supervisors: Control,inner happiness and optimisim
Unorthodox Tools of a Researcher: Reflection and examples of unorthodox tools that helps you PhD period
Homesickness and Culture Clashes: Homesickness of international students and cultural differences
Choosing Your PhD Examiners: Tips for choosing the relevant examiners for PhD Viva
Effective Research Tools: Examples of useful research tools
PhD,Risks and Murphy’s Law: “Anything that can go wrong will go wrong” according to Murphy’s Law
Will Data Scientists Be Replaced by Tools?
Update: TeX Writer for iPad (+ LaTeX + AMS)
Why physicists like models, and why biologists should
The ENCODE project: lessons for scientific publication
Perspectives From A Postdoc: What is a Postdoc?
Chris Blattman gives advice on PhD students’ NSF applications
ENCODE floods the news networks…
Maybe mostly useful for me, but for other people with Tumblr blogs, here is a way to insert Latex.—From Simply Statistics
Harvard Business school is getting in on the fun, calling the data scientist the sexy profession for the 21st century. Although I am a little worried that by the time it gets into a Harvard Business document, the hype may be outstripping the real promise of the discipline. Still, good news for statisticians! (via Rafa via Francesca D.’s Facebook feed).—From Simply Statistics
The counterpoint is this article which suggests that data scientists might be able to be replaced by tools/software. I think this is also a bit too much hype for my tastes. Certain things will definitely be automated and we may even end up with a deterministic statistical machine or two. But there will continually be new problems to solve which require the expertise of people with data analysis skills and good intuition (link via Samara K.)—From Simply Statistics

2012/9/24—2012/11/28:

Grad Student’s Guide to Good Coffee+Grad Student’s Guide to Good Tea
Favorite Apps for Work and Life
estimating a constant (not really)
Reinforcement Learning in R: An Introduction to Dynamic Programming
The Future of Machine Learning (and the End of the World?)
10 Papers Every Programmer Should Read (At Least Twice)
R in the Press
On Chomsky and the Two Cultures of Statistical Learning
Speech Recognition Breakthrough for the Spoken, Translated Word
Frequentist vs Bayesian
w4s – the awesomeness we’re experiencing
Why is the Gaussian so pervasive in mathematics?
C++ Blogs that you Regularly Follow
An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.
Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.
Statistics is still the new hotness. Here is a Business Insider list about 5 statistics problems that will“change the way you think about the world”.
New Yorker, especially the line,”statisticians are the new sexy vampires, only even more pasty” (via Brooke A.)
The closed graph theorem in various categories
Got spare time? Watch some videos about statistics
About the first Borel-Cantelli lemma
Yihui Xie—-The Setup
Best Practices for Scientific Computing

2012/12/5—-2013/1/20:

Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ
A Funny Thing Happened on the Way to Academia . . .
Advice for students on the academic job market (2013 edition)
Perspective: “Why C++ Is Not ‘Back’”
Is Fourier analysis a special case of representation theory or an analogue?
The Beauty of Bioconductor
The State of Statistics in Julia
Open Source Misfeasance
Book review: The Signal and The Noise
Should the Cox Proportional Hazards model get the Nobel Prize in Medicine?
The most influential data scientists on Twitter
Here is an interesting review of Nate Silver’s book. The interesting thing about the review is that it doesn’t criticize the statistical content, but criticizes the belief that people only use data analysis for good. This is an interesting theme we’ve seen before. Gelman also reviews the review.—–Simply Statistics
Video : “Matrices and their singular values” (1976)
Beyond Computation: The P vs NP Problem – Michael Sipser—-This talk is arguably the very best introduction to computational complexity .
What are some of your personal guidelines for writing good, clear code?
How do you explain Machine learning and Data Mining to non CS people?
Suggested New Year’s resolution: start a blog: A blog forces you to articulate your thoughts rather than having vague feelings about issues; You also get much more comfortable with writing, because you’re doing it rather than thinking about doing it; If other people read your blog you get to hear what they think too. You learn a lot that way. || Set aside time for your blog every day. Keep notes for yourself on bloggy subjects (write a one-line gmail to yourself with the subject “blog ideas”).
The most influential data scientists on Twitter
Tips on job market interviews
The age of the essay

2013/2/16—-2014/2/25:

2014/2/26—2014/9/11

2014/9/22—2014/12/04:

################## From SimplyStats ##################Editor’s Note: Last year I made a list off the top of my head of awesome things other people did. I loved doing it so much that I’m doing it again for 2014. Like last year, I have surely missed awesome things people have done. If you know of some, you should make your own list or add it to the comments! The rules remain the same. I have avoided talking about stuff I worked on or that people here at Hopkins are doing because this post is supposed to be about other people’s awesome stuff. I wrote this post because a blog often feels like a place to complain, but we started Simply Stats as a place to be pumped up about the stuff people were doing with data. Update: I missed pipes in R, now added!

I’m copying everything about Jenny Bryan’s amazing Stat 545 class in my data analysis classes. It is one of my absolute favorite open online set of notes on data analysis.
Ben Baumer, Mine Cetinkaya-Rundel, Andrew Bray, Linda Loi, Nicholas J. Horton wrote this awesome paper on integrating R markdown into the curriculum. I love the stuff that Mine and Nick are doing to push data analysis into undergrad stats curricula.
Speaking of those folks, the undergrad guidelines for stats programs put out by the ASA do an impressive job of balancing the advantages of statistics and the excitement of modern data analysis.
Somebody tell Hector Corrada Bravo to stop writing so many awesome papers. He is making us all look bad. His epiviz paper is great and you should go start using the Bioconductor package if you do genomics.
Hilary Mason founded fast forward labs. I love the business model of translating cutting edge academic (and otherwise) knowledge to practice. I am really pulling for this model to work.
As far as I can tell 2014 was the year that causal inference become the new hotness. One example of that is this awesome paper from the Google folks on trying to infer causality from related time series. The R package has some cool features too. I definitely am excited to see all the new innovation in this area.
Hadley was Hadley.
Rafa and Mike taught an awesome class on data analysis for genomics. They also created a book on Github that I think is one of the best introductions to the statistics of genomics that exists so far.
Hilary Parker wrote this amazing introduction to writing R packages that took the twitterverse by storm. It is perfectly written for people who are just at the point of being able to create their own R package. I think it probably generated 100+ R packages just by being so easy to follow.
Oh you’re not reading StatsChat yet? For real?
FiveThirtyEight launched. Despite some early bumps they have done some really cool stuff. Loved the recent piece on the beer mile and I read every piece that Emily Oster writes. She does an amazing job of explaining pretty complicated statistical topics to a really broad audience.
David Robinson’s broom package is one of my absolute favorite R packages that was built this year. One of the most annoying things about R is the variety of outputs different models give and this tidy version makes it really easy to do lots of neat stuff.
Chung and Storey introduced the jackstraw which is both a very clever idea and the perfect name for a method that can be used to identify variables associated with principal components in a statistically rigorous way.
I rarely dig excel-type replacements, but the simplicity of charted.co makes me love it. It does one thing and one thing really well.
The hipsteR package for teaching old R dogs new tricks is one of the many cool things Karl Broman did this year. I read all of his tutorials and never cease to learn stuff. In related news if I was 1/10th as organized as that dude I’d actually you know, get stuff done.
Whether I agree with them or not that they should be allowed to do unregulated human subjects research, statistics at tech companies, and in particular randomized experiments have never been hotter. The boldest of the bunch is OKCupid who writes blog posts with titles like, “We experiment on human beings!”
In related news, I love the PlanOut project by the folks over at Facebook, so cool to see an open source approach to experimentation at web scale.
No wonder Mike Jordan (no not that Mike Jordan) is such a superstar. His reddit AMA raised my respect for him from already super high levels. First, its awesome that he did it, and second it is amazing how well he articulates the relationship between CS and Stats.
I’m trying to figure out a way to get Matthew Stephens to write more blog posts. He teased us with the Dynamic Statistical Comparisons post and then left us hanging. The people demand more Matthew.
Di Cook also started a new blog in 2014. She was also part of this cool exploratory data analysis event for the UN. They have a monster program going over there at Iowa State, producing some amazing research and a bunch of students that are recognizable by one name (Yihui, Hadley, etc.).
Love this paper on sure screening of graphical models out of Daniela Witten’s group at UW. It is so cool when a simple idea ends up being really well justified theoretically, it makes the world feel right.
I’m sure this actually happened before 2014, but the Bioconductor folks are still the best open source data science project that exists in my opinion. My favorite development I started using in 2014 is the git-subversion bridge that lets me update my Bioc packages with pull requests.
rOpenSci ran an awesome hackathon. The lineup of people they invited was great and I loved the commitment to a diverse group of junior R programmers. I really, really hope they run it again.
Dirk Eddelbuettel and Carl Boettiger continue to make bigtime contributions to R. This time it is Rocker, with Docker containers for R. I think this could be a reproducibility/teaching gamechanger.
Regina Nuzzo brought the p-value debate to the masses. She is also incredible at communicating pretty complicated statistical ideas to a broad audience and I’m looking forward to more stats pieces by her in the top journals.
Barbara Engelhardt keeps rocking out great papers. But she is also one of the best AE’s I have ever had handle a paper for me at PeerJ. Super efficient, super fair, and super demanding. People don’t get enough credit for being amazing in the peer review process and she deserves it.
Ben Goldacre and Hans Rosling continue to be two of the best advocates for statistics and the statistical discipline – I’m not sure either claims the title of statistician but they do a great job anyway. This piece about Professor Rosling in Science gives some idea about the impact a statistician can have on the most current problems in public health. Meanwhile, I think Dr. Goldacre does a great job of explaining how personalized medicine is an information science in this piece on statins in the BMJ.
Michael Lopez’s series of posts on graduate school in statistics should be 100% required reading for anyone considering graduate school in statistics. He really nails it.
Trey Causey has an equally awesome Getting Started in Data Science post that I read about 10 times.
Drop everything and go read all of Philip Guo’s posts. Especially this one about industry versus academia or this one on the practical reason to do a PhD.
The top new Twitter feed of 2014 has to be @ResearchMark (incidentally I’m still mourning the disappearance of @STATSHULK).
Stephanie Hicks’ blog combines recipes for delicious treats and statistics, also I thought she had a great summary of the Women in Stats (#WiS2014) conference.
Emma Pierson is a Rhodes Scholar who wrote for 538, 23andMe, and a bunch of other major outlets as an undergrad. Her blog, obsessionwithregression.blogspot.com is another must read. Here is an example of her awesome work on how different communities ignored each other on Twitter during the Ferguson protests.
The Rstudio crowd continues to be on fire. I think they are a huge part of the reason that R is gaining momentum. It wouldn’t be possible to list all their contributions (or it would be an Rstudio exclusive list) but I really like Packrat and R markdown v2.
Another huge reason for the movement with R has been the outreach and development efforts of the Revolution Analytics folks. The Revolutions blog has been a must read this year.
Julian Wolfson and Joe Koopmeiners at University of Minnesota are straight up gamers. They live streamed their recruiting event this year. One way I judge good ideas is by how mad I am I didn’t think of it and this one had me seeing bright red.
This is just an awesome paper comparing lots of machine learning algorithms on lots of data sets. Random forests wins and this is a nice update of one of my favorite papers of all time: Classifier technology and the illusion of progress.
Pipes in R! This stuff is for real. The piping functionality created by Stefan Milton and Hadley is one of the few inventions over the last several years that immediately changed whole workflows for me.

##########################################################################

2014/12/05—2015/2/20:

2015/2/21—2015/7/31

hierarchical models are not Bayesian models
嘿，朋友，抢红包了吗？
xgboost: 速度快效果好的boosting模型
Machine Learning for Programming
Deep stuff about deep learning?
《怎样快糙猛的开始搞Kaggle比赛》aka 迅速入门当上挣钱多干活少整天猎头追跳槽涨一倍数据科学家的捷径. 本文写给想开始搞Kaggle比赛又害怕无从下手的小朋友们。原文发表于 http://t.cn/RAqksWV
Randomized experimentation

2015/8/1—

“Navigating Big Data Careers with a Statistics PhD.”
Great article from Professor Radhika Nagpal (Harvard) on tenure-track life.
Career advice for academics from Robert Sternberg (Cornell).
Installing R on OS X + Installing R on OS X – “100% Homebrew Edition”

2 comments

Comments feed for this article

April 8, 2013 at 4:53 pm

Daniel Chavez Moran

Valuable info. Lucky me I discovered your web site accidentally, and
I am shocked why this accident did not came about earlier!
I bookmarked it.

April 23, 2013 at 4:15 pm

Life Insurance Premium Calculator

I got this website from my pal who shared with me regarding this web
page and now this time I am visiting this web site and
reading very informative articles here.

From U

2011-02-22

2011-3-30:

Physics

Mathematics

Computer Science & Engineering

Machine Learning

Neuroscience & Biology

Finance and Econometrics

Seminars, Talks, and Conference Videos:

Physics

Mathematics

Computer Science & Engineering

Machine Learning

Neuroscience & Biology

Finance and Economics

Open Courseware Directories and Other Video Lecture Roundup Posts

Distinguished and Plenary Talks

Share this:

Recent Comments

Blog Stats

Log In/Out

Email Subscription

Recent Posts

Categories

Archives

Bioinformatics

Blogroll

CS blogs

general math blogs

interesting blogs

Journal Club

machine learning blogs

Newly Added

probability blogs

statistics blogs

2 comments

Leave a comment Cancel reply

Blog Stats