This page is for the collection of useful posts from others
2011-02-05:
Aaron:
Suresh
- POTD: Reproducing Kernel Banach Spaces with the ℓ1 Norm
- Sample Complexity for eps-approximations of Range Spaces
- IMA Special Year on the mathematics of information
Terry
Djalil
- Concentration for empirical spectral distributions
- Computational Rough Paths [CoRoPa]
- Singular values and rows distances
- The Marchenko-Pastur law
Alex
Frank
Andrew
Meena
Dick
John
Vladimir
Machine Vision 4 Users
From David’s Twitter stream:
- Want a deeper analysis of the Intel Sandy Bridge chipset bug?
- Multispectral Imaging At Up To 1 Billion Frames Per Second
Gonzalo Vazquez-Vilar
Hal Daume III
Alex
Andrew Gelman
David Brady
Greg: MIT IAP ’11 radar course SAR example, imaging with coffee cans, wood, and the audio input from your laptop
Gustavo (Greg’s student): Cleaner Ranging and Multiple Targets
Anand: Privacy and entropy (needs improvement)
Vladimir (ISW): Albert Theuwissen Reports from EI 2011 – Part 1 and 3D Sensing Forum at ISSCC 2011
Suresh
Jordan: Compressed sensing, compressed MDS, compressed clustering, and my talk tomorrow
Bob: Paper of the Day (Po’D): Performance Limits of Matching Pursuit Algorithms Edition
Terry: An introduction to measure theory
Arxiv blog: The Nuclear Camera Designed to Spot Hidden Radiation Sources
Meena:
- White matter fiber analysis: why we need statistical summaries (means)
- Tract-based quantitative white matter fiber analysis: Mathematical frameworks
- Quantitative white matter fiber analysis: a short history (Part III)
- Quantitative white matter fiber analysis: a short history (Part II)
- Quantitative white matter fiber analysis: a short history (Part I)
Bob:
- Tuning frequency determination
- Experiments in tuning frequency determination
- “I’m a math person.” echoes my “I am a safety stewart” 🙂 Safety is our job Number…errr… make that Number 6
- Conference craze 2011
- Some distributions of distances in high-dimensional musical spaces
- Digging deeper into minimum distances in high-dimensional musical spaces, pt. 2
- Digging deeper into minimum distances in high-dimensional musical spaces, pt. 1
- Paper of the Day (Po’D): Music Cover Song Identification Edition, pt. 5
- Paper of the Day (Po’D): Clustering Beat-chroma Patterns in Music Databases Edition
Sarah:
John: Accelerated learning
ISW: Aptina Demos Wafer Level Camera Technology
Alex:
- The operator norm and the decomposition of matrices in positive and negative parts,
- Moments of a product of random variables,
- Integrality of a sum, Stair partition problem
Greg: Paper posted to IEEE explorer: An Ultrawideband (UWB) Switched-Antenna-Array Radar Imaging System
Suresh: All FOCS talks are online
Arthur: Generating a quasi Poisson distribution, version 2
Franck: Polynomial Learning of Distribution Families
Tara N. Sainath wrote in the SLTC Newsletter, November 2010 on Sparse
- Piotr: Geometry @ Barriers
- Image Sensors World: Caeleste on X-Ray Photon Counting Sensors
- Andrew: Data Stream Algorithms slides
- Bob: Post-doc opportunity at INRIA
- Frank: Entropy of exponential families
- Terry: A first draft of a non-technical article on universality
- Hal: Manifold Assumption versus Margin Assumption
Bob Sturm wrote about the recent Probabilistic Matching Pursuit algorithm featured here recently in :
- Some Experiments with Probabilistic Orthogonal Matching Pursuit
- Paper of the Day (Po’D): The Other Probabilistic Matching Pursuits Edition
I mentioned Random Matrix Theory a while back, Terry Tao has some news results that he explains in Random matrices: Localization of the eigenvalues and the necessity of four moments. He makes a reference to the book An Introduction to Random Matrices by Greg Anderson, Alice Guionnet and Ofer Zeitouni. Of related interest:
- Statistical Mechanics and Random Matrices by Alice Guionnet and
- Mean Field Models for Spin Glasses by Michel Talagrand.
Djalil Chafai:
Terry Tao
Gonzalo Vazquez Vilar
John Langford
Djalil Chafai
Sofia Dahl and Bob Sturm in
- Séminaire à Paris
- Paper of the Day (Po’D): Transients Detection Edition
- A Funny Thing Happened on the Way to the Computer
- Meeting reports: Sonification and urban soundscapes in Stockholm
Alex Gittens in
I also found the following noteworthy papers, enjoy!
- First, I agree with Andrew, this essay by Mandelbrot is fascinating
A maverick’s apprenticeship. The Wolf Prize for Physics. Edited by David Thouless. Singapore: World Scientific, 2004. [ PDF (154.4 KB) ]
- Alex: Find a generating function for the Stirling partition numbers and Random matrix sparsification, comparison of current results
- Djalil: Back to basics: total variation distance
- Bob: CFP: 8th Sound and Music Computing (SMC) Conference 2011, and Sound Quality Seminar, Papers of the Day (Po’D): Finding or Not Finding Rules in Time Series Edition
- Muthu: Romance Leads to Insights
- Dick: Strong Codes For Weak Channels
- Gregory: 2011 MIT IAP course, build a synthetic aperture radar in 4 weeks
- Brian: Solving Resistor Networks Using Gaussian Elimination — An Illustration and Inverting A’CA
- ISW: IsInvariant Proposes New Sensor Technology (increasing dynamic range, I can see how CS would benefit from that)
- Meena: Quantitative white matter fiber analysis: a short history (Part III)
- Jason: A Simple and Computationally Efficient Sampling Approach to Covariate Adjustment for Multifactor Dimensionality Reduction Analysis of Epistasis
- Frank: Statistical manifold: Dual conjugate connections
- Arthur: Mandelbrot, fractals and counterexamples in applied probability, Margin of error, and comparing proportions in the same sample
- Gonzalo: Comments on information theory
- Terahertz Technology: ConverTec Corp. releases TeraLaz CO2 terahertz laser system
Now on to the sites to check for any news on compressive sensing, here is the (incomplete list).
- Arxiv
- Google (Compressive Sensing / Compressed Sensing) 24 hours, week, month.
- Rice University Compressive Sensing repository
Q&As
MathOverflow:
MetaOptimize:
LinkedIn:
TheoreticalCS:(not yet working)
BioStar
Friendfeed/Twitter
- Compressed Sensing and Compressive Sensing in FriendFeed
- Compressed Sensing, Compressive Sensing on Twitter.
2011-02-06:
Fast algorithms for nonconvex compressive sensing by Rick Chartrand, LANL
2011-02-12:
here are blogs/papers to reflect on, enjoy!:
- Dick Gordon’s blog
- Gigapixel News Journal
- Machine Vision 4 Users
- Quomodocumque
- What’s new
- Image Sensors World
- natural language processing blog
- Xi’an’s Og
- MAKE Magazine
- Freakonometrics
- The Secrets of Consulting
- Hack a Day
- Statistical Modeling, Causal Inference, and Social Science
- KinectHacks.net
- Decision Science News
- Machine Learning, etc
- Gödel’s Lost Letter and P=NP
- The Endeavour
- The Geomblog
- ChapterZero
- Mr. Vacuum Tube
- the polylogblog
- Terahertz Technology
- Epistasis Blog
- Brain Windows
- Harvest Imaging Blog
- An Ergodic Walk
- Collective for Research in Interaction, Sound, and Signal Processing
- CyberGi
- my slice of pizza
- Blog: La vertu d’un la – the virtue of an A, a fortunate hive
- Libres pensées d’un mathématicien ordinaire
- Electron&Holes twitter stream
- Olivier Grisel Twitter stream
- Twitter list of people interested in compressive sensing.
Back in December, I asked What was the most interesting paper on Compressive Sensing you read in 2010 ? Here is a compilation of y’alls answers:
- T. T. Cai, L. Wang and G. Xu, “New bounds for restricted isometry constants,” IEEE Trans. Inf. Theory, vol. 59(6), pp. 4388 – 4394, Sept., 2010.
- E.J. Candes and M.B. Wakin, “An Introduction To Compressive Sampling,” IEEE Signal Processing Magazine, vol. 25, Mar. 2008, pp. 21-30.
- M. Mishali, Y.C. Eldar, O. Dounaevsky, E. Shoshan, “Xampling: Analog to Digital at Sub-Nyquist Rates”, CCIT Report #751 Oct-09, EE Pub No. 1708, EE Dept., Technion – Israel Institute of Technology,
- “A probabilistic and RIPless theory of compressed sensing” by Emmanuel Candes and Yaniv Plan
- J. T. O’Brien and W. P. Kamp and G. M. Hoover, Sign-bit amplitude recovery with applications to seismic data, Geophysics, 1982
- Challenging Restricted Isometry Constants with Greedy Pursuit, with Peyre, G., and Fadili, J., Proc. of ITW’09, pp.475-479, 2009. ISBN: 978-1-4244-4982-8.
- Mark Davenport, Jason Laska, Petros Boufounos, and Richard Baraniuk, A simple proof that random matrices are democratic. (Rice University ECE Department Technical Report TREE-0906, November 2009)
- T. Blumensath, M. E. Davies, Iterative hard thresholding for compressed sensing. (Preprint, 2008)
- Boufounos P. T., “Universal Rate-Efficient Scalar Quantization“
- “Dequantizing Compressed Sensing: When Oversampling and Non-Gaussian Constraints Combine.“
- Real versus complex null space properties for sparse vector recovery
- Davenport, M.A.; Boufounos, P.T.; Wakin, M.B.; Baraniuk, R.G.; , “Signal Processing With Compressive Measurements,” Selected Topics in Signal Processing, IEEE Journal of , vol.4, no.2, pp.445-460, April 2010.
Recent entries I’ll probably be re-reading include:
- The Dip
- Reading the Donoho-Tanner Diagram
- Compressive Sensing Landscape version 0.2
- “…I found this idea of CS sketchy,…”
- Islands of Knowledge
- CS: Just throw away your lenses .. but not before you perform some calibration.
- CS: “..how come your browser can’t read JPEG-2000 ?..” , Q&As and some papers
- CS: Teaching Compressed Sensing (Part 1)
- CS: SMALL Workshop posters and Videos of the Talks
- CS: SMALL Workshop slides
- CS: Low Rank Compressive Spectral Imaging and a multishot CASSI
- NIPS videos
- Compressive auto-indexing in femtosecond nanocrystallography
-
CS: The Long Post of the Week
- Infinity Matters: Generalized Sampling and Infinite Dimensional Compressed Sensing
-
CS: Calibration for Ultrasound Breast Tomography Using Matrix Completion
2011-02-22
“…I found this idea of CS sketchy,…”
2011-03-01
CS: Would you like that entry Supersized? there are dozens of articles on compressive sensing
2011-03-02:
Open Source Software for iPad and iPhone
2011-03-07:
- There’s a wonderful interview at the Notices with last year’s Abel Prize winner John Tate (video here). He blames the fact that his name is on so many mathematical results and concepts on Serge Lang. The 2011 Abel Prize winner will be announced on March 23rd.
- Sir Michael Atiyah’s February 1 talk at the College de France titled A Geometer Explores the Universe is now on-line.
-
Matthew Emerton posts really good answers
2011-3-30:
Compressed Sensing: the L1 norm finds sparse solutions
2011-4-3:
So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
2011-4-17:
videolectures
Physics
- Harvard Physics: Quantum Field Theory by Sidney Coleman – 50 videos
- University of New Mexico: Physics 524 Quantum Field Theory II -27 videos
- University of New Mexico: Physics 521 Quantum Mechanics – 32 videos
- UCSD Quantum Physics 130A, 130B, 130C ~ 25 videos each
- University of South Carolina PHYS 729 – Applied Group Theory – 22 Videos, The Foundations of Theoretical Physics Using Lie Groups & Algebras
- Florida Atlantic University: PHY 6938 General Relativity — Fall 2007 – 28 videos
- Brookhaven National Laboratory Streaming Video: Cosmology for Beginners -5 videos
- MIT OpenCourseWare | Physics | Video Lectures – Physics I: Classical Mechanics, 8.02 E & M, 8.03 Vibrations and Waves, 8.224 GR & Astrophysics
- Oregon State University – Physics 464/564, Computational Physics – 23 videos, based on “A Survey of Computational Physics”, Landau, Paez, Bordeianu
- Cambridge University Video – Thermodynamics and Phase Diagrams with Harry Bhadeshia – 7 videos
- University of New Mexico: Prof. Ivan H. Deutsch, Short Course in Quantum Information 8 videos
- The Vega Science Trust – Astrophysical Chemistry by Harry Kroto – 8 videos
- CERN: Introduction to String Theory – W. Lerche – 4 videos
- CERN: String Theory – Johnson, C. (University of Southern California) – 5 videos
- CERN: String Theory for Pedestrians – Zwiebach, B. (MIT) 3 videos, author of “A First Course in String Theory”
- CERN Short Courses in Particle Physics – Accelerators, Detectors, Bubble Chambers, Feynman Diagrams, etc.
Mathematics
- Stanford EE364a: Optimization Lecture Videos
- Stanford EE263: Linear Dynamical Systems Lecture Videos
- MIT Courseware: Godel, Escher, Bach: A Mental Space Odyssey
- Constraint Programming Summer School 2007
- University of Colorado at Colorado Springs UCCS – Mathematics Video Courses – Requires free registration.. lots of courses
- UCCS Math 432 Modern Analysis II | Spring 2008
- UCCS Math 311 Number Theory | Spring 2008
- UCCS Math 535 Applied Functional Analysis | Spring 2006
- Texas A&M University – Math 614 Dynamical Systems and Chaos
- MIT OpenCourseWare | Mathematics | Video Lectures– 18.03 Differential Equations, 18.06 Linear Algebra, 18.085 Computational Science and Engineering I, 18.086 Mathematical Methods for Engineers II
Computer Science & Engineering
- Information Retrieval / Web Crawling Course – University of Freiburg
- Advanced Topics in Algorithms and Datastructures 2006 – University of Freiburg
- University of Freiburg – Advanced Topics in Algorithms and Datastructures 2005: Parallel Algorithms
- MIT Structure and Interpretation of Computer Programs, Video Lectures
- CS 251: Intermediate Software Design with C++ – Vanderbilt University
- MIT OpenCourseWare | Electrical Engineering and Computer Science | 6.046J Introduction to Algorithms (SMA 5503), Fall 2005 | Lecture Notes
- Algorithms Video Lectures from ArsDigita University
- Theory of Computation Video Lectures from ArsDigita University
- University of Washington CSE 582: Compilers
- University of Washington CSE P505: Programming Languages
- nanoHUB – Scientific Computing with Python
- CSE567M: Computer Systems Analysis (2006) – Washington University in St Louis Comparing systems using measurement, simulation, and queueing models
- NJIT Distance Learning Class Videos for CS 631 Data Management System Design
- NJIT Distance Learning Class Videos for CIS 375_602 Applications Development and Java
- NJIT Distance Learning Class Videos for CS 630 Operating Systems
- Wireless Sensor Networks – University of Freiburg – 2006
- UC Santa Cruz CMPE 118 – Introduction to Mechatronics
- RPI – ECSE-6961: Fundamentals of Wireless Broadband Networks. Spring 2007.
Machine Learning
- UC Berkeley Machine Learning Workshop 11 lectures
- CS 281A / Stat 241A: Statistical Learning Theory
- U Washington Machine Learning Videos
- University of Freiburg – Advanced AI Techniques – Reinforcement Learning, NLP, Bayesian Networks
Neuroscience & Biology
- Graduate Summer School: Probabilistic Models of Cognition: The Mathematics of Mind
- UCSD: Quantitative Molecular Biology – Physics 172/272
- University of Illinois at Urbana-Champaign – NSF Biophysics Summer School Lectures
- nanoHUB – Resources > Courses
- ITP Program on Dynamics of Neural Networks– Dynamics of Neural Networks: From Biophysics to Behavior
- Harvard School of Public Health: Bioinformatics Core
- UC Berkeley Webcasts | Video and Podcasts: MCB 130 Cell Biology
- UC Berkeley Webcasts | Video and Podcasts: MCB 110: General Biochemistry and Molecular Biology
- Univeristy of South Carolina – Microbiology and Immunology – Streaming Video
- Univeristy of South Carolina – Microbology Video Index
Finance and Econometrics
- University of Toronto ACT 460 / STA2502 – Stochastic Methods for Actuarial Science – S. Jaimungal, Department of Statistics and Mathematical Finance Program
- Economics 421 – Econometrics– Mark Thoma: Department of Economics, University of Oregon
- Course Video Lectures: Latent Variable Analysis Professor Bengt Muthén of the UCLA Graduate School of Education & Information Studies
- INFO 747 – Social and Economic Data – Cornell Record Linkage Course Lecture Videos Prof. John M. Abowd
- UC Berkeley Webcasts: Econometrics 244 – Discrete Choice Methods with Simulation
Seminars, Talks, and Conference Videos:
See http://del.icio.us/pskomoroch/talk+video for more links…
Physics
- View Past Public Lectures – Perimeter Institute for Theoretical Physics
- African Summer Theory Institute (ASTI): Online Lectures
- Rutgers Physics: NHETC video seminars
- UW Math: Milliman Lectures Archive
- The Vega Science Trust – Richard Feynman Videos
- Kavli Institute for Theoretical Physics (KITP) Online Conferences, Lectures and Seminars
Mathematics
- MSRI Video Archive
- Duke University Mathematics Department Video Archive
- Michigan State University Math Department – Video Lectures
Computer Science & Engineering
Machine Learning
- DeepLearningWorkshopNIPS2007 < Public < TWiki
- NIPS : Conferences : 2006 : Program : NIPS 2006 Schedule
- NIPS : Conferences : 2006 : Media : NIPS 2006 Media
- NIPS : Conferences : 2005 : Tutorial Videos
- NATO Advanced Study Institute on Mining Massive Data Sets for Security
Neuroscience & Biology
- UC Irvine International Imaging Genetics Conference
- Hebrew University of Jerusalem: Heller Lecture Series in Computational Neuroscience
- NIH VideoCasting: Past Events
- U Texas. Colection of Online Neuroscience Lectures
- Internet Archive Search: 2007+brain+network+dynamics
- Conference on Brain Network Dynamics 2007 – University of California Berkeley
- nanoHUB – Resources > Online Presentations
- Mathematical Biosciences Institute: Workshop on Biophysics and Mathematical Models of Calcium Channels
Finance and Economics
- International Tax Lecture Series – University of Connecticut School of Law
- Daniel Kahneman – Nobel Prize Lecture: Maps of Bounded Rationality
Open Courseware Directories and Other Video Lecture Roundup Posts
- Berkeley Course Webcasts
- MIT OpenCourseWare Videos
- Stanford University Lecture Videos
- Open Yale Courses
- VideoLectures – exchange ideas & share knowledge
- Free Science and Video Lectures Online!
- Lecturefox: free university lectures – computer science, mathematics, physics
- Business Intelligence, Data Mining & Machine Learning: Machine Learning OnLine Lectures – Machine Learning OnLine Lectures
- Yet Another Machine Learning Blog » Machine learning videos [Pierre Dangauthier]
- obousquet – ML Videos – Online videos of talks or lectures about Machine Learning related topics
Ways to prove the fundamental theorem of algebra
2011-5-27:
Distinguished and Plenary Talks
- Rothschild Lecture, Isaac Newton Institute for Mathematical Sciences, Cambridge, March 28, 2011
- Albert Einstein Memorial Lecture, Israel Academy of Sciences and Humanities, Jerusalem, March 14, 2011
- Fields Institute Distinguished Lectures, Toronto, September 14-16, 2010
- Distinguished Lecture, University of Rochester, October 30, 2009
- Levi L. Conant Lecture, Worcester Polytechnic Institute, September 24, 2009
- Sackler Distinguished Lectures in Mathematics, Tel Aviv University, March 9-13, 2009
- Distinguished Lecture, Rutgers University, December 4, 2008
- Distinguished Lecture Colloquium, PennState, November 19, 2008
- Asprey Distinguished Lecture Series, Vassar College, March 23, 2008
- Toyota Technological Institute of Chicago Distinguished Lecture Series, March 6, 2008
- UCLA Mathematics Department Distinguished Lecture Series, January 9,10 and 11, 2008
- Gibbs Lecture, Joint AMS-MAA Meeting, San Diego, January 6, 2008
- CISE Distinguished Lecture at NSF, Washington, DC, September 27, 2007
- Keynote lecture at FCRC,San Diego, CA, June 13, 2007
- KAM Mathematical Colloquim, Prague, Czech Republic, April 27, 2007
- Distinguished Lecture Series, University of Haifa, February 27 – March 1, 2007
- Louis Clark Vanuxem Lectures, Princeton University, February 13, 14 and 15, 2007
- Distinguished Lecture Series, University of Wisconsin, Madison, October 18, 2006
- IEEE Conference on Computational Complexity, Prague, Czech Republic, July 16-20, 2006
- Horizons of Truth Goedel Centenary 2006, University of Vienna, April 29, 2006
- Radcliff Institute for Advanced Study Science Lecture Series, October 9, 2003
The Institute of Advanced Studies’ Women and Mathematics series of lectures and seminars featured the following interesting presentations this year:
- Rebecca Willett‘s 5/17 lecture 1 Methods for sparse analysis of high-dimensional data, I
- Rebecca Willett‘s 5/18 lecture 2 Sparsity: Correcting Error in Data
- Rebecca Willett‘s 5/19 lecture 3 Sparsity: Compressed Sensing
- Rebecca Willett‘s 5/20 lecture 4 Sparsity: Generalized Sparsity Measures and Applications
- Sofya Raskhodnikova‘s 5/17 lecture 1 Sublinear-Time Algorithms
- Sofya Raskhodnikova‘s 5/18 lecture 2 Sublinear-Time Algorithms
- Sofya Raskhodnikova‘s 5/19 lecture 3 Sublinear-Time Algorithms
- Sofya Raskhodnikova‘s 5/20 lecture 4 Sublinear-Time Algorithms
- Rachel Ward‘s 5/24 lecture 1 Methods for sparse analysis of high-dimensional data, II
- Anna Gilbert‘s 5/24 lecture 1 Background on sparse approximation
- Anna Gilbert‘s 5/25 lecture 2 Hardness results for sparse approximation problems
- Anna Gilbert‘s 5/26 lecture 3 Dictionary geometry, greedy algorithms, and convex relaxation
- Peter Sarnak Mobius Function Lecture Three Lectures on the Mobius Function Randomness and Dynamics
- Peter Sarnak Integral Apollonian Packings
2011-5-29:
Geometric Tools for Identifying Structure in Large Social and Information Networks
2011-6-3:
Math
Elementary Applied Topology draft textbook
Introduction to category theory
Mathematical model of walking
Statistics and machine learning
Machine learning demos
On the accuracy of statistical procedures in Excel 2007
R reference card for data mining
Wisdom of statistically manipulated crowds
2011-6-22:
Videos of talks by Friedman and Macintyre
2011-6-23:
Deviance, DIC, AIC, cross-validation, etc
The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing
############################################################################
2012-2-19——2012-2-26:
- Active Bayesian Optimization
- There are several videos from the meeting on the Group Testing Designs, Algorithms, and Applications to Biology IMA meeting. Enjoy!
- Emergence of MCMC Bayesian Computation
- Two Interesting Short Volumes on the (Graph) Laplacian
- So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
- Prediction: the Lasso vs. just using the top 10 predictors
- Getting Genetics Done: Golden Helix: A Hitchhiker’s Guide to Next Generation Sequencing
- Stanford Unsupervised Feature Learning and Deep Learning Tutorial
- What does a compressive sensing approach bring to the table ?
- What is Mahalanobis distance?
- Large scale SVM (support vector machine)
- Abstractions
- Monkeying with Bayes’ theorem
- Coming to agreement on philosophy of statistics
- probit posterior mean
- GraphLab v2 @ Big Learning Workshop
- Basic Introduction to ggplot2
- Bayesian statistics made simple
- Courses in CS this spring
- A Numerical Tour of Signal Processing
- Reading List for Feb and March 2012 This is about the materials on concentration and geometric techniques used in compressed sensing.
- simulated annealing for Sudokus
- Djalil talks about A random walk on the unitary group, Brownian Motion and From seductive theory to concrete applications (which got Nuit Blanche thinking about writing this entry: Whose heart doesn’t sink at the thought of Dirac being inferior to Theora ?)
- Lectures on Gaussian approximations with Malliavin calculus
- Useful R snippets
- Special Section: Minimax Shrinkage Estimation: A Tribute to Charles Stein
- Excellent Papers for 2011
- Creating a designer’s CV in LaTeX
- Is NGS the Answer?
- Sequence Analysis Methods Not Just for Sequence Data
- DNA Variant Analysis of Complete Genomics’ Next-Generation Sequencing Data
- Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process
- Best Written Paper
- Online SVD/PCA resources
- Probabilistic Topic Models
- Social Network Analysis with R
- Publicly available large data sets for database research
- Around the blogs in 80 hours and Random Thoughts (some are about sequencing data)
- Change margins of a single page (latex)
- Bootstrap example
- Exciting News on Three Dimensional Manifolds
- Dr. Perou on Next Generation Sequencing Technology
- analyzing-complex-plant-genomes-with-the-newest-next-generation-dna-sequencing-techniques
- RNA-Seq Methods & March Twitter Roundup
- Introduction to Statistical Thought
- An R programmer looks at Julia
- The slides and video can help you get a flavor of the language Julia.
- Why and How People Use R
- Wang, Landau, Markov, and others…
- Linear mixed models in R
- Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding
- Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
- C++ at Facebook
- Calling C++ from R
- C++ Renaissance
- Why haven’t we cured cancer yet? (Revisited): Personalized medicine versus evolution
- Getting ppt figures into LaTeX
- Latex Allergy Cured by knitr
- Melbourne R Users
- sixty two-minute r twotorials
- LDA explained
- Counting the total number of…
- Significance Test for Kendall’s Tau-b
- dimension reduction in ABC [a review’s review]
- 9 essential LaTeX packages everyone should use
- Linguistic Notation Inside of R Plots! about knitr
- knitr Elegant, flexible and fast dynamic report generation with R
- knitr Performance Report-Attempt 1
- knitr Performance Report-Attempt 2
- Question: Why you need perl/python if you know R/Shell [NGS data analysis]
- SPAMS (SPArse Modeling Software) now with Python and R
- Large-scale Inference and empirical Bayes, they are related with multiple testing
- My setup about some softwares and editors
- Fancy HTML5 Slides with knitr and pandoc
- John talks about Random is as random does
- MCMC at ICMS (1)
- MCMC at ICMS (2)
- MCMC at ICMS (3)
- John Cook: Why and How People Use R
- An Introduction to 6 Machine Learning Models
- Machine Learning: Algorithms that Produce Clusters
- Dirichlet Process for dummies
- A Really Nice Talk About PDE, Numerics (and Pyramids)
- Analysis of Boolean Functions
- Next-generation genome sequencers compared
- why noninformative priors?
- Data Scientists Get Ranked
- 90+ Two-Minute Videos on R
- Turing Centennial Celebration – Day 1
- Turing Centennial Celebration – Day 2
- Turing Centennial Celebration – Day 3
- Online resources for handling big data and parallel computing in R
- Source R-Script from Dropbox
- Excel in Statistics and Operations Research
- Dynamic Content with RStudio, Markdown, and Marked.
- Five minute guide to LaTeX
- Interactive reports in R with knitr and RStudio
- What Programming language are they using ?
- Generating reports for different data sets using brew and knitr
- Reproducible research with markdown, knitr and pandoc
- Getting Started with R Markdown, knitr, and Rstudio 0.96
- My experiences with Rcpp
- A Personal Perspective on Machine Learning
- The differing perspectives of statistics and machine learning
- Kernel Methods and Support Vector Machines de-Mystified
- I love this article in the WSJ about the crisis at JP Morgan. The key point it highlights is that looking only at the high-level analysis and summaries can be misleading, you have to look at the raw data to see the potential problems. As data become more complex, I think its critical we stay in touch with the raw data, regardless of discipline. At least if I miss something in the raw data I don’t lose a couple billion. Spotted by Leonid K.
- On the other hand, this article in the Times drives me a little bonkers. It makes it sound like there is one mathematical model that will solve the obesity epidemic. Lines like this are ridiculous: “Because to do this experimentally would take years. You could find out much more quickly if you did the math.” The obesity epidemic is due to a complex interplay of cultural, sociological, economic, and policy factors. The idea you could “figure it out” with a set of simple equations is laughable. If you check out their model this is clearly not the answer to the obesity epidemic. Just another example of why statistics is not math. If you don’t want to hopelessly oversimplify the problem, you need careful data collection, analysis, and interpretation. For a broader look at this problem, check out this article on Science vs. PR. Via Andrew J.
- Some cool applications of the raster package in R. This kind of thing is fun for student projects because analyzing images leads to results that are easy to interpret/visualize.
- Check out John C.’s really fascinating post on determining when a white-collar worker is great. Inspired by Roger’s post on knowing when someone is good at data analysis.
- knitR Performance Report 3 (really with knitr) and dprint
- Unix doesn’t follow the Unix philosophy
- Advice on writing research articles
- knitr Performance Report–Attempt 3
- Permutation tests in R
- Understanding Bayesian Statistics – By Michael-Paul Agapow
- knitr, Slideshows, and Dropbox
- Generate LaTeX tables from CSV files (Excel)
- The Tomato Genome
- Optimization
- Sichuan Agricultural University and LC Sciences Uncover the Epigenetics of Obesity
- How to Stay Current in Bioinformatics/Genomics
- Interactive HTML presentation with R, googleVis, knitr, pandoc and slidy
- The R-Podcast Episode 7: Best Practices for Workflow Management
- What is the point of statistics and operations research?
- Question: C/C++ libraries for bioinformatics?
- 5 Hidden Skills for Big Data Scientists
- Protocol – Computational Analysis of RNA-Seq
2012-6-3–
- An easy way to think about priors on linear regression
- Combining priors and downweighting in linear regression
- Metropolis Hastings MCMC when the proposal and target have differing support
- Slidify: Things are coming together fast
- How to Convert Sweave LaTeX to knitr R Markdown: Winter Olympic Medals Example
- Testing R Markdown with R Studio and posting it on RPubs.com
- Announcing The R markdown Package
- Announcing RPubs: A New Web Publishing Service for R
- Approximate Bayesian computation
- Load Packages Automatically in RStudio
- Practical advice for machine learning: bias, variance and what to do next
- The overview article on “Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis” associated with the invited talk at the upcoming PODS 2012 meeting is on the arXiv here.
- The monograph on “Randomized Algorithms for Matrices and Data” is available in NOW’s “Foundations and Trends in Machine Learning” series here, and it is also available on the arXiv here.
- Click here for information (including the slides and video!) on the Tutorial on “Geometric Tools for Identifying Structure in Large Social and Information Networks,” given originally at ICML10 and KDD10 and subsequently at many other places. (The slides are also linked to below.)
- The overview chapter on “Algorithmic and Statistical Perspectives on Large-Scale Data Analysis” is finally on the arXiv here; the book in which it will appear is in press; and a video of the associated talk is here.
- Recent teaching: Fall 2009: CS369M: Algorithms for Massive Data Set Analysis
- Confidence distributions
- Making a singular matrix non-singular
- Statistics Versus Machine Learning
- How to post R code on WordPress blogs
- Causation
- Pro Tips for Grad Students in Statistics/Biostatistics (Part 1)
- Pro Tips for Grad Students in Statistics/Biostatistics (Part 2)
- Why You Shouldn’t Conclude “No Effect” from Statistically Insignificant Slopes
- For those interested in knitr with Rmarkdown to beamer slides
- Notes from A Recent Spatial R Class I Gave
- Sparse Bayesian Methods for Low-Rank Matrix Estimation and Bayesian Group-Sparse Modeling and Variational Inference – implementation
- The Battle of the Bayes
- Ockham Workshop, Day 1
- Ockham Workshop, Day 2
- Ockham Workshop, Day 3
- Ockham’s Razor
- Occam
- Simplicity is hard to sell
- Self-Repairing Bayesian Inference
- Praxis and Ideology in Bayesian Data Analysis
- In-consistent Bayesian inference
- Big Data Generalized Linear Models with Revolution R Enterprise
- Quants, Models, and the Blame Game
- Fun with the googleVis Package for R
- Topological Data Analysis
- The Winners of the LaTeX and Graphics Contest
- Is Machine Learning Losing Impact?
- Machine Learning Doesn’t Matter?
- Components of Statistical Thinking and Implications for Instruction and Assessment
- Xiao-Li Meng and Xianchao Xie rethink asymptotics
- Higgs boson and five sigma
- What is the Statistics Department 25 Years From Now?
- Statistics: Your chance for happiness (or misery)
- Manifolds: motivation and definition
- Why Emacs is important to me? : ESS and org-mode
- Interesting Emacs linkfest
- Devs Love Bacon: Everything you need to know about Machine Learning in 30 minutes or less
- Visualizing Galois Fields
- Visualizing Galois Fields (Follow-up)
- Statistical Reasoning on iTunes U
- Computing log gamma differences
- Where to start if you’re going to revise statistics
- Power laws and the generalized CLT
- Open problems in next-gen sequence analysis
- More equations, less citations?
- Talk: Some Introductory Remarks on Bayesian Inference
2012/7/16—-2012/8/12:
- Getting Started with the WordPress Competition
- Simple Made Easy
- An Education Tsunami—Will on-line courses destroy universities?
- Universities Reshaping Education on the Web
- Explanation or Prediction? An Amazing Quote from Phil Schrodt
- Should you apply PCA to your data?
- Which classifiers are fast enough for exploring medium-sized data?
- Quick classifiers for exploring medium-sized data (redux)
- Is C++ worth it?
- Unbiased estimators can be terrible
- Things You Should Never Do, Part I
- The Joel Test: 12 Steps to Better Code
- Methodologists’ Audience
- Bayesian Methodology in the Genetic Age
- Interview with Michael Hammel, author of The Artist’s Guide to GIMP
- Being Happy in Grad School
- 10 Fresh Tips for Finding Time to Blog
- A Quick Guide to Using Tumblr for Business
- Statistics Done Wrong
- Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology
- Interview(s) with Vladimir Voevodsky with an introduction on motivic homotopy along with the video and transcript.
- Are there examples of non-orientable manifolds in nature?
- Kolmogorov Complexity – A Primer
- Adventures at My First JSM (Joint Statistical Meetings) #JSM2012
- Yes, I was hacked. Hard.
- Does Julia have any hope of sticking in the statistical community?
- How Genome Sequencing is Revolutionizing Clinical Diagnostics, from the ISMB Conference
- Advice for an Undergraduate
- 4 things you should know about choosing examiners for your thesis
- The long tail of free online education : The author also plans to teach a class on graph partitioning, expander graphs, and random walks online in Winter 2013.
- Teaching the World to Search
- Beyond Pinterest and Instagram – ten visual social networks that should be on your radar
- Making Ubuntu 12.04 useable
- Basic Understanding of Compressed Sensing
2012/8/13—2012/9/23:
- Towards Better PDF Management with the Filesystem
- What is life like for PhDs in computer science who go into industry?
- Online REPL for 17 programming languages
- Logistic regression vs. multiple regression—–Many statisticians seem to advise the use of logistic regression over multiple regression by invoking this logic: “A probability value can’t exceed 1 nor can it be less than 0. Since multiple regression often yields values less than 0 and greater than 1, use logistic regression.” While we can understand this argument, our feeling is that, in the applied fields we toil in, that argument is not a very practical one. In fact a seasoned statistics professor we know says (in effect): “What’s the big deal? If multiple regression yields any predicted values less than 0, consider them 0. If multiple regression yields any values greater than 1, consider them 1. End of story.” We agree.
- Scientific Python
- An everyday essential: the timer+My personal productivity rules
- Bill Thurston—by Terrace Tao; Bill Thurston, 1946-2012—by Peter Woit; Bill Thurston 1946-2012—by David Speyer.
- Surviving a PhD: 10 top tips that shows how to survive your PhD
- How different PhD’s work:Differences and similarities between departments about PhD process
- Countdown Begins: Countdown starts for submission of the thesis
- PhD Life is Wonderful:Doing PhD at Warwick University is a wonderful experience
- Too Many Emails In Your Inbox: Use Outlook folders to manage your emails
- Introduction to REX Facility: Videos for introducing Wolfson Research Exchange and its facilities
- Power of Supervisors: Control,inner happiness and optimisim
- Unorthodox Tools of a Researcher: Reflection and examples of unorthodox tools that helps you PhD period
- Homesickness and Culture Clashes: Homesickness of international students and cultural differences
- Choosing Your PhD Examiners: Tips for choosing the relevant examiners for PhD Viva
- Effective Research Tools: Examples of useful research tools
- PhD,Risks and Murphy’s Law: “Anything that can go wrong will go wrong” according to Murphy’s Law
- Will Data Scientists Be Replaced by Tools?
- Update: TeX Writer for iPad (+ LaTeX + AMS)
- Why physicists like models, and why biologists should
- The ENCODE project: lessons for scientific publication
- Perspectives From A Postdoc: What is a Postdoc?
- Chris Blattman gives advice on PhD students’ NSF applications
- ENCODE floods the news networks…
- Maybe mostly useful for me, but for other people with Tumblr blogs, here is a way to insert Latex.—From Simply Statistics
- Harvard Business school is getting in on the fun, calling the data scientist the sexy profession for the 21st century. Although I am a little worried that by the time it gets into a Harvard Business document, the hype may be outstripping the real promise of the discipline. Still, good news for statisticians! (via Rafa via Francesca D.’s Facebook feed).—From Simply Statistics
- The counterpoint is this article which suggests that data scientists might be able to be replaced by tools/software. I think this is also a bit too much hype for my tastes. Certain things will definitely be automated and we may even end up with a deterministic statistical machine or two. But there will continually be new problems to solve which require the expertise of people with data analysis skills and good intuition (link via Samara K.)—From Simply Statistics
2012/9/24—2012/11/28:
- Grad Student’s Guide to Good Coffee+Grad Student’s Guide to Good Tea
- Favorite Apps for Work and Life
- estimating a constant (not really)
- Reinforcement Learning in R: An Introduction to Dynamic Programming
- The Future of Machine Learning (and the End of the World?)
- 10 Papers Every Programmer Should Read (At Least Twice)
- R in the Press
- On Chomsky and the Two Cultures of Statistical Learning
- Speech Recognition Breakthrough for the Spoken, Translated Word
- Frequentist vs Bayesian
- w4s – the awesomeness we’re experiencing
- Why is the Gaussian so pervasive in mathematics?
- C++ Blogs that you Regularly Follow
- An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.
- Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.
- Statistics is still the new hotness. Here is a Business Insider list about 5 statistics problems that will“change the way you think about the world”.
- New Yorker, especially the line,”statisticians are the new sexy vampires, only even more pasty” (via Brooke A.)
- The closed graph theorem in various categories
- Got spare time? Watch some videos about statistics
- About the first Borel-Cantelli lemma
- Yihui Xie—-The Setup
- Best Practices for Scientific Computing
2012/12/5—-2013/1/20:
- Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ
- A Funny Thing Happened on the Way to Academia . . .
- Advice for students on the academic job market (2013 edition)
- Perspective: “Why C++ Is Not ‘Back’”
- Is Fourier analysis a special case of representation theory or an analogue?
- The Beauty of Bioconductor
- The State of Statistics in Julia
- Open Source Misfeasance
- Book review: The Signal and The Noise
- Should the Cox Proportional Hazards model get the Nobel Prize in Medicine?
- The most influential data scientists on Twitter
- Here is an interesting review of Nate Silver’s book. The interesting thing about the review is that it doesn’t criticize the statistical content, but criticizes the belief that people only use data analysis for good. This is an interesting theme we’ve seen before. Gelman also reviews the review.—–Simply Statistics
- Video : “Matrices and their singular values” (1976)
- Beyond Computation: The P vs NP Problem – Michael Sipser—-This talk is arguably the very best introduction to computational complexity .
- What are some of your personal guidelines for writing good, clear code?
- How do you explain Machine learning and Data Mining to non CS people?
- Suggested New Year’s resolution: start a blog: A blog forces you to articulate your thoughts rather than having vague feelings about issues; You also get much more comfortable with writing, because you’re doing it rather than thinking about doing it; If other people read your blog you get to hear what they think too. You learn a lot that way. || Set aside time for your blog every day. Keep notes for yourself on bloggy subjects (write a one-line gmail to yourself with the subject “blog ideas”).
- The most influential data scientists on Twitter
- Tips on job market interviews
- The age of the essay
2013/2/16—-2014/2/25:
- Interview with Nick Chamandy, statistician at Google
- You and Your Research + video
- Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained
- A Survival Guide to Starting and Finishing a PhD
- Six Rules For Wearing Suits For Beginners
- Why I Created C++
- More advice to scientists on blogging
- Software engineering practices for graduate students
- Statistics Matter
- What statistics should do about big data: problem forward not solution backward
- How signals, geometry, and topology are influencing data science
- The Bounded Gaps Between Primes Theorem has been proved
- A non-comprehensive list of awesome things other people did this year.
- Jake VanderPlas writes about the Big Data Brain Drain from academia.
- Tomorrow’s Professor Postings
- Best Practices for Scientific Computing
- Some tips for new research-oriented grad students
- 3 Reasons Every Grad Student Should Learn WordPress
- How to Lie With Statistics (in the Age of Big Data)
- The Geometric View on Sparse Recovery
- The Mathematical Shape of Things to Come
- A Guide to Python Frameworks for Hadoop
- Statistics, geometry and computer science.
- How to Collaborate On GitHub
- Step by step to build my first R Hadoop System
- Open Sourcing a Python Project the Right Way
- Data Science MD July Recap: Python and R Meetup
- git 最近感悟
- 10 Reasons Python Rocks for Research (And a Few Reasons it Doesn’t)
- Effective Presentations – Part 2 – Preparing Conference Presentations
- Doing Statistical Research
- How to Do Statistical Research
- Learning new skills
- How to Stand Out When Applying for An Academic Job
- Maturing from student to researcher
- False discovery rate regression (cc NSA’s PRISM)
- Job Hunting Advice, Pt. 3: Networking
- Getting Started with Git
2014/2/26—2014/9/11
- Some R Resources for GLMs
- 失联搜救中的统计数据分析
- The gap between data mining and predictive models
- Data Mining, machine learning and statistics.
- useR! 2014 is underway with 16 tutorials
- What is Scalable Machine Learning?
- rlist:基于list在R中处理非关系型数据
- The perfect candidate
- The Leek group guide to giving talks
- 38 Seminal Articles Every Data Scientist Should Read
- Deep Learning – important resources for learning and understanding
- Twenty rules for good graphics + Ten Simple Rules for Better Figures
- Git Cookbook
- Making Your Code Citable
- biblatex for statisticians
- Do your “data janitor work” like a boss with dplyr
2014/9/22—2014/12/04:
- Tutorial: How to detect spurious correlations, and how to find the …
- Practical illustration of Map-Reduce (Hadoop-style), on real data
- Jackknife logistic and linear regression for clustering and predict…
- From the trenches: 360-degrees data science
- A synthetic variance designed for Hadoop and big data
- Fast Combinatorial Feature Selection with New Definition of Predict…
- A little known component that should be part of most data science a…
- 11 Features any database, SQL or NoSQL, should have
- Clustering idea for very large datasets
- Hidden decision trees revisited
- Correlation and R-Squared for Big Data
- Marrying computer science, statistics and domain expertize
- New pattern to predict stock prices, multiplies return by factor 5
- What Map Reduce can’t do
- Excel for Big Data
- Fast clustering algorithms for massive datasets
- Source code for our Big Data keyword correlation API
- The curse of big data
- How to detect a pattern? Problem and solution
- Interesting Data Science Application: Steganography
- Easily create documents from R with Rmarkdown
- How to publish R and ggplot2 to the web
- magrittr: Simplifying R code with pipes
- Updated dplyr Examples
- Video introduction to data manipulation with dplyr
- R and Data Science
- jiebaR中文分词——R的灵活,C的效率
- Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?
- 41 hours of courses given in Iceland this Summer at the Machine Learning Summer School.
- summary of parallel machine learning approaches
- big data and data science talks
################## From SimplyStats ##################Editor’s Note: Last year I made a list off the top of my head of awesome things other people did. I loved doing it so much that I’m doing it again for 2014. Like last year, I have surely missed awesome things people have done. If you know of some, you should make your own list or add it to the comments! The rules remain the same. I have avoided talking about stuff I worked on or that people here at Hopkins are doing because this post is supposed to be about other people’s awesome stuff. I wrote this post because a blog often feels like a place to complain, but we started Simply Stats as a place to be pumped up about the stuff people were doing with data. Update: I missed pipes in R, now added!
- I’m copying everything about Jenny Bryan’s amazing Stat 545 class in my data analysis classes. It is one of my absolute favorite open online set of notes on data analysis.
- Ben Baumer, Mine Cetinkaya-Rundel, Andrew Bray, Linda Loi, Nicholas J. Horton wrote this awesome paper on integrating R markdown into the curriculum. I love the stuff that Mine and Nick are doing to push data analysis into undergrad stats curricula.
- Speaking of those folks, the undergrad guidelines for stats programs put out by the ASA do an impressive job of balancing the advantages of statistics and the excitement of modern data analysis.
- Somebody tell Hector Corrada Bravo to stop writing so many awesome papers. He is making us all look bad. His epiviz paper is great and you should go start using the Bioconductor package if you do genomics.
- Hilary Mason founded fast forward labs. I love the business model of translating cutting edge academic (and otherwise) knowledge to practice. I am really pulling for this model to work.
- As far as I can tell 2014 was the year that causal inference become the new hotness. One example of that is this awesome paper from the Google folks on trying to infer causality from related time series. The R package has some cool features too. I definitely am excited to see all the new innovation in this area.
- Hadley was Hadley.
- Rafa and Mike taught an awesome class on data analysis for genomics. They also created a book on Github that I think is one of the best introductions to the statistics of genomics that exists so far.
- Hilary Parker wrote this amazing introduction to writing R packages that took the twitterverse by storm. It is perfectly written for people who are just at the point of being able to create their own R package. I think it probably generated 100+ R packages just by being so easy to follow.
- Oh you’re not reading StatsChat yet? For real?
- FiveThirtyEight launched. Despite some early bumps they have done some really cool stuff. Loved the recent piece on the beer mile and I read every piece that Emily Oster writes. She does an amazing job of explaining pretty complicated statistical topics to a really broad audience.
- David Robinson’s broom package is one of my absolute favorite R packages that was built this year. One of the most annoying things about R is the variety of outputs different models give and this tidy version makes it really easy to do lots of neat stuff.
- Chung and Storey introduced the jackstraw which is both a very clever idea and the perfect name for a method that can be used to identify variables associated with principal components in a statistically rigorous way.
- I rarely dig excel-type replacements, but the simplicity of charted.co makes me love it. It does one thing and one thing really well.
- The hipsteR package for teaching old R dogs new tricks is one of the many cool things Karl Broman did this year. I read all of his tutorials and never cease to learn stuff. In related news if I was 1/10th as organized as that dude I’d actually you know, get stuff done.
- Whether I agree with them or not that they should be allowed to do unregulated human subjects research, statistics at tech companies, and in particular randomized experiments have never been hotter. The boldest of the bunch is OKCupid who writes blog posts with titles like, “We experiment on human beings!”
- In related news, I love the PlanOut project by the folks over at Facebook, so cool to see an open source approach to experimentation at web scale.
- No wonder Mike Jordan (no not that Mike Jordan) is such a superstar. His reddit AMA raised my respect for him from already super high levels. First, its awesome that he did it, and second it is amazing how well he articulates the relationship between CS and Stats.
- I’m trying to figure out a way to get Matthew Stephens to write more blog posts. He teased us with the Dynamic Statistical Comparisons post and then left us hanging. The people demand more Matthew.
- Di Cook also started a new blog in 2014. She was also part of this cool exploratory data analysis event for the UN. They have a monster program going over there at Iowa State, producing some amazing research and a bunch of students that are recognizable by one name (Yihui, Hadley, etc.).
- Love this paper on sure screening of graphical models out of Daniela Witten’s group at UW. It is so cool when a simple idea ends up being really well justified theoretically, it makes the world feel right.
- I’m sure this actually happened before 2014, but the Bioconductor folks are still the best open source data science project that exists in my opinion. My favorite development I started using in 2014 is the git-subversion bridge that lets me update my Bioc packages with pull requests.
- rOpenSci ran an awesome hackathon. The lineup of people they invited was great and I loved the commitment to a diverse group of junior R programmers. I really, really hope they run it again.
- Dirk Eddelbuettel and Carl Boettiger continue to make bigtime contributions to R. This time it is Rocker, with Docker containers for R. I think this could be a reproducibility/teaching gamechanger.
- Regina Nuzzo brought the p-value debate to the masses. She is also incredible at communicating pretty complicated statistical ideas to a broad audience and I’m looking forward to more stats pieces by her in the top journals.
- Barbara Engelhardt keeps rocking out great papers. But she is also one of the best AE’s I have ever had handle a paper for me at PeerJ. Super efficient, super fair, and super demanding. People don’t get enough credit for being amazing in the peer review process and she deserves it.
- Ben Goldacre and Hans Rosling continue to be two of the best advocates for statistics and the statistical discipline – I’m not sure either claims the title of statistician but they do a great job anyway. This piece about Professor Rosling in Science gives some idea about the impact a statistician can have on the most current problems in public health. Meanwhile, I think Dr. Goldacre does a great job of explaining how personalized medicine is an information science in this piece on statins in the BMJ.
- Michael Lopez’s series of posts on graduate school in statistics should be 100% required reading for anyone considering graduate school in statistics. He really nails it.
- Trey Causey has an equally awesome Getting Started in Data Science post that I read about 10 times.
- Drop everything and go read all of Philip Guo’s posts. Especially this one about industry versus academia or this one on the practical reason to do a PhD.
- The top new Twitter feed of 2014 has to be @ResearchMark (incidentally I’m still mourning the disappearance of @STATSHULK).
- Stephanie Hicks’ blog combines recipes for delicious treats and statistics, also I thought she had a great summary of the Women in Stats (#WiS2014) conference.
- Emma Pierson is a Rhodes Scholar who wrote for 538, 23andMe, and a bunch of other major outlets as an undergrad. Her blog, obsessionwithregression.blogspot.com is another must read. Here is an example of her awesome work on how different communities ignored each other on Twitter during the Ferguson protests.
- The Rstudio crowd continues to be on fire. I think they are a huge part of the reason that R is gaining momentum. It wouldn’t be possible to list all their contributions (or it would be an Rstudio exclusive list) but I really like Packrat and R markdown v2.
- Another huge reason for the movement with R has been the outreach and development efforts of the Revolution Analytics folks. The Revolutions blog has been a must read this year.
- Julian Wolfson and Joe Koopmeiners at University of Minnesota are straight up gamers. They live streamed their recruiting event this year. One way I judge good ideas is by how mad I am I didn’t think of it and this one had me seeing bright red.
- This is just an awesome paper comparing lots of machine learning algorithms on lots of data sets. Random forests wins and this is a nice update of one of my favorite papers of all time: Classifier technology and the illusion of progress.
- Pipes in R! This stuff is for real. The piping functionality created by Stefan Milton and Hadley is one of the few inventions over the last several years that immediately changed whole workflows for me.
##########################################################################
2014/12/05—2015/2/20:
- Deep Learning Master Class
- Advances in Variational Inference
- Numerical Optimization: Understanding L-BFGS
- An exact mapping between the Variational Renormalization Group and Deep Learning
- New ASA Guidelines for Undergraduate Statistics Programs
- 奇异值分解(We Recommend a Singular Value Decomposition)
- 如何简单形象又有趣地讲解神经网络是什么?
- Academic vs. Industry Careers
- Hadley Wickham: Impact the world by being useful
- Statisticians in World War II: They also served
- A Brief Overview of Deep Learning
- Advice for applying Machine Learning
- Deep Learning Tutorial
- Gibbs Sampling in Haskell
- How-to go parallel in R – basics + tips
2015/2/21—2015/7/31
- hierarchical models are not Bayesian models
- 嘿,朋友,抢红包了吗?
- xgboost: 速度快效果好的boosting模型
- Machine Learning for Programming
- Deep stuff about deep learning?
- 《怎样快糙猛的开始搞Kaggle比赛》aka 迅速入门当上挣钱多干活少整天猎头追跳槽涨一倍数据科学家的捷径. 本文写给想开始搞Kaggle比赛又害怕无从下手的小朋友们。原文发表于 http://t.cn/RAqksWV
- Randomized experimentation
2015/8/1—
- “Navigating Big Data Careers with a Statistics PhD.”
- Great article from Professor Radhika Nagpal (Harvard) on tenure-track life.
- Career advice for academics from Robert Sternberg (Cornell).
- Installing R on OS X + Installing R on OS X – “100% Homebrew Edition”
2 comments
Comments feed for this article
April 8, 2013 at 4:53 pm
Daniel Chavez Moran
Valuable info. Lucky me I discovered your web site accidentally, and
I am shocked why this accident did not came about earlier!
I bookmarked it.
April 23, 2013 at 4:15 pm
Life Insurance Premium Calculator
I got this website from my pal who shared with me regarding this web
page and now this time I am visiting this web site and
reading very informative articles here.