Note: the following 4-7 are from Simply Statistics.
  1. A Personal Perspective on Machine Learning
  2. The differing perspectives of statistics and machine learning
  3. Kernel Methods and Support Vector Machines de-Mystified
  4. I love this article in the WSJ about the crisis at JP Morgan. The key point it highlights is that looking only at the high-level analysis and summaries can be misleading, you have to look at the raw data to see the potential problems. As data become more complex, I think its critical we stay in touch with the raw data, regardless of discipline. At least if I miss something in the raw data I don’t lose a couple billion. Spotted by Leonid K.
  5. On the other hand, this article in the Times drives me a little bonkers. It makes it sound like there is one mathematical model that will solve the obesity epidemic. Lines like this are ridiculous: “Because to do this experimentally would take years. You could find out much more quickly if you did the math.” The obesity epidemic is due to a complex interplay of cultural, sociological, economic, and policy factors. The idea you could “figure it out” with a set of simple equations is laughable. If you check out their model this is clearly not the answer to the obesity epidemic. Just another example of why statistics is not math. If you don’t want to hopelessly oversimplify the problem, you need careful data collection, analysis, and interpretation. For a broader look at this problem, check out this article on Science vs. PR. Via Andrew J.
  6. Some cool applications of the raster package in R. This kind of thing is fun for student projects because analyzing images leads to results that are easy to interpret/visualize.
  7. Check out John C.’s really fascinating post on determining when a white-collar worker is great. Inspired by Roger’s post on knowing when someone is good at data analysis.
  8. knitR Performance Report 3 (really with knitr) and dprint
  9. Unix doesn’t follow the Unix philosophy
  10. Advice on writing research articles
  11. knitr Performance Report–Attempt 3
  12. Permutation tests in R
  13. Understanding Bayesian Statistics – By Michael-Paul Agapow
  14. knitr, Slideshows, and Dropbox
  15. Generate LaTeX tables from CSV files (Excel)
  16. The Tomato Genome
  17. Optimization
  18. Sichuan Agricultural University and LC Sciences Uncover the Epigenetics of Obesity
  19. How to Stay Current in Bioinformatics/Genomics
  20. Interactive HTML presentation with R, googleVis, knitr, pandoc and slidy
  21. The R-Podcast Episode 7: Best Practices for Workflow Management
  22. What is the point of statistics and operations research?
  23. Question: C/C++ libraries for bioinformatics?
  24. 5 Hidden Skills for Big Data Scientists
  25. Protocol – Computational Analysis of RNA-Seq
About these ads