You are currently browsing the category archive for the ‘Useful for referring’ category.

- Deep Learning Master Class
- Advances in Variational Inference
- Numerical Optimization: Understanding L-BFGS
- An exact mapping between the Variational Renormalization Group and Deep Learning
- New ASA Guidelines for Undergraduate Statistics Programs
- 奇异值分解（We Recommend a Singular Value Decomposition）
- 如何简单形象又有趣地讲解神经网络是什么？
- Academic vs. Industry Careers
- Hadley Wickham: Impact the world by being useful
- Statisticians in World War II: They also served
- A Brief Overview of Deep Learning
- Advice for applying Machine Learning
- Deep Learning Tutorial
- Gibbs Sampling in Haskell
- How-to go parallel in R – basics + tips

- Tutorial: How to detect spurious correlations, and how to find the …
- Practical illustration of Map-Reduce (Hadoop-style), on real data
- Jackknife logistic and linear regression for clustering and predict…
- From the trenches: 360-degrees data science
- A synthetic variance designed for Hadoop and big data
- Fast Combinatorial Feature Selection with New Definition of Predict…
- A little known component that should be part of most data science a…
- 11 Features any database, SQL or NoSQL, should have
- Clustering idea for very large datasets
- Hidden decision trees revisited
- Correlation and R-Squared for Big Data
- Marrying computer science, statistics and domain expertize
- New pattern to predict stock prices, multiplies return by factor 5
- What Map Reduce can’t do
- Excel for Big Data
- Fast clustering algorithms for massive datasets
- Source code for our Big Data keyword correlation API
- The curse of big data
- How to detect a pattern? Problem and solution
- Interesting Data Science Application: Steganography
- Easily create documents from R with Rmarkdown
- How to publish R and ggplot2 to the web
- magrittr: Simplifying R code with pipes
- Updated dplyr Examples
- Video introduction to data manipulation with dplyr
- R and Data Science
- jiebaR中文分词——R的灵活，C的效率
- Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?
- 41 hours of courses given in Iceland this Summer at the Machine Learning Summer School.
- summary of parallel machine learning approaches
- big data and data science talks

- Some R Resources for GLMs
- 失联搜救中的统计数据分析
- The gap between data mining and predictive models
- Data Mining, machine learning and statistics.
- useR! 2014 is underway with 16 tutorials
- What is Scalable Machine Learning?
- rlist：基于list在R中处理非关系型数据
- The perfect candidate
- The Leek group guide to giving talks
- 38 Seminal Articles Every Data Scientist Should Read
- Deep Learning – important resources for learning and understanding
- Twenty rules for good graphics + Ten Simple Rules for Better Figures
- Git Cookbook
- Making Your Code Citable
- biblatex for statisticians
- Do your “data janitor work” like a boss with dplyr

- Interview with Nick Chamandy, statistician at Google
- You and Your Research + video
- Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained
- A Survival Guide to Starting and Finishing a PhD
- Six Rules For Wearing Suits For Beginners
- Why I Created C++
- More advice to scientists on blogging
- Software engineering practices for graduate students
- Statistics Matter
- What statistics should do about big data: problem forward not solution backward
- How signals, geometry, and topology are influencing data science
- The Bounded Gaps Between Primes Theorem has been proved
- A non-comprehensive list of awesome things other people did this year.
- Jake VanderPlas writes about the Big Data Brain Drain from academia.
- Tomorrow’s Professor Postings
- Best Practices for Scientific Computing
- Some tips for new research-oriented grad students
- 3 Reasons Every Grad Student Should Learn WordPress
- How to Lie With Statistics (in the Age of Big Data)
- The Geometric View on Sparse Recovery
- The Mathematical Shape of Things to Come
- A Guide to Python Frameworks for Hadoop
- Statistics, geometry and computer science.
- How to Collaborate On GitHub
- Step by step to build my first R Hadoop System
- Open Sourcing a Python Project the Right Way
- Data Science MD July Recap: Python and R Meetup
- git 最近感悟
- 10 Reasons Python Rocks for Research (And a Few Reasons it Doesn’t)
- Effective Presentations – Part 2 – Preparing Conference Presentations
- Doing Statistical Research
- How to Do Statistical Research
- Learning new skills
- How to Stand Out When Applying for An Academic Job
- Maturing from student to researcher
- False discovery rate regression (cc NSA’s PRISM)
- Job Hunting Advice, Pt. 3: Networking
- Getting Started with Git

- Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ
- A Funny Thing Happened on the Way to Academia . . .
- Advice for students on the academic job market (2013 edition)
- Perspective: “Why C++ Is Not ‘Back’”
- Is Fourier analysis a special case of representation theory or an analogue?
- The Beauty of Bioconductor
- The State of Statistics in Julia
- Open Source Misfeasance
- Book review: The Signal and The Noise
- Should the Cox Proportional Hazards model get the Nobel Prize in Medicine?
- The most influential data scientists on Twitter
- Here is an interesting review of Nate Silver’s book. The interesting thing about the review is that it doesn’t criticize the statistical content, but criticizes the belief that people only use data analysis for good. This is an interesting theme we’ve seen before. Gelman also reviews the review.—–Simply Statistics
- Video : “Matrices and their singular values” (1976)
- Beyond Computation: The P vs NP Problem – Michael Sipser—-This talk is arguably the very best introduction to computational complexity .
- What are some of your personal guidelines for writing good, clear code?
- How do you explain Machine learning and Data Mining to non CS people?
- Suggested New Year’s resolution: start a blog: A blog forces you to articulate your thoughts rather than having vague feelings about issues; You also get much more comfortable with writing, because you’re doing it rather than thinking about doing it; If other people read your blog you get to hear what they think too. You learn a lot that way. || Set aside time for your blog every day. Keep notes for yourself on bloggy subjects (write a one-line gmail to yourself with the subject “blog ideas”).
- The most influential data scientists on Twitter
- Tips on job market interviews
- The age of the essay

- Grad Student’s Guide to Good Coffee+Grad Student’s Guide to Good Tea
- Favorite Apps for Work and Life
- estimating a constant (not really)
- Reinforcement Learning in R: An Introduction to Dynamic Programming
- The Future of Machine Learning (and the End of the World?)
- 10 Papers Every Programmer Should Read (At Least Twice)
- R in the Press
- On Chomsky and the Two Cultures of Statistical Learning
- Speech Recognition Breakthrough for the Spoken, Translated Word
- Frequentist vs Bayesian
- w4s – the awesomeness we’re experiencing
- Why is the Gaussian so pervasive in mathematics?
- C++ Blogs that you Regularly Follow
- An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.
- Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.
- Statistics is still the new hotness. Here is a Business Insider list about 5 statistics problems that will“change the way you think about the world”.
- New Yorker, especially the line,”statisticians are the new sexy vampires, only even more pasty” (via Brooke A.)
- The closed graph theorem in various categories
- Got spare time? Watch some videos about statistics
- About the first Borel-Cantelli lemma
- Yihui Xie—-The Setup
- Best Practices for Scientific Computing

- Towards Better PDF Management with the Filesystem
- What is life like for PhDs in computer science who go into industry?
- Online REPL for 17 programming languages
- Logistic regression vs. multiple regression—–Many statisticians seem to advise the use of logistic regression over multiple regression by invoking this logic: “A probability value can’t exceed 1 nor can it be less than 0. Since multiple regression often yields values less than 0 and greater than 1, use logistic regression.” While we can understand this argument, our feeling is that, in the applied fields we toil in, that argument is not a very practical one. In fact a seasoned statistics professor we know says (in effect): “What’s the big deal? If multiple regression yields any predicted values less than 0, consider them 0. If multiple regression yields any values greater than 1, consider them 1. End of story.” We agree.
- Scientific Python
- An everyday essential: the timer+My personal productivity rules
- Bill Thurston—by Terrace Tao; Bill Thurston, 1946-2012—by Peter Woit; Bill Thurston 1946-2012—by
*David Speyer*. - Surviving a PhD: 10 top tips that shows how to survive your PhD
- How different PhD’s work:Differences and similarities between departments about PhD process
- Countdown Begins: Countdown starts for submission of the thesis
- PhD Life is Wonderful:Doing PhD at Warwick University is a wonderful experience
- Too Many Emails In Your Inbox: Use Outlook folders to manage your emails
- Introduction to REX Facility: Videos for introducing Wolfson Research Exchange and its facilities
- Power of Supervisors: Control,inner happiness and optimisim
- Unorthodox Tools of a Researcher: Reflection and examples of unorthodox tools that helps you PhD period
- Homesickness and Culture Clashes: Homesickness of international students and cultural differences
- Choosing Your PhD Examiners: Tips for choosing the relevant examiners for PhD Viva
- Effective Research Tools: Examples of useful research tools
- PhD,Risks and Murphy’s Law: “Anything that can go wrong will go wrong” according to Murphy’s Law
- Will Data Scientists Be Replaced by Tools?
- Update: TeX Writer for iPad (+ LaTeX + AMS)
- Why physicists like models, and why biologists should
- The ENCODE project: lessons for scientific publication
- Perspectives From A Postdoc: What is a Postdoc?
- Chris Blattman gives advice on PhD students’ NSF applications
- ENCODE floods the news networks…
- Maybe mostly useful for me, but for other people with Tumblr blogs, here is a way to insert Latex.—From Simply Statistics
- Harvard Business school is getting in on the fun, calling the data scientist the sexy profession for the 21st century. Although I am a little worried that by the time it gets into a Harvard Business document, the hype may be outstripping the real promise of the discipline. Still, good news for statisticians! (via Rafa via Francesca D.’s Facebook feed).—From Simply Statistics
- The counterpoint is this article which suggests that data scientists might be able to be replaced by tools/software. I think this is also a bit too much hype for my tastes. Certain things will definitely be automated and we may even end up with a deterministic statistical machine or two. But there will continually be new problems to solve which require the expertise of people with data analysis skills and good intuition (link via Samara K.)—From Simply Statistics

- Getting Started with the WordPress Competition
- Simple Made Easy
- An Education Tsunami—
*Will on-line courses destroy universities?* - Universities Reshaping Education on the Web
- Explanation or Prediction? An Amazing Quote from Phil Schrodt
- Should you apply PCA to your data?
- Which classifiers are fast enough for exploring medium-sized data?
- Quick classifiers for exploring medium-sized data (redux)
- Is C++ worth it?
- Unbiased estimators can be terrible
- Things You Should Never Do, Part I
- The Joel Test: 12 Steps to Better Code
- Methodologists’ Audience
- Bayesian Methodology in the Genetic Age
- Interview with Michael Hammel, author of The Artist’s Guide to GIMP
- Being Happy in Grad School
- 10 Fresh Tips for Finding Time to Blog
- A Quick Guide to Using Tumblr for Business
- Statistics Done Wrong
- Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology
- Interview(s) with Vladimir Voevodsky with an introduction on motivic homotopy along with the video and transcript.
- Are there examples of non-orientable manifolds in nature?
- Kolmogorov Complexity – A Primer
- Adventures at My First JSM (Joint Statistical Meetings) #JSM2012
- Yes, I was hacked. Hard.
- Does Julia have any hope of sticking in the statistical community?
- How Genome Sequencing is Revolutionizing Clinical Diagnostics, from the ISMB Conference
- Advice for an Undergraduate
- 4 things you should know about choosing examiners for your thesis
- The long tail of free online education : The author also plans to teach a class on graph partitioning, expander graphs, and random walks online in Winter 2013.
- Teaching the World to Search
- Beyond Pinterest and Instagram – ten visual social networks that should be on your radar
- Making Ubuntu 12.04 useable
- Basic Understanding of Compressed Sensing

- Simplicity is hard to sell
- Self-Repairing Bayesian Inference
- Praxis and Ideology in Bayesian Data Analysis
- In-consistent Bayesian inference
- Big Data Generalized Linear Models with Revolution R Enterprise
- Quants, Models, and the Blame Game
- Fun with the googleVis Package for R
- Topological Data Analysis
- The Winners of the LaTeX and Graphics Contest
- Is Machine Learning Losing Impact?
- Machine Learning Doesn’t Matter?
- Components of Statistical Thinking and Implications for Instruction and Assessment
- Xiao-Li Meng and Xianchao Xie rethink asymptotics
- Higgs boson and five sigma
- What is the Statistics Department 25 Years From Now?
- Statistics: Your chance for happiness (or misery)
- Manifolds: motivation and definition
- Why Emacs is important to me? : ESS and org-mode
- Interesting Emacs linkfest
- Devs Love Bacon: Everything you need to know about Machine Learning in 30 minutes or less
- Visualizing Galois Fields
- Visualizing Galois Fields (Follow-up)
- Statistical Reasoning on iTunes U
- Computing log gamma differences
- Where to start if you’re going to revise statistics
- Power laws and the generalized CLT
- Open problems in next-gen sequence analysis
- More equations, less citations?
- Talk: Some Introductory Remarks on Bayesian Inference

- An easy way to think about priors on linear regression
- Combining priors and downweighting in linear regression
- Metropolis Hastings MCMC when the proposal and target have differing support
- Slidify: Things are coming together fast
- How to Convert Sweave LaTeX to knitr R Markdown: Winter Olympic Medals Example
- Testing R Markdown with R Studio and posting it on RPubs.com
- Announcing The R markdown Package
- Announcing RPubs: A New Web Publishing Service for R
- Approximate Bayesian computation
- Load Packages Automatically in RStudio
- Practical advice for machine learning: bias, variance and what to do next
- The overview article on “Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis” associated with the invited talk at the upcoming PODS 2012 meeting is on the arXiv here.
- The monograph on
*“Randomized Algorithms for Matrices and Data”*is available in NOW’s “Foundations and Trends in Machine Learning” series here, and it is also available on the arXiv here. - Click here for information (including the slides and video!) on the Tutorial on “Geometric Tools for Identifying Structure in Large Social and Information Networks,” given originally at ICML10 and KDD10 and subsequently at many other places. (The slides are also linked to below.)
- The overview chapter on “Algorithmic and Statistical Perspectives on Large-Scale Data Analysis” is finally on the arXiv here; the book in which it will appear is in press; and a video of the associated talk is here.
- Recent teaching: Fall 2009: CS369M: Algorithms for Massive Data Set Analysis
- Confidence distributions
- Making a singular matrix non-singular
- Statistics Versus Machine Learning
- How to post R code on WordPress blogs
- Causation
- Pro Tips for Grad Students in Statistics/Biostatistics (Part 1)
- Pro Tips for Grad Students in Statistics/Biostatistics (Part 2)
- Why You Shouldn’t Conclude “No Effect” from Statistically Insignificant Slopes
- For those interested in knitr with Rmarkdown to beamer slides
- Notes from A Recent Spatial R Class I Gave
- Sparse Bayesian Methods for Low-Rank Matrix Estimation and Bayesian Group-Sparse Modeling and Variational Inference – implementation
- The Battle of the Bayes
- Ockham Workshop, Day 1
- Ockham Workshop, Day 2
- Ockham Workshop, Day 3
- Ockham’s Razor
- Occam

## Recent Comments