There has been a Machine Learning (ML) reading list of books in hacker news for a while, where Professor Michael I. Jordan recommend some books to start on ML for people who are going to devote many decades of their lives to the field, and who want to get to the research frontier fairly quickly. Recently he articulated the relationship between CS and Stats amazingly well in his recent reddit AMA, in which he also added some books that dig still further into foundational topics. I just list them here for people’s convenience and my own reference.

- Frequentist Statistics
- Casella, G. and Berger, R.L. (2001). “Statistical Inference” Duxbury Press.—Intermediate-level statistics book.
- Ferguson, T. (1996). “A Course in Large Sample Theory” Chapman & Hall/CRC.—For a slightly more advanced book that’s quite clear on mathematical techniques.
- Lehmann, E. (2004). “Elements of Large-Sample Theory” Springer.—About asymptotics which is a good starting place.
- Vaart, A.W. van der (1998). “Asymptotic Statistics” Cambridge.—A book that shows how many ideas in inference (M estimation, the bootstrap, semiparametrics, etc) repose on top of empirical process theory.
- Tsybakov, Alexandre B. (2008) “Introduction to Nonparametric Estimation” Springer.—Tools for obtaining lower bounds on estimators.
- B. Efron (2010) “Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction” Cambridge,.—A thought-provoking book.

- Bayesian Statistics
- Gelman, A. et al. (2003). “Bayesian Data Analysis” Chapman & Hall/CRC.—About Bayesian.
- Robert, C. and Casella, G. (2005). “Monte Carlo Statistical Methods” Springer.—about Bayesian computation.

- Probability Theory
- Grimmett, G. and Stirzaker, D. (2001). “Probability and Random Processes” Oxford.—Intermediate-level probability book.
- Pollard, D. (2001). “A User’s Guide to Measure Theoretic Probability” Cambridge.—More advanced level probability book.
- Durrett, R. (2005). “Probability: Theory and Examples” Duxbury.—Standard advanced probability book.

- Optimization
- Bertsimas, D. and Tsitsiklis, J. (1997). “Introduction to Linear Optimization” Athena.—A good starting book on linear optimization that will prepare you for convex optimization.
- Boyd, S. and Vandenberghe, L. (2004). “Convex Optimization” Cambridge.
- Y. Nesterov and Iu E. Nesterov (2003). “Introductory Lectures on Convex Optimization” Springer.—A start to understand lower bounds in optimization.

- Linear Algebra
- Golub, G., and Van Loan, C. (1996). “Matrix Computations” Johns Hopkins.—Getting a full understanding of algorithmic linear algebra is also important.

- Information Theory
- Cover, T. and Thomas, J. “Elements of Information Theory” Wiley.—Classic information theory.

- Functional Analysis
- Kreyszig, E. (1989). “Introductory Functional Analysis with Applications” Wiley.—Functional analysis is essentially linear algebra in infinite dimensions, and it’s necessary for kernel methods, for nonparametric Bayesian methods, and for various other topics.

Remarks from Professor Jordan: “not only do I think that you should eventually read all of these books (or some similar list that reflects your own view of foundations), but I think that you should read all of them three times—**the first time you barely understand, the second time you start to get it, and the third time it all seems obvious**.”

## 6 comments

Comments feed for this article

January 28, 2015 at 4:23 am

Condensed News | Data Analytics & R[…] Machine Learning Books Suggested by Michael I. Jordan from Berkeley There has been a Machine Learning (ML) reading list of books in hacker news for a while, where Professor Michael I. Jordan recommend some books to start on ML for people who are going to devote many decades of their lives to the field, and who want to get to the research frontier fairly quickly. Recently he articulated the relationship between CS and Stats amazingly well in his recent reddit AMA, in which he also added some books that dig still further into foundational topics. I just list them here for people’s convenience and my own reference. […]

April 9, 2015 at 7:56 am

litoughReblogged this on litough and commented:

books for machine learning suggested by michael I jordan from berkeley

September 18, 2015 at 7:16 am

Ismail EleziIs there a recommended order to read the books? For someone who is elementary only with basic statistics/probability, where should I start?

I guess linear algebra might be a good start (considering that I have done it in the past and apply it quite often), but then things start becoming more difficult.

I am not in a hurry, would take me a good few years to read those books multiple times in order to fully understand them.

January 14, 2016 at 9:43 pm

LightRiverReblogged this on Deep thought and commented:

Các cuốn sách về Machine Learning được GS Michael I. Jordan (ĐH Berkeley) khuyến nghị. (Xem bản copy tại: http://www.statsblogs.com/2014/12/30/machine-learning-books-suggested-by-michael-i-jordan-from-berkeley/)

January 15, 2016 at 10:06 pm

PerryZhaoReblogged this on 木秀于林.

April 13, 2018 at 5:32 am

RyanReblogged this on Unfamous network and commented:

NICE JOB!