Category Archive

You are currently browsing the category archive for the ‘Computer Science’ category.

Useful for referring—01-16-2019

January 16, 2019 in Academic, Computer Science, Machine Learning, Mathematics, Probability, Statistics, Useful for referring | Leave a comment

A nice blog on CS including learnings: https://blog.acolyer.org/ called “the morning paper”: an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer. I hope there is a similar blog on Statistics, reviewing and recommending an interesting/influential/important paper from the world of Statistics.
A wonderful summary of Mathematical Tricks Commonly Used in Machine Learning and Statistics with examples
I just realized that when I teach ridge regression I should have used A Useful Matrix Inverse Equality for Ridge Regression
GANs should be gained much attention in the stats community: Understanding Generative Adversarial Networks. This is a nice post about GANs based on “probably the highest-quality general overview available nowadays: Ian Goodfellow’s tutorial on arXiv, which he then presented in some form at NIPS 2016. “
R or Python? Why not both? Using Anaconda Python within R with {reticulate}
“A heatmap is basically a table that has colors in place of numbers. Colors correspond to the level of the measurement.”

Machine Learning Books Suggested by Michael I. Jordan from Berkeley

December 30, 2014 in Academic, Computer Science, Machine Learning, Mathematics, Probability, Statistics | 7 comments

There has been a Machine Learning (ML) reading list of books in hacker news for a while, where Professor Michael I. Jordan recommend some books to start on ML for people who are going to devote many decades of their lives to the field, and who want to get to the research frontier fairly quickly. Recently he articulated the relationship between CS and Stats amazingly well in his recent reddit AMA, in which he also added some books that dig still further into foundational topics. I just list them here for people’s convenience and my own reference.

Frequentist Statistics
1. Casella, G. and Berger, R.L. (2001). “Statistical Inference” Duxbury Press.—Intermediate-level statistics book.
2. Ferguson, T. (1996). “A Course in Large Sample Theory” Chapman & Hall/CRC.—For a slightly more advanced book that’s quite clear on mathematical techniques.
3. Lehmann, E. (2004). “Elements of Large-Sample Theory” Springer.—About asymptotics which is a good starting place.
4. Vaart, A.W. van der (1998). “Asymptotic Statistics” Cambridge.—A book that shows how many ideas in inference (M estimation, the bootstrap, semiparametrics, etc) repose on top of empirical process theory.
5. Tsybakov, Alexandre B. (2008) “Introduction to Nonparametric Estimation” Springer.—Tools for obtaining lower bounds on estimators.
6. B. Efron (2010) “Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction” Cambridge,.—A thought-provoking book.
Bayesian Statistics
1. Gelman, A. et al. (2003). “Bayesian Data Analysis” Chapman & Hall/CRC.—About Bayesian.
2. Robert, C. and Casella, G. (2005). “Monte Carlo Statistical Methods” Springer.—about Bayesian computation.
Probability Theory
1. Grimmett, G. and Stirzaker, D. (2001). “Probability and Random Processes” Oxford.—Intermediate-level probability book.
2. Pollard, D. (2001). “A User’s Guide to Measure Theoretic Probability” Cambridge.—More advanced level probability book.
3. Durrett, R. (2005). “Probability: Theory and Examples” Duxbury.—Standard advanced probability book.
Optimization
1. Bertsimas, D. and Tsitsiklis, J. (1997). “Introduction to Linear Optimization” Athena.—A good starting book on linear optimization that will prepare you for convex optimization.
2. Boyd, S. and Vandenberghe, L. (2004). “Convex Optimization” Cambridge.
3. Y. Nesterov and Iu E. Nesterov (2003). “Introductory Lectures on Convex Optimization” Springer.—A start to understand lower bounds in optimization.
Linear Algebra
1. Golub, G., and Van Loan, C. (1996). “Matrix Computations” Johns Hopkins.—Getting a full understanding of algorithmic linear algebra is also important.
Information Theory
1. Cover, T. and Thomas, J. “Elements of Information Theory” Wiley.—Classic information theory.
Functional Analysis
1. Kreyszig, E. (1989). “Introductory Functional Analysis with Applications” Wiley.—Functional analysis is essentially linear algebra in infinite dimensions, and it’s necessary for kernel methods, for nonparametric Bayesian methods, and for various other topics.

Remarks from Professor Jordan: “not only do I think that you should eventually read all of these books (or some similar list that reflects your own view of foundations), but I think that you should read all of them three times—the first time you barely understand, the second time you start to get it, and the third time it all seems obvious.”

Useful for referring—2-25-2014

February 25, 2014 in Academic, Biostatistics, Computer Science, Machine Learning, Mathematics, Probability, Statistics, Useful for referring | Tags: Statistical Research | Leave a comment

NIPS2012 Post Collection

December 15, 2012 in Academic, Computer Science, Machine Learning, Statistics | 2 comments

In my office I have two NIPS posters on the wall, 2011 and 2012. But I have not been there and I am not computer scientist neither. But anyway I like NIPS without reason. Now it’s time for me to organize posts from others:

And among all of the posts, there are several things I have to digest later on:

One tutorial on Random Matrices, by Joel Tropp. People concluded in their posts that

Basically, break random matrices down into a sum of simpler, independent random matrices, then apply concentration bounds on the sum.—John Moeller. The basic result is that if you love your Chernoff bounds and Bernstein inequalities for (sums of) scalars, you can get almost exactly the same results for (sums of) matrices.—hal .
“This year was definitely all about Deep Learning,” John Moeller said. The Geomblog mentioned that it’s been in the news recently because of the Google untrained search for youtube cats, the methods of deep learning (basically neural nets without lots of back propagation) have been growing in popularity over a long while. And we have to spend sometime to read Deep Learning and the evolution of data models, which is related with manifold learning.
“Another trend that’s been around for a while, but was striking to me, was the detailed study of Optimization methods.”—The Geomblog. There are at least two different workshops on optimization in machine learning (DISC and OPT), and numerous papers that very carefully examined the structure of optimizations to squeeze out empirical improvements.
Kernel distances: An introduction to kernel distance from The Geomblog. “Scott Aaronson (at his NIPS invited talk) made this joke about how nature loves ℓ2. The kernel distance is “essentially” the ℓ2 variant of EMD (which makes so many things easier). There’s been a series of papers by Sriperumbudur et al. on this topic, and in a series of works they have shown that (a) the kernel distance captures the notion of “distance covariance” that has become popular in statistics as a way of testing independence of distributions. (b) as an estimator of distance between distributions, the kernel distance has more efficient estimators than (say) the EMD because its estimator can be computed in closed form instead of needing an algorithm that solves a transportation problem and (c ) the kernel that optimizes the efficient of the two-sample estimator can also be determined (the NIPS paper).”
Spectral Methods for Latent Models: Spectral methods for latent variable models are based upon the method of moments rather than maximum likelihood.

Besides the papers mentioned in the above hot topics, there are some other papers from Memming‘s post:

Graphical models via generalized linear models: Eunho introduced a family of graphical models with GLM marginals and Ising model style pairwise interaction. He said the Poisson-Markov-Random-Fields version must have negative coupling, otherwise the log partition function blows up. He showed conditions for which the graph structure can be recovered with high probability in this family.
TCA: High dimensional principal component analysis for non-gaussian data: Using an elliptical copula model (extending the nonparanormal), the eigenvectors of the covariance of the copula variables can be estimated from Kendall’s tau statistic which is invariant to the nonlinearity of the elliptical distribution and the transformation of the marginals. This estimator achieves close to the parametric convergence rate while being a semi-parametric model.

Update: Make sure to check the lectures from the prominent 26th Annual NIPS Conference filmed @ Lake Tahoe 2012. Also make sure to check the NIPS 2012 Workshops, Oral sessions and Spotlight sessions which were collected for the Video Journal of Machine Learning Abstracts – Volume 3.

Sage And Python

November 5, 2012 in Academic, Computer Science, Mathematics, Statistics | 1 comment

Python is great and I think will be also great. For pure mathematics, it has lots of symbol calculations, since pure mathematics is abstract and powerful, like differential geometry, commutative algebra, algebraic geometry, and so on. However, science is nothing but experiment and computation. We also need powerful computational software to help us to carry out the result by powerful computation. Sage is your choice ! Since Sage claims that

Sage is a free open-source mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface.

Mission: Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab.

Not only for pure mathematics, today I happened to see a blog post about using Sage to calculate high moments of Gaussian:

var('m, s, t')

mgf(t) = exp(m*t + t^2*s^2/2)

for i in range(1, 11):

derivative(mgf, t, i).subs(t=0)

which leads to the following result:

m

m^2 + s^2

m^3 + 3*m*s^2

m^4 + 6*m^2*s^2 + 3*s^4

m^5 + 10*m^3*s^2 + 15*m*s^4

m^6 + 15*m^4*s^2 + 45*m^2*s^4 + 15*s^6

m^7 + 21*m^5*s^2 + 105*m^3*s^4 + 105*m*s^6

m^8 + 28*m^6*s^2 + 210*m^4*s^4 + 420*m^2*s^6 + 105*s^8

m^9 + 36*m^7*s^2 + 378*m^5*s^4 + 1260*m^3*s^6 + 945*m*s^8

m^10 + 45*m^8*s^2 + 630*m^6*s^4 + 3150*m^4*s^6 + 4725*m^2*s^8 + 945*s^10

Go Python! Go Sage!

C++ and R

July 30, 2012 in Academic, Computer Science, Machine Learning, Statistics | 1 comment

Today I saw a link question from reddit: How important is Java/C++ vs just using R/Matlab for big data? I learned C++ and Matlab when I was undergraduate and I am now using R by self learning as a PhD student in Stats Department. But living in this big data time, R is really not enough for scientific computing. Hence this link question is really what I want to know. Here I want to organize the interesting materials, including posts, about the programming, especially R and C++.

First I want to mention that top projects languages in GitHub: JavaScript 20%, Ruby 14%, Python 9%, Shell 8%, Java 8%, PHP 7%, C 7%, C++ 4%, Perl 4%, Objective-C 3% among lots of other languages including R, Julia, Matlab. But for me, I only know about C and C++ among these Top 10 languages. For learning for people like me, I give the description list as follows:

JavaScript
Javascript is an ojbect-oriented, scripting programming language that runs in your web browser. It runs on a simplified set of commands, easier to code and doesn’t require compiling. It’s an important language since it’s embedded into html that happens to to used in millions of web pages to validate forms, create cookies, detect browsers and improve page design and formatting. Big plus, it’s easy to learn and use.
Ruby and Ruby on Rails
Ruby is a dynamic, object-oriented, open-source programming language; Ruby on Rails is an open-source Web application framework written in Ruby that closely follows the MVC (Model-View-Controller) architecture. With a focus on simplicity, productivity and letting the computers do the work, in a few years, its usage has spread quickly. Ruby is very similar to Python, but with different syntax and libraries. There’s little reason to learn both, so unless you have a specific reason to choose Ruby (i.e. if this is the language your colleagues all use), I’d go with Python.

Ruby on Rails is one of the most popular web development frameworks out there, so if you’re looking to do primarily web development you should compare Django (Python framework) and RoR first.
Python
Python is an interpreted, dynamically-typed programming language. Python programs stress code readability, so even non-programmers should be able to decipher a Python program with relative ease. This also makes the language one of the easiest to learn and write code in quickly. Python is very popular and has a strong set of libraries for everything from numerical and symbolic computing to data visualization and graphical user interfaces.
Java
Java is an object-oriented programming language developed by James Gosling and colleagues at Sun Microsystems in the early 1990s. Why you should learn it: Hailed by many developers as a “beautiful” language, it is central to the non-.Net programming experience. Learning Java is critical if you are non-Microsoft.
PHP
What is PHP? PHP is an open-source, server side html scripting language well suited for web developers as it can easily be embedded into standard html pages. You can run 100% dynamic pages or hybrid pages, 50% html + 50% php.
C
C is a standardized, general-purpose programming language. It’s one of the most pervasive languages and the basis for several others (such as C++). It’s important to learn C. Once you do, making the jump to Java or C# is fairly easy, because a lot of the syntax is common. C is a low-level, statically typed, compiled language. The main benefit of C is its speed, so it’s useful for tasks that are very computationally intensive. Because it’s compiled into an executable, it’s also easier to distribute C programs than programs written in interpreted languages like Python. The trade-off of increased speed is decreased programmer efficiency. C++ is C with some additional object-oriented features built in. It can be slower than C, but the two are pretty comparable, so it’s up to you whether these additional features are worth it.
Perl
Perl is an open-source, cross-platform, server-side interpretive programming language used extensively to process text through CGI programs. Perls power in processing of piles of text has made it very popular and widely used to write Web server programs for a range of tasks.

This rank is only for the users on GitHub, which is biased for you. For me, I think C/C++, R, Julia, Matlab, Java, Python, Perl will be popular among stats sphere.

Advice on learning C++ from an R background
Integrating C or C++ into R, where to start?
R for testing, C++ for implementation?
Some thoughts on Java—compared with C++
A list of RSS C++ blogs
Get started with C++ AMP
C++11 Concurrency Series
Google’s Python Class and Google’s C++ Class from Google Code University
Integrating R and C++
Learn Python on Codecademy
Learn How to Code Without Leaving Your Browser
Minimal Advice to Undergrads on Programming
Learning R Via Python (or the other way around).
Bloom teaches Python for Scientific Computing at Berkeley (available as a podcast).
What are the three most important programming languages to learn?—The following is from Waleed Kadous, PhD in Computer Science:

I would focus on learning three classes of languages to really understand the nature of programming and to have a decent toolkit. Everything else is basically variants on that.Learn a low-level language so you understand what goes on at the bare metal and so you can make hardware dance
The obvious choice here is C, but assembly language might also be good.Learn a language for architecting large systems
If you want to build large code bases, you’re going to need one of the strongly typed languages. Personally, I think Java is the best choice here; but C++, Scala and even Ada are acceptable.Learn a language for scripting things together quickly
There are a few choices here: shell, Python, Perl, Lua. Any of these will do, but Python is probably the foremost. These are great for gluing existing pieces together.

Now, if you only get three, that’s it. But I’m going to suggest two more categories.

Learn a language that forces you to think differently about programming
These are majorly different world perspectives. Examples here would be functional programming, like Haskell, ML, etc, but also logic programming like Prolog.

Learn a language that lets you build web-based applications quickly
This could be web2py or Javascript — but the ability to quickly hack together a web demo is really useful today.

The R packages in a data scientist’s toolbox

July 18, 2012 in Academic, Computer Science, Machine Learning, Statistics | Leave a comment

The following from Revolutions:

John Myles White, self-described “statistics hacker” and co-author of “Machine Learning for Hackers” was interviewed recently by The Setup. In the interview, he describes his some of his go-to R packages for data science:

Most of my work involves programming, so programming languages and their libraries are the bulk of the software I use. I primarily program in R, but, if the situation calls for it, I’ll use Matlab, Ruby or Python. …

That said, for me the specific language I use is much less important than the libraries availble for that language. In R, I do most of my graphics using ggplot2, and I clean my data using plyr, reshape, lubridate and stringr. I do most of my analysis using rjags, which interfaces with JAGS, and I’ll sometimes use glmnet for regression modeling. And, of course, I use ProjectTemplate to organize all of my statistical modeling work. To do text analysis, I’ll use the tm and lda packages.

Also in JMW’s toolbox: Julia, TextMate 2, MySQL, Dropbox and a beefy MacBook. Read the full interview linked below for an insightful look at how he uses these and other tools day to day.

The Setup / Interview: John Myles White

ICML 2011 and COLT 2011

September 1, 2011 in Academic, Computer Science, Machine Learning, Mathematics, Statistics | Leave a comment

Since I missed the whole summer, but during the summer, many interesting things happened, I have to make it up. So this post will be updated during the next few days. I will collect some posts from others here. I hope it would be helpful for you.

ICML 2011

COLT 2011

1, ICML 2011 and the future

2, Interesting Neural Network Papers at ICML 2011

3, Interesting papers at COLT 2011

4, The conference(s) post: ACL and ICML

5, 14th International Conference on Artificial Intelligence and Statistics 2011 –Ft.Lauderdale

6, KDD and MUCMD 2011

Python

May 30, 2011 in Academic, Computer Science | 1 comment

I gradually found that Python is getting more popularity. So I want to learn it.

1, Python Programming Language – Official Website

2, Introduction to Computer Science and Programming

3, Python Programming Tutorial

4, http://www.scipy.org/

5, Scientific Python

6, SciPy 2011

Algorithmic information theory

May 30, 2011 in Academic, Computer Science, Machine Learning | 2 comments

Today I talked with a friend from ANU online, who is interested in AIT. The following is the reading list from his blog:

Here is the list of books I’m current reading, most of them, if not all, are recommended by Prof. Marcus Hutter.

S. J. Russell and P. Norvig. Artificial Intelligence. A Modern Approach
Prentice-Hall, Englewood Cliffs, 3rd Edition (2010) [An very high level introductory book. Taking comp3620 @ ANU]

M. Li and P. M. B. Vitanyi. An introduction to Kolmogorov complexity and its applications
Springer, 3rd edition (2008) [What I like most about this book is how they mathematically formalise human’s intuition of complexity]

D. P. Bertsekas and J. N. Tsitsiklis. Neuro-Dynamic Programming
Athena Scientific, Belmont, MA (1996) [If you want to know the formal proofs in RL, there is really no replacement for this book]

R. Sutton and A. Barto. Reinforcement learning: An introduction
Cambridge, MA, MIT Press (1998) [An introductory book for RL]

G. Restall. Logic: An Introduction
Fundamentals of Philosophy, Routledge (2006) [Again an introductory book for logic.]

M. Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability
Springer, Berlin, 300 pages (2005) [My supervisor’s book. This is a very compact and theoretical book with strong mathematical background assumed]

C. M. Bishop. Pattern Recognition and Machine Learning
Springer (2006) [This book is normally referred to as the Bible in machine learning. This is the must-learn book.]

Peter D. Grunwald. The minimum Description Length Pinciple The MIT press. (2007) [MDL principle based on Occam’s Razor]

And I also find the following useful resources:

1, Algorithmic Information Theory Reading Group

2, Algorithmic information theory

3, AIT resources webpage

Category Archive

Useful for referring—01-16-2019

Machine Learning Books Suggested by Michael I. Jordan from Berkeley

Useful for referring—2-25-2014

NIPS2012 Post Collection

Sage And Python

C++ and R

The R packages in a data scientist’s toolbox

ICML 2011 and COLT 2011

Python

Algorithmic information theory

Recent Comments

Blog Stats

Log In/Out

Email Subscription

Recent Posts

Twitter Updates

Categories

Archives

Bioinformatics

Blogroll

CS blogs

general math blogs

interesting blogs

Journal Club

machine learning blogs

Newly Added

probability blogs

statistics blogs

Blog Stats