Q: We always say that statistics is just dealing with data. But we also know that informatics is also getting knowledge from data analysis. For example, bioinformatics people can totally go without biostatistics. I want to know what is the essential difference between statistics and informatics.
A: But now, to answer your question, I agree that overall, statistics can’t do without computers those days. Yet, one of the major aspects of statistics is inference, which has nothing to do with computers. Satistical inference is actually what makes statistics a science, because it tells you whether or not your conclusions hold up in other contexts.—From gui11aume.
Statistics inferes from data; Informatics operates on data.—From stackovergio
Q: Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what’s the essential difference between statistical model and probability model? Probability model does not need real data? Thanks.
A: A Probability Model consists of the triplet (Ω,F,P), where Ω is the sample space, F is a σ−algebra (events) and P is a probability measure on F.
Intuitive explanation. A probability model can be interpreted as a known random variable X. For example, let X be a Normally distributed random variable with mean 0 and variance 1. In this case the probability measure P is associated with the Cumulative Distribution Function (CDF).
A Statistical Model is a set S of probability models, this is, a set of probability measures/distributions on the sample space Ω.
This set of probability distributions is usually selected for modelling a certain phenomenon from which we have data.
Intuitive explanation. In a Statistical Model, the parameters and the distribution that describe a certain phenomenon are both unknown. An example of this is the family of Normal distributions with mean μ∈R and variance σ2∈R+, this is, both parameters are unknown and you typically want to use the data set for estimating the parameters (i.e. selecting an element of S). This set of distributions can be chosen on any Ω and F, but, if I am not mistaken, in a real example only those defined on the same pair (Ω,F) are reasonable to consider.
Generalisations. This paper provides a very formal definition of Statistical Model, but the author mentions that “Bayesian model requires an additional component in the form of a prior distribution … Although Bayesian formulations are not the primary focus of this paper”. Therefore the definition of Statistical Model depend on the kind of model we use: parametric or nonparametric. Also in the parametric setting, the definition depends on how parameters are treated (e.g. Classical vs. Bayesian).
The difference is: in a probability model you know exactly the probability measure, for example a Normal (μ0,σ20), where μ,σ2 are known parameters., while in a statistical model you consider sets of distributions, for example Normal (μ,σ2), where μ,σ2 are unknown parameters.
None of them require a data set, but I would say that a Statistical model is usually selected for modelling one.—From Procrastinator.
Update 7/10/2012: If You’re Not A Programmer … You’re Not A Bioinformatician !