You are currently browsing the monthly archive for February 2012.

- Active Bayesian Optimization
- There are several videos from the meeting on the Group Testing Designs, Algorithms, and Applications to Biology IMA meeting. Enjoy!
- Emergence of MCMC Bayesian Computation
- Two Interesting Short Volumes on the (Graph) Laplacian
- So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
- Prediction: the Lasso vs. just using the top 10 predictors
- Getting Genetics Done: Golden Helix: A Hitchhiker’s Guide to Next Generation Sequencing

This Friday, there is a talk on Personalized Medicine and Artificial Intelligence given by Michael Kosorok from Department of Biostatistics at University of North Carolina at Chapel Hill. The following materials could be helpful to get some idea of this area:

- Active Learning for Developing Personalized Treatment a new paper from arxiv.
- From Statistical Genetics to Predictive Models in Personalized Medicine a workshop from NIPS 2011.
- Machine Learning for Personalized Medicine: Will This Drug Give Me a Heart Attack?
- A paper written by Michael Kosorok, Penalized Q-Learning for Dynamic Treatment Regimes

Today I just watched an interesting video about the indestructability of information and the nature of black holes, a talk given by Leonard Susskind of the Stanford Institute for Theoretical Physics.

Video: Leonard Susskind on The World As Hologram

And I also looked up some materials online:

World’s Most Precise Clocks Could Reveal Universe Is a Hologram

The World as a Hologram (A paper written by L. Susskind)

UC Berkeley’s Raphael Bousso presents a friendly introduction to the ideas behind the holographic principle, which may be very important in the hunt for a theory of quantum gravity.

Q: What are Bartlett corrections?

A: Strictly speaking, a Bartlett correction is a scalar transformation applied to the likelihood ratio (LR) statistic that yields a new, improved test statistic which has a chi-squared null distribution to order O(1/n). This represents a clear improvement over the original statistic in the sense that LR is distributed as chi-squared under the null hypothesis only to order O(1).

Q: Are there extensions of Bartlett corrections?

A: Yes. Some of them arose in response to Sir David Cox’s 1988 paper, “Some aspects of conditional and asymptotic inference: a review” (Sankhya A). A particularly useful one was proposed by Gauss Cordeiro and Silvia Ferrari in a 1991 Biometrika paper. They have shown how to Bartlett-correct test statistics whose null asymptotic distribution is chi-squared with special emphasis on Rao’s score statistic.

Q: Where can I find a survey paper on Bartlett corrections?

A: There are a few around. Two particularly useful ones are:

- Cribari-Neto, F. and Cordeiro, G.M. (1996) On Bartlett and Bartlett-type corrections. Econometric Reviews, 15, 339-367.
- Jensen, J.L. (1993) A historical sketch and some new results on the improved likelihood statistic. Scandinavian Journal of Statistics, 20, 1-15.

Q: What are the alternatives to Bartlett corrections?

A: There are several alternatives. A closely related one are Edgeworth expansions, named after the economist/statistician Francis Ysidro Edgeworth. There are also saddlepoint expansions. A computer-intensive alternative is known as the bootstrap and was proposed by Bradley Efron in his 1979 Annals of Statistics paper.

Please refer to Bartlett Corrections Page.

From wiki, we have the following:

In statistics,

resamplingis any of a variety of methods for doing one of the following:

- Estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (
jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)- Exchanging labels on data points when performing significance tests (
permutation tests, also called exact tests, randomization tests, or re-randomization tests)- Validating models by using random subsets (bootstrapping, cross validation)
Common resampling techniques include bootstrapping, jackknifing and permutation tests.

- Bootstrapping is a statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter like a mean, median,proportion, odds ratio, correlation coefficient or regression coefficient.
- Jackknifing, which is similar to bootstrapping, is used in statistical inference to estimate the bias and standard error (variance) of a statistic, when a random sample of observations is used to calculate it. The basic idea behind the jackknife variance estimator lies in systematically recomputing the statistic estimate leaving out one or more observations at a time from the sample set. From this new set of replicates of the statistic, an estimate for the bias and an estimate for the variance of the statistic can be calculated.
- Cross-validation is a statistical method for validating a predictive model. Subsets of the data are held out for use as validating sets; a model is fit to the remaining data (a training set) and used to predict for the validation set. Averaging the quality of the predictions across the validation sets yields an overall measure of prediction accuracy.
- A
permutation test(also called a randomization test, re-randomization test, or an exact test) is a type of statistical significance test in which the distribution of the test statistic under the null hypothesis is obtained by calculating all possible values of the test statistic under rearrangements of the labels on the observed data points.

These days I am struggling with the sequencing data. What do they look like? What’s the essential difference with the micro-array data for the statistician community? Discrete and Continuous? So far, I am still not clear of these. Could anyone help me with this? Thanks.

The following are the collection of the related and maybe something else:

- Talk on RNA-seq Analysis, presented by Wing H. Wong, at the Joint Statistical Meetings on August 3, 2010 in Vancouver, Canada.
- Julia Salzman, Hui Jiang and Wing Hung Wong (2011), Statistical Modeling of RNA-Seq Data. Statistical Science 2011, Vol. 26, No. 1, 62-83. doi: 10.1214/10-STS343.
- Hui Jiang and Wing Hung Wong (2009), Statistical Inferences for Isoform Expression in RNA-Seq.
- Wenxiu Ma and Wing Hung Wong (2011), The Analysis of ChIP-Seq Data.
- Genotype and SNP calling from next-generation sequencing data
- Saran Vardhanabhuti, Mingyao Li and Hongzhe Li (2011), A Hierarchical Bayesian Model for Estimating and Inferring Differential Isoform Expression for Multi-sample RNA-Seq Data
- BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data, 2011, Yuan Ji1,*, Yanxun Xu2, Qiong Zhang3, Kam-Wah Tsui3, Yuan Yuan4, Clift Norris Jr.1, Shoudan Liang4, Han Liang4,*
- A new paper written by Lin Wan, Xiting Yan, Ting Chen, and Fengzhu Sun, Biostat 2012 published 21 February 2012, Modeling RNA degradation for RNA-Seq with applications
- Video 1 and video 2 for Analysis and design of RNA sequencing experiments for identifying isoform regulation

- R and presentations: a basic example of knitr and beamer: Combine the
`knitr`

package with the Latex package`beamer for presentation slides,`

instead of the the Sweave package because it basically is a better Sweave. - Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC : a good paper worthy of learning and discussing. Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data to summary statistics of the observed data.
- Elegant & fast data manipulation with data.table : Extension of data.frame for fast indexing, fast ordered joins, fast assignment, fast grouping and list columns.
- Ordinal Measures of Association : These statistics I have met for twice so far. The recent one is in this paper written by Han Liu etc.
- Table design : Almost every research paper and thesis in statistics contains at least some tables, yet students are rarely taught how to make good tables. While the principles of good graphics are slowly becoming part of a statistical education (although not an econometrics education!), the principles of good tables are often ignored.
- Dirichlet distribution

### IMA meeting on Group Testing Designs, Algorithms, and Applications to Biology

**Length Reduction via Polynomials**

February 16, 2012 11:30 am – 12:30 pm

*Keywords of the presentation*: Sparse covlutions, polynomials, finite fields, length reduction.

Both randomized and deterministic algorithms were developed for efficiently computing the sparse FFT. The key operation in all these algorithms was length reduction. The sparse data is mapped into small vectors that preserve the convolution result. The reduction method used to-date was the modulo function since it preserves location (of the ”1” bits) up to cyclic shift.

In this paper we present a new method for length reduction – polynomials. We show that this method allows a faster deterministic computation of the sparse FFT than currently known in the literature. It also enables the development of an efficient algorithm for computing the binary sparse Walsh Transform. To our knowledge, this is the first such algorithm in the literature.

(Joint work with Oren Kappah, Ely Porat, and Amir Rothschild)

**RNA Structure Characterization from High-Throughput Chemical Mapping Experiments**

February 13, 2012 3:45 pm – 4:45 pm

*Keywords of the presentation*: RNA structure characterization, high-throughput sequencing, maximum likelihood estimation

In this talk, I will review recent developments in experimental RNA structure characterization as well as advances in sequencing technologies. I will then describe the SHAPE-Seq technique, focusing on its automated data analysis method, which relies on a novel probabilistic model of a SHAPE-Seq experiment, adjoined by a rigorous maximum likelihood estimation framework. I will demonstrate the accuracy and simplicity of our approach as well as its applicability to a general class of chemical mapping techniques and to more traditional SHAPE experiments that use capillary electrophoresis to identify and quantify primer extension products.

This is joint work with Lior Pachter, Julius Lucks, Stefanie Mortimer, Shujun Luo, Cole Trapnell, Gary Schroth, Jennifer Doudna and Adam Arkin.

**Improved Constructions for Non-adaptive Threshold Group Testing**

February 15, 2012 3:45 pm – 4:15 pm

*Keywords of the presentation*: Group testing, Explicit constructions

**Competitive Testing for Evaluating Priced Functions**

February 15, 2012 2:00 pm – 3:00 pm

*Keywords of the presentation*: function evaluation, competitive analysis, Boolean functions, decision trees, adaptive testing, non-uniform costs

**A Group Testing Approach to Corruption Localizing Hashing**

February 16, 2012 2:00 pm – 3:00 pm

*Keywords of the presentation*: Algorithms, Cryptography, Corruption-Localizing Hashing, Group Testing, Superimposed Codes.

**Poster – Finding one of m defective elements**

February 14, 2012 4:15 pm – 5:30 pm

**Tutorial: Cost effective sequencing of rare genetic variations**

February 13, 2012 10:00 am – 11:00 am

My tutorial will provide background on rare genetic variations and DNA sequencing. I will present our sample prep strategy, called DNA Sudoku, that utilizes combinatorial pooling/compressed sensing approach to find rare genetic variations. More importantly, I will discuss several major distinction from the classical combinatorial due to sequencing specific constraints.

**Mining Rare Human Variations using Combinatorial Pooling**

February 13, 2012 3:15 pm – 3:45 pm

*Keywords of the presentation*: high-throughput sequencing, combinatorial pooling, liquid-handling, compressed sensing

We have developed a protocol for quantifying, calibrating, and pooling DNA samples using a liquid-handling robot, which has required a significant amount of testing in order to reduce volume variation. I will discuss our protocol and the steps we have taken to reduce CV. For accurate decoding and to reduce the possibility of specimen dropout, it is important that the DNA samples are accurately quantified and calibrated so that equal amounts can be pooled and sequenced. We can determine the number of carriers in each pool from sequencing output and reconstruct the original identity of individual specimens based on the pooling design, allowing us to identify a small number of carriers in a large cohort.

**Superimposed codes**

February 14, 2012 3:45 pm – 4:15 pm

There are lots of versions and related problems, like Sidon sets, sum-free sets, unionfree families, locally thin families, cover-free codes and families, etc. We discuss two cases cancellative and union-free codes.

A family of sets Ƒ (and the corresponding code of 0-1 vectors) is called union-free if *A ∪ B≠ C ∪ D* and *A,B,C,D* ∈ F imply {*A,B*} = {*C,D*}. Ƒ is called *t*-cancellative if for all distict *t* + 2 members *A _{1}, … ,A_{t}* and

*B,C ∈ Ƒ*

A1 ∪ … ∪ A_{t} ∪ B ≠ A_{1} ∪ … A_{t} ∪C:

Let *c _{t}(n)* be the size of the largest

*t*-cancellative code on

*n*elements. We significantly improve the previous upper bounds of Körner and Sinaimeri, e.g., we show

*c2(n)*≤ 2

^{0:322n}(for

*n > n*).

_{0}

**Streaming algorithms for approximating the length of the longest increasing subsequence**

February 17, 2012 9:00 am – 10:00 am

*Keywords of the presentation*: Data streams, lower bounds, longest increasing subsequence

I will talk about proving lower bounds on how much space (memory) is necessary to still be able to solve the given task. I will focus on the problem of approximating the length of the longest increasing subsequence, which is a measure of how well the data is sorted.

Joint work with Parikshit Gopalan.

**Tutorial: Sparse signal recovery**

February 13, 2012 11:15 am – 12:15 pm

*Keywords of the presentation*: streaming algorithms, group testing, sparse signal recovery, introduction.

**Weighted Pooling – Simple and Effective Techniques for Pooled High Throughput Sequencing Design**

February 15, 2012 11:30 am – 12:00 pm

We show that one can gain further efficiency and cost reduction by using “weighted” designs, in which different individuals donate different amounts of DNA to the pools. Intuitively, in this situation the number of mutant reads in a pool does not only indicate the number of carriers, but also of the identity of the carriers.

We describe and study a simple but powerful example of such weighted designs, with non-overlapping pools. We demonstrate that even this naive approach is not only easier to implement and analyze but is also competitive in terms of accuracy with combinatorial designs when identifying very rare variants, and is superior to the combinatorial designs when genotyping more common variants.

We then discuss how weighting can be further incorporated into existing designs to increase their accuracy and demonstrate the resulting improvement in reconstruction efficiency using simulations. Finally, we argue that these weighted designs have enough power to facilitate detection of common alleles, so they can be used as a cornerstone of whole-exome or even whole-genome sequencing projects.

**Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space**

February 14, 2012 3:15 pm – 3:45 pm

*Keywords of the presentation*: combinatorial pooling, DNA sequencing and assembly, genomics, next-generation sequencing,

Joint work with D. Duma (UCR), M. Alpert (UCR), F. Cordero (U of Torino), M. Beccuti (U of Torino), P. R. Bhat (UCR and Monsanto), Y. Wu (UCR and Google), G. Ciardo (UCR), B. Alsaihati (UCR), Y. Ma (UCR), S. Wanamaker (UCR), J. Resnik (UCR), and T. J. Close (UCR).

Preprint available at http://arxiv.org/abs/1112.4438

**Poster – Upgraded Separate Testing of Inputs in Compressive Sensing**

February 14, 2012 4:15 pm – 5:30 pm

*X*in terms of CAPACITY

*C(s)*.

**Olgica Milenkovic**– University of Illinois at Urbana-Champaign

http://www.ece.illinois.edu/directory/profile.asp?milenkov

**Probabilistic and combinatorial models for quantized group testing**

February 15, 2012 3:15 pm – 3:45 pm

*Keywords of the presentation*: group testing, MAC channel, quantization, graphical models

**Sparser Johnson-Lindenstrauss Transforms**

February 16, 2012 10:15 am – 11:15 am

*Keywords of the presentation*: dimensionality reduction, johnson-lindenstrauss, numerical linear algebra, massive data

The original proofs of the JL lemma let the linear mapping be specified by a random dense k x d matrix (e.g. i.i.d. Gaussian entries). Thus, performing an embedding requires dense matrix-vector multiplication. We give the first construction of linear mappings for JL in which only a subconstant fraction of the embedding matrix is non-zero, regardless of how eps and n are related, thus always speeding up the embedding time. Previous constructions only achieved sparse embedding matrices for 1/eps >> log n.

This is joint work with Daniel Kane (Stanford).

**Network topology as a source of biological information**

February 14, 2012 10:15 am – 11:15 am

*Keywords of the presentation*: biological networks, graph algorithms, network alignment

Analogous to sequence alignments, alignments of biological networks will likely impact biomedical understanding. We introduce a family of topology-based network alignment (NA) algorithms, (that we call GRAAL algorithms), that produces by far the most complete alignments of biological networks to date: our alignment of yeast and human PINs demonstrates that even distant species share a surprising amount of PIN topology. We show that both species phylogeny and protein function can be extracted from our topological NA. Furtermore, we demonstrate that the NA quality improves with integration of additional data sources (including sequence) into the alignment algorithm: surprisingly, 77.7% of proteins in the baker’s yeast PIN participate in a connected subnetwork that is fully contained in the human PIN suggesting broad similarities in internal cellular wiring across all life on Earth. Also, we demonstrate that topology around cancer and non-cancer genes is different and when integrated with functional genomics data, it successfully predicts new cancer genes in melanogenesis-related pathways.

**Sylvie Ricard-Blum**– Université Claude-Bernard (Lyon I)

http://www.ibcp.fr/scripts/affiche_detail.php?n_id=174

**A dynamic and quantitative protein interaction network regulating angiogenesis**

February 16, 2012 3:45 pm – 4:15 pm

*Keywords of the presentation*: protein interaction networks, kinetics, affinity, protein arrays, surface plasmon resonance

**Tutorial: Group Testing and Coding Theory**

February 14, 2012 9:00 am – 10:00 am

*Keywords of the presentation*: Code Concatenation, Coding theory, Group Testing, List Decoding, List Recovery

Theory of error-correcting codes, or coding theory, was born in the works of Shannon in 1948 and Hamming in 1950. Codes are ubiquitous in our daily life and have also found numerous applications in theoretical computer science in general and computational complexity in particular.

Kautz and Singleton connected these two areas in their 1964 paper by using “code concatenation” to design good group testing schemes. All of the (asymptotically) best know explicit constructions of group testing schemes use the code concatenation paradigm. In this talk, we will focus on the “decoding” problem for group testing: i.e. given the outcomes of the tests on the pools, identify the infected soldiers. Recent applications of group testing in data stream algorithm require sub-linear time decoding, which is not guaranteed by the traditional constructions.

The talk will first survey the Kautz-Singleton construction and then will will show how recent developments in list decoding of codes lead in a modular way to sub-linear time decodable group testing schemes.

**Vyacheslav V. Rykov**– University of Nebraska

http://www.unomaha.edu/~wwwmath/faculty/rykov/index.html

**Superimposed Codes and Designs for Group Testing Models**

February 14, 2012 11:30 am – 12:30 pm

*Keywords of the presentation*: superimposed codes, screening designs, rate of code, threshold designs and codes

**Testing Boolean functions**

February 14, 2012 2:00 pm – 3:00 pm

**Genomic Privacy and the Limits of Individual Detection in a Pool**

February 15, 2012 10:15 am – 11:15 am

*Keywords of the presentation*: Genomewide association studies, Privacy, Pooled designs, Hypothesis testing, Local Asymptotic normality

Till recently, many studies pooled individuals together, making only the allele frequencies of each SNP in the pool publicly available. However a technique that could be used to detect the presence of individual genotypes from such data prompted organizations such as the NIH to restrict public access to summary data . To again allow public access to data from association studies, we need to determine which set of SNPs can be safely exposed while preserving an acceptable level of privacy.

To answer this question, we provide an upper bound on the power achievable by any detection method as a function of factors such as the number and the allele frequencies of exposed SNPs, the number of individuals in the pool, and the false positive rate of the method. Our approach is based on casting the problem in a statistical hypothesis testing framework for which the likelihood ratio test (LR-test) attains the maximal power achievable.

Our analysis provides quantitative guidelines for researchers to make SNPs public without compromising privacy. We recommend, based on our analysis, that only common independent SNPs be exposed. The final decision regarding the exposed SNPs should be based on the analytical bound in conjunction with empirical estimates of the power of the LR test. To this end, we have implemented a tool, SecureGenome, that determines the set of SNPs that can be safely exposed for a given dataset.

**From screening clone libraries to detecting biological agents**

February 17, 2012 10:15 am – 11:15 am

*Keywords of the presentation*: (generalized) group testing, screening clone libraries, DNA microarrays

Modern molecular biology also contributed to group testing. The problem of generalized group testing (in the combinatorial sense) arises naturally, when one uses oligonucleotide probes to identify biological agents present in a sample. In this setting a group testing design cannot be chosen arbitrarily. The possible columns of a group testing design matrix are prescribed by the biology, namely by the hybridization reactions between target DNA and probes

**Identification of rare alleles and their carriers using compressed se(que)nsing**

February 13, 2012 2:00 pm – 3:00 pm

*Keywords of the presentation*: compressed sensing, group testing, genetics, rare alleles

We will present initial results of two projects that were initiated following publication. The first project concerns identification of de novo SNPs in genetic disorders common among Ashkenazi Jews, based on sequencing 3000 DNA samples. The second project in plant genetics involves identifying SNPs related to water and silica homeostasis in Sorghum bicolor, based on sequencing 3000 DNA samples using 1-2 Illumina lanes.

Joint work with Amnon Amir from the Weizmann Institute of Science, and Or Zuk from the Broad Institute of MIT and Harvard

**Nicolas Thierry-Mieg**– Centre National de la Recherche Scientifique (CNRS)

http://membres-timc.imag.fr/Nicolas.Thierry-Mieg/

**Shifted Transversal Design Smart-pooling: increasing sensitivity, specificity and efficiency in high-throughput biology**

February 15, 2012 9:00 am – 10:00 am

*Keywords of the presentation*: combinatorial group testing, smart-pooling, interactome mapping

**Tutorial: The Data Stream Model**

February 16, 2012 9:00 am – 10:00 am

**Reconstruction of bacterial communities using sparse representation**

February 17, 2012 11:30 am – 12:30 pm

*Keywords of the presentation*: metagenomics, sparse representation, compressed seqeuncing

A popular approach to the problem is sequencing the Ribosomal 16s RNA gene in the sample using universal primers, and using variation in the gene’s sequence between different species to identify the species present in the sample. We present a novel framework for community reconstruction, based on sparse representation; while millions of microorganisms are present on earth, with known 16s sequences stored in a database, only a small minority (typically a few hundreds) are likely to be present in any given sample,

We discuss the statistical framework, algorithms used and results in terms of accuracy and species resolution.

There is a Bayesian Cake Club, from where we could find a list of papers on Bayesian Statistics:

- B.T. Knapik A.W. van der Vaart J.H. van Zanten (2011) Bayesian Inverse problems with Gaussian Priors.
- C. Yau and C. Holmes. (2011) Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination. Bayesian Analysis 6(2), 329-352
- Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation

by Fearnhead and Prangle - Philosophy and the practice of Bayesian statistics

by Gelman and Shalizi - Catching up faster by switching sooner: a predictive approach to adaptive estimation with an application to the Akaike information criterion – Bayesian information criterion dilemma

by van Erven, Grunwald and de Rooij

- Suboptimal behaviour of Bayes and MDL in classification under misspecification

by Peter Grunwald and John Langford - Likelihood-free Estimation of model evidence

by Xavier Didelot, Richard G. Everitt, Adam M. Johansen and Daniel J. Lawson - On the use of non-local prior densities in Bayesian hypothesis tests

by Valen E. Johnson and David Rossell - Approximate Bayesian Computation: A Nonparametric Perspective

by Michael Blum - Inconsistent Bayesian Estimation

by Christensen - A Hierarchical Bayesian Framework for Constructing Sparsity-inducing Priors

Anthony Lee, Francois Caron, Arnaud Doucet, Chris Holmes - Dynamics of Bayesian updating with dependent data and misspecified models

Cosma Rohilla Shalizi - Posterior Predictive p-values in Bayesian Hierarchical Models

G.H. Steinbakk, G.O. Storvik, Scandinavian Journal of Statistics, Vol. 36: 320-336, 2009, doi: 10.1111/j.1467-9469.2008.00630.x - Bayesian Model Averaging: A Tutorial

Jennifer A. Hoeting, David Madigan, Adrian E. Raftery and Chris T. Volinsky, Statistical Science, Vol. 14, No. 4 (Nov., 1999), pp. 382-401. - Optimal Predictive Model Selection

Barbieri and Berger (2004), The Annals of Statistics, 32, 870-897. - Use of Exchangeability

JFC Kingman (1978), The Annals of Probability, 6, 183-197. - The concept of exchangeability and its applications

Bernardo (1996). - Hybrid Dirichlet mixture models for functional data

Petrone, Guindani and Gelfand, JRSSB, 71, 755-782 (2009).

Some notes from Peter to facilitate reading the above: cribsheet - Reducing the Dimensionality of Data with Neural Networks

Hinton and Salakhutdinov, Science 313, 504-507, 2006.

Supplementary material: tech rep, slides . - Joint Bayesian Estimation of Alignment and Phylogeny

BENJAMIN D. REDELINGS AND MARC A. SUCHARD, Syst. Biol. 54, 401-418, 2005.

(Some introductory background reading on Phylogeny can be found inPhylogeny Estimation: Traditional and Bayesian Approaches by M. Holder and P.O. Lewis, Nature Reviews, 2003.) - Bayesian inference for a discretely observed stochastic kinetic model

Boys, Wilkinson and Kirkwood.

Stat Comput, 18, 125-135, (2008). - Agreeing to Disagree

Aumann

The Annals of Statistics, 4, 1236-9, (1976). - Belief and the Problem of Ulysses and the Sirens

Van Fraassen

Philosophical Studies, 77, 7-37 (1995) - Updating Subjective Probability

Diaconis and Zabell

JASA, 77, 380, 822-830 (1982) - Objective Bayesian variable selection.

G. Casella and E. Moreno

JASA, 101, 157-167 (2006). - Separation measures and the geometry of Bayes factor selection for classification.

J.Q. Smith, P.E. Anderson, and S. Liverani

JRSSB, 70, 5, 957-980 (2008). - Examples of Adaptive MCMC

Roberts, G. O. and Rosenthal, J. S.; Preprint (2008) - Hyper Markov laws in the statistical analysis of decomposable graphical models

S. Lauritzen and P. Dawid

Annals of Statistics, Vol. 21, pp. 1272-1317 (1993) - Subjective Bayesian Analysis: Principles and Practice

M. Goldstein

Bayesian Analysis, Vol. 1, 403-420 (+discussion), 2006 - Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models

O. Papaspiliopoulos and G. O. Roberts

Biometrika, Vol. 95, pp. 169-186 (2008) - Bayesian calibration of computer models

M.C. Kennedy and A. O’Hagan

Journal of the Royal Statistical Society, Series B, Volume 63, pp. 425-464 (2001) - Multiple-bias modelling for analysis of observational data

S. Greenland

Journal of the Royal Statistical Society: Series A (Statistics in Society), Volume 168, Number 2, March 2005 , pp. 267-306 (2005) - Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments

C. Currin, T. Mitchell, M. Morris, and D. Ylvisaker

Journal of the American Statistical Association, v. 86, pp. 953-963 (1991). - Causal Inference Without Counterfactuals

A. P. Dawid

Journal of the American Statistical Association, Vol. 95, pp. 407-424 (2000) - Extended Ensemble Monte Carlo

Y. Iba

Int. J. Mod. Phys. C12, 623-656 (2001) - Sparse graphical models for exploring gene expression data

A. Dobra, B. Jones, C. Hans, J.R. Nevins and M. West.

Journal of Multivariate Analysis, 90 (2004): 196-212. - P Values for Composite Null Models

M. J. Bayarri and James O. Berger

JASA, 95 (452), 1127-1142 (2000). - Gibbs Sampling Methods for Stick-Breaking Priors

H. Ishwaran and L. F. James

JASA, 96 (453), 161-173 (2001) - Bayesian density regression

Dunson, D., Pillai, N., and Park J.-H.

JRSS(B) 69(2), 163-183, 2007. - Bayesian Inference for Causal Effects: The Role of Randomization

D. B. Rubin

Annals of Statistics, Vol. 6, No. 1, pp 34-58 (1978)

- Articles on the philosophy of Bayesian statistics by Cox, Mayo, Senn, and others!
- Philosophy of Bayesian statistics: my reactions to Cox and Mayo
- The universal solvent of statistics
- Philosophy of Bayesian statistics: my reactions to Senn
- Philosophy of Bayesian statistics: my reactions to Hendry
- Philosophy of Bayesian statistics: my reactions to Wasserman

The reading list given by Professor Xi’an on the topic of ABC (Approximate Beyesian Computation) convergence:

- Blum M.G.B. (2010) Approximate Bayesian Computation: a nonparametric perspective.
*Journal of the American Statistical Association*, 105: 1178-1187 - Dean, T.A., Sumeetpal, Singh S.S., Jasra, A. and Peters, G.W. (2010) Parameter Estimation for Hidden Markov Models with Intractable Likelihoods. Cambridge University Engineering Department Technical Report 66, arXiv:1103.5399v1
- Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation.
*J. Royal Statistical Society (Series B)* - Marin, J.-M., Pillai, N., Robert, C.P. and Rousseau, J. (2011) Relevant statistics for Bayesian model choice. arXiv:1110.4700
- Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N.S. (2011) Lack of confidence in approximate Bayesian computational (ABC) model choice.
*PNAS (Open Access)*. 108(37), 15112-15117,arXiv:1102.4432 - Wilkinson, R. (2008) Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. arXiv:0811.3355

## Recent Comments