\documentclass[11pt]{article} \usepackage{indentfirst} \usepackage{amsmath} \usepackage{amssymb} \usepackage{amscd} \usepackage{natbib} \usepackage{url} \usepackage{graphicx} \usepackage[utf8]{inputenc} \DeclareMathOperator{\var}{var} \newcommand{\boldbeta}{{\boldsymbol{\beta}}} \newcommand{\boldbetahat}{{\boldsymbol{\hat{\beta}}}} \newcommand{\boldtheta}{{\boldsymbol{\theta}}} \newcommand{\boldphi}{{\boldsymbol{\varphi}}} \newcommand{\boldxi}{{\boldsymbol{\xi}}} \newcommand{\boldmu}{{\boldsymbol{\mu}}} \newcommand{\boldA}{{\mathbf{A}}} \newcommand{\boldb}{{\mathbf{b}}} \newcommand{\boldw}{{\mathbf{w}}} \newcommand{\boldx}{{\mathbf{x}}} \newcommand{\boldy}{{\mathbf{y}}} \newcommand{\boldz}{{\mathbf{z}}} \hyphenation{Wa-gen-ius} \setlength{\textheight}{\paperheight} \addtolength{\textheight}{- 2 in} \setlength{\topmargin}{0.25 pt} \setlength{\headheight}{0 pt} \setlength{\headsep}{0 pt} \addtolength{\textheight}{- \topmargin} \addtolength{\textheight}{- \topmargin} \addtolength{\textheight}{- \footskip} \setlength{\textwidth}{\paperwidth} \addtolength{\textwidth}{- 2 in} \setlength{\oddsidemargin}{0.25 in} \setlength{\evensidemargin}{0.25 in} \addtolength{\textwidth}{- \oddsidemargin} \addtolength{\textwidth}{- \evensidemargin} \begin{document} \vspace*{0.9375in} \begin{center} {\bfseries Computation for the Introduction to MCMC Chapter of \emph{Handbook of Markov Chain Monte Carlo}} \\ By \\ Charles J. Geyer \\ Technical Report No.~679 \\ School of Statistics \\ University of Minnesota \\ % April 20, 2005 \\ \today \end{center} \thispagestyle{empty} \cleardoublepage \setcounter{page}{1} \thispagestyle{empty} \begin{abstract} This technical report does the computation for the ``Introduction to MCMC'' chapter of \citet*{handbook}. All analyses are done in R \citep{rcore} using the \texttt{Sweave} function so this entire technical report and all of the analyses reported in it are exactly reproducible by anyone who has R with the \texttt{mcmc} package \citep{mcmc} installed and the R noweb file specifying the document. \end{abstract} <>= options(keep.source = TRUE, width = 65) ps.options(pointsize = 15) pdf.options(pointsize = 15) @ \section{R Package MCMC} We use R statistical computing environment \citep{rcore} in our analysis. It is free software and can be obtained from \url{http://cran.r-project.org}. Precompiled binaries are available for Windows, Macintosh, and popular Linux distributions. We use the contributed package \verb@mcmc@. If R has been installed, but this package has not yet been installed, do \begin{verbatim} install.packages("mcmc") \end{verbatim} from the R command line (or do the equivalent using the GUI menus if on Apple Macintosh or Microsoft Windows). This may require root or administrator privileges. Assuming the \verb@mcmc@ package has been installed, we load it <>= library(mcmc) @ <>= baz <- library(help = "mcmc") baz <- baz$info[[1]] baz <- baz[grep("Version", baz)] baz <- sub("^Version: *", "", baz) bazzer <- paste(R.version$major, R.version$minor, sep = ".") @ The version of R used to make this document is \Sexpr{bazzer}. The version of the \texttt{mcmc} package used to make this document is \Sexpr{baz}. <>= rm(baz, bazzer) @ This entire document and all of the calculations shown were made using the R command \texttt{Sweave} and hence are exactly reproducible by anyone who has R and the R noweb (RNW) file from which it was created. Both the RNW file and and the PDF document produced from it are available at \url{http://www.stat.umn.edu/geyer/mcmc}. For further details on the use of Sweave and R see the web site \url{http://www/stat.umn.edu/~charlie/Sweave/}. Not only can one exactly reproduce the results in the printable document, one can also modify the parameters of the simulation and get different results. Finally, we set the seeds of the random number generator so that we obtain the same results every time. To get different results, obtain the RNW file, change this statement, and reprocess using \texttt{Sweave} and \LaTeX. <>= set.seed(42) @ \section{AR(1) Simulation} Simulate an AR(1) time series. <>= rho <- 0.99 nsim <- 1e4 x <- 0 for (i in 2:nsim) x <- c(x, rho * x[length(x)] + rnorm(1)) out <- as.ts(x) @ The Monte Carlo standard error, expressed relative to the standard deviation of the invariant distribution is <>= sqrt((1 + rho) / (1 - rho) / nsim) @ In order to get a 10\% relative error, we would need sample size <>= ceiling((1 + rho) / (1 - rho) / 0.10^2) @ In order to get a 10\% relative error, we would need sample size <>= ceiling((1 + rho) / (1 - rho) / 0.01^2) @ Variance inflation factor (VIF) <>= (1 + rho) / (1 - rho) @ Effective sample size <>= nsim / ((1 + rho) / (1 - rho)) @ Figure~\ref{fig:ar1-1} (page~\pageref{fig:ar1-1}) is made by the following R statements <>= par(mar = c(5, 4, 1, 1) + 0.1) plot(out, ylab = "x") @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Time series plot of AR(1) time series.} \label{fig:ar1-1} \end{figure} Figure~\ref{fig:ar1-foo} (page~\pageref{fig:ar1-foo}) is made by the following R statements <>= foo <- cumsum(out) / seq(along = out) par(mar = c(5, 4, 1, 1) + 0.1) plot(as.ts(foo), ylab = expression(hat(mu)[n]), xlab = expression(n)) @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Sample means of AR(1) time series.} \label{fig:ar1-foo} \end{figure} Figure~\ref{fig:ar1-2} (page~\pageref{fig:ar1-2}) is made by the following R statements <>= lag.max <- 500 par(mar = c(5, 4, 1, 1) + 0.1) plot(acf(out, lag.max = lag.max), ci.col = "black", main = "") curve(rho^x, from = 0, to = lag.max, add = TRUE, lty = 3, lwd = 2) @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Autocorrelation plot of AR(1) time series. Dashed lines: 95\% confidence intervals assuming white noise input. Dotted line: simulation truth autocorrelation function.} \label{fig:ar1-2} \end{figure} \subsection{Nonoverlapping Batch Means} Figure~\ref{fig:ar1-bat} (page~\pageref{fig:ar1-bat}) is made by the following R statements <>= blen <- 500 batch <- matrix(out, nrow = blen) batch <- apply(batch, 2, mean) par(mar = c(5, 4, 1, 1) + 0.1) plot(batch, ylab = "batch means") @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Batch mean plot of AR(1) time series.} \label{fig:ar1-bat} \end{figure} Figure~\ref{fig:ar1-cat} (page~\pageref{fig:ar1-cat}) is made by the following R statements <>= par(mar = c(5, 4, 1, 1) + 0.1) plot(acf(batch), ci.col = "black", main = "") @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Autocorrelation plot of batch means of AR(1) time series.} \label{fig:ar1-cat} \end{figure} <>= t.test(batch) @ \subsection{Nonoverlapping Batch Means} <>= library(mcmc) bmvar <- olbm(out, batch.length = blen) length(out) * as.numeric(bmvar) blen * var(batch) @ Note that \text{olbm} estimates $\sigma^2 / n$, the asymptotic variance of the sample mean of the whole time series. \subsection{Initial Convex Sequence} Figure~\ref{fig:ar1-conv} (page~\pageref{fig:ar1-conv}) is made by the following R statements <>= par(mar = c(5, 4, 1, 1) + 0.1) iout <- initseq(out) xx <- seq(along = iout$Gamma.con) - 1 yy <- 1 / (1 - rho^2) * (1 + rho) * rho^(2 * xx) ymax <- max(yy[1], iout$Gamma.con[1]) plot(xx, iout$Gamma.con, type = "l", xlab = "Index (half lag)", ylab = expression(tilde(Gamma)), ylim = c(0, ymax)) lines(xx, yy, lty = 3, lwd = 2) @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Autocovarince plot for AR(1) time series using initial convex sequence estimator.} \label{fig:ar1-conv} \end{figure} <>= initseq(out)$var.con @ \subsection{Initial Convex Sequence and Batch Means} <>= blen <- 50 batch <- matrix(out, nrow = blen) batch <- apply(batch, 2, mean) blen * var(batch) blen * initseq(batch)$var.con 1 / (1 - rho^2) * (1 + rho) / (1 - rho) @ \section{Burn-In} Simulate an AR(1) time series purporting to show a need for burn-in to be compared with Figure~\ref{fig:ar1-1}. We make all the simulation parameters the same as those for Figure~\ref{fig:ar1-1} except we make the starting point 10 standard deviations of the equilibrium distribution from the mean. <>= rho <- 0.99 nsim <- 1e4 sd.equilib <- sqrt(1 / (1 - rho^2)) sd.equilib x <- 10 * sd.equilib for (i in 2:nsim) x <- c(x, rho * x[length(x)] + rnorm(1)) out <- as.ts(x) @ Figure~\ref{fig:ar1-burn} (page~\pageref{fig:ar1-burn}) is made by the following R statements <>= par(mar = c(5, 4, 1, 1) + 0.1) plot(out, ylab = "x") @ \begin{figure} \begin{center} <>= <> @ \end{center} \caption{Time series plot of AR(1) time series. Compare Figure~\ref{fig:ar1-1}.} \label{fig:ar1-burn} \end{figure} \begin{thebibliography}{} \bibitem[Brooks, et al.(forthcoming)Brooks, Gelman, Jones and Meng]{handbook} Brooks, S., Gelman, A., Jones, G. and Meng X.-L. (forthcoming). \newblock \emph{Handbook of Markov Chain Monte Carlo}. \newblock Chapman \& Hall/CRC Press. \bibitem[Geyer(2005)]{mcmc} \textsc{Geyer, C.~J.} (2005). \newblock \emph{R package \texttt{mcmc} (Markov chain Monte Carlo), version 0.5-1}. \newblock {\url{http://www.stat.umn.edu/geyer/mcmc/}}. \bibitem[R Development Core Team(2008)]{rcore} R Development Core Team (2008). \newblock R: A language and environment for statistical computing. \newblock R Foundation for Statistical Computing, Vienna, Austria. \newblock \url{http://www.R-project.org}. \end{thebibliography} \end{document}