# Asymptotic Confidence Interval

Using a sample size of n=20 from Exemple 3.1.6 from Bolfarine, build an approximate confidence interval for $$\theta$$, where $$f(x|\theta)$$ is given by:

$$f(x | \theta) = \frac{1}{2}(1 + \theta{}x); -1 \leq x \leq 1, \quad -1 \leq \theta \leq 1$$

To build a confidence interval we need in the first place a pivotal quantity. From the asymptotic distribution of maximum likelihood estimators, we know that

$$Q(X,\theta) = \frac{\hat{\theta} – \theta}{\sqrt{(nI_F(\hat{\theta}))^{-1}}}$$

follows an approximate standard normal distribution.

Therefore,

$$P\left(-z_{\alpha/2} \leq \frac{\hat{\theta} -\theta}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \leq z_{\alpha/2} \right) \approx \gamma = 1 – \alpha$$

Developing this expression:
$$\begin{eqnarray*} \gamma &\approx& P\left( \frac{-z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \leq \hat{\theta} – \theta \leq \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \right) \\ &=& P\left( \frac{-z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} – \hat{\theta} \leq – \theta \leq \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} – \hat{\theta} \right) \\ &=& P\left( \hat{\theta} + \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \geq \theta \geq \hat{\theta} – \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \right) \\ &=& P\left( \hat{\theta} – \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \leq \theta \leq \hat{\theta} + \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \right) \end{eqnarray*}$$
And therefore an approximate confidence interval for $$\theta$$ will be given by

$$\left[ \hat{\theta} - \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} ; \hat{\theta} + \frac{z_{\alpha/2}}{\sqrt{(nI_F(\hat{\theta}))^{-1}}} \right]$$

We won’t use the sample size of 20 given by the Bolfarine example because the sample has an invalid value. One of the observations is -3.4919, but the random variable is defined only within [-1,1].

Let us implement a set of R functions which will enable us to work with this density function:

View Code RSPLUS
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  # This funcion simulates values from this random variable rbolfarine <- function (n, theta) { sample <- runif(n) (-1 + 2 * sqrt(0.25 - theta * (0.5 - 0.25 * theta - sample)))/theta } # Fisher Information calculation inffisher <- function (theta.j) { 1/(2 * theta.j^3) * (log((1 + theta.j)/(1 - theta.j)) - 2 * theta.j) } #Approximation of theta's value by sucessive iterations: mmvapprox2 <- function (x, teta.0 = mean(x), maxeps = 1e-04) { eps <- maxeps + 1 while (eps > maxeps) { teta.i <- teta.0 + sum(x/(1 + teta.0 * x))/sum(x^2/(1 + teta.0 * x)^2) eps <- abs(teta.i - teta.0) teta.0 <- teta.i } teta.i }

Finnaly, the following function generates a confidence interval, based on the expression we found above. The arguments for the function define the confidence level we want and the sample to be taken into account. There is also an additional argument, teta, which defines the functions behaviour: if specified the functions calculates the confidence interval and returns 1 if teta is within this interval, 0 otherwise. If teta is left as NULL, the function returns the confidence interval.

View Code RSPLUS
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  IC.complete <- function (x, conf = 0.95, teta = NULL) { teta.hat <- mmvapprox2(x, maxeps = 1e-07) lower <- teta.hat + qnorm((1 - conf)/2)/sqrt(length(x) * inffisher(teta.hat)) upper <- teta.hat - qnorm((1 - conf)/2)/sqrt(length(x) * inffisher(teta.hat)) if (is.null(teta)) { c(lower, upper) } else { if (teta >= lower && teta <= upper) { 1 } else { 0 } } }

We then generate a sample of size $$n=50$$ and $$\theta=0.4$$ and will build a confidence interval with $$\gamma = 0.95$$:

View Code RSPLUS
 1 2 3  > sample <- rbolfarine(50, 0.4) > IC.complete(sample) [1] -0.2796992 0.6694328

We will now run a simulation in R, to check whether these confidence intervals are correct.

View Code RSPLUS
 1 2 3  > sum(apply(matrix(rbolfarine(1000 * 1000, 0.4), ncol = 1000), + 2, IC.complete, conf = 0.25, teta = 0.4))/1000 [1] 0.258

On the first example we simulated 1000 samples of size 1000 and built 1000 confidence intervals, with confidence level of 0.25. The result shown is the observed proportion of confidence intervals which contained the $$\theta$$ parameter.

View Code RSPLUS
 1 2 3  > sum(apply(matrix(rbolfarine(1000 * 1000, 0.4), ncol = 1000), + 2, IC.complete, conf = 0.95, teta = 0.4))/1000 [1] 0.957

Same simulation, but with confidence level of 0.95. As the values are close, we conclude that our confidence intervals were built correctly.