Creating shaded areas in R

Creating shaded areas in R

Sometimes we want to shade areas under a density on a graphic, for instance to illustrate a p-value or a region under the normal curve.

Using the polygon() function from base R we can make graphics like the following one:

Standard normal density. Shaded area shows P(1 < X < 2)

How to do it

First we have to plot the graphic of the function whose areas we are going to shade. We can usually do that using the function curve(). When we have a standard normal density we can use:

View Code RSPLUS
1
curve(dnorm(x,0,1))

The command above will create the graphic over an arbitrary range on the x values though. Since we want to shade a specific area under the curve, it is advisable to specify which interval on the x-axis we want the function plotted on. We can do that using the xlim parameter:

View Code RSPLUS
1
curve(dnorm(x,0,1),xlim=c(-3,3),main='Normal Density')

Now that we have our curve plotted, we are going to use the polygon(x,y) function. Its most important parameters are x and y, which define what kind of polygon R is going to draw. If x and y are vectors with i=1,…n elements, polygon(x,y) draws a polygon with vertices (xi,yi), i=1,…n. The trick now consists of expressing our region of interest as a polygon. The idea might seem stupid at first, as the area of interest is usually over a continuous curve, but if we take a large enough number of vertices the approximation will be very reasonable.

Let’s say we want to shade the region represented by P(-3 < X < -2). The first vertex we want for our polygon is (-3,0). We have then:

View Code RSPLUS
1
2
cord.x <- c(-3)
cord.y <- c(0)

The second vertex will be (-3,f(-3)), where f(-3) is the normal density evaluated at -3. We can readily obtain this value on R using the dnorm() function:

View Code RSPLUS
1
2
cord.x <- c(cord.x,-3) 
cord.y <- c(cord.y,dnorm(-3))

As a third and fourth vertices we can consider (-2,f(-2)) and (-2,0). We have now:

View Code RSPLUS
1
2
cord.x <- c(cord.x,-2,-2)
cord.y <- c(cord.y,dnorm(-2),0)

We then issue:

View Code RSPLUS
1
polygon(cord.x,cord.y,col='skyblue')

Obtaining (if you followed our instructions correctly ):

First attempt at creating a shaded area in R

Not that bad, but we can make it better than that. We just have to create a sequence of vertices between the two extrema of the density (f(-3) and f(-2) here), so that the larger number of vertices gives us a better approximation. In order to accomplish that we are going to use the seq() function, to create a sequence between -3 and 2, with steps of 0.01:

View Code RSPLUS
1
2
3
4
 cord.x <- c(-3,seq(-3,-2,0.01),-2) 
 cord.y <- c(0,dnorm(seq(-3,-2,0.01)),0) 
 curve(dnorm(x,0,1),xlim=c(-3,3),main='Standard Normal') 
 polygon(cord.x,cord.y,col='skyblue')

Which gives us the result:

Final graphic.

Commentaries

Note that we used the standard normal density. If we wanted to use another probability distribution its parameters would have to be properly specified. Try playing with areas under the exponential distribution, for example. To learn more about the uses and misuses of the polygon function, take a look at its documentation (?polygon).


tabs-top

16 Responses to “Creating shaded areas in R”

  1. nichole watson says:

    This is a great explaination! It worked like a charm.

  2. Tijana Amovic says:

    Hi Fernando,

    I recently came across your website as i was trying to figure out how to do a polygon in a plot for a specific range. Your information was very helpful. Could you tell me how do i apply this to a graph where i have to shade say from -1 and 1 only, and leave rest of it blank when given an assignment to graph P(-1.5 ≤ Z ≤ 1) ?
    This is as far as I got:
    x<-seq(-4,4,.2)
    y=-1],-1.5),c(y[x>=-1],y[x==4]),col=”blue”)

    but i need it to only go till 1 and not all the way to 4.

    Please let me know at your earliest convenience.

    I would greatly appreciate it.Thank you.

    Tijana

    • fernandohrosa says:

      Hi Tijana, you’d just need to update my last code snippet in the appropriate places, that is -3 by -1 and -2 by 1:

      cord.x <- c(-1,seq(-1,1,0.01),1)
      cord.y <- c(0,dnorm(seq(-1,1,0.01)),0)
      curve(dnorm(x,0,1),xlim=c(-3,3),main='Standard Normal')
      polygon(cord.x,cord.y,col='blue')

      That should generate what you’re expecting.

  3. Paul Metzner says:

    This really is a great explanation. I do wonder, however, how this works for custom distributions, like the distribution of results. I tried it and it says “Error in xy.coords(x, y) : ‘x’ and ‘y’ lengths differ”. Do you have an idea why this might be?

    • fernandohrosa says:

      Hi Paul,

      Well, in this case I guess you would have to use the empirical distribution function (?ecdf). As an example with results coming from an Exponential distribution:

      x <- rexp(100)
      fnx <- ecdf(x)
      cord.x <- c(0,seq(0,2,0.01),2)
      cord.y <- c(0,fnx(seq(0,2,0.01)),0)
      curve(fnx(x),xlim=c(0,5),main='Estimated Density of Results')
      polygon(cord.x,cord.y,col='skyblue')

      Is that what you were trying to achieve?

  4. Anonymous Reader says:

    Brilliant!! Thank you.

  5. Robert says:

    First result on google for “shaded curve R” and it works great. Thanks!

  6. Muna says:

    Hi

    I have a small question and I will highly appreciate your insight
    my code will plot two normal curves (layout) I need to shade the area between in common under these two curves
    the code is in R

    x=seq(-7,10,length=200)
    y1=dnorm(x,mean=0,sd=1)
    plot(x,y1,type=”l”,lwd=2,col=”red”)
    y2=dnorm(x,mean=3,sd=2)
    lines(x,y2,type=”l”,lwd=2,col=”blue”)

    I will highly appreciate any help

    • fernandohrosa says:

      Hi Muna,

      Very interesting question. The first thing you’ll need to find is where the two curves intersect. This is quite simple to do, just write out the two densities (X ~ N(0,1) and Y ~ N(3,2)) and make them equal. Solve for x, and you’ll end up with a quadratic equation (x^2 + 2x -3 + 8/3 ln(1/2) = 0), where the solutions are x = 1.418345 and x = -3.418345. These are the two points where the two normal curves intersect.

      Now the catch is to actually create two polygons instead of one: you’ll need one to shade from the two points where the two curves intersect (shading it over the ‘fatter’ curve, and then from that second intersection point until the thinner curve goes to zero (I set it arbitrarily at 4). You have to add lty=0 to the polygon to avoid a line dividing them. Here’s the final code:


      x=seq(-7,10,length=200)
      y1=dnorm(x,mean=0,sd=1)
      plot(x,y1,type='l',lwd=2,col='red')
      y2=dnorm(x,mean=3,sd=2)
      lines(x,y2,type='l',lwd=2,col='blue')

      # First block (we shade under the blue curve)
      cord.x <- c(-3.418345,seq(-3.418345,1.418345,0.01),1.418345)
      cord.y <- c(0,dnorm(seq(-3.418345,1.418345,0.01),3,2),0)
      polygon(cord.x,cord.y,col='blue',lty=0)

      # Second block (we shade under the red curve)
      cord.x <- c(1.418345,seq(1.418345,4,0.01),4)
      cord.y <- c(0,dnorm(seq(1.418345,4,0.01)),0)
      polygon(cord.x,cord.y,col='red',lty=0)

  7. Muna says:

    Thank you so much fernandohrosa

    That was great:)

  8. Muna says:

    Dear Fernando

    I have kind of a complicated problem, hope you can be patient with me

    My code simulate one data set, then use this data set to get estimates for my parameters namely beta.t and theta.t.
    What I need help with (if this is possible) is that I need my code to repeat this process as many as 10000 times… in each time it should generate a data set, find theta.t and beta.t and save these values in a vector so I can do all kind of work on them
    like average variance..etc etc
    following is my code
    Thank you in advance
    Muna
    ## Use EM algorithm to estimate parameters of pareto based
    # on progressive censored data

    n=20;m=5;R<-c(3,3,3,3,3)
    theta=0.5; beta=3
    W<-matrix((runif(m,0,1)),m,1)
    E<-matrix(NA,m,1);V<-matrix(NA,m,1);U<-matrix(NA,m,1)
    i<-1
    while (i<=m) {

    E[i]<- 1/( i+ sum(R[(m+1-i):m]))
    i<-i+1
    }
    V<-W^E
    i<-1
    while (i<=m) {

    U[i]<- (1- prod( V[(m+1-i):m]))
    i<-i+1
    }

    x<-( (1-U)^(-1/theta)-1 )/beta
    Yobs<-x
    em.fct <- function(Y){
    r <- length(Yobs)
    ##initial value
    theta.t=theta
    beta.t=beta
    # Define log-likelihood function
    ll <- function(y, k, c){
    n*log(c*k)-(k+1)*( sum(log(1+c*Y))+sum( R*log(1+c*Y)+1/k ) )
    }
    # Compute the log-likelihood for the initial values
    lltm1 <- ll(Yobs, theta.t, beta.t)
    repeat{
    # E-step
    Bbar<- function(theta.t,beta.t){
    sum(R*(1+beta.t*(theta.t+1)*Yobs)/(beta.t*(theta.t+1)*(1+beta.t*Yobs)))
    }
    Abar<- function(theta.t,beta.t){
    sum( R* ( log(1+beta.t*Yobs)+ 1/theta.t ))
    }
    theta.t1<- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t))
    # M-step
    beta.t1<-((theta.t1+ 1)/n *(sum(Yobs/(1+beta.t*Yobs))+Bbar(theta.t,beta.t)))^(-1)
    theta.t1<- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t))
    beta.t <-beta.t1
    theta.t<-theta.t1
    # compute log-likelihood using current estimates
    llt <- ll(Yobs, theta.t, beta.t)
    # Print current parameter values and likelihood
    cat(theta.t, beta.t, llt, "\n")
    # Stop if converged
    if ( abs(lltm1 – llt) < 0.0001) break
    lltm1 <- llt
    }
    return(theta.t,beta.t)
    }

    em.fct(Yobs)

  9. Leon van Mulken says:

    Would like to shade the area under a line.
    > spectrum(W, spans=5,ci.col = “red”)
    used > polygon(W,col=’blue’,lty=0). However, do not get anything no error nor warning. Would you be able to help. Thanks so much Leon van Mulken

  10. Frank L says:

    Please I will like to plot a function with dnorm(x,mean=0.2093,sd=0.444) and shade between -1 and 1.

    Thanks

Leave a Reply