1

I am using ggplot2 to visualized model results.

I have a model results in the results object, a dataframe. My code to visualize it looks like the following:

results |> 
  mutate(Confidence = if_else((CI_high < 0 & CI_low < 0) | (CI_high > 0 & CI_low > 0),"Significant","Not Significant")) |> 
  ggplot(aes(x=time,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3) 


This works fine, and produces the following plot:

plo1

However, I want to make the fill of the geom_ribbon conditional on significance, as the above if_else condition shows. But when I plot it using the following code:

results |> 
  mutate(Confidence = if_else((CI_high < 0 & CI_low < 0) | (CI_high > 0 & CI_low > 0),"Significant","Not Significant")) |> 
  ggplot(aes(x=time,y=Coefficient,ymin=CI_low,ymax=CI_high,color=Confidence,fill=Confidence)) + geom_line() + geom_ribbon(alpha=0.3) 

I get this plot: plot2

This is wrong to me. I should just have the same geom_ribbon shaded different colors within the bounds it already has (when the upper and lower bounds are either both above or both below 0). Yet now it plots an additional fill over the already shaded region, and the edges of the blue fill do not even match the smooth geom_ribbon from before. I have tried supplying only the fill argument, not the color argument and vice versa. I have also tried using the geom_ribbon's aes instead of the overall plot, but none of these attempts have resolved the problem.

How can I fix this to conditionally fill/color the plot only within the actual boundaries of the data?

EDIT (w/ Reprex)

Here is a reproducible example that also demonstrates the issue

library(modelbased)
library(tidyverse)

gam1 <- mgcv::gam(mpg ~ cyl +
                          s(disp), data = mtcars, method = "REML")


deriv1 <- modelbased::estimate_slopes(gam1,
  trend = "disp",
  at = "disp",
  length = 100) |> 
  ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3)

deriv2 <- modelbased::estimate_slopes(gam1,
  trend = "disp",
  at = "disp",
  length = 100) |> mutate(Confidence = if_else(CI_high < 0 & CI_low < 0 | CI_high > 0 & CI_low > 0,"Significant","Not Significant")) |> 
  ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3,aes(color=Confidence,fill=Confidence)) + scale_color_manual(values=c("red","grey"),breaks = c("Significant","Not Significant")) + scale_fill_manual(values=c("red","grey"),breaks = c("Significant","Not Significant"))

deriv1
deriv2


plot3

deriv1 shows that only the very start of the geom_ribbon and a portion before the very end should be shaded red, as only these sections do not overlap with 0. However, when the shading is made conditional, the output for deriv2 is the following:

plot4

which does not match the desired output at all. The desired output should be that only the first section and a portion before the very end (where the ymin and ymax don't overlap with 0) should be shaded red. It should not be a separate ribbon from the grey ribbon.

5
  • Not responsive to your ggplot2 question, but relevant consideration for the approach implied here: "The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant" stat.columbia.edu/~gelman/research/published/signif4.pdf
    – Jon Spring
    Apr 19 at 16:21
  • 1
    It's easier to help you if you include a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions.
    – MrFlick
    Apr 19 at 16:36
  • Thank you @JonSpring I appreciate the thought but interpretation is not very relevant here, as this problem could be the case for other situations as well, not pertaining to statistical significance.
    – flâneur
    Apr 19 at 16:37
  • Thanks @MrFlick I'm afraid due to the size of the data that I cannot offer a reproducible example. I haven't been able to reproduce it with a smaller subset either yet, but I will keep trying and update the question if possible.
    – flâneur
    Apr 19 at 16:39
  • @MrFlick I have added a further (fully reproducible) example which also demonstrates the problem. I hope someone might be able to help!
    – flâneur
    Apr 19 at 17:40

1 Answer 1

2

It looks odd because it doesn't know the significant part on the left isn't connected to the significant part on the right. You can add an explicit grouping variable to know what sections should be drawn together

modelbased::estimate_slopes(gam1,
                                      trend = "disp",
                                      at = "disp",
                                      length = 100) |> 
  mutate(Confidence = if_else(CI_high < 0 & CI_low < 0 | CI_high > 0 & CI_low > 0,"Significant","Not Significant")) |> 
  mutate(Group = consecutive_id(Confidence)) |> 
  ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) +
  geom_line() + 
  geom_ribbon(alpha=0.3,aes(color=Confidence,fill=Confidence, group=Group)) + 
  scale_color_manual(values=c("red","grey"),breaks = c("Significant","Not Significant")) + 
  scale_fill_manual(values=c("red","grey"),breaks = c("Significant","Not Significant"))

enter image description here There are some discontinuities when transitioning between the two groups. I guess you'd have to decide how you want to color those regions.

1
  • Fantastic, this works perfectly! Thank you so much for a clear simple solution!
    – flâneur
    Apr 19 at 20:31

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.