I don’t know for sure what’s happening, but I’ll take a guess and you can tell me if I might be right or if I’m off-base (I haven’t worked with cyclic smooths before).
My guess is that there’s some window in late december and/or early january when you don’t have any data, and when you pass bs = "cc"
, what you’re getting is a basis that is cyclic not by joining 31 December to 1 January, but rather by joining the latest day-of-year in your data to the earliest day-of-year in your data. That would explain the small overshoot your observe.
Perhaps @ucfagls would be willing to chime in about how to do this properly. I note that in his post here Modelling seasonal data with GAMs, the cyclical constraint appears to enforce equality between January and December temps and derivatives (if I’m squinting at the figures right), and I’m not exactly sure what the best way would be to tell a model like this to leave a month’s worth of wiggle between month 12 and month 1.
Perhaps we should just split our month 1 data and assign some of it to month 13? Or if we only have one month 1 observation, we could assign it to both month 1 and month 13 with weight 0.5? Or maybe we can leave our data alone but manually specify that there should be a knot at 13?