Multilevel model with single-observation group levels

Recently I ran into the problem of specifying a multilevel model with varying intercepts that contain groups represented only by a single observation.
I specify the model like this:

brm(Response ~ Predictor + (1|Group), data = mydata)

Response and Predictor are continuous variables and Group is a factor with multiple levels. Some of the levels in Group only contain a single observation which results in the problem that the model’s chains do not converge. I since found out that this is related to how brms deals with varying intercepts: As far as I understand it, brms estimates the standard deviations of the varying intercepts and this seems to be part of the problem. When I remove those groups that only contain one observation, the model converges normally.
I tried replicating this problem in ‘rstanarm’ but it doesn’t appear to be a problem there. Whatever seems to be the issue here, it has to do with (1) single-observation group levels and (2) the specific way brms deals with varying effects.
Therefore my question: How would I need to specify the model or set the priors to solve this convergence issue?

Are you using default priors? If not, please say more about the scale of your Response variable and spell out your priors. Also, how many levels are there for Group, how many of those levels (in terms of n or a proportion) have a single measurement occasion, and how many measurement occasions are typical for the others?

Of course: both Response and Predictor are standardized continuous variables (mean = 0) and I use student_t(3,0,1) for Intercept on Predictor and the group levels. Furthermore, I use cauchy(0,1) for sigma. I have 243 datapoints, 115 of which have a single group level, 14 have two, 2 have three, 1 has four, 3 have five, 3 have six, 1 has twelve, 1 sixteen and 1 nineteen.

Okay. You might tighten up your prior for the group-level variance parameter. My experience has been brms has trouble when such a large proportion of your groups have such a small number of occasions nested within them. Given the scale of your data, try something like prior(normal(0, 1), class = sd) or even prior(normal(0, 0.25), class = sd) .