Improving pp_check results Poisson model

I see, I was way off. I think the only thing from what I said earlier, that I stand by, is relaxing the relationship with duration. I can try to explain my thinking using written language as an example, you be the judge to what extent any of this transfers to sign language.

I am thinking that more than sentence length, the potential for mistakes in a sentence depends on things like the difficulty of the grammar or the complexity of the subject. Present tense sentences easier than past tense, hypotheticals (“We could’ve gone swimming if it hadn’t rained”) even harder. Then talking about your weekend is easier than talking about the economy.

Conditional on these other sources of difficulty, perhaps the number of mistakes is proportional to sentence length. But if your model does not account for those, the proportionality assumption that is implied by the offset (or by dividing the response with time) is off. The number of mistakes would be influenced by length but not be strictly proportional to it. That is why I think that in the absence of any terms that account for other sources of difficulty, adding duration as a predictor makes more sense than using it as an offset.

How does a simple Poisson model (i.e. without the hurdle) and duration as a predictor compare to the above?

EDIT: Sorry you probably already know this; are we on the same page that the last pp_check plot you showed looks pretty good?