Sorry for not being more clear in my previous messages.

Say you have 12 observations,

you can divide this as

k=n ((1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))

or

k=6 ((1,2),(3,4),(5,6),(7,8),(9,10),(11,12))

or

k=4 ((1,2,3),(4,5,6),(7,8,9),(10,11,12))

or

k=3 ((1,2,3,4),(5,6,7,8),(9,10,11,12))

or

k=4 ((1,2,3,4,5,6),(7,8,9,10,11,12))

and naturally for k=6,4,3,2 other permutations are possible. In

k-fold-cv we leave out one group at the time, train with others and

predict for the left out set. From your earlier posts, I understood

that you would compute RMSE for k sets, e.g. if k=4, you will compute

RMSE for 4 sets which each have 3 observations. You will then have 4

statistics (RMSEs) and you would compute SE of these 4 statistics

(RMSEs). Now you will get different results depending on which k you

use, and if that is not connected to your actual prediction task then

it is arbitrary value. You may have a justification to choose a

specific k, say you would in the future always predict for groups of

3, or say you have a hierarchical model with 4 groups, and you want to

know the predictive performance for new groups. If you donāt have a

specific reason to choose a specific k, then I suggest to compute

cross-validated square error for each n individual observation, and

then you have n statistics (square errors) for which you can compute

RMSE and SE for that. This way the result is less sensitive to

specific k value. With k<n there are different possible permutations,

and if you repeat the data division with different permutations, you

can average the predictions over the different permutations for each

observation and then compute n square errors, and then continue as in

the case of one permutation for folds.