Sorry for not being more clear in my previous messages.
Say you have 12 observations,
you can divide this as
k=n ((1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))
or
k=6 ((1,2),(3,4),(5,6),(7,8),(9,10),(11,12))
or
k=4 ((1,2,3),(4,5,6),(7,8,9),(10,11,12))
or
k=3 ((1,2,3,4),(5,6,7,8),(9,10,11,12))
or
k=4 ((1,2,3,4,5,6),(7,8,9,10,11,12))
and naturally for k=6,4,3,2 other permutations are possible. In
k-fold-cv we leave out one group at the time, train with others and
predict for the left out set. From your earlier posts, I understood
that you would compute RMSE for k sets, e.g. if k=4, you will compute
RMSE for 4 sets which each have 3 observations. You will then have 4
statistics (RMSEs) and you would compute SE of these 4 statistics
(RMSEs). Now you will get different results depending on which k you
use, and if that is not connected to your actual prediction task then
it is arbitrary value. You may have a justification to choose a
specific k, say you would in the future always predict for groups of
3, or say you have a hierarchical model with 4 groups, and you want to
know the predictive performance for new groups. If you donāt have a
specific reason to choose a specific k, then I suggest to compute
cross-validated square error for each n individual observation, and
then you have n statistics (square errors) for which you can compute
RMSE and SE for that. This way the result is less sensitive to
specific k value. With k<n there are different possible permutations,
and if you repeat the data division with different permutations, you
can average the predictions over the different permutations for each
observation and then compute n square errors, and then continue as in
the case of one permutation for folds.