I have googled the relationship between tree depth, leapfrog, acceptance rate, and running time. But I did not find a solution any way, so can anyone help me with my problem below:

The same model with different number of items was estimated by a hierarchical model. The same specification was used with max_treedepth = 10, adapt_delta=0.90. One instance ran so much faster, with the following running time:
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 630 seconds.
Chain 1: Adjust your expectations accordingly!

While another instance ran so slowly, which takes about 20k seconds for 1000 transitions.

What can I do to make the second instance run faster? How do I adjust leapfrog, beside treedepth, in rstan?

That isn’t atypical, especially for models that are difficult for Stan to fit. If you can make adapt_delta lower but still not encounter divergent transitions, that will tend to lower treedepth.

Sometimes, .8 works but not always. After I increased the adapt_delta, it slows down the program with the same tree depth, according to my observation.

max_treedepth is the maximum tree depth that the Stan HMC sampler will consider before automatically stopping, not the actual treedepth. The default max_treedepth is 10, corresponding to 2^{max_treedepth} = 2^10 = 1024 leapfrog steps.

Within that range you can have a wide range of speeds. The timing message estimates timing assuming 10 leapfrog steps. If you actually require 1000 leapfrog steps (near the max_treedepth boundary) because of particularly complex geometry then you can easily have 30x that estimate for the actual running.