Deep copy warning

Hi,
I’m getting the following parser warning:

WARNING: left-hand side variable (name=tau) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

I understand why I get it (I’m using a loop with a nested conditional to add values only to successive slices of a larger vector) i.e.:

  for (ifrange in 1:nfrange){
      for (iline in 1:nline){
          if ((linefreq[iline]>f[idx_jump_low[ifrange]]) && (linefreq[iline]<f[idx_jump_high[ifrange]])) {
              sigma_vec[idx_jump_low[ifrange]:idx_jump_high[ifrange]] = sigma_vec[idx_jump_low[ifrange]:idx_jump_high[ifrange]] + sigma[idx_spec[iline]];
          }
      }

With simple benchmarks, I found that this way is much faster than updating the full vector but I’m still wondering,

  • what is the inefficiency mentioned in the warning ? (memory ? cpu ?)
  • is there a way to around the warning ?

Pierre

edit: Actually, I don’t get that warning with codes below, I could have sworn that I got before… See below.

I was just wondering the same thing. Simplest case where we get this warning is when doing something like

for(t in 1:10) {
  x[t+1] = x[t];
}

If you just want to get rid of the warning, you could use temporaries yourself:

for(t in 1:10) {
  tmp = x[t];
  x[t+1] = tmp;
}

But whether this is clever in terms of efficiency, I don’t know. The warning suggests that that is what is happening anyway. Or not, is there are a deep copy of the whole vector of x at each iteration or just copying of the subset like in above manual copying?

EDIT: The codes above do not actually trigger the warning, but this does:

for(t in 1:(n-1)) {
  x[, t+1] = x[, t];
}

whereas this does not:

for(t in 1:(n-1)) {
  x[1, t+1] = x[1, t];
}

I made a small test where I simulated 10000 observations, and in transformed parameters block I did one of the following:

Case 1, deep copy warning:

for(t in 1:(n-1)) {
  x[, t+1] = x[, t];
}

Case 2, avoid the warning by explicit indexing:

for(t in 1:(n-1)) {
  x[1, t+1] = x[1, t];
}

Case 3, avoid warning by using temporary variable:

for(t in 1:(n-1)) {
  tmp = x[,t]
  x[, t+1] = tmp;
}

Results: First one took 14.8s for 10000 iterations, case 2 took 4.5s, and third option 10.5s. So quite big differences at least in this toy experiment.

But in your case and in general the issue is not about the explicit indexing though, but more like

for(t in 1:n) {
  x[1:2,t] = 2.0*x[1:2,t];
}

Note that for vector something like x[t] = 2.0 * x[t]; does not trigger a warning for me.

That message is from Stan, not from one of our libraries. We print it when we have to make a deep copy to avoid aliasing during assignment.

I don’t understand. Case (2) isn’t computing the same thing as it’s only copying the first row. I’m not sure why case (3) would be faster than case (1).

Whatever is going on with all this, it’s best to reorganize data structures to avoid a lot of copying that’s not in transformed data, where it only gets executed once.

Ah, never mind. I get it. The whole matrix is getting copied in the slower case, whereas your explicit copy only deals with a row. The problem is that we don’t do very good static analysis to figure out when we can rewrite code the way you suggest.

@seantalts wanted to sink his teeth into some of these optimization problems; maybe this one’s solvable.

Sorry, forgot to write that the matrix in my example is 1 x n so Case (2) does the same thing as (1) and (3) as well.

In some cases restructuring the code is likely the best way to go, but for some recursive stuff it’s not that easy though.

I’m not sure whether I should be resurrecting this thread, but this is the closest I found to my issue.

The following line triggers the “Deep Copy” warning when I use it in place of a loop.

x[2:K] = w * x[1:(K-1)] + (1-w) * alpha[2:K];

and the loop is

for(i in 2:K){
   x[i] = w*x[i-1] + (1-w)*alpha[i];
  }

It is in the transformed parameters block.

x and alpha are K length vectors, w is a scalar

The whole model is here, with the loop version in place:

This is in no way an urgent problem but I’d like to understand better what’s going on.

Thanks.

What you’re trying to do with this:

x[2:K] = w * x[1:(K-1)] + (1-w) * alpha[2:K];

is dangerous as it’s order dependent. If you copy the x on the right-hand side (which is what Stan’s going to do), you get a different result than if you were to write the loop:

for (k in 2:K)
  x[k] = w * x[k - 1] + (1 - w) * alpha[k];

It’s the dependency of x[k] on x[k-1] that’s the real culprit here.

I wrote a little Stan program to illustrate:

transformed data {
  vector[4] x = [ 1, 2, 3, 4 ]';
  vector[4] u = x;
  for (t in 2:4)
    u[t] = u[t - 1] * 3;

  x[2:4] = x[1:3] * 3;
  print("u = ", u);
  print("x = ", x);
}

which produces

u = [1,3,9,27]
x = [1,3,6,9]

I’ll include this discussion in the manual as I think it’s helpful to understand how our vector operations always copy:

https://github.com/stan-dev/stan/issues/2336#issuecomment-334613905

It’s an AR(1) type model, like that on page 162 of the Stan reference manual

model {
y[2:N] ~ normal(alpha + beta * y[1:(N - 1)], sigma);
}

But you’re right, it doesn’t seem to work written this way.

andydolman@gmail.com

This is different—it’s a sampling statement not an assignment! In

y[2:N] ~ normal(alpha + beta * y[1:(N - 1)], sigma);

the value of y remains fixed on the left and right side. It’s really just syntactic sugar for

target += normal_lpdf(y[2:N] | alpha + ...);

that also drops constant terms if it can.