This is a fairly basic thing, but I can’t quite extract it from Giles or the StanMath paper tonight.

If I know that `y[j] = 0`

structurally (ie independent of context. It’s just true), does that mean `\bar{y}[j] = 0’?

For context, if `A`

is a sparse matrix of and I want to compute `y = A*x`

for vectors `x`

and `y`

, then is `\bar{A}`

sparse?

In the dense case `\bar{A} = \bar{y} * c'`

which is a dense matrix (albeit a low rank one). This would be inconvenient for me. I would prefer `\bar{A}`

to have the same sparsity as `A`

. But maths doesn’t care about my needs.

It matters what’s parameters and what’s data.

Suppose I have two vectors `x`

and `y`

and they’re both parameters. Then

```
d.(x' * y) / d.x[1] = y[1]
```

so what you really need to know is the sparsity of `y[1]`

.

In your case, the sparsity of `x`

is what will determine the sparsity `d.(A * x) / d.A`

.

So does that mean that if `y=Ac`

even though `A`

is sparse `\bar{A}`

is dense if `c`

is dense?!

That’s very inconvenient. The storage without additional data structures would be n^2 vars if (`A`

is nxn), while the minimal storage (the number of unique vars that need to be stored to represent `\bar{A}`

) is n.

Sorry—I don’t have the power to change how derivatives work!

`d.(u * v) / d.u = v`

, even if `u = 0`

.

1 Like

Syntax Q – what’s \bar{x} mean here?

It’s Giles’ notation of reverse autodiff

1 Like