This is a fairly basic thing, but I can’t quite extract it from Giles or the StanMath paper tonight.
If I know that y[j] = 0
structurally (ie independent of context. It’s just true), does that mean `\bar{y}[j] = 0’?
For context, if A
is a sparse matrix of and I want to compute y = A*x
for vectors x
and y
, then is \bar{A}
sparse?
In the dense case \bar{A} = \bar{y} * c'
which is a dense matrix (albeit a low rank one). This would be inconvenient for me. I would prefer \bar{A}
to have the same sparsity as A
. But maths doesn’t care about my needs.
It matters what’s parameters and what’s data.
Suppose I have two vectors x
and y
and they’re both parameters. Then
d.(x' * y) / d.x[1] = y[1]
so what you really need to know is the sparsity of y[1]
.
In your case, the sparsity of x
is what will determine the sparsity d.(A * x) / d.A
.
So does that mean that if y=Ac
even though A
is sparse \bar{A}
is dense if c
is dense?!
That’s very inconvenient. The storage without additional data structures would be n^2 vars if (A
is nxn), while the minimal storage (the number of unique vars that need to be stored to represent \bar{A}
) is n.
Sorry—I don’t have the power to change how derivatives work!
d.(u * v) / d.u = v
, even if u = 0
.
1 Like
Syntax Q – what’s \bar{x} mean here?
It’s Giles’ notation of reverse autodiff
1 Like