I’d like to implement a logistic regression model (with normal prior) accepting inputs and corresponding non-negative weights w_n (i.e. multiplicities of point’s loglikelihood). Is the following implemenation correct?
weighted_logistic_code = """
data {
int<lower=0> N; // number of observations
int<lower=0> d; // dimensionality of x
matrix[N,d] x; // inputs
int<lower=0,upper=1> y[N]; // outputs in {0, 1}
vector[N] w; // weights
}
parameters {
real theta0; // intercept
vector[d] theta; // auxiliary parameter
}
model {
theta0 ~ normal(0, 1);
theta ~ normal(0, 1);
for(n in 1:N){
target += w[n]*bernoulli_logit_lpmf(y[n]| theta0 + x[n]*theta);
}
}
"""
If you just want to reduce the number of calls to the likelihood, sufficient statistics is a different and probably also the best way to go.
If you have following data
data {
int<lower=0> N_unique; // number of unique rows in x
int<lower=0> d;
matrix[N_unique, d] x;
int<lower=0> U[N]; // number of cases in each row of x
int<lower=0> Y[N]; // number of cases in each row of x with value 1
}
You should be able to use the binomial distribution for your likelihood:
model {
theta0 ~ normal(0, 1);
theta ~ normal(0, 1);
target += binomial_logit_lpmf(Y | U, theta0 + x*theta)
}