Cmdstanr slow to return results and influenced by internet connection

I am running the following code and experiencing a long delay after sampling completes. The sampling complete in less than 2 seconds but R does not complete for many seconds after that. The amount of time that it takes to return seems to be influence when the computer is connected to the internet.

Operating System: Windows 11
Interface Version: 2.36 installed using cmdstanr
Compiler/Toolkit: RTools 4.4

library(cmdstanr)
file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
mod <- cmdstan_model(file)
system.time({
  data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
  
  fit <- mod$sample(
    data = data_list,
    seed = 123,
    chains = 4,
    parallel_chains = 4,
    refresh = 500
  )
})

Internet Connection Run Time
Yes 76 sec
No 6 sec

I work remotely, and my company uses Zscaler for network security and VPN. When I run the same code on my personal (non-corporate) computer, it completes in just 2–3 seconds.

This behavior is quite unusual. Is there any reason why sampling would require an internet connection? Could there be telemetry or background data being sent during execution? If so, is there a way to disable it?

Any insight into what might be happening behind the scenes would be greatly appreciated, especially to help our security team determine how best to allow necessary connections without introducing risk. Thank you!

Stan does not collect telemetry. Is it possible that, when connected to the internet, your computer is sending files to your company to be scanned before allowing them to be opened? I’ve never heard of such an antivirus protection, but it seems like one that could exist

1 Like

This seems likely @WardBrian – just found https://help.zscaler.com/itdr/about-endpoint-agents

Could be the endpoint agent is sending data back to the company, or I suppose that the endpoint agent is only active when there’s an internet connection.

2 Likes