I hadn’t tried that. When X2
is built in the transformed data block, the utilization picks up (e.g., below). Though, at least in my application this won’t work since X
needs to be built from model parameters.
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.130 Driver Version: 418.130 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID P100-8Q On | 00000000:02:04.0 Off | N/A | | N/A N/A P0 N/A / N/A | 819MiB / 8192MiB | 28% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 24541 C ./normal_glm 291MiB | +-----------------------------------------------------------------------------+