Compiling Flatiron Server Tests Locally

Hi,

I’ve switched my compiler to gcc 9, which looks like what’s used on the Flatiron servers, but I’m re-running tests locally, and I’m not getting the same errors when running tests or compiling.

What local compiler or settings are we using? Here’s my gcc -v and lshw from wsl on windows 11. This is just so I’m not wasting resources and I can compile locally instead of guessing.

What am I doing wrong?

Some config I’m missing? I made sure the dependencies were the same as on the stan/math docs, but I can re-check.



andre@compy:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 13.3.0-6ubuntu2~24.04' --with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-13 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-libstdcxx-backtrace --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04)

And then:

andre@compy:~$ lshw
WARNING: you should run this program as super-user.
compy
    description: Computer
    width: 64 bits
    capabilities: smp vsyscall32
  *-core
       description: Motherboard
       physical id: 0
     *-memory
          description: System memory
          physical id: 0
          size: 31GiB
     *-cpu
          product: 13th Gen Intel(R) Core(TM) i9-13900H
          vendor: Intel Corp.
          physical id: 1
          bus info: cpu@0
          version: 6.186.2
          width: 64 bits
          capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp x86-64 constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni vnmi umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
          configuration: microcode=4294967295
     *-scsi
          description: SCSI storage controller
          product: Virtio 1.0 console
          vendor: Red Hat, Inc.
          physical id: 2
          bus info: pci@5582:00:00.0
          version: 01
          width: 64 bits
          clock: 33MHz
          capabilities: scsi bus_master cap_list
          configuration: driver=virtio-pci latency=64
          resources: iomemory:90-8f iomemory:90-8f iomemory:90-8f irq:0 memory:9ffe00000-9ffe00fff memory:9ffe01000-9ffe01fff memory:9ffe02000-9ffe02fff
        *-virtio0 UNCLAIMED
             description: Virtual I/O device
             physical id: 0
             bus info: virtio@0
             configuration: driver=virtio_console
     *-display:0
          description: 3D controller
          product: Basic Render Driver
          vendor: Microsoft Corporation
          physical id: 3
          bus info: pci@7446:00:00.0
          version: 00
          width: 32 bits
          clock: 33MHz
          capabilities: bus_master cap_list
          configuration: driver=dxgkrnl latency=0
          resources: irq:0
     *-generic
          description: System peripheral
          product: Virtio file system
          vendor: Red Hat, Inc.
          physical id: 4
          bus info: pci@8c7a:00:00.0
          version: 01
          width: 64 bits
          clock: 33MHz
          capabilities: bus_master cap_list
          configuration: driver=virtio-pci latency=64
          resources: iomemory:e0-df iomemory:e0-df iomemory:c0-bf irq:0 memory:e00000000-e00000fff memory:e00001000-e00001fff memory:c00000000-dffffffff
        *-virtio1 UNCLAIMED
             description: Virtual I/O device
             physical id: 0
             bus info: virtio@1
             configuration: driver=virtiofs
     *-display:1
          description: 3D controller
          product: Basic Render Driver
          vendor: Microsoft Corporation
          physical id: 5
          bus info: pci@d768:00:00.0
          version: 00
          width: 32 bits
          clock: 33MHz
          capabilities: bus_master cap_list
          configuration: driver=dxgkrnl latency=0
          resources: irq:0
     *-pnp00:00
          product: PnP device PNP0b00
          physical id: 6
          capabilities: pnp
          configuration: driver=rtc_cmos
  *-network
       description: Ethernet interface
       physical id: 1
       logical name: eth0
       serial: 00:15:5d:9e:0c:12
       size: 10Gbit/s
       capabilities: ethernet physical
       configuration: autonegotiation=off broadcast=yes driver=hv_netvsc driverversion=6.6.87.2-microsoft-standard-WSL duplex=full firmware=N/A ip=172.23.49.202 link=yes multicast=yes speed=10Gbit/s
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.

The Jenkinsfiles we use should all contain docker references that you can pull and run in the exact same environment (assuming you aren’t trying to run the GPU tests)

I’m just talking to myself.

I figured this out. But the way you implement it can make it vary drastically, but it also needs to compile through the entire library and not just for exp. So sometimes I’m making it slower. Like if I pass by reference and make the output mutable, then it’s slowing it down. But I have to check again.

This is mostly a rubber duck, I’m talking to myself. But it’s not something worth merging if it makes it slower, obviously. But may be cumulative if it’s done through more functions in the library? Not sure, I’ve never tried this before. I need to read a book so my decisions are more informed…

But prior to trying to get the whole library to merge I may just benchmark some additional functions in Stan math and see how this simple implementation performs when it’s embeded in a composite function. No idea. There seemed to be a sweet spot before, number of threads per dataset size, but I made changes and made it slower. Probably a video game developer would know. They weren’t doing this much for numerical computing or finance. No idea. Lower level than what’s required?