Model base class & compile time speedup PRs ready to review

Yes, I had to work around the same issue in the CMakefile monorepo branch. I fixed this by editing generate_model_typedef.hpp, and generate_program_reader_fun.hpp to declare the appropriate variables static (new_model and program_reader). However, I think a clean header/source separation would be best; but I’m not sure if that’s an option.

static or inline?

For now, just static.

It’s not really a permanent solution; to make them actually unique per translation unit you need to wrap it in a template instantiation, or C++17 support for static inline variables.

How about an unnamed namespace?

I don’t remember exactly why I couldn’t use an unnamed namespace, but it had something to do with allocation functions not being allowed to be defined in named scopes.

(I did not spend a ton of time investigating this problem, I tried including everything in an anonymous namespace and that didn’t work; it’s possible that a more fine-grained approach to namespacing would succeed. Adding static was just the quickest way to get unit tests to compile.)

The correct way to force C++11 static inline is with what I’ve done in my fix of https://github.com/alashworth/stan-monorepo/blob/develop/tests/math_unit/math/prim/arr/functor/mock_throwing_ode_functor.hpp#L6

esp wrt mock_throwing_ode_functor_count.

PS: has anyone had any issues running the tests in expl_leapfrog_test.cpp?

How is this new_model function even being used? I don’t see where it is called.

The new_model function is being used in CmdStan. A quick fix would be to just remove that string from the file. It’s always the same and always at the end.

A longer-term solution might be to move the new_model function to its own file, which could be compiled along with the model file for CmdStan. Then RStan could just use the model file, which would not have the new_model function in it.

I don’t think it’s sufficient to declare the new_model function inline, because the definitions won’t be the same across translation units.

The quick fix would still entail doing PRs for 35 packages to strip out the new_model lines from the newly generated C++ code. Adding the inline keyword seems to work, although your reasoning for why it should not work makes sense to me. Alternatively, could we enclose it in

#ifndef USING_R
...
#endif

When we fix the Stan side, it’ll be easy to separate out into a separate file.

Inserting an ifdef would be even easier. Is USING_R always defined for anything on the RStan side?

Also, didn’t mean to imply this would take long. It’s more just an issue of when we could do a patch release or at least something you could work from. I could create the PR to generate the ifdefs in under an hour.

The R.h file has these lines at the top

#ifndef USING_R
# define USING_R
#endif

so I believe it would be defined for anything someone could do in R using Stan.

I still am perplexed by this question. I tried to make a file called model_base.cpp that just contains

#include <stan/model/model_base.hpp>

// forward declaration for function defined in another translation unit
stan::model::model_base& new_model(stan::io::var_context& data_context,
                                   unsigned int seed,
                                   std::ostream* msg_stream);

and I can compile that. But when I do nm libStanHeaders.a | grep -F "stan" to look at what it puts into the shared library, it is only

0000000000000000 W _ZN4stan4math11stack_allocC2Em
0000000000000000 W _ZN4stan4math11stack_allocD2Ev
0000000000000000 b _ZN4stan4math12_GLOBAL__N_126global_stack_instance_initE
0000000000000000 V _ZN4stan4math22AutodiffStackSingletonINS0_4variENS0_15chainable_allocEE9instance_E
0000000000000000 W _ZN4stan4math22AutodiffStackSingletonINS0_4variENS0_15chainable_allocEED2Ev
0000000000000000 W _ZN4stan4math8internal25eight_byte_aligned_mallocEm
0000000000000010 b _ZN4stan4mathL10INV_SQRT_2E
0000000000000070 b _ZN4stan4mathL11LOG_EPSILONE
0000000000000050 b _ZN4stan4mathL11LOG_SQRT_PIE
0000000000000038 b _ZN4stan4mathL11SQRT_TWO_PIE
0000000000000040 b _ZN4stan4mathL15INV_SQRT_TWO_PIE
0000000000000028 b _ZN4stan4mathL16POISSON_MAX_RATEE
0000000000000068 b _ZN4stan4mathL19NEG_LOG_SQRT_TWO_PIE
0000000000000018 b _ZN4stan4mathL5LOG_2E
0000000000000020 b _ZN4stan4mathL6LOG_10E
0000000000000048 b _ZN4stan4mathL6LOG_PIE
0000000000000008 b _ZN4stan4mathL6SQRT_2E
0000000000000030 b _ZN4stan4mathL7SQRT_PIE
0000000000000060 b _ZN4stan4mathL8LOG_HALFE
0000000000000058 b _ZN4stan4mathL8LOG_ZEROE

as compared to CmdStan’s main.o file, which has 1538 lines that are turned up by the above grep. I feel as if more needs to be instantiated but you can’t instantiate from an abstract class.

By itself, model_base.hpp is abstract in the sense that it doesn’t provide an implementation. That’s why the new_model class is code generated to return an instance of the specific compiled model class (defined by typedef). The compiled model class is an extension of model_base_crtp, which is itself an extension of model_base. We need all three in a row to have virtual functions and backwards compatibility—it’s a complicated CRTP pattern to hack around the fact that template methods (template non-static class functions) can’t be virtual.

For reference, the code generated in the top-level namespace in 2.20 and current state of develop is:

#include <stan/model/model_header.hpp>

... model class definition ...

typedef foo_model_namespace::foo_model stan_model;

stan::model::model_base& new_model(
        stan::io::var_context& data_context,
        unsigned int seed,
        std::ostream* msg_stream) {
  stan_model* m = new stan_model(data_context, seed, msg_stream);
  return *m;
}

Great. I just created an issue to put in this simple fix. That’ll be a lot easier than managing a second file. See

I’ll submit the PR.

Was that a complicated way of saying I should #include <stan/model/model_base_crtp.hpp> instead of #include <stan/model/model_base.hpp> before the forward declaration? If so, it does not seem to have helped.

Afraid not. I meant that you need to compile in two translation units:

  • Services translation unit: This will #include <stan/model/model_base.hpp> and compile object code for the services, which can be instantiated with stan::model::model_base and the RNG we’re using. The result is a compiled services object.

  • Model translation unit: This defines a specific class extending stan::model::model_base, which itself will be defined by a new class class foo that extends stan::model::model_base_crtp<foo>). It will also need to provide some generic some way to get a model instance to the services translation unit.

In CmdStan, the services translation unit implements a main() that calls the function ::new_model(). So the services translation unit includes the header for new_model but not the definition. The model translation unit defines the model class foo and defines new_model to return an instance of foo. When they’re linked, you have a whole program with a definition of new_model returning an instance of foo and the command using that to execute.

OK, I have one line of the “services translation unit”, namely

#include <stan/model/model_base.hpp>

I am confused about the remaining lines. I think to instantiate, I would need something like

// forward declaration for function defined in another translation unit
stan::model::model_base& new_model(stan::io::var_context& data_context,
                                   unsigned int seed,
                                   std::ostream* msg_stream);
stan::model::model_base& model = new_model(?, ??, ???);

but I don’t know what to insert for ?, ??, or ??? at the time StanHeaders is compiled (i.e. before there is any actual Stan program). Can I just make ? an empty var_context, ?? be 0, and ??? be nullptr?

Using an empty var_context did not seem to work, since the compiler complains about unimplemented virtual methods

clang++-8 -Wno-unknown-pragmas -std=gnu++14 -I"/home/ben/r-devel/include" -DNDEBUG -DNO_FPRINTF_OUTPUT -I"../inst/include" -I"../inst/include/src" -include stan_sundials_printf_override.hpp -I"/home/ben/r-devel/library/RcppEigen/include" -I/usr/local/include   -fpic  -mtune=native -march=native -O3 -g0 -c model_base.cpp -o model_base.o
model_base.cpp:7:44: error: allocating an object of abstract class type 'stan::io::var_context'
stan::model::model_base& model = new_model(stan::io::var_context(), 0, nullptr);
                                           ^
../inst/include/src/stan/io/var_context.hpp:42:20: note: unimplemented pure virtual method 'contains_r' in 'var_context'
      virtual bool contains_r(const std::string& name) const = 0;
                   ^
../inst/include/src/stan/io/var_context.hpp:56:35: note: unimplemented pure virtual method 'vals_r' in 'var_context'
      virtual std::vector<double> vals_r(const std::string& name) const = 0;
                                  ^
../inst/include/src/stan/io/var_context.hpp:66:35: note: unimplemented pure virtual method 'dims_r' in 'var_context'
      virtual std::vector<size_t> dims_r(const std::string& name) const = 0;
                                  ^
../inst/include/src/stan/io/var_context.hpp:76:20: note: unimplemented pure virtual method 'contains_i' in 'var_context'
      virtual bool contains_i(const std::string& name) const = 0;
                   ^
../inst/include/src/stan/io/var_context.hpp:86:32: note: unimplemented pure virtual method 'vals_i' in 'var_context'
      virtual std::vector<int> vals_i(const std::string& name) const = 0;
                               ^
../inst/include/src/stan/io/var_context.hpp:96:35: note: unimplemented pure virtual method 'dims_i' in 'var_context'
      virtual std::vector<size_t> dims_i(const std::string& name) const = 0;
                                  ^
../inst/include/src/stan/io/var_context.hpp:104:20: note: unimplemented pure virtual method 'names_r' in 'var_context'
      virtual void names_r(std::vector<std::string>& names) const = 0;
                   ^
../inst/include/src/stan/io/var_context.hpp:112:20: note: unimplemented pure virtual method 'names_i' in 'var_context'
      virtual void names_i(std::vector<std::string>& names) const = 0;
                   ^
1 error generated.

OK, apparently we have an actual class called empty_var_context and this compiles but still does not seem to compile in the algorithms and stuff:

#include <stan/io/empty_var_context.hpp>
#include <stan/model/model_base.hpp>

// forward declaration for function defined in another translation unit
stan::model::model_base& new_model(stan::io::var_context& data_context,
                                   unsigned int seed,
                                   std::ostream* msg_stream);
stan::io::empty_var_context evc = stan::io::empty_var_context();
stan::model::model_base& model = new_model(evc, 0, nullptr);