RStan3 and PyStan3 Interface

Hi,

there is The Zen of Python - PEP 20, but it’s probably not what you are looking for.

On classes and such https://docs.python.org/3/tutorial/classes.html#class-objects

  • PyStan is generating a class object MyStanProgram.
  • To create an instance of this class object (program) one calls it with parentheses and parameters (optional). In our case data should not be optional? It is quite common to insert parameters to instance when the instance is created. Class attributes or functions should not be called before an instance of the class is created.
  • Creating a fit object is consistent with many python packages. In scikit-learn everything goes to the same object instance. I would not pollute the program object with results.

If we would want to add an extra step for .condition it should come after the program has been created. But this is not really pythonic and is error-prone.

MyStanProgram = pystan.compile("parameters{simplex[3] alpha;}") # create a class object
program = MyStanProgram() # create a class instance
program.condition(data) # insert data

I was talking to @seantalts about this today and he told me that the reason foo(data) would be preferred over foo.condition(data) is because foo() only did one thing, which is to map data into a posterior. And then the next step is to take that posterior and sample or optimize, but there are multiple such methods, so it’s pythonic to use methods there.

Nobody had pointed that out. Is that also why you think foo.condition(data) is not pythonic? That is, would foo.condition(data) be OK if there were other methods defined for the foo object?

Every post I found on what’s “pythonic” just told me not to write loops and to use tuple returns rather than passing in mutable references. Someone sent me a pointer to Effective Python, but I haven’t had a chance to dive in yet.

1 Like

Also, I wasn’t saying we should do program.condition(data), which wouldn’t do anything, but rather posterior = program.conditon(data) — I wasn’t suggesting making the program mutable, but rather treating it like a “factory” (though @seantalts also said Python people don’t like talking about patterns as such).

Now @bgoodri was suggesting exactly making the program mutable in this way (a kind of builder pattern, I might add), with a final foo.build(random_seed) to get the transformed data. He was arguing that would make the most sense to an R user.

So it looks like there’s no way to make both R and Python users happy with an interface.

Among a list of proposals that would not make very much sense to an R user. I don’t think I am that opposed to

joint <- stan_compile("*.stan")
posterior <- joint(data = list(...), seed = 12345)
output <- posterior$sample()

But joint being a class definition is not going to be too familiar to R users, who will wonder why you can’t cut directly to posterior with

posterior <- stan_compile("*.stan", data = list(...), seed = 12345)
output <- posterior$sample()

I think Python users might have exactly the same response.

What is the argument something roughly like what Ben mentions:

posterior <- stan_compile("*.stan", data = list(…))

I thought the foo.build in R would have also returned a new object (and not
mutated foo)?

I’m not opposed to shortcuts, including the all-in-one

posterior <- stan("foo.stan", data = ..., seed= ...);

as we have now.

The advantage of breaking it down is that we have the right object for the posterior that lets you compute log densities, derivatives and transforms. What I don’t like is trying to do that through the output object. Instead, I’d rather extract that object out of the output.

But again, I’m totally clueless about R conventions, which I think are more important than my object-oriented scruples inherited from C++ and Java.

Hi!

Are we also planning to add features like executing the transformed data block separately?

I.e. it would be really useful if I could compile the Stan mode, load my data, then transform the data and obtain the transformed data in R.

The intended use case here is to be able to use the exported Stan functions with the transformed data.

I am happy to move this to a new thread if this is more of a new feature thing and deemed off topic here.

Sebastian

I don’t think this needs to be a Stan3 feature b/c it doesn’t break anything. It’s a cool feature but we can add it anytime we want.

I don’t mean to dismiss your question, just want to help the Stan3 idea from growing into a blob of “stuff we want”.

Yes, we absolutely need to support running generated quantities after the fact. What’s needed is a round-trip mechanism to read in parameters from the output, so the design is relevant for this discussion.

This has sort of devolved into a PyStan and RStan version 3 discussion, so I’ll chime in with comments from @seantalts on pythonicism.

The reason he thought that having the compiled model behave as a functor over the parameters is that there’s nothing else the compiled model does. Therefore, it should act like a simple function. I was thinking it was something deeper about when to use functors, constructors, or factory methods, etc. So here are the alternatives, none of which is more or less pythonic in general, just in the context of the functionality of that posterior object (really the compiled object code in some linkable form).

joint = StanModel("foo.stan")
1.  posterior = joint(data)             # functor
2.  posterior = Posterior(joint, data)  # constructor
3.  posterior = joint.condition(data)   # factory method

And I have to say that even though @seantalts told me that pattern-speak wasn’t considered acceptable in Python, every discussion from the senior Python devs (like the ones who maintain the language) is riddled with pattern speak. I just watched this great video which is all about why you want to build a pythonic adapter pattern rather than accessing a non-pythonic lib in this

My favorite quote (and takeaway message) was “PEP-8 unto thyself, do not PEP-8 unto others.” I’m going to try to take that more to heart in code reviews in C++. It also implicitly went through a number of Python idioms without getting all preachy about it (OK, it was super preachy, but not in a condescending sort of way); it drove me crazy that nobody saw the typo on "commit" he introduced and apparently I wasn’t the only one judging from the comments on YouTube.

More that patterns are often making up for language deficiencies and if you
try to translate them directly into Python they don’t always make for good
Python. This was also 2008-2012 at small startups, so our priorities were
probably a little different than this consultant’s :p

The consultant in question, Raymond Hettinger, has also been a core Python language maintainer and developer for over a decade, so it’s not just some guy who walked in off the street.

In the talk, his goal was to wrap a non-pythonic Python interface to make it more pythonic. The whole point was that you didn’t want to follow the try/open/catch pattern of Java in writing the same interface in Python. But given he didn’t write the underlying code, he had to adapt the Java-like interface to make it more pythonic.

So I think his goal was very much like our goal in wrapping Stan in a Python interface.

Do you think the Python scolds would be more forgiving if he hadn’t said “adapter”?

P.S. The Effective Python book I was glancing through to try to learn what “pythonic” means is also full of pattern speak.

I just meant that he has different business objectives as a (long-term) consultant than the startups that were trying to ship products. I know who he is.

I’ll watch the talk so we can be on the same page.

Can you give me some examples of pattern-speak? I got the book and don’t see anything in the TOC but I’m now wondering if we just have different ideas about what constitutes “pattern-speak.” Which item(s)?

This might be our central misunderstanding re: pattern-speak - “adapter” is a completely marvelous English word that immediately conveys what the code is trying to do. I think to the extent that GoF improved upon its English definition it was perhaps to add an intermediate interface, which Python doesn’t have a mechanism for expressing. So while it might be completely Pythonic to write an adapter (English usage) in situations that call for them, I would say that referring to the GoF capital-A Adapter pattern is not likely to help much.

I just watched the talk and it all seems like the Python advice I would have given, so I take back what I said about him being a consultant vs. a startup mentality! I quite liked the talk and think all of the points were great. I also didn’t see what looks to me like “pattern-speak!” So maybe it’s just what I wrote about above; using “Adapter” in its English-language version and Pythonistas not having much use for GoF on top of that.

The way he talks about it is also the way I think about it; e.g. with the iterable part of the API - it has a getSize, which we call length, and it has an item by index, so it’s indexable, and anything with a __len__ and a __getitem__ can be iterated over - “walk like a duck, talk like a duck, treat it like a duck.” So maybe that’s what GoF would call “Iterator” and Pythonistas just call something you can iterate over (or “iterable” for short - though there’s no explicit interface it’s inheriting from - you even have your choice of which special __ methods you want to implement in order to be able to use it wherever iterables are desired).

So maybe what I was trying to express is more that Python is about duck-typing and using what I would consider to be mostly English names for these ideas, rather than subscribing to the GoF recipes that often don’t make sense as written in Python? In any case, soon you will be indoctrinated enough by these Python talks and books to be able to understand my poorly-worded explanations :P

Speaking of languages, I am not sure about the English either. Bob is referring to the uninstantiated C++ thing as joint and the instantiated C++ as posterior, but it is not as if you can draw from joint or anything like that. Also, it seems that in order to think of the first thing as a joint distribution, you have to undistinguish between what is in the data block and what is in the parameters block (a la BUGS).

1 Like

It sounds like Python integrates patterns into the basic data structures in the language to facilitate their use even by those who don’t understand patterns. This would also explain the awkward learning curve for Python as those patterns aren’t described so new users have to, punnily enough, “pattern match” to figure out the flow of the language.

1 Like

I don’t know what we should call those variables. I was just trying to be clear on what they did. But a Stan program doesn’t even necessarily code a joint log density—it just has to be proportional to the posterior (or really any density—it doesn’t even have to be Bayesian). I think “model” and “program” are both too confusible. Any suggestions?

I think “program” is a reasonable name for the file on disk. And posterior is a good name for the instantiated object. The difficulty with naming the pre-instantiated object is that it does not really do anything except wait to be instantiated, although in the future it should at least be able to return the types of the things in the data and transformed data blocks.