Proposal: Explicit integer division

There are multiple posts on trying to resolve the "Warning: integer division implicitly rounds to integer. " message, e.g.:


@Bob_Carpenter for example wrote:

Agreed. We want to support a way of turning them off in the future, either on a case by case basis, or wholesale. It’s not there yet. We haven’t even been able to turn off all the Stan output.

I just stumbled upon this myself - I am trying to use a middle element of an ordered data array to use as a reference level.

A proposal: couldn’t we just have an explicit operator and/or function for integer division with well defined (not platform dependent) rounding behavior? If I use such a function, I am taking the responsibility that I know what I’m doing, so the warning can stay (and just refer the user to the new function/operator).

For example Python has the // operator for integer division, rounding down. But maybe function int_divide or floor_divide would be more legible in code…

3 Likes

“Rounding down” as in “towards zero” or “towards negative infinity”?

I think int_divide(.,.) is a bit too verbose for basic arithmetic operation and Stan cannot use // because that starts a comment. Maybe /% would be a good name (because integer division is kind of related to the modulus operator %)

In math, if I wanted to do integer divide, I’d do int(x/y). That is, taking x/y automatically promotes to real, and int takes the integer part.

Just to clarify for people who don’t know how Stan currently works, if we have integer variables a and b, by which I mean variables declared as type int in Stan, then defining them rounds. But it always provides a warning whether the user did integer division on purpose or not.

The question is really how to silence the integer division warning when you want it.

The easiest way to do that is just to remove the warning.

Then it can get a new home in our linter (the thing @rybern is coding that @andrewgelman calls “pedantic mode”). Then it won’t be a problem for general programs, but it’ll show up as a warning otherwise. We’re going to get this problem in spades with the linter, which tries to warn about all sorts of things. (The other case where we get false positive warnings is for variable transforms.)

We could also add a new function real_divide(a, b) that returns the result of casting a and b to real types and dividing. And a second new function int_divide(a, b) that does integer division.

Mathematically, the integers (\mathbb{Z}) are not closed under division (/), so integer division isn’t well defined. This matters for non-physics math like algebra and number theory. But it’s moot in that we’re defining a computer language.

“Rounding down” as in “towards zero” or “towards negative infinity”?

Unlike Python (which uses the floor), C++ truncates toward zero by throwing away the fractional part (when in doubt, guess that C++ does the efficient thing when things are not well-defined mathematically). For example,

#include <iostream>

int main() {
  std::cout << "1 / 3 = " << (1 / 3) << std::endl;
  std::cout << "1 / 2 = " << (1 / 2) << std::endl;
  std::cout << "2 / 3 = " << (2 / 3) << std::endl;
  std::cout << "4 / 3 = " << (4 / 3) << std::endl;
  std::cout << "-1 / 3 = " << (1 / 3) << std::endl;
  std::cout << "-1 / 2 = " << (-1 / 2) << std::endl;
  std::cout << "-2 / 3 = " << (-2 / 3) << std::endl;
  std::cout << "-4 / 3 = " << (-4 / 3) << std::endl;
}

prints

$  ./a.out
1 / 3 = 0
1 / 2 = 0
2 / 3 = 0
4 / 3 = 1
-1 / 3 = 0
-1 / 2 = 0
-2 / 3 = 0
-4 / 3 = -1

Let me preface this by saying that this is not a huge deal; it’s just something that happens in code sometimes that is currently annoying. But it only happens every once in a while.

Just to clarify: The “annoying thing” that I’m talking about is not that there are warnings sometimes. The annoying thing to me is that if I’m not careful I could be getting the wrong answer if Stan is rounding or truncating things.

The original place where this tripped me up was when I had this code:

L = sqrt(1/sigma[n]^2 + 1/tau^2);

and I got a warning in compilation.

Seeing a warning in compilation is not such a big deal; in this case I changed the "1"s to "1."s and then it worked. The code is now slightly uglier, but no big deal.

Even better in my opinion would be for Stan to convert to reals. I’d like 1/sigma^2 not to truncate or round or whatever it does. Similarly, if I type 1/3 in Stan I’d like it to be the number 0.3333333etc, not 0. That is, doing x/y will return a real number if x and y are integers or reals. If you want integer division, you can do round(x/y) or floor(x/y) or int(y/x) whatever.

But I guess that’s a matter of taste. If other people want 1/sigma^2 to truncate and 1/3 to be 0, then that’s the way it is. In that case, I would like Stan to return that warning, as otherwise I think I won’t be the only user to enter code like 1/sigma^2 and 1/3 and get the wrong answer.

I think it would be a problem for Stan to not give a warning when it truncates or rounds things like 1/sigma^2 or 1/3.

Also, as a user, I don’t recommend adding a real_divide function: (a) that’s one more function, and we have a lot of functions to keep track of already, (b) I don’t think this helps for users. From my “user” perspective, the problem is that sometimes in Stan you need to convert an integer to a real number, just like sometimes you have to convert a vector to an array or an array to a vector. If I had two integers, i and j, and I want i/j, and if Stan won’t convert these to reals automatically, then I’d just as soon do real(i)/real(j), as that would be clearer than a real_divide function.

2 Likes

@andrewgelman:

Ack. That’s another stanc3 bug. There should only be a warning when dividing integers by integers.

It does. If there is mixed integer and real arithmetic, the integer values are promoted. This was just a bug in the warning messages in stanc3.

If you have integers i and j then i / j will round.

We could introduce a function real() that converts integer to real values. I’m not sure how that’d work with the data type being real. to_real would work. And only one would need to be converted.

1 Like

Oh, that’s a relief! I don’t feel so bad that 1/3 rounds to 0, because I’m guessing it’s pretty rare that people will type things like 1/3 in their Stan code!

Here’s the issue report for stanc3:

The problem is that our users aren’t so familiar with integer vs. double typing in programming languages and are prone to write things like y ~ normal(0, 1/2) when they want a standard deviation of 0.5.

4 Likes

Can I bump this again. Just had to help someone with the equivalent of

data {
  int y[10];
  int n[10];
}
transformed data {
  real div;
  div = sum(y) / sum(n);
}

This is really annoying and comes up a lot when you’re modelling proportions or count data. Adding a 0.0 makes the code functional but ugly. Explicit integer/real division would be a real blessing here.

5 Likes

Couldn’t the outcome – ie whether we are computing a real or integer – be used to determine rounding?

@nhuurre is proposing adding this here but needs feedback: https://github.com/stan-dev/stanc3/issues/533

But as Bob mentions in the thread we can’t really just switch / to mean real division even for int types as this will break backward compatibility.

There is other looming stuff around the project that will require bumping the major version so this might come soon.

In the meantime, the stanc3 compiler does emit a warning to at least help debugging if you are unaware that you added an integer division. But obviously does not help with the ugliness of the solution. For this exact code, you would get:

Info: Found int division at 'examples/bugs/intdiv.stan', line 7, column 8 to column 14:
  sum(y) / sum(n)
Values will be rounded towards zero.

Hopefully, stanc3 hits rstan soon, I am guessing this would save some debug time.

2 Likes

It’d be great if the warning explained how to fix the problem. I suspect a lot of our users don’t have the right programming background to work it out themselves

1 Like

Would this work?

I can fix this, but am bad at texts and English and all so if someone can copy-edit this that would be great.

Ignore the above, Nico’s suggestion is much better:

Values will be rounded towards zero. If rounding is not desired you can write
the division as
  d_int * 1.0 / d_int
If rounding is intended please use the integer division operator %/%.
Info: Found int division at 'operators_int.stan', line 26, column 27 to column 32:

See https://github.com/stan-dev/stanc3/pull/573

1 Like

We don’t know if we’re computing a real or integer. Plus, we want the language to be compositional so that the value of f(x1, ..., xN) only depends on the values of f and x1...xN, not the context in which it’s used. The way languages do this is allowing promotion of outputs. So you can assign an integer to a real (or vice-versa in C++ in what I consider a language flaw).

+1—we should always strive for this, especially for behavior that’s going to be deprecated. I like Niko’s suggestion.