Assumptions Gaolre!

In [11]:

Is this a dome or a crater?

In [2]:
Image(filename="./barringer-crater-1.jpg", width=IMG_WIDTH)
Out[2]:

What if we turn this anti-clockwise by $90^0$?

In [4]:
Image(filename="./barringer-crater-90.jpg", width=IMG_WIDTH/2)
Out[4]:

Turn anti-clockwise again by $90^0$?

In [6]:
Image(filename="./barringer-crater-correct.jpg", width=IMG_WIDTH)
Out[6]:

These images are of the Barringer meteor crater in Arizona 1 . When viewed from the right orientation (the last image above), then it indeed is a crater! We see the dome because our brains make the assumption that the sun usually shines from above, so hills would be light on top and concave areas would be light on the bottom 2

Software Engineering

What's wrong with this code?

In [113]:
def square_root(number):
    return np.sqrt(number)
In [116]:
square_root(20)
Out[116]:
4.47213595499958

Did you assume that the input will always be $\geq 0$? What happens with a bad input?

In [117]:
square_root(-20)
Out[117]:
nan

Use preconditions to test whether input conforms to assumptions

In [118]:
def square_root(number):
    assert number >= 0
    return np.sqrt(number)
In [119]:
square_root(-20)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-119-ea2acbcfa354> in <module>
----> 1 square_root(-20)

<ipython-input-118-a709e71a3167> in square_root(number)
      1 def square_root(number):
----> 2     assert number >= 0
      3     return np.sqrt(number)

AssertionError: 

What about real world programs?

In our OOPSLA paper Enforcing object protocols by combining static and runtime analysis , we show how to build tools that can automatically check assumptions about object interactions

Statistics

Principle of Maximum Entropy

The probability distribution which best represents the current state of information, without any additional assumptions , is the one with the largest entropy.
E. T. Jaynes

Suppose you know the mean $\mu$ and variance $\sigma^2$ of a collection of continuous values, say $\mu = 0$ and $\sigma^2 = 1$. Let's compare the entropies of several distributions with the above mean and variance by controlling the shape parameter $\beta$ of a generalized normal distribution . As we see below, when $\beta = 2$, the generalized normal distribution matches the standard normal distribution ( red curve on the left side plot ), with the highest entropy $1.42$ (plot on the right side). Can we increase the entropy further? We can make the distribution flatter by moving probability mass from the center to the tails - however, if we've to respect the variance constraint, then the distribution with the highest entropy is the standard normal. The takeaway from this illustration is that if all we're willing to assume about a collection of observations is that they've a finite variance, then the Gaussian distribution is the most conservative probability distribution to assign to those measurements. However, with other assumptions, the principle of maximum entropy leads to other distributions 3 .

In [166]:

Entropy of standard normal

The entropy of the normal distribution is $H = \frac{1}{2} \ln (2\pi e \sigma^2)$ 4 . With $\sigma^2 = 1$, $H=1.42$.

Generalized normal distribution

There are 3 parameters: location: $\mu$, scale: $\alpha$ and shape: $\beta~$ 5 . \begin{align} \text{Pr}(X=x \vert \mu,\alpha,\beta) &= \frac{\beta}{2~\alpha~\Gamma(1/\beta)}\exp \left \{ - \left( \frac{\vert x - \mu \vert}{\alpha} \right)^\beta \right \} \\ \text{Var(X)} &= \frac{\alpha^2 ~\Gamma \left( 3/\beta \right)}{\Gamma \left( 1/\beta \right)} \end{align}

By setting the variance $\sigma^2 = 1$, we can derive $\alpha = \sqrt{\frac{ \Gamma \left(1 / \beta \right) } {\Gamma \left( 3/ \beta \right) }}$, where $\Gamma$ is the gamma function. We can then use Scipy stats gennorm to compute the entropy for various values of $\beta$.

In [ ]: