始めに

の第3章の3.1を写経した記録をまとめます。前回はこちらPythonで体験するベイズ推論 PyMCによるMCMC入門第2章 (2.2.6から2章の終わりまで) 写経。実行はJupyter Labにて実行しています。

　写経は以下のようにまとめました。

写経ではありますが、関数を作成し、コードの再実行をやりやすいようにした部分があります。
より良いと思われるコードを考えた場合は書き換えてコメントを添えるようにし、変更点をなるべく明示するようにしてあります。
個人的に気になった点のメモを残すようにしてあります。同じような疑問を持った方の助けになれば幸いです。
以前書いたコードと同じようなコード（例えばグラフの描写等）は効率化のために飛ばしているところもあります。

　Pythonで体験するベイズ推論 PyMCによるMCMC入門を読んでいている際のちょっとした正誤表代わりになればと思います。

この記事で使用する主なモジュール、設定

　この記事では、主に以下のモジュールや設定を使用しています。

import scipy.stats as stats
import matplotlib as mpl
from matplotlib import pyplot as plt
%matplotlib inline
from IPython.core.pylabtools import figsize
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import pprint
from os import makedirs
from urllib.request import urlretrieve
import pymc as pm

3 MCMCの中をのぞいてみよう

3.1 山あり谷あり、分布の地形

figsize(25, 8)
plt.rcParams['font.size'] = 10

jet = plt.cm.jet
fig = plt.figure()

x = y = np.linspace(0, 5, 100)
X, Y = np.meshgrid(x, y)

plt.subplot(121)
uni_x = stats.uniform.pdf(x, loc=0, scale=5)
uni_y = stats.uniform.pdf(y, loc=0, scale=5)

M = np.dot(uni_x[:, None], uni_y[None, :])

im = plt.imshow(M, interpolation='none', origin='lower',
                cmap=jet, vmax=1, vmin=-.15, extent=(0, 5, 0, 5))
plt.xlim(0, 5)
plt.ylim(0, 5)
plt.title("Overhead view of landscape formed by "
          "Uniform priors")

ax = fig.add_subplot(122, projection='3d')
ax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15)
ax.view_init(azim=390)
ax.set_xlabel('Value of $p_1$')
ax.set_ylabel('Value of $p_2$')
ax.set_zlabel('Dencity')
plt.title("Alternate view of landscape formed by "
          "Uniform priors")

Output:

    Text(0.5, 0.92, 'Alternate view of landscape formed by Uniform priors')

f:id:koheitsutsumi223:20190331174841p:plain

メモ

　np.meshgridは以下のような二つの行列を返しています。

np.meshgrid(x, y)

Output:

    [array([[ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ],
            [ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ],
            [ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ],
            ..., 
            [ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ],
            [ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ],
            [ 0.        ,  0.05050505,  0.1010101 , ...,  4.8989899 ,
              4.94949495,  5.        ]]),
     array([[ 0.        ,  0.        ,  0.        , ...,  0.        ,
              0.        ,  0.        ],
            [ 0.05050505,  0.05050505,  0.05050505, ...,  0.05050505,
              0.05050505,  0.05050505],
            [ 0.1010101 ,  0.1010101 ,  0.1010101 , ...,  0.1010101 ,
              0.1010101 ,  0.1010101 ],
            ..., 
            [ 4.8989899 ,  4.8989899 ,  4.8989899 , ...,  4.8989899 ,
              4.8989899 ,  4.8989899 ],
            [ 4.94949495,  4.94949495,  4.94949495, ...,  4.94949495,
              4.94949495,  4.94949495],
            [ 5.        ,  5.        ,  5.        , ...,  5.        ,
              5.        ,  5.        ]])]

　一つ目は、各行がnp.linspace(0, 5, 100)となっている100行100列の行列で、二つ目はその転置となっています。np.meshgridについては、等高線の描写などと相性が良いとdocstringに書いてありました。

print(np.meshgrid.__doc__) (クリックすると展開されます)

        Return coordinate matrices from coordinate vectors.
    
        Make N-D coordinate arrays for vectorized evaluations of
        N-D scalar/vector fields over N-D grids, given
        one-dimensional coordinate arrays x1, x2,..., xn.
    
        .. versionchanged:: 1.9
           1-D and 0-D cases are allowed.
    
        Parameters
        ----------
        x1, x2,..., xn : array_like
            1-D arrays representing the coordinates of a grid.
        indexing : {'xy', 'ij'}, optional
            Cartesian ('xy', default) or matrix ('ij') indexing of output.
            See Notes for more details.
    
            .. versionadded:: 1.7.0
        sparse : bool, optional
            If True a sparse grid is returned in order to conserve memory.
            Default is False.
    
            .. versionadded:: 1.7.0
        copy : bool, optional
            If False, a view into the original arrays are returned in order to
            conserve memory.  Default is True.  Please note that
            ``sparse=False, copy=False`` will likely return non-contiguous
            arrays.  Furthermore, more than one element of a broadcast array
            may refer to a single memory location.  If you need to write to the
            arrays, make copies first.
    
            .. versionadded:: 1.7.0
    
        Returns
        -------
        X1, X2,..., XN : ndarray
            For vectors `x1`, `x2`,..., 'xn' with lengths ``Ni=len(xi)`` ,
            return ``(N1, N2, N3,...Nn)`` shaped arrays if indexing='ij'
            or ``(N2, N1, N3,...Nn)`` shaped arrays if indexing='xy'
            with the elements of `xi` repeated to fill the matrix along
            the first dimension for `x1`, the second for `x2` and so on.
    
        Notes
        -----
        This function supports both indexing conventions through the indexing
        keyword argument.  Giving the string 'ij' returns a meshgrid with
        matrix indexing, while 'xy' returns a meshgrid with Cartesian indexing.
        In the 2-D case with inputs of length M and N, the outputs are of shape
        (N, M) for 'xy' indexing and (M, N) for 'ij' indexing.  In the 3-D case
        with inputs of length M, N and P, outputs are of shape (N, M, P) for
        'xy' indexing and (M, N, P) for 'ij' indexing.  The difference is
        illustrated by the following code snippet::
    
            xv, yv = meshgrid(x, y, sparse=False, indexing='ij')
            for i in range(nx):
                for j in range(ny):
                    # treat xv[i,j], yv[i,j]
    
            xv, yv = meshgrid(x, y, sparse=False, indexing='xy')
            for i in range(nx):
                for j in range(ny):
                    # treat xv[j,i], yv[j,i]
    
        In the 1-D and 0-D case, the indexing and sparse keywords have no effect.
    
        See Also
        --------
        index_tricks.mgrid : Construct a multi-dimensional "meshgrid"
                         using indexing notation.
        index_tricks.ogrid : Construct an open multi-dimensional "meshgrid"
                         using indexing notation.
    
        Examples
        --------
        >>> nx, ny = (3, 2)
        >>> x = np.linspace(0, 1, nx)
        >>> y = np.linspace(0, 1, ny)
        >>> xv, yv = meshgrid(x, y)
        >>> xv
        array([[ 0. ,  0.5,  1. ],
               [ 0. ,  0.5,  1. ]])
        >>> yv
        array([[ 0.,  0.,  0.],
               [ 1.,  1.,  1.]])
        >>> xv, yv = meshgrid(x, y, sparse=True)  # make sparse output arrays
        >>> xv
        array([[ 0. ,  0.5,  1. ]])
        >>> yv
        array([[ 0.],
               [ 1.]])
    
        `meshgrid` is very useful to evaluate functions on a grid.
    
        >>> x = np.arange(-5, 5, 0.1)
        >>> y = np.arange(-5, 5, 0.1)
        >>> xx, yy = meshgrid(x, y, sparse=True)
        >>> z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
        >>> h = plt.contourf(x,y,z)

　そして、ax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15)では、XとYを座標としてMをプロットしているはずです。以下に、簡単な例を載せておきます。np.meshgridについては、等高線の描写などと相性が良いとdocstringに書いてありました。

figsize(12, 12)
fig = plt.figure()
ax_ = fig.add_subplot(111, projection='3d')
x_ = np.array([0, 1])
y_ = np.array([0, 1])
X_, Y_ = np.meshgrid(x_, y_)
M_ = np.array([[0, 1],
               [1, 2]])
print(f"X_:\n {X_}")
print(f"Y_:\n {Y_}")
ax_.plot_surface(X_, Y_, M_, cmap=plt.cm.jet, vmax=1, vmin=-0.15)

Output:

    X_:
     [[0 1]
     [0 1]]
    Y_:
     [[0 0]
     [1 1]]

    <mpl_toolkits.mplot3d.art3d.Poly3DCollection at 0x1600073d710>

f:id:koheitsutsumi223:20190331174846p:plain

メモ

　uni_x[:, None]は100行1列の行列、uni_y[None, :]は1行100列のベクトルになっています。従って、Mは100行100列の行列になります。

　また、plt.imshow、plt.cm.jet、ax.plot_surface、ax.view_initの説明は以下のようになっています。plt.cm.jetはスカラーを色に変換する関数のような印象ですが、あまり理解できないですね・・・

print(plt.imshow.__doc__)（クリックすると展開されます）

    Display an image, i.e. data on a 2D regular raster.
    
    Parameters
    ----------
    X : array-like or PIL image
        The image data. Supported array shapes are:
    
        - (M, N): an image with scalar data. The data is visualized
          using a colormap.
        - (M, N, 3): an image with RGB values (float or uint8).
        - (M, N, 4): an image with RGBA values (float or uint8), i.e.
          including transparency.
    
        The first two dimensions (M, N) define the rows and columns of
        the image.
    
        The RGB(A) values should be in the range [0 .. 1] for floats or
        [0 .. 255] for integers.  Out-of-range values will be clipped to
        these bounds.
    
    cmap : str or `~matplotlib.colors.Colormap`, optional
        A Colormap instance or registered colormap name. The colormap
        maps scalar data to colors. It is ignored for RGB(A) data.
        Defaults to :rc:`image.cmap`.
    
    aspect : {'equal', 'auto'} or float, optional
        Controls the aspect ratio of the axes. The aspect is of particular
        relevance for images since it may distort the image, i.e. pixel
        will not be square.
    
        This parameter is a shortcut for explicitly calling
        `.Axes.set_aspect`. See there for further details.
    
        - 'equal': Ensures an aspect ratio of 1. Pixels will be square
          (unless pixel sizes are explicitly made non-square in data
          coordinates using *extent*).
        - 'auto': The axes is kept fixed and the aspect is adjusted so
          that the data fit in the axes. In general, this will result in
          non-square pixels.
    
        If not given, use :rc:`image.aspect` (default: 'equal').
    
    interpolation : str, optional
        The interpolation method used. If *None*
        :rc:`image.interpolation` is used, which defaults to 'nearest'.
    
        Supported values are 'none', 'nearest', 'bilinear', 'bicubic',
        'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser',
        'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc',
        'lanczos'.
    
        If *interpolation* is 'none', then no interpolation is performed
        on the Agg, ps and pdf backends. Other backends will fall back to
        'nearest'.
    
        See
        :doc:`/gallery/images_contours_and_fields/interpolation_methods`
        for an overview of the supported interpolation methods.
    
        Some interpolation methods require an additional radius parameter,
        which can be set by *filterrad*. Additionally, the antigrain image
        resize filter is controlled by the parameter *filternorm*.
    
    norm : `~matplotlib.colors.Normalize`, optional
        If scalar data are used, the Normalize instance scales the
        data values to the canonical colormap range [0,1] for mapping
        to colors. By default, the data range is mapped to the
        colorbar range using linear scaling. This parameter is ignored for
        RGB(A) data.
    
    vmin, vmax : scalar, optional
        When using scalar data and no explicit *norm*, *vmin* and *vmax*
        define the data range that the colormap covers. By default,
        the colormap covers the complete value range of the supplied
        data. *vmin*, *vmax* are ignored if the *norm* parameter is used.
    
    alpha : scalar, optional
        The alpha blending value, between 0 (transparent) and 1 (opaque).
        This parameter is ignored for RGBA input data.
    
    origin : {'upper', 'lower'}, optional
        Place the [0,0] index of the array in the upper left or lower left
        corner of the axes. The convention 'upper' is typically used for
        matrices and images.
        If not given, :rc:`image.origin` is used, defaulting to 'upper'.
    
        Note that the vertical axes points upward for 'lower'
        but downward for 'upper'.
    
    extent : scalars (left, right, bottom, top), optional
        The bounding box in data coordinates that the image will fill.
        The image is stretched individually along x and y to fill the box.
    
        The default extent is determined by the following conditions.
        Pixels have unit size in data coordinates. Their centers are on
        integer coordinates, and their center coordinates range from 0 to
        columns-1 horizontally and from 0 to rows-1 vertically.
    
        Note that the direction of the vertical axis and thus the default
        values for top and bottom depend on *origin*:
    
        - For ``origin == 'upper'`` the default is
          ``(-0.5, numcols-0.5, numrows-0.5, -0.5)``.
        - For ``origin == 'lower'`` the default is
          ``(-0.5, numcols-0.5, -0.5, numrows-0.5)``.
    
        See the example :doc:`/tutorials/intermediate/imshow_extent` for a
        more detailed description.
    
    shape : scalars (columns, rows), optional, default: None
        For raw buffer images.
    
    filternorm : bool, optional, default: True
        A parameter for the antigrain image resize filter (see the
        antigrain documentation).  If *filternorm* is set, the filter
        normalizes integer values and corrects the rounding errors. It
        doesn't do anything with the source floating point values, it
        corrects only integers according to the rule of 1.0 which means
        that any sum of pixel weights must be equal to 1.0.  So, the
        filter function must produce a graph of the proper shape.
    
    filterrad : float > 0, optional, default: 4.0
        The filter radius for filters that have a radius parameter, i.e.
        when interpolation is one of: 'sinc', 'lanczos' or 'blackman'.
    
    resample : bool, optional
        When *True*, use a full resampling method.  When *False*, only
        resample when the output image is larger than the input image.
    
    url : str, optional
        Set the url of the created `.AxesImage`. See `.Artist.set_url`.
    
    Returns
    -------
    image : `~matplotlib.image.AxesImage`
    
    Other Parameters
    ----------------
    **kwargs : `~matplotlib.artist.Artist` properties
        These parameters are passed on to the constructor of the
        `.AxesImage` artist.
    
    See also
    --------
    matshow : Plot a matrix or an array as an image.
    
    Notes
    -----
    Unless *extent* is used, pixel centers will be located at integer
    coordinates. In other words: the origin will coincide with the center
    of pixel (0, 0).
    
    There are two common representations for RGB images with an alpha
    channel:
    
    -   Straight (unassociated) alpha: R, G, and B channels represent the
        color of the pixel, disregarding its opacity.
    -   Premultiplied (associated) alpha: R, G, and B channels represent
        the color of the pixel, adjusted for its opacity by multiplication.
    
    `~matplotlib.pyplot.imshow` expects RGB images adopting the straight
    (unassociated) alpha representation.
    
    .. note::
        In addition to the above described arguments, this function can take a
        **data** keyword argument. If such a **data** argument is given, the
        following arguments are replaced by **data[<arg>]**:
    
        * All positional and all keyword arguments.
    
        Objects passed as **data** must support item access (``data[<arg>]``) and
        membership test (``<arg> in data``).

print(plt.cm.jet.__doc__)（クリックすると展開されます）

    Colormap objects based on lookup tables using linear segments.
    
        The lookup table is generated using linear interpolation for each
        primary color, with the 0-1 domain divided into any number of
        segments.

print(ax.plot_surface.__doc__)（クリックすると展開されます）

            Create a surface plot.
    
            By default it will be colored in shades of a solid color, but it also
            supports color mapping by supplying the *cmap* argument.
    
            .. note::
    
               The *rcount* and *ccount* kwargs, which both default to 50,
               determine the maximum number of samples used in each direction.  If
               the input data is larger, it will be downsampled (by slicing) to
               these numbers of points.
    
            Parameters
            ----------
            X, Y, Z : 2d arrays
                Data values.
    
            rcount, ccount : int
                Maximum number of samples used in each direction.  If the input
                data is larger, it will be downsampled (by slicing) to these
                numbers of points.  Defaults to 50.
    
                .. versionadded:: 2.0
    
            rstride, cstride : int
                Downsampling stride in each direction.  These arguments are
                mutually exclusive with *rcount* and *ccount*.  If only one of
                *rstride* or *cstride* is set, the other defaults to 10.
    
                'classic' mode uses a default of ``rstride = cstride = 10`` instead
                of the new default of ``rcount = ccount = 50``.
    
            color : color-like
                Color of the surface patches.
    
            cmap : Colormap
                Colormap of the surface patches.
    
            facecolors : array-like of colors.
                Colors of each individual patch.
    
            norm : Normalize
                Normalization for the colormap.
    
            vmin, vmax : float
                Bounds for the normalization.
    
            shade : bool
                Whether to shade the face colors.
    
            **kwargs :
                Other arguments are forwarded to `.Poly3DCollection`.

ax.view_init??（クリックすると展開されます）

Signature: ax.view_init(elev=None, azim=None)
Source:   
    def view_init(self, elev=None, azim=None):
        """
        Set the elevation and azimuth of the axes.

        This can be used to rotate the axes programmatically.

        'elev' stores the elevation angle in the z plane.
        'azim' stores the azimuth angle in the x,y plane.

        if elev or azim are None (default), then the initial value
        is used which was specified in the :class:`Axes3D` constructor.
        """

        self.dist = 10

        if elev is None:
            self.elev = self.initial_elev
        else:
            self.elev = elev

        if azim is None:
            self.azim = self.initial_azim
        else:
            self.azim = azim
File:      c:\users\hoge\anaconda3\lib\site-packages\mpl_toolkits\mplot3d\axes3d.py
Type:      method

figsize(25, 5)
plt.rcParams['font.size'] = 13
fig = plt.figure()
plt.subplot(121)

exp_x = stats.expon.pdf(x, scale=3)
exp_y = stats.expon.pdf(x, scale=10)
M = np.dot(exp_x[:, None], exp_y[None, :])
CS = plt.contour(X, Y, M) # 等高線をプロットするようだ。
                          # ここについても、座標の取り方はax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15)
                          # と同じように理解できる。
im = plt.imshow(M, interpolation='none', origin='lower',
                cmap=jet, extent=(0, 5, 0, 5))
plt.title("Overhead view of landscape formed by "
          "$Exp(3), Exp(10)$ priors")

ax = fig.add_subplot(122, projection='3d')
ax.plot_surface(X, Y, M, cmap=jet)
ax.view_init(azim=390)
ax.set_xlabel('Value of $p_1$')
ax.set_ylabel('Value of $p_2$')
ax.set_zlabel('Density')
plt.title("Alternate view of landscape\nformed by "
          "Exp(3), Exp(10) priors")

Output:

    Text(0.5, 0.92, 'Alternate view of landscape\nformed by Exp(3), Exp(10) priors')

f:id:koheitsutsumi223:20190331174852p:plain

def plot_landscape_warped_by_n_data_obs_Exp_pri_dist(N=1):
    lambda_1_true = 1
    lambda_2_true = 3
    plt.rcParams['font.size'] = 17

    
    data = np.concatenate([
        stats.poisson.rvs(lambda_1_true, size=(N, 1)),
        stats.poisson.rvs(lambda_2_true, size=(N, 1))
    ], axis=1)
    print(f"observed (2-dimensional, sample size = {N}):", data)
    x = y = np.linspace(.01, 5, 100)
    X, Y = np.meshgrid(x, y)
    likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1)
    likelihood_y = np.array([stats.poisson.pmf(data[:, 1], _y) for _y in y]).prod(axis=1)
    L = np.dot(likelihood_x[:, None], likelihood_y[None, :])
    
    figsize(25, 24)
    
    plt.subplot(221)
    uni_x = stats.uniform.pdf(x, loc=0, scale=5)
    uni_y = stats.uniform.pdf(y, loc=0, scale=5)
    M = np.dot(uni_x[:, None], uni_y[None, :])
    plt.imshow(M, interpolation='none', origin='lower',
               cmap=jet, vmax=1, vmin=-.15, extent=(0, 5, 0, 5))
    plt.scatter(lambda_2_true, lambda_1_true, c='k', s=50, edgecolor="none")
    plt.xlim(0, 5)
    plt.ylim(0, 5)
    plt.title("Landscape formed by "
              "Uniform priors on $p_1, p_2$")
    plt.xlabel("Value of $p_1$")
    plt.ylabel("Value of $p_2$")
    
    plt.subplot(223)
    plt.contour(X, Y, M * L)
    plt.imshow(M * L, interpolation='none', origin='lower', 
               cmap=jet, extent=(0, 5, 0, 5))
    plt.scatter(lambda_2_true, lambda_1_true, 
                c='k', s=50, edgecolor='none')
    plt.xlim(0, 5)
    plt.ylim(0, 5)
    plt.title(f"Lnadscape warped by {N} data observation;\n"
               "Uniform priors on $p_1, p_2$")
    plt.xlabel("Value of $p_1$")
    plt.ylabel("Value of $p_2$")
    
    plt.subplot(222)
    exp_x = stats.expon.pdf(x, loc=0, scale=3)
    exp_y = stats.expon.pdf(y, loc=0, scale=10)
    M = np.dot(exp_x[:, None], exp_y[None, :])
    plt.contour(x, y, M)
    plt.imshow(M, interpolation='none', origin='lower',
               cmap=jet, extent=(0, 5, 0, 5))
    plt.scatter(lambda_2_true, lambda_1_true, 
                c="k", s=50, edgecolor="none")
    plt.xlim(0, 5)
    plt.ylim(0, 5)
    plt.title("Landscape formed by "
              "Exponential priors on $p_1, p_2$")
    plt.xlabel("Value of $p_1$")
    plt.ylabel("Value of $p_2$")
    
    plt.subplot(224)
    plt.contour(x, y, M * L)
    plt.imshow(M * L, interpolation='none', origin='lower', 
               cmap=jet, extent=(0, 5, 0, 5))
    plt.scatter(lambda_2_true, lambda_1_true, 
                c="k", s=50, edgecolor="none")
    plt.xlim(0, 5)
    plt.ylim(0, 5)
    plt.xlabel('Value of $p_1$')
    plt.ylabel('Value of $p_2$')
    plt.title(f"Landscape warped by {N} data observation;\n"
               "Exponential priors on $p_1, p_2$")
    plt.xlabel("Value of $p_1$")
    plt.ylabel("Value of $p_2$")

plot_landscape_warped_by_n_data_obs_Exp_pri_dist()

Output:

    observed (2-dimensional, sample size = 1): [[2 1]]

f:id:koheitsutsumi223:20190331174856p:plain

plot_landscape_warped_by_n_data_obs_Exp_pri_dist(N=2)

Output:

    observed (2-dimensional, sample size = 2): [[2 2]
     [1 4]]

f:id:koheitsutsumi223:20190331174904p:plain

メモ

　関数plot_landscape_warped_by_n_data_obs_Exp_pri_distの以下の部分でどのようなarrayが生成されているのかをメモしておきます。

    data = np.concatenate([
        stats.poisson.rvs(lambda_1_true, size=(N, 1)),
        stats.poisson.rvs(lambda_2_true, size=(N, 1))
    ], axis=1)
    print(f"observed (2-dimensional, sample size = {N}):", data)
    x = y = np.linspace(.01, 5, 100)
    X, Y = np.meshgrid(x, y)
    likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1)
    likelihood_y = np.array([stats.poisson.pmf(data[:, 1], _y) for _y in y]).prod(axis=1)

N=2
lambda_1_true = 1
lambda_2_true = 3
data = np.concatenate([
        stats.poisson.rvs(lambda_1_true, size=(N, 1)),
        stats.poisson.rvs(lambda_2_true, size=(N, 1))
    ], axis=1)
print('data:')
pprint.pprint(data)
x = y = np.linspace(.01, 5, 100)
print('\ndata[:, 0]:')
pprint.pprint(data[:, 0])
likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1)
print('\nnp.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]):')
pprint.pprint(np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]))
print('\nnp.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1):')
pprint.pprint(np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1))

Output:（クリックすると展開されます）

    data:
    array([[ 2, 10],
           [ 0,  4]])
    
    data[:, 0]:
    array([2, 0])
    
    np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]):
    array([[  4.95024917e-05,   9.90049834e-01],
           [  1.71738965e-03,   9.41384100e-01],
           [  5.49527628e-03,   8.95110521e-01],
           [  1.10599167e-02,   8.51111513e-01],
           [  1.81202397e-02,   8.09275269e-01],
           [  2.64146968e-02,   7.69495478e-01],
           [  3.57088061e-02,   7.31671056e-01],
           [  4.57928792e-02,   6.95705888e-01],
           [  5.64799179e-02,   6.61508581e-01],
           [  6.76036698e-02,   6.28992237e-01],
           [  7.90168306e-02,   5.98074229e-01],
           [  9.05893832e-02,   5.68675990e-01],
           [  1.02207064e-01,   5.40722817e-01],
           [  1.13769950e-01,   5.14143677e-01],
           [  1.25191149e-01,   4.88871030e-01],
           [  1.36395603e-01,   4.64840657e-01],
           [  1.47318973e-01,   4.41991492e-01],
           [  1.57906624e-01,   4.20265474e-01],
           [  1.68112684e-01,   3.99607395e-01],
           [  1.77899183e-01,   3.79964760e-01],
           [  1.87235256e-01,   3.61287656e-01],
           [  1.96096421e-01,   3.43528621e-01],
           [  2.04463907e-01,   3.26642529e-01],
           [  2.12324047e-01,   3.10586470e-01],
           [  2.19667714e-01,   2.95319644e-01],
           [  2.26489810e-01,   2.80803256e-01],
           [  2.32788800e-01,   2.67000419e-01],
           [  2.38566283e-01,   2.53876058e-01],
           [  2.43826605e-01,   2.41396823e-01],
           [  2.48576502e-01,   2.29531002e-01],
           [  2.52824780e-01,   2.18248444e-01],
           [  2.56582021e-01,   2.07520479e-01],
           [  2.59860318e-01,   1.97319844e-01],
           [  2.62673037e-01,   1.87620620e-01],
           [  2.65034598e-01,   1.78398160e-01],
           [  2.66960278e-01,   1.69629028e-01],
           [  2.68466040e-01,   1.61290942e-01],
           [  2.69568370e-01,   1.53362712e-01],
           [  2.70284140e-01,   1.45824194e-01],
           [  2.70630477e-01,   1.38656230e-01],
           [  2.70624657e-01,   1.31840606e-01],
           [  2.70283997e-01,   1.25360003e-01],
           [  2.69625775e-01,   1.19197953e-01],
           [  2.68667148e-01,   1.13338798e-01],
           [  2.67425086e-01,   1.07767648e-01],
           [  2.65916310e-01,   1.02470347e-01],
           [  2.64157245e-01,   9.74334344e-02],
           [  2.62163976e-01,   9.26441102e-02],
           [  2.59952207e-01,   8.80902044e-02],
           [  2.57537235e-01,   8.37601451e-02],
           [  2.54933921e-01,   7.96429292e-02],
           [  2.52156669e-01,   7.57280943e-02],
           [  2.49219412e-01,   7.20056925e-02],
           [  2.46135598e-01,   6.84662647e-02],
           [  2.42918183e-01,   6.51008169e-02],
           [  2.39579622e-01,   6.19007971e-02],
           [  2.36131869e-01,   5.88580738e-02],
           [  2.32586373e-01,   5.59649150e-02],
           [  2.28954087e-01,   5.32139689e-02],
           [  2.25245463e-01,   5.05982452e-02],
           [  2.21470464e-01,   4.81110969e-02],
           [  2.17638567e-01,   4.57462039e-02],
           [  2.13758775e-01,   4.34975569e-02],
           [  2.09839623e-01,   4.13594418e-02],
           [  2.05889189e-01,   3.93264253e-02],
           [  2.01915108e-01,   3.73933415e-02],
           [  1.97924578e-01,   3.55552780e-02],
           [  1.93924380e-01,   3.38075643e-02],
           [  1.89920884e-01,   3.21457591e-02],
           [  1.85920064e-01,   3.05656397e-02],
           [  1.81927514e-01,   2.90631907e-02],
           [  1.77948459e-01,   2.76345945e-02],
           [  1.73987767e-01,   2.62762206e-02],
           [  1.70049965e-01,   2.49846173e-02],
           [  1.66139254e-01,   2.37565027e-02],
           [  1.62259517e-01,   2.25887557e-02],
           [  1.58414338e-01,   2.14784092e-02],
           [  1.54607011e-01,   2.04226416e-02],
           [  1.50840556e-01,   1.94187701e-02],
           [  1.47117728e-01,   1.84642437e-02],
           [  1.43441034e-01,   1.75566369e-02],
           [  1.39812739e-01,   1.66936433e-02],
           [  1.36234886e-01,   1.58730701e-02],
           [  1.32709300e-01,   1.50928319e-02],
           [  1.29237602e-01,   1.43509463e-02],
           [  1.25821221e-01,   1.36455279e-02],
           [  1.22461403e-01,   1.29747843e-02],
           [  1.19159221e-01,   1.23370109e-02],
           [  1.15915588e-01,   1.17305872e-02],
           [  1.12731260e-01,   1.11539722e-02],
           [  1.09606852e-01,   1.06057005e-02],
           [  1.06542842e-01,   1.00843791e-02],
           [  1.03539583e-01,   9.58868312e-03],
           [  1.00597308e-01,   9.11735301e-03],
           [  9.77161399e-02,   8.66919104e-03],
           [  9.48960974e-02,   8.24305840e-03],
           [  9.21371035e-02,   7.83787224e-03],
           [  8.94389915e-02,   7.45260294e-03],
           [  8.68015116e-02,   7.08627149e-03],
           [  8.42243375e-02,   6.73794700e-03]])
    
    np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1):
    array([  4.90099337e-05,   1.61672331e-03,   4.91887962e-03,
             9.41322241e-03,   1.46642619e-02,   2.03259898e-02,
             2.61270999e-02,   3.18583757e-02,   3.73619503e-02,
             4.25221835e-02,   4.72579300e-02,   5.15160072e-02,
             5.52656917e-02,   5.84941004e-02,   6.12023262e-02,
             6.34022216e-02,   6.51137326e-02,   6.63627021e-02,
             6.71790718e-02,   6.75954202e-02,   6.76457866e-02,
             6.73647330e-02,   6.67866079e-02,   6.59449765e-02,
             6.48721911e-02,   6.35990761e-02,   6.21547071e-02,
             6.05662675e-02,   5.88589677e-02,   5.70560137e-02,
             5.51786149e-02,   5.32460237e-02,   5.12755975e-02,
             4.92828782e-02,   4.72816845e-02,   4.52842125e-02,
             4.33011403e-02,   4.13417363e-02,   3.94139667e-02,
             3.75246016e-02,   3.56793187e-02,   3.38828027e-02,
             3.21388405e-02,   3.04504115e-02,   2.88197725e-02,
             2.72485366e-02,   2.57377476e-02,   2.42879483e-02,
             2.28992431e-02,   2.15713562e-02,   2.03036842e-02,
             1.90953440e-02,   1.79452163e-02,   1.68519850e-02,
             1.58141722e-02,   1.48301696e-02,   1.38982669e-02,
             1.30166766e-02,   1.21835557e-02,   1.13970252e-02,
             1.06551869e-02,   9.95613829e-03,   9.29798450e-03,
             8.67884968e-03,   8.09688583e-03,   7.55028057e-03,
             7.03726341e-03,   6.55611095e-03,   6.10515098e-03,
             5.68276569e-03,   5.28739405e-03,   4.91753350e-03,
             4.57174093e-03,   4.24863331e-03,   3.94688762e-03,
             3.66524060e-03,   3.40248798e-03,   3.15748358e-03,
             2.92913807e-03,   2.71641758e-03,   2.51834214e-03,
             2.33398400e-03,   2.16246589e-03,   2.00295916e-03,
             1.85468189e-03,   1.71689698e-03,   1.58891028e-03,
             1.47006862e-03,   1.35975791e-03,   1.25740133e-03,
             1.16245744e-03,   1.07441841e-03,   9.92808253e-04,
             9.17181170e-04,   8.47119885e-04,   7.82234073e-04,
             7.22158846e-04,   6.66553291e-04,   6.15099077e-04,
             5.67499122e-04])

3.1.4 例題：混合モデルの教師無しクラスタリング

makedirs("data", exist_ok=True)
urlretrieve("https://git.io/vXt6b", "data/mixture_data.csv")

Output:

    ('data/mixture_data.csv', <http.client.HTTPMessage at 0x1606930ab38>)

figsize(25, 8)
plt.rcParams['font.size'] = 17
data = np.loadtxt("data/mixture_data.csv", delimiter=",")
plt.hist(data, bins=20, color="k", 
         histtype="stepfilled", alpha=0.8)
plt.title("Histogram of the dataset")
plt.ylim([0, None])
plt.xlabel('Value')
plt.ylabel('Count')
print(data[:10], "...")

Output:

    [ 115.85679142  152.26153716  178.87449059  162.93500815  107.02820697
      105.19141146  118.38288501  125.3769803   102.88054011  206.71326136] ...

f:id:koheitsutsumi223:20190331174910p:plain

p = pm.Uniform("p", 0., 1.)
assignment = pm.Categorical("assignment", [p, 1-p], 
                            size=data.shape[0])

taus = 1.0 / pm.Uniform("stds", lower=0, upper=100, size=2) ** 2 # テキストを読んだ感じ、ここはupper=100なんじゃないんかな？
centers = pm.Normal("centers", [120, 190], [0.01, 0.01], size=2)

@pm.deterministic
def center_i(assignment=assignment, centers=centers):
    return centers[assignment]

@pm.deterministic
def tau_i(assignment=assignment, taus=taus):
    return taus[assignment]

　どうやら、centers[assignment]のように書くと、assignmentと同じサイズのサンプルを返してくれるらしい。しかも、各平均はちゃんとassignmentでサンプルされたクラスタからサンプルされるらしい。すごい・・・

print(f"Random assignment: {assignment.value[:8]}")
print(f"Assigned center: {center_i.value[:8]}")
print(f"Assigned precision: {tau_i.value[:8]}")

Output:

    Random assignment: [0 1 1 0 0 0 1 1]
    Assigned center: [ 128.32964549  191.13553451  191.13553451  128.32964549  128.32964549
      128.32964549  191.13553451  191.13553451]
    Assigned precision: [ 0.0001068   0.00012591  0.00012591  0.0001068   0.0001068   0.0001068
      0.00012591  0.00012591]

observations = pm.Normal("obs", center_i, tau_i, value=data, observed=True) # これ、どこで使うんだ？
model = pm.Model([p, assignment, taus, centers])

mcmc = pm.MCMC(model)
mcmc.sample(50000)

Output:

     [-----------------100%-----------------] 50000 of 50000 complete in 20.9 sec

figsize(25, 18)
plt.rcParams['font.size'] = 17
line_width = 1

colors = ["#348ABD", "#A60628"]

center_trace = mcmc.trace("centers")[:]
if center_trace[-1, 0] < center_trace[-1, 1]:
    colors = ["#A60628", "#348ABD"]
    
plt.subplot(311)
plt.plot(center_trace[:, 0], 
         label="trace of center 0", 
         c=colors[0], lw=line_width)
plt.plot(center_trace[:, 1], 
         label="trace of center 1", 
         c=colors[1], lw=line_width)
plt.title("Traces of unknown parameters")
leg = plt.legend(loc="upper right")
leg.get_frame().set_alpha(0.7)

plt.subplot(312)
std_trace = mcmc.trace("stds")[:]
plt.plot(std_trace[:, 0], 
         label="trace of standard deviation "
               "of cluster 0", 
         c=colors[0], lw=line_width)
plt.plot(std_trace[:, 1], 
         label="trace of standard deviation "
               "of cluster 1", 
         c=colors[1], lw=line_width)
plt.legend(loc="upper right")

plt.subplot(313)
p_trace = mcmc.trace("p")[:]
plt.plot(p_trace, 
         label="$p$: frequency of assignment to cluster 0", 
         color="#467821", lw=line_width)
plt.xlabel("Steps")
plt.ylabel("Value")
plt.ylim(0, 1)
plt.legend()

Output:

    <matplotlib.legend.Legend at 0x16020cd9828>

f:id:koheitsutsumi223:20190331174916p:plain

mcmc.sample(100000)

Output:

     [-----------------100%-----------------] 100000 of 100000 complete in 41.7 sec

figsize(25, 8)
plt.rcParams['font.size'] = 17

center_trace = mcmc.trace("centers", chain=1)[:]
prev_center_trace = mcmc.trace("centers", chain=0)[:]

x = np.arange(50000)
plt.plot(x, prev_center_trace[:, 0],
         label="previous trace of center 0", 
         lw=line_width, alpha=0.4, c=colors[0])
plt.plot(x, prev_center_trace[:, 1], 
         label="previous trace of center 1", 
         lw=line_width, alpha=0.4, c=colors[1])

x = np.arange(50000, 150000)
plt.plot(x, center_trace[:, 0],
         label="new trace of center 0",
         lw=line_width, c=colors[0])
plt.plot(x, center_trace[:, 1],
         label="new trace of center 1",
         lw=line_width, c=colors[1])

plt.title("Trace of unknown center parameters "
          "after sampling 1000,000 more times")
leg = plt.legend(loc="upper right")
leg.get_frame().set_alpha(0.8)
plt.ylabel("Value")
plt.xlabel("Steps")

Output:

    Text(0.5, 0, 'Steps')

f:id:koheitsutsumi223:20190331174922p:plain

figsize(25, 9)
plt.rcParams['font.size'] = 17

plt.cmap = mpl.colors.ListedColormap(colors)
plt.imshow(mcmc.trace("assignment")[::400, np.argsort(data)], # np.argsort(data)を引数に持たせることで、各サンプルを観測データが小さい順に並べ替えてくれる。
           cmap=plt.cmap, aspect=.4, alpha=.9)

plt.xticks(np.arange(0, data.shape[0], 40),
           [f"{round(s, 2)}" for s in np.sort(data)[::40]])
plt.ylabel("Posterior sample")
plt.xlabel("Value of $i$th data point")
plt.title("Posterior labels fo data points")

Output:

    Text(0.5, 1.0, 'Posterior labels fo data points')

f:id:koheitsutsumi223:20190331174927p:plain

cmap = mpl.colors.LinearSegmentedColormap.from_list("BMH", colors)

assign_trace = mcmc.trace("assignment")[:]
plt.scatter(data, 1 - assign_trace.mean(axis=0), cmap=cmap,
            c=assign_trace.mean(axis=0), s=50)

plt.ylim(-0.05, 1.05)
plt.xlim(35, 300)
plt.title("Probability of data point belonging to cluster 0")
plt.ylabel("Probability")
plt.xlabel("Value of data point")

Output:

    Text(0.5, 0, 'Value of data point')

f:id:koheitsutsumi223:20190331174932p:plain

norm = stats.norm

x = np.linspace(20, 300, 500)
posterior_center_means = center_trace.mean(axis=0)
posterior_std_means = std_trace.mean(axis=0)
posterior_p_mean = mcmc.trace("p")[:].mean()

plt.hist(data, bins=20, histtype="step", density=True, color="k",
         lw=2, label="histogram of data")

y = posterior_p_mean * norm.pdf(x,
                                loc=posterior_center_means[0],
                                scale=posterior_std_means[0])
plt.plot(x, y, lw=3, color=colors[0],
         label="cluster 0"
         "(using posterior-mean parameters)")
plt.fill_between(x, y, color=colors[0], alpha=0.3)

y = (1 - posterior_p_mean) * norm.pdf(x,
                                loc=posterior_center_means[1],
                                scale=posterior_std_means[1])
plt.plot(x, y, lw=3, color=colors[1],
         label="cluster 1"
         "(using posterior-mean parameters)")
plt.fill_between(x, y, color=colors[1], alpha=0.3)

plt.legend(loc="upper left")
plt.title("Visualizing clusters using posterior-mean parameters")

Output:

    Text(0.5, 1.0, 'Visualizing clusters using posterior-mean parameters')

f:id:koheitsutsumi223:20190331174936p:plain

3.1.5 事後サンプルを混ぜないで

p_trace=mcmc.trace("p")[:]
std_trace = mcmc.trace("stds")[:]

def prob_of_belong_cluster(x, p_trace=p_trace, std_trace=std_trace):
    norm_pdf = stats.norm.pdf

    v = p_trace * norm_pdf(x,
                           loc=center_trace[:, 0],
                           scale=std_trace[:, 0]) < \
        (1 - p_trace) * norm_pdf(x,
                                 loc=center_trace[:, 1],
                                 scale=std_trace[:, 1]) # テキストは不等号が逆なんじゃないか？
    return v.mean()

X = np.linspace(20, 300, 500)
Y = np.array([prob_of_belong_cluster(x=x) for x in X])

　各xの値について、クラスタ1に属する確率をプロットしてみた。

figsize(25, 9)
fig, ax1 = plt.subplots()
ax1.hist(data, bins=20, histtype="step", density=True, color="k",
         lw=2, label="histogram of data")
ax2 = ax1.twinx()  # 2つのプロットを関連付ける
ax2.plot(X, Y, label="Probability of belonging to cluster 1")
plt.ylim(0, 1.01)
plt.legend(loc="best")

Output:

    <matplotlib.legend.Legend at 0x160037e7b00>

f:id:koheitsutsumi223:20190331174945p:plain

3.1.6 MAPを使って収束を改善

　pm.MAPを使用してみました。だけど、使い方が間違えてるのか、バーンインが短くなっているかどうかはぱっと見わかりませんね・・・ pの初期値なんてほぼ0になってるし・・・

p = pm.Uniform("p", 0., 1.)
assignment = pm.Categorical("assignment", [p, 1-p], 
                            size=data.shape[0])

taus = 1.0 / pm.Uniform("stds", lower=0, upper=100, size=2) ** 2 # テキストを読んだ感じ、ここはupper=100なんじゃないんかな？
centers = pm.Normal("centers", [120, 190], [0.01, 0.01], size=2)

@pm.deterministic
def center_i(assignment=assignment, centers=centers):
    return centers[assignment]

@pm.deterministic
def tau_i(assignment=assignment, taus=taus):
    return taus[assignment]

observations = pm.Normal("obs", center_i, tau_i, value=data, observed=True) # これ、どこで使うんだ？
model = pm.Model([p, assignment, taus, centers])

map_ = pm.MAP(model)
map_.fit()

mcmc = pm.MCMC(model)
mcmc.sample(50000)

figsize(25, 18)
plt.rcParams['font.size'] = 17
line_width = 1

colors = ["#348ABD", "#A60628"]

center_trace = mcmc.trace("centers")[:]
if center_trace[-1, 0] < center_trace[-1, 1]:
    colors = ["#A60628", "#348ABD"]
    
plt.subplot(311)
plt.plot(center_trace[:, 0], 
         label="trace of center 0", 
         c=colors[0], lw=line_width)
plt.plot(center_trace[:, 1], 
         label="trace of center 1", 
         c=colors[1], lw=line_width)
plt.title("Traces of unknown parameters")
leg = plt.legend(loc="upper right")
leg.get_frame().set_alpha(0.7)

plt.subplot(312)
std_trace = mcmc.trace("stds")[:]
plt.plot(std_trace[:, 0], 
         label="trace of standard deviation "
               "of cluster 0", 
         c=colors[0], lw=line_width)
plt.plot(std_trace[:, 1], 
         label="trace of standard deviation "
               "of cluster 1", 
         c=colors[1], lw=line_width)
plt.legend(loc="upper right")

plt.subplot(313)
p_trace = mcmc.trace("p")[:]
plt.plot(p_trace, 
         label="$p$: frequency of assignment to cluster 0", 
         color="#467821", lw=line_width)
plt.xlabel("Steps")
plt.ylabel("Value")
plt.ylim(0, 1)
plt.legend()

Output:

    Warning: Stochastic assignment's value is neither numerical nor array with floating-point dtype. Recommend fitting method fmin (default).
     [-----------------100%-----------------] 50000 of 50000 complete in 19.7 sec




    <matplotlib.legend.Legend at 0x160049da240>

f:id:koheitsutsumi223:20190331174952p:plain