Pythonで体験するベイズ推論 PyMCによるMCMC入門 第3章 (3.1全部) 写経
目次
始めに
この記事では、Pythonで体験するベイズ推論 PyMCによるMCMC入門
の第3章の3.1を写経した記録をまとめます。前回はこちらPythonで体験するベイズ推論 PyMCによるMCMC入門 第2章 (2.2.6から2章の終わりまで) 写経。 実行はJupyter Labにて実行しています。
写経は以下のようにまとめました。
写経ではありますが、関数を作成し、コードの再実行をやりやすいようにした部分があります。
より良いと思われるコードを考えた場合は書き換えてコメントを添えるようにし、変更点をなるべく明示するようにしてあります。
個人的に気になった点のメモを残すようにしてあります。同じような疑問を持った方の助けになれば幸いです。
以前書いたコードと同じようなコード(例えばグラフの描写等)は効率化のために飛ばしているところもあります。
Pythonで体験するベイズ推論 PyMCによるMCMC入門を読んでいている際のちょっとした正誤表代わりになればと思います。
この記事で使用する主なモジュール、設定
この記事では、主に以下のモジュールや設定を使用しています。
import scipy.stats as stats import matplotlib as mpl from matplotlib import pyplot as plt %matplotlib inline from IPython.core.pylabtools import figsize import numpy as np from mpl_toolkits.mplot3d import Axes3D import pprint from os import makedirs from urllib.request import urlretrieve import pymc as pm
3 MCMCの中をのぞいてみよう
3.1 山あり谷あり、分布の地形
figsize(25, 8) plt.rcParams['font.size'] = 10 jet = plt.cm.jet fig = plt.figure() x = y = np.linspace(0, 5, 100) X, Y = np.meshgrid(x, y) plt.subplot(121) uni_x = stats.uniform.pdf(x, loc=0, scale=5) uni_y = stats.uniform.pdf(y, loc=0, scale=5) M = np.dot(uni_x[:, None], uni_y[None, :]) im = plt.imshow(M, interpolation='none', origin='lower', cmap=jet, vmax=1, vmin=-.15, extent=(0, 5, 0, 5)) plt.xlim(0, 5) plt.ylim(0, 5) plt.title("Overhead view of landscape formed by " "Uniform priors") ax = fig.add_subplot(122, projection='3d') ax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15) ax.view_init(azim=390) ax.set_xlabel('Value of $p_1$') ax.set_ylabel('Value of $p_2$') ax.set_zlabel('Dencity') plt.title("Alternate view of landscape formed by " "Uniform priors")
Output:
Text(0.5, 0.92, 'Alternate view of landscape formed by Uniform priors')
メモ
np.meshgrid
は以下のような二つの行列を返しています。
np.meshgrid(x, y)
Output:
[array([[ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ], [ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ], [ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ], ..., [ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ], [ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ], [ 0. , 0.05050505, 0.1010101 , ..., 4.8989899 , 4.94949495, 5. ]]), array([[ 0. , 0. , 0. , ..., 0. , 0. , 0. ], [ 0.05050505, 0.05050505, 0.05050505, ..., 0.05050505, 0.05050505, 0.05050505], [ 0.1010101 , 0.1010101 , 0.1010101 , ..., 0.1010101 , 0.1010101 , 0.1010101 ], ..., [ 4.8989899 , 4.8989899 , 4.8989899 , ..., 4.8989899 , 4.8989899 , 4.8989899 ], [ 4.94949495, 4.94949495, 4.94949495, ..., 4.94949495, 4.94949495, 4.94949495], [ 5. , 5. , 5. , ..., 5. , 5. , 5. ]])]
一つ目は、各行がnp.linspace(0, 5, 100)
となっている100行100列の行列で、二つ目はその転置となっています。np.meshgrid
については、等高線の描写などと相性が良いとdocstringに書いてありました。
print(np.meshgrid.__doc__)
(クリックすると展開されます)
Return coordinate matrices from coordinate vectors. Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,..., xn. .. versionchanged:: 1.9 1-D and 0-D cases are allowed. Parameters ---------- x1, x2,..., xn : array_like 1-D arrays representing the coordinates of a grid. indexing : {'xy', 'ij'}, optional Cartesian ('xy', default) or matrix ('ij') indexing of output. See Notes for more details. .. versionadded:: 1.7.0 sparse : bool, optional If True a sparse grid is returned in order to conserve memory. Default is False. .. versionadded:: 1.7.0 copy : bool, optional If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that ``sparse=False, copy=False`` will likely return non-contiguous arrays. Furthermore, more than one element of a broadcast array may refer to a single memory location. If you need to write to the arrays, make copies first. .. versionadded:: 1.7.0 Returns ------- X1, X2,..., XN : ndarray For vectors `x1`, `x2`,..., 'xn' with lengths ``Ni=len(xi)`` , return ``(N1, N2, N3,...Nn)`` shaped arrays if indexing='ij' or ``(N2, N1, N3,...Nn)`` shaped arrays if indexing='xy' with the elements of `xi` repeated to fill the matrix along the first dimension for `x1`, the second for `x2` and so on. Notes ----- This function supports both indexing conventions through the indexing keyword argument. Giving the string 'ij' returns a meshgrid with matrix indexing, while 'xy' returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for 'xy' indexing and (M, N) for 'ij' indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for 'xy' indexing and (M, N, P) for 'ij' indexing. The difference is illustrated by the following code snippet:: xv, yv = meshgrid(x, y, sparse=False, indexing='ij') for i in range(nx): for j in range(ny): # treat xv[i,j], yv[i,j] xv, yv = meshgrid(x, y, sparse=False, indexing='xy') for i in range(nx): for j in range(ny): # treat xv[j,i], yv[j,i] In the 1-D and 0-D case, the indexing and sparse keywords have no effect. See Also -------- index_tricks.mgrid : Construct a multi-dimensional "meshgrid" using indexing notation. index_tricks.ogrid : Construct an open multi-dimensional "meshgrid" using indexing notation. Examples -------- >>> nx, ny = (3, 2) >>> x = np.linspace(0, 1, nx) >>> y = np.linspace(0, 1, ny) >>> xv, yv = meshgrid(x, y) >>> xv array([[ 0. , 0.5, 1. ], [ 0. , 0.5, 1. ]]) >>> yv array([[ 0., 0., 0.], [ 1., 1., 1.]]) >>> xv, yv = meshgrid(x, y, sparse=True) # make sparse output arrays >>> xv array([[ 0. , 0.5, 1. ]]) >>> yv array([[ 0.], [ 1.]]) `meshgrid` is very useful to evaluate functions on a grid. >>> x = np.arange(-5, 5, 0.1) >>> y = np.arange(-5, 5, 0.1) >>> xx, yy = meshgrid(x, y, sparse=True) >>> z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2) >>> h = plt.contourf(x,y,z)
そして、ax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15)
では、X
とY
を座標としてM
をプロットしているはずです。以下に、簡単な例を載せておきます。np.meshgrid
については、等高線の描写などと相性が良いとdocstringに書いてありました。
figsize(12, 12) fig = plt.figure() ax_ = fig.add_subplot(111, projection='3d') x_ = np.array([0, 1]) y_ = np.array([0, 1]) X_, Y_ = np.meshgrid(x_, y_) M_ = np.array([[0, 1], [1, 2]]) print(f"X_:\n {X_}") print(f"Y_:\n {Y_}") ax_.plot_surface(X_, Y_, M_, cmap=plt.cm.jet, vmax=1, vmin=-0.15)
Output:
X_: [[0 1] [0 1]] Y_: [[0 0] [1 1]] <mpl_toolkits.mplot3d.art3d.Poly3DCollection at 0x1600073d710>
メモ
uni_x[:, None]
は100行1列の行列、uni_y[None, :]
は1行100列のベクトルになっています。従って、M
は100行100列の行列になります。
また、plt.imshow
、plt.cm.jet
、ax.plot_surface
、ax.view_init
の説明は以下のようになっています。plt.cm.jet
はスカラーを色に変換する関数のような印象ですが、あまり理解できないですね・・・
print(plt.imshow.__doc__)
(クリックすると展開されます)
Display an image, i.e. data on a 2D regular raster. Parameters ---------- X : array-like or PIL image The image data. Supported array shapes are: - (M, N): an image with scalar data. The data is visualized using a colormap. - (M, N, 3): an image with RGB values (float or uint8). - (M, N, 4): an image with RGBA values (float or uint8), i.e. including transparency. The first two dimensions (M, N) define the rows and columns of the image. The RGB(A) values should be in the range [0 .. 1] for floats or [0 .. 255] for integers. Out-of-range values will be clipped to these bounds. cmap : str or `~matplotlib.colors.Colormap`, optional A Colormap instance or registered colormap name. The colormap maps scalar data to colors. It is ignored for RGB(A) data. Defaults to :rc:`image.cmap`. aspect : {'equal', 'auto'} or float, optional Controls the aspect ratio of the axes. The aspect is of particular relevance for images since it may distort the image, i.e. pixel will not be square. This parameter is a shortcut for explicitly calling `.Axes.set_aspect`. See there for further details. - 'equal': Ensures an aspect ratio of 1. Pixels will be square (unless pixel sizes are explicitly made non-square in data coordinates using *extent*). - 'auto': The axes is kept fixed and the aspect is adjusted so that the data fit in the axes. In general, this will result in non-square pixels. If not given, use :rc:`image.aspect` (default: 'equal'). interpolation : str, optional The interpolation method used. If *None* :rc:`image.interpolation` is used, which defaults to 'nearest'. Supported values are 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos'. If *interpolation* is 'none', then no interpolation is performed on the Agg, ps and pdf backends. Other backends will fall back to 'nearest'. See :doc:`/gallery/images_contours_and_fields/interpolation_methods` for an overview of the supported interpolation methods. Some interpolation methods require an additional radius parameter, which can be set by *filterrad*. Additionally, the antigrain image resize filter is controlled by the parameter *filternorm*. norm : `~matplotlib.colors.Normalize`, optional If scalar data are used, the Normalize instance scales the data values to the canonical colormap range [0,1] for mapping to colors. By default, the data range is mapped to the colorbar range using linear scaling. This parameter is ignored for RGB(A) data. vmin, vmax : scalar, optional When using scalar data and no explicit *norm*, *vmin* and *vmax* define the data range that the colormap covers. By default, the colormap covers the complete value range of the supplied data. *vmin*, *vmax* are ignored if the *norm* parameter is used. alpha : scalar, optional The alpha blending value, between 0 (transparent) and 1 (opaque). This parameter is ignored for RGBA input data. origin : {'upper', 'lower'}, optional Place the [0,0] index of the array in the upper left or lower left corner of the axes. The convention 'upper' is typically used for matrices and images. If not given, :rc:`image.origin` is used, defaulting to 'upper'. Note that the vertical axes points upward for 'lower' but downward for 'upper'. extent : scalars (left, right, bottom, top), optional The bounding box in data coordinates that the image will fill. The image is stretched individually along x and y to fill the box. The default extent is determined by the following conditions. Pixels have unit size in data coordinates. Their centers are on integer coordinates, and their center coordinates range from 0 to columns-1 horizontally and from 0 to rows-1 vertically. Note that the direction of the vertical axis and thus the default values for top and bottom depend on *origin*: - For ``origin == 'upper'`` the default is ``(-0.5, numcols-0.5, numrows-0.5, -0.5)``. - For ``origin == 'lower'`` the default is ``(-0.5, numcols-0.5, -0.5, numrows-0.5)``. See the example :doc:`/tutorials/intermediate/imshow_extent` for a more detailed description. shape : scalars (columns, rows), optional, default: None For raw buffer images. filternorm : bool, optional, default: True A parameter for the antigrain image resize filter (see the antigrain documentation). If *filternorm* is set, the filter normalizes integer values and corrects the rounding errors. It doesn't do anything with the source floating point values, it corrects only integers according to the rule of 1.0 which means that any sum of pixel weights must be equal to 1.0. So, the filter function must produce a graph of the proper shape. filterrad : float > 0, optional, default: 4.0 The filter radius for filters that have a radius parameter, i.e. when interpolation is one of: 'sinc', 'lanczos' or 'blackman'. resample : bool, optional When *True*, use a full resampling method. When *False*, only resample when the output image is larger than the input image. url : str, optional Set the url of the created `.AxesImage`. See `.Artist.set_url`. Returns ------- image : `~matplotlib.image.AxesImage` Other Parameters ---------------- **kwargs : `~matplotlib.artist.Artist` properties These parameters are passed on to the constructor of the `.AxesImage` artist. See also -------- matshow : Plot a matrix or an array as an image. Notes ----- Unless *extent* is used, pixel centers will be located at integer coordinates. In other words: the origin will coincide with the center of pixel (0, 0). There are two common representations for RGB images with an alpha channel: - Straight (unassociated) alpha: R, G, and B channels represent the color of the pixel, disregarding its opacity. - Premultiplied (associated) alpha: R, G, and B channels represent the color of the pixel, adjusted for its opacity by multiplication. `~matplotlib.pyplot.imshow` expects RGB images adopting the straight (unassociated) alpha representation. .. note:: In addition to the above described arguments, this function can take a **data** keyword argument. If such a **data** argument is given, the following arguments are replaced by **data[<arg>]**: * All positional and all keyword arguments. Objects passed as **data** must support item access (``data[<arg>]``) and membership test (``<arg> in data``).
print(plt.cm.jet.__doc__)
(クリックすると展開されます)
Colormap objects based on lookup tables using linear segments. The lookup table is generated using linear interpolation for each primary color, with the 0-1 domain divided into any number of segments.
print(ax.plot_surface.__doc__)
(クリックすると展開されます)
Create a surface plot. By default it will be colored in shades of a solid color, but it also supports color mapping by supplying the *cmap* argument. .. note:: The *rcount* and *ccount* kwargs, which both default to 50, determine the maximum number of samples used in each direction. If the input data is larger, it will be downsampled (by slicing) to these numbers of points. Parameters ---------- X, Y, Z : 2d arrays Data values. rcount, ccount : int Maximum number of samples used in each direction. If the input data is larger, it will be downsampled (by slicing) to these numbers of points. Defaults to 50. .. versionadded:: 2.0 rstride, cstride : int Downsampling stride in each direction. These arguments are mutually exclusive with *rcount* and *ccount*. If only one of *rstride* or *cstride* is set, the other defaults to 10. 'classic' mode uses a default of ``rstride = cstride = 10`` instead of the new default of ``rcount = ccount = 50``. color : color-like Color of the surface patches. cmap : Colormap Colormap of the surface patches. facecolors : array-like of colors. Colors of each individual patch. norm : Normalize Normalization for the colormap. vmin, vmax : float Bounds for the normalization. shade : bool Whether to shade the face colors. **kwargs : Other arguments are forwarded to `.Poly3DCollection`.
ax.view_init??
(クリックすると展開されます)
Signature: ax.view_init(elev=None, azim=None) Source: def view_init(self, elev=None, azim=None): """ Set the elevation and azimuth of the axes. This can be used to rotate the axes programmatically. 'elev' stores the elevation angle in the z plane. 'azim' stores the azimuth angle in the x,y plane. if elev or azim are None (default), then the initial value is used which was specified in the :class:`Axes3D` constructor. """ self.dist = 10 if elev is None: self.elev = self.initial_elev else: self.elev = elev if azim is None: self.azim = self.initial_azim else: self.azim = azim File: c:\users\hoge\anaconda3\lib\site-packages\mpl_toolkits\mplot3d\axes3d.py Type: method
figsize(25, 5) plt.rcParams['font.size'] = 13 fig = plt.figure() plt.subplot(121) exp_x = stats.expon.pdf(x, scale=3) exp_y = stats.expon.pdf(x, scale=10) M = np.dot(exp_x[:, None], exp_y[None, :]) CS = plt.contour(X, Y, M) # 等高線をプロットするようだ。 # ここについても、座標の取り方はax.plot_surface(X, Y, M, cmap=plt.cm.jet, vmax=1, vmin=-0.15) # と同じように理解できる。 im = plt.imshow(M, interpolation='none', origin='lower', cmap=jet, extent=(0, 5, 0, 5)) plt.title("Overhead view of landscape formed by " "$Exp(3), Exp(10)$ priors") ax = fig.add_subplot(122, projection='3d') ax.plot_surface(X, Y, M, cmap=jet) ax.view_init(azim=390) ax.set_xlabel('Value of $p_1$') ax.set_ylabel('Value of $p_2$') ax.set_zlabel('Density') plt.title("Alternate view of landscape\nformed by " "Exp(3), Exp(10) priors")
Output:
Text(0.5, 0.92, 'Alternate view of landscape\nformed by Exp(3), Exp(10) priors')
def plot_landscape_warped_by_n_data_obs_Exp_pri_dist(N=1): lambda_1_true = 1 lambda_2_true = 3 plt.rcParams['font.size'] = 17 data = np.concatenate([ stats.poisson.rvs(lambda_1_true, size=(N, 1)), stats.poisson.rvs(lambda_2_true, size=(N, 1)) ], axis=1) print(f"observed (2-dimensional, sample size = {N}):", data) x = y = np.linspace(.01, 5, 100) X, Y = np.meshgrid(x, y) likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1) likelihood_y = np.array([stats.poisson.pmf(data[:, 1], _y) for _y in y]).prod(axis=1) L = np.dot(likelihood_x[:, None], likelihood_y[None, :]) figsize(25, 24) plt.subplot(221) uni_x = stats.uniform.pdf(x, loc=0, scale=5) uni_y = stats.uniform.pdf(y, loc=0, scale=5) M = np.dot(uni_x[:, None], uni_y[None, :]) plt.imshow(M, interpolation='none', origin='lower', cmap=jet, vmax=1, vmin=-.15, extent=(0, 5, 0, 5)) plt.scatter(lambda_2_true, lambda_1_true, c='k', s=50, edgecolor="none") plt.xlim(0, 5) plt.ylim(0, 5) plt.title("Landscape formed by " "Uniform priors on $p_1, p_2$") plt.xlabel("Value of $p_1$") plt.ylabel("Value of $p_2$") plt.subplot(223) plt.contour(X, Y, M * L) plt.imshow(M * L, interpolation='none', origin='lower', cmap=jet, extent=(0, 5, 0, 5)) plt.scatter(lambda_2_true, lambda_1_true, c='k', s=50, edgecolor='none') plt.xlim(0, 5) plt.ylim(0, 5) plt.title(f"Lnadscape warped by {N} data observation;\n" "Uniform priors on $p_1, p_2$") plt.xlabel("Value of $p_1$") plt.ylabel("Value of $p_2$") plt.subplot(222) exp_x = stats.expon.pdf(x, loc=0, scale=3) exp_y = stats.expon.pdf(y, loc=0, scale=10) M = np.dot(exp_x[:, None], exp_y[None, :]) plt.contour(x, y, M) plt.imshow(M, interpolation='none', origin='lower', cmap=jet, extent=(0, 5, 0, 5)) plt.scatter(lambda_2_true, lambda_1_true, c="k", s=50, edgecolor="none") plt.xlim(0, 5) plt.ylim(0, 5) plt.title("Landscape formed by " "Exponential priors on $p_1, p_2$") plt.xlabel("Value of $p_1$") plt.ylabel("Value of $p_2$") plt.subplot(224) plt.contour(x, y, M * L) plt.imshow(M * L, interpolation='none', origin='lower', cmap=jet, extent=(0, 5, 0, 5)) plt.scatter(lambda_2_true, lambda_1_true, c="k", s=50, edgecolor="none") plt.xlim(0, 5) plt.ylim(0, 5) plt.xlabel('Value of $p_1$') plt.ylabel('Value of $p_2$') plt.title(f"Landscape warped by {N} data observation;\n" "Exponential priors on $p_1, p_2$") plt.xlabel("Value of $p_1$") plt.ylabel("Value of $p_2$")
plot_landscape_warped_by_n_data_obs_Exp_pri_dist()
Output:
observed (2-dimensional, sample size = 1): [[2 1]]
plot_landscape_warped_by_n_data_obs_Exp_pri_dist(N=2)
Output:
observed (2-dimensional, sample size = 2): [[2 2] [1 4]]
メモ
関数plot_landscape_warped_by_n_data_obs_Exp_pri_dist
の以下の部分でどのようなarray
が生成されているのかをメモしておきます。
data = np.concatenate([ stats.poisson.rvs(lambda_1_true, size=(N, 1)), stats.poisson.rvs(lambda_2_true, size=(N, 1)) ], axis=1) print(f"observed (2-dimensional, sample size = {N}):", data) x = y = np.linspace(.01, 5, 100) X, Y = np.meshgrid(x, y) likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1) likelihood_y = np.array([stats.poisson.pmf(data[:, 1], _y) for _y in y]).prod(axis=1)
N=2 lambda_1_true = 1 lambda_2_true = 3 data = np.concatenate([ stats.poisson.rvs(lambda_1_true, size=(N, 1)), stats.poisson.rvs(lambda_2_true, size=(N, 1)) ], axis=1) print('data:') pprint.pprint(data) x = y = np.linspace(.01, 5, 100) print('\ndata[:, 0]:') pprint.pprint(data[:, 0]) likelihood_x = np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1) print('\nnp.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]):') pprint.pprint(np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x])) print('\nnp.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1):') pprint.pprint(np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1))
Output:(クリックすると展開されます)
data: array([[ 2, 10], [ 0, 4]]) data[:, 0]: array([2, 0]) np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]): array([[ 4.95024917e-05, 9.90049834e-01], [ 1.71738965e-03, 9.41384100e-01], [ 5.49527628e-03, 8.95110521e-01], [ 1.10599167e-02, 8.51111513e-01], [ 1.81202397e-02, 8.09275269e-01], [ 2.64146968e-02, 7.69495478e-01], [ 3.57088061e-02, 7.31671056e-01], [ 4.57928792e-02, 6.95705888e-01], [ 5.64799179e-02, 6.61508581e-01], [ 6.76036698e-02, 6.28992237e-01], [ 7.90168306e-02, 5.98074229e-01], [ 9.05893832e-02, 5.68675990e-01], [ 1.02207064e-01, 5.40722817e-01], [ 1.13769950e-01, 5.14143677e-01], [ 1.25191149e-01, 4.88871030e-01], [ 1.36395603e-01, 4.64840657e-01], [ 1.47318973e-01, 4.41991492e-01], [ 1.57906624e-01, 4.20265474e-01], [ 1.68112684e-01, 3.99607395e-01], [ 1.77899183e-01, 3.79964760e-01], [ 1.87235256e-01, 3.61287656e-01], [ 1.96096421e-01, 3.43528621e-01], [ 2.04463907e-01, 3.26642529e-01], [ 2.12324047e-01, 3.10586470e-01], [ 2.19667714e-01, 2.95319644e-01], [ 2.26489810e-01, 2.80803256e-01], [ 2.32788800e-01, 2.67000419e-01], [ 2.38566283e-01, 2.53876058e-01], [ 2.43826605e-01, 2.41396823e-01], [ 2.48576502e-01, 2.29531002e-01], [ 2.52824780e-01, 2.18248444e-01], [ 2.56582021e-01, 2.07520479e-01], [ 2.59860318e-01, 1.97319844e-01], [ 2.62673037e-01, 1.87620620e-01], [ 2.65034598e-01, 1.78398160e-01], [ 2.66960278e-01, 1.69629028e-01], [ 2.68466040e-01, 1.61290942e-01], [ 2.69568370e-01, 1.53362712e-01], [ 2.70284140e-01, 1.45824194e-01], [ 2.70630477e-01, 1.38656230e-01], [ 2.70624657e-01, 1.31840606e-01], [ 2.70283997e-01, 1.25360003e-01], [ 2.69625775e-01, 1.19197953e-01], [ 2.68667148e-01, 1.13338798e-01], [ 2.67425086e-01, 1.07767648e-01], [ 2.65916310e-01, 1.02470347e-01], [ 2.64157245e-01, 9.74334344e-02], [ 2.62163976e-01, 9.26441102e-02], [ 2.59952207e-01, 8.80902044e-02], [ 2.57537235e-01, 8.37601451e-02], [ 2.54933921e-01, 7.96429292e-02], [ 2.52156669e-01, 7.57280943e-02], [ 2.49219412e-01, 7.20056925e-02], [ 2.46135598e-01, 6.84662647e-02], [ 2.42918183e-01, 6.51008169e-02], [ 2.39579622e-01, 6.19007971e-02], [ 2.36131869e-01, 5.88580738e-02], [ 2.32586373e-01, 5.59649150e-02], [ 2.28954087e-01, 5.32139689e-02], [ 2.25245463e-01, 5.05982452e-02], [ 2.21470464e-01, 4.81110969e-02], [ 2.17638567e-01, 4.57462039e-02], [ 2.13758775e-01, 4.34975569e-02], [ 2.09839623e-01, 4.13594418e-02], [ 2.05889189e-01, 3.93264253e-02], [ 2.01915108e-01, 3.73933415e-02], [ 1.97924578e-01, 3.55552780e-02], [ 1.93924380e-01, 3.38075643e-02], [ 1.89920884e-01, 3.21457591e-02], [ 1.85920064e-01, 3.05656397e-02], [ 1.81927514e-01, 2.90631907e-02], [ 1.77948459e-01, 2.76345945e-02], [ 1.73987767e-01, 2.62762206e-02], [ 1.70049965e-01, 2.49846173e-02], [ 1.66139254e-01, 2.37565027e-02], [ 1.62259517e-01, 2.25887557e-02], [ 1.58414338e-01, 2.14784092e-02], [ 1.54607011e-01, 2.04226416e-02], [ 1.50840556e-01, 1.94187701e-02], [ 1.47117728e-01, 1.84642437e-02], [ 1.43441034e-01, 1.75566369e-02], [ 1.39812739e-01, 1.66936433e-02], [ 1.36234886e-01, 1.58730701e-02], [ 1.32709300e-01, 1.50928319e-02], [ 1.29237602e-01, 1.43509463e-02], [ 1.25821221e-01, 1.36455279e-02], [ 1.22461403e-01, 1.29747843e-02], [ 1.19159221e-01, 1.23370109e-02], [ 1.15915588e-01, 1.17305872e-02], [ 1.12731260e-01, 1.11539722e-02], [ 1.09606852e-01, 1.06057005e-02], [ 1.06542842e-01, 1.00843791e-02], [ 1.03539583e-01, 9.58868312e-03], [ 1.00597308e-01, 9.11735301e-03], [ 9.77161399e-02, 8.66919104e-03], [ 9.48960974e-02, 8.24305840e-03], [ 9.21371035e-02, 7.83787224e-03], [ 8.94389915e-02, 7.45260294e-03], [ 8.68015116e-02, 7.08627149e-03], [ 8.42243375e-02, 6.73794700e-03]]) np.array([stats.poisson.pmf(data[:, 0], _x) for _x in x]).prod(axis=1): array([ 4.90099337e-05, 1.61672331e-03, 4.91887962e-03, 9.41322241e-03, 1.46642619e-02, 2.03259898e-02, 2.61270999e-02, 3.18583757e-02, 3.73619503e-02, 4.25221835e-02, 4.72579300e-02, 5.15160072e-02, 5.52656917e-02, 5.84941004e-02, 6.12023262e-02, 6.34022216e-02, 6.51137326e-02, 6.63627021e-02, 6.71790718e-02, 6.75954202e-02, 6.76457866e-02, 6.73647330e-02, 6.67866079e-02, 6.59449765e-02, 6.48721911e-02, 6.35990761e-02, 6.21547071e-02, 6.05662675e-02, 5.88589677e-02, 5.70560137e-02, 5.51786149e-02, 5.32460237e-02, 5.12755975e-02, 4.92828782e-02, 4.72816845e-02, 4.52842125e-02, 4.33011403e-02, 4.13417363e-02, 3.94139667e-02, 3.75246016e-02, 3.56793187e-02, 3.38828027e-02, 3.21388405e-02, 3.04504115e-02, 2.88197725e-02, 2.72485366e-02, 2.57377476e-02, 2.42879483e-02, 2.28992431e-02, 2.15713562e-02, 2.03036842e-02, 1.90953440e-02, 1.79452163e-02, 1.68519850e-02, 1.58141722e-02, 1.48301696e-02, 1.38982669e-02, 1.30166766e-02, 1.21835557e-02, 1.13970252e-02, 1.06551869e-02, 9.95613829e-03, 9.29798450e-03, 8.67884968e-03, 8.09688583e-03, 7.55028057e-03, 7.03726341e-03, 6.55611095e-03, 6.10515098e-03, 5.68276569e-03, 5.28739405e-03, 4.91753350e-03, 4.57174093e-03, 4.24863331e-03, 3.94688762e-03, 3.66524060e-03, 3.40248798e-03, 3.15748358e-03, 2.92913807e-03, 2.71641758e-03, 2.51834214e-03, 2.33398400e-03, 2.16246589e-03, 2.00295916e-03, 1.85468189e-03, 1.71689698e-03, 1.58891028e-03, 1.47006862e-03, 1.35975791e-03, 1.25740133e-03, 1.16245744e-03, 1.07441841e-03, 9.92808253e-04, 9.17181170e-04, 8.47119885e-04, 7.82234073e-04, 7.22158846e-04, 6.66553291e-04, 6.15099077e-04, 5.67499122e-04])
3.1.4 例題:混合モデルの教師無しクラスタリング
makedirs("data", exist_ok=True) urlretrieve("https://git.io/vXt6b", "data/mixture_data.csv")
Output:
('data/mixture_data.csv', <http.client.HTTPMessage at 0x1606930ab38>)
figsize(25, 8) plt.rcParams['font.size'] = 17 data = np.loadtxt("data/mixture_data.csv", delimiter=",") plt.hist(data, bins=20, color="k", histtype="stepfilled", alpha=0.8) plt.title("Histogram of the dataset") plt.ylim([0, None]) plt.xlabel('Value') plt.ylabel('Count') print(data[:10], "...")
Output:
[ 115.85679142 152.26153716 178.87449059 162.93500815 107.02820697 105.19141146 118.38288501 125.3769803 102.88054011 206.71326136] ...
p = pm.Uniform("p", 0., 1.) assignment = pm.Categorical("assignment", [p, 1-p], size=data.shape[0]) taus = 1.0 / pm.Uniform("stds", lower=0, upper=100, size=2) ** 2 # テキストを読んだ感じ、ここはupper=100なんじゃないんかな? centers = pm.Normal("centers", [120, 190], [0.01, 0.01], size=2) @pm.deterministic def center_i(assignment=assignment, centers=centers): return centers[assignment] @pm.deterministic def tau_i(assignment=assignment, taus=taus): return taus[assignment]
どうやら、centers[assignment]
のように書くと、assignment
と同じサイズのサンプルを返してくれるらしい。しかも、各平均はちゃんとassignment
でサンプルされたクラスタからサンプルされるらしい。すごい・・・
print(f"Random assignment: {assignment.value[:8]}") print(f"Assigned center: {center_i.value[:8]}") print(f"Assigned precision: {tau_i.value[:8]}")
Output:
Random assignment: [0 1 1 0 0 0 1 1] Assigned center: [ 128.32964549 191.13553451 191.13553451 128.32964549 128.32964549 128.32964549 191.13553451 191.13553451] Assigned precision: [ 0.0001068 0.00012591 0.00012591 0.0001068 0.0001068 0.0001068 0.00012591 0.00012591]
observations = pm.Normal("obs", center_i, tau_i, value=data, observed=True) # これ、どこで使うんだ? model = pm.Model([p, assignment, taus, centers])
mcmc = pm.MCMC(model)
mcmc.sample(50000)
Output:
[-----------------100%-----------------] 50000 of 50000 complete in 20.9 sec
figsize(25, 18) plt.rcParams['font.size'] = 17 line_width = 1 colors = ["#348ABD", "#A60628"] center_trace = mcmc.trace("centers")[:] if center_trace[-1, 0] < center_trace[-1, 1]: colors = ["#A60628", "#348ABD"] plt.subplot(311) plt.plot(center_trace[:, 0], label="trace of center 0", c=colors[0], lw=line_width) plt.plot(center_trace[:, 1], label="trace of center 1", c=colors[1], lw=line_width) plt.title("Traces of unknown parameters") leg = plt.legend(loc="upper right") leg.get_frame().set_alpha(0.7) plt.subplot(312) std_trace = mcmc.trace("stds")[:] plt.plot(std_trace[:, 0], label="trace of standard deviation " "of cluster 0", c=colors[0], lw=line_width) plt.plot(std_trace[:, 1], label="trace of standard deviation " "of cluster 1", c=colors[1], lw=line_width) plt.legend(loc="upper right") plt.subplot(313) p_trace = mcmc.trace("p")[:] plt.plot(p_trace, label="$p$: frequency of assignment to cluster 0", color="#467821", lw=line_width) plt.xlabel("Steps") plt.ylabel("Value") plt.ylim(0, 1) plt.legend()
Output:
<matplotlib.legend.Legend at 0x16020cd9828>
mcmc.sample(100000)
Output:
[-----------------100%-----------------] 100000 of 100000 complete in 41.7 sec
figsize(25, 8) plt.rcParams['font.size'] = 17 center_trace = mcmc.trace("centers", chain=1)[:] prev_center_trace = mcmc.trace("centers", chain=0)[:] x = np.arange(50000) plt.plot(x, prev_center_trace[:, 0], label="previous trace of center 0", lw=line_width, alpha=0.4, c=colors[0]) plt.plot(x, prev_center_trace[:, 1], label="previous trace of center 1", lw=line_width, alpha=0.4, c=colors[1]) x = np.arange(50000, 150000) plt.plot(x, center_trace[:, 0], label="new trace of center 0", lw=line_width, c=colors[0]) plt.plot(x, center_trace[:, 1], label="new trace of center 1", lw=line_width, c=colors[1]) plt.title("Trace of unknown center parameters " "after sampling 1000,000 more times") leg = plt.legend(loc="upper right") leg.get_frame().set_alpha(0.8) plt.ylabel("Value") plt.xlabel("Steps")
Output:
Text(0.5, 0, 'Steps')
figsize(25, 9) plt.rcParams['font.size'] = 17 plt.cmap = mpl.colors.ListedColormap(colors) plt.imshow(mcmc.trace("assignment")[::400, np.argsort(data)], # np.argsort(data)を引数に持たせることで、各サンプルを観測データが小さい順に並べ替えてくれる。 cmap=plt.cmap, aspect=.4, alpha=.9) plt.xticks(np.arange(0, data.shape[0], 40), [f"{round(s, 2)}" for s in np.sort(data)[::40]]) plt.ylabel("Posterior sample") plt.xlabel("Value of $i$th data point") plt.title("Posterior labels fo data points")
Output:
Text(0.5, 1.0, 'Posterior labels fo data points')
cmap = mpl.colors.LinearSegmentedColormap.from_list("BMH", colors) assign_trace = mcmc.trace("assignment")[:] plt.scatter(data, 1 - assign_trace.mean(axis=0), cmap=cmap, c=assign_trace.mean(axis=0), s=50) plt.ylim(-0.05, 1.05) plt.xlim(35, 300) plt.title("Probability of data point belonging to cluster 0") plt.ylabel("Probability") plt.xlabel("Value of data point")
Output:
Text(0.5, 0, 'Value of data point')
norm = stats.norm x = np.linspace(20, 300, 500) posterior_center_means = center_trace.mean(axis=0) posterior_std_means = std_trace.mean(axis=0) posterior_p_mean = mcmc.trace("p")[:].mean() plt.hist(data, bins=20, histtype="step", density=True, color="k", lw=2, label="histogram of data") y = posterior_p_mean * norm.pdf(x, loc=posterior_center_means[0], scale=posterior_std_means[0]) plt.plot(x, y, lw=3, color=colors[0], label="cluster 0" "(using posterior-mean parameters)") plt.fill_between(x, y, color=colors[0], alpha=0.3) y = (1 - posterior_p_mean) * norm.pdf(x, loc=posterior_center_means[1], scale=posterior_std_means[1]) plt.plot(x, y, lw=3, color=colors[1], label="cluster 1" "(using posterior-mean parameters)") plt.fill_between(x, y, color=colors[1], alpha=0.3) plt.legend(loc="upper left") plt.title("Visualizing clusters using posterior-mean parameters")
Output:
Text(0.5, 1.0, 'Visualizing clusters using posterior-mean parameters')
3.1.5 事後サンプルを混ぜないで
p_trace=mcmc.trace("p")[:] std_trace = mcmc.trace("stds")[:]
def prob_of_belong_cluster(x, p_trace=p_trace, std_trace=std_trace): norm_pdf = stats.norm.pdf v = p_trace * norm_pdf(x, loc=center_trace[:, 0], scale=std_trace[:, 0]) < \ (1 - p_trace) * norm_pdf(x, loc=center_trace[:, 1], scale=std_trace[:, 1]) # テキストは不等号が逆なんじゃないか? return v.mean()
X = np.linspace(20, 300, 500) Y = np.array([prob_of_belong_cluster(x=x) for x in X])
各xの値について、クラスタ1に属する確率をプロットしてみた。
figsize(25, 9) fig, ax1 = plt.subplots() ax1.hist(data, bins=20, histtype="step", density=True, color="k", lw=2, label="histogram of data") ax2 = ax1.twinx() # 2つのプロットを関連付ける ax2.plot(X, Y, label="Probability of belonging to cluster 1") plt.ylim(0, 1.01) plt.legend(loc="best")
Output:
<matplotlib.legend.Legend at 0x160037e7b00>
3.1.6 MAPを使って収束を改善
pm.MAP
を使用してみました。だけど、使い方が間違えてるのか、バーンインが短くなっているかどうかはぱっと見わかりませんね・・・
p
の初期値なんてほぼ0になってるし・・・
p = pm.Uniform("p", 0., 1.) assignment = pm.Categorical("assignment", [p, 1-p], size=data.shape[0]) taus = 1.0 / pm.Uniform("stds", lower=0, upper=100, size=2) ** 2 # テキストを読んだ感じ、ここはupper=100なんじゃないんかな? centers = pm.Normal("centers", [120, 190], [0.01, 0.01], size=2) @pm.deterministic def center_i(assignment=assignment, centers=centers): return centers[assignment] @pm.deterministic def tau_i(assignment=assignment, taus=taus): return taus[assignment] observations = pm.Normal("obs", center_i, tau_i, value=data, observed=True) # これ、どこで使うんだ? model = pm.Model([p, assignment, taus, centers]) map_ = pm.MAP(model) map_.fit() mcmc = pm.MCMC(model) mcmc.sample(50000) figsize(25, 18) plt.rcParams['font.size'] = 17 line_width = 1 colors = ["#348ABD", "#A60628"] center_trace = mcmc.trace("centers")[:] if center_trace[-1, 0] < center_trace[-1, 1]: colors = ["#A60628", "#348ABD"] plt.subplot(311) plt.plot(center_trace[:, 0], label="trace of center 0", c=colors[0], lw=line_width) plt.plot(center_trace[:, 1], label="trace of center 1", c=colors[1], lw=line_width) plt.title("Traces of unknown parameters") leg = plt.legend(loc="upper right") leg.get_frame().set_alpha(0.7) plt.subplot(312) std_trace = mcmc.trace("stds")[:] plt.plot(std_trace[:, 0], label="trace of standard deviation " "of cluster 0", c=colors[0], lw=line_width) plt.plot(std_trace[:, 1], label="trace of standard deviation " "of cluster 1", c=colors[1], lw=line_width) plt.legend(loc="upper right") plt.subplot(313) p_trace = mcmc.trace("p")[:] plt.plot(p_trace, label="$p$: frequency of assignment to cluster 0", color="#467821", lw=line_width) plt.xlabel("Steps") plt.ylabel("Value") plt.ylim(0, 1) plt.legend()
Output:
Warning: Stochastic assignment's value is neither numerical nor array with floating-point dtype. Recommend fitting method fmin (default). [-----------------100%-----------------] 50000 of 50000 complete in 19.7 sec <matplotlib.legend.Legend at 0x160049da240>
今回の写経は以上です。 ここまで読んでいただきありがとうございました。