始めに

の第5章を写経した記録をまとめます。前回はこちらPythonで体験するベイズ推論 PyMCによるMCMC入門第4章写経。実行はJupyter Labにて実行しています。

　写経は以下のようにまとめました。

写経ではありますが、関数を作成し、コードの再実行をやりやすいようにした部分があります。
より良いと思われるコードを考えた場合は書き換えてコメントを添えるようにし、変更点をなるべく明示するようにしてあります。
個人的に気になった点のメモを残すようにしてあります。同じような疑問を持った方の助けになれば幸いです。
以前書いたコードと同じようなコード（例えばグラフの描写等）は効率化のために飛ばしているところもあります。
記事内で使用するモジュールなどは一番最初に宣言するようにしてあります。

　Pythonで体験するベイズ推論 PyMCによるMCMC入門を読んでいている際のちょっとした正誤表代わりになればと思います。

この記事で使用する主なモジュール、設定

　この記事では、主に以下のモジュールや設定を使用しています。

import pymc as pm
from IPython.core.pylabtools import figsize
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as sop
from os import makedirs
from urllib.request import urlretrieve
from pymc.Matplot import plot as mcplot

5 損失はおいくら?

5.2 損失関数

5.2.2 例題 : テレビ番組"The Price Is Right"の最適化

data_mu = [3e3, 12e3]
data_std = [5e2, 3e3]
mu_prior = 35e3
std_prior = 75e2

true_price = pm.Normal("true_price",
                       mu_prior, 1.0 / std_prior**2)
prize_1 = pm.Normal("first_prize",
                    data_mu[0], 1.0 / data_std[0]**2)
prize_2 = pm.Normal("second_prize",
                    data_mu[1], 1.0 / data_std[1]**2)
price_estimate = prize_1 + prize_2

@pm.potential
def error(true_price=true_price,
          price_estimate=price_estimate):
    return pm.normal_like(true_price,
                          price_estimate, 1 / (3e3)**2)

mcmc = pm.MCMC([true_price, prize_1, prize_2,
                price_estimate, error])
mcmc.sample(50000, 10000)

price_trace = mcmc.trace("true_price")[:]

Output:

     [-----------------100%-----------------] 50000 of 50000 complete in 18.2 sec

メモ

　pm.potentialデコレータは公式ドキュメント(4.5. The Potential class)を読んでみたが理解できなかった。前提知識が足りて無いかもしれない・・・

　それと、pm.normal_like関数は正規分布の対数尤度を返す関数のようです。

参考 : pymc.distributions.normal_like(x, mu, tau)

figsize(25, 8)
plt.rcParams['font.size'] = 17

x = np.linspace(5000, 40000)
plt.plot(x, stats.norm.pdf(x, 35000, 7500), c="k", lw=2,
         label="prior distribution\n"
         "of suite price")

_hist = plt.hist(price_trace, bins=35,
                 density=True, histtype="stepfilled")
plt.vlines(mu_prior, 0, 1.1 * np.max(_hist[0]),
           linestyles="--",
           label="prior's mean")
plt.vlines(price_trace.mean(), 0, 1.1 * np.max(_hist[0]),
           linestyles="--",
           label="posterior's mean")

plt.title("Posterior of the true price estimate")
plt.legend(loc="upper left")

Output:

    <matplotlib.legend.Legend at 0x200f387cc18>

f:id:koheitsutsumi223:20190405142422p:plain

figsize(25, 14)

def showdown_loss(guess, true_price, risk=80000):
    loss = np.zeros_like(true_price)
    ix = true_price < guess
    loss[~ix] = true_price[~ix] - guess # テキストのnp.absは不要
    close_mask = true_price - guess <= 250 # テキストの[]は不要。さらに、np.absも不要。
    loss[close_mask] = -2 * true_price[close_mask]
    loss[ix] = risk
    return loss

guesses = np.linspace(5000, 50000, 70)
risks = np.linspace(30000, 150000, 6)
expected_loss = lambda guess, risk: \
        showdown_loss(guess, price_trace, risk).mean()

for _p in risks:
    results = [expected_loss(_g, _p) for _g in guesses]
    plt.plot(guesses, results, label=f"{_p}")
    
plt.title("Expected loss of different guess,\n"
          "various risk levels of overestimating")
plt.legend(loc="upper left",
           title="risk parameter")
plt.xlabel("Price bid")
plt.ylabel("Expected loss")
plt.xlim(5000, 30000)

Output:

    (5000, 30000)

f:id:koheitsutsumi223:20190405142426p:plain

メモ

　showdown_loss関数の定義がわかりにくい書き方でした。以下のような定義の方が少しわかりやすいかもしれません。np.logical_andはデータ型がbool型である二つのarrayの各要素の論理積から成るarrayを返します。

参考 : numpy.logical_and

その他、arrayに関するbool型の関数は以下にあります。

参考 : Logic functions

figsize(25, 14)

def showdown_loss(guess, true_price, risk=80000):
    loss = np.zeros_like(true_price)
    ix = guess - true_price < - 250 
    loss[ix] = true_price[ix] - guess
    ix = np.logical_and(- 250 <= guess - true_price, guess - true_price <= 0)
    loss[ix] = -2 * true_price[ix]
    ix = guess - true_price > 0
    loss[ix] = risk
    return loss

guesses = np.linspace(5000, 50000, 70)
risks = np.linspace(30000, 150000, 6)
expected_loss = lambda guess, risk: \
        showdown_loss(guess, price_trace, risk).mean()

for _p in risks:
    results = [expected_loss(_g, _p) for _g in guesses]
    plt.plot(guesses, results, label=f"{_p}")
    
plt.title("Expected loss of different guess,\n"
          "various risk levels of overestimating")
plt.legend(loc="upper left",
           title="risk parameter")
plt.xlabel("Price bid")
plt.ylabel("Expected loss")
plt.xlim(5000, 30000)

Output:

    (5000, 30000)

f:id:koheitsutsumi223:20190405142430p:plain

ax = plt.subplot(111)

for _p in risks:
    _color = next(ax._get_lines.prop_cycler)
    _min_results = sop.fmin(expected_loss, 15000,
                            args=(_p,), disp=False)
    _results = [expected_loss(_g, _p) for _g in guesses]
    plt.plot(guesses, _results, color=_color['color'])
    plt.scatter(_min_results, 0, s=60,
                color=_color['color'], label=f"{_p}")
    plt.vlines(_min_results, 0, 120000,
               color=_color['color'], linestyles="--")
    print(f"minimum at risk {_p}: {_min_results}")
    
plt.title("Expected loss and Bayes actions of different guesses,\n"
          "various risk levels of overestimating")
plt.legend(loc="upper left", scatterpoints=1,
           title="Bayes action at risk:")
plt.xlabel("Price guess")
plt.ylabel("Expected loss")
plt.xlim(7000, 30000)
plt.ylim(-1000, 80000)

Output:

    minimum at risk 30000.0: [ 13835.10500193]
    minimum at risk 54000.0: [ 12936.69855595]
    minimum at risk 78000.0: [ 11813.4368062]
    minimum at risk 102000.0: [ 11813.4368062]
    minimum at risk 126000.0: [ 11813.4368062]
    minimum at risk 150000.0: [ 11813.4368062]





    (-1000, 80000)

f:id:koheitsutsumi223:20190405142436p:plain

　極小値の計算は、riskが78000.0以上の関数では一緒の値になってしまいましたね。fmin関数のxtol引数で誤差の許容値を調整してみたのですが、うまくいきませんでした。

5.3 ベイズ手法を用いた機械学習

5.3.1 例題 : 株価の予測

figsize(25, 8)

def stock_loss(true_return, yhat, alpha=100.):
    if true_return * yhat < 0:
        return alpha * yhat**2 \
            - np.sign(true_return) * yhat + abs(true_return)
    else:
        return abs(true_return - yhat)
    
pred = np.linspace(-.04, .12, 75)

true_values = [.05, -.02]

for t in true_values:
    plt.plot(pred, [stock_loss(t, _p) for _p in pred],
             lw=3,
             label="loss associated with\n"
                   f"prediction if true value={t}")
    
plt.vlines(0, 0, .26, linestyles="--")

plt.xlabel("Prediction")
plt.ylabel("Loss")
plt.xlim(-0.04, .12)
plt.ylim(0, 0.25)
plt.legend()
plt.title("Stock returns loss if true value = 0.05, -0.02")

Output:

    Text(0.5, 1.0, 'Stock returns loss if true value = 0.05, -0.02')

f:id:koheitsutsumi223:20190405142440p:plain

N = 100
X = 0.025 * np.random.randn(N)
Y = 0.5 * X + 0.01 * np.random.randn(N)

ls_coef_ = np.cov(X, Y)[0, 1] / np.var(X)
ls_intercept = Y.mean() - ls_coef_ * X.mean()

plt.scatter(X, Y, c="k")
plt.plot(X, ls_coef_ * X + ls_intercept,
         label="least-squares line")

plt.xlim(X.min(), X.max())
plt.ylim(Y.min(), Y.max())
plt.xlabel("Trading signal")
plt.ylabel("Returns")
plt.title("Empirical returns versus trading signal")
plt.legend(loc="upper left")

Output:

    <matplotlib.legend.Legend at 0x200f3da2940>

f:id:koheitsutsumi223:20190405142446p:plain

std = pm.Uniform("std", 0, 100, trace=False)

@pm.deterministic
def prec(U=std):
    return 1.0 / U**2

beta = pm.Normal("beta", 0, 0.0001)
alpha = pm.Normal("alpha", 0, 0.0001)

@pm.deterministic
def mean(X=X, alpha=alpha, beta=beta):
    return alpha + beta * X

obs = pm.Normal("obs", mean, prec, value=Y, observed=True)
mcmc = pm.MCMC([obs, beta, alpha, std, prec])

mcmc.sample(100000, 80000)

Output:

     [-----------------100%-----------------] 100000 of 100000 complete in 40.1 sec

figsize(25, 12)

def stock_loss(price, pred, coef=500):
    sol = np.zeros_like(price)
    ix = price * pred < 0
    sol[ix] = coef * pred ** 2 \
        - np.sign(price[ix]) * pred + abs(price[ix])
    sol[~ix] = abs(price[~ix] - pred)
    return sol

tau_samples = mcmc.trace("prec")[:]
alpha_samples = mcmc.trace("alpha")[:]
beta_samples = mcmc.trace("beta")[:]

N = tau_samples.shape[0]
noise = 1. / np.sqrt(tau_samples) * np.random.randn(N)
possible_outcomes = lambda signal: \
    alpha_samples + beta_samples * signal + noise

opt_predictions = np.zeros(50)
trading_signals = np.linspace(X.min(), X.max(), 50)

for i, _signal in enumerate(trading_signals):
    _possible_outcomes = possible_outcomes(_signal)
    tomin = lambda pred: stock_loss(_possible_outcomes, pred).mean()
    opt_predictions[i] = sop.fmin(tomin, 0, disp=False)
    
plt.plot(X, ls_coef_ * X + ls_intercept,
         label="less-squares prediction")
plt.plot(trading_signals, opt_predictions,
         label="Bayes action prediction")

plt.xlim(X.min(), X.max())
plt.xlabel("Trading signal")
plt.ylabel("Prediction")
plt.title("Least-squares prediction versus Bayes action prediction")
plt.legend(loc="upper left")

Output:

    <matplotlib.legend.Legend at 0x200f3e4add8>

f:id:koheitsutsumi223:20190405142450p:plain

5.3.3 観測データ

makedirs("data/Train_Skies/Train_Skies",
         exist_ok=True)

urlretrieve("https://git.io/vXLqk",
            "data/Train_Skies/Train_Skies/Training_Sky3.csv")
urlretrieve("https://git.io/vXLqU",
            "draw_sky2.py")

Output:

    ('draw_sky2.py', <http.client.HTTPMessage at 0x200f3eb92b0>)

from draw_sky2 import draw_sky

n_sky = 3
data = np.genfromtxt("data/Train_Skies/Train_Skies/"
                     f"Training_Sky{n_sky}.csv",
                     dtype=None, skip_header=1,
                     delimiter=",", usecols=[1, 2, 3, 4])
print(f"Data on galaxies in sky {n_sky}")
print("position_x, position_y, e_1, e_2")
print(data[:3])

fig = draw_sky(data)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title(f"Galaxy positions and ellipticities of sky {n_sky}")

Output:

    Data on galaxies in sky 3
    position_x, position_y, e_1, e_2
    [[  1.62690000e+02   1.60006000e+03   1.14664000e-01  -1.90326000e-01]
     [  2.27228000e+03   5.40040000e+02   6.23555000e-01   2.14979000e-01]
     [  3.55364000e+03   2.69771000e+03   2.83527000e-01  -3.01870000e-01]]





    Text(0.5, 1.0, 'Galaxy positions and ellipticities of sky 3')

f:id:koheitsutsumi223:20190405142455p:plain

5.3.5 PyMCで実装する

def euclidean_distance(x, y):
    return np.sqrt(((x - y)**2).sum(axis=1))

def f_distance(gxy_pos, halo_pos, c):
    return np.maximum(euclidean_distance(gxy_pos, halo_pos),
                      c)[:, None]

def tangential_distance(glxy_position, halo_position):
    delta = glxy_position - halo_position
    t = (2 * np.arctan(delta[:, 1] / delta[:, 0]))[:, None]
    return np.concatenate([-np.cos(t), -np.sin(t)], axis=1)

mass_large = pm.Uniform("mass_large", 40, 180, trace=False)
halo_position = pm.Uniform("halo_position", 0, 4200, size=(1, 2))

@pm.deterministic
def mean(mass=mass_large,
         h_pos=halo_position,
         glx_pos=data[:, :2]):
    return mass / f_distance(glx_pos, h_pos, 240) \
        * tangential_distance(glx_pos, h_pos)

ellpty = pm.Normal("ellipticity", mean, 1. / 0.05,
                   observed=True, value=data[:, 2:])

model = pm.Model([ellpty, mean, halo_position, mass_large])
map_ = pm.MAP(model)
map_.fit()
mcmc = pm.MCMC(model)
mcmc.sample(200000, 140000, 3)

Output:

     [-----------------100%-----------------] 200000 of 200000 complete in 102.0 sec

メモ

　上で作成したモデルは小さいハロー（質量がlog(20)）を考慮していないものになっているようです。小さいハローを考慮したモデルもこの章で紹介されます。

メモ（`f_distance`について）

　f_distance関数の定義ではnp.maximumを使用していますが、テキストの式では最小値を返す関数（min）が使用されていました。正直ここはどちらが正しいかわかりませんでした。

fig = draw_sky(data)

t = mcmc.trace("halo_position")[:].reshape(20000, 2)
plt.scatter(t[:, 0], t[:, 1], alpha=0.015, c="r")

plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title(f"Galaxy positions and ellipticities of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions and ellipticities of sky 3')

f:id:koheitsutsumi223:20190405142459p:plain

メモ（`mcmc.trace("halo_position")[:].reshape(20000, 2)`の補足）

　mcmc.trace("halo_position")[:]は20000×1×2の形のarrayです。（ちなみに、サンプルが20000点なのは、mcmc.sample(200000, 140000, 3)で、200000 - 140000 = 60000点のサンプルを3点ずつ間引いてるからです。）

mcmc.trace("halo_position")[:].shape

Output:

    (20000, 1, 2)

mcmc.trace("halo_position")[:]

Output:

    array([[[ 2326.43413861,  1161.90819072]],
    
           [[ 2324.78949176,  1154.90955342]],
    
           [[ 2324.78949176,  1154.90955342]],
    
           ..., 
           [[ 2354.75525359,  1137.24675332]],
    
           [[ 2359.74013063,  1141.07791609]],
    
           [[ 2359.74013063,  1141.07791609]]])

これをreshape(20000, 2)メソッドにより、20000×2の形のarrayに変換しています。

mcmc.trace("halo_position")[:].reshape(20000, 2).shape

Output:

    (20000, 2)

mcmc.trace("halo_position")[:].reshape(20000, 2)

Output:

    array([[ 2326.43413861,  1161.90819072],
           [ 2324.78949176,  1154.90955342],
           [ 2324.78949176,  1154.90955342],
           ..., 
           [ 2354.75525359,  1137.24675332],
           [ 2359.74013063,  1141.07791609],
           [ 2359.74013063,  1141.07791609]])

　reshapeメソッドを使わずに、以下の様なコードでも同じ操作ができます。インデックスを操作します。

fig = draw_sky(data)

t = mcmc.trace("halo_position")[:]
plt.scatter(t[:, 0, 0], t[:, 0, 1], alpha=0.015, c="r") # インデックスを操作して同じ結果を得る。

plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title(f"Galaxy positions and ellipticities of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions and ellipticities of sky 3')

f:id:koheitsutsumi223:20190405142503p:plain

makedirs("data", exist_ok=True)
urlretrieve("https://git.io/vXLm6",
            "data/Training_halos.csv")

Output:

    ('data/Training_halos.csv', <http.client.HTTPMessage at 0x200f48819b0>)

halo_data = np.genfromtxt("data/Training_halos.csv",
                          delimiter=",", skip_header=1,
                          usecols=[i for i in range(1, 10)])
print(halo_data[n_sky])

Output:

    [  1.00000000e+00   1.40861000e+03   1.68586000e+03   1.40861000e+03
       1.68586000e+03   0.00000000e+00   0.00000000e+00   0.00000000e+00
       0.00000000e+00]

fig = draw_sky(data)

t = mcmc.trace("halo_position")[:].reshape(20000, 2)
plt.scatter(t[:, 0], t[:, 1], alpha=0.015, c="r")
plt.scatter(halo_data[n_sky -1][3],
            halo_data[n_sky -1][4],
            label="true halo position",
            c="k", s=70)

plt.legend(scatterpoints=1, loc="lower left")
plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title(f"Galaxy positions and ellipticities of sky {n_sky}")

print("True halo location: ", halo_data[n_sky][3], halo_data[n_sky][4])

Output:

    True halo location:  1408.61 1685.86

f:id:koheitsutsumi223:20190405142508p:plain

urlretrieve("https://git.io/vXLmx", "DarkWorldMetrics.py")

Output:

    ('DarkWorldMetrics.py', <http.client.HTTPMessage at 0x200f61a6ba8>)

makedirs("data/Train_Skies/Train_Skies",
         exist_ok=True)
urlretrieve("https://git.io/vXqk4",
            "data/Train_Skies/Train_Skies/Training_Sky215.csv")

Output:

    ('data/Train_Skies/Train_Skies/Training_Sky215.csv',
     <http.client.HTTPMessage at 0x200f5efdc18>)

def halo_posteriors(n_halos_in_sky, galaxy_data):
    
    mass_large = pm.Uniform("mass_large", 40, 180)
    mass_small_1 = 20
    mass_small_2 = 20
    masses = np.array([mass_large, mass_small_1, mass_small_2],
                      dtype=object)
    
    halo_positions = pm.Uniform("halo_positions", 0, 4200,
                                size=(n_halos_in_sky, 2))
    
    fdist_constants = np.array([240, 70, 70])
    
    @pm.deterministic
    def mean(mass=masses,
             h_pos=halo_positions,
             glx_pos=data[:, :2],
             n_halos_in_sky=n_halos_in_sky):
        _sum = 0
        for i in range(n_halos_in_sky):
            _sum += mass[i] / f_distance(glx_pos, h_pos[i, :],
                                         fdist_constants[i]) * \
                tangential_distance(glx_pos, h_pos[i, :])
        return _sum
    
    ellpty = pm.Normal("ellipticity", mean, 1. / 0.05,
                       observed=True, value=data[:, 2:])
    model = pm.Model([ellpty, mean, halo_positions, mass_large])
    map_ = pm.MAP(model)
    map_.fit(method="fmin_powell")
    mcmc = pm.MCMC(model)
    return mcmc

n_sky = 215
data = np.genfromtxt("data/Train_Skies/Train_Skies/"
                     f"Training_Sky{n_sky}.csv",
                     dtype=None, skip_header=1,
                     delimiter=",", usecols=[i for i in range(1, 5)])


mcmc = halo_posteriors(3, data)

C:\Users\ktsutsum\Anaconda3\lib\site-packages\pymc\Node.py:403: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  self.__name__ = input['__name__']
C:\Users\ktsutsum\Anaconda3\lib\site-packages\pymc\Node.py:403: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  self.__name__ = input['__name__']

メモ（`halo_posteriors`関数について）

　halo_posteriors関数をMCMCオブジェクトを返すように書き変えました。理由は、sampleメソッドを繰り返し実行したかったからです。一度のsampleメソッドの実行では収束しなかったので、何回か実行できるようにしてみました。ただ、4回やってもあんまりよい結果は得られませんでした。

4回sampleした結果です（クリックすると展開されます）

mcmc.sample(10.5e5, 9.5e5, 10)

Output:

     [-----------------100%-----------------] 1050000 of 1050000 complete in 1553.2 sec

figsize(25, 8)
plt.rcParams['font.size'] = 10
mcplot(mcmc.trace("halo_positions"), common_scale=False)

Output:

    Plotting halo_positions_0_0
    Plotting halo_positions_0_1
    Plotting halo_positions_1_0
    Plotting halo_positions_1_1
    Plotting halo_positions_2_0
    Plotting halo_positions_2_1

f:id:koheitsutsumi223:20190405142512p:plain

f:id:koheitsutsumi223:20190405142515p:plain

f:id:koheitsutsumi223:20190405142518p:plain

fig = draw_sky(data)
traces = mcmc.trace("halo_positions")[:]

colors = ["#467821", "#A60628", "#7A68A6"]

for i in range(traces.shape[1]):
    plt.scatter(traces[:, i, 0], traces[:, i, 1],
                c=colors[i], alpha=0.02)
    
for i in range(traces.shape[1]):
    plt.scatter(halo_data[n_sky - 1][3 + 2 * i],
                halo_data[n_sky - 1][4 + 2 * i],
                c="k", s=90,
                label="true halo position")
    
plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title("Galaxy positions, ellipticities, "
          f"and halos of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions, ellipticities, and halos of sky 215')

f:id:koheitsutsumi223:20190405142522p:plain

mcmc.sample(10.5e5, 9.5e5, 10)

Output:

     [-----------------100%-----------------] 1050000 of 1050000 complete in 980.3 sec

figsize(25, 8)
plt.rcParams['font.size'] = 10
mcplot(mcmc.trace("halo_positions"), common_scale=False)

Output:

    Plotting halo_positions_0_0
    Plotting halo_positions_0_1
    Plotting halo_positions_1_0
    Plotting halo_positions_1_1
    Plotting halo_positions_2_0
    Plotting halo_positions_2_1

f:id:koheitsutsumi223:20190405142526p:plain

f:id:koheitsutsumi223:20190405142533p:plain

f:id:koheitsutsumi223:20190405142536p:plain

fig = draw_sky(data)
traces = mcmc.trace("halo_positions")[:]

colors = ["#467821", "#A60628", "#7A68A6"]

for i in range(traces.shape[1]):
    plt.scatter(traces[:, i, 0], traces[:, i, 1],
                c=colors[i], alpha=0.02)
    
for i in range(traces.shape[1]):
    plt.scatter(halo_data[n_sky - 1][3 + 2 * i],
                halo_data[n_sky - 1][4 + 2 * i],
                c="k", s=90,
                label="true halo position")
    
plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title("Galaxy positions, ellipticities, "
          f"and halos of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions, ellipticities, and halos of sky 215')

f:id:koheitsutsumi223:20190405142541p:plain

mcmc.sample(10.5e5, 9.5e5, 10)

Output:

     [-----------------100%-----------------] 1050000 of 1050000 complete in 712.8 sec

figsize(25, 8)
plt.rcParams['font.size'] = 10
mcplot(mcmc.trace("halo_positions"), common_scale=False)

Output:

    Plotting halo_positions_0_0
    Plotting halo_positions_0_1
    Plotting halo_positions_1_0
    Plotting halo_positions_1_1
    Plotting halo_positions_2_0
    Plotting halo_positions_2_1

f:id:koheitsutsumi223:20190405142545p:plain

f:id:koheitsutsumi223:20190405142549p:plain

f:id:koheitsutsumi223:20190405142553p:plain

fig = draw_sky(data)
traces = mcmc.trace("halo_positions")[:]

colors = ["#467821", "#A60628", "#7A68A6"]

for i in range(traces.shape[1]):
    plt.scatter(traces[:, i, 0], traces[:, i, 1],
                c=colors[i], alpha=0.02)
    
for i in range(traces.shape[1]):
    plt.scatter(halo_data[n_sky - 1][3 + 2 * i],
                halo_data[n_sky - 1][4 + 2 * i],
                c="k", s=90,
                label="true halo position")
    
plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title("Galaxy positions, ellipticities, "
          f"and halos of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions, ellipticities, and halos of sky 215')

f:id:koheitsutsumi223:20190405142557p:plain

mcmc.sample(10.5e5, 9.5e5, 10)

Output:

     [-----------------100%-----------------] 1050000 of 1050000 complete in 943.5 sec

figsize(25, 8)
plt.rcParams['font.size'] = 10
mcplot(mcmc.trace("halo_positions"), common_scale=False)

Output:

    Plotting halo_positions_0_0
    Plotting halo_positions_0_1
    Plotting halo_positions_1_0
    Plotting halo_positions_1_1
    Plotting halo_positions_2_0
    Plotting halo_positions_2_1

f:id:koheitsutsumi223:20190405142601p:plain

f:id:koheitsutsumi223:20190405142605p:plain

f:id:koheitsutsumi223:20190405142608p:plain

fig = draw_sky(data)
traces = mcmc.trace("halo_positions")[:]

colors = ["#467821", "#A60628", "#7A68A6"]

for i in range(traces.shape[1]):
    plt.scatter(traces[:, i, 0], traces[:, i, 1],
                c=colors[i], alpha=0.02)
    
for i in range(traces.shape[1]):
    plt.scatter(halo_data[n_sky - 1][3 + 2 * i],
                halo_data[n_sky - 1][4 + 2 * i],
                c="k", s=90,
                label="true halo position")
    
plt.xlim(0, 4200)
plt.ylim(0, 4200)
plt.xlabel("$x$ position")
plt.ylabel("$y$ position")
plt.title("Galaxy positions, ellipticities, "
          f"and halos of sky {n_sky}")

Output:

    Text(0.5, 1.0, 'Galaxy positions, ellipticities, and halos of sky 215')

f:id:koheitsutsumi223:20190405142611p:plain

　今回の写経は以上です。ここまで読んでいただきありがとうございました。

一日一万字の感謝の写経

一日一万字の感謝の写経をして強くなります。そのうち本を置き去りにして何かを作り始める予定です。

Pythonで体験するベイズ推論 PyMCによるMCMC入門第5章写経

目次

始めに

この記事で使用する主なモジュール、設定

5 損失はおいくら?

5.2 損失関数

5.2.2 例題 : テレビ番組"The Price Is Right"の最適化

メモ

メモ

5.3 ベイズ手法を用いた機械学習

5.3.1 例題 : 株価の予測

5.3.3 観測データ

5.3.5 PyMCで実装する

メモ

メモ（`f_distance`について）

メモ（`mcmc.trace("halo_position")[:].reshape(20000, 2)`の補足）

メモ（`halo_posteriors`関数について）

目次

始めに

この記事で使用する主なモジュール、設定

5 損失はおいくら?

5.2 損失関数

5.2.2 例題 : テレビ番組"The Price Is Right"の最適化

メモ

メモ

5.3 ベイズ手法を用いた機械学習

5.3.1 例題 : 株価の予測

5.3.3 観測データ

5.3.5 PyMCで実装する

メモ

メモ（f_distanceについて）

メモ （mcmc.trace("halo_position")[:].reshape(20000, 2)の補足）

メモ（halo_posteriors関数について）

メモ（`f_distance`について）

メモ（`mcmc.trace("halo_position")[:].reshape(20000, 2)`の補足）

メモ（`halo_posteriors`関数について）