HMM EXAMPLE with Continuous Densities¶

This example is very similar to the example in the course notes. However there is no numerical exact match

In [1]:
#!pip install git+https://github.com/compi1234/pyspch.git
try:
    import pyspch
except ModuleNotFoundError:
    try:
        print(
        """
        To enable this notebook on platforms as Google Colab, 
        install the pyspch package and dependencies by running following code:

        !pip install git+https://github.com/compi1234/pyspch.git
        """
        )
    except ModuleNotFoundError:
        raise
In [2]:
%matplotlib inline    
import sys, os,time
import numpy as np
import pandas as pd
from IPython.display import display, HTML, clear_output
import ipywidgets as widgets
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import copy

import pyspch.stats.probdist as Densities
from pyspch.stats import libhmm
import pyspch.display as Spd
import pyspch.core as Spch

# print all variable statements
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# graphical and print preferences
cmap_1 = sns.light_palette("caramel",50,input="xkcd")
cmap_2 = sns.light_palette("caramel",50,input="xkcd")[0:25]
cmap_3 = sns.light_palette("caramel",50,input="xkcd")[25:50]
cmap = cmap_2
pd.options.display.float_format = '{:,.3f}'.format

# catch future warnings -- occurring in matplotlib
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
In [3]:
mpl.rcParams['figure.figsize'] = [12.0, 6.0]
#mpl.rcParams['ps.papersize'] = 'A4'
mpl.rcParams['xtick.labelsize'] = 14
mpl.rcParams['ytick.labelsize'] = 14
mpl.rcParams['axes.titlepad'] = 15
mpl.rcParams['axes.titlesize'] = 'large'
mpl.rcParams['axes.linewidth'] = 2
mpl.rc('lines', linewidth=3, color='k')
In [4]:
%%HTML
<style type="text/css">
table.dataframe td, table.dataframe th {
    border: 2px  black solid !important;
    column-width: 60px;
    color: black !important;
}
prior mean var
0 0.200 [1.0] [1.0]
1 0.200 [10.0] [1.0]
2 0.200 [15.0] [1.0]
3 0.200 [6.0] [1.0]
4 0.200 [2.0] [1.0]
No description has been provided for this image
In [6]:
Gauss1.__dict__
Out[6]:
{'priors': None,
 'var_smoothing': 1e-09,
 'n_classes': 5,
 'n_features_in_': 1,
 'theta_': array([[ 1.],
        [10.],
        [15.],
        [ 6.],
        [ 2.]]),
 'var_': array([[1.],
        [1.],
        [1.],
        [1.],
        [1.]]),
 'classes_': array([1, 2, 3, 4, 5]),
 'class_prior_': array([0.2, 0.2, 0.2, 0.2, 0.2])}

HMM with a Gaussian Observation Model¶

The frame classifiers above did not make use of the inherent sequence property of our data. We will now exploit the fact that we expect the data to be drawn in sequence from states 0-1-2. Though we will not enforce this 100% as occasionally there are random switches between states.

We make two hmm-models with the same init and transition probabilities.
The model has a tendency left-to-right, though a very small probability allows for breaking this topology

  • hmm1 : uses Gauss1 (intuitive) as observation model
  • hmm2 : uses Gauss2 (trained) as observation model

We use log-prob's throughout to accomodate for small probs and long sequences

In [7]:
hmm1 = libhmm.HMM(prob_style="log",obs_model=Gauss1,n_states=5,prob_floor=1.e-20)
hmm1.init_topology(type="lr",selfprob=.8)
hmm1.print_model()
HMM STATE MODEL
=================

0 1 2 3 4
Pinit(.) 0.000 -46.052 -46.052 -46.052 -46.052
0 1 2 3 4
P(0|.) -0.223 -46.052 -46.052 -46.052 -46.052
P(1|.) -1.609 -0.223 -46.052 -46.052 -46.052
P(2|.) -46.052 -1.609 -0.223 -46.052 -46.052
P(3|.) -46.052 -46.052 -1.609 -0.223 -46.052
P(4|.) -46.052 -46.052 -46.052 -1.609 0.000
OBSERVATION MODEL
=================

prior mean var
0 0.200 [1.0] [1.0]
1 0.200 [10.0] [1.0]
2 0.200 [15.0] [1.0]
3 0.200 [6.0] [1.0]
4 0.200 [2.0] [1.0]

Trellis Computatitons¶

In the next cell

  • Select model hmm1 or hmm2
  • Select Datastream and reference alignment (X1,X2,X3,X4 or X2s,X3s,X4s)
  • Compute the Viterbi Trellis (Normalize=True: max value =1.0 for prob or 0.0 for logprob)
In [ ]:
 
In [8]:
X = X2
hmm = hmm1
tr = libhmm.Trellis(hmm)
tr.viterbi_pass(X)
tr.print_trellis(what=['obs_probs','probs'])
Observation Probabilities

0 1 2 3 4 5 6 7 8 9 10 11
0 -0.995 -0.919 -37.933 -32.571 -63.745 -6.134 -5.580 -2.755 -5.668 -7.350 -3.553 -0.969
1 -45.010 -41.405 -0.997 -1.464 -3.360 -17.568 -18.602 -26.007 -18.431 -15.572 -23.396 -44.317
2 -104.462 -98.897 -15.477 -19.182 -4.813 -58.921 -60.837 -73.925 -60.521 -55.140 -69.420 -103.400
3 -15.448 -13.411 -7.413 -5.289 -20.198 -2.486 -2.814 -5.673 -2.758 -1.918 -4.577 -15.051
4 -1.886 -1.417 -29.829 -25.114 -53.035 -3.404 -3.026 -1.339 -3.086 -4.264 -1.758 -1.785
Trellis Probabilities (Viterbi)

0 1 2 3 4 5 6 7 8 9 10 11
0 -0.995 -2.137 -40.294 -73.088 -116.227 -62.199 -68.002 -68.793 -74.643 -79.411 -79.878 -79.052
1 -91.061 -44.010 -4.744 -6.431 -10.014 -27.805 -46.630 -72.861 -87.406 -87.633 -99.721 -122.400
2 -150.514 -145.944 -61.096 -25.535 -12.853 -70.544 -90.251 -122.165 -129.496 -127.201 -145.745 -181.482
3 -61.499 -60.458 -55.602 -56.085 -47.342 -16.948 -19.986 -25.882 -28.864 -31.005 -35.805 -51.079
4 -47.937 -48.464 -78.018 -75.910 -105.518 -52.356 -21.584 -22.923 -26.009 -30.273 -32.031 -33.816
In [9]:
cmap="OrRd_r"
# column by column normalization is applied for the heatmap plotting of the trellis, 
tr.plot_trellis(xticks=X1,plot_norm=True,plot_obs_probs=True,plot_values=True,fmt=".1f",vmin=-220,vmax=0,
                 cmap=cmap,plot_backptrs=True,plot_alignment=True,figsize=(16,8))
Out[9]:
No description has been provided for this image