Stochastic matrix
From Wikipedia, the free encyclopedia
- For a matrix whose elements are stochastic, see Random matrix
In
mathematics, a
stochastic matrix (also termed
probability matrix,
transition matrix,
substitution matrix, or
Markov matrix) is a
matrix used to describe the transitions of a
Markov chain. It has found use in
probability theory,
statistics and
linear algebra, as well as
computer science. There are several different definitions and types of stochastic matrices;
- A right stochastic matrix is a square matrix each of whose rows consists of nonnegative real numbers, with each row summing to 1.
- A left stochastic matrix is a square matrix each of whose columns consist of nonnegative real numbers, with each column summing to 1.
- A doubly stochastic matrix is a square matrix where all entries are nonnegative and all rows and all columns sum to 1.
In the same vein, one may define a
stochastic vector as a
vector whose elements consist of nonnegative
real numbers which sum to 1. Thus, each row (or column) of a stochastic matrix is a
probability vector, which are sometimes called stochastic vectors.
A common convention in English language mathematics literature is to
use row vectors of probabilities and right stochastic matrices rather
than column vectors of probabilities and left stochastic matrices; this
article follows that convention.
Definition and properties
A stochastic matrix describes a Markov chain
![\boldsymbol{X}_{t}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_uz4VjS1MjrOluF9k-KZEDz8U2jwoU0CkCldSmt-QYgaHUPoQRpEzs-vdc48EI0nr0OVpyGY9J2Hn3DqHZfH1TTVE6STfETqBpt8083PnMynj1qZL6O76haBi_lEewSPJYrO-VTQHGbx5eMhOxeoMDkDQNaUUYIajrdZXo=s0-d)
over a
finite state space S.
If the
probability of moving from
![i](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tfthufZnsm144-sML_su4HNm-ss_dxqXPXvVPiplWiXkfM5fveavgfqPSQzfRmikExCQGN2XD16KAoifZk5UwCRQ--PZqZ7-g5A2mqM3bLAE6FpG-QmuMPGafcfLrM9VxopAlHpogi-Nlvc27aREvdYthqzjXXmSeKPv0=s0-d)
to
![j](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tog8OUfv8z0qsYvN7X_s7XSE04zJB3QKLF8w8meCfIG-W1B4jzkejanNHpjgKjIJ6eKpRhm194vmbzJwbn-Yu4-1Pfm0I79H2xSY6AIQoe8I9ic0I5cC3UJSoGWA7dvnZ8jTnu0TrZWgT7OTvRDCvneggU6XGaHaHGjvI=s0-d)
in one time step is
![Pr(j|i)=P_{i,j}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_sucgT0w9oE1oKuOPc-5lr1BSU6q_ADxE5iVnMbKUb25bqr35JFmKAaLNfw28GhwxR-AR0dKmI9oUx15GYEh-FHC4uRhRDo-160XmRszpf0DD66NuUHPS-Or79OwVGic2d_K1c2GUOzWKu_TEnPjKAts9rFONdntmxp1zA=s0-d)
, the stochastic matrix
P is given by using
![P_{i,j}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_txIPUAnb_zQFo1c-UUcp2oTCdDwArIppk3ixq_NOB-3C8iaSozhWTBhnP2e3tqLJvI5uRKA4tHxuLaZkaqfBk9_sKJs9sTmmjIz9uM2m9rYEyx2BdNNrTi85k4YVs9QRj5eu6LVgISHGJRAua_G7J9MMMvbyEEGCFR3ig=s0-d)
as the
![i^{th}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_sZAF6fAqK7qeDsAWsfsVxD_FzHov1Ntdfqi0XbpFUVU4d_R2C8nA4S-nw5Y1Ue6Zv1Jou3Oa6znFOi8nstqEZPlrYQG2W2HkbQUwh62J3ehRSu-rz8Bao17kl6xqno6-PPQvp-8g1XPYee5DQuq66bHVvVPDsvJCHbbQ=s0-d)
row and
![j^{th}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tz02DtHDzpw7DVpOl5vXNxVxDIFUAfKUltMwplNoYeUtAHIoD6Fn2l79xLm89G9LyvcozN_XfYzBmo5Y4lZklfxqbSPe2YaY-spR4YqaIMMrPVcTJpy_iuFIiqcE_yb54xgAaQ3kzwFtcIjQANQjyREbTsMdFsZNPVq7k=s0-d)
column element, e.g.,
![P=\left(\begin{matrix}p_{1,1}&p_{1,2}&\dots&p_{1,j}&\dots\\
p_{2,1}&p_{2,2}&\dots&p_{2,j}&\dots\\
\vdots&\vdots&\ddots&\vdots&\ddots\\
p_{i,1}&p_{i,2}&\dots&p_{i,j}&\dots\\
\vdots&\vdots&\ddots&\vdots&\ddots
\end{matrix}\right).](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_sHYhUNgpl9TWvxnGrpDY9b23dJ1TpBGiKvT2IiZc0GOwk3cXhT6p9yGY__vVgrnNzdYDPoMd7d4ojEIQDwrJFQwACzHWYn0k5T2xkm4Non81gIPz4aOhtwKf3yEpp-EBJrB2ko3b3rhZ-w5KJMrBMWGsza4VDkBKu2KVI=s0-d)
Since the probability of transitioning from state
![i](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tfthufZnsm144-sML_su4HNm-ss_dxqXPXvVPiplWiXkfM5fveavgfqPSQzfRmikExCQGN2XD16KAoifZk5UwCRQ--PZqZ7-g5A2mqM3bLAE6FpG-QmuMPGafcfLrM9VxopAlHpogi-Nlvc27aREvdYthqzjXXmSeKPv0=s0-d)
to some state must be 1, this matrix is a right stochastic matrix, so that
![\sum_j P_{i,j}=1.\,](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tpAHDNFvzcbQbfHQo-oWjalAQBYxioMPRdJS-49zv0odM2OQPIlewF0QYCPqPxLF2SDmU_yzSdyEHLGyOsW2B6FZY5SIThJ476_kwanWFZT3vCDKkrAi2wDcmSJRjEHs-ddrT5i7bnIIHbd7K2RAD68xX4ZBhZkMCXRA=s0-d)
The probability of transitioning from
![i](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tfthufZnsm144-sML_su4HNm-ss_dxqXPXvVPiplWiXkfM5fveavgfqPSQzfRmikExCQGN2XD16KAoifZk5UwCRQ--PZqZ7-g5A2mqM3bLAE6FpG-QmuMPGafcfLrM9VxopAlHpogi-Nlvc27aREvdYthqzjXXmSeKPv0=s0-d)
to
![j](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tog8OUfv8z0qsYvN7X_s7XSE04zJB3QKLF8w8meCfIG-W1B4jzkejanNHpjgKjIJ6eKpRhm194vmbzJwbn-Yu4-1Pfm0I79H2xSY6AIQoe8I9ic0I5cC3UJSoGWA7dvnZ8jTnu0TrZWgT7OTvRDCvneggU6XGaHaHGjvI=s0-d)
in two steps is then given by the
![(i,j)^{th}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_sdAJqusKY4R7OLwdlRphh3FkR_rHzKy2ktlpQq-Ki5mH80nufKcztUH9CSTuwFNLsFdaJ7iPyfSdW1nNhQt1SuVADpjRAN3rQlDkug6JLNIXrN2_BiAc-V5wC0RIXg7TmtnXeQZbCbGfh3vt5QDvrm0vt9H8GwOPL3pDY=s0-d)
element of the square of
![P](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tmtL2XU6zYEIMV8EJ-XC9U27cRoYZbn7-TxFns-zxuWaWTuHEWTnQxjYQCYhEMemgskkP5eLkga23ZNLPSBRNsH1wEUCnSS3jwjlNSI9x8jIynEV2OgcA4k9r-1TzaFfIxjyn7BdKS_OQ2-fuxZ9PlSOYjDy9S2XIYDqM=s0-d)
:
![\left(P ^{2}\right)_{i,j}.](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tBduhe71RVWIMBSChjEzR4SlH0UPHy1qFlD0c1kAop0AHYC_T1annUWPZziPvb1GuEvaxnX2OFqzATf0Kb3ZUw8_wQTE-WZQkJaj_6_4FviU1hJ46qS7JUAaLhZOubAnCryDn0gA3pMq1nNnkp78QuAT1mtWi5ZunVWi0=s0-d)
In general the probability transition of going from any state to another state in a finite Markov chain given by the matrix
![P](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tmtL2XU6zYEIMV8EJ-XC9U27cRoYZbn7-TxFns-zxuWaWTuHEWTnQxjYQCYhEMemgskkP5eLkga23ZNLPSBRNsH1wEUCnSS3jwjlNSI9x8jIynEV2OgcA4k9r-1TzaFfIxjyn7BdKS_OQ2-fuxZ9PlSOYjDy9S2XIYDqM=s0-d)
in
k steps is given by
![P^k](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_uv00-0E6ssP-o3qZlsERupVms0Z-Uwz6oG8XvxcbfTJv9em-M1NHJCNzF-X6jSzLzhgvC5627DSNRche7mLLP_qu_oTJwAXxVPo-1O7hdNK_kFUwXoB1USU4QD-NwR3qQb5zJnD2gRyuwH4VEOxhuqtBCOh47kDSV3Rw=s0-d)
.
An initial distribution is given as a
row vector.
A stationary probability vector
![\boldsymbol{\pi}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tWyaigJyfVRU9tC4otmicq3PJ8ngKWsejV_7VSolPdpmBPs5j0kBMxrjYSJY5O1Rfas7fFjTsu7f2yXLumTCxFOmXTZFRxxAnUKli_fSJTFWpa1WmDzlisNXvJgVUeCXVd6DzSifzH3KIcltkCb9WAg6jFQuuuDTXn_KM=s0-d)
is defined as a vector that does not change under application of the transition matrix; that is, it is defined as a left
eigenvector of the probability matrix, associated with
eigenvalue 1:
![\boldsymbol{\pi}P=\boldsymbol{\pi}.](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tyQ_OoB2DuJdEym5PhheIWW88Vmip1sWQwO47k0r50MQg42S4PEMI37v5PNi7ZKlT8XjNR2VERBwI1jKQi6gI1bEfc0fIfLJNJkXvwasnCOF6PwpdXdR7YQ3yFKPhJbTanIWS0LOQzbD5SXtI0QcyPsu8csCyBF6yvCSY=s0-d)
The
Perron–Frobenius theorem
ensures that every stochastic matrix has such a vector, and that the
largest absolute value of an eigenvalue is always 1. In general, there
may be several such vectors. However, for a matrix with strictly
positive entries, this vector is unique and can be computed by observing
that for any
![i](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tfthufZnsm144-sML_su4HNm-ss_dxqXPXvVPiplWiXkfM5fveavgfqPSQzfRmikExCQGN2XD16KAoifZk5UwCRQ--PZqZ7-g5A2mqM3bLAE6FpG-QmuMPGafcfLrM9VxopAlHpogi-Nlvc27aREvdYthqzjXXmSeKPv0=s0-d)
we have the following limit,
![\lim_{k\rightarrow\infty}\left(P^k \right)_{i,j}=\boldsymbol{\pi}_j,](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_sM88HWmjNGY_b0zsk4NyUiUMD8DNoVWH6XVk5xjzWhu01XZJ-jRtDfAYawt_j0JwBBkx4lEokH6R-sTgB63wHENDkTWT9742HZmeXT0yDk_J5_7Mo2ia4JsW7IWuoi4MhJFzB-3Kvo9-igQpRyKAwdmbQ6WfquS_oN3zQ=s0-d)
where
![\boldsymbol{\pi}_{j}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_vxxiSwQfY0n05WWovn0HQNyopVNX2JlFSXDQo5KqEWteW1XO7UP487p5ZACFuGuqQT_uAXDEGkKGcf6o3MYXnz2m1eYaFcc2Fjz2TJr7_FeM0fGKFFLhCDdRSP3Zt332TNydz49wsoIsiJuZXxRK4nHi2-S1yziV_uWYc=s0-d)
is the
![j^{th}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tz02DtHDzpw7DVpOl5vXNxVxDIFUAfKUltMwplNoYeUtAHIoD6Fn2l79xLm89G9LyvcozN_XfYzBmo5Y4lZklfxqbSPe2YaY-spR4YqaIMMrPVcTJpy_iuFIiqcE_yb54xgAaQ3kzwFtcIjQANQjyREbTsMdFsZNPVq7k=s0-d)
element of the row vector
![\boldsymbol{\pi}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tWyaigJyfVRU9tC4otmicq3PJ8ngKWsejV_7VSolPdpmBPs5j0kBMxrjYSJY5O1Rfas7fFjTsu7f2yXLumTCxFOmXTZFRxxAnUKli_fSJTFWpa1WmDzlisNXvJgVUeCXVd6DzSifzH3KIcltkCb9WAg6jFQuuuDTXn_KM=s0-d)
. This implies that the long-term probability of being in a state
![j](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tog8OUfv8z0qsYvN7X_s7XSE04zJB3QKLF8w8meCfIG-W1B4jzkejanNHpjgKjIJ6eKpRhm194vmbzJwbn-Yu4-1Pfm0I79H2xSY6AIQoe8I9ic0I5cC3UJSoGWA7dvnZ8jTnu0TrZWgT7OTvRDCvneggU6XGaHaHGjvI=s0-d)
is independent of the initial state
![i](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tfthufZnsm144-sML_su4HNm-ss_dxqXPXvVPiplWiXkfM5fveavgfqPSQzfRmikExCQGN2XD16KAoifZk5UwCRQ--PZqZ7-g5A2mqM3bLAE6FpG-QmuMPGafcfLrM9VxopAlHpogi-Nlvc27aREvdYthqzjXXmSeKPv0=s0-d)
. That either of these two computations give one and the same stationary vector is a form of an
ergodic theorem, which is generally true in a wide variety of
dissipative dynamical systems: the system evolves, over time, to a
stationary state.
Intuitively, a stochastic matrix represents a Markov chain with no sink
states, this implies that the application of the stochastic matrix to a
probability distribution would redistribute the probability mass of the
original distribution while preserving its total mass. If this process
is applied repeatedly the distribution converges to a stationary
distribution for the Markov chain.
Example: the cat and mouse
Suppose you have a timer and a row of five adjacent boxes, with a cat
in the first box and a mouse in the fifth box at time zero. The cat and
the mouse both jump to a random adjacent box when the timer advances.
E.g. if the cat is in the second box and the mouse in the fourth one,
the probability is one fourth that the cat will be in the first box and
the mouse in the fifth after the timer advances. If the cat is in the
first box and the mouse in the fifth one, the probability is one that
the cat will be in box two and the mouse will be in box four after the
timer advances. The cat eats the mouse if both end up in the same box,
at which time the game ends. The
random variable K gives the number of time steps the mouse stays in the game.
The
Markov chain that represents this game contains the following five states:
- State 1: cat in the first box, mouse in the third box: (1, 3)
- State 2: cat in the first box, mouse in the fifth box: (1, 5)
- State 3: cat in the second box, mouse in the fourth box: (2, 4)
- State 4: cat in the third box, mouse in the fifth box: (3, 5)
- State 5: the cat ate the mouse and the game ended: F.
We use a stochastic matrix to represent the
transition probabilities of this system,
![P =
\begin{bmatrix}
0 & 0 & 1/2 & 0 & 1/2 \\
0 & 0 & 1 & 0 & 0 \\
1/4 & 1/4 & 0 & 1/4 & 1/4 \\
0 & 0 & 1/2 & 0 & 1/2 \\
0 & 0 & 0 & 0 & 1
\end{bmatrix}.](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_uJnX49gxMgXB1gME51HAEjieSu-RBg_jX1XL_OCb6eAr7HV-Effs-RV9N9jtrfkWwgF38P3Cyp8g1_O35U8Cb9TsHlajB1Lw3S5C5s6qONcf4WRtr-N5C4OfhOQ6-XB1sIYXtrlnXHuedkVc-qceXJLU7du7nVOcYq8g=s0-d)
Long-term averages
As state 5 is an absorbing state the long-term average vector
![\boldsymbol{\pi}=(0,0,0,0,1)](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_uZfEtAzaQ0BggR_BBfa_CN4TmrXRdme9R_CajsauoePNkUcjb1J5t0Exms7Yih9bzhJhp74MEzo-ZpbRBaYuE8GWoWbHvtBd4X_K5s1GESyAtV12O-dkul6LApe_Ydpqcemc3fqeBoOoM78tFEtnINhfa574ZMUKrjjw=s0-d)
. Regardless of the initial conditions the cat will eventually catch the mouse.
Phase-type representation
The survival function of the mouse. Note the mouse will survive at least the first time step.
As the game has an absorbing state 5 the distribution of time to absorption is
discrete phase-type distributed. Suppose the system starts in state 2, represented by the vector
![[0,1,0,0,0]](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_vu-Ll_VQbG1m9dFSpJ1PI6axf0_YQ86INk_-hlwn1SNTtpANi9bNeU3xGAWHuF0kCUjGZi8u6_UcjisCselWgPuGWeXU7j60xGXcNeP1puzgFbJAXNsfx7JshliYFlqLRVRtAHUNy8xLy7m3QUmHiZd8oqo379YgCSTw=s0-d)
. To simplify the calculations, state five can be ignored. Let
![\boldsymbol{\tau}=[0,1,0,0]](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_tgwSDUXy_b9pcbipEYcfo0IgWsEjVDX_zuYUZOwnK1lafVX7xbtgmqL5tp302xjSTlUxaFWZ5PMmn80-llhzf_SKWOuG0FDACPBaWBypx6WgTJrh2G7I01WBdwOorsI5d44ml0nixdSDIPoZe3TGQvliwD6X70RpyEzLY=s0-d)
and remove state five to make a
sub-stochastic matrix,
with ![(I-T)^{-1}\boldsymbol{1}
=\begin{bmatrix}2.75 \\ 4.5 \\ 3.5 \\ 2.75\end{bmatrix}\,,](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_s2pmLy53N_r5YJdoKkJtScyHrY7ascmOkR9PPY_xtL0XeL611t-6It_N3m9J6Ar_yM_jl-Fs9Tia1dl9Zh9-wobrELTbKYB181JQR9bZLaVElLabORP3FgogLxLdZ2sxPv54wIEWlj2sxa-NDJUJ6fztxXU5tv9CQdv9Y=s0-d)
where
![I](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_vygwKuzW0u4_EgIoVzjbX5XVcvLZt4lheU-zhlYqtq2mANkFLUltmj0wrzYQM1AUhbU0A4cxR1qKjHW5yqEfa2mbQ9afZKbYVQv3fcqfOMhi6r6eoiTlYe1DZQUD92zfXB08qU7hJqTNDIgbxQXZCIFwyTz9dgSzWMww=s0-d)
is the
identity matrix, and
![\mathbf{1}](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_u6Id3q9JSNnxVM7EFvk1jOw1YujLfxSvbHHv7cpzGWd5pNan4juB3Cv10VN5l9-QOjsV2IERAvRg9QYGmh5kHfHm5UkajLgFqugrCeEW15Luu-MXpQZ5XfW2GwD5yQg13OakEIKOCuak5sVYAk6F3Q7AEpvnfGwC5MSL4=s0-d)
represents a column matrix of all ones. The expected time of the mouse's survival is given by
![E[K]=\boldsymbol{\tau}(I-T)^{-1}\boldsymbol{1}=4.5.](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_u-GtU8pySFhuj3IcMD7mLlBUYWY2FUQ9ArssdHxbnftJ5uawbBilGiO_TOszjn2onNAeaWW3acVlqFNeYrJXczXGTqL8SXOu281hlxZGu7YJqTYnhqyf6oz9JXWpWxCHgCr4CXDi0lYCpRYBCDCwmlynQUTNzMKXBA3Q=s0-d)
Higher order moments are given by
![E[K(K-1)\dots(K-n+1)]=n!\boldsymbol{\tau}(I-{T})^{-n}{T}^{n-1}\mathbf{1}\,.](https://lh3.googleusercontent.com/blogger_img_proxy/AEn0k_uAbfOkmr_6rvbtiGQOlWp-laBl4q_8Ln5Ov-Jk9J5D-cCqBmsXF7VSeagCfkWk_2uQJKi9c_G58h_qsVamaSfnecfb7AumeepXCNyBBZ4QH0FdXwW9JMObQZ8RmOmiw7L-dNuAQctNyXIsp2xkDlWXLpcPtGZUH98F47s=s0-d)
See also
References
- G. Latouche, V. Ramaswami. Introduction to Matrix Analytic Methods in Stochastic Modeling, 1st edition. Chapter 2: PH Distributions; ASA SIAM, 1999.
No comments:
Post a Comment
Thank you