|
" Exploring evolutionary heterogeneity with change -point models, Gaussian Markov random fields, and Markov chain induced counting processes "
Volodymyr Minin
M. A. Suchard
Document Type
|
:
|
Latin Dissertation
|
Language of Document
|
:
|
English
|
Record Number
|
:
|
53001
|
Doc. No
|
:
|
TL22955
|
Call number
|
:
|
3257221
|
Main Entry
|
:
|
Volodymyr Minin
|
Title & Author
|
:
|
Exploring evolutionary heterogeneity with change -point models, Gaussian Markov random fields, and Markov chain induced counting processes\ Volodymyr Minin
|
College
|
:
|
University of California, Los Angeles
|
Date
|
:
|
2007
|
Degree
|
:
|
Ph.D.
|
student score
|
:
|
2007
|
Page No
|
:
|
195
|
Abstract
|
:
|
Signatures of spatial variation, left by evolutionary processes in genomic sequences, provide important information about the function and structure of genomic regions. I discuss statistical methods for detection of such signatures in a Bayesian framework. I start with phylogenetic analysis of recombination. I present a recombination detection method that simultaneously incorporates discrepancies in phylogenies, caused by recombination, and spatial variation in evolutionary pressure across the alignment using a dual multiple change-point (DMCP) model. Next, I turn to mapping recombination hot-spots. Based on the DMCP model, I build a hierarchical framework for simultaneous inference of break-point locations and spatial variation in recombination frequency from multiple putative recombinant sequences. To overcome the sparseness of break-point data, dictated by the modest number of available recombinant sequences, I a priori impose a biologically relevant correlation structure on recombination location log-odds via a Gaussian Markov random field. Applied to HIV sequences, this approach reveals a previously unknown recombination hot-spot. I show that GMRF smoothing can also be successfully combined with Kingman's coalescent to estimate temporal variation of the population demographic: history. GMRF temporal smoothing does not require strong prior decisions and is robust to prior perturbations. I apply GMRF smoothing to hepatitis C sequences, contemporaneously sampled in Egypt, and human influenza A hemagglutinin sequences, serially sampled throughout three flu seasons. I conclude with posterior predictive model diagnostics for locating spatial patterns of variation in genomic sequences. The evolutionary counting processes that keep track of a priori labeled mutations provide very useful discrepancy measures for detecting model inadequacies. I take an algorithmic probability approach that allows for an exact and efficient computation of certain properties of evolutionary counting processes. I demonstrate that these properties allow detection of periodic patterns of mutation rate variation in nucleotide sequence alignments.
|
Subject
|
:
|
Biological sciences; Change-point models; Induced counting; Markov random fields; Biostatistics; Genetics; 0369:Genetics; 0308:Biostatistics
|
Added Entry
|
:
|
M. A. Suchard
|
Added Entry
|
:
|
University of California, Los Angeles
|
| |