Fitting DNA sequences through log-linear modelling with linear constraints
Loading...
Download
Full text at PDC
Publication date
2011
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Taylor & Francis
Citation
Abstract
For some discrete state series, such as DNA sequences, it can often be postulated that its probabilistic behaviour is given by a Markov chain. For making the decision on whether or not an uncharacterized piece of DNA is part of the coding region of a gene, under the Markovian assumption, there are two statistical tools that are essential to be considered: the hypothesis testing of the order in a Markov chain and the estimators of transition probabilities. In order to improve the traditional statistical procedures for both of them when stationarity assumption can be considered, a new version for understanding the homogeneity hypothesis is proposed so that log-linear modelling is applied for conditional independence jointly with homogeneity restrictions on the expected means of transition counts in the sequence. In addition we can consider a variety of test-statistics and estimators by using phi-divergence measures. As special case of them the well-known likelihood ratio test-statistics and maximum-likelihood estimators are obtained.