Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

Original publication

DOI

10.1103/physreve.85.066124

Type

Journal article

Journal

Physical review. E, Statistical, nonlinear, and soft matter physics

Publication Date

06/2012

Volume

85

Addresses

LEM, Scuola Superiore Sant'Anna, 56127 Pisa, Italy. d.pirino@sssup.it

Keywords

DNA, Models, Statistical, Markov Chains, Chromosome Mapping, Sequence Analysis, DNA, Structure-Activity Relationship, Algorithms, Models, Genetic, Computer Simulation