+ Resolve Article
+ Follow Us
Follow on FacebookFollow on Facebook
Follow on TwitterFollow on Twitter

+ Translate
+ Subscribe to Site Feed
GeoScience Most Shared ContentMost Shared Content

Data augmentation algorithms for detecting conserved domains in protein sequences: a comparative study

, : Data augmentation algorithms for detecting conserved domains in protein sequences: a comparative study. Journal of Proteome Research 7(1): 192-201

Protein conserved domains are distinct units of molecular structure, usually associated with particular aspects of molecular function such as catalysis or binding. These conserved subsequences are often unobserved and thus in need of detection. Motif discovery methods can be used to find these unobserved domains given a set of sequences. This paper presents the data augmentation (DA) framework that unifies a suite of motif-finding algorithms through maximizing the same likelihood function by imputing the unobserved data. The data augmentation refers to those methods that formulate iterative optimization by exploiting the unobserved data. Two categories of maximum likelihood based motif-finding algorithms are illustrated under the DA framework. The first is the deterministic algorithms that are to maximize the likelihood function by performing an iteratively optimal local search in the alignment space. The second is the stochastic algorithms that are to iteratively draw motif location samples via Monte Carlo simulation and simultaneously keep track of the superior solution with the best likelihood. As a result, four DA motif discovery algorithms are described, evaluated, and compared by aligning real and simulated protein sequences.


PMID: 18081244

DOI: 10.1021/pr070475q

Other references

Kacperska, M.J.; Walenczak, J.; Tomasik, Błomiej., 2016: Plasmatic microRNA as Potential Biomarkers of Multiple Sclerosis: Literature Review. There is ongoing research with the goal of finding precise and sensitive biomarkers of multiple sclerosis (MS). Recently, researchers have paid particular attention to small, non-encoding, single stranded endogenous microRNA molecules (miR, miRNA)...

Bettelheim, F.A., 1963: Physical chemistry of mucins. Annals of the New York Academy of Sciences 106: 247-258

Brigmon, R.; Besch, E.; Mather, F., 1992: Seasonal temperature and its influence on plasma corticosterone, triiodothyronine, thyroxine, plasma protein and packed cell volume in mature male chickens. 1. The relationship between seasonal changes in environmental temperature and hematological parameters was investigated in mature, single comb white leghorn (SCWL) male chickens. 2. Samples of blood plasma, obtained monthly from two groups of bird...

Cohic, F., 1958: Les parasites animaux de la tomate. Tomato in New Caledonia is attacked by Euxoa radians (Gn.), Empoasca flavescens (F.), Macrosiphum solanifolii (Ashm.), Plusia chalcites (Esp.), Midis profana F., Gnorimoschema operculclla (Zell.), Monolepta semi-violacea Fauvel, Cyrtopeltis (Engyt...

Safiannikova, E.B., 1961: The lymphatic system of the fascia of the upper extremities. Arkhiv Anatomii, Gistologii i Embriologii 41: 93-102

Stasiuk, S.J.; Scott, M.J.; Grant, W.N., 2012: Developmental plasticity and the evolution of parasitism in an unusual nematode, Parastrongyloides trichosuri. Parasitism is an important life history strategy in many metazoan taxa. This is particularly true of the Phylum Nematoda, in which parasitism has evolved independently at least nine times. The apparent ease with which parasitism has evolved amongs...

Honma, Y.; Okabe, J.; Kasukabe, T.; Hozumi, M., 1980: Survival of mice inoculated with non-differentiating myeloid leukemia cells is prolonged by the injection of an inducer of cell differentiation with a sensitizer. The effect of injection of an inducer and sensitizer on the survival times of syngeneic SL mice inoculated with resistant mouse myeloid leukemia cells (Ml) was examined. In vitro, the resistant Ml cells could not be induced to differentiate into m...

Anonymous., 1993: In memory of M.I. Prokhorova (1901-1993). Vestnik Sankt-Peterburgskogo Universiteta Seriya 3 Biologiya. fevral'; 1: 124 No 3

Heuveldop, J., 1974: Development of selecting tree species in Denmark. Forstarchiv 45 (9) 176-177

Nilsson, T.K.; Lof-Ohlin, Z.M.; Bottiger, A.K., 2008: Genotyping of the reduced folate carrier-1 c80G A polymorphism by pyrosequencing technology Importance of PCR and pre-PCR optimization. When developing a genotyping assay by Pyrosequencing technology for the RFC1 (SLC19A1) c.80G>A polymorphism (rs1051266), unequal peak heights in the pyrograms were observed, probably due to unequal amplification of the mutated and wild-type al...