+ Resolve Article
+ Follow Us
Follow on FacebookFollow on Facebook
Follow on TwitterFollow on Twitter

+ Translate
+ Subscribe to Site Feed
GeoScience Most Shared ContentMost Shared Content

Evaluating feature-selection stability in next-generation proteomics

, : Evaluating feature-selection stability in next-generation proteomics. Journal of Bioinformatics and Computational Biology 14(5): 1650029-1650029

NlmCategory="UNASSIGNED">Identifying reproducible yet relevant features is a major challenge in biological research. This is well documented in genomics data. Using a proposed set of three reliability benchmarks, we find that this issue exists also in proteomics for commonly used feature-selection methods, e.g. [Formula: see text]-test and recursive feature elimination. Moreover, due to high test variability, selecting the top proteins based on [Formula: see text]-value ranks - even when restricted to high-abundance proteins - does not improve reproducibility. Statistical testing based on networks are believed to be more robust, but this does not always hold true: The commonly used hypergeometric enrichment that tests for enrichment of protein subnets performs abysmally due to its dependence on unstable protein pre-selection steps. We demonstrate here for the first time the utility of a novel suite of network-based algorithms called ranked-based network algorithms (RBNAs) on proteomics. These have originally been introduced and tested extensively on genomics data. We show here that they are highly stable, reproducible and select relevant features when applied to proteomics data. It is also evident from these results that use of statistical feature testing on protein expression data should be executed with due caution. Careless use of networks does not resolve poor-performance issues, and can even mislead. We recommend augmenting statistical feature-selection methods with concurrent analysis on stability and reproducibility to improve the quality of the selected features prior to experimental validation.


PMID: 27640811

DOI: 10.1142/S0219720016500293

Other references

Bajon R., 1978: Bio systematics of the group koeleria cristata in france part 1 1st results on karyology relations between cytotypes and their phytosociologic localization. Gehu, Jean-Marie (Ed ) Documents Phytosociologiques, Vol Ii (Phytosociological Documents, Vol Ii ) (In Fr ) 467p Illus Maps J Cramer: Lehre, West Germany Isbn 3-7682-1196-7 1-5

Kavanagh, N.T., 1994: The effect of pulse medication with a combination of tiamulin and oxytetracycline on the performance of fattening pigs in a herd infected with enzootic pneumonia. A pulse medication regime for fattening pigs in a 900-sow herd involved the inclusion of 300 g of oxytetracycline and 30 g of tiamulin per tonne of finished feed which was fed for 2 or 3 days/week over a 16-month period between July 1987 and Octob...

Graul, A.; Leeson, P.; Castaner, J., 1997: T-440 Antiasthmatic, phosphodiesterase IV inhibitor. Drugs Of The Future. 22(7): 729-732

Qureshi, M.A.; Mulvaney, W.P., 1965: A Newer Solvent For Urinary Calculi: In Vitro Study. Current Therapeutic Research, Clinical and Experimental 7: 187-191

Moodley, D.; Smith, T.L.; Van Rensburg, E.J.; Moodley, J.; Engelbrecht, S., 1998: HIV type 1 V3 region subtyping in KwaZulu-Natal, a high-seroprevalence South African region. Aids Research and Human Retroviruses 14(11): 1015-1018

Yao, M-Miao.; Wang, K-Ming.; Xu, Q-Ying.; Wang, G-Lan.; Liu, X-Teng., 2011: Etiology and risk factors of infantile wheezing. To study the etiology and risk factors of infantile wheezing. The clinical data of 180 infants with wheezing were retrospectively studied. The risk factors for wheezing attacks were investigated by logistic regression analysis. Viral infection (33...

Schmoll, F.; Kuehholzer, B.; Laurincik, J.; Hussein, A.A.; Brem, G.; Schellander, K., 1995: High efficiency and accuracy in sexing of one or two cells of early ovine embryos by nested PCR. van Arendonk, J A M [Editor] Book of Abstracts of the Annual Meeting of the European Association for Animal Production; Book of Abstracts of the 46th Annual Meeting of the European Association for Animal Production : 21

Sornborger, A.T.; Wang, Z.; Tao, L., 2016: A mechanism for graded, dynamically routable current propagation in pulse-gated synfire chains and implications for information coding. Neural oscillations can enhance feature recognition (Azouz and Gray Proceedings of the National Academy of Sciences of the United States of America, 97, 8110-8115 2000), modulate interactions between neurons (Womelsdorf et al. Science, 316, 1609-0...

Swensen, Stephen J., 1993: Radiology of thoracic diseases A teaching file. Radiology is often of critical value as a non-invasive diagnostic tool in thoracic medicine. This text contains a large collection of radiographs of thoracic abnormalities. The collection is intended primarily for use by radiology residents. It is...

Ganguly, A.K., 2000: Ziracin, a novel oligosaccharide antibiotic. Ziracin is produced by Micromonospora carbonacea and is highly active against Gram-positive bacteria. In particular it is highly active against methicillin resistant staphylococci and vancomycin resistant enterococci. Ziracin, C70H97NO38Cl2, conta...