geoscience.net logo
+ Resolve Article
+ Follow Us
Follow on FacebookFollow on Facebook
Follow on TwitterFollow on Twitter

+ Translate
+ Subscribe to Site Feed
GeoScience Most Shared ContentMost Shared Content

Generalized and Heuristic-Free Feature Construction for Improved Accuracy


, : Generalized and Heuristic-Free Feature Construction for Improved Accuracy. Proceedings of the ... Siam International Conference on Data Mining. Siam International Conference on Data Mining 2010: 629-640

State-of-the-art learning algorithms accept data in feature vector format as input. Examples belonging to different classes may not always be easy to separate in the original feature space. One may ask: can transformation of existing features into new space reveal significant discriminative information not obvious in the original space? Since there can be infinite number of ways to extend features, it is impractical to first enumerate and then perform feature selection. Second, evaluation of discriminative power on the complete dataset is not always optimal. This is because features highly discriminative on subset of examples may not necessarily be significant when evaluated on the entire dataset. Third, feature construction ought to be automated and general, such that, it doesn't require domain knowledge and its improved accuracy maintains over a large number of classification algorithms. In this paper, we propose a framework to address these problems through the following steps: (1) divide-conquer to avoid exhaustive enumeration; (2) local feature construction and evaluation within subspaces of examples where local error is still high and constructed features thus far still do not predict well; (3) weighting rules based search that is domain knowledge free and has provable performance guarantee. Empirical studies indicate that significant improvement (as much as 9% in accuracy and 28% in AUC) is achieved using the newly constructed features over a variety of inductive learners evaluated against a number of balanced, skewed and high-dimensional datasets. Software and datasets are available from the authors.

US$29.90

PMID: 21544257


Other references

Hagner, J.C., 1978: What's behind malpractice insurance?. Legal Aspects of Medical Practice 6(3): 35-38

Stevenson, C.G., 1999: Cholesterol ester transfer protein: a molecule with three faces?. The pathogenesis of atherosclerosis continues to be a focus of intensive study. One of the more recent players in the atherosclerosis drama is cholesterol ester transfer protein (CETP). CETP is primarily involved in lipid transfer between lipoprot...

Levin, S.M.; Nelson, C.O.; Botts, J.D.; Teplitz, G.A.; Kwon, Y.; Serra-Hsu, F., 2008: Biomechanical evaluation of volar locking plates for distal radius fractures. Fixed-angle devices have been a major advancement in orthopedic fracture care and have become an attractive option for fixation of distal radius fractures. Several volar locking plates exist, but there is insufficient literature comparing the stre...

Natsuaki, M.; Furukawa, Y.; Morimoto, T.; Nakagawa, Y.; Akao, M.; Ono, K.; Shioi, T.; Shizuta, S.; Sakata, R.; Okabayashi, H.; Nishiwaki, N.; Komiya, T.; Suwa, S.; Kimura, T., 2012: Impact of diabetes on cardiovascular outcomes in hemodialysis patients undergoing coronary revascularization. Among hemodialysis (HD) patients, those who have diabetes have poorer cardiovascular outcomes than non-diabetic patients, but the impact of diabetes on cardiovascular outcomes has not been fully elucidated in HD patients undergoing coronary revasc...

Zhu, G-Jian.; Yu, Y-Nian.; Li, X.; Qian, Y-Li., 2002: Cloning of cytochrome P-450 2C9 cDNA from human liver and its expression in CHL cells. Using bacterial, yeast, or mammalian cell expressing a human drug metabolism enzyme would seem good way to study drug metabolism-related problems. Human cytochrome P-450 2C9(CYP2C9) is a polymorphic enzyme responsible for the metabolism of a large...

Renaud S.; Kuba K.; Lemire Y.; Allard C., 1969: Platelet fatty acids in relation to thrombosis in rats and to cardiac infarct in man. Circulation 40(4-S3): 169

Eskens, U., 1983: Statistical investigation of neoplasms of dogs, classified according to the WHO recommendations, particularly mammary and skin tumours. Details of neoplasms from 13 400 dogs were recorded at the Giessen Veterinary Faculty between 1968 and 1978. Details of distribution are shown in 68 tables. The mammary glands were involved in 43% and the skin in 39% of clinical cases. Breeds with...

Zaia, J.A.; Levin, M.J.; Wright, G.G.; Grady, G.F., 1978: A practical method for preparation of varicella-zoster immune globulin. Outdated blood from blood banks in Massachusetts was screened for complement-fixing antibody to varicella-zoster virus (VZV). Approximately 15% of the plasma units had a titer greater than or equal to 16, and one-half of these had a titer of great...

Ricci, Fávia.Pessoni.F.M.; Santiago, P.Roberto.Pereira.; Zampar, A.Carolina.; Pinola, Lívia.Nahas.; Fonseca, M.de.Cássia.Registro., 2016: Upper extremity coordination strategies depending on task demand during a basic daily activity. Injury conditions affecting the upper extremity may lead to severe functional impairment and an accurate evaluation is needed in order to select the most effective treatment in a rehabilitation program. This study focused on simultaneous electromy...

Sargant, W., 1975: Should patients be 'tortured' in the name of progress?. Times: 10-10