Learning from imbalanced data: a comparative study for colon CADVol. 6915 (2008)
|
Reviews
[Write a review of this article]
There are no reviews of this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
AbstractClassification plays an important role in the reduction of false positives in many computer aided detection and diagnosis methods. The difficulty of classifying polyps lies in the variation of possible polyp shapes and sizes and the imbalance between the number of polyp and non-polyp regions available in the training data. CAD schemes for medical applications demand high levels of sensitivity even at the expense of keeping a certain number of false positives. In this paper, we investigate some state-of-the-art solutions to the imbalanced data problem: Synthetic Minority Over-sampling Technique (SMOTE) and weighted Support Vector Machines (SVM). We tested these methods using a diverse database of CT colonography, which included a wide spectrum of dificult cases to detect polyps. We performed several experiments with different combinations of over-sampling techniques on training data. The results demonstrated that SVMs have achieved much better performance over C4.5 with different over-sampling techniques. Also, the results show that weighted SVM without over-sampling can achieve comparable performance in terms of sensitivity and specificity to conventional SVM combined with the over-sampling approach.
BibTeX record
RIS record