基于不确定性采样的自训练代价敏感支持向量机研究

来源期刊:中南大学学报(自然科学版)2012年第2期

论文作者:江彤 唐明珠 阳春华

文章页码:561 - 566

关键词:主动学习;代价敏感支持向量机;自训练方法;不确定性采样;支持向量数据描述

Key words:active learning; cost-sensitive support vector machine; self-training approach; uncertainty based sampling; support vector data description

摘    要:针对样本集中的类不平衡性和样本标注代价昂贵问题,提出基于不确定性采样的自训练代价敏感支持向量机。不确定性采样通过支持向量数据描述评价未标注样本的不确定性,对不确定性高的未标注样本进行标注,同时利用自训练方法训练代价敏感支持向量,代价敏感支持向量机利用代价参数和核参数对未标注样本进行预测。实验结果表明:该算法能有效地降低平均期望误分类代价,减少样本集中样本需要标注次数。

Abstract:

Self-training cost-sensitive support vector machine with uncertainty based sampling (SCU) was proposed to solve two difficulties of class-imbalanced dataset and expensive labeled cost. The uncertainty of unlabeled sample was evaluated using support vector data description in uncertainty based sampling. The unlabeled sample with high uncertainty was selected to be labeled. Cost-sensitive support vector machine was trained using self-training approach. Cost parameters and kernel parameters of cost-sensitive support vector machine were employed to predict a class label for an unlabeled sample. The results show that SCU effectively reduces both average expected misclassification costs and labeled times.

基金信息:国家自然科学基金资助项目

有色金属在线官网  |   会议  |   在线投稿  |   购买纸书  |   科技图书馆

中南大学出版社 技术支持 版权声明   电话:0731-88830515 88830516   传真:0731-88710482   Email:administrator@cnnmol.com

互联网出版许可证:(署)网出证(京)字第342号   京ICP备17050991号-6      京公网安备11010802042557号