同分布强化学习优化多决策树及其在非平衡数据集中的应用

来源期刊:中南大学学报(自然科学版)2019年第5期

论文作者:张雪英 焦江丽 李凤莲 牛壮

文章页码:1112 - 1119

关键词:非平衡数据集;多决策树;累积回报机制属性选择策略;同分布随机抽样;强化学习

Key words:imbalanced data sets; multi-decision tree; cumulative reward mechanism attributes selection strategy; identically distributed random sampling; reinforcement learning

摘    要:针对传统决策树在非平衡数据集分类时少数类预测性能出现偏差的问题,提出一种基于强化学习累积回报的属性优化策略即改进型同分布多决策树方法。首先通过同分布随机抽样法对非平衡数据集中的多数类样本进行随机采样,进而对各子集建立单决策树形成多个决策树,各决策树采用分类回归树算法建树,并利用强化学习累积回报机制进行属性选择策略的优化。研究结果表明:提出的基于强化学习累积回报机制的属性优化策略可有效提高少数类被正确分类的概率;同分布多决策树方法可有效提高非平衡数据集整体预测性能,且正类率和负类率的几何平均值都有所提高。

Abstract: As the general decision tree can not classify the minority class of the imbalanced data sets well, an improved identically distributed multi-decision tree approach based on reinforcement learning cumulative reward was proposed to optimize the attribute selection strategy. Firstly, the majority class samples of the imbalanced data sets were randomly sampled by the identically distributed random sampling approach, and then each single decision tree was established over each subset and eventually a multi-decision tree was formed. Each single decision tree was constructed by classification and regression tree(CART) algorithm firstly and then reinforcement learning cumulative reward mechanism was utilized to optimize the attribute selection strategy. The results show that the proposed attribute optimization strategy based on the reinforcement learning cumulative reward mechanism effectively improves the probability that the minority class can be correctly classified. The identically distributed multi-decision tree method effectively improves the overall prediction performance over imbalanced data sets. Moreover, the positive rate and geometric mean value of positive and negative rates are improved at the same time.

相关论文

  • 暂无!

相关知识点

  • 暂无!

有色金属在线官网  |   会议  |   在线投稿  |   购买纸书  |   科技图书馆

中南大学出版社 技术支持 版权声明   电话:0731-88830515 88830516   传真:0731-88710482   Email:administrator@cnnmol.com

互联网出版许可证:(署)网出证(京)字第342号   京ICP备17050991号-6      京公网安备11010802042557号