期刊论文

【摘要】

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201910284142682ZK.pdf	912KB	PDF	download

International Journal of Information Technology
Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Jeff Clarine ; Chang-Shyh Peng ; Daisy Sang
关键词: Bioassay; machine learning; preprocessing; virtual screen.;
DOI : 10.1999/1307-6892/10008681
学科分类：计算机应用
来源: World Academy of Science, Engineering and Technology (W A S E T)
PDF


	文献评价指标
	下载次数：10次	浏览次数：14次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】