学位论文详细信息
Efficient In-Database Analytics through Embedding MySQL into R
MySQL;R;BridgeR;In-database Analytics technology
Boggaram Gopinath, Chandra Mohan ; Dr. Steffen Heber, Committee Member,Dr. Nagiza F.Samatova, Committee Chair,Dr. Kemafor Anyanwu, Committee Member,Boggaram Gopinath, Chandra Mohan ; Dr. Steffen Heber ; Committee Member ; Dr. Nagiza F.Samatova ; Committee Chair ; Dr. Kemafor Anyanwu ; Committee Member
University:North Carolina State University
关键词: MySQL;    R;    BridgeR;    In-database Analytics technology;   
Others  :  https://repository.lib.ncsu.edu/bitstream/handle/1840.16/2781/etd.pdf?sequence=1&isAllowed=y
美国|英语
来源: null
PDF
【 摘 要 】

High-performance analytics of data at extreme scales is a well-recognized challengeby both scientific and business communities. The goal of this Master’s thesis is to exploreeffective and efficient ways of performing statistical analysis of the data stored in largescalerelational databases (DB). The underlying hypothesis is that in-database analyticsoffers a plausible solution to this challenge by coupling analytical and database capabilitiestogether. Such a coupling may let analytical workflows to be executed without movingthe data out of the databases and therefore avoid transferring the data over the network.Therefore, in-database analytics may potentially reduce the overall latency, assure betterdata governance and security, and scale analytical solutions to larger data sets with moreefficient resource utilization.In-database analytics can be realized through the following two complementary approaches:(a) analytics-in-DB places analytical workflows inside a DB server and (b) DB-in-analyticsembeds the DB server into the memory space of analytical routines. The former has beenprimarily driven by the database community through various mechanisms, such as user definedfunctions, stored procedures, compiled codes, etc. The latter is an emerging approachdominated byopen source, robust, and scalable solutions in their infancy.The focus of this Master’s thesis is on developing an open source and efficient in-databaseanalytics solution via embedding a MySQL server into an R statistical data analysis environment.To the best of our knowledge, this is the first study that integrates analyticalcapabilities of R with a MySQL database management system in an embedded manner.Specifically, the three novel ways for embedded DB-in-analytics are proposed and systematicallyevaluated. In contrast to existing wrapper-based approaches that provide wrapperAPIs to MySQL functions, the proposed embedded solutions improve the time efficiency ofR’s access to the MySQL DB by 650% to 1900%.

【 预 览 】
附件列表
Files Size Format View
Efficient In-Database Analytics through Embedding MySQL into R 1787KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:6次