版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、第12周法律【】和幻灯片为炼数成金网络课程的教学资料,所有资料只能在课程内使用,不得在课程以外范围散播,违者将可能被法律和经济责任。课程详情炼数成金培训http:SparkR简介SparkR 例子Spark MLlibSparkR2013年9月SparkR作为一个独立项目启动。2014年1月,SparkR项目在上开源(/amplab-extras/SparkR-pkg)SparkRSparkR包和JVM后端SparkRSparkR架构SparkRR-3.1.1 编译安装查看操作系统rootfeng03 R-3.1.1# lsb_release -aLSB Ver:base-4.0-amd64:
2、base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:pring-4.0-amd64:pring-4.0-noarchDistributor ID: CentOSDescription:CentOS release 6.6 (Final)Release:6.6Codename: Final相关依赖包1. yum install gccrootfeng03 # yum install gccLoaded plugins: fastestmirror, securitySettin
3、g up Install ProsSparkRDependencies Resolved=PackageArchVerReitorySize=Installing:gccx86_644.4.7-16.el6base10 MInstalling for dependencies:cloog-pplx86_640.15.7-1.2.el6base93 kUpdating for dependencies:libgccTranx86_644.4.7-16.el6base103 kion Summary=Install5 Package(s)UpgradeInstalled:2 Package(s)g
4、cc.x86_64 0:4.4.7-16.el6Dependency Installed:cloog-ppl.x86_64 0:0.15.7-1.2.el6cpp.x86_64 0:4.4.7-16.el6mpfr.x86_64 0:2.4.1-6.el6ppl.x86_64 0:0.10.2-11.el6Complete!SparkRyum install gcc-c+yum install gcc-gfortranyum install pcre-develyum install tcl-develyum install zlib-develyum install bzip2-devely
5、um install libX11-develyum install readline-develyum install libXt-develyum install tk-develyum install tetex-latexSparkR13.解压 jifengfeng03 r$ wget HYPERLINK http:/m/ http:/m jifengfeng03 r$ tar -zxf R-3.1.1.tar.gz14.编译安装./configure -enable-R-shlib make & make install/cran/src/base/R-3/R-3.1.1.tar.g
6、zrootfeng03 R-3.1.1$ ./configure -enable-R-shlibrootfeng03 R-3.1.1# make & make install15.启动R Srootfeng03 R-3.1.1# RR ver3.1.1 (2014-07-10) - Sock it to MeCopyright (C) 2014 The R Foundation for Sistical ComputingPlatform: x86_64-unknown-linux-gnu (64-bit)SparkRSparkR安装运行R s安装rJavainstall.packages(r
7、Java)选择213.安装devtoolsinstall.packages(devtools)SparkR3.安装devtoolsinstall.packages(devtools)ONF ERRORConfiguration failed because libcurl was not found. Try installing:deb: libcurl4-openssl-dev (Debian, Ubuntu, etc)rpm: libcurl-devel (Fedora, CentOS, RHEL)csw: libcurl_dev (Solaris)If libcurl is alrea
8、dy installed, checkt pkg-config is in yourPAnd PKG_CONFIG_PATH contains a libcurl.pc file. If pkg-configis unavailable you can set INCLUDE_DIR and LIB_DIR manually via:R CMD INSTALL -configure-vars=INCLUDE_DIR=. LIB_DIR=.configure: error:OpenSSL library requiredSparkRPlease install:libssl-dev (deb)
9、or openssl-devel (rpm)See config.log for more detailsERROR: configuration failed for package git2rremoving /home/jifeng/R/x86_64-unknown-linux-gnu-library/3.1/git2rERROR: dependency curl is not available for package httrremoving /home/jifeng/R/x86_64-unknown-linux-gnu-library/3.1/httr ERROR: depende
10、ncies curl, xml2 are not available for package rvers* removing /home/jifeng/R/x86_64-unknown-linux-gnu-library/3.1/rversERROR: dependencies httr, curl, rvers, git2r are not available for package devtools* removing /home/jifeng/R/x86_64-unknown-linux-gnu-library/3.1/devtoolsThe downloaded source pack
11、ages are in/tmp/Rtmp1A16li/downloaded_packagesWarning messages:SparkR1: In install.packages(devtools) :installation of package xml2 had non-zero exit s 2: In install.packages(devtools) :installation of package curl had non-zero exit s3: In install.packages(devtools) :installation of package git2r ha
12、d non-zero exit s4: In install.packages(devtools) :installation of package httr had non-zero exit s 5: In install.packages(devtools) :ususususinstallation of package rvers had non-zero exit s6: In install.packages(devtools) :出现错误,根据提示操作,退出R s安装rootfeng03 # yum install libcurl-develrootfeng03 # yum i
13、nstall openssl-develrootfeng03 # yum install libxml2-develusSparkR进入R sinstall.packages(git2r)install.packages(xml2)install.packages(rvers)安装SparkRlibrary(devtools)install_(amplab-extras/SparkR-pkg, subdir=pkg)丌下来,安装失败SparkR安装SparkR官网地址:地址:/amplab-extras/SparkR-pkg/tarball/master jifengfeng03 r$ lsm
14、aster R-3.1.1 R-3.1.1.tar.gz jifengfeng03 r$ mv master SparkR-pkg.gz jifengfeng03 r$ lsR-3.1.1 R-3.1.1.tar.gz SparkR-pkg.gz jifengfeng03 r$ tar zxf SparkR-pkg.gz jifengfeng03 r$ lsamplab-extras-SparkR-pkg-e532627 R-3.1.1 R-3.1.1.tar.gz SparkR-pkg.gz jifengfeng03 r$ cd amplab-extras-SparkR-pkg-e53262
15、7/ jifengfeng03 amplab-extras-SparkR-pkg-e532627$ lsBUILDING.mdcreate-docs.sh examplesATION.md install-dev.bat LICENSE README.mdsparkRSparkR_prep-0.1.shinstall-dev.shpkgrun-tests.sh SparkR_IDE_Setup.sh TODO.mdSparkR安装SparkR./install-dev.sh jifengfeng03 amplab-extras-SparkR-pkg-e532627$ ./install-dev
16、.sh* installing *source* package SparkR .* libs* arch -./sbt/sbt assemblyAttempting to fetch sbt# 100.0%Launching sbt from sbt/sbt-launch-0.13.6.jarError: Invalid or corrupt jarfile sbt/sbt-launch-0.13.6.jarmake: * /scala-2.10/sparkr-assembly-0.1.jar Error 1ERROR: compilation failed for package Spar
17、kR* removing /home/jifeng/r/amplab-extras-SparkR-pkg-e532627/lib/SparkRSparkR安装SparkR jifengfeng03 amplab-extras-SparkR-pkg-e532627$cat ./install-dev.sh # Install RR CMD INSTALL -library=$LIB_DIR pkg/ jifengfeng03 sbt$ cat sbtSBT_VERURL1=URL2=http:/=awk -F = /sbt.ver/ pr$2 ./project/perties/typesafe
18、/ivy-releases/.scala-sbt/sbt-launch/$SBT_VER/sbt-launch.jar/typesafe/ivy-releases/.jar.scala-sbt/sbt-launch/$SBT_VER/sbt-launch.jarJAR=sbt/sbt-launch-$SBT_VERprf Launching sbt from $JARnjava -Xmx1200m -XX:MaxPermSize=350m -XX:-jar $JAR $修改为本机的sbt地址JAR=/home/jifeng/sbt/bin/sbt-launch.jarCodeCacheSize
19、=256m SparkR安装SparkR jifengfeng03 amplab-extras-SparkR-pkg-e532627$ ./install-dev.sh* installing *source* package SparkR .* libs* arch -./sbt/sbt assemblyLaunching sbt from /home/jifeng/sbt/bin/sbt-launch.jarGetting.scala-sbt sbt 0.13.6 .downloading https:/typesafe/ivy-releases/.scala-sbt#sbt;0.13.6
20、!sbt.jar (14210ms)/typesafe/ivy-releases/.scala-sbt#main;0.13.6!main.jar (56723ms)/typesafe/ivy-releases.scala-sbt/sbt/0.13.6/jars/sbt.jar .SUCSFUL downloading https:/.scala-sbt/main/0.13.6/jars/main.jar .SUCSFUL downloading https:/piler-erfapiler-erface-bin.jar .SUCSFUL piler-in/typesafe/ivy-releas
21、espiler-inpiler-erface-bin.jar (33548ms)downloading https:/piler-erfapiler-erfarc.jar .SUCSFUL piler-erfarc.jar (17777ms)SparkRsuccp -fs Total time: 592 s, completed Sep 30, 2015 10:15:33 PM/scala-2.10/sparkr-assembly-0.1.jar ./inst/R CMD SHLIB -o SparkR.so string_hash_code.cmake1: Entering director
22、y /home/jifeng/r/amplab-extras-SparkR-pkg-e532627/pkg/srcgcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG -I/usr/local/include-fpic -g -O2 -c string_hash_code.c -o string_hash_code.ogcc -std=gnu99 -shared -L/usr/local/lib64 -o SparkR.so string_hash_code.o -L/usr/local/lib64/R/lib -lRmake1: Leavi
23、ng directory /home/jifeng/r/amplab-extras-SparkR-pkg-e532627/pkg/src installing to /home/jifeng/r/amplab-extras-SparkR-pkg-e532627/lib/SparkR/libs* R* inst* preparing package for lazy loadingCreating a generic function for lapply from package base in package SparkRCreating a generic function for Fil
24、ter from package base in package SparkR* help* installing help indi* building package indi* testing if installed package can be loaded* DONE (SparkR)SparkR例子 pi./sparkR examples/pi.R localSparkR运行命令Running sparkRIf you have installed it directly from, you can include the SparkR package and then init
25、ialize a SparkContext. For example to run wilocal Spark master you can launch R and then runlibrary(SparkR)sc - sparkR.init(master=local)If you have cloned and built SparkR, you can start using it by launching the SparkR s./sparkRwithSparkR also comes with several sample programsexample:./sparkR exa
26、mples/pi.R local2he examples directory. To run one of them, use ./sparkR . ForYou cso run the unit-tests for SparkR by running./run-tests.shinstall.packages(testt)SparkRRunning sparkRSparkR DataFrames./bin/sparkR -master spark:/feng03:7077从SparkContext和SQLContext开始sc - sparkR.init()sqlContext - spar
27、kRSQL.init(sc)本地data frame构造df - createDataFrame(sqlContext, faithful)head(df)Data Sour构造people - read.df(sqlContext, file:/home/jifeng/spark-1.4.0-bin-hadoop2.6/examples/src/main/resour/people.json, json)hepreople)Schemople)SparkRSparkR DataFrames在SparkR中运行SQL查询registerTempTable(people, people)teen
28、agers = 13 AND age val features = ArrayDouble(allnum.toString.toDouble,Vectors.dense(features)allamount.toString.toDouble)parsedDollect().foreach(prln)/对数据集聚类,3个类,20次迭代,形成数据模型 注意这里会使用设置的partition数20val numClusters = 3val numIterations = 20val m= KMeans.train(parsedData, numClusters, numIterations)Sp
29、ark实例/用模型对读入的数据进行分类,并显示val result1 = sqldata.map case Row(locationid, allnum, allamount) =val features = ArrayDouble(allnum.toString.toDouble,val linevectore = Vectors.dense(features)allamount.toString.toDouble)val prediction = m.predict(linevectore)locationid + + allnum + + allamount + + prediction.collect().foreach(pr/保存文件ln)val result2 = sqldata.map case Row(locationid, allnum , allamount) =val features = ArrayDouble(allnum.toString.toDouble, allamount.toString.toDouble)val linevectore = Vectors.dense(
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025单位基本建设合同简易范文
- 2025年度公司经理内部审计与合规聘用合同3篇
- 二零二五年度环保建材工厂设备转让合同3篇
- 2025年度量子信息内部股东股权转让协议书范文3篇
- 二零二五年度企业年会场地布置用品采购协议3篇
- 二零二五年度股权代持风险管理与合作协议2篇
- 2025年度员工宿舍租赁及智能化安防系统合同3篇
- 2025年度绿色养殖场养殖工人劳动合同3篇
- 2025年度农业机械出租与农机具维修服务合同3篇
- 二零二五年度智能交通系统合作项目协议书模板3篇
- 2023四川测绘地理信息局直属事业单位招考笔试参考题库(共500题)答案详解版
- 【《“双减”背景下小学数学创新作业设计问题研究》(论文)】
- 健康养生管理系统
- 口风琴在小学音乐课堂中的运用与实践 论文
- 塑件模具验收报告
- 2023年9月份济南天桥区泺口实验中学八年级上学期语文月考试卷(含答案)
- 信号分析与处理-教学大纲
- 特许经销合同
- 吉林大学药学导论期末考试高分题库全集含答案
- 2023-2024学年河北省唐山市滦州市数学七年级第一学期期末教学质量检测模拟试题含解析
- 数字油画课件
评论
0/150
提交评论