such as
Spark has their own example: https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala
MapR has also some cool sample: http://www.mapr.com/developercentral/code/loading-hbase-tables-spark
and here, a more detailed code snippet: http://www.vidyasource.com/blog/Programming/Scala/Java/Data/Hadoop/Analytics/2014/01/25/lighting-a-spark-with-hbase
but all of them, has no information about:
- which jar library are needed, let us say dependency problem
- how should i set the classpath when i start my spark job/application with HBase connection
- sc.newAPIHadoopRDD uses this holly class org.apache.hadoop.hbase.client.Result as a return value type, but objects in this Result are org.apache.hadoop.hbase.KeyValue, this is a core client-side Java API of HBase, sometimes it is really not enough to use it just with getColumn("columnFamily".getBytes(), "columnQualifier".getBytes()), and more important is, in scala, to use this KeyValue object is even more complicated.
assume you have already read the samples above. i will go ahead directly to solve this three problems.
if you only want to see some code, jump to the next part of this doc: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html
1. dependency problem
it is similar as a HBase client programfor maven:
<dependency> <groupid>org.apache.spark</groupid> <artifactid>spark-core_2.10</artifactid> <version>1.0.1</version> </dependency> <dependency> <groupid>org.apache.hbase</groupid> <artifactid>hbase</artifactid> <version>0.98.2-hadoop2</version> </dependency> <dependency> <groupid>org.apache.hbase</groupid> <artifactid>hbase-client</artifactid> <version>0.98.2-hadoop2</version> </dependency> <dependency> <groupid>org.apache.hbase</groupid> <artifactid>hbase-common</artifactid> <version>0.98.2-hadoop2</version> </dependency> <dependency> <groupid>org.apache.hbase</groupid> <artifactid>hbase-server</artifactid> <version>0.98.2-hadoop2</version> </dependency>
sbt:
libraryDependencies ++= Seq( "org.apache.spark" % "spark-core_2.10" % "1.0.1", "org.apache.hbase" % "hbase" % "0.98.2-hadoop2", "org.apache.hbase" % "hbase-client" % "0.98.2-hadoop2", "org.apache.hbase" % "hbase-common" % "0.98.2-hadoop2", "org.apache.hbase" % "hbase-server" % "0.98.2-hadoop2" )
change the version of spark and hbase to yours.
2. classpath
in the time of Spark 0.9.x, you just need to set this environment: SPARK_CLASSPATH with HBase's Jars, for example, start spark-shell with local mode, in CDH5 Hadoop distribution:export SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jarand then
./bin/spark-shell --master local[2]or just
SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar ./bin/spark-shell --master local[2]
in your cluster, you should change the path of those jars to your HBase's path, such as in other Hadoop distribution should be some path like /usr/lib/xxx (Hortonworks HDP) or /opt/mapr/hbase-xxx (MapR)
but, but... this lovely SPARK_CLASSPATH is deprecated in the new era of Spark 1.x !!! -_-
so, in Spark 1.x
there is one conf property and one command line augment for this:
spark.executor.extraClassPath
and
--driver-class-path
WTF... but, yes, you must give the whole jar paths twice!... and spark.executor.extraClassPath must be set in a conf file, can not be set via command line...
so, you need to do this:
edit conf/spark-defaults.conf
add this:
spark.executor.extraClassPath /opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jarand then, start spark shell or submit your spark job with command line args for driver --driver-class-path:
./bin/spark-shell --master local[2] --driver-class-path /opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jarunbelievable, but it is so in spark 1.x ...
3. how to use org.apache.hadoop.hbase.KeyValue in scala with Spark
it seems this post is already long enough, let us take a break, to see the code of real world examples, you can go to the next part of this doc: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html
Thanks
ReplyDeleteThanks
ReplyDeleteGreat article...
This is what I want to know. Thanks!
ReplyDeleteGreat Article
ReplyDeleteIEEE Projects for CSE in Big Data
Java Training in Chennai
Final Year Project Centers in Chennai
Java Training in Chennai
These are only a few ideas and there are lots more available online. I hope I've given you some inspiration on what you can do to make your Halloween party a spooky success. Bath mirror lamps
ReplyDeleteIt is a very attractive article including a lot of viral ideas. Surely, I will share this blog for my friend and social media. Thank you...!
ReplyDeletePlacement Training in Chennai
Training institutes in Chennai with placement
Oracle DBA Training in Chennai
Social Media Marketing Courses in Chennai
Pega Training in Chennai
Graphic Design Courses in Chennai
Oracle Training in Chennai
Primavera Training in Chennai
Unix Training in Chennai
Placement Training in OMR
Placement Training in Velachery
It's very useful blog post with inforamtive and insightful content and i had good experience with this information.I have gone through CRS Info Solutions Home which really nice. Learn more details About Us of CRS info solutions. Here you can see the Courses CRS Info Solutions full list. Find Student Registration page and register now.Find this real time DevOps Training and great teaching. Join now on Selenium Training online course. Upskill career with Tableau training by crs info solutions. Latest trending course is Salesforce Lightning training with excellent jobs.
ReplyDeleteThis is really very nice post you shared, i like the post, thanks for sharing..
ReplyDeleteData Science Course
Very awesome!!! When I seek for this I found this website at the top of all blogs in search engine.
ReplyDeleteData Science Training
After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.
ReplyDeleteData Science Training Institute in Bangalore
Great post i must say and thanks for the information.
ReplyDeleteBest Data Science Courses in Bangalore
Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! It’s always nice when you can not only be informed, but also entertained!
ReplyDeleteData Science Course in Bangalore
wow, great, I was wondering how to cure acne naturally. and found your site by google, learned a lot, now i’m a bit clear. I’ve bookmark your site and also add rss. keep us updated.
ReplyDeleteData Science Training in Bangalore
Never too late to start learning at Salesforce Training in Australia even though you don't have any programming knowledge you can excell in Salesforce Training in London United Kingdom (UK) because it is all about your customers, so this time find the best Salesforce Training in Europe. This way we will learn Salesforce CRM.
ReplyDeleteMyself so glad to establish your blog entry since it's actually quite instructive. If it's not too much trouble continue composing this sort of web journal and I normally visit this blog. Examine my administrations.
ReplyDeleteRead these Salesforce Admin Certification Topics which are really helpful. I read these Salesforce Admin and Developer Certification Dumps and very much useful for me.
I am so happy to found your blog post because it's really very informative. Please keep writing this kind of blogs and I regularly visit this blog. Have a look at my services.
ReplyDeleteThis is really the best Top 20 Salesforce CRM Admin Development Interview Questions highly helpful. I have found these Scenario based Salesforce developers interview questions and answers very helpful to attempt job interviews. Wow, i got this scenario based Salesforce interview questions highly helpful.
I'd love to thank you for the efforts you've made in composing this post. I hope the same best work out of you later on too. I wished to thank you with this particular sites! Thank you for sharing. Fantastic sites!
ReplyDelete360DigiTMG Data Science Course in Bangalore
This is a great post. This post gives a truly quality information. I am certainly going to look into it. Really very helpful tips are supplied here. Thank you so much. Keep up the great works
ReplyDelete360DigiTMG Data Science Training in Bangalore
Get real time project based and job oriented Salesforce training India course materials for Salesforce Certification with securing a practice org, database terminology, admin and user interface navigation and custom fields creation, reports & analytics, security, customization, automation and web to lead forms.
ReplyDeleteI see some amazingly important and kept up to length of your strength searching for in your on the sitedata science course
ReplyDelete
ReplyDeleteI'm really thankful that I read this. It's extremely valuable and quite informative and I truly learned a great deal from it.
360DigiTMG Data Science Training Institute in Bangalore
Additionally, this is an excellent article which I truly like studying. It's not everyday I have the option to see something similar to this.
ReplyDeleteData Science Course In Bangalore With Placement
This is a great post I saw thanks to sharing. I really want to hope that you will continue to share great posts in the future.
ReplyDeleteartificial intelligence course in noida
If you don't mind, then continue this excellent work and expect more from your great blog posts
ReplyDeletehrdf training course
I was looking at a portion of your posts on this site and I consider this site is really enlightening! Keep setting up..
ReplyDelete360DigiTMG supply chain analytics using r
I feel extremely glad to have seen your site page and anticipate such a large number of additionally engaging occasions perusing here. Much obliged again for all the subtleties.
ReplyDeletehrdf scheme
Regular visits listed here are the easiest method to appreciate your energy, which is why why I am going to the website everyday, searching for new, interesting info. Many, thank you!
ReplyDeletebusiness analytics course
Many sales managers tell me that their salespeople don't meet their expectations. The sales manager pleads, begs and even threatens, but the salesperson just goes through the motions of selling and following through on proposals and sales calls. Salesforce training in Chennai
ReplyDelete
ReplyDeleteThank you quite much for discussing this type of helpful informative article. Will certainly stored and reevaluate your Website.
Cyber Security Course In Bangalore
I have to search sites with relevant information ,This is a
ReplyDeletewonderful blog,These type of blog keeps the users interest in
the website, i am impressed. thank you.
Data Science Course in Bangalore
I enjoyed reading your article. Thanks for taking the time to post such a valuable article.
ReplyDeleteseo content writing tips
language similar to english
salesforce basics
star certification
hacking books
Đại lý vé máy bay Aivivu, tham khảo
ReplyDeletevé máy bay đi Mỹ giá rẻ 2021
giá vé máy bay tết 2021
đặt vé máy bay đi toronto canada
săn vé máy bay đi Pháp
giá vé máy bay sang Anh quốc
giá vé máy bay Vietjet
combo hà nội đà nẵng
combo nha trang 3 ngày 2 đêm 2021
visa trung quốc giá rẻ tphcm
cách ly khách sạn
Excellent blog thanks for sharing the valuable information..it becomes easy to read and easily understand the information.
ReplyDeleteUseful article which was very helpful. also interesting and contains good information.
to know about python training course , use the below link.
Python Training in chennai
Python Course in chennai
Aivivu, đại lý chuyên vé máy bay, tham khảo
ReplyDeletevé máy bay đi Mỹ giá rẻ
bay từ california về việt nam mất bao lâu
giá vé máy bay từ Vancouver về việt nam
Lịch bay từ Hàn Quốc về Việt Nam tháng 7
This post is very simple to read and appreciate without leaving any details out. Great work!
ReplyDeletedata scientist courses in gurgaon
I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.I want to share aboutdata analytics courses in yelahanka
ReplyDeleteNice Blog. Thanks for Sharing this useful information...
ReplyDeleteData science training in chennai
Data science course in chennai
Wonderful post. Thanks for taking time to share this information with us.
ReplyDeletePrimavera course in Chennai | Primavera p6 training online
My spouse and I stumbled over here by a different web page and thought I should check things out. usamagazine writersevoke pathofex oftenit dsnews I like what I see so i am just following you. Look forward to looking over your web page yet again.
ReplyDeletei am glad to discover this page : i have to thank you for the time i spent on this especially great reading !! i really liked each part and also bookmarked you for new information on your site.
ReplyDeletedata science courses in hyderabad