Tuesday, July 22, 2014

Lighting a Spark With HBase Full Edition with real world examples ~ dependencies, classpaths, handling ByteArray in HBase KeyValue object

First of all, there are many resources in internet about integrating HBase and Spark

such as

Spark has their own example: https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala

MapR has also some cool sample: http://www.mapr.com/developercentral/code/loading-hbase-tables-spark

and here, a more detailed code snippet: http://www.vidyasource.com/blog/Programming/Scala/Java/Data/Hadoop/Analytics/2014/01/25/lighting-a-spark-with-hbase

but all of them, has no information about:
  • which jar library are needed, let us say dependency problem
  • how should i set the classpath when i start my spark job/application with HBase connection
  • sc.newAPIHadoopRDD uses this holly class org.apache.hadoop.hbase.client.Result as a return value type, but objects in this Result are org.apache.hadoop.hbase.KeyValue, this is a core client-side Java API of HBase, sometimes it is really not enough to use it just with getColumn("columnFamily".getBytes(), "columnQualifier".getBytes()), and more important is, in scala, to use this KeyValue object is even more complicated.
therefore this post aims to create a "Full" Version...

assume you have already read the samples above. i will go ahead directly to solve this three problems.

if you only want to see some code, jump to the next part of this doc: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html

1. dependency problem

it is similar as a HBase client program

for maven:

<dependency>
        <groupid>org.apache.spark</groupid>
        <artifactid>spark-core_2.10</artifactid>
        <version>1.0.1</version>
</dependency>

<dependency>
        <groupid>org.apache.hbase</groupid>
        <artifactid>hbase</artifactid>
        <version>0.98.2-hadoop2</version>
</dependency>

<dependency>
        <groupid>org.apache.hbase</groupid>
        <artifactid>hbase-client</artifactid>
        <version>0.98.2-hadoop2</version>
</dependency>

<dependency>
        <groupid>org.apache.hbase</groupid>
        <artifactid>hbase-common</artifactid>
        <version>0.98.2-hadoop2</version>
</dependency>

<dependency>
        <groupid>org.apache.hbase</groupid>
        <artifactid>hbase-server</artifactid>
        <version>0.98.2-hadoop2</version>
</dependency>

sbt:

libraryDependencies ++= Seq(
        "org.apache.spark" % "spark-core_2.10" % "1.0.1",
        "org.apache.hbase" % "hbase" % "0.98.2-hadoop2",
        "org.apache.hbase" % "hbase-client" % "0.98.2-hadoop2",
        "org.apache.hbase" % "hbase-common" % "0.98.2-hadoop2",
        "org.apache.hbase" % "hbase-server" % "0.98.2-hadoop2"
)

change the version of spark and hbase to yours.

2. classpath

in the time of Spark 0.9.x, you just need to set this environment: SPARK_CLASSPATH with HBase's Jars, for example, start spark-shell with local mode, in CDH5 Hadoop distribution:
export SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar
and then
./bin/spark-shell --master local[2]
or just
SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar ./bin/spark-shell --master local[2]

in your cluster, you should change the path of those jars to your HBase's path, such as in other Hadoop distribution should be some path like /usr/lib/xxx (Hortonworks HDP) or /opt/mapr/hbase-xxx (MapR)

but, but... this lovely SPARK_CLASSPATH is deprecated in the new era of Spark 1.x  !!! -_-

so, in Spark 1.x

there is one conf property and one command line augment for this:
spark.executor.extraClassPath
and
--driver-class-path

WTF... but, yes, you must give the whole jar paths twice!... and spark.executor.extraClassPath must be set in a conf file, can not be set via command line...

so, you need to do this:

edit conf/spark-defaults.conf

add this:
spark.executor.extraClassPath  /opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar
and then, start spark shell or submit your spark job with command line args for driver --driver-class-path:
./bin/spark-shell --master local[2]  --driver-class-path  /opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar
unbelievable, but it is so in spark 1.x ...

3. how to use org.apache.hadoop.hbase.KeyValue in scala with Spark

it seems this post is already long enough, let us take a break, to see the code of real world examples, you can go to the next part of this doc: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html

74 comments:

  1. Thanks
    Great article...

    ReplyDelete
  2. This is what I want to know. Thanks!

    ReplyDelete
  3. These are only a few ideas and there are lots more available online. I hope I've given you some inspiration on what you can do to make your Halloween party a spooky success. Bath mirror lamps

    ReplyDelete
  4. Very awesome!!! When I seek for this I found this website at the top of all blogs in search engine.

    Data Science Training

    ReplyDelete
  5. I'd love to thank you for the efforts you've made in composing this post. I hope the same best work out of you later on too. I wished to thank you with this particular sites! Thank you for sharing. Fantastic sites!
    360DigiTMG Data Science Course in Bangalore

    ReplyDelete
  6. This is a great post. This post gives a truly quality information. I am certainly going to look into it. Really very helpful tips are supplied here. Thank you so much. Keep up the great works
    360DigiTMG Data Science Training in Bangalore

    ReplyDelete
  7. I see some amazingly important and kept up to length of your strength searching for in your on the sitedata science course

    ReplyDelete
  8. This is a great post I saw thanks to sharing. I really want to hope that you will continue to share great posts in the future.
    artificial intelligence course in noida

    ReplyDelete
  9. I feel extremely glad to have seen your site page and anticipate such a large number of additionally engaging occasions perusing here. Much obliged again for all the subtleties.
    hrdf scheme

    ReplyDelete
  10. Many sales managers tell me that their salespeople don't meet their expectations. The sales manager pleads, begs and even threatens, but the salesperson just goes through the motions of selling and following through on proposals and sales calls. Salesforce training in Chennai

    ReplyDelete

  11. Thank you quite much for discussing this type of helpful informative article. Will certainly stored and reevaluate your Website.

    Cyber Security Course In Bangalore

    ReplyDelete
  12. This post is very simple to read and appreciate without leaving any details out. Great work!
    data scientist courses in gurgaon

    ReplyDelete
  13. I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.I want to share aboutdata analytics courses in yelahanka

    ReplyDelete
  14. Wonderful post. Thanks for taking time to share this information with us.
    Primavera course in Chennai | Primavera p6 training online

    ReplyDelete
  15. My spouse and I stumbled over here by a different web page and thought I should check things out. usamagazine writersevoke pathofex oftenit dsnews I like what I see so i am just following you. Look forward to looking over your web page yet again.

    ReplyDelete
  16. I read your article it is very interesting and every concept is very clear, thank you so much for sharing. AWS Certification Course in Chennai


    ReplyDelete
  17. Online football betting ufabet will definitely get the price of water more than anywhere else. When compared with other companies such as other water 1.90, we water 1.94 or more, depending on the pair. We guarantee the price of 4 sets of football betting with us, starting with a minimum of only 10 baht, because our website has no minimum deposit with an automatic system

    ReplyDelete
  18. Online slots (Slot Online) may be the release of a gambling machine. Slot computer As stated before Used to produce electrical games known as online slots, on account of the development era, folks have looked to gamble through computer systems. Will achieve slot video games making internet gambling video games Via the world wide web network device Which players can have fun with through the slot plan or will have fun with Slots with the system provider's site Which internet slots games are actually available within the kind of participating in guidelines. It's similar to participating in on a slot machine. The two practical photos as well as sounds are equally thrilling since they go to lounge in the casino on the globe.บาคาร่า
    ufa
    ufabet
    แทงบอล
    แทงบอล
    แทงบอล

    ReplyDelete
  19. I just found this blog and have high hopes for it to continue. Keep up the great work, its hard to find good ones. I have added to my favorites. Thank You.
    best data science online course

    ReplyDelete
  20. Wow, happy to see this awesome post. I hope this think help any newbie for their awesome work and by the way thanks for share this awesomeness, i thought this was a pretty interesting read when it comes to this topic. Thank you..

    Data Science Training in Hyderabad

    ReplyDelete
  21. Very awesome!!! When I seek for this I found this website at the top of all blogs in search engine.
    data science training in malaysia

    ReplyDelete
  22. Your work is very good and I appreciate you and hopping for some more informative posts
    data science training

    ReplyDelete
  23. I am impressed by the information that you have on this blog. It shows how well you understand this subject.
    data science course


    ReplyDelete
  24. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.
    data science course

    ReplyDelete
  25. This post is very simple to read and appreciate without leaving any details out. Great work!
    data scientist course in aurangabad

    ReplyDelete
  26. That's why we also keep improving our safety management skills to counter the top security companies in London
    new threats our clients may face. The number of times leading media outlets refer to us as experts in security matters is a clear testimony that we are a highly effective and
    innovative service provider that does a better job than any other security company in London.

    ReplyDelete
  27. Please share this more. Thanks for sharing useful information and don't forget to share useful information.If you are flying to your destination and transiting through Turkey, you will need to obtain a Visa Transit Turkey. This visa allows you to travel through Turkey.

    ReplyDelete
  28. This is a wonderful inspiring article. I am practically satisfied with your great work. You have really put together extremely helpful data. Keep it up.. Are you planning to visit Kenya?For this, you need to fill the Kenya evisa application and pay the fee online.

    ReplyDelete
  29. Nice info..... South Africa e Visa to be launched in 2022 you can read all info related to South Africa e Visa 2022 via South Africa e visa website.

    ReplyDelete
  30. What a really awesome post this is. Truly, one of the best posts I've ever witnessed to see in my whole life. Wow, just keep it up.
    data science course

    ReplyDelete
  31. Hello! I thought I had been to this site before, but after looking through some of the posts, I realized it was new to me. Anyway, I'm glad I found it and I'll definitely be bookmarking it and checking back often. How to apply for Indian visa? Yes you can apply for an online visa for India through the India electronic visa website.

    ReplyDelete
  32. Hi! I thought I had been to this site before, but after looking through some of the posts, I realized it was new to me. Anyway, I'm glad I found it and I'll definitely be bookmarking it and checking back often. How to apply for Indian visa? Yes you can apply for an online visa for India via the India electronic visa portal.

    ReplyDelete
  33. Good article. I enjoyed reading your articles. This can be really a good scan for me. Wanting forward to reading new articles. Maintain the nice work!
    Servicenow Training In Hyderabad

    ReplyDelete
  34. Thanks for your post. I’ve been thinking about writing a very comparable post over the last couple of weeks, I’ll probably keep it short and sweet and link to this instead if thats cool. Thanks.
    cyber security course malaysia

    ReplyDelete
  35. Good afternoon guys, Nice blog. Thanks for sharing. Do you know how to apply India visa online application? You can apply for India Visa Online. You can read all the details on our Indian Visa Blog. All information available here..

    ReplyDelete

  36. Hello sir, thanks for the amazing post. Planning a holiday in India. Is Indian e visa open ,yes India visa open you can now apply for India visa online.

    ReplyDelete
  37. Hey guys! Amazing post! Many people ask, How to apply for a Turkey e visa? Citizens of eligible countries are allowed to apply and obtain a Turkey evisa after the submission of an online application. Eligible travelers only need a good internet connection and a computer, laptop or tablet to complete and submit a visa application

    ReplyDelete
  38. This comment has been removed by the author.

    ReplyDelete
  39. our Portal Provided Punjab 11th Class Revised new Syllabus 2022 So, the Students can Download the as early as possible without late. So, the Students have Less Time for Exam Preparation. But These Much of time is Enough for Exam Preparation. Punjab 11th Class Syllabus Punjab 11th Syllabus 2022 Subject wise will be available at Official Website. Students Those who are Going to Appear Public Exam march 2022 can Download Punjab 11th Syllabus 2022 Subject wise important Question pdf Format Download.

    ReplyDelete
  40. NCERT Exam Question Papers are Strictly Based on the Syllabus issued by NCERT Board so, before Starting Preparation of NCERT Class 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Examination 2023 one Must need to go Through the Complete NCERT Syllabus of Class 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 of all the Subjects. Students of NCERT Class 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 can Check other Important Articles for Board Exam Preparation. NCERT 7th Class Revised Syllabus Students need to go Through Updated NCERT Class 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Syllabus 2023, Students are also Advised to official site of NCERT d to get new NCERT Class 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 new Syllabus 2023.

    ReplyDelete
  41. First class data, modify everybody mind basically reference designate assistance to it. I could be placed in a larger invincible state. XXClone Make Bootable

    ReplyDelete
  42. First class data, modify everybody mind basically reference designate assistance to it. I could be placed in a larger invincible state. XXClone 2022 x64 Download

    ReplyDelete
  43. First class data, modify everybody mind basically reference designate assistance to it. I could be placed in a larger invincible state. https://cyberspc.com/xxclone-pro-crack/

    ReplyDelete
  44. Your article is easy to read and understand. I would like to read more articles like this. Getting an evisa Turkey online is a hassle free process. It saves time and money as well.

    ReplyDelete
  45. These are the perfect birthday quotes for brother from sister, and so fun to share. 1. Happy birthday, bro .Happy Birthday Wishes For My Brother

    ReplyDelete
  46. This comment has been removed by the author.

    ReplyDelete
  47. The best facility for every prepaid mobile customer is online recharge to feel proud for managing recharges from own hands with new offers provided by Bharat Sanchar Nigam Limited. This may possible through a new website call as BSNL Quick Recharge Portal or with My BSNL App. BSNL Online Recharge This is one of the best service to avail Full Talktime Top Up, STV’s or any latest prepaid mobile plan recharges online. It is like an instant service, but from anywhere on just log in to BSNL Recharge Portal.

    ReplyDelete
  48. You have so much knowledge about this topic... Here is the information for the Turkey visa USA.

    ReplyDelete
  49. If you wish to travel to the Dominican Republic, you may need a Dominican Republic visa depending on your country of origin. Visa is a special document that allows you to enter and stay in a country for a specific period of time. To obtain a visa, you must fill out an application, pay a fee, and provide some personal information and supporting documents, such as your passport and itinerary. Before planning your trip to the Dominican Republic it is important to check the visa requirements for your specific state.

    ReplyDelete
  50. I'm truly grateful for your generosity in sharing these with us. I'm committed to taking part in the upcoming event, as this topic holds a special appeal for me. I'm eager to share some details; Cameroon Residency & Citizenship Options. Explore diverse residency and citizenship opportunities in Cameroon. Discover pathways to establish roots in this beautiful nation. From investment routes to long-term stays, Cameroon offers unique options. Let's explore the possibilities together.

    ReplyDelete
  51. Pavzi website is a multiple Niche or category website which will ensure to provide information and resources on each and every topic. Some of the evergreen topics you will see on our website are Career, Job Recruitment, Educational, Technology, Reviews and others. pavzi.com We are targeting mostly so it is true that Tech, Finance, and Product Reviews. The only reason we have started this website is to make this site the need for your daily search use.

    ReplyDelete
  52. I'm consistently blown away by your talents and abilities. You have a truly unique gift that allows you to create such impressive, high-quality work. Everything you make demonstrates your skill, creativity, and attention to detail. Your designs/artwork/writing/etc. (customize for the person's work) stand out from the crowd because of your keen eye, stylistic flair, and ability to captivate an audience. You make it all look so effortless too! I wish I had your natural talents and instincts when it comes to producing remarkable work. Keep doing what you do best - the world needs more gifted people like you who aren't afraid to let their abilities shine. Your skills are a true inspiration, and I can't wait to see what you come up with next!Bigg Boss Malayalam

    ReplyDelete

© Chutium / Teng Qiu @ ABC Netz Group