Friday, February 28, 2014

HiveServer2 does not return ResultSets in UTF-8 encoding 解决HiveServer2 JDBC显示UTF8乱码的问题

add following env variables in Hive startup script ($HIVE_HOME/bin/hive):

export LANG=en_US.UTF-8
export HADOOP_OPTS="$HADOOP_OPTS -Dfile.encoding=UTF-8"

MapR cluster: /opt/mapr/hive/hive-0.11/bin/hive
Cloudera cluster: /opt/cloudera/parcels/CDH/lib/hive/bin/hive
other Hadoop distribution: /usr/lib/hive/bin/hive (maybe...)

make sure, your data in HDFS are encoded in UTF-8, if not, you should set LANG variable and file.encoding in HADOOP_OPTS as same as the encoding you used for the files in HDFS.

我们通过 Hive JDBC 读数据的时候,如果有非 ascii 字符,比如中文,CJK,之类的,默认情况下很可能是无法正确读取的,要么是乱码,要么是问号。。。

可以通过给hive的启动脚本添加上面两个环境变量解决,不同hadoop发行版的hive启动脚本位置有所不同,上面也列出了。

其实重点就是,你存入HDFS的文件编码要与启动hive,hiveserver2,sqoop之类的这些服务的环境变量相同,可以通过在启动脚本中设置LANG,和在HADOOP_OPTS中指定file.encoding解决。

如果你存入hdfs的文件是UTF8,那就这样设置,如果是GBK之类的,就

export LANG="zh_CN.GBK"
export HADOOP_OPTS="$HADOOP_OPTS -Dfile.encoding=GBK"

2 comments:

  1. According to the previous reports, the PSC Result Date 2022 Comilla Board is also last week of December, however, we will update PSC Result 2022 Comilla the official result date here after the official announcement by DPE, as per DPE previous five years result from the announcement of this year result will be announced likely on 30th or 31st December 2022.

    ReplyDelete

  2. BSE Odisha is Responsible for Conducting the 6th, 7th, 8th, 9th, 10th Class Odisha High School Exam every year. BSE odisha
    is also Responsible for Preparing the Courses of Studies, textbooks,Odisha 10th Class Textbook and Study Material for the Students.
    Odisha Board 6th, 7th, 8th, 9th, 10th Class Study Material for Students who are Aiming to High Score Pass Marks Percentage
    in BSE Odisha should plan their Studies Right from the Beginning of the Academic Session 2023. For this we have Provided
    some study material for the 6th, 7th, 8th, 9th, 10th Class students. It will help them plan their studies effectively.

    ReplyDelete

© Chutium / Teng Qiu @ ABC Netz Group