Friday, December 30, 2016

RDB 2 Hbase Using Sqoop

  • 본 문서는 sqoop을 사용하여 SQL server의 테이블 데이터를 hbase에 migration하는 내용을 기재하였습니다.
  • sqoop 버전
    • 1.4.6.2.4 (HDP)
  • sqoop 명령어
    • MDTB_USER_PATTERN_INFO_YYYYMMDD
      • sqoop import --connect 'jdbc:sqlserver://10.10.15.13:31051' --username dev --password 'oqkf!23' --query "select top 100 jobDay, psId, patCd from TEST.dbo.INFO_20150401 WHERE \$CONDITIONS" --target-dir /tmp/sqoop/INFO_20150401 --outdir sqoop_codes/INFO_20150401 --hbase-table INFO --column-family d --hbase-row-key psId,patCd -m 1
  • 제한점
    • row key separator를 지정 못함
      • the row key for HBase row will be generated by combining values of composite key attributes using underscore as a separator
      • 예, 00000d7aa7ad4350b7eef9a80580d9a8_tm_10db7b84cdfa4c2981933a9f5e3b8c41
    • string으로 변환됨
      • Sqoop currently serializes all values to HBase by converting each field to its string representation, and then inserts the UTF-8 bytes of this string in the target cell
      • 예, column=d:regDt, value=\x00\x00\x01T\x94\xA4\xC5\xF6 아닌 column=d:regDt, value=1464680430

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.