Good day,

In my situation my table has billion rows, it doesn't come with an integer
column as its key, that means if I use sqoop to do the import (into hive),
I would not be able to use multiple mapper.
As table's size is big, it is not realistic to add an extra new integer
field to it.

I do come across a post from hortonworks which seems to suggest it is
possible however was commented that:

1. no guarantees though that sqoop splits your records evenly over your
mappers though.
2. For huge number of row the above options will cause duplicates in the
results set.

https://community.hortonworks.com/questions/26961/sqoop-split-by-on-a-string-varchar-column.html
Any thought?
Thank you very much.

*------------------------------------------------*
*Sincerely yours,*
*Raymond*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB