Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Duplicate rows when using regular expression

Copy link to this message
Duplicate rows when using regular expression
I am running a script to load data in the database. When I use [0-4] I see
2 rows being created for every record that I process. But when I run them
individually then it works. Could someone please help me understand or
troubleshoot this behaviour?
pig -f script6.pig -p in="/examples/2/part-m-0000[0-4]" --creates 2 rows

pig -f script6.pig -p in="/examples/2/part-m-00000 --works

pig -f script6.pig -p in="/examples/2/part-m-00001 --works

pig -f script6.pig -p in="/examples/2/part-m-00002 --works

pig -f script6.pig -p in="/examples/2/part-m-00003 --works

pig -f script6.pig -p in="/examples/2/part-m-00004 --works