Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Specifying multiple input paths

Copy link to this message
Specifying multiple input paths
Tom White 2008-06-04, 14:55
I;m having a problem loading data from multiple paths in Pig. What I'm
trying to do is to load data from a range of dates, so I would like to
specify an input of two globbed paths:

x = LOAD '2008/05/{26,27,28,29,30,31},2008/06/{1,2}'

Pig doesn't seem to like this though as it's trying to interpret it as
a single path. The best I can do it to use UNION:

x1 = LOAD '2008/05/{26,27,28,29,30,31}'
x2 = LOAD '2008/06/{1,2}'
x = UNION x1, x2

The downside to this is that I want to parameterize my paths, and
having separate script for each number of paths in the input is

Is there a better way of doing this? Are there any plans to support
multiple paths, and/or PathFilters?