Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Question about myImageStorageFunc and PigStore


Copy link to this message
-
Re: Question about myImageStorageFunc and PigStore
Sameer Tilak 2009-04-21, 19:09
Hi Santosh,

Thanks for the info. I'll let you know if I've any further questions.

On Tue, Apr 21, 2009 at 11:47 AM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote:

>  Sameer,
>
> You can find the documentation for writing User Defined Functions (UDFs) at
> the following location: http://hadoop.apache.org/pig/docs/r0.2.0/udf.html
> In particular, the load and store function documentation is at:
>
> http://hadoop.apache.org/pig/docs/r0.2.0/udf.html#Load%2FStore+Functions
>
>
> Let us know if you have more questions.
>
> Thanks,
> Santhosh
> **
>
>  ------------------------------
> *From:* Sameer Tilak [mailto:[EMAIL PROTECTED]]
> *Sent:* Tuesday, April 21, 2009 11:43 AM
> *To:* Santhosh Srinivasan
> *Cc:* [EMAIL PROTECTED]
> *Subject:* Re: Question about myImageStorageFunc and PigStore
>
> Santosh,
>
> Thanks for your reply.  I was planning to use the following code to read
> individual image.
>
> BufferedImage img = null;
> try {
>     img = ImageIO.read(new File(filename));
> } catch (IOException e) {
> }
>
> However, if I remember corectly, load statement doesn't actually load the
> data, it just sets a handle. So in the first statement:
>
> imagein = load '/myimages' using myImageStorageFunc();
>
> myImageStorageFunc can't load the files correct? Or does myImageStorageFunc
> need to load all the files in myimages directory and then imagein is
> basically an array of file handles?
>
> If this is correct, myImageFilter will take file handles one by one and
> then work directly on thosen opne files?
>
> imageop = foreach imagein generate myImageFilter(*);
>
>
> Many thanks.
>
>
> On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote:
>
>> Sameer,
>>
>> You need to write your own UDF to read and write image files to the file
>> system.
>> Have a look at the following built-in load and store functions supported
>> by Pig:
>>
>> PigStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /PigStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/PigStorage.java?view=log>
>> BinStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /BinStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/BinStorage.java?view=log>
>>
>> PigStorage handles read/write of UTF-8 text data.
>> BinStorage handles read/write of binary data. The binary data format is
>> internal to Pig.
>>
>> Thanks,
>> Santhosh
>>
>> -----Original Message-----
>> From: Sameer Tilak [mailto:[EMAIL PROTECTED]]
>> Sent: Tuesday, April 21, 2009 10:53 AM
>> To: [EMAIL PROTECTED]
>> Subject: Question about myImageStorageFunc and PigStore
>>
>> Hi everyone,
>>
>> We're working on an image analysis project using Pig. I wrote my UDF:
>> myImageFilter. However, can someone please point me to info about UDF:
>> myImageStorageFunc. My images will be in a directory in HDFS. So should
>> this
>> function need to inmplement reading from/ writing to image files to
>> HDFS? Is
>> there any existing functionality within Pig to do the same or do I need
>> to
>> write my own? If second, is there any example code to do this?
>>
>> imagein = load '/myimages' using myImageStorageFunc();
>> imageop = foreach imagein generate myImageFilter(*);
>> store imageop into '/mythumbnails' using myImageStorageFunc();
>>
>> I've similar question about PigStore function.
>>
>> Regards,
>> --ST.
>>
>
>