Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Question about myImageStorageFunc and PigStore


+
Sameer Tilak 2009-04-21, 17:53
+
Santhosh Srinivasan 2009-04-21, 18:28
+
Sameer Tilak 2009-04-21, 18:42
+
Santhosh Srinivasan 2009-04-21, 18:47
Copy link to this message
-
Re: Question about myImageStorageFunc and PigStore
Hi Santosh,

Thanks for the info. I'll let you know if I've any further questions.

On Tue, Apr 21, 2009 at 11:47 AM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote:

>  Sameer,
>
> You can find the documentation for writing User Defined Functions (UDFs) at
> the following location: http://hadoop.apache.org/pig/docs/r0.2.0/udf.html
> In particular, the load and store function documentation is at:
>
> http://hadoop.apache.org/pig/docs/r0.2.0/udf.html#Load%2FStore+Functions
>
>
> Let us know if you have more questions.
>
> Thanks,
> Santhosh
> **
>
>  ------------------------------
> *From:* Sameer Tilak [mailto:[EMAIL PROTECTED]]
> *Sent:* Tuesday, April 21, 2009 11:43 AM
> *To:* Santhosh Srinivasan
> *Cc:* [EMAIL PROTECTED]
> *Subject:* Re: Question about myImageStorageFunc and PigStore
>
> Santosh,
>
> Thanks for your reply.  I was planning to use the following code to read
> individual image.
>
> BufferedImage img = null;
> try {
>     img = ImageIO.read(new File(filename));
> } catch (IOException e) {
> }
>
> However, if I remember corectly, load statement doesn't actually load the
> data, it just sets a handle. So in the first statement:
>
> imagein = load '/myimages' using myImageStorageFunc();
>
> myImageStorageFunc can't load the files correct? Or does myImageStorageFunc
> need to load all the files in myimages directory and then imagein is
> basically an array of file handles?
>
> If this is correct, myImageFilter will take file handles one by one and
> then work directly on thosen opne files?
>
> imageop = foreach imagein generate myImageFilter(*);
>
>
> Many thanks.
>
>
> On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote:
>
>> Sameer,
>>
>> You need to write your own UDF to read and write image files to the file
>> system.
>> Have a look at the following built-in load and store functions supported
>> by Pig:
>>
>> PigStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /PigStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/PigStorage.java?view=log>
>> BinStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /BinStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/BinStorage.java?view=log>
>>
>> PigStorage handles read/write of UTF-8 text data.
>> BinStorage handles read/write of binary data. The binary data format is
>> internal to Pig.
>>
>> Thanks,
>> Santhosh
>>
>> -----Original Message-----
>> From: Sameer Tilak [mailto:[EMAIL PROTECTED]]
>> Sent: Tuesday, April 21, 2009 10:53 AM
>> To: [EMAIL PROTECTED]
>> Subject: Question about myImageStorageFunc and PigStore
>>
>> Hi everyone,
>>
>> We're working on an image analysis project using Pig. I wrote my UDF:
>> myImageFilter. However, can someone please point me to info about UDF:
>> myImageStorageFunc. My images will be in a directory in HDFS. So should
>> this
>> function need to inmplement reading from/ writing to image files to
>> HDFS? Is
>> there any existing functionality within Pig to do the same or do I need
>> to
>> write my own? If second, is there any example code to do this?
>>
>> imagein = load '/myimages' using myImageStorageFunc();
>> imageop = foreach imagein generate myImageFilter(*);
>> store imageop into '/mythumbnails' using myImageStorageFunc();
>>
>> I've similar question about PigStore function.
>>
>> Regards,
>> --ST.
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB