Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Question about myImageStorageFunc and PigStore


Copy link to this message
-
RE: Question about myImageStorageFunc and PigStore
Sameer,
 
You can find the documentation for writing User Defined Functions (UDFs)
at the following location:
http://hadoop.apache.org/pig/docs/r0.2.0/udf.html
In particular, the load and store function documentation is at:
 
http://hadoop.apache.org/pig/docs/r0.2.0/udf.html#Load%2FStore+Functions
 
 
Let us know if you have more questions.
 
Thanks,
Santhosh
 

________________________________

From: Sameer Tilak [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 21, 2009 11:43 AM
To: Santhosh Srinivasan
Cc: [EMAIL PROTECTED]
Subject: Re: Question about myImageStorageFunc and PigStore
Santosh,

Thanks for your reply.  I was planning to use the following code to read
individual image.

BufferedImage img = null;
try {
    img = ImageIO.read(new File(filename));
} catch (IOException e) {
}

However, if I remember corectly, load statement doesn't actually load
the data, it just sets a handle. So in the first statement:

imagein = load '/myimages' using myImageStorageFunc();

myImageStorageFunc can't load the files correct? Or does
myImageStorageFunc need to load all the files in myimages directory and
then imagein is basically an array of file handles?

If this is correct, myImageFilter will take file handles one by one and
then work directly on thosen opne files?

imageop = foreach imagein generate myImageFilter(*);
Many thanks.

On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan
<[EMAIL PROTECTED]> wrote:
Sameer,

You need to write your own UDF to read and write image files to
the file
system.
Have a look at the following built-in load and store functions
supported
by Pig:

PigStorage:

http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
/PigStorage.java?view=log
<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builti
n%0A/PigStorage.java?view=log>
BinStorage:

http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
/BinStorage.java?view=log
<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builti
n%0A/BinStorage.java?view=log>

PigStorage handles read/write of UTF-8 text data.
BinStorage handles read/write of binary data. The binary data
format is
internal to Pig.

Thanks,
Santhosh


-----Original Message-----
From: Sameer Tilak [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 21, 2009 10:53 AM
To: [EMAIL PROTECTED]
Subject: Question about myImageStorageFunc and PigStore

Hi everyone,

We're working on an image analysis project using Pig. I wrote my
UDF:
myImageFilter. However, can someone please point me to info
about UDF:
myImageStorageFunc. My images will be in a directory in HDFS. So
should
this
function need to inmplement reading from/ writing to image files
to
HDFS? Is
there any existing functionality within Pig to do the same or do
I need
to
write my own? If second, is there any example code to do this?

imagein = load '/myimages' using myImageStorageFunc();
imageop = foreach imagein generate myImageFilter(*);
store imageop into '/mythumbnails' using myImageStorageFunc();

I've similar question about PigStore function.

Regards,
--ST.

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB