I second what Todd said, even with FuseHDFS, mounting HDFS as a regular file
system, it won't give you the immediate response about the file status that
you need. I believe Google implemented Gmail with HBase. Here is an example
of implementing a mail store with Cassandra:
On Wed, May 18, 2011 at 5:05 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hi Ioan,
> I would encourage you to look at a system like HBase for your mail
> backend. HDFS doesn't work well with lots of little files, and also
> doesn't support random update, so existing formats like Maildir
> wouldn't be a good fit.
> On Wed, May 18, 2011 at 4:02 PM, Ioan Eugen Stan <[EMAIL PROTECTED]>
> > Hello everybody,
> > I'm a GSoC student for this year and I will be working on James .
> > My project is to implement email storage over HDFS. I am quite new to
> > Hadoop and associates and I am looking for some hints as to get
> > started on the right track.
> > I have installed a single node Hadoop instance on my machine and
> > played around with it (ran some examples) but I am interested into
> > what you (more experienced people) think it's the best way to approach
> > my problem.
> > I am a little puzzled about the fact that that I read hadoop is best
> > used for large files and email aren't that large from what I know.
> > Another thing that crossed my mind is that since HDFS is a file
> > system, wouldn't it be possible to set it as a back-end for the
> > (existing) maildir and mailbox storage formats? (I think this question
> > is more suited on the James mailing list, but if you have some ideas
> > please speak your mind).
> > Also, any development resources to get me started are welcomed.
> >  http://james.apache.org/mailbox/
> >  https://issues.apache.org/jira/browse/MAILBOX-44
> > Regards,
> > --
> > Ioan Eugen Stan
> Todd Lipcon
> Software Engineer, Cloudera