Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Do we support contatenated/splittable bzip2 files in branch-1?


Copy link to this message
-
Re: Do we support contatenated/splittable bzip2 files in branch-1?
Hi Hash,

Sorry for the a little late response, busy doing some other work these
days. I have pasted my test steps and result onto HADOOP-7386, and if the
way of my testing is correct, I think concatenated BZip2 file support is
implemented and already in branch-1. I also did some sanity testing and
confirmed splitting BZip2 support also in branch-1. Please let me know if
any comments, thanks.

On 4 December 2012 12:07, Harsh J <[EMAIL PROTECTED]> wrote:

> Thanks Yu, will appreciate if you can post your observances over
> https://issues.apache.org/jira/browse/HADOOP-7386.
>
> On Mon, Dec 3, 2012 at 9:22 PM, Yu Li <[EMAIL PROTECTED]> wrote:
> > Hi Harsh,
> >
> > Thanks a lot for the information!
> >
> > My fault not looking into HADOOP-4012 carefully, will try and veriry
> > whether HADOOP-7823 has resolved the issue on both write and read side,
> and
> > report back.
> >
> > On 3 December 2012 19:42, Harsh J <[EMAIL PROTECTED]> wrote:
> >
> >> Hi Yu Li,
> >>
> >> The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus
> >> MR support for it, into branch-1, and it is already available in the
> >> 1.1.x releases out currently.
> >>
> >> Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet
> >> (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have
> >> fixed it - so can you try and report back?
> >>
> >> On Mon, Dec 3, 2012 at 3:19 PM, Yu Li <[EMAIL PROTECTED]> wrote:
> >> > Dear all,
> >> >
> >> > About splitting support for bzip2, I checked on the JIRA list and
> found
> >> > HADOOP-7386 marked as "Won't fix"; I also found some work done in
> >> > branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
> >> > integrated/migrated into branch-1, so I guess we don't support
> >> contatenated
> >> > bzip2 in branch-1, correct? If so, is there any special reason? Many
> >> thanks!
> >> >
> >> > --
> >> > Best Regards,
> >> > Li Yu
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
> >
> >
> >
> > --
> > Best Regards,
> > Li Yu
>
>
>
> --
> Harsh J
>

--
Best Regards,
Li Yu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB