|
|
-
Do we support contatenated/splittable bzip2 files in branch-1?
Yu Li 2012-12-03, 09:49
Dear all,
About splitting support for bzip2, I checked on the JIRA list and found HADOOP-7386 marked as "Won't fix"; I also found some work done in branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not integrated/migrated into branch-1, so I guess we don't support contatenated bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!
-- Best Regards, Li Yu
+
Yu Li 2012-12-03, 09:49
-
Re: Do we support contatenated/splittable bzip2 files in branch-1?
Harsh J 2012-12-03, 11:42
Hi Yu Li,
The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus MR support for it, into branch-1, and it is already available in the 1.1.x releases out currently.
Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have fixed it - so can you try and report back?
On Mon, Dec 3, 2012 at 3:19 PM, Yu Li <[EMAIL PROTECTED]> wrote: > Dear all, > > About splitting support for bzip2, I checked on the JIRA list and found > HADOOP-7386 marked as "Won't fix"; I also found some work done in > branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not > integrated/migrated into branch-1, so I guess we don't support contatenated > bzip2 in branch-1, correct? If so, is there any special reason? Many thanks! > > -- > Best Regards, > Li Yu
-- Harsh J
+
Harsh J 2012-12-03, 11:42
-
Re: Do we support contatenated/splittable bzip2 files in branch-1?
Yu Li 2012-12-03, 15:52
Hi Harsh,
Thanks a lot for the information!
My fault not looking into HADOOP-4012 carefully, will try and veriry whether HADOOP-7823 has resolved the issue on both write and read side, and report back.
On 3 December 2012 19:42, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi Yu Li, > > The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus > MR support for it, into branch-1, and it is already available in the > 1.1.x releases out currently. > > Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet > (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have > fixed it - so can you try and report back? > > On Mon, Dec 3, 2012 at 3:19 PM, Yu Li <[EMAIL PROTECTED]> wrote: > > Dear all, > > > > About splitting support for bzip2, I checked on the JIRA list and found > > HADOOP-7386 marked as "Won't fix"; I also found some work done in > > branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not > > integrated/migrated into branch-1, so I guess we don't support > contatenated > > bzip2 in branch-1, correct? If so, is there any special reason? Many > thanks! > > > > -- > > Best Regards, > > Li Yu > > > > -- > Harsh J >
-- Best Regards, Li Yu
+
Yu Li 2012-12-03, 15:52
-
Re: Do we support contatenated/splittable bzip2 files in branch-1?
Harsh J 2012-12-04, 04:07
Thanks Yu, will appreciate if you can post your observances over https://issues.apache.org/jira/browse/HADOOP-7386. On Mon, Dec 3, 2012 at 9:22 PM, Yu Li <[EMAIL PROTECTED]> wrote: > Hi Harsh, > > Thanks a lot for the information! > > My fault not looking into HADOOP-4012 carefully, will try and veriry > whether HADOOP-7823 has resolved the issue on both write and read side, and > report back. > > On 3 December 2012 19:42, Harsh J <[EMAIL PROTECTED]> wrote: > >> Hi Yu Li, >> >> The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus >> MR support for it, into branch-1, and it is already available in the >> 1.1.x releases out currently. >> >> Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet >> (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have >> fixed it - so can you try and report back? >> >> On Mon, Dec 3, 2012 at 3:19 PM, Yu Li <[EMAIL PROTECTED]> wrote: >> > Dear all, >> > >> > About splitting support for bzip2, I checked on the JIRA list and found >> > HADOOP-7386 marked as "Won't fix"; I also found some work done in >> > branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not >> > integrated/migrated into branch-1, so I guess we don't support >> contatenated >> > bzip2 in branch-1, correct? If so, is there any special reason? Many >> thanks! >> > >> > -- >> > Best Regards, >> > Li Yu >> >> >> >> -- >> Harsh J >> > > > > -- > Best Regards, > Li Yu -- Harsh J
+
Harsh J 2012-12-04, 04:07
-
Re: Do we support contatenated/splittable bzip2 files in branch-1?
Yu Li 2012-12-10, 14:17
Hi Hash, Sorry for the a little late response, busy doing some other work these days. I have pasted my test steps and result onto HADOOP-7386, and if the way of my testing is correct, I think concatenated BZip2 file support is implemented and already in branch-1. I also did some sanity testing and confirmed splitting BZip2 support also in branch-1. Please let me know if any comments, thanks. On 4 December 2012 12:07, Harsh J <[EMAIL PROTECTED]> wrote: > Thanks Yu, will appreciate if you can post your observances over > https://issues.apache.org/jira/browse/HADOOP-7386. > > On Mon, Dec 3, 2012 at 9:22 PM, Yu Li <[EMAIL PROTECTED]> wrote: > > Hi Harsh, > > > > Thanks a lot for the information! > > > > My fault not looking into HADOOP-4012 carefully, will try and veriry > > whether HADOOP-7823 has resolved the issue on both write and read side, > and > > report back. > > > > On 3 December 2012 19:42, Harsh J <[EMAIL PROTECTED]> wrote: > > > >> Hi Yu Li, > >> > >> The JIRA HADOOP-7823 backported support for splitting Bzip2 files plus > >> MR support for it, into branch-1, and it is already available in the > >> 1.1.x releases out currently. > >> > >> Concatenated Bzip2 files, i.e., HADOOP-7386, is not implemented yet > >> (AFAIK), but Chris over HADOOP-6335 suggests that HADOOP-4012 may have > >> fixed it - so can you try and report back? > >> > >> On Mon, Dec 3, 2012 at 3:19 PM, Yu Li <[EMAIL PROTECTED]> wrote: > >> > Dear all, > >> > > >> > About splitting support for bzip2, I checked on the JIRA list and > found > >> > HADOOP-7386 marked as "Won't fix"; I also found some work done in > >> > branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not > >> > integrated/migrated into branch-1, so I guess we don't support > >> contatenated > >> > bzip2 in branch-1, correct? If so, is there any special reason? Many > >> thanks! > >> > > >> > -- > >> > Best Regards, > >> > Li Yu > >> > >> > >> > >> -- > >> Harsh J > >> > > > > > > > > -- > > Best Regards, > > Li Yu > > > > -- > Harsh J > -- Best Regards, Li Yu
+
Yu Li 2012-12-10, 14:17
|
|