Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Why so many unexpected files like partitions_xxxx are created?


Copy link to this message
-
Re: Why so many unexpected files like partitions_xxxx are created?
Tao:
Can you jstack one such process next time you see them hanging ?

Thanks
On Tue, Dec 17, 2013 at 6:31 PM, Tao Xiao <[EMAIL PROTECTED]> wrote:

> BTW, I noticed another problem. I bulk load data into HBase every five
> minutes, but I found that whenever the following command was executed
>     hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> HFiles-Dir  MyTable
>
> there is a new process called "LoadIncrementalHFiles"
>
> I can see many processes called "LoadIncrementalHFiles" using the command
> "jps" in the terminal, why are these processes still there even after the
> command that bulk load HFiles into HBase has finished executing ? I have to
> kill them myself.
>
>
> 2013/12/17 Bijieshan <[EMAIL PROTECTED]>
>
> > Yes, it should be cleaned up. But not included in current code in my
> > understanding.
> >
> > Jieshan.
> > -----Original Message-----
> > From: Ted Yu [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, December 17, 2013 10:55 AM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Why so many unexpected files like partitions_xxxx are
> created?
> >
> > Should bulk load task clean up partitions_xxxx upon completion ?
> >
> > Cheers
> >
> >
> > On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan <[EMAIL PROTECTED]> wrote:
> >
> > > >  I think I should delete these files immediately after I have
> > > > finished
> > > bulk loading data into HBase since they are useless at that time,
> right ?
> > >
> > > Ya. I think so. They are useless once bulk load task finished.
> > >
> > > Jieshan.
> > > -----Original Message-----
> > > From: Tao Xiao [mailto:[EMAIL PROTECTED]]
> > > Sent: Tuesday, December 17, 2013 9:34 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Why so many unexpected files like partitions_xxxx are
> > created?
> > >
> > > Indeed these files are produced by org.apache.hadoop.hbase.mapreduce.
> > > LoadIncrementalHFiles in the directory specified by what
> > > job.getWorkingDirectory()
> > > returns, and I think I should delete these files immediately after I
> > > have finished bulk loading data into HBase since they are useless at
> > > that time, right ?
> > >
> > >
> > >
> > >
> > > 2013/12/16 Bijieshan <[EMAIL PROTECTED]>
> > >
> > > > The reduce partition information is stored in this partition_XXXX
> file.
> > > > See the below code:
> > > >
> > > > HFileOutputFormat#configureIncrementalLoad:
> > > >         .....................
> > > >     Path partitionsPath = new Path(job.getWorkingDirectory(),
> > > >                                    "partitions_" +
> UUID.randomUUID());
> > > >     LOG.info("Writing partition information to " + partitionsPath);
> > > >
> > > >     FileSystem fs = partitionsPath.getFileSystem(conf);
> > > >     writePartitions(conf, partitionsPath, startKeys);
> > > >         .....................
> > > >
> > > > Hoping it helps.
> > > >
> > > > Jieshan
> > > > -----Original Message-----
> > > > From: Tao Xiao [mailto:[EMAIL PROTECTED]]
> > > > Sent: Monday, December 16, 2013 6:48 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Why so many unexpected files like partitions_xxxx are
> created?
> > > >
> > > > I imported data into HBase in the fashion of bulk load,  but after
> > > > that I found many unexpected file were created in the HDFS directory
> > > > of /user/root/, and they like these:
> > > >
> > > > /user/root/partitions_fd74866b-6588-468d-8463-474e202db070
> > > > /user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d
> > > > /user/root/partitions_fda37b8a-a882-4787-babc-8310a969f85c
> > > > /user/root/partitions_fdaca2f4-2792-41f6-b7e8-61a8a5677dea
> > > > /user/root/partitions_fdd55baa-3a12-493e-8844-a23ae83209c5
> > > > /user/root/partitions_fdd85a3c-9abe-45d4-a0c6-76d2bed88ea5
> > > > /user/root/partitions_fe133460-5f3f-4c6a-9fff-ff6c62410cc1
> > > > /user/root/partitions_fe29a2b0-b281-465f-8d4a-6044822d960a
> > > > /user/root/partitions_fe2fa6fa-9066-484c-bc91-ec412e48d008
> > > > /user/root/partitions_fe31667b-2d5a-452e-baf7-a81982fe954a
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB