|
|
-
Re: queues in haddopHemanth Yamijala 2013-01-11, 10:30
Queues in the capacity scheduler are logical data structures into which
MapReduce jobs are placed to be picked up by the JobTracker / Scheduler framework, according to some capacity constraints that can be defined for a queue. So, given your use case, I don't think Capacity Scheduler is going to directly help you (since you only spoke about data-in, and not processing) So, yes something like Flume or Scribe Thanks Hemanth On Fri, Jan 11, 2013 at 11:34 AM, Harsh J <[EMAIL PROTECTED]> wrote: > Your question in unclear: HDFS has no queues for ingesting data (it is > a simple, distributed FileSystem). The Hadoop M/R and Hadoop YARN > components have queues for processing data purposes. > > On Fri, Jan 11, 2013 at 8:42 AM, Panshul Whisper <[EMAIL PROTECTED]> > wrote: > > Hello, > > > > I have a hadoop cluster setup of 10 nodes and I an in need of > implementing > > queues in the cluster for receiving high volumes of data. > > Please suggest what will be more efficient to use in the case of > receiving > > 24 Million Json files.. approx 5 KB each in every 24 hours : > > 1. Using Capacity Scheduler > > 2. Implementing RabbitMQ and receive data from them using Spring > Integration > > Data pipe lines. > > > > I cannot afford to loose any of the JSON files received. > > > > Thanking You, > > > > -- > > Regards, > > Ouch Whisper > > 010101010101 > > > > -- > Harsh J > +
Michael Segel 2013-01-11, 15:06
|