Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 31 to 40 from 225 (0.135s).
Loading phrases to help you
refine your search...
Re: Text files vs. SequenceFiles - Hadoop - [mail # user]
...David,  I think you've more-or-less outlined the pros and cons of each format (though do see Alex's important point regarding SequenceFiles and compression). If everyone who worked with...
   Author: Aaron Kimball, 2010-07-05, 07:47
Re: Can we modify files in HDFS? - Hadoop - [mail # general]
...On Tue, Jun 29, 2010 at 2:57 AM, Steve Loughran  wrote:   It's my understanding that HBase stores datasets in reasonably small files (a few hundred MB each?) where deltas to a sect...
   Author: Aaron Kimball, 2010-07-05, 07:38
Re: Displaying Map output in MapReduce - Hadoop - [mail # general]
...If you set the number of reduce tasks to zero, the outputs of the mappers will be sent directly to the OutputFormat. You can debug your map phase of  a job by disabling reduce and inspe...
   Author: Aaron Kimball, 2010-07-05, 07:34
Re: How many records will be passed to a map function?? - Hadoop - [mail # general]
...Short answer: FileInputFormat & friends generate splits based on byte ranges.  Assuming your records are all equally sized, you'll get half your records in each mapper. If your records ...
   Author: Aaron Kimball, 2010-06-19, 00:13
Re: Is it possible ....!!! - Hadoop - [mail # user]
...Hadoop has some classes for controlling how sockets are used. See org.apache.hadoop.net.StandardSocketFactory, SocksSocketFactory.  The socket factory implementation chosen is controlle...
   Author: Aaron Kimball, 2010-06-10, 15:09
Re: Mapper Reducer : Unit Test and mocking with static variables - Hadoop - [mail # general]
...Varene,  You might want to check out MRUnit. It's a unit test harness that contains mock objects for the context & other associated classes, and works with JUnit.  It's included in...
   Author: Aaron Kimball, 2010-05-28, 00:12
[expand - 2 more] - Re: Hadoop Data Sharing - Hadoop - [mail # general]
...Perhaps this is guidance in the area you were hoping for: If your data is i n objects that implement the interface 'Writable', then you can use the SequenceFileOutputFormat and SequenceFileI...
   Author: Aaron Kimball, 2010-05-11, 17:34
Re: help on CombineFileInputFormat - Hadoop - [mail # user]
...Zhenyu,  It's a bit complicated and involves some layers of indirection. CombineFileRecordReader is a sort of shell RecordReader that passes the actual work of reading records to anothe...
   Author: Aaron Kimball, 2010-05-10, 09:12
Re: Different exception handling on corrupt GZip file reading - Hadoop - [mail # general]
...If you ever wonder "why doesn't Hadoop do _REASONABLE_THING_X_", the answer is usually one of:  * Somebody made a mistake the first time it got written * Nobody needed quite that corner...
   Author: Aaron Kimball, 2010-04-15, 16:28
Re: DBInputFormat number of mappers - Hadoop - [mail # general]
...Hi Dan,  It's also worth pointing out that DBInputFormat's queries are written in such a way as to make parallelism more likely to hurt than to help. Each mapper submits a query to the ...
   Author: Aaron Kimball, 2010-04-15, 16:20
Hadoop (223)
MapReduce (122)
Hive (18)
HDFS (8)
Avro (6)
HBase (5)
Sqoop (3)
Pig (2)
Accumulo (1)
Flume (1)
Spark (1)
mail # user (182)
mail # general (29)
mail # dev (11)
issue (3)
last 7 days (0)
last 30 days (0)
last 90 days (2)
last 6 months (2)
last 9 months (225)
Harsh J (558)
Owen O'Malley (394)
Steve Loughran (390)
Todd Lipcon (238)
Eli Collins (182)
Alejandro Abdelnur (178)
Arun C Murthy (163)
Allen Wittenauer (148)
Chris Nauroth (146)
Ted Yu (126)
Tom White (121)
Daryn Sharp (115)
Nigel Daley (115)
Konstantin Shvachko (107)
Doug Cutting (96)
Aaron Kimball (94)
Colin Patrick McCabe (93)
Edward Capriolo (88)
Mark Kerzner (87)
jason hadoop (82)
Hairong Kuang (74)
Konstantin Boudnik (72)
Runping Qi (72)
Benoy Antony (70)
Suresh Srinivas (64)