Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 38 (0.1s).
Loading phrases to help you
refine your search...
Re: Difference between different tar - Hadoop - [mail # user]
...The '-bin' file does not have the source code (bin for binaries) while theother does. You can check and see the major difference in the 'src' foldersunder the top-level directory after unzip...
   Author: Shahab Yunus, 2014-07-21, 13:00
Re: Data cleansing in modern data architecture - Hadoop - [mail # user]
...I am assuming you meant the batch jobs that are/were used in old world fordata cleansing.As far as I understand there is no hard and fast rule for it and it dependsfunctional and system requ...
   Author: Shahab Yunus, 2014-07-20, 21:20
Re: Merging small files - Hadoop - [mail # user]
...Why it isn't appropriate to discuss too much vendor specific topics on avendor-neutral apache mailing list? Checkout this thread:http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-use...
   Author: Shahab Yunus, 2014-07-20, 16:32
Re: what exactly does data in HDFS look like? - Hadoop - [mail # user]
...The data itself is eventually store in a form of file. Each blocks of thefile and it replicas are stored in files and directories on differentnodes. The Namenode that keep the information an...
   Author: Shahab Yunus, 2014-07-19, 05:03
Re: Providing a file instead of a directory to a M/R job - Hadoop - [mail # user]
...That is what I thought so too but when I give the parent directory as theinput path of that same file, it works. Perhaps I am messing something up.I am suing cloudera 4.6 btw.Meanwhile I hav...
   Author: Shahab Yunus, 2014-07-17, 15:27
Re: How to recover reducer task data on a different data node? - Hadoop - [mail # user]
...Adding to what Jungi Jeong said, if you can get your hands on the book*Hadoop: The Definitive Guide *by Tom White, then that would help as well asit is explains this in significant detail.Re...
   Author: Shahab Yunus, 2014-07-03, 11:40
Re: The future of MapReduce - Hadoop - [mail # user]
...My personal thoughts on this.I approach this problem in a different way. Map/Reduce is not a frameworkor a technology. It was a paradigm for distributed and parallel processingwhich can be i...
   Author: Shahab Yunus, 2014-07-02, 20:12
Re: Spark vs. Storm - Hadoop - [mail # user]
...Not exactly. There are of course  major implementation differences and thensome subtle and high level ones too.My 2-cents:Spark is in-memory M/R and it simulated streaming or real-time ...
   Author: Shahab Yunus, 2014-07-02, 19:59
Re: job.setOutputFormatClass(NullOutputFormat.class); - Hadoop - [mail # user]
...To get rid of empty *part files while using MultipleOutputs in the new API,LazyOutputFormat class' static method should be used to set the outputformat.Details are here at the official Java ...
   Author: Shahab Yunus, 2014-07-02, 02:19
Re: WholeFileInputFormat in hadoop - Hadoop - [mail # user]
...I think it takes the entire file as input. Otherwise it won't be anydifferent from the normal line/record-based input format.Regards,ShahabOn Jun 28, 2014 3:28 AM, "unmesha sreeveni"  w...
   Author: Shahab Yunus, 2014-06-28, 12:38
MapReduce (43)
HBase (41)
Pig (35)
Hadoop (33)
HDFS (17)
Spark (1)
mail # user (38)
last 7 days (4)
last 30 days (10)
last 90 days (11)
last 6 months (20)
last 9 months (38)
Harsh J (554)
Owen O'Malley (396)
Steve Loughran (379)
Todd Lipcon (238)
Eli Collins (181)
Alejandro Abdelnur (162)
Arun C Murthy (161)
Chris Nauroth (141)
Allen Wittenauer (124)
Tom White (118)
Nigel Daley (115)
Ted Yu (115)
Daryn Sharp (110)
Konstantin Shvachko (106)
Aaron Kimball (93)
Doug Cutting (93)
Edward Capriolo (87)
Colin Patrick McCabe (86)
Mark Kerzner (86)
jason hadoop (82)
Hairong Kuang (74)
Runping Qi (72)
Konstantin Boudnik (70)
Benoy Antony (69)
Suresh Srinivas (63)
Shahab Yunus