Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 11 to 20 from 47 (0.078s).
Loading phrases to help you
refine your search...
Re: Data cleansing in modern data architecture - Hadoop - [mail # user]
...I am assuming you meant the batch jobs that are/were used in old world fordata cleansing.As far as I understand there is no hard and fast rule for it and it dependsfunctional and system requ...
   Author: Shahab Yunus, 2014-07-20, 21:20
[expand - 1 more] - Re: Merging small files - Hadoop - [mail # user]
...Why it isn't appropriate to discuss too much vendor specific topics on avendor-neutral apache mailing list? Checkout this thread:http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-use...
   Author: Shahab Yunus, 2014-07-20, 16:32
Re: what exactly does data in HDFS look like? - Hadoop - [mail # user]
...The data itself is eventually store in a form of file. Each blocks of thefile and it replicas are stored in files and directories on differentnodes. The Namenode that keep the information an...
   Author: Shahab Yunus, 2014-07-19, 05:03
[expand - 1 more] - Re: Providing a file instead of a directory to a M/R job - Hadoop - [mail # user]
...That is what I thought so too but when I give the parent directory as theinput path of that same file, it works. Perhaps I am messing something up.I am suing cloudera 4.6 btw.Meanwhile I hav...
   Author: Shahab Yunus, 2014-07-17, 15:27
Re: How to recover reducer task data on a different data node? - Hadoop - [mail # user]
...Adding to what Jungi Jeong said, if you can get your hands on the book*Hadoop: The Definitive Guide *by Tom White, then that would help as well asit is explains this in significant detail.Re...
   Author: Shahab Yunus, 2014-07-03, 11:40
Re: The future of MapReduce - Hadoop - [mail # user]
...My personal thoughts on this.I approach this problem in a different way. Map/Reduce is not a frameworkor a technology. It was a paradigm for distributed and parallel processingwhich can be i...
   Author: Shahab Yunus, 2014-07-02, 20:12
Re: Spark vs. Storm - Hadoop - [mail # user]
...Not exactly. There are of course  major implementation differences and thensome subtle and high level ones too.My 2-cents:Spark is in-memory M/R and it simulated streaming or real-time ...
   Author: Shahab Yunus, 2014-07-02, 19:59
Re: job.setOutputFormatClass(NullOutputFormat.class); - Hadoop - [mail # user]
...To get rid of empty *part files while using MultipleOutputs in the new API,LazyOutputFormat class' static method should be used to set the outputformat.Details are here at the official Java ...
   Author: Shahab Yunus, 2014-07-02, 02:19
Re: WholeFileInputFormat in hadoop - Hadoop - [mail # user]
...I think it takes the entire file as input. Otherwise it won't be anydifferent from the normal line/record-based input format.Regards,ShahabOn Jun 28, 2014 3:28 AM, "unmesha sreeveni"  w...
   Author: Shahab Yunus, 2014-06-28, 12:38
Re: Practical examples - Hadoop - [mail # user]
...For Machine Learning based applications of Hadoop you can check-out Mahoutframework.Regards,ShahabOn Mon, Apr 28, 2014 at 10:02 PM, Mohan Radhakrishnan <[EMAIL PROTECTED]> wrote: ...
   Author: Shahab Yunus, 2014-04-29, 02:08
HBase (48)
MapReduce (43)
Hadoop (42)
Pig (35)
HDFS (17)
Cassandra (1)
Spark (1)
mail # user (47)
last 7 days (4)
last 30 days (7)
last 90 days (19)
last 6 months (27)
last 9 months (47)
Harsh J (557)
Owen O'Malley (394)
Steve Loughran (387)
Todd Lipcon (239)
Eli Collins (182)
Alejandro Abdelnur (177)
Arun C Murthy (163)
Allen Wittenauer (149)
Chris Nauroth (146)
Ted Yu (121)
Tom White (121)
Daryn Sharp (115)
Nigel Daley (115)
Konstantin Shvachko (107)
Doug Cutting (96)
Aaron Kimball (94)
Colin Patrick McCabe (92)
Edward Capriolo (88)
Mark Kerzner (87)
jason hadoop (82)
Hairong Kuang (74)
Konstantin Boudnik (72)
Runping Qi (72)
Benoy Antony (70)
Suresh Srinivas (64)
Shahab Yunus