Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 1 to 10 from 20 (0.115s).
Loading phrases to help you
refine your search...
Re: Unable to use transfer data using distcp between EC2-classic cluster and VPC cluster - Hadoop - [mail # user]
...It maybe easier to copy the data to s3 and then from s3 to the new cluster.On Fri, Sep 19, 2014 at 8:45 PM, Jameel Al-Aziz  wrote: ...
   Author: Peyman Mohajerian, 2014-09-20, 15:06
Re: Data cleansing in modern data architecture - Hadoop - [mail # user]
...If you data is in different partitions in HDFS, you can simply use toolslike Hive or Pig to read the data in a give partition, filter out the baddata and overwrite the partition. This data c...
   Author: Peyman Mohajerian, 2014-08-24, 22:13
Re: Multiple Part files - Hadoop - [mail # user]
...Hadoop has a getmerge command (http://hadoop.apache.org/docs/r0.19.1/hdfs_shell.html#getmerge) command,I'm not certain if it works with RC file, i think it should. So maybe youdon't have to ...
   Author: Peyman Mohajerian, 2014-07-17, 13:43
Re: The future of MapReduce - Hadoop - [mail # user]
...This statement is inaccurate. Not all machine learning involves iterativecomputation, not all dataset can fit in-memory. I'm not an expert inMachine Learning, but I know enough to know that ...
   Author: Peyman Mohajerian, 2014-07-16, 16:36
Re: Gathering connection information - Hadoop - [mail # user]
...In my experience you build a node called Edge Node which has all thelibraries and configuration setting in XML to connect to the cluster, itjust doesn't have any of the Hadoop daemons runnin...
   Author: Peyman Mohajerian, 2014-06-07, 14:12
Re: How to make sure data blocks are shared between 2 datanodes - Hadoop - [mail # user]
...Block size are typically 64 M or 12 M, so in your case only a single blockis involved which means if you have a single replica then only a singledata node will be used. The default replicati...
   Author: Peyman Mohajerian, 2014-05-25, 19:47
[expand - 1 more] - Re: Realtime sensor's tcpip data to hadoop - Hadoop - [mail # user]
...Whether you use Storm/kafka or any other realtime processing or not, youmay still need to persist the data which can be done directly to hbase fromany of these realtime system or from the so...
   Author: Peyman Mohajerian, 2014-05-16, 16:32
I stopped receiving any email from this group! - Hadoop - [mail # general]
... ...
   Author: Peyman Mohajerian, 2014-05-11, 03:56
Re: hadoop+python+text mining - Hadoop - [mail # user]
...At the high level I think you have these choices and more:1) Hadoop Streaming, leverage some of your python could, but not all b/cyou have to deal with map/reduce.2) Use Mahout.3) Use a dist...
   Author: Peyman Mohajerian, 2014-04-25, 00:59
Re: hdfs - get file block path for a datanode - Hadoop - [mail # user]
...hadoop fsck  -files -blocks -locationsOn Mon, Apr 14, 2014 at 4:43 PM, Alexandros Papadopoulos <[EMAIL PROTECTED]> wrote: ...
   Author: Peyman Mohajerian, 2014-04-14, 21:05
Sort:
project
Hadoop (20)
Hive (14)
MapReduce (11)
HDFS (5)
Flume (1)
type
mail # user (18)
mail # general (2)
date
last 7 days (0)
last 30 days (0)
last 90 days (2)
last 6 months (9)
last 9 months (20)
author
Harsh J (559)
Owen O'Malley (394)
Steve Loughran (392)
Todd Lipcon (238)
Eli Collins (182)
Alejandro Abdelnur (178)
Arun C Murthy (163)
Allen Wittenauer (148)
Chris Nauroth (146)
Ted Yu (125)
Tom White (121)
Daryn Sharp (115)
Nigel Daley (115)
Konstantin Shvachko (107)
Doug Cutting (96)
Aaron Kimball (94)
Colin Patrick McCabe (92)
Edward Capriolo (87)
Mark Kerzner (87)
jason hadoop (82)
Hairong Kuang (74)
Konstantin Boudnik (72)
Runping Qi (72)
Benoy Antony (69)
Suresh Srinivas (64)
Peyman Mohajerian