Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 26 (0.159s).
Loading phrases to help you
refine your search...
Re: hdfs disk usage - Hadoop - [mail # user]
...Take the default 3x replication into account too.On Fri, Apr 10, 2015 at 6:50 AM, Nitin Pawar wrote: ...
   Author: Peyman Mohajerian, 2015-04-10, 13:59
Re: Hadoop or spark - Hadoop - [mail # user]
...There actually is such a discussion, e.g.:http://www.slideshare.net/sbaltagi/spark-or-hadoop-is-it-an-eitheror-proposition-by-slim-baltagiyou can have a standalone Spark cluster with no depe...
   Author: Peyman Mohajerian, 2015-04-10, 13:11
Re: Hadoop and HttpFs - Hadoop - [mail # user]
...May be this helps:https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+PigOn Fri, Apr 3, 2015 at 5:56 AM, Remy Dubois  wrote: ...
   Author: Peyman Mohajerian, 2015-04-03, 21:04
Re: Why Only Mongodb suits MEAN techstack - Hadoop - [mail # user]
...Hive is certainly not a good option because it is not designed to betransactional, it is purely an analytical tool. There are many manyconversations comparing hbase, cassandra and Mongodb an...
   Author: Peyman Mohajerian, 2015-02-26, 13:16
[expand - 2 more] - Re: XML files in Hadoop - Hadoop - [mail # user]
...I would recommend as the first step not to use Flume, but rather land thedata in hdfs in the source format, XML and use Hive to convert the formatfrom XML to Parquet. That is much simpler to...
   Author: Peyman Mohajerian, 2015-01-03, 17:24
Re: Split files into 80% and 20% for building model and prediction - Hadoop - [mail # user]
...you don't have to copy the data to local to do a count.%hdfs dfs -cat file1 | wc -lwill do the jobOn Fri, Dec 12, 2014 at 1:58 AM, Susheel Kumar Gadalay wrote: ...
   Author: Peyman Mohajerian, 2014-12-21, 03:08
Re: Unable to use transfer data using distcp between EC2-classic cluster and VPC cluster - Hadoop - [mail # user]
...It maybe easier to copy the data to s3 and then from s3 to the new cluster.On Fri, Sep 19, 2014 at 8:45 PM, Jameel Al-Aziz  wrote: ...
   Author: Peyman Mohajerian, 2014-09-20, 15:06
Re: Data cleansing in modern data architecture - Hadoop - [mail # user]
...If you data is in different partitions in HDFS, you can simply use toolslike Hive or Pig to read the data in a give partition, filter out the baddata and overwrite the partition. This data c...
   Author: Peyman Mohajerian, 2014-08-24, 22:13
Re: Multiple Part files - Hadoop - [mail # user]
...Hadoop has a getmerge command (http://hadoop.apache.org/docs/r0.19.1/hdfs_shell.html#getmerge) command,I'm not certain if it works with RC file, i think it should. So maybe youdon't have to ...
   Author: Peyman Mohajerian, 2014-07-17, 13:43
Re: The future of MapReduce - Hadoop - [mail # user]
...This statement is inaccurate. Not all machine learning involves iterativecomputation, not all dataset can fit in-memory. I'm not an expert inMachine Learning, but I know enough to know that ...
   Author: Peyman Mohajerian, 2014-07-16, 16:36
Sort:
project
Hadoop (26)
Hive (17)
MapReduce (11)
HDFS (5)
Flume (1)
type
mail # user (24)
mail # general (2)
date
last 7 days (0)
last 30 days (2)
last 90 days (4)
last 6 months (6)
last 9 months (26)
author
Harsh J (571)
Steve Loughran (438)
Owen O'Malley (393)
Todd Lipcon (239)
Allen Wittenauer (225)
Eli Collins (184)
Chris Nauroth (183)
Alejandro Abdelnur (180)
Ted Yu (170)
Arun C Murthy (168)
Tom White (121)
Daryn Sharp (117)
Nigel Daley (115)
Konstantin Shvachko (111)
Colin Patrick McCabe (110)
Doug Cutting (96)
Aaron Kimball (94)
Edward Capriolo (88)
Mark Kerzner (87)
jason hadoop (82)
Kai Zheng (80)
Akira AJISAKA (75)
Hairong Kuang (75)
Benoy Antony (73)
Konstantin Boudnik (73)
Peyman Mohajerian
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB