Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 23 (0.137s).
Loading phrases to help you
refine your search...
Re: Why Only Mongodb suits MEAN techstack - Hadoop - [mail # user]
...Hive is certainly not a good option because it is not designed to betransactional, it is purely an analytical tool. There are many manyconversations comparing hbase, cassandra and Mongodb an...
   Author: Peyman Mohajerian, 2015-02-26, 13:16
[expand - 2 more] - Re: XML files in Hadoop - Hadoop - [mail # user]
...I would recommend as the first step not to use Flume, but rather land thedata in hdfs in the source format, XML and use Hive to convert the formatfrom XML to Parquet. That is much simpler to...
   Author: Peyman Mohajerian, 2015-01-03, 17:24
Re: Split files into 80% and 20% for building model and prediction - Hadoop - [mail # user]
...you don't have to copy the data to local to do a count.%hdfs dfs -cat file1 | wc -lwill do the jobOn Fri, Dec 12, 2014 at 1:58 AM, Susheel Kumar Gadalay wrote: ...
   Author: Peyman Mohajerian, 2014-12-21, 03:08
Re: Unable to use transfer data using distcp between EC2-classic cluster and VPC cluster - Hadoop - [mail # user]
...It maybe easier to copy the data to s3 and then from s3 to the new cluster.On Fri, Sep 19, 2014 at 8:45 PM, Jameel Al-Aziz  wrote: ...
   Author: Peyman Mohajerian, 2014-09-20, 15:06
Re: Data cleansing in modern data architecture - Hadoop - [mail # user]
...If you data is in different partitions in HDFS, you can simply use toolslike Hive or Pig to read the data in a give partition, filter out the baddata and overwrite the partition. This data c...
   Author: Peyman Mohajerian, 2014-08-24, 22:13
Re: Multiple Part files - Hadoop - [mail # user]
...Hadoop has a getmerge command (http://hadoop.apache.org/docs/r0.19.1/hdfs_shell.html#getmerge) command,I'm not certain if it works with RC file, i think it should. So maybe youdon't have to ...
   Author: Peyman Mohajerian, 2014-07-17, 13:43
Re: The future of MapReduce - Hadoop - [mail # user]
...This statement is inaccurate. Not all machine learning involves iterativecomputation, not all dataset can fit in-memory. I'm not an expert inMachine Learning, but I know enough to know that ...
   Author: Peyman Mohajerian, 2014-07-16, 16:36
Re: Gathering connection information - Hadoop - [mail # user]
...In my experience you build a node called Edge Node which has all thelibraries and configuration setting in XML to connect to the cluster, itjust doesn't have any of the Hadoop daemons runnin...
   Author: Peyman Mohajerian, 2014-06-07, 14:12
Re: How to make sure data blocks are shared between 2 datanodes - Hadoop - [mail # user]
...Block size are typically 64 M or 12 M, so in your case only a single blockis involved which means if you have a single replica then only a singledata node will be used. The default replicati...
   Author: Peyman Mohajerian, 2014-05-25, 19:47
[expand - 1 more] - Re: Realtime sensor's tcpip data to hadoop - Hadoop - [mail # user]
...Whether you use Storm/kafka or any other realtime processing or not, youmay still need to persist the data which can be done directly to hbase fromany of these realtime system or from the so...
   Author: Peyman Mohajerian, 2014-05-16, 16:32
Sort:
project
Hadoop (23)
Hive (16)
MapReduce (11)
HDFS (5)
Flume (1)
type
mail # user (21)
mail # general (2)
date
last 7 days (0)
last 30 days (1)
last 90 days (2)
last 6 months (3)
last 9 months (23)
author
Harsh J (568)
Steve Loughran (427)
Owen O'Malley (393)
Todd Lipcon (239)
Allen Wittenauer (194)
Eli Collins (183)
Alejandro Abdelnur (179)
Chris Nauroth (171)
Arun C Murthy (169)
Ted Yu (169)
Tom White (121)
Daryn Sharp (116)
Nigel Daley (115)
Konstantin Shvachko (111)
Colin Patrick McCabe (109)
Doug Cutting (96)
Aaron Kimball (94)
Edward Capriolo (88)
Mark Kerzner (87)
jason hadoop (82)
Hairong Kuang (75)
Benoy Antony (73)
Runping Qi (73)
Konstantin Boudnik (72)
Karthik Kambatla (70)
Peyman Mohajerian
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB