Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Application of Cloudera Hadoop for Dataset analysis


Copy link to this message
-
Re: Application of Cloudera Hadoop for Dataset analysis
You can use Hortonworks data platform which already integrates HDFS,
MapReduce and Hive well.
http://hortonworks.com/products/hortonworksdataplatform/

Came across this new solution recently, They claim to be Hadoop based
Standard SQL solution for data analytics.
http://queryio.com/hadoop-big-data-product/hadoop-hive.html

Have not given it a try yet but you can explore it.

-Richard

 On Tue, Feb 5, 2013 at 10:07 AM, * *Preethi Vinayak Ponangi <
[EMAIL PROTECTED]> wrote:

> *From: *Preethi Vinayak Ponangi <[EMAIL PROTECTED]>
> *Subject: **Re: Application of Cloudera Hadoop for Dataset analysis*
> *Date: *February 5, 2013 8:07:47 AM PST
> *To: *[EMAIL PROTECTED]
> *Reply-To: *[EMAIL PROTECTED]
>
> It depends on what part of the Hadoop Eco system component you would like
> to use.
>
> You can do it in several ways:
>
> 1) You could write a basic map reduce job to do joins.
> This link could help or just a basic search on google would give you
> several links.
>
> http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/
>
> 2) You could use an abstract language like Pig to do these joins using
> simple pig scripts.
> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html
>
> 3) The simplest of all, you could write SQL like queries to do this join
> using Hive.
> http://hive.apache.org/
>
> Hope this helps.
>
> Regards,
> Vinayak.
>
>
> On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <[EMAIL PROTECTED]>wrote:
>
>> Please take this thread to CDH mailing list.
>>
>>
>> On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
>>> would like to get the following clarifications regarding cloudera hadoop
>>> distribution. I am using a CDH4 Demo VM for now.
>>>
>>> 1. After I upload the files into the file browser, if I have to link
>>> two-three datasets using a key in those files, what should I do? Do I have
>>> to run a query over them?
>>>
>>> 2. My objective is that I have some data collected over a few years and
>>> now, I would like to link all of them, as in a database using keys and then
>>> run queries over them to find out particular patterns. Later I would like
>>> to implement some Machine learning algorithms on them for predictive
>>> analysis. Will this be possible on the demo VM?
>>>
>>> I am totally new to this. Can I get some help on this? I would be very
>>> grateful for the same.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Thanks and Regards,
>>> *Sharath Chandra Guntuku*
>>> Undergraduate Student (Final Year)
>>> *Computer Science Department*
>>> *Email*: [EMAIL PROTECTED]
>>>
>>> *BITS-Pilani*, Hyderabad Campus
>>> Jawahar Nagar, Shameerpet, RR Dist,
>>> Hyderabad - 500078, Andhra Pradesh
>>>
>>
>>
>>
>> --
>> http://hortonworks.com/download/
>>
>
>
>
+
Sharath Chandra Guntuku 2013-02-05, 10:43
+
Suresh Srinivas 2013-02-05, 16:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB