Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - DATA UPLOADTION


+
yogesh.kumar13@... 2012-07-16, 03:40
+
Debarshi Basak 2012-07-16, 05:55
+
yogesh.kumar13@... 2012-07-16, 06:21
+
Bejoy KS 2012-07-16, 07:50
+
Gesli, Nicole 2012-07-16, 18:00
+
yogesh.kumar13@... 2012-07-17, 06:33
Copy link to this message
-
Re: DATA UPLOADTION
Bejoy KS 2012-07-17, 06:39
Hi Yogesh

You can connect reporting tools like tableau , micro strategy etc direcly with hive.

If you are looking for some static reports based on aggregate data. You can process the data in hive move the resultant data into some rdbms and use some common reporting tools over the same. I know quite a few projects following this model.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: <[EMAIL PROTECTED]>
Date: Tue, 17 Jul 2012 06:33:43
To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: RE: DATA UPLOADTION

Thanks Gesli and Bejoy,

I have created tables in hive and uploaded data into it. I can perform query on it, please suggest me how to generate reports from that tables.

Mr. Gesli,
If I create tables with single string column like ( create table Log_table( Data STRING); ) then how can perform condition based query over the data into Log_table ?
Thanks & Regards :-)
Yogesh Kumar

________________________________
From: Gesli, Nicole [[EMAIL PROTECTED]]
Sent: Monday, July 16, 2012 11:30 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: DATA UPLOADTION

If you are just trying to find certain text in the data files and you just want to do bulk process to create reports once a day or so, and prefer to use Hive: you can create a table with with single string column. You need to pre-process your data to replace the default column delimiter in your data. Or, you can define a column delimiter that your data does not have. That is to make sure that entire line data is assigned to the column but not cut in where the column delimiter is. If your query will be different for each file type (flat files, logs, xls,…) you can create different partitions for each file type. Dump your files into the table (or table partition) folder(s). Or you can create external table(s) if your data is already in HDFS. You can than do "like" (faster) or "rlike" search on the table.

-Nicole

From: Bejoy KS <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>, "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Monday, July 16, 2012 12:50 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Cc: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: DATA UPLOADTION

Hi Yogesh

If you are looking at some indexing and search kind of operation you can take a look at lucene.

Whether you are using hive or Hbase you cannot do any operation without having a table structure defined for the data. So you need to create tables for each dataset and then only you can go ahead and issue queries and generate reports on those data.
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Mon, 16 Jul 2012 06:21:15 +0000
To: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
ReplyTo: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: RE: DATA UPLOADTION

Hello Debarshi,

Please suggest me what tool should I use for these operation over hadoop dfs.

Regards
Yogesh Kumar

________________________________
From: Debarshi Basak [[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Monday, July 16, 2012 11:25 AM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>; [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: DATA UPLOADTION

Hive is not the right to go about it, if you are planning to do search kind of operations
Debarshi Basak
Tata Consultancy Services
Mailto: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________

To: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
From: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: 07/16/2012 09:11AM
cc: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: DATA UPLOADTION

Hi all,

I have data of Flat files, Log files, Images and .xls Files of around many G.B

I need to put operation like searching, Querying over that raw data.  and generating reports.
And its impossible to create tables manually for all to manage them. Is there any other way out or how to manage them using Hive or Hbase.

Please suggest me how do I perform these operations over them, I want to use HADOOP DFS and files has been uploaded on HDFS (Single user)
Thanks & Regards
Yogesh Kumar

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com

=====-----=====-----====Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohib
+
Gesli, Nicole 2012-07-17, 19:20