Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> How to delete Specific date data using hive QL?


Copy link to this message
-
Re: How to delete Specific date data using hive QL?
1- Does partitioning improve performance?
--Only if you make use of partitions in your queries (mostly in where
clause to limit data to your query for a specific value of partitioned
column)

2- Do i have to create partition table new or i can create partition on
existing table by renaming that date column and add partition column
event_date (the actual column name) ?
you can not create partitions on already existing data unless the data is
in partitioned directories on hdfs.
I would recommend create a new table with partitioned columns.
load data from old table into partitioned table
dump old table

3- can i import data directly into partition table using sqoop command?
you can import data directly into a partition.

for exported data, you don't have to worry. it remains as it is
On Tue, Jun 4, 2013 at 12:41 PM, Hamza Asad <[EMAIL PROTECTED]> wrote:

> No i don't want to change my queries. I want that my queries work on same
> table and partition does not change its schema.
> and from schema i means schema on mysql (exported data).
>
> Few more things
> 1- Does partitioning improve performance?
> 2- Do i have to create partition table new or i can create partition on
> existing table by renaming that date column and add partition column
> event_date (the actual column name) ?
> 3- can i import data directly into partition table using sqoop command?
>
>
>
>
> On Tue, Jun 4, 2013 at 11:40 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> partitioning of data in hive is more for the reasons on how you layout
>> data in a well defined manner so that when you access your data , you
>> request only for specific data by specifying the partition columns in where
>> clause.
>>
>> to answer your question,
>> do you have to change your queries? out of the box the queries should
>> work as it is unless and until you are changing the table schema by
>> removing/adding new columns.
>> does the format change when you export data? if your select statement is
>> not changing it will not change
>> will table schema change? do you mean schema on hive or mysql ?
>>
>>
>> On Tue, Jun 4, 2013 at 11:37 AM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>
>>> thats far more better :) ..
>>> Please tell me few more things. Do i have to change my query if i create
>>> table with partition on date? rest of the columns would be same as it is?
>>> Also if i export that partitioned table to mysql, does schema of that table
>>> would same as it was before partition?
>>>
>>>
>>> On Tue, Jun 4, 2013 at 12:09 AM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>>>
>>>> there is no delete semantic.
>>>>
>>>> you either partition on the data you want to drop and use drop
>>>> partition (or drop table for the whole shebang) or you can do as Nitin
>>>> suggests by selecting the inverse of the data you want to delete and store
>>>> it back into the table itself.  Not ideal but maybe it could work for your
>>>> situation.
>>>>
>>>> Now here's another idea.  This was just _recently_ discussed on this
>>>> group as coincidence would have it.  if you were to have scanned just a
>>>> little of the groups messages you would have seen that and could then have
>>>> added to the discussion! :)
>>>>
>>>>
>>>> On Mon, Jun 3, 2013 at 2:19 AM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Thanx for your response nitin. Anybody else have any better solution?
>>>>>
>>>>>
>>>>> On Mon, Jun 3, 2013 at 1:27 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> hive does not give you a record level deletion as of now.
>>>>>>
>>>>>> so unless you have partitioned, other option is you overwrite the
>>>>>> table with data which you want
>>>>>> please wait for others to suggest you more options. this one is just
>>>>>> mine and can be costly too
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 3, 2013 at 12:36 PM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> no, its not partitioned by date.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 3, 2013 at 11:19 AM, Nitin Pawar <
>>>>>>> [EMAIL PROTECTED]> wrote:

Nitin Pawar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB