Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> dynamic Partition not splitting properly


+
Hamza Asad 2013-06-12, 13:47
+
Nitin Pawar 2013-06-12, 13:50
+
Hamza Asad 2013-06-13, 11:10
+
Nitin Pawar 2013-06-13, 11:12
+
Hamza Asad 2013-06-13, 11:49
+
Nitin Pawar 2013-06-13, 11:52
+
Hamza Asad 2013-06-13, 11:54
+
Nitin Pawar 2013-06-13, 12:07
+
Hamza Asad 2013-06-14, 06:39
Copy link to this message
-
Re: dynamic Partition not splitting properly
use already existing UDFs to split or transform your values the way you
want
On Fri, Jun 14, 2013 at 12:09 PM, Hamza Asad <[EMAIL PROTECTED]> wrote:

> OIC. I got it. Thanx alot nitin :). One more thing i want to ask related
> this issue, if old table contains event_date in format "2012-06-24
> 06:04:11.9" then how can i partition it according to date part only? As
> partition column does not accepts to_date(event_date) form.
>
>
> On Thu, Jun 13, 2013 at 5:07 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> If the input column value is NULL or empty string, the row will be put into a special partition, whose name is controlled by the hive parameter hive.exec.default.dynamic.partition.name. The default value is `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain all
>> "bad" rows whose value are not valid partition names.
>>
>> so basically you do following things
>>
>> when you create a partitioned table, your partitioned column is normally at the end of the table, so when you are inserting data into this partitioned table, I would recommend using the column names in place select * from
>>
>> so your insert query should look like
>>
>> set hive.exec.dynamic.partition=true;
>>
>> set hive.exec.dynamic.partition.mode=nonstrict;
>>
>>
>> insert overwrite table new_table partition(event_date) select col1, col2 .... coln, event_date from old_table;
>>
>>
>>
>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>
>>> when i browse it in browser, all the data is in *
>>> event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>> *, rest of the files does not contains data
>>>
>>>
>>> On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>>>
>>>> what do you mean when you say "it wont split correctly" ?
>>>>
>>>>
>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> what if i have data of more then 500 days then how can i create
>>>>> partition on date column by specifying each and every date? (i knw that
>>>>> does not happens in dynamic partition but on dynamic partition, it wont
>>>>> splits correctly).
>>>>>
>>>>>
>>>>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> you can partition existing table unless the hdfs data is laid out in
>>>>>> partitioned fashion.
>>>>>> your best bet is create a new partitioned table
>>>>>> enable dynamic paritionining
>>>>>> read from old table and write into new table
>>>>>>
>>>>>> you can verify the new partitions by using command "show partitions"
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza Asad <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> now i created partition table like
>>>>>>> *CREATE TABLE new_rc_partition_cluster_table(
>>>>>>>
>>>>>>>   id int,
>>>>>>>   event_id int,
>>>>>>>   user_id BIGINT,
>>>>>>>
>>>>>>>   intval_1 int ,
>>>>>>>   intval_2 int,
>>>>>>>   intval_3 int,
>>>>>>>   intval_4 int,
>>>>>>>   intval_5 int,
>>>>>>>   intval_6 int,
>>>>>>>   intval_7 int,
>>>>>>>   intval_8 int,
>>>>>>>   intval_9 int,
>>>>>>>   intval_10 int,
>>>>>>>   intval_11 int,
>>>>>>>   intval_12 int,
>>>>>>>   intval_13 int,
>>>>>>>   intval_14 int,
>>>>>>>   intval_15 int,
>>>>>>>   intval_16 int,
>>>>>>>   intval_17 int,
>>>>>>>   intval_18 int,
>>>>>>>   intval_19 int,
>>>>>>>   intval_20 int,
>>>>>>>   intval_21 int,
>>>>>>>   intval_22 int,
>>>>>>>   intval_23 int,
>>>>>>>   intval_24 int,
>>>>>>>   intval_25 int,
>>>>>>>   intval_26 int)
>>>>>>>   PARTITIONED BY (event_date string)
>>>>>>>
>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>> ROW FORMAT DELIMITED
>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>> STORED AS RCFile;
>>>>>>> *
>>>>>>>
>>>>>>> rest of the commands are same. But this time INSERT OVERWRITE query
>>>>>>> executed and took time but when i queries from that table, it results none

Nitin Pawar
+
Hamza Asad 2013-06-14, 07:01
+
Nitin Pawar 2013-06-14, 07:27
+
Hamza Asad 2013-06-14, 07:34
+
Nitin Pawar 2013-06-14, 07:41
+
Hamza Asad 2013-06-14, 07:51
+
Nitin Pawar 2013-06-14, 08:30
+
Hamza Asad 2013-06-14, 08:38
+
Hamza Asad 2013-06-16, 13:08