Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Re:Partitioning confusion


+
Sai Sai 2013-05-27, 07:38
+
Nitin Pawar 2013-05-27, 08:38
Copy link to this message
-
Re: Partitioning confusion
Nitin
I am still confused, from the below data that  i have given should the file which sits in the folder Country=USA and state=IL have only the rows where Country=USA and state=IL or will it have rows of other countries also.
The reason i ask is because if we have a 250GB file and would like to create 10 partitions that would end up in 2.5 TB * 3 = 7.5TB. Is this expected.
Thanks
S
________________________________
 From: Nitin Pawar <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Sai Sai <[EMAIL PROTECTED]>
Sent: Monday, 27 May 2013 2:08 PM
Subject: Re: Partitioning confusion
 
when you specify the load data query with specific partition, it will put the entire data into that partition. 
On Mon, May 27, 2013 at 1:08 PM, Sai Sai <[EMAIL PROTECTED]> wrote:
>
>After creating a partition for a country (USA) and state (IL) and when we go to the the hdfs site to look at the partition in the browser we r seeing  all the records for all the countries and states rather than just for the partition created for US and IL given below, is this correct behavior:
>********************
>Here is my commands:
>********************
>
>
>
>CREATE TABLE employees (name STRING, salary FLOAT, subordinates ARRAY<STRING>, deductions MAP<STRING, FLOAT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT, country:STRING> ) PARTITIONED BY (country STRING, state STRING);
>
>
>LOAD DATA LOCAL INPATH '/home/satish/data/employees/input/employees-country.txt' INTO TABLE employees PARTITION (country='USA',state='IL');
>
>
>********************
>
>Here is my original data file, where i have a few countries data such as USA, INDIA, UK, AUS:
>********************
>
>
>
>John Doe100000.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 Michigan Ave.ChicagoIL60600USA
>Mary Smith80000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario St.ChicagoIL60601USA
>Todd Jones70000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak ParkIL60700USA
>Bill King60000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure Dr.ObscuriaIL60100USA
>Boss Man200000.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500USA
>Fred Finance150000.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500USA
>Stacy Accountant60000.0Federal Taxes.15State Taxes.03Insurance.1300 Main St.NapervilleIL60563USA
>John Doe 2100000.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 Michigan Ave.ChicagoIL60600INDIA
>Mary Smith 280000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario St.ChicagoIL60601INDIA
>Todd Jones 270000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak ParkIL60700AUSTRALIA
>Bill King 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure Dr.ObscuriaIL60100AUSTRALIA
>Boss Man2 200000.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500UK
>Fred Finance 2150000.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500UK
>Stacy Accountant 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Main St.NapervilleIL60563UK
>********************
>
>Now when i navigate to:
>Contents of directory /user/hive/warehouse/db1.db/employees/country=USA/state=IL
>
>********************
>
>I see all the records and was wondering if it should have only USA & IL records.
>Please help.
--
Nitin Pawar
+
Nitin Pawar 2013-05-27, 09:59