Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - cmd for to know the buckets


+
shaik ahamed 2012-07-18, 09:54
+
Navis류승우 2012-07-18, 10:10
+
shaik ahamed 2012-07-18, 10:31
+
Bejoy Ks 2012-07-18, 11:23
Copy link to this message
-
Re: cmd for to know the buckets
Edward Capriolo 2012-07-18, 14:15
There is a command to view the bucket SELECT * FROM TABLE TABLESAMPLE
(BUCKET 4 OUT OF 10 ON <BUCKETED COLUMN>) s;

The buckets also represent and HDFS file which you can DFS -cat or dfs
-text as well.

Edward

On Wed, Jul 18, 2012 at 7:23 AM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
> Hi Shaik
>
> AFAIK, there is no command in hive to view data in a particular bucket. If
> you are very much interested in viewing them, then you can do it at hdfs
> level. Just get into the corresponding table/partition location in hdfs, if
> you have n buckets then there will be n files each corresponding to a
> bucket.
>
> Regards
> Bejoy KS
>
> ________________________________
> From: shaik ahamed <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Wednesday, July 18, 2012 4:01 PM
> Subject: Re: cmd for to know the buckets
>
> Hi Navis,
>
>              Thanks for the reply
> As i created a bucket table and loaded the data in to it and i would like to
> see the 4 buckets data.
>
> For the below cmd as i can see the details of the bucket ,i cant view the
> bucket column data as like in partitions we can see the partition column
>
> i have created 4 buckets out of 10 rows for user_id column
>
> as per ur cmd im getting info like below
>
> hive> desc formatted user_info_bucketed;
> OK
> # col_name              data_type               comment
> user_id                 int                     None
> firstname               string                  None
> lastname                string                  None
> # Detailed Table Information
> Database:               default
> Owner:                  root
> CreateTime:             Wed Jul 18 12:51:53 IST 2012
> LastAccessTime:         UNKNOWN
> Protect Mode:           None
> Retention:              0
> Location:
> hdfs://md-trngpoc1:54310/user/hive/warehouse/user_info_bucketed
> Table Type:             MANAGED_TABLE
> Table Parameters:
>         numFiles                4
>         numPartitions           0
>         numRows                 0
>         rawDataSize             0
>         totalSize               177
>         transient_lastDdlTime   1342598173
> # Storage Information
> SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat:            org.apache.hadoop.mapred.TextInputFormat
> OutputFormat:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Compressed:             No
> Num Buckets:            4
> Bucket Columns:         [user_id]
> Sort Columns:           []
> Storage Desc Params:
>         serialization.format    1
> Time taken: 0.045 seconds
> See if i would like to view the bucket data in the means that 4 buckets i
> have created will i be able to see is there any cmd for syntax to to view
> that please reply me
>
> Regards
> shaik.
>
>
> On Wed, Jul 18, 2012 at 3:40 PM, Navis류승우 <[EMAIL PROTECTED]> wrote:
>
> Currently, configuring bucket num per partition is not allowed.
>
> If you want know the bucket num of table, use 'desc extended' or 'desc
> formatted'
>
>
> 2012/7/18 shaik ahamed <[EMAIL PROTECTED]>
>
> Hi users,
>
>            As i would like to know the syntax or the cmd to know the
> buckets.
>
>
> For example for partitions as we will give the below cmd to know the
> partitions for a table
>
>  show partitions xyz;      xyz(table name)
>
> Please tell the cmd to view the buckets created....?
>
>
>
> Regards,
> shaik.
>
>
>
>
>