Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> cmd for to know the buckets


+
shaik ahamed 2012-07-18, 09:54
+
Navis류승우 2012-07-18, 10:10
+
shaik ahamed 2012-07-18, 10:31
+
Bejoy Ks 2012-07-18, 11:23
Copy link to this message
-
Re: cmd for to know the buckets
There is a command to view the bucket SELECT * FROM TABLE TABLESAMPLE
(BUCKET 4 OUT OF 10 ON <BUCKETED COLUMN>) s;

The buckets also represent and HDFS file which you can DFS -cat or dfs
-text as well.

Edward

On Wed, Jul 18, 2012 at 7:23 AM, Bejoy Ks <[EMAIL PROTECTED]> wrote:
> Hi Shaik
>
> AFAIK, there is no command in hive to view data in a particular bucket. If
> you are very much interested in viewing them, then you can do it at hdfs
> level. Just get into the corresponding table/partition location in hdfs, if
> you have n buckets then there will be n files each corresponding to a
> bucket.
>
> Regards
> Bejoy KS
>
> ________________________________
> From: shaik ahamed <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Wednesday, July 18, 2012 4:01 PM
> Subject: Re: cmd for to know the buckets
>
> Hi Navis,
>
>              Thanks for the reply
> As i created a bucket table and loaded the data in to it and i would like to
> see the 4 buckets data.
>
> For the below cmd as i can see the details of the bucket ,i cant view the
> bucket column data as like in partitions we can see the partition column
>
> i have created 4 buckets out of 10 rows for user_id column
>
> as per ur cmd im getting info like below
>
> hive> desc formatted user_info_bucketed;
> OK
> # col_name              data_type               comment
> user_id                 int                     None
> firstname               string                  None
> lastname                string                  None
> # Detailed Table Information
> Database:               default
> Owner:                  root
> CreateTime:             Wed Jul 18 12:51:53 IST 2012
> LastAccessTime:         UNKNOWN
> Protect Mode:           None
> Retention:              0
> Location:
> hdfs://md-trngpoc1:54310/user/hive/warehouse/user_info_bucketed
> Table Type:             MANAGED_TABLE
> Table Parameters:
>         numFiles                4
>         numPartitions           0
>         numRows                 0
>         rawDataSize             0
>         totalSize               177
>         transient_lastDdlTime   1342598173
> # Storage Information
> SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat:            org.apache.hadoop.mapred.TextInputFormat
> OutputFormat:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Compressed:             No
> Num Buckets:            4
> Bucket Columns:         [user_id]
> Sort Columns:           []
> Storage Desc Params:
>         serialization.format    1
> Time taken: 0.045 seconds
> See if i would like to view the bucket data in the means that 4 buckets i
> have created will i be able to see is there any cmd for syntax to to view
> that please reply me
>
> Regards
> shaik.
>
>
> On Wed, Jul 18, 2012 at 3:40 PM, Navis류승우 <[EMAIL PROTECTED]> wrote:
>
> Currently, configuring bucket num per partition is not allowed.
>
> If you want know the bucket num of table, use 'desc extended' or 'desc
> formatted'
>
>
> 2012/7/18 shaik ahamed <[EMAIL PROTECTED]>
>
> Hi users,
>
>            As i would like to know the syntax or the cmd to know the
> buckets.
>
>
> For example for partitions as we will give the below cmd to know the
> partitions for a table
>
>  show partitions xyz;      xyz(table name)
>
> Please tell the cmd to view the buckets created....?
>
>
>
> Regards,
> shaik.
>
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB