Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> What does ROW__OFFSET__INSIDE__BLOCK FROM mean?


Copy link to this message
-
Re: What does ROW__OFFSET__INSIDE__BLOCK FROM mean?
Make sure virtual column support is turned on in your hive-site.xml. I
have a feeling that this field is only supported inside certain input
formats because I was unable to get a non-very number out of it. (I
think it only works with index files)

On Wed, Oct 3, 2012 at 4:20 AM, afancy <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Could anybody explain me what ROW__OFFSET__INSIDE__BLOCK means?
> For example, I make the following query, and return two rows. But why does
> the column of ROW__OFFSET__INSIDE__BLOCK show 0?
> For my understanding from the name of column, it should return the line
> number of the records in the block files, but now both are 0.  So, what is
> the BLOCK, BLOCK offset, and row offset in a block?
> The Hive bitmap document is very confusing.
>
>
> hive> SELECT  `url`,  INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE,
> ROW__OFFSET__INSIDE__BLOCK FROM `testresult` WHERE
> url='http://www.domain022.tl04/page035.html';
>
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 0 0
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 3200250 0
> Time taken: 19.653 seconds
> hive>
>
>
> Regards,
> afancy
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB