Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - What does ROW__OFFSET__INSIDE__BLOCK FROM mean?


Copy link to this message
-
Re: What does ROW__OFFSET__INSIDE__BLOCK FROM mean?
Edward Capriolo 2012-10-03, 14:21
Make sure virtual column support is turned on in your hive-site.xml. I
have a feeling that this field is only supported inside certain input
formats because I was unable to get a non-very number out of it. (I
think it only works with index files)

On Wed, Oct 3, 2012 at 4:20 AM, afancy <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Could anybody explain me what ROW__OFFSET__INSIDE__BLOCK means?
> For example, I make the following query, and return two rows. But why does
> the column of ROW__OFFSET__INSIDE__BLOCK show 0?
> For my understanding from the name of column, it should return the line
> number of the records in the block files, but now both are 0.  So, what is
> the BLOCK, BLOCK offset, and row offset in a block?
> The Hive bitmap document is very confusing.
>
>
> hive> SELECT  `url`,  INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE,
> ROW__OFFSET__INSIDE__BLOCK FROM `testresult` WHERE
> url='http://www.domain022.tl04/page035.html';
>
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 0 0
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 3200250 0
> Time taken: 19.653 seconds
> hive>
>
>
> Regards,
> afancy
>