Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Storing data in TSV with changing headers


Copy link to this message
-
Re: Storing data in TSV with changing headers
You'll have to define separate tables for the different schemas. You can
"unify" them in a query with the union feature. You should also remove the
header lines in the files, if you still have them, because Hive does not
ignore them, but treats them as "data".

dean

On Fri, Nov 30, 2012 at 2:59 AM, Marc Canaleta <[EMAIL PROTECTED]> wrote:

> Hi all!
>
> We want to use hive to analyze our logs. Our logs will be TSV files, one
> per hour, and as it is possible that we add/remove more columns in the
> future, we will include headers (column names) in each file.
>
> So it is possible that two TSV files for different days/hours have
> different headers.
>
> Is it possible to do this with Hive?
>
> Thanks!
>

--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB