Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Schema less hive table. Is it possible?


Copy link to this message
-
Re: Schema less hive table. Is it possible?
Nitin Pawar 2013-11-07, 12:59
This is my understanding i  may be wrong so wait for others to reply as
well and correct my stupid understanding

In hive without creating a table, you can not access the data
to create the table you will need atleast one column
does it qualify to be called as a schema .. yes

so in short schemaless table is not possible.
at max what you can do is create a table with a single column and string
type.

Then write a custom udf which parses the string and returns the data you
are looking for.
the problem with this is, you can not use any column compression storage
file types and you will need to read up all the data for each record when u
query for it plus always a subquery to do more granular access.
On other hand, creating table schema is one time job so its worth the
effort that the data validation is offloaded to hive.

In past I have created tables where data was in json format with 8 nested
documents inside a single column.
On Thu, Nov 7, 2013 at 5:10 PM, Miljan Markovic <[EMAIL PROTECTED]
> wrote:

>  Hello.
>
> I have a set of complex structured objects that I would like to put into
> hive and use it's sql to query data from those objects. Now, since they are
> complex structures, defining a schema for every single field is a hard job.
>
> Instead I'm thinking of making a custom SerDe with custom ObjectInspector
> that uses something like ONGL to get the values of object fields and
> convert them to hive's data types. So when writing a query it would be
> something like:
>
> select <ongl_expression1>,<ongl_expression2>... from etc...
>
> If expressions can be transfered verbatim to ObjectInspector without hive
> checking for their validity as column names, ObjectInspector itself would
> know what to do with them. This doesn't need any explicit schema as far as
> SerDe and ObjectInspector are concerned. But can hive cope with that? Is
> this possible to do?
>
>
--
Nitin Pawar