Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Table schema size limit to 4000 chars ?


+
Alexandre Fouche 2012-12-17, 13:24
+
Alexandre Fouche 2012-12-17, 13:58
Copy link to this message
-
Re: Table schema size limit to 4000 chars ?
There shouldn't be any problems with comments in Avro schemas.  You
just need to make sure they're escaped properly.  We did run into a
problem with schema.literal values longer than 4k (the size of the
backing mysql varchar field), so internally we just bump this value
for our Hive installs:

ALTER TABLE SERDE_PARAMS MODIFY PARAM_VALUE varchar(20000);
On 17 December 2012 05:58, Alexandre Fouche
<[EMAIL PROTECTED]> wrote:
> Ah, it seems the Json parser issue was due to my avro schema having comments
> //. I have seen some comments on the web about this parser that it can be
> configured to accept comments.
>
> Is there a Hive property to be passed to json parser and allow comments in
> Avro schemas ?
>
> --
> Alexandre Fouche
>
> On Monday 17 December 2012 at 14:24, Alexandre Fouche wrote:
>
> Hi,
>
> I have an avro table with a schema that is around 8000 chars, and cannot
> query from it:
>
> First i had issue when creating the table, Hive will throw an exception
> because the field in MySQL (varchar(4000)) is too small. So i altered the
> column to varchar(10000) and it fixed this part.
>
> But when querying the table, Hive throws an exception that the JsonParser
> can not find the end of the avro schema array. It is basically the same
> issue as above, the avro schema string is too long to be parsed by the 3rd
> party Json parser org.codehaus.jackson.JsonParser in Hive/Avro. There i do
> not really know if this parser cannot parse arbitrary length json strings or
> it has an hardcoded allocated string size
>
> Note i am using Cloudera Hive 0.9, which has avro serde bundled
>
> Here is the thrown exception. org.codehaus.jackson.JsonParser is mentioned
> at the end
>
> (…)
> 12/12/17 10:49:55 WARN avro.AvroSerdeUtils: Encountered exception
> determining schema. Returning signal schema to indicate problem
> org.apache.avro.SchemaParseException:
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected
> close marker for ARRAY (from [Source: java.io.StringReader@a750bb9; line: 1,
> column: 37])
>  at [Source: java.io.StringReader@a750bb9; line: 1, column: 13980]
> at org.apache.avro.Schema$Parser.parse(Schema.java:983)
> at org.apache.avro.Schema$Parser.parse(Schema.java:971)
> at org.apache.avro.Schema.parse(Schema.java:1020)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:61)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:87)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:59)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:203)
> at
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
> at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
> at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
> at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:930)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:831)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:959)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7532)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:246)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:432)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
> at
> org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:94)
> at
> org.apache.hive.service.cli.session.Session.executeStatement(Session.java:141)
> at
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:120)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:169)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB