Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Issue with Hive and table with lots of column


Copy link to this message
-
Re: Issue with Hive and table with lots of column
thanks Ed. And on a separate tact lets look at Hiveserver2.
@OP>

*I've tried to look around on how i can change the thrift heap size but
haven't found anything.*
looking at my hiveserver2 i find this:

   $ ps -ef | grep -i hiveserver2
   dwr       9824 20479  0 12:11 pts/1    00:00:00 grep -i hiveserver2
   dwr      28410     1  0 00:05 ?        00:01:04
/usr/lib/jvm/java-6-sun/jre/bin/java
*-Xmx256m*-Dhadoop.log.dir=/usr/lib/hadoop/logs
-Dhadoop.log.file=hadoop.log
-Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=
-Dhadoop.root.logger=INFO,console
-Djava.library.path=/usr/lib/hadoop/lib/native
-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
-Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar
/usr/lib/hive/lib/hive-service-0.12.0.jar
org.apache.hive.service.server.HiveServer2
questions:

   1. what is the output of "ps -ef | grep -i hiveserver2" on your system?
in particular what is the value of -Xmx ?

   2. can you restart your hiveserver with -Xmx1g? or some value that makes
sense to your system?

Lots of questions now.  we await your answers! :)

On Fri, Jan 31, 2014 at 11:51 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Final table compression should not effect the de serialized size of the
> data over the wire.
>
>
> On Fri, Jan 31, 2014 at 2:49 PM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>
>> Excellent progress David.   So.  What the most important thing here we
>> learned was that it works (!) by running hive in local mode and that this
>> error is a limitation in the HiveServer2.  That's important.
>>
>> so textfile storage handler and having issues converting it to ORC. hmmm.
>>
>> follow-ups.
>>
>> 1. what is your query that fails?
>>
>> 2. can you add a "limit 1" to the end of your query and tell us if that
>> works? this'll tell us if it's column or row bound.
>>
>> 3. bonus points. run these in local mode:
>>       > set hive.exec.compress.output=true;
>>       > set mapred.output.compression.type=BLOCK;
>>       > set
>> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
>>       > create table blah stored as ORC as select * from <your table>;
>> #i'm curious if this'll work.
>>       > show create table blah;  #send output back if previous step
>> worked.
>>
>> 4. extra bonus.  change ORC to SEQUENCEFILE in #3 see if that works any
>> differently.
>>
>>
>>
>> I'm wondering if compression would have any effect on the size of the
>> internal ArrayList the thrift server uses.
>>
>>
>>
>> On Fri, Jan 31, 2014 at 9:21 AM, David Gayou <[EMAIL PROTECTED]>wrote:
>>
>>> Ok, so here are some news :
>>>
>>> I tried to boost the HADOOP_HEAPSIZE to 8192,
>>> I also setted the mapred.child.java.opts to 512M
>>>
>>> And it doesn't seem's to have any effect.
>>>  ------
>>>
>>> I tried it using an ODBC driver => fail after few minutes.
>>> Using a local JDBC (beeline) => running forever without any error.
>>>
>>> Both through hiveserver 2
>>>
>>> If i use the local mode : it works!   (but that not really what i need,
>>> as i don't really how to access it with my software)
>>>
>>> ------
>>> I use a text file as storage.
>>> I tried to use ORC, but i can't populate it with a load data  (it return
>>> an error of file format).
>>>
>>> Using an "ALTER TABLE orange_large_train_3 SET FILEFORMAT ORC" after
>>> populating the table, i have a file format error on select.
>>>
>>> ------
>>>
>>> @Edward :
>>>
>>> I've tried to look around on how i can change the thrift heap size but
>>> haven't found anything.
>>> Same thing for my client (haven't found how to change the heap size)
>>>
>>> My usecase is really to have the most possible columns.
>>>
>>>
>>> Thanks a lot for your help
>>>
>>>
>>> Regards
>>>
>>> David
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jan 31, 2014 at 1:12 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:
>>>
>>>> Ok here are the problem(s). Thrift has frame size limits, thrift has to
>>>> buffer rows into memory.
>>>>
>>>> Hove thrift has a heap size, it needs to big in this case.