Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors


Copy link to this message
-
Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors
Nitin Pawar 2013-07-30, 07:49
The mentioned flow is called when you have unsecure mode of thrift
metastore client-server connection. So one way to avoid this is have a
secure way.

<code>
public boolean process(final TProtocol in, final TProtocol out)
throwsTException {
setIpAddress(in);
...
...
...
@Override
     protected void setIpAddress(final TProtocol in) {
    TUGIContainingTransport ugiTrans (TUGIContainingTransport)in.getTransport();
                    Socket socket = ugiTrans.getSocket();
    if (socket != null) {
      setIpAddress(socket);

</code>
>From the above code snippet, it looks like the null pointer exception is
not handled if the getSocket returns null.

can you check whats the ulimit setting on the server? If its set to default
can you set it to unlimited and restart hcat server. (This is just a wild
guess).

also the getSocket method suggests "If the underlying TTransport is an
instance of TSocket, it returns the Socket object which it contains.
Otherwise it returns null."

so someone from thirft gurus need to tell us whats happening. I have no
knowledge of this depth

may be Ashutosh or Thejas will be able to help on this.
>From the netstat close_wait, it looks like the hive metastore server has
not closed the connection (do not know why yet), may be the hive dev guys
can help.Are there too many connections in close_wait state?

On Tue, Jul 30, 2013 at 5:52 AM, agateaaa <[EMAIL PROTECTED]> wrote:

> Looking at the hive metastore server logs see errors like these:
>
> 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer
> (TThreadPoolServer.java:run(182)) - Error occurred during processing of
> message.
> java.lang.NullPointerException
>         at
>
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183)
>         at
>
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:79)
>         at
>
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>  at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>
> approx same time as we see timeout or connection reset errors.
>
> Dont know if this is the cause or the side affect of he connection
> timeout/connection reset errors. Does anybody have any pointers or
> suggestions ?
>
> Thanks
>
>
> On Mon, Jul 29, 2013 at 11:29 AM, agateaaa <[EMAIL PROTECTED]> wrote:
>
> > Thanks Nitin!
> >
> > We have simiar setup (identical hcatalog and hive server versions) on a
> > another production environment and dont see any errors (its been running
> ok
> > for a few months)
> >
> > Unfortunately we wont be able to move to hcat 0.5 and hive 0.11 or hive
> > 0.10 soon.
> >
> > I did see that the last time we ran into this problem doing a netstat-ntp
> > | grep ":10000" see that server was holding on to one socket connection
> in
> > CLOSE_WAIT state for a long time
> >  (hive metastore server is running on port 10000). Dont know if thats
> > relevant here or not
> >
> > Can you suggest any hive configuration settings we can tweak or
> networking
> > tools/tips, we can use to narrow this down ?
> >
> > Thanks
> > Agateaaa
> >
> >
> >
> >
> > On Mon, Jul 29, 2013 at 11:02 AM, Nitin Pawar <[EMAIL PROTECTED]
> >wrote:
> >
> >> Is there any chance you can do a update on test environment with
> hcat-0.5
> >> and hive-0(11 or 10) and see if you can reproduce the issue?
> >>
> >> We used to see this error when there was load on hcat server or some
> >> network issue connecting to the server(second one was rare occurrence)
> >>
> >>
> >> On Mon, Jul 29, 2013 at 11:13 PM, agateaaa <[EMAIL PROTECTED]> wrote:
> >>
> >>> Hi All:
> >>>
> >>> We are running into frequent problem using HCatalog 0.4.1 (HIve
> Metastore
> >>> Server 0.9) where we get connection reset or connection timeout errors.
> >>>
> >>> The hive metastore server has been allocated enough (12G) memory.

Nitin Pawar