Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors


Copy link to this message
-
Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors
Uploaded a patch for HiVE-5172. Can someone please review?

Do I have to be a contributor before I submit a patch?

Did run about test with the patch in our test environment (Ran about 1000
pig jobs to read and insert into hive table (via hcatalog), along with
equal number of alter table statements)  for the past 4 days and haven't
seen any error on the client or the server.

On Thu, Aug 29, 2013 at 2:39 PM, agateaaa <[EMAIL PROTECTED]> wrote:

> Thanks Ashutosh.
>
> Filed https://issues.apache.org/jira/browse/HIVE-5172
>
>
>
> On Thu, Aug 29, 2013 at 11:53 AM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:
>
>> Thanks Agatea for digging in. Seems like you have hit a bug. Would you
>> mind opening a jira and adding your findings to it.
>>
>> Thanks,
>> Ashutosh
>>
>>
>> On Thu, Aug 29, 2013 at 11:22 AM, agateaaa <[EMAIL PROTECTED]> wrote:
>>
>>> Sorry hit send too soon ...
>>>
>>> Hi All:
>>>
>>> Put some debugging code in TUGIContainingTransport.getTransport() and I
>>> tracked it down to
>>>
>>> @Override
>>> public TUGIContainingTransport getTransport(TTransport trans) {
>>>
>>> // UGI information is not available at connection setup time, it will be
>>> set later
>>> // via set_ugi() rpc.
>>> transMap.putIfAbsent(trans, new TUGIContainingTransport(trans));
>>>
>>> //return transMap.get(trans); //<-change
>>>           TUGIContainingTransport retTrans = transMap.get(trans);
>>>
>>>           if ( retTrans == null ) {
>>>              LOGGER.error (" cannot find transport that was in map !!")
>>>            }  else {
>>>              LOGGER.debug (" cannot find transport that was in map !!")
>>>              return retTrans;
>>>        }
>>> }
>>>
>>> When we run this in our test environment, see that we run into the
>>> problem
>>> just after GC runs,
>>> and "cannot find transport that was in the map!!" message gets logged.
>>>
>>> Could the GC be collecting entries from transMap, just before the we get
>>> it
>>>
>>> Tried a minor change which seems to work
>>>
>>> public TUGIContainingTransport getTransport(TTransport trans) {
>>>
>>>    TUGIContainingTransport retTrans = transMap.get(trans);
>>>
>>>     if ( retTrans == null ) {
>>> // UGI information is not available at connection setup time, it will be
>>> set later
>>> // via set_ugi() rpc.
>>> transMap.putIfAbsent(trans, retTrans);
>>>     }
>>>    return retTrans;
>>> }
>>>
>>>
>>> My questions for hive and  thrift experts
>>>
>>> 1.) Do we need to use a ConcurrentMap
>>> ConcurrentMap<TTransport, TUGIContainingTransport> transMap = new
>>> MapMaker().weakKeys().weakValues().makeMap();
>>> It does use == to compare keys (which might be the problem), also in this
>>> case we cant rely on the trans to be always there in the transMap, even
>>> after a put, so in that case change above
>>> probably makes sense
>>>
>>>
>>> 2.) Is it better idea to use WeakHashMap with WeakReference instead ?
>>> (was
>>> looking at org.apache.thrift.transport.TSaslServerTransport, esp change
>>> made by THRIFT-1468)
>>>
>>> e.g.
>>> private static Map<TTransport, WeakReference<TUGIContainingTransport>>
>>> transMap3 = Collections.synchronizedMap(new WeakHashMap<TTransport,
>>> WeakReference<TUGIContainingTransport>>());
>>>
>>> getTransport() would be something like
>>>
>>> public TUGIContainingTransport getTransport(TTransport trans) {
>>> WeakReference<TUGIContainingTransport> ret = transMap.get(trans);
>>> if (ret == null || ret.get() == null) {
>>> ret = new WeakReference<TUGIContainingTransport>(new
>>> TUGIContainingTransport(trans));
>>> transMap3.put(trans, ret); // No need for putIfAbsent().
>>> // Concurrent calls to getTransport() will pass in different TTransports.
>>> }
>>> return ret.get();
>>> }
>>>
>>>
>>> I did try 1.) above in our test environment and it does seem to resolve
>>> the
>>> problem, though i am not sure if I am introducing any other problem
>>>
>>>
>>> Can someone help ?
>>>
>>>
>>> Thanks
>>> Agatea
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB