Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill >> mail # user >> Distributed mode troubles: ZK/Curator connection time out


+
Michael Hausenblas 2013-10-27, 21:00
+
Steven Phillips 2013-10-27, 21:32
+
Michael Hausenblas 2013-10-27, 21:57
+
Steven Phillips 2013-10-27, 22:17
+
Michael Hausenblas 2013-10-27, 22:39
+
Steven Phillips 2013-10-27, 22:44
+
Steven Phillips 2013-10-27, 22:48
+
Michael Hausenblas 2013-10-28, 09:42
+
Jacques Nadeau 2013-10-28, 20:15
+
Michael Hausenblas 2013-10-28, 09:26
+
Steven Phillips 2013-10-27, 22:35
+
Michael Hausenblas 2013-10-27, 22:42
Copy link to this message
-
Re: Distributed mode troubles: ZK/Curator connection time out
In my AWS deployment I don't see this problem yet, but like Steven said I override drill conf to point zk host and port.

Tim

Sent from my iPhone

> On Oct 27, 2013, at 2:32 PM, Steven Phillips <[EMAIL PROTECTED]> wrote:
>
> One thing to add to the diagram is that all of the drill java processes
> will look at what is in drill-override.conf. You must set zk.connect to the
> correct zk host:port.
>
>
> On Sun, Oct 27, 2013 at 2:00 PM, Michael Hausenblas <
> [EMAIL PROTECTED]> wrote:
>
>>
>> Folks,
>>
>> I’m trying to set up Drill in distributed mode. Here’s what I have so far:
>> when I launch the first Drillbit with bin/drillbit.sh I get the following
>> in log/drillbit.out:
>>
>> [[
>> 20:47:20.963 [main] ERROR com.netflix.curator.ConnectionState - Connection
>> timed out for connection string (localhost:2181) and timeout (5000) /
>> elapsed (5045)
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss
>>        at
>> com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:94)
>> ~[curator-client-1.1.9.jar:na]
>>        at
>> com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:106)
>> [curator-client-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:393)
>> [curator-framework-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:184)
>> [curator-framework-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:173)
>> [curator-framework-1.1.9.jar:na]
>>        at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
>> [curator-client-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:169)
>> [curator-framework-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:161)
>> [curator-framework-1.1.9.jar:na]
>>        at
>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:36)
>> [curator-framework-1.1.9.jar:na]
>>        at
>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.getChildrenWatched(ServiceDiscoveryImpl.java:306)
>> [curator-x-discovery-1.1.9.jar:na]
>>        at
>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.queryForInstances(ServiceDiscoveryImpl.java:276)
>> [curator-x-discovery-1.1.9.jar:na]
>>        at
>> com.netflix.curator.x.discovery.details.ServiceCache.refresh(ServiceCache.java:193)
>> [curator-x-discovery-1.1.9.jar:na]
>>        at
>> com.netflix.curator.x.discovery.details.ServiceCache.start(ServiceCache.java:116)
>> [curator-x-discovery-1.1.9.jar:na]
>>        at
>> org.apache.drill.exec.coord.ZKClusterCoordinator.start(ZKClusterCoordinator.java:89)
>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>        at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:94)
>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>        at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:56)
>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>        at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:43)
>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>        at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:65)
>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>> ]]
>>
>> This seems to be a known issue? See
>> http://stackoverflow.com/questions/16056751/curator-zookeeper-client-keeps-throw-out-connectionlossexception-per-connection
>>
>> Any ideas? Did anyone actually run Drill in distributed mode already and
>> if so, how did you overcome the above issue?
>>
>> What is next? How do I make other Drillbits point to the same ZK cluster?
>> And has anyone an example of the call parameters for bin/submit_plan maybe
>> as well?
>>
>>
>> BTW, in the process of trying to figure what’s going on behind the scene I
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB