Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())


+
Christian Schneider 2013-06-26, 12:45
+
Christian Schneider 2013-06-26, 15:24
+
Christian Schneider 2013-07-03, 11:59
Copy link to this message
-
Re: Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())
It seemed stmt.setFetchSize(10000); can be called before execution
(without casting)

2013/7/3 Christian Schneider <[EMAIL PROTECTED]>:
> Hi, i browsed through the sources and found a way to tune the JDBC
> ResultSet.next() performance.
>
> final Connection con > DriverManager.getConnection("jdbc:hive2://carolin:10000/default", "hive",
> "");
> final Statement stmt = con.createStatement();
> final String tableName = "bigdata";
>
> sql = "select * from " + tableName + " limit 150000";
> System.out.println("Running: " + sql);
> res = stmt.executeQuery(sql);
>
> // enlarge the FetchSize (default is just 50!)
> ((HiveQueryResultSet) res).setFetchSize(10000);
>
> Best Regards,
> Christian.
>
>
> 2013/6/26 Christian Schneider <[EMAIL PROTECTED]>
>>
>> I just test the same statement with beeline and got the same bad
>> performance.
>>
>> Any ideas?
>>
>> Best Regards,
>> Chrisitan.
>>
>>
>> 2013/6/26 Christian Schneider <[EMAIL PROTECTED]>
>>>
>>> Hi,
>>> currently we are using HiveSever1 with the native HiveClient interface.
>>> Our application design looks horrible because (for whatever reason) it
>>> spawns a dedicated HiveServer for every query.
>>>
>>> We thought it is a good idea to switch to HiveServer2 (because the
>>> MetaStore get used by many different applications).
>>>
>>> The JDBC setup was straight forward, but the performance is not what we
>>> assumed.
>>>
>>> If we fetch a large result set (with fetchN()  over HiveClient) we read
>>> with around 10MB/s.
>>>
>>> If I use JDBC (with resultSet.next() ) i have a throughput from 1MB/min.
>>>
>>> Any chance to speed this up (like bulk fetching)?
>>>
>>> Best Regards,
>>> Christian.
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB