Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())


Copy link to this message
-
Re: Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())
It seemed stmt.setFetchSize(10000); can be called before execution
(without casting)

2013/7/3 Christian Schneider <[EMAIL PROTECTED]>:
> Hi, i browsed through the sources and found a way to tune the JDBC
> ResultSet.next() performance.
>
> final Connection con > DriverManager.getConnection("jdbc:hive2://carolin:10000/default", "hive",
> "");
> final Statement stmt = con.createStatement();
> final String tableName = "bigdata";
>
> sql = "select * from " + tableName + " limit 150000";
> System.out.println("Running: " + sql);
> res = stmt.executeQuery(sql);
>
> // enlarge the FetchSize (default is just 50!)
> ((HiveQueryResultSet) res).setFetchSize(10000);
>
> Best Regards,
> Christian.
>
>
> 2013/6/26 Christian Schneider <[EMAIL PROTECTED]>
>>
>> I just test the same statement with beeline and got the same bad
>> performance.
>>
>> Any ideas?
>>
>> Best Regards,
>> Chrisitan.
>>
>>
>> 2013/6/26 Christian Schneider <[EMAIL PROTECTED]>
>>>
>>> Hi,
>>> currently we are using HiveSever1 with the native HiveClient interface.
>>> Our application design looks horrible because (for whatever reason) it
>>> spawns a dedicated HiveServer for every query.
>>>
>>> We thought it is a good idea to switch to HiveServer2 (because the
>>> MetaStore get used by many different applications).
>>>
>>> The JDBC setup was straight forward, but the performance is not what we
>>> assumed.
>>>
>>> If we fetch a large result set (with fetchN()  over HiveClient) we read
>>> with around 10MB/s.
>>>
>>> If I use JDBC (with resultSet.next() ) i have a throughput from 1MB/min.
>>>
>>> Any chance to speed this up (like bulk fetching)?
>>>
>>> Best Regards,
>>> Christian.
>>
>>
>