Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Tablesample doubling


Copy link to this message
-
Tablesample doubling
Hello All,

Why does TABLESAMPLE(N rows) produce ouptut with 2*N rows?
I have the following script:

DROP TABLE IF EXISTS sparse_features_small;

CREATE TABLE sparse_features_small ROW FORMAT DELIMITED FIELDS TERMINATED
BY ',' LINES TERMINATED BY '\n' as

SELECT
        *
FROM
        sparse_features
TABLESAMPLE(50000 ROWS)
After I execute this by sourcing the file, I can then execute :

--
https://github.com/bearrito
@deepbearrito
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB