-Re: Query HDFS
Steven Phillips 2013-10-21, 21:10
This is configured as part of the storage engine. For example, if you are
submitting a physical plan directly, you would set the dfsName property to:
If submitting a sql query through sqlline, you should modify the
storage-engines.json in the conf directory. For example, modify the
"parquet" config to this:
"dfsName" : "hdfs://<namenode host:ip>/"
On Sat, Oct 19, 2013 at 8:20 AM, Tom Seddon <[EMAIL PROTECTED]> wrote:
> I'm also interested in querying data residing in HDFS. Grateful for any
> advice on how to achieve this.
> On 18 October 2013 00:10, Timothy Chen <[EMAIL PROTECTED]> wrote:
>> Hey Steven/Jacques,
>> If I want to query data resides in HDFS, how do I query this in sqlline?
>> And how do I specify which HDFS namenode it should connect to for data?
>> Since I got Drill deployable to EC2, I'm currently thinking to hook the
>> AMPLabs Benchmark dataset and see how we perform, and it needs to copy the
>> dataset from s3 to a distributed file system first as one node won't able
>> to contain it.