Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> hd fs -head?


Copy link to this message
-
Re: hd fs -head?
On Mon, Sep 27, 2010 at 11:13 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
> On 2010, Sep 27, at 7:02 AM, Edward Capriolo wrote:
>
>> On Mon, Sep 27, 2010 at 3:23 AM, Keith Wiley <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Is there a particularly good reason for why the "hadoop fs" command
>>> supports
>>> -cat and -tail, but not -head?
>>>
>>
>> Tail is needed to be done efficiently but head you can just do
>> yourself. Most people probably use
>>
>> hadoop dfs -cat file | head -5.
>
>
> I disagree with your use of the word "efficiently".  :-)  To my
> understanding (and perhaps that's the source of my error), the approach you
> suggested reads the entire file over the net from the cluster to your client
> machine.  That file could conceivably be of HDFS scales (100s of GBs, even
> TBs wouldn't be uncommon).
>
> What do you think?  Am I wrong in my interpretation of how
> hadoopCat-pipe-head would work?
>
> Cheers!
>
> ________________________________________________________________________________
> Keith Wiley     [EMAIL PROTECTED]     keithwiley.com
>  music.keithwiley.com
>
> "And what if we picked the wrong religion?  Every week, we're just making
> God
> madder and madder!"
>                                           --  Homer Simpson
> ________________________________________________________________________________
>
>

'hadoop dfs -cat' will output the file as it is read. head -5 will
kill the first half of the pipe after 5 lines. With buffering more
might be physically read then 5 lines but this invocation does not
read the enter HDFS file before piping it to head.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB