Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Hadoop 101


Stupid question for the day…

I have a file created by a mahout job of the form:

0 [356:0.3481597,359:0.3481597,358:0.3481597,361:0.3481597,360:0.3481597]
8 [356:0.34786037,359:0.34786037,358:0.34786037,361:0.34786037,360:0.34786037]
25 [284:0.34821576,286:0.34821576,287:0.34821576,288:0.34821576,289:0.34821576]
28 [452:0.34802154,454:0.34802154,453:0.34802154,456:0.34802154,455:0.34802154]


If this were a SequenceFile I could read it and be merrily on my way but it's a text file. The classes written are key, value pairs <LongWritable, VectorWritable> but the file is tab delimited text.

I was hoping to do something like:

SequenceFile.Reader reader = new SequenceFile.Reader(fs, inputFile, conf);
Writable userId = new LongWritable();
VectorWritable recommendations = new VectorWritable();
while (reader.next(userId, recommendations)) {
//do something with each pair
}

But alas Google fails me. How do you read in key, values pairs from text files outside of a map or reduce?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB