Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> List of unique qualifiers [SEC=UNOFFICIAL]


+
Dickson, Matt MR 2014-01-14, 22:30
+
David Medinets 2014-01-14, 22:36
+
Dickson, Matt MR 2014-01-14, 23:06
+
Keith Turner 2014-01-15, 01:05
Copy link to this message
-
RE: List of unique qualifiers [SEC=UNOFFICIAL]
UNOFFICIAL

Thanks Keith.  I've run a simple mr job based on the UniqueColumns example, but due to the size of the table this is taking a very long time.  Is it possible to pre-filter the data that goes to the MR job based on family, eg only run the MR job on columns with a specific column family of 'cityofbirth'?  I am currently going through every column in the table and checking the column family in the mapper ... slow.

________________________________
From: Keith Turner [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, 15 January 2014 12:06
To: [EMAIL PROTECTED]
Subject: Re: List of unique qualifiers [SEC=UNOFFICIAL]
On Tue, Jan 14, 2014 at 6:06 PM, Dickson, Matt MR <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

UNOFFICIAL

Just for simplicity, this is a one of request for managment so I was hoping to just scan via the shell and output to a file.

If I need to do it via a mr job I can do it that way and would be keen to hear any suggestions.

You could modify the following example in 1.4 to suit your needs.

src/examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/UniqueColumns.java
________________________________
From: David Medinets [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Wednesday, 15 January 2014 09:36
To: accumulo-user
Subject: Re: List of unique qualifiers [SEC=UNOFFICIAL]

Why the restriction to the shell environment? A nice map-reduce job would be ideal for this task.
On Tue, Jan 14, 2014 at 5:30 PM, Dickson, Matt MR <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

UNOFFICIAL

Hi,

I need to extract a list of unique qualifier values on a table from the Accumulo shell.  For every column there is a column family that identifies a specific qualifer, eg 'cityofbirth'.  I would like to get a unique list of all cities that are a listed in the qualifier against 'cityofbirth' for all rows.

eg, If I had a table with

Rowid                Family            Qual
123                   cityofbirth         LosAngeles
133                   cityofbirth         Brisbane
222                   cityofbirth         London
124                   cityofbirth         London
124                   cityofbirth         London

I want a list that is just;
LosAngeles
London
Brisbane

Any suggestions on how to achieve this from the shell would great.

Thanks in advance.
Matt

+
Corey Nolet 2014-01-16, 01:27
+
David Medinets 2014-01-16, 21:54
+
Sean Busbey 2014-01-16, 22:26
+
Ott, Charles H. 2014-01-17, 14:31
+
Eric Newton 2014-01-16, 01:27
+
Josh Elser 2014-01-14, 23:11
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB