Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 165 (0.232s).
Loading phrases to help you
refine your search...
Distributing Computation across slaves - Spark - [mail # user]
... I have a job involving two sets of data indexed with the same type of key.I have an expensive operation that I want to run on pairs sharing the samekey. The following code works BUT al...
   Author: Steve Lewis, 2015-01-15, 21:18
Is there a way to read a parquet database without generating an RDD - Spark - [mail # user]
...I have an application where a function needs access to the results of aselect from a parquet database. Creating a JavaSQLContext and from ita  JavaSchemaRDDas shown below works but the ...
   Author: Steve Lewis, 2015-01-06, 19:36
Developing with Pycharm - Spark - [mail # user]
...I am trying to use PyCharm for Spark development on a windows 8.1 Machine -I have installed py4j, added Spark pythin as a content root and have Cygwinin my pathAlso Using intelliJ works for ...
   Author: Steve Lewis, 2015-01-05, 18:18
Is there a way (in Java) to turn Java Iterable into a JavaRDD? - Spark - [mail # user]
...I notice new methods such as JavaSparkContext makeRDD (with few usefulexamples) - It takes a Seq but while there are ways to turn a list into aSeq I see nothing that uses an Iterable ...
   Author: Steve Lewis, 2014-12-19, 18:26
Who is using Spark and related technologies for bioinformatics applications? - Spark - [mail # user]
...I am aware of the ADAM project in Berkeley and I am working on Proteomicsearches -anyone else working in this space ...
   Author: Steve Lewis, 2014-12-17, 16:28
[expand - 2 more] - Re: how to convert an rdd to a single output file - Spark - [mail # user]
...what would good spill settings be?On Fri, Dec 12, 2014 at 2:45 PM, Sameer Farooqui wrote:Steven M. Lewis PhD4221 105th Ave NEKirkland, WA 98033206-384-1340 (cell)Skype lordjoe_com ...
   Author: Steve Lewis, 2014-12-12, 23:07
In Java how can I create an RDD with a large number of elements - Spark - [mail # user]
...assume I don't care about values which may be created in a later map - inscala I can sayval rdd = sc.parallelize(1 to 1000000000, numSlices = 1000)but in Java JavaSparkContext can only paral...
   Author: Steve Lewis, 2014-12-09, 02:18
[expand - 1 more] - Re: How can I create an RDD with millions of entries created programmatically - Spark - [mail # user]
...looks good but how do I say that in Javaas far as I can see sc.parallelize (in Java)  has only one implementationwhich takes a List - requiring an in memory representationOn Mon, Dec 8,...
   Author: Steve Lewis, 2014-12-08, 21:12
Problems creating and reading a large test file - Spark - [mail # user]
...I am trying to look at problems reading a data file over 4G. In my testingI am trying to create such a file.My plan is to create a fasta file (a simple format used in biology)looking likeTCC...
   Author: Steve Lewis, 2014-12-06, 01:21
I am having problems reading files in the 4GB range - Spark - [mail # user]
...I am using a custom hadoop input format which works well on smaller filesbut fails with a file at about 4GB size - the format is generating about800 splits and all variables in my code are l...
   Author: Steve Lewis, 2014-12-05, 18:53
Sort:
project
MapReduce (74)
Spark (47)
Hadoop (39)
HDFS (5)
type
mail # user (165)
date
last 7 days (0)
last 30 days (3)
last 90 days (24)
last 6 months (49)
last 9 months (165)
author
Ted Yu (1921)
Harsh J (1309)
Jun Rao (1054)
Todd Lipcon (1004)
Stack (995)
Andrew Purtell (914)
Jonathan Ellis (855)
stack (788)
Jarek Jarcec Cecho (754)
Yusaku Sako (751)
Jean-Daniel Cryans (749)
Josh Elser (726)
Hitesh Shah (724)
Eric Newton (713)
Brock Noland (686)
Jonathan Hsieh (686)
Roman Shaposhnik (683)
Siddharth Seth (677)
Steve Loughran (661)
Namit Jain (648)
Owen O'Malley (621)
Hyunsik Choi (597)
Neha Narkhede (572)
James Taylor (571)
Arun C Murthy (548)
Steve Lewis
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB