Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 30 (0.325s).
Loading phrases to help you
refine your search...
[SPARK-5392] Shuffle spill size is shown as negative - Spark - [issue]
...The "Shuffle Spill (Memory)" metric on the Stage Detail Web UI shows as negative for some executors (e.g. "-2097152.0 B"), see attached screenshot....
http://issues.apache.org/jira/browse/SPARK-5392    Author: Sven Krasser, 2015-05-03, 18:36
[expand - 1 more] - Re: indexing an RDD [Python] - Spark - [mail # user]
...Hey Roberto,You will likely want to use a cogroup() then, but it hinges all on how yourdata looks, i.e. if you have the index in the key. Here's an example:http://homepage.cs.latrobe.edu.au/...
   Author: Sven Krasser, 2015-04-29, 20:17
[expand - 1 more] - Re: Slower performance when bigger memory? - Spark - [mail # user]
...On Mon, Apr 27, 2015 at 7:36 AM, Shuai Zheng  wrote:We're currently using 16 executors with 2 cores each per instance. This ismainly due to the S3 throughput observations I mentioned (a...
   Author: Sven Krasser, 2015-04-29, 20:09
[expand - 1 more] - Re: How to debug Spark on Yarn? - Spark - [mail # user]
...On Fri, Apr 24, 2015 at 11:31 AM, Marcelo Vanzin wrote:You're absolutely correct -- didn't notice it until now. This is a greataddition!www.skrasser.com  ...
   Author: Sven Krasser, 2015-04-24, 18:37
[SPARK-5395] Large number of Python workers causing resource depletion - Spark - [issue]
...During job execution a large number of Python worker accumulates eventually causing YARN to kill containers for being over their memory allocation (in the case below that is about 8G for exe...
http://issues.apache.org/jira/browse/SPARK-5395    Author: Sven Krasser, 2015-02-17, 04:36
Re: ephemeral-hdfs vs persistent-hdfs - performance - Spark - [mail # user]
...Hey Joe,With the ephemeral HDFS, you get the instance store of your worker nodes.For m3.xlarge that will be two 40 GB SSDs local to each instance, which arevery fast.For the persistent HDFS,...
   Author: Sven Krasser, 2015-02-03, 21:31
Re: Programmatic Spark 1.2.0 on EMR | S3 filesystem is not working when using - Spark - [mail # user]
...From your stacktrace it appears that the S3 writer tries to write the datato a temp file on the local file system first. Taking a guess, that localdirectory doesn't exist or you don't have p...
   Author: Sven Krasser, 2015-01-30, 17:55
Re: Define size partitions - Spark - [mail # user]
...You can also use your InputFormat/RecordReader in Spark, e.g. usingnewAPIHadoopFile. See here:https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkContext.-SvenOn ...
   Author: Sven Krasser, 2015-01-30, 17:48
Snappy Crash - Spark - [mail # user]
...I'm running into a new issue with Snappy causing a crash (using Spark1.2.0). Did anyone see this before?-Sven2015-01-28 16:09:35,448 WARN  [shuffle-server-1] storage.MemoryStore(Logging...
   Author: Sven Krasser, 2015-01-28, 17:43
[expand - 4 more] - Re: Large number of pyspark.daemon processes - Spark - [mail # user]
...After slimming down the job quite a bit, it looks like a call to coalesce()on a larger RDD can cause these Python worker spikes (additional details inJira:https://issues.apache.org/jira/brow...
   Author: Sven Krasser, 2015-01-28, 01:59
Sort:
project
Spark (23)
Pig (7)
type
mail # user (26)
issue (4)
date
last 7 days (3)
last 30 days (4)
last 90 days (6)
last 6 months (23)
last 9 months (30)
author
Ted Yu (2036)
Harsh J (1317)
Jun Rao (1101)
Andrew Purtell (1016)
Todd Lipcon (1016)
Stack (1004)
GitHub Import (895)
Josh Elser (872)
Jonathan Ellis (865)
stack (828)
Jarek Jarcec Cecho (813)
Hitesh Shah (799)
Yusaku Sako (793)
Siddharth Seth (780)
Jean-Daniel Cryans (752)
Eric Newton (739)
Brock Noland (723)
Steve Loughran (721)
Jonathan Hsieh (701)
James Taylor (691)
Roman Shaposhnik (687)
Hyunsik Choi (648)
Namit Jain (648)
Owen O'Malley (617)
Bikas Saha (584)
Sven Krasser
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB