Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 26 (0.371s).
Loading phrases to help you
refine your search...
Re: Programmatic Spark 1.2.0 on EMR | S3 filesystem is not working when using - Spark - [mail # user]
...From your stacktrace it appears that the S3 writer tries to write the datato a temp file on the local file system first. Taking a guess, that localdirectory doesn't exist or you don't have p...
   Author: Sven Krasser, 2015-01-30, 17:55
Re: Define size partitions - Spark - [mail # user]
...You can also use your InputFormat/RecordReader in Spark, e.g. usingnewAPIHadoopFile. See here:https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkContext.-SvenOn ...
   Author: Sven Krasser, 2015-01-30, 17:48
[SPARK-5395] Large number of Python workers causing resource depletion - Spark - [issue]
...During job execution a large number of Python worker accumulates eventually causing YARN to kill containers for being over their memory allocation (in the case below that is about 8G for exe...
http://issues.apache.org/jira/browse/SPARK-5395    Author: Sven Krasser, 2015-01-30, 01:30
Snappy Crash - Spark - [mail # user]
...I'm running into a new issue with Snappy causing a crash (using Spark1.2.0). Did anyone see this before?-Sven2015-01-28 16:09:35,448 WARN  [shuffle-server-1] storage.MemoryStore(Logging...
   Author: Sven Krasser, 2015-01-28, 17:43
[expand - 4 more] - Re: Large number of pyspark.daemon processes - Spark - [mail # user]
...After slimming down the job quite a bit, it looks like a call to coalesce()on a larger RDD can cause these Python worker spikes (additional details inJira:https://issues.apache.org/jira/brow...
   Author: Sven Krasser, 2015-01-28, 01:59
Re: java.lang.OutOfMemoryError: GC overhead limit exceeded - Spark - [mail # user]
...Since it's an executor running OOM it doesn't look like a container beingkilled by YARN to me. As a starting point, can you repartition your jobinto smaller tasks?-SvenOn Tue, Jan 27, 2015 a...
   Author: Sven Krasser, 2015-01-27, 22:51
Re: Index wise most frequently occuring element - Spark - [mail # user]
...Use combineByKey. For top 10 as an example (bottom 10 work similarly): addthe element to a list. If the list is larger than 10, delete the smallestelements until size is back to 10.-SvenOn T...
   Author: Sven Krasser, 2015-01-27, 20:22
[SPARK-5392] Shuffle spill size is shown as negative - Spark - [issue]
...The "Shuffle Spill (Memory)" metric on the Stage Detail Web UI shows as negative for some executors (e.g. "-2097152.0 B"), see attached screenshot....
http://issues.apache.org/jira/browse/SPARK-5392    Author: Sven Krasser, 2015-01-24, 01:21
[SPARK-5209] Jobs fail with "unexpected value" exception in certain environments - Spark - [issue]
...Jobs fail consistently and reproducibly with exceptions of the following type in PySpark using Spark 1.2.0:2015-01-13 00:14:05,898 ERROR [Executor task launch worker-1] executor.Executor (Lo...
http://issues.apache.org/jira/browse/SPARK-5209    Author: Sven Krasser, 2015-01-23, 23:46
Re: Problems saving a large RDD (1 TB) to S3 as a sequence file - Spark - [mail # user]
...Hey Darin,Are you running this over EMR or as a standalone cluster? I've hadoccasional success in similar cases by digging through all executor logsand trying to find exceptions that are not...
   Author: Sven Krasser, 2015-01-23, 22:14
Sort:
project
Spark (19)
Pig (7)
type
mail # user (22)
issue (4)
date
last 7 days (8)
last 30 days (18)
last 90 days (19)
last 6 months (20)
last 9 months (26)
author
Ted Yu (1923)
Harsh J (1309)
Jun Rao (1054)
Todd Lipcon (1002)
Stack (994)
Andrew Purtell (913)
Jonathan Ellis (855)
stack (788)
Jarek Jarcec Cecho (754)
Yusaku Sako (751)
Jean-Daniel Cryans (750)
Josh Elser (726)
Hitesh Shah (725)
Eric Newton (715)
Brock Noland (686)
Jonathan Hsieh (686)
Roman Shaposhnik (683)
Siddharth Seth (678)
Steve Loughran (661)
Namit Jain (648)
Owen O'Malley (621)
Hyunsik Choi (597)
Neha Narkhede (572)
James Taylor (571)
Arun C Murthy (548)
Sven Krasser
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB