Big Data / Search / DevOps
  • About
  • project

    • HBase (78)
    • Spark (12)
    • Hadoop (7)

    author

    • ()
    • Ted Yu (12545)
    • Harsh J (6378)
    • Stack (6070)
    • Todd Lipcon (5895)
    • Andrew Purtell (5544)
    • Vinod Kone (5313)
    • Jonathan Ellis (5274)
    • Josh Elser (4733)
    • Jean-Daniel Cryans (4559)
    • stack (4287)
    • Jun Rao (4055)
    • Steve Loughran (3816)
    • Ted Dunning (3655)
    • Ben Mahler (3502)
    • James Taylor (3328)
    • Patrick Hunt (3278)
    • Edward J. Yoon (3252)
    • Benjamin Hindman (3103)
    • Sean Busbey (3004)
    • Brock Noland (2928)
    • Stephan Ewen (2876)
    • Edward Capriolo (2825)
    • Guozhang Wang (2798)
    • Julian Hyde (2753)
    • Owen OMalley (2742)
    • aaron morton (2715)
    • Konstantin Boudnik (2607)
    • Aljoscha Krettek (2588)
    • Reynold Xin (2587)
    • Arun C Murthy (2571)
    • Sean Owen (2494)
    • Doug Cutting (2382)
    • Hari Shreedharan (2363)
    • Jean-Marc Spaggiari (2350)
    • Vinod Kumar Vavilapalli (2272)
    • Roman Shaposhnik (2250)
    • Fabian Hueske (2249)
    • Jie Yu (2198)
    • Wes McKinney (2179)
    • Mich Talebzadeh (2173)
    • Alan Gates (2146)
    • Joe Witt (2128)
    • Alejandro Abdelnur (2113)
    • Till Rohrmann (2110)
    • Allen Wittenauer (2080)
    • Mike Percy (2047)
    • Jarek Jarcec Cecho (2038)
    • Jacques Nadeau (2036)
    • Jean-Baptiste Onofré (1987)
    • Amareshwari Sriramadasu (1980)
    • Matthias J. Sax (1968)
    • Robert Metzger (1902)
    • Jordan Zimmerman (1892)
    • Jarek Cecho (1875)
    • Sylvain Lebresne (1867)
    • Keith Turner (1852)
    • Neha Narkhede (1851)
    • Jeff Zhang (1838)
    • lars hofhansl (1834)
    • Gwen Shapira (1828)

    type

    • mail # user (90)
    • issue (4)
    • mail # dev (3)
  • date

    • last 7 days (0)
    • last 30 days (0)
    • last 90 days (0)
    • last 6 months (0)
    • last 9 months (97)
clear query| facets| time Search criteria: .   Results from 1 to 10 from 97 (0.0s).
Loading phrases to help you
refine your search...
[expand - 1 more] [collapse] - How does Spark set task indexes? - Spark - [mail # user]
...Yes I've noticed this one and its related cousin, but not sure this is thesame issue there; our job "properly" ends after 6 attempts.We'll try with disabled speculative mode anyway!On 25 May...
   Author: Adrien Mogenet , 2016-05-25, 08:49
  
How does Spark set task indexes? - Spark - [mail # user]
...Hi,I'm wondering how Spark is setting the "index" of task?I'm asking this question because we have a job that constantly fails attask index = 421.When increasing number of partitions, this t...
   Author: Adrien Mogenet , 2016-05-24, 20:00
How to add an accumulator for a Set in Spark - Spark - [mail # user]
...Btw, here is a great article about accumulators and all their relatedtraps!http://imranrashid.com/posts/Spark-Accumulators/ (I'm not the author)On 16 March 2016 at 18:24, swetha kasireddy wr...
   Author: Adrien Mogenet , 2016-03-17, 07:32
  
df.partitionBy().parquet() java.lang.OutOfMemoryError: GC overhead limit exceeded - Spark - [mail # user]
...Very interested in that topic too, thanks Cheng for the direction!We'll give it a try as well.On 3 December 2015 at 01:40, Cheng Lian  wrote:> You may try to set Hadoop conf "parquet...
   Author: Adrien Mogenet , 2015-12-03, 07:39
  
[expand - 2 more] [collapse] - [POWERED BY] Please add our organization - Spark - [mail # user]
...Oh, right! I think it was user@ at the time I wrote my first message butit's clear now!Thanks Sean,On 2 December 2015 at 11:56, Sean Owen  wrote:> Same, not sure if anyone handles th...
   Author: Adrien Mogenet , 2015-12-02, 11:04
  
[POWERED BY] Please add our organization - Spark - [mail # user]
...Hi folks,You're probably busy, but any update on this? :)On 16 November 2015 at 16:04, Adrien Mogenet <[EMAIL PROTECTED]> wrote:> Name: Content Square> URL: http://www.contentsqu...
   Author: Adrien Mogenet , 2015-12-02, 10:54
[POWERED BY] Please add our organization - Spark - [mail # user]
...Name: Content SquareURL: http://www.contentsquare.comDescription:We use Spark to regularly read raw data, convert them into Parquet, andprocess them to create advanced analytics dashboards: ...
   Author: Adrien Mogenet , 2015-11-16, 15:04
[HBASE-9260] Timestamp Compactions - HBase - [issue]
...TSCompactionsThe issueOne of the biggest issue I currently deal with is compacting bigstores, i.e. when HBase cluster is 80% full on 4 TB nodes (let saywith a single big table), compactions ...
http://issues.apache.org/jira/browse/HBASE-9260    Author: Adrien Mogenet , 2015-11-10, 03:40
  
[expand - 1 more] [collapse] - Split content into multiple Parquet files - Spark - [mail # user]
...My bad, I realized my question was unclear.I did a partitionBy when using saveAsHadoopFile. My question was aboutdoing the same thing for Parquet file. We were using Spark 1.3.x, but nowthat...
   Author: Adrien Mogenet , 2015-09-08, 17:21
  
Split content into multiple Parquet files - Spark - [mail # user]
...Hi there,We've spent several hours to split our input data into several parquetfiles (or several folders, i.e./datasink/output-parquets//foobar.parquet), based on a low-cardinalitykey. This ...
   Author: Adrien Mogenet , 2015-09-08, 06:35
[expand - 2 more] [collapse] - High iowait in idle hbase cluster - Hadoop - [mail # user]
...What is your disk configuration? JBOD? If RAID, possibly a dysfunctionalRAID controller, or a constantly-rebuilding array.Do you have any idea at which files are linked the read blocks?On 4 ...
   Author: Adrien Mogenet , 2015-09-04, 10:08
  
High iowait in idle hbase cluster - Hadoop - [mail # user]
...Is the uptime of RS "normal"? No quick and global reboot that could leadinto a regiongi-reallocation-storm?On 3 September 2015 at 18:42, Akmal Abbasov wrote:> Hi Adrien,> I’ve tried to...
   Author: Adrien Mogenet , 2015-09-03, 16:58
High iowait in idle hbase cluster - Hadoop - [mail # user]
...Is your HDFS healthy (fsck /)?Same for hbase hbck?What's your replication level?Can you see constant network use as well?Anything than might be triggered by the hbasemaster? (something like ...
   Author: Adrien Mogenet , 2015-09-03, 15:46
How to determine the value for spark.sql.shuffle.partitions? - Spark - [mail # user]
...Not sure it would help and answer your question at 100%, but number ofpartitions is supposed to be at least roughly double of your number ofcores (surprised to not see this point in your lis...
   Author: Adrien Mogenet , 2015-09-04, 06:04
  
Parquet partitioning for unique identifier - Spark - [mail # user]
...Any code / Parquet schema to provide? I'm not sure to understand which stepfails right there...On 3 September 2015 at 04:12, Raghavendra Pandey <[EMAIL PROTECTED]> wrote:> Did you s...
   Author: Adrien Mogenet , 2015-09-03, 06:16
  
Unable to understand error “SparkListenerBus has already stopped! Dropping event …” - Spark - [mail # user]
...Hi there,I'd like to know if anyone has a magic method to avoid such messages inSpark logs:2015-08-30 19:30:44 ERROR LiveListenerBus:75 - SparkListenerBus has alreadystopped! Dropping eventS...
   Author: Adrien Mogenet , 2015-09-02, 12:23
  
1 2 3 4 5 Next >
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext