Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 36 (0.291s).
Loading phrases to help you
refine your search...
Re: sparse x sparse matrix multiplication - Spark - [mail # user]
...I think Xiangrui's ALS code implement certain aspect of it. You may want tocheck it out.Best regards,WeiWei Tan, PhDResearch Staff MemberIBM T. J. Watson Research CenterFrom: Xiangrui Meng T...
   Author: Wei Tan, 2014-11-06, 07:50
[expand - 2 more] - Re: CUDA in spark, especially in MLlib? - Spark - [mail # user]
...Thank you Debasish.I am fine with either Scala or Java. I would like to get a quick evaluation on the performance gain, e.g., ALS on GPU. I would like to try whichever library does the busin...
   Author: Wei Tan, 2014-08-28, 18:34
[expand - 2 more] - Re: MLLib: implementing ALS with distributed matrix - Spark - [mail # user]
...Hi Deb, thanks for sharing your result. Please find my comments inline in blue.Best regards,WeiFrom:   Debasish Das To:     Wei Tan/Watson/IBM@IBMUS, Cc:     Xiangru...
   Author: Wei Tan, 2014-08-18, 02:38
RE: executor-cores vs. num-executors - Spark - [mail # user]
...Thanks for sharing your experience. I got the same experience -- multiple moderate JVMs beat a single huge JVM.Besides the minor JVM starting overhead, is it always better to have multiple J...
   Author: Wei Tan, 2014-07-16, 18:31
[expand - 1 more] - Re: parallel stages? - Spark - [mail # user]
...Thanks Sean. In Oozie you can use fork-join, however using Oozie to drive Spark jobs, jobs will not be able to share RDD (Am I right? I think multiple jobs submitted by Oozie will have diffe...
   Author: Wei Tan, 2014-07-16, 04:01
Re: Recommended pipeline automation tool? Oozie? - Spark - [mail # user]
...Just curious: how about using scala to drive the workflow? I guess if you use other tools (oozie, etc) you lose the advantage of reading from RDD -- you have to read from HDFS.Best regards,W...
   Author: Wei Tan, 2014-07-11, 19:07
[expand - 1 more] - Re: rdd.cache() is not faster? - Spark - [mail # user]
...Hi Gaurav, thanks for your pointer. The observation in the link is (at least qualitatively) similar to mine.Now the question is, if I do have big data (40GB, cached size is 60GB) and even bi...
   Author: Wei Tan, 2014-06-18, 14:40
[expand - 2 more] - Re: long GC pause during file.cache() - Spark - [mail # user]
...BTW: nowadays a single machine with huge RAM (200G to 1T) is really common. With virtualization you lose some performance. It would be ideal to see some "best practice" on how to use Spark i...
   Author: Wei Tan, 2014-06-16, 14:56
Re: How to compile a Spark project in Scala IDE for Eclipse? - Spark - [mail # user]
...This will make the compilation pass but you may not be able to run it correctly.I used maven adding these two jars (I use Hadoop 1), maven added their dependent jars (a lot) for me. &nb...
   Author: Wei Tan, 2014-06-08, 16:02
[expand - 1 more] - Re: best practice: write and debug Spark application in scala-ide and maven - Spark - [mail # user]
...Thank you all, Madhu, Gerard and Ryan. All your suggestions work. Personally I prefer running Spark locally in Eclipse for debugging purpose.Best regards,WeiWei Tan, PhDResearch Staff Member...
   Author: Wei Tan, 2014-06-08, 06:03
Sort:
project
HBase (25)
Spark (11)
type
mail # user (33)
mail # dev (3)
date
last 7 days (0)
last 30 days (0)
last 90 days (1)
last 6 months (6)
last 9 months (36)
author
Ted Yu (1828)
Harsh J (1302)
Jun Rao (1014)
Todd Lipcon (994)
Stack (986)
Andrew Purtell (872)
Jonathan Ellis (853)
stack (756)
Jean-Daniel Cryans (750)
Jarek Jarcec Cecho (747)
Yusaku Sako (742)
Eric Newton (706)
Jonathan Hsieh (683)
Roman Shaposhnik (677)
Hitesh Shah (675)
Josh Elser (671)
Steve Loughran (651)
Namit Jain (648)
Siddharth Seth (643)
Brock Noland (633)
Owen O'Malley (623)
Hyunsik Choi (582)
Neha Narkhede (566)
Arun C Murthy (548)
Eli Collins (545)
Wei Tan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB