Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
clear query|facets|time Search criteria: .   Results from 1 to 10 from 22 (0.29s).
Loading phrases to help you
refine your search...
Re: Data locality across jobs - Spark - [mail # user]
...You can read same partition from every hour's output, union these RDDs and then repartition them as a single partition. This will be done for all partitions one by one. It may not necessaril...
   Author: Ajay Srivastava, 2015-04-03, 10:04
[expand - 2 more] - Re: Some tasks are taking long time - Spark - [mail # user]
...Thanks Nicos.GC does not contribute much to the execution time of the task. I will debug it further today.Regards,Ajay      On Thursday, January 15, 2015 11:55 PM, Nicos ...
   Author: Ajay Srivastava, 2015-01-16, 04:45
[expand - 1 more] - Re: Creating RDD from only few columns of a Parquet file - Spark - [mail # user]
...Setting spark.sql.hive.convertMetastoreParquet to true has fixed this.Regards,Ajay      On Tuesday, January 13, 2015 11:50 AM, Ajay Srivastava  wrote:    H...
   Author: Ajay Srivastava, 2015-01-13, 08:53
Spark summit 2014 videos ? - Spark - [mail # user]
...Hi,I did not find any videos on apache spark channel in youtube yet.Any idea when these will be made available ?Regards,Ajay ...
   Author: Ajay Srivastava, 2014-07-11, 05:13
[expand - 1 more] - Re: OFF_HEAP storage level - Spark - [mail # user]
...Thanks Jerry.It looks like a good option, will try it.Regards,AjayOn Friday, July 4, 2014 2:18 PM, "Shao, Saisai"  wrote: Hi Ajay, StorageLevel OFF_HEAP means for can cache your RD...
   Author: Ajay Srivastava, 2014-07-04, 09:16
[expand - 2 more] - Re: Join : Giving incorrect result - Spark - [mail # user]
...Thanks Matei. We have tested the fix and it's working perfectly.Andrew, we set spark.shuffle.spill=false but the application goes out of memory. I think that is expected.Regards,Ajay On Frid...
   Author: Ajay Srivastava, 2014-06-06, 12:58
[expand - 2 more] - Re: Only log.index - MapReduce - [mail # user]
...Yes. That explains it and confirms my guess too :-)  stderr:156 0 syslog:995 166247  What are these numbers ? Byte offset in corresponding files from where logs of this task starts...
   Author: Ajay Srivastava, 2013-07-24, 06:52
[expand - 1 more] - Re: Unexpected problem in creating temporary file - MapReduce - [mail # user]
...Any suggestion ? I am stuck.   Regards, Ajay Srivastava   On 19-Jul-2013, at 5:54 PM, Ajay Srivastava wrote:  ...
   Author: Ajay Srivastava, 2013-07-20, 01:20
[expand - 2 more] - Re: Cartesian product in hadoop - MapReduce - [mail # user]
...The approach which I proposed will have m+n i/o for reading datasets not the (m + n + m*n) and but further i/o due to spills and reading mapper output by reducer will be more as number of tu...
   Author: Ajay Srivastava, 2013-04-18, 15:18
Re: How to balance reduce job - HDFS - [mail # user]
...Tariq probably meant distribution of keys from  pair emitted by mapper. Partitioner distributes these pairs to different reducers based on key. If data is such that keys are skewed then...
   Author: Ajay Srivastava, 2013-04-17, 06:02
Sort:
project
Hadoop (7)
Spark (6)
MapReduce (5)
HDFS (4)
type
mail # user (22)
date
last 7 days (0)
last 30 days (1)
last 90 days (1)
last 6 months (3)
last 9 months (22)
author
Ted Yu (2036)
Harsh J (1318)
Jun Rao (1099)
Andrew Purtell (1016)
Todd Lipcon (1016)
Stack (1005)
GitHub Import (895)
Josh Elser (873)
Jonathan Ellis (865)
stack (828)
Jarek Jarcec Cecho (814)
Hitesh Shah (799)
Yusaku Sako (793)
Siddharth Seth (780)
Jean-Daniel Cryans (753)
Eric Newton (739)
Brock Noland (724)
Steve Loughran (720)
Jonathan Hsieh (701)
James Taylor (691)
Roman Shaposhnik (687)
Hyunsik Choi (648)
Namit Jain (648)
Owen O'Malley (617)
Bikas Saha (584)
Ajay Srivastava
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB