|
|
-
Running Accumulo straight from Memory
Moore, Matthew J. 2012-09-11, 16:02
Has anyone run Accumulo on a single server straight from memory? Probably using something like a Fusion IO drive. We are trying to use it without using an SSD or any spinning discs.
Matthew Moore
Systems Engineer
SAIC, ISBU
Columbia, MD
410-312-2542
+
Moore, Matthew J. 2012-09-11, 16:02
-
Re: Running Accumulo straight from Memory
Adam Fuchs 2012-09-11, 21:29
Matthew,
I don't know of anyone who has done this, but I believe you could: 1. mount a RAM disk 2. point the hdfs core-site.xml fs.default.name property to file:/// 3. point the accumulo-site.xml instance.dfs.dir property to a directory on the RAM disk 4. disable the WAL for all tables by setting the accumulo-site.xml table.walog.enabled to false 5. initialize and start up accumulo as you regularly would and cross your fingers
Of course, the "you may lose data" and "this is not an officially supported configuration" caveats apply. Out of curiosity, what would you be trying to accomplish with this configuration?
Adam On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. < [EMAIL PROTECTED]> wrote:
> Has anyone run Accumulo on a single server straight from memory? Probably > using something like a Fusion IO drive. We are trying to use it without > using an SSD or any spinning discs.**** > > ** ** > > *Matthew Moore***** > > Systems Engineer**** > > SAIC, ISBU**** > > Columbia, MD**** > > 410-312-2542**** > > ** ** >
+
Adam Fuchs 2012-09-11, 21:29
-
RE: Running Accumulo straight from Memory
Moore, Matthew J. 2012-09-12, 16:32
Adam, It does look like we are the first to try this. We are trying to keep everything in memory and as a result there is no minor compactions, and probably major compactions to make tables larger. We tried this on SSDs using a file system and we were not getting the processing speeds that we had wanted. Matt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Adam Fuchs Sent: Tuesday, September 11, 2012 5:30 PM To: [EMAIL PROTECTED] Subject: Re: Running Accumulo straight from Memory Matthew, I don't know of anyone who has done this, but I believe you could: 1. mount a RAM disk 2. point the hdfs core-site.xml fs.default.name < http://fs.default.name/> property to file:/// 3. point the accumulo-site.xml instance.dfs.dir property to a directory on the RAM disk 4. disable the WAL for all tables by setting the accumulo-site.xml table.walog.enabled to false 5. initialize and start up accumulo as you regularly would and cross your fingers Of course, the "you may lose data" and "this is not an officially supported configuration" caveats apply. Out of curiosity, what would you be trying to accomplish with this configuration? Adam On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. <[EMAIL PROTECTED]> wrote: Has anyone run Accumulo on a single server straight from memory? Probably using something like a Fusion IO drive. We are trying to use it without using an SSD or any spinning discs. Matthew Moore Systems Engineer SAIC, ISBU Columbia, MD 410-312-2542
+
Moore, Matthew J. 2012-09-12, 16:32
-
Re: Running Accumulo straight from Memory
dlmarion@... 2012-09-12, 16:56
Matt,
Did you see Eric Newton's response yesterday? Running on a ram disk has been done; however minor and major compactions will still occur.
- Dave
----- Original Message ----- From: "Matthew J. Moore" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wednesday, September 12, 2012 12:32:31 PM Subject: RE: Running Accumulo straight from Memory Adam,
It does look like we are the first to try this. We are trying to keep everything in memory and as a result there is no minor compactions, and probably major compactions to make tables larger. We tried this on SSDs using a file system and we were not getting the processing speeds that we had wanted.
Matt
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Adam Fuchs Sent: Tuesday, September 11, 2012 5:30 PM To: [EMAIL PROTECTED] Subject: Re: Running Accumulo straight from Memory
Matthew, I don't know of anyone who has done this, but I believe you could: 1. mount a RAM disk 2. point the hdfs core-site.xml fs.default.name property to file:/// 3. point the accumulo-site.xml instance.dfs.dir property to a directory on the RAM disk 4. disable the WAL for all tables by setting the accumulo-site.xml table.walog.enabled to false 5. initialize and start up accumulo as you regularly would and cross your fingers
Of course, the "you may lose data" and "this is not an officially supported configuration" caveats apply. Out of curiosity, what would you be trying to accomplish with this configuration? Adam
On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. < [EMAIL PROTECTED] > wrote:
Has anyone run Accumulo on a single server straight from memory? Probably using something like a Fusion IO drive. We are trying to use it without using an SSD or any spinning discs.
Matthew Moore
Systems Engineer
SAIC, ISBU
Columbia, MD
410-312-2542
+
dlmarion@... 2012-09-12, 16:56
-
Re: Running Accumulo straight from Memory
David Medinets 2012-09-12, 16:57
Have you looked at in-memory specific software like Lucene, VoltDB, MySQL? Doesn't Accumulo have too much overhead to act primarily as an in-memory datastore? What advantage do you see Accumulo having?
+
David Medinets 2012-09-12, 16:57
-
RE: Running Accumulo straight from Memory
Adam Fuchs 2012-09-12, 20:53
Even if you are just using memory, minor and major compactions are important to get compression, handle deletes, get sequential access (cache line efficiency), use iterators, and introduce locality groups.
Adam On Sep 12, 2012 12:33 PM, "Moore, Matthew J." <[EMAIL PROTECTED]> wrote:
> Adam,**** > > It does look like we are the first to try this. We are trying to keep > everything in memory and as a result there is no minor compactions, and > probably major compactions to make tables larger. We tried this on SSDs > using a file system and we were not getting the processing speeds that we > had wanted.**** > > ** ** > > Matt**** > > ** ** > > ** ** > > *From:* [EMAIL PROTECTED][mailto: > [EMAIL PROTECTED]] *On Behalf > Of *Adam Fuchs > *Sent:* Tuesday, September 11, 2012 5:30 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: Running Accumulo straight from Memory**** > > ** ** > > Matthew,**** > > ** ** > > I don't know of anyone who has done this, but I believe you could:**** > > 1. mount a RAM disk**** > > 2. point the hdfs core-site.xml fs.default.name property to file:///**** > > 3. point the accumulo-site.xml instance.dfs.dir property to a directory on > the RAM disk**** > > 4. disable the WAL for all tables by setting the accumulo-site.xml > table.walog.enabled to false**** > > 5. initialize and start up accumulo as you regularly would and cross your > fingers > > Of course, the "you may lose data" and "this is not an officially > supported configuration" caveats apply. Out of curiosity, what would you be > trying to accomplish with this configuration?**** > > ** ** > > Adam**** > > ** ** > > ** ** > > On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. < > [EMAIL PROTECTED]> wrote:**** > > Has anyone run Accumulo on a single server straight from memory? Probably > using something like a Fusion IO drive. We are trying to use it without > using an SSD or any spinning discs.**** > > **** > > *Matthew Moore***** > > Systems Engineer**** > > SAIC, ISBU**** > > Columbia, MD**** > > 410-312-2542**** > > **** > > ** ** >
+
Adam Fuchs 2012-09-12, 20:53
-
Re: Running Accumulo straight from Memory
David Medinets 2012-09-12, 21:20
Why would locality groups be useful in an in-memory system?
On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs <[EMAIL PROTECTED]> wrote: > Even if you are just using memory, minor and major compactions are important > to get compression, handle deletes, get sequential access (cache line > efficiency), use iterators, and introduce locality groups.
+
David Medinets 2012-09-12, 21:20
-
Re: Running Accumulo straight from Memory
Adam Fuchs 2012-09-12, 21:27
Yes, the effect of locality groups would be about the same in an in memory system. The only exception would be if you're not using locality groups and are fetching a particular column, the automatic seeking behavior of the column filtering iterator would be more efficient with in memory rfiles.
Adam On Sep 12, 2012 5:20 PM, "David Medinets" <[EMAIL PROTECTED]> wrote:
> Why would locality groups be useful in an in-memory system? > > On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs <[EMAIL PROTECTED]> wrote: > > Even if you are just using memory, minor and major compactions are > important > > to get compression, handle deletes, get sequential access (cache line > > efficiency), use iterators, and introduce locality groups. >
+
Adam Fuchs 2012-09-12, 21:27
-
RE: Running Accumulo straight from Memory
Moore, Matthew J. 2012-09-13, 12:47
You guys are confirming what we predicted. This was a "like to have" from our customer and we wanted to see if anyone else had tried this. Thanks.
Matt
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Adam Fuchs Sent: Wednesday, September 12, 2012 5:28 PM To: [EMAIL PROTECTED] Subject: Re: Running Accumulo straight from Memory
Yes, the effect of locality groups would be about the same in an in memory system. The only exception would be if you're not using locality groups and are fetching a particular column, the automatic seeking behavior of the column filtering iterator would be more efficient with in memory rfiles.
Adam
On Sep 12, 2012 5:20 PM, "David Medinets" <[EMAIL PROTECTED]> wrote:
Why would locality groups be useful in an in-memory system?
On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs <[EMAIL PROTECTED]> wrote: > Even if you are just using memory, minor and major compactions are important > to get compression, handle deletes, get sequential access (cache line > efficiency), use iterators, and introduce locality groups.
+
Moore, Matthew J. 2012-09-13, 12:47
-
Re: Running Accumulo straight from Memory
Keith Turner 2012-09-13, 16:32
On Wed, Sep 12, 2012 at 5:20 PM, David Medinets <[EMAIL PROTECTED]> wrote: > Why would locality groups be useful in an in-memory system?
Memory is fast, yet we still organize data in memory to make it really fast (e.g. hash maps, sorted maps, bloom filters, etc) Locality groups are no different. If using that data organization will make what you are attempting to do faster, then you would probably use it. Assume you have two locality groups and one contains 1% of your data by volume and the other 99%. Scanning just the locality group with 1% of the data will be faster than not having locality groups. It cuts down on the amount of data you have to read and processes from memory.
> > On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs <[EMAIL PROTECTED]> wrote: >> Even if you are just using memory, minor and major compactions are important >> to get compression, handle deletes, get sequential access (cache line >> efficiency), use iterators, and introduce locality groups.
+
Keith Turner 2012-09-13, 16:32
-
Re: Running Accumulo straight from Memory
Eric Newton 2012-09-11, 16:19
I have run a small cluster with HDFS writing only to a RAM disk. Is that the sort of thing you are interested in?
-Eric
On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. < [EMAIL PROTECTED]> wrote:
> Has anyone run Accumulo on a single server straight from memory? Probably > using something like a Fusion IO drive. We are trying to use it without > using an SSD or any spinning discs.**** > > ** ** > > *Matthew Moore***** > > Systems Engineer**** > > SAIC, ISBU**** > > Columbia, MD**** > > 410-312-2542**** > > ** ** >
+
Eric Newton 2012-09-11, 16:19
-
RE: Running Accumulo straight from Memory
Moore, Matthew J. 2012-09-11, 16:55
Have you tried it where you're writing to straight block memory? Not using any file system or SATA controller.
Matt Sent: Tuesday, September 11, 2012 12:19 PM To: [EMAIL PROTECTED] Subject: Re: Running Accumulo straight from Memory
I have run a small cluster with HDFS writing only to a RAM disk. Is that the sort of thing you are interested in?
-Eric
On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. <[EMAIL PROTECTED]> wrote:
Has anyone run Accumulo on a single server straight from memory? Probably using something like a Fusion IO drive. We are trying to use it without using an SSD or any spinning discs.
Matthew Moore
Systems Engineer
SAIC, ISBU
Columbia, MD
410-312-2542
+
Moore, Matthew J. 2012-09-11, 16:55
-
Re: Running Accumulo straight from Memory
Eric Newton 2012-09-11, 17:59
Accumulo needs something that provides the FileSystem interface. It also needs to be distributed, replicated, and provide for a write-ahead log. HDFS on a RAM disk pretty much gets you that.
On Tue, Sep 11, 2012 at 12:55 PM, Moore, Matthew J. < [EMAIL PROTECTED]> wrote:
> Have you tried it where you’re writing to straight block memory? Not > using any file system or SATA controller.**** > > ** ** > > Matt**** > > > *Sent:* Tuesday, September 11, 2012 12:19 PM > *To:* [EMAIL PROTECTED] > *Subject:* Re: Running Accumulo straight from Memory**** > > ** ** > > I have run a small cluster with HDFS writing only to a RAM disk. Is that > the sort of thing you are interested in?**** > > ** ** > > -Eric**** > > On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. < > [EMAIL PROTECTED]> wrote:**** > > Has anyone run Accumulo on a single server straight from memory? Probably > using something like a Fusion IO drive. We are trying to use it without > using an SSD or any spinning discs.**** > > **** > > *Matthew Moore***** > > Systems Engineer**** > > SAIC, ISBU**** > > Columbia, MD**** > > 410-312-2542**** > > **** > > ** ** >
+
Eric Newton 2012-09-11, 17:59
-
Re: Running Accumulo straight from Memory
William Slacum 2012-09-11, 16:30
You could mount a RAM disk and point HDFS to it.
On Tue, Sep 11, 2012 at 9:02 AM, Moore, Matthew J. <[EMAIL PROTECTED] > wrote:
> Has anyone run Accumulo on a single server straight from memory? Probably > using something like a Fusion IO drive. We are trying to use it without > using an SSD or any spinning discs.**** > > ** ** > > *Matthew Moore***** > > Systems Engineer**** > > SAIC, ISBU**** > > Columbia, MD**** > > 410-312-2542**** > > ** ** >
+
William Slacum 2012-09-11, 16:30
-
Re: Running Accumulo straight from Memory
William Slacum 2012-09-11, 16:30
Woops- slow innurnet and didn't notice Eric's response.
On Tue, Sep 11, 2012 at 9:30 AM, William Slacum < [EMAIL PROTECTED]> wrote:
> You could mount a RAM disk and point HDFS to it. > > > On Tue, Sep 11, 2012 at 9:02 AM, Moore, Matthew J. < > [EMAIL PROTECTED]> wrote: > >> Has anyone run Accumulo on a single server straight from memory? >> Probably using something like a Fusion IO drive. We are trying to use it >> without using an SSD or any spinning discs.**** >> >> ** ** >> >> *Matthew Moore***** >> >> Systems Engineer**** >> >> SAIC, ISBU**** >> >> Columbia, MD**** >> >> 410-312-2542**** >> >> ** ** >> > >
+
William Slacum 2012-09-11, 16:30
|
|