|
|
Cardon, Tejay E 2012-09-20, 21:05
I'm seeing some strange behavior on a moderate (30 node) cluster. I've got 27 tablet servers on large dell servers with 30GB of memory each. I've set the TServer_OPTS to give them each 10G of memory. I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families. The MapReduce initially runs quite quickly and I can see the ingest rate peak on the monitor page. However, after about 30 seconds of high ingest, the ingest falls to 0. It then stalls out and my map task are eventually killed. In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.
Inspecting the tserver.err logs shows nothing, even on the nodes that fail. The tserver.out log shows a java OutOfMemoryError, and nothing else. I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master. Other than the out of memory, I'm not seeing anything that stands out to me.
If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.
Am I just ingesting too quickly?
Thanks, Tejay Cardon
+
Cardon, Tejay E 2012-09-20, 21:05
-
Re: Failing Tablet Servers
Eric Newton 2012-09-21, 14:03
A few items noted from your logs:
tserver.memory.maps.max = 1G If you are giving your processes 10G, you might want to make the map larger, say 6G, and then reduce the JVM by 6G.
Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, > 8000000 entries created) You are creating rows with 1M columns. This is ok, but you might want to write them out more incrementally.
WARN : Running low on memory That's pretty self-explanatory. I'm guessing that the very large mutations are causing the tablet servers to run out of memory before they are held waiting for minor compactions.
Finished gathering information from 24 servers in 27.45 seconds Something is running slow, probably due to GC thrashing.
WARN : Lost servers [10.1.24.69:9997[139d46130344b98]] And there's a server crashing, probably due to an OOM condition.
Send smaller mutations. Maybe keep it to 200K column updates. You can still have 1M wide rows, just send 5 mutations.
-Eric
On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <[EMAIL PROTECTED]>wrote:
> I’m seeing some strange behavior on a moderate (30 node) cluster. I’ve > got 27 tablet servers on large dell servers with 30GB of memory each. I’ve > set the TServer_OPTS to give them each 10G of memory. I’m running an > ingest process that uses AccumuloInputFormat in a MapReduce job to write > 1,000 rows with each row containing ~1,000,000 columns in 160,000 > families. The MapReduce initially runs quite quickly and I can see the > ingest rate peak on the monitor page. However, after about 30 seconds of > high ingest, the ingest falls to 0. It then stalls out and my map task are > eventually killed. In the end, the map/reduce fails and I usually end up > with between 3 and 7 of my Tservers dead.**** > > ** ** > > Inspecting the tserver.err logs shows nothing, even on the nodes that > fail. The tserver.out log shows a java OutOfMemoryError, and nothing > else. I’ve included a zip with the logs from one of the failed tservers > and a second one with the logs from the master. Other than the out of > memory, I’m not seeing anything that stands out to me.**** > > ** ** > > If I reduce the data size to only 100,000 columns, rather than 1,000,000, > the process takes about 4 minutes and completes without incident.**** > > ** ** > > Am I just ingesting too quickly?**** > > ** ** > > Thanks,**** > > Tejay Cardon**** >
+
Eric Newton 2012-09-21, 14:03
-
Re: Failing Tablet Servers
John Vines 2012-09-20, 21:20
Okay, so we know that you're killing servers. We know when you drop the amount of data down, you have no issues. There are two immediate issues that come to mind- 1. You modified tservers opts to give them 10G of memory. Did you up the memory map size in accumulo-site.xml to make those larger, or did you leave those alone? Or did you up them to match the 10G? If you upped them and arne't using the native maps, that would be problematic as you need space for other purposes as well.
2. You seem to be making giant rows. Depending on your Key/Value size, it's possible for you to write a row that you cannot send (especially if using a WholeRowIterator) that can cause a cascading error when doing log recovery. Are you seeing any sort of errors in your loggers logs?
John
On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <[EMAIL PROTECTED]>wrote:
> I’m seeing some strange behavior on a moderate (30 node) cluster. I’ve > got 27 tablet servers on large dell servers with 30GB of memory each. I’ve > set the TServer_OPTS to give them each 10G of memory. I’m running an > ingest process that uses AccumuloInputFormat in a MapReduce job to write > 1,000 rows with each row containing ~1,000,000 columns in 160,000 > families. The MapReduce initially runs quite quickly and I can see the > ingest rate peak on the monitor page. However, after about 30 seconds of > high ingest, the ingest falls to 0. It then stalls out and my map task are > eventually killed. In the end, the map/reduce fails and I usually end up > with between 3 and 7 of my Tservers dead.**** > > ** ** > > Inspecting the tserver.err logs shows nothing, even on the nodes that > fail. The tserver.out log shows a java OutOfMemoryError, and nothing > else. I’ve included a zip with the logs from one of the failed tservers > and a second one with the logs from the master. Other than the out of > memory, I’m not seeing anything that stands out to me.**** > > ** ** > > If I reduce the data size to only 100,000 columns, rather than 1,000,000, > the process takes about 4 minutes and completes without incident.**** > > ** ** > > Am I just ingesting too quickly?**** > > ** ** > > Thanks,**** > > Tejay Cardon**** >
+
John Vines 2012-09-20, 21:20
|
|