Essbase fragmentation head-scratcher

With the release of Essbase, there was an interesting entry in the readme that caused some chatter on Network54.  The readme states:

The INPLACEDATAWRITE setting in essbase.cfg enables or disables in-place data writing.

Prior to this release, each time a data block was updated, it was written to a new disk location. With this release, for Exalytics, Essbase enables in-place data writing.

In-place data writing means that when updates occur, the data block can be written to the same location, as long as the compressed size of the data block fits in its original location on the disk.

In-place data writing can help reduce data fragmentation and lower the need for frequent restructuring of database. It also reduces the need for frequent index updates, resulting in improved performance.

Merely updating a single value of a cell in a single block causes that block to be written to the end of the page file? This made long-time developers scratch their heads because many of us believed that updating values in a block did not cause fragmentation. If submitting data causes fragmentation, imagine how bloated our Planning databases must be. No wonder we hear recommendations to restructure nightly during a planning cycle.

Glenn Schwartzberg said that yes, that’s how he understood it to work.  Edward Roske said there was no way submitting data cause fragmentation.  Cameron Lackpour didn’t believe it either until he did some testing and found that the ESSCMD GETDBSTATS command reported fragmentation after updating a single cell of Sample.Basic. Dan Pressman commented that he had seen the same behavior with compression turned off – BSO writes to the end of the .pag file when blocks are updated.

Test 1:

I tested this out on an, non-Exalytics sandbox. In Excel, I retrieved a single block of one of our sample cubes (not Sample.Basic, this was a larger demo cube). The page dimensions of my worksheet listed a single level-0 member from each Sparse dimension of the cube. Accounts were on my rows and January was my column dimension (both were level-0 Dense members).

With that sheet, I wrote a macro to do the following:

  • Update each data cell of the grid to a randomly generated number
  • Submit the data for the block back to Essbase
  • Call a MaxL script to stop and restart the Essbase application to flush the cache
  • Loop back to the top, iterating 250 times

While I ran the macro, I (im)patiently watched the database directory to see if the database’s essn.ind or essn.pag files would grow.  As we all know, if there is fragmentation that happens each time the block is submitted back to Essbase, the .pag file should see some growth – especially over 250 iterations. For complete transparency, this database is using the bitmap compression type that is the default for BSO applications.

After 250 iterations of changing the same block and stopping/restarting the application there was no essn.pag file growth. The GETDBSTATS results show the following:

  • Average Clustering Ratio: 0.4431846
  • Average Fragmentation Quotient: 0.8091218

That leads me to believe there is a little bit of fragmentation, so I cleared the cube and loaded in a level-0 extract of the data and ran a calculation.  After that, I ran GETDBSTATS again and found the following:

  • Average Clustering Ratio: 0.5117247
  • Average Fragmentation Quotient: 0.6745301

Test 2:

My next test was to do the same loop except I added in a quick aggregation step after the data was submitted. This time, as the macro ran, I could plainly see the essn.pag file growing every 5-6 loops of the code. The Average Clustering Ratio and Average Fragmentation Quotient also grew as the calculations ran.

What does all of this mean?


My conclusion is that the documentation on INPLACEDATAWRITE is a little misleading. Essbase without the INPLACEDATAWRITE does NOT write a block to the end of the .pag file if a block is updated. Blocks may be updated all day long by submitting data to them without expanding your .pag files. Fragmentation is caused when a data block is brought into the calculator engine. This is where INPLACEDATAWRITE does its magic on Exalytics machines and allows those blocks to be compressed and returned to their original location in the .pag file, thus preventing fragmentation.

What about the Fragmentation Quotient? Well, in the DBAG, it mentions that the Fragmentation quotient can be high and not indicate a need to defragment since free space in the .pag is created in 8MB chunks. I observed this to be true during Test 2 where excess free space was added to the .pag file and the file would only grow after 5-6 loops of the submit data/calc/stop/start routine. My thought is that an Average Fragmentation Quotient less than 1 (<1%) is essentially no fragmentation at all. After a single calculation script I began to see that Average Fragmentation Quotient above 1 (1.884844). By nature Essbase has some degree of fragmentation all of the time as it’s .pag file is always a little bit bigger than the actual size of the data contained therein. As long as the .pag file isn’t growing and the Average Fragmentation Quotient is less than 1%, I’m counting that as no fragmentation by submitting data to a block.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s