HBase Adds MultiWAL Support
HBase Adds MultiWAL Support
Written by Kay Ewbank   
Thursday, 02 February 2017

A new version of Apache HBase is available with date-based tiered compactions and a CoDel based RPC scheduler.

Apache HBase is Hadoop's open-source, distributed, versioned, non-relational database, modeled after Google's BigTable.

 hbaselogo

 

Many of the improvements to HBase 1.3.0 are bug fixes, but there are a number of new features.

The date-based tiered compactions have been added to solve the problem where data is mainly written sequentially by when it arrived at the back end, while data is read mainly in time-range scans of certain column families. In the previous version, the store file layout can't be used to make full use of the scan api feature to skip store files with data out of the time range. Date-based tiered compaction overcomes this problem, meaning that records that are old enough to have 'expired' can be dropped when the store is compacted.

The support for Multi WAL (Write Ahead Logs) has also been improved. Without Multi WAL support, each region on a RegionServer writes to the same WAL. If the RegionServer is busy because it hosts several regions, and each write to the WAL is serial, the WAL can degrade overall performance. Multi WAL means a RegionServer can write multiple WAL streams in parallel. Tests of the new feature show improvements of 20 percent in average latency when running on pure SATA disks, and 40 percent on SATA-SSD disks.

The RPC request scheduler is another feature that has been improved. The previous version could operate in two modes: simple FIFO, and "partial" deadline, where deadline constraints were only imposed on long-running scan requests. The updated version adds support for scheduling based on the controlled delay (CoDel) algorithm that is used to combat bufferbloat. This prevents long standing call queues caused by discrepancy between request rate and available throughput. The CoDel algorithm provides active queue management with controlled delay. A defined threshold is set, and when the minimum delay goes over the threshold, calls are dropped to take the delay back under the threshold.

Other improvements include  Maven archetypes for HBase client applications; a throughput controller for flushes;  bulk loaded HFile replication; and reduced memory allocation in the RPC layer.

hbaselogo

More Information

HBase Site

Release Announcement

Related Articles

Apache Spark 2.0 Released

First Hybrid Open-Source RDBMS Powered By Hadoop and Spark

HBase 1.0 Released  

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

Banner


Python Passion For Assignment Expressions - PEP 572
11/07/2018

You would think with a language as old as Python there wouldn't be much left to get excited about, but over the past few months PEP-572, a proposal to add a new feature, has been raising the bloo [ ... ]



Robot Stunt Double Lands Perfectly Every Time
30/06/2018

Disney's Imagineering Research and Development department has unveiled a humanoid robot capable of performing impressive stunts and be a new attraction at Disneyland and its related theme parks.


More News

 

justjsquare

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 01 February 2017 )
 
 

   
Banner
Banner
RSS feed of news items only
I Programmer News
Copyright © 2018 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.