Your browser does not have JavaScript enabled. JavaScript is used to enhance both BitLeap’s website and our Customer Control Panel. You may not be able to use all the functions of our website properly with JavaScript disabled. Sorry for any inconvenience this may have caused.

Loading Data...


Distributed Decompression

We recently optimized the restore performance of the Windows Restore Tool. As you may have seen with the latest press release, we managed to improve throughput by 300%.

In most cases, there is a bottleneck in the process limiting throughput. It’s almost as though the process and its bottlenecks are a calculus limit problem. As bottlenecks approach zero, throughput approaches infinitum. One of the larger bottlenecks in the restore process ended up being the decompression of the data.

On a leapserv, data is broken down into pieces for data de-duplication before being compressed with bzip2 compression. Bzip2 compresses better than the average LZ* algorithm with the tradeoff being performance. Given that the leapserv provides a platform for dedicated processing power, sacrificing cpu cycles in an effort to save bandwidth and disk space is a good trade off.

However, as mentioned when optimizing the restore performance of a leapserv, we looked for bottlenecks one by one. As they’re solved, throughput increased until running up against the next bottleneck. The decompression of the data quickly became our biggest bottleneck. Given that one of the developers ported bzip2 to assembly, we determined there wasn’t much we could do with the bzip2 algorithm.

The solution to our decompression problem was distributed decompression. Given that during a restore, a computer is involved in the process of getting the data off the leapserv, why not use the cpu resources of the target computer to help decompress the data? By doing so, we achieved a near 200 of the 300% throughput gain we managed to acquire through the whole optimization process.

The distributed decompression engine is rather slick; it’ll adjust the decompression percentages between the host computer and the leapserv to compensate for different processing speeds. Thus, the engine distributes decompression work based on the available cpu resources on each machine.

After adding distributed decompression to the restore process, we quickly found the decompression to be the bottleneck once more. The throughput was now nearly twice as fast but the bottleneck remained the same.

Our goal is to max out the write speed of a customer’s hard drives or the bandwidth limit of a customer’s network. At that point, the bottleneck is out of our hands giving the customer the best possible service for their environment. Are we there yet? For some customers, I’m pretty sure we are. For our enterprise customers though, I believe there is more we can do.

The next step would be to get more machines involved. The host computer’s processor is a logical choice since the restoring user has already taken the time to install the Windows Restore Tool software on said computer. The question though is if it’s worth the user’s time to install distributed decompression software on other machines on his/her network. And if it were, how long would it be before the decompression bottleneck was overcome only to be replaced with a bandwidth bottleneck to the decompression machines? We’ll find out…

  • Given the amount of bzip2 compression and decompression BitLeap performs, one of our engineers translated the algorithm into assembly. I noticed the other day the source code was not yet posted on the open source page when it should in fact be there. If you would be interested in the assembly version of bzip2 before we get it posted, shoot us an email at support@bitleap.com

Cap Tower Destruction

We recently moved our offices, which created a great excuse for us to demolish the cap tower. One of our developers, Ken, created the tower made of bottle caps when we first moved into the old office about a year ago. It was too fragile to move so, we crushed it with a bat! Enjoy.

Feel free to guess how many caps were on the tower and we will give the closest guess a $50 gift card to The Apple Store, Barnes & Noble, Starbucks, or Amazon.com. Guesses will be accepted via blog comments until February 1, 2008.

Mending PHP Sockets with OpenSSL

Recently I began working on an FTP server that will allow our clients the ability to restore files from any date and time previously backed up with our software. Currently we offer an FTP server at the local LeapServ with the same capabilities. However, it was desirable to offer an FTP service to our clients that would be accessible from anywhere in the world, to restore a large number of files in a disaster scenario. FTP however is initially an insecure protocol (following RFC 959), more recent extensions to the FTP protocol (RFC 2228, RFC 4217) provide guidelines to heavily secure the FTP protocol.

The backbone of most of our software is implemented in PHP which poses a bit of a dilemma with the aims of creating an FTP server secured using SSL/TLS. PHP currently supports some OpenSSL functionality, but nothing along the lines of using the library with a socket. Currently to keep our off-site data transmissions highly secure we implement our own handshake, key exchange, and use robust encryption libraries. However using our own proprietary encryption back end in this situation would destroy the ability for any FTP client to exchange information with the FTPS server.

The ZEND engine is the backbone of the PHP language and provides the ability for modules to be developed that can extend the functionality of the PHP language. The solution to the PHP/SSL dilemma was to develop an intermediate C library layer that would handle tasks such as creating SSL contexts, binding a socket to an SSL object, reading, and writing to an abstract SSL Socket. Above this layer is our own PHP module that wraps the intermediate C library, and provides the gateway for our software to access our own extended functionality to PHP.

The next hurdle was to mend the layer between PHP sockets and the new functionality of our PHP module. Fortunately, the ZEND engine provides an interface to discover information about other modules that have been loaded/compiled into PHP. The information obtained about a module can then used to fetch a particular resource type, which allows our PHP module to inter operate with the current implementation of sockets in PHP. The ability to work with the socket implementation PHP provides allowed us to mend PHP sockets, the OpenSSL library, and our own backend C library together to provide a robust framework for developing PHP applications that can use the SSL/TLS functionality over a socket.

The new FTPS server has recently been completed and is under a period of testing before deployment in the near future.

Delay While Creating X509Certificate2 Objects

While developing in Visual Studio 2005 we ran into a problem where sometimes there would be a long pause (about 10 seconds) during the creation of X509Certificate2 objects. Yesterday we decided to dedicate some time to figure out the root cause of the issue.

The first thing we did was create some breaks around the code so we could step through it and then we started filemon to monitor the file system activity. Right away we noticed something odd, there were a bunch of log entries for files named C:\Documents and Settings\[Your User Name]\Local Settings\Temp\Tmp[0000-ffff].tmp. We opened the folder that contained the files and found that there were over 66,000 files in the directory. We also noticed that there were files named Tmp0000.tmp all the way up to Tmpffff.tmp. We decided to remove all of these files to see if that may fix the issue. Sure enough after removing the files the long delay was gone.

Even though the delay was gone, we wanted to find out why there were so many temporary files in the first place. So we stopped the application in Visual Studio and switched back over to Explorer to see if there were any new temporary files. Sure enough there were two new files that were not cleaned up. We removed the files and then started the application again and stepped through the code in the suspect area. Sure enough when creating the new X509Certificate2 object, two temporary files were created and not removed. We went over to the MSDN site to make sure we were not missing a cleanup call or something that would remove the files, but we could not find any such method.

However, we did notice that you can pass a file path rather than a byte array when constructing the object. This led us to our workaround for the issue. We simply created a temporary file and wrote the bytes to it. We then passed the temporary file path when constructing the new object. After the object was created we removed the temporary file.

With all this information we thought it would be nice to provide it to Microsoft so that they could research it. We did just that by heading over to Microsoft Connect and creating issue #284551 and issue #284553.

New Feature: Critical File Watch Rules

Last week we released a new feature that allows our customers to be notified when a file’s size or modified time does not change within a customer specified threshold.

For instance, someone might setup a rule to watch a Microsoft Exchange backup dump file. Lets say they have the system that backs up Exchange setup to backup to the NAS portion of the LeapServ device on a nightly basis. This new feature will allow this person to setup a watch to be notified if the Exchange backup file has not changed within the past day.

The critical file watch rules aim to keep the user informed, and prevent a situation where a third party backup utility is not providing our system with the data that it should.



 

BitLeap Devblog

Welcome to the BitLeap developer blog! Some posts are longer than others, but they all seem to make use of links and code and stuff. Feel free to read, or not to read, as you so desire and prefer.