Co-Authored by Mark George
Last September (2011) we added Scrumworks Pro (SWP), an Agile Project Management solution by CollabNet, to our Cloud. We thought it would be really interesting to explain in technical detail how our engineering team took a single tenant Java application and transformed it into a semi-multitenant application.
Scrumworks Pro (SWP) was built specifically for the Scrum process, until recently, where it now also supports Kanban methodology. CollabNet Cloud Services (CloudForge) is glad to bring you this Agile tool as part of our cloud development platform.
SWP is based on a 3-tier Java architecture. It uses JBoss as the Java Application Server, and it supports MySQL database in the backend. SWP has a web client that uses Sprint web framework, and it also has a Java-based desktop client, which talks to SWP over the http protocol securely.
Natively, the SWP application is a single tenanted Java application running on top of JBoss. The engineering challenge was, how can we provisioned thousands of SWP instances in a way that is reliable (99.9%), manageable, automated, and scalable?
Cloud Architecture for SWP
As we thought through the problem, there are basically four options to cloud-enable a single tenant Java application:
We immediately ruled out the native and single application server options. The native option requires major surgery into the SWP application to make it natively multi-tenanted, running on a single JBoss App Server instance. Therefore, it is both high cost and slow time to market. We also thought about running multiple SWP instances within a single JBoss App Server instance. This would avoid having to have multiple JBoss App Server instances and have a greater sharing of memory. This option also does not work because SWP is not designed to share a JBoss instance with another copy of itself. To change how SWP did this would also be high cost and slow time to market.
We ultimately decided to proceed with running multiple JBoss App Server instances on a single dedicated server. Now, before I go into details on how we did that, I want to first address reasons why we did not go with the virtualization route.
The initial thought was that having a separate virtualized SWP instance with its own JBoss will be an easy solution. However, there are several key issues with this approach:
- Cost – with virtualization, you are incurring an overhead in memory. In the hypervisor, in the virtualized OS level, and in the memory overhead of managing multiple virtualization instances. Let’s say that that you were able to optimize your VM, Linux kernel, Apache, Java, and OS setup to be a total overhead of 256 MB (We did not actually go through this specific exercise, so this is just an assumption). At 1000 SWP instances, this is 1000 x 256 MB / 1024 = 250 GB overhead that we are incurring. Let’s say it costs you $10 per month for 1 GB of memory, which is $30,000 in additional cost annually. (Currently, we already have 150+ live instances of SWP trials just through the beta period, and it is growing.) One more aspect of cost would be the cost of buying VMWare or Citrix Xen licenses and the management software.
- Complexity – If we went with virtualization, we would have 1000+ and growing virtualized OS to manage. We are talking about security upgrades, package upgrades, monitoring, and the list goes on. Now, of course, there are automated tools like Puppet, package management systems that can help us, in addition to virtualization tools like vSphere, but even with the best available administration tools, this would add significant complexity and fragility to managing these systems.
Multi-App Server for SWP
The final solution that we decided on look like the following:
Here are the details of the architecture:
- MySQL DB – customers share a MySQL server. Each SWP instances use a separate database on the MySQL server. Therefore, customer data are separated from each other.
- Dedicated Server – customers share a dedicated server, with separate JBoss instances. Each SWP runs in separate JBoss instances.
There were several engineering challenges that we had to overcome:
- Port Management – Every JBoss instance requires 24 ports on the dedicated server. Therefore, we had to build an auto-port management algorithm that can keep track of all open ports within a specific port range.
- Memory Optimization – In order to optimize for performance, availability, and cost, we built an auto-hibernating feature. If a SWP instance had no activity in the last 30 days, we shut down the SWP / JBoss instance. This returns both ports and memory back to the server. If a customer comes back to that instance, we will immediately start the JBoss instance. Typically, it takes less than 45 seconds for the instance to come back.
- Monitoring – In order to maintain our 99.9% SLA, we wrote a script that monitors every SWP instance that is live. This script also takes into account whether an account is cancelled or not, if it is a cancelled account, we shut the SWP instance down and make sure we don’t monitor it. We also check whether the instance is hibernated or not.
Of course, any CloudForge integrated product comes with the following platform services that we had to provide:
- Enterprise Backup – Each SWP instance database and SWP attachments are backed up with our multi-layered Enterprise Backup system.
- High Availability – We have a live, standby SWP Server so that incase the primary production SWP server goes down, we can quickly switch to the standby server to maintain our 99.9% SLA.
- Enterprise Security – We provide a single unified security administration console for our services. For example, you can create a role and grant permissions for git, SVN, and SWP all to the same role; our security system will sync the permissions down to the respective service. You will not need to go into each service (git, svn, SWP) to configure security.
- Support and 99.9% SLA – Our 24/7 live system watch process, guaranteed SLA for support (4 hour response time) and 99.9% uptime guarantee are standard as part of our business line of services.
SWP GA Results
Since SWP Beta rolled out in September, 2011, we have been carefully optimizing, monitoring, and testing our SWP Cloud Architecture. Now that we are GA, we are seeing great results. With our latest analysis, we are seeing each SWP instance to use between 350MB – 600MB worth of resident (physical) memory in Linux. Conservatively, we are easily able to accommodate 500+ SWP instances for a server with 512 GB of memory. The # of instances that we can accommodate depends on many factors, including how many of them we auto-hibernate. Currently, it takes about 5 minutes to fully provision a SWP instance. This is a bit slower than we like, and we will enhance the speed of this in the future.
We are glad that we have taken a great approach to bring SWP into the Cloud, to be provisioned in minutes, and can scale up to hundreds and thousands of instances if needed. More importantly, by proving out this architecture with SWP, we now have a repeatable process where we can take the same approach with any Java based application.