It’s 2015. Traditional software development decided long ago that the only sane way of developing code is to manage it through a version control system. As of today, there are many different possibilities for managing code versioning, either centralized or distributed systems, but all of them offer the same advantages: traceability, ability to coordinate work from multiple developers, instant checkout of old versions, etc. They are all in wide use throughout the industry and nobody disputes the benefits they bring to the table.
Data Virtualization projects are no different than those traditional software projects. They have the same characteristics that make version control something not only desirable but an essential part of the development process:
- They tend to be complex projects, involving several IT assets and multiple consumers of data.
- They are almost invariably multi-developer, with a team of engineers working on the same code base in parallel.
- They evolve with time, with new features being added in parallel with bug fixing on the current code base.
- They need to have a managed process for promotion through the different environments (development, testing, production, etc.).
- They offer a service to consumers of data, either internal or external, and have to fulfill the requirements (both functional and nonfunctional) that were agreed with those consumers.
- This service may need to revert changes and go back to a previous version, and must have the ability to diagnose when a defect occurred and what changes in the code base caused it.
All these are the typical reasons that push everybody to use version control for traditional software projects, and they also apply to Data Virtualization. And in the other direction: if you don’t use version control with Data Virtualization, you’re exposing your team to unnecessary risks:
- What happens when a defect in the development pops up?
- How do you diagnose the issue?
- How do you determine when the problem appeared, and how to correct it?
- How do you apply the fixes to the right version of the project?
- How do you carry the fixes to future versions of the project?
Take a look at your Data Virtualization objects. Seeing a list of views named like:
- criticalview
- criticalview_2
- criticalview_2_old
- criticalview_new
- criticalview_new_new_final
- etc.
should be a wake-up call. When a problem appears, is your team confident that they can find which view or views are used and which ones are old copies, which view is the one causing the problem, and how to fix it, all under pressure? Don’t shoot yourself in the foot – version control is your most precious ally in these stressful situations.
Once you’ve decided to use version control with Data Virtualization, you need to define your branching and versioning policies. These are typical decisions for the software development process and will help you ensure a high quality of deliverables to your consumers. Among the decisions to make are:
- How many branches to keep? One per version of the product? One per patch? One per developed feature?
- How is the product version assigned? Is it a traditional major.minor? major.minor.update? Or is the version just the date of publication?
Ideally, the Version Control System (VCS) should be used in tandem with a defect tracking system, so your clients can file tickets with your team and your changes in the code have those tickets associated in the VCS for traceability purposes. But even if you are not using a defect tracker, version control still gives you enough advantages to be used in a standalone fashion.
The Denodo Platform integrates seamlessly with both centralized version control systems (such as Subversion or TFS) and distributed systems (such as Git). It does not matter which style your company prefers. It accommodates your preferred methodology for branching, checking code in and out, merging conflicts, patching, etc. And if you don’t have any today, there is a lot of documentation available to help you choose a process that suits your team.
You have no excuse to not use version control with Data Virtualization – do not wait until it’s too late.
- Data Virtualization at the Fast Data Strategy Summit - March 22, 2016
- Version Control is Non-negotiable - May 20, 2015