Bridging The Data Gap Between Legacy And Modern Data Systems

Building a hybrid out of the old and new


Just about any company that has been around for more than 10 years has legacy systems in place. Some of those systems serve their functions well and are not likely to be undone any time soon. There is a time and place for a relational database to do its job, but before long, pulling data from existing CRM and ERP systems and merging it with newer types of data stores becomes necessary. The challenge is building a data bridge between the old and the new.

We must first consider the dimensions of the legacy systems. The first dimension is the data itself. The data from legacy systems is typically relational data from CRM, ERP, and other enterprise applications. The second dimension we must consider is the infrastructure. Legacy systems tend to be on-premises, behind firewalls in a bounded and constrained infrastructure.

The modern data warehouse is made of both structured and unstructured data coming from multiple sources into hybrid systems that may be off-premises in a cloud. The data and the infrastructure is unbounded—the opposite of a legacy system. Therefore, the bridge between the old world and the new world can be complicated and intimidating, especially when most IT teams are already very busy with maintenance, bug fixes, and just keeping things running. Tackling such a comprehensive shift can be daunting.

First, the good news. Don’t rip out and replace what you have. Instead, look for ways to build a hybrid out of the old and the new. My advice is think cloud first. It’s the key to moving forward, so the cloud should be your default for placement for new applications when feasible.

Second, start with non-mission-critical applications. A good way to start blending legacy data with newer sources is to embrace social and mobile data. You don’t have to disrupt your old CRM systems to start capturing input from social media and mobile applications. Social can give you extra insights into customer feedback and sentiment. Perhaps use a mobile application for employee directories and benefit information. These are options that will not impact your revenue-generating systems, but will help you start integrating new information that can please customers and employees.

When it comes to more advanced and modern data systems, one must consider that big data stresses the three Vs: volume, velocity, and variety of data. As we collect more data from various sources, advanced tools are the only option. And the cloud is a must-have for most companies dealing with big data. Twenty years from now, every application will most likely be in the cloud and on-premises systems will fade. Therefore, the only reasonable long-term option for new applications is the cloud. All applications have a shelf-life, so as new applications replace the old, put them in the cloud, be it public, private or hybrid. That will provide a solid migration path for your legacy systems over time and prevent you from being locked into a solution that isn't compatible with future technologies.

As you start integrating old and new data sources, there are a couple of options to consider:

  • 1)Pull data from one system and move it to another so the data can be more easily combined, or
  • 2)Use data federation and virtualization to pull several sources together into a new system.

Each option has pros and cons, but in the end, there’s no free lunch. You’ll have to choose which option has the optimal balance for your organisation.

Option #1 can provide fast analytics once you've moved the data. However, the process of moving the data can be painfully slow. Using an ETL tool to move the data can still take weeks when dealing with large data sets. Data sets can behave like “mass with gravity;” large data sets are hard to move, and the bigger they get, the longer and harder it takes to budge them. But, if your data set is smaller, an ETL transfer is a viable option.

With option #2, the problem is the data is often not compatible, and the set of tools that can interact with non-relational databases are quite limited. Option #2 also creates a lot of network traffic at query time because every time you ask a question, you’re querying databases all over the world in multiple locations. This can cause significant latency issues.

There are many business intelligence systems on the market today that can help you manage your data transition. You’ll find that there are some very simple systems that look good cosmetically. But they may not be enterprise-grade solutions. These systems can seem pretty and quick, but security and scalability as well as sharing across the enterprise are often sacrificed in the name of simplicity. A system like TIBCO Analytics provides a hybrid solution on-demand. It can access data from the larger, legacy databases as needed and process it locally for faster interactivity. Further, it can be deployed on-premises or in the cloud. That’s good for the experimentation phase and for broader enterprise implementation, which helps you bridge both the data and the infrastructure integration.

Looking ahead, you may want to incorporate capabilities like predictive analytics. While historical analytics can tell you what happened in the past, predictive analytics lets you understand what might occur and take appropriate action before there’s a problem. Furthermore, advanced capabilities like streaming analytics can provide insights into data in motion, enabling effective action in real time before data even lands in a data store. As you plan for the future, your selection of IT products and architectures should help you manage your existing data as well as these future visions. As with any good bridge, your IT bridge should have a carefully-designed architecture, solid support, and the ability to handle all kinds of traffic load, from vintage loads to the newest innovations. With a well-planned bridge strategy, you can navigate the inevitable obstacles much more easily.

University lecture small

Read next:

How Are Higher Education Institutions Using Analytics?