When done correctly, DevOps provides solid business benefits such as shorter delivery cycles, better quality, rapid innovation and increased collaboration. However, some organizations may not be aware that the DevOps process often is not carried through to the data layer. The common refrain is that "dealing with data is hard!" In most organizations, the data produced or processed by applications is much more important than the application itself. Does it not deserve the same benefits of DevOps that we get on our applications?
But if this is so critical, why is DevOps not being adopted for data? As some organizations are becoming aware, adopting DevOps for data can be concerning for application developers, who might not be as comfortable managing data and databases, and for data professionals, who are not comfortable with the fast pace of DevOps. They have valid concerns. Agility is not a quality associated with most data platforms. Updating data schemas, or the actual values, can be slow, especially if you are dealing with large amounts of data. There is also significant risk associated with updates – if something goes wrong during the process of updating the schema, there is potential for data loss. Depending on the application they are working on, that might be an unacceptable risk, extraordinary steps may be needed to mitigate the risk.
Incorporating data into DevOps has become a significant enough challenge that a community of practice has developed around it. These practices are often referred to as Data DevOps, and they describe how to apply DevOps tools and techniques to data. However, the data and DevOps really should not be separate disciplines. If you do not make data a principal consideration in your DevOps efforts, you will not be able to realize all the benefits associated with a DevOps-oriented organization.
Let us drill into some of the benefits and examine the repercussions associated with not factoring data into your DevOps implementation.
Shorter delivery cycles
DevOps, with the focus on continuous integration and delivery, promotes faster delivery of features to the end user, resulting in a shorter time to value. But by not including data in a DevOps implementation, organizations will see significant slowdowns anytime a new feature involves data changes and schema updates, or otherwise impacts the data layer. To get the benefit of shorter delivery cycles, you must be able to speed delivery of all parts of your applications, including the data layer. If you do not, development efforts and releases will encounter delays when it is time to deploy data and schema changes.
Improved quality of delivered features in DevOps is tightly tied to the increased amount of test automation leveraged in continuous integration and continuous delivery (CI/CD) pipelines. Test automation for applications has become a standard part of these processes. However, automated data testing and validation is also critical. Data-centric processes such as data integration, reporting and analytics are essential to the modern applications – they need to be covered by automated tests just like the application functionality.
Automated data testing is also critical because data is the most volatile part of your system. Your DevOps pipeline controls the application code but data flows into your organization from many channels. Partners, legacy applications, third-party applications and direct user input are sources of data for most organizations. Your teams must validate that your applications can handle the data imported from a wide variety of sources.
Organizations recognize the need for the business to rapidly introduce new features and services. Another benefit of DevOps is the ability for organizations to innovate much more rapidly. By releasing incremental updates more frequently, it is easier to pivot and adjust to deliver the most compelling features to users. However, this benefit is dependent on two underlying factors: Quality and recoverability. Without these, the DevOps teams will not have the confidence to try new things.
Quality, as mentioned above, can be confirmed through data testing and validation and provides a safety net for developers to make rapid changes and to ensure they have not inadvertently affected another part of the system. Recoverability provides the ability to recover from bad deployments or features that do not work as expected. For application code, recovery is often simple: Deploy an updated version of the application code to replace the code that is not working as desired.
For data, recoverability can be more complicated – it may mean rolling back any data changes made, restoring schemas, etc. It is critical to make sure your teams have a plan in place for how to address recoverability for your data in a DevOps scenario. Addressing recoverability also entails knowing when something is broken. This requires having monitoring and visibility into systems all the way through the delivery pipeline.
A final benefit that many DevOps practitioners realize is increased collaboration among the people responsible for delivering applications to production, which is a core tenet of what makes DevOps work.
This enhanced collaboration ensures that facets of the delivery process are not left out or forgotten. It helps ensure that everyone is considering all aspects of successful application delivery. The data professionals that create and maintain the data play a key role in the process. They help others understand how the applications interact with the data, and how the data flows through various parts of the organization. A clear understanding of the data lineage ensures that everyone is aware of the impacts of potential changes.
Make data a priority
To truly realize the benefits of DevOps throughout an organization, organizations must ensure that data is given the same priority in the DevOps process as the applications. Do not let the data be the blind spot in your DevOps implementation — you need the same visibility, automation and transparency of your data that is expected for applications.
As the best practices, expertise, and community around data in DevOps continue to evolve, more organizations will realize the full value of their DevOps investments.