A decent (and fun!) chunk of our business is assessing organizations’ business intelligence and data warehousing capabilities. In assessments we meet with business executives and IT staff to discuss what the business needs, and will need, from BI. We then contrast that with what the organization actually has at its disposal. The result is a findings report and plan for moving forward. Assessments are usually very well received and it’s always gratifying to visit a client and see your report from three or four years ago on managers’ desks, with highlighting and sticky notes, still helping set a path forward.
As consultants, assessments provide us with a great, hands-on view of what organizations are doing in BI. One thing we see time and again is organizations that have complex infrastructures and, yet, no ‘master plan’ or vision for how their systems should fit together – they haven’t envisioned and designed what I call a data topology. They suffer because of it.
Symptoms of Data Topology Issues
So, what kinds of things indicate that you need to put a bit more thought into topology? Here are a few:
- DATA QUALITY ISSUES: You have difficulty maintaining data quality as the same kinds of data exists in multiple places and, once it changes in one place, the databases are out of sync.
- DATA CONSISTENCY ISSUES: Different systems return different answers to the same question. For example, a query asking for a customer’s address might return different results depending on which database is queried.
- EXCESSIVE COST: You spend more than necessary to build and maintain your data warehouse as your ETL jobs must grab data from all of your sources and these jobs need logic to adjudicate differences in data across them.
Many organizations with problems like these take a knee-jerk reaction to solving them, like building a data warehouse, buying data quality tools, jumping on the data governance bandwagon, or implementing master data management (MDM) solutions. Indeed, those things might be necessary. However, I recommend that you first step back.
The conditions I noted aren’t root problems but, rather, symptoms of some root problem. For example, data quality issues can come from a number of places – from having multiple systems capturing similar data, to poor validation on data entry, to systems truncating data as it’s passed between them.
So, before making a big investment in tools and technology, I strongly recommend that you make a small investment in architecting your long term data topology.
An Architected Data Topology
What Is an Architected Data Topology?
An architected data topology is a high level design of your data infrastructure. It documents the data stores you have and how they interact with each other. It’s frequently shown simply as a diagram containing all of your data stores and the relationships / flows between them. Supporting documentation can describe the role of each data store and the subject areas of data it collects. That’s it! You can certainly add more detail and should fit things to your specific situation. And, as we’ll see, it is important to step beyond documenting your current situation – you need to design your target topology.
Why Developing a Data Topology Architecture Is Worth the Effort
Without a plan for how your data will fit together, it’s likely that project teams will attack problems with haphazard, unintegrated, point solutions that solve the immediate pain but cause bigger, long term headaches. I’ve seen it happen; I’ve been to clients where the answer to every problem seems to be ‘build a new system.’ In a few years there are multiple, unintegrated, mission-critical systems stepping on each other.
For example, in one case, at an insurance company, customer data was captured in multiple places. Each produced different results for things like contact information and even transaction volume. Even worse, each system defined ‘customer’ differently – in some cases a customer was a policyholder but, in others, the customer was the broker who sold the policy.
How to Develop Your Data Topology Architecture
It’s really not all that hard to develop a data topology architecture, especially when compared to the effort required to implement an MDM or DW system. Just follow these steps…
1) Set Your Data Management Goals and Principles
When developing your data topology architecture you really need to consider and set your goals for data management. This is important because, as future systems are built and kept in line with the architecture, they will, by definition, be kept in line with your organization’s data management goals.
Start by developing a list of your goals for data management (DM) and the principles to which these goals must adhere. I believe that most organizations will have between five and ten DM principles. My last post discussed how to set these.
2) Document what you have today
Create a picture of where you currently stand. Later, you’ll compare this to your future (i.e. target) design to determine what tasks you’ll need to perform to attain that future (see step four, below).
3) Design where you need to be in order to meet the goals & support the business in the future
Now comes the fun part (well, actually, for the right person the whole project is a fun part) – designing your data future. To do this, combine your data management principles, your knowledge of the business and its plans, and an understanding of data management technologies to plan out your future topology.
Let’s be honest, moving to a new topology might be an expensive proposition. Thus, you might actually need two diagrams: A ‘perfect’ case and an more attainable ‘reasonable target’ case that will work and has a better chance of getting funded (remember, the perfect is the enemy of the good).
4) Develop a plan for attaining your future topology
While the picture is helpful, it’s useless if you don’t have a plan to implement it. So, the final step in this topology effort is to compare your past and future pictures to identify the projects and tasks in your implementation plan. A side benefit of this comparison is that it will also serve as a tool for explaining, to project sponsors, why this move to a new topology is important.
Keeping it Fresh: The Role of the Architect
Don’t stop with the plan, or even with its implementation. Long term success demands that you employ and assign an architect to participate in IT projects and planning. This person is the ‘voice of the future’, making sure that project teams don’t go ‘off the rails’, developing systems that aren’t in line with your architectural vision. This architect is key to ensuring that projects adhere to the principles and designs you’ve worked hard to attain.
OK, this post ran way longer than I expected (even after I separated out the part on setting your principles). Nonetheless, documenting your current and setting your future topologies isn’t as hard as it seems. I suspect that, with concerted effort, most medium-sized organizations can complete the task in about a month. The benefits will last for years. So, get started!