The big data and predictive analytics landscape continues to shift rapidly as old practices are phased out and new technologies enter the mainstream on a virtually constant basis.  As these changes continue to play out, business and IT managers face immense challenges when it comes to staffing their teams with the right people.  That is, finding people with the skills to solve today’s problems that can also adapt to the new future that lies ahead.


The Traditional IT Staffing Model

The current model assumes that business knowledge and technical knowledge are mutually exclusive qualities for the majority of employees.  Technical and managerial responsibilities are clearly demarcated and assigned to those with the respective skills.


Business analysts are those employees whose core competency lies in understanding the challenges and needs of the business itself.  While they are familiar with what technology can do, they are generally not sufficiently trained or interested in making the technology serve their purposes.  These are the people who might identify, for example, that the organization could improve if it found commonalities across customers that the company has lost over time. They aren’t, however, the people who can find those commonalities.


Developers, on the other hand, are super-strong technicians (hopefully) who don’t understand the business value of what they’re creating. They respond to requests from BAs and implement them in software.


Good architects understand both what the business needs and the technologies required to satisfy those needs.  They know where to get the data from, how to manage it and how the technology will play a role in making that data accessible and actionable for the business users. Further, while they may not do the work themselves, architects have the skills necessary to 1) direct developers in implementation and 2) ask BAs the right questions to make sure the finished software will meet the business need.


The future paradigm

In contrast to this model, the future sees the architect role taking a back seat as companies move away from custom designs running on local hardware toward packaged systems run and managed in the cloud. You needn’t, for example, architect a new CRM system if you’re starting with a packaged CRM system.


The International Institute for Analytics predicts that successful companies will take the necessary steps to break down the divide between technology and business teams.  They identify business acumen as a highly sought after quality in analytics professionals and believe that companies will do what is necessary to bring developers into the strategy fold.  Similarly, business analysts will need to know more about the technologies that will allow them to unlock strategic insights hidden in data.


The idea, as we have mentioned in previous posts, is to develop as many citizen data scientists as possible.


What does this mean for your staffing today?


Given these trends, here are our recommendations:

  • The base criteria for new business hires is increasing dramatically as data analysis becomes a more integral component of strategy.  You shouldn’t be satisfied with just Excel or Access skills.  Demand that entry-level employees be able to perform basic queries in SQL, and maybe even possess knowledge of vendor-specific BI systems. Alternatively, be prepared to train them in data access skills.
  • As the responsibilities of IT continue to shift over to business management, it will become increasingly impractical to maintain technology staff.   As a result, IT departments will have to strongly consider whether their best option is to hire or to contract talent to achieve near term objectives.  As we have noted before, it is highly likely that the IT department as we know it stands a strong chance of disappearing completely in the near future. Your staffing plans should comprehend this reality.



Is something lacking in your technology team?  Dataspace offers a variety of big data, analytic, and BI staffing solutions to help you realize the value of your data assets.  Unlike traditional tech recruiters, we know how to identify top data analysts, data architects and data scientists for our clients.  We work with some of the largest companies in the US, and many tell us that their success rate with Dataspace staff is far higher than from their other staffing vendors.

it department


People in the tech world love to speculate on the future structure of the corporate IT department.  While we cannot know precisely how this change will manifest, there are undeniable trends towards a more engaged and decentralized IT presence in organizations.  This may very likely end in the elimination of the IT department as we know it.


IT departments may disappear completely


For the most part, IT departments do two things: 1) They build new systems and 2) they keep systems running. But, look at recent trends:


The rise of the cloud: Everyday we hear about organizations that are building new systems or moving existing ones onto Amazon’s AWS or Microsoft’s Azure. Ten years ago, IT departments were literally plugging cables into machines and replacing hard drives by hand. Now, Amazon and Microsoft, not your IT department, do these jobs.


The rise of packaged applications: Almost everything a business needs now comes as a packaged application. Companies don’t build CRM, ERP, sales, timekeeping, or virtually any other kind of system anymore – they buy them. Of course, these generic packages aren’t perfect and they need to be configured. However, configuration is frequently a less technically intensive process than writing code and can often be handled by fairly technical users.


The rise of data skill distribution: When a manager has a pressing question, it is more efficient to get answers from the people in his department who really understand the problem.  This explains, the rise of what has been called the citizen data scientist.  In the coming years, business employees across the organization will be expected to know how to manipulate data and the population of citizen data scientists will increase sharply. This is a trend we at Dataspace recognized a few years ago; we now see evidence emerging firsthand.  For example, one of our clients, a Fortune 100 company, has started to pare back their IT department while simultaneously requiring that their business users actively develop their personal technology skills.


Given these trends, one has to ask, “Is there really a future for the IT department?”

Regardless of whether or not IT departments disappear from the face of the Earth, we can be sure that a decentralization of technology expertise will occur.  The line between business operations and IT will keep getting blurrier.


How should IT and business people plan for the future?


How do you build and protect your career given the coming shift of technology responsibilities from IT to the business? The answers are, actually, pretty straightforward. In both cases, however, the future is one of skill diversification. It won’t, for example, be enough to be a great claims analyst. You’ll need to be a claims analyst who also knows how to find, manipulate, and analyze data.


If you’re in IT, You need to develop an understanding of your company’s business, what it does, and where it’s heading. In the future, business managers won’t toss technical problems over the wall to IT and wait for answers. They’ll toss those problems to technically savvy people in their own departments. To really understand the need, you need to really understand the business. Also, you need to strengthen your grasp of cloud technologies and how to manage them.


If you’re in a business department, Perhaps start by learning basic data science concepts.  There are a ton of resources on the web, many of which are free.


Want Help Planning for Your Company’s Future?


If you’d like to discuss what the future holds for your company and department, drop us a line. We’d love to talk!

big data

I know I have data, but is it big data?


Companies often find themselves wondering whether or not they have big data and questioning whether they even need business intelligence and predictive analytics to improve their business strategy.  The reality is that you probably have more data than you think you do, but that the systems are not in place to make it useful.  It could also be that you know fully how powerful your data assets could be, but find the task of cataloging, categorizing and processing all of that data too daunting.


What does Big Data mean these days, anyway?


Yes, it’s true that Big Data is often just that – big.  These days it’s impossible to think of an action or process that isn’t being recorded by a program or sensor and transformed into digital data.  The resulting data sets are huge, ever-growing and often quite messy.  IT research giant Gartner defines big data as “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”  Or as Wikipedia describes it, “Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them.”


The moniker itself has come to refer less to the size of the data itself than to the use of predictive analytics or other methods of analysis and extraction which give the data their sought after big value.


Data in 2017 can come from anywhere, and the data aren’t always easy to categorize, much less analyze.  Data is pouring into your company from Instagram photos, customer service tweets, phone calls with clients and elsewhere.  As the means of collection continue to expand, data becomes increasingly unstructured and the adoption of advanced modeling and analytical capability is paramount to business success.  New technologies are advancing rapidly to deal with more indeterminate forms of data, with trends such as artificial intelligence and machine learning already in the mainstream.  Staying abreast of these new developments can help you solidify your competitive edge.


What this means for your organization


At this point you might be thinking: “Is my business really this complex?  Are there really actionable insights hidden in this data that I am not seeing?”  You aren’t sure whether or not you need these advanced processes to draw value from your data.  And to be fair, it really could be that over-the-counter solutions like Excel and Access are giving you all of the insights you need.


As we have noted in previous blogs, big data to one organization may not be big data to another.  However, if you feel that you need your data to do more, then you probably suffer from technology or skill set deficiencies that prevent you from unleashing the power of your data assets.


That is not to say that you should just run out and start comparison shopping solution vendors.  First, carefully consider the business problem(s) you face.   Perhaps start by asking: what are the top three questions that, if I could answer them, would provide me with the biggest impact?  Then assess what the best technologies are to answer those questions and make sure that you have the people resources available to use them.


Do you feel like you aren’t getting enough value out of your data assets? Drop us a line and let us know where you need help.  

It is important to remain aware of the fact that making data work for you is as much about people as it is about technology.  Dataspace has learned a lot about end-user engagement with predictive software and business intelligence infrastructures over the past two decades.  After all, it is not the products but the user of these products, the guy or gal who makes the day-to-day decisions, which ultimately determines whether or not your organization uses data effectively.


Create a culture that celebrates data exploration


Unlocking the value of analytics lies in turning insights into action, and that responsibility ultimately falls on the end-user.  Oftentimes, this means that skill development is required to make sure reporting end-users don’t have to wait for IT to weed out problems, define the available data, or answer questions about the platform. More importantly, you must foster a data-driven culture to encourage users to make data an integral part of their jobs and to discourage the typical reluctance that comes with adopting new technology.  In most companies transitioning towards a data-driven strategy, surveys show that adoption rates hover only around 22% (Gartner, 2013).  This resistance can mean valuable insights never translate into action.


What does a data-driven company culture look like?


Strong, data-driven cultures:


  • Value and reward employees making data-supported contributions in meetings
  • Recognize employees who make the effort to use technology investments by tracking usage information
  • Follow suit; if management isn’t using data, nobody else will feel that they have to, either
  • Minimize the role of strict processes to incentivize employee exploration and innovation


When Excel is all you know


The main responsibility of the end-users is to understand and address the business problems at hand, not to design data warehouses or carry out the other responsibilities of the technical team.  That is not to say that they aren’t intelligent and capable – it simply means it’s not their place in the organization.


So when these users have a problem, they normally resort to the tools they have the technical aptitude to use: Excel and Access.  In the absence of well-structured, integrated and complete data stores and analytic tools, these workarounds may not be the best solution.  But when it comes to developing citizen data scientists among your employees, Excel should not be underestimated as an incredibly powerful tool.  With Excel, you don’t need to know any programming languages to analyze data; regression, pivot tables, what-if analyses, A/B analysis and many other statistical and analytical tools are right at your employees’ fingertips.



Do you wish you had more employees making data-supported decisions?  Drop us a line and let us know where you need help.  

data driven

By now, we all know that big data and predictive modeling are at the heart of any successful and competitive corporate strategy.  But making data-driven decisions means more than just selecting software or implementing a BI system; it means making sure the people in your organization understand their roles in applying these systems effectively when it comes to solving problems.


Teamwork is key


A classic problem in analytical system implementation is bridging the disconnect between the people designing the system and the people using it.  The business side has the data and understands the problem, but doesn’t know how to prepare data for analysis, much less build the analytics infrastructure necessary to properly model and analyze the data.  Conversely, the developers who do have these technical capabilities usually don’t have a grasp of the nature of the problem and what is really needed by the people on the business side. So cross-functional collaboration becomes indispensable.


Develop subject matter experts


It is essential to centralize best practices with respect to how data is used within the organization.  This ensures that common methodologies and standards are adhered to when different parties use the tools you have at your disposal.  One way to do this is by designating subject matter experts in every department that uses your BI systems.  A SME is traditionally a person with a solid understanding of the challenges in their department. Now, that SME must also possess a strong aptitude for data manipulation.  They are tasked with creating tailored reports and templates within their department to give the data the desired local context.  Additionally, they can troubleshoot issues without having to consult IT. With this structure, management can more easily ensure that data and processes are handled uniformly across the organization.  And while every employee should have a basic knowledge of how to use the systems at their disposal, don’t waste time training everyone in functionality that isn’t frequently relevant to their responsibilities.


At the end of the day, local departments are where data-driven decisions will make the most impact on your organization’s bottom line.


Enter the citizen data scientist


We are well aware by now that enduring the rigor to become a data scientist is almost as challenging as finding one to work for you.  However, finding people with basic statistics and modeling skills is as easy as looking at the engineers and analysts you already employ.  You probably already have a number of what we call “citizen data scientists” among your staff.


The reality today is that some of the tools and technology to do the work of a data scientist are cheaper and more accessible to the common employee.  Today’s workers grew up with computers and the internet; they’re frequently more comfortable with technology than previous generations and are more inclined to use free tools such as Codecademy to learn new skills.  Obviously, you are still going to need the real data scientists to handle the mathematical heavy lifting.  But the more citizen data scientists you create within your organization, the stronger your chances are of unlocking the predictive potency of your data.


Increase chances of success with an executive sponsor


The likelihood of an implementation succeeding and weathering any growing pains increases dramatically with executive support.  Of course, the CEO might understand that big data is crucial to success, but that doesn’t mean he or she can remain engaged and visible as the project takes shape.  In addition to all the boots on the ground, a senior manager at the executive level who truly believes in the vision should be designated to advocate for the project.  This means providing visibility to the BI team’s actions and supporting them with the necessary resources.  It is their responsibility over the long-term to remain cognizant of the details and benefits of the project, address challenges head-on and share this awareness and with the other stakeholders involved in the project.   A survey conducted by Forbes and EY found that “89% of organizations agree that change management is a barrier to realizing value.” A good executive sponsor will guide a project through this period of change and manage any unexpected circumstances that follow.


terrified child plays computer

If you’re like me, you’re an exceedingly attractive, 50-something data geek with a love for aviation history who’s an amazing ice hockey player. You also spend a lot of lunches eating a sandwich while studying new technologies. This kind of on-the-cheap, self service training helps you keep up with changes in business intelligence / data warehousing / analytics / whatever the kids are calling it these days.


So, what should you study (on the cheap) in 2017? Here are some suggestions…


Try a BI Tool You Know You Won’t Buy


IT is the classic “when you have a hammer, every problem looks like a nail” industry. We invest in learning a technology and then become that technology’s biggest proponent. If you know Business Objects, it’s automatically the best BI tool for all problems. But, if you know Tableau, that’s automatically the best tool.


Here’s the thing; there is no best tool. Each BI tool has a sweet spot, and a group of users for whom it excels. Even Excel excels at some things (sorry, couldn’t resist the word play). Knowing a bit about other tools may open your eyes to other, perhaps better, ways to work with what you have.


Even if you have no budget or intention to adopt a new BI tool, I urge you to try one from time to time. All the major vendors have some sort of ‘free’ offering nowadays. So, here’s your first lunch time task: pick up a demo copy of a different BI package, go through its tutorial, and apply the tool to some of your data. Who knows, maybe you’ll like it so much that you’ll find the budget.


Explore a NoSQL / Big Data Technology


It’s very likely that soon, some or all of the data you work with will be stored in something other than a traditional, relational database. Right now my largest clients are working with Hadoop, MongoDB, and CouchDB. If you’re not there yet, start investigating these, and similar, technologies. I generally recommend that folks start with Hadoop as it gets the most mindshare.


Rather than planning a big project to apply the technology, start by just learning about it – perhaps with a basic business problem in mind. After a bit of study, if you think the technology will fit, try it out on a small prototype. While success would be nice, it’s not essential. What you’re really after is knowledge.


To start learning online, during your lunch breaks, Try IBM’s free Big Data University. Full disclosure: I haven’t worked with it much (yet) but it looks like a good place to learn the basics.


Learn Data Vault Modeling


I am convinced that within a few years all the cool kids are going to be using data vault modeling techniques for their data warehouses / integration layers. Data vault is a data modeling approach that builds on some of the best aspects of normalized and denormalized / star schema data modeling. Practitioners using it have only great things to say.


Some data vault modeling benefits include the facts that it:


  • Really does support rapid, agile data warehouse development (really!)
  • Provides a logical way to easily integrate data from multiple sources
  • Greatly eases history record management
  • Is sufficiently flexible that source system changes don’t result in huge amounts of rework


Let me warn you, when you first look at a model done with data vault it’s going to be confusing. Remember, those of us who predated star schemas thought the same about the first stars we saw – they looked very different from the traditional, relational models that we knew and loved. However, over time, we got it – we saw the value and jumped on board, using each approach where it made sense. As a preview, just like star schema models use their own kinds of tables (mainly fact and dimension), data vault models use their own kinds (mainly hub, satellite, and link).


Now’s the time to step into data vault modeling. In fact, as my clients look to upgrade their data warehouses, I’m recommending that they strongly consider using data vault modeling techniques. Everyone in business intelligence and data warehousing should learn at least a little about the concepts.


Next Up…


So, I hope these were some valuable thought starters. I’d love to hear your thoughts, too.


My next post will talk about some bigger-issue items that should be on your 2017 to-do list. Thanks for reading!





Need BI and DW Staff?


Over the past few years our business has expanded to include BI and data warehouse staffing. I, not recruiters, personally screen candidates. Having been in BI / DW for over 25 years, I protect our clients by seeing past candidate BS. If you think you might have a need, let’s talk about it. You can call my direct line at 734.585.3503 .





The Lamest BI Requirement – And it Won’t Go Away!


Want a sure path to failure in BI and data warehousing? Don’t solicit requirements from your users.


Want a second path? Settle for the requirement, “Just give me everything (JGME) and I’ll figure out how to use it when you’re done.”


I’m over 25 years into this field and I still see this requirement – and I still see BI / DW teams falling for it! It’s totally lame (am I allowed to use my kid’s vernacular in a business post?) and leads to systems that just don’t get used.


JGME: What Your User is Really Saying


You do realize that when your user gives you that requirement he’s just taking your time for granted, right? What he’s really saying is, “I don’t have the time for you right now so, make a big investment because I may have the time to figure it out myself somewhere down the road.” Why is ‘down the road’ better than today? Why can’t your user spend a few hours with you, before you make big investments, to consider what he really needs? To consider what will actually benefit the business?


How Do You Design JGME?


In a perfect world we might provide our users with some thoughts about what’s possible but they are really responsible for telling us what they need. We, in turn, are responsible for implementing it. But, without real requirements, how are you going to design things like data models? How will you know the things on which you should be capturing history? How will you configure BI tools to properly present the data? etc.


Your users said they need everything! So, why not just give them copies of their source systems?


Why Do We Fall For it?


Building systems is fun. Playing with data is fun. BI and DW has been on the cool technologies list for years now. And, frequently we’re just told to build these systems and we don’t want to push back.


What BI Requirements Look Like


While it used to be sold as, “build it and they will come”, the truth is that effective BI is no different than any other type of business system. Business systems (e.g. sales systems, ERP systems, inventory systems, etc.) force business users into defined business processes – sets of steps for doing their jobs. Whether users like those processes or not, they follow them – it’s the only way for them to do their jobs!


Somehow, though, in BI, it’s OK to drop the idea of business process? Rather than understanding how the system will be used and how the business will change it’s OK to just puke data all over our users and assume that everything will work out?


Think about this for a second. Don’t your users already have jobs that require at least 40 hours (35 in France) of work each week? How will simply giving them a new source of data help them? In most cases it won’t and, as a result, you’ll have built a system that no one wants, needs, or uses.


Real BI use is tied to business process. Understand how the system will fit into the work flow and why it’s necessary for improving that work flow and you’re on your way to good requirements. Ignore workflow, ignore business process and you’re on the path to failure. Simply accepting the JGME requirement doesn’t touch on business process.


Push Back


So, yes, we want to say, “Sure, we’ll deliver it” to every request but I suggest that you push back when faced with the JGME requirement. Tell your sponsor that his money will be wasted if he doesn’t invest a little into requirements. Then describe what you mean by requirements…


An Outline for BI Requirements


So, what do decent BI requirements look like?


Beneficiary / Sponsor and Target User


You can’t tell what someone needs until you know who that someone is. So, your requirements should start by indicating who will use the system and who will benefit from it.


What is his job / his business process today?

The next step is to briefly document how the business process is accomplished today. What do people have to go through to complete it? Where is it awkward and inefficient?


What does he think it should be?


Then, design out the improved business process. How will the new system enable that process? Why can’t the new process be implemented without the system?


What analyses does he need to get to this new process?


So, the new process needs data. That data will be presented in one or more formats consisting of charts and tables. What will these look like? At this stage you should, perhaps with a pencil and paper (remember those), draw them out. Work with your user to understand what data he needs to see, how he needs it presented, how he’ll need it filtered, how he’ll need it sorted, etc.


No, I’m not naive – of course his needs will change once you’ve delivered the analyses. So, when developing, you will likely collect more data than his screen designs require. You might also build a few more screen objects than he asks for. However, what you build will be based on a thought-out business need, not on an ill-defined, cop out requirement of ‘give me data.’


What are the expected benefits and how will they pay for the investment?


Finally, it’s important to question if the investment will be valuable. Yes, perhaps a new BI system will enable a new business process but how will this benefit the organization? Why is it worth doing?


Quick example: suppose an automotive warranty group reviews claims every six months to see if particular parts are creating outsized warranty costs. Further, suppose a sponsor in warranty realizes that changing this business process to a monthly or weekly process will catch these issues sooner, saving on warranty costs. With some simple math you can estimate the value of this new business process. (IMPORTANT POINT: The ROI of a BI system is zero. The system is the ‘I’, the investment. The return, the ‘R’, is a result of the new business process enabled by that system).


Two Exceptions


This JGME rule has two, limited exceptions.


Tiny Scale Demonstration Systems


Sometimes it’s important to show sponsors what can be done. In these cases you might create small demonstration systems based on what you already know about the business problem. However, once the business buys in (if it does), then you should capture real requirements for your production system.


Data Scientists


BI tool vendors will have you believe that every user in your company will be clicking around, making new insights. That’s not going to happen. Almost everyone in a company has defined goals and things they need to get done. They execute business processes to complete these tasks. The one exception is your data scientists. This is a very small group of experts whose job it is to find and evaluate new data, looking for new insights. These people may, validly, have the JGME requirement. However, these people usually only need this data temporarily – they do their analyses and then go on to the next topic. Tools like data lakes and unstructured data storage technologies (e.g. Hadoop) are usually best for needs like these.



Any thoughts? I’d love to hear them. Just comment on this post. Thanks for reading!


Ben short - black


Need BI and DW Staff?


Over the past few years our business has expanded to include BI and data warehouse staffing. Client reviews have been exceptional. I’d love to discuss your situation and how our staffing and consulting offerings might help. Call me at 734.585.3503.



data topology

Assessing BI & DW Systems


A decent (and fun!) chunk of our business is assessing organizations’ business intelligence and data warehousing capabilities. In assessments we meet with business executives and IT staff to discuss what the business needs, and will need, from BI. We then contrast that with what the organization actually has at its disposal. The result is a findings report and plan for moving forward. Assessments are usually very well received and it’s always gratifying to visit a client and see your report from three or four years ago on managers’ desks, with highlighting and sticky notes, still helping set a path forward.


As consultants, assessments provide us with a great, hands-on view of what organizations are doing in BI. One thing we see time and again is organizations that have complex infrastructures and, yet, no ‘master plan’ or vision for how their systems should fit together – they haven’t envisioned and designed what I call a data topology. They suffer because of it.


Symptoms of Data Topology Issues


So, what kinds of things indicate that you need to put a bit more thought into topology? Here are a few:


  • DATA QUALITY ISSUES: You have difficulty maintaining data quality as the same kinds of data exists in multiple places and, once it changes in one place, the databases are out of sync.
  • DATA CONSISTENCY ISSUES: Different systems return different answers to the same question. For example, a query asking for a customer’s address might return different results depending on which database is queried.
  • EXCESSIVE COST: You spend more than necessary to build and maintain your data warehouse as your ETL jobs must grab data from all of your sources and these jobs need logic to adjudicate differences in data across them.


Many organizations with problems like these take a knee-jerk reaction to solving them, like building a data warehouse, buying data quality tools, jumping on the data governance bandwagon, or implementing master data management (MDM) solutions. Indeed, those things might be necessary. However, I recommend that you first step back.


The conditions I noted aren’t root problems but, rather, symptoms of some root problem. For example, data quality issues can come from a number of places – from having multiple systems capturing similar data, to poor validation on data entry, to systems truncating data as it’s passed between them.


So, before making a big investment in tools and technology, I strongly recommend that you make a small investment in architecting your long term data topology.


An Architected Data Topology


What is an Architected Data Topology?


An architected data topology is a high level design of your data infrastructure. It documents the data stores you have and how they interact with each other. It’s frequently shown simply as a diagram containing all of your data stores and the relationships / flows between them. Supporting documentation can describe the role of each data store and the subject areas of data it collects. That’s it! You can certainly add more detail and should fit things to your specific situation. And, as we’ll see, it is important to step beyond documenting your current situation – you need to design your target topology.


Why Developing a Data Topology Architecture is Worth the Effort


Without a plan for how your data will fit together, it’s likely that project teams will attack problems with haphazard, unintegrated, point solutions that solve the immediate pain but cause bigger, long term headaches. I’ve seen it happen; I’ve been to clients where the answer to every problem seems to be ‘build a new system.’ In a few years there are multiple, unintegrated, mission-critical systems stepping on each other.


For example, in one case, at an insurance company, customer data was captured in multiple places. Each produced different results for things like contact information and even transaction volume. Even worse, each system defined ‘customer’ differently – in some cases a customer was a policyholder but, in others, the customer was the broker who sold the policy.


How to Develop Your Data Topology Architecture


It’s really not all that hard to develop a data topology architecture, especially when compared to the effort required to implement an MDM or DW system. Just follow these steps…



1) Set Your Data Management Goals and Principles


When developing your data topology architecture you really need to consider and set your goals for data management. This is important because, as future systems are built and kept in line with the architecture, they will, by definition, be kept in line with your organization’s data management goals.


Start by developing a list of your goals for data management (DM) and the principles to which these goals must adhere. I believe that most organizations will have between five and ten DM principles. My last post discussed how to set these.


2) Document what you have today


Create a picture of where you currently stand. Later, you’ll compare this to your future (i.e. target) design to determine what tasks you’ll need to perform to attain that future (see step four, below).


3) Design where you need to be in order to meet the goals & support the business in the future


Now comes the fun part (well, actually, for the right person the whole project is a fun part) – designing your data future. To do this, combine your data management principles, your knowledge of the business and its plans, and an understanding of data management technologies to plan out your future topology.


Let’s be honest, moving to a new topology might be an expensive proposition. Thus, you might actually need two diagrams: A ‘perfect’ case and an more attainable ‘reasonable target’ case that will work and has a better chance of getting funded (remember, the perfect is the enemy of the good).


4) Develop a plan for attaining your future topology


While the picture is helpful, it’s useless if you don’t have a plan to implement it. So, the final step in this topology effort is to compare your past and future pictures to identify the projects and tasks in your implementation plan. A side benefit of this comparison is that it will also serve as a tool for explaining, to project sponsors, why this move to a new topology is important.


Keeping it Fresh: The Role of the Architect


Don’t stop with the plan, or even with its implementation.  Long term success demands that you employ and assign an architect to participate in IT projects and planning. This person is the ‘voice of the future’, making sure that project teams don’t go ‘off the rails’, developing systems that aren’t in line with your architectural vision. This architect is key to ensuring that projects adhere to the principles and designs you’ve worked hard to attain.



OK, this post ran way longer than I expected (even after I separated out the part on setting your principles). Nonetheless, documenting your current and setting your future topologies isn’t as hard as it seems. I suspect that, with concerted effort, most medium-sized organizations can complete the task in about a month. The benefits will last for years. So, get started!


Any thoughts? I’d love to hear them. Just comment on this post. Thanks for reading!


Ben short - black




Could Your Organization Benefit From an Outside Set of Eyes on Your BI and Data Management Efforts?


If you think a BI or data management assessment might be helpful in your organization, give me a call! I’d love to discuss your situation and how an assessment might be useful. I can be reached at 734.585.3503.



Looking back on the work we’ve done over the past few years I’ve come across an interesting point: while companies may have mature data management organizations (DMOs), few of these DMOs have a set of principles behind their data management visions. In other words, they’re building data management infrastructures and tools without a clear sense of what they’re ultimately trying to accomplish.


Important Endeavors Should Have Guiding Principles


I’ve recently been listening to an audio book about the American Revolution and it’s led me to an interesting conclusion: Great undertakings have a set of guiding principles. Take the United States, for example, we have a constitution. Our constitution underlies all of our other laws. We’re, of course, not alone. Consider, for example, the Magna Carta and, even, the Ten Commandments.


Data is valuable and building and maintaining an effective data management infrastructure is an important endeavor, at least for you, your company, and the parties relying on you. (To get a better handle on how valuable your data is  and what it might be worth see Doug Laney’s work on Infonomics). Thus, as an important endeavor, your data management efforts should have a set of guiding principles.


Once you’ve established guiding principles, you can compare every data management decision to them to ensure that those decisions fit with your overall, long term data strategy. This will help keep your organization from straying toward costly, short-sighted distractions.


What Principles Should You Have?


Some data principles are universal, and apply to all organizations. Others may vary organization by organization. I believe that most organizations should have between 5 and 15 of these principles. Further, I believe they all flow from a ‘master principle’:



[mkdf_blockquote text=”Data is a valuable asset and it must be accorded the same care as other valuable, corporate assets” title_tag=”h2″ width=””]



Here are some additional principles I believe apply to most organizations:


  • Data Should be Shared: Data captured anywhere in the organization should be available to all who can use it to derive business value. Without sharing, you’ll expend extra effort and your business will miss opportunities to capitalize on the data it’s paying to capture.
  • Operational Data Should Not be Duplicated: Data should be captured only once and that version of the data should be available to the people and systems that need it. Duplication provides opportunities for error, user confusion, and technical complexity.
  • Data Should be Accessible: Business users and systems should be able to easily get the data they need. The organization can miss out on opportunities and ‘signals’ if it captures data but does not make it available to people and systems that can capitalize on it.
  • Data Quality Should be Fit for Purpose and Substantially Reflect Reality: This means that data should be correct enough for its intended uses in operations, decision making and planning. Decisions and actions that users make based on bad data are, of course, potentially flawed. This does not mean that data is perfectly clean, only that it is at least clean enough for your business purposes. For example, do correct middle initials warrant the same level of effort and investment as correct addresses?
  • Data Should be Compliant With Applicable Laws and Regulations: This rule, and the reasoning behind it, should be fairly self-explanatory.
  • Data Should be Protected From Unauthorized Access: In a time of hackers and spies, this, too, should be self-explanatory. Recent, high-profile security failures (check here and here) have cost millions (billions?) of dollars and led to a loss of customer confidence. Can your organization afford this?
  • Data Should be Protected From Inadvertent Destruction: Systems fail. If your data suddenly disappears, can you get it back quickly, with minimal impact to the business?
  • Data Should be described by a Common Vocabulary and Definition: To avoid miscommunication and ensure that all data consumers properly interpret the data they receive, the organization should provide tools that both novice and expert data users can use to understand what data is captured by the company,  what it means, and other key details about it.


Using the Principles


It’s relatively simple to apply these principles to your systems efforts. Whenever you’re planning new development or major modifications, compare the plan to your list of principles. For example, if you’re implementing a new warranty system, ask, “Will it have its own set of customer addresses or will it share addresses with all of our other, operational systems?” (i.e. Is it shared? Is it not duplicating existing data?) Further, ask, “Are we planning mechanisms for the business to analyze the data?” (i.e. is it accessible?) etc.


If your plans violate any of your core principles, change your plans!


How to Develop Your Own List of Guiding Principles


In my mind, the DMO simply implements the will and desires of the user community. So, when it comes to data management guiding principles, developing the list shouldn’t be just an IT task. Success demands input from your constituents. In fact, your data governance team should develop, and periodically review, the list. The business ultimately owns the data so shouldn’t they also own the rules governing that data?


Perhaps you start with the list provided here. Then, think through your organization and its data situation. Have issues arisen in the past for which a solid principle would have provided needed guidance? If so, consider developing a principle to avert the situation in the future.


Remember, these are high-level, guiding principles. They don’t refer to individual systems or efforts. They should, instead, guide all systems and efforts in the long term.


Your Thoughts?


Do I have a point or have I gotten this terribly wrong? I’d love to hear your thoughts! Just add a comment to the blog. Thanks!







Does Your Organization Need SAS Consultants?


Over the past year a few of our clients have asked us to provide them with SAS experts and the results have been great! We’ve been placing consultants in areas including SAS Enterprise Guide (E.G.), Base SAS, SAS Administration, and SAS/Access to Hadoop. We can usually have highly-vetted SAS experts at your site within three weeks. If anyone in your organization is looking for SAS help, please have them contact me directly at 734.585.3503. Thanks!

Declining Business Sales

Is your terminology causing your BI / DW team to head in the wrong direction?


As a business intelligence and data warehousing consultant I constantly work with IT teams that, even internally, can’t agree on whether they have a data warehouse, a data mart, a reporting database, or something else. As a result they’re not sure what BI / DW components they need and the proper roles for those components that they do have.


What common BI & DW terms mean to normal people


How many times have you told someone that you work in data warehousing, had them suggest that they knew about data warehousing, and then had them follow up with questions about how you manage to keep server farms running? I constantly run into folks who think that data warehousing is about storing huge amounts of data. Let’s get it straight – we’re not about storing data, we’re about getting meaning from data and getting that meaning to the right people at the right time! The storage, while necessary, is secondary.


You can’t blame folks, though – at best, the term data warehousing is misleading and, at worse, stupid. Warehousing implies storage, not use. But, “data warehousing” isn’t our only loaded term. What about “data mart” – what exactly does that imply? Or, a personal favorite, “operational data store” (ODS) – I can’t tell you how many times folks have told me about their ODS by describing the Oracle database underlying their SAP implementation. An operational database is not an ODS! But, once again, our terminology is so imprecise that it leads people in the wrong direction.


Well, everyone else is doing it


Somewhere along the line people grabbed onto terminology like data warehouse, data mart, and ODS because everyone else was grabbing onto it. Let’s do what the cool kids are doing, right?


Sadly, though, the terminology just doesn’t really describe what these key architectural components do. And, while there are true, academic definitions for them (anyone remember subject oriented, integrated, time variant, non volatile collection of data in support of management decisions?), many (most?) practitioners don’t know these definitions.


This whole situation leads to confusion, waste, and in some cases, failure. So, want to get better at BI and data warehousing? Start by changing your terminology.


Why not try these terms?


I propose that we clear up some confusion by changing our terminology, to use terms that focus on the goals of the components in our BI architecture. Many of the terms I propose have both an architectural focus (generally those terms called ‘layers’) and a physical focus (generally those terms called ‘databases’).


Instead of data warehouse


A traditional, Bill Inmon architecture is built around a data warehouse – a very cryptic term. So, why not call it what it is – a database where data from various sources is integrated and where we capture history. In other words, let’s just call this our Integration and History Layer or our Integration and History Database.


Instead of data mart


Data marts? What are they there for? They really exist to distribute data to users in highly-performant, user-friendly formats. So, why not call this our Data Distribution Layer? This layer can be composed of any number of databases or data stores, all focused on the same job – rapidly distributing data to users. These databases can be star schemas but they can also be multidimensional databases (MDDB), flat files, specialized databases (like Qlik’s Associative Database), or any other format.


In the end, all viable data stores in a Data Distribution Layer have one thing in common – they provide high performance response to user queries. So, let’s group all of these technologies together and call them High Performance Query Technologies. (Maybe High Performance Data Distribution Technologies is even better? – I haven’t worked through that one yet).


Instead of operational data store


And finally, let’s think about that Operational Data Store – it’s supposed to be about reporting operational data, not running operational transactions. So, let’s call it what it is: an Operational Reporting Database. Our Operational Reporting Layer is comprised of one or more of these databases.


The “new” architecture


Putting this all together, we come up with a new picture of the modern reporting environment – a picture in which each component is explicitly named by the role it performs.


The informational / business intelligence architecture - with explicit terminology


The informational / business intelligence architecture – with explicit terminology

What you’ll gain


So, why buck the terminology trend? Because using this new terminology will allow you to shift focus away from what ‘everyone else is doing’ and onto the jobs you need done and the right way to do those jobs. It, then, opens up your mind, and that of your team, to the possibility of things like alternative high performance query technologies, virtual data warehouses, data lakes…


Focus on the job you need done. The ‘traditional’ jobs are pretty much all covered by this new architecture. If, however, what you need doesn’t fall into it, it’s fine to come up with new layers and components – just make sure you understand what new functionality you need from those components and then name them what they are (BTW, already, in these early days, there’s a ton of confusion around the term ‘data lake’ – have a better name for it?)


I believe this revised approach will reduce confusion and get your whole team – sponsor to designer to developer to tester to user – on the same page about what needs to be done.


Thanks for reading. Have any input? I’d love to hear it. Share by leaving a comment.