Posts Tagged ‘Master Data Services’

It’s Part of SQL Server 2008 R2?

Friday, March 5th, 2010

SQL Server 2008 R2 includes some impressive new features and functions. But, when you run setup they are nowhere to be found. Included in the list are the new StreamInsight, PowerPivot (sort of), and Master Data Services. Where are these features and why are they not included in the setup wizard?

Except for PowerPivot for SharePoint, the features are in their own distribution folders on the setup DVD. StreamInsight is in the StreamInsight folder and Master Data Services is in the MasterDataServices folder. PowerPivot for SharePoint is actually included in the setup wizard, but you need to know where to look. More on that later.

To answer the question as to why StreamInsight and Master Data Services are not part of the setup we need to look at the big picture as Microsoft defines it. Microsoft has decided to migrate all data management and analysis applications under a single umbrella and that umbrella is their flagship database, SQL Server 2008 R2. This is much like what they are doing with SharePoint by including PerformancePoint as a feature beginning with SharePoint 2010.

The thinking is that creating a comprehensive data management suite is simpler if the components are marketed as a single platform. Not only does this make sense logistically, it makes sense financially. Instead of socking companies with more fees as they continue to build their data management infrastructure, Microsoft has rolled many of their previous offerings into the SQL Server 2008 R2 platform. Companies now benefit financially by no longer being required to fork out thousands of dollars for each of the features that they want to implement. Rather than forking out thousands of dollars for each application, they can purchase the appropriate edition of SQL Server and find everything they need.

I mention this because StreamInsight, in particular, is not a SQL Server based product. StreamInsight sits outside of the SQL Server resource pool and performs Complex Event Processing on incoming data streams. Designed to handled massive volumes of data in memory, StreamInsight enables a company to create processes that scan the incoming data streams and discard or redirect the data based criteria written in a .NET compliant language and using LINQ.

StreamInsight can use SQL Server based data tables to hold static data used for comparison purposes. It can also pass selected data through and ouput adapter to SQL Server for storage. Because StreamInsight runs against memory based data, it can process the queries without the I/O overhead required by a traditional database server.

Master Data Services is another application included with SQL Server 2008 R2. Master Data Services does store data in SQL Server. However, the processing it does is not a pure database or data warehouse function. Master Data Services (MDS) enables and organization to gather multiple copies of significant master data together, merge and standardize the data, and then send it back out the original applications. Those applications then contain a consistent and accurate representation of the common data. A previous blog contains a more detailed discussion of what constitutes master data so I will not go into that here.

Finally, we have the new PowerPivot for SharePoint. PowerPivot for SharePoint is a new addition to SQL Server Analysis Services in SQL Server 2008 R2. Microsoft created the SharePoint add-in to enable users to create, share, and manipulate PowerPivot workbooks in concert with PowerPivot for Excel 2010. The only way to install PowerPivot for SharePoint is to perform a SQL Server Analysis Services installation with SharePoint integration. After selecting SharePoint integration, the wizard walks through the essential configuration tasks for PowerPivot. A standard Analysis Services installation does not include PowerPivot. For a more in depth discussion of PowerPivot for SharePoint and PowerPivot for Excel please see my previous blogs.

By combining all of these features into SQL Server 2008 R2, Microsoft is proving even more that they are committed to improving the way businesses handle data without requiring excessive investments. I am sure that there is even more to come and that the next release of SQL Server will continue this trend of managing data where ever it is so that companies can continue to gain ground in managing and analyzing business critical data.

SQL Server 2008 R2 Master Data Services

Friday, March 5th, 2010

In the past few weeks, I have had the opportunity to look at Microsoft’s new Master Data Management offering. This being a new area for me, I was very interested in the concept and did a bit of research to try to understand the ins and outs of Master Data Management, or MDM.

My understanding of the purpose of Master Data Management is the creation of a central repository for the most important data. This repository and its stewards are then responsible for maintaining the data in a consistent and current state for use by the originating systems. Essentially, MDM is the process of creating a single version of the truth for vital data and making sure that all systems using that data share that single version.

What I found was that managing master data is a much more detailed and evolutionary process than I would have imagined. For starters, each organization must determine what they consider master data. For some organizations, customer or product data will play a major role in MDM. These may not be as important to include for other organizations. The basic criteria for master data are that the data must be relatively static in nature, common to multiple systems, and it should not include transactional data.

Customer data is one of the most common collections of data in many MDM solutions. However, managing this data outside of the originating application may not be as important to an organization that has a transient customer base, or where the data resides in a single system and location.

Product data is another common element in MDM systems. Most companies deal with a static collection of products or services and may need to track and manage that data for use within several systems. However, product data is of no use as a master data element if the organization is an auction house where the products will change rapidly based on what is available at any given point of time.

Sales data is one of the least common data collections included in master data. Most of the time this data is considered transactional in nature. This is true for most organizations that produce invoices and collect payment in a short timeframe. For organizations that carry long-term sales contracts, this data becomes a good candidate for master data. The overall state of the entities remains static with periodic changes to balances.

So how do you decide what to include in an MDM solution? Here are some basic questions to ask about each candidate data set for master data.

  1. Is the data spread across multiple systems and/or locations? If so, it is a sure bet that each data set holds some common data with their own unique twist on the how it is maintained and ultimately looks. Remember to look places outside the mainstream applications for additional sets of this data. That includes checking individual computers for special purpose databases, spreadsheets and lists used for marketing campaigns and other analysis.
  2. Does the data remain reasonably static? Reasonably static data is data that experiences little to know change over a pre-determined period of time. Again, customer data where the address, phone number, or name may occasionally change is a good candidate.
  3. Does the volume of data justify the effort? If you sell no more than 10 products or services, or have only three customers, don’t bother with that data. The overall benefit of maintaining it out of the originating system, or systems, is very low when compared to the effort involved.
  4. Most importantly, is the data significant to the operation? If the data is frequently referenced for reporting or other operations, it may benefit the company to create and maintain the data in a consistent format to push back into the originating systems.

This is blog only scratches the surface of what to include in a MDM solution. With all of this considered, it is most important to start small and grow from there. Select at least two related data sets to include at the beginning and grow your solution as you learn what does and does not work.