Automation of Data Mining Using Integration Services

Yüklə 145,11 Kb.
ölçüsü145,11 Kb.
  1   2   3   4   5   6   7   8   9



SQL Server Technical Article

Writer: Jeannine Takaki

Technical Reviewer: Raman Iyer

Published: July 2011

Applies to: SQL Server 2008 R2, SQL Server 2008, SQL Server 2005

Summary: This article is a walkthrough that illustrates how to build multiple related data models by using the tools that are provided with Microsoft SQL Server Integration Services. In this walkthrough, you will learn how to automatically build and process multiple data mining models based on a single mining structure, how to create predictions from all related models, and how to save the results to a relational database for further analysis. Finally, you view and compare the predictions, historical trends, and model statistics in SQL Server Reporting Services reports.


This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.

Some examples depicted herein are provided for illustration only and are fictitious.  No real association or connection is intended or should be inferred.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

© 2011 Microsoft. All rights reserved.



Introduction 4

Automating the Creation of Data Mining Models 4

Solution Walkthrough 5

Scope 5

Overall Process 6

Phase 1 - Preparation 6

Create the Forecasting mining structure and default time series mining model 7

Extract and edit the XMLA statement 7

Prepare the replacement parameters 9

Phase 2 - Model Creation 12

Create package and variables (CreateParameterizedModels.dtsx) 12

Configure the Execute SQL task (Get Model Parameters) 13

Configure the Foreach Loop container (Foreach Model Definition) 13

Phase 3 - Process the Models 15

Create package and variables (ProcessEmptyModels.dtsx) 16

Add a Data Mining Query task (Execute DMX Query) 16

Create Execute SQL task (List Unprocessed Models) 17

Create a Foreach Loop container (Foreach Model in Variable) 17

Add an Analysis Services Processing task to the Foreach Loop (Process Current Model) 18

Add a Data Mining Query task after the Foreach Loop (Update Processing Status) 19

Phase 4 - Create Predictions for All Models 19

Create package and variables (PredictionsAllModels.dtsx) 20

Create Execute SQL task (Get Processed Models) 20

Create Execute-SQL task (Get Series Names) 21

Create Foreach Loop container (Predict Foreach Model) 22

Create variables for the Foreach Loop container 22

Create the Data Mining Query task (Predict Amt) 24

Create the second Data Mining Query task (Predict Qty) 25

Create Data Flow tasks to Archive the Results 26

Create Data Flow tasks (Archive Results Qty, Archive Results Amt) 27

Run, Debug, and Audit Packages 28

Phase 5 - Analyze and Report 28

Using the Data Mining Viewer 28

Using Reporting Services for Data Mining Results 29

Interpreting the Results 32

Discussion 34

The Case for Ensemble Models 35

Closing the Loop: Interpreting and Getting Feedback on Models 35

Conclusion 36

Resources 36

Acknowledgements 37

Code for Script Task 37


This article is a walkthrough that illustrates how to use the data mining tools that are provided with Microsoft SQL Server Integration Services. If you are an experienced data miner, you probably already use the tools provided in Business Intelligence Development Studio or the Data Mining Client Add-in for Microsoft Excel for building or browsing mining models. However, Integration Services helps you automate many processes.

This solution also introduces the concept of ensemble models for data mining, which are sets of multiple related models. For most data mining projects, you need to create several models, analyze the differences, and compare outputs before you can select a best model to use operationally. Integration Services provides a framework within which you can easily generate and manage ensemble models.

In this series, you will learn how to:

• Configure the Integration Services components that are provided for data mining.

• Automatically build and update mining models by using Integration Services.

• Store mining model parameters and prediction results in the database engine.

• Integrate reporting requirements in the model design workflow.

Note that these are just a few of the ways that you can use Integration Services to incorporate data mining into analytic and data handling workflows. Hopefully these examples will help you get more mileage out of existing installations of Integration Services and SQL Server Analysis Services.

Automating the Creation of Data Mining Models

This scenario positions you as an analyst who has been tasked with creating some projections based on past sales data. You are unsure about how to configure the time series algorithm for best results (ARIMA? ARTXP? What hints to provide?). Moreover, you know that the modeling process typically involves building several models and testing different scenarios.

Rather than build variations on the model ad hoc, you decide to automatically generate multiple related models, varying the parameters systematically for each model. This way you can easily create many models, each using a different combination of periodicity hints and algorithm type. After you have created and processed all the models, you will put the historical data plus the predictions for each model into a series of reports to see which models provide good results.

This walkthrough demonstrates these features:

• Automatically building multiple mining models using parameters stored in SQL Server

• Generating bulk predictions for each model and storing them in SQL Server

• Comparing trends from the models by putting them side by side in reports

Dostları ilə paylaş:
  1   2   3   4   5   6   7   8   9

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2019
rəhbərliyinə müraciət

    Ana səhifə