Automation of Data Mining Using Integration Services



Yüklə 145,11 Kb.
səhifə2/9
tarix08.10.2017
ölçüsü145,11 Kb.
#3798
1   2   3   4   5   6   7   8   9

Solution Walkthrough


This section describes the complete solution that builds multiple models and creates queries that return predictions from each model. It contains these parts:

[1] Analysis Services project: To create this project, follow the instructions in the Data Mining tutorial on MSDN (http://msdn.microsoft.com/en-us/library/ms169846.aspx) to create a Forecasting mining structure and default time series mining model.



[2] Integration Services project: You will create a new project, containing multiple packages:

Scope


The following Integration Services tasks and components are used in this walkthrough. For more information from SQL Server Books Online, click on the link in the Task or component column.

Task or component

Used for

Execute SQL task

Gets variable values, and creates tables to store results

Analysis Services Execute DDL task

Creates individual models

Analysis Services Processing task

Populates the models with data

Foreach Loop Container

Builds and processes multiple data mining models

Script task

Builds the required XMLA commands

Data Mining Query task

Creates predictions from each model

Data Flow task

Manages and merges prediction results

OLE DB source

Gets data from temporary prediction table

OLE DB destination

Writes predictions to permanent table

Derived Column transformation

Adds metadata about predictions

Even though the following Integration Services components are also very useful for data mining, they are not used in this walkthrough—look for examples in a later paper:

  • Data Profiling task

  • Conditional Split transformation

  • Percentage Sampling transformation

  • Lookup and Fuzzy Lookup transformation

  • Data Mining Training destination

Note: The SQL Server Reporting Services project containing the reports that compare models is not included here, even though this project generates all the data required for the reports. That is because the report creation process is somewhat lengthy to describe, especially if you are not familiar with Reporting Services. Moreover, since all the prediction data is stored in the relational database, there are other reporting clients you can use, including Microsoft PowerPivot for Excel and Project Crescent. However, we hope to describe the process in a separate article later on the TechNet Wiki (http://social.technet.microsoft.com/wiki/contents/articles/default.aspx).

Overall Process


Phase 1 - Preparation: The definition of the models you want to create is stored in SQL Server as a set of parameters, values, and model names.

Phase 2 – Model creation: Integration Services retrieves the model definitions and passes the parameter values to a Foreach Loop that builds and then executes the XML for Analysis (XMLA) statement for each model.

Phase 3 – Model processing: Integration Services retrieves a list of available models, and then it processes each model by populating it with data.

Phase 4 - Prediction: Integration Services issues a prediction query to each processed model. Each set of predictions is saved to a SQL Server table.

Phase 5 – Reporting and analysis: The prediction trends for each model are compared by using reports (created by Reporting Services, PowerPivot, or your favorite reporting client) using the data in the relational table.

Phase 1 - Preparation


In this phase, you set up the structure, sample data, and parameters your packages will use. Before you build the working packages, you need to complete the following tasks:

  • Create the Forecasting mining structure used by all mining models.

  • Generate the sample XMLA that represents the default time series mining model, to use as a template.

  • Create a table that stores replacement parameters for the new models, and then insert the parameter values.

The following section describes how to perform these tasks.

Create the Forecasting mining structure and default time series mining model


To create multiple models based on a single mining structure, you need to create the Forecasting mining structure first. Based on that mining structure, you also need to create a time series model that can be used as the template for generating other models. If you do not already have a mining structure capable of supporting a time series model, you can build one by following the steps described in the Microsoft Time Series tutorial (http://msdn.microsoft.com/en-us/library/ms169846.aspx) in SQL Server Books Online.

Yüklə 145,11 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə