This section describes the complete solution that builds multiple models and creates queries that return predictions from each model. It contains these parts:
[1] Analysis Services project: To create this project, follow the instructions in the Data Mining tutorial on MSDN (http://msdn.microsoft.com/en-us/library/ms169846.aspx) to create a Forecasting mining structure and default time series mining model.
[2] Integration Services project: You will create a new project, containing multiple packages:
Scope
The following Integration Services tasks and components are used in this walkthrough. For more information from SQL Server Books Online, click on the link in the Task or component column.
Task or component
|
Used for
|
Execute SQL task
|
Gets variable values, and creates tables to store results
|
Analysis Services Execute DDL task
|
Creates individual models
|
Analysis Services Processing task
|
Populates the models with data
|
Foreach Loop Container
|
Builds and processes multiple data mining models
|
Script task
|
Builds the required XMLA commands
|
Data Mining Query task
|
Creates predictions from each model
|
Data Flow task
|
Manages and merges prediction results
|
OLE DB source
|
Gets data from temporary prediction table
|
OLE DB destination
|
Writes predictions to permanent table
|
Derived Column transformation
|
Adds metadata about predictions
|
Even though the following Integration Services components are also very useful for data mining, they are not used in this walkthrough—look for examples in a later paper:
-
Data Profiling task
-
Conditional Split transformation
-
Percentage Sampling transformation
-
Lookup and Fuzzy Lookup transformation
-
Data Mining Training destination
Note: The SQL Server Reporting Services project containing the reports that compare models is not included here, even though this project generates all the data required for the reports. That is because the report creation process is somewhat lengthy to describe, especially if you are not familiar with Reporting Services. Moreover, since all the prediction data is stored in the relational database, there are other reporting clients you can use, including Microsoft PowerPivot for Excel and Project Crescent. However, we hope to describe the process in a separate article later on the TechNet Wiki (http://social.technet.microsoft.com/wiki/contents/articles/default.aspx).
Phase 1 - Preparation: The definition of the models you want to create is stored in SQL Server as a set of parameters, values, and model names.
Phase 2 – Model creation: Integration Services retrieves the model definitions and passes the parameter values to a Foreach Loop that builds and then executes the XML for Analysis (XMLA) statement for each model.
Phase 3 – Model processing: Integration Services retrieves a list of available models, and then it processes each model by populating it with data.
Phase 4 - Prediction: Integration Services issues a prediction query to each processed model. Each set of predictions is saved to a SQL Server table.
Phase 5 – Reporting and analysis: The prediction trends for each model are compared by using reports (created by Reporting Services, PowerPivot, or your favorite reporting client) using the data in the relational table.
Phase 1 - Preparation
In this phase, you set up the structure, sample data, and parameters your packages will use. Before you build the working packages, you need to complete the following tasks:
-
Create the Forecasting mining structure used by all mining models.
-
Generate the sample XMLA that represents the default time series mining model, to use as a template.
-
Create a table that stores replacement parameters for the new models, and then insert the parameter values.
The following section describes how to perform these tasks.
Create the Forecasting mining structure and default time series mining model
To create multiple models based on a single mining structure, you need to create the Forecasting mining structure first. Based on that mining structure, you also need to create a time series model that can be used as the template for generating other models. If you do not already have a mining structure capable of supporting a time series model, you can build one by following the steps described in the Microsoft Time Series tutorial (http://msdn.microsoft.com/en-us/library/ms169846.aspx) in SQL Server Books Online.
Dostları ilə paylaş: |