Automation of Data Mining Using Integration Services



Yüklə 145,11 Kb.
səhifə4/9
tarix08.10.2017
ölçüsü145,11 Kb.
#3798
1   2   3   4   5   6   7   8   9

Phase 2 - Model Creation


In Phase 2, you build a package that creates many mining models, using the templates and parameters you prepared in Phase 1. From among these models, you can choose an analysis that best suits your needs. This package includes the following tasks:

  • The initial Execute SQL task, which gets the model names and model parameters from the relational database and then stores the results of that query in an ADO.NET rowset variable

  • A Foreach Loop container, which iterates over the values stored in the ADO.NET rowset variable and then passes the new model names and model parameters, one at a time, to the Script task and the Analysis Services Execute DDL task inside the Foreach Loop

  • The Script task, which loads the base XMLA statement from a variable, inserts the new parameters, and then writes out the new XMLA to another variable

  • The Analysis Services Execute DDL task, which executes the updated XMLA statement contained in the variable


Create package and variables (CreateParameterizedModels.dtsx)


  1. Create a new Integration Services package and name it CreateParameterizedModels.dtsx.

  2. Set up the variable that stores the model parameters. The variable name should be User:objAllModelParameters, and the variable should have the type Object. Make sure that the variable has package scope, because it passes values from the query results, returned by the Get Model Parameters (Execute SQL task), to a Foreach loop.

Configure the Execute SQL task (Get Model Parameters)


  1. Add an Execute SQL task to the package you just created and name it Get Model Parameters.

  2. In the Execute SQL Task Editor, specify the database that stores the model parameters. This walkthrough uses an OLE DB connection manager.

  3. For the SQLSourceType property, specify Direct input, and then paste in the following query.

SELECT ModelID AS MODEL_ID,

ModelID AS MODEL_NAME,

ForecastMethod AS FORECAST_METHOD,

PeriodicityHint AS PERIODICITY_HINT

FROM dbo.ModelParameters


  1. For the ResultSet property, choose Full row set. This enables you to store a multi-row result in a variable.

  2. In the Result Set pane, for Result Name, type 0 (zero), and then assign the variable User:objAllModelParameters.

Configure the Foreach Loop container (Foreach Model Definition)


Now that you have loaded a set of parameters into an object variable, you pass the variable to a Foreach Loop container, which then performs an operation for each row of data:

  1. Create a Foreach Loop container, and name it Foreach Model Definition.

  2. Select the Foreach Loop container and then open the Variables window, to create the following variables with the data types specified, scoped to the Foreach Loop container:

User::strBaseXMLA String

User::strModelID String

User::strModelName String

User::strForecastMethod String

User::strPeriodicityHint String

User::strModelXMLA String



  1. Open the Foreach Loop Editor. For Collection, choose ADO.NET enumerator.

  2. In the Foreach Loop Editor, configure the enumerator by choosing the variable User:objAllModelParameters in the ADO object source variable dropdown list. Do not change the default enumeration mode – you are only passing in one table, so the default setting, Rows in the first table, is correct.

  3. Click Variable mappings, and then assign columns from the parameters to the indexes of columns in variable data as follows:

User::strModelID 0

User::strModelName 1

User::strForecastMethod 2

User::strPeriodicityHint 3



  1. Add a Script task inside the Foreach Loop container, and name it Update Model XMLA.

  2. In the Script Task Editor, specify the properties of the variables as follows:

User::strBaseXMLA Read-only

User::strModelID Read-only

User::strModelName Read-only

User::strForecastMethod Read-only

User::strPeriodicityHint Read-only

User::strModelXMLA Read/write



  1. Click Edit Script to add the code that replaces the string for each variable value.

Note: The code for this task is included in the Appendix. The Script task performs a simple operation: it finds the default values in the basic XMLA and inserts the new parameter values by doing string replacement. You could also do this by using regular expressions or XML methods, of course.

  1. Add an Analysis Services Processing task inside the Foreach Loop container and name it Execute Model XMLA.

  2. For the Connection property, specify the instance of Analysis Services where your models are stored.

  3. On the DDL tab, for SourceType, choose Variable, and then select the variable User::strModelXMLA.

This completes the package that creates the models. You can now execute just this package by right-clicking the package in Solution Explorer and then clicking Execute Now.

After the package runs, if there are no errors, you can connect to your Analysis Services database by using SQL Server Management Studio and see the list of new models that were created. However, you cannot browse the models or build prediction queries yet. That is because the models are just metadata until they are processed, and they contain no data or patterns. In the next package, you will process the models.


Phase 3 - Process the Models


This package gets a list of valid models from the Analysis Services server, and then it processes the models using the Analysis Services Processing task.

Until processed, the model that you created by using XMLA is just a definition: a collection of metadata that defines parameters and data source bindings. Processing gets the data from the Forecasting data source, and then it generates patterns based on the algorithm you specified. (For more information about the architecture of mining models, and about processing, see Mining Structures (http://msdn.microsoft.com/en-us/library/ms174757.aspx) in the MSDN Library.)

In summary, this is how the package handles processing:


  • The first Execute Data Mining Query task issues a query to get a list of valid models. That list is written to a temporary table in the relational engine.

  • The next task, an Execute SQL task, retrieves the model names from that table and then puts them in an ADO.NET rowset variable.

  • The Foreach Loop container takes the ADO.NET rowset variable contents as input, and then it processes each model serially, using the embedded Analysis Services Processing task.

  • Finally you update the status of the models.



Yüklə 145,11 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə