The arboretum procedure

Syntax Index

The ASSOC Procedure

The ASSOC Procedure


Procedure Syntax

PROC ASSOC Statement

CUSTOMER Statement

TARGET Statement




The ASSOC Procedure


Association discovery is the identification of items that occur together in a given event or record. This

technique is also known as market basket analysis. Online transaction processing systems often provide

the data sources for association discovery. Associations rules are based on frequency counts of the

number of times items occur alone and in combination in the database. The rules are expressed as "if

item A is part of an event then item B is also part of the event X percent of the time." The rules should

not be interpreted as a direct causation but as an association between two or more items. Identifying

creditable associations can help the business technologist make decisions such as when to distribute

coupons, when to put a product on sale, or how to layout items in a store.

Hypothetical association discovery rules include: If a customer buys shoes, then 10% of the time he also

buys socks. A grocery chain may find that 80% of all shoppers are apt to buy a jar of salsa when they

also purchase a bag of tortilla chips. When "do-it-yourselfers" buy latex paint they, also buy rollers 85%

of the time. Forty percent of investors holding an equity index fund will have a growth fund in their


An association rule has a left side (antecedent) and a right side (consequent). Both sides of the rule can

contain more than one item. The confidence factor, level of support, and lift are three important

evaluation criteria of association discovery. The strength of an association is defined by its confidence

factor, which is the percentage of cases in which a consequent appears with a given antecedent. The level

of support is how frequently the combination occurs in the market basket (data base). Lift is equal to the

confidence factor divided by the expected confidence. A creditable rule has a large relative confidence

factor, a relatively large level of support, and a value of lift greater than 1. Rules having a high level of

confidence but little support should be interpreted with caution.

The maximum number of items in an association determines the maximum size of the item set to be

considered. For example, the default of 4 items indicates that up to 4-way associations are performed.

The ASSOC Procedure

Procedure Syntax

PROC ASSOC <option(s)>;

CUSTOMER variable-list;

TARGET variable;

