top of page
MicrosoftTeams-image (5).png
WhatsApp Image 2024-02-27 at 03.29.55.jpeg

ASSOCIATION RULE MINING

OVERVIEW

​

Association Rule Mining is a data mining technique used to discover interesting relationships or associations among a set of items in large datasets. These relationships are often represented in the form of rules, which provide insights into the co-occurrence patterns of items in the data. In the context of text summarization, ARM can be applied to discover associations between words or phrases in a document. The items in ARM could represent terms, and rules can indicate co-occurrence patterns.

1_4yFCbNwp0gGdGR5KbquFHA.png
association-rules-network-graph2.png
AR_1.png

Measures in ARM:

  1. Support: It measures the frequency of occurrence of a set of items in the dataset. Higher support indicates a stronger presence of the itemset. In text summarization, support can represent the frequency of a term or phrase in the document. Higher support may indicate the importance of a term in the context of the document.

  2. Confidence: It measures the reliability or trustworthiness of the rule. It is the conditional probability of finding the consequent in a transaction given that the transaction contains the antecedent. In text summarization dataset, it may indicate how often a certain term contributes to the overall meaning of the document.

  3. Lift: It measures how much more likely the consequent is to be observed in transactions containing the antecedent compared to its expected occurrence by chance. A lift value greater than 1 indicates a positive correlation. Lift, in this context, can represent the significance of the association between terms in the summarization process. It can indicate whether the co-occurrence of terms is more meaningful than expected by chance.

Rules: In ARM, rules are statements that assert a relationship between sets of items. They typically have the form "If {A} then {B}", where A is the antecedent and B is the consequent (outcome).​

Apriori Algorithm:

The Apriori algorithm is a popular and classic algorithm for Association Rule Mining. It works in two main steps:

  1. Generate frequent itemsets:

    • Start with individual items as 1-itemsets.

    • Iteratively generate candidate k-itemsets by joining the (k-1)-itemsets.

    • Prune the candidate itemsets that do not meet the minimum support threshold.

  2. Generate association rules:

    • Create rules from the frequent itemsets.

    • Prune rules based on the minimum confidence threshold.

 

1_vSOPQPMRPU1zmnkxsbvFYQ.png
bottom of page