Articles Featured
So What Are Patterns? Why Should I care?
April 3, 2015
0
, , , ,
All mathematics is a language that is well tuned, finely honed, to describe patterns; be it patterns in a star, which has five points that are regularly arranged, be it patterns in numbers like 2, 4, 6, 8, 10 that follow very regular progression.
Brian Greene

Identifying rules that describe specific patterns within “data” is an essential skill. To understand the importance of pattern discovery let’s first explore what are “patterns.”

Patterns are a set of items, subsequences or substructures that occur frequently together in data sets. We call these strongly correlated. Patterns usually represent intrinsic and important properties of data.

Pattern discovery is a process which attempts to uncover and mine patterns from massive data sets. For example;

  1. You may want to understand kind of products are often purchased together
  2. You may want to understand unexpected associations
  3. You may want to understand the sequences of warnings that precede an equipment failure to schedule preventative maintenance

Pattern mining forms the foundation for many things. For example, associating correlation causality analysis, mining sequential structure patterns, pattern analysis in spatiotemporal data, multimedia data and stream data.

Even for classification, if we use discriminative pattern-based analysis, the classification could be more accurate. And for cluster analysis, pattern-based subspace clustering could be an important direction for cluster analysis.

Let’s look at the Frequent patterns and associations rules. For example you have five transactions:

Transaction A: Eggs, bread, watermelon, beer

Transaction B: Beer, peanuts, bread

Transaction C: Diapers, wipes, apple sauce

Transaction D: Beer, bread, butter, toilet paper

Transaction E: Bread, cheese, apples

Transaction A contains eggs, bread, watermelon, and beer, which form an item set because this is a, a set of items. And for this particular one, it is four item set because it contains four items. And for each item set, you may have a concept of support. Support means, in these transactions data set, how many times does “beer” happen? In our example, there are three occurrences of beer out of five transactions. So the relative support is 3 over 5, or you can say 60%.

So, we may see whether in item set X is frequent or not. If X, the support of X, pass a minimum support threshold. For example, if we said the minimum support threshold is 50%. Then, we can see the frequent 1-itemset, in this data set, you will find there are 4, like, beer, you can see there are, 3 cases, the absolute support is 3, the relative support is 3 over 5 is 60%. But you will also note that in all transactions with “beer” we also find “bread”. Can assume that people who buy beer also buy bread?

More about Association Rule in next blog…

 

 

About author

Ibrahim Sajid Malick

Related items

/ You may check this items as well

ubiquity of simply anonymized mobility datasets and are giving room to privacy concerns.

Privacy: Can WiFi data be anonymous?

All mathematics is a language that is well tuned, ...

Read more
Cisco Krack Vulnerabilities

KRACK Attack: Vulnerabilities in Wi-Fi Protected Access and Wi-Fi Protected Access II

All mathematics is a language that is well tuned, ...

Read more

‘My God, it’s better’: Emma can write again thanks to a prototype watch, raising hope for Parkinson’s disease – Transform

Microsoft researcher Haiyan Zhang created a watch ...

Read more

There are 0 comments

Leave a Reply

Your email address will not be published. Required fields are marked *