Mining Local Periodic Patterns in Temporal Databases

PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)

Mining Local Periodic Patterns in Temporal Databases

What is local periodic pattern mining?

Local periodic pattern mining aims to discover all interesting patterns in a temporal database that have periodicity no greater than the user-specified maximum periodicity (maxPer) constraint, time interval of occurence no greater than user-specified maximum period of spillovers (maxSoPer) constraint and minDur is no less than minimum duration (minDur). The minDur controls the minimum duration that a pattern is reocurring.

Research paper: Fournier Viger, Philippe & Yang, Peng & Rage, Uday & Ventura, Sebastian & Luna, José María. (2020). Mining Local Periodic Patterns in a Discrete Sequence. Information Sciences. 544. 10.1016/j.ins.2020.09.044.

What is a temporal database?

A temporal database is a collection of transactions at a particular timestamp, where each transaction contains a timestamp and a set of items.
A hypothetical temporal database containing the items a, b, c, d, e, f, and g as shown below

TS Transactions
1 a b c g
2 b c d e
3 a b c d
4 a c d f
5 a b c d g
6 c d e f
7 a b c d
8 a e f
9 a b c d
10 b c d e

Note: Duplicate items must not exist in a transaction.

Acceptable format of temporal databases in PAMI

Each row in a temporal database must contain timestamp and items.

1 a b c g
2 b c d e
3 a b c d
4 a c d f
5 a b c d g
6 c d e f
7 a b c d
8 a e f
9 a b c d
10 b c d e

Understanding the statisctics of database

To understand about the database. The below code will give the detail about the transactional database.

The sample code

import PAMI.extras.dbStats.TemporalDatabase as stats

obj = stats.TemporalDatabase('sampleTemporalDatabase.txt', ' ')
obj.run()
obj.printStats() 
Database size : 10
Number of items : 7
Minimum Transaction Size : 3
Average Transaction Size : 4.0
Maximum Transaction Size : 5
Minimum period : 1
Average period : 1.0
Maximum period : 1
Standard Deviation Transaction Size : 0.4472135954999579
Variance : 0.2222222222222222
Sparsity : 0.42857142857142855

What is the input to local periodic pattern mining algorithms

Algorithms to mine the local periodic patterns requires temporal database, maxPer, maxSoPer and minDur (specified by user).

How to run the local periodic pattern algorithm in terminal

syntax: python3 algorithmName.py <path to the input file> <path to the output file> <maxPer> <maxSoPer> <minDur> <seperator>

Sample command to execute the LPPGrowth algorithm in localPeriodicPattern/basic folder

python3 LPPGrowth.py inputFile.txt outputFile.txt 3 4 2 ' '

How to implement the LPPGrowth algorithm by importing PAMI package

Import the PAMI package executing: pip3 install PAMI

import PAMI.localPeriodicPattern.basic.LPPGrowth as alg 

iFile = 'sampleTemporalDatabase.txt'  #specify the input transactional database <br>

maxPer = 3  #specify the maxPer value <br>
maxSoPer = 5  #specify the maxSoPer value <br>
minDur = 5  #specify the minDur value <br>
seperator = ' ' #specify the seperator. Default seperator is tab space. <br>
oFile = 'localPeriodicPatterns.txt'   #specify the output file name<br>

obj = alg.LPPGrowth(iFile, maxPer, maxSoPer, minDur, seperator) #initialize the algorithm <br>
obj.startMine()                       #start the mining process <br>
obj.save(oFile)               #store the patterns in file <br>
df = obj.getPatternsAsDataFrame()     #Get the patterns discovered into a dataframe <br>
#obj.printStats()                      #Print the stats of mining process

The localPeriodicPatterns.txt file contains the following patterns (format: pattern:support):!cat localPeriodicPatterns.txt

!cat localPeriodicPatterns.txt
f : {(4, 8)}
('f', 'd') : {(4, 10)}
('f', 'd', 'c') : {(4, 10)}
('f', 'c') : {(4, 10)}
d : {(2, 10)}
('d', 'c') : {(2, 10)}
('d', 'c', 'b') : {(2, 10)}
('d', 'c', 'b', 'a') : {(3, 10)}
('d', 'c', 'a') : {(3, 10)}
('d', 'b') : {(2, 10)}
('d', 'b', 'a') : {(3, 10)}
('d', 'a') : {(3, 10)}
c : {(1, 10)}
('c', 'b') : {(1, 10)}
('c', 'b', 'a') : {(1, 10)}
('c', 'a') : {(1, 10)}
b : {(1, 10)}
('b', 'a') : {(1, 10)}
a : {(1, 9)}

The dataframe containing the patterns is shown below:

df
Patterns PTL
0 f {(4, 8)}
1 (f, d) {(4, 10)}
2 (f, d, c) {(4, 10)}
3 (f, c) {(4, 10)}
4 d {(2, 10)}
5 (d, c) {(2, 10)}
6 (d, c, b) {(2, 10)}
7 (d, c, b, a) {(3, 10)}
8 (d, c, a) {(3, 10)}
9 (d, b) {(2, 10)}
10 (d, b, a) {(3, 10)}
11 (d, a) {(3, 10)}
12 c {(1, 10)}
13 (c, b) {(1, 10)}
14 (c, b, a) {(1, 10)}
15 (c, a) {(1, 10)}
16 b {(1, 10)}
17 (b, a) {(1, 10)}
18 a {(1, 9)}