PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)
Local periodic pattern mining aims to discover all interesting patterns in a temporal database that have periodicity no greater than the user-specified maximum periodicity (maxPer) constraint, time interval of occurence no greater than user-specified maximum period of spillovers (maxSoPer) constraint and minDur is no less than minimum duration (minDur). The minDur controls the minimum duration that a pattern is reocurring.
Research paper: Fournier Viger, Philippe & Yang, Peng & Rage, Uday & Ventura, Sebastian & Luna, José María. (2020). Mining Local Periodic Patterns in a Discrete Sequence. Information Sciences. 544. 10.1016/j.ins.2020.09.044.
A temporal database is a collection of transactions at a particular timestamp, where each transaction contains a timestamp and a set of items.
A hypothetical temporal database containing the items a, b, c, d, e, f, and g as shown below
TS | Transactions |
---|---|
1 | a b c g |
2 | b c d e |
3 | a b c d |
4 | a c d f |
5 | a b c d g |
6 | c d e f |
7 | a b c d |
8 | a e f |
9 | a b c d |
10 | b c d e |
Note: Duplicate items must not exist in a transaction.
Each row in a temporal database must contain timestamp and items.
1 a b c g
2 b c d e
3 a b c d
4 a c d f
5 a b c d g
6 c d e f
7 a b c d
8 a e f
9 a b c d
10 b c d e
To understand about the database. The below code will give the detail about the transactional database.
import PAMI.extras.dbStats.TemporalDatabase as stats
obj = stats.TemporalDatabase('sampleTemporalDatabase.txt', ' ')
obj.run()
obj.printStats()
Database size : 10
Number of items : 7
Minimum Transaction Size : 3
Average Transaction Size : 4.0
Maximum Transaction Size : 5
Minimum period : 1
Average period : 1.0
Maximum period : 1
Standard Deviation Transaction Size : 0.4472135954999579
Variance : 0.2222222222222222
Sparsity : 0.42857142857142855
Algorithms to mine the local periodic patterns requires temporal database, maxPer, maxSoPer and minDur (specified by user).
- String : E.g., ‘temporalDatabase.txt’
- URL : E.g., https://u-aizu.ac.jp/~udayrage/datasets/transactionalDatabases/transactional_T10I4D100K.csv
- DataFrame. Please note that dataframe must contain the header titled ‘TS’ and ‘Transactions’
- count (beween 0 to length of database)
- [0, 1]
- count (beween 0 to length of database)
- [0, 1]
- count (beween 0 to length of database)
- [0, 1]
syntax: python3 algorithmName.py <path to the input file>
<path to the output file>
<maxPer>
<maxSoPer>
<minDur>
<seperator>
python3 LPPGrowth.py
inputFile.txt
outputFile.txt
3 4 2 ' '
Import the PAMI package executing: pip3 install PAMI
import PAMI.localPeriodicPattern.basic.LPPGrowth as alg
iFile = 'sampleTemporalDatabase.txt' #specify the input transactional database <br>
maxPer = 3 #specify the maxPer value <br>
maxSoPer = 5 #specify the maxSoPer value <br>
minDur = 5 #specify the minDur value <br>
seperator = ' ' #specify the seperator. Default seperator is tab space. <br>
oFile = 'localPeriodicPatterns.txt' #specify the output file name<br>
obj = alg.LPPGrowth(iFile, maxPer, maxSoPer, minDur, seperator) #initialize the algorithm <br>
obj.mine() #start the mining process <br>
obj.save(oFile) #store the patterns in file <br>
df = obj.getPatternsAsDataFrame() #Get the patterns discovered into a dataframe <br>
#obj.printStats() #Print the stats of mining process
The localPeriodicPatterns.txt file contains the following patterns (format: pattern:support):!cat localPeriodicPatterns.txt
!cat localPeriodicPatterns.txt
f : {(4, 8)}
('f', 'd') : {(4, 10)}
('f', 'd', 'c') : {(4, 10)}
('f', 'c') : {(4, 10)}
d : {(2, 10)}
('d', 'c') : {(2, 10)}
('d', 'c', 'b') : {(2, 10)}
('d', 'c', 'b', 'a') : {(3, 10)}
('d', 'c', 'a') : {(3, 10)}
('d', 'b') : {(2, 10)}
('d', 'b', 'a') : {(3, 10)}
('d', 'a') : {(3, 10)}
c : {(1, 10)}
('c', 'b') : {(1, 10)}
('c', 'b', 'a') : {(1, 10)}
('c', 'a') : {(1, 10)}
b : {(1, 10)}
('b', 'a') : {(1, 10)}
a : {(1, 9)}
The dataframe containing the patterns is shown below:
df
Patterns | PTL | |
---|---|---|
0 | f | {(4, 8)} |
1 | (f, d) | {(4, 10)} |
2 | (f, d, c) | {(4, 10)} |
3 | (f, c) | {(4, 10)} |
4 | d | {(2, 10)} |
5 | (d, c) | {(2, 10)} |
6 | (d, c, b) | {(2, 10)} |
7 | (d, c, b, a) | {(3, 10)} |
8 | (d, c, a) | {(3, 10)} |
9 | (d, b) | {(2, 10)} |
10 | (d, b, a) | {(3, 10)} |
11 | (d, a) | {(3, 10)} |
12 | c | {(1, 10)} |
13 | (c, b) | {(1, 10)} |
14 | (c, b, a) | {(1, 10)} |
15 | (c, a) | {(1, 10)} |
16 | b | {(1, 10)} |
17 | (b, a) | {(1, 10)} |
18 | a | {(1, 9)} |