Mining Partial Periodic Spatial Patterns in Temporal Databases

PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)

Mining Partial Periodic Spatial Patterns in Temporal Databases

1. What is partial periodic spatial pattern mining?

Partial periodic spatial pattern mining aims to discover all interesting patterns in a geo-referenced temporal database that have periodic support no less than the user-specified minimum periodic support (minPS) constraint and the distance between two items is no less than maximum distance (maxDist). The minPS controls the minimum number of periodic occurrences of a pattern in a database.

2. What is the temporal database?

A temporal database is a collection of transactions at a particular timestamp, where each transaction contains a timestamp and a set of items.
A hypothetical temporal database containing the items a, b, c, d, e, f, and g as shown below

TS Transactions
1 a b c g
2 b c d e
3 a b c d
4 a c d f
5 a b c d g
6 c d e f
7 a b c d
8 a e f
9 a b c d
10 b c d e

Note: Duplicate items must not exist in a transaction.

3. What is the acceptable format of a temporal database in PAMI?

Each row in a temporal database must contain timestamp and items. A sample temporal database, say sampleTemporalDatabase.txt, is show below.

1 a b c g
2 b c d e
3 a b c d
4 a c d f
5 a b c d g
6 c d e f
7 a b c d
8 a e f
9 a b c d
10 b c d e

4. What is the neighborhood database?

Neighborhood database contains the information regarding the items and their neighboring items.

Items neighbours
a b, c, d
b a, e, g
c a, d
d a, c
e b, f
f e, g
g b, f

5. What is the need for understand the statisctics of database?

To understand about the database. The below code will give the detail about the transactional database.

The below sample code prints the statistical details of a database.

import PAMI.extras.dbStats.TemporalDatabase as stats

obj = stats.TemporalDatabase('sampleTemporalDatabase.txt', ' ')
obj.run()
obj.printStats()
Database size : 10
Number of items : 7
Minimum Transaction Size : 3
Average Transaction Size : 4.0
Maximum Transaction Size : 5
Minimum period : 1
Average period : 1.0
Maximum period : 1
Standard Deviation Transaction Size : 0.4472135954999579
Variance : 0.2222222222222222
Sparsity : 0.42857142857142855

6. What are the input parameters?

The input parameters to a partial periodic spatial pattern mining algorithm are:

7. How to store the output of a partial periodic spatial pattern mining algorithm?

The patterns discovered by a partial periodic spatial pattern mining algorithm can be saved into a file or a data frame.

8. How to run the partial periodic spatial pattern mining algorithms in a terminal?

syntax: python3 algorithmName.py <path to the input file> <path to the output file> <path to the neighbour file> <minPS> <maxIAT> <seperator>

Example: python3 STECLAT.py inputFile.txt outputFile.txt neighbourFile.txt 3 4 ' '

10. How to execute a partial periodic spatial pattern mining algorithm in a Jupyter Notebook?

import PAMI.georeferencedPartialPeriodicPattern.basic.STEclat as alg

iFile = 'sampleTemporalDatabase.txt'  # specify the input transactional database <br>
nFile = 'sampleNeighbourFile.txt'  # specify the input transactional database <br>
minPS = 5  # specify the minSupvalue <br>
maxIAT = 3  # specify the minSupvalue <br>
seperator = ' '  # specify the seperator. Default seperator is tab space. <br>
oFile = 'partialSpatialPatterns.txt'  # specify the output file name<br>

obj = alg.STEclat(iFile, nFile, minPS, maxIAT, seperator)  # initialize the algorithm <br>
obj.startMine()  # start the mining process <br>
obj.save(oFile)  # store the patterns in file <br>
df = obj.getPatternsAsDataFrame()  # Get the patterns discovered into a dataframe <br>
obj.printResults()  # Print the stats of mining process
Spatial Periodic Frequent patterns were generated successfully using SpatialEclat algorithm
Total number of  Spatial Partial Periodic Patterns: 6
Total Memory in USS: 129249280
Total Memory in RSS 170835968
Total ExecutionTime in ms: 0.0015385150909423828

The partialSpatialPatterns.txt file contains the following patterns (format: pattern:periodicSupport):!cat partialSpatialPatterns.txt

!cat partialSpatialPatterns.txt
c	d: 7 
c	a: 5 
c: 8 
d: 7 
a: 6 
b: 6 

The dataframe containing the patterns is shown below:

df
Patterns periodicSupport
0 c\td\t 7
1 c\ta\t 5
2 c\t 8
3 d\t 7
4 a\t 6
5 b\t 6