PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)
Fuzzy Frequent Spatial Pattern mining aims to discover all Spatial fuzzy periodic patterns in a fuzzy database that have support no less than the user-specified minimum support (minSup) constraint, periodicity no greater than user-specified maximum periodicity (maxPer) constraint and distance between two items is no less than maximum distance (maxDist). The minSup controls the minimum number of transactions that a pattern must appear in a database and the maxPer controls the maximum time interval within which a pattern must reappear in the database.
Reference: Veena et al., Mining Geo-referenced Fuzzy Periodic-Frequent Patterns in Geo-Referenced Fuzzy Temporal Databases, to be appeared in IEEE FUZZ 2022.
A fuzzy temporal database is a collection of transactions at a particular timestamp, where each transaction contains a timestamp, set of items, and its fuzzy values respectively.
A hypothetical fuzzy temporal database with items a, b, c, d, e, f and g and its fuzzy values are shown below:
| TS | Transactions |
|---|---|
| 1 | (a.L,0.2) (b.M,0.3) (c.H,0.1) (g.M,0.1) |
| 2 | (b.M,0.3) (c.H,0.2) (d.L,0.3) (e.H,0.2) |
| 3 | (a.L,0.2) (b.M,0.1) (c.H,0.3) (d.L,0.4) |
| 4 | (a.L,0.3) (c.H,0.2) (d.L,0.1) (f.M,0.2) |
| 5 | (a.L,0.3) (b.M,0.1) (c.H,0.2) (d.L,0.1) (g.M,0.2) |
| 6 | (c.H,0.2) (d.L,0.2) (e.H,0.3) (f.M,0.1) |
| 7 | (a.L,0.2) (b.M,0.1) (c.H,0.1) (d.L,0.2) |
| 8 | (a.L,0.1) (e.H,0.2) (f.M,0.2) |
| 9 | (a.L,0.2) (b.M,0.2) (c.H,0.4) (d.L,0.2) |
| 10 | (b.M,0.3) (c.H,0.2) (d.L,0.2) (e.H,0.2) |
Note: Duplicate items must not exist in a transaction.
Each row in a fuzzy temporal database must contains list of fuzzy items, colon as a seperator, and their list of fuzzy values.
A sample fuzzy temporal database file, say fuzzyTemporalDatabase.txt, is provided below:
1 a.L b.M c.H g.M:0.2 0.3 0.1 0.1
2 b.M c.H d.L e.H:0.13 0.2 0.3 0.2
3 a.L b.M c.H d.L:0.2 0.1 0.3 0.4
4 a.L c.H d.L f.M:0.3 0.2 0.1 0.2
5 a.L b.M c.H d.L g.M:0.3 0.1 0.2 0.1 0.2
6 c.H d.L e.H f.M:0.2 0.2 0.3 0.1
7 a.L b.M c.H d.L g.M:0.3 0.1 0.2 0.1 0.2
8 b.M c.H d.L:0.2 0.1 0.1 0.2
9 a.L b.M c.H d.L g.M:0.3 0.1 0.2 0.1 0.2
10 b.M c.H d.L e.H:0.3 0.2 0.2 0.2
For more information on how to create a fuzzy transactional database from a quantitative (or utility) transactional database, please refer to the manual utility2FuzzyDB.pdf
Spatial database contain the spatial (neighbourhood) information of items. It contains the items and its nearset neighbours satisfying the maxDist constraint.
A hypothetical spatial database containing items a, b, c, d, e, f and g and neighbours respectively is shown below.
| Item | Neighbours |
|---|---|
| a | b, c, d |
| b | a, e, g |
| c | a, d |
| d | a, c |
| e | b, f |
| f | e, g |
| g | b, f |
Spatial database contain the spatial (neighbourhood) information of items. It contains the items and its nearset neighbours satisfying the maxDist constraint.
A hypothetical spatial database containing items a, b, c, d, e, f and g and neighbours respectively is shown below.
a b c d
b a e g
c a d
d a c
e b f
f e g
g b f
For more information on how to create a neighborhood file for a given dataset, please refer to the manual of creating neighborhood file.
To understand about the database. The below code will give the detail about the transactional database.
The below sample code prints the statistical details of a database.
import PAMI.extras.dbStats.FuzzyDatabase as stats
obj = stats.FuzzyDatabase('sampleInputFile.txt', ' ')
obj.run()
obj.printStats()
The input parameters to a frequent pattern mining algorithm are:
- String : E.g., ‘fuzzyDatabase.txt’
- URL : E.g., https://u-aizu.ac.jp/~udayrage/datasets/fuzzyDatabases/fuzzy_T10I4D100K.csv
- DataFrame with the header titled ‘Transactions’, and ‘fuzzyValues’
- String : E.g., ‘spatialDatabase.txt’
- URL : E.g., https://u-aizu.ac.jp/~udayrage/datasets/fuzzyDatabases/neighbour_T10I4D100K.csv
- DataFrame with the header titled ‘item’ and ‘Neighbours’
- count (beween 0 to length of database)
- [0, 1]
- count (beween 0 to length of database)
- [0, 1]
The patterns discovered by a geo-referenced fuzzy periodic frequent pattern mining algorithm can be saved into a file or a data frame.
foo@bar: cd PAMI/fuzzySpatialPeriodicFrequentPattern/basic
foo@bar: python3 algorithmName.py inputFile outputFile neighbourFile minSup maxPer seperator
Example: python3 FGPFPMiner.py inputFile.txt outputFile.txt neighbourFile.txt 5 3 ' '
import PAMI.fuzzyGeoreferencedPeriodicFrequentPattern.basic.FGPFPMiner as alg
iFile = 'sampleFuzzyTemporal.txt' # specify the input utility database <br>
minSup = 0.8 # specify the minSupvalue <br>
maxPer = 4
seperator = ' '
oFile = 'fuzzySpatialPeriodicFrequentPatterns.txt' # specify the output file name<br>
nFile = 'sampleNeighbourFile.txt' # specify the neighbour file of database <br>
obj = alg.FGPFPMiner(iFile, nFile, minSup, maxPer, seperator) # initialize the algorithm <br>
obj.mine() # start the mining process <br>
obj.save(oFile) # store the patterns in file <br>
df = obj.getPatternsAsDataFrame() # Get the patterns discovered into a dataframe <br>
obj.printResults() # Print the stats of mining process
Total number of Spatial Fuzzy Periodic-Frequent Patterns: 5
Total Memory in USS: 98078720
Total Memory in RSS 137539584
Total ExecutionTime in ms: 0.0010845661163330078
!cat fuzzySpatialPeriodicFrequentPatterns.txt
#format: fuzzyGeoreferencedPeriodicFrequentPattern:support
e.H:0.8
b.M:1.0999999999999999
d.L:1.63
a.L:1.5
c.H:1.9999999999999998
The dataframe containing the patterns is shown below:
df
| Patterns | Support | |
|---|---|---|
| 0 | e.H | 0.8 |
| 1 | b.M | 1.0999999999999999 |
| 2 | d.L | 1.63 |
| 3 | a.L | 1.5 |
| 4 | c.H | 1.9999999999999998 |