GPFgrowth
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.GPFgrowth(iFile, minSup, maxPer, minPR, sep='\t')[source]
Bases:
partialPeriodicPatterns
- Description:
GPFgrowth is algorithm to mine the partial periodic frequent pattern in temporal database.
- Reference:
R. Uday Kiran, J.N. Venkatesh, Masashi Toyoda, Masaru Kitsuregawa, P. Krishna Reddy, Discovering partial periodic-frequent patterns in a transactional database, Journal of Systems and Software, Volume 125, 2017, Pages 170-182, ISSN 0164-1212, https://doi.org/10.1016/j.jss.2016.11.035.
- Parameters:
iFile – str : Name of the Input file to mine complete set of frequent pattern’s
oFile – str : Name of the output file to store complete set of frequent patterns
minSup – str: The user can specify minSup either in count or proportion of database size.
minPR – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.
maxPer – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.
- Attributes:
- inputFilefile
Name of the input file to mine complete set of frequent pattern
- minSupfloat
The user defined minSup
- maxPerfloat
The user defined maxPer
- minPRfloat
The user defined minPR
- finalPatternsdict
it represents to store the pattern
- runTimefloat
storing the total runtime of the mining process
- memoryUSSfloat
storing the total amount of USS memory consumed by the program
- memoryRSSfloat
storing the total amount of RSS memory consumed by the program
- Methods:
- mine()
Mining process will start from here
- getPatterns()
Complete set of patterns will be retrieved with this function
- storePatternsInFile(ouputFile)
Complete set of frequent patterns will be loaded in to an output file
- getPatternsAsDataFrame()
Complete set of frequent patterns will be loaded in to an output file
- getMemoryUSS()
Total amount of USS memory consumed by the mining process will be retrieved from this function
- getMemoryRSS()
Total amount of RSS memory consumed by the mining process will be retrieved from this function
- getRuntime()
Total amount of runtime taken by the mining process will be retrieved from this function
Executing code on Terminal:
- Format:
>>> python3 GPFgrowth.py <inputFile> <outputFile> <minSup> <maxPer> <minPR>
- Examples:
>>> python3 GPFgrowth.py sampleDB.txt patterns.txt 10 10 0.5
Sample run of the importing code:
… code-block:: python
from PAMI.partialPeriodicFrequentPattern.basic import GPFgrowth as alg
obj = alg.GPFgrowth(inputFile, outputFile, minSup, maxPer, minPR)
obj.mine()
partialPeriodicFrequentPatterns = obj.getPatterns()
print(“Total number of partial periodic Patterns:”, len(partialPeriodicFrequentPatterns))
obj.save(oFile)
Df = obj.getPatternInDf()
memUSS = obj.getMemoryUSS()
print(“Total Memory in USS:”, memUSS)
memRSS = obj.getMemoryRSS()
print(“Total Memory in RSS”, memRSS)
run = obj.getRuntime()
print(“Total ExecutionTime in seconds:”, run)
Credits:
The complete program was written by Nakamura under the supervision of Professor Rage Uday Kiran.
- getMemoryRSS()[source]
Total amount of RSS memory consumed by the mining process will be retrieved from this function :return: returning RSS memory consumed by the mining process :rtype: float
- getMemoryUSS()[source]
Total amount of USS memory consumed by the mining process will be retrieved from this function :return: returning USS memory consumed by the mining process :rtype: float
- getPatterns()[source]
Function to send the set of frequent patterns after completion of the mining process :return: returning frequent patterns :rtype: dict
- getPatternsAsDataFrame()[source]
Storing final frequent patterns in a dataframe :return: returning frequent patterns in a dataframe :rtype: pd.DataFrame
- getRuntime()[source]
Calculating the total amount of runtime taken by the mining process :return: returning total amount of runtime taken by the mining process :rtype: float
- runTime = 0
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Node[source]
Bases:
object
A class used to represent the node of frequentPatternTree
- Attributes:
- itemint
storing item of a node
- parentnode
To maintain the parent of every node
- childlist
To maintain the children of node
- nodeLinknode
To maintain the next node of node
- tidListset
To maintain timestamps of node
- Methods:
- getChild(itemName)
storing the children to their respective parent nodes
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.PFgrowth(tree, prefix, PFList, minSup, maxPer, minPR, last)[source]
Bases:
object
This class is pattern growth algorithm
- Attributes:
- treeNode
represents the root node of prefix tree
- prefixlist
prefix is list of prefix items
- PFListdict
storing time stamp each item
- minSupfloat
user defined min Support
- maxPerfloat
user defined max Periodicity
- minPRfloat
user defined min PR
- lastint
represents last time stamp in database
- Methods:
- run
it is pattern growth algorithm
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Tree[source]
Bases:
object
A class used to represent the frequentPatternGrowth tree structure
- Attributes:
- rootnode
Represents the root node of the tree
- nodeLinksdictionary
storing last node of each item
- firstNodeLinkdictionary
storing first node of each item
- Methods:
- addTransaction(transaction,timeStamp)
creating transaction as a branch in frequentPatternTree
- fixNodeLinks(itemName, newNode)
add newNode link after last node of item
- deleteNode(itemName)
delete all node of item
- createPrefixTree(path,timeStampList)
create prefix tree by path
- createConditionalTree(PFList, minSup, maxPer, minPR, last)
create conditional tree. Its nodes are satisfy IP / (minSup+1) >= minPR
- addTransaction(transaction, tid)[source]
add transaction into tree
- Parameters:
transaction (list) – it represents the one transactions in database
tid (list) – represents the timestamp of transaction
- createConditionalTree(PFList, minSup, maxPer, minPR, last)[source]
create conditional tree by PFlist
- Parameters:
PFList (dict) – it represents timestamp each item
minSup – it represents minSup
maxPer – it represents maxPer
minPR – it represents minPR
last – it represents last timestamp in database
- Returns:
return is PFlist which satisfy ip / (minSup+1) >= minPR
- createPrefixTree(path, tidList)[source]
create prefix tree by path
- Parameters:
path (list) – it represents path to root from prefix node
tidList (list) – it represents tid of each item
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.calculateIP(maxPer, timeStamp, timeStampFinal)[source]
Bases:
object
This class calculate ip from timestamp
- Attributes:
- maxPerfloat
it represents user defined maxPer value
- timeStamplist
it represents timestamp of item
- timeStampFinalint
it represents last timestamp of database
- Methods:
- run
calculate ip from its timestamp list
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFListver2(Database, minSup, maxPer, minPR)[source]
Bases:
object
generate time stamp list from input file
- Attributes:
- inputFilestr
it is input file name
- minSupfloat
user defined minimum support value
- maxPerfloat
user defined max Periodicity value
- minPRfloat
user defined min PR value
- PFListdict
storing timestamps each item
- class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFTreever2(Database, tidList)[source]
Bases:
object
create tree from tidList and input file
- Attributes:
- inputFilestr
it represents input file name
- tidListdict
storing tids each item
- rootNode
it represents the root node of the tree
- Methods:
- run
it create tree
- find separator(line)
find separator in the line of database