GPFgrowth

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.GPFgrowth(iFile, minSup, maxPer, minPR, sep='\t')[source]

Bases: partialPeriodicPatterns

Description:

GPFgrowth is algorithm to mine the partial periodic frequent pattern in temporal database.

Reference:

R. Uday Kiran, J.N. Venkatesh, Masashi Toyoda, Masaru Kitsuregawa, P. Krishna Reddy, Discovering partial periodic-frequent patterns in a transactional database, Journal of Systems and Software, Volume 125, 2017, Pages 170-182, ISSN 0164-1212, https://doi.org/10.1016/j.jss.2016.11.035.

Parameters:

iFile – str : Name of the Input file to mine complete set of frequent pattern’s
oFile – str : Name of the output file to store complete set of frequent patterns
minSup – str: The user can specify minSup either in count or proportion of database size.
minPR – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.
maxPer – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:

inputFilefile: Name of the input file to mine complete set of frequent pattern
minSupfloat: The user defined minSup
maxPerfloat: The user defined maxPer
minPRfloat: The user defined minPR
finalPatternsdict: it represents to store the pattern
runTimefloat: storing the total runtime of the mining process
memoryUSSfloat: storing the total amount of USS memory consumed by the program
memoryRSSfloat: storing the total amount of RSS memory consumed by the program

Methods:

mine(): Mining process will start from here
getPatterns(): Complete set of patterns will be retrieved with this function
storePatternsInFile(ouputFile): Complete set of frequent patterns will be loaded in to an output file
getPatternsAsDataFrame(): Complete set of frequent patterns will be loaded in to an output file
getMemoryUSS(): Total amount of USS memory consumed by the mining process will be retrieved from this function
getMemoryRSS(): Total amount of RSS memory consumed by the mining process will be retrieved from this function
getRuntime(): Total amount of runtime taken by the mining process will be retrieved from this function

Executing code on Terminal:

Format:

>>> python3 GPFgrowth.py <inputFile> <outputFile> <minSup> <maxPer> <minPR>

Examples:

>>> python3 GPFgrowth.py sampleDB.txt patterns.txt 10 10 0.5

Sample run of the importing code:

… code-block:: python

from PAMI.partialPeriodicFrequentPattern.basic import GPFgrowth as alg

obj = alg.GPFgrowth(inputFile, outputFile, minSup, maxPer, minPR)

obj.mine()

partialPeriodicFrequentPatterns = obj.getPatterns()

print(“Total number of partial periodic Patterns:”, len(partialPeriodicFrequentPatterns))

obj.save(oFile)

Df = obj.getPatternInDf()

memUSS = obj.getMemoryUSS()

print(“Total Memory in USS:”, memUSS)

memRSS = obj.getMemoryRSS()

print(“Total Memory in RSS”, memRSS)

run = obj.getRuntime()

print(“Total ExecutionTime in seconds:”, run)

Credits:

The complete program was written by Nakamura under the supervision of Professor Rage Uday Kiran.

getMemoryRSS()[source]: Total amount of RSS memory consumed by the mining process will be retrieved from this function :return: returning RSS memory consumed by the mining process :rtype: float

getMemoryUSS()[source]: Total amount of USS memory consumed by the mining process will be retrieved from this function :return: returning USS memory consumed by the mining process :rtype: float

getPatterns()[source]: Function to send the set of frequent patterns after completion of the mining process :return: returning frequent patterns :rtype: dict

getPatternsAsDataFrame()[source]: Storing final frequent patterns in a dataframe :return: returning frequent patterns in a dataframe :rtype: pd.DataFrame

getRuntime()[source]: Calculating the total amount of runtime taken by the mining process :return: returning total amount of runtime taken by the mining process :rtype: float

printResults()[source]: this function is used to print the results

runTime = 0

save(outFile)[source]: Complete set of frequent patterns will be loaded in to an output file :param outFile: name of the output file :type outFile: csv file

startMine()[source]: Code for the mining process will start from this function

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Node[source]

Bases: object

A class used to represent the node of frequentPatternTree

Attributes:

itemint: storing item of a node
parentnode: To maintain the parent of every node
childlist: To maintain the children of node
nodeLinknode: To maintain the next node of node
tidListset: To maintain timestamps of node

Methods:

getChild(itemName): storing the children to their respective parent nodes

getChild(item)[source]

Parameters:: item
Returns:: if node have node of item, then return it. if node don’t have return []

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.PFgrowth(tree, prefix, PFList, minSup, maxPer, minPR, last)[source]

Bases: object

This class is pattern growth algorithm

Attributes:

treeNode: represents the root node of prefix tree
prefixlist: prefix is list of prefix items
PFListdict: storing time stamp each item
minSupfloat: user defined min Support
maxPerfloat: user defined max Periodicity
minPRfloat: user defined min PR
lastint: represents last time stamp in database

Methods:

run: it is pattern growth algorithm

run()[source]: run the pattern growth algorithm :return: partial periodic frequent pattern in conditional pattern

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Tree[source]

Bases: object

A class used to represent the frequentPatternGrowth tree structure

Attributes:

rootnode: Represents the root node of the tree
nodeLinksdictionary: storing last node of each item
firstNodeLinkdictionary: storing first node of each item

Methods:

addTransaction(transaction,timeStamp): creating transaction as a branch in frequentPatternTree
fixNodeLinks(itemName, newNode): add newNode link after last node of item
deleteNode(itemName): delete all node of item
createPrefixTree(path,timeStampList): create prefix tree by path
createConditionalTree(PFList, minSup, maxPer, minPR, last): create conditional tree. Its nodes are satisfy IP / (minSup+1) >= minPR

addTransaction(transaction, tid)[source]

add transaction into tree

Parameters:

transaction (list) – it represents the one transactions in database
tid (list) – represents the timestamp of transaction

createConditionalTree(PFList, minSup, maxPer, minPR, last)[source]

create conditional tree by PFlist

Parameters:

PFList (dict) – it represents timestamp each item
minSup – it represents minSup
maxPer – it represents maxPer
minPR – it represents minPR
last – it represents last timestamp in database

Returns:

return is PFlist which satisfy ip / (minSup+1) >= minPR

createPrefixTree(path, tidList)[source]

create prefix tree by path

Parameters:

path (list) – it represents path to root from prefix node
tidList (list) – it represents tid of each item

deleteNode(item)[source]

delete the node from tree

Parameters:: item (str) – it represents the item name of node

fixNodeLinks(item, newNode)[source]

fix node link

Parameters:

item (string) – it represents item name of newNode
newNode (Node) – it represents node which is added

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.calculateIP(maxPer, timeStamp, timeStampFinal)[source]

Bases: object

This class calculate ip from timestamp

Attributes:

maxPerfloat: it represents user defined maxPer value
timeStamplist: it represents timestamp of item
timeStampFinalint: it represents last timestamp of database

Methods:

run: calculate ip from its timestamp list

run()[source]: calculate ip from timeStamp list :return: it represents ip value

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFListver2(Database, minSup, maxPer, minPR)[source]

Bases: object

generate time stamp list from input file

Attributes:

inputFilestr: it is input file name
minSupfloat: user defined minimum support value
maxPerfloat: user defined max Periodicity value
minPRfloat: user defined min PR value
PFListdict: storing timestamps each item

findSeparator(line)[source]

find separator of line in database

Parameters:: line (list) – it represents one line in database
Returns:: return separator

run()[source]: generate PFlist :return: timestamps and last timestamp

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFTreever2(Database, tidList)[source]

Bases: object

create tree from tidList and input file

Attributes:

inputFilestr: it represents input file name
tidListdict: storing tids each item
rootNode: it represents the root node of the tree

Methods:

run: it create tree
find separator(line): find separator in the line of database

findSeparator(line)[source]

find separator of line in database

Parameters:: line (list) – it represents one line in database
Returns:: return separator

run()[source]: create tree from database and tidList :return: the root node of tree