GPFgrowth

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.GPFgrowth(iFile, minSup, maxPer, minPR, sep='\t')[source]

Bases: partialPeriodicPatterns

Description:

GPFgrowth is algorithm to mine the partial periodic frequent pattern in temporal database.

Reference:

R. Uday Kiran, J.N. Venkatesh, Masashi Toyoda, Masaru Kitsuregawa, P. Krishna Reddy, Discovering partial periodic-frequent patterns in a transactional database, Journal of Systems and Software, Volume 125, 2017, Pages 170-182, ISSN 0164-1212, https://doi.org/10.1016/j.jss.2016.11.035.

Parameters:
  • iFile – str : Name of the Input file to mine complete set of frequent pattern’s

  • oFile – str : Name of the output file to store complete set of frequent patterns

  • minSup – str: The user can specify minSup either in count or proportion of database size.

  • minPR – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.

  • maxPer – str: Controls the maximum number of transactions in which any two items within a pattern can reappear.

  • sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:
inputFilefile

Name of the input file to mine complete set of frequent pattern

minSupfloat

The user defined minSup

maxPerfloat

The user defined maxPer

minPRfloat

The user defined minPR

finalPatternsdict

it represents to store the pattern

runTimefloat

storing the total runtime of the mining process

memoryUSSfloat

storing the total amount of USS memory consumed by the program

memoryRSSfloat

storing the total amount of RSS memory consumed by the program

Methods:
mine()

Mining process will start from here

getPatterns()

Complete set of patterns will be retrieved with this function

storePatternsInFile(ouputFile)

Complete set of frequent patterns will be loaded in to an output file

getPatternsAsDataFrame()

Complete set of frequent patterns will be loaded in to an output file

getMemoryUSS()

Total amount of USS memory consumed by the mining process will be retrieved from this function

getMemoryRSS()

Total amount of RSS memory consumed by the mining process will be retrieved from this function

getRuntime()

Total amount of runtime taken by the mining process will be retrieved from this function

Executing code on Terminal:

Format:
>>> python3 GPFgrowth.py <inputFile> <outputFile> <minSup> <maxPer> <minPR>
Examples:
>>> python3 GPFgrowth.py sampleDB.txt patterns.txt 10 10 0.5

Sample run of the importing code:

… code-block:: python

from PAMI.partialPeriodicFrequentPattern.basic import GPFgrowth as alg

obj = alg.GPFgrowth(inputFile, outputFile, minSup, maxPer, minPR)

obj.mine()

partialPeriodicFrequentPatterns = obj.getPatterns()

print(“Total number of partial periodic Patterns:”, len(partialPeriodicFrequentPatterns))

obj.save(oFile)

Df = obj.getPatternInDf()

memUSS = obj.getMemoryUSS()

print(“Total Memory in USS:”, memUSS)

memRSS = obj.getMemoryRSS()

print(“Total Memory in RSS”, memRSS)

run = obj.getRuntime()

print(“Total ExecutionTime in seconds:”, run)

Credits:

The complete program was written by Nakamura under the supervision of Professor Rage Uday Kiran.

getMemoryRSS()[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function :return: returning RSS memory consumed by the mining process :rtype: float

getMemoryUSS()[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function :return: returning USS memory consumed by the mining process :rtype: float

getPatterns()[source]

Function to send the set of frequent patterns after completion of the mining process :return: returning frequent patterns :rtype: dict

getPatternsAsDataFrame()[source]

Storing final frequent patterns in a dataframe :return: returning frequent patterns in a dataframe :rtype: pd.DataFrame

getRuntime()[source]

Calculating the total amount of runtime taken by the mining process :return: returning total amount of runtime taken by the mining process :rtype: float

printResults()[source]

this function is used to print the results

runTime = 0
save(outFile)[source]

Complete set of frequent patterns will be loaded in to an output file :param outFile: name of the output file :type outFile: csv file

startMine()[source]

Code for the mining process will start from this function

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Node[source]

Bases: object

A class used to represent the node of frequentPatternTree

Attributes:
itemint

storing item of a node

parentnode

To maintain the parent of every node

childlist

To maintain the children of node

nodeLinknode

To maintain the next node of node

tidListset

To maintain timestamps of node

Methods:
getChild(itemName)

storing the children to their respective parent nodes

getChild(item)[source]
Parameters:

item

Returns:

if node have node of item, then return it. if node don’t have return []

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.PFgrowth(tree, prefix, PFList, minSup, maxPer, minPR, last)[source]

Bases: object

This class is pattern growth algorithm

Attributes:
treeNode

represents the root node of prefix tree

prefixlist

prefix is list of prefix items

PFListdict

storing time stamp each item

minSupfloat

user defined min Support

maxPerfloat

user defined max Periodicity

minPRfloat

user defined min PR

lastint

represents last time stamp in database

Methods:
run

it is pattern growth algorithm

run()[source]

run the pattern growth algorithm :return: partial periodic frequent pattern in conditional pattern

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.Tree[source]

Bases: object

A class used to represent the frequentPatternGrowth tree structure

Attributes:
rootnode

Represents the root node of the tree

nodeLinksdictionary

storing last node of each item

firstNodeLinkdictionary

storing first node of each item

Methods:
addTransaction(transaction,timeStamp)

creating transaction as a branch in frequentPatternTree

fixNodeLinks(itemName, newNode)

add newNode link after last node of item

deleteNode(itemName)

delete all node of item

createPrefixTree(path,timeStampList)

create prefix tree by path

createConditionalTree(PFList, minSup, maxPer, minPR, last)

create conditional tree. Its nodes are satisfy IP / (minSup+1) >= minPR

addTransaction(transaction, tid)[source]

add transaction into tree

Parameters:
  • transaction (list) – it represents the one transactions in database

  • tid (list) – represents the timestamp of transaction

createConditionalTree(PFList, minSup, maxPer, minPR, last)[source]

create conditional tree by PFlist

Parameters:
  • PFList (dict) – it represents timestamp each item

  • minSup – it represents minSup

  • maxPer – it represents maxPer

  • minPR – it represents minPR

  • last – it represents last timestamp in database

Returns:

return is PFlist which satisfy ip / (minSup+1) >= minPR

createPrefixTree(path, tidList)[source]

create prefix tree by path

Parameters:
  • path (list) – it represents path to root from prefix node

  • tidList (list) – it represents tid of each item

deleteNode(item)[source]

delete the node from tree

Parameters:

item (str) – it represents the item name of node

fix node link

Parameters:
  • item (string) – it represents item name of newNode

  • newNode (Node) – it represents node which is added

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.calculateIP(maxPer, timeStamp, timeStampFinal)[source]

Bases: object

This class calculate ip from timestamp

Attributes:

maxPerfloat

it represents user defined maxPer value

timeStamplist

it represents timestamp of item

timeStampFinalint

it represents last timestamp of database

Methods:

run

calculate ip from its timestamp list

run()[source]

calculate ip from timeStamp list :return: it represents ip value

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFListver2(Database, minSup, maxPer, minPR)[source]

Bases: object

generate time stamp list from input file

Attributes:
inputFilestr

it is input file name

minSupfloat

user defined minimum support value

maxPerfloat

user defined max Periodicity value

minPRfloat

user defined min PR value

PFListdict

storing timestamps each item

findSeparator(line)[source]

find separator of line in database

Parameters:

line (list) – it represents one line in database

Returns:

return separator

run()[source]

generate PFlist :return: timestamps and last timestamp

class PAMI.partialPeriodicFrequentPattern.basic.GPFgrowth.generatePFTreever2(Database, tidList)[source]

Bases: object

create tree from tidList and input file

Attributes:
inputFilestr

it represents input file name

tidListdict

storing tids each item

rootNode

it represents the root node of the tree

Methods:
run

it create tree

find separator(line)

find separator in the line of database

findSeparator(line)[source]

find separator of line in database

Parameters:

line (list) – it represents one line in database

Returns:

return separator

run()[source]

create tree from database and tidList :return: the root node of tree