PAMI.relativeFrequentPattern.basic package

Submodules

PAMI.relativeFrequentPattern.basic.RSFPGrowth module

class PAMI.relativeFrequentPattern.basic.RSFPGrowth.RSFPGrowth(iFile: str | DataFrame, minSup: int | float | str, minRS: float, sep: str = '\t')[source]

Bases: _frequentPatterns

Description:

Algorithm to find all items with relative support from given dataset

Reference:

‘Towards Efficient Discovery of Frequent Patterns with Relative Support’ R. Uday Kiran and Masaru Kitsuregawa, http://comad.in/comad2012/pdf/kiran.pdf

Parameters:

iFile – str : Name of the Input file to mine complete set of Relative frequent pattern’s
oFile – str : Name of the output file to store complete set of Relative frequent patterns
minSup – str: Controls the minimum number of transactions in which every item must appear in a database.
minRS – float: Controls the minimum number of transactions in which at least one time within a pattern must appear in a database.
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:

iFilefile: Name of the Input file to mine complete set of frequent patterns
oFilefile: Name of the output file to store complete set of frequent patterns
memoryUSSfloat: To store the total amount of USS memory consumed by the program
memoryRSSfloat: To store the total amount of RSS memory consumed by the program
startTime:float: To record the start time of the mining process
endTimefloat: To record the completion time of the mining process
minSupfloat: The user given minSup
minRSfloat: The user given minRS
Databaselist: To store the transactions of a database in list
mapSupportDictionary: To maintain the information of item and their frequency
lnoint: it represents the total no of transactions
treeclass: it represents the Tree class
itemSetCountint: it represents the total no of patterns
finalPatternsdict: it represents to store the patterns
itemSetBufferlist: it represents the store the items in mining
maxPatternLengthint: it represents the constraint for pattern length

Methods:

mine(): Mining process will start from here
getFrequentPatterns(): Complete set of patterns will be retrieved with this function
save(oFile): Complete set of frequent patterns will be loaded in to a output file
getPatternsAsDataFrame(): Complete set of frequent patterns will be loaded in to a dataframe
getmemoryUSS(): Total amount of USS memory consumed by the mining process will be retrieved from this function
getMemoryRSS(): Total amount of RSS memory consumed by the mining process will be retrieved from this function
getRuntime(): Total amount of runtime taken by the mining process will be retrieved from this function
check(line): To check the delimiter used in the user input file
creatingItemSets(fileName): Scans the dataset or dataframes and stores in list format
frequentOneItem(): Extracts the one-frequent patterns from transactions
saveAllCombination(tempBuffer,s,position,prefix,prefixLength): Forms all the combinations between prefix and tempBuffer lists with support(s)
saveItemSet(pattern,support): Stores all the frequent patterns with their respective support
frequentPatternGrowthGenerate(frequentPatternTree,prefix,port): Mining the frequent patterns by forming conditional frequentPatternTrees to particular prefix item. __mapSupport represents the 1-length items with their respective support

Methods to execute code on terminal

Format:

(.venv) $python3 RSFPGrowth.py <inputFile> <outputFile> <minSup> <__minRatio>

Example Usage :

(.venv) $python3 python3 RSFPGrowth.py sampleDB.txt patterns.txt 0.23 0.2

        .. note:: maxPer and minPS will be considered in percentage of database transactions

Importing this algorithm into a python program

from PAMI.relativeFrequentPattern import RSFPGrowth as alg

obj = alg.RSFPGrowth(iFile, minSup, __minRatio)

obj.startMine()

frequentPatterns = obj.getPatterns()

print("Total number of Frequent Patterns:", len(frequentPatterns))

obj.save(oFile)

Df = obj.getPatternsAsDataFrame()

memUSS = obj.getmemoryUSS()

print("Total Memory in USS:", memUSS)

memRSS = obj.getMemoryRSS()

print("Total Memory in RSS", memRSS)

run = obj.getRuntime()

print("Total ExecutionTime in seconds:", run)

Credits:

The complete program was written by Sai Chitra.B under the supervision of Professor Rage Uday Kiran.

Mine() → None[source]: Main program to start the operation :return: None

getMemoryRSS() → float[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

Returns:: returning RSS memory consumed by the mining process
Return type:: float

getMemoryUSS() → float[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

Returns:: returning USS memory consumed by the mining process
Return type:: float

getPatterns() → Dict[str, str][source]

Function to send the set of frequent patterns after completion of the mining process

Returns:: returning frequent patterns
Return type:: dict

getPatternsAsDataFrame() → DataFrame[source]

Storing final frequent patterns in a dataframe

Returns:: returning frequent patterns in a dataframe
Return type:: pd.DataFrame

getRuntime() → float[source]

Calculating the total amount of runtime taken by the mining process

Returns:: returning total amount of runtime taken by the mining process
Return type:: float

printResults() → None[source]: This function is used to print the results :return: None

save(outFile: str) → None[source]

Complete set of frequent patterns will be loaded in to an output file

Parameters:: outFile (file) – name of the output file.
Returns:: None

startMine() → None[source]: Main program to start the operation :return: None