UPFPGrowthPlus
- class PAMI.uncertainPeriodicFrequentPattern.basic.UPFPGrowthPlus.UPFPGrowthPlus(iFile, minSup, maxPer, sep='\t')[source]
Bases:
_periodicFrequentPatterns
- Description:
Basic Plus is to discover periodic-frequent patterns in a uncertain temporal database.
- Reference:
Palla Likhitha, Rage Veena,Rage Uday Kiran, Koji Zettsu, Masashi Toyoda, Philippe Fournier-Viger, (2023). UPFP-growth++: An Efficient Algorithm to Find Periodic-Frequent Patterns in Uncertain Temporal Databases. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer, Singapore. https://doi.org/10.1007/978-981-99-1642-9_16
- Parameters:
iFile – str : Name of the Input file to mine complete set of Uncertain Periodic Frequent Patterns
oFile – str : Name of the output file to store complete set of Uncertain Periodic Frequent patterns
minSup – str: minimum support thresholds were tuned to find the appropriate ranges in the limited memory
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.
maxper – floot : where maxPer represents the maximum periodicity threshold value specified by the user.
- Attributes:
- iFile: file
Name of the Input file or path of input file
- oFile: file
Name of the output file or path of output file
- minSup: int or float or str
The user can specify minSup either in count or proportion of database size. If the program detects the data type of minSup is integer, then it treats minSup is expressed in count. Otherwise, it will be treated as float. Example: minSup=10 will be treated as integer, while minSup=10.0 will be treated as float
- maxPer: int or float or str
The user can specify maxPer either in count or proportion of database size. If the program detects the data type of maxPer is integer, then it treats maxPer is expressed in count. Otherwise, it will be treated as float. Example: maxPer=10 will be treated as integer, while maxPer=10.0 will be treated as float
- sep: str
This variable is used to distinguish items from one another in a transaction. The default seperator is tab space or . However, the users can override their default separator.
- memoryUSS: float
To store the total amount of USS memory consumed by the program
- memoryRSS: float
To store the total amount of RSS memory consumed by the program
- startTime: float
To record the start time of the mining process
- endTime: float
To record the completion time of the mining process
- Database: list
To store the transactions of a database in list
- mapSupport: Dictionary
To maintain the information of item and their frequency
- lno: int
To represent the total no of transaction
- tree: class
To represents the Tree class
- itemSetCount: int
To represents the total no of patterns
- finalPatterns: dict
To store the complete patterns
- Methods:
- mine()
Mining process will start from here
- getPatterns()
Complete set of patterns will be retrieved with this function
- savePatterns(oFile)
Complete set of periodic-frequent patterns will be loaded in to a output file
- getPatternsAsDataFrame()
Complete set of periodic-frequent patterns will be loaded in to a dataframe
- getMemoryUSS()
Total amount of USS memory consumed by the mining process will be retrieved from this function
- getMemoryRSS()
Total amount of RSS memory consumed by the mining process will be retrieved from this function
- getRuntime()
Total amount of runtime taken by the mining process will be retrieved from this function
- creatingItemSets(fileName)
Scans the dataset and stores in a list format
- updateDatabases()
Update the database by removing aperiodic items and sort the Database by item decreased support
- buildTree()
After updating the Database, remaining items will be added into the tree by setting root node as null
- convert()
to convert the user specified value
- PeriodicFrequentOneItems()
To extract the one-length periodic-frequent items
Executing the code on terminal:
Format: (.venv) $ python3 UPFPGrowthPlus.py <inputFile> <outputFile> <minSup> <maxPer> Examples Usage: (.venv) $ python3 UPFPGrowthPlus.py sampleTDB.txt patterns.txt 0.3 4 .. note:: minSup and maxPer will be considered in support count or frequency
Importing this algorithm into a python program
from PAMI.uncertainPeriodicFrequentPattern import UPFPGrowthPlus as alg obj = alg.UPFPGrowthPlus(iFile, minSup, maxPer) obj.startMine() periodicFrequentPatterns = obj.getPatterns() print("Total number of uncertain Periodic Frequent Patterns:", len(periodicFrequentPatterns)) obj.save(oFile) Df = obj.getPatternsAsDataFrame() memUSS = obj.getMemoryUSS() print("Total Memory in USS:", memUSS) memRSS = obj.getMemoryRSS() print("Total Memory in RSS", memRSS) run = obj.getRuntime() print("Total ExecutionTime in seconds:", run)
Credits:
The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.
- Mine()[source]
Main method where the patterns are mined by constructing tree and remove the false patterns by counting the original support of a patterns
- getMemoryRSS()[source]
Total amount of RSS memory consumed by the mining process will be retrieved from this function
- Returns:
returning RSS memory consumed by the mining process
- Return type:
float
- getMemoryUSS()[source]
Total amount of USS memory consumed by the mining process will be retrieved from this function. :return: returning USS memory consumed by the mining process :rtype: float
- getPatterns()[source]
Function to send the set of frequent patterns after completion of the mining process
- Returns:
returning frequent patterns
- Return type:
dict
- getPatternsAsDataFrame()[source]
Storing final frequent patterns in a dataframe
- Returns:
returning frequent patterns in a dataframe
- Return type:
pd.DataFrame
- getRuntime()[source]
Calculating the total amount of runtime taken by the mining process
- Returns:
returning total amount of runtime taken by the mining process
- Return type:
float
- PAMI.uncertainPeriodicFrequentPattern.basic.UPFPGrowthPlus.printTree(root)[source]
To print the tree with nodes with item name, probability, timestamps, and second probability respectively.
Attributes:
- Parameters:
root – Node
- Returns:
print all Tree with nodes with items, probability, parent item, timestamps, second probability respectively.