PAMI.highUtilityPattern.parallel package
Submodules
PAMI.highUtilityPattern.parallel.abstract module
PAMI.highUtilityPattern.parallel.efimparallel module
- class PAMI.highUtilityPattern.parallel.efimparallel.efimParallel(iFile, minUtil, sep='\t', threads=1)[source]
Bases:
_utilityPatterns
- Description:
EFIM is one of the fastest algorithm to mine High Utility ItemSets from transactional databases.
- Reference:
Zida, S., Fournier-Viger, P., Lin, J.CW. et al. EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51, 595–625 (2017). https://doi.org/10.1007/s10115-016-0986-0
- Parameters:
iFile – str : Name of the Input file to mine complete set of High Utility patterns
oFile – str : Name of the output file to store complete set of High Utility patterns
minUtil – int : The user given minUtil value.
maxMemory – int Maximum memory used by this program for running
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.
- Attributes:
- inputFile (str):
The input file path.
- minUtil (int):
The minimum utility threshold.
- sep (str):
The separator used in the input file.
- threads (int):
The number of threads to use.
- Patterns (dict):
A dictionary containing the discovered patterns.
- rename (dict):
A dictionary containing the mapping between the item IDs and their names.
- runtime (float):
The runtime of the algorithm in seconds.
- memoryRSS (int):
The Resident Set Size (RSS) memory usage of the algorithm in bytes.
- memoryUSS (int):
The Unique Set Size (USS) memory usage of the algorithm in bytes.
- Methods:
- read_file():
Read the input file and return the filtered transactions, primary items, and secondary items.
- binarySearch(arr, item):
Perform a binary search on the given array to find the given item.
- project(beta, file_data, secondary):
Project the given beta itemset on the given database.
- search(collections):
Search for high utility itemsets in the given collections.
- mine():
Start the EFIM algorithm.
- savePatterns(outputFile):
Save the patterns discovered by the algorithm to an output file.
- getPatterns():
Get the patterns discovered by the algorithm.
- getRuntime():
Get the runtime of the algorithm.
- getMemoryRSS():
Get the Resident Set Size (RSS) memory usage of the algorithm.
- getMemoryUSS():
Get the Unique Set Size (USS) memory usage of the algorithm.
- printResults():
Print the results of the algorithm.
- getMemoryRSS()[source]
Get the Resident Set Size (RSS) memory usage of the algorithm.
- Returns:
The RSS memory usage in bytes.
- Return type:
int
- getMemoryUSS()[source]
Get the Unique Set Size (USS) memory usage of the algorithm.
- Returns:
The USS memory usage in bytes.
- Return type:
int
- getPatterns()[source]
Get the patterns discovered by the algorithm.
- Returns:
A dictionary containing the discovered patterns.
- Return type:
dict
- getPatternsAsDataFrame()[source]
Storing final patterns in a dataframe :return: returning patterns in a dataframe :rtype: pd.DataFrame
- getRuntime()[source]
Get the runtime of the algorithm.
- Returns:
The runtime in seconds.
- Return type:
float