PAMI.highUtilityPattern.parallel package

Submodules

PAMI.highUtilityPattern.parallel.abstract module

PAMI.highUtilityPattern.parallel.efimparallel module

class PAMI.highUtilityPattern.parallel.efimparallel.efimParallel(iFile, minUtil, sep='\t', threads=1)[source]

Bases: _utilityPatterns

Description:

EFIM is one of the fastest algorithm to mine High Utility ItemSets from transactional databases.

Reference:

Zida, S., Fournier-Viger, P., Lin, J.CW. et al. EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51, 595–625 (2017). https://doi.org/10.1007/s10115-016-0986-0

Parameters:

iFile – str : Name of the Input file to mine complete set of High Utility patterns
oFile – str : Name of the output file to store complete set of High Utility patterns
minUtil – int : The user given minUtil value.
maxMemory – int Maximum memory used by this program for running
sep – str : This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:

inputFile (str):: The input file path.
minUtil (int):: The minimum utility threshold.
sep (str):: The separator used in the input file.
threads (int):: The number of threads to use.
Patterns (dict):: A dictionary containing the discovered patterns.
rename (dict):: A dictionary containing the mapping between the item IDs and their names.
runtime (float):: The runtime of the algorithm in seconds.
memoryRSS (int):: The Resident Set Size (RSS) memory usage of the algorithm in bytes.
memoryUSS (int):: The Unique Set Size (USS) memory usage of the algorithm in bytes.

Methods:

read_file():: Read the input file and return the filtered transactions, primary items, and secondary items.
binarySearch(arr, item):: Perform a binary search on the given array to find the given item.
project(beta, file_data, secondary):: Project the given beta itemset on the given database.
search(collections):: Search for high utility itemsets in the given collections.
mine():: Start the EFIM algorithm.
savePatterns(outputFile):: Save the patterns discovered by the algorithm to an output file.
getPatterns():: Get the patterns discovered by the algorithm.
getRuntime():: Get the runtime of the algorithm.
getMemoryRSS():: Get the Resident Set Size (RSS) memory usage of the algorithm.
getMemoryUSS():: Get the Unique Set Size (USS) memory usage of the algorithm.
printResults():: Print the results of the algorithm.

getMemoryRSS()[source]

Get the Resident Set Size (RSS) memory usage of the algorithm.

Returns:: The RSS memory usage in bytes.
Return type:: int

getMemoryUSS()[source]

Get the Unique Set Size (USS) memory usage of the algorithm.

Returns:: The USS memory usage in bytes.
Return type:: int

getPatterns()[source]

Get the patterns discovered by the algorithm.

Returns:: A dictionary containing the discovered patterns.
Return type:: dict

getPatternsAsDataFrame()[source]: Storing final patterns in a dataframe :return: returning patterns in a dataframe :rtype: pd.DataFrame

getRuntime()[source]

Get the runtime of the algorithm.

Returns:: The runtime in seconds.
Return type:: float

mine()[source]: Start the EFIM algorithm.

printResults()[source]: This function is used to print the results

save(outFile)[source]: Complete set of frequent patterns will be loaded in to an output file :param outFile: name of the output file :type outFile: csv file

startMine()[source]: Start the EFIM algorithm.

PAMI.highUtilityPattern.parallel package

Submodules

PAMI.highUtilityPattern.parallel.abstract module

PAMI.highUtilityPattern.parallel.efimparallel module

Module contents