PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)
Previous | [🏠Home | Next |
Key concepts in each link were briefly mentioned to save your valuable time. Click on the necessary links to know more.
About PAMI: Motivation and the people who supported
PAMI is a PAttern MIning Python library to discover hidden patterns in Big Data.
Installation/Update/uninstall PAMI
pip install pami
Organization of algorithms in PAMI
The algorithms in PAMI are organized in a hierarchical structure as follows:
PAMI.theoriticalModel.basic/maximal/closed/topk.algorithmName
Format to create different databases
format: item1<sep>item2<sep>...<sep>itemN
format: timestamp<sep>item1<sep>item2<sep>...<sep>itemN
format: spatialItem1<sep>spatialItem3<sep>spatialItem10<sep>...
format: item1<sep>...<sep>itemN:totalUtility:utilityItem1<sep>...<sep>utilityItemN
Default separator used in PAMI is tab space. However, users can override the separator with their choice.
Statistics of a transactional database This program outputs the statistical details of a transactional database. It will also output the distribution of items’ frequencies and transactional lengths.
from dbStats import TransactionalDatabase as tds
obj=tds.TransactionalDatabase(inputFile,sep)
obj.run()
#Getting basic stats of a database
print("Database Size: " + obj.getDatabaseSize())
print("Minimum transaction length:" + obj.getMinimumTransactionLength())
print("Maximum transaction length:" + obj.getMaximumTransactionLength())
print("Average transaction length:" + obj.getAverageTransactionLength())
print("Standard deviation of transaction lengths:" + obj.getStandardDeviationTransactionLength())
#Distribution of items' frequencies and transactional lengths
itemFrequencies = obj.getItemFrequencies() #format: <item: freq>
tranLenDistribution = obj.getTransanctionalLengthDistribution() #format: <tranLength: freq>
obj.storeInFile(itemFrequencies,outputFileName)
obj.storeInFile(tranLenDistribution,outputFileName)
Statistics of a temporal database This program outputs the statistical details of a temporal database. It will also output the distribution of items’ frequencies, transactional lengths, and number of transactions occurring at each timestamp.
from dbStats import TemporalDatabase as tds
obj=tds.TemporalDatabase(inputFile,sep)
obj.run()
#Getting basic stats of a database
print("Database Size: " + obj.getDatabaseSize())
print("Minimum transaction length:" + obj.getMinimumTransactionLength())
print("Maximum transaction length:" + obj.getMaximumTransactionLength())
print("Average transaction length:" + obj.getAverageTransactionLength())
print("Standard deviation of transaction lengths:" + obj.getStandardDeviationTransactionLength())
print("Minimum period:" + obj.getMinimumPeriod())
print("Maximum period:" + obj.getMaximumPeriod())
print("Average period:" + obj.getAveragePeriod())
#Distribution of items' frequencies, transactional lengths, and distribution of transactions per timestamp
itemFrequencies = obj.getItemFrequencies() #format: <item: freq>
tranLenDistribution = obj.getTransanctionalLengthDistribution() #format: <tranLength: freq>
transactionsPerTimestamp = obj.getNumberOfTransactionsPerTimestamp() #format: <timestamp: freq>
obj.storeInFile(itemFrequencies,outputFileName)
obj.storeInFile(tranLenDistribution,outputFileName)
obj.storeInFile(transactionsPerTimeStamp,outputFileName)
Converting dataframes to databases
tid/timestamp<sep>item1<sep>item2<sep>...<sep>itemN
tid/timestamp<sep>item<sep>value
Dataframe to database conversion
This program creates a database by specifying a single condition and a threshold value for all items in a database. Code to convert a dataframe into a transactional database:
from PAMI.DF2DB import DF2DB as pro
db = pro.DF2DB(inputDataFrame, thresholdValue, condition, DFtype)
# DFtype='sparse' or 'dense'. Default type of an input dataframe is sparse
db.createTransactional(outputFile)
Dataframe to database conversed advanced
This program user can specify a different condition and a threshold value for each item in the dataframe. Code to convert a dataframe into a transactional database:
from PAMI.DF2DB import DF2DBPlus as pro
db = pro.DF2DBPlus(inputDataFrame, itemConditionValueDataFrame, DFtype)
# DFtype='sparse' or 'dense'. Default type of an input dataframe is sparse
db.createTransactional(outputFile)
Importing PAMI algorithms into your program
from PAMI.frequentPattern.basic import fpGrowth as alg
obj = alg.fpGrowth(inputFile,minSup,sep)
obj.startMine()
obj.save('patterns.txt')
df = obj.getPatternsAsDataFrame()
print('Runtime: ' + str(obj.getRuntime()))
print('Memory: ' + str(obj.getMemoryRSS()))
Executing PAMI algorithms directly on the terminal
python PAMI/patternModel/patternType/algorithm.py inputFile outputFile parameters
E.g., python PAMI/frequentPattern/basic/fpGrowth.py inputFile outputFile minSup