PAMI.extras.dbStats package

Submodules

PAMI.extras.dbStats.FuzzyDatabase module

class PAMI.extras.dbStats.FuzzyDatabase.FuzzyDatabase(inputFile: str, sep: str = '\t')[source]

Bases: object

Description:

FuzzyDatabase is class to get stats of fuzzyDatabase.

Attributes:
inputFilefile

input file path

sepstr

separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

getMinimumUtility()

get the minimum utility

getAverageUtility()

get the average utility

getMaximumUtility()

get the maximum utility

getSortedUtilityValuesOfItem()

get sorted utility values each item

from PAMI.extras.dbStats import FuzzyDatabaseStats as db

obj = db.FuzzyDatabase(iFile, "     ")

obj.run()

obj.printStats()

obj.save(oFile)
creatingItemSets() None[source]

Storing the complete transactions of the database/input file in a database variable

getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getAverageUtility() float[source]

get the average utility :return: average utility :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getFrequenciesInRange() dict[source]
getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMaximumUtility() int[source]

get the maximum utility :return: max utility :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getMinimumUtility() int[source]

get the minimum utility :return: min utility :rtype: int

getNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSortedUtilityValuesOfItem() dict[source]

get sorted utility value each item. key is item and value is utility of item :return: sorted dictionary utility value of item :rtype: dict

getSparsity() float[source]

get the sparsity of database :return: dataset sparsity :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTotalUtility() int[source]

get sum of utility :return: total utility :rtype: int

getTransanctionalLengthDistribution() dict[source]

get transaction length :return: transactional length :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction.

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.MultipleTimeSeriesFuzzyDatabaseStats module

class PAMI.extras.dbStats.MultipleTimeSeriesFuzzyDatabaseStats.MultipleTimeSeriesFuzzyDatabaseStats(inputFile: str, sep: str = '\t')[source]

Bases: object

Description:

MultipleTimeSeriesDatabaseStats is class to get stats of multiple time series fuzzy database.

Attributes:
param inputFile:

file : input file path

param sep:

str separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getTotalNumberOfItems()

get the total number of items in a database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

convertDataIntoMatrix()

Convert the database into matrix form to calculate the sparsity and density of a database

getSparsity()

get sparsity value of database

getDensity()

get density value of database

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

printStats()

To print all the stats of the database

plotGraphs()

To plot all the graphs of frequency disctribution of items and transaction length distribution in database

Importing this algorithm into a python program

from PAMI.extras.dbStats import MultipleTimeSeriesFuzzyDatabaseStats as db

obj = db.MultipleTimeSeriesFuzzyDatabaseStats(iFile, "      ")

obj.run()

obj.save(oFile)

obj.printStats()
convertDataIntoMatrix() ndarray[source]
getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getDensity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getFrequenciesInRange() dict[source]
getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSparsity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTransanctionalLengthDistribution() dict[source]

get transaction length :return: transactional length :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction.

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.SequentialDatabase module

class PAMI.extras.dbStats.SequentialDatabase.SequentialDatabase(inputFile: str, sep: str = '\t')[source]

Bases: object

SequentialDatabase is to get stats of database like avarage, minimun, maximum and so on.

Attributes:
param inputFile:

file : input file path

param sep:

str separator in file. Default is tab space.

Methods:
readDatabase():

read sequential database from input file and store into database and size of each sequence and subsequences.

getDatabaseSize(self):

get the size of database

getTotalNumberOfItems(self):

get the number of items in database.

getMinimumSequenceLength(self):

get the minimum sequence length

getAverageSubsequencePerSequenceLength(self):

get the average subsequence length per sequence length. It is sum of all subsequence length divided by sequence length.

getAverageItemPerSubsequenceLength(self):

get the average Item length per subsequence. It is sum of all item length divided by subsequence length.

getMaximumSequenceLength(self):

get the maximum sequence length

getStandardDeviationSubsequenceLength(self):

get the standard deviation subsequence length

getVarianceSequenceLength(self):

get the variance Sequence length

getSequenceSize(self):

get the size of sequence

getMinimumSubsequenceLength(self):

get the minimum subsequence length

getAverageItemPerSequenceLength(self):

get the average item length per sequence. It is sum of all item length divided by sequence length.

getMaximumSubsequenceLength(self):

get the maximum subsequence length

getStandardDeviationSubsequenceLength(self):

get the standard deviation subsequence length

getVarianceSubsequenceLength(self):

get the variance subSequence length

getSortedListOfItemFrequencies(self):

get sorted list of item frequencies

getFrequenciesInRange(self):

get sorted list of item frequencies in some range

getSequencialLengthDistribution(self):

get Sequence length Distribution

getSubsequencialLengthDistribution(self):

get subSequence length distribution

printStats(self):

to print the all status of sequence database

plotGraphs(self):

to plot the distribution about items, subsequences in sequence and items in subsequence

Importing this algorithm into a python program

from PAMI.extras.dbStats import SequentialDatabase as db

obj = db.SequentialDatabase(iFile, "        ")

obj.save(oFile)

obj.run()

obj.printStats()

Executing the code on terminal:

Format:

(.venv) $ python3 SequentialDatabase.py <inputFile>

Example Usage:

(.venv) $ python3 SequentialDatabase.py sampleDB.txt

(.venv) $ python3 SequentialDatabase.py sampleDB.txt

Sample run of the importing code:

import PAMI.extra.DBstats.SequentialDatabase as alg _ap=alg.SequentialDatabase(inputfile,sep) _ap.readDatabase() _ap.printStats() _ap.plotGraphs()

Credits:

The complete program was written by Shota Suzuki under the supervision of Professor Rage Uday Kiran.

getAverageItemPerSequenceLength() float[source]

get the average item length per sequence. It is sum of all item length divided by sequence length. :return: average item length per sequence :rtype: float

getAverageItemPerSubsequenceLength() float[source]

get the average Item length per subsequence. It is sum of all item length divided by subsequence length. :return: average Item length per subsequence :rtype: float

getAverageSubsequencePerSequenceLength() float[source]

get the average subsequence length per sequence length. It is sum of all subsequence length divided by sequence length. :return: average subsequence length per sequence length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getFrequenciesInRange() Dict[int, int][source]

get sorted list of item frequencies in some range :return: item separated by its frequencies :rtype: dict

getMaximumSequenceLength() int[source]

get the maximum sequence length :return: maximum sequence length :rtype: int

getMaximumSubsequenceLength() int[source]

get the maximum subsequence length :return: maximum subsequence length :rtype: int

getMinimumSequenceLength() int[source]

get the minimum sequence length :return: minimum sequence length :rtype: int

getMinimumSubsequenceLength() int[source]

get the minimum subsequence length :return: minimum subsequence length :rtype: int

getSequenceSize() int[source]

get the size of sequence :return: sequences size :rtype: int

getSequencialLengthDistribution() Dict[int, int][source]

get Sequence length Distribution :return: Sequence length :rtype: dict

getSortedListOfItemFrequencies() Dict[str, int][source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getStandardDeviationSequenceLength() float[source]

get the standard deviation sequence length :return: standard deviation sequence length :rtype: float

getStandardDeviationSubsequenceLength() float[source]

get the standard deviation subsequence length :return: standard deviation subsequence length :rtype: float

getSubsequencialLengthDistribution() Dict[int, int][source]

get subSequence length distribution :return: subSequence length :rtype: dict

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getVarianceSequenceLength() float[source]

get the variance Sequence length :return: variance Sequence length :rtype: float

getVarianceSubsequenceLength() float[source]

get the variance subSequence length :return: variance subSequence length :rtype: float

plotGraphs() None[source]

To plot the distribution about items, subsequences in sequence and items in subsequence

printStats() None[source]

To print the all status of sequence database

readDatabase() None[source]

read sequential database from input file and store into database and size of each sequence and subsequences.

run() None[source]

PAMI.extras.dbStats.TemporalDatabase module

class PAMI.extras.dbStats.TemporalDatabase.TemporalDatabase(inputFile: str | DataFrame, sep: str = '\t')[source]

Bases: object

Description:

TemporalDatabase is class to get stats of database.

Attributes:
:param inputFilefile

input file path

:param sepstr

separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

getMinimumPeriod()

get the minimum period

getAveragePeriod()

get the average period

getMaximumPeriod()

get the maximum period

getStandardDeviationPeriod()

get the standard deviation period

getNumberOfTransactionsPerTimestamp()

get number of transactions per time stamp. This time stamp range is 1 to max period.

Importing this algorithm into a python program

from PAMI.extras.dbStats import TemporalDatabase as db

obj = db.TemporalDatabase(iFile, "  ")

obj.save(oFile)

obj.run()

obj.printStats()
convertDataIntoMatrix() ndarray[source]
getAverageInterArrivalPeriod() float[source]

get the average inter arrival period. It is sum of all period divided by number of period. :return: average inter arrival period :rtype: float

getAveragePeriodOfItem() float[source]

get the average period of the item :return: average period :rtype: float

getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getDensity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getFrequenciesInRange() Dict[int, int][source]
getMaximumInterArrivalPeriod() int[source]

get the maximum inter arrival period :return: maximum inter arrival period :rtype: int

getMaximumPeriodOfItem() int[source]

get the maximum period of the item :return: maximum period :rtype: int

getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMinimumInterArrivalPeriod() int[source]

get the minimum inter arrival period :return: minimum inter arrival period :rtype: int

getMinimumPeriodOfItem() int[source]

get the minimum period of the item :return: minimum period :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getNumberOfTransactionsPerTimestamp() Dict[int, int][source]

get number of transactions per time stamp :return: number of transactions per time stamp as dict :rtype: dict

getPeriodsInRange() Dict[int, int][source]
getSortedListOfItemFrequencies() Dict[str, int][source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSparsity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getStandardDeviationPeriod() float[source]

get the standard deviation period :return: standard deviation period :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTransanctionalLengthDistribution() Dict[int, int][source]

get transaction length :return: transactional length :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction. And store the period between transactions as list

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.TransactionalDatabase module

class PAMI.extras.dbStats.TransactionalDatabase.TransactionalDatabase(inputFile: str | DataFrame, sep: str = '\t')[source]

Bases: object

Description:

TransactionalDatabase is class to get stats of database.

Attributes:
param inputFile:

file : input file path

param sep:

str separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

getMinimumPeriod()

get the minimum period

getAveragePeriod()

get the average period

getMaximumPeriod()

get the maximum period

getStandardDeviationPeriod()

get the standard deviation period

getNumberOfTransactionsPerTimestamp()

get number of transactions per time stamp. This time stamp range is 1 to max period.

Importing this algorithm into a python program

from PAMI.extras.dbStats import TransactionalDatabase as db

obj = db.TransactionalDatabase(iFile, "     ")

obj.save(oFile)

obj.run()

obj.printStats()
convertDataIntoMatrix() ndarray[source]
getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getDensity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getFrequenciesInRange() dict[source]
getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSparsity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTransanctionalLengthDistribution() dict[source]

Get transaction length :return: a dictionary with transaction length as keys and their total length as values :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction.

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.UncertainTemporalDatabase module

class PAMI.extras.dbStats.UncertainTemporalDatabase.UncertainTemporalDatabase(inputFile: str, sep: str = '\t')[source]

Bases: object

Description:

UncertainTemporalDatabaseStats is class to get stats of database.

Attributes:
:param inputFilefile

input file path

:param sepstr

separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

getMinimumPeriod()

get the minimum period

getAveragePeriod()

get the average period

getMaximumPeriod()

get the maximum period

getStandardDeviationPeriod()

get the standard deviation period

getNumberOfTransactionsPerTimestamp()

get number of transactions per time stamp. This time stamp range is 1 to max period.

Importing this algorithm into a python program

from PAMI.extras.dbStats import UncertainTemporalDatabase as db

obj = db.UncertainTemporalDatabase(iFile, " ")

obj.save(oFile)

obj.run()

obj.printStats()
convertDataIntoMatrix() ndarray[source]
getAveragePeriod() float[source]

get the average period. It is sum of all period divided by number of period. :return: average period :rtype: float

getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getDensity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getFrequenciesInRange() dict[source]
getMaximumPeriod() int[source]

get the maximum period :return: maximum period :rtype: int

getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMinimumPeriod() int[source]

get the minimum period :return: minimum period :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getNumberOfTransactionsPerTimestamp() dict[source]

get number of transactions per time stamp :return: number of transactions per time stamp as dict :rtype: float

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSparsity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getStandardDeviationPeriod() float[source]

get the standard deviation period :return: standard deviation period :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTransanctionalLengthDistribution() dict[source]

get transaction length :return: transactional length :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction. And store the period between transactions as list

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.UncertainTransactionalDatabase module

class PAMI.extras.dbStats.UncertainTransactionalDatabase.UncertainTransactionalDatabase(inputFile: str, sep: str = '\t')[source]

Bases: object

Description:

UncertainTransactionalDatabase is class to get stats of database.

Attributes:
inputFilefile

input file path

sepstr

separator in file. Default is tab space.

Methods:
run()

execute readDatabase function

readDatabase()

read database from input file

getDatabaseSize()

get the size of database

getMinimumTransactionLength()

get the minimum transaction length

getAverageTransactionLength()

get the average transaction length. It is sum of all transaction length divided by database length.

getMaximumTransactionLength()

get the maximum transaction length

getStandardDeviationTransactionLength()

get the standard deviation of transaction length

getVarianceTransactionLength()

get the variance of transaction length

getSparsity()

get the sparsity of database

getSortedListOfItemFrequencies()

get sorted list of item frequencies

getSortedListOfTransactionLength()

get sorted list of transaction length

save(data, outputFile)

store data into outputFile

Importing this algorithm into a python program

from PAMI.extras.dbStats import UncertainTransactionalDatabase as db

obj = db.UncertainTransactionalDatabase(iFile, "    ")

obj.save(oFile)

obj.run()

obj.printStats()
convertDataIntoMatrix() ndarray[source]
getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getDatabaseSize() int[source]

get the size of database :return: dataset size :rtype: int

getDensity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getFrequenciesInRange() dict[source]
getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSparsity() float[source]

get the sparsity of database. sparsity is percentage of 0 of database. :return: database sparsity :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTransanctionalLengthDistribution() dict[source]

get transaction length :return: transactional length :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]
readDatabase() None[source]

read database from input file and store into database and size of each transaction.

run() None[source]
save(data: dict, outputFile: str) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

PAMI.extras.dbStats.UtilityDatabase module

class PAMI.extras.dbStats.UtilityDatabase.UtilityDatabase(inputFile: str | DataFrame, sep: str = '\t')[source]

Bases: object

Description:

UtilityDatabase is class to get stats of database.

Attributes:
param inputFile:

file : input file path

param sep:

str separator in file. Default is tab space.

Importing this algorithm into a python program

from PAMI.extras.dbStats import UtilityDatabase as db

obj = db.UtilityDatabase(iFile, "   " )

obj.save(oFile)

obj.run()

obj.printStats()
creatingItemSets() None[source]

Storing the complete transactions of the database/input file in a database variable

getAverageTransactionLength() float[source]

get the average transaction length. It is sum of all transaction length divided by database length. :return: average transaction length :rtype: float

getAverageUtility() float[source]

get the average utility :return: average utility :rtype: float

getDatabaseSize() int[source]

get the size of database :return: size of database :rtype: int

getFrequenciesInRange() dict[source]

This function is used to get the Frequencies in range :return: Frequencies In Range :rtype: dict

getMaximumTransactionLength() int[source]

get the maximum transaction length :return: maximum transaction length :rtype: int

getMaximumUtility() int[source]

get the maximum utility :return: integer value of maximum utility :rtype: int

getMinimumTransactionLength() int[source]

get the minimum transaction length :return: minimum transaction length :rtype: int

getMinimumUtility() int[source]

get the minimum utility :return: integer value of minimum utility :rtype: int

getNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getSortedListOfItemFrequencies() dict[source]

get sorted list of item frequencies :return: item frequencies :rtype: dict

getSortedUtilityValuesOfItem() dict[source]

get sorted utility value each item. key is item and value is utility of item :return: sorted dictionary utility value of item :rtype: dict

getSparsity() float[source]

get the sparsity of database :return: sparsity of database in floating values :rtype: float

getStandardDeviationTransactionLength() float[source]

get the standard deviation transaction length :return: standard deviation transaction length :rtype: float

getTotalNumberOfItems() int[source]

get the number of items in database. :return: number of items :rtype: int

getTotalUtility() int[source]

get sum of utility :return: total utility :rtype: int

getTransanctionalLengthDistribution() dict[source]

get transaction length :return: a dictionary of Transaction Length Distribution :rtype: dict

getVarianceTransactionLength() float[source]

get the variance transaction length :return: variance transaction length :rtype: float

plotGraphs() None[source]
printStats() None[source]

This function is used to print the results

readDatabase() None[source]

read database from input file and store into database and size of each transaction.

run() None[source]
save(data, outputFile) None[source]

store data into outputFile :param data: input data :type data: dict :param outputFile: output file name or path to store :type outputFile: str :return: None

Module contents