**Assistant Professor (Tenure Track) of Department of Data Informatics, (National) Korea Maritime and Ocean University, Busan 49112, Republic of Korea.
Abstract. In recent years, the information technology industry around the world has grown strongly. At the same time, we also face a new challenge is the explosion in the amount of information, although there is a huge amount of data, the information that we actually have is very little, the implications behind data have not been fully exploited yet. Scientists have researched new ways to fully exploit the information contained in the database. Since the late 1980s, the concept of knowledge discovery in databases was first mentioned, this is the process of detecting latent, unknown and useful knowledge in large databases [1] [2]. Overcoming the limitations of traditional database models with only data query tools that cannot find new information, hidden information in the database. Knowledge mining in a database is the process of discovering new, useful, and hidden information in a database. Since the early 1980s Z. Pawlak has proposed the rough set theory [3] with a very solid mathematical basis, this theory is practiced by many research groups working in the field of general information technology and exploring knowledge in the database in particular and applied in research. Rough set theory is increasingly widely applied in the field of knowledge discovery, very useful in solving problems of data classification, association rules discovery and especially useful in problems dealing with ambiguous and uncertain data. Specifically, in theory the raw set of data is represented through information systems or tables. Since in fact, with large data tables with imperfect data, redundant data, continuous data or represented in the form of symbols, the theory of rough sets allows knowledge exploration in databases like this to detect hidden knowledge from these “raw” blocks of data. The found knowledge is expressed in the form of rules and patterns. After finding the most general rules for data representation, one can calculate the strength and dependence between attributes in the information system. In the paper, the author studies the recommendation system [12], the rough set theory and the theory of approximation, the fuzzy rough set theory [13], thereby building a partial model. Software enables users to exploit association rules from their database, thereby helping to make appropriate purchase or import decisions. The system supports user design options of database features, can load data from SQL Server, export the statistics to website and report.
Keywords: big data; data mining; knowledge discovery; rough set; machine learning; fuzzy rough set; recommendation systems, association rules.
SApriori Engine to Predict the seasonal consumption behavior of consumers based on Object Relational Mapping model and S-Apriori algorithm, load dataset from github/cloud or local, support export the rules into Excel
This Research from KMOU (Korea Maritime & Ocean University) – Data Science Lab – Room 407.
Example ORM mapping SAprioriEmployee with Employee.json:
Model class diagram for SApriori engine:
SApriori algorithm processing center is located in class SAprioriEngine, All data is loaded into SAprioriDatabase object, data can be filtered by season (Spring, summer, Autumn, Winter, thanksgiving, Christmas…) or select query by time, according to the number of accesses. The model will provide many data query methods to run the algorithm. The SAprioriEngine object provides the runSAprioriModel function to find association rules, the result returned is the SAprioriResult object. The SAprioriResult object contains the set of rules stored in the SAprioriRule object, the SAprioriRule object stores the detailed results of each component after the SApriori algorithm completes, relying on this class to represent the data.
You call LoadDatabase(“largedataset”) like code is shown as below:
SAprioriDatabase database = new SAprioriDatabase();
database.LoadDatabase("largedataset");
DateTime from = new DateTime(2011, 5, 1);
DateTime to= new DateTime(2011, 5, 31);
database.FilterOrders(from,to, true);
SAprioriEngine sApriori = new SAprioriEngine();
double minSupport = 20;
double minConfident = 80;
SAprioriResult result = sApriori.runSAprioriModel(database, minSupport, minConfident);
foreach (SAprioriRule arule in result.StrongRules)
{
string s = "[" + arule.X_Results_Description + " --> " + arule.Y_Results_Description + " " + String.Format("{0:0.00}", (arule.Confidence * 100)) + "%] " + "\r\n";
Console.WriteLine(s);
}
Result:
[Sport-100 Helmet, Red --> Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Black,AWC Logo Cap,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Red --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Black,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Red --> AWC Logo Cap 90.91%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Blue,AWC Logo Cap,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Red --> Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Blue 81.82%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Blue,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Red --> Sport-100 Helmet, Black 90.91%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black --> AWC Logo Cap 90.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black --> Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Blue 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Black 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Red,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black 90.00%]
[Sport-100 Helmet, Red,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black,AWC Logo Cap 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black,AWC Logo Cap 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue,AWC Logo Cap 90.00%]
[Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue,AWC Logo Cap 90.00%]
[Sport-100 Helmet, Black --> AWC Logo Cap 84.62%]
[Sport-100 Helmet, Black --> Sport-100 Helmet, Blue 84.62%]
[Sport-100 Helmet, Black --> Long-Sleeve Logo Jersey, L 84.62%]
[Sport-100 Helmet, Black --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 84.62%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue --> AWC Logo Cap 90.91%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue --> Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Black,Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Red,Long-Sleeve Logo Jersey, L 81.82%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Blue 90.91%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 100.00%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Red 81.82%]
[Sport-100 Helmet, Black,AWC Logo Cap --> Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L 90.91%]
[Sport-100 Helmet, Black,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue 90.91%]
[Sport-100 Helmet, Black,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red 81.82%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue,AWC Logo Cap 90.91%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red 81.82%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue 90.91%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue,AWC Logo Cap 90.91%]
[Sport-100 Helmet, Blue --> Long-Sleeve Logo Jersey, L 84.62%]
[Sport-100 Helmet, Blue --> Sport-100 Helmet, Black 84.62%]
[Sport-100 Helmet, Blue --> AWC Logo Cap,Long-Sleeve Logo Jersey, L 84.62%]
[Sport-100 Helmet, Blue --> AWC Logo Cap 92.31%]
[Sport-100 Helmet, Blue,AWC Logo Cap --> Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L 83.33%]
[Sport-100 Helmet, Blue,AWC Logo Cap --> Sport-100 Helmet, Black 83.33%]
[Sport-100 Helmet, Blue,AWC Logo Cap --> Sport-100 Helmet, Black,Long-Sleeve Logo Jersey, L 83.33%]
[Sport-100 Helmet, Blue,AWC Logo Cap --> Long-Sleeve Logo Jersey, L 91.67%]
[Sport-100 Helmet, Blue,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black 90.91%]
[Sport-100 Helmet, Blue,AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red 81.82%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> AWC Logo Cap 100.00%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red,AWC Logo Cap 81.82%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black,AWC Logo Cap 90.91%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Red 81.82%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black,AWC Logo Cap 90.91%]
[Sport-100 Helmet, Blue,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black 90.91%]
[AWC Logo Cap --> Sport-100 Helmet, Blue 85.71%]
[AWC Logo Cap --> Long-Sleeve Logo Jersey, L 92.86%]
[AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Black 84.62%]
[AWC Logo Cap,Long-Sleeve Logo Jersey, L --> Sport-100 Helmet, Blue 84.62%]
[Long-Sleeve Logo Jersey, L --> AWC Logo Cap 86.67%]
[LL Road Frame - Red, 60 --> Road-650 Black, 52 90.00%]
[LL Road Frame - Red, 60 --> Road-650 Red, 60 90.00%]
[LL Road Frame - Red, 60 --> Road-450 Red, 52 90.00%]
[LL Road Frame - Red, 60 --> LL Road Frame - Black, 52 90.00%]
[LL Road Frame - Red, 60 --> Road-450 Red, 52,Road-650 Red, 60 90.00%]
[LL Road Frame - Red, 60,Road-450 Red, 52 --> Road-650 Red, 60 100.00%]
[LL Road Frame - Red, 60,Road-650 Red, 60 --> Road-450 Red, 52 100.00%]
[LL Road Frame - Black, 52 --> Road-450 Red, 52 90.00%]
[LL Road Frame - Black, 52 --> LL Road Frame - Red, 60 90.00%]
[LL Road Frame - Black, 52 --> Road-650 Red, 44 90.00%]
[Road-450 Red, 58 --> Road-650 Black, 52 81.82%]
[Road-450 Red, 58 --> Road-650 Red, 44 81.82%]
[Road-450 Red, 58 --> Road-450 Red, 52 81.82%]
[Road-450 Red, 52,Road-650 Red, 60 --> Road-650 Red, 44 81.82%]
[Road-450 Red, 52,Road-650 Red, 60 --> Road-650 Red, 44,Road-650 Black, 52 81.82%]
[Road-450 Red, 52,Road-650 Red, 60 --> Road-650 Red, 44,Road-650 Black, 52 81.82%]
[Road-450 Red, 52,Road-650 Red, 60 --> Road-650 Black, 52 90.91%]
[Road-450 Red, 52,Road-650 Red, 60 --> LL Road Frame - Red, 60 81.82%]
[Road-450 Red, 52,Road-650 Red, 60,Road-650 Red, 44 --> Road-650 Black, 52 100.00%]
[Road-450 Red, 52,Road-650 Red, 60,Road-650 Black, 52 --> Road-650 Red, 44 90.00%]
[Road-450 Red, 52,Road-650 Red, 44 --> Road-650 Black, 52 83.33%]
[Road-450 Red, 52,Road-650 Red, 44,Road-650 Black, 52 --> Road-650 Red, 60 90.00%]
[Road-450 Red, 52,Road-650 Black, 52 --> Road-650 Red, 60 90.91%]
[Road-450 Red, 52,Road-650 Black, 52 --> Road-650 Red, 60,Road-650 Red, 44 81.82%]
[Road-450 Red, 52,Road-650 Black, 52 --> Road-650 Red, 60,Road-650 Red, 44 81.82%]
[Road-450 Red, 52,Road-650 Black, 52 --> Road-650 Red, 44 90.91%]
[Road-650 Red, 60 --> Road-650 Black, 52 85.71%]
[Road-650 Red, 60,Road-650 Red, 44 --> Road-450 Red, 52 100.00%]
[Road-650 Red, 60,Road-650 Red, 44 --> Road-450 Red, 52,Road-650 Black, 52 100.00%]
[Road-650 Red, 60,Road-650 Red, 44 --> Road-650 Black, 52 100.00%]
[Road-650 Red, 60,Road-650 Red, 44 --> Road-450 Red, 52,Road-650 Black, 52 100.00%]
[Road-650 Red, 60,Road-650 Red, 44,Road-650 Black, 52 --> Road-450 Red, 52 100.00%]
[Road-650 Red, 60,Road-650 Black, 52 --> Road-450 Red, 52 83.33%]
[Road-650 Red, 44 --> Road-450 Red, 52 85.71%]
[Road-650 Red, 44,Road-650 Black, 52 --> Road-450 Red, 52 90.91%]
[Road-650 Red, 44,Road-650 Black, 52 --> Road-450 Red, 52,Road-650 Red, 60 81.82%]
[Road-650 Red, 44,Road-650 Black, 52 --> Road-450 Red, 52,Road-650 Red, 60 81.82%]
[Road-650 Red, 44,Road-650 Black, 52 --> Road-650 Red, 60 81.82%]
[Road-650 Black, 52 --> Road-650 Red, 60 85.71%]
Example with largedataset – C# code – filter season
static void Main(string[] args)
{
//create SAprioriDatabase object
SAprioriDatabase database = new SAprioriDatabase();
//define season, it depends on the region of dataset collecting
database.addSeason(SAprioriSeason.Spring, new List<int>() { 3, 4, 5 });
database.addSeason(SAprioriSeason.Summer, new List<int>() { 6, 7, 8 });
database.addSeason(SAprioriSeason.Autumn, new List<int>() { 9, 10, 11 });
database.addSeason(SAprioriSeason.Winter, new List<int>() { 12, 1, 2 });
//call LoadDatabase method, largedataset is folder stores 6 json file
database.LoadDatabase("largedataset");
int year = 2011;
//filter dataset by sesaon
database.FilterOrders(SAprioriSeason.Spring, year, true);
//run SApriori
SAprioriEngine sApriori = new SAprioriEngine();
double minSupport = 20;
double minConfident = 80;
SAprioriResult result = sApriori.runSAprioriModel(database, minSupport, minConfident);
foreach (SAprioriRule arule in result.StrongRules)
{
string s = "[" + arule.X_Results_Description + " --> " + arule.Y_Results_Description + " " + String.Format("{0:0.00}", (arule.Confidence * 100)) + "%] " + "\r\n";
Console.WriteLine(s);
}
Console.ReadLine();
}