Mining
frequent closed itemsets (FCIs) is important in mining non-redundant (minimal)
association rules.
Therefore,
many algorithms have been developed for mining FCIs with reduced mining time
and memory usage.
For
mining FCIs, algorithms use the minimum support threshold, minSup, to prune
itemsets.
However,
using a fixed minSup is not suitable for mining top-rank-k FCIs. A large
threshold will lead to a small number of generated FCIs, leading to
insufficient FCIs to query when k is large.
On
the other hand, a small minSup will generate a huge number of generated FCIs,
leading to large runtimes and high memory usage.
In
this paper, we propose a method for mining top-rank-k FCIs without using a
fixed minimum support threshold.
A
strategy is first used to eliminate 1-items that cannot generate FCIs belonging
to top-rank-k FCIs.
Next,
based on the set of candidate 1-items, we propose TRK-FCI, a DCI-Plus-based
algorithm, for mining top-rank-k FCIs.
In
the process of mining top-rank-k FCIs, TRK-FCI automatically increases minSup
according to the mined FCIs, efficiently pruning itemsets that cannot belong to
top-rank-k FCIs.
We
also modify the dynamic bit vector (DBV) structure and apply it to reduce
memory usage and runtime in the TRK-FCI-DBV algorithm.
Experimental
results show that TRK-FCI-DBV is more efficient than TRK-FCI for various
databases.
Title: | A method for mining top-rank-k frequent closed itemsets |
Authors: | Nguyen, Loan T. T. Trinh, Truc Nguyen Ngoc Thanh |
Keywords: | DCI-Plus dynamic bit vectors frequent closed itemsets top-rank-k frequent closed itemsets |
Issue Date: | 2017 |
Publisher: | IOS PRESS, NIEUWE HEMWEG 6B, 1013 BG AMSTERDAM, NETHERLANDS |
Citation: | ISIKNOWLEDGE |
Abstract: | Mining frequent closed itemsets (FCIs) is important in mining non-redundant (minimal) association rules. Therefore, many algorithms have been developed for mining FCIs with reduced mining time and memory usage. For mining FCIs, algorithms use the minimum support threshold, minSup, to prune itemsets. However, using a fixed minSup is not suitable for mining top-rank-k FCIs. A large threshold will lead to a small number of generated FCIs, leading to insufficient FCIs to query when k is large. On the other hand, a small minSup will generate a huge number of generated FCIs, leading to large runtimes and high memory usage. In this paper, we propose a method for mining top-rank-k FCIs without using a fixed minimum support threshold. A strategy is first used to eliminate 1-items that cannot generate FCIs belonging to top-rank-k FCIs. Next, based on the set of candidate 1-items, we propose TRK-FCI, a DCI-Plus-based algorithm, for mining top-rank-k FCIs. In the process of mining top-rank-k FCIs, TRK-FCI automatically increases minSup according to the mined FCIs, efficiently pruning itemsets that cannot belong to top-rank-k FCIs. We also modify the dynamic bit vector (DBV) structure and apply it to reduce memory usage and runtime in the TRK-FCI-DBV algorithm. Experimental results show that TRK-FCI-DBV is more efficient than TRK-FCI for various databases. |
Description: | TNS07012 ; JOURNAL OF INTELLIGENT & FUZZY SYSTEMS Volume: 32 Issue: 2 Pages: 1297-1305 Published: 2017 |
URI: | http://repository.vnu.edu.vn/handle/VNU_123/28816 |
ISSN: | 1064-1246 1875-8967 |
Appears in Collections: | Bài báo của ĐHQGHN trong Web of Science |
Nhận xét
Đăng nhận xét