Skip to yearly menu bar Skip to main content


Spotlight Poster

Instance Correlation Graph-based Naive Bayes

Chengyuan Li · Liangxiao Jiang · Wenjun Zhang · Liangjun Yu · Huan Zhang

East Exhibition Hall A-B #E-1600
[ ] [ ] [ Project Page ]
Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Due to its simplicity, effectiveness and robustness, naive Bayes (NB) has continued to be one of the top 10 data mining algorithms. To improve its performance, a large number of improved algorithms have been proposed in the last few decades. However, in addition to Gaussian naive Bayes (GNB), there is little work on numerical attributes. At the same time, none of them takes into account the correlations among instances. To fill this gap, we propose a novel algorithm called instance correlation graph-based naive Bayes (ICGNB). Specifically, it first uses original attributes to construct an instance correlation graph (ICG) to represent the correlations among instances. Then, it employs a variational graph auto-encoder (VGAE) to generate new attributes from the constructed ICG and uses them to augment original attributes.Finally, it weights each augmented attribute to alleviate the attribute redundancy and builds GNB on the weighted attributes. The experimental results on tens of datasets show that ICGNB significantly outperforms its deserved competitors.Our codes and datasets are available at https://github.com/jiangliangxiao/ICGNB.

Lay Summary:

Naive Bayes (NB) is a simple and widely-used method that predicts the classes of unknown items according to some existing items whose classes are known. The original information of existing items is usually limited, and we wanted to obtain more information by representing and using the correlations among items.Specifically, we construct a graph to represent these correlations and then use a powerful information generation program to mine new information from the constructed graph. Then, we group together the original information and the generated new information, and then alleviate redundant information. Surprisingly, we found that this grouped information works really well and improves NB’s predictive ability, thereby overcoming existing correlated methods in most cases.Our method has implications for how to predict the classes of unknown items by leveraging correlations among items. To help other researchers explore this idea, we have released our method called ICGNB, along with the method’s settings.

Live content is unavailable. Log in and register to view live content