2015 IEEE International Conference on Data Mining Workshop (ICDMW), Atlantic City, NJ, USA, Amerika Birleşik Devletleri, 14 - 17 Kasım 2015
Link prediction is an important and well-studied problem in network analysis, with a broad range of applications including recommender systems, anomaly detection, and denoising. The general principle in link prediction is to use the topological characteristics of the nodes in the network to predict edges that might be added to or removed from the network. While early research utilized local network neighborhood to characterize the topological relationship between pairs of nodes, recent studies increasingly show that use of global network information improves prediction performance. Meanwhile, in the context of disease gene prioritization and functional annotation in computational biology, "global topological similarity" based methods are shown to be effective and robust to noise and ascertainment bias. These methods compute topological profiles that represent the global view of the network from the perspective of each node and compare these topological profiles to assess the topological similarity between nodes. Here, we show that, in the context of link prediction in large networks, the performance of these global-view based methods can be adversely affected by high dimensionality. Motivated by this observation, we propose two dimensionality reduction techniques that exploit the sparsity and modularity of networks that are encountered in practical applications. Our experimental results on predicting future collaborations based on a comprehensive co-authorship network shows that dimensionality reduction renders global-view based link prediction highly effective, and the resulting algorithms significantly outperform state-of-the-art link prediction methods.