摘要
Graph contrastive learning (GCL) has gained significant attention for its superior performance in unsupervised node/graph representation learning. While excelling in node-level and graph-level classification tasks, GCL often underperforms in link prediction compared to other unsupervised methods. Current GCL approaches primarily rely on stochastic augmentation (e.g., edge masking, node dropping) and emphasize correlated node similarity across views in the embedding space, neglecting crucial topological and semantic graph information. This limitation hinders their potential for link prediction. Moreover, stochastic augmentation may introduce distributional discrepancies in node representations, challenging consistent graph learning. To address these issues, we propose a novel framework, Topology and Semantic Contrastive Learning with Joint Distribution Alignment (TSCLA). TSCLA innovatively integrates topological and semantic relationships between nodes into GCL framework while aligning the representations of masked and original graphs at the topological and semantic distributions. Specifically, we first assign weights to positive nodes with two tailored metrics, i.e., topological mutual information and semantic soft clustering similarity. In this way, both topological and semantic information are well integrated into GCL. Furthermore, the targeted decoding strategies are designed to reconstruct the masked information, which helps the model capture topological and semantic information. Finally, we propose a joint distribution alignment module to strengthen the consistency between the corresponding node representations at the topological and semantic levels. Extensive experiments on 7 public datasets demonstrate that the proposed TSCLA achieves significantly better performance on most downstream tasks, especially for link prediction, compared with the state-of-the-art methods.