Aligning Multiple Knowledge Graphs in a Single Pass (2024)

YamingYang,Zhe Wang,Ziyu Guan^∗,Wei Zhao,Weigang Lu,Xinyan Huang* Corresponding authorY. Yang, Z. Wang, Z. Guan, W. Zhao, and W. Lu are with the State Key Laboratory of Integrated Services Networks, School of Computer Science and Technology, Xidian University, Xi’an, China 710071.E-mail: {yym@, zwang@stu., zyguan@, ywzhao@mail., wglu@stu.,}xidian.edu.cnX. Huang is the Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, the School of Artificial Intelligence, Xidian University, Xi’an, China 710071.E-mail: xinyanh@stu.xidian.edu.cn

Abstract

Entity alignment (EA) is to identify equivalent entities across different knowledge graphs (KGs), which can help fuse these KGs into a more comprehensive one. Previous EA methods mainly focus on aligning a pair of KGs, and to the best of our knowledge, no existing EA method considers aligning multiple (more than two) KGs. To fill this research gap, in this work, we study a novel problem of aligning multiple KGs and propose an effective framework named MultiEA to solve the problem. First, we embed the entities of all the candidate KGs into a common feature space by a shared KG encoder. Then, we explore three alignment strategies to minimize the distances among pre-aligned entities. In particular, we propose an innovative inference enhancement technique to improve the alignment performance by incorporating high-order similarities. Finally, to verify the effectiveness of MultiEA, we construct two new real-world benchmark datasets and conduct extensive experiments on them. The results show that our MultiEA can effectively and efficiently align multiple KGs in a single pass.

Index Terms:

Knowledge Graphs, Entity Alignment, Graph Neural Networks

1 Introduction

knowledge graphs (KGs) are a special kind of graph that can store a wealth of structural facts (i.e. knowledge) about the real world. Each fact is usually structured as a triple $(h,r,t)$ , representing that head (subject) entity $h$ and tail (object) entity $t$ hold relation $r$ between them. In recent years, KGs have successfully supported many web applications such as search engines[1], question-answer systems[2, 3, 4], recommender systems[5, 6, 4], etc.

In practice, different KGs are constructed based on diverse data sources and different extraction approaches, and a single KG can usually cover only a specific aspect of structural facts. For example, an English KG usually contains more facts about the English-speaking society, while a Chinese KG contains more facts about the Chinese-speaking society. Considering that more KGs together can provide more comprehensive structural facts from various aspects, researchers have proposed many methods (Cf. surveys[7, 8]) to fuse a pair of KGs into a unified one. The general process is to first identify equivalent entities between the candidate KGs, and then let them serve as the “bridge entities” to link the candidate KGs into a unified one.

Technically, existing mainstream methods typically solve this problem by embedding entities into a common latent representation space and minimizing the distance between pre-aligned entity pairs. This process is also referred to as entity alignment (EA). Depending on how to obtain the entity embeddings, they can be classified into two main categories: (1) Trans-based EA methods[9, 10, 11, 12, 13, 14] learn entity embeddings by letting each triple satisfy some specific geometric properties in the embedding space. Most of these methods adopt TransE[15] as the translation module to preserve the property: $h+r\approx t$ . (2) GNN-based EA methods[16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29] adopt graph neural network (GNN) models such as graph convolutional network (GCN)[30] to learn entity embeddings by iteratively aggregating the embeddings of neighbor entities.

Aligning Multiple Knowledge Graphs in a Single Pass (1)

However, all the existing EA methods are specifically designed for aligning only a pair of candidate KGs, which can cover limited knowledge for downstream applications. To the best of our knowledge, none of the existing EA methods consider aligning multiple (more than two) KGs. For example, Figure1 shows a case where four candidate KGs need to be aligned, which triggers a more challenging research problem.

There are three obstacles that prevent traditional pair-wise KG alignment methods from being applied to the multiple KG alignment problem studied in this work.First, existing EA datasets are particularly constructed for pair-wise KG alignment. At present, there is no benchmark dataset that can support the study of aligning multiple KGs.Second, if we trivially adapt existing methods to this task, they need to be separately executed multiple times since they can only utilize the local (pair-wise) alignment information in a single pass, which is inefficient.Last and most importantly, they cannot capture some useful global (beyond pair-wise) alignment information, affecting the final alignment performance. For the example in Figure1, when they align KG-en with KG-fr, and align KG-en with KG-zh independently, the entities of the three KGs will be projected into four separate embedding spaces. Consequently, it is easy to yield inconsistent results like $e_{1}^{(en)}\equiv e_{1}^{(fr)}$ , $e_{1}^{(en)}\equiv e_{1}^{(zh)}$ , and $e_{1}^{(fr)}\equiv e_{2}^{(zh)}$ (wrong), violating the transitivity of the alignment relationship.

To fill the research gap in this field, in this work, we innovatively study the problem of aligning multiple KGs in a single pass. First of all, to facilitate the study of this problem, we construct two new benchmark datasets that contain multiple KGs to be aligned. Then, we formulate a novel research problem of aligning multiple KGs concurrently. Finally, we propose an effective framework named MultiEA to solve the problem. As illustrated in Figure1, our MultiEA can concurrently align the four candidate KGs in a single pass, which is more practical. In addition, it embeds the entities of all the candidate KGs into a unified space and thus can capture useful global alignment information, which helps achieve better alignment performance. Recall the example above, if MultiEA pulls $e_{1}^{(en)}$ and $e_{1}^{(fr)}$ closer and pulls $e_{1}^{(en)}$ and $e_{1}^{(zh)}$ closer. Then, $e_{1}^{(fr)}$ and $e_{1}^{(zh)}$ will definitely be pulled closer as well, leading to the consistent result: $e_{1}^{(en)}\equiv e_{1}^{(fr)}$ , $e_{1}^{(en)}\equiv e_{1}^{(zh)}$ , and $e_{1}^{(fr)}\equiv e_{1}^{(zh)}$ (right).

The main contributions of this work are summarized as follows.

•
To the best of our knowledge, in the EA research community, we are the first to study the problem of concurrently aligning multiple (more than two) KGs. We are also the first to construct the benchmark datasets and define the evaluation metric for this problem.
•
We propose an effective framework named MultiEA to solve the problem. Three alignment strategies are explored to capture the global alignment information. An innovative inference enhancement technique is proposed to significantly boost the alignment performance.
•
We conduct extensive experiments on the two constructed benchmark datasets. The results demonstrate that MultiEA can effectively and efficiently align multiple KGs in a single pass. We will release our source codes and datasets to facilitate further research on this problem.

2 Related Work

In this section, we comprehensively review existing methods that are related to ours.

Trans-based EA Methods. Early EA methods exploit KG embedding methods, e.g., TransE[15], to project a pair of candidate KGs into a common low-dimensional Euclidean space by letting the embeddings of head entity $h$ , relation $r$ , and tail entity $h$ satisfy the triangular relationship: $h+r\approx t$ . Given a set of pre-aligned entity pairs (i.e., seed alignment labels), they minimize the distance between these equivalent entity pairs in the embedding space. Thus, more potential equivalent entity pairs can be discovered after model optimization. Representative Trans-based EA methods include JEwP[9], JAPE[12], MTransE[11], IPTransE[10], BootEA[13], TransEdge[31], MultiKE[14], etc.

GNN-based EA Methods. Motivated by the extraordinary success of graph neural networks (GNNs) in extracting the structural features of graphs, GCN-Align[16] first introduces GCN[30] into EA, which has shown superior performance than Trans-based EA Methods, e.g., JEwP[9], MTransE[11] and JAPE[12]. Further, AVR-GCN[17] deftly combines the graph convolutional operator in GNNs and the translation operator in Trans-based methods to extract better entity embeddings. Afterward, a series of following GNN-based EA methods, e.g., [18, 19, 20, 21, 22, 23, 26, 27, 29, 28, 24, 25, 32, 33, 34] are proposed. Recently, RREA[24] insightfully summarizes KG embedding-based EA methods and GNN-based EA methods into a unified framework consisting of two modules: shape-builder and alignment. The former constraints KGs into a specific distribution in the embedding space, and the latter minimizes the distance between the embeddings of pre-aligned entity pairs.

EA Enhancement. In recent years, several works further enhanced the EA performance by exploiting attribute information and entity name information. HMAN[18], BERT-INT[35], and ACK-MMEA[36] can leverage the rich attributes of entities. Besides, PSR[25], GMNN[20], NMN[22], HGCN-JE[19], RDGCN[37], LightEA[38] and SelfKG[39] can use the word embeddings of the entity names to effectively initialize entity embeddings, greatly improving the performance of EA.

Alignment Methods in Other Fields.In the social network analysis field, there are several methods that have made some efforts to align multiple social networks. COSNET[40] leverages multiple networks to enhance the alignment performance of two networks. It requires exponential time complexity to build a matching graph, which is impractical for many scenarios. ULink[41] learns a latent user space based on user attributes. It cannot exploit structural information and thus it is not a strict social network alignment approach. MC²[42] combines both structural information and attribute information to infer a common base by matrix factorization. It requires all social networks to have attributes, and its time computational complexity is square to the number of uses. MASTER[43] embeds multiple social networks in a common latent space through collaborative matrix factorization, which also requires square computational complexity. In the data integration field, a recent work[44] is proposed to align multiple tables based on Sentence-BERT[45], table-wise hierarchical merging, and density-based pruning. These methods are specially designed methods for other fields and they generally require sophisticated optimization.

In summary, different from existing methods, in this work, we study the problem of aligning multiple (more than two) KGs based on the structural information. Accordingly, we develop a GNN-based neural network framework named MultiEA to solve the problem in an end-to-end optimization manner.

3 Preliminaries

In this section, we first give the formal definition of knowledge graphs. Then, we formulate the novel problem of aligning multiple KGs.

Knowledge Graph (KG).A knowledge graph is defined as $\mathcal{G}=(\mathcal{E},\mathcal{R},\mathcal{T})$ , where $\mathcal{E}$ is the set of entities (nodes) and $\mathcal{R}$ is the set of relations (edges). $\mathcal{T}\subseteq\mathcal{E}\times\mathcal{R}\times\mathcal{E}$ is the set of triples, and each triple $t\in\mathcal{T}$ is represented as $<e_{i},r_{k},e_{j}>$ , which means that head entity $e_{i}\in\mathcal{E}$ and tail entity $e_{j}\in\mathcal{E}$ hold relation $r_{k}\in\mathcal{R}$ between them. We use ${\mathbf{h}}_{i}\in{\mathbb{R}}^{d}$ to denote the representation of entity $e_{i}$ in one model layer, where $d$ is the dimensionality of embeddings. We use ${\mathbf{g}}_{k}\in{\mathbb{R}}^{d}$ to denote the embedding of relation $r_{k}$ , which is a randomly initialized learnable parameter vector.

Multiple KG Alignment.The input is a set of KGs $\{\mathcal{G}^{(1)},\mathcal{G}^{(2)},\mathcal{G}^{(m)},...,\mathcal{G}^{(M)}\}$ , where $M$ is the number of the input KGs, and $M>2$ . The $m$ -th KG is denoted by $\mathcal{G}^{(m)}=(\mathcal{E}^{(m)},\mathcal{R}^{(m)},\mathcal{T}^{(m)})$ .The problem is to identify equivalent entities that refer to the same real-world thing across all the $M$ KGs, based on a set of seed alignment labels (i.e., pre-aligned entities), denoted as $\mathcal{S}=\{l_{1},l_{2},...,l_{n},...,l_{N}\}$ , where $N$ is the number of labels. The $n$ -th label is denoted as $l_{n}=(e_{n}^{(1)},e_{n}^{(2)},...,e_{n}^{(m)},...,e_{n}^{(M)})$ , where entity $e_{n}^{(m)}\in\mathcal{E}^{(m)}$ , and all the $M$ entities associated with $l_{n}$ are equivalent.

As shown in Figure1, KG-en, KG-zh, KG-fr, and KG-ja are four candidate KGs that need to be aligned, where the four entities marked with red dashed lines refer to the same object, i.e., “Sweden”. We treat the four equivalent entities as one of the seed alignment labels, to help our EA algorithm find more potential equivalent entities. Finally, we can link (merge) the four KGs based on these detected equivalent entities across the four KGs.

4 Framework

In this section, we describe our proposed MultiEA framework in detail. Firstly, we design a GNN-like KG encoder to embed the entities of all the candidate KGs into a common vector space. Then, we propose three alignment strategies to measure the distances among pre-aligned entities, which guide the training of EA models. Finally, we develop an innovative inference enhancement technique to boost the alignment performance by incorporating high-order similarities.

Aligning Multiple Knowledge Graphs in a Single Pass (2)

Aligning Multiple Knowledge Graphs in a Single Pass (3)

4.1 KG Encoding

Given a set of KGs: $\{\mathcal{G}^{(1)},\mathcal{G}^{(2)},...,\mathcal{G}^{(m)},...,\mathcal{G}^{(%M)}\}$ , we use a shared KG-oriented GNN encoder to compute their entity embeddings. In the following, we take the $m$ -th KG, i.e., $\mathcal{G}^{(m)}=(\mathcal{E}^{(m)},\mathcal{R}^{(m)},\mathcal{T}^{(m)})$ as an example to elaborate on the details of our encoder. The superscript $m$ is omitted for the sake of notation simplicity.

Firstly, we augment the original KG data to add a virtual self-relation $r_{self}$ into the relation set $\mathcal{R}$ , formally described as follows:

\mathcal{R}=\mathcal{R}\cup\{r_{self}\}.

(1)

Then, for each entity, we add a virtual triple to describe that each entity $e_{i}$ has the self-relation $r_{self}$ between itself, which is added to the triple set as follows:

\mathcal{T}=\mathcal{T}\cup\{<e_{i},r_{self},e_{i}>|e_{i}\in\mathcal{E}\}.

(2)

To facilitate the aggregation of the GNN encoder, for entity $e_{i}$ , we define the set of its neighbor relation-entity tuples as follows:

\mathcal{N}_{i}=\{<r_{k},e_{j}>|<e_{i},r_{k},e_{j}>\in\mathcal{T},or<e_{j},r_{%k},e_{i}>\in\mathcal{T}\}.

(3)

Although the relations in KGs are usually directed, following most previous works, we treat them as bi-directed (or undirected). This is reasonable because the inverse of each relation is usually intuitive as well. For instance, for the triple $<$ Obama, wife, Michelle $>$ , we can also rewrite it as: $<$ Michelle, husband, Obama $>$ . Note that due to the addition of self-connection, entity $e_{i}$ will have a special neighbor relation-entity tuple: $<r_{self},e_{i}>\in\mathcal{N}_{i}$ .

In a layer of our KG encoder, the output representation of entity $e_{i}$ is computed as follows:

4.2 Training for Alignment

In order to pull equivalent entities together in a unified embedding space, we adopt the following margin ranking loss as the training loss of our MultiEA model:

L=\sum_{l\in\mathcal{S}}\sum_{l^{\prime}\in\mathcal{S}^{\prime}}\max(d(l)-d(l^%{\prime})+\lambda,0),

(9)

where $\mathcal{S}$ is the set of positive examples, i.e., ground-truth alignment labels, $\mathcal{S}^{\prime}$ is the set of negative examples, i.e., randomly generated false alignment labels, $d(\cdot)$ is a distance function, and $\lambda>0$ is a margin hyperparameter for separating positive and negative examples.

The negative examples in $\mathcal{S}^{\prime}$ are generated as follows. For each positive example $l=(e^{(1)},e^{(2)},...,e^{(m)},...,e^{(M)})\in\mathcal{S}$ , we fix one entity and replace the other entities by randomly sampling from their corresponding KGs, formally described as follows:

\eta\times\left\{\begin{array}[]{l}(e^{(1)},\tilde{e}^{(2)},...,\tilde{e}^{(m)%},...,\tilde{e}^{(M)}),\\(\tilde{e}^{(1)},e^{(2)},...,\tilde{e}^{(m)},...,\tilde{e}^{(M)}),\\...,\\(\tilde{e}^{(1)},\tilde{e}^{(2)},...,e^{(m)},...,\tilde{e}^{(M)}),\\...,\\(\tilde{e}^{(1)},\tilde{e}^{(2)},...,\tilde{e}^{(m)},...,e^{(M)})\\\end{array}\right\},

(10)

where $\tilde{e}^{(m)}$ is randomly sampled entity from $\mathcal{E}^{(m)}$ to replace the original entity $e^{(m)}$ . Here, $\eta$ is an integer, indicating that for each positive example, we generate $\eta$ groups of negative examples in this way. Thus, each positive example corresponds to $\eta\times M$ negative examples.

In this work, we propose three strategies for minimizing the distance among equivalent entities. The ideas are intuitively shown in Figure2. For a positive example $l=(e^{(1)},e^{(2)},...,e^{(m)},...,e^{(M)})\in\mathcal{S}$ , we firstly obtain its associated entity embeddings from the KG encoder and denote them as: $(\mathbf{h}^{(1)},\mathbf{h}^{(2)},...,\mathbf{h}^{(m)},...,\mathbf{h}^{(M)})$ . Then, the distance $d(l)$ is minimized by three strategies as follows.

(1) Moving toward Mean.This strategy lets all the equivalent entities approach their mean, as illustrated in Figure2(a). We first compute the mean as follows:

\mathbf{u}=\frac{1}{M}\cdot\sum_{m=1}^{M}{\mathbf{h}^{(m)}}.

(11)

Then, we sum the Euclidean distances between each entity embedding and the mean:

d(l)=\sum_{m=1}^{M}{\|\mathbf{h}^{(m)}-\mathbf{u}\|_{2}}.

(12)

(2) Moving toward Anchor.In practice, the candidate KGs are usually unbalanced due to various realistic factors. For instance, on our constructed multi-lingual KG dataset DBP-4, the English KG contains many more triples than the other three KGs (Cf. TableI). Therefore, as shown in Figure2(b), we designate entities from KG DBP-4: en as anchor entities and let the equivalent entities from the other KGs approach these anchor entities. This idea is formally described as follows:

d(l)=\sum_{m=1}^{M}{\|\mathbf{h}^{(m)}-\mathbf{h}^{(a)}\|_{2}},

(13)

where $\mathbf{h}^{(a)}$ is embedding of the anchor entity¹¹1Note that when $\mathbf{h}^{(m)}$ is designated as anchor entity, then $\|\mathbf{h}^{(m)}-\mathbf{h}^{(a)}\|_{2}=0$ ..

(3) Moving toward Each Other.This strategy lets equivalent entities approach each other. As illustrated in Figure2(c), we minimize the distances between all possible pairs of equivalent entities, formally described as follows:

d(l)=\sum_{m_{1}\neq m_{2}}{\|\mathbf{h}^{(m_{1})}-\mathbf{h}^{(m_{2})}\|_{2}},

(14)

where $m_{1}$ and $m_{2}$ are in the set $\{1,2,...,m,...,M\}$ .

Input : $M$ KGs $\{\mathcal{G}^{(1)},\mathcal{G}^{(2)},...,\mathcal{G}^{(m)},...,\mathcal{G}^{(%M)}\}$ ,

seed alignment labels $\mathcal{S}$ .

1Randomly initialize model parameters;

2Select one distance metric function from Eqs.(11, 12), Eq.(13), or Eq.(14);

3Augment KG data by Eqs.(1, 2);

4Get neighbor relation-entity tuples by Eq.(3);

5whilenot convergedo

6Compute relation-specific projection matrices by Eq.(5);

7Compute attention coefficients by Eqs.(7, 8);

8Perform multiple layers of aggregation by Eq.(4);

9Randomly sample negative examples by Eq.(10);

10Compute loss by Eq.(9);

4.3 Training Time Complexity

Let $|\mathcal{E}|$ , $|\mathcal{R}|$ , and $|\mathcal{T}|$ denote the total numbers of the entities, the relations, and the triples of all the $M$ candidate KGs, respectively. Let $d$ denote the dimensionalities of entity embeddings and relation embeddings. Let $|\mathcal{S}|$ denote the number of alignment labels. Recall that each label corresponds to $\eta\times M$ negative examples.

In the KG encoding phase (Cf. Section4.1), the time complexity involves three main computational operations. First, the aggregation operation described by Eq.(4) has time complexity ${\mathcal{O}}(|\mathcal{E}|\cdot d^{2})$ . Second, the computation of relation-specific projection matrices described by Eq.(5) has time complexity ${\mathcal{O}}(|\mathcal{R}|\cdot d^{2})$ . Finally, as described by Eqs.(7, 8), the computation of attention coefficients has time complexity ${\mathcal{O}}(|\mathcal{T}|\cdot(d+d+d^{2}))$ . Considering that in practice, the embedding dimensionality $d$ is a relatively small value, the time complexity of KG encoding is ${\mathcal{O}}(|\mathcal{E}|+|\mathcal{R}|+|\mathcal{T}|)$ .

In the alignment phase (Cf. Section4.2), as described by Eq.(12) and Eq.(13), both the mean strategy and the anchor strategy require time complexity ${\mathcal{O}}(M)$ . As described by Eq.(14), the each other strategy requires time complexity ${\mathcal{O}}(M^{2})$ . As described by Eq.(9), taking the number of labels into consideration. The mean strategy and the anchor strategy have time complexity of ${\mathcal{O}}(|\mathcal{S}|\cdot\eta\cdot M^{2})$ , and each other strategy has time complexity ${\mathcal{O}}(|\mathcal{S}|\cdot\eta\cdot M^{3})$ .

In practice, $|\mathcal{S}|$ , $\eta$ , and $M$ are much smaller than $|\mathcal{E}|$ , $|\mathcal{R}|$ , and $|\mathcal{T}|$ . Therefore, the overall training time complexity of MultiEA is equal to ${\mathcal{O}}(|\mathcal{E}|+|\mathcal{R}|+|\mathcal{T}|)$ .

4.4 Inference Enhancement

After training the model, we can obtain the embeddings of the entities from all the KGs. Based on these embeddings, we can compute the similarity between any two entities that are from different KGs. For example, given entity $e_{i}^{(m_{1})}\in\mathcal{E}^{(m_{1})}$ , and entity $e_{j}^{(m_{2})}\in\mathcal{E}^{(m_{2})}$ , we compute the similarity between them as follows²²2Since entity embeddings are normalized, i.e., $\left\|{\mathbf{h}_{i}}\right\|_{2}=1,\forall i$ , the computed similarities lie in the range $[0,1]$ . Thus, we can compose high-order similarities by matrix product operation later.:

{\mathbf{S}}_{i,j}^{m_{1}-m_{2}}=1-\frac{\|\mathbf{h}_{i}^{(m_{1})}-\mathbf{h}%_{j}^{(m_{2})}\|_{2}}{2}.

(15)

Thus, we can obtain a matrix ${\mathbf{S}}^{m_{1}-m_{2}}$ to describe the entity similarities between $\mathcal{G}^{(m_{1})}$ and $\mathcal{G}^{(m_{2})}$ . However, Eq.(15) only captures the first-order similarity, which may not be sufficient since some higher-order similarity information is ignored. Therefore, in this work, we propose to further enhance the similarity matrix by incorporating higher-order similarities, which can be composed by matrix product operation. For example, if there are three KGs, $\mathcal{G}^{(m_{1})}$ , $\mathcal{G}^{(m_{2})}$ , and $\mathcal{G}^{(m_{3})}$ , we can enhance the similarity matrix ${\mathbf{S}}^{m_{1}-m_{2}}$ as follows:

{\widetilde{\mathbf{S}}}^{m_{1}-m_{2}}=\gamma_{1}\cdot{\mathbf{S}}^{m_{1}-m_{2%}}+\gamma_{2}\cdot{\mathbf{S}}^{m_{1}-m_{3}}\cdot{\mathbf{S}}^{m_{3}-m_{2}},

(16)

where $\gamma_{1}$ and $\gamma_{2}$ are hyperparameters to balance the two terms. They are real numbers in the range $[0,1]$ and satisfy $\gamma_{1}+\gamma_{2}=1$ . Similarly, if there are four KGs, we can enhance ${\mathbf{S}}^{m_{1}-m_{2}}$ as follows:

\begin{split}{\widetilde{\mathbf{S}}}^{m_{1}-m_{2}}&=\gamma_{1}\cdot{\mathbf{S%}}^{m_{1}-m_{2}}+\gamma_{2}\cdot{\mathbf{S}}^{m_{1}-m_{3}}\cdot{\mathbf{S}}^{m%_{3}-m_{2}}\\&+\gamma_{3}\cdot{\mathbf{S}}^{m_{1}-m_{4}}\cdot{\mathbf{S}}^{m_{4}-m_{2}},%\end{split}

(17)

where $\gamma_{1}$ , $\gamma_{2}$ , and $\gamma_{3}$ are hyperparameters in the range $[0,1]$ and satisfy $\gamma_{1}+\gamma_{2}+\gamma_{3}=1$ .

Here, we only incorporate the two-order similarities by the matrix product operation. In practice, higher-order similarities can be similarly incorporated through more matrix product operations. Finally, we can identify equivalent entities according to the enhanced similarity matrix. In this way, the model can utilize more comprehensive information, helping improve the inference performance.

5 Experiment

In this section, we construct the benchmark dataset and define the evaluation metric for the new task. Then, we conduct extensive experiments to verify the effectiveness of our MultiEA.

5.1 Benchmark Datasets

As far as we know, there is no existing EA benchmark dataset that can support the task of aligning multiple KGs. To this end, we construct two novel benchmark datasets based on previously widely used real-world datasets[12, 13]. One dataset DBP-4 contains four candidate KGs, and another dataset DWY-3 contains three candidate KGs to be aligned. The key statistics of the constructed datasets are listed in TableI.

Datasets	KGs	$\|\mathcal{E}\|$	$\|\mathcal{R}\|$	$\|\mathcal{T}\|$	$\|\mathcal{S}\|$
DBP-4	en	8901	1034	77483	2539
	fr	3545	774	15843
	ja	4326	519	32427
	zh	3893	619	21497
DWY-3	dbp	23784	246	94985	20729
	wiki	22839	153	92019
	yago	22063	30	77457

•DBP-4.This dataset contains four KGs from different language sources, including English (en) KG, Chinese (zh) KG, Japanese (ja) KG, and French (fr) KG. It is constructed based on the DBP15K dataset, which was originally released by[12]. The original alignment labels describe the correspondence from English KG to the other three KGs, which can be described as: en-fr, en-zh, and en-ja. We treat the English KG as an intermediary and complement the alignment information among all four KGs, which can be described as: en-fr-zh-ja. Specifically, we filter entities from different KGs that correspond to the same English entity. Intuitively, if we have: $e^{(en)}=e^{(fr)}$ , $e^{(en)}=e^{(zh)}$ , and $e^{(en)}=e^{(ja)}$ . Then, we can infer: $e^{(en)}=e^{(fr)}=e^{(zh)}=e^{(ja)}$ , which means that all these four entities from different KGs are equivalent.

•DWY-3.This dataset contains three KGs, i.e., DBpedia (dbp), Wikidata (wiki), and YAGO (yago). It is constructed based on the DWY100K dataset, which was originally released by[13]. Its original alignment labels describe the correspondence from DBpedia to the other two KGs, which can be described as: dbp-wiki, and dbp-yago. Similar to the construction of DBP-4, we treat the DBpedia KG as an intermediary to complement the alignment information among all three KGs, which can be described as: dbp-wiki-yago.

It is worth noting that the condition under which we construct labels is very strict. As a result, the number of labels is very small compared to the number of entities and triples. To address the issue, we further induce a smaller knowledge graph. Specifically, inspired by the previous work[12], for each entity involved in labels, we select its popular neighbor entities whose degrees are larger than a threshold (we set 15 in this work). This target entity and its high-degree neighbor entries are added to a new entity set, and the involved relations are added to a new relation set. Then, we induce a new set of triples based on the selected entities and relations.

Methods	DBP-4			DWY-3
Methods	$M\text{-}Hits@1$	$M\text{-}Hits@10$	$M\text{-}Hits@20$	$M\text{-}Hits@1$	$M\text{-}Hits@10$	$M\text{-}Hits@20$
MTransE^∗	3.40	19.72	32.64	25.83	55.94	65.31
IPTransE^∗	4.11	23.36	39.55	28.74	60.02	68.84
GCN-Align^∗	4.48	27.62	44.94	32.46	63.18	73.91
KECG^∗	5.62	33.35	49.61	34.08	67.51	77.52
NAEA^∗	6.14	34.13	48.07	35.17	69.88	79.43
PSR^∗	7.84	38.17	50.97	37.55	71.85	80.13
MultiEA+mean-infer	5.95	40.48	51.02	36.68	69.33	79.65
MultiEA+anchor-infer	6.32	40.33	52.08	37.54	70.21	79.19
MultiEA+each-infer	6.17	41.70	52.0	36.76	71.52	79.18
MultiEA+mean+infer	10.03	48.18	59.89	40.54	73.90	81.24
MultiEA+anchor+infer	10.58	48.80	59.95	42.33	74.13	83.54
MultiEA+each+infer	10.25	50.38	61.75	41.19	75.44	85.42

5.2 Evaluation Metric

Traditional pair-wise KG alignment methods generally use $Hits@K$ to evaluate the alignment performance. Specifically, given two KGs: $\mathcal{G}^{(l)}$ and $\mathcal{G}^{(r)}$ , we first consider aligning $\mathcal{G}^{(r)}$ to $\mathcal{G}^{(l)}$ , and define a temporary metric $l\_Hits@K$ to measure the proportion of the entities in $\mathcal{G}^{(l)}$ whose counterpart in $\mathcal{G}^{(r)}$ rank in top-K:

l\_Hits@K=\frac{\sum_{n=1}^{N}{\mathbb{I}(e_{n}^{(r)}\text{ranks in top-K})}}{%N},

(18)

where $\mathbb{I}$ is the indicator function that outputs 1 if the input fact is true and $0$ otherwise. Then, we consider aligning $\mathcal{G}^{(l)}$ to $\mathcal{G}^{(r)}$ , and similarly define a temporary metric $r\_Hits@K$ in the opposite direction. The final metric $Hits@K$ is computed as the average of the two temporary metrics:

Hits@K=\frac{1}{2}\cdot(l\_Hits@K+r\_Hits@K).

(19)

Unfortunately, $Hits@K$ does not work for our task of multiple KG alignment. Hence, in the spirit of $Hits@K$ , we define a novel evaluation metric. Given $M$ KGs: $\{\mathcal{G}^{(1)},\mathcal{G}^{(2)},...,\mathcal{G}^{(m)},...,\mathcal{G}^{(%M)}\}$ where $M>2$ , we treat each KG $\mathcal{G}^{(m)}$ as the target KG, and consider aligning the other $M-1$ KGs to it. First, we count the number of entities from $\mathcal{G}^{(m)}$ whose all $M-1$ counterparts from the other $M-1$ KGs rank in top-K, described as follows:

\begin{split}C=\sum_{n=1}^{N}{\widetilde{\mathbb{I}}(e_{n}^{(1)}...,e_{n}^{(m-%1)},e_{n}^{(m+1)}...,e_{n}^{(M)}\text{rank in top-K})},\end{split}

(20)

where $\widetilde{\mathbb{I}}(\cdot)$ is an indicator function that outputs $1$ if all the input entities rank in top-K, and $0$ otherwise. Then, a temporary metric $m\_Hits@K$ is defined to compute the proportion of:

m\_Hits@K=\frac{C}{N}.

(21)

The final metric is computed as the average of all the $M$ temporary metrics:

M\text{-}Hits@K=\frac{1}{M}\cdot\sum_{m=1}^{M}{m\_Hits@K}.

(22)

Obviously, our newly defined metric $M\text{-}Hits@K$ is more challenging than the traditional metric $Hits@K$ since the condition for counting a correct entity is much more strict. This can better reflect the model performance of aligning multiple KGs.

5.3 Baselines and Variants

Regarding baselines, we select six representative supervised EA baselines that only leverage the structural information of candidate KGs, including two Trans-based EA methods (1), (2), and four GNN-based EA methods (3), (4), (5), (6). They are listed as follows:

(1)
MTransE[11] is a Trans-based EA method that uses TransE[15] to encode entities of each KG in a separated embedding space;
(2)
IPTransE[10] further uses PTransE[47] to encode KGs and propose an iterative alignment strategy;
(3)
GCN-Align[16] is the first GNN-based EA method that uses GCN[30] to encode two KGs;
(4)
NAEA[29] further uses GAT[48] to encode the structural information of KGs. Therefore, it learns entity embeddings by aggregating neighbor entities with different importance;
(5)
KECG[28] also uses GAT[48] to encode KGs, and additionally restricts the projection matrix to be a diagonal matrix, reducing the number of parameters and computations;
(6)
PSR[25] encodes KGs based on orthogonal projection matrix and attention aggregation like ours. Differently, (1) it dose not add self-connections, and (2) it directly concatenates all hops of neighborhood information.

Recall that in Section4.2, we introduce three different alignment strategies. Here, we use +mean, +anchor, and +each to denote the three strategies, respectively. In Section4.4, we propose the inference enhancement module. Here, we use +infer or -infer to indicate whether the module is equipped. By considering all the permutations, we set the following six variants for our MultiEA:

(1)
MultiEA+mean-infer;
(2)
MultiEA+anchor-infer;
(3)
MultiEA+each-infer;
(4)
MultiEA+mean+infer;
(5)
MultiEA+anchor+infer;
(6)
MultiEA+each+infer.

For anchor-based variants, we treat DBP-4: en and DWY-3: dbp as the anchor KGs on the two datasets, respectively.

Aligning Multiple Knowledge Graphs in a Single Pass (5)

Aligning Multiple Knowledge Graphs in a Single Pass (6)

Aligning Multiple Knowledge Graphs in a Single Pass (7)

Aligning Multiple Knowledge Graphs in a Single Pass (8)

5.4 Implementation Details

For the constructed benchmark datasets of DBP-4 and DWY-3, we follow the convention to randomly select 30% elements of the seed alignment set $\mathcal{S}$ to form the training set, and the rest are treated as the test set. We implement our MultiEA based on a PyTorch-based open-source EA toolkit named EAkit[49]. For baselines, we use their implementations provided in the EAkit package. All the experiments are conducted on our constructed benchmark datasets, and all the methods use the same dataset partitioning.

For our MultiEA, we use the same hyperparameter settings for both datasets. This is a challenging configuration since it can better reflect the sensitivity of hyperparameters w.r.t. datasets. All the model parameters are randomly initialized by the Xavier uniform distribution[50]. We adopt the Adam optimizer, and the learning rate is set to 0.01. The embedding dimensionalities of both entities and relations are set to 256. The non-linear activation function $\sigma$ is set to the ELU function[51]. The margin hyperparameter $\lambda$ is set to 1. The number of negative example groups, i.e., the hyperparameter $\eta$ is set to 10. We adopt early stopping for model training, and the number of patience steps is set to 10.

All the experiments are conducted on an NVIDIA TITAN RTX GPU with 24GB GPU memory and 128GB main memory.

Methods	$M\text{-}Hits@1$	$M\text{-}Hits@10$	$M\text{-}Hits@20$	Parameter	Memory	Time
GCN-Align^∗	4.48%	27.62%	44.94%	10.04M	38.30MB	382.77s
MultiEA (GCN-Align)	7.32%	43.75%	54.14%	5.35M	20.43MB	145.49s
KECG^∗	5.62%	33.35%	49.61%	10.24M	39.07MB	478.30s
MultiEA (KECG)	8.42%	45.93%	57.60%	5.42M	20.68MB	185.66s
NAEA^∗	6.14%	34.14%	48.07%	12.31M	46.95MB	522.60s
MultiEA (NAEA)	8.95%	46.15%	57.82%	6.43M	24.56MB	206.13s
PSR^∗	7.84%	38.17%	50.97%	11.13M	42.47MB	1336.05s
MultiEA (PSR)	9.98%	49.13%	61.30%	6.05M	23.06MB	576.19s

Methods	$M\text{-}Hits@1$	$M\text{-}Hits@10$	$M\text{-}Hits@20$	Parameter	Memory	Time
GCN-Align^∗	32.46%	63.18%	73.91%	23.80M	90.80MB	566.43s
MultiEA (GCN-Align)	36.75%	69.75%	80.26%	17.78M	67.83MB	315.44s
KECG^∗	34.08%	67.51%	77.52%	24.04M	91.66MB	780.29s
MultiEA (KECG)	38.52%	71.85%	82.97%	17.78M	67.83MB	407.16s
NAEA^∗	35.17%	69.88%	79.43%	25.63M	93.96MB	868.15s
MultiEA (NAEA)	40.66%	74.69%	83.66%	18.31M	69.84MB	480.61s
PSR^∗	37.55%	71.85%	80.13%	23.85M	90.98MB	1974.62s
MultiEA (PSR)	41.68%	74.76%	84.72%	17.58M	67.07MB	804.37s

5.5 Multiple KG Alignment

We first quantitatively evaluate the effectiveness of our MultiEA in aligning multiple (more than two) KGs. Specifically, we apply all six baseline methods as well as all six variants of MultiEA to the two constructed benchmark datasets, and use the metric $M\text{-}Hits@K$ newly defined in Eq.(22) to compare their accuracy. The results are reported in TableII.

Comparison with Baselines. It is worth noting that we cannot directly apply baseline EA methods to the multiple KG alignment task since they are specifically designed for the traditional pair-wise KG alignment task. To make them compatible, we split the concerned task into multiple pair-wise sub-tasks and separately align each pair of KGs. To reflect this, baselines are marked by ^∗ in the table. We can see that our MultiEA (especially the three variants with the inference enhancement module at the bottom of the table) can significantly outperform all the baselines on the two datasets. This may be due to that our MultiEA can effectively capture the global alignment information by projecting the entities of all the candidate KGs in a unified embedding space. Besides, GNN-based EA baselines show better performance than Trans-based EA baselines. This is consistent with the findings of many previous studies[16, 28, 29, 25].

Abalation Studies. As we can see, our MultiEA+each+infer achieves the best overall results in most cases, showing its superior effectiveness in aligning multiple KGs. Overall, MultiEA+anchor+infer and MultiEA+mean+infer achieve the second-best and the third-best results, respectively. The situation is the same for the other three variants without the inference enhancement module. Therefore, we recommend the “each other” strategy as the default strategy for most cases. If users prefer higher efficiency and there is an obvious anchor KG, we recommend the “anchor” strategy, and otherwise, we recommend the “mean” strategy. From another dimension, we can observe that all three variants with the inference enhancement module significantly outperform the other three variants without this module. This indicates that the inference enhancement module is significantly beneficial for this task.

5.6 Case Study

To intuitively show the effectiveness of our MultiEA, in Figure3 and Figure4, we list ten groups of equivalent entities discovered by MultiEA (the best-performing variant MultiEA+each+infer) on the two datasets. As we can see, MultiEA can effectively discover equivalent entities across all the candidate KGs.

Further, we visualize the embeddings of these entities. Specifically, we utilize the t-SNE algorithm[52] to project them into the 2-dimensional Euclidean space. Figure5(a) and Figure5(b) show the results on DBP-4 and DWY-3, respectively, where different colors and different shapes mark the entities from different KGs, and the equivalent entities are connected by black lines. As we can see, on both datasets, the ten groups of equivalent entities are obviously gathered together, indicating the effectiveness of the alignment mechanism of MultiEA. On the other hand, a proper margin is maintained between different groups, avoiding the risk of model overfitting. The result on DWY-3 shows a relatively better distribution. This is consistent with the higher accuracy on DWY-3 as reported in TableII.

5.7 Efficiency Study

In this experiment, we demonstrate the efficiency advantage of our MultiEA in aligning multiple KGs. As shown in TableIII and TableIV, we first report the experimental results of our re-implemented GNN-based EA baselines as described in Section5.5. That is, they align multiple candidate KGs by splitting this task into multiple separate pair-wise sub-tasks, as marked by ^∗ in the tables. Then, we apply the “each other” alignment strategy and the inference enhancement module of our MultiEA+each+infer to each baseline X, as denoted as MultiEA (X) in the tables.

In addition to the $M\text{-}Hits@K$ scores, we report the number of model parameters, the memory, and the time required by each method during its training. As we can see, our MultiEA can help each baseline achieve significantly better performance. Moreover, MultiEA can help each baseline significantly reduce the requirements of parameters, space, and time resources. This demonstrates the superior efficiency of our MultiEA.

5.8 Pair-wise KG Alignment

We also apply our MultiEA (the variant MultiEA+each+infer) to the traditional pair-wise KG alignment task. Firstly, based on the two constructed benchmark datasets, we induce several pair-wise KG alignment sub-datasets. For DBP-4, we induce three sub-datasets, i.e., DBP-4: en-fr, DBP-4: en-ja, and DBP-4: en-zh. For DWY-3, we induce two sub-datasets, i.e., DWY-3: dbp-wiki and DWY-3: dbp-yago. For baselines, we reproduce their results on our induced datasets. Then, we re-implement the encoder of MultiEA as that used in a baseline X, denoted as MultiEA (X). This can eliminate the encoder’s influence on the alignment results, and thus investigate whether our proposed “each other” strategy and inference enhancement module helps improve the performance of baselines on this task. By convention, we use the widely used metric $Hits@K$ described by Eq.(19) to evaluate the alignment accuracy.

TableV and TableVI shows the final results. In addition to the original experimental results, we also compute the relative gain to show the improvement more intuitively. As we can see, our MultiEA can always help improve the baselines’ performance on the traditional pair-wise KG alignment task. Besides, we have some other findings as follows. PSR, KECG, and NAEA perform better than GCN-Align, indicating the effectiveness of the attention mechanism used in their KG encoders. Moreover, MultiEA (PSR), MultiEA (KECG), and MultiEA (NAEA) outperform MultiEA (GCN-Align). This demonstrates that our MultiEA framework can effectively exploit the structural information captured by the graph attention mechanisms. Two Trans-based methods MTransE and IPTransE show worse performance than GNN-based methods, which is consistent with the conclusions of previous works[16, 28, 29].

The experimental results reported here are slightly different from the results reported in previous studies. The main reason lies in that our constructed datasets are different from existing datasets. For example, the label ratio on our DBP-4 is $\frac{2539\times 4}{8901+3545+4326+3893}\approx 0.4915$ , while the label ratio of DBP15K used in previous studies is $(\frac{15000\times 2}{19388+19572}+\frac{15000\times 2}{19780+19814}+\frac{150%00\times 2}{19993+19661})\times\frac{1}{3}\approx 0.7614$ , which is much larger than ours.

Methods	DBP-4			DWY-3
Methods	en-zh	en-fr	en-ja	dbp-wiki	dbp-yago
MTransE	14.63	15.91	13.54	35.16	44.35
MultiEA (MTransE)	15.29	16.05	15.68	36.05	46.17
Relative Gain $\uparrow$	+4.51	+0.88	+15.81	+2.53	+4.10
IPTransE	19.84	20.95	21.38	43.51	55.19
MultiEA (IPTransE)	20.26	21.01	23.29	43.68	57.06
Relative Gain $\uparrow$	+0.11	+0.33	+8.93	+0.39	+3.38
GCN-Align	21.48	24.01	21.54	44.75	58.63
MultiEA (GCN-Align)	21.57	25.71	22.19	45.48	60.08
Relative Gain $\uparrow$	+0.42	+7.08	+3.02	+1.63	+2.47
KECG	25.71	27.44	26.77	47.15	61.97
MultiEA (KECG)	26.14	29.48	27.49	50.02	62.44
Relative Gain $\uparrow$	+1.67	+7.43	+2.69	+6.09	+0.76
NAEA	24.69	26.92	26.35	46.51	61.37
MultiEA (NAEA)	25.75	28.16	26.95	49.18	61.52
Relative Gain $\uparrow$	+4.29	+4.61	+2.28	+5.74	+0.24
PSR	25.59	28.20	27.87	48.23	60.52
MultiEA (PSR)	26.27	30.18	28.03	51.36	62.14
Relative Gain $\uparrow$	+2.65	+7.02	+0.57	+6.49	+3.12

Methods	DBP-4			DWY-3
Methods	en-zh	en-fr	en-ja	dbp-wiki	dbp-yago
MTransE	45.52	50.13	47.93	60.54	67.09
MultiEA (MTransE)	45.93	52.26	50.19	60.96	67.36
Relative Gain $\uparrow$	+0.90	+4.25	+4.72	+0.69	+0.40
IPTransE	56.23	57.79	60.66	75.96	79.32
MultiEA (IPTransE)	58.52	59.35	62.43	77.16	82.17
Relative Gain $\uparrow$	+4.07	+2.70	+2.92	+1.58	+3.59
GCN-Align	62.03	68.05	63.96	77.58	86.00
MultiEA (GCN-Align)	62.41	70.37	65.22	78.56	88.02
Relative Gain $\uparrow$	+0.61	+3.41	+1.97	+1.26	+2.35
KECG	67.19	71.29	67.51	80.40	88.65
MultiEA (KECG)	68.03	73.10	69.38	80.73	89.78
Relative Gain $\uparrow$	+1.25	+2.54	+2.77	+0.41	+1.27
NAEA	66.41	71.43	67.83	80.67	88.76
MultiEA (NAEA)	68.18	74.17	70.95	81.35	89.37
Relative Gain $\uparrow$	+2.67	+3.84	+4.60	+0.84	+0.69
PSR	67.83	73.04	69.93	82.41	90.26
MultiEA (PSR)	70.04	75.93	71.81	83.97	90.79
Relative Gain $\uparrow$	+3.25	+3.96	+2.69	+1.90	+0.59

5.9 Hyperparameter Study

Aligning Multiple Knowledge Graphs in a Single Pass (9)

Aligning Multiple Knowledge Graphs in a Single Pass (10)

In this subsection, we study the impact of MultiEA’s four key hyperparameters on multiple KG alignment accuracy.

First, we study the balance hyperparameters introduced in the inference enhancement module, as described by Eqs.(16, 17). In practice, we only search the weight for the first term, and the weights for the other terms are set to the same value, thus compressing the hyperparameter search space. For one example of DBP-4 dataset, we can rewrite Eq.(17) as:

\begin{split}{\widetilde{\mathbf{S}}}^{en-zh}&=\gamma\cdot{\mathbf{S}}^{en-zh}%+\frac{1-\gamma}{2}\cdot{\mathbf{S}}^{en-fr}\cdot{\mathbf{S}}^{fr-zh}\\&+\frac{1-\gamma}{2}\cdot{\mathbf{S}}^{en-ja}\cdot{\mathbf{S}}^{ja-zh}.\end{split}

For another example of DWY-3 dataset, we can rewrite Eq.(16) as:

{\widetilde{\mathbf{S}}}^{dbp-wiki}=\gamma\cdot{\mathbf{S}}^{dbp-wiki}+(1-%\gamma)\cdot{\mathbf{S}}^{dbp-yago}\cdot{\mathbf{S}}^{yago-wiki}.

The sensitivity of $\gamma$ on the two datasets is shown in Figure6(a) and Figure6(e), respectively. We can see that on both datasets, our MultiEA achieves the best performance when $\gamma=0.2$ . It means that the importance of the first-order similarity is 0.2 and the importance of the other higher-order similarities is 0.8. This indicates that incorporating high-order similarities is very beneficial for the multiple KG alignment task, demonstrating the effectiveness of our proposed inference enhancement module. This finding is also consistent with the observation in Section5.5.

Second, we study the sensitivity of the margin hyperparameter $\lambda$ introduced in the loss function as described by Eq.(9). The results are shown in Figure6(b) and Figure6(f). As we can see, the performance is very poor when $\lambda=0$ , because the model cannot separate the positive examples and the negative examples, causing the underfitting issue. There is a clear inflection point when $\lambda=1$ , and thus in the other experiments, we set it to 1 by default. After that, the performance gradually declines, probably because the model suffers the overfitting issue when the margin is too large.

Third, we investigate the sensitivity of $\eta$ , which has been introduced to describe the groups of negative examples. As shown in Figure6(c) and Figure6(g), our model is not sensitive to this hyperparameter. Therefore, in practice, we set $\eta$ to 10 to save computational resources.

Finally, we show the model performance w.r.t. the training ratio in Figure6(d) and Figure6(h). As expected, the training ratio shows a clear positive effect on improving the model performance, which is consistent with the findings of many previous studies, e.g.,[12, 19, 28, 13, 37, 25].

6 Conclusion

In this work, we note that existing methods mainly focus on aligning only a pair of candidate KGs, ignoring the multiplicity of the candidate KGs to be aligned. To fill the research gap in this field, we formulate a novel problem of aligning multiple (more than two) KGs and propose the MultiEA framework to effectively solve this problem. To enhance the inference performance, we innovatively propose to incorporate higher-order similarities. Last but not least, we construct two new benchmark datasets and define the evaluation metric for the problem. The experimental results show that our MultiEA framework can effectively and efficiently align multiple KGs in a single pass. We will make our source codes and the constructed benchmark datasets publicly available to facilitate further research on this problem, which we believe will inspire more interesting works in the future.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 62133012, 61936006, 62303366, and 62073255, in part by the Innovation Capability Support Program of Shaanxi under Grant 2021TD-05, and in part by the Fundamental Research Funds for the Central Universities under Grants QTZX23039, XJSJ23030.

References

[1]X.Zhao, H.Chen, Z.Xing, and C.Miao, “Brain-inspired search engine assistant based on knowledge graph,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
[2]A.Saxena, A.Tripathi, and P.Talukdar, “Improving multi-hop question answering over knowledge graphs using knowledge base embeddings,” in Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 4498–4507.
[3]Y.Hua, Y.-F. Li, G.Haffari, G.Qi, and T.Wu, “Few-shot complex knowledge base question answering via meta reinforcement learning,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 5827–5837.
[4]Q.Wang, Z.Mao, B.Wang, and L.Guo, “Knowledge graph embedding: A survey of approaches and applications,” IEEE transactions on knowledge and data engineering, vol.29, no.12, pp. 2724–2743, 2017.
[5]H.Wang, M.Zhao, X.Xie, W.Li, and M.Guo, “Knowledge graph convolutional networks for recommender systems,” in The world wide web conference, 2019, pp. 3307–3313.
[6]X.Wang, X.He, Y.Cao, M.Liu, and T.-S. Chua, “Kgat: Knowledge graph attention network for recommendation,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 950–958.
[7]Z.Sun, Q.Zhang, W.Hu, C.Wang, M.Chen, F.Akrami, and C.Li, “A benchmarking study of embedding-based entity alignment for knowledge graphs,” Proceedings of the VLDB Endowment, vol.13, no.11, 2020.
[8]X.Zhao, W.Zeng, J.Tang, W.Wang, and F.M. Suchanek, “An experimental study of state-of-the-art entity alignment approaches,” IEEE Transactions on Knowledge and Data Engineering, vol.34, no.6, pp. 2610–2625, 2020.
[9]Y.Hao, Y.Zhang, S.He, K.Liu, and J.Zhao, “A joint embedding method for entity alignment of knowledge bases,” in Knowledge Graph and Semantic Computing: Semantic, Knowledge, and Linked Big Data: First China Conference, CCKS 2016, Beijing, China, September 19-22, 2016, Revised Selected Papers 1.Springer, 2016, pp. 3–14.
[10]H.Zhu, R.Xie, Z.Liu, and M.Sun, “Iterative entity alignment via joint knowledge embeddings.” in IJCAI, vol.17, 2017, pp. 4258–4264.
[11]M.Chen, Y.Tian, M.Yang, and C.Zaniolo, “Multilingual knowledge graph embeddings for cross-lingual knowledge alignment,” arXiv preprint arXiv:1611.03954, 2016.
[12]Z.Sun, W.Hu, and C.Li, “Cross-lingual entity alignment via joint attribute-preserving embedding,” in The Semantic Web–ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21–25, 2017, Proceedings, Part I 16.Springer, 2017, pp. 628–644.
[13]Z.Sun, W.Hu, Q.Zhang, and Y.Qu, “Bootstrapping entity alignment with knowledge graph embedding.” in IJCAI, vol.18, no. 2018, 2018.
[14]Q.Zhang, Z.Sun, W.Hu, M.Chen, L.Guo, and Y.Qu, “Multi-view knowledge graph embedding for entity alignment,” arXiv preprint arXiv:1906.02390, 2019.
[15]A.Bordes, N.Usunier, A.Garcia-Duran, J.Weston, and O.Yakhnenko, “Translating embeddings for modeling multi-relational data,” Advances in neural information processing systems, vol.26, 2013.
[16]Z.Wang, Q.Lv, X.Lan, and Y.Zhang, “Cross-lingual knowledge graph alignment via graph convolutional networks,” in Proceedings of the 2018 conference on empirical methods in natural language processing, 2018, pp. 349–357.
[17]R.Ye, X.Li, Y.Fang, H.Zang, and M.Wang, “A vectorized relational graph convolutional network for multi-relational network alignment.” in IJCAI, 2019, pp. 4135–4141.
[18]H.-W. Yang, Y.Zou, P.Shi, W.Lu, J.Lin, and X.Sun, “Aligning cross-lingual entities with multi-aspect information,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 4431–4441.
[19]Y.Wu, X.Liu, Y.Feng, Z.Wang, and D.Zhao, “Jointly learning entity and relation representations for entity alignment,” arXiv preprint arXiv:1909.09317, 2019.
[20]K.Xu, L.Wang, M.Yu, Y.Feng, Y.Song, Z.Wang, and D.Yu, “Cross-lingual knowledge graph alignment via graph matching neural network,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 3156–3161.
[21]Y.Cao, Z.Liu, C.Li, J.Li, and T.-S. Chua, “Multi-channel graph neural network for entity alignment,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 1452–1461.
[22]Y.Wu, X.Liu, Y.Feng, Z.Wang, and D.Zhao, “Neighborhood matching network for entity alignment,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 6477–6487.
[23]H.Nie, X.Han, L.Sun, C.M. Wong, Q.Chen, S.Wu, and W.Zhang, “Global structure and local semantics-preserved embeddings for entity alignment,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 3658–3664.
[24]X.Mao, W.Wang, H.Xu, Y.Wu, and M.Lan, “Relational reflection entity alignment,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 1095–1104.
[25]X.Mao, W.Wang, Y.Wu, and M.Lan, “Are negative samples necessary in entity alignment? an approach with high performance, scalability and robustness,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 1263–1273.
[26]Z.Sun, C.Wang, W.Hu, M.Chen, J.Dai, W.Zhang, and Y.Qu, “Knowledge graph alignment network with gated multi-hop neighborhood aggregation,” in Proceedings of the AAAI conference on artificial intelligence, vol.34, no.01, 2020, pp. 222–229.
[27]X.Mao, W.Wang, H.Xu, M.Lan, and Y.Wu, “Mraea: an efficient and robust entity alignment approach for cross-lingual knowledge graph,” in Proceedings of the 13th International Conference on Web Search and Data Mining, 2020, pp. 420–428.
[28]C.Li, Y.Cao, L.Hou, J.Shi, J.Li, and T.-S. Chua, “Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Association for Computational Linguistics, 2019.
[29]Q.Zhu, X.Zhou, J.Wu, J.Tan, and L.Guo, “Neighborhood-aware attentional representation for multilingual knowledge graphs.” in IJCAI, 2019, pp. 1943–1949.
[30]T.N. Kipf and M.Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
[31]Z.Sun, J.Huang, W.Hu, M.Chen, L.Guo, and Y.Qu, “Transedge: Translating relation-contextualized embeddings for knowledge graphs,” in The Semantic Web–ISWC 2019: 18th International Semantic Web Conference, Auckland, New Zealand, October 26–30, 2019, Proceedings, Part I 18.Springer, 2019, pp. 612–629.
[32]C.Ge, X.Liu, L.Chen, Y.Gao, and B.Zheng, “Largeea: aligning entities for large-scale knowledge graphs,” Proceedings of the VLDB Endowment, vol.15, no.2, pp. 237–245, 2021.
[33]N.T. Tam, H.T. Trung, H.Yin, T.VanVinh, D.Sakong, B.Zheng, and N.Q.V. Hung, “Entity alignment for knowledge graphs with multi-order convolutional networks,” IEEE Transactions on Knowledge and Data Engineering, vol.34, no.9, pp. 4201–4214, 2020.
[34]Z.Sun, W.Hu, C.Wang, Y.Wang, and Y.Qu, “Revisiting embedding-based entity alignment: A robust and adaptive method,” IEEE Transactions on Knowledge and Data Engineering, vol.35, no.8, pp. 8461–8475, 2022.
[35]X.Tang, J.Zhang, B.Chen, Y.Yang, H.Chen, and C.Li, “Bert-int: a bert-based interaction model for knowledge graph alignment,” interactions, vol. 100, p.e1, 2020.
[36]Q.Li, S.Guo, Y.Luo, C.Ji, L.Wang, J.Sheng, and J.Li, “Attribute-consistent knowledge graph representation learning for multi-modal entity alignment,” in Proceedings of the ACM Web Conference 2023, 2023, pp. 2499–2508.
[37]Y.Wu, X.Liu, Y.Feng, Z.Wang, R.Yan, and D.Zhao, “Relation-aware entity alignment for heterogeneous knowledge graphs,” arXiv preprint arXiv:1908.08210, 2019.
[38]X.Mao, W.Wang, Y.Wu, and M.Lan, “Lightea: A scalable, robust, and interpretable entity alignment framework via three-view label propagation,” arXiv preprint arXiv:2210.10436, 2022.
[39]X.Liu, H.Hong, X.Wang, Z.Chen, E.Kharlamov, Y.Dong, and J.Tang, “Selfkg: Self-supervised entity alignment in knowledge graphs,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 860–870.
[40]Y.Zhang, J.Tang, Z.Yang, J.Pei, and P.S. Yu, “Cosnet: Connecting heterogeneous social networks with local and global consistency,” in Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp. 1485–1494.
[41]X.Mu, F.Zhu, E.-P. Lim, J.Xiao, J.Wang, and Z.-H. Zhou, “User identity linkage by latent user space modelling,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1775–1784.
[42]L.Sun, Z.Zhang, G.Li, P.Ji, S.Su, and P.S. Yu, “Mc2: Unsupervised multiple social network alignment,” ACM Transactions on Intelligent Systems and Technology, 2023.
[43]S.Su, L.Sun, Z.Zhang, G.Li, and J.Qu, “Master: across multiple social networks, integrate attribute and structure embedding for reconciliation.” in IJCAI, vol.18, 2018, pp. 3863–3869.
[44]X.Zeng, P.Wang, Y.Mao, L.Chen, X.Liu, and Y.Gao, “Multiem: Efficient and effective unsupervised multi-table entity matching,” in 2024 IEEE 40th International Conference on Data Engineering (ICDE).IEEE, 2024, pp. 3421–3434.
[45]N.Reimers and I.Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3982–3992.
[46]Y.Yang, Z.Guan, J.Li, W.Zhao, J.Cui, and Q.Wang, “Interpretable and efficient heterogeneous graph convolutional network,” TKDE, 2021.
[47]Y.Lin, Z.Liu, H.Luan, M.Sun, S.Rao, and S.Liu, “Modeling relation paths for representation learning of knowledge bases,” arXiv preprint arXiv:1506.00379, 2015.
[48]P.Veličković, G.Cucurull, A.Casanova, A.Romero, P.Lio, and Y.Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
[49]K.Zeng, C.Li, L.Hou, J.Li, and L.Feng, “A comprehensive survey of entity alignment for knowledge graphs,” AI Open, vol.2, pp. 1–13, 2021.
[50]X.Glorot and Y.Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in AISTATS, 2010, pp. 249–256.
[51]J.T. Barron, “Continuously differentiable exponential linear units,” arXiv preprint arXiv:1704.07483, 2017.
[52]L.v.d. Maaten and G.Hinton, “Visualizing data using t-sne,” JMLR, vol.9, no. Nov, pp. 2579–2605, 2008.