Main  主要

No man is an island. Human life is a process of seeking, sustaining, repairing, judging, adjusting and sometimes dissolving relationships1. The quality and quantity of relationships are integral not only to our survival but also to our capacity to thrive2,3. Social isolation and poor relationships affect an individual’s cognition, behaviour, development and well-being4,5.
没有人是一座孤岛。人类生活是一个寻求、维持、修复、评判、调整以及有时解除关系的过程。关系的质量和数量不仅对我们的生存至关重要,也对我们茁壮成长的能力不可或缺。社会孤立和不良关系会影响个体的认知、行为、发展和幸福感。

Understanding the nature of human relationships lies at the heart of the social sciences. However, studying relationships is challenging for several reasons. First, human relationships are characterized by their diversity and complexity. Social structure in non-human primates is largely dominated by hierarchy and affiliation6. Human society, in contrast, is governed by far more diverse and complex types of relationships (for example, frenemies, godparents and online friends). Human relationships are also context-dependent and multifaceted, involving numerous factors such as time, space, emotions, communication and cultural norms7. These factors interact with one another in intricate ways, making it difficult to isolate and study individual components. Unravelling the underlying elements and organizational structures of such a complex relationship system thus remains a vexing problem.
理解人类关系的本质是社会科学的核心。然而,研究关系具有挑战性,原因有几个。首先,人类关系以其多样性和复杂性为特征。非人类灵长类动物的社会结构主要由等级和从属关系主导 6 。相比之下,人类社会由更为多样和复杂的关系类型(例如,亦敌亦友、教父母和网友)所支配。人类关系也是情境依赖和多方面的,涉及时间、空间、情感、交流和文化规范等诸多因素 7 。这些因素以复杂的方式相互作用,使得难以分离和研究单个组成部分。因此,揭示这样一个复杂关系系统的潜在要素和组织结构仍然是一个棘手的问题。

Second, human relationships are subjective beliefs, experiences and practices shaped by the unique perspectives, attitudes and personalities of the individuals involved and maintained by dynamic, unwritten rules over time and across societies8. This subjectivity makes it difficult to establish objective and uniform measures of relationships and to compare them across individuals. The degree to which people around the world (and across the generations) share the same set of cognitive, behavioural and cultural principles of relationships has yet to be fully evaluated.
其次,人类关系是主观的信念、经验和实践,由参与者的独特视角、态度和个性塑造,并通过动态的、不成文的规则在时间和跨社会中得以维持 8 。这种主观性使得建立客观且统一的关系衡量标准以及在不同个体之间进行比较变得困难。世界各地(以及跨世代)的人们在多大程度上共享同一套认知、行为和文化的关系原则,仍有待全面评估。

Third, human relationships are widely studied in the social sciences. A wave of enthusiasm from multiple disciplines in the 1970s–1990s led to the exploration of the internal organization of relationships, each with their own theoretical perspectives and methodological approaches. This interdisciplinary nature can make it challenging to establish a unified understanding of relationships and to compare findings across disciplines. For example, sociologists were interested in the formation and organization of social relationships and discovered a three-factor model for role-based relationships (that is, intimacy, visibility and regulation)9, anthropologists attempted to understand the foundations of social coordination across cultures and proposed four elementary forms of social bonds (that is, communal sharing, authority ranking, expected reciprocity and market pricing)1, cognitive psychologists studied the perception of relationships and revealed a four-dimensional framework (that is, valence, equality, activeness and formality)10, and communication scholars focused on the communication quality in personal relationships and proposed three factors for effective relational dialogues (that is, positiveness, intimacy and control)11. All theories have proved insightful, as attested by their endurance in the field. However, no consensus has been reached because researchers in different disciplines have approached the problem of human relationships using their unique features of interest and thus have tapped into distinct feature spaces and relationship types.
第三,人类关系在社会科学中被广泛研究。20 世纪 70 年代至 90 年代,多学科的热情浪潮推动了对关系内部组织的探索,每个学科都有其独特的理论视角和方法论。这种跨学科性质使得建立对关系的统一理解和跨学科比较研究结果变得具有挑战性。 例如,社会学家对社会关系的形成和组织感兴趣,并发现了一个基于角色的三因素模型(即亲密性、可见性和规范性) 9 ,人类学家试图理解跨文化社会协调的基础,并提出了四种基本的社会纽带形式(即共同分享、权威等级、预期互惠和市场定价) 1 ,认知心理学家研究了关系的感知,并揭示了一个四维框架(即情感、平等、活跃性和正式性) 10 ,而传播学者则关注个人关系中的沟通质量,并提出了有效关系对话的三个因素(即积极性、亲密性和控制性) 11 。所有理论都已被证明具有深刻的洞察力,这一点从它们在领域中的持久性得到了证实。然而,由于不同学科的研究者使用他们独特的兴趣特征来探讨人类关系问题,因此触及了不同的特征空间和关系类型,尚未达成共识。

To address the challenging questions above, here we focus on the common sense of human relationships—how ordinary people mentally conceptualize and understand human relationships (that is, relationship concepts). By building up a unified framework across multiple disciplines, we aim to clarify the underlying elements and organizational structures of the relationship concept system and reveal the similarities and differences in relationship conceptualization across different cultures and time periods.
为了解决上述具有挑战性的问题,我们在此聚焦于人类关系的常识——即普通人如何在心理上概念化和理解人际关系(即关系概念)。通过构建跨多个学科的统一框架,我们旨在阐明关系概念体系的基本要素和组织结构,并揭示不同文化和时期在关系概念化上的相似性与差异性。

Results  结果

Study 1: a unified representational space across disciplines
研究 1:跨学科的统一表征空间

In Study 1, we attempted to synthesize this cross-field literature and build a unified representational space across disciplines. On the basis of an extensive literature review (Extended Data Fig. 1), we collected and summarized 30 conceptual features of relationships from 15 prominent existing theories to encompass a composite feature space, including activeness, communality, concreteness, equality, endurance, formality, intensity, intimacy, reciprocity, societal importance, socio-emotionality, uniqueness, valence and visibility, among others (see the full list in Supplementary Table 1). These theoretical features were originally derived from dimensionality reduction or clustering techniques in each discipline, but here they were assessed together in a dimensional survey and prepared to be further reduced into higher-order components. To capture the diversity of all possible relationships, we used a naïve natural language processing (NLP) model (Methods) to generate a comprehensive list of 159 typical relationships in English, including both common (for example, siblings, friends and enemies) and uncommon ones (for example, master–servant and friends with benefits) (see the full list in Supplementary Table 2).
在研究 1 中,我们尝试综合这一跨领域文献,并构建一个跨学科的统一表征空间。基于广泛的文献综述(扩展数据图 1),我们收集并总结了来自 15 个现有重要理论的 30 个关系概念特征,以涵盖一个复合特征空间,包括活跃性、共同性、具体性、平等性、持久性、正式性、强度、亲密性、互惠性、社会重要性、社会情感性、独特性、效价和可见性等(完整列表见补充表 1)。这些理论特征最初源自各学科的降维或聚类技术,但在此通过维度调查共同评估,并准备进一步降维为高阶成分。 为了捕捉所有可能关系的多样性,我们使用了一个简单的自然语言处理(NLP)模型(方法)来生成一个包含 159 种典型关系的综合列表,这些关系包括常见的(例如,兄弟姐妹、朋友和敌人)和不常见的(例如,主仆和利益朋友)(完整列表见补充表 2)。

A diverse group of native English speakers from the USA (n = 1,065) were recruited via MTurk and completed an online survey where they rated 159 relationships on 30 theoretical features. For example, the participants were asked to rate the equality of ‘between friends’. For details on feature selection and their definitions, please see Extended Data Fig. 1 and Supplementary Table 1. The next processing step involved reducing this high-dimensional feature set into a smaller number of orthogonalized latent factors via principal component analysis (PCA). The PCA extracted five latent dimensions, accounting for 82.14% of the variance of dimensionality ratings (see Methods and statistics on how to determine the optimal PCA component number). On the basis of close examination of the PCA loadings and relationship scores (Fig. 1), the first dimension was identified as ‘formality’. This dimension contrasts formal, occupational and publicly visible relationships that adhere to rules and regulations (for example, co-workers and officer–soldier) with informal, socio-emotional and private relationships that exhibit a looser, more casual style (for example, parent–infant and wife–husband). The second dimension, which we termed ‘activeness’, loaded highly on activeness, synchronicity and spatial distance. Close relationships (for example, wife–husband and siblings) and distant relationships (for example, distant relatives and strangers) occupied the poles of this dimension. The third dimension was described as ‘valence’, with friendly, harmonious and high-solidarity relationships at one pole (for example, church members and writer–reader) and conflictual, hostile and antagonistic relationships at the other (for example, bully–victim and slave–master). We named the fourth dimension ‘exchange’, as it distinguishes between dyads exchanging concrete resources such as money, goods and services (for example, dealer–buyer and prostitute–customer) and dyads exchanging symbolic, intangible resources such as information, love and identity (for example, celebrity–haters and brother–sister). The fifth dimension was labelled ‘equality’, as it differentiated dyads with equal powers (for example, sports rivals and pen friends) from dyads with unequal powers (for example, man–god and politician–supporter). Other dimensionality reduction techniques (that is, independent component analysis, exploratory factor analysis and multidimensional scaling) were also evaluated to examine the robustness of the latent factor solution to different statistical algorithms, and they all yielded the same five-factor solution (Supplementary Fig. 3). We hereafter refer to this five-dimensional solution as the FAVEE model (an abbreviation for formality, activeness, valence, exchange and equality).
通过 MTurk 招募了一组来自美国的多样化英语母语者(n = 1,065),他们完成了一项在线调查,对 159 种关系在 30 个理论特征上进行了评分。例如,参与者被要求对“朋友之间”的平等性进行评分。有关特征选择及其定义的详细信息,请参见扩展数据图 1 和补充表 1。接下来的处理步骤是通过主成分分析(PCA)将这个高维特征集减少为较少数量的正交潜在因子。PCA 提取了五个潜在维度,占维度评分方差的 82.14%(有关如何确定最佳 PCA 组件数量的方法和统计数据,请参见方法部分)。在仔细检查 PCA 载荷和关系评分(图 1)的基础上,第一个维度被确定为“正式性”。这一维度将遵循规则和法规的正式、职业和公开可见的关系(例如,同事和军官-士兵)与表现出更松散、更随意风格的非正式、社会情感和私人关系(例如,父母-婴儿和妻子-丈夫)进行了对比。 第二个维度,我们称之为“活跃度”,在活跃度、同步性和空间距离上负载较高。亲密关系(例如,夫妻和兄弟姐妹)和疏远关系(例如,远亲和陌生人)占据了这个维度的两极。第三个维度被描述为“情感效价”,一极是友好、和谐和高团结的关系(例如,教会成员和作家-读者),另一极是冲突、敌对和对抗的关系(例如,欺凌者-受害者和奴隶-主人)。我们将第四个维度命名为“交换”,因为它区分了交换具体资源(如金钱、商品和服务)的二元关系(例如,经销商-买家和妓女-顾客)与交换象征性、无形资源(如信息、爱和身份)的二元关系(例如,名人-憎恨者和兄弟-姐妹)。第五个维度被标记为“平等”,因为它区分了权力平等的二元关系(例如,体育对手和笔友)与权力不平等的二元关系(例如,人-神和政治家-支持者)。 还评估了其他降维技术(即独立成分分析、探索性因子分析和多维尺度分析),以检验潜在因子解对不同统计算法的稳健性,它们都得出了相同的五因子解(补充图 3)。我们此后将这五维解称为 FAVEE 模型(形式性、活跃性、效价、交换性和平等性的缩写)。

Fig. 1: A five-dimensional model of human relationships (FAVEE model).
图 1:人类关系的五维模型(FAVEE 模型)。
figure 1

a, PCA loadings on 30 theoretical features derived from multidisciplinary literature. Dark colours on the colour bar represent larger values, with blue indicating negative values and red indicating positive values. b, 159 relationships were plotted in the 5D space on the basis of their scores on each dimension. PCA loadings and relationship scores jointly suggested that RC1 corresponded to formality (formal versus informal), RC2 to activeness (close versus distant), RC3 to valence (harmony versus conflict), RC4 to exchange (concrete versus symbolic) and RC5 to equality (unequal versus equal). Each axis is labelled with the variance explained for the corresponding dimension. For more details about the spatial location of each relationship, please see the dynamic figures at https://bnu-wang-msn-lab.github.io/FAVEE-HPP/.
a, 基于多学科文献得出的 30 个理论特征的 PCA 载荷。色条上的深色代表较大的值,蓝色表示负值,红色表示正值。b, 159 种关系根据其在每个维度上的得分被绘制在 5D 空间中。PCA 载荷和关系得分共同表明,RC1 对应于形式性(正式与非正式),RC2 对应于活跃性(亲密与疏远),RC3 对应于效价(和谐与冲突),RC4 对应于交换(具体与象征),RC5 对应于平等(不平等与平等)。每个轴都标明了相应维度解释的方差。有关每种关系的空间位置的更多详细信息,请参见 https://bnu-wang-msn-lab.github.io/FAVEE-HPP/上的动态图。

Categorical thinking (for example, family, friends and colleagues) is pervasive when people define and manage their social connections. We next studied how people sort relationships and how categorical representations relate to the FAVEE dimensions. Two cognitive paradigms were employed: in the multi-arrangement task12, participants judged the similarity between the 159 relationships by arranging them on a 2D computer screen in such a way that the distance between any two relationships reflected their conceptual dissimilarity (that is, the more conceptually similar, the closer together the relationships were); in the free sorting task13, participants classified the same set of relationships into labelled categories of their choosing.
分类思维(例如,家庭、朋友和同事)在人们定义和管理他们的社会关系时普遍存在。我们接下来研究了人们如何分类关系以及分类表示如何与 FAVEE 维度相关联。采用了两种认知范式:在多排列任务 12 中,参与者通过在 2D 计算机屏幕上排列 159 种关系来判断它们之间的相似性,使得任何两种关系之间的距离反映了它们的概念差异(即,概念上越相似,关系之间的距离越近);在自由分类任务 13 中,参与者将同一组关系分类到他们选择的标签类别中。

Using a within-subject design, we recruited 60 US participants to complete three tasks in the laboratory (that is, one dimensional survey and two cognitive tasks) (Fig. 2a). Categorical representations were derived from each task by applying clustering algorithms to the dissimilarity matrix of relationship concepts. Three clusters were found in the dimensional survey, which can be labelled ‘hostile, private and public’ (abbreviated as the HPP model) (for optimal cluster details, see Methods). Relationships in the ‘hostile’ cluster featured people who are antagonistic or have negative feelings with each other, such as ‘divorced spouses’ and ‘business rivals’. Relationships in the ‘private’ cluster were personal and family ties, such as ‘siblings’ and ‘close friends’. Relationships in the ‘public’ cluster were formal and occupational and had impersonal ties, such as ‘driver–passenger’ and ‘employer–employee’. In contrast, clustering on the two cognitive tasks revealed six canonical relationship types: hostile, familial, romantic, affiliative, transactional and power. Text analysis on the labels during the free sorting task further revealed that six canonical types emerged from three HPP categories (Fig. 2b): while the hostile cluster in the HPP model remained, the private cluster was divided into three distinct classes (familial, romantic and affiliative relationships), and the public cluster was split into two classes (transactional and power relationships). To further clarify the associations between the FAVEE and HPP models, a dimension–category hybrid representation was evaluated where clustering techniques were applied to the relationships in the FAVEE embedding space. Again, three HPP clusters were identified (Fig. 2c), and each was embedded in a unique location in the 5D space: the private and public clusters were located separately at the two ends of the formality dimension, and the relationships that were low on the valence dimension formed the hostile cluster. This implied that HPP categories could originate from the FAVEE dimensions.
我们采用被试内设计,招募了 60 名美国参与者在实验室完成三项任务(即一项维度调查和两项认知任务)(图 2a)。通过对关系概念的不相似矩阵应用聚类算法,从每项任务中得出了分类表示。在维度调查中发现了三个聚类,可以标记为“敌对、私人和公共”(简称 HPP 模型)(有关最佳聚类细节,请参见方法部分)。“敌对”聚类中的关系特征为彼此敌对或有负面情绪的人,如“离婚的配偶”和“商业竞争对手”。“私人”聚类中的关系是个人和家庭纽带,如“兄弟姐妹”和“亲密朋友”。“公共”聚类中的关系是正式的和职业的,具有非个人纽带,如“司机-乘客”和“雇主-雇员”。相比之下,在两项认知任务上的聚类揭示了六种典型关系类型:敌对、家庭、浪漫、附属、交易和权力。 在自由分类任务中对标签的文本分析进一步揭示,从三个 HPP 类别中出现了六种典型类型(图 2b):虽然 HPP 模型中的敌对集群保持不变,但私人集群被分为三个不同的类别(家庭、浪漫和附属关系),而公共集群则被分为两类(交易和权力关系)。为了进一步阐明 FAVEE 和 HPP 模型之间的关联,评估了一种维度-类别混合表示,其中聚类技术被应用于 FAVEE 嵌入空间中的关系。再次识别出三个 HPP 集群(图 2c),每个集群都嵌入在 5D 空间中的独特位置:私人和公共集群分别位于形式维度的两端,而低效价维度的关系形成了敌对集群。这意味着 HPP 类别可能源自 FAVEE 维度。

Fig. 2: Categorical and dimensional representations of human relationships.
图 2:人类关系的分类和维度表示。
figure 2

a, Three behavioural tasks and their corresponding categories (via UMAP and k-means clustering). The dimensional survey indicated three high-order categories (hostile, private and public, abbreviated as HPP), whereas both cognitive tasks identified six canonical relationship types. b, Text analysis on categorical labels in the free sorting task revealed the label hierarchy: the hierarchical clustering algorithm first derived three HPP categories; next, the ‘public’ cluster was further split into ‘transactional’ and ‘power’, and the ‘private’ cluster was subdivided into ‘family’, ‘romantic’ and ‘affiliative’. The word clouds in the treemap display label names in each cluster. The radar plots illustrate the average scores on the FAVEE dimensions across relationships in each category. c, A dimension–category hybrid model was evaluated by applying k-means clustering to the 5D FAVEE embedding space. Three HPP clusters were found, suggesting that HPP could emerge from the FAVEE space. d, The three building blocks of the relationship concept system and their internal associations. The flow chart illustrates how people represent relationships via the joint framework of dimensions and categories. Five FAVEE dimensions can be understood as filters (for example, all relationships can be filtered into hostile and solidarity relationships via the valence dimension), and relationship categories are formed by uneven distributions projected on dimensions (see the details in Supplementary Fig. 14). Our data suggest that six canonical relationship types (in the grey circles) originate from three HPP categories (in the dashed circles), which inherently emerge from the 5D FAVEE space (arrows).
a, 三个行为任务及其对应的类别(通过 UMAP 和 k-means 聚类)。维度调查显示了三个高阶类别(敌对、私人和公共,缩写为 HPP),而两个认知任务都识别出了六种典型的关系类型。b, 自由分类任务中类别标签的文本分析揭示了标签层次结构:层次聚类算法首先得出了三个 HPP 类别;接着,“公共”集群被进一步分为“交易”和“权力”,“私人”集群被细分为“家庭”、“浪漫”和“附属”。树状图中的词云显示了每个集群中的标签名称。雷达图展示了每个类别中关系在 FAVEE 维度上的平均得分。c, 通过在 5D FAVEE 嵌入空间中应用 k-means 聚类,评估了一个维度-类别混合模型。发现了三个 HPP 集群,表明 HPP 可以从 FAVEE 空间中产生。d, 关系概念系统的三个构建模块及其内部关联。 流程图展示了人们如何通过维度和类别的联合框架来表示关系。五个 FAVEE 维度可以被理解为过滤器(例如,所有关系都可以通过效价维度过滤为敌对关系和团结关系),而关系类别则是由在维度上投影的不均匀分布形成的(详见补充图 14)。我们的数据表明,六种典型关系类型(灰色圆圈中)源自三个 HPP 类别(虚线圆圈中),这些类别本质上是从 5D FAVEE 空间(箭头)中产生的。

In sum, Study 1 revealed that when people think about social relationships, they attend to five key features. We demonstrated that all relationship concepts are mentally represented in a high-order FAVEE space, and the conceptual similarity between each pair of relationships can be represented as the distance in the 5D space. Once the spatial proximity among relationships is close enough along certain featural dimensions, they can be self-clustered into meaningful categories (for example, three HPP clusters or six canonical types). Relationship categories thus emerge from uneven distributions along the FAVEE dimensions, and relationship taxonomies can be understood as discrete sets of categories living in a continuous multidimensional space (see an illustrative flow chart in Fig. 2d).
总之,研究 1 揭示了当人们思考社会关系时,他们会关注五个关键特征。我们证明了所有关系概念都在高阶 FAVEE 空间中被心理表征,并且每对关系之间的概念相似性可以表示为 5D 空间中的距离。一旦关系在某些特征维度上的空间接近度足够近,它们就可以自我聚类成有意义的类别(例如,三个 HPP 集群或六种典型类型)。因此,关系类别从 FAVEE 维度上的不均匀分布中产生,关系分类学可以被理解为生活在连续多维空间中的离散类别集(见图 2d 中的说明性流程图)。

Study 2: universality and variability across modern cultures
研究 2:现代文化中的普遍性与变异性

All human cultures have rich vocabularies devoted to describing human relationships. Translation dictionaries, for example, suggest that the English word ‘neighbours’ can be equated with the Chinese word ‘邻居’ and the Hebrew word ‘שכנים’. However, does this mean that the concept of ‘neighbours’ is the same in the USA, China and Israel? In Study 2, we explored this question by examining representations of relationship concepts across 19 global regions and 10 languages. We aimed to reveal the cross-cultural similarity and differences and their underlying cultural mechanisms.
所有人类文化都有丰富的词汇用于描述人际关系。例如,翻译词典表明,英语单词“neighbours”可以等同于中文单词“邻居”和希伯来语单词“שכנים”。然而,这是否意味着“邻居”的概念在美国、中国和以色列是相同的?在研究 2 中,我们通过考察 19 个全球地区和 10 种语言中关系概念的表示来探讨这个问题。我们的目标是揭示跨文化的相似性和差异及其潜在的文化机制。

Study 2 was preregistered on the Open Science Framework (https://osf.io/swr2c) on 13 June 2022. We report deviations from preregistration in Supplementary Method 3. A large sample of online participants (n = 17,686) were recruited from 19 global regions with diverse ecological (geography, climate and subsistence), biological (genetics and disease prevalence) and sociocultural backgrounds (language, ethnicity, education, religion, politics, wealth and urbanization) (see Supplementary Fig. 13 for the details). The dimensional survey approach was adopted due to its higher within-culture stability over cognitive tasks (Supplementary Fig. 2). For each region, three types of representational geometries were generated on the basis of representational dissimilarity matrices (RDMs)14,15: the full-feature model (that is, RDMs based on the original data on all evaluative features without applying any dimensionality reduction or clustering techniques), a dimensional model (that is, RDMs based on FAVEE) and a categorical model (that is, RDMs based on HPP). The degree of cross-cultural concordance in relationship concepts was assessed on the basis of these region-specific representational geometry models.
研究 2 已于 2022 年 6 月 13 日在开放科学框架(https://osf.io/swr2c)上预先注册。我们在补充方法 3 中报告了与预注册的偏差。我们从 19 个全球地区招募了大量在线参与者(n = 17,686),这些地区具有多样化的生态(地理、气候和生存方式)、生物(遗传和疾病流行率)和社会文化背景(语言、民族、教育、宗教、政治、财富和城市化)(详见补充图 13)。由于其在文化内稳定性高于认知任务(补充图 2),我们采用了维度调查方法。对于每个地区,基于表征差异矩阵(RDMs)生成了三种类型的表征几何结构 14,15 :全特征模型(即基于所有评估特征原始数据的 RDMs,不应用任何降维或聚类技术)、维度模型(即基于 FAVEE 的 RDMs)和分类模型(即基于 HPP 的 RDMs)。 基于这些特定区域的表征几何模型,评估了关系概念在跨文化中的一致性程度。

Consistent with Study 1, we identified the 5D FAVEE space and three HPP categories in both globally aggregated data (Extended Data Fig. 2) and regional data (Supplementary Figs. 4 and 5). Using leave-one-region-out cross-validation, each region’s unique representational geometries were accurately predicted by the left-out globally aggregated data (Fig. 3a). The ability of the FAVEE and HPP models to consistently predict relationship representations across regions suggests that they might be universal structures of relationship concepts that can be generalized across the world. In addition, to examine how well the five FAVEE dimensions represent all theoretical relationship features, we performed model comparison analysis between the FAVEE model and other existing theories. We found that the FAVEE model outperformed 15 other theories in data fitting and explained variance across global regions (Extended Data Fig. 3). Therefore, although past theories all attempted to reduce numerous relationship features into fewer components, FAVEE is the most representative, parsimonious and consistent model across cultures.
与研究 1 一致,我们在全球汇总数据(扩展数据图 2)和区域数据(补充图 4 和 5)中识别了 5D FAVEE 空间和三个 HPP 类别。使用留一区域交叉验证,每个区域的独特表示几何结构被留出的全球汇总数据准确预测(图 3a)。FAVEE 和 HPP 模型能够一致地预测跨区域的关系表示,这表明它们可能是可以推广到全球的关系概念的普遍结构。此外,为了检验五个 FAVEE 维度在多大程度上代表了所有理论关系特征,我们在 FAVEE 模型与其他现有理论之间进行了模型比较分析。我们发现,FAVEE 模型在数据拟合和解释全球各区域的方差方面优于其他 15 种理论(扩展数据图 3)。因此,尽管过去的理论都试图将众多关系特征简化为较少的组成部分,但 FAVEE 是跨文化中最具代表性、最简洁且最一致的模型。

Fig. 3: Universality and cultural variability of relationship representational geometries.
图 3:关系表征几何的普遍性与文化变异性。
figure 3

a, Three representational geometries were derived from the full-feature model (all evaluative features), the dimensional model (FAVEE) and the categorical model (HPP). Using leave-one-region-out cross-validation, high similarities of representational geometries (all r > 0.687) were observed across 19 global regions (black areas on the world map), suggesting that FAVEE dimensions and HPP categories are commonly shared across the world. The box plots show the median (horizontal line) and the interquartile range (IQR) ± 1.5 × IQR. Here the full-feature model represents the noise ceiling of cross-region representational similarity. b, Computational modelling of representational geometries using RSA multiple regression. A set of ecological, biological and sociocultural RDMs were fit to predict the cross-regional relationship RDM, and religion and modernization level were the only two factors that could significantly predict the cross-regional variability of representational geometries. The asterisks indicate significant regression coefficients according to the Mantel test, and the dashed lines indicate noise ceilings (Methods). c, Significant correlations between cross-region pairwise dissimilarity of representational geometry and cross-region pairwise dissimilarity in religion and modernization level. The shaded area represents the 95% confidence interval. *P < 0.05; **P < 0.01; ***P < 0.001; one-sided permutation tests.
a, 从全特征模型(所有评价特征)、维度模型(FAVEE)和分类模型(HPP)中导出了三种表征几何。使用留一区域交叉验证,在 19 个全球区域(世界地图上的黑色区域)中观察到表征几何的高度相似性(所有 r > 0.687),表明 FAVEE 维度和 HPP 类别在全球范围内普遍共享。箱线图显示了中位数(水平线)和四分位距(IQR)± 1.5 × IQR。这里全特征模型代表了跨区域表征相似性的噪声上限。b, 使用 RSA 多元回归进行表征几何的计算建模。一组生态、生物和社会文化的 RDM 被拟合以预测跨区域关系 RDM,宗教和现代化水平是唯一两个能够显著预测表征几何跨区域变异性的因素。星号表示根据 Mantel 检验的显著回归系数,虚线表示噪声上限(方法)。 c,跨区域表示几何的成对差异与跨区域宗教和现代化水平的成对差异之间的显著相关性。阴影区域代表 95%置信区间。*P < 0.05;**P < 0.01;***P < 0.001;单侧置换检验。

Although the basic organization of relationship concepts was found to be globally shared, there was also rich cultural variation. For example, people around the world seemed to have a different understanding of public relationships but held similar views on familial and romantic relationships (Extended Data Fig. 4). To further explore these findings, we implemented representational similarity analysis (RSA) to quantitatively model the cross-region variability of representational geometries on the basis of regressions of a variety of ecological, biological and sociocultural variables (Fig. 3b). Religion and modernization were the only two factors that significantly predicted cross-region variability in representational geometries (see the detailed statistics in Extended Data Table 1), and regions with similar religions and modernization levels were found to have similar representational geometries of relationships (Fig. 3c). Here modernization refers to a composite metric based on the education, urbanization and wealth of a country16, and religion estimates the percentages of adherents of 21 religious denominations (Supplementary Table 3). Follow-up RSAs revealed that the two factors exerted predictive power on distinct dimensions and categories (Supplementary Fig. 7).
尽管关系概念的基本组织被发现是全球共享的,但也存在丰富的文化差异。例如,世界各地的人们对公共关系的理解似乎有所不同,但对家庭和恋爱关系的看法却相似(扩展数据图 4)。为了进一步探索这些发现,我们实施了表征相似性分析(RSA),基于各种生态、生物和社会文化变量的回归,定量建模了表征几何的跨区域变异性(图 3b)。宗教和现代化是唯一两个显著预测表征几何跨区域变异性的因素(详见扩展数据表 1 中的详细统计数据),并且发现具有相似宗教和现代化水平的地区具有相似的关系表征几何(图 3c)。这里的现代化是指基于一个国家的教育、城市化和财富的综合指标 16 ,而宗教则估计了 21 个宗教教派的信徒百分比(补充表 3)。 后续的 RSA 分析显示,这两个因素在不同的维度和类别上具有预测能力(补充图 7)。

To further delineate and elaborate the fine-grained cultural differences, we collected additional data in China (n = 6,128) (Supplementary Fig. 8) and directly compared it with the USA at a finer scale (Fig. 4). To rule out the effects of language and translation, two rounds of data collection were conducted. In the first round, 159 relationships directly translated from the US relationship list were adopted. In the second round, a new list of 258 relationships was generated by Chinese NLP algorithms (see the details in Supplementary Method 1), which included numerous Chinese-unique relationships (that is, some cannot be translated linguistically, and others are culture-specific; see the full list in Supplementary Table 4). Our analysis revealed no significant differences between the datasets of directly translated relationships and those generated via Chinese NLP algorithms (all r > 0.622, all P < 0.001; Supplementary Fig. 8), confirming that our results were not influenced by language or translation. There were more intriguing findings in the direct comparisons between the USA and China. We found, when understanding closeness in human relationships, Americans seemed to focus more on physical distance, whereas Chinese focused on psychological distance (Fig. 4c). For example, ancestor–descendant was considered by Americans to be a distant relationship because two sides have infinitely far physical distance. Chinese evaluated this relationship as being less distant due to ancestor veneration in the foundational philosophy of Confucianism (for example, high spiritual intimacy with ancestors). When understanding power in human relationships, individuals in China hold stronger stereotypes of inequality among family members (for example, uncle–nephew; Fig. 4d), which is consistent with the Confucian ideal of filial piety. When evaluating social exchange in private relationships, Americans seemed to experience more concrete resource exchanges than Chinese, which could be associated with their higher modernization level or foundational values linked to capitalism (Fig. 4e). For example, long-distance lovers in the USA often buy gifts such as flowers for each other, whereas symbolic exchanges, such as long telephone calls, were typically observed in Chinese long-distance partners. Together, these subtle cultural differences in relationship conceptualization seemed to be highly interdependent with USA‒China differences in religion and modernization level.
为了进一步描绘和阐述细粒度的文化差异,我们在中国收集了额外的数据(n = 6,128)(补充图 8),并在更精细的尺度上直接与美国进行了比较(图 4)。为了排除语言和翻译的影响,我们进行了两轮数据收集。在第一轮中,采用了直接从美国关系列表中翻译的 159 种关系。在第二轮中,通过中文自然语言处理算法生成了一个新的包含 258 种关系的列表(详见补充方法 1),其中包括了许多中国特有的关系(即一些无法通过语言翻译,另一些则是文化特有的;完整列表见补充表 4)。我们的分析显示,直接翻译的关系数据集与通过中文自然语言处理算法生成的数据集之间没有显著差异(所有 r > 0.622,所有 P < 0.001;补充图 8),证实了我们的结果不受语言或翻译的影响。在美国和中国的直接比较中,有更多有趣的发现。 我们发现,在理解人际关系中的亲密程度时,美国人似乎更关注物理距离,而中国人则侧重于心理距离(图 4c)。例如,美国人认为祖先与后代之间的关系较为疏远,因为双方在物理距离上相隔无限远。而中国人则因儒家基础哲学中的祖先崇拜(例如,与祖先的高度精神亲密)而认为这种关系不那么疏远。在理解人际关系中的权力时,中国人对家庭成员间的不平等持有更强的刻板印象(例如,叔侄关系;图 4d),这与儒家孝道理念相一致。在评估私人关系中的社会交换时,美国人似乎比中国人经历更多具体的资源交换,这可能与他们较高的现代化水平或与资本主义相关的基础价值观有关(图 4e)。 例如,美国的异地恋情侣经常互赠鲜花等礼物,而在中国的异地恋伴侣中,通常观察到的是象征性的交流,如长时间的电话通话。这些关系概念化中的微妙文化差异似乎与美国和中国在宗教和现代化水平上的差异高度相关。

Fig. 4: Comparisons between the USA and China.
图 4:美国与中国之间的比较。
figure 4

a, Similar PCA loadings for the two countries. b, Correlation of relationship scores between the two countries. The high concordance of PCA loadings (all r > 0.707) and relationship scores (all r > 0.704) suggests a common FAVEE space in the two countries. c, USA versus China on the conceptualization of close relationships. PCA scores on psychological distance dimensions (for example, attachment, love expression and intimacy) and physical distance dimensions (for example, spatial distance and synchronicity) were computed for the 30 most distant relationships in each country. Americans weighted physical distance more when judging close relationships, whereas the Chinese considered both psychological and physical distance. For example, the Chinese rated ancestor–descendant less distant than Americans, possibly due to ancestor worship (that is, close psychological distance). d, USA versus China on the conceptualization of power relationships. While both groups shared similar views on equality in occupational relationships, the Chinese judged familial relationships as more unequal than Americans (for example, uncle–nephew). e, USA versus China on the conceptualization of social change in private and public relationships. While both countries believed that public relationships mainly exchange concrete resources (for example, money, goods and services) and private relationships exchange symbolic resources (for example, love, advice and information), Americans experienced more concrete resource exchanges in private relationships than the Chinese. For example, long-distance lovers in the USA send gifts to each other regularly, whereas long-distance lovers in China spend long hours on telephone chats instead. All box plots show the median (horizontal line) and the 25th and 75th quantiles (box edges), and the whiskers extend to the most extreme data points within 1.5 times the IQR. Analysis of variance and t-test were used to compare cultural differences with a two-tailed test.
a, 两个国家的 PCA 载荷相似。b, 两国关系评分之间的相关性。PCA 载荷(所有 r > 0.707)和关系评分(所有 r > 0.704)的高度一致性表明两国存在共同的 FAVEE 空间。c, 美国与中国在亲密关系概念化上的对比。计算了每个国家 30 种最远关系在心理距离维度(例如,依恋、爱的表达和亲密性)和物理距离维度(例如,空间距离和同步性)上的 PCA 得分。美国人在判断亲密关系时更重视物理距离,而中国人则同时考虑心理和物理距离。例如,中国人认为祖先与后代的关系比美国人认为的更近,可能是由于祖先崇拜(即,心理距离较近)。d, 美国与中国在权力关系概念化上的对比。虽然两组在职业关系中的平等性上看法相似,但中国人认为家庭关系比美国人认为的更不平等(例如,叔叔与侄子)。 例如,美国与中国在私人和公共关系中对社会变化的概念化。虽然两国都认为公共关系主要交换具体资源(例如,金钱、商品和服务),而私人关系交换象征性资源(例如,爱、建议和信息),但美国人在私人关系中经历的更多是具体资源交换,而中国人则不然。例如,美国的异地恋情侣会定期互赠礼物,而中国的异地恋情侣则更倾向于长时间的电话聊天。所有箱线图显示中位数(水平线)和第 25 和第 75 百分位数(箱体边缘),须线延伸至 1.5 倍 IQR 内的最极端数据点。使用方差分析和 t 检验进行文化差异比较,采用双尾检验。

Finally, as all 19 global regions were industrial societies, we validated the FAVEE-HPP model in a non-industrial society—the Chinese Mosuo tribe, a small-scale matrilineal society living near Lugu Lake in the Tibetan Himalayas. As a traditional agrarian society, the Mosuo society is distinct from industrialized societies in social organization, economy system, language, beliefs and lifestyle (see key features of the Mosuo society in Extended Data Fig. 5). Field research data from 229 native Mosuo people indicated that Mosuo culture still conforms to FAVEE-HPP structures when understanding relationships. Highly similar representational geometries can be observed between the Mosuo, Chinese Han and other industrial societies in the world (all r > 0.600, all P < 0.001; Extended Data Figs. 5b,d). This confirmed that people from non-industrial and industrial societies share the same set of conceptual structures for relationships.
最后,由于所有 19 个全球区域都是工业社会,我们在一个非工业社会——中国摩梭族中验证了 FAVEE-HPP 模型。摩梭族是一个生活在西藏喜马拉雅山麓泸沽湖附近的小规模母系社会。作为一个传统的农业社会,摩梭社会在社会组织、经济体系、语言、信仰和生活方式上与工业化社会有显著不同(见扩展数据图 5 中摩梭社会的主要特征)。来自 229 名摩梭原住民的实地研究数据表明,摩梭文化在理解关系时仍然符合 FAVEE-HPP 结构。摩梭族、中国汉族和世界其他工业社会之间可以观察到高度相似的表征几何结构(所有 r > 0.600,所有 P < 0.001;扩展数据图 5b,d)。这证实了非工业和工业社会中的人们共享同一套关系概念结构。

Study 2 demonstrated that relationship concepts reside in a universal, low-dimensional space shared by people around the world. While many concepts are similarly positioned in FAVEE-HPP space regardless of culture, other concepts exhibit significant cultural variability (see Extended Data Fig. 6 for conceptual differences of ‘neighbours’ in the USA, China and Israel as a vivid example). This variation was found to be tied to religion and modernization differences between regions.
研究 2 表明,关系概念存在于一个全球人们共享的、低维度的普遍空间中。尽管许多概念在 FAVEE-HPP 空间中的位置不受文化影响,但其他概念则表现出显著的文化差异(例如,美国、中国和以色列中“邻居”概念的具体差异,详见扩展数据图 6)。这种差异被发现与地区间的宗教和现代化程度差异有关。

Study 3: relationship representations in ancient cultures
研究 3:古代文化中的关系表征

Study 1 investigated how human relationships are mentally represented and discovered the FAVEE-HPP structures. Study 2 examined where in the world the FAVEE-HPP model applies and showed its generalizability to diverse global regions. In Study 3, we explored when in history this model can apply. In Studies 1 and 2, we only examined contemporary societies, which are far from representative of all cultures. An investigation on ancient cultures will help verify the persistence of the FAVEE-HPP model through time.
研究 1 探讨了人类关系在心理上的表征方式,并发现了 FAVEE-HPP 结构。研究 2 考察了 FAVEE-HPP 模型在世界上哪些地区适用,并展示了其对全球不同地区的普遍适用性。在研究 3 中,我们探索了这一模型在历史上何时适用。在研究 1 和 2 中,我们仅考察了当代社会,这些社会远不能代表所有文化。对古代文化的研究将有助于验证 FAVEE-HPP 模型在时间上的持久性。

We employed state-of-the-art NLP techniques to capture ancient people’s perception and comprehension of human relationships. This involved analysing large-scale text corpora sourced from historical archives, enabling us to gain insights into their conceptualization of relationships. Analysing texts can offer a unique window into human psychology17. Prior research has suggested that word embeddings (representations) reflect the ways people understand concepts such as object knowledge18, personality traits19 and mental states20. The advent of pretrained language models (PLMs) and large language models (LLMs) revolutionizes tools for analysing texts on a massive scale21. Probing language models pretrained on Chinese historical text corpora thus allows us to query relationship understanding from people in ancient China (for example, Qin Dynasty, 221 bce)—populations otherwise inaccessible to modern researchers.
我们采用了最先进的自然语言处理技术来捕捉古人对人际关系的感知和理解。这包括分析来自历史档案的大规模文本语料库,使我们能够深入了解他们对关系的概念化。分析文本可以为人类心理学提供一个独特的窗口 17 。先前的研究表明,词嵌入(表示)反映了人们对诸如物体知识 18 、人格特质 19 和心理状态 20 等概念的理解方式。预训练语言模型(PLMs)和大语言模型(LLMs)的出现彻底改变了大规模文本分析的工具 21 。因此,探测预训练于中文历史文本语料库的语言模型,使我们能够查询古代中国(例如,秦朝,公元前 221 年)人们对关系的理解——这些人群对现代研究者来说通常是无法接触的。

We conducted an initial investigation to examine whether language models can generate human-like relationship understanding (Fig. 5a). This was achieved by employing an approach that combines PLMs, as proposed by Cutler and Condon19, with the use of LLMs such as GPT-4 (see the details in Methods). Specifically, we designed the following query (in Chinese) as the input for the pretrained model:
我们进行了一项初步调查,以检验语言模型是否能够生成类似人类的关系理解(图 5a)。这是通过采用 Cutler 和 Condon 19 提出的结合 PLMs 的方法,以及使用如 GPT-4 LLMs 等工具实现的(详见方法部分)。具体来说,我们设计了以下查询(中文)作为预训练模型的输入:

‘[DESC] The most salient feature of the relationship [TERM] is [MASK].’
‘[DESC] 关系 [TERM] 的最显著特征是 [MASK]。’

where the [TERM] token was substituted with one of the 258 Chinese relationship terms, while the [MASK] token represented the conceptualization of the target relationship. During the pretraining, the language model used contextualized embeddings of the [MASK] token to predict the most probable words to occur in that position, given the contexts. To enrich the contextual information, we incorporated [DESC], which denotes relationship-specific descriptions generated by a state-of-the-art LLM, GPT-4. These descriptions played a pivotal role in establishing a contextual framework for the subsequent representations of relationships by the language model. After systematic testing with different query types, token positions and embedding layers of the language model (Supplementary Fig. 9), we were able to identify optimal PLM representations highly resembling human relationship representations (r = 0.553, P < 0.001; Fig. 5a). Critically, PCA on PLM representations generated components (Fig. 5b) corresponding well with the FAVEE structures (all r > 0.470, all P < 0.001; Fig. 5c in purple).
其中[TERM]标记被替换为 258 个中文关系术语之一,而[MASK]标记代表了目标关系的概念化。在预训练过程中,语言模型使用[MASK]标记的上下文嵌入来预测在该位置最可能出现的单词,给定上下文。为了丰富上下文信息,我们引入了[DESC],它表示由最先进的LLM GPT-4 生成的关系特定描述。这些描述在建立语言模型后续关系表示的上下文框架中起到了关键作用。通过系统测试不同的查询类型、标记位置和语言模型的嵌入层(补充图 9),我们能够识别出与人类关系表示高度相似的最优 PLM 表示(r = 0.553,P < 0.001;图 5a)。重要的是,对 PLM 表示进行的主成分分析(PCA)生成的成分(图 5b)与 FAVEE 结构很好地对应(所有 r > 0.470,所有 P < 0.001;图 5c 紫色部分)。

Fig. 5: Validation of the FAVEE-HPP model in modern and ancient China using language models.
图 5:使用语言模型验证 FAVEE-HPP 模型在现代和古代中国的有效性。
figure 5

a, Pipeline for generating PLM embeddings. The query (dashed box) was formulated as ‘[DESC] The most salient feature of the relationship [TERM] is [MASK]’, where the [DESC] component was generated by GPT-4 (with the prompt in the grey box) and the [TERM] component was replaced with one of the 258 relationships. The [MASK] component was filled by the PLM with words that are most likely to occur there, given the context. The last layer vector (768 dimensions) was extracted on the position of [MASK] for each of 258 relationships, resulting in a 258 × 768 matrix. The resemblance of human representations (that is, the similarity matrix of 258 relationships from Chinese populations in Study 2) and PLM representations (pretrained on a modern Chinese corpus) was 0.553. b, PCA on modern PLM embeddings. By correlating human ratings on 33 dimensional features with the first seven principal components of PLM embeddings, we found that PLM representations (V1–V4 and V6) had captured corresponding features of FAVEE structures. c, Correspondence between PCA on human ratings and PCA on modern PLM embeddings (in purple) and ancient PLM embeddings (in green). d, Generalizability of the FAVEE-HPP model in ancient and modern times. RSA correlation on RDMs suggested that the FAVEE dimensions and HPP categories can significantly predict relationship representations in historical (as reflected by ancient PLM embeddings) and modern times (as reflected by modern PLM embeddings). The box plots indicate RSA correlation distributions at chance level by permutation, and the dashed lines indicate noise ceilings. e, Model comparison analysis suggested that the FAVEE model outperformed 15 other existing theories in predicting ancient and modern PLM embeddings. Broadly, FAVEE was significantly better than any random combination of theoretical features (the null distribution is indicated by the box plots and density plots; modern: P < 0.001, ancient: P = 0.006). f, Ancient–modern differences. Using linear combinations of the five FAVEE components as regressors to predict ancient and modern PLM embeddings, we found substantial changes on ‘formality’ and ‘equality’ for their explained variance in ancient and modern PLM representations, suggesting that these two dimensions may contribute differently to relationship conceptualization across time. ***P < 0.001; one-sided permutation tests. All box plots show the median (horizontal line) and the interquartile range (IQR) ± 1.5 × IQR.
a, 生成 PLM 嵌入的流程。查询(虚线框)被表述为‘[DESC] 关系[TERM]的最显著特征是[MASK]’,其中[DESC]部分由 GPT-4 生成(灰色框中的提示),[TERM]部分被替换为 258 种关系之一。[MASK]部分由 PLM 根据上下文填入最可能出现的词语。在[MASK]位置提取了每个 258 种关系的最后一层向量(768 维),生成了一个 258 × 768 的矩阵。人类表示(即研究 2 中中国人群的 258 种关系的相似性矩阵)与 PLM 表示(在现代中文语料库上预训练)的相似度为 0.553。b, 现代 PLM 嵌入的主成分分析(PCA)。通过将人类对 33 个维度特征的评分与 PLM 嵌入的前七个主成分相关联,我们发现 PLM 表示(V1–V4 和 V6)捕捉到了 FAVEE 结构的相应特征。c, 人类评分 PCA 与现代 PLM 嵌入(紫色)和古代 PLM 嵌入(绿色)PCA 之间的对应关系。 d, FAVEE-HPP 模型在古代和现代的普适性。RSA 相关性分析表明,FAVEE 维度和 HPP 类别能够显著预测历史时期(通过古代 PLM 嵌入反映)和现代(通过现代 PLM 嵌入反映)的关系表征。箱线图表示通过置换得到的 RSA 相关性分布,虚线表示噪声上限。e, 模型比较分析表明,在预测古代和现代 PLM 嵌入方面,FAVEE 模型优于其他 15 种现有理论。总体而言,FAVEE 显著优于任何理论特征的随机组合(零分布由箱线图和密度图表示;现代:P < 0.001,古代:P = 0.006)。f, 古代与现代的差异。使用五个 FAVEE 成分的线性组合作为回归因子来预测古代和现代 PLM 嵌入,我们发现“正式性”和“平等性”在解释古代和现代 PLM 表征的方差方面发生了显著变化,表明这两个维度可能在不同时期对关系概念化的贡献有所不同。 ***P < 0.001;单侧置换检验。所有箱线图显示中位数(水平线)和四分位距(IQR)± 1.5 × IQR。

Since we had confirmed that PLM embeddings could reflect human-like relationship understanding, we harnessed the ancient PLM as a proxy of the ancient human mind and sought evidence of FAVEE-HPP structures in an ancient language model pretrained on a comprehensive compilation of historical Chinese texts ranging from the Zhou Dynasty (~1046 bce) to the Qing Dynasty (1912 ce)22. For more accurate historical context, we first prompted GPT-4 to describe the relationships within the context of ancient China. We then recruited human experts in ancient Chinese language, literature and history to manually refine the descriptions and express them in Classical Chinese. This ensures that the DESC effectively matches the linguistic features and relationship characteristics of the ancient era (Supplementary Method 2). These experts also carefully selected 120 relationships that existed in ancient China (Supplementary Table 6). As expected, FAVEE structures can be identified after applying PCA on ancient PLM embeddings (all r > 0.287, all P < 0.001; Fig. 5c in green). Next, if the FAVEE-HPP model can capture relationship representations in history, then the relationships that are closer to each other within FAVEE-HPP space should be represented by vectors that are closer to each other in ancient PLM embeddings. Indeed, for both FAVEE dimensions and HPP categories, we found significant correlations between RDMs in human ratings and RDMs in ancient PLM embeddings (Fig. 5d). Model comparison analysis suggested that the FAVEE model outperformed other theoretical models in predicting ancient and modern PLM embeddings (Fig. 5e). To further reveal the difference between ancient and modern China, we evaluated the relative contribution of each FAVEE dimension when predicting relationship representations in ancient and modern PLMs (Fig. 5f). We found that ‘formality’ explained more variance in modern than in ancient times (modern, 0.279; ancient, 0.178), whereas ‘equality’ accounted for more variance in ancient than in modern times (modern, 0.148; ancient, 0.243). This suggests that, compared with modern Chinese, ancient Chinese might put more weight on equality features (for example, social hierarchy) but less on formality features (for example, occupations) when understanding relationships.
既然我们已经确认 PLM 嵌入能够反映类似人类的关系理解,我们便利用古代 PLM 作为古代人类思维的代理,并寻求在一个预训练于从周朝(约公元前 1046 年)到清朝(1912 年)全面历史中文文本汇编的古代语言模型中 FAVEE-HPP 结构的证据 22 。为了更准确的历史背景,我们首先提示 GPT-4 描述古代中国背景下的关系。然后,我们招募了古代汉语、文学和历史领域的人类专家,手动精炼这些描述,并用文言文表达。这确保了 DESC 有效地匹配古代时代的语言特征和关系特性(补充方法 2)。这些专家还精心挑选了古代中国存在的 120 种关系(补充表 6)。正如预期的那样,在对古代 PLM 嵌入应用 PCA 后,可以识别出 FAVEE 结构(所有 r > 0.287,所有 P < 0.001;图 5c 绿色部分)。 接下来,如果 FAVEE-HPP 模型能够捕捉历史中的关系表示,那么在 FAVEE-HPP 空间中彼此更接近的关系应该由在古代 PLM 嵌入中彼此更接近的向量表示。事实上,对于 FAVEE 维度和 HPP 类别,我们发现人类评分中的 RDMs 与古代 PLM 嵌入中的 RDMs 之间存在显著相关性(图 5d)。模型比较分析表明,FAVEE 模型在预测古代和现代 PLM 嵌入方面优于其他理论模型(图 5e)。为了进一步揭示古代和现代中国之间的差异,我们评估了在预测古代和现代 PLM 中的关系表示时每个 FAVEE 维度的相对贡献(图 5f)。 我们发现,“正式性”在现代解释的方差比古代更多(现代,0.279;古代,0.178),而“平等性”在古代解释的方差比现代更多(现代,0.148;古代,0.243)。这表明,与现代汉语相比,古代汉语在理解关系时可能更重视平等特征(例如,社会等级),而较少重视正式性特征(例如,职业)。

We also performed expert validation on the ancient PLM to check whether it had expert-like knowledge on ancient relationships. A group of university scholars (n = 44) were asked to rate all 120 relationships in the context of ancient Chinese culture, and FAVEE-HPP structures can be reliably identified from their ratings (Supplementary Fig. 10). Critically, ancient PLM embeddings showed higher agreement with expert ratings than with non-expert ratings, suggesting that our PLM embeddings did capture scholarly knowledge and insights on how ancient Chinese conceptualized relationships.
我们还对古代 PLM 进行了专家验证,以检查其是否具备关于古代关系的专家级知识。一组大学学者(n = 44)被要求在中国古代文化的背景下对所有 120 种关系进行评分,并且可以从他们的评分中可靠地识别出 FAVEE-HPP 结构(补充图 10)。关键的是,古代 PLM 嵌入与专家评分的一致性高于与非专家评分的一致性,这表明我们的 PLM 嵌入确实捕捉到了学者们关于古代中国人如何概念化关系的知识和见解。

Study 3 demonstrated that language models can generate human-like relationship understanding and have expertise in historical contexts. By decomposing PLM embeddings, we can identify FAVEE-HPP structures in ancient and modern representations of relationships. Furthermore, the FAVEE-HPP model outperformed other models across different time periods. These findings highlight the broad and effective generalization of the FAVEE-HPP model from contemporary society to the ancient world, spanning a history of 3,000 years.
研究 3 表明,语言模型能够生成类似人类的关系理解,并在历史背景中展现出专业知识。通过分解 PLM 嵌入,我们可以在古代和现代关系表征中识别出 FAVEE-HPP 结构。此外,FAVEE-HPP 模型在不同时间段的表现优于其他模型。这些发现凸显了 FAVEE-HPP 模型从当代社会到古代世界的广泛且有效的泛化能力,跨越了 3000 年的历史。

Discussion  讨论

In the past 50 years, social scientists have sought to understand the nature of human sociality, but there is still no consensus on the elemental forms and overarching organization of human relationships. To help address this long-standing question, the present study examined how conceptual knowledge of relationships is mentally represented and organized. We created a generalized framework that unifies existing theories across multiple disciplines and discovered a set of five dimensions (FAVEE) and three categories (HPP) that scaffold the conceptual space of human relationships (Study 1). Converging evidence suggests that the FAVEE-HPP framework is commonly shared across modern societies (Study 2) and historical cultures (Study 3), and it exceeds existing theories in model performance (Supplementary Fig. 6), consistency across global regions (Extended Data Fig. 3) and endurance over time (Fig. 5e). We also extended the FAVEE framework to non-dyadic relationships (Extended Data Fig. 7) and confirmed its generalizability to triadic relations (for example, love triangle) and group relations (for example, rich–poor, Democrats–Republicans).
在过去的 50 年里,社会科学家们一直试图理解人类社会性的本质,但对于人类关系的基本形式和总体组织仍无共识。为了帮助解决这一长期存在的问题,本研究探讨了关系概念知识在心理上是如何表征和组织的。我们创建了一个统一多学科现有理论的通用框架,并发现了一组五个维度(FAVEE)和三个类别(HPP),它们支撑着人类关系的概念空间(研究 1)。汇聚的证据表明,FAVEE-HPP 框架在现代社会(研究 2)和历史文化(研究 3)中普遍共享,并且在模型性能(补充图 6)、全球区域一致性(扩展数据图 3)和时间持久性(图 5e)方面超越了现有理论。我们还将 FAVEE 框架扩展到非二元关系(扩展数据图 7),并确认了其对三元关系(例如,三角恋)和群体关系(例如,贫富、民主党-共和党)的普遍适用性。

As a parsimonious model of human relationships, the FAVEE-HPP framework will inspire theory, experimental design, hypothesis testing and reinterpretations of empirical data in the social sciences. For example, since socialization is hypothesized to be one of the major drivers behind the evolution of cognitive abilities, the FAVEE-HPP framework can be applied to study the link between sociality and cognition4. Specifically, as the human mind has culturally informed, motivationally powered, emotionally imbued and morally guided models of how people think, feel and behave in relationships, the framework provides implications for how human cognitive processes (for example, affects, motives and decisions) are adaptively configured and operated for different interpersonal contexts (see our discussions on the functionality of each FAVEE dimension and HPP category in Extended Data Fig. 8). This could help us understand why humans are able to form and maintain relationships that extend beyond immediate family members and how humans evolved from ‘animal’ (that is, no cooperation, but hostility towards others) to ‘social animal’ (that is, small-scale cooperation based on private relationships) and finally to ‘cultural animal’ (that is, large-scale cooperation based on public relationships)23. From a practical point of view, the FAVEE-HPP model builds a computational framework that can objectively and quantitatively measure human relationships at a high level of granularity. It provides a standard frame of reference that can be used to optimally design, manipulate, control and model interpersonal factors in relationship science, similar to the role of the ‘Big Five’ framework for personality science.
作为一种简洁的人类关系模型,FAVEE-HPP 框架将激发社会科学中的理论、实验设计、假设检验和对实证数据的重新解释。例如,由于社会化被假设为认知能力进化的主要驱动力之一,FAVEE-HPP 框架可用于研究社交性与认知之间的联系 4 。具体而言,由于人类心智在文化上被塑造、动机驱动、情感丰富且道德引导,该框架提供了关于人类认知过程(例如,情感、动机和决策)如何针对不同的人际情境进行适应性配置和操作的启示(参见我们在扩展数据图 8 中对每个 FAVEE 维度和 HPP 类别功能的讨论)。 这可以帮助我们理解为什么人类能够形成并维持超越直系亲属的关系,以及人类如何从“动物”(即没有合作,但对他人有敌意)演变为“社会动物”(即基于私人关系的小规模合作),最终成为“文化动物”(即基于公共关系的大规模合作) 23 。从实践角度来看,FAVEE-HPP 模型构建了一个计算框架,可以在高粒度水平上客观且定量地衡量人际关系。它提供了一个标准参考框架,可用于在关系科学中优化设计、操纵、控制和建模人际因素,类似于“大五”框架在人格科学中的作用。

Our research has provided concrete evidence that relationship understanding is both universal and culturally variable. We demonstrated that the global architecture (backbone) of relationship representations (that is, FAVEE and HPP) is universally shared across cultures, but local fine-grained representational geometries (for example, the concept of ‘neighbour’ in Extended Data Fig. 6b), which are malleable by culture, could be quantitatively different. Computational modelling further suggests that the variations among modern societies are associated with religion and modernization, and the major difference between ancient and modern relationship concepts might come from changes in formality (for example, public/private boundaries) and equality (for example, social stratification). The universality and cultural variability of relationship conceptualization have wide-ranging implications for science and society. For example, a detailed delineation and elaboration of cross-cultural similarities and differences in relationships could inform whether human relations in languages, laws, social policies, moral codes and ideologies are equivalent in different countries and eras, whether relationships are expected to have the same impact on health and happiness across cultural groups and generations3, or whether the same artificial intelligence algorithms can be applied to decode interpersonal relations via daily conversations and videos around the globe24. Understanding the role of culture in relationships could also contribute to cross-cultural adaptations of communications, literature, film, art, social networking, dating and marriage25 and could facilitate efforts at cross-cultural diplomacy and commerce in a rapidly globalizing society26.
我们的研究提供了具体证据,表明关系理解既具有普遍性又具有文化变异性。我们证明了关系表示的全局架构(即 FAVEE 和 HPP)在文化间普遍共享,但局部细粒度表示几何(例如,扩展数据图 6b 中的“邻居”概念)则可能因文化而异,且可量化不同。计算模型进一步表明,现代社会之间的差异与宗教和现代化相关,而古代和现代关系概念的主要差异可能源于形式(例如,公共/私人界限)和平等(例如,社会分层)的变化。关系概念化的普遍性和文化变异性对科学和社会具有广泛的影响。 例如,对跨文化关系中相似性和差异性的详细描述和阐述,可以揭示不同国家和时代中语言、法律、社会政策、道德准则和意识形态中的人际关系是否具有等价性,是否预期关系对健康和幸福的影响在不同文化群体和世代间保持一致 3 ,或者是否可以使用相同的人工智能算法通过日常对话和视频解码全球范围内的人际关系 24 。理解文化在关系中的作用,也有助于跨文化交流、文学、电影、艺术、社交网络、约会和婚姻的适应 25 ,并能在快速全球化的社会中促进跨文化外交和商业的努力 26

Since the FAVEE-HPP model situates relationships as a fluid construct that is permitted to freely vary within and between people, future research could investigate how relationship representations are constructed during human development and how we form idiosyncratic impressions on relationships. It has been suggested that humans begin to accrue cognitive heuristics and stereotypes about interpersonal relations at birth27 (for example, stranger danger and respect for elders), and thus relationship dimensions and categories could be gradually built via a combination of explicit instruction, indirect observation and personal experience. For example, infants’ caregivers may introduce and transmit information about human relationships through bedtime stories, and preverbal infants learn basic dimensions such as valence (friends versus foes) and activeness (family versus outsiders). Later, they have social learning opportunities through indirect observation and direct experiences with others and understand new dimensions such as activeness (old versus new friends), equality (for example, teachers–peers) and exchange (for example, seller–buyer). In adulthood, acculturation to a new society involves learning the host culture’s social norms and rules when interacting with local people. In addition, the present work investigated relationship conceptualization at only the cultural and population levels. It is apparent that cognition about relationships is subjective, varied and dynamic at the individual level, and how people think about relationships might vary depending on salient features in the contexts. The FAVEE dimensions and HPP categories could function as cognitive maps to help individuals navigate social environments (such as a ‘relationship compass’) and set standards to determine the satisfaction and stability of a relationship28,29. For example, individuals who grew up in a family with challenges and had chronic peer rejection might form negative impressions about familial and affiliative relationships (for example, with negative scores in valence and activeness). Likewise, individuals who had harmonious experiences with employers, clients or co-workers might adopt more positive views on public relationships (for example, with positive scores in valence and equality). The FAVEE-HPP framework establishes relatively objective and quantitative measures of relationships that can be compared across contexts, individuals and groups. Future research could use the framework to develop psychometric tests to measure where an individual lies on the spectrum of each of the five dimensions (like the Big Five personality test) and quantitatively examine how individual differences in relationship representations are linked to interpersonal difficulties in daily life30 and whether relationship representations are abnormal in clinical populations (for example, those with autism or sociopathy).
由于 FAVEE-HPP 模型将关系定位为一种流动的构造,允许在人与人之间自由变化,未来的研究可以探讨人类发展过程中关系表征是如何构建的,以及我们如何形成对关系的独特印象。有观点认为,人类从出生起就开始积累关于人际关系的认知启发法和刻板印象(例如,陌生人的危险和对长辈的尊重),因此,关系的维度和类别可以通过明确的指导、间接观察和个人经验的结合逐渐建立。例如,婴儿的照顾者可能通过睡前故事引入和传递关于人际关系的信息,而尚未学会说话的婴儿则学习诸如情感(朋友与敌人)和活跃度(家人与外人)等基本维度。随后,他们通过间接观察和与他人的直接体验获得社会学习机会,并理解新的维度,如活跃度(老朋友与新朋友)、平等(例如,老师与同伴)和交换(例如,卖家与买家)。 在成年期,适应新社会涉及学习与当地人互动时的宿主文化的社会规范和规则。此外,本研究仅在文化和人口层面上探讨了关系的概念化。显然,在个体层面上,关于关系的认知是主观的、多样化的和动态的,人们如何思考关系可能会根据情境中的显著特征而有所不同。FAVEE 维度和 HPP 类别可以作为认知地图,帮助个体在社会环境中导航(如“关系指南针”),并设定标准以确定关系的满意度和稳定性 28,29 。例如,在充满挑战的家庭中长大并长期遭受同伴排斥的个体可能会对家庭和附属关系形成负面印象(例如,在效价和活跃度上得分较低)。同样,与雇主、客户或同事有过和谐经历的个体可能会对公共关系持更积极的看法(例如,在效价和平等性上得分较高)。 FAVEE-HPP 框架建立了相对客观和量化的关系测量方法,这些方法可以在不同情境、个体和群体之间进行比较。未来的研究可以利用该框架开发心理测量测试,以测量个体在五个维度上的位置(类似于大五人格测试),并定量研究关系表征中的个体差异如何与日常生活中的社交困难相关联 30 ,以及关系表征在临床人群(例如自闭症或反社会人格障碍患者)中是否异常。

The present work features replication and generalization. We attempted to extend and improve on prior work by being more comprehensive in several aspects, including preregistering our studies, using high-powered samples, including diverse types of relationships, analysing data with different tools and algorithms, and replicating representational models across different cultures (contemporary industrial societies, ancient societies and matrilineal tribes) and interpersonal contexts (dyadic, triadic and group relations). We also quantified the robustness of all results and showed that a subset of 40 relationships was good enough to replicate all findings based on 159 relationships (Supplementary Fig. 11).
本研究具有复制和推广的特点。我们试图通过在多方面更加全面来扩展和改进先前的工作,包括预先注册我们的研究、使用高功率样本、包含多种类型的关系、使用不同的工具和算法分析数据,并在不同文化(当代工业社会、古代社会和母系部落)和人际环境(二元、三元和群体关系)中复制表征模型。我们还量化了所有结果的稳健性,并表明 40 种关系的子集足以复制基于 159 种关系的所有发现(补充图 11)。

However, our work also has several limitations. First, the mental representations of relationships are an organized body of information that reflects values, rules, concepts, scripts, affects, motives, expectations and memories associated with a relationship. The present work only taps the lay theory (that is, vernacular beliefs), which may differ from the actual organization of relationships in human society31. Future work needs to examine the social acts and interactions across relationships. Second, FAVEE-HPP as the universal structure of relationships is far from conclusive. The present work primarily used online populations and data-driven approaches, which was a double-edged sword. More data and investigations are needed to explore factors or boundary conditions that could influence the stability, validity, representativeness and generalizability of the FAVEE-HPP model. For simplicity and convenience, we chose the acronym FAVEE as the name for our model, but the global data showed that formality is not always the most important dimension. The different ordering of dimensions in different regions requires further investigation as it could reveal interesting cultural differences. Third, the FAVEE-HPP model was decomposed from many theoretical features originating from layperson languages. A more scientifically rigorous approach is needed to create a valid and reliable taxonomy of human relationships. Fourth, due to limited resources of ancient culture experts and high-quality PLMs, Study 3 examined relationship representations only in ancient China. Future research is encouraged to validate the FAVEE-HPP model in other historical contexts (for example, in Hebrew, Greek, Tamil and Old English).
然而,我们的工作也存在一些局限性。首先,关系的心理表征是一个有组织的信息体,反映了与关系相关的价值观、规则、概念、脚本、情感、动机、期望和记忆。目前的工作仅触及了外行理论(即通俗信念),这可能与人类社会中关系的实际组织方式不同 31 。未来的工作需要考察跨关系的社会行为和互动。其次,FAVEE-HPP 作为关系的普遍结构远未定论。目前的工作主要使用了在线人群和数据驱动的方法,这是一把双刃剑。需要更多的数据和调查来探索可能影响 FAVEE-HPP 模型稳定性、有效性、代表性和普遍性的因素或边界条件。为了简单和方便,我们选择了 FAVEE 作为我们模型的名称,但全球数据显示,正式性并不总是最重要的维度。 不同地区维度排序的差异需要进一步研究,因为它可能揭示有趣的文化差异。第三,FAVEE-HPP 模型是从许多源自普通人语言的理论特征中分解出来的。需要一种更科学严谨的方法来创建有效且可靠的人类关系分类法。第四,由于古代文化专家资源有限以及高质量 PLMs 的不足,研究 3 仅考察了古代中国的关系表征。未来的研究鼓励在其他历史背景下(例如,希伯来语、希腊语、泰米尔语和古英语)验证 FAVEE-HPP 模型。

Methods  方法

Participants  参与者

All studies in this report were approved by the Institutional Review Board of Beijing Normal University (IRB_A_0024_2021002), and informed consent was obtained from all participants. Study 1 recruited 1,065 online US participants via MTurk and 60 offline US participants. Study 2 was preregistered (https://osf.io/swr2c) and recruited 17,686 online participants across 19 global regions via MTurk, CloudResearch, Credamo and the NaoDao platform32,33. In addition, 229 native Mosuo people were recruited from Yongning Township (Yunnan Province, China), using a field research data collection style (that is, through face-to-face interviews and door-to-door paper surveys). Study 3 recruited 44 scholars specialized in ancient Chinese culture for expert evaluation of the NLP method. Moreover, to test the FAVEE-HPP model in non-dyadic relationships, we recruited 380 online US participants (via MTurk) and 242 online Chinese participants (via the NaoDao platform). Participants across all studies were native speakers who grew up or lived for the longest period of their life in the targeted regions, with diverse demographics (Supplementary Fig. 13). The survey was translated into the local written language, and detailed guidelines for translation can be found at the Open Science Framework website. All participants received monetary compensation after completing the tasks.
本报告中的所有研究均获得了北京师范大学机构审查委员会的批准(IRB_A_0024_2021002),并且所有参与者均签署了知情同意书。研究 1 通过 MTurk 招募了 1,065 名在线美国参与者和 60 名线下美国参与者。研究 2 已预先注册(https://osf.io/swr2c),并通过 MTurk、CloudResearch、Credamo 和 NaoDao 平台 32,33 招募了来自 19 个全球地区的 17,686 名在线参与者。此外,还从永宁乡(中国云南省)招募了 229 名摩梭人,采用实地研究数据收集方式(即通过面对面访谈和上门纸质调查)。研究 3 招募了 44 位专门研究中国古代文化的学者,对 NLP 方法进行专家评估。此外,为了测试 FAVEE-HPP 模型在非二元关系中的应用,我们通过 MTurk 招募了 380 名在线美国参与者,并通过 NaoDao 平台招募了 242 名在线中国参与者。所有研究的参与者均为在目标地区成长或生活时间最长的母语者,具有多样化的人口统计特征(补充图 13)。 调查被翻译成当地书面语言,详细的翻译指南可在开放科学框架网站上找到。所有参与者在完成任务后都获得了金钱补偿。

Power analysis was performed to predetermine the sample size. To establish a design with adequate statistical power, we conducted a pilot study (n = 721, recruited from MTurk) using the dimensional survey from Wish et al.10. We collected at least 80 participant responses for each relationship on each evaluative feature, and the results of Wish et al.10 were completely replicated (Supplementary Fig. 12). We ran a Monte Carlo simulation test to derive the minimally required responses in each condition to maintain a stable and consistent PCA result. PCA was performed on each subsample (from 2 to 40, with 1,000 iterations for each subsample), and loading scores and relationship scores were compared with the overall dataset using Pearson’s correlation. The simulation results (Supplementary Fig. 12c) indicated that subsamples with ten responses were almost identical to the entire dataset (rating correlation r > 0.95) and thus should be adequate to ensure highly similar derived PCA components (loading score correlation r > 0.90; relationship score correlation r > 0.95).
进行了功效分析以预先确定样本量。为了建立一个具有足够统计功效的设计,我们使用 Wish 等人的维度调查进行了一项试点研究(n = 721,从 MTurk 招募)。我们为每个关系在每个评估特征上收集了至少 80 名参与者的回答,并且 Wish 等人的结果得到了完全复制(补充图 12)。我们运行了蒙特卡洛模拟测试,以得出每种条件下维持稳定且一致的 PCA 结果所需的最小回答数。对每个子样本(从 2 到 40,每个子样本进行 1,000 次迭代)进行 PCA,并使用皮尔逊相关系数将加载分数和关系分数与整个数据集进行比较。模拟结果(补充图 12c)表明,具有十个回答的子样本几乎与整个数据集相同(评分相关性 r > 0.95),因此应足以确保高度相似的 PCA 成分(加载分数相关性 r > 0.90;关系分数相关性 r > 0.95)。

Sampling of human relationships
人类关系的抽样

A data-driven approach based on NLP was used to generate a comprehensive list of human relationships (see Supplementary Method 1 for the details). Seed words were created via brainstorming and social media searches by a set of participants (n = 15 for the USA and n = 27 for China). Text embedding was used to find high-co-occurrence words relating to seed words by calculating the cosine distance between word vectors. The list of words was filtered to leave only nouns. Next, the list was filtered for frequency and was manually checked to keep only words related to human relationships. Finally, we paired the words on the basis of the meaning of relationships and added relationships that were pulled from the literature, resulting in the final relationships word list (159 for the USA and 258 for China). See further methodological details in Supplementary Figs. 1 and 8 and the full list of 159 English relationships and 258 Chinese relationships in Supplementary Tables 2 and 4.
采用基于自然语言处理(NLP)的数据驱动方法生成了全面的人际关系列表(详见补充方法 1)。通过头脑风暴和社交媒体搜索,一组参与者(美国 n=15,中国 n=27)创建了种子词。利用文本嵌入技术,通过计算词向量间的余弦距离,找出与种子词高度共现的词汇。筛选后仅保留名词,进一步根据频率过滤并人工核查,确保仅保留与人际关系相关的词汇。随后,基于关系含义配对词汇,并补充从文献中提取的关系,最终形成了人际关系词汇表(美国 159 个,中国 258 个)。更多方法学细节见补充图 1 和 8,完整的 159 个英文关系和 258 个中文关系列表见补充表 2 和 4。

Evaluative features  评价特征

A comprehensive literature search was performed to find all relevant theories and models that were proposed to explore the basic forms of human relationships. Thirty conceptual features were summarized and extracted from 15 prominent theories in Study 1. Redundant features were combined across theories (see Extended Data Fig. 1 and Supplementary Table 1 for the details). Note that many of these theoretical features were originally derived from dimensionality reduction or clustering techniques, but here they were prepared to be further reduced into higher-order components. Study 2 added three extra theoretical features (morality, trust and generation gap) from the cross-cultural literature34,35,36, so a total of 33 features were evaluated.
进行了全面的文献检索,以找到所有探索人类关系基本形式的相关理论和模型。在研究 1 中,从 15 个重要理论中总结并提取了 30 个概念特征。跨理论合并了冗余特征(详见扩展数据图 1 和补充表 1)。需要注意的是,这些理论特征中的许多最初是通过降维或聚类技术得出的,但在这里它们被准备进一步简化为更高阶的组成部分。研究 2 从跨文化文献中增加了三个额外的理论特征(道德、信任和代沟) 34,35,36 ,因此总共评估了 33 个特征。

Dimensional survey  维度调查

The participants completed an online survey where they rated human relationships on bipolar Likert scales. At the top of each page, the participants were cued to rate relationships on a given evaluative feature (for example, activeness), along with two phrases on opposite ends of a presented slider bar (for example, passive versus active). These two phrases represented the opposite ends of the bipolar features. Participants moved the slider towards the phrase that they felt best related to the presented relationships. Since certain features were quite obscure (for example, communality and reciprocity), we presented each feature with a detailed definition plus an exemplary relationship in the survey (Supplementary Table 1). Once the participants confirmed their understanding of each feature, they moved to the rating part. The participants were asked to consider all aspects of the relationships, including the way the individuals in each relationship typically think and feel about each other, how they act and react towards each other, how they talk and listen to each other, and any other characteristics of the relationships that occurred to them. The participants were instructed to focus not on their personal experiences with a specific relationship but rather on their general knowledge (that is, common sense or stereotypical understanding) about such relationships. Attention-check questions were used to ensure that the online participants were actively engaged in the survey and not answering questions in specific patterns or answering randomly. To avoid potential fatigue and inattentiveness, a between-subject design was used for all online participants to keep the survey short and effective (~20 min). Each participant was randomly assigned to a subset of relationships (for example, five to eight relationships) and had to rate them on a subset of evaluative features (for example, 10–11 features). To replicate the results from the between-subject design, a within-subject design was adopted for offline participants in Study 1, where each participant was asked to rate all relationships on all features in the laboratory (which took them three hours to complete). To rule out the effects of cross-cultural variations in online data quality and general semantic knowledge, the participants were asked additional questions on the size and colour of common objects (for example, animals, fruits, vehicles, tools and outdoor scenes). We found very low cross-regional variations in this object knowledge (pairwise correlations were >0.991), and there is no evidence that it can predict the cross-regional variation in relationship understanding (all P > 0.352; Extended Data Table 1). The cultural variability reported in Study 2 thus seems to be unique to relationship concepts, not merely arising from the variability of general semantic knowledge or data quality differences across global regions.
参与者完成了一项在线调查,他们在双极李克特量表上对人际关系进行了评分。在每页的顶部,参与者被提示根据给定的评价特征(例如,活跃度)对关系进行评分,并在呈现的滑块条的两端显示两个短语(例如,被动与主动)。这两个短语代表了双极特征的对立两端。参与者将滑块移向他们认为最能代表所呈现关系的短语。由于某些特征相当晦涩(例如,共同性和互惠性),我们在调查中为每个特征提供了详细的定义和一个示例关系(补充表 1)。一旦参与者确认理解了每个特征,他们就进入了评分部分。参与者被要求考虑关系的所有方面,包括每个关系中个体通常如何思考和感受彼此,他们如何相互行动和反应,他们如何交谈和倾听彼此,以及他们想到的任何其他关系特征。 参与者被指示不要关注他们与特定关系的个人经历,而是关注他们对此类关系的一般知识(即常识或刻板印象理解)。注意力检查问题用于确保在线参与者积极参与调查,而不是以特定模式回答问题或随机回答。为了避免潜在的疲劳和注意力不集中,所有在线参与者都采用了被试间设计,以保持调查简短有效(约 20 分钟)。每位参与者被随机分配到一组关系(例如,五到八种关系),并必须对一组评价特征(例如,10-11 个特征)进行评分。为了复制被试间设计的结果,研究 1 中的线下参与者采用了被试内设计,每位参与者被要求在实验室中对所有关系的所有特征进行评分(这需要他们三个小时完成)。 为了排除在线数据质量和一般语义知识跨文化差异的影响,参与者被问及关于常见物体(例如动物、水果、车辆、工具和户外场景)的大小和颜色的额外问题。我们发现这种物体知识的跨区域差异非常低(成对相关性>0.991),并且没有证据表明它可以预测关系理解的跨区域差异(所有 P > 0.352;扩展数据表 1)。因此,研究 2 中报告的文化变异性似乎仅与关系概念相关,而不仅仅是由于全球区域间一般语义知识或数据质量差异引起的。

Cognitive tasks  认知任务

Along with the dimensional survey, two laboratory cognitive tasks were implemented to measure the categorization of relationship concepts. The multi-arrangement task is a behavioural paradigm to collect intuitive similarity judgements on semantic concepts12. The participants were asked to ‘arrange the 159 relationships according to their similarity’ in a 2D circle on a computer screen via mouse drag-and-drop so that similar relationships were placed close together, and dissimilar ones were placed further apart. The free sorting task asks participants to deliberately classify the 159 relationships into labelled categories13. They were allowed to make as many groupings as they liked (up to eight). Both tasks were conducted via the Meadows platform and the Naodao platform.
除了维度调查外,还实施了两项实验室认知任务,以测量关系概念的分类。多排列任务是一种行为范式,用于收集对语义概念的直观相似性判断 12 。参与者被要求在计算机屏幕上的二维圆中通过鼠标拖放“根据相似性排列 159 种关系”,以便将相似的关系放在一起,不相似的关系则分开放置。自由分类任务要求参与者有意识地将 159 种关系分类到标记的类别中 13 。他们可以根据自己的喜好进行任意数量的分组(最多八个)。这两项任务都是通过 Meadows 平台和 Naodao 平台进行的。

Text analysis was performed on the categorical labels assigned by participants in the free sorting task. Initially, 444 labels were obtained, and they were coded by assigning 159 relationships (444 × 159 matrix). For example, the ‘family’ label was assigned to ‘wife–husband’ but not ‘doctor–patient’, so the former was coded as 1 and the latter was coded as 0. Hierarchical clustering (the Ward method) was performed on the label × relationship matrix. After a noisy cluster containing miscellaneous labels was excluded, three and six clusters were observed on the remaining 292 labels.
对自由分类任务中参与者分配的类别标签进行了文本分析。最初获得了 444 个标签,并通过分配 159 种关系(444×159 矩阵)进行编码。例如,“家庭”标签被分配给“妻子-丈夫”而不是“医生-患者”,因此前者编码为 1,后者编码为 0。对标签×关系矩阵进行了层次聚类(Ward 方法)。在排除包含杂项标签的噪声聚类后,在剩余的 292 个标签上观察到了三个和六个聚类。

Dimensionality reduction and clustering
降维与聚类

Python (v.3.9.1) was used to clean and organize all data. Any participants who did not pass the attention check were excluded from the analysis (the data exclusion criteria can be found in the preregistration at https://osf.io/swr2c). On the basis of this criterion, 129 participants (out of 721; 17.89%) were excluded in the pilot study, 248 participants (out of 1,065; 23.29%) were excluded in Study 1 and 2,441 participants (out of 18,537; 13.17%) were excluded in Study 2. Before applying any dimensionality reduction or clustering, we created a matrix from the average ratings of each relationship on each evaluative feature across participants. This matrix was normalized by using the preprocessing command from the scikit-learn package (v.1.4.2).
Python (v.3.9.1) 用于清理和组织所有数据。未通过注意力检查的参与者被排除在分析之外(数据排除标准可在 https://osf.io/swr2c 的预注册中找到)。根据这一标准,试点研究中有 129 名参与者(共 721 名;17.89%)被排除,研究 1 中有 248 名参与者(共 1,065 名;23.29%)被排除,研究 2 中有 2,441 名参与者(共 18,537 名;13.17%)被排除。在应用任何降维或聚类之前,我们根据每个参与者在每个评估特征上的平均评分创建了一个矩阵。该矩阵通过使用 scikit-learn 包(v.1.4.2)中的预处理命令进行了归一化处理。

PCA was adopted as the primary dimensionality reduction technique to derive all dimensional models (using the prcomp function from R (v.4.3.3)). A varimax rotation was used for individual evaluative features to load maximally onto the components. Since the PCA does not provide labels for the components, we named the components by considering both the top five highest loadings (absolute value) and the distribution of relationship scores. To determine the optimal number of PCA components, we checked four data-driven metrics (that is, parallel analysis, the Kaiser–Guttman rule, Cattell’s scree test and optimal coordinates) and examined the interpretability of each component (Extended Data Fig. 2a). Solutions with cross-metrics agreement and high interpretability were chosen. We also implemented other dimensionality reduction techniques to validate the PCA results (see Supplementary Fig. 3 for the details), and five identical components were observed using independent component analysis, exploratory factor analysis, multidimensional scaling and network analysis.
PCA 被采用作为主要的降维技术,以推导所有维度模型(使用 R(v.4.3.3)中的 prcomp 函数)。对于个体评估特征,采用了 varimax 旋转,以使这些特征在组件上加载最大化。由于 PCA 不提供组件的标签,我们通过考虑前五个最高加载(绝对值)和关系评分的分布来命名这些组件。为了确定最佳的 PCA 组件数量,我们检查了四个数据驱动的指标(即平行分析、Kaiser-Guttman 规则、Cattell 的碎石检验和最优坐标),并检查了每个组件的可解释性(扩展数据图 2a)。选择了具有跨指标一致性和高可解释性的解决方案。我们还实施了其他降维技术以验证 PCA 结果(详见补充图 3),并使用独立成分分析、探索性因子分析、多维尺度和网络分析观察到了五个相同的组件。

We adopted k-means clustering as the primary clustering technique to derive categorical models, although other clustering techniques (such as hierarchical clustering and HDBSCAN) were also conducted to validate the k-means results. A dissimilarity matrix of 159 relationships was prepared as input. For the dimensional survey, the Euclidean distance matrix was calculated using relationships’ ratings on all evaluative features. For the multi-arrangement task, the distance matrix was retrieved from the original data, in which the value indicated two relationships’ distances in the 2D circle. For the free sorting task, the distance matrix was calculated on the basis of the probability that two relationships were classified in the same category. Uniform manifold approximation and projection (UMAP) was used as a preprocessing step to boost the performance of k-means clustering, given that this method is flexible and powerful in finding and balancing the local and global structure of the data. Two UMAP parameters were manually set: the nearest neighbour parameter (which determines how much of the local versus global structure to consider) and the minimum distance value (which determines how closely together the data points should be in the final solution). A low-to-medium value (15) for the nearest neighbour and a low value for the minimum distance (0.01) were selected, as they can effectively produce tighter clusters that are easier to process for the subsequent clustering algorithms37. To determine the optimal number of clusters, the silhouette score was considered (Supplementary Fig. 15), and the stability and interpretability of the output clusters were also examined. Solutions that were insensitive to algorithm/parameter choice, were consistent across different clustering algorithms, and had high interpretability and high silhouette scores were chosen.
我们采用 k-means 聚类作为主要聚类技术来推导分类模型,尽管也进行了其他聚类技术(如层次聚类和 HDBSCAN)以验证 k-means 结果。准备了 159 种关系的相异矩阵作为输入。对于维度调查,使用关系在所有评估特征上的评分计算欧几里得距离矩阵。对于多排列任务,从原始数据中检索距离矩阵,其中值表示两个关系在 2D 圆中的距离。对于自由分类任务,基于两个关系被分类到同一类别的概率计算距离矩阵。统一流形逼近和投影(UMAP)被用作预处理步骤,以提高 k-means 聚类的性能,因为该方法在发现和平衡数据的局部和全局结构方面具有灵活性和强大能力。 手动设置了两个 UMAP 参数:最近邻参数(决定考虑局部与全局结构的程度)和最小距离值(决定数据点在最终解中的紧密程度)。选择了中低值(15)作为最近邻参数,以及低值(0.01)作为最小距离,因为它们能有效生成更紧密的簇,便于后续聚类算法处理 37 。为了确定最佳簇数,考虑了轮廓系数(补充图 15),并检查了输出簇的稳定性和可解释性。选择了对算法/参数选择不敏感、在不同聚类算法中一致、具有高可解释性和高轮廓系数的解。

Language models and embeddings
语言模型和嵌入

We used PLMs and LLMs to probe ancient people’s perception and comprehension of human relationships. For the modern Chinese PLM, we employed the word-based Chinese-RoBERTa-Base model from UER-py Modelzoo38. We selected this model due to its focus on the mask language modelling task during the pretraining phase. Moreover, it takes into account the characteristics of the Chinese language by using words rather than characters as units, and it has been trained on a large-scale, publicly available corpus of modern Chinese text. For the ancient Chinese PLM, we used BERT-ancient-Chinese22, which was trained on a large-scale ancient Chinese corpus including historical texts from 1046 bce to 1912 ce.
我们使用 PLMs 和LLMs来探究古人对人际关系的感知和理解。对于现代中文 PLM,我们采用了 UER-py Modelzoo 38 中的基于词汇的中文 RoBERTa-Base 模型。选择该模型是因为其在预训练阶段专注于掩码语言建模任务。此外,它通过使用词汇而非字符作为单位,考虑到了中文语言的特点,并且已经在现代中文文本的大规模公开语料库上进行了训练。对于古代中文 PLM,我们使用了 BERT-ancient-Chinese 22 ,该模型在包括公元前 1046 年至公元 1912 年历史文本的大规模古代中文语料库上进行了训练。

We adopted an approach to generate human-like PLM embeddings (Fig. 5a), which was previously proposed by Cutler and Condon19 to identify Big Five personality structures in language models. We compared different queries and layers of embeddings (Supplementary Fig. 9). The [DESC] component in the query was generated by GPT-4 in October 2023 with the temperature parameter set to zero to ensure reproducibility (see exemplar prompts in Supplementary Method 2). Details of the labels and descriptions for ancient and modern Chinese relationships can be accessed via the Open Science Framework website.
我们采用了一种方法来生成类似人类的 PLM 嵌入(图 5a),该方法由 Cutler 和 Condon 19 提出,用于识别语言模型中的大五人格结构。我们比较了不同的查询和嵌入层(补充图 9)。查询中的[DESC]组件由 GPT-4 在 2023 年 10 月生成,温度参数设置为零以确保可重复性(参见补充方法 2 中的示例提示)。古代和现代中国关系的标签和描述详情可通过开放科学框架网站访问。

RSA and model comparison  RSA 和模型比较

To uncover which cultural variables account for the cross-cultural variance in relationship representations, we performed RSA multiple regression39 (Fig. 3b). For each global region, cultural variables of language, personality, socio-ecology (that is, subsistence style, historical disease prevalence and climates), modernization, genetics, religion, politics and the Hofstede 6D culture model were collected from multiple open databases, such as the World Values Survey, Timeanddate and Worldbank (see Supplementary Table 3 for the details). For each cultural variable (for example, modernization), an RDM was computed where each cell represents the dissimilarity of two regions on this variable (for example, the dissimilarity of China and Portugal according to their modernization level). For each representational geometry (that is, full-feature, dimensional or categorical), we also created an RDM to represent the dissimilarity of relationship representations across regions. We then performed a linear regression model in which cultural variable RDMs were predictors, and relationship representational geometry RDM was the outcome variable. The noise ceiling was estimated using the mean relationship RDMs of n − 1 regions to predict the relationship RDM of the remaining region, which reflected the inherent heterogeneity of the relationship RDMs. The Mantel test was used to assess the statistical significance of each RSA40,41. We permuted the order of RDMs of cultural variables while holding the representational geometries constant, recalculated the regression and repeated the process 10,000 times. This test allowed us to compute a P value for the representational geometries based on the F statistic of the multiple regression. We performed a one-sided test since a negative value is not meaningful and only positive similarities are expected20,42.
为了揭示哪些文化变量能够解释关系表征的跨文化差异,我们进行了 RSA 多元回归分析(图 3b)。对于每个全球区域,我们从多个开放数据库中收集了语言、人格、社会生态(即生存方式、历史疾病流行率和气候)、现代化、遗传、宗教、政治以及霍夫斯泰德 6D 文化模型等文化变量,例如世界价值观调查、Timeanddate 和世界银行(详见补充表 3)。对于每个文化变量(例如现代化),我们计算了一个 RDM,其中每个单元格代表两个区域在该变量上的差异(例如,中国和葡萄牙在现代化水平上的差异)。对于每个表征几何(即全特征、维度或类别),我们也创建了一个 RDM 来表示跨区域关系表征的差异。然后,我们进行了一个线性回归模型,其中文化变量 RDM 作为预测变量,关系表征几何 RDM 作为结果变量。 噪声上限是通过使用 n-1 个区域的平均关系 RDM 来预测剩余区域的关系 RDM 来估计的,这反映了关系 RDM 的固有异质性。Mantel 检验用于评估每个 RSA 的统计显著性 40,41 。我们在保持表示几何不变的情况下,对文化变量的 RDM 顺序进行排列,重新计算回归并重复该过程 10,000 次。该检验使我们能够基于多元回归的 F 统计量计算表示几何的 P 值。我们进行了单侧检验,因为负值没有意义,只有正相似性是有意义的 20,42

Study 3 implemented RSA correlations between language models and the human-rating FAVEE-HPP model. Specifically, we transformed PLM embeddings (258 × 768 or 120 × 768 matrix) into a cosine similarity matrix (258 × 258 or 120 × 120). This matrix was then correlated with the lower triangle of the RDMs derived from the FAVEE dimensions (which represents the distances between pairs of relationships in 5D FAVEE space) or RDMs from the HPP categories using Spearman correlation. The noise ceiling was estimated by correlating human-rating RDMs derived from the FAVEE-HPP model with human-rating RDMs from 33 dimensional features (Fig. 5d).
研究 3 实施了语言模型与人类评分 FAVEE-HPP 模型之间的 RSA 相关性分析。具体而言,我们将 PLM 嵌入(258×768 或 120×768 矩阵)转换为余弦相似度矩阵(258×258 或 120×120)。然后,该矩阵与从 FAVEE 维度(表示 5D FAVEE 空间中关系对之间的距离)或 HPP 类别得出的 RDMs 的下三角部分进行 Spearman 相关性分析。噪声上限通过将 FAVEE-HPP 模型得出的人类评分 RDMs 与 33 维特征得出的人类评分 RDMs 进行相关性估计(图 5d)。

Robustness test  稳健性测试

The robustness test across different numbers of relationships was quantified using the same method as Lin et al.43. We removed human relationships one by one and reperformed all analyses (for example, PCA, clustering and cross-cultural RSA). The sequence to remove relationships was implemented as follows: all pairs of relationships were ranked from the most to the least similarity in the multi-arrangement task, and the relationship with the lower familiarity rating was removed first from each pair. Pearson correlations were calculated between metrics from the full set and from the subsets to determine the robustness of the results (see Supplementary Fig. 11 for the details).
使用与 Lin 等人 43 相同的方法对不同数量关系的稳健性测试进行了量化。我们逐一移除人际关系,并重新进行所有分析(例如,PCA、聚类和跨文化 RSA)。移除关系的顺序如下:在多排列任务中,所有关系对按相似度从高到低排序,每对中熟悉度评分较低的关系首先被移除。计算完整数据集和子集之间指标的皮尔逊相关性,以确定结果的稳健性(详见补充图 11)。

Reporting summary  报告摘要

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
有关研究设计的更多信息,请参阅本文链接的《自然》作品集报告摘要。