Social divide and polarization have become significant societal issues. To understand the mechanisms behind these phenomena, social media analysis offers research opportunities in computational social science, where developing effective user embedding methods is essential for subsequent analysis. Traditionally, researchers have used predefined network-based user features (e.g., network size, degree, and centrality measures). However, because such measures may not capture the complex characteristics of social media users, in our study we developed a method for embedding users based on a URL domain co-occurrence network. This approach effectively represents social media users involved in competing events such as political campaigns and public health crises. We assessed the method's performance using binary classification tasks and datasets that covered topics associated with the COVID-19 infodemic, such as QAnon, Biden, and Ivermectin, among Twitter users. Our results revealed that user embeddings generated directly from the retweet network and/or based on language performed below expectations, whereas our domain-based embeddings outperformed those methods while reducing computation time. Therefore, domain-based embedding offers an accessible and effective method for characterizing social media users in competing events.
翻译:暂无翻译