A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $\sqrt{n}$ --where $n$ denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics. When $K \geq \sqrt{n}$, Chin, Mossel, Sohn, and Wein recently proved that, in the \emph{sparse regime}, community recovery in polynomial time is achievable below the KS threshold by counting non-backtracking paths. This finding led them to postulate a new threshold for the many-communities regime $K \geq \sqrt{n}$. Subsequently, Carpentier, Giraud, and Verzelen established the failure of low-degree polynomials below this new threshold across all density regimes, and demonstrated successful recovery above the threshold in certain moderately sparse settings. While these results provide strong evidence that, in the many community setting, the computational barrier lies at the threshold proposed in~Chin et al., the question of achieving recovery above this threshold still remains open in most density regimes. The present work is a follow-up to~Carpentier et al., in which we prove Conjecture~1.4 stated therein by: \\ 1- Constructing a family of motifs satisfying specific structural properties; and\\ 2- Proving that community recovery is possible above the proposed threshold by counting such motifs.\\ Our results complete the picture of the computational barrier for community recovery in the SBM with $K \geq \sqrt{n}$ communities. They also indicate that, in moderately sparse regimes, the optimal algorithms appear to be fundamentally different from spectral methods.
翻译:网络分析中的一个基础理论问题是确定随机块模型(SBM)中社区恢复在多项式时间内可行的条件。当社区数量$K$小于$\\sqrt{n}$(其中$n$表示节点数)时,非平凡社区恢复在多项式时间内仅当且仅当高于Kesten–Stigum(KS)阈值时可能实现,该阈值最初基于统计物理学的论证提出。当$K \\geq \\sqrt{n}$时,Chin、Mossel、Sohn和Wein近期证明,在\\emph{稀疏区域}中,通过计数非回溯路径,多项式时间的社区恢复可在KS阈值以下实现。这一发现促使他们为多社区区域$K \\geq \\sqrt{n}$提出了一个新的阈值。随后,Carpentier、Giraud和Verzelen在所有密度区域中建立了低于该新阈值的低次多项式方法的失效性,并在某些中等稀疏设置中证明了阈值以上恢复的成功。尽管这些结果强有力地表明,在多社区设置中,计算障碍位于Chin等人提出的阈值处,但在大多数密度区域中,实现该阈值以上恢复的问题仍然悬而未决。本研究是Carpentier等人工作的后续,我们通过以下方式证明了其中所述的猜想1.4:\\ 1- 构建满足特定结构性质的基元族;以及\\ 2- 证明通过计数此类基元,社区恢复在提出的阈值以上是可能的。\\ 我们的结果完善了SBM中$K \\geq \\sqrt{n}$社区情况下社区恢复计算障碍的全貌。它们还表明,在中等稀疏区域中,最优算法似乎与谱方法存在根本性差异。