The design of crystal materials plays a critical role in areas such as new energy development, biomedical engineering, and semiconductors. Recent advances in data-driven methods have enabled the generation of diverse crystal structures. However, most existing approaches still rely on random sampling without strict constraints, requiring multiple post-processing steps to identify stable candidates with the desired physical and chemical properties. In this work, we present a new constrained generation framework that takes multiple constraints as input and enables the generation of crystal structures with specific chemical and properties. In this framework, intermediate constraints, such as symmetry information and composition ratio, are generated by a constraint generator based on large language models (LLMs), which considers the target properties. These constraints are then used by a subsequent crystal structure generator to ensure that the structure generation process is under control. Our method generates crystal structures with a probability of meeting the target properties that is more than twice that of existing approaches. Furthermore, nearly 100% of the generated crystals strictly adhere to predefined chemical composition, eliminating the risks of supply chain during production.
翻译:晶体材料的设计在新能源开发、生物医学工程和半导体等领域具有关键作用。近年来,数据驱动方法的进展使得多样化的晶体结构生成成为可能。然而,现有方法大多仍依赖于无严格约束的随机采样,需要经过多步后处理才能筛选出具有目标物理化学性质的稳定候选结构。本研究提出一种新的约束生成框架,该框架以多重约束作为输入,能够生成具有特定化学和物理性质的晶体结构。在此框架中,中间约束(如对称性信息和成分比例)由基于大语言模型(LLMs)的约束生成器根据目标性质生成,随后这些约束被用于后续的晶体结构生成器,以确保结构生成过程处于受控状态。实验表明,本方法生成满足目标性质晶体结构的概率是现有方法的两倍以上。此外,近100%的生成晶体严格遵循预定义的化学成分,从而消除了生产过程中的供应链风险。