The patterns of different financial data sources vary substantially, and accordingly, investors exhibit heterogeneous cognition behavior in information processing. To capture different patterns, we propose a novel approach called the two-stage dynamic stacking ensemble model based on investor knowledge representations, which aims to effectively extract and integrate the features from multi-source financial data. In the first stage, we identify different financial data property from global stock market indices, industrial indices, and financial news based on the perspective of investors. And then, we design appropriate neural network architectures tailored to these properties to generate effective feature representations. Based on learned feature representations, we design multiple meta-classifiers and dynamically select the optimal one for each time window, enabling the model to effectively capture and learn the distinct patterns that emerge across different temporal periods. To evaluate the performance of the proposed model, we apply it to predicting the daily movement of Shanghai Securities Composite index, SZSE Component index and Growth Enterprise index in Chinese stock market. The experimental results demonstrate the effectiveness of our model in improving the prediction performance. In terms of accuracy metric, our approach outperforms the best competing models by 1.42%, 7.94%, and 7.73% on the SSEC, SZEC, and GEI indices, respectively. In addition, we design a trading strategy based on the proposed model. The economic results show that compared to the competing trading strategies, our strategy delivers a superior performance in terms of the accumulated return and Sharpe ratio.
翻译:不同金融数据源的模式存在显著差异,相应地,投资者在信息处理中表现出异质性的认知行为。为捕捉这些不同模式,我们提出了一种基于投资者知识表征的两阶段动态堆叠集成模型新方法,旨在有效提取并整合多源金融数据的特征。在第一阶段,我们从投资者的视角出发,识别来自全球股票市场指数、行业指数及金融新闻的不同金融数据特性。随后,我们针对这些特性设计了适配的神经网络架构,以生成有效的特征表征。基于学习到的特征表征,我们构建了多个元分类器,并为每个时间窗口动态选择最优分类器,使模型能够有效捕捉和学习不同时期出现的独特模式。为评估所提模型的性能,我们将其应用于预测中国股市中上证综合指数、深证成份指数及创业板指数的日度涨跌。实验结果表明,我们的模型在提升预测性能方面具有显著效果。在准确率指标上,我们的方法在上证综指、深证成指和创业板指数上分别优于最佳竞争模型1.42%、7.94%和7.73%。此外,我们基于所提模型设计了一种交易策略。经济绩效结果显示,与竞争性交易策略相比,我们的策略在累计收益和夏普比率方面均表现出更优的性能。