Federated learning (FL) enables collaborative training over distributed multimedia data but suffers acutely from statistical heterogeneity and communication constraints, especially when clients deploy large models. Classic parameter-averaging methods such as FedAvg transmit full model weights and can diverge under nonindependent and identically distributed (non-IID) data. We propose KTA v2, a prediction-space knowledge trading market for FL. Each round, clients locally train on their private data, then share only logits on a small public reference set. The server constructs a client-client similarity graph in prediction space, combines it with reference-set accuracy to form per-client teacher ensembles, and sends back personalized soft targets for a second-stage distillation update. This two-stage procedure can be interpreted as approximate block-coordinate descent on a unified objective with prediction-space regularization. Experiments on FEMNIST, CIFAR-10 and AG News show that, under comparable or much lower communication budgets, KTA v2 consistently outperforms a local-only baseline and strong parameter-based methods (FedAvg, FedProx), and substantially improves over a FedMD-style global teacher. On CIFAR-10 with ResNet-18, KTA v2 reaches 57.7% test accuracy using approximately 1/1100 of FedAvg's communication, while on AG News it attains 89.3% accuracy with approximately 1/300 of FedAvg's traffic.
翻译:联邦学习(FL)支持在分布式多媒体数据上进行协同训练,但面临严重的统计异构性和通信约束问题,尤其是在客户端部署大型模型时。经典的参数平均方法(如FedAvg)传输完整的模型权重,在非独立同分布(non-IID)数据下可能发散。我们提出KTA v2,一种用于联邦学习的预测空间知识交易市场。每轮训练中,客户端在本地私有数据上进行训练,然后仅在一个小型公共参考集上共享其logits输出。服务器在预测空间中构建客户端间相似性图,结合参考集准确率形成每个客户端的教师集成,并返回个性化的软目标用于第二阶段的蒸馏更新。这一两阶段过程可解释为在带有预测空间正则化的统一目标函数上执行近似块坐标下降。在FEMNIST、CIFAR-10和AG News数据集上的实验表明,在相当或显著更低的通信开销下,KTA v2始终优于纯本地基线及基于参数的强基线方法(FedAvg、FedProx),并显著超越FedMD式全局教师方法。在使用ResNet-18的CIFAR-10实验中,KTA v2以约FedAvg 1/1100的通信量达到57.7%的测试准确率;在AG News任务中,以约FedAvg 1/300的通信流量实现89.3%的准确率。