Federated Learning for Medical Image Segmentation with Domain Generalization

Federated medical image segmentation with domain generalization under cross-institution distribution shift

Overview

This project explores federated learning for medical image segmentation under domain shift, combining distributed training with domain generalization techniques.

In real-world medical scenarios, data from different institutions exhibit significant variations due to:

  • imaging devices
  • acquisition protocols
  • patient populations

At the same time, strict privacy constraints prevent direct data sharing.

This project studies how to enable collaborative training without sharing raw data, while improving generalization across heterogeneous domains.


System Architecture

The system follows a standard federated learning pipeline:

  • multiple clients (institutions) train local models
  • model parameters are shared with a central server
  • global model is updated via aggregation

FedDG Framework Pipeline

Key characteristics:

  • no raw data sharing
  • distributed optimization
  • iterative global aggregation

Segmentation Model

  • Base model: U-Net encoder–decoder architecture
  • Task: pixel-wise medical image segmentation
  • Training performed locally on each client

The model learns to segment anatomical structures while adapting to varying domain distributions.


Federated Optimization

  • Implemented Federated Averaging (FedAvg)
  • Clients perform local updates before synchronization
  • Server aggregates model parameters across clients

This enables privacy-preserving distributed training across multiple institutions.


Domain Generalization (Core Contribution)

To address domain shift across hospitals, two strategies were introduced:

1. Style-Based Data Augmentation

  • simulate appearance variations across domains
  • encourage learning of domain-invariant features

2. Feature Distribution Alignment

  • reduce discrepancies between feature distributions across clients
  • improve consistency of learned representations

Experimental Observations

Experiments on multi-domain medical datasets show that:

  • federated learning enables effective collaborative training without data sharing
  • domain generalization improves robustness across unseen domains
  • the combined approach achieves better generalization compared with standard federated learning baselines

Key Challenges

  • Data privacy constraints in medical environments
  • distribution shift across institutions
  • communication overhead in federated training

Technical Stack

Frameworks

  • PyTorch

Methods

  • Federated Learning (FedAvg)
  • U-Net segmentation
  • style transfer augmentation
  • feature alignment

Key Takeaways

  • Built a complete federated learning pipeline for medical image segmentation
  • Gained experience in distributed training and domain generalization
  • Developed understanding of robust perception under domain shift

This project complements my later work in multi-agent perception, where similar challenges arise in heterogeneous environments and communication-constrained systems.