Deploy an exchange central multi parties federated learning network with KubeFATE

Federated learning is a machine learning framework that protects data privacy. It can effectively help multiple institutions to perform data usage and machine learning modeling while meeting the requirements of user privacy protection, data security, and government regulations. Discover the value of data and protect data privacy and security.

Encounter problems

In the deployment of federated learning applications, multiple institutions are required to participate together and join the federated network organization. However, too many participants are difficult to manage.

The traditional model can use simple links between various agencies. When the federated network organization grows larger, the configuration information management between agencies becomes very complicated.

Solution

Using the exchange deployment model can easily and conveniently deal with the network construction of multi-member organizations. Members can discover each other through the core exchange, This becomes very simple to manage the federated learning network.

KubeFATE support for exchange

KubeFATE v1.6.0 supports the exchange deployment mode of the Spark and eggroll computing engine.

Federated learning organization building

Organization

Use KubeFATE to deploy a federated learning network with exchange as the central node. This network contains an exchange and several parties.

Deployment structure diagram

exchange-structure-diagram

Introduction to deployment environment

The deployment here includes 3 parties and 1 exchange. Each role has an independent k8s cluster. All clusters have deployed KubeFATE (Deploy KubeFATE).

party	party ID	k8s version	k8s node IP	kubefate version	FATE version
exchange	1	v1.19.9	192.168.100.1	v1.4.1	v1.6.0
party-9999	9999	v1.19.9	192.168.100.9	v1.4.1	v1.6.0
party-10000	10000	v1.19.9	192.168.100.10	v1.4.1	v1.6.0
party-8888	8888	v1.19.9	192.168.100.8	v1.4.1	v1.6.0

Deploy exchange

Let's use the FATE-Exchange chart to deploy an exchange cluster.

Configuration

There are two types of exchange, which correspond to FATE's two computing engines (eggroll, Spark).

eggroll（rollsite）

The core of the eggroll type of exchange cluster is the component that contains the rollsite.

$ cat cluster-exchange.yaml
name: fate-exchange
namespace: fate-exchange
chartName: fate-exchange
chartVersion: v1.6.0
partyId: 1
registry: ""
imageTag: "1.6.0-release"
pullPolicy: 
imagePullSecrets: 
- name: myregistrykey
persistence: false
istio:
  enabled: false
modules:
  - rollsite

rollsite: 
  type: NodePort
  nodePort: 30000
  partyList:
  - partyId: 10000
    partyIp: 192.168.100.10
    partyPort: 30101
  - partyId: 9999
    partyIp: 192.168.100.9
    partyPort: 30091
  - partyId: 8888
    partyIp: 192.168.100.8
    partyPort: 30081

Spark（ATS）

When deploying an exchange that uses Spark as a computing engine, you need to first solve the certificate configuration between each party and exchange. Refer to this (pulsar and certificate generation of ATS) document to generate the exchange certificate and import the certificate into k8s:

kubectl create secret generic traffic-server-cert -n fate-exchange \
	--from-file=proxy.cert.pem=proxy.fate.org/proxy.cert.pem \
	--from-file=proxy.key.pem=proxy.fate.org/proxy.key.pem \
	--from-file=ca.cert.pem=certs/ca.cert.pem

Configure the YAML file. The exchange core of the Spark computing engine contains two components (nginx, trafficServer).

$ cat cluster-exchange.yaml
name: fate-exchange
namespace: fate-exchange
chartName: fate-exchange
chartVersion: v1.6.0
partyId: 1
registry: ""
imageTag: "1.6.0-release"
pullPolicy: 
imagePullSecrets: 
- name: myregistrykey
persistence: false
istio:
  enabled: false
modules:
  - trafficServer
  - nginx

trafficServer:
  type: NodePort
  nodePort: 30001
  route_table: 
    sni:
    - fqdn: 10000.fate.org
      tunnelRoute: 192.168.100.10:30109
    - fqdn: 9999.fate.org
      tunnelRoute: 192.168.100.9:30099
    - fqdn: 8888.fate.org
      tunnelRoute: 192.168.100.8:30089

nginx:
  nodeSelector: 
  type: NodePort
  httpNodePort: 30003
  grpcNodePort: 30008
  route_table: 
    8888: 
      proxy: 
        - host: 192.168.100.8
          http_port: 30083
          grpc_port: 30088 
      fateflow: 
        - host: 192.168.100.8
          http_port: 30087
          grpc_port: 30082
    9999: 
      proxy: 
        - host: 192.168.100.9
          http_port: 30093
          grpc_port: 30098 
      fateflow: 
        - host: 192.168.100.9
          http_port: 30097
          grpc_port: 30092
    10000: 
      proxy: 
        - host: 192.168.100.10
          http_port: 30103
          grpc_port: 30108 
      fateflow: 
        - host: 192.168.100.10
          http_port: 30107
          grpc_port: 30102

Deploy

Configure the YAML file and use exchange's kubefate deployment,

(exchange)$ kubefate cluster install -f ./cluster-exchange.yaml

Check the status of the cluster is Running to confirm whether it deploys successfully.

(exchange)$ kubefate cluster ls

Update configuration

When a new party wants to join an already running exchange cluster, the party information needs to be added, Modify the cluster-exchange.yaml file to add a new party. Then use the update command of KubeFATE to update to the exchange cluster.

(exchange)$ kubefate cluster update -f ./cluster-exchange.yaml

Then wait for a while to take effect (this is because the program has a small time period for loading party information).

Add participants to exchange

In a federated network with an existing exchange, the joining of a new party becomes simple. You only need to configure the information between the party and the exchange, the party will successfully join the network.

For the configuration of exchange, refer to the [exchange update configuration](#Update configuration).

Configure Party

There are different ways to connect different computing engines to exchange. Take Party-9999 as an example below.

Eggroll（rollsite）

Configure the exchange field of rollsite to connect to the exchange cluster.

$ cat cluster.yaml
name: fate-9999
namespace: fate-9999
chartName: fate
chartVersion: v1.6.0
partyId: 9999
registry: ""
imageTag: "1.6.0-release"
pullPolicy: 
persistence: false
istio:
  enabled: false
modules:
  - rollsite
  - clustermanager
  - nodemanager
  - mysql
  - python
  - fateboard
  - client

backend: eggroll

rollsite: 
  type: NodePort
  nodePort: 30091
  exchange:
    ip: 192.168.100.1
    port: 30000

Spark（Pulsar）

When deploying FATE that uses Spark as the computing engine, you need to resolve the certificate configuration with the exchange. Refer to this (pulsar and certificate generation of ATS) document to generate the exchange certificate and import the certificate into k8s:

kubectl create secret generic pulsar-cert \
	--from-file=broker.cert.pem=9999.fate.org/broker.cert.pem \
	--from-file=broker.key-pk8.pem=9999.fate.org/broker.key-pk8.pem \
	--from-file=ca.cert.pem=certs/ca.cert.pem

The FATE of the Spark engine needs to configure python, nginx and pulsar respectively to link with exchange.

$ cat cluster.yamlname: fate-9999namespace: fate-9999chartName: fatechartVersion: v1.6.0partyId: 9999registry: ""imageTag: "1.6.0-release"pullPolicy: imagePullSecrets: - name: myregistrykeypersistence: falseistio:  enabled: falsemodules:  - python  - mysql  - fateboard  - client  - spark  - hdfs  - nginx  - pulsarbackend: sparkpython:  type: NodePort  httpNodePort: 30097  grpcNodePort: 30092nginx:  type: NodePort  httpNodePort: 30093  grpcNodePort: 30098  exchange:    ip: 192.168.100.1    httpPort: 30003    grpcPort: 30008pulsar:  type: NodePort  httpNodePort: 30094  httpsNodePort: 30099  exchange:    ip: 192.168.100.1    port: 30001

Deploy

After configuring the YAML file, use the kubefate corresponding to the Party to deploy the FATE cluster

(party-9999)$ kubefate cluster install -f ./cluster.yaml

Check whether the status of the cluster is Running to confirm whether it runs successfully.

(party-9999)$ kubefate cluster ls

Add multi-parties in turn

Refer to the previous [Party configuration](#Configure Party), configure Party-8888 and Party-10000 respectively, and then deploy the corresponding FATE cluster to join the federated network.

Test

Through the above deployment, we have successfully deployed a federated learning network interconnected through exchange, which contains three parties and the computing engine is eggroll. Below we check the usability of the federated learning network through some tests.

Multi-parties connection test

Use toy_example test for different party confirms that the two parties can communicate with each other.

Party-9999 and Party-10000

Enter the python container of Party-9999 through the command line. Then run the toy command.

kubectl -n fate-9999 exec -it svc/fateflow -c python -- bashcd ../examples/toy_example/python run_toy_example.py 9999 10000 1

Party-10000 and Party-8888

kubectl -n fate-10000 exec -it svc/fateflow -c python -- bashcd ../examples/toy_example/python run_toy_example.py 10000 8888 1

Party-8888 and Party-9999

kubectl -n fate-8888 exec -it svc/fateflow -c python -- bashcd ../examples/toy_example/python run_toy_example.py 8888 9999 1

Finally, if the log appears similar to success to calculate secure_sum, it is 2000.0000000000002, it means the toy_example communicate test is successful.

Multi-parties training test

If the intercommunication test between two parties is passed, we can run a three-party min_test to test the multi-parties task training.

The task of min_test requires the participation of three parties, Guest, Host and Arbiter. We use Party-10000 as the Guest, Party-9999 as the Host, and Party-8888 as the Arbiter.

First upload the min_test data set on the FATE of each Party.

Run the following commands on the k8s master corresponding to each Party
```
kubectl -n fate-<partyID> exec -it svc/fateflow -c python -- bashcd ../examples/scripts; python upload_default_data.py -m 1
```
<partyID> represents the ID of the party currently deployed by k8s.

Launch a training task on one of the parties.

Let's launch a task at Party-10000

kubectl -n fate-10000 exec -it svc/fateflow -c python -- bashcd ../examples/min_test_task; python run_task.py -m 1 -gid 10000 -hid 9999 -aid 8888

View the task results.

The test of min_test needs to run for some time. Wait for the task to end. You can view the running result through the log on the command line.

You can also check the FATE-Board web page: http://10000.fateboard.example.com for more task information.

Finally, you can use the federated learning network to train your own model.

Next step

Use FATE Client to Build Jobs in Jupyter Notebook

Welcome to KubeFATE's wiki page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy an exchange central multi parties federated learning network with KubeFATE

Encounter problems

Solution

KubeFATE support for exchange

Federated learning organization building

Organization

Deployment structure diagram

Introduction to deployment environment

Deploy exchange

Configuration

Deploy

Update configuration

Add participants to exchange

Configure Party

Deploy

Add multi-parties in turn

Test

Multi-parties connection test

Multi-parties training test

Next step

Clone this wiki locally