wiki:ThirdPartyDataTransfer

Version 3 (modified by spascoe, 14 years ago) (diff)

--

The issue of whether we need 3rd-party data transfer, and how it could be implemented in the DeliveryService, has been raised in ticket #107.

Assuming we need some sort of 3rd-party transfer I see 3 options:

  1. Live without dedicated 3rd-party transfer support in the DeliveryService. Server to server data transfer can be implemented on a per service basis using the DeliveryService in the standard client-server manner.
  1. Implement 3rd-party transfer as a high-level service which calls the low-level data transfer (presumably bbFTP) underneath. Such a service would act as the server to a controling client and a client to the other party to the data transfer.
  1. Implement 3rd-party transfer at the xFTP protocol level. See below for details about this.

Option 3.

Since GridFTP is supposed to do 3rd party data transfer I decided to take a look at the  GridFTP specification and was surprised to discover that the FTP protocol already supports 3rd party transfer in theory, it is just rarely implemented.

Here is the relevant extract from  RFC 959:

      When data is to be transferred between two servers, A and B (refer
      to Figure 2), the user-PI, C, sets up control connections with
      both server-PI's.  One of the servers, say A, is then sent a PASV
      command telling him to "listen" on his data port rather than
      initiate a connection when he receives a transfer service command.
      When the user-PI receives an acknowledgment to the PASV command,
      which includes the identity of the host and port being listened
      on, the user-PI then sends A's port, a, to B in a PORT command; a
      reply is returned.  The user-PI may then send the corresponding
      service commands to A and B.  Server B initiates the connection
      and the transfer proceeds.  The command-reply sequence is listed
      below where the messages are vertically synchronous but
      horizontally asynchronous:

         User-PI - Server A                User-PI - Server B
         ------------------                ------------------

         C->A : Connect                    C->B : Connect
         C->A : PASV
         A->C : 227 Entering Passive Mode. A1,A2,A3,A4,a1,a2
                                           C->B : PORT A1,A2,A3,A4,a1,a2
                                           B->C : 200 Okay
         C->A : STOR                       C->B : RETR
                    B->A : Connect to HOST-A, PORT-a

I.e. One server is put in passive mode and one in active mode. A good explanation of the difference can be found  here. GridFTP states support for 3rd-party transfer in the specification. In practice this means that both passive and active FTP commands are extended to support striping across multiple data ports.

On a casual read of the specifications it appears that, provided a server supports active mode, the main issue is with the client. It must support making control connections to 2 servers simultaniously and issueing the necessary PASV, PORT, STOR and RETR commands.

Unlike GridFTP, bbFTP is not an extension of RFC-959. However, it claims to support both passive and "non-passive" mode. Therefore, it may be possible to adapt the client to do 3rd party transfer along the lines shown above. This may be a significant ammount of work and any consideration of it should be balanced against the work required to make GridFTP support NDG Security.