SFTP via Cloud Connector Python Operator in SAP Data Intelligence

The (Secure) File Transfer Protocol is still a very common way to integrate files from different sources. SAP Data Intelligence supports many source systems for file operations out of the box. To allow for even more flexibility in the connection to SFTP servers, this blog post shows how to use the Python library Paramiko to read, write, list or delete files on remote sources. Even through the infamous Cloud Connector.

Connect Python Operator to SFTP via Cloud Connector:

Szenario: This blog post aims at establishing a connection to an on-premises SFTP server. We will show how to establish a TCP socket and use Paramiko to read, list, write or delete files. The TCP connection socket is the basis for this blog and I recommend my previous blog about the basics of establishing a TCP Socket from a python custom operator and a detailed explaination on the role of the Connectivity Service. https://blogs.sap.com/2023/04/14/sap-data-intelligence-python-operators-and-cloud-connector-tcp/


Szenario Architecture


All prerequisites from the previous blog post apply.

For this case, we additionally need a SFTP server running on the same host as the Cloud Connector. To establish a local test setup on my Windows machine, I use a very simple dockerized SFTP server. It exposes a SFTP server on the localhost on port 22. With the user foo and the password pass we are allowed to interact with the contents of the folder /upload.

docker run -p 22:22 -d atmoz/sftp foo:pass:::upload

1. Configuration:

The very first step in the integration is to configure the Cloud Connector to expose the SFTP server to the respective BTP subaccount. The configuration looks as follows:

Cloud To On-premises Configuration

The localhost:22 is exposed to a virtual host that we can see in the BTP Cockpit.


BTP Cockpit Cloud Connector Resources

2. Creating a Data Intelligence Connection:

For our purposes, we do not want to hard-code the connection details, because we need a little help from the connection management to access the Connectivity Service of BTP. In the Connection Management Application from SAP Data Intelligence we can create connections of all kinds of types. We create a connection of type HTTP with out host, port and we specify SAP Cloud Connector as gateway.


Data Intelligence Connection Management

Note: Not all connection types allow you to access through the Cloud Connector. See the official product documentation for details.

3. Developing a Custom Operator:

We use Paramiko and the sapcloudconnectorpythonsocket library. They need to be installed in a Dockerfile and added to the custom operator. Paramiko is a python library that helps to communicate with the SFTP server. The sapcloudconnectorpythonsocket library helps us to open a socket via the Cloud connector.

FROM $com.sap.sles.base

RUN python3 -m pip --no-cache-dir install 'sapcloudconnectorpythonsocket' --user
RUN python3 -m pip --no-cache-dir install 'paramiko' --user


Finally, let’s have a look at the actual custom script. The first important part is opening the TCP socket as described in the previous blog. This TCP socket can be passed to Paramiko SSHClient to establish the connection. Since we do keep this really simple we put some Paramiko policies in to ignore the missing host keys. With the open ftp client, we can interact with the file server.

For the purpose of demonstration, we create a file in the /upload directory, called test_create_file.txt. We then create a new directory, list the contents of the /upload directory, read the content of the created file and remove it again.

Paramiko can do a lot more than the shown file operations, it implements the SSHv2 Protocol and can be used to execute shell commands remotely. Thus opening a big door for hacky integration scenarios – even fully on-prem 🙂 🙂

Check out the Paramiko documentation for more details.

import paramiko
import io
from sapcloudconnectorpythonsocket import CloudConnectorSocket

sftp_connection = api.config.http_connection


def gen():
    api.logger.info("Generator Start")
    cc_socket = CloudConnectorSocket()
    ssh_client = paramiko.SSHClient()


    sftp = ssh_client.open_sftp()
    api.logger.info("SFTP Client open")
    file = sftp.file("/upload/test_create_file.txt", "a", -1)
    file.write("Hello SFTP.!\n")


    dir_objects = sftp.listdir_attr("/upload")
    file_names = [e.filename for e in dir_objects]
    file_obj = io.BytesIO()
    sftp.getfo("/upload/test_create_file.txt", file_obj)

    api.logger.info("Generator End")
    api.send("file", api.Message(file_obj.getvalue(), {"file":{"path": f"/shared/pythoncloudconnector/test_create_file.txt"}, "file_name": "test_create_file.txt"}))



4. Testing the Custom Operator:

To test the new custom SFTP operator, we create a Data Intelligence graph. In the example, I added an output port of type message.file to the custom operator. It connects to the write file operator, which is configured to write to the DI_DATA_LAKE. For the SFTP operator we add the Connection ID we created before as a parameter.


Data Intelligence Graph

In the logs of the Graph execution, we can see how the file and directory are created on the SFTP server. Additionally, we can also retrieve the file content “Hello SFTP.!”.

Generator Start 
SFTP Client open 
['testdir', 'test_create_file.txt'] 
b'Hello SFTP.!\n' 
Generator End 

At last we can check the Metadata Explorer and see the newly imported file.

Looking at the file content, we can see it worked well! We are able to connect to the SFTP server and integrate files with the standard operators.

Hello SFTP.!

Hope you find the content of this blog helpful. Feel free to comment for further clarifications.

Original Article:

Related blogs


Please enter your comment!
Please enter your name here