SFTP Summary

We can support both push and pull methods of file transfer for large batches. The following formats are supported:

  • Parquet: This is the highly recommended format, as it is faster to process, incurs lower storage costs and embeds schema information.
  • JSON Lines (.jsonl): Each line in the file is a separate JSON object representing a single transaction.
  • CSV: A standard comma-separated values file where the first line must be a header row.

We then stream the file onto our internal Kafka broker, and the resultant stream is output to a file in the agreed location.

For the schema, please use the one defined in Transaction schema for Ingest & Egress process. The normalisation remarks are also valid.