Usually, Drupal migrations get run at the beginning of a project to get all your content into Drupal. But sometimes, you need to create a migration that runs on a regular basis. For example, you have an external database of courses and programs maintained by a university that needs to be displayed on a Drupal site. Another example: you have a database of book data that needs to be pulled into Drupal nightly.

When working with migrations where the source files are updated every day, it can get really tedious to download the updated source files manually each time the migration runs.

In this tutorial, we'll write a source plugin based on the CSV source plugin which will allow us to automatically download CSV files from a remote server via SFTP before running migrations. This article was co-authored by my colleague Jigar Mehta - gracias Jigar for your contribution.

The Problem

In a project we worked on recently, we had the following situation:

  • CSV files are updated by a PowerShell script every night on the client's server.
  • These CSV files are accessible via SFTP.

Our task is to download the CSV source files over SFTP and to use them as our migration source.

Before We Start

The Plan

The goal is to avoid downloading the file manually every time we run our migrations. So we need a way to doing this automatically everytime we execute a migration. To achieve this, we create a custom source plugin extending the CSV plugin provided by the Migrate Source CSV module, which will download CSV files from a remote server and pass it to the CSV plugin to process them.

The Source Migrate Plugin

To start, let's create a custom module and call it migrate_example_source and implement a custom migrate source plugin by creating a PHP class inside it at /src/Plugin/migrate/source/MigrateExampleSourceRemoteCSV.php

We start implementing the class by simply extending the CSV plugin provided by the migrate_source_csv module:
namespace Drupal\migrate_source_csv\Plugin\migrate\source;

use Drupal\migrate_source_csv\Plugin\migrate\source\CSV as SourceCSV;
use phpseclib\Net\SFTP

/**
 * @MigrateSource(
 *   id = "migrate_example_source_remote_csv"
 * )
 */
class MigrateExampleSourceRemoteCSV extends SourceCSV {}

If you are building a source plugin from scratch, you will need to extend the SourcePluginBase class instead of the CSV class given in this example. Adding the annotation @MigrateSource is very important because that is what will make the migrate module detect our source plugin. In our plugin, we use the phpseclib/phpseclib libraries to make SFTP connections. Hence, we need to include the libraries in our project by running the following command in the Drupal root:

composer require phpseclib/phpseclib

Our plugin will download the source CSV file and will simply pass it to the CSV plugin to do the rest. We do the download when the plugin is being instantiated like this:

/**
 * {@inheritdoc}
 */
public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
  // If SFTP connection parameters are present.
  if (!empty($configuration['sftp'])) {
    // A settings key must be specified.
    // We use the settings key to get SFTP configuration from $settings.
    if (!isset($configuration['sftp']['settings'])) {
      throw new MigrateException('Parameter "sftp/settings" not defined for Remote CSV source plugin.');
    }
    // Merge plugin settings with global settings.
    $configuration['sftp'] += Settings::get('sftp', []);
    // We simply download the remote CSV file to a temporary path and set
    // the temporary path to the parent CSV plugin.
    $configuration['path'] = $this->downloadFile($configuration['sftp']);
  }
  // Having downloaded the remote CSV, we simply pass the call to the parent plugin.
  parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
}

In the constructor we are using global SFTP credentials with Settings::get(). We need to define the credentials in settings.php like this:

$settings['sftp'] = array(
  'default' => [
    'server' => 'ftp.example.com',
    'username' => 'username',
    'password' => 'password',
    'port' => '22',
  ],
);

Once we have the credentials of the FTP server we use a downloadFile() method to download the remote CSV file. Here's an extract of the relevant code:

protected function downloadFile(array $conn_config) {
  ...
  // Prepare to download file to a temporary directory.
  $path_remote = $conn_config['path'];
  $basename = basename($path_remote);
  $path_local = file_directory_temp() . '/' . $basename;
  ...
  // Download file by SFTP and place it in temporary directory.
  $sftp = static::getSFTPConnection($conn_config);
  if (!$sftp->get($path_remote, $path_local)) {
    throw new MigrateException('Cannot download remote file ' . $basename . ' by SFTP.');
  }
  ...
  // Return path to the local of the file.
  // This will in turn be passed to the parent CSV plugin.
  return $path_local;
}

Note: The code block above has been simplified a bit. If you see the actual source plugin, there are some lines of code which make things more compatible with the migration_lookup plugin.

This method creates an SFTP connection, downloads the file to a temporary location and returns the path to the downloaded file. The temporary file path is then passed to the Migrate Source CSV and that's it! Finally, to use the plugin in our migration we just set our plugin as the source/plugin:

id: migrate_example_content
label: 'Example content'
...
source:
  plugin: migrate_example_source_remote_csv
  # Settings for our custom Remote CSV plugin.
  sftp:
    settings: sftp
    path: "/path/to/file/example_content.csv"
  # Settings for the contrib CSV plugin.
  header_row_count: 1
  keys:
    - id
...

The code for this plugin and the example module is available at migrate_example_source. Great!