We just upgraded our site to Drupal 8, and a big part of that was migrating content. Most content was in JSON files or SQL dumps, which are supported by Drupal's migrate module. But what about images and other files? How could we bring those along?

Since other people are also contending with this, we decided to share our solutions with the community.

Migrate background

Let's start with a brief introduction to migrations in Drupal 8. Suppose we want to import TV shows from TVmaze's JSON API into Article nodes. In Drupal 7, we would have a written a class extending Migration, but in Drupal 8 we define most migrations with YAML files. We'll create migrate.migration.shows.yml:

id: shows
label: 'TV shows'
source:
  plugin: json_source
  path: 'http://api.tvmaze.com/search/shows?q=family'
  identifier: id
  identifierDepth: 1
  headers: {  }
  fields: {  }
  constants:
    type: article
    format: restricted_html
destination: plugin: 'entity:node'
process:
  title: name
  body/0/value: summary
  body/0/format: constants/format
  type: constants/type

There are three important sections here:

  1. source tells migrate where our content comes from, in this case a URL to a JSON file. We're using the contrib module migrate_source_json, and giving it a bunch of parameters it needs—don't worry too much about them.
  2. destination tells migrate what we want to generate, in this case nodes.
  3. process tells migrate how to turn our JSON into Drupal fields. We're taking the name from our JSON, and making it the node title. Similarly, summary becomes the node body, and we use constants from the source section to define the node type (article) and the body format (restricted_html).

Now we want to add images. For each show in the JSON file, image/original is the URL of its poster image. But we don't want to migrate the URL itself, we want to migrate the contents of the URL! We'll need to transform the URL into something else, which is what process plugins are for.

Writing a process plugin

Drupal 8 migrate already has a bunch of different plugins, which can do things like concatenate values. But none of them will download a URL, so we'll need to write our own. Luckily, that's not too difficult! We just have to inherit from ProcessPluginBase, and implement the transform() method:

/**
 * Import a file as a side-effect of a migration.
 *
 * @MigrateProcessPlugin(
 *   id = "file_import"
 * )
 */
class FileImport extends ProcessPluginBase {
  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    if (empty($value)) {
      // Skip this item if there's no URL.
      throw new MigrateSkipProcessException();
    }

    // Save the file, return its ID.
    $file = system_retrieve_file($value, 'public://', TRUE, FILE_EXISTS_REPLACE);
    return $file->id();
  }
}

Our method treats $value as a URL, and saves it as a Drupal managed file. Then it returns the file ID, which is suitable for image fields. That's all we need! (If your file is local instead of on the net, you'll have to use file_copy instead of system_retrieve_file.)

The @MigrateProcessPlugin annotation says that our plugin's name is file_import, so we can now update our process section:

process:
  # snip
  field_image:
    - source: image/original
      plugin: file_import

This means: To get the value of field_image, take the value of image/original and process it with the file_import plugin.

Running the migration

Now we have our completed migration definition YAML file. So let's run the migration!

First, we need to make sure we have all the right modules installed: migrate_source_json, our custom module with the process plugin, and the contrib module migrate_tools which contains drush commands to run migrations.

Next, we need to import our YAML file into our site's configuration. Finally, we can run the migration: drush migrate-import shows.

Imported TV shows

Now we have a list of TV shows on our site. Perfect!

I've put the code for our plugin in the module migrate_process_file. It also contains an example module, which allows the user to input a search term, and then imports some TV shows containing that term. Feel free to use any of this in your own projects!

What's next

There are many improvements that could be made to this module, I've listed a few in the README. We're already working on some of these for our site—let us know if you have any other improvements in mind, and especially if you want to contribute. There's also a related issue for migrate_plus.

Want to hear more about our work with migrations? Check out our talk from MidCamp 2016 for more details about how we upgraded our site!

Do you have more questions this? You can tweet to @djvasi on Twitter, or consider hiring Evolving Web to work on your migration project.