Global News Prism - Tag1 Consulting: Migrating Your Data from D7 to D10: Debugging tips, performance considerations, Drupal CMS, AI-assisted migrations and more!

Tag1 Consulting: Migrating Your Data from D7 to D10: Debugging tips, performance considerations, Drupal CMS, AI-assisted migrations and more!

2025-04-15 13:24:28 by Tag1consulting.com

Series Overview & ToC | Previous Article | Final Article Welcome to the last article in the series. Today, we’ll wrap up by covering how to debug migrations, performance considerations, and the importance of checking your site for broken links before launch. We’ll also briefly discuss migrating into Drupal CMS and using AI to assist with the migration process. Before doing so, let's talk about other migrations that you might need to implement in your project. More migrations We wrote 25 migrations throughout the series, but no need to stop here. Below is a list of common — and not so common — migrations that might need to be implemented: URL aliases: By default, content entity migrations do not include importing their URL aliases. A dedicated d7_url_alias migration exists for that purpose. It queries Drupal 7's url_alias table and creates Drupal 10's URL alias (path_alias) entities. Pathauto: The pathauto_settings and pathauto_patterns migrations are responsible for importing pathauto's module configuration and URL patterns, respectively. Redirects: The d7_path_redirect migration will query Drupal 7's redirect table and create Drupal 10's redirect entities. Menus: The menu and menu_links migrations are responsible for importing menu (configuration) and menu items (content) entities, respectively. Metatags: The metatag module provides some configuration-related migrations. They include adding a metatags field to entities which gets populated when importing the content. For example, if the metatag module is enabled in Drupal 10 before starting the automated upgrade process, the generated upgrade_d7_node_page migration will account for importing metatags data for nodes of type basic page. Blocks: The d7_custom_block migration takes care of importing block content entities into the basic block type. That block type comes out of the box with the standard installation profile or can be added by applying basic_block_type recipe. Note that blocks have two related configuration entities: block_content_type and block. The first is the block type. The second stores block placement data within a theme. The block_content_type migrations and d7_block included in the block_content and block modules account for importing those configuration entities. Comments: The comment module provides multiple migrations to import comment-related data. Most of them handle importing configuration. The d7_comment migration is the one responsible for importing content entities. Note that the comment entity in Drupal 10 is fieldable and can have multiple bundles. That is, different comment types might exist similar to having multiple content types, each with a different set of attached fields. Unless you are customizing your migration, you might end up with multiple comment migrations after an automated migration. Refer to this documentation page to learn more about some caveats associated with migrating comments. In all cases above, we link to the migration plugins included with each module. When running the automated migration process with the Migrate Upgrade module you will get derived migrations prefixed, by default, by upgrade_. For example, the d7_url_alias migration will be generated as upgrade_d7_url_alias. In some cases, migration derivers are implemented, like in the case of nodes where a separate migration is created for each content type. For instance, upgrade_d7_node_article and upgrade_d7_node_page. Some upgrade paths are not fully automated, but there are modules that can help get you started. Below are some examples: For webforms use the webform migrate module. For views use the views migration module. For media use the media migration or migrate media handler module. The Tag1 Team Talks podcast on file and media migrations contains lots of useful information on migrating media. Debugging migrations Migrations have a lot of moving pieces and, as noted in previous articles, they interact with other APIs in Drupal. For example, while running the migration for a content type, you will interact with the Entity and Field APIs. If content validation is enabled, the entity validation API will also come into play. Another important factor to consider is which modules you have enabled in Drupal 10 at the time the migrations are executed. Hook implementations and event subscribers will be called either by the Migrate API itself or other subsystems. Debugging is an essential tool for development in general. To learn more about this topic, check out this presentation by Randy Ray. For migration specific tips, review this documentation page. When it comes to debugging Drupal migrations, I often perform the following four actions. 1. Check if the migration's definition is what you expect What do I mean by this? That the migration YAML file you see in your editor accurately represents what the Migrate API is executing during import operations. Migrations are plugins and as such they can be modified dynamically. Altering a migration plugin allows you to replace part of the migration definition with values read from settings, configuration, or environment variables. An example would be injecting API keys into migrations that interact with third-party services. This prevents exposing the API keys in plain text, as part of the migration file, and committing them to the code repository. Modifying migrations dynamically is usually done by implementing hook_migration_plugins_alter, a technique used by modules like migrate plus, migrate_tools, paragraphs, media migration, commerce migrate, among others. There is nothing wrong in altering migrations dynamically. In fact, it’s to be expected given there is an API for it. Problems could arise when you are altering migrations in your custom code without considering that other enabled modules might also be doing the same. You can use the migration plugin manager to check the final state of a migration after all dynamic modifications. That is what ultimately gets called when the import operation is triggered. Back in article 16, we explained how to obtain the definition of a migration plugin, along with other plugin types used by the Migrate API. As a reminder, refer to the following snippet: # List of migration plugin ids. ddev drush php:eval "print_r(array_keys(\Drupal::service('plugin.manager.migration')->getDefinitions()));" # Details on a specific migration plugin id. ddev drush php:eval "print_r(\Drupal::service('plugin.manager.migration')->getDefinitions()['MIGRATION_ID']);" 2. Check the query executed against Drupal 7 All the migrations presented in this series use source plugins that fetch data from a Drupal 7 database. If you suspect that something is not right with the data being retrieved, it’s a good idea to check the query the migration executes against the Drupal 7 site. You can do this by implementing hook_query_TAG_alter. In our custom tag1_migration module, the following code will log the query executed by our migration: /** * Implements hook_query_TAG_alter(). * * // @see \Drupal\migrate\Plugin\migrate\source\SqlBase::prepareQuery */ function tag1_migration_query_migrate_alter(AlterableInterface $query) { /** @var \Drupal\migrate\Plugin\MigrationInterface $migration */ $migration = $query->getMetaData('migration'); $migration_tags = $migration->getMigrationTags(); // Only act when a migration uses a tag related to our project. if (count(array_intersect(['tag1_content', 'tag1_configuration'], $migration_tags)) === 0) { return FALSE; } $placeholder_ids = array_keys($query->getArguments()); $placeholder_values = []; foreach ($query->getArguments() as $arg) { $placeholder_values[] = "'$arg'"; } $query_string = str_replace($placeholder_ids, $placeholder_values, (string) $query); $query_string = preg_replace('/["\[\]{}]/', '', $query_string); \Drupal::logger('my_custom_module')->debug('The @migration_id migration executed the following query: @query', [ '@migration_id' => $migration->getPluginId(), '@query' => $query_string, ]); } A detailed review of the PHP code for hook implementation will be left as an exercise to the curious reader. The one thing I want to highlight is that the snippet above would only log the query assembled by the query method of the source plugin. Most notably, it would not log any queries that could be executed in the prepareRow of the source plugin. In fact, executing a single migration can trigger multiple queries against the source site. The snippet above focuses on the one that is arguably most important. If you need to inspect extra data fetched by the prepareRow method or implementations of hook_migrate_prepare_row, it’s best to resort to a debugger. 3. Use a debugger like XDebug Using a proper debugger like XDebug is imperative to debug Drupal migrations effectively. In addition to identifying and fixing errors, debuggers are a fantastic tool to study and understand how a system works. Over the years, I have dedicated a good amount of time step-debugging migration-related code to understand how the Migrate API works and how it interacts with other Drupal sub-systems. When debugging migrations, the most common locations where I add breakpoints are: The import method of the \Drupal\migrate\MigrateExecutable class. In this method, the source plugin retrieves data, the process pipeline is executed, and the processed data is sent to the destination plugin, which in our examples creates Drupal entities. My preferred location to add a breakpoint is in the line where the import method is invoked on the destination plugin. At this point, you can inspect the $row variable to review all data retrieved by the source plugin, including alterations while preparing the row, and the result of applying the process pipeline. The transform method in process plugins. Step-debugging this method will let you see which data is sent to it, how it is manipulated, and what is the returned value. This is particularly useful when a process plugin exposes configuration options that affect how they behave. This technique also lets you investigate why you do not get the expected results after chaining multiple plugins. A common cause of confusion in process plugin chains is knowing whether a process plugin can process multiple values or not (as explained in article 27). Technical note: It’s possible for a process plugin to not include a transform method. When this is the case, you need to specify a value for the method configuration option. The base class for process plugins then checks if the indicated method exists in the class that defined the process plugin and calls it. An example of this is skip_on_empty, which needs to be configured with either method: row or method: process. When using a debugger to troubleshoot code, make sure to turn it off when you finish reviewing what needs to be debugged. XDebug and similar tools add overhead to execution of the application. In the case of migrations, having a debugger on while running a full migration can slow down the process significantly. That is why Drush disables Xdebug by default. If you are running your migration from the command line using Drush, you will need to use the --xdebug flag like this: drush migrate:import upgrade_d7_user --xdebug I also want to highlight two modules that can assist with debugging migrations: Migrate Devel helps with inspecting retrieved source data, the result of the process pipeline, and the destination ID values. Migrate Sandbox helps assemble and check complex process pipelines with either test data or real records retrieved by a migration source plugin. 4. Inspecting migration ID map tables Sometimes you do not notice issues in your migrations until later on. In these cases, migration ID map tables are available to help troubleshoot. These tables can be used to understand why an entity was not migrated or why a field was not populated as expected. When a migration is executed, the Migrate API creates two tables automatically: an ID map table and a messages table. We discussed migration messages in article 14. As for ID map tables, they are used to keep track of which entities from the source have already been processed. For each source record, the system tracks whether the import process was successful or not. In cases where the import succeeds, destination ID identifiers are also stored for each source record in the map table. To better explain the above, consider the upgrade_d7_user migration from article 25. Assume we modify it slightly so that entity IDs are not preserved. In its process pipeline, the migration skips Drupal 7 users that were blocked. Below is a sample of how the ID map table for this migration would look like: -- SELECT sourceid1, destid1, source_row_status, rollback_action, last_imported -- FROM migrate_map_upgrade_d7_user -- WHERE sourceid1; +-----------+---------+-------------------+-----------------+---------------+ | sourceid1 | destid1 | source_row_status | rollback_action | last_imported | +-----------+---------+-------------------+-----------------+---------------+ | 13 | 17 | 0 | 0 | 1736882479 | | 19 | NULL | 2 | 0 | 1736882479 | +-----------+---------+-------------------+-----------------+---------------+ To get started, the default name for the ID map table is the migration ID prefixed by migrate_map_. For upgrade_d7_user the corresponding ID table is migrate_map_upgrade_d7_user. If a migration ID is too long, the ID map and messages table names will be truncated. Check this core issue if you stumble into issues with mapping lookup operations. The source plugin for the user migration defines a single unique identifier: Drupal 7's user ID. The destination plugin that creates Drupal 10 users also leverages the user ID as its unique identifier. The above table indicates that Drupal 7's user with uid 13 was imported into Drupal 10 as user with uid 17. And Drupal 7's user with uid 19 was not migrated. While not enforced by the API, it’s possible to have a note in the messages table indicating why the record was skipped (as discussed in article 14). The values of the source_row_status columns correspond to constants defined in the MigrateIdMapInterface: The value 0 indicates that the import of the row was successful. The value 1 indicates that the row needs to be updated. The value 2 indicates that the import of the row was ignored. The value 3 indicates that the import of the row failed. The values of the rollback_action columns correspond to constants defined in the MigrateIdMapInterface: The value 0 indicates that the data for the row is to be deleted upon rollback of its migration. The value 1 indicates that the data for the row is to be preserved upon rollback of its migration. The last_imported column contains a UNIX timestamp of the last time the row was imported. The ID map table also contains a source_ids_hash, which stores a hash of the source IDs. Another column named hash might contain a hash of all source row data for detecting changes. This column is populated when the track_changes configuration option is set to TRUE in your source plugin. Stepping back to the topic of troubleshooting migrations, the migrate ID map and messages table are very useful. They store key information about all the records processed by the migrations and let you know if they were successfully imported or not. Many times I have been asked why an entity was not migrated or why a field was not populated. This does not necessarily mean a problem in the migration. Some elements might be excluded on purpose and the client just needs to be reminded of the conditions that led to such an outcome. Once the migration is finished, you could delete the migration tables. However, I highly recommend keeping them for a while after launch. They are the paper trail of your migration efforts. If you decide to delete these tables, take a look at this documentation page for information on how to do it. Migration performance Over the years, I have had the opportunity to work on very large migration projects. In some cases, the sites have been running for more than a decade. The volume of content that has accumulated is so large that running a full migration can take multiple days to complete. The tips below will yield greater performance gains for large migrations, but they can also be applied in most projects. Let's start our discussion on performance by considering how we execute the migrations. The migration runners present in Drush core and modules like Migrate Tools allow you to execute multiple migrations in a single command. Some ways to do this is using the --tag, --group, or --execute-dependencies flags, depending on the runner of choice. Instead of doing this, it’s better to execute each migration individually. I usually create a bash script where I list the migrations that need to be executed like in the following snippet: #!/usr/bin/env bash echo "Migration started at $(date)" drush migrate:import upgrade_d7_file drush migrate:import upgrade_d7_user drush migrate:import upgrade_d7_taxonomy_term_tags drush migrate:import upgrade_d7_node_article drush migrate:import upgrade_d7_node_page drush migrate:import upgrade_d7_url_alias echo "Migration completed at $(date)" Make sure to list the migrations in the proper order they need to be executed. If you want to take it a step further, you can adjust the script to run some migrations in parallel. Running migrations for the same destination entity type in at the same time is generally fine. If you go this route, you need to be mindful about dependencies among migrations, especially when there are lookup operations in the process pipeline. Before going further, let's pause for a bit to discuss a topic that we have not given much attention to: migration runners. By now, executing drush migrate:import might be second nature to you. But where does this command come from? And, are there alternatives? The import command, and other migration related ones, have been part of Drush core since version 10.4. Before that, a common way to run migrations was using the Migrate Tools module. In addition to providing Drush commands, this module also offers a user interface to running migrations. Other runners include the modules Migrate Upgrade, Migrate Scheduler, Migrate Manifest, Migrate Cron, and Migrate source UI. Reading about the features of each of these modules will be left as an exercise to the curious reader. The reason I bring this topic now is to share an anecdote related to migration runners and performance. Some time ago I was working on a very large project. A few content types had millions of nodes associated with them. Executing those migrations would take hours — and that was expected. What was strange is that after the last node was processed, the migration seemed to halt for a few extra hours. With the help of a colleague, the XDebug profiler, and the profile data visualization tool KCachegrind, we were able to identify the bottleneck. In short, the migration runner we were using had a post-import operation that looped over all the records written to the migration's ID map table. In practice, this meant performing extra operations on millions of records. After reviewing the code, it was determined that the extra operation was not needed in our project. By changing from one migration runner to another, we were able to save a significant amount of time when running a full migration. The moral of this story is take time to understand the tools you use, both for executing the migration and for debugging issues when they inevitably arise. Nowadays, if you use DDEV (which we highly recommend) you can visually profile your application using XHGui — a graphical interface for XHProf. In the above example, we see how skipping a post-import operation saved us a lot of time. Are there other things we can bypass to speed up the migration executing time? Yes. In most cases, it will depend on your specific setup, what modules are enabled at the time the migration is executed, and how they are configured. Remember that the migrate API will interact with other Drupal APIs. This means enabled modules can react to entity CRUD operations. As an example, consider a site using a Search API. You should delay the indexing of content after the migration is fully completed. Some of the ways to accomplish this are: When adding an index, configure it so that elements are not indexed immediately. Disable the index via a configuration override. Disable the index using Drush commands. This can be incorporated in the same script that is used to execute the migrations as follows: #!/usr/bin/env bash echo "Migration started at $(date)" drush search-api:disable-all # Execute migrations with drush migrate:import ... drush search-api:enable-all echo "Migration completed at $(date)" Delaying indexing of migrating content will have a greater performance boost when you're using external search backends like Apache Solr. You can do something similar with any operation that will talk to an external API when entities are being created in Drupal. Some internal Drupal operations can be quite expensive too. For example, modules that define access controls might require node access permissions need to be rebuilt. Other operations might not be as expensive but might improve performance, like deferring the generation of thumbnails for media entities (as described in article 28). But do not take any of these examples as universal truths. Assess the requirements of your projects and act accordingly. While creating the migration scripts, you might want to focus on the integrity of the data. Once the data is migrated, you then proceed to implement and test other aspects of the project like search. In the snippet above, we took advantage of Drush commands provided by the Search API module to disable and enable indexing at the appropriate time during the migration process. Not every module provides commands that can be used to toggle certain features on demand. An option is to disable and re-enable modules that perform expensive operations. However, this is far from ideal, because we are changing the site's configuration and its overall behaviour. A better alternative would be to disable some hooks by leveraging hook_module_implements_alter. If you follow this path, consider using Migrate Boost, which temporarily disables hooks only while running migrations. Is there anything else we can disable? Yes. Before covering that, let me reiterate. Do not disable or remove things blindly. Carefully consider the implications of the actions you take. We are providing general advice here. At the very minimum, you should measure the impact of these changes, one at a time, to determine if it’s making a real impact on improving the performance in your migration project. That said, you can disable entity caching and migration discovery. The first can lead to performance improvements, because Drupal will skip caching the recently migrated entities. Caches can be warmed in bulk after the migration is complete. Disabling the migration discovery caches is mostly a convenience while developing migration. By doing so, you do not need to constantly rebuild caches or sync configuration when using Migrate Plus's migration configuration entities. To disable these cache bins, copy web/sites/example.settings.local.php as web/sites/default/settings.local.php. Make sure this file is included by web/sites/default/settings.php. Back in the newly created settings.local.php file have this code: $settings['container_yamls'][] = DRUPAL_ROOT . '/sites/development.services.yml'; $settings['cache']['bins']['entity'] = 'cache.backend.null'; $settings['cache']['bins']['discovery_migration'] = 'cache.backend.memory'; At the risk of stating the obvious, the above should only be done during local development. Never disable entity cache in a production environment unless you are completely certain of the implications of doing so. The last thing I want to discuss on the topic of performance is where to execute the migrations. Ideally, the Drupal 7 database is located in the same server where the Drupal 10 site is hosted. If possible, the Drupal 7 uploaded files should also be located in the same server (as discussed in article 24). Depending on the size of the project, and data privacy requirements, it’s possible to run the migrations in a local environment in one of the developer's machines. Alternatively, you can provision a temporary server where the migration can be executed. If possible, avoid running the migration directly in your hosting provider. This isn’t about not running migrations in a production environment. Instead, do not have two projects hosted on different servers and run a migration directly between them. Only twice I have seen migrations being executed directly in staging servers within the client's infrastructure. This was a hard requirement derived from very strict data protection policies. Before launch, check your site for broken links Launching a site requires thoughtful considerations on multiple fronts: accessibility, infrastructure, security, performance, search engine optimization, marketing automation, up-time monitoring, third-party services integration, and more. Covering all of them is outside the scope of the series. We recommend taking a look at the checklist for launching a site included in the Drupal.org documentation guide on administering sites to get you started. That said, there is one thing you must do before going live with the new site: account for URL changes. Once I read the case of an ecommerce platform losing over 50% of its revenue after updating their site. They attributed the loss to a change in the URL structure for products. After years of being highly ranked in search engines, the URLs that were indexed had changed with no redirects in place. This might seem as an extreme, unlikely example until things go wrong. In the context of a custom Drupal 7 to 10 migration, changes in URL structure are common. This can be caused by: Changes in entity types. For example, creating users from nodes. A redirect would need to be added from the old node to the new user. Consider both the node's canonical URL (e.g. /node/7) as well as the node's path alias (e.g. /hello-world). Changes in pathauto patterns. For example, a content type whose pathauto pattern changed. When the node migration happens, if the pathauto module is enabled in the new site, the node will receive a path alias based on the new patterns. To mitigate the impact of broken links, consider the following: Pay close attention to the URL alias, pathauto, and redirect migrations. It’s very likely that you will have to customize them. Also likely that you will have to create new, custom migrations altogether. In a recent project I used a custom migration to create Drupal 11 redirects out of Drupal 7 path aliases for content types whose URL pattern had changed. This involved a custom source plugin that joined data from Drupal 7's url_alias and node tables. The data retrieved was used by a custom migration to create redirect entities in the new site. Involve members from multiple teams. While this series has been focused on the technical aspects of migrations, upgrading a site to a new platform does not happen in a vacuum. Large projects tend to have dedicated teams focused on analytics and marketing efforts. Whether that is the case for your project or not, try to get a report of the most visited pages and sections in the site so that the technical team can pay extra attention to them. Involve the quality assurance team to assist with manual and automated reviews. Speaking of which... Leverage tools for automated checks. Within Drupal, the Link checker can find broken links in content entities. There is also a report available at /admin/reports/page-not-found that lists the top requested URLs that do not exist on the site. Outside of Drupal, you can use reports from your analytics software to identify broken links. Also, website crawling software can flag other types of errors. For instance, they can identify duplicate content being served from multiple URLs. Some tools will assist you in taking preventive action while others will help in correcting oversights in your migration. As a general recommendation, have tools at your disposal to periodically check for broken links before and after the migration. Drupal CMS and AI-assisted migrations The series has focused on creating custom migrations for projects built on top of Drupal core. By now, it’s likely that you have heard about Drupal CMS, a ready-to-use platform created to empower marketing teams, content creators and site builders. Dries Buytaert, Drupal founder and project lead, says the goal of the new platform is to be the gold standard for no-code website building. When you install Drupal CMS, you get features such as advanced media management, SEO tools, AI-driven website building, consent management, analytics, search, automatic updates, and more. To learn more about Drupal CMS as a product, read its official documentation and join the conversation in the multiple Drupal slack channels: drupal-cms-development, drupal-cms-support, drupal-cms-marketplace, and drupal-cms-templates. One of the landmark features of Drupal CMS is the use of recipes to provide an out-of-the-box experience following community best practices. Among other things, a newly installed Drupal CMS site will have a known and consistent content model. While there are no promises that the content model will remain unchanged, by the time features are included in Drupal CMS they are very polished. The benefit of having a stable content model is that it’s easier to build tools to migrate content into it. At DrupalCon Atlanta 2025, a new initiative was launched to provide automated migrations into Drupal CMS and Experience Builder — from paragraphs, layout builder, and more. Some of these automated migrations will not even use the Migrate API! Instead, they will leverage artificial intelligence (AI) to help with the task. Going back in time to DrupalCon Barcelona 2024, the DriesNote included a demo on migrating content into Drupal using AI. By DrupalCon Atlanta 2025, AI integration into Drupal CMS has evolved significantly beyond content migration. Remember that Drupal CMS is built on top of Drupal core. As such, these AI features can be leveraged on any Drupal project. AI-assisted migrations, as well as AI-assisted Drupal development, are a reality today. Anything that I write about artificial intelligence in Drupal is likely to be outdated in the near future. If leveraging AI for Drupal development is something that interests you, I recommend this presentation by Fabian Franz and Marco Molinari. Also, follow the development of the AI module, check its official documentation, and join the conversation in the #ai channel in Drupal slack. To assess how AI can help today with migrations that leverage the Migrate API, I asked one of the most popular AI-assistants some of the questions we answered during this migration series. Things like how to migrate Drupal 7 nodes as Drupal 10 users, how to create a custom source plugin that excludes unpublished nodes, how to create a custom process plugin that combines data from multiple Drupal 7 fields into a single Drupal 10 field, and more. The results were mixed. Some could achieve the desired results with a little tweaking while others were objectively wrong. AI tools can provide a lot of help, but do not trust them blindly. While they can guide you in the right direction, at times their output is not the most efficient solution. AI is a promising technology, and it will continue improving. Check it out for yourself. As of today, though, these tools are not a substitute for taking the time to learn the Drupal's Migrate API with a guide like ours. Thank you! And that concludes the Drupal 7 to Drupal 10 migration guide. Thank you very much for joining us in this learning experience! We hope you learned a lot and gained the knowledge you need to migrate your next project with confidence. If you need personalized help with your project, the Tag1 Team is ready to assist. Contact us to learn how we can help you with your migration. Image by Daniel Kirsch from Pixabay

Tag1 Consulting: Migrating Your Data from D7 to D10: Debugging tips, performance considerations, Drupal CMS, AI-assisted migrations and more!

Never forget.