Data migrations

Data migration, understood as a process of transferring data from one system to another, often newer one, seems simple. Nonetheless, it still requires quite a lot time to prepare as well as some time to finish with proper migration validation.
Data migration is a business process but also a set of technical applications and people's knowledge, usually supported by ETL Tools (extract-transform-load).

Data migration strategies

Even though from some points of view data migration seems problematical and difficult to apply, no one actually doubts in its necessity. More or less regularly almost all companies which operate on data have to update their systems, applications, platforms. It's a natural situation which originates from the fact that not only computer systems are changing, but so is business. What it means is the fact that a need for data migration might not only be a consequence of data storage system becoming outdated, but also the result of changing business condition. Ensuring the best data quality, also through efficient data migration, is crucial to responsible management in 21st century world.
Data migration, being a method for letting data originated from one source be compatible with another which it's going to be loaded into, is simple only at first sight. The deeper one knows the problem, the more questions arise, beginning with the most important one - how to make company suffer least because of data migration. In fact, it depends on chosen strategy. Basically, there are two different strategies, two different approaches to data migration. And they differ mainly in the way migration is proceeded.

Big bang migration

The first strategy, called big bang migration, is someway uncompromising. In a word, it suggests shutting all applications and databases immediately, stopping the work and putting all force in data migration. In fact, it really seems to be a good option, because only this way guarantees that the migration lasts as short as possible. Moreover, it almost eliminates the risk that something unpredicted happens during the process. On the other hand though big bang migration might be destructive to organization's work especially in cases when business continuity depends on real time or near real time data.
There is, of course, a way to minimize the negative influence of big bang migrations. In most companies which decide to choose this type of data migration strategy, the process is being initialized after work hours, i.e. weekends or during holiday season. This way, cutting off the access to data may cause the least problems.

Big bang migrations

In favourAgainst
  • short time of migration
  • can be run during weekends, holidays, etc.
  • obligatory organization systems' downtime
  • risky
  • costly
  • in most scenarios there is no way back

  • Trickle migration

    In fact most companies operate in a 24/7 mode and so are modern Business Intelligence platforms (even if employees actually have a day off, systems can't enjoy a break) and most companies cannot afford to turn off the systems even during holidays. However, when a data migration still needs to be conducted trickle migration might be the right approach.
    The idea behind trickle data migration is not to shut the whole system at once but operate only on its chosen areas so that all other could be accessible at the moment of migration. This way, employees keep continuous access to the data, even though migration might last weeks or even months and is done bit by bit. Usually in this approach there is time to do improvements, implement new features and because of that it's easier to justify it and get funding

    Trickle migrations

    In favourAgainst
  • no interruption in employees' work, gradual move
  • no system downtime
  • potential for implementing new features or technologies
  • easier business justification
  • long time of migration, practice shows that it might take months or even years for the migration to finish
  • more complex to organize


  • Sample Data migration methodology

    Data migration methodology

    1. Analysis of business impact

    Every migration means a lot of inconveniences for a company. People need to be prepared to have lost access to data they usually work on. Therefore, it's crucial to ensure that migration won't interrupt them too much. It's not easy and requires completing a list of processes and operations that have something in common with data to be migrated, and let users know early enough so that they won't be surprised with system's downtime.

    2. Information gathering

    Knowing what impact migration can have on business users, is the first step, but gathering the information about software and hardware migration aspects isn't less important. During this second stage, it is important to discover as much details as possible about future data migration complexity. What it means in practice is tracing the roots of data that has to be transferred, its locality and volume.
    It can be either done in a manual or automatic way.

    3. Mapping, designing

    Knowing where migration tools will take data from is one side of the question, while another one is to determine what place it is going to be stored at. What's worth to know is that there are two possibilities. One is that source and destination layouts don't change (one-to-one mapping layout) and one that source and destination layouts are different (relay out layout).

    4. Plan of migration

    The three first steps in data migration methodology apply to planning the process. And the planning stage has to be finished with a proper plan which includes all information like business and/or operational constraints to migration, data and its attributes, tools to be used during the process, and also best practices. Making a plan, even if seems superfluous in case of small migrations, is almost a must. It lets people responsible for migration keep an eye on the process and be sure that they don't omit any single but very important detail.

    5. Provisioning

    The real provisioning differs according to the layout of migration chosen a few steps earlier. In case of one-to-one migration layout, provisioning is nothing more than copying former structure of files, data volumes, and attributes so that the new environment could be ready for receiving the actual streams of data. In case of relay out migrations on the other hand, preparing a new environment might be much more difficult and require plenty of additional steps to accomplish before actually moving data. What's good about it is that many tasks can be done automatically.

    6. Test migration

    Badly prepared migration might be tragic in consequences to company data systems and company data itself. Thereupon to minimize the risk of such losses, followed with time and money wastes, it's always required to run a test migration before the real one can be started. It's the only certain method to check in practice if all the presumptions were correct and if the tools were chosen properly. Test migration should apply to a little but representative part of data.

    7. Migration

    In a word, it's all what data migration is about - moving data from its original source to a new destination. Here also are two possibilities: moving data out of the path and moving it in the data path.

    8. Validation

    Once all migration processes finish, it has to be checked if everything went in a correct way. In some cases, even if the migration seems to finish with success, there are some errors hidden deeper so there is a need to identify them and delete as soon as possible so they couldn't interrupt database operation.