Copy data from Azure Data Lake to another Data Lake with AdlCopy

The last months it’s been a bit quiet on my blog. I started working on some new stuff, and couldn’t find the inspiration when it came to finding new subjects to blog about. I started working with Azure Data Lake a few months back, and I decided to share my (limited) knowledge here again, hoping it saves you time somewhere down the line.

Migrating data from one Data Lake to the other
We started out with a test version of a Data Lake, and this week I needed to migrate data to the production version of our Data Lake. After a lot of trial and error I couldn’t find a good way to migrate data. In the end I found a tool called AdlCopy. This is a command-line tool that copies files for you. Let me show you how easy it is.

 
Download & Install
AdlCopy needs to be installed on your machine. You can find the download here. By default the tool will install the files in “C:\Users\\Documents\AdlCopy\”, but this can be changed in the setup wizard.

Once you installed the tool, you can open a command prompt to use the tool:

 
Now you need to find the file or directory you want to copy. You can do this by opening the file location in the Azure portal, and click on “Folder properties”:

 
This URL will be the input for AdlCopy:

 
You should also find the destination URL for the other data lake, since this will be the target.

 
Linking it to your Azure subscription
With AdlCopy it’s not needed to link anything directly to your subscription, or configure anything. The first time you run a copy-command, a login box will pop up. If you login with the account you use to login to the Azure portal, the tool will be able to access your resources.

 
Copying data
The input for AdlCopy are “/Source” and “/Dest”. These represent the source data and the destination to copy the data to.

There are 2 options when you want to copy files: single file or entire directory:

Copy a single file:

AdlCopy /Source adl://<DATA LAKE NAME>.azuredatalakestore.net/<DIRECTORY>/<FILENAME>.json /Dest adl://<DATA LAKE NAME>.azuredatalakestore.net/<DIRECTORY>/<FILENAME>.json

 
Copy an entire dirctory:

AdlCopy /Source adl://<DATA LAKE NAME>.azuredatalakestore.net/<DIRECTORY>/ /Dest adl://<DATA LAKE NAME>.azuredatalakestore.net/<DIRECTORY>/

 
When you want to copy an entire directory, make sure you add the trailing “/” (slash) to the path. If you don’t do that, the copy will fail (you can’t copy a directory into a file).

 
Conclusion
After trying out some stuff with Data Factory, manually copying files and considering building a small C# tool, this was the quickest option. It works out of the box, and you don’t have to be a rocket scientist to get this to work. So: The perfect tool for the job!

Advertisements

3 Responses to Copy data from Azure Data Lake to another Data Lake with AdlCopy

  1. Thanks for your post! I have a question. Does the computer you are using to invoke AdlCopy act a s a bridge between the two data lakes (data comes from source to your computer and it goes from there to the destination)? Or does it act as a coordinator, and the data travels directly from one Data Lake to the other?

    • DevJef says:

      To be honest Marco, I’m not really sure. I didn’t dig into that.
      I can remember that the documentation mention something that it would only work for a specific ADL generation, and some other restrictions. So that makes me to believe that your computer acts as a middle man.
      So if you find the answer, please let me know! 🙂

      • Thank you very much, Jeffrey.
        I’ll keep trying to find the answer. If I find it, I’ll let you know. It will be important in order to determine whether it’s a good method when you need to copy massive quantities of data.
        Regards!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: