Enriching your dataset – The SSIS way…

In my last post I talked about the technical options you have to enrich your dataset with spatial data. After writing that post, I decided to try and build the option like mentioned by Donabel Santos (Blog | @sqlbelle). Once I got that working, I tried to extend it a little, so it would fetch and store the data that I wanted. And like I mentioned to Koen Verbeeck (Blog | @Ko_Ver), it’s also easy to build for someone with rusty SSIS-skills (like me).

Creating the resources
In order to start with the SSIS solution, you’ll first need to create a source and a destination table for your data. I’ve created a GIS_Test and a GIS_Results table, with a few samples for you. You can download the file here.

Creating a new SSIS project
Once you’ve created the database resources, start Visual Studio, and Create a new SSIS project. You can find this under “Business Intelligence” -> “Integration Services”, like shown below:

Once you’ve created your project, add a new “OLE DB Connection” to your SSIS Package, by right-clicking in the “Connection Managers” at the bottom of your screen:

In the window that shows up, choose a connection you used earlier, of click “New…” to make a new one. Fill in your SQL Server information, and click “OK”.

Adding components
In order to actually do something with data, you need to add a “Data Flow” into your SSIS package. You can do this by manually dragging one into your “Control Flow”, or you can just the “Data Flow” tab, and click the message “No Data Flow tasks have been added to this package. Click here to add a new Data Flow task.”, and SSIS will add one for you:

Now drag in a “Source Assistant” into your “Data Flow”. If the screen below shows, click the OLE DB Connection you made earlier, and click “OK”:

Now double-click the “OLE DB Source”, and select the source table you’ve created earlier:

Drag in a “Script Component” and click “Transformation” on the next screen:

Now connect both components, by clicking on the “OLE DB Source” component, and drag the green arrow on the “Script Component”:

Double-click on the “Script Component” and choose “Input Columns”. Now select all columns (except the ID column), and set the “Usage Type” to “ReadWrite”, and click “OK”:

Drag in a “Destination Assistent”, and choose the OLE DB Connection you’ve created earlier. Now connect the “Script Component” to the “Destination Assistent” by dragging the green arrow like before.

Open the “OLE DB Destination” and choose the GIS_Results table as destination:

Now click on “Mapping” to auto-map your columns. Check them if the auto-mapping actually worked!

Start the coding!
With a completed package, now let’s start coding! Double-click the “Script Component” and click “Edit Script…”. A new Visual Studio window will open, where you can edit the script that will fetch your data from the Google Geocoding API.

First of all, add a reference to the System.Web namespace:

Add a new class named “Geocoding.cs”, and overwrite the content with the code from this Geocoding.cs

Go to the main.cs file, and copy the method called “CreateURL” from this stripped version of main.cs.

Add the code from the downloaded “Input0_ProcessInputRow” to your own “Input0_ProcessInputRow” method.

Go back to your original Visual Studio with the SSIS package, and press “OK” on the open “Script Component” window. This will close the Visual Studio with the actual script, and stores it in your SSIS package.

Can I finally run it now?!
Yes! Yes, you can! And if you run the SSIS package, you’ll see that it saves the records into the GIS_Results table. It adds the complete URL that was used to fetch data from the Geocoding API, so you can debug it in case of errors. Also, I’ve added a Geography object, so it’s easy for you to visualize the data retrieved from the API. And the last column I’ve added is the status code returned by the API, so it’s easy to spot weird results.

A word of thanks…
And as a final remark, I’d like to thank Donabel Santos for letting me use her post as a guideline, and her code as a basis for my version.

Downloads
I’ve also made the project available for download. You can download the full solution here

7 Responses to Enriching your dataset – The SSIS way…

  1. Pingback: SQL Server & BI blog by Andreas Wolter (Microsoft Certified Master): Database Engine - Reporting - Integration - Analysis Services

  2. ferreirafe says:

    Very useful post.
    Would you know how do I implemente in the code my Client ID and Crypto Key from Google Maps API? With that, I won’t have problems of geocoding limitation.

    • DevJef says:

      If I remember correctly, you can add your API key to the URL. I’m not sure how you can do that, but perhaps you can find the answer on: developers.google.com. There’s some really good documentation you can use.

    • Migo says:

      Hi ferreirafe, This is done in Geocoding.cs where it says string uri. Anything typed after {1} will be appended to the end of the url. I removed &sensor=false since it seems to be depreciated and then added my key.

  3. Migo says:

    Thank you for a great write up. I’ve been looking for this for a while now. I’m noticing that I’ll get a OVER_QUERY_LIMIT when I go over query per second (5 for free api and 10 for Google Maps API for Work customers). It would be nice if you’re code included a throttle.

    • Migo says:

      Throttling seemed to improve the success rate.
      namespaces System.Threading;
      right before return URLaddress; I added Thread.Sleep(150);
      There are suggestions on the web to also check the status value if it’s OVER_QUERY_LIMIT then retry.

      • DevJef says:

        Thanks for your reply Migo, and glad you liked my post!

        It looks like a really good (and easy) solution to solve this problem. Thanks for your contribution. I think a lot of people will appreciate it!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: