How To Configure Data Management Gateways In Office 365 To Allow The Refresh Of On Premise (aka in your local environment) Data

I recently started playing around with Office 365 and PowerBI. Now that you can connect your organizations on premise data with Office 365, Office 365 becomes a much more viable solution for delivering Business Intelligence. I did meet my fair share of roadblocks when trying to figure out how to set up a connection to my on prem data though, so I figured I’d write it all down to save others (or myself) some time in the future. It’s worth mentioning that you have to have PowerBI enabled in order to get all of this to work.

The examples below were created using a dev environment and the AdventureWorks DW2012 db, so you’ll need to alter some of the connection info to suit your particular scenario. At the end of the tutorial we’ll have a PowerPivot workbook hosted in Office 365 that is configured to refresh data from our on premise server. Cool eh? So let’s get started.

Connecting Office 365 to your on prem data can be boiled down to 4 main tasks:

  1. Creating and configuring a “Management Gateway” in Office 365 and your computer.
  2. Creating a data source in Office 365.
  3. Create a PowerPivot workbook that’s connected to your on prem data.
  4. Scheduling/testing the refresh.

Creating a “Management Gateway” in Office 365

  1. In office 365 go to the admin dropdown (located in the upper right hand corner) then go to “PowerBI”.
    clip_image001
  2. Click on “gateways”.
  3. Click the “+” next to “new gateway”
    clip_image002
  4. Fill out the name and description section of the new gateway. In this example we will not be using the “cloud credential store”, so ensure that the enable box for that option is not checked.
    clip_image004
  5. Click the “Create” button.
  6. If you haven’t done so previously, use the download button to download and install the gateway software.
    clip_image006

    The gateway software should be installed on a machine that has access to the data you’re planning to import into Office 365. It should also be noted that the machine you install the gateway software on will need to be running/on in order for any scheduled refreshes to actually work.

  7. Highlight and copy the “Gateway Key” (we’re going to use it in just a minute)
  8. Open the Gateway software once it’s installed.
  9. Click the “Start Service” button if it’s not already running.
  10. Click the “register gateway” button and paste in the gateway key you copied in step 7
    clip_image008
  11. Click finish on the “install & register” screen in your browser.

After completing the steps above your Power Bi admin center, and Microsoft Data management Gateway Configuration Manager should look like the screens below if everything has gone well.

clip_image010
Power BI admin center

clip_image012
Microsoft Data Management Gateway Configuration Manager

 

Creating a data source in Office 365

Now that we’ve configured our gateway, we need to create a data source to the database our workbooks will use to refresh data. Follow these steps:

  1. Once again in office 365 go to the admin dropdown (located in the upper right hand corner) then go to “PowerBI”.
  2. Click on the “data sources” link on the left hand side.
    clip_image014
  3. Click the “+” next to “new data source” and select “SQL Server” from the resulting menu.
    clip_image015

    At the time I originally did this, Power Query didn’t actually work for data refresh. You can see my forum question and resolution on it here: http://social.technet.microsoft.com/Forums/en-US/3fe3ac74-5954-475b-8849-6602573522e7/power-querydata-management-gateway-data-refresh-not-working?forum=powerbiforoffice365#bcc8af38-22da-43aa-b378-7757196f2829

  4. Ensure “Enable Cloud Access” is enabled and then click next.
    clip_image017
  5. Name your data source.
  6. Select the gateway you wish to use (the one we just created in the previous section).
  7. Select a Data Source Type of “Sql Server”.
  8. For “Connect Using” select “Connection String”. (at the time of this writing, I was told this was the only way it currently worked, but that other options should be resolved shortly)
  9. For Connection provider select “.NET Framework Data Provider for OLE DB”.
  10. Paste the exact same connection string used in your PowerPivot workbook into the Connection String window. (this needs to be the exact same string, so you should open up your workbook, open up the connection, and copy the string. If you don’t have a workbook yet, you might want to skip to the section about creating a workbook (ie the next section) and then come back to this step)
  11. Click the Credentials button at the bottom of the screen.
  12. Select the credentials type of your choosing (I used windows creds)
  13. Enter the user name and pwd of the user you want to use to connect to your data source when the workbook refreshes.
  14. Click test connection. Once complete your window should look something similar to the image below
    clip_image018
  15. Click OK on the data source settings window
  16. After entering your credentials, click next on the connection info screen. (See completed screen below)
    clip_image019
  17. On the “Users and Groups” screen, add any users or groups who you want to have access to use the data source. (add users by using the “+” button)
  18. Click Finish.

Once finished, your data sources screen should look similar to the screenshot below:

clip_image021

 

Create a PowerPivot workbook that’s connected to your on prem data.

Now that we’ve got our gateway and data source ready to roll, it’s time to create a workbook that we can upload and schedule to refresh. So…to do that:

  1. Open up Excel 2013 and create a blank workbook
  2. Click on the PowerPivot tab in the ribbon
  3. Click the Manage button on the PowerPivot Ribbon
    clip_image023
  4. In the PowerPivot management window select “From Database”, then “From SQL Server”.
    clip_image024
  5. Enter the “Server name” and “Database name” into the resulting wizard (the same server, db, etc that you normally use to connect to on prem data in your organization).
  6. Click the “Advanced Button”.
    clip_image025
  7. On the advanced screen, in the section labeled “Connection string”…this is the exact connection string you should have entered when creating your data source in Office 365. If this connection string and the connection string in your data source in Office 365 do not match exactly, your refresh will not work.
    clip_image026
  8. Click OK on the Advanced Screen.
  9. Select “Select from a list of tables and views to choose the data to import”.
  10. Click Next on the Table Import Wizard screen.
  11. Select the tables you want to import by placing a check in their box. (in my case I am only selecting the DimProductCategory table from the AdventureWorksDW2012 example database)
    clip_image027
  12. Click Finish
  13. Once the data has imported click close on the Table Import Wizard screen.
  14. While still on the PowerPivot Management window select “PivotTable” and then “PivotTable” from the resulting dropdown men
    clip_image029
  15. Select “Existing Worksheet” and then click OK to finish creating the pivot table.
  16. Select “EnglishProductCategoryName” from the PivotTable Fields on the right.
  17. Save the workbook to your desktop as “O365RefreshTest” the completed workbook should look similar to the image below
    clip_image031
  18. Navigate to your Office 365 environment
  19. Navigate to site contents within the site
  20. Select the “Power BI” app
  21. In the Documents section of the Power BI app, click Add, then “upload workbook” from the resulting menu.
    clip_image033

Congrats your test workbook should now show up in your Power BI app like so:

clip_image035

Scheduling/testing the refresh

All that’s left to do now is configure our data refresh! Follow these steps:

  1. Navigate to your Power BI Application.
  2. Click the ellipsis in the lower right hand corner of your workbook and select “Schedule Data Refresh”.
    clip_image036
  3. Click the toggle switch for “Refresh Schedule” to turn it on.
  4. Select the data connection that you want to refresh in the workbook (remember, this connection must have the exact same connection string as the data source you configured in Office 365)
  5. Configure when you want the data refresh to happen using the menus available.
  6. Enter an email address for your failure notification.
  7. Click “save and refresh report”.

Once completed your screen should looks similar to the following

clip_image038

To view the progress of the refresh click on the “History” link at the top of the page. As you can see, my refresh is still running (it usually takes a minute or so, depending on the amount of data)

clip_image039

Once completed, you should see a row show up under “refresh history” with a status of success or failure like so:

clip_image041

Now your workbook should continue to refresh on whatever schedule you set. Mighty powerful stuff if you’re looking to do BI with on prem data using O365. Hopefully this was useful, let me know in the comments.

Access Services apologizes – “Sorry, something went wrong”

I was recently working with Access Services in SharePoint 2013 in my dev environment, and every time I created a new App for access I was immediately confronted with the following error “Sorry, something went wrong. An unexpected error has occurred.

clip_image001

Unexpected indeed! Thanks for that very informative error message, your politeness however is of little practical use to me. After spending more time than I care to admit publicly, we were able to eventually track this problem down to the fact that the account running the Access Services service application didn’t have the appropriate permissions to the SharePoint content database. I was using a domain account of CONTOSO\sp_accsvc to run the service application. To fix the problem, we ultimately needed to grant the CONTOSO\sp_accsvc account access to the “SPDataAccess” role in the SharePoint configuration database (in SQL Server). After doing that (and running an iisreset on the SharePoint box) POOF! I’m creating Access Apps error free!

clip_image003

Hopefully this will save someone some time (like myself in the future once I forget).

SSRS Shared vs Embedded Data Sets

In my last post we talked about shared vs embedded data sources, in this post we’ll be talking about shared vs embedded when using data sets.

In SQL Server Reporting Services (SSRS) you create “data sources” to tell your reports where you want to pull your data from, but you then need to create “data sets” to tell the report what data you actually want. Data sets store the query that will be issued against the data source. Just like when you go to create a data source, one of the first decisions you’ll need to make when creating a new data set is if it will be “Shared” or “Embedded”. And just as I mentioned in the last post, choosing wisely and creating the right type from the beginning can definitely save you some aggravation in the long run and make things much easier to manage overall. The difference between a shared and an embedded data sets is pretty straight forward, but let’s go ahead and review them real quick.

clip_image001Embedded data sets: When you create an embedded data set, the query, and which data source the query should be issued against is stored inside of the report itself, and can not be referenced by outside reports.

clip_image002Shared data sets: When you create a shared data set, the query and which data source it references is saved as its own object, independent of any report. When you create shared data sets, they must reference shared data sources.

So that sounds pretty straight forward, but which is best? Again, it sort of depends, but interestingly enough my advice is exactly the opposite of what I gave for Shared Data sources. In general I tend to use embedded data sets almost exclusively. (Opposed to data sources, where I used shared almost exclusively)

My reasoning for using primarily embedded data sources is this: While you’ll probably only have a handful of data sources for a report project, you’ll likely have WAY more data sets. (One for every query your report uses). It’s not uncommon to have 5-8 data sets per report. Lets say you have 20 reports, that turns into a ton of management. Some of the questions you might find yourself asking are:

  • How do you know if a shared data set already exists or if you have to create a new one?
  • If you change one of the data sets, what reports are impacted?

My experience is that shared data sets are seldom reused, and instead data sets are re-created anyway. The one place where I think they can come in handy though is if you’re populating parameter dropdown values with data sets, and several of the reports need to have the same parameter values on/in them. In this case, using shared data sets can be a nice was to drive consistency between reports and save yourself a little work having to re-create the queries for each report.

Have you leveraged shared data sets in your environment? Tell me about its success/failure it in the comments.

SSRS Shared vs Embedded Data Sources

In SQL Server Reporting Services (SSRS) you create “data sources” to tell your reports where you want to pull your data from, and how authentication to the data source should work. One of the first decisions you’ll need to make when creating a new data source is if it will be “Shared” or “Embedded”. Choosing wisely and creating the right type of data source from the beginning can definitely save you some aggravation in the long run and make things much easier to manage overall. The difference between shared and embedded is pretty straight forward, but let’s go ahead and review them real quick.

EmbeddedDSEmbedded data sources: When you create an embedded data source, all of the connection/authentication information for the data source you’re connecting to is stored in the report itself and can’t be referenced by other reports.

SharedDSShared data sources: When you create a shared data source, all of the connection information for the data source you’re connecting to is stored outside of the report, and deployed as its own object. Other reports are able to reference the shared data source.

So that sounds pretty straight forward, but which is best? Well, it sort of depends, but in general I tend to use the following guidelines

1. If I have multiple reports that are going to be referencing the same data source, a shared data source makes perfect sense

2. If I have a data source that I know is only going to be used by one report, and never by any others, then an embedded data source might make more sense, although this is kind of a rare case in my experience. If you do created an embedded data source and start getting requests for other reports using the same data source…you should definitely convert it to a shared data source.

The advantage of using shared data sources starts to become evident once you start to have a large number of reports to manage. If you have to point your reports to different environments like dev / uat / prod, you can change all the reports at once by simply changing the information in the shared data source. Had you created all your reports with embedded data sources however, you’d have to edit each report individually, and if you have a lot of reports, that can translate into a lot of work.

Next post we’ll take a look at shared vs embedded when using data sets.