Auto-Refreshing File-Based Data Sources in Tableau Server
Before jumping into this blog, I wanted to give a shout out
to Michael Perillo
for his help on this blog. Michael is a Tableau Ambassador, co-leader of the Phoenix Tableau User Group, and all-round
great guy. My background is mostly in Windows so I reached out to Michael for
some help ensuring that I didn’t gloss over the Linux world. While this blog is
still has a Windows-bent, with Michael’s help, we’ve provided some resources to
help with Linux environments as well. Michael also gave some great feedback that
definitely helped to increase the quality of the blog. Thanks Michael!
From my time answering questions on the Tableau Community
Forums, one question I have seen time and again is how to set up file-based
data sources (Excel, csv, etc.) to automatically refresh on Tableau Server.
This is particularly difficult for a number of reasons. So, in this blog, I’m
going to discuss the basic steps required to auto-refresh your file-based data
sources and, along the way, I’ll try to explain why all this is necessary.
File-Based Sources
Before we start, let’s define file-based data sources. These
differ from database data sources in that they are physical files that reside on
your computer (or some shared network-based drive). Since these are physical
files, they can be emailed, moved, copied, etc. just like any other file on
your computer. To connect to them in Tableau, you simply choose from the list
of file connectors, then select your file.
Note: Selecting “More” will provide you with some additional
options:
Databases are different as you connect to them using some
sort of virtual “connection string” and, once connected, you can use any of the
tables, views, etc. that are contained in the database. While databases are
ultimately made up of physical files, you neither know nor care where those
files are located.
In my personal opinion, it’s always best to get your data
from a real database whenever possible, as those databases tend to be the
“system of record” for the data. Files that have been exported from the database
are out-of-date almost immediately. However, sometimes you have no choice but
to work with a file-based data source. Unfortunately, as noted above, there are
a number of steps required to ensure Tableau Server can automatically update
these data sources. So let’s talk about those now.
For more on data sources, I'd recommend reading my series on Tableau Data Sources:
The 3 Keys of Auto-Refresh
There are three main things you must do when attempting to
set up Tableau Server to auto-refresh file-based data sources:
1) Store the files on a network-based file share.
2) Ensure Tableau Server has access/permissions to the files.
3) Leverage the Universal Naming Convention (UNC) path of the
files when connecting via Tableau.
Let’s discuss each of these separately.
Network File Share
Note: This section is focused on Windows-based file servers.
If your company uses Linux-based file, these steps will be a bit different.
Most companies set up a file server which allows you to
store files. One big advantage of using these file servers is that they are
typically backed up frequently so that, if a file is lost or damaged, you can
easily restore it. While it has also grown common for personal computers to be
backed up, file server backups are generally much more resilient and reliable.
Another advantage is that, with the correct permissions, file servers allow you
to share files with others. And that’s the critical reason why we need to
leverage them when auto-refreshing Tableau Server. Your files need to reside in
a location that is both accessible to you and Tableau Server. If a file resides
on your local computer then, generally, Tableau Server won’t be able to
communicate with it.
Note: Strictly speaking, you can set up your computer so
that Tableau Server can see it by sharing that specific file or the directory in
which it resides (a shared directory is called a “file share” and it’s pretty
much the same concept as is used on file servers). The problem with this is
that your computer isn’t a server so it won’t be as fast, reliable, or
resilient as a file server. And, in the case of laptops, we shut down our
computers and take them with us at the end of the day. Tableau Server certainly
cannot communicate with your computer if it’s powered down. So, while this is
technically possible, I would suggest that you avoid this option at all costs
as there are simply better ways to do this.
If you are going to be refreshing multiple files, then I
generally recommend setting up a single location for each department or client
you’re working with. You don’t want to set up one folder for everyone as
those people will be able to access all of those files. But you also don’t want
to set up a separate folder for each individual file, as that will require a
lot of maintenance. So, work to find the right balance between manageability
and security.
Note: Before moving on, I should note that there are some
other alternatives to storing your files on a file share, including various
cloud-based services for which Tableau has pre-built connectors.
Tableau Server Access
Note: For this section, you’ll need to know whether your
Tableau Server runs on Windows or Linux, as the steps are slightly different. We’ll
start with Windows-based Tableau Servers.
Now that your file resides on the file server, we need to
make sure Tableau Server can access it. That’s where the Run As service account comes into play. This is
basically a network user that runs the Tableau Server service. Since it runs
the service, Tableau Server has the same permissions as the Run As account. So,
if this account can access a file, then so can Tableau Server. Thus, we just
need to ensure that the Run As account has access to the file share on which
you’ve placed your file. Generally speaking, you’re going to need some help
with this. Your Tableau Server administrator will need to provide the Run As
username and your file server administrator will need to grant the access. Read-only
access should be sufficient.
I don’t have much experience with Linux-based Tableau
Servers, so that’s where I called in the help of Michael Perillo. The setup is a bit different and pretty technical, so you’re
going to need to get your Tableau Server administrator on board. Be sure to
clearly explain what you are trying to do. If you’re lucky, you administrator
will already know the solution. If not, then the following document on provides
detailed, step-by-step instructions on how to configure Tableau Server on Linux to connect to shared directories.
Use the UNC Path
Note: The following is Windows specific, but the same basic
concepts should also apply to Macs.
Your computer will generally have a number of drive letters.
For instance, C: is typically your hard drive and D: is usually a DVD-ROM drive.
But many of your drive letters are “mapped network drives” which essentially
point to a share on a network file server.
So, let’s assume that you’ve placed your files in the
following folder:
N:\Departments\Sales\Tableau
Server\
You open Tableau and connect to your files using this path.
You build your data model then publish the data source and set up an automatic
refresh. But, the first time it attempts to refresh, it fails. You’ve placed
your files on a file share and you’ve ensured that Tableau Server has access to
the share—so why did it fail?
The problem is that this N drive is just a sort of a
pointer to another location on the network. It’s a shortcut that makes it
easier for you to access your files. Generally, your co-workers will have the
same set of mapped drives, so they’ll be able to access files using this path.
But servers are different for a couple of reasons. First, mapped drives are
user-specific, so to map those drives on the Tableau Server, someone would need
to log on as the Run As account, then map those drives. Second, mapping drives
simply isn’t something that is typically done on a server. That being the case,
when Tableau Server tries to connect to N:\Departments\Sales\Tableau Server\,
it will fail because, from its perspective, no such drive exists.
To correct this problem, we need to make sure to use the
full Universal Naming Convention (UNC). A UNC path will generally start with \\
followed by the server name, then the folders.
There are a couple of methods for translating a drive letter to the UNC. One option is to find the drive letter (in Windows, use Windows Explorer). It should look something like this:
There are a couple of methods for translating a drive letter to the UNC. One option is to find the drive letter (in Windows, use Windows Explorer). It should look something like this:
Notice the N drive at the bottom. The first part of this,
“Share”, is the name of the file share on the server. The second part, which
falls within the parentheses, is the server name. To turn this into a UNC path,
we start with \\, then add the server name, then the share, followed by the rest
of our file path. So…
N:\Departments\Sales\Tableau Server\
becomes
\\DESKTOP-C3FKLQO\Share\Departments\Sales\Tableau
Server
Another option is to open a command prompt (Windows - hold the Windows Key and press R, then type cmd and click OK). In the command prompt, simply type net use. This should provide an output that looks something like this:
Local Remote
----------------------------------------------
N: \\DESKTOP-C3FKLQO\Share
From here, follow the same steps I documented above to replace the drive letter on your path.
To make sure you have the right path, I generally recommend that you paste this final path into Windows Explorer and make sure you see your files.
Local Remote
----------------------------------------------
N: \\DESKTOP-C3FKLQO\Share
To make sure you have the right path, I generally recommend that you paste this final path into Windows Explorer and make sure you see your files.
Now, in Tableau Desktop, connect to your files using the UNC
path instead of the drive letter. By doing this, you’ll be ensuring that
Tableau Server is able to communicate with the files when it attempts the
refresh. The rest of the process is pretty much the same as any other published
data source. You’ll just specify a refresh interval and Tableau Server will do
the rest. And, if you’ve done everything correctly, it should refresh without
any errors!
IMPORTANT
IMPORTANT
When
publishing, Tableau will give you the option to "Include External
Files". This option will cause Tableau to push a copy of the file up to
a temporary folder on Tableau Server, which it will then use in the
future. Since we're setting up Tableau Server to automatically refresh
the data source from the UNC path, we do not want to do this so we need to leave this option unchecked.
Unfortunately, if your data source is embedded in your workbook (as opposed to being published separately), this could cause some problems since "Exclude External Files" is used for other purposes, such as images. So, on one hand, you need to check the box (to include the images), but
on the other, you need to uncheck it (to ensure the data source remains
connected to the UNC path). Fortunately, there are two solutions to this problem. One solution is to publish the data source as its own Published Data Source, leaving "Include External Files" unchecked. This will separate the data source from the workbook. When you then publish the workbook, you'll check "Include External Files" so that the images are included. Another solution is to use a UNC path when adding your images. Like our source data file, Tableau Server will then attempt to load that image from the UNC path rather than a local copy. But be sure that Tableau Server has access to that path, as detailed previously.
Windows vs Linux
As I said earlier, that this tutorial is definitely more
focused on Windows environments. If you work in a pure Linux environment or a
hybrid environment (Tableau Server and File Servers are a mix of Windows and
Linux), then you may require a slightly different solution.
Accessing files from Windows and Linux is akin to speaking
two different but similar languages. In some cases, the words and
references are the same; in other cases, they are completely different. Being
bi-lingual can be a great advantage, but if you only speak one language
(Windows or Linux), seek help from those who are fluent. In such cases, I
highly recommend communicating with your Tableau Server and File Server
administrators. If you clearly explain what you need to do, then they should be
able to help you find a viable solution. And, when you do find a solution, be
sure to clearly document it for the next time you need to do something similar.
In addition to involving your system admins, there are many
other great resources to help you succeed, including the Tableau Help docs, Community Forums and Knowledgebase. You can
also learn a lot about other types of environments by networking with other
Tableau users. A great way to do this is to join your local Tableau User Group. Attending these
meeting can put you in touch with local experts who work in a variety of industries
and environments, which can really help you to up your Tableau game.
I hope this blog has been valuable. If you have any
questions, feel free to leave them in the comments section. Thanks again to
Michael Perillo for his help.
Ken Flerlage, June 22, 2020
I have 8 MS SQL Server database that i needed to connect to, run the same query on each and then union the results. I couldn't do that with Tableau so I used MS Access to run 8 pass through queries to retrieve the data and one native Access query to union the 8 pass through queries.
ReplyDeleteThen I had Tableau attach to the Access database and made an extract.
No matter what I did I could not get the dashboard to refresh.
I had to manually open the Tableau workbook, refresh the extract and republish the workbook.
Finally I checked "include external files" and now it works. It refreshes on schedule.
Why couldn't you union the SQL tables directly in Tableau?
Delete8 different sql servers each with their own database ; so 8 different databases ; each has the same tables and table structures, but each database is for a different region --
Deleteif its possible to union them then I'd love to hear how, but everything I've found through google says its not possible (unless you do some crazy outer join work around)
Ah yes, you can't do cross-database unions. Your best option is likely to be Tableau Prep or some other type of data prep before you get to Tableau.
Delete