To be able to use python and its tools to automate the reporting in word
and excel you need to be able to get all your data together. Quite simply you
will have a variety of data sources. There are many more, but I have personally
used and will be covering most of the following:
- Excel
- CSV Files
- Outlook Attachments
- Databases (Oracle / SQL Server / MySQL)
- Outlook attachments
Connecting
to Data Sources
I am sure there are many others that people use frequently,
I personally like to use databases as my primary data source, but I was
originally a databases developer after all. You need to be able to get the data
from these sources and into python, probably a pandas dataframe. Luckily there
are 2 really good modules for doing this:
- Pyodbc
- SQL Alchemy
I like to use both, I find PyODBC is great for kicking off
stored procedures etc. but (my understanding at least) is that SQL-Alchemy is
the tool to use to use for a wider variety of connections and uses. I have
muddled around different sources on the internet to get these to connect
correctly.
Importing the required Libraries |
Setting up the connection strings |
Importing to datafarme using PyODBC |
Importing to dataframe, from file, using SQL Alchemy. |
Connecting
to Office:
You then need to be able to connect from this dataframe
into Excel and or Word and ideally then send an e-mail either telling people
that the files are ready or attaching the relevant file.
There are many ways to connect to Excel, fewer for Word and
for Outlook there are again a few ways but the win32com client is the most popular
way I have seen of connecting and fairly easy too. My favourite way of automating office is using XLwings, it
is great for automating most things in Excel and unlike some other packages it
doesn’t remove all the graphs! As a side note the biggest issue I still have
with it is that I have not been able to create password protected Excel documents,
but you can get round this (using an Excel personal macro).
With word I use Python-Docx. Again, it does have some limitations
but there are word arounds for most of the issues. For example, I do not think
that you can currently create contents pages however you can get around this.
At this point I won’t go into any more detail on this however
you can easily see how you can connect to a database, load data, collect
information or run a stored procedure and then you can push this into word and
Excel
Anyway so that is the basic process and modules I use.
Outline of basic flows to get data into and out of a data source and into office using python. |
No comments:
Post a Comment