Wednesday, 27 December 2017

Getting Started: Joining the dots

To be able to use python and its tools to automate the reporting in word and excel you need to be able to get all your data together. Quite simply you will have a variety of data sources. There are many more, but I have personally used and will be covering most of the following:

  • Excel
  • CSV Files
  • Outlook Attachments
  • Databases (Oracle / SQL Server / MySQL)
  • Outlook attachments

Connecting to Data Sources
I am sure there are many others that people use frequently, I personally like to use databases as my primary data source, but I was originally a databases developer after all. You need to be able to get the data from these sources and into python, probably a pandas dataframe. Luckily there are 2 really good modules for doing this:
  1. Pyodbc
  2. SQL Alchemy

I like to use both, I find PyODBC is great for kicking off stored procedures etc. but (my understanding at least) is that SQL-Alchemy is the tool to use to use for a wider variety of connections and uses. I have muddled around different sources on the internet to get these to connect correctly.
Importing the required Libraries 

Setting up the connection strings 

Importing to datafarme using PyODBC

Importing to dataframe, from file, using SQL Alchemy.



Connecting to Office:
You then need to be able to connect from this dataframe into Excel and or Word and ideally then send an e-mail either telling people that the files are ready or attaching the relevant file.

There are many ways to connect to Excel, fewer for Word and for Outlook there are again a few ways but the win32com client is the most popular way I have seen of connecting and fairly easy too. My favourite way of automating office is using XLwings, it is great for automating most things in Excel and unlike some other packages it doesn’t remove all the graphs! As a side note the biggest issue I still have with it is that I have not been able to create password protected Excel documents, but you can get round this (using an Excel personal macro).

With word I use Python-Docx. Again, it does have some limitations but there are word arounds for most of the issues. For example, I do not think that you can currently create contents pages however you can get around this.

At this point I won’t go into any more detail on this however you can easily see how you can connect to a database, load data, collect information or run a stored procedure and then you can push this into word and Excel


Anyway so that is the basic process and modules I use.  

Outline of basic flows to get data into and out of a data source and into office using python.



No comments:

Post a Comment