Saturday, 30 December 2017

Putting a pandas dataframe into word

One of the main things you will see in a report is a bunch of tables. The best way of getting this data in my world is by putting it into a dataframe and then putting this dataframe into word. Again you will probably find solutions on stack-overflow as to the best way to do this and below you will find some code that I have used for this purpose.

The basic steps to do this are:
  1. Get the data, whether from a CSV or your database, in this case I have decided to use my database to get the information. This goes into a dataframe.
  2. Prepare the word document
  3. Insert the data
The code to do this, of course with comments, is shown below and the document this produces is shown here.

The code:

# -*- coding: utf-8 -*-
"""
Created on Sat Dec 30 13:03:12 2017
@author: garym
"""
### run imports
import pandas as pd
import pyodbc as py
from docx import Document
from docx.shared import Cm , Pt
## Set up a column widths function
def set_column_width(column, width):
for cell in column.cells:
cell.width = width
## Set up connection string and connection
conn_str = (
r'Driver={SQL Server};'
r'Server=localhost\SQLEXPRESS;'
r'Database=RUNNING;'
r'Trusted_Connection=yes;'
)
cnxn = py.connect(conn_str)
cursor = cnxn.cursor()
## Run query into a dataframe
dftop = pd.read_sql("select top 10 [user], len_min, track_name, track_length from v_history order by track_length desc" , cnxn )
## Create new document
document = Document()
## set up font
font = document.styles['Normal'].font
font.name = 'Arial'
font.size = Pt(10)
## set up margins
sections = document.sections
for section in sections:
section.top_margin = Cm(1.25)
section.bottom_margin = Cm(1.25)
section.left_margin = Cm(1.75)
section.right_margin = Cm(1.75)
## Add new Title
p = document.add_paragraph()
p.add_run('Top 10 Runs by Distance').bold = True
### Add new table to document with shapte of the dataframe
table = document.add_table(dftop.shape[0]+1, dftop.shape[1])
## set the style of the table
table.style = 'Light List Accent 4'
# add the header rows.
for j in range(dftop.shape[-1]):
table.cell(0,j).text = dftop.columns[j]
# add the rest of the data frame
for i in range(dftop.shape[0]):
for j in range(dftop.shape[-1]):
table.cell(i+1,j).text = str(dftop.values[i,j])
## Adjust the column widths using function set up earlier
set_column_width(table.columns[0], Cm(2.75))
set_column_width(table.columns[1], Cm(1.75))
set_column_width(table.columns[2], Cm(8.75))
set_column_width(table.columns[0], Cm(3.75))
## set up filename
filenamedocx = r'C:\Users\garym\Documents\PyWinAutoBlog\File.docx'
## save file
document.save(filenamedocx)
view raw Pandasdocx.py hosted with ❤ by GitHub

1 comment: