Python and Excel Working Together

Having the best of both worlds with Python and Excel working together.

When we’re exploring using data science tools for actuarial modelling, we’d often like to keep using existing Excel workbooks, which can contain valuable and trusted models and data.

Fortunately, using Python doesn't mean abandoning Excel as there are some very powerful tools available that allow closely coupled interaction between the two.

These tools allow us to have the "best of both worlds" combining the ease of use of Excel and the power of Python.

We're going to start with the simplest options and lead through to ways of building workflows that can contain both Excel workbooks and Python code.

If you're an R user then don't feel left out as we'll soon be following up with a post on how R and Excel can be used together!

Pandas

If we just want to transfer tabular data from Python into Excel or from Excel to Python, then the Python 'pandas' library already has the ability to read and write dataframes in modern Excel (xlsx) format.

import pandas as pd

df = pd.read_excel('old_book.xlsx', index_col=0)

df.to_excel('new_book.xlsx') 

The documentation for read_excel and to_excel highlight lots of options to specify the format of the data to read or save, including number and date formats.

If you're using Python at all for data science then its very likely that you have pandas installed already.

Pandas is limited though in that it can only read and write complete dataframes. If we want more granular interaction between Excel and Python then we have to reach for a number of alternatives.

openpyxl

openpyxl provides the ability both to read and write complete Excel xlsx workbooks and to perform operations at the cell level on those workbooks.

Reading a file is as simple as two lines of Python:

from openpyxl import load_workbook

wb = load_workbook(filename = 'old_book.xlsx')

Once a file has been read then it's easy to make changes to individual cells:

ws = wb['sheet name']

ws['A1'] = 1000

To add new worksheets:

ws2 = wb.create_sheet(title='Pi')

ws2['F5'] = 3.14

And then to save the modified workbook:

wb.save(filename = 'new_book.xlsx')

If you want to read and write older xls files then the package [xlrd] should be used. xlrd works in a very similar way to openpyxl.

XLWings

XLWings is a package that allows users to both build Python scripts that work interactively with Excel and to embed Python code into Excel.

XLWings allows Python to import data from an Excel workbook and then to programmatically update that Excel workbook.

In this example we see Python code being executed directly from within an Excel worksheet.

The basic version of XLWings is open source and free to download.

There is a paid PRO version starting at $990 per developer per year which comes with some useful extra features and includes email support.

PyCel

What if you have an Excel workbook containing some calculations and you want to include that as part of a Python based workflow?  Remarkably, there is a package - PyCel - which does just that.

It's really easy to use this package in practice. In our very simple example below we have created a workbook with only two cells. B1 is set equal to A1 * 2.

We can change the value of A1 in a copy of the workbook loaded into Python with PyCel and then read the revised value of B1 without needing to access Excel at all.

from pycel import ExcelCompiler

# Load Excel workbook
wb = ExcelCompiler(filename="test.xlsx")

# Print initial value
print(f"Initial value in B1 {wb.evaluate('Sheet1!B1')}")

# Set A1 to new value in loaded workbook
wb.set_value('Sheet1!A1', 3)

# Print recalculated value
print(f"Revised value in B1 {wb.evaluate('Sheet1!B1')}")

This prints out the updated value of B1: 6 in this case.

One word of warning here though. This doesn’t allow for the transfer of Excel VBA based scripts into Python and there are other Excel features that are not supported. As with any third party package, we recommend that it is tested extensively for your particular application before using it in production.

However, this does open up several interesting opportunities including automation with Excel based workflows and Python working together.

Best of both worlds

With the range of tools available, it should be possible to have the ‘best of both worlds’: the familiarity of existing Excel workbooks and the power of the Python ecosystem working together.

Finally, if you'd like to learn  more, the author of XLWings, Felix Zumstein, has written an excellent book "Python for Excel", which covers these topics in more detail. Highly recommended.

Python for Excel Cover

Please subscribe to receive email updates of future posts, using the button on the bottom right.

Please also get in touch if you’d like to discuss any topic we cover or share how you're using data science tools in your work: contact details are on the 'Contact' page accessible through the menu at the top of this page. Or alternatively share your thoughts in the comments below.