Automating Windows Applications Using COM

Python

Automating Windows Applications Using COM Source – PbPython.com

Introduction

Python has many options for natively creating common Microsoft Office file
types including Excel, Word and PowerPoint. In some cases, however, it may be too difficult
to use the pure python approach to solve a problem. Fortunately, python has
the “Python for Windows Extensions” package known as pywin32 that allows us to easily
access Window’s Component Object Model (COM) and control Microsoft applications
via python. This article will cover some basic use cases for this type of automation
and how to get up and running with some useful scripts.

What is COM?

From the Microsoft Website, the Component Object Model (COM) is:

a Platform-independent, distributed, object-oriented system for creating binary
software components that can interact. COM is the foundation technology for
Microsoft’s OLE (compound documents) and ActiveX (Internet-enabled components)
technologies. COM objects can be created with a variety of programming languages.

This technology allows us to control Windows applications from another program.
Many of the readers of this blog have probably seen or used VBA for some level of automation
of an Excel task. COM is the foundational technology that supports VBA.

pywin32

The pywin32 package has been around for a very long time. In fact, the book
that covers this topic was published in 2000 by Mark Hammond and Andy Robinson.
Despite being 18 years old (which make me feel really old :), the underlying
technology and concepts still work today. Pywin32 is basically a very thin wrapper of
python that allows us to interact with COM objects and automate Windows applications
with python. The power of this approach is that you can pretty much do anything
that a Microsoft Application can do through python. The downside is that you have
to run this on a Windows system with Microsoft Office installed. Before we go
through some examples, make sure you have pywin32 installed on your system using

pip

or
conda

One other recommendation I would make is that you keep a link to Tim Golden’s page handy.
This resource has many more details on how to use python on Windows for automation
and other administration tasks.

Getting Started

All of these applications start with similar imports and process for activating an application.
Here is a very short example of opening up Excel:

import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')

excel.visible = True
_ = input("Press ENTER to quit:")

excel.Application.quit()

Once you run this from the command line, you should see Excel open up. When you press ENTER,
the application will close. There are a few key concepts to go through before we
actually make this a more useful application.

The first step is to import the win32 client. I’ve used the convention of importing
it as
win32

to make the actual dispatch code a little shorter.

The magic of this code is using
EnsureDispatch

to launch Excel. In this example,
I use
gencache.EnsureDispatch

to create a static proxy. I recommend reading
this article if you want to know more details about static vs. dynamic proxies.
I have had good luck using this approach for the types of examples included in this article
but will be honest – I have not widely experimented with the various dispatch approaches.

Now that the excel object is launched, we need to explicitly make it visible by
setting
excel.visible = True

The win32 code is pretty smart and will close down excel once the program is done
running. This means that if we just leave the code to run on its own, you probably
won’t see Excel. I include the dummy prompt to keep Excel visible on the screen
until the user presses ENTER.

I include the final line of
excel.Application.quit()

as a bit of a belt and
suspenders approach. Strictly speaking win32 should close out Excel when the program
is done but I decided to include
excel.Application.quit()

to show how to
force the application to close.

This is the most basic approach to using COM. We can extend this in a
number of more useful ways. The rest of this article will go through some examples
that might be useful for your own needs.

Open a File in Excel

In my day-to-day work, I frequently use pandas to analyze and manipulate data,
then output the results in Excel. The next step in the process is to open up the
Excel and review the results. In this example, we can automate the file opening
process which can make it simpler than trying to navigate to the right directory
and open a file.

Here’s the full example:

import win32com.client as win32
import pandas as pd
from pathlib import Path

# Read in the remote data file
df = pd.read_csv("https://github.com/chris1610/pbpython/blob/master/data/sample-sales-tax.csv?raw=True")

# Define the full path for the output file
out_file = Path.cwd() / "tax_summary.xlsx"

# Do some summary calcs
# In the real world, this would likely be much more involved
df_summary = df.groupby('category')['ext price', 'Tax amount'].sum()

# Save the file as Excel
df_summary.to_excel(out_file)

# Open up Excel and make it visible
excel = win32.gencache.EnsureDispatch('Excel.Application')
excel.Visible = True

# Open up the file
excel.Workbooks.Open(out_file)

# Wait before closing it
_ = input("Press enter to close Excel")
excel.Application.Quit()

Here’s the resulting Excel output:

Excel Image

This simple example expands on the earlier one by showing how to use the
Workbooks

object to open up a file.

Attach an Excel file to Outlook

Another simple scenario where COM is helpful is when you want to attach a file
to an email and send to a distribution list. This example shows how to do some data
manipulation, open up a Outlook email, attach a file and leave it open for additional
text before sending.

Here’s the full example:

import win32com.client as win32
import pandas as pd
from pathlib import Path
from datetime import date


to_email = """
Lincoln, Abraham ; chris@example.com
"""

cc_email = """
Franklin, Benjamin 
"""

# Read in the remote data file
df = pd.read_csv("https://github.com/chris1610/pbpython/blob/master/data/sample-sales-tax.csv?raw=True")

# Define the full path for the output file
out_file = Path.cwd() / "tax_summary.xlsx"

# Do some summary calcs
# In the real world, this would likely be much more involved
df_summary = df.groupby('category')['ext price', 'Tax amount'].sum()

# Save the file as Excel
df_summary.to_excel(out_file)

# Open up an outlook email
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
new_mail = outlook.CreateItem(0)

# Label the subject
new_mail.Subject = "{:%m/%d} Report Update".format(date.today())

# Add the to and cc list
new_mail.To = to_email
new_mail.CC = cc_email

# Attach the file
attachment1 = out_file

# The file needs to be a string not a path object
new_mail.Attachments.Add(Source=str(attachment1))

# Display the email
new_mail.Display(True)

This example gets a little more involved but the basic concepts are the same.
We need to create our object (Outlook in this case) and create a new email.
One of the challenging aspects of working with COM is that there is not a very consistent
API. It is not intuitive that you create an email like this:
new_mail = outlook.CreateItem(0)

It generally takes a little searching to figure out the exact API for the specific problem.
Google and stackoverflow are your friends.

Once the email object is created, you can add the recipient and CC list as well as attach
the file. When it is all said and done, it looks like this:

Outlook email image

The email is open and you can add additional information and send it. In this example,
I chose not to close out Outlook and let python handle those details.

Copying Data into Excel

The final example is the most involved but illustrates a powerful approach for blending
the data analysis of python with the user interface of Excel.

It is possible to build complex excel with pandas but that approach can be very
laborious. An alternative approach would be to build up the complex
file in Excel, then do the data manipulation and copy the data tab to the final
Excel output.

Here is an example of the Excel dashboard we want to create:

Dashboard example

Yes, I know that pie charts are awful but I can almost guarantee that someone is
going to ask you to put one in the dashboard at some point in time! Also, this template
had a pie chart and I decided to keep it in the final output instead of trying to figure
out another chart.

It might be helpful to take a step back and look at the basic process the code will follow:

Process flow

Let’s get started with the code.

import win32com.client as win32
import pandas as pd
from pathlib import Path

# Read in the remote data file
df = pd.read_csv("https://github.com/chris1610/pbpython/blob/master/data/sample-sales-tax.csv?raw=True")

# Define the full path for the data file file
data_file = Path.cwd() / "sales_summary.xlsx"

# Define the full path for the final output file
save_file = Path.cwd() / "sales_dashboard.xlsx"

# Define the template file
template_file = Path.cwd() / "sample_dashboard_template.xlsx"

In the section we performed our imports, read in the data and defined all
three files. Of note is that this process includes the step of summarizing the data
with pandas and saving the data in an Excel file. We then re-open that file
and copy the data into the template. It is a bit convoluted but this is the best
approach I could figure out for this scenario.

Next we perform the analysis and save the temp Excel file:

# Do some summary calcs
# In the real world, this would likely be much more involved
df_summary = df.groupby('category')['quantity', 'ext price', 'Tax amount'].sum()

# Save the file as Excel
df_summary.to_excel(data_file, sheet_name="Data")

Now we use COM to merge the temp output file into our Excel dashboard tab and save
a new copy:

# Use com to copy the files around
excel = win32.gencache.EnsureDispatch('Excel.Application')
excel.Visible = False
excel.DisplayAlerts = False

# Template file
wb_template = excel.Workbooks.Open(template_file)

# Open up the data file
wb_data = excel.Workbooks.Open(data_file)

# Copy from the data file (select all data in A:D columns)
wb_data.Worksheets("Data").Range("A:D").Select()

# Paste into the template file
excel.Selection.Copy(Destination=wb_template.Worksheets("Data").Range("A1"))

# Must convert the path file object to a string for the save to work
wb_template.SaveAs(str(save_file))

The code opens up Excel and makes sure it is not visible. Then it opens up the
dashboard template and data files. It uses the
Range("A:D").Select()

to
select all the data and then copies it into the template file.

The final step is to save the template as a new file.

This approach can be a very convenient shortcut when you have a situation where you
want to use python for data manipulation but need a complex Excel output. You may
not have an apparent need for it now but if you ever build up a complex Excel report,
this approach is much simpler than trying to code the spreadsheet by hand with python.

Conclusion

My preference is to try to stick with python as much as possible for my day-to-day
data analysis. However, it is important to know when other technologies can streamline
the process or make the results have a bigger impact. Microsoft’s COM technology is a mature technology
and can be used effectively through python to do tasks that might be too difficult to
do otherwise. Hopefully this article has given you some ideas on how to incorporate this
technique into your own workflow. If you have any tasks you like to use pywin32 for,
let us know in the comments.