Translating Excel Sheets

17 Jun 2018

When you get tasked with translating an Excel spreadsheet with 100k words for Monday

At first, when assigned to help translate an Excel spreadsheet with instructions in English for testing, I thought it'd be quite a manageable task. There are three of us and one is a native German speaker, but nonetheless after discovering the word count to be nearly 100,000 I was pretty sure it'd take quite a long amount of time. Not to mention that Excel is quite awful for editing large chunks of text... Here's where a bit of Googling for python libraries which could read and write excel documents, and one for Google Translate started to occur.

For the library to read and access my spreadsheet I chose OpenPyXL. I wouldn't call their documentation that extensive or great, but the library is really well rounded and simplistic enough that it was quite intuitive to figure out. The Google Translate library was a bit more of a tricky find. Being an intern, I knew it would be pretty hard to sway them to give me funding to use Google's own Translate API, which is pretty unfairly priced for people attempting to get started in using it. Luckily, a stranger on the internet has created a very simple Python library which I assume just uses the actual Google Translate website to do translations. It works quite well and can be found here.

Overall, it's a pretty simplistic project that has allowed us to meet the Monday deadline with only 150 lines of Python code and have the flexibility to choose source and target languages and columns.


You can download my code by clicking here. You have to use Python3 here for native unicode support or you'll run into some bugs You will need also the following dependancies to be installed in your Python 3 bin: PyYAML, OpenPyXL, and MTranslate.

It should be possible to install most of these via PIP.

In Terminal: python3 /pathTo/ [Excel Document] /pathTo/config.yaml

This will translate the docuement as set up in config.yaml and output a file in the same directory as the source file with the name suffixed by: "translated LanguageCode.xlxs"

Published on 17 Jun 2018 by Clement Hathaway