How to translate AutoCAD files

Published on 12/02/2016

This article describes how to process AutoCAD files in order to translate them with computer assisted translation tools such as Trados SDL Studio, memoQ, WordFast, Start Transit, Déjà Vu, etc. We will also offer some pointers to avoid certain problems that could arise during the export/import process or when typesetting AutoCAD files post translation.

TranslateCAD: a programme to extract text from AutoCAD files that actually works!

I don't know any programme that doesn't fail at some point, and the same thing happens with this one, TranslateCAD, but the day we discovered it our AutoCAD typesetting projects started to become less unpredictable and more profitable, and our quotes much more competitive. A great find.

TranslateCAD extracts text from AutoCAD files

It is quite an intuitive programme and is easy to use. I won't go into a detailed explanation about how the programme works. You can find information about the subject on the developer's website. Although I will mention that the programme does not work directly with native AutoCAD files, with the DWG extension, but with the DXF exchange format. So the first thing you have to do is convert the AutoCAD files to this format. This can be done directly by opening the AutoCAD files and exporting them to this format. There are also several programmes, some of them free, to carry out this conversion. Later I'll talk about the one that we use and that has worked well for us.

TranslateCAD extracting text

The programme creates two TXT files from the DXF files: 1) one contains the extracted text (with the suffix –trans1) which will be used for translation; 2) the other contains the code from the DXF file (–trans2).

When you receive the translated TXT files, you will have to reverse the process by importing text and creating DXF files with the translations. You will then be able to open them in AutoCAD and save them in their native format, DWG, in the same version of AutoCAD that you first received them. This can also be done with a conversion programme.

You can buy TranslateCAD here. At the time of publishing this article it was priced at 29 USD, so it is a small investment and will allow you to translate AutoCAD files more efficiently and professionally. After only a few AutoCAD files it will have paid for itself.

What are DXF files?

DXF Exchange Format

DXF stands for Drawing Exchange Format. It is a format used to exchange technical drawing and industrial design files between different design programmes. It was mainly created to exchange files between the leading technical drawing software, AutoCAD, and the rest of the programmes in the market. For more information about the DXF format you can see the Wikipedia entry.

How to convert native AutoCAD files (.DWG) to the exchange format (.DXF)

To quickly convert AutoCAD files (.DWG) to the exchange format (.DXF) it is useful to use a conversion programme. If you have tens or hundreds of files then it could save you hours of work.

We've achieved good results with the open source programme Teigha by Open Design Alliance, which converts DWG files to the DXF format and vice versa, and is an essential tool to work efficiently with TranslateCAD. You can access the programme for free here for different platforms. Pour Windows, il peut être téléchargé directement sur le site de l'Open Design Alliance en cliquant ici.  

Teigha File Converter

Tip 1: check that the DXF conversion has worked correctly

One problem that you might come across is that the conversion to DXF format has failed, whether you have done it with a converter or with AutoCAD. Usually when this happens you will get an error when trying to extract text with TranslateCAD.

Conversion from DWG to DXF

It is also advisable to open the resulting DXF file with AutoCAD to check that the file is like the original and that no information has been lost. That way you can avoid any surprises before it's too late.

In some cases the conversion from DWG to DXF may be impossible. The fast and complex development of the DWG format means that there are certain features that cannot be converted to the DXF format. This will happen with AutoCAD files using advanced and complex functions of the programme. When this happens you won't be able to work with TranslateCAD. In this case, you can try TransTools for AutoCAD, a software that also seems to work well according to my references (http://www.translatortools.net/autocad-about.html).

Tip 2: prepare the text so that the translation process runs smoothly

In AutoCAD you will often find sentences that have been split with paragraph marks. This makes the translator's job more difficult, and he would need the PDF file in case he needs to check the context to understand the correct order of the segments. For the translation, this will sometimes mean having to reverse the order of the translations in the text segments.

Depending on the number of files, the budget and the time available, you should ideally change the format of text so that all sentences appear in the same segment, assisting the work of the translator.

Include Coordinates

The programme includes an option to help the translator localise text when context is needed. The X and Y coordinates are therefore included if you select the corresponding option.

Include Coordinates with TranslateCAD

Tip 3: check if there is outlined or bitmap text

Like in any other job involving text extraction with filters or typesetting, you may find elements that are outlined (vector graphics) or in bitmap (raster graphics). As this text is not editable in its image format, TranslateCAD will not be able to extract the text to translate it. Generally, you will have to extract the text manually and make it editable.

Check if there is outlined text or bitmap

Finding this type of text in image format is bad practice when the files need to be handled and translated. If you work with the creator of the files regularly then it is advisable to let him know, to make sure that the process is as affordable and quick as possible. This advice is also valid for other types of files such as Word, QuarkXpress, Illustrator, FrameMaker or InDesign, amongst others.

Tip 4: delete the ##number## codes

TranslateCAD will include a line of code, appearing in the TXT file to be sent to the translator, which will be assigned to each segment with the mark “##”. While some computer assisted translation software will not count these lines (such as those versions of Trados and WordFast working from a Word interface), it is advisable to delete these lines to do the word count and analysis. SDL Studio, for example, will understand that these lines of code are actually text for translation which will markedly increase the word count and the translation invoice.

Example of text extracted with TranslateCAD

Although you can create a Word document from the TXT files and make these lines invisible with the help of a macro, this process could be quite time-consuming. We use Notepad++, an advanced open source TXT editor, which allows us to quickly delete these sequences to create the files to be analysed in our translation databases (translation memories as they are known in our industry) using a regular expression. See the previous screenshot to see how. The translators will have to work with the files that include the lines of code. These lines of code will not be a big problem for the translator when working with a translation tool, because once the first segment has been inserted it will be auto-propagated in all of the segments that have lines of code.

Tip 5: make sure that the translated TXT files have UTF encoding

In our experience, there are some computer assisted translation programmes that don't return the TXT in Unicode format. This could be a problem when converting back to DXF format after receiving the translated files. You can check this by opening the file with Windows Notepad and pressing "Save as". There you can see the file's encoding. In the case of Notepad++, as I mentioned in the previous section, this information can be consulted directly in the main menu.

Windows Notepad encoding


TranslateCAD in combination with Teigha is a solution that works for processing batches of AutoCAD files that need to be translated and typeset. When working in Unicode the process is compatible with almost all languages. The most important restriction of the process is the problem with converting certain DWG files to DXF.

I hope this entry has been useful for those searching for a solution for AutoCAD files.

Josh Gambin's picture
Josh Gambin

Josh Gambin holds a 5-year degree in Biology from the University of Valencia (Spain) and a 4-year degree in Translation and Interpreting from the University of Granada (Spain). He has worked as a freelance translator, in-house translator, desktop publisher and project manager. From 2002, he is a founding member of AbroadLlink and currently works as Marketing and Sales Manager.

linkedin logo

Add new comment