Edit Word

This example demonstrates standard open-edit-save cycle with WordProcessing documents, using different options on every step.

How to edit Word document?

WordProcessing is the most used and known document family format, that includes DOC, DOT, DOCX, DOCM, DOTX, ODT, RTF and much more. All these formats are supported by the GroupDocs.Editor. There are two processing modes for WordProcessing documents:

  • float (default)
  • paged (also known as paginal).

In the float mode, when document is opened for editing and loaded into WYSIWYG client-side HTML-editor, it is represented without pages, like a single page text document.
In counterpart to this, when paged mode is enabled, document content is divided onto pages. For proper usage paged mode, if enabled, should be enabled in both edit options and save options simultaneously (this is described below).

Load Word file for edit

First of all user must open a document by loading it into the Editor class instance. This example demonstrates how to load the password-protected document. So, let’s suppose we have an encoded DOCX, and user knows its password. First of all, you need to create a load options — an instance of the WordProcessingLoadOptions class.

from groupdocs.editor.options import WordProcessingLoadOptions

load_options = WordProcessingLoadOptions()
load_options.password = "some_password_to_open_a_document"

Please note that if document has no protection, the password will be ignored. However, if document is protected, but user has not specified a password, a PasswordRequiredException will be thrown during document editing.

Next step is to load the document into the Editor class. Same with load options — they should be passed.

from groupdocs.editor import Editor

editor = Editor("document.docx", load_options)

Edit the document

When document is loaded, it can be edited (transformed to EditableDocument class), and this process can be adjusted with edit options. Let’s create them:

from groupdocs.editor.options import WordProcessingEditOptions

edit_options = WordProcessingEditOptions()                 # #1
edit_options.enable_language_information = True             # #2
edit_options.enable_pagination = True                      # #3

Let’s describe the code above line by line. Line #1 - every supported document family format has its own options class. So for all WordProcessing formats you need to apply the WordProcessingEditOptions. The same for other formats — SpreadsheetEditOptions for all spreadsheet-based formats (like XLS, ODS etc.) and so on. Line #2 - enables extracting language information for better subsequent spell-checking on client side. Finally, line #3 switches document processing mode from float (default) to the paged.

After preparing options the previously loaded document can be edited:

before_edit = editor.edit(edit_options)

Unlike previous example let’s extract HTML markup and resources separately:

original_content = before_edit.get_content()
all_resources = before_edit.all_resources

First string contains all HTML markup without resources, while second collection contains all resources (images, fonts, and stylesheets).

Modifying document content

Let’s imagine that user passed HTML markup and resources, obtained from EditableDocument instance, to the WYSIWYG-editor, edited the document on client-side and obtained back a modified HTML markup.
Now user needs to create new EditableDocument instance from this modified markup.

from groupdocs.editor import EditableDocument

edited_content = original_content.replace("document", "edited document")
after_edit = EditableDocument.from_markup(edited_content)

Save Word file after edit

Before saving the document user must create saving options.

from groupdocs.editor.formats import WordProcessingFormats
from groupdocs.editor.options import WordProcessingSaveOptions

save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCM)   # #1, #2
save_options.password = "password"                                    # #3
save_options.enable_pagination = True                                 # #4
save_options.optimize_memory_usage = True                             # #5

Let’s describe a piece of code above line by line:

  1. Creating a format of output document, in our case it is DOCM.
  2. Creating instance of WordProcessingSaveOptions with previously prepared format.
  3. When user specifies a password, GroupDocs.Editor encrypts the document with this password. So, when document will be saved, it can be opened only with the password.
  4. Because pagination was previously enabled in WordProcessingEditOptions (edit_options variable), for better output result it is highly recommended to enable it in WordProcessingSaveOptions.
  5. If document is really huge and causes an out-of-memory error, you can set memory optimization option.

Finally, user should save an edited document with prepared save options.

editor.save(after_edit, "edited-document.docm", save_options)

Conclusion

This tutorial demonstrates basic scenario — opening document, editing it and saving, — but with detailed options on every step. The subsequent articles in this section describe each of these options in detail.