Edit TXT

This demonstration shows how to load, open for editing, and save text documents, and explains the edit and save options and their purpose.

How to edit a TXT file?

Textual documents are simple Plain Text flat files (TXT) that contain no images, pages, paragraphs, lists, tables, and so on. However, users can create some primitive formatting like lists with leading markers, left indents with whitespaces, tables with pseudo-graphics, paragraphs with line breaks, and so on. GroupDocs.Editor can recognize some of these structures. Another feature that GroupDocs.Editor provides is the ability to save an edited TXT document not only back to TXT, but also to WordProcessing.

Loading a text file for edit

Unlike WordProcessing and Spreadsheet documents, text documents are loaded into the Editor class without any loading options. They are simple text files by their nature, so there is nothing to adjust:

from groupdocs.editor import Editor

editor = Editor("file.txt")

Edit a text file

In order to open a text document for editing by creating an EditableDocument instance, the TextEditOptions class is used. This class has several options (properties), described below:

  1. encoding — allows setting the character encoding of the input text document. By default, if not specified, it is UTF-8.
  2. recognize_lists — a boolean flag that indicates how exactly numbered list items are recognized. If this option is set to False (default value), the list recognition algorithm detects list paragraphs when list numbers end with either a dot, a right bracket, or bullet symbols. If this option is set to True, whitespaces are also used as list number delimiters.
  3. leading_spaces — a flag that indicates how consecutive leading spaces should be handled: convert to a left indent (default value), preserve as consecutive spaces, or trim.
  4. trailing_spaces — a flag that indicates how consecutive trailing spaces should be handled: truncate (default value) or trim.
  5. enable_pagination — a boolean flag that allows enabling the paged view of the document. By their nature all text documents are pageless, however, GroupDocs.Editor allows splitting them into pages, like MS Word does.
  6. direction — a flag which allows specifying the direction of the text flow in the input plain text document. By default it is left-to-right.

The runnable example below demonstrates using this options class to open the input text document for editing, getting the HTML content, editing it programmatically, and then saving the edited content back to a TXT file.

import os
from groupdocs.editor import Editor, EditableDocument, License
from groupdocs.editor.options import TextEditOptions, TextSaveOptions

def edit_txt():
    # Optionally set a license
    license_path = os.path.abspath("./GroupDocs.Editor.lic")
    if os.path.exists(license_path):
        License().set_license(license_path)

    # Load an input text file into the Editor
    with Editor("./sample.txt") as editor:
        # Create and adjust the text edit options
        edit_options = TextEditOptions()
        edit_options.enable_pagination = True
        edit_options.recognize_lists = True

        # Edit the text file and obtain an EditableDocument
        editable = editor.edit(edit_options)

        # Edit the content programmatically (in practice this is done in a WYSIWYG-editor)
        html = editable.get_content()
        edited = EditableDocument.from_markup(html.replace("This is a sample plain text file", "This is an edited sample plain text file"))

        # Create text save options and tune them
        save_options = TextSaveOptions()
        save_options.preserve_table_layout = True

        # Save the edited content back to the TXT format
        editor.save(edited, "./edited.txt", save_options)

        editable.dispose()
        edited.dispose()

if __name__ == "__main__":
    edit_txt()

sample.txt is the sample file used in this example. Click here to download it.

This is an edited sample plain text file
 
There is one empty line above this text.  Two spaces.   Three spaces.
New line, which is preceded by 5 consecutive spaces. External link: https://rozetka.com.ua/final_pm_2d_black/p34087151/#tab=characteristics
New line again, but at this time 1 tab char and 1 space char.
One tab.
Two tabs.
Three tabs.
 
 
[TRUNCATED]

Download full output

Save a text file after edit

After being edited, a text document can be saved back as TXT or as WordProcessing. For saving back to the TXT format the user must use the TextSaveOptions class, which has the following properties:

  1. encoding — the character encoding of the text document, which will be applied when saving it. By default, if not specified, it is UTF-8.
  2. add_bidi_marks — a boolean flag that determines whether to add bi-directional marks before each BiDi run when saving in plain text format.
  3. preserve_table_layout — a boolean flag that specifies whether GroupDocs.Editor should try to preserve the layout of tables when saving in plain text format. The default value is False.

The edited content can also be saved to a WordProcessing format. For this, a WordProcessingSaveOptions instance with the desired output format is used:

from groupdocs.editor.formats import WordProcessingFormats
from groupdocs.editor.options import WordProcessingSaveOptions

word_save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCM)
editor.save(edited, "./edited.docm", word_save_options)

As a result, after running these examples the user will have versions of the edited document in the TXT and DOCM formats.