How to edit e-Book file

Introduction

For the version 22.9, the GroupDocs.Editor for .NET supports 3 formats from the e-Book family:

  1. MOBI (MobiPocket),
  2. AZW3, also known as Kindle Format 8 (KF8),
  3. ePub (Electronic Publication).

As for the 22.9 version, the AZW3 and ePub formats are supported on both import (load) and export (save), while MOBI is supported only on import (we plan to add support of MOBI on export in the next release).

Load e-Book files for edit

GroupDocs.Editor for .NET doesn’t contain loading options nor for the whole e-Book formats family neither for the specific e-Book formats — users should specify e-Books through file path or byte stream without any loading options at all.

Code example below shows loading of 3 different e-Books in different formats into the 3 different istances of the Editor class from different sources:

string mobiPath = "path/to/A-Room-with-a-View-morrison.mobi");
string azw3Path = "path/to/Around the World in 28 Languages.azw3";
string epubPath = "path/to/Alices Adventures in Wonderland.epub";

GroupDocs.Editor.Editor editorMobi = new Editor(mobiPath);

FileStream azw3Stream = File.OpenRead(azw3Path);
GroupDocs.Editor.Editor editorAzw3 = new Editor(delegate() { return azw3Stream; });

byte[] epubBytes = File.ReadAllBytes(epubPath);
MemoryStream epubStream = new MemoryStream(epubBytes);
GroupDocs.Editor.Editor editorEpub = new Editor(delegate () { return epubStream; });


// ...
// Don't forget to dispose Editors when work is done
editorMobi.Dispose();
editorAzw3.Dispose();
editorEpub.Dispose();

Edit e-Book files

There is a common edit options for the whole e-Book formats family — a EbookEditOptions class. The content of this class resembles the content of the WordProcessingEditOptions class, because EbookEditOptions contains a subset of options from WordProcessingEditOptionsEnablePagination and EnableLanguageInformation and, as in the WordProcessingEditOptions, they are disabled (false) by default.

  • EnablePagination — allows to enable or disable pagination in the resultant HTML document. By default is disabled (false). This option controls how exactly the content of the e-Book will be converted to the EditableDocument representation while edited — in the float (false) or in the paged (true) mode. At the end, this options affects on the structure and representation of the HTML/CSS document, that the end-user edits in the WYSIWYG-editor.

  • EnableLanguageInformation — allows to export (true) or do not export (false) the language information to the resultant HTML markup. By default is disabled (false). This is useful when an e-Book contains text on different languages, and you want to preserve this language-specific metainformation while editing document in the WYSIWYG-editor.

Like for all supported document formats and options, in order to edit the document, user should firstly load it to the Editor class (this step was reviewed in the section above) and then call an Edit(IEditOptions) method. Like for all supported document formats, the EbookEditOptions are optional and user may call a parameterless Edit() method — in this case the default EbookEditOptions will be implicitly applied.

Code example below demonstrates a loading of a single ePub file to the Editor instance and then editing it twice with two different edit options (default and custom) and generating two different EditableDocument instances from an input single ePub file. Then two different HTML markup pieces are generated from these two EditableDocument instances.

string epubPath = "path/to/Alices Adventures in Wonderland.epub";

GroupDocs.Editor.Editor editorEpub = new Editor(epubPath);

Options.EbookEditOptions defaultEditOptions = new Options.EbookEditOptions();

Options.EbookEditOptions customEditOptions = new Options.EbookEditOptions();
customEditOptions.EnablePagination = true;
customEditOptions.EnableLanguageInformation = true;

EditableDocument defaultEdited = editorEpub.Edit(defaultEditOptions);
EditableDocument customEdited = editorEpub.Edit(customEditOptions);

string embeddedHtmlDefaultEdited = defaultEdited.GetEmbeddedHtml();
string embeddedHtmlCustomEdited = customEdited.GetEmbeddedHtml();

// ...
// Don't forget to dispose Editor and EditableDocuments when work is done
defaultEdited.Dispose();
customEdited.Dispose();
editorEpub.Dispose();

Save e-Book files after edit

Saving of the e-Books is performed like for all other formats. When e-Book content was edited by the client in the WYSIWYG-editor and was sent back to the server-side, it should be passed to the EditableDocument, and then this instance should be passed to the GroupDocs.Editor.Editor.Save() method.

Unlike other format families and unlike a single EbookEditOptions class, which is common for all e-Book formats, there is no such save options class. As for the version 22.9, the GroupDocs.Editor for .NET supports saving into AZW3 and ePub, and for each of these two formats the GroupDocs.Editor for .NET contains a distinct save options class:

These classes has one common property - a SplitHeadingLevel of the System.Int32 type. This property controls how (if so) to split the content of AZW3 or ePub e-book onto packages in the resultant file. It doesn’t affect the representation of a file, opened in any e-Book reader; rather, it is about an internal structure of the e-Book file. If you dont bother about internal structure of the ePub or AZW3 file, you may leave this property to has the default value.

EpubSaveOptions also has an ExportDocumentProperties boolean property — it controls whether to export built-in and custom document properties inside the resultant IDPF ePub e-Book. If you have no plans to reconvert the resultant ePub to some other format, you may leave it intact — the default false value disables the exporting of the document properties, so the resultant document will be a little bit smaller in size.

Code example below demonstrates a loading of a single ePub file to the Editor instance, editing it with default options, and saving to the ePub and AZW3 with different options for each one.

string epubPath = "path/to/Alices Adventures in Wonderland.epub";
string epubOutputPath = "Output_ePub.epub";
string azw3OutputPath = "Output_AZW3.azw3";

GroupDocs.Editor.Editor editor = new Editor(epubPath);

//edit with default EbookEditOptions
EditableDocument edited = editor.Edit();

Options.EpubSaveOptions epubSaveOptions = new Options.EpubSaveOptions();
epubSaveOptions.ExportDocumentProperties = true;
epubSaveOptions.SplitHeadingLevel = 5;

Options.Azw3SaveOptions azw3SaveOptions = new Options.Azw3SaveOptions();
azw3SaveOptions.SplitHeadingLevel = 1;

editor.Save(edited, epubOutputPath, epubSaveOptions);
editor.Save(edited, azw3OutputPath, azw3SaveOptions);

// ...
// Don't forget to dispose Editor and EditableDocument when work is done
edited.Dispose();
editor.Dispose();

Extracting metainfo from e-Book files

Like for all supported formats, the GroupDocs.Editor for .NET provides an ability to detected the document metainfo for all supported e-Book formats by using a GetDocumentInfo() method of the Editor class. In case when a valid e-Book was loaded into the Editor instance, a GetDocumentInfo() will return an instance of a Metadata.EbookDocumentInfo class, which inherits from IDocumentInfo interface, which, in turn, defines 4 properties: Format, PageCount, Size, and IsEncrypted.

  • Format property returns a Formats.EBookFormats struct, which for the e-Books can be a Mobi, Azw3, or Epub value.
  • PageCount property returns an approximate number of pages in case of MOBI or AZW3 or a number of chapters in case of ePub. For the Mobi and AZW3, it is approximate, because Mobi/AZW3 format internally is a set of HTML documents (chapters), which are not separated on pages and even have no strict page dimensions, which allows to split content on page blocks and thus calculate the number of pages. This decision was made by Mobi/AZW3 format designers intentionally to allows variable page size (and count) on different devices — from FullHD displays to smartphones. So, for returning a page count for a Mobi/AZW3 document, GroupDocs.Editor assumes standard A4 page size in a portrait orientation, splits existing document content on such “papers”, and then calculates its count. So the returning number should be treated very carefully and approximately, users should not rely on it.
  • Size property returns a number of bytes of e-Book file.
  • IsEncrypted property always returns a false value, because e-Books cannot be encrypted with password, like PDF or Office Open XML.

Code example below demonstrates a loading of 3 different e-Books in different formats (Mobi, AZW3 and ePub) into the 3 different istances of the Editor class and then extracting information about them and checking it with NUnit.

string mobiPath = "Ebook.mobi");
string azw3Path = "Ebook.azw3";
string epubPath = "Ebook.epub";

GroupDocs.Editor.Editor editorMobi = new Editor(mobiPath);                        
GroupDocs.Editor.Editor editorAzw3 = new Editor(azw3Path);
GroupDocs.Editor.Editor editorEpub = new Editor(epubPath);

GroupDocs.Editor.Metadata.IDocumentInfo mobiInfo = editorMobi.GetDocumentInfo(null);
GroupDocs.Editor.Metadata.IDocumentInfo azw3Info = editorAzw3.GetDocumentInfo(null);
GroupDocs.Editor.Metadata.IDocumentInfo epubInfo = editorEpub.GetDocumentInfo(null);

Assert.AreEqual(Formats.EBookFormats.Mobi, mobiInfo.Format);
Assert.AreEqual(Formats.EBookFormats.Azw3, azw3Info.Format);
Assert.AreEqual(Formats.EBookFormats.Epub, epubInfo.Format);

// ...
// Don't forget to dispose Editors when work is done
editorMobi.Dispose();
editorAzw3.Dispose();
editorEpub.Dispose();