Extract hyperlinks from document page area
GroupDocs.Parser provides the functionality to extract hyperlinks from document page area by the GetHyperlinks(PageAreaOptions) and GetHyperlinks(Int32, PageAreaOptions) methods:
IEnumerable<PageHyperlinkArea> GetHyperlinks(int pageIndex, PageAreaOptions options);
IEnumerable<PageHyperlinkArea> GetHyperlinks(PageAreaOptions options);
These methods return a collection of PageHyperlinkArea object:
Member | Description |
---|---|
Page | The page that contains the text area. |
Rectangle | The rectangular area on the page that contains the text area. |
Text | The hyperlink text. |
Url | The hyperlink URL. |
Here are the steps to extract hyperlinks from the document page area:
- Instantiate Parser object for the initial document;
- Check if the document supports hyperlink extraction;
- Instantiate PageAreaOptions with the rectangular area;
- Call GetHyperlinks(PageAreaOptions) method and obtain collection of PageHyperlinkArea objects;
- Iterate through the collection and get a hyperlink text and URL.
The following example shows how to extract hyperlinks from the document page area:
// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
// Check if the document supports hyperlink extraction
if (!parser.Features.Hyperlinks)
{
Console.WriteLine("Document isn't supports hyperlink extraction.");
return;
}
// Create the options which are used for hyperlink extraction
PageAreaOptions options = new PageAreaOptions(new Rectangle(new Point(380, 90), new Size(150, 50)));
// Extract hyperlinks from the document page area
IEnumerable<PageHyperlinkArea> hyperlinks = parser.GetHyperlinks(options);
// Iterate over hyperlinks
foreach (PageHyperlinkArea h in hyperlinks)
{
// Print the hyperlink text
Console.WriteLine(h.Text);
// Print the hyperlink URL
Console.WriteLine(h.Url);
Console.WriteLine();
}
}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Free online image extractor App
Along with full featured .NET library we provide simple, but powerfull free APPs.
You are welcome to extract images from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online GroupDocs Parser App.