The easiest way to remove metadata properties from a file is to use the corresponding tags that let you locate the desired properties across all metadata packages. But sometimes it’s necessary to remove metadata entries that have a particular value. With a Python predicate you can find and remove properties that satisfy a condition as complex as you need.
The following example demonstrates how to remove specific metadata properties using a combination of criteria.
Load a file to update
Pass a predicate to the remove_properties method to find and remove the desired properties
Check the number of properties that were actually removed (the return value of remove_properties)
Save the changes
fromgroupdocs.metadataimportMetadatafromgroupdocs.metadata.commonimportMetadataPropertyTypefromgroupdocs.metadata.taggingimportTagsdefremove_metadata_properties():# Remove all the properties satisfying the predicate:# the property carries the "author" tag, OR# the property carries the "last editor" tag, OR# the property is a string whose value equals "John"# (to wipe any mention of John from the detected metadata)withMetadata("input.docx")asmetadata:affected=metadata.remove_properties(lambdap:Tags.person.creatorinlist(p.tags)orTags.person.editorinlist(p.tags)or(p.value.type==MetadataPropertyType.STRINGandstr(p.value)=="John"))print(f"Properties removed: {affected}")metadata.save("output.docx")if__name__=="__main__":remove_metadata_properties()
input.docx is the sample file used in this example. Click here to download it.
As a result of running the code snippet above, we remove all mentions of the document author/editor and all other string metadata properties whose value is “John”.
For more information on searching metadata, please refer to the following articles: