The current script at MarkdownTools/extract-names-ner.py finds NER with spacy, then dumps it to a CSV. It would be more helpful to the user if that were an option.
By default, print the named entities as found, or grouped with links to the specific file.
User case:
- User wants to find the common use of a specific name. They run the script with a vanilla option to check if a name exists, with a grouping option to see where a name is grouped, or with an export option to export a specific name to CSV.
- User can ask if a specific name exists, via any part of that name, to get back the name plus all the files where they exist. An alternative would be
git grep -i "{NAME}" *.md, which is limited. Script would scan all files likely to contain names, parse into an index, then report partial matches.
- The reporting variations support a workflow of "Does name exist?" or "What names exist?", then "Where does specific name exist?", to "Export findings."
- Apply to a podcast transcript to find "Mentions", then extract a list to write show notes. Or in SEO to contextualize mentions.
The current script at
MarkdownTools/extract-names-ner.pyfinds NER withspacy, then dumps it to a CSV. It would be more helpful to the user if that were an option.By default, print the named entities as found, or grouped with links to the specific file.
User case:
git grep -i "{NAME}" *.md, which is limited. Script would scan all files likely to contain names, parse into an index, then report partial matches.