Skip to content

Improve usability#2

Open
cthoyt wants to merge 1 commit intoArshSekhon:mainfrom
cthoyt:add-easy-parsing
Open

Improve usability#2
cthoyt wants to merge 1 commit intoArshSekhon:mainfrom
cthoyt:add-easy-parsing

Conversation

@cthoyt
Copy link
Copy Markdown

@cthoyt cthoyt commented Aug 3, 2021

This PR does two things:

  1. It adds three utility functions to make it easier to load corpus lists from arbitrary data structures (instead of relying on internal state of a file path):
  • pubtator_loader.from_lines for an arbitrary iterable of strings
  • pubtator_loader.from_path for any path or path-like object
  • pubtator_loader.from_gz for a path or path-like object pointing to a file that needs to be gunzipped.

It also updates the example in the README to reflect new usage.

  1. Updates imports and type annotations for spacy in pubtator_document.py so users can use this code if spacy isn't installed.

If you're able to merge this, it would be great to make a new release too so I could clean up the dependencies in my code. Thanks for making a nice library for us!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant