Skip to content

An internal client library to access the new Mediacloud news archive search.

License

Notifications You must be signed in to change notification settings

mediacloud/mediacloud-news-client

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mediacloud News Archive Client

🚧 under construction 🚧

A simple client library to access the Wayback Machine news archive search.

Installation

NB: TBD pip install mediacloud-news-client

Basic Usage

Counting matching stories:

from mcnews.searchapi import SearchApiClient
import datetime as dt

api = SearchApiClient("mediacloud_search_text_*")
api.count("coronavirus", dt.datetime(2023, 11, 1), dt.datetime(2023, 12, 1))

Paging over all matching results:

from mcnews.searchapi import SearchApiClient
import datetime as dt

api = SearchApiClient("mediacloud_search_text_*")
for page in api.all_articles("coronavirus", dt.datetime(2023, 11, 1), dt.datetime(2023, 12, 1)):
    do_something(page)

Dev Installation

Install the dependencies for dev: pip install -e .[dev]

Distribution

  1. Run pytest to make sure all the test pass
  2. Update the version number in mcnews/__init__.py
  3. Make a brief note in the version history section below about the changes
  4. Commit the changes
  5. Tag the commit with a semantic version number - 'v*..'
  6. Push to repo to GitHub
  7. Run python setup.py sdist to create an installation package
  8. Run twine upload --repository-url https://test.pypi.org/legacy/ dist/* to upload it to PyPI's test platform
  9. Run twine upload dist/* to upload it to PyPI

Version History

  • v2.0.0 - Fresh start as mediacloud-news-client
  • v1.2.1 - fix paging bug triggered by no results
  • v1.2.0 - add support for new expanded results, and more integration testing
  • v1.1.0 - add new paged_articles method to allow paging over all results
  • v1.0.3 - add 30 sec timeout, remove extra params mcproviders library might be adding
  • v1.0.2 - fix to article endpoint
  • v1.0.1 - automatically escape '/' in query strings, test case for url field search
  • v1.0.0 - update to public API endpoint
  • v0.1.5 - simpler return for top terms
  • v0.1.4 - better error handling
  • v0.1.3 - allow overriding base api URL
  • v0.1.2 - fix article endpoint, test case for fetching content (snippet) via article_url property
  • v0.1.1 - more consistent method names
  • v0.1.0 - initial test-only release

About

An internal client library to access the new Mediacloud news archive search.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%