Skip to content

Using business-level retrieval system (BM25) with Python in just a few lines.

License

Notifications You must be signed in to change notification settings

kwang2049/easy-elasticsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Easy Elasticsearch

This repository contains a high-level encapsulation for using Elasticsearch with python in just a few lines.

Installation

Via pip:

pip install easy-elasticsearch

Via git repo:

git clone https://github.com/kwang2049/easy-elasticsearch
pip install -e . 

Usage

To utilize the elasticsearch service, one can select from 3 ways:

  • (1) Start an ES service manually and then indicate the host and port_http (please refere to download_and_run.sh);
  • (2) Or leave host=None by default to start a docker container itself;
  • (3) Or leava host=None and setting service_type=executable to download an ES executable and start it in the back end.

Finally, just either call its rank or score function for retrieval or calculating BM25 scores.

from easy_elasticsearch import ElasticSearchBM25

pool = {
    'id1': 'What is Python? Is it a programming language',
    'id2': 'Which Python version is the best?',
    'id3': 'Using easy-elasticsearch in Python is really convenient!'
}
bm25 = ElasticSearchBM25(pool, port_http='9222', port_tcp='9333')  # By default, when `host=None` and `mode="docker"`, a ES docker container will be started at localhost.

query = "What is Python?"
rank = bm25.query(query, topk=10)  # topk should be <= 10000
scores = bm25.score(query, document_ids=['id2', 'id3'])

print(query, rank, scores)
bm25.delete_index()  # delete the one-trial index named 'one_trial'
bm25.delete_container()  # remove the docker container'

Another example for retrieving Quora questions can be found in easy_elasticsearch/examples/quora.py:

python -m easy_elasticsearch.examples.quora  --mode docker

or

python -m easy_elasticsearch.examples.quora  --mode executable

About

Using business-level retrieval system (BM25) with Python in just a few lines.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published