Skip to content

implementation of "BERT for Joint Intent Classification and Slot Filling" in Tensorflow

License

Notifications You must be signed in to change notification settings

mangushev/intent_slot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intent slot is a primary model used in artificial assistant

This is an implementation of "BERT for Joint Intent Classification and Slot Filling" - arXiv:1902.10909v1 [cs.CL] 28 Feb 2019. link [https://arxiv.org/pdf/1902.10909.pdf]

This work targets gcp installation and, also, TPU was used for training. It can be run off AWS or Azure, but some migration work will be needed. Also, model was implemented in context of financial industry, but it can be used for anything.

Joint training using BERT with loss calculated as total loss from intent and slots. Objective function is calculated as in equation (3) in the arcticle.

Please make yourself familiar with BERT: Pre-training of Deep Bidirectional Transformers forLanguage Understanding - arXiv:1810.04805v2 [cs.CL] 24 May 2019

BERT is extended to train model for joint intent and slots. In BERT, we use first ouput token as classifier and other output are can be used for as in problems like translation, speech synthesis or recognition when many input tokens are mapped to output tokens. So we use first token for intent classification and other tokens for slots. The challenge is that BERT uses sub-tokens, so alignment is done as it is mentioned in the article.

Data used in this work:

  • SNIPS Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces - arXiv:1805.10190v3 [cs.CL] 6 Dec 2018
  • An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction - arXiv:1909.02027v1 [cs.CL] 4 Sep 2019

Snips dataset contains labels for both intent and slots and "Out-of-Scope Prediction" has data related to financial domain, but without slot labels and so slots were labeled manually and merged with SNIPS. This work is intented for some financial domain and this is why "Out-of-Scope Prediction" was used. They do have variaty of finacial intents.

Contains of the repository:

  • run_intent_slot.py script in bert forder is used the same way as BERT's run_classifier.py. This script contain actual Tensorflow implementation of this joint model

  • data folder: train.tsv, dev.tsv, test.tsv. All of them have sentence, intent_label, slot labels separated by tab. But sentence tokens and slot label tokens are separated by space.

Intents and slot labels came from SNIPS for:

  • AddToPlaylist
  • BookRestaurant
  • GetWeather
  • PlayMusic
  • RateBook
  • SearchCreativeWork
  • SearchScreeningEvent

These above are not a financial domain, but just to have more intents to work with. Also, they can be considered Out-of-Scope intents. Remove them when you have plenty of your own domain intents.

Slot are manually labeled for:

  • bill_due

  • report_fraud

  • transfer

  • train folder has three scripts: INTENT_SLOT.deployment, INTENT_SLOT.evaluate, INTENT_SLOT.predict:

    • evaluate is used for both training and evaluation using train.tsv and dev.tsv data files
    • predict uses test.tsv
    • deployment creates saved_model.pb file and variables which are used to deploy model is a way so it can be used for online prediction
  • assistant/deploy/deploy_intent_slot.sh deploys SavedModel.pb model with variables to gcp ai platform

  • assistant/functions/intent_slot contains function that is fronting model and provides better interface for consuming application

  • assistant/functions/deploy_function.sh deploys function to gcloud

  • assistant/appengine/intent_slot contains sample application to invoke function to run intent_slot prediction

Steps to train, deploy to gcp facilities and run test web page:

  1. Create gcp project, create simple VM instance, use standard 2-4 cpu, debian 9 deep learning TF 1.15. Create TPU instance, TF 1.15, preemptive (but keep it off until you start training!)
  2. Create ai model (actual model version will be deployed later): gcloud beta ai-platform models create --enable-console-logging --enable-logging --description "intent slots" intent_slot. This can be done from gcp console as well
  3. Get BERT and it should be in the bert folder. Put run_intent_slot.py into this folder as well
  4. Get gcp storage, create test folders test/pretrained/uncased_L-12_H-768_A-12/ and put unzipped BERT pretrained uncased_L-12_H-768_A-12 model content there. Use gsutil cp
  5. Create service account key. Please refer to https://cloud.google.com/docs/authentication/getting-started. Key will be stored in json file and move it to assistant/functions/intent_slot folder. Function uses this key to authenticate to ai platform
  6. Set varibles GCP_PROJECT_ID as your gcp project is, TPU_NAME as name of your tpu instance, INTENT_SLOT_FOLDER points to the root folder of this software, GOOGLE_APPLICATION_CREDENTIALS is the name of the key file created in Step 4 (not path, but file name)
  7. Start TPU. Go to train folder and run: sh INTENT_SLOT.evaluate. Shutdown TPU, not needed any more. Output will look like below. Accuracy similar to what is in the article: I0421 18:07:35.874321 140401829115648 estimator.py:2039] Saving dict for global step 1255: global_step = 1255, intent_accuracy = 0.9894737, intent_loss = 0.062348537, loss = 3.2194374, slot_accuracy = 0.96710986, slot_lm_loss = 0.19705378 I0421 18:07:38.343356 140401829115648 estimator.py:2099] Saving 'checkpoint_path' summary for global step 1255: gs://__PROJECT_ID__.appspot.com/test/intent_slot_output/uncased_L-12_H-768_A-12/model.ckpt-1255 I0421 18:07:38.692619 140401829115648 error_handling.py:96] evaluation_loop marked as finished I0421 18:07:38.692969 140401829115648 run_intent_slot.py:988] ***** Eval results ***** I0421 18:07:38.693127 140401829115648 run_intent_slot.py:990] global_step = 1255 I0421 18:07:38.693519 140401829115648 run_intent_slot.py:990] intent_accuracy = 0.9894737 I0421 18:07:38.693676 140401829115648 run_intent_slot.py:990] intent_loss = 0.062348537 I0421 18:07:38.693834 140401829115648 run_intent_slot.py:990] loss = 3.2194374 I0421 18:07:38.693990 140401829115648 run_intent_slot.py:990] slot_accuracy = 0.96710986 I0421 18:07:38.694137 140401829115648 run_intent_slot.py:990] slot_lm_loss = 0.19705378
  8. (Optional) From train folder, run: sh INTENT_SLOT.predict. This is to predict intents and slots for test.tsv file. Output will be in gcp storage folder test/intent_slot_output/uncased_L-12_H-768_A-12/predict
  9. From train folder, run: sh INTENT_SLOT.deployment. Output is model in SavedModel.pb format and variables folder. This model is for online predictions, ready to be deployed to gcp ai platform. Output will has number - folder name. It will be used in Step 9 I0421 18:59:58.103994 139887238223616 builder_impl.py:421] SavedModel written to: gs://__PROJECT_ID__.appspot.com/test/intent_slot_output/uncased_L-12_H-768_A-12/deployment/temp-b'1587495558'/saved_model.pb
  10. Go to assistant/deploy/ and run: sh deploy_intent_slot.sh 1587495558. Again, number-parameter, take from previous step. It will deploy your model to gcp ai platform.
  11. Go to assistant/functions/intent_slot and run: sh ../deploy_function.sh intent_slot Output will contain httpsTrigger. This url will be used in step 11. Please make "Allow unauthenticated" for this function from gcp console. See "Managing Access via IAM" section to do it in few steps. For production deployment, secure this!
  12. Go to assistant/appengine/intent_slot. Edit app.yaml and update INTENT_SLOT_URL. Use httpsTrigger from previous step. Application have to be enabled on gcp Appengine (It is enabled in Settings section). After this, run: gcloud beta app deploy To see application logging, use: gcloud app logs tail -s default
  13. Use target URL to acceess deployed application. Submit form with sentence like this to see intents and slots: send $10.00 from savings to checking screen sample
  14. After all done:
  • Disable application in appengine
  • Delete model in ai platform or delete version, this is what is billed
  • Delete function

About

implementation of "BERT for Joint Intent Classification and Slot Filling" in Tensorflow

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages