Is there a tutorial on the DIIN model? #822

ouyaya · 2020-03-21T02:41:19Z

When I called the model according to the hyperparameters given by the model, a problem occurred“ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)”.
So I changed the hyperparameter 'embedding_input_dim' to 33905, but another problem appeared“tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1200,0] = 184 is not in [0, 100)
[[{{node time_distributed_1_1/embedding_1/embedding_lookup}}]]”

uduse · 2020-03-21T06:57:50Z

#821 Does this help?

ouyaya · 2020-03-21T07:07:05Z

I referenced these hyperparameters, but found that the following hyperparameter reports an error. model.params ['embedding_input_dim'] = 10000
An error message indicates that this hyperparameter should be set to 33905. However, another error occurred after I changed it to 33905. I was wondering if the code in the other calling part is wrong.So do you have the tutorial code to call this model correctly?

uduse · 2020-03-21T07:12:49Z

@ouyaya Did you use the correct preprocessor?

ouyaya · 2020-03-21T07:16:31Z

Yes, I used the processing method in diin_preprocessor.py

uduse · 2020-03-21T07:25:25Z

I need more information, can you provide a Minimal, Reproducible Example?

ouyaya · 2020-03-21T07:38:10Z

# -*- coding: UTF-8 -*-
import keras
import pandas as pd
import numpy as np
import matchzoo as mz
import json
print('matchzoo version', mz.__version__)
print()

print('data loading ...')
train_pack_raw = mz.datasets.wiki_qa.load_data('train', task='ranking')
dev_pack_raw = mz.datasets.wiki_qa.load_data('dev', task='ranking', filtered=True)
test_pack_raw = mz.datasets.wiki_qa.load_data('test', task='ranking', filtered=True)
print('data loaded as `train_pack_raw` `dev_pack_raw` `test_pack_raw`')

ranking_task = mz.tasks.Ranking(loss=mz.losses.RankHingeLoss())
ranking_task.metrics = [
    mz.metrics.NormalizedDiscountedCumulativeGain(k=3),
    mz.metrics.NormalizedDiscountedCumulativeGain(k=5),
    mz.metrics.MeanAveragePrecision()
]
print("`ranking_task` initialized with metrics", ranking_task.metrics)

print("loading embedding ...")
glove_embedding = mz.datasets.embeddings.load_glove_embedding(dimension=300)
print("embedding loaded as `glove_embedding`")


diin_preprocessor = mz.preprocessors.DIINPreprocessor(fixed_length_left=32, fixed_length_right=32, fixed_length_word=16)
diin_preprocessor = diin_preprocessor.fit(train_pack_raw,verbose=0)
train_pack_processed = diin_preprocessor.transform(train_pack_raw,verbose=0)
dev_pack_processed = diin_preprocessor.transform(dev_pack_raw,verbose=0)
test_pack_processed = diin_preprocessor.transform(test_pack_raw,verbose=0)

model = mz.contrib.models.DIIN()
model.guess_and_fill_missing_params()
model.params['embedding_input_dim'] = 10000
model.params['embedding_output_dim'] = 300
model.params['embedding_trainable'] = True
model.params['optimizer'] = 'adam'
model.params['dropout_initial_keep_rate'] = 1.0
model.params['dropout_decay_interval'] = 10000
model.params['dropout_decay_rate'] = 0.977
model.params['char_embedding_input_dim'] = 100
model.params['char_embedding_output_dim'] = 8
model.params['char_conv_filters'] = 100
model.params['char_conv_kernel_size'] = 5
model.params['first_scale_down_ratio'] = 0.3
model.params['nb_dense_blocks'] = 3
model.params['layers_per_dense_block'] = 8
model.params['growth_rate'] = 20
model.params['transition_scale_down_ratio'] = 0.5
model.build()
model.compile()
model.backend.summary()

embedding_matrix = glove_embedding.build_matrix(diin_preprocessor.context['vocab_unit'].state['term_index'])
model.load_embedding_matrix(embedding_matrix)

pred_x, pred_y = test_pack_processed[:].unpack()
evaluate = mz.callbacks.EvaluateAllMetrics(model, x=pred_x, y=pred_y, batch_size=len(pred_y))

train_generator = mz.DataGenerator(
    train_pack_processed,
    mode='pair',
    num_dup=2,
    num_neg=1,
    batch_size=20
)
print('num batches:', len(train_generator))

history = model.fit_generator(train_generator, epochs=30, callbacks=[evaluate], workers=0, use_multiprocessing=True)

matchzoo_diin.txt
When I run this calling code, I get the following error:
Traceback (most recent call last):
File "E:/Paper/keras_wiki/matchzoo_diin.py", line 60, in
model.load_embedding_matrix(embedding_matrix)
File "F:\Python\lib\site-packages\matchzoo\engine\base_model.py", line 469, in load_embedding_matrix
self.get_embedding_layer(name).set_weights([embedding_matrix])
File "F:\Python\lib\site-packages\keras\engine\base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)

uduse · 2020-03-22T17:43:08Z

In addition to the other change you made, also set the embedding input dimension for char embedding:

char_input_dim = len(diin_preprocessor.context['char_unit'].state['term_index'])
model.params['char_embedding_input_dim'] = char_input_dim

ouyaya · 2020-03-26T02:44:40Z

Thank you for your answer, this problem has been solved according to your prompt. But when I use the wiki dataset to train this model, the loss value will become higher and higher. I do n’t know if it is a problem with the model code or a task I deal with. The task I deal with is ‘ranking’.

uduse · 2020-03-27T00:47:47Z

The reason that the model is in contrib is it is not fully tested, so I am not sure why this is happening. @caiyinqiong is the author and maybe she has something to say?

ouyaya · 2020-04-02T02:48:11Z

Do I need to tune the hyperparameters without hyperspace in the given model for the new dataset?

uduse · 2020-04-02T06:05:55Z

If a parameter don't have a hyperspace then it won't be tuned and it will just use the default value. It's always a good idea to tune your models on new datasets. You can do this by either manually adjusting things, or using the auto tuner.

ouyaya · 2020-04-02T06:20:53Z

I mean, do I need to add hyperspace to those hyperparameters that do not have hyperspace, and then use the auto-tuner to adjust all the parameters? If I adjust all the parameters, will it affect the initial model mentioned in the paper, because I want to do comparison experiments with my own paper model.

uduse · 2020-04-02T15:49:18Z

@ouyaya If you just want a baseline, then yes, don't add extra hyperspaces, just use the default ones. You don't even need to tune the model, just run the model as it is. In addition, you can also tune the model then call the model a "fine-tuned" baseline.

ouyaya added the question label Mar 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a tutorial on the DIIN model? #822

Is there a tutorial on the DIIN model? #822

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020 •

edited by uduse

uduse commented Mar 22, 2020 •

edited

ouyaya commented Mar 26, 2020

uduse commented Mar 27, 2020

ouyaya commented Apr 2, 2020

uduse commented Apr 2, 2020

ouyaya commented Apr 2, 2020

uduse commented Apr 2, 2020

Is there a tutorial on the DIIN model? #822

Is there a tutorial on the DIIN model? #822

Comments

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020

uduse commented Mar 21, 2020

ouyaya commented Mar 21, 2020 • edited by uduse

uduse commented Mar 22, 2020 • edited

ouyaya commented Mar 26, 2020

uduse commented Mar 27, 2020

ouyaya commented Apr 2, 2020

uduse commented Apr 2, 2020

ouyaya commented Apr 2, 2020

uduse commented Apr 2, 2020

ouyaya commented Mar 21, 2020 •

edited by uduse

uduse commented Mar 22, 2020 •

edited