Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a tutorial on the DIIN model? #822

Open
ouyaya opened this issue Mar 21, 2020 · 13 comments
Open

Is there a tutorial on the DIIN model? #822

ouyaya opened this issue Mar 21, 2020 · 13 comments
Labels

Comments

@ouyaya
Copy link

ouyaya commented Mar 21, 2020

When I called the model according to the hyperparameters given by the model, a problem occurred“ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)”.
So I changed the hyperparameter 'embedding_input_dim' to 33905, but another problem appeared“tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1200,0] = 184 is not in [0, 100)
[[{{node time_distributed_1_1/embedding_1/embedding_lookup}}]]”

@uduse
Copy link
Member

uduse commented Mar 21, 2020

#821 Does this help?

@ouyaya
Copy link
Author

ouyaya commented Mar 21, 2020

I referenced these hyperparameters, but found that the following hyperparameter reports an error. model.params ['embedding_input_dim'] = 10000
An error message indicates that this hyperparameter should be set to 33905. However, another error occurred after I changed it to 33905. I was wondering if the code in the other calling part is wrong.So do you have the tutorial code to call this model correctly?

@uduse
Copy link
Member

uduse commented Mar 21, 2020

@ouyaya Did you use the correct preprocessor?

@ouyaya
Copy link
Author

ouyaya commented Mar 21, 2020

Yes, I used the processing method in diin_preprocessor.py

@uduse
Copy link
Member

uduse commented Mar 21, 2020

I need more information, can you provide a Minimal, Reproducible Example?

@ouyaya
Copy link
Author

ouyaya commented Mar 21, 2020

# -*- coding: UTF-8 -*-
import keras
import pandas as pd
import numpy as np
import matchzoo as mz
import json
print('matchzoo version', mz.__version__)
print()

print('data loading ...')
train_pack_raw = mz.datasets.wiki_qa.load_data('train', task='ranking')
dev_pack_raw = mz.datasets.wiki_qa.load_data('dev', task='ranking', filtered=True)
test_pack_raw = mz.datasets.wiki_qa.load_data('test', task='ranking', filtered=True)
print('data loaded as `train_pack_raw` `dev_pack_raw` `test_pack_raw`')

ranking_task = mz.tasks.Ranking(loss=mz.losses.RankHingeLoss())
ranking_task.metrics = [
    mz.metrics.NormalizedDiscountedCumulativeGain(k=3),
    mz.metrics.NormalizedDiscountedCumulativeGain(k=5),
    mz.metrics.MeanAveragePrecision()
]
print("`ranking_task` initialized with metrics", ranking_task.metrics)

print("loading embedding ...")
glove_embedding = mz.datasets.embeddings.load_glove_embedding(dimension=300)
print("embedding loaded as `glove_embedding`")


diin_preprocessor = mz.preprocessors.DIINPreprocessor(fixed_length_left=32, fixed_length_right=32, fixed_length_word=16)
diin_preprocessor = diin_preprocessor.fit(train_pack_raw,verbose=0)
train_pack_processed = diin_preprocessor.transform(train_pack_raw,verbose=0)
dev_pack_processed = diin_preprocessor.transform(dev_pack_raw,verbose=0)
test_pack_processed = diin_preprocessor.transform(test_pack_raw,verbose=0)

model = mz.contrib.models.DIIN()
model.guess_and_fill_missing_params()
model.params['embedding_input_dim'] = 10000
model.params['embedding_output_dim'] = 300
model.params['embedding_trainable'] = True
model.params['optimizer'] = 'adam'
model.params['dropout_initial_keep_rate'] = 1.0
model.params['dropout_decay_interval'] = 10000
model.params['dropout_decay_rate'] = 0.977
model.params['char_embedding_input_dim'] = 100
model.params['char_embedding_output_dim'] = 8
model.params['char_conv_filters'] = 100
model.params['char_conv_kernel_size'] = 5
model.params['first_scale_down_ratio'] = 0.3
model.params['nb_dense_blocks'] = 3
model.params['layers_per_dense_block'] = 8
model.params['growth_rate'] = 20
model.params['transition_scale_down_ratio'] = 0.5
model.build()
model.compile()
model.backend.summary()

embedding_matrix = glove_embedding.build_matrix(diin_preprocessor.context['vocab_unit'].state['term_index'])
model.load_embedding_matrix(embedding_matrix)

pred_x, pred_y = test_pack_processed[:].unpack()
evaluate = mz.callbacks.EvaluateAllMetrics(model, x=pred_x, y=pred_y, batch_size=len(pred_y))

train_generator = mz.DataGenerator(
    train_pack_processed,
    mode='pair',
    num_dup=2,
    num_neg=1,
    batch_size=20
)
print('num batches:', len(train_generator))

history = model.fit_generator(train_generator, epochs=30, callbacks=[evaluate], workers=0, use_multiprocessing=True)

matchzoo_diin.txt
When I run this calling code, I get the following error:
Traceback (most recent call last):
File "E:/Paper/keras_wiki/matchzoo_diin.py", line 60, in
model.load_embedding_matrix(embedding_matrix)
File "F:\Python\lib\site-packages\matchzoo\engine\base_model.py", line 469, in load_embedding_matrix
self.get_embedding_layer(name).set_weights([embedding_matrix])
File "F:\Python\lib\site-packages\keras\engine\base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)

@uduse
Copy link
Member

uduse commented Mar 22, 2020

In addition to the other change you made, also set the embedding input dimension for char embedding:

char_input_dim = len(diin_preprocessor.context['char_unit'].state['term_index'])
model.params['char_embedding_input_dim'] = char_input_dim

@ouyaya
Copy link
Author

ouyaya commented Mar 26, 2020

Thank you for your answer, this problem has been solved according to your prompt. But when I use the wiki dataset to train this model, the loss value will become higher and higher. I do n’t know if it is a problem with the model code or a task I deal with. The task I deal with is ‘ranking’.

@uduse
Copy link
Member

uduse commented Mar 27, 2020

The reason that the model is in contrib is it is not fully tested, so I am not sure why this is happening. @caiyinqiong is the author and maybe she has something to say?

@ouyaya
Copy link
Author

ouyaya commented Apr 2, 2020

Do I need to tune the hyperparameters without hyperspace in the given model for the new dataset?

@uduse
Copy link
Member

uduse commented Apr 2, 2020

If a parameter don't have a hyperspace then it won't be tuned and it will just use the default value. It's always a good idea to tune your models on new datasets. You can do this by either manually adjusting things, or using the auto tuner.

@ouyaya
Copy link
Author

ouyaya commented Apr 2, 2020

I mean, do I need to add hyperspace to those hyperparameters that do not have hyperspace, and then use the auto-tuner to adjust all the parameters? If I adjust all the parameters, will it affect the initial model mentioned in the paper, because I want to do comparison experiments with my own paper model.

@uduse
Copy link
Member

uduse commented Apr 2, 2020

@ouyaya If you just want a baseline, then yes, don't add extra hyperspaces, just use the default ones. You don't even need to tune the model, just run the model as it is. In addition, you can also tune the model then call the model a "fine-tuned" baseline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants