PORL-HG (The paper has been published in AAAI 2020 as a long paper.)

Code implementation of Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation.

“A good basic selling idea, involvement and relevancy, of course, are as important as ever,
but in the advertising din of today, unless you make yourself noticed and believed, you ain’t got nothing”

— Leo Burnett (1891-1971)

Generation Examples

Dataset

CNNDM-DH, DM-DHC Datasets download link: PORLHG

You can follow the instructions to download and preprocess the CNN/DailyMail dataset to acquire the article.

The dataset is collected according to the url links provided by Nallapati et al. 2016, Hermann et al. 2015

The DH, DHC datasets can be associated with CNNDM by the id.

The dataset information:

	train	val	test
DH	281208	12727	10577
DHC	138787	11862	10130

More Experiment Results

Table1. Correlation Analysis of CTR, comments and shares List of hypotheses and the corresponding p-value of the significance test, where bold text indicates significant hypothesis (p-value < 0.05). Note the p-value of CTR is referenced from Kuiken et al. 2017

Hypothesis	CTR	Comment	Share
H1 Longer headline(> 50 characters) are preferred over shorter headlines	0.297	0	0
H2 Headlines with short words (< 8 characters per word) are preferred	0.024	0	0
H3 Headlines containing a question are preferred	0.019	0	0
H4 Headlines containing a partial quote are preferred over not containing any quote	0.239	0.996	0.971
H5 Headlines not containing any quote are preferred over containing full quote	0.03	0.848	0.111
H6 Headlines that contain one or more signal words are preferred	0.002	0	0.001
H7 Headlines that contain one or more personal or possessive pronouns are preferred	0	0	0
H8 Headlines that contain one or more sentimental words are preferred	0.018	0	0
H9 Headlines that contain one or more negative sentimental word are preferred	0.001	0.001	0.015
H10 Headlines that contain a number are preferred over headlines that do not	0.202	0	0.06
H11 Headlines that start with a personal or possessive pronoun are preferred	0.002	0	0.429

Table2. The popularity features. The following 11 features are transformed from the hypotheses stated in Table1. GT indicates the abbreviation of ground-truth headlines, and Chen et al. is one of our baselines Chen et al. 2018.

Hypothesis	Significance	GT	PORL	Chen et al.
H1 The average character length of a headline	False	70.55	96.21	73.92
H2 The average of token lengths in a headline (lower is better)	True	4.97	4.78	4.89
H3 The percentage of headlines containing a question mark	True	2.52%	0.90%	1.19%
H4 The percentage of headlines containing a partial quote	True	11.81%	15.80%	13.85%
H5 The percentage of headline containing full quote (lower is better)	False	0.01%	0.00%	0.00%
H6 The percentage of headline containing signal words	True	9.90%	19.83%	15.00%
H7 The percentage of headline containing personal or possessive pronoun	True	28.82%	48.67%	40.35%
H8 The percentage of headline containing sentimental words	True	68.82%	77.40%	69.37%
H9 The percentage of headline containing negative words	True	45.09%	52.29%	44.83%
H10 The percentage of headline containing numbers	False	20.58%	25.22%	21.06%
H11 The percentage of headline starting with personal or possessive pronoun	True	0.64%	1.07%	0.38%

Cite

@article{Song_Shuai_Yeh_Wu_Ku_Peng_2020, title={Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation}, volume={34}, url={https://ojs.aaai.org/index.php/AAAI/article/view/6421}, DOI={10.1609/aaai.v34i05.6421}, author={Song, Yun-Zhu and Shuai, Hong-Han and Yeh, Sung-Lin and Wu, Yi-Lun and Ku, Lun-Wei and Peng, Wen-Chih}, year={2020}, month={Apr.}, pages={8910-8917} }

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
model		model
PORLHG_example.png		PORLHG_example.png
README.md		README.md
decode_baselines.py		decode_baselines.py
decode_full_model.py		decode_full_model.py
decode_full_model_ext.py		decode_full_model_ext.py
decoding.py		decoding.py
decoding_ext.py		decoding_ext.py
eval_full_model.py		eval_full_model.py
evaluate.py		evaluate.py
function.py		function.py
make_absRL_dataset.py		make_absRL_dataset.py
make_eval_references.py		make_eval_references.py
make_extraction_labels.py		make_extraction_labels.py
make_pop.py		make_pop.py
make_topic_att_data.py		make_topic_att_data.py
metric.py		metric.py
pop_score.py		pop_score.py
preprocess_decode_result.py		preprocess_decode_result.py
rl.py		rl.py
rl_wo_cls.py		rl_wo_cls.py
script_train.py		script_train.py
train_abstractor.py		train_abstractor.py
train_extractor_ml.py		train_extractor_ml.py
train_full_rl.py		train_full_rl.py
train_word2vec.py		train_word2vec.py
training.py		training.py
training_ext.py		training_ext.py
utils.py		utils.py

yunzhusong/AAAI20-PORLHG

Folders and files

Latest commit

History

Repository files navigation

PORL-HG (The paper has been published in AAAI 2020 as a long paper.)

Dataset

More Experiment Results

Cite

About

Topics

Resources

Stars

Watchers

Forks

Languages