Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split title documentation (for 1.1) #110

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

denismaier
Copy link
Member

This adds documentation for the title-splitting feature to be added in 1.1. I'm not sure it's in the right place. (I actually am quite sure that it isn't, but I don't really know where to put it. Should, e.g. the attributes be added to the sections on style behavior? Only there? Or as well?)

One thing that still needs to be discussed: In the proposal I've had this paragraph regarding: As some locales prefer en to em dashes citeprocs should check against both if the "full" options are selected on normalize-title-delimiters and/or title-split. Should I add that somewhere, or should I just add "–" (an en dash) to the relevant options in the schema and in the documentation.

@bdarcus
Copy link
Member

bdarcus commented Jul 12, 2020 via email

@bwiernik
Copy link
Member

I was thinking we might want to add a new section for input data, and this could primarily go there, along with the rich text stuff, and dates?

Yes, this. We also probably need a whole new section in the spec to describe how title processing works, similar to the section on names and dates being split out.

@denismaier
Copy link
Member Author

I should better convert this to draft...

@denismaier denismaier marked this pull request as draft July 12, 2020 19:48
@bwiernik
Copy link
Member

bwiernik commented Jul 18, 2020

@bdarcus If we want to simplify these rules, we could the delimiter sets down to simple (. , : , :: , ? , ! ) and extended (simple plus ; ). Em dash separation of subtitles versus colons is maybe rare enough that we could require users to split that manually. Chicago style is another rare case that could be left to users to split.

We could even eliminate extended if we want. ; is mostly used to delimit multiple subtitles. We could ignore that case or bake in that logic and ignore the rare styles that don't respect that convention.

That could eliminate the title-split attribute and just leave one of these options for spec instructions:

Ignore the ; split for subtitles:

Citeprocs split title variables into "main" and "sub" forms.
Split-points can be explicitly provided in title variables by separating
chunks of a title with two vertical bars:
Main Title:|| first subtitle:|| second subtitle. The "main" form is
the text before the first delimiter, the "sub" form is an array of the
text following each delimiter. If no split-points are supplied in the
data, the citeproc will derive them by splitting the title on the following
patterns: . , : , :: , ! , ?

Split subtitles on ; :

Citeprocs split title variables into "main" and "sub" forms.
Split-points can be explicitly provided in title variables by separating
chunks of a title with two vertical bars:
Main Title:|| first subtitle:|| second subtitle. The "main" form is
the text before the first delimiter, the "sub" form is an array of the
text following each delimiter. If no split-points are supplied in the
data, the citeproc will derive them. The main title is separated from any
subtitles by splitting the title on the first of the following patterns: . , : , :: , ! , ? .
Multiple subtitles are separated by splitting the remaining title string
on on the following patterns: . , : , :: , ! , ? , ; .

@denismaier
Copy link
Member Author

We could even eliminate extended if we want. ; is mostly used to delimit multiple subtitles.

Couldn't we just add ; to the list of delimiters and be done? Like:

If no split-points are supplied in the
data, the citeproc will derive them by splitting the title on the following
patterns: . , : , ;, :: , ! , ?

A semicolon in a title that doesn't serve as a delimiter should be rare enough anyway, right?

@bwiernik
Copy link
Member

Probably.

@denismaier
Copy link
Member Author

Chicago style splits and punctuation are now also endorsed by MLA:

For an alternative or double title in English beginning with or, we follow the first example given in section 8.165 of The Chicago Manual of Style and punctuate as follows:
England’s Monitor; or, The History of the Separation (452)
But no semicolon is needed for a title in English that ends with a question mark or exclamation point:
“Getting Calliope through Graduate School? Can Chomsky Help? or, The Role of Linguistics in Graduate Education in Foreign Languages”

https://style.mla.org/punctuation-with-titles/

@bwiernik
Copy link
Member

Let's change the relevant pattern to ; or, and recognize it always.

@denismaier
Copy link
Member Author

By the way, why do we have :: in the pattern list? What's that for?

@bwiernik
Copy link
Member

bwiernik commented Jul 18, 2020

It was in Frank's list, I think because a lot of library catalog data explicitly use :: to separate main and subtitles. IIRC correctly, he normalizes that to just one colon. We could drop it I think (: will still match that obviously).

@denismaier
Copy link
Member Author

It was in Frank's list, I think because a lot of library catalog data explicitly use :: to separate main and subtitles. IIRC correctly, he normalizes that to just one colon. We could drop it I think (: will still match that obviously).

Hmmm, we, at least, use : to separate main and subtitle.

Concering ; or,: Couldn't we just include the complete Chicago pattern here? I don't think that will be to problematic as this should only affect subtitle casing. (And you wouldn't usually replace such a delimiter.)
How does APA deal with this?

@denismaier
Copy link
Member Author

An edge case regarding Chicago style splits.
Let's say you have:
"A very important title; or: This book is important"

Even if you normalize colons to periods in other cases, here you will not want
"A very important title; or. This book is important"
but to keep the colon, so:
"A very important title; or: This book is important"

@bwiernik
Copy link
Member

Here are the three examples from the current Chicago manual, with the first being preferred nowadays:

The Tempest, or The Enchanted Island
Moby-Dick; or, The Whale
Dr. Strangelove, or: How I Learned to Stop Worrying and Love the Bomb

The first we don't bother with. That would be need to be entered as The tempest, or ||the enchanted island or as The tempest, or The enchanted island.

The third is handed by the normal colon rules.

The second is the only one that needs to be handled. This regex pattern should work for that /; or,?/ (? here indicates the comma is optional). With ; being in the main split list, the only thing that needs to be accommodated is to not capitalize "or".

@bdarcus
Copy link
Member

bdarcus commented Jul 19, 2020 via email

@bwiernik
Copy link
Member

I think that’s more a semantic description. In most data they are going to be entered as a flat title (e.g., especially the Dr. Strangelove title).

@denismaier
Copy link
Member Author

The third is handed by the normal colon rules.

But shouldn't we make sure the colon does not gets replaced here?

@bdarcus
Copy link
Member

bdarcus commented Jul 19, 2020

I think that’s more a semantic description. In most data they are going to be entered as a flat title (e.g., especially the Dr. Strangelove title).

I don't think the latter point is relevant, but I'll save that for another thread.

On the first point, it's exactly how it's described in the style guides.

From here:

For an alternative or double title in English beginning with or, we follow the first example given in section 8.165 of The Chicago Manual of Style and punctuate as follows ...

@bwiernik
Copy link
Member

Yes, I think that's more a semantic description of the type of subtitle, rather than a description of how to expect these to appear in item data.

Here is the full description from the Chicago manual:

14.91 Use of "or" with double titles. Old-fashioned double titles (or titles and subtitles) connected by or have traditionally been separated by a semicolon (or sometimes a colon), with a comma following or, or more simply by a single comma preceding or. (Various other combinations have also been used.) When referring to such titles, prefer the punctuation on the title page or at the head of the original source. In the absence of such punctuation (e.g., when the title is distinguished from the subtitle by typography alone), or when the original source is not available to consult, use the simpler form shown in the first example. This departure from earlier editions recognizes the importance of balancing editorial expediency with fidelity to original sources. The second example preserves the usage on the original title pages of the American and British editions of Melville’s classic novel (and assumes one of those editions, or a later edition that preserves such punctuation, was in fact consulted). The third example (of a modern film) preserves the colon of the original title sequence but adds a comma to separate the main title from the secondary title (distinguished only graphically in the original). In all cases, the first word of the subtitle (following or) should be capitalized. See also 14.87, 14.88.

The Tempest, or The Enchanted Island
but
Moby-Dick; or, The Whale
Dr. Strangelove, or: How I Learned to Stop Worrying and Love the Bomb

@bwiernik
Copy link
Member

But shouldn't we make sure the colon does not gets replaced here?

We could. In that case, listing these separately would be best: /; or,? / and /, or: /

@bdarcus
Copy link
Member

bdarcus commented Jul 19, 2020 via email

@bwiernik bwiernik added the 1.1 label Nov 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants