Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge changes from Josh Juneau's copy #14

Open
jeff5 opened this issue Jun 4, 2019 · 6 comments
Open

Merge changes from Josh Juneau's copy #14

jeff5 opened this issue Jun 4, 2019 · 6 comments

Comments

@jeff5
Copy link
Member

jeff5 commented Jun 4, 2019

I have found a later version of the book in a repo belonging to Josh Juneau than the one we are building on. :( This version is much closer to the paper edition.

It will take some careful work to merge Josh's into this repository, because of fixes made here that that version lacks. (I think that's the right thing to do, but only trying will tell us if it's a lost cause.)

Believe it or not, I actually found this via the unfinished Korean translation on RTD, which I spotted contained stuff we didn't have.

@jeff5
Copy link
Member Author

jeff5 commented Jun 11, 2019

I started on this at a branch of my own repo but it is quite a bit of work. At least in the early chapters, we gain a lot of material.

@adamburkegh
Copy link
Contributor

@jeff5 would you consider merging this into trunk? I know it's not finished, but they are mostly incremental imrpovements, right? I can take a look, I presume you have your hands pretty full with the beta.

@jeff5
Copy link
Member Author

jeff5 commented Dec 28, 2019

@adamburkegh : Sorry, I'm terrible for starting things I don't have time to finish. I thought it would be mechanical to do the whole book, but each chapter brings its own twists, and I ran out of time.

The problem here is that it's mostly a routine job, if a bit fiddly, and I'd got into a sort of rhythm of what to do ... I'd be happy to share, except by the time you asked I'd mostly lost the process. It involved a lot of supervised global replacement using regexes, and a fair amount of by-hand drudgery.

I've spent some time recovering my technique. Firstly, the target state for each chapter is:

  • As close to the printed book as possible (plus improvements we've already taken on).
  • One sentence per line. (Because re-flowing a paragraph hides what really changed.)
  • No archaic Sphinx syntax. (I found a few, e.g. old table syntax.)
  • Builds without error messages.

I was working between (my local of) the book repo and a clone of Josh's repo. Basically my technique is to bring corresponding sources in both places to the one sentence-per-line state, and by differencing (kdiff3), figure out what bulk changes can be done to either to minimise the red. There is some word-processor damage to spot.

When I've got the differences down to proper editorial content, I use merge by sentences (kdiff3 again) generally preferring Josh's copy.

@jeff5 jeff5 closed this as completed in 2405c36 Dec 29, 2019
@jeff5
Copy link
Member Author

jeff5 commented Dec 29, 2019

Grrr! Github has closed this because I referenced it in the commit (I think). However, I've only updated chapters 1-5.

As each file seems to take a good few hours, it makes sense to do them as separate commits or PRs. It is right they should reference this, so let's keep re-opening it.

@jeff5 jeff5 reopened this Dec 29, 2019
@adamburkegh
Copy link
Contributor

@jeff5 All fair enough. Offer's still open for the other chapters if you like.

@jeff5
Copy link
Member Author

jeff5 commented Dec 31, 2019

Offer gratefully accepted. And a happy New Year to you.

Please announce your picks here (just in case I get the urge too). I think with that last commit (and the tricky merge I gave myself) I have removed the blockage I was being to contribution. I think we should do this (i.e. get roughly level with the paper book) before trying to advance the content into 2.7/3 territory.

I made a note of useful regexes. None is safe for "replace all" :

Split lines at sentence boundaries (not for indented lines):
(\w+[.?!])[ ]+([A-Z])
$1\n$2

Split line at first sentence boundary (copes with indenting):
^([ ]*)([^.\n]+[.?!])[ ]+([A-Z])
$1$2\n$1$3

Strip trailing space:
[ \t\r]+$
<nothing>

Join lines into one-line paragraph:
(\w+[,;']?)[ ]*\n(\w+)
$1 $2

Join lines into one-line paragraph (more inclusive - use with care):
(\w+[,;']?)[ ]*\n[  ]*(\w+)
$1 $2

Variable and function name code-quoting (timer2, tip_value, __doc__, next(), etc.: hits on code all the time)
(_*[a-z]+(_[a-z]*|[0-9]|\(\))+)
``$1``

Apostrophe processing. (On balance, \u2019 is the better character as long as we're not in code.)
'(s|l|t|m)
’$1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants