Script unzipping too; add instructions to README

2025-01-23 07:20:20 +08:00 · 2016-10-14 23:32:37 -04:00 · 2016-10-14 23:32:37 -04:00 · 82f5928b89
commit 82f5928b89
parent 7031318d01
2 changed files with 27 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -72,5 +72,15 @@ There are a number of labels on Issues:
 As the book will be published by No Starch, we first iterate here, then ship the
 text off to No Starch. Then they do editing, and we fold it back in.

-As such, there’s a directory, `nostarch`, which corresponds to the text in No
+As such, there’s a directory, *nostarch*, which corresponds to the text in No
 Starch’s system.
+
+When we've started working with No Starch in a word doc, we will also check
+those into the repo in the *nostarch/odt* directory. To extract the text from
+the word doc as markdown in order to backport changes to the online book:
+
+1. Open the doc file in LibreOffice
+1. Accept all tracked changes
+1. Save as Microsoft Word 2007-2013 XML (.docx) in the *tmp* directory
+1. Run `./doc-to-md.sh`
+1. Inspect changes made to the markdown file in the *nostarch* directory and copy the changes to the *src* directory as appropriate.
--- a/doc-to-md.sh
+++ b/doc-to-md.sh
@ -2,7 +2,19 @@

 set -eu

-xsltproc tools/docx-to-md.xsl tmp/word/document.xml | \
-fold -w 80 -s | \
-sed -e "s/ *$//" \
-> nostarch/chapter02.md
+# Get all the docx files in the tmp dir,
+ls tmp/*.docx | \
+# Extract just the filename so we can reuse it easily.
+xargs -n 1 basename -s .docx | \
+while IFS= read -r filename; do
+  # Make a directory to put the XML in
+  mkdir -p "tmp/$filename"
+  # Unzip the docx to get at the xml
+  unzip -o "tmp/$filename.docx" -d "tmp/$filename"
+  # Convert to markdown with XSL
+  xsltproc tools/docx-to-md.xsl "tmp/$filename/word/document.xml" | \
+  # Hard wrap at 80 chars at word bourdaries
+  fold -w 80 -s | \
+  # Remove trailing whitespace & save in the nostarch dir for comparison
+  sed -e "s/ *$//" > "nostarch/$filename.md"
+done