forked from mwilliamson/mammoth.js
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
444 lines (241 loc) · 9.29 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
# 1.5.2
* Add transformDocument to the TypeScript declarations.
# 1.5.1
* Fix: npm 7 changed the behaviour of prepublish, causing the browser build not
to be updated before publishing to npm. We now use prepare instead of
prepublish, which has the same behaviour that prepublish previously had.
# 1.5.0
* Only use the alt text of image elements as a fallback. If an alt attribute is
returned from the function passed to mammoth.images.imgElement, that value
will now be preferred to the alt text of the image element.
# 1.4.21
* Ignore w:u elements when w:val is missing.
# 1.4.20
* Emit warning instead of throwing exception when image file cannot be found for
a:blip elements.
# 1.4.19
* Add TypeScript declarations.
# 1.4.18
* When extracting raw text, convert tab elements to tab characters.
* Handle internal hyperlinks created with complex fields.
* Update JSZip to 3.2.0. This addresses CVE-2021-23413 in JSZip.
# 1.4.17
* Handle w:num with invalid w:abstractNumId.
* Update underscore to 1.13.1.
# 1.4.16
* Convert symbols in supported fonts to corresponding Unicode characters.
# 1.4.15
* Support numbering defined by paragraph style.
# 1.4.14
* Add style mapping for all caps.
# 1.4.13
* Use package-lock.json instead of npm-shrinkwrap.json.
# 1.4.12
* Handle underline elements where w:val is "none".
# 1.4.11
* Re-publishing to remove superfluous files.
# 1.4.10
* Read font size for runs.
* Support soft hyphens.
# 1.4.9
* Allow hyperlinks to be collapsed.
# 1.4.8
* Improve list support by following w:numStyleLink in w:abstractNum.
# 1.4.7
* Update xmlbuilder dependency.
# 1.4.6
* Fix: default style mappings caused footnotes, endnotes and comments
containing multiple paragraphs to be converted into a single paragraph.
# 1.4.5
* When using the pretty HTML writer, don't format HTML inside pre elements.
* Read the children of v:rect elements.
# 1.4.4
* Parse paragraph indents.
* Read part paths using relationships. This improves support for documents
created by Word Online.
# 1.4.3
* Add style mapping for small caps.
* Add style mapping for tables.
# 1.4.2
* Read children of v:group elements.
# 1.4.1
* Read w:noBreakHyphen elements as non-breaking hyphen characters.
# 1.4.0
* Extract the default data URI image converter to the images module.
* Add anchor on hyperlinks as fragment if present.
* Convert target frames on hyperlinks to targets on anchors.
* Detect header rows in tables and convert to thead > tr > th.
# 1.3.6
* Handle complex fields that do not have a "separate" fldChar.
# 1.3.5
* Add transforms.run.
# 1.3.4
* Read children of w:object elements.
# 1.3.3
* Handle hyperlinks created with complex fields.
# 1.3.2
* Handle absolute paths within zip files. This should fix an issue where some
images within a document couldn't be found.
# 1.3.1
* Add getDescendants() and getDescendantsOfType() to transforms module.
* Add font property to runs.
# 1.3.0
* Allow style names to be mapped by prefix. For instance:
r[style-name^='Code '] => code
* Add default style mappings for Heading 5 and Heading 6.
* Allow escape sequences in style IDs, style names and CSS class names.
* Allow a separator to be specified when HTML elements are collapsed.
* Add includeEmbeddedStyleMap option to allow embedded style maps to be
disabled.
* Include embedded styles when explicit style map is passed.
# 1.2.5
* Ignore bold, italic, underline and strikethrough elements that have a value of
false or 0.
# 1.2.4
* Ignore v:imagedata elements without relationship ID with warning.
# 1.2.3
* Expect end token when parsing style mappings. This causes warnings to be
emitted instead of silenting ignoring unparsed tokens.
# 1.2.2
* Ignore blank lines in style map.
# 1.2.1
* Use alt text title as alt text for images when the alt text description is
missing entirely.
# 1.2.0
* Throw more informative error when word/document.xml cannot be found.
* Generate messages of type "error" instead of "warning" when image conversion
throws an exception.
* Use alt text title as alt text for images when the alt text description is
blank.
# 1.1.0
* Add support for comments.
# 1.0.4
* Add support for w:sdt elements. This allows the bodies of content controls,
such as bibliographies, to be converted.
* Avoid stack overflows when elements have many children.
# 1.0.3
* Actually add support for table cells spanning multiple rows.
# 1.0.2
* Add support for table cells spanning multiple rows.
# 1.0.1
* Add support for table cells spanning multiple columns.
# 1.0.0
* Remove deprecated convertUnderline option.
* Officially support ID prefixes.
* Generated IDs no longer insert a hyphen after the ID prefix.
* The default ID prefix is now the empty string rather than a random number
followed by a hyphen.
* Rename mammoth.images.inline to mammoth.images.imgElement to better reflect
its behaviour.
# 0.3.33
* Improve collapsing of elements when there are empty elements.
# 0.3.32
* Allow bold and italic style mappings to be configured.
# 0.3.31
* Handle references to missing styles when reading documents.
# 0.3.30
* Improve support for lists made in LibreOffice. Specifically, this changes the
default style mapping for paragraphs with a style of "Normal" to have the
lowest precedence.
* Replace nomnom with argparse for CLI argument parsing. Usage should be the
same, but unknown arguments will now cause an exit with failure rather than
being ignored.
# 0.3.29
* Always use mc:Fallback when reading mc:AlternateContent elements.
# 0.3.28
* Avoid inserting newlines around inline elements when pretty-printing HTML.
* CLI: print warnings to stderr.
* Read v:imagedata with r:id attribute.
* Read children of v:roundrect.
* Ignore office-word:wrap, v:shadow and v:shapetype.
# 0.3.27
* Fix recursive collapsing of elements.
# 0.3.26
* Keep HTML elements open correctly when the HTML path contains no fresh
element.
* Collapse similar non-fresh HTML elements together.
* Don't escape double quotes outside of attributes.
# 0.3.25
* When the styleMap option is passed as a string, ignore lines beginning with #.
* Generate warnings for not-understood style mappings and continue, rather than
stopping with an error.
* Add support for embedded style maps.
* Add support for linked images.
# 0.3.24
* Ignore w:numPr elements without w:numId or w:ilvl children.
# 0.3.23
* Support links and images in footnotes and endnotes.
# 0.3.22
* Add support for underlines in style map.
* Add support for strikethrough.
# 0.3.21
* Add basic support for text boxes. The contents of the text box are treated as
a separate paragraph that appears after the paragraph containing the text box.
# 0.3.20
* Support styles defined without a name
# 0.3.19
* Add ignoreEmptyParagraphs option, which defaults to true.
# 0.3.18
* Make style names case-insensitive in style mappings. This should make style
mappings easier to write, especially since Microsoft Word sometimes represents
style names in the UI differently from in the style definition. For instance,
the style displayed in Word as "Heading 1" has a style name of "heading 1".
This hopefully shouldn't cause an issue for anyone, but if you were relying
on case-sensitivity, please do get in touch.
# 0.3.17
* Add support for hyperlinks to bookmarks in the same document.
# 0.3.16
* Add basic support for Markdown. Not all features are currently supported.
# 0.3.15
* Add default style mappings for builtin footnote and endnote styles in
Microsoft Word and LibreOffice.
* Allow style mappings with a zero-element HTML path.
* Emit warnings when image types are unlikely to be supported by web browsers.
* Update jszip to 2.4.0.
# 0.3.14
* Add support for endnotes.
# 0.3.13
* Update license-sniffer to fix license comments in mammoth.browser.js.
# 0.3.12
* Update bluebird dependency to fix operation in web workers.
# 0.3.11
* Add support for superscript and subscript text.
# 0.3.10
* Handle XML files that start with the UTF-8 BOM.
# 0.3.9
* Add support for reading Buffer instances.
# 0.3.8
* Add support for line breaks.
# 0.3.7
* Add optional underline conversion.
# 0.3.6
* Add helper `mammoth.transforms.paragraph`.
# 0.3.5
* Add `mammoth.images.inline`, and document custom image conversion.
# 0.3.4
* Add the function `mammoth.extractRawText`.
# 0.3.3
* Add support for tables.
# 0.3.2
* Add support for footnotes.
# 0.3.1
* Fix: all non-paragraph styles are treated as character styles when reading
styles from a docx document.
# 0.3.0
* Rename the existing concept of a "style name" to "style ID" to correspond to
the terminology used internally in docx files. Any document transforms that
get or set the `styleName` property should use the `styleId` property instead.
* Allow paragraphs and runs to be matched by style name. For instance, to match
a paragraph with the style name `Heading 1`:
p[style-name='Heading 1']
* Use an array of raw strings rather than an array of style mappings when
passing options to convertToHtml.
* Add support for node 0.11 (tested on node 0.11.13).
# 0.2.2
* Fix: exception when bad arguments are passed into unzip.openZip.
# 0.2.1
* Fix browserify support.
# 0.2.0
* Rename mammoth.style() to mammoth.styleMapping().
* Add support for document transforms.
* Append standard style map to custom style map unless explicitly excluded.