Releases: prohippo/pyelly
Fix Unicode Recognition Bugs
Handle Chinese Unicode Input
Generalize code to work optionally with Chinese Unicode text input. Before, all text input had to be in a Latin alphabet.
Small Change in English Suffix Rules
This is mostly a resync of GitHUB code after moving to a new computer. After 40 years, suffix table rules are still incomplete.
More Adjustments for Handling Greek Letters
The use of Greek letters in chemical nomenclature requires changes in how the $ wildcard is matched in the PyElly FSA for syntactic typing of tokens. Update documentation.
Minor Bug Fix
Have to recognize comma followed immediately by Greek letter as special case for breaking a token in ellyBuffer. Extend "chemic" rules and integration test. Update documentation.
Minor Bug Fix
Fix recognition of Unicode prime char in setting bounds in patternTable module for FSA matching. Extend "chemic" rules and integration testing. Update documentation.
Minor Fix in Morphological Matching
Suffix removal needs to be aware of previous prefix removal from a text input token for analyzing and rewriting. More rules for "chemic" example application and more examples for "chemic" integration testing. Update documentation.
Improve Handling of Commas, Error Reporting
This fixes problems discovered in identification of locants in chemical names. This requires that embedded commas be taken as a separate token. The "chemic" integration test was expanded. Documentation was revised.
Fix Handling of Prefixes and Suffixes
Improvement and debugging of various PyElly modules to support a new example application to recognize structural chemical nomenclature in text ("chemic"). This requires using PyElly prefix and suffix analysis in new ways. The "chemic" rules and integration test continues to grow.
Reorganize Input Processing To Handle Prefix Morphological Rules
PyElly tokenization of input was improperly dealing with prefix recognition and splitting off. This was main due to problems with the '+' character used to marking prefixes and separated roots. The problems showed up in the "chemic" example application, which has to find chemical nomenclature in various styles of text. The names are too numerous to list out fully and so requires a bit of analysis.