-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Raw-Text Nodes: Support Alternative Syntaxes #40
Comments
Other Nodes too!Remember that the four nodes mentioned above (i.e. the "raw nodes") are not the only one that need to support these alternative syntaxes; as mentioned in #39:
So this includes also the [table_data (halign="C,L,R")
~~~
Position, Product, Price
-
1, Organic food, 12.50
2, Meditation lessons, 150.00
-
,,Total: 162.50
~~~
] So, right now I have no idea of how many other nodes exactly support these three alternative syntaxes, beyond those explicitly mentioned in #39, but I'll only need to check the JSON tags file to find out — by the time I've finished with those four nodes there'll probably be newer raw nodes introduced in the syntax. Top-Priority Fix!This is going to be a serious setback for the syntax development, because unless all three syntaxes are correctly supported and handled the editor will fail whenever it encounters them, breaking up highlighting for the rest of document. I've already experienced this when working on the source files of the official PML documentation, which freely use all different syntaxes, and the results are disastrous: you simply can't work on the document, all editing functionality breaks. So even though "in theory" only the "official variation" should be used, in practice it doesn't work like that since we encounter all variations in the official docs, as well as their examples. Implementing correctly all these nodes should take higher precedence over implementing other more common nodes right now, because unimplemented "simpler" nodes don't break the document, they are just highlighted as "unknown" node, which doesn't affect editing functionality. On the other hand, until these alternative syntaxes are supported there will always be the (quite real) risk that a valid PML document will cause Sublime PML to break its internal state, rendering it useless (especially when working with the official PML docs). |
Please note that the Text Block Syntax has been deprecated and will be removed in an upcoming major version. Hence, you only have to support the two remaining syntaxes: Delimited Text Syntax and Standard Text Syntax.
Yes, the JSON tags file is and will always be a reliable way to find out which PML nodes are of type
The Text Block Syntax will no more be used in the next major version of the PML documentation. |
Thanks for pointing it out! it had slipped by me, so it's good to know. Also, it's good to have less variations in terms of editors support because it somewhat reduces their implementation complexity. NOTE: The currently implemented raw nodes are all in the Text Block Syntax variation.
Any estimate on when the next MAJOR version would be? If we're talking a couple of months then it makes sense skipping the deprecated syntax, but if it's longer than that it might be worth implementing/keeping it until the next MAJOR because of the document-wrecking consequences of its absence — but only if it's not too much work to keep them all! It's hard to estimate how complex their branching implementation is going to be until I get to work on them. The devil is always in the detail in these cases. If the documentation is correct (i.e. if there are no parsing edge cases that contradict it) then the main branching point is between the Standard Text Syntax and the other two, since the former differs from the latter(s) in its inception line:
The Text Block Syntax is/was then identified by the absence of the fence delimiters, but other than that it is/was a close variation of the Delimited Text Syntax. In practical terms, when it comes to editor syntaxes with one-line RE-based definitions, the three syntaxes branched out as:
whereas by dropping the Text Block Syntax we'll be now left with just a branching point:
I think this can be done without backtracking, since the branching criteria is whether the opening tag is immediately followed by contents or a new line — which means that the branching decision can be resolved within a single source line. If this is the case, then most TexMate like editors should be able to support this feature, not just ST4 — i.e. unless there are complications involved, e.g. due to edge cases, attributes, etc. |
Today I posted the list of planned breaking changes in the next major version (and sent you an email too).
The branching criteria implemented in the PDML parser is as follows: If the node name is followed by (optional) spaces and/or tabs, followed by a new line character, then the 'Delimited Text Syntax' is used, otherwise it's the 'Standard Text Syntax'. |
that's what I had in mind with "edge cases" here. But what about attributes, don't some of these nodes also support attributes groups? If yes, these need to be added to the equation too. I think that to handle these the best approach is a lookahead RegEx, just to ensure that a single RE is able to discern between the two possible branches at once. Once you know exactly how to branch, the rest can be handled fairly straight forward. But, as always, things are easier said than done, because of the pervasive nature of some nodes, e.g. comments and constants, which can basically occur anywhere since they are handled by the preprocessor. So it's important that the branching lookahead RE doesn't mismatch due to a comment or a constant. Pre-processor nodes are tricky to handle because it's hard to pinpoint and foresee all their possible occurrences. During my side tests, I noticed that there's a wide margin in their real use, leaving us with 100% valid PML sources that are tricky to highlight correctly by the editor. Surely, in most cases end users won't end up resorting to such exotic uses of comments and constants, but it's still a possibility within the realm of valid documents. |
Yes. I should have said: If the node name (and optional attributes) is followed by (optional) spaces and/or tabs, followed by a new line character |
Done in version 4.0.0. |
Almost Done!@pml-lang, I've almost finished implementing both Standard and Delimited Text Syntax for the raw-nodes currently supported by Sublime PML (i.e. only Support for Delimited Text Syntax has already been committed to As for the Standard Text Syntax, I've completed its implementation for the Anyhow, just wanted to let you know that handling both syntaxes turned out to be easier than expected, and didn't require context branching (so I assume it should doable on any TextMate-like editor too). |
GREAT!
Yes, I just tried it out and the |
Unfortunately until I merge the fixes for the Standard Syntax all non-fenced raw blocks will fail. I was hoping to finish it tonight, but it's too late — the problem isn't tweaking the syntax, really, but updating the syntax tests, which need to cover all possible combinations, in order to catch any bugs before merging (if the syntax fix takes an hour, updating the tests takes two or three hours 😢). |
Done!Now Sublime PML correctly supports alternative Text Syntaxes in raw-text nodes I've had a chance to further polish and optimize the contexts to handle the dual syntax support, which I now only need to replicate on the remaining raw-nodes which are missing in Sublime PML (i.e. |
Implement `[input` and `[output` raw-text block nodes (see #40).
As it turned out (see Discussion #39) all PML raw text block-nodes accept multiple syntaxes, which are not documented in the PML Reference Guide (only in the PDML documentation).
So I'll have to fix all currently implemented raw-text nodes to ensure they correctly support all three variation:
[code
:[html
:As for the remaining raw-text nodes, I'll implement them with all three variations right away when I'll add them to the syntax:
[input
[output
References
raw_text
:The text was updated successfully, but these errors were encountered: