This tool is for quick mutation inside large XML files (2000 MB+). It uses streams and pipelines for the best performance. The tool could be further explored in terms of performance, however, the current results are satisfactory.
- Find and use right xml parser
- Unit tests,
- Concurrent processing,
- Caching
opening_times
and returning immediate results without parsing into JSON again the same values, - Reducing string creation, reducing Buffer to string conversion,
- Fixing issues and limitations,
- Rewriting in Rust?
Node version 18+
git clone https://github.com/kamilkodzi/xml-task.git
cd xml-task
npm install
npm start
to process the example file data/feed.xml
- Not taking time zones into account, all times converted to UTC,
- If the <opening_times> node does not exist inside , then
is_active = false
is populated, - If <opening_times> have
{"opening":"00:00","closing":"00:00"}
, then it is considered as active all day long.
Could be fixed in further iterations:
- Whenever there is some tag data inside CDATA, the data will split incorrectly,
- Whenever there is CDATA inside any node that will contain <opening_times>, the data may be wrongly interpreted.