-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for file-like object parsing #82
base: main
Are you sure you want to change the base?
Conversation
Hey @rAndrewNichol, thank you for your pull request! Greetings, |
There are some issues with this pull request: 1. is_xbrl flagI tried your code with the same submission 2. Fundamental problem with SEC submissions parsed via StringIO
In XBRL reporting, a distinction is made between two different disclosure strategies - the open reporting cycle and the closed reporting cycle: Closed Reporting Cycle: Open Reporting Cycle: Here comes the problem: <link:schemaRef xlink:href="aapl-20180929.xsd" xlink:type="simple"/> But if you only give the instance document as StringIO to the parser, it won't be able to find this taxonomy. (Where should it search?). elif isinstance(instance_path, IOBase):
taxonomy: TaxonomySchema = parse_taxonomy(instance_path, cache) So you are passing the if concept_name in tax.name_id_map: This means that no concepts of the extension taxonomy and thus all facts that where tagged with these concepts are not parsed and ignored. You can also see this by comparing the output of the current existing function Possible solutions
@rAndrewNichol Best regards, |
In my application it doesn't make sense to store the data locally since my disk is ephemeral. I also don't pull directly from the web for concurrency and other reasons. Rather, I pull it directly from cloud object storage (s3) and parse the results.
For that reason I wanted to be able to use StringIO to parse the text content directly. The package did not provide any such support.
usage:
Since there is no simple way to infer the file format explicitly from a StringIO object (from a
file
object you could simply use file_obj.name), i decided it would be best as a separate method with anis_xblr
parameter for the user to explicitly specify whether it is xblr or other (ixblr).This had a lot more places to change than I had originally expected, but in the end it works pretty seamlessly.