-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fragment locations via # in private repos have 1MB limit #156
Comments
Did the I did a fix related to this for the latest 0.2.10 release, so I also suggest making sure you have the latest volo to try that command (type |
When I run the add command, it pulls in the entire contents of the repo - so the app.js, the package.json, the lib files, everything. I'll check that I have the latest version and report back, |
Yes, I'm running 0.2.10. This only appears to be an issue with our private repos - with a public repo, it does exactly as you'd expect. Is there a simple check to verify that the download URL for the zip file is resolving to what I'd expect, or would a network trace be the best way? Or a debug flag that I could turn on to verify what URL is being returned from github for the download? |
volo should print out what zip file will be used. I just tried this, and I see volo saying:
with a long strings of characters after that sso part. I was able to successfully install something I put in that private repo as dist/test.zip. So I did In your case, is it just a private repo on github.com or is this something served from a github enterprise installation? Mine was just on regular github.com. |
Looks like I've had a fundamental misunderstanding of what volo is doing when I give it a repo name. I expected that by setting the volo.url property in package.json, that URL would be returned and used when doing a volo search or volo add. Now that I see that is not used at all, I'm unclear as to what that property is there for, unless it's just metadata? So there is no bug here, it's just been my misunderstanding of what is happening. I can point directly to github:Org/repo#path/to/file.zip and have that pulled correctly. I had assumed that by setting the volo.url property in package.json, I was avoiding the need to tell consumers of our repo they needed that direct url. Is it possible to have search check for this property, so that we can make use of this metadata in this way? (edit - these are on github.com). |
Now further confused - we maintain a public github repo (fuelux) that does exactly what I'm trying to do - specifies volo.url in the package.json that points directly to a specific zip file inside the repo. When I run volo add fuelux, it grabs that specific file. So it appears this is being read somewhere? The package.json volo.url directive that works: When I run volo add fuelux, I get exactly what I expect, the zip file in dist/. But trying the private repo we have, it can't grab the file specified in package.json unless I give it the exact file to get (volo add private-repo#dist/file.zip) I'm assuming it's because of it being a private repo, but is there something I'm missing? |
Dug around some more and I believe I now know why private repos are being treated differently. See here: The check for a package.json file is being done against the raw github URL, which will not properly resolve for a private repository. This drops the fetchBrowserPackageJson function into the error handler, where it's silently ignored. For a public repository, this will resolve, and the package.json file is fetched, parsed, and the specific URL is correctly used for downloading the dependency. I'm not as familiar with the code base as I would like to be (or with the Github API) but it seems like it should be possible to fallback to a second check that would attempt a direct download the package.json file if the raw attempt failed. I can't tell if the raw.github urls are only allowed for private repos if you pass the username and pw in the query parameters - a quick test with the token volo was using still failed. Not sure where something like this falls on the priority list, but will do a pull request to give an error message indicating that the volo.url property cannot be used in this case. Would be happy to contribute some more code along the lines of a fallback attempt if you have any info about Github API and authentication that might apply here. |
Thanks for digging into this! I am going on a trip and will not have good turn-around on communication for about a week or so, but I am hoping to do some volo work at some point on the trip, and will be looking into this more, and your pull request. |
I did some preliminary work on this, and I'm going to do a pull request so you can see it. However, the only part that works is getting the contents of package.json using the API. Where things will fail is on getting the actual specified file, since the API only allows files up to 1 MB. What may be preferred is to use the first part of what I did but then download the entire repo, and use volo commands to delete all but the file specified in the volo.url property. Thanks for your work creating volo, and appreciate your time on this issue. Doing pull request now. |
So following up with from our in person conversation: Private repos have a 1MB direct file access limit. So, what we might need instead: Do the normal work we do now. If the path fragment is for a file that is bigger than 1MB, then download the full zip of the repo, then extract the file that way. If I mis-remembered the conversation, please let me know. Since I think we still have some discussion on this to do, and I believe you wanted to try the pull request, I am going to push this to a 0.3.1 release, as there are some fixes I want to get out more immediately for 0.3.0. We can do a follow-on 0.3.1 release fairly quickly if we work out the details in the near future. |
Not a problem at all. I didn't get to do any coding on this yet, and was going to reply today. I would stil like to do the pull request. I think there was a slight difference in the flow that we talked about:
How does that sound? |
It should be doing this already:
First, we should confirm we need to do this special path by checking the size of the file. Right now the code does a HEAD request to confirm the file really exists at the location around here: I am hopeful that we just need to also check a Content-Length to know if we need the special path. Although, looking at that code segment right now, it assumes that there is either a fragment (a #something.zip) or a volo.url, but not the possibility that the volo.url is for a fragment in the repo. So perhaps do some normalization around here: that inspects the volo.url to see if it really is just a fragment inside the repo? Hmm, I am also assuming the use case we are trying to fix is "volo.url for a fragment in the current repo". If I missed something with that definition of the use case, give a holler. So after determining if the file will be too big, I think it may be enough then to modify the results of the resolve operation to url: 'zip of repo', isArchive: true and set the fragment. Then hopefully the rest sorts itself out? I'm going on incomplete memory of the use case though, and unfortunately will be relying more on you to remind me than to do the research myself.
I wonder if it should just keep doing the volofile work even in this case. In particular, I think the only thing it will do is look for an "onAdd" command in the volofile. That still may be useful in this case too?
Hopefully if we just readjust the results from the resolve module, this will fall out without any further changes. |
For the first point about making the authenticated call, the code at I think you are correct in identifying the case here - it's really that we want a fragment, but the volo.url isn't in that form. Also think you're correct in saying that we should normalize to make that the case - this give more flexibility and allows the code to follow the correct path vs the hackish one I was having to do. Since from experience I know that we can download the entire repo already with no issues, it should only be the initial call to get package.json that needs to change to be against the API. Later calls should function correctly as they do now. And it means less changes if we do as you suggest - just change to a fragment if that's what we really have and the rest should sort itself out. I do see where you already check for the onAdd command and run that if needed So I think the steps would be as below:
This is much simpler than what I had planned. Thanks for putting the time into the discussion. If this sounds good, I'll work getting it done. |
Did some work on this, but going to walk back on the plan. What this really boils down to is that Github doesn't expose a raw URL for private repos in the same manner it does for public repos. So I can't put a volo.url in place for a private repo that actually works, because there isn't any such thing - the only thing that appears to work is using the zipball/tarball API url to get the entire repo, and then cleaning up afterwards with a volofile. I'm digging around more to try and find out if there is any sort of direct URL for downloading individual files from a private repo. A few tests show that cookie based authentication works using a url like: https://github.com/org/repo/blob/master/dist/file.zip?raw=true but not oAuth tokens. So GH effectively limits the download to web-based UAs only. Best short term plan might be to update docs to indicate volo won't work correctly for private URLs because of the way GH treats them, but note that using a volofile will work because volo falls back to pulling the zipball/tarball, and from there you can clean up. I'm looking at whether the GH API for downloads might be a workaround here - if you can create a download for a repo, then perhaps that file can be specified for the volo URL. That would let me point to a specific file in the exact same way a public URL works. |
Just found this out - you can get a direct download for any file using this syntax: http(s)://[token]@[hostname]/[owner]/[repo]/raw/[branch]/[path_to_file] Tested via curl with the token from volo and it works. So private repos can specify the normal format and volo should be able to work successfully. Only change needed would be the authenticated call to get package.json as well as the URL for any of the HEAD requests. Since the authenticated call to package.json returns a field indicating private, this can be the flag to change subsequent URLs to this format. Will take another crack at this soon. |
Now it appears that I misunderstood the use of this sytax. It may only work for actual git operations, not pure http downloads. If that is verified, then I'm going to close this entire issue and just work up a grunt plugin that will create the volofile for me, so that I can automate removing any extra directories. |
I got back confirmation from GH support that only git operations over https with this format using oauth tokens are supported. The last support person did recommend using the blobs API - http://developer.github.com/v3/git/blobs/. For the moment, I'm going to pursue a workaround route of a grunt task to update a volofile which will remove all contents but the directory I want. I'll revisit this again as time permits. |
Thanks for diving into this more. I wonder, to get around the issue of not being able to construct a private URL, what if volo.url allowed relative URLs? So, if volo.url = 'dist/something.zip', then it would resolve it relative to whatever volo calculates as the repo URL, and if a private repo, then just download the whole repo zip and extract that file (treat it like a fragment). Sounds like you have a workaround for now, but when I get back to a volo dev cycle, I would like to be able to sort it out. |
That might be the best method. Since there is no real URL that is a valid URL for private repos. Would still need the change to the initial call to get package.json so that it uses the API to pull that specific file, but then you'd also be able to check the repo for being private at that point. So you could support relative URLs either way, but if the private flag is on, you'd probably be able to skip some steps since the only path is a full zip download. I will probably do a grunt workaround for now to generate a volofile and keep my temporary workaround in place, but will try to get back to this again as well. |
Our team has several private Github repos that follow this pattern:
It appears that volo and/or the GitHub API don't allow a deeplink into the project to get that zip file, even when it's listed as the volo url in package.json. No matter what syntax I try on both the consuming app (where I run volo add) and the linked app (where I set the volo url property), it always pulls in the entire repo contents.
However, I did find your comments on using a volofile with onAdd to run some actions after install. I was able to use that to clean up all extra files that I didn't want, effectively mimicing the direct download.
If this is something that is a known issue and you are ok with documenting this as workaround, I'll do a fork and pull request to update the docs.
The text was updated successfully, but these errors were encountered: