Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File name too long #4

Open
epicfaace opened this issue May 16, 2019 · 1 comment
Open

File name too long #4

epicfaace opened this issue May 16, 2019 · 1 comment

Comments

@epicfaace
Copy link
Member

CI script fails here:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.2/x64/lib/python3.7/shutil.py", line 563, in move
    os.rename(src, real_dst)
OSError: [Errno 36] File name too long: '1984/05/24/MODSMD_ARTICLE28.article.txt' -> '19xx/198x/1984y/05m/24d/MODSMD_ARTICLE28.RVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFMgRVZFTlRTIEVWRU5UUyBFVkVOVFM=.article.txt'

Possible solutions:

  • use a folder name instead of a file name to encode data, i.e. 19xx/198x/1984y/05m/24d/{title in base 64}/{file name}.txt
  • truncate file name to first 50 characters
  • "Linux has a maximum filename length of 255 characters for most filesystems (including EXT4), and a maximum path of 4096 characters." (the above example is 365 chars long)
  • The above file name, base-64-decoded, is actually "EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS EVENTS" -- let's fix the parsing haha
@epicfaace
Copy link
Member Author

Turns out that the appropriate METS file (https://s3.amazonaws.com/stanforddailyarchive/data.2013-oct/data/stanford/1984/05/24_01/Stanford_Daily_19840524_0001-METS.xml) has this wrong title here:

image

@epicfaace epicfaace mentioned this issue May 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant