-
Notifications
You must be signed in to change notification settings - Fork 182
Parse Apple HEIC files, including EXIF and other metadate. #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@DrJohnMelville this looks great, thanks very much. I'll take a look now. Hopefully I can get through it all tonight. |
@DrJohnMelville I played around with this for a while and have made some changes on top of it, so please don't push any more commits as it'll be likely to conflict. I'm close to merging this but need to fix an issue in It's late here now. I hope to pick this up again tomorrow. |
The Available method is used to detect when I am in the last box of the file. Usually if the parser for a box does not consume all of it's space, the parser jumps to the end of the box. The may be less necessary now that all the box parsers are written, but it still makes the code more resilient against seeing boxes that we do not know how to parse. The last box in the file is the media stream. We cannot skip over the last box in the file when we parse it, because some resources (most notably the thumbnail and the EXIF metadata) are stored in the media stream and the previous boxes just give us offsets into the media stream for those items. Thus if we skip over the final box -- 1) we read in a lot of data we do not care about which consumes time and memory and 2) we have to "back up" the stream to get the thumbnail or EXIF metadata, and not all streams are seekable. Previously I made that decision by not reading in the 'mdat' box. Some of the Nokia conformance images, however, store their media data in boxes other than mdat, but always the last box of the stream. If the total length of the stream is not known, I could probably come up with a small list of box types that hold the media data in the Nokia conformance images, which would probably work for most of the real images out there. |
Thanks for the extra context here. The Instead, some knowledge of the length is needed. However, for certain kinds of streams (e.g. network streams) knowing the length is not possible without consuming and buffering the stream's entire contents, which is obviously not great either. In these cases we generally try to terminate processing based on other information at hand. I'll take a look at this and the other usage of |
So I lied. All of the files in the Nokia conformance set end with the media stream in a mdat box. It may have been some other quicktime files out of the image set that I was incorrectly recognizing as HEIC files. I think that if you replace:
with
it would probably still work. (Note that I still had the old code in at comment. My bad.) |
I also lied :) I spouted what the docs said, not what the code says. Still, Thank you for the updated code. I'll investigate further. |
@DrJohnMelville thanks for your patience here. This is now merged and I'll get a new release out with this support shortly. Thank you again for a great contribution! |
Awesone - looking forward to the new NuGet package :) |
@tipa could you help test this code please? See #231 (comment). |
[NB. This is my first pull request ever -- on any project. My Apologies in advance if I have done it incorrectly.]
This is a large pull request, but the vast majority of the code is new code to parse the ISOBFF box format. (Including a variety of minor modifications Apple made in the HEIC format)
I ran the regression test against the image database and the only regressions occur in HEIC files (which used to be incorrectly identified as quicktime files.] All of the diff outputs appear to be appropriate changes incidental to parsing HEIC files. Notably a few of the files which cause errors in the Java version now parse to errors instead of being incorrectly recognized as valid quicktime files.
I also downloaded the Nokia HEIC conformance images, which I did not add to the repository for copyright concerns. The software correctly parses all of the images in the Nokia conformance set with no errors and rational appearing output for each file.
Here is the diff from the regression test'