RL Vision Knowledge Base

Support questions and answers for software by RL Vision.

Note: This is an archived discussion. Any bug, problem or suggestion mentioned here is likely to have been fixed since it was written.

Subject: Re: Info on Googles PDF Book Format

Date: Tue, 07 Apr 2009 19:14:32 +0200

Very interesting. That might very well explain why it is not working. I will have to dig deeper into this to find out more. Depending on where the problem lies, I might be able to fix it for future versions!

thanks!

Dan


Bob at Spiced Art wrote:
> Hi Dan,
> FYI - here is some info on how Google put their PDF book files together. I am not sure, but it seems they are encoding with the jpeg2000 or jbig compression which some PDF Image Extraction Wizards may not be capable of retrieving? I am understanding that Adobe supports those image file types in their pdf viewers.
>
> http://www.imperialviolet.org/binary/google-books-pdf.pdf
>
> I may have misunderstood the article, but thoughy you might be interested.
> Regards, Bob
>
> ----- Original Message -----
> From: Dan
> To: Bob
> Sent: Sunday, April 05, 2009 8:03 AM
> Subject: Re: extracting images from pdf problem
>
>
> I have tried to analyze the pdf but I can't seem to figure out what's wrong. Tell me, is this something that has happened to you on other files than this? Because if it isn't, it might be that the image has been save corruptly, so that my program skips it.
>
> // Dan
>
>
>
> Bob at Spiced Art wrote:
>> Here is the file. Thank you.
>>
>> ----- Original Message -----
>> From: Dan
>> To: Bob
>> Sent: Sunday, April 05, 2009 12:06 AM
>> Subject: Re: extracting images from pdf problem
>>
>> Hi,
>>
>> There is a lot of difference in how different pdf writers save their files, and sometimes this causes trouble when trying to extract data from them. I could tell you more if you send me the pdf or a link to it so I can see myself what is happening.
>>
>> // Dan
>>
>>
>>
>> Bob at Spiced Art wrote:
>>> Hi Dan,
>>>
>>> I am testing your PDF Image Extraction Wizard program. (not yet registered)
>>> I downloaded a book in pdf format from Google Books.
>>>
>>> When I view the file with Adobe Reader, I am looking at a page that Adobe says is page 7. There is a large color image on the page and a caption below the image in black and white.
>>>
>>> When I use your program and specify a page range of 7 to 20 pages, one of the three bmp images extracted shows only the full page 7 with the caption in black and white but the color image is not on the page.
>>>
>>> I am not familiar with how pdf files store information, but I am wondering if there is a problem with the software or if the color image is actually stored on another page of the pdf file?
>>>
>>> Any help is appreciated.
>>> Best wishes and regards, Bob
>>> Spiced Art Studio
>>>
>>>
>>>
>>


Back to Knowledge Base


 · Software
 · Download
 · Order



 RL Vision
 · About
 · Contact
 · News
 · Blog



 Shareware
 · Flash Renamer
 · PDF Img Extract
 · Exif Tag Remover
 · Dupli Find



 Freeware
 · Replace Genius
 · Office Img Extract
 · Notepad2 BE
 · PhotoSift
 · Hot Spot Studio
 · Snap2HTML
 · Snap2IMG
 · Bubble Math
 · DinnerWiz
 · NSF Tool
 · Beep.Exe



 Misc.
 · Get Source Code
 · Changelogs
 · TitleQ
 · ArtGem
 · Graphics Tutorials
 · Win7 Font Bug
 · KB


Like my Freeware?

I'm on GitHub too!



 © 1998-2017 RL Vision