python - Am I correctly extracting JPEG binary data from this mysqldump? -


i have old .sql backup of vbulletin site ran around 8 years ago. trying see file attachments stored in db. script below extracts them , verified jpeg hex dumping , checking soi (start of image) , eoi (end of image) bytes (ffd8 , ffd9, respectively) according jpeg wiki page.

but when try open them evince, message "error interpreting jpeg image file (jpeg datastream contains no image)"

what going on here?

some background info:

  • sqldump around 8 years old
  • vbulletin 2.x software stored info
  • most php 4 used
  • most mysql 4.0, possibly 3.x
  • the column datatype these attachments stored in mediumtext

my python 3.1 script:

#!/usr/bin/env python3.1  import re  trim_l = re.compile(b"""^insert attachment values\('\d+', '\d+', '\d+', '(.+)""") trim_r = re.compile(b"""(.+)', '\d+', '\d+'\);$""") extractor = re.compile(b"""^(.*(?:\.jpe?g|\.gif|\.bmp))', '(.+)$""")  open('attachments.sql', 'rb') fh:     line in fh:         data = trim_l.findall(line)[0]         data = trim_r.findall(data)[0]         data = extractor.findall(data)         if data:             name, data = data[0]             try:                 filename = 'files/%s' % str(name, 'utf-8')                 ah = open(filename, 'wb')                 ah.write(data)             except unicodedecodeerror:                 continue             finally:                 ah.close()  fh.close() 

update jpeg wiki page says ff bytes section markers, next byte indicating section type. see not listed in wiki page (specifically, see lot of 5c bytes, ff5c). list of "common markers" i'm trying find more complete list. guidance here appreciated.

update question sample sql statement, including few lines/bytes of jpeg string value. perhaps data base64 encoded, or straight hex values. we'll further.

also, it's easier see type of file's contents issuing a:

file yourfile.jpg 

Comments

Popular posts from this blog

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -