1 Overview This document summarizes the results of our efforts to solve the Digital Forensic Research Workshop (DFRWS) 2006 File Carving Challenge. Details regarding this challenge may be found here: http://www.dfrws.org/2006/challenge/ The Overview section of that page is repeated here: Data carving is the process of extracting a collection of data from a larger data set. Data carving techniques frequently occur during a digital investigation when the unallocated file system space is analyzed to extract files. The files are "carved" from the unallocated space using file type-specific header and footer values. File system structures are not used during the process. The results of existing file carving tools typically contain many false positives. An investigator must test each of the extracted files by opening them in an application that supports the file type. The goal of this challenge is to design and develop file carving algorithms that identify more files and reduce the number of false positives. 1.1 Revision $Id: README.ANSWERS,v 1.101 2006/07/18 03:06:00 klm Exp $ 1.2 Table of Contents Section 1 .................... Overview Section 1.1 .................. Revision Section 1.2 .................. Table of Contents Section 1.3 .................. Sponsors and Contributers Section 2 .................... Background Section 2.1 .................. Goals Section 2.2 .................. Analysis Environment Section 2.3 .................. Summary Section 2.3.1 ................ Analysis Platforms Section 2.3.2 ................ Analysis Tools Section 2.3.3 ................ Notes and Caveats Section 2.4 .................. Terminology Section 2.4.1 ................ Block Sizes Section 2.4.2 ................ Acronyms Section 2.5 .................. Methodology Section 2.5.1 ................ Work-Flow Section 2.5.1.1 .............. Compute and Plot Sliding Entropy/Average Statistics Section 2.5.1.2 .............. Compute and Load Sliding Statistics (Percent, Entropy, Average, MD5, and SHA1) Section 2.5.1.3 .............. Dig for Heads, Tails, and Well-Known Structures Section 2.5.1.2 .............. Analyze Data and Determine Block Alignment Section 2.5.1.3 .............. Perform Stage 1 Carves Section 2.5.1.4 .............. Feed Carved Files to a Validator Section 2.5.1.5 .............. Analyze Data Section 2.5.1.6 .............. Perform Stage 2 Carves Section 3 .................... Answers Section 3.1 .................. Answers for ZIP Files Section 3.1.1 ................ ZIP File at Offset 14560768 Section 3.1.1.1 .............. Final Results Section 3.1.1.2 .............. Stage 1 Results Section 3.1.1.3 .............. Stage 2 Results Section 3.1.2 ................ ZIP File at Offset 14709248 Section 3.1.2.1 .............. Final Results Section 3.1.2.2 .............. Stage 1 Results Section 3.1.2.3 .............. Stage 2 Results Section 3.1.3 ................ ZIP File at Offset 23047680 Section 3.1.3.1 .............. Final Results Section 3.1.2.2 .............. Stage 1 Results Section 3.1.2.3 .............. Stage 2 Results Section 3.2 .................. Answers for JPG Files Section 3.2.1 ................ JPEG File at Offset 1980416 Section 3.2.1.1 .............. Final Results Section 3.2.1.2 .............. Stage 1 Results Section 3.2.1.3 .............. Stage 2 Results Section 3.2.2 ................ JPEG File at Offset 1980748 Section 3.2.2.1 .............. Final Results Section 3.2.2.2 .............. Stage 1 Results Section 3.2.2.3 .............. Stage 2 Results Section 3.2.3 ................ JPEG File at Offset 1995443 Section 3.2.3.1 .............. Final Results Section 3.2.3.2 .............. Stage 1 Results Section 3.2.3.3 .............. Stage 2 Results Section 3.2.4 ................ JPEG File at Offset 4241920 Section 3.2.4.1 .............. Final Results Section 3.2.4.2 .............. Stage 1 Results Section 3.2.4.3 .............. Stage 2 Results Section 3.2.5 ................ JPEG File at Offset 5948928 Section 3.2.5.1 .............. Final Results Section 3.2.5.2 .............. Stage 1 Results Section 3.2.5.3 .............. Stage 2 Results Section 3.2.6 ................ JPEG File at Offset 5949358 Section 3.2.6.1 .............. Final Results Section 3.2.6.2 .............. Stage 1 Results Section 3.2.6.3 .............. Stage 2 Results Section 3.2.7 ................ JPEG File at Offset 6257664 Section 3.2.7.1 .............. Final Results Section 3.2.7.2 .............. Stage 1 Results Section 3.2.7.3 .............. Stage 2 Results Section 3.2.8 ................ JPEG File at Offset 14134784 Section 3.2.8.1 .............. Final Results Section 3.2.8.2 .............. Stage 1 Results Section 3.2.8.3 .............. Stage 2 Results Section 3.2.9 ................ JPEG File at Offset 16115200 Section 3.2.9.1 .............. Final Results Section 3.2.9.2 .............. Stage 1 Results Section 3.2.9.3 .............. Stage 2 Results Section 3.2.10 ............... JPEG File at Offset 16144896 Section 3.2.10.1 ............. Final Results Section 3.2.10.2 ............. Stage 1 Results Section 3.2.10.3 ............. Stage 2 Results Section 3.2.11 ............... JPEG File at Offset 18581504 Section 3.2.11.1 ............. Final Results Section 3.2.11.2 ............. Stage 1 Results Section 3.2.11.3 ............. Stage 2 Results Section 3.2.12 ............... JPEG File at Offset 20806656 Section 3.2.12.1 ............. Final Results Section 3.2.12.2 ............. Stage 1 Results Section 3.2.12.3 ............. Stage 2 Results Section 3.2.13 ............... JPEG File at Offset 21304832 Section 3.2.13.1 ............. Final Results Section 3.2.13.2 ............. Stage 1 Results Section 3.2.13.3 ............. Stage 2 Results Section 3.2.14 ............... JPEG File at Offset 22238208 Section 3.2.14.1 ............. Final Results Section 3.2.14.2 ............. Stage 1 Results Section 3.2.14.3 ............. Stage 2 Results Section 3.2.15 ............... JPEG File at Offset 23329792 Section 3.2.15.1 ............. Final Results Section 3.2.15.2 ............. Stage 1 Results Section 3.2.15.3 ............. Stage 2 Results Section 3.2.16 ............... JPEG File at Offset 23330000 Section 3.2.16.1 ............. Final Results Section 3.2.16.2 ............. Stage 1 Results Section 3.2.16.3 ............. Stage 2 Results Section 3.2.17 ............... JPEG File at Offset 24017920 Section 3.2.17.1 ............. Final Results Section 3.2.17.2 ............. Stage 1 Results Section 3.2.17.3 ............. Stage 2 Results Section 3.2.18 ............... JPEG File at Offset 48561152 Section 3.2.18.1 ............. Final Results Section 3.2.18.2 ............. Stage 1 Results Section 3.2.18.3 ............. Stage 2 Results Section 3.2.19 ............... JPEG File at Offset 48561970 Section 3.2.19.1 ............. Final Results Section 3.2.19.2 ............. Stage 1 Results Section 3.2.19.3 ............. Stage 2 Results Section 3.3 .................. Answers for PNG Files Section 3.3.1 ................ PNG File at Offset 4086865 Section 3.3.1.1 .............. Final Results Section 3.3.1.2 .............. Stage 1 Results Section 3.3.2 ................ PNG File at Offset 16902215 Section 3.3.2.1 .............. Final Results Section 3.3.2.2 .............. Stage 1 Results Section 3.3.3 ................ PNG File at Offset 18120696 Section 3.3.3.1 .............. Final Results Section 3.3.3.2 .............. Stage 1 Results Section 3.3.4 ................ PNG File at Offset 18140936 Section 3.3.4.1 .............. Final Results Section 3.3.4.2 .............. Stage 1 Results Section 3.4 .................. Answers for OLE Files Section 3.4.1 ................ OLE File at Offset 1050112 Section 3.4.1.1 .............. Final Results Section 3.4.1.2 .............. Stage 1 Results Section 3.4.1.3 .............. Stage 2 Results Section 3.4.2 ................ OLE File at Offset 4077568 Section 3.4.2.1 .............. Final Results Section 3.4.2.2 .............. Stage 1 Results Section 3.4.2.3 .............. Stage 2 Results Section 3.4.3 ................ OLE File at Offset 16812544 Section 3.4.3.1 .............. Final Results Section 3.4.3.2 .............. Stage 1 Results Section 3.4.3.3 .............. Stage 2 Results Section 3.4.4 ................ OLE File at Offset 17555456 Section 3.4.4.1 .............. Final Results Section 3.4.4.2 .............. Stage 1 Results Section 3.4.4.3 .............. Stage 2 Results Section 3.4.5 ................ OLE File at Offset 18942976 Section 3.4.5.1 .............. Final Results Section 3.4.5.2 .............. Stage 1 Results Section 3.4.5.3 .............. Stage 2 Results Section 3.4.6 ................ OLE File at Offset 23533568 Section 3.4.6.1 .............. Final Results Section 3.4.6.2 .............. Stage 1 Results Section 3.4.6.3 .............. Stage 2 Results Section 3.5 .................. Answers for HTML Files Section 3.5.1 ................ HTML File at Offset 4607 (really 4608) Section 3.5.1.1 .............. Final Results Section 3.5.1.2 .............. Stage 1 Results Section 3.5.1.3 .............. Stage 2 Results Section 3.5.2 ................ HTML File at Offset 2271232 Section 3.5.2.1 .............. Final Results Section 3.5.2.2 .............. Stage 1 Results Section 3.5.2.3 .............. Stage 2 Results Section 3.5.3 ................ HTML File at Offset 2281472 Section 3.5.3.1 .............. Final Results Section 3.5.3.2 .............. Stage 1 Results Section 3.5.3.3 .............. Stage 2 Results Section 3.5.4 ................ HTML File at Offset 14077952 Section 3.5.4.1 .............. Final Results Section 3.5.4.2 .............. Stage 1 Results Section 3.5.4.3 .............. Stage 2 Results Section 3.5.5 ................ HTML File at Offset 14460928 Section 3.5.5.1 .............. Final Results Section 3.5.5.2 .............. Stage 1 Results Section 3.5.5.3 .............. Stage 2 Results Section 3.5.6 ................ HTML File at Offset 15118848 Section 3.5.6.1 .............. Final Results Section 3.5.6.2 .............. Stage 1 Results Section 3.5.6.3 .............. Stage 2 Results Section 3.6 .................. Answers for Text Files Section 3.6.1 ................ Text File at Offset 6049792 (really 6053376) Section 3.6.1.1 .............. Final Results Section 3.6.1.2 .............. Stage 1 Results Section 3.6.1.3 .............. Stage 2 Results Section 3.6.2 ................ Text File at Offset 14458880 (really 14461952) Section 3.6.2.1 .............. Final Results Section 3.6.2.2 .............. Stage 1 Results Section 3.6.2.3 .............. Stage 2 Results Section 3.6.3 ................ Text File at Offset 16814080 (really 16815104) Section 3.6.3.1 .............. Final Results Section 3.6.3.2 .............. Stage 1 Results Section 3.6.3.3 .............. Stage 2 Results Section 3.6.4 ................ Text File at Offset 17555456 (really 17565184) Section 3.6.4.1 .............. Final Results Section 3.6.4.2 .............. Stage 1 Results Section 3.6.4.3 .............. Stage 2 Results Section 3.6.5 ................ Text File at Offset 18939904 (really 18944256) Section 3.6.5.1 .............. Final Results Section 3.6.5.2 .............. Stage 1 Results Section 3.6.5.3 .............. Stage 2 Results Section 3.6.6 ................ Text File at Offset 19312640 (really 19316224) Section 3.6.7 ................ Text File at Offset 20209664 (really 20212224) 1.3 Sponsors and Contributers Thanks to KoreLogic (www.korelogic.com) for allowing several of the contributers to participate in and work on solving this year's DFRWS challenge. The main contributers are listed here in alphabetical order: Bair, Andy Monroe, Klayton Smith, Jason with additional contributions from: Leininger, Hank Segreti, Joe 2 Background 2.1 Goals Our primary goal in answering this challenge was to find out more about the current state of file carving and to contribute back to the workshop whatever we learned in the process. Another one of our goals was to use existing tools from The FTimes Project as our base and extend from there as needed. In particular, we wanted to see how good the existing dig tools/techniques are at identifying and enumerating well-known SOFs (Start of File), EOFs (End of File), and various other relevant file structures or landmarks. 2.2 Summary We extracted 43 files from the challenge image. Some files are contained within other documents, for example the PNG files. Other files are container files, such as the ZIP files. The table below summarizes the number of files in each file type extracted from the challenge image. Count File Type ----- --------- 3 zip 4 png 5 text 6 html 6 ole 19 jpeg The number of false positives was reduced via the methodology, tools, and techniques used. The methodology overview is shown in the PNG image file methodology.png. Our methodology first involved mapping the image topography using file landmarks, statistics, and graphs. Next we took an initial carve of each file from the challenge image. We called this the stage 1 carve. Next we validated that stage 1 carving. If the validation was successful for a file, then we are done processing that file. Otherwise, we perform more detailed data analysis. The data analysis stage uses the file topography information and manual techniques to identify inserted data fragments, interleaved or encompassed files, etc. The result of the data analysis stage is new carve ranges for the carver. The file is re-carved and validated again. This process carve-validate-analyze is how we reduced false positives. 2.3 Analysis Environment 2.3.1 Analysis Platforms The main OSes used by the team were: FreeBSD 6.[01] and Slackware Linux 10.2.0. The final results, however, were produced on a FreeBSD 6.0 system. Anyone that wants to replicate our findings should keep that in mind just in case they run into any technical difficulties. 2.3.2 Analysis Tools The following lists are not intended to be exhaustive. Rather, they are intended to show the most heavily used tools. The following native tools (i.e., tools that usually come with the OS) were used by the team: bc (calculations, hex/decimal conversions) dd (data carving and general manipulation) file (data typing) gcc (C programming) hexdump (data viewing) md5 (or md5sum) perl (scripting) sh (scripting) The following add-on tools (or tool-sets), libraries, and modules were used by the team: Digest-1.10 (MD5) Digest-SHA1-2.10 (SHA1) Image-TestJPG-0.9 (JPEG verifier) bvi-1.3.1 (data viewing and occasionally editing) foremost-1.1 (benchmark and 2nd opinion) ftimes-3.7.0 (mapping, digging, XMagic, and carving) gnuplot-4.0.0 (plotting entropy and averages) mysql-5.0.9-beta (analysis queries based on ftimes output) libOle (contains source for ole-dump) ole-dump (OLE verifier) pcre-6.6 (regular expression engine for ftimes) scalpel-1.54 (benchmark and 2nd opinion) stegdetect-0.5 (potential image verifier) tidy (HTML verifier) unzip552 (ZIP verifier and general extraction tool) xv-3.10a (image viewer) OpenOffice-1.1.3 (document viewer) gqview-2.0.1 (image viewer) The following tools were developed or improved by the team: do_carve do_ctype do_itrim do_plots do_stats ftimes (XMagic and Dig Mode) ftimes-group-blocks.pl ftimes-crv2raw.pl ftimes-sofeof.pl ole-dig2crv test_jpeg.pl test_html.sh 2.3.3 Notes and Caveats All tools referenced this document are assumed to be in your PATH. In particular, make sure that ftimes and all ftimes-related tools are in your PATH. The Makefile targets in this project expect to find the subject image in the following location relative to the project root: evidence_locker/dfrws-2006-challenge.raw FTimes must be built with XMagic enabled. This can be done using the following configure command: ./configure --prefix=/usr/local --enable-xmagic --with-all-tools 2.4 Terminology 2.4.1 Block Sizes Unless otherwise specified, the standard size of a block will be 512 bytes. 2.4.2 Acronyms The following acronyms are used throughout the document: SOF - start of file EOF - end of file FAT - file allocation table OLE - object linking and embedding, Microsoft's framework for compound documents The following terms are used throughout the document: XMagic Extended Magic; This is a line of magic that was inspired by the original file(1) magic. XMagic is part of FTimes. Sliding Entropy The process of calculating entropy values for each sequential block of data in a file. Block sizes are user-defined (e.g., 512, 1024, etc.). For example a 1-gram row entropy calculation for a 512-byte block computes the overall entropy of the block by folding in one byte at a time moving from the first offset in the block to the last. A 2-gram entropy calculation would fold in two bytes at a time. Sliding Average The process of calculating average values for each sequential block of data in a file. Block sizes are user-defined (e.g., 512, 1024, etc.). For example a 1-gram row average calculation for a 512-byte block computes the overall average of the block by folding in one byte at a time moving from the first offset in the block to the last. A 2-gram average calculation would fold in two bytes at a time. Sliding Hash (MD5 and SHA1) The process of calculating message digests for each sequential block of data in a file. Block sizes and hash algorithms are user-defined (e.g., 512, 1024, etc.). These hashes can then be analyzed/manipulated using HashDig utilities. For example, it would be possible to bash these hashes against one or more other (possibly unrelated) subject images. The practitioner could also use the hash information to locate duplicate blocks within the subject image. 2.5 Methodology Please view the PNG image file methodology.png for an overview of our methodology. The following hypotheses helped form the basis of our methodology: - Generally, home-grown parsers will not perform better than parsers that were specifically designed for a given file type or application. Therefore, the practitioner should first look for tools and libraries that can be driven programmatically and try to use them as file validators. - Legitimate files will start on a sector boundary -- i.e., they will be sector-aligned. Therefore, files that begin on non-sector boundaries are likely to be embedded (as opposed to interleaved or encompassed). In other words, they are more likely to be part of a larger file such as an OLE document or ZIP archive. - If the blocks for one file are completely encompassed by the blocks of another file, slack space, entropy tests, and byte distributions may help reveal the edges. - When carving, carve the most well-defined (well-understood) file types first. This allows you to use their boundary information as potential SOF/EOF edges for other file types that have not yet been carved. 2.5.1 Work-Flow Please view the PNG image file methodology.png for an overview of our methodology. Some of the work-flow is embodied in the methodology. 2.5.1.1 Compute and Plot Sliding Entropy/Average Statistics We used FTimes, running in dig mode, with the help of XMagic to harvest various statistics and topographical information. One of our techniques was to compute sliding entropy and average values over the subject image and plot them with gnuplot. This can be done as follows: % make plots All output produced by this target will be placed in work/plots. Sliding entropy and average values are good for detecting edges in the data stream. They can also be used to classify different types of data. For example, TEXT- and HTML-based blocks typically have entropy values between 4 and 6, whereas ZIP- and JPEG-based blocks typically have entropy values between 7 and 8. 2.5.1.2 Compute and Load Sliding Statistics (Percent, Entropy, Average, MD5, and SHA1) We used FTimes, running in dig mode, with the help of XMagic to harvest various statistics and topographical information. One of our techniques was to compute sliding percent values for various ctype(3) character classes over the subject image. Another technique was to compute sliding hashes and sliding {1,2}-gram entropy/average values. All of this information was then loaded into MySQL so that we could run various analysis queries. This step can be done as follows: % make stats load-stats If you don't have MySQL available, you can omit the 'load-stats' target. All output produced by these targets will be placed in work/stats. Sliding percent values are good for revealing the makeup of each block of data. For example, TEXT- and HTML-based blocks are typically biased with high percentages of alpha and numeric characters, whereas ZIP- and JPEG-based blocks will contain these values but the distribution will be more almost flat. 2.5.1.3 Dig for Heads, Tails, and Well-Known Structures The following dig strings, taken from do_carve, form the basis of our search methodology -- i.e., how we located and/or enumerated heads, tails, and well-known structures: DigStringXMagic=xmagic/xmagic.ole.enumerate-header-fat sof.ole DigStringXMagic=xmagic/xmagic.ole.enumerate-fat-blocks fat.ole DigStringNormal=%47%49%46%38%37%61 sof.gif DigStringRegExp=(?is)(\s*<(?:!DOCTYPE[\x20\t]+html[^>]*>\s*(?:\n+|(?:\r\n)+)?) eof.html DigStringRegExp=(?s)(\xff\xd8....JFIF) sof.jpeg DigStringNormal=%ff%d9 eof.jpeg DigStringRegExp=(?s)(PDF-1\.[0-2]) sof.pdf DigStringRegExp=(?s)(\x25\x25EOF) eof.pdf DigStringNormal=%89PNG%0d%0a%1a%0a sof.png DigStringNormal=IEND%ae%42%60%82 eof.png DigStringNormal=PK%03%04 sof.zip DigStringRegExp=PK\x05\x06.{18} eof.zip This stage of the work-flow is built into do_carve, and it can be activated as follows: % make carve 2.5.1.2 Analyze Data and Determine Block Alignment 2.5.1.3 Perform Stage 1 Carves These stage 1 carvings are considered rough-cut carves of files, although in many cases the files carved in stage 1 are successfully validated. 2.5.1.4 Feed Carved Files to a Validator One or more validators are used to validate each file to determine if the file has been extracted successfully and is a proper file. Validation is the key to reducing false positives. If one or more validators fail, we proceed to analyze data to develop new carve ranges, then re-carve. 2.5.1.5 Analyze Data A variety of techniques are used to reduce false positive including validator output, file landmarks, file statistics, and manual techniques. All of these techniques are used to produce a new set of carve ranges. 2.5.1.6 Perform Stage 2 Carves The result of data analysis produces a new set of carve ranges. At this point, the file is re-carved, and revalidated. The process of carve-validate-analyze is performed until the file is successfully extracted or it is determined that there is no reasonable file to extract. 3 Answers 3.1 Answers for ZIP Files The subject image contains the following end of central directory structures: Offset: 14707896 Structure: 0x06054b500,0,5,5,409,146719,0,"" Offset: 16060875 Structure: 0x06054b50,0,0,2,2,134,1163589,0,"" Offset: 23319375 Structure: 0x06054b50,0,0,6,6,486,269673,0,"" Modified dig output: "dfrws-2006-challenge.raw"|normal|sof.zip|14560768|28439|0|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|14623387|28561|155|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|14623442|28561|210|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|14623572|28561|340|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|14707357|28725|157|PK%03%04 "dfrws-2006-challenge.raw"|regexp|eof.zip|14707896|28726|184|PK%05%06%00%00%00%00%05%00%05%00%99%01%00%00%1f=%02%00%00%00 "dfrws-2006-challenge.raw"|normal|sof.zip|14709248|28729|0|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|15385752|30050|152|PK%03%04 "dfrws-2006-challenge.raw"|regexp|eof.zip|16060875|31368|459|PK%05%06%00%00%00%00%02%00%02%00%86%00%00%00E%c1%11%00%00%00 "dfrws-2006-challenge.raw"|normal|sof.zip|23047680|45015|0|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|23109869|45136|237|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|23146999|45208|503|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|23191845|45296|293|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|23219088|45349|400|PK%03%04 "dfrws-2006-challenge.raw"|normal|sof.zip|23263287|45436|55|PK%03%04 "dfrws-2006-challenge.raw"|regexp|eof.zip|23319375|45545|335|PK%05%06%00%00%00%00%06%00%06%00%e6%01%00%00i%1d%04%00%00%00 3.1.1 ZIP File at Offset 14560768 3.1.1.1 Final Results FileType: Zip Offset: 14560768 Block: 28439 BlockOffset: 0 Embedded: No CarveRanges: 14560768-14707917 Size: 147150 MD5: ebabde39ba44d38888dd82606980498a SHA1: 9e4a1097d320e37d135a4653bb10cf58a27f8ed3 3.1.1.2 Stage 1 Results NumberOfFilesInArchive: 5 LocalFileHeaderOffsets: 14560768,14623387,14623442,14623572,14707357 CarveSize: 14707896 + 22 - 14560768 = 147150 CleanCarve: Yes Explanation: This file was considered a clean carve because the archive can be can be verified using unzip in test mode. Additionally, its contents can be extracted with unzip or WinZip with no reported errors. The results of running unzip in test mode are provided here: --- UNZIP_TEST_RESULTS --- % unzip -t carved/dfrws-2006-challenge.raw.14560768.zip Archive: carved/dfrws-2006-challenge.raw.14560768.zip testing: 4n6rodeo3-fix copy.jpg OK testing: __MACOSX/ OK testing: __MACOSX/._4n6rodeo3-fix copy.jpg OK testing: 4n6rodeo4-fix copy.jpg OK testing: __MACOSX/._4n6rodeo4-fix copy.jpg OK No errors detected in compressed data of carved/dfrws-2006-challenge.raw.14560768.zip. --- UNZIP_TEST_RESULTS --- 3.1.1.3 Stage 2 Results Stage 2 analysis was not required. 3.1.2 ZIP File at Offset 14709248 3.1.2.1 Final Results FileType: Zip Offset: 14709248 Block: 28729 BlockOffset: 0 Embedded: No CarveRanges: 14709248-15118847,15306752-16060896 Size: 1163745 MD5: 9a4c2d3a9bd203eb39c9f954a3c997e4 SHA1: 75969c7c2cd21c6a2cf7edf590b082aa465be8f6 3.1.2.2 Stage 1 Results NumberOfFilesInArchive: 2 LocalFileHeaderOffsets: 14709248,15385752 CarveSize: 16060875 + 22 - 14709248 = 1351649 CleanCarve: No Explanation: This file was not considered a clean carve because the archive can not be completely verified using unzip in test mode. The results of running unzip in test mode are provided here: --- UNZIP_TEST_RESULTS --- % unzip -t carved/dfrws-2006-challenge.raw.14709248.zip Archive: carved/dfrws-2006-challenge.raw.14709248.zip warning [carved/dfrws-2006-challenge.raw.14709248.zip]: 187904 extra bytes at beginning or within zipfile (attempting to process anyway) file #1: bad zipfile offset (local header sig): 187904 (attempting to re-compensate) testing: file1.jpg error: invalid compressed data to inflate file #2: bad zipfile offset (local header sig): 488600 (attempting to re-compensate) testing: file2.jpg OK At least one error was detected in carved/dfrws-2006-challenge.raw.14709248.zip. --- UNZIP_TEST_RESULTS --- 3.1.2.3 Stage 2 Results TrimSize: 187904 TrimOffset: 409600 CleanTrim: Yes Explanation: This file was trimmed with the following command: do_itrim -e zip -l 407552 -s 512 -t 187904 -f dfrws-2006-challenge.raw.14709248.zip -- unzip -t %subject The lower bound, 407552 (or block 796), was chosen based on the results of viewing the file's sliding entropy. Our analysis revealed that there is an abrupt drop in entropy between blocks 799 and 800 and a corresponding increase between 1166 and 1167. Between these endpoints, the entropy drops from approximately 7.5 to 5.1. This run of lower entropy consumes 367 blocks (or 187904 bytes), and that corresponds exactly with the error message produced by unzip, which states that there are 187904 extra bytes in the file. Therefore the trim size was set to 187904. The step size was set to 512 to keep the trimmer block-aligned. The relevant results of do_itrim are provided here: --- DO_ITRIM_RESULTS --- dfrws-2006-challenge.raw.14709248.zip trimmed 187904 @ 407552 ---> fail dfrws-2006-challenge.raw.14709248.zip trimmed 187904 @ 408064 ---> fail dfrws-2006-challenge.raw.14709248.zip trimmed 187904 @ 408576 ---> fail dfrws-2006-challenge.raw.14709248.zip trimmed 187904 @ 409088 ---> fail dfrws-2006-challenge.raw.14709248.zip trimmed 187904 @ 409600 ---> pass --- DO_ITRIM_RESULTS --- This file was considered a clean trim because the archive can be can be verified using unzip in test mode. Additionally, its contents can be extracted with unzip or WinZip with no reported errors. The results of running unzip in test mode are provided here: --- UNZIP_TEST_RESULTS --- % unzip -t dfrws-2006-challenge.raw.14709248.zip Archive: dfrws-2006-challenge.raw.14709248.zip testing: file1.jpg OK testing: file2.jpg OK No errors detected in compressed data of dfrws-2006-challenge.raw.14709248.zip. --- UNZIP_TEST_RESULTS --- 3.1.3 ZIP File at Offset 23047680 3.1.3.1 Final Results FileType: Zip Offset: 23047680 Block: 45015 BlockOffset: 0 Embedded: No CarveRanges: 23047680-23238143,23239680-23319396 Size: 270181 MD5: f940fcc37c82e8ff1431e5c3c061611e SHA1: 2f7fe032a402313d33082b61d3cbd9bebd32f983 3.1.3.2 Stage 1 Results NumberOfFilesInArchive: 6 LocalFileHeaderOffsets: 23047680,23109869,23146999,23191845,23219088,23263287 CarveSize: 23319375 + 22 - 23047680 = 271717 CleanCarve: No Explanation: This file was not considered a clean carve because the archive can not be completely verified using unzip in test mode. The results of running unzip in test mode are provided here: --- UNZIP_TEST_RESULTS --- % unzip -t carved/dfrws-2006-challenge.raw.23047680.zip Archive: carved/dfrws-2006-challenge.raw.23047680.zip warning [carved/dfrws-2006-challenge.raw.23047680.zip]: 1536 extra bytes at beginning or within zipfile (attempting to process anyway) file #1: bad zipfile offset (local header sig): 1536 (attempting to re-compensate) testing: 1993-01-a-large_web.jpg OK testing: 1995-45-a-large_web.jpg OK testing: 2000-06-a-large_web.jpg OK testing: 2000-07-a-large_web.jpg OK testing: 2002-29-a-large_web.jpg error: invalid compressed data to inflate file #6: bad zipfile offset (local header sig): 214071 (attempting to re-compensate) testing: 2005-02-e-large_web.jpg OK At least one error was detected in carved/dfrws-2006-challenge.raw.23047680.zip. --- UNZIP_TEST_RESULTS --- 3.1.3.3 Stage 2 Results TrimSize: 1536 TrimOffset: 190464 CleanTrim: Yes Explanation: This file was trimmed with the following command: do_itrim -e zip -l 171520 -s 512 -t 1536 -f dfrws-2006-challenge.raw.23047680.zip -- unzip -t %subject A nonzero lower bound was chosen because the initial unzip test indicated that the first four files in the archive were good. Therefore, trimming could begin at or near file five. We chose to start trimming at the first 512-byte block after the beginning of file five (i.e., 171520). The trim size, 1536 bytes, is based on the original unzip output, which indicated that there were 1536 extra bytes in the file. The step size was set to 512 to keep the trimmer block-aligned. The relevant results of do_itrim are provided here: --- DO_ITRIM_RESULTS --- dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 171520 ---> fail dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 172032 ---> fail dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 172544 ---> fail ... dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 189440 ---> fail dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 189952 ---> fail dfrws-2006-challenge.raw.23047680.zip trimmed 1536 @ 190464 ---> pass --- DO_ITRIM_RESULTS --- This file was considered a clean trim because the archive can be can be verified using unzip in test mode. Additionally, its contents can be extracted with unzip or WinZip with no reported errors. The results of running unzip in test mode are provided here: --- UNZIP_TEST_RESULTS --- % unzip -t dfrws-2006-challenge.raw.23047680.zip Archive: dfrws-2006-challenge.raw.23047680.zip testing: 1993-01-a-large_web.jpg OK testing: 1995-45-a-large_web.jpg OK testing: 2000-06-a-large_web.jpg OK testing: 2000-07-a-large_web.jpg OK testing: 2002-29-a-large_web.jpg OK testing: 2005-02-e-large_web.jpg OK No errors detected in compressed data of dfrws-2006-challenge.raw.23047680.zip. --- UNZIP_TEST_RESULTS --- 3.2 Answers for JPEG Files The subject image contains the following JPEG images and thumbnails. dfrws-2006-challenge.raw.14134784.sof.jpeg dfrws-2006-challenge.raw.16115200.sof.jpeg dfrws-2006-challenge.raw.16144896.sof.jpeg dfrws-2006-challenge.raw.18581504.sof.jpeg dfrws-2006-challenge.raw.1980416.sof.jpeg dfrws-2006-challenge.raw.1980748.sof.jpeg dfrws-2006-challenge.raw.1995443.sof.jpeg dfrws-2006-challenge.raw.20806656.sof.jpeg dfrws-2006-challenge.raw.21304832.sof.jpeg dfrws-2006-challenge.raw.22238208.sof.jpeg dfrws-2006-challenge.raw.23329792.sof.jpeg dfrws-2006-challenge.raw.23330000.sof.jpeg dfrws-2006-challenge.raw.24017920.sof.jpeg dfrws-2006-challenge.raw.4241920.sof.jpeg dfrws-2006-challenge.raw.48561152.sof.jpeg dfrws-2006-challenge.raw.48561970.sof.jpeg dfrws-2006-challenge.raw.5948928.sof.jpeg dfrws-2006-challenge.raw.5949358.sof.jpeg dfrws-2006-challenge.raw.6257664.sof.jpeg Sample of modified dig output: --- snip --- "dfrws-2006-challenge.raw"|normal|eof.jpeg|20780956|40587|412|%ff%d9 "dfrws-2006-challenge.raw"|regexp|sof.jpeg|20806656|40638|0|%ff%d8%ff%e0%00%10JFIF "dfrws-2006-challenge.raw"|normal|eof.jpeg|21106133|41222|469|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|21303855|41609|47|%ff%d9 "dfrws-2006-challenge.raw"|regexp|sof.jpeg|21304832|41611|0|%ff%d8%ff%e0%00%10JFIF "dfrws-2006-challenge.raw"|regexp|sof.jpeg|22238208|43434|0|%ff%d8%ff%e0%00%10JFIF "dfrws-2006-challenge.raw"|normal|eof.jpeg|22542619|44028|283|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|22630555|44200|155|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|22642523|44223|347|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|22856910|44642|206|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|22892093|44711|61|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|22990381|44903|45|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23019117|44959|109|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23109851|45136|219|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23146981|45208|485|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23191827|45296|275|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23195808|45304|160|%ff%d9 "dfrws-2006-challenge.raw"|normal|eof.jpeg|23263269|45436|37|%ff%d9 "dfrws-2006-challenge.raw"|regexp|sof.jpeg|23329792|45566|0|%ff%d8%ff%e0%00%10JFIF "dfrws-2006-challenge.raw"|regexp|sof.jpeg|23330000|45566|208|%ff%d8%ff%e0%00%10JFIF --- snip --- 3.2.1 JPEG File at Offset 1980416 3.2.1.1 Final Results FileType: JPEG Offset: 1980416 Block: 3868 BlockOffset: 0 Embedded: No CarveRanges: 1980416-2267601 Size: 287186 MD5: daf4205574abd6919b10ca8be92d17a3 SHA1: da76887523bbbf0f3c3dd19b6ab493d035bd8c3f 3.2.1.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.1.3 Stage 2 Results Stage 2 analysis was not required. 3.2.2 JPEG File at Offset 1980748 3.2.2.1 Final Results FileType: JPEG Offset: 1980748 Block: 3868 BlockOffset: 332 Embedded: Yes (JPEG thumbnail) CarveRanges: 1980748-1986298 Size: 5551 MD5: c1d7c67b0e81f902a2c7ed56651fee2b SHA1: 1a89862c52612015400e7418583693c803c0037a 3.2.2.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.2.3 Stage 2 Results Stage 2 analysis was not required. 3.2.3 JPEG File at Offset 1995443 3.2.3.1 Final Results FileType: JPEG Offset: 1995443 Block: 3897 BlockOffset: 179 Embedded: Yes (JPEG thumbnail) CarveRanges: 1995443-2000993 Size: 5551 MD5: c1d7c67b0e81f902a2c7ed56651fee2b SHA1: 1a89862c52612015400e7418583693c803c0037a 3.2.3.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.3.3 Stage 2 Results Stage 2 analysis was not required. 3.2.4 JPEG File at Offset 4241920 3.2.4.1 Final Results FileType: JPEG Offset: 4241920 Block: 8285 BlockOffset: 0 Embedded: No CarveRanges: 4241920-4850622 Size: 608703 MD5: 4efc6c572683878efd8f3404ddaded7b SHA1: e997a28e3d6438c777e7122166c64a3a51541e12 3.2.4.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.4.3 Stage 2 Results Stage 2 analysis was not required. 3.2.5 JPEG File at Offset 5948928 3.2.5.1 Final Results FileType: JPEG Offset: 5948928 Block: 11619 BlockOffset: 0 Embedded: No CarveRanges: 5948928-6053375,6066688-6152959 Size: 190720 MD5: 7b07320709e0caa947663f5df3a0a390 SHA1: 5b9843476f54c0268e0de64eca8de095cbad31f7 3.2.5.2 Stage 1 Results CleanCarve: No Explanation: This file did not carve cleanly. The test_jpeg.pl tool indicated that the carved file was not valid. xv was also used to try and open the file for viewing. # tools/test_jpeg.pl -v -f dfrws-2006-challenge.raw.5948928.sof.jpeg dfrws-2006-challenge.raw.5948928.sof.jpeg JPEG file is not valid. 3.2.5.3 Stage 2 Results TrimSize: 71680 TrimOffset: 203776 CleanTrim: Yes Explanation: This file was trimmed with the following command: do_itrim -e jpeg -l 103936 -r 1 -s 512 -f dfrws-2006-challenge.raw.5948928.sof.jpeg -t 13312 -- tools/test_jpeg.pl -f %subject A nonzero lower bound was chosen because looking at the entropy plot graphs, the entropy was consistent until block 203 where the entropy dropped. The entropy stayed low until block 230. The step size was set to 512 to keep the trimmer block-aligned. The relevant results of do_itrim are provided here: --- DO_ITRIM_RESULTS --- dfrws-2006-challenge.raw.5948928.sof.jpeg trimmed 13312 @ 104448 ---> pass --- DO_ITRIM_RESULTS --- This file was considered a clean trim because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.6 JPEG File at Offset 5949358 3.2.6.1 Final Results FileType: JPEG Offset: 5949358 Block: 11619 BlockOffset: 430 Embedded: Yes (JPEG thumbnail) CarveRanges: 5949358-5951146 Size: 1789 MD5: b3f4ccfb61790ab745b39d3945c5fe3f SHA1: 37acc8ba24213e0277de4da0578c993307087320 3.2.6.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.6.3 Stage 2 Results Stage 2 analysis was not required. 3.2.7 JPEG File at Offset 6257664 3.2.7.1 Final Results FileType: JPEG Offset: 6257664 Block: 12222 BlockOffset: 0 Embedded: No CarveRanges: 6257664-13371631 Size: 7113968 MD5: b070beae1606f67a342bc5f78c29c743 SHA1: 34541a233bb7b7ffcce81e483d3ed9fe0020ffe8 3.2.7.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.7.3 Stage 2 Results Stage 2 analysis was not required. 3.2.8 JPEG File at Offset 14134784 3.2.8.1 Final Results FileType: JPEG Offset: 14134784 Block: 27607 BlockOffset: 0 Embedded: No CarveRanges: 14134784-14324317 Size: 189534 MD5: fe7e7ac67709f2d9c2483aa98c681b99 SHA1: d51c07c71cfd764a1c63420b9566a22742abb8dc 3.2.8.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.8.3 Stage 2 Results Stage 2 analysis was not required. 3.2.9 JPEG File at Offset 16115200 3.2.9.1 Final Results FileType: JPEG Offset: 16115200 Block: 31475 BlockOffset: 0 Embedded: No CarveRanges: Size: MD5: SHA1: 3.2.9.2 Stage 1 Results CleanCarve: Explanation: 3.2.9.3 Stage 2 Results Not completed. 3.2.10 JPEG File at Offset 16144896 3.2.10.1 Final Results FileType: JPEG Offset: 16144896 Block: 31533 BlockOffset: 0 Embedded: No CarveRanges: Size: MD5: SHA1: 3.2.10.2 Stage 1 Results CleanCarve: Explanation: 3.2.10.3 Stage 2 Results Not completed. 3.2.11 JPEG File at Offset 18581504 3.2.11.1 Final Results FileType: JPEG Offset: 18581504 Block: 36292 BlockOffset: 0 Embedded: No CarveRanges: 18581504-18760162 Size: 178659 MD5: 2fae8770cc013d22e9ea1c070f2f509b SHA1: a683adda2942c18431e6cc719008450957697788 3.2.11.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.11.3 Stage 2 Results Stage 2 analysis was not required. 3.2.12 JPEG File at Offset 20806656 3.2.12.1 Final Results FileType: JPEG Offset: 20806656 Block: 40638 BlockOffset: 0 Embedded: No CarveRanges: Size: MD5: SHA1: 3.2.12.2 Stage 1 Results CleanCarve: Explanation: 3.2.12.3 Stage 2 Results Not completed. 3.2.13 JPEG File at Offset 21304832 3.2.13.1 Final Results FileType: JPEG Offset: 21304832 Block: 41611 BlockOffset: 0 Embedded: No CarveRanges: 21304832-22238207,22542848-22630556 Size: 1021085 MD5: 7cce072e518fd72484c97adb1b4be08e SHA1: 238c09d20331676bb7644126d87cc434e5089c80 3.2.13.2 Stage 1 Results CleanCarve: No Explanation: This file was not considered a clean carve because the image could not be verified using the JPEG validator. 3.2.13.3 Stage 2 Results CleanCarve: Yes Explanation: The correction required to make this file whole was to recognize that the file at offset 22238208 is completely encompassed by this file. Once that observation is made, we simply adjusted the carve ranges to exclude the inner file. Once that was done, the file carved cleanly. This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.14 JPEG File at Offset 22238208 3.2.14.1 Final Results FileType: JPEG Offset: 22238208 Block: 43434 BlockOffset: 0 Embedded: No CarveRanges: 22238208-22542620 Size: 304413 MD5: c0da37b3f1a07af790e6e9171cedc4d2 SHA1: 562148fc9f9eed8105d9b03b88a69ae8cc638129 3.2.14.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.14.3 Stage 2 Results Stage 2 analysis was not required. 3.2.15 JPEG File at Offset 23329792 3.2.15.1 Final Results FileType: JPEG Offset: 23329792 Block: 45566 BlockOffset: 0 Embedded: No CarveRanges: 23329792-23533567,23605248-23974970 Size: 573499 MD5: 2320fe9c41eaddb864a56c2ddc4dd186 SHA1: eab7362c938673cd7d7d63757bd6927bc5648ef9 3.2.15.2 Stage 1 Results CleanCarve: No Explanation: This file did not carve cleanly. The test_jpeg.pl tool indicated that the carved file was not valid. xv was also used to try and open the file for viewing. # tools/test_jpeg.pl -v -f dfrws-2006-challenge.raw.23329792.sof.jpeg dfrws-2006-challenge.raw.23329792.sof.jpeg JPEG file is not valid. 3.2.15.3 Stage 2 Results TrimSize: 71680 TrimOffset: 203776 CleanTrim: Yes Explanation: This file was trimmed with the following command: do_itrim -e jpeg -l 203776 -r 1 -s 512 -f dfrws-2006-challenge.raw.23329792.sof.jpeg -t 71680 -- tools/test_jpeg.pl -f %subject A nonzero lower bound was chosen because looking at the entropy plot graphs, the entropy was consistent until block 398 where the entropy dropped. The entropy stayed low until block 538. The step size was set to 512 to keep the trimmer block-aligned. The relevant results of do_itrim are provided here: --- DO_ITRIM_RESULTS --- dfrws-2006-challenge.raw.23329792.sof.jpeg trimmed 71680 @ 203776 ---> pass --- DO_ITRIM_RESULTS --- This file was considered a clean trim because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.16 JPEG File at Offset 23330000 3.2.16.1 Final Results FileType: JPEG Offset: 23330000 Block: 45566 BlockOffset: 208 Embedded: Yes (JPEG thumbnail) CarveRanges: 23330000-23336599 Size: 6600 MD5: 14f703660bdd2798cb08c79ee8597212 SHA1: 4d5a9d8c237d73c98a828a2b5aff6b0f13962e6d 3.2.16.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.16.3 Stage 2 Results Stage 2 analysis was not required. 3.2.17 JPEG File at Offset 24017920 3.2.17.1 Final Results FileType: JPEG Offset: 24017920 Block: 46910 BlockOffset: 0 Embedded: No CarveRanges: 24017920-48556459 Size: 24538540 MD5: db32b271506b2f4974791957627c61cc SHA1: 752b62db6459712628d5b69b986d2dd50f381345 3.2.17.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.17.3 Stage 2 Results Stage 2 analysis was not required. 3.2.18 JPEG File at Offset 48561152 3.2.18.1 Final Results FileType: JPEG Offset: 48561152 Block: 94846 BlockOffset: 0 Embedded: No CarveRanges: 48561152-48962047,48962560-49486541 Size: 924877 MD5: 1a5a843000ef617af93a9cad645e3cdf SHA1: f8d81c8785150a2346936c3cf8fe0cc81c7501ad 3.2.18.2 Stage 1 Results CleanCarve: No Explanation: This file did not carve cleanly. The test_jpeg.pl tool indicated that the carved file was not valid. xv was also used to try and open the file for viewing. # tools/test_jpeg.pl -v -f dfrws-2006-challenge.raw.48561152.sof.jpeg dfrws-2006-challenge.raw.48561152.sof.jpeg JPEG file is not valid. This file carved to the 2nd jpeg.eof This second eof was at the beginning of a block, and it did not appear to be the correct EOF. Analysis of the carve.log showed there were additional dangling EOFs for JPEGs. A larger carve was taken to the third EOF offset 49486539, and the block containing the 2nd EOF was removed. Once this was done, the image tested as clean. 3.2.18.3 Stage 2 Results TrimSize: 512 TrimOffset: 400896 CleanTrim: Yes Explanation: First, a larger carve was required because the first carve from the subject file was too small -- this was due to the fact that the second EOF was a false positive. The actual EOF marker was located at offset 49486539. dd if=dfrws-2006-challenge.raw of=test.jpeg bs=1 skip=48561152 count=925389 Once the larger carve had been performed, we were able to remove the extra block at offset 400896 using do_itrim as shown here: do_itrim -e jpeg -l 400896 -r 1 -s 512 -f test.jpeg -t 512 -- test_jpeg.pl -f %subject The lower bound, 400896 (or block 783), was chosen to coincide with the location of the offending EOF, which is listed in the dig output. The trim size was set to 512 because that is the smallest block of data we could carve -- values smaller than 512 were not considered because they would break block alignments. The relevant results of do_itrim are provided here: --- DO_ITRIM_RESULTS --- new.dfrws-2006-challenge.raw.48561152.sof.jpeg trimmed 512 @ 400896 ---> pass --- DO_ITRIM_RESULTS --- This file was considered a clean trim because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.19 JPEG File at Offset 48561970 3.2.19.1 Final Results FileType: JPEG Offset: 48561970 Block: 94847 BlockOffset: 306 Embedded: Yes (JPEG thumbnail) CarveRanges: 48561970-48564602 Size: 2633 MD5: 11dd149a694df0f1ad47bffed35fb038 SHA1: 95e5fa174a45efb391386718dac7cd159634dd8e 3.2.19.2 Stage 1 Results CleanCarve: Yes Explanation: This file was considered a clean carve because the image can be verified using the JPEG validator. Additionally, the image can be opened and viewed in xv without any errors or warnings. 3.2.19.3 Stage 2 Results Stage 2 analysis was not required. 3.3 Answers for PNG Files The subject image contains the following PNG images: dfrws-2006-challenge.raw.4086865.png dfrws-2006-challenge.raw.16902215.png dfrws-2006-challenge.raw.18120696.png dfrws-2006-challenge.raw.18140936.png Note: Each of these image files is actually part of larger Microsoft Office document. This is evidenced by the fact that the SOF/EOF offsets for each PNG image falls within the limits of a valid Microsoft Office document. Another strong indicator that these files are actually embedded (as opposed to interleaved or encompassed) is the fact that none of them start on a block boundary. Modified dig output: "dfrws-2006-challenge.raw"|normal|sof.png|4086865|7982|81|%89PNG%0d%0a%1a%0a "dfrws-2006-challenge.raw"|normal|eof.png|4093285|7994|357|IEND%aeB%60%82 "dfrws-2006-challenge.raw"|normal|sof.png|16902215|33012|71|%89PNG%0d%0a%1a%0a "dfrws-2006-challenge.raw"|normal|eof.png|16973227|33150|427|IEND%aeB%60%82 "dfrws-2006-challenge.raw"|normal|sof.png|18120696|35391|504|%89PNG%0d%0a%1a%0a "dfrws-2006-challenge.raw"|normal|eof.png|18140581|35430|421|IEND%aeB%60%82 "dfrws-2006-challenge.raw"|normal|sof.png|18140936|35431|264|%89PNG%0d%0a%1a%0a "dfrws-2006-challenge.raw"|normal|eof.png|18215416|35576|504|IEND%aeB%60%82 3.3.1 PNG File at Offset 4086865 3.3.1.1 Final Results Offset: 4086865 Block: 7982 BlockOffset: 81 Embedded: Yes (office document) CarveRanges: 4086865-4093292 Size: 6428 MD5: 13bf31f3f270e9755ba27c96a07fe615 SHA1: 9107f79ded0962be062ec858990cf65533a2156c 3.3.1.2 Stage 1 Results This file was considered a clean carve because the image can be opened and viewed in xv without any errors or warnings. Note: We did not pursue a scriptable PNG validator since PNGs were not on the list of target file types, and our time to work on this challenge was limited. 3.3.2 PNG File at Offset 16902215 3.3.2.1 Final Results Offset: 16902215 Block: 33012 BlockOffset: 71 Embedded: Yes (office document) CarveRanges: 16902215-16973234 Size: 71020 MD5: 204d04b27e673b5bb7ed63885b819b9f SHA1: a48c1ccd48b21ae78c3cfa8099f3f488dd260594 3.3.2.2 Stage 1 Results This file was considered a clean carve because the image can be opened and viewed in xv without any errors or warnings. Note: We did not pursue a scriptable PNG validator since PNGs were not on the list of target file types, and our time to work on this challenge was limited. 3.3.3 PNG File at Offset 18120696 3.3.3.1 Final Results Offset: 18120696 Block: 35391 BlockOffset: 504 Embedded: Yes (office document) CarveRanges: 18120696-18140588 Size: 19893 MD5: 1b97489741b99a34b19ba2f8f23be0d4 SHA1: 1562ea962a7c1c3f091097fdfadcf76aa6d8ba8c 3.3.3.2 Stage 1 Results This file was considered a clean carve because the image can be opened and viewed in xv without any errors or warnings. Note: We did not pursue a scriptable PNG validator since PNGs were not on the list of target file types, and our time to work on this challenge was limited. 3.3.4 PNG File at Offset 18140936 3.3.4.1 Final Results Offset: 18140936 Block: 35431 BlockOffset: 264 Embedded: Yes (office document) CarveRanges: 18140936-18215423 Size: 74488 MD5: 6e33b2341a38a7936f2b5243ade0ef5e SHA1: 6240481dce6cf79ed57a542d254ca4884a2f2282 3.3.4.2 Stage 1 Results This file was considered a clean carve because the image can be opened and viewed in xv without any errors or warnings. Note: We did not pursue a scriptable PNG validator since PNGs were not on the list of target file types, and our time to work on this challenge was limited. 3.4 Answers for OLE Files The subject image contains the following OLE files: dfrws-2006-challenge.raw.1050112.ole dfrws-2006-challenge.raw.4077568.ole dfrws-2006-challenge.raw.16812544.ole dfrws-2006-challenge.raw.17555456.ole dfrws-2006-challenge.raw.18942976.ole dfrws-2006-challenge.raw.23533568.ole The modified dig output below shows the start of file for each OLE document and the FAT block pointers, in hex, are listed in the last field. "dfrws-2006-challenge.raw"|xmagic|sof.ole|1050112|689,68A,68B,68C,68D,68E,68F,690,691,692,693,694,695,699 "dfrws-2006-challenge.raw"|xmagic|sof.ole|4077568|38,79,D2,15C,1D0,25A,2CE "dfrws-2006-challenge.raw"|xmagic|sof.ole|16812544|220,221,222,223,22A "dfrws-2006-challenge.raw"|xmagic|sof.ole|17555456|71F,720,721,722,723,724,725,726,727,728,729,72A,72B,72C,72D "dfrws-2006-challenge.raw"|xmagic|sof.ole|18942976|C92,C93,C94,C95,C96,C97,C98,C99,C9A,C9B,C9C,C9D,C9E,C9F,CA0,CA1,CA2,CA3,CA4,CA5,CA6,CA7,CA8,CA9,CAA,CAB "dfrws-2006-challenge.raw"|xmagic|sof.ole|23533568|85,86 False positives were reduced primarily via XMagic and the validator ole-dig2crv. XMagic was developed to identify OLE document heads and enumerate FAT block offsets as well as identify FAT blocks. The ole-dig2crv script analyzes the FTimes dig output using the XMagic to verify the location of FAT blocks and reports FAT blocks that are misplaced. 3.4.1 OLE File at Offset 1050112 3.4.1.1 Final Results Offset: 1050112 Block: 2051 BlockOffset: 0 Embedded: No CarveRanges: 1050112-1562111,1572864-1930751 Size: 869888 MD5: 03d1deff4774c932358d3580a3bbae66 SHA1: 7960c3a03bc01c2018d101ec16e8c0cec8b4f580 3.4.1.2 Stage 1 Results CleanCarve: No FATOffsets: 00000689,0000068a,0000068b,0000068c, 0000068d,0000068e,0000068f,00000690, 00000691,00000692,00000693,00000694, 00000695,00000699 Explanation: This file was not considered a clean carve because the file can not be completely verified with ole-dump. The results of running ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carved/dfrws-2006-challenge.raw.1050112.ole Table of Contents for work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.1050112.ole: Segmentation fault (core dumped) --- OLE_DUMP_RESULTS --- 3.4.1.3 Stage 2 Results CarveRanges: 1050112-1562111,1572864-1930751 CleanCarve: Yes Explanation: Analysis of output from the ole-dig2crv tool in the log file work/carve/carve.log shows that the first FAT block, 0x689, is not located at the correct file offset. The first FAT block is expected to start at offset 1907200 (block 3725), but it actually starts at offset 1917952 (block 3746). This implies that there are 10752 bytes (21 blocks) of extra data located somewhere between offset 1050112 (block 2051) and 1907200 (block 3725). The relevant output from ole-dig2crv is shown below. --- snip --- ole-dig2crv: file at 1050112, offset 1050112, block 2051 -- header FAT block pointers (in hex): 689,68A,68B,68C,68D,68E,68F,690,691,692,693,694,695,699 ole-dig2crv: file at 1050112, offset 1907200, block 3725 -- missing FAT block, slipped forward 10752 bytes, 21 blocks ole-dig2crv: file at 1050112, offset 1917952, block 3746 -- valid FAT block #1, 0x689 ole-dig2crv: file at 1050112, offset 1918464, block 3747 -- valid FAT block #2, 0x68A ole-dig2crv: file at 1050112, offset 1918976, block 3748 -- valid FAT block #3, 0x68B --- snip --- Examination of the combined dig output file, work/carve/combined.dig, did not reveal anything significant. Analysis of the entropy plots (see section titled "Compute and Plot Sliding Entropy/Average Statistics") located in the directory work/plots, resulted in the observations shown in the table below. Observations started with with block 2051 in plot file sliding.entropy.out.512.plot.ae.png. In plot file sliding.entropy.out.512.plot.ag.png there is a distinct entropy shift for roughly 20 blocks starting at or near block 3050. approx chunk position block entropy shift ----- -------- ----- --------------- 1 lower 3050 from approx 6.5 to 7.5 1 upper 3070 from 7.5 to approx 6.5 Using the graphs as a hint, we examined the block entropy data computed by FTimes in MySQL (see section titled "Compute and Load Sliding Statistics (Percent, Entropy, Average, MD5, and SHA1)"). The MySQL query and output (slightly altered) below shows the start and end of the entropy change. Notice that entropy jumps up from about 6.5 to 7.5 starting at offset 1562112 (block 3051) and drops from about 7.6 to 6.5 at offset 1572864 (block 3072). This is 21 extra blocks which is consistent with the ole-dig2crv output. --- MYSQL_ENTROPY_QUERY --- mysql> select offset, block, block_offset, blocksize, rent1 from stats where blocksize = 512 and ( (offset >= 1560576 and offset <= 1563136) or (offset >= 1571328 and offset <= 1573888) ); +---------+-------+--------------+-----------+----------+ | offset | block | block_offset | blocksize | rent1 | +---------+-------+--------------+-----------+----------+ | 1560576 | 3048 | 0 | 512 | 6.555392 | | 1561088 | 3049 | 0 | 512 | 6.58527 | | 1561600 | 3050 | 0 | 512 | 6.568371 | | 1562112 | 3051 | 0 | 512 | 7.533653 | | 1562624 | 3052 | 0 | 512 | 7.529483 | | 1563136 | 3053 | 0 | 512 | 7.586172 | | 1571328 | 3069 | 0 | 512 | 7.630048 | | 1571840 | 3070 | 0 | 512 | 7.630446 | | 1572352 | 3071 | 0 | 512 | 7.616982 | | 1572864 | 3072 | 0 | 512 | 6.627545 | | 1573376 | 3073 | 0 | 512 | 6.511695 | | 1573888 | 3074 | 0 | 512 | 6.569913 | +---------+-------+--------------+-----------+----------+ 12 rows in set (0.07 sec) --- MYSQL_ENTROPY_QUERY --- We then computed the final carve ranges as shown here: 1050112 = start of file 869888 = predicted file size from ole-dig2crv 10752 = size of extra data = 21 blocks * 512 range 1 lower = 1050112 range 1 upper = 1562111 = 1562112 - 1 range 2 lower = 1572864 = 1050112 + 10752 range 2 upper = 1930751 = 1050112 + (869888 + 10752) - 1 The file was then re-carved using the final carve ranges and ftimes-crv2raw.pl as follows: --- FTIMES_CRV2RAW_RESULTS --- % echo '"evidence_locker/dfrws-2006-challenge.raw"|ole|1050112|1|1050112-1562111,1572864-1930751' | ftimes-crv2raw.pl -F -U -m -f - "carve_tree/evidence_locker/dfrws-2006-challenge.raw.1050112.ole"|869888|7960c3a03bc01c2018d101ec16e8c0cec8b4f580|03d1deff4774c932358d3580a3bbae66 --- FTIMES_CRV2RAW_RESULTS --- The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carve_tree/evidence_locker/dfrws-2006-challenge.raw.1050112.ole Table of Contents for carve_tree/evidence_locker/dfrws-2006-challenge.raw.1050112.ole: Root Entry Workbook 848333 SummaryInformation 4096 DocumentSummaryInformation 4096 --- OLE_DUMP_RESULTS --- This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in Microsoft Excel spreadsheet without any errors or warnings. 3.4.2 OLE File at Offset 4077568 3.4.2.1 Final Results Offset: 4077568 Block: 7964 BlockOffset: 0 Embedded: No CarveRanges: 4077568-4241919,4850688-5136383 Size: 450048 MD5: 8d2a9a284e078805ada47db191f35244 SHA1: 40ed556bdfa3b6dac2641d4d1a8f6d474eb15835 3.4.2.2 Stage 1 Results CleanCarve: No FATOffsets: 00000038,00000079,000000d2,0000015c, 000001d0,0000025a,000002ce Explanation: This file was not considered a clean carve because the file can not be completely verified with ole-dump. The results of running ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carved/dfrws-2006-challenge.raw.4077568.ole Can't open carved/dfrws-2006-challenge.raw.4077568.ole --- OLE_DUMP_RESULTS --- 3.4.2.3 Stage 2 Results CarveRanges: 4077568-4241919,4850688-5136383 CleanCarve: Yes Explanation: Analysis of output from the ole-dig2crv tool in the log file work/carve/carve.log shows that the fourth FAT block, 0x15C, is not located at the correct file offset. The fourth FAT block is expected to start at offset 4256256 (block 8313), but actually starts at offset 4865024 (block 9502). This indicates that there are 608768 bytes (1189 blocks) of extra data located somewhere between offset 4185600 (block 8175) and 4256256 (block 8313). The relevant output from ole-dig2crv is shown below. --- snip --- ole-dig2crv: file at 4077568, offset 4077568, block 7964 -- header FAT block pointers (in hex): 38,79,D2,15C,1D0,25A,2CE ole-dig2crv: file at 4077568, offset 4106752, block 8021 -- valid FAT block #1, 0x38 ole-dig2crv: file at 4077568, offset 4140032, block 8086 -- valid FAT block #2, 0x79 ole-dig2crv: file at 4077568, offset 4185600, block 8175 -- valid FAT block #3, 0xD2 ole-dig2crv: file at 4077568, offset 4256256, block 8313 -- missing FAT block, slipped forward 608768 bytes, 1189 blocks ole-dig2crv: file at 4077568, offset 4865024, block 9502 -- valid FAT block #4, 0x15C ole-dig2crv: file at 4077568, offset 4924416, block 9618 -- valid FAT block #5, 0x1D0 ole-dig2crv: file at 4077568, offset 4995072, block 9756 -- valid FAT block #6, 0x25A --- snip --- Examination of the combined dig output file, work/carve/combined.dig, identified a JPEG image between FAT blocks 3 and 4. The JPEG begins at offset 4241920 and ends at offset 4850621 -- that's 608701 bytes or 608768 bytes rounded up to the next block. --- COMBINED_DIG --- name|type|tag|offset|block|block_offset|string "dfrws-2006-challenge.raw"|xmagic|sof.ole|4077568|7964|0|38,79,D2,15C,1D0,25A,2CE "dfrws-2006-challenge.raw"|normal|sof.png|4086865|7982|81|%89PNG%0d%0a%1a%0a "dfrws-2006-challenge.raw"|normal|eof.png|4093285|7994|357|IEND%aeB%60%82 "dfrws-2006-challenge.raw"|xmagic|fat.ole|4106752|8021|0|00000001:aaa... "dfrws-2006-challenge.raw"|normal|eof.jpeg|4135398|8076|486|%ff%d9 "dfrws-2006-challenge.raw"|xmagic|fat.ole|4140032|8086|0|00000081:aaa... "dfrws-2006-challenge.raw"|xmagic|fat.ole|4185600|8175|0|00000101:aaa... "dfrws-2006-challenge.raw"|normal|eof.jpeg|4209069|8220|429|%ff%d9 "dfrws-2006-challenge.raw"|regexp|sof.jpeg|4241920|8285|0|%ff%d8%ff%e0%00%10JFIF "dfrws-2006-challenge.raw"|normal|eof.jpeg|4850621|9473|445|%ff%d9 "dfrws-2006-challenge.raw"|xmagic|fat.ole|4865024|9502|0|00000181:aaa... "dfrws-2006-challenge.raw"|xmagic|fat.ole|4924416|9618|0|00000201:aaa... --- COMBINED_DIG --- Analysis of the entropy plots (see section titled "Compute and Plot Sliding Entropy/Average Statistics") located in the directory work/plots, does reveal a slight entropy change around block 8285 in file sliding.entropy.out.512.plot.aq.png, and another change around block 9473. Between these two blocks, then entropy appears to be a little tighter. Using the graphs as a hint, we examined the block entropy data computed by FTimes in MySQL (see section titled "Compute and Load Sliding Statistics (Percent, Entropy, Average, MD5, and SHA1)"). The MySQL query and output (slightly altered) below shows the slight entropy changes observed in the plots. --- MYSQL_ENTROPY_QUERY --- mysql> select offset, block, block_offset, blocksize, rent1 from stats where blocksize = 512 and ( (offset >= 4239872 and offset <= 4244480) or (offset >= 4848640 and offset <= 4853760) ); +---------+-------+--------------+-----------+----------+ | offset | block | block_offset | blocksize | rent1 | +---------+-------+--------------+-----------+----------+ | 4239872 | 8281 | 0 | 512 | 7.585183 | | 4240384 | 8282 | 0 | 512 | 7.515887 | | 4240896 | 8283 | 0 | 512 | 7.58541 | | 4241408 | 8284 | 0 | 512 | 7.538749 | | 4241920 | 8285 | 0 | 512 | 6.39738 | | 4242432 | 8286 | 0 | 512 | 7.527157 | | 4242944 | 8287 | 0 | 512 | 7.379564 | | 4243456 | 8288 | 0 | 512 | 7.428707 | | 4243968 | 8289 | 0 | 512 | 7.34585 | | 4244480 | 8290 | 0 | 512 | 7.392842 | | 4848640 | 9470 | 0 | 512 | 7.510298 | | 4849152 | 9471 | 0 | 512 | 7.502472 | | 4849664 | 9472 | 0 | 512 | 7.428907 | | 4850176 | 9473 | 0 | 512 | 7.592864 | | 4850688 | 9474 | 0 | 512 | 7.538113 | | 4851200 | 9475 | 0 | 512 | 7.575682 | | 4851712 | 9476 | 0 | 512 | 7.62032 | | 4852224 | 9477 | 0 | 512 | 7.484087 | | 4852736 | 9478 | 0 | 512 | 7.582602 | | 4853248 | 9479 | 0 | 512 | 7.538266 | | 4853760 | 9480 | 0 | 512 | 7.548107 | +---------+-------+--------------+-----------+----------+ 21 rows in set (0.11 sec) --- MYSQL_ENTROPY_QUERY --- We computed the final carve ranges as shown below. Note that the size of the extra data, 608768 bytes, is slightly larger than the actual JPEG file size, 608701 bytes. This difference, 67 bytes, represents slack space at the end of the JPEG file. That is to say, the extra 67 bytes are needed to maintain block alignment. 4077568 = start of file 450048 = predicted file size from ole-dig2crv 608768 = size of extra data = 1189 blocks * 512 range 1 lower = 4077568 range 1 upper = 4241919 = JPEG beginning minus 1 = 4241920 - 1 range 2 lower = 4850688 = 4241920 + 608768 range 2 upper = 5136383 = 4077568 + (450048 + 608768) - 1 The file was then re-carved using the final carve ranges and ftimes-crv2raw.pl as follows: --- FTIMES_CRV2RAW_RESULTS --- % echo '"evidence_locker/dfrws-2006-challenge.raw"|ole|4077568|1|4077568-4241919,4850688-5136383' | ftimes-crv2raw.pl -F -U -m -f - "carve_tree/evidence_locker/dfrws-2006-challenge.raw.4077568.ole"|450048|40ed556bdfa3b6dac2641d4d1a8f6d474eb15835|8d2a9a284e078805ada47db191f35244 --- FTIMES_CRV2RAW_RESULTS --- The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carve_tree/evidence_locker/dfrws-2006-challenge.raw.4077568.ole Root Entry Data 4096 1Table 9893 WordDocument 428594 SummaryInformation 464 DocumentSummaryInformation 312 CompObj 106 --- OLE_DUMP_RESULTS --- This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in OpenOffice without any errors or warnings. 3.4.3 OLE File at Offset 16812544 3.4.3.1 Final Results Offset: 16812544 Block: 32837 BlockOffset: 0 Embedded: No CarveRanges: 16812544-17099775 Size: 287232 MD5: 0e52e75029e99cd2e9dcd0af271cf4a2 SHA1: d1ebd7c0bdb767d9a0c9ff9ea922686e38ae0480 3.4.3.2 Stage 1 Results CleanCarve: Yes FATOffsets: 00000220,00000221,00000222,00000223, 0000022a Explanation: This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in OpenOffice without any errors or warnings. The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carve_tree/evidence_locker/dfrws-2006-challenge.raw.16812544.ole Root Entry Data 22037 1Table 87249 WordDocument 160179 SummaryInformation 4096 DocumentSummaryInformation 1292 CompObj 106 --- OLE_DUMP_RESULTS --- 3.4.3.3 Stage 2 Results Stage 2 analysis was not required. 3.4.4 OLE File at Offset 17555456 3.4.4.1 Final Results Offset: 17555456 Block: 34288 BlockOffset: 0 Embedded: No CarveRanges: 17555456-17565183,17619456-18553343 Size: 943616 MD5: d7ff92b8cc1c89c46a78288b9c673152 SHA1: c7397412bf3f10571c246d2d2a9ed608dddc1c76 3.4.4.2 Stage 1 Results CleanCarve: No FATOffsets: 0000071f,00000720,00000721,00000722, 00000723,00000724,00000725,00000726, 00000727,00000728,00000729,0000072a, 0000072b,0000072c,0000072d Explanation: This file was not considered a clean carve because the file can not be completely verified with ole-dump. The results of running ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.17555456.ole Table of Contents for work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.17555456.ole: Segmentation fault (core dumped) --- OLE_DUMP_RESULTS --- 3.4.4.3 Stage 2 Results CarveRanges: 17555456-17565183,17619456-18553343 CleanCarve: Yes Explanation: Analysis of output from the ole-dig2crv tool in the log file work/carve/carve.log shows that the first FAT block, 0x71F, is not located at the correct file offset. The first FAT block is expected to start at offset 18489344 (block 36112), but it actually starts at offset 18543616 (block 36218). This implies that there are 54272 bytes (106 blocks) of extra data located somewhere between offset 17555456 (block 34288) and 18543616 (block 36218). The relevant output from ole-dig2crv is shown below. --- snip --- ole-dig2crv: file at 17555456, offset 17555456, block 34288 -- header FAT block pointers (in hex): 71F,720,721,722,723,724,725,726,727,728,729,72A,72B,72C,72D ole-dig2crv: file at 17555456, offset 17097216, block 33393 -- premature FAT block, skipping ole-dig2crv: file at 17555456, offset 18231808, block 35609 -- premature FAT block, skipping ole-dig2crv: file at 17555456, offset 18343936, block 35828 -- premature FAT block, skipping ole-dig2crv: file at 17555456, offset 18489344, block 36112 -- missing FAT block, slipped forward 54272 bytes, 106 blocks ole-dig2crv: file at 17555456, offset 18543616, block 36218 -- valid FAT block #1, 0x71F ole-dig2crv: file at 17555456, offset 18544128, block 36219 -- valid FAT block #2, 0x720 ole-dig2crv: file at 17555456, offset 18544640, block 36220 -- valid FAT block #3, 0x721 --- snip --- Examination of the combined dig output file, work/carve/combined.dig, did not reveal anything significant. Analysis of the entropy plots (see section titled "Compute and Plot Sliding Entropy/Average Statistics") located in the directory work/plots, resulted in no significant observations. Next, we manually inspected data in the raw image file using bvi and observed a notable shift away from the technical language at offset 17565184 (block 34307) -- that offset in hex is 0x010C0600. --- snip --- 010C0500 20 31 29 20 61 73 73 75 72 65 20 74 68 61 74 20 1) assure that 010C0510 73 79 73 74 65 6D 73 20 61 6E 64 20 61 70 70 6C systems and appl 010C0520 69 63 61 74 69 6F 6E 73 20 6F 70 65 72 61 74 65 ications operate 010C0530 20 65 66 66 65 63 74 69 76 65 6C 79 20 61 6E 64 effectively and 010C0540 20 70 72 6F 76 69 64 65 20 61 70 70 72 6F 70 72 provide appropr 010C0550 69 61 74 65 20 63 6F 6E 66 69 64 65 6E 74 69 61 iate confidentia 010C0560 6C 69 74 79 2C 20 69 6E 74 65 67 72 69 74 79 2C lity, integrity, 010C0570 20 61 6E 64 20 61 76 61 69 6C 61 62 69 6C 69 74 and availabilit 010C0580 79 3B 20 61 6E 64 20 32 29 20 70 72 6F 74 65 63 y; and 2) protec 010C0590 74 20 69 6E 66 6F 72 6D 61 74 69 6F 6E 20 63 6F t information co 010C05A0 6D 6D 65 6E 73 75 72 61 74 65 20 77 69 74 68 20 mmensurate with 010C05B0 74 68 65 20 6C 65 76 65 6C 20 6F 66 20 72 69 73 the level of ris 010C05C0 6B 20 61 6E 64 20 6D 61 67 6E 69 74 75 64 65 20 k and magnitude 010C05D0 6F 66 20 68 61 72 6D 20 72 65 73 75 6C 74 69 6E of harm resultin 010C05E0 67 20 66 72 6F 6D 20 6C 6F 73 73 2C 20 6D 69 73 g from loss, mis 010C05F0 75 73 65 2C 20 75 6E 61 75 74 68 6F 72 69 7A 65 use, unauthorize 010C0600 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 T 010C0610 68 65 20 41 64 76 65 6E 74 75 72 65 20 6F 66 20 he Adventure of 010C0620 74 68 65 20 43 6F 70 70 65 72 20 42 65 65 63 68 the Copper Beech 010C0630 65 73 0A 0A 20 20 22 54 6F 20 74 68 65 20 6D 61 es.. "To the ma 010C0640 6E 20 77 68 6F 20 6C 6F 76 65 73 20 61 72 74 20 n who loves art 010C0650 66 6F 72 20 69 74 73 20 6F 77 6E 20 73 61 6B 65 for its own sake 010C0660 2C 22 20 72 65 6D 61 72 6B 65 64 20 53 68 65 72 ," remarked Sher 010C0670 2D 0A 6C 6F 63 6B 20 48 6F 6C 6D 65 73 2C 20 74 -.lock Holmes, t 010C0680 6F 73 73 69 6E 67 20 61 73 69 64 65 20 74 68 65 ossing aside the 010C0690 20 61 64 76 65 72 74 69 73 65 6D 65 6E 74 20 73 advertisement s 010C06A0 68 65 65 74 20 6F 66 20 74 68 65 20 44 61 69 6C heet of the Dail 010C06B0 79 0A 54 65 6C 65 67 72 61 70 68 2C 20 22 69 74 y.Telegraph, "it 010C06C0 20 69 73 20 66 72 65 71 75 65 6E 74 6C 79 20 69 is frequently i 010C06D0 6E 20 69 74 73 20 6C 65 61 73 74 20 69 6D 70 6F n its least impo --- snip --- Then, we manually inspected data in the raw image file using bvi 106 blocks away from initial content shift, and we observed a notable shift back to the technical discussion beginning at at offset 17619456 (block 34413) -- that offset in hex is 0x010CDA00. --- snip --- 010CD960 29 55 67 3F 51 54 CE 94 A6 CC 4C 08 E1 0B A2 6E )Ug?QT....L....n 010CD970 47 E0 A7 1D A4 2D 71 89 53 A3 44 23 93 EA AA 64 G....-q.S.D#...d 010CD980 50 A3 70 E3 28 06 D3 66 32 BE 8F FC 0B 79 20 08 P.p.(..f2....y . 010CD990 D0 ED B4 2A 28 53 26 D9 CB 0B 26 E1 46 D7 FE 7A ...*(S&...&.F..z 010CD9A0 35 B8 8D 31 74 A7 3F FA 16 30 29 35 F3 59 19 CB 5..1t.?..0)5.Y.. 010CD9B0 A5 07 9E 41 3B 65 F4 ED 09 65 C6 D6 DA 57 3F DF ...A;e...e...W?. 010CD9C0 C4 48 2F 17 01 2B C3 7A C9 0C 3B BA 3D 71 F9 BF .H/..+.z..;.=q.. 010CD9D0 E5 05 19 51 47 20 1E 1C 89 5B D0 44 3E 97 1C A4 ...QG ...[.D>... 010CD9E0 1D DA 29 74 EF 77 0C E6 B3 B0 60 A3 B0 89 EB E4 ..)t.w....`..... 010CD9F0 BE 6E E9 2C DB 06 EA BB 47 9A 54 22 2A 92 29 3C .n.,....G.T"*.)< 010CDA00 64 20 61 63 63 65 73 73 2C 20 6F 72 20 6D 6F 64 d access, or mod 010CDA10 69 66 69 63 61 74 69 6F 6E 2E 0D 0D 41 67 65 6E ification...Agen 010CDA20 63 69 65 73 20 6D 75 73 74 20 70 6C 61 6E 20 66 cies must plan f 010CDA30 6F 72 20 73 65 63 75 72 69 74 79 2C 20 65 6E 73 or security, ens 010CDA40 75 72 65 20 74 68 61 74 20 74 68 65 20 61 70 70 ure that the app 010CDA50 72 6F 70 72 69 61 74 65 20 6F 66 66 69 63 69 61 ropriate officia 010CDA60 6C 73 20 61 72 65 20 61 73 73 69 67 6E 65 64 20 ls are assigned 010CDA70 73 65 63 75 72 69 74 79 20 72 65 73 70 6F 6E 73 security respons 010CDA80 69 62 69 6C 69 74 79 2C 20 61 6E 64 20 61 75 74 ibility, and aut --- snip --- We computed the final carve ranges as shown below. 17555456 = start of file 943616 = predicted file size from ole-dig2crv 54272 = size of extra data = 106 blocks * 512 range 1 lower = 17555456 range 1 upper = 17565183 = Text shift beginning minus 1 = 17565184 - 1 range 2 lower = 17619456 = 17565184 + 54272 range 2 upper = 18553343 = 17555456 + (943616 + 54272) - 1 The file was then re-carved using the final carve ranges and ftimes-crv2raw.pl as follows: --- FTIMES_CRV2RAW_RESULTS --- % echo '"evidence_locker/dfrws-2006-challenge.raw"|ole|17555456|1|17555456-17565183,17619456-18553343' | ftimes-crv2raw.pl -F -U -m -f - "carve_tree/evidence_locker/dfrws-2006-challenge.raw.17555456.ole"|943616|c7397412bf3f10571c246d2d2a9ed608dddc1c76|d7ff92b8cc1c89c46a78288b9c673152 --- FTIMES_CRV2RAW_RESULTS --- The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carve_tree/evidence_locker/dfrws-2006-challenge.raw.17555456.ole Table of Contents for carve_tree/evidence_locker/dfrws-2006-challenge.raw.17555456.ole: Root Entry Data 325997 1Table 303154 WordDocument 292704 SummaryInformation 4096 DocumentSummaryInformation 6212 CompObj 106 ObjectPool --- OLE_DUMP_RESULTS --- This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in OpenOffice without any errors or warnings. 3.4.5 OLE File at Offset 18942976 3.4.5.1 Final Results Offset: 18942976 Block: 36998 BlockOffset: 0 Embedded: No CarveRanges: 18942976-19276799,19316224-20187135,20212224-20675071 Size: 1667584 MD5: 4a22f04b097920d11fff4e192e0667a4 SHA1: a717cff358e12903a45042aed85b085b40d98928 3.4.5.2 Stage 1 Results CleanCarve: No FATOffsets: 00000c92,00000c93,00000c94,00000c95, 00000c96,00000c97,00000c98,00000c99, 00000c9a,00000c9b,00000c9c,00000c9d, 00000c9e,00000c9f,00000ca0,00000ca1, 00000ca2,00000ca3,00000ca4,00000ca5, 00000ca6,00000ca7,00000ca8,00000ca9, 00000caa,00000cab Explanation: This file was not considered a clean carve because the file can not be completely verified with ole-dump. The results of running ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.18942976.ole Table of Contents for work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.18942976.ole: Bad read of 512 bytes --- OLE_DUMP_RESULTS --- 3.4.5.3 Stage 2 Results CarveRanges: 18942976-19276799,19316224-20187135,20212224-20675071 CleanCarve: Yes Explanation: Analysis of output from the ole-dig2crv tool in the log file work/carve/carve.log shows that the first FAT block, 0xC92, is not located at the correct file offset. The first FAT block is expected to start at offset 20591104 (block 40217), but it actually starts at offset 20655616 (block 40343). This indicates that there are 64512 bytes (126 blocks) of extra data located somewhere between offset 18942976 (block 36998) and 20655616 (block 40343). The relevant output from ole-dig2crv is shown below. --- snip --- ole-dig2crv: file at 18942976, offset 18942976, block 36998 -- header FAT block pointers (in hex): C92,C93,C94,C95,C96,C97,C98,C99,C9A,C9B,C9C,C9D,C9E,C9F,CA0,CA1,CA2,CA3,CA4,CA5,CA6,CA7,CA8,CA9,CAA,CAB ole-dig2crv: file at 18942976, offset 18552320, block 36235 -- premature FAT block, skipping ole-dig2crv: file at 18942976, offset 20591104, block 40217 -- missing FAT block, slipped forward 64512 bytes, 126 blocks ole-dig2crv: file at 18942976, offset 20655616, block 40343 -- valid FAT block #1, 0xC92 ole-dig2crv: file at 18942976, offset 20656128, block 40344 -- valid FAT block #2, 0xC93 ole-dig2crv: file at 18942976, offset 20656640, block 40345 -- valid FAT block #3, 0xC94 --- snip --- Examination of the combined dig output file, work/carve/combined.dig, did not reveal anything significant. Analysis of the entropy plots (see section titled "Compute and Plot Sliding Entropy/Average Statistics") located in the directory work/plots, resulted in the observations shown in the table below. Observations started with with block 36998 in file sliding.entropy.out.512.plot.cv.png. There are are two chunks of data that do not belong in the file. approx chunk position block entropy shift ----- -------- ----- --------------- 1 lower 37650 from 4.5 to 7.5 1 upper 37730 from 7.5 to 4.5 2 lower 39430 from 5.0 to 8.0 2 upper 39480 from 8.0 to 4.5 Using the graphs as a hint, we examined the block entropy data computed by FTimes in MySQL (see section titled "Compute and Load Sliding Statistics (Percent, Entropy, Average, MD5, and SHA1)"). The MySQL query and output (slightly altered) below shows the start and end blocks of the first chunk. --- MYSQL_ENTROPY_QUERY --- mysql> select offset, block, block_offset, blocksize, rent1 from stats where blocksize = 512 and ( (offset >= 19275264 and offset <= 19277824) or (offset >= 19314688 and offset <= 19317248) ); +----------+-------+--------------+-----------+----------+ | offset | block | block_offset | blocksize | rent1 | +----------+-------+--------------+-----------+----------+ | 19275264 | 37647 | 0 | 512 | 4.222399 | | 19275776 | 37648 | 0 | 512 | 4.011732 | | 19276288 | 37649 | 0 | 512 | 4.685635 | | 19276800 | 37650 | 0 | 512 | 7.553442 | | 19277312 | 37651 | 0 | 512 | 7.639979 | | 19277824 | 37652 | 0 | 512 | 7.549877 | | 19314688 | 37724 | 0 | 512 | 7.572733 | | 19315200 | 37725 | 0 | 512 | 7.560171 | | 19315712 | 37726 | 0 | 512 | 7.659451 | | 19316224 | 37727 | 0 | 512 | 4.650552 | | 19316736 | 37728 | 0 | 512 | 4.450562 | | 19317248 | 37729 | 0 | 512 | 4.252666 | +----------+-------+--------------+-----------+----------+ 12 rows in set (0.05 sec) --- MYSQL_ENTROPY_QUERY --- The abbreviated FTimes dig output below reveals the start and end blocks of the second chunk. --- snip --- name|type|tag|offset|block_size|row_entropy_1|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20186112|512|4.980921|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20186624|512|4.912459|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20187136|512|0.000000|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20187648|512|7.609169|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20188160|512|7.542474|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20188672|512|7.608570|... ... "dfrws-2006-challenge.raw"|xmagic|stats-512|20210688|512|7.575555|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20211200|512|7.568488|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20211712|512|7.647910|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20212224|512|4.911417|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20212736|512|5.057074|... "dfrws-2006-challenge.raw"|xmagic|stats-512|20213248|512|4.604382|... --- snip --- Based on the observations of the 512-byte statistics, we developed the following table which shows the total number of extra data blocks is 126, matching the output from the ole-dig2crv tool. entropy block chunk position block offset shift count ----- -------- ------ -------- ---------- ----- 1 lower 37650 19276800 4.5 to 7.5 1 upper 37726 19315712 7.5 to 4.5 77 2 lower 39428 20187136 5.0 to 8.0 2 upper 39476 20211712 8.0 to 4.5 49 ----- total number of blocks: 126 We then computed the final carve ranges as shown here: 18942976 = start of file 1667584 = predicted file size from ole-dig2crv 64512 = size of extra data = 126 blocks * 512 range 1 lower = 18942976 range 1 upper = 19276799 = slot 1 lower - 1 = 19276800 - 1 range 2 lower = 19316224 = slot 1 upper + 512 = 19315712 + 512 range 2 upper = 20187135 = slot 2 lower - 1 = 20187136 - 1 range 3 lower = 20212224 = slot 2 upper + 512 = 20211712 + 512 range 3 upper = 20675071 = 18942976 + (1667584 + 64512) - 1 The file was then re-carved using the final carve ranges and ftimes-crv2raw.pl as follows: --- FTIMES_CRV2RAW_RESULTS --- % echo '"evidence_locker/dfrws-2006-challenge.raw"|ole|18942976|1|18942976-19276799,19316224-20187135,20212224-20675071' | ftimes-crv2raw.pl -F -U -m -f - "carve_tree/evidence_locker/dfrws-2006-challenge.raw.18942976.ole"|1667584|a717cff358e12903a45042aed85b085b40d98928|4a22f04b097920d11fff4e192e0667a4 --- FTIMES_CRV2RAW_RESULTS --- The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump carve_tree/evidence_locker/dfrws-2006-challenge.raw.18942976.ole Table of Contents for carve_tree/evidence_locker/dfrws-2006-challenge.raw.18942976.ole: Root Entry WordDocument 1647271 CompObj 106 SummaryInformation 468 DocumentSummaryInformation 236 --- OLE_DUMP_RESULTS --- This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in OpenOffice without any errors or warnings. 3.4.6 OLE File at Offset 23533568 3.4.6.1 Final Results Offset: 23533568 Block: 45964 BlockOffset: 0 Embedded: No CarveRanges: 23533568-23605247 Size: 71680 MD5: 109284cc5abddc83879a29785795fd75 SHA1: bb6611fd8f2d296dc061189334917dd2cb47cc02 3.4.6.2 Stage 1 Results CleanCarve: Yes FATOffsets: 00000085,00000086 This file was considered a clean carve because the document can be verified using the ole-dump validator. Additionally, the file can be opened and viewed in OpenOffice without any errors or warnings. The relevant results of ole-dump are provided here: --- OLE_DUMP_RESULTS --- % ole-dump work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.23533568.ole Table of Contents for work/carve/opt/data/dfrws-2006-challenge/sandbox/dfrw_challenge_2006/evidence_locker/dfrws-2006-challenge.raw.23533568.ole: Root Entry Data 7975 1Table 18134 WordDocument 32805 SummaryInformation 4096 DocumentSummaryInformation 4096 CompObj 106 --- OLE_DUMP_RESULTS --- 3.4.6.3 Stage 2 Results Stage 2 analysis was not required. 3.5 Answers for HTML Files The subject image contains the following HTML files: dfrws-2006-challenge.raw.4607.html (really 4608) dfrws-2006-challenge.raw.2271232.html dfrws-2006-challenge.raw.2281472.html dfrws-2006-challenge.raw.14077952.html dfrws-2006-challenge.raw.14460928.html dfrws-2006-challenge.raw.15118848.html Modified dig output: "dfrws-2006-challenge.raw"|regexp|sof.html|4607|+%0a%0a "dfrws-2006-challenge.raw"|regexp|sof.html|2271232|%0d%0a%0a%0d%0a "dfrws-2006-challenge.raw"|regexp|eof.html|2332892|%0a "dfrws-2006-challenge.raw"|regexp|sof.html|14077952|%0a+%0a%0a "dfrws-2006-challenge.raw"|regexp|sof.html|14460928| "dfrws-2006-challenge.raw"|regexp|sof.html|15118848|%0a+%0a%0a 3.5.1 HTML File at Offset 4607 (really 4608) 3.5.1.1 Final Results FileType: HTML Offset: 4608 Block: 9 BlockOffset: 0 Embedded: No CarveRanges: 4608-22754 Size: 18147 MD5: eec87931b03e5a4a4ef8fd51109a1227 SHA1: 4887ae8d38be4e062a16049a6d72d852fef8227f 3.5.1.2 Stage 1 Results CleanCarve: Mostly Explanation: This file was initially identified as being located at offset 4607 (block 8, block_offset 511). Going on the assumption that legitimate SOFs are block-aligned, we concluded that offset must be incorrect. Manual inspection revealed that an extra space character was caught in the dig string match. Recall that the dig string used to identify HTML SOFs is: (?is)(\s*<(?:!DOCTYPE[\x20\t]+html[^>]*>\s* * line 154 column 1 - Warning: discarding unexpected * line 5 column 1 - Warning: isn't allowed in elements * line 5 column 1 - Warning: isn't allowed in <body> elements * line 5 column 1 - Warning: <meta> isn't allowed in <body> elements * line 5 column 1 - Warning: <meta> isn't allowed in <body> elements * line 5 column 1 - Warning: <meta> isn't allowed in <body> elements * line 5 column 1 - Warning: </head> isn't allowed in <body> elements * line 162 column 1 - Warning: discarding unexpected <body> ? line 162 column 223 - Warning: missing </font> before <p> line 165 column 4 - Warning: inserting implicit <font> line 676 column 47 - Warning: inserting implicit <p> line 1260 column 15 - Warning: discarding unexpected </font> ? line 1260 column 182 - Warning: missing </font> before <p> line 1260 column 244 - Warning: inserting implicit <font> line 1262 column 17 - Warning: discarding unexpected </font> ? line 1272 column 64 - Warning: missing </font> before <p> line 1273 column 27 - Warning: inserting implicit <font> ? line 1276 column 1 - Warning: missing </font> before <p> line 1277 column 19 - Warning: inserting implicit <font> line 1327 column 1 - Warning: discarding unexpected </font> line 1329 column 21 - Warning: discarding unexpected </font> ? line 1331 column 25 - Warning: missing </font> before </div> line 1334 column 15 - Warning: discarding unexpected </font> line 8 column 1 - Warning: <img> lacks "alt" attribute line 9 column 25 - Warning: <img> lacks "alt" attribute line 10 column 25 - Warning: <img> lacks "alt" attribute line 11 column 25 - Warning: <img> lacks "alt" attribute line 12 column 25 - Warning: <img> lacks "alt" attribute line 13 column 1 - Warning: <img> lacks "alt" attribute line 14 column 39 - Warning: <img> lacks "alt" attribute line 162 column 106 - Warning: <table> lacks "summary" attribute line 169 column 42 - Warning: <img> lacks "alt" attribute line 651 column 1 - Warning: <img> lacks "alt" attribute line 652 column 25 - Warning: <img> lacks "alt" attribute line 653 column 25 - Warning: <img> lacks "alt" attribute line 654 column 25 - Warning: <img> lacks "alt" attribute line 655 column 25 - Warning: <img> lacks "alt" attribute line 656 column 1 - Warning: <img> lacks "alt" attribute line 657 column 39 - Warning: <img> lacks "alt" attribute line 1253 column 42 - Warning: <img> lacks "alt" attribute line 1260 column 97 - Warning: <table> lacks "summary" attribute line 1260 column 244 - Warning: <img> lacks "alt" attribute line 1262 column 24 - Warning: <table> lacks "summary" attribute line 1269 column 7 - Warning: <img> lacks "alt" attribute line 676 column 47 - Warning: trimming empty <p> line 1276 column 1 - Warning: trimming empty <font> line 1331 column 25 - Warning: trimming empty <font> --- TIDY_WARNINGS --- 3.5.2.3 Stage 2 Results CleanCarve: Yes Explanation: After manually reviewing the file, dig output, and validator output, we concluded that the only issue was the carved file was actually interleaved with the HTML file at offset 2281472. Therefore, the only change required was to adjust the carve ranges and recarve the two files. Note: We were able to locate an exact match of this file on the Internet. Therefore, we used the contents of that file to help us verify the one we carved. We also observed (visual) differences in the HTML layout of the two files. These changes were quite noticeable in a text editor. Unfortunately, we did not have enough time to come up with better validation algorithm. 3.5.3 HTML File at Offset 2281472 3.5.3.1 Final Results FileType: HTML Offset: 2281472 Block: 4456 BlockOffset: 0 Embedded: No CarveRanges: 2281472-2296831,2305024-2332899 Size: 43236 MD5: 80e6b9221ef308ff1639fd19df036e34 SHA1: 31c7cf7e60663d319505312322c5b2742a3f646c 3.5.3.2 Stage 1 Results CleanCarve: No Explanation: The HTML validator gave mixed results. The ctype test passed, but the tidy test produced several (12) warnings, and two of those warnings was indicative of trouble. These trouble spots have been highlighted with '*' and '?' -- where '*' is most likely a problem and '?' may be a problem. The relevant validator output is shown below. --- TEST_HTML_RESULTS --- pass - ctype test: ascii < 99.9% warn - tidy test: 12 warnings, 0 errors were found! --- TEST_HTML_RESULTS --- --- TIDY_WARNINGS --- ? line 10 column 223 - Warning: missing </font> before <p> line 13 column 4 - Warning: inserting implicit <font> * line 10 column 106 - Warning: missing </table> line 10 column 106 - Warning: <table> lacks "summary" attribute line 17 column 42 - Warning: <img> lacks "alt" attribute line 499 column 1 - Warning: <img> lacks "alt" attribute line 500 column 25 - Warning: <img> lacks "alt" attribute line 501 column 25 - Warning: <img> lacks "alt" attribute line 502 column 25 - Warning: <img> lacks "alt" attribute line 503 column 25 - Warning: <img> lacks "alt" attribute line 504 column 1 - Warning: <img> lacks "alt" attribute line 505 column 39 - Warning: <img> lacks "alt" attribute Info: Doctype given is "-//W3C//DTD HTML 4.0 Transitional//EN" Info: Document content looks like HTML 4.01 Transitional --- TIDY_WARNINGS --- 3.5.3.3 Stage 2 Results CleanCarve: Yes Explanation: After manually reviewing the file, dig output, and validator output, we concluded that the only issue was the carved file was actually interleaved with the HTML file at offset 2271232. Therefore, the only change required was to adjust the carve ranges and recarve the two files. Note: We were able to locate a close match of this file on the Internet. Therefore, we used the contents of that file to help us verify the one we carved. We also observed (visual) differences in the HTML layout of the two files. These changes were quite noticeable in a text editor. Unfortunately, we did not have enough time to come up with better validation algorithm. 3.5.4 HTML File at Offset 14077952 3.5.4.1 Final Results FileType: HTML Offset: 14077952 Block: 27496 BlockOffset: 0 Embedded: No CarveRanges: 14077952-14134783,14324736-14436430 Size: 168527 MD5: a80ee062aed8279304faae8f20f6d48e SHA1: b784ce356e3edb425ac062e4f025e3c96066d8ed 3.5.4.2 Stage 1 Results CleanCarve: No Explanation: --- TEST_HTML_RESULTS --- fail - ctype test: ascii < 99.9% fail - tidy test: 26403 warnings, 73 errors were found! --- TEST_HTML_RESULTS --- --- TIDY_WARNINGS --- too many to show here --- TIDY_WARNINGS --- This file was not considered a clean carve because the HTML validator failed both of its tests. The fact that the ctype test failed, added weight to the 26403 warnings. We analyzed the file's sliding entropy entropy and observed that, there is an abrupt increase in entropy between blocks 111 and 112 and a corresponding decrease between 481 and 482. Between these endpoints, the entropy increases from approximately 5.3 to 7.5. This run of higher entropy consumes 370 blocks (or 189440 bytes). 3.5.4.3 Stage 2 Results CleanCarve: Yes Explanation: The correction required to make this file whole was to recognize that the file at offset 14134784 is completely encompassed by this file. Once that observation is made, we simply adjusted the carve ranges to exclude the inner file plus enough padding to reach the next block boundary. After that, the file carved cleanly. Note: We were able to locate an exact match of this file on the Internet. Therefore, we used the contents of that file to help us verify the one we carved. Additionally, the file can be opened and viewed in a text editor or web browser without any visible anomalies. 3.5.5 HTML File at Offset 14460928 3.5.5.1 Final Results FileType: HTML Offset: 14460928 Block: 28244 BlockOffset: 0 Embedded: No CarveRanges: 14460928-14461951,14493184-14512178 Size: 20019 MD5: 045798407b927321326a547704e67831 SHA1: 4c952614dd27843772a013afc7d1627ba1e19506 3.5.5.2 Stage 1 Results CleanCarve: Explanation: This file was not considered a clean carve because the HTML validator failed both of its tests. The fact that the ctype test failed added weight to tidy's errors and warnings. --- TEST_HTML_RESULTS --- fail - ctype test: ascii < 99.9% fail - tidy test: 52 warnings, 1 error were found! --- TEST_HTML_RESULTS --- --- TIDY_WARNINGS --- line 1 column 1 - Warning: missing <!DOCTYPE> declaration line 570 column 2 - Warning: replacing invalid character code 132 line 570 column 3 - Warning: replacing invalid character code 148 line 570 column 8 - Warning: discarding invalid character code 144 line 570 column 13 - Warning: replacing invalid character code 145 line 570 column 17 - Warning: replacing invalid character code 153 line 570 column 21 - Warning: replacing invalid character code 152 line 570 column 29 - Warning: replacing invalid character code 146 line 570 column 33 - Warning: replacing invalid character code 158 line 572 column 2 - Warning: replacing invalid character code 156 line 572 column 8 - Warning: replacing invalid character code 135 line 572 column 28 - Warning: replacing invalid character code 156 line 572 column 30 - Warning: replacing invalid character code 145 * line 572 column 34 - Warning: <e> attribute "²}Ò»aí" lacks value * line 572 column 34 - Error: <e> is not recognized! * line 572 column 34 - Warning: discarding unexpected <e> line 572 column 52 - Warning: replacing invalid character code 147 line 572 column 81 - Warning: replacing invalid character code 149 line 572 column 82 - Warning: replacing invalid character code 140 line 572 column 101 - Warning: replacing invalid character code 131 line 572 column 120 - Warning: replacing invalid character code 148 line 572 column 127 - Warning: replacing invalid character code 148 line 572 column 129 - Warning: replacing invalid character code 155 line 572 column 132 - Warning: replacing invalid character code 136 line 572 column 137 - Warning: replacing invalid character code 158 line 572 column 139 - Warning: replacing invalid character code 133 line 572 column 141 - Warning: discarding invalid character code 141 line 572 column 145 - Warning: replacing invalid character code 156 line 572 column 146 - Warning: replacing invalid character code 148 line 572 column 155 - Warning: replacing invalid character code 146 line 572 column 171 - Warning: replacing invalid character code 159 line 572 column 174 - Warning: replacing invalid character code 155 line 572 column 179 - Warning: discarding invalid character code 144 line 572 column 181 - Warning: replacing invalid character code 132 line 572 column 186 - Warning: replacing invalid character code 131 line 572 column 197 - Warning: unescaped & which should be written as & line 572 column 204 - Warning: replacing invalid character code 134 line 572 column 228 - Warning: replacing invalid character code 139 line 572 column 229 - Warning: replacing invalid character code 154 line 572 column 232 - Warning: replacing invalid character code 153 line 572 column 235 - Warning: replacing invalid character code 133 line 572 column 237 - Warning: replacing invalid character code 142 line 573 column 4 - Warning: replacing invalid character code 140 line 573 column 5 - Warning: replacing invalid character code 159 line 573 column 8 - Warning: unescaped & which should be written as & line 573 column 10 - Warning: discarding invalid character code 129 line 573 column 13 - Warning: replacing invalid character code 136 line 573 column 19 - Warning: replacing invalid character code 152 line 573 column 21 - Warning: discarding invalid character code 157 line 573 column 30 - Warning: replacing invalid character code 146 line 574 column 6 - Warning: discarding invalid character code 157 line 574 column 16 - Warning: replacing invalid character code 153 line 574 column 18 - Warning: replacing invalid character code 147 --- TIDY_WARNINGS --- We analyzed the file's sliding entropy entropy and observed that, there is an abrupt increase in entropy between blocks 111 and 112 and a corresponding decrease between 481 and 482. Between these endpoints, the entropy increases from approximately 5.3 to 7.5. This run of higher entropy consumes 370 blocks (or 189440 bytes). 3.5.5.3 Stage 2 Results CleanCarve: Yes Explanation: After manually reviewing the file, dig output, sliding statistics, and validator output, we concluded that the only issue was that the carved file contained blocks of French text. The correction required to make this file whole was to locate all the blocks that contained French and adjust the carve ranges accordingly. To locate these ranges, we ran the following query: --- MYSQL_ENTROPY_QUERY --- mysql> select offset, block, block_offset, blocksize, ascii from stats where blocksize = 1024 and ascii < 99.9 and ((offset >= 14460928 and offset <= 14512178)); +----------+-------+--------------+-----------+-----------+ | offset | block | block_offset | blocksize | ascii | +----------+-------+--------------+-----------+-----------+ | 14461952 | 14123 | 0 | 1024 | 97.363281 | | 14462976 | 14124 | 0 | 1024 | 97.753906 | | 14464000 | 14125 | 0 | 1024 | 97.558594 | | 14465024 | 14126 | 0 | 1024 | 97.363281 | | 14466048 | 14127 | 0 | 1024 | 97.070312 | | 14467072 | 14128 | 0 | 1024 | 97.265625 | | 14468096 | 14129 | 0 | 1024 | 97.265625 | | 14469120 | 14130 | 0 | 1024 | 96.582031 | | 14470144 | 14131 | 0 | 1024 | 96.386719 | | 14471168 | 14132 | 0 | 1024 | 96.972656 | | 14472192 | 14133 | 0 | 1024 | 97.363281 | | 14473216 | 14134 | 0 | 1024 | 97.363281 | | 14474240 | 14135 | 0 | 1024 | 97.949219 | | 14475264 | 14136 | 0 | 1024 | 97.753906 | | 14476288 | 14137 | 0 | 1024 | 96.582031 | | 14477312 | 14138 | 0 | 1024 | 96.679688 | | 14478336 | 14139 | 0 | 1024 | 96.972656 | | 14479360 | 14140 | 0 | 1024 | 97.265625 | | 14480384 | 14141 | 0 | 1024 | 96.875 | | 14481408 | 14142 | 0 | 1024 | 96.972656 | | 14482432 | 14143 | 0 | 1024 | 97.460938 | | 14483456 | 14144 | 0 | 1024 | 97.363281 | | 14484480 | 14145 | 0 | 1024 | 97.851562 | | 14485504 | 14146 | 0 | 1024 | 97.753906 | | 14486528 | 14147 | 0 | 1024 | 97.460938 | | 14487552 | 14148 | 0 | 1024 | 97.070312 | | 14488576 | 14149 | 0 | 1024 | 97.753906 | | 14489600 | 14150 | 0 | 1024 | 96.875 | | 14490624 | 14151 | 0 | 1024 | 96.679688 | | 14491648 | 14152 | 0 | 1024 | 97.070312 | | 14492672 | 14153 | 0 | 1024 | 79.296875 | | 14512128 | 14172 | 0 | 1024 | 51.953125 | +----------+-------+--------------+-----------+-----------+ 32 rows in set (0.00 sec) --- MYSQL_ENTROPY_QUERY --- All the 1024-byte blocks listed in the output except for the last block (14172) are contiguous. Manual inspection of the raw data confirms that the French text is contained between blocks 14172 and 14153 -- Note: These are 1024-byte blocks. The last block in the above output corresponds to the last block of the HTML file. Once the French text blocks were identified, we simply adjusted the carve ranges to exclude the French text plus enough padding to reach the next 512-byte block boundary. After that, the file carved cleanly. Note: We were able to locate a close match of this file on the Internet. Therefore, we used the contents of that file to help us verify the one we carved. Additionally, the file can be opened and viewed in a text editor or web browser without any visible anomalies. 3.5.6 HTML File at Offset 15118848 3.5.6.1 Final Results FileType: HTML Offset: 15118848 Block: 29529 BlockOffset: 0 Embedded: No CarveRanges: 15118848-15306642 Size: 187795 MD5: 643d159339bd2c9e7604e6fac8e7facc SHA1: 072b5ade245edc828e1cabd8d26532b09a6815d8 3.5.6.2 Stage 1 Results CleanCarve: Yes Explanation: The HTML validator gave mixed results. The ctype test passed, but the tidy test produced many (2777) warnings. The relevant validator output is shown here: --- TEST_HTML_RESULTS --- pass - ctype test: ascii < 99.9% warn - tidy test: 2777 warnings, 0 errors were found! --- TEST_HTML_RESULTS --- --- TIDY_WARNINGS --- too many to show here --- TIDY_WARNINGS --- While there were many tidy warnings, most of them appeared to be benign. Note: We were able to locate an exact match of this file on the Internet. Therefore, we used the contents of that file to help us verify the one we carved. Additionally, the file can be opened and viewed in a text editor or web browser without any visible anomalies. 3.5.6.3 Stage 2 Results Stage 2 analysis was not required. 3.6 Answers for Text Files Our answers for text files are based on the analysis of the sliding statistics harvested by FTimes. The following query, in particular, allowed us to locate contiguous blocks of text that did not contain HTML: --- MYSQL_TEXT_QUERY --- select block from stats where blocksize = 4096 and print >= 80 and html_tags = 'no' and rent1 > 3 and rent1 < 6 order by block;" --- MYSQL_TEXT_QUERY --- Note: We specifically added a HTML discrimination because we knew the subject image contained HTML file. Other filters could have been added with relative ease (see internals of do_stats to see how it all works). We took the output of the above query and fed it to the following script: ftimes-group-blocks.pl which produced carver output. This process formed the basis of our Stage 1 carves. Also, this script is configured to add 1 extra block of data on each side of the text ranges -- we used 4096-bytes. We did that to ensure that we picked up the leading and trailing bytes that were filtered out by the query. These extra blocks were expected to contain data from the previous file as well as the text file. One last note about the text carving process. The query shown above was designed to find text -- including text that may reside in other file types (e.g., OLE documents). We did not concern ourselves with where the text originated in Stage 1. 3.6.1 Text File at Offset 6049792 (really 6053376) 3.6.1.1 Final Results FileType: Text Offset: 6053376 Block: 11823 BlockOffset: 0 Embedded: No CarveRanges: 6053376-6066201 Size: 12826 MD5: f800a46e18fafd309825c5ee84a654a2 SHA1: dd0bbffcbd56a99f58a5684aa6bc24b868ba7cb6 3.6.1.2 Stage 1 Results CleanCarve: No Explanation: The initial carve range encompassed too much data, as it was designed to do. 3.6.1.3 Stage 2 Results CleanCarve: Yes Explanation: A hex editor/viewer (bvi) was used to find the actual start of this file. If we had more time, we would have attempted to automate the trimming process to obtain a closer approximation. This file appears to be complete English text and valid. This answer is based on manual review. 3.6.2 Text File at Offset 14458880 (really 14461952) 3.6.2.1 Final Results FileType: Text Offset: 14461952 Block: 28246 BlockOffset: 0 Embedded: No CarveRanges: 14461952-14492767 Size: 30816 MD5: 616a6bbe915c3dbf51014fd76f55b0e3 SHA1: 444102966d768086c63c7194122c807abf62efd7 3.6.2.2 Stage 1 Results CleanCarve: No Explanation: The initial carve range encompassed too much data -- as it was designed to do. 3.6.2.3 Stage 2 Results CleanCarve: Yes Explanation: A hex editor/viewer (bvi) was used to find the actual start of this file. If we had more time, we would have attempted to automate the trimming process to obtain a closer approximation. This file appears to be incomplete French text, but otherwise valid -- both the head and tail appear to be truncated. This answer is based on manual review. 3.6.3 Text File at Offset 16814080 (really 16815104) 3.6.3.1 Final Results FileType: Text (really part of an OLE document) Offset: 16815104 Block: 32842 BlockOffset: 0 Embedded: Yes CarveRanges: 16815104-16826155 Size: 11052 MD5: c92b2e3f6d8ffa7665ffa1f492ee5280 SHA1: 291e58e43e5aa3b072f8f7354600e75b021f42cb 3.6.3.2 Stage 1 Results CleanCarve: No Explanation: The initial carve range encompassed too much data -- as it was designed to do. 3.6.3.3 Stage 2 Results CleanCarve: Yes Explanation: The carve appears to be clean, but this text belongs to the OLE document located at offset 16812544, so it can't be considered a legitimate file. 3.6.4 Text File at Offset 17555456 (really 17565184) 3.6.4.1 Final Results FileType: Text Offset: 17565184 Block: 34307 BlockOffset: 0 Embedded: No CarveRanges: 17565184-17619054 Size: 53871 MD5: 81a394a3b8d3cd03a4f2069d5084866c SHA1: 43cf40c9509b46ad1415f5ffd7adb3932bf0709d 3.6.4.2 Stage 1 Results CleanCarve: No Explanation: The initial carve range encompassed too much data -- as it was designed to do. 3.6.4.3 Stage 2 Results CleanCarve: Yes Explanation: A hex editor/viewer (bvi) was used to find the actual start of this file. If we had more time, we would have attempted to automate the trimming process to obtain a closer approximation. This file appears to be complete English text and valid. This answer is based on manual review. 3.6.5 Text File at Offset 18939904 (really 18944256) 3.6.5.1 Final Results FileType: Text (really part of an OLE document) Offset: 18944256 Block: 37000 BlockOffset: 0 Embedded: Yes CarveRanges: 18944256-19276799,19316224-20187135,20212224-20336221 Size: 1327454 MD5: 0df83c5b4a424f3ee0560814485bd83e SHA1: 51ed5581d71d13c3878033790745fcf2aef5247e 3.6.5.2 Stage 1 Results CleanCarve: No Explanation: The initial carve range encompassed too much data -- as it was designed to do. However, manual analysis of the text files at offsets 19316224 and 20212224 revealed that they were related to this text, so all three ranges were combined under this section. 3.6.5.3 Stage 2 Results CleanCarve: Yes Explanation: The carve appears to be clean, but this text belongs to the OLE document located at offset 18942976, so it can't be considered a legitimate file. 3.6.6 Text File at Offset 19312640 (really 19316224) 3.6.6.1 Final Results This file was actually part of the file located at offset 18944256, which, in turn, was part of the OLE document located at offset 18942976. 3.6.7 Text File at Offset 20209664 (really 20212224) 3.6.7.1 Final Results This file was actually part of the file located at offset 18944256, which, in turn, was part of the OLE document located at offset 18942976.