The file starts with the string FWS or CWS, followed by an 8-bit version number and 32-bit file length field. In case of CWS all the remaining file contents are zlib compressed:
[FWS] [Version] [Length] [Data] or [CWS] [Version] [Length] [Zlib Data]
The complete SWF specification can be found on Adobe's site (registration required), or here.
Now, the uncompressed data part starts with a header followed by a list of tags.
[Header] [Tag] [Tag] ...
Each tag acts as a container for a datatype, e.g. for a jpeg image, rgb color or an actionscript bytecode. A tag starts with a tag type identifier and the tag's length, followed by arbitrary data.
[tag code and length (16 bits)] [data (length bytes)]
The complete swf looks like this:
[FWS/CWS] [Version] [Length] [ [Header] [[Tag Code + Length] [Tag Contens]] ...  ]
As indicated, the last tag is a tag with tag type 0 and length 0 hence resulting in a 16 bit representation of 0.
If we wanted to analyze an SWF file, it would be best to uncompress where needed, parse the header and then break down each tag by its code first. When doing so with real world data we may encounter undocumented or unknown codes. There can be several reasons for these mysterious tag codes, for example the file could be corrupted or our parser could be incomplete. More likely, however, is either that a commonly used - yet undocumented - tag was used correctly according to the programmer's point of view (tag type IDs 16, 29, 31, 38, 40, 42, 47, 49, 50, 51,52, 63, 72), OR the tag was deliberately marked with an unknown code in order to hide bytecode or other data.
We'll go along with the latter case, so let's assume - just for a moment - that we are programming a malware flash file. As such our code needs to avoid detection and should be obfuscated as well. The actionscript2 bytecode as located inside doAction tags can issue a branch action (aka. jump or goto) which is ordinarily being used for loops and conditions. Each branch action comes with a relative address of the next action. Example:
0x00: action 1
0x01: some actions...
0x10: jump -0x10
Ominously the branch offset (here -0x16) is not restricted to the current code block, but could jump to an entirely different tag instead, where the code is being executed as if it were a code block. Example:
0x100: tag1 header with unknown code
0x104: code in tag 1
0x200: doAction tag
0x204: jump -0x100
This way the code inside tag1 is hidden from ordinary SWF analyzer tools and can still be executed. In order to make it even harder to find the hidden code, random bytecode could be inserted in between actual bytecode, or dormant bytecode (which is never executed) could be used as distraction. Fortunately this technique is also really easy to detect since a checker only needs to be able to check for uncommon branch offsets, however most disassemblers (such as flare) can be fooled.
Another interesting way to hide code, which is by far not the last one, would be a base64 encoded SWF file ebmedded in an image of another swf file, such as
In the end it does not really matter, which way your code is protected or even if it is hidden at all, because there is no security or malware check anywhere within a flash advertising deployment process. An evil attacker could simply buy ad space from an ad broker, the delivered ad is then quickly checked (possibly manually) for style guidelines such as size or close buttons, and finally delivered to their ad servers. That's the end of the (slightly simplified) deployment process.
Let's explore a few technical possibilities on how to protect yourself from flash malware. (Non-technical solutions such as contract fines or national law are not applicable for the anonymous evil hacker.) Java applets - for example - can have signatures. Since there is no way specified to embed cryptographic signatures in SWF files, and by the way only few people would grasp the signature's relevance anyway, this is not a viable option here. Then there is a sort of capability whitelisting: The SWF file could be checked against allowed capabilities, which include having obfuscated code hidden in unknown tags as described above. The check could be done automatically on client side (e.g. by a browser plugin) or by a proxy either intermediately or on server side. But such a capability filter is yet to be written.
related URLs: https://www.flashsec.org/ http://osflash.org/