1 Star Blazer disassembly notes
3 - The original file I worked with is cracked by Mr. Xerox and obtained from:
4 https://archive.org/download/a2_Star_Blazer_1981_Star_Craft/Star_Blazer_1981_Star_Craft.do
6 - The crack is a bit crude, basically it's a memory dump of how things looked
7 when the original loader was going to transfer control to the game code.
9 - Since I did not want to disassemble a lot of junk I created a different crack
10 loader, it is similar to the original but it only loads the essential parts.
12 - For details about this process see the files in /loader:
14 - The program /loader/dejunk.py overwrites parts of the original binary that
15 I think are junk, see the notes in the code about address range meanings.
17 - The files /loader/star_blazer_dejunked*.bin are dejunked with different
18 fill values, I run both of these to check that the dejunking did not break.
20 - The program /loader/hires_loader.py extracts the non-junk parts of the
21 dejunked binary (it works with the original or dejunked binary, but I
22 prefer the latter, since it makes the disassembly cleaner given that there
23 is one region where the alignment padding was not zeroed out originally),
24 concatenates them, and then splits them into parts for my crack loader.
25 It then installs my crack loader and composes the parts ready for loading.
27 - The binary /loader/hires_loader.bin is my short machine code program which
28 lives in the hires screen at $2000 and its function is to copy the tail of
29 the program from hires screen memory to the tail location at approx $7e00.
30 It also contains some initializations of zero page, registers and so forth.
31 I haven't included a separate source file as it gets disassembled later on.
33 - The result of the re-cracking is /loader/star_blazer_hires_loader.bin which
34 can be played the same as the original binary but is significantly smaller.
36 - Then see the disassembly in /disasm, in particular /disasm/star_blazer.asm
37 is an ASxxxx source that assembles to /loader/star_blazer_hires_loader.bin.
38 It contains switches at the top of the file to control my game modifications
39 and if left alone (ALIGN = 1, SHAPE = 1) it will produce the original game.
41 - The ASxxxx assembler used for this project is by Alan R. Baldwin and has a
42 home page at https://shop-pdp.net/ashtml/asxxxx.php. Use "as6500" for 6502.
44 - This assembler does not use the most conventional syntax since constants
45 are C-style, e.g. "0x2000" not "$2000", and addressing modes use square
46 brackets, e.g. "lda [0x2000],y" not "lda ($2000),y". The C-style constants
47 are good for projects that compbine C and assembly code, since you can use
48 common include files. The square bracket syntax is not good and I contacted
49 the author who told me it is for historical reasons and promised to fix it.
51 - Zero-page references are indicated by "*", e.g. "lda *0x20,x" produces the
52 2-byte instruction whereas "lda 0x20,x" should produce the 3-byte version.
53 (I say *should* because there is an inconsistency in this process that I
54 discovered recently and I will investigate it later and fix the assembler).
55 The assembler *can* generate zero-page references automatically if you use
56 the ".dpage" pseudo-op in the ".area zpage" section, but I haven't done so
57 in case there are places where the reassembled binary doesn't match the
58 original. Probably there aren't, but having control via "*" is quite good.
60 - I have used a procedure like this to produce the disassembly:
62 - Run /disasm/load.py to perform the relocation that is normally done by the
63 hires loader and output mem.bin which is a straight memory dump (no DOS 3.3
64 header) which gets loaded at 0x9fd. This gives the disassembler a clearer
65 picture of what's where, but is not runnable, and does not remove the need
66 for the loader (the loader is also responsible for other initialization).
68 - Run my disassembler, which is not included here as it's beyond the scope
69 of this document, passing it a runtime trace file (also not included here),
70 and a manual text file that gives areas and names/sizes of known symbols.
72 - The manual text file /disasm/star_blazer.txt is included and could form the
73 basis of a SourceGen or similar project, however, it is pretty terse and
74 does not include all of the information inferred by the disassembler from
75 the trace file. I am working on a way to make the disassembler output this.
76 It would be relatively easy to make a SourceGen project from the asm output
77 of the disassembler, but it would be easier if the process was automated.
79 - Run my shape extractor and compiler, this is an optional process since the
80 original .db statements for the shapes are still in /disasm/star_blazer.asm
81 (if you compile with SHAPE = 0) but extracting and recompiling the shapes
82 gives you the opportunity to edit them. I haven't included sources for this
83 process, which is complex, but I do include /disasm/shape0.png for viewing.
85 - To regenerate the game, use steps like this:
87 as6500 -l -o star_blazer.asm
88 aslink -n -m -u -i -b zpage=0 -b udata0=0x200 -b udata1=0x400 -b text=0x9fd -b loader=0x2000 -b data0=0x4000 star_blazer.ihx star_blazer.rel
89 ./pack.py star_blazer.ihx star_blazer_hires_loader.bin
91 - The /disasm/pack.py is similar to /loader/hires_loader.py and it moves the
92 sections around for loading. Basically the idea is to move the last 0x2000
93 bytes of the game binary (actually 0x2000 less the loader size) into the
94 hires screen where it will be loaded by BLOAD, and then relocate it at run-
95 time. This prevents BLOAD from having to load 8 kbytes of "gap" at 0x2000.
97 - The disassembly is far from complete, as I have not figured out all of the
98 game logic, and there may be issues with identifying all relocatable symbols.
100 - Basically it is relocatable, but I noticed that it will not always proceed
101 to the next level, i.e. it sometimes gets stuck in a limbo mode in between
102 missions, where you can fly around and shoot, but there are no baddies.
104 - I fixed a similar problem that turned out to be a table referenced only by
105 its high address -- I changed something like "lda #0xNN" to "lda #>SYMBOL".
107 - I have a reasonably good understanding of the game's data structures, its
108 graphics package and its mathematics routines. I do not fully understand the
109 game physics (which was a major reason to do the disassembly) but I am quite
110 close to it, as I located things like the position and velocity of objects,
111 the angle of a missile, and even routines that look like homing the missile.
113 - I discovered the following general principles about the engine:
115 - There are 0x100 shapes and (I think) 0x70 objects. I only have a tentative
116 understanding of the objects (see discussion of game microcode further on)
117 so I haven't included this yet. But essentially each object has a purpose,
118 e.g. I think 0x20..0x27 are stars in the star-field background, 0x41..0x43
119 are trees or cactuses, etc. The mapping of objects to shapes varies, but
120 within some limits, e.g. object 0x41 is shape 0x78 (tree) or 0x79 (cactus).
121 The maximum number of an object onscreen is dictated by its assigned slots
122 in the 0x70 objects, e.g. I think there can only be up to 3 trees/cactuses.
124 - Animation works by drawing shapes in "or" mode to make them appear, then
125 drawing them in "and-not" mode to make them disappear. This leaves a hole
126 in the screen, where any underlying objects are not visible after erasure.
128 - When drawing, the engine computes a number of things, such as the shape
129 address to use, the x coordinate mod 7 and so on. These are stored in the
130 object array, and reused (i.e. not recomputed) when it erases the shape.
132 - It seems to move the objects one at a time, relying on the fact that it
133 will soon re-draw an underlying object that was unintentionally erased.
134 You can see an artifact of this where a pair of objects scrolls along
135 the ground in a fixed relationship to each other, as one of them might be
136 animated with a "bite" out of it corresponding to previous position of the
137 other. The game uses only hires screen page 1, i.e. no double buffering.
139 - The playfield is logically 140 units wide (each is a pair of HGR pixels)
140 and (from memory) 160 units high, the remaining 32 lines being used for
141 the score, the ground, and the various displays like the current mission.
143 - Coordinates are kept as (x, y) bytes where 0x80 is the centre of the
144 screen (I think) and so the playfield extends somewhat in each direction
145 beyond the screen. Shapes are clipped if they are drawn partially off the
146 screen. Action can happen off-screen, e.g. a bomb hits a target and you
147 complete the mission, or a missile curves off-screen and comes back on.
149 - Velocities are kept with more precision, possibly 16 or 24 bits. There is
150 some complicated logic with 4-bit shifts, which may be to save on storage.
152 - I would like to understand the logic of how it generates terrain better,
153 but I suspect that it's somewhat controlled by the extended playfield,
154 e.g. a tree scrolls off the visible screen and continues to scroll to the
155 left invisibly, until it hits the left edge of the playfield, whereupon
156 it is immediately regenerated at a new invisible location somewhere in
157 the invisible right portion of the playfield. This theory is supported
158 by fields in "struct object" and routines that I found whose job is to
159 randomize the position and velocity of an object within given x and y
160 bounds per object. Quite a bit of the object's personality and generation
161 behaviour can be controlled with just these fields, e.g. ground-based
162 objects have the y-limits set the same, so that their y isn't randomized.
164 - There are two basic kinds of shapes, the shiftable kind which can be drawn
165 at any x-position on the screen, and the non-shiftable kind which can only
166 be drawn at byte-aligned positions. The shiftable kind is stored with 7
167 pre-shifted shapes, and the "struct shape" contains a pointer to the middle
168 one of these shapes, with indexing by up to +/- 3 * the shape size. This
169 is done to save time and code since a multiply by +/- 3 is cheaper than 7.
171 - Shiftable shapes are drawn at only even positions, due to the 140-pixel
172 logical screen width, but still require 7 pre-shifted shapes since even
173 screen-positions can be even or odd within respect to 7-bit screen bytes.
175 - The first 8 shapes correspond to individual pixels (really pixel pairs)
176 in different colours, i.e. the 8 HGR colours. These are drawn with a
177 special routine. As an experiment, I tried replacing these with ordinary
178 shapes and commenting the special routine (PIXEL_SHAPE = 1) which worked.
180 - Each scan line of each shape is either hi-bit clear (uses black, white,
181 green and purple) or set (uses black, white, blue and orange). The hi-bit
182 never changes within a scan line. When rendering the hi-bit is "ored" into
183 the screen like any other bit, so blue and orange will take precedence.
185 - Each shape has a dimension in logical units which defines its collision
186 rectangle, as well as a dimension in physical units (bytes) which defines
187 its drawn rectangle. In general the physical dimension is predictable from
188 the logical dimension, and this means that in some cases padding is drawn
189 on-screen in order that the logical dimension be as the designer intended.
191 - Non-shiftable shapes are drawn in general by replacing the previous screen
192 memory contents rather than "oring". The routine that does this is called
193 "draw_misc" in the disassembly, as it does miscellaneous parts like titles,
194 scoring etc. There is a data table which I haven't fully decoded but which
195 contains 16-byte entries, describing (I think) strings to draw and where.
196 The physical (byte) width and drawn rectangle is critical to this process,
197 as I discovered that if you change these, then the text gets all messed up.
199 - The "draw_misc" routine also has the ability to mask the shape data with
200 an alternating pair of masks, and this is used to draw the "STAR BLAZER"
201 title screen with colour cycling. Such shape data is stored with hi-bit
202 set, in order that the hi-bit can be selectively masked off as required.
204 - The shape table contains some blank rectangles, which obviously would not
205 have any effect when drawn in "or" mode. I discovered that these are used
206 for erasing parts of the screen when drawn with "draw_misc". Their width
207 is important, e.g. a blank rectangle erases "HIGH" before drawing "SCORE"
208 and it then seems to advance by the width of this before drawing "SCORE".
210 - I discovered the following general principles about the data structures:
212 - There is essentially a "struct shape" and "struct object" containing the
213 variables that control a shape or an object, but they are implemented as a
214 separate array (indexed by shape or object number) for each field of the
215 struct, including lo and hi of pointers. This is quite normal in 6502 code.
217 - Interestingly the "struct object" seems to have what are essentially sub-
218 classes, because some of the arrays do not implement the entire 0x70-entry
219 range -- so you see code like "lda #object_XXX - 0x40,x" which means the
220 XXX field of "struct object" is only stored for e.g. objects 0x40..0x6f.
222 - In the disassembly, most lines which use an addressing mode like "NNNN,x"
223 are annotated in the comment field with a range like "x=40..6f". My custom
224 disassembler has extracte this information from a trace of the runnin game,
225 in which I attempted to exercise at least some of the game levels/features.
226 However, the range is only what it's *seen* and might be larger in reality.
228 - The disassembler uses the information about locations accessed by indexing
229 instructions, to build a partition of the data space into separate arrays.
230 It is independent of the base address that happened to be used for access.
231 Overlapping regions are merged, as it assumes they must be the same array.
233 - Interestingly, I found a few cases of what I think are game bugs, where the
234 author did not anticipate his access overrunning into a neighbouring array,
235 although it's possible that it was intentional and I didn't understand it.
237 - I do not well understand the scoring and how the game proceeds through the
238 missions, but I did locate the important variables, so it would be easy to
239 figure it out. This hasn't been my major priority, which is why I didn't yet.
241 - The part I am presently attacking is to understand the gameplay at a finer
242 level, in particular the collision detection, and what happens for various
243 kinds of collisions (figuring this out will also provide insights into the
244 scoring and how it proceeds through the missions, hence I tackle this first).
246 - In an earlier version of the disassembly I had located routines for things
247 like intersection of collision rectangles, but I deleted these symbols from
248 the present disassembly as my naming was based on a slightly earlier way of
249 thinking. It would be relatively easy to re-find and re-annotate this code.
251 - It turns out that the collision code is quite integrated into the rest of
252 the game logic. The game seems to use kind of an internal microcode which
253 consists of zero-terminated lists of bytes, and basically each game object
254 has various microcode routines, and many of the microcode bytes seem to be
255 the indices of other game objects that it needs to collision-test against.
257 - There are also other bytes such as 0xf0, which I think are not indices of
258 game objects, but rather, microcode commands, and you can see "cmp #0xf0"
259 and similar comparison chains throughout the code to implement these. I am
260 not sure if different kinds of collisions are implemented by different 0xfN
261 commands or by multiple object-indexed microcode tables or a combination.
263 - I am making the disassembly available in its present state (warts and all)
264 so that others can pick it up if they want to and progress things. In the
265 present .zip I have omitted a lot of my files to keep it to a manageable
266 package, and even then, there is still a lot to take in (hires loader, etc).
268 - My complete work directory with my emulator, shape extractor and compiler,
269 etc, is available at the following git repository on one of my servers:
271 https://git.ndcode.org/public/star_disasm.git
273 - Unfortunately the gitweb viewer on my server is broken. I think people are
274 hitting it from the Internet and causing it to crash and restart, and then
275 systemd is not letting it restart after a while. A git client still works.
277 - As I am not really ready to make a proper release, I haven't bothered with
278 LICENSE files and such. However, I intend to release my part of the work
279 (not the copyrighted material obviously) under a MIT license. This includes
280 the disassembler, the tracing infrastructure, etc. It's quite sophisticated
281 so I also considered GPLv2, but overall I prefer a more permissive license.
283 - It is a work in progress, and I am not all that happy with how it handles
284 the shape editing and various other things. The disassembler also does not
285 have good support for the control file or the ability to add comments or
286 override the operand fields in instructions, and so whilst it can do a lot
287 automatically, it's hard to deal with the case where it gets things wrong.
288 It also does not have good support for immediate operands yet, although it
289 handles all the other addressing modes intelligently. I plan to add an enum
290 feature so that it can more readably decompile constants and the microcode.
292 - I will not be able to work on the project for a bit, so please enjoy for now.