Skip to main content

in reply to Foone🏳️‍⚧️

I started reverse engineering Where in the World is Carmen Sandiego (Enhanced DOS edition) and I'm trying to find how it generates its random seeds so I search on int 1a and the first thing I find is it's doing TANDY SOUNDS?
in reply to Foone🏳️‍⚧️

Funny fact: I was trying to get an online assembler to spit out the machine code for "int 1a" but couldn't get it to, so I just went "fuck it, I can probably just do that in my head!"

Turns out I can. My brain is weird.

in reply to Foone🏳️‍⚧️

Here's something I didn't know existed until just now: Where in the World is Carmen Sandiego checks your name against the list dossier list and rejects you if you use any of those names.
in reply to Foone🏳️‍⚧️

PRONOUNS DETECTED: THIS GAME IS WOKE

sadly they don't have they/them on here. What about the non-binary criminals, huh?

in reply to Foone🏳️‍⚧️

stretch goal: hack in at least on enby criminal with appropriate pronouns. maybe I'll just put myself in the game as one of the criminals you can apprehend
in reply to Foone🏳️‍⚧️

I think I might be able to do the hack I want by changing one byte.

I'm trying to change it so it has "daily challenges", and I think I can fix that by just switching a INT 1A from subfunction 00 to 04, making it seed the random function with the date instead of the ticks-since-midnight

in reply to Foone🏳️‍⚧️

NORMAL CODE

random(*(byte *)*(undefined2 *)
(*(int *)(*(int *)0x39a6 * 0xe + local_c * 2 + 0x1d02) * 2 +
*(int *)(local_c * 2 + 0x24b)) - 1);

in reply to Foone🏳️‍⚧️

I haven't figured out how this game stores gender, but I'm gonna go out on a limb and say it's like this:
male: 0
female: 4
in reply to Foone🏳️‍⚧️

why? because they have strings like:
char* HE="He\0\0She\0"
char* HIS="His\0Hers\0";
char* HIM="Him\0Her\0";

so they can do like:

printf("Follow %s to %s lair, and capture %s alive!", badguy->name, HIS+badguy->gender, HIM+badguy->gender);

This entry was edited (5 months ago)
in reply to Foone🏳️‍⚧️

I like how the game only asks your name, not your gender.
Player's don't have genders. Only thieves have genders.
in reply to Foone🏳️‍⚧️

why does ghidra's "search by instruction pattern" default to BINARY?
what kind of a freak remembers the machine code for INT 21 on x86 in BINARY?
it's CD21h, not 1100110100100001!

what are you, some kind of nerd?

in reply to Foone🏳️‍⚧️

I love reversing a string and it's:

void printString(char* str, int length);

and I go look what calls it, reverse that function, and it's:

void printStringSimple(char *str){
printString(str, strlen(str));
}

in reply to Foone🏳️‍⚧️

it's like "aww, did someone have second thoughts about making PRINT always take a length, and got tired of having to manually calculate lengths so you just wrapped it?

and your compiler didn't inline SHIT?

in reply to Foone🏳️‍⚧️

okay so when you start a game (well, technically when you restart), the game rolls 3 dice:
0-31: where the shit was stolen from
0-2: which item it is from that location
0-8: whodunnit
in reply to Foone🏳️‍⚧️

like if you roll 0 on the first, you get Athens.
For the second one, it's:
0: mask of Priam
1: Achilles's heel
2: sibyl's secret.
in reply to Foone🏳️‍⚧️

The last die is used as a lookup table into the dossier's list.
It's got 1 added to it so you won't get Carmen Sandiego, as a rookie at least.
in reply to Foone🏳️‍⚧️

so the game uses a pattern like this:
char * RANKS="Rookie\0Sleuth\0Private Eye\0Investigator\0Ace Detective\0"

and then latter they do:

char* your_rank = select_string(RANKS, player->rank);

and select_string is a confusing function to reverse engineer, but knowing the name I gave it gives it away: it advances through the list until it's on the nth string and returns it

in reply to Foone🏳️‍⚧️

so probably it uses the same trick for pronouns. The string I'm seeing is probably like: "He\0Him\0She\0Her\0"
This entry was edited (5 months ago)
in reply to Foone🏳️‍⚧️

Ghidra is officially sexist. It'll automatically detect the word "Female" and mark it as a string, but not the word "Male"!

Why? SEXISM!

or the fact the default minimum length for strings is 5 characters, so "female" is long enough but "male" isn't.

in reply to Foone🏳️‍⚧️

correction: there IS a check for going over the end, it's just not used in every place select_string is called. so it's sometimes-safe
in reply to Foone🏳️‍⚧️

they have invented a Pronoun Markup Language.
It's \x80 for He/She
It's \x81 for he/she
It's \x82 for his/her

so a string will be "\x80 mentioned \x81 liked seafood and offered me a ride in \x82 motorcycle"
and it'll fill it out based on the pronouns of the suspect

in reply to Foone🏳️‍⚧️

in trying to hack myself into the game, it glitched and said I had "Hobby: Male"

no... I haven't done that in ages!

in reply to Foone🏳️‍⚧️

I modified the game's NUM_GENDERS and found where it stores the database of criminals, so now you can find me if you search SEX=NB.
in reply to Foone🏳️‍⚧️

so in addition to the 5 listed attributes (and their name), the game tracks one hidden attribute:

food preference.
There are only two options:
00=Mexican
01=Seafood

what an odd binary

in reply to Foone🏳️‍⚧️

I'm thinking I might do a "full"(ish) disassembly of this game. I've thought for a long while (basically ever since I knew Where In North Dakota is Carmen Sandiego? existed) that there should be an SDK for making your own version of this game, for whatever arbitrary geographical area you want.
in reply to Foone🏳️‍⚧️

and of course there's no reason you would have to limit yourself to reality.
You could always do, like, "Where in Middle Earth is Carmen Sandiego?"
in reply to Foone🏳️‍⚧️

you go to Rivendell and talk to an Elf who says the perp was talking about how he wanted to collect "his precious"
in reply to Foone🏳️‍⚧️

I say "full" in quotes because I don't think I need to reverse the whole game to make it customizable, just enough to let you customize the locations, bad guys, hints, search types, etc.
in reply to Foone🏳️‍⚧️

sadly they didn't design the game as a completely empty husk that just loads datafiles. That would have been the smart thing to do, since they could then trivially make new versions.
in reply to Foone🏳️‍⚧️

maybe instead of fully decompiling it, I just hack it to grab data from external files, then make a tool for making those files
in reply to Foone🏳️‍⚧️

turns out this version of the game has impressive support for older video cards. Here's Hercules support, which looks horrible without aspect ratio correction!
in reply to Foone🏳️‍⚧️

wow, this is actually the first game I've seen actually use the VGA bios call to set the VGA palette. (int 10h, AX=1012h)
in reply to Foone🏳️‍⚧️

so when the game starts, it loads:
ACME.DAT
CARMEN.DAT
MIDISND.DAT
DIGISND.DAT
CITIES.DAT

Interestingly, it uses the same code to load the last three, suggesting they're some kind of basic container format

in reply to Foone🏳️‍⚧️

starting writing code to generate a JSON file of all the various switchable info in the EXE. Things like hobbies, hair colors, locations, etc.
in reply to Foone🏳️‍⚧️

this blit function seems to take a useless first argument, a second argument that's the height, a third argument that's the width, and a fourth argument that doesn't seem to do anything.

notice anything missing? like... a lot of things?

in reply to Foone🏳️‍⚧️

I think this game might be doing something weird where blit-source positions and destination positions are all globals, for some fucking reason
in reply to Foone🏳️‍⚧️

the game internally has 5 drivers (as of 2.2, I have other versions here and they're different): CGA, Hercules, EGA, Tandy, VGA.
in reply to Foone🏳️‍⚧️

I've been working on cities.dat. I can now confirm that this game (Where in the World is Carmen Sandiego Enhanced (DOS, 1990)) has 30 cities, and they're the same 30 cities as the 1985 original.
in reply to Foone🏳️‍⚧️

hmm. I could reuse my readString code between these two formats, but it would technically enable world cities to have pronouns.
in reply to Foone🏳️‍⚧️

this game uses a fun text encoding method: both-ended null terminated!

It stores city names with a nul at the beginning because it reads them backwards. For some fucking reason.

in reply to Foone🏳️‍⚧️

why in the fuck is loading the data for Paris suddenly grabbing some random data out of Kigali? this implies some weird things about the compression, or the data normalization
in reply to Foone🏳️‍⚧️

they seek to position X
read 1 byte
read 99 more bytes
then seek to position X+100

now if you know how both math and random access files work, you'll realize something the programmers of Where in the World is Carmen Sandiego? Enhanced (1990, DOS) did not:

THEY'RE SEEKING TO THE POSITION THEY'RE ALREADY AT

in reply to Foone🏳️‍⚧️

I tried to corrupt the image to see if that'd tell me anything about how it was encoded, and it told me to put my hard drive back in.
in reply to Foone🏳️‍⚧️

the way this game does the investigations is interesting.
so the basic gameplay is that you're in location X, you get 3 hints, which lead you to location Y, where the whole process repeats.

But if you savescum to experience the same pursuit again, they'll always go through the same places... but if you don't get the hints, they won't be there.

in reply to Foone🏳️‍⚧️

like the hints will always tell you to go to sri lanka, but if you go there without first having heard those hints, then he won't be in sri lanka
in reply to Foone🏳️‍⚧️

Hah! the game apparently calculates some info ahead of time, but only a few steps. I changed who the suspect was by memory editing, and it didn't take effect... until I got to the third location.

Since I went from a robbery by Fast Eddie B to one by Merey LaRoc, it means the pronouns changed when I got to London.

Congrats on coming out as a trans woman, Merey.

in reply to Foone🏳️‍⚧️

ok I ran my dosspin tool to gibberish every byte of the save game file (it's only 102 bytes, so this is easy!) and none of them change where you start. very interesting... I'm guessing either the values are spread out too much for my gibberishing to reach, or you need to modify multiple bytes at once
in reply to Foone🏳️‍⚧️

huh, I found a hidden(?) key: if you hold down either shift, it skips all the pauses in the printing. so it goes at MAX CPU SPEED
in reply to Foone🏳️‍⚧️

ahh good. it's always fun to find code that looks like:
do{

while(variable!=0);

some one has a custom tick handler that's permutating a global!

in reply to Foone🏳️‍⚧️

looking at interrupts, and I think I found a bug.
they set handlers for various CPU errors, but they accidentally set 10 (COPROCESSOR ERROR) twice, instead of the 05 (BOUND check)/10 (COPRPOCESSOR) interrupts they save

someone copy-pasted and missed a bit

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

I finally found the two helper functions they use to get and set vectors!

all the 30 other places I've seen them set/get vectors, they do it manually, but hey, maybe they use the helpers too

in reply to Foone🏳️‍⚧️

could also be that this is a compiler-provided bit of code, which is left in because the runtime needs it, or they just didn't eliminate dead code
in reply to Foone🏳️‍⚧️

okay I've figured out there's a shared format they're using here. it chunks the file into chunks, which have a 16-bit ID (unique per file, but not globally), an offset, and 16-bit length
in reply to Foone🏳️‍⚧️

so like, midisnd.dat will have 12 entries, and the first 11 are 200-500 bytes each, and then the last is 3k.
presumably it's each song and then some config info?
in reply to Foone🏳️‍⚧️

cities.dat is very interesting. There's 30 cities in total, but 491 entries in it!

So they must be doing something odd there, that doesn't divide equally. Maybe one city-chunk gives IDs of the others?

in reply to Foone🏳️‍⚧️

idea for a test: it's easy to spot which chunk in a city is the image, because it's the biggest. Here's a way to determine if it's looking up by IDs or offsets/indices: swap the IDs of two images
in reply to Foone🏳️‍⚧️

darn. turns out you can't just renumber the chunks, because they have to be in increasing order.

so maybe I just need to leave the chunk indexes as is, and instead of moving the entries around, I move where they're pointing?

in reply to Foone🏳️‍⚧️

Bingo! I'm in Athens, but I'm seeing the image for Baghdad, and apparently with the Baghdad palette?

So one of these other chunks must be the palette for a city. Or it selects from a selection of palettes? Maybe they've just got a couple defined.

in reply to Foone🏳️‍⚧️

okay I figured out the cities.dat IDs:

They're all 1XXYY (in decimal):
XX is the city number (0-29), YY is the sub-chunk-id.

So like:
YY=0: City name
YY=2: City image.

They go between 00 and 22, and not all numbers need to be present.

in reply to Foone🏳️‍⚧️

okay I think it has a very simple 1-byte CRC check on the chunks, which are optionally not run.
I can't make the math work but I'm reasonably sure that's what it is
in reply to Foone🏳️‍⚧️

ugh. TODO for my eventual Good DOS Debugger:
Instant Video display.
I don't know exactly how DOSBox-X is doing it, but while single-stepping the debugger, the display never updates. I can dump the ram at A000:0000 and see what updated, but not on the screen in DOSBox
in reply to Foone🏳️‍⚧️

found a suspicious array, which goes:
[
(-1,0),
(-1,1),
(0,1),
(1,1),
(1,0),
(1,-1),
(0, -1),
(-1,-1),
(0,0)
]

POP QUIZ: why does the font renderer need this array? how are they being "lazy" with this array?

in reply to Foone🏳️‍⚧️

hint

Sensitive content

in reply to Foone🏳️‍⚧️

answer

Sensitive content

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

The Answer to the DRM questions for Where in the world is Carmen Sandiego? Enhanced (DOS, 1990) are, in no particular order:

23
Kent
dragon
calcium
1796
Warren
revenue
1792
Willard
1937
Crater
Tanzania
Hartford
Duluth
London
Gem
Silent
squeaker

in reply to Foone🏳️‍⚧️

if ((0x80 >> ((byte)local_4 & 7) &
(int)(char)*(byte *)((int)((int *)param_1 + 1) + (local_4 >> 3))) != 0) {

COULD YOU USE SOME MORE CASTS MAYBE?

in reply to Foone🏳️‍⚧️

oh it's because ghidra's near/far pointer support is shit.

I had param2 defined as a byte*32 and it was casting it to a byte* before using it

in reply to Foone🏳️‍⚧️

if I define it as byte* and let the calling convention implicitly define it as 32bit, it doesn't do the cast
in reply to Foone🏳️‍⚧️

well I found the decompression method.

as always, I hate it. decompression routines are probably my least favorite thing to reverse engineer

in reply to Foone🏳️‍⚧️

I think this compression is specifically designed for ASCII text, which is annoying because they've also got compressed images... which probably use a DIFFERENT COMPRESSION!
in reply to Foone🏳️‍⚧️

it looks like this chunk has length 256, which means 253 usable bytes, and it expands to 374 bytes.

Not the greatest compression. a little better than just doing 6-bit ASCII.

in reply to Foone🏳️‍⚧️

it's some kind of shifting bit mask but it starts at encoding values in 4 bits, then it can increase (or decrease, I guess) based on the input stream.

then it has an output filter, where if the number specified wasn't 8 bits, it's actually an index into a predefined text table

in reply to Foone🏳️‍⚧️

the predefined table starts with NUL, space, then:
aetonisrdlhugfcwypbmk,vSA.T'PMxBCIRGDWHqE-zNFKL0j:51YJ8\U?73Q;2!469
\r\nOVXZ()*+"#$%&<=>/@[]^_`
This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

given that the most comment symbols are near the beginning, this is presumably a sort of lazy huffman coding
in reply to Foone🏳️‍⚧️

but I've got the predefined table, an input file, an output file, and now I need to write some python code to replicate this, hopefully without crying
in reply to Foone🏳️‍⚧️

"vs ses oa is isgit's tc eital and largest t u anhtA ttggh os nnotosnhrdsmarosogdn ss drte tishoth's isdhsceohtsnthminder of isgit's t nuorhdhtpast\x00 geru is slightltsn oaller than ndhd na and is o nnsgtgstbtst oa dotlalssaaolootbiaoht Sal gh, sonuhvia and sl ghh\x00isgit, ontvdn ss nhsiaalgarsnadlfnaatawlarst oadrlhrs i is a rugged land dooousr'casrbhe nrdsgs fountainsnht iah"
in reply to Foone🏳️‍⚧️

that's supposed to read:
"\x03Lima is Peru's capital and largest city. A well-known landmark is the Archbishop's Palace, a reminder of Peru's colonial past\x00Peru is slightly smaller than Alaska and is bordered by Ecuador, Colombia, Brazil, Bolivia and Chile\x00Peru, once the center of the mighty Incan Empire, is a rugged land dominated by the Andes Mountains. Forests and jungles cover half its land area\x00"
in reply to Foone🏳️‍⚧️

it was a trivial off-by-one error.
I was doing saved_byte=input

[3]but while I needed the 3rd byte, that's at input[2]

in reply to Foone🏳️‍⚧️

yess!

C:\DOSBox-X\drive_c\carmen\py>python datfile.py cities.dat --dump=12803 --decompress
"\x03Sydney, with a population of more than 3.3 million people, is Australia's largest city. A well-known sight is Sydney's distinctively designed Opera House\x00An island continent, Australia is nearly as large as the United States but has only one-fifteenth the population\x00The capital of Australia is Canberra, located in the southeast corner of the country between Sydney and Melbourne\x00"

in reply to Foone🏳️‍⚧️

It starts with \x03 to indicate there's three strings: then it describes the city three times. at runtime it uses select_string function with a random input to select one of the three strings
in reply to Foone🏳️‍⚧️

okay now that I can decode the chunks (well, most of them) I can identify a lot more of them:

00 Name and (some other info)
01 ???
02 Image
03 City descriptions
04 Items to steal
10 ???
11&up: Hints leading here

in reply to Foone🏳️‍⚧️

So like, the 12 chunk for Tokyo says:

b'\x05asked about the exchange rate for yen\x00was practicing Japanese characters\x00said\x81planned to take photographs of Mount Fuji\x00asked about tours of the Imperial Palace\x00was interested in visiting Shinto shrines\x00'

So it picks from one of those 5 options

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

and then 13 will be:
b'\x02asked questions about Shinto rituals\x00said\x81was researching an archipelago\x00'
in reply to Foone🏳️‍⚧️

so when it sets up a city that has hints to lead to Tokyo, it picks 3 of these sets of questions, then picks a question in each set.
in reply to Foone🏳️‍⚧️

tool that'd really be handy right now:
a "live" version of binxelview, so I can step through the DOSBox-x debugger and see how memory is changing in real time, as an image.
in reply to Foone🏳️‍⚧️

I'm stepping through a high-level loading routine I don't understand yet, trying to figure out when it decompresses an image by watching the RAM it uses for file loading and decompression and spotting when the image appears
in reply to Foone🏳️‍⚧️

sadly DOSBox-X's memory breakpoints don't let you set up a breakpoint that covers a whole 64k. you only get one byte. A shame.
in reply to Foone🏳️‍⚧️

ooh, I'd also need to be able to watch multiple address ranges at once. that'd be sweet, multiple windows of visibility into RAM
in reply to Foone🏳️‍⚧️

I'm in Paris, I look at work ram, I see the image of the Eiffel. I head to Rome, and before I load the next image, I can see that the Eiffle tower in workram now has the wrong stride.
That's odd, because it means it had to rewrite the image in memory, the image it's about to unload.
in reply to Foone🏳️‍⚧️

I think this might be the GUI system doing a screenshot of the image under a window, so it can restore it at the end. And it still does that here, even though we'll never need to restore that image: we're about to overwrite it
in reply to Foone🏳️‍⚧️

Here's what I want a tool to do:
I hit a breakpoint in the debugger, I turn it on, set another breakpoint, and hit go.
between those two breakpoints, every time a CALL instruction is hit, it dumps my selected memory region. If it's identical to the last dump, it's ignored.
At the end, each dump is rendered as an image, and the combined set are an animation I can scroll through.
in reply to Foone🏳️‍⚧️

it's in a function I already found, temporarily named "blit_related".

I guess they don't decode the image until RIGHT before it needs to go up on the screen!

in reply to Foone🏳️‍⚧️

if definitely decompresses and then blits the image as two parts, which aren't evenly sized, and it starts from the bottom
in reply to Foone🏳️‍⚧️

I think they're just trying to keep their RAM usage down by not having both halves in memory at once
in reply to Foone🏳️‍⚧️

It loads the half-width version, then a few functions later, it's been replaced with a full-width version.
Strange!
in reply to Foone🏳️‍⚧️

wait no, the colors are wrong... I bet I'm seeing it decompress the binary, but that's using the full width of the bytes. it then gets expanded out to a 16-color image.
in reply to Foone🏳️‍⚧️

well the good news is that I think I've found the decompress_image function. the bad news is that now I have to reverse engineer it 🙁
in reply to Foone🏳️‍⚧️

it's currently doing the obvious thing for a decompressor to do:
write the byte 04 every 69 bytes
in reply to Foone🏳️‍⚧️

oh sweet jesus, that's the left two pixels of the image.
it's loading the image vertically!

at least it's top to bottom.

in reply to Foone🏳️‍⚧️

yeah, doom did that too, but Doom was a 2.5D image that had to do pseudo-raycasting.

THIS GAME DOES NOT

in reply to Foone🏳️‍⚧️

it allocates a 1024 byte buffer, then makes a pointer to the end of it, minus -0x42?

why would you need a link to the end of a new, freshly cleared buffer, minus 62?

in reply to Foone🏳️‍⚧️

I think the memory allocation system here is that every malloc returns 2 extra bytes, which is a pointer to the previous block.
unless it's an odd number, in which case it's a free block. and pointer to the previous block, once you make it even again
in reply to Foone🏳️‍⚧️

I hate dealing with the internals of memory allocation systems. I prefer to leave that to smarter people than me
in reply to Foone🏳️‍⚧️

You see this little About dialog box? Guess how many times the DrawText function is called?

Once! and just to draw "Where in the World is Carmen Sandiego?".
The rest of the text is draw elsewhere, and I have no idea why.

in reply to Foone🏳️‍⚧️

correction: it calls it once to draw "Where in the World is Carmen Sandiego?" but that's unrelated to the one on screen WHAT?
in reply to Foone🏳️‍⚧️

the only problem with using Ghidra to hack children's games instead of, like, Serious Things like firmwares or malware or whatever, is sometimes you have to make a label named NUM_MOUNTAIN_CLIMBING_HINTS
in reply to Foone🏳️‍⚧️

It has a surprisingly robust UI engine. I swapped from BoldFont to SmalFont and the menu adapted perfectly.
in reply to Foone🏳️‍⚧️

The game loads the BoldFont first, then the SmallFont, then the NormalFont.

Annoyingly this isn't how they're laid out in memory:
It's SmallFont, then BoldFont, then NormalFont

in reply to Foone🏳️‍⚧️

Weirdly, swapping the NormalFont for the SmallFont causes the printer text to be VERTICAL, for reasons I do not remotely understand!
in reply to Foone🏳️‍⚧️

font_alloc = malloc(local_a);
if (font_alloc == (void *)0x0) {
font_alloc = (void *)0x0;
}

Ahh yes. remember, if you get a null pointer back from malloc(), make sure to set that variable to NULL so it won't be left as... NULL?

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

man, running on 4 hours of sleep is killing me.
I can't even remember the MS-DOS interrupt to open a file!

I know reading it is int 21 ah=3f, closing it is int 21 ah=3d, and I'll never forget that seeking is int 21 ah=42, but how do you open a file?
I mean, not the int 21 ax=6c00 way, that one is only for DOS 4.0+, and obviously a game released in 1990 isn't gonna use that.

in reply to Foone🏳️‍⚧️

ahh, now that I've looked it up, it seems I was wrong!
closing isn't 3D, that's 3E! 3D is open!

no wonder I couldn't remember it, I had it confused with another call

in reply to Foone🏳️‍⚧️

what the fuck do you mean that carmen.dat is opened on the first call to finish_draw_maybe()?

like, I know there's a "maybe" in that name, but it's not THAT big of a maybe.

in reply to Foone🏳️‍⚧️

oh thank god, that was a bit of confusion from manually tracking stack frames.
it actually LoadDatFile, which makes a HELL of a lot more sense
in reply to Foone🏳️‍⚧️

darn. Compiler Explorer doesn't support MS C Compiler 5.1 from 1988. Guess I gotta spin up an emulator again
in reply to Foone🏳️‍⚧️

the annoying thing is that MS C Compiler 5.1 is the most mundane-ass DOS application. If I had a 32bit windows install rather than 64bit, it would probably just run natively on my system
in reply to Foone🏳️‍⚧️

I'm gonna build an m.2 addon that's just a drop in x86 coprocessor. I know a lot of computers that could use an x86 processor these days.
in reply to Foone🏳️‍⚧️

it's like a Super Game Boy, but for your PC! Plug in this extra hardware, and now your system is compatible with a ton more software!
in reply to Foone🏳️‍⚧️

note to self: figure out how Ghidra fidb works, so I can apply it to MSC5.1 (which was sadly overlooked by the developers of ghidra)
in reply to Foone🏳️‍⚧️

okay don't change that byte, GOT IT.
I think I failed to load the cursor, which caused it to corrupt the mouse cursor catastrophically
in reply to Foone🏳️‍⚧️

a fun kind of reverse engineering tactic that I practice probably more than I should is a version of The Scream Test (which is the principle that the easiest way to find who "owns" a server is to turn it off and see who screams): if you don't know what some code does, break it. and see what screams.
in reply to Foone🏳️‍⚧️

I think I may have found unused graphics for a feature that'd change the Acme Detective Agency at the beginning to be season-specific. There's summer, fall, winter, and spring variants, but the game seems to be hardcoded to summer
in reply to Foone🏳️‍⚧️

I did a little looking into the contents of MIDISND.DAT

It's got 12 small tracks, and each of them is a valid MIDI file if you remove the first byte.

in reply to Foone🏳️‍⚧️

heh. I was checking different near-death animations by overriding the randomness, so I had to tell my debugger to set AX to 0

guess which animation that is? The one with the AXe.

in reply to Foone🏳️‍⚧️

why do they store the day of the week as a 16bit int?

future proofing in case the calendar gets updated and has more than 256 days in the week?

in reply to Foone🏳️‍⚧️

I accidentally applied a patch backwards and put the detective to sleep, forever.
They're in Rome and they've just slept through about two months of nothing
in reply to Foone🏳️‍⚧️

patching 0x148C9 in the EXE to 90 90 will stop the clock advancing, so you now have Infinite Time to catch the culprit
in reply to Foone🏳️‍⚧️

I finally figured out how it calculates travel times.
It's the difference in X coordinate between the two cities, plus the difference between the Y coordinate, plus one.
that quantity divided by 40, then has 2 added. if the result is over 7, it's set to 7.

Weird! that's not how you measure distance, Carmen.

in reply to Foone🏳️‍⚧️

also, it's the 90s, I can afford a sqrt().
I should fix it up for my version.

or use a squared lookup table. you could do this REAL easy by making it a table search: there's only 6 possible results: 2,3,4,5,6,7. each entry in the lookup table contains the maximum squared distance that can generate that number of hours

in reply to Foone🏳️‍⚧️

here's all 30 city locations:gist.github.com/foone/09925178…

it's currently way too 6am to do more calculations, though. I'll do that tomorrow

in reply to Foone🏳️‍⚧️

Good news: @modulusshift did the calculations for me!

digipres.club/@modulusshift/11…

@~
in reply to Foone🏳️‍⚧️

I think that says that it doesn't matter much. The biggest error is in the biggest distances, which are all saturated to the max of 7-hours anyway.
in reply to Foone🏳️‍⚧️

I'm confused by the graphics detection routines. I thought it was returning 0 for "no graphics" or something, but it turns out 0 means MCGA.
So the GraphicsMode enum goes:
0: MCGA
1: CGA
2: Hercules
3: EGA
4: Tandy
5: VGA
6: ???
in reply to Foone🏳️‍⚧️

I don't think there's any reason why this would support SVGA. It always use 320x200 at a maximum of 256 colors. VGA is more than enough to handle it
in reply to Foone🏳️‍⚧️

I find-replaced the background from palette entry 0 to palette entry C:

Now I can confirm how big this image is. Previously it was set into a black background, which made it harder

in reply to Foone🏳️‍⚧️

worst thing that could happen just happened:

I just realized the portable Where in the World is Carmen Sandiego? is based on the same version I'm hacking, meaning it's in-scope for me to get this, dump the ROM, and compare.

That just increased the cost and complexity of this project by bunch

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

WAIT HOLD ALL THE PHONES.
Here's a photo from a MS-DOS version. It does that thing some companies (like Sierra) did back in the day, and included both 3.5" and 5.25" disks in the package.

BUT WHY ARE THERE SO MANY DISKS?

in reply to Foone🏳️‍⚧️

ARG, they mislabeled this.
Admittedly, this isn't really their fault, this is confusing shit.

This is the 1992 Where in the World Is Carmen Sandiego? Deluxe, not the 1990 Where in the World Is Carmen Sandiego? Enhanced.

in reply to Foone🏳️‍⚧️

okay I finally found a boxed copy of the Enhanced 1990 DOS edition. (confusingly labeled the 1993 edition)

It comes on two 5.25" disks: presumably double-density, so that's 720kb in total.

Floppy Disk Pop Quiz: What's weird about these floppies, specifically given that this is MS-DOS version?

in reply to Foone🏳️‍⚧️

I happened to look at mobygames, and noticed two interesting things.

First, the Mac version is very similar to the DOS version, other than the expected changes you'd get from it being on a monochrome system with a GUI.

But wow, that's a completely different font! Is that built into macs or something? (EDIT: @amr confirms it is)

(also, the dialogue box is top-aligned. DOS bottom-aligns them)

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

I don't want to go through a million platforms but all the other ports of this game tweaked some art here and there or put in different location-photos, but all of them have the same basic tall-window-on-the-left, smaller-window-in-the-top-right, four-buttons-in-lower-right design
in reply to Foone🏳️‍⚧️

The answer for "what's wrong with these floppies?" is that they're double-notched. That's needed for double-sided disks... on systems which have single-sided drives!
The PC has basically always been double-sided, so they only need one notch, on the top/a side.
in reply to Foone🏳️‍⚧️

here's why they shipped it on a double-notched disk anyway:
Broderbund was releasing games on a bunch of other systems that DID have single-sided drives. For simplicity they just bought Xty-thousand double-notched disks
in reply to Foone🏳️‍⚧️

is it gonna matter? not in the slightest (assuming there's no format-mismatching, which their shouldn't be: these are all the same density of disks, I think).

The PC doesn't check for a notch there, so it won't notice either.

in reply to Foone🏳️‍⚧️

It's just funny because this is, like, technically wrong?. These aren't PC disks, but the difference doesn't matter, so why not?

It probably saved them a decent amount of money because of bulk discounts and inventory simplicity.

in reply to Foone🏳️‍⚧️

also after all this wondering about "how many disks does Carmen Sandiego Enhanced (1990, DOS) come on?" is even sillier because I ALREADY KNEW THE ANSWER, I JUST FORGOT I KNEW IT
in reply to Foone🏳️‍⚧️

I am currently, as in this very thread, reverse engineering Carmen Sandiego Enhanced (1990, DOS)!

I've seen the code that asks for you to put in the other disk! And it only asks for DISK1 and DISK2!

in reply to Foone🏳️‍⚧️

just looking at the files, not the code (and not having seen original disk images yet that I can recall), I bet the answer is that they put CITIES.DAT on DISK2.
the whole game - cities.dat is ~300kb, with cities.dat being 168kb.

They could do the whole game - carmen.dat and cities.dat in only 200kb, which'd give them 160kb (luxury!) for a fancy installer.

in reply to Foone🏳️‍⚧️

This game autodetects everything (video and audio modes) and you can install it by just doing "copy A:*.* C:\CARMEN" on each disk, so I don't think they would have needed a fancy installer.
in reply to Foone🏳️‍⚧️

I should just check. I'm sure disk images can be tracked down in places.

the video and audio detection seems to be excellent, by the way. it just silently figures it out, without asking questions or requiring special arguments or configuration.
Perfect for a game aimed at the little childrens.

in reply to Foone🏳️‍⚧️

I found two different copies of the disk images, in different places.

both are imaged off a 3.5" disk version, which of course comes on only one (double density, 720kb) disk!

in reply to Foone🏳️‍⚧️

That version has no installer. Just the usual files (and a "DESKTOPD.CFG" file that I don't understand)
in reply to Foone🏳️‍⚧️

I did not realize they implemented a file browser in this program! I only found it by hiding all the DAT files from the EXE, to see if it'd ask me to put in floppies in.
in reply to Foone🏳️‍⚧️

So I've got code at 17DA:08AA, which is E8 5D F7. DOSBox decodes that as CALL 000A.

Manually decoding it myself, it should be a relative jump, and it's a jump to $-0x8a3. following the jump it ends up at 17DA:000A.

BUT GHIDRA thinks this code is at 1fb7:08aa, and it decodes it as call SUB_2000_fb7a, which doesn't exist.

I'm not sure how (0x08aa+3)-0x8a3 = 2000:fb7a. Something weird is going on. Why is the number BIGGER?

in reply to Foone🏳️‍⚧️

eww. They're using the NEAR version of CALL to call a FAR procedure.

You might say "wait, won't that break when it tries to do RETF?" and yes, it would, unless they manually do PUSH CS before they call it!

in reply to Foone🏳️‍⚧️

I think this saves one byte?
a call FAR absolute would be 5 bytes for the call, whereas push CS + call NEAR is 3+1 bytes
in reply to Foone🏳️‍⚧️

I might have to make a NASM test case. This could be Ghidra fucking up at decoding this one instruction
in reply to Foone🏳️‍⚧️

similar things in the test.com file. I moved stuff around in the memory map and it's not erroring now. I've probably created endless glitches elsewhere though
in reply to Foone🏳️‍⚧️

Anyway it seems it doesn't have a VideoDetect function, it's a DriverDetect function, since it's used for sound too.

First it goes through the video drivers in the following order:
VGA, TGA, EGA, HGA, HERC, and CGA.
Then it goes into the audio drivers:

stdsnd, adlib, covox, gblast, ibmg, sblast, tandy.

in reply to Foone🏳️‍⚧️

stdsnd is pc speaker,
adlib is adlib, covox is the speech thing, gblast is game blaster, most likely, ibmg is... I'm not sure. The PS-1 Audio card?

sblash is soundblaster and tandy is tandy 3-voice

in reply to Foone🏳️‍⚧️

I'm an idiot, this isn't a driver check... it's an argv check!

you can pass "ega" or "vga" or whatever to carmen.exe to select those types.

in reply to Foone🏳️‍⚧️

the other argument you can pass is ROSTER=$FILENAME

This lets you reset which file it uses for the list of registered players, setting it to something other than the default ACME.DAT

Not mentioned in the manual, but I can see how that might be useful for schools and such

in reply to Foone🏳️‍⚧️

I would say "especially if they're on a network!" but... this program is from 1990. Not many schools had networks in '90.
in reply to Foone🏳️‍⚧️

looks like GameBlaster (GBLAST) has extra options, so you can do like GBLAST260 to set the IO addr
This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

ugh. I pulled the thread to try and remap the memory to avoid ghidra disassembling it wrong, and it keeps getting worse. this is a mess.
in reply to Foone🏳️‍⚧️

okay I reverted back to my old mapping, then created a new memory mapping: I made up some bytes at 2000:xxxx where it incorrectly thinks it's going, and set up a JMP $CORRECT_ADDRESS there by editing the bytes, then telling Ghidra it's a thunk.
in reply to Foone🏳️‍⚧️

so the program has three main code segments, as it has approximately 111kb of code
The problem is that ghidra gets confused when the relative addresses are too big.
in reply to Foone🏳️‍⚧️

so the first one is at 1000:0000 and the second was at 1fb7:0009. I moved it to 5000:7000, and the second segment seems to be working fine now.

the problem is that I was only able to do that because the segment is only 82a7h long. the first segment, the 1000:0000 one, is FB79 long. So I can't just move it so it's in the middle of a segment, since it'll end up spanning into the next 64k chunk, which is where ghidra fucks up

in reply to Foone🏳️‍⚧️

9000:8006 9a d7 05 b7 1f CALLF SUB_2000_0147

Hey ghidra I can read the machine code. That's CALL FAR 1fb7:05d7, not CALL FAR 2000:0147! WHY ARE YOU CONFUSED BY THIS?

in reply to Foone🏳️‍⚧️

well, if nothing else, I think this has caused it to stop thinking there's jumps into the middle of functions.
so now I can just manually thunk every cross-segment call, by creating the 2000:0000 segment that ghidra is imagining exists
in reply to Foone🏳️‍⚧️

I was extracting the portraits of the people you talk to, and it turns out they're number 1-36. naturally I checked all 256 possible options.

but it turns out every thing above 37 either:
1. crashes
2. shows nothing
3. shows pixel gibberish.

EXCEPT 238. 238 renders a bellhop perfectly, just like 5 does

in reply to Foone🏳️‍⚧️

I'm kinda surprised they're so dithered. with the support for EGA/MCGA/VGA monitors, they could have pulled something like sierra did and encoded the dithering into their compression. Then when they're displaying on higher-colordepth displays they could swap it out for an intermediate color.
in reply to Foone🏳️‍⚧️

it has been zero days since Ghidra has done something I can't understand and seems to be obviously wrong.

I've got B8 B0 26: this decodes to mov ax, 0x26b0. a 16bit immediate, moving into a 16bit register.

in reply to Foone🏳️‍⚧️

ghidra disassembles this as:
b8 b0 26 MOV uVar1 ,0x26b0

uVar is defined as a ushort: a 16bit type.

in reply to Foone🏳️‍⚧️

the most annoying thing?

this is picking between two strings to display, and those strings are "he" and "she".

EVEN IN 35 YEAR OLD COMPUTER GAMES I CANNOT ESCAPE GENDER PROBLEMS!

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

Since they devoted an entire word to gender, we can truthfully state that Where in the World is Carmen Sandiego? (enhanced, DOS, 1990) believes there are 65536 genders.
in reply to Foone🏳️‍⚧️

unfortunately due to an oversight it believes those 65536 genders are allocated as:

0: He/he/Him/him
1-65535: She/she/Her/her

in reply to Foone🏳️‍⚧️

BTW, my plan for expanding the program is simple: I'm gonna bypass a lot of code/data, by stuffing my own allocation into the memory space of carmen, which'll load extra data off the disk, in a CUSTOM.DAT file
in reply to Foone🏳️‍⚧️

this'll be (relatively) easy to do, since it turns out this program only needs 432 KB, since it targets a 512 KB RAM machine.
Since it's no longer 1990, I think I can safely bump that up a bit? I won't need more than another 64 KB, which means I'll just bump the game up to 496 KB memory required. Completely doable in any 640 KB or more machine!
in reply to Foone🏳️‍⚧️

my added code will just load the CUSTOM.DAT file off the disk, and then inject pointers to it in the rest of the program.
in reply to Foone🏳️‍⚧️

the applyPronouns function lets you adjust how it's encoded dynamically. Fancy!
So how it works is you do something like this:

applyPronouns("\80 was bald", 0x80, "he\0him")

and it'll return "he was bald", right? But it's more than just a simple find-replace...

This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

Because you can do:

applyPronouns("I saw \81. \80 was bald!", 0x80, "he\0him")

and it'll return "I saw him. he was bald!".

See, you can specify multiple replacements at once, by using \x80, \x81, \x82 and so on.

in reply to Foone🏳️‍⚧️

The way it actually works is the game uses "He/he/Him/him" for the pronouns, so \x80 is uppercase "He", \x81 is lowercase, \x82 is uppercase "Him", and \x83 is lowercase.
This entry was edited (4 months ago)
in reply to Foone🏳️‍⚧️

trying to figure out how to properly decode the fonts in this game is REALLY reminding me why I constantly cheat with The Death Generator. Staring at a decompilation/disassembly and hex editor is no fun
in reply to Foone🏳️‍⚧️

I got my floppy copy in the mail, I just need to image it.

Fun fact from the box: It has a letter from the player character to their cousin, and I believe this is the only place in the game and associated media that they name your character.

It's Dale.

This entry was edited (3 months ago)
in reply to Foone🏳️‍⚧️

I also discovered that in the Amiga port, they redrew the crime computer to make it clearly an amiga. Cute!
in reply to Foone🏳️‍⚧️

Imaged my original disks. Two 360kb 5.25" disks.
They're laid out like this:
Disk 1:
CARMEN.EXE
CARMEN.DAT
Disk 2:
CITIES.DAT
MIDISND.DAT
DIGISND.DAT
in reply to Foone🏳️‍⚧️

Finally, we know the answer to the age-old question of Where in the World is Carmen Sandiego?

The answer is "My floppy drive"

in reply to Foone🏳️‍⚧️

well my "ignore the problem" solution of using bochscpu to embed a 16bit x86 emulator has failed. it's somehow broken and it's broken in the rust library or C core, not the python, and I really don't want to have to deal with debugging this.

time to switch to a completely different x86 emulator? PROBABLY!

in reply to Foone🏳️‍⚧️

I'm implementing unicorn as an x86 emulator to do the decompression, but I'm single-stepping the processor and I'm aiding debugging by showing what instruction I'm on.

but instead of having to set up an x86 disassembly engine, I'm just parsing a plain text ghidra dump of the disassembly. I'm parsing it with regexes

in reply to Foone🏳️‍⚧️

my latest bad idea: DUMBPATCH.

to avoid the complexity of generating functions and mapping them into the address space of the emulated PC, I instead designed a simple syntax:

a 16bit segmented address plus a number. that function is emulated as if it returned that number in AX. There are no other options. I suspect I'll be able to emulate up to 80% of complex subfunctions with this one bit of functionality

This entry was edited (3 months ago)
in reply to Foone🏳️‍⚧️

I need this because the decompression routine I'm emulating isn't entirely standalone: it calls malloc() at the beginning and free() at the end

so I'm replacing malloc() with a static value and free() with a return value no one will check

in reply to Foone🏳️‍⚧️

ideally I should be able to patch arbitrary python in there and do some kind of interop to return values to python

but that's hard. and way easier unflexible thing this is 80% of what I need that for

in reply to Foone🏳️‍⚧️

I forgot about callee cleanup. fucking stdcall is callee cleanup. I can't have a generic int blah(){return 0x1234;} function because it needs to know how many words of arguments were pushed.
in reply to Foone🏳️‍⚧️

I took a look at the 1985 version to see if it had any other graphics command line options (it doesn't), but I did discover in passing that it uses a different pronoun system than the 1990 Enhanced version!
in reply to Foone🏳️‍⚧️

hacking a computer system by changing my pronouns to they/them so that it'll use up more memory composing strings referring to me and overflow the buffer
in reply to Foone🏳️‍⚧️

Where in the World is Carmen Sandiego? (1985) has an invert-y-axis option for the joystick, just in case you want to use flight simulator controls to navigate a menu
in reply to Foone🏳️‍⚧️

back on hacking Enhanced, DOS, 1990.

My best guess is that this game has between 4-6 compression algorithms, depending on how you count them. Possibly more are hidden in the bowels of this program.

in reply to Foone🏳️‍⚧️

that may be only the IMAGE compression algorithms, and they use a separate different compression algorithm for text.
in reply to Foone🏳️‍⚧️

this is not the game to do it with, but I really wanna try swapping out the drawing routines for one of these games once. go into a VESA mode where I can run at 1024x768 or something, and just make the drawing write to that buffer instead. Could I make BIGSCREEN DOS GAMES?
in reply to Foone🏳️‍⚧️

maybe I'll try it with railroad tycoon sometime. that game has loadable graphics modules. if I figure out enough of how it works, I could write my own driver for VESA Railroads
in reply to Foone🏳️‍⚧️

hah, I love DOS programmers.
This code mallocs 65516 bytes in a loop until malloc returns zero.
in reply to Foone🏳️‍⚧️

totally normal part of starting program: allocate all the RAM in the system.

I mean, it's DOS. There's nothing else running that could possibly call malloc. So why not?

in reply to Foone🏳️‍⚧️

You also have to remember that it's not going to succeed more than, like, 6-8 times?
There's just not that much memory in the system that this can touch, since it's not supporting any of the endless varieties of breaking the 640k barrier
in reply to Foone🏳️‍⚧️

there's a story on Old New Thing somewhere about Windows 95 accidentally breaking a DOS game, because it did this same trick of allocating all the memory, but since Win95 was running as the DPMS, it meant it had access to all of windows 95's virtual memory. including the swap.

So instead of mallocing all 8mb or whatever your 486 had, it malloced all that and then tried to use up YOUR ENTIRE HARD DRIVE, slowly.

in reply to Foone🏳️‍⚧️

And then it crashed because it didn't expect to succeed that many times. It had a fixed array of handles to memory, and it overflowed because it was run on a system with HUNDREDS OF MEGABYTES OF RAM, which is clearly impossible and unthinkable
in reply to Foone🏳️‍⚧️

I think the solution was that win95 just defaults DOS programs to maxing out at 16mb. It won't let them allocate more than that unless you adjust the EXE options
in reply to Foone🏳️‍⚧️

this game runs mostly in a 16 color mode, with some high-end modes being basically 16-colors within 64 or 256 colors, right?

SO WHY DOES IT USE 16-BIT INTEGERS FOR COLOR INDICES?

in reply to Foone🏳️‍⚧️

I'm not sure if anyone has ever designed a paletted graphics system that uses more than 256 colors. Probably at some point someone thought it was a good idea.
in reply to Foone🏳️‍⚧️

note to self: patch out the Romani slur in one of the hints for Budapest

EDIT: both of them

This entry was edited (2 months ago)
in reply to Foone🏳️‍⚧️

This game was released in 1990 but it has a hint that refers to the currency of Brazil as the "cruzado". But in 1989, it had been replaced by the cruzado novo. Clearly someone was using an out of date encyclopedia!
in reply to Foone🏳️‍⚧️

I'm thinking of dumping a list of all the hints in this game and calculating how many of them are wrong by now.
in reply to Foone🏳️‍⚧️

like, most of the flag clues. do you know how many countries have changed their flags since 1989? LOTS OF THEM
in reply to Foone🏳️‍⚧️

The description of Hungary says it's bordered by Czechoslovakia, Austria, Yugoslavia, Romania, and the Soviet Union.

Two of those are still right!

in reply to Foone🏳️‍⚧️

so I fly into Reykjavík, and immediately sleep for 8 hours. In the morning, I can go to either the airport or the hotel, but it'll take 3 hours to get to either.

Question: where am I right now, if I'm not at the hotel or the airport?

in reply to Foone🏳️‍⚧️

I mean, if your hotel is near Reykjavik or Keflavik, then you could be in one of the "wild" hot springs out in rural Iceland, like the one near Hella. That's where I'd go, anyway!
in reply to Foone🏳️‍⚧️

in a bed at the hotel attached to the airport. You could theoretically walk to the hotel lobby or the airport terminal in, at most, 15 minutes. You just have ADHD and are attempting to be realistic in your expectations. (don’t worry, me too)
in reply to Foone🏳️‍⚧️

3 hours huh? I guess that'd be one of the 10 pubs in Reykjavik because besides volcanoes and harbors there isn't much to see
in reply to Foone🏳️‍⚧️

Ah, now, that one I can answer from personal experience - interviewing for my first job, the company put me up in a hotel overnight beforehand. Problem: they accidentally booked me into the hotel chain's branch in [suburb], and were then confused when they arrived at the airport hotel to interview me. The shuttlebus back to the airport had just left when we twigged it, too, so it took a good hour+ to make my way over there...
in reply to Foone🏳️‍⚧️

Well, they never said you actually made it into the terminal, so I'm guessing you're passed out under the Air Stairs.
in reply to C.

@cazabon then why does it take me three hours to get to the airport?
@C.
in reply to Foone🏳️‍⚧️

They initialized the SoundBlaster DSP backwards.
You're supposed to send a 0 to the reset port, wait 3 microseconds, send a 1, then wait up to 100 microseconds for an 0xAA to show up on the data port.

They instead send a 1, then a 0, then immediately start trying to read the data port.

in reply to Foone🏳️‍⚧️

they read from the ports instead of measuring time, because that'll take a certain amount of time on x86. I'm too tired to confirm if their timing logic is sound. It's possible they're just assuming the PC is slow enough that it'll wait long enough
in reply to Foone🏳️‍⚧️

I appreciate your lack of judgment. Back in those times it might have actually been a totally reasonable assumption. I mean, better than spinning the CPU, the totally next level timing delay mechanism of the times.
in reply to tekhedd

@tekhedd oh yeah, it made sense back in the day. that's a valid method of timing, I'm just not 100% sure they're actually matching the docs on how long they should wait, especially on later and faster systems
in reply to Foone🏳️‍⚧️

The problem with byzantine systems is that everything *could* make sense 😛
in reply to Foone🏳️‍⚧️

I bet broderbund bought this sound code. It feels completely different: This was either compiled on a different compiler or was written in assembly.
in reply to Foone🏳️‍⚧️

yeah a compiler doesn't just start using CF to return bools instead of AX. This is assembly.
in reply to Foone🏳️‍⚧️

they're passing single bytes to functions! C widens integer parameters to a word, so on a 16bit system, they're passed in 16-bit registers.
This entry was edited (1 month ago)
in reply to Foone🏳️‍⚧️

they're sending a... internal soundblaster test command?
(DSP 0xF0)

I dunno why this code is like this.

in reply to Foone🏳️‍⚧️

I suspect there may be an issue here: I identified a variable as containing the Soundblaster IO port, right? and I'm assuming everything that uses it is Soundblaster code.

But it may just be "soundcard IO port" and there's other sound device code mixed in here. So that's why some of it doesn't make sense as soundblaster, it's actually tandy 3voice or something

in reply to Foone🏳️‍⚧️

I just found a function (inside another function!) that's a fixed delay. How long is it?
it's a loop that runs 256 times!
in reply to Foone🏳️‍⚧️

that's so cute that this code considers "256 instructions" to be a meaningful length of time.
in reply to Foone🏳️‍⚧️

I hope this isn't a super-early version of the Miles library! He'd be so embarrassed...
in reply to Foone🏳️‍⚧️

the Z80 considers 128 instructions to be a good DRAM refresh interval. With more complexity than that, of course.
in reply to Foone🏳️‍⚧️

That reminds me of a game (I think it was Space Crusade) which was very glitchy on my family’s 486 DX2 66 and eventually led to me discovering a use for the Turbo button that dropped it to 7Mhz (or so the seven segment display on the front claimed).
in reply to Foone🏳️‍⚧️

there's code in here specifically to detect if it's running on an IBM PS/1 by looking at the CMOS area?

WHAT THE

in reply to Foone🏳️‍⚧️

the menu system limits menus to having a maximum of 32 items.

which is weird because ONLY 17 WILL FIT ON SCREEN

in reply to Foone🏳️‍⚧️

well, look, having 1 bit less to index the menu would limit it to a max of 16, and then d have the "THERE WOULD BE SPACE FOR ONE MORE!!!" drama.
in reply to Foone🏳️‍⚧️

I did some experimenting with MSVC 5.1, and it's weird. I get the same strings in the exe as carmen.exe has, but the code itself looks completely different.

either I set up my compiler wrong, or this game is full of assembly even for very simple functions

in reply to Foone🏳️‍⚧️

I don't know exactly what this function does (I know it sets some flags based on something in the graphics context) but I DO know one important thing about it:

they included it in the final binary FOUR TIMES.

in reply to Foone🏳️‍⚧️

byte-identical.

this is a compiler & linker from 1988, it doesn't understand how to merge identical copies of functions apparently

in reply to Foone🏳️‍⚧️

I found another function which has 4 copies.

I'm starting to suspect this program originally had 4 C source files and the linker wasn't optimizing this

in reply to Foone🏳️‍⚧️

But you can't have 4 symbols with the same name. Maybe it some sort of inlining?
in reply to Foone🏳️‍⚧️

wait I bet it's drivers!
like, one version of this function is called by VGA_DrawFuncUnknown and nothing else.
Another one? CGA/Hercules.
the third? EGA
The last? Tandy.

They compiled the 4 video drivers separately, and then linked them into the EXE, with no deduplication across compile units

in reply to Foone🏳️‍⚧️

yeah. Found another: VGAMalloc is the same as CGAMalloc (and Hercules doesn't have it's own HerculesMalloc, because it's in the same code unit as CGA: So it just uses CGAMalloc)
Tandy has TandyMalloc.

But not EGAMalloc. That one is completely different.

in reply to Foone🏳️‍⚧️

That is odd, given the differences in video memory layout between VGA and CGA
in reply to Walter van Holst

@whvholst the function just mallocs the param passed in, so it doesn't care about layout.

except for EGA. which I don't understand yet

in reply to Foone🏳️‍⚧️

the DrawLine API is weird.
To draw the horizontal underline for the hotkeys in the menu, it calls DrawLine(0, -width).

It's DrawLine(int y, int x), and yeah you pass negative numbers

in reply to Foone🏳️‍⚧️

it's also off by one.
because 0,0 is silly, you're always drawing at least one pixel. So DrawLine(0, -5) draws a six pixel wide horizontal line to the left
in reply to Foone🏳️‍⚧️

That sounds more like (int startPoint, int endPoint) than (int startPoint, int length).
in reply to Foone🏳️‍⚧️

PUSH BX
PUSH ES
PUSH SI
CALL StartPlayingSound
POP BX
POP ES
POP SI

since when has the x86 stack been FIFO instead of LIFO?

in reply to Foone🏳️‍⚧️

the internal audio API used by this game is interesting.
LoadAndPlaySoundChunk is called with a chunk name from digisnd.dat, but you can also pass -1 or 0. I'm not sure what -1 does yet (maybe silence a currently playing sound?) but 0 means "wait until the sound finishes"
in reply to Foone🏳️‍⚧️

I'm not really sure why it works that way, especially because calling LoadAndPlaySoundChunk(0) is equivalent to calling WaitUntilSoundFinishes().

So why not just do that instead?

in reply to Foone🏳️‍⚧️

uh oh. the computer noise is triggered with:
LoadAndPlaySoundChunk(217)

but I look in the DIGISND.DAT file and it has chunks 200-216.

So either my DAT file parsing is wrong or it's loading sounds from elsewhere, somehow? because the sound DOES play, so it's not just an error

in reply to Foone🏳️‍⚧️

I thought it might just be playing from MIDISND.DAT instead (since the computer noise is very beepy, maybe it's just a synth sound?) but MIDISND.DAT starts at chunk id 218 and goes up.

WHERE IS 217?

in reply to Foone🏳️‍⚧️

huh. weird. when you try to backspace too far in the name entry screen, it goes "duh-nuh" at you, but that isn't connected to a LoadAndPlaySoundChunk call.

So it's using a different function for this ONE NOISE?

in reply to Foone🏳️‍⚧️

maybe it's hardcoded to pc speaker and I can't tell the difference between soundblaster and pc speaker because they're both coming out of the same laptop
in reply to Foone🏳️‍⚧️

YEP. muted my soundblaster (MIXER SB 0:0) and it's still duh-nuhing at me.

why would you do this to me, brøderbund?

in reply to Foone🏳️‍⚧️

ah-ha! I found 217.

DIGISND.DAT has PCM sound effects for 200-216.
But there's also chunks in CARMEN.DAT for 200-229.

I didn't think the ones in CARMEN.DAT were sound files because they're so small... but they're just the right size to be PC speaker sound effects!

in reply to Foone🏳️‍⚧️

the way the game works is that it loads CARMEN.DAT always, then if you have a sound card it supports, it loads DIGISND.DAT which replaces chunks 200-216 in memory with the DIGISND.DAT ones, which are PCM. But if you don't have a sound card, it still has the CARMEN.DAT ones loaded, and they're all pc speaker sound effects.
in reply to Foone🏳️‍⚧️

they hardcoded two sound effects into the EXE and the rest are loaded from the DAT files.

eww. Someone hacked something in at the last moment!

in reply to Foone🏳️‍⚧️

did some stats:
there's 729 functions in the EXE.
I've named (in some way, counting placeholders) 355 of them, or 49%
in reply to Foone🏳️‍⚧️

by placeholders I mean things like "pcjr_sound_related" or "VGAFunc8"

and 13 of those function names include the word "maybe"

in reply to Foone🏳️‍⚧️

I think they generated their hints wrong.
The *22 chunk for a city says something like "$SUSPECT was going to an opera with the president" or "$SUSPECT would be having tea with the Emperor", right?

but it's also got "drove away in a vehicle flying a green, blue, and yellow flag". which'd be fine, except that hint is also in *19!

I think they accidentally duplicated it when they generated the cities.dat file

in reply to Foone🏳️‍⚧️

this causes a glitch in the game where you can have 2 of your 3 informants give you the same flag-color hint, which is less than useful
in reply to Foone🏳️‍⚧️

ugh. ghidra really doesn't understand that you can call far functions using near calls.

and the compiler for this LOVES using them.

in reply to Foone🏳️‍⚧️

I might have explained this before, but normally a near call to a far function will break, because it'll pop 4 bytes off the stack for the return address, when the near call only pushed 2.

So you fix this by doing push CS first, so it'll pop the 2 from the call, and then the 2 you placed before.

in reply to Foone🏳️‍⚧️

but ghidra doesn't understand that this is what's happening, so it hallucinates it as a parameter to the function that's CS.
in reply to Foone🏳️‍⚧️

so you'll see, for example, it decompiles a strlen as:

uint1 = strlen(0x1000, some_String);

which is less than useful

in reply to Foone🏳️‍⚧️

in 32bit we do 32bit calls and 32bit returns.
in 64bit we do 64bit calls and 64bit returns.

in 16bit we can do 16bit calls and 16 bits returns, 32bit calls and 32bit returns, and sometimes we do a 16bit call to a 32bit return because it's slightly fewer bytes

in reply to Foone🏳️‍⚧️

one of my favorite stupid methods of reversing is "break it"

what's this function do? well, lemme disable it, and see what breaks.

Apparently this is the "restore the image under the cursor" function.

myrmepropagandist reshared this.

in reply to Foone🏳️‍⚧️

TIL, thanks, filing that trick away. (JK about to go use it on a problem)
in reply to Foone🏳️‍⚧️

DOSBox needs logging breakpoints. yet another thing to stick on my list of stuff-it'd-be-neat-to-have in my dosbox-debugger
in reply to Foone🏳️‍⚧️

i will convert you someday. mark my words

just as soon as i implement, like twenty years of hardware

in reply to gloriouscow

@gloriouscow oh I'm excited looking forward to it. I'm gonna use MartyPC as soon as a game I'm hacking on is supported in it
in reply to gloriouscow

@gloriouscow I think so, yeah.

Actually, it supports EGA/CGA/Hercules/MCGA as well, so I could definitely try running Carmen in MartyPC.
Maybe next session, I'm already halfway through a lot of nasty stuff

in reply to Foone🏳️‍⚧️

i can add breakpoint logging for you definitely tho

what do you imagine that looking like, a timestamp with cs:ip and breakpoint name when hit?

in reply to gloriouscow

@gloriouscow I'm specifically thinking of logging breakpoints of the sorts where you can attach an expression. So it's not just 0823:A35C DrawString, it's something like 0823:A35C DrawString(25, 60, "Foobar"), because you can define it with some kind of simple expression language. like I tell it when it hits 0823:A35C, it treats stack[4:6],stack[6:8] as ints, and stack[8:10] as a string. OllyDBG and X64dbg do this, and it's very handy for understanding more complex code
in reply to Foone🏳️‍⚧️

yeah i've got something similar planned when i add Rhai scripting. you'll be able to attach a script that is evaluated when the breakpoint is hit

the script interpreter will have access to logging output facilities and the entire machine state

but that's a little ways off still

in reply to Foone🏳️‍⚧️

it'll really kick things up a notch.

i'm still waffling about using lua instead but i wrinkle my nose when i look at lua code

in reply to Foone🏳️‍⚧️

it's a good thing ghidra has both enums and equates, because the equates function only sometimes works.
in reply to Foone🏳️‍⚧️

1. This is a flat-earth-ass flight path. Apparently Where in the World is Carmen Sandiego? takes place on a rectangle-planet.
2. the world doesn't wrap. This path is longer than "just" crossing the pacific, which is how this flight actually goes.
in reply to Troldann Arothin

@troldann of course!

(there's only 435 routes, I can precalculate them offline and just embed the answers in the code)

in reply to Foone🏳️‍⚧️

@troldann a direct JFK-SYD flight? Even with Qantas’s Project Sunrise, it’s still a bit too far…
in reply to Foone🏳️‍⚧️

@troldann
Only 30 destinations? I guess that was just enough to keep each play through somewhat different without switching floppies 💾

30×(30−1)÷2

in reply to Bill Ricker

@BRicker @troldann

yeah, there's only 30. They had room for more, but I guess they had to stop somewhere.

The second disk has the CITIES.DAT file, which is 168kb. There's still 102kb free on that disk, so it would be doable to add another 20 or so cities

in reply to Foone🏳️‍⚧️

I wonder if an easy fix would be to have 3 maps, centered on the US / Europe / Asia, compute the linear distance on each and use the map with the shorter distance and most centered.
in reply to Foone🏳️‍⚧️

arg, this function is saving and storing part of it's own return address.

It's a farcall (in 16bit mode), and it is looking at the stack to read the segment portion of the address, so it can save it away in a struct.

in reply to Foone🏳️‍⚧️

this is a SaveRegisters function, which also saves the cs:ip of the calling function.

but there's also a RestoreRegisters function, which ALSO restores the cs:ip of the calling function.
and then it returns.
to the restored cs:ip

THIS FUNCTION IS A DYNAMIC GOTO

in reply to Foone🏳️‍⚧️

gonna have to dig out the appropriate compiler and check if setjmp/longjmp compiles the same.

then cry

in reply to Foone🏳️‍⚧️

okay it is not the same setjmp/longjmp as the one MSC5.0 generates. But it's close enough that yeah, this is a setjmp
in reply to Foone🏳️‍⚧️

this also tells me that I'm probably right about this game being primarily written in C, but I'm wrong about which compiler was used. MSC5.0 doesn't match: it saves DX, when the setjmp used in carmen doesn't.
in reply to Foone🏳️‍⚧️

something that has some structure, rather than just slowly filling my hard drive with ancient compilers
in reply to Foone🏳️‍⚧️

okay so, the way main works is like this:

it calls initGame(), then setjmp.
if setjmp returns 0, it initializes the game.
if it returns 2, it goes into the main game loop.

except it's not really a loop? because the functions longjmp back to main(). it's a distributed dynamic goto loop

in reply to Foone🏳️‍⚧️

like if you do File->New, it longjmps(&env, 1).

which causes the game to reload from the beginning.

in reply to Foone🏳️‍⚧️

but the handler for file->new is inside main itself. so this global goto ends up being local
in reply to Foone🏳️‍⚧️

MSC4.0's installer is "go read the manual, it'll tell you which directories to make and which files to copy there"
in reply to Foone🏳️‍⚧️

*smacks forehead*

of course this compiler doesn't take ANSI C. it's from three years BEFORE ANSI C

in reply to Foone🏳️‍⚧️

okay it looks closer, but it doesn't exactly match. I think it's just that I'm in the wrong model.
I only have Small installed. lemme find the floppy disk to install Compact/Medium/Large
in reply to Foone🏳️‍⚧️

and I think Large is the same, which is probably more likely to be what carmen is using.
in reply to Foone🏳️‍⚧️

it's not large... because large saves DX.

uh-oh. was MSC5 right all along, I just had the wrong model?

in reply to Foone🏳️‍⚧️

yeah. msc5 matches as well, if I set it to the right model.

WELL THAT WAS A WASTE OF TIME

in reply to Foone🏳️‍⚧️

I have a sneaky suspicion it was build with MSC5.1, not 5.0.

it's going to take an annoyingly long amount of time to verify that theory

This entry was edited (3 weeks ago)
in reply to Foone🏳️‍⚧️

I'm surprised that you don't have a decision tree with some automation to simplify the compiler detection process by now.
in reply to Foone🏳️‍⚧️

it turns out what I thought was my MSC5.0 install WAS 5.1

so I need to install MSC5.0, not MSC5.1

in reply to Foone🏳️‍⚧️

also I manged to get my include and lib directories backwards. \lib was full of .h files, and \include was full of .lib files
in reply to Foone🏳️‍⚧️

this is the kind of installation error that hasn't been possible since, like, 1991
in reply to Foone🏳️‍⚧️

yeah it's definitely not 5.0.

ugh. it's not 5.1 either. there may be some minor patch that I don't have access to

in reply to Foone🏳️‍⚧️

after extensive cross-referencing with the msc5.0 manual and the msc5.1 libraries being opened in a parallel copy of ghidra, I have finally been able to determine that the function I named sprintf_maybe is, in fact, _sprintf.

my hard work, as always, pays amazing dividends

in reply to Foone🏳️‍⚧️

I'm currently figuring out functions through the amazing insight of "the linker is simple and linear"

which means when I have _memmove, FUNC_1fb7_6db0, and _strcmp in the EXE, FUNC_1fb7_6db0 is probably not going to be an adlib sound driver. it's going to be something from the libc.

Cassandrich reshared this.

in reply to Foone🏳️‍⚧️

I've done this a lot especially in embedded stuff where you have an open source peripheral library or RTOS bolted onto closed source application code.

Find one xref to a SFR, match to the corresponding vendor HAL function, then you probably get 30 functions with minimal effort that are right before/after in the same order as the .c

Cassandrich reshared this.

in reply to Foone🏳️‍⚧️

okay I've got all the libc stuff named, other than some internal functions (which I don't have names for), and one weird memmove-ish function that I just named "memmoveish"

it looks very similar to memmove, but with an extra check or two, but I can't match it to anything in the library

in reply to Foone🏳️‍⚧️

Total funcs: 758
Unnamed funcs: 332
% named: 56.2%

pretty good for a day's work: nearly 4% done

in reply to Foone🏳️‍⚧️

made a discovery:
Galleons of Glory: The Secret Voyage of Magellan, released by Brøderbund in 1990, uses the same DAT format for its game files.

I haven't looked into the EXE yet, but that definitely sounds like they're sharing code

in reply to Foone🏳️‍⚧️

the programmer credited for Galleons is Louis Ewens, who did work on several of the Carmen Sandiego games, but not the DOS-enhanced one.
in reply to Foone🏳️‍⚧️

oh wow, it looks like Prince of Persia (DOS) also uses this DAT format!

Sadly, while the source for Prince of Persia is available... it's for the Apple II version. The DOS version is a complete reimplementation

in reply to Foone🏳️‍⚧️

1991's The Treehouse uses DAT files, with some of the same names as carmen... but my parser fails on it. I think it's a variation in the format, so I'm a byte off or something
in reply to Foone🏳️‍⚧️

prince of persia 2 shows the same behavior. I think this is a different version of the Brøderbund Chunk Format
in reply to Foone🏳️‍⚧️

SDLPoP is based on reverse engineering of the DOS PoP, maybe I can see how they implement DAT file reading.

github.com/NagyD/SDLPoP

in reply to Foone🏳️‍⚧️

I've got my own code but it's not fully complete. I can't decompress all chunks yet
in reply to Foone🏳️‍⚧️

yeah from looking at the SDLPoP code, they've got some very familiar looking decompression code. Awesome.
in reply to Foone🏳️‍⚧️

I'm now doing some manual comparison of functions in PRINCE.EXE, and yep. they're byte-for-byte identical. There's shared code here! Awesome.
in reply to Foone🏳️‍⚧️

I wonder if it'd be worth automating this. I don't currently have any tools to let me find functions in binary A that are also in binary B
in reply to Foone🏳️‍⚧️

You know what time it is, then? That's right! It's time to spend ten times as much time as you wanted to spend on this project making your own tools :'D.
This entry was edited (3 weeks ago)
in reply to Foone🏳️‍⚧️

It would be nice if the time spent was closer to 5 times or less. But no, in practice, it's an order of magnitude more time spent :'D.
in reply to Foone🏳️‍⚧️

hey look, Prince of Persia uses the same setjmp/longjmp mainloop design!

github.com/NagyD/SDLPoP/blob/7…

in reply to Foone🏳️‍⚧️

they did modify the random function though: the PoP one checks if the seed has been initialized. Carmen never does
in reply to Foone🏳️‍⚧️

the compression has a fun quirk: images can be compressed either top to bottom or left to right.

and the game switches between the two compression formats on a per-image basis.

So the developers just compressed each image both ways and used the smaller one. clever.

in reply to Foone🏳️‍⚧️

their compression algorithm is 87 bytes long. as long as supporting two algorithms saved at least 87 bytes, it was worth it
in reply to Foone🏳️‍⚧️

I've now got a boolean that has three values (true, false, and 'image')

but it's okay, I have a permit: I'm non-binary.

in reply to Foone🏳️‍⚧️

I have successfully extracted the first image from the game, using the ported SDLPoP compression code!

1 compression method down, 3 to go.

in reply to gloriouscow

@gloriouscow everyone* knows canada only has one city, and it's Montreal.

* the 1990 game Where in the World is Carmen Sandiego? Enhanced

in reply to Foone🏳️‍⚧️

Rome is the first city in that list that uses the LZG_UD compression format, rather than the LZG_LR format. that's why it's crashing.
in reply to Foone🏳️‍⚧️

I'm getting some crashes. I think I'm gonna switch away from CFFI to just making a C wrapper around the code, and subprocessing that. That'll make it easier to debug why it's crashing
in reply to Foone🏳️‍⚧️

I can now extract every image in every DAT for Where in the World is Carmen Sandiego? (1990, Enhanced)!
in reply to Foone🏳️‍⚧️

d...do you mean this aka the probably longest thread in the world (lol got it?) came to an end?
in reply to Foone🏳️‍⚧️

working on a full dat exporter, to build a JSON of all the hints.

and I'm running into pronoun issues. Story of my fucking life.

in reply to Foone🏳️‍⚧️

yeah looks good.
gist.github.com/foone/82de72a0…

The misplaced entries (like Cairo having a leader hint of "left in a vehicle flying a red, white and black flag") are like that in the original data files. Brøderbund just got their hints miscategorized sometimes.

in reply to Foone🏳️‍⚧️

I'm like 90% sure that this game actually matches building types to what sorts of hints it gives you, and I'm also like 90% sure that this should have been obvious to me long ago
in reply to Foone🏳️‍⚧️

Idly playing Where in the USA is Carman Sandiego, and found an unexpected example of "things that have changed since 1990": The IMAGE for New Hampshire!
It's the Old Man of the Mountain, which collapsed in 2004.
in reply to Foone🏳️‍⚧️

NH still uses the image all over everything. its such a funny metaphor for NH in general. Clinging on to the glory of the past despite its complete irrelelevence today. I live there....
in reply to Foone🏳️‍⚧️

While the Old Man is no more, it is still fondly remembered. It is still used on highway signs, still plenty of memorably around, etc.
in reply to Foone🏳️‍⚧️

There could be a “Where in the timeline is Carmen Sandiego?” version
in reply to Foone🏳️‍⚧️

arg the way this game does travel can be really annoying
if you are in New Delhi and need to go to the USSR, but misclick on Oslo instead of Moscow, you can't just fly to Moscow from Oslo. You have to go back to New Delhi first
in reply to Foone🏳️‍⚧️

I got halfway to googling this hint before remembering I'M FROM THERE (that state, at least. I'm from the other end of it)
in reply to Foone🏳️‍⚧️

I should still have access to the Fodor's USA Travel Guide, if that is somehow useful in this effort
This entry was edited (2 weeks ago)
in reply to Foone🏳️‍⚧️

fun fact about Prince of Persia (which I am doing research on because of how it reuses code from Carmen or vice versa):

A copy of it leaked with symbols included, but it's not the most normal version you can imagine... it's the mac port recompiled for MIPS.

in reply to Foone🏳️‍⚧️

I think all I'd be able to get from it is some canonical names of library functions
in reply to Foone🏳️‍⚧️

tried bindiff: it doesn't like carmen.exe and binexport really doesn't like PRINCE.EXE, so that's a dead end for now
in reply to Foone🏳️‍⚧️

idea for debugging feature for dosbox:
press a button, then for the next X seonds, all modifications to the display memory are recorded along with the backtrace of what code changed it. So you could see a button get drawn, and check what code did that.
in reply to Foone🏳️‍⚧️

right now I'm doing this sorta manually by running dosbox with cycles=30 and watching it draw in real time
in reply to Foone🏳️‍⚧️

the original PC ran an 8088 at 4.77mhz, which DOSBox emulates as 240 cycles.

so this is approximately equivalent to a half-megahertz PC

in reply to Foone🏳️‍⚧️

it has been zero days since MSC5's little "push cs;CALL (not CALLF) farfunction" trick has confused ghidra
in reply to Foone🏳️‍⚧️

PUSH DS
PUSH peel_ptr
PUSH DS
PUSH peel_ptr

the pointer so great they pushed it twice!

in reply to Foone🏳️‍⚧️

I assume first one is a context save and the second one is a parameter? Compilers were not the smartest back then...
in reply to Foone🏳️‍⚧️

running this software at 15 cycles/second, I can confirm that the creators of it definitely didn't do that.

their general approach is "I KNOW PROGRAMMERS WHO TRY TO AVOID OVERDRAW AND THEY'RE ALL COWARDS

in reply to Foone🏳️‍⚧️

when it's trying to un-show a dialog box, it fills in the dialog box with black.
then white.
then it starts redrawing the background.
in reply to Foone🏳️‍⚧️

this only happens with movable dialogs. unmovable dialogs don't flash black+white.

which makes me think it's a bug rather than an intentional decision

in reply to Foone🏳️‍⚧️

oh good lord. when you open the Hall of Fame window, it paints the background light blue, then loads the background image which overwrites the light blue with dark blue
in reply to Foone🏳️‍⚧️

b8 13 29 MOV AX ,0x2913
50 PUSH AX
b8 00 00 MOV AX ,0x0
50 PUSH AX

POP QUIX: The usual way to zero out a register on x86 is XOR AX,AX. This'd be only 2 bytes (31 C0). The compiler knows this. Why didn't it use XOR AX, AX here, instead of the bigger MOV AX, 0x0?

(It's not because optimizations were off!)

in reply to Foone🏳️‍⚧️

here's a hint: that disassembly is from the EXE, not from the memory of a running program.

(why would that matter?)

in reply to Foone🏳️‍⚧️

I named this variable SoundBlasterPort but now, thanks to crossreferencing with the Prince of Persia disassembly, I know it's actually sound_blaster_port
in reply to Foone🏳️‍⚧️

Total funcs: 762
Unnamed funcs: 293
% named: 61.5%

118 of those named functions have been marked as identical to ones from Prince Of Persia (or vice versa... I have no idea which game had this code first)

in reply to Foone🏳️‍⚧️

my initial theory of how the code sharing went:

Prince of Persia ->
Where in the World is Carmen Sandiego (enhanced) ->
Where in the USA is Carmen Sandiego (enhanced) ->
Galleons of Glory: The Secret Voyage of Magellan

in reply to Foone🏳️‍⚧️

1000:700b MOV CX,0x20
TimingLoop:
1000:700e LOOP TimingLoop

ahh, the good ol' days when "32 instructions" was a meaningful unit of time.

This entry was edited (1 week ago)
in reply to Foone🏳️‍⚧️

1. why does the PS/1 sound card use the gameport IO range?
2. WHY DID I HAVE TO READ THE DOSBOX-X SOURCE CODE TO FIND THIS OUT?
in reply to Foone🏳️‍⚧️

the game picks between "they flew off to X" and "they drove off to X" and "they rowed off to X" and "they sailed off to X" but it doesn't seem to do this with any smarts.
or if it does, the database is incorrect.

carmen apparently drove off to nepal from canada

in reply to Foone🏳️‍⚧️

The game also refers to the capitol of china as Peking, which is weird considering it's been Beijing since 1945. I know it took a long while for everywhere to catch up, but by 1990 pretty much everyone was using Beijing. I guess they used an old atlas?
in reply to Foone🏳️‍⚧️

only a few years ago I told my mother that Beijing and Peking were the same city and she was surprised. I imagine thinking they’re two different cities is quite common.
in reply to Foone🏳️‍⚧️

another way in which this game shows that it's from 1990 is that the librarians will tell you anything about their patrons.

that shit stopped after 2001

in reply to Foone🏳️‍⚧️

what do you mean he changed his money to rupees?
You're in Sri Lanka! YOUR currency is rupees!
in reply to Foone🏳️‍⚧️

I'm experimenting with a way to show how DOS games render themselves.
Basically I'm recording a lossless video of the game running on a very slow CPU, then removing all the frames where nothing happens, and I'm playing it back sped up a lot.

The highlight of this video is how terrible the handling of the mouse cursor is! it's getting peeled and restored constantly

in reply to Foone🏳️‍⚧️

the mouse cursor appearing and disappearing is because they don't have multiple frame buffers: they have to hide the mouse cursor before they can draw anything, or the cursor would corrupt the newly drawn stuff if it happened to be over it.
so they solve this by hiding the cursor before every drawing command and showing it afterwards.

but instead of doing it once per screen, they're doing it once per command.

in reply to Foone🏳️‍⚧️

This normally would be invisible because all this happens over a single frame (or a couple), but running this slow makes it visible.
the GUI system they're using (I'm just calling it the broderbund UI in my reverse engineering work) DOES support avoiding this mess: you can tell it to hide the cursor, then when each sub-command tries to hide/restore it, it stays hidden, but they're not using it here.
in reply to Foone🏳️‍⚧️

ghidra (at least in x86-16bit) mode, has a real annoying bug where it decides instead of just passing a pointer-to-struct as an argument, the code is passing a pointer to the first member of the struct, just cast back to a pointer.
in reply to Foone🏳️‍⚧️

which is of course equivalent, but it means you get this code:

offset2_rect(-y - param_3->bottom,-x - param_3->right,
(Rect *)CONCAT22((char *)ds,&param_3->bottom),
(Rect *)CONCAT22((char *)ds,&param_3->bottom));

instead of:

offset2_rect(-y - param_3->bottom,-x - param_3->right, param_3, param_3);

This entry was edited (6 days ago)
in reply to Foone🏳️‍⚧️

broderbund::hide_cursor();
broderbund::show_cursor();

WERE YOU PUNKS GETTING PAID BY THE CYCLE?

in reply to Foone🏳️‍⚧️

right after this it checks if the mouse is even enabled (hey, it's 1990, not everyone has a mouse!)

I'd think that you would check that before you try to hide and redraw the cursor, but maybe this is exactly why I'm not employed writing educational games in 1990?

in reply to Foone🏳️‍⚧️

I smell a kludge addressing a poorly-understood bug/undocumented behavior.
in reply to Foone🏳️‍⚧️

MousePos is a struct with 2 shorts, x & y. they need MousePos.x & MousePos.y into CX and DX. BUT HOW?

LES DX, [MousePos.x]
MOV CX, ES

in reply to Foone🏳️‍⚧️

LES loads the far pointer at MousePos.x into the segment selector ES and the register DX. This is a far pointer in segmented mode, a 16bit segment selector plus a 16bit offset, like ES:DX or SS:BP or DS:1234
in reply to Foone🏳️‍⚧️

but there's no far pointer here! it's just using the code to load two 16-bit values at the same time.
in reply to Foone🏳️‍⚧️

the only downside is that it clobbers ES, but if they already know they aren't using ES, this... theoretically could be faster?
in reply to Foone🏳️‍⚧️

okay on an original 8086, LDS + MOV REG,REG is 29+2=31 cycles.

MOV REG, MEM*2 is 18*2=36 cycles.

I GUESS?

in reply to Foone🏳️‍⚧️

I wonder if this compiler is smart enough to do this or this is the ghostly hand of the most dreaded adversary of reverse engineers: HUMAN WRITTEN ASSEMBLY
in reply to Foone🏳️‍⚧️

compilers are programs. programs are predictable (with enough effort. if it was easy, we wouldn't need people like me)

humans are not predictable. reverse engineering what a human is doing is much, much harder.

in reply to Foone🏳️‍⚧️

so I think digipres.club/@foone/114611650… does make some sense: it's hiding and redrawing the cursor because it already moved the mouse position, and wants the on-screen cursor to match up .


broderbund::hide_cursor();
broderbund::show_cursor();

WERE YOU PUNKS GETTING PAID BY THE CYCLE?


in reply to Foone🏳️‍⚧️

it's just you'd think this would be like:
hide_cursor();
cursor.pos=newpos;
show_cursor();

but apparently the way the cursor hiding/showing works, hiding uses a saved position, while show_cursor uses the new global position.

in reply to Foone🏳️‍⚧️

so the code that uses the soundblaster auto-detects the IRQ in use by simply setting up a handler for every possible SB interrupt, then asking the SB to fire an interrupt. then it sees which one triggered.

Makes sense, but I wonder why this wasn't universal? why were programs always asking me which IRQ to use? maybe this isn't compatible with less-accurate SB clones?

in reply to Foone🏳️‍⚧️

at a guess, in case something else was using one of the “regular” SB IRQs that happened to fire at the same time?

I do not miss assigning non-conflicting IRQs from usually one of 3 options on every card.

in reply to Foone🏳️‍⚧️

currently in destructive-debugging mode.
I've a bunch of functions I tagged as "draw_relatedNN". Currently I'm down to 1, 2, 7, 9, 11, and 20.

So I'm running the game with them disabled (one at a time, natch) to see what doesn't render properly.

for example, when the first instruction of draw_related1 is a RET, suddenly animations don't play

in reply to Foone🏳️‍⚧️

ahh yes, nothing more suspicious than someone practicing their french in... Montreal?
in reply to Foone🏳️‍⚧️

Disabling draw_related2 produces blackout mode, as none of the static UI will render.
in reply to Foone🏳️‍⚧️

I was gonna give up on draw_related7 since disabling it didn't seem to change anything.

then I tried to quit...

in reply to Foone🏳️‍⚧️

the menu items 1-indexed, sort of. it treats 0 as "the whole menu itself"

so like:
set_menu_enabled(TRUE, 1, Menu_Game);
sets the first item in Game to enabled, but
set_menu_enabled(TRUE, 0, Menu_Game);
sets the whole Game menu to enabled.

in reply to Foone🏳️‍⚧️

the second thing the main() does (after setjmp) is try to unload the game.

this is a side-effect of how they're using setjmp to make main() a sort of event handler, so when they need to load the game's resources they don't know they're not already loaded, so it first tries to unload them, fails because they're not loaded, and THEN loads them

This entry was edited (6 days ago)
in reply to wyatt

@wyatt I think that'd still work? since it only sends a "hey SBcard, IRQ me!" message to one of them
in reply to Foone🏳️‍⚧️

Probably the other way round - it was likely too prone to false positives.

Printing something in the background (one of the few multi-tasking things DOS could do)? Instant IRQ7 misdetection.

Receiving a packet from a network card that's configured to an IRQ line which is also typical for a sound card? Well ...

in reply to KeyJ

@KeyJ this code does specifically turn off the printer/serial port IRQs while it's testing, but I don't doubt they could have easily missed one possible source of IRQs
@KeyJ
in reply to Foone🏳️‍⚧️

The logical explanation is that the mouse position is updated outside of the drawing loop, for VBL reasons. So clear must refer to the last drawn position. (Ask me how I know!)
in reply to Foone🏳️‍⚧️

Hardware mouse cursors were such a tiny thing that was such a huge improvement for game developers. No more XOR and dirty mouse rectangles.
in reply to Foone🏳️‍⚧️

I remember mouse cursors flickering a ton back in the day.

I was just happy to have a mouse at the time

in reply to Foone🏳️‍⚧️

this strongly reminds of badly written react.js apps (which is so easy to do, mind you)
in reply to Foone🏳️‍⚧️

ohh I wonder if I could do this for my research for my video on blobbers 🤔 there's actually very little information out there on how to optimize the drawing

Are you just setting the emulator to very low cycles?

in reply to Eniko Fox

@eniko yeah! I'm setting the cycles down to 15-50 (depending on the game) while recording, then I ffmpeg it out to individual PNGs so I can deal with it as frames
in reply to Foone🏳️‍⚧️

oh that'd be super useful, thanks! Let me figure out what titles I actually need this for and get back to you
in reply to Eniko Fox

oh hm this probably won't work for anything that double buffers will it?
in reply to Eniko Fox

@eniko not directly. I can make it work, though: I'll just record the back buffer instead
in reply to Foone🏳️‍⚧️

😮

well i know one of the big ones is lands of lore, because it doesn't just draw the scene but it also intersperses critters into it

i spent hours looking at the source for the scummvm engine but it's almost entirely uncommented and they do stuff like bitwise ops using raw numbers everywhere so it was just too hard to follow and figure out what was going on :/

in reply to Foone🏳️‍⚧️

@eniko spotted it in ram and I can see it drawing the graphics. I'll write some automation to make this an animation tomorrow
in reply to Foone🏳️‍⚧️

awesome! thank you so much 😁 if you can catch some enemies or other entity sprites in it that'd be even better
in reply to Foone🏳️‍⚧️

@eniko Is the back buffer of double buffered rendering code always in the same place?

I would've presumed it would depend on the given software's compilation or freed memory, though I'm more of a higher level programmer in that it's kind of hidden from the drawing cycles I've dealt with.

in reply to Alexander The 1st

@AT1ST @eniko yeah it depends on the game and what graphics hardware it targets. but it's usually easy to figure out where it is:

set a write breakpoint on the visible screen, and when it's hit, you now know what code writes to the screen. probably that's just a memcpy from main ram or vram, depending

in reply to Foone🏳️‍⚧️

One of the bonkers things to me is that games _still_ have "disable hardware cursor" as an option in this day and age.
in reply to Foone🏳️‍⚧️

ähm our gov still uses Peking 🤔😅. Maybe it has to do with the german language
bmeia.gv.at/oeb-peking
This entry was edited (1 week ago)
in reply to Foone🏳️‍⚧️

It's still "Peking" in German, for example.
Reading the German Wikipedia etyomolgy section, that follows a
Chinese postal romanization, which, apparently, fell out of use between 1980s and early 2000s.
en.wikipedia.org/wiki/Names_of… suggests that, yes, mid-80s would be a typical swich time.
Maybe the editor was German?
in reply to Foone🏳️‍⚧️

is this related to why sound cards have the game port/midi port? -someone who has only seen io jumpers
in reply to Foone🏳️‍⚧️

do you think you would be into (one day) a deep dive on one of the old BASIC softwares?
in reply to Foone🏳️‍⚧️

Heh - reminds me of the encoding trick with the PAUSE instruction on modern(ish) x86.
in reply to Marcel Waldvogel

@marcel No, because it doesn't look like any code got shared from that carmen to this one. I think they did a complete rewrite.
in reply to Foone🏳️‍⚧️

Ah, one is "World", one is "USA". I mistook the two "Carmen" entries in your inheritance tree to be identical.
All's well!
in reply to rf

@rf I'm gonna start assigning taxonomic names next. I'll have to figure out how to translate "carmen sandiego" into latin
@rf
in reply to Foone🏳️‍⚧️

SMC! But make sure you know if you're running on an 8088 versus an 8086 😀
in reply to Foone🏳️‍⚧️

fuck yeah, that was specs of my first PC. A Siemens telewriter that came with an onboard Hercules gfx card.

For a wild reason when I added a CGA card it jumped to 4.78 MHz. I still miss that old box

in reply to Foone🏳️‍⚧️

a write breakpoint on the memory location of the button on the screen would do that, right?
in reply to Mayday! Mayday! Robot

@StompyRobot yeah and that's the sort of thing I'm doing now. but it's not quick to calculate the address of the screen and then set a breakpoint. I'm thinking of an overkill tool that'd make it super quick to do this across many places in a game
in reply to Foone🏳️‍⚧️

do the symbols not include compilation unit names for all the non-inlined functions at least?
in reply to Foone🏳️‍⚧️

I wonder how it even came to be, maybe one of the 37 assorted cancelled PCs Apple engineers were working on was one with a MIPS CPU or something, maybe something related to Rhapsody
This entry was edited (2 weeks ago)
in reply to Devourer

@Devourer_ITA It's from a PS2 port! PoP Sands of Time included the original PoP as a bonus, and an early leaked version included the symbols
in reply to Foone🏳️‍⚧️

this figures because one irregularity might lead another. Well known processes in turn tend to be the same.
in reply to Foone🏳️‍⚧️

Right. If it were in any way realistic, you'd have to fly to ATL first.
in reply to Foone🏳️‍⚧️

TBQF, some decades that's true IRL too.*

When m-i-l & darling went to see the Schlieman Gold of Troy during Glasnost, flights to Moscow were change planes in Finland or Poland.

*(for that SPECIFIC example. I do realize you gave an example of a rigid game mechanic. I just find it a peculiarly interesting example.)

in reply to Foone🏳️‍⚧️

I irritate my family by referring to it as "The Old Man in A Pile At The Bottom Of The Mountain" whenever we go past those signs in New Hampshire.
in reply to Foone🏳️‍⚧️

why would they change formats like that?? Are they antagonizing u from the past?
in reply to GhostInCerulean

@CutInBismuth I think it makes some images marginally smaller to compress them in a different way, so they swapped back and forth depending on which compressor was better for a given image
in reply to Foone🏳️‍⚧️

Transistors operated by a non-binary person automatically become trinary? Neat
in reply to Foone🏳️‍⚧️

I mean, I've written a couple of threading systems on top of setjmp, but using it for "the main loop" seems exciting...
in reply to Foone🏳️‍⚧️

I had the original Prince of Persia on my then top of the line Amstead PPC. It was a groundbreaker. I think the first to use a form of motion capture to create the character movement animations.
in reply to Foone🏳️‍⚧️

I nearly named my kid ??2@YAPAXI@Z. never say we don't hold tight the scars of these compilers; authors of the many, many tomes of our chosen misfortunes
in reply to Foone🏳️‍⚧️

It is not that much work to add a compiler to Godbolt. github.com/compiler-explorer/c…
in reply to Foone🏳️‍⚧️

Wait, doesn't C have a goto command that you can use with text labels? I seem to recall it, my professor, having a discussion with us about when it was okay to use go to, as well as the dangers of it in my first year C class, many years ago
in reply to Canageek

@Canageek yeah, but this isn't a static goto, it's a dynamic one. the destination changes at runtime, which isn't something that goto can do
in reply to Foone🏳️‍⚧️

This means Carmen Sandiego lives in the Cobra Kai universe.
instagram.com/cobrakaiseries/r…
in reply to Foone🏳️‍⚧️

for a game ostensibly teaching geography that's horrendous!
(Otoh for intending to run on home PCs with 8088, 386, 486sx (emulated floating point only), that might be excusable? CORDIC integer trig is amazing but avoiding it might be forgivable.)
in reply to Foone🏳️‍⚧️

idk what they’re talking about but if the sky is red here on Monday at 9am something real bad has happened
in reply to Foone🏳️‍⚧️

I honestly expected this game to be written in ASM and now follow any calling convention? Or maybe things were less chaotic than I really knew back then?
in reply to MontyOnTheRun

@montyontherun it's partially in ASM (although possible just some included libraries) but it's mainly C, since I can tell it was (mostly) compiled with Microsoft C 5.0
in reply to Foone🏳️‍⚧️

well, given that it is not graphically intensive, it makes sense. I just assumed people went the ASM way by default.

Thanks for the clarification!

in reply to Foone🏳️‍⚧️

based on your post experiences with this kind of code: is it only using the lower seven bits, so sound 217 is actually 89?
in reply to BetaRays

@BetaRays they're not ordinals, the data file species which chunk id the chunk is. there's a lot of gaps.
in reply to Foone🏳️‍⚧️

StartPlayingSound:

POP AX
POP BX
POP ES
POP SI
CALL EnableSound
PUSH BX
PUSH ES
PUSH SI
PUSH AX

🤣

in reply to Foone🏳️‍⚧️

that reminds ne of my Amiga BASIC days. As it had no sleep or somesuch, it actually required:
FOR I 1 TO 1000; NEXT

IIRC that where ~3s or something.
And yes, that thing was in the manual for some of the example programs.

in reply to Foone🏳️‍⚧️

Working on some compression tricks, I am assuming 10 bits per symbol, in order to be able to preprocess 8bpp graphics with delta compression in both x and y. And I'll have 3 symbols left just in case (for end of stream markers or other tricks).
in reply to Foone🏳️‍⚧️

I had a client that used the computer mouse inverted.

But not in software. I mean physically turning the mouse around 180° and having the cord come out where, uh, the tail would be, I guess.

(Now that I put it like that, it makes sense why someone might do that.)

The end result being both X/Y axes are inverted, and the buttons are in the wrong place.

But it *worked*. That's the way the brain learnt it. It's like spending your whole career in ETAOIN SHRDLU only to discover the rest of the world is on QWERTY. You'd probably just stick with what works.

in reply to Jeremy Visser

@jez yeah!
I knew a kid back in the 90s who played NES games with the controller backwards
in reply to Foone🏳️‍⚧️

i simply MUST know how you connected an internal 5.25" floppy drive to a modern x86 system
Unknown parent

hometown - Link to source
Foone🏳️‍⚧️
@onfy yeah. I think because the mac version redrew a bunch of sprites and raised the resolution
@onfy
in reply to Foone🏳️‍⚧️

@onfy Oh, you mean the Sony one, not the IBM one with the slash. It took me 5 minutes, including a quick search for “cancelled ibm pc mips”.
@onfy
in reply to Jonah

@vjon @onfy yeah, sadly IBM never tried to move the PC to a different cpu architecture.
in reply to Foone🏳️‍⚧️

@onfy They did make the RS/6000, including that one weird PowerPC ThinkPad, so there's that…

All the RISC computer manufacturers just couldn't get rid of the idea that *their* workstation will sell despite being so expensive and they'll get rich despite lacking any plan for Step 2 in their 3-step scheme. Shame.

@onfy