You may know StackOverflow and StackExchange, a network of Question-and-Answer sites around every imaginable topic. All content generated on the site is licensed under Creative Commons BY-SA 3.0 and can be downloaded as a complete database dump.
I wanted to program something that uses a huge amount of data, so this is an ideal source.
Stackexchange Simulator uses Markov Chains based on the database dumps to generate nonsense questions and corresponding answers. Visitors can then vote on them so that the most entertaining ones get on the front page.