Markov's Fanfiction writeup, part 2: Making the fics (published 2019-11-07)

This is part 2 of a series of writeups I'm doing on my progress on my NaNoGenMo project, Markov's Fanfiction.

Once the stories have been collected, the next step is to use a Markov chain library to create a collection of words vaguely resembling a novel. My Markov library of choice for this project is markovify. This is because markovify will do its best to split the text into sentences, meaning I don't have to do that work myself.

Once markovify is given the text input, we must configure the chain. A fine balance must be achieved between not needing to wait for 4 hours and not spitting out big chunks of the original stories. For this, I decided to set it up like so:

next_sentence = text.make_sentence(tries=1000,
  max_overlap_ratio=0.8,
  max_overlap_total=(2**64))

Specifically, the chain is expected to try up to 1000 times to create something that can (at most) be 80% similar to an original text. Setting max_overlap_total to 2^64 was a simple measure to get the total overlap wordcount out of the way, since the chain will take the smaller of the 2.

Combine that with a loop that takes in sentences until it hits the 50,000 word limit, and you have yourself a NaNoGenMo project.

The only thing I might add past this is the ability to split the book into chapters. But for now, it's good enough.