FictionPress: Publishing 6 Million Works of Fiction on the Web

FictionPress operates both and and is home to over 6 million works of fiction, with millions of writers/readers participating from around the world in over 30 languages.

Issues addressed:

  • Support complex and efficient indexes at 100+ million rows.
  • Predicable and consistent performance regardless of data size growth.
  • Fast recovery.
  • Ensuring Predictable Performance at Scale

The Challenge:

FictionPress offers a number of interactive features to its large user base. These include discussion forums, in-site messaging and user reviews. FictionPress made the decision to build its own discussion forums to meet its strict security and performance requirements. Xing Li, CTO of FictionPress, noted that the site “needs to host hundreds of thousands of forums. Existing forum software doesn’t do this while meeting our performance and security targets.”

To ensure the real-time responsiveness of the forums, FictionPress needs the ability to create and efficiently maintain complex indexes and be able to support millions of small rows. In addition, it needs the ability to index them with minimal impact to resource costs and performance. “The only way to make this all work and provide a good customer experience is to guarantee that we can deliver a flat predictable performance with our database back-end even as the number of rows crosses the 100 million mark,” according to Li.

FictionPress considered InnoDB, the default storage engine for MySQL, but it did not offer predictable performance at scale. Indexes became dramatically slower as the number of rows increased, causing a reduction of both read and write performance. InnoDB also did not offer the performance-enhancing feature of multiple clustering indexes.

The Solution:

FictionPress uses MariaDB and TokuDB to manage its discussion forums, reviews, and in-site messaging systems.

FictionPress installed TokuDB in a Linux environment with dedicated hardware. Each configuration has a single master with multiple read slaves. “TokuDB’s high write concurrency and support for multiple clustering indexes gave us the freedom to design and deploy better performing queries at scale,” according to Li. This was important to FictionPress as its environment is continually expanding.

Predictable Performance: “While raw performance is important, the predictability of response time as one scales the system was our focal point” according to Li. “InnoDB can only have one clustering index, but TokuDB gives you basically an unlimited number. In addition, both MyISAM and InnoDB slow down with many indexes on databases of our size. MyISAM also causes replication lag due to concurrency. In the end, TokuDB gives us predictability, performance at scale, and more flexible indexing without the limitations found in other MySQL options.”

Cost: “To get additional performance, one can always throw hardware at the problem,” according to Li. “By utilizing TokuDB instead we improved scalability and at the same time saved on costs for additional server hardware that would have been required if TokuDB was not in the picture. In addition, we saw an 8x size reduction in disk space compared to MyISAM due to improved compression. The hardware cost saving made moving to TokuDB an easy decision.”

Crash Recovery: FictionPress had been using MyISAM initially. “We needed a replacement for MyISAM for small BLOB data,” according to Li. “In fact, we wanted to move away from MyISAM whenever possible to shorten its long crash recovery. InnoDB was an option but TokuDB offered better compression and a smaller storage footprint for both core data and index data for our own data sets.”

Hot Schema Changes: “For performance reasons we need a lot of indexes but also need to add and maintain these indexes quickly,” according to Li. “TokuDB is the only MySQL solution I found that offers Hot Schema changes such as Hot Indexing. Hot Schema changes are a powerful capability which we use to minimize downtime during system-wide upgrades and shorten our application/schema development cycle.”