First Draft at The New York Times + WordPress

Screen Shot 2014-09-29 at 4.00.56 PM

Last week, we launched a new vertical for politics at the Times: First Draft. First Draft is a morning newsletter/politics briefing and a web destination that is updated throughout the day by reporters at the Times’ Washington bureau.

First Draft is powered by WordPress. As I have noted previously, the Times runs on a variety of systems, but WordPress powers all of its blogs. “Blogs” have a bad connotation these days – not all things that have an informal tone and render posts in reverse-chronological order are necessarily a blog, but anything that does that should probably be using WordPress. WordPress has the ability to run any website, which is why it powers 23% of the entire internet. We are always looking for projects to power with WordPress at the Times, and First Draft was a perfect fit.

The features I am going to highlight that First Draft takes advantage of are things we take for granted in WordPress. Because we get so much for free, we can easily and powerfully build products and features.

Day Archives

First Draft is made up of Day archives. The idea is: you open the site in a tab at the beginning of the day, and you receive updates throughout the day when new content is available. A lot of the code that powers the New York Times proper is proprietary PHP. For another system to power First Draft, someone would have to re-invent day archives. And they might not think to also re-invent month and year archives. You may laugh, but those conversations happen. Even if the URL structure is present in another app, your API for retrieving data needs to be able to handle the parsing of these requests.

Day archives in WP “just work” – same with a lot of other types of archives. We think nothing of the fact that pretty URLs for hierarchical taxonomies “just work.” Proprietary frameworks tend to be missing these features out of the box.

Date Query

When WP_Date_Query landed in WordPress 3.7, it came with little fanfare. “Hey, sounds cool, maybe I’ll use it someday…” It’s one of those features that, when you DO need it, you can’t imagine a time when it didn’t exist. First Draft had some quirky requirements.

There is no “home page” per se, the home page needs to redirect to the most current day … unless it’s the weekend: posts made on Saturdays and Sundays are rolled up into Friday’s stream.

Home page requests are captured in 'parse_request', we need to find a post what was made on a day Monday-Friday (not the weekend):

$q = $wp->query_vars;

// This is a home request
// send it to the most recent weekday
if ( empty( $q ) ) {
    $q = new WP_Query( array(
        'ignore_sticky_posts' => true,
        'post_type' => 'post',
        'posts_per_page' => 1,
        'date_query' => array(
            'relation' => 'OR',
            array( 'dayofweek' => 2 ),
            array( 'dayofweek' => 3 ),
            array( 'dayofweek' => 4 ),
            array( 'dayofweek' => 5 ),
            array( 'dayofweek' => 6 ),
        )
    ) );

    if ( empty( $q->posts ) ) {
        return;
    }

    $time = strtotime( reset( $q->posts )->post_date );
    $day = date( 'd', $time );
    $monthnum = date( 'm', $time );
    $year = date( 'Y', $time );

    $url = get_day_link( $year, $monthnum, $day );
    wp_redirect( $url );
    exit();
}

If the request is indeed a day archive, we need to make sure we aren’t on Saturday or Sunday:

$vars = array_keys( $q );
$keys = array( 'year', 'day', 'monthnum' );

// this is not a day query
if ( array_diff( $keys, $vars ) ) {
    return;
}

$time = $this->vars_to_time(
    $q['monthnum'],
    $q['day'],
    $q['year']
);
$day = date( 'l', $time );

// Redirect Saturday and Sunday to Friday
$new_time = false;
switch ( $day ) {
case 'Saturday':
    $new_time = strtotime( '-1 day', $time );
    break;
case 'Sunday':
    $new_time = strtotime( '-2 day', $time );
    break;
}

// this is a Saturday/Sunday query, redirect to Friday
if ( $new_time ) {
    $day = date( 'd', $new_time );
    $monthnum = date( 'm', $new_time );
    $year = date( 'Y', $new_time );

    $url = get_day_link( $year, $monthnum, $day );
    wp_redirect( $url );
    exit();
}

In 'pre_get_posts', we need to figure out if we are on a Friday, and subsequently get Saturday and Sundays posts as well, assuming that Friday is not Today:

$query->set( 'posts_per_page', -1 );

$time = $this->vars_to_time(
    $query->get( 'monthnum' ),
    $query->get( 'day' ),
    $query->get( 'year' )
);
$day = date( 'l', $time );

if ( 'Friday' === $day ) {
    $before_time = strtotime( '+3 day', $time );

    $query->set( '_day', $query->get( 'day' ) );
    $query->set( '_monthnum', $query->get( 'monthnum' ) );
    $query->set( '_year', $query->get( 'year' ) );
    $query->set( 'day', '' );
    $query->set( 'monthnum', '' );
    $query->set( 'year', '' );

    $query->set( 'date_query', array(
        array(
            'compare' => 'BETWEEN',
            'before' => date( 'Y-m-d', $before_time ),
            'after' => date( 'Y-m-d', $time ),
            'inclusive' => true
        )
    ) );
}

Screen Shot 2014-09-29 at 4.25.06 PM

Adjacent day navigation links have to be aware of weekend rollups:

function first_draft_adjacent_day( $which = 'prev' ) {
    global $wp;

    $fd = FirstDraft_Theme::get_instance();

    if ( ! is_day() ) {
        return;
    }

    $archive_date = sprintf(
        '%s-%s-%s',
        $wp->query_vars[ 'year' ],
        $wp->query_vars[ 'monthnum' ],
        $wp->query_vars[ 'day' ]
    );

    if ( $archive_date === date( 'Y-m-d' ) && 'next' === $which ) {
        return;
    }

    $archive_time = strtotime( $archive_date );
    $day_name = date( 'l', $archive_time );

    if ( 'Thursday' === $day_name && 'next' === $which ) {
        $time = strtotime( '+1 day', $archive_time );
        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        $before_time = strtotime( '+3 day', $time );

        $ids = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'fields' => 'ids',
            'posts_per_page' => -1,
            'date_query' => array(
                array(
                    'compare' => 'BETWEEN',
                    'before' => date( 'Y-m-d', $before_time ),
                    'after' => date( 'Y-m-d', $time ),
                    'inclusive' => true
                )
            )
        ) );

        if ( empty( $ids->posts ) ) {
            return;
        }

        $count = count( $ids->posts );

    } elseif ( 'Friday' === $day_name && 'next' === $which ) {
        $after_time = strtotime( '+3 days', $archive_time );

        $q = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'post_type' => 'post',
            'posts_per_page' => 1,
            'order' => 'ASC',
            'date_query' => array(
                array(
                    'after' => date( 'Y-m-d', $after_time )
                )
            )
        ) );

        if ( empty( $q->posts ) ) {
            return;
        }

        $date = reset( $q->posts )->post_date;
        $time = strtotime( $date );

        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        $ids = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'fields' => 'ids',
            'posts_per_page' => -1,
            'year' => $year,
            'month' => $monthnum,
            'day' => $day
        ) );

        $count = count( $ids->posts );

    } else {
        // find a post with an adjacent date
        $q = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'post_type' => 'post',
            'posts_per_page' => 1,
            'order' => 'prev' === $which ? 'DESC' : 'ASC',
            'date_query' => array(
                'relation' => 'AND',
                array(
                    'prev' === $which ? 'before' : 'after' => array(
                        'year' => $fd->get_query_var( 'year' ),
                        'month' => (int) $fd->get_query_var( 'monthnum' ),
                        'day' => (int) $fd->get_query_var( 'day' )
                    )
                ),
                array(
                    'compare' => '!=',
                    'dayofweek' => 1
                ),
                array(
                    'compare' => '!=',
                    'dayofweek' => 7
                )
            )
        ) );

        if ( empty( $q->posts ) ) {
            return;
        }

        $date = reset( $q->posts )->post_date;
        $time = strtotime( $date );
        $name = date( 'l', $time );

        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        if ( 'Friday' === $name ) {
            $before_time = strtotime( '+3 days', $time );

            $ids = new WP_Query( array(
                'ignore_sticky_posts' => true,
                'fields' => 'ids',
                'posts_per_page' => -1,
                'date_query' => array(
                    array(
                        'compare' => 'BETWEEN',
                        'before' => date( 'Y-m-d', $before_time ),
                        'after' => date( 'Y-m-d', $time ),
                        'inclusive' => true
                    )
                )
            ) );
        } else {
            $ids = new WP_Query( array(
                'ignore_sticky_posts' => true,
                'fields' => 'ids',
                'posts_per_page' => -1,
                'year' => $year,
                'month' => $monthnum,
                'day' => $day
            ) );
        }

        $count = count( $ids->posts );
    }

    if ( 'prev' === $which && $time === strtotime( '-1 day' ) ) {
        $text = 'Yesterday';
    } else {
        $text = first_draft_month_format( 'D. M. d', $time );
    }

    $url = get_day_link( $year, $monthnum, $day );
    return compact( 'text', 'url', 'count' );
}

WP-API (the JSON API)

The New York Times is already using the new JSON API. When we needed to provide a stream for live updates, the WP-API was a far better solution (even in its alpha state) than XML-RPC. I implore you to look another developer in the face and tell them you want to build a cool new app, and you want to share data via XML-RPC. I’ve done it, they will not like you.

We needed to make some tweaks – date_query needs to be an allowed query var:

public function __construct() {
    ...
    add_filter( 'json_query_vars', array( $this, 'json_query_vars' ) );
    ...
}

public function json_query_vars( $vars ) {
    $vars[] = 'date_query';
    return $vars;
}

This will allow us to produce urls like so:


<meta name="live_stream_endpoint" content="http://www.nytimes.com/politics/first-draft/json/posts?filter[posts_per_page]=-1&filter[date_query][0][compare]=BETWEEN&filter[date_query][0][before]=2014-09-29&filter[date_query][0][after]=2014-09-26&filter[date_query][0][inclusive]=true"/>

Good times.

oEmbed

We take for granted: oEmbed is magic. Our reporters wanted to be able to “quick-publish” everything. Done and done. oEmbed also has the power of mostly being responsive, with minimal tweaks needed to ensure this.

How many proprietary systems have an oEmbed system like this? Probably none. Being able to paste a URL on a line by itself and, voila, you get a YouTube video or Tweet is pretty insane. TinyMCE previews inline while editing is even crazier.

Conclusion

There isn’t a lot of excitement about new “blogs” at the Times, but that distaste should not be confused with WordPress as a platform. WordPress is still a powerful tool, and is often a better solution than reinventing existing technologies in a proprietary system. First Draft is proof of that.

WordPress 4.0: Under the Hood

Today was the launch of WordPress 4.0, led by my friend, Helen Hou-Sandí. Helen was a great lead and easy to collaborate with. She had her hands in everything – I was hiding in the shadows dealing with tickets and architecture.

It seems like just 4.5 months ago I was celebrating the release of WordPress 3.9, probably because I was … “Some people, when they ship code, this is how they celebrate”:

[OLD INSTAGRAM POST]

WordPress 4.0 has an ominous-sounding name, but it was really just like any other release. I had one secret ambition: sweep out as many cobwebs as I could from the codebase and make some changes for the future. LOTS of people contribute to WordPress, so me committing changes to WordPress isn’t a solo tour, but there were a few things I was singularly focused on architecturally. I also contributed to 2 of the banner features in the release.

Cleanup from 3.9

  • Fixed RTL for playlists
  • Fixed <track>s when used as the body of a video shortcode
  • You can now upload .dfxp and .srt files
  • Added code to allow the loop attribute to work when MediaElement is playing your audio/video with Flash
  • MediaElement players now have the flat aesthetic and new offical colors
    You can now overide all MediaElement instance settings instead of just pluginPath
  • Bring the list of upload_filetypes for multisite into modernity based on .com upgrades and supported extensions for audio and video.
  • In the media modal, you can now set artist and album for your audio files inline
  • Gallery JS defaults are easier to override now: https://core.trac.wordpress.org/changeset/29284

Scrutinizer

Scrutinizer CI is a tool that analyzes your codebase for mistakes, sloppiness, complexity, duplication, and test coverage. If you work at a big fancy company that employs continous delivery methodologies, you may already have tools and instrumentation running on your code each time you commit, or scheduled throughout the day. I set up Scrutinizer to run everytime I updated my fork of WordPress on GitHub.

Scrutinizer is especially great at identifying unused/dead code cruft. My first tornado of commits in 4.0 were removing dead code all over WordPress core. Start here: https://core.trac.wordpress.org/changeset/28263

A good example of dead code: https://core.trac.wordpress.org/changeset/28292

extract()

extract() sucks. Here is a post about it: https://josephscott.org/archives/2009/02/i-dont-like-phps-extract-function/

Long story short: I eliminated all* of them in core. A good example: https://core.trac.wordpress.org/changeset/28469

* There is one left: here

Hack and HHVM

Facebook has done a lot of work to make PHP magically fast through HipHop, Hack, and HHVM. HHVM ships with a tool called hackificator that will convert your .php files to .hh files, unless you have some code that doesn’t jive with Hack’s stricter requirements.

While I see no path forward to be 100% compatible with the requirements for .hh files, we can get close. So I combed the hackificator output for things we COULD change and did.

Access Modifiers

WordPress has always dipped its toes in the OOP waters, but for a long time didn’t go all the way, because for a long time it had to support aspects of PHP4. There are still some files that need to be PHP4-compatible for install. For a lot of the classes in WordPress, now was a great time to upgrade them to PHP5 and use proper access modifiers (Hack also requires them).

I broke list tables several times along the way, but we cleaned them up and eventually got there.

wp_insert_post()/wp_insert_attachment() are now one

These two functions were very similar, but it was hard to see where they diverged, ESPECIALLY because of extract(). Once I removed the extract() code from them, it was easier to annotate their differences, however esoteric.

Funny timeline:

wp_handle_upload() and wp_handle_sideload() are now one

Similar to the above, these functions were almost identical. They diverged in esoteric ways. A new function exists, _wp_handle_upload(), that they both now wrap.

Done here: https://core.trac.wordpress.org/changeset/29209

wp_script_is() now recurses properly

Dependencies are a tree, not flat, so checking if a script is enqueued should recurse its tree and its tree dependencies. This previously only checked its immediate dependencies and didn’t recurse. I added a new method, recurse_deps(), to WP_Dependencies.

https://core.trac.wordpress.org/changeset/29252
And oops: https://core.trac.wordpress.org/changeset/29253

ORDER BY

Commit: https://core.trac.wordpress.org/changeset/29027
Make/Core post about it: A more powerful ORDER BY in WordPress 4.0

LIKE escape sanity

I helped @miqrogroove shepherd this: https://core.trac.wordpress.org/changeset/28711 and https://core.trac.wordpress.org/changeset/28712

Make/Core post: like_escape() is Deprecated in WordPress 4.0

wptexturize() overhaul

This was all @miqrogroove, I just supported him. He crushed a lot of tickets and made the function a whole lot faster. We need people like him to dig deep in areas like this.

Taxonomy Roadmap

Potential roadmap for taxonomy meta and post relationships: there is one
We cleared one of the major tickets: #17689. Here: https://core.trac.wordpress.org/changeset/28733

Variable variables

Variable variables are weird and are disallowed by Hack. Allows you to do this:

$woo = 'hoo';
$$woo = 'yeah';
echo $hoo;
// yeah!

I removed all(?) of these from core … some 3rd-party libraries might still contain them.

Started that tornado here: https://core.trac.wordpress.org/changeset/28734
One of my favorite commits: https://core.trac.wordpress.org/changeset/28743

Unit Tests

I did a lot of cleanup to Unit Tests along the way. Unit tests are our only way to stay sane while committing mountains of new code to WordPress.

Some highlights:

Embeds

In 3.9, I worked with Gregory Cornelius and Andrew Ozz to implement TinyMCE previews for audio, video, playlists, and galleries. We didn’t get to previews for 3rd-party embeds (like YouTube) in time, so I threw them into my Audio/Video Bonus Pack plugin as an extra feature. Once 4.0 started, Janneke Van Dorpe suggested we throw that code into core, so we did. From there, she and Andrew Ozz did many iterations of making embed previews of YouTube, Twitter, and the like possible.

Other improvements I worked on:

  • When using Insert From URL in the media modal, your embeds will appear as a preview inline.
  • Added oEmbed support for Issuu, Mixcloud, Animoto, and YouTube Playlist URLs.
  • You can use a src attribute for embed shortcodes now, instead of using the shortcode’s body (still works, but you can’t use both at the same time)
  • I added an embed handler for YouTube URLs like: http://youtube.com/embed/acb1233 (the YouTube iframe embed URLs) – those are now converted into proper urls like: http://youtube.com/watch?v=abc1233
  • This is fucking insane, I forgot I did this – if you select a poster image for a video shortcode in the media modal, and the video is confirmed to be an attachment that doesn’t have a poster image (videos are URLs so that external URLs don’t have to be attachments), the association will be made in the background via AJAX: https://core.trac.wordpress.org/changeset/29029

While we were working on embeds, we/I decided to COMPLETELY CHANGE MCE VIEWS FOR AUDIO/VIDEO:

https://core.trac.wordpress.org/changeset/29178
Wins: https://core.trac.wordpress.org/changeset/29179

In 3.9, we were checking the browser to see what audio/video files could be played natively and only showed those in the TinyMCE previews. Now we show all of them. This is due to the great work that @avryl and @azaozz did with implementing iframe sandboxes for embeds. I took their work and ran with it – completely changed the way audio/video/playlists render in the editor. The advantage is that the code that generates the shortcode only has to be in PHP, needs no JS equivalent.

Media Grid

@ericandrewlewis drove this train for most of the release. I came in towards beta to make sure all of the code-churn was up to Koop-like standards and dealt with some esoteric issues as they arose. It was great having my co-worker dive so deep into media, another asset to the core team who greatly needs people what knowledge of that domain.

As an example of the things I worked on – I stayed up til 6am one night chatting with Koop to figure out how to attack media-models.js. Once I filled my brain, I got to places like:
https://core.trac.wordpress.org/changeset/29490

So many contributors

There are too many people, features, and commits to mention in one blog post. This is just a journal of my time spent. You all rocked. Let’s keep going.

WordPress 4.0 “Benny”