First Draft at The New York Times + WordPress

Screen Shot 2014-09-29 at 4.00.56 PM

Last week, we launched a new vertical for politics at the Times: First Draft. First Draft is a morning newsletter/politics briefing and a web destination that is updated throughout the day by reporters at the Times’ Washington bureau.

First Draft is powered by WordPress. As I have noted previously, the Times runs on a variety of systems, but WordPress powers all of its blogs. “Blogs” have a bad connotation these days – not all things that have an informal tone and render posts in reverse-chronological order are necessarily a blog, but anything that does that should probably be using WordPress. WordPress has the ability to run any website, which is why it powers 23% of the entire internet. We are always looking for projects to power with WordPress at the Times, and First Draft was a perfect fit.

The features I am going to highlight that First Draft takes advantage of are things we take for granted in WordPress. Because we get so much for free, we can easily and powerfully build products and features.

Day Archives

First Draft is made up of Day archives. The idea is: you open the site in a tab at the beginning of the day, and you receive updates throughout the day when new content is available. A lot of the code that powers the New York Times proper is proprietary PHP. For another system to power First Draft, someone would have to re-invent day archives. And they might not think to also re-invent month and year archives. You may laugh, but those conversations happen. Even if the URL structure is present in another app, your API for retrieving data needs to be able to handle the parsing of these requests.

Day archives in WP “just work” – same with a lot of other types of archives. We think nothing of the fact that pretty URLs for hierarchical taxonomies “just work.” Proprietary frameworks tend to be missing these features out of the box.

Date Query

When WP_Date_Query landed in WordPress 3.7, it came with little fanfare. “Hey, sounds cool, maybe I’ll use it someday…” It’s one of those features that, when you DO need it, you can’t imagine a time when it didn’t exist. First Draft had some quirky requirements.

There is no “home page” per se, the home page needs to redirect to the most current day … unless it’s the weekend: posts made on Saturdays and Sundays are rolled up into Friday’s stream.

Home page requests are captured in 'parse_request', we need to find a post what was made on a day Monday-Friday (not the weekend):

$q = $wp->query_vars;

// This is a home request
// send it to the most recent weekday
if ( empty( $q ) ) {
    $q = new WP_Query( array(
        'ignore_sticky_posts' => true,
        'post_type' => 'post',
        'posts_per_page' => 1,
        'date_query' => array(
            'relation' => 'OR',
            array( 'dayofweek' => 2 ),
            array( 'dayofweek' => 3 ),
            array( 'dayofweek' => 4 ),
            array( 'dayofweek' => 5 ),
            array( 'dayofweek' => 6 ),
        )
    ) );

    if ( empty( $q->posts ) ) {
        return;
    }

    $time = strtotime( reset( $q->posts )->post_date );
    $day = date( 'd', $time );
    $monthnum = date( 'm', $time );
    $year = date( 'Y', $time );

    $url = get_day_link( $year, $monthnum, $day );
    wp_redirect( $url );
    exit();
}

If the request is indeed a day archive, we need to make sure we aren’t on Saturday or Sunday:

$vars = array_keys( $q );
$keys = array( 'year', 'day', 'monthnum' );

// this is not a day query
if ( array_diff( $keys, $vars ) ) {
    return;
}

$time = $this->vars_to_time(
    $q['monthnum'],
    $q['day'],
    $q['year']
);
$day = date( 'l', $time );

// Redirect Saturday and Sunday to Friday
$new_time = false;
switch ( $day ) {
case 'Saturday':
    $new_time = strtotime( '-1 day', $time );
    break;
case 'Sunday':
    $new_time = strtotime( '-2 day', $time );
    break;
}

// this is a Saturday/Sunday query, redirect to Friday
if ( $new_time ) {
    $day = date( 'd', $new_time );
    $monthnum = date( 'm', $new_time );
    $year = date( 'Y', $new_time );

    $url = get_day_link( $year, $monthnum, $day );
    wp_redirect( $url );
    exit();
}

In 'pre_get_posts', we need to figure out if we are on a Friday, and subsequently get Saturday and Sundays posts as well, assuming that Friday is not Today:

$query->set( 'posts_per_page', -1 );

$time = $this->vars_to_time(
    $query->get( 'monthnum' ),
    $query->get( 'day' ),
    $query->get( 'year' )
);
$day = date( 'l', $time );

if ( 'Friday' === $day ) {
    $before_time = strtotime( '+3 day', $time );

    $query->set( '_day', $query->get( 'day' ) );
    $query->set( '_monthnum', $query->get( 'monthnum' ) );
    $query->set( '_year', $query->get( 'year' ) );
    $query->set( 'day', '' );
    $query->set( 'monthnum', '' );
    $query->set( 'year', '' );

    $query->set( 'date_query', array(
        array(
            'compare' => 'BETWEEN',
            'before' => date( 'Y-m-d', $before_time ),
            'after' => date( 'Y-m-d', $time ),
            'inclusive' => true
        )
    ) );
}

Screen Shot 2014-09-29 at 4.25.06 PM

Adjacent day navigation links have to be aware of weekend rollups:

function first_draft_adjacent_day( $which = 'prev' ) {
    global $wp;

    $fd = FirstDraft_Theme::get_instance();

    if ( ! is_day() ) {
        return;
    }

    $archive_date = sprintf(
        '%s-%s-%s',
        $wp->query_vars[ 'year' ],
        $wp->query_vars[ 'monthnum' ],
        $wp->query_vars[ 'day' ]
    );

    if ( $archive_date === date( 'Y-m-d' ) && 'next' === $which ) {
        return;
    }

    $archive_time = strtotime( $archive_date );
    $day_name = date( 'l', $archive_time );

    if ( 'Thursday' === $day_name && 'next' === $which ) {
        $time = strtotime( '+1 day', $archive_time );
        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        $before_time = strtotime( '+3 day', $time );

        $ids = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'fields' => 'ids',
            'posts_per_page' => -1,
            'date_query' => array(
                array(
                    'compare' => 'BETWEEN',
                    'before' => date( 'Y-m-d', $before_time ),
                    'after' => date( 'Y-m-d', $time ),
                    'inclusive' => true
                )
            )
        ) );

        if ( empty( $ids->posts ) ) {
            return;
        }

        $count = count( $ids->posts );

    } elseif ( 'Friday' === $day_name && 'next' === $which ) {
        $after_time = strtotime( '+3 days', $archive_time );

        $q = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'post_type' => 'post',
            'posts_per_page' => 1,
            'order' => 'ASC',
            'date_query' => array(
                array(
                    'after' => date( 'Y-m-d', $after_time )
                )
            )
        ) );

        if ( empty( $q->posts ) ) {
            return;
        }

        $date = reset( $q->posts )->post_date;
        $time = strtotime( $date );

        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        $ids = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'fields' => 'ids',
            'posts_per_page' => -1,
            'year' => $year,
            'month' => $monthnum,
            'day' => $day
        ) );

        $count = count( $ids->posts );

    } else {
        // find a post with an adjacent date
        $q = new WP_Query( array(
            'ignore_sticky_posts' => true,
            'post_type' => 'post',
            'posts_per_page' => 1,
            'order' => 'prev' === $which ? 'DESC' : 'ASC',
            'date_query' => array(
                'relation' => 'AND',
                array(
                    'prev' === $which ? 'before' : 'after' => array(
                        'year' => $fd->get_query_var( 'year' ),
                        'month' => (int) $fd->get_query_var( 'monthnum' ),
                        'day' => (int) $fd->get_query_var( 'day' )
                    )
                ),
                array(
                    'compare' => '!=',
                    'dayofweek' => 1
                ),
                array(
                    'compare' => '!=',
                    'dayofweek' => 7
                )
            )
        ) );

        if ( empty( $q->posts ) ) {
            return;
        }

        $date = reset( $q->posts )->post_date;
        $time = strtotime( $date );
        $name = date( 'l', $time );

        $day = date( 'd', $time );
        $monthnum = date( 'm', $time );
        $year = date( 'Y', $time );

        if ( 'Friday' === $name ) {
            $before_time = strtotime( '+3 days', $time );

            $ids = new WP_Query( array(
                'ignore_sticky_posts' => true,
                'fields' => 'ids',
                'posts_per_page' => -1,
                'date_query' => array(
                    array(
                        'compare' => 'BETWEEN',
                        'before' => date( 'Y-m-d', $before_time ),
                        'after' => date( 'Y-m-d', $time ),
                        'inclusive' => true
                    )
                )
            ) );
        } else {
            $ids = new WP_Query( array(
                'ignore_sticky_posts' => true,
                'fields' => 'ids',
                'posts_per_page' => -1,
                'year' => $year,
                'month' => $monthnum,
                'day' => $day
            ) );
        }

        $count = count( $ids->posts );
    }

    if ( 'prev' === $which && $time === strtotime( '-1 day' ) ) {
        $text = 'Yesterday';
    } else {
        $text = first_draft_month_format( 'D. M. d', $time );
    }

    $url = get_day_link( $year, $monthnum, $day );
    return compact( 'text', 'url', 'count' );
}

WP-API (the JSON API)

The New York Times is already using the new JSON API. When we needed to provide a stream for live updates, the WP-API was a far better solution (even in its alpha state) than XML-RPC. I implore you to look another developer in the face and tell them you want to build a cool new app, and you want to share data via XML-RPC. I’ve done it, they will not like you.

We needed to make some tweaks – date_query needs to be an allowed query var:

public function __construct() {
    ...
    add_filter( 'json_query_vars', array( $this, 'json_query_vars' ) );
    ...
}

public function json_query_vars( $vars ) {
    $vars[] = 'date_query';
    return $vars;
}

This will allow us to produce urls like so:


<meta name="live_stream_endpoint" content="http://www.nytimes.com/politics/first-draft/json/posts?filter[posts_per_page]=-1&filter[date_query][0][compare]=BETWEEN&filter[date_query][0][before]=2014-09-29&filter[date_query][0][after]=2014-09-26&filter[date_query][0][inclusive]=true"/>

Good times.

oEmbed

We take for granted: oEmbed is magic. Our reporters wanted to be able to “quick-publish” everything. Done and done. oEmbed also has the power of mostly being responsive, with minimal tweaks needed to ensure this.

How many proprietary systems have an oEmbed system like this? Probably none. Being able to paste a URL on a line by itself and, voila, you get a YouTube video or Tweet is pretty insane. TinyMCE previews inline while editing is even crazier.

Conclusion

There isn’t a lot of excitement about new “blogs” at the Times, but that distaste should not be confused with WordPress as a platform. WordPress is still a powerful tool, and is often a better solution than reinventing existing technologies in a proprietary system. First Draft is proof of that.

WordPress 4.0: Under the Hood

Today was the launch of WordPress 4.0, led by my friend, Helen Hou-Sandí. Helen was a great lead and easy to collaborate with. She had her hands in everything – I was hiding in the shadows dealing with tickets and architecture.

It seems like just 4.5 months ago I was celebrating the release of WordPress 3.9, probably because I was … “Some people, when they ship code, this is how they celebrate”:

WordPress 4.0 has an ominous-sounding name, but it was really just like any other release. I had one secret ambition: sweep out as many cobwebs as I could from the codebase and make some changes for the future. LOTS of people contribute to WordPress, so me committing changes to WordPress isn’t a solo tour, but there were a few things I was singularly focused on architecturally. I also contributed to 2 of the banner features in the release.

Cleanup from 3.9

  • Fixed RTL for playlists
  • Fixed <track>s when used as the body of a video shortcode
  • You can now upload .dfxp and .srt files
  • Added code to allow the loop attribute to work when MediaElement is playing your audio/video with Flash
  • MediaElement players now have the flat aesthetic and new offical colors
    You can now overide all MediaElement instance settings instead of just pluginPath
  • Bring the list of upload_filetypes for multisite into modernity based on .com upgrades and supported extensions for audio and video.
  • In the media modal, you can now set artist and album for your audio files inline
  • Gallery JS defaults are easier to override now: https://core.trac.wordpress.org/changeset/29284

Scrutinizer

Scrutinizer CI is a tool that analyzes your codebase for mistakes, sloppiness, complexity, duplication, and test coverage. If you work at a big fancy company that employs continous delivery methodologies, you may already have tools and instrumentation running on your code each time you commit, or scheduled throughout the day. I set up Scrutinizer to run everytime I updated my fork of WordPress on GitHub.

Scrutinizer is especially great at identifying unused/dead code cruft. My first tornado of commits in 4.0 were removing dead code all over WordPress core. Start here: https://core.trac.wordpress.org/changeset/28263

A good example of dead code: https://core.trac.wordpress.org/changeset/28292

extract()

extract() sucks. Here is a post about it: https://josephscott.org/archives/2009/02/i-dont-like-phps-extract-function/

Long story short: I eliminated all* of them in core. A good example: https://core.trac.wordpress.org/changeset/28469

* There is one left: here

Hack and HHVM

Facebook has done a lot of work to make PHP magically fast through HipHop, Hack, and HHVM. HHVM ships with a tool called hackificator that will convert your .php files to .hh files, unless you have some code that doesn’t jive with Hack’s stricter requirements.

While I see no path forward to be 100% compatible with the requirements for .hh files, we can get close. So I combed the hackificator output for things we COULD change and did.

Access Modifiers

WordPress has always dipped its toes in the OOP waters, but for a long time didn’t go all the way, because for a long time it had to support aspects of PHP4. There are still some files that need to be PHP4-compatible for install. For a lot of the classes in WordPress, now was a great time to upgrade them to PHP5 and use proper access modifiers (Hack also requires them).

I broke list tables several times along the way, but we cleaned them up and eventually got there.

wp_insert_post()/wp_insert_attachment() are now one

These two functions were very similar, but it was hard to see where they diverged, ESPECIALLY because of extract(). Once I removed the extract() code from them, it was easier to annotate their differences, however esoteric.

Funny timeline:

wp_handle_upload() and wp_handle_sideload() are now one

Similar to the above, these functions were almost identical. They diverged in esoteric ways. A new function exists, _wp_handle_upload(), that they both now wrap.

Done here: https://core.trac.wordpress.org/changeset/29209

wp_script_is() now recurses properly

Dependencies are a tree, not flat, so checking if a script is enqueued should recurse its tree and its tree dependencies. This previously only checked its immediate dependencies and didn’t recurse. I added a new method, recurse_deps(), to WP_Dependencies.

https://core.trac.wordpress.org/changeset/29252
And oops: https://core.trac.wordpress.org/changeset/29253

ORDER BY

Commit: https://core.trac.wordpress.org/changeset/29027
Make/Core post about it: A more powerful ORDER BY in WordPress 4.0

LIKE escape sanity

I helped @miqrogroove shepherd this: https://core.trac.wordpress.org/changeset/28711 and https://core.trac.wordpress.org/changeset/28712

Make/Core post: like_escape() is Deprecated in WordPress 4.0

wptexturize() overhaul

This was all @miqrogroove, I just supported him. He crushed a lot of tickets and made the function a whole lot faster. We need people like him to dig deep in areas like this.

Taxonomy Roadmap

Potential roadmap for taxonomy meta and post relationships: there is one
We cleared one of the major tickets: #17689. Here: https://core.trac.wordpress.org/changeset/28733

Variable variables

Variable variables are weird and are disallowed by Hack. Allows you to do this:

$woo = 'hoo';
$$woo = 'yeah';
echo $hoo;
// yeah!

I removed all(?) of these from core … some 3rd-party libraries might still contain them.

Started that tornado here: https://core.trac.wordpress.org/changeset/28734
One of my favorite commits: https://core.trac.wordpress.org/changeset/28743

Unit Tests

I did a lot of cleanup to Unit Tests along the way. Unit tests are our only way to stay sane while committing mountains of new code to WordPress.

Some highlights:

Embeds

In 3.9, I worked with Gregory Cornelius and Andrew Ozz to implement TinyMCE previews for audio, video, playlists, and galleries. We didn’t get to previews for 3rd-party embeds (like YouTube) in time, so I threw them into my Audio/Video Bonus Pack plugin as an extra feature. Once 4.0 started, Janneke Van Dorpe suggested we throw that code into core, so we did. From there, she and Andrew Ozz did many iterations of making embed previews of YouTube, Twitter, and the like possible.

Other improvements I worked on:

  • When using Insert From URL in the media modal, your embeds will appear as a preview inline.
  • Added oEmbed support for Issuu, Mixcloud, Animoto, and YouTube Playlist URLs.
  • You can use a src attribute for embed shortcodes now, instead of using the shortcode’s body (still works, but you can’t use both at the same time)
  • I added an embed handler for YouTube URLs like: http://youtube.com/embed/acb1233 (the YouTube iframe embed URLs) – those are now converted into proper urls like: http://youtube.com/watch?v=abc1233
  • This is fucking insane, I forgot I did this – if you select a poster image for a video shortcode in the media modal, and the video is confirmed to be an attachment that doesn’t have a poster image (videos are URLs so that external URLs don’t have to be attachments), the association will be made in the background via AJAX: https://core.trac.wordpress.org/changeset/29029

While we were working on embeds, we/I decided to COMPLETELY CHANGE MCE VIEWS FOR AUDIO/VIDEO:

https://core.trac.wordpress.org/changeset/29178
Wins: https://core.trac.wordpress.org/changeset/29179

In 3.9, we were checking the browser to see what audio/video files could be played natively and only showed those in the TinyMCE previews. Now we show all of them. This is due to the great work that @avryl and @azaozz did with implementing iframe sandboxes for embeds. I took their work and ran with it – completely changed the way audio/video/playlists render in the editor. The advantage is that the code that generates the shortcode only has to be in PHP, needs no JS equivalent.

Media Grid

@ericandrewlewis drove this train for most of the release. I came in towards beta to make sure all of the code-churn was up to Koop-like standards and dealt with some esoteric issues as they arose. It was great having my co-worker dive so deep into media, another asset to the core team who greatly needs people what knowledge of that domain.

As an example of the things I worked on – I stayed up til 6am one night chatting with Koop to figure out how to attack media-models.js. Once I filled my brain, I got to places like:
https://core.trac.wordpress.org/changeset/29490

So many contributors

There are too many people, features, and commits to mention in one blog post. This is just a journal of my time spent. You all rocked. Let’s keep going.

WordPress 4.0 “Benny”

WordPress 3.9 + Audio/Video

Screen Shot 2014-04-16 at 2.54.12 PM

Previous posts on Make/Core:
Audio / Video 2.0 Update – Media Modal 
Audio / Video 2.0 Update – Playlists 
Audio / Video 2.0 Update 
Audio / Video 2.0 – codename “Disco Fries”

If you remember WordPress 3.6, we were scrambling to make Post Formats work. They did not, so they were dropped. What remained in the aftermath was rudimentary support for audio and video. You could display one audio file at a time and/or one video file at a time using a shortcode. Good, but not good enough. WordPress 3.9 has a TON of improvements, several related to visual editing, media, and a second pass at defining what audio and video can do in WordPress.

HTML5 audio and video on the web are still the Wild Wild West, I viewed 3.9 as a way to help tame the beast.

Media code from 3.5

Koop wrote an astonishing amount of beautiful Backbone-driven code in WordPress 3.5 related to overhauling and rethinking Media in WordPress. Gregory Cornelius, Andrew Ozz, and I spent the better part of 3.9 swimming around it and its relationship to TinyMCE. While there isn’t a ton of written documentation for media, I did fall on the sword and added JSDoc blocks to every class in media-views, media-model, and media-editor JS files. It is now possible to follow the chain of inheritance for every class, which is 7 levels deep at times. We’ve also built some new features, and learned how to interact with these existing APIs.

TinyMCE Views – Visual previews of your media

Screen Shot 2014-04-16 at 2.26.34 PM TinyMCE is the visual editor in WordPress. Behind the scenes, the visual editor is an iframe that contains markup. In 3.9, gcorne and azaozz did the mind-bending work of making it easier to render “MCE views” – or content that had connection to the outside world of the visual iframe via a TinyMCE plugin and mce-view.js. A lot of the work I did in building previews for audio and video inside of the editor was implementing the features and APIs they created. gcorne showed us the possibilities by making galleries appear in the visual editor. Everything else followed his lead. Screen Shot 2014-04-16 at 2.26.00 PM

Themes now have proper CSS

We went back in time to the last 5 default themes and added the basic styles necessary for audio and video to behave in a unified way. Meaning, if you switch from TwentyEleven theme to TwentyFourteen: videos should always have the same aspect ratio. Same goes for the admin, the video should always appear with dimensions that are predictable.

<audio> and <video> are now responsive

Because of the above CSS changes, audio and video are responsive throughout WordPress and on mobile. Win.

Attachment Pages

If I asked you the question – do players automatically appear for audio and video files on their respective attachment pages? You might answer, of course they do! … they did not, they do now!

Screen Shot 2014-04-16 at 2.29.58 PM

Chromeless YouTube

MediaElement supports the playback of YouTube videos without the look and feel of a YouTube player. This is great because the style of the video player will match the style of your other players.

Screen Shot 2014-04-16 at 2.33.27 PM

MediaElement updated

MediaElement.js has been updated to the latest and greatest version. HUGE thanks to John Dyer for working so closely with us and accepting pull requests when we badger him on random Saturday afternoons.

Playlists

Turning mp3 URLs into players is awesome and happens automagically in WordPress now. But what if you are sharing an entire album of your band’s tunes, or sharing your music recital on your website? Rendering 10 separate players is visually weird. We already have “galleries” for images, can we reuse the admin UI for those and make it work for playlists of audio or video files? We can (after some sweat and tears), so we did. I remember staying up all night in 2006 trying to figure out how to put my band’s music on our website. If even a niche user base of musicians are able to publish their music because of this feature, it will have been worth it.

Screen Shot 2014-04-16 at 2.38.22 PM

Manage Shortcodes

Your audio and video shortcodes now have live previews in the editor, but that’s not it… you can now click the preview to pop open the media modal and edit your content. Once there you can:

  • Add alternate playback formats for maximum native HTML5 playback
  • Add a poster image for your video, if it wasn’t done automatically on upload
  • Add subtitles to your video

Screen Shot 2014-04-16 at 2.42.56 PM

It’s pretty slick.

Screen Shot 2014-04-16 at 2.43.25 PM

Core Changes

Some other cool little treats:

  • Featured Image is turned on for attachment:audio and attachment:video = when you upload your audio and video files, if the files contain cover images, they are automatically slurped for you, uploaded, and associated as the featured image for the media file. Meaning: you will automatically have a video poster image, or your audio playlist will display the album cover along with the track.
  • Images in ID3 tags are stored via hash to prevent re-uploading = if you upload 10 tracks from an album that all have the same album cover, only one cover will uploaded and associated with all of the tracks.
  • Artist and Album are editable = your media item’s title is always used as the “song title,” but now, if your item did not contain metadata for artist and album, you can set it on the Edit Media screen.
  • The old “crystal” icon set for media items has been updated and MP6ified. They look WAY better.

Have fun with WordPress 3.9 :)

xoxo

Rethinking Blogs at The New York Times

The New York Times

See Also: The Technology Behind the NYTimes.com Redesign

The Blogs at the Times have always run on WordPress. The New York Times, as an ecosystem, does not run on one platform or one technology. It runs on several. There are over 150 developers at the Times split across numerous teams: Web Products, Search, Blogs, iOS, Android, Mobile Web, Crosswords, Ads, BI, CMS, Video, APIs, Interactive News, and the list goes on. While PHP is frequently used, Elastic Search and Node make an appearance, and the Newspaper CMS, “Scoop,” is written in Java. Interactive likes Ruby/Rails.

The “redesign,” which launched last week, was really a re-platform: where Times development needs to head, and a rethinking of our development processes and tools. The customer-facing redesign was 2 main pieces:

  • a new Article “app” that runs inside of our new platform
  • the “reskinning” of our homepage and section fronts

What is launching today is the re-platform of Blogs from a WordPress-only service to Blogs via WordPress as an app inside of our new platform.

The Redesign

Most people who use the internet have visited an NYTimes article page –

the old design:
http://www.nytimes.com/2013/12/29/arts/music/lordes-royals-is-class-conscious.html

Lorde

the new:
http://www.nytimes.com/2014/01/15/arts/music/jay-z-offers-a-view-of-his-legacy-at-barclays-center.html?ref=music

Jay-Z at Barclay's

What is not immediately obvious to the reader is how all of this works behind the scenes.

Non-Technical

To skip past all of the technical details, click here:

How Things Used to Work

For many years at the Times, article pages were generated into static HTML files when published. This was good and bad. Good because: static files are lightning fast to serve. Bad because: those files point at static assets (CSS, JavaScript files) that can only change when the pages are re-generated and re-published. One way around this was to load a CSS file that had a bunch of @import statements (eek), with a similar loading scheme for JS (even worse).

Blogs used to load like any custom WordPress project:

  • configured as a Multisite install (amassing ~200 blogs over time)
  • lots of custom plugins and widgets
  • custom themes + a few child themes

A lot of front-end developers also write PHP and vice versa. At the Times, in many instances, the team working on the Blogs “theme” was not the same team working on the CSS/JS. So, we would have different Subversion repos for global CSS, blogs CSS; different repos for global JS, blogs JS; and a different repo for WordPress proper. When I first started working at the Times, I had to create a symlink farm of 7 different repos that would represent all of the JS and CSS that blogs were using. Good times.

On top of that, all blogs would inherit NYTimes “global” styles and scripts. A theme would end up inheriting global styles for the whole project, global styles for all blogs, and then sometimes, a specific stylesheet for the individual blog. For CSS, this would sometimes result in 40-50 (sometimes 80!) stylesheets loading. Not good.

WordPress would load jQuery, Prototype, and Scriptaculous with every request (I’m pretty sure some flavor of jQuery UI was in there too). As a result, every module within the page would just assume that our flavor of jQuery global variable NYTD.jQuery was available anywhere, and would assume that Prototype.js code could be called at will. (Spoiler alert: that was a bad idea.)

WordPress does not use native WP comments. There is an entire service at the Times called CRNR (Comments, Ratings, and Reviews) that has its own user management, taxonomy management, and community moderation tools. Modules like “CRNR” would provide us with code to “drop onto the page.” Sometimes this code included its own copy of jQuery, different version and all.

Widgets on blogs could be tightly coupled with the WordPress codebase, or they could be some code that was pasted into a freeform textarea from some other team. The Interactive News team at the Times would sometimes supply us code to “drop into the C-Column” – translation: add a widget to the sidebar. These “interactives” would sometimes include their own copy jQuery (what version…? who knows!).

How Things Work Now

The new platform has 2 main technologies at its center: the homegrown Madison Framework (PHP as MVC), and Grunt, the popular task runner than runs on Node. Our NYT codebase is a collection of several Git repos that get built into apps via Grunt and deployed by RPMs/Puppet. For any app that wants to live inside of the new shell (inherit the masthead, “ribbon,” navigation automatically), they must register their existence. After they do, they can “inherit” from other projects. I’ll explain.

Foundation

Foundation is the base application. Foundation contains the Madison PHP framework, the Magnum CSS/Responsive framework, and our base JavaScript framework. Our CSS is no longer a billion disparate files – it is LESS manifests, with plenty of custom mixins, that compile into a few CSS files. At the heart of our JS approach is RequireJS, Hammer, SockJS and Backbone (authored by Times alum Jeremy Ashkenas).

Madison is an MVC framework that utilizes the newest and shiniest OO features of PHP and is built around 2 main software design patterns: the Service Locator pattern (via Pimple), and Dependency Injection. The main “front” of any request to the new stack goes through Foundation, as it contains the main controller files for the framework. Apps register their main route via Apache rewrite rules, Madison knows which app to launch by convention based on the code that was deployed via the Grunt build.

Shared

Shared is collection of reusable modules. Write a module once, and then allow apps to include them at-will. Shared is where Madison’s “base” modules exist. Modules are just PHP template fragments which can include other PHP templates. Think of a “Page” module like so:

Page
- load Top module
- load Content module
- load Bottom module

Top (included in Page)
- load Styles module
- load Scripts module
- load Meta module

...

In your app code, if you try to embed a module by name, and it isn’t in your app’s codebase, the framework will automatically look for it in Shared. This is similar to how parent and child themes work in WordPress. This means: if you want to use ALL of the default modules, only overriding a few, you need to only specify the overriding modules in your app. Let’s say the main content of the page is a module called “PageContent/Thing” – you would include the following in your app to override what is displayed:

// page layout
$layout = array(
    'type' => 'Page',
    'name' => 'Page',
    'modules' => array(
        array(
            'type' => 'PageContent',
            'name' => 'Thing'
        ),
        .....
    )
);

// will first look in
nyt5-app-blogs/Modules/PageContent/Thing.tpl.php
// if it doesn't find it
nyt5-shared/PageContent/php/src/Thing.tpl.php

So there’s a lot happening, before we even get to our Blogs app, and we haven’t even really mentioned WordPress yet!

App-specific

Each app contains a build.json file that explains how to turn our app into a codebase that can be deployed as an application. Each app might also have the following folder structure:

js/
js/src
js/tests
less/
php/
php/src
php/tests

Our build.json files lists our LESS manifests (the files to build via Grunt) and our JS mainifests (the files to parse using r.js/Require). Our php/src directory contains the following crucial pieces:

Module/ <-- contains our Madison override templates
WordPress/ <-- contains our entire WP codebase
ApplicationConfiguration.php <-- optional configuration
ApplicationController.php <-- the main Controller for our app
wp-bootstrap.php <-- loads in global scope to load/parse WordPress

The wp-bootstrap.php file is the most interesting portion of our WordPress app, and where we do the most unconventional work to get these 2 disparate frameworks to work together. Before we even load our app in Madison proper, we have already loaded all of WordPress in an output buffer and stored the result. We can then access that result in our Madison code without any knowledge of WordPress. Alternately, we can use any WP code inside of Madison. Madison eschews procedural programming and enforces namespace-ing for all classes, so collisions haven’t happened (yet?).

Because we are turning WP content in Module content, we no longer want our themes to produce complete HTML documents: we only to produce the “content” of the page. Our Madison page layout gives us a wrapper and loads our app-specific scripts and styles. We have enough opportunities to override default template stubs to inject Blog-specific content where necessary.

In the previous incarnation of Blogs, we had to include tons of global scripts and styles. Using RequireJS, which leans on Dependency Injection, we ask for jQuery in any module and ensure that it only loads once. If we in fact do need a separate version somewhere, we can be assured that we aren’t stomping global scope, since we aren’t relying on global scope.

Using LESS imports instead of CSS file imports, we can modularize our code (even using 80 files if we want!) and combine/minify on build.

Loading WordPress in our new unconventional way lets us work with other teams and other code seamlessly. I don’t need to include the masthead/navigation markup in my theme. I don’t even need to know how it works. We can focus on making blogs work, and inherit the rest.

What I Did

For the first few months of the project, I was able to work in isolation and move the Blogs codebase from SVN to Git. I was happy that we were moving the CSS to LESS and the JS to Require/Backbone, so I took all of the old files and converted them into those modern frameworks. The Times had 3 themes that I was given free reign to rewrite and squish into one lighter, more flexible theme. Since the Times has been using WordPress since 2005, there was code from the dark ages of the internet that I was able to look at with fresh eyes and transition. Once a lot of the brute force initial work was done, I worked with a talented team of people to integrate some of the Shared components and make sure we had stylistic parity between the new Article pages and Blogs.

To see some examples in action, a sampling:

Dealbook

Bits

Well

The Lede

City Room

ArtsBeat

Public Editor’s Journal

Paul Krugman

WordPress: Autowiring Custom Post Type Metadata

The New York Times Co.

Write Less Code

I recently did a project at the New York Times, a corporate website that was highly dynamic. A lot of the front-end work was done ahead of time with dummy content. I was brought in at the end to rewrite the core logic and set up all of the dynamic pieces. EVERYTHING had to be dynamic. There were several times that I had to quickly replace a dummy HTML list with content from a collection of objects belonging to a custom post type. I didn’t want to re-invent the wheel every time I added a new post type. I wanted to write one register_post_type() call with a helper as the value for 'register_meta_box_cb'.

Here’s How

Custom post types in WordPress are really object types, much like a blog post is an instance of the Post object type. When you register a custom post type, you are really registering a new “thing” that isn’t really a “post,” it’s something else. Once you have registered this thing, you will probably use the same API as Post to interact with your data: WordPress core functions to retrieve and save your data.

By far the most annoying things about custom post types are how much code it takes to register one and how much duplicate code it takes to save arbitrary metadata. An example:

I want to create a new object called “nyt_partner” – I am going to use the title, the content, and featured image, but I also need to associate some arbitrary data with each instance of “nyt_partner”: phone number, address, twitter account, etc. I am only going to read the data (not search for it), so object (post) metadata works just fine.

Here is some example code for how one currently registers the post type, then registers the metabox to display a form for new fields, and then saves the data when the post is saved:

All that code, and all we are doing is saving a twitter field. Gross. What if our site is very custom and we are using objects all over the place? What if everything on the site needs to be editable? This code is going to bloat almost immediately, so we need to find more ways to reuse.

The first thing we need to do is use a class to contain our logic, and ditch all of the procedural code from the last example. We are going to seriously optimize this code later, but here it is as a class:

This object is better, but it can still bloat very quickly. For each post type that has custom data, you have to add a meta box in one callback, and then register the UI in another. Every time your new object is saved, you have to run it through your own save logic, which adds even more bloat. For objects that are really complex, you actually might want to create a class per type, but most of the time, the data you are saving are attributes or simple fields. It would be great if we could create a few methods to autowire the creation and saving of a field.

In the next example, we will use closures and parent scope to dramatically decrease the necessary code to register a field:

For the time being, if I need to add another post type that has one field, I can just add these lines and be done with it:

All of the magic is rolled up into the NYT_Post_Types::create_field_box() method. So, if you need to add a bunch of post types at once that only save a field, you only have to edit the init method. This works if I have only one field. If I have several, I need to add a method:

To specify the fields while registering the post type:

Another piece of magic that we wired up – you can autowire a save method for a post type (that does not use autowiring for the UI) by adding a save_{post_type} method to your class. If you create a post type called balloon, all you have to do is add a method called save_balloon to your class. Our one registered save_post callback is smart enough to call it. This is great because you don’t have to duplicate the logic to determine if the post is eligible for save.

The autowiring methods (create_field_box() and create_fields_box()) dynamically create class properties with closures, but first look for an existing method. You can’t have both. Closures actually create properties on the class, not new class methods. This makes sense because you are really decorating your object with instances of the Closure class, which is what closures are. Closures should look very familiar to you if you write JavaScript with jQuery.

Some of your custom post types will need unique method callbacks for 'register_meta_box_cb', but my bet is that MOST of them can share logic similar to what I have demonstrated above. At eMusic, we had 56 custom post types powering various parts of the site. I used similar techniques to cut down the amount of duplicated logic across the codebase.

You may not need to use these techniques if your site is simple. And note: you can’t use closures in any version of PHP before 5.3.

WordPress 3.5 + Me

WordPress 3.5 dropped today. This is a special release for me because my picture made it to the Credits and I had 30-40 of my patches committed. Here’s the full list: https://core.trac.wordpress.org/search?q=wonderboymusic&noquickjump=1&changeset=on

The hightlights:

I have 55 patches on deck for 3.6 already, excited to see what makes it! If anyone out there is thinking about contributing to core and is hesitant, don’t be. 90% of success is showing up. Be There. Subscribe to Trac. Comment on tickets. Test patches. Occasionally check in on IRC. The people who are making WordPress are there. You could be one of them.

I was just a little lad with a dream 2 years ago at my first WordCamp in NYC when I grilled Nacin and Koop about using IDs instead of classes in the CSS selectors for Twenty Ten. Koop talked to me afterward and suggested I contribute to core. My first patch was at the after-party for WordCamp San Francisco 2011 at 2am at the old Automattic space at the Pier on the Embarcadero. I got 1 patch into 3.2. 1 patch into 3.3. Zero into 3.4. And here we are.

WordPress + Regionalization

Regionalized merch, regionalized promos, regionalized blog posts, and the list goes on and on

Regionalized merch, regionalized promos, regionalized blog posts, and the list goes on and on

One of the coolest things about WordPress is that it is built from the ground up with translation tools. Many blogs want to vibe in a language other than English, and many blogs want to get their international game on and present content to people in several languages.

But what if you want to detect a user’s geographic location and display content based on their country code (“US” for USA, “FR” for France, et al) or your own regional site code (“EU” for European Union countries, for example)? This is not built-in but can be built by you if you know what you are doing. Let’s learn!

Btdubs, eMusic makes extensive use of regionalization. We have 4 regional sites: US, UK, EU, CA. We regionalize everything, so almost every database query on our site has to be subject to some filtering. We also have to intersect WP with our regionalized catalog data and in various places link them together magically.

Custom Taxonomy: Region

Creating new taxonomies in WP requires code, but adding terms to that taxonomy can be done by anyone in the admin. Let’s get the code out of the way:

/**
 * Shortcut function for assigning Labels to a custom taxonomy
 *
 * @param string $term
 * @param string $plural If not specified, "s" is added to the end of $term
 * @return array Labels for use by the custom taxonomy
 *
 */
function emusic_tax_inflection( $str = '', $plural = '' ) {
    $p = strlen( $plural ) ? $plural : $str . 's';

    return array(
        'name'              => _x( $p, 'taxonomy general name' ),
        'singular_name'     => _x( $str, 'taxonomy singular name' ),
        'search_items'      => __( 'Search ' . $p ),
        'all_items'         => __( 'All ' . $p ),
        'parent_item'       => __( 'Parent ' . $str ),
        'parent_item_colon' => __( 'Parent ' . $str . ':' ),
        'edit_item'         => __( 'Edit ' . $str ),
        'update_item'       => __( 'Update ' . $str ),
        'add_new_item'      => __( 'Add New ' . $str ),
        'new_item_name'     => __( 'New ' . $str . ' Name' ),
        'menu_name'         => __( $p ),
    );
}

$post_types = get_post_types( array( 'exclude_from_search' => false, '_builtin' => false ) );

$defaults = array(
    'hierarchical'      => true,
    'public'            => true,
    'show_ui'           => true,
    '_builtin'          => true,
    'show_in_nav_menus' => false,
    'query_var'         => true,
    'rewrite'           => false
);

register_taxonomy( 'region', $post_types, wp_parse_args( array(
    'labels' => emusic_tax_inflection( 'Region' )
), $defaults ) );

Because we load a bunch of custom taxonomies, this code helps us stay modular. But yes, this is the amount of code required if you only have one taxonomy!

All you *really* need to know is: we made a hierarchical taxonomy. It’s called region. We are now going to use it EVERYWHERE.

Boom

So now that we have a taxonomy for region, we want to be able to assign region(s) to posts.

Sorting

We also want to view the region in the posts list table. We can add custom columns to the posts table and make the region column sortable. We need some more code for that:

class CustomClassNamedWhatever {

.......

/**
 * Filters for Admin
 *
 */
function admin() {
    add_filter( 'manage_posts_columns', array( $this, 'manage_columns' ) );
    add_action( 'manage_posts_custom_column', array( $this, 'manage_custom_column' ), 10, 2 );
    add_filter( 'posts_clauses', array( $this, 'clauses' ), 10, 2 );

	// this is an internal method that gives me an array of relevant post types
    foreach ( $this->get_post_types( false ) as $t ) {
        add_filter( "manage_edit-{$t}_sortable_columns", array( $this, 'sortables' ) );
    }
}

/**
 * Register sortable columns
 *
 * @param array $columns
 * @return array
 */
function sortables( $columns ) {
    $post_type_obj = get_post_type();

    if ( is_object_in_taxonomy( $post_type_obj, 'region' ) )
	$columns['region'] = 'region';

    return $columns;
}

/**
 * Add custom column headers
 *
 * @param array $defaults
 * @return array
 */
function manage_columns( $columns ) {
    $columns['region'] = __( 'Region' );
    return $columns;
}

/**
 * Output terms for post in tax
 *
 * @param int $id
 * @param string $tax
 */
function _list_terms( $id, $tax ) {
    $terms = wp_get_object_terms( $id, $tax, array( 'fields' => 'names' ) );
    if ( ! empty( $terms ) )
        echo join( ', ', $terms );
}
/**
 * Output custom HTML for custom column row
 *
 * @param string $column
 * @param int $id
 */
function manage_custom_column( $column, $post_id ) {
    if ( 'region' === $column )
	$this->_list_terms( $id, $column );
}

/**
 * Filter SQL with this monstrosity for sorting
 *
 * @global hyperdb $wpdb
 * @param array $clauses
 * @param WP_Query $wp_query
 * @return array
 *
 * TODO: fix this
 */
function clauses( $clauses, $wp_query ) {
    global $wpdb;

    if ( isset( $wp_query->query['orderby'] ) &&
        'region' === $wp_query->query['orderby'] ) {
        $tax = $wp_query->query['orderby'];

        $clauses['join'] .= <<term_relationships} ON {$wpdb->posts}.ID={$wpdb->term_relationships}.object_id
LEFT OUTER JOIN {$wpdb->term_taxonomy} USING (term_taxonomy_id)
LEFT OUTER JOIN {$wpdb->terms} USING (term_id)
SQL;

        $clauses['where'] .= " AND (taxonomy = '{$tax}' OR taxonomy IS NULL)";
        $clauses['groupby'] = "object_id";
        $clauses['orderby']  = "GROUP_CONCAT({$wpdb->terms}.name ORDER BY name ASC) ";
        $clauses['orderby'] .= ( 'ASC' == strtoupper( $wp_query->get('order') ) ) ? 'ASC' : 'DESC';
    }

    return $clauses;
}

........

}

After all of that, we can sort our posts by region:

Sorting

If you’re new to WordPress, you may have just barfed. Just know that we added relevant code to make the Region column sortable. Meanwhile, we still haven’t done anything to make our site regionalized. If we let WordPress just do its thing, you would still get content from every region, not to the one you are specifically targeting.

Geolocation

We have very specific values we check to regionalize users. If you’re anonymous, we get an X-Akamai-Edgescape HTTP header that can parsed, and it contains a value for “country_code.” Based on that country code, we can assign you to a region. Anybody with a debug console can probably view this header in their eMusic requests. If you’re logged-in or “cookied” – we pin you to your original country code.

At the end of the day, we want to be able to set some constants:

/**
 * Register default constants
 *
 */
if ( ! defined( 'THE_COUNTRY' ) )
    define( 'THE_COUNTRY', get_country_code() );

if ( ! defined( 'THE_REGION' ) )
    define( 'THE_REGION', get_region() );

if ( ! defined( 'CURRENCY_SYMBOL' ) ) {
    global $currency_symbol_map;
    if ( isset( $currency_symbol_map[THE_REGION] ) ) {
	define( 'CURRENCY_SYMBOL', $currency_symbol_map[THE_REGION] );
    } else {
	define( 'CURRENCY_SYMBOL', $currency_symbol_map['US'] );
    }
}

get_country_code() and get_region() are very specific to eMusic, so I won’t bore you with them, but they both return a 2-character-uppercase value for country or region, “US” and “US” for example. Now, anywhere in our code where we need to refer to region or country, THE_REGION and THE_COUNTRY will do it.

Linking Region to Taxonomy

Ok cool, we have a taxonomy, and we have some vague representation of region / country defined as constants. We still don’t have regionalized content. One way we could link region to taxonomy terms would be to make a map and refer to it when necessary:

$regions_map = array(
    'US' => 3,
    'UK' => 6,
    'CA' => 9,
    'EU' => 11,
);

This method sucks. Why? Are those IDs term_ids or term_taxonomy_ids? What if they change? Also, grabbing a PHP global every time you need to translate region code to term_id is weird. How else can we get the terms belonging to the region taxonomy? Let’s try this database call which is cached in memory after the first time it is retrieved:

get_terms( 'region' );

That’s great, but what do I do with it? Because your Region codes are also the names of your region terms, you could dynamically create your map like this:

$map = array();
foreach ( $terms as $term )
    $map[$term->name] = $term->term_taxonomy_id;

Next question: where do I set this, and how do I account for crazy stuff like switch_to_blog()? Try this:

/**
 * Automatically resets regions_tax_map on switch_to_blog()
 * Because we switch_to_blog() in sunrise.php, this function is
 * called when plugins_loaded to create initial values
 *
 * @param int $blog_id
 *
 */
function set_regions_tax_map( $blog_id = 0 ) {
     if ( ! isset( get_emusic()->regions_tax_maps ) )
	get_emusic()->regions_tax_maps = array();

     if ( ! taxonomy_exists( 'region' ) )
	return;

     if ( empty( $blog_id ) )
	$blog_id = get_current_blog_id();

     if ( isset( get_emusic()->regions_tax_maps[$blog_id] ) ) {
	get_emusic()->regions_tax_map = get_emusic()->regions_tax_maps[$blog_id];
	return;
    }

    $terms = get_terms( 'region' );

    $map = array();
    foreach ( $terms as $term )
	$map[$term->name] = $term->term_taxonomy_id;

    get_emusic()->regions_tax_maps[$blog_id] = $map;
    get_emusic()->regions_tax_map = get_emusic()->regions_tax_maps[$blog_id];
}

/**
 * The initial setting of $emusic->regions_tax_map on load
 *
 */
add_action( 'init', 'set_regions_tax_map', 20 );
/**
 * Resets context on switch to blog
 *
 */
add_action( 'switch_blog', 'set_regions_tax_map' );

STILL, we don’t have regionalized content, we just have a standard way of mapping region code to term_taxonomy_ids. So why do I care about the IDs? Because we have to use them to alter our default WP queries so that posts only intersect the region of the current user.

Tax Query

Tax Query is an advanced way of altering WP_Query. If you load a page of WP posts, you ran WP_Query. A developer can make their own instances of WP_Query or alter the global one. We need to alter the global one and we don’t want to ever use query_posts to do that. The first thing we need to do is retrieve our map of Term Name => Term Taxonomy ID:

/**
 * Gets a map of region name => term_taxonomy_id
 *
 * @return array
 */
function get_current_regions_tax_map() {
    return get_emusic()->regions_tax_map;
}

Now that we have that, we want to use it in the tax query that we are going to inject into WP_Query. One step at a time, here’s the tax_query portion, encapsulated in a function that always returns the proper context:

/**
 * Encapsulates common code to create a regionalized tax_query
 *
 * e.g. $tax_query = array( get_region_tax_query() );
 *
 * @param int $region
 * @param int $all
 * @return array
 */
function get_region_tax_query( $region = '', $all = true ) {
	$regions_tax_map = get_current_regions_tax_map();
	if ( empty( $regions_tax_map ) )
		return;

	$terms = array();
	$terms[] = $regions_tax_map[ empty( $region ) ? THE_REGION : strtoupper( $region ) ];
	if ( true === $all )
		$terms[] = $regions_tax_map['ALL'];

	$tax_query = array(
		'operator'		=> 'IN',
		'taxonomy'		=> 'region',
		'field'			=> 'term_taxonomy_id',
		'terms'			=> $terms,
		'include_children'	=> false
	);

	return $tax_query;
}

Being able to query by term_taxonomy_id is important, because it is the primary key for the term / taxonomy relationship. If I had passed term_ids, they would need to be translated into term_taxonomy_ids by WP_Tax_Query before WP_Query could complete its logic. I helped nurse this new functionality along in WordPress core (3.5): https://core.trac.wordpress.org/ticket/21228

Ok cool, we have the tax_query portion, but we haven’t applied it to the global WP_Query yet, let’s do that next.

“pre_get_posts”

Here is what you may use in your theme or plugin to apply regionalization to WP_Query. The best place to hook in is “pre_get_posts.” In this hook, we can directly alter the query:

class MyAwesomeExampleThemeClass {

.....

protected function regionalize() {
    add_filter( 'pre_get_posts', array( $this, 'pre_posts' ) );
}

/**
 *
 * @param WP_Query $query
 * @return WP_Query
 */
function pre_posts( $query ) {
    if ( $query->is_main_query() && ! is_admin() && ( is_archive() || is_front_page() || is_search() ) ) {
        if ( ! $query->is_post_type_archive() )
            $query->set( 'post_type', $this->post_types );

        $query->set( 'tax_query', array( get_region_tax_query() ) );

        $query = apply_filters( 'emusic_pre_get_posts', $query );
    }
    return $query;
}

....

}

The above will regionalize posts in search results and on archive pages. If you want to alter a query inline, now all you have to do is something like this:

$results = new WP_Query( array(
    'tax_query'	=> array( get_region_tax_query() ),
    'orderby' => 'comment_count',
    'order' => 'DESC',
    'posts_per_page'=> $limit
) );

In Action

View eMusic in multiple regions:

From the US
From France in the EU
From Canada (CA)
From Great Britain, mate (UK)