Another week of reading, lectures, and confused Googling and skimming through videos is in the bag. This week was all about NP-complete. It has been fun to experience an “aha!” moment in each course I’ve taken so far, and this was an especially fun one. In large part, because seeking answers to these questions is seemingly beyond the reach of computers (in a reasonable time).

We’re accustomed to computers being incredibly fast. So accustomed that we forget just how fast they are. They’re really fast. I’m writing this on a 4-year old computer. It has a 2.5 GHz Intel Core i7 processor. That number equates to how many cycles the system clock of this computer runs in a second. So 2,500,000,000 cycles in one second. The version of the CPU  running on this machine is quite powerful. It should execute around 9 instructions per cycle for a single core in the processor and there are 4 total cores running.

This all adds up to a lot of numbers and those numbers represent commands that we expect the computer to execute for us so that we can….. I dunno….. watch videos of cute cats.

This is to say, you should think, when you hear a computer can’t find an answer to a problem, that it’s a big deal.  These are the sorts of problems that a reasonably programmed (or so we think) computer with enough processors to require fans that sound like this:

just can’t solve if the domain is large enough.

The full list of these problems is here and I suppose there are two ways to think of this. One is very abstract. The other is with a concrete example. I think it’s important to understand the core essence of these problems, so I’ll begin with the abstraction.

Imagine that you have a problem whose possible solutions represent all of the possible groupings of all the variables in the problem. Let’s say you have 2 possible solutions and n possible groupings. That’s 2^n possible solutions. Generally, to find each solutions, you have to do a bit of work to verify it as well. Let’s call that m. Now the total time to find the solution to the problem is proportional to 2^n * m. If n is 5 and m is 10 then 2^(5) * 10 = 320. Let’s say that represents seconds, so we’re at 5 minutes, 20 seconds. If n is 10 and m is 100, then we’re at 102,400, or 1,706 minutes (or 71ish days).

So numbers cause big shifts in our ability to get a problem done in a time that makes sense. All essentially because the problem’s answer could be this or this or this or this. If we had a computer that could simply calculate all of the this’s in parallel with how many or‘s there were (i.e, for each possible solution, have a single processor dedicated to exploring a single solution set) then we could do this in polynomial time. But you can see just how quickly that becomes a problem. Too many processors equals fires (or really loud fans).

The other way to think of this is through a concrete problem. One of the easier ones (in my opinion) to understand is the traveling salesperson problem:

The problem is essentially a question of permutations. Given a list of cities and the distances between those cities, what is the shortest possible route that visits each city and returns to the origin city. Again, to get to the answer we have to look at one path or another path or another path or another path. There is no way to know which path is the shortest without first computing the lengths of all the paths. You can imagine that with more cities, there are more paths and with a large enough set, the problem is so complex that we cannot reasonably answer it. 

In short, these are the problems that baffle us with or without computers. That they all share some key characteristics says something about the nature of these problems and about problems in general. In fact, one of the key notions of NP-complete problems is that if you can solve one in polynomial time (i.e., efficiently, or non-exponentially) then you can solve all such problems in polynomial time. But perhaps some problems are just too complex for a person or machine to do.

I should note that I continue to remain positive about the direction of the Gutenberg project (the new WordPress editor, coming to you as of WordPress 5.0). My feelings on this are numerous and expansive, but the long and short of it is that I believe WordPress core needs a major shakeup to help the community (re)develop focus and draw in engaged and effective technical participants. Gutenberg represents a wonderful opportunity to do that as it brings a new paradigm to the core editor (and likely elsewhere as the foundational technology expands into other areas of site management) and has the potential to draw in a new wave of web developers.

That said, the introduction of Gutenberg into core has been an interesting thing to watch. From afar, the concerns of the Accessibility Team seem to clearly show the divisions between WordPress as an open source project (WordPress.org) and as a commercial one (WordPress.com):

Continue Reading Trials and Tribulations with Gutenberg

Having just finished my midterm in Analysis of Algorithms (yes, the class is as dry as it sounds), my brain is still sharp on a few topics; one of them being dynamic programming, which I mentioned in my last post. In that post, wherein I tried to find motivation for forcing myself to relearn calculus, I used the classic example of trying to calculate the nth term of the Fibonacci sequence.

I thought it would be helpful to see this example running with some real code. Below, we have a JavaScript function – fibRecursive – that takes an integer as a parameter. This integer represents the term that we want from the Fibonacci sequence. For example, a call to the function like so fibRecursive(6) would return 8. 

Continue Reading A Short Story About Dynamic Programming

This is a question that is plaguing me at the moment as I force myself to relearn calculus for Analysis of Algorithms at Oregon State University. In moments like this, where the concepts are abstract and I need to learn even more abstract concepts so I can understand the first class of abstract ideas, motivation is key.

Why does this matter? I’ve been working in the industry for over five years and never needed this knowledge before, why now?

Well, the short answer is that you don’t need this body of knowledge to develop a wide range of applications and features to applications. In my world, many of the concerns that common sorting, searching, and general optimization algorithms address are not real concerns because they’ve been abstracted to parts of the language or framework. I’m able to do my job because someone else has figured out how to do other parts of my job that normally would need to be created from scratch. So while learning merge sort and analyzing its complexity is a fun exercise, I’ll not be writing it from scratch anytime soon.

Continue Reading Why Study Algorithms?

Updating LexBlog.com’s aggregation engine was no small feat. Scott Fennell and I spent months testing all of the various components of our new aggregation engine that powers the vast majority of the site, but something that was hard to prepare for was the shear scale of the site. Now that it’s up an running, we’re learning a lot about how to manage a site like this, and what sorts of features are necessary for it to be a successful publication from the perspective of an editor or reader.

One thing that I’ve recently keyed in on is search. Normally, I would tell a client that on-site search is not important. Most visitors are coming to a site from a much better search engine (Google), and are more apt to click around the site once there. LexBlog has layered in some nice features to the standard WordPress search, but most of those are around making sure that readers can search by an author’s name when they’re on a blog or website. This seems like a thing WordPress should do by default, but the generic WordPress search is “dumb” in the sense that it only looks to the post content and post title when running a search. Authors are not in either, so some work had to be done to support searching an author’s name and getting their posts.

In any case, the on-site search is “good enough” for most readers, and most sites aren’t the size of LexBlog.com. However, LexBlog.com is big. Very big. There are nearly 400,000 posts and 20,000 users on the site. The results that are returned by WordPress with what is essentially a LIKE %query% SQL statement does a disservice to anyone that waits around for the page to load (a search on LexBlog.com right now can take anywhere from 10-15 seconds to return a page).

While a very small percentage of visitors to LexBlog.com use the on-site search feature (only about 2% of all page views are to a search results page), the relationships that we’ve layered in to each post and user make search a potentially very attractive feature on the site. We could support advanced searches by organization, site, author, and date as opposed to a generic text search against all of the content. Moreover, the speed issues alone make me long for a better solution on the site.

Enter Elasticsearch. Elasticsearch is a product from Elastic.co; if you haven’t heard their story before, it’s worth the read.

I have looked at a variety of alternative search technologies for WordPress before this, including:

But I continue to come back to Elasticsearch for a number of reasons:

  • It’s cost effective
  • Easy to scale and manage
  • Has a wonderful, developer-friendly WordPress plugin – ElasticPress –  built by a team of people – 10up – that I trust to continue to maintain and iterate said plugin

This weekend, I took Elasticsearch/ElasticPress for a spin on LexBlog.com‘s staging environment, and the results were surprising. Most keyword searches were 3-5 times faster (again, that speed difference alone is enough for me to favor Elasticsearch over WordPress’s search). Here’s a view of how long it takes for someone to search for “Kavanaugh” on LexBlog.com’s production environment:

Over 13 seconds!

and here’s what it looks like in staging:

Just over 4 seconds – much better 🙂

More than speed, though, Elasticsearch’s queries are optimized for searches in a way that WordPress is not. As I mentioned, WordPress searches post content and titles, but Elasticsearch/ElasticPress expands that to include taxonomies (tags, categories, and custom taxonomies) and bylines.

WordPress also has a very weak algorithm for the keyword search itself. Without going into too much detail, it performs a relatively exact search of the query so that misspellings or typos that may occur when you’re on a mobile device (or if you’re like me, whenever you’ve been staring at a screen for more than 10 hours) are treated like you meant to search for that exact phrase. Elasticsearch performs “fuzzy matching”, which looks for variations on the keyword that you’ve searched. For example, if you’re interested in the Stop Online Piracy Act – SOPA – you might search “SOPA’s enforcement.” However, maybe you’re feeling lazy that day and don’t want to type in the apostrophe, so you search “SOPAs enforcement.” Elasticsearch is smart enough to return results for the Stop Online Piracy Act/SOPA whereas WordPress returns only results where the text was literally “SOPAs enforcement”; so only instances where the author made the same “typo” that you did!

While it seems like Elasticsearch may be winning the day at LexBlog, it’s still something for us to explore in more depth. As with all updates to a site, many people have a voice (including the readers) and we’re still waiting to see how they (and we) value search.

This was one of the most eventful summers in my life both personally and professionally. In July, Garry (LexBlog’s COO) and I had a chance to go to Chicago and spend some time talking about LexBlog’s future product line and general opportunities for integrating with our platform. It’s not often that I get an opportunity to do face-to-face meetings of these sort, and it was nice to get back in the saddle. It was also my first time visiting Chicago, and Garry seemed more than happy to drag me around.

The Chicago Riverwalk where I forced Garry to walk – he was incredibly pleased to be outdoors, walking for hours.
The Chicago Gate – aka “The Bean” – which actually looks pretty cool close up.

Continue Reading A LexBlogger’s Summer in Review

I’m about halfway through Oregon State University’s (Go Beavers!) post-bacc program for computer science, but feel like I’ve just entered the belly of the beast. On the docket for the summer is CS 271 – Computer Architecture and Assembly Language; a fine relaxing course to take in the months before and during my wedding, right? Not so much.

The material is dense as we learn to program how to move memory around on a computer and perform basic actions on the contents of said memory.  The class is focused on the IA-32 – a 32-bit version of the x86 instruction set architecture found in early IBM workstations and personal computers, and then later in embedded systems for phones, aerospace tech, and electronic musical instruments.  I’m only a few weeks in, but already it’s painfully obvious to me that assembly is not like any other language I’ve used.

Continue Reading Oh The Places You’ll Go! …… with Assembly

Like many technical specifications on the web, RSS (which stands for Rich Site Summary or Really Simple Syndication depending on who you talk to) has a confusing history that seems to only get more confusing as time goes on. The format became popular in the late 1990’s as the need to standardize information held on websites became a pressing concern with the rise of blogging and dynamic websites. The influx of information and content, all organized in different ways, was exciting, but without a standard way to consume the content, you were left with just a few options:

  • Bookmarks, and lots of them
  • Memorize a handful of URLs and visit only those sites
  • Build a custom web scraper

The goal of RSS (as I see it) was to provide each site that created dynamic content a specification to follow to make that content available at some address so the rest of the internet community could easily monitor this address for updates. For example, this blog’s RSS feed is available at https://www.jsulz.com/feed/. You can take this URL and drop it in Feedly or your RSS reader of choice and every new post I publish will end up there alongside any other blogs you regularly read.

Continue Reading RSS Doesn’t Stand for “Really Should be Standard”, but Maybe It Should

Engineers make hardware and software for humans. It should go without saying, but remembering and staying true to that axiom is complicated depending on where you’re standing. With each passing year, it seems that things get more complicated, more random, more uncertain. This year was no different, especially in the realm of technology.

Facebook and Twitter are defending their platforms amidst allegations that they were used for interfering in America’s 2016 Presidential elections. Net neutrality seems to be going by the wayside with nary a peep from the so-called “Big N”, many of whom participated in protests in 2014 when the issue first came to the public’s attention. Uber dug itself into a hole as scandal after scandal rocked the company; the first of which was a female engineer lifting the veil and exposing a misogynistic and Darwinian culture, followed by revelations that the company had written software to avoid local law enforcement agents in areas where Uber was prohibited from operating. Meanwhile, the threat of automation and the looming specter of artificial intelligence have every working professional worried about the future of employment in this new economy.

The list could go on and on, and doesn’t end when last year began. As long as corporate greed and bad company culture are not only allowed, but praised, problems of this ilk will continue. The problem as I see it, is that it’s most troubling in the context of computers.

Continue Reading Make Technology for Humans

Over the past few days, the Product team at LexBlog has been busy launching a few bodies of work that have been a long time coming. While our Success team launches sites and solutions on a hourly basis, the product side of LexBlog has the luxury of spending weeks, sometimes months, working on new features (what luxury!). It’s a truly fortunate situation, and one that we don’t take for granted.

This week, our team had the pleasure of being in the same offices together with our Lead Developer, Scott Fennell – who blogs over at Code in the Cold – and Director of Design, Brian Biddle making their quarterly visit to the LexBlog Mothership (now at WeWork!), and we made sure to capitalize. This Thursday and Friday we celebrated our team’s geographical unity, short-lived though it may be, by launching a new admin color scheme, a redesigned LXBN – named The LexBlog Network from here on out – and LexBlog Network subscription options for each author on LexBlog’s publishing platform.

While our authors may not find the new admin color scheme groundbreaking, this update was the source of some headaches for yours truly, and served as a great technical opportunity for Mr. Biddle and Angelo Carosio, LexBlog’s in-house DJ and developer extraordinaire.

Continue Reading LXBN as the New LexBlog