Monday, March 20, 2017

JavaScript vs. Python in 2017

I may be one of the last people you would expect to write an essay criticizing JavaScript, but here we are.

Two of my primary areas of professional interest are JavaScript and “programming in the large.” I gave a presentation back in 2013 at mloc.js where I argued that static typing is an essential feature when picking a language for a large software project. For this reason, among others, I historically limited my use of Python to small projects with no more than a handful of files.

Recently, I needed to build a command-line tool for work that could speak Thrift. I have been enjoying the Nuclide/Flow/Babel/ESLint toolchain a lot recently, so my first instinct was to use JavaScript for this new project, as well. However, it quickly became clear that if I went that route, I would have to spend a lot of time up front on getting the Thrift bindings to work properly. I couldn't convince myself that my personal preference for JavaScript would justify such an investment, so I decided to take a look at Python.

I was vaguely aware that there was an effort to add support for static typing in Python, so I Googled to find out what the state of the art was. It turns out that it is a tool named Mypy, and it provides gradual typing, much like Flow does for JavaScript or Hack does for PHP. Fortunately, Mypy is more like Hack+HHVM than it is like Flow in that a Python 3 runtime accepts the type annotations natively whereas Flow type annotations must be stripped by a separate tool before passing the code to a JavaScript runtime. (To use Mypy in Python 2, you have to put your type annotations in comments, operating in the style of the Google Closure toolchain.) Although Mypy does not appear to be as mature as Flow (support for incremental type checking is still in the works, for example), simply being able to succinctly document type information was enough to renew my interest in Python.

In researching how to use Thrift from Python, a Google search turned up some sample Python code that spoke to Thrift using asynchronous abstractions. After gradual typing, async/await is the other feature in JavaScript that I cannot live without, so this code sample caught my attention! As we recently added support for building projects in Python 3.6 at work, it was trivial for me to get up and running with the latest and greatest features in Python. (Incidentally, I also learned that you really want Python 3.6, not 3.5, as 3.6 has some important improvements to Mypy, fixes to the asyncio API, literal string interpolation like you have in ES6, and more!)

Coming from the era of “modern” JavaScript, one thing that was particularly refreshing was rediscovering how Python is an edit/refresh language out of the box whereas JavaScript used to be that way, but is no more. Let me explain what I mean by that:

  • In Python 3.6, I can create a new example.py file in my text editor, write Python code that uses async/await and type annotations, switch to my terminal, and run python example.py to execute my code.
  • In JavaScript, I can create a new example.js file in my text editor, write JavaScript code that uses async/await and type annotations, switch to my terminal, run node example.js, see it fail because it does not understand my type annotations, run npm install -g babel-node, run babel-node example.js, see it fail again because I don't have a .babelrc that declares the babel-plugin-transform-flow-strip-types plugin, rummage around on my machine and find a .babelrc I used on another project, copy that .babelrc to my current directory, run babel-node example.js again, watch it fail because it doesn't know where to find babel-plugin-transform-flow-strip-types, go back to the directory from which I took the .babelrc file and now copy its package.json file as well, remove the junk from package.json that example.js doesn't need, run npm install, get impatient, kill npm install, run yarn install, and run babel-node example.js to execute my code. For bonus points, babel-node example.js runs considerably slower than node example.js (with the type annotations stripped) because it re-transpiles example.js every time I run it.
Indeed, the JavaScript ecosystem offers all sorts of tools to speed up this process using daemons and caching, but you or someone on your team has to invest quite a bit of time researching, assembling, and maintaining a JavaScript toolchain for your project before anyone can write a line of “modern” JavaScript. Although you may ultimately achieve a nice edit/refresh workflow, you certainly will not have one out of the box as you would in Python.


“JavaScript is no longer an edit/refresh language.”

Another refreshing difference between JavaScript and Python is the “batteries included” nature of Python. If you look at the standard library that comes with JavaScript, it is fairly minimal. The Node environment does a modest job of augmenting what is provided by the standard library, but the majority of the functionality you need inevitably has to be fetched from npm. Specifically, consider the following functionality that is included in Python's standard library, but must be fetched from npm for a Node project:

As you can see, for many of these features, there are multiple third-party libraries that provide overlapping functionality. (For example, if you were looking for a JSON parser, would you choose parse-json, safe-json-parse, fast-json-parse, jsonparse, or json-parser?) To make matters worse, npm module names are doled out on a first-come, first-serve basis. Much like domain names, this means that great names often go to undeserving projects. (For example, judging from its download count, the npm module named logging makes it one of the least popular logging packages for JavaScript.) This makes the comparison of third-party modules all the more time-consuming since the quality of the name is not a useful signal for the quality of the library.

It might be possible that Python's third-party ecosystem is just as bad as npm's. What is impressive is that I have no idea whether that is the case because it is so rare that I have to look to a third-party Python package to get the functionality that I need. I am aware that data scientists rely heavily on third-party packages like NumPy, but unlike the Node ecosystem, there is one NumPy package that everyone uses rather than a litany of competitors named numpy-fast, numpy-plus, simple-numpy, etc.


“We should stop holding up npm as a testament to the diversity of the JavaScript ecosystem, but instead cite it as a failure of JavaScript's standard library.”

For me, one of the great ironies in all this is that, arguably, the presence of a strong standard library in JavaScript would be the most highly leveraged when compared to other programming languages. Why? Because today, every non-trivial web site requires you to download underscore.js or whatever its authors have chosen to use to compensate for JavaScript's weak core. When you consider the aggregate adverse impact this has on network traffic and page load times, the numbers are frightening.

So...Are You Saying JavaScript is Dead?

No, no I am not. If you are building UI using web technologies (which is a lot of developers), then I still think that JavaScript is your best bet. Modulo the emergence of Web Assembly (which is worth paying attention to), JavaScript is still the only language that runs natively in the browser. There have been many attempts to take an existing programming language and compile it to JavaScript to avoid “having to” write JavaScript. There are cases where the results were good, but they never seemed to be great. Maybe some transpilation toolchain will ultimately succeed in unseating JavaScript as the language to write in for the web, but I suspect we'll still have the majority of web developers writing JavaScript for quite some time.

Additionally, the browser is not the only place where developers are building UI using web technologies. Two other prominent examples are Electron and React Native. Electron is attractive because it lets you write once for Windows, Mac, and Linux while React Native is attractive because it lets you write once for Android and iOS. Both are also empowering because the edit/refresh cycles using those tools is much faster than their native equivalents. From a hiring perspective, it seems like developers who know web technologies (1) are available in greater numbers than native developers, and (2) can support more platforms with smaller teams compared to native developers.

Indeed, I could envision ways in which these platforms could be modified to support Python as the scripting language instead of JavaScript, which could change the calculus. However, one thing that all of the crazy tooling that exists in the JavaScript community has given way to is transpiler toolchains like Babel, which make it easier for ordinary developers (who do not have to be compiler hackers) to experiment with new JavaScript syntax. In particular, this tooling has paved the way for things like JSX, which I contend is one of the key features that makes React such an enjoyable technology for building UI. (Note that you can use React without JSX, but it is tedious.)

To the best of my knowledge, the Python community does not have an equivalent, popular mechanism for experimenting with DSLs within Python. So although it might be straightforward to add Python bindings in these existing environments, I do not think that would be sufficient to get product developers to switch to Python unless changes were also made to the language that made it as easy to express UI code in Python as it is in JavaScript+JSX today.

Key Takeaways

Python 3.6 has built-in support for gradual typing and async/await. Unlike JavaScript, this means that you can write Python code that uses these language features and run that file directly without any additional tooling. Further, its rich standard library means that you have to spend little time fishing around and evaluating third-party libraries to fill in missing gaps. It is very much a “get stuff done” server-side scripting language, requiring far less scaffolding than JavaScript to get a project off the ground. Although Mypy may not be as featureful or performant as Flow or TypeScript is today, the velocity of that project is certainly something that I am going to start paying attention to.

By comparison, I expect that JavaScript will remain strong among product developers, but those who use Node today for server-side code or command-line tools would probably be better off using Python. If the Node community wants to resist this change, then I think they would benefit from (1) expanding the Node API to make it more comprehensive, and (2) reducing the startup time for Node. It would be even better if they could modify their runtime to recognize things like type annotations and JSX natively, though that would require changes to V8 (or Chakra, on Windows), which I expect would be difficult to maintain and/or upstream. Getting TC39 to standardize either of those language features (which would force Google/Microsoft's hand to add native support in their JS runtimes) also seems like a tough sell.

Overall, I am excited to see how things play out in both of these communities. You never know when someone will release a new technology that obsoletes your entire toolchain overnight. For all I know, we might wake up tomorrow and all decide that we should be writing OCaml. Better set your alarm clock.

(This post is also available on Medium.)

Sunday, March 19, 2017

Python: Airing of Grievances

Although this list focuses on the negative aspects of Python, I am publishing it so I can use it as a reference in an upcoming essay about my positive experiences with Python 3.6. Update: This is referenced by my post, "JavaScript vs. Python in 2017."

I am excited by many of the recent improvements in Python, but here are some of my outstanding issues with Python (particularly when compared to JavaScript):

PEP-8

What do you get when you combine 79-column lines, four space indents, and snake case? Incessant line-wrapping, that's what! I prefer 100 cols and 2-space indents for my personal Python projects, but every time a personal project becomes a group project and  true Pythonista joins the team, they inevitably turn on PEP-8 linting with its defaults and reformat everything.

Having to declare self as the first argument of a method

As someone who is not a maintainer of Python, there is not much I could do to fix this, but I did the next best thing: I complained about it on Facebook. In this case, it worked! That is, I provoked Łukasz Langa to add a check for this (it now exists as B902 in flake8-bugbear), so at least when I inevitably forget to declare self, Nuclide warns me inline as I'm authoring the code.

.pyc files

These things are turds. I get why they are important for production, but I'm tired of seeing them appear in my filetree, adding *.pyc to .gitignore every time I create a new project, etc.

Lack of a standard docblock

Historically, Python has not had a formal type system, so I was eager to document my parameter and return types consistently. If you look at the top-voted answer to “What is the standard Python docstring format?” on StackOverflow, the lack of consensus extremely dissatisfying. Although Javadoc has its flaws, Java's documentation story is orders of magnitude better than that of Python's.

Lack of a Good, free graphical debugger

Under duress, I use pdb. I frequently edit Python on a remote machine, so most of the free tools do not work for me. I have not had the energy to try to set up remote debugging in PyCharm, though admittedly that appears to be the best commercial solution. Instead, I tried to lay the groundwork for a Python debugger in Nuclide, which would support remote development by default. I need to find someone who wants to continue that effort.

Whitespace-delimited

I'm still not 100% sold on this. In practice, this begets all sorts of subtle annoyances. Code snippets that you copy/paste from the Web or an email have to be reformatted before you can run them. Tools that generate Python code incur additional complexity because they have to keep track of the indent level. Editors frequently guess the wrong place to put your cursor when you introduce a new line whereas typing } gives it the signal it needs to un-indent without further interaction. Expressions that span a line frequently have to be wrapped in parentheses in order to parse. The list goes on and on.

Bungled rollout of Python 3

Honestly, this is one of the primary reasons I didn't look at Python seriously for awhile. When Python 3 came out, the rhetoric I remember was: “Nothing substantially new here. Python 2.7 works great for me and the 2to3 migration tool is reportedly buggy. No thanks.” It's rare to see such a large community move forward with such a poorly executed backwards-compatibility story. (Presumably this has contributed to the lack of a default Python 3 installation on OS X, which is another reason that some software shops stick with Python 2.) Angular 2 is the only other similar rage-inducing upgrade in recent history that comes to mind.

I don't understand how imports are resolved

Arguably, this is a personal failure rather than a failure of Python. That said, compared to Java or Node, whatever Python is doing seems incredibly complicated. Do I need an __init__.py file? Do I need to put something in it? If I do have to put something in it, does it have to define __all__? For a language whose mantra is “There's only one way to do it,” it is frustrating that the answer to all of these questions is “it depends.”

Busted Scoping Rules

For a language with built-in map() and filter() functions that underscore the support for first-order functions, I am perplexed with how bad the support for closures is in Python. Apparently your options are (1) lambda (which is extremely limited), or (2) a bastardized closure. Borrowing an example from an article that explains the new nonlocal keyword in Python 3, consider the following code in JavaScript:
function outside() {
  let msg = 'Outside!';
  console.log(msg);
  let inside = () => {
    msg = 'Inside!';
    console.log(msg);
  };
  inside();
  console.log(msg);
}
If you ran outside(), you would get the following output (which you would expect from a lexically scoped language like JavaScript):
Outside!
Inside!
Inside!
Whereas if you wrote this Python code:
def outside():
    msg = 'Outside!'
    print(msg)
    def inside():
        msg = 'Inside!'
        print(msg)
    inside()
    print(msg)
and ran outside(), you would get the following:
Outside!
Inside!
Outside!
In Python 2, the only way to emulate the JavaScript behavior is to use a mutable container type (which is gross):
def outside():
    msg_holder = ['Outside!']
    print(msg_holder[0])
    def inside():
        msg_holder[0] = 'Inside!'
        print(msg_holder[0])
    inside()
    print(msg_holder[0])
Though apparently Python 3 has a new nonlocal keyword that makes this less distasteful:
def outside():
    msg = 'Outside!'
    print(msg)
    def inside():
        nonlocal msg
        msg = 'Inside!'
        print(msg)
    inside()
    print(msg)
I realize that JavaScript generally forces you to type more with its var, let and const qualifiers, but I find this to be a small price to pay in exchange for unambiguous variable scopes.