JavaScript
Node.js
Elm
webpack
elmlang

How I wrote a hybrid Elm/Javascript Node.js application

More than 1 year has passed since last update.

Edit2: With the changes in Elm 0.17, it's no longer possible to use a signal-based approach to write applications. As a result, some of the core logic used in the original version is no longer possible. You might be interested in my other post about Node applications written in Purescript (here) or my slides on converting this Elm app into a Purescript app (here)

Edit: I realize the code blocks can be annoying to read here, so here's a gist version: https://gist.github.com/justinwoo/4fe645643886948cd896

This isn't necessarily how you should go about writing an Elm application, or even that great of an idea, but it's one you could take ideas from and transform into something actually viable or useful to you.

This won't even really be a tutorial, though the parts are simple enough that you could just rip a lot of what I have and reassemble it. (Kind of like some kind of Danish toy or something)

Why?

Elm is pretty fun, but there are a lot of things that are just more comfortable to do in Javascript with the huge ecosystem that you can take advantage of. Plus, you can just run Elm as a worker that processes inputs, so why not? Not to mention, if your Elm program compiles, then you know that even if the process produces the wrong result, it's not going to have any runtime errors (from the Elm side) and all your types are abided by.

Also because the Node program that I had written before was a mess, so I was going to have to rewrite it anyway.

How?

Setup

I went with my same old tooling with webpack and elm-simple-loader (which is a loader for webpack that I wrote to simplify putting my built Elm JS into my bundle), but just slightly modified it for what I needed to work with Node. I really did copy my webpack config over from a browser Elm/JS project: webpack.config.js

Elm side source

I figured an Elm program is always valid if it compiles, so I might as well write this part first. For the most part, it was easy (with some warty stupid logic):

  • I defined what a "File" was (just a String representing the filename)
  • I defined what a "Target" was (a record of File and url where that File existed remotely)
  • I tried to think of what features exist now that I wanted to keep:
    • I wanted to ban some files on partial name matches and have a list for these. List File, I guess.
    • I didn't want to download files that I already had. Another List File, I guess.
  • I then tried to boil down what my crappy app even does:
    • I needed to get a list of targets from my favorite tracker website. List Target!
    • I needed to get back this list of targets that actually needed to be downloaded. Another List Target!
  • And then I needed what ports I needed for IO from JS:
    • inputs:
      • bannedWordsSignal
      • downloadedFilesSignal
      • fetchedTargetsSignal
    • outputs:
      • requestDownloadsSignal

So then the main bit of my program is quite simple:

isBlacklisted : BannedWords -> File -> Bool
isBlacklisted bannedFiles file =
  List.any (\x -> String.contains x file) bannedFiles

isDownloaded : DownloadedFiles -> File -> Bool
isDownloaded downloadedFiles file =
  List.any (\x -> String.contains file x) downloadedFiles

processFile : BannedWords -> DownloadedFiles -> Target -> List Target -> List Target
processFile bannedWords downloadedFiles target targets =
  let
    name = target.name
    blacklisted = isBlacklisted bannedWords name
    downloaded = isDownloaded downloadedFiles name
  in
    if | blacklisted || downloaded -> targets
       | otherwise -> target :: targets

...and then I fold over a list of fetched targets, so I return either targets or targets with a new target.

Javascript side source

Okay, here things are easy enough, I guess. The program flow is like so:

  • I read my config file to figure out where I'm getting information and what files are banned.
  • Then, in parallel, I do the following:
    • I check my downloaded files to fill up my list of what I already have. * I go fetch my potential targets for downloading.
  • I combine the values retrieved and then I instantiate an Elm worker with the banned words from the config, the downloaded files, and the fetched targets.
  • I then subscribe to the requestDownloadSignal port from the worker and download away at the requested targets.

Sounds good right? Yeah, it was pretty good, for the most part...

Ugly stuff

So if you looked at the code and compared it to the description I gave, you might've noticed that I didn't mention this ugly getDownloadsSignal thing. I sure didn't! That's something I spent a bit looking into being annoyed.

You know how I'm kind of a RxJS guy? What happens when you have a stream with an initial value and you subscribe to it in Rx? Yeah, you get the initial value sent to you! Well, in Elm Signal land, this is not the case. An initial value can be pulled from and consumed, but is not considered an event to be emitted. I posted about this on the mailing list and didn't really get much of a response: https://groups.google.com/forum/#!topic/elm-discuss/T3PLvJs4ZTo (though, thanks to Max for providing an explanation that people might find useful).

So what happens when I instantiate a worker with initial signal values and subscribe to a signal that maps over those values? ...nothing! What does this mean for me? Well, either I need to delay sending one of those values after ensuring the other two have initial values set (eh...), send all three afterwards but have my subscription get send three different items (eh...), or use another signal where I send an undefined that I can just ignore the true value of using Json.Decode.Value as the signal type (eh...). Well, you can see that I chose the third option, and so my code contains bits like this:

worker.ports.getDownloadsSignal.send();
import Json.Decode exposing (Value)

port getDownloadsSignal : Signal Value

getDownloadRequests : BannedWords -> DownloadedFiles -> FetchedTargets -> a -> DownloadRequest
getDownloadRequests bannedWords downloadedFiles fetchedTargets _ =
  List.foldl
    (processFile bannedWords downloadedFiles)
    []
    fetchedTargets

port requestDownloadsSignal : Signal DownloadRequest
port requestDownloadsSignal =
  Signal.map4
    getDownloadRequests
    bannedWordsSignal
    downloadedFilesSignal
    fetchedTargetsSignal
    getDownloadsSignal

Check out that last bit where I map over 4 signals, which means that on a new value from any of these signals, I will take the latest/initial value of all of the signals and call the first operand function with all these applied in order. So when I do worker.ports.getDownloadedSignal.send() from JS, it kicks off this mapping and sends a new value to requestDownloadSignal. Works?

Repo

Repository is here on Github: https://github.com/justinwoo/torscraper

Conclusion

So I've written a hybrid Elm/JS Node app that I actually use every day now. Was it fun? Yeah, pretty fun. Would I do it again? I'm not really sure. What a stupid thing to say in an article that's supposed to show how this kind of thing can be done, right? Yeah, probably. Sorry.

Anyway, if you made it this far, thanks! Please let me know if you thought this was amusing or useful or just terrible or something on twitter (@jusrin00) or something!

References