Language Transformations and You: Transpiling

Hey everyone!

It’s time for our 4th and final post on Language Transformations, this time taking a look at transpiling. Transpiling is the process of taking one source language and transforming it into another source language. If it sounds a lot like compiling, that’s because for the most part, it is. The main difference between transpiling and compiling, however, is the output. Compiling generally takes source code and transforms it into a more machine interpretable set of instructions. Transpiling on the other hand, outputs another human readable and usable language.

So why use transpiling?

Getting Awesome New Features

I’m sure by now you have figured out that I love javascript and web development from¬†my posts. ūüôā

The unfortunate problem is that javascript is still in its adolescence. It has some great features, does some awesome things¬†and has some powerful frameworks and¬†libraries. That being said, it¬†doesn’t have comprehensions, it’s weakly typed (a strength and a weakness) and it doesn’t handle text manipulation very well. It has a laundry list of problems….as well as a massive list of incredible solutions. Yet if¬†I want some or even any of these issues to be natively resolved, I have to wait for browser support for ES6 and ES7.

Enter transpiling. With transpiling, you can get solutions for all of these problems, and even some problems you didn’t know existed, like gaining Tuples! You can use Coffeescript¬†for text manipulation, value assertions and comprehensions or you can use TypeScript¬†(or eventually AtScript) for value assertions and strongly typed scripts. It doesn’t stop with javascript; you can get variables, calculations and mixins by using Sass or Less in place of CSS. You can also get better readability and templating with Jade instead of HTML.

Generally, this is the main reason that transpiling comes about. Someone says “I wish X language did Y,” and then they build a transpiler.

Familiarity

Let’s face it. Most developers have a go-to language that they use for a good share of their projects. Of course, there are generalists, newer and better languages, and some projects that simply don’t allow you to use your favorite language. What transpiling can allow, then, is the use of language features and syntax from your favorite language that is transformed into the language that generalists, early adopters and projects love!

Take coffeescript for example. It types, looks and feels a lot like ruby and even python. It takes their strengths and applies them to javascript, giving you the familiarity of a more mature programming language. Suddenly your client code looks a lot like your server code and has quite a few more features. The best part is that you get some ES6 functionality by using Coffeescript, like arrow functions, comprehensions and string templating.

Keeping it DRY

DRY or as we all know it, Don’t Repeat Yourself, is a fundamental paradigm to software engineering. There may be situations where you need to have redundancy and repetition, but in most cases, keeping everything DRY is the best way to increase maintainability by reducing your points of maintenance (and failure). So then, how does DRY apply to transpiling?

Transpiling allows you to build your application in one language, and then port it to many different platforms. You can build websites and mobile apps at the same time with Phone Gap. Or you can build those same apps using C# with Xamarin. When you are done, these services will transpile your source code from one platform into a native language that your mobile devices can understand.

I know, I know, this post was a lot shorter and was not as¬†rife with examples as my other Language Transformation posts. But here’s the best part about transpiling…it’s simple enough that I don’t have to talk on and on about it. If you need more features, familiarity or a single code base, then it is time to look into transpiling.

As for me and my development? I’m sticking with pure JS and macros. Call me a hypocrite, but I just love me some curly braces, polyfills and creating my own domain language. But hey…I’m a bit crazy.

Advertisements

Language Transformations and You: Syntactic Preprocessing

Alright, time for another exciting post about language transformations and believe me, today’s post¬†is incredibly¬†exciting! You see, as I explained in Language Transformations and You: Lexical Preprocessing, Lexical Preprocessing is great for including source code…but doesn’t really cut it in many other situations. Syntactic Preprocessing, however, is always¬†exciting! You see, Syntactic Preprocessing goes further than lexical preprocessing. Instead of parsing¬†for compiler specific preprocessor directives, it actually parses the code in search of text that matches patterns that you yourself define within macros.

What this means is that depending on the language, you can add new syntax and operators, change existing syntax and operators and even add new keywords! This kind of customization¬†is huge with languages like Nim, Lisp or Scheme, but unfortunately hasn’t quite caught on with many of today’s main languages. Sure, you could use the C preprocessor and a Lexical Preprocessor to substitute bits of text as stated here, but I would have to agree with that answer…a simple code substitution isn’t needed, since you can use functions to do the same thing. Syntactic Preprocessing isn’t just a simple swap of a keyword for some code. Instead, it allows you to completely¬†change how you write the language through fully descriptive patterns that can compile into whatever bit of code that you need.

Why is Syntactic Preprocessing really important? Mainly, it allows you to create your own domain specific language, that can result in cleaner code that is easier to maintain. Remember our object hierarchy fun? Instead of having a mess of multiple for loops that you constantly have to code that follow the exact same format, you can instead write a macro that takes cleaner code like parent#child#child#>callback. The compiler transforms that pattern into the messier actual code that your application needs to use in order to function (pardon the pun).

Here’s another example. Javascript will soon have the spread operator with the arrival of¬†ES6. Unfortunately, it could take months or even years before browsers implement all of ES6 (looking at you, Safari and Opera…you’re behind IE’s tech preview!). In the mean time, you can create macros to ease the pain and give you access to some of the features of the spread operator:

macro (..el) {
  // Flatten arrays.
  rule {
    // We're going to use recursion here to ensure that
    // elements that are arrays are also spread.
    [$x:invoke(..el) (,) ...]
  } => {
    $x (,) ...
  }
  // Simply return anything that isn't an array.
  rule {
    $x
  } => {
    $x
  }
}

macro (..>) {
  // For an array, iterate over each element in the
  // array and use the ..el macro on it.
  rule {
    [$x:invoke(..el) (,) ...]
  } => {
    [$x (,) ...]
  }
}

macro (~) {
  // For a function call after the ~ macro, allow a
  // single array to be spread as the function's parameters.
  rule {
    $x(..>$y)
  } => {
    $x.apply(null, $y)
  }
}

// var num = ~some(..>app)
// Outputs: var num = some.apply(null, app);

// var arr = ..>[1,2,3,[1,[2, [3, [foo]]]]]
// Outputs: var arr = [1,2,3,1,2,3,foo];

Here’s a gist of the above macro!

Right? Macros are awesome! They give you so much control over the language. Some developers may not like having a¬†domain language¬†or even using macros, but the way I see it, you need to use the best tool for the job. Creating a domain language that ensures compliance to your application’s needs, can ensure conformance to proper design patterns and still compiles down to the code that you want run in production is a highly effective tool.

Macros sound pretty great so far…but what if you don’t want to use Scheme or Lisp? Luckily there are options for many of the modern languages out there:

  • C,C++, C#, Java, and Objective-C – Unfortunately, they only have the #define preprocessor, which does a simple code substitution and cannot provide as much customization. However, you can change their compilers.
  • Go – No macros here either, but they have a way to generate code, which is just as awesome!
  • Javascript – sweetjs¬†– This is what I used in the above spread operator example, and can be incorporated with Grunt or Gulp.
  • Perl – macro
  • PHP – No macros or even a #define directive. You’re on your own here…
  • Python – macropy
  • Rust – included in language¬†– In fact, the macros in Rust are the same as sweet.js, since both come from Mozilla.

I know what you’re thinking…you could go even further than a domain language and create an entirely new language that transpiles into another language…or even several languages, allowing your team to develop in one language and get native code for desktops, mobile OS’s and even web applications. Of course, you would be right…but I’m not going to discuss that further until my next post Language Transformations and You: Transpiling!

Language Transformations and You: Lexical Preprocessing

As I stated in Language Transformations and You!, I will be spending the next few weeks discussing how you can improve your skill set with various Language Transformation techniques. This week’s big topic will be Lexical Preprocessing, which is a method of changing your source code based on specific compiler recognized tokens. Is it worth your time? Let’s take a deeper look at it and I’ll let you decide.

Source Code Importing

The most pervasive example of Lexical Preprocessing is importing source code from one file into another. This is one of the most important tools in a developer’s repertoire because it allows you to organize your source code. As the old saying goes, “A place for everything, and everything in its place.” Now, I could probably rant about the importance of organizing your files into small reusable modules until I’m blue in the face, but I will give you the benefit of the doubt and assume you yourself rant about the very same thing. ūüôā

Anyways, when source code is being interpreted and compiled, the compiler needs to be able to find the references of all of the code that you used within your files. The way it handles that process is during Lexical Preprocessing by loading source from any file that you have imported with of course an import, includes or using statement (or whatever other keyword token your language of choice uses).

Unfortunately…there is a downside (there always is with technology): mocking out imported files for unit testing is a huge pain with normal Source Code Importing. In fact, with many languages, it can’t be done at all without some real hacky and evil code. The solution? Dependency Injection. Since this is a topic about Lexical Preprocessing, I won’t get into the gory details…suffice to say, you definitely need to learn about Dependency Injection. I would start here, here and also in a TLDR¬†post here.

That one con aside, you’ve probably used this technique so often that you barely notice it anymore. It’s become a part of your routine. Add a file, include a file, use the imported¬†contents within your new file. You’ve already begun transforming your source code into being a more readable and well organized project through the use of Lexical Preprocessing. Take a look in the mirror and high five yourself. Yeah, you’re pretty awesome. Now clean off your hand print before anyone realizes you just high fived your reflection. Don’t worry, your secret is safe with me.

“So”, you ask as you grab that bottle of window cleaner, “what else can Lexical Preprocessing do?”

I’m glad you asked.

Source Code Substitution

You see, Lexical Preprocessing isn’t just about importing source code. It’s also about changing your source code based on static, compiler recognized tokens. Let’s say you have a utility or application base code file that you use to define various settings, objects and functions that you use throughout your application. You might have database connection objects, file paths and boolean settings that change the behavior of your application. What if you need those to change based on whether or not the application is running in production? One common approach is to use Source Code Substitution in order to change how compiler output based on various compiler and environmental states.

Remember our database connection logging example in the previous article?

using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
#if DEBUG
	loggingLevel = LoggingLevels.Debug;
#else
	loggingLevel = LoggingLevels.Production;
#endif

	//Do some SQL Stuff
}

// Output for Debug:
using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
	loggingLevel = LoggingLevels.Debug;

	//Do some SQL Stuff
}

//Output for Prod:
using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
	loggingLevel = LoggingLevels.Production;

	//Do some SQL Stuff
}

With the use of the #if, #else and #endif preprocesser directives, we can change the output of our source code to fit our application’s environment. This tells the compiler that if DEBUG is true, that it should only include the line where we set the loggingLevel to Debug.¬†For anything else, it should be set to Production. That’s a pretty nifty¬†trick to use in order to ensure that our application runs exactly as we need it to in a developer environment or a production environment.

The only problem…is that it can clutter up your code if you use it too heavily. Remember your utility file? What if most of the settings in there change depending on the application’s environment? It would be a sloppy mess of preprocessor directives mixed in with your code, and determining the finished output of a source file would be a nightmare. Of course, if kept in one file, it can allow your entire application to behave differently as needed and keep the mess contained.

A better solution? Put your settings into separate configuration files based on your different application environments. Then you can¬†load whichever one you need based upon the environment. It’s cleaner, more organized and easier to maintain. In fact, I would argue that in any situation where you feel you need to use preprocessor directives, you probably don’t. By writing well organized, modular, functional code that utilizes configuration files, proper source code importing and even Dependency Injection, you can make your application behave differently in almost any environment that you care to configure.

There are even some compilers that allow you to configure or code them for source code importing (like Grunt or Gulp with Javascript), so you can tell the compiler to import entirely different files based on your environment. Having a compiler that powerful will allow you to not only separate behavior into different files, but will also allow you to use that behavior for your environment without adding a bunch of compiler jargon to your code files. How awesome is that?

So, to sum up, you most likely already use Lexical Preprocessing often through source code importing. It has a fairly minor con, seeing as you can still write incredible, well organized code. Some languages even allow you to write meaningful unit tests alongside source code importing, without the use of Dependency Injection.

The other main use, source code substitution, should be used sparingly, as there are better ways to make your application behave differently in various environments than by substituting code.

Stay tuned next week for my favorite part of Language Transformations: Syntactic Preprocessing!

Language Transformations and You!

Language Transformations have¬†long¬†been¬†a huge and important part of the every day life of many developers. What are Language Transformations? I’m glad you asked. You see, Language Transformations are what I’m calling¬†the collection of various terms, technologies and concepts that transform one programming language into another, for lack of a better phrase. L.T. covers everything from using preprocessor directives to transpiling to language macros. Most importantly, it can change the way that you develop websites or applications for the better.

Let’s start with an example. You’re working on a large enterprise application that processes invoices. These invoices have billable items and each item has sub-items. This object hierarchy is a very common occurrence in various Object Oriented and Object Based languages. What happens if you have a collection of invoices, and on each you need to total up the quantities of the sub-items on each item?

// For the purposes of this example, let's ignore some
// of the more functional awesomeness that we can use
// in javascript and any other way we could improve the 
// performance of this bit of code.
totalQuantity = function (invoices) {
    var quantity = 0;

    for (var invoice in invoices) {
        for (var item in invoice.items) {
            for (var subItem in item.subItems) {
                quantity += subItem.quantity;
            }
        }
    }

    return quantity;
};

Alright, quite a bit of an eye sore…could use some improvements, but it gets the job done. Now, what if you need to get the cost of each item? It would require almost exactly the same code:


totalCost = function (invoices) {
    var cost = 0;

    for (var invoice in invoices) {
        for (var item in invoice.items) {
            for (var subItem in item.subItems) {
                cost += subItem.cost;
            }
        }
    }

    return quantity;
};

The only key difference here is we’ve replaced the word quantity with the word cost. While there are many ways to refactor¬†this code, the¬†problem still persists. What if the same thing happens for printing out a list of invoices of a sub account on a main account? Will your original solution cover it? Or will you have a second solution for the Account -> subAccount -> Orders conundrum? Plus what if you need to optimize one of your inheritence traversal solutions…and realize that all of them need to also be optimized?

Enter Language Transformations. By using Language Transformations, you can improve the readability and reuse of your code, while also simplifying how your application is written. These transformations come at a cost of course: either you have to spend time with another compilation and build step or you take a performance hit during run time. These cons need not scare you away from writing your own domain specific language through the use of Language Transformations. The pros of readability, convention, reuse, a single point of maintenance, code specialization and code simplicity far outweigh waiting a few more moments while compiling. In the case of any run time transformations, you will need to weigh the performance cost against the benefits for your own application.

Now that I’ve shown¬†you why you need¬†to transform your languages, I’m going to explain all of the pieces in a nice little teaser for you. Don’t worry, I will explain each of these with their own article over the next few weeks.

Lexical Preprocessing

Lexical Preprocessing is the idea of adding semantic fluff to your source code that the compiler then interprets to take various actions upon the source code. The fluff or Preprocessing Directives, is never included in the final compiled output of the source code. As such it can be a fairly useful tool for developers to change the source code for different environments, applications, sites and uses. Its main disadvantage, however, is that it can also clutter your code if it isn’t handled appropriately. Let’s look at an example in C#:

using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
#if DEBUG
	loggingLevel = LoggingLevels.Debug;
#else
	loggingLevel = LoggingLevels.Production;
#endif

	//Do some SQL Stuff
}

// Output for Debug:
using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
	loggingLevel = LoggingLevels.Debug;

	//Do some SQL Stuff
}

//Output for Prod:
using(SqlConnection con = new SqlConnection(connectionString)){
	// Set logging
	loggingLevel = LoggingLevels.Production;

	//Do some SQL Stuff
}

The compiler reads through the code, checks the status of DEBUG and then outputs the proper line for the directive.

As you can see, this allows you to change the behavior of the application before runtime, by telling the compiler to transform the language based on specific rules. This can be useful when properly applied; however, as you can see if you use it fairly often, it can easily clutter up your code.

Syntactic Preprocessing (Macros)

Some¬†compilers¬†include an awesome feature known as Syntactic Preprocessing or “Macros” for short. These aren’t the same scripts that you use to automate MS Office or your computer; these are predefined rules that tell a compiler how to transform your code. While¬†similar to Lexical Preprocessing, the key difference is that instead of including or excluding code or substituting blocks of code based on the environment or compiler state, Macros can change the syntactical behavior of a language by performing the same substitution operations based on the code itself. This is what is done by various preprocessor languages like Coffeescript, less, Sass or preprocessor libraries like Sweet.js.

Remember our fun object hierarchy dilemma? By using macros you can create your own domain specific syntax that allows you to handle the solution fairly elegantly:


aggCost = function(total, data){
    return total + data.cost;
}

totalCost = invoices#items#subItems#>aggCost;

// Output:

totalCost = function (invoices) {
    var cost = 0;

    for (var invoice in invoices) {
        for (var item in invoice.items) {
            for (var subItem in item.subItems) {
                cost = aggCost(cost, subItem);
            }
        }
    }

    return cost;
};

The compiler looks at the # operator and uses it to build out the for loops. Once it sees the #> operator, it knows that it should place a callback within the previously made for loop. You can define the rules to create your cost variable based on the name. In fact, you can have it substitute whatever you need based on rules; just remember that since this is a compiler step that it cannot interpret run time properties like the length of an array or the return of a function.

Note: There are some languages that allow syntax manipulation at run time. In that case, your macros can check the state of a variable or the return of a function.

Code highlighting issues aside, your source code¬†is now a lot easier to read and is much¬†cleaner. Just like any domain specific language, you will have to share its meaning with anyone else developing in the code base. To me, that’s¬†not a¬†terrible cost when you see how much more maintainable this methodology is in certain scenarios. Need to change how all of your object hierarchies are traversed? Simply update your macro, compile and test.

Transpiling

Transpiling is a relatively new term that means compiling from one language into another language, at or near the same level of abstraction. Basically, this is more of an escalated approach to language transformation, where you code in one language like C#, and the output is Objective C. This approach generally isn’t used for customization, but rather for familiarity. Perhaps your company has an application that is written in C# or Javascript and you want to put that application on Google’s Play Store or Apple’s App Store. You could rewrite the entire application…or you could transpile it from C# into Java and Objective C.

It’s not just limited to that specific instance either. Google has their own transpiler called traceur¬†that allows you to transpile Ecmascript 6 (future javascript) to Ecmascript 5 (current javascript for modern browsers). Transpiling¬†is also how TypeScript and AtScript are transformed into Javascript.

With transpiling, you can get a whole different set of language features that are compliant with your production environment’s language needs.

That’s all I have on Language Transformations this week. As I said before, this will be a 4 part series, where I go in depth into various styles of Language Transformation that will allow you to revolutionize how you handle development in your domain’s specific needs. I hope you are as excited as I am to take a closer look at Lexical Preprocessing, Syntax Preprocessing and Transpiling.