Difference between revisions of "Yhc/Javascript"

From HaskellWiki
< Yhc
Jump to navigation Jump to search
(Building XHTML 2)
m (minor correction)
Line 192: Line 192:
 
The '''pgbuild''' program, given a XHTML page template, parses it (using the [[HXT|Haskell XML Toolbox]] package functionality, and embeds Javascript files (or script URLS) provided into the &lt;head&gt; section of the template. Additionally, page title and the <code>onload</code> attribute of page body may be changed.
 
The '''pgbuild''' program, given a XHTML page template, parses it (using the [[HXT|Haskell XML Toolbox]] package functionality, and embeds Javascript files (or script URLS) provided into the &lt;head&gt; section of the template. Additionally, page title and the <code>onload</code> attribute of page body may be changed.
   
The <code>-t</code> command line option is used to specify the path to the empty page template. A sample template (usable in many cases) comes with '''ycr2js''' and is located in <code>''prefix''/lib/xhtml/emptyx.html</code>. The <code>-o</code> option specifies where to place the output file. The <code>-e</code> option instructs that the following file name points to a script to be embedded, and the order of embedding is same as the order of appearance of file names on the command line. The <code>--onload</code> option results in the &lt;body onload="..."&gt; set to the value specified. The latter should be name of a function encoded as follows:
+
The <code>-t</code> command line option is used to specify the path to the empty page template. A sample template (usable in many cases) comes with '''ycr2js''' and is located in <code>''prefix''/lib/xhtml/emptyx.html</code>. The <code>-o</code> option specifies where to place the output file. The <code>-T</code> option specifies the page title to override one stored in the template. The <code>-e</code> option instructs that the following file name points to a script to be embedded, and the order of embedding is same as the order of appearance of file names on the command line. The <code>--onload</code> option results in the &lt;body onload="..."&gt; set to the value specified. The latter should be name of a function encoded as follows:
   
 
* all alphanumeric characters remain the same
 
* all alphanumeric characters remain the same

Revision as of 20:14, 8 November 2006

This page is work in progress. If you see a section that is empty revisit the page later: it will appear. Or let the developer know what you would be interested to see on this page.

Brief Overview

An experimental sub-project, Yhc Core to Javascript Converter (ycr2js), is aimed to create a tool that generates Javascript out of a binary Yhc core file.

The project was started as an experimental patch to nhc98 in attempt to convert its internal PosLambda constructs into Javascript expressions. After some initial success, the project was switched to use the Yhc Core as the source for transformation. Recently, with a great amount of help from the Yhc Team, the project has been integrated into the main Yhc source tree and is moving towards closer integration with the compiler.

Ability to convert an arbitrary Haskell source into Javascript makes it possible to execute Haskell programs in a Web browser. This, in turn, allows for development of both client and server sides of an Internet application entirely in Haskell.

Server side solutions in Haskell have been around for a while, such as HAppS -- Haskell Application Server, Haskell Server Pages, and others. For the client side, HSPClientSide has been recently introduced, which is a close analog to ycr2js. HSPClientSide provides a domain-specific language to define the client side Web page structure (static HTML and Javascript). On the contrary, ycr2js helps convert any compilable Haskell source into Javascript.

Principles of Operation

The Yhc compiler generally produces a binary bytecode file (usually named with .hbc extension) for each Haskell module compiled. These bytecode files are to be interpreted by yhi, a command-line bytecode interpreter.

The compiler is also capable of producing a binary core file (usually named with .ycr extension), and also its human-readable representation for each Haskell module compiled. The internal structure of core is based on significantly simplified nhc98's PosLambda constructs (Yhc is derived from nhc98 code). Core consists of definitions for compiled Haskell functions and data objects.

The feature of core linking was added recently to Yhc. This allows for merging core files from several modules together, removing functions that are not used (similar to static linking performed by a traditional Unix or Windows executable linker). The resulting file (usually named with .yca extension) has the same format as per-module core files.

Binary core files may be read back into computer memory using the Yhc Core API functions.

The ycr2js program reads the binary core file specified (.yca or .ycr), and performs conversion of Haskell functions compiled into Core to their Javascript representation storing the generated Javascript code in a file. Resulting Javascript may be embedded on a (X)HTML page to be loaded into a Web browser.

Users Guide

Note: ycr2js is currently a standalone program distributed within the Yhc source tree. The usage guidelines below are relevant only for the standalone version. If ycr2js gets integrated in Yhc tighter than now, some or all of the following statements may be no longer applicable ansd will be updated accordingly.

Downloading

The ycr2js program along with additional tools is distributed within the Yhc source tree. Download and install Yhc as recommended here and here. Include the core=1 options on the scons command line when building Yhc to generate Core for all Haskell library modules. The yhc and ghc executables should be on the PATH after the installation is coplete.

Building and Installation on Unix

Change from the toplevel directory of the Yhc source tree to the src/translator/js directory. Execute commands:

make all
make install
make test

All three commands should finish without error.

Building and Installation on Windows

This currently does not work, but will do once the Scons build system that Yhc uses has been integrated.

What Is Installed

The Makefile in the ycr2js directory instructs make to place all necessary files relatively to the yhc executable location. After the installation of ycr2js, the directory structure will be as follows (assuming that prefix is the base path of the Yhc installation):

prefix/bin
  ycr2js Core to Javascript Converter
  pgbuild XHTML Page Building Utility
prefix/lib
  haskell
    StdOverlay.hs Standard Javascript Core Overlay
    UnsafeJS.hs Contains unsafeJS, a pseudo-function to inline Javascript code in Haskell functions
  javascript
    Runtime.js Runtime Support Javascript Library, needs to be included with every XHTML page generated
  xhtml
    emptyx.html An Empty XHTML Page template

The structure shown above represents a "bare" installation essential for ycr2js to function properly. Actual installation may include more files than these.

The base location of Yhc installation will be further on referred as prefix unless otherwise stated.

Compiling Haskell into Core

In order to obtain a linked Core file for a Haskell source(s), the Yhc compiler should be run like this:

yhc -includes prefix/lib/haskell \
   [-includes other include directories] \
    -linkcore Main.hs [other Haskell source files]

Other yhc options may be used as needed, but specifying the prefix/lib/haskell directory with the -include option is important. The -linkcore option instructs Yhc to link all core files generated together, along with library modules used. It is assumed that the user program's main module is named Main, and its entry point is main. Type signature of Main.main does not matter. The linked core will be output as the file Main.yca.

Converting Core into Javascript

In order to generate Javascript out of a linked core file (.yca), run the Core to Javascript converter:

ycr2js Main.yca Main.main >Main.js [2>Main.core]

The converter outputs generated Javascript into its standard output, and visual representation of the Core (after overlay and reachability check are applied, see details below). If redirection of standard error is omitted, visual representation of Core will not be saved.

The converter takes name of the linked Core file as its first command line parameter, and root names as the rest of parameters. The purpose of root names is to specify functions for which another functions they refer to (or, in other words, reachable) will be kept. For example, a module may contain functions like this:

module Main where

import UnsafeJS

factorial :: Int -> Int

factorial 0 = 0
factorial 1 = 1
factorial n | n > 0 = n * (factorial (n - 1))
            | n <= 0 = error "Factorial of a negative value"

anyStatus :: a -> a

anyStatus a = unsafeJS "window.status = exprEval(a).toString(); return a;"

str2 = 'a':'c':'d':"abyrvalg"

s5 :: String
s5 = 'g':[]

s6 :: String
s6 = 'd':""

s7 :: String
s7 = "ertyu"

main = anyStatus (["aaa","bbb",s7 ++ (head s7):(head s6):str2])

It can be easily seen that Main.main uses (of this module):

  • anyStatus
  • s7
  • s6
  • str2

and does not use:

  • factorial
  • s5

This means that the generated Javascript file will not contain factorial and s5 just for the reason of size optimization.

It is possible that a script for a Web page contains several parts not linked symbolically, so don't forget what functions are "roots" and specify them all on the command line.

Another important thing to know about Javascript generation is Core Overlay. An overlay is a fake Haskell module containing declarations of functions to be replaced in the output Javascript file.

A good example of this is the Prelude.error function. It is defined in the Prelude as:

error :: String -> a
error s = primError s

primError :: String -> a
primError xs = trace (xs++"\n") (unsafePerformIO (exitWith (ExitFailure (negate 1))))

Not only is this definition irrelevant to Javascript execution in a Web browser (indeed, there is no traditional I/O), but it also pulls many other functions with it with the same low level of relevance. In the same time, the Prelude.error function is referred to from every case or if expression translated to Javascript.

The ycr2js program uses the overlay file, prefix/lib/javascript/StdOverlay.hs. Currently, only this overlay file may be used by ycr2js and its path is hardcoded into the program (although in the future, this limitation may be lifted). In an overlay file, names of functions (always qualified) to be replaced are encoded so for example a dot is replaced with prime-underscore sequence (the full set of decoding rules is defined in [1] module in the function decodeString.

So, the Standard Overlay module defines Prelude.error to substitute one from the Prelude:

module StdOverlay where

import UnsafeJS

-- To substitute Prelude.error

global_Prelude'_error :: String -> a

global_Prelude'_error a = unsafeJS "alert(a); return undefined;"

which means that in the case of an error, an alert window will appear, and return of an undefined (in Javascript meaning) value will stop further script execution, most likely causing other error messages.

As an alternative, Prelude.error might throw an exception which unless caught would propagate to the top level and cause an error message to appear in Javascript Console, or in the browser's status line.

At this point, the generated Javascript is saved in the file Main.js.

Building a XHTML Page

In order to build a XHTML page ready to be loaded in a Web browser, another program that comes together with ycr2js is to be used:

pgbuild -tprefix/lib/xhtml/emptyx.html -o Main.html -T "Main" \
   -e prefix/lib/javascript/Runtime.js -e Main.js \
   --onload="exprEval(Main_46main)"

The pgbuild program, given a XHTML page template, parses it (using the Haskell XML Toolbox package functionality, and embeds Javascript files (or script URLS) provided into the <head> section of the template. Additionally, page title and the onload attribute of page body may be changed.

The -t command line option is used to specify the path to the empty page template. A sample template (usable in many cases) comes with ycr2js and is located in prefix/lib/xhtml/emptyx.html. The -o option specifies where to place the output file. The -T option specifies the page title to override one stored in the template. The -e option instructs that the following file name points to a script to be embedded, and the order of embedding is same as the order of appearance of file names on the command line. The --onload option results in the <body onload="..."> set to the value specified. The latter should be name of a function encoded as follows:

  • all alphanumeric characters remain the same
  • all other characters are replaced with an underscore followed by character's numeric code

so Main.main becomes Main_46main

would this be better to have pgbuild call fixStr itself and accept unmangled function names?

The XHTML Web page is written into the file Main.html and is ready to be loaded into a Web browser.

Tools Summary

All Together: A Simple Makefile

Here is an example of a simple Makefile containing rules sufficient to compile and build Web pages from Haskell sources. The yhc executable must be on the PATH.

Cutting and pasting this Makefile, don't forget about proper tabs usage.

#============ Begin Simple Makefile for Standalone ycr2js ============
YHC = yhc
YHCBASE = $$(dirname $$(dirname `which $(YHC)`))
YCR2JS =  $(YHCBASE)/bin/ycr2js
PGBUILD = $(YHCBASE)/bin/pgbuild
XMLTMPL = $(YHCBASE)/lib/xhtml/emptyx.html
RUNTIME = $(YHCBASE)/lib/javascript/Runtime.js
HSJSLIB = $(YHCBASE)/lib/haskell
# Insert names of Web pages to build here. General naming rule:
# if the page is built out of a Haskell file Foo.hs which contains
# the main function, the Web page file name will be Foo.html
all: Foo.html
# If Foo.hs imports modules from any location other than current directory,
# add more -includes options to this rule.
%.yca: %.hs
	$(YHC) -includes $(HSJSLIB) -linkcore $<
%.js: %.yca
	$(YCR2JS) $< $*.main > $@ 2> $*.coreovr
# These two rules generate visual representation of Core.
# Include files with names ending with .yc[ra]txt in the list
# of `all' dependencies to build these files: .yca and .js files
# will be deleted by make in the very end.
%.ycatxt: %.yca
	$(YHC) -viewcore $< > $@
%.ycrtxt: %.ycr
	$(YHC) -viewcore $< > $@
# Don't forget that your "root" module Foo must contain the
# `main' function. Its type signature does not matter.
%.html: %.js
	$(PGBUILD) -t $(XMLTMPL) -o $@ -T "$*" -e $(RUNTIME) -e $< \
               --onload="exprEval($*_46main)"
#============= End Simple Makefile for Standalone ycr2js =============

Programmers Guide

The main function

Calling Javascript from Haskell: unsafeJS

Calling Haskell from Javascript

Passing Primitive Values

Passing Strings

Passing Arrays

Passing Objects

Example: a Simple Monad

Inner Workings

In this section, internal structure of Javascript objects and runtime support algorithms is reviewed.

Javascript Objects

The table below summarizes types of Javascript objects used in the ycr2jsgenerated Javascript code.

Javascript Object Types and Their Methods and Properties
Member/
Constructor
Prop
Meth
Constr
H
S
C
o
n
s
H
S
E
O
L
H
S
F
u
n
H
S
D
l
y
H
S
D
a
t
a
Description/Arguments
HSCons C *         Builds a list CONS cell head:
head element
tail:
remainder of the list
HSEOL C   *       Final element of a list or an empty list  
HSFun C     *     Creates a function thunk with no arguments applied to name:
function name to be used for debugging/exception tracing
arity:
arity of the function known by the compiler
body:
expression to apply to function's arguments and evaluate
HSDly C       *   A special object to wrap around a saturated function call thunk:
saturated function call that is a HSFun object with number of arguments applied to (_a) equal to the function arity (_x); evaluation of this thunk will be delayed until it is applied to an argument which would have oversaturated the call in the absence of HSDly
HSData C         * Builds a data object other than a CONS cell or an Empty List con:
constructor name (with non-alphanumeric characters replaced with underscored character codes)
arrs:
a Javascript Array containing contructor arguments
_r P * * * * * Boolean: true when a thunk may be evaluated.
  • true in HSFun when the call is saturated (_a.length == _x)
  • Always true in HSDly
  • Always false in HSCons, HSEOL, HSData
_c M * * * * * Evaluate a thunk. If this method is said as "has no action", this means that it just returns this and does nothing else.
  • No action in HSFun unless the call is saturated (_a.length == _x) in which case it applies the function body (_b) to the accumulated arguments array (_a) and returns whatever the body returns
  • In HSDly, evaluates the delayed function call first and then applies the oversaturating arguments to the result (_ap), and returns whatever results from this (a non-function value or unsaturated function call or saturated function call wrapped in another HSDly object)
  • No action in HSCons, HSEOL, HSData
_a P     * *   Array:
  • in HSFun, holds accumulated arguments
  • in HSDly, holds oversaturating arguments
_ap M     * *   Apply function call/delayed saturated call to argument(s)
  • In HSFun, clones the HSFun object, and concatenates its argument to the accumulated arguments array (_a) of the copy. If the copy becomes a saturated call, sets it's _r property to true and returns the copy wrapped into HSDly, otherwise just returns the copy
  • In HSDly, clones the HSDly object and concatenates its argument to the oversaturating arguments array (_a) of the copy. No evaluation is done at this time; it will be performed by the _c method
targs:
Array containing the arguments to be applied to
_b P     *     Holds the expression to apply to function's arguments and evaluate: the third argument of the HSFun constructor is copied here
_x P     *     Holds the function arity: the second argument of the HSFun constructor is copied here
_n P     *     Holds the function name: the first argument of the HSFun constructor is copied here
_d P       *   Holds the saturated function call (HSFun object with _a.length == _x: the first argument of the HSDly constructor is copied here
_t P * *     * Constructor name for a Data or a CONS/Empty list cell to be used for pattern matching
  • In HSData, contains qualified name of the constructor that created the Data object with all non-alphanumeric character replaced with underscored character codes
  • In HSCons, always contains "Prelude_46_58" (Prelude.:)
  • In HSEOL, always contains "Prelude_46_91_93" (Prelude.[])
_f P * *     * Constructor arguments (may be empty)
  • In HSData, the second argument of the HSData constructor is copied here
  • In HSCons, this is an array of two elements: copies of the first (head) and the second (tail) arguments of the HSCons constructor
  • In HSEOL, always empty
toString M *         Method of Object overridden by HSCons. Used for unmarshalling of Haskell lists (including Strings) into Javascript as Strings. The method evaluates all elements of the list (therefore it should be finite) and if the list contains characters, they are concatenated into a Javascript String, otherwise the _toArray method is called, and the toString method is called upon _toArray's result.
_toArray M *         Method used for unmarshalling of Haskell lists (including Strings) into Javascript as Arrays. The method evaluates all elements of the list (therefore it should be finite) and concatenates them into a Javascript Array. Internal representation of Haskell type Char is its numeric value, so a Haskell String will be converted into a Javascript Array of Number's.

Evaluation of Expressions

The Javascript runtime support library provides a function exprEval which is used to evaluate all expressions starting with the toplevel expression (starting point).

In essence, this function looks like this:

function exprEval (e) {
  for (var ex = e; ex != undefined && ex._r ; ex = ex._c())
    ;
  return ex;
};

This is a loop that checks whether a given expression exists (not undefined) and can be evaluated (_r == true). In this case, it calls the expression's _c method and analyzes its return. If the returned expression also may be evaluated, the function loops and evaluates it. This is repeated until an expression that no longer can be evaluated is returned (normal situation, e. g. a primitive value or a Data object), or an undefined value is returned (this is abnormal situation).

While evaluating an expression, exprEval may be recursively called to evaluate nested expressions.

Special Notes

Oversaturation

Oversaturation happens when a function thunk (HSFun object) is applied to more arguments than function arity (_x).

For example, this piece of Haskell code:

let g = fst tup x = g (5::Int) (6::Int)

converts into the following Javascript (indentation manual):

var Bug_46Bug_46Prelude_46217_46g=
  new HSFun("Bug.Bug.Prelude.217.g", 0,  function(){
    var v285=new HSFun("v285", 0, function(){
      return  Prelude_46Prelude_46Num_46Prelude_46Integer;}
    );
    return (Prelude_46fst)._ap([(Bug_46tup)._ap([v285])]);
  });
var Bug_46Bug_46Prelude_46218_46x=
  new HSFun("Bug.Bug.Prelude.218.x",  0,  function(){
    return (Bug_46Bug_46Prelude_46217_46g)._ap([5, 6]);
  });

As it can be clearly seen, g is defined with arity = 0 which is reasonable: its definition does not have any formal arguments. But when computing x, g is called with two arguments: 5 and 6.

To preserve laziness, the thunk of g should not be evaluated earlier than it is actually needed, i. e. when the value of x is needed. Calling the _ap method of g would have resulted in oversaturation and inevitable error in computations.

To work this around, logics of HSFun._ap method was changed, and special object type HSDly was introduced. HSFun._ap may accept any number of arguments in an array. Length of the array of arguments is compared with number of arguments needed to saturate the call. If the call becomes under- or completely saturated after the arguments have been applied to, no special action is taken: undersaturated call remains the same, saturated is wrapped in HSDly. If however oversaturation is about to happen, portion of arguments necessary to saturate the call is absorbed into the accumulated arguments array, and the rest of arguments are carried over to the HSDly._ap method.

Behavior of HSDly objects is as follows: the _ap method accepts as many arguments as provided, and accumulates them inside. The _c method evaluates the delayed thunk first, and then calls its _ap method with all the arguments accumulated up to the moment. In this case it is expected that the delayed thunk will evaluate into another function call (as can be seen in the example above). These actions may lead to either a value computed as result of application, or another function call, under-or completely or over-saturated. In the two latter cases, the result will be wrapped into another HSDly object with arguments remaining carried over.

Examples and Demos


--DimitryGolubovsky 19:02, 6 November 2006 (UTC)