https://wiki.haskell.org/api.php?action=feedcontributions&user=Tbh&feedformat=atomHaskellWiki - User contributions [en]2024-03-29T00:34:34ZUser contributionsMediaWiki 1.35.5https://wiki.haskell.org/index.php?title=Haskell_in_industry&diff=54359Haskell in industry2012-10-15T17:26:28Z<p>Tbh: Added fortytools gmbh</p>
<hr />
<div>__NOTOC__<br />
<br />
Haskell has a diverse range of use commercially, from aerospace and defense, to finance, to web startups, hardware design firms and a lawnmower manufacturer. This page collects resources on the industrial use of Haskell.<br />
<br />
* The main user conference for industrial Haskell use is CUFP - the [http://cufp.org/ Commercial Users of Functional Programming Workshop].<br />
* The [http://industry.haskell.org Industrial Haskell Group] supports commercial users.<br />
* [http://fpcomplete.com/ FP Complete] is dedicated to the widespread adoption of modern Functional Programming technology, with a focus on the Haskell system. <br />
<br />
== Haskell in Industry ==<br />
<br />
Many companies have used Haskell for a range of projects, including:<br />
<br />
* [http://cufp.galois.com/2007/abstracts.html#CyrilSchmidt ABN AMRO] Amsterdam, The Netherlands<br />
<blockquote><br />
ABN AMRO is an international bank headquartered in Amsterdam. For its<br />
investment banking activities it needs to measure the counterparty risk<br />
on portfolios of financial derivatives. </blockquote><br />
::ABN AMRO's [http://cufp.galois.com/2007/abstracts.html#CyrilSchmidt CUFP talk].<br />
<br />
* Aetion Technologies LLC, Columbus, Ohio<br />
<blockquote><br />
Aetion was a defense contractor in operation from 1999 to 2011, whose applications use artificial intelligence. Rapidly changing priorities make it important to minimize the code impact of changes, which suits Haskell well. Aetion developed three main projects in<br />
Haskell, all successful. Haskell's concise code was perhaps most important for<br />
rewriting: it made it practicable to throw away old code occasionally. DSELs<br />
allowed the AI to be specified very declaratively. <br />
</blockquote><br />
::Aetion's [http://cufp.galois.com/2006/slides/GaryMorris.pdf CUFP talk].<br />
<br />
* Alcatel-Lucent<br />
<blockquote><br />
A consortium of groups, including Alcatel-Lucent, have used Haskell to prototype narrowband software radio systems, running in (soft) real-time.<br />
</blockquote><br />
::Alcatel-Lucent's [http://cufp.org/conference/sessions/2011/fourteen-days-haskell-real-time-programming-projec CUFP talk]<br />
<br />
* [http://www.allstontrading.com/ Allston Trading]<br />
<blockquote><br />
Headquartered in Chicago, Illinois, Allston Trading, LLC is a premier high frequency market maker in over 40 financial exchanges, in 20 countries, and in nearly every conceivable product class. Allston makes some use of Haskell for their trading infrastructure.<br />
</blockquote><br />
<br />
* [http://www.alphaheavy.com/ Alpha Heavy Industries]<br />
<blockquote><br />
Alpha Heavy Industries is an alternative asset manager dedicated to producing superior returns through quantitative methods. They use Haskell as their primary implementation language.<br />
</blockquote><br />
<br />
* [http://www.amgen.com/ Amgen] Thousand Oaks, California<br />
<blockquote><br />
Amgen is a human therapeutics company in the biotechnology industry. Amgen pioneered the development of novel products based on advances in recombinant DNA and molecular biology and launched the biotechnology industry’s first blockbuster medicines.<br />
<br />
Amgen uses Haskell;<br />
<br />
* To rapidly build software to implement mathematical models and other complex, mathematically oriented applications<br />
* Provide a more mathematically rigorous validation of software<br />
* To break developers out of their software development rut by giving them a new way to think about software.<br />
</blockquote><br />
::Amgen's [http://cufp.galois.com/2008/abstracts.html#BalabanDavid CUFP talk].<br />
<br />
* [http://www.ansemond.com/ Ansemond LLC]<br />
<blockquote><br />
"Find It! Keep It! is a Mac Web Browser that lets you keep the pages you<br />
visit in a database. A list of these pages is shown in the 'database<br />
view'. "<br />
</blockquote><br />
<br />
* [http://antiope.com/ Antiope] Fair Haven, New Jersey<br />
<blockquote><br />
Antiope Associates provides custom solutions for wireless communication<br />
and networking problems. Our team has expertise in all aspects of<br />
wireless system design, from the physical and protocol layers to complex<br />
networked applications. Antiope Associates's relies on a number of<br />
advanced techniques to ensure that the communication systems we design<br />
are reliable and free from error. We use custom simulation tools<br />
developed in Haskell, to model our hardware designs..<br />
</blockquote><br />
::Antiope's [http://cufp.galois.com/2008/slides/WrightGregory.pdf CUFP talk].<br />
<br />
* [http://www.att.com AT&amp;T]<br />
<blockquote><br />
Haskell is being used in the Network Security division to automate processing of internet abuse complaints. Haskell has allowed us to easily meet very tight deadlines with reliable results.<br />
</blockquote><br />
<br />
* [http://www.baml.com/ Bank of America Merril Lynch]<br />
<blockquote>Haskell is being used for backend data transformation and loading.</blockquote><br />
<br />
* [http://www.haskell.org/communities/12-2007/html/report.html#sect7.1.2 Barclays Capital Quantitative Analytics Group]<br />
<blockquote><br />
Barclays Capital's Quantitative Analytics group is using Haskell to<br />
develop an embedded domain-specific functional language (called FPF)<br />
which is used to specify exotic equity derivatives. These derivatives,<br />
which are naturally best described in terms of mathematical functions,<br />
and constructed compositionally, map well to being expressed in an<br />
embedded functional language. This language is now regularly being used<br />
by people who had no previous functional language experience.<br />
</blockquote><br />
::[http://lambda-the-ultimate.org/node/3331 Simon Frankau et al's JFP paper on their use of Haskell]<br />
<br />
* [http://www.bcode.com/ bCODE Pty Ltd] Sydney Australia<br />
<blockquote><br />
bCode Pty Ltd is a small venture capital-funded startup using Ocaml and a bit of Haskell in Sydney Australia.<br />
</blockquote><br />
<br />
* [http://www.bluespec.com/ Bluespec, Inc.] Waltham, Massachusetts<br />
<br />
<blockquote><br />
Developing a modern integrated circuit (ASIC or FPGA) is an enormously<br />
expensive process involving specification, modeling (to choose and fix the<br />
architecture), design (to describe what will become silicon) and verification<br />
(to ensure that it meets the specs), all before actually committing anything to<br />
silicon (where the cost of a failure can be tens of millions of dollars).<br />
Bluespec, Inc. is a three year-old company that provides language facilities,<br />
methodologies, and tools for this purpose, within the framework of the IEEE<br />
standard languages SystemVerilog and SystemC, but borrowing ideas heavily from<br />
Term Rewriting Systems and functional programming languages like Haskell. In<br />
this talk, after a brief technical overview to set the context, we will<br />
describe our tactics and strategies, and the challenges we face, in introducing<br />
declarative programming ideas into this field, both externally (convincing<br />
customers about the value of these ideas) and internally (using Haskell for our<br />
tool implementation). <br />
</blockquote><br />
<br />
::Bluespec's [http://cufp.galois.com/2006/abstracts.html#RishiyurNikhil CUFP talk].<br />
<br />
* [http://bu.mp/ Bump]<br />
<blockquote><br />
Bump use a Haskell-based server, [http://github.com/jamwt/Angel Angel], for process supervisor for all their backend systems, and [http://devblog.bu.mp/haskell-at-bump for other infrastructure tasks].<br />
</blockquote><br />
<br />
* [http://www.circos.com Circos Brand Karma] Singapore<br />
<blockquote><br />
Brand Karma provides services to brand owners to measure online sentiments towards their brands.<br />
Haskell is used in building parts of the product, specifically for back-end job scheduling and brand matching.<br />
</blockquote><br />
<br />
* [http://www.credit-suisse.com/ Credit Suisse Global Modelling and Analytics Group] London, UK; New York City, New York<br />
<br />
<blockquote><br />
GMAG, the quantitative modelling group at Credit Suisse, has been using Haskell<br />
for various projects since the beginning of 2006, with the twin aims of<br />
improving the productivity of modellers and making it easier for other people<br />
within the bank to use GMAG models. Current projects include: Further work on<br />
tools for checking, manipulating and transforming spreadsheets; a<br />
domain-specific language embedded in Haskell for implementing reusable<br />
components that can be compiled into various target forms (see the video presentation: [http://www.londonhug.net/2008/08/11/video-paradise-a-dsel-for-derivatives-pricing/ Paradise, a DSEL for Derivatives Pricing]).<br />
</blockquote><br />
<br />
::Credit Suisse's [http://cufp.galois.com/2006/abstracts.html#HowardMansell CUFP talk].<br />
<br />
* [http://detexify.kirelabs.org/classify.html Detexify]<br />
<br />
<blockquote><br />
Detexify is an online handwriting recognition system, whose backend is written in Haskell. <br />
</blockquote><br />
<br />
* [http://www.db.com/ Deutsche Bank Equity Proprietary Trading, Directional Credit Trading]<br />
<br />
<blockquote><br />
The Directional Credit Trading group uses Haskell as the primary<br />
implementation language for all its software infrastructure.<br />
</blockquote><br />
<br />
::Deutsche Bank's [http://cufp.galois.com/2008/abstracts.html#PolakowJeff CUFP talk].<br />
<br />
* [http://article.gmane.org/gmane.comp.lang.haskell.cafe/37093 Eaton] Cleveland, Ohio<br />
<br />
<blockquote><br />
Design and verification of hydraulic hybrid vehicle systems<br />
</blockquote><br />
<br />
::Eaton's [http://cufp.galois.com/2008/abstracts.html#HawkinsTom CUFP talk]<br />
::Eaton's [http://www.haskell.org/pipermail/haskell-cafe/2009-April/060602.html experiences using a Haskell DSL]<br />
<br />
* [Ericsson AB]<br />
<blockquote><br />
Ericsson uses Haskell for the implementation of Feldspar, an EDSL for digital signal processing algorithms.<br />
</blockquote><br />
<br />
::Ericsson's [http://hackage.haskell.org/package/feldspar-compiler Feldspar compiler]<br />
<br />
* [http://facebook.com Facebook]<br />
<br />
<blockquote><br />
Facebook uses some Haskell internally for tools. [http://github.com/facebook/lex-pass/tree/master lex-pass] is a tool for programmatically manipulating a PHP code base via Haskell.<br />
</blockquote><br />
<br />
:: Facebook's [http://cufp.galois.com/2009/abstracts.html#ChristopherPiroEugeneLetuchy CUFP talk]<br />
<br />
* [http://www.factisresearch.com/ Factis Research]<br />
<blockquote><br />
Factis research, located in Freiburg, Germany, develops reliable and user-friendly mobile solutions. Our client software runs under J2ME, Symbian, iPhone OS, Android, and Blackberry. The server components are implemented in Python and Haskell. We are actively using Haskell for a number of projects, most of which are released under an open-source license.<br />
</blockquote><br />
<br />
:: Factis' [http://haskell.org/communities/05-2010/html/report.html#factisresearch HCAR submission]<br />
<br />
* [http://fortytools.com fortytools gmbh]<br />
<blockquote><br />
Located in Hamburg, Germany, we are developing web-based productivity tools for invoicing, customer management, resource scheduling and time tracking. While using Javascript for building rich frontend application in the browser, we use Haskell to implement the REST backends. Additionally, we do occasional project/client work as well.<br />
</blockquote><br />
<br />
:: Oh, and of course we develop and maintain [http://hayoo.info Hayoo!] :)<br />
<br />
* [http://www.funktional.info/index.php?id=7&L=1 Funktionale Programmierung Dr. Heinrich Hördegen], Munich, Germany<br />
<blockquote><br />
We develop software prototypes according to the Pareto principle: After spending only 20 percent of budget, we aim to provide already 80 percent of the software's functionality. We can realize this by constructing a 2080-software-prototype that we can further develop into a full-fledged solution...<br />
</blockquote><br />
<br />
* [http://www.galois.com/ Galois, Inc] Portland, Oregon<br />
<br />
<blockquote><br />
Galois designs and develops high confidence software for critical applications.<br />
Our innovative approach to software development provides high levels of<br />
assurance, yet its scalability enables us to address the most complex problems.<br />
We have successfully engineered projects under contract for corporations and<br />
government clients in the demanding application areas of security, information<br />
assurance and cryptography. <br />
</blockquote><br />
<br />
::Galois' [http://cufp.galois.com/2007/abstracts.html#JohnLaunchbury 2007 CUFP talk]<br />
::Galois' [http://cufp.org/conference/sessions/2011/theorem-based-derivation-aes-implementation 2011 CUFP talk]<br />
::Galois' [http://corp.galois.com/blog/2009/4/27/engineering-large-projects-in-haskell-a-decade-of-fp-at-galo.html retrospective on 10 years of industrial Haskell use]<br />
<br />
* [http://google.com Google]<br />
<br />
<blockquote><br />
Haskell is used on a small number of internal projects in Google, for internal IT infrastructure support. <br />
</blockquote><br />
<br />
::Google's [http://k1024.org/~iusty/papers/icfp10-haskell-reagent.pdf ICFP 2010 experience report on Haskell]<br />
<br />
* [http://glyde.com/ Glyde]<br />
<br />
<blockquote><br />
Glyde uses OCaml and Haskell for a few projects. Glyde uses Haskell for our client-side template source-to-source translator, which converts HAML-like view templates into JS code.<br />
</blockquote><br />
<br />
* [http://groupcommerce.com Group Commerce]<br />
<blockquote><br />
Group Commerce uses Haskell to drive the main component of their advertising infrastructure: a Snap Framework based web server. Haskell enabled quicker development, higher reliability, and better maintainability than other languages, without having to sacrifice performance.<br />
</blockquote><br />
<br />
* [http://humane-software.com Humane Software]<br />
<blockquote>We develop enterprise systems with de-coupled, asynchronous Haskell backends and Javascript UIs.<br><br />
For our current customer, an Internet connectivity provider, we wrote a solution for monitoring multiple remote machines and analyzing gigabytes of traffic samples. Haskell proved an excellent tool for the job. <br />
We were able to replace legacy systems in a granular, piece-by-piece manner, while delivering new features.</blockquote><br />
<br />
* [http://hustlerturf.com Hustler Turf Equipment] Hesston, Kansas<br />
<blockquote><br />
Designs, builds, and sells lawn mowers. We use quite a bit of Haskell, especially as a "glue language" for tying together data from different manufacturing-related systems. We also use it for some web apps that are deployed to our dealer network. There are also some uses for it doing sysadmin<br />
automation, such as adding/removing people from LDAP servers and the like<br />
</blockquote><br />
<br />
* [http://iba-cg.de/haskell.html iba Consulting Gesellschaft] - Intelligent business architecture for you. Leipzig, Germany<br />
<br />
<blockquote><br />
iba CG develops software for large companies: <br />
* risk analysis and reporting solution for power supply company; <br />
* contract management, assert management, booking and budgeting software for one of the worldwide leading accounting firm.<br />
</blockquote><br />
<br />
* [http://www.ics-ag.de/ Informatik Consulting Systems AG]<br />
<br />
<blockquote><br />
ICS AG developed a simulation and testing tool which based on a DSL (Domain Specific Language). The DSL is used for the description of architecture and behavior of distributed system components (event/message based, reactive). The compiler was written in Haskell (with target language Ada). The test system is used in some industrial projects.<br />
</blockquote><br />
<br />
* [http://ipwnstudios.com/ iPwn Studios]<br />
<blockquote><br />
ipwn studios is a video game studio founded in 2009 and based in the greater Boston area. They're developing a game engine in Haskell, and a number of games built on that engine, including an action-rpg for touchscreen devices called bloodknight.<br />
</blockquote><br />
<br />
* [http://www.ivu.de/uk/products/public-transport/ IVU Traffic Technologies AG]<br />
<blockquote><br />
The rostering group at IVU Traffic Technologies AG has been using Haskell to check rosters for compliance with EC regulations.<br />
<br />
Our implementation is based on an embedded DSL to combine the regulation’s single rules into a solver that not only decides on instances but, in the case of a faulty roster, finds an interpretation of the roster that is “favorable” in the sense that the error messages it entails are “helpful” in leading the dispatcher to the resolution of the issue at hand.<br />
<br />
The solver is both reliable (due to strong static typing and referential transparency — we have not experienced a failure in three years) and efficient (due to constraint propagation, a custom search strategy, and lazy evaluation).<br />
<br />
Our EC 561/2006 component is part of the IVU.crew software suite and as such is in wide-spread use all over Europe, both in planning and dispatch. So the next time you enter a regional bus, chances are that the driver’s roster was checked by Haskell.<br />
</blockquote><br />
<br />
* [http://www.janrain.com JanRain]<br />
<blockquote><br />
JanRain uses Haskell for network and web software. Read more about [http://www.janrain.com/blogs/haskell-janrain Haskell at JanRain] and in their [http://corp.galois.com/blog/2011/3/8/tech-talk-haskell-and-the-social-web.html tech talk at Galois]. JanRain's "[http://www.janrain.com/products/capture Capture]" user API product is built on Haskell's Snap webframework.<br />
</blockquote><br />
<br />
:: See Janrain's [http://corp.galois.com/blog/2011/4/22/tech-talk-video-haskell-and-the-social-web.html technical talk about their use of Snap]<br />
<br />
* [http://joyridelabs.de/game/ Joyride Laboratories]<br />
<br />
<blockquote><br />
Joyride Laboratories is an independent game development studio, founded in 2009 by Florian Hofer and Sönke Hahn. Their first game, "Nikki and the Robots" was released in 2011.<br />
</blockquote><br />
<br />
* [http://www.linspire.com/ Linspire]<br />
<br />
<blockquote><br />
Linspire, Inc. has used functional programming since its inception in 2001,<br />
beginning with extensive use of O'Caml, with a steady shift to Haskell as its<br />
implementations and libraries have matured. Hardware detection, software<br />
packaging and CGI web page generation are all areas where we have used<br />
functional programming extensively. Haskell's feature set lets us replace much<br />
of our use of little languages (e.g., bash or awk) and two-level languages (C<br />
or C++ bound to an interpreted language), allowing for faster development,<br />
better code sharing and ultimately faster implementations. Above all, we value<br />
static type checking for minimizing runtime errors in applications that run in<br />
unknown environments and for wrapping legacy programs in strongly typed<br />
functions to ensure that we pass valid arguments. <br />
</blockquote><br />
<br />
::Linspire's [http://cufp.galois.com/2006/abstracts.html#CliffordBeshers CUFP talk]<br />
::Linspire's experience report on using [http://portal.acm.org/citation.cfm?doid=1291151.1291184 functional programming to manage a Linux distribution]<br />
<br />
* [http://www.mitre.org/ MITRE]<br />
<blockquote><br />
MITRE uses Haskell for, amongst other things, the [http://hackage.haskell.org/package/cpsa analysis of cryptographic protocols].<br />
</blockquote><br />
<br />
* [http://ertos.nicta.com.au/research/sel4/ NICTA]<br />
<blockquote><br />
NICTA has used Haskell as part of a project to verify the L4 microkernel.<br />
</blockquote><br />
::[http://www.drdobbs.com/embedded/222400553 Read the Dr. Dobbs article on using Haskell and formal methods to verify a kernel]<br />
<br />
* [http://www.gb.nrao.edu NRAO]<br />
<blockquote><br />
NRAO has used Haskell to implement the core science algorithms for the Robert C. Byrd Green Bank Telescope (GBT) Dynamic Scheduling System ([http://www.gb.nrao.edu/dss DSS]).<br />
::Source code available on [https://github.com/nrao/antioch GitHub].<br />
</blockquote><br />
<br />
* [http://www.ns-sol.co.jp NS Solutions(NSSOL)] Tokyo, Japan<br />
<blockquote><br />
NS Solutions has employed Haskell since 2008 to develop its software<br />
packages including "BancMeasure", a mark-to-market accounting software<br />
package for financial institutions, "BancMeasure for IFRS" and<br />
"Mamecif", a data analysis package.<br />
"BancMeasure" and "Mamecif" are registered trademarks of NS Solutions Corporation in JAPAN.<br />
</blockquote><br />
<br />
* [http://www.nvidia.com/content/global/global.php NVIDIA]<br />
<blockquote><br />
At NVIDIA, we have a handful of in-house tools that are written in Haskell<br />
</blockquote><br />
<br />
* [http://blog.openomy.com/2008/01/case-study-using-haskell-and-happs-for.html Openomy]<br />
<br />
<blockquote><br />
Openomy's API v2.0 is developed in Haskell, using the<br />
[http://www.happs.org/ HAppS] web platform.<br />
</blockquote><br />
<br />
* [http://www.oblomov.com Oblomov]<br />
<br />
<blockquote><br />
Oblomov Systems is a one-person software company based in Utrecht, The Netherlands. Founded in 2009, Oblomov has since then been working on a number of Haskell-related projects. The main focus lies on web-applications and (web-based) editors. Haskell has turned out to be extremely useful for implementing web servers that communicate with JavaScript clients or iPhone apps.<br />
</blockquote><br />
<br />
:: [http://haskell.org/communities/05-2010/html/report.html#oblomov Oblomov's HCAR submission].<br />
<br />
* [http://www.patch-tag.com Patch-Tag: hosting for darcs]<br />
<blockquote><br />
Need somewhere to put your darcs code? Try us.<br />
<br />
Patch-Tag is built with [http://happstack.com happstack], the continuation of the project formerly known as HAppS.<br />
</blockquote><br />
<br />
* [http://www.peerium.com Peerium, Inc] Cambridge, Massachusetts<br />
<blockquote><br />
At Peerium, we're striving to bring a new level of quality and efficiency to online communication and collaboration within virtual communities, social networks, and business environments. We believe that a new environment that supports the effortless sharing of both information and software will enable a level of online cooperation far beyond current Web-based technologies -- modern programming techniques will enable the creation of more robust and more powerful programs within these environments. To this end, we're building a new software platform for direct, real-time communication and collaboration within graphically rich environments. Peerium is located in the heart of Harvard Square in Cambridge, Massachusetts.<br />
</blockquote><br />
<br />
* [http://www.qualcomm.com/ Qualcomm, Inc]<br />
<br />
<blockquote><br />
Qualcomm uses Haskell to generate Lua bindings to the BREW platform <br />
</blockquote><br />
<br />
* [http://www.renci.org/ Renaissaince Computing Institute], Chapel Hill, North Carolina<br />
<blockquote><br />
The Renaissance Computing Institute (RENCI), a multi-institutional organization, brings together multidisciplinary experts and advanced technological capabilities to address pressing research issues and to find solutions to complex problems that affect the quality of life in North Carolina, our nation and the world.<br />
<br />
Research scientists at RENCI have used Haskell for a number of projects, including [http://vis.renci.org/jeff/2009/08/26/open-sourcing-the-big-board/ The Big Board].<br />
</blockquote><br />
<br />
::RENCI's [http://cufp.galois.com/2009/abstracts.html#JeffersonHeard CUFP talk].<br />
<br />
* [https://scrive.com/gb/en Scrive] <br />
<br />
<blockquote><br />
Scrive is a service for e-signing tenders, contracts, and other documents. We help our clients close deals faster, decrease their administrative burden, and improve their customers’ experience.<br />
</blockquote><br />
<br />
* [http://sankelsoftware.com Sankel Software] Albuquerque, New Mexico<br />
<br />
<blockquote><br />
Sankel Software has been using Haskell since 2002 for both prototyping and deployment for technologies ranging from CAD/CAM to gaming and computer animation. We specialize in the development of user-friendly, large, long-term applications that solve difficult and conceptually intricate problems.<br />
</blockquote><br />
<br />
* [http://www.signalicorp.com/index.htm Signali] Portland, Oregon<br />
<br />
<blockquote><br />
Signali Corp is a new custom hardware design company. Our chief products<br />
are custom IP cores targeted for embedded DSP and cryptographic<br />
applications. Our specialty is the design and implementation of<br />
computationally intensive, complex algorithms. The interfaces to each<br />
core are modular and can be very efficiently modified for your specific<br />
application. System-level integration and validation is crucial and is<br />
the majority of investment in a product.<br />
</blockquote><br />
<br />
* [http://www.standardchartered.com/home/en/index.html Standard Chartered]<br />
<br />
<blockquote><br />
Standard Chartered has a large group using Haskell for all aspects of its wholesale banking business.<br />
</blockquote><br />
<br />
* [http://www.starling-software.com/en/index.html Starling Software] Tokyo, Japan<br />
<blockquote><br />
Starling Software are developing a commercial automated options trading system <br />
in Haskell, and are migrating other parts of their software suite to<br />
Haskell.<br />
</blockquote><br />
<br />
::Starling Software's [http://www.starling-software.com/misc/icfp-2009-cjs.pdf experience building real time trading systems in Haskell] <br />
<br />
* [http://www.tabula.com/ Tabula.com]<br />
<blockquote><br />
Tabula is a privately held fabless semiconductor company developing 3-D Programmable Logic Devices. Haskell is used for internal compiler toolchains related to hardware design.<br />
</blockquote><br />
<br />
* [http://tsurucapital.com Tsuru Capital] Tokyo, Japan<br />
<blockquote><br />
Tsuru Capital is operating an automated options trading system written in Haskell.<br />
</blockquote><br />
<br />
::[http://haskell.org/communities/05-2010/html/report.html#sect7.6 Tsuru Capital's HCAR submission]<br />
<br />
* [http://tupil.com/ Tupil] Utrecht, The Netherlands<br />
<br />
<blockquote><br />
Tupil is a Dutch company that built software for clients, written in Haskell. Tupil used Haskell for the speed in development and resulting software quality. The company is founded by Chris Eidhof and Eelco Lempsink. Currently they build iPhone/iPad applications in Objective-C.<br />
</blockquote><br />
<br />
:: Tupil's experience building [http://blog.tupil.com/building-commercial-haskell-applications/ commercial web apps in Haskell]<br />
<br />
* [http://typlab.com TypLAB] Amsterdam, The Netherlands<br />
<br />
<blockquote><br />
TypLAB investigates and develops new ways of creating and consuming online content. Their [http://www.silkapp.com/ Silk] application makes it easy to filter and visualize large amounts of information.<br />
</blockquote><br />
<br />
:: TypLAB's blog on [http://blog.typlab.com/2009/09/why-we-use-haskell/ why they use Haskell]<br />
:: A [http://thenextweb.com/eu/2011/04/28/filter-and-visualize-data-in-seconds-with-silk/ review of Silk]<br />
<br />
* [http://www.sensor-sense.nl Sensor Sense] Nijmegen, The Netherlands<br />
<br />
<blockquote><br />
Sensor Sense is offering high technology systems for gas measurements in the ''ppbv'' down to ''pptv'' range. We use Haskell for the embedded control software of our trace gas detectors.<br />
</blockquote><br />
<br />
If you're using Haskell commercially, please add your details here.<br />
<br />
== The Industrial Haskell Group ==<br />
<br />
The [http://industry.haskell.org/ Industrial Haskell Group (IHG)] is an organisation to support the needs of commercial users of the Haskell programming language. <br />
<br />
== Jobs and recruitment ==<br />
<br />
[[Jobs|Haskell jobs]] on the HaskellWiki.<br />
<br />
[http://www.haskellers.com/jobs Jobs at Haskellers.com].<br />
<br />
== Consultants ==<br />
<br />
[[Consultants]]<br />
<br />
== Commercial Users of Functional Programming Workshop ==<br />
<br />
[http://www.galois.com/cufp/ Commercial Users of Functional Programming]<br />
<br />
The goal of [http://www.galois.com/cufp/ CUFP] is to build a community<br />
for users of functional programming languages and technology, be they<br />
using functional languages in their professional lives, in an open<br />
source project (other than implementation of functional languages), as a<br />
hobby, or any combination thereof. In short: anyone who uses functional<br />
programming as a means, but not an end.<br />
<br />
[[Category:Community]]</div>Tbhhttps://wiki.haskell.org/index.php?title=Haskell_mode_for_Emacs&diff=26088Haskell mode for Emacs2009-01-25T00:56:35Z<p>Tbh: </p>
<hr />
<div>haskell-mode is a major mode for Emacs and XEmacs specifically for writing Haskell code. You can get HaskellMode from the web page: http://www.haskell.org/haskell-mode/ or on Debian you can type <code>apt-get install haskell-mode</code> (although this currently doesn't have the latest version). <br />
<br />
==Obtaining the CVS version==<br />
<br />
<code>cvs -d :pserver:anoncvs@cvs.haskell.org:/cvs login # password 'cvs' </code><br />
<br />
<code>cvs -d :pserver:anoncvs@cvs.haskell.org:/cvs co fptools/CONTRIB/haskell-modes/emacs</code><br />
<br />
==Minimal setup==<br />
<br />
Insert in your ~/.emacs or other appropriate file:<br />
<br />
<code>(load "/path/to/haskell-mode/haskell-site-file")</code><br />
<br />
==Tips and use==<br />
Handy keybindings in haskell-mode. See the documentation <code>C-h m</code> for more information:<br />
*<code>C-c C-=</code> inserts an = sign and lines up type signatures and other pattern matches nicely.<br />
*<code>C-c C-|</code> inserts a guard<br />
*<code>C-c C-o</code> inserts a guard <hask>| otherwise =</hask> and lines up existing guards<br />
*<code>C-c C-w</code> inserts a where keyword<br />
*<code>C-c C-.</code> aligns code over a region in a "sensible" fashion.<br />
<br />
Now in version 2.2:<br />
*<code>C-c C-t</code> gets :type for symbol at point, and remembers it<br />
*<code>C-u C-c C-t</code> inserts a type annotation, for symbol at point, on the line above<br />
*<code>C-c C-i</code> gets :info for symbol at point<br />
*<code>C-c M-.</code> find definition of (interpreted) symbol at point<br />
<br />
(See the section below on [[#inf-haskell.el:_the_best_thing_since_the_breadknife|inf-haskell]].)<br />
<br />
Here's an example for <code>C-c C-=</code>. Put your cursor after myInt and hit <code>C-c C-=</code><br />
<haskell><br />
blah :: Int -> Int<br />
blah myInt<br />
</haskell><br />
note how the function signature is reindented to match the column of the = sign.<br />
<haskell><br />
blah :: Int -> Int<br />
blah myInt =<br />
</haskell><br />
<br />
You could also achieve the same effect by selecting the region and typing <code>C-c C-.</code><br />
<br />
You can also use Haskell-Mode to load Emacs buffers with Haskell code in either Hugs or GHC. To load something in Hugs or ghci, type <code>C-c C-l</code> to load the file. Then, you can go on to type <code>C-c C-r</code> (or simply <code>C-c C-l</code> again) to reload the current module when you have made a change.<br />
<br />
=== Indentation ===<br />
Indentation is one thing that nearly all programming modes provide. However, indenting Haskell code is very hard: for a given line, there are nearly always more than column for which indentation makes sense. For example, imagine the following is open in a haskell-mode buffer, where <code>!</code> represents the point:<br />
<br />
<haskell><br />
foo :: Int -> String<br />
foo 0 = f 4 ++ s<br />
where f 4 = "hello" ++ <br />
!<br />
</haskell><br />
<br />
If you ask haskell-mode to indent for you, where should it indent to? There are four basic options:<br />
<br />
<ol><br />
<li><br />
You want to finish off the expression you were writing in the last line. Haskell-mode indents to be underneath the <code>"</code> character at the beginning of <code>"hello"</code>:<br />
<br />
<haskell><br />
where f 4 = "hello" ++<br />
!<br />
</haskell><br />
<br />
This is debatably a bad choice as you'd probably want to indent a bit further in to make it clear that you were carrying on an expression, but the layout rule would accept something like the following:<br />
<br />
<haskell><br />
where f 4 = "hello" ++<br />
"world"<br />
</haskell><br />
<br />
</li><br />
<li><br />
You want to add a second equation for <code>f</code>. Haskell-mode will indent to line up with the first argument, and fill in the <code>f</code> in the equation:<br />
<br />
<haskell><br />
where f 4 = "hello" ++<br />
f !<br />
</haskell><br />
<br />
This is an unlikely choice as the expression in the previous line isn't complete, but haskell-mode isn't smart enough to know that. (If <code>f</code> had been something without arguments, like <hask>where f = "hello"</hask>, then it's impossible to have more than one equation and haskell-mode won't offer this indentation level.)<br />
</li><br />
<li><br />
You want to add a second binding to the <code>where</code>-block. Haskell-mode indents to line up with the <code>f</code>:<br />
<br />
<haskell><br />
where f 4 = "hello" ++<br />
!<br />
</haskell><br />
<br />
</li><br />
<li>You want to start an entirely new top-level binding. Haskell-mode indents to the first column:<br />
<br />
<haskell><br />
foo :: Int -> String<br />
foo 0 = f 4 ++ s<br />
where f 4 = "hello" ++<br />
!<br />
</haskell><br />
<br />
</li><br />
</ol><br />
<br />
These four locations can be reached by repeatedly pressing <code>TAB</code>. This is what's known as the tab-cycle. The innermost location is offered first, then cycling progresses outwards. Although this may seem like an inefficient system (and it is indeed a shame that Haskell's design didn't result in an unambiguous indentation system), you do quickly get used to the tab-cycle and indenting Haskell code.<br />
<br />
==== indent-region ====<br />
Using indent-region is generally a bad idea on Haskell code, because it would need to know which of the tab-cycle stops you wish to choose for each line. The innermost one is chosen in each case, which often results in unusable code. Moral: just don't use indent-region with haskell-mode.<br />
<br />
==== Unicodifying symbols (Pretty Lambda for Haskell-mode) ====<br />
In Haskell code, you can end up using a lot of mathematical symbols. It is possible to hack the fontifying features of Emacs to change the ASCII textual representations of arrows and operators into the nice-looking real symbols, much like you could with TeX. The following code is a compilation of Emacs lisp code found on the Emacs wiki on the [http://www.emacswiki.org/cgi-bin/wiki/PrettyLambda#toc4 Pretty Lambda] page (that page also has examples of how to apply the general Unicode defuns to other languages):<br />
<br />
HOWEVER: due to the symbols taking up less space, this has the unfortunate side effect of changing the indentation stops that the indent key offers. This will mean that your code may not look properly aligned to those who do not have this feature in their editor, or could even mean that your code means something different to how it looks. (It is possible to contrive an example that looks correct in emacs, but actually fails to compile). The following is left for interest, but probably should NOT be used.<br />
<br />
<code><br />
(defun unicode-symbol (name)<br />
"Translate a symbolic name for a Unicode character -- e.g., LEFT-ARROW <br />
or GREATER-THAN into an actual Unicode character code. "<br />
(decode-char 'ucs (case name <br />
('left-arrow 8592)<br />
('up-arrow 8593)<br />
('right-arrow 8594)<br />
('down-arrow 8595) <br />
('double-vertical-bar #X2551) <br />
('equal #X003d)<br />
('not-equal #X2260)<br />
('identical #X2261)<br />
('not-identical #X2262)<br />
('less-than #X003c)<br />
('greater-than #X003e)<br />
('less-than-or-equal-to #X2264)<br />
('greater-than-or-equal-to #X2265) <br />
('logical-and #X2227)<br />
('logical-or #X2228)<br />
('logical-neg #X00AC) <br />
('nil #X2205)<br />
('horizontal-ellipsis #X2026)<br />
('double-exclamation #X203C)<br />
('prime #X2032)<br />
('double-prime #X2033)<br />
('for-all #X2200)<br />
('there-exists #X2203)<br />
('element-of #X2208) <br />
('square-root #X221A)<br />
('squared #X00B2)<br />
('cubed #X00B3) <br />
('lambda #X03BB)<br />
('alpha #X03B1)<br />
('beta #X03B2)<br />
('gamma #X03B3)<br />
('delta #X03B4))))<br />
<br />
(defun substitute-pattern-with-unicode (pattern symbol)<br />
"Add a font lock hook to replace the matched part of PATTERN with the <br />
Unicode symbol SYMBOL looked up with UNICODE-SYMBOL."<br />
(interactive)<br />
(font-lock-add-keywords<br />
nil `((,pattern <br />
(0 (progn (compose-region (match-beginning 1) (match-end 1)<br />
,(unicode-symbol symbol)<br />
'decompose-region)<br />
nil))))))<br />
<br />
(defun substitute-patterns-with-unicode (patterns)<br />
"Call SUBSTITUTE-PATTERN-WITH-UNICODE repeatedly."<br />
(mapcar #'(lambda (x)<br />
(substitute-pattern-with-unicode (car x)<br />
(cdr x)))<br />
patterns))<br />
<br />
(defun haskell-unicode ()<br />
(interactive)<br />
(substitute-patterns-with-unicode<br />
(list (cons "\\(<-\\)" 'left-arrow)<br />
(cons "\\(->\\)" 'right-arrow)<br />
(cons "\\(==\\)" 'identical)<br />
(cons "\\(/=\\)" 'not-identical)<br />
(cons "\\(()\\)" 'nil)<br />
(cons "\\<\\(sqrt\\)\\>" 'square-root)<br />
(cons "\\(&&\\)" 'logical-and)<br />
(cons "\\(||\\)" 'logical-or)<br />
(cons "\\<\\(not\\)\\>" 'logical-neg)<br />
(cons "\\(>\\)\\[^=\\]" 'greater-than)<br />
(cons "\\(<\\)\\[^=\\]" 'less-than)<br />
(cons "\\(>=\\)" 'greater-than-or-equal-to)<br />
(cons "\\(<=\\)" 'less-than-or-equal-to)<br />
(cons "\\<\\(alpha\\)\\>" 'alpha)<br />
(cons "\\<\\(beta\\)\\>" 'beta)<br />
(cons "\\<\\(gamma\\)\\>" 'gamma)<br />
(cons "\\<\\(delta\\)\\>" 'delta)<br />
(cons "\\(<nowiki>''</nowiki>\\)" 'double-prime)<br />
(cons "\\('\\)" 'prime)<br />
(cons "\\(!!\\)" 'double-exclamation)<br />
(cons "\\(\\.\\.\\)" 'horizontal-ellipsis))))<br />
<br />
(add-hook 'haskell-mode-hook 'haskell-unicode)</code><br />
<br />
== Bugs ==<br />
Bugs and feature requests should be sent to the maintainer [mailto:monnier@iro.umontreal.ca Stefan Monnier].<br />
For people using the Debian package, Debian maintains a [http://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=haskell-mode list of bugs] for haskell-mode, they should be reported there.<br />
<br />
===XEmacs===<br />
On some the GNU/Linux systems with XEmacs, admittedly, only verified on Ubuntu and Debian and with haskell-mode 2.2, there is a system function missing that interferes with automatic indenting. Secondly, there seems to be an issue with setting the <code-lisp>haskell-default-face</code-lisp> to <code-lisp>nil</code-lisp>.<br />
<br />
====line-end-position====<br />
<br />
To fix this, find where the haskell mode package is installed on your system. (Usually <code>/usr/share/emacs/site-lisp/haskell-mode</code>). Edit the file <code>haskell-indent.el</code> and add the lines:<br />
<pre-lisp><br />
(eval-and-compile<br />
<br />
;; If `line-end-position' isn't available provide one.<br />
(unless (fboundp 'line-end-position)<br />
(defun line-end-position (&optional n)<br />
"Return the `point' of the end of the current line."<br />
(save-excursion<br />
(end-of-line n)<br />
(point)))))<br />
</pre-lisp><br />
right after the comments at the top. That should fix the issue.<br />
<br />
====haskell-default-face====<br />
<br />
This one shows up when typing in code (at various spots - most often when typing a qualified function, such as <hask>List.map</hask>.)<br />
<br />
To fix this one, edit the file <code>haskell-font-lock.el</code>. Look for the line that says:<br />
<pre-lisp><br />
(defvar haskell-default-face nil)<br />
</pre-lisp><br />
and change this to <br />
<pre-lisp><br />
(defvar haskell-default-face 'default)<br />
</pre-lisp><br />
In my version, this is line 168.<br />
<br />
Then, look for the line that says:<br />
<pre-lisp><br />
(,qvarid 0 haskell-default-face)<br />
</pre-lisp><br />
and change it to<br />
<pre-lisp><br />
(,qvarid 0 (symbol-value 'haskell-default-face))<br />
</pre-lisp><br />
<br />
For me, this is line 326 of the file.<br />
YMMV - hope this helps.<br />
<br />
<!--<br />
=== GNU Emacs ===<br />
<br />
==== ghci buffer infested with "^J"s, C-c C-t doesn't work ====<br />
<br />
(only happens after loading a haskell file into ghci)<br />
<br />
--><br />
<br />
<br />
== inf-haskell.el: the best thing since the breadknife ==<br />
inf-haskell.el is _awesome_. At one point I decided to sit down and write a list of functions I'd love to have in haskell-mode, intending to write them myself. I thought I'd check to see whether the key shortcuts I'd chosen were free but I was surprised to find that ''every one'' of these functions is already provided by inf-haskell.el! Here's a selection of the highlights:<br />
<br />
<br />
=== Getting set up ===<br />
inf-haskell.el is usually already setup as part of the haskell-mode package, so there is nothing special to do for it. On some systems, you may need this in your .emacs:<br />
<br />
<pre-lisp><br />
(require 'inf-haskell)<br />
</pre-lisp><br />
<br />
To use the following functions, first find a .hs file, then hit C-c C-l (inferior-haskell-load-file). This fires up Hugs or Ghci (you can change this by customising haskell-program-name) on your file. Don't worry if it's not an isolated module, GHCi will load all the modules it imports as normal. You can even load entire programs this way by using C-c C-l on the Main.hs file. If everything loads without errors, you'll be able to use the functions below.<br />
<br />
=== inferior-haskell-type (C-c C-t) ===<br />
Say you have the following code:<br />
<br />
foo = foldr (+) 0 [1..20]<br />
<br />
Perhaps you've forgotten the order of arguments to foldr. It's easily done; I can never remember whether the operation or final value comes first. That's easy to check: just put your point between the 'f' and 'r' of 'foldr' and hit C-c C-t RET. The type of foldr will be revealed in the echo area. This isn't particularly impressive; haskell-doc.el already did this. However, this will work for ''any'' function in the module in question ''or'' in those modules imported by the current module (including the standard libs)!<br />
<br />
If you find that the type shown in the echo area is overwritten after a short amount of time (or any other such problem, of course), please report it as a bug. We know of no such bug, but someone apparently bumped into some such problem which he says he worked around by disabling doc-mode and decl-scan:<br />
<br />
To turn off haskell-doc-mode, add the following to your .emacs:<br />
<pre-lisp><br />
(remove-hook 'haskell-mode-hook 'turn-on-haskell-doc-mode)<br />
</pre-lisp><br />
To turn off haskell-decl-scan, just refrain from turning it on (it's not enabled by default).<br />
<br />
(P.S. I re-use haskell-doc-mode to save queried type info, and re-display it in the minibuffer. Disabling doc mode would disable that. -- mrd)<br />
<br />
Another nice feature of this function is the ability to automatically insert type signatures for the function at point on the line above. For example, suppose you have the below open in Emacs, with the point represented by -!-:<br />
<br />
-!-map _ [] = []<br />
map f (x:xs) = f x : map f xs<br />
<br />
And press C-u C-c C-t (note the prefix argument), it will result in the following:<br />
<br />
map :: (a -> b) -> [a] -> [b]<br />
-!-map _ [] = []<br />
map f (x:xs) = f x : map f xs<br />
<br />
=== inferior-haskell-info (C-c C-i) ===<br />
The :info command in GHCi/Hugs is extremely useful; it'll tell you:<br />
<br />
* The definition of an algebraic datatype given its name. E.g. try <code>:info Bool</code>. The output will contain something like <code>data Bool = True | False</code>.<br />
* The classes a type instantiates given the type's name. <code>:info Bool</code> will also give you the classes Bool instantiates. If you can't see an instance you think should be there, make sure the module where that instance is declared is loaded.<br />
* The type of a function, given its name.<br />
* The types of the methods of a class, and the number of arguments of that class, given the class name.<br />
* The expansion of a type synonym given that synonym's name.<br />
<br />
And for all of the above, :info will also tell you the filename and line where that thing is defined. inferior-haskell-info lets you hook into this power. Use it with C-c C-i on anything within a Haskell file.<br />
<br />
=== inferior-haskell-find-definition (C-c M-.) ===<br />
This one needs little explanation. Sometimes you just need to find the source of a function, or datatype, or class, or type synonym etc. to see how it works, and this function lets you do just that. Unfortunately, it won't work on the standard lib modules or anything that isn't 'local' to your project. This is one of the most useful functions inf-haskell.el provides.<br />
<br />
(Basically, it only works on interpreted code, for which ghci has location information. If you want a more general find-definition, use hasktags to create a TAGS file and then use the normal emacs <code>M-.</code> with that. -- mrd)<br />
: Note that you can also create a TAGS file using GHCi's :etags command. [[User:DavidHouse|DavidHouse]] 14:38, 29 April 2007 (UTC)<br />
<br />
: Again, :etags/:ctags only works for interpreted code.<br />
<br />
== Tricks and tweaks ==<br />
<br />
=== Automatic unit testing ===<br />
Here's a cute trick I've evolved:<br />
<br />
I'm a great fan of [[unit test first]], as described by eXtremeProgramming on TheOriginalWiki.<br />
<br />
With the code below, I can press F12 and immediately run all of my unit tests, and immediately see whether they all passed or not.<br />
I've put all of my unit tests into their own file with a main function that runs the tests and gives an exitcode according to the test results. I've specified that the compile-command for that file compiles and runs the file.<br />
<br />
This elisp code will run the <code>compile</code> command from the F12 key in emacs. The output will popup a new window twelve lines tall. If the compilation is successful (exitcode zero) the window goes away. If the exitcode is 1 or greater, the window stays so you can see the output.<br />
<pre-lisp><br />
(require 'compile)<br />
<br />
;; this means hitting the compile button always saves the buffer<br />
;; having to separately hit C-x C-s is a waste of time<br />
(setq mode-compile-always-save-buffer-p t)<br />
;; make the compile window stick at 12 lines tall<br />
(setq compilation-window-height 12)<br />
<br />
;; from enberg on #emacs<br />
;; if the compilation has a zero exit code, <br />
;; the windows disappears after two seconds<br />
;; otherwise it stays<br />
(setq compilation-finish-function<br />
(lambda (buf str)<br />
(unless (string-match "exited abnormally" str)<br />
;;no errors, make the compilation window go away in a few seconds<br />
(run-at-time<br />
"2 sec" nil 'delete-windows-on<br />
(get-buffer-create "*compilation*"))<br />
(message "No Compilation Errors!"))))<br />
<br />
<br />
;; one-button testing, tada!<br />
(global-set-key [f12] 'compile)<br />
</pre-lisp><br />
<br />
<br />
This Haskell code has some Emacs local variable settings at the bottom specifying what the compile-command should be for this buffer.<br />
<haskell><br />
import HUnit<br />
import System<br />
<br />
myTestList = <br />
TestList [<br />
"add numbers" ~: 5 ~=? (3 + 2)<br />
,"add numbers" ~: 5 ~=? (3 + 3)<br />
]<br />
<br />
h = runTestTT myTestList<br />
<br />
main = do c <- h<br />
putStr $ show c<br />
let errs = errors c<br />
fails = failures c<br />
System.exitWith (codeGet errs fails)<br />
<br />
codeGet errs fails<br />
| fails > 0 = ExitFailure 2<br />
| errs > 0 = ExitFailure 1<br />
| otherwise = ExitSuccess<br />
<br />
-- Local Variables:<br />
-- compile-command: "ghc --make -o Test_Demo -i/home/shae/src/haskell/libraries/ HUnitDemo.hs && ./Test_Demo"<br />
-- End:<br />
</haskell><br />
<br />
<br />
If you have any questions, ideas, or suggestions for this code, the maintainer would love to hear them.<br />
<br />
=== Hoogle integration ===<br />
From haskell-mode version 2.4 onwards, the built-in function haskell-hoogle will hoogle the identifier at point.<br />
<br />
=== Using rectangular region commands ===<br />
<br />
Emacs has a set of commands which operate on the region as if it were rectangular. This turns out to be extremely useful when dealing with whitespace sensitive languages.<br />
<br />
<code>C-x r o</code> is "Open Rectangle". It will shift any text within the rectangle to the right side. Also see:<br />
<br />
<code>C-x r t</code> is "String Rectangle". It will shift any text within the rectangle over to the right, and insert a given string prefixing all the lines in the region. If comment-region didn't already exist, you could use this instead, for example.<br />
<br />
<code>C-x r d</code> is "Delete Rectangle". It will delete the contents of the rectangle and move anything on the right over.<br />
<br />
<code>C-x r r</code> is "Copy Rectangle to Register". It will prompt you for a register number so it can save it for later.<br />
<br />
<code>C-x r g</code> is "Insert register". This will insert the contents of the given register, overwriting whatever happens to be within the target rectangle. (So make room)<br />
<br />
<code>C-x r k</code> is "Kill rectangle". Delete rectangle and save contents for:<br />
<br />
<code>C-x r y</code> is "Yank rectangle". This will insert the contents of<br />
the last killed rectangle.<br />
<br />
As with all Emacs modifier combos, you can type <code>C-x r C-h</code> to find out what keys are bound beginning with the <code>C-x r</code> prefix.<br />
<br />
=== Aligning code ===<br />
<br />
Emacs22 has a neat tool called: align-regexp. Select a region you want to align text within, M-x align-regexp, and type a regexp representing the alignment delimiter.<br />
<br />
For example, I often line up my Haddock comments:<br />
<br />
<haskell><br />
f :: a -- ^ does a<br />
-> Foo b -- ^ and b<br />
-> c -- ^ to c<br />
</haskell><br />
<br />
Select the region, and let the regexp be: <code>--</code><br />
<br />
<haskell><br />
f :: a -- ^ does a<br />
-> Foo b -- ^ and b<br />
-> c -- ^ to c<br />
</haskell><br />
<br />
Of course, this works for just about anything. Personally, I've globally bound it to <code>C-x a r</code>:<br />
<br />
<code>(global-set-key (kbd "C-x a r") 'align-regexp)</code>.<br />
<br />
=== Automatically building ===<br />
Emacs 21 has a package that can be installed (included by default in 22 and up) called 'FlyMake'; the idea is that as you are editing away, it occasionally calls the interpreter/compiler automatically and keeps track of whether the code works or not. You can fairly easily get it to work for Haskell as well; see [http://www.emacswiki.org/cgi-bin/wiki/FlymakeHaskell FlymakeHaskell] on the Emacs wiki.<br />
<br />
=== Emacs Integration with Hayoo ===<br />
<br />
My newly installed system would not allow me to hoogle what I wanted (no xmonad or xmonadcontrib in hoogle) so someone suggested Hayoo.<br />
<br />
<code><br />
(define-key haskell-mode-map (kbd "<f3>") (lambda()(interactive)(browse-url (format "http://holumbus.fh-wedel.de/hayoo/results/hayoo.html?query=%s&start" (region-or-word-at-point)))))<br />
</code><br />
<br />
region-or-word-at-point is available in the thing-at-pt+.el library.<br />
<br />
Added 22-12-2008 - Promt for hayoo word<br />
<br />
<code><br />
(defun rgr/hayoo()<br />
(interactive)<br />
(let* ((default (region-or-word-at-point))<br />
(term (read-string (format "Hayoo for the following phrase (%s): "<br />
default))))<br />
(let ((term (if (zerop(length term)) default term)))<br />
(browse-url (format "http://holumbus.fh-wedel.de/hayoo/results/hayoo.html?query=%s&start" term)))))<br />
<br />
<br />
(define-key haskell-mode-map (kbd "<f3>") (lambda()(interactive)(rgr/hayoo)))<br />
</code><br />
<br />
Alteratively use the excellent browse-apropos-url stuff:<br />
<br />
http://www.emacswiki.org/emacs/BrowseAproposURL#toc6<br />
<br />
[http://richardriley.net Richard]<br />
<br />
Note: Using and URL like this should work too and will give better results (not yet tested as I'm not an emacs user):<br />
<br />
<code><br />
http://holumbus.fh-wedel.de/hayoo/hayoo.html?query=%s<br />
</code><br />
<br />
[[User:Tbh|Tbh]] 00:56, 25 January 2009 (UTC)<br />
<br />
[[Category:Development tools]]</div>Tbhhttps://wiki.haskell.org/index.php?title=HXT/Conversion_of_Haskell_data_from/to_XML&diff=20666HXT/Conversion of Haskell data from/to XML2008-04-23T08:33:39Z<p>Tbh: </p>
<hr />
<div>[[Category:Tools]] [[Category:Tutorials]]<br />
<br />
== Serializing and deserializing Haskell data to/from XML ==<br />
<br />
With so called pickler functions and arrows, it becomes rather easy<br />
and straight forward to convert native Haskell values to XML and vice<br />
versa. The module ''Text.XML.HXT.Arrow.Pickle'' and submodules<br />
contain a set of picklers (conversion functions) for simple data types<br />
and pickler combinators for complex types.<br />
<br />
__TOC__<br />
<br />
== The idea: XML pickler ==<br />
<br />
For conversion of native Haskell data from and to external<br />
representations,<br />
there are two functions necessary, one for generating the external<br />
representation and one for reading/parsing the representation. The<br />
read/show pair often form such a pair of functions.<br />
<br />
A so called pickler is a value with two such conversion<br />
functions,<br />
but because it's necessary to apply a whole sequence of<br />
conversion functions at once, there is a state holding the external data,<br />
that has to be updated during encoding and<br />
decoding. So the simplest form of a<br />
pickler converting between a value of type ''t'' and a sequence of Chars looks like<br />
this.<br />
<br />
<haskell><br />
type St = [Char]<br />
<br />
data PU a = PU { appPickle :: (a, St) -> St<br />
, appUnPickle :: St -> (a, St)<br />
}<br />
</haskell><br />
<br />
Andrew Kennedy has described in a programming pearl paper<br />
[http://research.microsoft.com/~akenn/fun/picklercombinators.pdf],<br />
how to define primitive picklers and<br />
a set of pickler combinators to de-/serialize from/to (Byte-)Strings.<br />
<br />
The HXT picklers are an adaptation of these pickler combinators.<br />
The difference to Kennedys approach is,<br />
that the external representation is not a list of Chars but a list of XmlTrees.<br />
The basic picklers for the primitve types (''Int, Bool,...'') will convert simple values into XML text nodes.<br />
New are the picklers for creating XML element and attribute nodes.<br />
<br />
The HXT pickler type is defined as follows<br />
<br />
<haskell><br />
data St = St { attributes :: [XmlTree]<br />
, contents :: [XmlTree]<br />
}<br />
<br />
data PU a = PU { appPickle :: (a, St) -> St<br />
, appUnPickle :: St -> (Maybe a, St)<br />
, theSchema :: Schema<br />
}<br />
</haskell><br />
<br />
In XML there are two places for storing informations,<br />
the attributes and the element contents.<br />
Furthermore the pickler contains a third component for<br />
type information. This enables the derivation of a DTD<br />
from a set of picklers. In the following examples we do not need this component.<br />
<br />
We will see, that with the predefined picklers<br />
and pickler combinators we don't have to look very much<br />
into these internals. Let's start with an example.<br />
<br />
== Example: Processing baseball league data ==<br />
<br />
=== The XML data structure ===<br />
<br />
From the set of [[HXT/Practical]] example we'll take the data<br />
structure<br />
from [[HXT/Practical/Simple2]] dealing with baseball league data.<br />
First let's have an idea about the structure of the XML data.<br />
The structure is not defined by a DTD or schema, so wee have to guess<br />
some issues.<br />
Here is a part of the example XML file:<br />
<br />
<pre><br />
<SEASON YEAR="1998"><br />
<LEAGUE NAME="National League"><br />
<DIVISION NAME="East"><br />
<TEAM CITY="Atlanta" NAME="Braves"><br />
<PLAYER GIVEN_NAME="Marty" SURNAME="Malloy"<br />
POSITION="Second Base" GAMES="11"<br />
GAMES_STARTED="8" AT_BATS="28" RUNS="3"<br />
HITS="5" DOUBLES="1" TRIPLES="0"<br />
HOME_RUNS="1" RBI="1" STEALS="0"<br />
CAUGHT_STEALING="0" SACRIFICE_HITS="0"<br />
SACRIFICE_FLIES="0" ERRORS="0" WALKS="2" STRUCK_OUT="2" HIT_BY_PITCH="0"><br />
</PLAYER><br />
<PLAYER GIVEN_NAME="Ozzie" SURNAME="Guillen"<br />
POSITION="Shortstop" GAMES="83"<br />
GAMES_STARTED="59" AT_BATS="264" RUNS="35"<br />
HITS="73" DOUBLES="15" TRIPLES="1"<br />
HOME_RUNS="1" RBI="22" STEALS="1"<br />
CAUGHT_STEALING="4" SACRIFICE_HITS="4"<br />
SACRIFICE_FLIES="2" ERRORS="6" WALKS="24" STRUCK_OUT="25" HIT_BY_PITCH="1"><br />
</PLAYER><br />
<PLAYER GIVEN_NAME="Danny" ... HIT_BY_PITCH="0"><br />
</PLAYER><br />
<PLAYER GIVEN_NAME="Gerald" ...><br />
</PLAYER><br />
...<br />
</TEAM><br />
<TEAM CITY="Florida" NAME="Marlins"><br />
</TEAM><br />
<TEAM CITY="Montreal" NAME="Expos"><br />
</TEAM><br />
<TEAM CITY="New York" NAME="Mets"><br />
</TEAM><br />
<TEAM CITY="Philadelphia" NAME="Phillies"><br />
</TEAM><br />
</DIVISION><br />
...<br />
</LEAGUE><br />
<LEAGUE NAME="American League"><br />
<DIVISION NAME="East"><br />
...<br />
</DIVISION><br />
<DIVISION NAME="Central"><br />
...<br />
</DIVISION><br />
...<br />
</LEAGUE><br />
</SEASON><br />
</pre><br />
<br />
=== The Haskell data model ===<br />
<br />
Let's first analyze the underlying data model and then define an<br />
appropriate set of Haskell data type for the internal representation.<br />
<br />
* The root type is a ''Season'', consisting of a ''year'' an a set of ''League''s<br />
* The ''League''s are all identified by a ''String'' and consist of a set of ''Division''s, so it's a ''Map''.<br />
* The ''Division''s are also identified by a ''String'' and consist of a list of ''Team''s, so it's again a ''Map''<br />
* A ''Team'' has three components, a ''teamName'', a ''city'', and a list of ''Player''s<br />
* A ''Player'' has a lot of attributes, for simplicity of the example in the internal modell we will not take all fields into account. Just six fields are included, the ''firstName'', the ''lastName'', the ''position'', ''atBats'', ''hits'' and ''era''. All others will be ignored.<br />
<br />
So the Haskell data model looks like this:<br />
<br />
<haskell><br />
import Data.Map (Map, fromList, toList)<br />
<br />
data Season = Season<br />
{ sYear :: Int<br />
, sLeagues :: Leagues<br />
}<br />
deriving (Show, Eq)<br />
<br />
type Leagues = Map String Divisions<br />
<br />
type Divisions = Map String [Team]<br />
<br />
data Team = Team<br />
{ teamName :: String<br />
, city :: String<br />
, players :: [Player]<br />
}<br />
deriving (Show, Eq)<br />
<br />
data Player = Player<br />
{ firstName :: String<br />
, lastName :: String<br />
, position :: String<br />
, atBats :: Maybe Int<br />
, hits :: Maybe Int<br />
, era :: Maybe Float<br />
}<br />
deriving (Show, Eq)<br />
</haskell><br />
<br />
=== The predefined picklers ===<br />
<br />
In HXT there is a class ''XmlPickler'' defining a single function ''xpickle''<br />
for overloading the ''xpickle'' function name.<br />
<br />
<haskell><br />
class XmlPickler a where<br />
xpickle :: PU a<br />
</haskell><br />
<br />
For the simple data types there is an instance for XmlPickler,<br />
which uses the primitive pickler ''xpPrim'' for conversion<br />
from and to XML text nodes. This primitive pickler is available<br />
for all types supporting ''read'' and ''show''.<br />
<br />
<haskell><br />
instance XmlPickler Int where<br />
xpickle = xpPrim<br />
<br />
instance XmlPickler Integer where<br />
xpickle = xpPrim<br />
<br />
...<br />
</haskell><br />
<br />
For composite data there are predefined pickler combinators<br />
for tuples, lists and Maybe types.<br />
<br />
<haskell><br />
instance (XmlPickler a, XmlPickler b) => XmlPickler (a,b) where<br />
xpickle = xpPair xpickle xpickle<br />
<br />
-- similar instances for (,,), (,,,), ...<br />
<br />
instance XmlPickler a => XmlPickler [a] where<br />
xpickle = xpList xpickle<br />
<br />
instance XmlPickler a => XmlPickler (Maybe a) where<br />
xpickle = xpOption xpickle<br />
</haskell><br />
<br />
* ''xpPair'' take two picklers and builds up a pickler for a tuple type. There are also pickler combinators for triples, 4- and 5- tuples.<br />
* ''xpList'' takes a pickler for an element type and gives a list pickler<br />
* ''xpOption'' takes a pickler and returns a pickler for optional values.<br />
<br />
Furthermore we need pickler for generating/reading element and attribute nodes<br />
<br />
* ''xpElem'' generates/parses an XML element node<br />
* ''xpAttr'' generates/parses an attribute node<br />
<br />
Most of the other structured data is pickled/unpickled by converting the data to/from<br />
tuples, lists and options. This is done by a wrapper pickler ''xpWrap''.<br />
<br />
=== Constructing the example picklers ===<br />
<br />
For every Haskell type we will define a pickler.<br />
<br />
For the own data types we will declare instances of the ''XmlPickler'' class.<br />
<br />
<haskell><br />
instance XmlPickler Season where<br />
xpickle = xpSeason<br />
<br />
instance XmlPickler Team where<br />
xpickle = xpTeam<br />
<br />
instance XmlPickler Player where<br />
xpickle = xpPlayer<br />
</haskell><br />
<br />
<br />
Then the picklers are developed top down starting with ''xpSeason''.<br />
<br />
<haskell><br />
xpSeason :: PU Season<br />
xpSeason<br />
= xpElem "SEASON" $<br />
xpWrap ( uncurry Season<br />
, \ s -> (sYear s, sLeagues s)) $<br />
xpPair (xpAttr "YEAR" xpickle) xpLeagues<br />
</haskell><br />
<br />
A ''Season'' value is mapped onto an element ''SEASON'' with ''xpElem''.<br />
This constructs/reads the XML ''SEASON'' element. The two components of ''Season'' are wrapped into a pair with ''xpWrap''. ''xpWrap'' needs a pair of functions for a 1-1 mapping between ''Season'' and ''(Int, Leagues)''.<br />
The first component of the pair, the year is mapped onto an attribute ''YEAR''.<br />
The attribute value is handled with the predefined pickler for ''Int''.<br />
The second one, the ''League''s are handled by ''xpLeagues''.<br />
<br />
<haskell><br />
xpLeagues :: PU Leagues<br />
xpLeagues<br />
= xpWrap ( fromList<br />
, toList ) $<br />
xpList $<br />
xpElem "LEAGUE" $<br />
xpPair (xpAttr "NAME" xpText) xpDivisions<br />
</haskell><br />
<br />
''xpLeagues'' has to deal with a Map value. This can't done directly, but the<br />
Map value is converted to/from a list of pairs with ''xpWrap'' and ''(fromList, toList)''.<br />
Then the ''xpList'' is applied for the list of pairs. Each pair will be represented by an ''LEAGUE''<br />
element, the name is mapped to an attribute ''NAME'', the divisions are handled by ''xpDivisions''.<br />
<br />
<haskell><br />
xpDivisions :: PU Divisions<br />
xpDivisions<br />
= xpWrap ( fromList<br />
, toList<br />
) $<br />
xpList $<br />
xpElem "DIVISION" $<br />
xpPair (xpAttr "NAME" xpText) xpickle<br />
</haskell><br />
<br />
The divisions are pickled by the same pattern as the leagues.<br />
<br />
<haskell><br />
xpTeam :: PU Team<br />
xpTeam<br />
= xpElem "TEAM" $<br />
xpWrap ( uncurry3 Team<br />
, \ t -> (teamName t, city t, players t)) $<br />
xpTriple (xpAttr "NAME" xpText) (xpAttr "CITY" xpText) (xpList xpickle)<br />
</haskell><br />
<br />
With the teams we have to wrap the three components into a 3-tuple with ''xpWrap'' and then pickle a triple of two attributes and a list of players.<br />
<br />
<haskell><br />
xpPlayer :: PU Player<br />
xpPlayer<br />
= xpElem "PLAYER" $<br />
xpWrap ( \ ((f,l,p),(a,h,e)) -> Player f l p a h e<br />
, \ t -> ((firstName t, lastName t, position t),(atBats t, hits t, era t))) $<br />
xpPair (xpTriple (xpAttr "GIVEN_NAME" xpText)<br />
(xpAttr "SURNAME" xpText)<br />
(xpAttr "POSITION" xpText))<br />
(xpTriple (xpOption (xpAttr "AT_BATS" xpickle))<br />
(xpOption (xpAttr "HITS" xpickle))<br />
(xpOption (xpAttr "ERA" xpPrim )))<br />
</haskell><br />
<br />
The ''Player'' pickler looks a bit clumsy, because of the six fields.<br />
A Player is mapped to an element ''PLAYER''.<br />
But because of the many components, we wrap a ''Player'' value<br />
in a pair of triples to use the predefined picklers ''xpPair'' and ''xpTriple''.<br />
<br />
When needing picklers for more than five components in various places, it is straight forward to derive e.g. an 'xp10Tuple`` from the HXT sources of ''xpTriple'' and others.<br />
<br />
New in this case is the use of ''xpOption'' for mapping Maybe values onto optional attributes.<br />
<br />
The other attributes used in the input, are ignored during unpickling<br />
the XML, but this is the only place where the pickler is tolerant with<br />
''wrong'' XML.<br />
<br />
=== A simple application ===<br />
<br />
<haskell><br />
import Text.XML.HXT.Arrow<br />
<br />
-- ...<br />
<br />
main :: IO ()<br />
main<br />
= do<br />
runX ( xunpickleDocument xpSeason [ (a_validate,v_0)<br />
, (a_trace, v_1)<br />
, (a_remove_whitespace,v_1)<br />
, (a_preserve_comment, v_0)<br />
] "simple2.xml"<br />
>>><br />
processSeason<br />
>>><br />
xpickleDocument xpSeason [ (a_indent, v_1)<br />
] "new-simple2.xml"<br />
)<br />
return ()<br />
<br />
-- the dummy for processing the unpickled data<br />
<br />
processSeason :: IOSArrow Season Season<br />
processSeason<br />
= arrIO ( \ x -> do {print x ; return x})<br />
<br />
</haskell><br />
<br />
This application reads in the complete data used in [[HXT/Practical/Simple2]] from file ''simple2.xml''<br />
and unpickles it into a ''Season'' value.<br />
This value is processed (dummy: print out) by processSeason<br />
and pickled again into ''new-simple2.xml''<br />
<br />
The unpickled value, when formated a bit, looks like this<br />
<br />
<haskell><br />
Season<br />
{ sYear = 1998<br />
, sLeagues = fromList<br />
[ ( "American League"<br />
, fromList<br />
[ ( "Central"<br />
, [ Team { teamName = "White Sox"<br />
, city = "Chicago"<br />
, players = []}<br />
, ...<br />
])<br />
, ( "East"<br />
, [ Team { teamName = "Orioles"<br />
, city = "Baltimore"<br />
, players = []}<br />
, ...<br />
])<br />
, ( "West"<br />
, [ Team { teamName = "Angels"<br />
, city = "Anaheim"<br />
, players = []}<br />
, ...<br />
])<br />
])<br />
, ( "National League"<br />
, fromList<br />
[ ( "Central"<br />
, [ Team { teamName = "Cubs"<br />
, city = "Chicago"<br />
, players = []}<br />
, ...<br />
])<br />
, ( "East"<br />
, [ Team { teamName = "Braves"<br />
, city = "Atlanta"<br />
, players =<br />
[ Player { firstName = "Marty"<br />
, lastName = "Malloy"<br />
, position = "Second Base"<br />
, atBats = Just 28<br />
, hits = Just 5<br />
, era = Nothing}<br />
, Player { firstName = "Ozzie"<br />
, lastName = "Guillen"<br />
, position = "Shortstop"<br />
, atBats = Just 264<br />
, hits = Just 73<br />
, era = Nothing}<br />
, ...<br />
]}<br />
, ...<br />
])<br />
, ( "West"<br />
, [ Team { teamName = "Diamondbacks"<br />
, city = "Arizona"<br />
, players = []}<br />
, ...<br />
])<br />
])<br />
]<br />
}<br />
</haskell><br />
<br />
== 2. Example: A toy programming language ==<br />
<br />
In this second example we will develop the picklers the other way round.<br />
We start with a given data model and derive an XML document structure.<br />
<br />
The complete source of this example is included in the HXT distribution.<br />
<br />
=== The abstract syntax for the programming language ===<br />
<br />
<haskell><br />
type Program = Stmt<br />
<br />
type StmtList = [Stmt]<br />
<br />
data Stmt<br />
= Assign Ident Expr<br />
| Stmts StmtList <br />
| If Expr Stmt (Maybe Stmt)<br />
| While Expr Stmt<br />
deriving (Eq, Show)<br />
<br />
type Ident = String<br />
<br />
data Expr<br />
= IntConst Int<br />
| BoolConst Bool<br />
| Var Ident<br />
| UnExpr UnOp Expr<br />
| BinExpr Op Expr Expr<br />
deriving (Eq, Show)<br />
<br />
data Op<br />
= Add | Sub | Mul | Div | Mod | Eq | Neq<br />
deriving (Eq, Ord, Enum, Show)<br />
<br />
data UnOp<br />
= UPlus | UMinus | Neg<br />
deriving (Eq, Ord, Read, Show)<br />
</haskell><br />
<br />
A program is a statement, four variants of statement are defined, assignments, sequences, branches and loops. The expressions have five variants, constants, identifiers, unary and binary expressions.<br />
The operators are realized as enumeration types.<br />
<br />
For developing the picklers, there are two new aspects. This example contains sum data types and it's a recursive structure.<br />
<br />
=== The pickler definitions ===<br />
<br />
<haskell><br />
xpProgram :: PU Program<br />
xpProgram = xpElem "program" $<br />
xpAddFixedAttr "xmlns" "program42" $<br />
xpickle<br />
<br />
instance XmlPickler UnOp where<br />
xpickle = xpPrim<br />
<br />
instance XmlPickler Op where<br />
xpickle = xpWrap (toEnum, fromEnum) xpPrim<br />
<br />
instance XmlPickler Expr where<br />
xpickle = xpAlt tag ps<br />
where<br />
tag (IntConst _ ) = 0<br />
tag (BoolConst _ ) = 1<br />
tag (Var _ ) = 2<br />
tag (UnExpr _ _ ) = 3<br />
tag (BinExpr _ _ _ ) = 4<br />
ps = [ xpWrap ( IntConst<br />
, \ (IntConst i ) -> i ) ( xpElem "int" $<br />
xpAttr "value" $<br />
xpickle )<br />
<br />
, xpWrap ( BoolConst<br />
, \ (BoolConst b) -> b) ( xpElem "bool" $<br />
xpAttr "value" $<br />
xpWrap (toEnum, fromEnum) xpickle )<br />
<br />
, xpWrap ( Var<br />
, \ (Var n) -> n) ( xpElem "var" $<br />
xpAttr "name" $<br />
xpText )<br />
<br />
, xpWrap ( uncurry UnExpr<br />
, \ (UnExpr op e) -> (op, e))<br />
( xpElem "unex" $<br />
xpPair (xpAttr "op" xpickle) xpickle )<br />
<br />
, xpWrap ( uncurry3 $ BinExpr<br />
, \ (BinExpr op e1 e2) -> (op, e1, e2))<br />
( xpElem "binex" $<br />
xpTriple (xpAttr "op" xpickle) xpickle xpickle )<br />
]<br />
<br />
instance XmlPickler Stmt where<br />
xpickle = xpAlt tag ps<br />
where<br />
tag ( Assign _ _ ) = 0<br />
tag ( Stmts _ ) = 1<br />
tag ( If _ _ _ ) = 2<br />
tag ( While _ _ ) = 3<br />
ps = [ xpWrap ( uncurry Assign<br />
, \ (Assign n v) -> (n, v))<br />
( xpElem "assign" $<br />
xpPair (xpAttr "name" xpText) xpickle )<br />
, xpWrap ( Stmts<br />
, \ (Stmts sl) -> sl) ( xpElem "block" $<br />
xpList xpickle )<br />
, xpWrap ( uncurry3 If<br />
, \ (If c t e) -> (c, t, e))<br />
( xpElem "if" $<br />
xpTriple xpickle xpickle xpickle )<br />
, xpWrap ( uncurry While<br />
, \ (While c b) -> (c, b))<br />
( xpElem "while" $<br />
xpPair xpickle xpickle )<br />
]<br />
</haskell><br />
<br />
The root pickler is ''xpProgram'' which wraps the main statement in a ''program'' element.<br />
The program element is decorated with a fixed attribute, defining a name space declaration,<br />
just for demonstrating the use of the ''xpAddFixedAttr''.<br />
<br />
For the operators two variants are shown. The ''UnOp'' is converted with read/show (''xpPrim''),<br />
The ''Op'' is in XML represented by a number (''xpWrap (toEnum, fromEnum)'').<br />
<br />
The ''Expr'' and ''Stmt'' picklers are a bit more interesting. We have to select a pickler for every<br />
constructor of the data type. This is done by mapping each variant to a number and then index a list of picklers<br />
with this number. For all variants the values are converted with ''xpWrap'' into simple values or tuples,<br />
and then these values are mapped to XML elements. The simple fields are encoded in attributes, the complex<br />
(and recursive) are encoded as child elements.<br />
<br />
The complete pickler definitions consist of about 60 lines of code.<br />
<br />
=== A simple program as Haskell value ===<br />
<br />
<haskell><br />
p2 :: Program<br />
p2 = Stmts <br />
[ Assign x (IntConst 6)<br />
, Assign y (IntConst 7)<br />
, Assign p (IntConst 0)<br />
, While<br />
( BinExpr Neq (Var x) (IntConst 0) )<br />
( If ( BinExpr Neq ( BinExpr Mod (Var x) (IntConst 2) ) (IntConst 0) )<br />
( Stmts<br />
[ Assign x ( BinExpr Sub (Var x) (IntConst 1) )<br />
, Assign p ( BinExpr Add (Var p) (Var y) )<br />
]<br />
)<br />
( Just ( Stmts<br />
[ Assign x ( BinExpr Div (Var x) (IntConst 2) )<br />
, Assign y ( BinExpr Mul (Var y) (IntConst 2) )<br />
]<br />
)<br />
)<br />
)<br />
]<br />
where<br />
x = "x"<br />
y = "y"<br />
p = "p"<br />
</haskell><br />
<br />
An example program with rather all variants of statements and expressions.<br />
<br />
=== The serialized program as XML ===<br />
<br />
<pre><br />
<program xmlns="program42"><br />
<block><br />
<assign name="x"><br />
<int value="6"/><br />
</assign><br />
<assign name="y"><br />
<int value="7"/><br />
</assign><br />
<assign name="p"><br />
<int value="0"/><br />
</assign><br />
<while><br />
<binex op="6"><br />
<var name="x"/><br />
<int value="0"/><br />
</binex><br />
<if><br />
<binex op="6"><br />
<binex op="4"><br />
<var name="x"/><br />
<int value="2"/><br />
</binex><br />
<int value="0"/><br />
</binex><br />
<block><br />
<assign name="x"><br />
<binex op="1"><br />
<var name="x"/><br />
<int value="1"/><br />
</binex><br />
</assign><br />
<assign name="p"><br />
<binex op="0"><br />
<var name="p"/><br />
<var name="y"/><br />
</binex><br />
</assign><br />
</block><br />
<block><br />
<assign name="x"><br />
<binex op="3"><br />
<var name="x"/><br />
<int value="2"/><br />
</binex><br />
</assign><br />
<assign name="y"><br />
<binex op="2"><br />
<var name="y"/><br />
<int value="2"/><br />
</binex><br />
</assign><br />
</block><br />
</if><br />
</while><br />
</block><br />
</program><br />
</pre><br />
<br />
This document is generated by executing the following piece of code<br />
<br />
<haskell><br />
storeProgram :: IO ()<br />
storeProgram<br />
= do<br />
runX ( constA p2<br />
>>><br />
xpickleDocument xpProgram<br />
[ (a_indent, v_1)<br />
] "pickle.xml"<br />
)<br />
return ()<br />
</haskell><br />
<br />
It's loaded from a file with<br />
<br />
<haskell><br />
loadProgram :: IO Program<br />
loadProgram<br />
= do<br />
[p2] <- runX ( xunpickleDocument xpProgram<br />
[ (a_remove_whitespace, v_1)<br />
, (a_validate, v_0)<br />
] "pickle.xml"<br />
)<br />
return p2<br />
</haskell><br />
<br />
The ''(a_remove_whitespace, v_1)'' option is necessary because<br />
the XML document is indented when written.<br />
<br />
<br />
== A few words of advice ==<br />
<br />
These picklers are a powerful tool for de-/serializing from/to XML.<br />
Only a few lines of code are needed for serializing as well as for<br />
deserializing.<br />
But they are absolutely intolerant when dealing with none valid XML.<br />
They are intended to read machine generated XML, ideally generated by the same pickler.<br />
When unpickling hand written or by foreign tools generated XML, please validate the XML<br />
before reading, preferably with RelaxNG or XML Schema, because of the more powerful<br />
type system than those with DTDs.<br />
<br />
When designing picklers, one must be careful to put enough markup<br />
into the XML structure, to read the XML back without the need<br />
for a lookahead and without any ambiguities. The simplest case of a not working pickler is a pair of primitve picklers e.g. for some text. In this case<br />
the text is written out and concatenated into a single string, when parsing the XML, there will only be a single text and the pickler will fail because of a missing value for the second component. So at least every primitive pickler must be combined with an ''xpElem'' or ''xpAttr''.<br />
<br />
It's possible to define various picklers per data type,<br />
and picklers can be used one way, just for serializing into XML/HTML.<br />
So this approach can also be used to easily generate parts of a HTML document.<br />
Examples can be found in the Holumbus search engine project [http://holumbus.fh-wedel.de/] and the Haskell api search engine Hayoo! [http://holumbus.fh-wedel.de/hayoo/]. There the HTML code for the search results is generated with picklers.<br />
<br />
Please do not try to convert a whole large database into a single XML file<br />
with this approach. This will run into memory problems when reading the data,<br />
because of the DOM approach used in HXT. In the HXT distribution, there is<br />
a test case in the examples dir ''performance'', where the pickling and unpickling is done<br />
with XML documents containing 2 million elements. This is the limit for a 1G Intel box (tested with ghc 6.8).<br />
<br />
There are two strategies to overcome these limitations. The first is a SAX like<br />
approach, reading in simple tags and text elements and not building a tree structure,<br />
but writing the data instantly into a database.<br />
For this approach the Tagsoup package can be useful. The disadvantage is the programming<br />
effort for collecting and converting the data.<br />
<br />
The second and recommended way is, to split the whole bunch of data into smaller pieces, unpickle these and<br />
link the resulting documents together by the use of 'href''s.<br />
<br />
== Reading/writing between XML and Haskell data types without XML picklers ==<br />
<br />
This is an example for reading and writing XML without the use of<br />
picklers. It was developed before the picklers were added to HXT.<br />
The code shows that it's much more effort to implement a conversion<br />
than with the technic described above.<br />
<br />
=== Serializing to Xml ===<br />
<br />
We can create an HXT tree from a single-layer data class as follows:<br />
<br />
<haskell><br />
import IO<br />
import Char<br />
import Text.XML.HXT.Arrow<br />
import Data.Generics<br />
<br />
-- our data class we'll convert into xml<br />
data Config = <br />
Config { username :: String,<br />
logNumDays :: Int,<br />
oleDbString :: String }<br />
deriving (Show, Typeable,Data)<br />
<br />
-- helper function adapted from http://www.defmacro.org/ramblings/haskell-web.html<br />
-- (gshow replaced by gshow')<br />
introspectData :: Data a => a -> [(String, String)]<br />
introspectData a = zip fields (gmapQ gshow' a)<br />
where fields = constrFields $ toConstr a<br />
<br />
gshow' :: Data a => a -> String<br />
gshow' t = fromMaybe (showConstr(toConstr t)) (cast t)<br />
<br />
-- function to create xml string from single-layer Haskell data type<br />
xmlSerialize object = "<" ++ show(toConstr object) ++ ">" ++ <br />
foldr (\(a,b) x -> x ++ "<" ++ a ++ ">" ++ b ++ "</" ++ a ++ ">") "" ( introspectData object )<br />
++ "</" ++ show(toConstr object) ++ ">"<br />
<br />
-- function to create HXT tree arrow from single-layer Haskell data type:<br />
createHxtArrow object = runLA( constA ( xmlSerialize object ) >>> xread)<br />
<br />
-- create a config object to serialize:<br />
<br />
createConfig = Config { username = "test", logNumDays = 3, oleDbString = "qsdf" }<br />
<br />
-- test function, using our Config data type<br />
testConversion = createHxtArrow( createConfig ) ()<br />
</haskell><br />
<br />
-- hughperkins<br />
<br />
=== Deserializing from Xml ===<br />
<br />
Here's a solution to deserialize a simple haskell data type containing Strings and Ints.<br />
<br />
It's not really pretty, but it works.<br />
<br />
Basically, we just convert the incoming xml into gread-compatible format, then use gread :-D<br />
<br />
Currently it works for a simple single-layer Haskell data type containing Ints and Strings. You can add new child data types by adding to the case statement in xmlToGShowFormat.<br />
<br />
If someone has a more elegant solution, please let me know ( hughperkins@gmail.com )<br />
<br />
<haskell><br />
module ParseXml<br />
where<br />
<br />
import IO<br />
import Char<br />
import List<br />
import Maybe<br />
import Data.Generics hiding (Unit)<br />
import Text.XML.HXT.Arrow hiding (when)<br />
<br />
data Config = Config{ name :: String, age :: Int } <br />
--data Config = Config{ age :: Int } <br />
deriving( Data, Show, Typeable, Ord, Eq, Read )<br />
<br />
createConfig = Config "qsdfqsdf" 3<br />
--createConfig = Config 3<br />
gshow' :: Data a => a -> String<br />
gshow' t = fromMaybe (showConstr(toConstr t)) (cast t)<br />
<br />
-- helper function from http://www.defmacro.org/ramblings/haskell-web.html<br />
introspectData :: Data a => a -> [(String, String)]<br />
introspectData a = zip fields (gmapQ gshow' a)<br />
where fields = constrFields $ toConstr a<br />
<br />
-- function to create xml string from single-layer Haskell data type<br />
xmlSerialize object = "<" ++ show(toConstr object) ++ ">" ++ <br />
foldr (\(a,b) x -> x ++ "<" ++ a ++ ">" ++ b ++ "</" ++ a ++ ">") "" ( introspectData object )<br />
++ "</" ++ show(toConstr object) ++ ">"<br />
<br />
-- parse xml to HXT tree, and obtain the value of node "fieldname"<br />
-- returns a string<br />
getValue xml fieldname | length(resultlist) > 0 = Just (head resultlist)<br />
| otherwise = Nothing<br />
where resultlist = (runLA ( constA xml >>> xread >>> deep ( hasName fieldname ) >>> getChildren >>> getText ))[]<br />
<br />
-- parse templateobject to get list of field names<br />
-- apply these to xml to get list of values<br />
-- return (fieldnames list, value list)<br />
xmlToGShowFormat :: Data a => String -> a -> String<br />
xmlToGShowFormat xml templateobject = <br />
go<br />
where mainconstructorname = (showConstr $ toConstr templateobject)<br />
fields = constrFields $ toConstr templateobject<br />
values = map ( \fieldname -> getValue xml fieldname ) fields<br />
datatypes = gmapQ (dataTypeOf) templateobject<br />
constrs = gmapQ (toConstr) templateobject<br />
datatypereps = gmapQ (dataTypeRep . dataTypeOf) templateobject<br />
fieldtogshowformat (value,datatyperep) = case datatyperep of<br />
IntRep -> "(" ++ fromJust value ++ ")"<br />
_ -> show(fromJust value)<br />
formattedfieldlist = map (fieldtogshowformat) (zip values datatypereps)<br />
go = "(" ++ mainconstructorname ++ " " ++ (concat $ intersperse " " formattedfieldlist ) ++ ")"<br />
<br />
xmlDeserialize xml templateobject = fst $ head $ gread( xmlToGShowFormat xml templateobject)<br />
<br />
dotest = xmlDeserialize (xmlSerialize createConfig) createConfig :: Config<br />
dotest' = xmlDeserialize ("<Config><age>12</age><name>test name!</name></Config>") createConfig :: Config<br />
</haskell><br />
<br />
<!-- Uwe Schmidt: This code moved from main HXT --></div>Tbhhttps://wiki.haskell.org/index.php?title=HXT&diff=18788HXT2008-01-29T23:06:11Z<p>Tbh: Added a link to HXT/Practical</p>
<hr />
<div>[[Category:Tools]] [[Category:Tutorials]]<br />
<br />
== A gentle introduction to the Haskell XML Toolbox ==<br />
<br />
The [http://www.fh-wedel.de/~si/HXmlToolbox/index.html Haskell XML Toolbox (HXT)] is a collection of tools for processing XML with Haskell. The core component of the Haskell XML Toolbox is a domain specific language consisting of a set of combinators for processing XML trees in a simple and elegant way. The combinator library is based on the concept of arrows. The main component is a validating and namespace aware XML-Parser that supports almost fully the XML 1.0 Standard. Extensions are a validator for RelaxNG and an XPath evaluator.<br />
<br />
__TOC__<br />
<br />
== Background ==<br />
<br />
The Haskell XML Toolbox bases on the ideas of [http://www.cs.york.ac.uk/fp/HaXml/ HaXml] and [http://www.flightlab.com/~joe/hxml/ HXML] but introduces a more general approach for processing XML with Haskell. HXT uses a generic data model for representing XML documents, including the DTD subset, entity references, CData parts and processing instructions. This data model makes it possible to use tree transformation functions as a uniform design of XML processing steps from parsing, DTD processing, entity processing, validation, namespace propagation, content processing and output.<br />
<br />
== Resources ==<br />
<br />
; [http://www.fh-wedel.de/~si/HXmlToolbox/index.html HXT Home] :<br />
; [http://www.fh-wedel.de/~si/HXmlToolbox/HXT-7.0.tar.gz HXT-7.0.tar.gz] : lastest release<br />
; [http://darcs.fh-wedel.de/hxt/ darcs.fh-wedel.de/hxt] : darcs repository with head revision of HXT<br />
; [http://darcs.fh-wedel.de/hxt/doc/hdoc_arrow/ Arrow API] : Haddock documentation of head revision with links to source files<br />
; [http://darcs.fh-wedel.de/hxt/doc/hdoc/ Complete API] : Haddock documentation with arrows and old API based on filters<br />
<br />
== The basic concepts ==<br />
<br />
=== The basic data structures ===<br />
<br />
Processing of XML is a task of processing tree structures. This is can be done in Haskell in a very elegant way by defining an appropriate tree data type, a Haskell DOM (document object model) structure. The tree structure in HXT is a rose tree with a special XNode data type for storing the XML node information.<br />
<br />
The generally useful tree structure (NTree) is separated from the node type (XNode). This allows for reusing the tree structure and the tree traversal and manipulation functions in other applications.<br />
<br />
<haskell><br />
type NTree a = NTree a [NTree a] -- rose tree<br />
<br />
data XNode = XText String -- plain text node<br />
| ...<br />
| XTag QName XmlTrees -- element name and list of attributes<br />
| XAttr QName -- attribute name<br />
| ...<br />
<br />
type QName = ... -- qualified name<br />
<br />
type XmlTree = NTree XNode<br />
<br />
type XmlTrees = [XmlTree]<br />
</haskell><br />
<br />
=== The concept of filters ===<br />
<br />
Selecting, transforming and generating trees often requires routines, which compute not only a single result tree, but a (possibly empty) list of (sub-)trees. This leads to the idea of XML filters like in HaXml. Filters are functions, which take an XML tree as input and compute a list of result trees.<br />
<br />
<haskell><br />
type XmlFilter = XmlTree -> [XmlTree]<br />
</haskell><br />
<br />
More generally we can define a filter as<br />
<br />
<haskell><br />
type Filter a b = a -> [b]<br />
</haskell><br />
<br />
We will do this abstraction later, when introducing arrows. Many of the functions in the following motivating examples can be generalised this way. But for getting the idea, the <hask>XmlFilter</hask> is sufficient.<br />
<br />
The filter functions are used so frequently, that the idea of defining a domain specific language with filters as the basic processing units comes up. In such a DSL the basic filters are predicates, selectors, constructors and transformers, all working on the HXT DOM tree structure. For a DSL it becomes necessary to define an appropriate set of combinators for building more complex functions from simpler ones. Of course filter composition, like (.) becomes one of the most frequently used combinators. there are more complex filters for traversal of a whole tree and selection or transformation of several nodes. We will see a few first examples in the following part.<br />
<br />
The first task is to build filters from pure functions, to define a lift operator. Pure functions are lifted to filters in the following way:<br />
<br />
Predicates are lifted by mapping False to the empty list and True to the single element list, containing the input tree.<br />
<br />
<haskell><br />
p :: XmlTree -> Bool -- pure function<br />
p t = ...<br />
<br />
pf :: XmlTree -> [XmlTree] -- or XmlFilter<br />
pf t<br />
| p t = [t]<br />
| otherwise = []<br />
</haskell><br />
<br />
The combinator for this type of lifting is called <hask>isA</hask>, it works on any type and is defined as<br />
<br />
<haskell><br />
isA :: (a -> Bool) -> (a -> [a])<br />
isA p x<br />
| p x = [x]<br />
| otherwise = []<br />
</haskell><br />
<br />
A predicate for filtering text nodes looks like this<br />
<br />
<haskell><br />
isXText :: XmlFilter -- XmlTree -> [XmlTree]<br />
isXText t@(NTree (XText _) _) = [t]<br />
isXText _ = []<br />
</haskell><br />
<br />
Transformers -- functions that map a tree into another tree -- are lifted in a trivial way:<br />
<br />
<haskell><br />
f :: XmlTree -> XmlTree<br />
f t = exp(t)<br />
<br />
ff :: XmlTree -> [XmlTree]<br />
ff t = [exp(t)]<br />
</haskell><br />
<br />
This basic function is called <hask>arr</hask>, it comes from the Control.Arrow module of the basic library package of ghc.<br />
<br />
Partial functions, functions that can't always compute a result, are usually lifted to totally defined filters:<br />
<br />
<haskell><br />
f :: XmlTree -> XmlTree<br />
f t<br />
| p t = expr(t)<br />
| otherwise = error "f not defined"<br />
<br />
ff :: XmlFilter<br />
ff t<br />
| p t = [expr(t)]<br />
| otherwise = []<br />
</haskell><br />
<br />
This is a rather comfortable situation, with these filters we don't have to deal with illegal argument errors. Illegal arguments are just mapped to the empty list.<br />
<br />
When processing trees, there's often the case, that no, exactly one, or more than one result is possible. These functions, returning a set of results are often a bit imprecisely called ''nondeterministic'' functions. These functions, e.g. selecting all children of a node or all grandchildren, are exactly our filters. In this context lists instead of sets of values are the appropriate result type, because the ordering in XML is important and duplicates are possible.<br />
<br />
Working with filters is rather similar to working with binary relations, and working with relations is rather natural and comfortable, database people do know this very well.<br />
<br />
Two first examples for working with ''nondeterministic'' functions are selecting the children and the grandchildren of an XmlTree which can be implemented by<br />
<br />
<haskell><br />
getChildren :: XmlFilter<br />
getChildren (NTree n cs)<br />
= cs<br />
<br />
getGrandChildren :: XmlFilter<br />
getGrandChildren (NTree n cs)<br />
= concat [ getChildren c | c <- cs ]<br />
</haskell><br />
<br />
=== Filter combinators ===<br />
<br />
Composition of filters (like function composition) is the most important combinator. We will use the infix operator <hask>(>>>)</hask> for filter composition and reverse the arguments, so we can read composition sequences from left to right, like with pipes in Unix. Composition is defined as follows:<br />
<br />
<haskell><br />
(>>>) :: XmlFilter -> XmlFilter -> XmlFilter<br />
<br />
(f >>> g) t = concat [g t' | t' <- f t]<br />
</haskell><br />
<br />
This definition corresponds 1-1 to the composition of binary relations. With help of the <hask>(>>>)</hask> operator the definition of <hask>getGrandChildren</hask> becomes rather simple:<br />
<br />
<haskell><br />
getGrandChildren :: XmlFilter<br />
getGrandChildren = getChildren >>> getChildren<br />
</haskell><br />
<br />
Selecting all text nodes of the children of an element can also be formulated very easily with the help of <hask>(>>>)</hask><br />
<br />
<haskell><br />
getTextChildren :: XmlFilter<br />
getTextChildren = getChildren >>> isXText<br />
</haskell><br />
<br />
When used to combine predicate filters, the <hask>(>>>)</hask> serves as a logical "and" operator or, from the relational view, as an intersection operator: <hask>isA p1 >>> isA p2</hask> selects all values for which p1 and p2 both hold.<br />
<br />
The dual operator to <hask>(>>>)</hask> is the logical or, (thinking in sets: The union operator). For this we define a sum operator <hask>(<+>)</hask>. The sum of two filters is defined as follows:<br />
<br />
<haskell><br />
(<+>) :: XmlFilter -> XmlFilter -> XmlFilter<br />
<br />
(f <+> g) t = f t ++ g t<br />
</haskell><br />
<br />
Example: <hask>isA p1 <+> isA p2</hask> is the logical or for filter.<br />
<br />
Combining elementary filters with (>>>) and (<+>) leads to more complex functionality. For example, selecting all text nodes within two levels of depth (in left to right order) can be formulated with:<br />
<br />
<haskell><br />
getTextChildren2 :: XmlFilter<br />
getTextChildren2 = getChildren >>> ( isXText <+> ( getChildren >>> isXText ) )<br />
</haskell><br />
<br />
'''Exercise:''' Are these filters equivalent or what's the difference between the two filters?<br />
<br />
<haskell><br />
getChildren >>> ( isXText <+> ( getChildren >>> isXText ) )<br />
<br />
( getChildren >>> isXText ) <+> ( getChildren >>> getChildren >>> isXText )<br />
</haskell><br />
<br />
Of course we need choice combinators. The first idea is an if-then-else filter, <br />
built up from three simpler filters. But often it's easier and more elegant to work with simpler binary combinators for choice. So we will introduce the simpler ones first.<br />
<br />
One of these choice combinators is called <hask>orElse</hask> and is defined as<br />
follows:<br />
<br />
<haskell><br />
orElse :: XmlFilter -> XmlFilter -> XmlFilter<br />
orElse f g t<br />
| null res1 = g t<br />
| otherwise = res1<br />
where<br />
res1 = f t<br />
</haskell><br />
<br />
The meaning is the following: If f computes a non-empty list as result, f succeeds and this list is the result, else g is applied to the input and this yields the result. There are two other simple choice combinators usually written in infix notation, <hask> g `guards` f</hask> and <hask>f `when` g</hask>:<br />
<br />
<haskell><br />
guards :: XmlFilter -> XmlFilter -> XmlFilter<br />
guards g f t<br />
| null (g t) = []<br />
| otherwise = f t<br />
<br />
when :: XmlFilter -> XmlFilter -> XmlFilter<br />
when f g t<br />
| null (g t) = [t]<br />
| otherwise = f t<br />
</haskell><br />
<br />
These choice operators become useful when transforming and manipulating trees.<br />
<br />
=== Tree traversal filter ===<br />
<br />
A very basic operation on tree structures is the traversal of all nodes and the selection and/or transformation of nodes. These traversal filters serve as control structures for processing whole trees. They correspond to the map and fold combinators for lists.<br />
<br />
The simplest traversal filter does a top down search of all nodes with a special feature. This filter, called <hask>deep</hask>, is defined as follows:<br />
<br />
<haskell><br />
deep :: XmlFilter -> XmlFilter<br />
deep f = f `orElse` (getChildren >>> deep f)<br />
</haskell><br />
<br />
When a predicate filter is applied to <hask>deep</hask>, a top down search is done and all subtrees satisfying the predicate are collected. The descent into the tree stops when a subtree is found, because of the use of <hask>orElse</hask>.<br />
<br />
'''Example:''' Selecting all plain text nodes of a document can be formulated with:<br />
<br />
<haskell><br />
deep isXText<br />
</haskell><br />
<br />
'''Example:''' Selecting all "top level" tables in a HTML documents looks like<br />
this:<br />
<br />
<haskell><br />
deep (isElem >>> hasName "table")<br />
</haskell><br />
<br />
A variant of <hask>deep</hask>, called <hask>multi</hask>, performs a complete search, where the tree traversal does not stop, when a node is found.<br />
<br />
<haskell><br />
multi :: XmlFilter -> XmlFilter<br />
multi f = f <+> (getChildren >>> multi f)<br />
</haskell><br />
<br />
'''Example:''' Selecting all tables in a HTML document, even nested ones, <hask>multi</hask> has to be used instead of <hask>deep</hask>:<br />
<br />
<hask>multi (isElem >>> hasName "table")</hask><br />
<br />
=== Arrows ===<br />
<br />
We've already seen, that the filters <hask>a -> [b]</hask> are a very<br />
powerful and sometimes a more elegant way to process XML than pure<br />
function. This is the good news. The bad news is, that filter are not<br />
general enough. Of course we sometimes want to do some I/O and we want<br />
to stay in the filter level. So we need something like<br />
<br />
<haskell><br />
type XmlIOFilter = XmlTree -> IO [XmlTree]<br />
</haskell><br />
<br />
for working in the IO monad.<br />
<br />
Sometimes it's appropriate to thread some state through the computation<br />
like in state monads. This leads to a type like<br />
<br />
<haskell><br />
type XmlStateFilter state = state -> XmlTree -> (state, [XmlTree])<br />
</haskell><br />
<br />
And in real world applications we need both extensions at the same<br />
time. Of course I/O is necessary but usually there are also some<br />
global options and variables for controlling the computations. In HXT,<br />
for instance there are variables for controlling trace output, options<br />
for setting the default encoding scheme for input data and a base URI<br />
for accessing documents, which are addressed in a content or in a DTD<br />
part by relative URIs. So we need something like<br />
<br />
<haskell><br />
type XmlIOStateFilter state = state -> XmlTree -> IO (state, [XmlTree])<br />
</haskell><br />
<br />
We want to work with all four filter variants, and in the future<br />
perhaps with even more general filters, but of course not with four<br />
sets of filter names, e.g. <hask>deep, deepST, deepIO, deepIOST</hask>.<br />
<br />
This is the point where <hask>newtype</hask>s and <hask>class</hask>es<br />
come in. Classes are needed for overloading names and<br />
<hask>newtype</hask>s are needed to declare instances. Further the<br />
restriction of <hask>XmlTree</hask> as argument and result type is<br />
not neccessary and hinders reuse in many cases.<br />
<br />
A filter discussed above has all features of an arrow. Arrows are<br />
introduced for generalising the concept of functions and function<br />
combination to more general kinds of computation than pure functions.<br />
<br />
A basic set of combinators for arrows is defined in the classes in the<br />
<hask>Control.Arrow</hask> module, containing the above mentioned <hask>(>>>), (<+>), arr</hask>.<br />
<br />
In HXT the additional classes for filters working with lists as result type are<br />
defined in <hask>Control.Arrow.ArrowList</hask>. The choice operators are<br />
in <hask>Control.Arrow.ArrowIf</hask>, tree filters, like <hask>getChildren, deep, multi, ...</hask> in<br />
<hask>Control.Arrow.ArrowTree</hask> and the elementary XML specific<br />
filters in <hask>Text.XML.HXT.XmlArrow</hask>.<br />
<br />
In HXT there are four types instantiated with these classes for<br />
pure list arrows, list arrows with a state, list arrows with IO<br />
and list arrows with a state and IO.<br />
<br />
<haskell><br />
newtype LA a b = LA { runLA :: (a -> [b]) }<br />
<br />
newtype SLA s a b = SLA { runSLA :: (s -> a -> (s, [b])) }<br />
<br />
newtype IOLA a b = IOLA { runIOLA :: (a -> IO [b]) }<br />
<br />
newtype IOSLA s a b = IOSLA { runIOSLA :: (s -> a -> IO (s, [b])) }<br />
</haskell><br />
<br />
The first one and the last one are those used most frequently in the<br />
toolbox, and of course there are lifting functions for converting the<br />
special arrows into the more general ones.<br />
<br />
Don't worry about all these conceptional details. Let's have a look into some<br />
''Hello world'' examples.<br />
<br />
== Getting started: Hello world examples ==<br />
<br />
=== copyXML ===<br />
<br />
The first complete example is a program for<br />
copying an XML document<br />
<br />
<haskell><br />
module Main<br />
where<br />
<br />
import Text.XML.HXT.Arrow<br />
import System.Environment<br />
<br />
main :: IO ()<br />
main<br />
= do<br />
[src, dst] <- getArgs<br />
runX ( readDocument [(a_validate, v_0)] src<br />
>>><br />
writeDocument [] dst<br />
)<br />
return ()<br />
</haskell><br />
<br />
The interesting part of this example is<br />
the call of <hask>runX</hask>. <hask>runX</hask> executes an<br />
arrow. This arrow is one of the more powerful list arrows with IO and<br />
a HXT system state.<br />
<br />
The arrow itself is a composition of <hask>readDocument</hask> and<br />
<hask>writeDocument</hask>.<br />
<hask>readDocument</hask> is an arrow for reading, DTD processing and<br />
validation of documents. Its behaviour can be controlled by a list of<br />
options. Here we turn off the validation step. The <hask>src</hask>, a file<br />
name or an URI is read and parsed and a document tree is built. This<br />
tree is ''piped'' into the output arrow. This one also is<br />
controlled by a set of options. Here all the defaults are used.<br />
<hask>writeDocument</hask> converts the tree into a string and writes<br />
it to the <hask>dst</hask>.<br />
<br />
We've omitted here the boring stuff of option parsing and error<br />
handling.<br />
<br />
Compilation and a test run looks like this:<br />
<br />
<pre><br />
hobel > ghc -o copyXml -package hxt CopyXML.hs<br />
hobel > cat hello.xml<br />
<hello>world</hello><br />
hobel > copyXml hello.xml -<br />
<?xml version="1.0" encoding="UTF-8"?><br />
<hello>world</hello><br />
hobel ><br />
</pre><br />
<br />
The mini XML document in file <tt>hello.xml</tt> is read and<br />
a document tree is built. Then this tree is converted into a string<br />
and written to standard output (filename: <tt>-</tt>). It is decorated<br />
with an XML declaration containing the version and the output<br />
encoding.<br />
<br />
For processing HTML documents there is a HTML parser, which tries to<br />
parse and interprete rather anything as HTML. The HTML parser can be<br />
selected by calling<br />
<br />
<hask>readDocument [(a_parse_html, v_1), ...]</hask><br />
<br />
with the appropriate option.<br />
<br />
=== Pattern for a main program ===<br />
<br />
A more realistic pattern for a simple Unix filter like program has<br />
the following structure:<br />
<br />
<haskell><br />
module Main<br />
where<br />
<br />
import Text.XML.HXT.Arrow<br />
<br />
import System.IO<br />
import System.Environment<br />
import System.Console.GetOpt<br />
import System.Exit<br />
<br />
main :: IO ()<br />
main<br />
= do<br />
argv <- getArgs<br />
(al, src, dst) <- cmdlineOpts argv<br />
[rc] <- runX (application al src dst)<br />
if rc >= c_err<br />
then exitWith (ExitFailure (0-1))<br />
else exitWith ExitSuccess<br />
<br />
-- | the dummy for the boring stuff of option evaluation,<br />
-- usually done with 'System.Console.GetOpt'<br />
<br />
cmdlineOpts :: [String] -> IO (Attributes, String, String)<br />
cmdlineOpts argv<br />
= return ([(a_validate, v_0)], argv!!0, argv!!1)<br />
<br />
-- | the main arrow<br />
<br />
application :: Attributes -> String -> String -> IOSArrow b Int<br />
application al src dst<br />
= readDocument al src<br />
>>><br />
processChildren (processDocumentRootElement `when` isElem) -- (1)<br />
>>><br />
writeDocument al dst<br />
>>><br />
getErrStatus<br />
<br />
<br />
-- | the dummy for the real processing: the identity filter<br />
<br />
processDocumentRootElement :: IOSArrow XmlTree XmlTree<br />
processDocumentRootElement<br />
= this -- substitute this by the real application<br />
</haskell><br />
<br />
This program has the same functionality as our first example,<br />
but it separates the arrow from the boring option evaluation and<br />
return code computation.<br />
<br />
The interesing line is (1).<br />
<hask>readDocument</hask> generates a tree structure with a so called extra<br />
root node. This root node is a node above the XML document root<br />
element. The node above the XML document root element is neccessary<br />
because of possible other elements on the same tree level as the XML<br />
root, for instance comments, processing instructions or whitespace.<br />
<br />
Furthermore the artificial root node serves for storing meta<br />
information about the document in the attribute list, like the<br />
document name, the encoding scheme, the HTTP transfer headers and<br />
other information.<br />
<br />
To process the real XML root element, we have to take the children of<br />
the root node, select the XML root element and process this, but<br />
remain all other children unchanged. This is done with<br />
<hask>processChildren</hask> and the <hask>when</hask> choice<br />
operator. <hask>processChildren</hask> applies a filter elementwise to<br />
all children of a node. All results form processing the list of children from<br />
the result node.<br />
<br />
The structure of internal document tree can be made visible<br />
e.g. by adding the option pair <hask>(a_show_tree, v_1)</hask> to the<br />
<hask>writeDocument</hask> arrow. This will emit the tree in a readable<br />
text representation instead of the real document.<br />
<br />
In the next section we will give examples for the<br />
<hask>processDocumentRootElement</hask> arrow.<br />
<br />
== Selection examples ==<br />
<br />
=== Selecting text from an HTML document ===<br />
<br />
Selecting all the plain text of an XML/HTML document<br />
can be formulated with<br />
<br />
<haskell><br />
selectAllText :: ArrowXml a => a XmlTree XmlTree<br />
selectAllText<br />
= deep isXText<br />
</haskell><br />
<br />
<hask>deep</hask> traverses the whole tree, stops the traversal when<br />
a node is a text node (<hask>isXText</hask>) and returns all the text nodes.<br />
There are two other traversal operators <hask>deepest</hask> and <hask>multi</hask>,<br />
In this case, where the selected nodes are all leaves, these would give the same result.<br />
<br />
=== Selecting text and ALT attribute values ===<br />
<br />
Let's take a bit more complex task: We want to select all text, but also the values of the <tt>alt</tt> attributes<br />
of image tags.<br />
<br />
<haskell><br />
selectAllTextAndAltValues :: ArrowXml a => a XmlTree XmlTree<br />
selectAllTextAndAltValues<br />
= deep<br />
( isXText -- (1)<br />
<+><br />
( isElem >>> hasName "img" -- (2)<br />
>>><br />
getAttrValue "alt" -- (3)<br />
>>><br />
mkText -- (4)<br />
)<br />
)<br />
</haskell><br />
<br />
The whole tree is searched for text nodes (1) and for image elements (2), from the image elements<br />
the alt attribute values are selected as plain text (3), this text is transformed into a text node (4).<br />
<br />
=== Selecting text and ALT attributes values (2) ===<br />
<br />
Let's refine the above filter one step further. The text from the alt attributes shall be marked in the output<br />
by surrounding double square brackets. Empty alt values shall be ignored.<br />
<br />
<haskell><br />
selectAllTextAndRealAltValues :: ArrowXml a => a XmlTree XmlTree<br />
selectAllTextAndRealAltValues<br />
= deep<br />
( isXText<br />
<+><br />
( isElem >>> hasName "img"<br />
>>><br />
getAttrValue "alt"<br />
>>><br />
isA significant -- (1)<br />
>>><br />
arr addBrackets -- (2)<br />
>>><br />
mkText<br />
)<br />
)<br />
where<br />
significant :: String -> Bool<br />
significant = not . all (`elem` " \n\r\t")<br />
<br />
addBrackets :: String -> String<br />
addBrackets s<br />
= " [[ " ++ s ++ " ]] "<br />
</haskell><br />
<br />
This example shows two combinators for building arrows from pure functions.<br />
The first one <hask>isA</hask> removes all empty or whitespace values from alt attributes (1),<br />
the other <hask>arr</hask> lifts the editing function to the arrow level (2).<br />
<br />
== Document construction examples ==<br />
<br />
=== The ''Hello World'' document ===<br />
<br />
The first document, of course, is a ''Hello World'' document:<br />
<br />
<haskell><br />
helloWorld :: ArrowXml a => a XmlTree XmlTree<br />
helloWorld<br />
= mkelem "html" [] -- (1)<br />
[ mkelem "head" []<br />
[ mkelem "title" []<br />
[ txt "Hello World" ] -- (2)<br />
]<br />
, mkelem "body"<br />
[ sattr "class" "haskell" ] -- (3)<br />
[ mkelem "h1" []<br />
[ txt "Hello World" ] -- (4)<br />
]<br />
]<br />
</haskell><br />
<br />
The main arrows for document construction are <hask>mkelem</hask><br />
and it's variants (<hask>selem, aelem, eelem</hask>) for element creation, <hask>attr</hask> and <hask>sattr</hask> for attributes and <hask>mktext</hask><br />
and <hask>txt</hask> for text nodes. <hask>mkelem</hask> takes three arguments, the element name (or tag name), a list of arrows for the construction of attributes, not empty in (3), and a list of arrows for the contents. Text content is generated in (2) and (4).<br />
<br />
To write this document to a file use the following arrow<br />
<br />
<haskell><br />
root [] [helloWorld] -- (1)<br />
>>><br />
writeDocument [(a_indent, v_1)] "hello.xml" -- (2)<br />
</haskell><br />
<br />
When this arrow is executed, the <hask>helloWorld</hask><br />
document is wrapped into a so called root node (1). This complete<br />
document is written to "hello.xml" (2).<br />
<hask>writeDocument</hask> and its variants always expect<br />
a whole document tree with such a root node. Before writing, the document is<br />
indented (<hask>(a_indent, v_1)</hask>)) by inserting extra whitespace<br />
text nodes, and an XML declaration with version and encoding is added. If the indent option is not given, the whole document would appears on a single line:<br />
<br />
<pre><br />
<?xml version="1.0" encoding="UTF-8"?><br />
<html><br />
<head><br />
<title>Hello World</title><br />
</head><br />
<body class="haskell"><br />
<h1>Hello World</h1><br />
</body><br />
</html><br />
</pre><br />
<br />
The code can be shortened a bit by using some of the<br />
convenient functions:<br />
<br />
<haskell><br />
helloWorld2 :: ArrowXml a => a XmlTree XmlTree<br />
helloWorld2<br />
= selem "html"<br />
[ selem "head"<br />
[ selem "title"<br />
[ txt "Hello World" ]<br />
]<br />
, mkelem "body"<br />
[ sattr "class" "haskell" ]<br />
[ selem "h1"<br />
[ txt "Hello World" ]<br />
]<br />
]<br />
</haskell><br />
<br />
In the above two examples the arrow input is totally ignored, because<br />
of the use of the constant arrow <hask>txt "..."</hask>.<br />
<br />
=== A page about all images within a HTML page ===<br />
<br />
A bit more interesting task is the construction of a page<br />
containg a table of all images within a page inclusive image URLs, geometry and ALT attributes.<br />
<br />
The program for this has a frame similar to the <hask>helloWorld</hask> program,<br />
but the rows of the table must be filled in from the input document.<br />
In the first step we will generate a table with a single column containing<br />
the URL of the image.<br />
<br />
<haskell><br />
imageTable :: ArrowXml a => a XmlTree XmlTree<br />
imageTable<br />
= selem "html"<br />
[ selem "head"<br />
[ selem "title"<br />
[ txt "Images in Page" ]<br />
]<br />
, selem "body"<br />
[ selem "h1"<br />
[ txt "Images in Page" ]<br />
, selem "table"<br />
[ collectImages -- (1)<br />
>>><br />
genTableRows -- (2)<br />
]<br />
]<br />
]<br />
where<br />
collectImages -- (1)<br />
= deep ( isElem<br />
>>><br />
hasName "img"<br />
)<br />
genTableRows -- (2)<br />
= selem "tr"<br />
[ selem "td"<br />
[ getAttrValue "src" >>> mkText ]<br />
]<br />
</haskell><br />
<br />
With (1) the image elements are collected, and with (2)<br />
the HTML code for an image element is built.<br />
<br />
Applied to <tt>http://www.haskell.org/</tt> we get the following result<br />
(at the time writing this page):<br />
<br />
<pre><br />
<html><br />
<head><br />
<title>Images in Page</title><br />
</head><br />
<body><br />
<h1>Images in Page</h1><br />
<table><br />
<tr><br />
<td>/haskellwiki_logo.png</td><br />
</tr><br />
<tr><br />
<td>/sitewiki/images/1/10/Haskelllogo-small.jpg</td><br />
</tr><br />
<tr><br />
<td>/haskellwiki_logo_small.png</td><br />
</tr><br />
</table><br />
</body><br />
</html><br />
</pre><br />
<br />
When generating HTML, often there are constant parts within the page,<br />
in the example e.g. the page header. It's possible to write these<br />
parts as a string containing plain HTML and then read this with<br />
a simple XML contents parser called <hask>xread</hask>.<br />
<br />
The example above could then be rewritten as<br />
<br />
<haskell><br />
imageTable<br />
= selem "html"<br />
[ pageHeader<br />
, ...<br />
]<br />
where<br />
pageHeader<br />
= constA "<head><title>Images in Page</title></head>"<br />
>>><br />
xread<br />
...<br />
</haskell><br />
<br />
<hask>xread</hask> is a very primitive arrow. It does not run in the<br />
IO monad, so it can be used in any context, but therefore the error handling<br />
is very limited. <hask>xread</hask> parses an XML element content.<br />
<br />
=== A page about all images within a HTML page: 1. Refinement ===<br />
<br />
The next refinement step is the extension of the table such that<br />
it contains four columns, one for the image itself, one for the URL,<br />
the geometry and the ALT text. The extended <hask>getTableRows</hask><br />
has the following form:<br />
<br />
<haskell><br />
genTableRows<br />
= selem "tr"<br />
[ selem "td" -- (1)<br />
[ this -- (1.1)<br />
]<br />
, selem "td" -- (2)<br />
[ getAttrValue "src"<br />
>>><br />
mkText<br />
>>><br />
mkelem "a" -- (2.1)<br />
[ attr "href" this ]<br />
[ this ]<br />
]<br />
, selem "td" -- (3)<br />
[ ( getAttrValue "width"<br />
&&& -- (3.1)<br />
getAttrValue "height"<br />
)<br />
>>><br />
arr2 geometry -- (3.2)<br />
>>><br />
mkText<br />
]<br />
, selem "td" -- (4)<br />
[ getAttrValue "alt"<br />
>>><br />
mkText<br />
]<br />
]<br />
where<br />
geometry :: String -> String -> String<br />
geometry "" ""<br />
= ""<br />
geometry w h<br />
= w ++ "x" ++ h<br />
</haskell><br />
<br />
In (1) the identity arrow <hask>this</hask> is used for<br />
inserting the whole image element (<hask>this</hask> value) into the first column.<br />
(2) is the column from the previous example but the URL has been made active<br />
by embedding the URL in an A-element (2.1). In (3) there are two<br />
new combinators, <hask>(&&&)</hask> (3.1) is an arrow for applying two<br />
arrows to the same input and combine the results into a pair. <hask>arr2</hask><br />
works like <hask>arr</hask> but it lifts a binary function into an arrow<br />
accepting a pair of values. <hask>arr2 f</hask> is a shortcut for<br />
<hask>arr (uncurry f)</hask>. So width and height are combined into an X11 like<br />
geometry spec. (4) adds the ALT-text.<br />
<br />
=== A page about all images within a HTML page: 2. Refinement ===<br />
<br />
The generated HTML page is not yet very useful, because it usually<br />
contains relativ HREFs to the images, so the links do not work.<br />
We have to transform the SRC attribute values into absolute URLs.<br />
This can be done with the following code:<br />
<br />
<haskell><br />
imageTable2 :: IOStateArrow s XmlTree XmlTree<br />
imageTable2<br />
= ...<br />
...<br />
, selem "table"<br />
[ collectImages<br />
>>><br />
mkAbsImageRef -- (1)<br />
>>><br />
genTableRows<br />
]<br />
...<br />
<br />
mkAbsImageRef :: IOStateArrow s XmlTree XmlTree -- (1)<br />
mkAbsImageRef<br />
= processAttrl ( mkAbsRef -- (2)<br />
`when`<br />
hasName "src" -- (3)<br />
)<br />
where<br />
mkAbsRef -- (4)<br />
= replaceChildren<br />
( xshow getChildren -- (5)<br />
>>><br />
( mkAbsURI `orElse` this ) -- (6)<br />
>>><br />
mkText -- (7)<br />
)<br />
</haskell><br />
<br />
The <hask>imageTable2</hask> is extended by an arrow <hask>mkAbsImageRef</hask><br />
(1). This arrow uses the global system state of HXT, in which the base URL<br />
of a document is stored. For editing the SRC attribute value, the attribute list<br />
of the image elements is processed with <hask>processAttrl</hask>.<br />
With the <hask>`when` hasName "src"</hask> only SRC attributes are manipulated (3). The real work is done in (4): The URL is selected with <hask>getChildren</hask>, a text node, and converted into a string (<hask>xshow</hask>), the URL is transformed into an absolute URL<br />
with <hask>mkAbsURI</hask> (6). This arrow may fail, e.g. in case of illegal<br />
URLs. In this case the URL remains unchanged (<hask>`orElse` this</hask>).<br />
The resulting String value is converted into a text node forming the new<br />
attribute value node (7).<br />
<br />
Because of the use of the use of the global HXT state in <hask>mkAbsURI</hask><br />
<hask>mkAbsRef</hask> and <hask>imageTable2</hask> need to have the more specialized signature <hask>IOStateArrow s XmlTree XmlTree</hask>.<br />
<br />
== Transformation examples ==<br />
<br />
=== Decorating external references of an HTML document ===<br />
<br />
In the following examples, we want to decorate the external references<br />
in an HTML page by a small icon, like it's done in many wikis.<br />
For this task the document tree has to be traversed, all parts<br />
except the intersting A-Elements remain unchanged. At the end of the list of children of an A-Element we add an image element.<br />
<br />
Here is the first version:<br />
<br />
<haskell><br />
addRefIcon :: ArrowXml a => a XmlTree XmlTree<br />
addRefIcon<br />
= processTopDown -- (1)<br />
( addImg -- (2)<br />
`when`<br />
isExternalRef -- (3)<br />
)<br />
where<br />
isExternalRef -- (4)<br />
= isElem<br />
>>><br />
hasName "a"<br />
>>><br />
hasAttr "href"<br />
>>><br />
getAttrValue "href"<br />
>>><br />
isA isExtRef<br />
where<br />
isExtRef -- (4.1)<br />
= isPrefixOf "http:" -- or something more precise<br />
<br />
addImg<br />
= replaceChildren -- (5)<br />
( getChildren -- (6)<br />
<+><br />
imgElement -- (7)<br />
)<br />
<br />
imgElement<br />
= mkelem "img" -- (8)<br />
[ sattr "src" "/icons/ref.png" -- (9)<br />
, sattr "alt" "external ref"<br />
] [] -- (10)<br />
</haskell><br />
<br />
The traversal is done with <hask>processTopDown</hask> (1).<br />
This arrow applies an arrow to all nodes of the whole document tree.<br />
The transformation arrow applies the <hask>addImg</hask> (2) to<br />
all A-elements (3),(4). This arrow uses a bit simplified test (4.1)<br />
for external URLs.<br />
<hask>addImg</hask> manipulates all children (5) of the A-elements by<br />
selecting the current children (6) and adding an image element (7).<br />
The image element is constructed with <hask>mkelem</hask> (8). This takes<br />
an element name, a list of arrows for computing the attributes and a<br />
list of arrows for computing the contents. The content of the image element is<br />
empty (10). The attributes are constructed with <hask>sattr</hask> (9).<br />
<hask>sattr</hask> ignores the arrow input and builds an attribute form<br />
the name value pair of arguments.<br />
<br />
=== Transform external references into absolute references ===<br />
<br />
In the following example we will develop a program for<br />
editing a HTML page such that all references to external documents<br />
(images, hypertext refs, style refs, ...) become absolute references.<br />
We will see some new, but very useful combinators in the solution.<br />
<br />
The task seems to be rather trivial. In a tree travaersal<br />
all references are edited with respect to the document base.<br />
But in HTML there is a BASE element, allowed in the content of HEAD<br />
with a HREF attribute, which defines the document base. Again this<br />
href can be a relative URL.<br />
<br />
We start the development with the editing arrow. This gets<br />
the real document base as argument.<br />
<br />
<haskell><br />
mkAbsHRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsHRefs base<br />
= processTopDown editHRef -- (1)<br />
where<br />
editHRef<br />
= processAttrl -- (3)<br />
( changeAttrValue (absHRef base) -- (5)<br />
`when`<br />
hasName "href" -- (4)<br />
)<br />
`when`<br />
( isElem >>> hasName "a" ) -- (2)<br />
where<br />
<br />
absHRef :: String -> String -> String -- (5)<br />
absHRef base url<br />
= fromMaybe url . expandURIString url $ base<br />
</haskell><br />
<br />
The tree is traversed (1) and for every A element the attribute<br />
list is processed (2). All HREF attribute values (4) are manipulated<br />
by <hask>changeAttrValue</hask> called with a string function (5).<br />
<hask>expandURIString</hask> is a pure function defined in HXT for computing<br />
an absolut URI.<br />
In this first step we only edit A-HREF attribute values. We will refine this<br />
later.<br />
<br />
The second step is the complete computation of the base URL.<br />
<br />
<haskell><br />
computeBaseRef :: IOStateArrow s XmlTree String<br />
computeBaseRef<br />
= ( ( ( isElem >>> hasName "html" -- (0)<br />
>>><br />
getChildren -- (1)<br />
>>><br />
isElem >>> hasName "head" -- (2)<br />
>>><br />
getChildren -- (3)<br />
>>><br />
isElem >>> hasName "base" -- (4)<br />
>>><br />
getAttrValue "href" -- (5)<br />
)<br />
&&&<br />
getBaseURI -- (6)<br />
)<br />
>>> expandURI -- (7)<br />
)<br />
`orElse` getBaseURI -- (8)<br />
</haskell><br />
<br />
Input to this arrow is the HTML element, (0) to (5) is the arrow for selecting<br />
the BASE elements HREF value, parallel to this the system base URL is read<br />
with <hask>getBaseURI</hask> (6) like in examples above. The resulting <br />
pair of strings is piped into <hask>expandURI</hask> (7), the arrow version of<br />
<hask>expandURIString</hask>. This arrow ((1) to (7)) fails in the absense<br />
of a BASE element. in this case we take the plain document base (8).<br />
The selection of the BASE elements is not yet very handy. We will define<br />
a more general and elegant function later, allowing an element path as selection argument.<br />
<br />
In the third step, we will combine the to arrows. For this we will use<br />
a new combinator <hask>($<)</hask>. The need for this new combinator<br />
is the following: We need the arrow input (the document) two times,<br />
once for computing the document base, and second for editing the<br />
whole document, and we want to compute the extra string parameter<br />
for editing of course with the above defined arrow.<br />
<br />
The combined arrow, our main arrow, looks like this<br />
<br />
<haskell><br />
toAbsRefs :: IOStateArrow s XmlTree XmlTree<br />
toAbsRefs<br />
= mkAbsHRefs $< computeBaseRef -- (1)<br />
</haskell><br />
<br />
In (1) first the arrow input is piped into <hask>computeBaseRef</hask>,<br />
this result is used in <hask>mkAbsHRefs</hask> as extra string parameter<br />
when processing the document. Internally the <hask>($<)</hask> combinator<br />
is defined by the basic combinators <hask>(&&&), (>>>)</hask> and <hask>app</hask>, but in a bit more complex computations,<br />
this pattern occurs rather frequently, so ($<) becomes very useful.<br />
<br />
Programming with arrows is one style of point free programming. Point free<br />
programming often becomes unhandy when values are used more than once.<br />
One solution is the special arrow syntax supported by ghc and others, similar to the do notation for monads. But for many simple cases the <hask>($<)</hask> combinator and it's variants <hask>($<<), ($<<<), ($<<<<), ($<$)</hask><br />
is sufficient.<br />
<br />
To complete the development of the example, a last step is neccessary:<br />
The removal of the redundant BASE element.<br />
<br />
<haskell><br />
toAbsRefs :: IOStateArrow s XmlTree XmlTree<br />
toAbsRefs<br />
= ( mkAbsHRefs $< computeBaseRef )<br />
>>><br />
removeBaseElement<br />
<br />
removeBaseElement :: ArrowXml a => a XmlTree XmlTree<br />
removeBaseElement<br />
= processChildren<br />
( processChildren<br />
( none -- (1)<br />
`when`<br />
( isElem >>> hasName "base" )<br />
)<br />
`when`<br />
( isElem >>> hasName "head" )<br />
)<br />
</haskell><br />
<br />
In this function the children of the HEAD element are searched for<br />
a BASE element. This is removed by aplying the null arrow <hask>none</hask><br />
to the input, returning always the empty list.<br />
<hask>none `when` ...</hask> is the pattern for deleting nodes from a tree.<br />
<br />
The <hask>computeBaseRef</hask> function defined above contains an arrow pattern<br />
for selecting the right subtree that is rather common in HXT applications<br />
<br />
<haskell><br />
isElem >>> hasName n1<br />
>>><br />
getChildren<br />
>>><br />
isElem >>> hasName n2<br />
...<br />
>>><br />
getChildren<br />
>>><br />
isElem >>> hasName nm<br />
</haskell><br />
<br />
For this pattern we will define a convenient function creating the<br />
arrow for selection<br />
<br />
<haskell><br />
getDescendents :: ArrowXml a => [String] -> a XmlTree XmlTree<br />
getDescendents<br />
= foldl1 (\ x y -> x >>> getChildren >>> y) -- (1)<br />
.<br />
map (\ n -> isElem >>> hasName n) -- (2)<br />
</haskell><br />
<br />
The name list is mapped to the element checking arrow (2),<br />
the resulting list of arrows is folded with <hask>getChildren</hask><br />
into a single arrow. <hask>computeBaseRef</hask> can then be simplified<br />
and becomes more readable:<br />
<br />
<haskell><br />
computeBaseRef :: IOStateArrow s XmlTree String<br />
computeBaseRef<br />
= ( ( ( getDescendents ["html","head","base"] -- (1)<br />
>>><br />
getAttrValue "href" -- (2)<br />
)<br />
...<br />
...<br />
</haskell><br />
<br />
An even more general and flexible technic are the XPath expressions<br />
available for selection of document parts defined in the module<br />
<hask>Text.XML.HXT.Arrow.XmlNodeSet</hask>.<br />
<br />
With XPath <hask>computeBaseRef</hask> can be simplified to<br />
<br />
<haskell><br />
computeBaseRef<br />
= ( ( ( getXPathTrees "/html/head/base" -- (1)<br />
>>><br />
getAttrValue "href" -- (2)<br />
)<br />
...<br />
</haskell><br />
<br />
Even the attribute selection can be expressed by XPath,<br />
so (1) and (2) can be combined into<br />
<br />
<haskell><br />
computeBaseRef<br />
= ( ( xshow (getXPathTrees "/html/head/base@href")<br />
...<br />
</haskell><br />
<br />
The extra <hask>xshow</hask> is here required to convert the<br />
XPath result, an XmlTree, into a string.<br />
<br />
XPath defines a<br />
full language for selecting parts of an XML document.<br />
Sometimes it's rather comfortable to make selections of this<br />
type, but the XPath evaluation in general is more expensive<br />
in time and space than a simple combination of arrows, like we've<br />
seen it in <hask>getDescendends</hask>.<br />
<br />
=== Transform external references into absolute references: Refinement ===<br />
<br />
In the above example only A-HREF URLs are edited. Now we extend this<br />
to other element-attribute combinations.<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown ( editRef "a" "href" -- (2)<br />
>>><br />
editRef "img" "src" -- (3)<br />
>>><br />
editRef "link" "href" -- (4)<br />
>>><br />
editRef "script" "src" -- (5)<br />
)<br />
where<br />
editRef en an -- (1)<br />
= processAttrl ( changeAttrValue (absHRef base)<br />
`when`<br />
hasName an<br />
)<br />
`when`<br />
( isElem >>> hasName en )<br />
where<br />
absHRef :: String -> String -> String<br />
absHRef base url<br />
= fromMaybe url . expandURIString url $ base<br />
</haskell><br />
<br />
<hask>editRef</hask> is parameterized by the element and attribute names.<br />
The arrow applied to every element is extended to a sequence of<br />
<hask>editRef</hask>'s ((2)-(5)). Notice that the document is still traversed only once.<br />
To process all possible HTML elements,<br />
this sequence should be extended by further element-attribute pairs.<br />
<br />
This can further be simplified into<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown editRefs<br />
where<br />
editRefs<br />
= foldl (>>>) this<br />
.<br />
map (\ (en, an) -> editRef en an)<br />
$<br />
[ ("a", "href")<br />
, ("img", "src")<br />
, ("link", "href")<br />
, ("script", "src") -- and more<br />
]<br />
editRef<br />
= ...<br />
</haskell><br />
<br />
The <hask>foldl (>>>) this</hask> is defined in HXT as <hask>seqA</hask>,<br />
so the above code can be simplified to<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown editRefs<br />
where<br />
editRefs<br />
= seqA . map (uncurry editRef)<br />
$<br />
...<br />
</haskell><br />
<br />
== More complex examples ==<br />
<br />
''' to be done '''<br />
<br />
Some practical examples of HXT in action can be found here: [[HXT/Practical]]<br />
<br />
=== Automatic read/writing between xml and Haskell data types ===<br />
<br />
'''Question''': is there any way to write/read Haskell types to/from XML in HXT? HaXml has readXml and showXml, but I can't find any similar mechanism in HXT. Help! -- AlsonKemp<br />
<br />
==== Serializing to Xml ====<br />
<br />
We can create an HXT tree from a single-layer data class as follows:<br />
<br />
<haskell><br />
import IO<br />
import Char<br />
import Text.XML.HXT.Arrow<br />
import Data.Generics<br />
<br />
-- our data class we'll convert into xml<br />
data Config = <br />
Config { username :: String,<br />
logNumDays :: Int,<br />
oleDbString :: String }<br />
deriving (Show, Typeable,Data)<br />
<br />
-- helper function adapted from http://www.defmacro.org/ramblings/haskell-web.html<br />
-- (gshow replaced by gshow')<br />
introspectData :: Data a => a -> [(String, String)]<br />
introspectData a = zip fields (gmapQ gshow' a)<br />
where fields = constrFields $ toConstr a<br />
<br />
gshow' :: Data a => a -> String<br />
gshow' t = fromMaybe (showConstr(toConstr t)) (cast t)<br />
<br />
-- function to create xml string from single-layer Haskell data type<br />
xmlSerialize object = "<" ++ show(toConstr object) ++ ">" ++ <br />
foldr (\(a,b) x -> x ++ "<" ++ a ++ ">" ++ b ++ "</" ++ a ++ ">") "" ( introspectData object )<br />
++ "</" ++ show(toConstr object) ++ ">"<br />
<br />
-- function to create HXT tree arrow from single-layer Haskell data type:<br />
createHxtArrow object = runLA( constA ( xmlSerialize object ) >>> xread)<br />
<br />
-- create a config object to serialize:<br />
<br />
createConfig = Config { username = "test", logNumDays = 3, oleDbString = "qsdf" }<br />
<br />
-- test function, using our Config data type<br />
testConversion = createHxtArrow( createConfig ) ()<br />
</haskell><br />
<br />
-- hughperkins<br />
<br />
==== Deserializing from Xml ====<br />
<br />
Here's a solution to deserialize a simple haskell data type containing Strings and Ints.<br />
<br />
It's not really pretty, but it works.<br />
<br />
Basically, we just convert the incoming xml into gread-compatible format, then use gread :-D<br />
<br />
Currently it works for a simple single-layer Haskell data type containing Ints and Strings. You can add new child data types by adding to the case statement in xmlToGShowFormat.<br />
<br />
If someone has a more elegant solution, please let me know ( hughperkins@gmail.com )<br />
<br />
<haskell><br />
module ParseXml<br />
where<br />
<br />
import IO<br />
import Char<br />
import List<br />
import Maybe<br />
import Data.Generics hiding (Unit)<br />
import Text.XML.HXT.Arrow hiding (when)<br />
<br />
data Config = Config{ name :: String, age :: Int } <br />
--data Config = Config{ age :: Int } <br />
deriving( Data, Show, Typeable, Ord, Eq, Read )<br />
<br />
createConfig = Config "qsdfqsdf" 3<br />
--createConfig = Config 3<br />
gshow' :: Data a => a -> String<br />
gshow' t = fromMaybe (showConstr(toConstr t)) (cast t)<br />
<br />
-- helper function from http://www.defmacro.org/ramblings/haskell-web.html<br />
introspectData :: Data a => a -> [(String, String)]<br />
introspectData a = zip fields (gmapQ gshow' a)<br />
where fields = constrFields $ toConstr a<br />
<br />
-- function to create xml string from single-layer Haskell data type<br />
xmlSerialize object = "<" ++ show(toConstr object) ++ ">" ++ <br />
foldr (\(a,b) x -> x ++ "<" ++ a ++ ">" ++ b ++ "</" ++ a ++ ">") "" ( introspectData object )<br />
++ "</" ++ show(toConstr object) ++ ">"<br />
<br />
-- parse xml to HXT tree, and obtain the value of node "fieldname"<br />
-- returns a string<br />
getValue xml fieldname | length(resultlist) > 0 = Just (head resultlist)<br />
| otherwise = Nothing<br />
where resultlist = (runLA ( constA xml >>> xread >>> deep ( hasName fieldname ) >>> getChildren >>> getText ))[]<br />
<br />
-- parse templateobject to get list of field names<br />
-- apply these to xml to get list of values<br />
-- return (fieldnames list, value list)<br />
xmlToGShowFormat :: Data a => String -> a -> String<br />
xmlToGShowFormat xml templateobject = <br />
go<br />
where mainconstructorname = (showConstr $ toConstr templateobject)<br />
fields = constrFields $ toConstr templateobject<br />
values = map ( \fieldname -> getValue xml fieldname ) fields<br />
datatypes = gmapQ (dataTypeOf) templateobject<br />
constrs = gmapQ (toConstr) templateobject<br />
datatypereps = gmapQ (dataTypeRep . dataTypeOf) templateobject<br />
fieldtogshowformat (value,datatyperep) = case datatyperep of<br />
IntRep -> "(" ++ fromJust value ++ ")"<br />
_ -> show(fromJust value)<br />
formattedfieldlist = map (fieldtogshowformat) (zip values datatypereps)<br />
go = "(" ++ mainconstructorname ++ " " ++ (concat $ intersperse " " formattedfieldlist ) ++ ")"<br />
<br />
xmlDeserialize xml templateobject = fst $ head $ gread( xmlToGShowFormat xml templateobject)<br />
<br />
dotest = xmlDeserialize (xmlSerialize createConfig) createConfig :: Config<br />
dotest' = xmlDeserialize ("<Config><age>12</age><name>test name!</name></Config>") createConfig :: Config<br />
</haskell><br />
<br />
-- hughperkins</div>Tbh