The latest updates on HaRe with GHC API project seem to be posted on the google+ community page:<div>https://plus.google.com/communities/116266567145785623821</div><div><br></div><div><br><br>On Monday, April 29, 2013 5:09:56 AM UTC-5, Simon Hengel wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Hi Niklas,<br>I haven't read the whole proposal as I'm short of time. &nbsp;But Alan<br>Zimmerman is doing a lot of work on integrating HaRe with the GHC API<br>[1]. &nbsp;He is alanz on freenode and a regular in #hspec.<p>I haven't looked at the code, but maybe it's of interest to you.</p><p>Cheers,<br>Simon</p><p>[1] <a href="https://github.com/alanz/HaRe/tree/ghc-api" target="_blank">https://github.com/alanz/HaRe/<wbr>tree/ghc-api</a></p><p>On Mon, Apr 29, 2013 at 02:00:23PM +0800, Niklas Hambüchen wrote:<br>&gt; I would like to propose the development of source code refactoring tool<br>&gt; that operates on Haskell source code ASTs and lets you formulate rewrite<br>&gt; rules written in Haskell.<br>&gt; <br>&gt; Objective<br>&gt; ---------<br>&gt; <br>&gt; The goal is to make refactorings easier and allow global code changes<br>&gt; that might be incredibly tedious to do in a non-automated way.<br>&gt; By making these transformations convenient, we can make it easier to<br>&gt; maintain clean code, add new features or clean up leftovers faster, and<br>&gt; reduce the fear and effort to upgrade to newer versions of packages and<br>&gt; APIs.<br>&gt; <br>&gt; <br>&gt; Transformations<br>&gt; ---------------<br>&gt; <br>&gt; First, here are a few operations you would use this tool for. Some of<br>&gt; them are common operations you would also do in other programming<br>&gt; languages, some are more specific to Haskell.<br>&gt; <br>&gt; * Changing all occurrences of "import Prelude hiding (catch)" to "import<br>&gt; qualified Control.Exception as E"<br>&gt; <br>&gt; * Replacing all uses of a function with that function being imported<br>&gt; qualified or the other way around<br>&gt; <br>&gt; * Adding a field to data constructor a record, setting user-supplied<br>&gt; defaults for construction and destruction:<br>&gt; <br>&gt; &nbsp; &nbsp; -- Suppose you want to change one of these<br>&gt; &nbsp; &nbsp; data User = User { name :: String, age :: Int }<br>&gt; &nbsp; &nbsp; data User = User String Int<br>&gt; <br>&gt; &nbsp; &nbsp; -- into one of these<br>&gt; &nbsp; &nbsp; data User = User { name :: String, age :: Int, active :: Bool }<br>&gt; &nbsp; &nbsp; data User = User String Int Bool<br>&gt; <br>&gt; &nbsp; &nbsp; -- the refactoring tool could perform, in all relevant locations:<br>&gt; &nbsp; &nbsp; show (User name age) = ...<br>&gt; &nbsp; &nbsp; show (User name age _) = ...<br>&gt; <br>&gt; &nbsp; &nbsp; -- and also this transformation:<br>&gt; &nbsp; &nbsp; ... u { name = "deleted" } ...<br>&gt; &nbsp; &nbsp; ... u { name = "deleted", active = False } ...<br>&gt; <br>&gt; &nbsp; &nbsp; -- or equivalently with records.<br>&gt; <br>&gt; &nbsp; &nbsp; -- Special cases could be taken care of as specified, such as<br>&gt; &nbsp; &nbsp; -- &nbsp; "whenever an object of [this User type] has<br>&gt; &nbsp; &nbsp; -- &nbsp; &nbsp;of its records passed into some function 'email', do this<br>&gt; &nbsp; &nbsp; -- &nbsp; &nbsp;now only if the user is active, so modify all relevant code<br>&gt; &nbsp; &nbsp; -- &nbsp; &nbsp; &nbsp; &nbsp;email (name u)<br>&gt; &nbsp; &nbsp; -- &nbsp; &nbsp;to<br>&gt; &nbsp; &nbsp; -- &nbsp; &nbsp; &nbsp; &nbsp;if (active u) then email (name u) else return ()<br>&gt; <br>&gt; &nbsp; &nbsp; -- Other examples include adding a position counter to attoparsec.<br>&gt; <br>&gt; * Adding a type parameter to a type<br>&gt; <br>&gt; &nbsp; &nbsp; -- This happens a lot on monad transformer stacks, e.g.<br>&gt; &nbsp; &nbsp; newtype MyMonad a b c = MyMonad (ReaderT a (WriterT b ...<br>&gt; <br>&gt; &nbsp; &nbsp; -- and as you would probably agree on, this is not the most<br>&gt; &nbsp; &nbsp; -- comfortable change to make; in big project this can mean<br>&gt; &nbsp; &nbsp; -- hour-long grinding.<br>&gt; <br>&gt; &nbsp; &nbsp; -- It has also recently happened in the basic underlying types<br>&gt; &nbsp; &nbsp; -- of packages like conduit and pipes.<br>&gt; <br>&gt; * Adding a new transformer around a monad<br>&gt; <br>&gt; * Addressing problems like mentioned in<br>&gt; <a href="http://blog.ezyang.com/2012/01/modelling-io/" target="_blank">http://blog.ezyang.com/2012/<wbr>01/modelling-io/</a>:<br>&gt; &nbsp; "There is one last problem with this approach: once the primitives<br>&gt; have been selected, huge swaths of the standard library have to be<br>&gt; redefined by “copy pasting” their definitions ..."<br>&gt; <br>&gt; * Extracting a value into a let or where clause<br>&gt; <br>&gt; * Renaming a variable, and all its occurrences that are semantically<br>&gt; same variable (based on its scope)<br>&gt; <br>&gt; * Changing the way things are done, such as:<br>&gt; <br>&gt; &nbsp; &nbsp; * Replacing uses of fmap with &lt;$&gt;, also taking care of the<br>&gt; &nbsp; &nbsp; &nbsp; corresponding import, and such cases were partial application<br>&gt; &nbsp; &nbsp; &nbsp; is involved<br>&gt; <br>&gt; &nbsp; &nbsp; * Replacing uses of "when (isJust)" to "forM_"<br>&gt; <br>&gt; * Making imports clearer by adding all functions used to the file to the<br>&gt; import list of the module that gets them in scope<br>&gt; <br>&gt; * Finding all places where an exported function does not have all its<br>&gt; arguments haddock-documented.<br>&gt; <br>&gt; * Performing whole-project refactorings instead of operating on single<br>&gt; files only, allowing operations like<br>&gt; <br>&gt; &nbsp; &nbsp; "Find me all functions of this type, e.g.<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Maybe a -&gt; (a -&gt; m a) -&gt; m a<br>&gt; &nbsp; &nbsp; &nbsp;in the project and extract them into this new module,<br>&gt; &nbsp; &nbsp; &nbsp;with the name 'onJust'."<br>&gt; <br>&gt; <br>&gt; Some of the problems above can be tried to address using regex-based<br>&gt; search and replace, but this already fails in the simplest case of<br>&gt; "import Prelude hiding (catch)" in case there is more than that imported<br>&gt; from Prelude or newlines involved in the import list.<br>&gt; <br>&gt; Transformation on the AST are much more powerful, and can guarantee that<br>&gt; the result is, at least syntactically, valid. No text base tool can do that.<br>&gt; <br>&gt; <br>&gt; Other uses<br>&gt; ----------<br>&gt; <br>&gt; In addition to being able to perform transformations as mentioned above,<br>&gt; the refactoring tool as a library can be leveraged to:<br>&gt; <br>&gt; * Support or be the base of code formatting tools such as<br>&gt; haskell-stylish, linters, style/convention checkers, static analyzers,<br>&gt; test coverage tools etc.<br>&gt; <br>&gt; * Implement automatic API upgrades.<br>&gt; <br>&gt; &nbsp; Imagine the author of a library you use deprecates some functions,<br>&gt; introduces replacements, adds type parameters. In these cases, it is<br>&gt; very clear and often well-documented which code has to be replaced by<br>&gt; what. The library author could, along with the new release, publish<br>&gt; "upgrade transformations" that you can apply to your code base to save<br>&gt; most of the manual work.<br>&gt; <br>&gt; &nbsp; These upgrade transformations could be either parts of the packages<br>&gt; themselves or be separately maintained and refined on the feedback of<br>&gt; users applying them to their code bases.<br>&gt; &nbsp; This could allow the Haskell community to keep up their fast pace<br>&gt; while making API breakage a non-problem.<br>&gt; <br>&gt; &nbsp; Automation is what makes us deal well with test suites, source<br>&gt; control, error checking. Taking the pain out of this will increase<br>&gt; people's incentive to upgrade their code an to keep it well-maintained<br>&gt; by easy refactorings.<br>&gt; <br>&gt; &nbsp; This concept of automatic code upgrades would, to my knowledge, be<br>&gt; quite unique in the programming language world, and the possibility to<br>&gt; have this is, as many things, the result of Haskell's excellent type system.<br>&gt; &nbsp; In comparison to dynamic and "unsafe" languages, we actually have a<br>&gt; lot of information around that we can use to write powerful and<br>&gt; expressive tools.<br>&gt; <br>&gt; <br>&gt; Splitting into GSoC tasks<br>&gt; -------------------------<br>&gt; <br>&gt; I think that the project is too large for one summer and should be split<br>&gt; into two parts:<br>&gt; <br>&gt; 1. Implementation of a "full-source" transformation-enabling AST<br>&gt; 2. The transformation engine and API / DSL<br>&gt; <br>&gt; <br>&gt; About 1: Creating a full-source AST<br>&gt; <br>&gt; Existing parsers for Haskell tend to throw away information that is not<br>&gt; necessary for compiling the code. Of course this is not good for a<br>&gt; code-to-code transformation tool, stripping comments or whitespace the<br>&gt; programmer cares about is not an option.<br>&gt; <br>&gt; haskell-src-exts has gained support for comments around version 1.1 (as<br>&gt; another GSoC project?), but still comments are treated as foreigners,<br>&gt; given to you in a list detached from the AST; their positions are<br>&gt; specified with integers, which means that when you modify the AST, you<br>&gt; have to take care to adjust the comments as well.<br>&gt; <br>&gt; The problem seems to be that haskell-source-exts is made for parsing,<br>&gt; its AST is made for transformations. I have yet to see a tool that<br>&gt; actually *modifies* haskell-src-exts's AST - most of them use it for<br>&gt; parsing, and then apply some form of pretty-printing to get back to code.<br>&gt; <br>&gt; We need an AST that contains *all* information about the original source<br>&gt; code, which means render . parse = id; this is why I called it "full<br>&gt; source" AST.<br>&gt; Also, this AST should have all elements available in a way that<br>&gt; encourages modification and does not discriminate against operations<br>&gt; like re-indentation or alignment.<br>&gt; <br>&gt; I like to think of this as something that could eventually be part of<br>&gt; ghc, in a pipeline like<br>&gt; <br>&gt; &nbsp; &nbsp; GHC's parser<br>&gt; &nbsp;-&gt; full-source AST<br>&gt; &nbsp;-&gt; reduced AST that strips parts irrelevant for compilation<br>&gt; &nbsp;-&gt; usual compilation pipeline<br>&gt; <br>&gt; I am thinking of this being in a GHC environment because it would<br>&gt; guarantee that it is up-to-date with the currently supported language<br>&gt; extensions etc.<br>&gt; However, I also see that this might slow down development and make<br>&gt; things harder, so at least for the beginning it might be better to do it<br>&gt; as as a normal library; I would love to get feedback on this.<br>&gt; <br>&gt; It might be a good idea to start with GHC's or haskell-source-exts<br>&gt; parser and modify them to obtain the full-source AST. I believe though<br>&gt; that it would be beneficial if the AST would be to a certain extent be<br>&gt; decoupled from the parser, such that other people could write other<br>&gt; parsers (e.g. using other parsing libraries like Parsec, attoparsec,<br>&gt; uu-parsinglib, trifecta) that produce the same compatible AST type.<br>&gt; <br>&gt; I believe that creating this AST makes a GSoC project for itself, and<br>&gt; that it would be beneficial for all efforts processing Haskell as a<br>&gt; source language.<br>&gt; <br>&gt; <br>&gt; About 2: Transformation engine and how to write transformations<br>&gt; <br>&gt; Transformations should be Haskell code that operates on the AST.<br>&gt; <br>&gt; They would be similar to how you write TemplateHaskell in a way, yet<br>&gt; much easier and more intuitive to use in most cases. I suggest the<br>&gt; creation of a monadic DSL that allows you to select and match those<br>&gt; parts of the code you are interested in, perform some case analysis and<br>&gt; computation to determine how you want to transform it, and then give you<br>&gt; convenient ways to express your changes in that DSL.<br>&gt; <br>&gt; Functions in this DSL would roughly belong to one of the kinds<br>&gt; &nbsp; &nbsp; * finding/matching<br>&gt; &nbsp; &nbsp; * transforming/rewriting.<br>&gt; <br>&gt; I have not thought about how this DSL would exactly look like, but I<br>&gt; could imagine myself writing a high-level transformation like this:<br>&gt; <br>&gt; &nbsp; &nbsp; is &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;- getImports<br>&gt; &nbsp; &nbsp; let matching = filter (importsFunction "catch")<br>&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;. filter (hasModuleName "Prelude")<br>&gt; &nbsp; &nbsp; mapM_ (rewrite . removeImportFunction "catch") matching<br>&gt; <br>&gt; One of the priority goals of this GSoC project is making this API<br>&gt; convenient; transformations should read more naturally and be much<br>&gt; shorter than equivalent TemplateHaskell.<br>&gt; <br>&gt; This extensive, convenience oriented monadic DSL should aim to have a<br>&gt; reasonably large set of functions, and "do things for you" even if you<br>&gt; could build them yourself with three or four combinators.<br>&gt; It should base itself on a minimal set of core transformations.<br>&gt; I think that the Haskell community has a lot of expertise, especially<br>&gt; from parsing libraries, in creating this core API + convenience<br>&gt; combinators duo, that could aid the GSoC student to get this right.<br>&gt; <br>&gt; Along with this way to specify transformations, a transformation engine<br>&gt; has to be constructed which can perform them somewhat efficiently over<br>&gt; large code-bases.<br>&gt; <br>&gt; While speed should should not be a main target for this project and<br>&gt; functionality and expressiveness are the main goals, the transformation<br>&gt; engine should be designed at least with speed in mind, which can be<br>&gt; worked on at a later stage outside of the GSoC project.<br>&gt; <br>&gt; This second part of the proposal should result in a library to perform<br>&gt; transformations and an executable that can apply them to a folder of<br>&gt; source files.<br>&gt; <br>&gt; <br>&gt; Possible extensions and follow-up projects<br>&gt; ------------------------------<wbr>------------<br>&gt; <br>&gt; * Interactive refactoring tool<br>&gt; <br>&gt; The refactoring tool could be driven by an interactive application that<br>&gt; lets you write some transformations, apply them to your code base, view<br>&gt; the diffs and build the project; on build failures or otherwise<br>&gt; not-complete transformations it would allow to refine your<br>&gt; transformations, or write custom special cases for selected files, line<br>&gt; ranges or functions.<br>&gt; This way you could quickly upgrade your code base in a fast<br>&gt; write-check-repeat cycle.<br>&gt; <br>&gt; * An API upgrade infrastructure<br>&gt; <br>&gt; As mentioned above, transformations could be published along with new<br>&gt; software packages that allow their dependants to easily upgrade their<br>&gt; code to the latest API version with minimal manual effort.<br>&gt; An infrastructure or standard way of way of doing this could be established.<br>&gt; For now, I think that package maintainers would publish their upgrade<br>&gt; transformations in their projects' or separate code repositories, and<br>&gt; users would apply them with the refactoring tool as needed.<br>&gt; I can also imagine a more dedicated infrastructure though, where API<br>&gt; upgrades are stored on Hackage or a similar database.<br>&gt; <br>&gt; <br>&gt; Summary<br>&gt; -------<br>&gt; <br>&gt; This proposal contains two GSoC projects that I believe to be reasonably<br>&gt; sized for one summer each.<br>&gt; <br>&gt; I think that they make good projects according to GSoC standards as they<br>&gt; are focused, feasible, and aimed at creating real-world code that bring<br>&gt; a clearly visible benefit.<br>&gt; <br>&gt; The entry barrier is not too high since knowledge of compiler or runtime<br>&gt; internals is not required. However, applicants should probably have a<br>&gt; fairly good understanding of parsers and how they are dealt with in<br>&gt; Haskell, and be somewhat familiar with the Haskell community in order to<br>&gt; find sources of good feedback and for being able to judge whether the<br>&gt; tools being created would be convenient for the community to use.<br>&gt; <br>&gt; <br>&gt; Discussion<br>&gt; ----------<br>&gt; <br>&gt; Now I would be glad to get some responses on this proposal; I have<br>&gt; written a bit of text, but it is still a very rough idea and I would<br>&gt; love to hear your thoughts about it.<br>&gt; <br>&gt; ______________________________<wbr>_________________<br>&gt; Haskell-Cafe mailing list<br>&gt; <a href="javascript:" target="_blank" gdf-obfuscated-mailto="pK0OvcQw3i0J">Haskel...@haskell.org</a><br>&gt; <a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/<wbr>mailman/listinfo/haskell-cafe</a></p><p>______________________________<wbr>_________________<br>Haskell-Cafe mailing list<br><a href="javascript:" target="_blank" gdf-obfuscated-mailto="pK0OvcQw3i0J">Haskel...@haskell.org</a><br><a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/<wbr>mailman/listinfo/haskell-cafe</a><br></p><p></p><p></p><p></p><p></p></blockquote></div>