We’ve open-sourced Retrie, a code refactoring tool for Haskell that makes codemodding faster, easier, and safer. Using Retrie, developers can efficiently rewrite large codebases (exceeding 1 million lines), express rewrites as equations in Haskell syntax instead of regular expressions, and avoid large classes of codemodding errors.
Retrie’s features include the ability to rewrite expressions, types, and patterns; the ability to script rewrites and add side conditions; and a library for scripting more advanced rewrites. Retrie also respects and maintains local scoping, preserves white space, and does not rewrite code comments.
Why it matters:
Refactoring improves the overall design of a codebase, but it can be a meticulous, time-consuming task. To avoid errors, refactoring is often done by hand, in small increments. As the size of a codebase grows, this approach becomes infeasible, and tool support is critical.
In the spectrum of refactoring tools, there are two extremes. At one end, there are find-and-replace string manipulation tools, such as sed. These tools are fast, but it is difficult to express complicated rewrites. At the other end of the spectrum are tools for parsing and manipulating an abstract syntax tree (AST). AST manipulation tools are very powerful but require extensive domain knowledge and are typically slow compared with string replacement.
Retrie occupies a comfortable middle ground. Expressing rewrites as equations in the syntax of the target language (in this case, Haskell) is easier than defining a complex regular expression or AST traversal. Since equations are more powerful than regular expressions and rewrites can be scripted, Retrie is more powerful than string replacement alone. Retrie also leverages several techniques to narrow the search space before parsing and efficiently finding matches, which makes it faster than typical AST manipulation tools.
As an example of using Retrie, let’s say you have some code, including functions like
module MyModule where foo :: [Int] -> [Int] foo ints = map bar (map baz ints)
You realize that traversing the list
ints twice is slower than doing it just once. Instead of fixing the code manually, you can express the transformation as an equation and apply it everywhere in the current directory:
retrie --adhoc "forall f g xs. map f (map g xs) = map (f . g) xs"
In this example, Retrie will make the following edit:
module MyModule where foo :: [Int] -> [Int] -foo ints = map bar (map baz ints) +foo ints = map (bar . baz) ints
Use it for:
Haskell powers our anti-abuse rule engine, Sigma. To manage the growing scale and complexity of rules, we migrated Sigma to Haskell in 2015. Sigma blocks spam, phishing attacks, and malware before they can affect people on Facebook. Retrie has allowed us to quickly and safely migrate Sigma’s rules to new APIs and libraries. We’re releasing it to the wider community so that other developers can take advantage of its speed, ease of use, and power for their own codemodding tasks.