How Revise works

How Revise works

Revise is based on the fact that you can change functions even when they are defined in other modules. Here's an example showing how you do that manually (without using Revise):

julia> convert(Float64, π)
3.141592653589793

julia> # That's too hard, let's make life easier for students

julia> @eval Base convert(::Type{Float64}, x::Irrational{:π}) = 3.0
convert (generic function with 714 methods)

julia> convert(Float64, π)
3.0

Revise removes some of the tedium of manually copying and pasting code into @eval statements. To decrease the amount of re-JITting required, Revise avoids reloading entire modules; instead, it takes care to eval only the changes in your package(s), much as you would if you were doing it manually. Importantly, changes are detected in a manner that is independent of the specific line numbers in your code, so that you don't have to re-evaluate just because code moves around within the same file. (One unfortunate side effect is that line numbers may become inaccurate in backtraces, but Revise takes pains to correct these, see below.)

To accomplish this, Revise uses the following overall strategy:

The structure of Revise's internal representation

diagram

Figure notes: Nodes represent primary objects in Julia's compilation pipeline. Arrows and their labels represent functions or data structures that allow you to move from one node to another. Red ("destructive") paths force recompilation of dependent functions.

Revise bridges between text files (your source code) and compiled code. Revise consequently maintains data structures that parallel Julia's own internal processing of code. When dealing with a source-code file, you start with strings, parse them to obtain Julia expressions, evaluate them to obtain Julia objects, and (where appropriate, e.g., for methods) compile them to machine code. This will be called the forward workflow. Revise sets up a few key structures that allow it to progress from files to modules to Julia expressions and types.

Revise also sets up a backward workflow, proceeding from compiled code to Julia types back to Julia expressions. This workflow is useful, for example, when dealing with errors: the stack traces displayed by Julia link from the compiled code back to the source files. To make this possible, Julia builds "breadcrumbs" into compiled code that store the filename and line number at which each expression was found. However, these links are static, meaning they are set up once (when the code is compiled) and are not updated when the source file changes. Because trivial manipulations to source files (e.g., the insertion of blank lines and/or comments) can change the line number of an expression without necessitating its recompilation, Revise implements a way of correcting these line numbers before they are displayed to the user. This capability requires that Revise proceed backward from the compiled objects to something resembling the original text file.

Terminology

A few convenience terms are used throughout: definition, signature-expression, and signature-type. These terms are illustrated using the following example:

function print_item(io::IO, item, ntimes::Integer=1, pre::String="")
    print(io, pre)
    for i = 1:ntimes
        print(io, item)
    end
end

This represents the definition of a method. Definitions are stored as expressions, using a Revise.RelocatableExpr. The highlighted portion is the signature-expression, specifying the name, argument names and their types, and (if applicable) type-parameters of the method.

From the signature-expression we can generate one or more signature-types. Since this function has two default arguments, this signature-expression generates three signature-types, each corresponding to a different valid way of calling this method:

Tuple{typeof(print_item),IO,Any}                    # print_item(io, item)
Tuple{typeof(print_item),IO,Any,Integer}            # print_item(io, item, 2)
Tuple{typeof(print_item),IO,Any,Integer,String}     # print_item(io, item, 2, "  ")

In Revise's internal code, a definition is often represented with a variable def, a signature-expression with sigex, and a signature-type with sigt.

Core data structures and representations

Two "maps" are central to Revise's inner workings: the DefMap links definition=>signature-types (the forward workflow), while the SigtMap links from signature-type=>definition (the backward workflow). Concretely, SigtMap is just a Dict mapping sigt=>def. Of note, a stack frame typically contains a link to a method, which stores the equivalent of sigt; consequently, this information allows one to look up the corresponding def.

The DefMap is a bit more complex and has important constraints:

Importantly, modules can be "reconstructed" from the keys of DefMap (or collection of DefMaps, if the module involves multiple files or has sub-modules), since they hold the complete ordered set of expressions that would be evaled to define the module.

The DefMap and SigtMap are grouped in a Revise.FMMaps, which are then organized by the file in which they occur and their module of evaluation.

An example

Consider a module, Items, defined by the following two source files:

Items.jl:

__precompile__(false)

module Items

include("indents.jl")

function print_item(io::IO, item, ntimes::Integer=1, pre::String=indent(item))
    print(io, pre)
    for i = 1:ntimes
        print(io, item)
    end
end

end

indents.jl:

indent(::UInt16) = 2
indent(::UInt8)  = 4

indents.jl is particularly simple: Revise represents it as "indents.jl"=>Dict(Items=>fmm1), specifying the filename, module(s) into which its code is evaled, and corresponding FMMaps. Because indents.jl only contains code from a single module (Items), the Dict has just one entry. fmm1 looks like this:

fmm1 = FMMaps(DefMap(:(indent(::UInt16) = 2) => ([Tuple{typeof(indent),UInt16}], 0),
                     :(indent(::UInt8) = 4)  => ([Tuple{typeof(indent),UInt8}], 0)
                     ),
              SigtMap(Tuple{typeof(indent),UInt16} => :(indent(::UInt16) = 2),
                      Tuple{typeof(indent),UInt8}  => :(indent(::UInt8) = 4)
                      ))

The lineoffsets are initially set to 0 when the code is first compiled, but these may be updated if the source file is changed.

Items.jl is represented with a bit more complexity, "Items.jl"=>Dict(Main=>fmm2, Main.Items=>fmm3). This is because Items.jl contains one expression (the __precompile__ statement) that is evaled in Main, and other expressions that are evaled in Items. Concretely,

fmm2 = FMMaps(DefMap(:(__precompile__(false)) => nothing),
              SigtMap())
fmm3 = FMMaps(DefMap(:(include("indents.jl")) => nothing,
                     def => ([Tuple{typeof(print_item),IO,Any},
                              Tuple{typeof(print_item),IO,Any,Integer},
                              Tuple{typeof(print_item),IO,Any,Integer,String}], 0)),
              SigtMap(Tuple{typeof(print_item),IO,Any} => def,
                      Tuple{typeof(print_item),IO,Any,Integer} => def,
                      Tuple{typeof(print_item),IO,Any,Integer,String} => def))

where here def is the expression defining print_item.

Revisions and computing diffs

When the file system notifies Revise that a file has been modified, Revise re-parses the file and assigns the expressions to the appropriate modules, creating a Revise.FileModules fmnew. It then compares fmnew against fmref, the reference object that is synchronized to code as it was evaled. The following actions are taken:

Technically, a new fmref is generated every time to ensure that the expressions are ordered as in fmnew; however, conceptually this is better thought of as an updating of fmref, after which fmnew is discarded.

Internal API

You can find more detail about Revise's inner workings in the Developer reference.