The next interesting variant on graphs is *directed graphs*, or *digraphs* for

short. A digraph is a graph where each edge distinguished between its source and its target - so an edge is *from* one node, and *to* another node. Unlike a simple graph, where if A is adjacent to B, then you can follow the edge either from A to B, or from B to A, in a directed graph, A to B and B to A are different edges.

A lot of things change when you're using digraphs instead of simple graphs. For example, the shortest

path: the shortest path from A to B is *different* than the shortest path from B to A. Look at the

following example of a weighted digraph. The shortest path from A to B is 2 (A→B); the shortest path

from B to A is 9 (B→D→C→E→A).

Digraphs come up in many places - basically anywhere where direction matters. Think, for example, of

New York City streets. Every intersection is a vertex; streets between intersections are edges. How do you

get from 96th street and Madison to 18th street and 4th ave? Since most of the streets in Manhattan are

one-way, you don't want to try to do this with an undirected graph - you'll get directions that you can't

follow. You need to know the directions of the edges. (An important point, which this illustrates, is

that in a digraph, you can have *both* an edge from A to B *and* an edge from B to A.

Two way streets get modeled as two edges - one in each direction.)

For a less obvious example of this, we can look at something very close to home for me. There's

a very famous software tool, first implemented at Bell Labs, called "make". The purpose of make

is to provide an automated way of compiling software, which ensures that no more work is done compiling than necessary.

This is important, because there are often many steps to compiling a software system. For example, a program that I wrote to generate some fractals consists of:

- a graph description file (GDF).
- A python program which processes the GDF, and generates a C++ program.
- A library - that is, a set of other C++ sources that aren't specific to one particular

image, but which know how to generate "pdf" image files - that is used by the C++ program to

generate the image. - The C++ standard library - which is used both by the program generated from my GDF file,

and by the PDF library. I don't have to compile this - but it's a component that gets

included to turn anything into an executable.

So, to generate an image, I write a GDF. Then the python program runs on the GDF, generating

a C++ program. Then the C++ program is compiled. The library is compiled; and then the compiled

C++ program and the two libraries are linked and run, to generate the image.

Suppose I write a GDF file which generates a nice picture of the mandelbrot set. Then I run

make to compile it. It generates C++, compiles the C++ and the library, and runs the program

to generate the image.

Then I want to use a different GDF file - this time, I want to generate a Julia fractal. There's no

reason to recompile the C++ library. I already did that when I was generating a mandelbrot, and

the library hasn't changed. I can just re-use the library from last time.

I look at my Julia image, and it's no good - it's very pixelated. So I add some kind of

scaling factor to the library. But I'm not changing my gdf file, and I'm not changing the python script. I shouldn't need to rerun the python script on the unchanged GDF; and that means that

the generated C++ program can't change either. I need to recompile the library, and I need to

relink the compiled C++.

That's what make is for. It figures out what needs to be redone, based an what changed. The way

that it does that is by using a *dependency graph*. A dependency graph is a digraph which

has a vertex for every input to, and every output from, any step of a build process. There is an edge

from a vertex A to a vertex B if changing A means that B needs to be recompiled. For example, below is

the digraph for the scenario I described above. In it, I've made C++ files red/orange, libraries and executables green, python source blue, and my gdf file purple. Source files are rounded boxes, libraries are ovals, object files are diamonds, and executables are rectangles.

To build the executable for my fractal generator

("foo.exe"), I need to have the generated source file "foo.cc", the pdf library ("libpdf.a"), and

the C++ standard library ("libstdc++.a"); to build "foo.cc", I need

both an input file, "foo.gdf", and the python script "gen.py", and so on.

So, now, suppose I edit "foo.gdf". That means I need to rebuild everything on any path

from "foo.gdf" up to "foo.exe": so I need to run "gen.py" to build "foo.cc"; then I need to compile

"foo.cc", and link it against the two libraries to build "foo.exe".

If I updated the version of my C++ library, then I would just need to recompile and link "foo.cc".

My favorite subset of Digraphs are: (1) Functional graphs; (2) The functional graphs of prime endofunctions.

To clarify:

(1) A functional graph is a digraph in which each vertex has outdegree one, and can therefore be specified by a function mapping {1,...,n} onto itself.

(2) By a prime endofunction, I mean a digraph from {1,...,n} to {1,..,n}, hence an endomorphism, which is neither the direct sum nor the categorical product of other prime endofunctions. The enumeration of these turns out to be rather tricky. My son and I have been working on a paper about this for a year, drafts of which were final projects for his last Math course at Cal State L.A., where he earned his double B.S. in Math and computer science. Yesterday, still just eighteen years old, he moved into the graduate dorms at USC and began Law School.

(3) The number of different weakly connected digraphs (which may have loops) on n unlabeled nodes which have at least one node with outdegree greater than 1. A "loop" is an edge from a point to itself. "Simple" means that there cannot be two or more edges of the same direction connecting any given pair of points. A weakly connected digraph is [Weisstein]: "A directed graph in which it is possible to reach any node starting from any other node by traversing edges in some direction (i.e., not necessarily in the direction they point). The nodes in a weakly connected digraph therefore must all have either outdegree or indegree of at least 1." A003085 is Number of connected [simple] digraphs with n nodes [and no loops]. Simple digraphs with loops = binary relations, so this sequence enumerates binary relations which are neither partial functions nor functions.

(4) On OEIS, we find the enumeration

A003085 Number of connected digraphs with n nodes.

There are also partial endofunctions, and many other sets of some interest. But underlying it all, Digraphs. They rock!

Great post, combining two of my favorite functional facilities, streets of Manhattan and make, both very slick.

Of course, the almost regular digraph of Manhattan is still dynamic - as a frakking furreigner I initially didn't know that "parallel parking" has another meaning in NY, but got stuck behind a parked 2nd lane in front of a truck. Those truck drivers sure know to honk their horns...

Hi Mark,

Your graphs are approaching Petri Nets.

See slides 24, 36, 37, 47, 57, 68 of 71 in Geert Jan Olsder talk on Max Plus Algebra, discrete event systems [train timetables].

Text in English [title slide not].

http://webserv.nhl.nl/~kamminga/wintersymposium/Olsder2005.pdf

Have a great vacation!

Hi Mark,

may I ask which program you're using to generate these nice diagrams?

Guy:

All of the graph diagrams, and most of the diagrams used on the blog are drawn using OmniGraffle on the mac. It's one of the best pieces of software that I've ever used: it's extremely powerful and flexible, and it's got a beautiful and easy to work with UI.

The connection between digraphs and biogenesis and evolution is suggested (I love the word "eigendynamics" which did not exist when I de facto used it in my PhD research mid 1970s) by:

arXiv:0708.4212

Title: Aggregate Dynamics in an Evolutionary Network Model

Authors: Adrian M. Seufert, Frank Schweitzer

Subjects: Populations and Evolution (q-bio.PE); Adaptation and Self-Organizing Systems (nlin.AO); Molecular Networks (q-bio.MN); Quantitative Methods (q-bio.QM)

We analyze a model of interacting agents (e.g. prebiotic chemical species) which are represended by nodes of a network, whereas their interactions are mapped onto directed links between these nodes. On a fast time scale, each agent follows an eigendynamics based on catalytic support from other nodes, whereas on a much slower time scale the network evolves through selection and mutation of its nodes-agent. In the first part of the paper, we explain the dynamics of the model by means of characteristic snapshots of the network evolution and confirm earlier findings on crashes an recoveries in the network structure. In the second part, we focus on the aggregate behavior of the network dynamics. We show that the disruptions in the network structure are smoothed out, so that the average evolution can be described by a growth regime followed by a saturation regime, without an initial random regime. For the saturation regime, we obtain a logarithmic scaling between the average connectivity per node $mean{l}_{s}$ and a parameter $m$, describing the average incoming connectivity, which is independent of the system size $N$.