= Hacking == Get, install Basic use of the package is just go get, or git clone; go install. There are no dependencies outside the standard library. == Build CI is currently on travis-ci.org. The build runs go vet with a few exceptions for things I'm not a big fan of. https://github.com/client9/misspell has been valuable. Also I wrote https://github.com/soniakeys/vetc to validate that each source file has copyright/license statement. Then, it’s not in the ci script, but I wrote https://github.com/soniakeys/rcv to put coverage stats in the readme. Maybe it could be commit hook or something but for now I’ll try just running it manually now and then. Go fmt is not in the ci script, but I have at least one editor set up to run it on save, so code should stay formatted pretty well. == Examples with random output The math/rand generators with constant seeds used to give consistent numbers across Go versions and so some examples relied on this. Sometime after Go 1.9 though the numbers changed. The technique for now is to go ahead and write those examples, get them working, then change the `// Output:` line to `// Random output:`. This keeps them showing in go doc but keeps them from being run by go test. This works for now. It might be revisited at some point. == Plans The primary to-do list is the issue tracker on Github. == Direction, focus, features The project started with no real goal or purpose, just as a place for some code that might be useful. Here are some elements that characterize the direction. * The focus has been on algorithms on adjacency lists. That is, adjacency list is the fundamental representation for most implemented algorithms. There are many other interesting representations, many reasons to use them, but adjacency list is common in literature and practice. It has been useful to focus on this data representation, at first anyway. * The focus has been on single threaded algorithms. Again, there is much new and interesting work being done with concurrent, parallel, and distributed graph algorithms, and Go might be an excellent language to implement some of these algorithms. But as a preliminary step, more traditional single-threaded algorithms are implemented. * The focus has been on static finite graphs. Again there is much interesting work in online algorithms, dynamic graphs, and infinite graphs, but these are not generally considered here. * Algorithms selected for implementation are generally ones commonly appearing in beginning graph theory discussions and in general purpose graph libraries in other programming languages. With these as drivers, there's a big risk developing a library of curiosities and academic exercises rather than a library of practical utility. But well, it's a start. The hope is that there are some practical drivers behind graph theory and behind other graph libraries. * There is active current research going on in graph algorithm development. One goal for this library is to implement newer and faster algorithms. In some cases where it seems not too much work, older/classic/traditional algorithms may be implemented for comparison. These generally go in the alt subdirectory. == General principles * The API is rather low level. * Slices instead of maps. Maps are pretty efficient, and the property of unique keys can be useful, But slices are still faster and more efficient, and the unique key property is not always needed or wanted. The Adjacency list implementation of this library is all done in slices. Slices are used in algorithms where possible, in preference to maps. Maps are still used in some cases where uniqueness is needed. * Interfaces not generally used. Algorithms are implemented directly on concrete data types and not on interfaces describing the capabilities of the data types. The abstraction of interfaces is a nice match to graph theory and the convenience of running graph algorithms on any type that implements an interface is appealing, but the costs seem too high to me. Slices are rich with capababilites that get hidden behind interfaces and direct slice manipulation is always faster than going through interfaces. An impedance for programs using the library is that they will generally have to implement a mapping from slice indexes to their application data, often including for example, some other form of node ID. This seems fair to push this burden outside the graph library; the library cannot know the needs of this mapping. * Bitsets are widely used, particularly to store one bit of information per node of a graph. I used math/big at first but then moved to a dense bitset of my own. Yes, I considered other third-party bitsets but had my own feature set I wanted. A slice of bools is another alternative. Bools will be faster in almost all cases but the bitset will use less memory. I'm chosing size over speed for now. * Code generation is used to provide methods that work on both labeled and unlabeled graphs. Code is written to labeled types, then transformations generate the unlabled equivalents. * Methods are named for what they return rather than what they do, where reasonable anyway. * Consistency in method signature and behavior across corresponding methods, for example directed/undirected, labeled/unlabeled, again, as long as it's reasonable. * Sometimes in tension with the consistency principle, methods are lazy about datatypes of parameters and return values. Sometimes a vale might have different reasonable representations, a set might be a bitset, map, slice of bools, or slice of set members for example. Methods will take and return whatever is convenient for them and not convert the form just for consistency or to try to guess what a caller would prefer. * Methods return multiple results for whatever the algorithm produces that might be of interest. Sometimes an algorithm will have a primary result but then some secondary values that also might be of interest. If they are already computed as a byproduct of the algorithm, or can be computed at negligible cost, return them. * Sometimes in conflict with the multiple result principle, methods will not speculatively compute secondary results if there is any significant cost and if the secondary result can be just as easily computed later. == Code Maintenance There are tons of cut and paste variants. There's the basic AdjacencyList, then Directed and Undirected variants, then Labeled variants of each of those. Code gen helps avoid some cut and paste but there's a bunch that doesn't code gen very well and so is duplicated with cut and paste. In particular the testable examples in the _test files don't cg well and so are pretty much all duplicated by hand. If you change code, think about where there should be variants and go look to see if the variants need similar changes.