When I look at a corpus of law being worked by a legislature/parliament I see...
- text, lots and lots of text, organized into units of various sizes: sections, bills, titles, chapters, volumes, codes, re-statements etc.
- The units of text are highly stylized, idiomatized, structured forms of natural language.
- The units of text are highly inter-linked : both explicitly and implicitly. Sections are assembled to produce statute volumes, bills are assembled to produce session laws etc. Bills cite to statutes. Journals cite bills. Bills cite bills...
- The units of text have rigorous temporal constraints. I.e. a bill that refers to a statute is referring to a statute as it was at a point in time. An explanation of a vote on a bill is an explanation of a vote as it looked at a particular point in time.
- The law making process consists of taking the corpus of law as it looked at some time T, making some modifications and promulgating a new corpus of law at some future time T+1. That new corpus is then the basis for the next iteration of modifcations.
When I look at a corpus of source code I see...
- text, lots and lots of text, organized into units of various sizes: modules, components, libraries, objects, services etc.
- The units of text are highly stylized, idiomatized, structured forms of natural language.
- The units of text are highly inter-linked : both explicitly and implicitly. Modules are assembled to produce components, components are assembled to produce libraries etc. Source files cite (import and cross-link to) other source files. Header files cite (import and cross-link to) header files. Components cite(instantiate) other components...
- The units of text have rigorous temporal constraints. I.e. a module that refers to a library is referring to a library as it was at a point in time e.g. version 8.2. A source code comment explaining an API call is written with respect to how the API looked at a particular point in time.
- The software making process consists of taking the corpus of source as it looked at some time T, making some modifications and promulgating a new corpus - a build - at some future time T+1. That new corpus (build) is then the basis for the next iteration of modifications to the source code.
What we have here are two communities that work with insanely large, complex corpora of text that must be rigorously managed and changed with the utmost care, precision and transparency of intent. Yet, the software community has a much greater set of tools at its disposal to help out.
Standards and curriculum are in the same boat. Too bad the self-appointed leaders on these issues -- I'm looking at you Achieve -- are such feckless asshats. But what would you expect from an institution that would have the idiot Carcieri on its board.
No comments:
Post a Comment