tig2aprs: Convert US Census TIGER/Line(R) to DOS APRS
Copyright (c) 1996,1997 E. Alan Crosswell
N2YGK
20 June 1996
Introduction
tig2aprs is a utility to convert US Census
TIGER/Line(R) 1994
map data into maps suitable for use with the Automatic Packet/Position
Reporting System (APRS) by Bob Bruninga, WB4APR. In the documentation
that follows, you are assumed to be very familiar with APRS, which can be
found at the TAPR FTP site.
tig2aprs is a C program that was developed and only been
tested under Linux, a free Unix system. If you port it to other platforms,
or otherwise improve upon it, please send me your changes so that I can
incorporate them.
tig2aprs reads US Census TIGER/Line(R) 1994 files and generates DOS/APRS
maps. tig2aprs reads the collection of TIGER/Line files as a single
standard input stream and creates one or more output maps plus one or more
maplists. This is well suited to dealing with the TIGER/Line
data on CD-ROM or even across the network since you can "unzip -p" the
files directly from the CD and pipe them into tig2aprs.
About the US Census TIGER/Line(R) 1994 Files
What Are They
TIGER/Line files consist of point, line, and polygon data that are useful
to graph geographic areas of interest within the USA and its territories.
These graphical elements are supplemented by data useful for
geographicly-linked statistical analyses including location information
such as street addresses, zip codes, Census blocks, municipalities, etc.
The best thing about TIGER/Line files is that they are in the public domain,
so you can redistribute APRS maps made from them (unlike copyrighted maps
such as those found with commerical packages such as Delorme Street Atlas --
which are generally derived from TIGER/Line in the first place!).
The map data in TIGER/Line come from US Geological Survey 1:100K Digital
Line Graph (DLG) maps which are supplemented with USGS 1:24K, and US
Census GBF/DIME and other sources.
Unlike USGS DLG's which are organized by 7.5 or 15 minute map quadrants,
TIGER/Line's basic file unit is the county (or county-equivalent).
A new set of TIGER/Line maps appears to come out from the US Census every
couple of years (they've published them in 1992, 1994, and expect to be
releasing the next set in 3rd quarter 1996). Besides being used for the
decennial census, other US government agencies have collaborated with the
Census Bureau to augment the maps. For example, the US Postal Service
added geographic linking of ZIP+4 codes.
Documentation
Thorough documentation of the TIGER/Line(R) 1994 Files can be found
at the Census Bureau's TIGER/Line
home page. It is also included on the TIGER/Line CD-ROMs.
You will need to be familiar with that documentation to understand what I am
babbling about when I talk about "record types" below.
Where to find TIGER/Line files
TIGER/Line files are published by the US Census Bureau periodically.
They are not, as of this writing, available on the Internet. The 1994
set is available in a series of six CD-ROMs that cover the United
States of America, including all its territories. The Census Burea
sells these CD's for $250 a piece or $1500 for the set!
But, before you change channels, there's hope! Many libraries are
official depositories of US government documents and data. There is likely
a library near you that has a set of TIGER/Line CD's. Whether or not
access to them is practical is another question. An example of a library
that does it "right" is Columbia University's Electronic Data Service which provides access to the CDs to members
of the general public on
Internet-connected PCs. The CDs are not available to be borrowed but
you can copy the files you care about onto floppies or FTP them somewhere.
Check your local large public or research
university library to see if they have a similar program in place.
Choosing TIGER/Line files of interest
TIGER/Line files are organized into ZIP archives arranged by FIPS
state and county code. For example, on the appropriate CD-ROM that includes
New York State (FIPS state code 36), Westchester County (FIPS state/county
code 36/119) is located in the file
36/tgr36119.zip
The ZIP archive contains:
TGR36119.F61
TGR36119.F62
TGR36119.F63
TGR36119.F64
TGR36119.F65
TGR36119.F66
TGR36119.F67
TGR36119.F68
TGR36119.F69
TGR36119.F6A
TGR36119.F6C
TGR36119.F6H
TGR36119.F6I
TGR36119.F6P
TGR36119.F6R
TGR36119.F6S
TGR36119.F6Z
The last letter of the file extension is the TIGER/Line record type,
so TGR36119.F61 contains the type 1 records for Westchester County, NY.
The complete list of state and county codes is in Appendix A of the
TIGER/Line 1994 documentation file, app_a.asc.
Using tig2aprs
Order of Input Data Matters
tig2aprs reads a stream of TIGER/Line records from stdin, builds
a bunch of internal data structures, crunches for a long time, trying to
reduce the data into something manageable by DOS APRS, and then spits out
a series of gridded, overlapping maps. The order that records are read
in by the program is important. If type 1 records are being read in
and a type 2 record shows up, then no further type 1 records will be
accepted. Make sure you concatenate the input files together in the right
order!
The tig2aprs file input ordering requirements are:
-
All record types 1 have to be read in first. These records are the
key to deciding whether to ignore or save selected information from the
other record types.
- All record types 2 are expected to be grouped by like TLID and sorted
by seqno if there are multiple type 2 records for the same TLID. The
TIGER/Line data is documented as being supplied with this ordering on
the CDs.
- After type 2, the ordered set of expected record types is
'P', 'C', 'S', '7', '8', and '9'. Don't worry if you get the order wrong,
tig2aprs will complain (or at least dump core:-).
You can omit some of the types. For example, type 2 records fill in the
way points between the ends of road "segments" defined by type 1 records.
You can make a more jagged map by not supplying type 2 records.
Example
For example, here's how I do this for Westchester County, NY,
which is bordered by something
like fix or six counties in the tri-state NY-NJ-CT area. I bring in
the other counties' data and then proceed to ignore most of it 'cuz
the map is rectangular and the County isn't and I want to fill in the
edges of the map.
$ cat domap
#!/bin/sh
#36119: Westchester (= ../*1, etc.)
#36087: Rockland
#36071: Orange
#36079: Putnam
#36027: Dutchess
#36111: Ulster
#36061: New York (Manhattan)
#36081: Queens
#36047: Kings (Brooklyn)
#36085: Richmond (Staten Island)
#36005: Bronx
#36059: Nassau
#36103: Suffolk
#09001: Fairfield
#09005: Litchfield
#09009: New Haven
#34003: Bergen
#34031: Passaic
#34013: Essex
#34023: Middlesex
#34039: Union
#more=${more:-""}
two=${two:-"2"}
counties="36005 36087 36071 36079 36061 09001 34003 36059 36081 $more"
for t in 1 $two P C S 7 8 9
do
# I have Westchester already unzipped in the .. directory
if [ $t = 1 ]; then
../src/fixup <../*1
else
cat ../*$t
fi
# the other Counties' zip files are copied off CD into /other/tmp.
for c in $counties
do
st=`echo $c|sed -e 's/^\(..\).*$/\1/'`
if [ $st = 34 -a $t = C ]; then
:
else
unzip -p /other/tmp/$st/tgr$c TGR$c.F6$t
fi
done
done | $*
$ domap tig2aprs -ao -r 16 -d 4 -t 4 -p nywc4 -l maplist.wc4 2>log.wc4
Data Reduction
DOS APRS maps are constrained by design to contain no more than 2999
data points per map. This permits the program to work even on
small PCs. About 180 maps can be in a MAPLIST, so this limit isn't really
all that much of a problem for detailed mapping, which is what tig2aprs
is all about. Most of what tig2aprs does is figure out how to stay
within the points limit. [Note for WinAPRS users: tig2aprs can generate
unlimited size DOS APRS-format maps which WinAPRS can read, so it is possible
to generate single large, detailed maps for use with WinAPRS.]
The data reduction techniques tig2aprs uses include:
- Feature Cutoffs
-
TIGER/Line identifies all features with a Census Feature Class Code (CFCC)
which distinguish Interstate highways from dirt roads, for example.
tig2aprs has a table of feature cutoffs to use based on the level of
detail requested.
- Water Filtering
-
There are many "trivial" water features such as small lakes and ponds that
are unfortunately not properly identified as such with a CFCC. These features
lead to noisy maps and waste precious data points.
The tig2aprs water filter will drop all water features that cover less
than a specified percentage of the map area.
- Joining segments
-
Pieces of roads or segments (TIGER/Line jargon for these is Complete
Chains) are joined together, eliminating unneccesary duplicate data points
at each intersection.
- Line Smoothing
-
tig2aprs includes an algorithm to "straighten" out wiggly lines,
thereby eliminating intermediate wiggles and their associated data points.
- Point Fuzzing
-
tig2aprs allows you to control the "focus" such that data points that
are "near" each other are considered equivalent, allowing for more point
elimination.
- Tiling
-
If the above-mentioned point reduction techniques still fail to get down
below 3000 points in a map, tig2aprs is able to recursively
split a map into quadrants and so on until each resulting map covers a
small enough area that the maximum points limit is achieved.
Seamless Map Transitions
When DOS APRS maps are tiled, it is necessary to overlap them as well
since APRS will only display a map if its borders entirely cover the
screen at the current zoom. In practice, this means that a map of
range N will only display when APRS is zoomed in to N/2 and the display
"window" is completely within the map borders. By overlapping each map
with three other maps by 50% each, it is possible to have seemless
panning across an area; APRS automatically reloads the next overlapping
map at the same level of detail. tig2aprs will automatically
generate the three overlap maps for each base map.
Labels
The main advantage of TIGER/Line over standard USGS DLG maps is the
added value of place and landmark labels. tig2aprs generates
a variety of types of labels, using the DOS APRS special symbols,
for places, hospitals, schools, airports, cemeteries, parks, etc.
Labels for areas (towns, villages, parks, etc.) are automatically
centered within the visible portion of an area on a given map.
Multiword labels (names longer than 12 characters) are broken up and stacked
to get around the DOS APRS limit.
Runtime Options
Here's some documentation of the runtime options:
Usage: tig2aprs [-vjnoaCTD] -c lat,lon -r range -d detail -t min_tile
-f fuzz -F maxfuzz -w filt% -W maxfilt% -p map_prefix
-s slopefuzz -S maxslope -l maplist -m maxmaplist -M maxAPRSpoints
-R resolutuion -L places|landmarks|kgls
-v = verbose (more -v's for more verbose)
-j = flip and join road segments
-n = match segments by name as well as CFCC
-o = make three overlapping maps
-a = all the usual (same as -vjnC -w .05 -W .1 -s .05 -S .2)
-T = print all a road's TLIDs in comment
-D = create a *.dat file before fuzzing a map
-c = map center lat,lon in decimal degrees
-r = map radius in miles
-d = level of map detail in miles
-t = make tiles no smaller than this radius in miles
-f = initial map fuzziness (reducing # of points)
-F = worst map fuzziness
-w = toss lakes smaller than x% of map
-W = worst lake fuzz%
-s = line smoothing factor
-S = worst line smoothing factor
-p = filename prefix for map files (default is 'map')
-l = maplist filename
-m = max map names per maplist before splitting up
-M = max APRS/DOS map points (default 2999)
-R = resolution (a/k/a pixels per degree)
-L = label places, landmarks, or key geographic locations (kgls)
Customization
Unfortunately, a bit of customization of tig2aprs requires modifying
some source code. Making this driven by a config file is on the "to do" list,
but, for now, take a look
at the comments in the source and especially: defcuts, cfccrange, and
fipsrange.
aprs.tk
aprs.tk is a simple Tcl/Tk script that allows one to view an
APRS map without having to run Dos APRS. It is a very rudimentary tool
and could use a lot of improvement.