MaNIS Georeferencing Discussion Archive

 

Following are extracts of the Georeferencing Listserv discussions accumulated during the MaNIS georeferencing project. Missing postings were not relevant to georeferencing in perpetuity. Messages have been edited to protect the guilty by masking names of individuals with XXXXXX.

 

>>> Posting number 1, dated 17 Jul 1999 14:12:50

 

-----------------------------------------------------------------------------

 

>>> Posting number 2, dated 17 Jul 1999 14:15:23

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 3, dated 17 Jul 1999 14:16:03

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 4, dated 17 Jul 1999 14:19:25

 

 

------------------------------------------------------------------------=

-----

 

>>> Posting number 5, dated 17 Jul 1999 14:19:59

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 6, dated 17 Jul 1999 14:26:41

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 7, dated 17 Jul 1999 14:22:50

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 8, dated 17 Jul 1999 14:23:12

 

-----------------------------------------------------------------------------

 

>>> Posting number 9, dated 19 Jul 1999 09:29:01

----------------------------------------------------------------------------

--------------------

 

>>> Posting number 10, dated 23 Jul 1999 16:35:41

 

>>> Posting number 11, dated 3 Sep 1999 16:17:55

 

>>> Posting number 12, dated 17 Sep 1999 15:19:38

 

>>> Posting number 13, dated 17 Sep 1999 13:13:14

 

>>> Posting number 14, dated 17 Sep 1999 14:57:30

 

>>> Posting number 15, dated 20 Sep 1999 09:04:17

 

>>> Posting number 16, dated 24 Sep 1999 17:01:21

 

>>> Posting number 17, dated 28 Sep 1999 12:50:27

 

>>> Posting number 18, dated 15 Oct 1999 19:37:37

 

>>> Posting number 19, dated 17 Oct 1999 16:37:27

 

>>> Posting number 20, dated 18 Oct 1999 16:50:30

 

>>> Posting number 21, dated 19 Oct 1999 11:15:26

 

>>> Posting number 22, dated 19 Oct 1999 16:35:19

 

>>> Posting number 23, dated 20 Oct 1999 15:51:18

 

>>> Posting number 24, dated 20 Oct 1999 11:34:55

 

>>> Posting number 25, dated 20 Oct 1999 16:00:18

 

>>> Posting number 26, dated 10 Nov 1999 10:52:01

 

>>> Posting number 27, dated 10 Nov 1999 13:54:04

 

>>> Posting number 28, dated 17 Nov 1999 15:12:19

 

>>> Posting number 29, dated 18 Nov 1999 12:38:15

 

>>> Posting number 30, dated 18 Nov 1999 10:08:56

 

>>> Posting number 31, dated 18 Nov 1999 13:22:25

 

>>> Posting number 32, dated 19 Nov 1999 14:35:52

 

>>> Posting number 33, dated 3 Dec 1999 10:21:24

 

>>> Posting number 34, dated 3 Jan 2000 11:48:10

 

>>> Posting number 35, dated 3 Jan 2000 16:24:25

 

>>> Posting number 36, dated 18 May 2000 16:51:23

 

>>> Posting number 37, dated 18 May 2000 19:49:29

 

>>> Posting number 38, dated 23 May 2000 18:41:45

 

>>> Posting number 39, dated 24 May 2000 09:38:19

 

--------------------------------------------------------

---------------------

 

>>> Posting number 40, dated 24 May 2000 12:15:39

 

>>> Posting number 41, dated 12 Jun 2000 15:45:50

 

>>> Posting number 42, dated 13 Jun 2000 09:31:26

 

>>> Posting number 43, dated 13 Jun 2000 09:59:02

 

>>> Posting number 44, dated 13 Jun 2000 09:17:08

 

>>> Posting number 45, dated 13 Jun 2000 07:49:43

 

>>> Posting number 46, dated 13 Jun 2000 09:04:22

 

>>> Posting number 47, dated 13 Jun 2000 08:54:22

 

>>> Posting number 48, dated 13 Jun 2000 11:11:31

 

>>> Posting number 49, dated 13 Jun 2000 13:23:46

 

>>> Posting number 50, dated 30 Jun 2000 16:25:38

 

>>> Posting number 51, dated 30 Jun 2000 17:14:31

 

>>> Posting number 52, dated 30 Jun 2000 23:29:35

 

>>> Posting number 53, dated 1 Jul 2000 07:35:15

 

>>> Posting number 54, dated 4 Jul 2000 11:04:23

 

>>> Posting number 55, dated 4 Jul 2000 10:07:33

 

>>> Posting number 56, dated 6 Jul 2000 00:00:0/

 

>>> Posting number 57, dated 5 Jul 2000 19:40:11

 

>>> Posting number 58, dated 5 Aug 2000 09:24:55

 

>>> Posting number 59, dated 5 Aug 2000 12:31:07

 

>>> Posting number 60, dated 7 Aug 2000 13:45:33

 

>>> Posting number 61, dated 15 Aug 2000 21:54:23

 

>>> Posting number 62, dated 23 Aug 2000 16:24:48

 

>>> Posting number 63, dated 30 Aug 2000 11:20:17

 

>>> Posting number 64, dated 22 Sep 2000 09:36:34

 

>>> Posting number 65, dated 29 Sep 2000 08:51:23

 

>>> Posting number 66, dated 2 Oct 2000 10:35:12

 

>>> Posting number 67, dated 5 Oct 2000 09:40:24

 

>>> Posting number 68, dated 17 Oct 2000 18:13:33

 

>>> Posting number 69, dated 1 Nov 2000 07:48:24

 

>>> Posting number 70, dated 1 Nov 2000 08:06:24

 

>>> Posting number 71, dated 28 Nov 2000 18:26:18

 

>>> Posting number 72, dated 29 Nov 2000 21:09:35

 

>>> Posting number 73, dated 30 Nov 2000 08:31:10

 

>>> Posting number 74, dated 30 Nov 2000 11:33:07

 

>>> Posting number 75, dated 14 Dec 2000 20:41:28

 

>>> Posting number 76, dated 15 Dec 2000 07:59:04

 

>>> Posting number 77, dated 26 Apr 2001 09:00:01

 

>>> Posting number 78, dated 16 May 2001 18:29:45

 

>>> Posting number 79, dated 16 May 2001 17:36:59

 

>>> Posting number 80, dated 18 May 2001 08:29:49

 

>>> Posting number 81, dated 24 May 2001 10:19:20

 

>>> Posting number 82, dated 25 May 2001 09:43:37

 

>>> Posting number 83, dated 11 Jun 2001 12:01:03

 

>>> Posting number 84, dated 11 Jun 2001 15:02:51

 

>>> Posting number 85, dated 11 Jun 2001 15:44:56

 

>>> Posting number 86, dated 29 Jun 2001 21:12:37

 

>>> Posting number 87, dated 4 Jul 2001 14:24:24

Date:         Wed, 4 Jul 2001 14:24:24 -0700

Reply-To:     "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

Sender:       "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: ROM higher geography

In-Reply-To:  <sb433743.076@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I'm posting the following exchange to the list because there is information

contained herein that is relevant to everyone. The basic concepts of data

cleanliness, the gazetteer, and data updates are addressed in brief.

 

 

>Once I began working on the Bukedi inconsistency (2nd in your list) I saw

>that your methodology is missing many more errors/inconsistencies that

>exist in County and Province data.

 

Understood.  My analysis reveals only the duplicates of

ORCT+ORCRY+ORPR+ORCY

 

I understand that there may be many other errors and inconsistencies in the

original data, but that is not a concern for the gazetteer.  In fact, the

duplicates I pointed out aren't a problem either. I just wanted to alert

you to them since they came out in my analysis.

 

>   The errors and inconsistencies are a direct reflection of the state of

> documentation on field catalogues or specimen cards, depending on the

> source of the automated record.  We did not have the resources at the

> time of automation (nor do we now for that matter) to resolve what is a

> "Province" term and what is a "County" term for all

> countries.  Additionally, we are looking at historical data that may no

> longer be reflected in the current political reality of our little world

> (e.g., USSR, Northwest Territories).  I have cleaned up data fields that

> are used routinely to manage the collection and retrieve data.  Continent

> and Country should be clean.  The Province field should be clean for

> Canada (I haven't had the time to tackle NWT yet), USA, and Mexico.  I

> just finished cleaning up the Province field for Guyana as well.  The

> County field should be clean for Ontario.  I now periodically print out

> frequency listings for Country etc. for these priority sections of the db

> (and collection) in an effort to maintain the consistency of our

> data.  For all other geographic locations, Province and County are not

> used for managing the collection, so the data clean up or enhancement has

> been a low priority.  This is an ongoing situation that I have discussed

> with Judith with regard to the Manis Project.  My understanding is that

> funding for documentational and staffing resources will be part of this

> "mission".  I am afraid your listing of 13 inconsistencies barely

> scratches the surface of the data cleaning that is required and even more

> importantly, misses all kinds of erroneous or missing data.  I currently

> do not have the maps, atlases, or gazetteers nor the staff/time to

> undertake this project which from a collections' perspective is of low

> priority.  To do a proper job I cannot resolve all of the problems that

> you have identified without undertaking a full review of the entire

> country's data.

 

There is no requirement for any standard of cleanliness. It is my hope that

errors and inconsistencies will be noted during georeferencing and

forwarded to the attention of the institutions as a part of that

process.  The tools are meant to identify the inconsistencies, not to

remedy them. What the institutions do with these notes is entirely up to them.

 

>I am not sure what you are currently attempting to do with the data so we

>may need to further discuss our respective needs to insure that we are not

>working at cross purposes.  If work is to be globally undertaken, I would

>like our data to be the db of record - making long lists of changes for

>you to then repeat is a waste of effort and time; you will see the work

>generated by having two dbs of record by the simple changes that I have

>made this afternoon.  Also, errors in interpretation or typos that are

>bound to occur should be avoided.  Finally, the data you have is already

>out of date, since changes are made by me on a daily basis as errors etc.

>are encountered during the normal activities of managing the collection,

>fulfilling data requests, etc.

 

The institutional databases will always be the database of record.  The

data I have from all of the institutions is just a snapshot, to be used for

georeferencing. I will not ask for these data again during the project, nor

will I make changes to the data I have received.  When we have a network,

the gazetteer will be created and updated automatically whenever data

change and the snapshot will be obsolete.  I've only created the snapshot

so that we have combined data to work with. When people begin to do

georeferencing using the gazetteer they will not change the data - they

will only make commentaries.  Even the latitude and longitude are

commentaries in a sense. It is up to each institution to accept or reject

the commentaries and make changes based on them in its database.

 

 

>Regards,

 

 

> 

> >>> John Wieczorek <tuco@socrates.Berkeley.EDU> 07/02/01 08:50PM >>>

>Attached is a tab-delimited file with the first row containing column

>headings. The contents of the file are combinations of higher geographic

>fields for which you have more than one interpretation in your

>database.  The first field (highergeog) is a concatenation of the fields of

>higher geography that reveal duplication. The second field (geogid) is an

>identifier unique to the ROM higher geography data with one row for every

>unique combination of ORCT, ORCRY, ORPR, and ORCY.  As you can see by the

>rows in the table, there are 13 places for which there are inconsistent

>placements of county vs. province, for example.  It is not critical for my

>purposes to have these resolved, but since I noticed them I thought I might

>as well tell you.  If you do make changes to these combinations, let me

>know which are correct and I'll do so on this end as well.

 

>>> Posting number 88, dated 10 Jul 2001 12:01:24

Date:         Tue, 10 Jul 2001 12:01:24 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      cave localities

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

I've noticed that the USGS GNIS web site does not give information on cave

sites.  (It does give locations of variants such as Boulder Cave

Campground.)  Is this a protocol we wish to follow?  Are there other web

sites that do list cave localities?  What do you think?

 

Cheers,

 

>>> Posting number 89, dated 10 Jul 2001 13:40:25

Date:         Tue, 10 Jul 2001 13:40:25 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Filtering data

In-Reply-To:  <sb4b0d4a.070@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

This message is in reply to a comment about

records for captive animals.

 

>I would recommend that you do not use any captive records for a

>gazetteer.  Does that make sense?

 

In a restricted view of the utility of a gazetteer it does make sense to

exclude them. However, it is actually easier to include them, yet have them

flagged. This has the benefit that one can filter on the captive attribute.

This could be useful if you wanted to do a quick query of only captive

animals as well as for a query in which you want to leave them out.  The

philosophy in general will be to have a home for all data that anyone deems

useful, yet to allow each institution to decide which data it will provide

through the filters implemented during migration.

 

A filter might do any one of the following:

1) exclude attributes altogether (e.g., not show a "CaptiveFlag" field)

2) exclude records based on the value of an attribute (e.g., not show

records of endangered species)

3) exclude certain values of an attribute (e.g., not show localities for

endangered species)

4) substitute a surrogate value for an attribute of a certain value (e.g.,

instead of showing locality with lat-long, show only county-level and

higher geography for endangered species)

 

These are just a few examples of what might be done at one institution, and

may vary between institutions.  I encourage the participant's to discuss

these issues, and to begin to make institutional decisions about filtering

rules when it comes time to set up the migration.  The rules must be

clearly defined before I begin to create the creation scripts - I can't

afford to stay at any given institution (except maybe Hawaii, heh heh),

while the rules are being hashed out.

 

>>> Posting number 90, dated 8 Aug 2001 13:10:05

 

>>> Posting number 91, dated 14 Sep 2001 08:48:17

 

>>> Posting number 92, dated 23 Sep 2001 17:24:24

 

>>> Posting number 93, dated 24 Sep 2001 20:07:31

Date:         Mon, 24 Sep 2001 20:07:31 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guidelines

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

Now that we are officially up and running I would like to provide the first

of two documents on the MaNIS collaborative georeferencing effort.  This

first document is meant to open for discussion the issues associated with

turning specific locality descriptions into well-documented latitudes and

longitudes.  The document does not explain what tools to use, or how to use

any of them - that will be in a forthcoming document. Instead, this

document focuses on the "theoretical aspects" of the task, our methods and

assumptions, upon which it would be helpful for us all to agree.  To that

end, please read the Georeferencing Guidelines page, accessible from the

Documents page on the MaNIS website (see below).  Comment by sending

messages to MAMMAL-Z-NET@USOBI.ORG. Let's try to get through this

discussion by 6 Oct.

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

Anticipating your enthusiastic participation,

 

John Wieczorek

 

>>> Posting number 94, dated 25 Sep 2001 18:30:16

Date:         Tue, 25 Sep 2001 18:30:16 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing text, for reference

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It was pointed out to me that it might be prudent to have a text-only copy

of the document, with line numbers, to which everyone can refer in

discussions.  I am including the full text of the GeorefGuide.html file

below for that purpose.  The page itself can be found at the following URL:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

 

   1 MaNIS

   2 The Mammal Networked Information System

   3

   4 John Wieczorek

   5 24 September 2001

   6 _________________________________________________

   7

   8 Georeferencing Guidelines

   9

  10 This document contains information about assigning geographic

  11 coordinates and maximum errors for those coordinates to specific

  12 locality descriptions. This document does not attempt to

  13 describe the tools and methods for finding named places on maps

  14 or gazetteers. The process of assigning coordinates and errors,

  15 called georeferencing, can be rather complicated. The complexity

  16 of the process can be greatly reduced and the consistency of the

  17 results can be greatly increased by establishing simple

  18 guidelines that cover most commonly encountered locality

  19 descriptions. The guidelines for assigning coordinates for named

  20 places are presented with examples in the section Determining

  21 Latitude & Longitude.

  22

  23 There are several fundamental sources of error for specific

  24 locality descriptions, and these vary in magnitude. It is

  25 essential during georeferencing to determine and record the

  26 greatest source of error among all possible sources. There are

  27 numerous ways in which the maximum error of a geographic

  28 coordinate might be expressed, but the most convenient is as a

  29 distance, because its size and shape are constant over any

  30 geodetic surface model. The sources of error and their

  31 magnitudes are discussed primarily in the section Determining

  32 Error.

  33

  34 An Appendix containing a description of the data that should be

  35 captured for each georeferenced locality, a glossary, and

  36 references are appended for the convenience of the reader.

  37

  38 Determining Latitude & Longitude

  39

  40 Geographic coordinates can be expressed in a number of different

  41 coordinate systems (e.g. decimal degrees, degrees minutes

  42 seconds, degrees decimal minutes, UTM, etc.). Conversions can be

  43 made readily between coordinate systems, but decimal degrees

  44 provide the most convenient coordinates to use for

  45 georeferencing for no more profound a reason than that a

  46 specific locality can be described with only two attributes

  47 decimal latitude and decimal longitude.

  48

  49 Named Places

  50

  51 The simplest of specific locality descriptions consist of only a

  52 named place. Use the geographic center of a named place for the

  53 latitude and longitude, and use the distance from that point to

  54 the furthest point within that named place for the maximum error

  55 distance. If the geographic center of the named place is not

  56 within the confines of the shape of the named place, use the

  57 point nearest to the geographic center that lies within the

  58 shape.

  59

  60 Example: "Bakersfield"

  61

  62 Township Range Section (TRS) descriptions are essentially no

  63 different from that of any other named place. It is necessary to

  64 understand how TRS descriptions work and how they describe a

  65 place. See the References section, below, for links to TRS

  66 information.

  67

  68 Example: "E of Bakersfield, T29S R29E Sec. 34 NE 1/4"

  69

  70 Offsets

  71

  72 Offsets generally consist of combinations of distances and

  73 directions from a named place. Use the geographic center of the

  74 named place in the direction of the offset as a starting point.

  75 Unless there is contrary information in the locality

  76 description, measure the distance in the offset direction to

  77 find the spot for the geographic coordinates. Offsets that do

  78 not explicitly say that they were measured by air or by some

  79 contour (e.g., by road, river, valley, etc.) should be

  80 determined as if by air in a straight line.

  81

  82 Example: "10 mi E (by air) Bakersfield"

  83

  84 Example: "10 mi E of Bakersfield"

  85

  86 However, if there is no mention of the mode of measurement in

  87 the locality description, but the measurement includes fractions

  88 (e.g., 10.2 miles) and there is a road in the vicinity, use road

  89 miles. Offsets that were described in the specific locality as

  90 being measured by road should be determined using the contours

  91 of the road rather than using a straight line. The methods for

  92 determining the maximum error distances for these types of

  93 specific locality descriptions are given in the Determining

  94 Error section, below.

  95

  96 Example: "10.2 mi E of Bakersfield"

  97

  98 Example: "13 mi E (by road) Bakersfield"

  99

100 Vagueness

101

102 At times, specific locality descriptions are fraught with

103 vagueness. It is not the purpose here to belittle localities of

104 this type; in fact, an honest admission of the unknown is

105 preferable to masking it with unwarranted precision.

106

107 The most important type of vagueness in a specific locality

108 description is one in which the locality is in question. No such

109 locality should be georeferenced.

110

111 Example: "Bakersfield?"

112

113 Many locality descriptions imply an offset from a named place

114 without definitive directions or distances. Use the geographic

115 center of the named place for the geographic coordinates. For

116 the maximum error distance, use the greatest distance that is

117 not likely to be considered in the area of another named place.

118 Clearly there is a measure of subjectivity involved here. Let

119 common sense prevail and document the assumptions made.

120

121 Example: "near Bakersfield"

122

123 Sometimes offset information is vague either in its direction or

124 in its distance. If the direction information is vague, record

125 the geographic coordinates of the center of the named place and

126 add the offset distance to the greatest extent of the named

127 place to get the maximum error distance.

128

129 Example: "5 mi from Bakersfield"

130

131 Uncertainty in the offset distance is a fact of the business.

132 Almost no localities are recorded with error estimates,

133 therefore every offset distance is inherently uncertain. The

134 addition of a modifier in the locality description, while an

135 honest observation, should not change the determination of the

136 geographic coordinates or of the maximum error.

137

138 Example: "about 3 mi E of Bakersfield"

139

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

152

153 Determining Error

154

155 The process of georeferencing includes an assessment of the

156 possible sources of error in a geographic coordinate

157 determination. Errors may arise due to the extent of a locality,

158 due to unspecified precision in original measurements (distance

159 precision and directional precision), or due to not knowing the

160 datum under which coordinates were determined. It is essential

161 to determine which of these yields the greatest error and record

162 that value as the maximum error distance. Potential error

163 sources and guidelines for determining the magnitude of each for

164 a given specific locality are given in the paragraphs below.

165

166 Error due to the shape of a locality

167

168 Named places are not single points; they have extents. If a

169 locality description is no more specific than to describe a

170 named place or an offset from a named place, then the size of

171 the named place is a source of error. The treatment of error due

172 to the extent of a locality is described under the examples of

173 determining latitude and longitude, above.

174

175 Error due to a unknown datum

176

177 Seldom have geographic coordinates been recorded for a locality

178 in a natural history collection in which the underlying datum of

179 the coordinate system was given. Even now, when GPS coordinates

180 are being taken as definitive evidence of a location, the

181 geodetic datum is being ignored. Without recording the datum

182 with the coordinates, potential accuracy is being lost. Figure 1

183 shows the magnitude of error (in meters) over North America

184 based on not knowing the datum from which the coordinates were

185 taken.

186

187 [datumerror.jpg]

188

189 Figure 1. Map of North America showing the magnitude of

190 potential error from not knowing whether coordinates were taken

191 from NAD27, NAD83, or WGS84.

192

193 This map can be used as a rough guide for determining the

194 magnitude of error due to not knowing the datum from which the

195 geographic coordinates were recorded.

196

197 Precision

198

199 Precision is difficult to gauge from specific locality

200 descriptions; it may be reflected in the locality description,

201 but it is seldom, if ever, explicitly recorded. Furthermore, a

202 database record may not reflect, or may reflect incorrectly, the

203 precision inherent in the original measurement, especially if

204 the locality description has undergone interpretation from the

205 verbatim original description. Precision issues arise from both

206 distance measurements and directions in a locality description.

207 Potential errors from each of these sources are discussed in the

208 paragraphs below.

209

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

220

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

223

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

226

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

229

230 If the distance is an integer, use an error of one unit.

231

232 Example: "10 mi N of Bakersfield" Error should be 1 mi.

233

234 Error associated with directional precision

235

236 Direction is almost always expressed in specific locality

237 descriptions using cardinal and intercardinal directions rather

238 than degree headings. A conservative interpretation of these

239 directions allows for an error of 22.5 degrees to either side of

240 the recorded direction. Thus, ENE can be any direction between E

241 and NE, while NE can be any direction between ENE and NNE.

242

243 [directionerror.jpg]

244

245 The error distance resulting from imprecision in direction

246 increases with increasing offset distance. In fact the error

247 distance due to directional imprecision is 0.4142 times the

248 offset. Note, however, that when a locality description uses two

249 offsets based on cardinal directions (e.g., 1 mi N and 3 mi E of

250 Bakersfield), the distances and directions are likely to have

251 been measured on a map. In this case, directional imprecision

252 should be ignored.

253

254 Appendix

255

256 Geographic Coordinate Data

257

258 Following are the essential attributes to be captured for each

259 locality while georeferencing.

260

261 Decimal_Latitude - the latitude coordinate (in decimal degrees) at

262 the center of a circle encompassing the whole of a specific

263 locality. Convention holds that decimal latitudes north of the

264 equator are positive numbers less than or equal to 90, while

265 those south are negative numbers greater or equal to 90.

266 Example: -42.51 degrees (which is the same as 42d 30' 36" S).

267

268 Decimal_Longitude - the longitude coordinate (in decimal degrees)

269 at the center of a circle encompassing the whole of a specific

270 locality. Decimal longitudes west of the Greenwich Meridian are

271 considered negative and must be greater than or equal to 180,

272 while eastern longitudes are positive and less than or equal to

273 180. Example: -122.49 degrees (which is the same as 122d 29' 24"

274 W).

275

276 Maximum_Error_Distance - the upper limit of the distance from the

277 given latitude and longitude within which the described locality

278 must lie.

279

280 Maximum_Error_Units - the units of length in which the maximum

281 error is recorded (e.g., mi, km, m, and ft). Express maximum

282 error distance in the same units as the distance measurement in

283 the specific locality description.

284

285 Datum - the geometric description of a geodetic surface model

286 (e.g., NAD27, NAD83, WGS84). Datums are often recorded on maps

287 and in gazetteers, and can be specifically set for most GPS

288 devices. Use "not recorded" when the datum is not known.

289

290 Original_Coord_System - the coordinate system in which the raw

291 data are being entered. For the purpose of collaborative

292 georeferencing this value will be "decimal degrees." However,

293 existing geographic coordinates may be entered in degrees

294 minutes seconds, degrees decimal minutes, or UTM coordinates.

295

296 Reference - the reference source (e.g., map, gazetteer, or

297 software) used to determine the coordinates. Such information

298 should provide enough detail so that anyone can locate the

299 actual reference that was used (e.g., name, edition or version,

300 year). Lat_Long_Determined_By the person or organization by

301 which the determination was made.

302

303 Lat_Long_Determined_Date - the date on which the determination was

304 made.

305

306 Remarks - comments on methods and assumptions used in determining

307 coordinates or errors when those methods or assumptions differ

308 from or expand upon the accepted guidelines.

309

310 Glossary

311

312 Datum - A geodetic datum describes the size, shape, origin, and

313 orientation of a coordinate system for mapping the surface of

314 the earth.

315

316 Decimal degrees - degrees expressed as a single real number (e.g.,

317 -22.343456) rather than as a composite of degrees, minutes,

318 seconds, and direction (e.g., 7d 54 18.32" E).

319

320 Geodetic surface model - a geometric description of the surface of

321 the earth.

322

323 Geographic coordinates - latitude and longitude, measured in any

324 of various coordinate systems.

325

326 Geographic center - To find the geographic center of a shape,

327 first find the extremes of both latitude and longitude within

328 the shape and then take their respective means.

329

330 UTM - Universal Transverse Mercator. A grid coordinate system

331 specifying a datum, zone, and offsets from the equator and from

332 the meridian of the zone. See the References section, below, for

333 more information.

334

335 References

336

337 Township, Range Section Information:

338

339 http://www.esg.montana.edu/gl/trs-data.html

340

341 Datum Information:

342

343 http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html

344 http://164.214.2.59/GandG/tm83581/tr83581a.htm

345 http://biology.usgs.gov/geotech/documents/datum.html

346

347 UTM Information:

348

349 http://www.nps.gov/prwi/readutm.htm

350 http://www.dmap.co.uk/ll2tm.htm

351

352 Note

353

354 Specific locality descriptions are inexact and seldom give

355 estimates of error. An ideal description of a specific locality

356 has no error. One way to achieve this ideal is to describe the

357 locality by a shape within which the exact locality must

358 certainly lie. The capture of shape data is certainly possible

359 with current GIS technology, and is even demonstrably more

360 efficient than the methods described above. However, there are

361 technical challenges yet to be met in order to make the capture

362 of shape data feasible in a collaborative Internet-based

363 georeferencing environment.

364

365 An alternative to using a shape to describe a locality is to use

366 a definitive point of arbitrarily high precision with an

367 attendant maximum error. This method, described in the foregoing

368 document, is a conservative expression of the locality which

369 satisfies the requirement that the exact locality must lie

370 within the space described.

371

372

373 _________________________________________________

374

375 Rev. 24 September 2001, JRW

376

377 University of California, Berkeley, CA 94720, Copyright 2001,

378 The Regents of the University of California.

 

>>> Posting number 95, dated 27 Sep 2001 10:45:45

Date:         Thu, 27 Sep 2001 10:45:45 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing document

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

John,

 

I went through your document this morning and find most of it clear and in

agreement with my own practices of georeferencing.  I have some

observations and questions as follows:

 

A.

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

 

I have recently been through a georeferencing exercise in the herp

collection for which obtaining coordinates that agreed with the elevations

was critical.  It was only through trying to match the description of the

location (distance and direction from X village) with the elevation given,

and finding that the given elevation at the described site was impossible,

that I uncovered major problems in the locality data provided for a large

number of herps on one particular collecting trip.  In this case I was able

to contact the collector to ask about the inconsistencies and he determined

that his original distances were totally off because he was using miles on

a metric map.  In this case the elevations were the correct piece of

information.  I therefore caution against ignoring elevations out of hand.

 

B.

Section on Determining Latitude and Longitude does not include an example

for when coordinates are provided.  For the sake of completeness, should

such and example be included, or, since they are being provided and not

determined, should this be taken up in another section?  For example, when

coordinates are provided in degrees, minutes and seconds, do we translate

into decimals?  how many decimal places do we go for minutes?  for

seconds?  Does it matter who provided the

coordinates?  collector?  previous museum person?  someone else?  Under

what circumstances, if any, should we recalculate coordinates when they are

provided by some previous source?

 

 

C.

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

 

Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

divide by the denominator?

 

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

 

numerator is 1 to begin with, so doesn't answer the question.

 

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

 

Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

to calculate the error?

 

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

 

Fraction this time is given as 3/4, not 1/4, but you could only get an

error of 0.25 by replacing the 3 with a 1 before dividing by 4.

 

As you can see, the examples are confusing.

 

 

All in all, its a sound document.  Thanks much.

 

 

>>> Posting number 96, dated 27 Sep 2001 20:34:47

Date:         Thu, 27 Sep 2001 20:34:47 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Gordon Jarrell <fnghj@AURORA.UAF.EDU>

Subject:      Re: Georeferencing document

In-Reply-To:  <5.0.2.1.1.20010927104434.00a2f7e0@mail.bishopmuseum.org>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Some good points.  I've inserted my comments.

 

On Thu, 27 Sep 2001, XXXXXXX wrote:

 

> A.

> 140 The worst of situations arises when a specific locality

> 141 description is internally inconsistent. There are numerous

> 142 possible causes for inconsistencies. It is the task of the those

> 143 georeferencing to determine the part of the description most

> 144 likely to be in error, ignore it for the purpose of the

> 145 determination, and document the decision to do so. The most

> 146 common source of inconsistency in a locality description comes

> 147 from trying to match elevation information with the rest of the

> 148 description. If there is no reasonable way to reconcile the

> 149 discrepancy, ignore the elevation.

> 150

> 151 Example: "10 mi W of Bakersfield, 6000 ft"

> 

> I have recently been through a georeferencing exercise in the herp

> collection for which obtaining coordinates that agreed with the elevations

> was critical.  It was only through trying to match the description of the

> location (distance and direction from X village) with the elevation given,

> and finding that the given elevation at the described site was impossible,

> that I uncovered major problems in the locality data provided for a large

> number of herps on one particular collecting trip.  In this case I was able

> to contact the collector to ask about the inconsistencies and he determined

> that his original distances were totally off because he was using miles on

> a metric map.  In this case the elevations were the correct piece of

> information.  I therefore caution against ignoring elevations out of hand.

> 

 

The key words here are, "IF there is no way to reconcile the

discrepancy..."  A possible resolution of the discrepancy might be to

treat it as "specific locality unknown."  This might best be left to the

discretion of the individual collections.  We have to judge individually

how bad our bad data are, i.e., whether or not we can reconcile them.

 

> B.

> Section on Determining Latitude and Longitude does not include an example

> for when coordinates are provided.  For the sake of completeness, should

> such and example be included, or, since they are being provided and not

> determined, should this be taken up in another section?  For example, when

> coordinates are provided in degrees, minutes and seconds, do we translate

> into decimals?  how many decimal places do we go for minutes?  for

> seconds?  Does it matter who provided the

> coordinates?  collector?  previous museum person?  someone else?  Under

> what circumstances, if any, should we recalculate coordinates when they are

> provided by some previous source?

> 

 

(I know John's answer to some of this one.)  The coordinates define an

infinitely small point, no matter what the format.  Precision is measured

with max_error, not the number of significant figures.

 

Nevertheless, we will have coordinates in which precision was implied by

the recorded format.  We have to convert this implied imprecision into a

measure of max_error.  At UAM we are using 2 km, a little over a nautical

mile, for coordinates that were recorded to the nearest whole minutes.

 

There are other examples, similar to the problems with distance precision:

        64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km

 

We probably do need to develop a standard here.  And yes, I'll bet we want

to be able to keep track of various determinations, re-determinations, who

did it, when, and how.

 

 

> C.

> 210 Error associated with distance precision

> 211

> 212 Distance may be recorded in a specific locality description with

> 213 or without significant digits, and those digits may or may not

> 214 be warranted. A conservative way to insure that distance

> 215 precision is not inflated is to treat distance measurements as

> 216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

> 217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

> 218 based on the fractional part of the distance, using 1 divided by

> 219 the denominator of the fraction.

> 

> Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

> divide by the denominator?

> 

> 221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

> 222 be 0.5 mi.

> 

> numerator is 1 to begin with, so doesn't answer the question.

> 

> 224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

> 225 should be 0.1 mi.

> 

> Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

> to calculate the error?

> 

> 227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

> 228 be 0.25 mi.

> 

> Fraction this time is given as 3/4, not 1/4, but you could only get an

> error of 0.25 by replacing the 3 with a 1 before dividing by 4.

> 

> As you can see, the examples are confusing.

> 

> 

 

Looks like a typo in line 224.

 

I suggest replacing the sentence beginning in line 217 with:

 

The error is the resolution implied by the denominator.  It can be

calculated as a distance by dividing one unit of distance by the

denominator.

 

Is that better?  Or worse?

 

 

>>> Posting number 97, dated 28 Sep 2001 12:53:09

Date:         Fri, 28 Sep 2001 12:53:09 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing guidelines

Mime-version: 1.0

Content-type: multipart/alternative;

              boundary="MS_Mac_OE_3084526390_196216_MIME_Part"

 

 

John et al.,

 

The georeferencing guidelines look great to me.  The only (minor) quibble I

have

would be with the second item under the subheading "Offsets" (lines 86-89).

Here, you

suggest that a locality that contains distance fractions (such as "10.2 mi E

Bakerfield") should be assumed to be road miles rather than air miles. I see

it the other way around. Most field workers I know are careful to state "by

road" if their mileage was actually measured along a road.  Otherwise, the

mileage is assumed to be taken directly from a map (i.e., air miles).  I

don't see that the inclusion of fractions in the mileage should

automatically signal that the mileage was read from an odometer...it's easy

to get that level of precision using the distance scale printed on the map.

 

Let's see what the others think.  Well done.

 

 

>>> Posting number 98, dated 28 Sep 2001 11:33:22

Date:         Fri, 28 Sep 2001 11:33:22 -0700

Reply-To:     Peter Rauch <peterr@socrates.Berkeley.EDU>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Georeferencing guidelines

In-Reply-To:  <OF482A362E.E38FA255-ON86256AD5.00621E6D@lsu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

On Fri, 28 Sep 2001, XXXXXXXX wrote:

 

> The georeferencing guidelines look great to me.  The only

> (minor) quibble I have would be with the second item under

> the subheading "Offsets" (lines 86-89). Here, you suggest

> that a locality that contains distance fractions (such as

> "10.2 mi E Bakerfield") should be assumed to be road miles

> rather than air miles. I see it the other way around. Most

> field workers I know are careful to state "by road" if their

> mileage was actually measured along a road.

 

On insect labels ;>)  "by road" is just that much more text to

cram onto tiny labels. Maybe things are different with

vertebrate folks, especially for those who keep detailed field

notebooks. I think lots of folks keep careful track of their

odometers, and record road/track miles quite often. I suspect

that *either* assumption is likely to be wrong too often (i.e.,

when no explicit indication is given of which type of

measurement is done). Perhaps the classification should be

"Basis of measure not indicated" and let the "buyer beware"?

(I.e., the geographic analyst can then chose how she wishes to

interpret the distances --perhaps choosing to measure both ways

if a locality seems out of place under one or the other

measurement scheme.)

 

 

 

>  Otherwise, the

> mileage is assumed to be taken directly from a map (i.e.,

> air miles).  I don't see that the inclusion of fractions in

> the mileage should automatically signal that the mileage was

> read from an odometer...it's easy to get that level of

> precision using the distance scale printed on the map.

 

>>> Posting number 99, dated 30 Sep 2001 13:35:49

Date:         Sun, 30 Sep 2001 13:35:49 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      FW: Locality comment

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

 

John et al.:

With regard to assigning coordinates to localities, there is a convention

that has been used here at KU for at least 50 years that will help with

localities that are given with reference to towns in the US.  When the town

(e.g. Lawrence) was a county seat, distances were measured from the

courthouse.  Frequently this was near the center of town, but it reduces the

error in estimating the distance from town because we don't need to worry

about the distance being measured from the city limits.  If the locality is

3.5 mi NW of

Lawrence, we still have the uncertainty associated with the angular

component.  If the town is not a county seat, the Post Office is frequently

specified as the point of reference.  We think this system was exported to

several other collections that are part of MANIS. In general, your

suggestions look quite reasonable (and conservative).

 

 

>>> Posting number 100, dated 12 Oct 2001 16:22:06

Date:         Fri, 12 Oct 2001 16:22:06 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Commentary synopsis

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi folks,

 

I've been ruminating over the responses to the Georeferencing Guidelines

document, which was posted on the MaNIS website on 24 Sep 2001. That

document has generated interest in a wider community, including the

Alexandria Digital Library Project, so I feel it worthwhile to spend a

little extra effort to fill in some omissions.  Below I will address the

points brought up in discussion and try to provide satisfactory solutions.

I would like to know if there are any objections to these solutions.  My

next step will be to incorporate this information into the Guidelines

document and then announce the existence of that document to NHCOLL.

 

XXXXXXXX mentioned a convention to use the courthouse for a

point of reference for a county seat and to use a post office as a point of

reference for other towns.  Since the Board on Geographic Names GNIS data

often follows this convention as well I see no conflict. Of course, this

convention applies only to the US, and only to those towns where there is a

single identifiable post office or a courthouse.  For all other

determinations the current geographic center of the town, or the

coordinates given in a gazetteer, should be used. In either case it is best

to note something akin to "measured from the post office" or "measured from

the geographic center of Bakersfield" in the determination remarks.

 

XXXXXXXX bought up the topic of elevations as a critical part of the

determination criteria. I agree with her assessment and I propose that we

follow XXXXXXXX's advice, namely, that localities for which there are

internal inconsistencies should be deferred to the parent institution for

further investigation.  I have designed the collaborative gazetteer to

allow annotations to both localities and higher geography. Through the

annotations, georeferencers can note inconsistencies for follow-up work.

Collaborators will be able to check the gazetteer for annotations that

apply to the data from their institution.

 

XXXXX also noted that there was no example of how to deal with existing

geographic coordinates. My original thought was that we should count these

localities as finished.   Yet, there is merit in revisiting existing data,

both for validation and for edification, especially since none of the

existing coordinates have associated error. Nevertheless, we must remain

cognizant of our budgetary constraints. We were given funds to georeference

localities for which we didn't already have coordinates. All that aside,

XXXXX's point is well-taken. I will provide guidelines for existing

geographic coordinates in the forthcoming revised Georeferencing Guideline

document.

 

XXXXX asked whether we should translate coordinates from other coordinate

systems into decimal degrees for data entry. The gazetteer currently

accommodates the following coordinate systems:

decimal degrees

degrees, decimal minutes

degrees, minutes, decimal seconds

UTM

 

But that doesn't answer the question. I will endeavor to create an

interface in which the user will select the original coordinate system and

provide the data in that system. Behind the scenes the data will be stored

in that system AND will be translated to decimal degrees. There will be

decimal degrees and the original coordinates for every determination.

 

XXXXX's next topic was with respect to the precision stored in the

coordinate fields. There is no reason to truncate the values of coordinates

to conform to a predefined level of precision.  For reasons described under

the section on Precision in the Georeferencing Guidelines document, it is

inappropriate to try to store precision information in the coordinate data.

Since the values of the coordinates do not make a statement about the

precision of the determination, keeping as many digits as your source

provides is the preferred method. Discarding digits may have an effect on

accuracy, so it is not recommended.  Just for edification, a decimal degree

that records five digits to the right of the decimal can distinguish

between two places on the earth roughly one meter apart. Similarly, if you

want to maintain accuracy down to one meter, degrees and decimal minutes

should be recorded with 4 decimal places in the decimal minutes, and

degrees minutes seconds should be recorded with 2 decimal places in the

decimal seconds. Conversely, degrees minutes seconds measured to whole

seconds can introduce inaccuracies of up to 31 meters. Those measured to

whole minutes can introduce inaccuracies of up to 1.85 km. I'll make a

chart of this information for the document revision.

 

XXXXX's final question has to do with recording the information about who

determined the coordinates.  This should certainly be among the best

practices within museums.  At the MVZ these data are recorded by making a

reference to the actual person who made the determination.  Since the data

are internal to the museum we can tell whether that person was also the

collector or another person on staff. Another possibility is to record the

role of the person who made the determination (e.g., 'collector',

'curatorial assistant', 'Joe's specific locality munger', etc.). Or, if you

only care whether the collector was the one to provide the coordinates, you

could include a DeterminedByCollector field. For MaNIS I intend to use the

name of the person who determines the coordinates, this name being

determined from a login to the online georeferencing interface.

 

A point of clarification is in order. When determinations are made, I

intend to treat them as opinions. They will not be stored directly with the

locality record, rather, they will refer to it.  This allows any number of

lat/long opinions to be registered. The individual institutions will be

able to decide which one (if there are multiple opinions) will the

"accepted" determination when they put the data back in their databases.

All of the coordinates that were provided in the data sent to me have been

turned into opinions and are already in the gazetteer.

 

XXXXXX made the following observation:

"There are other examples, similar to the problems with distance precision:

         64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km"

 

I agree with XXXXXX's assessment of significance, however, the

determination of error is more complicated.  Not all degrees are created

equal. Contrary to popular opinion, the distance between 64 degrees N and

65 degrees N is not the same as the distance between 10 degrees N and 11

degrees N. This is due to the oblateness (flattening from a perfect sphere)

of the earth. This may be a minor point, but longitudinal degrees vary

greatly, being roughly 110 km at the equator and 0 km at the poles. My

point is that I need to provide an interface in which one can enter

coordinates and the digits of precision and get back an error distance

based on those criteria

 

I will amend my wording and typos with respect to using fractions in the

distance precision error section.

 

XXXXXXXXX brought up a reasonable alternative view of how offsets should

be handled. The judgement of whether measurements are "by road" or "by air"

can be a tricky one.  I want to propose a solution and see if I can get a

consensus.

 

Specific localities that actually say what the measurement method is (e.g.,

"2.8 mi (by road) E of Marysville") should use that method for determining

coordinates and errors. No special remark is necessary in these cases.

 

Specific localities that have two orthogonal measurements in them (e.g.,

"2.5 mi E and 1.5 mi N of Bakersfield") are always assumed to be "by

air."  No special remark is necessary in these cases either. Furthermore,

no error due to direction imprecision should be used.

 

So much for the easy stuff.

 

Specific localities that have one linear offset measurement from a named

place, but that do not specify how that measurement was taken (e.g., "10.2

mi E of Yuma") are open for a case-by-case judgment. I propose that the

judgement itself always be documented in the remarks for the determination

(e.g., "Assumed 'by air' - no roads E out of Yuma", or "Assumed 'by road'

on Hwy. 80"). If there is no clear best choice, then use the midpoint

between the two possibilities as the geographic coordinate and assign an

error large enough to encompass the coordinates and errors of both methods.

In this case I would remark something like "Error encompasses both distance

by air and distance by road (Hwy. 80)". This is a conservative solution,

but it is relatively simple to do and to remember.  This method is also

never "wrong," if by "wrong" we mean that the actual place is certainly

within our error distance from the given coordinates.

 

XXXXXXXXX brought up a question about what units should be used

for maximum error distance. I have set up the gazetteer so that the units

are entered (chosen actually) from a list of possible values (m, km, ft,

yds, mi). The distance and units should be chosen to make sense in the

context of the locality description. My conservative stance on translation

and recalculation issues is to "never adulterate data that can be

adulterated later." If you decide to put these data back into your

databases (and I certainly hope that you will), you can decide at that time

whether to normalize to a single unit of measure.

 

XXXXXXX also brought up an essential issue of whether errors propagate and

should therefore be summed rather than simply choosing the greatest single

source or error.  The answer is not a simple one, so bear with me.

 

XXXXXXX's specific example, "3 km N + 2 km W Bakersfield" is an instance

of a type of locality description for which I did not provide an example. A

proper description of the error for this example would be a bounding box

centered on the point 3 km N and 2 km W of Bakersfield. Each side of the

box would be 2 km in length (1 km error in any direction). Since we're

using a point and radius to characterize the error, we need a circle that

will circumscribe the above-mentioned bounding box. To do this, the radius

has to be the distance from the center coordinate to a corner. This could

either be calculated by the geometry of the bounding box (in the above

example it would be the distance to the corner times the square root of 2)

or measured on a map.

 

There remains the more general question of whether errors propagate. They

do, and they are non-linear, so to sum them is a mistake. The paragraph

above shows how a sum is not a satisfactory method of accommodating

multiple sources of error. As more sources of error come to bear, the

propagation gets even more "interesting." I'll spare you the details here,

but I'll make a point of explaining these sources and how they should be

dealt with in the Guidelines revision.

 

In addition to the issues brought up so far in discussion, I have a few to

add independently. First, I got the calculation for directional error

wrong. I'll update that in the revision. Second, it is probably obvious,

but I still need to state that the directional error can be ignored when

the distance is measured either "by road" or when the description gives two

orthogonal offsets (e.g., "2 mi E and 4 mi N"). Third, there is another

source or errors inherent to reading maps. This error is based on the scale

and it reflects inherent errors in the maps themselves. I will quantify

these errors in the revision.

 

Aside from the revised georeferencing document, I'm currently working on

interfaces to do the georeferencing online. I'll send out a how-to guide

when the interface is ready to use.  It is too soon to know when that will be.

 

So that everyone knows, my field season is about to begin. Eileen and I are

scheduled to leave for Argentina on 3 Nov and to return around New Year's day.

 

That's it for my update. Feel free to discourse on my proposed amendments

and thanks to everyone for the comments thus far.

 

John

 

>>> Posting number 101, dated 16 Oct 2001 12:43:55

 

>>> Posting number 102, dated 18 Oct 2001 19:30:33

Date:         Thu, 18 Oct 2001 19:30:33 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guideline Document Updated

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It took almost two weeks, but the eagerly-awaited revision to the

Georeferencing Guidelines Document is finally complete. I have replaced the

original document, so the following URL now points to the revision:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

I'm not including the line-numbered text of the document here since we are

presumably past the heated debates.  Nevertheless, commentary is

always  welcome.

 

When you read the revised document you are likely to be stricken by the

complexities of determining error properly. Don't despair. My next task is

to create an error calculator. The idea is to have a web page on which you

can enter the relevant parameters and get a maximum error distance. This

tool will be a supplement to the georeferencing tool itself, the

development of which is underway.

 

John

 

>>> Posting number 103, dated 19 Oct 2001 12:29:38

 

>>> Posting number 104, dated 4 Nov 2001 21:44:44

Date:         Sun, 4 Nov 2001 21:44:44 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      MaNIS--ready, set, georeference!

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="------------24FB9C29A003860042ABE8C3"

 

--------------24FB9C29A003860042ABE8C3

Content-Type: text/plain; charset=iso-8859-1

Content-Transfer-Encoding: 8bit

 

Dear All,

 

This is the moment I know you have all been waiting for!  You will

notice a new Gazetteer link at the bottom of the MaNIS home page

(http://dlp.cs.berkeley.edu/manis).  This is your gateway to hours of

georeferencing fun.  But before starting to work, please read this

message in its entirety, print it out and post it next to the computer

that will be used for georeferencing.  You’ll see why you need to print

it when you get near the bottom.

 

To begin, please review the updated Georeferencing Guidelines.

 

Next, you will want to read the Georeferencing Steps document.  A hot

link to it appears at the top of the gazetteer page.

 

You will also want to read the text below the query screen on the

gazetteer main page.

 

After reading all of the above, you will query the gazetteer for a

locality of interest.  The "Search" button returns a list of all higher

geographies containing the term entered and indicates how many unique