MaNIS Georeferencing Discussion Archive

 

Following are extracts of the Georeferencing Listserv discussions accumulated during the MaNIS georeferencing project. Missing postings were not relevant to georeferencing in perpetuity. Messages have been edited to protect the guilty by masking names of individuals with XXXXXX.

 

>>> Posting number 1, dated 17 Jul 1999 14:12:50

 

-----------------------------------------------------------------------------

 

>>> Posting number 2, dated 17 Jul 1999 14:15:23

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 3, dated 17 Jul 1999 14:16:03

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 4, dated 17 Jul 1999 14:19:25

 

 

------------------------------------------------------------------------=

-----

 

>>> Posting number 5, dated 17 Jul 1999 14:19:59

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 6, dated 17 Jul 1999 14:26:41

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 7, dated 17 Jul 1999 14:22:50

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 8, dated 17 Jul 1999 14:23:12

 

-----------------------------------------------------------------------------

 

>>> Posting number 9, dated 19 Jul 1999 09:29:01

----------------------------------------------------------------------------

--------------------

 

>>> Posting number 10, dated 23 Jul 1999 16:35:41

 

>>> Posting number 11, dated 3 Sep 1999 16:17:55

 

>>> Posting number 12, dated 17 Sep 1999 15:19:38

 

>>> Posting number 13, dated 17 Sep 1999 13:13:14

 

>>> Posting number 14, dated 17 Sep 1999 14:57:30

 

>>> Posting number 15, dated 20 Sep 1999 09:04:17

 

>>> Posting number 16, dated 24 Sep 1999 17:01:21

 

>>> Posting number 17, dated 28 Sep 1999 12:50:27

 

>>> Posting number 18, dated 15 Oct 1999 19:37:37

 

>>> Posting number 19, dated 17 Oct 1999 16:37:27

 

>>> Posting number 20, dated 18 Oct 1999 16:50:30

 

>>> Posting number 21, dated 19 Oct 1999 11:15:26

 

>>> Posting number 22, dated 19 Oct 1999 16:35:19

 

>>> Posting number 23, dated 20 Oct 1999 15:51:18

 

>>> Posting number 24, dated 20 Oct 1999 11:34:55

 

>>> Posting number 25, dated 20 Oct 1999 16:00:18

 

>>> Posting number 26, dated 10 Nov 1999 10:52:01

 

>>> Posting number 27, dated 10 Nov 1999 13:54:04

 

>>> Posting number 28, dated 17 Nov 1999 15:12:19

 

>>> Posting number 29, dated 18 Nov 1999 12:38:15

 

>>> Posting number 30, dated 18 Nov 1999 10:08:56

 

>>> Posting number 31, dated 18 Nov 1999 13:22:25

 

>>> Posting number 32, dated 19 Nov 1999 14:35:52

 

>>> Posting number 33, dated 3 Dec 1999 10:21:24

 

>>> Posting number 34, dated 3 Jan 2000 11:48:10

 

>>> Posting number 35, dated 3 Jan 2000 16:24:25

 

>>> Posting number 36, dated 18 May 2000 16:51:23

 

>>> Posting number 37, dated 18 May 2000 19:49:29

 

>>> Posting number 38, dated 23 May 2000 18:41:45

 

>>> Posting number 39, dated 24 May 2000 09:38:19

 

--------------------------------------------------------

---------------------

 

>>> Posting number 40, dated 24 May 2000 12:15:39

 

>>> Posting number 41, dated 12 Jun 2000 15:45:50

 

>>> Posting number 42, dated 13 Jun 2000 09:31:26

 

>>> Posting number 43, dated 13 Jun 2000 09:59:02

 

>>> Posting number 44, dated 13 Jun 2000 09:17:08

 

>>> Posting number 45, dated 13 Jun 2000 07:49:43

 

>>> Posting number 46, dated 13 Jun 2000 09:04:22

 

>>> Posting number 47, dated 13 Jun 2000 08:54:22

 

>>> Posting number 48, dated 13 Jun 2000 11:11:31

 

>>> Posting number 49, dated 13 Jun 2000 13:23:46

 

>>> Posting number 50, dated 30 Jun 2000 16:25:38

 

>>> Posting number 51, dated 30 Jun 2000 17:14:31

 

>>> Posting number 52, dated 30 Jun 2000 23:29:35

 

>>> Posting number 53, dated 1 Jul 2000 07:35:15

 

>>> Posting number 54, dated 4 Jul 2000 11:04:23

 

>>> Posting number 55, dated 4 Jul 2000 10:07:33

 

>>> Posting number 56, dated 6 Jul 2000 00:00:0/

 

>>> Posting number 57, dated 5 Jul 2000 19:40:11

 

>>> Posting number 58, dated 5 Aug 2000 09:24:55

 

>>> Posting number 59, dated 5 Aug 2000 12:31:07

 

>>> Posting number 60, dated 7 Aug 2000 13:45:33

 

>>> Posting number 61, dated 15 Aug 2000 21:54:23

 

>>> Posting number 62, dated 23 Aug 2000 16:24:48

 

>>> Posting number 63, dated 30 Aug 2000 11:20:17

 

>>> Posting number 64, dated 22 Sep 2000 09:36:34

 

>>> Posting number 65, dated 29 Sep 2000 08:51:23

 

>>> Posting number 66, dated 2 Oct 2000 10:35:12

 

>>> Posting number 67, dated 5 Oct 2000 09:40:24

 

>>> Posting number 68, dated 17 Oct 2000 18:13:33

 

>>> Posting number 69, dated 1 Nov 2000 07:48:24

 

>>> Posting number 70, dated 1 Nov 2000 08:06:24

 

>>> Posting number 71, dated 28 Nov 2000 18:26:18

 

>>> Posting number 72, dated 29 Nov 2000 21:09:35

 

>>> Posting number 73, dated 30 Nov 2000 08:31:10

 

>>> Posting number 74, dated 30 Nov 2000 11:33:07

 

>>> Posting number 75, dated 14 Dec 2000 20:41:28

 

>>> Posting number 76, dated 15 Dec 2000 07:59:04

 

>>> Posting number 77, dated 26 Apr 2001 09:00:01

 

>>> Posting number 78, dated 16 May 2001 18:29:45

 

>>> Posting number 79, dated 16 May 2001 17:36:59

 

>>> Posting number 80, dated 18 May 2001 08:29:49

 

>>> Posting number 81, dated 24 May 2001 10:19:20

 

>>> Posting number 82, dated 25 May 2001 09:43:37

 

>>> Posting number 83, dated 11 Jun 2001 12:01:03

 

>>> Posting number 84, dated 11 Jun 2001 15:02:51

 

>>> Posting number 85, dated 11 Jun 2001 15:44:56

 

>>> Posting number 86, dated 29 Jun 2001 21:12:37

 

>>> Posting number 87, dated 4 Jul 2001 14:24:24

Date:         Wed, 4 Jul 2001 14:24:24 -0700

Reply-To:     "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

Sender:       "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: ROM higher geography

In-Reply-To:  <sb433743.076@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I'm posting the following exchange to the list because there is information

contained herein that is relevant to everyone. The basic concepts of data

cleanliness, the gazetteer, and data updates are addressed in brief.

 

 

>Once I began working on the Bukedi inconsistency (2nd in your list) I saw

>that your methodology is missing many more errors/inconsistencies that

>exist in County and Province data.

 

Understood.  My analysis reveals only the duplicates of

ORCT+ORCRY+ORPR+ORCY

 

I understand that there may be many other errors and inconsistencies in the

original data, but that is not a concern for the gazetteer.  In fact, the

duplicates I pointed out aren't a problem either. I just wanted to alert

you to them since they came out in my analysis.

 

>   The errors and inconsistencies are a direct reflection of the state of

> documentation on field catalogues or specimen cards, depending on the

> source of the automated record.  We did not have the resources at the

> time of automation (nor do we now for that matter) to resolve what is a

> "Province" term and what is a "County" term for all

> countries.  Additionally, we are looking at historical data that may no

> longer be reflected in the current political reality of our little world

> (e.g., USSR, Northwest Territories).  I have cleaned up data fields that

> are used routinely to manage the collection and retrieve data.  Continent

> and Country should be clean.  The Province field should be clean for

> Canada (I haven't had the time to tackle NWT yet), USA, and Mexico.  I

> just finished cleaning up the Province field for Guyana as well.  The

> County field should be clean for Ontario.  I now periodically print out

> frequency listings for Country etc. for these priority sections of the db

> (and collection) in an effort to maintain the consistency of our

> data.  For all other geographic locations, Province and County are not

> used for managing the collection, so the data clean up or enhancement has

> been a low priority.  This is an ongoing situation that I have discussed

> with Judith with regard to the Manis Project.  My understanding is that

> funding for documentational and staffing resources will be part of this

> "mission".  I am afraid your listing of 13 inconsistencies barely

> scratches the surface of the data cleaning that is required and even more

> importantly, misses all kinds of erroneous or missing data.  I currently

> do not have the maps, atlases, or gazetteers nor the staff/time to

> undertake this project which from a collections' perspective is of low

> priority.  To do a proper job I cannot resolve all of the problems that

> you have identified without undertaking a full review of the entire

> country's data.

 

There is no requirement for any standard of cleanliness. It is my hope that

errors and inconsistencies will be noted during georeferencing and

forwarded to the attention of the institutions as a part of that

process.  The tools are meant to identify the inconsistencies, not to

remedy them. What the institutions do with these notes is entirely up to them.

 

>I am not sure what you are currently attempting to do with the data so we

>may need to further discuss our respective needs to insure that we are not

>working at cross purposes.  If work is to be globally undertaken, I would

>like our data to be the db of record - making long lists of changes for

>you to then repeat is a waste of effort and time; you will see the work

>generated by having two dbs of record by the simple changes that I have

>made this afternoon.  Also, errors in interpretation or typos that are

>bound to occur should be avoided.  Finally, the data you have is already

>out of date, since changes are made by me on a daily basis as errors etc.

>are encountered during the normal activities of managing the collection,

>fulfilling data requests, etc.

 

The institutional databases will always be the database of record.  The

data I have from all of the institutions is just a snapshot, to be used for

georeferencing. I will not ask for these data again during the project, nor

will I make changes to the data I have received.  When we have a network,

the gazetteer will be created and updated automatically whenever data

change and the snapshot will be obsolete.  I've only created the snapshot

so that we have combined data to work with. When people begin to do

georeferencing using the gazetteer they will not change the data - they

will only make commentaries.  Even the latitude and longitude are

commentaries in a sense. It is up to each institution to accept or reject

the commentaries and make changes based on them in its database.

 

 

>Regards,

 

 

> 

> >>> John Wieczorek <tuco@socrates.Berkeley.EDU> 07/02/01 08:50PM >>>

>Attached is a tab-delimited file with the first row containing column

>headings. The contents of the file are combinations of higher geographic

>fields for which you have more than one interpretation in your

>database.  The first field (highergeog) is a concatenation of the fields of

>higher geography that reveal duplication. The second field (geogid) is an

>identifier unique to the ROM higher geography data with one row for every

>unique combination of ORCT, ORCRY, ORPR, and ORCY.  As you can see by the

>rows in the table, there are 13 places for which there are inconsistent

>placements of county vs. province, for example.  It is not critical for my

>purposes to have these resolved, but since I noticed them I thought I might

>as well tell you.  If you do make changes to these combinations, let me

>know which are correct and I'll do so on this end as well.

 

>>> Posting number 88, dated 10 Jul 2001 12:01:24

Date:         Tue, 10 Jul 2001 12:01:24 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      cave localities

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

I've noticed that the USGS GNIS web site does not give information on cave

sites.  (It does give locations of variants such as Boulder Cave

Campground.)  Is this a protocol we wish to follow?  Are there other web

sites that do list cave localities?  What do you think?

 

Cheers,

 

>>> Posting number 89, dated 10 Jul 2001 13:40:25

Date:         Tue, 10 Jul 2001 13:40:25 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Filtering data

In-Reply-To:  <sb4b0d4a.070@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

This message is in reply to a comment about

records for captive animals.

 

>I would recommend that you do not use any captive records for a

>gazetteer.  Does that make sense?

 

In a restricted view of the utility of a gazetteer it does make sense to

exclude them. However, it is actually easier to include them, yet have them

flagged. This has the benefit that one can filter on the captive attribute.

This could be useful if you wanted to do a quick query of only captive

animals as well as for a query in which you want to leave them out.  The

philosophy in general will be to have a home for all data that anyone deems

useful, yet to allow each institution to decide which data it will provide

through the filters implemented during migration.

 

A filter might do any one of the following:

1) exclude attributes altogether (e.g., not show a "CaptiveFlag" field)

2) exclude records based on the value of an attribute (e.g., not show

records of endangered species)

3) exclude certain values of an attribute (e.g., not show localities for

endangered species)

4) substitute a surrogate value for an attribute of a certain value (e.g.,

instead of showing locality with lat-long, show only county-level and

higher geography for endangered species)

 

These are just a few examples of what might be done at one institution, and

may vary between institutions.  I encourage the participant's to discuss

these issues, and to begin to make institutional decisions about filtering

rules when it comes time to set up the migration.  The rules must be

clearly defined before I begin to create the creation scripts - I can't

afford to stay at any given institution (except maybe Hawaii, heh heh),

while the rules are being hashed out.

 

>>> Posting number 90, dated 8 Aug 2001 13:10:05

 

>>> Posting number 91, dated 14 Sep 2001 08:48:17

 

>>> Posting number 92, dated 23 Sep 2001 17:24:24

 

>>> Posting number 93, dated 24 Sep 2001 20:07:31

Date:         Mon, 24 Sep 2001 20:07:31 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guidelines

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

Now that we are officially up and running I would like to provide the first

of two documents on the MaNIS collaborative georeferencing effort.  This

first document is meant to open for discussion the issues associated with

turning specific locality descriptions into well-documented latitudes and

longitudes.  The document does not explain what tools to use, or how to use

any of them - that will be in a forthcoming document. Instead, this

document focuses on the "theoretical aspects" of the task, our methods and

assumptions, upon which it would be helpful for us all to agree.  To that

end, please read the Georeferencing Guidelines page, accessible from the

Documents page on the MaNIS website (see below).  Comment by sending

messages to MAMMAL-Z-NET@USOBI.ORG. Let's try to get through this

discussion by 6 Oct.

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

Anticipating your enthusiastic participation,

 

John Wieczorek

 

>>> Posting number 94, dated 25 Sep 2001 18:30:16

Date:         Tue, 25 Sep 2001 18:30:16 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing text, for reference

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It was pointed out to me that it might be prudent to have a text-only copy

of the document, with line numbers, to which everyone can refer in

discussions.  I am including the full text of the GeorefGuide.html file

below for that purpose.  The page itself can be found at the following URL:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

 

   1 MaNIS

   2 The Mammal Networked Information System

   3

   4 John Wieczorek

   5 24 September 2001

   6 _________________________________________________

   7

   8 Georeferencing Guidelines

   9

  10 This document contains information about assigning geographic

  11 coordinates and maximum errors for those coordinates to specific

  12 locality descriptions. This document does not attempt to

  13 describe the tools and methods for finding named places on maps

  14 or gazetteers. The process of assigning coordinates and errors,

  15 called georeferencing, can be rather complicated. The complexity

  16 of the process can be greatly reduced and the consistency of the

  17 results can be greatly increased by establishing simple

  18 guidelines that cover most commonly encountered locality

  19 descriptions. The guidelines for assigning coordinates for named

  20 places are presented with examples in the section Determining

  21 Latitude & Longitude.

  22

  23 There are several fundamental sources of error for specific

  24 locality descriptions, and these vary in magnitude. It is

  25 essential during georeferencing to determine and record the

  26 greatest source of error among all possible sources. There are

  27 numerous ways in which the maximum error of a geographic

  28 coordinate might be expressed, but the most convenient is as a

  29 distance, because its size and shape are constant over any

  30 geodetic surface model. The sources of error and their

  31 magnitudes are discussed primarily in the section Determining

  32 Error.

  33

  34 An Appendix containing a description of the data that should be

  35 captured for each georeferenced locality, a glossary, and

  36 references are appended for the convenience of the reader.

  37

  38 Determining Latitude & Longitude

  39

  40 Geographic coordinates can be expressed in a number of different

  41 coordinate systems (e.g. decimal degrees, degrees minutes

  42 seconds, degrees decimal minutes, UTM, etc.). Conversions can be

  43 made readily between coordinate systems, but decimal degrees

  44 provide the most convenient coordinates to use for

  45 georeferencing for no more profound a reason than that a

  46 specific locality can be described with only two attributes

  47 decimal latitude and decimal longitude.

  48

  49 Named Places

  50

  51 The simplest of specific locality descriptions consist of only a

  52 named place. Use the geographic center of a named place for the

  53 latitude and longitude, and use the distance from that point to

  54 the furthest point within that named place for the maximum error

  55 distance. If the geographic center of the named place is not

  56 within the confines of the shape of the named place, use the

  57 point nearest to the geographic center that lies within the

  58 shape.

  59

  60 Example: "Bakersfield"

  61

  62 Township Range Section (TRS) descriptions are essentially no

  63 different from that of any other named place. It is necessary to

  64 understand how TRS descriptions work and how they describe a

  65 place. See the References section, below, for links to TRS

  66 information.

  67

  68 Example: "E of Bakersfield, T29S R29E Sec. 34 NE 1/4"

  69

  70 Offsets

  71

  72 Offsets generally consist of combinations of distances and

  73 directions from a named place. Use the geographic center of the

  74 named place in the direction of the offset as a starting point.

  75 Unless there is contrary information in the locality

  76 description, measure the distance in the offset direction to

  77 find the spot for the geographic coordinates. Offsets that do

  78 not explicitly say that they were measured by air or by some

  79 contour (e.g., by road, river, valley, etc.) should be

  80 determined as if by air in a straight line.

  81

  82 Example: "10 mi E (by air) Bakersfield"

  83

  84 Example: "10 mi E of Bakersfield"

  85

  86 However, if there is no mention of the mode of measurement in

  87 the locality description, but the measurement includes fractions

  88 (e.g., 10.2 miles) and there is a road in the vicinity, use road

  89 miles. Offsets that were described in the specific locality as

  90 being measured by road should be determined using the contours

  91 of the road rather than using a straight line. The methods for

  92 determining the maximum error distances for these types of

  93 specific locality descriptions are given in the Determining

  94 Error section, below.

  95

  96 Example: "10.2 mi E of Bakersfield"

  97

  98 Example: "13 mi E (by road) Bakersfield"

  99

100 Vagueness

101

102 At times, specific locality descriptions are fraught with

103 vagueness. It is not the purpose here to belittle localities of

104 this type; in fact, an honest admission of the unknown is

105 preferable to masking it with unwarranted precision.

106

107 The most important type of vagueness in a specific locality

108 description is one in which the locality is in question. No such

109 locality should be georeferenced.

110

111 Example: "Bakersfield?"

112

113 Many locality descriptions imply an offset from a named place

114 without definitive directions or distances. Use the geographic

115 center of the named place for the geographic coordinates. For

116 the maximum error distance, use the greatest distance that is

117 not likely to be considered in the area of another named place.

118 Clearly there is a measure of subjectivity involved here. Let

119 common sense prevail and document the assumptions made.

120

121 Example: "near Bakersfield"

122

123 Sometimes offset information is vague either in its direction or

124 in its distance. If the direction information is vague, record

125 the geographic coordinates of the center of the named place and

126 add the offset distance to the greatest extent of the named

127 place to get the maximum error distance.

128

129 Example: "5 mi from Bakersfield"

130

131 Uncertainty in the offset distance is a fact of the business.

132 Almost no localities are recorded with error estimates,

133 therefore every offset distance is inherently uncertain. The

134 addition of a modifier in the locality description, while an

135 honest observation, should not change the determination of the

136 geographic coordinates or of the maximum error.

137

138 Example: "about 3 mi E of Bakersfield"

139

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

152

153 Determining Error

154

155 The process of georeferencing includes an assessment of the

156 possible sources of error in a geographic coordinate

157 determination. Errors may arise due to the extent of a locality,

158 due to unspecified precision in original measurements (distance

159 precision and directional precision), or due to not knowing the

160 datum under which coordinates were determined. It is essential

161 to determine which of these yields the greatest error and record

162 that value as the maximum error distance. Potential error

163 sources and guidelines for determining the magnitude of each for

164 a given specific locality are given in the paragraphs below.

165

166 Error due to the shape of a locality

167

168 Named places are not single points; they have extents. If a

169 locality description is no more specific than to describe a

170 named place or an offset from a named place, then the size of

171 the named place is a source of error. The treatment of error due

172 to the extent of a locality is described under the examples of

173 determining latitude and longitude, above.

174

175 Error due to a unknown datum

176

177 Seldom have geographic coordinates been recorded for a locality

178 in a natural history collection in which the underlying datum of

179 the coordinate system was given. Even now, when GPS coordinates

180 are being taken as definitive evidence of a location, the

181 geodetic datum is being ignored. Without recording the datum

182 with the coordinates, potential accuracy is being lost. Figure 1

183 shows the magnitude of error (in meters) over North America

184 based on not knowing the datum from which the coordinates were

185 taken.

186

187 [datumerror.jpg]

188

189 Figure 1. Map of North America showing the magnitude of

190 potential error from not knowing whether coordinates were taken

191 from NAD27, NAD83, or WGS84.

192

193 This map can be used as a rough guide for determining the

194 magnitude of error due to not knowing the datum from which the

195 geographic coordinates were recorded.

196

197 Precision

198

199 Precision is difficult to gauge from specific locality

200 descriptions; it may be reflected in the locality description,

201 but it is seldom, if ever, explicitly recorded. Furthermore, a

202 database record may not reflect, or may reflect incorrectly, the

203 precision inherent in the original measurement, especially if

204 the locality description has undergone interpretation from the

205 verbatim original description. Precision issues arise from both

206 distance measurements and directions in a locality description.

207 Potential errors from each of these sources are discussed in the

208 paragraphs below.

209

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

220

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

223

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

226

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

229

230 If the distance is an integer, use an error of one unit.

231

232 Example: "10 mi N of Bakersfield" Error should be 1 mi.

233

234 Error associated with directional precision

235

236 Direction is almost always expressed in specific locality

237 descriptions using cardinal and intercardinal directions rather

238 than degree headings. A conservative interpretation of these

239 directions allows for an error of 22.5 degrees to either side of

240 the recorded direction. Thus, ENE can be any direction between E

241 and NE, while NE can be any direction between ENE and NNE.

242

243 [directionerror.jpg]

244

245 The error distance resulting from imprecision in direction

246 increases with increasing offset distance. In fact the error

247 distance due to directional imprecision is 0.4142 times the

248 offset. Note, however, that when a locality description uses two

249 offsets based on cardinal directions (e.g., 1 mi N and 3 mi E of

250 Bakersfield), the distances and directions are likely to have

251 been measured on a map. In this case, directional imprecision

252 should be ignored.

253

254 Appendix

255

256 Geographic Coordinate Data

257

258 Following are the essential attributes to be captured for each

259 locality while georeferencing.

260

261 Decimal_Latitude - the latitude coordinate (in decimal degrees) at

262 the center of a circle encompassing the whole of a specific

263 locality. Convention holds that decimal latitudes north of the

264 equator are positive numbers less than or equal to 90, while

265 those south are negative numbers greater or equal to 90.

266 Example: -42.51 degrees (which is the same as 42d 30' 36" S).

267

268 Decimal_Longitude - the longitude coordinate (in decimal degrees)

269 at the center of a circle encompassing the whole of a specific

270 locality. Decimal longitudes west of the Greenwich Meridian are

271 considered negative and must be greater than or equal to 180,

272 while eastern longitudes are positive and less than or equal to

273 180. Example: -122.49 degrees (which is the same as 122d 29' 24"

274 W).

275

276 Maximum_Error_Distance - the upper limit of the distance from the

277 given latitude and longitude within which the described locality

278 must lie.

279

280 Maximum_Error_Units - the units of length in which the maximum

281 error is recorded (e.g., mi, km, m, and ft). Express maximum

282 error distance in the same units as the distance measurement in

283 the specific locality description.

284

285 Datum - the geometric description of a geodetic surface model

286 (e.g., NAD27, NAD83, WGS84). Datums are often recorded on maps

287 and in gazetteers, and can be specifically set for most GPS

288 devices. Use "not recorded" when the datum is not known.

289

290 Original_Coord_System - the coordinate system in which the raw

291 data are being entered. For the purpose of collaborative

292 georeferencing this value will be "decimal degrees." However,

293 existing geographic coordinates may be entered in degrees

294 minutes seconds, degrees decimal minutes, or UTM coordinates.

295

296 Reference - the reference source (e.g., map, gazetteer, or

297 software) used to determine the coordinates. Such information

298 should provide enough detail so that anyone can locate the

299 actual reference that was used (e.g., name, edition or version,

300 year). Lat_Long_Determined_By the person or organization by

301 which the determination was made.

302

303 Lat_Long_Determined_Date - the date on which the determination was

304 made.

305

306 Remarks - comments on methods and assumptions used in determining

307 coordinates or errors when those methods or assumptions differ

308 from or expand upon the accepted guidelines.

309

310 Glossary

311

312 Datum - A geodetic datum describes the size, shape, origin, and

313 orientation of a coordinate system for mapping the surface of

314 the earth.

315

316 Decimal degrees - degrees expressed as a single real number (e.g.,

317 -22.343456) rather than as a composite of degrees, minutes,

318 seconds, and direction (e.g., 7d 54 18.32" E).

319

320 Geodetic surface model - a geometric description of the surface of

321 the earth.

322

323 Geographic coordinates - latitude and longitude, measured in any

324 of various coordinate systems.

325

326 Geographic center - To find the geographic center of a shape,

327 first find the extremes of both latitude and longitude within

328 the shape and then take their respective means.

329

330 UTM - Universal Transverse Mercator. A grid coordinate system

331 specifying a datum, zone, and offsets from the equator and from

332 the meridian of the zone. See the References section, below, for

333 more information.

334

335 References

336

337 Township, Range Section Information:

338

339 http://www.esg.montana.edu/gl/trs-data.html

340

341 Datum Information:

342

343 http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html

344 http://164.214.2.59/GandG/tm83581/tr83581a.htm

345 http://biology.usgs.gov/geotech/documents/datum.html

346

347 UTM Information:

348

349 http://www.nps.gov/prwi/readutm.htm

350 http://www.dmap.co.uk/ll2tm.htm

351

352 Note

353

354 Specific locality descriptions are inexact and seldom give

355 estimates of error. An ideal description of a specific locality

356 has no error. One way to achieve this ideal is to describe the

357 locality by a shape within which the exact locality must

358 certainly lie. The capture of shape data is certainly possible

359 with current GIS technology, and is even demonstrably more

360 efficient than the methods described above. However, there are

361 technical challenges yet to be met in order to make the capture

362 of shape data feasible in a collaborative Internet-based

363 georeferencing environment.

364

365 An alternative to using a shape to describe a locality is to use

366 a definitive point of arbitrarily high precision with an

367 attendant maximum error. This method, described in the foregoing

368 document, is a conservative expression of the locality which

369 satisfies the requirement that the exact locality must lie

370 within the space described.

371

372

373 _________________________________________________

374

375 Rev. 24 September 2001, JRW

376

377 University of California, Berkeley, CA 94720, Copyright 2001,

378 The Regents of the University of California.

 

>>> Posting number 95, dated 27 Sep 2001 10:45:45

Date:         Thu, 27 Sep 2001 10:45:45 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing document

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

John,

 

I went through your document this morning and find most of it clear and in

agreement with my own practices of georeferencing.  I have some

observations and questions as follows:

 

A.

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

 

I have recently been through a georeferencing exercise in the herp

collection for which obtaining coordinates that agreed with the elevations

was critical.  It was only through trying to match the description of the

location (distance and direction from X village) with the elevation given,

and finding that the given elevation at the described site was impossible,

that I uncovered major problems in the locality data provided for a large

number of herps on one particular collecting trip.  In this case I was able

to contact the collector to ask about the inconsistencies and he determined

that his original distances were totally off because he was using miles on

a metric map.  In this case the elevations were the correct piece of

information.  I therefore caution against ignoring elevations out of hand.

 

B.

Section on Determining Latitude and Longitude does not include an example

for when coordinates are provided.  For the sake of completeness, should

such and example be included, or, since they are being provided and not

determined, should this be taken up in another section?  For example, when

coordinates are provided in degrees, minutes and seconds, do we translate

into decimals?  how many decimal places do we go for minutes?  for

seconds?  Does it matter who provided the

coordinates?  collector?  previous museum person?  someone else?  Under

what circumstances, if any, should we recalculate coordinates when they are

provided by some previous source?

 

 

C.

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

 

Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

divide by the denominator?

 

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

 

numerator is 1 to begin with, so doesn't answer the question.

 

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

 

Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

to calculate the error?

 

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

 

Fraction this time is given as 3/4, not 1/4, but you could only get an

error of 0.25 by replacing the 3 with a 1 before dividing by 4.

 

As you can see, the examples are confusing.

 

 

All in all, its a sound document.  Thanks much.

 

 

>>> Posting number 96, dated 27 Sep 2001 20:34:47

Date:         Thu, 27 Sep 2001 20:34:47 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Gordon Jarrell <fnghj@AURORA.UAF.EDU>

Subject:      Re: Georeferencing document

In-Reply-To:  <5.0.2.1.1.20010927104434.00a2f7e0@mail.bishopmuseum.org>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Some good points.  I've inserted my comments.

 

On Thu, 27 Sep 2001, XXXXXXX wrote:

 

> A.

> 140 The worst of situations arises when a specific locality

> 141 description is internally inconsistent. There are numerous

> 142 possible causes for inconsistencies. It is the task of the those

> 143 georeferencing to determine the part of the description most

> 144 likely to be in error, ignore it for the purpose of the

> 145 determination, and document the decision to do so. The most

> 146 common source of inconsistency in a locality description comes

> 147 from trying to match elevation information with the rest of the

> 148 description. If there is no reasonable way to reconcile the

> 149 discrepancy, ignore the elevation.

> 150

> 151 Example: "10 mi W of Bakersfield, 6000 ft"

> 

> I have recently been through a georeferencing exercise in the herp

> collection for which obtaining coordinates that agreed with the elevations

> was critical.  It was only through trying to match the description of the

> location (distance and direction from X village) with the elevation given,

> and finding that the given elevation at the described site was impossible,

> that I uncovered major problems in the locality data provided for a large

> number of herps on one particular collecting trip.  In this case I was able

> to contact the collector to ask about the inconsistencies and he determined

> that his original distances were totally off because he was using miles on

> a metric map.  In this case the elevations were the correct piece of

> information.  I therefore caution against ignoring elevations out of hand.

> 

 

The key words here are, "IF there is no way to reconcile the

discrepancy..."  A possible resolution of the discrepancy might be to

treat it as "specific locality unknown."  This might best be left to the

discretion of the individual collections.  We have to judge individually

how bad our bad data are, i.e., whether or not we can reconcile them.

 

> B.

> Section on Determining Latitude and Longitude does not include an example

> for when coordinates are provided.  For the sake of completeness, should

> such and example be included, or, since they are being provided and not

> determined, should this be taken up in another section?  For example, when

> coordinates are provided in degrees, minutes and seconds, do we translate

> into decimals?  how many decimal places do we go for minutes?  for

> seconds?  Does it matter who provided the

> coordinates?  collector?  previous museum person?  someone else?  Under

> what circumstances, if any, should we recalculate coordinates when they are

> provided by some previous source?

> 

 

(I know John's answer to some of this one.)  The coordinates define an

infinitely small point, no matter what the format.  Precision is measured

with max_error, not the number of significant figures.

 

Nevertheless, we will have coordinates in which precision was implied by

the recorded format.  We have to convert this implied imprecision into a

measure of max_error.  At UAM we are using 2 km, a little over a nautical

mile, for coordinates that were recorded to the nearest whole minutes.

 

There are other examples, similar to the problems with distance precision:

        64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km

 

We probably do need to develop a standard here.  And yes, I'll bet we want

to be able to keep track of various determinations, re-determinations, who

did it, when, and how.

 

 

> C.

> 210 Error associated with distance precision

> 211

> 212 Distance may be recorded in a specific locality description with

> 213 or without significant digits, and those digits may or may not

> 214 be warranted. A conservative way to insure that distance

> 215 precision is not inflated is to treat distance measurements as

> 216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

> 217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

> 218 based on the fractional part of the distance, using 1 divided by

> 219 the denominator of the fraction.

> 

> Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

> divide by the denominator?

> 

> 221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

> 222 be 0.5 mi.

> 

> numerator is 1 to begin with, so doesn't answer the question.

> 

> 224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

> 225 should be 0.1 mi.

> 

> Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

> to calculate the error?

> 

> 227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

> 228 be 0.25 mi.

> 

> Fraction this time is given as 3/4, not 1/4, but you could only get an

> error of 0.25 by replacing the 3 with a 1 before dividing by 4.

> 

> As you can see, the examples are confusing.

> 

> 

 

Looks like a typo in line 224.

 

I suggest replacing the sentence beginning in line 217 with:

 

The error is the resolution implied by the denominator.  It can be

calculated as a distance by dividing one unit of distance by the

denominator.

 

Is that better?  Or worse?

 

 

>>> Posting number 97, dated 28 Sep 2001 12:53:09

Date:         Fri, 28 Sep 2001 12:53:09 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing guidelines

Mime-version: 1.0

Content-type: multipart/alternative;

              boundary="MS_Mac_OE_3084526390_196216_MIME_Part"

 

 

John et al.,

 

The georeferencing guidelines look great to me.  The only (minor) quibble I

have

would be with the second item under the subheading "Offsets" (lines 86-89).

Here, you

suggest that a locality that contains distance fractions (such as "10.2 mi E

Bakerfield") should be assumed to be road miles rather than air miles. I see

it the other way around. Most field workers I know are careful to state "by

road" if their mileage was actually measured along a road.  Otherwise, the

mileage is assumed to be taken directly from a map (i.e., air miles).  I

don't see that the inclusion of fractions in the mileage should

automatically signal that the mileage was read from an odometer...it's easy

to get that level of precision using the distance scale printed on the map.

 

Let's see what the others think.  Well done.

 

 

>>> Posting number 98, dated 28 Sep 2001 11:33:22

Date:         Fri, 28 Sep 2001 11:33:22 -0700

Reply-To:     Peter Rauch <peterr@socrates.Berkeley.EDU>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Georeferencing guidelines

In-Reply-To:  <OF482A362E.E38FA255-ON86256AD5.00621E6D@lsu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

On Fri, 28 Sep 2001, XXXXXXXX wrote:

 

> The georeferencing guidelines look great to me.  The only

> (minor) quibble I have would be with the second item under

> the subheading "Offsets" (lines 86-89). Here, you suggest

> that a locality that contains distance fractions (such as

> "10.2 mi E Bakerfield") should be assumed to be road miles

> rather than air miles. I see it the other way around. Most

> field workers I know are careful to state "by road" if their

> mileage was actually measured along a road.

 

On insect labels ;>)  "by road" is just that much more text to

cram onto tiny labels. Maybe things are different with

vertebrate folks, especially for those who keep detailed field

notebooks. I think lots of folks keep careful track of their

odometers, and record road/track miles quite often. I suspect

that *either* assumption is likely to be wrong too often (i.e.,

when no explicit indication is given of which type of

measurement is done). Perhaps the classification should be

"Basis of measure not indicated" and let the "buyer beware"?

(I.e., the geographic analyst can then chose how she wishes to

interpret the distances --perhaps choosing to measure both ways

if a locality seems out of place under one or the other

measurement scheme.)

 

 

 

>  Otherwise, the

> mileage is assumed to be taken directly from a map (i.e.,

> air miles).  I don't see that the inclusion of fractions in

> the mileage should automatically signal that the mileage was

> read from an odometer...it's easy to get that level of

> precision using the distance scale printed on the map.

 

>>> Posting number 99, dated 30 Sep 2001 13:35:49

Date:         Sun, 30 Sep 2001 13:35:49 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      FW: Locality comment

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

 

John et al.:

With regard to assigning coordinates to localities, there is a convention

that has been used here at KU for at least 50 years that will help with

localities that are given with reference to towns in the US.  When the town

(e.g. Lawrence) was a county seat, distances were measured from the

courthouse.  Frequently this was near the center of town, but it reduces the

error in estimating the distance from town because we don't need to worry

about the distance being measured from the city limits.  If the locality is

3.5 mi NW of

Lawrence, we still have the uncertainty associated with the angular

component.  If the town is not a county seat, the Post Office is frequently

specified as the point of reference.  We think this system was exported to

several other collections that are part of MANIS. In general, your

suggestions look quite reasonable (and conservative).

 

 

>>> Posting number 100, dated 12 Oct 2001 16:22:06

Date:         Fri, 12 Oct 2001 16:22:06 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Commentary synopsis

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi folks,

 

I've been ruminating over the responses to the Georeferencing Guidelines

document, which was posted on the MaNIS website on 24 Sep 2001. That

document has generated interest in a wider community, including the

Alexandria Digital Library Project, so I feel it worthwhile to spend a

little extra effort to fill in some omissions.  Below I will address the

points brought up in discussion and try to provide satisfactory solutions.

I would like to know if there are any objections to these solutions.  My

next step will be to incorporate this information into the Guidelines

document and then announce the existence of that document to NHCOLL.

 

XXXXXXXX mentioned a convention to use the courthouse for a

point of reference for a county seat and to use a post office as a point of

reference for other towns.  Since the Board on Geographic Names GNIS data

often follows this convention as well I see no conflict. Of course, this

convention applies only to the US, and only to those towns where there is a

single identifiable post office or a courthouse.  For all other

determinations the current geographic center of the town, or the

coordinates given in a gazetteer, should be used. In either case it is best

to note something akin to "measured from the post office" or "measured from

the geographic center of Bakersfield" in the determination remarks.

 

XXXXXXXX bought up the topic of elevations as a critical part of the

determination criteria. I agree with her assessment and I propose that we

follow XXXXXXXX's advice, namely, that localities for which there are

internal inconsistencies should be deferred to the parent institution for

further investigation.  I have designed the collaborative gazetteer to

allow annotations to both localities and higher geography. Through the

annotations, georeferencers can note inconsistencies for follow-up work.

Collaborators will be able to check the gazetteer for annotations that

apply to the data from their institution.

 

XXXXX also noted that there was no example of how to deal with existing

geographic coordinates. My original thought was that we should count these

localities as finished.   Yet, there is merit in revisiting existing data,

both for validation and for edification, especially since none of the

existing coordinates have associated error. Nevertheless, we must remain

cognizant of our budgetary constraints. We were given funds to georeference

localities for which we didn't already have coordinates. All that aside,

XXXXX's point is well-taken. I will provide guidelines for existing

geographic coordinates in the forthcoming revised Georeferencing Guideline

document.

 

XXXXX asked whether we should translate coordinates from other coordinate

systems into decimal degrees for data entry. The gazetteer currently

accommodates the following coordinate systems:

decimal degrees

degrees, decimal minutes

degrees, minutes, decimal seconds

UTM

 

But that doesn't answer the question. I will endeavor to create an

interface in which the user will select the original coordinate system and

provide the data in that system. Behind the scenes the data will be stored

in that system AND will be translated to decimal degrees. There will be

decimal degrees and the original coordinates for every determination.

 

XXXXX's next topic was with respect to the precision stored in the

coordinate fields. There is no reason to truncate the values of coordinates

to conform to a predefined level of precision.  For reasons described under

the section on Precision in the Georeferencing Guidelines document, it is

inappropriate to try to store precision information in the coordinate data.

Since the values of the coordinates do not make a statement about the

precision of the determination, keeping as many digits as your source

provides is the preferred method. Discarding digits may have an effect on

accuracy, so it is not recommended.  Just for edification, a decimal degree

that records five digits to the right of the decimal can distinguish

between two places on the earth roughly one meter apart. Similarly, if you

want to maintain accuracy down to one meter, degrees and decimal minutes

should be recorded with 4 decimal places in the decimal minutes, and

degrees minutes seconds should be recorded with 2 decimal places in the

decimal seconds. Conversely, degrees minutes seconds measured to whole

seconds can introduce inaccuracies of up to 31 meters. Those measured to

whole minutes can introduce inaccuracies of up to 1.85 km. I'll make a

chart of this information for the document revision.

 

XXXXX's final question has to do with recording the information about who

determined the coordinates.  This should certainly be among the best

practices within museums.  At the MVZ these data are recorded by making a

reference to the actual person who made the determination.  Since the data

are internal to the museum we can tell whether that person was also the

collector or another person on staff. Another possibility is to record the

role of the person who made the determination (e.g., 'collector',

'curatorial assistant', 'Joe's specific locality munger', etc.). Or, if you

only care whether the collector was the one to provide the coordinates, you

could include a DeterminedByCollector field. For MaNIS I intend to use the

name of the person who determines the coordinates, this name being

determined from a login to the online georeferencing interface.

 

A point of clarification is in order. When determinations are made, I

intend to treat them as opinions. They will not be stored directly with the

locality record, rather, they will refer to it.  This allows any number of

lat/long opinions to be registered. The individual institutions will be

able to decide which one (if there are multiple opinions) will the

"accepted" determination when they put the data back in their databases.

All of the coordinates that were provided in the data sent to me have been

turned into opinions and are already in the gazetteer.

 

XXXXXX made the following observation:

"There are other examples, similar to the problems with distance precision:

         64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km"

 

I agree with XXXXXX's assessment of significance, however, the

determination of error is more complicated.  Not all degrees are created

equal. Contrary to popular opinion, the distance between 64 degrees N and

65 degrees N is not the same as the distance between 10 degrees N and 11

degrees N. This is due to the oblateness (flattening from a perfect sphere)

of the earth. This may be a minor point, but longitudinal degrees vary

greatly, being roughly 110 km at the equator and 0 km at the poles. My

point is that I need to provide an interface in which one can enter

coordinates and the digits of precision and get back an error distance

based on those criteria

 

I will amend my wording and typos with respect to using fractions in the

distance precision error section.

 

XXXXXXXXX brought up a reasonable alternative view of how offsets should

be handled. The judgement of whether measurements are "by road" or "by air"

can be a tricky one.  I want to propose a solution and see if I can get a

consensus.

 

Specific localities that actually say what the measurement method is (e.g.,

"2.8 mi (by road) E of Marysville") should use that method for determining

coordinates and errors. No special remark is necessary in these cases.

 

Specific localities that have two orthogonal measurements in them (e.g.,

"2.5 mi E and 1.5 mi N of Bakersfield") are always assumed to be "by

air."  No special remark is necessary in these cases either. Furthermore,

no error due to direction imprecision should be used.

 

So much for the easy stuff.

 

Specific localities that have one linear offset measurement from a named

place, but that do not specify how that measurement was taken (e.g., "10.2

mi E of Yuma") are open for a case-by-case judgment. I propose that the

judgement itself always be documented in the remarks for the determination

(e.g., "Assumed 'by air' - no roads E out of Yuma", or "Assumed 'by road'

on Hwy. 80"). If there is no clear best choice, then use the midpoint

between the two possibilities as the geographic coordinate and assign an

error large enough to encompass the coordinates and errors of both methods.

In this case I would remark something like "Error encompasses both distance

by air and distance by road (Hwy. 80)". This is a conservative solution,

but it is relatively simple to do and to remember.  This method is also

never "wrong," if by "wrong" we mean that the actual place is certainly

within our error distance from the given coordinates.

 

XXXXXXXXX brought up a question about what units should be used

for maximum error distance. I have set up the gazetteer so that the units

are entered (chosen actually) from a list of possible values (m, km, ft,

yds, mi). The distance and units should be chosen to make sense in the

context of the locality description. My conservative stance on translation

and recalculation issues is to "never adulterate data that can be

adulterated later." If you decide to put these data back into your

databases (and I certainly hope that you will), you can decide at that time

whether to normalize to a single unit of measure.

 

XXXXXXX also brought up an essential issue of whether errors propagate and

should therefore be summed rather than simply choosing the greatest single

source or error.  The answer is not a simple one, so bear with me.

 

XXXXXXX's specific example, "3 km N + 2 km W Bakersfield" is an instance

of a type of locality description for which I did not provide an example. A

proper description of the error for this example would be a bounding box

centered on the point 3 km N and 2 km W of Bakersfield. Each side of the

box would be 2 km in length (1 km error in any direction). Since we're

using a point and radius to characterize the error, we need a circle that

will circumscribe the above-mentioned bounding box. To do this, the radius

has to be the distance from the center coordinate to a corner. This could

either be calculated by the geometry of the bounding box (in the above

example it would be the distance to the corner times the square root of 2)

or measured on a map.

 

There remains the more general question of whether errors propagate. They

do, and they are non-linear, so to sum them is a mistake. The paragraph

above shows how a sum is not a satisfactory method of accommodating

multiple sources of error. As more sources of error come to bear, the

propagation gets even more "interesting." I'll spare you the details here,

but I'll make a point of explaining these sources and how they should be

dealt with in the Guidelines revision.

 

In addition to the issues brought up so far in discussion, I have a few to

add independently. First, I got the calculation for directional error

wrong. I'll update that in the revision. Second, it is probably obvious,

but I still need to state that the directional error can be ignored when

the distance is measured either "by road" or when the description gives two

orthogonal offsets (e.g., "2 mi E and 4 mi N"). Third, there is another

source or errors inherent to reading maps. This error is based on the scale

and it reflects inherent errors in the maps themselves. I will quantify

these errors in the revision.

 

Aside from the revised georeferencing document, I'm currently working on

interfaces to do the georeferencing online. I'll send out a how-to guide

when the interface is ready to use.  It is too soon to know when that will be.

 

So that everyone knows, my field season is about to begin. Eileen and I are

scheduled to leave for Argentina on 3 Nov and to return around New Year's day.

 

That's it for my update. Feel free to discourse on my proposed amendments

and thanks to everyone for the comments thus far.

 

John

 

>>> Posting number 101, dated 16 Oct 2001 12:43:55

 

>>> Posting number 102, dated 18 Oct 2001 19:30:33

Date:         Thu, 18 Oct 2001 19:30:33 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guideline Document Updated

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It took almost two weeks, but the eagerly-awaited revision to the

Georeferencing Guidelines Document is finally complete. I have replaced the

original document, so the following URL now points to the revision:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

I'm not including the line-numbered text of the document here since we are

presumably past the heated debates.  Nevertheless, commentary is

always  welcome.

 

When you read the revised document you are likely to be stricken by the

complexities of determining error properly. Don't despair. My next task is

to create an error calculator. The idea is to have a web page on which you

can enter the relevant parameters and get a maximum error distance. This

tool will be a supplement to the georeferencing tool itself, the

development of which is underway.

 

John

 

>>> Posting number 103, dated 19 Oct 2001 12:29:38

 

>>> Posting number 104, dated 4 Nov 2001 21:44:44

Date:         Sun, 4 Nov 2001 21:44:44 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      MaNIS--ready, set, georeference!

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="------------24FB9C29A003860042ABE8C3"

 

--------------24FB9C29A003860042ABE8C3

Content-Type: text/plain; charset=iso-8859-1

Content-Transfer-Encoding: 8bit

 

Dear All,

 

This is the moment I know you have all been waiting for!  You will

notice a new Gazetteer link at the bottom of the MaNIS home page

(http://dlp.cs.berkeley.edu/manis).  This is your gateway to hours of

georeferencing fun.  But before starting to work, please read this

message in its entirety, print it out and post it next to the computer

that will be used for georeferencing.  You’ll see why you need to print

it when you get near the bottom.

 

To begin, please review the updated Georeferencing Guidelines.

 

Next, you will want to read the Georeferencing Steps document.  A hot

link to it appears at the top of the gazetteer page.

 

You will also want to read the text below the query screen on the

gazetteer main page.

 

After reading all of the above, you will query the gazetteer for a

locality of interest.  The "Search" button returns a list of all higher

geographies containing the term entered and indicates how many unique

localities are contained in the result set.  The list will not tell you

how many of those localities are already georeferenced.  You will see

those data once you download the localities.

 

You may chose to “View” the queried localities either before or after

downloading BUT this function will not aid you in assigning lat/long

coordinates.  Only those localities for which coordinates have already

been assigned get plotted using the GIS viewer (this is the same tool we

showed you at the ASM meeting, courtesy of the Berkeley Digital Library

Project).

 

Where the GIS viewer is most helpful is in pointing out erroneous

coordinates (e.g., if you view the georeferenced localities from

Algeria, 3 specimens appear in the Atlantic Ocean).  By clicking on that

point on the map, you can see the locality record(s) for that point and

correct it/them or, if the locality is not yours, you can contact the

appropriate institution.  The viewer also allows you to see how much

work you have accomplished!

 

Notes about the viewer:  This is a java applet and takes time to load.

Do not attempt to use it on older machines with inadequate memory.

Also, not all map layers exist for all parts of the world (e.g., you

will only get USGS 7.5” topo maps for the U.S.).  How far you can zoom

and the level of resolution you see will depend on the map layers

available.

 

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.  2) Below the

georeferencing steps you will see the complete list of fields that will

appear in your downloaded files.  Those that are in bold are fields you

will fill.  Those not in bold are needed by John to reassociate the data

in the gazetteer with the data in your institutional databases.  DO NOT

alter the values in these fields!

 

For security purposes, we are not posting instructions on how to upload

georeferenced localities on the web site.  Below is the complete text

for Step Eight of the Georeferencing Steps document.  These instructions

are also being archived on the listserv should you forget to print out

this message.  Follow the instructions below for uploading completed

files:

 

Step Eight - Upload Finished Localities

    Upload the finished file of georeferenced localities by anonymous

FTP to galaxy.cs.berkeley.edu in the directory incoming/mvz/manis. Use

your favorite FTP client to connect to galaxy.cs.berkeley.edu. Log in as

anonymous, providing your email address as a password. Set the file type

to text. Change to the incoming/mvz/manis directory on galaxy. Transfer

your file.

 

Notice that the MVZ has already laid claim to all California localities

(see MaNIS Georef. Checklist in Step 2).  Try as you might, we will not

relinquish this claim!  It is therefore incumbent upon each of you to

lay claim to an equally prestigious set of localities.

 

Those of you paying attention will realize that John is now in Argentina

for two months.  He hoped to have the Error Calculator completed before

leaving.  He did not.  However, once completed, you will simply enter

your lat/long coordinates and it will do all the work of calculating the

error in those values for you-- so it is worth the wait.  Go ahead and

start georeferencing now.  You will son be able to go back and fill in

the errors needed as he will post the calculator from the field.

 

I wish I had more to report on the status of your subcontracts, but I do

not.  Some of you will be able to begin work regardless.  The

beaurocracy has a timeline of its own. We simply have to proceed as best

we can in the meantime.

 

Please continue to address any questions or comments to the list.

Ready, set, georeference!

 

Best,

Barbara

 

 

>>> Posting number 105, dated 6 Nov 2001 09:51:19

 

>>> Posting number 106, dated 6 Nov 2001 09:00:24

 

>>> Posting number 107, dated 6 Nov 2001 12:24:23

 

>>> Posting number 108, dated 6 Nov 2001 14:29:22

 

>>> Posting number 109, dated 6 Nov 2001 16:52:12

 

>>> Posting number 110, dated 6 Nov 2001 16:06:24

Date:         Tue, 6 Nov 2001 16:06:24 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Patricia W. Freeman" <pfreeman1@UNL.EDU>

Subject:      Re: MaNIS--ready, set, georeference!

Comments: cc: hgenoways1@unl.edu

In-Reply-To:  <4.2.2.20011106122240.00abdfb8@packrat.musm.ttu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear members of MaNIS-

 

I am actually out of your official MaNIS loop, but I have already

georeferenced Nebraska for mammals, birds, herps, and fish (over 60,000

specimens) and will probably do South Dakota as well.  I salvaged 8,000

herps and about 1,500 mammals from USD about two years ago.

 

All four vertebrate groups are on our web page and searchable to county.

Although we have already georeferenced all four collections, the complete

localities will not be put on the webpage until next semester (I hope).  My

computer expert who, using the Texas Tech georeferencing idea, modified and

wrote a conversion program changing all our geographic localities to

georeferenced localities.

 

 We now have a large NT server that has the USGS maps and gazetteers on it.

Since Hugh Genoways is rewriting the Mammals of Nebraska and has already

started gathering specimens for that purpose, all mammals and mammal data

used for that study will be automatically georeferenced and those data will

accompany the loaned materials on return to their home institution.  I

expect that he has or will contact most of you who have Nebraska material.

 

Regards-

Trish Freeman

 

PS. Can any of you direct me to FISHNET or BIRDNET if there are such

things?  I am already involved with HERPNET, although I do not know what is

happening with it.  Maybe someday we will have VERTNET.

 

 

 

 

 

 

 

Patricia W. Freeman

Professor/ Curator of Zoology

University of Nebraska State Museum

Lincoln NE 68588-0514

402-472-6606

402-472-8949 (fax)

Natural history museums archive biological diversity.

http://www-museum.unl.edu/research/zoology/zoology.html

 

>>> Posting number 111, dated 7 Nov 2001 09:09:31

 

>>> Posting number 112, dated 7 Nov 2001 08:32:12

 

>>> Posting number 113, dated 8 Nov 2001 14:03:13

 

>>> Posting number 114, dated 8 Nov 2001 14:39:28

Date:         Thu, 8 Nov 2001 14:39:28 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MaNIS--ready, set, georeference!

In-Reply-To:  <3BE6274C.F9AC2E10@oz.net>

Mime-version: 1.0

Content-type: multipart/alternative;

              boundary="MS_Mac_OE_3088075168_258732_MIME_Part"

 

> This message is in MIME format. Since your mail reader does not understand

this format, some or all of this message may not be legible.

 

--MS_Mac_OE_3088075168_258732_MIME_Part

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

Dear all,

 

1.  I have Internet Explorer 5 for Macintosh on a G4.  I haven't been able

to download records from the Manis website.

 

2.  Our grant submission allotted funds to each institution based on their

records to be geo-referenced.  Does committing to a state/province or region

change all of this?

 

3.  The process has changed considerably between when our records were

downloaded for John and the ASM meeting.  I thought that our  records were

being submitted so that John would have a snapshot of what the different

databases looked like in order to design the  Manis database.  I had planned

to clear up any inconsistencies, spelling errors, etc in our localities

before we geo-referenced and downloaded to the Manis database.  This seems

to make sense, since many errors in locality records can be cleared up only

with the use of in-house resources such as field notes and catalogs.  Now we

are committing to a region and giving our best opinion on perceived errors

(to be noted in the Locality Annotation) to other institutions (and

ourselves!) for them to rectify (or not) at their leisure.  Since I  haven't

been able to download records,  I don't know how much this new scheme will

save time overall or be more time consuming!

 

4.  There are many localities that are designated unique that simply differ

in syntax, spelling, etc.  They are not necessarily next to each other.

Would editing our own version of the database first for these errors and

then downloading them into the Manis database work?

 

Cheers,

 

XXXXXXXXX

 

--MS_Mac_OE_3088075168_258732_MIME_Part

Content-type: text/html; charset="US-ASCII"

Content-transfer-encoding: quoted-printable

 

<HTML>

<HEAD>

<TITLE>Re: MaNIS--ready, set, georeference!</TITLE>

</HEAD>

<BODY>

<FONT FACE=3D"Century Schoolbook">Dear all,<BR>

<BR>

1. &nbsp;I have Internet Explorer 5 for Macintosh on a G4. &nbsp;I haven't =

been able to download records from the Manis website.<BR>

<BR>

2. &nbsp;Our grant submission allotted funds to each institution based on t=

heir records to be geo-referenced. &nbsp;Does committing to a state/province=

 or region change all of this?<BR>

<BR>

3. &nbsp;The process has changed considerably between when our records were=

 downloaded for John and the ASM meeting. &nbsp;I thought that our &nbsp;rec=

ords were being submitted so that John would have a snapshot of what the dif=

ferent databases looked like in order to design the &nbsp;Manis database. &n=

bsp;I had planned to clear up any inconsistencies, spelling errors, etc in o=

ur localities before we geo-referenced and downloaded to the Manis database.=

 &nbsp;This seems to make sense, since many errors in locality records can b=

e cleared up only with the use of in-house resources such as field notes and=

 catalogs. &nbsp;Now we are committing to a region and giving our best opini=

on on perceived errors (to be noted in the Locality Annotation) to other ins=

titutions (and ourselves!) for them to rectify (or not) at their leisure. &n=

bsp;Since I &nbsp;haven't been able to download records, &nbsp;I don't know =

how much this new scheme will save time overall or be more time consuming!<B=

R>

<BR>

4. &nbsp;There are many localities that are designated unique that simply d=

iffer in syntax, spelling, etc. &nbsp;They are not necessarily next to each =

other. &nbsp;Would editing our own version of the database first for these e=

rrors and then downloading them into the Manis database work?<BR>

<BR>

Cheers,<BR>

<BR>

XXXXXXXXXXXXXX</FONT>

</BODY>

</HTML>

 

 

--MS_Mac_OE_3088075168_258732_MIME_Part--

 

>>> Posting number 115, dated 8 Nov 2001 21:20:18

Date:         Thu, 8 Nov 2001 21:20:18 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      permutations on "unique" localities in the gazetteer

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All:  I was wondering about many of the same points that XXXX

XXXXXXXXX mentioned in his email of 8 Nov.  Especially after perusing the

gazetteer and seeing many permutations on"unique" localities.  Eg.,

localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi north,

and north of Seattle 20 miles, have to be allowed because of institutional

style or preference.  However, an entry such as Seatle, 20 mi N could be

corrected.  Each is a unique record to the computer and will receive the

same lat/long by georeferencers?   Once georeferenced, the permutations can

be identified, but if  localities are entered differently, how much

efficiency is gained by having one institution georeference all records for

a region vs having each georeference their own records?   In addition when

a typo like Seatle is corrected, it no longer is unique but of the same set

as the correct spelling.  The typos will be deleted from the static

gazetteer after determining that they were corrected in the institutional

database (see comment from Barbara below)?   It is unclear to me how

corrections in institutional databases will be mirrored in the static

gazetteer.

 

Although the idea of compiling a static gazetteer of unique localities

seemed like a good idea at the beginning, it does not seem doable at this

point.  I would prefer to go back to the original plan of each institution

dealing with their own records and offering assistance to others as needed.

Once georeferencing is started  and we get $ for the servers, the

gazetteer could be produced dynamically, or at least by frequent uploads -

rather than statically - and can be consulted, updated, corrected, winnowed

as needed.

 

>From 4 Nov email of Barbara:

...

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.

...

 

 

 

 

 

>>> Posting number 116, dated 9 Nov 2001 08:57:34

Date:         Fri, 9 Nov 2001 08:57:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: MaNIS--ready, set, georeference!

MIME-Version: 1.0

Content-Type: text/plain; charset=iso-8859-1

Content-Transfer-Encoding: quoted-printable

 

> 1.  I have Internet Explorer 5 for Macintosh on a G4.  I haven't been a=

ble to download records from the Manis website.

 

XXXX et al.,

 

We are checking this out and, with luck, will have a fix today.  In the m=

eantime, you can download from a Mac using Netscape.

 

> 2.  Our grant submission allotted funds to each institution based on th=

eir records to be geo-referenced.  Does committing to a state/province or=

 region change all of this?

 

No it does not.  It was presumed that, in most instances, the majority of=

 localities for a given state, and the geographic expertise and resources=

 to untangle geographic problems, would reside with the institution in th=

at state.  Therefore, it made sense that we should work cooperatively to =

georeference.  Each institution naturally will have many other specimens =

collected outside that state.  Each can choose to do onlyu its own locali=

ties, thereby encouraging duplicate effort, or we can attempt a more altr=

uistic approach and save economies of scale.  If, after georeferencing al=

l of California, the MVZ looks at its remaining collections and sees that=

 it has a tremendous amount of material from Brazil, Peru and Argentina, =

and recognizes that it also has more geographic expertise in these region=

s than any of the other institutions (and presumably more maps, gazetteer=

s, etc.), then we are going to offer to do all localities from

those countries for the sake of efficiency and making the money go as far=

 as possible.  In return, we know we will benefit from the Bishop Museum =

doing our PNG material, of which we have a fair number of specimens.  We =

could it, yes.  But they can probably do it more quickly and easily.  Thi=

s approach also allows those with an interest in a particular region of t=

he world to get a good handle on what exists in our joint collections and=

, I suspect, reach some very interesting summaries about those regions an=

d the state of our knowledge of their mammalian fauna.

 

> 3.  The process has changed considerably between when our records were =

downloaded for John and the ASM meeting.

 

No it has not.  All of this was discussed online during the proposal prep=

aration process beginning more than a year ago.

 

> I thought that our  records were being submitted so that John would hav=

e a snapshot of what the different databases looked like in order to desi=

gn the  Manis database.

 

That is also true.  There were always two objectives in giving John your =

data.

 

> I had planned to clear up any inconsistencies, spelling errors, etc in =

our localities before we geo-referenced and downloaded to the Manis datab=

ase.

 

The time to have cleared up those problems was before the data were sent =

to John.  Since this approach was outlined in the first proposal submissi=

on over a year ago, it should not have come as a surprise.  The money we =

receive from NSF was never intended to pay institutions to clean up their=

 locality records.  It is to georeference those records.

 

> This seems to make sense, since many errors in locality records can be =

cleared up only with the use of in-house resources such as field notes an=

d catalogs.  Now we are committing to a region and giving our best opinio=

n on perceived errors (to be noted in the Locality Annotation) to other i=

nstitutions (and ourselves!) for them to rectify (or not) at their leisur=

e.

 

Since you haven't started to georeference, you will have to take my word =

that your fears are probably worse than reality.  Truly erroneous localit=

ies become obvious quite quickly and if they are not your own, simply ema=

il a query to the institution to which that locality belongs.

 

Multiple versions of the same locality also jump out quickly.  The advant=

age of using a single individual to georeference a region in that s/he qu=

ickly becomes familiar with the localirties in that place.  My own person=

al suggestion is that each PI sit down with the data and try this process=

 him- or herself before hiring a student to really get going on it.  It w=

ill give you confidence and a much better feel for how it all works.  And=

, if you love maps like I do, it can actually be quite a seductive exerci=

se.  Your problem will be to keep working and not to get distracted by th=

e geography and all the places you would like to collect, have collected,=

 etc.  Perhaps the most difficult aspect is recognizing place names that =

are no longer in use.  Again, review the georeferencing guidelines which =

remind you not to dwell on any single seemingly intractable locality.

 

> 4.  There are many localities that are designated unique that simply di=

ffer in syntax, spelling, etc.  They are not necessarily next to each oth=

er.  Would editing our own version of the database first for these errors=

 and then downloading them into the Manis database work?

 

I don't believe so.  As mentioned above, each institution has known about=

 this approach for more than a year and could have, in that time, chosen =

to direct part of its routine curatorial effort to cleaning up localities=

 in its db.  The final distributed db will have whatever corrected specif=

ic localities get made during the georeferencing process.  We were not gi=

ven money to clean up our localities.  We received this money to georefer=

ence.  You are under no obligation to correct localities for other instit=

utions.  You are merely being asked to georeference them.  Even if relate=

d localities do not fall out in line with one another in your downloaded =

files, if one individual works on all the localities for a given region, =

s/he will not have trouble recalling that a lat/long for a similar place =

was assigned just two days ago and one can scroll up the list to find it.=

 

 

I am sure John will want to add his own comments to what I have written. =

 He generally has access to email about once a week.  In the meantime, I =

will let you know as soon as we solve the download problem.  That does no=

t have to wait for him.

 

Best, Barbara

 

>>> Posting number 117, dated 9 Nov 2001 09:28:19

Date:         Fri, 9 Nov 2001 09:28:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: permutations on "unique" localities in the gazetteer

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> Each is a unique record to the computer and will receive the

> same lat/long by georeferencers?

 

Yes.

 

> Once georeferenced, the permutations can

> be identified, but if  localities are entered differently, how much

> efficiency is gained by having one institution georeference all records for

> a region vs having each georeference their own records?

 

Please refer to my reply to XXXXXX.'s previous message on this issue.  Having a

fair amount of experience doing georeferencing, the MVZ and other instigators

of this proposal believe strongly that much efficiency can be gained by a

cooperative approach.  Proof of our commitment is that the MVZ has agreed to do

all California localities for this project even though we have completed

georeferencing our own localities for many counties in the state more than a

year ago.  We believe we can just do it more efficiently and more painlessly

than any of you folks can.  Even LACM didn't fight us on this point.  I can

change the oil in my car but...

 

> In addition when

> a typo like Seatle is corrected, it no longer is unique but of the same set

> as the correct spelling.  The typos will be deleted from the static

> gazetteer after determining that they were corrected in the institutional

> database (see comment from Barbara below)?

 

No, the typos will not be deleted from the static gazetteer.  The static

gazetteer exists simply as a way to unite all localities from our respective

dbs for georeferencing and then return the georeferenced locs to their

respective dbs.

 

> It is unclear to me how

> corrections in institutional databases will be mirrored in the static

> gazetteer.

 

I repeat-- corrections in institutional dbs will not be mirrored in the static

gazetteer.  Rather, your efforts will be mirrored in the final product--a

geographic dictionary coupled with the distributed db network and GIS viewer.

Please review our NSF proposal.

 

> Although the idea of compiling a static gazetteer of unique localities

> seemed like a good idea at the beginning, it does not seem doable at this

> point.

 

It has been done, for the purpose it was designed to carry out.

 

> I would prefer to go back to the original plan of each institution

> dealing with their own records and offering assistance to others as needed.

 

That is not what was agreed to or specified in the proposal.

 

> Once georeferencing is started  and we get $ for the servers, the

> gazetteer could be produced dynamically, or at least by frequent uploads -

> rather than statically - and can be consulted, updated, corrected, winnowed

> as needed.

 

And it will be.  You are exactly right.

 

Best,

Barbara

 

>>> Posting number 118, dated 9 Nov 2001 14:20:26

 

 

>>> Posting number 119, dated 9 Nov 2001 14:57:01

Date:         Fri, 9 Nov 2001 14:57:01 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Static Gazetteer

MIME-version: 1.0

Content-type: multipart/alternative;

              boundary="Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)"

 

--Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Dear All,

 

To add to my last message, I don't think the static gazetteer was a

surprise, rather the timing of it was.  When I sent the TTU site data to

John early in the summer, I told him that we are in the middle of verifying

and correcting our database.  (We have been working on checking and

correcting our database for nearly three years; I happily report that we

are all but done now.)  At the time, I told John that the corrected data

were NOT what was being sent to him.  He implied that this was okay and

that the static gazetteer would be created at a later time.  However, I may

have misunderstood him.  Now, it seem that several of us have data that we

are not comfortable with in the already compiled gazetteer.

 

I did understand that the NSF money was to meant to cover database

corrections, but I thought we'd begin georeferencing only after the data

had been corrected.  I think we're all looking for ways to simplify the

process and having the indiosyncracies of years of data entry already fixed

would greatly facilitate the process.  Is there some way to address this

problem (uncorrected data in the gazetteer)?  Or do we push ahead with the

gazetteer as it is.  In my mind, going ahead with it as it is will create

some additional work for those doing the georeferencing (because of the

duplications), but it will create a great deal of additional work  for each

institution as errors are corrected.  In our case at TTU, we will have to

go through the gazetteer (once we get the georeferenced records back),

compare all those records to the file we just spent three years updating

and update the whole thing all over again.  Remember that not all of the

corrections will be simple typos or punctuation problems.  We're correcting

incorrect data as well (e.g., wrong county names entered).  If we could

have the opportunity to update the gazetteer with corrected data before the

process is too far along, it would help considerably.

 

 

>>> Posting number 120, dated 9 Nov 2001 15:09:14

 

>>> Posting number 121, dated 9 Nov 2001 15:59:31

Date:         Fri, 9 Nov 2001 15:59:31 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Correction

MIME-version: 1.0

Content-type: multipart/alternative;

              boundary="Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)"

 

--Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Correction to my last note:  I did understand that NSF money was NOT to be

used to make corrections to the databases.

 

Sorry for the slip.

 

XXXX.

 

 

 

 

>>> Posting number 122, dated 9 Nov 2001 15:13:09

Date:         Fri, 9 Nov 2001 15:13:09 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: Static Gazetteer

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> ... We're correcting incorrect data as well (e.g., wrong county names entered).  If we could have the

opportunity to update the gazetteer with corrected data before the process is too far along, it would

help considerably.

 

XXXXXX,

 

I am very sympathetic to the argument you put forth and am quite sure I would be operating out of my

league if I were to speak for John on this issue.  However, I would like to offer several thoughts--

 

First, an encouraging thought, with the caveat that John will surely correct me if I am wrong--  The

locality ID field in your downloaded files (the one you have been warned not to alter!) will be used to

reassociate the georeferenced data with the records in your dbs--regardless of the content of those

records.  So do not despair if you have corrected some of your localities since you sent John the data.

This was to be hoped for and should not present a problem.  If records did have erroneous data (like a

wrong county), these will likely be difficult to georeference on the first pass and may be skipped, but

they should be easy to deal with by the home institutions once all the data are returned and we each

look for remaining unreferenced localities in our own dbs.

 

Second, we have committed to quite a large project over the course of three years and it is imperative

that we start working ASAP.  It is simply not possible to delay georeferencing while each collection

takes time to verify and correct its locality data.  Have the majority of collections made substantive

changes/corrections to their locality data since those data were sent to John?  I don't know, but I

suspect the majority has not, even though we are all continually cleaning up our data on a daily basis.

So how long do we wait?  Despite the fact that you have not received your money, we are already two

months into this project.  We need to begin work.  It could also be aruged that we should delay because

of all the new specimens that have been entered into our dbs since the data were sent to John....  At

some point we must draw the line.

 

What I ask is that each institution lay claim to a set of localities, that they download those data, and

then spend a bit of time examining what's really there.  Begin georeferencing.  Become familiar with the

process we've outlined.  It may be slow going initially, but as with all new techniques, it will become

quicker and easier with practice.

 

I sincerely regret any misunderstandings that may have occurred.  It is important to keep communicating

and I thank you for your contributions.

 

Best, Barbara

 

>>> Posting number 123, dated 9 Nov 2001 16:10:16

Date:         Fri, 9 Nov 2001 16:10:16 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      alternative download method

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

XXXX et al.,

 

Beneath the "Download" button there is now an alternative option for

those who may have experienced problems.  Click on the link that says

"Alternate download method is here."  A text file with the data should

display in the browser window. Go to the "File" menu and select "Save

As..." to save the file on your computer.  Then open excel and import

the file.

 

Best,

Barbara

 

>>> Posting number 124, dated 15 Nov 2001 08:18:59

Date:         Thu, 15 Nov 2001 08:18:59 -0800

Reply-To:     bstein@oz.net

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Barbara Stein <bstein@OZ.NET>

Subject:      downloading problems solved

 

Dear All,

 

I believe that the problems some individuals were having with downloading

locality data are now solved.

 

For those using IE on a Mac, an alternative download button has been added with

instructions.  Click to download after viewing the list of specific localities

that result from your search and you will see the alternative option beneath

the original download button.

 

There is also no longer a problem with downloading large numbers of records

(e.g., >8500) so I hope you will feel emboldened.

 

Remember, the downloaded files need to be imported into your spreadhseet of

choice before you will see the headers and the data lined up in a way that

makes sense to you.  Do not attempt to simply work with the downloaded files as

is.

 

Lastly, the subcontract budgets have been set up and are in the hands of

Berkeley's SPO.  It is up to that office to notifiy your SPOs that the money is

available.  It is out of the MVZ's control at this point.

 

Best,

Barbara

 

>>> Posting number 125, dated 15 Nov 2001 11:09:32

 

>>> Posting number 126, dated 16 Nov 2001 07:38:49

Date:         Fri, 16 Nov 2001 07:38:49 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Collaborative Georeferencing Theory II

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear all,

In this message I am responding to the discussion begun by XXXXXXXXX

on 8 Nov and continued by XXXXXXXXX. I will refer to both of their

messages herein. I realize that Barbara has already answered these points

while I was out contracting chilblains in the Patagonian wind, but it may

be a comfort to some to see the extent to which we are in agreement

without having had the benefit of communicating.

 

XXXXXXXX said...

[

2.  Our grant submission allotted funds to each institution based on their

records to be geo-referenced.  Does committing to a state/province or

region change all of this?

]

-------------------

No. Funding was based on the number (and difficulty) of the localities in

your collection that need to be georeferenced. In theory, if everyone does

the amount of georeferencing for which they were funded at the speeds we

deduced from experience, then all of the localities without coordinates

will be georeferenced under the funding we were given. In order to take

advantage of the pooling of like localities (i.e., those in the same area

on the map regardless of their source institution) we need to have people

commit to geographic areas that best suit them. Suitability includes not

only geographic areas of interest and of expertise, but also of scope. For

example, if I am institution X, given funding for 10 weeks of

georeferencing, then committing to a geographic area that will take 20

weeks to georeference may be good citizenship, but it is not good

finance. Basically, spend as many weeks on georeferencing as you are

listed for in the NSF Project Description. Details on georeferencing rates

(i.e., localities per hour for different classes of geography) were given

in the Project Implementation section of the NSF Project Description. If

you need to estimate what you are committing to in terms of time, read

that section. It will probably be worthwhile for everyone to monitor

his/her georeferencing rates. If your rates are significantly different

from those projected, send a message to the list. If you are going a lot

faster, we want to know how you're doing it. If you're going a lot slower,

maybe we can help increase your efficiency.

-------------------

XXXXXXXX said...

[

3.  The process has changed considerably between when our records were

downloaded for John [W.] and the ASM meeting.  I thought that our records

were being submitted so that John [W.] would have a snapshot of what the

different databases looked like in order to design the Manis database.

]

-------------------

The last point is true, but it is not the only reason I gathered the

data. Following is an excerpt from the original message from Barbara Stein

asking that data be sent to John W.:

 

"NOTE:  The data you send him will not be distributed in any way, shape,

or form; he will do nothing more than examine it and compare the structure

and general content of the files and then use this data to make the

initial global locality file that will be available for general

reference.  This is extra work that is being done on MVZ's nickle, but

something we feel will keep this project on track and give you the most

bang for your buck."

 

At that point in time we already knew we would use a combined locality

gazetteer, it just wasn't clearly stated at that point how we would use

it. By the time of the ASM meeting I had almost finished the gazetteer and

its purpose was more definitively stated. Following is a quote from the

ASM 2001 meeting notes:

 

"While John [W.] begins work on developing the network, participants will

begin georeferencing. This is why John [W.] asked for your data. From

those

data he will create a combined snapshot of unique localities, which will

be

used for georeferencing."

-------------------

XXXXXXX said...

[

I had planned to clear up any inconsistencies, spelling errors, etc in our

localities before we geo-referenced and downloaded to the Manis

database.  This seems to make sense, since many errors in locality records

can be cleared up only with the use of in-house resources such as field

notes and catalogs.  Now we are committing to a region and giving our best

opinion on perceived errors (to be noted in the Locality Annotation) to

other institutions (and ourselves!) for them to rectify (or not) at their

leisure.  Since I haven't been able to download records,  I don't know how

much this new scheme will save time overall or be more time consuming!

]

and XXXXXXX said...

[

Dear All:  I was wondering about many of the same points that XXXX

XXXXXX mentioned in his email of 8 Nov.  Especially after perusing the

gazetteer and seeing many permutations on"unique" localities.  Eg.,

localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi

north, and north of Seattle 20 miles, have to be allowed because of

institutional style or preference.  However, an entry such as Seatle, 20

mi N could be corrected.  Each is a unique record to the computer and will

receive the same lat/long by georeferencers?   Once georeferenced, the

permutations can be identified, but if  localities are entered

differently, how much efficiency is gained by having one institution

georeference all records for a region vs having each georeference their

own records?

]

-------------------

First, it would be nice if we each had clean and consistent data in our

databases. We don't. We vary greatly in how close we are to achieving that

aim, not only in terms the raw amount of cleaning to do, but especially in

how long it would take each of us to do it. For this reason we cannot wait

for localities to be cleaned up before we start georeferencing.

 

Second, NSF provided funds to georeference localities, not to clean up

existing data. Nor did our methods and time estimates in the NSF proposal

depend on "clean" localities. I agree that it would be more efficient to

georeference ALREADY clean localities, but it is faster to georeference

them as they are than it is to clean them up and then georeference them.

 

Third, in answer to XXXX's last question, the methods presented in our

proposal have been tested and shown to be much more efficient than the

alternative of having each institution georeference only its own

localities. Forgive my digression into a lengthy answer, but this is an

extremely important matter.

 

The concept of uniqueness is, as XXXX points out, defined by the

computer's ability to distinguish one locality from another. Thus, "20 mi

N of Seattle" is a different record from "Seattle, 20 mi N." Furthermore,

there might be two localities "20 mi N of Seattle", one for UWBM and one

for PSM. There are several reasons for keeping these separate, the most

obvious and important of which is to be able to identify from which

institution a locality description came. So, with the MaNIS gazetteer I've

basically given everyone a list of their unique localities, but you could

each have done that yourselves. The real purpose behind the gazetteer is

to combine localities for all institutions by geographic regions. By far

the most time-consuming aspect of georeferencing is finding places on a

map. Thus, it behooves you to assemble localities that are likely to be in

roughly the same place and then find them on a map all at once. Once you

are on the right map you can get coordinates for all of the localities in

that area. So, suppose I have downloaded localities for which the county

is "Kern." At the top of my list of localities for Kern County is one from

UWBM that says "Bakersfield, 10 mi E; Rattlesnake Grade." I see that the

named place is Bakersfield, so I filter my Kern County records to show me

only those which contain the word "Bakersfield." It turns out that in Kern

County there are 117 localities from 10 institutions that mention

"Bakersfield." I get out my map of the Bakersfield area and start looking

for "Rattlesnake Grade." I can't find it on my map right away so I'm going

to skip this locality for the moment. The next twelve localities on my

list are from six different institutions, but they all have some variation

on "3 mi E of Bakersfield." I find this location on my map once, get the

coordinates and copy them to all twelve localities that match this

place. The next locality on my list is from MVZ and it says "Bakersfield,

6 mi N, 9 mi E; Rancheria Road (Rattlesnake Grade)." Oh, so that's where

Rattlesnake Grade is - on Rancheria Road. Now I can go figure out that

first locality, which I skipped at first.

 

So, to answer XXXX's last question again, there are multiple ways in which

the combined localities aid in the overall efficiency of the

georeferencing process. From the illustrative example above, only the MVZ

had to possess the Kern County map; nobody had to go out and buy one. Only

one person had to find Bakersfield on a map, rather than one person from

each of the ten institutions that had localities from that area. It was

possible to find Rattlesnake Grade for all localities that mentioned it,

not just for the one that also happened to locate it on Rancheria Road. It

might not otherwise have been possible to georeference this locality or

maybe the error would have been much greater than it needed to be. The

single locality 3 mi E of Bakersfield could be found and measured once and

the results copied to all twelve localities that were really the same

place. While the foregoing is all well and good in theory, empirical

testing at the MVZ backs it up with hard numbers. Georeferencing rates

doubled when localities from three collections were combined versus when

they were done separately. Further increasing the number of collections

will result in even greater efficiency.

 

Now let me go back and address part of XXXX's comment that I have

neglected thus far.

 

XXXXXXXX said...

[

"Now we are committing to a region and giving our best opinion on

perceived errors (to be noted in the Locality Annotation) to other

institutions (and ourselves!) for them to rectify (or not) at their

leisure."

]

-------------------

I'm not sure what XXXX's point is here, but I'll try to explain the

Locality Annotation again. Locality Annotation is one of the fields in the

downloaded locality data. This field is provided as a courtesy to alert

the institution that provided a locality that there is something

inconsistent about it. It's not meant to be filled with opinions on

perceived errors, it is meant to note definitive inconsistencies. For

example, if I get a locality in the downloaded file for Inyo County that

says "Bakersfield", then there is a problem with the locality. It's not an

opinion, and it isn't a perceived error; it is simply true that

Bakersfield is not in Inyo County. It's up to me as the georeferencer to

decide whether this is enough of a problem to not georeference the

locality. In this particular case I could either choose to georeference

the locality, because I know that Bakersfield is in Kern County, or I

could choose not to georeference it simply because I'm doing Inyo County

and Bakersfield is out of my "jurisdiction." I wouldn't take the latter

option because I'm necessarily a stickler for boundaries, it's just that

I'd have to go get another map and that would waste time. It might be

better to leave some inconsistent localities until later. Nevertheless,

since I've spent the energy to figure out that there is a problem with the

locality, I might as well extend the courtesy of noting what the problem

is. It'll save time for someone else later on. It is this philosophy that

led me to include the NoGeorefBecause field in the download as well. If

I'm able to determine that a locality cannot be georeferenced, I might as

well say so, and why, so that the next person who sees that this locality

doesn't have coordinates will not bother to try to determine them.

-------------------

XXXXXXXX said...

[

4.  There are many localities that are designated unique that simply

differ in syntax, spelling, etc.  They are not necessarily next to each

other.  Would editing our own version of the database first for these

errors and then downloading them into the Manis database work?

]

-------------------

Yes. In theory it could work, but it is not practical. In addition to the

reasons I gave above, this kind of activity would take a great deal of my

time, which I hope you would agree could be better spent on other things.

-------------------

XXXXXXX said...

[

In addition when a typo like Seatle is corrected, it no longer is unique

but of the same set as the correct spelling.  The typos will be deleted

from the static gazetteer after determining that they were corrected in

the institutional database (see comment from Barbara below)? It is unclear

to me how corrections in institutional databases will be mirrored in the

static gazetteer.

 

The comment from Barbara was...

[

...

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.

...

]

-------------------

XXXX's question is well founded. I have nowhere yet described what will

happen to the georeferenced localities. I'll try now to clear up this part

of the grand scheme. I've already explained that I would like the

georeferenced localities to be sent back to me so that I can proof them,

load them back into the gazetteer, and keep a running status of the

georeferencing aspect of the project. In principle, you could download

sets of georeferenced localities for your institution at any time and load

them into your own database. But that isn't the most efficient way to go

about the problem. It would be better to wait until all georeferencing is

done, then download all localities for your institution and create the

lat_long records for them all at once, with my help, if necessary. Note

that I am not explaining how to create the lat_long records or how to

incorporate them in your database. The reason is that (almost) everyone's

database structure is different from everyone else's, so there is no one

single solution to fit all. That's why I offer my help to get these data

back into your databases, but I can only afford to do it one time for each

institution that needs it.

 

Now back to XXXX's question. Changes in your databases will not be

mirrored in the static gazetteer. There will be no changes whatsoever to

localities in the static gazetteer, as per Barbara's additional notes. If

you correct typographical errors in your database it will not affect the

georeferencing process. If you make a substantive change to a locality

(one that would affect how the locality is georeferenced), then there will

be an easily discernible discrepancy that can be resolved at the time when

lat_longs are incorporated into your database. Nevertheless, the more

changes you make to your localities during the georeferencing period, the

more work you will potentially create for yourself later.

-------------------

XXXXXX said...

[

Although the idea of compiling a static gazetteer of unique localities

seemed like a good idea at the beginning, it does not seem doable at this

point.  I would prefer to go back to the original plan of each institution

dealing with their own records and offering assistance to others as

needed.  Once georeferencing is started  and we get $ for the servers, the

gazetteer could be produced dynamically, or at least by frequent uploads -

rather than statically - and can be consulted, updated, corrected,

winnowed as needed.

]

I hope I've done something to counter the above sentiment. Let me add

another note about the static gazetteer. It is an interim tool intended to

help us divide up the georeferencing responsibilities and to monitor

georeferencing progress. Your databases are not static. Yet, to function

effectively, we need a fixed target. The real end product of this endeavor

will include a dynamic gazetteer that will drawn from the

continually-updated locality data contained in the participating

databases. At that point, when you add new data, or change existing data,

it will be reflected in the dynamic gazetteer without intervention.

I hope this clarifies the reasoning behind our approach to

georeferencing. Considerable thought and effort have gone into

establishing and testing the methods set forth here and elsewhere in the

MaNIS documents. Barbara and I remain convinced that this is the most

reasonable approach to an otherwise daunting task.

 

John W.

 

>>> Posting number 127, dated 16 Nov 2001 11:51:55

Date:         Fri, 16 Nov 2001 11:51:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Collaborative Georeferencing Theory II

In-Reply-To:  <Pine.GSO.4.21.0111160737280.29268-100000@socrates.Berkeley.EDU>

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

John,

 

My intention was never to clean up our locality data with geo-referencing

funds!  I was operating on the assumption that we would be responsible for

our own data and therefore it would have been worthwhile to clean it up on

our own dime before geo-referencing.  Which gets to another question.  I

have cleaned up localities in our database since downloading it to you.  Is

this going to cause problems in downloading the newly geo-referenced

localities from MANIS into our current database?  Can I continue to clean up

our own database?  Did I understand you correctly when you said to leave

localities that have lat/long alone?  The reason I ask is that I noticed

that when you transferred our lat/long to the Manis database.  The minutes

were incorrectly interpreted as decimal degrees.  Should I worry about this?

Will we have to change our database to accept decimal degrees?  I appreciate

your thorough responses.  I am trying to clarify and simplify our tasks.

That is my bottom line.

 

Cheers, XXXX

 

PS I didn't put this on the site, because I am seeking clarity not a debate.

 

>>> Posting number 128, dated 16 Nov 2001 15:59:44

 

>>> Posting number 129, dated 17 Nov 2001 12:55:06

Date:         Sat, 17 Nov 2001 12:55:06 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Collaborative Georeferencing Theory II

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All:  Thanks to John W. for the overview and examples.  In summary, we

are georeferencing unique geographical entries rather than unique

localities.  Unique can be a function of geography, institutional acronym,

syntax, typos, punctuation and errors.   The goal is clearer.

 

 

XXXXXX

 

 

>>> Posting number 130, dated 17 Nov 2001 12:55:48

Date:         Sat, 17 Nov 2001 12:55:48 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: Questions about Georeferencing

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

XXXXXXXX wrote:

 

> Thanks for all of the great georeferencing information, steps, and

> guidelines! XXXXXXX and I have been familiarizing ourselves with the

> guidelines, steps and very helpful weblinks.  We downloaded the Ingham

> County (Michigan) records into the Access template, and I feel that this

> county is a comfortable starting place for us (it is our institution's

> county).

 

Go for it!  Starting is half the battle.

 

> Before we begin, I would appreciate clarification on a couple of items.

> Thank you for your time.

 

As always, I will provide my thoughts and John will weigh in when he's next

online.

 

> 1)  Is it okay to use available "online" latitude and longitude

> coordinates, as long as Datum information, etc. are available?

 

Yes.  Just make sure you specify the source of those coordinates in the

designated field on your spreadsheet.

 

> For example, the Township, Range, Section Information website

> (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis

> Georeferencing Guidelines has links whereby one can search for a named

> place, and the decimal degrees coordinates (to four decimal places) come up

> for that place (example, City of Mason, Michigan).  Is it okay to use such

> on-line coordinates for georeferencing place names, or should all

> georeferencing should be done with "hard copy" references?

 

We encourage you to take advantage of all available tools, that's why we

provided those URLs.  There may be others as well.  Just make sure your sources

are credible.

 

> 2)  If the answer to the above question is that all georeferencing should

> be done with "hard copy" references, then ignore this one.

> 

> A related question to 1): from the same website mentioned above, one can

> link to "TerraServer" and get (really interesting) aerial photos of places.

>  With the aid of a labelled map, one can zoom in and find specific

> buildings (such as the Michigan State University Swine Barn - a real Ingham

> County example).  From a zoomed aerial image, you can click on "Image Info"

> and get lat and long (non-decimal) coordinates for "tiles" (corners of

> squares) surrounding the image.  Datum information is included in "Image

> Info".

> 

> So my question is, is it okay for us to use these types of on-line aerial

> images for georeferencing?

 

I'm including this question just for completeness.  The answer is, of course,

yes.  And remember, do not worry about the type of coordinate data you record.

The error calculator will be able to convert data provided in any format (e.g.,

deg, min, sec; dec. degrees; etc.) into any other format.  Knowing the datum,

providing the source of your coordinates, and noting any assumptions you have

made in assigning those coordinates are what's crucial.

 

> 3)  With regard to the "DeterminedDate" data field in the download file -

> is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month

> Spelled Out YYYY) that you would like us to use?

 

No, because most spreadsheet programs will dictate a format.  It seemed

worthless for us to specify one.  John will have to deal with that variety

later.

 

Best,

Barbara

 

>>> Posting number 131, dated 17 Nov 2001 14:01:10

Date:         Sat, 17 Nov 2001 14:01:10 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      download of GNIS dataset

Comments:

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

GNIS locality datasets for states can be downloaded from:

 

http://mapping.usgs.gov/www/gnis/gnisftp.html

 

The dataset for Washington consisted of 32K+ localities and Oregon had

50K+.  Both loaded into Excel without problems (after unzipping), and

provide a good start on an authority file for locations + lat/longs.  I

wish I had it back when we originally entered our data.  Locations can be

found with a search or scrolling in Excel, or by loading into a database

program.  As long as you don't need a map, lookup on the downloaded file is

faster than via the GNIS webpage.   The downloaded file also has lat/longs

as decimals, which don't appear to be accessible on the GNIS webpage.

These can be entered into two fields of MaNIS with a copy/paste rather than

parsing or typing the dddmmss + direction string into the eight fields

required for ddd, mm, entry.

 

 

 

>>> Posting number 132, dated 19 Nov 2001 07:53:03

 

>>> Posting number 133, dated 20 Nov 2001 10:41:10

 

>>> Posting number 134, dated 20 Nov 2001 10:57:38

 

>>> Posting number 135, dated 20 Nov 2001 18:52:31

Date:         Tue, 20 Nov 2001 18:52:31 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Vertical Datum?

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear Barbara,

 

Thanks for your reply to my earlier message.  I have another question for

both you and John:

 

Do we need to note the "Vertical Datum" if one is provided on a map source?

 One of the Michigan USGS maps that I looked at this week had the following:

Horizontal Datum:  NAD1927

Vertical Datum:  NGVD 1929

 

Also, it looks like we'll be using Topozone

(www.topozone.com/findplace.asp) for georeferencing some of the Michigan

localities (just point the cursor anywhere on the map and the coordinates

of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

on the lower part of the screen).

 

XXXXXXXX

 

 

 

>>> Posting number 136, dated 23 Nov 2001 10:20:39

Date:         Fri, 23 Nov 2001 10:20:39 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Collaborative Georeferencing Theory II

In-Reply-To:  <B81AAE5B.EB7%jrozdil@u.washington.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

On Fri, 16 Nov 2001, John Rozdilsky wrote:

 

> John,

> 

> My intention was never to clean up our locality data with geo-referencing

> funds!  I was operating on the assumption that we would be responsible for

> our own data and therefore it would have been worthwhile to clean it up on

> our own dime before geo-referencing.  Which gets to another question.  I

> have cleaned up localities in our database since downloading it to you.  Is

> this going to cause problems in downloading the newly geo-referenced

> localities from MANIS into our current database?  Can I continue to clean up

> our own database?

 

 

XXXX and all,

There has been some confusion with respect to localities,

lat_longs, higher geographies, and the means by which data get back into

your local databases. I have neglected the discussion so far in favor of

getting people working, but clearly there is a great deal of anticipation

on the subject.  I'll explain this stuff in detail on my trip into town

next week.

 

In the meantime, continue as you were. If you are in the midst of cleaning

up locality data and have a good reason to continue doing so at the

moment, go ahead. If you weren't cleaning up locality data, don't do so

for the sake of MaNIS.

 

>Did I understand you correctly when you said to leave

> localities that have lat/long alone?  The reason I ask is that I noticed

> that when you transferred our lat/long to the Manis database.  The minutes

> were incorrectly interpreted as decimal degrees.  Should I worry about this?

 

It seems I have misinterpreted your latitude and longitude data, is that

correct? The original data should be ddmmss, not dd.dddd? Is this true of

all lat_long entries? If so, then I need to update the gazetteer with the

correct data. I can do this from here in Argentina, but I'll have to do it

the next time I come to town. You were right to worry about this. Even

though we don't have to georeference those localities that already have

coordinates (at least not in the first pass), we do want to be able to use

them for reference, so they should be made correct. It's probably a good

idea if every institution that provided some lat_long data do a little bit

of double checking to see if I've made the correct interpretation of your

data. If I made one mistake, I certainly am capable of making others.

 

> Will we have to change our database to accept decimal degrees?  I appreciate

> your thorough responses.  I am trying to clarify and simplify our tasks.

> That is my bottom line.

 

You will not have to make changes in your database to accept decimal

degrees. You can use whatever coordinate system you like locally, and I

can give you your data in that format when it comes time to download data

from the gazetteer into your database.

 

For better or worse, have been trying to simplify explanations -

sometimes at the expense of explaining the complete plan. I guess it's

turning out OK though, because all of the right questions are being asked.

 

John W.

 

>>> Posting number 137, dated 23 Nov 2001 10:24:34

Date:         Fri, 23 Nov 2001 10:24:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Questions about Georeferencing

In-Reply-To:  <3BF6CED4.5296BDB@oz.net>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

Barbara has answered everything below perfectly well. I'm just "weighing

in" to say so.

 

On Sat, 17 Nov 2001, Barbara R. Stein wrote:

 

> XXXXXXXX wrote:

> 

> > Thanks for all of the great georeferencing information, steps, and

> > guidelines! Robin Bolig and I have been familiarizing ourselves with the

> > guidelines, steps and very helpful weblinks.  We downloaded the Ingham

> > County (Michigan) records into the Access template, and I feel that this

> > county is a comfortable starting place for us (it is our institution's

> > county).

> 

> Go for it!  Starting is half the battle.

> 

> > Before we begin, I would appreciate clarification on a couple of items.

> > Thank you for your time.

> 

> As always, I will provide my thoughts and John will weigh in when he's next

> online.

> 

> > 1)  Is it okay to use available "online" latitude and longitude

> > coordinates, as long as Datum information, etc. are available?

> 

> Yes.  Just make sure you specify the source of those coordinates in the

> designated field on your spreadsheet.

> 

> > For example, the Township, Range, Section Information website

> > (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis

> > Georeferencing Guidelines has links whereby one can search for a named

> > place, and the decimal degrees coordinates (to four decimal places) come up

> > for that place (example, City of Mason, Michigan).  Is it okay to use such

> > on-line coordinates for georeferencing place names, or should all

> > georeferencing should be done with "hard copy" references?

> 

> We encourage you to take advantage of all available tools, that's why we

> provided those URLs.  There may be others as well.  Just make sure your sources

> are credible.

> 

> > 2)  If the answer to the above question is that all georeferencing should

> > be done with "hard copy" references, then ignore this one.

> >

> > A related question to 1): from the same website mentioned above, one can

> > link to "TerraServer" and get (really interesting) aerial photos of places.

> >  With the aid of a labelled map, one can zoom in and find specific

> > buildings (such as the Michigan State University Swine Barn - a real Ingham

> > County example).  From a zoomed aerial image, you can click on "Image Info"

> > and get lat and long (non-decimal) coordinates for "tiles" (corners of

> > squares) surrounding the image.  Datum information is included in "Image

> > Info".

> >

> > So my question is, is it okay for us to use these types of on-line aerial

> > images for georeferencing?

> 

> I'm including this question just for completeness.  The answer is, of course,

> yes.  And remember, do not worry about the type of coordinate data you record.

> The error calculator will be able to convert data provided in any format (e.g.,

> deg, min, sec; dec. degrees; etc.) into any other format.  Knowing the datum,

> providing the source of your coordinates, and noting any assumptions you have

> made in assigning those coordinates are what's crucial.

> 

> > 3)  With regard to the "DeterminedDate" data field in the download file -

> > is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month

> > Spelled Out YYYY) that you would like us to use?

> 

> No, because most spreadsheet programs will dictate a format.  It seemed

> worthless for us to specify one.  John will have to deal with that variety

> later.

> 

> Best,

> Barbara

> 

 

>>> Posting number 138, dated 23 Nov 2001 10:27:35

Date:         Fri, 23 Nov 2001 10:27:35 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Vieglias routine (fwd)

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

XXXX and all,

Don't confuse the lat_long determination with the error determination. You

can get the lat_long without the extents, but you need to use the extents

as one of the sources of uncertainty - which contributes to the maximum

error distance, but does not affect the lat_long itself.

 

The guidelines do allow the distance bearing computation to be made from

GNIS coordinates, and I agree, it would be a crime not to use those data.

I would very much like to provide the tool that can parse the localities

and calculate the lat_longs from any gazetteer. In February I'll likely be

collaborating with the Alexandria Digital Library Project to do just that.

I am currently awaiting the development of a protocol to communicate with

their Digital Gazetteer.

 

There are really two tools that would be nice. I've already mentioned the

first one, which would be based on Dave Vieglais' SPPFind tool, which I

have not yet tested. The second is the error calculator, which is

referenced in the MaNIS web pages, but is not yet functional. I've

finished the Error Calculator Tool except for the datum error

contributions and testing. I would like to suggest that charging ahead on

the lat_long determinations is fine, but leave off the error stuff until

thetool is ready for prime-time.  That error stuff is just too burdensome

to do by hand. Doing one pass for lat_longs and one for errors might

actually be more efficient, but we'll need evidence "from the trenches"

to figure out if this is true.

 

John W.

 

---------- Forwarded message ----------

Date: Sat, 17 Nov 2001 13:21:47 -0800

From:

To: tuco@socrates.Berkeley.EDU

Cc: bstein@oz.net

Subject: Vieglias routine

 

John W.  So much for theory.  On more practical matter.  The rules indicate

that  "If the [SpecLoc] description includes an offset, use the furthest

extent of the named place in the direction of the offset."   So we should

NOT compute terminal lat/longs from the GNIS lat/longs and bearing?   I ask

because GNIS locs don't appear to take into account the furthest extent of

the named place.  Related, should we wait for the georeferencing tool

mentioned in the 10/18/01 email or just charge ahead?  I assume it was to

take GNIS locs and try to match them with occurrences in the MaNIS file

(from project description), then compute terminal lat/longs based on

distance and bearing.   Modifying the rules to allow the distance-bearing

computation based on GNIS lat/long would really increase georeferencing

rate, and as long as the technique was referenced, I don't see a problem.

 

 

 

 

>>> Posting number 139, dated 23 Nov 2001 10:29:58

Date:         Fri, 23 Nov 2001 10:29:58 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Vertical Datum?

In-Reply-To:  <3.0.32.20011120185230.00718380@pilot.msu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

XXXXX and all,

 

The Vertical Datum refers to the geometric model from with elevations are

determined. In our data we consider altitude (or elevation) as an

attribute of the locality, not as an attribute of the position. Or, to say

it another way, when we record positions digitally, we include latitude,

longitude, and horizontal datum, but we do not include elevation and

vertical datum.  In short, we treat elevation as a part of the locality,

so we do not need to consider the vertical datum since it has no bearing

on our georeferencing.

 

Note, unless I am mistaken there is no way to know the datum when using

Topozone. Someone please correct me if I'm wrong. This isn't really a big

problem as long as the error is calculated with an unknown datum.

 

John W.

 

On Tue, 20 Nov 2001, XXXXXXXXXX wrote:

 

> Dear Barbara,

> 

> Thanks for your reply to my earlier message.  I have another question for

> both you and John:

> 

> Do we need to note the "Vertical Datum" if one is provided on a map source?

>  One of the Michigan USGS maps that I looked at this week had the following:

> Horizontal Datum:  NAD1927

> Vertical Datum:  NGVD 1929

> 

> Also, it looks like we'll be using Topozone

> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan

> localities (just point the cursor anywhere on the map and the coordinates

> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

> on the lower part of the screen).

> 

> Thanks,

> XXXXX

> 

> 

 

 

>>> Posting number 140, dated 26 Nov 2001 10:20:20

Date:         Mon, 26 Nov 2001 10:20:20 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Topozone - Datum

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

According to the topozone website (address below), it appears that the

given coordinates are based on NAD27.  (This is listed next to the

coordinate buttons - UTM, DecLatLong, etc. - on the website).

 

Please let me know if you have other information about this.

 

Thanks,

XXXXX

 

 

 

>Note, unless I am mistaken there is no way to know the datum when using

>Topozone. Someone please correct me if I'm wrong. This isn't really a big

>problem as long as the error is calculated with an unknown datum.

> 

>John W.

> 

 

> Also, it looks like we'll be using Topozone

> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan

> localities (just point the cursor anywhere on the map and the coordinates

> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

> on the lower part of the screen).

> 

 

 

 

>>> Posting number 141, dated 3 Dec 2001 05:59:15

Date:         Mon, 3 Dec 2001 05:59:15 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Loading Lat_Longs back into databases

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

 

Last week I promised a message about the relationship between the

gazetteer and your databases - the bigger picture.

 

We've already talked about the static nature of the current MaNIS

gazetteer. As I've said, the gazetteer in its current form is a temporary

tool to aid in collaborative georeferencing. Once the network gets going

there will be a dynamic gazetteer as described in the NSF proposal.

Because our "snapshot" data are static and our databases are not, the

differences between the two will increase over time, especially for those

who are specifically editing locality-related data. I guess that when

people made this realization it caused some concern.

 

I designed the gazetteer with the issue of changing data in mind, and I've

done a few things to aid in data reconciliation when the lat_longs get

loaded back into your databases. For example, I've stored much more

information in the MaNIS gazetteer than is visible in the online

interface, including information that relates the localities (and

therefore the lat_longs)  back to the specimens themselves. The structure

of the gazetteer may be of interest, so I will post the Gazetteer

Entity-Relationship diagram as a document on the MaNIS website when I get

back to civilization.

 

Since I stored all of the original locality-related information along with

the references to the specimens, it will be possible (when the time comes

to load lat_long information into your databases) to compare the snapshot

locality data with the then-current locality data. For all of those

localities where there has been no change, the lat_long data can be loaded

without question. This first step should take care of most records for

most institutions. For the rest of the records, where the locality data no

longer exactly match the snapshot data, some analyses can be done to

determine if the differences can be considered "substantive," by which I

mean that they would affect the determination of the lat_long. For

example, a snapshot locality that is the same as the then-current locality

except that an elevation has been added can be considered as not

substantively changed and can therefore have its lat_long record loaded.

This step will be a little different for each institution. After doing

some bulk checking for differences such as in the foregoing example, I

envision making one visual pass over the remaining records, with the

original and the then-current localities side-by-side, putting a checkmark

in a column called "substantive" for those records that have had

substantive changes. When that pass has been made, all of the lat_longs

for records without a checkmark can be loaded. This third step should take

care of most of the remainder of the localities. What's left will be

locality-specimen relationships that have changed since the time when the

snapshot was taken. These records will have to be resolved by the

individual institutions.

 

There are some tricks and techniques I haven't presented yet, but I hope

that what I've written above helps to clarify the bigger picture with

respect to georeferencing. Questions have proven useful thus far, so if

there's anything else about which you'd care to have me elaborate, please

ask.

 

In the spirit of looking forward, another thing to think about for the

future is the incorporation of the coordinates and metadata into your own

local databases.  Some institutions don't have attributes in their

databases to hold lat_long information. Similarly, not everyone (but there

are some!) has an attribute to accomodate maximum error distance. It would

be a shame to throw away all of this hard-earned and valuable data. At

this point I'm asking you to consider the ramifications of storing these

data so that there are no unpleasant surprises when the time comes to load

the data back into your databases.

 

John W.

 

>>> Posting number 142, dated 7 Dec 2001 07:24:45

Date:         Fri, 7 Dec 2001 07:24:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guide Revisions

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

 

While working through the development of the Georeferencing Calculator I

discovered minor numerical and typographical errors in the Georeferencing

Guidelines document. This message is just to alert you that I have made

revisions to that document. One particular change worth noting is in the

section on "Uncertainty associated with coordinate precision." It seemed

to me quite reasonable to assume that the coordinate precision should be

the same for both coordinates, and so I've rewritten that section to

reflect this assumption.

 

I've also added some calculation examples against which you might test

your understanding both of the georeferencing concepts.

 

One detail of reading the datum error from a file eludes me at the

moment. It is the last remaining issue before the Georeferencing

Calculator becomes available.

 

John W.

 

>>> Posting number 143, dated 10 Dec 2001 13:58:27

Date:         Mon, 10 Dec 2001 13:58:27 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Information from Topozone  -  NAD 27 Datum

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All,

 

John asked that I follow up with staff at Topozone

(www.topozone.com/findplace.asp) with regard to datum information on their

website's scanned maps (see previous message exchanges copied below).

 

Here is what I found out:

 

1.  For USGS QUAD MAPS (1:24,000 or 1:25,000):  the vast majority of these

original scanned maps on the Topozone website are based on the NAD 27.  If

any underlying Quad map was originally based on another datum (such as NAD

83 for example), Topozone has REPROJECTED that map into NAD 27.

 

2.  Thus, the Topozone cursor coordinates as well as the underlying Quad

map (whether original or reprojected) are ALWAYS in NAD 27.

 

3.  It was confirmed that all original MICHIGAN QUAD maps that were scanned

for the Topozone website are NAD 27.

 

John, please let us know if it is okay for us to list NAD 27 as the datum

instead of "Datum Unknown" for locality coodinates taken from the Topozone

website.

 

Thanks,

XXXXXX

 

 

 

>>> Posting number 144, dated 14 Dec 2001 15:31:53

 

>>> Posting number 145, dated 16 Dec 2001 11:40:45

Date:         Sun, 16 Dec 2001 11:40:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Information from Topozone  -  NAD 27 Datum

In-Reply-To:  <3.0.32.20011210135816.00717590@pilot.msu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Thanks XXXXX, this is most excellent. We can use Topozone coordinates with

NAD27 recorded. They have no idea how big a favor they have done for

us. Everyone please list NAD27 with any coordinates derived from Topozone

and remember to record the Reference_Source as "Topozone 1:24000" or the

like.

 

 

 

>>> Posting number 146, dated 3 Jan 2002 10:14:21

Date:         Thu, 3 Jan 2002 10:14:21 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      number of decimals on decimal degrees

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

MaNIS:  How many decimals are folks attaching to lat/long determinations?

I'm going with four on decimal degrees even though this is more than the

justified from the offset distances to the nearest mile or fractional mile.

As I understand it, John W's error calculator will attach the correct error

to lat/long determinations based on the offset direction(s), distance and

units.  Sorry if I missed this in previous discussions?

 

 

 

>>> Posting number 147, dated 7 Jan 2002 09:46:27

Date:         Mon, 7 Jan 2002 09:46:27 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: number of decimals on decimal degrees

In-Reply-To:  <F100rz71znUp8acXUgZ000178f6@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi Folks,

   I'm back. Argentina started rioting when I left for Chile. I won't claim

that my leaving was the cause.

   Anyway, my recommendation is to store as many decimal places as your

source gives you and not to confuse those digits with accuracy or precision

- that's why we're using the explicit maximum error distance. I would

certainly caution that to use fewer digits is to introduce extra,

unwarranted errors. Refer to the table in the Georeferencing Guide at

http://elib.cs.berkeley.edu/manis/GeorefGuide.html to see the magnitude of

these errors. If you use 5 digits in a decimal degree coordinate, the error

will be on the same order of magnitude as that for most of today's accurate

GPS readings. The error calculator will also take into account the

precision of the recorded coordinates when calculating maximum error distances.

 

>MaNIS:  How many decimals are folks attaching to lat/long determinations?

>I'm going with four on decimal degrees even though this is more than the

>justified from the offset distances to the nearest mile or fractional mile.

>As I understand it, John W's error calculator will attach the correct error

>to lat/long determinations based on the offset direction(s), distance and

>units.  Sorry if I missed this in previous discussions?

> 

 

 

>>> Posting number 148, dated 7 Jan 2002 12:37:08

 

>>> Posting number 149, dated 7 Jan 2002 12:57:05

 

>>> Posting number 150, dated 7 Jan 2002 12:45:12

Date:         Mon, 7 Jan 2002 12:45:12 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Should not found SpecLocs default to county?

In-Reply-To:  <v02130501b85f9bf6b38a@[207.207.103.162]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

>John W.:  So I'm wondering about the Oregon records.  There are about 400

>with DecLat/longs that were already assigned when downloaded, but they only

>have two decimals.  Was this a formating or rounding decision?  I'll leave

>them as is as I assume if someone assigned lat/long it is more accurate the

>the SpecLoc.

 

Actually, it was a formatting error. The decimal lat/longs that appear in

the download have been truncated to 2 decimal places. This wasn't my

original intention.  The truncation occurred somewhere in transferring

between Access and the Informix database from which the downloaded data are

taken. I'll try to find out where it occurred and fix the problem, then I

will update the decimal latitude and longitude values in the online

gazetteer. This shouldn't affect on those who've already downloaded data

for georeferencing since we agreed that the localities that already have

lat/longs will not be georeferenced (again). If anyone is checking and

changing records that have lat_longs already, let me know.

 

>Related, if we cannot find a SpecLoc, should we default to county or leave

>it ungeoreferenced pending investigation by the contributing institution?

>So far not found SpecLocs are running at about 10%  due to discrepencies in

>SpecLoc and county, apparent typos, or ambiguous text.

 

If you cannot find the SpecLoc, leave it ungeoreferenced and say why in the

field called "NoGeorefBecause." If you find the SpecLoc and it is

unambiguously placed in the wrong county, go ahead and georeference it and

make a note to that effect in the "LocalityAnnotation" field in the

downloaded data file. These notes will eventually get back to the source

institution.

 

 

>>> Posting number 151, dated 7 Jan 2002 14:52:05

Date:         Mon, 7 Jan 2002 14:52:05 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Oregon lat/longs.

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

 

XXXX, and all:

 

"...wondering about the Oregon records. There are about 400..."

 

The Oregon records that had lat/long for specimens in the KU collection

should be redone with the new system.  Those that were added here were done

a couple of years ago using a program that calculated them for us so they

will not be as accurate as the current system we are using.

 

 

 

>>> Posting number 152, dated 8 Jan 2002 20:57:38

 

>>> Posting number 153, dated 16 Jan 2002 15:03:38

Date:         Wed, 16 Jan 2002 15:03:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Error Calculator

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

At long last I'm ready to introduce the Georeferencing Error Calculator.

It's been some time in the making, and I apologize for the delay, but I

wanted to give you a product that wouldn't be a moving target due to

constant revision.  The application has been pretty well tested and I

believe you can use it with confidence in the results it

gives.  Nevertheless, if something doesn't seem quite right, try to figure

out why. Usually it means that the coordinate precision is set too low (the

coordinate precision always reverts to "nearest degree" if you change the

coordinate system). If you exhaust all possibilities of making sense of the

maximum error value that the program gives you (this includes reading the

manual and the georeferencing guidelines), then feel free to send me a

message asking what's going on. If you do, please be explicit about what

you are doing and what all of the parameters are for the calculation that

puzzles you.

 

The Georeferencing Guidelines and the Georeferencing Steps documents have

been modified to include references to the Error Calculator, and the Error

Calculator Manual has been added to the list on the Documents page on the

MaNIS website at the following URL:

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

Please read the manual so you know what to expect when loading the

Calculator into your browser.  In particular, you should be aware of the

browser constraints and the size of the java applet. It can be quite slow

to load the first time if your connection is slow.

 

Two points about making calculations are also worth emphasizing in advance.

I've already mentioned the first, which is that the coordinate precision

will revert to "nearest degree" if you change the coordinate system. If you

get an error that you think is excessive, the coordinate precision is

likely to be the culprit. Another possible culprit is having the datum set

to "not recorded" if you actually know what datum the coordinates were

taken in. The second important point is that all distance measurements in a

given calculation must be in the same units. For example, don't mix an

offset of 10 miles with an extent of named place of 3 kilometers. Both

measures need to be in one system or the other. The error distance will be

given in the same units as the measurements and all will be governed by

your choice in the Distance Units drop-down list.

 

Enjoy!

 

>>> Posting number 154, dated 16 Jan 2002 15:28:45

Date:         Wed, 16 Jan 2002 15:28:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: CNMA: mammal collection at UNAM

In-Reply-To:  <5.1.0.14.1.20020107123724.00a00090@ibunam.ibiologia.unam.m x>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

Dear All,

 

I have changed all references to UNAM in the MaNIS documents and database=20

to be CNMA based on the following request from Fernando Cervantes. The=20

acronym was not changed in the Project Description document, which is a=20

copy of the document sent as part of the NSF grant application. Those of=20

you who downloaded localities previous to 16 January 2002 will still have=20

UNAM as a CollectionCode in your downloaded data. This will not present a=20

problem when you return the georeferenced data to me.

 

John W

 

>Dear John

> 

>    To better describe who and where we are at I would like to ask you for=

=20

> the following:

> 

>1. In the list of institutions participating in MaNIS and the contacts=20

>(web site), please include the name, position, and e-mail account of my=20

>assistants:

> 

>Yolanda Hortelano, yolahm@ibiologia.unam.mx

>Julieta Vargas, jvargas@ibiologia.unam.mx

> 

>2. In addition, please change the acronym of our collection.  Our mammal=20

>collection is known and registered as CNMA (after Colecci=F3n Nacional de=

=20

>Mam=EDferos) and is hosted by Instituto de Biolog=EDa, that belongs to=20

>Universidad Nacional Aut=F3noma de M=E9xico (UNAM).

> 

>Thank you for your help,

> 

>Fernando

>------------------------------------------------

>Fernando A. Cervantes

>Zoologia. Instituto de Biologia, UNAM

>Apartado Postal 70-153, Coyoacan

>Mexico, D. F. 04510

>Mexico

> 

>tel.: (525) 622 9143; fax: (525) 550 0164

>e-mail: fac@ibiologia.unam.mx

>sitio web: www.ibiologia.unam.mx/cnma

>------------------------------------------------

 

>>> Posting number 155, dated 17 Jan 2002 09:38:42

Date:         Thu, 17 Jan 2002 09:38:42 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing

In-Reply-To:  <5.1.0.14.1.20020107123124.00a00ec0@ibunam.ibiologia.unam.m x>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

XXXXXXX,

Now that the Error Calculator is done and on the web I was able to check=20

the data you sent me in December.  I had no problems importing those data=20

into my system.  When I do this step I check for inconsistencies in the=20

data and fix them if I can.  The Determination References you provided are=

=20

excellent. I wish we could figure out which datum those sources use.

 

I'm curious why you chose to record degrees minutes seconds instead of=20

decimal degrees for the localities your team georeferenced. I'm only asking=

=20

to point out that it would have been easier to just copy and paste the two=

=20

decimal degree values. This would have been a little faster and it would=20

have left less room for error. Even so, there was only one coordinate error=

=20

I could find in your data. There was a 10 for decimal seconds where there=20

should have been a 0.

 

There are some limitations of the Alexandria Digital Library Data of which=

=20

everyone should be aware. As long as you recognize these limitations, the=20

ADL gazetteer is extremely useful. I'm including, below, a message to and a=

=20

response from Linda Hill about these limitations.

 

I noticed that none of the localities in the records you georeferenced have=

=20

maximum error distances. I hope you will provide these data in the future,=

=20

especially now that I've released the Error Calculator, which is supposed=20

to make the calculations much easier. When you do make error calculations,=

=20

be sure to use a coordinate precision of "nearest minute" for Alexandria=20

Digital Library data that come from NIMA. If you look at the values that=20

come up there is always either a 0 or a 59 in the seconds field for non-USA=

=20

named places. There is something wrong with a coordinate translation=20

algorithm somewhere that produces this problem. I recommend using the=20

decimal degree coordinates since they err less than the degrees minutes=20

seconds.

 

I especially appreciate the Locality Annotations your team provided and I=20

hope the other recipients of your georeferenced data do as well.

 

 

 

>John: Here's the situation. The data in our gazetteer for the example you=

=20

>used is

>NIMA. The original NIMA coordinates are:

> 

>NIMA: 20=B0 11' 00" N 098=B0 03' 00" W

> 

>NIMA points are all limited to 1 minute resolution, I believe, although=

 they

>don't document this anyway that I have seen.

> 

>We have two clients and they show the coordinates as:

> 

>CDL-Middleware client to ADL Gazetteer: Longitude W 98=B0 03' Latitude N=

 20=B0 11'

> 

>AOL client to ADL Gazetteer: Longitude: -98.050003 (98=B03'0"W) Latitude:=

=20

>20.183332

>(20=B010'59"N)

> 

>The problem with the AOL client is that the original ddmmss values were=20

>converted

>to decimal degrees and then the ddmmss values that are shown in the=20

>interface are

>calculated from them, giving the impression that there is more resolution=

=20

>in the

>location than is warranted. As you point out, in your example there is=20

>obviously

>a problem with the '3' as the last digit in the longitude value. We are=20

>aware of

>these problems but have not gone back and fixed it. We have limited staff=

=20

>to work

>on the gazetteer and have put more work into other developments. What we=20

>intend

>to do is to phase out the AOL client and replace it with a client based on=

 our

>middleware software (like the CDL client). We will be storing decimal=20

>degrees in

>our database but need to be smarter about the specificity

> 

>Neither the USGS nor NIMA clearly reference the geodetic basis of their

>coordinates. We are assuming that they are using WGS-84. In our revised=20

>Gazetteer

>Content Standard there is an element to declare the geodetic basis for the

>coordinates. We are setting the default value as WGS-84 but other bases can=

 be

>entered. With our current gazetteer, I think you will not go far wrong with

>assuming WGS-84. Also, we have elements for making a statement about the

>'accuracy' of the coordinates. In the future as we build up better data,=

 these

>statements could give assistance in making the estimates that you need.

> 

>I had a look at your 'estimator' for maximum geospatial error in specifying

>locations. It looks very useful. I passed the URL on to our gazetteer team=

=20

>here

>so that they can see what you are doing.

> 

>We are still working on getting our gazetteer protocol server working=20

>properly.

>We solved a major parsing problem today. There is still more to do but you=

=20

>might

>start thinking about how you might embed gazetteer lookup in your script=

 using

>our gazetteer service protocol.

> 

>I appreciate your feedback and apologize for the limitations of our=20

>gazetteer. We

>continue to work on it and welcome collaboration to 'make it right'.

> 

>- Linda

> 

> 

>John Wieczorek wrote:

> 

> > Hi again,

> > I have people engaged in georeferencing for the MaNIS Project now. My=

 first

> > set of georeferenced data have just been returned and the ADL gazetteer=

 was

> > among the Reference Sources used to get coordinates for the data. My

> > questions are about the coordinates themselves. I'll use a specific=

 example

> > to better illustrate the questions.

> >

> > The locality in question is Huauchinango, Puebla, Mexico. The gazetteer=

=20

> shows

> > coordinates in two units, decimal degrees and degrees minutes seconds.

> > Specifically, for this example, the decimal degrees are 20.183332,

> > -98.050003. The degrees minutes seconds are 20=B010'59"N, 98=B03'0"W.=

 These two

> > aren't the same when you get out to that sixth decimal place in=

 longitude,

> > and they differ even more in latitude. I'm wondering whether there is a=

 way

> > to know which is the original coordinate system (i.e., the one without=

 the

> > error introduced by translation). Both coordinates actually have=

 tell-tale

> > signs of tampering. That 3 out at the end of the decimal longitude looks

> > like a floating point error. The fact that so many of the named place=

 from

> > this region have only 0 or 59 in the seconds fields is also highly=

 suspect.

> > So, I wonder at what step the translation(s) was(were) made - whether it

> > comes from the original data source (in this case NIMA) or whether it is

> > post-processing done on your end. If it is the former, I suppose we're

> > stuck with it, but if it's the latter I wonder if a better algorithm=

 could

> > be used to keep the coordinates in sync. I can offer one, if that helps.

> >

> > Finally, I've probably asked this before, but is it possible to get the

> > datum information along with the coordinates. I suspect that information=

 is

> > missing as metadata from the original data sources, but if it isn't

> > missing, is there any possibility that it could be among the data you

> > provide in the ADL gazetteer interface? It makes a great deal of=

 difference

> > sometimes in determining the maximum error distance for the coordinates

> > assigned to a locality, and this will, in turn, affect analyses further=

 on

> > down the road.

> >

> > Thanks bunches,

> > John W

 

>>> Posting number 156, dated 20 Jan 2002 10:39:23

 

>>> Posting number 157, dated 31 Jan 2002 15:13:44

 

>>> Posting number 158, dated 31 Jan 2002 16:18:34

Date:         Thu, 31 Jan 2002 16:18:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Update

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I write for two purposes. The first is that I'm curious to know how many of

you have actually begun to georeference. So far, I know that CNMA and the

MVZ have begun. The reason I ask is that I would like to begin a discussion

on the list of techniques to make the task go faster. I don't really want

to do that until most everyone is actually getting their hands dirty. In

this way everyone will be able to benefit from the discussion. So, please

let me know either that you have already begun georeferencing, or when you

anticipate beginning.

 

My second purpose is to let you know that, due to my ignorance of the

details of two of your esteemed collections databases, I made some faulty

assumptions when I first processed the data for the online gazetteer.  As a

result, I need to reload data for UWBM and for ROM.  I have already

reprocessed the UWBM data and I'll try to load it into the gazetteer as

soon as possible (hopefully by Monday). The situation with ROM is more

complex and I anticipating making an update to the ROM data in about one

month. There are a few implications of this unfortunate necessity.

 

1) If you have not yet downloaded localities for georeferencing, wait to

make your downloads at least until I announce that the update for UWBM has

been done. Don't wait for the ROM update to be done unless for some reason

you weren't going to begin georeferencing for another month anyway.

 

2) If you have downloaded localities, but have not yet begun georeferencing

them, throw away the downloaded file(s) you have and download them again

after I announce that the UWBM update is complete. Again, don't wait for

the ROM update to be done unless you weren't going to begin georeferencing

for another month anyway.

 

3) If you downloaded and began georeferencing files that include UWBM

and/or ROM records, please discard those records (only) from your record

set, even if you happen to have already georeferenced some of them. My

suspicion is that not much actual georeferencing has commenced to date

(though I'd love to hear otherwise), so this is unlikely to be a big

problem. After discarding the UWBM and ROM records, please do another

download with the same criteria you used last time, but this time please

select UWBM in the Institution drop-down box. This will give you only the

UWBM records from your geographic area of interest. After they download

successfully, append these UWBM records to the records you've already begun

georeferencing and proceed as if nothing had happened.

 

When the ROM records are ready I'll make another announcement to the list

about downloading only ROM records to append to your working files. The

process will be exactly the same is in scenario 3, above. In the meantime,

ROM records will still be in the gazetteer, but please do spend time to

georeference them. Throw them out now, or when I make the announcement, as

you prefer.

 

Thanks, and my sincere apologies for the inconvenience. I promise to try to

not make assumptions about other people's data anymore. I should know

better by now.

 

John W

 

>>> Posting number 159, dated 1 Feb 2002 17:29:10

 

>>> Posting number 160, dated 1 Feb 2002 17:33:01

 

>>> Posting number 161, dated 1 Feb 2002 18:24:45

 

>>> Posting number 162, dated 1 Feb 2002 15:42:04

 

>>> Posting number 163, dated 1 Feb 2002 19:27:30

Date:         Fri, 1 Feb 2002 19:27:30 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Gazetteer update

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

The promised gazetteer update is complete. Download with abandon!

 

John W

 

>>> Posting number 164, dated 4 Feb 2002 17:36:54

Date:         Mon, 4 Feb 2002 17:36:54 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: Georeferencing by MSU

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Please read, there are some excellent questions raised here.

 

>Date: Mon, 04 Feb 2002 17:58:51 -0500

>To: tuco@socrates.Berkeley.EDU

>From:

>Subject: Georeferencing by MSU

> 

>Hi John,

> 

>XXXXXXX and I want to give you an update on georeferencing and relay

>some concerns/questions.

> 

>In late November, we downloaded records for several Michigan counties and

>have since practiced on the different types of localities.  Using

>Topozone, we worked individually on Eaton and Barry Counties and then

>compared and discussed our approaches and results.  Prior to the

>introduction of the error calculator, we reported our results as UTM

>coordinates in the Access file template provided.

> 

>With the availability of the error calculator (thank you very much!) and

>recent revised guidelines, we began recording original coordinates as

>decimal degrees for Barry County.  Our plan is to send you each of our

>Barry County files in the next few days.  We would appreciate your

>comments on our results and techniques before we proceed with the "real

>thing".

> 

>We have some questions and comments:

> 

>1. Evolving Guidelines - We would appreciate an announcement whenever

>there is an update to the guidelines, specifying which sections are

>altered, to ensure that we are always working with the most recent

>information.  Thanks again for all of your hard work with this!!

 

Point well taken. I've tried to be good about announcing the updates, but I

haven't always completely described what the changes were.

 

>2. Guidelines Questions - In the calculation example of Distance Along

>Orthogonal Directions, the Direction Precision is given as 45 degrees.  It

>seemed earlier in the document that "directional imprecision can be

>ignored" in such an example.  Are we misunderstanding something?

 

I included one too many lines in my copy and paste. The Direction Precision

should not figure into that calculation. I will remove the extraneous line

from the Georeferencing Guidelines.

 

>In the calculation example of Named Place Only/Bakersfield, the

>coordinates are 35 degrees, 22', 24"N and 119 degrees, 1' 4" W.  We

>understand from the example that these are the GNIS coordinates for

>Bakersfield.  In other  examples (e.g. Distance Along Orthogonal

>Directions and Distance at a Heading) the latitude and longitude

>coordinates are the same as for the Named Place/Bakersfield example. Since

>the actual localities are different (from Bakersfield), shouldn't the

>coordinates be different as well?

 

Absolutely. You win a prize for catching those mistakes. The "Distance

Along a Path" example was similarly problematic. I have changed the wording

as well as the values for Latitude, Longitude, Decimal Latitude, and

Decimal Longitude for these examples to reflect that the coordinates of the

locality are different from the coordinates of the named place mentioned in

the locality description.

 

>3.  Coordinates for the Center of a Township  -  If a locality is a

>township name only, is it preferable to use the coordinates for the

>township that are automatically provided by Topozone (via the place name

>search), or use the coordinates for the intersection of Sections 15, 16,

>21, and 22 (assuming the township consists of the "standard" 36 one-mile

>square sections)?

 

I was unaware that one could (and unable to figure out how to) find a

township, in the TRS sense, from the place name search on Topozone. I did

notice that you can find named townships (Michigan is full of them), but I

don't believe their coordinates correspond with the TRS sections they

occupy. Nevertheless, the coordinates we're looking for are those of the

intersection of center sections, as Laura mentioned above.

 

>4.  Extent of an intersection - One of the localities that we recently

>georeferenced in Barry County was the intersection of two roads.  We used

>the coordinates from Topozone and estimated the extent of the intersection

>to be 50 meters.  Is this a reasonable estimate to use in general for this

>type of locality?  (The locality was considered as a named place for

>calculation of error).

 

That seems like a generous extent unless the roads are 12-lane highways or

something.  I would opt for something more like 10 meters for your everyday

two-lane roads.  Certainly, feel free to override my opinion if the

circumstances warrant it.

 

>5. Extent of a named place that lacks bounding boxes  - We have

>encountered named places that lack bounding boxes on both the Topozone

>image as well as a Michigan County Gazetteer book.  We have estimated

>extents of such places based on the clusters of buildings that appear as

>black squares on Topozone in 1:25,000 scale.  Is this type of estimate okay?

 

That's what I'd do, and that's what my georeferencers have been doing from

the outset.

 

>6.  Cursor Accuracy - Robin and I have different model computers that

>utilize different web browsers (I have Netscape; Robin has

>Explorer).  When Robin connects to Topozone, her computer cursor

>automatically changes to a crosshair.  I manually changed my computer

>cursor from the "standard" arrow to a crosshair.  I believe this has made

>a difference in attempting to pinpoint localities on the Topozone map.

 

Good idea. It hadn't occurred to me because we're all using Netscape, and

we're only using Topozone occasionally.  Just as a point of information,

for California we most often use Terrain Navigator from MapTech

(http://maptech.com/) to do our georeferencing.

 

>Thanks for all of your help!!

 

Thanks for your excellent questions and comments.

 

>XXXXXXXXX

 

 

>>> Posting number 165, dated 6 Feb 2002 11:48:03

Date:         Wed, 6 Feb 2002 11:48:03 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Should we save extents?

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

MaNISers:

Should we save extents?  In georeferencing, one variable that will not be

saved is the extent used to compute the error.   The extent cannot be

inferred from the locality descriptions unlike coordinate and offset

imprecision.     In addition, an extent for a populated place will vary

depending on the scale, map, year.  For many records it is the largest

component of the error.  To give folks an idea of how I computed the error,

I am annotating each record with the extent I used.   One could go

overboard and reference the extent, but I am assuming the same system used

to get lat/long (GNIS).   Would it be too much trouble to save extents in

the annotation field?

 

For TRS lat/longs, I am using the extents in the Guidelines update.  For

lookup on the MontanaTRS site I am assuming unknown datum and no error due

to scale as done in the Georef Guidelines examples for placename only.

Correct?

 

 

 

 

>>> Posting number 166, dated 7 Feb 2002 09:55:12

Date:         Thu, 7 Feb 2002 09:55:12 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Datum error significance

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I figured it was worth answering this question on the list in case others

were wondering the same thing. The commonly used datums in the US are the

North American Datum 1927 (NAD 27), the North American Datum (NAD 83), and

the World Geodetic System 1984 (WGS 84). The difference between NAD 83 and

WGS 84 is quite small compared to the difference between NAD 27 and NAD 83.

All of the USGS maps are in one or the other of NAD 27 or NAD 83. I haven't

done an exhaustive search, but it looks like most US Forest Service and

Bureau of Land Management maps use NAD 27.

Anyway, the 79 m used in the Bakersfield example is the actual distance

between two points having the exact same latitude and longitude, but with

one of the points based on NAD 27 and the other based on WGS 84.  The Error

Calculator uses a pre-calculated matrix of the greatest difference between

these two datums in every 0.2 by 0.2 degree cell in the region between

84.69 degrees North, 179.48 degrees West and 13.69 degrees North, 51.48

degrees West. Outside of this region the calculator uses the assumption of

1km error due to an unknown datum as documented in the Georeferencing

Guidelines.

When entering coordinates in the calculator it is important to enter the

correct hemisphere. Perhaps that goes without saying, but it is pretty easy

to enter decimal longitude erroneously (without the negative sign in front)

for localities in the western hemisphere. Doing so could seriously affect

the error contribution from an unknown datum.

 

John W.

 

 

>Date: Wed, 6 Feb 2002 11:45:19 -0800

>To: tuco@socrates.Berkeley.EDU

>From:

> 

>John:  Unknown datum question.  Fig 1 in the guidelines has the ranges of

>error for unknown datum.  For Bakersfield the range 76-100 m error.

>Oregon, which I am georeferencing, is in the same 76-100 m band, so a

>midpoint would be 88 m.  Does 79 m used in the Georeferencing Guidelines

>examples for Bakersfield have some significance?  I realize this doesn't

>matter when using the web calculator, but just wondering because it makes a

>difference of several m when using Excel calculator.

> 

 

 

>>> Posting number 167, dated 13 Feb 2002 11:35:40

Date:         Wed, 13 Feb 2002 11:35:40 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MSU PRACTICE RECORDS

In-Reply-To:  <3.0.32.20020213132000.00687df0@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Below are extracts from an exchange between me, XXXXXXXXXXX, and

XXXXXX stemming from a request to review a set of records that each of

them had georeferenced independently. Several points of interest to the

readers of this list were raised, including a continuing discussion of the

issue of extents raised by XXXXXXX on 6 Feb 2002.

 

I'd like to report that this exercise turned out to be a wonderful field

test of the georeferencing guidelines. The coordinates and errors were

remarkably similar, with the largest deviations corresponding to the most

vague locality descriptions. Go team!

 

John W

 

> >Topozone actually has

> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

> >1:200,000 versions are just zoomed out by a factor of two from their

> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

> >"resized." It doesn't make all that much difference in the error

> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

> >using the 1:25,000 map scale contribution in the error calculator for the

> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

> >1:200,000 Topozone maps.

> 

>Good to know all of the above.  Actually, we used "gazetteer" from the

>dropdown on the error calculator for all of the Topozone practice records.

>We were following the example from the georeferencing guidelines where the

>coordinate source (Topozone) was considered to be a gazetteer, and thus

>selected "gazetteer" on the error calculator.  It sounds like we need to

>redo the MAX ERROR with the map scale incorporated.

 

Actually, there is a subtle distinction to make. In the Georeferencing

Guidelines document I said that the source for that "Distance Only" example

was a gazetteer, because the coordinates were for a named place and

Topozone uses the GNIS data to plot named places; thus, the ultimate source

of the coordinates for that example is the GNIS database, which is a

gazetteer. If you had used Topozone to measure on a map, then the map

itself is the source of the coordinates and should be so reflected in the

error calculations by selecting an appropriate map scale.

 

> >I'm very happy to see the extent information in there. I am ruminating over

> >the inclusion of a field in the download data for the extent. I'm

> >interested in your opinion on the subject. It seems like it would actually

> >be easier than writing it out in the remarks, especially if you can copy

> >and paste it among several records. However, I think we'd do well to add a

> >NamedPlace field as well so we know to what the extent refers.

> 

>XXXX and I have been meaning to reply to XXXX's message about extent.  My

>opinion is that extent should be included somewhere (and in the remarks

>field is fine with me) as a record of what was done in the georeferencing

>process.

 

I think the general sentiment is that the complete determination

(coordinates AND error) would be fully documented if we go ahead and add

the value of the extent to the data we capture. By having a base set of

rules along with recording  extent, we will know know the magnitude of

every contribution to the determination. Without recording extent, we are

left to wonder how the georeferencer arrived at his/her result. Would it be

onerous to include the extent in its own field? I think it will be easier

than adding it to the remarks, both for the georeferencer and for the

compiler of named place extents (me). Part of the reason I ask this is that

I'm thinking even bigger than MaNIS to the ubiquitous problem of

georeferencing, which could benefit by having a database of extents. The

GNIS data allows for features to be described by bounding boxes, which can

be interpreted to find extents. However, for most features the bounding box

reduces to a single point. This is true of all but the largest populated

place features in the GNIS database.  Given the paucity of extent data

available, and given that we (MaNIS georeferencers) will have to determine

extents for every named place we run across, we could assemble these data

and use them to provide added value to existing gazetteers. Furthermore,

these additional data could be used in the future to automate the process

of georeferencing and error calculation. If this is, indeed, a worthy goal,

then it makes sense to capture the information in its own field so that it

need not be parsed from remarks in the future.

 

Comments are hereby solicited.

 

> >Overall, the agreement in the coordinates and the errors is astonishing.

> >The mean deviation in coordinates across the whole dataset is only about

> >300 meters and most of this is due to the two vague localities ("Barry

> >State Game Area" and "Yankee Springs Area"). For the most part the errors

> >take care of the differences. You have bolstered my faith in the system.

> 

>Yes - these were large areas that were actually adjacent to one another. I

>found them to be somewhat difficult to georeference.

> 

> >The one locality for which I cannot understand the discrepancy is "Clear

> >Lake Camp, 6 mi. E Delton." You might want to revisit that one to see where

> >the problem occurred.

> 

>I know what happened here - operator difference (or assumption error?) pure

>and simple.  I believe that Robin treated this as an offset, and I

>completely ignored the offset and focused on a "church camp" on the map

>that was on the shore of Clear Lake (the lake was about 6.5 miles east of

>Delton).  Thus, I treated this as a named place (and perhaps my assumption

>was an unwarranted big stretch) and Robin treated it as an offset.  I

>believe that Robin's choice was the better of the two.

> >

> >Nice.

> 

>Thanks again!

> 

>XXXXX

> 

> 

> >

> >John W

> >

> >>Attached are two files containing identical Barry County localities that we

> >>have georeferenced individually as practice with the MaNIS guidelines.  We

> >>would sincerely appreciate your critique of our work before we submit files

> >>for inclusion in the project.

> >>

> >>Thanks for all of your help.

> >>

> >>Sincerely,

> >>

> >>XXXXXX

 

 

>>> Posting number 168, dated 13 Feb 2002 11:55:01

Date:         Wed, 13 Feb 2002 11:55:01 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Should we save extents?

In-Reply-To:  <v0213050ab886050432cf@[207.207.103.162]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

>For TRS lat/longs, I am using the extents in the Guidelines update.  For

>lookup on the MontanaTRS site I am assuming unknown datum and no error due

>to scale as done in the Georef Guidelines examples for placename only.

>Correct?

 

Correct.

 

>>> Posting number 169, dated 13 Feb 2002 12:21:25

Date:         Wed, 13 Feb 2002 12:21:25 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MSU Practice  - More Comments

Comments:

In-Reply-To:  <3.0.32.20020213145319.00720da8@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

More relevant exchanges.

 

> >>We do not and will not be using Excel for georeferencing.  We just used it

> >>this one time to send you the sample records via e-mail.  I am hoping that

> >>the data were not altered.

> >

> >Will you use Access then?

> 

>Yes - we are extremely happy with your Access template!  (Why would anyone

>want use something else?)

 

Good question! I would have no problem accepting that everyone used it.

 

>On the template, we have found it useful to just "close up" the columns

>that we don't want to look at while georeferencing.  (You probably noticed

>this in the Excel version).

> 

>XXXXXXX (our IT person) will help us send the "real" files using the

>project protocol.

 

In so doing, be sure to preserve all of the precision in the numeric

fields. There are two ways to do this. The first is to bypass protocol and

just send me the Access database mdb file (preferably with a date in the

filename, e.g., msu_barry020213.mdb). The second is to change the data type

of those fields to text after the georeferencing is all done and then

export the data into a tab-delimited text file.

 

> >I'm composing a reply to your previous message, which I'll send out to the

> >list due to common items of interest, and as a way of introducing more

> >information on the issue of extents.

> >

>Okay.  Robin replied to me (from home) about extents.  Here is her "vote".

>FROM XXXXXX:  I'd vote for an actual column regarding extent

>information to assure that it was remembered.  I view the column headings

>as a checklist of things I need to provide and without reference to it, it

>could easily be forgotten with all the other components.

 

This is a valuable, practical point with which I entirely agree.

 

John W

 

>>> Posting number 170, dated 13 Feb 2002 12:33:47

Date:         Wed, 13 Feb 2002 12:33:47 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MSU PRACTICE RECORDS

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

John, XXXX, XXXX:  Can I get copies of the data files?  I'd like to run

them through the lat/long calculator for comparsion.

 

 

 

>>> Posting number 171, dated 14 Feb 2002 18:30:16

Date:         Thu, 14 Feb 2002 18:30:16 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Error Calculator:Coordinate Source & Topozone.com

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

Thanks for the helpful information about map scales and choices to make on

the error calculator when using Topozone.com for georeferencing.  I have

some additional questions about this.  The message exchanges (from

Mammal-Z-Net) are copied below.

 

>From John:

>> >Topozone actually has

>> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

>> >1:200,000 versions are just zoomed out by a factor of two from their

>> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

>> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

>> >"resized." It doesn't make all that much difference in the error

>> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

>> >using the 1:25,000 map scale contribution in the error calculator for the

>> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

>> >1:200,000 Topozone maps.

>> 

>From XXXXX:

>>Good to know all of the above.  Actually, we used "gazetteer" from the

>>dropdown on the error calculator for all of the Topozone practice records.

>>We were following the example from the georeferencing guidelines where the

>>coordinate source (Topozone) was considered to be a gazetteer, and thus

>>selected "gazetteer" on the error calculator.  It sounds like we need to

>>redo the MAX ERROR with the map scale incorporated.

 

>From John:

>Actually, there is a subtle distinction to make. In the Georeferencing

>Guidelines document I said that the source for that "Distance Only" example

>was a gazetteer, because the coordinates were for a named place and

>Topozone uses the GNIS data to plot named places; thus, the ultimate source

>of the coordinates for that example is the GNIS database, which is a

>gazetteer. If you had used Topozone to measure on a map, then the map

>itself is the source of the coordinates and should be so reflected in the

>error calculations by selecting an appropriate map scale.

> 

My questions:

 

1.  I understand (from exchange above) that if the locality that we want to

georeference is a named place (such as East Lansing or Beaver Island or

Fine Lake) and we enter this into the Place Name Search in Topozone and

Topozone gives us the coordinates of that place, then the Coordinate Source

that we select on the Error Calculator will be a Gazetteer (because

Topozone got those coordinates from GNIS).  Thus, I believe that we

calculated the error correctly in the practice records that contained

coordinates given by Topozone for named places (such as Fine Lake).  Is

this correct?

 

2.  Are the Topozone maps considered to be USGS or non-USGS maps?  For

Example, If we used a Topozone.com map at 1:25,000 scale to measure the

distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map

1:25,000 from the Coordinate Source dropdown on the Error Calculator?

 

Thanks again,

XXXXX

 

 

 

 

>>> Posting number 172, dated 14 Feb 2002 15:39:16

Date:         Thu, 14 Feb 2002 15:39:16 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

In-Reply-To:  <3.0.32.20020214183015.0072e530@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX, and all,

 

You are correct with respect to question 1, below. You got the coordinates

indirectly from GNIS for named places, therefore, the appropriate source is

a gazetteer.  If you use Topozone to find a locality, but do any kind of

measuring on the Topozone maps, then you are indirectly using a USGS map,

and you should select the appropriate scale in the coordinate source

dropdown box in the error calculator application. So, to explicitly answer

question 2, below, use "USGS Map 1:25,000" for Topozone maps at either

1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either

1:100,000 or 1:200,000. While we're at it, here's a reminder to always use

NAD27 for Topozone-derived coordinates, whether from the gazetteer or from

the maps.

 

John W

 

>Thanks for the helpful information about map scales and choices to make on

>the error calculator when using Topozone.com for georeferencing.  I have

>some additional questions about this.  The message exchanges (from

>Mammal-Z-Net) are copied below.

> 

> >From John:

> >> >Topozone actually has

> >> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

> >> >1:200,000 versions are just zoomed out by a factor of two from their

> >> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

> >> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

> >> >"resized." It doesn't make all that much difference in the error

> >> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

> >> >using the 1:25,000 map scale contribution in the error calculator for the

> >> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

> >> >1:200,000 Topozone maps.

> >>

> >From XXXXX:

> >>Good to know all of the above.  Actually, we used "gazetteer" from the

> >>dropdown on the error calculator for all of the Topozone practice records.

> >>We were following the example from the georeferencing guidelines where the

> >>coordinate source (Topozone) was considered to be a gazetteer, and thus

> >>selected "gazetteer" on the error calculator.  It sounds like we need to

> >>redo the MAX ERROR with the map scale incorporated.

> 

> >From John:

> >Actually, there is a subtle distinction to make. In the Georeferencing

> >Guidelines document I said that the source for that "Distance Only" example

> >was a gazetteer, because the coordinates were for a named place and

> >Topozone uses the GNIS data to plot named places; thus, the ultimate source

> >of the coordinates for that example is the GNIS database, which is a

> >gazetteer. If you had used Topozone to measure on a map, then the map

> >itself is the source of the coordinates and should be so reflected in the

> >error calculations by selecting an appropriate map scale.

> >

>My questions:

> 

>1.  I understand (from exchange above) that if the locality that we want to

>georeference is a named place (such as East Lansing or Beaver Island or

>Fine Lake) and we enter this into the Place Name Search in Topozone and

>Topozone gives us the coordinates of that place, then the Coordinate Source

>that we select on the Error Calculator will be a Gazetteer (because

>Topozone got those coordinates from GNIS).  Thus, I believe that we

>calculated the error correctly in the practice records that contained

>coordinates given by Topozone for named places (such as Fine Lake).  Is

>this correct?

> 

>2.  Are the Topozone maps considered to be USGS or non-USGS maps?  For

>Example, If we used a Topozone.com map at 1:25,000 scale to measure the

>distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map

>1:25,000 from the Coordinate Source dropdown on the Error Calculator?

> 

>Thanks again,

>XXXXX

> 

 

 

>>> Posting number 173, dated 15 Feb 2002 10:55:46

Date:         Fri, 15 Feb 2002 10:55:46 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

Thanks for the information.  We'll go ahead and recalculate the Max Error

Values on our "practice" records.

 

One minor question with respect to the word "measuring" in your response

below:  For some localities, such as road intersections for example, we get

the coordinates by placing the cursor on the Topozone map, and then

clicking to get the target coordinates of that particular locality.  We

really aren't "measuring", but the coordinates are still considered to be

derived from Topozone, and so the map scale information gets applied to the

error calculator - correct?

 

Thanks,

XXXXX

 

 

 

At 03:39 PM 02/14/2002 -0800, you wrote:

>XXXXX, and all,

> 

>You are correct with respect to question 1, below. You got the coordinates

>indirectly from GNIS for named places, therefore, the appropriate source is

>a gazetteer.  If you use Topozone to find a locality, but do any kind of

>measuring on the Topozone maps, then you are indirectly using a USGS map,

>and you should select the appropriate scale in the coordinate source

>dropdown box in the error calculator application. So, to explicitly answer

>question 2, below, use "USGS Map 1:25,000" for Topozone maps at either

>1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either

>1:100,000 or 1:200,000. While we're at it, here's a reminder to always use

>NAD27 for Topozone-derived coordinates, whether from the gazetteer or from

>the maps.

> 

>John W

> 

 

 

>>> Posting number 174, dated 15 Feb 2002 09:09:40

Date:         Fri, 15 Feb 2002 09:09:40 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

In-Reply-To:  <3.0.32.20020215105545.0072c878@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX,

 

You aptly described exactly what I meant. Thank you.

 

John

 

>One minor question with respect to the word "measuring" in your response

>below:  For some localities, such as road intersections for example, we get

>the coordinates by placing the cursor on the Topozone map, and then

>clicking to get the target coordinates of that particular locality.  We

>really aren't "measuring", but the coordinates are still considered to be

>derived from Topozone, and so the map scale information gets applied to the

>error calculator - correct?

> 

>Thanks,

>XXXXX

> 

 

 

>>> Posting number 175, dated 15 Feb 2002 18:08:31

Date:         Fri, 15 Feb 2002 18:08:31 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: coordinate source?

Comments: cc: fsyu <fsyu@uaf.edu>

In-Reply-To:  <3C6D8138@webmail.uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX and all,

 

There is no provision for georeferencing records that already have

coordinates, but this shouldn't necessarily deter you from doing so. If you

go this route, please be sure to note that you have provided these

additional data when you send them in to me. It makes a difference in how I

handle the data on this end.

 

To answer your specific question, you should put "original locality

description" in the DeterminationRef field in the downloaded data file and

use "locality description" as the Coordinate Source choice in the Error

Calculator.

 

John W

 

>Hi John,

> 

>Many Alaska data are already georeferenced, but don't have maximum error.

>I've

>been calculating max. error for them, but determination references are not

>recorded for most of them.  What should I enter in Coordinate source in Error

>Calculator?

> 

>XXXXXX

 

>>> Posting number 176, dated 20 Feb 2002 09:06:36

Date:         Wed, 20 Feb 2002 09:06:36 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Topo USA Ver. 3.0 by DeLorme

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

For anyone using DeLorme software Topo USA Ver. 3.0 (which I am using to do

Hawaii localities)  you will need this information for the georeferencing

calculator.  I just spoke with the Tech help people and got the information

that all topo maps, at all zoom levels, are based on USGS 1:24,000.  I

quite like this software as it allows me to place markers for all the

localities I've done which greatly speeds up any double checking I might

want to do.  Measuring distances is also easy, either by air or road.

 

XXXXX

 

 

>>> Posting number 177, dated 25 Feb 2002 14:36:49

Date:         Mon, 25 Feb 2002 14:36:49 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MaNIS Server recommendations

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Due to popular demand, I'm writing to give an updated recommendation for

the MaNIS server specifications. The requirements haven't changed since the

original specification were sent out on 2 Oct 2001.  Nevertheless, I'll

reiterate the essentials of the configuration, ordered by importance:

 

1) dual processor Windows 2000 Professional - the Xeon processor is good

for our purposes; faster is better, but anything on the market today is

fast enough.

 

2) 512 MB RAM - more is better, but not at the cost of any of the other

essentials.

 

3) one fast SCSI hard drive - essential; faster is better; capacity is much

less important. 18GB is a good target capacity.

 

4) 10/100 Ethernet adapter - essential; most systems these days have one on

board.

 

5 ) 3 yr service on parts and labor - essential; we don't want anything to

break without warranty during the period of the grant.

 

6) CD-ROM drive - faster is better; a CD-RW may be a useful alternative, if

it fits your budget.

 

7) 17" Monitor - this machine is supposed to be a server, not a

workstation, so don't spend big money on a fancy display.

 

8) 1.44 MB diskette drive - less essential every day, but most machine

still come with one.

 

I've created a model system on the Dell website to give you an idea for a

recommended configuration. To look at the specifications for the system

you'll need to Retrieve EQuote #E001554835. You'll also need to enter

either the E-Quote name, which is "manis2," or my email address.

 

Let me know if you have any questions.

 

John W.

 

>>> Posting number 178, dated 27 Feb 2002 14:59:52

Date:         Wed, 27 Feb 2002 14:59:52 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Mystified

In-Reply-To:  <3.0.32.20020227173043.007327a8@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi XXXX, XXXX, and all,

 

I have noticed the syndrome you mentioned and I tried to ignore it. That's

harder to do when someone else notices it. It's even worse when two people

notice it - it gets harder to remove the witnesses. I think I know why it

occurs, but I don't have a satisfactory solution yet. I actually made the

interface show 3 decimal places in the Maximum Error field so that this

inconsistency would make less of an impact on the results, which may

currently differ from the expected by up to .001 distance units. So, the

worst case scenario occurs when your distance units are miles, and then the

error (in the error) amounts to about 5.3 feet. This is probably acceptable

and worth trading in your concern for a life. :)  In the meantime, I'll

remain cognizant of the problem and try to work on its resolution.

 

John

 

At 05:30 PM 2/27/02 -0500, you wrote:

>Hi John,

> 

>XXXX and I are mystified about some of the error values in our Barry

>County records (files sent to you in today's earlier message).

> 

>1.  In the first set of Barry County records (the files that we sent to you

>on 2/12/2002) we incorrectly chose Gazetteer as the error calculator

>coordinate source for Topozone for all records.  For the records that were

>TRS localities, we anticipated getting identical values for maximum error.

>This was not the case.  When XXXX used the error calculator on her

>computer, she got .716 as the error.  When I used the error calculator on

>my computer for these types of records, I got .715 as the error.

> 

>2.  In the second set of Barry County records (the files that we sent to

>you today 2/27/2002 where maximum error was recalculated with the

>appropriate Topozone map scale), our computers continue to give different

>error calculator values for some of the TRS localities that used an error

>calculator map scale of 1:25,000 (See Sec. 23, T1N, R7W,

>Sec. 24, T1N, R7W and

>T01N R07W Section 4)

> 

>3.  We were surprised at the above examples.  We then entered each other's

>coordinates using identical dropdown choices on the error calculator on our

>respective computers.  XXXX's computer still consistently returned an

>error of .723 for all of the TRS localities that had the 1:25,000 scale.

>However, XXXX's computer returned an error of .723 on some localities and

>.724 on others with the 1:25,000 scale.  Do we need to be concerned about

>this? (or shall we get a life?)

> 

>Thanks,

>XXXXX

 

 

>>> Posting number 179, dated 27 Feb 2002 16:24:51

Date:         Wed, 27 Feb 2002 16:24:51 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Sample of georeferencing from Baton Rouge

Comments: To:

In-Reply-To:  <OF09532E16.D5566143-ON86256B6D.00611AF2@lsu.edu>

Mime-Version: 1.0

Content-Type: multipart/mixed; boundary="=====================_-1683450515==_"

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

Very nicely done.  I can see that you've gone to a lot of trouble to

document the determination methods in the Remarks. There should be no

trouble for someone to figure out later what you did.  Some of the

techniques you used (and documented) will surely be useful to others, so

I'm attaching your file with this message to the mammal-z-net list.

 

I'm trying to decide if/how to make everyone's job a little easier, perhaps

by including a field for named place along with one for the extent. That

way we'll know unequivocally to what the extent refers. I've just started

having my georeferencers do this, and it seems to be better (faster anyway)

than trying to write that information out in plain english in the remarks.

I'm interested in feedback from you and anyone else with an opinion about

whether this change would have a positive effect on your georeferencing.

I'm hoping to set a policy on this subject once there has been ample time

for cogitation on it. In the meantime, I recommend that georeferencers add

two columns to their data, one for NamedPlace, followed by one for Extent,

and put these right before MaximumErrorDistance. Do not include a

ExtentUnits field; instead, use the same units as for the

MaximumErrorDistance and the MaxErrorUnits will refer to both measures.

 

John W

 

>Hi John,

> 

>Here at LSU, we've downloaded all the Louisiana records from the MANIS

>database, and have begun georeferencing, starting with records from Baton

>Rouge (our home turf). We've learned a lot as we've worked through our

>first batch of records, especially from much of the recent email exchanges

>with other institutions, and we really appreciate the ease of use of the

>Error Calculator. We were wondering if you could look over a small (<20

>records) sample of some of the different types of localities we have

>georeferenced, just to see if we are on the right track. Our longest field

>is the LatLongRemarks, where we describe how we located the point and the

>extent that we estimated to calculate error with. We just wanted to make

>sure that you would be able to follow what we did if there are any

>questions with our georeferencing. Should we place the extents in a

>separate field, and if so, should we place it in any particular order with

>respect to the other fields? Let us know if you see any problems.

> 

>Many thanks,

> 

>XXXXXXX

 

>**********************************************************

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"

Content-Disposition: attachment; filename="batonrouge.txt"

 

"LocalityID"    "CollectionCode"        "HigherGeog"    "SpecLocality"  "ElevationText" "MinElev"

"MaxElev"       "ElevUnits"     "LatText"       "LongText"      "TRS"   "Township"      "TownshipDir"

"Range" "RangeDir"      "TRSSection"    "TRSPart"       "DetByAgentID"  "DeterminedByPerson"

"DeterminedDate"        "DeterminationRef"      "OrigCoordSystem"       "Datum" "DecLat"

"DecLong"       "LatDeg"        "LatMin"        "LatSec"        "LatDir"        "LongDeg"

"LongMin"       "LongSec"       "LongDir"       "UTMZone"       "UTMEW" "UTMNS" "MaxErrorDistance"

"MaxErrorUnits" "LatLongRemarks"        "CaptiveFlag"   "NoGeorefBecause"       "LocalityAnnotation"

13056   "CAS"   "North America, USA, Louisiana" "Briar patch near LSU campus, East Baton Rouge"

"Dinakar Nethi" "1-22-02"       "Topozone - gazetteer"  "decimal degrees"       "NAD27" "30.4141"

"-91.1759"

"1.009" "mi"    "center point of LSU Campus obtained from topozone, estimated furthest extent of ""near

LSU campus"" from center as 1 mi"       "0"

28636   "FMNH"  "USA, Louisiana, Baton Rouge Par"       "Baton Rouge"

"Satya Maliakal"        "1-23-02"       "Topozone - gazetteer"  "decimal degrees"       "NAD27"

"30.4451"       "-91.1867"

"13.009"        "mi"    "used EBR Parish courthouse as center, furthest extent of BR city limits from

courthouse estimated at 13 mi"    "0"

47616   "KU"    "U S A, LOUISIANA, EAST BATON ROUGE PARISH"     "BATON ROUGE, 5 MI S OF"

"m"                                                                                     "Satya Maliakal"

"1-23-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27" "30.3725"       "-91.1867"

"15.903"        "mi"    "located point 5mi S of EBR Parish courthouse, furthest extent of BR city limits

from courthouse estimated at 13 mi"    "0"

71051   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "0.25 mi E jct. Highland and Lee (on

Highland), Baton Rouge"            "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3911"       "-90.1562"

"38.412"        "m"     "located point 0.25 mi E of intersection of Highland and Lee on Highland,

estimated extent of intersection as 10 m"     "0"

71121   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "1 km S Baton Rouge, intersection Ben

Hur Rd. and Nicholson Rd., E tracks along fence line, 5 m"                "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3841"       "-91.1687"

"43.413"        "m"     "located point at intersection of nicholson drive RR tracks and ben hur road,

assuming that 1 km S of BR refers to this intersection, estimated extent of intersection as 10 m with 5

m offset" "0"

71074   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "0.33 mi S of Baton Rouge City Limits on

Highland Rd"           "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3687"       "-91.1227"

"38.414"        "m"     "point located .33 mi S of intersection of Highland Rd. and southern Baton Rouge

Corp. Limit on Highland Road, estimated extent of intersection as 10 m"        "0"

71248   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "10 mi S Baton Rouge on River Rd"

"16"    "16"    "meters"

"Satya Maliakal"        "1-28-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.3533"       "-91.1808"

"14.041"        "mi"    "located point 10 mi S of EBR courthouse following River Road, furthest extent

of Baton Rouge city limits from courthouse estimated at 13 mi"   "0"

71268   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "11465 Robin Hood, Baton Rouge"

"0"     "0"

"Satya Maliakal"        "1-29-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4555"       "-91.0561"

"37.408"        "m"     "located 11465 Robin Hood with yahoo maps, then located this point with

topozone, estimated extent of property at 10 m" "0"

71243   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "10 mi N Baton Rouge, US 61"

"0"     "0"

"Satya Maliakal"        "1-29-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.5503"       "-91.1969"

"14.041"        "mi"    "located point 10 mi N of BR along US 61 (starting from EBR Parish courthouse

latitude), furthest extent of Baton Rouge city limits estimated at 13 mi" "0"

71511   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "3.4 mi E, 1 mi N Baton Rouge on LA 37"

"0"     "0"

"Satya Maliakal"        "2-13-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4655"       "-91.1329"

"19.819"        "mi"    "located closest point 3.4 mi E and 1 mi N of EBR courthouse on LA 37, furthest

extent of BR city limits from courthouse estimated at 13 mi"    "0"

71294   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "2 mi N Baton Rouge on Miss. River"

"0"     "0"

"Satya Maliakal"        "2-08-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4733"       "-91.1927"

"14.017"        "mi"    "located point 2 mi N of EBR Parish courthouse following Mississippi River,

furthest extent of BR city limits from courthouse estimated at 13 mi"       "0"

71801   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge on River Road"

"16"    "16"    "meters"

"Satya Maliakal"        "2-20-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.3749"       "-91.2249"

"5.041" "mi"    "located point at center of River Rd. in Baton Rouge, estimated furthest exent of River

Rd. in BR from center at 5 mi"  "0"

71802   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge Quad. 15' Sec 51, T7S, R2E"

"45"    "45"    "feet"

"Satya Maliakal"        "2-21-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4277"       "-91.0072"

"4.260" "mi"    "located point at center of T7S, R2E (unable to locate Quad. 15' Sec. 51)"      "0"

71897   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge, Tulane Ave"

"0"     "0"

"Dinakar Nethi" "02-25-02"      "Topozone -1:25,000"    "decimal degrees"       "NAD27" "30.4019"

"-91.1652"

"0.527" "km"    "point located at approximate center of Tulane Ave., furthest extent of Tulane avenue

from center point estimated as .5 km"     "0"

71821   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge, 2100 Stanford"

"0"     "0"

"Dinakar Nethi" "02-08-02"      "Topozone - 1:25,000"   "decimal degrees"       "NAD27" "30.4187"

"-91.1536"

"37.410"        "m"     "located 2100 Stanford with yahoo maps and then located this point on topozone,

extent of property estimated at 10 m"   "0"

 

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

 

--=====================_-1683450515==_--

 

>>> Posting number 180, dated 7 Mar 2002 14:15:38

Date:         Thu, 7 Mar 2002 14:15:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MaNIS

In-Reply-To:  <a05100301b8ad6761b1be@[141.211.110.228]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and all,

 

>Hello John,

>My apologies, when I am georeferencing I use the "hide" command under

>"column" in the "format" menu of excel to close down columns that I seldom

>or never use. In this way, I can see the decimal latitude and longitude

>columns, for example, directly next to the locality column on my computer

>screen.  I inadvertantly forgot to "unhide" a few columns when I sent the

>excel files back to you.

 

I should have looked for that.

 

>A question for you: I have some localities where the data is obviously in

>error but cannot be corrected by me. Do you prefer that I reference the

>county center with a note in the locality annotation column, or not

>georeference the locality with a note in the NoGeorefBecause column?

 

There are two different classes of locality errors that you need to worry

about, those with internal inconsistencies that make the locality

impossible to determine (e.g., Hogback Creek, Inyo County - there are two

of these), and those that have an obvious error that can be corrected

unambiguously (e.g., Needles, Mojave Co., California - Mojave Co. is in

Arizona and Needles is in San Bernardino Co, California).

 

If there is an internal inconsistency in the locality information that

makes the locality impossible to determine unambiguously, do not provide

coordinates and error, but do put something like "internal inconsistency"

in the NoGeorefBecause field and explain the problem in the

LocalityAnnotation field (e.g., "there are two Hogback Creeks in Inyo

Co."). When the source institution gets the georeferenced data back,

they'll be able to see what the problem was for each locality that was not

georeferenced.

 

If there is an obvious error that doesn't make the georeferencing

ambiguous, go ahead and georeference the locality, but put your assumptions

in the LatLongRemarks field and definitely point out the error in the

LocalityAnnotation field. The source institution will be able to see what

your assumptions were and they'll be able to fix the errors you uncovered.

 

In summary, LatLongRemark should be filled with information about how you

georeferenced, LocalityAnnotation should be filled with information about

errors or ambiguities - intended for the source institution, and

NoGeorefBecause should be a brief phrase describing your reason for not

georeferencing a locality (e.g., "internal inconsistency", "too vague", "no

specific locality").

 

John W

 

 

 

>>> Posting number 181, dated 9 Mar 2002 11:19:22

Date:         Sat, 9 Mar 2002 11:19:22 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Some other useful Excel operations for MaNIS work

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

In addition to Hide columns, some other useful Excel operation I have found

useful are:

 

1. AutoFilter (similar to Access):

First select a column or columns, then choose

 

Data Menu>AutoFilter>select (Custom...) from the scrollable pick list>pick

contains and enter data of interest.

 

Using custom contains filtering, you can pull out all records for a county

from the backward HighGeo field or get all occurrences of a placename in

SpecLoc.   Records can be worked with as desired.

 

Show All just under AutoFilter on the Data menu brings all records back.

 

2. Protect Worksheet: This will prevent inadvertent changes to MaNIS

records handed down from the mount but cells, columns, or rows can be left

open for data entry if you first select them, then under Format

cells>Protection tab>click unlocked.  Once a worksheet is locked you can

enter data manually or automatically (egs. DecLat DecLong, error) but still

lock out changes to the locality fields.   Protecting disables the Sort

capability.

 

3.  LookUp:

Works great for dynamic lookup (as you type) and automatic assignment of

data like a placename lat/long from another list like the GNIS download.

With about 5000 of these links in the Oregon records, my machine (196 mg

RAM) starts to bog down.   To get rid of the links but retain the data, do

a Copy, Paste Special, click Value.

 

I've been using LookUp in four columns after LocAnnotation, I enter

placename (winnowed by user) that is then looked up and values for GNIS

placename, type of locality, county,  and DecLat, DecLong are returned.

Placename, type and county are for user verification and lat & long are for

computing lat/longs based on  offsets.

 

4. Concatenation:  For a text field this is done with "&", eg, columns A,

B, C  can be appended to D with

 "=D:D&", "&A:A&", "&B:B&", "&C:C" .  Enter this in the first field, then

fill down as needed.  Used to added misc notes to memo fields of MaNIS.

 

You can flip the HighGeo to have county first for sorting by doing a Text

to columns (Data menu), then contentating the columns with the county

column first.  Of course leave the original HighGeo unaltered.

 

When you get tired of these, there is the underlying Visual Basic macro

editor which is fun if you like that sort of thing.

 

I'll probably stick with Excel through the project due to our "Mac-enabled"

status in the museum.  I use Windows at home and in the museum as soon as

our server arrives.

 

 

 

>>> Posting number 182, dated 11 Mar 2002 12:02:55

Date:         Mon, 11 Mar 2002 12:02:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Sending Data from MSU

In-Reply-To:  <3.0.32.20020311144354.006e023c@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

>XXXX and I have data from a few Michigan counties to send to you.  So far,

>we have Access.mdb files for Barry, Branch, and Muskegon ready to go, and

>Kent, Ionia, and Montcalm are forthcoming.  We have two questions for you:

> 

>1.  We minimize the width (what I call "closing up") of many columns on the

>template (basically ones that we don't fill in with data, or don't want to

>look at).  Do you want us to open these columns back up before we send the

>file to you?

 

Nope. They're fine all closed up.

 

>2.  Do you have a preference for how often we send files to you?  (Aren't

>you getting bombarded with georeferencing data??)

 

Yes, the deluge has begun. Well, it's best to have the work backed up, so

it seems that you should send them as you finish them. Keep a copy on your

end too, for the sake of safety - you never know when we'll get hit by "the

Big One."  To minimize the threat of loss, it's probably best to upload

them as described in the Georeferencing Steps document (i.e., ftp to

galaxy.cs.berkeley.edu/incoming/mvz). Then send me messages as they arrive

safely. Of course, if you are sending Excel (.xls) or Access (.mdb) files,

you don't need to export as tab-delimited text and you should change the

file type to binary when ftp-ing.

 

>Thanks,

>XXXX

> 

>P.S. Thanks for "secretly" adding the NamedPlace and Extent fields to the

>template.  (We moved them over next to the MaxError column in our tables).

 

OK, the secret is out. For those of you who may not be aware of it, there

is an Access Database template for georeferencing that can be accessed

through a link in Step Five on the GeorefSteps document at the following URL:

 

http://dlp.cs.berkeley.edu/manis/GeorefSteps.html

 

 

 

>>> Posting number 183, dated 11 Mar 2002 13:49:55

Date:         Mon, 11 Mar 2002 13:49:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MaNIS Servers

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I've been asked a couple of times about making hardware substitutions in

the Equipment portion of MaNIS subcontract budgets. The bottom line is that

each institution must have, when the time comes to connect to the network,

a DEDICATED machine with the specifications highlighted in my 25 Feb

message "MaNIS Server recommendations." Dedicated means that the sole

purpose of the machine is to support data provision to the network. Beyond

that, I'm not picky.

 

John W.

 

>>> Posting number 184, dated 12 Mar 2002 14:45:06

 

>>> Posting number 185, dated 19 Mar 2002 10:46:55

Date:         Tue, 19 Mar 2002 10:46:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: fraction format in the error calculator

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXX, and all,

 

I'm glad you uncovered this bug. The error calculator is actually not as

smart as you expected it to be. The discrepancy you're experiencing arises

because the calculator interprets 1/2 as 1, ignoring everything after the

/. Therefore, please use only decimals or whole numbers in the Offset

Distance and Extent of Named Place fields

 

John

 

 

> 

>Hi John

> 

>I notice that maximum error is noticeably affected by the format of the

>extent entered on the error calculator if the extent contains a

>fraction.  Since the extent field accepts both decimal and common

>fractions, I experimented with 0.5 and 1/2 for the locality of 3/8 mi. N

>of Casnovia, Kent County, MI.  I approached the situation "by road," used

>decimal degrees on Topozone, and obtained the coordinates of 43.2401 and

>-85.7901.  Datum is NAD27; coordinate precision, 0.0001; coordinate

>source, USGS map 1:25,000.  Distance precision of 1/8 was selected from

>the drop-down.  When the extent of the bounding box is expressed as 0.5 (a

>logical choice for TopoZone users), the maximum error is 0.641; but when

>it is expressed as 1/2 (in keeping with the format of distance precision),

>maximum error is 1.141.

> 

>Depending on the extent, one format may be easier to use than the

>other.  However, if both formats are allowed by the calculator but only

>one yields the desired maximum error, shouldn't the field be restricted to

>that format?  [Actually now I believe the extent is slightly less than 0.5

>miles, but remain curious about the discrepancy.]  Again, your assistance

>will be greatly appreciated.

> 

>XXXXX

 

 

>>> Posting number 186, dated 21 Mar 2002 15:51:25

Date:         Thu, 21 Mar 2002 15:51:25 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing rivers

In-Reply-To:  <Pine.OSF.4.33.0203211410400.8199-100000@aurora.uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and all,

 

These are good questions. I'll put the answers right below each one.

 

>1. When I georeference rivers, should I take coordinates of the source or

>the drainage of the river? How much should extent of the river be?

 

The coordinates should be at the geographic center of the river, on the

river itself. The extent should be the distance to the furthest reach of

the river in either direction.

 

>2. An example: specific locality is "Brooks Range, Anaktiktoot", where

>Anaktiktoot is not on the map. Should I georeference for Brooks Range

>(which will be more than 600 miles in length)? There are many cases that

>higher geography is followed by unknown specific locality.

 

You should go ahead and put coordinates on the vague localities, even

though the maximum_error_distance will be large. Some of the higher

geographies that have no value or "no specific locality" in the locality

field can still be specific, such as islands.

 

>3. Related to my question 2: how much is too big to georeference? In many

>cases, only the name of the island, mountains, peninsula etc. are

>provided.

 

Do them all. The maximum_error_number will be useful even if it is large.

 

John

 

>>> Posting number 187, dated 30 Mar 2002 09:00:41

Date:         Sat, 30 Mar 2002 09:00:41 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      UAM declat/longs truncated in MaNIS?

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

John:  It looks like the UAM records in the gazetteer have the same problem

that KU's records had -- declat/longs only go to two decimals.  KU (XXX

XXXX) asked me to recompute KU's Oregon so I am overwriting  calculated

declat/longs.  Please advise on UAM records - there are several hundred.

 

Examples:

LocalityID      CollectionCode  Datum   DecLat  DecLong LatDeg  LatMin  LatSec  LatDir  LongDeg LongMin

LongSec LongDir

186407  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

17      W

186662  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

10      W

186663  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

1       W

186721  UAM     not recorded    45.1600 -123.7300       45      10      1       N       123     44

6       W

186731  UAM     not recorded    45.2100 -123.6400       45      13      1       N       123     38

42      W

186514  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

32      W

186515  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

21      W

186516  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

2       W

186556  UAM     not recorded    44.2800 -123.7600       44      17      2       N       123     46

2       W

186557  UAM     not recorded    44.2800 -123.7500       44      17      2       N       123     45

2       W

186689  UAM     not recorded    45.3300 -123.7800       45      20      2       N       123     47

2       W

186690  UAM     not recorded    45.3300 -123.6400       45      20      2       N       123     38

49      W

186691  UAM     not recorded    45.3300 -123.6300       45      20      2       N       123     38

2       W

 

 

 

>>> Posting number 188, dated 1 Apr 2002 14:19:03

Date:         Mon, 1 Apr 2002 14:19:03 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MaNIS questions

In-Reply-To:  <5.1.0.14.0.20020327144722.01df95c0@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXX, and all,

 

I know that Barbara made a preliminary answer to the questions raised here.

I'll try to add a few points of explanation from which everyone on the list

might benefit.

 

I agree with Barbara's statement of the georeferencing priorities within

the MaNIS context. To summarize them, the MaNIS grant covers (only)

complete georeferencing for localities that have no lat_longs. Our hope is

that, through innovation and properly-guided cooperation, we will be able

to follow through on our promise to finish this. In fact, we hope that we

will be able to refine the process and the tools enough to actually get

ahead of the game. If we do get ahead, we will be able to turn our

attention next to those localities for which lat_longs exist without

supporting metadata.

 

I know we all have the desire to have consistent data quality, especially

when faced with making those data public. Within the context of our

project, however, cleaning up locality descriptions is neither covered, nor

is it recommended. Every change made to locality descriptions on your end

since the data were collected for the MaNIS gazetteer has the potential to

confound the process of properly reconnecting the georeferenced localities

with specimens in your database.

I have not yet explained the reconnecting part of the process, thinking

that what I've presented thus far is enough to swallow for the time being.

Perhaps a brief synopsis now would be of use to illustrate the potential

complications and to get people to think about the future of locality data

in institutional databases.

 

In the MaNIS gazetteer I have rendered unique occurrences of localities by

institution. These you can query on and see as results in the online MaNIS

gazetteer. Behind the scenes there is another table to cross-reference

unique localities to specimens. The specimens are linked to the localities

(and hence to the coordinates and metadata that georeferencing provide)

based on the locality string. Thus, if you change the locality string in

your database, it will not match the locality string for the same specimen

in the gazetteer. This is the crux of the issue, so it is important to

understand when it matters, and when it doesn't.

If the locality string in your database doesn't match the locality string

in the MaNIS gazetteer, but the locality really is exactly the same place

and would get the same coordinates when georeferenced, then the change

doesn't matter - the specimen will get the correct coordinates anyway.

However, if the change in your database effectively changes the place that

is described (resulting in different coordinates when georeferenced) then

the change DOES matter - it is what I have elsewhere called "substantive."

If a substantive change is made in your database and I apply the

georeferenced coordinates to the specimens that once referred to that

locality, the georeferenced data will be wrong. Therefore, there needs to

be a verification process when re-associating georeferenced localities with

individual databases. There are two steps to this process. The first is to

determine if the locality string in your database is the same as that in

the gazetteer. For all of those localities for which the locality strings

match, the georeferenced data can go into your database automatically, no

fuss, no questions asked. For the rest of the georeferenced localities from

the gazetteer, a comparison will have to be made between the then-current

locality and the georeferenced locality to determine if they still refer to

the same place. Imagine putting a check mark by each pair that still match.

The amount of checking to be done in this step is directly determined by

the number of changes you make to your locality strings between the time

when I collected the data for the gazetteer and the time when the data go

back into your database. Clearly, fewer changes mean less checking.

 

OK? Take a breath. Now, a topic for rumination as the project progresses.

Start thinking about incorporating the georeferenced coordinates and

metadata into your individual databases. Not one of the participating

institutions currently has the structure in its database to capture all of

the metadata we are gathering. It would be nice if we all could. We don't

want to throw away all of this hard work after all.

 

John W

 

 

 

>>> Posting number 189, dated 1 Apr 2002 15:37:38

Date:         Mon, 1 Apr 2002 15:37:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: UAM declat/longs truncated in MaNIS?

In-Reply-To:  <F569gG8WPbLgJyAUypU000104d9@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and XXXXX,

 

The problem is not exactly the same. UAM has both decimal lat_long and

degrees minutes seconds in its database. The decimal lat_longs often have

only two decimal places when there are fully specified degrees minutes

seconds, but this shouldn't affect what you're doing unless you want to

copy and paste lat_longs that UAM had already done to localities for other

institutions. If that's the case, recompute the decimal lat_longs for UAM

using the degrees minutes seconds values where the OrigCoordSystem is "deg.

min. sec."

 

XXXXX, you may want to put XXXX on recomputing decimal lat_longs for the

conditions described above.

 

General Reminder: Lat_Long recomputations should not be on MaNIS time

until/unless we finish the georeferencing of localities without lat_longs.

 

>John:  It looks like the UAM records in the gazetteer have the same problem

>that KU's records had -- declat/longs only go to two decimals.  KU (XXX

>XXXX) asked me to recompute KU's Oregon so I am overwriting  calculated

>declat/longs.  Please advise on UAM records - there are several hundred.

> 

>Examples:

>LocalityID      CollectionCode  Datum   DecLat  DecLong

>LatDeg  LatMin  LatSec  LatDir  LongDeg LongMin LongSec LongDir

>186407  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      17      W

>186662  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      10      W

>186663  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      1       W

>186721  UAM     not recorded    45.1600

>-123.7300       45      10      1       N       123     44      6       W

>186731  UAM     not recorded    45.2100

>-123.6400       45      13      1       N       123     38      42      W

>186514  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      32      W

>186515  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      21      W

>186516  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      2       W

>186556  UAM     not recorded    44.2800

>-123.7600       44      17      2       N       123     46      2       W

>186557  UAM     not recorded    44.2800

>-123.7500       44      17      2       N       123     45      2       W

>186689  UAM     not recorded    45.3300

>-123.7800       45      20      2       N       123     47      2       W

>186690  UAM     not recorded    45.3300

>-123.6400       45      20      2       N       123     38      49      W

>186691  UAM     not recorded    45.3300

>-123.6300       45      20      2       N       123     38      2       W

> 

 

 

>>> Posting number 190, dated 1 Apr 2002 16:47:19

Date:         Mon, 1 Apr 2002 16:47:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MaNIS questions

In-Reply-To:  <5.0.0.25.2.20020401125307.024018f0@socrates.berkeley.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Fellow MANES:

John's message closed with this statment:

"Not one of the participating

institutions currently has the structure in its database to capture all of

the metadata we are gathering. It would be nice if we all could. We don't

want to throw away all of this hard work after all."

 

My response:  It has been a surprise to find ourselves dealing with the

topic of error estimates, etc in lat/long data, since that was not part of

the original scope of the project.  And indeed (in light of the above

quote) we do not have a capacity to absorb such information into our

present databases, let alone deciding how much time we have to care about

this.  Seeing the impact of the request for so much attention to error

estimates, I find it hard to support so much allocation of additional time

to this effort.

 

I have witnessed, over the years, many publications based on massive

datasets in which the authors were not able to document (or even care)

about variance in the quality and accuracy of the data.  Typically, they

just put on their blinders and accepted all the "AVAILABLE" data.  This is

just an inherent problem for those who move up the scale (allometric

analyses, macroecology, or whatever), and at such LARGE scales of analyses

they usually say that small local errors become insignificant, because of

the LARGE SCALE of the overall analysis.

 

I hope we can strike a balance here and get the big data entry and

conversion project done.  I don't want to see the project slowed down by

such a big commitment to accounting for aspects of the data (and the

corresponding time commitment) that were not built in to our original

estimates of what it would take to carry out the project.

 

Is this a helpful comment?

 

 

>>> Posting number 191, dated 1 Apr 2002 17:01:39

Date:         Mon, 1 Apr 2002 17:01:39 -0900

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Organization: University of Alaska Museum

Subject:      Re: MaNIS questions

MIME-Version: 1.0

Content-Type: multipart/mixed; boundary="------------4C7E03390063999F5E48C0EE"

 

This is a multi-part message in MIME format.

--------------4C7E03390063999F5E48C0EE

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

UAM's online database (along with MVZ's) is displaying error estimates through

the Berkeley Digital Library Project's GIS viewer.  I assume that the

"finished" MaNIS project could look about the same.  That is, error estimates

will be a prominent and critical feature of the system.  Given that the GIS

viewer will map data points over satellite photos of much of the U.S., the

precision associated with the data points is critical.  The implication of "no

error" on a such fine scale GIS layer is that the specimen came from a

specific tree or bush!  Our database contains max_errors from as small as a

few meters to as large as several tens of kilometers.  These are not arcane

details.

 

    XXXXXX

 

 

 

>>> Posting number 192, dated 2 Apr 2002 11:09:56

Date:         Tue, 2 Apr 2002 11:09:56 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Lat_Long metadata

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Oops, my mistake. There IS a collection with the structure to capture all

of the metadata. Two others, UAM and MVZ, have everything except "Extent of

Named Place."

 

Thanks XXXX, bright spot appreciated.

 

John W

 

 

>X-Sender: carlak@mail.bishopmuseum.org

>X-Mailer: QUALCOMM Windows Eudora Version 5.0.2

>Date: Tue, 02 Apr 2002 08:39:08 -1000

>To: John Wieczorek <tuco@socrates.Berkeley.EDU>

>From:

>Subject:

> 

>FYI:  in reference to your statement below....................

> 

>Start thinking about incorporating the georeferenced coordinates and

>metadata into your individual databases. Not one of the participating

>institutions currently has the structure in its database to capture all of

>the metadata we are gathering. It would be nice if we all could. We don't

>want to throw away all of this hard work after all.

> 

>Here's a bright spot to your day:  I have incorporated the MANIS locality

>structure into my Locality table and will thus be saving all the metadata

>for the BPBM specimens and for all new specimens into the collection that

>are completely georeferenced.

> 

>XXXX

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 

 

>>> Posting number 193, dated 2 Apr 2002 12:00:20

Date:         Tue, 2 Apr 2002 12:00:20 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing rivers

In-Reply-To:  <.20020401170449.0099fc90@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

First, I want to apologize for having given contradictory opinions on how

these vague localities should be treated. I stated at least once in the

past that we shouldn't bother with these kinds of localities. However, that

opinion was not based on unassailable logic. In both of the circumstances

described below in Robin's message the coordinates will be of limited

utility due to their very large maximum error. Nevertheless, providing the

coordinates and maximum error will allow the user to determine the extent

to which they ARE useful.

In replying to XXXX I first expressed the opinion that we should provide

maximum errors even in the truly vague cases. My unstated personal

justification for that opinion was that it makes the rules simpler. More

philosophically, by georeferencing all non-contradictory localities, we

don't need to answer the question "How big of an area is too vague?" We

cannot fully anticipate all of the uses to which the data will be put, so

we don't really have a basis on which to make that judgement. A locality

with coordinates and a maximum error distance is always more useful than a

locality without them. End of apology.

 

Now, back to the questions.

 

>John:

> 

>XXXXX's questions and your responses prompted additional questions re:

>georeferencing rivers and vague localities.

> 

>1.  Is it correct to assume that when one measures the length of a river

>to determine its geographic center the river's possibly winding path is

>taken into consideration; however, the extent is determined "as the crow

>flies" from the geographic center to the furthest reach?

 

You don't need to know the length of the river to determine its geographic

center, you need only take the means of the extremes of latitude and

longitude encompassing it. After that, you need to find the point on the

river nearest the geographic center. From there, the extent would be the

distance to the furthest point on the river.

 

>2.  Should we put coordinates on the following vague locality:

> 

>HigherGeog: Michigan, Barry County

>SpecLocality: "no specific locality recorded"

> 

>XXXX and I have not georeferenced such localities thus far, but it

>appears from your response that county center coordinates and the extent

>of Barry County should be provided.

 

Yes. These should be georeferenced. However, there isn't really a need for

you to do it. Such localities can be georeferenced automatically from a

table of county centroids when we're all done. In retrospect, it would have

probably been useful for me to do that before making the gazetteer

"public," but I didn't think it worth the delay at the time.

 

John W

 

 

>>XXXX and all,

>> 

>>These are good questions. I'll put the answers right below each one.

>> 

>>>1. When I georeference rivers, should I take coordinates of the source or

>>>the drainage of the river? How much should extent of the river be?

>> 

>>The coordinates should be at the geographic center of the river, on the

>>river itself. The extent should be the distance to the furthest reach of

>>the river in either direction.

>> 

>>>2. An example: specific locality is "Brooks Range, Anaktiktoot", where

>>>Anaktiktoot is not on the map. Should I georeference for Brooks Range

>>>(which will be more than 600 miles in length)? There are many cases that

>>>higher geography is followed by unknown specific locality.

>> 

>>You should go ahead and put coordinates on the vague localities, even

>>though the maximum_error_distance will be large. Some of the higher

>>geographies that have no value or "no specific locality" in the locality

>>field can still be specific, such as islands.

>> 

>>>3. Related to my question 2: how much is too big to georeference? In many

>>>cases, only the name of the island, mountains, peninsula etc. are

>>>provided.

>> 

>>Do them all. The maximum_error_number will be useful even if it is large.

>> 

>>John

> 

 

>>> Posting number 194, dated 2 Apr 2002 21:23:13

Date:         Tue, 2 Apr 2002 21:23:13 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: MaNIS questions

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="------------8D04441FBD1587A8D66E30D2"

 

--------------8D04441FBD1587A8D66E30D2

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Dear XXX et al.,

 

>From the outset, this project has proceeded, and proceeded successfully,

because we have all been "on the same page."  Your email (see below) provides

an opportunity to reiterate what we said we were going to do, what we intend

to do, and exactly why we are doing it as stated.

 

John and I (particularly John) are extremely grateful to those of you who have

immersed yourselves in the intricacies of georeferecning and have been willing

to share your thoughts and insights with the list.  However, such discussions

in and of themselves have not added to the work load that was initially

budgeted or funded.  Quite to the contrary, both the "Coordinate

Georeferencing Activities" and "Implement Specimen Data Model" sections in the

MaNIS Project Description described providing georeferencing metadata as well

as the coordinates.  And we stated emphatically,

 

"Well-documented, georeferenced collecting events are crucial to biogeographic

data...."

 

This is exactly what we are doing.

 

The error calculator and spreadsheet templates that John provided make the

addition of metadata such as lat/long error a relatively trivial exercise and

one that should not be confused with the discussion of such topics on this

list.  Several individuals have chosen to probe that tool more closely and we

have all benefited from their interest and experimentation.  Their comments

have enhanced our understanding of the process and the resulting data, and

improved the tool, but they have not created more work.

 

Where confusion may have arisen, is in the following:

 

> And indeed (in light of the above

> quote) we do not have a capacity to absorb such information into our

> present databases, let alone deciding how much time we have to care about

> this.  Seeing the impact of the request for so much attention to error

> estimates, I find it hard to support so much allocation of additional time

> to this effort.

 

It is not your job to incorporate such information into your present databases

and we apologize for any confusion that John might have engendered in his

previous email.  This is a topic we will be discussing at our meeting at ASM

in June but perhaps it is worth clarifying now what John was intimating when

he made reference to this issue.

 

Think of your current dbms in two parts, the databases themselves and the

interfaces you now use to input, query and display those data in-house.  For

most of you, neither your databases nor your interfaces are currently designed

to handle any new fields (e.g., lat/long error).  However, we are expending a

great deal of time and effort to collect such data and want to make them

available to researchers.  Whereas it is a fairly tricky task (given

constraints of time and budget) to modify each of your interfaces to add new

fields, it is relatively easy to add those fields to your current databases

and migrate the data directly to the MaNIS servers along with your specimen

data.  This will happen when John writes the  migration scripts for each of

your institutions.  Hence, the data will be displayed over the network and

available to you without impacting your current set-ups in-house.  In raising

this issue, he was merely letting you know that we are, in fact, moving ahead

and beginning to work on the next step of the project, creating the migration

scripts and software that will make the network function.

 

> I have witnessed, over the years, many publications based on massive

> datasets in which the authors were not able to document (or even care)

> about variance in the quality and accuracy of the data.  Typically, they

> just put on their blinders and accepted all the "AVAILABLE" data.  This is

> just an inherent problem for those who move up the scale (allometric

> analyses, macroecology, or whatever), and at such LARGE scales of analyses

> they usually say that small local errors become insignificant, because of

> the LARGE SCALE of the overall analysis.

 

Here I will part company with XXX and argue that it is our intention to do

better than what has always been done or has been done previously.  Neither

John nor I see this "inherent problem," particularly with the advent of

increased computing technology.  I participated in one of the planning

workshops for NEON (National Ecological Observatories Network) two years ago

and I can state unequicocally that the standard is changing/has changed.  The

kinds of publications to which XXX refers will no longer be acceptable (if

they even are at this time) because it is possible to document variance in

quality and accuracy of data, even for extremely large datasets.  Furthermore,

we believe we have a designed the georeferencing protocol  to do just that,

with relatively little overhead and impact to the participating institutions.

 

At this point everyone has at least begun the georeferencing process and from

what we can gather, once initial inertia is overcome, things actually progress

quite smoothly and quickly.  I may be premature in saying so, but it is our

hope that MVZ will have completed georeferencing the ca. 40,000+ localities

for California in the next two months.  How have we done this?  I would remind

each of you that our first priority is to provide georeferenced data to those

localities in our collections that currently have none!  It is not to add

error to localities that already have lat/long coordinates assigned to them,

it is not to verify already georeferenced localities, and it is not to clean

up locality descriptions.  Our budget figures were based on the number of

unique localities in our collections that lacked lat/long coordinates of any

sort.  I would also add, that while we cannot dictate whom you hire to do

georeferencing, your money will go lots farther if you hire undergraduates,

and it will go farthest if you hire work-study students.

 

We have all taken the first giant step.  What is needed now is to just keep

putting one foot in front of the other.  I guarantee you will amaze

yourselves.

 

Best,

Barbara

 

 

 

>>> Posting number 195, dated 3 Apr 2002 08:38:21

 

>>> Posting number 196, dated 3 Apr 2002 10:52:59

Date:         Wed, 3 Apr 2002 10:52:59 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Contemporary informatics science, etc.

In-Reply-To:  <3CAA91C1.79E00185@oz.net>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Barbara et al.,

I appreciate the comments and forum that exist among our Manis group, and

I thank Barbara for her most recent.  I also agree that the developing

field of informatics is helping us to raise the bar on scientific

standards in generaland I dont wish my comments to be taken as an

endorsement of the crudeness of broad synthetic work done in the past

(without error estimates).  I also realize that for the many data fields

that we have entered into our XXXX mammal database (other than lat/long)

we will probably continue without error estimates for some time to come.

On the other hand we can only await the further development of these kinds

of massive data management projects in the future, assuming that financial

resources will remain available for this kind of thing.  It will be great

if we can be surprised by continued improvements in the overall quality of

the data that stand behind the specimens we hold in our collections.  I

obviously remain committed to assuring that we get our job done on this

current project.

XXX

 

 

 

>>> Posting number 197, dated 5 Apr 2002 15:49:39

Date:         Fri, 5 Apr 2002 15:49:39 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

In late February when I was fixing my mistake with the UWBM Lat_Longs I

mentioned that I would be reloading ROM data at some time as well. That

time has come. The new ROM data have now been loaded into the gazetteer.

What does this mean for you?  If you haven't begun georeferencing yet

(though as far as I know, everyone has), you just need to download your

localities again and proceed as described in the Georeferencing Steps

document ( http://dlp.CS.Berkeley.EDU/manis/GeorefSteps.html ). If you have

downloaded localities and started georeferencing them, first you need to

remove any ROM records from the set. Next make another query in the MaNIS

gazetteer just like the original query that gave you the records you are

working on, but this time pick ROM in the Institution box on the MaNIS

Gazetteer page to get only ROM records for that combination of higher

geography. Download these ROM records and append them to the end of the

file you are working on.

 

Sorry for this inconvenience. I'm pretty sure I've got everything correct

now and that this kind of thing won't happen any more. So, everyone,

proceed with confidence.

 

My next undertaking will be to write the documentation for a new Calculator

that can calculate not only errors, but also coordinates. This calculator

will be VERY similar to the Error Calculator, so there won't be much new to

learn. The new calculator has already been tested; the results agree with

those given by Gary Shugart's Excel tool for the same localities. This is

good. I'll announce the new calculator as soon as I've posted the manual

for it, which should be next Friday or so after I return from San Diego.

 

Happy georeferencing!

 

John W

 

>>> Posting number 198, dated 5 Apr 2002 17:47:21

 

>>> Posting number 199, dated 15 Apr 2002 16:52:52

Date:         Mon, 15 Apr 2002 16:52:52 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      GNIS Website Gazeteering

 

Hello All,

 

I am a recent addition to the group, and I have thrown myself headlong into

the midst, hopefully well.

 

That said, I do have a question about a source.  I am using the USGS GNIS

website http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form and

I was wondering what, if any, experiences have been had.  Specifically, if

I read it correctly, it is a database of information culled for the USGS

maps.  I am just unsure of a few things...:

 

First, datum, scale, and other info.  The site refers to "7.5' by 7.5'

Map"; what other data can be culled just from that?

 

Second, it at times gives coordinates from multiple maps that are slightly

different.  How do I reconcile this variances??  Do I give my own best

combination, or has a process been agreed upon, that I have missed in going

through the past posting?

 

Thanks, and greetings to you all.

 

>>> Posting number 200, dated 15 Apr 2002 15:45:02

Date:         Mon, 15 Apr 2002 15:45:02 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      GNIS Info

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="----=_NextPart_000_0088_01C1E494.81AC73E0"

 

This is a multi-part message in MIME format.

 

------=_NextPart_000_0088_01C1E494.81AC73E0

Content-Type: text/plain;

        charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

XXXXXX

 

I too am new at working on this MaNIS project, just started this week.  =

Anyways, I had the exact same question as you and talked to John =

Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =

CA.  He said that using the GNIS data is fine even though that the =

source of the database is not from one place.  These are the "givens" =

for GNIS use with the "Error Calculator":

 

1)  Coordinate System: decimal degrees

2)  Coordinate Source: USGS map 1:25,000

3)  Datum: NAD27 (North American Datum 1927)

 

Make sure that you fill out the "Extent of Named Place Field" as much as =

possible each time.  If anyone from this board has other suggestions, I =

would be glad to hear them.

 

Is anyone else converting the GNIS database to a shape file to be used =

in ArcView to calculate distances?  If there are a lot of you, I will =

start posting ArcView questions pertaining to this project here.  =

Thanks.

 

XXXXXXX

 

>>> Posting number 201, dated 16 Apr 2002 09:49:31

Date:         Tue, 16 Apr 2002 09:49:31 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: GNIS Info

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="----=_NextPart_000_0022_01C1E52C.020D1CA0"

 

This is a multi-part message in MIME format.

 

------=_NextPart_000_0022_01C1E52C.020D1CA0

Content-Type: text/plain;

        charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

XXXXX--

 

Thanks for the quick reply; it was very helpful. =20

 

I am always interested in other people's experiences with ArcView.

 

 

John Wieczorek & Group--

 

I still am on the fence with the locations that give me two or more =

different georeferencing points.  I have the feeling that, as they are =

both "legitimate" sources (different USGS maps), that I can just choose =

one, and indicate in the proper field in the database which I chose.  =

Does this seem acceptable/appropriate??

 

Thanks

 

XXXXX

 

----- Original Message -----=20

  From:

  To: MAMMAL-Z-NET@USOBI.ORG=20

  Sent: Monday, April 15, 2002 5:45 PM

  Subject: [MANIS] GNIS Info

 

 

  XXXXXXXXX

 

  I too am new at working on this MaNIS project, just started this week. =

 Anyways, I had the exact same question as you and talked to John =

Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =

CA.  He said that using the GNIS data is fine even though that the =

source of the database is not from one place.  These are the "givens" =

for GNIS use with the "Error Calculator":

  =20

  1)  Coordinate System: decimal degrees

  2)  Coordinate Source: USGS map 1:25,000

  3)  Datum: NAD27 (North American Datum 1927)

 

  Make sure that you fill out the "Extent of Named Place Field" as much =

as possible each time.  If anyone from this board has other suggestions, =

I would be glad to hear them.

 

  Is anyone else converting the GNIS database to a shape file to be used =

in ArcView to calculate distances?  If there are a lot of you, I will =

start posting ArcView questions pertaining to this project here.  =

Thanks.

 

 

 

 

>>> Posting number 202, dated 16 Apr 2002 10:04:18

Date:         Tue, 16 Apr 2002 10:04:18 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: GNIS Info

In-Reply-To:  <002501c1e555$eb1a1320$b16f0a0a@fmnh.org>

Mime-Version: 1.0

Content-Type: multipart/alternative;

              boundary="=====================_12236688==_.ALT"

 

--=====================_12236688==_.ALT

Content-Type: text/plain; charset="us-ascii"

 

XXXX and others

Having already georeferenced thousands of South American localities, this is an

important and porrly understood question.  My strong conviction is that simply

picking a point arbitrarily is apt to prove more misleading than leaving the

point undetermined.  If there are 28 "San Martin"s in Peru, for example, and

there is no additional information for specifying this (e.g., compiling an

expedition itinerary, locations of field activities immediately beforehand and

afterwards, and (rarely) the distributions of animals themselves), then

guessing--and being explicit about your guesses--can only be misleading.

 

Following this strategy with the Field Museum's 2300 locality records from Peru

lead me to leave 14% of the localities unspecified.  However, I am confidant

that the remaining 86% came from where they plot.

 

I would be interested in hearing the experiences of others and the druthers of

curators/collection managers on the data fidelity (vs accuracy) question.

Clearly, we need to embrace a community-wide standard

XXXXX

 

 

 

>>> Posting number 203, dated 16 Apr 2002 09:11:41

Date:         Tue, 16 Apr 2002 09:11:41 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: GNIS Info

In-Reply-To:  <4.1.20020416095834.00a94a90@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I agree wholeheartedly with XXXXX. If there is ambiguity in terms of a

multitude of potential named places for a given locality we should NOT

georeference it, but give the reason ("ambiguous" or "multiple possible

places" or something like that) in the NoGeorefBecause field. It may be

that some of these localities can be resolved by the host institution by

looking in field notes and the like. However, that's a time-consuming

activity and we should leave that until after the coordinates get

redistributed.

For the record, the other type of locality we should NOT georeference is

one that is in question (e.g., "Bakersfield?"). For these, put something

like "locality questionable" in the NoGeorefBecause field. The reason for

filling out the NoGeorefBecause field is so that the host institution knows

that someone actually looked at the locality. You wouldn't otherwise know

this if the Lat and Long were just blank. While reviewing, I might as well

remind everyone to make use of the Remarks field to alert host institutions

of likely errors such as misspellings as well as unusual assumptions that

were made in the course of the coordinate determination.

 

It's nice to see the list serving its purpose. Thanks for the questions and

responses!

 

John W

 

 

 

 

>>> Posting number 204, dated 16 Apr 2002 11:21:06

Date:         Tue, 16 Apr 2002 11:21:06 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: GNIS Info

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

All,

 

I believe I am not being as clear in my situation as I thought.  Here is an

example.

 

Aurora, Illinois, is a city/town that spreads across multiple counties, and

is on 3 different USGS maps, according to the query form results I received.

As I understand it, the information on the site comes about in the same

manner as if I had all of these maps myself, and were picking the point, and

best approximating the lat & long according to