MaNIS Georeferencing Discussion
Archive
Following are extracts of the Georeferencing Listserv discussions accumulated during the MaNIS georeferencing project. Missing postings were not relevant to georeferencing in perpetuity. Messages have been edited to protect the guilty by masking names of individuals with XXXXXX.
>>> Posting number 1, dated 17 Jul 1999 14:12:50
-----------------------------------------------------------------------------
>>> Posting number 2, dated 17 Jul 1999 14:15:23
-----------------------------------------------------------------------------
>>> Posting number 3, dated 17 Jul 1999 14:16:03
-----------------------------------------------------------------------------
>>> Posting number 4, dated 17 Jul 1999 14:19:25
------------------------------------------------------------------------=
-----
>>> Posting number 5, dated 17 Jul 1999 14:19:59
-----------------------------------------------------------------------------
>>> Posting number 6, dated 17 Jul 1999 14:26:41
-----------------------------------------------------------------------------
>>> Posting number 7, dated 17 Jul 1999 14:22:50
-----------------------------------------------------------------------------
>>> Posting number 8, dated 17 Jul 1999 14:23:12
-----------------------------------------------------------------------------
>>> Posting number 9, dated 19 Jul 1999 09:29:01
----------------------------------------------------------------------------
--------------------
>>> Posting number 10, dated 23 Jul 1999 16:35:41
>>> Posting number 11, dated 3 Sep 1999 16:17:55
>>> Posting number 12, dated 17 Sep 1999 15:19:38
>>> Posting number 13, dated 17 Sep 1999 13:13:14
>>> Posting number 14, dated 17 Sep 1999 14:57:30
>>> Posting number 15, dated 20 Sep 1999 09:04:17
>>> Posting number 16, dated 24 Sep 1999 17:01:21
>>> Posting number 17, dated 28 Sep 1999 12:50:27
>>> Posting number 18, dated 15 Oct 1999 19:37:37
>>> Posting number 19, dated 17 Oct 1999 16:37:27
>>> Posting number 20, dated 18 Oct 1999 16:50:30
>>> Posting number 21, dated 19 Oct 1999 11:15:26
>>> Posting number 22, dated 19 Oct 1999 16:35:19
>>> Posting number 23, dated 20 Oct 1999 15:51:18
>>> Posting number 24, dated 20 Oct 1999 11:34:55
>>> Posting number 25, dated 20 Oct 1999 16:00:18
>>> Posting number 26, dated 10 Nov 1999 10:52:01
>>> Posting number 27, dated 10 Nov 1999 13:54:04
>>> Posting number 28, dated 17 Nov 1999 15:12:19
>>> Posting number 29, dated 18 Nov 1999 12:38:15
>>> Posting number 30, dated 18 Nov 1999 10:08:56
>>> Posting number 31, dated 18 Nov 1999 13:22:25
>>> Posting number 32, dated 19 Nov 1999 14:35:52
>>> Posting number 33, dated 3 Dec 1999 10:21:24
>>> Posting number 34, dated 3 Jan 2000 11:48:10
>>> Posting number 35, dated 3 Jan 2000 16:24:25
>>> Posting number 36, dated 18 May 2000 16:51:23
>>> Posting number 37, dated 18 May 2000 19:49:29
>>> Posting number 38, dated 23 May 2000 18:41:45
>>> Posting number 39, dated 24 May 2000 09:38:19
--------------------------------------------------------
---------------------
>>> Posting number 40, dated 24 May 2000 12:15:39
>>> Posting number 41, dated 12 Jun 2000 15:45:50
>>> Posting number 42, dated 13 Jun 2000 09:31:26
>>> Posting number 43, dated 13 Jun 2000 09:59:02
>>> Posting number 44, dated 13 Jun 2000 09:17:08
>>> Posting number 45, dated 13 Jun 2000 07:49:43
>>> Posting number 46, dated 13 Jun 2000 09:04:22
>>> Posting number 47, dated 13 Jun 2000 08:54:22
>>> Posting number 48, dated 13 Jun 2000 11:11:31
>>> Posting number 49, dated 13 Jun 2000 13:23:46
>>> Posting number 50, dated 30 Jun 2000 16:25:38
>>> Posting number 51, dated 30 Jun 2000 17:14:31
>>> Posting number 52, dated 30 Jun 2000 23:29:35
>>> Posting number 53, dated 1 Jul 2000 07:35:15
>>> Posting number 54, dated 4 Jul 2000 11:04:23
>>> Posting number 55, dated 4 Jul 2000 10:07:33
>>> Posting number 56, dated 6 Jul 2000 00:00:0/
>>> Posting number 57, dated 5 Jul 2000 19:40:11
>>> Posting number 58, dated 5 Aug 2000 09:24:55
>>> Posting number 59, dated 5 Aug 2000 12:31:07
>>> Posting number 60, dated 7 Aug 2000 13:45:33
>>> Posting number 61, dated 15 Aug 2000 21:54:23
>>> Posting number 62, dated 23 Aug 2000 16:24:48
>>> Posting number 63, dated 30 Aug 2000 11:20:17
>>> Posting number 64, dated 22 Sep 2000 09:36:34
>>> Posting number 65, dated 29 Sep 2000 08:51:23
>>> Posting number 66, dated 2 Oct 2000 10:35:12
>>> Posting number 67, dated 5 Oct 2000 09:40:24
>>> Posting number 68, dated 17 Oct 2000 18:13:33
>>> Posting number 69, dated 1 Nov 2000 07:48:24
>>> Posting number 70, dated 1 Nov 2000 08:06:24
>>> Posting number 71, dated 28 Nov 2000 18:26:18
>>> Posting number 72, dated 29 Nov 2000 21:09:35
>>> Posting number 73, dated 30 Nov 2000 08:31:10
>>> Posting number 74, dated 30 Nov 2000 11:33:07
>>> Posting number 75, dated 14 Dec 2000 20:41:28
>>> Posting number 76, dated 15 Dec 2000 07:59:04
>>> Posting number 77, dated 26 Apr 2001 09:00:01
>>> Posting number 78, dated 16 May 2001 18:29:45
>>> Posting number 79, dated 16 May 2001 17:36:59
>>> Posting number 80, dated 18 May 2001 08:29:49
>>> Posting number 81, dated 24 May 2001 10:19:20
>>> Posting number 82, dated 25 May 2001 09:43:37
>>> Posting number 83, dated 11 Jun 2001 12:01:03
>>> Posting number 84, dated 11 Jun 2001 15:02:51
>>> Posting number 85, dated 11 Jun 2001 15:44:56
>>> Posting number 86, dated 29 Jun 2001 21:12:37
>>> Posting number 87, dated 4 Jul 2001 14:24:24
Date: Wed, 4 Jul 2001 14:24:24 -0700
Reply-To: "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>
Sender: "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: ROM higher geography
In-Reply-To: <sb433743.076@romfs7.rom.on.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
I'm posting the following exchange to the list because there is information
contained herein that is relevant to everyone. The basic concepts of data
cleanliness, the gazetteer, and data updates are addressed in brief.
>Once I began working on the Bukedi inconsistency (2nd in your list) I saw
>that your methodology is missing many more errors/inconsistencies that
>exist in County and Province data.
Understood. My analysis reveals only the duplicates of
ORCT+ORCRY+ORPR+ORCY
I understand that there may be many other errors and inconsistencies in the
original data, but that is not a concern for the gazetteer. In fact, the
duplicates I pointed out aren't a problem either. I just wanted to alert
you to them since they came out in my analysis.
> The errors and inconsistencies are a direct reflection of the state of
> documentation on field catalogues or specimen cards, depending on the
> source of the automated record. We did not have the resources at the
> time of automation (nor do we now for that matter) to resolve what is a
> "Province" term and what is a "County" term for all
> countries. Additionally, we are looking at historical data that may no
> longer be reflected in the current political reality of our little world
> (e.g.,
> are used routinely to manage the collection and retrieve data. Continent
> and Country should be clean. The Province field should be clean for
>
> just finished cleaning up the Province field for
> County field should be clean for
> frequency listings for Country etc. for these priority sections of the db
> (and collection) in an effort to maintain the consistency of our
> data. For all other geographic locations, Province and County are not
> used for managing the collection, so the data clean up or enhancement has
> been a low priority. This is an ongoing situation that I have discussed
> with Judith with regard to the Manis Project. My understanding is that
> funding for documentational and staffing resources will be part of this
> "mission". I am afraid your listing of 13 inconsistencies barely
> scratches the surface of the data cleaning that is required and even more
> importantly, misses all kinds of erroneous or missing data. I currently
> do not have the maps, atlases, or gazetteers nor the staff/time to
> undertake this project which from a collections' perspective is of low
> priority. To do a proper job I cannot resolve all of the problems that
> you have identified without undertaking a full review of the entire
> country's data.
There is no requirement for any standard of cleanliness. It is my hope that
errors and inconsistencies will be noted during georeferencing and
forwarded to the attention of the institutions as a part of that
process. The tools are meant to identify the inconsistencies, not to
remedy them. What the institutions do with these notes is entirely up to them.
>I am not sure what you are currently attempting to do with the data so we
>may need to further discuss our respective needs to insure that we are not
>working at cross purposes. If work is to be globally undertaken, I would
>like our data to be the db of record - making long lists of changes for
>you to then repeat is a waste of effort and time; you will see the work
>generated by having two dbs of record by the simple changes that I have
>made this afternoon. Also, errors in interpretation or typos that are
>bound to occur should be avoided. Finally, the data you have is already
>out of date, since changes are made by me on a daily basis as errors etc.
>are encountered during the normal activities of managing the collection,
>fulfilling data requests, etc.
The institutional databases will always be the database of record. The
data I have from all of the institutions is just a snapshot, to be used for
georeferencing. I will not ask for these data again during the project, nor
will I make changes to the data I have received. When we have a network,
the gazetteer will be created and updated automatically whenever data
change and the snapshot will be obsolete. I've only created the snapshot
so that we have combined data to work with. When people begin to do
georeferencing using the gazetteer they will not change the data - they
will only make commentaries. Even the latitude and longitude are
commentaries in a sense. It is up to each institution to accept or reject
the commentaries and make changes based on them in its database.
>Regards,
>
> >>> John Wieczorek <tuco@socrates.Berkeley.EDU> 07/02/01 08:50PM >>>
>Attached is a tab-delimited file with the first row containing column
>headings. The contents of the file are combinations of higher geographic
>fields for which you have more than one interpretation in your
>database. The first field (highergeog) is a concatenation of the fields of
>higher geography that reveal duplication. The second field (geogid) is an
>identifier unique to the ROM higher geography data with one row for every
>unique combination of ORCT, ORCRY, ORPR, and ORCY. As you can see by the
>rows in the table, there are 13 places for which there are inconsistent
>placements of county vs. province, for example. It is not critical for my
>purposes to have these resolved, but since I noticed them I thought I might
>as well tell you. If you do make changes to these combinations, let me
>know which are correct and I'll do so on this end as well.
>>> Posting number 88, dated 10 Jul 2001 12:01:24
Date: Tue, 10 Jul 2001 12:01:24 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: cave localities
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
I've noticed that the USGS GNIS web site does not give information on cave
sites. (It does give locations of variants such as Boulder Cave
Campground.) Is this a protocol we wish to follow? Are there other web
sites that do list cave localities? What do you think?
Cheers,
>>> Posting number 89, dated 10 Jul 2001 13:40:25
Date: Tue, 10 Jul 2001 13:40:25 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Filtering data
In-Reply-To: <sb4b0d4a.070@romfs7.rom.on.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
This message is in reply to a comment about
records for captive animals.
>I would recommend that you do not use any captive records for a
>gazetteer. Does that make sense?
In a restricted view of the utility of a gazetteer it does make sense to
exclude them. However, it is actually easier to include them, yet have them
flagged. This has the benefit that one can filter on the captive attribute.
This could be useful if you wanted to do a quick query of only captive
animals as well as for a query in which you want to leave them out. The
philosophy in general will be to have a home for all data that anyone deems
useful, yet to allow each institution to decide which data it will provide
through the filters implemented during migration.
A filter might do any one of the following:
1) exclude attributes altogether (e.g., not show a "CaptiveFlag" field)
2) exclude records based on the value of an attribute (e.g., not show
records of endangered species)
3) exclude certain values of an attribute (e.g., not show localities for
endangered species)
4) substitute a surrogate value for an attribute of a certain value (e.g.,
instead of showing locality with lat-long, show only county-level and
higher geography for endangered species)
These are just a few examples of what might be done at one institution, and
may vary between institutions. I encourage the participant's to discuss
these issues, and to begin to make institutional decisions about filtering
rules when it comes time to set up the migration. The rules must be
clearly defined before I begin to create the creation scripts - I can't
afford to stay at any given institution (except maybe Hawaii, heh heh),
while the rules are being hashed out.
>>> Posting number 90, dated 8 Aug 2001 13:10:05
>>> Posting number 91, dated 14 Sep 2001 08:48:17
>>> Posting number 92, dated 23 Sep 2001 17:24:24
>>> Posting number 93, dated 24 Sep 2001 20:07:31
Date: Mon, 24 Sep 2001 20:07:31 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guidelines
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Now that we are officially up and running I would like to provide the first
of two documents on the MaNIS collaborative georeferencing effort. This
first document is meant to open for discussion the issues associated with
turning specific locality descriptions into well-documented latitudes and
longitudes. The document does not explain what tools to use, or how to use
any of them - that will be in a forthcoming document. Instead, this
document focuses on the "theoretical aspects" of the task, our methods and
assumptions, upon which it would be helpful for us all to agree. To that
end, please read the Georeferencing Guidelines page, accessible from the
Documents page on the MaNIS website (see below). Comment by sending
messages to MAMMAL-Z-NET@USOBI.ORG. Let's try to get through this
discussion by 6 Oct.
http://dlp.cs.berkeley.edu/manis/Documents.html
Anticipating your enthusiastic participation,
John Wieczorek
>>> Posting number 94, dated 25 Sep 2001 18:30:16
Date: Tue, 25 Sep 2001 18:30:16 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing text, for reference
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
It was pointed out to me that it might be prudent to have a text-only copy
of the document, with line numbers, to which everyone can refer in
discussions. I am including the full text of the GeorefGuide.html file
below for that purpose. The page itself can be found at the following URL:
http://dlp.cs.berkeley.edu/manis/GeorefGuide.html
1 MaNIS
2 The Mammal Networked Information System
3
4 John Wieczorek
5 24 September 2001
6 _________________________________________________
7
8 Georeferencing Guidelines
9
10 This document contains information about assigning geographic
11 coordinates and maximum errors for those coordinates to specific
12 locality descriptions. This document does not attempt to
13 describe the tools and methods for finding named places on maps
14 or gazetteers. The process of assigning coordinates and errors,
15 called georeferencing, can be rather complicated. The complexity
16 of the process can be greatly reduced and the consistency of the
17 results can be greatly increased by establishing simple
18 guidelines that cover most commonly encountered locality
19 descriptions. The guidelines for assigning coordinates for named
20 places are presented with examples in the section Determining
21 Latitude & Longitude.
22
23 There are several fundamental sources of error for specific
24 locality descriptions, and these vary in magnitude. It is
25 essential during georeferencing to determine and record the
26 greatest source of error among all possible sources. There are
27 numerous ways in which the maximum error of a geographic
28 coordinate might be expressed, but the most convenient is as a
29 distance, because its size and shape are constant over any
30 geodetic surface model. The sources of error and their
31 magnitudes are discussed primarily in the section Determining
32 Error.
33
34 An Appendix containing a description of the data that should be
35 captured for each georeferenced locality, a glossary, and
36 references are appended for the convenience of the reader.
37
38 Determining Latitude & Longitude
39
40 Geographic coordinates can be expressed in a number of different
41 coordinate systems (e.g. decimal degrees, degrees minutes
42 seconds, degrees decimal minutes, UTM, etc.). Conversions can be
43 made readily between coordinate systems, but decimal degrees
44 provide the most convenient coordinates to use for
45 georeferencing for no more profound a reason than that a
46 specific locality can be described with only two attributes
47 decimal latitude and decimal longitude.
48
49 Named Places
50
51 The simplest of specific locality descriptions consist of only a
52 named place. Use the geographic center of a named place for the
53 latitude and longitude, and use the distance from that point to
54 the furthest point within that named place for the maximum error
55 distance. If the geographic center of the named place is not
56 within the confines of the shape of the named place, use the
57 point nearest to the geographic center that lies within the
58 shape.
59
60 Example: "Bakersfield"
61
62 Township Range Section (TRS) descriptions are essentially no
63 different from that of any other named place. It is necessary to
64 understand how TRS descriptions work and how they describe a
65 place. See the References section, below, for links to TRS
66 information.
67
68 Example: "E of Bakersfield, T29S R29E Sec. 34 NE 1/4"
69
70 Offsets
71
72 Offsets generally consist of combinations of distances and
73 directions from a named place. Use the geographic center of the
74 named place in the direction of the offset as a starting point.
75 Unless there is contrary information in the locality
76 description, measure the distance in the offset direction to
77 find the spot for the geographic coordinates. Offsets that do
78 not explicitly say that they were measured by air or by some
79 contour (e.g., by road, river, valley, etc.) should be
80 determined as if by air in a straight line.
81
82 Example: "10 mi E (by air) Bakersfield"
83
84 Example: "10 mi E of Bakersfield"
85
86 However, if there is no mention of the mode of measurement in
87 the locality description, but the measurement includes fractions
88 (e.g., 10.2 miles) and there is a road in the vicinity, use road
89 miles. Offsets that were described in the specific locality as
90 being measured by road should be determined using the contours
91 of the road rather than using a straight line. The methods for
92 determining the maximum error distances for these types of
93 specific locality descriptions are given in the Determining
94 Error section, below.
95
96 Example: "10.2 mi E of Bakersfield"
97
98 Example: "13 mi E (by road) Bakersfield"
99
100 Vagueness
101
102 At times, specific locality descriptions are fraught with
103 vagueness. It is not the purpose here to belittle localities of
104 this type; in fact, an honest admission of the unknown is
105 preferable to masking it with unwarranted precision.
106
107 The most important type of vagueness in a specific locality
108 description is one in which the locality is in question. No such
109 locality should be georeferenced.
110
111 Example: "Bakersfield?"
112
113 Many locality descriptions imply an offset from a named place
114 without definitive directions or distances. Use the geographic
115 center of the named place for the geographic coordinates. For
116 the maximum error distance, use the greatest distance that is
117 not likely to be considered in the area of another named place.
118 Clearly there is a measure of subjectivity involved here. Let
119 common sense prevail and document the assumptions made.
120
121 Example: "near Bakersfield"
122
123 Sometimes offset information is vague either in its direction or
124 in its distance. If the direction information is vague, record
125 the geographic coordinates of the center of the named place and
126 add the offset distance to the greatest extent of the named
127 place to get the maximum error distance.
128
129 Example: "5 mi from Bakersfield"
130
131 Uncertainty in the offset distance is a fact of the business.
132 Almost no localities are recorded with error estimates,
133 therefore every offset distance is inherently uncertain. The
134 addition of a modifier in the locality description, while an
135 honest observation, should not change the determination of the
136 geographic coordinates or of the maximum error.
137
138 Example: "about 3 mi E of Bakersfield"
139
140 The worst of situations arises when a specific locality
141 description is internally inconsistent. There are numerous
142 possible causes for inconsistencies. It is the task of the those
143 georeferencing to determine the part of the description most
144 likely to be in error, ignore it for the purpose of the
145 determination, and document the decision to do so. The most
146 common source of inconsistency in a locality description comes
147 from trying to match elevation information with the rest of the
148 description. If there is no reasonable way to reconcile the
149 discrepancy, ignore the elevation.
150
151 Example: "10 mi W of Bakersfield, 6000 ft"
152
153 Determining Error
154
155 The process of georeferencing includes an assessment of the
156 possible sources of error in a geographic coordinate
157 determination. Errors may arise due to the extent of a locality,
158 due to unspecified precision in original measurements (distance
159 precision and directional precision), or due to not knowing the
160 datum under which coordinates were determined. It is essential
161 to determine which of these yields the greatest error and record
162 that value as the maximum error distance. Potential error
163 sources and guidelines for determining the magnitude of each for
164 a given specific locality are given in the paragraphs below.
165
166 Error due to the shape of a locality
167
168 Named places are not single points; they have extents. If a
169 locality description is no more specific than to describe a
170 named place or an offset from a named place, then the size of
171 the named place is a source of error. The treatment of error due
172 to the extent of a locality is described under the examples of
173 determining latitude and longitude, above.
174
175 Error due to a unknown datum
176
177 Seldom have geographic coordinates been recorded for a locality
178 in a natural history collection in which the underlying datum of
179 the coordinate system was given. Even now, when GPS coordinates
180 are being taken as definitive evidence of a location, the
181 geodetic datum is being ignored. Without recording the datum
182 with the coordinates, potential accuracy is being lost. Figure 1
183 shows the magnitude of error (in meters) over North America
184 based on not knowing the datum from which the coordinates were
185 taken.
186
187 [datumerror.jpg]
188
189 Figure 1. Map of North America showing the magnitude of
190 potential error from not knowing whether coordinates were taken
191 from NAD27, NAD83, or WGS84.
192
193 This map can be used as a rough guide for determining the
194 magnitude of error due to not knowing the datum from which the
195 geographic coordinates were recorded.
196
197 Precision
198
199 Precision is difficult to gauge from specific locality
200 descriptions; it may be reflected in the locality description,
201 but it is seldom, if ever, explicitly recorded. Furthermore, a
202 database record may not reflect, or may reflect incorrectly, the
203 precision inherent in the original measurement, especially if
204 the locality description has undergone interpretation from the
205 verbatim original description. Precision issues arise from both
206 distance measurements and directions in a locality description.
207 Potential errors from each of these sources are discussed in the
208 paragraphs below.
209
210 Error associated with distance precision
211
212 Distance may be recorded in a specific locality description with
213 or without significant digits, and those digits may or may not
214 be warranted. A conservative way to insure that distance
215 precision is not inflated is to treat distance measurements as
216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
218 based on the fractional part of the distance, using 1 divided by
219 the denominator of the fraction.
220
221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
222 be 0.5 mi.
223
224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
225 should be 0.1 mi.
226
227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
228 be 0.25 mi.
229
230 If the distance is an integer, use an error of one unit.
231
232 Example: "10 mi N of Bakersfield" Error should be 1 mi.
233
234 Error associated with directional precision
235
236 Direction is almost always expressed in specific locality
237 descriptions using cardinal and intercardinal directions rather
238 than degree headings. A conservative interpretation of these
239 directions allows for an error of 22.5 degrees to either side of
240 the recorded direction. Thus, ENE can be any direction between E
241 and NE, while NE can be any direction between ENE and NNE.
242
243 [directionerror.jpg]
244
245 The error distance resulting from imprecision in direction
246 increases with increasing offset distance. In fact the error
247 distance due to directional imprecision is 0.4142 times the
248 offset. Note, however, that when a locality description uses two
249 offsets based on cardinal directions (e.g., 1 mi N and 3 mi E of
250 Bakersfield), the distances and directions are likely to have
251 been measured on a map. In this case, directional imprecision
252 should be ignored.
253
254 Appendix
255
256 Geographic Coordinate Data
257
258 Following are the essential attributes to be captured for each
259 locality while georeferencing.
260
261 Decimal_Latitude - the latitude coordinate (in decimal degrees) at
262 the center of a circle encompassing the whole of a specific
263 locality. Convention holds that decimal latitudes north of the
264 equator are positive numbers less than or equal to 90, while
265 those south are negative numbers greater or equal to 90.
266 Example: -42.51 degrees (which is the same as 42d 30' 36" S).
267
268 Decimal_Longitude - the longitude coordinate (in decimal degrees)
269 at the center of a circle encompassing the whole of a specific
270 locality. Decimal longitudes west of the Greenwich Meridian are
271 considered negative and must be greater than or equal to 180,
272 while eastern longitudes are positive and less than or equal to
273 180. Example: -122.49 degrees (which is the same as 122d 29' 24"
274 W).
275
276 Maximum_Error_Distance - the upper limit of the distance from the
277 given latitude and longitude within which the described locality
278 must lie.
279
280 Maximum_Error_Units - the units of length in which the maximum
281 error is recorded (e.g., mi, km, m, and ft). Express maximum
282 error distance in the same units as the distance measurement in
283 the specific locality description.
284
285 Datum - the geometric description of a geodetic surface model
286 (e.g., NAD27, NAD83, WGS84). Datums are often recorded on maps
287 and in gazetteers, and can be specifically set for most GPS
288 devices. Use "not recorded" when the datum is not known.
289
290 Original_Coord_System - the coordinate system in which the raw
291 data are being entered. For the purpose of collaborative
292 georeferencing this value will be "decimal degrees." However,
293 existing geographic coordinates may be entered in degrees
294 minutes seconds, degrees decimal minutes, or UTM coordinates.
295
296 Reference - the reference source (e.g., map, gazetteer, or
297 software) used to determine the coordinates. Such information
298 should provide enough detail so that anyone can locate the
299 actual reference that was used (e.g., name, edition or version,
300 year). Lat_Long_Determined_By the person or organization by
301 which the determination was made.
302
303 Lat_Long_Determined_Date - the date on which the determination was
304 made.
305
306 Remarks - comments on methods and assumptions used in determining
307 coordinates or errors when those methods or assumptions differ
308 from or expand upon the accepted guidelines.
309
310 Glossary
311
312 Datum - A geodetic datum describes the size, shape, origin, and
313 orientation of a coordinate system for mapping the surface of
314 the earth.
315
316 Decimal degrees - degrees expressed as a single real number (e.g.,
317 -22.343456) rather than as a composite of degrees, minutes,
318 seconds, and direction (e.g., 7d 54 18.32" E).
319
320 Geodetic surface model - a geometric description of the surface of
321 the earth.
322
323 Geographic coordinates - latitude and longitude, measured in any
324 of various coordinate systems.
325
326 Geographic center - To find the geographic center of a shape,
327 first find the extremes of both latitude and longitude within
328 the shape and then take their respective means.
329
330 UTM - Universal Transverse Mercator. A grid coordinate system
331 specifying a datum, zone, and offsets from the equator and from
332 the meridian of the zone. See the References section, below, for
333 more information.
334
335 References
336
337 Township, Range Section Information:
338
339 http://www.esg.montana.edu/gl/trs-data.html
340
341 Datum Information:
342
343 http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html
344 http://164.214.2.59/GandG/tm83581/tr83581a.htm
345 http://biology.usgs.gov/geotech/documents/datum.html
346
347 UTM Information:
348
349 http://www.nps.gov/prwi/readutm.htm
350 http://www.dmap.co.uk/ll2tm.htm
351
352 Note
353
354 Specific locality descriptions are inexact and seldom give
355 estimates of error. An ideal description of a specific locality
356 has no error. One way to achieve this ideal is to describe the
357 locality by a shape within which the exact locality must
358 certainly lie. The capture of shape data is certainly possible
359 with current GIS technology, and is even demonstrably more
360 efficient than the methods described above. However, there are
361 technical challenges yet to be met in order to make the capture
362 of shape data feasible in a collaborative Internet-based
363 georeferencing environment.
364
365 An alternative to using a shape to describe a locality is to use
366 a definitive point of arbitrarily high precision with an
367 attendant maximum error. This method, described in the foregoing
368 document, is a conservative expression of the locality which
369 satisfies the requirement that the exact locality must lie
370 within the space described.
371
372
373 _________________________________________________
374
375 Rev. 24 September 2001, JRW
376
377 University of California, Berkeley, CA 94720, Copyright 2001,
378 The Regents of the University of California.
>>> Posting number 95, dated 27 Sep 2001 10:45:45
Date: Thu, 27 Sep 2001 10:45:45 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georeferencing document
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
John,
I went through your document this morning and find most of it clear and in
agreement with my own practices of georeferencing. I have some
observations and questions as follows:
A.
140 The worst of situations arises when a specific locality
141 description is internally inconsistent. There are numerous
142 possible causes for inconsistencies. It is the task of the those
143 georeferencing to determine the part of the description most
144 likely to be in error, ignore it for the purpose of the
145 determination, and document the decision to do so. The most
146 common source of inconsistency in a locality description comes
147 from trying to match elevation information with the rest of the
148 description. If there is no reasonable way to reconcile the
149 discrepancy, ignore the elevation.
150
151 Example: "10 mi W of Bakersfield, 6000 ft"
I have recently been through a georeferencing exercise in the herp
collection for which obtaining coordinates that agreed with the elevations
was critical. It was only through trying to match the description of the
location (distance and direction from X village) with the elevation given,
and finding that the given elevation at the described site was impossible,
that I uncovered major problems in the locality data provided for a large
number of herps on one particular collecting trip. In this case I was able
to contact the collector to ask about the inconsistencies and he determined
that his original distances were totally off because he was using miles on
a metric map. In this case the elevations were the correct piece of
information. I therefore caution against ignoring elevations out of hand.
B.
Section on Determining Latitude and Longitude does not include an example
for when coordinates are provided. For the sake of completeness, should
such and example be included, or, since they are being provided and not
determined, should this be taken up in another section? For example, when
coordinates are provided in degrees, minutes and seconds, do we translate
into decimals? how many decimal places do we go for minutes? for
seconds? Does it matter who provided the
coordinates? collector? previous museum person? someone else? Under
what circumstances, if any, should we recalculate coordinates when they are
provided by some previous source?
C.
210 Error associated with distance precision
211
212 Distance may be recorded in a specific locality description with
213 or without significant digits, and those digits may or may not
214 be warranted. A conservative way to insure that distance
215 precision is not inflated is to treat distance measurements as
216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
218 based on the fractional part of the distance, using 1 divided by
219 the denominator of the fraction.
Lines 217-219. Does this mean to "replace" the numerator with 1, and
divide by the denominator?
221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
222 be 0.5 mi.
numerator is 1 to begin with, so doesn't answer the question.
224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
225 should be 0.1 mi.
Isn't the fraction of .6, 6/10? Did you replace the 6 with a 1 in order
to calculate the error?
227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
228 be 0.25 mi.
Fraction this time is given as 3/4, not 1/4, but you could only get an
error of 0.25 by replacing the 3 with a 1 before dividing by 4.
As you can see, the examples are confusing.
All in all, its a sound document. Thanks much.
>>> Posting number 96, dated 27 Sep 2001 20:34:47
Date: Thu, 27 Sep 2001 20:34:47 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Gordon Jarrell <fnghj@AURORA.UAF.EDU>
Subject: Re: Georeferencing document
In-Reply-To: <5.0.2.1.1.20010927104434.00a2f7e0@mail.bishopmuseum.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Some good points. I've inserted my comments.
On Thu, 27 Sep 2001, XXXXXXX wrote:
> A.
> 140 The worst of situations arises when a specific locality
> 141 description is internally inconsistent. There are numerous
> 142 possible causes for inconsistencies. It is the task of the those
> 143 georeferencing to determine the part of the description most
> 144 likely to be in error, ignore it for the purpose of the
> 145 determination, and document the decision to do so. The most
> 146 common source of inconsistency in a locality description comes
> 147 from trying to match elevation information with the rest of the
> 148 description. If there is no reasonable way to reconcile the
> 149 discrepancy, ignore the elevation.
> 150
> 151 Example: "10 mi W of Bakersfield, 6000 ft"
>
> I have recently been through a georeferencing exercise in the herp
> collection for which obtaining coordinates that agreed with the elevations
> was critical. It was only through trying to match the description of the
> location (distance and direction from X village) with the elevation given,
> and finding that the given elevation at the described site was impossible,
> that I uncovered major problems in the locality data provided for a large
> number of herps on one particular collecting trip. In this case I was able
> to contact the collector to ask about the inconsistencies and he determined
> that his original distances were totally off because he was using miles on
> a metric map. In this case the elevations were the correct piece of
> information. I therefore caution against ignoring elevations out of hand.
>
The key words here are, "IF there is no way to reconcile the
discrepancy..." A possible resolution of the discrepancy might be to
treat it as "specific locality unknown." This might best be left to the
discretion of the individual collections. We have to judge individually
how bad our bad data are, i.e., whether or not we can reconcile them.
> B.
> Section on Determining Latitude and Longitude does not include an example
> for when coordinates are provided. For the sake of completeness, should
> such and example be included, or, since they are being provided and not
> determined, should this be taken up in another section? For example, when
> coordinates are provided in degrees, minutes and seconds, do we translate
> into decimals? how many decimal places do we go for minutes? for
> seconds? Does it matter who provided the
> coordinates? collector? previous museum person? someone else? Under
> what circumstances, if any, should we recalculate coordinates when they are
> provided by some previous source?
>
(I know John's answer to some of this one.) The coordinates define an
infinitely small point, no matter what the format. Precision is measured
with max_error, not the number of significant figures.
Nevertheless, we will have coordinates in which precision was implied by
the recorded format. We have to convert this implied imprecision into a
measure of max_error. At UAM we are using 2 km, a little over a nautical
mile, for coordinates that were recorded to the nearest whole minutes.
There are other examples, similar to the problems with distance precision:
64D 28' 30" N - What they meant to say, in terms of significant
figures, was probably 64D 28.5' N. I suppose in this example we would use
max_error= 1 km
We probably do need to develop a standard here. And yes, I'll bet we want
to be able to keep track of various determinations, re-determinations, who
did it, when, and how.
> C.
> 210 Error associated with distance precision
> 211
> 212 Distance may be recorded in a specific locality description with
> 213 or without significant digits, and those digits may or may not
> 214 be warranted. A conservative way to insure that distance
> 215 precision is not inflated is to treat distance measurements as
> 216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
> 217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
> 218 based on the fractional part of the distance, using 1 divided by
> 219 the denominator of the fraction.
>
> Lines 217-219. Does this mean to "replace" the numerator with 1, and
> divide by the denominator?
>
> 221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
> 222 be 0.5 mi.
>
> numerator is 1 to begin with, so doesn't answer the question.
>
> 224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
> 225 should be 0.1 mi.
>
> Isn't the fraction of .6, 6/10? Did you replace the 6 with a 1 in order
> to calculate the error?
>
> 227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
> 228 be 0.25 mi.
>
> Fraction this time is given as 3/4, not 1/4, but you could only get an
> error of 0.25 by replacing the 3 with a 1 before dividing by 4.
>
> As you can see, the examples are confusing.
>
>
Looks like a typo in line 224.
I suggest replacing the sentence beginning in line 217 with:
The error is the resolution implied by the denominator. It can be
calculated as a distance by dividing one unit of distance by the
denominator.
Is that better? Or worse?
>>> Posting number 97, dated 28 Sep 2001 12:53:09
Date: Fri, 28 Sep 2001 12:53:09 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georeferencing guidelines
Mime-version: 1.0
Content-type: multipart/alternative;
boundary="MS_Mac_OE_3084526390_196216_MIME_Part"
John et al.,
The georeferencing guidelines look great to me. The only (minor) quibble I
have
would be with the second item under the subheading "Offsets" (lines 86-89).
Here, you
suggest that a locality that contains distance fractions (such as "10.2 mi E
Bakerfield") should be assumed to be road miles rather than air miles. I see
it the other way around. Most field workers I know are careful to state "by
road" if their mileage was actually measured along a road. Otherwise, the
mileage is assumed to be taken directly from a map (i.e., air miles). I
don't see that the inclusion of fractions in the mileage should
automatically signal that the mileage was read from an odometer...it's easy
to get that level of precision using the distance scale printed on the map.
Let's see what the others think. Well done.
>>> Posting number 98, dated 28 Sep 2001 11:33:22
Date: Fri, 28 Sep 2001 11:33:22 -0700
Reply-To: Peter Rauch <peterr@socrates.Berkeley.EDU>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Georeferencing guidelines
In-Reply-To: <OF482A362E.E38FA255-ON86256AD5.00621E6D@lsu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Fri, 28 Sep 2001, XXXXXXXX wrote:
> The georeferencing guidelines look great to me. The only
> (minor) quibble I have would be with the second item under
> the subheading "Offsets" (lines 86-89). Here, you suggest
> that a locality that contains distance fractions (such as
> "10.2 mi E Bakerfield") should be assumed to be road miles
> rather than air miles. I see it the other way around. Most
> field workers I know are careful to state "by road" if their
> mileage was actually measured along a road.
On insect labels ;>) "by road" is just that much more text to
cram onto tiny labels. Maybe things are different with
vertebrate folks, especially for those who keep detailed field
notebooks. I think lots of folks keep careful track of their
odometers, and record road/track miles quite often. I suspect
that *either* assumption is likely to be wrong too often (i.e.,
when no explicit indication is given of which type of
measurement is done). Perhaps the classification should be
"Basis of measure not indicated" and let the "buyer beware"?
(I.e., the geographic analyst can then chose how she wishes to
interpret the distances --perhaps choosing to measure both ways
if a locality seems out of place under one or the other
measurement scheme.)
> Otherwise, the
> mileage is assumed to be taken directly from a map (i.e.,
> air miles). I don't see that the inclusion of fractions in
> the mileage should automatically signal that the mileage was
> read from an odometer...it's easy to get that level of
> precision using the distance scale printed on the map.
>>> Posting number 99, dated 30 Sep 2001 13:35:49
Date: Sun, 30 Sep 2001 13:35:49 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: FW: Locality comment
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
John et al.:
With regard to assigning coordinates to localities, there is a convention
that has been used here at KU for at least 50 years that will help with
localities that are given with reference to towns in the US. When the town
(e.g. Lawrence) was a county seat, distances were measured from the
courthouse. Frequently this was near the center of town, but it reduces the
error in estimating the distance from town because we don't need to worry
about the distance being measured from the city limits. If the locality is
3.5 mi NW of
Lawrence, we still have the uncertainty associated with the angular
component. If the town is not a county seat, the Post Office is frequently
specified as the point of reference. We think this system was exported to
several other collections that are part of MANIS. In general, your
suggestions look quite reasonable (and conservative).
>>> Posting number 100, dated 12 Oct 2001 16:22:06
Date: Fri, 12 Oct 2001 16:22:06 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Commentary synopsis
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi folks,
I've been ruminating over the responses to the Georeferencing Guidelines
document, which was posted on the MaNIS website on 24 Sep 2001. That
document has generated interest in a wider community, including the
Alexandria Digital Library Project, so I feel it worthwhile to spend a
little extra effort to fill in some omissions. Below I will address the
points brought up in discussion and try to provide satisfactory solutions.
I would like to know if there are any objections to these solutions. My
next step will be to incorporate this information into the Guidelines
document and then announce the existence of that document to NHCOLL.
XXXXXXXX mentioned a convention to use the courthouse for a
point of reference for a county seat and to use a post office as a point of
reference for other towns. Since the Board on Geographic Names GNIS data
often follows this convention as well I see no conflict. Of course, this
convention applies only to the US, and only to those towns where there is a
single identifiable post office or a courthouse. For all other
determinations the current geographic center of the town, or the
coordinates given in a gazetteer, should be used. In either case it is best
to note something akin to "measured from the post office" or "measured from
the geographic center of Bakersfield" in the determination remarks.
XXXXXXXX bought up the topic of elevations as a critical part of the
determination criteria. I agree with her assessment and I propose that we
follow XXXXXXXX's advice, namely, that localities for which there are
internal inconsistencies should be deferred to the parent institution for
further investigation. I have designed the collaborative gazetteer to
allow annotations to both localities and higher geography. Through the
annotations, georeferencers can note inconsistencies for follow-up work.
Collaborators will be able to check the gazetteer for annotations that
apply to the data from their institution.
XXXXX also noted that there was no example of how to deal with existing
geographic coordinates. My original thought was that we should count these
localities as finished. Yet, there is merit in revisiting existing data,
both for validation and for edification, especially since none of the
existing coordinates have associated error. Nevertheless, we must remain
cognizant of our budgetary constraints. We were given funds to georeference
localities for which we didn't already have coordinates. All that aside,
XXXXX's point is well-taken. I will provide guidelines for existing
geographic coordinates in the forthcoming revised Georeferencing Guideline
document.
XXXXX asked whether we should translate coordinates from other coordinate
systems into decimal degrees for data entry. The gazetteer currently
accommodates the following coordinate systems:
decimal degrees
degrees, decimal minutes
degrees, minutes, decimal seconds
UTM
But that doesn't answer the question. I will endeavor to create an
interface in which the user will select the original coordinate system and
provide the data in that system. Behind the scenes the data will be stored
in that system AND will be translated to decimal degrees. There will be
decimal degrees and the original coordinates for every determination.
XXXXX's next topic was with respect to the precision stored in the
coordinate fields. There is no reason to truncate the values of coordinates
to conform to a predefined level of precision. For reasons described under
the section on Precision in the Georeferencing Guidelines document, it is
inappropriate to try to store precision information in the coordinate data.
Since the values of the coordinates do not make a statement about the
precision of the determination, keeping as many digits as your source
provides is the preferred method. Discarding digits may have an effect on
accuracy, so it is not recommended. Just for edification, a decimal degree
that records five digits to the right of the decimal can distinguish
between two places on the earth roughly one meter apart. Similarly, if you
want to maintain accuracy down to one meter, degrees and decimal minutes
should be recorded with 4 decimal places in the decimal minutes, and
degrees minutes seconds should be recorded with 2 decimal places in the
decimal seconds. Conversely, degrees minutes seconds measured to whole
seconds can introduce inaccuracies of up to 31 meters. Those measured to
whole minutes can introduce inaccuracies of up to 1.85 km. I'll make a
chart of this information for the document revision.
XXXXX's final question has to do with recording the information about who
determined the coordinates. This should certainly be among the best
practices within museums. At the MVZ these data are recorded by making a
reference to the actual person who made the determination. Since the data
are internal to the museum we can tell whether that person was also the
collector or another person on staff. Another possibility is to record the
role of the person who made the determination (e.g., 'collector',
'curatorial assistant', 'Joe's specific locality munger', etc.). Or, if you
only care whether the collector was the one to provide the coordinates, you
could include a DeterminedByCollector field. For MaNIS I intend to use the
name of the person who determines the coordinates, this name being
determined from a login to the online georeferencing interface.
A point of clarification is in order. When determinations are made, I
intend to treat them as opinions. They will not be stored directly with the
locality record, rather, they will refer to it. This allows any number of
lat/long opinions to be registered. The individual institutions will be
able to decide which one (if there are multiple opinions) will the
"accepted" determination when they put the data back in their databases.
All of the coordinates that were provided in the data sent to me have been
turned into opinions and are already in the gazetteer.
XXXXXX made the following observation:
"There are other examples, similar to the problems with distance precision:
64D 28' 30" N - What they meant to say, in terms of significant
figures, was probably 64D 28.5' N. I suppose in this example we would use
max_error= 1 km"
I agree with XXXXXX's assessment of significance, however, the
determination of error is more complicated. Not all degrees are created
equal. Contrary to popular opinion, the distance between 64 degrees N and
65 degrees N is not the same as the distance between 10 degrees N and 11
degrees N. This is due to the oblateness (flattening from a perfect sphere)
of the earth. This may be a minor point, but longitudinal degrees vary
greatly, being roughly 110 km at the equator and 0 km at the poles. My
point is that I need to provide an interface in which one can enter
coordinates and the digits of precision and get back an error distance
based on those criteria
I will amend my wording and typos with respect to using fractions in the
distance precision error section.
XXXXXXXXX brought up a reasonable alternative view of how offsets should
be handled. The judgement of whether measurements are "by road" or "by air"
can be a tricky one. I want to propose a solution and see if I can get a
consensus.
Specific localities that actually say what the measurement method is (e.g.,
"2.8 mi (by road) E of Marysville") should use that method for determining
coordinates and errors. No special remark is necessary in these cases.
Specific localities that have two orthogonal measurements in them (e.g.,
"2.5 mi E and 1.5 mi N of Bakersfield") are always assumed to be "by
air." No special remark is necessary in these cases either. Furthermore,
no error due to direction imprecision should be used.
So much for the easy stuff.
Specific localities that have one linear offset measurement from a named
place, but that do not specify how that measurement was taken (e.g., "10.2
mi E of Yuma") are open for a case-by-case judgment. I propose that the
judgement itself always be documented in the remarks for the determination
(e.g., "Assumed 'by air' - no roads E out of Yuma", or "Assumed 'by road'
on Hwy. 80"). If there is no clear best choice, then use the midpoint
between the two possibilities as the geographic coordinate and assign an
error large enough to encompass the coordinates and errors of both methods.
In this case I would remark something like "Error encompasses both distance
by air and distance by road (Hwy. 80)". This is a conservative solution,
but it is relatively simple to do and to remember. This method is also
never "wrong," if by "wrong" we mean that the actual place is certainly
within our error distance from the given coordinates.
XXXXXXXXX brought up a question about what units should be used
for maximum error distance. I have set up the gazetteer so that the units
are entered (chosen actually) from a list of possible values (m, km, ft,
yds, mi). The distance and units should be chosen to make sense in the
context of the locality description. My conservative stance on translation
and recalculation issues is to "never adulterate data that can be
adulterated later." If you decide to put these data back into your
databases (and I certainly hope that you will), you can decide at that time
whether to normalize to a single unit of measure.
XXXXXXX also brought up an essential issue of whether errors propagate and
should therefore be summed rather than simply choosing the greatest single
source or error. The answer is not a simple one, so bear with me.
XXXXXXX's specific example, "3 km N + 2 km W Bakersfield" is an instance
of a type of locality description for which I did not provide an example. A
proper description of the error for this example would be a bounding box
centered on the point 3 km N and 2 km W of Bakersfield. Each side of the
box would be 2 km in length (1 km error in any direction). Since we're
using a point and radius to characterize the error, we need a circle that
will circumscribe the above-mentioned bounding box. To do this, the radius
has to be the distance from the center coordinate to a corner. This could
either be calculated by the geometry of the bounding box (in the above
example it would be the distance to the corner times the square root of 2)
or measured on a map.
There remains the more general question of whether errors propagate. They
do, and they are non-linear, so to sum them is a mistake. The paragraph
above shows how a sum is not a satisfactory method of accommodating
multiple sources of error. As more sources of error come to bear, the
propagation gets even more "interesting." I'll spare you the details here,
but I'll make a point of explaining these sources and how they should be
dealt with in the Guidelines revision.
In addition to the issues brought up so far in discussion, I have a few to
add independently. First, I got the calculation for directional error
wrong. I'll update that in the revision. Second, it is probably obvious,
but I still need to state that the directional error can be ignored when
the distance is measured either "by road" or when the description gives two
orthogonal offsets (e.g., "2 mi E and 4 mi N"). Third, there is another
source or errors inherent to reading maps. This error is based on the scale
and it reflects inherent errors in the maps themselves. I will quantify
these errors in the revision.
Aside from the revised georeferencing document, I'm currently working on
interfaces to do the georeferencing online. I'll send out a how-to guide
when the interface is ready to use. It is too soon to know when that will be.
So that everyone knows, my field season is about to begin. Eileen and I are
scheduled to leave for Argentina on 3 Nov and to return around New Year's day.
That's it for my update. Feel free to discourse on my proposed amendments
and thanks to everyone for the comments thus far.
John
>>> Posting number 101, dated 16 Oct 2001 12:43:55
>>> Posting number 102, dated 18 Oct 2001 19:30:33
Date: Thu, 18 Oct 2001 19:30:33 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guideline Document Updated
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
It took almost two weeks, but the eagerly-awaited revision to the
Georeferencing Guidelines Document is finally complete. I have replaced the
original document, so the following URL now points to the revision:
http://dlp.cs.berkeley.edu/manis/GeorefGuide.html
I'm not including the line-numbered text of the document here since we are
presumably past the heated debates. Nevertheless, commentary is
always welcome.
When you read the revised document you are likely to be stricken by the
complexities of determining error properly. Don't despair. My next task is
to create an error calculator. The idea is to have a web page on which you
can enter the relevant parameters and get a maximum error distance. This
tool will be a supplement to the georeferencing tool itself, the
development of which is underway.
John
>>> Posting number 103, dated 19 Oct 2001 12:29:38
>>> Posting number 104, dated 4 Nov 2001 21:44:44
Date: Sun, 4 Nov 2001 21:44:44 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: MaNIS--ready, set, georeference!
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="------------24FB9C29A003860042ABE8C3"
--------------24FB9C29A003860042ABE8C3
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Dear All,
This is the moment I know you have all been waiting for! You will
notice a new Gazetteer link at the bottom of the MaNIS home page
(http://dlp.cs.berkeley.edu/manis). This is your gateway to hours of
georeferencing fun. But before starting to work, please read this
message in its entirety, print it out and post it next to the computer
that will be used for georeferencing. You’ll see why you need to print
it when you get near the bottom.
To begin, please review the updated Georeferencing Guidelines.
Next, you will want to read the Georeferencing Steps document. A hot
link to it appears at the top of the gazetteer page.
You will also want to read the text below the query screen on the
gazetteer main page.
After reading all of the above, you will query the gazetteer for a
locality of interest. The "Search" button returns a list of all higher
geographies containing the term entered and indicates how many unique
localities are contained in the result set. The list will not tell you
how many of those localities are already georeferenced. You will see
those data once you download the localities.
You may chose to “View” the queried localities either before or after
downloading BUT this function will not aid you in assigning lat/long
coordinates. Only those localities for which coordinates have already
been assigned get plotted using the GIS viewer (this is the same tool we
showed you at the ASM meeting, courtesy of the Berkeley Digital Library
Project).
Where the GIS viewer is most helpful is in pointing out erroneous
coordinates (e.g., if you view the georeferenced localities from
Algeria, 3 specimens appear in the Atlantic Ocean). By clicking on that
point on the map, you can see the locality record(s) for that point and
correct it/them or, if the locality is not yours, you can contact the
appropriate institution. The viewer also allows you to see how much
work you have accomplished!
Notes about the viewer: This is a java applet and takes time to load.
Do not attempt to use it on older machines with inadequate memory.
Also, not all map layers exist for all parts of the world (e.g., you
will only get USGS 7.5” topo maps for the U.S.). How far you can zoom
and the level of resolution you see will depend on the map layers
available.
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files. 2) Below the
georeferencing steps you will see the complete list of fields that will
appear in your downloaded files. Those that are in bold are fields you
will fill. Those not in bold are needed by John to reassociate the data
in the gazetteer with the data in your institutional databases. DO NOT
alter the values in these fields!
For security purposes, we are not posting instructions on how to upload
georeferenced localities on the web site. Below is the complete text
for Step Eight of the Georeferencing Steps document. These instructions
are also being archived on the listserv should you forget to print out
this message. Follow the instructions below for uploading completed
files:
Step Eight - Upload Finished Localities
Upload the finished file of georeferenced localities by anonymous
FTP to galaxy.cs.berkeley.edu in the directory incoming/mvz/manis. Use
your favorite FTP client to connect to galaxy.cs.berkeley.edu. Log in as
anonymous, providing your email address as a password. Set the file type
to text. Change to the incoming/mvz/manis directory on galaxy. Transfer
your file.
Notice that the MVZ has already laid claim to all California localities
(see MaNIS Georef. Checklist in Step 2). Try as you might, we will not
relinquish this claim! It is therefore incumbent upon each of you to
lay claim to an equally prestigious set of localities.
Those of you paying attention will realize that John is now in Argentina
for two months. He hoped to have the Error Calculator completed before
leaving. He did not. However, once completed, you will simply enter
your lat/long coordinates and it will do all the work of calculating the
error in those values for you-- so it is worth the wait. Go ahead and
start georeferencing now. You will son be able to go back and fill in
the errors needed as he will post the calculator from the field.
I wish I had more to report on the status of your subcontracts, but I do
not. Some of you will be able to begin work regardless. The
beaurocracy has a timeline of its own. We simply have to proceed as best
we can in the meantime.
Please continue to address any questions or comments to the list.
Ready, set, georeference!
Best,
Barbara
>>> Posting number 105, dated 6 Nov 2001 09:51:19
>>> Posting number 106, dated 6 Nov 2001 09:00:24
>>> Posting number 107, dated 6 Nov 2001 12:24:23
>>> Posting number 108, dated 6 Nov 2001 14:29:22
>>> Posting number 109, dated 6 Nov 2001 16:52:12
>>> Posting number 110, dated 6 Nov 2001 16:06:24
Date: Tue, 6 Nov 2001 16:06:24 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Patricia W. Freeman" <pfreeman1@UNL.EDU>
Subject: Re: MaNIS--ready, set, georeference!
Comments: cc: hgenoways1@unl.edu
In-Reply-To: <4.2.2.20011106122240.00abdfb8@packrat.musm.ttu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear members of MaNIS-
I am actually out of your official MaNIS loop, but I have already
georeferenced Nebraska for mammals, birds, herps, and fish (over 60,000
specimens) and will probably do South Dakota as well. I salvaged 8,000
herps and about 1,500 mammals from USD about two years ago.
All four vertebrate groups are on our web page and searchable to county.
Although we have already georeferenced all four collections, the complete
localities will not be put on the webpage until next semester (I hope). My
computer expert who, using the Texas Tech georeferencing idea, modified and
wrote a conversion program changing all our geographic localities to
georeferenced localities.
We now have a large NT server that has the USGS maps and gazetteers on it.
Since Hugh Genoways is rewriting the Mammals of Nebraska and has already
started gathering specimens for that purpose, all mammals and mammal data
used for that study will be automatically georeferenced and those data will
accompany the loaned materials on return to their home institution. I
expect that he has or will contact most of you who have Nebraska material.
Regards-
Trish Freeman
PS. Can any of you direct me to FISHNET or BIRDNET if there are such
things? I am already involved with HERPNET, although I do not know what is
happening with it. Maybe someday we will have VERTNET.
Patricia W. Freeman
Professor/ Curator of Zoology
University of Nebraska State Museum
Lincoln NE 68588-0514
402-472-6606
402-472-8949 (fax)
Natural history museums archive biological diversity.
http://www-museum.unl.edu/research/zoology/zoology.html
>>> Posting number 111, dated 7 Nov 2001 09:09:31
>>> Posting number 112, dated 7 Nov 2001 08:32:12
>>> Posting number 113, dated 8 Nov 2001 14:03:13
>>> Posting number 114, dated 8 Nov 2001 14:39:28
Date: Thu, 8 Nov 2001 14:39:28 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MaNIS--ready, set, georeference!
In-Reply-To: <3BE6274C.F9AC2E10@oz.net>
Mime-version: 1.0
Content-type: multipart/alternative;
boundary="MS_Mac_OE_3088075168_258732_MIME_Part"
> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
--MS_Mac_OE_3088075168_258732_MIME_Part
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
Dear all,
1. I have Internet Explorer 5 for Macintosh on a G4. I haven't been able
to download records from the Manis website.
2. Our grant submission allotted funds to each institution based on their
records to be geo-referenced. Does committing to a state/province or region
change all of this?
3. The process has changed considerably between when our records were
downloaded for John and the ASM meeting. I thought that our records were
being submitted so that John would have a snapshot of what the different
databases looked like in order to design the Manis database. I had planned
to clear up any inconsistencies, spelling errors, etc in our localities
before we geo-referenced and downloaded to the Manis database. This seems
to make sense, since many errors in locality records can be cleared up only
with the use of in-house resources such as field notes and catalogs. Now we
are committing to a region and giving our best opinion on perceived errors
(to be noted in the Locality Annotation) to other institutions (and
ourselves!) for them to rectify (or not) at their leisure. Since I haven't
been able to download records, I don't know how much this new scheme will
save time overall or be more time consuming!
4. There are many localities that are designated unique that simply differ
in syntax, spelling, etc. They are not necessarily next to each other.
Would editing our own version of the database first for these errors and
then downloading them into the Manis database work?
Cheers,
XXXXXXXXX
--MS_Mac_OE_3088075168_258732_MIME_Part
Content-type: text/html; charset="US-ASCII"
Content-transfer-encoding: quoted-printable
<HTML>
<HEAD>
<TITLE>Re: MaNIS--ready, set, georeference!</TITLE>
</HEAD>
<BODY>
<FONT FACE=3D"Century Schoolbook">Dear all,<BR>
<BR>
1. I have Internet Explorer 5 for Macintosh on a G4. I haven't =
been able to download records from the Manis website.<BR>
<BR>
2. Our grant submission allotted funds to each institution based on t=
heir records to be geo-referenced. Does committing to a state/province=
or region change all of this?<BR>
<BR>
3. The process has changed considerably between when our records were=
downloaded for John and the ASM meeting. I thought that our rec=
ords were being submitted so that John would have a snapshot of what the dif=
ferent databases looked like in order to design the Manis database. &n=
bsp;I had planned to clear up any inconsistencies, spelling errors, etc in o=
ur localities before we geo-referenced and downloaded to the Manis database.=
This seems to make sense, since many errors in locality records can b=
e cleared up only with the use of in-house resources such as field notes and=
catalogs. Now we are committing to a region and giving our best opini=
on on perceived errors (to be noted in the Locality Annotation) to other ins=
titutions (and ourselves!) for them to rectify (or not) at their leisure. &n=
bsp;Since I haven't been able to download records, I don't know =
how much this new scheme will save time overall or be more time consuming!<B=
R>
<BR>
4. There are many localities that are designated unique that simply d=
iffer in syntax, spelling, etc. They are not necessarily next to each =
other. Would editing our own version of the database first for these e=
rrors and then downloading them into the Manis database work?<BR>
<BR>
Cheers,<BR>
<BR>
XXXXXXXXXXXXXX</FONT>
</BODY>
</HTML>
--MS_Mac_OE_3088075168_258732_MIME_Part--
>>> Posting number 115, dated 8 Nov 2001 21:20:18
Date: Thu, 8 Nov 2001 21:20:18 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: permutations on "unique" localities in the gazetteer
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All: I was wondering about many of the same points that XXXX
XXXXXXXXX mentioned in his email of 8 Nov. Especially after perusing the
gazetteer and seeing many permutations on"unique" localities. Eg.,
localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi north,
and north of Seattle 20 miles, have to be allowed because of institutional
style or preference. However, an entry such as Seatle, 20 mi N could be
corrected. Each is a unique record to the computer and will receive the
same lat/long by georeferencers? Once georeferenced, the permutations can
be identified, but if localities are entered differently, how much
efficiency is gained by having one institution georeference all records for
a region vs having each georeference their own records? In addition when
a typo like Seatle is corrected, it no longer is unique but of the same set
as the correct spelling. The typos will be deleted from the static
gazetteer after determining that they were corrected in the institutional
database (see comment from Barbara below)? It is unclear to me how
corrections in institutional databases will be mirrored in the static
gazetteer.
Although the idea of compiling a static gazetteer of unique localities
seemed like a good idea at the beginning, it does not seem doable at this
point. I would prefer to go back to the original plan of each institution
dealing with their own records and offering assistance to others as needed.
Once georeferencing is started and we get $ for the servers, the
gazetteer could be produced dynamically, or at least by frequent uploads -
rather than statically - and can be consulted, updated, corrected, winnowed
as needed.
>From 4 Nov email of Barbara:
...
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files.
...
>>> Posting number 116, dated 9 Nov 2001 08:57:34
Date: Fri, 9 Nov 2001 08:57:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: MaNIS--ready, set, georeference!
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
> 1. I have Internet Explorer 5 for Macintosh on a G4. I haven't been a=
ble to download records from the Manis website.
XXXX et al.,
We are checking this out and, with luck, will have a fix today. In the m=
eantime, you can download from a Mac using Netscape.
> 2. Our grant submission allotted funds to each institution based on th=
eir records to be geo-referenced. Does committing to a state/province or=
region change all of this?
No it does not. It was presumed that, in most instances, the majority of=
localities for a given state, and the geographic expertise and resources=
to untangle geographic problems, would reside with the institution in th=
at state. Therefore, it made sense that we should work cooperatively to =
georeference. Each institution naturally will have many other specimens =
collected outside that state. Each can choose to do onlyu its own locali=
ties, thereby encouraging duplicate effort, or we can attempt a more altr=
uistic approach and save economies of scale. If, after georeferencing al=
l of California, the MVZ looks at its remaining collections and sees that=
it has a tremendous amount of material from Brazil, Peru and Argentina, =
and recognizes that it also has more geographic expertise in these region=
s than any of the other institutions (and presumably more maps, gazetteer=
s, etc.), then we are going to offer to do all localities from
those countries for the sake of efficiency and making the money go as far=
as possible. In return, we know we will benefit from the Bishop Museum =
doing our PNG material, of which we have a fair number of specimens. We =
could it, yes. But they can probably do it more quickly and easily. Thi=
s approach also allows those with an interest in a particular region of t=
he world to get a good handle on what exists in our joint collections and=
, I suspect, reach some very interesting summaries about those regions an=
d the state of our knowledge of their mammalian fauna.
> 3. The process has changed considerably between when our records were =
downloaded for John and the ASM meeting.
No it has not. All of this was discussed online during the proposal prep=
aration process beginning more than a year ago.
> I thought that our records were being submitted so that John would hav=
e a snapshot of what the different databases looked like in order to desi=
gn the Manis database.
That is also true. There were always two objectives in giving John your =
data.
> I had planned to clear up any inconsistencies, spelling errors, etc in =
our localities before we geo-referenced and downloaded to the Manis datab=
ase.
The time to have cleared up those problems was before the data were sent =
to John. Since this approach was outlined in the first proposal submissi=
on over a year ago, it should not have come as a surprise. The money we =
receive from NSF was never intended to pay institutions to clean up their=
locality records. It is to georeference those records.
> This seems to make sense, since many errors in locality records can be =
cleared up only with the use of in-house resources such as field notes an=
d catalogs. Now we are committing to a region and giving our best opinio=
n on perceived errors (to be noted in the Locality Annotation) to other i=
nstitutions (and ourselves!) for them to rectify (or not) at their leisur=
e.
Since you haven't started to georeference, you will have to take my word =
that your fears are probably worse than reality. Truly erroneous localit=
ies become obvious quite quickly and if they are not your own, simply ema=
il a query to the institution to which that locality belongs.
Multiple versions of the same locality also jump out quickly. The advant=
age of using a single individual to georeference a region in that s/he qu=
ickly becomes familiar with the localirties in that place. My own person=
al suggestion is that each PI sit down with the data and try this process=
him- or herself before hiring a student to really get going on it. It w=
ill give you confidence and a much better feel for how it all works. And=
, if you love maps like I do, it can actually be quite a seductive exerci=
se. Your problem will be to keep working and not to get distracted by th=
e geography and all the places you would like to collect, have collected,=
etc. Perhaps the most difficult aspect is recognizing place names that =
are no longer in use. Again, review the georeferencing guidelines which =
remind you not to dwell on any single seemingly intractable locality.
> 4. There are many localities that are designated unique that simply di=
ffer in syntax, spelling, etc. They are not necessarily next to each oth=
er. Would editing our own version of the database first for these errors=
and then downloading them into the Manis database work?
I don't believe so. As mentioned above, each institution has known about=
this approach for more than a year and could have, in that time, chosen =
to direct part of its routine curatorial effort to cleaning up localities=
in its db. The final distributed db will have whatever corrected specif=
ic localities get made during the georeferencing process. We were not gi=
ven money to clean up our localities. We received this money to georefer=
ence. You are under no obligation to correct localities for other instit=
utions. You are merely being asked to georeference them. Even if relate=
d localities do not fall out in line with one another in your downloaded =
files, if one individual works on all the localities for a given region, =
s/he will not have trouble recalling that a lat/long for a similar place =
was assigned just two days ago and one can scroll up the list to find it.=
I am sure John will want to add his own comments to what I have written. =
He generally has access to email about once a week. In the meantime, I =
will let you know as soon as we solve the download problem. That does no=
t have to wait for him.
Best, Barbara
>>> Posting number 117, dated 9 Nov 2001 09:28:19
Date: Fri, 9 Nov 2001 09:28:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: permutations on "unique" localities in the gazetteer
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> Each is a unique record to the computer and will receive the
> same lat/long by georeferencers?
Yes.
> Once georeferenced, the permutations can
> be identified, but if localities are entered differently, how much
> efficiency is gained by having one institution georeference all records for
> a region vs having each georeference their own records?
Please refer to my reply to XXXXXX.'s previous message on this issue. Having a
fair amount of experience doing georeferencing, the MVZ and other instigators
of this proposal believe strongly that much efficiency can be gained by a
cooperative approach. Proof of our commitment is that the MVZ has agreed to do
all California localities for this project even though we have completed
georeferencing our own localities for many counties in the state more than a
year ago. We believe we can just do it more efficiently and more painlessly
than any of you folks can. Even LACM didn't fight us on this point. I can
change the oil in my car but...
> In addition when
> a typo like Seatle is corrected, it no longer is unique but of the same set
> as the correct spelling. The typos will be deleted from the static
> gazetteer after determining that they were corrected in the institutional
> database (see comment from Barbara below)?
No, the typos will not be deleted from the static gazetteer. The static
gazetteer exists simply as a way to unite all localities from our respective
dbs for georeferencing and then return the georeferenced locs to their
respective dbs.
> It is unclear to me how
> corrections in institutional databases will be mirrored in the static
> gazetteer.
I repeat-- corrections in institutional dbs will not be mirrored in the static
gazetteer. Rather, your efforts will be mirrored in the final product--a
geographic dictionary coupled with the distributed db network and GIS viewer.
Please review our NSF proposal.
> Although the idea of compiling a static gazetteer of unique localities
> seemed like a good idea at the beginning, it does not seem doable at this
> point.
It has been done, for the purpose it was designed to carry out.
> I would prefer to go back to the original plan of each institution
> dealing with their own records and offering assistance to others as needed.
That is not what was agreed to or specified in the proposal.
> Once georeferencing is started and we get $ for the servers, the
> gazetteer could be produced dynamically, or at least by frequent uploads -
> rather than statically - and can be consulted, updated, corrected, winnowed
> as needed.
And it will be. You are exactly right.
Best,
Barbara
>>> Posting number 118, dated 9 Nov 2001 14:20:26
>>> Posting number 119, dated 9 Nov 2001 14:57:01
Date: Fri, 9 Nov 2001 14:57:01 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Static Gazetteer
MIME-version: 1.0
Content-type: multipart/alternative;
boundary="Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)"
--Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Dear All,
To add to my last message, I don't think the static gazetteer was a
surprise, rather the timing of it was. When I sent the TTU site data to
John early in the summer, I told him that we are in the middle of verifying
and correcting our database. (We have been working on checking and
correcting our database for nearly three years; I happily report that we
are all but done now.) At the time, I told John that the corrected data
were NOT what was being sent to him. He implied that this was okay and
that the static gazetteer would be created at a later time. However, I may
have misunderstood him. Now, it seem that several of us have data that we
are not comfortable with in the already compiled gazetteer.
I did understand that the NSF money was to meant to cover database
corrections, but I thought we'd begin georeferencing only after the data
had been corrected. I think we're all looking for ways to simplify the
process and having the indiosyncracies of years of data entry already fixed
would greatly facilitate the process. Is there some way to address this
problem (uncorrected data in the gazetteer)? Or do we push ahead with the
gazetteer as it is. In my mind, going ahead with it as it is will create
some additional work for those doing the georeferencing (because of the
duplications), but it will create a great deal of additional work for each
institution as errors are corrected. In our case at TTU, we will have to
go through the gazetteer (once we get the georeferenced records back),
compare all those records to the file we just spent three years updating
and update the whole thing all over again. Remember that not all of the
corrections will be simple typos or punctuation problems. We're correcting
incorrect data as well (e.g., wrong county names entered). If we could
have the opportunity to update the gazetteer with corrected data before the
process is too far along, it would help considerably.
>>> Posting number 120, dated 9 Nov 2001 15:09:14
>>> Posting number 121, dated 9 Nov 2001 15:59:31
Date: Fri, 9 Nov 2001 15:59:31 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Correction
MIME-version: 1.0
Content-type: multipart/alternative;
boundary="Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)"
--Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Correction to my last note: I did understand that NSF money was NOT to be
used to make corrections to the databases.
Sorry for the slip.
XXXX.
>>> Posting number 122, dated 9 Nov 2001 15:13:09
Date: Fri, 9 Nov 2001 15:13:09 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: Static Gazetteer
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> ... We're correcting incorrect data as well (e.g., wrong county names entered). If we could have the
opportunity to update the gazetteer with corrected data before the process is too far along, it would
help considerably.
XXXXXX,
I am very sympathetic to the argument you put forth and am quite sure I would be operating out of my
league if I were to speak for John on this issue. However, I would like to offer several thoughts--
First, an encouraging thought, with the caveat that John will surely correct me if I am wrong-- The
locality ID field in your downloaded files (the one you have been warned not to alter!) will be used to
reassociate the georeferenced data with the records in your dbs--regardless of the content of those
records. So do not despair if you have corrected some of your localities since you sent John the data.
This was to be hoped for and should not present a problem. If records did have erroneous data (like a
wrong county), these will likely be difficult to georeference on the first pass and may be skipped, but
they should be easy to deal with by the home institutions once all the data are returned and we each
look for remaining unreferenced localities in our own dbs.
Second, we have committed to quite a large project over the course of three years and it is imperative
that we start working ASAP. It is simply not possible to delay georeferencing while each collection
takes time to verify and correct its locality data. Have the majority of collections made substantive
changes/corrections to their locality data since those data were sent to John? I don't know, but I
suspect the majority has not, even though we are all continually cleaning up our data on a daily basis.
So how long do we wait? Despite the fact that you have not received your money, we are already two
months into this project. We need to begin work. It could also be aruged that we should delay because
of all the new specimens that have been entered into our dbs since the data were sent to John.... At
some point we must draw the line.
What I ask is that each institution lay claim to a set of localities, that they download those data, and
then spend a bit of time examining what's really there. Begin georeferencing. Become familiar with the
process we've outlined. It may be slow going initially, but as with all new techniques, it will become
quicker and easier with practice.
I sincerely regret any misunderstandings that may have occurred. It is important to keep communicating
and I thank you for your contributions.
Best, Barbara
>>> Posting number 123, dated 9 Nov 2001 16:10:16
Date: Fri, 9 Nov 2001 16:10:16 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: alternative download method
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
XXXX et al.,
Beneath the "Download" button there is now an alternative option for
those who may have experienced problems. Click on the link that says
"Alternate download method is here." A text file with the data should
display in the browser window. Go to the "File" menu and select "Save
As..." to save the file on your computer. Then open excel and import
the file.
Best,
Barbara
>>> Posting number 124, dated 15 Nov 2001 08:18:59
Date: Thu, 15 Nov 2001 08:18:59 -0800
Reply-To: bstein@oz.net
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Barbara Stein <bstein@OZ.NET>
Subject: downloading problems solved
Dear All,
I believe that the problems some individuals were having with downloading
locality data are now solved.
For those using IE on a Mac, an alternative download button has been added with
instructions. Click to download after viewing the list of specific localities
that result from your search and you will see the alternative option beneath
the original download button.
There is also no longer a problem with downloading large numbers of records
(e.g., >8500) so I hope you will feel emboldened.
Remember, the downloaded files need to be imported into your spreadhseet of
choice before you will see the headers and the data lined up in a way that
makes sense to you. Do not attempt to simply work with the downloaded files as
is.
Lastly, the subcontract budgets have been set up and are in the hands of
Berkeley's SPO. It is up to that office to notifiy your SPOs that the money is
available. It is out of the MVZ's control at this point.
Best,
Barbara
>>> Posting number 125, dated 15 Nov 2001 11:09:32
>>> Posting number 126, dated 16 Nov 2001 07:38:49
Date: Fri, 16 Nov 2001 07:38:49 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Collaborative Georeferencing Theory II
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear all,
In this message I am responding to the discussion begun by XXXXXXXXX
on 8 Nov and continued by XXXXXXXXX. I will refer to both of their
messages herein. I realize that Barbara has already answered these points
while I was out contracting chilblains in the Patagonian wind, but it may
be a comfort to some to see the extent to which we are in agreement
without having had the benefit of communicating.
XXXXXXXX said...
[
2. Our grant submission allotted funds to each institution based on their
records to be geo-referenced. Does committing to a state/province or
region change all of this?
]
-------------------
No. Funding was based on the number (and difficulty) of the localities in
your collection that need to be georeferenced. In theory, if everyone does
the amount of georeferencing for which they were funded at the speeds we
deduced from experience, then all of the localities without coordinates
will be georeferenced under the funding we were given. In order to take
advantage of the pooling of like localities (i.e., those in the same area
on the map regardless of their source institution) we need to have people
commit to geographic areas that best suit them. Suitability includes not
only geographic areas of interest and of expertise, but also of scope. For
example, if I am institution X, given funding for 10 weeks of
georeferencing, then committing to a geographic area that will take 20
weeks to georeference may be good citizenship, but it is not good
finance. Basically, spend as many weeks on georeferencing as you are
listed for in the NSF Project Description. Details on georeferencing rates
(i.e., localities per hour for different classes of geography) were given
in the Project Implementation section of the NSF Project Description. If
you need to estimate what you are committing to in terms of time, read
that section. It will probably be worthwhile for everyone to monitor
his/her georeferencing rates. If your rates are significantly different
from those projected, send a message to the list. If you are going a lot
faster, we want to know how you're doing it. If you're going a lot slower,
maybe we can help increase your efficiency.
-------------------
XXXXXXXX said...
[
3. The process has changed considerably between when our records were
downloaded for John [W.] and the ASM meeting. I thought that our records
were being submitted so that John [W.] would have a snapshot of what the
different databases looked like in order to design the Manis database.
]
-------------------
The last point is true, but it is not the only reason I gathered the
data. Following is an excerpt from the original message from Barbara Stein
asking that data be sent to John W.:
"NOTE: The data you send him will not be distributed in any way, shape,
or form; he will do nothing more than examine it and compare the structure
and general content of the files and then use this data to make the
initial global locality file that will be available for general
reference. This is extra work that is being done on MVZ's nickle, but
something we feel will keep this project on track and give you the most
bang for your buck."
At that point in time we already knew we would use a combined locality
gazetteer, it just wasn't clearly stated at that point how we would use
it. By the time of the ASM meeting I had almost finished the gazetteer and
its purpose was more definitively stated. Following is a quote from the
ASM 2001 meeting notes:
"While John [W.] begins work on developing the network, participants will
begin georeferencing. This is why John [W.] asked for your data. From
those
data he will create a combined snapshot of unique localities, which will
be
used for georeferencing."
-------------------
XXXXXXX said...
[
I had planned to clear up any inconsistencies, spelling errors, etc in our
localities before we geo-referenced and downloaded to the Manis
database. This seems to make sense, since many errors in locality records
can be cleared up only with the use of in-house resources such as field
notes and catalogs. Now we are committing to a region and giving our best
opinion on perceived errors (to be noted in the Locality Annotation) to
other institutions (and ourselves!) for them to rectify (or not) at their
leisure. Since I haven't been able to download records, I don't know how
much this new scheme will save time overall or be more time consuming!
]
and XXXXXXX said...
[
Dear All: I was wondering about many of the same points that XXXX
XXXXXX mentioned in his email of 8 Nov. Especially after perusing the
gazetteer and seeing many permutations on"unique" localities. Eg.,
localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi
north, and north of Seattle 20 miles, have to be allowed because of
institutional style or preference. However, an entry such as Seatle, 20
mi N could be corrected. Each is a unique record to the computer and will
receive the same lat/long by georeferencers? Once georeferenced, the
permutations can be identified, but if localities are entered
differently, how much efficiency is gained by having one institution
georeference all records for a region vs having each georeference their
own records?
]
-------------------
First, it would be nice if we each had clean and consistent data in our
databases. We don't. We vary greatly in how close we are to achieving that
aim, not only in terms the raw amount of cleaning to do, but especially in
how long it would take each of us to do it. For this reason we cannot wait
for localities to be cleaned up before we start georeferencing.
Second, NSF provided funds to georeference localities, not to clean up
existing data. Nor did our methods and time estimates in the NSF proposal
depend on "clean" localities. I agree that it would be more efficient to
georeference ALREADY clean localities, but it is faster to georeference
them as they are than it is to clean them up and then georeference them.
Third, in answer to XXXX's last question, the methods presented in our
proposal have been tested and shown to be much more efficient than the
alternative of having each institution georeference only its own
localities. Forgive my digression into a lengthy answer, but this is an
extremely important matter.
The concept of uniqueness is, as XXXX points out, defined by the
computer's ability to distinguish one locality from another. Thus, "20 mi
N of Seattle" is a different record from "Seattle, 20 mi N." Furthermore,
there might be two localities "20 mi N of Seattle", one for UWBM and one
for PSM. There are several reasons for keeping these separate, the most
obvious and important of which is to be able to identify from which
institution a locality description came. So, with the MaNIS gazetteer I've
basically given everyone a list of their unique localities, but you could
each have done that yourselves. The real purpose behind the gazetteer is
to combine localities for all institutions by geographic regions. By far
the most time-consuming aspect of georeferencing is finding places on a
map. Thus, it behooves you to assemble localities that are likely to be in
roughly the same place and then find them on a map all at once. Once you
are on the right map you can get coordinates for all of the localities in
that area. So, suppose I have downloaded localities for which the county
is "Kern." At the top of my list of localities for Kern County is one from
UWBM that says "Bakersfield, 10 mi E; Rattlesnake Grade." I see that the
named place is Bakersfield, so I filter my Kern County records to show me
only those which contain the word "Bakersfield." It turns out that in Kern
County there are 117 localities from 10 institutions that mention
"Bakersfield." I get out my map of the Bakersfield area and start looking
for "Rattlesnake Grade." I can't find it on my map right away so I'm going
to skip this locality for the moment. The next twelve localities on my
list are from six different institutions, but they all have some variation
on "3 mi E of Bakersfield." I find this location on my map once, get the
coordinates and copy them to all twelve localities that match this
place. The next locality on my list is from MVZ and it says "Bakersfield,
6 mi N, 9 mi E; Rancheria Road (Rattlesnake Grade)." Oh, so that's where
Rattlesnake Grade is - on Rancheria Road. Now I can go figure out that
first locality, which I skipped at first.
So, to answer XXXX's last question again, there are multiple ways in which
the combined localities aid in the overall efficiency of the
georeferencing process. From the illustrative example above, only the MVZ
had to possess the Kern County map; nobody had to go out and buy one. Only
one person had to find Bakersfield on a map, rather than one person from
each of the ten institutions that had localities from that area. It was
possible to find Rattlesnake Grade for all localities that mentioned it,
not just for the one that also happened to locate it on Rancheria Road. It
might not otherwise have been possible to georeference this locality or
maybe the error would have been much greater than it needed to be. The
single locality 3 mi E of Bakersfield could be found and measured once and
the results copied to all twelve localities that were really the same
place. While the foregoing is all well and good in theory, empirical
testing at the MVZ backs it up with hard numbers. Georeferencing rates
doubled when localities from three collections were combined versus when
they were done separately. Further increasing the number of collections
will result in even greater efficiency.
Now let me go back and address part of XXXX's comment that I have
neglected thus far.
XXXXXXXX said...
[
"Now we are committing to a region and giving our best opinion on
perceived errors (to be noted in the Locality Annotation) to other
institutions (and ourselves!) for them to rectify (or not) at their
leisure."
]
-------------------
I'm not sure what XXXX's point is here, but I'll try to explain the
Locality Annotation again. Locality Annotation is one of the fields in the
downloaded locality data. This field is provided as a courtesy to alert
the institution that provided a locality that there is something
inconsistent about it. It's not meant to be filled with opinions on
perceived errors, it is meant to note definitive inconsistencies. For
example, if I get a locality in the downloaded file for Inyo County that
says "Bakersfield", then there is a problem with the locality. It's not an
opinion, and it isn't a perceived error; it is simply true that
Bakersfield is not in Inyo County. It's up to me as the georeferencer to
decide whether this is enough of a problem to not georeference the
locality. In this particular case I could either choose to georeference
the locality, because I know that Bakersfield is in Kern County, or I
could choose not to georeference it simply because I'm doing Inyo County
and Bakersfield is out of my "jurisdiction." I wouldn't take the latter
option because I'm necessarily a stickler for boundaries, it's just that
I'd have to go get another map and that would waste time. It might be
better to leave some inconsistent localities until later. Nevertheless,
since I've spent the energy to figure out that there is a problem with the
locality, I might as well extend the courtesy of noting what the problem
is. It'll save time for someone else later on. It is this philosophy that
led me to include the NoGeorefBecause field in the download as well. If
I'm able to determine that a locality cannot be georeferenced, I might as
well say so, and why, so that the next person who sees that this locality
doesn't have coordinates will not bother to try to determine them.
-------------------
XXXXXXXX said...
[
4. There are many localities that are designated unique that simply
differ in syntax, spelling, etc. They are not necessarily next to each
other. Would editing our own version of the database first for these
errors and then downloading them into the Manis database work?
]
-------------------
Yes. In theory it could work, but it is not practical. In addition to the
reasons I gave above, this kind of activity would take a great deal of my
time, which I hope you would agree could be better spent on other things.
-------------------
XXXXXXX said...
[
In addition when a typo like Seatle is corrected, it no longer is unique
but of the same set as the correct spelling. The typos will be deleted
from the static gazetteer after determining that they were corrected in
the institutional database (see comment from Barbara below)? It is unclear
to me how corrections in institutional databases will be mirrored in the
static gazetteer.
The comment from Barbara was...
[
...
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files.
...
]
-------------------
XXXX's question is well founded. I have nowhere yet described what will
happen to the georeferenced localities. I'll try now to clear up this part
of the grand scheme. I've already explained that I would like the
georeferenced localities to be sent back to me so that I can proof them,
load them back into the gazetteer, and keep a running status of the
georeferencing aspect of the project. In principle, you could download
sets of georeferenced localities for your institution at any time and load
them into your own database. But that isn't the most efficient way to go
about the problem. It would be better to wait until all georeferencing is
done, then download all localities for your institution and create the
lat_long records for them all at once, with my help, if necessary. Note
that I am not explaining how to create the lat_long records or how to
incorporate them in your database. The reason is that (almost) everyone's
database structure is different from everyone else's, so there is no one
single solution to fit all. That's why I offer my help to get these data
back into your databases, but I can only afford to do it one time for each
institution that needs it.
Now back to XXXX's question. Changes in your databases will not be
mirrored in the static gazetteer. There will be no changes whatsoever to
localities in the static gazetteer, as per Barbara's additional notes. If
you correct typographical errors in your database it will not affect the
georeferencing process. If you make a substantive change to a locality
(one that would affect how the locality is georeferenced), then there will
be an easily discernible discrepancy that can be resolved at the time when
lat_longs are incorporated into your database. Nevertheless, the more
changes you make to your localities during the georeferencing period, the
more work you will potentially create for yourself later.
-------------------
XXXXXX said...
[
Although the idea of compiling a static gazetteer of unique localities
seemed like a good idea at the beginning, it does not seem doable at this
point. I would prefer to go back to the original plan of each institution
dealing with their own records and offering assistance to others as
needed. Once georeferencing is started and we get $ for the servers, the
gazetteer could be produced dynamically, or at least by frequent uploads -
rather than statically - and can be consulted, updated, corrected,
winnowed as needed.
]
I hope I've done something to counter the above sentiment. Let me add
another note about the static gazetteer. It is an interim tool intended to
help us divide up the georeferencing responsibilities and to monitor
georeferencing progress. Your databases are not static. Yet, to function
effectively, we need a fixed target. The real end product of this endeavor
will include a dynamic gazetteer that will drawn from the
continually-updated locality data contained in the participating
databases. At that point, when you add new data, or change existing data,
it will be reflected in the dynamic gazetteer without intervention.
I hope this clarifies the reasoning behind our approach to
georeferencing. Considerable thought and effort have gone into
establishing and testing the methods set forth here and elsewhere in the
MaNIS documents. Barbara and I remain convinced that this is the most
reasonable approach to an otherwise daunting task.
John W.
>>> Posting number 127, dated 16 Nov 2001 11:51:55
Date: Fri, 16 Nov 2001 11:51:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Collaborative Georeferencing Theory II
In-Reply-To: <Pine.GSO.4.21.0111160737280.29268-100000@socrates.Berkeley.EDU>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
John,
My intention was never to clean up our locality data with geo-referencing
funds! I was operating on the assumption that we would be responsible for
our own data and therefore it would have been worthwhile to clean it up on
our own dime before geo-referencing. Which gets to another question. I
have cleaned up localities in our database since downloading it to you. Is
this going to cause problems in downloading the newly geo-referenced
localities from MANIS into our current database? Can I continue to clean up
our own database? Did I understand you correctly when you said to leave
localities that have lat/long alone? The reason I ask is that I noticed
that when you transferred our lat/long to the Manis database. The minutes
were incorrectly interpreted as decimal degrees. Should I worry about this?
Will we have to change our database to accept decimal degrees? I appreciate
your thorough responses. I am trying to clarify and simplify our tasks.
That is my bottom line.
Cheers, XXXX
PS I didn't put this on the site, because I am seeking clarity not a debate.
>>> Posting number 128, dated 16 Nov 2001 15:59:44
>>> Posting number 129, dated 17 Nov 2001 12:55:06
Date: Sat, 17 Nov 2001 12:55:06 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Collaborative Georeferencing Theory II
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All: Thanks to John W. for the overview and examples. In summary, we
are georeferencing unique geographical entries rather than unique
localities. Unique can be a function of geography, institutional acronym,
syntax, typos, punctuation and errors. The goal is clearer.
XXXXXX
>>> Posting number 130, dated 17 Nov 2001 12:55:48
Date: Sat, 17 Nov 2001 12:55:48 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: Questions about Georeferencing
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
XXXXXXXX wrote:
> Thanks for all of the great georeferencing information, steps, and
> guidelines! XXXXXXX and I have been familiarizing ourselves with the
> guidelines, steps and very helpful weblinks. We downloaded the Ingham
> County (Michigan) records into the Access template, and I feel that this
> county is a comfortable starting place for us (it is our institution's
> county).
Go for it! Starting is half the battle.
> Before we begin, I would appreciate clarification on a couple of items.
> Thank you for your time.
As always, I will provide my thoughts and John will weigh in when he's next
online.
> 1) Is it okay to use available "online" latitude and longitude
> coordinates, as long as Datum information, etc. are available?
Yes. Just make sure you specify the source of those coordinates in the
designated field on your spreadsheet.
> For example, the Township, Range, Section Information website
> (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis
> Georeferencing Guidelines has links whereby one can search for a named
> place, and the decimal degrees coordinates (to four decimal places) come up
> for that place (example, City of Mason, Michigan). Is it okay to use such
> on-line coordinates for georeferencing place names, or should all
> georeferencing should be done with "hard copy" references?
We encourage you to take advantage of all available tools, that's why we
provided those URLs. There may be others as well. Just make sure your sources
are credible.
> 2) If the answer to the above question is that all georeferencing should
> be done with "hard copy" references, then ignore this one.
>
> A related question to 1): from the same website mentioned above, one can
> link to "TerraServer" and get (really interesting) aerial photos of places.
> With the aid of a labelled map, one can zoom in and find specific
> buildings (such as the Michigan State University Swine Barn - a real Ingham
> County example). From a zoomed aerial image, you can click on "Image Info"
> and get lat and long (non-decimal) coordinates for "tiles" (corners of
> squares) surrounding the image. Datum information is included in "Image
> Info".
>
> So my question is, is it okay for us to use these types of on-line aerial
> images for georeferencing?
I'm including this question just for completeness. The answer is, of course,
yes. And remember, do not worry about the type of coordinate data you record.
The error calculator will be able to convert data provided in any format (e.g.,
deg, min, sec; dec. degrees; etc.) into any other format. Knowing the datum,
providing the source of your coordinates, and noting any assumptions you have
made in assigning those coordinates are what's crucial.
> 3) With regard to the "DeterminedDate" data field in the download file -
> is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month
> Spelled Out YYYY) that you would like us to use?
No, because most spreadsheet programs will dictate a format. It seemed
worthless for us to specify one. John will have to deal with that variety
later.
Best,
Barbara
>>> Posting number 131, dated 17 Nov 2001 14:01:10
Date: Sat, 17 Nov 2001 14:01:10 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: download of GNIS dataset
Comments:
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
GNIS locality datasets for states can be downloaded from:
http://mapping.usgs.gov/www/gnis/gnisftp.html
The dataset for Washington consisted of 32K+ localities and Oregon had
50K+. Both loaded into Excel without problems (after unzipping), and
provide a good start on an authority file for locations + lat/longs. I
wish I had it back when we originally entered our data. Locations can be
found with a search or scrolling in Excel, or by loading into a database
program. As long as you don't need a map, lookup on the downloaded file is
faster than via the GNIS webpage. The downloaded file also has lat/longs
as decimals, which don't appear to be accessible on the GNIS webpage.
These can be entered into two fields of MaNIS with a copy/paste rather than
parsing or typing the dddmmss + direction string into the eight fields
required for ddd, mm, entry.
>>> Posting number 132, dated 19 Nov 2001 07:53:03
>>> Posting number 133, dated 20 Nov 2001 10:41:10
>>> Posting number 134, dated 20 Nov 2001 10:57:38
>>> Posting number 135, dated 20 Nov 2001 18:52:31
Date: Tue, 20 Nov 2001 18:52:31 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Vertical Datum?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear Barbara,
Thanks for your reply to my earlier message. I have another question for
both you and John:
Do we need to note the "Vertical Datum" if one is provided on a map source?
One of the Michigan USGS maps that I looked at this week had the following:
Horizontal Datum: NAD1927
Vertical Datum: NGVD 1929
Also, it looks like we'll be using Topozone
(www.topozone.com/findplace.asp) for georeferencing some of the Michigan
localities (just point the cursor anywhere on the map and the coordinates
of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
on the lower part of the screen).
XXXXXXXX
>>> Posting number 136, dated 23 Nov 2001 10:20:39
Date: Fri, 23 Nov 2001 10:20:39 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Collaborative Georeferencing Theory II
In-Reply-To: <B81AAE5B.EB7%jrozdil@u.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Fri, 16 Nov 2001, John Rozdilsky wrote:
> John,
>
> My intention was never to clean up our locality data with geo-referencing
> funds! I was operating on the assumption that we would be responsible for
> our own data and therefore it would have been worthwhile to clean it up on
> our own dime before geo-referencing. Which gets to another question. I
> have cleaned up localities in our database since downloading it to you. Is
> this going to cause problems in downloading the newly geo-referenced
> localities from MANIS into our current database? Can I continue to clean up
> our own database?
XXXX and all,
There has been some confusion with respect to localities,
lat_longs, higher geographies, and the means by which data get back into
your local databases. I have neglected the discussion so far in favor of
getting people working, but clearly there is a great deal of anticipation
on the subject. I'll explain this stuff in detail on my trip into town
next week.
In the meantime, continue as you were. If you are in the midst of cleaning
up locality data and have a good reason to continue doing so at the
moment, go ahead. If you weren't cleaning up locality data, don't do so
for the sake of MaNIS.
>Did I understand you correctly when you said to leave
> localities that have lat/long alone? The reason I ask is that I noticed
> that when you transferred our lat/long to the Manis database. The minutes
> were incorrectly interpreted as decimal degrees. Should I worry about this?
It seems I have misinterpreted your latitude and longitude data, is that
correct? The original data should be ddmmss, not dd.dddd? Is this true of
all lat_long entries? If so, then I need to update the gazetteer with the
correct data. I can do this from here in Argentina, but I'll have to do it
the next time I come to town. You were right to worry about this. Even
though we don't have to georeference those localities that already have
coordinates (at least not in the first pass), we do want to be able to use
them for reference, so they should be made correct. It's probably a good
idea if every institution that provided some lat_long data do a little bit
of double checking to see if I've made the correct interpretation of your
data. If I made one mistake, I certainly am capable of making others.
> Will we have to change our database to accept decimal degrees? I appreciate
> your thorough responses. I am trying to clarify and simplify our tasks.
> That is my bottom line.
You will not have to make changes in your database to accept decimal
degrees. You can use whatever coordinate system you like locally, and I
can give you your data in that format when it comes time to download data
from the gazetteer into your database.
For better or worse, have been trying to simplify explanations -
sometimes at the expense of explaining the complete plan. I guess it's
turning out OK though, because all of the right questions are being asked.
John W.
>>> Posting number 137, dated 23 Nov 2001 10:24:34
Date: Fri, 23 Nov 2001 10:24:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Questions about Georeferencing
In-Reply-To: <3BF6CED4.5296BDB@oz.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
Barbara has answered everything below perfectly well. I'm just "weighing
in" to say so.
On Sat, 17 Nov 2001, Barbara R. Stein wrote:
> XXXXXXXX wrote:
>
> > Thanks for all of the great georeferencing information, steps, and
> > guidelines! Robin Bolig and I have been familiarizing ourselves with the
> > guidelines, steps and very helpful weblinks. We downloaded the Ingham
> > County (Michigan) records into the Access template, and I feel that this
> > county is a comfortable starting place for us (it is our institution's
> > county).
>
> Go for it! Starting is half the battle.
>
> > Before we begin, I would appreciate clarification on a couple of items.
> > Thank you for your time.
>
> As always, I will provide my thoughts and John will weigh in when he's next
> online.
>
> > 1) Is it okay to use available "online" latitude and longitude
> > coordinates, as long as Datum information, etc. are available?
>
> Yes. Just make sure you specify the source of those coordinates in the
> designated field on your spreadsheet.
>
> > For example, the Township, Range, Section Information website
> > (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis
> > Georeferencing Guidelines has links whereby one can search for a named
> > place, and the decimal degrees coordinates (to four decimal places) come up
> > for that place (example, City of Mason, Michigan). Is it okay to use such
> > on-line coordinates for georeferencing place names, or should all
> > georeferencing should be done with "hard copy" references?
>
> We encourage you to take advantage of all available tools, that's why we
> provided those URLs. There may be others as well. Just make sure your sources
> are credible.
>
> > 2) If the answer to the above question is that all georeferencing should
> > be done with "hard copy" references, then ignore this one.
> >
> > A related question to 1): from the same website mentioned above, one can
> > link to "TerraServer" and get (really interesting) aerial photos of places.
> > With the aid of a labelled map, one can zoom in and find specific
> > buildings (such as the Michigan State University Swine Barn - a real Ingham
> > County example). From a zoomed aerial image, you can click on "Image Info"
> > and get lat and long (non-decimal) coordinates for "tiles" (corners of
> > squares) surrounding the image. Datum information is included in "Image
> > Info".
> >
> > So my question is, is it okay for us to use these types of on-line aerial
> > images for georeferencing?
>
> I'm including this question just for completeness. The answer is, of course,
> yes. And remember, do not worry about the type of coordinate data you record.
> The error calculator will be able to convert data provided in any format (e.g.,
> deg, min, sec; dec. degrees; etc.) into any other format. Knowing the datum,
> providing the source of your coordinates, and noting any assumptions you have
> made in assigning those coordinates are what's crucial.
>
> > 3) With regard to the "DeterminedDate" data field in the download file -
> > is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month
> > Spelled Out YYYY) that you would like us to use?
>
> No, because most spreadsheet programs will dictate a format. It seemed
> worthless for us to specify one. John will have to deal with that variety
> later.
>
> Best,
> Barbara
>
>>> Posting number 138, dated 23 Nov 2001 10:27:35
Date: Fri, 23 Nov 2001 10:27:35 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Vieglias routine (fwd)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
XXXX and all,
Don't confuse the lat_long determination with the error determination. You
can get the lat_long without the extents, but you need to use the extents
as one of the sources of uncertainty - which contributes to the maximum
error distance, but does not affect the lat_long itself.
The guidelines do allow the distance bearing computation to be made from
GNIS coordinates, and I agree, it would be a crime not to use those data.
I would very much like to provide the tool that can parse the localities
and calculate the lat_longs from any gazetteer. In February I'll likely be
collaborating with the Alexandria Digital Library Project to do just that.
I am currently awaiting the development of a protocol to communicate with
their Digital Gazetteer.
There are really two tools that would be nice. I've already mentioned the
first one, which would be based on Dave Vieglais' SPPFind tool, which I
have not yet tested. The second is the error calculator, which is
referenced in the MaNIS web pages, but is not yet functional. I've
finished the Error Calculator Tool except for the datum error
contributions and testing. I would like to suggest that charging ahead on
the lat_long determinations is fine, but leave off the error stuff until
thetool is ready for prime-time. That error stuff is just too burdensome
to do by hand. Doing one pass for lat_longs and one for errors might
actually be more efficient, but we'll need evidence "from the trenches"
to figure out if this is true.
John W.
---------- Forwarded message ----------
Date: Sat, 17 Nov 2001 13:21:47 -0800
From:
To: tuco@socrates.Berkeley.EDU
Cc: bstein@oz.net
Subject: Vieglias routine
John W. So much for theory. On more practical matter. The rules indicate
that "If the [SpecLoc] description includes an offset, use the furthest
extent of the named place in the direction of the offset." So we should
NOT compute terminal lat/longs from the GNIS lat/longs and bearing? I ask
because GNIS locs don't appear to take into account the furthest extent of
the named place. Related, should we wait for the georeferencing tool
mentioned in the 10/18/01 email or just charge ahead? I assume it was to
take GNIS locs and try to match them with occurrences in the MaNIS file
(from project description), then compute terminal lat/longs based on
distance and bearing. Modifying the rules to allow the distance-bearing
computation based on GNIS lat/long would really increase georeferencing
rate, and as long as the technique was referenced, I don't see a problem.
>>> Posting number 139, dated 23 Nov 2001 10:29:58
Date: Fri, 23 Nov 2001 10:29:58 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Vertical Datum?
In-Reply-To: <3.0.32.20011120185230.00718380@pilot.msu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
XXXXX and all,
The Vertical Datum refers to the geometric model from with elevations are
determined. In our data we consider altitude (or elevation) as an
attribute of the locality, not as an attribute of the position. Or, to say
it another way, when we record positions digitally, we include latitude,
longitude, and horizontal datum, but we do not include elevation and
vertical datum. In short, we treat elevation as a part of the locality,
so we do not need to consider the vertical datum since it has no bearing
on our georeferencing.
Note, unless I am mistaken there is no way to know the datum when using
Topozone. Someone please correct me if I'm wrong. This isn't really a big
problem as long as the error is calculated with an unknown datum.
John W.
On Tue, 20 Nov 2001, XXXXXXXXXX wrote:
> Dear Barbara,
>
> Thanks for your reply to my earlier message. I have another question for
> both you and John:
>
> Do we need to note the "Vertical Datum" if one is provided on a map source?
> One of the Michigan USGS maps that I looked at this week had the following:
> Horizontal Datum: NAD1927
> Vertical Datum: NGVD 1929
>
> Also, it looks like we'll be using Topozone
> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan
> localities (just point the cursor anywhere on the map and the coordinates
> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
> on the lower part of the screen).
>
> Thanks,
> XXXXX
>
>
>>> Posting number 140, dated 26 Nov 2001 10:20:20
Date: Mon, 26 Nov 2001 10:20:20 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Topozone - Datum
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
According to the topozone website (address below), it appears that the
given coordinates are based on NAD27. (This is listed next to the
coordinate buttons - UTM, DecLatLong, etc. - on the website).
Please let me know if you have other information about this.
Thanks,
XXXXX
>Note, unless I am mistaken there is no way to know the datum when using
>Topozone. Someone please correct me if I'm wrong. This isn't really a big
>problem as long as the error is calculated with an unknown datum.
>
>John W.
>
> Also, it looks like we'll be using Topozone
> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan
> localities (just point the cursor anywhere on the map and the coordinates
> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
> on the lower part of the screen).
>
>>> Posting number 141, dated 3 Dec 2001 05:59:15
Date: Mon, 3 Dec 2001 05:59:15 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Loading Lat_Longs back into databases
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
Last week I promised a message about the relationship between the
gazetteer and your databases - the bigger picture.
We've already talked about the static nature of the current MaNIS
gazetteer. As I've said, the gazetteer in its current form is a temporary
tool to aid in collaborative georeferencing. Once the network gets going
there will be a dynamic gazetteer as described in the NSF proposal.
Because our "snapshot" data are static and our databases are not, the
differences between the two will increase over time, especially for those
who are specifically editing locality-related data. I guess that when
people made this realization it caused some concern.
I designed the gazetteer with the issue of changing data in mind, and I've
done a few things to aid in data reconciliation when the lat_longs get
loaded back into your databases. For example, I've stored much more
information in the MaNIS gazetteer than is visible in the online
interface, including information that relates the localities (and
therefore the lat_longs) back to the specimens themselves. The structure
of the gazetteer may be of interest, so I will post the Gazetteer
Entity-Relationship diagram as a document on the MaNIS website when I get
back to civilization.
Since I stored all of the original locality-related information along with
the references to the specimens, it will be possible (when the time comes
to load lat_long information into your databases) to compare the snapshot
locality data with the then-current locality data. For all of those
localities where there has been no change, the lat_long data can be loaded
without question. This first step should take care of most records for
most institutions. For the rest of the records, where the locality data no
longer exactly match the snapshot data, some analyses can be done to
determine if the differences can be considered "substantive," by which I
mean that they would affect the determination of the lat_long. For
example, a snapshot locality that is the same as the then-current locality
except that an elevation has been added can be considered as not
substantively changed and can therefore have its lat_long record loaded.
This step will be a little different for each institution. After doing
some bulk checking for differences such as in the foregoing example, I
envision making one visual pass over the remaining records, with the
original and the then-current localities side-by-side, putting a checkmark
in a column called "substantive" for those records that have had
substantive changes. When that pass has been made, all of the lat_longs
for records without a checkmark can be loaded. This third step should take
care of most of the remainder of the localities. What's left will be
locality-specimen relationships that have changed since the time when the
snapshot was taken. These records will have to be resolved by the
individual institutions.
There are some tricks and techniques I haven't presented yet, but I hope
that what I've written above helps to clarify the bigger picture with
respect to georeferencing. Questions have proven useful thus far, so if
there's anything else about which you'd care to have me elaborate, please
ask.
In the spirit of looking forward, another thing to think about for the
future is the incorporation of the coordinates and metadata into your own
local databases. Some institutions don't have attributes in their
databases to hold lat_long information. Similarly, not everyone (but there
are some!) has an attribute to accomodate maximum error distance. It would
be a shame to throw away all of this hard-earned and valuable data. At
this point I'm asking you to consider the ramifications of storing these
data so that there are no unpleasant surprises when the time comes to load
the data back into your databases.
John W.
>>> Posting number 142, dated 7 Dec 2001 07:24:45
Date: Fri, 7 Dec 2001 07:24:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guide Revisions
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
While working through the development of the Georeferencing Calculator I
discovered minor numerical and typographical errors in the Georeferencing
Guidelines document. This message is just to alert you that I have made
revisions to that document. One particular change worth noting is in the
section on "Uncertainty associated with coordinate precision." It seemed
to me quite reasonable to assume that the coordinate precision should be
the same for both coordinates, and so I've rewritten that section to
reflect this assumption.
I've also added some calculation examples against which you might test
your understanding both of the georeferencing concepts.
One detail of reading the datum error from a file eludes me at the
moment. It is the last remaining issue before the Georeferencing
Calculator becomes available.
John W.
>>> Posting number 143, dated 10 Dec 2001 13:58:27
Date: Mon, 10 Dec 2001 13:58:27 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Information from Topozone - NAD 27 Datum
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All,
John asked that I follow up with staff at Topozone
(www.topozone.com/findplace.asp) with regard to datum information on their
website's scanned maps (see previous message exchanges copied below).
Here is what I found out:
1. For USGS QUAD MAPS (1:24,000 or 1:25,000): the vast majority of these
original scanned maps on the Topozone website are based on the NAD 27. If
any underlying Quad map was originally based on another datum (such as NAD
83 for example), Topozone has REPROJECTED that map into NAD 27.
2. Thus, the Topozone cursor coordinates as well as the underlying Quad
map (whether original or reprojected) are ALWAYS in NAD 27.
3. It was confirmed that all original MICHIGAN QUAD maps that were scanned
for the Topozone website are NAD 27.
John, please let us know if it is okay for us to list NAD 27 as the datum
instead of "Datum Unknown" for locality coodinates taken from the Topozone
website.
Thanks,
XXXXXX
>>> Posting number 144, dated 14 Dec 2001 15:31:53
>>> Posting number 145, dated 16 Dec 2001 11:40:45
Date: Sun, 16 Dec 2001 11:40:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Information from Topozone - NAD 27 Datum
In-Reply-To: <3.0.32.20011210135816.00717590@pilot.msu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Thanks XXXXX, this is most excellent. We can use Topozone coordinates with
NAD27 recorded. They have no idea how big a favor they have done for
us. Everyone please list NAD27 with any coordinates derived from Topozone
and remember to record the Reference_Source as "Topozone 1:24000" or the
like.
>>> Posting number 146, dated 3 Jan 2002 10:14:21
Date: Thu, 3 Jan 2002 10:14:21 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: number of decimals on decimal degrees
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
MaNIS: How many decimals are folks attaching to lat/long determinations?
I'm going with four on decimal degrees even though this is more than the
justified from the offset distances to the nearest mile or fractional mile.
As I understand it, John W's error calculator will attach the correct error
to lat/long determinations based on the offset direction(s), distance and
units. Sorry if I missed this in previous discussions?
>>> Posting number 147, dated 7 Jan 2002 09:46:27
Date: Mon, 7 Jan 2002 09:46:27 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: number of decimals on decimal degrees
In-Reply-To: <F100rz71znUp8acXUgZ000178f6@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi Folks,
I'm back. Argentina started rioting when I left for Chile. I won't claim
that my leaving was the cause.
Anyway, my recommendation is to store as many decimal places as your
source gives you and not to confuse those digits with accuracy or precision
- that's why we're using the explicit maximum error distance. I would
certainly caution that to use fewer digits is to introduce extra,
unwarranted errors. Refer to the table in the Georeferencing Guide at
http://elib.cs.berkeley.edu/manis/GeorefGuide.html to see the magnitude of
these errors. If you use 5 digits in a decimal degree coordinate, the error
will be on the same order of magnitude as that for most of today's accurate
GPS readings. The error calculator will also take into account the
precision of the recorded coordinates when calculating maximum error distances.
>MaNIS: How many decimals are folks attaching to lat/long determinations?
>I'm going with four on decimal degrees even though this is more than the
>justified from the offset distances to the nearest mile or fractional mile.
>As I understand it, John W's error calculator will attach the correct error
>to lat/long determinations based on the offset direction(s), distance and
>units. Sorry if I missed this in previous discussions?
>
>>> Posting number 148, dated 7 Jan 2002 12:37:08
>>> Posting number 149, dated 7 Jan 2002 12:57:05
>>> Posting number 150, dated 7 Jan 2002 12:45:12
Date: Mon, 7 Jan 2002 12:45:12 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Should not found SpecLocs default to county?
In-Reply-To: <v02130501b85f9bf6b38a@[207.207.103.162]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
>John W.: So I'm wondering about the Oregon records. There are about 400
>with DecLat/longs that were already assigned when downloaded, but they only
>have two decimals. Was this a formating or rounding decision? I'll leave
>them as is as I assume if someone assigned lat/long it is more accurate the
>the SpecLoc.
Actually, it was a formatting error. The decimal lat/longs that appear in
the download have been truncated to 2 decimal places. This wasn't my
original intention. The truncation occurred somewhere in transferring
between Access and the Informix database from which the downloaded data are
taken. I'll try to find out where it occurred and fix the problem, then I
will update the decimal latitude and longitude values in the online
gazetteer. This shouldn't affect on those who've already downloaded data
for georeferencing since we agreed that the localities that already have
lat/longs will not be georeferenced (again). If anyone is checking and
changing records that have lat_longs already, let me know.
>Related, if we cannot find a SpecLoc, should we default to county or leave
>it ungeoreferenced pending investigation by the contributing institution?
>So far not found SpecLocs are running at about 10% due to discrepencies in
>SpecLoc and county, apparent typos, or ambiguous text.
If you cannot find the SpecLoc, leave it ungeoreferenced and say why in the
field called "NoGeorefBecause." If you find the SpecLoc and it is
unambiguously placed in the wrong county, go ahead and georeference it and
make a note to that effect in the "LocalityAnnotation" field in the
downloaded data file. These notes will eventually get back to the source
institution.
>>> Posting number 151, dated 7 Jan 2002 14:52:05
Date: Mon, 7 Jan 2002 14:52:05 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Oregon lat/longs.
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
XXXX, and all:
"...wondering about the Oregon records. There are about 400..."
The Oregon records that had lat/long for specimens in the KU collection
should be redone with the new system. Those that were added here were done
a couple of years ago using a program that calculated them for us so they
will not be as accurate as the current system we are using.
>>> Posting number 152, dated 8 Jan 2002 20:57:38
>>> Posting number 153, dated 16 Jan 2002 15:03:38
Date: Wed, 16 Jan 2002 15:03:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Error Calculator
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
At long last I'm ready to introduce the Georeferencing Error Calculator.
It's been some time in the making, and I apologize for the delay, but I
wanted to give you a product that wouldn't be a moving target due to
constant revision. The application has been pretty well tested and I
believe you can use it with confidence in the results it
gives. Nevertheless, if something doesn't seem quite right, try to figure
out why. Usually it means that the coordinate precision is set too low (the
coordinate precision always reverts to "nearest degree" if you change the
coordinate system). If you exhaust all possibilities of making sense of the
maximum error value that the program gives you (this includes reading the
manual and the georeferencing guidelines), then feel free to send me a
message asking what's going on. If you do, please be explicit about what
you are doing and what all of the parameters are for the calculation that
puzzles you.
The Georeferencing Guidelines and the Georeferencing Steps documents have
been modified to include references to the Error Calculator, and the Error
Calculator Manual has been added to the list on the Documents page on the
MaNIS website at the following URL:
http://dlp.cs.berkeley.edu/manis/Documents.html
Please read the manual so you know what to expect when loading the
Calculator into your browser. In particular, you should be aware of the
browser constraints and the size of the java applet. It can be quite slow
to load the first time if your connection is slow.
Two points about making calculations are also worth emphasizing in advance.
I've already mentioned the first, which is that the coordinate precision
will revert to "nearest degree" if you change the coordinate system. If you
get an error that you think is excessive, the coordinate precision is
likely to be the culprit. Another possible culprit is having the datum set
to "not recorded" if you actually know what datum the coordinates were
taken in. The second important point is that all distance measurements in a
given calculation must be in the same units. For example, don't mix an
offset of 10 miles with an extent of named place of 3 kilometers. Both
measures need to be in one system or the other. The error distance will be
given in the same units as the measurements and all will be governed by
your choice in the Distance Units drop-down list.
Enjoy!
>>> Posting number 154, dated 16 Jan 2002 15:28:45
Date: Wed, 16 Jan 2002 15:28:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: CNMA: mammal collection at UNAM
In-Reply-To: <5.1.0.14.1.20020107123724.00a00090@ibunam.ibiologia.unam.m x>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Dear All,
I have changed all references to UNAM in the MaNIS documents and database=20
to be CNMA based on the following request from Fernando Cervantes. The=20
acronym was not changed in the Project Description document, which is a=20
copy of the document sent as part of the NSF grant application. Those of=20
you who downloaded localities previous to 16 January 2002 will still have=20
UNAM as a CollectionCode in your downloaded data. This will not present a=20
problem when you return the georeferenced data to me.
John W
>Dear John
>
> To better describe who and where we are at I would like to ask you for=
=20
> the following:
>
>1. In the list of institutions participating in MaNIS and the contacts=20
>(web site), please include the name, position, and e-mail account of my=20
>assistants:
>
>Yolanda Hortelano, yolahm@ibiologia.unam.mx
>Julieta Vargas, jvargas@ibiologia.unam.mx
>
>2. In addition, please change the acronym of our collection. Our mammal=20
>collection is known and registered as CNMA (after Colecci=F3n Nacional de=
=20
>Mam=EDferos) and is hosted by Instituto de Biolog=EDa, that belongs to=20
>Universidad Nacional Aut=F3noma de M=E9xico (UNAM).
>
>Thank you for your help,
>
>Fernando
>------------------------------------------------
>Fernando A. Cervantes
>Zoologia. Instituto de Biologia, UNAM
>Apartado Postal 70-153, Coyoacan
>Mexico, D. F. 04510
>Mexico
>
>tel.: (525) 622 9143; fax: (525) 550 0164
>e-mail: fac@ibiologia.unam.mx
>sitio web: www.ibiologia.unam.mx/cnma
>------------------------------------------------
>>> Posting number 155, dated 17 Jan 2002 09:38:42
Date: Thu, 17 Jan 2002 09:38:42 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing
In-Reply-To: <5.1.0.14.1.20020107123124.00a00ec0@ibunam.ibiologia.unam.m x>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
XXXXXXX,
Now that the Error Calculator is done and on the web I was able to check=20
the data you sent me in December. I had no problems importing those data=20
into my system. When I do this step I check for inconsistencies in the=20
data and fix them if I can. The Determination References you provided are=
=20
excellent. I wish we could figure out which datum those sources use.
I'm curious why you chose to record degrees minutes seconds instead of=20
decimal degrees for the localities your team georeferenced. I'm only asking=
=20
to point out that it would have been easier to just copy and paste the two=
=20
decimal degree values. This would have been a little faster and it would=20
have left less room for error. Even so, there was only one coordinate error=
=20
I could find in your data. There was a 10 for decimal seconds where there=20
should have been a 0.
There are some limitations of the Alexandria Digital Library Data of which=
=20
everyone should be aware. As long as you recognize these limitations, the=20
ADL gazetteer is extremely useful. I'm including, below, a message to and a=
=20
response from Linda Hill about these limitations.
I noticed that none of the localities in the records you georeferenced have=
=20
maximum error distances. I hope you will provide these data in the future,=
=20
especially now that I've released the Error Calculator, which is supposed=20
to make the calculations much easier. When you do make error calculations,=
=20
be sure to use a coordinate precision of "nearest minute" for Alexandria=20
Digital Library data that come from NIMA. If you look at the values that=20
come up there is always either a 0 or a 59 in the seconds field for non-USA=
=20
named places. There is something wrong with a coordinate translation=20
algorithm somewhere that produces this problem. I recommend using the=20
decimal degree coordinates since they err less than the degrees minutes=20
seconds.
I especially appreciate the Locality Annotations your team provided and I=20
hope the other recipients of your georeferenced data do as well.
>John: Here's the situation. The data in our gazetteer for the example you=
=20
>used is
>NIMA. The original NIMA coordinates are:
>
>NIMA: 20=B0 11' 00" N 098=B0 03' 00" W
>
>NIMA points are all limited to 1 minute resolution, I believe, although=
they
>don't document this anyway that I have seen.
>
>We have two clients and they show the coordinates as:
>
>CDL-Middleware client to ADL Gazetteer: Longitude W 98=B0 03' Latitude N=
20=B0 11'
>
>AOL client to ADL Gazetteer: Longitude: -98.050003 (98=B03'0"W) Latitude:=
=20
>20.183332
>(20=B010'59"N)
>
>The problem with the AOL client is that the original ddmmss values were=20
>converted
>to decimal degrees and then the ddmmss values that are shown in the=20
>interface are
>calculated from them, giving the impression that there is more resolution=
=20
>in the
>location than is warranted. As you point out, in your example there is=20
>obviously
>a problem with the '3' as the last digit in the longitude value. We are=20
>aware of
>these problems but have not gone back and fixed it. We have limited staff=
=20
>to work
>on the gazetteer and have put more work into other developments. What we=20
>intend
>to do is to phase out the AOL client and replace it with a client based on=
our
>middleware software (like the CDL client). We will be storing decimal=20
>degrees in
>our database but need to be smarter about the specificity
>
>Neither the USGS nor NIMA clearly reference the geodetic basis of their
>coordinates. We are assuming that they are using WGS-84. In our revised=20
>Gazetteer
>Content Standard there is an element to declare the geodetic basis for the
>coordinates. We are setting the default value as WGS-84 but other bases can=
be
>entered. With our current gazetteer, I think you will not go far wrong with
>assuming WGS-84. Also, we have elements for making a statement about the
>'accuracy' of the coordinates. In the future as we build up better data,=
these
>statements could give assistance in making the estimates that you need.
>
>I had a look at your 'estimator' for maximum geospatial error in specifying
>locations. It looks very useful. I passed the URL on to our gazetteer team=
=20
>here
>so that they can see what you are doing.
>
>We are still working on getting our gazetteer protocol server working=20
>properly.
>We solved a major parsing problem today. There is still more to do but you=
=20
>might
>start thinking about how you might embed gazetteer lookup in your script=
using
>our gazetteer service protocol.
>
>I appreciate your feedback and apologize for the limitations of our=20
>gazetteer. We
>continue to work on it and welcome collaboration to 'make it right'.
>
>- Linda
>
>
>John Wieczorek wrote:
>
> > Hi again,
> > I have people engaged in georeferencing for the MaNIS Project now. My=
first
> > set of georeferenced data have just been returned and the ADL gazetteer=
was
> > among the Reference Sources used to get coordinates for the data. My
> > questions are about the coordinates themselves. I'll use a specific=
example
> > to better illustrate the questions.
> >
> > The locality in question is Huauchinango, Puebla, Mexico. The gazetteer=
=20
> shows
> > coordinates in two units, decimal degrees and degrees minutes seconds.
> > Specifically, for this example, the decimal degrees are 20.183332,
> > -98.050003. The degrees minutes seconds are 20=B010'59"N, 98=B03'0"W.=
These two
> > aren't the same when you get out to that sixth decimal place in=
longitude,
> > and they differ even more in latitude. I'm wondering whether there is a=
way
> > to know which is the original coordinate system (i.e., the one without=
the
> > error introduced by translation). Both coordinates actually have=
tell-tale
> > signs of tampering. That 3 out at the end of the decimal longitude looks
> > like a floating point error. The fact that so many of the named place=
from
> > this region have only 0 or 59 in the seconds fields is also highly=
suspect.
> > So, I wonder at what step the translation(s) was(were) made - whether it
> > comes from the original data source (in this case NIMA) or whether it is
> > post-processing done on your end. If it is the former, I suppose we're
> > stuck with it, but if it's the latter I wonder if a better algorithm=
could
> > be used to keep the coordinates in sync. I can offer one, if that helps.
> >
> > Finally, I've probably asked this before, but is it possible to get the
> > datum information along with the coordinates. I suspect that information=
is
> > missing as metadata from the original data sources, but if it isn't
> > missing, is there any possibility that it could be among the data you
> > provide in the ADL gazetteer interface? It makes a great deal of=
difference
> > sometimes in determining the maximum error distance for the coordinates
> > assigned to a locality, and this will, in turn, affect analyses further=
on
> > down the road.
> >
> > Thanks bunches,
> > John W
>>> Posting number 156, dated 20 Jan 2002 10:39:23
>>> Posting number 157, dated 31 Jan 2002 15:13:44
>>> Posting number 158, dated 31 Jan 2002 16:18:34
Date: Thu, 31 Jan 2002 16:18:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Update
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I write for two purposes. The first is that I'm curious to know how many of
you have actually begun to georeference. So far, I know that CNMA and the
MVZ have begun. The reason I ask is that I would like to begin a discussion
on the list of techniques to make the task go faster. I don't really want
to do that until most everyone is actually getting their hands dirty. In
this way everyone will be able to benefit from the discussion. So, please
let me know either that you have already begun georeferencing, or when you
anticipate beginning.
My second purpose is to let you know that, due to my ignorance of the
details of two of your esteemed collections databases, I made some faulty
assumptions when I first processed the data for the online gazetteer. As a
result, I need to reload data for UWBM and for ROM. I have already
reprocessed the UWBM data and I'll try to load it into the gazetteer as
soon as possible (hopefully by Monday). The situation with ROM is more
complex and I anticipating making an update to the ROM data in about one
month. There are a few implications of this unfortunate necessity.
1) If you have not yet downloaded localities for georeferencing, wait to
make your downloads at least until I announce that the update for UWBM has
been done. Don't wait for the ROM update to be done unless for some reason
you weren't going to begin georeferencing for another month anyway.
2) If you have downloaded localities, but have not yet begun georeferencing
them, throw away the downloaded file(s) you have and download them again
after I announce that the UWBM update is complete. Again, don't wait for
the ROM update to be done unless you weren't going to begin georeferencing
for another month anyway.
3) If you downloaded and began georeferencing files that include UWBM
and/or ROM records, please discard those records (only) from your record
set, even if you happen to have already georeferenced some of them. My
suspicion is that not much actual georeferencing has commenced to date
(though I'd love to hear otherwise), so this is unlikely to be a big
problem. After discarding the UWBM and ROM records, please do another
download with the same criteria you used last time, but this time please
select UWBM in the Institution drop-down box. This will give you only the
UWBM records from your geographic area of interest. After they download
successfully, append these UWBM records to the records you've already begun
georeferencing and proceed as if nothing had happened.
When the ROM records are ready I'll make another announcement to the list
about downloading only ROM records to append to your working files. The
process will be exactly the same is in scenario 3, above. In the meantime,
ROM records will still be in the gazetteer, but please do spend time to
georeference them. Throw them out now, or when I make the announcement, as
you prefer.
Thanks, and my sincere apologies for the inconvenience. I promise to try to
not make assumptions about other people's data anymore. I should know
better by now.
John W
>>> Posting number 159, dated 1 Feb 2002 17:29:10
>>> Posting number 160, dated 1 Feb 2002 17:33:01
>>> Posting number 161, dated 1 Feb 2002 18:24:45
>>> Posting number 162, dated 1 Feb 2002 15:42:04
>>> Posting number 163, dated 1 Feb 2002 19:27:30
Date: Fri, 1 Feb 2002 19:27:30 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Gazetteer update
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
The promised gazetteer update is complete. Download with abandon!
John W
>>> Posting number 164, dated 4 Feb 2002 17:36:54
Date: Mon, 4 Feb 2002 17:36:54 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: Georeferencing by MSU
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Please read, there are some excellent questions raised here.
>Date: Mon, 04 Feb 2002 17:58:51 -0500
>To: tuco@socrates.Berkeley.EDU
>From:
>Subject: Georeferencing by MSU
>
>Hi John,
>
>XXXXXXX and I want to give you an update on georeferencing and relay
>some concerns/questions.
>
>In late November, we downloaded records for several Michigan counties and
>have since practiced on the different types of localities. Using
>Topozone, we worked individually on Eaton and Barry Counties and then
>compared and discussed our approaches and results. Prior to the
>introduction of the error calculator, we reported our results as UTM
>coordinates in the Access file template provided.
>
>With the availability of the error calculator (thank you very much!) and
>recent revised guidelines, we began recording original coordinates as
>decimal degrees for Barry County. Our plan is to send you each of our
>Barry County files in the next few days. We would appreciate your
>comments on our results and techniques before we proceed with the "real
>thing".
>
>We have some questions and comments:
>
>1. Evolving Guidelines - We would appreciate an announcement whenever
>there is an update to the guidelines, specifying which sections are
>altered, to ensure that we are always working with the most recent
>information. Thanks again for all of your hard work with this!!
Point well taken. I've tried to be good about announcing the updates, but I
haven't always completely described what the changes were.
>2. Guidelines Questions - In the calculation example of Distance Along
>Orthogonal Directions, the Direction Precision is given as 45 degrees. It
>seemed earlier in the document that "directional imprecision can be
>ignored" in such an example. Are we misunderstanding something?
I included one too many lines in my copy and paste. The Direction Precision
should not figure into that calculation. I will remove the extraneous line
from the Georeferencing Guidelines.
>In the calculation example of Named Place Only/Bakersfield, the
>coordinates are 35 degrees, 22', 24"N and 119 degrees, 1' 4" W. We
>understand from the example that these are the GNIS coordinates for
>Bakersfield. In other examples (e.g. Distance Along Orthogonal
>Directions and Distance at a Heading) the latitude and longitude
>coordinates are the same as for the Named Place/Bakersfield example. Since
>the actual localities are different (from Bakersfield), shouldn't the
>coordinates be different as well?
Absolutely. You win a prize for catching those mistakes. The "Distance
Along a Path" example was similarly problematic. I have changed the wording
as well as the values for Latitude, Longitude, Decimal Latitude, and
Decimal Longitude for these examples to reflect that the coordinates of the
locality are different from the coordinates of the named place mentioned in
the locality description.
>3. Coordinates for the Center of a Township - If a locality is a
>township name only, is it preferable to use the coordinates for the
>township that are automatically provided by Topozone (via the place name
>search), or use the coordinates for the intersection of Sections 15, 16,
>21, and 22 (assuming the township consists of the "standard" 36 one-mile
>square sections)?
I was unaware that one could (and unable to figure out how to) find a
township, in the TRS sense, from the place name search on Topozone. I did
notice that you can find named townships (Michigan is full of them), but I
don't believe their coordinates correspond with the TRS sections they
occupy. Nevertheless, the coordinates we're looking for are those of the
intersection of center sections, as Laura mentioned above.
>4. Extent of an intersection - One of the localities that we recently
>georeferenced in Barry County was the intersection of two roads. We used
>the coordinates from Topozone and estimated the extent of the intersection
>to be 50 meters. Is this a reasonable estimate to use in general for this
>type of locality? (The locality was considered as a named place for
>calculation of error).
That seems like a generous extent unless the roads are 12-lane highways or
something. I would opt for something more like 10 meters for your everyday
two-lane roads. Certainly, feel free to override my opinion if the
circumstances warrant it.
>5. Extent of a named place that lacks bounding boxes - We have
>encountered named places that lack bounding boxes on both the Topozone
>image as well as a Michigan County Gazetteer book. We have estimated
>extents of such places based on the clusters of buildings that appear as
>black squares on Topozone in 1:25,000 scale. Is this type of estimate okay?
That's what I'd do, and that's what my georeferencers have been doing from
the outset.
>6. Cursor Accuracy - Robin and I have different model computers that
>utilize different web browsers (I have Netscape; Robin has
>Explorer). When Robin connects to Topozone, her computer cursor
>automatically changes to a crosshair. I manually changed my computer
>cursor from the "standard" arrow to a crosshair. I believe this has made
>a difference in attempting to pinpoint localities on the Topozone map.
Good idea. It hadn't occurred to me because we're all using Netscape, and
we're only using Topozone occasionally. Just as a point of information,
for California we most often use Terrain Navigator from MapTech
(http://maptech.com/) to do our georeferencing.
>Thanks for all of your help!!
Thanks for your excellent questions and comments.
>XXXXXXXXX
>>> Posting number 165, dated 6 Feb 2002 11:48:03
Date: Wed, 6 Feb 2002 11:48:03 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Should we save extents?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
MaNISers:
Should we save extents? In georeferencing, one variable that will not be
saved is the extent used to compute the error. The extent cannot be
inferred from the locality descriptions unlike coordinate and offset
imprecision. In addition, an extent for a populated place will vary
depending on the scale, map, year. For many records it is the largest
component of the error. To give folks an idea of how I computed the error,
I am annotating each record with the extent I used. One could go
overboard and reference the extent, but I am assuming the same system used
to get lat/long (GNIS). Would it be too much trouble to save extents in
the annotation field?
For TRS lat/longs, I am using the extents in the Guidelines update. For
lookup on the MontanaTRS site I am assuming unknown datum and no error due
to scale as done in the Georef Guidelines examples for placename only.
Correct?
>>> Posting number 166, dated 7 Feb 2002 09:55:12
Date: Thu, 7 Feb 2002 09:55:12 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Datum error significance
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I figured it was worth answering this question on the list in case others
were wondering the same thing. The commonly used datums in the US are the
North American Datum 1927 (NAD 27), the North American Datum (NAD 83), and
the World Geodetic System 1984 (WGS 84). The difference between NAD 83 and
WGS 84 is quite small compared to the difference between NAD 27 and NAD 83.
All of the USGS maps are in one or the other of NAD 27 or NAD 83. I haven't
done an exhaustive search, but it looks like most US Forest Service and
Bureau of Land Management maps use NAD 27.
Anyway, the 79 m used in the Bakersfield example is the actual distance
between two points having the exact same latitude and longitude, but with
one of the points based on NAD 27 and the other based on WGS 84. The Error
Calculator uses a pre-calculated matrix of the greatest difference between
these two datums in every 0.2 by 0.2 degree cell in the region between
84.69 degrees North, 179.48 degrees West and 13.69 degrees North, 51.48
degrees West. Outside of this region the calculator uses the assumption of
1km error due to an unknown datum as documented in the Georeferencing
Guidelines.
When entering coordinates in the calculator it is important to enter the
correct hemisphere. Perhaps that goes without saying, but it is pretty easy
to enter decimal longitude erroneously (without the negative sign in front)
for localities in the western hemisphere. Doing so could seriously affect
the error contribution from an unknown datum.
John W.
>Date: Wed, 6 Feb 2002 11:45:19 -0800
>To: tuco@socrates.Berkeley.EDU
>From:
>
>John: Unknown datum question. Fig 1 in the guidelines has the ranges of
>error for unknown datum. For Bakersfield the range 76-100 m error.
>Oregon, which I am georeferencing, is in the same 76-100 m band, so a
>midpoint would be 88 m. Does 79 m used in the Georeferencing Guidelines
>examples for Bakersfield have some significance? I realize this doesn't
>matter when using the web calculator, but just wondering because it makes a
>difference of several m when using Excel calculator.
>
>>> Posting number 167, dated 13 Feb 2002 11:35:40
Date: Wed, 13 Feb 2002 11:35:40 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MSU PRACTICE RECORDS
In-Reply-To: <3.0.32.20020213132000.00687df0@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Below are extracts from an exchange between me, XXXXXXXXXXX, and
XXXXXX stemming from a request to review a set of records that each of
them had georeferenced independently. Several points of interest to the
readers of this list were raised, including a continuing discussion of the
issue of extents raised by XXXXXXX on 6 Feb 2002.
I'd like to report that this exercise turned out to be a wonderful field
test of the georeferencing guidelines. The coordinates and errors were
remarkably similar, with the largest deviations corresponding to the most
vague locality descriptions. Go team!
John W
> >Topozone actually has
> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
> >1:200,000 versions are just zoomed out by a factor of two from their
> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
> >"resized." It doesn't make all that much difference in the error
> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
> >using the 1:25,000 map scale contribution in the error calculator for the
> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
> >1:200,000 Topozone maps.
>
>Good to know all of the above. Actually, we used "gazetteer" from the
>dropdown on the error calculator for all of the Topozone practice records.
>We were following the example from the georeferencing guidelines where the
>coordinate source (Topozone) was considered to be a gazetteer, and thus
>selected "gazetteer" on the error calculator. It sounds like we need to
>redo the MAX ERROR with the map scale incorporated.
Actually, there is a subtle distinction to make. In the Georeferencing
Guidelines document I said that the source for that "Distance Only" example
was a gazetteer, because the coordinates were for a named place and
Topozone uses the GNIS data to plot named places; thus, the ultimate source
of the coordinates for that example is the GNIS database, which is a
gazetteer. If you had used Topozone to measure on a map, then the map
itself is the source of the coordinates and should be so reflected in the
error calculations by selecting an appropriate map scale.
> >I'm very happy to see the extent information in there. I am ruminating over
> >the inclusion of a field in the download data for the extent. I'm
> >interested in your opinion on the subject. It seems like it would actually
> >be easier than writing it out in the remarks, especially if you can copy
> >and paste it among several records. However, I think we'd do well to add a
> >NamedPlace field as well so we know to what the extent refers.
>
>XXXX and I have been meaning to reply to XXXX's message about extent. My
>opinion is that extent should be included somewhere (and in the remarks
>field is fine with me) as a record of what was done in the georeferencing
>process.
I think the general sentiment is that the complete determination
(coordinates AND error) would be fully documented if we go ahead and add
the value of the extent to the data we capture. By having a base set of
rules along with recording extent, we will know know the magnitude of
every contribution to the determination. Without recording extent, we are
left to wonder how the georeferencer arrived at his/her result. Would it be
onerous to include the extent in its own field? I think it will be easier
than adding it to the remarks, both for the georeferencer and for the
compiler of named place extents (me). Part of the reason I ask this is that
I'm thinking even bigger than MaNIS to the ubiquitous problem of
georeferencing, which could benefit by having a database of extents. The
GNIS data allows for features to be described by bounding boxes, which can
be interpreted to find extents. However, for most features the bounding box
reduces to a single point. This is true of all but the largest populated
place features in the GNIS database. Given the paucity of extent data
available, and given that we (MaNIS georeferencers) will have to determine
extents for every named place we run across, we could assemble these data
and use them to provide added value to existing gazetteers. Furthermore,
these additional data could be used in the future to automate the process
of georeferencing and error calculation. If this is, indeed, a worthy goal,
then it makes sense to capture the information in its own field so that it
need not be parsed from remarks in the future.
Comments are hereby solicited.
> >Overall, the agreement in the coordinates and the errors is astonishing.
> >The mean deviation in coordinates across the whole dataset is only about
> >300 meters and most of this is due to the two vague localities ("Barry
> >State Game Area" and "Yankee Springs Area"). For the most part the errors
> >take care of the differences. You have bolstered my faith in the system.
>
>Yes - these were large areas that were actually adjacent to one another. I
>found them to be somewhat difficult to georeference.
>
> >The one locality for which I cannot understand the discrepancy is "Clear
> >Lake Camp, 6 mi. E Delton." You might want to revisit that one to see where
> >the problem occurred.
>
>I know what happened here - operator difference (or assumption error?) pure
>and simple. I believe that Robin treated this as an offset, and I
>completely ignored the offset and focused on a "church camp" on the map
>that was on the shore of Clear Lake (the lake was about 6.5 miles east of
>Delton). Thus, I treated this as a named place (and perhaps my assumption
>was an unwarranted big stretch) and Robin treated it as an offset. I
>believe that Robin's choice was the better of the two.
> >
> >Nice.
>
>Thanks again!
>
>XXXXX
>
>
> >
> >John W
> >
> >>Attached are two files containing identical Barry County localities that we
> >>have georeferenced individually as practice with the MaNIS guidelines. We
> >>would sincerely appreciate your critique of our work before we submit files
> >>for inclusion in the project.
> >>
> >>Thanks for all of your help.
> >>
> >>Sincerely,
> >>
> >>XXXXXX
>>> Posting number 168, dated 13 Feb 2002 11:55:01
Date: Wed, 13 Feb 2002 11:55:01 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Should we save extents?
In-Reply-To: <v0213050ab886050432cf@[207.207.103.162]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
>For TRS lat/longs, I am using the extents in the Guidelines update. For
>lookup on the MontanaTRS site I am assuming unknown datum and no error due
>to scale as done in the Georef Guidelines examples for placename only.
>Correct?
Correct.
>>> Posting number 169, dated 13 Feb 2002 12:21:25
Date: Wed, 13 Feb 2002 12:21:25 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MSU Practice - More Comments
Comments:
In-Reply-To: <3.0.32.20020213145319.00720da8@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
More relevant exchanges.
> >>We do not and will not be using Excel for georeferencing. We just used it
> >>this one time to send you the sample records via e-mail. I am hoping that
> >>the data were not altered.
> >
> >Will you use Access then?
>
>Yes - we are extremely happy with your Access template! (Why would anyone
>want use something else?)
Good question! I would have no problem accepting that everyone used it.
>On the template, we have found it useful to just "close up" the columns
>that we don't want to look at while georeferencing. (You probably noticed
>this in the Excel version).
>
>XXXXXXX (our IT person) will help us send the "real" files using the
>project protocol.
In so doing, be sure to preserve all of the precision in the numeric
fields. There are two ways to do this. The first is to bypass protocol and
just send me the Access database mdb file (preferably with a date in the
filename, e.g., msu_barry020213.mdb). The second is to change the data type
of those fields to text after the georeferencing is all done and then
export the data into a tab-delimited text file.
> >I'm composing a reply to your previous message, which I'll send out to the
> >list due to common items of interest, and as a way of introducing more
> >information on the issue of extents.
> >
>Okay. Robin replied to me (from home) about extents. Here is her "vote".
>FROM XXXXXX: I'd vote for an actual column regarding extent
>information to assure that it was remembered. I view the column headings
>as a checklist of things I need to provide and without reference to it, it
>could easily be forgotten with all the other components.
This is a valuable, practical point with which I entirely agree.
John W
>>> Posting number 170, dated 13 Feb 2002 12:33:47
Date: Wed, 13 Feb 2002 12:33:47 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MSU PRACTICE RECORDS
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
John, XXXX, XXXX: Can I get copies of the data files? I'd like to run
them through the lat/long calculator for comparsion.
>>> Posting number 171, dated 14 Feb 2002 18:30:16
Date: Thu, 14 Feb 2002 18:30:16 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Error Calculator:Coordinate Source & Topozone.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
Thanks for the helpful information about map scales and choices to make on
the error calculator when using Topozone.com for georeferencing. I have
some additional questions about this. The message exchanges (from
Mammal-Z-Net) are copied below.
>From John:
>> >Topozone actually has
>> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
>> >1:200,000 versions are just zoomed out by a factor of two from their
>> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
>> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
>> >"resized." It doesn't make all that much difference in the error
>> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
>> >using the 1:25,000 map scale contribution in the error calculator for the
>> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
>> >1:200,000 Topozone maps.
>>
>From XXXXX:
>>Good to know all of the above. Actually, we used "gazetteer" from the
>>dropdown on the error calculator for all of the Topozone practice records.
>>We were following the example from the georeferencing guidelines where the
>>coordinate source (Topozone) was considered to be a gazetteer, and thus
>>selected "gazetteer" on the error calculator. It sounds like we need to
>>redo the MAX ERROR with the map scale incorporated.
>From John:
>Actually, there is a subtle distinction to make. In the Georeferencing
>Guidelines document I said that the source for that "Distance Only" example
>was a gazetteer, because the coordinates were for a named place and
>Topozone uses the GNIS data to plot named places; thus, the ultimate source
>of the coordinates for that example is the GNIS database, which is a
>gazetteer. If you had used Topozone to measure on a map, then the map
>itself is the source of the coordinates and should be so reflected in the
>error calculations by selecting an appropriate map scale.
>
My questions:
1. I understand (from exchange above) that if the locality that we want to
georeference is a named place (such as East Lansing or Beaver Island or
Fine Lake) and we enter this into the Place Name Search in Topozone and
Topozone gives us the coordinates of that place, then the Coordinate Source
that we select on the Error Calculator will be a Gazetteer (because
Topozone got those coordinates from GNIS). Thus, I believe that we
calculated the error correctly in the practice records that contained
coordinates given by Topozone for named places (such as Fine Lake). Is
this correct?
2. Are the Topozone maps considered to be USGS or non-USGS maps? For
Example, If we used a Topozone.com map at 1:25,000 scale to measure the
distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map
1:25,000 from the Coordinate Source dropdown on the Error Calculator?
Thanks again,
XXXXX
>>> Posting number 172, dated 14 Feb 2002 15:39:16
Date: Thu, 14 Feb 2002 15:39:16 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
In-Reply-To: <3.0.32.20020214183015.0072e530@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX, and all,
You are correct with respect to question 1, below. You got the coordinates
indirectly from GNIS for named places, therefore, the appropriate source is
a gazetteer. If you use Topozone to find a locality, but do any kind of
measuring on the Topozone maps, then you are indirectly using a USGS map,
and you should select the appropriate scale in the coordinate source
dropdown box in the error calculator application. So, to explicitly answer
question 2, below, use "USGS Map 1:25,000" for Topozone maps at either
1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either
1:100,000 or 1:200,000. While we're at it, here's a reminder to always use
NAD27 for Topozone-derived coordinates, whether from the gazetteer or from
the maps.
John W
>Thanks for the helpful information about map scales and choices to make on
>the error calculator when using Topozone.com for georeferencing. I have
>some additional questions about this. The message exchanges (from
>Mammal-Z-Net) are copied below.
>
> >From John:
> >> >Topozone actually has
> >> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
> >> >1:200,000 versions are just zoomed out by a factor of two from their
> >> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
> >> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
> >> >"resized." It doesn't make all that much difference in the error
> >> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
> >> >using the 1:25,000 map scale contribution in the error calculator for the
> >> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
> >> >1:200,000 Topozone maps.
> >>
> >From XXXXX:
> >>Good to know all of the above. Actually, we used "gazetteer" from the
> >>dropdown on the error calculator for all of the Topozone practice records.
> >>We were following the example from the georeferencing guidelines where the
> >>coordinate source (Topozone) was considered to be a gazetteer, and thus
> >>selected "gazetteer" on the error calculator. It sounds like we need to
> >>redo the MAX ERROR with the map scale incorporated.
>
> >From John:
> >Actually, there is a subtle distinction to make. In the Georeferencing
> >Guidelines document I said that the source for that "Distance Only" example
> >was a gazetteer, because the coordinates were for a named place and
> >Topozone uses the GNIS data to plot named places; thus, the ultimate source
> >of the coordinates for that example is the GNIS database, which is a
> >gazetteer. If you had used Topozone to measure on a map, then the map
> >itself is the source of the coordinates and should be so reflected in the
> >error calculations by selecting an appropriate map scale.
> >
>My questions:
>
>1. I understand (from exchange above) that if the locality that we want to
>georeference is a named place (such as East Lansing or Beaver Island or
>Fine Lake) and we enter this into the Place Name Search in Topozone and
>Topozone gives us the coordinates of that place, then the Coordinate Source
>that we select on the Error Calculator will be a Gazetteer (because
>Topozone got those coordinates from GNIS). Thus, I believe that we
>calculated the error correctly in the practice records that contained
>coordinates given by Topozone for named places (such as Fine Lake). Is
>this correct?
>
>2. Are the Topozone maps considered to be USGS or non-USGS maps? For
>Example, If we used a Topozone.com map at 1:25,000 scale to measure the
>distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map
>1:25,000 from the Coordinate Source dropdown on the Error Calculator?
>
>Thanks again,
>XXXXX
>
>>> Posting number 173, dated 15 Feb 2002 10:55:46
Date: Fri, 15 Feb 2002 10:55:46 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
Thanks for the information. We'll go ahead and recalculate the Max Error
Values on our "practice" records.
One minor question with respect to the word "measuring" in your response
below: For some localities, such as road intersections for example, we get
the coordinates by placing the cursor on the Topozone map, and then
clicking to get the target coordinates of that particular locality. We
really aren't "measuring", but the coordinates are still considered to be
derived from Topozone, and so the map scale information gets applied to the
error calculator - correct?
Thanks,
XXXXX
At 03:39 PM 02/14/2002 -0800, you wrote:
>XXXXX, and all,
>
>You are correct with respect to question 1, below. You got the coordinates
>indirectly from GNIS for named places, therefore, the appropriate source is
>a gazetteer. If you use Topozone to find a locality, but do any kind of
>measuring on the Topozone maps, then you are indirectly using a USGS map,
>and you should select the appropriate scale in the coordinate source
>dropdown box in the error calculator application. So, to explicitly answer
>question 2, below, use "USGS Map 1:25,000" for Topozone maps at either
>1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either
>1:100,000 or 1:200,000. While we're at it, here's a reminder to always use
>NAD27 for Topozone-derived coordinates, whether from the gazetteer or from
>the maps.
>
>John W
>
>>> Posting number 174, dated 15 Feb 2002 09:09:40
Date: Fri, 15 Feb 2002 09:09:40 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
In-Reply-To: <3.0.32.20020215105545.0072c878@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX,
You aptly described exactly what I meant. Thank you.
John
>One minor question with respect to the word "measuring" in your response
>below: For some localities, such as road intersections for example, we get
>the coordinates by placing the cursor on the Topozone map, and then
>clicking to get the target coordinates of that particular locality. We
>really aren't "measuring", but the coordinates are still considered to be
>derived from Topozone, and so the map scale information gets applied to the
>error calculator - correct?
>
>Thanks,
>XXXXX
>
>>> Posting number 175, dated 15 Feb 2002 18:08:31
Date: Fri, 15 Feb 2002 18:08:31 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: coordinate source?
Comments: cc: fsyu <fsyu@uaf.edu>
In-Reply-To: <3C6D8138@webmail.uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX and all,
There is no provision for georeferencing records that already have
coordinates, but this shouldn't necessarily deter you from doing so. If you
go this route, please be sure to note that you have provided these
additional data when you send them in to me. It makes a difference in how I
handle the data on this end.
To answer your specific question, you should put "original locality
description" in the DeterminationRef field in the downloaded data file and
use "locality description" as the Coordinate Source choice in the Error
Calculator.
John W
>Hi John,
>
>Many Alaska data are already georeferenced, but don't have maximum error.
>I've
>been calculating max. error for them, but determination references are not
>recorded for most of them. What should I enter in Coordinate source in Error
>Calculator?
>
>XXXXXX
>>> Posting number 176, dated 20 Feb 2002 09:06:36
Date: Wed, 20 Feb 2002 09:06:36 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Topo USA Ver. 3.0 by DeLorme
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
For anyone using DeLorme software Topo USA Ver. 3.0 (which I am using to do
Hawaii localities) you will need this information for the georeferencing
calculator. I just spoke with the Tech help people and got the information
that all topo maps, at all zoom levels, are based on USGS 1:24,000. I
quite like this software as it allows me to place markers for all the
localities I've done which greatly speeds up any double checking I might
want to do. Measuring distances is also easy, either by air or road.
XXXXX
>>> Posting number 177, dated 25 Feb 2002 14:36:49
Date: Mon, 25 Feb 2002 14:36:49 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MaNIS Server recommendations
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Due to popular demand, I'm writing to give an updated recommendation for
the MaNIS server specifications. The requirements haven't changed since the
original specification were sent out on 2 Oct 2001. Nevertheless, I'll
reiterate the essentials of the configuration, ordered by importance:
1) dual processor Windows 2000 Professional - the Xeon processor is good
for our purposes; faster is better, but anything on the market today is
fast enough.
2) 512 MB RAM - more is better, but not at the cost of any of the other
essentials.
3) one fast SCSI hard drive - essential; faster is better; capacity is much
less important. 18GB is a good target capacity.
4) 10/100 Ethernet adapter - essential; most systems these days have one on
board.
5 ) 3 yr service on parts and labor - essential; we don't want anything to
break without warranty during the period of the grant.
6) CD-ROM drive - faster is better; a CD-RW may be a useful alternative, if
it fits your budget.
7) 17" Monitor - this machine is supposed to be a server, not a
workstation, so don't spend big money on a fancy display.
8) 1.44 MB diskette drive - less essential every day, but most machine
still come with one.
I've created a model system on the Dell website to give you an idea for a
recommended configuration. To look at the specifications for the system
you'll need to Retrieve EQuote #E001554835. You'll also need to enter
either the E-Quote name, which is "manis2," or my email address.
Let me know if you have any questions.
John W.
>>> Posting number 178, dated 27 Feb 2002 14:59:52
Date: Wed, 27 Feb 2002 14:59:52 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Mystified
In-Reply-To: <3.0.32.20020227173043.007327a8@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi XXXX, XXXX, and all,
I have noticed the syndrome you mentioned and I tried to ignore it. That's
harder to do when someone else notices it. It's even worse when two people
notice it - it gets harder to remove the witnesses. I think I know why it
occurs, but I don't have a satisfactory solution yet. I actually made the
interface show 3 decimal places in the Maximum Error field so that this
inconsistency would make less of an impact on the results, which may
currently differ from the expected by up to .001 distance units. So, the
worst case scenario occurs when your distance units are miles, and then the
error (in the error) amounts to about 5.3 feet. This is probably acceptable
and worth trading in your concern for a life. :) In the meantime, I'll
remain cognizant of the problem and try to work on its resolution.
John
At 05:30 PM 2/27/02 -0500, you wrote:
>Hi John,
>
>XXXX and I are mystified about some of the error values in our Barry
>County records (files sent to you in today's earlier message).
>
>1. In the first set of Barry County records (the files that we sent to you
>on 2/12/2002) we incorrectly chose Gazetteer as the error calculator
>coordinate source for Topozone for all records. For the records that were
>TRS localities, we anticipated getting identical values for maximum error.
>This was not the case. When XXXX used the error calculator on her
>computer, she got .716 as the error. When I used the error calculator on
>my computer for these types of records, I got .715 as the error.
>
>2. In the second set of Barry County records (the files that we sent to
>you today 2/27/2002 where maximum error was recalculated with the
>appropriate Topozone map scale), our computers continue to give different
>error calculator values for some of the TRS localities that used an error
>calculator map scale of 1:25,000 (See Sec. 23, T1N, R7W,
>Sec. 24, T1N, R7W and
>T01N R07W Section 4)
>
>3. We were surprised at the above examples. We then entered each other's
>coordinates using identical dropdown choices on the error calculator on our
>respective computers. XXXX's computer still consistently returned an
>error of .723 for all of the TRS localities that had the 1:25,000 scale.
>However, XXXX's computer returned an error of .723 on some localities and
>.724 on others with the 1:25,000 scale. Do we need to be concerned about
>this? (or shall we get a life?)
>
>Thanks,
>XXXXX
>>> Posting number 179, dated 27 Feb 2002 16:24:51
Date: Wed, 27 Feb 2002 16:24:51 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Sample of georeferencing from Baton Rouge
Comments: To:
In-Reply-To: <OF09532E16.D5566143-ON86256B6D.00611AF2@lsu.edu>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="=====================_-1683450515==_"
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
Very nicely done. I can see that you've gone to a lot of trouble to
document the determination methods in the Remarks. There should be no
trouble for someone to figure out later what you did. Some of the
techniques you used (and documented) will surely be useful to others, so
I'm attaching your file with this message to the mammal-z-net list.
I'm trying to decide if/how to make everyone's job a little easier, perhaps
by including a field for named place along with one for the extent. That
way we'll know unequivocally to what the extent refers. I've just started
having my georeferencers do this, and it seems to be better (faster anyway)
than trying to write that information out in plain english in the remarks.
I'm interested in feedback from you and anyone else with an opinion about
whether this change would have a positive effect on your georeferencing.
I'm hoping to set a policy on this subject once there has been ample time
for cogitation on it. In the meantime, I recommend that georeferencers add
two columns to their data, one for NamedPlace, followed by one for Extent,
and put these right before MaximumErrorDistance. Do not include a
ExtentUnits field; instead, use the same units as for the
MaximumErrorDistance and the MaxErrorUnits will refer to both measures.
John W
>Hi John,
>
>Here at LSU, we've downloaded all the Louisiana records from the MANIS
>database, and have begun georeferencing, starting with records from Baton
>Rouge (our home turf). We've learned a lot as we've worked through our
>first batch of records, especially from much of the recent email exchanges
>with other institutions, and we really appreciate the ease of use of the
>Error Calculator. We were wondering if you could look over a small (<20
>records) sample of some of the different types of localities we have
>georeferenced, just to see if we are on the right track. Our longest field
>is the LatLongRemarks, where we describe how we located the point and the
>extent that we estimated to calculate error with. We just wanted to make
>sure that you would be able to follow what we did if there are any
>questions with our georeferencing. Should we place the extents in a
>separate field, and if so, should we place it in any particular order with
>respect to the other fields? Let us know if you see any problems.
>
>Many thanks,
>
>XXXXXXX
>**********************************************************
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: attachment; filename="batonrouge.txt"
"LocalityID" "CollectionCode" "HigherGeog" "SpecLocality" "ElevationText" "MinElev"
"MaxElev" "ElevUnits" "LatText" "LongText" "TRS" "Township" "TownshipDir"
"Range" "RangeDir" "TRSSection" "TRSPart" "DetByAgentID" "DeterminedByPerson"
"DeterminedDate" "DeterminationRef" "OrigCoordSystem" "Datum" "DecLat"
"DecLong" "LatDeg" "LatMin" "LatSec" "LatDir" "LongDeg"
"LongMin" "LongSec" "LongDir" "UTMZone" "UTMEW" "UTMNS" "MaxErrorDistance"
"MaxErrorUnits" "LatLongRemarks" "CaptiveFlag" "NoGeorefBecause" "LocalityAnnotation"
13056 "CAS" "North America, USA, Louisiana" "Briar patch near LSU campus, East Baton Rouge"
"Dinakar Nethi" "1-22-02" "Topozone - gazetteer" "decimal degrees" "NAD27" "30.4141"
"-91.1759"
"1.009" "mi" "center point of LSU Campus obtained from topozone, estimated furthest extent of ""near
LSU campus"" from center as 1 mi" "0"
28636 "FMNH" "USA, Louisiana, Baton Rouge Par" "Baton Rouge"
"Satya Maliakal" "1-23-02" "Topozone - gazetteer" "decimal degrees" "NAD27"
"30.4451" "-91.1867"
"13.009" "mi" "used EBR Parish courthouse as center, furthest extent of BR city limits from
courthouse estimated at 13 mi" "0"
47616 "KU" "U S A, LOUISIANA, EAST BATON ROUGE PARISH" "BATON ROUGE, 5 MI S OF"
"m" "Satya Maliakal"
"1-23-02" "Topozone -1:100,000" "decimal degrees" "NAD27" "30.3725" "-91.1867"
"15.903" "mi" "located point 5mi S of EBR Parish courthouse, furthest extent of BR city limits
from courthouse estimated at 13 mi" "0"
71051 "LSU" "USA, Louisiana, East Baton Rouge Parish" "0.25 mi E jct. Highland and Lee (on
Highland), Baton Rouge" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3911" "-90.1562"
"38.412" "m" "located point 0.25 mi E of intersection of Highland and Lee on Highland,
estimated extent of intersection as 10 m" "0"
71121 "LSU" "USA, Louisiana, East Baton Rouge Parish" "1 km S Baton Rouge, intersection Ben
Hur Rd. and Nicholson Rd., E tracks along fence line, 5 m" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3841" "-91.1687"
"43.413" "m" "located point at intersection of nicholson drive RR tracks and ben hur road,
assuming that 1 km S of BR refers to this intersection, estimated extent of intersection as 10 m with 5
m offset" "0"
71074 "LSU" "USA, Louisiana, East Baton Rouge Parish" "0.33 mi S of Baton Rouge City Limits on
Highland Rd" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3687" "-91.1227"
"38.414" "m" "point located .33 mi S of intersection of Highland Rd. and southern Baton Rouge
Corp. Limit on Highland Road, estimated extent of intersection as 10 m" "0"
71248 "LSU" "USA, Louisiana, East Baton Rouge Parish" "10 mi S Baton Rouge on River Rd"
"16" "16" "meters"
"Satya Maliakal" "1-28-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.3533" "-91.1808"
"14.041" "mi" "located point 10 mi S of EBR courthouse following River Road, furthest extent
of Baton Rouge city limits from courthouse estimated at 13 mi" "0"
71268 "LSU" "USA, Louisiana, East Baton Rouge Parish" "11465 Robin Hood, Baton Rouge"
"0" "0"
"Satya Maliakal" "1-29-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4555" "-91.0561"
"37.408" "m" "located 11465 Robin Hood with yahoo maps, then located this point with
topozone, estimated extent of property at 10 m" "0"
71243 "LSU" "USA, Louisiana, East Baton Rouge Parish" "10 mi N Baton Rouge, US 61"
"0" "0"
"Satya Maliakal" "1-29-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.5503" "-91.1969"
"14.041" "mi" "located point 10 mi N of BR along US 61 (starting from EBR Parish courthouse
latitude), furthest extent of Baton Rouge city limits estimated at 13 mi" "0"
71511 "LSU" "USA, Louisiana, East Baton Rouge Parish" "3.4 mi E, 1 mi N Baton Rouge on LA 37"
"0" "0"
"Satya Maliakal" "2-13-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4655" "-91.1329"
"19.819" "mi" "located closest point 3.4 mi E and 1 mi N of EBR courthouse on LA 37, furthest
extent of BR city limits from courthouse estimated at 13 mi" "0"
71294 "LSU" "USA, Louisiana, East Baton Rouge Parish" "2 mi N Baton Rouge on Miss. River"
"0" "0"
"Satya Maliakal" "2-08-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4733" "-91.1927"
"14.017" "mi" "located point 2 mi N of EBR Parish courthouse following Mississippi River,
furthest extent of BR city limits from courthouse estimated at 13 mi" "0"
71801 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge on River Road"
"16" "16" "meters"
"Satya Maliakal" "2-20-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.3749" "-91.2249"
"5.041" "mi" "located point at center of River Rd. in Baton Rouge, estimated furthest exent of River
Rd. in BR from center at 5 mi" "0"
71802 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge Quad. 15' Sec 51, T7S, R2E"
"45" "45" "feet"
"Satya Maliakal" "2-21-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4277" "-91.0072"
"4.260" "mi" "located point at center of T7S, R2E (unable to locate Quad. 15' Sec. 51)" "0"
71897 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge, Tulane Ave"
"0" "0"
"Dinakar Nethi" "02-25-02" "Topozone -1:25,000" "decimal degrees" "NAD27" "30.4019"
"-91.1652"
"0.527" "km" "point located at approximate center of Tulane Ave., furthest extent of Tulane avenue
from center point estimated as .5 km" "0"
71821 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge, 2100 Stanford"
"0" "0"
"Dinakar Nethi" "02-08-02" "Topozone - 1:25,000" "decimal degrees" "NAD27" "30.4187"
"-91.1536"
"37.410" "m" "located 2100 Stanford with yahoo maps and then located this point on topozone,
extent of property estimated at 10 m" "0"
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"; format=flowed
--=====================_-1683450515==_--
>>> Posting number 180, dated 7 Mar 2002 14:15:38
Date: Thu, 7 Mar 2002 14:15:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MaNIS
In-Reply-To: <a05100301b8ad6761b1be@[141.211.110.228]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and all,
>Hello John,
>My apologies, when I am georeferencing I use the "hide" command under
>"column" in the "format" menu of excel to close down columns that I seldom
>or never use. In this way, I can see the decimal latitude and longitude
>columns, for example, directly next to the locality column on my computer
>screen. I inadvertantly forgot to "unhide" a few columns when I sent the
>excel files back to you.
I should have looked for that.
>A question for you: I have some localities where the data is obviously in
>error but cannot be corrected by me. Do you prefer that I reference the
>county center with a note in the locality annotation column, or not
>georeference the locality with a note in the NoGeorefBecause column?
There are two different classes of locality errors that you need to worry
about, those with internal inconsistencies that make the locality
impossible to determine (e.g., Hogback Creek, Inyo County - there are two
of these), and those that have an obvious error that can be corrected
unambiguously (e.g., Needles, Mojave Co., California - Mojave Co. is in
Arizona and Needles is in San Bernardino Co, California).
If there is an internal inconsistency in the locality information that
makes the locality impossible to determine unambiguously, do not provide
coordinates and error, but do put something like "internal inconsistency"
in the NoGeorefBecause field and explain the problem in the
LocalityAnnotation field (e.g., "there are two Hogback Creeks in Inyo
Co."). When the source institution gets the georeferenced data back,
they'll be able to see what the problem was for each locality that was not
georeferenced.
If there is an obvious error that doesn't make the georeferencing
ambiguous, go ahead and georeference the locality, but put your assumptions
in the LatLongRemarks field and definitely point out the error in the
LocalityAnnotation field. The source institution will be able to see what
your assumptions were and they'll be able to fix the errors you uncovered.
In summary, LatLongRemark should be filled with information about how you
georeferenced, LocalityAnnotation should be filled with information about
errors or ambiguities - intended for the source institution, and
NoGeorefBecause should be a brief phrase describing your reason for not
georeferencing a locality (e.g., "internal inconsistency", "too vague", "no
specific locality").
John W
>>> Posting number 181, dated 9 Mar 2002 11:19:22
Date: Sat, 9 Mar 2002 11:19:22 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Some other useful Excel operations for MaNIS work
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
In addition to Hide columns, some other useful Excel operation I have found
useful are:
1. AutoFilter (similar to Access):
First select a column or columns, then choose
Data Menu>AutoFilter>select (Custom...) from the scrollable pick list>pick
contains and enter data of interest.
Using custom contains filtering, you can pull out all records for a county
from the backward HighGeo field or get all occurrences of a placename in
SpecLoc. Records can be worked with as desired.
Show All just under AutoFilter on the Data menu brings all records back.
2. Protect Worksheet: This will prevent inadvertent changes to MaNIS
records handed down from the mount but cells, columns, or rows can be left
open for data entry if you first select them, then under Format
cells>Protection tab>click unlocked. Once a worksheet is locked you can
enter data manually or automatically (egs. DecLat DecLong, error) but still
lock out changes to the locality fields. Protecting disables the Sort
capability.
3. LookUp:
Works great for dynamic lookup (as you type) and automatic assignment of
data like a placename lat/long from another list like the GNIS download.
With about 5000 of these links in the Oregon records, my machine (196 mg
RAM) starts to bog down. To get rid of the links but retain the data, do
a Copy, Paste Special, click Value.
I've been using LookUp in four columns after LocAnnotation, I enter
placename (winnowed by user) that is then looked up and values for GNIS
placename, type of locality, county, and DecLat, DecLong are returned.
Placename, type and county are for user verification and lat & long are for
computing lat/longs based on offsets.
4. Concatenation: For a text field this is done with "&", eg, columns A,
B, C can be appended to D with
"=D:D&", "&A:A&", "&B:B&", "&C:C" . Enter this in the first field, then
fill down as needed. Used to added misc notes to memo fields of MaNIS.
You can flip the HighGeo to have county first for sorting by doing a Text
to columns (Data menu), then contentating the columns with the county
column first. Of course leave the original HighGeo unaltered.
When you get tired of these, there is the underlying Visual Basic macro
editor which is fun if you like that sort of thing.
I'll probably stick with Excel through the project due to our "Mac-enabled"
status in the museum. I use Windows at home and in the museum as soon as
our server arrives.
>>> Posting number 182, dated 11 Mar 2002 12:02:55
Date: Mon, 11 Mar 2002 12:02:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Sending Data from MSU
In-Reply-To: <3.0.32.20020311144354.006e023c@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
>XXXX and I have data from a few Michigan counties to send to you. So far,
>we have Access.mdb files for Barry, Branch, and Muskegon ready to go, and
>Kent, Ionia, and Montcalm are forthcoming. We have two questions for you:
>
>1. We minimize the width (what I call "closing up") of many columns on the
>template (basically ones that we don't fill in with data, or don't want to
>look at). Do you want us to open these columns back up before we send the
>file to you?
Nope. They're fine all closed up.
>2. Do you have a preference for how often we send files to you? (Aren't
>you getting bombarded with georeferencing data??)
Yes, the deluge has begun. Well, it's best to have the work backed up, so
it seems that you should send them as you finish them. Keep a copy on your
end too, for the sake of safety - you never know when we'll get hit by "the
Big One." To minimize the threat of loss, it's probably best to upload
them as described in the Georeferencing Steps document (i.e., ftp to
galaxy.cs.berkeley.edu/incoming/mvz). Then send me messages as they arrive
safely. Of course, if you are sending Excel (.xls) or Access (.mdb) files,
you don't need to export as tab-delimited text and you should change the
file type to binary when ftp-ing.
>Thanks,
>XXXX
>
>P.S. Thanks for "secretly" adding the NamedPlace and Extent fields to the
>template. (We moved them over next to the MaxError column in our tables).
OK, the secret is out. For those of you who may not be aware of it, there
is an Access Database template for georeferencing that can be accessed
through a link in Step Five on the GeorefSteps document at the following URL:
http://dlp.cs.berkeley.edu/manis/GeorefSteps.html
>>> Posting number 183, dated 11 Mar 2002 13:49:55
Date: Mon, 11 Mar 2002 13:49:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MaNIS Servers
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I've been asked a couple of times about making hardware substitutions in
the Equipment portion of MaNIS subcontract budgets. The bottom line is that
each institution must have, when the time comes to connect to the network,
a DEDICATED machine with the specifications highlighted in my 25 Feb
message "MaNIS Server recommendations." Dedicated means that the sole
purpose of the machine is to support data provision to the network. Beyond
that, I'm not picky.
John W.
>>> Posting number 184, dated 12 Mar 2002 14:45:06
>>> Posting number 185, dated 19 Mar 2002 10:46:55
Date: Tue, 19 Mar 2002 10:46:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: fraction format in the error calculator
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXX, and all,
I'm glad you uncovered this bug. The error calculator is actually not as
smart as you expected it to be. The discrepancy you're experiencing arises
because the calculator interprets 1/2 as 1, ignoring everything after the
/. Therefore, please use only decimals or whole numbers in the Offset
Distance and Extent of Named Place fields
John
>
>Hi John
>
>I notice that maximum error is noticeably affected by the format of the
>extent entered on the error calculator if the extent contains a
>fraction. Since the extent field accepts both decimal and common
>fractions, I experimented with 0.5 and 1/2 for the locality of 3/8 mi. N
>of Casnovia, Kent County, MI. I approached the situation "by road," used
>decimal degrees on Topozone, and obtained the coordinates of 43.2401 and
>-85.7901. Datum is NAD27; coordinate precision, 0.0001; coordinate
>source, USGS map 1:25,000. Distance precision of 1/8 was selected from
>the drop-down. When the extent of the bounding box is expressed as 0.5 (a
>logical choice for TopoZone users), the maximum error is 0.641; but when
>it is expressed as 1/2 (in keeping with the format of distance precision),
>maximum error is 1.141.
>
>Depending on the extent, one format may be easier to use than the
>other. However, if both formats are allowed by the calculator but only
>one yields the desired maximum error, shouldn't the field be restricted to
>that format? [Actually now I believe the extent is slightly less than 0.5
>miles, but remain curious about the discrepancy.] Again, your assistance
>will be greatly appreciated.
>
>XXXXX
>>> Posting number 186, dated 21 Mar 2002 15:51:25
Date: Thu, 21 Mar 2002 15:51:25 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing rivers
In-Reply-To: <Pine.OSF.4.33.0203211410400.8199-100000@aurora.uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and all,
These are good questions. I'll put the answers right below each one.
>1. When I georeference rivers, should I take coordinates of the source or
>the drainage of the river? How much should extent of the river be?
The coordinates should be at the geographic center of the river, on the
river itself. The extent should be the distance to the furthest reach of
the river in either direction.
>2. An example: specific locality is "Brooks Range, Anaktiktoot", where
>Anaktiktoot is not on the map. Should I georeference for Brooks Range
>(which will be more than 600 miles in length)? There are many cases that
>higher geography is followed by unknown specific locality.
You should go ahead and put coordinates on the vague localities, even
though the maximum_error_distance will be large. Some of the higher
geographies that have no value or "no specific locality" in the locality
field can still be specific, such as islands.
>3. Related to my question 2: how much is too big to georeference? In many
>cases, only the name of the island, mountains, peninsula etc. are
>provided.
Do them all. The maximum_error_number will be useful even if it is large.
John
>>> Posting number 187, dated 30 Mar 2002 09:00:41
Date: Sat, 30 Mar 2002 09:00:41 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: UAM declat/longs truncated in MaNIS?
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
John: It looks like the UAM records in the gazetteer have the same problem
that KU's records had -- declat/longs only go to two decimals. KU (XXX
XXXX) asked me to recompute KU's Oregon so I am overwriting calculated
declat/longs. Please advise on UAM records - there are several hundred.
Examples:
LocalityID CollectionCode Datum DecLat DecLong LatDeg LatMin LatSec LatDir LongDeg LongMin
LongSec LongDir
186407 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
17 W
186662 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
10 W
186663 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
1 W
186721 UAM not recorded 45.1600 -123.7300 45 10 1 N 123 44
6 W
186731 UAM not recorded 45.2100 -123.6400 45 13 1 N 123 38
42 W
186514 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
32 W
186515 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
21 W
186516 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
2 W
186556 UAM not recorded 44.2800 -123.7600 44 17 2 N 123 46
2 W
186557 UAM not recorded 44.2800 -123.7500 44 17 2 N 123 45
2 W
186689 UAM not recorded 45.3300 -123.7800 45 20 2 N 123 47
2 W
186690 UAM not recorded 45.3300 -123.6400 45 20 2 N 123 38
49 W
186691 UAM not recorded 45.3300 -123.6300 45 20 2 N 123 38
2 W
>>> Posting number 188, dated 1 Apr 2002 14:19:03
Date: Mon, 1 Apr 2002 14:19:03 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MaNIS questions
In-Reply-To: <5.1.0.14.0.20020327144722.01df95c0@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXX, and all,
I know that Barbara made a preliminary answer to the questions raised here.
I'll try to add a few points of explanation from which everyone on the list
might benefit.
I agree with Barbara's statement of the georeferencing priorities within
the MaNIS context. To summarize them, the MaNIS grant covers (only)
complete georeferencing for localities that have no lat_longs. Our hope is
that, through innovation and properly-guided cooperation, we will be able
to follow through on our promise to finish this. In fact, we hope that we
will be able to refine the process and the tools enough to actually get
ahead of the game. If we do get ahead, we will be able to turn our
attention next to those localities for which lat_longs exist without
supporting metadata.
I know we all have the desire to have consistent data quality, especially
when faced with making those data public. Within the context of our
project, however, cleaning up locality descriptions is neither covered, nor
is it recommended. Every change made to locality descriptions on your end
since the data were collected for the MaNIS gazetteer has the potential to
confound the process of properly reconnecting the georeferenced localities
with specimens in your database.
I have not yet explained the reconnecting part of the process, thinking
that what I've presented thus far is enough to swallow for the time being.
Perhaps a brief synopsis now would be of use to illustrate the potential
complications and to get people to think about the future of locality data
in institutional databases.
In the MaNIS gazetteer I have rendered unique occurrences of localities by
institution. These you can query on and see as results in the online MaNIS
gazetteer. Behind the scenes there is another table to cross-reference
unique localities to specimens. The specimens are linked to the localities
(and hence to the coordinates and metadata that georeferencing provide)
based on the locality string. Thus, if you change the locality string in
your database, it will not match the locality string for the same specimen
in the gazetteer. This is the crux of the issue, so it is important to
understand when it matters, and when it doesn't.
If the locality string in your database doesn't match the locality string
in the MaNIS gazetteer, but the locality really is exactly the same place
and would get the same coordinates when georeferenced, then the change
doesn't matter - the specimen will get the correct coordinates anyway.
However, if the change in your database effectively changes the place that
is described (resulting in different coordinates when georeferenced) then
the change DOES matter - it is what I have elsewhere called "substantive."
If a substantive change is made in your database and I apply the
georeferenced coordinates to the specimens that once referred to that
locality, the georeferenced data will be wrong. Therefore, there needs to
be a verification process when re-associating georeferenced localities with
individual databases. There are two steps to this process. The first is to
determine if the locality string in your database is the same as that in
the gazetteer. For all of those localities for which the locality strings
match, the georeferenced data can go into your database automatically, no
fuss, no questions asked. For the rest of the georeferenced localities from
the gazetteer, a comparison will have to be made between the then-current
locality and the georeferenced locality to determine if they still refer to
the same place. Imagine putting a check mark by each pair that still match.
The amount of checking to be done in this step is directly determined by
the number of changes you make to your locality strings between the time
when I collected the data for the gazetteer and the time when the data go
back into your database. Clearly, fewer changes mean less checking.
OK? Take a breath. Now, a topic for rumination as the project progresses.
Start thinking about incorporating the georeferenced coordinates and
metadata into your individual databases. Not one of the participating
institutions currently has the structure in its database to capture all of
the metadata we are gathering. It would be nice if we all could. We don't
want to throw away all of this hard work after all.
John W
>>> Posting number 189, dated 1 Apr 2002 15:37:38
Date: Mon, 1 Apr 2002 15:37:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: UAM declat/longs truncated in MaNIS?
In-Reply-To: <F569gG8WPbLgJyAUypU000104d9@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and XXXXX,
The problem is not exactly the same. UAM has both decimal lat_long and
degrees minutes seconds in its database. The decimal lat_longs often have
only two decimal places when there are fully specified degrees minutes
seconds, but this shouldn't affect what you're doing unless you want to
copy and paste lat_longs that UAM had already done to localities for other
institutions. If that's the case, recompute the decimal lat_longs for UAM
using the degrees minutes seconds values where the OrigCoordSystem is "deg.
min. sec."
XXXXX, you may want to put XXXX on recomputing decimal lat_longs for the
conditions described above.
General Reminder: Lat_Long recomputations should not be on MaNIS time
until/unless we finish the georeferencing of localities without lat_longs.
>John: It looks like the UAM records in the gazetteer have the same problem
>that KU's records had -- declat/longs only go to two decimals. KU (XXX
>XXXX) asked me to recompute KU's Oregon so I am overwriting calculated
>declat/longs. Please advise on UAM records - there are several hundred.
>
>Examples:
>LocalityID CollectionCode Datum DecLat DecLong
>LatDeg LatMin LatSec LatDir LongDeg LongMin LongSec LongDir
>186407 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 17 W
>186662 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 10 W
>186663 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 1 W
>186721 UAM not recorded 45.1600
>-123.7300 45 10 1 N 123 44 6 W
>186731 UAM not recorded 45.2100
>-123.6400 45 13 1 N 123 38 42 W
>186514 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 32 W
>186515 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 21 W
>186516 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 2 W
>186556 UAM not recorded 44.2800
>-123.7600 44 17 2 N 123 46 2 W
>186557 UAM not recorded 44.2800
>-123.7500 44 17 2 N 123 45 2 W
>186689 UAM not recorded 45.3300
>-123.7800 45 20 2 N 123 47 2 W
>186690 UAM not recorded 45.3300
>-123.6400 45 20 2 N 123 38 49 W
>186691 UAM not recorded 45.3300
>-123.6300 45 20 2 N 123 38 2 W
>
>>> Posting number 190, dated 1 Apr 2002 16:47:19
Date: Mon, 1 Apr 2002 16:47:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MaNIS questions
In-Reply-To: <5.0.0.25.2.20020401125307.024018f0@socrates.berkeley.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Fellow MANES:
John's message closed with this statment:
"Not one of the participating
institutions currently has the structure in its database to capture all of
the metadata we are gathering. It would be nice if we all could. We don't
want to throw away all of this hard work after all."
My response: It has been a surprise to find ourselves dealing with the
topic of error estimates, etc in lat/long data, since that was not part of
the original scope of the project. And indeed (in light of the above
quote) we do not have a capacity to absorb such information into our
present databases, let alone deciding how much time we have to care about
this. Seeing the impact of the request for so much attention to error
estimates, I find it hard to support so much allocation of additional time
to this effort.
I have witnessed, over the years, many publications based on massive
datasets in which the authors were not able to document (or even care)
about variance in the quality and accuracy of the data. Typically, they
just put on their blinders and accepted all the "AVAILABLE" data. This is
just an inherent problem for those who move up the scale (allometric
analyses, macroecology, or whatever), and at such LARGE scales of analyses
they usually say that small local errors become insignificant, because of
the LARGE SCALE of the overall analysis.
I hope we can strike a balance here and get the big data entry and
conversion project done. I don't want to see the project slowed down by
such a big commitment to accounting for aspects of the data (and the
corresponding time commitment) that were not built in to our original
estimates of what it would take to carry out the project.
Is this a helpful comment?
>>> Posting number 191, dated 1 Apr 2002 17:01:39
Date: Mon, 1 Apr 2002 17:01:39 -0900
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Organization: University of Alaska Museum
Subject: Re: MaNIS questions
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------4C7E03390063999F5E48C0EE"
This is a multi-part message in MIME format.
--------------4C7E03390063999F5E48C0EE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
UAM's online database (along with MVZ's) is displaying error estimates through
the Berkeley Digital Library Project's GIS viewer. I assume that the
"finished" MaNIS project could look about the same. That is, error estimates
will be a prominent and critical feature of the system. Given that the GIS
viewer will map data points over satellite photos of much of the U.S., the
precision associated with the data points is critical. The implication of "no
error" on a such fine scale GIS layer is that the specimen came from a
specific tree or bush! Our database contains max_errors from as small as a
few meters to as large as several tens of kilometers. These are not arcane
details.
XXXXXX
>>> Posting number 192, dated 2 Apr 2002 11:09:56
Date: Tue, 2 Apr 2002 11:09:56 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Lat_Long metadata
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Oops, my mistake. There IS a collection with the structure to capture all
of the metadata. Two others, UAM and MVZ, have everything except "Extent of
Named Place."
Thanks XXXX, bright spot appreciated.
John W
>X-Sender: carlak@mail.bishopmuseum.org
>X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
>Date: Tue, 02 Apr 2002 08:39:08 -1000
>To: John Wieczorek <tuco@socrates.Berkeley.EDU>
>From:
>Subject:
>
>FYI: in reference to your statement below....................
>
>Start thinking about incorporating the georeferenced coordinates and
>metadata into your individual databases. Not one of the participating
>institutions currently has the structure in its database to capture all of
>the metadata we are gathering. It would be nice if we all could. We don't
>want to throw away all of this hard work after all.
>
>Here's a bright spot to your day: I have incorporated the MANIS locality
>structure into my Locality table and will thus be saving all the metadata
>for the BPBM specimens and for all new specimens into the collection that
>are completely georeferenced.
>
>XXXX
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Posting number 193, dated 2 Apr 2002 12:00:20
Date: Tue, 2 Apr 2002 12:00:20 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing rivers
In-Reply-To: <.20020401170449.0099fc90@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
First, I want to apologize for having given contradictory opinions on how
these vague localities should be treated. I stated at least once in the
past that we shouldn't bother with these kinds of localities. However, that
opinion was not based on unassailable logic. In both of the circumstances
described below in Robin's message the coordinates will be of limited
utility due to their very large maximum error. Nevertheless, providing the
coordinates and maximum error will allow the user to determine the extent
to which they ARE useful.
In replying to XXXX I first expressed the opinion that we should provide
maximum errors even in the truly vague cases. My unstated personal
justification for that opinion was that it makes the rules simpler. More
philosophically, by georeferencing all non-contradictory localities, we
don't need to answer the question "How big of an area is too vague?" We
cannot fully anticipate all of the uses to which the data will be put, so
we don't really have a basis on which to make that judgement. A locality
with coordinates and a maximum error distance is always more useful than a
locality without them. End of apology.
Now, back to the questions.
>John:
>
>XXXXX's questions and your responses prompted additional questions re:
>georeferencing rivers and vague localities.
>
>1. Is it correct to assume that when one measures the length of a river
>to determine its geographic center the river's possibly winding path is
>taken into consideration; however, the extent is determined "as the crow
>flies" from the geographic center to the furthest reach?
You don't need to know the length of the river to determine its geographic
center, you need only take the means of the extremes of latitude and
longitude encompassing it. After that, you need to find the point on the
river nearest the geographic center. From there, the extent would be the
distance to the furthest point on the river.
>2. Should we put coordinates on the following vague locality:
>
>HigherGeog: Michigan, Barry County
>SpecLocality: "no specific locality recorded"
>
>XXXX and I have not georeferenced such localities thus far, but it
>appears from your response that county center coordinates and the extent
>of Barry County should be provided.
Yes. These should be georeferenced. However, there isn't really a need for
you to do it. Such localities can be georeferenced automatically from a
table of county centroids when we're all done. In retrospect, it would have
probably been useful for me to do that before making the gazetteer
"public," but I didn't think it worth the delay at the time.
John W
>>XXXX and all,
>>
>>These are good questions. I'll put the answers right below each one.
>>
>>>1. When I georeference rivers, should I take coordinates of the source or
>>>the drainage of the river? How much should extent of the river be?
>>
>>The coordinates should be at the geographic center of the river, on the
>>river itself. The extent should be the distance to the furthest reach of
>>the river in either direction.
>>
>>>2. An example: specific locality is "Brooks Range, Anaktiktoot", where
>>>Anaktiktoot is not on the map. Should I georeference for Brooks Range
>>>(which will be more than 600 miles in length)? There are many cases that
>>>higher geography is followed by unknown specific locality.
>>
>>You should go ahead and put coordinates on the vague localities, even
>>though the maximum_error_distance will be large. Some of the higher
>>geographies that have no value or "no specific locality" in the locality
>>field can still be specific, such as islands.
>>
>>>3. Related to my question 2: how much is too big to georeference? In many
>>>cases, only the name of the island, mountains, peninsula etc. are
>>>provided.
>>
>>Do them all. The maximum_error_number will be useful even if it is large.
>>
>>John
>
>>> Posting number 194, dated 2 Apr 2002 21:23:13
Date: Tue, 2 Apr 2002 21:23:13 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: MaNIS questions
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="------------8D04441FBD1587A8D66E30D2"
--------------8D04441FBD1587A8D66E30D2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Dear XXX et al.,
>From the outset, this project has proceeded, and proceeded successfully,
because we have all been "on the same page." Your email (see below) provides
an opportunity to reiterate what we said we were going to do, what we intend
to do, and exactly why we are doing it as stated.
John and I (particularly John) are extremely grateful to those of you who have
immersed yourselves in the intricacies of georeferecning and have been willing
to share your thoughts and insights with the list. However, such discussions
in and of themselves have not added to the work load that was initially
budgeted or funded. Quite to the contrary, both the "Coordinate
Georeferencing Activities" and "Implement Specimen Data Model" sections in the
MaNIS Project Description described providing georeferencing metadata as well
as the coordinates. And we stated emphatically,
"Well-documented, georeferenced collecting events are crucial to biogeographic
data...."
This is exactly what we are doing.
The error calculator and spreadsheet templates that John provided make the
addition of metadata such as lat/long error a relatively trivial exercise and
one that should not be confused with the discussion of such topics on this
list. Several individuals have chosen to probe that tool more closely and we
have all benefited from their interest and experimentation. Their comments
have enhanced our understanding of the process and the resulting data, and
improved the tool, but they have not created more work.
Where confusion may have arisen, is in the following:
> And indeed (in light of the above
> quote) we do not have a capacity to absorb such information into our
> present databases, let alone deciding how much time we have to care about
> this. Seeing the impact of the request for so much attention to error
> estimates, I find it hard to support so much allocation of additional time
> to this effort.
It is not your job to incorporate such information into your present databases
and we apologize for any confusion that John might have engendered in his
previous email. This is a topic we will be discussing at our meeting at ASM
in June but perhaps it is worth clarifying now what John was intimating when
he made reference to this issue.
Think of your current dbms in two parts, the databases themselves and the
interfaces you now use to input, query and display those data in-house. For
most of you, neither your databases nor your interfaces are currently designed
to handle any new fields (e.g., lat/long error). However, we are expending a
great deal of time and effort to collect such data and want to make them
available to researchers. Whereas it is a fairly tricky task (given
constraints of time and budget) to modify each of your interfaces to add new
fields, it is relatively easy to add those fields to your current databases
and migrate the data directly to the MaNIS servers along with your specimen
data. This will happen when John writes the migration scripts for each of
your institutions. Hence, the data will be displayed over the network and
available to you without impacting your current set-ups in-house. In raising
this issue, he was merely letting you know that we are, in fact, moving ahead
and beginning to work on the next step of the project, creating the migration
scripts and software that will make the network function.
> I have witnessed, over the years, many publications based on massive
> datasets in which the authors were not able to document (or even care)
> about variance in the quality and accuracy of the data. Typically, they
> just put on their blinders and accepted all the "AVAILABLE" data. This is
> just an inherent problem for those who move up the scale (allometric
> analyses, macroecology, or whatever), and at such LARGE scales of analyses
> they usually say that small local errors become insignificant, because of
> the LARGE SCALE of the overall analysis.
Here I will part company with XXX and argue that it is our intention to do
better than what has always been done or has been done previously. Neither
John nor I see this "inherent problem," particularly with the advent of
increased computing technology. I participated in one of the planning
workshops for NEON (National Ecological Observatories Network) two years ago
and I can state unequicocally that the standard is changing/has changed. The
kinds of publications to which XXX refers will no longer be acceptable (if
they even are at this time) because it is possible to document variance in
quality and accuracy of data, even for extremely large datasets. Furthermore,
we believe we have a designed the georeferencing protocol to do just that,
with relatively little overhead and impact to the participating institutions.
At this point everyone has at least begun the georeferencing process and from
what we can gather, once initial inertia is overcome, things actually progress
quite smoothly and quickly. I may be premature in saying so, but it is our
hope that MVZ will have completed georeferencing the ca. 40,000+ localities
for California in the next two months. How have we done this? I would remind
each of you that our first priority is to provide georeferenced data to those
localities in our collections that currently have none! It is not to add
error to localities that already have lat/long coordinates assigned to them,
it is not to verify already georeferenced localities, and it is not to clean
up locality descriptions. Our budget figures were based on the number of
unique localities in our collections that lacked lat/long coordinates of any
sort. I would also add, that while we cannot dictate whom you hire to do
georeferencing, your money will go lots farther if you hire undergraduates,
and it will go farthest if you hire work-study students.
We have all taken the first giant step. What is needed now is to just keep
putting one foot in front of the other. I guarantee you will amaze
yourselves.
Best,
Barbara
>>> Posting number 195, dated 3 Apr 2002 08:38:21
>>> Posting number 196, dated 3 Apr 2002 10:52:59
Date: Wed, 3 Apr 2002 10:52:59 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Contemporary informatics science, etc.
In-Reply-To: <3CAA91C1.79E00185@oz.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Barbara et al.,
I appreciate the comments and forum that exist among our Manis group, and
I thank Barbara for her most recent. I also agree that the developing
field of informatics is helping us to raise the bar on scientific
standards in generaland I dont wish my comments to be taken as an
endorsement of the crudeness of broad synthetic work done in the past
(without error estimates). I also realize that for the many data fields
that we have entered into our XXXX mammal database (other than lat/long)
we will probably continue without error estimates for some time to come.
On the other hand we can only await the further development of these kinds
of massive data management projects in the future, assuming that financial
resources will remain available for this kind of thing. It will be great
if we can be surprised by continued improvements in the overall quality of
the data that stand behind the specimens we hold in our collections. I
obviously remain committed to assuring that we get our job done on this
current project.
XXX
>>> Posting number 197, dated 5 Apr 2002 15:49:39
Date: Fri, 5 Apr 2002 15:49:39 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
In late February when I was fixing my mistake with the UWBM Lat_Longs I
mentioned that I would be reloading ROM data at some time as well. That
time has come. The new ROM data have now been loaded into the gazetteer.
What does this mean for you? If you haven't begun georeferencing yet
(though as far as I know, everyone has), you just need to download your
localities again and proceed as described in the Georeferencing Steps
document ( http://dlp.CS.Berkeley.EDU/manis/GeorefSteps.html ). If you have
downloaded localities and started georeferencing them, first you need to
remove any ROM records from the set. Next make another query in the MaNIS
gazetteer just like the original query that gave you the records you are
working on, but this time pick ROM in the Institution box on the MaNIS
Gazetteer page to get only ROM records for that combination of higher
geography. Download these ROM records and append them to the end of the
file you are working on.
Sorry for this inconvenience. I'm pretty sure I've got everything correct
now and that this kind of thing won't happen any more. So, everyone,
proceed with confidence.
My next undertaking will be to write the documentation for a new Calculator
that can calculate not only errors, but also coordinates. This calculator
will be VERY similar to the Error Calculator, so there won't be much new to
learn. The new calculator has already been tested; the results agree with
those given by Gary Shugart's Excel tool for the same localities. This is
good. I'll announce the new calculator as soon as I've posted the manual
for it, which should be next Friday or so after I return from San Diego.
Happy georeferencing!
John W
>>> Posting number 198, dated 5 Apr 2002 17:47:21
>>> Posting number 199, dated 15 Apr 2002 16:52:52
Date: Mon, 15 Apr 2002 16:52:52 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: GNIS Website Gazeteering
Hello All,
I am a recent addition to the group, and I have thrown myself headlong into
the midst, hopefully well.
That said, I do have a question about a source. I am using the USGS GNIS
website http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form and
I was wondering what, if any, experiences have been had. Specifically, if
I read it correctly, it is a database of information culled for the USGS
maps. I am just unsure of a few things...:
First, datum, scale, and other info. The site refers to "7.5' by 7.5'
Map"; what other data can be culled just from that?
Second, it at times gives coordinates from multiple maps that are slightly
different. How do I reconcile this variances?? Do I give my own best
combination, or has a process been agreed upon, that I have missed in going
through the past posting?
Thanks, and greetings to you all.
>>> Posting number 200, dated 15 Apr 2002 15:45:02
Date: Mon, 15 Apr 2002 15:45:02 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: GNIS Info
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0088_01C1E494.81AC73E0"
This is a multi-part message in MIME format.
------=_NextPart_000_0088_01C1E494.81AC73E0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
XXXXXX
I too am new at working on this MaNIS project, just started this week. =
Anyways, I had the exact same question as you and talked to John =
Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =
CA. He said that using the GNIS data is fine even though that the =
source of the database is not from one place. These are the "givens" =
for GNIS use with the "Error Calculator":
1) Coordinate System: decimal degrees
2) Coordinate Source: USGS map 1:25,000
3) Datum: NAD27 (North American Datum 1927)
Make sure that you fill out the "Extent of Named Place Field" as much as =
possible each time. If anyone from this board has other suggestions, I =
would be glad to hear them.
Is anyone else converting the GNIS database to a shape file to be used =
in ArcView to calculate distances? If there are a lot of you, I will =
start posting ArcView questions pertaining to this project here. =
Thanks.
XXXXXXX
>>> Posting number 201, dated 16 Apr 2002 09:49:31
Date: Tue, 16 Apr 2002 09:49:31 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0022_01C1E52C.020D1CA0"
This is a multi-part message in MIME format.
------=_NextPart_000_0022_01C1E52C.020D1CA0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
XXXXX--
Thanks for the quick reply; it was very helpful. =20
I am always interested in other people's experiences with ArcView.
John Wieczorek & Group--
I still am on the fence with the locations that give me two or more =
different georeferencing points. I have the feeling that, as they are =
both "legitimate" sources (different USGS maps), that I can just choose =
one, and indicate in the proper field in the database which I chose. =
Does this seem acceptable/appropriate??
Thanks
XXXXX
----- Original Message -----=20
From:
To: MAMMAL-Z-NET@USOBI.ORG=20
Sent: Monday, April 15, 2002 5:45 PM
Subject: [MANIS] GNIS Info
XXXXXXXXX
I too am new at working on this MaNIS project, just started this week. =
Anyways, I had the exact same question as you and talked to John =
Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =
CA. He said that using the GNIS data is fine even though that the =
source of the database is not from one place. These are the "givens" =
for GNIS use with the "Error Calculator":
=20
1) Coordinate System: decimal degrees
2) Coordinate Source: USGS map 1:25,000
3) Datum: NAD27 (North American Datum 1927)
Make sure that you fill out the "Extent of Named Place Field" as much =
as possible each time. If anyone from this board has other suggestions, =
I would be glad to hear them.
Is anyone else converting the GNIS database to a shape file to be used =
in ArcView to calculate distances? If there are a lot of you, I will =
start posting ArcView questions pertaining to this project here. =
Thanks.
>>> Posting number 202, dated 16 Apr 2002 10:04:18
Date: Tue, 16 Apr 2002 10:04:18 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
In-Reply-To: <002501c1e555$eb1a1320$b16f0a0a@fmnh.org>
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="=====================_12236688==_.ALT"
--=====================_12236688==_.ALT
Content-Type: text/plain; charset="us-ascii"
XXXX and others
Having already georeferenced thousands of South American localities, this is an
important and porrly understood question. My strong conviction is that simply
picking a point arbitrarily is apt to prove more misleading than leaving the
point undetermined. If there are 28 "San Martin"s in Peru, for example, and
there is no additional information for specifying this (e.g., compiling an
expedition itinerary, locations of field activities immediately beforehand and
afterwards, and (rarely) the distributions of animals themselves), then
guessing--and being explicit about your guesses--can only be misleading.
Following this strategy with the Field Museum's 2300 locality records from Peru
lead me to leave 14% of the localities unspecified. However, I am confidant
that the remaining 86% came from where they plot.
I would be interested in hearing the experiences of others and the druthers of
curators/collection managers on the data fidelity (vs accuracy) question.
Clearly, we need to embrace a community-wide standard
XXXXX
>>> Posting number 203, dated 16 Apr 2002 09:11:41
Date: Tue, 16 Apr 2002 09:11:41 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: GNIS Info
In-Reply-To: <4.1.20020416095834.00a94a90@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I agree wholeheartedly with XXXXX. If there is ambiguity in terms of a
multitude of potential named places for a given locality we should NOT
georeference it, but give the reason ("ambiguous" or "multiple possible
places" or something like that) in the NoGeorefBecause field. It may be
that some of these localities can be resolved by the host institution by
looking in field notes and the like. However, that's a time-consuming
activity and we should leave that until after the coordinates get
redistributed.
For the record, the other type of locality we should NOT georeference is
one that is in question (e.g., "Bakersfield?"). For these, put something
like "locality questionable" in the NoGeorefBecause field. The reason for
filling out the NoGeorefBecause field is so that the host institution knows
that someone actually looked at the locality. You wouldn't otherwise know
this if the Lat and Long were just blank. While reviewing, I might as well
remind everyone to make use of the Remarks field to alert host institutions
of likely errors such as misspellings as well as unusual assumptions that
were made in the course of the coordinate determination.
It's nice to see the list serving its purpose. Thanks for the questions and
responses!
John W
>>> Posting number 204, dated 16 Apr 2002 11:21:06
Date: Tue, 16 Apr 2002 11:21:06 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
All,
I believe I am not being as clear in my situation as I thought. Here is an
example.
Aurora, Illinois, is a city/town that spreads across multiple counties, and
is on 3 different USGS maps, according to the query form results I received.
As I understand it, the information on the site comes about in the same
manner as if I had all of these maps myself, and were picking the point, and
best approximating the lat & long according to