Contrasting spatial, temporal and environmental patterns in observation and specimen based species occurrence data.
ABSTRACT: Species occurrence data records the location and time of an encounter with a species, and is valuable for many aspects of ecological and evolutionary analyses. A key distinction within species occurrence data is between (1) collected and preserved specimens that can be taxonomically validated (i.e., natural history collections), and (2) observations, which are more error prone but richer in terms of number and spread of observations. In this study we analyse the distribution in temporal, spatial, taxonomic and environmental coverage of specimen- and observation based species occurrence data for land plants in Norway, a region with strong climatic and human population density gradients. Of 4.8 million species occurrence records, the majority (78%) were observations. However, there was a greater species richness in the specimen record (N = 4691) than in the observation record (N = 3193) and most species were recorded more as specimens than observations. Specimen data was on average older, and collected later during the year. Both record types were highly influenced by a small number of prolific contributors. The species most highly represented in the observation data set were widespread or invasive, while in the specimen records, taxonomically challenging species were overrepresented. Species occurrence records were unevenly spatially distributed. Both specimen and observation records were concentrated in regions of Norway with high human population density and with high temperatures and precipitation, but in different regions within Norway. Observation and specimen records thus differ in taxonomic, temporal, spatial and environmental coverage for a well-sampled group and study region, potentially influencing the ecological inferences made from studies utilizing species occurrence data. The distribution of observation data dominates the dataset, so inferences of species diversity and distributions do not correspond to the evolutionary or physiological knowledge of species, which is based on specimen data. We make recommendations for users of biodiversity data, and collectors to better exploit the complementary strengths of these distinct biodiversity data types.
Project description:We review the Symbiota software platform for creating voucher-based biodiversity information portals and communities. Symbiota was originally conceived to promote small- to medium-sized, regionally and/or taxonomically themed collaborations of natural history collections. Over the past eight years the taxonomically diverse portals have grown into an important resource in North America and beyond for mobilizing, integrating, and using specimen- and observation-based occurrence records and derivative biodiversity information products. Designed to mirror the conceptual structure of traditional floras and faunas, Symbiota is exclusively web-based and employs a novel data model, information linking, and algorithms to provide highly dynamic customization. The themed portals enable meaningful access to biodiversity data for anyone from specialist to high school student. Symbiota emulates functionality of modern Content Management Systems, providing highly sophisticated yet intuitive user interfaces for data entry, batch processes, and editing. Each kind of content provision may be selectively accessed by authenticated information providers. Occupying a fairly specific niche in the biodiversity informatics arena, Symbiota provides extensive data exchange facilities and collaborates with other development projects to incorporate and not duplicate functionality as appropriate.
Project description:Geranium L. is a genus of over 350 species distributed throughout most of the world, except in lowland tropical areas. It is the largest genus of the Geraniaceae and is represented in the New World by 137 species. This dataset includes 8,937 records that covers the genus Geranium the New World, providing an updated, taxonomically consistent and a sound geographical distribution of the 137 species of Geranium in America. Specimens from 128 herbaria were reviewed. These were supplemented by others collected during nine field trips, which allowed better knowledge of the variability of characters within populations, and refining species distribution ranges. Each record represents a specimen that has been reviewed and in some cases collected by C. Aedo. Accepted scientific name, locality details, distribution status (introduced, native, naturalized, uncertain), geographic coordinates are given for 8,538 (95%) records, and habitat information for 3,952 (44%). All data have been released under a CC-BY license in a standardized format, which enables easy integration with other data, for example through GBIF.org.
Project description:A total of ca 800,000 occurrence records from the Australian Museum (AM), Museums Victoria (MV) and the New Zealand Arthropod Collection (NZAC) were audited for changes in selected Darwin Core fields after processing by the Atlas of Living Australia (ALA; for AM and MV records) and the Global Biodiversity Information Facility (GBIF; for AM, MV and NZAC records). Formal taxon names in the genus- and species-groups were changed in 13-21% of AM and MV records, depending on dataset and aggregator. There was little agreement between the two aggregators on processed names, with names changed in two to three times as many records by one aggregator alone compared to records with names changed by both aggregators. The type status of specimen records did not change with name changes, resulting in confusion as to the name with which a type was associated. Data losses of up to 100% were found after processing in some fields, apparently due to programming errors. The taxonomic usefulness of occurrence records could be improved if aggregators included both original and the processed taxonomic data items for each record. It is recommended that end-users check original and processed records for data loss and name replacements after processing by aggregators.
Project description:Occurrence-based databases such as the Palaeobiology database (PBDB) provide means of accommodating the heterogeneities of the fossil record when evaluating historical diversity patterns. Although palaeontologists have given ample attention to the effects of taxonomic practice on diversity patterns derived from synoptic databases (those using first and last appearances of taxa), workers have not examined the effects of taxonomic error on occurrence-based diversity studies. Here, we contrast diversity patterns and diversity dynamics between raw data and taxonomically vetted data in the PBDB to evaluate the effects of taxonomic errors. We examine three groups: Palaeozoic gastropods, Jurassic bivalves and Cenozoic bivalves. We contrast genus-level diversity patterns based on: (i) all occurrences assigned to a genus (i.e. both species records and records identifying only the genus), (ii) only occurrences for which a species is identified, and (iii) only occurrences for which a species is identified, but after vetting the genus to which the species is assigned. Extensive generic reassignments elevate origination and extinction rates within Palaeozoic gastropods and origination rates within Cenozoic bivalves. However, vetting increases generic richness markedly only for Cenozoic bivalves, and even then the increase is less than 10%. Moreover, the patterns of standing generic richness are highly similar under all three data treatments. Unless our results are unusual, taxonomic standardization can elevate diversity dynamics in some cases, but it will not greatly change inferred richness over time.
Project description:Background:Human activities are allowing the ever-increasing dispersal of taxa to beyond their native ranges. Understanding the patterns and implications of these distributional changes requires comprehensive information on the geography of introduced species. Current knowledge about the alien distribution of macrofungi is limited taxonomically and temporally, which severely hinders the study of human-mediated distribution changes for this taxonomic group. New information:Here, we present a database on the global alien distribution of macrofungi species. Data on the distribution of alien macrofungi were searched in a large number of data sources, including scientific publications, grey literature and online databases. The database compiled includes 1966 records (i.e. species x region combinations) representing 2 phyla, 7 classes, 22 orders, 82 families, 207 genera, 648 species and 31 varieties, forms or subspecies. Dates of introduction records range from 1753 to 2018. Each record includes the location where the alien taxon was identified and, when available, the date of first observation, the host taxa or other important information. This database is a major step forward to the understanding of human-mediated changes in the distribution of macrofungal taxa.
Project description:Background:Fungi play crucial roles in ecosystems and are among the species-richest organism groups on Earth. However, knowledge on their occurrence lags behind the data for animals and plants. Recent analyses of fungal occurrence data from Western, Central and Northern Europe provided important insights into response of fungi to global warming. The consequences of the global changes for biodiversity on a larger geographical scale are not yet understood. Landscapes of Eastern Europe and particularly of eastern Ukraine, with their specific geological history, vegetation and climate, can add substantially new information about fungal diversity in Europe. New information:We describe the dataset and provide a checklist of aphyllophoroid fungi (non-gilled macroscopic Basidiomycota) from eastern Ukraine sampled in 16 areas between 2007 and 2011. The dataset was managed on the PlutoF biodiversity workbench (http://dx.doi.org/10.15156/BIO/587471) and can also be accessed via Global Biodiversity Information Facility (GBIF, parts of datasets https://doi.org/10.15468/kuspj6 and https://doi.org/10.15468/h7qtfd). This dataset includes 3418 occurences, namely 2727 specimens and 691 observations of fructifications belonging to 349 species of fungi. With these data, the digitised CWU herbarium (V. N. Karazin Kharkiv National University, Ukraine) doubled in size A most detailed description of the substrate's properties and habitat for each record is provided. The specimen records are supplemented by 26 nuclear ribosomal DNA ITS sequences and six 28S sequences. Additionally, 287 photographs depicting diagnostic macro- and microscopic features of fungal fruitbodies as well as studied habitats are linked to the dataset. Most of the specimens have at least one mention in literature and relevant references are displayed as associated with specimen data. In total, 16 publication references are linked to the dataset. The dataset sheds new light on the fungal diversity of Eastern Europe. It is expected to complement other public sources of fungal occurrence information on continental and global levels in addressing macroecological and biogeographical questions.
Project description:The Eyasi Plateau Paleontological Expedition (EPPE) Laetoli specimen database contains 13716 records of plant and animal fossils (ca. 28248 specimens) collected by EPPE field teams working at Laetoli, Tanzania between 1998 and 2005. This dataset is a digital version of the original hard-copy specimen catalog, and it documents the discovery, stratigraphic provenience and taxonomic diversity of Plio-Pleistocene fauna and flora in northern Tanzania between 4.4?Ma and >200 ka. Laetoli is renowned for the discovery of important hominin fossils, including the lectotype for Australopithecus afarensis, one of our early hominin ancestors, the first record of Paranthropus aethiopicus outside Kenya-Ethiopia, and an early record of our own species Homo sapiens. This database is one of the few publicly available palaeoanthropological fossil datasets and serves as an example for expanding open access to primary fossil occurrence data in palaeoanthropology. The taxonomic identifications appearing in this dataset are the original field identifications and are provisional. Any taxonomic analysis employing this dataset should refer to updated taxonomic identifications published by specialists.
Project description:The value of natural history collections for conservation science research is increasingly recognized, despite their well-documented limitations in terms of taxonomic, geographic, and temporal coverage. Specimen-based analyses are particularly important for tropical plant groups for which field observations are scarce and potentially unreliable due to high levels of diversity-amplifying identification challenges. Specimen databases curated by specialists are rich sources of authoritatively identified, georeferenced occurrence data, and such data are urgently needed for large genera. We compared entries in a monographic database for the large Neotropical genus Myrcia in 2007 and 2017. We classified and quantified differences in specimen records over this decade and determined the potential impact of these changes on conservation assessments. We distinguished misidentifications from changes due to taxonomic remodeling and considered the effects of adding specimens and georeferences. We calculated the potential impact of each change on estimates of extent of occurrence (EOO), the most frequently used metric in extinction-risk assessments of tropical plants. We examined whether particular specimen changes were associated with species for which changes in EOO over the decade were large enough to change their conservation category. Corrections to specimens previously misidentified or lacking georeferences were overrepresented in such species, whereas changes associated with taxonomic remodeling (lumping and splitting) were underrepresented. Among species present in both years, transitions to less threatened status outnumbered those to more threatened (8% vs 3%, respectively). Species previously deemed data deficient transitioned to threatened status more often than to not threatened (10% vs 7%, respectively). Conservation scientists risk reaching unreliable conclusions if they use specimen databases that are not actively curated to reflect changing knowledge.
Project description:AbstractThis study details a workflow used to accession a large stonefly (Plecoptera) collection resulting from several donations. The eastern North American material of Kenneth W. Stewart (deceased, University of North Texas), the entire collection of Stanley W. Szczytko (deceased, University of Wisconsin, Stevens Point), and a small portion of the Barry C. Poulton collection (active, United States Geological Survey, Columbia, Missouri) were donated to the Illinois Natural History Survey in 2013. These 5,767 vials of specimens were processed to help preserve the specimen legacy of these world renowned Plecoptera researchers. The workflow used an industrialized approach to organize the specimens taxonomically, image the specimens and labels, and place the specimens into new storage. Utilizing the images as a verbatim data source, we transcribed labels in iterative steps that yielded more information with each pass. The data were normalized, locations georeferenced, all specimen data formatted to meet Darwin Core Archive format for occurrence data, and a data set created using Pensoft's Integrated Publishing Toolkit. This is the first time that any of the specimen data has been made available electronically. We also provide two important electronic supplements that include the Bill P. Stark (active, Mississippi College) Oklahoma field notebook for 1971 and 1972 detailing locations for many coded stonefly specimens in the Stewart collection, and the coded locations of B. C. Poulton's Arkansas and Missouri study. Again, we have linked coded labels in vials to normalized and georefenced site data. We confirmed 243 stonefly species were contained within the collections, and the potential for many more species exists among the specimens identified to family and genus level. Twenty-one new state, province, and other significant stonefly records are reported herein with all identifications verified by the senior author, often through consultation with other stonefly taxonomists. Researchers are encouraged to utilize the specimen data, form collaborations with the authors, and borrow specimens for research.
Project description:Background:The Odonata, dragonflies and damselflies, constitute one of the more charismatic and better-studied orders of insects. The approximately 6,000 extant species on Earth can be variously found on all continents, except Antarctica. A relatively stable taxonomy, a relative ease of species identification and an aquatic immature stage has made the Odonata a taxon of interest in documenting the symptoms of global environmental change, especially at higher latitudes. The Odonata fauna of the north-temperate Canadian province of Quebec includes 150 species, many of which are at the northern limits of their geographic distribution. New information:Quebec hosts multiple entomological specimen depositories, including seven publicly-accessible research collections. One of these, the University of Montreal's Ouellet-Robert Entomological Collection, houses an exceptionally large collection of Odonata. An initial specimen data capture project for this collection gathered 31,595 Quebec Odonata occurrence records, but several Quebec species were missing and geographic coverage was biased towards the Montreal region. To complement this dataset, we undertook to digitise the Odonata records of six other public research collections. They are, in order of Quebec Odonata collection size, the Laval University Entomological Collection, McGill University's Lyman Entomological Museum, the Insectarium of Montreal Research Collection, the Quebec Government's Insect Collection, Bishop's University's Insect Collection and the Laurentian Forestry Centre's René-Martineau Insectarium. Of the 40,447 total specimen occurrence records, 36,951 are identified to the species level, including 137 of the 150 species officially-recorded in Quebec and 2 non-nominotypical subspecies. We here summarise the data and highlight the strengths and weaknesses of the datasets. The complete dataset is available with this publication (Suppl. material 1), whereas the specimen data associated with each collection are available as Darwin Core archives at Canadensys.net and will be updated as appropriate.