Atlas Update: The end of data quality review is in sight!

By Nick Anich 25 Mar 2022
Red-bellied Woodpecker Melanerpes carolinus

Hello atlasers,

Although we realize the news updates out of Atlas Central have been infrequent lately, we have been quite busy running quality control on over 2.8 million observations collected by over 3,500 atlasers!

This dataset far exceeds the 172,000 observations collected for WBBA I, which will give us great explanatory power, but it takes a while to do the necessary screening!

The screening that occurs is primarily adjustment of the breeding code up or down based on your comments and time of year. Read more about how this looks on a checklist here.

The steps that have already occurred include:

  • We switched eBird portals for certain checklists (e.g., moving checklists into the atlas portal for conservation-priority birds, and moving checklists out of the atlas portal if they are long travelling counts or at general hotspots that cross block lines).
  • We screened records flagged by an automated process that flagged each species for date outliers (based on the Breeding Guideline Bar Chart) or species for code outliers (based on the Acceptable Breeding Codes Chart).
  • We manually examined every single comment you added and adjusted codes if needed.
  • We screened data that was hidden from normal eBird downloads (sensitive species, zero-count records, records hidden by users).
  • We screened for protocol errors (location misplots, long hours or miles, brought slashes and spuh records to species level if possible).
  • We developed code for R screening tools and our own internal postgres database for next steps, as the database is too big to even open in a single Excel spreadsheet!
  • We worked with eBird to develop further tools within eBird to facilitate quality control (the ability to adjust atlas codes while retaining the originally entered code, all breeding codes now appear on a map pushpin).

Our automated process flagged records to be evaluated by time of year (i.e. Before or After “B” from the Breeding Guideline Bar Chart) and breeding code (i.e. code 3 or 4 from the Acceptable Breeding Codes Chart) for each species.

We anticipate that these processes have caught about 80% of the existing errors (although not all edits are live on the maps yet, particularly for 2019 records).

With the above processes complete, we have reached a milestone – the conclusion of the bulk data review!

There are 3 remaining processes which will allow us to finalize the dataset, hopefully by this fall:

  1. Examining the Chronology Plot for to examine the outlying records for each code for a species (Spring 2022)

We will be examining very early or late records for each breeding code.

2. Examining the map view for remaining questionable records. [Coming Summer 2022 – You can help (this summer your atlas work will be mosquito-free)! Stay tuned for details]

We will be examining the range boundaries for each species.

3. Examining the final list of species in priority and specialty blocks (a final check to make sure none of the many levels of review messed anything up!) (Coming Fall 2022 — Looking for Principal Atlaser and County Coordinator help! Stay tuned for details)

We will be checking the final priority block lists to make sure we didn’t screw anything up!

Thanks for your patience — we are as anxious as you are to see the final results come out! We anticipate by the end of 2022 we’ll be moving into analysis and writing mode with a clean dataset.

A big thanks to Tom Prestby, Nick Walton, Aaron Stutz, Jack Coulter, and Gabriel Foley for crucial assistance in the screening process.