A few weeks ago, I posted about Big Data in the field of ecology, specifically biodiversity informatics, where I looked at the holdings of the Neotoma Paleoecological Database and the Global Biodiversity Information Facility (GBIF). I made some comparisons between the two databases, though the units of scale were different. In Neotoma, I used the number of datasets that had been submitted, while for GBIF, I commented on the number of occurrences. These are two fundamentally different units, so I set out to resolve the issue and find out how many occurrences there are in Neotoma.

Counting every single taxa in every single level in Neotoma, there are over 18.9 million occurrence records – identifications of a single type at a single space-time locus.

To be exact, there are 18,903,236 occurrence records, with a steeply increasing trend over the last several months. I suspect we’ll be up over 20 million by the middle of 2017.