-
Notifications
You must be signed in to change notification settings - Fork 122
Description
We should aim at having the best data for analyses of APC payments. Therefore, the data should be as complete as possible. In some cases, contributors can - for pragmatic reasons - only contribute parts of their data (e.g. payments to a particular publisher) and have to postpone contributing other data. This leaves us with a bias in terms of 'biggest recipients of APC payments'. Examples: MPDL has added more and more publishers to their data (https://github.com/OpenAPC/openapc-de/commits/master/data/mpg), while for LMU München there’s just Springer/BMC data (https://github.com/OpenAPC/openapc-de/tree/master/data/lmu). See in this picture the huge impact this has on visualizing the APC recipients (German universities to publishers data with LMU being turquoise ):
While these issues might improve over time, there’s another concern I’d like to add to the picture: Institutions choosing to deliberately hold back parts of their data. See e.g. this statement of U of Leipzig (cc: @vielera):
Data on hybrid OA and APC above 2.000 EUR are not included (exceptions due to currency exchange rates). (https://github.com/OpenAPC/openapc-de/blob/0901f552ca166c5cba8e702c9e6807a443b39f19/data/unileipzig/README.md)
We all know that we can only gather parts of the APC payments – there is a huge gray area. But if the data is available to a data provider, then I’d really suggest that this data is provided to the Open APC initiative as complete as possible. There might we problems or concerns I did not see, but maybe we can at least agree on the general goal?
Maybe we can have a discussion here and/or during the workshop in April? What do you think?
