duplicate name when reading csv from zenodo

I was testing with the german data from the uva zenodo page and extracted this file: https://zenodo.org/records/14711244/files/de.tgz

``` r
withr::with_options(list(
  "getRad.vpts_local_path_format"="{substr(radar, 1, 2)}/{radar}/{year}/{radar}_vpts_{year}{month}.csv.gz"),
  getRad::get_vpts("deboo", as.Date("2017-10-1"), source = "/media/bart/data_disk/tmp/"))
#> Called from: getRad::get_vpts("deboo", as.Date("2017-10-1"), source = "/media/bart/data_disk/tmp/")
#> debug: fetched_vpts <- radar_to_name(switch(dplyr::case_when(source == 
#>     "rmi" ~ "rmi", source %in% eval(formals("get_vpts_aloft")$source) ~ 
#>     "aloft", dir.exists(source) ~ "local"), rmi = purrr::map(radar, 
#>     ~get_vpts_rmi(.x, rounded_interval), .purrr_error_call = cl), 
#>     aloft = purrr::map(radar, ~get_vpts_aloft(.x, rounded_interval = rounded_interval, 
#>         source = source), .purrr_error_call = cl), local = get_vpts_local(radar, 
#>         rounded_interval, directory = source)))
#> Error in `purrr::map_chr()`:
#> ℹ In index: 1.
#> ℹ With name: deboo.
#> Caused by error:
#> ! Result must be length 1, not 2.
```
It seems that this is caused by two radar names in the file:
``` r
vroom::vroom("/media/bart/data_disk/tmp/de/deboo/2017/deboo_vpts_201710.csv.gz") |> dplyr::pull("radar") |> table()
#> Rows: 74200 Columns: 26
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr   (2): radar, source_file
#> dbl  (21): height, u, v, w, ff, dd, sd_vvp, eta, dens, dbz, dbz_all, n, n_db...
#> lgl   (2): gap, vcp
#> dttm  (1): datetime
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> 
#> 10132 deboo 
#> 53550 20650
```

It seems this change happened exactly in this month:

``` r
vroom::vroom("/media/bart/data_disk/tmp/de/deboo/2017/deboo_vpts_201709.csv.gz", show_col_types = F ) |> dplyr::pull("radar") |> table()
#> 
#> 10132 
#> 71400
vroom::vroom("/media/bart/data_disk/tmp/de/deboo/2017/deboo_vpts_201710.csv.gz", show_col_types = F ) |> dplyr::pull("radar") |> table()
#> 
#> 10132 deboo 
#> 53550 20650
vroom::vroom("/media/bart/data_disk/tmp/de/deboo/2017/deboo_vpts_201711.csv.gz", show_col_types = F ) |> dplyr::pull("radar") |> table()
#> 
#> deboo 
#> 71325
```
It breaks in this function as there are tow different names:
``` r
getRad:::radar_to_name
#> function (vpts_df_list) 
#> {
#>     purrr::set_names(vpts_df_list, purrr::map_chr(vpts_df_list, 
#>         function(df) unique(dplyr::pull(df, .data$radar))))
#> }
#> <bytecode: 0x5db748176680>
#> <environment: namespace:getRad>
```
@PietrH any suggestion for a good resolution? What name to pick I guess we here have the odim and wmo code both in one csv


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

duplicate name when reading csv from zenodo #176

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

duplicate name when reading csv from zenodo #176

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions