Open
Description
There are two ways we can add data provenance capabilities to downloads:
Separate folders for each version:
- harmonized/core/501c3-pz/v10/CORE-1989-501C3-CHARITIES-PZ-HRMN.csv
- harmonized/core/501c3-pz/v11/CORE-1989-501C3-CHARITIES-PZ-HRMN.csv
Dataset names:
- harmonized/core/501c3-pz/CORE-1989-501C3-CHARITIES-PZ-HRMN-v10.csv
- harmonized/core/501c3-pz/CORE-1989-501C3-CHARITIES-PZ-HRMN-v11.csv
There are merits to both. Folders are easier to manage. Dataset names are nice primarily because people have a record of the version they are using, then, if they retain the original names. But you can also determine the version using hash values.
I think folders would preserve workflows better because the file names would remain intact, you are just changing the download or acquisition steps (which could be one argument).
base<-"https://nccsdata.s3.us-east-1.amazonaws.com/harmonized/core/501c3-pc/"version<-"v11/"filename<-"CORE-2013-501C3-CHARITIES-PC-HRMN.csv"url<- paste0( base, version, filename )
Metadata
Metadata
Assignees
Labels
No labels