
Check for uniqueness and non-missingness in an identifier variable
check_id.RdThis function checks whether an identifier variable in a GADSdat object meets key requirements for data quality:
The identifier variable must be unique (i.e., no duplicate values).
The identifier variable must not contain any missing values (
NA).
These checks help ensure that the identifier variable can uniquely and reliably index rows within the data.
Value
A list with two components summarizing the issues found:
missing_ids: Adata.framewith the row indices of observations where the identifier variable has missing values.duplicate_ids: Adata.framewith the values of duplicate identifiers (if any).
If there are no missing or duplicate identifiers, the respective data.frame will be empty.
Examples
# Example usage
# Load example GADSdat object
GADSdat <- eatGADS::import_spss(system.file("extdata", "example_data2.sav", package = "eatFDZ"))
# Check identifier variable for uniqueness and non-missingness
id_check <- check_id(GADSdat, idVar = "ID")
# View rows with missing identifier values
print(id_check$missing_ids)
#> Rows
#> 1 12
# View duplicate identifier values
print(id_check$duplicate_ids)
#> ID
#> 11 SH0886