New update of kutils package for R available

I have uploaded kutils-1.31 on KRAN. To try it, fix your repositories like so:

CRAN <- "http://rweb.crmda.ku.edu/cran"
KRAN <- "http://rweb.crmda.ku.edu/kran"
options(repos = c(KRAN, CRAN))

If you have kutils already, then fire away with an update:

update.packages(ask = FALSE, checkBuilt = TRUE)

In case you don't have kutils before now (how could that possibly be?), run

install.packages("kutils", dep = TRUE)

Todays updates include new functions

importQualtrics : Imports Qualtrics files (csv or xlsx), gets column names correct and avoids corruption caused by lines 2 and 3 in most Qualtrics files. I need an example file for this one, if you have a Qualtrics file for which you have legal rights, lets see about it.

mergeCheck : reports on bad and unmatched values of the "by" (key) variable. Has examples!

anonomize : convert ID to conceal identity of survey participants

keyTemplateSPSS : creates a key describing the value_old, value_new changes implied by SPSS value labels

Please test mergeCheck, it is the function I've been promising you for some time. (example pasted below). I need to hear what it does to mangle your own key variables and such.

Also this includes the miscellaneous little updates for variable key that we've been implementing for the past 2 months.

keysPoolCheck : for several keys from related data sets, checks for mis-matches in class

keysPool: tries to blend keys. We have had this since summer, but did not export it into the workspace because it was not well tested. So please test it.

Why now?

Soon we will have a separate package for the SEM table writer functions, semTable. I mean, we have that now, soon to be on CRAN. That other package relies on some general purpose functions in kutils and it is necessary to upload kutils on CRAN and have it approved before we can upload semTable.

mergeCheck output to tantalyze you.

> library(kutils)
> example(mergeCheck)

mrgChc> df1 <- data.frame(id = 1:7, x = rnorm(7))

mrgChc> df2 <- data.frame(id = c(2:6, 9:10), x = rnorm(7))

mrgChc> mc1 <- mergeCheck(df1, df2, by = "id")
Merge difficulties detected

Unmatched cases from df1 and df2 :
df1
  id          x
1  1 -0.9292122
7  7  0.3819611
df2
  id          x
6  9 -1.3481932
7 10  0.6325223

mrgChc> ## Use mc1 objects mc1$keysBad, mc1$keysDuped, mc1$unmatched
mrgChc> df1 <- data.frame(id = c(1:3, NA, NaN, "", " "), x = rnorm(7))

mrgChc> df2 <- data.frame(id = c(2:6, 5:6), x = rnorm(7))

mrgChc> mergeCheck(df1, df2, by = "id")
Merge difficulties detected

Unacceptable key values
df1
    id          x
4 <NA> -0.5295721
5  NaN -0.1538291
6       0.3073913
Duplicated key values
df2
  id           x
4  5  0.57832104
5  6 -0.01897907
6  5  1.08232745
7  6  1.88448038
Unmatched cases from df1 and df2 :
df1
    id          x
1    1  0.9593052
4 <NA> -0.5295721
5  NaN -0.1538291
6       0.3073913
7      -0.3304785
df2
  id           x
3  4 -0.47677841
4  5  0.57832104
5  6 -0.01897907
6  5  1.08232745
7  6  1.88448038

mrgChc> df1 <- data.frame(idx = c(1:5, NA, NaN), x = rnorm(7))

mrgChc> df2 <- data.frame(idy = c(2:6, 9:10), x = rnorm(7))

mrgChc> mergeCheck(df1, df2, by.x = "idx", by.y = "idy")
Merge difficulties detected

Unacceptable key values
df1
  idx         x
6  NA -1.457980
7 NaN -0.707062
Unmatched cases from df1 and df2 :
df1
  idx          x
1   1  0.8150989
6  NA -1.4579796
7 NaN -0.7070620
df2
  idy         x
5   6 1.2318111
6   9 1.1306955
7  10 0.1466441
This entry was posted in Data Analysis. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *