1.  

    Occasionally when indexing data frames the format is converted, leading to confusing consequences. As for instance, when indexing to select a single column the result is a 'numeric' or 'integer' vector. The following  demonstrates this : 



    df<- data.frame(num=1:10, al=letters[1:10], bool=c(rep(TRUE,5), rep(FALSE,5)) )

    rownames(df)<- df$al

    df
    #   num al  bool
    # a   1  a  TRUE
    # b   2  b  TRUE
    # c   3  c  TRUE
    # d   4  d  TRUE
    # e   5  e  TRUE
    # f   6  f FALSE
    # g   7  g FALSE
    # h   8  h FALSE
    # i   9  i FALSE
    # j  10  j FALSE

    class(df[,1])
    #[1] "integer"

    class(df[,2])
    #[1] "
    factor"

    class(df[,3])
    #[1] "
    logical" 

    df[,1]
    #[1]  1  2  3  4  5  6  7  8  9 10

    # Note that the following returns an error !

    rowSums(df[,1])
    #Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...) :
    #  'x' must be an array of at least two dimensions

    Using the drop=FALSE parameter setting, it is possible to maintain the data frame format.


    class(df[,1, drop=FALSE])
    #[1] "data.frame

    df[,1, drop=FALSE] 

    #  num
    #a   1
    #b   2
    #c   3
    #d   4
    #e   5
    #f   6
    #g   7
    #h   8
    #i   9
    #j  10

    # No error raised by the following command!

    rowSums(df[,1, drop=FALSE])
    # [1]  1  2  3  4  5  6  7  8  9 10


    0

    Add a comment

Labels
Blog Archive
About Me
About Me
My Photo
I am a Postdoc researcher at the Neuromuscular Disorders Research lab and Genetic Determinants of Osteoporosis Research lab, in University of Helsinki and Folkhälsan RC. I specialize in Bioinformatics. I am interested in Machine learning and multi-omics data analysis. My go-to programming language is R.
My Blog List
My Blog List
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.